#Step-level Divergence

1개의 포스트

[논문리뷰] Learning from the Self-future: On-policy Self-distillation for dLLMs

본 논문은 기존의 OPSD 방법론들이 Autoregressive (AR) 모델에 최적화되어 있어, dLLMs의 고유한 특성인 비자기회귀적 생성 방식과 충돌한다는 문제를 해결하고자 합니다.

#Review #On-policy Self-distillation #Diffusion Large Language Models #dLLMs #Step-level Divergence #Self-future #Reasoning Benchmarks

2026년 6월 16일