#Representation Distillation

1개의 포스트

[논문리뷰] OPRD: On-Policy Representation Distillation

본 논문은 Large Language Models (LLMs)의 Post-training에 필수적인 On-Policy Distillation (OPD) 방식의 본질적인 두 가지 한계점을 지적하며, 이를 해결하기 위한 새로운 접근 방식인 OPRD (On-Policy Representation Distillation)를 제안합니다.

#Review #On-Policy Distillation #Representation Distillation #Large Language Models #Knowledge Distillation #Hidden States #Mathematical Reasoning #Variance Reduction

2026년 6월 4일