#D-Optimal Selection

1개의 포스트

[논문리뷰] Pruning and Distilling Mixture-of-Experts into Dense Language Models

본 연구는 MoE 모델의 높은 메모리 요구량으로 인해 발생하는 배포 제약 문제를 해결하기 위해, 전문가 기반 구조를 효율적인 Dense 모델로 변환하는 체계적인 프레임워크를 제안한다.

#Review #Mixture-of-Experts #Knowledge Distillation #Model Pruning #D-Optimal Selection #Dense Language Models #Expert Scoring #Submodularity

2026년 6월 8일