[논문리뷰] Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-TrainingPeng Cheng이 arXiv에 게시한 'Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training' 논문에 대한 자세한 리뷰입니다.#Review#Mixture-of-Experts#Large Language Models#Checkpoint Recycling#Model Growth#Efficient Pretraining#Depth Growth#Width Growth#Sunk Cost2025년 10월 10일댓글 수 로딩 중