#Distribution Vector

1개의 포스트

[논문리뷰] OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training

LLM의 도메인 및 언어 적응을 위해 CPT 를 수행할 때, 데이터의 혼합 비율(Mixture Ratio)은 매우 민감한 하이퍼파라미터입니다. 기존에는 이 비율을 학습 전에 고정해야 하며, 부적절할 경우 수주간의 GPU 연산 자원이 낭비되는 문제가 있었습니다.

#Review #Continual Pre-training #Model Merging #Distribution Vector #Bayesian Optimization #LLM Adaptation

2026년 3월 31일