#Principal Component Analysis (PCA)

1개의 포스트

[논문리뷰] Learning from the Best, Differently: A Diversity-Driven Rethinking on Data Selection

대규모 언어 모델(LLMs) 사전 훈련 시, 기존의 점수 기반 데이터 선택 방식이 다양성 부족으로 인해 성능 저하를 초래하는 문제를 해결하고자 합니다.

#Review #Data Selection #Large Language Models (LLMs)#Data Diversity #Data Quality #Principal Component Analysis (PCA)#Orthogonal Dimensions #Pre-training

2025년 10월 23일