#Annotated Data

1개의 포스트

[논문리뷰] CARFT: Boosting LLM Reasoning via Contrastive Learning with Annotated Chain-of-Thought-based Reinforced Fine-Tuning

본 논문은 LLM의 추론 능력 향상을 목표로, 기존 SFT(Supervised Fine-Tuning) 방식의 제한된 일반화 능력과 RL(Reinforcement Learning) 기반 방식의 불안정한 추론 경로 샘플링 및 주석된 CoT(Chain-of-Thought) 활용 부족 이라는 두 가지 주요 한계를 해결하고자 합니다.

#Review #LLM Reasoning #Contrastive Learning #Reinforcement Learning #Fine-tuning #Chain-of-Thought (CoT)#Annotated Data #Model Stability

2025년 8월 25일