#Iterative Fine-Tuning

1개의 포스트

[논문리뷰] MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

본 연구는 멀티모달 대규모 언어 모델(MLLM)이 복잡한 수학 문제 해결과 같은 추론 태스크에서 겪는 어려움을 극복하는 것을 목표로 합니다. 특히, 기존의 정적인 교사 모델 유래 데이터셋에 의존하는 방식이 모델의 새로운 문제 적응력과 견고한 일반화 능력을 제한한다는 한계를 해결하고자 합니다.

#Review #Multimodal Reasoning #Mathematical Problem Solving #Self-Evolving #Iterative Fine-Tuning #Reward Models #Reflection #Large Language Models (LLMs)

2025년 11월 12일