#RL Fine-tuning

1개의 포스트

[논문리뷰] On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs

본 논문은 강화 학습(RL)으로 파인튜닝된 비전 언어 모델(VLM) 의 강건성(robustness) 및 사고 과정(Chain-of-Thought, CoT) 일관성 을 평가하는 것을 목표로 합니다.

#Review #VLM #RL Fine-tuning #Chain-of-Thought #Robustness #Faithfulness #Textual Perturbations #Visual Grounding #Uncertainty Calibration

2026년 2월 15일