[논문리뷰] On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language ModelsYanxi Chen이 arXiv에 게시한 'On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models' 논문에 대한 자세한 리뷰입니다.#Review#Reinforcement Fine-Tuning (RFT)#Large Language Models (LLMs)#Entropy Dynamics#Exploration-Exploitation#Policy Optimization#GRPO#Entropy Control#Discriminator Score2026년 2월 8일댓글 수 로딩 중
[논문리뷰] VIDEOP2R: Video Understanding from Perception to ReasoningarXiv에 게시된 'VIDEOP2R: Video Understanding from Perception to Reasoning' 논문에 대한 자세한 리뷰입니다.#Review#Video Understanding#Reinforcement Fine-Tuning (RFT)#Large Video Language Models (LVLMs)#Perception and Reasoning#Chain-of-Thought (CoT)#Process-Aware Learning#Policy Optimization#Credit Assignment2025년 11월 18일댓글 수 로딩 중