[논문리뷰] Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty AdaptationarXiv에 게시된 'Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation' 논문에 대한 자세한 리뷰입니다.#Review#Reinforcement Learning#LLM Reasoning#Group Relative Policy Optimization#Advantage Estimation#Exploration-Exploitation#Curriculum Learning#Multi-modal LLMs2026년 2월 12일댓글 수 로딩 중
[논문리뷰] Reasoning in Space via Grounding in the WorldLi Zhang이 arXiv에 게시한 'Reasoning in Space via Grounding in the World' 논문에 대한 자세한 리뷰입니다.#Review#3D Visual Grounding#Spatial Reasoning#Large Language Models (LLMs)#Chain-of-Thought (CoT)#Hybrid Representation#Multi-modal LLMs#Point Clouds2025년 10월 16일댓글 수 로딩 중
[논문리뷰] Efficient Multi-modal Large Language Models via Progressive Consistency DistillationarXiv에 게시된 'Efficient Multi-modal Large Language Models via Progressive Consistency Distillation' 논문에 대한 자세한 리뷰입니다.#Review#Multi-modal LLMs#Token Compression#Efficiency#Knowledge Distillation#Progressive Learning#Consistency Distillation#MLLM Training2025년 10월 6일댓글 수 로딩 중