[논문리뷰] E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow ModelsarXiv에 게시된 'E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models' 논문에 대한 자세한 리뷰입니다.#Review#Reinforcement Learning#Flow Models#Entropy-aware Sampling#Group Relative Policy Optimization#SDE#Human Preference Alignment#Image Generation2026년 1월 7일댓글 수 로딩 중
[논문리뷰] π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action ModelsarXiv에 게시된 'π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models' 논문에 대한 자세한 리뷰입니다.#Review#Reinforcement Learning (RL)#Vision-Language-Action Models (VLAs)#Flow-based Models#Policy Optimization#Robotics#Flow Matching#SDE#MDP2025년 11월 9일댓글 수 로딩 중