#Human Preference Alignment

4개의 포스트

[논문리뷰] Personalizing Text-to-Image Generation to Individual Taste

arXiv에 게시된 'Personalizing Text-to-Image Generation to Individual Taste' 논문에 대한 자세한 리뷰입니다.

#Review #Text-to-Image Generation #Personalization #Reward Modeling #Human Preference Alignment #Subjective Aesthetics

2026년 4월 9일

[논문리뷰] E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

arXiv에 게시된 'E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models' 논문에 대한 자세한 리뷰입니다.

#Review #Reinforcement Learning #Flow Models #Entropy-aware Sampling #Group Relative Policy Optimization #SDE #Human Preference Alignment #Image Generation

2026년 1월 7일

[논문리뷰] G^2RPO: Granular GRPO for Precise Reward in Flow Models

arXiv에 게시된 'G^2RPO: Granular GRPO for Precise Reward in Flow Models' 논문에 대한 자세한 리뷰입니다.

#Review #Reinforcement Learning #Flow Models #Generative Models #Human Preference Alignment #Stochastic Differential Equations (SDE)#Reward Signal #Multi-Granularity

2025년 10월 9일

[논문리뷰] TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

Jian Yang이 arXiv에 게시한 'TempFlow-GRPO: When Timing Matters for GRPO in Flow Models' 논문에 대한 자세한 리뷰입니다.

#Review #Flow Matching #Reinforcement Learning #Human Preference Alignment #GRPO #Temporal Credit Assignment #Generative AI #Text-to-Image

2025년 8월 20일