본문으로 건너뛰기

#Reward Model

15개의 포스트

[논문리뷰] VEFX-Bench: A Holistic Benchmark for Generic Video Editing and Visual Effects

댓글 수 로딩 중

[논문리뷰] Unified Personalized Reward Model for Vision Generation

댓글 수 로딩 중

[논문리뷰] UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

댓글 수 로딩 중

[논문리뷰] MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique

댓글 수 로딩 중

[논문리뷰] RewardDance: Reward Scaling in Visual Generation

댓글 수 로딩 중

[논문리뷰] Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

댓글 수 로딩 중

[논문리뷰] Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models

댓글 수 로딩 중

[논문리뷰] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

댓글 수 로딩 중

[논문리뷰] Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

댓글 수 로딩 중

[논문리뷰] A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling

댓글 수 로딩 중