[논문리뷰] RubricBench: Aligning Model-Generated Rubrics with Human StandardsarXiv에 게시된 'RubricBench: Aligning Model-Generated Rubrics with Human Standards' 논문에 대한 자세한 리뷰입니다.#Review#LLM Evaluation#Reward Models#Rubric-Guided Evaluation#Benchmarks#Model Alignment#Human Standards#Cognitive Misalignment2026년 3월 2일댓글 수 로딩 중
[논문리뷰] F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the RarearXiv에 게시된 'F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare' 논문에 대한 자세한 리뷰입니다.#Review#Reinforcement Learning#LLM#Policy Optimization#Reward Models#Diversity Preservation#Focal Loss#Group Sampling#Mathematical Reasoning2026년 2월 8일댓글 수 로딩 중
[논문리뷰] MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language ModelsarXiv에 게시된 'MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models' 논문에 대한 자세한 리뷰입니다.#Review#Reward Models#LLM Memory Management#Benchmarking#Long Context#Evaluation Metrics#Generative RMs#Memory Management Patterns2026년 1월 20일댓글 수 로딩 중
[논문리뷰] Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and ImagearXiv에 게시된 'Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image' 논문에 대한 자세한 리뷰입니다.#Review#Reward Models#Multimodal LLMs#Benchmark#Text-to-Image Generation#Image Editing#Interleaved Generation#Multimodal Reasoning#MLLM-as-a-judge2025년 12월 18일댓글 수 로딩 중
[논문리뷰] Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-FollowingarXiv에 게시된 'Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following' 논문에 대한 자세한 리뷰입니다.#Review#Multimodal Judges#LMM Evaluation#Pluralistic Criteria#Criteria-Following#Trade-off Sensitivity#Conflict Resolution#Reward Models#Benchmark2025년 11월 27일댓글 수 로딩 중
[논문리뷰] MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-TuningarXiv에 게시된 'MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning' 논문에 대한 자세한 리뷰입니다.#Review#Multimodal Reasoning#Mathematical Problem Solving#Self-Evolving#Iterative Fine-Tuning#Reward Models#Reflection#Large Language Models (LLMs)2025년 11월 12일댓글 수 로딩 중
[논문리뷰] Beyond Correctness: Evaluating Subjective Writing Preferences Across CulturesarXiv에 게시된 'Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures' 논문에 대한 자세한 리뷰입니다.#Review#Subjective Preference Learning#Writing Evaluation#Reward Models#RLHF#Cross-Cultural AI#Generative Models#Language Model Judges#Genre Instability2025년 10월 17일댓글 수 로딩 중
[논문리뷰] Controlling Multimodal LLMs via Reward-guided DecodingMichal Drozdzal이 arXiv에 게시한 'Controlling Multimodal LLMs via Reward-guided Decoding' 논문에 대한 자세한 리뷰입니다.#Review#Multimodal LLMs#Reward Models#Guided Decoding#Visual Grounding#Hallucination Mitigation#Object Precision#Object Recall#Inference-time Control2025년 8월 18일댓글 수 로딩 중