#Self-Correction

21개의 포스트

[논문리뷰] DMax: Aggressive Parallel Decoding for dLLMs

본 논문은 dLLM을 위한 DMax 패러다임을 제안하며, 이는 예측의 self-refinement 과정을 임베딩 공간 내의 변환으로 재구성합니다. 핵심 기법인 OPUT은 학습 시 모델 스스로의 예측을 통해 noisy input을 구성함으로써 train-inference 간의 불일치를 줄여 자가 수정 능력을 극대화합니다 .

#Review #Diffusion Language Models #Parallel Decoding #Error Accumulation #On-Policy Training #Self-Correction #Embedding Space

2026년 4월 9일

[논문리뷰] Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines

본 논문은 Four-Condition Design을 통해 성능 이득을 Additive하게 분해하는 프레임워크를 제안합니다. 이 방법론은 Generator 기반 성능($x_1$), 표준 Revision($x_2$), 독립 재해결 제어($x_3$), 구조화된 Null 초안 제어($x_4$)를 비교하여 세 가지 효과를 각각 산출합니다.

#Review #Multi-LLM Pipeline #Iterative Refinement #Self-Correction #Task-Time Scaling #Code Generation #MCQ

2026년 4월 1일

[논문리뷰] DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation

arXiv에 게시된 'DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation' 논문에 대한 자세한 리뷰입니다.

#Review #Agentic Systems #Presentation Generation #Large Language Models (LLMs)#Multimodal LLMs (MLLMs)#Environment-Grounded Reflection #Self-Correction #Dual-Agent Framework #Supervised Fine-tuning

2026년 3월 8일

[논문리뷰] Recursive Think-Answer Process for LLMs and VLMs

Yong Man Ro이 arXiv에 게시한 'Recursive Think-Answer Process for LLMs and VLMs' 논문에 대한 자세한 리뷰입니다.

#Review #LLMs #VLMs #Reasoning #Self-Correction #Reinforcement Learning #Confidence Estimation #Iterative Refinement #Think-Answer

2026년 3월 2일

[논문리뷰] OCR-Agent: Agentic OCR with Capability and Memory Reflection

arXiv에 게시된 'OCR-Agent: Agentic OCR with Capability and Memory Reflection' 논문에 대한 자세한 리뷰입니다.

#Review #OCR #VLM #Self-Correction #Agentic AI #Capability Reflection #Memory Reflection #Iterative Refinement #Chain-of-Thought

2026년 2월 24일

[논문리뷰] UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Animesh Sinha이 arXiv에 게시한 'UniT: Unified Multimodal Chain-of-Thought Test-time Scaling' 논문에 대한 자세한 리뷰입니다.

#Review #Multimodal AI #Chain-of-Thought #Test-time Scaling #Unified Models #Iterative Reasoning #Image Generation #Visual Reasoning #Self-Correction

2026년 2월 17일

[논문리뷰] Distilling Feedback into Memory-as-a-Tool

vicgalle이 arXiv에 게시한 'Distilling Feedback into Memory-as-a-Tool' 논문에 대한 자세한 리뷰입니다.

#Review #LLM #Continual Learning #Memory-Augmented Agents #Self-Correction #Feedback Distillation #Tool Use #Inference Cost Amortization #Rubric-based Learning

2026년 1월 11일

[논문리뷰] ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement

Zhaohe Liao이 arXiv에 게시한 'ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement' 논문에 대한 자세한 리뷰입니다.

#Review #Table Visualization #Infographic Generation #Multi-modal Large Language Models (MLLMs)#Diffusion Models #Self-Correction #Reinforcement Learning #Graphic Design #Data-to-Visual Mapping

2025년 12월 16일

[논문리뷰] VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning

Yansong Tang이 arXiv에 게시한 'VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.

#Review #Tool-integrated Visual Reasoning #Referring Grounded Reasoning #Agentic Reinforcement Learning #Self-Correction #Large Vision-Language Models #Chain-of-Thought #Tool Refinement

2025년 12월 8일

[논문리뷰] Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

arXiv에 게시된 'Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning' 논문에 대한 자세한 리뷰입니다.

#Review #Self-Evolving Agent #Vision-Language Models #Tool-Integrated Reasoning #Reinforcement Learning #Self-Correction #Multimodal AI #Generative AI

2025년 11월 25일

[논문리뷰] LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls

arXiv에 게시된 'LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls' 논문에 대한 자세한 리뷰입니다.

#Review #Large Language Models (LLMs)#Tool Learning #Data Generation #Model Training #Closed-Loop Framework #Reinforcement Learning (RL)#Data Refinement #Self-Correction

2025년 11월 12일

[논문리뷰] RiddleBench: A New Generative Reasoning Benchmark for LLMs

arXiv에 게시된 'RiddleBench: A New Generative Reasoning Benchmark for LLMs' 논문에 대한 자세한 리뷰입니다.

#Review #LLM Reasoning #Generative AI #Benchmark #Logical Deduction #Spatial Reasoning #Constraint Satisfaction #Hallucination Cascade #Self-Correction

2025년 11월 9일

[논문리뷰] From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

arXiv에 게시된 'From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model' 논문에 대한 자세한 리뷰입니다.

#Review #Discrete Diffusion Models #Vision-Language Models #Error Cascades #Self-Correction #Refinement Framework #Parallel Generation #Image Captioning #Hallucination Mitigation

2025년 10월 27일

[논문리뷰] PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold

arXiv에 게시된 'PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold' 논문에 대한 자세한 리뷰입니다.

#Review #Deep Research Agent #Reinforcement Learning from AI Feedback #RLOO Algorithm #Large Language Models #Tool Use #Self-Correction #Reasoning Scaffold #Agent Alignment

2025년 10월 22일

[논문리뷰] DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search

arXiv에 게시된 'DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search' 논문에 대한 자세한 리뷰입니다.

#Review #Multimodal LLM #Web Search #Visual Question Answering #Reinforcement Learning #Image Cropping #Self-Correction #Tool Use

2025년 10월 15일

[논문리뷰] Test-Time Policy Adaptation for Enhanced Multi-Turn Interactions with LLMs

Yao Shu이 arXiv에 게시한 'Test-Time Policy Adaptation for Enhanced Multi-Turn Interactions with LLMs' 논문에 대한 자세한 리뷰입니다.

#Review #Large Language Models #Multi-turn Interaction #Test-Time Adaptation #Reinforcement Learning from Human Feedback #Policy Optimization #Online Learning #Self-Correction

2025년 10월 1일

[논문리뷰] THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning

Yicheng Pan이 arXiv에 게시한 'THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning' 논문에 대한 자세한 리뷰입니다.

#Review #Mathematical Reasoning #Tool-Integrated Reasoning #Reinforcement Learning #Hierarchical Optimization #Self-Correction #Large Language Models #Code Generation

2025년 9월 18일

[논문리뷰] Interleaving Reasoning for Better Text-to-Image Generation

Shixiang Tang이 arXiv에 게시한 'Interleaving Reasoning for Better Text-to-Image Generation' 논문에 대한 자세한 리뷰입니다.

#Review #Text-to-Image Generation #Interleaving Reasoning #Multimodal Learning #Visual Quality #Fine-grained Detail #Diffusion Models #Self-Correction

2025년 9월 9일

[논문리뷰] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

jgkwak이 arXiv에 게시한 'Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off' 논문에 대한 자세한 리뷰입니다.

#Review #Virtual Try-On #Virtual Try-Off #Diffusion Transformer #Bidirectional Learning #Generative AI #Fashion Synthesis #Attention Mechanism #Self-Correction

2025년 8월 11일

[논문리뷰] Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling

Ruolin Shen이 arXiv에 게시한 'Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling' 논문에 대한 자세한 리뷰입니다.

#Review #Visual Document Understanding #Visual Question Answering #Multi-Agent System #Test-Time Scaling #Self-Correction #Mixed Reward Modeling #Large Language Models

2025년 8월 8일

[논문리뷰] Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction

Jui-Hui Chung이 arXiv에 게시한 'Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction' 논문에 대한 자세한 리뷰입니다.

#Review #Automated Theorem Proving #Formal Verification #Language Models #Self-Correction #Data Synthesis #Reinforcement Learning #Model Averaging #Lean

2025년 8월 6일