최신 포스트

[논문리뷰] Less is More: Early Stopping Rollout for On-Policy Distillation

본 논문은 기존 OPD 방식에서 발생하는 Off-policy Teacher Decay 문제를 해결하기 위해 제안되었습니다 .

#Review #On-policy Distillation #Knowledge Distillation #Language Models #Early Stopping Rollout #Off-policy Teacher Decay #Cascading Alignment #Sub-mode Commitment

2026년 5월 27일

[논문리뷰] Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents

본 논문은 소규모(Small) 오픈소스 CUA들이 다양한 소프트웨어 환경에서 도메인 특화 성능이 여전히 부족하다는 점을 해결하고자 합니다.

#Review #Computer-Use Agent #Domain Specialization #Annotation-free #Weakness-aware #Direct Preference Optimization #GUI Agent

2026년 5월 27일

[논문리뷰] Joint Training of Multi-Token Prediction in Reinforcement Learning via Optimal Coefficient Calibration

본 논문은 LLM post-training 과정에서 MTP와 RL objectives를 공동으로 학습할 때 발생하는 심각한 성능 저하 문제를 해결하고자 한다.

#Review #Multi-Token Prediction #Reinforcement Learning #Optimization #Optimal Coefficient Calibration #Large Language Models #Mathematical Reasoning

2026년 5월 27일

[논문리뷰] HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs

본 논문은 Hybrid-Reasoning LLM의 효율적인 활용을 위한 핵심 과제인 '상황별 최적의 추론 모드 선택' 문제를 해결하고자 합니다. 기존 연구들은 각기 다른 모델, 데이터셋, 평가 환경에서 개별적으로 제안되었기 때문에, 전략 간의 실질적인 성능이나 효율성을 객관적으로 비교하기 어렵다는 한계가 있습니다.

#Review #Hybrid-Reasoning LLMs #Adaptive Thinking-Mode Switch #Efficiency-Effectiveness Trade-off #Prompt-Tuning #Routing #Speculative Execution #LLM Benchmarking

2026년 5월 27일

[논문리뷰] Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders

본 논문은 LLM post-training에서 데이터 엔지니어링이 모델 성능 향상의 핵심임에도 불구하고, 기존 방식들은 주로 외부 피드백(인간 선호도, 보상 모델, rollout 결과 등)에 의존하여 비용이 높고 효율성이 제한적이라는 문제에서 출발한다.

#Review #Sparse Autoencoder #LLM Post-training #Reinforcement Learning #Data Engineering #Mechanistic Interpretability #Curriculum Learning #Data Selection

2026년 5월 27일

[논문리뷰] GradSentry: Gradient Spectral Entropy for Backdoor Sample Filtering in Large Language Model Fine-Tuning

본 논문은 LLM fine-tuning 과정에서 발생하는 backdoor 공격을 효과적으로 탐지하고 제거하기 위한 새로운 filtering 기법을 제안합니다.

#Review #LLM Fine-Tuning #Backdoor Defense #Gradient Spectral Entropy #Sample Filtering #SVD #Robustness

2026년 5월 27일

[논문리뷰] Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

본 논문은 기존 비디오 세계 모델이 단일 에이전트 환경에 집중되어 있어, 다중 에이전트가 상호작용하는 복잡한 공유 환경을 효율적으로 시뮬레이션하지 못하는 문제를 해결합니다.

#Review #Generative World Model #Multi-Agent Interaction #Diffusion Transformer #Permutation Symmetry #Rotary Positional Embedding #Sparse Hub Attention

2026년 5월 27일

[논문리뷰] GEM: Generative Supervision Helps Embodied Intelligence

본 논문은 현재의 Embodied VLM들이 고수준의 언어적 추론에는 능숙하지만, 실제 물리 환경에서 로봇을 제어하기 위한 미세한 공간적 구조와 물리적 인지 능력이 결합되지 못하는 한계를 해결하고자 합니다.

#Review #Embodied Intelligence #Vision-Language Models #Generative Supervision #Depth Map Prediction #Diffusion Transformer #Robot Manipulation #Spatiotemporal Planning

2026년 5월 27일

[논문리뷰] GE-Sim 2.0: A Roadmap Towards Comprehensive Closed-loop Video World Simulators for Robotic Manipulation

본 논문은 현대 로봇 학습에서 정책(Policy) 모델의 복잡도는 증가하는 반면, 이를 안정적으로 평가할 수 있는 시뮬레이션 환경이 병목 현상으로 작용하는 문제를 해결하고자 한다.

#Review #Robotic Manipulation #Video World Simulator #Action-Conditioned Generation #Closed-loop Evaluation #Proprioceptive State Expert #World Judge

2026년 5월 27일

[논문리뷰] From Pixels to Words -- Towards Native One-Vision Models at Scale

본 논문은 기존의 modular VLM이 가진 복잡한 파이프라인과 파편화된 visual-language 정보를 해결하기 위해 단일화된 Native one-vision 아키텍처를 제안한다.

#Review #Native Vision-Language Models #Monolithic Backbone #Spatiotemporal Attention #One-Vision Foundation Model #End-to-End Learning #Spatial Intelligence

2026년 5월 27일

[논문리뷰] Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving

본 논문은 End-to-End Autonomous Driving을 위한 Vision-Language-Action (VLA) 모델이 직면한 High-Fidelity Trajectory Planning과 Efficient Inference 간의 상충 관계 문제를 해결하고자 합니다.

#Review #Autonomous Driving #VLM #Block-Diffusion #Inference Efficiency #Trajectory Planning #Scaffold Speculative Decoding #Latency #Throughput

2026년 5월 27일

[논문리뷰] Everything at Every Scale: Scale-Invariant Diffusion with Continuous Super-Resolution

본 논문은 이미지 생성과 super-resolution이 본질적으로 스케일 간 정보 손실을 역전시키는 동일한 과정임을 지적하며, 이를 통합할 수 있는 새로운 접근법을 제시합니다 .

#Review #Diffusion Models #Scale Invariance #Super-Resolution #Frequency Space #Renormalization Group #Unconditional Generation

2026년 5월 27일

[논문리뷰] Efficient and Scalable Provenance Tracking for LLM-Generated Code Snippets

본 논문은 LLM이 생성한 코드의 출처를 투명하게 추적하고 저작권 준수를 확인해야 하는 시급한 문제 의식에서 출발합니다. 기존의 Winnowing 기반 플래지어리즘 탐지 도구는 정확도는 높지만, 데이터셋 전체를 스캔해야 하는 선형 시간 복잡도로 인해 최신 LLM이 학습되는 대규모 데이터셋에 적용하기에는 한계가 있습니다.

#Review #Provenance Tracking #Code Similarity #LLM #Vector Search #Winnowing #SourceTracker #HybridSourceTracker

2026년 5월 27일

[논문리뷰] ESC-Skills: Discovering and Self-Evolving Skills for Emotional Support Conversations

본 논문은 기존 ESC 시스템들이 주로 end-to-end 방식에 의존하여 해석 가능성이 낮고 체계적인 기술 개선이 어렵다는 문제를 해결하고자 합니다.

#Review #Emotional Support Conversations #Skill-centric Framework #Intervention Units #Self-Evolutionary #Large Language Models #Simulation-based Verification

2026년 5월 27일

[논문리뷰] DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

본 논문은 LLM의 추론 성능 향상을 위해 외부의 강력한 teacher 모델이나 복잡하게 큐레이션된 학습 데이터에 의존해야 하는 기존 RL 패러다임의 한계를 해결하고자 합니다. 기존 방식들은 학습 데이터의 품질이나 교사의 지식 수준에 따라 성능이 제약되는 structural limitation을 가지고 있습니다.

#Review #Reinforcement Learning #Reasoning Models #Denoising Reasoning #Weak-to-Strong Generalization #Self-correction #Large Language Models

2026년 5월 27일

[논문리뷰] CubePart: An Open-Vocabulary Part-Controllable 3D Generator

기존의 3D 생성 모델은 모놀리식 메쉬(monolithic mesh)를 생성하거나, 사용자가 제어할 수 없는 임의의 파트 단위로만 분해하여 게임 엔진이나 물리 시뮬레이션 환경에 필요한 특정 구조와 정렬하기 어렵습니다.

#Review #3D Generation #Part-Controllable #Open-Vocabulary #Diffusion Transformer #Schema-driven #Game Asset

2026년 5월 27일

[논문리뷰] Clark Hash: Stateless Sparse Johnson-Lindenstrauss Quantization for Neural Embeddings

본 논문은 대규모 신경망 임베딩(neural embeddings)을 저장할 때 발생하는 과도한 메모리 및 스토리지 비용 문제를 해결하기 위해 Clark Hash를 제안합니다.

#Review #Neural Embeddings #Johnson-Lindenstrauss #Quantization #Sparse Projection #Stateless Codec #Dimensionality Reduction

2026년 5월 27일

[논문리뷰] Chartographer: Counterfactual Chart Generation for Evaluating Vision-Language Models

본 논문은 기존의 Chart QA 벤치마크가 VLM의 진정한 시각적 추론 능력을 정확히 측정하지 못하고, 단순한 시각적 패턴 매칭이나 사전 학습된 파라메트릭 지식에 의한 '지름길(Shortcut)'을 활용하고 있다는 문제를 제기합니다.

#Review #Vision-Language Models #Chart QA #Counterfactual Generation #Visual Reasoning #Shortcut Learning #Generalization

2026년 5월 27일

[논문리뷰] AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation

본 논문은 과학적 탐구의 장기적인 연구 과정에서 발생하는 비효율적인 실험 반복과 고립된 탐색 문제를 해결하기 위해 AutoScientists를 제안합니다 .

#Review #Multi-agent Systems #Scientific Experimentation #Self-Organization #Autonomous Discovery #LLM Agents #BioML-Bench

2026년 5월 27일

[논문리뷰] AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning

본 논문은 대규모 언어 모델(LLM) 기반 에이전트의 Long-Horizon Tasks 수행 능력 향상에 Scaling Out 전략이 기여할 수 있는지에 대한 연구를 수행한다.

#Review #Agent Scaling #Collective Reasoning #Long-Horizon Tasks #Shared Reasoning Hub #Multi-Agent Systems #Homogeneous Teams #Heterogeneous Teams #Reinforcement Learning

2026년 5월 27일