Review

[논문리뷰] YoCausal: How Far is Video Generation from World Model? A Causality Perspective

본 논문은 최신 Video Diffusion Models (VDMs)가 진정한 의미의 세계 모델(World Model)로 발전하고 있는지, 아니면 단순히 통계적 시간 패턴을 과적합(overfit)하고 있는지를 검증하고자 합니다.

#Review #Video Generation #World Models #Causality #Violation of Expectation #Reverse Surprise Index #Causality Cognition Index #Diffusion Models

2026년 5월 28일

[논문리뷰] WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction

본 논문은 기존 memory 벤치마크가 정적인 대화 데이터에 편향되어 있고, memory를 단일 성공 지표로만 평가하여 실패 원인 파악이 어렵다는 문제를 해결하기 위해 WorldMemArena를 제안한다.

#Review #Multimodal Agent #Memory Benchmark #Action-World Interaction #Lifecycle Evaluation #Long-horizon #Lifelong Evolution #Agentic Execution

2026년 5월 28일

[논문리뷰] Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention

본 논문은 더 큰 모델(Larger Models)이 더 작은 모델이 학습하지 못하는 작업들을 어떻게 학습하는지에 대한 근본적인 메커니즘을 규명하고자 합니다.

#Review #Scaling Laws #Rare-Task Retention #Gradient Interference #Neural Network Scaling #Multi-Task Learning #Feature Learning

2026년 5월 28일

[논문리뷰] When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

본 논문은 LLM이 장기적인 상호작용 속에서 누적되는 정보들 중 무엇을 믿고, 무엇을 수정하며, 무엇을 무시해야 하는지에 대한 문제(CBM)를 해결하고자 합니다. 기존의 LLM은 문맥 내에서 제공되는 형식적 증거를 따르기보다 사전 학습된 파라메트릭 지식이나 문맥상의 노이즈에 과도하게 의존하는 경향이 있습니다 .

#Review #Contextual Belief Management #Large Language Models #BeliefTrack #Reinforcement Learning #Contextual Interference #Symbolic Verification

2026년 5월 28일

[논문리뷰] When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems

본 연구는 클라우드 기반의 고성능 Frontier 모델과 에지 장치 기반의 고효율 SLM(Small Language Model)을 통합하는 하이브리드 Multi-Agent System(MAS)의 설계 공간을 체계적으로 탐구합니다.

#Review #Multi-Agent Systems #Hybrid AI #Edge Inference #Cloud Agents #Agentic Workflow #KV-cache #Model Routing

2026년 5월 28일

[논문리뷰] Verifiable Rewards Beyond Math and Code: Lightweight Corpus-Grounded Process Supervision for Factual Question Answering

본 논문은 지식 집약적 QA 작업에서 LLM의 사실적 정확도를 높이기 위한 효율적인 보상 신호가 부족하다는 점을 문제로 지적합니다.

#Review #Reinforcement Learning #Factuality #Process Supervision #Wikipedia #Co-occurrence #Large Language Models #GRPO

2026년 5월 28일

[논문리뷰] Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation

본 논문은 UDM에서 사용되는 Bridge Plug-in 파라미터화가 표준적인 노이즈 제거 목표(denoising posterior)를 최적화하지 못한다는 구조적 불일치 문제를 해결합니다.

#Review #Uniform Diffusion Models #Leave-one-out #Denoiser #Absorbing State Reformulation #Discrete Diffusion #Bridge Plug-in

2026년 5월 28일

[논문리뷰] UniSteer: Text-Guided Flow Matching in Activation Space for Versatile LLM Steering

본 논문은 LLM의 행동 제어를 위한 기존 Activation Steering 방법론들이 가진 확장성 및 구성적 제약 문제를 해결하기 위해 UniSteer를 제안합니다.

#Review #LLM Steering #Activation Space #Flow Matching #Text-Guided Control #Activation Inversion #Multi-Constraint #Zero-shot Classification

2026년 5월 28일

[논문리뷰] UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents

본 논문은 모바일 GUI 자동화에서 lightweight 모델이 겪는 End-to-End 계획 수립의 한계를 극복하고자 합니다. 현재 대부분의 GUI 에이전트는 거대한 VLM에 의존하며, 이는 컴퓨팅 자원이 제한적인 온디바이스(on-device) 환경에서 높은 추론 비용과 신뢰성 부족 문제를 야기합니다.

#Review #GUI Agent #Knowledge Graph #Autonomous Exploration #On-device AI #Lightweight Model #Mobile Automation

2026년 5월 28일

[논문리뷰] Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

본 연구는 대규모 언어 모델(LLM)이 Deep Research 분야에서 사실 기반의 긴 리포트를 작성할 때 발생하는 불투명성과 시각 자료 활용의 한계를 해결하고자 합니다.

#Review #Multi-Agent System #Multimodal Deep Research #Verifiable Generation #Test-Time Scaling #Visual Working Memory #Report Generation

2026년 5월 28일

[논문리뷰] Towards Consistent Video Geometry Estimation

본 논문은 기존 비디오 기하학 추정 모델들이 모델 구조나 학습 프로토콜에 따라 offline(full-sequence) 또는 online(streaming) 환경 중 하나에만 국한되는 문제를 해결합니다.

#Review #Foundation Model #Video Geometry Estimation #Dynamic Chunking Attention #Depth Estimation #Surface Normal Estimation #Point Map Estimation

2026년 5월 28일

[논문리뷰] Token-Level Generalization in LoRA Adapter Backdoors: Attack Characterization and Behavioral Detection

본 논문은 HuggingFace와 같은 공공 모델 허브에서 배포되는 LoRA 어댑터가 데이터 오염(Poisoning)을 통해 치명적인 백도어에 취약할 수 있다는 점을 지적합니다.

#Review #LoRA Adapter #Backdoor Attack #Data Poisoning #Behavioral Detection #Weight-Level Detection #LLM Security

2026년 5월 28일

[논문리뷰] Thinking Before Constraining: A Unified Decoding Framework for Large Language Models

본 논문은 LLM의 풍부한 추론 능력과 엄격한 출력 형식 보장 사이의 상충 관계(trade-off)를 해결하고자 합니다. 기존의 Constrained Decoding 방식은 생성 초기부터 문법을 강제하여 모델의 추론 유연성을 제한하고 성능을 떨어뜨리는 문제를 발생시킵니다.

#Review #Large Language Models #Constrained Decoding #Structured Generation #Chain-of-Thought #Parser

2026년 5월 28일

[논문리뷰] SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

본 논문은 비디오 생성 모델이 Sparse한 조건(Text, Start/End Frame)에만 의존함에 따라 발생하는 서사 구조 및 시간적 페이싱(Temporal Pacing) 제어의 한계를 극복하고자 SmartDirector를 제안합니다.

#Review #Video Generation #Keyframe-Conditioned #Narrative Pacing #Flow Matching #Multi-Chunk VAE #Director-Gen #Director-SR

2026년 5월 28일

[논문리뷰] Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

본 논문은 에이전트의 효율적인 기술 습득과 OOD 환경에서의 범용성 확보를 위해 기술의 종류에 따른 차별화된 처리(Differentiated Treatment)가 필요함을 제기한다.

#Review #Agentic Reinforcement Learning #Skill Internalization #Out-of-Distribution Generalization #Difficulty-Aware Routing #Privileged Distillation #Shortcut Learning

2026년 5월 28일

[논문리뷰] RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

본 연구는 비검증(non-verifiable) 도메인에서의 LLM 평가가 가진 주관성과 기존 rubric 기반 평가의 모델 의존성 문제를 해결하고자 합니다.

#Review #Reward Modeling #Rubric-based Evaluation #Reinforcement Learning #Pointwise Reward #LLM Alignment #Preference Optimization

2026년 5월 28일

[논문리뷰] Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

본 논문은 기존의 embodied AI 모델들이 특정 작업이나 로봇 플랫폼에만 고도화되어 있어 발생하는 파편화(fragmentation) 문제를 해결하기 위해 통합 모델을 제안합니다. 현재의 방식은 데이터 활용도가 낮고 일반화 성능이 제한적이라는 한계가 있습니다.

#Review #Embodied Intelligence #Vision-Language-Action Models #Flow-matching #Multi-task Learning #Cross-embodiment #Reinforcement Learning

2026년 5월 28일

[논문리뷰] PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions

본 논문은 텍스트 기반의 4D Human-Object Interaction(HOI) 생성 시 발생하는 물리적 불일치와 시각적 부자연스러움을 해결하는 것을 목표로 합니다.

#Review #4D Generation #Human-Object Interaction #Gaussian Splatting #Material Point Method #Diffusion Models

2026년 5월 28일

[논문리뷰] PhoneWorld: Scaling Phone-Use Agent Environments

본 논문은 모바일 에이전트 연구의 병목 현상인 '재현 가능하고 제어 가능한 환경의 부족' 문제를 해결하고자 한다. 기존 벤치마크들은 이미 구축된 환경에서의 평가에만 집중하고 있으며, 새로운 환경을 확장성 있게 구축할 방법은 제시하지 못하고 있다.

#Review #Phone-Use Agent #Environment Synthesis #GUI Trajectories #Autonomous App Construction #Scaling #Multimodal Agent

2026년 5월 28일

[논문리뷰] Parallax: Parameterized Local Linear Attention for Language Modeling

본 논문은 대규모 언어 모델(LLM) 학습에서 Softmax Attention이 가지는 구조적 한계를 극복하고 효율성을 높이는 것을 목표로 한다.

#Review #Local Linear Attention #Language Modeling #Muon Optimizer #Parameterized Attention #Arithmetic Intensity

2026년 5월 28일