Review

[논문리뷰] SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects

본 연구는 기존의 정적인 3D 장면 표현 방식이 실내 공간의 동적 특성과 가동부를 효과적으로 편집하는 데 한계가 있다는 문제 의식에서 출발합니다.

#Review #3D Scene Understanding #Executable World Programs #Articulated Objects #Scene Editing #Inverse Graphics #Program Synthesis

2026년 5월 19일

[논문리뷰] SAGA: A Sequence-Adaptive Generative Architecture for Multi-Horizon Probabilistic Forecasting with Adaptive Temporal Conformal Prediction

본 논문은 기존의 microsimulation 모델이 사용하는 parametric 소득 예측 프로세스의 구조적 한계를 해결하고자 합니다.

#Review #Deep Sequence Models #Probabilistic Forecasting #Conformal Prediction #Microsimulation #Transformer #Labor Economics

2026년 5월 19일

[논문리뷰] Process Rewards with Learned Reliability

본 논문은 기존 PRM이 중간 단계에 대해 단일 Scalar 보상값만을 제공하여, 해당 점수의 신뢰도를 평가할 수 없는 한계점을 해결하고자 합니다.

#Review #Process Reward Model #Beta-Binomial #Adaptive Computation Allocation #Test-Time Scaling #Uncertainty Estimation

2026년 5월 19일

[논문리뷰] PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset

본 논문은 기존 T2I 모델들이 주로 1K~2K 수준의 해상도에 고착되어 있어, 디지털 영화 제작이나 상업 디자인 등에서 요구하는 100MP 수준의 Ultra-High-Resolution(UHR) 생성 능력이 부족한 문제를 해결하고자 한다.

#Review #Ultra-High-Resolution #Text-to-Image #100MP #PixVerve-95K #PixVerve-Bench #Diffusion Models

2026년 5월 19일

[논문리뷰] PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents

본 연구는 대규모 외부 컨텍스트를 반복적으로 쿼리하는 LLM 에이전트 환경에서 발생하는 반복적인 오리엔테이션 작업의 비효율성 문제를 해결합니다.

#Review #Long-Context LLM Agents #Context Map #Orientation Cache #Prompt Engineering #LLM Inference

2026년 5월 19일

[논문리뷰] Overcoming Catastrophic Forgetting in Visual Continual Learning with Reinforcement Fine-Tuning

죄송합니다. 제공해주신 논문 URL https://arxiv.org/html/2605.09640을 browse 도구를 사용하여 접근하는 데 실패했습니다. 논문 내용을 가져올 수 없어 요청하신 요약 및 Figure 정보 추출 작업을 수행할 수 없습니다.

2026년 5월 19일

[논문리뷰] OpenComputer: Verifiable Software Worlds for Computer-Use Agents

본 논문은 컴퓨터 사용 에이전트의 훈련과 평가를 저해하는 환경 구축의 어려움과 평가 신뢰성 부족 문제를 해결하기 위해 OpenComputer를 제안한다.

#Review #Computer-Use Agents #Verifiable Software Worlds #Verifier-Grounded #Benchmark Synthesis #Desktop Automation #Self-Evolving Verification

2026년 5월 19일

[논문리뷰] OmniGUI: Benchmarking GUI Agents in Omni-Modal Smartphone Environments

본 논문은 기존 GUI 에이전트 벤치마크가 정적 스크린샷 위주로 구성되어 있어, 실시간 환경에서 요구되는 동적 오디오 및 비디오 처리 능력을 평가하지 못한다는 한계를 해결하고자 한다 .

#Review #GUI Agents #Multimodal Benchmark #Smartphone Environments #Temporal Reasoning #Auditory Processing #Action Grounding

2026년 5월 19일

[논문리뷰] Omni-DuplexEval: Evaluating Real-time Duplex Omni-modal Interaction

본 논문은 현대의 MLLM이 실시간 환경에서의 상호작용 능력을 평가할 수 있는 표준화된 벤치마크와 평가 방법론이 부족하다는 문제점을 지적합니다.

#Review #Multimodal Large Language Models #Real-time Duplex Interaction #Streaming Video Understanding #Benchmark #Proactive Interaction

2026년 5월 19일

[논문리뷰] MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation

본 논문은 현대의 영상 생성 기술이 단일 샷(single-shot)에서 다중 샷(multi-shot) 이야기 구조로 진화함에 따라 발생하는 모델 평가의 한계를 극복하고자 한다.

#Review #Multi-Shot Audio-Video Generation #Benchmark #Evaluation Framework #Adaptive Hybrid Evaluation #Cinematic Language

2026년 5월 19일

[논문리뷰] Language-Switching Triggers Take a Latent Detour Through Language Models

본 연구는 대규모 언어 모델(LLM)에 삽입된 백도어(Backdoor)가 어떠한 내부 메커니즘을 통해 트리거를 처리하고 모델 출력을 가로채는지 규명하는 것을 목표로 합니다. 기존 연구들은 트리거를 일종의 불투명한 블랙박스로 처리하여 탐지 및 방어에 한계가 있었습니다.

#Review #Backdoor Attack #Circuit Interpretability #Activation Patching #Language-Switching #Orthogonal Latent Encoding #Residual Stream #Transformer

2026년 5월 19일

[논문리뷰] GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

본 논문은 현재 긴 문맥 이해를 위한 RL 학습이 데이터의 편향된 구성과 보상 신호의 불균일성으로 인해 비효율적으로 진행된다는 점을 핵심 문제로 지적한다.

#Review #Long-Context RL #Capability-Oriented Data #Reinforcement Learning #Multitask Alignment #Advantage Estimation #TMN-Reweight

2026년 5월 19일

[논문리뷰] EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

본 논문은 Large Language Models (LLMs)에 tool-use capabilities를 부여하는 Agentic Reinforcement Learning (Agentic RL)이 겪는 두 가지 주요 bottleneck, 즉 scalable하고 robust한 executable environments의 부족과 implicit human reasoning을 포착하는 현실적인 training data의 희소성을…

#Review #Agentic Reinforcement Learning #Tool-Use Agents #Environment Synthesis #Trajectory Generation #Dependency Graph #LLM Post-training

2026년 5월 19일

[논문리뷰] Echo-Forcing: A Scene Memory Framework for Interactive Long Video Generation

본 논문은 Autoregressive 비디오 확산 모델이 긴 비디오 생성 및 대화형 시나리오에서 겪는 기억 관리(KV Cache management)의 기능적 Entanglement 문제를 해결하고자 한다.

#Review #Video Generation #Autoregressive #KV Cache #Scene Memory #Long-form Video #Interactive Generation

2026년 5월 19일

[논문리뷰] ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop

본 요청은 제공된 URL(https://arxiv.org/html/2605.18746)에 대한 접근이 원활하지 않아, 해당 논문의 내용을 직접적으로 추출할 수 없습니다. 연구원으로서 해당 논문에 대한 심층 분석을 제공해 드리고 싶으나, 실시간 액세스 오류로 인해 논문 정보 파악이 불가능합니다.

2026년 5월 19일

[논문리뷰] Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding

본 논문은 기존의 Tree-based Speculative Decoding이 겪고 있는 속도와 정확도(MAT) 사이의 Pareto tradeoff 문제를 해결하고자 한다.

#Review #Speculative Decoding #Tree Construction #Dynamic Pruning #Retrieval-based #GPU-resident #Budget Compensation #Long-context

2026년 5월 19일

[논문리뷰] DocAtlas: Multilingual Document Understanding Across 80+ Languages

본 논문은 기존 Document Understanding 모델들이 다국어 데이터 처리 및 문서 구조 파악에서 겪는 한계를 극복하기 위해 DocAtlas를 제안한다. 대다수의 기존 모델들은 특정 언어군에 편향되어 있거나, 복잡한 문서 레이아웃을 처리하는 데 있어 성능이 저하되는 Generalization 문제를 겪고 있다.

#Review #Document Understanding #Multilingual #Vision-Language Models #OCR #Multimodal Learning

2026년 5월 19일

[논문리뷰] Delta Attention Residuals

본 논문은 기존 Attention Residuals에서 발생하는 routing collapse 문제를 해결하고자 한다. 기존 모델들은 각 레이어의 출력 $h_i$가 이전 레이어들의 누적 합이기 때문에, 레이어가 깊어질수록 인접한 $h_i$와 $h_{i-1}$ 간의 중복성이 극도로 높아진다 .

#Review #Attention Residuals #Delta Representation #Additive Routing #Transformer #Model Scaling #Fine-tuning

2026년 5월 19일

[논문리뷰] CopT: Contrastive On-Policy Thinking with Continuous Spaces for General and Agentic Reasoning

본 논문은 표준 CoT 패러다임이 가진 비효율적인 '생각 후 답변' 순서와, 이미 답변을 도출한 후에도 불필요하게 추론을 지속하는 Performative Reasoning 문제를 해결하고자 한다 .

#Review #Large Language Models #Chain-of-Thought #Continuous Embeddings #Contrastive Verification #On-Policy Thinking #Agentic Reasoning

2026년 5월 19일

[논문리뷰] Context Memorization for Efficient Long Context Generation

본 논문은 긴 Prefix를 활용하는 현대의 LLM 애플리케이션들이 겪는 성능 저하와 추론 비효율성 문제를 해결하고자 합니다 .

#Review #Attention-State Memory #Long Context Generation #In-Context Learning #Retrieval-Augmented Generation #Online-Softmax Identity #Prefix Caching #LLM Inference

2026년 5월 19일