최신 포스트

[논문리뷰] UniT: Unified Geometry Learning with Group Autoregressive Transformer

본 논문은 기존의 Feed-forward 기하학적 인식 모델들이 파편화되어 있다는 문제를 해결하고자 합니다. 현재 연구들은 온라인 스트리밍 인식, 오프라인 다중 뷰 재구성, metric-scale 추정, 긴 시퀀스 확장성 등 각기 다른 Paradigm에 집중하고 있어 통합적인 프레임워크가 부재합니다.

#Review #Geometry Perception #Group Autoregressive Transformer #Metric-scale Estimation #Long-horizon Scalability #Multi-modal Fusion #Feed-forward Model

2026년 5월 20일

[논문리뷰] Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning

본 연구는 UMM 학습 시 이해와 생성 작업 간에 발생하는 아키텍처적 충돌과 이로 인한 성능 트레이드오프 문제를 해결하고자 한다. 기존의 다중 작업 학습(Multi-task learning)은 복잡한 파이프라인과 데이터 균형 조정 기법을 필요로 하며, 종종 한 작업의 성능 향상이 다른 작업의 저하를 초래하는 한계가 있다.

#Review #Unified Multimodal Models #Intelligent Image Editing #Instruction Tuning #Data Synthesis #Multi-task Learning #Reasoning-intensive

2026년 5월 20일

[논문리뷰] Toto 2.0: Time Series Forecasting Enters the Scaling Era

본 논문은 TSFM(Time Series Foundation Models)이 NLP나 Vision 모델과 달리 모델 크기가 커져도 예측 성능이 정체되거나 저하되는 Scaling의 불확실성 문제를 해결하고자 합니다.

#Review #Time Series Foundation Models #Scaling Laws #Contiguous Patch Masking #u-μP #Quantile Output Head #NorMuon #Observability Metrics

2026년 5월 20일

[논문리뷰] The Unlearnability Phenomenon in RLVR for Language Models

본 논문은 LLM 학습 과정에서 특정 문제들이 정답 보상을 받음에도 불구하고 왜 지속적으로 학습되지 않는지(Unlearnability)라는 역설적인 현상을 규명합니다.

#Review #Large Language Models #Reinforcement Learning #RLVR #Unlearnability #Gradient Outliers #Representation Learning

2026년 5월 20일

[논문리뷰] Stitched Value Model for Diffusion Alignment

본 논문은 diffusion model의 효과적인 alignment를 위해 noisy latent regime에서 정확하고 효율적인 Value Function을 구축하는 문제를 다룬다.

#Review #Diffusion Models #Alignment #Value Function #Model Stitching #Reward Modeling #Inference-time Steering #Reinforcement Learning

2026년 5월 20일

[논문리뷰] SpecBench: Measuring Reward Hacking in Long-Horizon Coding Agents

본 요청에 대해 제공된 URL(https://arxiv.org/html/2605.21384) 및 관련 학술 검색 결과가 현재 접근 불가능하거나 유효하지 않은 상태입니다. 해당 논문은 가상의 정보이거나, 아직 arXiv 시스템에 정식으로 렌더링되지 않은 데이터일 가능성이 높습니다.

2026년 5월 20일

[논문리뷰] Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection

본 논문은 LLM의 안전성 정렬 과정에서 발생하는 Alignment Tax가 본질적으로는 서로 다른 최적화 목적이 충돌하며 발생하는 'catastrophic forgetting'의 일종임을 규명합니다 .

#Review #Safety Alignment #Alignment Tax #Continual Learning #Catastrophic Forgetting #Gradient Projection #Orthogonal Constraint

2026년 5월 20일

[논문리뷰] Rethinking Visual Attribution for Chest X-ray Reasoning in Large Vision Language Models

본 논문은 의료 분야에서 활용되는 LVLM의 예측 결과에 대한 Visual Attribution 방식이 실제로 모델의 판단 근거를 정확히 반영하는지 검증하는 데 핵심적인 한계를 해결하고자 합니다.

#Review #Large Vision Language Models #Chest X-ray #Visual Attribution #Causal Framework #Concept-based Interpretability #Optimal Transport

2026년 5월 20일

[논문리뷰] PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models

본 논문은 기존의 계획 벤치마크가 고정된 인스턴스 집합에 의존하여 시나리오의 다양성과 구조적 복잡도를 충분히 반영하지 못하는 한계를 극복하기 위해 제안되었습니다. 기존 연구들은 단순히 프롬프트 길이 등 표면적인 지표로 난이도를 측정하며, 자동화된 검증 및 확장 가능한 데이터 생성이 결여되어 있었습니다.

#Review #Large Language Models #PlanningBench #Constraint-driven Synthesis #Reinforcement Learning #Verifiable Data #Taxonomy

2026년 5월 20일

[논문리뷰] PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis

본 연구는 희소한 건축학적 입력으로부터 몰입감 있는 multi-room indoor environment를 합성하는 데 있어, photorealistic한 파노라마와 cross-view spatial coherence를 동시에 유지하는 문제가 핵심적인 도전 과제임을 지적한다.

#Review #Generative Spatial World Model #Whole-House Panorama Synthesis #3D Gaussian Splatting #Panoramic LRM #Room-aware Group Attention #Topology-aware Progressive Caching #Decoupled Guidance

2026년 5월 20일

[논문리뷰] On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

본 연구는 급증하는 과학 연구 논문 생산량에 따른 피어 리뷰 시스템의 확장성 문제를 해결하기 위해 도입된 AI Reviewers의 역량과 신뢰성을 객관적으로 평가하는 것을 목표로 합니다.

#Review #AI Reviewers #Peer Review #LLM Agents #Scientific Evaluation #Expert Annotation

2026년 5월 20일

[논문리뷰] OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generation

본 연구는 Layout-Grounded Image Generation 분야에서 객체 간의 복잡한 Occlusion 문제를 해결하기 위해 고안되었습니다.

#Review #Layout-Grounded Image Generation #Occlusion Modeling #Z-Order #Transformer #Generative Models

2026년 5월 20일

[논문리뷰] OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond

본 논문은 장문 컨텍스트 추론 및 다중 모드 지능의 발전으로 인해 KV Cache가 추론의 지배적인 메모리 병목으로 부상한 문제를 해결합니다. 기존의 per-channel 양자화 기법은 Key 텐서의 채널별 이상치를 처리하는 데 효과적이나, 압축률이 극도로 높아질 경우 그 효용이 급격히 감소합니다.

#Review #KV Cache Quantization #Token Norm Imbalance (TNI)#Omni-Scaled Canalized Rotation (OScaR)#Extreme Low-bit Quantization #Large Language Models (LLMs)#CUDA Kernel Optimization

2026년 5월 20일

[논문리뷰] OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under optimal Squared error quantization

Long-context 모델의 확장에 따라 KV cache의 메모리 점유율은 모델 서빙의 핵심적인 기술적 과제가 되었다.

2026년 5월 20일

[논문리뷰] Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs

본 논문은 Agentic LLM의 추론 과정에서 발생하는 입력 기반(input-heavy) 오버헤드와 연산 단계 간 성능 저하 문제를 해결합니다. Agentic 워크플로우는 도구 사용 및 메모리 검색으로 인해 컨텍스트가 반복적으로 길어지며, 이는 Prefilling 단계가 전체 추론의 주요 병목이 되게 합니다 .

#Review #Agentic LLMs #Model Quantization #Prefilling #Decoding #NVFP4 #Efficiency

2026년 5월 20일

[논문리뷰] Mem-π: Adaptive Memory through Learning When and What to Generate

본 논문은 기존 LLM 에이전트의 정적인 메모리 검색 패러다임이 갖는 한계를 극복하기 위해 제안되었습니다. 현재의 메모리 증강 에이전트들은 주로 외부 저장소에서 과거의 경험을 검색하는 방식에 의존하지만, 이러한 검색된 데이터는 현재의 에이전트 맥락과 맞지 않거나 지나치게 특수하여 범용성이 떨어지는 문제가 있습니다.

#Review #Large Language Model Agents #Generative Memory #Reinforcement Learning #Adaptive Memory #Abstention Policy #Decoupled Policy Optimization

2026년 5월 20일

[논문리뷰] Mega-ASR: Towards In-the-wild^2 Speech Recognition via Scaling up Real-world Acoustic Simulation

본 논문은 기존의 ASR 기술이 깨끗한 환경에서는 뛰어난 성능을 보이지만, 실제 환경의 복합적인 음향 왜곡(noise, reverberation, far-field, obstruction 등) 속에서는 WER이 급격히 상승하고 할루시네이션(hallucination)이나 문장 누락이 발생하는 'acoustic robustness bottleneck'을 해결하고자 한다.

#Review #ASR-in-the-wild #Compound Acoustic Simulation #Acoustic-to-Semantic #Progressive Supervised Fine-Tuning #Policy Optimization #Robust Speech Recognition #Acoustic Robustness Bottleneck

2026년 5월 20일

[논문리뷰] MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization

본 논문은 LLM agent의 skill 최적화가 근본적으로 다목적(multi-objective) 문제임에도 불구하고, 기존 방식들이 이를 단일 목적 함수로 축소함으로써 발생하는 비효율성을 해결하고자 합니다.

#Review #Multi-Objective Optimization #LLM Agents #Skill Optimization #Chebyshev Scalarization #Hypervolume #Prompt Engineering #Constraint Satisfaction

2026년 5월 20일

[논문리뷰] LongMINT: Evaluating Memory under Multi-Target Interference in Long-Horizon Agent Systems

본 논문은 현재의 memory-augmented agent들이 현실 세계의 복잡하고 진화하는 long-horizon 환경에서 겪는 기억 오류 문제를 해결하고자 한다.

#Review #Long-Horizon #Agent Systems #Memory Evaluation #Multi-Target Interference #Retrieval-Augmented Generation #Benchmarking

2026년 5월 20일

[논문리뷰] Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency

본 논문은 현대의 Large Language Models 학습이 직면한 불안정성(Instability)과 이로 인한 컴퓨팅 자원 낭비 문제를 시스템 차원의 제어 문제로 정의합니다.

#Review #Large Language Models #Training Control Governance #LBW-Guard #AdamW #Training Stability #Bounded Autonomous Control #Compute Efficiency

2026년 5월 20일