Review

[논문리뷰] Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

본 논문은 멀티모달 대규모 언어 모델(MLLMs) 이 시각적 내용보다 언어적 선험 지식에 과도하게 의존하여 발생하는 시각적으로 근거 없는 환각(hallucinations) 문제를 해결하는 것을 목표로 합니다.

#Review #MLLMs #Video Understanding #Hallucinations #Counterfactual Generation #Diffusion Models #Reinforcement Learning #QA Dataset #DNA-Train

2026년 1월 4일

[논문리뷰] SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning

본 논문은 기존 VLM 기반 에이전트의 텍스트 중심 추론 및 고립된 도구 호출 한계를 극복하고자 합니다.

#Review #Multimodal Agents #Reinforcement Learning #Vision-Language Models #Tool Use #Agentic Reasoning #Image Search #HR-MMSearch #BN-GSPO

2026년 1월 4일

[논문리뷰] Nested Learning: The Illusion of Deep Learning Architectures

본 논문은 기존 딥러닝 모델, 특히 대규모 언어 모델(LLM) 이 직면한 지속 학습, 자기 개선, 효과적인 문제 해결 능력의 한계를 극복하고자 합니다. 이를 위해 기계 학습 모델을 중첩되고 다단계의 최적화 문제로 해석하는 새로운 학습 패러다임인 Nested Learning (NL) 을 제안합니다.

#Review #Nested Learning #Continual Learning #In-context Learning #Associative Memory #Multi-Timescale Memory #Self-Modifying Models #Optimizers

2026년 1월 4일

[논문리뷰] NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

본 연구는 기존 4D 세계 모델링 방법론의 확장성 한계(고비용의 특수 다중 뷰 데이터 및 번거로운 오프라인 전처리)를 극복하고자 합니다. 이를 위해 다양한 in-the-wild 단일 뷰 영상 으로부터 4D 재구성 및 새로운 경로 영상 생성 이 가능한 다재다능하고 확장성 높은 4D 세계 모델 NeoVerse 를 제안합니다.

#Review #4D World Model #Gaussian Splatting #Monocular Video #Novel View Synthesis #Video Generation #Feed-Forward Reconstruction #Degradation Simulation

2026년 1월 4일

[논문리뷰] MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing

본 논문은 3D 모핑의 난제를 해결하고자 합니다. 특히 다양한 카테고리 간의 객체에 대해 의미론적으로 일관되고 시간적으로 부드러운 변형 시퀀스를 훈련 없이 생성하는 것을 목표로 합니다. 기존 3D 모핑 방식의 한계, 즉 부정확한 대응 추정으로 인한 구조적으로 비현실적인 결과와 낮은 일반화 성능을 극복하고자 합니다.

#Review #3D Morphing #Structured Latent (SLAT)#Generative Models #Attention Mechanisms #Training-Free Framework #Cross-Category Transitions #Temporal Coherence

2026년 1월 4일

[논문리뷰] InfoSynth: Information-Guided Benchmark Synthesis for LLMs

대규모 언어 모델(LLM)의 추론 및 코드 생성 능력 평가를 위한 새롭고 다양한 벤치마크를 효율적으로 생성하는 것이 이 논문의 핵심 목표입니다.

#Review #Benchmark Synthesis #LLM Evaluation #Code Generation #Information Theory #Genetic Algorithms #Novelty Metrics #Diversity Metrics

2026년 1월 4일

[논문리뷰] Fast-weight Product Key Memory

본 논문은 최신 언어 모델의 시퀀스 모델링 레이어에서 저장 용량과 계산 효율성 사이의 근본적인 트레이드오프를 해결하는 것을 목표로 합니다.

#Review #Fast-weight Memory #Product Key Memory #Episodic Memory #Language Models #Long-Context Modeling #Memory Augmented Networks #Continual Learning

2026년 1월 4일

[논문리뷰] Diversity or Precision? A Deep Dive into Next Token Prediction

본 연구는 LLM의 사전 훈련된 토큰 출력 분포가 후속 강화 학습(RL) 을 위한 탐색 공간에 미치는 영향을 체계적으로 조사하는 것을 목표로 합니다. 특히, 다음 토큰 예측 을 확률적 결정 과정으로 재해석하여 다양성과 정밀도 간의 균형이 전체적인 추론 성능에 어떻게 영향을 미치는지 밝히고자 합니다.

#Review #Next Token Prediction #Reinforcement Learning #Large Language Models #Reward Shaping #Pre-training Objective #Policy Gradient #Exploration-Exploitation

2026년 1월 4일

[논문리뷰] Deep Delta Learning

본 논문은 딥 잔차 신경망(Deep Residual Networks)의 엄격한 가산적 귀납적 편향(additive inductive bias)으로 인해 복잡한 상태 전이 모델링 능력이 제한되는 문제를 해결하고자 합니다.

#Review #Deep Residual Networks #Delta Operator #Geometric Transformation #Spectral Analysis #Gated Networks #Householder Reflection #Dynamical Systems #Identity Shortcut

2026년 1월 4일

[논문리뷰] Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

본 논문은 기존의 단방향적인 아바타 생성 모델들이 부족했던 실시간 양방향 상호작용 과 감정적 참여(emotional engagement) 를 가능하게 하는 대화형 헤드 아바타 생성 시스템을 개발하는 것을 목표로 합니다.

#Review #Avatar Generation #Real-Time Interaction #Diffusion Models #Preference Optimization #Causal Inference #Multimodal Input #Head Avatar

2026년 1월 4일

[논문리뷰] AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction

본 논문은 단일 시점 비디오에서 동적인 3D 장면을 재구성할 때 발생하는 주요 문제점인 고주파수 외형 디테일과 시간적 연속성의 동시 확보를 목표로 합니다.

#Review #Dynamic Scene Reconstruction #Gabor Representation #Gaussian Splatting #Temporal Continuity #Cubic Hermite Splines #Frequency Adaptivity #Monocular Video

2026년 1월 4일

[논문리뷰] On the Role of Discreteness in Diffusion LLMs

본 논문은 확산 모델(Diffusion Models)을 언어 모델링에 적용할 때 발생하는 근본적인 문제점을 분석하고, 텍스트의 이산적이고 구조화된 특성이 확산 메커니즘과 어떻게 불일치하는지 명확히 하는 것을 목표로 합니다.

#Review #Diffusion Models #Language Models #Discrete Text #Continuous Diffusion #Text Generation #Data Augmentation #Parallel Decoding #Structural Dependency

2026년 1월 1일

[논문리뷰] Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

본 논문은 기존 대규모 언어 모델(LLM)이 언어의 비균일한 정보 밀도에도 불구하고 토큰에 균일한 연산을 적용하여 발생하는 비효율성 문제를 해결하고자 합니다.

#Review #Hierarchical Language Model #Concept-Level Reasoning #Dynamic Segmentation #Adaptive Computation #Scaling Laws #Maximal Update Parametrization #Next-Token Prediction #Flash Attention

2026년 1월 1일

[논문리뷰] DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

현재 Multimodal Large Language Models (MLLMs)이 겪는 텍스트 중심 추론의 한계와 복잡한 장기 시각 중심 태스크에서의 비효율성을 해결하고, 확산 모델을 활용한 새로운 '생성형 멀티모달 추론' 패러다임을 확립하는 것을 목표로 합니다.

#Review #Multimodal Reasoning #Diffusion Models #Image-to-Image Generation #Vision-centric AI #Generative AI #Spatial Planning #Constraint Satisfaction

2026년 1월 1일

[논문리뷰] mHC: Manifold-Constrained Hyper-Connections

논문은 Hyper-Connections (HC) 가 잔여 스트림의 폭을 넓히고 연결성을 다양화하여 성능을 향상시키지만, 항등 매핑(identity mapping) 속성을 손상시켜 심각한 훈련 불안정성, 제한된 확장성, 그리고 상당한 메모리 접근 오버헤드 를 야기하는 문제를 해결하고자 합니다.

#Review #Hyper-Connections #Residual Connections #Manifold Learning #Doubly Stochastic Matrices #Training Stability #Large Language Models #Infrastructure Optimization #Deep Learning Architecture

2025년 12월 31일

[논문리뷰] Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

본 논문은 경량 LLM이 높은 계산 효율성 을 유지하면서도 내재적인 에이전트 지능을 갖출 수 있도록 하는 것을 목표로 합니다. 특히, 기존의 증류(distillation) 방식이 아닌, sub-2B 규모 의 모델이 처음부터 추론 및 계획 능력 을 체계적으로 학습하도록 하는 데 중점을 둡니다.

#Review #Lightweight LLM #Agentic AI #Pre-training #Multi-Latent Attention #Long-Context #Curriculum Learning #Agentic Mid-training #Instruction Tuning

2025년 12월 31일

[논문리뷰] Valori: A Deterministic Memory Substrate for AI Systems

현대 AI 시스템, 특히 RAG(Retrieval Augmented Generation) 및 에이전트 워크플로우에서 부동 소수점(floating-point) 연산 으로 인해 발생하는 비결정론적(non-determinism) 메모리 상태 문제를 해결하는 것이 목표입니다.

#Review #Deterministic AI #Reproducible Computation #Fixed-Point Arithmetic #Vector Databases #AI Memory #State Machine #Auditability

2025년 12월 31일

[논문리뷰] SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

본 연구는 단일 모노큘러 비디오 로부터 동적 장면을 공간(카메라 시점)과 시간(모션 시퀀스)에 걸쳐 독립적으로 제어하며 생성적으로 렌더링하는 것을 목표로 합니다.

#Review #Video Diffusion Model #Generative Rendering #Novel View Synthesis #Space-Time Disentanglement #Temporal Control #Camera Control #Dynamic Scenes #Temporal Warping

2025년 12월 31일

[논문리뷰] Scaling Open-Ended Reasoning to Predict the Future

본 연구는 불확실한 미래에 대한 개방형 예측 질문에 대해 언어 모델(LLM)이 정확하고 신뢰할 수 있는 예측을 할 수 있도록 훈련하는 것을 목표로 합니다.

#Review #Language Models #Forecasting #Open-Ended Reasoning #Reinforcement Learning (RL)#Data Generation #Calibration #Retrieval-Augmented Generation (RAG)#Future Prediction

2025년 12월 31일

[논문리뷰] Pretraining Frame Preservation in Autoregressive Video Memory Compression

본 논문은 오토회귀 비디오 생성 모델에서 발생하는 긴 비디오 컨텍스트 처리의 한계 와 컨텍스트 품질 및 길이 간의 트레이드오프 문제를 해결하고자 합니다.

#Review #Video Compression #Autoregressive Models #Memory Compression #Frame Preservation #Pretraining #Video Generation #Diffusion Models #Long-Range Consistency

2025년 12월 31일