최신 포스트

[PyTorch] FlexAttention에 저정밀도 K/V 입력 지원 추가

FlexAttention compiled 모드에서 FP8 등 저정밀도 K/V 입력을 허용하여 양자화 추론을 지원한다

#PyTorch #FlexAttention #FP8 #Quantization

2026년 1월 5일

[논문리뷰] Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

본 논문은 기존 LLM 에이전트 프레임워크가 겪는 높은 구성 비용 과 정적 기능 문제를 해결하는 것을 목표로 합니다.

#Review #LLM Agents #Automated Agent Generation #Reinforcement Learning #Hybrid Policy Optimization #Tool Synthesis #In-context Learning #Agent Framework #Scalability

2026년 1월 4일

[논문리뷰] Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

본 논문은 멀티모달 대규모 언어 모델(MLLMs) 이 시각적 내용보다 언어적 선험 지식에 과도하게 의존하여 발생하는 시각적으로 근거 없는 환각(hallucinations) 문제를 해결하는 것을 목표로 합니다.

#Review #MLLMs #Video Understanding #Hallucinations #Counterfactual Generation #Diffusion Models #Reinforcement Learning #QA Dataset #DNA-Train

2026년 1월 4일

[논문리뷰] SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning

본 논문은 기존 VLM 기반 에이전트의 텍스트 중심 추론 및 고립된 도구 호출 한계를 극복하고자 합니다.

#Review #Multimodal Agents #Reinforcement Learning #Vision-Language Models #Tool Use #Agentic Reasoning #Image Search #HR-MMSearch #BN-GSPO

2026년 1월 4일

[논문리뷰] Nested Learning: The Illusion of Deep Learning Architectures

본 논문은 기존 딥러닝 모델, 특히 대규모 언어 모델(LLM) 이 직면한 지속 학습, 자기 개선, 효과적인 문제 해결 능력의 한계를 극복하고자 합니다. 이를 위해 기계 학습 모델을 중첩되고 다단계의 최적화 문제로 해석하는 새로운 학습 패러다임인 Nested Learning (NL) 을 제안합니다.

#Review #Nested Learning #Continual Learning #In-context Learning #Associative Memory #Multi-Timescale Memory #Self-Modifying Models #Optimizers

2026년 1월 4일

[논문리뷰] NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

본 연구는 기존 4D 세계 모델링 방법론의 확장성 한계(고비용의 특수 다중 뷰 데이터 및 번거로운 오프라인 전처리)를 극복하고자 합니다. 이를 위해 다양한 in-the-wild 단일 뷰 영상 으로부터 4D 재구성 및 새로운 경로 영상 생성 이 가능한 다재다능하고 확장성 높은 4D 세계 모델 NeoVerse 를 제안합니다.

#Review #4D World Model #Gaussian Splatting #Monocular Video #Novel View Synthesis #Video Generation #Feed-Forward Reconstruction #Degradation Simulation

2026년 1월 4일

[논문리뷰] MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing

본 논문은 3D 모핑의 난제를 해결하고자 합니다. 특히 다양한 카테고리 간의 객체에 대해 의미론적으로 일관되고 시간적으로 부드러운 변형 시퀀스를 훈련 없이 생성하는 것을 목표로 합니다. 기존 3D 모핑 방식의 한계, 즉 부정확한 대응 추정으로 인한 구조적으로 비현실적인 결과와 낮은 일반화 성능을 극복하고자 합니다.

#Review #3D Morphing #Structured Latent (SLAT)#Generative Models #Attention Mechanisms #Training-Free Framework #Cross-Category Transitions #Temporal Coherence

2026년 1월 4일

[논문리뷰] InfoSynth: Information-Guided Benchmark Synthesis for LLMs

대규모 언어 모델(LLM)의 추론 및 코드 생성 능력 평가를 위한 새롭고 다양한 벤치마크를 효율적으로 생성하는 것이 이 논문의 핵심 목표입니다.

#Review #Benchmark Synthesis #LLM Evaluation #Code Generation #Information Theory #Genetic Algorithms #Novelty Metrics #Diversity Metrics

2026년 1월 4일

[논문리뷰] Fast-weight Product Key Memory

본 논문은 최신 언어 모델의 시퀀스 모델링 레이어에서 저장 용량과 계산 효율성 사이의 근본적인 트레이드오프를 해결하는 것을 목표로 합니다.

#Review #Fast-weight Memory #Product Key Memory #Episodic Memory #Language Models #Long-Context Modeling #Memory Augmented Networks #Continual Learning

2026년 1월 4일

[논문리뷰] Diversity or Precision? A Deep Dive into Next Token Prediction

본 연구는 LLM의 사전 훈련된 토큰 출력 분포가 후속 강화 학습(RL) 을 위한 탐색 공간에 미치는 영향을 체계적으로 조사하는 것을 목표로 합니다. 특히, 다음 토큰 예측 을 확률적 결정 과정으로 재해석하여 다양성과 정밀도 간의 균형이 전체적인 추론 성능에 어떻게 영향을 미치는지 밝히고자 합니다.

#Review #Next Token Prediction #Reinforcement Learning #Large Language Models #Reward Shaping #Pre-training Objective #Policy Gradient #Exploration-Exploitation

2026년 1월 4일

[논문리뷰] Deep Delta Learning

본 논문은 딥 잔차 신경망(Deep Residual Networks)의 엄격한 가산적 귀납적 편향(additive inductive bias)으로 인해 복잡한 상태 전이 모델링 능력이 제한되는 문제를 해결하고자 합니다.

#Review #Deep Residual Networks #Delta Operator #Geometric Transformation #Spectral Analysis #Gated Networks #Householder Reflection #Dynamical Systems #Identity Shortcut

2026년 1월 4일

[논문리뷰] Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

본 논문은 기존의 단방향적인 아바타 생성 모델들이 부족했던 실시간 양방향 상호작용 과 감정적 참여(emotional engagement) 를 가능하게 하는 대화형 헤드 아바타 생성 시스템을 개발하는 것을 목표로 합니다.

#Review #Avatar Generation #Real-Time Interaction #Diffusion Models #Preference Optimization #Causal Inference #Multimodal Input #Head Avatar

2026년 1월 4일

[논문리뷰] AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction

본 논문은 단일 시점 비디오에서 동적인 3D 장면을 재구성할 때 발생하는 주요 문제점인 고주파수 외형 디테일과 시간적 연속성의 동시 확보를 목표로 합니다.

#Review #Dynamic Scene Reconstruction #Gabor Representation #Gaussian Splatting #Temporal Continuity #Cubic Hermite Splines #Frequency Adaptivity #Monocular Video

2026년 1월 4일

[triton] Proton의 Runtime과 Metric 상관관계 단순화로 오버헤드 감소

Proton 프로파일러의 Data/Metric 인터페이스를 재설계하여 이중 잠금과 불필요한 조회를 제거하고 프로파일링 오버헤드를 줄인 사례를 분석합니다.

#Triton #Proton #Profiling #Performance #Refactoring

2026년 1월 4일

[cpython] gh-124951: base64 인코딩/디코딩 2~3배 속도 향상 — CPU 파이프라이닝 최적화

lookup table 정렬과 loop-carried dependency 제거로 base64 처리 속도를 2~3배 개선

#Python #CPython #Performance #base64 #C

2026년 1월 2일

[논문리뷰] On the Role of Discreteness in Diffusion LLMs

본 논문은 확산 모델(Diffusion Models)을 언어 모델링에 적용할 때 발생하는 근본적인 문제점을 분석하고, 텍스트의 이산적이고 구조화된 특성이 확산 메커니즘과 어떻게 불일치하는지 명확히 하는 것을 목표로 합니다.

#Review #Diffusion Models #Language Models #Discrete Text #Continuous Diffusion #Text Generation #Data Augmentation #Parallel Decoding #Structural Dependency

2026년 1월 1일

[논문리뷰] Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

본 논문은 기존 대규모 언어 모델(LLM)이 언어의 비균일한 정보 밀도에도 불구하고 토큰에 균일한 연산을 적용하여 발생하는 비효율성 문제를 해결하고자 합니다.

#Review #Hierarchical Language Model #Concept-Level Reasoning #Dynamic Segmentation #Adaptive Computation #Scaling Laws #Maximal Update Parametrization #Next-Token Prediction #Flash Attention

2026년 1월 1일

[논문리뷰] DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

현재 Multimodal Large Language Models (MLLMs)이 겪는 텍스트 중심 추론의 한계와 복잡한 장기 시각 중심 태스크에서의 비효율성을 해결하고, 확산 모델을 활용한 새로운 '생성형 멀티모달 추론' 패러다임을 확립하는 것을 목표로 합니다.

#Review #Multimodal Reasoning #Diffusion Models #Image-to-Image Generation #Vision-centric AI #Generative AI #Spatial Planning #Constraint Satisfaction

2026년 1월 1일

[논문리뷰] mHC: Manifold-Constrained Hyper-Connections

논문은 Hyper-Connections (HC) 가 잔여 스트림의 폭을 넓히고 연결성을 다양화하여 성능을 향상시키지만, 항등 매핑(identity mapping) 속성을 손상시켜 심각한 훈련 불안정성, 제한된 확장성, 그리고 상당한 메모리 접근 오버헤드 를 야기하는 문제를 해결하고자 합니다.

#Review #Hyper-Connections #Residual Connections #Manifold Learning #Doubly Stochastic Matrices #Training Stability #Large Language Models #Infrastructure Optimization #Deep Learning Architecture

2025년 12월 31일

[논문리뷰] Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

본 논문은 경량 LLM이 높은 계산 효율성 을 유지하면서도 내재적인 에이전트 지능을 갖출 수 있도록 하는 것을 목표로 합니다. 특히, 기존의 증류(distillation) 방식이 아닌, sub-2B 규모 의 모델이 처음부터 추론 및 계획 능력 을 체계적으로 학습하도록 하는 데 중점을 둡니다.

#Review #Lightweight LLM #Agentic AI #Pre-training #Multi-Latent Attention #Long-Context #Curriculum Learning #Agentic Mid-training #Instruction Tuning

2025년 12월 31일