#Autoregressive

13개의 포스트

[논문리뷰] Nemotron-Labs-Diffusion: A Tri-Mode Language Model Unifying Autoregressive, Diffusion, and Self-Speculation Decoding

본 논문은 기존의 엄격한 순차적 Autoregressive (AR) 디코딩 방식이 가진 낮은 추론 병렬성과 자원 활용도 문제를 해결하기 위해 고안되었습니다.

#Review #Language Model #Autoregressive #Diffusion #Self-Speculation #Parallel Decoding #Inference Efficiency #Tri-Mode Decoding

2026년 7월 7일

[논문리뷰] GEAR: Guided End-to-End AutoRegression for Image Synthesis

본 논문은 현대의 시각적 생성 모델들이 tokenizer와 generator를 2단계로 분리하여 학습함으로써 발생하는 비효율성을 해결하고자 합니다 .

#Review #GEAR #Autoregressive #Tokenizer #End-to-End #Representation Alignment #Vector Quantization #Image Synthesis

2026년 6월 30일

[논문리뷰] BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language

본 논문은 기존의 뇌-기계 인터페이스(BCI) 연구들이 Brain encoding과 decoding을 독립적인 작업으로 간주하고, 모달리티 간 통합이 결여된 단편적인 접근 방식을 취하는 한계를 해결하고자 합니다.

#Review #BrainJanus #Unified Model #Brain Encoding #Brain Decoding #Autoregressive #Omni Space #Tokenization

2026년 6월 30일

[논문리뷰] DreamForge-World 0.1 Preview: A Low-Compute Real-Time Controllable World Model

본 논문은 제한된 컴퓨팅 환경에서 Real-time 인터랙티브 시뮬레이션을 가능하게 하는 DreamForge-World 0.1 Preview를 제안합니다 .

#Review #World Model #Interactive Generation #Real-time #Consumer GPU #Autoregressive #Multimodal #LoRA

2026년 6월 29일

[논문리뷰] Lip Forcing: Few-Step Autoregressive Diffusion for Real-time Lip Synchronization

본 논문은 기존 Diffusion 기반의 오디오-비디오 생성 모델이 가진 높은 Latency와 연산 복잡도 문제를 해결하는 것을 목표로 합니다. 기존 방식은 고품질의 출력을 생성하기 위해 수십 번의 Sampling Step이 필요하여 실시간 서비스에 적용하기 어렵습니다.

#Review #Lip Synchronization #Diffusion Models #Autoregressive #Real-time #Audio-Driven Talking Face

2026년 6월 9일

[논문리뷰] dots.tts Technical Report

본 논문은 기존의 이산적(Discrete) 토큰 기반 TTS 모델이 가진 표현력의 한계를 극복하고, 연속적인(Continuous) latent 공간에서 안정적인 AR 음성 생성을 구현하고자 합니다.

#Review #Text-to-Speech #Continuous Latent #Flow-Matching #Autoregressive #AudioVAE #Self-Correction #MeanFlow Distillation

2026년 6월 7일

[논문리뷰] LongLive-RAG: A General Retrieval-Augmented Framework for Long Video Generation

본 논문은 Autoregressive(AR) 비디오 생성 모델에서 장기 생성 시 발생하는 오류 누적과 identity drift 문제를 해결하고자 합니다. 기존 방식은 효율성을 위해 Sliding-window Attention에만 의존하며, 생성된 초기 Latent를 폐기하거나 고정된 앵커(anchor)만을 사용합니다 .

#Review #Long Video Generation #Autoregressive #Retrieval-Augmented Generation #Video Diffusion #Temporal Consistency #Attention

2026년 6월 1일

[논문리뷰] minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

본 논문은 기존의 고품질 Video Foundation Model을 실시간 상호작용이 가능한 Interactive World Model로 전환하는 파이프라인의 부재 문제를 해결합니다.

#Review #Video World Models #Diffusion Models #Autoregressive #Distillation #Real-time Inference #Camera Control

2026년 5월 28일

[논문리뷰] From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

본 논문은 에이전트가 방대한 원시 경험 데이터로부터 효과적으로 기술을 습득하지 못하는 비효율성 문제를 해결하기 위해 Skill Consumption 프레임워크를 제안한다. 기존 방식은 데이터의 노이즈와 구조적 미흡함으로 인해 기술 추출의 정밀도가 낮다는 한계가 있다.

#Review #Agent Skills #Skill Consumption #Model-Generated Skills #Autoregressive #Skill Acquisition

2026년 5월 24일

[논문리뷰] Echo-Forcing: A Scene Memory Framework for Interactive Long Video Generation

본 논문은 Autoregressive 비디오 확산 모델이 긴 비디오 생성 및 대화형 시나리오에서 겪는 기억 관리(KV Cache management)의 기능적 Entanglement 문제를 해결하고자 한다.

#Review #Video Generation #Autoregressive #KV Cache #Scene Memory #Long-form Video #Interactive Generation

2026년 5월 19일

[논문리뷰] SNLP: Layer-Parallel Inference via Structured Newton Corrections

본 논문은 Transformer 모델의 고질적인 문제인 Layer-wise Dependency로 인한 추론 지연(Latency) 문제를 해결하고자 합니다.

#Review #Layer-Parallel Inference #Structured Newton Corrections #Transformer #Autoregressive #Solver-induced Inference Bias #Identity Newton #HC Newton

2026년 5월 18일

[논문리뷰] ERNIE 5.0 Technical Report

ERNIE 5.0은 텍스트, 이미지, 비디오, 오디오에 걸쳐 통합된 멀티모달 이해 및 생성 을 위한 본질적으로 자기회귀(autoregressive) 기반 파운데이션 모델 을 개발하는 것을 목표로 합니다.

#Review #Multimodal Foundation Model #Autoregressive #Mixture-of-Experts #Elastic Training #Reinforcement Learning #Unified Architecture #Sparse MoE #Efficient Deployment

2026년 2월 4일

[논문리뷰] HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming

고해상도 비디오 생성은 확산 모델의 제곱 복잡도 로 인해 계산적으로 병목 현상이 발생하여 실용적인 추론이 불가능하다는 문제를 해결하고자 합니다.

#Review #High-Resolution Video Generation #Diffusion Models #Autoregressive #Efficiency #Caching #Attention Mechanisms #Video Streaming #Temporal Consistency

2025년 12월 24일