#Training-free

22개의 포스트

[논문리뷰] Motion4Motion: Motion Transfer Across Subjects at Inference

본 논문은 기존 모션 전이 방식이 스켈레톤 구조에 지나치게 의존함으로써 겪는 범용성 부족 문제를 해결하고자 합니다. 대다수의 기존 연구는 인간 중심의 스켈레톤 사전 지식을 강제하여, 동물과 같이 다양한 형태의 캐릭터 간 모션 전이에 적용하기 어렵습니다 .

#Review #Motion Transfer #Training-free #Diffusion Transformer #Attention Control #Video Generation #Cross-species #Motion Flow

2026년 7월 13일

[논문리뷰] SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

본 논문은 기존 LLM 에이전트의 기술 적응 방식이 장기 과제(Long-horizon tasks)에서 가지는 한계를 해결하고자 합니다.

#Review #LLM Agents #Skill Adaptation #Failure Attribution #Trajectory-level #Step-level #Training-free

2026년 6월 1일

[논문리뷰] One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation

본 연구는 기존 세포 인스턴스 분할 모델들이 학습 데이터에 종속되어 Out-of-Distribution (OOD) 세포 유형에서 성능이 급격히 저하되는 문제를 해결하고자 합니다.

#Review #Cell Instance Segmentation #Foundation Models #Group Prompting #Chain-of-Prompts #Training-free #Histopathology #SAM

2026년 5월 31일

[논문리뷰] EarlyTom: Early Token Compression Completes Fast Video Understanding

본 논문은 Video-LLM의 추론 효율성을 저해하는 가장 큰 병목 현상이 LLM 자체가 아닌 Vision Encoder 단계에 집중되어 있다는 점을 지적한다. 기존의 토큰 압축 연구들은 주로 LLM 내부나 그 이후 단계의 처리에 집중하여 TTFT를 효과적으로 줄이지 못했다 .

#Review #Video-LLMs #Token Compression #Vision Encoder #Time-to-First-Token #Inference Efficiency #Training-free

2026년 5월 28일

[논문리뷰] You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories

본 연구는 고비용의 RLVR 학습 과정에서 발생하는 막대한 컴퓨팅 자원 소비 문제를 해결하기 위해 고안되었습니다. 기존의 RLVR은 성능 향상을 위해 방대한 최적화 단계가 필수적이지만, 학습 궤적의 기하학적 구조에 대한 이해가 부족하여 효율적인 최적화가 어려웠습니다.

#Review #RLVR #LLMs #Low-rank #Extrapolation #SVD #Training-free #Parameter Trajectories

2026년 5월 20일

[논문리뷰] A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

터미널 기반의 소프트웨어 엔지니어링 에이전트는 긴 호흡의 의사결정이 필요하지만, 반복적이고 노이즈가 많은 터미널 출력으로 인해 컨텍스트의 중복성이 심화되는 문제에 직면해 있습니다 . 이러한 데이터 중복은 토큰 비용을 기하급수적으로 증가시킬 뿐만 아니라, 중요한 신호를 가려 장기 추론 성능을 저하시키는 주요 원인이 됩니다.

#Review #Terminal Agents #Context Compression #Self-evolving Framework #Token Efficiency #Long-horizon Reasoning #Training-free

2026년 4월 22일

[논문리뷰] Speculative Decoding for Autoregressive Video Generation

본 논문은 이미지 품질 라우터를 사용하여 블록별로 드래프트된 결과물을 수락하거나 타겟 모델로 재생성하는 SDVG 프레임워크를 제안합니다. 드래프터는 4번의 Denoising step을 통해 후보 블록을 생성하며, 이는 Worst-frame aggregation을 통해 ImageReward로 평가됩니다 .

#Review #Speculative Decoding #Autoregressive Video Generation #Video Diffusion #Training-free #ImageReward

2026년 4월 21일

[논문리뷰] Elucidating the SNR-t Bias of Diffusion Probabilistic Models

저자들은 SNR-t bias를 완화하기 위해 DCW (Differential Correction in Wavelet domain)를 제안합니다 . 이 방법론은 학습 없이(training-free) 추론 단계에서 적용 가능한 플러그 앤 플레이 방식의 differential correction을 수행합니다.

#Review #Diffusion Probabilistic Models #SNR-t Bias #Differential Correction #Wavelet Domain #Generation Quality #Training-free

2026년 4월 19일

[논문리뷰] When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models

본 논문은 최신 Text-to-Video (T2V) 모델들이 고품질 영상을 생성함에도 불구하고, 프롬프트에 명시된 객체의 수를 정확하게 반영하지 못하는 수치적 정렬(numerical misalignment) 문제를 해결하고자 합니다.

#Review #Text-to-Video #Diffusion Transformer #Numerical Alignment #Training-free #Layout-guided Generation

2026년 4월 9일

[논문리뷰] Extend3D: Town-Scale 3D Generation

최근 3D generative model은 고품질의 3D 객체를 생성하는 데 성공했으나, 여전히 복잡한 구성의 대규모 3D 장면(Town-Scale) 생성에는 어려움을 겪고 있습니다.

#Review #3D Scene Generation #Training-free #Latent Flow Model #Overlapping Patch-wise Flow #Under-noising #SDEdit #3D-aware Optimization

2026년 3월 31일

[논문리뷰] Coarse-Guided Visual Generation via Weighted h-Transform Sampling

Coarse-Guided Visual Generation 은 deblurring, super-resolution 등 다양한 실제 애플리케이션에 필수적입니다.

#Review #Guided Visual Generation #Diffusion Model #Doob's h-Transform #Coarse-guided Generation #Training-free #Image Restoration #Video Generation #Weighted Sampling

2026년 3월 12일

[논문리뷰] Rethinking Global Text Conditioning in Diffusion Transformers

이 논문은 확산 트랜스포머(Diffusion Transformers)에서 변조(modulation) 기반의 글로벌 텍스트 조건화(pooled text embedding) 가 필수적인지, 그리고 성능 향상에 기여할 수 있는지에 대한 질문을 해결하고자 합니다.

#Review #Diffusion Transformers #Text Conditioning #CLIP Embedding #Modulation Guidance #Text-to-Image Generation #Image Editing #Training-free

2026년 2월 10일

[논문리뷰] HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

기존 Multimodal Large Language Models (MLLMs) 이 스트리밍 비디오 이해에서 겪는 성능 불안정, 높은 응답 지연 시간, 높은 GPU 메모리 사용량 등의 문제를 해결하는 것을 목표로 합니다.

#Review #Streaming Video Understanding #KV Cache Management #Hierarchical Memory #MLLMs #Low Latency #Training-free #Memory Efficiency

2026년 1월 22일

[논문리뷰] KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs

디코더 전용 LLM을 학습 없이 텍스트 임베딩 백본으로 활용할 때 발생하는 두 가지 구조적 문제(인과적 어텐션으로 인한 정보 비대칭, 다음 토큰 예측 목표로 인한 의미 압축 편향)를 해결하여, 고품질의 텍스트 임베딩을 효율적으로 추출하는 것입니다.

#Review #Text Embedding #Decoder-only LLMs #Training-free #KV Re-routing #Causal Attention #Representation Learning #Intrinsic Dimensionality

2026년 1월 5일

[논문리뷰] Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules

본 논문은 확산 언어 모델(dLLM)이 오토회귀 모델에 비해 가지는 잠재력에도 불구하고, 느리고 반복적인 샘플링 과정으로 인해 실용성이 저해되는 문제를 해결하고자 합니다.

#Review #Diffusion Language Models #Decoding Efficiency #Early Exit #Confidence Schedules #Training-free #Model-agnostic #Progress-aware

2025년 12월 10일

[논문리뷰] Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration

본 논문은 3D Diffusion 모델의 느린 추론 속도 문제를 해결하는 것을 목표로 합니다.

#Review #3D Geometry Synthesis #Diffusion Models #Acceleration #Caching #Training-free #Flow Matching #Voxel Stabilization #Computational Efficiency

2025년 11월 30일

[논문리뷰] Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

본 연구는 시각적 자기회귀(VAR) 모델 에서 추가 훈련 없이 프롬프트 기반 이미지 편집 기능을 구현하는 것을 목표로 합니다. 기존 VAR 모델의 편집 능력 한계를 극복하고, 원본 이미지의 관련 없는 세부 사항을 보존하면서 텍스트 프롬프트에 따라 타겟 편집을 정확하고 제어 가능하게 수행하는 방법론을 개발하고자 합니다.

#Review #Image Editing #Autoregressive Models #Noise Inversion #Text-to-Image #Gumbel-max Trick #Training-free #Location-aware Argmax Inversion

2025년 9월 3일

[논문리뷰] Motion2Motion: Cross-topology Motion Transfer with Sparse Correspondence

이 논문은 골격 토폴로지가 크게 다른 캐릭터 간의 애니메이션 전이 문제를 해결하는 것을 목표로 합니다.

#Review #Motion Transfer #Cross-topology #Sparse Correspondence #Motion Matching #Animation #Training-free #Few-shot Learning

2025년 8월 20일

[논문리뷰] SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation

본 논문은 기존 오디오 기반 비디오 생성 모델들이 주로 시맨틱 정보에만 초점을 맞춰 공간적 일관성이 부족하다는 한계를 지적합니다.

#Review #Audio-driven Video Generation #Spatial Auditory Cues #Video Scene Layout #MLLM #Diffusion Models #Training-free

2025년 8월 4일

[논문리뷰] Reasoning with Sampling: Your Base Model is Smarter Than You Think

본 논문은 LLM의 RL-사후 훈련(RL-posttraining)이 진정으로 새로운 추론 능력을 부여하는지, 아니면 기본 모델의 기존 능력을 '선명하게' 하는 것인지에 대한 질문에 답하고자 합니다.

#Review #LLMs #MCMC #Sampling #Reasoning #Distribution Sharpening #Reinforcement Learning (RL)#Inference-time Optimization #Training-free

2025년 10월 27일

[논문리뷰] ConsistEdit: Highly Consistent and Precise Training-free Visual Editing

본 논문은 기존의 훈련 없이(training-free) 텍스트 기반 시각 편집 방법론이 겪는 한계, 즉 강한 편집 강도를 유지하면서도 원본과의 일관성을 보존하기 어렵다는 문제를 해결하고자 합니다.

#Review #Image Editing #Video Editing #Diffusion Transformer #Attention Control #Training-free #Multi-modal Diffusion Transformer (MM-DiT)#Consistency Preservation

2025년 10월 21일

[논문리뷰] Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

본 논문의 핵심 목표는 추가적인 모델 훈련 없이 확산(diffusion) 또는 플로우(flow) 기반 로봇 정책의 성능을 향상시키는 것입니다.

#Review #Diffusion Models #Flow-based Models #Robotics Control #Policy Composition #Test-time Optimization #Score-based Models #Training-free

2025년 10월 6일