#Interleaved Context

1개의 포스트

[논문리뷰] VINO: A Unified Visual Generator with Interleaved OmniModal Context

본 논문은 파편화된 기존 시각 생성 파이프라인의 한계를 극복하고, 단일 프레임워크 내에서 이미지 및 비디오 생성과 편집을 모두 수행할 수 있는 통합 시각 생성기 VINO 를 개발하는 것을 목표로 합니다.

#Review #Unified Generation #Multimodal Diffusion #Vision-Language Model #Image Editing #Video Editing #Interleaved Context #Progressive Training #Diffusion Transformer

2026년 1월 5일