#Full Self-Attention

1개의 포스트

[논문리뷰] EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

이 논문은 이미지 및 비디오 생성과 편집 작업이 아키텍처적 한계와 데이터 부족으로 인해 파편화되어 있다는 문제를 해결하고자 합니다. 단일 모델 내에서 이미지 및 비디오 편집과 생성을 통합하는 EditVerse 프레임워크를 제안하여, 인컨텍스트 학습 을 통해 다양한 모달리티를 유연하게 처리하는 것을 목표로 합니다.

#Review #Unified Multimodal Model #In-Context Learning #Image and Video Editing #Video Generation #Full Self-Attention #Rotary Positional Embedding #Cross-Modal Knowledge Transfer

2025년 9월 25일