#Rotary Positional Embedding

4개의 포스트

[논문리뷰] Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

본 논문은 기존 비디오 세계 모델이 단일 에이전트 환경에 집중되어 있어, 다중 에이전트가 상호작용하는 복잡한 공유 환경을 효율적으로 시뮬레이션하지 못하는 문제를 해결합니다.

#Review #Generative World Model #Multi-Agent Interaction #Diffusion Transformer #Permutation Symmetry #Rotary Positional Embedding #Sparse Hub Attention

2026년 5월 27일

[논문리뷰] MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

Large Language Models (LLMs)는 다양한 분야에서 뛰어난 능력을 보였지만, 수백만 토큰 규모의 장기적이고 세밀한 기억(long-term, fine-grained memory retention)을 처리하는 데에는 여전히 큰 어려움에 직면해 있습니다.

#Review #Memory Sparse Attention #Long-Context LLMs #Efficient Memory #End-to-End Trainable #KV Cache Compression #Rotary Positional Embedding #Multi-hop Reasoning #Scalability

2026년 3월 26일

[논문리뷰] Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout

본 논문은 기존의 autoregressive 비디오 diffusion 모델이 가진 세 가지 핵심 한계를 해결하는 것을 목표로 합니다.

#Review #Autoregressive Video Generation #Rotary Positional Embedding #Infinite Video Generation #Action Control #Cinematic Transitions #Video Diffusion Models #KV Cache

2025년 12월 1일

[논문리뷰] EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

이 논문은 이미지 및 비디오 생성과 편집 작업이 아키텍처적 한계와 데이터 부족으로 인해 파편화되어 있다는 문제를 해결하고자 합니다. 단일 모델 내에서 이미지 및 비디오 편집과 생성을 통합하는 EditVerse 프레임워크를 제안하여, 인컨텍스트 학습 을 통해 다양한 모달리티를 유연하게 처리하는 것을 목표로 합니다.

#Review #Unified Multimodal Model #In-Context Learning #Image and Video Editing #Video Generation #Full Self-Attention #Rotary Positional Embedding #Cross-Modal Knowledge Transfer

2025년 9월 25일