#MotionVLA

1개의 포스트

[논문리뷰] MotionVLA: Vision-Language-Action Model for Humanoid Motion

본 논문은 기존의 단일 코드북 기반 모션 토큰화가 저주파 포즈 정보에 편향되어 고주파 물리적 역학을 제대로 표현하지 못하는 문제를 해결하고자 합니다. 대다수 연구들은 움직임을 하나의 시퀀스로 통합하여 이산화하는데, 이는 관절 위치(저주파)와 속도(고주파)의 상이한 통계적 특성을 무시하게 만듭니다.

#Review #Vision-Language-Action #Humanoid Motion #Frequency-Domain Tokenizer #Autoregressive Generation #Dual-Stream Representation #MotionVLA

2026년 6월 16일