#Robot Control

3개의 포스트

[논문리뷰] PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence

본 연구는 시점 불일치 문제로 인해 로봇 일반화에 한계가 있는 기존 VLM(Vision-Language Model)의 단점을 해결하고자 합니다.

#Review #Egocentric Data #Physical Intelligence #VLM #Robot Control #Embodied AI #VQA Supervision #Human-Robot Interaction #Zero-shot Transfer

2025년 12월 21일

[논문리뷰] EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

본 연구는 기존 VLA 모델들이 가진 제한된 도메인 및 유연성 문제를 해결하고, 개방형 환경에서 인간 수준의 유연한 다중 모달 추론 및 물리적 상호작용 을 가능하게 하는 일반ist 로봇 제어를 목표로 합니다.

#Review #Embodied AI #Robot Control #Vision-Language-Action Models #Multimodal Pretraining #Flow Matching #Foundation Models #Generalization #Real-world Robotics

2025년 9월 1일

[논문리뷰] Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

본 논문은 기존 Vision-Language-Action (VLA) 모델 디코더의 한계(고정된 순서의 autoregressive 생성 또는 continuous diffusion /flow matching 헤드의 백본 분리)를 해결하고자 합니다.

#Review #Vision-Language-Action (VLA)#Discrete Diffusion #Action Decoding #Transformer #Robot Control #Masked Modeling #Adaptive Decoding #Reinforcement Learning

2025년 8월 28일