#VLA Models

10개의 포스트

[논문리뷰] EgoSteer: A Full-Stack System Towards Steerable Dexterous Manipulation from Egocentric Videos

본 논문은 일반적인 로봇 조작 모델이 실시간 Steerability를 확보하지 못하고, 특정 로봇 환경에 국한되는 한계를 해결하고자 한다.

#Review #Steerable Dexterous Manipulation #VLA Models #Egocentric Videos #World Model #Robot Learning #DAgger

2026년 7월 13일

[논문리뷰] EVA-Client: A Unified Data Collection, Inference, and Deployment Framework for Embodied Policies on Real Robots

본 논문은 최신 Vision-Language-Action(VLA) 및 World-Action 모델(WAM)의 학습 생태계는 성숙해진 반면, 학습된 모델을 실제 로봇에 배포하고 평가하는 과정은 여전히 파편화된 스크립트에 의존하고 있다는 점을 해결하고자 합니다 .

#Review #Embodied AI #Robot Manipulation #Deployment Framework #Inference Strategies #Data Collection #Real-Robot Evaluation #VLA Models

2026년 7월 6일

[논문리뷰] Tex3D: Objects as Attack Surfaces via Adversarial 3D Textures for Vision-Language-Action Models

본 논문은 Tex3D를 제안하여 VLA 시뮬레이션 환경 내에서 adversarial 3D 텍스처를 end-to-end로 최적화합니다. 제안하는 FBD는 MuJoCo에서 배경을 렌더링하고 Nvdiffrast에서 객체를 렌더링하여 두 렌더러 간의 MVP(Model-View-Projection) 및 조명 파라미터를 동기화함으로써 미분 가능한 경로를 확보합니다 .

#Review #VLA Models #3D Adversarial Textures #Embodied Robustness #Differentiable Rendering #Foreground-Background Decoupling

2026년 4월 2일

[논문리뷰] RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models

본 논문은 Vision-Language-Action (VLA) 모델 훈련 시, 시뮬레이션을 정적 데이터 소스로만 활용하고 폐쇄 루프 인터랙션을 충분히 활용하지 못하는 기존 Supervised Fine-Tuning (SFT) 기반 sim-real co-training의 한계를 극복하고자 합니다.

#Review #Reinforcement Learning #Sim-to-Real #Co-training #VLA Models #Robotic Manipulation #Supervised Fine-tuning #Catastrophic Forgetting

2026년 2월 15일

[논문리뷰] GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

본 논문은 현재 VLA(Vision-Language-Action) 모델이 겪는 제한된 장면 이해 능력과 약한 미래 예측 능력으로 인한 장기적인 액션 계획의 한계를 해결하는 것을 목표로 합니다.

#Review #VLA Models #World Models #Reinforcement Learning #Robotic Manipulation #Long-Horizon Control #Human-in-the-Loop #Continual Learning

2026년 2월 12일

[논문리뷰] GR-Dexter Technical Report

본 논문은 고자유도(high-DoF) 양손 덱스터러스 핸드 로봇에서 Vision-Language-Action (VLA) 모델 기반의 일반화된 로봇 조작 정책을 확장하는 과제를 해결합니다.

#Review #Dexterous Manipulation #Bimanual Robotics #VLA Models #Robot Learning #Teleoperation #Cross-Embodiment Data #Robotic Hand Design

2025년 12월 31일

[논문리뷰] DiG-Flow: Discrepancy-Guided Flow Matching for Robust VLA Models

Vision-Language-Action (VLA) 모델이 분포 변화 및 복잡한 다단계 로봇 조작 태스크에서 성능 저하를 겪는 문제를 해결하고자 합니다. 이는 학습된 표현이 태스크 관련 의미를 견고하게 포착하지 못하기 때문이며, 본 논문은 기하학적 정규화 를 통해 VLA 모델의 견고성을 향상시키는 것을 목표로 합니다.

#Review #VLA Models #Flow Matching #Robotics #Robustness #Distribution Shift #Wasserstein Distance #Geometric Regularization #Representation Learning

2025년 12월 2일

[논문리뷰] A Survey on Efficient Vision-Language-Action Models

이 논문은 대규모 Vision-Language-Action (VLA) 모델 이 직면한 막대한 계산 및 데이터 요구사항으로 인해 실제 로봇 환경에 배포되기 어려운 문제를 해결하는 것을 목표로 합니다.

#Review #Embodied AI #Robotic Manipulation #VLA Models #Efficient AI #Model Compression #Efficient Training #Data Collection #Multimodal AI

2025년 11월 9일

[논문리뷰] ACG: Action Coherence Guidance for Flow-based VLA models

본 논문은 모방 학습을 통해 훈련된 Vision-Language-Action (VLA) 모델, 특히 Diffusion 및 Flow Matching 모델 에서 발생하는 액션 불일치(jerks, pauses, jitter) 문제를 해결하여 안정성과 궤적 드리프트로 인한 정밀 조작 실패를 방지하는 것을 목표로 합니다.

#Review #Action Coherence #Flow Matching #VLA Models #Guidance #Robotics #Imitation Learning #Transformer #Self-Attention

2025년 10월 28일

[논문리뷰] RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

본 논문은 Vision-Language-Action (VLA) 모델 에 강화 학습(RL)을 적용할 때 발생하는 소규모 및 파편화된 실험의 문제점을 해결하고자 합니다. 대규모 실험을 지원하고 다양한 모델, 알고리즘, 평가 설정 간의 공정한 비교를 가능하게 하는 통합적이고 효율적인 프레임워크 를 제공하는 것을 목표로 합니다.

#Review #Reinforcement Learning #VLA Models #Robotics #GPU Management #PPO #GRPO #Sim-to-Real

2025년 10월 9일