[논문리뷰] OmniGAIA: Towards Native Omni-Modal AI AgentsGuanting Dong이 arXiv에 게시한 'OmniGAIA: Towards Native Omni-Modal AI Agents' 논문에 대한 자세한 리뷰입니다.#Review#Omni-modal AI#Multi-modal Agents#Tool-Integrated Reasoning#Benchmark#Event Graph#Active Perception#Trajectory Synthesis#DPO2026년 2월 26일댓글 수 로딩 중
[논문리뷰] SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action ModelsarXiv에 게시된 'SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models' 논문에 대한 자세한 리뷰입니다.#Review#Vision-Language-Action Models#Self-Uncertainty Estimation#Adaptive Inference#Active Perception#Action Decoding#Visual Attention#Robotic Manipulation2026년 2월 10일댓글 수 로딩 중
[논문리뷰] EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language ModelsZiyi Bai이 arXiv에 게시한 'EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models' 논문에 대한 자세한 리뷰입니다.#Review#Humanoid Robots#Vision-Language Models#Task Planning#Egocentric Control#Mobile Manipulation#Active Perception#Human-Robot Interaction#Real-World Deployment2026년 2월 4일댓글 수 로딩 중
[논문리뷰] Toward Ambulatory Vision: Learning Visually-Grounded Active View SelectionarXiv에 게시된 'Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection' 논문에 대한 자세한 리뷰입니다.#Review#Active Perception#Vision-Language Models (VLMs)#Embodied AI#View Selection#Reinforcement Learning (RL)#Supervised Fine-Tuning (SFT)#Visual Question Answering (VQA)#3D Environments2025년 12월 15일댓글 수 로딩 중
[논문리뷰] AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language ModelsZhen Li이 arXiv에 게시한 'AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models' 논문에 대한 자세한 리뷰입니다.#Review#3D Embodied Reasoning#Multimodal Large Language Models (MLLMs)#Chain-of-Thought (CoT)#Affordance Grounding#Motion Estimation#View Synthesis#Active Perception2025년 11월 13일댓글 수 로딩 중