#3D Grounding

3개의 포스트

[논문리뷰] N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

본 연구는 기존 멀티모달 모델이 2D 이미지에 의존하여 3D 공간 이해 능력이 부족하다는 한계를 해결하는 것을 목표로 합니다.

#Review #3D Grounding #Spatial Reasoning #Vision-Language Models #Depth Estimation #3D Object Detection #Chain-of-Thought #Data Generation #Multimodal AI

2025년 12월 18일

[논문리뷰] Error-Driven Scene Editing for 3D Grounding in Large Language Models

본 논문은 현재 3D-LLMs 가 3D 환경에서 언어를 시각적 및 공간적 요소에 정확하게 연결하지 못하는 문제점을 해결하고자 합니다.

#Review #3D Grounding #3D-LLMs #Scene Editing #Counterfactual Augmentation #Error-Driven Learning #Spatial Reasoning #Visual Grounding

2025년 11월 18일

[논문리뷰] OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment-aware Reasoning

본 논문은 기존 MLLM 기반 Embodied 시스템의 Geometric Adaptability Gap (다양한 공간 요구사항에 대한 3D 정보 부족)과 Embodiment Constraint Gap (실제 로봇의 물리적 제약 무시)이라는 두 가지 핵심 한계를 해결하고자 합니다.

#Review #Embodied AI #Multimodal LLMs #3D Grounding #Task-Adaptive Reasoning #Embodiment-Aware Planning #Robotics #Spatial Reasoning

2025년 9월 12일