본문으로 건너뛰기

#Spatial Understanding

6개의 포스트

[논문리뷰] Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models

댓글 수 로딩 중

[논문리뷰] Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

댓글 수 로딩 중

[논문리뷰] Enhancing Spatial Understanding in Image Generation via Reward Modeling

댓글 수 로딩 중

[논문리뷰] MiMo-Embodied: X-Embodied Foundation Model Technical Report

댓글 수 로딩 중

[논문리뷰] LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation

댓글 수 로딩 중