[논문리뷰] JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical EnvironmentsarXiv에 게시된 'JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments' 논문에 대한 자세한 리뷰입니다.#Review#3D Audio-Visual Learning#Spatial Grounding#Spatial Reasoning#Large Language Models (LLMs)#Ambisonics#RGB-D#Simulated Environments#Neural Intensity Vector2026년 2월 25일댓글 수 로딩 중
[논문리뷰] GEBench: Benchmarking Image Generation Models as GUI EnvironmentsarXiv에 게시된 'GEBench: Benchmarking Image Generation Models as GUI Environments' 논문에 대한 자세한 리뷰입니다.#Review#GUI Generation#Image Generation Models#Benchmark#Temporal Coherence#Spatial Grounding#Evaluation Metric#Vision Language Models2026년 2월 9일댓글 수 로딩 중
[논문리뷰] Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBencharXiv에 게시된 'Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench' 논문에 대한 자세한 리뷰입니다.#Review#Vision-Language Models#Benchmarking#Visual Measurement Reading#Synthetic Data Generation#Fine-grained Perception#Spatial Grounding#Reinforcement Learning2025년 11월 9일댓글 수 로딩 중
[논문리뷰] InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot PolicyYilun Chen이 arXiv에 게시한 'InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy' 논문에 대한 자세한 리뷰입니다.#Review#Robotics#Vision-Language-Action (VLA)#Spatial Grounding#Generalist Policy#Multimodal Learning#Instruction Following#Simulation-to-Real#Diffusion Models2025년 10월 16일댓글 수 로딩 중
[논문리뷰] See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial NavigationChih-Hai Su이 arXiv에 게시한 'See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation' 논문에 대한 자세한 리뷰입니다.#Review#Vision-Language Models#UAV Navigation#Zero-shot#Spatial Grounding#Waypoint Prompting#Autonomous Navigation#Adaptive Control2025년 9월 29일댓글 수 로딩 중