[논문리뷰] A Systematic Study of Cross-Modal Typographic Attacks on Audio-Visual ReasoningDeepti Ghadiyaram이 arXiv에 게시한 'A Systematic Study of Cross-Modal Typographic Attacks on Audio-Visual Reasoning' 논문에 대한 자세한 리뷰입니다.#Review#Multi-modal Large Language Models#Audio Typography#Adversarial Attack#Cross-modal Robustness#Semantic Steering#Safety Application#Content Moderation2026년 4월 8일댓글 수 로딩 중
[논문리뷰] OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene UnderstandingarXiv에 게시된 'OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding' 논문에 대한 자세한 리뷰입니다.#Review#3D Scene Understanding#Open-Vocabulary Segmentation#Referring Expression Segmentation#Training-Free#Voxel Grouping#Vision-Language Models#Multi-modal Large Language Models#Sparse Voxel Rasterization2026년 1월 14일댓글 수 로딩 중
[논문리뷰] OceanGym: A Benchmark Environment for Underwater Embodied AgentsarXiv에 게시된 'OceanGym: A Benchmark Environment for Underwater Embodied Agents' 논문에 대한 자세한 리뷰입니다.#Review#Underwater Robotics#Embodied AI#Benchmark Environment#Multi-modal Large Language Models#Autonomous Underwater Vehicles#Perception#Decision-Making#Simulation2025년 10월 1일댓글 수 로딩 중
[논문리뷰] RynnEC: Bringing MLLMs into Embodied Worldjiangpinliu이 arXiv에 게시한 'RynnEC: Bringing MLLMs into Embodied World' 논문에 대한 자세한 리뷰입니다.#Review#Multi-modal Large Language Models#Embodied AI#Embodied Cognition#Video Understanding#Instance Segmentation#Spatial Reasoning#Robotics2025년 8월 21일댓글 수 로딩 중
[논문리뷰] Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D GenerationHao Huang이 arXiv에 게시한 'Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation' 논문에 대한 자세한 리뷰입니다.#Review#Text-to-3D Generation#Prompt Engineering#Visual Analytics#Human-Computer Interaction#Multi-modal Large Language Models#3D Model Evaluation2025년 8월 7일댓글 수 로딩 중