[논문리뷰] Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video GroundingRynson W. H. Lau이 arXiv에 게시한 'Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding' 논문에 대한 자세한 리뷰입니다.#Review#Spatio-Temporal Video Grounding#Multimodal Large Language Models#Zero-Shot Learning#Visual Grounding#Decomposed Spatio-Temporal Highlighting#Logit-Guided Re-attention#Temporal-Augmented Assembling2025년 9월 19일댓글 수 로딩 중