#Predictive Sensing

1개의 포스트

[논문리뷰] Cambrian-S: Towards Spatial Supersensing in Video

본 논문은 현재 멀티모달 대규모 언어 모델(MLLM)이 비디오를 단편적인 프레임으로 처리하고 공간 구조를 제대로 이해하지 못하며, 언어적 기억에 과도하게 의존하는 한계를 지적합니다.

#Review #Spatial Supersensing #Video Understanding #Multimodal LLMs #Predictive Sensing #Memory Management #Event Segmentation #VSI-SUPER #Instruction Tuning

2025년 11월 9일