#Temporal modeling

1개의 포스트

[논문리뷰] PEEK: Picking Essential frames via Efficient Knowledge distillation

본 논문은 현대의 Vision-Language Models (VLMs)가 비디오 이해를 위해 제한된 수의 프레임만을 처리할 수 있다는 병목 문제를 해결하는 데 목적이 있습니다.

#Review #Video-language models #Frame selection #Knowledge distillation #Video captioning #Query-free sampling #Temporal modeling

2026년 5월 31일