#MLLM Benchmark

3개의 포스트

[논문리뷰] UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

본 연구는 Multimodal Large Language Models (MLLMs) 이 이미지의 미학, 품질, 구조, 텍스처와 같은 지각 수준의 특성을 이해하는 데 어려움을 겪는 문제를 해결하고자 합니다.

#Review #Perceptual Understanding #Image Aesthetics #Image Quality #Image Structure #Image Texture #MLLM Benchmark #Visual Question Answering #Reward Model

2025년 12월 28일

[논문리뷰] MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

본 논문은 MLLM(Multi-modal Large Language Models)이 물리적 환경에서 일반적인 비서 역할을 수행하기 위해 필수적인 비디오 기반 공간 지능 을 평가할 수 있는 포괄적인 벤치마크의 부재를 해결하고자 합니다.

#Review #Video-Based Spatial Intelligence #MLLM Benchmark #Spatial Reasoning #Multi-Modal Learning #Perception #Planning #Prediction #Cross-Video Reasoning #Human-AI Gap

2025년 12월 17일

[논문리뷰] HumanSense: From Multimodal Perception to Empathetic Context-Aware Responses through Reasoning MLLMs

본 논문은 인간 중심 시나리오에서 MLLM(Multimodal Large Language Models) 의 심층적인 이해 및 공감적, 상황 인지적 응답 능력을 평가하기 위한 세분화된 평가 프레임워크의 부족 문제 를 해결하고자 합니다.

#Review #Multimodal LLMs #Human-Centered AI #Empathy #Context-Awareness #MLLM Benchmark #Reinforcement Learning #Reasoning

2025년 8월 15일