본문으로 건너뛰기

#VQA

16개의 포스트

[논문리뷰] HakushoBench: A Japanese Chart and Table VQA Benchmark from Governmental White Papers

댓글 수 로딩 중

[논문리뷰] AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation

댓글 수 로딩 중

[논문리뷰] Differences That Matter: Auditing Models for Capability Gap Discovery and Rectification

댓글 수 로딩 중

[논문리뷰] World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models

댓글 수 로딩 중

[논문리뷰] Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs

댓글 수 로딩 중

[논문리뷰] ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use

댓글 수 로딩 중

[논문리뷰] Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation

댓글 수 로딩 중

[논문리뷰] EchoVLM: Dynamic Mixture-of-Experts Vision-Language Model for Universal Ultrasound Intelligence

댓글 수 로딩 중

[논문리뷰] SeeingEye: Agentic Information Flow Unlocks Multimodal Reasoning In Text-only LLMs

댓글 수 로딩 중

[논문리뷰] LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models

댓글 수 로딩 중

[논문리뷰] Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training

댓글 수 로딩 중