본문으로 건너뛰기

#Vision-Language Models (VLMs)

44개의 포스트

[논문리뷰] HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios

댓글 수 로딩 중

[논문리뷰] AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

댓글 수 로딩 중

[논문리뷰] AdaptMMBench: Benchmarking Adaptive Multimodal Reasoning for Mode Selection and Reasoning Process

댓글 수 로딩 중

[논문리뷰] SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation

댓글 수 로딩 중

[논문리뷰] Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

댓글 수 로딩 중

[논문리뷰] Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

댓글 수 로딩 중

[논문리뷰] Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned

댓글 수 로딩 중