본문으로 건너뛰기

#Human Evaluation

5개의 포스트

[논문리뷰] ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

댓글 수 로딩 중

[논문리뷰] Seedream 4.0: Toward Next-generation Multimodal Image Generation

댓글 수 로딩 중

[논문리뷰] DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context

댓글 수 로딩 중

[논문리뷰] Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble during Multi-Hop Analysis

댓글 수 로딩 중