본문으로 건너뛰기

#Evaluation Framework

18개의 포스트

[논문리뷰] EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation

댓글 수 로딩 중

[논문리뷰] AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery

댓글 수 로딩 중

[논문리뷰] MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

댓글 수 로딩 중

[논문리뷰] Rethinking Saliency Maps: A Cognitive Human Aligned Taxonomy and Evaluation Framework for Explanations

댓글 수 로딩 중

[논문리뷰] BESPOKE: Benchmark for Search-Augmented Large Language Model Personalization via Diagnostic Feedback

댓글 수 로딩 중

[논문리뷰] StatEval: A Comprehensive Benchmark for Large Language Models in Statistics

댓글 수 로딩 중

[논문리뷰] Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods

댓글 수 로딩 중

[논문리뷰] AInstein: Assessing the Feasibility of AI-Generated Approaches to Research Problems

댓글 수 로딩 중

[논문리뷰] HiKE: Hierarchical Evaluation Framework for Korean-English Code-Switching Speech Recognition

댓글 수 로딩 중