[논문리뷰] AIRS-Bench: a Suite of Tasks for Frontier AI Research Science AgentsarXiv에 게시된 'AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents' 논문에 대한 자세한 리뷰입니다.#Review#AI Research Agents#LLM Agents#Machine Learning Benchmarks#Scientific Discovery#Code Generation#Evaluation Metrics#Scaffolds#Reproducibility2026년 2월 9일댓글 수 로딩 중