[논문리뷰] AIRS-Bench: a Suite of Tasks for Frontier AI Research Science AgentsarXiv에 게시된 'AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents' 논문에 대한 자세한 리뷰입니다.#Review#AI Research Agents#LLM Agents#Machine Learning Benchmarks#Scientific Discovery#Code Generation#Evaluation Metrics#Scaffolds#Reproducibility2026년 2월 9일댓글 수 로딩 중
[논문리뷰] AI & Human Co-Improvement for Safer Co-SuperintelligencearXiv에 게시된 'AI & Human Co-Improvement for Safer Co-Superintelligence' 논문에 대한 자세한 리뷰입니다.#Review#AI Safety#Superintelligence#Human-AI Collaboration#Self-Improving AI#Co-Improvement#Alignment#AI Research Agents2025년 12월 7일댓글 수 로딩 중
[논문리뷰] What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation DiversityarXiv에 게시된 'What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity' 논문에 대한 자세한 리뷰입니다.#Review#AI Research Agents#Ideation Diversity#MLE-bench#LLM Backbones#Agentic Scaffolds#Shannon Entropy#Machine Learning Engineering#Performance Metrics2025년 11월 19일댓글 수 로딩 중