#Automated Question Generation

1개의 포스트

[논문리뷰] MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning

기존 대규모 모델 평가 벤치마크의 제한된 범위와 난이도 적응성 부족 문제를 해결하는 것이 목표입니다. 모델의 추론 능력에 따라 난이도를 조정하고 업데이트할 수 있는 다학제적 질문을 포함하는 새로운 벤치마크 MORPHOBENCH 를 제안하여 모델의 추론 능력 평가의 포괄성과 유효성을 향상하고자 합니다.

#Review #LLM Evaluation #Reasoning Benchmark #Difficulty Adaptation #Multimodal AI #Proof Graph #Agent Recognition #Automated Question Generation

2025년 10월 20일