#Problem Solving

3개의 포스트

[논문리뷰] Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking

본 논문은 텍스트 전용 추론 모델이 암묵적인 공간 및 기하학적 관계를 파악하는 데 어려움을 겪는 복잡한 추론 문제의 한계를 해결하고자 합니다.

#Review #Multimodal Reasoning #Visual Thinking #Reinforcement Learning #Code Generation #Geometric Reasoning #Adaptive Reward Mechanism #Problem Solving

2025년 12월 31일

[논문리뷰] CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics

본 논문은 대규모 언어 모델(LLMs)이 복잡한 과학 도메인, 특히 응집 물질 물리학(Condensed Matter Physics, CMP) 문제 해결에 얼마나 능숙한지 평가하기 위한 새로운 벤치마크인 CMPhysBench 를 제안합니다.

#Review #Large Language Models #Condensed Matter Physics #Benchmark #Scientific Reasoning #Evaluation Metric #Expression Edit Distance #Problem Solving

2025년 8월 27일

[논문리뷰] Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

본 연구는 대규모 언어 모델(LLM)이 고등학교 수준의 수학 및 코딩 과제에서는 진전을 보였지만, 현대 물리학 연구에서 발생하는 복잡하고 개방형의 난제들을 얼마나 효과적으로 추론하고 해결할 수 있는지 평가하는 것을 목표로 합니다.

#Review #AI Reasoning #Physics Research #LLM Evaluation #Scientific Benchmark #Frontier Physics #Problem Solving #Model Reliability #Auto-grading

2025년 10월 1일