#Code Interpreter

2개의 포스트

[논문리뷰] From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models

본 연구는 도구 증강 언어 모델(TaLMs) 이 외부 도구를 사용할 때 발생하는 추론 환각(reasoning hallucinations) 의 새로운 유형인 Tool-Induced Myopia (TIM) 를 식별하고 특성화하는 것을 목표로 합니다.

#Review #Tool-augmented LLMs #Reasoning Hallucinations #Tool-Induced Myopia (TIM)#Code Interpreter #Mathematical Reasoning #LLM Evaluation #Preference Optimization

2025년 11월 16일

[논문리뷰] rStar2-Agent: Agentic Reasoning Technical Report

본 논문은 대규모 언어 모델(LLM)이 복잡한 수학 추론에서 '더 길게 생각하는' 것을 넘어 '더 스마트하게 생각하도록' 돕는 것을 목표로 합니다. 구체적으로, 에이전트형 강화 학습(RL)을 통해 Python 코딩 도구 를 자율적으로 활용하고 환경 피드백으로부터 학습하여 최첨단 성능을 달성하고자 합니다.

#Review #Agentic Reinforcement Learning #Math Reasoning #Code Interpreter #Tool Use #GRPO-RoC #LLM Training Efficiency #Self-Reflection

2025년 8월 29일