#Inference-time Guidance

1개의 포스트

[논문리뷰] Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning

본 연구는 대형 모델의 추론 성능을 소형 모델에서 효율적으로 모사하기 위한 기존 추론 기법들의 한계를 해결하고자 합니다.

#Review #Mathematical Reasoning #Large Language Models #Process Reward Model #Inference-time Guidance #Chunk-Level Generation #Likelihood Scoring #Training-Free

2026년 6월 1일