#Budget Allocation

2개의 포스트

[논문리뷰] The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

본 연구는 고정된 컴퓨팅 자원 환경에서 LLM의 추론 성능을 극대화하기 위한 효율적인 예산 배분 문제를 해결합니다. 기존의 Uniform 정책은 모든 쿼리에 동일한 토큰 제한을 부여함으로써, 쉬운 문제에는 자원을 낭비하고 어려운 문제에는 성능 발휘에 필요한 충분한 자원을 제공하지 못하는 한계가 있습니다.

#Review #Inference-time Scaling #Budget Allocation #Shadow Price #Lambert W Function #Rational Abandonment #LLM Reasoning #Compute-Utility Equilibrium

2026년 6월 4일

[논문리뷰] CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs

논문은 LLM 추론을 강화하는 RLVR(Reinforcement Learning with Verifiable Rewards) 프레임워크에서 GRPO(Group Relative Policy Optimization) 와 같은 기존 방법론의 비효율적인 균일 롤아웃 예산 할당 문제를 해결하고자 합니다.

#Review #Reinforcement Learning #LLMs #Budget Allocation #Adaptive Learning #Capability-Oriented Value Function #Exploration-Exploitation #Resource Efficiency

2026년 2월 3일