#Planning

12개의 포스트

[논문리뷰] Bridging the Agent-World Gap: Text World Models for LLM-based Agents

본 논문은 LLM 기반 에이전트가 복잡하고 동적인 환경에서 환경 변화를 정확히 예측하지 못해 발생하는 Agent-World Gap 문제를 해결하고자 합니다.

#Review #LLM-based Agents #World Models #Text World Models #Environment Interaction #Planning #Sequential Decision Making

2026년 6월 9일

[논문리뷰] Reward Prediction with Factorized World States

본 연구는 AI 에이전트가 새로운 목표와 환경에 걸쳐 일반화할 수 있는 정확하고 일반화 가능한 보상 예측 모델 을 개발하는 것을 목표로 합니다. 특히 훈련 데이터의 편향과 일반화 한계가 있는 기존 지도학습 기반 보상 모델의 문제를 해결하고, 미세한 단계별 보상 평가를 위한 벤치마크 부족을 해소하고자 합니다.

#Review #Reward Prediction #World Models #State Representation #Large Language Models #Zero-shot Learning #Reinforcement Learning #Planning #Factorization

2026년 3월 10일

[논문리뷰] PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks

본 논문은 통합 멀티모달 모델(UMMs)이 일상생활과 밀접한 컴퓨터 사용 계획 태스크(planning-oriented computer-use tasks)를 얼마나 잘 지원하는지 평가하는 것을 목표로 합니다.

#Review #Multimodal Models #Image Generation #Image Editing #Benchmark #Computer-Use Tasks #Planning #Evaluation Metrics

2026년 2월 8일

[논문리뷰] Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization

논문은 LLM의 CoT(Chain-of-Thought) 추론 이 가진 높은 연산 비용과 이산 토큰 샘플링으로 인한 추론 경로 붕괴 문제를 해결하고자 합니다.

#Review #Latent Reasoning #Chain-of-Thought (CoT)#Large Language Models (LLMs)#Planning #Reinforcement Learning #Mathematical Reasoning #Decoupling #Interpretability

2026년 2월 1일

[논문리뷰] Agentic Reasoning for Large Language Models

본 설문조사 논문은 대규모 언어 모델(LLM)의 추론 능력이 정적인 폐쇄형 환경에서 벗어나 동적이고 개방형 환경에서 계획, 행동, 학습을 통해 지속적으로 상호작용하는 자율 에이전트 로 발전하는 Agentic Reasoning 패러다임을 체계화하는 것을 목표로 합니다.

#Review #Agentic Reasoning #LLM Agents #Self-Evolving AI #Multi-Agent Systems #Planning #Tool Use #Retrieval-Augmented Generation #Reinforcement Learning

2026년 1월 21일

[논문리뷰] MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

본 논문은 MLLM(Multi-modal Large Language Models)이 물리적 환경에서 일반적인 비서 역할을 수행하기 위해 필수적인 비디오 기반 공간 지능 을 평가할 수 있는 포괄적인 벤치마크의 부재를 해결하고자 합니다.

#Review #Video-Based Spatial Intelligence #MLLM Benchmark #Spatial Reasoning #Multi-Modal Learning #Perception #Planning #Prediction #Cross-Video Reasoning #Human-AI Gap

2025년 12월 17일

[논문리뷰] V-REX: Benchmarking Exploratory Visual Reasoning via Chain-of-Questions

본 논문은 기존 VLM이 복잡하고 개방형인 시각 추론 태스크에서 다단계 탐색 및 동적 계획 수립에 어려움을 겪는 문제를 해결하고자 합니다. 대규모 탐색 공간으로 인해 평가하기 어려운 VLM의 탐색적 시각 추론 능력을 정량적으로 평가하기 위한 벤치마크 ( V-REX ) 및 평가 프로토콜을 개발하는 것을 목표로 합니다.

#Review #Visual Reasoning #Multi-step Exploration #Chain-of-Questions (CoQ)#Vision-Language Models (VLMs)#Benchmarking #Planning #Following

2025년 12월 15일

[논문리뷰] Budget-Aware Tool-Use Enables Effective Agent Scaling

이 논문은 대규모 언어 모델(LLM) 기반 에이전트의 효과적인 테스트 시간 스케일링(test-time scaling) 에 대한 연구를 목표로 합니다. 특히, 도구 사용 에이전트가 명시적인 예산 제약 조건 하에서 외부 환경과의 상호작용(도구 호출)을 어떻게 효율적으로 활용하여 성능을 최적화할 수 있는지를 탐구합니다.

#Review #LLM Agents #Tool Use #Budget Awareness #Test-time Scaling #Cost-Performance #Web Search Agents #Planning #Self-Verification

2025년 11월 24일

[논문리뷰] Simulating the Visual World with Artificial Intelligence: A Roadmap

본 논문은 비디오 생성 모델이 포괄적인 물리적 세계 모델(Physical World Model) 로 진화하는 과정을 체계적으로 조망하고 로드맵을 제시하는 것을 목표로 합니다.

#Review #World Models #Video Generation #AI Simulation #Generative AI #Physical Plausibility #Interactive AI #Planning #Roadmap

2025년 11월 16일

[논문리뷰] UItron: Foundational GUI Agent with Advanced Perception and Planning

이 논문은 Mobile/PC 환경에서 복잡한 작업을 자동화하는 GUI 에이전트 의 핵심 역량을 강화하는 오픈소스 파운데이션 모델, Ultron 을 제시합니다.

#Review #GUI Agent #Foundational Model #Multimodal LLM #Perception #Planning #Reinforcement Learning #Data Engineering #Chinese App Scenarios

2025년 9월 1일

[논문리뷰] Dyna-Mind: Learning to Simulate from Experience for Better AI Agents

AI 에이전트가 복잡하고 장기적인 대화형 태스크에서 '대리 시행착오(vicarious trial and error)' 능력을 통해 현재의 한계를 극복하고, 환경을 mentally simulate하여 추론 및 의사결정 성능을 향상시키는 것을 목표로 합니다.

#Review #AI Agents #Reinforcement Learning #World Models #Simulation #Reasoning #Language Models #Planning #Interactive AI

2025년 10월 13일

[논문리뷰] Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

이 논문은 대규모 언어 모델(LLM)의 계획 능력 향상을 위한 강화 학습(RL) 방법론 의 이점과 한계를 이론적으로 분석하는 것을 목표로 합니다.

#Review #Reinforcement Learning #Large Language Models #Planning #Policy Gradient #Q-learning #Supervised Fine-Tuning #Diversity Collapse #Reward Hacking

2025년 10월 1일