본문으로 건너뛰기

#Long-Horizon Tasks

23개의 포스트

[논문리뷰] MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments

댓글 수 로딩 중

[논문리뷰] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

댓글 수 로딩 중

[논문리뷰] A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

댓글 수 로딩 중