본문으로 건너뛰기

#Performance Evaluation

5개의 포스트

[논문리뷰] VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics

댓글 수 로딩 중

[논문리뷰] Model Context Protocol (MCP) Tool Descriptions Are Smelly! Towards Improving AI Agent Efficiency with Augmented MCP Tool Descriptions

댓글 수 로딩 중