#Mobile GUI Agents

5개의 포스트

[논문리뷰] MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

본 논문은 모바일 GUI 에이전트의 타겟 앱 적응 과정에서 발생하는 비용과 비효율성 문제를 해결하기 위해 MobileForge를 제안한다. 기존 연구들은 사람이 작성한 작업 데이터나 전문가 시연, 보상 레이블에 의존해야 하므로 앱의 잦은 업데이트에 대응하기 어렵다 .

#Review #Mobile GUI Agents #Annotation-Free Adaptation #Hierarchical Feedback #Policy Optimization #MobileGym #HiFPO #GRPO

2026년 6월 23일

[논문리뷰] VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics

본 논문은 사용자 의도 중심의 10가지 범주, 149개의 작업, 그리고 80개의 환경 변이를 포함하는 VenusBench-Mobile을 제안한다. 에이전트의 실패 원인을 세밀하게 분석하기 위해 PUDAM 역량 분류 체계를 도입하여 각 작업의 난이도를 4단계(Level 1-4)로 구분하였다.

#Review #Mobile GUI Agents #User-Centric Benchmark #Capability Diagnostics #Human-Computer Interaction #Performance Evaluation #Robustness

2026년 4월 8일

[논문리뷰] MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments

본 논문은 기존 모바일 GUI 에이전트 벤치마크가 메모리 능력을 체계적으로 평가하지 못하고 메모리 관련 태스크 비중이 5.2-11.8%에 불과 하며 교차 세션 학습 평가가 부재하다는 문제를 제기합니다.

#Review #Mobile GUI Agents #Memory Benchmarking #Short-Term Memory #Long-Term Memory #LLM-as-Judge #Dynamic Environments #Evaluation Metrics #Task Automation

2026년 2월 8일

[논문리뷰] OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

본 연구는 복잡한 모바일 GUI 환경에서 자율 에이전트의 안전 문제 , 특히 시스템 침해 및 개인 정보 유출과 같은 예상치 못한 위험을 효과적으로 탐지하는 문제를 해결하고자 합니다. 기존의 안전 탐지 인프라와 전략이 미흡한 점을 개선하여, 모바일 에이전트 안전 연구의 체계적인 기반을 마련하는 것이 목표입니다.

#Review #Mobile GUI Agents #Agent Safety #Hybrid Detection #Formal Verification #VLM-based Contextual Judgment #Safety Benchmark #Risk Detection

2025년 11월 9일

[논문리뷰] MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents

이 논문은 모바일 GUI 에이전트의 효율성을 높이기 위해 GUI 작업과 효율적인 바로가기(shortcuts) 를 결합한 하이브리드 패러다임의 체계적인 벤치마킹 프레임워크가 부족하다는 문제를 해결하고자 합니다.

#Review #Mobile GUI Agents #Hybrid Automation #Shortcut Generation #Benchmark #Task Efficiency #LLM-based Agents #Mobile Robotics

2025년 9월 9일