#Group Relative Self-Distillation (GRSD)

1개의 포스트

[논문리뷰] UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Multimodal Large Language Models (MLLMs)의 발전과 함께 자율 모바일 GUI Agent에 대한 관심이 증가하고 있지만, 기존 방법론들은 비효율적인 실패 궤적(failed trajectory) 학습과 장기(long-horizon) GUI 태스크에서 희소한 보상(sparse rewards)에 따른 모호한 Credit Assignment 문제에 직면하고 있습니다.

#Review #GUI Agent #Self-Evolving Learning #Rejection Fine-Tuning (RFT)#Group Relative Self-Distillation (GRSD)#Credit Assignment #Sparse Rewards #Mobile Automation #Multimodal Large Language Models (MLLMs)

2026년 3월 25일