#Long-horizon

4개의 포스트

[논문리뷰] WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction

본 논문은 기존 memory 벤치마크가 정적인 대화 데이터에 편향되어 있고, memory를 단일 성공 지표로만 평가하여 실패 원인 파악이 어렵다는 문제를 해결하기 위해 WorldMemArena를 제안한다.

#Review #Multimodal Agent #Memory Benchmark #Action-World Interaction #Lifecycle Evaluation #Long-horizon #Lifelong Evolution #Agentic Execution

2026년 5월 28일

[논문리뷰] MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

본 논문은 대규모 VLA 모델 학습에 필수적인 장기 시점(long horizon)의 egocentric 데이터를 수집하기 위한 개방형 인프라를 구축하는 데 목적이 있습니다. 기존 데이터셋은 에피소드 길이가 짧고 고가의 하드웨어 장비에 의존해야 하는 등 확장성에 한계를 보입니다.

#Review #Egocentric Data #Vision Language Action (VLA)#Long-horizon #SLAM #STERA #Smartphone-based Capture

2026년 5월 17일

[논문리뷰] From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing

기존의 Diffusion-based 이미지 편집 모델들은 '모자를 추가하라'와 같은 명확하고 구체적인 작업에는 우수한 성능을 보이지만, '광고를 채식주의자 친화적으로 바꾸라'와 같은 추상적이고 다단계의 장기적인(long-horizon) 지시사항을 처리하는 데에는 한계가 있습니다.

#Review #Long-horizon #Image Editing #Planner-Orchestrator #Experiential Learning #Reward-driven #Multimodal LLM #Diffusion Models

2026년 5월 17일

[논문리뷰] Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces

본 논문은 기존 사용자 시뮬레이션 연구가 isolated scenario에 국한되거나 synthetic data에 의존하여 인간 행동의 전체적(holistic) 특성을 파악하지 못하는 문제를 해결하고자 한다.

#Review #Large Language Models #User Simulation #Human Behavior Modeling #Long-horizon #Cross-scenario #Benchmark

2026년 4월 9일