#Multi-Agent System

37개의 포스트

[논문리뷰] DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning

본 논문은 기존의 Deep Research(DR) 시스템들이 직면한 4가지 핵심적인 한계점을 해결하고자 합니다. 첫째, 불충분하게 정의된 연구 범위 속에서 긴 호흡의 계획을 수행할 때 발생하는 복잡성 문제입니다. 둘째, 단일 에이전트 환경에서 하위 작업의 분해 및 스케줄링 과정 중 발생하는 오류 전파의 위험입니다.

#Review #Deep Research #Multi-Agent System #Graph-Based Dynamic Planning #Recursive Execution #Rubric-Grounded Reasoning #Auditability #Test-Time Optimization

2026년 6월 8일

[논문리뷰] EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management

기존의 데이터 과학 에이전트는 고정된 작업 워크플로우와 제한적인 Action space에 의존하여, 경험을 체계적으로 축적하거나 재사용하는 능력이 부족합니다.

#Review #Data Science Agent #Multi-Agent System #Self-Evolving #Agent Skill #Agentic Reinforcement Learning

2026년 6월 4일

[논문리뷰] Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions

본 논문은 중앙 집중식 제어 없이도 다중 에이전트 시스템이 자율적으로 협력하고 고도의 지능을 갖출 수 있는 방법을 탐구합니다. 기존의 중앙 집중식 오케스트레이션은 모든 정보를 단일 게이트웨이로 처리해야 하므로 성능 병목 현상이 발생하고, 시스템 규모가 커짐에 따라 좌표화 복잡도가 기하급수적으로 증가하는 한계가 있습니다.

#Review #Multi-Agent System #Economic Interaction #Decentralized Coordination #Credit Assignment #Large Language Models #Agentic Intelligence #Self-Organization

2026년 6월 3일

[논문리뷰] Multi-Agent Computer Use

본 논문은 현대의 CUA들이 주로 단일 직렬 에이전트 방식으로 운용됨에 따라 복잡하고 긴 호흡의 작업에서 한계를 보인다는 점을 해결하고자 합니다. 기존 방식은 작업 분해, 병렬 실행, 새로운 정보에 기반한 재계획이 부족하여 긴 작업 수행 시 쉽게 정체되는 문제를 겪습니다.

#Review #Multi-Agent System #Computer Use Agent #DAG #Task Decomposition #Parallel Execution #Replanning

2026년 6월 1일

[논문리뷰] Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

본 연구는 대규모 언어 모델(LLM)이 Deep Research 분야에서 사실 기반의 긴 리포트를 작성할 때 발생하는 불투명성과 시각 자료 활용의 한계를 해결하고자 합니다.

#Review #Multi-Agent System #Multimodal Deep Research #Verifiable Generation #Test-Time Scaling #Visual Working Memory #Report Generation

2026년 5월 28일

[논문리뷰] Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration

본 논문은 Long-horizon Video-to-Video Generation의 핵심 과제인 Long Cinematic Video Remaking 문제를 해결하고자 합니다.

#Review #Long-Video Remaking #Multi-Agent System #Dual-Bridge Consistency #Character Identity #Narrative Fidelity #Video-to-Video Generation

2026년 5월 26일

[논문리뷰] One Sentence, One Drama: Personalized Short-Form Drama Generation via Multi-Agent Systems

본 논문은 기존의 디지털 단편 드라마 제작 방식이 가진 narrative pacing의 부재, 클립 간 spatial consistency 부족, 그리고 높은 manual review 의존성이라는 세 가지 핵심 문제를 해결하고자 합니다.

#Review #Short-Form Drama #Multi-Agent System #3D-Grounded Generation #Narrative Pacing #Spatial Consistency #Production-Level Quality Control

2026년 5월 21일

[논문리뷰] From Runnable to Shippable: Multi-Agent Test-Driven Development for Generating Full-Stack Web Applications from Requirements

본 논문은 현재의 코딩 에이전트가 웹 애플리케이션 생성 시 겪는 70% 이상의 기능적 요구사항 미충족 문제를 해결하는 것을 목표로 합니다. 기존의 에이전트는 코드 파일이나 터미널 출력만을 기반으로 검증을 수행하지만, 웹 애플리케이션의 정확성은 브라우저 환경에서의 동적 상호작용을 통해서만 평가될 수 있습니다 .

#Review #Multi-Agent System #Test-Driven Development #Web Development #Code Generation #Closed-Loop Validation #Large Language Model

2026년 5월 18일

[논문리뷰] Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

본 논문은 기존 LLM 기반 경쟁 프로그래밍 에이전트들이 가진 상태 비저장(stateless) 구조의 한계를 해결하고자 합니다. 대다수의 최신 프레임워크는 문제 해결 시마다 처음부터 시작하며, 과거의 디버깅 경험이나 실패 기록을 재사용하지 못하는 고립된 구조를 띱니다 .

#Review #Large Language Models #Competitive Programming #Agentic Evolution #Reinforcement Learning #Knowledge Network #Code Generation #Multi-Agent System

2026년 5월 17일

[논문리뷰] CutClaw: Agentic Hours-Long Video Editing via Music Synchronization

영상 편집은 시각적 스토리텔링과 오디오의 리듬감을 결합하는 복잡한 작업이나, 수 시간 분량의 원본 영상을 수동으로 편집하는 것은 매우 노동 집약적이며 전문적인 미적 판단을 요구합니다.

#Review #Multimodal Language Models #Video Editing #Audio-Visual Alignment #Multi-Agent System #Hierarchical Planning

2026년 3월 31일

[논문리뷰] Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models

Large Language Models (LLMs)는 Chain-of-Thought prompting과 같은 확장된 추론을 통해 상당한 발전을 이루었지만, 이를 Multi-modal Large Language Models (MLLMs)로 확장하는 것은 여전히 큰 도전 과제입니다.

#Review #Visual Reasoning #Image Understanding #Video Understanding #Multi-Agent System #Reinforcement Learning #Self-Evolving

2026년 3월 23일

[논문리뷰] WorldAgents: Can Foundation Image Models be Agents for 3D World Models?

최근 2D Foundation Models는 Text-to-Image Diffusion을 통해 탁월한 High-fidelity 이미지 생성 능력과 깊은 Semantic Understanding을 보여주었습니다.

#Review #3D World Generation #Foundation Models #Multi-Agent System #Vision-Language Models #3D Consistency #Gaussian Splatting

2026년 3월 22일

[논문리뷰] BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation

본 논문은 텍스트-투-비디오(T2V) 생성 모델의 상업적 잠재력을 확장하기 위해 'Seamless Brand Integration' 이라는 새로운 태스크를 소개합니다.

#Review #Text-to-Video Generation #Multi-Agent System #Brand Integration #Prompt Engineering #Large Language Models (LLMs)#LoRA Fine-tuning #Contextual Adaptation

2026년 3월 10일

[논문리뷰] Mozi: Governed Autonomy for Drug Discovery LLM Agents

약물 발견과 같은 고위험 과학 도메인에서 제한 없는 LLM 에이전트 가 겪는 도구 사용 환각, 재현 불가능성, 그리고 장기적 신뢰성 부족 문제를 해결하고자 합니다.

#Review #LLM Agents #Drug Discovery #Governed Autonomy #Multi-Agent System #Workflow Orchestration #Human-in-the-Loop #Computational Biology #Reproducibility

2026년 3월 5일

[논문리뷰] CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

대규모 언어 모델(LLM)이 생성하는 그럴듯하지만 실제로는 존재하지 않는 참고문헌 환각(hallucinated references) 문제를 해결하는 것을 목표로 합니다.

#Review #LLM Hallucination #Citation Verification #Multi-Agent System #Benchmark #Fact Checking #Scientific Integrity #Information Retrieval #Qwen3-VL

2026년 3월 1일

[논문리뷰] LongVideoAgent: Multi-Agent Reasoning with Long Videos

본 논문은 기존 MLLM(Multimodal Large Language Models)이 긴 길이의 비디오에서 발생하는 정보 압축 손실, 제한된 도구 세트, 그리고 미세한 시간적 추론 능력 부족 문제를 해결하는 것을 목표로 합니다.

#Review #Multi-Agent System #Long Video Understanding #Video Question Answering #Reinforcement Learning #Large Language Models #Temporal Grounding #Multimodal Reasoning #Tool-Augmented AI

2025년 12월 23일

[논문리뷰] DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

본 논문은 대규모 언어 모델(LLM)을 위한 고품질 데이터 준비 파이프라인의 파편화된 현상 과 표준화 부족 문제 를 해결하고자 합니다. 특히, LLM 기반의 데이터 합성 및 반복적인 의미론적 정제 를 효과적으로 지원하는 통합적이고 확장 가능한 LLM 구동 데이터 준비 프레임워크 를 구축하는 것이 목표입니다.

#Review #LLM Data Preparation #Workflow Automation #Data-Centric AI #Synthetic Data #Multi-Agent System #Framework #Reproducibility

2025년 12월 22일

[논문리뷰] Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

본 논문은 대규모 추론 모델(LRM)이 국제 수학 올림피아드(IMO) 수준의 초고난도 수학 문제를 해결하는 데 있어 긴 컨텍스트 길이의 제약 으로 인해 발생하는 병목 현상을 극복하는 것을 목표로 합니다.

#Review #Mathematical Reasoning #Long-Horizon Reasoning #Multi-Agent System #Reinforcement Learning #Olympiad Problems #Lemma Memory #Context Length #OREAL-H

2025년 12월 11일

[논문리뷰] Asking like Socrates: Socrates helps VLMs understand remote sensing images

기존 Vision-Language Model (VLM) 들이 원격 감지(RS) 이미지 분석에서 겪는 '가짜 추론(pseudo reasoning)' 문제를 해결하고자 합니다.

#Review #Remote Sensing #Vision-Language Models #Iterative Reasoning #Evidence-Seeking #Socratic Method #Reinforcement Learning #Multi-Agent System #VQA #Grounding

2025년 12월 1일

[논문리뷰] SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System

본 논문은 과학 영상 이해 및 교육 분야에서 기존 멀티모달 대규모 언어 모델(MLLMs) 및 영상 에이전트 시스템의 한계를 극복하는 것을 목표로 합니다. 특히, 외부 전문 지식 통합과 엄격한 단계별 추론이 요구되는 과학 도메인에서 모델의 성능과 신뢰성을 향상시키고자 합니다.

#Review #Multi-Agent System #Video Understanding #Scientific Education #Deming Cycle #Large Language Models #Iterative Optimization #Knowledge Integration #Educational Content Generation

2025년 11월 25일

[논문리뷰] MADD: Multi-Agent Drug Discovery Orchestra

초기 신약 개발 과정에서 히트 분자(hit molecule) 식별 에 필요한 막대한 자원과 기존 AI 방법론의 복잡성 및 접근성 부족 문제를 해결하는 것이 목표입니다.

#Review #Multi-Agent System #Drug Discovery #LLM #Hit Identification #Virtual Screening #Generative AI #Property Prediction #Automated Machine Learning

2025년 11월 12일

[논문리뷰] The Station: An Open-World Environment for AI-Driven Discovery

본 논문은 기존의 경직된 최적화 패러다임을 넘어선 AI 주도 자율 과학 발견을 위한 개방형 다중 에이전트 환경인 The Station 을 소개합니다.

#Review #Multi-Agent System #Open-World Environment #Scientific Discovery #AI-Driven Research #Large Language Models #Emergent Behavior #State-of-the-Art (SOTA)

2025년 11월 10일

[논문리뷰] D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents

본 논문은 기존 GUI 에이전트의 데이터 병목 현상, 지연된 오류 탐지의 높은 비용, 모순된 지침 등의 문제점을 해결하고자 합니다.

#Review #Mobile GUI Automation #Multi-Agent System #Cognitive Architecture #Pre-execution Alignment #Post-execution Reflection #Retrieval-Augmented Generation #Multimodal LLM #Deliberative AI

2025년 9월 29일

[논문리뷰] Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution

본 논문은 실세계 웹 페이지에서 멀티턴, 장기적 궤적(long-horizon trajectories) 을 따르는 작업 수행 시 기존 브라우저 에이전트의 행동 시퀀싱 혼란 과 과도한 시행착오 문제를 해결하는 것을 목표로 합니다.

#Review #Multi-Agent System #Browser Automation #Web Reconnaissance #Tool Generation #Task Execution #Self-Evolving AI #LLM/VLM #VisualWebArena

2025년 9월 26일

[논문리뷰] Interactive Recommendation Agent with Active User Commands

본 논문은 기존 추천 시스템의 수동적 피드백 메커니즘이 사용자의 미묘한 의도와 만족도를 정확히 포착하지 못하여 발생하는 '사용자 의도-시스템 해석' 간의 간극을 해결하고자 합니다.

#Review #Interactive Recommendation #Large Language Models #Multi-Agent System #Natural Language Processing #Knowledge Distillation #User Control

2025년 9월 26일

[논문리뷰] DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks

본 논문은 기존 벤치마크의 데이터 누출 위험과 비현실적인 평가 방식의 한계를 극복하기 위해, 대규모 언어 모델(LLM) 기반 연구 에이전트 의 실제 연구 능력을 평가하기 위한 새로운 벤치마크인 DeepResearch Arena 를 제안합니다.

#Review #LLM Evaluation #Research Agents #Benchmark #Multi-Agent System #Seminar-Grounded Tasks #Data Leakage Prevention #Ill-Structured Problems

2025년 9월 5일

[논문리뷰] Spacer: Towards Engineered Scientific Inspiration

Spacer는 기존 LLM의 한계인 제한된 창의성과 문맥 의존성을 극복하여 외부 개입 없이 창의적이고 사실에 기반한 과학적 개념을 생성하는 것을 목표로 합니다.

#Review #Scientific Discovery #Large Language Models (LLMs)#Decontextualization #Keyword Graph #Multi-Agent System #Scientific Ideation #Research Automation #Inspiration Engine

2025년 8월 27일

[논문리뷰] AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving

대규모 언어 모델(LLM) 기반 에이전트가 외부 도구를 활용할 때 발생하는 확장된 컨텍스트 및 노이즈/관련성 없는 도구 출력 으로 인한 시스템 신뢰성 및 정확도 저하 문제를 해결하고, 에이전트 기반 시스템의 안정성과 견고성 을 향상시키는 것을 목표로 합니다.

#Review #Multi-Agent System #Agent Stability #LLM #Tool Use #GAIA Benchmark #Robustness #Dynamic Supervision #Maneuvering

2025년 8월 14일

[논문리뷰] Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling

본 연구는 기존 비전-언어 모델(VLMs)이 매개변수 규모에 제약이 있고, 견고한 자가 수정 능력이 부족하며, 긴 시각적 맥락과 복잡한 추론을 요구하는 문서 기반 태스크에서 저조한 성능을 보이는 문제를 해결하고자 합니다.

#Review #Visual Document Understanding #Visual Question Answering #Multi-Agent System #Test-Time Scaling #Self-Correction #Mixed Reward Modeling #Large Language Models

2025년 8월 8일

[논문리뷰] CellForge: Agentic Design of Virtual Cell Models

본 논문은 복잡한 생물학적 시스템, 이질적인 데이터 양식, 그리고 다학제적 전문 지식의 필요성으로 인해 어려움을 겪는 가상 세포 모델의 자율적인 구축 문제를 해결하고자 합니다.

#Review #AI Scientist #Multi-Agent System #Virtual Cell Modeling #Single-Cell Perturbation Prediction #Deep Learning #Automated Model Design #Code Generation #Retrieval-Augmented Generation

2025년 8월 5일

[논문리뷰] SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution

본 논문은 대규모 언어 모델(LLM) 기반 소프트웨어 이슈 해결 시스템의 '제한된 관찰 범위(limited observation scope)' 문제를 해결하고자 합니다.

#Review #Multi-Agent System #Software Engineering #Fault Localization #Issue Resolution #Large Language Models #Competitive Debate #Graph Traversal

2025년 8월 4일

[논문리뷰] From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning

본 논문은 화학 반응 조건 추천에서 단순히 '무엇(what)'을 예측하는 것을 넘어 '왜(why)' 특정 조건이 적절한지에 대한 설명 가능한 근거 를 제공하는 것을 목표로 합니다.

#Review #Multi-Agent System #Chemical Reaction Prediction #Explainable AI #Evidence-Based Reasoning #Large Language Models #Tool-Augmented LLMs #Scientific Discovery

2025년 10월 10일

[논문리뷰] MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline

현재 기계 학습 엔지니어링(MLE) 벤치마크 는 수동 큐레이션에 의존하여 확장성이 낮고 적용 가능성이 제한적입니다. 본 연구는 이러한 문제를 해결하기 위해 LLM(Large Language Model) 에이전트 를 위한 고품질의 확장 가능한 MLE 태스크를 자동으로 생성하는 프레임워크를 개발하는 것을 목표로 합니다.

#Review #MLE (Machine Learning Engineering)#Automated Task Generation #Multi-Agent System #LLM Agents #Benchmark #Data Curation #Hybrid Verification #Kaggle

2025년 10월 9일

[논문리뷰] Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

본 논문은 학술 논문을 바탕으로 고품질의 대화형 프로젝트 웹페이지를 자동으로 생성 하는 새로운 태스크를 제안하고 해결하고자 합니다.

#Review #Human-Agent Collaboration #Project Page Generation #Multi-Agent System #LLM #VLM #Webpage Automation #PageBench #Scientific Communication #Cost-Effective AI

2025년 10월 24일

[논문리뷰] FinSight: Towards Real-World Financial Deep Research

본 논문은 기존 AI 시스템이 완전 자동화하기 어려웠던 전문 금융 보고서 생성의 문제를 해결하는 것을 목표로 합니다. 특히, 노동 집약적이고 지적인 노력이 많이 드는 금융 리서치 보고서 작업을 사람 전문가 수준으로 수행할 수 있는 고품질의 멀티모달 금융 보고서 를 생성하는 프레임워크 FinSight 를 제안합니다.

#Review #Financial Research #Multi-Agent System #Code Generation #Multimodal Reports #Iterative Visualization #Variable Memory #Deep Learning

2025년 10월 23일

[논문리뷰] VISTA: A Test-Time Self-Improving Video Generation Agent

본 논문은 텍스트-투-비디오(T2V) 생성 모델이 사용자 프롬프트에 매우 민감 하여 고품질 비디오를 얻기 위한 반복적인 프롬프트 수정과 필터링이 필요하다는 문제를 해결하고자 합니다.

#Review #Text-to-Video Generation #Prompt Optimization #Multi-Agent System #Test-Time Improvement #MLLM-as-a-Judge #Video Evaluation #Audio-Video Synthesis

2025년 10월 20일

[논문리뷰] JoyAgent-JDGenie: Technical Report on the GAIA

본 논문은 LLM 기반 에이전트 시스템들이 복잡한 실세계 태스크를 해결하는 데 있어 견고성, 적응성, 재현성이 부족하다는 문제를 제기합니다. 기존 시스템들이 툴킷 확장, 프롬프트 개선 등 개별적인 측면에만 집중하여 통합 프레임워크가 부재했기 때문입니다.

#Review #Generalist Agent #Multi-Agent System #Plan-Execute #ReAct #Hierarchical Memory #Tool Integration #GAIA Benchmark #LLM Agent

2025년 10월 2일