#Agentic AI

104개의 포스트

[논문리뷰] The Hitchhiker's Guide to Agentic AI: From Foundations to Systems

이 가이드는 현대 AI 시스템의 전체 스택을 이해하고 구축하고자 하는 연구자와 실무자를 위해, LLM의 기초 아키텍처부터 autonomous agentic 시스템까지를 통합적으로 설명합니다.

#Review #LLM #Reinforcement Learning #Agentic AI #System Architecture #Retrieval-Augmented Generation #Chain-of-Thought #Multi-Agent Systems

2026년 6월 24일

[논문리뷰] AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning

본 논문은 현대의 LLM 기반 에이전트가 기업 내부의 방대한 문서 아카이브에서 실질적인 지식 업무를 수행하는 데 필요한 Archive-grounded reasoning 능력을 평가하기 위해 Agora를 제안한다.

#Review #Agentic AI #Document Reasoning #Archive-Grounded #Benchmark #Multi-Hop QA #Workplace Automation

2026년 6월 23일

[논문리뷰] Skill-3D: Evolving Scene-Aware Skills for Agentic 3D Spatial Reasoning

본 논문은 기존 MLLM 기반 에이전트들이 3D 공간 추론 작업에서 장면의 특성을 무시하고 획일적인 툴 사용 전략을 취함으로써 성능이 저하되는 문제를 해결하고자 한다.

#Review #Agentic AI #3D Spatial Reasoning #Scene-Aware Skills #Tool Learning #Skill Evolution

2026년 6월 8일

[논문리뷰] The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

본 논문은 large language model (LLM)이 long-horizon agentic workflow로 전환됨에 따라 발생하는 efficiency 및 cost bottleneck 문제와 intrinsically complex, high-stakes task 해결의 어려움을 다룹니다.

#Review #Mixture-of-Experts (MoE)#Mini Activations #Agentic AI #Self-Evolution #Reinforcement Learning (RL)#Multi-Token Prediction (MTP)

2026년 5월 26일

[논문리뷰] TOBench: A Task-Oriented Omni-Modal Benchmark for Real-World Tool-Using Agents

본 논문은 실세계의 복잡한 전문 워크플로우를 수행하는 Agent의 능력과 이를 평가하는 기존 벤치마크 사이의 격차를 해소하고자 합니다.

#Review #Agentic AI #Omni-modal #Tool-using Agents #Model Context Protocol #Closed-loop Verification #Benchmark

2026년 5월 18일

[논문리뷰] Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis

본 논문은 기존의 text-driven 3D 생성 방식이 갖는 공간적 정보의 불명확성과, 기존 agentic 프레임워크가 holistic room generation 과정에서 직면하는 무한 루프 및 불안정성 문제를 해결하고자 합니다.

#Review #Agentic AI #3D Room Synthesis #MLLM #Blender Code #Execution Harness #Cross-stage Memory #Top-down View

2026년 5월 18일

[논문리뷰] Code as Agent Harness

본 논문은 LLM 기반 에이전트 시스템에서 코드가 단순한 생성 대상(target artifact)을 넘어, 시스템의 핵심 운영 인프라로 전환되고 있다는 점을 지적한다.

#Review #Agent Harness #Coding Agent #Harness Engineering #Agentic AI #Code-as-Agent-Harness #Executable Verification

2026년 5월 18일

[논문리뷰] RewardHarness: Self-Evolving Agentic Post-Training

본 논문은 기존의 Reward Modeling 방식이 대규모 인간 피드백 데이터에 의존하여 비용이 높고, 유연성이 부족하다는 문제점을 해결하고자 합니다.

#Review #Reward Modeling #Agentic AI #Self-Evolution #Multimodal Evaluation #In-Context Learning #Reinforcement Learning

2026년 5월 14일

[논문리뷰] Results and Retrospective Analysis of the CODS 2025 AssetOpsBench Challenge

본 논문은 LLM 기반 에이전트가 복잡한 산업 환경에서 실질적인 능력을 발휘하는지 평가하기 위한 방법론적 문제를 다룹니다. 기존 벤치마크는 지나치게 단순화된 과제에 의존하거나, 실무에서 필수적인 프라이버시 보호 및 다단계 실행 능력을 적절히 측정하지 못하는 한계가 있습니다 .

#Review #Agentic AI #Industry 4.0 #Benchmarking #Privacy-preserving #Multi-agent systems #Performance Evaluation #AssetOpsBench

2026년 5월 13일

[논문리뷰] AI Co-Mathematician: Accelerating Mathematicians with Agentic AI

본 논문은 수학 연구의 복잡하고 반복적인 실제 프로세스를 지원하기 위해 상태 유지형 워크플로우를 제공하는 AI co-mathematician을 제안한다.

#Review #Agentic AI #Mathematical Research #Interactive Workspace #Workstream #Stateful Workflow #Uncertainty Management #FrontierMath

2026년 5월 7일

[논문리뷰] Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

본 논문은 에이전트 스킬 라이브러리가 수천 개 이상으로 확장됨에 따라 발생하는 Skill Retrieval 의 병목 현상과 불완전한 스킬 검색 문제를 해결합니다. 기존의 Vanilla Skills 방식은 전체 라이브러리를 프롬프트에 포함하여 Context Window 오버로드, 토큰 비용 증가, 성능 저하를 초래합니다.

#Review #Agentic AI #Skill Retrieval #Graph-based Retrieval #Structural Dependency #Personalized PageRank #LLM Agents

2026년 4월 9일

[논문리뷰] HDP: A Lightweight Cryptographic Protocol for Human Delegation Provenance in Agentic AI Systems

본 논문은 에이전트 기반 AI 시스템에서 발생하는 구조적인 Accountability Gap 을 해결하기 위해 고안되었습니다.

#Review #Agentic AI #Delegation Provenance #Cryptographic Authorization #Multi-agent Systems #Ed25519 #Human-in-the-loop Security #IETF

2026년 4월 6일

[논문리뷰] ASI-Evolve: AI Accelerates AI

본 논문은 현대 AI 연구가 직면한 고비용, 장기 과제, 불투명한 연구 루프라는 병목 현상을 해결하기 위해 AI가 스스로 AI를 발전시키는 Asi-Evolve 를 제안한다.

#Review #Agentic AI #Autonomous Scientific Discovery #Neural Architecture Design #Pretraining Data Curation #Reinforcement Learning

2026년 4월 2일

[논문리뷰] Gen-Searcher: Reinforcing Agentic Search for Image Generation

최신 텍스트-이미지 생성 모델들은 놀라운 시각적 품질을 보여주지만, 학습 과정에서 습득한 고정된 지식에 의존한다는 근본적인 한계를 지닙니다. 특히 실시간 정보가 필요하거나 지식 집약적인 프롬프트가 주어질 경우, 모델은 올바른 시각적 참조 없이 이미지를 생성하여 factual error나 시각적 왜곡을 초래합니다.

#Review #Agentic AI #Image Generation #Multi-hop Search #Reinforcement Learning #Grounded Generation #Multimodal Agent

2026년 3월 30일

[논문리뷰] EpochX: Building the Infrastructure for an Emergent Agent Civilization

현재의 AI 에이전트 연구는 개별 에이전트의 지능을 높이는 데 집중되어 있으나, 실제 경제적 가치를 창출하기 위해서는 이를 조직화하고 협업하게 만드는 인프라가 필수적입니다. 기존의 많은 에이전트 플랫폼은 작업 실행을 일회성 이벤트로 간주하여, 결과물이 축적되지 않고 유실되는 한계가 있습니다.

#Review #Agentic AI #Marketplace Infrastructure #Credit Mechanism #Human-Agent Collaboration #Persistent Ecosystem Assets

2026년 3월 30일

[논문리뷰] RetroAgent: From Solving to Evolving via Retrospective Dual Intrinsic Feedback

본 논문은 LLM 기반 에이전트가 복잡한 대화형 환경에서 정적인 문제 해결을 넘어 지속적인 적응 및 진화를 가능하게 하는 것을 목표로 합니다. 기존 RL 패러다임의 탐색 부족 및 학습된 지식의 암묵적 특성으로 인한 비효율적인 학습 및 취약한 일반화 문제를 해결하고자 합니다.

#Review #Reinforcement Learning #Large Language Models #Self-Reflection #Intrinsic Feedback #Continuous Adaptation #Memory Retrieval #Agentic AI #GRPO

2026년 3월 11일

[논문리뷰] OpenClaw-RL: Train Any Agent Simply by Talking

본 논문은 AI 에이전트가 사용자 피드백, 툴 실행 결과, GUI 상태 변화 등 '다음 상태 신호(next-state signals)' 를 통해 실시간으로 지속적인 학습을 수행하도록 하는 프레임워크를 제안합니다.

#Review #Reinforcement Learning (RL)#Agentic AI #Online Learning #Next-State Signals #Process Reward Models (PRM)#On-Policy Distillation (OPD)#Multi-Modal Agents

2026년 3월 11일

[논문리뷰] Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

논문은 기존 비디오 이해 데이터셋이 자연스러운 장기적 일상생활을 반영하지 못하고 짧은 클립 위주라는 한계를 지적하며, 진정한 다중 모드 평생 이해(Multimodal Lifelong Understanding) 태스크를 엄격하게 정의하는 것을 목표로 합니다.

#Review #Multimodal Lifelong Understanding #Video Dataset #Agentic AI #Dynamic Memory Management #Long-Context MLLMs #Temporal Reasoning #Concept Drift

2026년 3월 5일

[논문리뷰] APRES: An Agentic Paper Revision and Evaluation System

본 논문은 과학 논문 심사 과정의 비일관적인 피드백 문제를 해결하고, 논문의 품질과 영향력을 향상시키기 위한 새로운 에이전트 기반 시스템인 APRES 를 제안합니다.

#Review #Large Language Models #Peer Review #Automated Revision #Citation Prediction #Agentic AI #Rubric Discovery #Scholarly Communication

2026년 3월 3일

[논문리뷰] Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

이 논문은 기존 딥 리서치 에이전트의 높은 추론 비용과 지연 시간, 그리고 이질적인 연구 환경 전반에 걸친 낮은 일반화 성능이라는 두 가지 주요 문제를 해결하는 것을 목표로 합니다. 특히, 장기적인(long-horizon) 에이전트 검색 태스크에서 효율성과 일반화 능력을 동시에 향상시키고자 합니다.

#Review #Agentic AI #Long-Horizon Search #Parallel Execution #Data Synthesis #Reinforcement Learning #Generalization #Efficiency #LLM Agent

2026년 2월 26일

[논문리뷰] PyVision-RL: Forging Open Agentic Vision Models via RL

본 논문은 에이전트형 멀티모달 모델의 강화 학습 시 발생하는 상호작용 붕괴(interaction collapse) 문제를 해결하고, 안정적인 학습을 통해 지속적인 도구 사용과 다중 턴 추론 능력을 유지하는 것을 목표로 합니다. 특히 이미지 및 비디오 이해 태스크를 위한 오픈-웨이트 멀티모달 모델 에 초점을 맞춥니다.

#Review #Agentic AI #Multimodal Models #Reinforcement Learning #Dynamic Tooling #Interaction Stability #Video Reasoning #Visual Language Models #Rollout Optimization

2026년 2월 24일

[논문리뷰] OCR-Agent: Agentic OCR with Capability and Memory Reflection

Large Vision-Language Models(VLM)이 복잡한 시각 이해 태스크에서 인지적 편향을 독립적으로 수정하지 못하고, 반복적이고 비효율적인 수정 루프에 빠져 답변 품질을 안정적으로 개선하지 못하는 문제를 해결하는 것이 목표입니다.

#Review #OCR #VLM #Self-Correction #Agentic AI #Capability Reflection #Memory Reflection #Iterative Refinement #Chain-of-Thought

2026년 2월 24일

[논문리뷰] GLM-5: from Vibe Coding to Agentic Engineering

본 논문은 AI 모델이 인간의 지시(vibe coding)에 의존하는 것을 넘어 자율적인 계획, 구현 및 반복 이 가능한 Agentic Engineering 패러다임으로 전환하는 것을 목표로 합니다.

#Review #Foundation Model #Agentic AI #Reinforcement Learning #Sparse Attention #Software Engineering #Long-Context Models #GPU Optimization

2026년 2월 17일

[논문리뷰] DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories

본 논문은 기존의 독립적인 이미지 검색 패러다임이 시각적 히스토리 내의 복잡한 문맥적 의존성을 간과하는 문제를 해결하는 것을 목표로 합니다. 이미지를 자율적인 탐색 작업으로 재구성하여, 모델이 원시 시각적 히스토리에서 다단계 추론을 통해 암묵적인 문맥 단서에 기반한 타겟을 찾아내는 새로운 에이전트 패러다임 을 제시합니다.

#Review #Multimodal Agents #Image Retrieval #Context-Aware #Visual Histories #Benchmarking #Vision-Language Models #Agentic AI

2026년 2월 16일

[논문리뷰] Self-EvolveRec: Self-Evolving Recommender Systems with LLM-based Directional Feedback

기존 추천 시스템 코드 진화 프레임워크들이 스칼라 지표(NDCG, Hit Ratio)에만 의존하여 진단적 통찰력을 제공하지 못하고, 고정된 검색 공간에 갇혀 혁신을 제한한다는 문제를 해결하고자 합니다.

#Review #Recommender System #LLM-based Code Evolution #Directional Feedback #User Simulator #Model Diagnosis Tool #Agentic AI #AutoML

2026년 2월 15일

[논문리뷰] Intelligent AI Delegation

본 논문은 기존 AI 태스크 분해 및 위임 방식의 한계(단순한 휴리스틱, 환경 변화에 대한 취약성)를 극복하고자 합니다.

#Review #AI Delegation #Multi-agent Systems #Task Decomposition #Agentic AI #Trust & Safety #LLM #Adaptive Coordination

2026년 2월 15일

[논문리뷰] Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

본 논문은 11B 활성화 파라미터 를 가진 196B Mixture-of-Experts (MoE) 모델 인 Step 3.5 Flash 를 소개하며, 첨단 에이전트 지능과 컴퓨팅 효율성 간의 격차를 해소하는 것을 목표로 합니다.

#Review #Mixture-of-Experts (MoE)#Sparse Models #Inference Efficiency #Hybrid Attention #Multi-Token Prediction (MTP)#Reinforcement Learning (RL)#Agentic AI #Long-Context Understanding

2026년 2월 11일

[논문리뷰] EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies

이 논문은 LLM 기반 에이전트의 장기적인 계획 및 실행 능력을 평가하는 기존 프레임워크가 단기적이고, 도메인에 특화되어 있으며, 현실적인 경제 역학에 충분히 기반하지 못하는 문제를 해결하는 것을 목표로 합니다.

#Review #LLM Evaluation #Long-Horizon Planning #Interactive Economies #Benchmark #Agentic AI #Economic Simulation #Plan-and-Execute

2026년 2월 11일

[논문리뷰] P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

본 논문은 기존 텍스트 기반 모델의 한계를 극복하고, 시각적 정보와 과학적 추론을 통합하여 물리 올림피아드 수준의 복잡한 문제 를 해결할 수 있는 개방형 Vision-Language Model (VLM) 을 개발하는 것을 목표로 합니다.

#Review #Vision-Language Models #Reinforcement Learning #Curriculum Learning #Physics Olympiads #Scientific Reasoning #Agentic AI #Multimodal AI #Physics

2026년 2월 10일

[논문리뷰] Chain of Mindset: Reasoning with Adaptive Cognitive Modes

기존 LLM(대규모 언어 모델)의 고정된 단일 사고방식 추론 방식이 문제 해결의 여러 단계에서 요구되는 이질적인 인지적 요구를 충족하지 못하는 한계를 해결하고자 합니다. 본 연구는 단계별로 적응적인 사고방식을 유연하게 조율하여 LLM의 문제 해결 능력을 차세대 지능 수준으로 끌어올리는 것을 목표로 합니다.

#Review #Adaptive Reasoning #Cognitive Modes #Large Language Models (LLMs)#Agentic AI #Multimodal Reasoning #Mindset Orchestration #Contextual Filtering #Training-free Framework

2026년 2월 10일

[논문리뷰] Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

본 논문은 대규모 언어 모델(LLM) 기반 에이전트 훈련을 위한 다양하고 신뢰할 수 있는 환경의 부족 문제 를 해결하고자 합니다.

#Review #Agentic AI #Reinforcement Learning #Synthetic Environments #Tool-Use Agents #World Model #Database-Backed Simulation #LLM-powered Agents

2026년 2월 10일

[논문리뷰] Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling

본 논문은 기존 이미지 편집 모델의 한계를 극복하고, 전문적인 워크플로우를 지원하는 고품질, 네이티브 해상도 이미지 편집 시스템을 개발하는 것을 목표로 합니다.

#Review #Image Editing #Agentic AI #Multi-turn Interaction #High-Fidelity #Native Resolution #LLM #Context Folding #Layer Decomposition

2026년 2월 10일

[논문리뷰] InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

본 논문은 기존 AI 과학자 시스템의 한계(도메인 특화 설계, 불완전한 추론 능력, 비효율적인 최적화 파이프라인, 장기 자율 운영 미흡)를 극복하고, 계산 및 경험적 영역 전반에 걸쳐 엔드투엔드 과학적 발견을 위한 통합 에이전트 프레임워크 인 InternAgent-1.5를 개발하는 것을 목표로 합니다.

#Review #Agentic AI #Scientific Discovery #Long-Horizon Reasoning #Structured Memory #Knowledge Graph #Experimental Optimization #Multi-disciplinary

2026년 2월 9일

[논문리뷰] V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval

기존 MLLM 기반 검색 시스템이 정적 시각 인코딩에 의존하고 시각적 증거를 능동적으로 검증하지 못해 시각적으로 모호한 경우 추론 오류가 발생하는 문제를 해결하고자 합니다. 시각적 검사에 기반한 증거 기반 에이전트 추론 프로세스 를 통해 범용 멀티모달 검색의 정확성과 신뢰성을 향상시키는 것을 목표로 합니다.

#Review #Multimodal Retrieval #Agentic AI #Large Language Models (LLMs)#Visual Tools #Chain-of-Thought (CoT)#Reinforcement Learning #Curriculum Learning #Evidence-Driven Reasoning

2026년 2월 5일

[논문리뷰] ProAct: Agentic Lookahead in Interactive Environments

ProAct는 인터랙티브 환경에서 LLM 에이전트가 겪는 긴 시퀀스 의사결정 문제, 특히 누적되는 시뮬레이션 오류 와 높은 분산의 가치 추정 으로 인한 한계를 극복하는 것을 목표로 합니다. 이를 통해 에이전트의 정확한 다중 턴 예측 능력 과 안정적인 정책 최적화 를 달성하고자 합니다.

#Review #Agentic AI #Large Language Models #Reinforcement Learning #Lookahead Reasoning #Monte-Carlo Tree Search #Supervised Fine-Tuning #Value Estimation #Simulation Drift

2026년 2월 5일

[논문리뷰] Vibe AIGC: A New Paradigm for Content Generation via Agentic Orchestration

본 논문은 지난 10년간 모델 중심 패러다임이 지배했던 생성형 AI(AIGC) 분야의 한계, 특히 '의도-실행 격차(Intent-Execution Gap)'를 해결하는 것을 목표로 합니다.

#Review #Agentic AI #Content Generation #Orchestration #Vibe Coding #Meta-Planner #Human-in-the-Loop #Intent-Execution Gap

2026년 2월 4일

[논문리뷰] daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently

본 논문은 대규모 언어 모델(LLM)이 단기 작업에서 뛰어난 성능을 보임에도 불구하고, 실제와 같은 복잡한 장기 에이전트 워크플로우로 확장하는 데 필요한 고품질 훈련 데이터 부족 문제를 해결하고자 합니다.

#Review #Long-Horizon Agency #Data Synthesis #Pull Request Chains #Software Evolution #LLM Training #Agentic AI #Self-Distillation #Code Generation

2026년 2월 3일

[논문리뷰] Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

본 논문은 기존 멀티모달 딥 리서치 MLLM들이 겪는 히트율 문제(검색 엔진의 노이즈와 불안정성) 및 제한된 추론 깊이/검색 폭 문제를 해결하고자 합니다.

#Review #Multimodal Large Language Models #Deep Research #Agentic AI #Tool Use #Visual Question Answering #Reinforcement Learning #Multi-scale Search

2026년 2월 2일

[논문리뷰] RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

본 논문은 LLM 및 에이전트 시나리오에서 학습 신호를 증폭하고 전체 RL 시스템을 강화하기 위해 환경, 정책, 보상 모델을 닫힌 루프(closed-loop) 최적화 를 통해 동적으로 구축하는 RLAnything 프레임워크를 제안합니다.

#Review #Reinforcement Learning #Large Language Models #Agentic AI #Reward Modeling #Environment Adaptation #Closed-loop Optimization #Multimodal Agents

2026년 2월 2일

[논문리뷰] Robust Tool Use via Fission-GRPO: Learning to Recover from Execution Errors

본 논문은 대규모 언어 모델(LLMs), 특히 소형 LLMs가 다중 턴 도구 실행에서 발생하는 실행 오류로부터 취약하고, 오류 발생 시 반복적인 무효 호출에 빠지는 문제를 해결하고자 합니다.

#Review #Tool Use #Execution Errors #Error Recovery #Reinforcement Learning #LLMs #Agentic AI #GRPO #FISSION

2026년 2월 1일

[논문리뷰] Language-based Trial and Error Falls Behind in the Era of Experience

Large Language Models (LLMs)가 언어 기반이 아닌 새로운 환경(예: 상징적, 공간적 태스크)에서 낮은 성능을 보이는 문제를 해결하는 것이 목표입니다.

#Review #Large Language Models #Reinforcement Learning #Exploration Efficiency #Sub-Scale Collaboration #Out-of-Distribution Tasks #Agentic AI #Supervised Fine-Tuning

2026년 1월 29일

[논문리뷰] Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning

본 논문은 대규모 언어 모델(LLM) 기반의 에이전트가 장기적인 태스크를 수행할 때 발생하는 비효율적인 탐색 문제를 해결하는 것을 목표로 합니다. 기존 RL 방법론은 컴퓨팅 자원을 중간 단계에 균일하게 할당하여 중요하지 않은 단계에서 자원을 낭비하고 고품질 궤적 확보에 실패하는 한계를 가지고 있습니다.

#Review #Agentic AI #Reinforcement Learning #Long-Horizon Tasks #Dynamic Branching #Strategic Exploration #LLM Agents #Sample Efficiency #Policy Optimization

2026년 1월 28일

[논문리뷰] The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

컴퓨터 비전 모델이 긴 서사적 일관성을 유지하지 못하고, 대화 같은 고수준의 아이디어와 시네마틱 실행 간의 '의미론적 간극'을 겪는 문제를 해결하는 것을 목표로 합니다. 특히, 단순 대화 입력만으로 장기적이고 일관성 있는 시네마틱 비디오를 자동 생성하는 엔드투엔드 프레임워크를 개발하고자 합니다.

#Review #Dialogue-to-Video Generation #Agentic AI #Cinematic Scripting #Long-Horizon Video Synthesis #Visual Coherence #Reinforcement Learning #Multimodal LLM

2026년 1월 26일

[논문리뷰] Agentic Very Long Video Understanding

본 논문은 항상 켜져 있는 개인 AI 비서가 요구하는 매우 긴 비디오 이해의 과제를 해결하는 것을 목표로 합니다.

#Review #Long-Horizon Video Understanding #Agentic AI #Entity Graph #Multimodal Reasoning #Video Question Answering #EgoLifeQA #Retrieval Augmented Generation

2026년 1월 26일

[논문리뷰] LongCat-Flash-Thinking-2601 Technical Report

본 논문은 장기적인 상호작용과 추론이 요구되는 에이전트 태스크 에서 기존 모델들의 한계를 극복하고, 뛰어난 에이전트 추론 능력을 가진 오픈소스 MoE(Mixture-of-Experts) 대규모 언어 모델인 LongCat-Flash-Thinking-2601 을 개발하는 것을 목표로 합니다.

#Review #Agentic AI #Large Language Models (LLMs)#Mixture-of-Experts (MoE)#Reinforcement Learning (RL)#Context Management #Scalable Training #Test-Time Reasoning #Open-Source Model

2026년 1월 25일

[논문리뷰] XR: Cross-Modal Agents for Composed Image Retrieval

AI 시대의 Composed Image Retrieval (CIR)에서 기존 유사성 기반 패러다임의 한계를 극복하고, 레퍼런스 이미지와 텍스트 수정 사항을 통합하는 데 필요한 교차-모달 추론 능력 을 향상시키는 것이 목표입니다.

#Review #Composed Image Retrieval #Cross-Modal Agents #Multimodal Reasoning #Training-free Framework #Information Retrieval #Agentic AI #Progressive Retrieval

2026년 1월 21일

[논문리뷰] FARE: Fast-Slow Agentic Robotic Exploration

본 연구는 자율 로봇 탐사에서 기존 방법론이 장기 정보 활용 및 환경 변화 적응에 어려움을 겪는 문제를 해결하고자 합니다.

#Review #Robotic Exploration #LLM #Reinforcement Learning #Fast-Slow Thinking #Hierarchical Planning #Agentic AI #Graph Reasoning

2026년 1월 21일

[논문리뷰] Aligning Agentic World Models via Knowledgeable Experience Learning

본 논문은 대규모 언어 모델(LLMs) 기반 에이전트 월드 모델이 겪는 '물리적 환각(physical hallucinations)' 문제를 해결하고자 합니다.

#Review #Agentic AI #World Models #Experience Learning #LLMs #Physical Hallucinations #Embodied AI #Predictive Coding #Knowledge Repository

2026년 1월 20일

[논문리뷰] Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering

본 논문은 에이전트 기반 과학에서 초장기 자율성(ultra-long-horizon autonomy) 의 핵심 병목 현상을 해결하고자 합니다.

#Review #Agentic AI #Long-Horizon Autonomy #Cognitive Accumulation #Hierarchical Cognitive Caching (HCC)#Context Management #Machine Learning Engineering (MLE)#LLM Agents

2026년 1월 15일

[논문리뷰] DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

본 논문은 심층 연구 시스템이 생성하는 길고 복잡한 보고서의 평가가 어렵다는 문제점을 해결하고자 합니다. 기존 벤치마크는 수동 주석 작업이 많거나, 고정된 평가 차원에 의존하거나, 인용되지 않은 사실을 신뢰성 있게 검증하지 못하는 한계가 있었습니다.

#Review #Agentic AI #Deep Research Systems #Automated Evaluation #Task Construction #Fact-Checking #LLM Benchmarking #Adaptive Evaluation

2026년 1월 14일

[논문리뷰] Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

본 논문은 기존 비디오 질의응답 벤치마크의 한계, 즉 폐쇄된 증거 설정과 텍스트 기반 검색에 의존하는 문제점을 해결하고자 합니다.

#Review #Video Question Answering #Open-domain Search #Multimodal LLMs #Agentic AI #Benchmark #Video Understanding #Multi-hop Reasoning

2026년 1월 12일

[논문리뷰] MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era

본 논문은 인터랙티브하고 자율적인 AI 에이전트의 대규모 훈련 및 평가를 위한 기존 인프라의 한계를 해결하고자 합니다.

#Review #Agentic AI #Distributed Orchestration #Scalability #Cloud-Native #Reinforcement Learning #Software Engineering Agents #Resource Management

2026년 1월 12일

[논문리뷰] DocDancer: Towards Agentic Document-Grounded Information Seeking

본 연구는 기존 DocQA(Document Question Answering) 에이전트들의 비효율적인 도구 활용 및 폐쇄형 모델 의존성 문제를 해결하고자 합니다.

#Review #Agentic AI #Document Question Answering #Tool-use #Information Seeking #Synthetic Data Generation #Long-context Understanding #Multimodal Documents

2026년 1월 8일

[논문리뷰] MiMo-V2-Flash Technical Report

본 논문은 빠른 추론 속도와 강력한 추론 및 에이전트 능력을 동시에 갖춘 효율적이고 비용 효율적인 대규모 언어 모델(LLM)인 MiMo-V2-Flash를 개발하는 것을 목표로 합니다.

#Review #Mixture-of-Experts #Sliding Window Attention #Multi-Token Prediction #Multi-Teacher On-Policy Distillation #Reinforcement Learning #Long-Context Modeling #Agentic AI

2026년 1월 6일

[논문리뷰] Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

본 논문은 경량 LLM이 높은 계산 효율성 을 유지하면서도 내재적인 에이전트 지능을 갖출 수 있도록 하는 것을 목표로 합니다. 특히, 기존의 증류(distillation) 방식이 아닌, sub-2B 규모 의 모델이 처음부터 추론 및 계획 능력 을 체계적으로 학습하도록 하는 데 중점을 둡니다.

#Review #Lightweight LLM #Agentic AI #Pre-training #Multi-Latent Attention #Long-Context #Curriculum Learning #Agentic Mid-training #Instruction Tuning

2025년 12월 31일

[논문리뷰] Video-BrowseComp: Benchmarking Agentic Video Research on Open Web

본 논문은 기존 벤치마크들이 텍스트 및 정적 멀티모달 정보 탐색에 초점을 맞추고 동적인 웹 비디오 콘텐츠를 간과하는 문제점을 해결하고자 합니다.

#Review #Agentic AI #Video Understanding #Web Browsing #Benchmark #Multimodal LLMs #Temporal Grounding #Cross-Source Reasoning #Information Seeking

2025년 12월 29일

[논문리뷰] GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training

멀티턴 강화 학습(RL) 기반 VLM(Vision-Language Model) 에이전트 훈련 의 주요 문제점인 희소한 보상, 긴 신용 할당 문제, 그리고 GTR(Guided Thought Reinforcement) 과 같은 기존 방법론에서 외부 교사 모델 사용으로 인한 높은 비용과 접근성 한계를 해결하는 것을 목표로 합니다.

#Review #Multi-turn Reinforcement Learning #Vision-Language Models (VLMs)#Agentic AI #Knowledge Distillation #Model Merging #PPO #Thought Guidance #Cost Efficiency

2025년 12월 25일

[논문리뷰] Step-DeepResearch Technical Report

본 논문은 Deep Research —개방형, 장기적, 복잡한 정보 탐색 작업—를 수행할 수 있는 견고한 자율 에이전트 구축의 문제를 다룹니다.

#Review #Deep Research Agents #LLMs #Reinforcement Learning #Supervised Fine-tuning #Agentic AI #Multi-hop Reasoning #Benchmarking #Cost-effectiveness

2025년 12월 23일

[논문리뷰] INTELLECT-3: Technical Report

본 논문은 기존 오픈소스 LLM RL 인프라의 복잡성과 확장성 한계를 해결하고, 106B 파라미터 Mixture-of-Experts (MoE) 모델인 INTELLECT-3 를 통해 최첨단 성능을 달성하는 것을 목표로 합니다.

#Review #Reinforcement Learning #Large Language Models #Mixture-of-Experts #Asynchronous Training #Distributed Systems #Agentic AI #Code Execution #Model Evaluation

2025년 12월 23일

[논문리뷰] Adaptation of Agentic AI

본 논문은 급성장하는 에이전트 AI 시스템의 적응(adaptation) 연구 분야를 체계적인 프레임워크로 통합하고, 에이전트 적응과 툴 적응 모두를 포괄하는 통일된 관점을 제공하는 것을 목표로 합니다.

#Review #Agentic AI #Adaptation #Agent Adaptation #Tool Adaptation #Reinforcement Learning #Fine-tuning #Modular AI

2025년 12월 18일

[논문리뷰] A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning

이 논문은 고수준 추론과 저수준 그라운딩이 긴밀하게 결합된 기존 end-to-end 어포던스 예측 모델들이 새로운 객체나 복잡한 지시에 대한 일반화에 어려움을 겪는 한계를 해결하고자 합니다.

#Review #Affordance Prediction #Zero-Shot Learning #Agentic AI #Foundation Models #Multimodal Reasoning #Visual Grounding #Image Generation #Robotics

2025년 12월 16일

[논문리뷰] Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task

본 논문은 기존 MLLM(Multimodal Large Language Models) 이 복잡한 VideoQA(Video Question Answering) 태스크에서 시공간적 관계 모델링 및 시간적 진화의 인과적 역학을 이해하는 데 겪는 어려움을 해결하는 것을 목표로 합니다.

#Review #VideoQA #MLLMs #Tool Learning #Spatiotemporal Reasoning #Video Toolkit #Agentic AI

2025년 12월 11일

[논문리뷰] Thinking with Images via Self-Calling Agent

본 논문은 희소한 고품질 추론 데이터로 인해 강화 학습을 통한 MLLM의 Interleaved Multimodal Chain-of-Thought (iMCoT) 최적화가 어렵다는 문제점을 해결하고자 합니다.

#Review #Multimodal LLMs #Self-Calling Chain-of-Thought #Reinforcement Learning #Visual Reasoning #Agentic AI #Tool Calling #Group Relative Policy Optimization

2025년 12월 11일

[논문리뷰] ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

본 논문은 기존 멀티모달 보상 모델(Reward Models, RMs)이 겪는 환각, 약한 시각적 접지(visual grounding), 그리고 검증을 위한 도구 사용 능력 부족 문제를 해결하는 것을 목표로 합니다.

#Review #Multimodal Reward Models #Agentic AI #Tool Use #Reinforcement Learning #Visual Reasoning #Multimodal LLMs #Instruction Following #Evaluation Benchmarks

2025년 12월 4일

[논문리뷰] Qwen3-VL Technical Report

Qwen3-VL은 기존 Qwen 시리즈 중 가장 강력한 Vision-Language Model (VLM) 을 개발하여 광범위한 멀티모달 벤치마크에서 뛰어난 성능을 달성하는 것을 목표로 합니다.

#Review #Vision-Language Model #Multimodal Reasoning #Long-Context #Interleaved Data #Mixture-of-Experts #DeepStack #Agentic AI

2025년 12월 3일

[논문리뷰] DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

본 논문은 오픈 소스 대규모 언어 모델(LLM)과 상업용 LLM 간의 성능 격차를 줄이고자 DeepSeek-V3.2 를 소개합니다.

#Review #Large Language Models #Sparse Attention #Reinforcement Learning #Agentic AI #Tool Use #Open-source LLM #DeepSeek

2025년 12월 2일

[논문리뷰] Agentic Policy Optimization via Instruction-Policy Co-Evolution

본 논문은 LLM 기반 에이전트의 강화 학습(RL) 과정에서 고정되고 수동으로 설계된 명령어(instruction)가 최적의 성능을 저해한다는 문제에 주목합니다.

#Review #Reinforcement Learning #Large Language Models #Instruction Optimization #Policy Co-Evolution #Agentic AI #Tool-Integrated Reasoning #Self-Reflection

2025년 12월 1일

[논문리뷰] Geometrically-Constrained Agent for Spatial Reasoning

본 논문은 Vision Language Models (VLMs)이 공간 추론 시 겪는 의미론-기하학적 간극(semantic-to-geometric gap) 문제를 해결하고자 합니다.

#Review #Spatial Reasoning #Vision Language Models (VLMs)#Geometric Constraints #Agentic AI #Tool Integration #Semantic-to-Geometric Gap #Task Formalization

2025년 11월 30일

[논문리뷰] MIRA: Multimodal Iterative Reasoning Agent for Image Editing

이 논문은 확산 기반 이미지 편집 모델이 복잡한 사용자 지침(구성 관계, 맥락적 단서, 참조 표현 등)을 정확하게 해석하지 못하여 발생하는 의미론적 드리프트 및 편집 실패 문제를 해결하는 것을 목표로 합니다.

#Review #Image Editing #Multimodal AI #Iterative Reasoning #Agentic AI #Reinforcement Learning #Diffusion Models #Vision-Language Models #Instruction Following

2025년 11월 27일

[논문리뷰] Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs

본 연구는 VLM이 다단계 시각적 상호작용 및 효과적인 도구 통합 추론에서 겪는 한계를 해결하고자 합니다. 특히, 도구 선택, 호출 및 조율 능력이 부족한 기존 VLM의 문제를 극복하고, 확장 가능한 훈련 환경과 에이전트 학습 전략을 통해 VLM의 도구 통합 시각적 추론 능력 을 체계적으로 향상시키는 것을 목표로 합니다.

#Review #Vision-Language Models (VLMs)#Reinforcement Learning (RL)#Tool-Integrated Reasoning (TIR)#Agentic AI #VQA #Training Environment #Behavioral Cloning #Policy Optimization

2025년 11월 25일

[논문리뷰] MedSAM3: Delving into Segment Anything with Medical Concepts

의료 영상 분할 분야에서 기존 모델들의 일반화 부족과 광범위한 수동 주석 요구 사항을 해결하고, 순전히 기하학적 프롬프트에 의존하는 한계를 극복하는 것을 목표로 합니다.

#Review #Medical Image Segmentation #Segment Anything Model (SAM)#Promptable Concept Segmentation (PCS)#Multimodal Large Language Models (MLLMs)#Agentic AI #Domain Adaptation #Text-guided Segmentation

2025년 11월 25일

[논문리뷰] Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution

본 논문은 기존의 단일(monolithic) VLM(Vision-Language Model)이 가진 정밀성, 결정론적 제어 및 복합적 시각 작업 처리 능력의 한계를 극복하고자 합니다.

#Review #Visual Agent #Multimodal Perception #Tool-Augmented LLM #Agentic AI #Visual Reasoning #Computer Vision #Structured Outputs #ReAct Framework

2025년 11월 18일

[논문리뷰] P1: Mastering Physics Olympiads with Reinforcement Learning

본 논문은 대규모 언어 모델(LLM)이 퍼즐 풀이를 넘어 과학 수준의 추론 능력을 갖추도록 발전시키고, 특히 복잡한 물리학 올림피아드 문제를 해결하는 능력을 향상시키는 것을 목표로 합니다. 이를 통해 LLM이 물리적 현실과 자연 법칙의 엄격한 제약을 준수하는, 진정한 과학적 추론 능력을 입증하고자 합니다.

#Review #Reinforcement Learning #Large Language Models #Physics Reasoning #Agentic AI #Olympiad Problems #Post-Training #Knowledge Transfer

2025년 11월 17일

[논문리뷰] MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism

대규모 언어 모델(LLMs) 기반 멀티 에이전트 추론 시스템이 보상 잡음(reward noise) 과 훈련 비효율성 으로 인해 오픈 소스 모델에 일반화되기 어려운 문제를 해결하는 것이 목표입니다.

#Review #Multi-Agent Systems #Reinforcement Learning #LLMs #Pipeline Parallelism #Reasoning #Reward Shaping #Agentic AI

2025년 11월 16일

[논문리뷰] DeepEyesV2: Toward Agentic Multimodal Model

본 논문은 텍스트와 이미지를 단순히 이해하는 것을 넘어, 코드 실행 환경 및 웹 검색 과 같은 외부 도구를 능동적으로 호출하고 이러한 도구 작업을 추론 과정에 원활하게 통합할 수 있는 Agentic 멀티모달 모델 을 구축하는 것을 목표로 합니다.

#Review #Agentic AI #Multimodal Models #Tool Use #Reinforcement Learning #Supervised Fine-tuning #Multimodal Reasoning #Web Search #Code Execution

2025년 11월 9일

[논문리뷰] VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

본 논문은 에이전트 시대의 추론 및 행동을 위한 시각 중심 코딩의 미개척 영역을 탐구합니다. 기존 RGB 픽셀 기반 이미지 표현의 제한된 상징적 추상화를 넘어서, 이미지를 SVG 코드 와 같은 압축적이고 해석 가능하며 실행 가능한 시각적 표현으로 변환하는 것을 목표로 합니다.

#Review #Multimodal AI #Code Generation #SVG #Visual Representation #Benchmark #Large Vision-Language Models #Agentic AI #Reasoning

2025년 11월 9일

[논문리뷰] The Collaboration Gap

AI 에이전트 기반 시스템에서 독립적으로 개발된 에이전트 간의 효과적인 협업 능력 이 부족하다는 문제인 ' 협업 격차(Collaboration Gap) '를 파악하고 정량화하는 것을 목표로 합니다.

#Review #AI Collaboration #Multi-Agent Systems #Large Language Models (LLMs)#Maze Solving #Heterogeneous Agents #Collaboration Gap #Relay Inference #Agentic AI

2025년 11월 9일

[논문리뷰] WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

WebSailor-V2는 오픈소스 웹 에이전트의 역량을 혁신적으로 향상시켜, 독점 시스템과의 성능 격차를 줄이는 것을 목표로 합니다. 특히 데이터 구성 및 확장 가능한 강화 학습(RL) 훈련의 두 가지 주요 과제를 해결하여 복잡한 웹 환경에서 고급 추론 및 도구 사용 능력을 갖춘 에이전트를 개발하고자 합니다.

#Review #Web Agents #Reinforcement Learning #Synthetic Data #Knowledge Graphs #LLMs #Supervised Fine-Tuning #Sim-to-Real Transfer #Agentic AI

2025년 9월 17일

[논문리뷰] WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents

본 논문은 기존의 심층 연구(deep-research) 에이전트들이 겪는 컨텍스트 질식(context suffocation) 및 노이즈 오염(noise contamination) 문제로 인한 추론 능력의 한계를 해결하는 것을 목표로 합니다.

#Review #Agentic AI #Deep Research #Iterative Reasoning #Long-Horizon Tasks #Context Management #Data Synthesis #Tool-Augmented LLMs #Markov Decision Process

2025년 9월 17일

[논문리뷰] Towards General Agentic Intelligence via Environment Scaling

본 논문은 일반 에이전트 지능(General Agentic Intelligence)을 발전시키기 위해 대규모 언어 모델(LLM)의 함수 호출 능력 을 향상시키는 것을 목표로 합니다.

#Review #Agentic AI #Environment Scaling #Function Calling #Tool Use #Large Language Models #Synthetic Data Generation #Supervised Fine-tuning

2025년 9월 17일

[논문리뷰] The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

본 논문은 대규모 언어 모델(LLM)의 지속적인 스케일링이 한계 효용 체감(diminishing returns)으로 이어지는지에 대한 논쟁을 다루며, 특히 장기적인 태스크(long-horizon tasks) 수행 능력에 초점을 맞춥니다.

#Review #Large Language Models #Long-Horizon Tasks #Execution Capability #Scaling Laws #Self-Conditioning #Thinking Models #Agentic AI

2025년 9월 15일

[논문리뷰] MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools

본 논문은 Model Context Protocol (MCP)을 통해 도구를 사용하는 언어 에이전트의 실제 성능을 정확하게 평가할 수 있는 표준화된 벤치마크의 부재 문제를 해결하고자 합니다.

#Review #Language Agents #Tool Use #Benchmarks #Model Context Protocol (MCP)#LLM Evaluation #Agentic AI #Real-World Performance

2025년 9월 15일

[논문리뷰] EnvX: Agentize Everything with Agentic AI

이 논문은 오픈소스 코드 저장소의 재활용 및 협업의 비효율성을 해결하기 위해, 저장소를 지능적인 자율 에이전트 로 변환하는 프레임워크인 EnvX 를 제안합니다.

#Review #Agentic AI #Multi-Agent Systems #Code Repository #Agentization #Natural Language Interaction #Agent-to-Agent Protocol #LLM-based Agents

2025년 9월 11일

[논문리뷰] A Survey of Reinforcement Learning for Large Reasoning Models

본 논문은 대규모 언어 모델(LLMs)을 대규모 추론 모델(LRMs)로 변환하는 데 강화 학습(RL) 이 기여한 최근 발전 사항을 종합적으로 조사하는 것을 목표로 합니다.

#Review #Reinforcement Learning #Large Reasoning Models #LLMs #Reward Design #Policy Optimization #Verifiable Rewards #Agentic AI #Multimodal AI

2025년 9월 11일

[논문리뷰] Reinforcement Learning Foundations for Deep Research Systems: A Survey

본 논문은 복잡한 다단계 작업을 해결하는 딥 리서치 에이전트(agentic AI) 훈련을 위한 강화 학습(RL) 기반 기술 을 체계적으로 조사합니다.

#Review #Reinforcement Learning #Deep Research Systems #Agentic AI #Tool Use #Hierarchical Agents #Reward Design #Multimodal AI #RL Frameworks

2025년 9월 9일

[논문리뷰] Open Data Synthesis For Deep Research

본 논문은 기존 벤치마크들이 '심층 연구(Deep Research)' 작업을 위한 충분한 구조적 깊이를 제공하지 못하는 한계를 해결하고자 합니다. 특히, 복잡한 질문을 하위 문제로 분해하고, 다단계 추론을 조율하며, 다양한 출처에서 증거를 합성해야 하는 작업에 초점을 맞춥니다.

#Review #Data Synthesis #Deep Research #Hierarchical Constraint Satisfaction Problems #Large Language Models #Agentic AI #Reinforcement Learning #Question Answering

2025년 9월 4일

[논문리뷰] Mimicking the Physicist's Eye:A VLM-centric Approach for Physics Formula Discovery

본 논문은 기존의 단일 모달(symbolic regression 또는 LLM) 접근법이 물리학자들이 현상학적 시각적 표현을 활용하는 점을 간과하여 동적 현상 내재의 시공간 패턴을 해석하는 능력이 약하다는 문제를 해결하고자 합니다.

#Review #Physics Formula Discovery #Multimodal AI #Vision-Language Models #Symbolic Regression #Causal Chain of Thought #Reinforcement Learning #Agentic AI

2025년 9월 1일

[논문리뷰] A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

이 논문은 과학 분야 대규모 언어 모델(Sci-LLMs)의 발전 과정을 데이터 기반과 에이전트 프론티어 관점에서 종합적으로 분석하는 것을 목표로 합니다.

#Review #Scientific LLMs #AI for Science #Scientific Data #Agentic AI #Multimodal Integration #Knowledge Representation #Autonomous Discovery #Data Ecosystems

2025년 9월 1일

[논문리뷰] AWorld: Orchestrating the Training Recipe for Agentic AI

본 논문은 에이전트 AI 시스템 개발의 핵심 병목인 비효율적인 경험 생성(experience generation) 문제를 해결하여, 복잡한 환경에서 '학습을 통한 실천(learning from practice)' 패러다임을 실용적이고 확장 가능하게 만드는 것을 목표로 합니다.

#Review #Agentic AI #Reinforcement Learning #Distributed Systems #Experience Generation #LLM Fine-tuning #GAIA Benchmark #Scalability #AWORLD Framework

2025년 8월 29일

[논문리뷰] Explain Before You Answer: A Survey on Compositional Visual Reasoning

본 설문조사는 복잡한 시각적 장면을 분해하고, 중간 개념을 이해하며, 다단계 논리적 추론을 수행하는 인간과 같은 능력을 기계에 부여하는 것을 목표로 하는 Compositional Visual Reasoning (CVR) 분야의 진화를 체계적으로 분석합니다.

#Review #Compositional Visual Reasoning #Multimodal AI #Vision-Language Models #Large Language Models #Chain-of-Thought #Tool Learning #Agentic AI #Survey

2025년 8월 26일

[논문리뷰] From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

이 논문은 AI 시스템이 단순한 계산 도구에서 자율적인 연구 파트너로 진화하는 'Agentic Science' 패러다임을 제안하고 포지셔닝합니다.

#Review #Agentic AI #Autonomous Scientific Discovery #AI for Science #Large Language Models #Multi-agent Systems #Scientific Workflow Automation #Natural Sciences

2025년 8월 21일

[논문리뷰] SSRL: Self-Search Reinforcement Learning

본 논문은 대규모 언어 모델(LLMs)이 강화 학습(RL)에서 에이전트 검색 태스크를 위한 효율적인 시뮬레이터 역할을 할 수 있는지 탐구합니다.

#Review #Reinforcement Learning #Large Language Models #Self-Search #Sim-to-Real Transfer #Agentic AI #Knowledge Retrieval #Reward Modeling

2025년 8월 18일

[논문리뷰] HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches

이 논문은 기업 환경에서 로컬(사내 문서/지식 그래프) 및 웹 지식 소스 를 동시에 활용하는 딥 서치 시스템의 필요성에 주목합니다.

#Review #Hierarchical Reinforcement Learning #Deep Search #Multi-source RAG #Agentic AI #Knowledge Integration #Enterprise Search #Large Reasoning Models

2025년 8월 13일

[논문리뷰] GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

본 논문은 오픈소스 MoE(Mixture-of-Experts) 기반 대규모 언어 모델인 GLM-4.5 를 소개합니다. 핵심 목표는 에이전트, 추론, 코딩(ARC) 태스크 전반에서 강력한 성능을 달성하고, 사고 및 직접 응답 모드를 지원하는 하이브리드 추론 방식을 통해 계산 효율성을 극대화하는 것입니다.

#Review #Large Language Model #Mixture-of-Experts #Agentic AI #Reasoning #Code Generation #Reinforcement Learning #Foundation Model

2025년 8월 11일

[논문리뷰] DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning

본 논문은 Vision Language Models(VLMs)이 복잡하고 동적인 물리 환경에서 정확한 행동 계획 및 공간/시간 추론 능력 에 한계를 보이는 문제를 해결하고자 합니다.

#Review #Vision Language Models (VLMs)#Agentic AI #Physical Reasoning #Benchmark #Simulation Environments #Action Planning #Interactive AI

2025년 8월 8일

[논문리뷰] Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

대규모 언어 모델(LLMs)이 다단계 추론 문제, 특히 정답 궤적이 희박한 어려운 태스크에서 겪는 한계를 극복하는 것을 목표로 합니다.

#Review #Supervised Reinforcement Learning #LLMs #Multi-step Reasoning #Reward Shaping #Expert Trajectories #Math Reasoning #Agentic AI

2025년 10월 31일

[논문리뷰] SeeingEye: Agentic Information Flow Unlocks Multimodal Reasoning In Text-only LLMs

텍스트 전용 대규모 언어 모델(LLMs)이 시각 정보를 직접 처리할 수 없는 한계를 극복하고, 멀티모달 추론 능력을 효율적이고 비용 효과적으로 활용할 수 있도록 하는 것을 목표로 합니다.

#Review #Multimodal Reasoning #Text-only LLM #Agentic AI #Information Flow #VQA #Structured Intermediate Representation #Decoupled Architecture #Tool Use

2025년 10월 30일

[논문리뷰] ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking

본 논문은 심층 정보 탐색(Deep Information Seeking, IS) 에이전트의 기존 병렬 사고 방식이 지닌 비효율성(반복적인 롤아웃)과 장기 추론 궤적 통합의 어려움(제한된 컨텍스트)을 해결하는 것을 목표로 합니다.

#Review #Agentic AI #Parallel Thinking #Information Seeking #LLM Agents #Context Window Optimization #Exploration Efficiency #Reasoning Aggregation #Tool Use

2025년 10월 29일

[논문리뷰] FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling

본 논문은 대규모 언어 모델(LLM)의 복잡한 멀티턴 함수 호출(Multi-Turn Function Calling) 능력 개발을 위한 고품질 학습 데이터 생성의 어려움을 해결하고자 합니다.

#Review #Function Calling #Multi-Turn Interaction #Large Language Models (LLMs)#Data Synthesis #Agentic AI #Tool Use #Chain-of-Thought (CoT)#Reinforcement Learning

2025년 10월 29일

[논문리뷰] AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis

본 논문은 대규모 언어 모델(LLM) 에이전트의 고급 추론 능력 을 확장하기 위해, 교육 이론인 근접 발달 영역(ZPD) 에서 영감을 받은 새로운 데이터 합성 접근 방식을 제안합니다.

#Review #LLM Agents #Data Synthesis #Zone of Proximal Development (ZPD)#Complex Reasoning #Tool Use #Automated Benchmarking #Agentic AI #Rejection Sampling Fine-Tuning

2025년 10월 29일

[논문리뷰] AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite

본 논문은 과학 연구 분야 AI 에이전트의 기존 벤치마크 평가 방식이 지닌 한계점(예: 비현실적인 측정, 재현성 부족, 비용 미반영 등)을 극복하고자 합니다.

#Review #AI Agents #Benchmarking #Scientific Research #LLM Evaluation #Agentic AI #Tool Use #Reproducibility #Cost-Aware Evaluation

2025년 10월 27일

[논문리뷰] Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

본 논문은 LLM 기반 에이전트가 긴 작업(long-horizon tasks)을 수행할 때 제한된 작업 메모리 가 불필요하거나 관련 없는 컨텍스트에 의해 쉽게 과부하되는 문제를 해결하고자 합니다.

#Review #Long-Horizon Tasks #Agentic AI #Context Curation #Working Memory #Reinforcement Learning #Policy Optimization #Large Language Models #Memory-as-Action

2025년 10월 15일

[논문리뷰] Understanding DeepResearch via Reports

본 논문은 지식 집약적 연구 작업을 수행하는 DeepResearch 에이전트 의 복합적인 평가 문제에 주목합니다.

#Review #DeepResearch Agents #LLM-as-a-Judge #Report Evaluation #Agentic AI #Factuality #Redundancy #Research Automation #Benchmark

2025년 10월 13일

[논문리뷰] Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

이 논문은 기존 대규모 언어 모델(LLM)의 컨텍스트 적응 방법론이 가지는 '간결성 편향(brevity bias)'과 '컨텍스트 붕괴(context collapse)' 문제를 해결하는 것을 목표로 합니다.

#Review #LLM Context Adaptation #Agentic AI #Self-Improving Systems #Prompt Engineering #Context Management #Dynamic Playbooks #Incremental Learning

2025년 10월 7일