최신 포스트

[논문리뷰] Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO

본 연구는 LLM 기반 Agent 생태계에서 Skill이 범람함에도 불구하고, 개별 개발자가 특정 목적 위주로 설계하여 기능적 파편화(Fragmentation)와 커버리지 부족 문제를 겪고 있는 현실을 해결하고자 합니다 .

#Review #Large Language Model #Agent #Skill Self-Evolution #GRPO #Benchmark #Automation

2026년 5월 5일

[논문리뷰] SVGS: Enhancing Gaussian Splatting Using Primitives with Spatially Varying Colors

본 논문은 기존 Gaussian Splatting 방식이 복잡한 텍스처나 기하학적 형태를 표현할 때 비효율적이라는 문제를 해결하고자 합니다 .

#Review #Gaussian Splatting #Novel-view Synthesis #Spatially Varying #Gaussian Surfels #Movable Kernels #3D Reconstruction

2026년 5월 5일

[논문리뷰] Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces

본 논문은 LLM 기반의 에이전트가 개별적인 도구 사용을 넘어 조율된 팀 단위로 진화함에 따라, 기존의 단일 에이전트 RL이나 고전적 MARL 방법론이 갖는 한계를 지적한다.

#Review #LLM #Multi-Agent Systems #Reinforcement Learning #Orchestration Trace #Credit Assignment #Reward Design #System Engineering

2026년 5월 5일

[논문리뷰] PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination

본 논문은 기존 특허 관련 연구가 특허 심사를 단순한 이진 분류(Acceptance Prediction)나 정적인 정보 추출 문제로만 취급하여 실제 현장의 반복적이고 상호작용적인 심사 과정을 반영하지 못한다는 한계를 해결하고자 한다.

#Review #Patent Examination #Office Action Generation #Rebuttal Generation #Large Language Models #Legal Reasoning #Benchmark

2026년 5월 5일

[논문리뷰] OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories

본 연구는 고성능 search agent 개발이 자본과 컴퓨팅 자원이 막대한 기업 주도의 CPT+SFT+RL 파이프라인에 종속된 현실을 비판적으로 접근합니다. 기존의 복잡한 학습 방식은 학계의 진입 장벽을 높이고 연구 생태계의 폐쇄성을 야기합니다.

#Review #Search Agent #SFT #ReAct #Data Quality #Long-horizon Reasoning #Data Synthesis

2026년 5월 5일

[논문리뷰] HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

본 논문은 현대의 복잡한 Agentic Harness 설계 이면에 숨겨진 실질적인 성능 구동 메커니즘을 규명하고 이를 단순화하고자 한다. 기존의 오케스트레이션 설계는 시스템이 매우 복잡하여 실질적인 추론 메커니즘을 파악하기 어렵다는 한계가 있었다.

#Review #Agentic Harness #Heavy Thinking #Large Language Model #Test-Time Scaling #Sequential Deliberation #Parallel Reasoning #RLVR

2026년 5월 5일

[논문리뷰] Healthcare AI GYM for Medical Agents

본 논문은 의료 AI 에이전트가 복잡한 다단계 임상 추론 환경에서 안정적인 툴 사용 정책을 학습하는 데 한계가 있다는 문제를 해결하고자 합니다. 기존의 단일 턴(single-turn) 기반 의료 QA 연구들은 실제 임상 환경의 핵심인 다단계 상호작용과 툴 활용 능력을 충분히 반영하지 못합니다.

#Review #Medical AI Agents #Reinforcement Learning #On-Policy Distillation #Clinical Reasoning #Multi-turn Interaction #Healthcare AI GYM

2026년 5월 5일

[논문리뷰] ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue

본 논문은 기존의 UAV SAR 연구들이 전통적인 비전 및 경로 계획 방식에 국한되어 있어, 복잡한 환경에서의 자율적 의사결정 능력을 평가할 통합된 벤치마크가 부족하다는 점을 지적합니다.

#Review #Embodied AI #Search and Rescue (SAR)#UAV #Multimodal Large Language Models (MLLMs)#Simulation Platform #Benchmark

2026년 5월 5일

[논문리뷰] Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation

본 논문은 기존의 text-based iRAG 시스템이 겪는 Coarse-grained attribution과 Visual semantic loss 문제를 해결하기 위해 고안되었습니다.

#Review #Iterative Retrieval-Augmented Generation #Visual Attribution #Vision-Language Models #Pixel-level Grounding #Multi-hop Reasoning

2026년 5월 5일

[논문리뷰] Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

본 논문은 LMM의 표준 post-training 파이프라인인 SFT→RLVR에서 발생하는 distributional drift 문제를 해결하고자 한다. 기존의 SFT는 토큰 수준의 uniform objective에 의존하여 모델이 피상적인 패턴만을 학습하게 만들며, 이는 모델의 본래 성능을 왜곡하는 결과를 초래한다.

#Review #Multimodal LLM #Reinforcement Learning #On-Policy Distillation #Distributional Drift #Mixture-of-Experts (MoE)#Adversarial Alignment

2026년 5월 5일

[논문리뷰] A Benchmark for Interactive World Models with a Unified Action Generation Framework

본 논문은 대규모 데이터셋과 통합된 벤치마크의 부재로 인해 interactive world model의 물리적 상호작용 능력을 객관적으로 평가하기 어렵다는 문제를 해결하고자 합니다.

#Review #Interactive World Models #Benchmark #Action Generation Framework #Embodied Intelligence #Trajectory Following #Memory Ability

2026년 5월 5일

[논문리뷰] T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning

본 논문은 다회차 Agentic RL 환경에서 빈번하게 발생하는 Training Collapse 현상을 해결하고자 합니다.

#Review #Agentic Reinforcement Learning #Multi-Turn Reasoning #Uncertainty-Guided Exploration #Token-Level Thinking Intervention #Turn-Level Dynamical Sampling #Training Stability

2026년 5월 4일

[논문리뷰] Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

본 논문은 데이터가 제한된 고자원 비영어권 언어(독일어 등)의 LLM 학습에서 발생하는 '데이터 다양성 확보'와 '데이터 품질 강화' 사이의 전략적 딜레마를 해결하고자 한다.

#Review #Large Language Models #Data Filtering #Sample Efficiency #German Language Modeling #Multi-Epoch Training #Semantic Density #High-Signal Data

2026년 5월 4일

[논문리뷰] PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments

본 논문은 기존 의료용 AI 벤치마크들이 정적 지식 회상이나 단일 단계 작업에 국한되어, 실제 의료 현장에서 요구되는 복합적이고 긴 호흡의 임상 워크플로우를 평가하지 못하는 한계를 해결하고자 한다.

#Review #LLM Agents #EHR #Benchmark #FHIR #Clinical Workflows #Agentic Evaluation #Long-horizon Tasks

2026년 5월 4일

[논문리뷰] Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

본 논문은 Autoregressive LVLM이 긴 문맥 생성 시 겪는 Visual Signal Dilution 문제를 해결하고자 한다.

#Review #Large Vision-Language Models #Visual Signal Dilution #Persistent Visual Memory #Autoregressive Generation #Multimodal Reasoning #Bottleneck Adapter

2026년 5월 4일

[논문리뷰] Perceptual Flow Network for Visually Grounded Reasoning

본 논문은 기존 LVLM이 표준 MLE 학습 과정에서 시각적 궤적을 제어하지 못해 발생하는 언어 편향과 환각(Hallucination) 문제를 해결하고자 합니다.

#Review #Large-Vision Language Models #Visually Grounded Reasoning #Perceptual Flow #Variational Reinforcement Learning #Vicinal Geometric Shaping #Hallucination Mitigation

2026년 5월 4일

[논문리뷰] OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models

본 논문은 해양 데이터의 파편화와 도메인 특화 데이터의 부재로 인해 발생하는 해양 인공지능(Marine AI)의 성능 병목 현상을 해결하고자 한다.

#Review #Multimodal Large Language Models #Marine Science #Foundation Models #Data Corpus #Instruction Tuning #Sonar Detection

2026년 5월 4일

[논문리뷰] Motion-Aware Caching for Efficient Autoregressive Video Generation

본 논문은 autoregressive 비디오 생성 모델에서 반복적인 denoising 프로세스로 인해 발생하는 과도한 계산 비용 문제를 해결하기 위해 MotionCache를 제안합니다.

#Review #Autoregressive Video Generation #Feature Caching #Motion-Aware Acceleration #Residual Stability #Diffusion Transformers

2026년 5월 4일

[논문리뷰] MolmoAct2: Action Reasoning Models for Real-world Deployment

본 논문은 범용 로봇 제어(Generalist robot manipulation)를 위한 VLA 모델이 실질적인 실환경 배포(Real-world deployment) 요건을 충족하지 못하는 한계를 해결하고자 한다.

#Review #Vision-Language-Action (VLA) Model #Embodied Reasoning #Flow Matching #Adaptive Depth Perception #Open-source Robotics #Real-world Deployment

2026년 5월 4일

[논문리뷰] Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

본 연구는 기존 Tree-RAG 방법론들이 단일 문서 내 단일 홉 질문에만 최적화되어 있어, 복잡한 교차 문서 multi-hop 질문 대응 및 corpus-level 확장에 한계가 있다는 점을 지적한다.

#Review #RAG #Tree-RAG #Hierarchical Abstract Tree #Multi-hop Retrieval #Multi-granular Retrieval

2026년 5월 4일