최신 포스트

[논문리뷰] ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?

기존의 WAM은 미래 비디오 생성에 의존하여 로봇 행동을 추론하지만, 여기에는 세 가지 심각한 한계가 존재합니다. 우선, 다수의 프레임에 대한 시공간 토큰을 처리해야 하므로 Inference 비용이 극도로 높습니다.

#Review #World Action Models #Image Editing #Robot Manipulation #Flow Matching #Efficient Inference #Embodied AI

2026년 6월 18일

[논문리뷰] HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining

Embodied foundation model 학습의 핵심 병목 현상은 정밀하게 주석 처리된 고품질 로봇 데이터의 부족과 데이터 수집의 높은 비용입니다.

#Review #Embodied AI #Egocentric Video #Pretraining #Robot Learning #Scaling Laws #Generalization #World-Action Models

2026년 6월 18일

[논문리뷰] Holo-World: Unified Camera, Object and Weather Control for Video World Model

본 연구는 비디오 월드 모델에서 카메라, 객체 동역학, 그리고 날씨 상태를 단일 인터페이스로 통합 제어하는 과정에서 발생하는 데이터 부족과 모델링 충돌 문제를 해결하고자 합니다.

#Review #Video World Model #Unified State Control #Weather Transfer #Unified Scene Adapter #Scene-Weather Decomposed CFG #HoloStateData

2026년 6월 18일

[논문리뷰] FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining

본 연구는 스타일과 콘텐츠를 동시에 참조하는 Dual-Reference Generation 작업에서 발생하는 콘텐츠 누출(Content Leakage) 및 구조 왜곡 문제를 해결하고자 합니다.

#Review #Diffusion Models #Dual-Reference Generation #LoRA Mining #Content-Style Disentanglement #Attention Enrichment #RoPE Modulation

2026년 6월 18일

[논문리뷰] FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows

본 논문은 기존 conditional generative 모델들이 컨디셔닝 신호를 정적인 입력으로만 취급하여 발생하는 생성 품질 저하와 정렬 실패 문제를 해결하고자 합니다.

#Review #Flow Matching #Conditional Generation #Feedback-Aware Training #Closed-Loop Inference #Self-Correction

2026년 6월 18일

[논문리뷰] FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines

본 논문은 복잡한 Multi-step LLM 파이프라인에서 발생하는 단계 간 상호작용 실패와 병목 현상을 해결하기 위해 FAPO를 제안한다. 기존의 프롬프트 전용 최적화 기법은 파이프라인 전체의 구조적 결함을 파악하는 데 한계가 있으며, 단일 단계의 프롬프트 튜닝만으로는 성능 개선이 어렵다.

#Review #LLM Pipeline #Prompt Optimization #Autonomous Agent #Claude Code #LangGraph #Failure Attribution #Pipeline Architecture

2026년 6월 18일

[논문리뷰] ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

본 논문은 로봇의 Dexterous Manipulation 기술을 습득함에 있어 인간의 개입이 필수적인 현재의 병목 현상을 해결하고자 합니다.

#Review #Physical Autoresearch #Agentic Robot Policy #Robot Fleet #Closed-loop System #Self-Improvement #Task Manipulation

2026년 6월 18일

[논문리뷰] Duration Aware Scheduling for ASR Serving Under Workload Drift

본 논문은 대규모 ASR 시스템에서 FCFS 기반 스케줄링이 작업 시간의 가변성을 고려하지 못해 발생하는 비효율성 문제를 해결한다. 기존의 vLLM과 같은 서빙 엔진들은 입력을 순차적으로 처리하므로, 긴 오디오 요청이 짧은 요청들을 가로막는 Head-of-Line blocking 현상이 빈번하게 발생한다.

#Review #ASR #Scheduling #Latency #vLLM #Whisper #Workload Drift #SJF #HRRN

2026년 6월 18일

[논문리뷰] DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

본 연구는 관절형 물체를 조작할 때 발생하는 hand-object interaction (HOI) 의 물리적 안정성 문제를 해결하고자 합니다.

#Review #Dexterous Manipulation #Articulated Object Manipulation #Hand-Object Interaction #Reinforcement Learning #Contact-Driven #Physically Informed #Robustness

2026년 6월 18일

[논문리뷰] DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis

본 연구는 기존 Distractor-Free Radiance Field 연구들이 대규모의 체계적인 데이터셋과 벤치마크의 부재로 인해 방법론의 강점과 한계를 파악하기 어렵다는 문제를 해결합니다.

#Review #Distractor-Free #Novel View Synthesis #Radiance Fields #3D Dataset #Benchmark #Diffusion-based Enhancement #DI2FIX

2026년 6월 18일

[논문리뷰] Current World Models Lack a Persistent State Core

본 논문은 현대의 World Models가 정교한 프레임을 생성할 수는 있으나, 관찰자가 보고 있지 않을 때에도 독립적으로 진화해야 하는 '지속적인 세계 상태(Persistent State Core)'를 결여하고 있다는 점을 지적합니다.

#Review #World Models #Persistent State #Viewpoint Intervention #WRBench #Video Generation #Diagnostic Benchmark #World-State Consistency

2026년 6월 18일

[논문리뷰] Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

본 논문은 현대의 LLM 에이전트가 단일 점수 기반의 정적 리더보드로는 충분히 평가될 수 없으며, 이로 인해 Rank Instability가 발생한다는 점을 지적합니다.

#Review #LLM Agents #Predictive Validity #Benchmark #Evaluation #Out-of-Distribution #MCP #Leaderboard

2026년 6월 18일

[논문리뷰] Adaptive Volumetric Mechanical Property Fields Invariant to Resolution

본 논문은 기존 3D 에셋이 물리 시뮬레이션에 필수적인 재질 정보(Young's modulus, Poisson's ratio, density)를 결여하고 있어, 사실적인 물리 시뮬레이션 구현에 병목 현상을 겪고 있다는 문제를 해결합니다 .

#Review #Mechanical Properties #Sparse Adaptive Voxels #Physics Simulation #Autoregressive Generation #3D Assets #Material Fields

2026년 6월 18일

[ray] Ray Core의 Lock Contention 해결: Publisher의 비동기 처리 도입

Ray의 Object Pubsub 로직을 IO 스레드로 분리하여 스케줄링 루프의 Lock Contention을 획기적으로 개선한 사례를 분석합니다.

#Ray #C++#Concurrency #Performance #Distributed Systems

2026년 6월 17일

[sglang] SGLang 성능 최적화: Speculative Decoding의 H2D 병목 해결 및 코드 중복 제거

Speculative Decoding 경로에서 발생하는 동기식 H2D 복사를 비동기 방식으로 최적화하고, 중복된 로직을 통합하여 성능을 개선했습니다.

#SGLang #LLM #Performance #PyTorch #SpeculativeDecoding

2026년 6월 17일

[sglang] [성능 최적화] SGLang `prepare_for_decode`에서 `latest_output_ids` H2D 복사 비동기화로 디코딩 처리량 30% 향상

SGLang 디코딩 과정에서 `latest_output_ids`의 H2D 복사를 비동기화하여 성능을 크게 개선한 사례 분석.

#SGLang #PyTorch #CUDA #성능 최적화 #GPU #LLM #H2D #비동기 프로그래밍

2026년 6월 17일

[vllm] vLLM에서 Flashinfer 기반 Non-gated MoE bf16 지원 최적화 분석

vLLM의 Flashinfer-TRTLLM 백엔드에 Non-gated MoE bf16 지원을 추가하여 성능을 약 15% 향상시킨 기술적 변경사항을 분석합니다.

#vLLM #MoE #Flashinfer #DeepLearning #Optimization

2026년 6월 17일

[논문리뷰] iOSWorld: A Benchmark for Personally Intelligent Phone Agents

본 논문은 기존 모바일 에이전트 벤치마크가 사용자의 지속적인 데이터와 상호 연관된 개인적 문맥을 결여하고 있다는 점을 지적하며, '개인 지능(Personal Intelligence)'을 갖춘 에이전트 평가의 필요성을 제기합니다.

#Review #iOSWorld #Mobile Agents #Personal Intelligence #Human-Computer Interaction #LLM-as-a-Judge #Multi-app Reasoning #Simulator Benchmark

2026년 6월 17일

[논문리뷰] Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding

본 논문은 OPSD 학습 과정에서 발생하는 교사 모델 신호의 품질 저하 문제를 해결하기 위해 Quality-Aware Self-Distillation을 제안한다.

#Review #GUI Grounding #On-Policy Self-Distillation #Teacher-Signal Reliability #Vision-Language Models #Correctness-Aware Gating #Probability Scaling

2026년 6월 17일

[논문리뷰] Sumi: Open Uniform Diffusion Language Model from Scratch

본 연구는 대규모 파라미터와 데이터 스케일로 scratch부터 사전 학습된 UDLM의 부재를 해결하고자 한다.

#Review #Uniform Diffusion Language Model #UDLM #Diffusion Models #Pre-training #Scaling #Generation Dynamics #Sumi

2026년 6월 17일