최신 포스트

[논문리뷰] AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

본 논문은 실세계 복잡한 환경에서 LLM 에이전트가 Progressive Disclosure되는 Dual Constraints 환경 하에서 효과적으로 계획을 수립하고 수정하는 능력이 부족하다는 점을 지적한다.

#Review #Large Language Model Agents #Adaptive Planning #Dual Constraints #Progressive Disclosure #Interactive Benchmarking #Constraint-based Planning

2026년 6월 4일

[논문리뷰] AdaCodec: A Predictive Visual Code for Video MLLMs

본 논문은 기존 비디오 MLLMs가 비디오의 시간적 중복성(Temporal Redundancy)을 무시하고 모든 프레임을 독립적인 RGB 이미지로 처리하여 발생하는 비효율성 문제를 해결한다.

#Review #Video MLLMs #Predictive Coding #Visual Token #Efficiency #Temporal Redundancy #GOP (Group of Pictures)#Latency

2026년 6월 4일

[논문리뷰] Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

본 논문은 금융 AI 에이전트가 겪는 '금융 인지 마찰(financial cognition friction)'과 그에 따른 성능 저하 문제를 해결합니다.

#Review #Financial LLM Agents #Interaction-Native #Knowledge Harness #Temporal Knowledge Graph #Passive Knowledge Injection #Execution Safety #Cognition Friction

2026년 6월 4일

[논문리뷰] AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

기존의 LLM 에이전트는 사용자의 Literal query에만 집중하여, 그 이면에 숨겨진 의도(예: '누가 어디에 있는가?'라는 질문 속에 숨겨진 '지금 그 사람이 대화할 여유가 있는가?'라는 의도)를 간과하는 문제가 있다.

2026년 6월 4일

[sglang] DeepSeek V4의 Prefill 성능을 1.35배 향상시킨 FlashAttention 최적화

DeepSeek V4 모델의 Prefill 단계 성능을 획기적으로 개선한 FlashAttention 최적화 분석

#AI #LLM #Performance Optimization #FlashAttention #DeepSeek V4 #SGLang

2026년 6월 3일

[feast] Feast 온라인 서빙 성능 튜닝: Sub-2ms 달성을 위한 여정

Feast 온라인 피처 서버의 p99 지연 시간을 sub-2ms로 단축하기 위한 성능 튜닝 과정을 상세히 분석합니다.

#Feast #성능 최적화 #Kubernetes #Redis #Python

2026년 6월 3일

[vllm] [ROCm CI 최적화] Docker 3단계 빌드 전략으로 빌드 시간 26분 단축하기

vLLM 프로젝트의 ROCm CI 빌드 시간을 획기적으로 단축하기 위해 도입된 3단계 Docker 빌드 아키텍처와 Content-addressed 캐싱 기법을 심층 분석합니다.

#vLLM #ROCm #Docker #CI/CD #Buildkite #Optimization

2026년 6월 3일

[transformers] Hugging Face Transformers: Slow Tokenizer 성능 회귀 문제 해결하기

PreTrainedTokenizer의 O(T*N*logN) 성능 저하 문제를 O(T)로 복구한 최적화 사례 분석

#HuggingFace #Transformers #Python #Optimization #Tokenizer

2026년 6월 3일

[논문리뷰] ZipSplat: Fewer Gaussians, Better Splats

본 논문은 기존의 Feed-forward 3DGS 방식이 3D Gaussian 배치를 입력 이미지의 픽셀 그리드에 고정시킴으로써 발생하는 구조적 비효율성을 해결하고자 합니다.

#Review #3D Gaussian Splatting #Feed-forward Reconstruction #Novel View Synthesis #Scene Tokens #Clustering #Pose-free

2026년 6월 3일

[논문리뷰] Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

본 연구는 Deep-Research Agent의 오류 원인을 파악하기 어렵다는 블랙박스 특성을 해결하고자 합니다. 기존의 에이전트 평가는 주로 최종 결과물(Final Answer)의 정확도에만 집중하기 때문에, 중간 단계의 어떤 부분에서 추론이 어긋났는지 진단하는 데 한계가 존재합니다.

#Review #Deep-Research Agents #Error Localization #Agent Trajectories #Span-Level Analysis #LLM Reasoning #Debugging

2026년 6월 3일

[논문리뷰] Unlocking Feature Learning in Gated Delta Networks at Scale

본 논문은 Gated Delta Network와 같은 효율적인 선형 아키텍처에서 대규모 학습 시 안정적인 feature learning을 지원하는 최적의 $\mu P$ 구성 방식을 도출하는 것을 목표로 합니다.

#Review #Gated Delta Network #Maximal Update Parametrization #Feature Learning #Hyperparameter Transfer #Linear Recurrent Models #Deep Learning Theory

2026년 6월 3일

[논문리뷰] Training-Free Multi-Concept LoRA Composition with Prompt-Aware Weighting

본 연구는 다수의 LoRA를 결합하여 복합적인 개념을 생성할 때 발생하는 의미적 간섭(Interference)과 그에 따른 화질 및 충실도 저하 문제를 해결합니다.

#Review #LoRA #Diffusion Models #Multi-Concept Composition #Prompt-Aware Weighting #Training-Free #Image Generation

2026년 6월 3일

[논문리뷰] ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

본 논문은 LRMs가 추론 과정에서 '오버씽킹(overthinking)' 현상으로 인해 불필요하게 긴 CoTs를 생성하여 비효율적인 계산 자원을 소모하는 문제를 해결하고자 합니다.

#Review #Large Reasoning Models #Reinforcement Learning #Chain-of-Thoughts #Preference Learning #Reasoning Efficiency #Redundancy Mitigation

2026년 6월 3일

[논문리뷰] Streaming Communication in Multi-Agent Reasoning

본 논문은 기존의 'generate-then-transfer' 패러다임이 유발하는 불필요한 대기 시간과 추론 효율 저하 문제를 해결하기 위해 고안되었습니다.

#Review #Multi-Agent Reasoning #LLM #Pipeline Parallelism #Streaming Communication #Step-Level Scaling Law #Communication Protocol

2026년 6월 3일

[논문리뷰] Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning

본 논문은 이미지 레이어 분해(Layer Decomposition) 모델의 학습에서 발생하는 데이터 부족 및 정답의 모호성 문제를 해결하기 위해 제안되었습니다. 기존 모델은 합성된 레이어 데이터셋에 의존하여 학습되는데, 이는 단일 정답을 강요함으로써 레이어 분해의 유연성을 제한하고 다양한 편집 가능성을 저해합니다 .

#Review #Image Layer Decomposition #Reinforcement Learning #Vision-Language Model #Flow-GRPO #LoRA #VLM-as-Judge

2026년 6월 3일

[논문리뷰] SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes

본 논문은 VLM이 단순한 공간 관찰을 넘어 실제 3D 환경에서 행동하고 그 결과를 관리할 수 있는지 평가하기 위해 SpatialAct를 제안한다. 기존의 공간 추론 벤치마크들은 대부분 정적인 이미지나 비디오를 대상으로 모델의 이해도만을 측정하며, 모델의 출력이 환경을 변화시키는 상호작용은 고려하지 않았다 .

#Review #VLM Agents #3D Spatial Reasoning #Action-Conditioned #Interactive Refinement #Benchmark #Simulator-Grounded

2026년 6월 3일

[논문리뷰] Semi-Supervised Noise Adaptation: Transferring Knowledge from Noise Domain

본 연구는 레이블이 거의 없는 target domain에서 의미 있는 소스 데이터를 구하기 어려운 문제를 해결하기 위해, 무작위 생성된 노이즈 분포를 소스 도메인으로 활용하는 SSNA 문제를 정의한다.

#Review #Semi-Supervised Learning #Transfer Learning #Noise Adaptation #Generalization Bound #Distribution Alignment #Representation Learning

2026년 6월 3일

[논문리뷰] Self-Distilled Policy Gradient

본 논문은 제공된 URL에 직접 접근할 수 없는 기술적 제한으로 인해, 해당 논문의 상세 내용(Figure, 구체적 수치 등)을 직접 추출하여 요약하는 것이 불가능합니다.

2026년 6월 3일

[논문리뷰] Score-Control for Hallucination Reduction in Diffusion Models

본 논문은 현대 Diffusion Model에서 발생하는 Hallucination 문제가 학습된 Score Function의 지나친 Smoothness에서 기인한다는 점을 이론적으로 규명합니다.

#Review #Diffusion Models #Hallucination Reduction #Score Smoothness #Variance-Guided Score Modulation (VSM)#Lipschitz Constant #Generative AI #Jacobian

2026년 6월 3일

[논문리뷰] STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

본 논문은 LLM의 예측 결과를 학습 데이터로 거슬러 올라가 추적하는 TDA의 계산 효율성과 이론적 한계를 해결하고자 합니다.

#Review #Training Data Attribution #LLM #Sparse Recovery #Compressive Sensing #Activation-Space #Steering Operators #Causal Inference

2026년 6월 3일