최신 포스트

[uv] uv, SIMD 가속 TOML 파싱으로 성능 향상

uv가 SIMD 가속 TOML 파싱 기능을 활성화하여 파싱 속도를 개선했습니다.

#Rust #Performance #Optimization #SIMD #TOML #uv

2026년 7월 1일

[sglang] DeepSeek NextN을 위한 Fused EH Norm 최적화: 커널 융합으로 성능 극대화하기

DeepSeek 모델의 EH Norm 연산을 커널 융합(Kernel Fusion)으로 최적화하여 연산 효율을 대폭 개선했습니다.

#SGLang #DeepSeek #CUDA #KernelFusion #Optimization

2026년 7월 1일

[논문리뷰] Xiaomi-GUI-0 Technical Report

본 연구는 기존 GUI 에이전트 연구들이 의존하는 정적인 벤치마크나 시뮬레이션 환경이 실제 모바일 기기의 복잡한 상태 분포를 반영하지 못하는 한계를 해결하기 위해 수행되었다.

#Review #GUI Agent #VLM #Real-Device #Reinforcement Learning #Data Flywheel #End-to-End #Mobile Automation

2026년 6월 30일

[논문리뷰] Unlocking the Visual Record of Materials Science: A Large-Scale Multimodal Dataset from Scientific Literature

본 논문은 재료과학 분야의 방대한 실험적 지식이 담긴 시각적 기록이 복합 그림 구조의 복잡성으로 인해 AI 모델이 접근할 수 없는 형태로 남아 있다는 문제를 해결하고자 한다. 기존 연구들은 주로 텍스트 기반 데이터베이스에 의존하며, 논문 내 포함된 풍부한 실험적 시각 자료를 활용하지 못하고 있다.

#Review #Multimodal dataset #Materials informatics #Compound figure detection #Information extraction #LLM #Vision-language #YOLO12-m

2026년 6월 30일

[논문리뷰] TerraDiT-Ω: Unified Spatial Control for Satellite Image Synthesis with Any Geospatial Primitive

본 논문은 기존의 위성 이미지 생성 모델들이 데이터 변환 과정에서 발생하는 기하학적 정보 손실과 컴퓨팅 병목 현상이라는 한계점에 직면해 있음을 지적한다. 기존 연구들은 위성 데이터를 Raster 기반으로 변환하여 사용하는데, 이는 정교한 지형적 특징을 왜곡하며 모델의 범용성을 저해한다 .

#Review #Satellite Imagery #Generative Models #Spatial Control #Geospatial Primitive #Diffusion Transformer #GALA #Synthetic Data Augmentation

2026년 6월 30일

[논문리뷰] SkillHone: A Harness for Continual Agent Skill Evolution Through Persistent Decision History

본 논문은 에이전트의 스킬이 정적인 아티팩트로 취급되어 지속적인 환경 변화와 작업 배포 환경에서 유지보수가 어렵다는 문제를 해결하고자 합니다.

#Review #Agent Skill #Continual Learning #Persistent Decision History #Skill Evolution #LLM Agent #Deep Research #Role-bounded Subagent

2026년 6월 30일

[논문리뷰] Scenes as Objects, Not Primitives: Instance-Structured 3D Tokenization from Unposed Views

본 논문은 기존의 Feed-forward 3D 재구성 방법론들이 씬을 객체 단위가 아닌 밀집된 원시 기하학적 요소(Points, Gaussians)의 집합으로 표현하여 객체 수준의 추론과 조작이 어렵다는 문제를 해결합니다 .

#Review #3D Reconstruction #Instance Segmentation #Gaussian Splatting #Feed-forward #Tokenization #Object-centric

2026년 6월 30일

[논문리뷰] Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs

본 연구는 LLM이 높은 자신감으로 환각(hallucination)을 생성하거나 지식의 경계를 식별하지 못하는 등 시스템적인 Metacognition 결핍 문제를 해결하고자 합니다 . 기존 모델들은 내부의 불확실성을 제대로 인지하지 못하거나 이를 언어적으로 정직하게 표현하지 못하는 한계가 있습니다.

#Review #LLM #Metacognition #Reinforcement Learning #Faithful Calibration #Uncertainty #Preference Optimization #Metacognitive Feedback

2026년 6월 30일

[논문리뷰] RedVox: Safety and Fairness Gaps in Speech Models Across Languages

본 논문은 최신 음성 인식 모델들의 안전성 및 공정성 평가가 지나치게 영어 중심적이며, 자연스러운 실사용 환경이 아닌 합성 데이터에 치중되어 있다는 한계점을 지적합니다.

#Review #Speech Models #Safety #Fairness #Multilingual #Benchmark #Red Teaming #Multimodal

2026년 6월 30일

[논문리뷰] QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

본 논문은 Long-horizon LLM Agent의 학습을 저해하는 희소 보상(Sparse Reward) 문제를 해결하기 위한 dense supervision 방법론들을 효율적으로 평가하고자 합니다 .

#Review #LLM Agents #Dense Supervision #Reinforcement Learning #Q-alignment #Evaluation Benchmark #Long-Horizon #Training-Free

2026년 6월 30일

[논문리뷰] PolyFlow: Continuous Topology Embedding Flow Matching for Artist-style Mesh Generation

본 논문은 기존 Autoregressive(AR) 메쉬 생성 모델이 직면한 심각한 추론 지연 및 오류 누적 문제를 해결하기 위해 PolyFlow를 제안한다. 기존 AR 방식은 메쉬를 고정된 시퀀스로 직렬화하여 순차적으로 토큰을 예측해야 하므로, 생성 속도가 매우 느리고 복잡한 형상에서 오류가 누적되기 쉽다.

#Review #Mesh Generation #Flow Matching #Topology Embedding #Retopology #Transformer #Parallel Generation #3D-Native

2026년 6월 30일

[논문리뷰] PhotoQuilt: Training-Free Arbitrary-Resolution Photomosaics via Bootstrapped Tiled Denoising

본 논문은 기존 생성 모델이 고해상도 Photomosaic 생성 시 발생하는 전역 구조 유지와 타일 수준의 상세 묘사 사이의 상충(Trade-off) 문제를 해결하고자 합니다.

#Review #Photomosaics #Diffusion Models #Bootstrapped Tiled Denoising #Training-Free #Arbitrary-Resolution #Global Coherence #Tile Autonomy

2026년 6월 30일

[논문리뷰] Orca: The World is in Your Mind

본 논문은 일반적인 지능을 구현하기 위해 단순한 예측 모델을 넘어 세상을 이해하고 행동하는 General World Foundation Model인 Orca를 제안합니다.

#Review #World Foundation Model #Next-State-Prediction #Latent World Space #Unconscious Learning #Conscious Learning #Multimodal Readout

2026년 6월 30일

[논문리뷰] Multi-Block Diffusion Language Models

본 논문은 기존 BD-LM이 단일 블록 단위의 순차적 디코딩으로 인해 발생하는 비효율성(storing bubbles) 문제를 해결하고자 합니다 .

#Review #Diffusion Language Models #Multi-Block Diffusion #Multi-block Teacher Forcing #Block Buffer #KV Caching #Parallel Decoding #Train-Inference Alignment

2026년 6월 30일

[논문리뷰] MuSViT: A Foundation Vision Model for Sheet Music Representation

본 연구는 시각적 악보(Sheet Music) 데이터를 구조화된 디지털 포맷으로 변환하는 강력한 도메인 전용 Backbone 모델의 부재를 해결하고자 합니다.

#Review #Foundation Model #Vision Transformer #Sheet Music Recognition #Masked Autoencoders #Self-supervised Learning #Optical Music Recognition

2026년 6월 30일

[논문리뷰] MemLearner: Learning to Query Context memory for Video World Models

본 논문은 Video World Models가 긴 시간의 생성 과정에서 장면의 일관성을 유지하지 못하는 메모리 부족 문제를 해결하고자 한다.

#Review #Video World Models #Context Memory #Adaptive Query Tokens #Diffusion Transformer #Learnable Memory

2026년 6월 30일

[논문리뷰] Managing Procedural Memory in LLM Agents: Control, Adaptation, and Evaluation

본 연구는 LLM 기반 에이전트가 현실 업무에서 반복적인 절차를 수행할 때 발생하는 Procedural Memory의 재사용성 문제를 해결하고자 한다. 기존 연구들은 로컬 환경에서의 단기 성능 향상에 집중하여, 서로 다른 태스크, 역할(Role), 모델 Backbone 간의 실질적인 전이 성능을 충분히 평가하지 못했다.

#Review #LLM Agents #Procedural Memory #Skill Transfer #Benchmark #Agent Evolution #Task Generalization

2026년 6월 30일

[논문리뷰] Little Brains, Big Feats: Exploring Compact Language Models

본 논문은 RAG 시스템의 Generation 단계에서 LLM의 높은 리소스 요구사항 문제를 해결하기 위해, 상대적으로 컴퓨팅 비용이 낮은 SLM의 활용 가능성을 탐구한다 .

#Review #Small Language Models (SLMs)#Retrieval-Augmented Generation (RAG)#On-device AI #LLM-as-a-Judge #Russian-language Benchmark

2026년 6월 30일

[논문리뷰] LUMOS: A Semantic Operating-System Layer for Accessibility-Grounded AI Agents

본 논문은 기존 운영체제가 인간 사용자에게 최적화되어 있어 AI 에이전트의 효율적인 제어를 방해한다는 문제점을 해결하고자 합니다.

#Review #AI Agents #Operating Systems #Accessibility #Semantic Blueprint #UI Automation #Computer Use #LLM

2026년 6월 30일

[논문리뷰] GEAR: Guided End-to-End AutoRegression for Image Synthesis

본 논문은 현대의 시각적 생성 모델들이 tokenizer와 generator를 2단계로 분리하여 학습함으로써 발생하는 비효율성을 해결하고자 합니다 .

#Review #GEAR #Autoregressive #Tokenizer #End-to-End #Representation Alignment #Vector Quantization #Image Synthesis

2026년 6월 30일