최신 포스트

[논문리뷰] HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering

Long-form video question answering (VideoQA)은 확장된 시간적 맥락에 대한 추론을 요구하지만, 현재 <strong>Large Vision-Language Models (LVLMs)</strong>의 finite context windows는 전체 비디오를 원시 프레임 속도로 처리하는 것을 불가능하게 만든다.

#Review #Video Question Answering #Frame Selection #Neuro-Symbolic Reasoning #Multimodal Understanding #Long Video

2026년 3월 22일

[논문리뷰] FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow

본 논문은 객체 단위의 정밀한 제어와 장면 전반의 스타일 일관성을 동시에 달성하기 어려웠던 기존 indoor scene 생성의 한계를 극복하기 위해, layout·shape·texture를 단일 rectified flow로 공동 생성하는 tri-branch 프레임워크 FlowScene을 제안합니다.

#Review #Scene Generation #Rectified Flow #Multimodal Graph #3D Indoor Synthesis #Style Consistency #Generative Models

2026년 3월 22일

[논문리뷰] EgoForge: Goal-Directed Egocentric World Simulator

Generative world models는 dynamic environment를 simulate하고 reason하는 데 중요한 발전을 보였지만, egocentric vision에서는 rapid viewpoint changes, frequent hand-object interactions, 그리고 latent human intent에 의존하는 complex goal-directed behavior로 인해 어려움을 겪습니다.

2026년 3월 22일

[논문리뷰] Deep Tabular Research via Continual Experience-Driven Execution

Large language models (LLMs)는 구조화된 데이터에 대한 reasoning에서 상당한 능력을 보여주었지만, hierarchical 및 bidirectional header , merged cell , non-canonical layout 을 포함하는 unstructured table에 대한 complex long-horizon analytical task 에서는 어려움을 겪습니다.

#Review #Deep Tabular Research #LLM Agents #Tabular Reasoning #Continual Learning #Experience-Driven Execution #Multi-hop Reasoning #Unstructured Tables

2026년 3월 22일

[논문리뷰] CurveStream: Boosting Streaming Video Understanding in MLLMs via Curvature-Aware Hierarchical Visual Memory Management

Multimodal Large Language Models (MLLMs)는 오프라인 비디오 이해에서 뛰어난 성능을 보였으나, 스트리밍 비디오 시나리오에서는 본질적인 병목 현상에 직면한다.

#Review #Streaming Video Understanding #MLLMs #Memory Management #Curvature Score #Hierarchical Visual Memory #Catastrophic Forgetting

2026년 3월 22일

[논문리뷰] Cooperation and Exploitation in LLM Policy Synthesis for Sequential Social Dilemmas

기존의 다중 에이전트 강화 학습(MARL)은 Sequential Social Dilemmas (SSDs) 환경에서 credit assignment 의 어려움, non-stationarity , 그리고 방대한 joint action space 문제로 인해 효과적인 정책 학습에 한계를 보입니다.

#Review #LLM Policy Synthesis #Sequential Social Dilemmas (SSDs)#Feedback Engineering #Multi-agent Environments #Cooperation #Reward Hacking #Programmatic Policies

2026년 3월 22일

[논문리뷰] Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD

Discrete diffusion models는 고품질 데이터를 생성할 수 있지만, 일반적으로 샘플링에 많은 반복(sampling steps) 이 필요하며 이는 높은 계산 비용 과 FLOPs 로 이어진다는 문제점이 있습니다.

#Review #Discrete Diffusion Models #Distillation #Moment Matching Distillation #D-MMD #GPT-2 Gradient Moment #Few-step Generators #CIFAR-10 #Open Web Text

2026년 3월 22일

[논문리뷰] BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

최근 LLMs의 context window가 기하급수적으로 확장되면서 long-document understanding의 잠재력이 커졌지만, 이는 심각한 inference latency와 정보 utilization 병목 현상을 야기했습니다.

#Review #Prompt Compression #Long-Context LLMs #Training-Free #Hierarchical Selection #Structure-Aware #Inference Latency #Information Utilization

2026년 3월 22일

[논문리뷰] Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

Distilled autoregressive (AR) video models는 efficient streaming generation을 가능하게 하지만, 종종 human visual preferences와 misalign되어 artifacts나 unnatural motion dynamics를 보입니다.

#Review #Video Generation #Distilled Autoregressive Models #Reinforcement Learning (RL)#Human Preferences #Streaming Generation #Forward-Process RL #Reward Hacking #Temporal Consistency

2026년 3월 22일

[논문리뷰] AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

본 논문은 도메인 특화 Data Science 태스크에서 AI 에이전트가 인간 전문가의 성능을 어느 수준까지 대체할 수 있는지, 그리고 어떤 영역에서 인간의 전문성이 여전히 우위를 지니는지 평가하기 위한 벤치마크 AgentDS를 제안합니다.

#Review #AI Agents #Human-AI Collaboration #Data Science Benchmark #Large Language Models #Domain-Specific Reasoning #Multi-Industry Evaluation

2026년 3월 22일

[논문리뷰] A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

Large language model (LLM)-based agents는 디지털 환경에서 강력한 자율 제어기로 부상했지만, 특히 웹 내비게이션과 같이 동적인 콘텐츠와 긴 액션 시퀀스를 요구하는 복잡한 task에서 long-horizon planning 능력의 약점을 드러낸다.

#Review #LLM Agents #Subgoals #Reinforcement Learning #Web Navigation #Long-Horizon Planning #Reward Shaping #Curriculum Learning

2026년 3월 22일

[sglang] SGLang의 SM120 FP8 Blockwise GEMM 성능 최적화: Pingpong 스케줄 도입

SM120 아키텍처에서 FP8 Blockwise GEMM 연산 시 Pingpong 스케줄을 도입하여 소형 M 사이즈에서 성능을 약 2배 향상시켰습니다.

#CUDA #CUTLASS #GEMM #FP8 #SGLang #SM120

2026년 3월 22일

[Axolotl] LoRA 커널에 bias, dropout, DoRA, embedding 지원 추가

Axolotl의 Triton LoRA 커널을 확장하여 bias 파라미터, dropout, DoRA(Weight-Decomposed LoRA), embedding 레이어를 지원하도록 개선한 분석.

#Axolotl #LoRA #DoRA #Triton #LLM Training #Performance #PEFT

2026년 3월 22일

[Axolotl] Qwen 3.5 모델 Liger 커널 지원 및 fused RMSNorm+Gated 커널 추가

Axolotl에 Qwen 3.5 / Qwen 3.5 MoE 모델용 Liger FLCE 커널 지원과 fused RMSNorm+SiLU gate Triton 커널을 추가한 분석.

#Axolotl #Liger Kernel #Qwen 3.5 #RMSNorm #Triton #LLM Training #Performance

2026년 3월 22일

[Open WebUI] 메모리 항목 삭제 시 확인 대화상자 추가

개별 메모리 삭제에 확인 대화상자를 추가하여 실수 방지 UX 개선

#Open WebUI #Svelte #UX #Performance

2026년 3월 21일

[Axolotl] ScatterMoE LoRA Triton 커널의 autotune 탐색 공간 축소

ScatterMoE LoRA Triton 커널의 autotune 설정에서 불필요하게 큰 block size를 제거하여 컴파일 시간을 단축하고 shared memory 초과를 방지한 분석.

#Axolotl #Triton #ScatterMoE #LoRA #Autotune #Performance #GPU

2026년 3월 21일

[ray] Ray Data의 차세대 데이터 소스 API: DataSourceV2 설계 및 최적화 전략

Ray Data의 새로운 DataSourceV2 아키텍처를 통해 데이터 소스별 최적화와 확장성을 어떻게 달성했는지 분석합니다.

#Ray #DataEngineering #DistributedSystems #Python #PyArrow

2026년 3월 21일

[Triton] AMD RDNA3에서 buffer cache modifier LLVM IR 전파

RDNA3 타겟에서 .cg/.cs/.cv/.wt cache modifier가 무시되던 문제를 수정하여 non-temporal 메모리 접근 지원

#Triton #AMD #RDNA3 #Cache Optimization #LLVM IR

2026년 3월 21일

[triton] Global Sanitizer에 TMA 및 cp.async 연산 부분 지원 추가

Triton의 Global Sanitizer에 tensor descriptor 디코딩과 TMA/cp.async 연산의 메모리 접근 추적 기능을 추가한 PR 분석.

#Triton #GSan #Sanitizer #TMA #AsyncCopy #Debugging

2026년 3월 20일

[axolotl] Context Parallel 이중 시퀀스 분할 버그 수정: noop context manager로 중복 적용 방지

Context Parallel 학습 시 accelerate와 axolotl이 시퀀스를 이중으로 분할하는 문제를 noop context manager 패치로 해결한 사례를 분석합니다.

#Axolotl #Context Parallel #Distributed Training #Bug Fix

2026년 3월 20일