최신 포스트

[논문리뷰] SafeDiffusion-R1: Online Reward Steering for Safe Diffusion Post-Training

본 논문은 기존의 T2I 모델 안전성 확보 방식들이 가진 데이터 의존성과 모델 성능 저하 문제를 해결하고자 합니다.

#Review #Diffusion Models #Safety Alignment #Online Reinforcement Learning #GRPO #CLIP #Concept Erasure

2026년 5월 18일

[논문리뷰] SNLP: Layer-Parallel Inference via Structured Newton Corrections

본 논문은 Transformer 모델의 고질적인 문제인 Layer-wise Dependency로 인한 추론 지연(Latency) 문제를 해결하고자 합니다.

#Review #Layer-Parallel Inference #Structured Newton Corrections #Transformer #Autoregressive #Solver-induced Inference Bias #Identity Newton #HC Newton

2026년 5월 18일

[논문리뷰] Post-Trained MoE Can Skip Half Experts via Self-Distillation

기존의 Dynamic MoE 연구들은 주로 모델을 밑바닥부터 재학습(from scratch)하거나 특정 작업에만 국한된 적응 방식을 취해왔습니다. 그러나 실제 현업에서는 이미 사전 학습 및 후속 학습(SFT, RL 등)이 완료된 Post-Trained MoE 모델을 활용하는 경우가 대부분입니다.

#Review #Mixture-of-Experts #Dynamic Inference #Self-Distillation #Zero-Expert Injection #Large Language Models #Model Adaptation

2026년 5월 18일

[논문리뷰] OProver: A Unified Framework for Agentic Formal Theorem Proving

본 논문은 기존 formal theorem proving 시스템이 증명 실패 시의 feedback과 retrieval을 inference-time heuristic으로만 사용하여 학습과 추론 간의 불일치(mismatch)가 발생하는 문제를 해결하고자 합니다.

#Review #Formal Theorem Proving #Lean 4 #Agentic Proving #Compiler Feedback #Test-Time Refinement #Reinforcement Learning

2026년 5월 18일

[논문리뷰] NGM: A Plug-and-Play Training-Free Memory Module for LLMs

본 논문은 LLM이 추론 시 고유한 로컬 패턴(식별자, 전문 용어, 구문 등)을 재구성하기 위해 과도한 연산 자원을 소모하는 문제를 해결하고자 합니다. 기존의 Conditional Memory 접근법은 학습이 필요한 메모리 테이블이나 별도의 저장소 인프라를 요구하여 유연성과 효율성을 제한합니다.

#Review #Large Language Models #Memory Module #N-gram #Training-Free #Plug-and-Play #Cosine Similarity

2026년 5월 18일

[논문리뷰] Monitoring the Internal Monologue: Probe Trajectories Reveal Reasoning Dynamics

본 논문은 LRM에서 생성되는 Chain of Thought(CoT)가 모델의 최종 출력과 항상 일치하지 않는다는 'Unfaithfulness' 문제를 해결하고자 합니다 .

#Review #Large Reasoning Models #Chain of Thought #Probe Trajectories #Representation Engineering #AI Safety #Max-pooling #Interpretability

2026년 5월 18일

[논문리뷰] Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLM Tool Use

본 논문은 LLM agent의 Adaptive Tool Use 과정에서 발생하는 성능 저하와 불투명성 문제를 해결하기 위해 모델 고유의 capability에 기반한 Model-Adaptive Tool Necessity 프레임워크를 제안합니다.

#Review #LLM #Tool Use #Meta-cognition #Knowing-Doing Gap #Representation Engineering #Model-Adaptive

2026년 5월 18일

[논문리뷰] MixSD: Mixed Contextual Self-Distillation for Knowledge Injection

본 논문은 LLM에 새로운 지식을 주입할 때 발생하는 Catastrophic Forgetting 문제를 해결하고자 한다.

#Review #Knowledge Injection #Self-Distillation #Catastrophic Forgetting #Language Models #Distribution Alignment #Fine-tuning

2026년 5월 18일

[논문리뷰] MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents

본 논문은 현재의 GUI agent가 장기적(Long-Horizon) 태스크 수행 시 인터페이스 변화에 따른 태스크 상태를 유지하는 데 한계를 보인다는 점을 문제로 지적합니다.

#Review #GUI Agents #Multimodal Memory #Long-Horizon #Memory Control #MLLM #Working Memory #Episodic Memory

2026년 5월 18일

[논문리뷰] Measuring Maximum Activations in Open Large Language Models

본 논문은 최신 오픈 LLM 생태계에서 Activation의 동적 범위(Dynamic Range)가 단순히 파라미터 수에 비례한다는 기존의 통념을 재검토하고, 모델별 Maximum Activation Magnitude(MM)를 체계적으로 측정하여 배포 시의 위험을 파악하고자 합니다.

#Review #Large Language Models #Activation Range #Quantization #Maximum Activation #LLM Inference #Residual Stream #Model Scaling

2026년 5월 18일

[논문리뷰] LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

본 논문은 긴 비디오 생성 시 발생하는 메모리 병목 현상과 낮은 연산 효율 문제를 해결하기 위해 시스템과 알고리즘이 통합된 인프라 LongLive-2.0을 제안한다.

#Review #Long Video Generation #NVFP4 #Sequence Parallelism #Autoregressive Diffusion #KV Cache Quantization #Balanced SP

2026년 5월 18일

[논문리뷰] LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs

본 연구는 장편 비디오 이해를 위해 Video LLMs를 확장할 때 발생하는 고질적인 계산 복잡도와 효율성 병목 문제를 해결하는 데 집중합니다.

#Review #Video LLMs #Vision Encoder #Token Compression #Compressed Token Distillation #Long-form Video Understanding #Spatio-temporal Modeling

2026년 5월 18일

[논문리뷰] Lance: Unified Multimodal Modeling by Multi-Task Synergy

본 논문은 기존 멀티모달 모델들이 이해와 생성이라는 두 가지 이질적인 목적을 통합할 때 발생하는 성능 저하와 작업 범위의 한계를 해결하기 위해 제안되었습니다.

#Review #Unified Multimodal Modeling #Multi-Task Synergy #Dual-Stream Architecture #Modality-Aware Rotary Positional Encoding #Autoregressive Modeling #Flow Matching

2026년 5월 18일

[논문리뷰] KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration

기존의 비디오 생성 모델 정렬 기법들은 주로 노이즈 기반의 탐색(exploration)이나 SDE 기반의 surrogate policy를 사용하여, 결정론적(deterministic) ODEdynamics로 작동하는 distilled AR 모델의 특성과 상충하는 문제를 야기합니다 .

#Review #Autoregressive Video Generation #Reinforcement Learning #Policy Optimization #Flow Matching #KV Caching #Causal-Semantic Exploration #Trajectory Velocity Energy

2026년 5월 18일

[논문리뷰] Incantation: Natural Language as the Action Interface for Multi-Entity Video World Models

본 논문은 현대적인 대화형 비디오 세계 모델들이 가진 구조적 한계인 Action Interface의 고착화 문제를 해결합니다.

#Review #Interactive Video World Model #Natural Language Action Interface #Multi-Entity Control #Cross-Entity Transfer #Streaming Inference #Self-Forcing Distillation

2026년 5월 18일

[논문리뷰] Geometric Phase Transition Enables Extreme Hippocampal Memory Capacity

본 연구는 생물학적 기억 체계가 어떻게 뉴런의 물리적 증식 없이도 정보 용량을 획기적으로 확장하는지 해결하고자 합니다.

#Review #Hippocampal Memory #Geometric Stability #Neural Manifold #Population Code #Excitatory-Inhibitory Dynamics #Crystalline Code

2026년 5월 18일

[논문리뷰] GRASP: Learning to Ground Social Reasoning in Multi-Person Non-Verbal Interactions

본 논문은 현재 MLLMs가 다중 인원 비디오에서 미묘한 비언어적 단서에 기반한 사회적 추론을 수행하는 데 어려움을 겪는 문제를 해결합니다.

2026년 5월 18일

[논문리뷰] From Runnable to Shippable: Multi-Agent Test-Driven Development for Generating Full-Stack Web Applications from Requirements

본 논문은 현재의 코딩 에이전트가 웹 애플리케이션 생성 시 겪는 70% 이상의 기능적 요구사항 미충족 문제를 해결하는 것을 목표로 합니다. 기존의 에이전트는 코드 파일이나 터미널 출력만을 기반으로 검증을 수행하지만, 웹 애플리케이션의 정확성은 브라우저 환경에서의 동적 상호작용을 통해서만 평가될 수 있습니다 .

#Review #Multi-Agent System #Test-Driven Development #Web Development #Code Generation #Closed-Loop Validation #Large Language Model

2026년 5월 18일

[논문리뷰] FINESSE-Bench: A Hierarchical Benchmark Suite for Financial Domain Knowledge and Technical Analysis in Large Language Models

본 논문은 기존의 금융 벤치마크가 지닌 한계를 극복하고 LLM의 실질적인 금융 전문 역량을 정밀하게 진단하기 위해 FINESSE-Bench를 제안한다.

#Review #Large Language Models #Financial Benchmarking #Difficulty Hierarchy #Technical Analysis #LLM-as-Judge #Professional Competence #Financial Reasoning

2026년 5월 18일

[논문리뷰] Evaluating Cognitive Age Alignment in Interactive AI Agents

본 논문은 최첨단 MLLM 에이전트가 높은 태스크 정확도에도 불구하고 실제 아동과의 상호작용에서 인지적 수준이 맞지 않는 설명을 제공하거나 과도하게 복잡한 추론을 시도하는 문제를 해결하고자 한다.

#Review #Cognitive Age Alignment #MLLM Agents #ChildAgentEval #Developmental Psychology #Skill-Guided Distillation #WISC #Interactive Evaluation

2026년 5월 18일