Review

[논문리뷰] Elucidating the SNR-t Bias of Diffusion Probabilistic Models

저자들은 SNR-t bias를 완화하기 위해 DCW (Differential Correction in Wavelet domain)를 제안합니다 . 이 방법론은 학습 없이(training-free) 추론 단계에서 적용 가능한 플러그 앤 플레이 방식의 differential correction을 수행합니다.

#Review #Diffusion Probabilistic Models #SNR-t Bias #Differential Correction #Wavelet Domain #Generation Quality #Training-free

2026년 4월 19일

[논문리뷰] EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection

본 논문은 Gradient Smartification 기법을 제안하여 로컬 그래디언트를 이진 표현으로 압축함으로써 통신 페이로드 크기를 최대 32배까지 감소시켰습니다. 이 과정에서 중앙값 기반의 적응형 임계값을 적용하여 기존 signSGD 방식의 고정 임계값 문제(노이즈 발생 및 불안정성)를 해결했습니다.

#Review #Federated Learning #Intrusion Detection #Gradient Compression #Homomorphic Encryption #6G-IoT #Median-Thresholding

2026년 4월 19일

[논문리뷰] DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off

본 논문은 GRPO 기반의 LLM RL 학습 과정에서 발생하는 극단적인 샘플(Extreme Hard/Easy samples)의 탐색 및 활용 불균형 문제를 해결하기 위해 고안되었습니다.

#Review #Large Language Models #Reinforcement Learning #Exploration-Exploitation Trade-Off #Perplexity #Reward Shaping

2026년 4월 19일

[논문리뷰] Can Large Language Models Reinvent Foundational Algorithms?

본 연구는 GRPO 기반의 on-policy unlearning과 cold start 단계를 결합하여 타겟 알고리즘 지식을 모델에서 제거합니다. 재발명 단계에서는 Python interpreter와 상호작용하며, 실패 시 Generative Verifier가 제공하는 진단 피드백을 통해 솔루션을 수정합니다.

#Review #Large Language Models #LLM Unlearning #Algorithmic Invention #GRPO #Test-time Reinforcement Learning

2026년 4월 19일

[논문리뷰] ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics

본 논문은 AI 생성 음악을 탐지하는 문제를 물리적 포렌식 잔차를 분석하는 과정으로 재정의하는 ArtifactNet 프레임워크를 제안한다. 시스템은 크게 3단계로 구성되는데, ArtifactUNet을 통한 포렌식 잔차 추출, HPSS를 활용한 7채널 특징 생성, 그리고 최종 판단을 위한 경량 CNN 분류로 이어진다.

#Review #AI-generated music #Forensic physics #Residual Vector Quantization #ArtifactNet #ArtifactBench #Codec-aware training #HPSS

2026년 4월 19일

[논문리뷰] AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization

본 논문은 최신 AI Accelerator(예: Amazon Trainium)에서 고성능 커널을 개발하는 과정이 극도로 어렵고 고비용이라는 문제를 해결하고자 합니다.

#Review #LLM Agent #Kernel Optimization #AI Accelerator #Amazon Trainium #Beam Search #Optimization Memory

2026년 4월 19일

[논문리뷰] (1D) Ordered Tokens Enable Efficient Test-Time Search

본 논문은 SoTo 프레임워크를 제안하여 다양한 tokenizer 구조, search 알고리즘, verifier, 그리고 AR prior의 상호작용을 체계적으로 분석합니다 . 제안 방법론은 FlexTok과 같은 1D ordered tokenizer를 활용하여, 중간 토큰 시퀀스가 전체 이미지의 전역적인 의미를 담도록 학습시킵니다.

#Review #tokenization #test-time scaling #autoregressive model #search #coarse-to-fine

2026년 4월 19일

[논문리뷰] Towards Autonomous Mechanistic Reasoning in Virtual Cells

본 논문은 생물학적 추론을 Directed Acyclic Graph(DAG) 형태로 공식화하여 추론 과정을 명확히 정의하고 검증 가능하게 만듭니다 . 제안하는 VCR-Agent는 보고서 생성기(Report Generator)와 설명 생성기(Explanation Constructor)라는 두 단계 파이프라인으로 구성되어 있습니다.

#Review #Virtual Cells #Large Language Models #Mechanistic Reasoning #Structured Explanation #Knowledge Retrieval #Verifier-based Filtering

2026년 4월 16일

[논문리뷰] SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems

본 논문은 정보 기하학에 기반한 FRQAD와 Local TurboQuant를 도입하여 메모리 저장 효율과 검색 정밀도를 동시에 달성한다. 저자들은 Fokker-Planck 동역학을 활용하여 메모리의 수명 주기를 수학적으로 관리하며, 이를 통해 고정밀에서 저정밀(32-bit에서 2-bit까지)로 이어지는 단계적 메모리 압축을 구현한다.

#Review #Agent Memory #Information Geometry #Vector Quantization #Ebbinghaus Forgetting #Cognitive Architecture #Soft Prompts #Fisher-Rao

2026년 4월 16일

[논문리뷰] RadAgent: A tool-using AI agent for stepwise interpretation of chest computed tomography

본 논문은 Reinforcement Learning을 통해 최적의 도구 사용 전략을 자동 학습하는 RadAgent 프레임워크를 제안합니다. RadAgent는 초기 보고서 초안을 작성한 후, 임상 진단 체크리스트를 기반으로 단계별 에이전트 루프를 거치며 필요한 도구를 호출하고 결과를 업데이트합니다 .

#Review #RadAgent #Reinforcement Learning #Vision-Language Models #Chest CT #Medical Report Generation #Tool-using AI Agent #Faithfulness #Robustness

2026년 4월 16일

[논문리뷰] OneHOI: Unifying Human-Object Interaction Generation and Editing

본 논문은 HOI(Human-Object Interaction)의 생성과 편집이 서로 분리된 연구 흐름으로 발전해 온 비효율성을 해결하기 위해 통합 프레임워크인 OneHOI를 제안합니다.

#Review #Human-Object Interaction #Diffusion Transformer #Image Editing #Unified Framework #Relational Modeling #Spatial Control

2026년 4월 16일

[논문리뷰] Model Capability Dominates: Inference-Time Optimization Lessons from AIMO 3

본 논문은 LLM의 수학적 추론 능력을 향상시키기 위한 Inference-Time Optimization 기법들이 실질적인 효과가 있는지 검증하고자 합니다.

#Review #LLM #Mathematical Reasoning #Inference-Time Optimization #Majority Voting #Self-Consistency #Diverse Prompting

2026년 4월 16일

[논문리뷰] MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

본 논문은 기존의 웹 페이지 자동 생성 방식이 가진 전역적 일관성 및 시각적 요소의 통합 문제를 해결하기 위해 MM-WebAgent를 제안한다.

#Review #Multimodal Web Agent #Hierarchical Planning #Self-Reflection #Webpage Generation #AIGC

2026년 4월 16일

[논문리뷰] LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning

본 논문은 LLM의 Long-context 추론 능력을 강화하기 위한 RL 과정에서 모델 내부의 Intrinsic Representation이 충분히 활용되지 못하는 문제를 해결하고자 합니다.

#Review #Reinforcement Learning #Large Language Models #Long-context #Sparsity #Activation Patterns #Saliency-guided

2026년 4월 16일

[논문리뷰] LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories

본 논문은 Flow Matching 모델을 인간의 선호도에 맞게 정렬(alignment)하는 과정에서 기존 Direct-Gradient 방식들이 가진 고비용 메모리 문제와 그래디언트 폭주(gradient explosion) 문제를 해결하고자 합니다.

#Review #Flow Matching #Preference Alignment #Direct-Gradient Method #Leap Trajectory #Trajectory-Similarity Weighting #Gradient Discounting

2026년 4월 16일

[논문리뷰] KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs

본 논문은 RAG(Retrieval-Augmented Generation) 환경에서 빈번하게 발생하는 KV cache의 컨텍스트 의존성 및 그로 인한 추론 지연 문제를 해결하는 것을 목표로 합니다.

#Review #LLM #KV Cache #RAG #Recomputation-Free #Soft-token Adapter #Self-Supervised Distillation #Attention Dynamics

2026년 4월 16일

[논문리뷰] Cross-Tokenizer LLM Distillation through a Byte-Level Interface

본 논문은 LLM의 핵심적인 제약 사항인 Tokenizer 불일치 문제를 해결하기 위한 범용적인 Cross-Tokenizer Distillation (CTD) 기법을 제안합니다.

#Review #Cross-Tokenizer Distillation #Byte-Level Interface #Knowledge Distillation #LLM #Vocabulary Mismatch

2026년 4월 16일

[논문리뷰] C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences

본 논문은 Rubric 생성과 Rubric 기반 검증을 협력적이지만 비판적인 의사소통 과정으로 재정의합니다. 제안 방법론인 C2는 우선 Verifier의 신뢰도를 기준으로 Rubric을 Helpful한 것과 Misleading한 것으로 합성한 후, 이 쌍을 활용하여 Generator를 DPO로 학습시키고 Verifier를 GRPO로 학습시킵니다 .

#Review #Reward Modeling #Reinforcement Learning from Human Feedback (RLHF)#Rubric-Augmented Verification #Binary Preferences #Cooperative Communication

2026년 4월 16일

[논문리뷰] Target Policy Optimization

본 논문은 기존의 Policy-Gradient 계열 방법론들이 sparse reward 환경에서 학습이 매우 불안정하고 효과적이지 않다는 문제를 해결하고자 합니다.

#Review #Target Policy Optimization #Sparse Reward #Policy Gradient #Cross-Entropy #RLVR #Grouped RL

2026년 4월 15일

[논문리뷰] SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

본 논문은 3D 공간 추론 학습에서 데이터 주석(annotation) 비용과 모델 합의(consensus) 기반 학습의 한계 문제를 해결하고자 합니다.

#Review #Spatial Reasoning #Self-Evolution #Vision-Language Models #Deterministic Geometric Environment #Reinforcement Learning

2026년 4월 15일