Review

[논문리뷰] Flow-OPD: On-Policy Distillation for Flow Matching Models

본 논문은 Flow Matching 모델의 다중 작업 정렬(multi-task alignment) 과정에서 발생하는 보상 희소성(reward sparsity)과 기울기 간섭(gradient interference) 문제를 해결하고자 합니다.

#Review #Flow Matching #On-Policy Distillation #Reinforcement Learning #Multi-task Alignment #Manifold Anchor Regularization #Text-to-Image

2026년 5월 10일

[논문리뷰] Fast Byte Latent Transformer

본 논문은 byte-level language model이 지닌 고질적인 추론 속도 문제를 해결하는 것을 목적으로 한다. 기존의 바이트 단위 모델은 Subword 모델과 달리 입력 길이가 훨씬 길어지기 때문에, Naive한 자기회귀(Autoregressive) 방식으로는 매우 느린 추론 속도를 보인다는 한계가 있다.

#Review #Byte-level Language Model #BLT #Diffusion #Inference Acceleration #Speculative Decoding #Latent Tokenization

2026년 5월 10일

[논문리뷰] Empirical Evidence for Simply Connected Decision Regions in Image Classifiers

본 논문은 현대의 deep neural network가 학습한 결정 영역이 단순히 path connected할 뿐만 아니라, 더 강력한 위상적 성질인 simply connected를 만족하는지 규명하고자 한다.

#Review #Deep Neural Networks #Decision Regions #Topology #Simply Connected #Coons Patches #Adversarial Robustness

2026년 5월 10일

[논문리뷰] DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

본 논문은 복잡한 워크플로우를 자동화하는 AI 에이전트의 보안 위협을 체계적으로 평가할 수 있는 표준화된 플랫폼과 벤치마크의 부재 문제를 해결합니다.

#Review #AI Agents #Red-Teaming #Safety Evaluation #Agentic Systems #Security Risk Assessment

2026년 5월 10일

[논문리뷰] CPCANet: Deep Unfolding Common Principal Component Analysis for Domain Generalization

본 논문은 기존의 DG 방법들이 데이터 간의 통계적 거리를 정렬하거나 대규모 모델의 표현력에 의존하는 방식에서 벗어나, 도메인 간의 불변 구조를 직접적으로 추출하지 못한다는 한계를 해결하고자 합니다.

#Review #Domain Generalization #Common Principal Component Analysis #Deep Unfolding Networks #Riemannian Optimization #Stiefel Manifold

2026년 5월 10일

[논문리뷰] CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment

현재의 LLM 라이프사이클은 대규모 pretraining과 finetuning이라는 두 단계에 고정되어 있어, 일단 배포되면 학습이 완전히 중단되는 한계가 있습니다.

#Review #Large Language Models #Deployment-Time Learning #Case-Based Reasoning #Contextual Bandit #No-Regret Learning #Experiential Learning

2026년 5월 10일

[논문리뷰] Beyond Retrieval: A Multitask Benchmark and Model for Code Search

코드 검색 벤치마크 분야는 데이터 오염, 평가 지표의 단일성, 그리고 실제 배포 환경과 괴리된 평가 방식으로 인해 정교한 모델 성능 측정이 어렵습니다.

#Review #Code Search #Benchmark #Reranker #Data Contamination #Retrieval-Augmented Generation #Code LLM

2026년 5월 10일

[논문리뷰] Anisotropic Modality Align

MLLM 학습은 고품질의 쌍(paired) 멀티모달 데이터 부족이라는 고질적인 문제에 직면해 있으며, 이를 해결하기 위해 공유 임베딩 공간에서 unimodal 데이터를 정렬하는 방식이 주목받고 있다.

#Review #Multimodal Large Language Models #Modality Gap #Unpaired Alignment #Anisotropic Geometric Correction #Representation Learning

2026년 5월 10일

[논문리뷰] AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

본 논문은 Agentic RL에서 발생하는 sparse, outcome-level reward 문제를 해결하기 위해 응답 수준에서의 정교한 Credit Assignment 프레임워크를 제안합니다.

#Review #Agentic Reinforcement Learning #Credit Assignment #Adaptive Entropy Modulation #Large Language Models #Exploration-Exploitation Trade-off #Surprisal #Policy Optimization

2026년 5월 10일

[논문리뷰] 4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding

본 논문은 기존 VLM이 동적 공간 추론에서 겪는 불투명성과 성능 한계를 해결하기 위해 4DThinker를 제안합니다. 기존 연구들은 추론 과정을 텍스트로만 기술하거나 외부 기하학적 모듈을 의존하여 추론 복잡도를 증가시키고 모델 자체의 내재적 능력을 제한하는 한계를 보입니다 .

#Review #Vision-Language Models #Dynamic Spatial Reasoning #Latent Mental Imagery #Dynamic-Imagery Fine-Tuning (DIFT)#4D Reinforcement Learning (4DRL)#Chain-of-Thought (CoT)

2026년 5월 10일

[논문리뷰] The Scaling Properties of Implicit Deductive Reasoning in Transformers

본 논문은 depth-bounded Transformer가 내재적(implicit)으로 수행하는 연역적 추론의 확장성(scaling) 한계를 규명합니다.

#Review #Transformers #Implicit Deductive Reasoning #Horn Clauses #Chain-of-Thought #Scaling Properties #Shortcut Learning #Algorithmic Alignment

2026년 5월 7일

[논문리뷰] TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

본 논문은 LLM이 자연어 처리에 성공한 것과 달리, tabular 데이터를 위한 통합된 representation 패러다임이 부재하다는 점을 해결하고자 합니다 .

#Review #Tabular Embedding #Contrastive Learning #Tabular Understanding #Foundation Models #Representation Learning #Tabular Retrieval

2026년 5월 7일

[논문리뷰] SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation

본 논문은 2K 고해상도 I2V 생성에서 발생하는 계산 효율성(Efficiency)과 입력 이미지 충실도(Fidelity) 사이의 심각한 trade-off 문제를 해결하고자 한다.

#Review #Image-to-Video #High-Resolution Generation #Diffusion Transformer #Conditional Segment-wise Generation #Efficiency #Streaming Inference

2026년 5월 7일

[논문리뷰] MARBLE: Multi-Aspect Reward Balance for Diffusion RL

본 논문은 diffusion model을 human preference에 맞게 미세 조정할 때, 여러 개의 reward를 동시에 최적화하는 과정에서 발생하는 성능 저하 문제를 해결하고자 합니다.

#Review #Diffusion Models #Reinforcement Learning #Multi-Reward Optimization #Gradient Harmonization #Reward Balancing #Alignment

2026년 5월 7일

[논문리뷰] Continuous-Time Distribution Matching for Few-Step Diffusion Distillation

본 논문은 기존의 Diffusion Distillation 방식이 학습 및 추론 시 고정된 이산적 타임스텝(discrete anchors)에 지나치게 의존함으로써 발생하는 성능 저하 문제를 해결하고자 한다.

#Review #Diffusion Models #Distillation #Continuous-Time Optimization #Distribution Matching #Few-Step Generation #Flow Matching

2026년 5월 7일

[논문리뷰] Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes

본 논문은 기계학습 연구의 제안-측정-수정 루프를 인간의 개입 없이 언어 모델 에이전트로 자동화하는 것을 목표로 합니다. 기존의 자동화 연구들이 주로 단일 모델 출력물 생성이나 제한적인 하이퍼파라미터 탐색에 머물렀던 것과 달리, 이 연구는 실제 학습 파이프라인 전반에 걸친 실질적인 코드 구조 수정을 목표로 합니다.

#Review #Auto Research #Language Agents #Closed-Loop #Training Recipes #Specialist Agents #Compute-Budgeted #Lineage Feedback

2026년 5월 7일

[논문리뷰] Audio-Visual Intelligence in Large Foundation Models

본 논문은 대규모 파운데이션 모델 시대에 멀티모달 학습이 필수적임에도 불구하고, 시청각 데이터 간의 정렬, Taxonomy의 불일치, 그리고 평가 방법론의 파편화로 인해 체계적인 연구가 어렵다는 문제를 해결하고자 합니다.

#Review #Audio-Visual Intelligence #Foundation Models #Multimodal Fusion #Embodied AI #Cross-modal Generation

2026년 5월 7일

[논문리뷰] AI Co-Mathematician: Accelerating Mathematicians with Agentic AI

본 논문은 수학 연구의 복잡하고 반복적인 실제 프로세스를 지원하기 위해 상태 유지형 워크플로우를 제공하는 AI co-mathematician을 제안한다.

#Review #Agentic AI #Mathematical Research #Interactive Workspace #Workstream #Stateful Workflow #Uncertainty Management #FrontierMath

2026년 5월 7일

[논문리뷰] X2SAM: Any Segmentation in Images and Videos

본 논문은 MLLM의 강력한 추론 능력과 foundation segmentation model의 정밀한 픽셀 단위 인식 능력을 통합하여 정적 이미지뿐만 아니라 동적 비디오까지 포괄하는 통합된 세분화 프레임워크를 구축하는 것을 목표로 합니다.

#Review #MLLM #Segmentation #Video-Understanding #Mask-Memory #Visual-Prompting #Spatio-Temporal-Consistency

2026년 5월 5일

[논문리뷰] Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

본 논문은 기존의 에이전트 벤치마크가 실제 업무 환경의 복잡한 파일 의존성(Large-Scale File Dependencies)을 충분히 반영하지 못하는 한계를 해결하기 위해 제안되었다.

#Review #AI Agents #Workspace Learning #Benchmark #File Dependency #Large-Scale #Autonomous Agent #Task-File-Driven

2026년 5월 5일