Review

[논문리뷰] Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection

대규모 언어 모델(LLM)이 정렬 노력에도 불구하고 여전히 유해한 행동에 취약하며, 기존 액티베이션 스티어링(Activation Steering) 기법들이 norm 보존 실패 로 인한 생성 붕괴, 세심한 계수 튜닝, 또는 이진 제어 등의 한계를 가진다는 문제점을 해결하고자 합니다.

#Review #Activation Steering #Large Language Models (LLMs)#Norm Preservation #Discriminative Layer Selection #Behavior Control #Inference-time Intervention #Angular Steering

2026년 1월 27일

[논문리뷰] Revisiting Parameter Server in LLM Post-Training

대규모 언어 모델(LLM) 후처리 훈련 과정에서 시퀀스 길이의 높은 편차 로 인해 발생하는 워크로드 불균형 문제 를 해결하는 것이 목표입니다.

#Review #LLM Post-Training #Parameter Server #Distributed Training #FSDP #On-Demand Communication #Workload Imbalance #Communication Optimization #Deep Learning

2026년 1월 27일

[논문리뷰] Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

현재 대규모 언어 모델(LLM)의 스케일링이 한계에 부딪혔으며, 특히 깊이 스케일링은 이론적으로 우수한 표현력을 제공하지만 기존 Transformer 아키텍처는 극심한 깊이에서 안정적으로 훈련하기 어렵습니다.

#Review #Transformer Architecture #Layer Normalization #Depth Scaling #Training Stability #Large Language Models #Gradient Flow #Highway Networks #Post-LayerNorm

2026년 1월 27일

[논문리뷰] HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences

본 논문은 학술 논문, 특히 AI/ML 분야에서 증가하는 환각 인용(HalluCitation) 의 확산과 그 영향을 체계적으로 조사하는 것을 목표로 합니다.

#Review #Hallucinated Citations #NLP Conferences #Citation Detection #Academic Integrity #Peer Review #Large Language Models (LLMs)#Bibliometrics

2026년 1월 27일

[논문리뷰] GPCR-Filter: a deep learning framework for efficient and precise GPCR modulator discovery

GPCR(G protein-coupled receptors) 변조기 발견의 복잡성과 기존 스크리닝 방법론의 한계(느리고 비용이 많이 들며 복잡한 동적 상호작용을 포착하지 못함)를 해결하는 것을 목표로 합니다.

#Review #GPCR #Drug Discovery #Deep Learning #Protein Language Model #Graph Neural Network #Attention Mechanism #Drug Target Interaction #Virtual Screening

2026년 1월 27일

[논문리뷰] FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning

본 논문은 장문 컨텍스트 LLM의 'lost-in-the-middle' 현상, 높은 계산 비용, 멀티 도큐먼트 추론 확장성 부족 문제를 해결하고, 기존 RAG 시스템의 의미론적 노이즈 및 구조화된 교차 문서 합성 한계를 극복하는 것을 목표로 합니다.

#Review #RAG #LLM-Enhanced Retrieval #Multi-Document Reasoning #Hierarchical Indexing #Bi-Path Retrieval #Adaptive Retrieval #Knowledge Organization #Context Window Optimization

2026년 1월 27일

[논문리뷰] AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

AI 에이전트의 자율적인 도구 사용과 환경 상호작용으로 인해 발생하는 복잡한 안전 및 보안 문제를 해결하고자 합니다. 기존 가드레일 모델의 에이전트 리스크 인지 부족과 진단 투명성 부족이라는 한계를 극복하고, 복잡하고 다양한 위험 행동을 포괄하는 진단형 가드레일 프레임워크 AgentDoG 를 제시하는 것이 목표입니다.

#Review #AI Agents #Safety Guardrails #Explainable AI (XAI)#Risk Taxonomy #Benchmarking #LLM Safety #Tool Use #Agent Alignment

2026년 1월 27일

[논문리뷰] AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning

본 논문은 멀티모달 대규모 언어 모델(MLLM)의 시각적 추론 능력을 향상시키기 위해, 적응적이며 다단계적인 도구 활용 능력 을 개발하는 것을 목표로 합니다. 기존 MLLM이 새로운 도구나 작업에 직면했을 때 도구를 유연하게 사용하고 조정하는 데 어려움을 겪는 문제를 해결하고자 합니다.

#Review #Multimodal LLMs #Tool Orchestration #Visual Reasoning #Reinforcement Learning #Adaptive Learning #Generalization #Tool Use

2026년 1월 27일

[논문리뷰] AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking

본 논문은 기존 벤치마크들이 다루지 못했던 시간-가변 오디오-비주얼 신호의 인간 문화적 맥락 이해 능력 을 평가하기 위해, MLLM(Multimodal Large Language Model) 의 맥락적, 문화적 지식 및 사고 능력 을 진단하는 새로운 벤치마크인 AVMeme Exam 을 제시합니다.

#Review #Multimodal LLMs #Benchmark #Cultural Understanding #Contextual Inference #Audio-Visual Memes #Multilingual #Q&A Evaluation

2026년 1월 27일

[논문리뷰] A Pragmatic VLA Foundation Model

이 논문은 로봇 조작을 위한 Vision-Language-Action (VLA) 파운데이션 모델 이 다양한 작업과 플랫폼에서 비용 효율적으로 일반화되는 문제를 해결하고자 합니다.

#Review #Vision-Language-Action Model #Robotics #Foundation Models #Multi-Embodiment Learning #Data Scaling #Computational Efficiency #Real-world Deployment

2026년 1월 27일

[논문리뷰] iFSQ: Improving FSQ for Image Generation with 1 Line of Code

이미지 생성 분야의 Autoregressive(AR) 모델과 Diffusion 모델 간의 단절을 해소하고, 이들을 위한 통일된 토크나이저를 구축 하는 것을 목표로 합니다.

#Review #Finite Scalar Quantization (FSQ)#Image Generation #Autoregressive Models #Diffusion Models #Quantization #Tokenization #Representation Alignment (REPA)#Latent Space

2026년 1월 26일

[논문리뷰] daVinci-Dev: Agent-native Mid-training for Software Engineering

본 논문은 LLM 기반 코드 에이전트 개발에서 기존 포스트 트레이닝(SFT, RL) 방식의 한계 인 리소스 제약과 데이터 불일치를 극복하고자 합니다.

#Review #Agentic Software Engineering #Mid-training #Large Language Models #Agent-native Data #Contextual Trajectories #Environmental Trajectories #SWE-Bench Verified #Code Generation

2026년 1월 26일

[논문리뷰] VIBEVOICE-ASR Technical Report

본 논문은 기존 단문 음성 인식의 발전에도 불구하고 컨텍스트 단편화 및 다화자 복잡성 으로 인해 장문 오디오(예: 회의, 팟캐스트) 이해가 어려운 문제를 해결하고자 합니다.

#Review #Automatic Speech Recognition #Speaker Diarization #Long-form Audio #Large Language Models #End-to-end Speech Processing #Multilingual #Context-aware ASR

2026년 1월 26일

[논문리뷰] The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

컴퓨터 비전 모델이 긴 서사적 일관성을 유지하지 못하고, 대화 같은 고수준의 아이디어와 시네마틱 실행 간의 '의미론적 간극'을 겪는 문제를 해결하는 것을 목표로 합니다. 특히, 단순 대화 입력만으로 장기적이고 일관성 있는 시네마틱 비디오를 자동 생성하는 엔드투엔드 프레임워크를 개발하고자 합니다.

#Review #Dialogue-to-Video Generation #Agentic AI #Cinematic Scripting #Long-Horizon Video Synthesis #Visual Coherence #Reinforcement Learning #Multimodal LLM

2026년 1월 26일

[논문리뷰] Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

본 논문은 초기 성공률이 낮아 훈련 신호가 희박한 어려운 추론 문제 에 대해 대규모 언어 모델(LLM) 이 학습 정체기에서 벗어나도록 돕는 것을 목표로 합니다.

#Review #Meta-RL #Curriculum Learning #Self-Play #LLM Reasoning #Sparse Rewards #Question Generation #Bilevel Optimization

2026년 1월 26일

[논문리뷰] SkyReels-V3 Technique Report

본 논문은 SkyReels-V3 를 통해 시각적 참조, 비디오, 오디오 및 텍스트 입력을 통합하여 유연하고 제어 가능한 비디오 생성을 가능하게 하는 통합 멀티모달 조건부 비디오 생성 프레임워크 를 제시하는 것을 목표로 합니다.

#Review #Video Generation #Multimodal AI #Diffusion Models #Transformer Architecture #Reference-guided Generation #Video-to-Video #Audio-driven Animation #Temporal Consistency

2026년 1월 26일

[논문리뷰] Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

과학적 추론을 위한 멀티모달 데이터의 부족과 기존 Text-to-Image(T2I) 모델 이 시각적으로는 그럴듯하지만 과학적으로 부정확한 이미지를 생성하는 문제를 해결하고자 합니다.

#Review #Scientific Image Synthesis #Multimodal Reasoning #Text-to-Image #Benchmarking #Programmatic Synthesis #Large Multimodal Models #Synthetic Data

2026년 1월 26일

[논문리뷰] STAR: Semantic Table Representation with Header-Aware Clustering and Adaptive Weighted Fusion

이 논문은 자연어 질의에 대한 테이블 검색(Table Retrieval) 과정에서 발생하는 비정형 질의와 정형 테이블 간의 심층적인 의미적 불일치 및 긴 테이블 처리 시 토큰 길이 제한 문제를 해결하는 것을 목표로 합니다.

#Review #Table Retrieval #Semantic Representation #K-means Clustering #Weighted Fusion #Large Language Models #Query Generation #Information Retrieval

2026년 1월 26일

[논문리뷰] SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback

본 논문은 복잡한 다중 문서 추론이 필요한 딥 서치(deep search) 질문-답변(QA) 쌍을 효율적으로 생성하는 문제를 다룹니다.

#Review #Deep Search #Agentic Data Generation #LLMs #Execution Feedback #Reinforcement Learning #Question Answering #Synthetic Data

2026년 1월 26일

[논문리뷰] Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

본 연구는 대규모 언어 모델(LLM) 에이전트가 좁은 범위의 환경에서 후기 훈련(post-training)된 후 광범위하고 이전에 본 적 없는 도메인에 배포될 때 발생하는 일반화 문제를 해결하는 것을 목표로 합니다.

#Review #LLM Agents #Reinforcement Learning #Cross-Domain Generalization #State Information Richness #Planning Complexity #State Augmentation #Step-by-Step Reasoning #Mid-Training

2026년 1월 26일