Review

[논문리뷰] False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize

본 연구는 대규모 언어 모델(LLM)의 악성 입력 감지를 위해 제안된 프루빙 기반(probing-based) 방법론 의 신뢰성을 재평가하는 것을 목표로 합니다.

#Review #LLM Safety #Malicious Input Detection #Probing Classifiers #Out-of-Distribution Generalization #Superficial Patterns #Instructional Patterns #Trigger Words #AI Safety

2025년 9월 5일

[논문리뷰] Durian: Dual Reference-guided Portrait Animation with Attribute Transfer

본 논문은 주어진 참조 이미지로부터 대상 인물의 얼굴 속성(예: 헤어스타일, 안경)을 전이하여 동적인 초상화 애니메이션 비디오를 제로샷(zero-shot) 방식으로 생성하는 것을 목표로 합니다.

#Review #Portrait Animation #Attribute Transfer #Diffusion Models #Dual Reference Networks #Zero-shot Learning #Self-Reconstruction #Facial Editing

2025년 9월 5일

[논문리뷰] Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

본 연구는 LLM(Large Language Models)이 겉으로는 논리적이지만 심층적인 역설적 의미를 담고 있는 'Drivelology(심오한 헛소리)'를 얼마나 깊이 이해하는지 평가하는 것을 목표로 합니다. 통계적 유창성을 넘어선 LLM의 진정한 인지적 이해, 특히 실용적 이해 의 근본적인 한계를 밝히고자 합니다.

#Review #Large Language Models #Pragmatic Understanding #Drivelology #Benchmark Dataset #Multilingual NLP #Semantic Reasoning #Contextual Inference

2025년 9월 5일

[논문리뷰] Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vector Drawings

본 연구는 2D 벡터 엔지니어링 도면(SVG 형식)으로부터 파라메트릭 CAD 모델을 자동으로 생성 하는 문제를 해결하는 것을 목표로 합니다.

#Review #CAD Generation #Vector Graphics #Sequence-to-Sequence Learning #Transformer Architecture #Engineering Drawings #Multi-modal Learning #Soft Target Loss #Dual Decoder

2025년 9월 5일

[논문리뷰] Delta Activations: A Representation for Finetuned Large Language Models

다양하게 미세 조정된 대규모 언어 모델(LLM)의 방대한 생태계에서 모델 간의 유사점과 차이점을 효율적으로 파악하고, 모델을 검색, 비교 및 클러스터링할 수 있는 표준화된 표현 방식 이 부족한 문제를 해결하는 것이 목표입니다. 이는 기존의 메타데이터 부족 문제를 극복하고 모델 재사용을 촉진하기 위함입니다.

#Review #LLM Embedding #Delta Activations #Finetuned Models #Model Representation #Model Clustering #Additive Property #Task Embedding #Model Merging

2025년 9월 5일

[논문리뷰] DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks

본 논문은 기존 벤치마크의 데이터 누출 위험과 비현실적인 평가 방식의 한계를 극복하기 위해, 대규모 언어 모델(LLM) 기반 연구 에이전트 의 실제 연구 능력을 평가하기 위한 새로운 벤치마크인 DeepResearch Arena 를 제안합니다.

#Review #LLM Evaluation #Research Agents #Benchmark #Multi-Agent System #Seminar-Grounded Tasks #Data Leakage Prevention #Ill-Structured Problems

2025년 9월 5일

[논문리뷰] Robix: A Unified Model for Robot Interaction, Reasoning and Planning

본 논문은 일반ist 로봇이 복잡한 장기 작업을 추론하고 자연스러운 인간 상호작용에 참여할 수 있도록 단일 비전-언어 아키텍처 내에서 로봇 추론, 태스크 플래닝, 자연어 상호작용을 통합하는 Robix 모델을 제안합니다.

#Review #Robot Learning #Vision-Language Models (VLMs)#Embodied AI #Human-Robot Interaction (HRI)#Task Planning #Reinforcement Learning (RL)#Chain-of-Thought (CoT) Reasoning #Robotics

2025년 9월 4일

[논문리뷰] Open Data Synthesis For Deep Research

본 논문은 기존 벤치마크들이 '심층 연구(Deep Research)' 작업을 위한 충분한 구조적 깊이를 제공하지 못하는 한계를 해결하고자 합니다. 특히, 복잡한 질문을 하위 문제로 분해하고, 다단계 추론을 조율하며, 다양한 출처에서 증거를 합성해야 하는 작업에 초점을 맞춥니다.

#Review #Data Synthesis #Deep Research #Hierarchical Constraint Satisfaction Problems #Large Language Models #Agentic AI #Reinforcement Learning #Question Answering

2025년 9월 4일

[논문리뷰] Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

논문은 기존 생성 모델이 의미론적 제어와 사진 같은 사실성 사이의 섬세한 균형을 맞추는 데 어려움을 겪고, 특히 Diffusion Transformer (DiT) 가 복잡한 다중 모드 조건부 설정에서 충분히 탐색되지 않았다는 문제를 해결하고자 합니다.

#Review #Diffusion Transformer #Mixture of Experts #Controllable Generation #Face Generation #Multimodal Synthesis #Semantic Control #Image Generation

2025년 9월 4일

[논문리뷰] MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement

이 논문은 다중 피사체 개인화 이미지 생성 시 발생하는 정체성 혼합(identity blending) 및 속성 유출(attribute leakage) 문제를 해결하는 것을 목표로 합니다.

#Review #Multi-Subject Generation #Personalized Image Synthesis #Semantic Correspondence #Attention Disentanglement #Diffusion Models #Identity Preservation #Dataset

2025년 9월 4일

[논문리뷰] LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

언어 모델(LMs)이 사전 훈련 과정에서 지식 표현을 어떻게 형성하고 발전시키는지에 대한 내부 프로세스를 분석하는 것입니다.

#Review #Language Models #Knowledge Acquisition #Pretraining Data #Entity Linking #Coreference Resolution #Information Retrieval #Model Analysis #Checkpoints

2025년 9월 4일

[논문리뷰] ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association

본 연구는 기존 모노큘러 덴스 SLAM 시스템의 주요 한계점인 카메라 인트린직스(intrinsics) 필요성, 높은 계산 복잡성, 그리고 장기적인 시퀀스에서의 드리프트 축적 문제를 해결하는 것을 목표로 합니다.

#Review #Monocular SLAM #Dense Reconstruction #Neural Networks #Pose Graph Optimization #Intrinsics-free #Real-time #Two-view Association

2025년 9월 3일

[논문리뷰] VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

논문은 LLM의 독립적인 추론과 상호작용적 에이전트 지능 사이의 격차를 해소하고자 합니다.

#Review #Agentic Reinforcement Learning #Tool Use #Large Language Models #Reinforcement Learning from Verifiable Rewards (RLVR)#Asynchronous Execution #Multi-modal AI #Framework

2025년 9월 3일

[논문리뷰] Universal Deep Research: Bring Your Own Model and Strategy

이 논문은 기존의 심층 연구 도구(DRT)들이 고정된 연구 전략과 제한적인 모델 선택으로 인해 사용자 정의가 어렵고 특정 산업에 특화된 연구 전략을 구축하기 어렵다는 문제를 제기합니다.

#Review #Agentic Systems #Language Models (LLMs)#Research Automation #Customizable Strategies #Code Generation #Deep Research #User-Defined Agents #Sandboxed Execution

2025년 9월 3일

[논문리뷰] UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

본 연구는 데이터 희소성, 확장 가능한 멀티-턴 강화 학습(RL), GUI 전용 작동의 한계, 환경 확장성 및 안정성 과 같은 자율 GUI 에이전트 개발의 주요 과제를 해결하는 것을 목표로 합니다.

#Review #GUI Agent #Multi-Turn RL #Reinforcement Learning #Data Flywheel #Agent Framework #Hybrid Environments #Parameter Interpolation

2025년 9월 3일

[논문리뷰] Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

본 논문은 3D 포인트 클라우드 학습에서 기존 단일 뷰(single-view) 기반 마스킹 재구성(masked reconstruction) 방식의 한계를 극복하고, 더 다양하고 도전적인 두 뷰(two-view) 기반 사전 학습 패러다임 을 탐구하는 것을 목표로 합니다.

#Review #Point Cloud Learning #Self-Supervised Learning #Cross Reconstruction #Decoupled Views #Generative Models #Positional Encoding #3D Vision

2025년 9월 3일

[논문리뷰] The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

본 설문조사는 LLM(Large Language Models)을 수동적인 시퀀스 생성기에서 자율적인 의사 결정 에이전트로 전환하는 Agentic RL(Agentic Reinforcement Learning) 패러다임의 등장을 탐구합니다.

#Review #Agentic Reinforcement Learning #Large Language Models #LLM Agents #Sequential Decision Making #Policy Optimization #Tool Use #Dynamic Environments #Autonomous AI

2025년 9월 3일

[논문리뷰] The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang

이 논문은 대규모 언어 모델(LLMs)이 언어 학습에서 인간과 유사한 메타언어적 추론 능력 을 진정으로 갖추고 있는지 평가하는 것을 목표로 합니다. LLM의 성공이 단순한 패턴 매칭이 아닌, 명시적인 문법 규칙과 어휘를 통해 낯선 언어를 학습하고 적용 하는 능력에서 비롯되는지 진단하고자 합니다.

#Review #LLMs #Metalinguistic Reasoning #Constructed Language #Camlang #Second Language Acquisition #Zero-shot Learning #Natural Language Understanding #Commonsense Reasoning

2025년 9월 3일

[논문리뷰] SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

본 논문은 Reinforcement Learning (RL)을 사용하여 Multi-turn Tool-Integrated Reasoning (TIR)을 수행하는 Large Language Models (LLMs)의 훈련 시 발생하는 불안정성, 특히 그래디언트 폭발 과 성능 저하 문제를 해결하는 것을 목표로 합니다.

#Review #Reinforcement Learning #Large Language Models #Tool-Integrated Reasoning #Multi-turn Reasoning #Gradient Explosion #Training Stability #Trajectory Filtering #Zero RL

2025년 9월 3일

[논문리뷰] SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction

본 논문은 자연어 질의를 SQL 쿼리로 변환하는 Text-to-SQL (NL2SQL) 시스템의 견고성과 신뢰성을 향상시키는 것을 목표로 합니다. 특히, 기존 시스템들이 실행 기반 피드백에만 의존하여 논리적으로 부정확하지만 문법적으로 유효한 SQL 쿼리 오류를 수정하지 못하는 한계를 극복하고자 합니다.

#Review #Text-to-SQL #Multi-agent Systems #Chain-of-Thought #Error Correction #Large Language Models #Query Planning #Database Interaction

2025년 9월 3일