Review

[논문리뷰] Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

본 논문은 고성능 추론 모델의 훈련 세부사항이 불완전하게 공개되어 재현이 어려운 문제를 해결하고, 기존 RL(강화 학습)의 클리핑 메커니즘 이 탐색 신호를 억제하고 비최적 궤적을 무시하는 한계를 극복하여 언어 모델의 추론 능력을 극대화하는 것을 목표로 합니다.

#Review #Reasoning LLMs #Reinforcement Learning #PPO #Gradient Clipping #Supervised Fine-tuning #Math Reasoning #Code Generation #Policy Optimization

2025년 8월 12일

[논문리뷰] Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts

본 논문은 기존 MoE (Mixture of Experts) LLM의 한계인 고정된 파라미터 활성화와 이로 인한 비효율적인 계산 문제를 해결하는 것을 목표로 합니다.

#Review #Mixture of Experts #LLMs #MoE Architecture #Dynamic Activation #Adjugate Experts #Upcycling Strategy #Load Balancing

2025년 8월 12일

[논문리뷰] GLiClass: Generalist Lightweight Model for Sequence Classification Tasks

본 연구는 기존 제로샷 텍스트 분류 모델(생성형 LLM, 크로스 인코더, 임베딩 기반 모델)의 한계점, 즉 계산 비효율성, 지시 불일치, 확장성 부족 등을 해결하고자 합니다.

#Review #Sequence Classification #Zero-shot Learning #Few-shot Learning #Transformer #Multi-label Classification #PPO #GLiNER #Computational Efficiency

2025년 8월 12일

[논문리뷰] Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control

이 논문은 기존 flow-기반 이미지 편집 모델이 대규모 형상 변환(large-scale shape transformations) 시 목표 형상 변화를 달성하지 못하거나 비-타겟 영역을 의도치 않게 변경하는 문제를 해결하는 것을 목표로 합니다.

#Review #Image Editing #Shape Transformation #Rectified Flow #Trajectory Divergence Map #Region Control #Generative Models #Diffusion Models

2025년 8월 12일

[논문리뷰] Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

본 연구는 최신 LLM 기반 에이전트 팩트체킹 시스템 이 잘못된 정보를 확산시키거나 진실을 훼손할 수 있는 포이즈닝 공격에 취약함을 지적합니다. 기존 공격 방식은 이러한 정교한 시스템의 클레임 분해 및 교차 검증 메커니즘에 효과적이지 못합니다.

#Review #Adversarial Attack #Poisoning Attack #Fact-checking #LLM Agent #Retrieval Augmented Generation #Misinformation #System Security

2025년 8월 12일

[논문리뷰] Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs

본 논문은 오픈-웨이트 대규모 언어 모델(LLMs)이 이중 용도(dual-use) 지식(예: 바이오위협 프록시 지식)을 학습하는 것을 효과적으로 방지하고, adversarial fine-tuning 공격에 대한 변조 저항성을 높이는 새로운 방법을 제안합니다.

#Review #LLMs #데이터 필터링 #사전 학습 #변조 저항성 #바이오위협 #AI 안전 #서킷 브레이킹 #머신 언러닝

2025년 8월 12일

[논문리뷰] Compressing Chain-of-Thought in LLMs via Step Entropy

Large Language Models(LLMs)의 Chain-of-Thought(CoT) 추론 과정에서 발생하는 과도한 상세함과 중복성으로 인한 높은 추론 비용 및 비효율성을 해결하는 것이 주요 목표입니다.

#Review #LLM #Chain-of-Thought #CoT Compression #Step Entropy #Reinforcement Learning #SFT #GRPO

2025년 8월 12일

[논문리뷰] BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent

현재 Deep-Research 에이전트 평가 벤치마크(예: BrowseComp)는 라이브 웹 검색 API 에 의존하여 공정성, 재현성 및 투명성 측면에서 중대한 한계를 가집니다.

#Review #Benchmarking #Deep-Research Agents #LLMs #Retrieval #Curated Corpus #Evaluation #Fairness #Transparency #Reproducibility

2025년 8월 12일

[논문리뷰] Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

본 연구는 강력한 추론 능력을 유지하면서도 고품질 시각적 합성 기능을 LLM에 통합하는 것을 목표로 합니다. 특히, 기존 방식들이 높은 훈련 비용을 수반하고 백본 LLM의 이미지 표현 학습 부족으로 어려움을 겪는 문제를 해결하여, 고충실도 및 제어 가능한 이미지 생성을 효율적으로 달성하고자 합니다.

#Review #Multimodal LLM #Diffusion Model #CLIP Latent #Image Generation #Multimodal Understanding #ControlNet #Training Efficiency

2025년 8월 12일

[논문리뷰] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

이 논문은 대규모 언어 모델(LLMs) 기반 AI 에이전트의 정적인 구성 한계 를 극복하고, 동적이고 진화하는 환경에 적응할 수 있는 자기 진화(Self-Evolving) 및 평생 학습(Lifelong Learning) 에이전트 시스템 패러다임을 종합적으로 조망하는 것을 목표로 합니다.

#Review #Self-Evolving AI Agents #Lifelong Learning #Foundation Models #Multi-Agent Systems #Agent Optimization #Prompt Engineering #Tool Use #AI Safety #Survey

2025년 8월 12일

[논문리뷰] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

가상 의류 착용(try-on) 및 탈의(try-off) 시 사람의 자세 및 외형 변화에 따른 의류-신체 일치성 모델링과 세부 묘사의 정확성 유지라는 고질적인 문제를 해결하는 것입니다.

#Review #Virtual Try-On #Virtual Try-Off #Diffusion Transformer #Bidirectional Learning #Generative AI #Fashion Synthesis #Attention Mechanism #Self-Correction

2025년 8월 11일

[논문리뷰] UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding

본 논문은 기존 GUI 에이전트 훈련 및 추론 방식의 세 가지 한계점인 추론 설계 딜레마(P1) , 비효율적인 보상(P2) , 그리고 고해상도 디스플레이에서의 시각적 노이즈(P3) 를 해결하고자 합니다.

#Review #GUI Agents #Reinforcement Learning #Grounding #MLLMs #Reward Function #Resampling #Visual Noise Reduction

2025년 8월 11일

[논문리뷰] Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal

본 논문은 대규모 추론 모델(LRMs)의 Chain-of-Thought(CoT) 추론 과정에서 발생하는 과도하게 긴 추론 트레이스 문제를 해결하여, 학습 비용과 추론 지연 시간을 줄이는 동시에 코드 추론 성능을 유지하거나 향상시키는 것을 목표로 합니다.

#Review #Code Reasoning #CoT Compression #LLMs #Efficiency #Surprisal #Pruning #Fine-tuning #Large Reasoning Models

2025년 8월 11일

[논문리뷰] MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

본 연구는 기존 대규모 언어 모델(LLM) 기반의 3D 메시 처리 방식이 갖는 데이터셋 규모의 한계와 텍스트 직렬화 과정에서의 3D 구조 정보 손실 문제를 해결하여, LLM이 텍스트 직렬화된 3D 메시를 더욱 효과적으로 이해하고 생성할 수 있도록 돕는 것을 목표로 합니다.

#Review #3D Mesh Generation #LLMs #Mesh Understanding #Text-to-3D #Primitive-Mesh Decomposition #Progressive Training #Multimodal AI

2025년 8월 11일

[논문리뷰] Memp: Exploring Agent Procedural Memory

논문은 대규모 언어 모델(LLM) 기반 에이전트가 겪는 취약한 절차적 메모리 문제를 해결하고, 에이전트에게 학습 가능하고 업데이트 가능한 평생 절차적 메모리 를 부여하는 것을 목표로 합니다. 이를 통해 에이전트의 성공률을 높이고 유사 작업에 대한 실행 효율성 을 개선하고자 합니다.

#Review #Procedural Memory #LLM Agents #Memory Management #Task Automation #Lifelong Learning #Experience Replay #Agent Learning

2025년 8월 11일

[논문리뷰] MELLA: Bridging Linguistic Capability and Cultural Groundedness for Low-Resource Language MLLMs

본 논문은 고자원 언어에 집중되어 저자원 언어에서 성능이 저하되는 기존 다중 모드 대규모 언어 모델(MLLM) 의 한계를 해결하고자 합니다.

#Review #Multimodal Large Language Models #Low-Resource Languages #Cultural Groundedness #Linguistic Capability #Dataset Creation #Multilingual AI

2025년 8월 11일

[논문리뷰] LightSwitch: Multi-view Relighting with Material-guided Diffusion

논문은 기존의 2D 이미지 리라이팅(relighting) 생성 모델들이 대상의 내재적 특성을 활용하지 못하거나 다중 뷰 데이터를 확장성 있게 고려하지 못해 불충분한 리라이팅 결과를 초래하는 문제를 해결하고자 합니다.

#Review #Multi-view Relighting #Diffusion Models #Material-guided #Inverse Rendering #3D Scene Reconstruction #Image Synthesis #Consistent Relighting

2025년 8월 11일

[논문리뷰] InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization

본 논문은 MLLM(Multimodal Large Language Model) 기반 GUI 에이전트 의 핵심 과제인 자연어 지시문 GUI Grounding 에서 의미론적 정렬(Semantic Alignment) 의 비효율적인 탐색 문제 해결을 목표로 합니다.

#Review #GUI Grounding #MLLMs #Reinforcement Learning #Policy Optimization #Exploration Strategy #Semantic Alignment #Adaptive Exploration Reward #Human-Computer Interaction

2025년 8월 11일

[논문리뷰] GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

본 논문은 오픈소스 MoE(Mixture-of-Experts) 기반 대규모 언어 모델인 GLM-4.5 를 소개합니다. 핵심 목표는 에이전트, 추론, 코딩(ARC) 태스크 전반에서 강력한 성능을 달성하고, 사고 및 직접 응답 모드를 지원하는 하이브리드 추론 방식을 통해 계산 효율성을 극대화하는 것입니다.

#Review #Large Language Model #Mixture-of-Experts #Agentic AI #Reasoning #Code Generation #Reinforcement Learning #Foundation Model

2025년 8월 11일

[논문리뷰] GENIE: Gaussian Encoding for Neural Radiance Fields Interactive Editing

본 논문은 NeRF 의 사실적인 렌더링 품질과 Gaussian Splatting (GS) 의 편집 가능성 및 구조적 표현의 장점을 결합하여, 물리 기반 상호작용 이 가능한 대화형 3D 장면 편집 시스템을 개발하는 것을 목표로 합니다. 기존 NeRF 의 편집 어려움과 GS 의 일부 시각적 한계를 극복하고자 합니다.

#Review #Neural Radiance Fields (NeRF)#Gaussian Splatting (GS)#Interactive Editing #3D Scene Representation #Physics Simulation #Hybrid Model #Real-time Rendering #Ray Tracing

2025년 8월 11일