Review

[논문리뷰] QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models

본 논문은 최신 Vision-Language Models (VLMs) 이 물리적 특성을 정량적으로 추론하는 능력에 대한 불확실성을 해결하고자 합니다.

#Review #Vision-Language Models #Physical Reasoning #Quantitative Benchmark #Kinematics #Mean Relative Accuracy #Video-Text #Embodied AI

2025년 12월 23일

[논문리뷰] Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation

본 연구는 질적 연구에서 LLM 기반 주제 분석의 신뢰성 문제를 해결하고, 기존의 시간 소모적이며 비용이 많이 드는 인간 코더 기반 방식의 한계를 극복하는 것을 목표로 합니다. 특히, LLM 출력의 신뢰도를 정량적으로 평가하고 투명하게 검증할 수 있는 다중 관점 검증 프레임워크를 제시하고자 합니다.

#Review #Thematic Analysis #Large Language Models #Qualitative Research #Cohen's Kappa #Semantic Similarity #Reliability Metrics #Ensemble Validation #Prompt Engineering

2025년 12월 23일

[논문리뷰] MemEvolve: Meta-Evolution of Agent Memory Systems

본 논문은 LLM 기반 에이전트의 고정된 메모리 시스템 아키텍처가 다양한 태스크 컨텍스트에 메타 적응할 수 없는 근본적인 한계 를 해결하고자 합니다.

#Review #LLM Agents #Memory Systems #Meta-Evolution #Self-Evolving AI #Memory Architecture #EvolveLab #Generalization

2025년 12월 23일

[논문리뷰] LongVideoAgent: Multi-Agent Reasoning with Long Videos

본 논문은 기존 MLLM(Multimodal Large Language Models)이 긴 길이의 비디오에서 발생하는 정보 압축 손실, 제한된 도구 세트, 그리고 미세한 시간적 추론 능력 부족 문제를 해결하는 것을 목표로 합니다.

#Review #Multi-Agent System #Long Video Understanding #Video Question Answering #Reinforcement Learning #Large Language Models #Temporal Grounding #Multimodal Reasoning #Tool-Augmented AI

2025년 12월 23일

[논문리뷰] INTELLECT-3: Technical Report

본 논문은 기존 오픈소스 LLM RL 인프라의 복잡성과 확장성 한계를 해결하고, 106B 파라미터 Mixture-of-Experts (MoE) 모델인 INTELLECT-3 를 통해 최첨단 성능을 달성하는 것을 목표로 합니다.

#Review #Reinforcement Learning #Large Language Models #Mixture-of-Experts #Asynchronous Training #Distributed Systems #Agentic AI #Code Execution #Model Evaluation

2025년 12월 23일

[논문리뷰] FaithLens: Detecting and Explaining Faithfulness Hallucination

본 논문은 대규모 언어 모델(LLM) 출력에서 발생하는 충실성 환각(faithfulness hallucination) 을 탐지하고, 그 결정에 대한 설명(explanation) 을 함께 제공하여 LLM의 신뢰성을 향상시키는 비용 효율적이고 효과적인 모델 FaithLens 를 제안합니다.

#Review #LLM Hallucination Detection #Explainable AI #Faithfulness Evaluation #Data Augmentation #Reinforcement Learning #Fact-Checking

2025년 12월 23일

[논문리뷰] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

본 논문은 기존 RL 접근 방식이 LLM을 단일 블랙박스 정책으로 취급하는 한계를 극복하고자 합니다.

#Review #Reinforcement Learning #Large Language Models #Policy Optimization #Interpretability #Transformer #Internal Policy #Entropy Analysis

2025년 12월 23일

[논문리뷰] Active Intelligence in Video Avatars via Closed-loop World Modeling

기존 비디오 아바타 생성 방식이 단순한 애니메이션을 넘어 자율적인 에이전시 를 가지지 못하고 장기 목표를 달성할 수 없는 한계를 해결하는 것이 목표입니다.

#Review #Video Avatars #Active Intelligence #World Models #Closed-loop Reasoning #POMDP #Generative AI #Hierarchical Planning #Cognitive Architecture

2025년 12월 23일

[논문리뷰] WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

논문은 단일 이미지로부터 장범위(long-range) 및 기하학적으로 일관된 새로운 시점 비디오를 생성하는 근본적인 문제를 해결하고자 합니다.

#Review #Novel View Synthesis #3D Geometry Propagation #Video Diffusion Models #Gaussian Splatting #Autoregressive Generation #Spatio-Temporal Noise #Geometric Consistency

2025년 12월 22일

[논문리뷰] Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives

본 연구는 LLM의 연역적 추론 능력 을 논리적(형식적) 및 직관적(자연어) 관점에서 깊이 이해하는 것을 목표로 합니다.

#Review #Syllogistic Reasoning #Large Language Models (LLMs)#Belief Bias #Natural Language Understanding (NLU)#Formal Logic #Prompt Engineering #Self-Consistency #Cognitive Psychology

2025년 12월 22일

[논문리뷰] UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models

본 연구는 대규모 언어 모델(LLMs)의 코드 생성 능력이 값비싼 감독 학습 데이터에 크게 의존하는 문제점을 해결하고자 합니다. 외부 코퍼스나 수동으로 주석 처리된 데이터 없이, 오직 사전 훈련된 지식만을 활용하여 LLM의 코드 생성 능력을 자율적으로 개선하는 비감독 학습 프레임워크를 개발하는 것이 목표입니다.

#Review #Unsupervised Learning #Code Generation #Large Language Models (LLMs)#Internal Probing #Self-Bootstrapping #Consensus Clustering #Code Intelligence

2025년 12월 22일

[논문리뷰] The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

본 논문은 최신 파운데이션 모델에서 추상적 의미(semantic abstraction)와 시각적 충실도(pixel-level fidelity) 사이의 근본적인 불일치를 해결하는 것을 목표로 합니다.

#Review #Unified Autoencoding #Prism Hypothesis #Semantic Representations #Pixel Representations #Frequency-Band Modulator #Foundation Models #Spectral Bias #Generative Models

2025년 12월 22일

[논문리뷰] StoryMem: Multi-shot Long Video Storytelling with Memory

본 논문은 영화적 품질과 장거리 일관성을 갖춘 다중 샷 장편 비디오 스토리텔링을 생성하는 문제를 해결하는 것을 목표로 합니다.

#Review #Video Storytelling #Multi-shot Video Generation #Memory Mechanism #Diffusion Models #Cross-shot Consistency #Latent Video Diffusion #ROPE Shift #Keyframe Selection

2025년 12월 22일

[논문리뷰] Region-Constraint In-Context Generation for Instructional Video Editing

본 논문은 텍스트 지시만으로 비디오 콘텐츠를 정밀하게 수정 하는 인-컨텍스트 비디오 편집 과정에서 발생하는 문제를 해결하고자 합니다. 구체적으로, 편집 영역이 불정확하고 노이즈 제거 과정 중 편집 및 비편집 영역 간의 토큰 간섭이 발생하는 한계를 극복하는 것을 목표로 합니다.

#Review #Video Editing #In-Context Learning #Diffusion Models #Region-Constraint #Instruction-based Editing #Latent Space Regularization #Attention Space Regularization #Large-scale Dataset

2025년 12월 22일

[논문리뷰] Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

본 논문은 대규모 (비전) 언어 모델(LLMs/VLMs)의 추론 및 강화 학습(RL) 훈련 과정에서 발생하는 탐색 비효율성 문제를 해결하는 것을 목표로 합니다.

#Review #Latent Variable Models #Variational Autoencoder (VAE)#Reinforcement Learning (RL)#Exploration #Large Language Models (LLMs)#Vision-Language Models (VLMs)#Controllable Generation #Reasoning Strategies

2025년 12월 22일

[논문리뷰] Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface

본 연구는 로봇 학습에서 공간 일반화 및 정책 견고성을 제한하는 다양한 로봇 시연 데이터 수집의 높은 비용 문제를 해결하고자 합니다. 특히, 제한된 수의 실제 시연으로부터 사실적이고 다양한 새로운 로봇 시연을 효율적으로 생성 하여 데이터 효율성을 획기적으로 개선하는 프레임워크를 제안합니다.

#Review #Robotics #Demonstration Generation #3D Control Interface #Data Efficiency #Visuomotor Policy Learning #Spatial Generalization #Depth Map #Video Generation

2025년 12월 22일

[논문리뷰] QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

대규모 언어 모델(LLM)의 내부 신호(예: logits, 엔트로피)가 부정확한 예측에 대해 종종 높은 확신을 보이는 등 신뢰할 수 없다는 문제점을 해결하고자 합니다.

#Review #Dynamic RAG #Hallucination Detection #Corpus Statistics #Uncertainty Quantification #Pre-training Data #LLM Calibration #Infini-gram #Multi-hop QA

2025년 12월 22일

[논문리뷰] Name That Part: 3D Part Segmentation and Naming

본 논문은 3D 객체를 의미론적으로 명명된 부분으로 분해하는 시맨틱 3D 파트 분할(semantic 3D part segmentation) 문제를 해결하는 것을 목표로 합니다.

#Review #3D Semantic Segmentation #Part Naming #Open-Vocabulary #LLM #Set Alignment #Geometric Deep Learning #Annotation Engine #Affordance Description

2025년 12월 22일

[논문리뷰] MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments

기존 모바일 GUI 에이전트 벤치마크인 AndroidWorld 의 포화 상태(90% 이상의 성공률)와 현실적이지 않은 태스크 복잡성 한계를 극복하는 것을 목표로 합니다.

#Review #Mobile Agents #GUI Benchmarking #Agent-User Interaction #Tool-Augmented Agents #Model Context Protocol (MCP)#Long-Horizon Tasks #Reproducible Evaluation #Android Environment

2025년 12월 22일

[논문리뷰] MatSpray: Fusing 2D Material World Knowledge on 3D Geometry

본 논문은 2D 이미지 기반의 물질 예측 모델을 활용하여 3D 형상에 물리 기반 렌더링(PBR) 속성을 부여하고, 여러 시점(multi-view)에서 일관성을 유지하며 다시 조명 가능한(relightable) 3D 객체 를 재구성하는 것을 목표로 합니다.

#Review #3D Reconstruction #Material Estimation #Diffusion Models #Gaussian Splatting #Inverse Rendering #PBR #Relighting #Neural Merger

2025년 12월 22일