#Out-of-Distribution Generalization

7개의 포스트

[논문리뷰] Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

본 논문은 에이전트의 효율적인 기술 습득과 OOD 환경에서의 범용성 확보를 위해 기술의 종류에 따른 차별화된 처리(Differentiated Treatment)가 필요함을 제기한다.

#Review #Agentic Reinforcement Learning #Skill Internalization #Out-of-Distribution Generalization #Difficulty-Aware Routing #Privileged Distillation #Shortcut Learning

2026년 5월 28일

[논문리뷰] Agentic Critical Training

본 논문은 LLM 에이전트가 단순한 모방을 넘어, 행동의 품질에 대한 자율적인 비판적 추론 및 진정한 자기 성찰 능력 을 개발하도록 훈련시키는 것을 목표로 합니다. 기존 모방 학습(IL)이 '무엇을 할지'만 가르치고 '왜 그 행동이 더 나은지'에 대한 이해가 부족하다는 한계를 해결하고자 합니다.

#Review #LLM Agents #Reinforcement Learning #Imitation Learning #Self-Reflection #Action Quality #Out-of-Distribution Generalization #Critical Reasoning #GRPO

2026년 3월 9일

[논문리뷰] VLS: Steering Pretrained Robot Policies via Vision-Language Models

본 논문은 사전 학습된 로봇 정책이 새로운 객체, 장면, 또는 명령 변경과 같은 분포 외(Out-of-Distribution, OOD) 시나리오 에서 실패하는 문제를 해결하고자 합니다.

#Review #Robot Learning #Vision-Language Models #Policy Steering #Inference-Time Adaptation #Out-of-Distribution Generalization #Diffusion Models #Generative Policies

2026년 2월 4일

[논문리뷰] False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize

본 연구는 대규모 언어 모델(LLM)의 악성 입력 감지를 위해 제안된 프루빙 기반(probing-based) 방법론 의 신뢰성을 재평가하는 것을 목표로 합니다.

#Review #LLM Safety #Malicious Input Detection #Probing Classifiers #Out-of-Distribution Generalization #Superficial Patterns #Instructional Patterns #Trigger Words #AI Safety

2025년 9월 5일

[논문리뷰] End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning

본 논문은 기존 RAG(Retrieval-Augmented Generation) 시스템이 의료 진단 분야에서 겪는 한계, 즉 수동적인 프롬프트 엔지니어링, 제한된 피드백 적응, 그리고 불투명한 추론 과정으로 인한 신뢰성 부족 문제를 해결하고자 합니다.

#Review #Agentic RAG #Medical Diagnosis #Reinforcement Learning #Traceable AI #Large Language Models #Clinical Decision Support #Out-of-Distribution Generalization #Reward Design

2025년 8월 25일

[논문리뷰] AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

본 논문은 최적화 모델링 자동화의 어려움, 즉 비공식적 언어를 정밀한 수학적 공식 및 실행 가능한 솔버 코드로 변환하는 문제에 주목합니다.

#Review #Optimization Modeling #Large Language Models (LLMs)#Experience Library #Self-Improving Systems #Continual Learning #Out-of-Distribution Generalization #Operations Research #Knowledge Representation

2025년 10월 23일

[논문리뷰] VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation

본 논문은 기존 VLA 모델이 훈련 데이터 외부의 미확인 객체 개념(unseen concepts) 에 직면했을 때 급격히 성능이 저하되는 문제, 즉 OOD(Out-of-Distribution) 일반화 실패를 해결하는 것을 목표로 합니다.

#Review #Vision-Language-Action Models #Agentic Framework #Unseen Concept Manipulation #Out-of-Distribution Generalization #Tool Use #Web Retrieval #Object Detection #LIBERO Simulation

2025년 10월 17일