#Adaptive Control

3개의 포스트

[논문리뷰] Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning

본 논문은 LLM reasoning을 위한 On-Policy Self-Distillation (OPSD)에서 teacher-side exposure mismatch라는 간과된 bottleneck을 식별하고 해결하고자 합니다.

#Review #Self-Distillation #LLM Reasoning #Teacher Exposure #On-Policy #Adaptive Control #Reinforcement Learning #Beta-policy

2026년 5월 14일

[논문리뷰] See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation

본 논문은 기존 Vision-Language Models (VLMs) 기반의 드론 내비게이션 접근 방식이 액션 예측을 텍스트 생성으로 간주하여 발생하는 한계를 해결하고자 합니다.

#Review #Vision-Language Models #UAV Navigation #Zero-shot #Spatial Grounding #Waypoint Prompting #Autonomous Navigation #Adaptive Control

2025년 9월 29일

[논문리뷰] AMFT: Aligning LLM Reasoners by Meta-Learning the Optimal Imitation-Exploration Balance

대규모 언어 모델(LLM)이 추론 태스크에서 겪는 catastrophic forgetting 및 모방(imitation) 과 탐색(exploration) 간의 최적화되지 않은 트레이드오프 문제를 해결하는 것이 목표입니다.

#Review #Large Language Models #Fine-tuning #Reinforcement Learning #Meta-learning #Adaptive Control #Imitation Learning #Exploration #Reasoning

2025년 8월 14일