#Self-Optimization

2개의 포스트

[논문리뷰] WoW: Towards a World omniscient World model Through Embodied Interaction

본 논문은 수동적 관찰에 의존하는 기존 비디오 생성 모델의 한계(물리적 인과관계 이해 부족)를 극복하고, 대규모의 인과관계가 풍부한 실제 상호작용 데이터 를 통해 로봇이 물리적 직관을 습득할 수 있는 세계 모델(World Model) 을 개발하는 것을 목표로 합니다.

#Review #World Model #Embodied AI #Robotics #Diffusion Models #Physical Reasoning #Vision Language Models #Interaction Data #Self-Optimization

2025년 9월 29일

[논문리뷰] Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment

이 논문은 대규모 언어 모델(LLM) 정렬(alignment) 방법론의 한계를 해결하고자 합니다. 기존 방법론들( SFT, DPO, PPO, GRPO )은 특정 정렬 방식에 고정되거나 정량적 지표만을 최적화하여 일반화 및 견고성 측면에서 부족함을 보였습니다.

#Review #LLM Alignment #Reinforcement Learning from Human Feedback #Preference Learning #Group Relative Alignment Optimization #Self-Optimization #Mixture-of-Experts #Imitation Learning

2025년 8월 14일