[논문리뷰] The Alignment Waltz: Jointly Training Agents to Collaborate for SafetyarXiv에 게시된 'The Alignment Waltz: Jointly Training Agents to Collaborate for Safety' 논문에 대한 자세한 리뷰입니다.#Review#LLM Safety#Multi-agent Reinforcement Learning#Safety Alignment#Overrefusal#Adversarial Attacks#Feedback Agent#Conversation Agent#Dynamic Improvement Reward2025년 10월 10일댓글 수 로딩 중