#Value Alignment

2개의 포스트

[논문리뷰] Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

본 연구는 autonomous agents의 가치 체계가 기반이 되는 LLM의 가치와는 본질적으로 다르며, 이를 체계적으로 평가할 수 있는 도구가 부재하다는 문제 의식에서 출발합니다. 기존의 ValueBench나 ValueCompass와 같은 연구들은 주로 정적인 텍스트 생성 모델의 가치 평가에만 국한되어 있습니다.

#Review #Autonomous Agents #Value Alignment #Benchmark #Agentic Modality #Harness Alignment #Skill Steering

2026년 5월 12일

[논문리뷰] Context-Value-Action Architecture for Value-Driven Large Language Model Agents

본 논문은 LLM 기반 에이전트가 인간의 행동을 시뮬레이션할 때 발생하는 Behavioral Rigidity 와 양극화 문제를 해결하고자 합니다.

#Review #LLM Agents #Value Alignment #Behavioral Fidelity #S-O-R Model #Value-Driven Reasoning #CVABench

2026년 4월 7일