#Factual Associations

1개의 포스트

[논문리뷰] Large Language Models Do NOT Really Know What They Don't Know

본 논문은 대규모 언어 모델(LLMs)이 사실적 오류를 생성할 때 내부적으로 어떻게 처리하는지 기계적으로 분석하여, LLMs가 진정으로 '무엇을 모르는지 아는지' 여부를 밝히는 것을 목표로 합니다.

#Review #LLMs #Hallucination Detection #Mechanistic Interpretability #Internal States #Knowledge Recall #Refusal Tuning #Factual Associations #Associated Hallucinations

2025년 10월 17일