#Attribution Graph

1개의 포스트

[논문리뷰] Contrastive Attribution in the Wild: An Interpretability Analysis of LLM Failures on Realistic Benchmarks

본 논문은 기존 interpretability 도구들이 실제 벤치마크상의 LLM 오류를 분석하는 데 한계가 있다는 점을 지적하며, 이를 해결하기 위한 실용적인 분석 프레임워크를 제안합니다.

#Review #LLM Interpretability #Contrastive Attribution #Layer-wise Relevance Propagation #Attribution Graph #Failure Analysis #Transformer

2026년 4월 21일