[논문리뷰] Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMsLecheng Yan이 arXiv에 게시한 'Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs' 논문에 대한 자세한 리뷰입니다.#Review#RLVR#LLMs#Mechanistic Interpretability#Memorization Shortcuts#Data Contamination#Anchor-Adapter Circuit#Path Patching#Logit Lens2026년 1월 19일댓글 수 로딩 중