[논문리뷰] Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning CapabilitiesarXiv에 게시된 'Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities' 논문에 대한 자세한 리뷰입니다.#Review#Retrieval-Augmented Generation#Large Language Models#Reasoning#Benchmark#Deep Search#Error Analysis#Scientific Problem Solving#Context Understanding2026년 2월 5일댓글 수 로딩 중
[논문리뷰] C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex ConversationsYiwen Guo이 arXiv에 게시한 'C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations' 논문에 대한 자세한 리뷰입니다.#Review#Spoken Dialogue Models#Bilingual Benchmark#Complex Conversations#Ambiguity Resolution#Context Understanding#LLM Evaluation#Human-Computer Interaction2025년 8월 2일댓글 수 로딩 중