[논문리뷰] Dynamic Model Routing and Cascading for Efficient LLM Inference: A SurveyJohn D. Kelleher이 arXiv에 게시한 'Dynamic Model Routing and Cascading for Efficient LLM Inference: A Survey' 논문에 대한 자세한 리뷰입니다.#Review#LLM Inference#Model Routing#Model Cascading#Efficiency Optimization#Dynamic Model Selection#Multi-LLM Systems#Cost-Performance Trade-off#Adaptive AI Systems2026년 3월 8일댓글 수 로딩 중
[논문리뷰] Cache-to-Cache: Direct Semantic Communication Between Large Language ModelsarXiv에 게시된 'Cache-to-Cache: Direct Semantic Communication Between Large Language Models' 논문에 대한 자세한 리뷰입니다.#Review#Large Language Models (LLMs)#Inter-model Communication#KV-Cache#Semantic Transfer#Multi-LLM Systems#Cache Fusion#Latency Reduction#Knowledge Sharing2025년 10월 9일댓글 수 로딩 중