[논문리뷰] Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs InferencearXiv에 게시된 'Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference' 논문에 대한 자세한 리뷰입니다.#Review#Large Language Models#Long-context Inference#Hybrid Attention#Dynamic Routing#Layer-level Sparsity#Context-aware2026년 4월 9일댓글 수 로딩 중
[논문리뷰] GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of ThoughtsarXiv에 게시된 'GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts' 논문에 대한 자세한 리뷰입니다.#Review#Collaborative Inference#Large Reasoning Models (LRMs)#Inference Latency#Step-wise Routing#Initial Token Entropy#Dynamic Routing#Computational Efficiency2026년 1월 12일댓글 수 로딩 중
[논문리뷰] UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoEarXiv에 게시된 'UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE' 논문에 대한 자세한 리뷰입니다.#Review#Mixture of Experts#Speech Generation#Music Generation#Multimodal AI#Dynamic Routing#Training Curriculum#Data Imbalance#Audio Synthesis2025년 10월 16일댓글 수 로딩 중
[논문리뷰] Dr.LLM: Dynamic Layer Routing in LLMsarXiv에 게시된 'Dr.LLM: Dynamic Layer Routing in LLMs' 논문에 대한 자세한 리뷰입니다.#Review#Dynamic Routing#LLMs#Adaptive Depth#Computational Efficiency#Monte Carlo Tree Search (MCTS)#Retrofittable Framework#Supervised Learning#Accuracy Improvement2025년 10월 15일댓글 수 로딩 중