#Speculative Sparsity

1개의 포스트

[논문리뷰] SpeContext: Enabling Efficient Long-context Reasoning with Speculative Context Sparsity in LLMs

본 논문은 대규모 언어 모델(LLM)의 장문맥(long-context) 추론 시 발생하는 Key-Value (KV) 캐시 관련 문제를 해결하는 것을 목표로 합니다.

#Review #LLMs #Long-context Reasoning #KV Cache Optimization #Speculative Sparsity #Knowledge Distillation #Adaptive Memory Management #Throughput

2025년 12월 1일