#Latent Context Language Models

1개의 포스트

[논문리뷰] End-to-End Context Compression at Scale

본 연구는 긴 문맥(long-context) 처리가 LLM의 핵심 역량임에도 불구하고, 기하급수적으로 증가하는 KV Cache 메모리 점유율과 이로 인한 추론 속도 저하 문제를 해결하고자 합니다.

#Review #Context Compression #KV Cache #Latent Context Language Models #Encoder-Decoder #End-to-End Training #Model Efficiency

2026년 6월 8일