#Long-Context LLMs

8개의 포스트

[논문리뷰] MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

Large Language Models (LLMs)는 다양한 분야에서 뛰어난 능력을 보였지만, 수백만 토큰 규모의 장기적이고 세밀한 기억(long-term, fine-grained memory retention)을 처리하는 데에는 여전히 큰 어려움에 직면해 있습니다.

#Review #Memory Sparse Attention #Long-Context LLMs #Efficient Memory #End-to-End Trainable #KV Cache Compression #Rotary Positional Embedding #Multi-hop Reasoning #Scalability

2026년 3월 26일

[논문리뷰] BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

최근 LLMs의 context window가 기하급수적으로 확장되면서 long-document understanding의 잠재력이 커졌지만, 이는 심각한 inference latency와 정보 utilization 병목 현상을 야기했습니다.

#Review #Prompt Compression #Long-Context LLMs #Training-Free #Hierarchical Selection #Structure-Aware #Inference Latency #Information Utilization

2026년 3월 22일

[논문리뷰] FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

Large Language Models (LLMs)의 장문 컨텍스트 처리 시 자기회귀(self-attention)의 2차 복잡도로 인한 성능 병목현상 , 특히 계산 집약적인 프리필(prefilling) 단계의 높은 오버헤드 를 해결하는 것이 목표입니다.

#Review #Long-Context LLMs #Prefilling #Sparse Attention #Pattern Discovery #Dynamic Thresholding #Attention Speedup #Transformer Optimization

2026년 3월 8일

[논문리뷰] HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing

본 논문은 기존 희소 어텐션(sparse attention) 방법론의 두 가지 근본적인 한계를 해결하고자 합니다. 첫째, 토큰 중요도 예측에 추가적인 프록시(proxy)를 사용하는 복잡성과 성능 저하 문제.

#Review #Sparse Attention #KV Cache Sharing #Hybrid Attention #Long-Context LLMs #Memory Optimization #Token Selection #Transformer Architecture

2026년 2월 4일

[논문리뷰] AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts

이 논문은 동적으로 변화하는 컨텍스트 내에서 장문 컨텍스트 LLM (Large Language Model) 기반 에이전트의 오랜 기간에 걸친 일관성(long-horizon consistency) 및 계획(planning) 능력을 평가하기 위한 표준화된 벤치마크의 부재를 해결합니다.

#Review #Long-Context LLMs #Autonomous Agents #Benchmark #Environment Rollouts #State Tracking #Tool Use #Memory Evaluation #Lateral Thinking Puzzles

2026년 1월 29일

[논문리뷰] Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs

현재 RoPE(Rotary Position Embeddings) 구현이 어텐션 스코어 계산 시 복소수 값의 내적에서 실수부만 사용 하고 허수부를 버려, 장문맥 의존성 모델링에 중요한 관계형 정보 손실 이 발생하는 문제를 해결하고자 합니다.

#Review #Rotary Position Embedding #Long-Context LLMs #Complex-Valued Neural Networks #Self-Attention #Positional Encoding #Information Loss #Length Extrapolation

2025년 12월 8일

[논문리뷰] LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering

본 논문은 기존 코드 평가 벤치마크의 한계를 극복하고, 수백만 토큰으로 확장된 컨텍스트 윈도우 를 가진 LLM이 현실적이고 복잡한 소프트웨어 개발 시나리오에서 긴 컨텍스트를 얼마나 잘 이해하고 활용하는지를 종합적으로 평가하는 것을 목표로 합니다.

#Review #Long-Context LLMs #Software Engineering #Code Evaluation #Benchmark #Multi-file Reasoning #Architectural Understanding #Context Length #Software Development Lifecycle #Metrics

2025년 9월 12일

[논문리뷰] The Markovian Thinker

본 논문은 추론 LLM 훈련 시 발생하는 무한한 상태 크기 와 추론 길이 증가에 따른 2차 계산 복잡도 문제를 해결하고자 합니다.

#Review #Reinforcement Learning #Large Language Models #Chain-of-Thought #Markovian Thinking #Context Management #Computational Efficiency #Long-Context LLMs #Transformer Optimization

2025년 10월 9일