#State-Space Models

3개의 포스트

[논문리뷰] WriteSAE: Sparse Autoencoders for Recurrent State

본 논문은 기존의 Residual SAE가 해결하지 못했던 state-space 및 hybrid recurrent language model의 matrix cache write 문제를 다룬다.

#Review #Sparse Autoencoders #State-Space Models #Recurrent Neural Networks #Mechanistic Interpretability #Cache-Patching #WriteSAE

2026년 5월 13일

[논문리뷰] Rethinking State Tracking in Recurrent Models Through Error Control Dynamics

본 논문은 recurrent 아키텍처의 상태 추적(state tracking) 능력이 단순히 이론적 표현력(expressivity)만으로 결정되지 않으며, hidden-state의 drift를 제어하는 에러 제어(error control) 역학에 의해 좌우된다는 점을 규명합니다.

#Review #State Tracking #Recurrent Models #Error Control #Affine Recurrences #State-Space Models #Symbolic Dynamics

2026년 5월 10일

[논문리뷰] Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers

언어 모델 아키텍처 간의 성능 차이를, 특히 학술 규모의 사전 훈련에서 발생하는 높은 노이즈와 비용 문제 없이 신뢰성 있게 평가하고 이해하는 것을 목표로 합니다.

#Review #Language Models #Transformer Architecture #Canon Layers #Synthetic Pretraining #Reasoning Depth #Linear Attention #State-Space Models #NoPE

2025년 12월 21일