#Low-Rank Correction

1개의 포스트

[논문리뷰] Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding

본 논문은 Speculative decoding에서 draft 품질과 연산 비용 간의 trade-off 문제를 해결하는 것을 목표로 합니다.

#Review #Speculative Decoding #LLM Inference #Autoregressive Drafting #Parallel Drafting #Causal Modeling #Low-Rank Correction

2026년 6월 1일