Summary of The N-grammys: Accelerating Autoregressive Inference with Learning-free Batched Speculation, by Lawrence Stewart (sierra) et al.
The N-Grammys: Accelerating Autoregressive Inference with Learning-Free Batched Speculationby Lawrence Stewart, Matthew Trager, Sujan Kumar…