Summary of Parallelspec: Parallel Drafter For Efficient Speculative Decoding, by Zilin Xiao et al.
ParallelSpec: Parallel Drafter for Efficient Speculative Decodingby Zilin Xiao, Hongming Zhang, Tao Ge, Siru Ouyang,…
ParallelSpec: Parallel Drafter for Efficient Speculative Decodingby Zilin Xiao, Hongming Zhang, Tao Ge, Siru Ouyang,…
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superpositionby Zheyang Xiong,…
Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Seriesby Byoungwoo Park, Hyungi…
Score-Based Variational Inference for Inverse Problemsby Zhipeng Xue, Penghao Cai, Xiaojun Yuan, Xiqi GaoFirst submitted…
Optimizing Tensor Computation Graphs with Equality Saturation and Monte Carlo Tree Searchby Jakob Hartmann, Guoliang…
ESPACE: Dimensionality Reduction of Activations for Model Compressionby Charbel Sakr, Brucek KhailanyFirst submitted to arxiv…
From Incomplete Coarse-Grained to Complete Fine-Grained: A Two-Stage Framework for Spatiotemporal Data Reconstructionby Ziyu Sun,…
Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approachby Divya Jyoti…
Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotationby…
Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineeringby Kazumoto Nakamura, Yuji Nozawa, Yu-Chieh…