Summary of Optimized Multi-token Joint Decoding with Auxiliary Model For Llm Inference, by Zongyue Qin et al.
Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inferenceby Zongyue Qin, Ziniu Hu, Zifan…
Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inferenceby Zongyue Qin, Ziniu Hu, Zifan…
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillationby Liqun Ma, Mingjie Sun,…
B’MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memoryby Luca Zancato,…
Just read twice: closing the recall gap for recurrent language modelsby Simran Arora, Aman Timalsina,…
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spikingby Xingrun Xing,…
Learning to (Learn at Test Time): RNNs with Expressive Hidden Statesby Yu Sun, Xinhao Li,…
GPTQT: Quantize Large Language Models Twice to Push the Efficiencyby Yipin Guo, Yilin Lang, Qinyuan…
LLMs Plagiarize: Ensuring Responsible Sourcing of Large Language Model Training Data Through Knowledge Graph Comparisonby…
Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?by Wataru Hashimoto, Hidetaka…
Deep Image-to-Recipe Translationby Jiangqin Ma, Bilal Mawji, Franz WilliamsFirst submitted to arxiv on: 1 Jul…