Summary of Unimem: Towards a Unified View Of Long-context Large Language Models, by Junjie Fang et al.
UniMem: Towards a Unified View of Long-Context Large Language Modelsby Junjie Fang, Likai Tang, Hongzhe…
UniMem: Towards a Unified View of Long-Context Large Language Modelsby Junjie Fang, Likai Tang, Hongzhe…
Fractal Patterns May Illuminate the Success of Next-Token Predictionby Ibrahim Alabdulmohsin, Vinh Q. Tran, Mostafa…
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokensby Jiacheng Liu, Sewon Min, Luke…
Pushing the Envelope of Low-Bit LLM via Dynamic Error Compensationby Yeonhong Park, Jake Hyun, Hojoon…
Deliberation in Latent Space via Differentiable Cache Augmentationby Luyang Liu, Jonas Pfeiffer, Jiaxing Wu, Jun…
MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Designby Zhen Zheng, Xiaonan…
ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residualsby Utkarsh Saxena, Sayeh Sharify, Kaushik…
SWAN: SGD with Normalization and Whitening Enables Stateless LLM Trainingby Chao Ma, Wenbo Gong, Meyer…
Model-diff: A Tool for Comparative Study of Language Models in the Input Spaceby Weitang Liu,…
Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architectureby Jingze Shi, Bingheng…