Summary of Language Models As Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models, by Hui-po Wang et al.
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Modelsby Hui-Po Wang,…