Summary of Addax: Utilizing Zeroth-order Gradients to Improve Memory Efficiency and Performance Of Sgd For Fine-tuning Language Models, by Zeman Li et al.
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language…