Summary of Non-convergence to Global Minimizers in Data Driven Supervised Deep Learning: Adam and Stochastic Gradient Descent Optimization Provably Fail to Converge to Global Minimizers in the Training Of Deep Neural Networks with Relu Activation, by Thang Do and Sonja Hannibal and Arnulf Jentzen
Non-convergence to global minimizers in data driven supervised deep learning: Adam and stochastic gradient descent…