Optimization – Page 302 – GrooveSquid.com

July 13, 2025

OmniPred: Language Models as Universal Regressorsby Xingyou Song, Oscar Li, Chansoo Lee, Bangding Yang, Daiyi…

July 13, 2025

Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalizationby Xuxi Chen,…

July 13, 2025

Linear Transformers are Versatile In-Context Learnersby Max Vladymyrov, Johannes von Oswald, Mark Sandler, Rong GeFirst…

July 13, 2025

D-Flow: Differentiating through Flows for Controlled Generationby Heli Ben-Hamu, Omri Puny, Itai Gat, Brian Karrer,…

July 13, 2025

The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rankby Amitay Bar, Rotem Mulayoff,…

July 13, 2025

Neural Control System for Continuous Glucose Monitoring and Maintenanceby Azmine Toushik WasiFirst submitted to arxiv…

July 13, 2025

Dealing with unbounded gradients in stochastic saddle-point optimizationby Gergely Neu, Nneka OkoloFirst submitted to arxiv…

July 13, 2025

FlexHB: a More Efficient and Flexible Framework for Hyperparameter Optimizationby Yang Zhang, Haiyang Wu, Yuekui…

July 13, 2025

AlgoFormer: An Efficient Transformer Framework with Algorithmic Structuresby Yihang Gao, Chuanyang Zheng, Enze Xie, Han…

July 13, 2025

Transformer tricks: Precomputing the first layerby Nils GraefFirst submitted to arxiv on: 20 Feb 2024CategoriesMain:…