Summary of Balancing Speed and Stability: the Trade-offs Of Fp8 Vs. Bf16 Training in Llms, by Kazuki Fujii et al.
Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMsby Kazuki Fujii, Taishi…
Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMsby Kazuki Fujii, Taishi…
Flow reconstruction in time-varying geometries using graph neural networksby Bogdan A. Danciu, Vito A. Pagone,…
Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approachby Jiyao Li, Mingze Ni, Yongshun Gong, Wei…
RESOLVE: Relational Reasoning with Symbolic and Object-Level Features Using Vector Symbolic Processingby Mohamed Mejri, Chandramouli…
Spatially Regularized Graph Attention Autoencoder Framework for Detecting Rainfall Extremesby Mihir Agarwal, Progyan Das, Udit…
Interaction Asymmetry: A General Principle for Learning Composable Abstractionsby Jack Brady, Julius von Kügelgen, Sébastien…
Enhancing Link Prediction with Fuzzy Graph Attention Networks and Dynamic Negative Samplingby Jinming Xing, Ruilin…
FlowTS: Time Series Generation via Rectified Flowby Yang Hu, Xiao Wang, Zezhen Ding, Lirong Wu,…
Unraveling the Gradient Descent Dynamics of Transformersby Bingqing Song, Boran Han, Shuai Zhang, Jie Ding,…
Circuit Complexity Bounds for RoPE-based Transformer Architectureby Bo Chen, Xiaoyu Li, Yingyu Liang, Jiangxuan Long,…