Summary of Improving Transformers with Dynamically Composable Multi-head Attention, by Da Xiao et al.
Improving Transformers with Dynamically Composable Multi-Head Attentionby Da Xiao, Qingye Meng, Shengping Li, Xingyuan YuanFirst…
Improving Transformers with Dynamically Composable Multi-Head Attentionby Da Xiao, Qingye Meng, Shengping Li, Xingyuan YuanFirst…
Airport Delay Prediction with Temporal Fusion Transformersby Ke Liu, Kaijing Ding, Xi Cheng, Guanhao Xu,…
ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysisby Mohammad Amaz…
HGTDR: Advancing Drug Repurposing with Heterogeneous Graph Transformersby Ali Gharizadeh, Karim Abbasi, Amin Ghareyazi, Mohammad…
Fighter flight trajectory prediction based on spatio-temporal graphcial attention networkby Yao Sun, Tengyu Jing, Jiapeng…
CaFA: Global Weather Forecasting with Factorized Attention on Sphereby Zijie Li, Anthony Zhou, Saurabh Patil,…
RESTAD: REconstruction and Similarity based Transformer for time series Anomaly Detectionby Ramin Ghorbani, Marcel J.T.…
USP: A Unified Sequence Parallelism Approach for Long Context Generative AIby Jiarui Fang, Shangchun ZhaoFirst…
Unified Video-Language Pre-training with Synchronized Audioby Shentong Mo, Haofan Wang, Huaxia Li, Xu TangFirst submitted…
Predictive Modeling in the Reservoir Kernel Motif Spaceby Peter Tino, Robert Simon Fong, Roberto Fabio…