Summary of Learning on Transformers Is Provable Low-rank and Sparse: a One-layer Analysis, by Hongkang Li et al.
Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysisby Hongkang Li, Meng Wang,…
Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysisby Hongkang Li, Meng Wang,…
XAMI – A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Imagesby Elisabeta-Iulia Dima, Pablo…
Make Graph Neural Networks Great Again: A Generic Integration Paradigm of Topology-Free Patterns for Traffic…
MetaGreen: Meta-Learning Inspired Transformer Selection for Green Semantic Communicationby Shubhabrata Mukherjee, Cory Beard, Sejun SongFirst…
GeoMFormer: A General Architecture for Geometric Molecular Representation Learningby Tianlang Chen, Shengjie Luo, Di He,…
CausalFormer: An Interpretable Transformer for Temporal Causal Discoveryby Lingbai Kong, Wengen Li, Hanchen Yang, Yichao…
METRIK: Measurement-Efficient Randomized Controlled Trials using Transformers with Input Maskingby Sayeri Lala, Niraj K. JhaFirst…
An All-MLP Sequence Modeling Architecture That Excels at Copyingby Chenwei Cui, Zehao Yan, Gedeon Muhawenayo,…
Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Modelsby Yang Zhang, Chenjia Bai, Bin…
Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Modelsby…