Summary of A Dynamical Model Of Neural Scaling Laws, by Blake Bordelon et al.
A Dynamical Model of Neural Scaling Lawsby Blake Bordelon, Alexander Atanasov, Cengiz PehlevanFirst submitted to…
A Dynamical Model of Neural Scaling Lawsby Blake Bordelon, Alexander Atanasov, Cengiz PehlevanFirst submitted to…
How many views does your deep neural network use for prediction?by Keisuke Kawano, Takuro Kutsuna,…
Compositional Generative Modeling: A Single Model is Not All You Needby Yilun Du, Leslie KaelblingFirst…
MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Expertsby Zhitian Xie, Yinger Zhang, Chenyi…
Repeat After Me: Transformers are Better than State Space Models at Copyingby Samy Jelassi, David…
Continuous Unsupervised Domain Adaptation Using Stabilized Representations and Experience Replayby Mohammad RostamiFirst submitted to arxiv…
Are Synthetic Time-series Data Really not as Good as Real Data?by Fanzhe Fu, Junru Chen,…
Merging Multi-Task Models via Weight-Ensembling Mixture of Expertsby Anke Tang, Li Shen, Yong Luo, Nan…
Multi-group Learning for Hierarchical Groupsby Samuel Deng, Daniel HsuFirst submitted to arxiv on: 1 Feb…
Introducing PetriRL: An Innovative Framework for JSSP Resolution Integrating Petri nets and Event-based Reinforcement Learningby…