Generalization – Page 88 – GrooveSquid.com

April 15, 2025

Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large…

April 15, 2025

Aiding Global Convergence in Federated Learning via Local Perturbation and Mutual Similarity Informationby Emanuel Buttaci,…

April 15, 2025

On the Impacts of the Random Initialization in the Neural Tangent Kernel Theoryby Guhan Chen,…

April 15, 2025

SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipeby Yuxin Xiao, Shujian Zhang, Wenxuan Zhou,…

April 15, 2025

SePPO: Semi-Policy Preference Optimization for Diffusion Alignmentby Daoan Zhang, Guangchen Lan, Dong-Jun Han, Wenlin Yao,…

April 15, 2025

Next state prediction gives rise to entangled, yet compositional representations of objectsby Tankred Saanum, Luca…

April 15, 2025

Failure-Proof Non-Contrastive Self-Supervised Learningby Emanuele Sansone, Tim Lebailly, Tinne TuytelaarsFirst submitted to arxiv on: 7…

April 15, 2025

Collaboration! Towards Robust Neural Methods for Routing Problemsby Jianan Zhou, Yaoxin Wu, Zhiguang Cao, Wen…

April 15, 2025

DEPT: Decoupled Embeddings for Pre-training Language Modelsby Alex Iacob, Lorenzo Sani, Meghdad Kurmanji, William F.…

April 15, 2025

DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objectsby Nidhi Mathihalli,…