Summary of Towards Building Efficient Sentence Bert Models Using Layer Pruning, by Anushka Shelke et al.

Towards Building Efficient Sentence BERT Models using Layer Pruning

by Anushka Shelke, Riya Savant, Raviraj Joshi

First submitted to arxiv on: 21 Sep 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary A recent study investigates the efficiency of Sentence BERT (SBERT) models by pruning layers, aiming to create smaller yet effective sentence embedding models. The researchers compare BERT models like Muril and MahaBERT-v2 before and after pruning with scratch-trained models like MahaBERT-Small and MahaBERT-Smaller. Through a two-phase SBERT fine-tuning process for Natural Language Inference (NLI) and Semantic Textual Similarity (STS), the study evaluates how layer reduction affects embedding quality. The findings show that pruned models, despite having fewer layers, perform competitively with fully layered versions. Moreover, pruned models consistently outperform scratch-trained models of similar sizes, demonstrating layer pruning as an effective strategy for reducing computational demand while preserving high-quality embeddings. This approach makes SBERT models more accessible for languages with limited technological resources.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This study looks at a way to make Sentence BERT (SBERT) models smaller and faster without losing their ability to understand sentences well. The researchers take big language models like Muril and MahaBERT-v2, remove some of the layers, and compare them to new models they train from scratch that are also small. They use two tests: one for understanding relationships between sentences (Natural Language Inference) and another for finding similar sentences (Semantic Textual Similarity). The results show that the pruned models do just as well as the big ones, even with fewer layers. And they even do better than the new models that are also small. This makes it easier to use SBERT models in places where computers aren’t very powerful.

Keywords

» Artificial intelligence » Bert » Embedding » Fine tuning » Inference » Pruning

Towards Building Efficient Sentence BERT Models using Layer Pruning

by Anushka Shelke, Riya Savant, Raviraj Joshi

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Present and Future Generalization Of Synthetic Image Detectors, by Pablo Bernabeu-perez et al.

Summary of Sparse Low-ranked Self-attention Transformer For Remaining Useful Lifetime Prediction Of Optical Fiber Amplifiers, by Dominic Schneider et al.

Related Posts