Loading Now

Summary of Towards Building Efficient Sentence Bert Models Using Layer Pruning, by Anushka Shelke et al.


Towards Building Efficient Sentence BERT Models using Layer Pruning

by Anushka Shelke, Riya Savant, Raviraj Joshi

First submitted to arxiv on: 21 Sep 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A recent study investigates the efficiency of Sentence BERT (SBERT) models by pruning layers, aiming to create smaller yet effective sentence embedding models. The researchers compare BERT models like Muril and MahaBERT-v2 before and after pruning with scratch-trained models like MahaBERT-Small and MahaBERT-Smaller. Through a two-phase SBERT fine-tuning process for Natural Language Inference (NLI) and Semantic Textual Similarity (STS), the study evaluates how layer reduction affects embedding quality. The findings show that pruned models, despite having fewer layers, perform competitively with fully layered versions. Moreover, pruned models consistently outperform scratch-trained models of similar sizes, demonstrating layer pruning as an effective strategy for reducing computational demand while preserving high-quality embeddings. This approach makes SBERT models more accessible for languages with limited technological resources.
Low GrooveSquid.com (original content) Low Difficulty Summary
This study looks at a way to make Sentence BERT (SBERT) models smaller and faster without losing their ability to understand sentences well. The researchers take big language models like Muril and MahaBERT-v2, remove some of the layers, and compare them to new models they train from scratch that are also small. They use two tests: one for understanding relationships between sentences (Natural Language Inference) and another for finding similar sentences (Semantic Textual Similarity). The results show that the pruned models do just as well as the big ones, even with fewer layers. And they even do better than the new models that are also small. This makes it easier to use SBERT models in places where computers aren’t very powerful.

Keywords

» Artificial intelligence  » Bert  » Embedding  » Fine tuning  » Inference  » Pruning