Loading Now

Summary of Lightweight Conceptual Dictionary Learning For Text Classification Using Information Compression, by Li Wan et al.


Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression

by Li Wan, Tansu Alpcan, Margreta Kuijper, Emanuele Viterbo

First submitted to arxiv on: 28 Apr 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG); Signal Processing (eess.SP)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed framework uses a lightweight supervised dictionary learning approach for text classification, combining data compression and representation techniques. The two-phase algorithm initially employs the Lempel-Ziv-Welch (LZW) algorithm to construct a dictionary from text datasets, focusing on conceptual significance. Then, dictionaries are refined considering label data, optimizing dictionary atoms to enhance discriminative power using mutual information and class distribution. This process generates numerical representations for training simple classifiers like SVMs and neural networks. The algorithm’s performance is evaluated using information bottleneck principles and the novel IPAR metric.
Low GrooveSquid.com (original content) Low Difficulty Summary
This approach uses a dictionary-based method that can be used for text classification, and it’s easy to understand even if you’re not an expert in machine learning or text analysis. The algorithm works by first creating a dictionary of words from a set of texts, then refining this dictionary based on the labels of the texts. This creates a representation of the texts that can be used to train simple classifiers like support vector machines (SVMs) and neural networks.

Keywords

» Artificial intelligence  » Machine learning  » Supervised  » Text classification