Loading Now

Summary of Capturing Style in Author and Document Representation, by Enzo Terreau and Antoine Gourru and Julien Velcin


Capturing Style in Author and Document Representation

by Enzo Terreau, Antoine Gourru, Julien Velcin

First submitted to arxiv on: 18 Jul 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Variational Information Bottleneck (VIB) architecture learns embeddings for both authors and documents with a stylistic constraint, addressing the limitation of existing works that do not capture writing style. The model fine-tunes a pre-trained document encoder and incorporates predefined stylistic features to stimulate the detection of writing style. This is achieved by making the representation axis interpretable with respect to writing style indicators. The VIB architecture is evaluated on three datasets: Gutenberg Project, Blog Authorship Corpus, and IMDb62, demonstrating that it matches or outperforms strong recent baselines in authorship attribution while accurately capturing authors’ stylistic aspects.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper creates a new way for computers to understand the writing styles of different authors. It’s like a special kind of fingerprint for how people write. The model looks at lots of documents and tries to learn what makes each author’s writing unique. This can be used in all sorts of applications, from identifying who wrote a piece of literature to recommending books based on an individual’s reading style.

Keywords

* Artificial intelligence  * Encoder