Summary of Small Language Models Can Outperform Humans in Short Creative Writing: a Study Comparing Slms with Humans and Llms, by Guillermo Marco et al.
Small Language Models can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs
by Guillermo Marco, Luz Rello, Julio Gonzalo
First submitted to arxiv on: 17 Sep 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper presents an evaluation of a small language model (SLM) called BART-large, comparing its creative fiction writing abilities to those of human writers and two large language models (LLMs): GPT-3.5 and GPT-4o. The study consists of two experiments: a human study where participants rated short stories from humans and the SLM on various aspects, and a qualitative linguistic analysis examining the textual characteristics of stories produced by each model. The results show that BART-large outperformed average human writers overall, with a 14% relative improvement in grammaticality and relevance. However, the slight human advantage in creativity was not statistically significant. The study also found that GPT-4o demonstrated near-perfect coherence but tended to produce more predictable language, with only 3% of its synopses featuring surprising associations compared to 15% for BART. These findings highlight the influence of model size and fine-tuning on the balance between creativity, fluency, and coherence in creative writing tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper compares a small language model (BART-large) with human writers and two large language models (GPT-3.5 and GPT-4o) to see who can write the best short stories. They had people rate the stories on how good they were, and then looked at the words used in each story to see what made them unique. The small model did a great job, but humans still had an edge when it came to creativity. A bigger model (GPT-4o) was very good at making sense and using simple language, but wasn’t as creative. |
Keywords
* Artificial intelligence * Fine tuning * Gpt * Language model