Loading Now

Summary of Improve Machine Learning Carbon Footprint Using Parquet Dataset Format and Mixed Precision Training For Regression Models — Part Ii, by Andrew Antonopoulos


Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression models – Part II

by Andrew Antonopoulos

First submitted to arxiv on: 17 Sep 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel study compares the power consumption of regression machine learning (ML) models trained on Comma-Separated-Values (CSV) and Parquet dataset formats using different floating-point precision settings. The experiments utilize a custom-built PC, various ML hyperparameters, including batch size, neurons, and epochs, to train Deep Neural Networks (DNNs). A benchmarking test serves as a reference, while the experiments employ distinct combinations of settings. Results indicate that mixed-precision training combined with specific hyperparameters leads to reduced power consumption, with a mean reduction of 7-11 Watts compared to the benchmark. However, careful consideration of hyperparameters is crucial, as high batch sizes and neurons can negatively impact power consumption. The study also employs inferential statistics (ANOVA and T-test) to compare means, concluding that there is no statistical significance between regression test means, accepting the null hypothesis.
Low GrooveSquid.com (original content) Low Difficulty Summary
A researcher compared how much energy different machine learning models use when training on two types of data formats: CSV and Parquet. They tried using different levels of precision for calculations, like 32-bit or 16-bit, to see if it would make a difference. The study used a special computer and many settings to train the models. The results showed that using certain settings can reduce energy use by up to 11 Watts compared to usual methods. However, the researchers found that some settings actually increase energy use, so you need to choose wisely. They also did statistical tests to compare their results, but they didn’t find any significant differences.

Keywords

» Artificial intelligence  » Machine learning  » Precision  » Regression