Loading Now

Summary of Computational-statistical Trade-off in Kernel Two-sample Testing with Random Fourier Features, by Ikjun Choi and Ilmun Kim


Computational-Statistical Trade-off in Kernel Two-Sample Testing with Random Fourier Features

by Ikjun Choi, Ilmun Kim

First submitted to arxiv on: 12 Jul 2024

Categories

  • Main: Machine Learning (stat.ML)
  • Secondary: Machine Learning (cs.LG); Statistics Theory (math.ST)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper proposes an approach to improve the computational efficiency of the Maximum Mean Discrepancy (MMD) test, a popular method for two-sample testing in high-dimensional data. The MMD test has quadratic-time complexity, which can be challenging for large-scale analysis. To address this limitation, the authors revisit the approximated MMD test using random Fourier features and investigate its computational-statistical trade-off. They show that by carefully choosing the number of random features, it is possible to attain the same minimax separation rates as the MMD test within sub-quadratic time, while maintaining pointwise consistency in power.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper helps make a popular testing method faster and more efficient for big data. The Maximum Mean Discrepancy (MMD) test is used to compare two groups of data points. While it’s been very effective, it has one major problem: it takes a long time to do many calculations. To fix this, the authors look at a simpler version of the MMD test that uses random numbers and mathematical techniques called Fourier features. They find that by using the right number of these features, they can make the test faster without losing its power.

Keywords

* Artificial intelligence