Summary of Mindeye2: Shared-subject Models Enable Fmri-to-image with 1 Hour Of Data, by Paul S. Scotti et al.
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data
by Paul S. Scotti, Mihir Tripathy, Cesar Kadir Torrico Villanueva, Reese Kneeland, Tong Chen, Ashutosh Narang, Charan Santhirasegaran, Jonathan Xu, Thomas Naselaris, Kenneth A. Norman, Tanishq Mathew Abraham
First submitted to arxiv on: 17 Mar 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a breakthrough in reconstructing visual perceptions from brain activity, leveraging just 1 hour of fMRI training data. The authors pretrain their model across 7 subjects and fine-tune it on minimal new subject data. A novel functional alignment procedure maps all brain data to a shared latent space, followed by a non-linear mapping to CLIP image space. The Stable Diffusion XL model is then fine-tuned to accept CLIP latents as inputs, achieving state-of-the-art image retrieval and reconstruction metrics compared to single-subject approaches. This work demonstrates the potential for accurate perception reconstructions from a single MRI facility visit. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper makes it possible to reconstruct what people see just by using their brain activity, even with very little training data. The authors developed a new way to align brain signals across different people and then use those signals to create images of what someone sees. This is useful because it means we can do this without needing lots of special equipment or expensive testing. The results are really good, which is important for things like helping people with vision problems. |
Keywords
» Artificial intelligence » Alignment » Diffusion » Latent space