Summary of Enhancing Neural Network Interpretability with Feature-aligned Sparse Autoencoders, by Luke Marks et al.
Enhancing Neural Network Interpretability with Feature-Aligned Sparse Autoencodersby Luke Marks, Alasdair Paren, David Krueger, Fazl…