Making Neural Networks Insensitive Towards Spurious Concepts

Series

Research Master Defense
Speakers

Floris Holstege , Floris Holstege

Location

Tinbergen Institute Amsterdam, room 1.01
Amsterdam
Date and time

August 31, 2022
14:00 - 15:00

Neural networks are widely used for image recognition. However, a major shortcoming is that they often rely on spurious correlations. Concept activation vectors (CAV, Kim et al. (2018)) can be used to quantify if a neural network is sensitive with respect to a concept – e.g. does it use the sea to classify a seagull? The contribution of this thesis is to introduce a method that trains a neural network to be insensitive with respect to a concept, called CAV-penalized training. Users select a set of pictures that capture the spurious concept, and the neural network is trained to not use it for classification. This allows for the incorporation of domain knowledge to deal with a range of spurious correlations. The effectiveness of CAV-penalized training is illustrated across benchmark datasets (MNIST, Waterbirds, CelebA), for both a convolutional neural network (CNN) and a fine-tuned Resnet-50 architecture. Our results indicate that CAV-penalized training performs similar or is competitive with a model trained on a dataset without the spurious correlation. Compared to other methods, CAV-penalized training requires little data annotation (100-250 images of a concept), yet achieves a competitive or better performance.