Spurious Correlations

Robust traing against spurious correlations

Neural networks are known to exploit spurious correlations in the training data: certain attributes that may correlate with certain categories during training, but are not predictive of the categories in general. For example, if the majority of lighter images co-occur with flame, the model may learn to associate the flame with the lighter category, rather than relying on the lighter to make the prediction. Similarly, a toxicity classifier may learn to spuriously associate toxicity with the mention of certain demographics in the text. Such biases degrade models’ worst-group test performance on minority groups that do not exhibit the spurious correlation.

We develop methods to mitigate the effect of spurious correlations during training neural networks. We consider robust training in supervised scenario, and mitigating spurious correlations from supervised or multimodal pretrained models during fine-tuning.

Checkout the following papers to know more:

  1. ArXiv
    Towards Mitigating Spurious Correlations in the Wild: A Benchmark & a more Realistic Dataset
    arXiv preprint arXiv:2306.11957, Preprints
  2. ArXiv
    Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias
    Yu Yang, Eric Gan, Gintare Karolina Dziugaite, and Baharan Mirzasoleiman
    arXiv preprint arXiv:2305.18761, Preprints
  3. ArXiv
    Eliminating Spurious Correlations from Pre-trained Models via Data Mixing
    Yihao Xue, Ali Payani, Yu Yang, and Baharan Mirzasoleiman
    arXiv preprint arXiv:2305.14521, Preprints
  4. Robust Learning with Progressive Data Expansion Against Spurious Correlation
    Yihe Deng*, Yu Yang*Baharan Mirzasoleiman, and Quanquan Gu
    Advances in Neural Information Processing Systems (NeurIPS), 2023
  5. Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
    Yu Yang, Besmira Nushi, Hamid Palangi, and Baharan Mirzasoleiman
    International Conference on Machine Learning (ICML), 2023