Big datasets empower modern over-parameterized deep learning systems. Such datasets are often scraped from the internet or other public and user-provided sources. An adversary can easily insert a subset of malicious examples into the data collected from public sources to harm the model’s behavior at test time. As a result, deep learning systems trained on public data are extremely vulnerable to data poisoning attacks. Such attacks modify a subset of training examples under small (and potentially invisible) adversarial perturbations, with the aim of changing the model’s prediction on specific test-time examples. Powerful attacks generate poisons that visually look innocent and are seemingly properly labeled. This makes them hard to detect even by expert observers. Hence, data poisoning attacks are arguably one of the most concerning threats to modern deep learning systems.
We develop powerful defense methods to traing neural networks robust against data poisoning attacks. We consider this problem in supervised setting as well as multimodal language-image pretraining.
Checkout the following papers to know more:
ArXivBetter Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor AttacksarXiv preprint arXiv:2310.05862, Preprints
Robust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor AttacksAdvances in Neural Information Processing Systems (NeurIPS), 2023
Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning AttackAdvances in Neural Information Processing Systems (NeurIPS), 2022
Not all poisons are created equal: Robust training against data poisoningInternational Conference on Machine Learning (ICML), 2022Oral presentation (top 2%)