Skip to content

andyisokay/delta-influence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

delta-influence

This repository contains code implementation for the paper:

Delta-Influence: Unlearning Poisons via Influence Functions

You can follow the below step-by-step guideline to replicate our experiments on "cifar10+badnet" which includes all code for attack, detection, unlearning and eval.

Notebooks for other "{dataset}+{attack}" will be updated in the future, (currently we provide "cifar10+badnet", "cifar100+frequency attack, "imagenette+witches' brew") but essentially they are similar so you can definitely try some different datasets, attack methods and unlearn algorithms:)

Example

Setup

conda create -n delta-influence-env python=3.12  
conda activate delta-influence-env  
pip install -e .

Credits: We utilize the Kronfluence to calculate influence matrix and the Corrective-Unlearning-Bench for unlearning, so please make sure you have them installed before moving on

Prepare the poisoned dataset

"poison_dataset.ipynb" shows how to inject badnet poison into the cifar10 dataset and also provides training scripts to get the victim model

Detect poisons

"delta_influence.ipynb" implements the delta-influence algorithm, which will return you the most responsible examples for the poisoning behavior

Besides, we also provide implementations of other popular detection methods, as well as the threshold baseline mentioned in the paper:

To check the ablation studies, relavant notebooks can be found named "modify_images.ipynb" and "modify_labels.ipynb"

Unlearn

For each combination of "{dataset}+{attack}+{detection}", we compare the unlearning effectiveness of 5 different corrective unlearning methods:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published