Are Classification Robustness and Explanation Robustness Really Strongly Correlated? An Analysis Through Input Loss Landscape
This project is tested under the following environment settings:
- OS: Ubuntu
- Python: 3.8
- Cuda: 12.1
- PyTorch: >= 1.11.0
- Torchvision: >= 0.12.0
- captum: 0.6.0
- scikit-learn: 1.2.2
- kornia: 0.7.0
First, create the data and model folders in the same directory as the src folder, and then create corresponding subfolders under the model file according to the name of the data set to store different types of models trained on the corresponding data set.
In this paper, we utilize the explanation method encapsulated in the captum-0.6.0 package. Hence, when employing our method for training, it is necessary to download the library and make a modification to a parameter in the captum-0.6.0/captum/utils/models/gradient.py file. This modification ensures that the explanation method can be effectively utilized during the training process, enabling normal backpropagation. The image below illustrates the required modification, which involves adding "create_graph=True" in the torch.autograd.grad() function.
The "sampleWays" folder contains the code used to extract image pairs for testing the robustness of model explanation. "Way3" is a method proposed by us that applies clustering ideas to extract pictures for testing the model's robustness. Compared to "Way1" and "Way2" proposed by others, Way3 demonstrates a more stable and reliable effect.
The "pic_index" folder stores subscripts for pairs of images from different datasets This ensures consistency in the image pairs used to test different models on the same dataset.
The "show_explAttack" folder contains attacks on model explanatory diagrams.
Within the "src" folder, you'll find the following files:
- "SEP_train.py": This is the new training method that we have proposed.
- "explAttack.py": This code is used to explain the graph attack.
- "explRobust_way3.py": This code is used to test the robustness of the model explanation using our proposed testing method.
- "adv_acc.py": Test the classification robustness of the model.
- dataset : We used CIFAR10, CIFAR100, MNIST, FMNIST, TinyImageNet data sets.
- reg_weight : This hyperparameter is used to control the ratio between explanation loss and classification loss.
- methond : Choose an explanation method during training.
- model : Network structure during training.
python SEP_train.py --epoch 25 --dataset 'CIFAR10' \
--reg_weight 5e4 --batch 128 \
--lr 0.01 --model 'ConvNet' \
--method 'grad_times_input' \