-
Notifications
You must be signed in to change notification settings - Fork 0
Attack the gradients with knowledge over used transformations #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This has a few components, first one should analyze the distribution of images after certain (sequences of) transforms. The easiest way to do this is to generate random images, transform them and then calculating the first n moments of the pixel distribution of the transformed outputs (like the mean and others). Instead of random images, one can also use real images from the dataset or a similar dataset, but we should not assume the attacker has access to this kind of information. Either way we can make a loss term through moment matching (square distance of expected moments and actual moments) which is added to the gradient similarity. A more complex but cooler way is to train a neural network to recognize what augmentations have been applied to an image which gives a low score if an image is likely to have been augmented by the known augmentations. Simply add the output of this network for the image reconstruction attempt to the gradient similarity. |
The notion that augmentations are fundamentally a way to make the model invariant to some aspect (i.e. rotation) provides some insight to why these augmentations make reconstruction harder. Namely, if a model is invariant to x, it will provide the same output regardless of the variations of x that are applied to an imput image. Thus it would make it impossible to know the x variation that was applied to the input image. However, note that what is used for the reconstruction is not the output of the network but it's gradients, which will not be x invariant. However, they might be increasingly invariant as the layers of the network progress. If this hypothesis about increasing invariance is true, it might mean that the gradients of the first layers are more usefull for reconstructing the augmented image. |
No description provided.
The text was updated successfully, but these errors were encountered: