You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fig. 1: The central hypothesis of our paper: within a data collection, there are correlations between the highfrequency components and the “semantic” component of the images. As a result, the model will perceive both high-frequency components as well as the “semantic” ones, leading to generalization behaviors counterintuitive to human (e.g., adversarial examples).
Fig. 2: Eight testing samples selected from CIFAR10 that help explain that CNN can capture the high-frequency image: the model (ResNet18) correctly predicts the original image (1st column in each panel) and the high frequency reconstructed image (3rd column in each panel), but incorrectly predict the low-frequency reconstructed image (2nd column in each panel). The prediction confidences are also shown. Details are in the paper.
Other Discussions in Paper (click to expand)
Trade-off between accuracy and robustness (Section 3)
Rethinking data before rethinking generalization (Section 4)
Re-evaluate the heuristics (BatchNorm seems to promote high-frequency information) (Section 5)
Adversarially robust models tend to filter out high-frequency components (Section 6)
Similar phenomena are observed beyond image classification (Section 7)
The results were generated by the TensorFlow code as shared here, but for the friends who prefer PyTorch, Xindi has nicely created some codes to help you start: PyTorch Implementation