Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed evaluation of models with random defenses #105

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Buntender
Copy link

Thank you for your outstanding contributions.

@LYMDLUT and I put forward this PR to improve the evaluation of models with random defenses.

We've noticed that AutoAttack's current strategy for selecting the final output (clean/APGD etc) based on one time evaluation, regardless of whether the target models implement random defenses or not. This overlooks the variability of outputs in models with random defenses.

Relying on a single evaluation to filter samples for subsequent attacks leads to inflated success rate and hinders the exploration of attack methods that could potentially yield superior outcomes.

To address this, we propose to perform multiple time evaluations for models with random defenses and chose the adversarial example with the highest robustness as final output.

@LYMDLUT
Copy link

LYMDLUT commented Mar 10, 2024

@fra31 Could you please review this pr?

@ScarlettChan
Copy link

ScarlettChan commented Mar 10, 2024 via email

@LYMDLUT
Copy link

LYMDLUT commented Oct 14, 2024

In Appendix L of our paper, we provide a detailed report on our fix for AutoAttack and its impact. We encourage future research to adopt this updated version when evaluating models with randomness, as it effectively reduces the risk of overestimating robustness.

If you find our work useful for your research, please consider citing it:

@Article{liu2024towards,
title={Towards Better Adversarial Purification via Adversarial Denoising Diffusion Training},
author={Liu, Yiming and Liu, Kezhao and Xiao, Yao and Dong, Ziyi and Xu, Xiaogang and Wei, Pengxu and Lin, Liang},
journal={arXiv preprint arXiv:2404.14309},
year={2024}
}

image

@ScarlettChan
Copy link

ScarlettChan commented Oct 14, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants