Skip to content

The goal of this repo is to become a benchmark for pentesting

License

Notifications You must be signed in to change notification settings

isamu-isozaki/AI-Pentest-Benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI-Pentest-Benchmark

The goal of this repo is to become a benchmark for pentesting. All the VMs used for testing was taken from here. Thanks to vulnhub!

List of all VMs used and the steps are in a google sheet here

Setup

Windows

  1. Download Virtual Box from here for windows hosts.
  2. Install Virtual Box.
  3. Get a Kali Linux Virtual Box using this link and click on Virtual Box.
  4. Follow steps here to import Kali Virtual Box
  5. Get the vulnhub virtual box by going to the google sheet and clicking the Virtual Box Path and downloading. Extract file. Here, if an .ova file is downloaded, go to file-> Import Appliance. Then import the file and go next and finish. Then skip to step 8.
  6. If not .ova you should be working with a .vmx/.vmdk file. For this, first download ovftool from here.
  7. Run ovftool [original .vmx location and filename] [new .ova location and filename] to convert to an ova then Import Appliance like mentioned in step 5.
  8. Create a Nat network by clicking on Tools. Going to the 3 dots and lines button next to Tools. Click on Network. Create a new Nat Network. The default ip address range here should be 10.0.2.0/24. One note. The above may be an issue in terms of connecting the vulnerable machine to the internet etc so if anyone knows a better way do let us know!
  9. For both Kali Linux and the new Virtual Box, goto Settings, Network, Select Nat Network, select the created Nat Network, goto advanced and select "Allow All"
  10. Open both the Kali box and the Virtual Box. For Kali you can login with both password and username kali
  11. In kali run
sudo netdiscover -r 10.0.2.0/24

Usually, the IP address that responds last is the correct one. Next do

nmap -A -T4 -p- ip_address

for the found ip address. Currently, for all boxes in our benchmark, this is the first step which we want to standarize. 12. Then see if your AI follows the steps given that ip address from the google sheets.

License

The google sheet and the code are under MIT license. To cite, please cite the following

@misc{isozaki2024automatedpenetrationtestingintroducing,
      title={Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements}, 
      author={Isamu Isozaki and Manil Shrestha and Rick Console and Edward Kim},
      year={2024},
      eprint={2410.17141},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2410.17141}, 
}

About

The goal of this repo is to become a benchmark for pentesting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published