This project contains the materials used for the following blog post about the usage the security.txt file:
https://excellium-services.com/2021/05/18/security-txt/
Script call chain:
generate-source-(ct|majestic).sh > generate-stats.py
.
Requirements for the python (>= 3.7) script are installed via the command:
pip install -r requirements.txt
Extract a list of LU domains from Certificate Transparency log using crt.sh data provider.
Extract a list of LU domains from Majestic Top 1 million most visited sites data provider.
Same goal the generate-source-ct.sh but using direct database access in order to extract more records. This script deal with limitations in terms of execution time allowed for a SQL query.
💬 However, after several tentatives, it was more efficient to use the web API via the advanced search because query execution time limitations were too restrictives.
Check for the presence of the security.txt file on the differents domains.
💬 The approach, regarding the LU obtained domains, is the following:
- If the domain is related to a non-web one (pop, ftp, lync, etc) then the subdomain is replaced by
www
:sip.excellium.lu
becomewww.excellium.lu
- If the domain if a mail address then the domain is extracted and the
www
subdomain is used as prefix:[email protected]
becomewww.excellium.lu
- Duplicate domains are handled to only test a domain one time.
Used by the GitHub action workflow to ensure that the python script peform its duty correctly.
File test-source.txt is the same file than
source-(ct|majestic).txt
files. However, it contains a subset of the domains because it is only used for the GitHub action workflow. The GitHub action workflow is used to allow the dependency checker of GitHub to verify that upgrading a dependency do not break the python script.
Contains the list of all LU domains gathered from Certificate Transparency log.
Contains the list of all LU domains gathered from Majestic Top 1 million most visited sites.
File with filename IMG*.png
are just images used for the blog post.
Visual Studio Code are used for all the scripts.
A workspace file as well as a python debug configuration file are provided.
- https://tools.ietf.org/html/draft-foudil-securitytxt
- https://securitytxt.org/
- https://community.turgensec.com/security-txt-progress-in-ethical-security-research/
- http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
- https://blog.majestic.com/development/majestic-million-csv-daily/
- https://certificate.transparency.dev/
- https://www.randori.com/enumerating-subdomains-with-crt-sh/
- https://github.com/crtsh/certwatch_db/blob/master/sql/create_schema.sql
- https://hub.docker.com/_/postgres
- https://twitter.com/search?q=vulnerability%20contact%20me%20&src=typed_query