Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/security legacy urls #2556

Open
wants to merge 77 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
d5941ac
alert on legacy security.txt locations.
underdarknl Feb 22, 2024
bedbbe5
Update normalize.py
underdarknl Feb 26, 2024
166e62e
fix host header not being set for non https urls in main.py
underdarknl Feb 26, 2024
2bcc07b
add useragent to boefje.json
underdarknl Feb 26, 2024
e6f969f
Update kat_finding_types.json, add legacy security_txt location
underdarknl Feb 26, 2024
acbe491
linting normalize.py
underdarknl Feb 26, 2024
1701d4b
Update kat_finding_types.json
underdarknl Feb 26, 2024
83fbb71
Update normalize.py
underdarknl Feb 26, 2024
27265f4
Merge branch 'main' into feat/security-legacy-urls
underdarknl Feb 26, 2024
3f17ff6
fix legacy url check in normalize.py
underdarknl Mar 12, 2024
be3674f
Merge branch 'main' into feat/security-legacy-urls
underdarknl Mar 13, 2024
a97c534
Create missing security_txt bit.py
underdarknl Mar 13, 2024
65db495
Create missing_security_txt.py
underdarknl Mar 13, 2024
a60532d
Update report.py to handle missing security.txt finding and legacy se…
underdarknl Mar 13, 2024
847e070
Create __init__.py
underdarknl Mar 13, 2024
fa47316
Update bit.py linting
underdarknl Mar 13, 2024
9471edb
Update missing_security_txt.py linting
underdarknl Mar 13, 2024
6101e32
Update missing_security_txt.py
underdarknl Mar 13, 2024
8986326
Update bit.py
underdarknl Mar 13, 2024
3369ea4
Update missing_security_txt.py
underdarknl Mar 13, 2024
d7dceea
Update report.py
underdarknl Mar 13, 2024
c569e45
Update report.py
underdarknl Mar 13, 2024
1c4acc3
Update test_web_systems_report.py
underdarknl Mar 13, 2024
5886f2e
Update test_reports.py, security_Txt is located on website, as are it…
underdarknl Mar 13, 2024
68fb7fc
Update test_reports.py
underdarknl Mar 13, 2024
bd6a518
Merge branch 'main' into feat/security-legacy-urls
ammar92 Mar 20, 2024
d4c623d
Fixes
ammar92 Mar 20, 2024
93b5156
Merge branch 'main' into feat/security-legacy-urls
stephanie0x00 Mar 21, 2024
c9042cc
Fix capitalization
underdarknl Apr 12, 2024
bfa4316
Update missing_security_txt.py
underdarknl Apr 12, 2024
b39216e
Merge branch 'main' into feat/security-legacy-urls
underdarknl Apr 12, 2024
c872d20
Merge branch 'main' into feat/security-legacy-urls
underdarknl Apr 30, 2024
c29b191
Update normalize.py, handle 404's with content
underdarknl May 2, 2024
0d65dcb
Update main.py, add timeout, and make timeout configurable, add origi…
underdarknl May 2, 2024
437ecc2
Update boefje.json
underdarknl May 2, 2024
3f40ad1
Create schema.json
underdarknl May 2, 2024
0ff8af4
Update security_txt_result_different_website.json, add status code
underdarknl May 3, 2024
4ad34f2
Update security_txt_result_same_website.json
underdarknl May 3, 2024
2f63581
Update normalize.py, add fallback for old rawfiles that dont have sta…
underdarknl May 3, 2024
ab28a3d
Merge branch 'main' into feat/security-legacy-urls
underdarknl May 3, 2024
6692d94
linting
underdarknl May 3, 2024
0bdcefd
Update kat_finding_types.json
underdarknl May 3, 2024
eb8dac2
refactor boefje, single output
underdarknl May 3, 2024
9be85eb
Update normalize.py
underdarknl May 3, 2024
bfb27ea
Create security_txt_result_no_file.json
underdarknl May 3, 2024
6c1b57b
Create security_txt_legacy-only.json
underdarknl May 3, 2024
c43f7a6
add legacy only test
underdarknl May 3, 2024
d1ea058
linting security_txt_legacy-only.json
underdarknl May 3, 2024
77c2661
linting security_txt_result_no_file.json
underdarknl May 3, 2024
96ec478
linting main.py
underdarknl May 3, 2024
63c5f0f
linting normalize.py
underdarknl May 3, 2024
08d720c
Update test_sucurity_txt.py
underdarknl May 3, 2024
8645670
Update and rename test_sucurity_txt.py to test_security_txt.py
underdarknl May 3, 2024
d141857
Rename security_txt_legacy-only.json to security_txt_results_legacy_o…
underdarknl May 3, 2024
55ea05b
Update security_txt_results_legacy_only.json
underdarknl May 3, 2024
2501a68
Update security_txt_result_no_file.json
underdarknl May 3, 2024
b970158
Update main.py
underdarknl May 3, 2024
f638b07
Update normalize.py
underdarknl May 3, 2024
c40b024
Update test_security_txt.py
underdarknl May 3, 2024
1a76913
lint test_security_txt.py
underdarknl May 3, 2024
c0d6bc4
Update test_security_txt.py
underdarknl May 3, 2024
66bc846
Update main.py
underdarknl May 3, 2024
50028d1
Update normalize.py
underdarknl May 3, 2024
e77541a
Update security_txt_result_no_file.json
underdarknl May 3, 2024
a8dd860
Update security_txt_results_legacy_only.json
underdarknl May 3, 2024
c3725ff
Update test_security_txt.py
underdarknl May 3, 2024
1e94dcc
Update security_txt_results_legacy_only.json
underdarknl May 3, 2024
c55ce43
Update test_security_txt.py
underdarknl May 3, 2024
0a78b16
Merge branch 'main' into feat/security-legacy-urls
underdarknl May 21, 2024
d4aab32
Update test_security_txt.py
underdarknl May 21, 2024
9590a46
Update test_security_txt.py
underdarknl May 21, 2024
04ee3e2
Update test_security_txt.py
underdarknl May 21, 2024
b2d466c
Update test_security_txt.py
underdarknl May 21, 2024
e4b06ca
Merge branch 'main' into feat/security-legacy-urls
underdarknl Jun 3, 2024
d87c1c8
Merge branch 'main' into feat/security-legacy-urls
underdarknl Sep 10, 2024
df082be
precommit
underdarknl Dec 30, 2024
79b806a
linting
underdarknl Dec 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -489,6 +489,13 @@
"impact": "Disallowed domains are domains that are for example 'world writable', this opens up the possibility for an atacker to host malicious files on a csp whitelisted domain.",
"recommendation": "Remove the offending hostname from the CSP header."
},
"KAT-LEGACY-SECURITY-LOCATION": {
"description": "This website only has a legacy location security.txt file.",
"source": "https://www.rfc-editor.org/rfc/rfc9116#section-3-1",
"risk": "info",
"impact": "Only providing the legacy url will mean, as time goes on, more and more tools and researchers will not find your Security disclosure policy possibly leading to less than ideal disclosure.",
"recommendation": "Add a security.txt file location in the /.well-known folder."
},
"KAT-NONSTANDARD-HEADERS": {
"description": "Headers are used that are nonstandard and should not be used anymore.",
"risk": "low",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
"id": "security_txt_downloader",
"name": "Security.txt downloader",
"description": "Downloads the security.txt file from the target website to check if it contains all the required elements.",
"environment_keys": [
"USERAGENT",
"TIMEOUT"
],
"consumes": [
"Website"
],
Expand Down
65 changes: 33 additions & 32 deletions boefjes/boefjes/plugins/kat_security_txt_downloader/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,61 +5,62 @@
import requests
from forcediphttpsadapter.adapters import ForcedIPHTTPSAdapter
from requests import Session
from requests.models import Response

from boefjes.job_models import BoefjeMeta

DEFAULT_TIMEOUT = 30
DEFAULT_USERAGENT = "OpenKAT"


def run(boefje_meta: BoefjeMeta) -> list[tuple[set, bytes | str]]:
input_ = boefje_meta.arguments["input"]
netloc = input_["hostname"]["name"]
scheme = input_["ip_service"]["service"]["name"]
ip = input_["ip_service"]["ip_port"]["address"]["address"]

useragent = getenv("USERAGENT", default="OpenKAT")
useragent = getenv("USERAGENT", default=DEFAULT_USERAGENT)

try:
timeout = int(getenv("TIMEOUT", default=DEFAULT_TIMEOUT))
except ValueError:
timeout = DEFAULT_TIMEOUT

session = requests.Session()

results = {}

for path in [".well-known/security.txt", "security.txt"]:
uri = f"{scheme}://{netloc}/{path}"
request_url = f"{scheme}://{netloc}/{path}"

if scheme == "https":
session.mount(uri, ForcedIPHTTPSAdapter(dest_ip=ip))
session.mount(request_url, ForcedIPHTTPSAdapter(dest_ip=ip))
else:
addr = ipaddress.ip_address(ip)
netloc = f"[{ip}]" if addr.version == 6 else ip

uri = f"{scheme}://{netloc}/{path}"

response = do_request(netloc, session, uri, useragent)

# if the response is 200, return the content
if response.status_code == 200:
results[path] = {"content": response.content.decode(), "url": response.url, "ip": ip, "status": 200}
# if the response is 301, we need to follow the location header to the correct security txt,
# we can not force the ip anymore
elif response.status_code in [301, 302, 307, 308]:
uri = response.headers["Location"]
response = requests.get(uri, stream=True, timeout=30, verify=False) # noqa: S501
if response.raw._connection:
ip = response.raw._connection.sock.getpeername()[0]
else:
ip = ""
results[path] = {
"content": response.content.decode(),
"url": response.url,
"ip": str(ip),
"status": response.status_code,
}
else:
results[path] = {"content": None, "url": None, "ip": None, "status": response.status_code}
iploc = f"[{ip}]" if addr.version == 6 else ip
request_url = f"{scheme}://{iploc}/{path}"

response = do_request(netloc, session, request_url, useragent, timeout)

# we can not force the ip anymore because we dont know it yet.
# TODO return a redirected URL and have OpenKAT figure out if we want to follow this.
if response.status_code in [301, 302, 307, 308]:
request_url = response.headers["Location"]
response = requests.get(request_url, stream=True, timeout=timeout, verify=False) # noqa: S501
Fixed Show fixed Hide fixed

Check failure

Code scanning / SonarCloud

Server certificates should be verified during SSL/TLS connections High

Enable server certificate validation on this SSL/TLS connection. See more on SonarQube Cloud
ip = str(response.raw._connection.sock.getpeername()[0])

results[path] = {
"content": response.content.decode(),
"url": response.url,
"request_url": request_url,
"ip": ip,
"status": response.status_code,
}
return [(set(), json.dumps(results))]


def do_request(hostname: str, session: Session, uri: str, useragent: str) -> Response:
def do_request(hostname: str, session: Session, uri: str, useragent: str, timeout: int):
response = session.get(
uri, headers={"Host": hostname, "User-Agent": useragent}, verify=False, allow_redirects=False
uri, headers={"Host": hostname, "User-Agent": useragent}, timeout=timeout, verify=False, allow_redirects=False
)

return response
Original file line number Diff line number Diff line change
Expand Up @@ -9,22 +9,28 @@
from octopoes.models.ooi.network import IPAddressV4, IPAddressV6, IPPort, Network
from octopoes.models.ooi.service import IPService, Service
from octopoes.models.ooi.web import URL, SecurityTXT, Website
from octopoes.models.types import Finding, KATFindingType


def run(input_ooi: dict, raw: bytes) -> Iterable[NormalizerOutput]:
results = json.loads(raw)
website_original = Reference.from_str(input_ooi["primary_key"])
valid_results = {}

for path, details in results.items():
if details["content"] is None:
# remove any nonsense locations from our validresults.
if details["content"] is None or details.get("status", 200) != 200:
continue
valid_results[path] = details

url_original = URL(
raw=f'{input_ooi["ip_service"]["service"]["name"]}://{input_ooi["hostname"]["name"]}/{path}',
network=Network(name=input_ooi["hostname"]["network"]["name"]).reference,
)
yield url_original
url = URL(raw=details["url"], network=Network(name=input_ooi["hostname"]["network"]["name"]).reference)
yield url

url_parts = urlparse(details["url"])
# we need to check if the website of the response is the same as the input website
if (
Expand Down Expand Up @@ -82,3 +88,11 @@ def run(input_ooi: dict, raw: bytes) -> Iterable[NormalizerOutput]:
security_txt=None,
)
yield security_txt_original

# Check for legacy url https://www.rfc-editor.org/rfc/rfc9116#section-3-1
if "security.txt" in valid_results and ".well-known/security.txt" not in valid_results:
ft = KATFindingType(id="KAT-LEGACY-SECURITY-LOCATION")
yield ft
yield Finding(
description="Only legacy /security.txt location found.", finding_type=ft.reference, ooi=website_original
)
21 changes: 21 additions & 0 deletions boefjes/boefjes/plugins/kat_security_txt_downloader/schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"title": "Arguments",
"type": "object",
"properties": {
"USERAGENT": {
"title": "USERAGENT",
"maxLength": 128,
"type": "string",
"description": "The Useragent used by the downloader.",
"default": "OpenKat"
},
"TIMEOUT": {
"title": "TIMEOUT",
"maximum": 9999,
"minimum": 0,
"type": "integer",
"description": "The timeout used by the downloader before it fails a url.",
"default": 30
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
".well-known/security.txt": {
"content": "This is the content",
"url": "https://www.example.com/.well-known/security.txt",
"ip": "192.0.2.1"
"ip": "192.0.2.1",
"status": 200
}
}
14 changes: 14 additions & 0 deletions boefjes/tests/examples/inputs/security_txt_result_no_file.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
".well-known/security.txt": {
"content": "<!DOCTYPE html><html><head><title>404 Not Found</title></head><body><h1>Not Found</h1><p>The requested URL \"https://www.example.com/.well-known/security.txt\" was not found on this server.</p></body></html>",
"url": "https://www.example.com/.well-known/security.txt",
"ip": "192.0.2.0",
"status": 404
},
"security.txt": {
"content": "<!DOCTYPE html><html><head><title>404 Not Found</title></head><body><h1>Not Found</h1><p>The requested URL \"https://www.example.com/security.txt\" was not found on this server.</p></body></html>",
"url": "https://www.example.com/security.txt",
"ip": "192.0.2.0",
"status": 404
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
".well-known/security.txt": {
"content": "This is the content",
"url": "https://example.com/.well-known/security.txt",
"ip": "192.0.2.0"
"ip": "192.0.2.0",
"status": 200
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
".well-known/security.txt": {
"content": "<!DOCTYPE html><html><head><title>404 Not Found</title></head><body><h1>Not Found</h1><p>The requested URL \"https://www.example.com/.well-known/security.txt\" was not found on this server.</p></body></html>",
"url": "https://example.com/.well-known/security.txt",
"ip": "192.0.2.0",
"status": 404
},
"security.txt": {
"content": "Contact: mailto:[email protected]\nPreferred-Languages: nl, en\nExpires: 2030-01-01T00:00:00.000Z",
"url": "https://example.com/security.txt",
"ip": "192.0.2.0",
"status": 200
}
}
Empty file.
9 changes: 9 additions & 0 deletions octopoes/bits/missing_security_txt/bit.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from bits.definitions import BitDefinition, BitParameterDefinition
from octopoes.models.ooi.web import SecurityTXT, Website

BIT = BitDefinition(
id="missing_security_txt",
consumes=Website,
parameters=[BitParameterDefinition(ooi_type=SecurityTXT, relation_path="website")],
module="bits.missing_security_txt.missing_security_txt",
)
16 changes: 16 additions & 0 deletions octopoes/bits/missing_security_txt/missing_security_txt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from collections.abc import Iterator

from octopoes.models import OOI
from octopoes.models.ooi.findings import Finding, KATFindingType
from octopoes.models.ooi.web import SecurityTXT, Website


def run(input_ooi: Website, additional_oois: list[SecurityTXT], config: dict[str, str]) -> Iterator[OOI]:
if not additional_oois:
ft = KATFindingType(id="KAT-NO-SECURITY-TXT")
yield ft
yield Finding(
ooi=input_ooi.reference,
finding_type=ft.reference,
description="This website does not have a security.txt file",
)
22 changes: 7 additions & 15 deletions rocky/reports/report_types/web_system_report/report.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from django.utils.translation import gettext_lazy as _

from octopoes.models.ooi.dns.zone import Hostname
from octopoes.models.ooi.findings import KATFindingType, RiskLevelSeverity
from octopoes.models.ooi.findings import RiskLevelSeverity
from octopoes.models.ooi.network import IPAddressV4, IPAddressV6
from reports.report_types.definitions import Report

Expand Down Expand Up @@ -132,9 +132,10 @@ def collect_data(self, input_oois: Iterable[str], valid_time: datetime) -> dict[
no_certificate_finding_types = self.group_finding_types_by_source(
self.octopoes_api_connector.query_many(query, valid_time, all_hostnames), ["KAT-NO-CERTIFICATE"]
)
query = "Hostname.<hostname[is Website].<website[is SecurityTXT]"
has_security_txt_finding_types = self.group_finding_types_by_source(
self.octopoes_api_connector.query_many(query, valid_time, all_hostnames)
query = "Hostname.<hostname[is Website].<ooi[is Finding].finding_type"
security_txt_finding_types = self.group_finding_types_by_source(
self.octopoes_api_connector.query_many(query, valid_time, all_hostnames),
["KAT-NO-SECURITY-TXT", "KAT-LEGACY-SECURITY-LOCATION"],
)
query = "Hostname.<hostname[is ResolvedHostname].address.<address[is IPPort].<ooi[is Finding].finding_type"
port_finding_types = self.group_finding_types_by_source(
Expand All @@ -161,16 +162,7 @@ def collect_data(self, input_oois: Iterable[str], valid_time: datetime) -> dict[
)
check.redirects_http_https = not any(url_finding_types.get(hostname, []))
check.offers_https = not any(no_certificate_finding_types.get(hostname, []))
check.has_security_txt = bool(has_security_txt_finding_types.get(hostname, []))
security_txt_finding_types = [
KATFindingType(
id="KAT-NO-SECURITY-TXT",
description="This hostname does not have a Security.txt file.",
risk_severity=RiskLevelSeverity.RECOMMENDATION,
recommendation="Make sure there is a security.txt available.",
)
]

check.has_security_txt = not any(security_txt_finding_types.get(hostname, []))
check.no_uncommon_ports = not any(port_finding_types.get(hostname, []))
check.has_certificates = check.offers_https
check.certificates_not_expired = check.has_certificates and "KAT-CERTIFICATE-EXPIRED" not in [
Expand All @@ -190,7 +182,7 @@ def collect_data(self, input_oois: Iterable[str], valid_time: datetime) -> dict[
+ no_certificate_finding_types.get(hostname, [])
+ port_finding_types.get(hostname, [])
+ certificate_finding_types.get(hostname, [])
+ security_txt_finding_types
+ security_txt_finding_types.get(hostname, [])
)

for finding_type in new_types:
Expand Down
16 changes: 3 additions & 13 deletions rocky/tests/integration/test_reports.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from octopoes.api.models import Declaration
from octopoes.connector.octopoes import OctopoesAPIConnector
from octopoes.models import Reference
from octopoes.models.ooi.findings import Finding, KATFindingType, RiskLevelSeverity
from octopoes.models.ooi.findings import Finding
from octopoes.models.ooi.reports import ReportData
from tests.integration.conftest import seed_system

Expand All @@ -22,7 +22,7 @@ def test_web_report(octopoes_api_connector: OctopoesAPIConnector, valid_time):
data = report.collect_data([input_ooi], valid_time)[input_ooi]

assert data["input_ooi"] == input_ooi
assert len(data["finding_types"]) == 1
assert len(data["finding_types"]) == 0
assert len(data["web_checks"]) == 1

assert asdict(data["web_checks"].checks[0]) == {
Expand Down Expand Up @@ -188,12 +188,6 @@ def test_aggregate_report(octopoes_api_connector: OctopoesAPIConnector, valid_ti
},
"safe_connections": {"number_of_compliant": 1, "total": 1},
}
security_txt_finding_type = KATFindingType(
id="KAT-NO-SECURITY-TXT",
description="This hostname does not have a Security.txt file.",
recommendation="Make sure there is a security.txt available.",
risk_severity=RiskLevelSeverity.RECOMMENDATION,
)
assert data["basic_security"]["summary"]["Web"] == {
"rpki": {"number_of_compliant": 2, "total": 2},
"system_specific": {
Expand All @@ -210,10 +204,7 @@ def test_aggregate_report(octopoes_api_connector: OctopoesAPIConnector, valid_ti
"Certificate is not expired": 2,
"Certificate is not expiring soon": 2,
},
"ips": {
"IPAddressV4|test|192.0.2.3": [security_txt_finding_type],
"IPAddressV6|test|3e4d:64a2:cb49:bd48:a1ba:def3:d15d:9230": [security_txt_finding_type],
},
"ips": {"IPAddressV4|test|192.0.2.3": [], "IPAddressV6|test|3e4d:64a2:cb49:bd48:a1ba:def3:d15d:9230": []},
},
"safe_connections": {"number_of_compliant": 2, "total": 2},
}
Expand Down Expand Up @@ -385,4 +376,3 @@ def test_multi_report(
"Other": {"total": 2, "enabled": 2},
"Web": {"total": 2, "enabled": 2},
}
assert multi_data["recommendation_counts"] == {"Make sure there is a security.txt available.": 2}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to stil test that KAT-NO-SECURITY-TXT works in the report.

3 changes: 1 addition & 2 deletions rocky/tests/reports/test_web_systems_report.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from reports.report_types.web_system_report.report import WebSystemReport


def test_web_report_no_findings(mock_octopoes_api_connector, valid_time, hostname, security_txt):
def test_web_report_no_findings(mock_octopoes_api_connector, valid_time, hostname):
mock_octopoes_api_connector.oois = {hostname.reference: hostname}
mock_octopoes_api_connector.queries = {
"Hostname.<hostname[is ResolvedHostname].address": {hostname.reference: []},
Expand All @@ -12,7 +12,6 @@ def test_web_report_no_findings(mock_octopoes_api_connector, valid_time, hostnam
"<ooi[is Finding].finding_type": {hostname.reference: []},
"Hostname.<netloc[is HostnameHTTPURL].<ooi[is Finding].finding_type": {hostname.reference: []},
"Hostname.<hostname[is Website].<ooi[is Finding].finding_type": {hostname.reference: []},
"Hostname.<hostname[is Website].<website[is SecurityTXT]": {hostname.reference: [security_txt]},
"Hostname.<hostname[is ResolvedHostname].address.<address[is IPPort].<ooi[is Finding].finding_type": {
hostname.reference: []
},
Expand Down
Loading