Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Exclude certificate classification for sensitive data queries #17314

Merged
merged 7 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,9 @@ module CleartextLogging {
*/
class SensitiveDataSourceAsSource extends Source, SensitiveDataSource {
SensitiveDataSourceAsSource() {
not SensitiveDataSource.super.getClassification() = SensitiveDataClassification::id()
not SensitiveDataSource.super.getClassification() in [
SensitiveDataClassification::id(), SensitiveDataClassification::certificate()
]
}

override SensitiveDataClassification getClassification() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,9 @@ module CleartextStorage {
*/
class SensitiveDataSourceAsSource extends Source, SensitiveDataSource {
SensitiveDataSourceAsSource() {
not SensitiveDataSource.super.getClassification() = SensitiveDataClassification::id()
not SensitiveDataSource.super.getClassification() in [
SensitiveDataClassification::id(), SensitiveDataClassification::certificate()
]
}

override SensitiveDataClassification getClassification() {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* The `py/clear-text-logging-sensitive-data` and `py/clear-text-storage-sensitive-data` queries have been updated to exclude the `certificate` classification of sensitive sources, which often do not contain sensitive data.
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ nodes
| test.py:23:58:23:65 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:27:40:27:47 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:30:58:30:65 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:34:30:34:39 | ControlFlowNode for get_cert() | semmle.label | ControlFlowNode for get_cert() |
| test.py:37:11:37:24 | ControlFlowNode for get_password() | semmle.label | ControlFlowNode for get_password() |
| test.py:39:22:39:35 | ControlFlowNode for get_password() | semmle.label | ControlFlowNode for get_password() |
| test.py:40:22:40:35 | ControlFlowNode for get_password() | semmle.label | ControlFlowNode for get_password() |
Expand Down Expand Up @@ -73,7 +72,6 @@ subpaths
| test.py:23:58:23:65 | ControlFlowNode for password | test.py:19:16:19:29 | ControlFlowNode for get_password() | test.py:23:58:23:65 | ControlFlowNode for password | This expression logs $@ as clear text. | test.py:19:16:19:29 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:27:40:27:47 | ControlFlowNode for password | test.py:19:16:19:29 | ControlFlowNode for get_password() | test.py:27:40:27:47 | ControlFlowNode for password | This expression logs $@ as clear text. | test.py:19:16:19:29 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:30:58:30:65 | ControlFlowNode for password | test.py:19:16:19:29 | ControlFlowNode for get_password() | test.py:30:58:30:65 | ControlFlowNode for password | This expression logs $@ as clear text. | test.py:19:16:19:29 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:34:30:34:39 | ControlFlowNode for get_cert() | test.py:34:30:34:39 | ControlFlowNode for get_cert() | test.py:34:30:34:39 | ControlFlowNode for get_cert() | This expression logs $@ as clear text. | test.py:34:30:34:39 | ControlFlowNode for get_cert() | sensitive data (certificate) |
| test.py:37:11:37:24 | ControlFlowNode for get_password() | test.py:37:11:37:24 | ControlFlowNode for get_password() | test.py:37:11:37:24 | ControlFlowNode for get_password() | This expression logs $@ as clear text. | test.py:37:11:37:24 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:39:22:39:35 | ControlFlowNode for get_password() | test.py:39:22:39:35 | ControlFlowNode for get_password() | test.py:39:22:39:35 | ControlFlowNode for get_password() | This expression logs $@ as clear text. | test.py:39:22:39:35 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:40:22:40:35 | ControlFlowNode for get_password() | test.py:40:22:40:35 | ControlFlowNode for get_password() | test.py:40:22:40:35 | ControlFlowNode for get_password() | This expression logs $@ as clear text. | test.py:40:22:40:35 | ControlFlowNode for get_password() | sensitive data (password) |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def log_password():


def log_cert():
logging.debug("Cert=%s", get_cert()) # NOT OK
logging.debug("Cert=%s", get_cert()) # OK

def print_password():
print(get_password()) # NOT OK
Expand All @@ -52,8 +52,8 @@ def log1(social_security_number, ssn, className, passportNo):
print(passportNo) # NOT OK

def log2(post_code, zipCode, home_address):
print(post_code) # NOT OK, but NOT FOUND - "code" is treated as enxrypted and thus not sensitive
print(zipCode) # NOT OK, but NOT FOUND - "code" is treated as enxrypted and thus not sensitive
print(post_code) # NOT OK, but NOT FOUND - "code" is treated as encrypted and thus not sensitive
print(zipCode) # NOT OK, but NOT FOUND - "code" is treated as encrypted and thus not sensitive
print(home_address) # NOT OK

def log3(user_latitude, user_longitude):
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
edges
| test.py:9:5:9:8 | ControlFlowNode for cert | test.py:12:21:12:24 | ControlFlowNode for cert | provenance | |
| test.py:9:5:9:8 | ControlFlowNode for cert | test.py:13:22:13:41 | ControlFlowNode for Attribute() | provenance | |
| test.py:9:5:9:8 | ControlFlowNode for cert | test.py:15:26:15:29 | ControlFlowNode for cert | provenance | |
| test.py:9:12:9:21 | ControlFlowNode for get_cert() | test.py:9:5:9:8 | ControlFlowNode for cert | provenance | |
| test.py:9:5:9:12 | ControlFlowNode for password | test.py:12:21:12:28 | ControlFlowNode for password | provenance | |
| test.py:9:5:9:12 | ControlFlowNode for password | test.py:13:22:13:45 | ControlFlowNode for Attribute() | provenance | |
| test.py:9:5:9:12 | ControlFlowNode for password | test.py:15:26:15:33 | ControlFlowNode for password | provenance | |
| test.py:9:16:9:29 | ControlFlowNode for get_password() | test.py:9:5:9:12 | ControlFlowNode for password | provenance | |
nodes
| test.py:9:5:9:8 | ControlFlowNode for cert | semmle.label | ControlFlowNode for cert |
| test.py:9:12:9:21 | ControlFlowNode for get_cert() | semmle.label | ControlFlowNode for get_cert() |
| test.py:12:21:12:24 | ControlFlowNode for cert | semmle.label | ControlFlowNode for cert |
| test.py:13:22:13:41 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| test.py:15:26:15:29 | ControlFlowNode for cert | semmle.label | ControlFlowNode for cert |
| test.py:9:5:9:12 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:9:16:9:29 | ControlFlowNode for get_password() | semmle.label | ControlFlowNode for get_password() |
| test.py:12:21:12:28 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:13:22:13:45 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| test.py:15:26:15:33 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
subpaths
#select
| test.py:12:21:12:24 | ControlFlowNode for cert | test.py:9:12:9:21 | ControlFlowNode for get_cert() | test.py:12:21:12:24 | ControlFlowNode for cert | This expression stores $@ as clear text. | test.py:9:12:9:21 | ControlFlowNode for get_cert() | sensitive data (certificate) |
| test.py:13:22:13:41 | ControlFlowNode for Attribute() | test.py:9:12:9:21 | ControlFlowNode for get_cert() | test.py:13:22:13:41 | ControlFlowNode for Attribute() | This expression stores $@ as clear text. | test.py:9:12:9:21 | ControlFlowNode for get_cert() | sensitive data (certificate) |
| test.py:15:26:15:29 | ControlFlowNode for cert | test.py:9:12:9:21 | ControlFlowNode for get_cert() | test.py:15:26:15:29 | ControlFlowNode for cert | This expression stores $@ as clear text. | test.py:9:12:9:21 | ControlFlowNode for get_cert() | sensitive data (certificate) |
| test.py:12:21:12:28 | ControlFlowNode for password | test.py:9:16:9:29 | ControlFlowNode for get_password() | test.py:12:21:12:28 | ControlFlowNode for password | This expression stores $@ as clear text. | test.py:9:16:9:29 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:13:22:13:45 | ControlFlowNode for Attribute() | test.py:9:16:9:29 | ControlFlowNode for get_password() | test.py:13:22:13:45 | ControlFlowNode for Attribute() | This expression stores $@ as clear text. | test.py:9:16:9:29 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:15:26:15:33 | ControlFlowNode for password | test.py:9:16:9:29 | ControlFlowNode for get_password() | test.py:15:26:15:33 | ControlFlowNode for password | This expression stores $@ as clear text. | test.py:9:16:9:29 | ControlFlowNode for get_password() | sensitive data (password) |
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
import pathlib


def get_cert():
return "<CERT>"
def get_password():
return "password"
Comment on lines +4 to +5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joefarebrother Could you elaborate on the rationale behind this change? The non-Python3 version tests certificate and password writes. Whereas the Python3 test now seems to have switched from testing certificate writes to testing password writes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No significant reason. Would it be best to include tests for both?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to @RasmusWL and @tausbn's judgement on that. 🙏🏼

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My interpretation is that this -py3 test is here to ensure we properly support writing through pathlib.Path methods for this query, so LGTM 👍



def write_password(filename):
cert = get_cert()
password = get_password()

path = pathlib.Path(filename)
path.write_text(cert) # NOT OK
path.write_bytes(cert.encode("utf-8")) # NOT OK
path.write_text(password) # NOT OK
path.write_bytes(password.encode("utf-8")) # NOT OK

path.open("w").write(cert) # NOT OK
path.open("w").write(password) # NOT OK
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,25 @@ edges
| password_in_cookie.py:7:16:7:43 | ControlFlowNode for Attribute() | password_in_cookie.py:7:5:7:12 | ControlFlowNode for password | provenance | |
| password_in_cookie.py:14:5:14:12 | ControlFlowNode for password | password_in_cookie.py:16:33:16:40 | ControlFlowNode for password | provenance | |
| password_in_cookie.py:14:16:14:43 | ControlFlowNode for Attribute() | password_in_cookie.py:14:5:14:12 | ControlFlowNode for password | provenance | |
| test.py:6:5:6:8 | ControlFlowNode for cert | test.py:8:20:8:23 | ControlFlowNode for cert | provenance | |
| test.py:6:5:6:8 | ControlFlowNode for cert | test.py:9:9:9:13 | ControlFlowNode for lines | provenance | |
| test.py:6:12:6:21 | ControlFlowNode for get_cert() | test.py:6:5:6:8 | ControlFlowNode for cert | provenance | |
| test.py:9:9:9:13 | ControlFlowNode for lines | test.py:10:25:10:29 | ControlFlowNode for lines | provenance | |
| test.py:15:5:15:12 | ControlFlowNode for password | test.py:17:20:17:27 | ControlFlowNode for password | provenance | |
| test.py:15:5:15:12 | ControlFlowNode for password | test.py:18:9:18:13 | ControlFlowNode for lines | provenance | |
| test.py:15:16:15:29 | ControlFlowNode for get_password() | test.py:15:5:15:12 | ControlFlowNode for password | provenance | |
| test.py:18:9:18:13 | ControlFlowNode for lines | test.py:19:25:19:29 | ControlFlowNode for lines | provenance | |
nodes
| password_in_cookie.py:7:5:7:12 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| password_in_cookie.py:7:16:7:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| password_in_cookie.py:9:33:9:40 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| password_in_cookie.py:14:5:14:12 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| password_in_cookie.py:14:16:14:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| password_in_cookie.py:16:33:16:40 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:6:5:6:8 | ControlFlowNode for cert | semmle.label | ControlFlowNode for cert |
| test.py:6:12:6:21 | ControlFlowNode for get_cert() | semmle.label | ControlFlowNode for get_cert() |
| test.py:8:20:8:23 | ControlFlowNode for cert | semmle.label | ControlFlowNode for cert |
| test.py:9:9:9:13 | ControlFlowNode for lines | semmle.label | ControlFlowNode for lines |
| test.py:10:25:10:29 | ControlFlowNode for lines | semmle.label | ControlFlowNode for lines |
| test.py:15:5:15:12 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:15:16:15:29 | ControlFlowNode for get_password() | semmle.label | ControlFlowNode for get_password() |
| test.py:17:20:17:27 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:18:9:18:13 | ControlFlowNode for lines | semmle.label | ControlFlowNode for lines |
| test.py:19:25:19:29 | ControlFlowNode for lines | semmle.label | ControlFlowNode for lines |
subpaths
#select
| password_in_cookie.py:9:33:9:40 | ControlFlowNode for password | password_in_cookie.py:7:16:7:43 | ControlFlowNode for Attribute() | password_in_cookie.py:9:33:9:40 | ControlFlowNode for password | This expression stores $@ as clear text. | password_in_cookie.py:7:16:7:43 | ControlFlowNode for Attribute() | sensitive data (password) |
| password_in_cookie.py:16:33:16:40 | ControlFlowNode for password | password_in_cookie.py:14:16:14:43 | ControlFlowNode for Attribute() | password_in_cookie.py:16:33:16:40 | ControlFlowNode for password | This expression stores $@ as clear text. | password_in_cookie.py:14:16:14:43 | ControlFlowNode for Attribute() | sensitive data (password) |
| test.py:8:20:8:23 | ControlFlowNode for cert | test.py:6:12:6:21 | ControlFlowNode for get_cert() | test.py:8:20:8:23 | ControlFlowNode for cert | This expression stores $@ as clear text. | test.py:6:12:6:21 | ControlFlowNode for get_cert() | sensitive data (certificate) |
| test.py:10:25:10:29 | ControlFlowNode for lines | test.py:6:12:6:21 | ControlFlowNode for get_cert() | test.py:10:25:10:29 | ControlFlowNode for lines | This expression stores $@ as clear text. | test.py:6:12:6:21 | ControlFlowNode for get_cert() | sensitive data (certificate) |
| test.py:17:20:17:27 | ControlFlowNode for password | test.py:15:16:15:29 | ControlFlowNode for get_password() | test.py:17:20:17:27 | ControlFlowNode for password | This expression stores $@ as clear text. | test.py:15:16:15:29 | ControlFlowNode for get_password() | sensitive data (password) |
| test.py:19:25:19:29 | ControlFlowNode for lines | test.py:15:16:15:29 | ControlFlowNode for get_password() | test.py:19:25:19:29 | ControlFlowNode for lines | This expression stores $@ as clear text. | test.py:15:16:15:29 | ControlFlowNode for get_password() | sensitive data (password) |
Original file line number Diff line number Diff line change
@@ -1,12 +1,21 @@
def get_cert():
return "<CERT>"

def get_password():
return "password"

def write_cert(filename):
cert = get_cert()
with open(filename, "w") as file:
file.write(cert) # NOT OK
file.write(cert) # OK
lines = [cert + "\n"]
file.writelines(lines) # OK

def write_password(filename):
password = get_password()
with open(filename, "w") as file:
file.write(password) # NOT OK
lines = [password + "\n"]
file.writelines(lines) # NOT OK

def FPs():
Expand Down
Loading