Python: Add support for threat models #17203

RasmusWL · 2024-08-12T13:58:38Z

No description provided.

python/ql/lib/semmle/python/frameworks/data/ModelsAsData.qll

Naming in other languages: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and instead I came up with new names.

Without, it's impossible to write test showing what threat-models are active by default... unless I provide a hardcoded list in the test itself, which is not any fun.

I didn't want to put the configuration file in `semmle/python/frameworks/**/*.model.yml`, so created `ext/` as in other languages

asgerf

Looks great! I mainly a comment about the class names otherwise the structure LGTM

asgerf · 2024-08-20T09:50:56Z

python/ql/lib/semmle/python/Concepts.qll

+ * Extend this class to refine existing API models. If you want to model new APIs,
+ * extend `ThreatModelSource::Range` instead.
+ */
+class ThreatModelSource extends DataFlow::Node instanceof ThreatModelSource::Range {


Might I suggest the name FlowSource for this class? It seems consistent with C++ and Swift at least, and it works nicely with RemoteFlowSource being a special case of it.

Then we could rename ActiveThreatModelSource to ThreatModelFlowSource to be consistent with other languages. I do agree that the "active" prefix makes sense, but given that this will be the new go-to thing to use in isSource() predicates it seems that we really do want consistency for that class name.

I've persuaded @michaelnebel to be in favor of ActiveThreatModelSource, so I've just filed a PR to make the existing languages use that as well 👍 #17424

FlowSource

Regarding renaming to FlowSource, I've tried to do that here: RasmusWL@dec5daa

I'm slightly hesitant to accept it, I can't quite put my finger on it, but I think it's because it's so generic that name could be used to capture the set of sources for any data-flow/taint-tracking configuration, no matter the logic, whereas ThreatModelSource seem to convey a more specific meaning to me.

I realize that your suggestion probably fits pretty well with current naming in C#/Java/Go with SourceNode and C++/Swift with FlowSource 🤔 Maybe I'll see if anyone comes with a convincing argument during next round of review, otherwise it looks like I should just disagree-and-commit.

Whatever we do we ought to treat the naming of the two classes as a single decision; not something where we try to make a decision for each class in isolation.

The proposed rename to FlowSource made sense when the other class would be called ThreatModelFlowSource; but it doesn't work so so nicely with ActiveThreatModelSource. I'd prefer the combination ThreatModelSource/ActiveThreatModelSource over FlowSource/ActiveThreatModelSource.

Alright, I'm feeling strongly in favor of ActiveThreatModelSource, so it seems like I won't add the FlowSource commit 👍

python/ql/lib/semmle/python/security/dataflow/CodeInjectionCustomizations.qll

python/ql/test/library-tests/frameworks/stdlib/threat_models.py

Since using `.DictionaryElementAny` doesn't actually do a store on the source, (so we can later follow any dict read-steps). I added the ensure_tainted steps to highlight that the result of the WHOLE expression ends up "tainted", and that we don't just mark `os.environ` as the source without further flow.

@michaelnebel

As part of adding support for threat-models to Python/JS (see github#17203), we ran into some trouble with name clashes. Naming in existing languages supporting threat-models: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and we had to come up with new names. Initially I used `ThreatModelSource` for the "QL only modeling", but that meant that we needed a new name to represent the active sources coming from either QL or data-extensions... for this I came up with `ActiveThreatModelSource`, and I really liked it. To me, it's much clearer that this class only contains the currently active threat model sources. So to align languages, I got approval from @michaelnebel to rename the existing classes.

tausbn

Just one minor comment, otherwise I think this looks really good! (I'm approving it now in case the minor comment is irrelevant.)

python/ql/lib/semmle/python/Concepts.qll

Co-authored-by: Taus <[email protected]>

tausbn

github-actions bot added the Python label Aug 12, 2024

github-advanced-security bot found potential problems Aug 12, 2024

View reviewed changes

python/ql/lib/semmle/python/frameworks/data/ModelsAsData.qll Fixed Show fixed Hide fixed

RasmusWL force-pushed the threat-models branch from 1e35fd4 to 8747fa4 Compare August 16, 2024 09:11

github-actions bot added the documentation label Aug 16, 2024

RasmusWL force-pushed the threat-models branch 2 times, most recently from b145618 to 2117d1f Compare August 16, 2024 11:42

RasmusWL added 4 commits August 19, 2024 10:54

ThreatModels: Expose knownThreatModel

766dcc4

Without, it's impossible to write test showing what threat-models are active by default... unless I provide a hardcoded list in the test itself, which is not any fun.

Python: Add test showing default active threat-models

617ab27

Python: Remove 'response' from default threat-models

8f7dec0

I didn't want to put the configuration file in `semmle/python/frameworks/**/*.model.yml`, so created `ext/` as in other languages

RasmusWL force-pushed the threat-models branch 2 times, most recently from e1b2ae4 to 93b5060 Compare August 19, 2024 12:51

RasmusWL mentioned this pull request Aug 19, 2024

JS: Add support for threat models #17256

Merged

RasmusWL marked this pull request as ready for review August 20, 2024 08:57

RasmusWL requested a review from a team as a code owner August 20, 2024 08:57

asgerf reviewed Aug 20, 2024

View reviewed changes

RasmusWL added 14 commits September 10, 2024 14:32

Python: Make queries use ActiveThreatModelSource

528f08f

Python: Add basic support for environment/commandargs threat-models

b9239d7

Python: Proper threat-model handling for argparse

e1801f3

Python: Model stdin thread-model

66f389a

Python: Model file threat-model

d245db5

Python: Fixup modeling of os.open

7483075

Python: Add basic support for database threat-model

8d8cd05

Python: Add e2e threat-model test

a0b24d6

Python: Add change-note

0ccb5b1

Docs: Update threat-model list to include Python

7d3793e

Python: Add threat-modeling of raw_input

333367c

Python: Additional threatModelSource annotations

cbebf7b

Python: Add links to threat-model docs

5ff7b65

RasmusWL mentioned this pull request Sep 10, 2024

Go/Java/C#: Rename ThreatModelFlowSource to ActiveThreatModelSource #17424

Merged

Docs: Include 'Threat models' for Python

e35c2b2

RasmusWL force-pushed the threat-models branch from 93b5060 to e35c2b2 Compare September 10, 2024 14:45

Docs: Fix link

e11bfc2

tausbn previously approved these changes Sep 17, 2024

View reviewed changes

python/ql/lib/semmle/python/Concepts.qll Outdated Show resolved Hide resolved

RasmusWL and others added 2 commits September 23, 2024 11:19

Merge branch 'main' into threat-models

4a21a85

Python: Minor simplification of ActiveThreatModelSource

535db98

Co-authored-by: Taus <[email protected]>

RasmusWL dismissed tausbn’s stale review via 535db98 September 23, 2024 09:22

tausbn previously approved these changes Sep 24, 2024

View reviewed changes

Merge branch 'main' into threat-models

431a1af

RasmusWL dismissed tausbn’s stale review via 431a1af September 26, 2024 09:44

RasmusWL requested a review from tausbn September 26, 2024 10:25

tausbn approved these changes Sep 26, 2024

View reviewed changes

RasmusWL merged commit 7c32efc into github:main Sep 26, 2024
37 of 38 checks passed

RasmusWL deleted the threat-models branch September 26, 2024 11:15

felickz mentioned this pull request Oct 24, 2024

End of life the local packs with new configurable threat-models setting in CodeQL GitHubSecurityLab/CodeQL-Community-Packs#69

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Add support for threat models #17203

Python: Add support for threat models #17203

RasmusWL commented Aug 12, 2024

asgerf left a comment

asgerf Aug 20, 2024

RasmusWL Sep 10, 2024

asgerf Sep 11, 2024

RasmusWL Sep 16, 2024

tausbn left a comment

tausbn left a comment

Python: Add support for threat models #17203

Python: Add support for threat models #17203

Conversation

RasmusWL commented Aug 12, 2024

asgerf left a comment

Choose a reason for hiding this comment

asgerf Aug 20, 2024

Choose a reason for hiding this comment

RasmusWL Sep 10, 2024

Choose a reason for hiding this comment

FlowSource

asgerf Sep 11, 2024

Choose a reason for hiding this comment

RasmusWL Sep 16, 2024

Choose a reason for hiding this comment

tausbn left a comment

Choose a reason for hiding this comment

tausbn left a comment

Choose a reason for hiding this comment