JS: Add support for threat models #17256

RasmusWL · 2024-08-19T13:02:14Z

This PR mostly follows the structure of the Python PR (#17203)

The most interesting aspect of this PR is that in f733ac1 I was only able to meaningfully add ActiveThreatModelSource as the default source for some queries, since some of them rely on this pattern:

class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource {
  RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource }
}

and it's not quite clear how these should be migrated 🤔 This pattern is used in these 7 cases:

javascript/ql/lib/semmle/javascript/security/dataflow/ClientSideUrlRedirectCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/CommandInjectionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/CorsMisconfigurationForCredentialsCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/RegExpInjectionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/RequestForgeryCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/ResourceExhaustionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/TaintedPathCustomizations.qll

I'll also note that while I think the support for environment, commandargs, database threat-models are OK, the ones for stdin and file are a little simple right now. (see 3 last commit messages for more context)

javascript/ql/lib/semmle/javascript/Concepts.qll

+   * Extend this class to model new APIs. If you want to refine existing API models,
+   * extend `ThreatModelSource` instead.
+   */
+  abstract class Range extends DataFlow::Node {


javascript/ql/lib/semmle/javascript/security/dataflow/RemoteFlowSources.qll

@@ -11,10 +11,9 @@
 private module Cached {
  /** A data flow source of remote user input. */
  cached
-  abstract class RemoteFlowSource extends DataFlow::Node {
-    /** Gets a human-readable string that describes the type of this remote flow source. */
+  abstract class RemoteFlowSource extends ThreatModelSource::Range {


asgerf

The changes to JS look good to me so far 👍

javascript/ql/lib/ext/default-threat-models-fixup.model.yml

Integration with RemoteFlowSource is not straightforward, so postponing that for later Naming in other languages: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and instead I came up with new names.

I didn't want to put the configuration file in `semmle/javascript/frameworks/**/*.model.yml`, so created `ext/` as in other languages

7 cases looks something like this: ``` class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource { RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource } } ``` (some have variations like `not this.(ClientSideRemoteFlowSource).getKind().isPathOrUrl()`) javascript/ql/lib/semmle/javascript/security/dataflow/ClientSideUrlRedirectCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/CommandInjectionCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/CorsMisconfigurationForCredentialsCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/RegExpInjectionCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/RequestForgeryCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/ResourceExhaustionCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/TaintedPathCustomizations.qll

Such that we can reuse the existing modeling, but have it globally applied as a threat-model as well. I Basically just moved the modeling. One important aspect is that this changes is that the previously query-specific `argsParseStep` is now a globally applied taint-step. This seems reasonable, if someone applied the argument parsing to any user-controlled string, it seems correct to propagate that taint for _any_ query.

Mirroring what was done for Python

Capturing - github@7d3793e - github@e35c2b2 - github@e11bfc2

However, as indicated by the `MISSING` annotations, we could do better.

Makes a difference due to the modeling of NodeJSFileSystemAccessRead depending on these, see https://github.com/github/codeql/blob/412e841d6929c7a4cf6508a01e721db01df7ac49/javascript/ql/lib/semmle/javascript/frameworks/NodeJSLib.qll#L479-L488 File copied from https://github.com/github/codeql/blob/7cef4322e70e10c0c62e9ce933ca9f6db44b6ec1/javascript/externs/nodejs/fs.js

You could argue that proper modeling be done in the same way as `NodeJSFileSystemAccessRead` is done for the callback based `fs` API (in NodeJSLib.qll). However, that work is straying from the core goals I'm working towards right now, so I'll argue that "perfect is the enemy of good", and leave this as is for now.

Technically not always true, but my assumption is that +90% of the time that's what it will be used for, so while we could be more precise by adding a taint-step from the `input` part of the construction, I'm not sure it's worth it in this case. Furthermore, doing so would break with the current way we model threat-model sources, and how sources are generally modeled in JS... so for a very pretty setup it would require changing all the other `file` threat-model sources to start at the constructors such as `fs.createReadStream()` and have taint-propagation steps towards the actual use (like we do in Python)... I couldn't see an easy path forwards for doing this while keeping the Concepts integration, so I opted for the simpler solution here.

RasmusWL · 2024-10-31T13:38:52Z

Have added proper docs integration, and have provided some modeling for database/stdin/file (although the latter two could be improved more). Ready for review now, will do DCA soon.

asgerf · 2024-10-31T14:17:33Z

Looks good so far, but will take another look tomorrow when I have more time.

The most interesting aspect of this PR is that in f733ac1 I was only able to meaningfully add ActiveThreatModelSource as the default source for some queries, since some of them rely on this pattern:

class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource {
  RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource }
}

Just change RemoteFlowSource to ActiveThreatModelSource while keeping the not in the body.

javascript/ql/lib/semmle/javascript/frameworks/NodeJSLib.model.yml

Co-authored-by: Asger F <[email protected]>

RasmusWL · 2024-11-01T09:24:49Z

Just change RemoteFlowSource to ActiveThreatModelSource while keeping the not in the body.

🤯 so obvious after you've seen it 😂

asgerf

👍

github-actions bot added JS documentation Python labels Aug 19, 2024

github-advanced-security bot found potential problems Aug 19, 2024

View reviewed changes

asgerf reviewed Aug 20, 2024

View reviewed changes

javascript/ql/lib/ext/default-threat-models-fixup.model.yml Outdated Show resolved Hide resolved

RasmusWL added 9 commits October 25, 2024 14:50

JS: Add test showing default active threat-models

05dce8a

JS: Remove 'response' from default threat-models

dbfbd2c

I didn't want to put the configuration file in `semmle/javascript/frameworks/**/*.model.yml`, so created `ext/` as in other languages

JS: Integrate RemoteFlowSource with ThreatModelSource

4b1c027

JS: Add environment threat-model source

412e841

JS: Model newer yargs command-line parsing pattern

d3ae4c9

JS: Add e2e threat-model test

1726287

RasmusWL force-pushed the js-threat-models branch from a465a1b to 1726287 Compare October 25, 2024 13:04

github-actions bot removed documentation Python labels Oct 25, 2024

RasmusWL added 10 commits October 29, 2024 11:29

JS: Minor improvements to threat-model Concepts

84f6b89

Mirroring what was done for Python

Docs: Threat-models supported in JS

07bc1fe

Capturing - github@7d3793e - github@e35c2b2 - github@e11bfc2

JS: Add change-note

7c7420a

JS: Add database threat-model source modeling

3656864

JS: Add initial file threat-model support

2b6c27e

However, as indicated by the `MISSING` annotations, we could do better.

JS: Add tests for stdin threat-model sources

b47fa77

JS: Do simple modeling of process.stdin as threat-model source

eca8bf5

github-actions bot added the documentation label Oct 31, 2024

RasmusWL marked this pull request as ready for review October 31, 2024 13:38

RasmusWL requested a review from a team as a code owner October 31, 2024 13:38

asgerf reviewed Oct 31, 2024

View reviewed changes

javascript/ql/lib/semmle/javascript/frameworks/NodeJSLib.model.yml Outdated Show resolved Hide resolved

JS: Remove dummy comment

19fae76

Co-authored-by: Asger F <[email protected]>

RasmusWL added 2 commits November 1, 2024 10:47

JS: Convert remaining queries to use ActiveThreatModelSourceAsSource

dc8e645

Merge branch 'main' into js-threat-models

c0ad9ba

RasmusWL requested a review from asgerf November 1, 2024 10:02

asgerf approved these changes Nov 4, 2024

View reviewed changes

RasmusWL merged commit 8f80c24 into github:main Nov 4, 2024
14 of 15 checks passed

RasmusWL deleted the js-threat-models branch November 4, 2024 13:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JS: Add support for threat models #17256

JS: Add support for threat models #17256

RasmusWL commented Aug 19, 2024 •

edited

Loading

asgerf left a comment

RasmusWL commented Oct 31, 2024

asgerf commented Oct 31, 2024

RasmusWL commented Nov 1, 2024

asgerf left a comment

JS: Add support for threat models #17256

JS: Add support for threat models #17256

Conversation

RasmusWL commented Aug 19, 2024 • edited Loading

asgerf left a comment

Choose a reason for hiding this comment

RasmusWL commented Oct 31, 2024

asgerf commented Oct 31, 2024

RasmusWL commented Nov 1, 2024

asgerf left a comment

Choose a reason for hiding this comment

RasmusWL commented Aug 19, 2024 •

edited

Loading