Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS: Add support for threat models #17256

Merged
merged 22 commits into from
Nov 4, 2024
Merged

Conversation

RasmusWL
Copy link
Member

@RasmusWL RasmusWL commented Aug 19, 2024

This PR mostly follows the structure of the Python PR (#17203)

The most interesting aspect of this PR is that in f733ac1 I was only able to meaningfully add ActiveThreatModelSource as the default source for some queries, since some of them rely on this pattern:

class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource {
  RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource }
}

and it's not quite clear how these should be migrated 🤔 This pattern is used in these 7 cases:

javascript/ql/lib/semmle/javascript/security/dataflow/ClientSideUrlRedirectCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/CommandInjectionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/CorsMisconfigurationForCredentialsCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/RegExpInjectionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/RequestForgeryCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/ResourceExhaustionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/TaintedPathCustomizations.qll

I'll also note that while I think the support for environment, commandargs, database threat-models are OK, the ones for stdin and file are a little simple right now. (see 3 last commit messages for more context)

* Extend this class to model new APIs. If you want to refine existing API models,
* extend `ThreatModelSource` instead.
*/
abstract class Range extends DataFlow::Node {

Check failure

Code scanning / CodeQL

Bidirectional imports for abstract classes Error

This abstract class doesn't import its subclass
RemoteFlowPassword
but imports 56 other subclasses, such as
GitHubActionsContextSource
.
This abstract class doesn't import its subclass
RemoteServerResponse
but imports 56 other subclasses, such as
GitHubActionsContextSource
.
This abstract class doesn't import its subclass
RemoteFlowSourceFromDBAccess
but imports 56 other subclasses, such as
GitHubActionsContextSource
.
@@ -11,10 +11,9 @@
private module Cached {
/** A data flow source of remote user input. */
cached
abstract class RemoteFlowSource extends DataFlow::Node {
/** Gets a human-readable string that describes the type of this remote flow source. */
abstract class RemoteFlowSource extends ThreatModelSource::Range {

Check failure

Code scanning / CodeQL

Bidirectional imports for abstract classes Error

This abstract class doesn't import its subclass
RemoteFlowPassword
but imports 50 other subclasses, such as
GitHubActionsContextSource
.
This abstract class doesn't import its subclass
RemoteServerResponse
but imports 50 other subclasses, such as
GitHubActionsContextSource
.
This abstract class doesn't import its subclass
RemoteFlowSourceFromDBAccess
but imports 50 other subclasses, such as
GitHubActionsContextSource
.
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to JS look good to me so far 👍

Integration with RemoteFlowSource is not straightforward, so postponing
that for later

Naming in other languages:
- `SourceNode` (for QL only modeling)
- `ThreatModelFlowSource` (for active sources from QL or data-extensions)

However, since we use `LocalSourceNode` in Python, and `SourceNode` in
JS (for local source nodes), it seems a bit confusing to follow the same
naming convention as other languages, and instead I came up with new names.
I didn't want to put the configuration file in
`semmle/javascript/frameworks/**/*.model.yml`, so created `ext/` as in other
languages
7 cases looks something like this:

```
class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource {
  RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource }
}
```

(some have variations like `not this.(ClientSideRemoteFlowSource).getKind().isPathOrUrl()`)

javascript/ql/lib/semmle/javascript/security/dataflow/ClientSideUrlRedirectCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/CommandInjectionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/CorsMisconfigurationForCredentialsCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/RegExpInjectionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/RequestForgeryCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/ResourceExhaustionCustomizations.qll
javascript/ql/lib/semmle/javascript/security/dataflow/TaintedPathCustomizations.qll
Such that we can reuse the existing modeling, but have it globally
applied as a threat-model as well.

I Basically just moved the modeling. One important aspect is that this
changes is that the previously query-specific `argsParseStep` is now a
globally applied taint-step. This seems reasonable, if someone applied
the argument parsing to any user-controlled string, it seems correct to
propagate that taint for _any_ query.
Mirroring what was done for Python
However, as indicated by the `MISSING` annotations, we could do better.
You could argue that proper modeling be done in the same way as
`NodeJSFileSystemAccessRead` is done for the callback based `fs` API (in
NodeJSLib.qll). However, that work is straying from the core goals I'm
working towards right now, so I'll argue that "perfect is the enemy of
good", and leave this as is for now.
Technically not always true, but my assumption is that +90% of the time
that's what it will be used for, so while we could be more precise by
adding a taint-step from the `input` part of the construction, I'm not
sure it's worth it in this case.

Furthermore, doing so would break with the current way we model
threat-model sources, and how sources are generally modeled in JS... so
for a very pretty setup it would require changing all the other `file`
threat-model sources to start at the constructors such as
`fs.createReadStream()` and have taint-propagation steps towards the
actual use (like we do in Python)...

I couldn't see an easy path forwards for doing this while keeping the
Concepts integration, so I opted for the simpler solution here.
@RasmusWL
Copy link
Member Author

Have added proper docs integration, and have provided some modeling for database/stdin/file (although the latter two could be improved more). Ready for review now, will do DCA soon.

@RasmusWL RasmusWL marked this pull request as ready for review October 31, 2024 13:38
@RasmusWL RasmusWL requested a review from a team as a code owner October 31, 2024 13:38
@asgerf
Copy link
Contributor

asgerf commented Oct 31, 2024

Looks good so far, but will take another look tomorrow when I have more time.

The most interesting aspect of this PR is that in f733ac1 I was only able to meaningfully add ActiveThreatModelSource as the default source for some queries, since some of them rely on this pattern:

class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource {
  RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource }
}

Just change RemoteFlowSource to ActiveThreatModelSource while keeping the not in the body.

@RasmusWL
Copy link
Member Author

RasmusWL commented Nov 1, 2024

Just change RemoteFlowSource to ActiveThreatModelSource while keeping the not in the body.

🤯 so obvious after you've seen it 😂

@RasmusWL RasmusWL requested a review from asgerf November 1, 2024 10:02
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@RasmusWL RasmusWL merged commit 8f80c24 into github:main Nov 4, 2024
14 of 15 checks passed
@RasmusWL RasmusWL deleted the js-threat-models branch November 4, 2024 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants