-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JS: Add support for threat models #17256
Conversation
* Extend this class to model new APIs. If you want to refine existing API models, | ||
* extend `ThreatModelSource` instead. | ||
*/ | ||
abstract class Range extends DataFlow::Node { |
Check failure
Code scanning / CodeQL
Bidirectional imports for abstract classes Error
RemoteFlowPassword
GitHubActionsContextSource
This abstract class doesn't import its subclass
RemoteServerResponse
GitHubActionsContextSource
This abstract class doesn't import its subclass
RemoteFlowSourceFromDBAccess
GitHubActionsContextSource
@@ -11,10 +11,9 @@ | |||
private module Cached { | |||
/** A data flow source of remote user input. */ | |||
cached | |||
abstract class RemoteFlowSource extends DataFlow::Node { | |||
/** Gets a human-readable string that describes the type of this remote flow source. */ | |||
abstract class RemoteFlowSource extends ThreatModelSource::Range { |
Check failure
Code scanning / CodeQL
Bidirectional imports for abstract classes Error
RemoteFlowPassword
GitHubActionsContextSource
This abstract class doesn't import its subclass
RemoteServerResponse
GitHubActionsContextSource
This abstract class doesn't import its subclass
RemoteFlowSourceFromDBAccess
GitHubActionsContextSource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to JS look good to me so far 👍
Integration with RemoteFlowSource is not straightforward, so postponing that for later Naming in other languages: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and instead I came up with new names.
I didn't want to put the configuration file in `semmle/javascript/frameworks/**/*.model.yml`, so created `ext/` as in other languages
7 cases looks something like this: ``` class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource { RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource } } ``` (some have variations like `not this.(ClientSideRemoteFlowSource).getKind().isPathOrUrl()`) javascript/ql/lib/semmle/javascript/security/dataflow/ClientSideUrlRedirectCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/CommandInjectionCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/CorsMisconfigurationForCredentialsCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/RegExpInjectionCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/RequestForgeryCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/ResourceExhaustionCustomizations.qll javascript/ql/lib/semmle/javascript/security/dataflow/TaintedPathCustomizations.qll
Such that we can reuse the existing modeling, but have it globally applied as a threat-model as well. I Basically just moved the modeling. One important aspect is that this changes is that the previously query-specific `argsParseStep` is now a globally applied taint-step. This seems reasonable, if someone applied the argument parsing to any user-controlled string, it seems correct to propagate that taint for _any_ query.
a465a1b
to
1726287
Compare
Mirroring what was done for Python
However, as indicated by the `MISSING` annotations, we could do better.
Makes a difference due to the modeling of NodeJSFileSystemAccessRead depending on these, see https://github.com/github/codeql/blob/412e841d6929c7a4cf6508a01e721db01df7ac49/javascript/ql/lib/semmle/javascript/frameworks/NodeJSLib.qll#L479-L488 File copied from https://github.com/github/codeql/blob/7cef4322e70e10c0c62e9ce933ca9f6db44b6ec1/javascript/externs/nodejs/fs.js
You could argue that proper modeling be done in the same way as `NodeJSFileSystemAccessRead` is done for the callback based `fs` API (in NodeJSLib.qll). However, that work is straying from the core goals I'm working towards right now, so I'll argue that "perfect is the enemy of good", and leave this as is for now.
Technically not always true, but my assumption is that +90% of the time that's what it will be used for, so while we could be more precise by adding a taint-step from the `input` part of the construction, I'm not sure it's worth it in this case. Furthermore, doing so would break with the current way we model threat-model sources, and how sources are generally modeled in JS... so for a very pretty setup it would require changing all the other `file` threat-model sources to start at the constructors such as `fs.createReadStream()` and have taint-propagation steps towards the actual use (like we do in Python)... I couldn't see an easy path forwards for doing this while keeping the Concepts integration, so I opted for the simpler solution here.
Have added proper docs integration, and have provided some modeling for database/stdin/file (although the latter two could be improved more). Ready for review now, will do DCA soon. |
Looks good so far, but will take another look tomorrow when I have more time.
class RemoteFlowSourceAsSource extends Source instanceof RemoteFlowSource {
RemoteFlowSourceAsSource() { not this instanceof ClientSideRemoteFlowSource }
} Just change |
javascript/ql/lib/semmle/javascript/frameworks/NodeJSLib.model.yml
Outdated
Show resolved
Hide resolved
Co-authored-by: Asger F <[email protected]>
🤯 so obvious after you've seen it 😂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR mostly follows the structure of the Python PR (#17203)
The most interesting aspect of this PR is that in f733ac1 I was only able to meaningfully add ActiveThreatModelSource as the default source for some queries, since some of them rely on this pattern:
and it's not quite clear how these should be migrated 🤔 This pattern is used in these 7 cases:
I'll also note that while I think the support for
environment
,commandargs
,database
threat-models are OK, the ones forstdin
andfile
are a little simple right now. (see 3 last commit messages for more context)