-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Add support for threat models #17203
Conversation
1e35fd4
to
8747fa4
Compare
b145618
to
2117d1f
Compare
Naming in other languages: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and instead I came up with new names.
Without, it's impossible to write test showing what threat-models are active by default... unless I provide a hardcoded list in the test itself, which is not any fun.
I didn't want to put the configuration file in `semmle/python/frameworks/**/*.model.yml`, so created `ext/` as in other languages
e1b2ae4
to
93b5060
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I mainly a comment about the class names otherwise the structure LGTM
* Extend this class to refine existing API models. If you want to model new APIs, | ||
* extend `ThreatModelSource::Range` instead. | ||
*/ | ||
class ThreatModelSource extends DataFlow::Node instanceof ThreatModelSource::Range { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might I suggest the name FlowSource
for this class? It seems consistent with C++ and Swift at least, and it works nicely with RemoteFlowSource
being a special case of it.
Then we could rename ActiveThreatModelSource
to ThreatModelFlowSource
to be consistent with other languages. I do agree that the "active" prefix makes sense, but given that this will be the new go-to thing to use in isSource()
predicates it seems that we really do want consistency for that class name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've persuaded @michaelnebel
to be in favor of ActiveThreatModelSource
, so I've just filed a PR to make the existing languages use that as well 👍 #17424
FlowSource
Regarding renaming to FlowSource
, I've tried to do that here: RasmusWL@dec5daa
I'm slightly hesitant to accept it, I can't quite put my finger on it, but I think it's because it's so generic that name could be used to capture the set of sources for any data-flow/taint-tracking configuration, no matter the logic, whereas ThreatModelSource
seem to convey a more specific meaning to me.
I realize that your suggestion probably fits pretty well with current naming in C#/Java/Go with SourceNode
and C++/Swift with FlowSource
🤔 Maybe I'll see if anyone comes with a convincing argument during next round of review, otherwise it looks like I should just disagree-and-commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever we do we ought to treat the naming of the two classes as a single decision; not something where we try to make a decision for each class in isolation.
The proposed rename to FlowSource
made sense when the other class would be called ThreatModelFlowSource
; but it doesn't work so so nicely with ActiveThreatModelSource
. I'd prefer the combination ThreatModelSource/ActiveThreatModelSource
over FlowSource/ActiveThreatModelSource
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'm feeling strongly in favor of ActiveThreatModelSource
, so it seems like I won't add the FlowSource
commit 👍
python/ql/lib/semmle/python/security/dataflow/CodeInjectionCustomizations.qll
Outdated
Show resolved
Hide resolved
Since using `.DictionaryElementAny` doesn't actually do a store on the source, (so we can later follow any dict read-steps). I added the ensure_tainted steps to highlight that the result of the WHOLE expression ends up "tainted", and that we don't just mark `os.environ` as the source without further flow.
As part of adding support for threat-models to Python/JS (see github#17203), we ran into some trouble with name clashes. Naming in existing languages supporting threat-models: - `SourceNode` (for QL only modeling) - `ThreatModelFlowSource` (for active sources from QL or data-extensions) However, since we use `LocalSourceNode` in Python, and `SourceNode` in JS (for local source nodes), it seems a bit confusing to follow the same naming convention as other languages, and we had to come up with new names. Initially I used `ThreatModelSource` for the "QL only modeling", but that meant that we needed a new name to represent the active sources coming from either QL or data-extensions... for this I came up with `ActiveThreatModelSource`, and I really liked it. To me, it's much clearer that this class only contains the currently active threat model sources. So to align languages, I got approval from @michaelnebel to rename the existing classes.
93b5060
to
e35c2b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor comment, otherwise I think this looks really good! (I'm approving it now in case the minor comment is irrelevant.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.