Skip to content

Commit

Permalink
Merge pull request #15802 from hvitved/dataflow/variable-capture-over…
Browse files Browse the repository at this point in the history
…lapping-paths

Variable capture: Avoid overlapping and false-positive data flow paths
  • Loading branch information
hvitved authored Mar 18, 2024
2 parents daee22d + 24e35f6 commit a13391b
Show file tree
Hide file tree
Showing 6 changed files with 89 additions and 214 deletions.
86 changes: 0 additions & 86 deletions csharp/ql/test/library-tests/dataflow/global/DataFlowPath.expected

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ def by_value1():
a = SOURCE
def inner(a_val=a):
SINK(a_val) #$ captured
SINK_F(a) #$ SPURIOUS: captured
SINK_F(a)
a = NONSOURCE
inner()

Expand Down
35 changes: 0 additions & 35 deletions ruby/ql/test/library-tests/dataflow/global/Flow.expected
Original file line number Diff line number Diff line change
@@ -1,10 +1,7 @@
testFailures
edges
| blocks.rb:14:12:14:20 | call to source | blocks.rb:8:10:8:14 | yield ... | provenance | |
| captured_variables.rb:9:24:9:24 | x | captured_variables.rb:10:10:10:23 | -> { ... } [captured x] | provenance | |
| captured_variables.rb:9:24:9:24 | x | captured_variables.rb:11:5:11:6 | fn [captured x] | provenance | |
| captured_variables.rb:10:5:10:6 | fn [captured x] | captured_variables.rb:11:5:11:6 | fn [captured x] | provenance | |
| captured_variables.rb:10:10:10:23 | -> { ... } [captured x] | captured_variables.rb:10:5:10:6 | fn [captured x] | provenance | |
| captured_variables.rb:11:5:11:6 | fn [captured x] | captured_variables.rb:10:20:10:20 | x | provenance | |
| captured_variables.rb:13:20:13:29 | call to taint | captured_variables.rb:9:24:9:24 | x | provenance | |
| captured_variables.rb:15:28:15:28 | x | captured_variables.rb:16:5:18:5 | -> { ... } [captured x] | provenance | |
Expand All @@ -16,18 +13,12 @@ edges
| captured_variables.rb:27:25:27:57 | call to capture_escape_return2 [captured x] | captured_variables.rb:24:14:24:14 | x | provenance | |
| captured_variables.rb:27:48:27:57 | call to taint | captured_variables.rb:22:28:22:28 | x | provenance | |
| captured_variables.rb:27:48:27:57 | call to taint | captured_variables.rb:27:25:27:57 | call to capture_escape_return2 [captured x] | provenance | |
| captured_variables.rb:29:33:29:33 | x | captured_variables.rb:30:10:32:5 | -> { ... } [captured x] | provenance | |
| captured_variables.rb:29:33:29:33 | x | captured_variables.rb:33:29:33:30 | fn [captured x] | provenance | |
| captured_variables.rb:30:5:30:6 | fn [captured x] | captured_variables.rb:33:29:33:30 | fn [captured x] | provenance | |
| captured_variables.rb:30:10:32:5 | -> { ... } [captured x] | captured_variables.rb:30:5:30:6 | fn [captured x] | provenance | |
| captured_variables.rb:33:29:33:30 | fn [captured x] | captured_variables.rb:31:14:31:14 | x | provenance | |
| captured_variables.rb:35:29:35:38 | call to taint | captured_variables.rb:29:33:29:33 | x | provenance | |
| captured_variables.rb:37:13:37:14 | fn [captured x] | captured_variables.rb:38:5:38:6 | fn [captured x] | provenance | |
| captured_variables.rb:38:5:38:6 | fn [captured x] | captured_variables.rb:42:14:42:14 | x | provenance | |
| captured_variables.rb:40:31:40:31 | x | captured_variables.rb:41:10:43:5 | -> { ... } [captured x] | provenance | |
| captured_variables.rb:40:31:40:31 | x | captured_variables.rb:44:13:44:14 | fn [captured x] | provenance | |
| captured_variables.rb:41:5:41:6 | fn [captured x] | captured_variables.rb:44:13:44:14 | fn [captured x] | provenance | |
| captured_variables.rb:41:10:43:5 | -> { ... } [captured x] | captured_variables.rb:41:5:41:6 | fn [captured x] | provenance | |
| captured_variables.rb:44:13:44:14 | fn [captured x] | captured_variables.rb:37:13:37:14 | fn [captured x] | provenance | |
| captured_variables.rb:46:27:46:36 | call to taint | captured_variables.rb:40:31:40:31 | x | provenance | |
| captured_variables.rb:48:5:48:12 | call to taint | captured_variables.rb:49:16:52:3 | do ... end [captured x] | provenance | |
Expand Down Expand Up @@ -65,11 +56,8 @@ edges
| captured_variables.rb:83:6:83:8 | foo [@field] | captured_variables.rb:60:5:62:7 | self in get_field [@field] | provenance | |
| captured_variables.rb:83:6:83:8 | foo [@field] | captured_variables.rb:83:6:83:18 | call to get_field | provenance | |
| captured_variables.rb:83:6:83:8 | foo [@field] | instance_variables.rb:13:5:15:7 | self in get_field [@field] | provenance | |
| captured_variables.rb:85:5:85:12 | call to taint | captured_variables.rb:86:6:89:1 | -> { ... } [captured y] | provenance | |
| captured_variables.rb:85:5:85:12 | call to taint | captured_variables.rb:90:1:90:2 | fn [captured y] | provenance | |
| captured_variables.rb:85:5:85:12 | call to taint | captured_variables.rb:91:6:91:6 | y | provenance | |
| captured_variables.rb:86:1:86:2 | fn [captured y] | captured_variables.rb:90:1:90:2 | fn [captured y] | provenance | |
| captured_variables.rb:86:6:89:1 | -> { ... } [captured y] | captured_variables.rb:86:1:86:2 | fn [captured y] | provenance | |
| captured_variables.rb:88:9:88:16 | call to taint | captured_variables.rb:90:1:90:2 | [post] fn [captured y] | provenance | |
| captured_variables.rb:90:1:90:2 | [post] fn [captured y] | captured_variables.rb:91:6:91:6 | y | provenance | |
| captured_variables.rb:90:1:90:2 | fn [captured y] | captured_variables.rb:87:10:87:10 | y | provenance | |
Expand All @@ -81,18 +69,12 @@ edges
| captured_variables.rb:101:11:101:11 | x | captured_variables.rb:104:31:104:31 | x | provenance | |
| captured_variables.rb:104:17:104:24 | call to taint | captured_variables.rb:100:21:100:21 | x | provenance | |
| captured_variables.rb:104:31:104:31 | x | captured_variables.rb:105:10:105:10 | x | provenance | |
| captured_variables.rb:109:9:109:17 | call to taint | captured_variables.rb:110:14:116:5 | -> { ... } [captured x] | provenance | |
| captured_variables.rb:109:9:109:17 | call to taint | captured_variables.rb:117:5:117:10 | middle [captured x] | provenance | |
| captured_variables.rb:109:9:109:17 | call to taint | captured_variables.rb:118:10:118:10 | x | provenance | |
| captured_variables.rb:110:5:110:10 | middle [captured x] | captured_variables.rb:117:5:117:10 | middle [captured x] | provenance | |
| captured_variables.rb:110:14:116:5 | -> { ... } [captured x] | captured_variables.rb:110:5:110:10 | middle [captured x] | provenance | |
| captured_variables.rb:111:9:111:13 | inner [captured x] | captured_variables.rb:115:9:115:13 | inner [captured x] | provenance | |
| captured_variables.rb:111:17:114:9 | -> { ... } [captured x] | captured_variables.rb:111:9:111:13 | inner [captured x] | provenance | |
| captured_variables.rb:113:17:113:25 | call to taint | captured_variables.rb:115:9:115:13 | [post] inner [captured x] | provenance | |
| captured_variables.rb:115:9:115:13 | [post] inner [captured x] | captured_variables.rb:117:5:117:10 | [post] middle [captured x] | provenance | |
| captured_variables.rb:115:9:115:13 | inner [captured x] | captured_variables.rb:112:18:112:18 | x | provenance | |
| captured_variables.rb:117:5:117:10 | [post] middle [captured x] | captured_variables.rb:118:10:118:10 | x | provenance | |
| captured_variables.rb:117:5:117:10 | middle [captured x] | captured_variables.rb:111:17:114:9 | -> { ... } [captured x] | provenance | |
| captured_variables.rb:117:5:117:10 | middle [captured x] | captured_variables.rb:115:9:115:13 | inner [captured x] | provenance | |
| captured_variables.rb:147:5:147:6 | [post] self [@x] | captured_variables.rb:153:14:155:7 | do ... end [captured self, @x] | provenance | |
| captured_variables.rb:147:10:147:18 | call to taint | captured_variables.rb:147:5:147:6 | [post] self [@x] | provenance | |
Expand All @@ -116,10 +98,7 @@ edges
| captured_variables.rb:194:1:194:1 | c [@x] | captured_variables.rb:185:5:189:7 | self in baz [@x] | provenance | |
| captured_variables.rb:197:9:197:17 | call to taint | captured_variables.rb:199:10:199:10 | x | provenance | |
| captured_variables.rb:206:13:206:21 | call to taint | captured_variables.rb:208:14:208:14 | x | provenance | |
| captured_variables.rb:219:9:219:17 | call to taint | captured_variables.rb:222:11:224:5 | -> { ... } [captured x] | provenance | |
| captured_variables.rb:219:9:219:17 | call to taint | captured_variables.rb:226:5:226:7 | fn1 [captured x] | provenance | |
| captured_variables.rb:222:5:222:7 | fn1 [captured x] | captured_variables.rb:226:5:226:7 | fn1 [captured x] | provenance | |
| captured_variables.rb:222:11:224:5 | -> { ... } [captured x] | captured_variables.rb:222:5:222:7 | fn1 [captured x] | provenance | |
| captured_variables.rb:226:5:226:7 | [post] fn1 [captured y] | captured_variables.rb:227:10:227:10 | y | provenance | |
| captured_variables.rb:226:5:226:7 | fn1 [captured x] | captured_variables.rb:226:5:226:7 | [post] fn1 [captured y] | provenance | |
| instance_variables.rb:10:19:10:19 | x | instance_variables.rb:11:18:11:18 | x | provenance | |
Expand Down Expand Up @@ -264,8 +243,6 @@ nodes
| blocks.rb:8:10:8:14 | yield ... | semmle.label | yield ... |
| blocks.rb:14:12:14:20 | call to source | semmle.label | call to source |
| captured_variables.rb:9:24:9:24 | x | semmle.label | x |
| captured_variables.rb:10:5:10:6 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:10:10:10:23 | -> { ... } [captured x] | semmle.label | -> { ... } [captured x] |
| captured_variables.rb:10:20:10:20 | x | semmle.label | x |
| captured_variables.rb:11:5:11:6 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:13:20:13:29 | call to taint | semmle.label | call to taint |
Expand All @@ -281,16 +258,12 @@ nodes
| captured_variables.rb:27:25:27:57 | call to capture_escape_return2 [captured x] | semmle.label | call to capture_escape_return2 [captured x] |
| captured_variables.rb:27:48:27:57 | call to taint | semmle.label | call to taint |
| captured_variables.rb:29:33:29:33 | x | semmle.label | x |
| captured_variables.rb:30:5:30:6 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:30:10:32:5 | -> { ... } [captured x] | semmle.label | -> { ... } [captured x] |
| captured_variables.rb:31:14:31:14 | x | semmle.label | x |
| captured_variables.rb:33:29:33:30 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:35:29:35:38 | call to taint | semmle.label | call to taint |
| captured_variables.rb:37:13:37:14 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:38:5:38:6 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:40:31:40:31 | x | semmle.label | x |
| captured_variables.rb:41:5:41:6 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:41:10:43:5 | -> { ... } [captured x] | semmle.label | -> { ... } [captured x] |
| captured_variables.rb:42:14:42:14 | x | semmle.label | x |
| captured_variables.rb:44:13:44:14 | fn [captured x] | semmle.label | fn [captured x] |
| captured_variables.rb:46:27:46:36 | call to taint | semmle.label | call to taint |
Expand Down Expand Up @@ -323,8 +296,6 @@ nodes
| captured_variables.rb:83:6:83:8 | foo [@field] | semmle.label | foo [@field] |
| captured_variables.rb:83:6:83:18 | call to get_field | semmle.label | call to get_field |
| captured_variables.rb:85:5:85:12 | call to taint | semmle.label | call to taint |
| captured_variables.rb:86:1:86:2 | fn [captured y] | semmle.label | fn [captured y] |
| captured_variables.rb:86:6:89:1 | -> { ... } [captured y] | semmle.label | -> { ... } [captured y] |
| captured_variables.rb:87:10:87:10 | y | semmle.label | y |
| captured_variables.rb:88:9:88:16 | call to taint | semmle.label | call to taint |
| captured_variables.rb:90:1:90:2 | [post] fn [captured y] | semmle.label | [post] fn [captured y] |
Expand All @@ -341,10 +312,6 @@ nodes
| captured_variables.rb:104:31:104:31 | x | semmle.label | x |
| captured_variables.rb:105:10:105:10 | x | semmle.label | x |
| captured_variables.rb:109:9:109:17 | call to taint | semmle.label | call to taint |
| captured_variables.rb:110:5:110:10 | middle [captured x] | semmle.label | middle [captured x] |
| captured_variables.rb:110:14:116:5 | -> { ... } [captured x] | semmle.label | -> { ... } [captured x] |
| captured_variables.rb:111:9:111:13 | inner [captured x] | semmle.label | inner [captured x] |
| captured_variables.rb:111:17:114:9 | -> { ... } [captured x] | semmle.label | -> { ... } [captured x] |
| captured_variables.rb:112:18:112:18 | x | semmle.label | x |
| captured_variables.rb:113:17:113:25 | call to taint | semmle.label | call to taint |
| captured_variables.rb:115:9:115:13 | [post] inner [captured x] | semmle.label | [post] inner [captured x] |
Expand Down Expand Up @@ -380,8 +347,6 @@ nodes
| captured_variables.rb:206:13:206:21 | call to taint | semmle.label | call to taint |
| captured_variables.rb:208:14:208:14 | x | semmle.label | x |
| captured_variables.rb:219:9:219:17 | call to taint | semmle.label | call to taint |
| captured_variables.rb:222:5:222:7 | fn1 [captured x] | semmle.label | fn1 [captured x] |
| captured_variables.rb:222:11:224:5 | -> { ... } [captured x] | semmle.label | -> { ... } [captured x] |
| captured_variables.rb:226:5:226:7 | [post] fn1 [captured y] | semmle.label | [post] fn1 [captured y] |
| captured_variables.rb:226:5:226:7 | fn1 [captured x] | semmle.label | fn1 [captured x] |
| captured_variables.rb:227:10:227:10 | y | semmle.label | y |
Expand Down
14 changes: 14 additions & 0 deletions ruby/ql/test/library-tests/dataflow/global/captured_variables.rb
Original file line number Diff line number Diff line change
Expand Up @@ -228,3 +228,17 @@ def multi_capture
end

multi_capture

def m1
x = taint(19)

fn1 = -> {
sink x
}

x = nil

fn1.call()
end

m1
80 changes: 74 additions & 6 deletions shared/dataflow/codeql/dataflow/VariableCapture.qll
Original file line number Diff line number Diff line change
Expand Up @@ -601,16 +601,22 @@ module Flow<LocationSig Location, InputSig<Location> Input> implements OutputSig
* observed in a similarly synthesized post-update node for this read of `v`.
*/
private predicate synthRead(
CapturedVariable v, BasicBlock bb, int i, boolean topScope, Expr closure
CapturedVariable v, BasicBlock bb, int i, boolean topScope, Expr closure, boolean alias
) {
exists(ClosureExpr ce | closureCaptures(ce, v) |
ce.hasCfgNode(bb, i) and ce = closure
ce.hasCfgNode(bb, i) and ce = closure and alias = false
or
localOrNestedClosureAccess(ce, closure, bb, i)
localOrNestedClosureAccess(ce, closure, bb, i) and alias = true
) and
if v.getCallable() != bb.getEnclosingCallable() then topScope = false else topScope = true
}

private predicate synthRead(
CapturedVariable v, BasicBlock bb, int i, boolean topScope, Expr closure
) {
synthRead(v, bb, i, topScope, closure, _)
}

/**
* Holds if there is an access of a captured variable inside a closure in the
* `i`th node of `bb`, such that we need to synthesize a `this.` qualifier.
Expand Down Expand Up @@ -919,16 +925,22 @@ module Flow<LocationSig Location, InputSig<Location> Input> implements OutputSig
)
}

predicate storeStep(ClosureNode node1, CapturedVariable v, ClosureNode node2) {
// store v in the closure or in the malloc in case of a relevant constructor call
private predicate storeStepClosure(
ClosureNode node1, CapturedVariable v, ClosureNode node2, boolean alias
) {
exists(BasicBlock bb, int i, Expr closure |
synthRead(v, bb, i, _, closure) and
synthRead(v, bb, i, _, closure, alias) and
node1 = TSynthRead(v, bb, i, false)
|
node2 = TExprNode(closure, false)
or
node2 = TMallocNode(closure) and hasConstructorCapture(closure, v)
)
}

predicate storeStep(ClosureNode node1, CapturedVariable v, ClosureNode node2) {
// store v in the closure or in the malloc in case of a relevant constructor call
storeStepClosure(node1, v, node2, _)
or
// write to v inside the closure body
exists(BasicBlock bb, int i, VariableWrite vw |
Expand Down Expand Up @@ -964,6 +976,62 @@ module Flow<LocationSig Location, InputSig<Location> Input> implements OutputSig
}

predicate clearsContent(ClosureNode node, CapturedVariable v) {
/*
* Stores into closure aliases block flow from previous stores, both to
* avoid overlapping data flow paths, but also to avoid false positive
* flow.
*
* Example 1 (overlapping paths):
*
* ```rb
* def m
* x = taint
*
* fn = -> { # (1)
* sink x
* }
*
* fn.call # (2)
* ```
*
* If we don't clear `x` at `fn` (2), we will have two overlapping paths:
*
* ```
* taint -> fn (2) [captured x]
* taint -> fn (1) [captured x] -> fn (2) [captured x]
* ```
*
* where the step `fn (1) [captured x] -> fn [captured x]` arises from normal
* use-use flow for `fn`. Clearing `x` at `fn` (2) removes the second path above.
*
* Example 2 (false positive flow):
*
* ```rb
* def m
* x = taint
*
* fn = -> { # (1)
* sink x
* }
*
* x = nil # (2)
*
* fn.call # (3)
* end
* ```
*
* If we don't clear `x` at `fn` (3), we will have the following false positive
* flow path:
*
* ```
* taint -> fn (1) [captured x] -> fn (3) [captured x]
* ```
*
* since normal use-use flow for `fn` does not take the overwrite at (2) into account.
*/

storeStepClosure(_, v, node, true)
or
exists(BasicBlock bb, int i |
captureWrite(v, bb, i, false, _) and
node = TSynthThisQualifier(bb, i, false)
Expand Down

0 comments on commit a13391b

Please sign in to comment.