Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Modernize cpp/unclear-array-index-validation #17678

Merged

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Oct 7, 2024

Pull Request checklist

This PR modernizes the old low precision cpp/unclear-array-index-validation by getting rid of the old DefaultTaintTracking inherited barriers and replacing them with a more precise BarrierGuard instantiation, and uses of SimpleRangeAnalysis.

I've started two DCA runs:

  • A nightly one to test the impact of the changes to SimpleRangeAnalysis
  • A run with just cpp/unclear-array-index-validation to test the changes to that query.

Since this query isn't included in any security suite I'm not sure how much you care about the result changes on cpp/unclear-array-index-validation?

I did a diff MRVA run to test the results, and they look reasonable.

Commit-by-commit review recommended.

The new results are TPs. They essentially look like:

c = fgetc(fin);
while(c != EOF) {
  if(... && (c = ... || c == EOF) {
    // ...
  }
}

and we now see that c == EOF is always false.

All query authors

Internal query authors only

  • Autofixes generated based on these changes are valid, only needed if this PR makes significant changes to .ql, .qll, or .qhelp files. See the documentation (internal access required).
  • Changes are validated at scale (internal access required).
  • Adding a new query? Consider also adding the query to autofix.

@MathiasVP MathiasVP force-pushed the modernize-unclear-array-index-validation branch from 4f2d029 to df6d34b Compare October 8, 2024 08:45
@MathiasVP MathiasVP force-pushed the modernize-unclear-array-index-validation branch from df6d34b to b00c545 Compare October 8, 2024 09:08
@MathiasVP MathiasVP marked this pull request as ready for review October 9, 2024 11:48
@MathiasVP MathiasVP requested a review from a team as a code owner October 9, 2024 11:48
Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments. I still need to have a look at the DCA results, and have another look at the new barriers.

Comment on lines -47 to -48
// Comparing to 0 is not an upper bound check
not oper.getAnOperand().getValue() = "0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't dropping this lead to FNs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, true. I should add this back in. I don't think this really removed anything in practice, but we may as well not regress here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 61a012f

extern int myMaxCharTable[256];

void test7(FILE* fp) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're passing in a character pointer, not a file pointer?

Copy link
Contributor Author

@MathiasVP MathiasVP Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know. I tried to do a proper test with fopen, but it turns out I couldn't get taint through that. So I was planning on opening another PR that adds flow through fopen.

(I'm actually quite surprised we don't have taint through fopen, but I can't find any taint model for it so I guess it makes sense)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see we have an fopen model but that doesn't extend TaintFunction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Let me quickly whip up a PR to try that out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR for this: #17715

int ch;
while ((ch = getc(fp)) != EOF) {
myMaxCharTable[ch] = 0; // GOOD [FALSE POSITIVE]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, because getc either returns a value which fits in an unsigned char or -1? Correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. See the Return value section at https://en.cppreference.com/w/c/io/fgetc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the alert changes on the nightly suite seem to suggest that this might be violated on some platforms?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry I should have read your top comment more closely. That clarifies things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the Kamailio alerts? I think they're TPs. They come from the EOF check here: https://github.com/kamailio/kamailio/blob/b02c247023ea3ea0ef9753efdb04aff7d5d3bbb4/src/modules/db_text/dbt_file.c#L143, I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the bound for fgetc is initially [-1, 255], and then the != check refines it to [0, 255]. So the == EOF check never holds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I see that now.

Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. I'm not exactly sure how to judge the DCA result changes for the query. The query was low precision, and it still is, but we get somewhat different results now. We might just be better prepared now if we want to improve the precision of the query in the future.

@jketema jketema merged commit b087fde into github:main Oct 9, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants