Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Reuse bounded predicate in TaintedAllocationSize query #17220

Merged

Conversation

paldepind
Copy link
Contributor

Reuse the bounded predicate in the TaintedAllocationSize query.

@github-actions github-actions bot added the C++ label Aug 14, 2024
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

We looked at some MRVA results and were happy with the differences. We should review the DCA results as well before merging.

And since this change does affect results, add a change note.

// There can be two separate reasons for `convertedExprMightOverflow` not holding:
// 1. `e` really cannot overflow.
// 2. `e` isn't analyzable.
// If we didn't rule out case 2 we would declare anything that isn't analyzable as bounded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it's a pretty common pattern to have three cases to think about (1) a thing holds (2) the thing doesn't hold and (3) we weren't able to determine either. If we had another predicate, convertedExprNeverOverflows, this would not be the inverse of convertedExprMightOverflow (due to case 3) and the choice between using convertedExprNeverOverflows or not convertedExprMightOverflow as we do here would be meaningful.

@paldepind paldepind force-pushed the reuse-unbounded-in-tainted-allocation-size branch from 668a02e to c7d76da Compare August 15, 2024 10:44
geoffw0
geoffw0 previously approved these changes Aug 16, 2024
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this, when you've finished reviewing a sample of those DCA differences.

@paldepind
Copy link
Contributor Author

I've taken a look at the results in the DCA report. With this PR there are 11 less query results.

One of the disappearing results is in git. The added barrier is c & 15 which is correct as that certainly bounds the input.

8 of the disappearing results are in vim. After looking at them, they all go through the same function with a bitwise & barrier. Here the mask, CT_CELL_MASK, is 7, so this also effectively bounds the value (and even before that the paths are very long (60-120) and kinda questionable).

The last 2 results are in openjdk. Here the paths are short (10 steps). The source reads a file which ends up tainting an allocation. In between some bit manipulation goes on, apparently to extract the right bits from a header. I think this could be a true positive for the purpose of the query, but it's clearly deliberate that those bits in the header should end up affecting the allocation.

I think the results seem fine. I think one future approach that could increase precision would be to consider that actual values used in the arithmetic operations (&, /, %, etc.) and only accept barriers where the operation actually reduces the size significantly (such as % sizeof(char) but not / sizeof(char)).

@paldepind paldepind marked this pull request as ready for review August 16, 2024 10:59
@paldepind paldepind requested a review from a team as a code owner August 16, 2024 10:59
@paldepind paldepind force-pushed the reuse-unbounded-in-tainted-allocation-size branch from b79e550 to 0eaa984 Compare August 16, 2024 11:03
geoffw0
geoffw0 previously approved these changes Aug 16, 2024
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for addressing this!

@paldepind paldepind force-pushed the reuse-unbounded-in-tainted-allocation-size branch from 0eaa984 to 1665bad Compare August 19, 2024 06:24
@paldepind
Copy link
Contributor Author

Fixed some conflicts after merging the other PR related to TaintedAllocationSize. Should be ready for final review now :)

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@geoffw0 geoffw0 merged commit a25d9c7 into github:main Aug 19, 2024
15 checks passed

if (size % 100)
{
malloc(size * sizeof(int)); // GOOD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this example "GOOD", but the one from lines 132-136 isn't?
That is, I can't see a real difference here:
https://godbolt.org/z/bPWaoPMYY

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my actual question is: how is modulo supposed to bound the input here?

Copy link
Contributor Author

@paldepind paldepind Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. It does not 😅. As it stands it's actually an example of a false negative. What I had in my mind was an example like this:

int size = atoi(getenv("USER"));
int size2 = size % 100;
malloc(size2 * sizeof(int));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out @intrigus-lgtm. It's fixed now by #17269.

@paldepind paldepind deleted the reuse-unbounded-in-tainted-allocation-size branch September 3, 2024 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants