Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codemod to simplify checks against empty sequences #212

Merged
merged 10 commits into from
Jan 18, 2024
Merged

Conversation

clavedeluna
Copy link
Contributor

@clavedeluna clavedeluna commented Jan 15, 2024

Overview

Codemod that changes expressions like x =! [] or x == () in if, assert, assignment, or standalone statements

Description

  • This codemod is inspired by a similar pylint rule
  • I added as many tests as I could come up with and inspired some by pylint. For now I left the test cases for multiple comparisons in one statement as xfailed because they're a little bit tricker than it looks and I wanted to go ahead and get this out.

Copy link

codecov bot commented Jan 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d078cda) 96.43% compared to head (bffa115) 96.45%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #212      +/-   ##
==========================================
+ Coverage   96.43%   96.45%   +0.02%     
==========================================
  Files          95       96       +1     
  Lines        4065     4095      +30     
==========================================
+ Hits         3920     3950      +30     
  Misses        145      145              
Files Coverage Δ
src/codemodder/registry.py 98.03% <ø> (ø)
src/core_codemods/__init__.py 100.00% <100.00%> (ø)
src/core_codemods/fix_empty_sequence_comparison.py 100.00% <100.00%> (ø)

@clavedeluna clavedeluna marked this pull request as ready for review January 15, 2024 18:41
Copy link
Member

@drdavella drdavella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this codemod and the implementation is nice. But we need to be a bit careful for the reasons described in the pylint rule:

Following this check blindly in weakly typed code base can create hard to debug issues. If the value can be something else that is falsey but not a sequence (for example None, an empty string, or 0) the code will not be equivalent.

It would be safer if we could somehow verify that the object was the same type but that's probably not going to be possible in general.

@@ -0,0 +1,11 @@
Empty sequences in a boolean comparison expression are considered falsy so you can use implicit boolean comparisons instead of
comparing against empty sequences directly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick but there shouldn't be a line break here.

@@ -0,0 +1,11 @@
Empty sequences in a boolean comparison expression are considered falsy so you can use implicit boolean comparisons instead of
Copy link
Member

@drdavella drdavella Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Empty sequences in a boolean comparison expression are considered falsy so you can use implicit boolean comparisons instead of
Empty sequences in Python always evaluate to `False`. This means that comparison expressions that use empty sequences can sometimes be simplified. In these cases no explicit comparison is required: instead we can rely on the [truth value](https://docs.python.org/3/library/stdtypes.html#truth-value-testing) of the object under comparison. This is sometimes referred to as "implicit" comparison. Using implicit boolean comparison expressions is considered best practice and can lead to better code.

DESCRIPTION = (
"Replace comparisons to empty sequence with implicit boolean comparison."
)
REFERENCES: list = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

):
NAME = "fix-empty-sequence-comparison"
SUMMARY = "Replace Comparisons to Empty Sequence with Implicit Boolean Comparison"
REVIEW_GUIDANCE = ReviewGuidance.MERGE_WITHOUT_REVIEW
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be MERGE_AFTER_REVIEW since this has the potential to break existing code.

@clavedeluna
Copy link
Contributor Author

I like this codemod and the implementation is nice. But we need to be a bit careful for the reasons described in the pylint rule:

Following this check blindly in weakly typed code base can create hard to debug issues. If the value can be something else that is falsey but not a sequence (for example None, an empty string, or 0) the code will not be equivalent.

It would be safer if we could somehow verify that the object was the same type but that's probably not going to be possible in general.

You're right. Ideally we would be able to limit this check to cases of value we're checking being sets themselves, but that's difficult. Unless we've already somehow solved the problem of "what is the underlying value of x?" Maybe we have it and I can rely on it @andrecsilva ?

x = [1,2]
if x != []:
    pass    

and not cases:

x = 'hi' # or None, or int
if x != []:
    pass    

@andrecsilva
Copy link
Contributor

I like this codemod and the implementation is nice. But we need to be a bit careful for the reasons described in the pylint rule:

Following this check blindly in weakly typed code base can create hard to debug issues. If the value can be something else that is falsey but not a sequence (for example None, an empty string, or 0) the code will not be equivalent.

It would be safer if we could somehow verify that the object was the same type but that's probably not going to be possible in general.

You're right. Ideally we would be able to limit this check to cases of value we're checking being sets themselves, but that's difficult. Unless we've already somehow solved the problem of "what is the underlying value of x?" Maybe we have it and I can rely on it @andrecsilva ?

x = [1,2]
if x != []:
    pass    

and not cases:

x = 'hi' # or None, or int
if x != []:
    pass    

We have resolve_expression in utils_mixin.py for tracking a variable through assignments. Still, it cannot guarantee its value unless it has only a single assignment to it. However, if we already know the value of the variable, what would be the point of the comparison in the first place?

@drdavella
Copy link
Member

This is certainly tricky. It's a good example of a codemod where an LLM might be able to provide more context/judgment. I'll take the blame for not evaluating the pylint rule carefully enough before giving the go-ahead for this work. I'm okay with accepting this codemod as-is (modulo comments above) under the condition that it is disabled by default.

@clavedeluna
Copy link
Contributor Author

This is certainly tricky. It's a good example of a codemod where an LLM might be able to provide more context/judgment. I'll take the blame for not evaluating the pylint rule carefully enough before giving the go-ahead for this work. I'm okay with accepting this codemod as-is (modulo comments above) under the condition that it is disabled by default.

I'm not attached to this codemod and I always go for whatever is best for the product, so @drdavella let me know if you'd prefer:

  1. Close PR, document come back later in life
  2. make merge after review as stated above + not register the codemod (disabled by default, we don't have another way to enable/disable)

@drdavella
Copy link
Member

@clavedeluna there is actually one other option: we define a variable called DEFAULT_EXCLUDED_CODEMODS in registry.py. You could add this one there and I think we'd be good.

Copy link

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

3 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@clavedeluna clavedeluna requested a review from drdavella January 18, 2024 11:53
@clavedeluna clavedeluna added this pull request to the merge queue Jan 18, 2024
Merged via the queue into main with commit 04b8586 Jan 18, 2024
14 checks passed
@clavedeluna clavedeluna deleted the empty-seq-comp branch January 18, 2024 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants