Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codemod Lxml parser defaults #61

Merged
merged 6 commits into from
Oct 4, 2023
Merged

Codemod Lxml parser defaults #61

merged 6 commits into from
Oct 4, 2023

Conversation

clavedeluna
Copy link
Contributor

@clavedeluna clavedeluna commented Oct 3, 2023

Overview

New codemod for lxml classes that have security-related parameters

Description

  • I generalized replace_arg to replace_args so now we can replace a list of args. The method api accepts args_info, which we expect to be a list of the namedtuple NewArg.

@@ -152,7 +152,7 @@ def match_codemods(
def load_registered_codemods() -> CodemodRegistry:
registry = CodemodRegistry()
logger.info("Loading registered codemod collections")
for entry_point in entry_points()["codemods"]:
for entry_point in entry_points().select(group="codemods"):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether you ran into this problem with the change but I think the right way to fix this is probably by adding some conditional logic based on Python version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up removing this change (sorry forgot to remove my comment) bc yeah py versions. We can leave as is for now.

@clavedeluna clavedeluna force-pushed the lxml-parser-defaults branch 2 times, most recently from 7c189c9 to 20af62a Compare October 3, 2023 11:51
@clavedeluna clavedeluna changed the title Lxml parser defaults Codemod Lxml parser defaults Oct 3, 2023
@clavedeluna clavedeluna marked this pull request as ready for review October 3, 2023 12:35
Copy link
Member

@drdavella drdavella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Primary request is about DESCRIPTION but it also might be nice to consider something like namedtuple for the new parameter arguments.

src/codemodder/codemods/api/helpers.py Outdated Show resolved Hide resolved
NAME = "safe-lxml-parser-defaults"
REVIEW_GUIDANCE = ReviewGuidance.MERGE_AFTER_CURSORY_REVIEW
SUMMARY = "Enable all security checks in `lxml.etree.XMLParser` call."
DESCRIPTION = "...........TODO"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this actually gets used as the CHANGE_DESCRIPTION for the simplified API (we should fix that). But we do need to make sure it's present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops total oversight here that's why the TODO. How verbose can I be, should I be? Is this used as the PR description when pixeebot makes the PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now (and again we need to fix this) DESCRIPTION -> CHANGE_DESCRIPTION for the simple API. CHANGE_DESCRIPTION should be a very simple description of the individual change itself. So in this case probably something like "Replace lxml parser parameters with safe defaults". In contrast the SUMMARY becomes the title of the PR that gets opened so it should be very brief and descriptive.

src/core_codemods/lxml_safe_parser_defaults.py Outdated Show resolved Hide resolved
class LxmlSafeParserDefaults(SemgrepCodemod):
NAME = "safe-lxml-parser-defaults"
REVIEW_GUIDANCE = ReviewGuidance.MERGE_AFTER_CURSORY_REVIEW
SUMMARY = "Enable all security checks in `lxml.etree.XMLParser` call."
Copy link
Member

@drdavella drdavella Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry should have mentioned this before, but this becomes the PR title. I would suggest this could be even briefer, like "Use safe defaults for lxml parsers" (and without a trailing period).

@clavedeluna clavedeluna force-pushed the lxml-parser-defaults branch from c294b1c to 6ac3925 Compare October 4, 2023 11:37
@clavedeluna clavedeluna requested a review from drdavella October 4, 2023 11:38
new_args = self.replace_args(
original_node,
[
NewArg(name="resolve_entities", value="False", add_if_missing=True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, I think it makes sense.

@clavedeluna clavedeluna merged commit 0790cc2 into main Oct 4, 2023
6 checks passed
@clavedeluna clavedeluna deleted the lxml-parser-defaults branch October 4, 2023 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants