Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refinement: Removed Redundancy of Scan Config Passing #304

Merged
merged 3 commits into from
Nov 29, 2023

Conversation

aryanagrawal22
Copy link
Contributor

Description

  • Issue: Refinement: Improve sending scan config to every crawler. #241. Hence I created a property crawler_config in base class ICrawler which is returns bool value, where default=False
  • If there is a need or optionality of passing the config to the crawler then this property crawler_config in the base class needs to be overridden in the sub class to return True.

Note : Currently as I was checking all the crawlers, I have just seen that config is being used only inside the StorageBucketsCrawler crawler. So only its value is set as True in its base class.

Changes Made

  • Base class ICrawler has a new property called as crawler_config which returns the abstracted private variable _config_depndency which is set to False by default.
  • StorageBucketsCrawler class has overridden crawler_config to return True.
  • get_crawl function in scanner.py has a check now calling the crawler_config to decide to send config or not.

Testing

Local testing done with logs denoting that where config file is sent and where it is not. (Removed the logs after the testing)

Screenshot 2023-11-23 at 6 01 29 PM

This check denotes that only StorageBucketsCrawler is being sent the config file

Copy link
Collaborator

@mshudrak mshudrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. I left some suggestions. PTAL.

src/gcp_scanner/crawler/interface_crawler.py Outdated Show resolved Hide resolved
src/gcp_scanner/crawler/storage_buckets_crawler.py Outdated Show resolved Hide resolved
@@ -180,8 +180,10 @@ def get_crawl(
Returns:
scan_result: a dictionary with scanning results
"""

res = crawler.crawl(project_id, client, crawler_config)
if crawler.crawler_config:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not actually calling the function here. This statement will always return true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are referencing about the crawler_config not having parentheses denoting a function, it is because the crawler_config method is a property, and properties in Python are typically accessed without parentheses.

Screenshot 2023-11-23 at 6 01 29 PM

The above testing image shows that, when the scanner is run, it is not always giving true for crawler_config. False for every class except of StorageBucketsCrawler (As for now only that class needs the config file).

Please let me know, if I have misunderstood your comment or something else was wrong.

Copy link
Collaborator

@mshudrak mshudrak Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my bad, right. NVM. I'd rename the function to make it more clear of what is this property about. Right now we have too many crawler_configs in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fair point. I have made the changes, changed the name to has_config_dependency from crawler_config

@aryanagrawal22
Copy link
Contributor Author

Thanks for your PR. I left some suggestions. PTAL.

Thank you for going through my PR, and reviewing it. I think, I have done the relevant changes as asked. Please let me know in case any others are needed.

Copy link
Collaborator

@mshudrak mshudrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion and we are good to go.

@@ -180,8 +180,10 @@ def get_crawl(
Returns:
scan_result: a dictionary with scanning results
"""

res = crawler.crawl(project_id, client, crawler_config)
if crawler.crawler_config:
Copy link
Collaborator

@mshudrak mshudrak Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my bad, right. NVM. I'd rename the function to make it more clear of what is this property about. Right now we have too many crawler_configs in the code.

@aryanagrawal22
Copy link
Contributor Author

Yes, fair point. I have made the changes, changed the name to has_config_dependency from crawler_config

Pushed with the changes 😁

Copy link
Collaborator

@mshudrak mshudrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

Fixes #241

@mshudrak mshudrak merged commit 3ac30b6 into google:main Nov 29, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants