-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: include inspector package urls as part of the malicious metadata facts for pypi packages #935
base: staging
Are you sure you want to change the base?
Conversation
…g and inspector links
One change I have made that I am unsure about is, in The reason the check for a 200 OK response exists is due to the way I have constructed the pypi inspector link. I observe that the format of the pypi inspector link was the same as |
if send_head_http_raw(inspector_link) is None: | ||
inspector_link = "" | ||
|
||
release_files.append(release_metadata["url"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add separate fields in a dict
for the inspector and file_server instead of adding all to a list
?
Also, we could add a separate field for version instead of using the version as the key for the return value of this heuristic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done three fields, version
, file_server_links
, and inspector_links
. I agree that makes much more sense, and allows me to resolve an invalid pypi inspector URL much easier.
I have added a suggestion in this comment to improve the return value. If the inspector HEAD request fails, the value corresponding to the inspector could be |
# include the pypi inspector link, which uses the same suffix of | ||
# packages/{blake2b_256}/file_name | ||
inspector_prefix = f"{self.INSPECTOR_PREFIX}{name.lower()}/{version}/" | ||
inspector_link = release_metadata["url"].replace(self.PYPI_PREFIX, inspector_prefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the code correctly, if a project doesn't publish the wheel or tarball, we won't send a corresponding HEAD request to check the inspector link, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since we iterate over the release data for that version, each item in that is for a separate way of distributing it (.tar.gz
, .whl
) so if one of them doesn't exist it won't exist to iterate over, so the HEAD request won't get sent.
To allow for retention of packages when they are taken off PyPI, this new feature includes the inspector.pypi.io URL for the distribution files as part of the
MaliciousMetadataFacts
detail_information
field. This has been done by modifying thewheel_absence.py
heuristic to instead of returning filenames, return the python hosted URL and corresponding pypi inspector URL. The unit test for this,test_wheel_absence.py
has been updated to reflect these changes accordingly.