Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.0.1 packaging #30

Merged
merged 77 commits into from
Aug 1, 2023
Merged

V0.0.1 packaging #30

merged 77 commits into from
Aug 1, 2023

Conversation

zbloss
Copy link
Collaborator

@zbloss zbloss commented Jul 28, 2023

Resolves Issue #24

Adds:

  • Compressor classes - gzip, bz2, & lzma.
  • Distance class to compute distance metrics.
  • Several Exceptions.
  • KnnCompressor class to do predictioning.
  • Github Actions pipelines to:
    • Run unit tests.
    • Deploy package to pypi.
  • Unit tests

Changes:

  • Moves the original codebase to original_codebase

npc_gzip/distance.py Outdated Show resolved Hide resolved
npc_gzip/utils.py Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@EliahKagan
Copy link
Collaborator

Expanding the CI test matrix to include other operating systems could be deferred until some point after this PR is merged. Is it okay if I mark all my comments about that as resolved?

@zbloss
Copy link
Collaborator Author

zbloss commented Jul 31, 2023

Expanding the CI test matrix to include other operating systems could be deferred until some point after this PR is merged. Is it okay if I mark all my comments about that as resolved?

Yep go ahead, we'll tackle that in the future

EliahKagan

This comment was marked as outdated.

zbloss added 2 commits July 31, 2023 14:35
update tests to reflect changes in not handling type conversions
pyproject.toml Outdated Show resolved Hide resolved
update poetry lock
@zbloss zbloss requested a review from bazingagin July 31, 2023 22:19
@EliahKagan

This comment was marked as duplicate.

Copy link
Collaborator

@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To speed things up, I've done a second pass through old unresolved comments. I marked things I felt I don't need to ask you about in this PR as resolved--they can be considered later and separately. For my remaining comments, I marked them resolved but made new ones here (improving and shortening them where feasible). Everything from me above this point can therefore be disregarded.

This review also has one completely new comment, in publish-package.yml, about PyPI authentication.

npc_gzip/aggregations.py Outdated Show resolved Hide resolved
npc_gzip/aggregations.py Outdated Show resolved Hide resolved
npc_gzip/exceptions.py Outdated Show resolved Hide resolved
npc_gzip/knn_classifier.py Outdated Show resolved Hide resolved
npc_gzip/knn_classifier.py Outdated Show resolved Hide resolved
npc_gzip/knn_classifier.py Outdated Show resolved Hide resolved
npc_gzip/knn_classifier.py Outdated Show resolved Hide resolved
npc_gzip/knn_classifier.py Outdated Show resolved Hide resolved
Comment on lines +41 to +44
- name: Poetry Build & Publish
run: |
poetry build
poetry publish
Copy link
Collaborator

@EliahKagan EliahKagan Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to access GitHub Actions secrets to work, because poetry publish has to authenticate with PyPI. However that is done, this code will probably have to be expanded slightly to facilitate it. In particular, if environment variables are used, the relevant secrets will have to be placed in them using an env: key. I don't know if this needs to be taken care of before the PR is merged or not. In particular, if this CI job is only going to be used in the future, and not for initially publishing the package, then there is no need to make any further changes to this file before merging the PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so this is a new one for me. Usually I set a pypi api token as an env var. However, when I logged into pypi I saw a new feature for "trusted publishers". I set up this repo as a trusted publisher which I think means this github action will be able to publish this package without storing an api token.

Let's merge with this as is, and if it doesn't work we know how to quickly fix it.

Copy link
Collaborator

@EliahKagan EliahKagan Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Based on Using trusted publishing with GitHub Actions, I think using trusted publishing in publish-package.yml may require either doing the publish step with the pypi-publish action instead of by running poetry publish, or taking explicit steps to provide poetry with the OIDC-based token. (However publishing is achieved, the build could definitely still be done with poetry build.) Either way, as you say, it can be fixed pretty quickly if it doesn't work.

It occurs to me that fixing it may be even easier if workflow_dispatch is added as a second event trigger. Then, if a release is created on GitHub and publishing fails in a way requires the workflow to be modified to fix, then publishing can be manually reattempted from the Actions tab after fixing it, without having to make a second release on GitHub, or delete and remake the tag, etc. (Jobs that have run already can be re-run, but that would use the workflow as it was for the job, rather than the updated workflow. In contrast, triggering the workflow_dispatch event from the Actions tab uses the tip of whatever branch it is run for.) If you choose to do this, the change I'm suggesting is:

 on:
   release:
     types: [published]
+  workflow_dispatch:

npc_gzip/aggregations.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Owner

@bazingagin bazingagin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The accuracy also seems reasonable. Minor comment on adding @EliahKagan to maintainer : ) Thanks for all the work @zbloss!

maintainers = [
"Zach Bloss <[email protected]>",
"Gin Jiang <[email protected]>",
]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest adding @EliahKagan too for all the effort and time in reviewing and improving!

@bazingagin bazingagin merged commit 9b1a8a7 into main Aug 1, 2023
EliahKagan added a commit to EliahKagan/npc_gzip that referenced this pull request Aug 2, 2023
Code style in the original codebase, including wrapping function
definitions and calls and regrouping/reordering imports, was
reglarized in bf86554 (as part of bazingagin#30). But line lengths in
docstrings were not changed (because black does not automate this
and it is unrelated to isort).

This tweaks docstrings in modules in original_codebase/ so they're
at most 89 columns. In one case, I removed a small amount of extra
wording from a docstring's description of function parameters. In
all other cases, I wrapped parts of docstrings that were over 89
columns to 79 columns, while leaving other parts (even in the same
docstrings and between 79 and 89 columns) as they were before. When
wrapping descriptions of function parameters, I used hanging
indents, for readability and because that style is the most
consistent with nearby code (and code outside original_codebase/).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants