Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't delete *.pyc files from the image #426

Merged
merged 2 commits into from
Jan 13, 2023

Conversation

yuvipanda
Copy link
Member

Panel and Xarray-Leaflet are heavy enough imports that without .pyc files, they sometimes together take as much as 15s to import?! This causes jupyterhub to fail startup in some cases.

The longer term fix is in panel and xarray-leaflet ( see holoviz/panel#4271,
xarray-contrib/xarray_leaflet#79).

In the meantime, leaving the .pyc files in place doesn't increase the image size by much, but makes startup definitely much faster!

Ref 2i2c-org/infrastructure#2047

Panel and Xarray-Leaflet are heavy enough imports that
without .pyc files, they sometimes together take as much as
15s to import?! This causes jupyterhub to fail startup in
some cases.

The longer term fix is in panel and xarray-leaflet (
see holoviz/panel#4271,
xarray-contrib/xarray_leaflet#79).

In the meantime, leaving the .pyc files in place doesn't
increase the image size by much, but makes startup definitely
much faster!

Ref 2i2c-org/infrastructure#2047
@github-actions
Copy link
Contributor

Binder 👈 Try on Mybinder.org!
Binder 👈 Try on Pangeo GCP Binder!
Binder 👈 Try on Pangeo AWS Binder!

@yuvipanda yuvipanda requested a review from scottyhq January 12, 2023 04:50
Copy link
Member

@scottyhq scottyhq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing analysis in the linked issue @yuvipanda this seems like it was a tricky one to track down! I've always been a bit reluctant about removing files to save space in the docker image (why not just remove them from the package if they aren't necessary?)

This illustrates there are two related performance optimizations to track:

  1. container size to reduce time pulling and storing images but which impacts...
  2. the container start-up time in jupyterhub!

@yuvipanda yuvipanda merged commit ec34b79 into pangeo-data:master Jan 13, 2023
@yuvipanda
Copy link
Member Author

Thanks @scottyhq! Yes, the performance of the pull is the trade-off here! I think .pyc are small enough though, so is probably fine - although let’s see how big the image gets when it is built and pushed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants