Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isolate pip with venv semaphores #287

Merged
merged 3 commits into from
Feb 22, 2022
Merged

isolate pip with venv semaphores #287

merged 3 commits into from
Feb 22, 2022

Conversation

bertsky
Copy link
Collaborator

@bertsky bertsky commented Feb 6, 2022

Pursuant to #286, try using the same mechanism we already use for git to bring about isolation for pip as well – but allow different venvs to be modified concurrently.

Also contains an attempt at speeding up build by replacing the costly re-pip trick (forcing reinstallation to ensure up-to-date timestamps) with just direct timestamp update (but for the targets only).

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 6, 2022

Remaining CI failure seems to be only the stdout timeout for docker save. I have increased it on master.

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 7, 2022

Remaining CI failure seems to be only the stdout timeout for docker save. I have increased it on master.

No idea why that update does not take effect in PRs immediately. (Anyway – should work, PR is ready to be merged AFAICS.)

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 22, 2022

This does not update the modules themselves, so we need not run release.sh – can I merge this, @kba?
(We are still waiting for ocrd/all/maximum-cuda to finish...)

@kba kba merged commit 2536d99 into OCR-D:master Feb 22, 2022
@bertsky
Copy link
Collaborator Author

bertsky commented Feb 22, 2022

CI failure: bad news – new upstream inconsistencies …

numpy>=1.20.0 (from imageio==2.16.0->scikit-image==0.17.2->ocr4all-pixel-classifier==0.6.5->-r requirements.in
numpy~=1.19.2 (from tensorflow==2.6.2->ocr4all-pixel-classifier==0.6.5->-r requirements.in

This came from ocrd_pc_segmentation – I suggest we try to fix that upstream.

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 22, 2022

This came from ocrd_pc_segmentation – I suggest we try to fix that upstream.

Strange. ocr4all-pixel-classifier requirements does not even allow TF 2.6 – it requries <=2.5. And no restriction on scikit-image. But somehow that ends up in the above combination. It has to do with the mysterious pip-compile mechanism.

@chreul @maxnth @chaddy314 could you please assist?

@maxnth
Copy link

maxnth commented Feb 23, 2022

@bertsky I just contacted the original author of the pixel-classifier and he's looking into fixing it upstream.

Edit: Just for the sake of clarification: the ocr4all-pixel-classifier isn't written or maintained by the OCR4all team.

@crater2150
Copy link

The problem seems to be, that scikit-image depends on imageio>=2.4.1. Note the missing upper bound on the version. With python packages, the only option to have a build that will still work in the future seems to be to hardcode all dependencies including transitives to a fixed version :(

I've added imageio ~= 2.4.1 to ocr4all-pixel-classifier's dependencies as a workaround, and also re-added the full version-pinned requirements.txt. I also published a release 0.6.6 on PyPI.

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 23, 2022

The problem seems to be, that scikit-image depends on imageio>=2.4.1. Note the missing upper bound on the version. With python packages, the only option to have a build that will still work in the future seems to be to hardcode all dependencies including transitives to a fixed version :(

Indeed. Which will likely cause further conflicts as more packages are needed in the same venv.

Could anything be done on the TF version front? What was the reason for TF<=2.5 in the first place? (I have recently seen that the HDF5 model format quickly becomes incompatible with other minor releases of Python, because the custom layer code is serialised via marshal format. The better alternative is TF's SavedModel format, which is basically a directory with Protobuf file and some resource files. Conversion is as simple as an interactive load+save. Would that help to make ocr4all-pixel-classifier models run on the newer versions?

I've added imageio ~= 2.4.1 to ocr4all-pixel-classifier's dependencies as a workaround, and also re-added the full version-pinned requirements.txt. I also published a release 0.6.6 on PyPI.

Thanks @crater2150 – that helped!

@crater2150
Copy link

crater2150 commented Feb 25, 2022

Could anything be done on the TF version front? What was the reason for TF<=2.5 in the first place?

I think, 2.5 was the most recent TF version, when that line was added, it simply hasn't been tested with newer versions.

I haven't had any problems with the HDF5 models between TF 2.0 and 2.5, so I'd say chances are good a newer TF version will just work. There isn't anyone actively developing the pixel classifier anymore, so larger changes to the codebase probably won't happen from our side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants