Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mets cache key error when removing a page from the cache #1297

Open
MehmedGIT opened this issue Dec 3, 2024 · 0 comments
Open

Mets cache key error when removing a page from the cache #1297

MehmedGIT opened this issue Dec 3, 2024 · 0 comments

Comments

@MehmedGIT
Copy link
Contributor

MehmedGIT commented Dec 3, 2024

The following block of code still produces errors when the mets caching is enabled and doing mets files merging (the issue is not there if the mets server is used, however, I would prefer to have both options working and available):
https://github.com/OCR-D/core/blob/master/src/ocrd_models/ocrd_mets.py#L564-L567

I remember reporting that internally and previously just disabled the mets caching to avoid the issue, but that is disturbing now. Could we silently ignore that error, please? I don't see any issues if the key to be removed is not in the cache when removing a page attribute (potentially previously removed). Are there any? This also happens only on specific workspaces - not sure what is the issue. Here is an example ocrd zip: https://easyupload.io/szq5oa (expires in 30 days). The used processor is ocrd-cis-ocropy-binarize without any extra parameters specified.

  apptainer exec --bind /mnt/lustre-emmy-hdd/projects/project_pwieder_ocr_nhr/operandi_test_local/slurm_workspaces/test_wf_job_20241203_102643621751/test_ws_20241203_102643621751:/ws_data --bind /local/3673970/ocrd_models/ocrd-resources:/usr/local/share/ocrd-resources --env OCRD_METS_CACHING=true /local/3673970/ocrd_processor_sifs/ocrd_all_maximum_image.sif ocrd workspace -d /ws_data merge --force --no-copy-files /ws_data/mets_1.xml --page-id PHYS_0005,PHYS_0006,PHYS_0007,PHYS_0008
  apptainer exec --bind /mnt/lustre-emmy-hdd/projects/project_pwieder_ocr_nhr/operandi_test_local/slurm_workspaces/test_wf_job_20241203_102643621751/test_ws_20241203_102643621751:/ws_data --bind /local/3673970/ocrd_models/ocrd-resources:/usr/local/share/ocrd-resources --env OCRD_METS_CACHING=true /local/3673970/ocrd_processor_sifs/ocrd_all_maximum_image.sif rm /ws_data/mets_1.xml

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/usr/local/bin/ocrd", line 8, in <module>
      sys.exit(cli())
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
      return self.main(*args, **kwargs)
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1078, in main
      rv = self.invoke(ctx)
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
      return _process_result(sub_ctx.command.invoke(sub_ctx))
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
      return _process_result(sub_ctx.command.invoke(sub_ctx))
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
      return ctx.invoke(self.callback, **ctx.params)
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 783, in invoke
      return __callback(*args, **kwargs)
    File "/usr/local/lib/python3.8/site-packages/click/decorators.py", line 92, in new_func
      return ctx.invoke(f, obj, *args, **kwargs)
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 783, in invoke
      return __callback(*args, **kwargs)
    File "/build/core/src/ocrd/cli/workspace.py", line 815, in merge
      workspace.merge(
    File "/build/core/src/ocrd_utils/deprecate.py", line 15, in wrapper
      return f(*args, **kwargs)
    File "/build/core/src/ocrd_utils/deprecate.py", line 15, in wrapper
      return f(*args, **kwargs)
    File "/build/core/src/ocrd_utils/deprecate.py", line 15, in wrapper
      return f(*args, **kwargs)
    [Previous line repeated 1 more time]
    File "/build/core/src/ocrd/workspace.py", line 173, in merge
      self.mets.merge(other_workspace.mets, after_add_cb=after_add_cb, **kwargs)
    File "/build/core/src/ocrd_models/ocrd_mets.py", line 919, in merge
      f_dest = self.add_file(
    File "/build/core/src/ocrd_models/ocrd_mets.py", line 485, in add_file
      self.remove_file(ID=ID, fileGrp=fileGrp)
    File "/build/core/src/ocrd_models/ocrd_mets.py", line 511, in remove_file
      self.remove_one_file(f)
    File "/build/core/src/ocrd_models/ocrd_mets.py", line 567, in remove_one_file
      del self._page_cache[attr][page_div.attrib[attr.name]]
  KeyError: ' - '
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant