Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file metadata: records can be published before file metadata extracted #1841

Open
dfdan opened this issue Oct 11, 2024 · 1 comment
Open
Labels
bug Something isn't working stale

Comments

@dfdan
Copy link
Member

dfdan commented Oct 11, 2024

Package version (if known): 10.8.6 and probably earlier

Describe the bug

When uploading + publishing records programmatically, I noticed that some of the records had been published before extract_file_metadata had completed. This leads to published records with files with null metadata. This prevents IIIF manifests (and possibly other things) from working.

Steps to Reproduce

  1. Disable celery workers
  2. Create new record with image, and publish
  3. Note that the image metadata is null.

This can happen if celery is slow/disabled or even under normal conditions when uploading programatically where the API call to publish happens immediatly after upload. Celery logs below show the job failing because the file is no longer in rdm_files_drafts

Expected behavior

Publishing should not be allowed until all files have had their metadata processing run.

Screenshots (if applicable)

Celery log -

[2024-10-11 13:43:45,616: WARNING/ForkPoolWorker-17] [2024-10-11 13:43:45,532] ERROR in tasks: Failed to extract file metadata.
Traceback (most recent call last):
File "/srv/venv/rdm12/lib/python3.10/site-packages/invenio_records_resources/tasks.py", line 27, in extract_file_metadata
service.extract_file_metadata(system_identity, record_id, file_key)
File "/srv/venv/rdm12/lib/python3.10/site-packages/invenio_records_resources/services/uow.py", line 376, in inner
res = f(self, *args, **kwargs)
File "/srv/venv/rdm12/lib/python3.10/site-packages/invenio_records_resources/services/files/service.py", line 160, in extract_file_metadata
uow.register(RecordCommitOp(file_record))
File "/srv/venv/rdm12/lib/python3.10/site-packages/invenio_records_resources/services/uow.py", line 349, in register
op.on_register(self)
File "/srv/venv/rdm12/lib/python3.10/site-packages/invenio_records_resources/services/uow.py", line 176, in on_register
self._record.commit()
File "/srv/venv/rdm12/lib/python3.10/site-packages/invenio_records/api.py", line 460, in commit
db.session.merge(self.model)
File "", line 2, in merge
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 3050, in merge
self._autoflush()
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2253, in _autoflush
self.flush()
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 3449, in flush
self.flush(objects)
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 3588, in flush
with util.safe_reraise():
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 70, in exit
compat.raise
(
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise

raise exception
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 3549, in _flush
flush_context.execute()
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
rec.execute(self)
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
util.preloaded.orm_persistence.save_obj(
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py", line 237, in save_obj
_emit_update_statements(
File "/srv/venv/rdm12/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py", line 1035, in _emit_update_statements
raise orm_exc.StaleDataError(
sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'rdm_drafts_files' expected to update 1 row(s); 0 were matched.

Additional context

@dfdan dfdan added the bug Something isn't working label Oct 11, 2024
Copy link
Contributor

This issue was automatically marked as stale.

@github-actions github-actions bot added the stale label Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

1 participant