You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Full reingest should work irrespective of the system configuration (provided the configuration is correct)
Current behaviour
Attempting a full reingest fails under the following configuration:
Storage Service and the pipeline (Dashboard) are on separate servers (i.e., connected together via a pipeline FS space)
The pipeline FS staging path value is different from the Storage Service internal processing location
Steps to reproduce
On a system where the Storage Service and the pipeline are on separate servers (i.e., connected together via a pipeline FS space), configure the pipeline local filesystem staging path to be different from the Storage Service internal processing location (e.g., if the Storage Service Internal Processing location path is "/var/archivematica/storage_service" , set the pipeline local filesystem staging path to "/mnt/pfs_staging" )
Run an ingest and store an AIP (the ingest should complete without failures)
Trigger a full reingest of the ingested AIP (e.g., from the Archival Storage tab), and approve the transfer from the dashboard
During the transfer processing the job "Verify transfer compliance" (Microservice: Verify transfer compliance) fails
Your environment (version of Archivematica, operating system, other relevant details)
AM version: stable/1.15.x , SS version: stable/0.21.x (although the same problem may have also occurred in previous versions, ref issue #1456 )
OS: Rocky 9/RHEL 9 (the issue seems to be unrelated to the OS)
Additional Information/Analysis
After triggering a reingest, and before approving the transfer, it can be noted that the transfer files are "enclosed" under a duplicated directory structure in the watched directory, for example:
Note there are two "oneimage2-xxxxx" subdirectory levels (one with the original AIP uuid, and one with the uuid assigned for reingest). The double directory level is what will cause the reingest process to fail.
When a full reingest is triggered, the AIP files are copied/moved among locations following roughly the following flow (assuming a compressed AIP) ( ref. SS code for locations.models.package:package:start_reingest()here and AM code for archivematica.dashboard:views:reingest()here ):
The SS copies the AIP from the aipstore and extracts it to the Storage Service internal processing location
The SS moves the extracted AIP files to the staging path of the respective pipeline FS
The SS moves the extracted AIP files to the "currently processing" location of the corresponding pipeline
A reingest API call is made to the pipeline. The pipeline moves the extracted AIP files from the currently processing location to the pipeline watched directory
It looks like the problematic step is 2) above. By taking a look at the logs, the move is implemented via rsync, which seems to be creating the problematic double directory level structure (note the source parameter of rsync does not have a "/" at the end, meaning that it will make a copy of the source files including the enclosing directory):
DEBUG 2024-02-09 11:23:48 locations.models.package:package:start_reingest:2257: Reingest: extracted to /var/archivematica/storage_service/tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387
INFO 2024-02-09 11:23:48 locations.models.package:package:start_reingest:2331: Reingest: files: ['tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387']
DEBUG 2024-02-09 11:23:48 locations.models.package:package:start_reingest:2347: Reingest: Current location: 413d6e6f-dcad-4937-8179-544a2020c28d: var/archivematica/storage_service (Storage Service Internal Processing)
DEBUG 2024-02-09 11:23:48 locations.models.space:space:move_to_storage_service:341: TO: src: var/archivematica/storage_service/tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387
DEBUG 2024-02-09 11:23:48 locations.models.space:space:move_to_storage_service:342: TO: dst: tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387
DEBUG 2024-02-09 11:23:48 locations.models.space:space:move_to_storage_service:343: TO: staging: /mnt/pfs_staging
INFO 2024-02-09 11:23:48 locations.models.space:space:move_rsync:545: Moving from /var/archivematica/storage_service/tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387 to /mnt/pfs_staging/tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387
INFO 2024-02-09 11:23:48 locations.models.space:space:move_rsync:587: rsync command: ['rsync', '-t', '-O', '--protect-args', '-vv', '--chmod=Fug+rw,o-rwx,Dug+rwx,o-rwx', '-r', '/var/archivematica/storage_service/tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387', '/mnt/pfs_staging/tmpztdld58u/oneimage2-74cad4f0-d1d4-467c-88f8-f29f008d6387']
In some installations, the SS internal processing location matches the value for fhe pipeline FS staging path (e.g., both are set to /var/archivematica/storage_service), in this case step 2) above does not trigger a move (i.e., rsync not run) and the problem is avoided. This (i.e., setting the value for fhe pipeline FS staging path to be the same as the SS internal processing location) could be the simplest workaround for this issue until the bug is fixed.
For Artefactual use:
Before you close this issue, you must check off the following:
All pull requests related to this issue are properly linked
All pull requests related to this issue have been merged
A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
Documentation regarding this issue has been written and merged (if applicable)
Details about this issue have been added to the release notes (if applicable)
The text was updated successfully, but these errors were encountered:
Expected behaviour
Full reingest should work irrespective of the system configuration (provided the configuration is correct)
Current behaviour
Attempting a full reingest fails under the following configuration:
Steps to reproduce
Your environment (version of Archivematica, operating system, other relevant details)
AM version: stable/1.15.x , SS version: stable/0.21.x (although the same problem may have also occurred in previous versions, ref issue #1456 )
OS: Rocky 9/RHEL 9 (the issue seems to be unrelated to the OS)
Additional Information/Analysis
After triggering a reingest, and before approving the transfer, it can be noted that the transfer files are "enclosed" under a duplicated directory structure in the watched directory, for example:
Note there are two "oneimage2-xxxxx" subdirectory levels (one with the original AIP uuid, and one with the uuid assigned for reingest). The double directory level is what will cause the reingest process to fail.
When a full reingest is triggered, the AIP files are copied/moved among locations following roughly the following flow (assuming a compressed AIP) ( ref. SS code for
locations.models.package:package:start_reingest()
here and AM code forarchivematica.dashboard:views:reingest()
here ):It looks like the problematic step is 2) above. By taking a look at the logs, the move is implemented via rsync, which seems to be creating the problematic double directory level structure (note the source parameter of rsync does not have a "/" at the end, meaning that it will make a copy of the source files including the enclosing directory):
In some installations, the SS internal processing location matches the value for fhe pipeline FS staging path (e.g., both are set to
/var/archivematica/storage_service
), in this case step 2) above does not trigger a move (i.e., rsync not run) and the problem is avoided. This (i.e., setting the value for fhe pipeline FS staging path to be the same as the SS internal processing location) could be the simplest workaround for this issue until the bug is fixed.For Artefactual use:
Before you close this issue, you must check off the following:
The text was updated successfully, but these errors were encountered: