Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: API "package" endpoint doesn't work with some absolute paths #1709

Open
5 tasks
djjuhasz opened this issue Aug 15, 2024 · 9 comments
Open
5 tasks

Comments

@djjuhasz
Copy link

djjuhasz commented Aug 15, 2024

Expected behaviour

Submitting a POST /api/v2beta/package request with a path parameter value that is an absolute path should start processing the transfer at the given path, as per https://www.archivematica.org/en/docs/archivematica-1.16/dev-manual/api/api-reference-archivematica/#package.

Current behaviour

Submitting a POST /api/v2beta/package request with an absolute path seems to work for some paths and not for others — I'm not clear why some paths don't work while others do. I am testing with an Archivematica deployment that has two transfer source locations configured: "/home" and "/transfer_source". Sending a package request using the path /transfer_source/small_bag.zip starts a transfer successfully with that, while a request with path /home/small.zip does not start a transfer.

In both cases the HTTP response status is 202 Accepted and a transfer_id is returned in the body, but in second case the transfer never actually starts.

Both paths exist on the Storage Service server:

Works:

artefactual@amss:/home$ ls -l /transfer_source/small_bag.zip
-rwxr-x--- 1 enduro archivematica 1683 Mar  6 19:12 /transfer_source/small_bag.zip

Doesn't work:

artefactual@amss:/home$ ls -l /home/small.zip
-rw-r--r-- 1 enduro archivematica 1276 Aug 15 22:23 /home/small.zip

Steps to reproduce

Here are the two CURL requests I used for testing:

Works:

  curl -i -X POST \
  -H 'Accept: */*' \
  -H 'Authorization: ApiKey REDACTED:REDACTED' \
  -H 'Content-Type: application/json' \
  --data "{\
       \"path\": \"$(echo -n '/transfer_source/small_bag.zip' | base64 -w 0)\", \
       \"name\": \"small_bag.zip\", \
       \"processing_config\": \"automated\", \
       \"type\": \"zipped bag\" \
      }" \
  https://REDACTED.archivematica.net/api/v2beta/package

Does not work:

curl -i -X POST \
  -H 'Accept: */*' \
  -H 'Authorization: ApiKey REDACTED:REDACTED' \
  -H 'Content-Type: application/json' \
  --data "{\
       \"path\": \"$(echo -n '/home/small.zip' | base64 -w 0)\", \
       \"name\": \"small.zip\", \
       \"processing_config\": \"automated\", \
       \"type\": \"zipfile\" \
      }" \
  https://REDACTED.archivematica.net/api/v2beta/package

Here's a screenshot of the storage location configuration in the Storage Service:
image

Archivematica MCPServer Debug log for failed transfer:
Archivematica.debug.log

Your environment (version of Archivematica, operating system, other relevant details)

Archivematica version: 1.16.0

OS:

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.6 LTS
Release:	20.04
Codename:	focal

For Artefactual use:

Before you close this issue, you must check off the following:

  • All pull requests related to this issue are properly linked
  • All pull requests related to this issue have been merged
  • A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
  • Documentation regarding this issue has been written and merged (if applicable)
  • Details about this issue have been added to the release notes (if applicable)
@djjuhasz
Copy link
Author

P.S. I can successfully start processing the /home/small.zip transfer using the Archivematica Dashboard.

@djjuhasz
Copy link
Author

Similar to #1436

@djjuhasz
Copy link
Author

If I use the "/home" transfer source UUID in the package request the transfer starts successfully:

curl -i -X POST \
  -H 'Accept: */*' \
  -H 'Authorization: ApiKey REDACTED:REDACTED' \
  -H 'Content-Type: application/json' \
  --data "{\
       \"path\": \"$(echo -n '749ef452-fbed-4d50-9072-5f98bc01e52e:small.zip' | base64 -w 0)\", \
       \"name\": \"small.zip\", \
       \"processing_config\": \"automated\", \
       \"type\": \"zipfile\" \
      }" \
  https://REDACTED.archivematica.net/api/v2beta/package

@replaceafill
Copy link
Member

I see this in your attached log:

WARNING   2024-08-15 22:33:48  archivematica.common:storageService:copy_files:316:  Unable to move files with {'origin_location': '/api/v2/location/32634513-bdfc-47e5-8cba-ee2e73dd9811/', 'files': [{'source': 'home/small.zip', 'destination': '/var/archivematica/sharedDirectory/tmp/tmpyfhgjjiu/small.zip'}], 'pipeline': '/api/v2/pipeline/f8c0d75f-15c3-4152-ac1c-abcf5d8c4b36/'} because 500 Server Error: Internal Server Error for url: https://REDACTED.archivematica.net:8000/api/v2/location/8f8af017-3b89-4ce9-a90b-42d4745a3d0d/

Port 8000 runs the Storage Service. See if you have a Traceback or ERROR in that same time span in your /var/log/archivematica/storage-service/storage_service_debug.log file.

@djjuhasz
Copy link
Author

@replaceafill I just tried the failing package again and here is the error in the Storage Service debug log: amss_debug.log

@replaceafill
Copy link
Member

@djjuhasz The relevant bit here:

locations.models.StorageException: Rsync failed with status 23: b'sending incremental file list\nrsync: change_dir "/transfer_source/home" failed: No such file or directory (2)\ndelta-transmission disabled for local transfer or --whole-file\ntotal: matches=0  hash_hits=0  false_alarms=0 data=0\n\nsent 20 bytes  received 79 bytes  198.00 bytes/sec\ntotal size is 0  speedup is 0.00\nrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1205) [sender=3.1.3]\n'

Could it be a problem with your spaces set up? The /transfer_source/home path that the Storage Service is looking for here doesn't seem to match the curl requests you're showing above.

@djjuhasz
Copy link
Author

djjuhasz commented Aug 16, 2024

@replaceafill it looks to me like the AM or the AMSS are prepending "/transfer_source" to the "/home/small.zip" from the "path" parameter in the request.

How would I set up the spaces incorrectly to cause this problem? I included a screenshot of the transfer source directories config from the AMSS dashboard in the initial bug report. The "/transfer_source" directory was initially the "default" transfer source, and I changed the default to "/home" after I first encountered this problem, but if this is a problem that still seems like an AMSS bug.

@djjuhasz
Copy link
Author

@replaceafill also, I can start the /home/small.zip transfer fine from the AM Dashboard, so the spaces setup works fine in that case.

@replaceafill
Copy link
Member

replaceafill commented Aug 16, 2024

@replaceafill it looks to me like the AM or the AMSS are prepending "/transfer_source" to the "/home/small.zip" from the "path" parameter in the request.

That is correct! We investigated this today and found a couple of problems:

  1. The documentation of the endpoint states:

    A fundamental difference between the package endpoint and others from which a transfer can be initiated is that a storage service transfer location UUID isnt always required. In some cases that might still be ideal.

    I think this should specify that if you pass a path with no transfer source location UUID prepended the MCPServer is going to fetch the default transfer source location. If you have multiple transfer source locations and you want to start a transfer that is not in a default one, you have to pass the UUID.

  2. If the transfer source location UUID is not specified the MCPServer will cache the initial fetch from the Storage Service in a global variable. This is problematic if the user changes the default transfer source location in the Storage Service after it has been cached. The MCPServer process would need to be restarted so the global variable can be reset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants