This script is intended to run besides Hercules. It continuously monitors a directory on a local data storage node. It copies new or updated files and then starts a Hercules jobs to send each file to a remote destination. This shall provide a drag-and-drop like experience for Hercules file transfer.
The main goal is to send files form a local to a remote destination. In this process four machines are involved:
- local data host: Data storage, accessible by local users, accessible by local transfer host
- local transfer host: NOT accessible by users, runs Hercules Server, runs THIS script
- remote transfer host: NOT accessible by users, runs Hercules Server, runs THIS script
- remote data host: Data storage, accessible by remote users, accessible by remote transfer host
This script can be configured via a json config file. The path to the configfile can be given via command line arguments: python3 main.py -c "config.json"
The config file shall have the following format and fields:
{
"ldh_ip": "",
"ldh_username": "",
"ldh_ssh_key_file": "",
"ldh_observe_dir": "",
"ldh_write_dir": "",
"lth_state_file_dir": "",
"lth_hercules_rcv_dir": "./temp/incoming",
"lth_out_temp_dir": "./temp/outgoing",
"hercules_monitor_address": "",
"rth_address": "",
"rth_target_dir": ""
}
- ldh_ip: local data host IP address
- ldh_username: local data host username for SFTP login
- ldh_ssh_key_file: local data host ssh key file for SFTP login
- ldh_observe_dir: local data host directory to observe for outgoing files
- ldh_write_dir: local data host directory to write incoming files to
- lth_state_file_dir: local transfer host directory where
incoming_state.json
andoutgoing_state.json
will be stored and loaded from. - lth_hercules_rcv_dir: local transfer host directory to observe for incoming hercules transfers
- lth_out_temp_dir: local transfer host directory to temporarily store files for outgoing hercules transfers
- hercules_monitor_address: address to interact with Hercules via HTTP api. (default=localhost:8000)
- rth_address: remote transfer host address. The destination address for hercules transferes. (SCION address)
- rth_target_dir: remote transfer host target directory. The destination directory for hercules transferes.
This script stores file metadata as a File
object. This includes a FileState
. This state determines the current and next steps that must be performed on that file (copy, send, delete, ...).
The states and the state transitions are explained in the chart below.
- does hercules handle small files? (Had issues with test files that are just a few bytes in size.)