Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCDIDC-1504 #157

Open
wants to merge 78 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
03a58ad
init gdc_file_upload
bullenc Dec 20, 2024
4e579c4
Update gdc_file_upload.py
bullenc Dec 20, 2024
8cf6298
Update gdc_file_upload.py
bullenc Dec 20, 2024
3f8ba02
Update gdc_file_upload.py
bullenc Dec 20, 2024
5f95332
Update gdc_file_upload.py
bullenc Dec 20, 2024
307d6a1
remove testing block
bullenc Dec 23, 2024
2070298
Update gdc_file_upload.py
bullenc Dec 23, 2024
a02d0c2
Adding token sanitization from error messages
bullenc Dec 23, 2024
678c51d
Update gdc_file_upload.py
bullenc Dec 23, 2024
23e13e5
Update gdc_file_upload.py
bullenc Dec 26, 2024
da4b5ee
Update gdc_file_upload.py
bullenc Dec 26, 2024
e3fe7e8
Update gdc_file_upload.py
bullenc Dec 26, 2024
bf0c400
Update gdc_file_upload.py
bullenc Dec 26, 2024
0fb276f
Update gdc_file_upload.py
bullenc Dec 26, 2024
3c3152c
Update gdc_file_upload.py
bullenc Dec 26, 2024
7c588c2
Update gdc_file_upload.py
bullenc Dec 26, 2024
e7b0ef3
Update gdc_file_upload.py
bullenc Dec 26, 2024
384da99
Update gdc_file_upload.py
bullenc Dec 26, 2024
f3be9b3
Update gdc_file_upload.py
bullenc Dec 26, 2024
8448c42
Update gdc_file_upload.py
bullenc Dec 26, 2024
c1cb918
Update gdc_file_upload.py
bullenc Dec 26, 2024
4b789f3
Update gdc_file_upload.py
bullenc Dec 26, 2024
d538d65
Update gdc_file_upload.py
bullenc Dec 26, 2024
e59ac7b
Update gdc_file_upload.py
bullenc Dec 30, 2024
9dd4187
Update gdc_file_upload.py
bullenc Dec 30, 2024
5cc3909
Update gdc_file_upload.py
bullenc Dec 30, 2024
7ba2b8e
Update gdc_file_upload.py
bullenc Jan 2, 2025
ef92173
up worker, implement parallel chunks
bullenc Jan 2, 2025
163d557
Update gdc_file_upload.py
bullenc Jan 2, 2025
ef7156a
Update gdc_file_upload.py
bullenc Jan 2, 2025
ae24240
Update gdc_file_upload.py
bullenc Jan 2, 2025
b59f5ff
Update gdc_file_upload.py
bullenc Jan 2, 2025
775387e
Update gdc_file_upload.py
bullenc Jan 2, 2025
f5f082b
Update gdc_file_upload.py
bullenc Jan 2, 2025
8d9197d
Update gdc_file_upload.py
bullenc Jan 2, 2025
d8b45f7
Update gdc_file_upload.py
bullenc Jan 2, 2025
af5d980
Update gdc_file_upload.py
bullenc Jan 2, 2025
42e38de
Update gdc_file_upload.py
bullenc Jan 2, 2025
3cab38b
Update gdc_file_upload.py
bullenc Jan 2, 2025
23ef0b9
Update gdc_file_upload.py
bullenc Jan 2, 2025
b766272
Update gdc_file_upload.py
bullenc Jan 2, 2025
29a0170
Update gdc_file_upload.py
bullenc Jan 2, 2025
4a2c472
Update prefect.yaml
bullenc Jan 2, 2025
51c46cf
Update gdc_file_upload.py
bullenc Jan 2, 2025
e28b592
Update gdc_file_upload.py
bullenc Jan 2, 2025
0c417e9
Update gdc_file_upload.py
bullenc Jan 2, 2025
7e716c6
Update gdc_file_upload.py
bullenc Jan 2, 2025
9b12b00
Update gdc_file_upload.py
bullenc Jan 2, 2025
695b48b
Update gdc_file_upload.py
bullenc Jan 2, 2025
360d7e6
Update gdc_file_upload.py
bullenc Jan 2, 2025
d7166aa
Update gdc_file_upload.py
bullenc Jan 2, 2025
b5042c5
Update gdc_file_upload.py
bullenc Jan 2, 2025
9cc6823
Update gdc_file_upload.py
bullenc Jan 2, 2025
dd2e9e9
Update gdc_file_upload.py
bullenc Jan 2, 2025
a3de205
Create gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
b1c8138
gdc-client method
bullenc Jan 3, 2025
f3344fd
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
c304f7a
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
a0695f1
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
9ebfcde
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
ac1ed3f
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
7155720
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
4c26d82
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
d919a28
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
a64bc69
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
9f5ce80
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
46c8f0f
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
66f2c3a
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
1e3a080
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
9a96df8
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
3aed06b
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
a51a331
Update gdc_file_upload_gdc_client.py
bullenc Jan 3, 2025
a417eba
Update gdc_file_upload_gdc_client.py
bullenc Jan 7, 2025
ae0484e
Update gdc_file_upload_gdc_client.py
bullenc Jan 7, 2025
ae5c533
enforce n process limit for >- 7GB file uploads
bullenc Jan 8, 2025
cd26dd6
Update gdc_file_upload_gdc_client.py
bullenc Jan 8, 2025
e0ce19e
Update gdc_file_upload_gdc_client.py
bullenc Jan 8, 2025
389da9e
Update gdc_file_upload_gdc_client.py
bullenc Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 37 additions & 1 deletion prefect.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -879,4 +879,40 @@ deployments:
work_pool:
name: ccdi-curation-8gb
work_queue_name: null
job_variables: {}
job_variables: {}

- name: ccdi-gdc-file-upload
version: null
tags: [ "TEST" ]
description: null
entrypoint: workflows/gdc_file_upload_gdc_client.py:runner
schedule: null

# flow-specific parameters
parameters:
bucket: "ccdi-validation"
project_id: "CCDI-MCI"
manifest_path: ""
gdc_client_path: ""
runner: ""
secret_key_name: "gdc-token"
upload_part_size_mb: 1
max_n_processes: 4


# pull action to overwrite from file pull action
pull:
- prefect.projects.steps.git_clone_project:
id: clone-step
repository: https://github.com/CBIIT/ChildhoodCancerDataInitiative-Prefect_Pipeline.git
branch: CCDIDC-1504_GDC_File_Upload
- prefect.projects.steps.pip_install_requirements:
requirements_file: requirements.txt
directory: "{{ clone-step.directory }}"
stream_output: False

# infra-specific fields
work_pool:
name: ccdi-curation-32gb
work_queue_name: null
job_variables: {}
14 changes: 14 additions & 0 deletions src/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1382,3 +1382,17 @@ def ccdi_to_dcf_index(ccdi_manifest: str) -> tuple:
del combined_df

return output_filename, log_name

def sanitize_return(err_string: str, remove_value_list: list):
"""Sanitize a string of provided substrings to remove.

Args:
err_string (str): String to remove any secret or sensitive substrings from
remove_value_list (list): List of substrings to remove

Returns:
str: Sanitized string
"""
for remove_value in remove_value_list:
err_string = err_string.replace(remove_value, "")
return err_string
Loading