-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Replace unarchiving function #6959
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6959 +/- ##
===========================================
- Coverage 88.11% 68.13% -19.99%
===========================================
Files 1592 633 -959
Lines 62337 30659 -31678
Branches 2012 262 -1750
===========================================
- Hits 54928 20888 -34040
- Misses 7074 9711 +2637
+ Partials 335 60 -275
Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the idea behing this change?
Also I was wondering:
- the dask-sidecar checks the input file mime type. if it is set to zip, then it does not uncompress
services/dynamic-sidecar/src/simcore_service_dynamic_sidecar/modules/nodeports.py
Outdated
Show resolved
Hide resolved
services/dynamic-sidecar/src/simcore_service_dynamic_sidecar/modules/nodeports.py
Outdated
Show resolved
Hide resolved
services/dynamic-sidecar/src/simcore_service_dynamic_sidecar/modules/nodeports.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please read my additional comments. I don't undersatnd how you wanna present this error to the user.
services/dynamic-sidecar/src/simcore_service_dynamic_sidecar/modules/nodeports.py
Outdated
Show resolved
Hide resolved
services/dynamic-sidecar/src/simcore_service_dynamic_sidecar/modules/nodeports.py
Outdated
Show resolved
Hide resolved
services/dynamic-sidecar/src/simcore_service_dynamic_sidecar/modules/nodeports.py
Outdated
Show resolved
Hide resolved
Quality Gate failedFailed conditions |
mkdir -p /tmp/7zip | ||
cd /tmp/7zip | ||
|
||
curl -LO https://www.7-zip.org/a/7z${SEVEN_ZIP_VERSION}-linux-x64.tar.xz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: why not to have our own image with this already in place instead of downloading and installing it every time?
|
||
file_count = await _get_file_count(zip_path) | ||
|
||
command = f"7z x {zip_path} -o{output_dir} -bb1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be careful not to end up with segmented paths in the CLI ...
Normally subprocess
api allows you to pass the command as a list of arguments. Here I read that it has to be a str
. So perhaps the way to go is to add quotes around the paths? (double check)
Something like
command = f"7z x \"{zip_path}\" -o \"{output_dir}\" -bb1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and long option names? 'x' '-o' --> '--output' ?
and it is indeed usually better to use the list of strings if possible
assert process.stderr # nosec | ||
|
||
async with progress_bar.sub_progress( | ||
steps=file_count, description=IDStr("...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the first time I see using IDStr
for descriptions ...
NOTE that:
- Calling
IDStr("...")
is equivalent to callingstr("..."). Therefore in this context: it basically does nothing. To perform a validation you need to create a
TypeAdapter(IDStgr).validate_python("...")` etc - If you want to annotate a field . Then note that
IDStr
is intended for annotation of String IDentifiers (e.g. aprj_123456
) and not descriptions! There are other constraints for e.g.ShortTruncatedStr
orLongTruncatedStr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pcrespov I think this is a flaw in the ProgressBar class. I guess I did it wrong at the time. but indeed I would just remove that IDStr from there it makes no sense. But probably should go in a separate PR, maybe an issue for this would be nice.
|
||
|
||
@pytest.fixture | ||
def to_archive_dir(tmpdir: LocalPath) -> Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is like a deja vu from to year one's reviews :-)
We use everywhere the tmp_path
version of this fixture which produces pathlib.Path
instead of tmpdir
which produces a custom version of pytest LocalPath
Please read https://docs.pytest.org/en/6.2.x/tmpdir.html#the-tmp-path-fixture and use tmp_path: Path
instead of tmpdir: LocalPath
try: | ||
subprocess.check_output(["7z", "--help"]) # noqa: S607, S603 | ||
except Exception: # pylint: disable=broad-except | ||
pytest.skip("7z is not installed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to add in the dynamic-sidecar Makefile
some recipe to install this. Perhaps add it in the skip message or simply make a reference to ./ci/github/helpers/install_7zip.bash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the end we replace the zip library just for unzipping correct?
Why not replacing the zip library for unzipping completely instead?
How are they comparable performance wise?
@@ -1304,6 +1304,8 @@ jobs: | |||
cache-dependency-glob: "**/dynamic-sidecar/requirements/ci.txt" | |||
- name: show system version | |||
run: ./ci/helpers/show_system_versions.bash | |||
- name: install 7zip | |||
run: sudo ./ci/github/helpers/install_7zip.bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this done here and not in the dynamic-sidecar.bash script?
|
||
|
||
class SevenZipError(BaseDynamicSidecarError): | ||
msg_template = "Could not finish command: '{command}'\nReason: {command_result}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg_template = "Could not finish command: '{command}'\nReason: {command_result}" | |
msg_template = "Could not complete 7zip command: '{command}'\nReason: {command_result}" |
@@ -36,6 +36,7 @@ | |||
|
|||
from ..core.settings import ApplicationSettings, get_settings | |||
from ..modules.notifications import PortNotifier | |||
from .seven_zip_wrapper import unarchive_zip_to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it not make sense to put that one also in servicelib?
Also if you ensure compatible interfaces, that would even call for a common interface?
@@ -95,7 +96,7 @@ async def upload_outputs( # pylint:disable=too-many-statements # noqa: PLR0915 | |||
port_notifier: PortNotifier, | |||
) -> None: | |||
# pylint: disable=too-many-branches | |||
logger.debug("uploading data to simcore...") | |||
_logger.debug("uploading data to simcore...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for this kind of logs, I guess you could use the log decorator and spare 1 line of code
async def _move_file_to_input_port( | ||
final_path: Path, downloaded_file: Path, dest_folder: PrunableFolder | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function named _move_file_to_input_port
takes arguments that do not have any port in their name. even the function does not check anything port related...
actually I don't even really understand what dest_folder
is? was that the original folder where the downloaded_file is living?
so this funciton
- moves downloaded_file into final_path
- then prunes dest_folder exluding final_path
maybe a bit of refactoring is in order? at least with naming the arguments?
if line_decoded.startswith("- "): # check file entry | ||
await sub_prog.update(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohla.. hope that is tested
|
||
|
||
@pytest.fixture | ||
def internal_tools_unarchived_tools(tmpdir: LocalPath) -> Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a very weird name...
def generate_content( | ||
to_archive_dir: Path, sub_dirs: int, files_in_subdirs: int | ||
) -> None: | ||
for i in range(sub_dirs): | ||
(to_archive_dir / f"s{i}").mkdir(parents=True, exist_ok=True) | ||
for k in range(files_in_subdirs): | ||
(to_archive_dir / f"s{i}" / f"{k}.txt").write_text("a" * k) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we have already such functions in /home/sanderegg/dev/github/osparc-simcore/packages/pytest-simcore/src/pytest_simcore/file_extra.py ?
print(f"{entry}") | ||
|
||
|
||
def _strip_folder_from_path(paths: set[Path], *, to_strip: Path) -> set[Path]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fail to really see the added value of a function that calls a one liner ;) but ok
pytest.param(5, 4, id="few_items"), | ||
], | ||
) | ||
async def test_ensure_same_interface_as_unarchive_dir( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just use a common interface and not have to test that? but just run the usual archive/unarchive tests with standard zip and 7zip?
What do these changes do?
When a new style dynamic service has to fetch an input port containing a zip archive, it will try to unzip it.
If it's not able to open it (see issue here), the service would fail to start.
To fix the issue the unarchiving method is replaced with
7zip
. The function signature was not changed for the purpose of making it compatible with the previously used version.Related issue/s
How to test
Dev-ops checklist