Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup tools: rsync progress; logging change #166

Merged
merged 9 commits into from
Feb 22, 2024

Conversation

eimrek
Copy link
Member

@eimrek eimrek commented Feb 14, 2024

Hi,

this PR contains two changes:

1) remove the logger argument from BackupManager, and instead only use the logger object of this library.

This change is motivated by the fact that in aiida-core, the default logging level is REPORT (23) and the logger.info messages of this library are not outputted, which give information about the progress and should probably be shown by default. After these changes, the logger instead should be controlled in the aiida-core via something like

    import logging
    from aiida.common.log import LOG_LEVEL_REPORT
    DOS_BACKUP_LOGGER = logging.getLogger("disk_objectstore.backup_utils")
    # As the default logging level in AiiDA is a custom level REPORT (23), adapt
    # the disk-objectstore logger to output the INFO (20) level by default as well
    storage_logger_level = STORAGE_LOGGER.getEffectiveLevel()
    if storage_logger_level  >= logging.INFO and storage_logger_level <= LOG_LEVEL_REPORT:
        DOS_BACKUP_LOGGER.setLevel(logging.INFO)
    else:
        DOS_BACKUP_LOGGER.setLevel(storage_logger_level)
    
    handler = logging.StreamHandler()
    formatter = logging.Formatter('%(name)s: [%(levelname)s] %(message)s')
    handler.setFormatter(formatter)

    DOS_BACKUP_LOGGER.addHandler(handler)

Although, perhaps there is a better way to do it, similarly to the other loggers (as is done in aiida.common.log.py).

In principle, I can also revert to using the logger argument, but then i probably should also create a custom log level called REPORT in this library, if i want to log consistently with aiida-core.

2) Enable rsync progress indication (with --info=progress2) and output the stdout directly to the user, without capturing it

If i call subprocess.run(rsync_args, capture_output=False), the stdout of rsync is passed directly from the subprocess to the user, bypassing any logging system. Although this has drawbacks (e.g. it won't show up in logs, e.g. when redirecting to a file), it allows to show the updating progress indicator to the user running the CLI sees. The output is captured and not shown, however, if the logging level is above INFO. The motivation behind why I just bypass capturing the output is that if the stdout is captured, it's not possible (i think; or at least not without writing a lot of code) to output a in-place updating progress indicator via the logging system.

As this has the drawback, I think it would be good to discuss and get opinions (@giovannipizzi @sphuber).

Two possible alternatives would be

  • capturing the progress output and logging each line of the progress updates. the user will get a lot of output, like
disk_objectstore.backup_utils: [INFO] 32.77K  96%   15.29MB/s    0:00:00 (xfr#2, to-chk=0/14
disk_objectstore.backup_utils: [INFO] 32.77K  97%   15.29MB/s    0:00:00 (xfr#2, to-chk=0/14
disk_objectstore.backup_utils: [INFO] 32.77K  98%   15.29MB/s    0:00:00 (xfr#2, to-chk=0/14
...
  • capturing the output of rsync, parsing it and getting the progress percentage, and implementing some progressbar in python (e.g. tqdm), similar to what is done in aiida-core. But perhaps this is a lot of code and would bloat the library a bit too much.

Copy link

codecov bot commented Feb 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (23c784a) 99.90% compared to head (34bb0de) 99.90%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #166   +/-   ##
=======================================
  Coverage   99.90%   99.90%           
=======================================
  Files          10       10           
  Lines        2066     2085   +19     
=======================================
+ Hits         2064     2083   +19     
  Misses          2        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@giovannipizzi
Copy link
Member

I think that not capturing the output of rsync is OK as the best solution among those you describe, but happy to hear if @sphuber also agrees or not

@eimrek eimrek requested a review from sphuber February 22, 2024 14:59
@eimrek
Copy link
Member Author

eimrek commented Feb 22, 2024

@sphuber This should be good now, give it a quick check if you have time.

@sphuber
Copy link
Contributor

sphuber commented Feb 22, 2024

To summarize the discussion @eimrek and myself had: ideally the progress of rsync also goes through the logging system as then it can be fully controlled by the caller, e.g. aiida-core. However, the current progress reporting mechanism in aiida-core doesn't even go through the logging, so it doens't make sense to invest the time now to do it here. So for now, we will go simply with letting rsync's progress go uncaptured straight to stdout

tests/test_backup.py Outdated Show resolved Hide resolved
disk_objectstore/backup_utils.py Outdated Show resolved Hide resolved
@eimrek eimrek requested a review from sphuber February 22, 2024 15:40
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eimrek , looks good to me

@sphuber sphuber merged commit d23dac4 into aiidateam:main Feb 22, 2024
17 checks passed
@eimrek eimrek deleted the backup-updates branch February 23, 2024 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants