Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fractal/NGIO logging #867

Open
tcompa opened this issue Nov 19, 2024 · 1 comment
Open

Fractal/NGIO logging #867

tcompa opened this issue Nov 19, 2024 · 1 comment
Labels

Comments

@tcompa
Copy link
Collaborator

tcompa commented Nov 19, 2024

Re: logging (extracting the discussion from #866, cc @lorenzocerrone @jluethi )


  1. The place where we want to be careful about logging is ngio, because it's a library that will be re-used in other Python modules. The same applies to fractal-tasks-core core library, but loggers defined in fractal-tasks-core tasks are not re-used by anyone outside that task - thus their only requirement is that their format includes a timestamp (unrelated: possibly in a custom timezone).

  2. Logger names are used as logger identifier by the logging library, and calling logging.getLogger("some.logger") twice returns the same object - even from two different modules (see e.g. here).

  3. Right now, fractal-tasks-core logging is configured in a way that does not display the logger name (details in "Where does the logger name appear?" below). This was less relevant in the past, because most/all logs came from the same logger (fractal-tasks-core). Now that we are including two different loggers, I suggest we pick reasonable/brief names and display them.

  4. Loggers should not be passed around (across functions or modules) as Python objects. That is, I think we should move from

from ngio.utils import ngio_logger

to

from ngio.utils import ngio_logger_name
ngio_loggger = logging.getLogger(ngio_logger_name)

This is my understanding of https://docs.python.org/3/howto/logging-cookbook.html#using-logging-in-multiple-modules and https://docs.python.org/3/howto/logging-cookbook.html#using-loggers-as-attributes-in-a-class-or-passing-them-as-parameters, and that's also how we are typically setting up things in fractal repositories.

  1. In general, I don't think we should ever import/use the ngio logger in fractal-tasks-core tasks.
    The reason becomes more evident in a case where we do show logger names:
# This seems correct, with different loggers for the current task and for ngio operations
2024-10-10 10:10:10 - fractal_tasks_core.tasks.projection - INFO - start task
2024-10-10 10:10:11 - fractal_tasks_core.tasks.projection - INFO - zarr_url=/some/ome.zarr/B/03/0
2024-10-10 10:10:12 - fractal_tasks_core.tasks.projection - INFO - preliminary checks are OK
2024-10-10 10:10:13 - ngio.loader - WARNING - I found the wrong dtype in some zarr array, continue anyway
2024-10-10 10:10:14 - fractal_tasks_core.tasks.projection - INFO - end task

# This seems wrong: the "start task" does not belong to ngio
2024-10-10 10:10:10 - ngio.something - INFO - start task
2024-10-10 10:10:11 - ngio.something - INFO - zarr_url=/some/ome.zarr/B/03/0
2024-10-10 10:10:12 - ngio.something - INFO - preliminary checks are OK
2024-10-10 10:10:13 - ngio.loader - WARNING - I found the wrong dtype in some zarr array, continue anyway
2024-10-10 10:10:14 - ngio.something - INFO - end task

(note that I showed the most complete example where logger names also include the specific module, but they could also simply be projection_task and ngio).

Where does the logger name appear?

When does run_fractal_task use the logger name? (by @jluethi)

The logger name determines which logger will be used in this block

    # Run task
    logger.info(f"START {task_function.__name__} task")
    metadata_update = task_function(**pars)
    logger.info(f"END {task_function.__name__} task")

However, this logger_name is currently not displayed in any log, due to our logging configuration (defined in __init__.py).
This snippet clarifies how we are configuring logging now, and how we could change it if we wanted to display the name:

import logging

# Create two loggers
logger1 = logging.getLogger("logger_name_1")
logger2 = logging.getLogger("logger_name_2")

# Use the configuration which is currently in use for fractal-tasks-core loggers
logging.basicConfig(level=logging.INFO, format="%(asctime)s; %(levelname)s; %(message)s")

logger1.error("This is an error from logger 1")
# 2024-11-19 09:00:55,532; ERROR; This is an error from logger 1

logger2.error("This is an error from logger 2")
# 2024-11-19 09:00:55,532; ERROR; This is an error from logger 2

# Replace configuration with a new one, which includes logger name
logging.basicConfig(level=logging.INFO, format="%(asctime)s; %(name)s; %(levelname)s; %(message)s", force=True)

logger1.error("This is an error from logger 1")
# 2024-11-19 09:00:55,532; logger_name_1; ERROR; This is an error from logger 1

logger2.error("This is an error from logger 2")
# 2024-11-19 09:00:55,532; logger_name_2; ERROR; This is an error from logger 2

This is also clear in the following task logs, where the logger name does not appear anywhere:

2024-11-18 12:53:24,431; INFO; START cellvoyager_to_ome_zarr_init task
2024-11-18 12:53:24,431; INFO; [glob_with_multiple_patterns] include_patterns=['*.png']
2024-11-18 12:53:24,438; INFO; [glob_with_multiple_patterns] Found 4 items
2024-11-18 12:53:24,438; INFO; Creating 20200812-CardiomyocyteDifferentiation14-Cycle1.zarr
2024-11-18 12:53:24,438; INFO; Start open_zarr_group_with_overwrite (overwrite=True).
2024-11-18 12:53:24,452; INFO; Zarr group /data/zarrs/1731934120-1/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr already exists, with keys=['B']
2024-11-18 12:53:24,477; INFO; Read /data/images/10.5281_zenodo.8287221/MeasurementData.mlf, apply following include patterns to image filenames: None apply the following exlcude patterns to image filenames: None
2024-11-18 12:53:24,499; INFO; Expected number of images for well B03: 4
2024-11-18 12:53:24,500; INFO; removing FOV overlaps for well='B03'
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] include_patterns=['20200812-CardiomyocyteDifferentiation14-Cycle1_*.png']
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] Found 4 items
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] include_patterns=['20200812-CardiomyocyteDifferentiation14-Cycle1_B03_*.png']
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] Found 4 items
/data/remote/tasks/1/fractal-tasks-core/1.3.2/venv/lib/python3.10/site-packages/pydantic/main.py:364: UserWarning: Pydantic serializer warnings:
  Expected `int` but got `int64` - serialized value may not be as expected
  return self.__pydantic_serializer__.to_python(
2024-11-18 12:53:24,531; INFO; END cellvoyager_to_ome_zarr_init task
@jluethi
Copy link
Collaborator

jluethi commented Nov 25, 2024

Conclusion:

  • Tasks will use their own loggers [template can set those defaults]
  • Logger level warning for ngio is good => we can overall reduce the number of logs
  • Logger level setting via env variable allows debugging even in Fractal server
  • We can unify/update logger config (e.g. import from ngio)

In the future, we may share more logging configs (e.g. timezone handling) through this (e.g. having them defined in ngio and tasks importing from there)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

2 participants