-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catalog to config #4323
base: main
Are you sure you want to change the base?
Catalog to config #4323
Conversation
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
…catalog-from-to-prototype
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
|
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
…catalog-from-to-prototype
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ElenaKhaustova, really great work! 🌟
@@ -28,6 +28,7 @@ def _create_session(package_name: str, **kwargs: Any) -> KedroSession: | |||
|
|||
|
|||
def is_parameter(dataset_name: str) -> bool: | |||
# TODO: when breaking change move it to kedro/io/core.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this todo still needed?
"""Converts the `KedroDataCatalog` instance into a configuration format suitable for | ||
serialization. This includes datasets, credentials, and versioning information. | ||
This method is only applicabe to catalogs that contain datasets initialized with static, primitive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is only applicabe to catalogs that contain datasets initialized with static, primitive | |
This method is only applicable to catalogs that contain datasets initialized with static, primitive |
kedro/io/catalog_config_resolver.py
Outdated
@@ -237,8 +237,9 @@ def _extract_patterns( | |||
|
|||
return sorted_patterns, user_default | |||
|
|||
@classmethod | |||
def _resolve_config_credentials( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop the word config
from the name of the method as well as the method below? What other credentials can they be except config credentials in the context of this class?
# Declares a class-level attribute that will store the initialization | ||
# arguments of an instance. Initially, it is set to None, but it will | ||
# hold a dictionary of arguments after initialization. | ||
_init_args: dict[str, Any] | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this shared across all instances? How to make sure we're not overwriting everything with the latest instance args?
method of the instance is called with the arguments used to initialize | ||
the object. | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wraps(init_func) |
and you need to import it at the top as from functools import wraps
# Save the original __init__ method of the subclass | ||
init_func: Callable = cls.__init__ | ||
|
||
def init_decorator(previous_init: Callable) -> Callable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can drop the padding of an extra function, this decorator will not be used in the code as a decorator, but rather you will just assign the function it returns to be the cls.__init__
, so only the new_init
function is required.
""" | ||
|
||
# Call the original __init__ method | ||
previous_init(self, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previous_init(self, *args, **kwargs) | |
init_func(self, *args, **kwargs) |
if type(self) is cls: | ||
# Capture and process the arguments passed to the original __init__ | ||
call_args = getcallargs(init_func, self, *args, **kwargs) | ||
# Call the custom post-initialization method to save captured arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this might not be needed, if you setup the function as suggested.
# hold a dictionary of arguments after initialization. | ||
_init_args: dict[str, Any] | None = None | ||
|
||
def __post_init__(self, call_args: dict[str, Any]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def __post_init__(self, call_args: dict[str, Any]) -> None: | |
def __post_init__(self, *args, **kwargs) -> None: |
Btw, do we even need this method at all? Can't all of this be done by the decorator itself instead of delegating it to a separate method?
@@ -484,14 +602,14 @@ def parse_dataset_definition( | |||
config = copy.deepcopy(config) | |||
|
|||
# TODO: remove when removing old catalog as moved to KedroDataCatalog | |||
if "type" not in config: | |||
if TYPE_KEY not in config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
load_versions: dict[str, str | None] = {} | ||
|
||
for ds_name, ds in self._lazy_datasets.items(): | ||
if _is_memory_dataset(ds.config.get("type", "")): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if _is_memory_dataset(ds.config.get("type", "")): | |
if _is_memory_dataset(ds.config.get(TYPE_KEY, "")): |
...maybe?
|
||
def _is_memory_dataset(ds_or_type: AbstractDataset | str) -> bool: | ||
"""Check if dataset or str type provided is a MemoryDataset.""" | ||
if isinstance(ds_or_type, AbstractDataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we check directly for a MemoryDataset
instance here or this is to avoid circular dependencies?
Description
Made on top of #4347 (review it first)
Implementation of #4329
Full context: #3932 (comment)
TODO:
save_version
via catalog constructor and when passing datasets #4327, PR: Validate datasets versions #4347versioned
flag and dataset parameter #4326AbstractDataset.to_config()
:kedro-datasets
: Datasets accept non-primitive parameters in the__init__
kedro-plugins#950Development notes
To run
pytest tests/io/test_kedro_data_catalog.py::TestKedroDataCatalog::TestKedroDataCatalogFromConfig
or see an example from How to test section in #4329Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file