Releases: JDASoftwareGroup/kartothek
Kartothek v5.3.0
Version 5.3.0 (2021-12-10)
- Add Deprecation warnings and migration helpers in order to
facilitate the Kartothek version 6.0.0 migration. - Removed warning for distinct categoricals (#501)
Kartothek v5.2.0
Version 5.2.0 (2021-11-22)
- Remove support for Python 3.6
- Allow
pyarrow<7
as a dependency.
Kartothek v5.1.0
Version 5.1.0 (2021-07-05)
- Add
~kartothek.io.eager.copy_dataset
{.interpreted-text
role="meth"} to copy and optionally rename datasets within one store
or between stores (eager only) - Add renaming option to
~kartothek.io.eager_cube.copy_cube
{.interpreted-text role="meth"} - Add predicates to cube condition converter to
~kartothek.utils.predicate_converter
{.interpreted-text
role="meth"}
Kartothek v5.0.0
Version 5.0.0 (2021-06-23)
This release rolls all the changes introduced with 4.x back to 3.20.0.
As the incompatibility between 4.0 and 5.0 will be an issue for some
customers, we encourage you to use the very stable kartothek 3.20.0 and
not version 4.x.
Please refer the Issue #471 for further information.
Kartothek v5.0.0rc1
Version 5.0.0 (2021-05-xx)
This release rolls all the changes introduced with 4.x back to 3.20.0.
As the incompatibility between 4.0 and 5.0 will be an issue for some
customers, we encourage you to use the very stable kartothek 3.20.0 and
not version 4.x.
Please refer the Issue #471 for further information.
Kartothek v4.0.3
Kartothek 4.0.3 (2021-06-10)
- Pin dask to not use 2021.5.1 and 2020.6.0 (#475)
Kartothek v4.0.1
Kartothek 4.0.1 (2021-04-13)
- Fixed dataset corruption after updates when table names other than
"table" are used (#445).
Kartothek v4.0.0
Kartothek 4.0.0 (2021-03-17)
This is a major release of kartothek with breaking API changes.
- Removal of complex user input (see gh427)
- Removal of multi table feature
- Removal of [kartothek.io.merge]{.title-ref} module
- class
~kartothek.core.dataset.DatasetMetadata
{.interpreted-text
role="class"} now has an attribute called [schema]{.title-ref} which
replaces the previous attribute [table_meta]{.title-ref} and returns
only a single schema - All outputs which previously returned a sequence of dictionaries
where each key-value pair would correspond to a table-data pair now
returns only onepandas.DataFrame
{.interpreted-text role="class"} - All read pipelines will now automatically infer the table to read
such that it is no longer necessary to provide [table]{.title-ref}
or [table_name]{.title-ref} as an input argument - All writing pipelines which previously supported a complex user
input type now expose an argument [table_name]{.title-ref} which can
be used to continue usage of legacy datasets (i.e. datasets with an
intrinsic, non-trivial table name). This usage is discouraged and we
recommend users to migrate to a default table name (i.e. leave it
None / [table]{.title-ref}) - All pipelines which previously accepted an argument
[tables]{.title-ref} to select the subset of tables to load no
longer accept this keyword. Instead the to-be-loaded table will be
inferred - Trying to read a multi-tabled dataset will now cause an exception
telling users that this is no longer supported with kartothek 4.0 - The dict schema for
~kartothek.core.dataset.DatasetMetadataBase.to_dict
{.interpreted-text
role="meth"} and
~kartothek.core.dataset.DatasetMetadata.from_dict
{.interpreted-text
role="meth"} changed replacing a dictionary in
[table_meta]{.title-ref} with the simple [schema]{.title-ref} - All pipeline arguments which previously accepted a dictionary of
sequences to describe a table specific subset of columns now accept
plain sequences (e.g. [columns]{.title-ref},
[categoricals]{.title-ref}) - Remove the following list of deprecated arguments for io pipelines
- label_filter
- central_partition_metadata
- load_dynamic_metadata
- load_dataset_metadata
- concat_partitions_on_primary_index
- Remove [output_dataset_uuid]{.title-ref} and
[df_serializer]{.title-ref} from
kartothek.io.eager.commit_dataset
{.interpreted-text role="func"}
since these arguments didn't have any effect - Remove [metadata]{.title-ref}, [df_serializer]{.title-ref},
[overwrite]{.title-ref}, [metadata_merger]{.title-ref} from
kartothek.io.eager.write_single_partition
{.interpreted-text
role="func"} ~kartothek.io.eager.store_dataframes_as_dataset
{.interpreted-text
role="func"} now requires a list as an input- Default value for argument [date_as_object]{.title-ref} is now
universally set toTrue
. The behaviour for [False]{.title-ref}
will be deprecated and removed in the next major release - No longer allow to pass [delete_scope]{.title-ref} as a delayed
object to
~kartothek.io.dask.dataframe.update_dataset_from_ddf
{.interpreted-text
role="func"} ~kartothek.io.dask.dataframe.update_dataset_from_ddf
{.interpreted-text
role="func"} and
~kartothek.io.dask.dataframe.store_dataset_from_ddf
{.interpreted-text
role="func"} now return a [dd.core.Scalar]{.title-ref} object. This
enables all [dask.DataFrame]{.title-ref} graph optimizations by
default.- Remove argument [table_name]{.title-ref} from
~kartothek.io.dask.dataframe.collect_dataset_metadata
{.interpreted-text
role="func"}
Kartothek v3.20.0
Version 3.20.0 (2021-03-15)
This will be the final release in the 3.X series. Please ensure your
existing codebase does not raise any DeprecationWarning from kartothek
and migrate your import paths ahead of time to the new
kartothek.api
{.interpreted-text role="mod"} modules to ensure a smooth
migration to 4.X.
- Introduce
kartothek.api
{.interpreted-text role="mod"} as the
public definition of the API. See also
versioning
{.interpreted-text role="doc"}. - Introduce [DatasetMetadataBase.schema]{.title-ref} to prepare
deprecation of [table_meta]{.title-ref} ~kartothek.io.eager.read_dataset_as_dataframes
{.interpreted-text
role="func"} and
~kartothek.io.iter.read_dataset_as_dataframes__iterator
{.interpreted-text
role="func"} now correctly return categoricals as requested for
misaligned categories.
Kartothek v3.19.1
Version 3.19.1 (2021-02-24)
- Allow
pyarrow==3
as a dependency. - Fix a bug in
~kartothek.io_components.utils.align_categories
{.interpreted-text
role="func"} for dataframes with missings and of non-categorical
dtype. - Fix an issue with the cube index validation introduced in v3.19.0
(#413).