Release v0.19.0 · pola-rs/r-polars

Breaking changes

Updated rust-polars to unreleased 2024-08-20, after 0.42.0 (#1183).
$describe_plan() and $describe_optimized_plan() are removed. Use
respectively $explain(optimized = FALSE) and $explain() instead (#1182).
The parameter inherit_optimization is removed from all functions that had it
(#1183).
In $write_parquet() and $sink_parquet(), the parameter data_pagesize_limit
is renamed data_page_size (#1183).
The LazyFrame method $get_optimization_toggle() is removed, and
$set_optimization_toggle() is renamed $optimization_toggle() (#1183).
In $unpivot(), the parameter streamable is removed (#1183).
Some functions have a parameter future that determines the compatibility level
when exporting Polars' internal data structures. This parameter is renamed
compat_level, which takes FALSE for the oldest flavor (more compatible)
and TRUE for the newest one (less compatible). It can also take an integer
determining a specific compatibility level when more are added in the future.
For now, future = FALSE can be replaced by compat_level = FALSE (#1183).
In $scan_parquet() and $read_parquet(), the default value of
hive_partitioning is now NULL (#1189).
In $dt$epoch(), the argument tu is renamed to time_unit (#1196).
In $fill_nan() for DataFrame, LazyFrame and Expr, the argument is
renamed value (#1198).
$shift_and_fill() is removed and replaced by a new argument fill_value in
$shift(). $shift_and_fill(fill_value, periods) can be replaced by
$shift(n, fill_value) (#1201).
In $shift() for various Expr, the argument periods is renamed n (#1201).
In $clip(), arguments min and max are renamed lower_bound and
upper_bound (#1203).
$clip_min() and $clip_max() are removed. Use $clip() with only
lower_bound or upper_bound instead (#1203).
In $write_csv and $sink_csv(), the argument quote is renamed
quote_char (#1206).

New features

New method $str$extract_many() (#1163).
Converting a nanoarrow_array with zero rows to an RPolarsDataFrame via
as_polars_df() now keeps the original schema (#1177).
$write_parquet() has two new arguments partition_by and
partition_chunk_size_bytes to write a DataFrame to a hive-partitioned
directory (#1183).
New method $bin$size() (#1183).
In $scan_parquet() and $read_parquet(), the parallel argument can take
the new value "prefiltered" (#1183).
$scan_parquet(), $scan_ipc() and $read_parquet() have a new argument
include_file_paths to automatically add a column containing the path to the
source file(s) (#1183).
$scan_ipc() can read a hive-partitioned directory with its new arguments
hive_partitioning, hive_schema, and try_parse_hive_dates (#1183).
$scan_parquet() and $read_parquet() gain two new arguments for more control
on importing hive partitions: hive_schema and try_parse_hive_dates (#1189).
New method $gather_every() for LazyFrame and DataFrame (#1199).
$glimpse() for DataFrame has two new arguments max_items_per_column and
max_colname_length (#1200).
New method $list$sample() (#1204).
New argument coalesce in $join_asof() (#1205).
New argument maintain_order in $list$unique() (#1207).

Other changes

In $unnest() for DataFrame and LazyFrame, the names argument is removed
and replaced by .... This doesn't change the previous behavior, e.g.
df$unnest(names = c("a", "b")) still works (#1170).

Full Changelog: v0.18.0...v0.19.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.19.0

Breaking changes

New features

Other changes