Releases: pola-rs/r-polars
Releases · pola-rs/r-polars
v0.21.0
Breaking changes
- Updated Rust Polars to 0.44.2 (#1271).
- Minimum supported Rust version (MSRV) is now 1.82.0.
$reshape()
'snested_type
argument is removed.$approx_n_unique()
no longer works on Categorical type.
<Series>$compare()
is removed. (#1272)
Deprecations
- Passing a single data.frame to
pl$DataFrame()
orpl$LazyFrame()
to convert a
data.frame to a polars DataFrame or LazyFrame is deprecated and a warning will
be shown. Useas_polars_df()
oras_polars_lf()
instead (#1275).
Bug fixes
- Maintain level order when converting Enums to factors (#1252, @andyquinterom).
Full Changelog: v0.19.0...v0.21.0
lib-v0.44.0
fix: fix csv download issue on windows (#1292)
v0.20.0
Breaking changes
- Updated rust-polars to 0.43.1 (#1230).
- In
pl$scan_ipc()
andpl$read_ipc()
, the argumentmemory_map
is removed
(#1230). - In
$serialize()
, in the fieldschema
, the fieldinner
is renamedfields
,
and the fieldsoutput_schema
andfilter
are removed (#1230).
New features
- New method
$cast()
forDataFrame
andLazyFrame
(#1219). - New argument
strict
in$drop()
to determine whether unknown column names
should trigger an error (#1220). - New method
$to_dummies()
forDataFrame
(#1225). - New argument
include_file_paths
inpl$scan_csv()
andpl$read_csv()
(#1235). - New method
$join_where()
forDataFrame
andLazyFrame
to perform
inequality joins (#1237).
Bug fixes
- Converting data of datatype
Null
to R doesn't error anymore. It now creates
a column filled withNA
(#1217).
New Contributors
Full Changelog: v0.19.0...v0.20.0
lib-v0.43.0
test: the latest nanoarrow supports utf8view type (#1257)
v0.19.1
lib-v0.42.1
docs: fix some typos in DEVELOPMENT.md (#1211)
v0.19.0
Breaking changes
- Updated rust-polars to unreleased 2024-08-20, after 0.42.0 (#1183).
$describe_plan()
and$describe_optimized_plan()
are removed. Use
respectively$explain(optimized = FALSE)
and$explain()
instead (#1182).- The parameter
inherit_optimization
is removed from all functions that had it
(#1183). - In
$write_parquet()
and$sink_parquet()
, the parameterdata_pagesize_limit
is renameddata_page_size
(#1183). - The LazyFrame method
$get_optimization_toggle()
is removed, and
$set_optimization_toggle()
is renamed$optimization_toggle()
(#1183). - In
$unpivot()
, the parameterstreamable
is removed (#1183). - Some functions have a parameter
future
that determines the compatibility level
when exporting Polars' internal data structures. This parameter is renamed
compat_level
, which takesFALSE
for the oldest flavor (more compatible)
andTRUE
for the newest one (less compatible). It can also take an integer
determining a specific compatibility level when more are added in the future.
For now,future = FALSE
can be replaced bycompat_level = FALSE
(#1183). - In
$scan_parquet()
and$read_parquet()
, the default value of
hive_partitioning
is nowNULL
(#1189). - In
$dt$epoch()
, the argumenttu
is renamed totime_unit
(#1196). - In
$fill_nan()
forDataFrame
,LazyFrame
andExpr
, the argument is
renamedvalue
(#1198). $shift_and_fill()
is removed and replaced by a new argumentfill_value
in
$shift()
.$shift_and_fill(fill_value, periods)
can be replaced by
$shift(n, fill_value)
(#1201).- In
$shift()
for variousExpr
, the argumentperiods
is renamedn
(#1201). - In
$clip()
, argumentsmin
andmax
are renamedlower_bound
and
upper_bound
(#1203). $clip_min()
and$clip_max()
are removed. Use$clip()
with only
lower_bound
orupper_bound
instead (#1203).- In
$write_csv
and$sink_csv()
, the argumentquote
is renamed
quote_char
(#1206).
New features
- New method
$str$extract_many()
(#1163). - Converting a
nanoarrow_array
with zero rows to anRPolarsDataFrame
via
as_polars_df()
now keeps the original schema (#1177). $write_parquet()
has two new argumentspartition_by
and
partition_chunk_size_bytes
to write aDataFrame
to a hive-partitioned
directory (#1183).- New method
$bin$size()
(#1183). - In
$scan_parquet()
and$read_parquet()
, theparallel
argument can take
the new value"prefiltered"
(#1183). $scan_parquet()
,$scan_ipc()
and$read_parquet()
have a new argument
include_file_paths
to automatically add a column containing the path to the
source file(s) (#1183).$scan_ipc()
can read a hive-partitioned directory with its new arguments
hive_partitioning
,hive_schema
, andtry_parse_hive_dates
(#1183).$scan_parquet()
and$read_parquet()
gain two new arguments for more control
on importing hive partitions:hive_schema
andtry_parse_hive_dates
(#1189).- New method
$gather_every()
forLazyFrame
andDataFrame
(#1199). $glimpse()
forDataFrame
has two new argumentsmax_items_per_column
and
max_colname_length
(#1200).- New method
$list$sample()
(#1204). - New argument
coalesce
in$join_asof()
(#1205). - New argument
maintain_order
in$list$unique()
(#1207).
Other changes
- In
$unnest()
forDataFrame
andLazyFrame
, thenames
argument is removed
and replaced by...
. This doesn't change the previous behavior, e.g.
df$unnest(names = c("a", "b"))
still works (#1170).
Full Changelog: v0.18.0...v0.19.0
lib-v0.42.0
chore: bump serde_json from 1.0.125 to 1.0.127 in /src/rust (#1209) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
v0.18.0
Breaking changes
- Updated rust-polars to 0.41.3 (#1147, #1156).
- In
$n_chunks()
, the default value ofstrategy
now is"first"
(#1137). $sample()
for Expr and DataFrame (#1136):- the argument
frac
is renamedfraction
; - all the arguments except
n
must be named; - for the Expr method only, the first argument is now
n
(it was already the
case for the DataFrame method); - for the Expr method only, the default value for
with_replacement
is now
FALSE
(it was already the case for the DataFrame method).
- the argument
$melt()
had several changes (#1147):melt()
is renamed$unpivot()
.- Some arguments were renamed:
id_vars
is nowindex
,value_vars
is now
on
. - The order of arguments has changed:
on
is now first, thenindex
. The
order of the other arguments hasn't changed. Note thaton
can be unnamed
but all the other arguments must be named.
pivot()
had several changes (#1147):- The argument
columns
is renamedon
. - The order of arguments has changed:
on
is now first, thenindex
and
values
. The order of the other arguments hasn't changed. Note thaton
can be unnamed but all the other arguments must be named.
- The argument
- In
$write_parquet()
and$sink_parquet()
, the default value of argument
statistics
is nowTRUE
and can take other values thanTRUE/FALSE
(#1147). - In
$dt$truncate()
and$dt$round()
, the argumentoffset
has been removed.
Use$dt$offset_by()
after those functions instead (#1147). - In
$top_k()
and$bottom_k()
forExpr
, the argumentsnulls_last
,
maintain_order
andmultithreaded
have been removed. If anynull
values
are in the top/bottomk
values, they will always be positioned last (#1147). $replace()
has been split in two functions depending on the desired
behaviour (#1147):$replace()
recodes some values in the column, leaving all other values
unchanged. Compared to the previous version, it doesn't use the arguments
default
andreturn_dtype
anymore.$replace_strict()
replaces all values by different values. If a value
doesn't have a specific mapping, it is replaced by thedefault
value.
$str$concat()
is deprecated, use$str$join()
(with the same arguments)
instead (#1147).- In
pl$date_range()
andpl$date_ranges()
, the argumentstime_unit
and
time_zone
have been removed. They were deprecated in previous versions
(#1147). - In
$join()
, whenhow = "cross"
,on
,left_on
andright_on
must be
NULL
(#1147).
New features
- New method
$has_nulls()
(#1133). - New method
$list$explode()
(#1139). $over()
gains a new argumentorder_by
to specify the order of values
within each group. This is useful when the operation depends on the order of
values, such as$shift()
(#1147).$value_counts()
gains an argumentnormalize
to give relative frequencies
of unique values instead of their count (#1147).
New Contributors
- @ju6ge made their first contribution in #1135
- @shikokuchuo made their first contribution in #1160
Full Changelog: v0.17.0...v0.18.0
lib-v0.41.0
test: tempolary disable the test of `pl$mem_address` (#1161)