Releases: pola-rs/r-polars
Releases ยท pola-rs/r-polars
v0.18.0
Breaking changes
- Updated rust-polars to 0.41.3 (#1147, #1156).
- In
$n_chunks()
, the default value ofstrategy
now is"first"
(#1137). $sample()
for Expr and DataFrame (#1136):- the argument
frac
is renamedfraction
; - all the arguments except
n
must be named; - for the Expr method only, the first argument is now
n
(it was already the
case for the DataFrame method); - for the Expr method only, the default value for
with_replacement
is now
FALSE
(it was already the case for the DataFrame method).
- the argument
$melt()
had several changes (#1147):melt()
is renamed$unpivot()
.- Some arguments were renamed:
id_vars
is nowindex
,value_vars
is now
on
. - The order of arguments has changed:
on
is now first, thenindex
. The
order of the other arguments hasn't changed. Note thaton
can be unnamed
but all the other arguments must be named.
pivot()
had several changes (#1147):- The argument
columns
is renamedon
. - The order of arguments has changed:
on
is now first, thenindex
and
values
. The order of the other arguments hasn't changed. Note thaton
can be unnamed but all the other arguments must be named.
- The argument
- In
$write_parquet()
and$sink_parquet()
, the default value of argument
statistics
is nowTRUE
and can take other values thanTRUE/FALSE
(#1147). - In
$dt$truncate()
and$dt$round()
, the argumentoffset
has been removed.
Use$dt$offset_by()
after those functions instead (#1147). - In
$top_k()
and$bottom_k()
forExpr
, the argumentsnulls_last
,
maintain_order
andmultithreaded
have been removed. If anynull
values
are in the top/bottomk
values, they will always be positioned last (#1147). $replace()
has been split in two functions depending on the desired
behaviour (#1147):$replace()
recodes some values in the column, leaving all other values
unchanged. Compared to the previous version, it doesn't use the arguments
default
andreturn_dtype
anymore.$replace_strict()
replaces all values by different values. If a value
doesn't have a specific mapping, it is replaced by thedefault
value.
$str$concat()
is deprecated, use$str$join()
(with the same arguments)
instead (#1147).- In
pl$date_range()
andpl$date_ranges()
, the argumentstime_unit
and
time_zone
have been removed. They were deprecated in previous versions
(#1147). - In
$join()
, whenhow = "cross"
,on
,left_on
andright_on
must be
NULL
(#1147).
New features
- New method
$has_nulls()
(#1133). - New method
$list$explode()
(#1139). $over()
gains a new argumentorder_by
to specify the order of values
within each group. This is useful when the operation depends on the order of
values, such as$shift()
(#1147).$value_counts()
gains an argumentnormalize
to give relative frequencies
of unique values instead of their count (#1147).
New Contributors
- @ju6ge made their first contribution in #1135
- @shikokuchuo made their first contribution in #1160
Full Changelog: v0.17.0...v0.18.0
lib-v0.41.0
test: tempolary disable the test of `pl$mem_address` (#1161)
v0.17.0
Breaking changes
- Updated rust-polars to unreleased version (> 0.40.0) (#1104, #1110, #1117, #1124):
- In
$join()
, there is a new argumentcoalesce
and thehow
options now accept"full"
instead of"outer"
and"outer_coalesce"
. $top_k()
and$bottom_k()
gain three argumentsnulls_last
,maintain_order
andmultithreaded
.- All
$rolling_*()
functions lose the argumentsby
,closed
andwarn_if_unsorted
. Rolling computations based onby
must be made via the correspondingrolling_*_by()
, e.grolling_mean_by()
instead ofrolling_mean(by =)
(#1115). pl$scan_parquet()
andpl$read_parquet()
gain an argumentglob
which defaults toTRUE
. Set it toFALSE
to avoid considering*
as a globing pattern.$is_not_nan()
on anull
value (NA
in R) now returnsnull
. Previously, it returnedTRUE
.- In
$reshape()
, argumentdims
is renameddimensions
and there is a new argumentnested_type
specifying if the output should be of type List or Array. - In
$value_counts()
, all arguments must be named and there is a new argumentname
to specify the name of the output. - In all functions accepting optimization parameter (such as
projection_pushdown
), there is a new parametercluster_with_columns
to combine sequential independent calls to$with_columns()
. $str$explode()
is removed.- The
check_sorted
argument is removed from$rolling()
and$group_by_dynamic()
. Sortedness is now verified in a quick manner, so this argument is no longer needed (pola-rs/polars#16494). $name$map()
stacks on Linux, so this method is deprecated and the document is removed. Please use other methods like<LazyFrame>$rename(<function>)
instead (#1123).
- In
- As warned in v0.16.0, the order of arguments in
pl$Series
is changed (#1071). The first argument is nowname
, and the second argument isvalues
. $to_struct()
on an Expr is removed. This method is now only available forSeries
,DataFrame
, and in the$list
and$arr
subnamespaces. For example,pl$col("a", "b", "c")$to_struct()
should be replaced withpl$struct(c("a", "b", "c"))
(#1092).pl$Struct()
now only accepts named inputs and objects of classRPolarsField
. For example,pl$Struct(pl$Boolean)
doesn't work anymore and should be named likepl$Struct(a = pl$Boolean)
(#1053).- In
$all()
and$any()
, the argumentdrop_nulls
is renamedignore_nulls
, and this argument must be named (#1050). - New method
$struct$with_fields()
(#1109) and new functionpl$field()
to be used in expressions in$struct$with_fields()
(#1113). - New methods for
RPolarsDataType
:$is_enum()
,$is_categorical()
,$is_known()
,$is_string()
,$contains_views()
,$contains_categorical()
(#1112). - In
$dt$combine()
, the argumentstm
andtu
are renamedtime
andtime_unit
(#1116). - The default value of the
rechunk
argument ofpl$concat()
is changed fromTRUE
toFALSE
(#1125). - In
$rename()
for LazyFrame and DataFrame, key-value pairs of names are changed toold_name = "new_name"
instead ofnew_name = "old_name"
(#1129). - In
$rename()
for LazyFrame and DataFrame, no argument is not allowed (#1129). - In all
$rolling_*()
functions, the argumentscenter
andddof
must be named (#1115).
New features
- Allow specify a function in
$rename()
for LazyFrame and DataFrame. They are equivalent topolars.LazyFrame.rename(mapping: Callable[[str], str])
orpolars.DataFrame.rename(mapping: Callable[[str], str])
in Python Polars (#1122, #1129).
Full Changelog: v0.16.4...v0.17.0
lib-v0.40.0
Add `$rolling_*_by()` expressions (#1115) Co-authored-by: eitsupi <[email protected]>
v0.16.4
New features
pl$read_ipc()
can read a raw vector of Apache Arrow IPC file (#1072).- New method
<DataFrame>$to_raw_ipc()
to serialize a DataFrame to a raw vector of Apache Arrow IPC file format (#1072). - New method
<LazyFrame>$serialize()
to serialize a LazyFrame to a character vector of JSON representation (#1073). - New function
pl$deserialize_lf()
to deserialize a LazyFrame from a character vector of JSON representation (#1073). - New methods
$str$head()
and$str$tail()
(#1074). - New S3 methods
nanoarrow::as_nanoarrow_array_stream()
andnanoarrow::infer_nanoarrow_schema()
forRPolarsSeries
(#1076). - New method
$dt$is_leap_year()
(#1077). as_polars_df()
andas_polars_series()
supportsarrow::RecordBatchReader
(#1078).- The new
experimental
argument foras_polars_df(<ArrowTabular>)
,as_polars_df(<RecordBatchReader>)
,as_polars_series(<nanoarrow_array_stream>)
, andas_polars_df(<nanoarrow_array_stream>)
(#1078).
Ifexperimental = TRUE
, these functions switch to use the Arrow C stream interface internally.
At this point, the performance is degraded under the expected use cases, so the default is set toexperimental = FALSE
.
Full Changelog: v0.16.3...v0.16.4
lib-v0.39.3
feat: import_stream internal method for Series to support Arrow C strโฆ
v0.16.3
New features
- New method
<SQLContext>$register_globals()
(#1064). - New experimental method
$sql()
for DataFrame and LazyFrame (#1065).
Miscellaneous
- Move the API document website to the new place (#1067, #1068).
Access to the old website is set to redirect to the top page of the new website.- Old URL:
https://rpolars.github.io/
- New URL:
https://pola-rs.github.io/r-polars/
- Old URL:
Full Changelog: v0.16.2...v0.16.3
v0.16.2
New features
$cut()
and$qcut()
to bin continuous values into discrete categories (#1057).pl$scan_parquet()
andpl$read_parquet()
can read data from the internet by specifying a URL to the first argument (#1056, @andyquinterom).pl$scan_parquet()
andpl$read_parquet()
gain an argumentstorage_options
to scan/read data via cloud storage providers (GCP, AWS, Azure). Note that this support is experimental (#1056, @andyquinterom).- Add support for the
Enum
datatype viapl$Enum()
(#1061).
Bug fixes
- In some read/scan functions, downloading files could fail if the URL was too long. This is now fixed (#1049, @DyfanJones).
New Contributors
- @DyfanJones made their first contribution in #1049
- @andyquinterom made their first contribution in #1056
Full Changelog: v0.16.1...v0.16.2
lib-v0.39.2
ci: exclude R devel on windows from binary library check step (#1062)
v0.16.1
This is a small hot-fix release to update dependent Rust polars to 0.39.1 (#1042).
Also, there are some updates.
Bug fixes
$len()
now correctly includesnull
values in the count (#1044).
Other improvements
$arr$max()
and$arr$min()
work without thenightly
feature (#1042).
Full Changelog: v0.16.0...v0.16.1