v0.15.0
Breaking changes due to Rust-polars update
- rust-polars is updated to 0.38.1 (#865, #872).
- in
$pivot()
, argumentsaggregate_function
,maintain_order
,sort_columns
andseparator
must be named. Values that are passed by position are ignored. - in
$describe()
, the name of the first column changed from"describe"
to"statistic"
. $mod()
methods and%%
works correctly to guaranteex == (x %% y) + y * (x %/% y)
.
- in
Other breaking changes
-
Removed
as.list()
for classRPolarsExpr
as it is a simple wrapper aroundlist()
(#843). -
Several functions have been rewritten to match the behavior of Python Polars.
pl$col(...)
requires at least one argument. (#852)pl$head()
,pl$tail()
,pl$count()
,pl$first()
,pl$last()
,pl$max()
,pl$min()
,pl$mean()
,pl$media()
,pl$std()
,pl$sum()
,pl$var()
,pl$n_unique()
, andpl$approx_n_unique()
are syntactic sugar forpl$col(...)$<method()>
. The argument...
now only accepts characters, that are either column names or regular expressions (#852).- There is no argument for
pl$len()
. If you want to measure the length of specific columns, you should usepl$count(...)
(#852). <Expr>$str$concat()
method'sdelimiter
argument's default value is changed from"-"
to""
(#853).<Expr>$str$concat()
method'signore_nulls
argument must be a named argument (#853).pl$Datetime()
's arguments are renamed:tu
totime_unit
, andtz
totime_zone
(#887).
-
pl$Categorical()
has been improved to allow specifying theordering
type (either lexical or physical). This also means that callingpl$Categorical
doesn't create aDataType
anymore. All calls topl$Categorical
must be replaced bypl$Categorical()
(#860). -
<Series>$rem()
is removed. Use<Series>$mod()
instead (#886). -
The conversion strategy between the POSIXct type without time zone attribute and Polars datetime has been changed (#878).
POSIXct
class vectors without a time zone attribute have UTC time internally and is displayed based on the system's time zone. Previous versions ofpolars
only considered the internal value and interpreted it as UTC time, so the time displayed asPOSIXct
and in Polars was different.# polars 0.14.1 Sys.setenv(TZ = "Europe/Paris") datetime = as.POSIXct("1900-01-01") datetime #> [1] "1900-01-01 PMT" s = polars::as_polars_series(datetime) s #> polars Series: shape: (1,) #> Series: '' [datetime[ms]] #> [ #> 1899-12-31 23:50:39 #> ] as.vector(s) #> [1] "1900-01-01 PMT"
Now the internal value is updated to match the displayed value.
# polars 0.15.0 Sys.setenv(TZ = "Europe/Paris") datetime = as.POSIXct("1900-01-01") datetime #> [1] "1900-01-01 PMT" s = polars::as_polars_series(datetime) s #> polars Series: shape: (1,) #> Series: '' [datetime[ms]] #> [ #> 1900-01-01 00:00:00 #> ] as.vector(s) #> [1] "1900-01-01 PMT"
This update may cause errors when converting from Polars to
POSIXct
for non-existent or ambiguous times. It is recommended to explicitly add a time zone before converting from Polars to R.Sys.setenv(TZ = "America/New_York") ambiguous_time = as.POSIXct("2020-11-01 01:00:00") ambiguous_time #> [1] "2020-11-01 01:00:00 EDT" pls = polars::as_polars_series(ambiguous_time) pls #> polars Series: shape: (1,) #> Series: '' [datetime[ms]] #> [ #> 2020-11-01 01:00:00 #> ] ## This will be error! # pls |> as.vector() pls$dt$replace_time_zone("UTC") |> as.vector() #> [1] "2020-11-01 01:00:00 UTC"
-
Removed argument
eager
inpl$date_range()
andpl$struct()
for more consistency of output. It is possible to replaceeager = TRUE
by calling$to_series()
(#882).
New features
- In the when-then-otherwise expressions, the last
$otherwise()
is now optional, as in Python Polars. If$otherwise()
is not specified, rows that don't respect the condition set in$when()
will be filled withnull
(#836). <DataFrame>$head()
and<DataFrame>$tail()
methods now support negative row numbers (#840).$group_by()
now works with named expressions (#846).- New methods for the
arr
subnamespace:$median()
,$var()
,$std()
,$shift()
,$to_struct()
(#867). $min()
andmax()
now work on categorical variables (#868).- New methods for the
list
subnamespace:$n_unique()
,$gather_every()
(#869). - Converts
clock_time_point
andclock_zoned_time
objects from the{clock}
package to Polars datetime type (#861). - New methods for the
name
subnamespace:$prefix_fields()
andsuffix_fields()
(#873). pl$Datetime()
'stime_zone
argument now accepts"*"
to match any time zone (#887).
Bug fixes
- R no longer crashes when calling an invalid Polars object that points to a null pointer (#874). This was occurring, such as when a Polars object was saved in an RDS file and loaded from another session.
New Contributors
- @detroyejr made their first contribution in #830
Full Changelog: v0.14.1...v0.15.0