Skip to content

Releases: pola-rs/r-polars

lib-v0.36.2

03 Feb 10:53
5cacaf1
Compare
Choose a tag to compare
lib-v0.36.2 Pre-release
Pre-release

Rust library for R polars package 0.13.1

v0.13.0

28 Jan 03:22
Compare
Choose a tag to compare

Breaking changes

  • <Expr>$where() is removed. Use <Expr>$filter() instead (#718).
  • Deprecated functions from 0.12.x are removed (#714).
    • <Expr>$apply() and <Expr>$map(), use $map_elements() and $map_batches() instead.
    • pl$polars_info(), use polars_info() instead.
  • The environment variables used when building the library have been changed (#693). This only affects selecting the feature flag and selecting profiles during source installation.
    • RPOLARS_PROFILE is renamed to LIBR_POLARS_PROFILE
    • RPOLARS_FULL_FEATURES is removed and LIBR_POLARS_FEATURES is added. To select the full_features, set LIBR_POLARS_FEATURES="full_features".
    • RPOLARS_RUST_SOURCE, which was used for development, has been removed. If you want to use library binaries located elsewhere, use LIBR_POLARS_PATH instead.
  • Remove the eager argument of <SQLContext>$execute(). Use the $collect() method after $execute() or as_polars_df to get the result as a DataFrame. (#719)
  • The argument name_generator of $list$to_struct() is renamed fields (#724).
  • The S3 method [ for the $list subnamespace is removed (#724).
  • The option polars.df_print has been renamed polars.df_knitr_print (#726).

Deprications

  • $list$lengths() is deprecated and will be removed in 0.14.0. Use $list$len() instead (#724).
  • pl$from_arrow() is deprecated and will be removed in 0.14.0. Use as_polars_df() or as_polars_series() instead (#728).
  • pl$set_options() and pl$reset_options() are deprecated and will be removed in 0.14.0. See ?polars_options for details (#726).

New features

  • For compatibility with CRAN, the number of threads used by Polars is automatically set to 2 if the environment variable POLARS_MAX_THREADS is not set (#720). To disable this behavior and have the maximum number of threads used automatically, one of the following ways can be used:
    • Build the Rust library with the disable_limit_max_threads feature.
    • Set the polars.limit_max_threads option to FALSE with the options() function before loading the package.
  • New method $rolling() for DataFrame and LazyFrame. When this is applied, it creates an object of class RPolarsRollingGroupBy (#682, #694).
  • New method $group_by_dynamic() for DataFrame and LazyFrame. When this is applied, it creates an object of class RPolarsDynamicGroupBy (#691).
  • New method $sink_ndjson() for LazyFrame (#681).
  • New function pl$duration() to create a duration by components (week, day, hour, etc.), and use them with date(time) variables (#692).
  • New methods $list$any() and $list$all() (#709).
  • New function pl$from_epoch() to convert a Unix timestamp to a date(time) variable (#708).
  • New methods for the list subnamespace: $set_union(), $set_intersection(), $set_difference(), $set_symmetric_difference() (#712).
  • New option int64_conversion to specify how Int64 columns (that don't have equivalent in base R) should be converted. This option can either be set globally with pl$set_options() or on a case-by-case basis, e.g with $to_data_frame(int64_conversion =) (#706).
  • Several changes in $join() for DataFrame and LazyFrame (#716):
    • <LazyFrame>$join() now errors if other is not a LazyFrame and <DataFrame>$join() errors if other is not a DataFrame.
    • Some arguments have been reordered (e.g how now comes before left_on). This can lead to bugs if the user didn't use argument names.
    • Argument how now accepts "outer_coalesce" to coalesce the join keys automatically after joining.
    • New argument validate to perform some checks on join keys (e.g ensure that there is a one-to-one matching between join keys).
    • New argument join_nulls to consider null values as a valid key.
  • <DataFrame>$describe() now works with all datatypes. It also gains an interpolation argument that is used for quantiles computation (#717).
  • as_polars_df() and as_polars_series() for the arrow package classes have been rewritten and work better (#727).
  • Options handling has been rewritten to match the standard option handling in
    R (#726):
    • Options are now passed via options(). The option names don't change but they must be prefixed with "polars.". For example, we can now pass options(polars.strictly_immutable = FALSE).
    • Options can be accessed with polars_options(), which returns a named list (this is the replacement of pl$options).
    • Options can be reset with polars_options_reset() (this is the replacement of pl$reset_options()).
  • New function polars_envvars() to print the list of environment variables related to polars (#735).

lib-v0.36.1

27 Jan 16:54
a60e134
Compare
Choose a tag to compare
lib-v0.36.1 Pre-release
Pre-release

Rust library for R polars package 0.13.0

v0.12.2

09 Jan 14:36
Compare
Choose a tag to compare

This is a small release including a few documentation improvements and internal updates.

v0.12.1

04 Jan 07:17
Compare
Choose a tag to compare

This version includes a few additional features and a large amount of documentation improvements.

Deprecations

  • pl$polars_info() is moved to polars_info(). pl$polars_info() is deprecated and will be removed in 0.13.0 (#662).

Rust-polars update

  • rust-polars is updated to 0.36.2 (#659). Most of the changes from 0.35.x to 0.36.2 were covered in R polars 0.12.0.
    The main change is that pl$Utf8 is replaced by pl$String. pl$Utf8 is an alias and will keep working, but pl$String is now preferred in the documentation and in new code.

What's changed

  • New methods $str$reverse(), $str$contains_any(), and $str$replace_many() (#641).
  • New methods $rle() and $rle_id() (#648).
  • New functions is_polars_df(), is_polars_lf(), is_polars_series() (#658).
  • $gather() now accepts negative indexing (#659).

Miscellaneous

  • Remove the Makefile in favor of Taskfile.yml. Please use task instead of make as a task runner in the development (#654).

lib-v0.36.0

04 Jan 01:05
2f631e4
Compare
Choose a tag to compare
lib-v0.36.0 Pre-release
Pre-release
feat: Bump rust-polars to 0.36.2 (#659)

Co-authored-by: eitsupi <[email protected]>

v0.12.0

28 Dec 13:50
Compare
Choose a tag to compare

BREAKING CHANGES DUE TO RUST-POLARS UPDATE

  • rust-polars is updated to 2023-12-25 unreleased version (#601, #622).
    This is the same version of Python Polars package 0.20.2, so please check
    the upgrade guide for details too.
    • pl$scan_csv() and pl$read_csv()'s comment_char argument is renamed comment_prefix.
    • <DataFrame>$frame_equal() and <Series>$series_equal() are renamed
      to <DataFrame>$equals() and <Series>$equals().
    • <Expr>$rolling_* functions gained an argument warn_if_unsorted.
    • <Expr>$str$json_extract() is renamed to <Expr>$str$json_decode().
    • Change default join behavior with regard to null values.
    • Preserve left and right join keys in outer joins.
    • count now ignores null values.
    • NaN values are now considered equal.
    • $gather_every() gained an argument offset.

Breaking changes and deprecations

  • $apply() on an Expr or a Series is renamed $map_elements(), and $map()
    is renamed $map_batches(). $map() and $apply() will be removed in 0.13.0 (#534).
  • Removed $days(), $hours(), $minutes(), $seconds(), $milliseconds(),
    $microseconds(), $nanoseconds(). Those were deprecated in 0.11.0 (#550).
  • pl$concat_list(): elements being strings are now interpreted as column names.
    Use pl$lit to concat with a string.
  • <RPolarsExpr>$lit_to_s() is renamed to <RPolarsExpr>$to_series() (#582).
  • <RPolarsExpr>$lit_to_df() is removed (#582).
  • Change class names and function names associated with class names.
    • The class name of all objects created by polars (DataFrame, LazyFrame,
      Expr, Series, etc.) has changed. They now start with RPolars, for example
      RPolarsDataFrame. This will only break your code if you directly use those
      class names, such as in S3 methods (#554, #585).
    • Private methods have been unified so that they do not have the RPolars prefix (#584).

What's changed

  • The Extract function ([) for DataFrame can use columns not included in the
    result for filtering (#547).
  • The Extract function ([) for LazyFrame can filter rows with Expressions (#547).
  • as_polars_df() for data.frame has a new argument rownames for to convert
    the row.names attribute to a column.
    This option is inspired by the tibble::as_tibble() function (#561).
  • as_polars_df() for data.frame has a new argument make_names_unique (#561).
  • New methods $str$to_date(), $str$to_time(), $str$to_datetime() as
    alternatives to $str$strptime() (#558).
  • The dim() function for DataFrame and LazyFrame correctly returns integer instead of
    double (#577).
  • The conversion of R's POSIXct class to Polars datetime now works correctly with millisecond
    precision (#589).
  • <LazyFrame>$filter(), <DataFrame>$filter(), and pl$when() now allow multiple conditions
    to be separated by commas, like lf$filter(pl$col("foo") == 1, pl$col("bar") != 2) (#598).
  • New method $replace() for expressions (#601).
  • Better error messages for trailing argument commas such as pl$DataFrame()$select("a",) (#607).
  • New function pl$threadpool_size() to get the number of threads used by Polars (#620).
    Thread pool size is also included in the output of pl$polars_info().

lib-v0.35.1

27 Dec 12:58
d528cdf
Compare
Choose a tag to compare
lib-v0.35.1 Pre-release
Pre-release
docs(website): better display of "Usage" section (#626)

Co-authored-by: eitsupi <[email protected]>

v0.11.0

26 Nov 13:22
Compare
Choose a tag to compare

BREAKING CHANGES DUE TO RUST-POLARS UPDATE

  • rust-polars is updated to 0.35.0 (2023-11-17) (#515)
    • changes in $write_csv() and sink_csv(): has_header is renamed
      include_header and there's a new argument include_bom.
    • pl$cov() gains a ddof argument.
    • $cumsum(), $cumprod(), $cummin(), $cummax(), $cumcount() are
      renamed $cum_sum(), $cum_prod(), $cum_min(), $cum_max(),
      $cum_count().
    • take() and take_every() are renamed $gather() and gather_every().
    • $shift() and $shift_and_fill() now accept Expr as input.
    • when reverse = TRUE, $arg_sort() now places null values in the first
      positions.
    • Removed argument ambiguous in $dt$truncate() and $dt$round().
    • $str$concat() gains an argument ignore_nulls.

Breaking changes and deprecations

  • The rowwise computation when several columns are passed to pl$min(), pl$max(),
    and pl$sum() is deprecated and will be removed in 0.12.0. Passing several
    columns to these functions will now compute the min/max/sum in each column
    separately. Use pl$min_horizontal() pl$max_horizontal(), and
    pl$sum_horizontal() instead for rowwise computation (#508).
  • $is_not() is deprecated and will be removed in 0.12.0. Use $not() instead
    (#511, #531).
  • $is_first() is deprecated and will be removed in 0.12.0. Use $is_first_distinct()
    instead (#531).
  • In pl$concat(), the argument to_supertypes is removed. Use the suffix
    "_relaxed" in the how argument to cast columns to their shared supertypes
    (#523).
  • All duration methods (days(), hours(), minutes(), seconds(),
    milliseconds(), microseconds(), nanoseconds()) are renamed, for example
    from $dt$days() to $dt$total_days(). The old usage is deprecated and will
    be removed in 0.12.0.
  • DataFrame methods $as_data_frame() is removed in favor of $to_data_frame() (#533).
  • GroupBy methods $as_data_frame() and $to_data_frame() which were used to
    convert GroupBy objects to R data frames are removed.
    Use $ungroup() method and the as.data.frame() function instead (#533).

What's changed

  • Fix the installation issue on Ubuntu 20.04 (#528, thanks @brownag).
  • New methods $write_json() and $write_ndjson() for DataFrame (#502).
  • Removed argument name in pl$date_range(), which was deprecated for a while
    (#503).
  • New private method .pr$DataFrame$drop_all_in_place(df) to drop DataFrame
    in-place, to release memory without invoking gc(). However, if there are other
    strong references to any of the underlying Series or arrow arrays, that memory
    will specifically not be released. This method is aimed for r-polars extensions,
    and will be kept stable as much as possible (#504).
  • New functions pl$min_horizontal(), pl$max_horizontal(), pl$sum_horizontal(),
    pl$all_horizontal(), pl$any_horizontal() (#508).
  • New generic functions as_polars_df() and as_polars_lf() to create polars
    DataFrames and LazyFrames (#519).
  • New method $ungroup() for GroupBy and LazyGroupBy (#522).
  • New method $rolling() to apply an Expr over a rolling window based on
    date/datetime/numeric indices (#470).
  • New methods $name$to_lowercase() and $name$to_uppercase() to transform
    variable names (#529).
  • New method $is_last_distinct() (#531).
  • New methods of the Expressions class, $floor_div(), $mod(), $eq_missing()
    and $neq_missing(). The base R operators %/% and %% for Expressions are
    now translated to $floor_div() and $mod() (#523).
    • Note that $mod() of Polars is different from the R operator %%, which is
      not guaranteed x == (x %% y) + y * (x %/% y).
      Please check the upstream issue pola-rs/polars#10570.
  • The extract function ([) for polars objects now behave more like for base R objects (#543).

lib-v0.35.0

26 Nov 07:21
c6a337f
Compare
Choose a tag to compare
lib-v0.35.0 Pre-release
Pre-release
docs(news): tweak news (#539)