Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC FIX: Move unreleased changelog entries to the correct section #1884

Merged
merged 13 commits into from
Jul 16, 2024
Merged
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 45 additions & 31 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,58 @@

- Added distributed tracing using open telemetry APIs for table stored procedure function in `DataFrame`:
- _execute_and_get_query_id
- Allow `df.plot()` and `series.plot()` to be called, materializing the data into the local client

#### Bug Fixes
- Fixed a bug regarding precision loss when converting to Snowpark pandas `DataFrame` or `Series` with `dtype=np.uint64`.


### Snowpark Local Testing Updates

#### New Features

- Added support for the following APIs:
- snowflake.snowpark.functions
- random
- Added new parameters to `patch` function when registering a mocked function:
- `distinct` allows an alternate function to be specified for when a sql function should be distinct.
- `pass_column_index` passes a named parameter `column_index` to the mocked function that contains the pandas.Index for the input data.
- `pass_row_index` passes a named parameter `row_index` to the mocked function that is the 0 indexed row number the function is currently operating on.
- `pass_input_data` passes a named parameter `input_data` to the mocked function that contains the entire input dataframe for the current expression.

#### Bug Fixes
- Fixed a bug that caused DecimalType columns to be incorrectly truncated to integer precision when used in BinaryExpressions.

### Snowpark pandas API Updates

#### New Features
- Added support for `DataFrameGroupBy.all`, `SeriesGroupBy.all`, `DataFrameGroupBy.any`, and `SeriesGroupBy.any`.
- Added support for `DataFrame.nlargest`, `DataFrame.nsmallest`, `Series.nlargest` and `Series.nsmallest`.
- Added support for `replace` and `frac > 1` in `DataFrame.sample` and `Series.sample`.
- Added support for `read_excel` (Uses local pandas for processing)
- Added support for `Series.at`, `Series.iat`, `DataFrame.at`, and `DataFrame.iat`.
- Added support for `Series.dt.isocalendar`.
- Added support for `Series.case_when` except when condition or replacement is callable.
- Added documentation pages for `Index` and its APIs.
- Added support for `DataFrame.assign`.
- Added support for `DataFrame.stack`.
- Added support for `DataFrame.pivot` and `pd.pivot`.
- Added support for `DataFrame.to_csv` and `Series.to_csv`.
- Added partial support for `Series.str.translate` where the values in the `table` are single-codepoint strings.
- Added support for `DataFrame.corr`.
- Allow `df.plot()` and `series.plot()` to be called, materializing the data into the local client
- Added support for `DataFrameGroupBy` and `SeriesGroupBy` aggregations `first` and `last`

#### Bug Fixes
- Fixed an issue when using np.where and df.where when the scalar 'other' is the literal 0.
- Fixed a bug in `DataFrame` and `Series` with `dtype=np.uint64` resulting in precision errors
- Fixed bug where `values` is set to `index` when `index` and `columns` contain all columns in DataFrame during `pivot_table`.

#### Improvements
- Added support for `Index.copy()`
- Added support for Index APIs: `dtype`, `values`, `item()`, `tolist()`, `to_series()` and `to_frame()`
- Expand support for DataFrames with no rows in `pd.pivot_table` and `DataFrame.pivot_table`.
- Added support for `inplace` parameter in `DataFrame.sort_index` and `Series.sort_index`.


## 1.19.0 (2024-06-25)

Expand All @@ -31,27 +73,20 @@
#### New Features

- Added support for `to_boolean` function.
- Added documentation pages for Index and its APIs.

#### Bug Fixes

- Fixed a bug where python stored procedure with table return type fails when run in a task.
- Fixed a bug where df.dropna fails due to `RecursionError: maximum recursion depth exceeded` when the DataFrame has more than 500 columns.
- Fixed a bug where `AsyncJob.result("no_result")` doesn't wait for the query to finish execution.
- Fixed a bug regarding precision loss when converting to Snowpark pandas `DataFrame` or `Series` with `dtype=np.uint64`.


### Snowpark Local Testing Updates

#### New Features

- Added support for the `strict` parameter when registering UDFs and Stored Procedures.
- Added support for the following APIs:
- snowflake.snowpark.functions
- random
- Added new parameters to `patch` function when registering a mocked function:
- `distinct` allows an alternate function to be specified for when a sql function should be distinct.
- `pass_column_index` passes a named parameter `column_index` to the mocked function that contains the pandas.Index for the input data.
- `pass_row_index` passes a named parameter `row_index` to the mocked function that is the 0 indexed row number the function is currently operating on.
- `pass_input_data` passes a named parameter `input_data` to the mocked function that contains the entire input dataframe for the current expression.

#### Bug Fixes

Expand All @@ -61,7 +96,6 @@
- Fixed a bug in mock implementation of `to_char` that raises `IndexError` when incoming column has nonconsecutive row index.
- Fixed a bug in handling of `CaseExpr` expressions that raises `IndexError` when incoming column has nonconsecutive row index.
- Fixed a bug in implementation of `Column.like` that raises `IndexError` when incoming column has nonconsecutive row index.
- Fixed a bug that caused DecimalType columns to be incorrectly truncated to integer precision when used in BinaryExoressions.

#### Improvements

Expand All @@ -80,42 +114,22 @@
- Added support for `DataFrame.expanding` and `Series.expanding` for aggregations `count`, `sum`, `min`, `max`, `mean`, `std`, `var`, and `sem` with `axis=0`.
- Added support for `DataFrame.rolling` and `Series.rolling` for aggregation `count` with `axis=0`.
- Added support for `DataFrameGroupBy.get_group`.
sfc-gh-jkew marked this conversation as resolved.
Show resolved Hide resolved
sfc-gh-jkew marked this conversation as resolved.
Show resolved Hide resolved
- Added support for `DataFrameGroupBy` and `SeriesGroupBy` aggregations `first` and `last`
- Added support for `Series.str.match`.
- Added support for `DataFrame.resample` and `Series.resample` for aggregations `size`, `first`, and `last`.
- Added support for `DataFrameGroupBy.all`, `SeriesGroupBy.all`, `DataFrameGroupBy.any`, and `SeriesGroupBy.any`.
- Added support for `DataFrame.nlargest`, `DataFrame.nsmallest`, `Series.nlargest` and `Series.nsmallest`.
- Added support for `replace` and `frac > 1` in `DataFrame.sample` and `Series.sample`.
sfc-gh-jkew marked this conversation as resolved.
Show resolved Hide resolved
- Added support for `read_excel` (Uses local pandas for processing)
- Added support for `Series.at`, `Series.iat`, `DataFrame.at`, and `DataFrame.iat`.
- Added support for `Series.dt.isocalendar`.
- Added support for `Series.case_when` except when condition or replacement is callable.
- Added documentation pages for `Index` and its APIs.
- Added support for `DataFrame.assign`.
- Added support for `DataFrame.stack`.
- Added support for `DataFrame.pivot` and `pd.pivot`.
sfc-gh-jkew marked this conversation as resolved.
Show resolved Hide resolved
- Added support for `DataFrame.to_csv` and `Series.to_csv`.

#### Bug Fixes

- Fixed a bug that causes output of GroupBy.aggregate's columns to be ordered incorrectly.
- Fixed a bug where `DataFrame.describe` on a frame with duplicate columns of differing dtypes could cause an error or incorrect results.
- Fixed a bug in `DataFrame.rolling` and `Series.rolling` so `window=0` now throws `NotImplementedError` instead of `ValueError`
- Fixed a bug in `DataFrame` and `Series` with `dtype=np.uint64` resulting in precision errors
- Fixed bug where `values` is set to `index` when `index` and `columns` contain all columns in DataFrame during `pivot_table`.

#### Improvements

- Added support for named aggregations in `DataFrame.aggregate` and `Series.aggregate` with `axis=0`.
- `pd.read_csv` reads using the native pandas CSV parser, then uploads data to snowflake using parquet. This enables most of the parameters supported by `read_csv` including date parsing and numeric conversions. Uploading via parquet is roughly twice as fast as uploading via CSV.
- Initial work to support an `pd.Index` directly in Snowpark pandas. Support for `pd.Index` as a first-class component of Snowpark pandas is coming soon.
- Added a lazy index constructor and support for `len`, `shape`, `size`, `empty`, `to_pandas()` and `names`. For `df.index`, Snowpark pandas creates a lazy index object.
- For `df.index`, Snowpark pandas creates a lazy index object.
- For `df.columns`, Snowpark pandas supports a non-lazy version of an `Index` since the data is already stored locally.
- Added support for `Index.copy()`
sfc-gh-jkew marked this conversation as resolved.
Show resolved Hide resolved
- Added support for Index APIs: `dtype`, `values`, `item()`, `tolist()`, `to_series()` and `to_frame()`
- Expand support for DataFrames with no rows in `pd.pivot_table` and `DataFrame.pivot_table`.
- Added support for `inplace` parameter in `DataFrame.sort_index` and `Series.sort_index`.

## 1.18.0 (2024-05-28)

Expand Down
Loading