Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SNOW-1649147]: Fix inplace argument for Series objects derived from DataFrames columns #2209

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@

- Stopped ignoring nanoseconds in `pd.Timedelta` scalars.
- Fixed AssertionError in tree of binary operations.
- Fixed `inplace` argument for Series objects derived from DataFrame columns.

#### Behavior Change

Expand Down
2 changes: 0 additions & 2 deletions src/snowflake/snowpark/modin/pandas/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2752,8 +2752,6 @@ def _update_inplace(self, new_query_compiler):
# Propagate changes back to parent so that column in dataframe had the same contents
if self._parent is not None:
if self._parent_axis == 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the else, because it doesn't seem like self._parent_axis can be anything other than 1?

In [2]: df = pd.DataFrame([[1, 2, 3], [4, None, 6]], columns=list("ABC"))

In [3]: df['B']._parent_axis
Out[3]: 0

In [4]: df.loc[1]._parent_axis
Out[4]: 0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfc-gh-joshi should we override this code? BTW, in indexing, we never use _parent_axis before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our code base, we never set _parent_axis = 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave it as-is. We don't use it, but upstream modin uses it in DataFrame._getitem_column (which we don't use since we use a different code path for indexing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now set _parent_axis=1 correctly: #2210

self._parent.loc[self.name] = self
sfc-gh-rdurrani marked this conversation as resolved.
Show resolved Hide resolved
else:
self._parent[self.name] = self

def _create_or_update_from_compiler(self, new_query_compiler, inplace=False):
Expand Down
13 changes: 13 additions & 0 deletions tests/integ/modin/series/test_fillna.py
Original file line number Diff line number Diff line change
Expand Up @@ -188,3 +188,16 @@ def test_argument_negative(test_fillna_series, test_fillna_df):
expect_exception_type=TypeError,
assert_exception_equal=False,
)


@sql_count_checker(query_count=1)
def test_inplace_fillna_from_df():
def inplace_fillna(df):
df["B"].fillna(method="ffill", inplace=True)
return df

eval_snowpark_pandas_result(
pd.DataFrame([[1, 2, 3], [4, None, 6]], columns=list("ABC")),
native_pd.DataFrame([[1, 2, 3], [4, None, 6]], columns=list("ABC")),
inplace_fillna,
)
Loading