Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1625830: Fix circular import when calling to_snowpark_pandas without initializing Snowpark pandas #2097

Merged
merged 7 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
- Fixed a bug in `session.read.csv` that caused an error when setting `PARSE_HEADER = True` in an externally defined file format.
- Fixed a bug in query generation from set operations that allowed generation of duplicate queries when children have common subqueries.
- Fixed a bug in `session.get_session_stage` that referenced a non-existing stage after switching database or schema.
- Fixed a bug where calling `DataFrame.to_snowpark_pandas_dataframe` without explicitly initializing the Snowpark pandas plugin caused an error.

### Snowpark Local Testing Updates

Expand Down
8 changes: 6 additions & 2 deletions src/snowflake/snowpark/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -1007,8 +1007,12 @@ def to_snowpark_pandas(
B A
2 1 1 3 1
"""
import snowflake.snowpark.modin.pandas as pd # pragma: no cover

# black and isort disagree on how to format this section with isort: skip
# fmt: off
import snowflake.snowpark.modin.plugin # isort: skip # noqa: F401
# If snowflake.snowpark.modin.plugin was successfully imported, then modin.pandas is available
import modin.pandas as pd # isort: skip
# fmt: on
# create a temporary table out of the current snowpark dataframe
temporary_table_name = random_name_for_temp_object(
TempObjectType.TABLE
Expand Down
56 changes: 56 additions & 0 deletions tests/integ/test_df_to_snowpark_pandas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#!/usr/bin/env python3
#
# Copyright (c) 2012-2024 Snowflake Computing Inc. All rights reserved.
#

# Tests behavior of to_snowpark_pandas() without explicitly initializing Snowpark pandas.

import pytest

from snowflake.snowpark._internal.utils import TempObjectType
from tests.utils import Utils

pytestmark = [
pytest.mark.xfail(
"config.getoption('local_testing_mode', default=False)",
reason="This is testing Snowpark pandas installation",
run=False,
)
]


@pytest.fixture(scope="module")
def tmp_table_basic(session):
table_name = Utils.random_name_for_temp_object(TempObjectType.TABLE)
Utils.create_table(
session, table_name, "id integer, foot_size float, shoe_model varchar"
)
session.sql(f"insert into {table_name} values (1, 32.0, 'medium')").collect()
session.sql(f"insert into {table_name} values (2, 27.0, 'small')").collect()
session.sql(f"insert into {table_name} values (3, 40.0, 'large')").collect()

try:
yield table_name
finally:
Utils.drop_table(session, table_name)


def test_to_snowpark_pandas_no_modin(session, tmp_table_basic):
snowpark_df = session.table(tmp_table_basic)
# Check if modin is installed (if so, we're running in Snowpark pandas; if not, we're just in Snowpark Python)
try:
sfc-gh-joshi marked this conversation as resolved.
Show resolved Hide resolved
import modin # noqa: F401
sfc-gh-joshi marked this conversation as resolved.
Show resolved Hide resolved
except ModuleNotFoundError:
# Current Snowpark Python installs pandas==2.2.2, but Snowpark pandas depends on modin
# 0.28.1, which needs pandas==2.2.1. The pandas version check is currently performed
# before Snowpark pandas checks whether modin is installed.
# TODO: SNOW-1552497: after upgrading to modin 0.30.1, Snowpark pandas will support
# all pandas 2.2.x, and this function call will raise a ModuleNotFoundError since
# modin is not installed.
with pytest.raises(
RuntimeError,
match="does not match the supported pandas version in Snowpark pandas",
):
snowpark_df.to_snowpark_pandas()
else:
snowpark_df.to_snowpark_pandas() # should have no errors
4 changes: 2 additions & 2 deletions tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -96,11 +96,11 @@ commands =
local: {env:SNOWFLAKE_PYTEST_CMD} --local_testing_mode -m "integ or unit or mock" {posargs:} tests
dailynotdoctest: {env:SNOWFLAKE_PYTEST_DAILY_CMD} -m "{env:SNOWFLAKE_TEST_TYPE} or udf" {posargs:} tests
# Snowpark pandas commands:
snowparkpandasnotdoctest: {env:MODIN_PYTEST_CMD} --durations=20 -m "{env:SNOWFLAKE_TEST_TYPE}" {posargs:} {env:SNOW_1314507_WORKAROUND_RERUN_FLAGS} tests/unit/modin tests/integ/modin
snowparkpandasnotdoctest: {env:MODIN_PYTEST_CMD} --durations=20 -m "{env:SNOWFLAKE_TEST_TYPE}" {posargs:} {env:SNOW_1314507_WORKAROUND_RERUN_FLAGS} tests/unit/modin tests/integ/modin tests/integ/test_df_to_snowpark_pandas.py
# This one only run doctest but we still need to include the tests folder to let tests/conftest.py to mark the doctest files for us
snowparkpandasdoctest: {env:MODIN_PYTEST_CMD} --durations=20 -m "{env:SNOWFLAKE_TEST_TYPE}" {posargs:} src/snowflake/snowpark/modin/ tests/unit/modin
# This one is used by daily_modin_precommit.yml
snowparkpandasdailynotdoctest: {env:MODIN_PYTEST_DAILY_CMD} --durations=20 -m "{env:SNOWFLAKE_TEST_TYPE}" {posargs:} {env:SNOW_1314507_WORKAROUND_RERUN_FLAGS} tests/unit/modin tests/integ/modin
snowparkpandasdailynotdoctest: {env:MODIN_PYTEST_DAILY_CMD} --durations=20 -m "{env:SNOWFLAKE_TEST_TYPE}" {posargs:} {env:SNOW_1314507_WORKAROUND_RERUN_FLAGS} tests/unit/modin tests/integ/modin tests/integ/test_df_to_snowpark_pandas.py
# This one is only called by jenkins job and the only difference from `snowparkpandasnotdoctest` is that it uses
# MODIN_PYTEST_NO_COV_CMD instead of MODIN_PYTEST_CMD
snowparkpandasjenkins: {env:MODIN_PYTEST_NO_COV_CMD} --durations=20 -m "{env:SNOWFLAKE_TEST_TYPE}" {posargs:} {env:SNOW_1314507_WORKAROUND_RERUN_FLAGS} tests/unit/modin tests/integ/modin
Expand Down
Loading