Skip to content

Commit

Permalink
SNOW-1658008: Clean up Snowpark pandas documentation references (#2255)
Browse files Browse the repository at this point in the history
<!---
Please answer these questions before creating your pull request. Thanks!
--->

1. Which Jira issue is this PR addressing? Make sure that there is an
accompanying issue to your PR.

   <!---
   In this section, please add a Snowflake Jira issue number.
   
Note that if a corresponding GitHub issue exists, you should still
include
   the Snowflake Jira issue number. For example, for GitHub issue
#1400, you should
   add "SNOW-1335071" here.
    --->

   Fixes SNOW-1658008

2. Fill out the following pre-review checklist:

- [ ] I am adding a new automated test(s) to verify correctness of my
new code
- [ ] If this test skips Local Testing mode, I'm requesting review from
@snowflakedb/local-testing
   - [ ] I am adding new logging messages
   - [ ] I am adding a new telemetry message
   - [ ] I am adding new credentials
   - [ ] I am adding a new dependency
- [ ] If this is a new feature/behavior, I'm adding the Local Testing
parity changes.

3. Please describe how your code solves the related issue.

This PR updates type annotation and documentation paths referencing
Snowpark pandas objects. Notably,
`snowflake.snowpark.modin.pandas.Series`/`DataFrame` are now referenced
by `modin.pandas.Series`/`DataFrame`.

In Snowpark session + DataFrame objects: functions that reference
Snowpark pandas objects now refer to `modin.pandas`, with the import
guarded by the `TYPE_CHECKING` variable. This should not affect any
Snowpark non-pandas users.

In Snowpark pandas, references to `snowflake.snowpark.modin.pandas.io.*`
were not generating correctly; these have been edited to
`snowflake.snowpark.modin.pandas.*`. Removing the `snowflake.snowpark`
prefix from these does not currently work; I will investigate why after
GA.
  • Loading branch information
sfc-gh-joshi authored Sep 10, 2024
1 parent 5358ee3 commit 2eb2a2a
Show file tree
Hide file tree
Showing 17 changed files with 171 additions and 158 deletions.
1 change: 1 addition & 0 deletions docs/source/modin/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Input/Output
:toctree: pandas_api/

read_snowflake
to_snowflake
to_snowpark

.. rubric:: pandas
Expand Down
15 changes: 8 additions & 7 deletions src/snowflake/snowpark/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@
from collections.abc import Iterable

if TYPE_CHECKING:
import modin.pandas # pragma: no cover
from table import Table # pragma: no cover

_logger = getLogger(__name__)
Expand Down Expand Up @@ -938,7 +939,7 @@ def to_snowpark_pandas(
self,
index_col: Optional[Union[str, List[str]]] = None,
columns: Optional[List[str]] = None,
) -> "snowflake.snowpark.modin.pandas.DataFrame":
) -> "modin.pandas.DataFrame":
"""
Convert the Snowpark DataFrame to Snowpark pandas DataFrame.
Expand All @@ -948,7 +949,7 @@ def to_snowpark_pandas(
all columns except ones configured in index_col.
Returns:
:class:`~snowflake.snowpark.modin.pandas.DataFrame`
:class:`~modin.pandas.DataFrame`
A Snowpark pandas DataFrame contains index and data columns based on the snapshot of the current
Snowpark DataFrame, which triggers an eager evaluation.
Expand All @@ -964,12 +965,12 @@ def to_snowpark_pandas(
Note:
Transformations performed on the returned Snowpark pandas Dataframe do not affect the Snowpark DataFrame
from which it was created. Call
- :func:`snowflake.snowpark.modin.pandas.to_snowpark <snowflake.snowpark.modin.pandas.to_snowpark>`
- :func:`modin.pandas.to_snowpark <modin.pandas.to_snowpark>`
to transform a Snowpark pandas DataFrame back to a Snowpark DataFrame.
The column names used for columns or index_cols must be Normalized Snowflake Identifiers, and the
Normalized Snowflake Identifiers of a Snowpark DataFrame can be displayed by calling df.show().
For details about Normalized Snowflake Identifiers, please refer to the Note in :func:`~snowflake.snowpark.modin.pandas.read_snowflake`
For details about Normalized Snowflake Identifiers, please refer to the Note in :func:`~modin.pandas.read_snowflake`
`to_snowpark_pandas` works only when the environment is set up correctly for Snowpark pandas. This environment
may require version of Python and pandas different from what Snowpark Python uses If the environment is setup
Expand All @@ -980,9 +981,9 @@ def to_snowpark_pandas(
- the installation section https://docs.snowflake.com/en/developer-guide/snowpark/python/snowpark-pandas#installing-the-snowpark-pandas-api
See also:
- :func:`snowflake.snowpark.modin.pandas.to_snowpark <snowflake.snowpark.modin.pandas.to_snowpark>`
- :func:`snowflake.snowpark.modin.pandas.DataFrame.to_snowpark <snowflake.snowpark.modin.pandas.DataFrame.to_snowpark>`
- :func:`snowflake.snowpark.modin.pandas.Series.to_snowpark <snowflake.snowpark.modin.pandas.Series.to_snowpark>`
- :func:`modin.pandas.to_snowpark <modin.pandas.to_snowpark>`
- :func:`modin.pandas.DataFrame.to_snowpark <modin.pandas.DataFrame.to_snowpark>`
- :func:`modin.pandas.Series.to_snowpark <modin.pandas.Series.to_snowpark>`
Example::
>>> df = session.create_dataframe([[1, 2, 3]], schema=["a", "b", "c"])
Expand Down
78 changes: 39 additions & 39 deletions src/snowflake/snowpark/modin/pandas/general.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@
if TYPE_CHECKING:
# To prevent cross-reference warnings when building documentation and prevent erroneously
# linking to `snowflake.snowpark.DataFrame`, we need to explicitly
# qualify return types in this file with `snowflake.snowpark.modin.pandas.DataFrame`.
# qualify return types in this file with `modin.pandas.DataFrame`.
# SNOW-1233342: investigate how to fix these links without using absolute paths
from modin.core.storage_formats import BaseQueryCompiler # pragma: no cover

Expand Down Expand Up @@ -172,8 +172,8 @@ def merge(
Parameters
----------
left : :class:`~snowflake.snowpark.modin.pandas.DataFrame` or named Series
right : :class:`~snowflake.snowpark.modin.pandas.DataFrame` or named Series
left : :class:`~modin.pandas.DataFrame` or named Series
right : :class:`~modin.pandas.DataFrame` or named Series
Object to merge with.
how : {'left', 'right', 'outer', 'inner', 'cross'}, default 'inner'
Type of merge to be performed.
Expand Down Expand Up @@ -234,7 +234,7 @@ def merge(
Returns
-------
:class:`~snowflake.snowpark.modin.pandas.DataFrame`
:class:`~modin.pandas.DataFrame`
A DataFrame of the two merged objects.
See Also
Expand Down Expand Up @@ -429,8 +429,8 @@ def merge_asof(
Parameters
----------
left : :class:`~snowflake.snowpark.modin.pandas.DataFrame` or named :class:`~snowflake.snowpark.modin.pandas.Series`.
right : :class:`~snowflake.snowpark.modin.pandas.DataFrame` or named :class:`~snowflake.snowpark.modin.pandas.Series`.
left : :class:`~modin.pandas.DataFrame` or named :class:`~modin.pandas.Series`.
right : :class:`~modin.pandas.DataFrame` or named :class:`~modin.pandas.Series`.
on : label
Field name to join on. Must be found in both DataFrames. The data MUST be ordered.
Furthermore, this must be a numeric column such as datetimelike, integer, or float.
Expand Down Expand Up @@ -461,7 +461,7 @@ def merge_asof(
Returns
-------
Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame`
Snowpark pandas :class:`~modin.pandas.DataFrame`
Examples
--------
Expand Down Expand Up @@ -678,7 +678,7 @@ def pivot_table(
Returns
-------
Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame`
Snowpark pandas :class:`~modin.pandas.DataFrame`
An Excel style pivot table.
Notes
Expand Down Expand Up @@ -808,7 +808,7 @@ def pivot(data, index=None, columns=None, values=None): # noqa: PR01, RT01, D20
Parameters
----------
data : :class:`~snowflake.snowpark.modin.pandas.DataFrame`
data : :class:`~modin.pandas.DataFrame`
columns : str or object or a list of str
Column to use to make new frame’s columns.
index : str or object or a list of str, optional
Expand All @@ -819,7 +819,7 @@ def pivot(data, index=None, columns=None, values=None): # noqa: PR01, RT01, D20
Returns
-------
:class:`~snowflake.snowpark.modin.pandas.DataFrame`
:class:`~modin.pandas.DataFrame`
Notes
-----
Expand Down Expand Up @@ -1166,11 +1166,11 @@ def concat(
Returns
-------
object, type of objs
When concatenating all Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.Series` along the index (axis=0),
a Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.Series` is returned. When ``objs`` contains at least
one Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame`,
a Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame` is returned. When concatenating along
the columns (axis=1), a Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame` is returned.
When concatenating all Snowpark pandas :class:`~modin.pandas.Series` along the index (axis=0),
a Snowpark pandas :class:`~modin.pandas.Series` is returned. When ``objs`` contains at least
one Snowpark pandas :class:`~modin.pandas.DataFrame`,
a Snowpark pandas :class:`~modin.pandas.DataFrame` is returned. When concatenating along
the columns (axis=1), a Snowpark pandas :class:`~modin.pandas.DataFrame` is returned.
See Also
--------
Expand Down Expand Up @@ -1506,13 +1506,13 @@ def to_datetime(
"""
Convert argument to datetime.
This function converts a scalar, array-like, :class:`~snowflake.snowpark.modin.pandas.Series` or
:class:`~snowflake.snowpark.modin.pandas.DataFrame`/dict-like to a pandas datetime object.
This function converts a scalar, array-like, :class:`~modin.pandas.Series` or
:class:`~modin.pandas.DataFrame`/dict-like to a pandas datetime object.
Parameters
----------
arg : int, float, str, datetime, list, tuple, 1-d array, Series, :class:`~snowflake.snowpark.modin.pandas.DataFrame`/dict-like
The object to convert to a datetime. If a :class:`~snowflake.snowpark.modin.pandas.DataFrame` is provided, the
arg : int, float, str, datetime, list, tuple, 1-d array, Series, :class:`~modin.pandas.DataFrame`/dict-like
The object to convert to a datetime. If a :class:`~modin.pandas.DataFrame` is provided, the
method expects minimally the following columns: :const:`"year"`,
:const:`"month"`, :const:`"day"`.
errors : {'ignore', 'raise', 'coerce'}, default 'raise'
Expand Down Expand Up @@ -1548,7 +1548,7 @@ def to_datetime(
Control timezone-related parsing, localization and conversion.
- If :const:`True`, the function *always* returns a timezone-aware
UTC-localized :class:`Timestamp`, :class:`~snowflake.snowpark.modin.pandas.Series` or
UTC-localized :class:`Timestamp`, :class:`~modin.pandas.Series` or
:class:`DatetimeIndex`. To do this, timezone-naive inputs are
*localized* as UTC, while timezone-aware inputs are *converted* to UTC.
Expand Down Expand Up @@ -1609,14 +1609,14 @@ def to_datetime(
parsing):
- scalar: :class:`Timestamp` (or :class:`datetime.datetime`)
- array-like: :class:`~snowflake.snowpark.modin.pandas.DatetimeIndex` (or
:class: :class:`~snowflake.snowpark.modin.pandas.Series` of :class:`object` dtype containing
- array-like: :class:`~modin.pandas.DatetimeIndex` (or
:class: :class:`~modin.pandas.Series` of :class:`object` dtype containing
:class:`datetime.datetime`)
- Series: :class:`~snowflake.snowpark.modin.pandas.Series` of :class:`datetime64` dtype (or
:class: :class:`~snowflake.snowpark.modin.pandas.Series` of :class:`object` dtype containing
- Series: :class:`~modin.pandas.Series` of :class:`datetime64` dtype (or
:class: :class:`~modin.pandas.Series` of :class:`object` dtype containing
:class:`datetime.datetime`)
- DataFrame: :class:`~snowflake.snowpark.modin.pandas.Series` of :class:`datetime64` dtype (or
:class:`~snowflake.snowpark.modin.pandas.Series` of :class:`object` dtype containing
- DataFrame: :class:`~modin.pandas.Series` of :class:`datetime64` dtype (or
:class:`~modin.pandas.Series` of :class:`object` dtype containing
:class:`datetime.datetime`)
Raises
Expand All @@ -1625,7 +1625,7 @@ def to_datetime(
When parsing a date from string fails.
ValueError
When another datetime conversion error happens. For example when one
of 'year', 'month', day' columns is missing in a :class:`~snowflake.snowpark.modin.pandas.DataFrame`, or
of 'year', 'month', day' columns is missing in a :class:`~modin.pandas.DataFrame`, or
when a Timezone-aware :class:`datetime.datetime` is found in an array-like
of mixed time offsets, and ``utc=False``.
Expand All @@ -1651,29 +1651,29 @@ def to_datetime(
:class:`datetime.datetime`. None/NaN/null entries are converted to
:const:`NaT` in both cases.
- **Series** are converted to :class:`~snowflake.snowpark.modin.pandas.Series` with :class:`datetime64`
dtype when possible, otherwise they are converted to :class:`~snowflake.snowpark.modin.pandas.Series` with
- **Series** are converted to :class:`~modin.pandas.Series` with :class:`datetime64`
dtype when possible, otherwise they are converted to :class:`~modin.pandas.Series` with
:class:`object` dtype, containing :class:`datetime.datetime`. None/NaN/null
entries are converted to :const:`NaT` in both cases.
- **DataFrame/dict-like** are converted to :class:`~snowflake.snowpark.modin.pandas.Series` with
- **DataFrame/dict-like** are converted to :class:`~modin.pandas.Series` with
:class:`datetime64` dtype. For each row a datetime is created from assembling
the various dataframe columns. Column keys can be common abbreviations
like [‘year’, ‘month’, ‘day’, ‘minute’, ‘second’, ‘ms’, ‘us’, ‘ns’]) or
plurals of the same.
The following causes are responsible for :class:`datetime.datetime` objects
being returned (possibly inside an :class:`Index` or a :class:`~snowflake.snowpark.modin.pandas.Series` with
being returned (possibly inside an :class:`Index` or a :class:`~modin.pandas.Series` with
:class:`object` dtype) instead of a proper pandas designated type
(:class:`Timestamp` or :class:`~snowflake.snowpark.modin.pandas.Series` with :class:`datetime64` dtype):
(:class:`Timestamp` or :class:`~modin.pandas.Series` with :class:`datetime64` dtype):
- when any input element is before :const:`Timestamp.min` or after
:const:`Timestamp.max`, see `timestamp limitations
<https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
#timeseries-timestamp-limits>`_.
- when ``utc=False`` (default) and the input is an array-like or
:class:`~snowflake.snowpark.modin.pandas.Series` containing mixed naive/aware datetime, or aware with mixed
:class:`~modin.pandas.Series` containing mixed naive/aware datetime, or aware with mixed
time offsets. Note that this happens in the (quite frequent) situation when
the timezone has a daylight savings policy. In that case you may wish to
use ``utc=True``.
Expand All @@ -1683,7 +1683,7 @@ def to_datetime(
**Handling various input formats**
Assembling a datetime from multiple columns of a :class:`~snowflake.snowpark.modin.pandas.DataFrame`. The keys
Assembling a datetime from multiple columns of a :class:`~modin.pandas.DataFrame`. The keys
can be common abbreviations like ['year', 'month', 'day', 'minute', 'second',
'ms', 'us', 'ns']) or plurals of the same
Expand Down Expand Up @@ -1744,7 +1744,7 @@ def to_datetime(
The default behaviour (``utc=False``) is as follows:
- Timezone-naive inputs are kept as timezone-naive :class:`~snowflake.snowpark.modin.pandas.DatetimeIndex`:
- Timezone-naive inputs are kept as timezone-naive :class:`~modin.pandas.DatetimeIndex`:
>>> pd.to_datetime(['2018-10-26 12:00:00', '2018-10-26 13:00:15'])
DatetimeIndex(['2018-10-26 12:00:00', '2018-10-26 13:00:15'], dtype='datetime64[ns]', freq=None)
Expand Down Expand Up @@ -1844,7 +1844,7 @@ def get_dummies(
Parameters
----------
data : array-like, Series, or :class:`~snowflake.snowpark.modin.pandas.DataFrame`
data : array-like, Series, or :class:`~modin.pandas.DataFrame`
Data of which to get dummy indicators.
prefix : str, list of str, or dict of str, default None
String to append DataFrame column names.
Expand Down Expand Up @@ -1873,7 +1873,7 @@ def get_dummies(
Returns
-------
:class:`~snowflake.snowpark.modin.pandas.DataFrame`
:class:`~modin.pandas.DataFrame`
Dummy-coded data.
Examples
Expand Down Expand Up @@ -1942,7 +1942,7 @@ def melt(
Returns
-------
:class:`~snowflake.snowpark.modin.pandas.DataFrame`
:class:`~modin.pandas.DataFrame`
unpivoted on the value columns
Examples
Expand Down Expand Up @@ -2034,7 +2034,7 @@ def crosstab(
Returns
-------
Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame`
Snowpark pandas :class:`~modin.pandas.DataFrame`
Cross tabulation of the data.
Notes
Expand Down
10 changes: 4 additions & 6 deletions src/snowflake/snowpark/modin/plugin/docstrings/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -517,7 +517,7 @@ def asfreq():
Returns
-------
Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame` or Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.Series`
Snowpark pandas :class:`~modin.pandas.DataFrame` or Snowpark pandas :class:`~modin.pandas.Series`
Notes
-----
Expand Down Expand Up @@ -585,7 +585,7 @@ def astype():
Returns
-------
same type as caller (Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame` or Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.Series`)
same type as caller (Snowpark pandas :class:`~modin.pandas.DataFrame` or Snowpark pandas :class:`~modin.pandas.Series`)
Examples
--------
Expand Down Expand Up @@ -695,7 +695,7 @@ def copy():
Returns
-------
copy : Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.Series` or Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.DataFrame`
copy : Snowpark pandas :class:`~modin.pandas.Series` or Snowpark pandas :class:`~modin.pandas.DataFrame`
Object type matches caller.
Examples
Expand Down Expand Up @@ -753,7 +753,7 @@ def count():
Returns
-------
Snowpark pandas :class:`~snowflake.snowpark.modin.pandas.Series`
Snowpark pandas :class:`~modin.pandas.Series`
For each column/row the number of non-NA/null entries.
See Also
Expand Down Expand Up @@ -1511,8 +1511,6 @@ def iloc():
>>> df.iloc[[0]]
a b c d
0 1 2 3 4
>>> type(df.iloc[[0]])
<class 'snowflake.snowpark.modin.pandas.dataframe.DataFrame'>
>>> df.iloc[[0, 1]]
a b c d
Expand Down
14 changes: 7 additions & 7 deletions src/snowflake/snowpark/modin/plugin/docstrings/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -626,9 +626,9 @@ def applymap():
See Also
--------
:func:`Series.apply <snowflake.snowpark.modin.pandas.Series.apply>` : For applying more complex functions on a Series.
:func:`Series.apply <modin.pandas.Series.apply>` : For applying more complex functions on a Series.
:func:`DataFrame.apply <snowflake.snowpark.modin.pandas.DataFrame.apply>` : Apply a function row-/column-wise.
:func:`DataFrame.apply <modin.pandas.DataFrame.apply>` : Apply a function row-/column-wise.
Examples
--------
Expand Down Expand Up @@ -775,9 +775,9 @@ def apply():
See Also
--------
:func:`Series.apply <snowflake.snowpark.modin.pandas.Series.apply>` : For applying more complex functions on a Series.
:func:`Series.apply <modin.pandas.Series.apply>` : For applying more complex functions on a Series.
:func:`DataFrame.applymap <snowflake.snowpark.modin.pandas.DataFrame.applymap>` : Apply a function elementwise on a whole DataFrame.
:func:`DataFrame.applymap <modin.pandas.DataFrame.applymap>` : Apply a function elementwise on a whole DataFrame.
Notes
-----
Expand Down Expand Up @@ -1307,7 +1307,7 @@ def compare():
Returns
-------
:class:`~snowflake.snowpark.modin.pandas.DataFrame`
:class:`~modin.pandas.DataFrame`
The result of the comparison.
Expand Down Expand Up @@ -2679,7 +2679,7 @@ def pivot():
Returns
-------
:class:`~snowflake.snowpark.modin.pandas.DataFrame`
:class:`~modin.pandas.DataFrame`
Notes
-----
Expand Down Expand Up @@ -4458,7 +4458,7 @@ def value_counts():
See Also
--------
:func:`Series.value_counts <snowflake.snowpark.modin.pandas.Series.value_counts>` : Equivalent method on Series.
:func:`Series.value_counts <modin.pandas.Series.value_counts>` : Equivalent method on Series.
Notes
-----
Expand Down
Loading

0 comments on commit 2eb2a2a

Please sign in to comment.