Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1462359/SNOW-1462481: Add Series.str and Series.dt to docs #1784

Merged
merged 6 commits into from
Jun 18, 2024

Conversation

sfc-gh-joshi
Copy link
Contributor

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-1462359 and SNOW-1462481

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
  3. Please describe how your code solves the related issue.

This PR adds documentation for Series.str.* and Series.dt.* to Snowpark pandas. This is significant for two reasons:

  1. This is the first time we are documenting frontend classes that are defined exclusively in upstream OSS modin, as we no longer have vendored copies of StringMethods and DatetimeProperties.
  2. Series.str and Series.dt are technically separate classes, and a large amount of customization is necessary to render their methods under the Series.str/dt name rather than their respective accessor objects.

The changes here are based loosely off those made by native pandas for the same purpose: https://github.com/pandas-dev/pandas/blob/bbe0e531383358b44e94131482e122bda43b33d7/doc/source/conf.py#L484-L485

I could not get the changes of upstream pandas to work directly out of the box, though I am not entirely sure as to why. It could be a consequence of these classes being external to Snowpark, or just the fact that vanilla pandas has an Accessor class wrapping the Series.str and Series.dt properties that modin lacks.

See comments in docs/source/conf.py for a detailed explanation of what each class does.

@sfc-gh-joshi sfc-gh-joshi requested review from a team as code owners June 15, 2024 00:25
@sfc-gh-joshi sfc-gh-joshi added documentation Improvements or additions to documentation NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md labels Jun 15, 2024
@@ -117,9 +117,6 @@ the method in the left column.
+-----------------------------+---------------------------------+----------------------------------------------------+
| ``slice_replace`` | N | |
+-----------------------------+---------------------------------+----------------------------------------------------+
| ``replace`` | P | ``N`` if `pat` is non-string, `repl` is |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entry was redundant (it appears also on line 101, in the correct alphabetical location).

@@ -38,7 +38,7 @@ def split():

Returns
-------
Series, Index, DataFrame or MultiIndex
:class:`~snowflake.snowpark.modin.pandas.Series`, Index, :class:`~snowflake.snowpark.modin.pandas.DataFrame` or MultiIndex
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a warning about DataFrame being ambiguous.

@sfc-gh-joshi
Copy link
Contributor Author

FYI @sfc-gh-vbudati we may be able to stop replacing "snowflake.snowpark.pandas" in jinja templates moving forward. Setting currentmodule to modin.pandas within the rst files should be sufficient, as this PR demonstrates that the docstring plugin is correctly applied to upstream modin classes.

@sfc-gh-stan sfc-gh-stan removed their request for review June 17, 2024 17:43
@sfc-gh-joshi sfc-gh-joshi added local testing Local Testing issues/PRs and removed local testing Local Testing issues/PRs labels Jun 17, 2024
Copy link
Contributor

@sfc-gh-helmeleegy sfc-gh-helmeleegy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Contributor

@sfc-gh-vbudati sfc-gh-vbudati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Thanks for figuring this out, Jonathan!

@sfc-gh-joshi sfc-gh-joshi enabled auto-merge (squash) June 18, 2024 17:54
@sfc-gh-joshi sfc-gh-joshi force-pushed the joshi/gen-str-dt-docs branch from 1695e8c to 12e1def Compare June 18, 2024 17:59
@sfc-gh-joshi sfc-gh-joshi merged commit fcf8455 into main Jun 18, 2024
36 of 37 checks passed
@sfc-gh-joshi sfc-gh-joshi deleted the joshi/gen-str-dt-docs branch June 18, 2024 22:22
@github-actions github-actions bot locked and limited conversation to collaborators Jun 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md snowpark-pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants