Skip to content

Commit

Permalink
SNOW-1320543 Merge modin docs into snowpark-python repo (#1461)
Browse files Browse the repository at this point in the history
Please answer these questions before submitting your pull requests.
Thanks!

1. What GitHub issue is this PR addressing? Make sure that there is an
accompanying issue to your PR.

   Fixes SNOW-1320543

2. Fill out the following pre-review checklist:

- [ ] I am adding a new automated test(s) to verify correctness of my
new code
   - [ ] I am adding new logging messages
   - [ ] I am adding a new telemetry message
   - [ ] I am adding new credentials
   - [ ] I am adding a new dependency

3. Please describe how your code solves the related issue.

Adding the modin/Snowpark pandas docs under `docs/source/modin`. The
autogenerated files should reside under `docs/source/modin/pandas-api`.
Added a section for Snowpark pandas in `docs/source/index.rst`.

---------

Co-authored-by: Naren Krishna <[email protected]>
  • Loading branch information
sfc-gh-vbudati and sfc-gh-nkrishna authored Apr 30, 2024
1 parent 9ccac9e commit 9e8e764
Show file tree
Hide file tree
Showing 26 changed files with 2,751 additions and 22 deletions.
3 changes: 2 additions & 1 deletion .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@
/src/snowflake/snowpark/modin/ @snowflakedb/snowpandas
/tests/integ/modin/ @snowflakedb/snowpandas
/tests/unit/modin/ @snowflakedb/snowpandas
/docs/modin_api_coverage/ @snowflakedb/snowpandas
/docs/source/modin/ @snowflakedb/snowpandas
/.github/ @snowflakedb/snowpandas @snowflakedb/snowpark-python-api-reviewers
/docs/ @snowflakedb/snowpandas @snowflakedb/snowpark-python-api-reviewers
/scripts/ @snowflakedb/snowpandas @snowflakedb/snowpark-python-api-reviewers
setup.py @snowflakedb/snowpandas @snowflakedb/snowpark-python-api-reviewers
tox.ini @snowflakedb/snowpandas @snowflakedb/snowpark-python-api-reviewers
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ whitesource/
docs/_build/
# Ignore generated autosummary files created by Sphinx docs when you run make html in the docs directory.
docs/source/snowpark/api/
docs/source/modin/pandas_api/

# Editor specific
.idea/
Expand Down
5 changes: 5 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ help:

.PHONY: help Makefile

view:
@$(SPHINXBUILD) -M clean "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@$(SPHINXBUILD) -M html "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
open build/html/index.html

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
Expand Down
3 changes: 3 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ python -m pip install sphinx

Open the documentation: `open -a "Google Chrome" build/html/index.html`

As a convenience, you can also use `make view` after activating your virtual environment, which runs `make clean`, `make html`, and opens the documentation with
either your default browser, or the application you set as default for opening HTML files.

Important files and directories:

`docs/source/index.rst`: Specify which rst to include in the `index.html` landing page.
Expand Down
7 changes: 3 additions & 4 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
"sphinx.ext.autosummary",
"sphinx.ext.napoleon",
"sphinx.ext.coverage",
"sphinx.ext.linkcode"
"sphinx.ext.linkcode",
]

# -- Options for autodoc --------------------------------------------------
Expand Down Expand Up @@ -95,6 +95,7 @@
def linkcode_resolve(domain, info):
import warnings, inspect, pkg_resources
import snowflake.snowpark

if domain != "py":
return None

Expand Down Expand Up @@ -126,10 +127,8 @@ def linkcode_resolve(domain, info):
source, lineno = inspect.getsourcelines(obj)
linespec = f"#L{lineno}-L{lineno + len(source) - 1}"
except TypeError:
linespec = ""
linespec = ""
return (
f"https://github.com/snowflakedb/snowpark-python/blob/"
f"v{release}/{os.path.relpath(fn, start=os.pardir)}{linespec}"
)


39 changes: 22 additions & 17 deletions docs/source/doc_gen.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,7 @@
import tempfile
import itertools

Class = namedtuple(
"Class",
["module", "methods", "attributes"]
)
Class = namedtuple("Class", ["module", "methods", "attributes"])
Module = namedtuple(
"Module", ["name", "attributes", "functions", "classes", "exceptions"]
)
Expand All @@ -33,10 +30,12 @@
TAB = " "
NEWLINE_TAB = f"\n{TAB}"
RUBRIC_HEADER = ".. rubric::"
AUTOSUMMARY_HEADER=".. autosummary::"
AUTOSUMMARY_HEADER = ".. autosummary::"


def autogen_and_parse_for_info(module_name: str, class_name: Optional[str] = None) -> Union[Module, Class]:
def autogen_and_parse_for_info(
module_name: str, class_name: Optional[str] = None
) -> Union[Module, Class]:
if class_name:
res = Class(module_name, [], [])
name = f"{module_name}.{class_name}"
Expand All @@ -45,7 +44,6 @@ def autogen_and_parse_for_info(module_name: str, class_name: Optional[str] = Non
name = module_name

with tempfile.TemporaryDirectory() as tmpdir:

rst_content = f"""
.. currentmodule:: snowflake.snowpark
Expand All @@ -62,12 +60,11 @@ def autogen_and_parse_for_info(module_name: str, class_name: Optional[str] = Non
with open(fname, "w") as fp:
fp.write(rst_content)

output_dir = os.path.join(tmpdir, 'output')
output_dir = os.path.join(tmpdir, "output")
subprocess.run(["sphinx-autogen", fname, "-o", output_dir, "-t", "_templates"])

section = ""


with open(f"{output_dir}/{name}.rst") as fp:
for line in fp:
line = line.strip()
Expand Down Expand Up @@ -102,11 +99,15 @@ def generate_autosummary_section(section: str, content: str) -> str:
return ""


def generate_module_header(title:str, module:str) -> str:
automodule_text = "" if module=="snowflake.snowpark" else f"""
def generate_module_header(title: str, module: str) -> str:
automodule_text = (
""
if module == "snowflake.snowpark"
else f"""
.. automodule:: {module}
:noindex:
"""
)
return f"""
{'='*(len(title)+5)}
{title}
Expand All @@ -118,10 +119,12 @@ def generate_module_header(title:str, module:str) -> str:
"""


def generate_classes(title:str, module:str, classes: Iterable[str]) -> str:
def generate_classes(title: str, module: str, classes: Iterable[str]) -> str:
results = [autogen_and_parse_for_info(module, c) for c in classes]
names = NEWLINE_TAB.join(classes)
methods = NEWLINE_TAB.join(itertools.chain.from_iterable(c.methods for c in results))
methods = NEWLINE_TAB.join(
itertools.chain.from_iterable(c.methods for c in results)
)
attributes = NEWLINE_TAB.join(
itertools.chain.from_iterable(c.attributes for c in results)
)
Expand All @@ -135,7 +138,7 @@ def generate_classes(title:str, module:str, classes: Iterable[str]) -> str:
"""


def generate_module(title:str, module: str) -> str:
def generate_module(title: str, module: str) -> str:
mod = autogen_and_parse_for_info(module)
attributes = NEWLINE_TAB.join(mod.attributes)
functions = NEWLINE_TAB.join(mod.functions)
Expand All @@ -161,9 +164,12 @@ def generate_module(title:str, module: str) -> str:
"module", help="The module or the parent module of the classes to be documented"
)
parser.add_argument("-c", "--classes", nargs="*", help="Classes to be documented")
parser.add_argument("-t", "--title", help="Title of the rst file generated", default="PLACEHOLDER")
parser.add_argument(
"-f", "--filename", help="File to write the generated content to")
"-t", "--title", help="Title of the rst file generated", default="PLACEHOLDER"
)
parser.add_argument(
"-f", "--filename", help="File to write the generated content to"
)

args = parser.parse_args()
if args.classes:
Expand All @@ -176,4 +182,3 @@ def generate_module(title:str, module: str) -> str:
fp.write(content)
else:
print(content)

1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,6 @@ information, see the `Snowpark Developer Guide for Python <https://docs.snowflak

snowpark/session
snowpark/index
modin/index

:ref:`genindex`
209 changes: 209 additions & 0 deletions docs/source/modin/dataframe.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
=============================
DataFrame
=============================

.. currentmodule:: snowflake.snowpark.modin.pandas
.. rubric:: :doc:`All supported DataFrame APIs <supported/dataframe_supported>`

.. rubric:: Constructor

.. autosummary::
:toctree: pandas_api/

DataFrame

.. rubric:: Attributes

.. autosummary::
:toctree: pandas_api/

DataFrame.index
DataFrame.columns
DataFrame.dtypes
DataFrame.info
DataFrame.select_dtypes
DataFrame.values
DataFrame.axes
DataFrame.ndim
DataFrame.size
DataFrame.shape
DataFrame.empty

.. rubric:: Conversion

.. autosummary::
:toctree: pandas_api/

DataFrame.astype
DataFrame.convert_dtypes
DataFrame.copy

.. rubric:: Indexing, iteration

.. autosummary::
:toctree: pandas_api/

DataFrame.head
DataFrame.loc
DataFrame.iloc
DataFrame.insert
DataFrame.__iter__
DataFrame.keys
DataFrame.iterrows
DataFrame.itertuples
DataFrame.tail
DataFrame.isin
DataFrame.where
DataFrame.mask

.. rubric:: Binary operator functions

.. autosummary::
:toctree: pandas_api/

DataFrame.add
DataFrame.sub
DataFrame.mul
DataFrame.div
DataFrame.truediv
DataFrame.floordiv
DataFrame.mod
DataFrame.pow
DataFrame.radd
DataFrame.rsub
DataFrame.rmul
DataFrame.rdiv
DataFrame.rtruediv
DataFrame.rfloordiv
DataFrame.rmod
DataFrame.rpow
DataFrame.round
DataFrame.lt
DataFrame.gt
DataFrame.le
DataFrame.ge
DataFrame.ne
DataFrame.eq

.. rubric:: Function application, GroupBy & window

.. autosummary::
:toctree: pandas_api/

DataFrame.apply
DataFrame.applymap
DataFrame.agg
DataFrame.aggregate
DataFrame.transform
DataFrame.groupby
DataFrame.rolling

.. rubric:: Computations / descriptive stats

.. autosummary::
:toctree: pandas_api/

DataFrame.abs
DataFrame.all
DataFrame.any
DataFrame.count
DataFrame.cummax
DataFrame.cummin
DataFrame.cumsum
DataFrame.describe
DataFrame.diff
DataFrame.max
DataFrame.mean
DataFrame.median
DataFrame.min
DataFrame.quantile
DataFrame.rank
DataFrame.round
DataFrame.skew
DataFrame.sum
DataFrame.std
DataFrame.var
DataFrame.nunique
DataFrame.value_counts


.. rubric:: Reindexing / selection / label manipulation

.. autosummary::
:toctree: pandas_api/

DataFrame.add_prefix
DataFrame.add_suffix
DataFrame.drop
DataFrame.drop_duplicates
DataFrame.duplicated
DataFrame.first
DataFrame.get
DataFrame.head
DataFrame.idxmax
DataFrame.idxmin
DataFrame.last
DataFrame.rename
DataFrame.rename_axis
DataFrame.reset_index
DataFrame.sample
DataFrame.set_axis
DataFrame.set_index
DataFrame.tail
DataFrame.take

.. rubric:: Missing data handling

.. autosummary::
:toctree: pandas_api/

DataFrame.dropna
DataFrame.ffill
DataFrame.fillna
DataFrame.isna
DataFrame.isnull
DataFrame.notna
DataFrame.notnull
DataFrame.pad
DataFrame.replace

.. rubric:: Reshaping, sorting, transposing

.. autosummary::
:toctree: pandas_api/

DataFrame.pivot_table
DataFrame.sort_values
DataFrame.sort_index
DataFrame.melt
DataFrame.squeeze
DataFrame.T
DataFrame.transpose

.. rubric:: Combining / comparing / joining / merging

.. autosummary::
:toctree: pandas_api/

DataFrame.join
DataFrame.merge
DataFrame.update

.. rubric:: Time Series-related

.. autosummary::
:toctree: pandas_api/

DataFrame.shift
DataFrame.first_valid_index
DataFrame.last_valid_index
DataFrame.resample

.. rubric:: Serialization / IO / conversion

.. autosummary::
:toctree: pandas_api/

DataFrame.to_pandas
DataFrame.to_snowflake
DataFrame.to_snowpark
Loading

0 comments on commit 9e8e764

Please sign in to comment.