Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SNOW-1826257]: Refactor docs to provide one place for supported aggregation functions #2680

Merged
merged 8 commits into from
Dec 16, 2024
62 changes: 62 additions & 0 deletions docs/source/modin/supported/agg_supp.rst
sfc-gh-rdurrani marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
:orphan:

Supported Aggregation Functions
====================================

This page lists which aggregation functions are supported by ``DataFrame.agg``,
``Series.agg``, ``DataFrameGroupBy.agg``, and ``SeriesGroupBy.agg``.
The following table is structured as follows: The first column contains the aggregation function's name.
The second column is a flag for whether or not the aggregation is supported by ``DataFrame.agg``. The
third column is a flag for whether or not the aggregation is supported by ``Series.agg``. The fourth column
is whether or not the aggregation is supported by ``DataFrameGroupBy.agg``. The fifth column is whether or not
the aggregation is supported by ``SeriesGroupBy.agg``.

.. note::
``Y`` stands for yes (supports distributed implementation), ``N`` stands for no (API simply errors out),
and ``P`` stands for partial (meaning some parameters may not be supported yet).

Both Python builtin and NumPy functions are supported for ``DataFrameGroupBy.agg`` and ``SeriesGroupBy.agg``.

+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| Aggregation Function | ``DataFrame.agg`` supports? (Y/N/P) | ``Series.agg`` supports? (Y/N/P) | ``DataFrameGroupBy.agg`` supports? (Y/N/P) | ``SeriesGroupBy.agg`` supports? (Y/N/P) |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``count`` | ``Y`` for ``axis=0``. | ``Y`` | ``Y`` | ``Y`` |
| | For ``axis=1``, ``Y`` if index is | | | |
| | not a MultiIndex. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``mean`` | ``Y`` for ``axis=0``. | ``Y`` | ``Y`` | ``Y`` |
| | ``N`` for ``axis=1``. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``min`` | ``Y`` for ``axis=0``. | ``Y`` | ``Y`` | ``Y`` |
| | For ``axis=1``, ``Y`` if index is | | | |
| | not a MultiIndex. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``max`` | ``Y`` for ``axis=0``. | ``Y`` | ``Y`` | ``Y`` |
| | For ``axis=1``, ``Y`` if index is | | | |
| | not a MultiIndex. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``sum`` | ``Y`` for ``axis=0``. | ``Y`` | ``Y`` | ``Y`` |
| | For ``axis=1``, ``Y`` if index is | | | |
| | not a MultiIndex. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``median`` | ``Y`` for ``axis=0``. | ``Y`` | ``Y`` | ``Y`` |
| | ``N`` for ``axis=1``. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``size`` | ``Y`` for ``axis=0``. | ``Y`` | ``Y`` | ``Y`` |
| | ``N`` for ``axis=1``. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``std`` | ``P`` for ``axis=0`` - only when | ``P`` - only when ``ddof=0`` | ``P`` - only when ``ddof=0`` | ``P`` - only when ``ddof=0`` |
| | ``ddof=0`` or ``ddof=1``. | or ``ddof=1``. | or ``ddof=1``. | or ``ddof=1``. |
| | ``N`` for ``axis=1``. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``var`` | ``P`` for ``axis=0`` - only when | ``P`` - only when ``ddof=0`` | ``P`` - only when ``ddof=0`` | ``P`` - only when ``ddof=0`` |
| | ``ddof=0`` or ``ddof=1``. | or ``ddof=1``. | or ``ddof=1``. | or ``ddof=1``. |
| | ``N`` for ``axis=1``. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``quantile`` | ``P`` for ``axis=0`` - only when | ``P`` - only when ``q`` is the | ``P`` - only when ``q`` is the | ``P`` - only when ``q`` is the |
| | ``q`` is the default value or | default value or a scalar. | default value or a scalar. | default value or a scalar. |
| | a scalar. | | | |
| | ``N`` for ``axis=1``. | | | |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
| ``len`` | ``N`` | ``N`` | ``Y`` | ``Y`` |
+-----------------------------+-------------------------------------+----------------------------------+--------------------------------------------+-----------------------------------------+
12 changes: 3 additions & 9 deletions docs/source/modin/supported/dataframe_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,15 +65,9 @@ Methods
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``add_suffix`` | Y | | |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``agg`` | P | ``margins``, ``observed``, | If ``axis == 0``: ``Y`` when function is one of |
| | | ``sort`` | ``count``, ``mean``, ``min``, ``max``, ``sum``, |
| | | | ``median``, ``size``; ``std`` and ``var`` |
| | | | supported with ``ddof=0`` or ``ddof=1``; |
| | | | ``quantile`` is supported when ``q`` is the |
| | | | default value or a scalar. |
| | | | If ``axis == 1``: ``Y`` when function is |
| | | | ``count``, ``min``, ``max``, or ``sum`` and the |
| | | | index is not a MultiIndex. |
| ``agg`` | P | ``margins``, ``observed``, | Check |
| | | ``sort`` | `Supported Aggregation Functions <agg_supp.html>`_ |
| | | | for a list of supported functions. |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``aggregate`` | P | ``margins``, ``observed``, | See ``agg`` |
| | | ``sort`` | |
Expand Down
7 changes: 3 additions & 4 deletions docs/source/modin/supported/groupby_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,9 @@ Function application
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| GroupBy method | Snowpark implemented? (Y/N/P/D) | Missing parameters | Notes for current implementation |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``agg`` | P | ``axis`` other than 0 is not | ``Y``, support functions are count, mean, min, max,|
| | | implemented. | sum, median, std, size, len, and var |
| | | | (including both Python and NumPy functions) |
| | | | otherwise ``N``. |
| ``agg`` | P | ``axis`` other than 0 is not | Check |
| | | implemented. | `Supported Aggregation Functions <agg_supp.html>`_ |
| | | | for a list of supported functions. |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``aggregate`` | P | ``axis`` other than 0 is not | See ``agg`` |
| | | implemented. | |
Expand Down
9 changes: 3 additions & 6 deletions docs/source/modin/supported/series_supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,12 +76,9 @@ Methods
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``add_suffix`` | Y | | |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``agg`` | P | | ``Y`` when function is one of ``count``, |
| | | | ``mean``, ``min``, ``max``, ``sum``, ``median``, |
| | | | ``size``; ``std`` and ``var`` supported with |
| | | | ``ddof=0`` or ``ddof=1``; ``quantile`` is |
| | | | supported when ``q`` is the default value |
| | | | or a scalar. |
| ``agg`` | P | | Check |
| | | | `Supported Aggregation Functions <agg_supp.html>`_ |
| | | | for a list of supported functions. |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
| ``aggregate`` | P | | See ``agg`` |
+-----------------------------+---------------------------------+----------------------------------+----------------------------------------------------+
Expand Down
2 changes: 2 additions & 0 deletions tests/integ/modin/groupby/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@
lambda gr: gr.median(),
lambda gr: gr.var(),
lambda gr: gr.var(ddof=0),
lambda gr: gr.quantile(),
lambda gr: gr.quantile(q=0.3),
]
all_agg_methods = result_compatible_agg_methods + int_to_decimal_float_agg_methods

Expand Down
Loading