Skip to content

Commit

Permalink
SNOW-1830652 - support np.sqrt and other easy unary functions (#2695)
Browse files Browse the repository at this point in the history
End-user request for np.sqrt support. Opportunistically added a number
of other unary, easy functions such as trig functions (sin, cos, etc) as
well as floor, ceil, trunc, exp, and abs, negative, and positive.
2. Fill out the following pre-review checklist:

- [x] I am adding a new automated test(s) to verify correctness of my
new code
- [ ] If this test skips Local Testing mode, I'm requesting review from
@snowflakedb/local-testing
   - [ ] I am adding new logging messages
   - [ ] I am adding a new telemetry message
   - [ ] I am adding new credentials
   - [ ] I am adding a new dependency
- [ ] If this is a new feature/behavior, I'm adding the Local Testing
parity changes.
- [x] I acknowledge that I have ensured my changes to be thread-safe.
Follow the link for more information: [Thread-safe Developer
Guidelines](https://github.com/snowflakedb/snowpark-python/blob/main/CONTRIBUTING.md#thread-safe-development)
  • Loading branch information
sfc-gh-jkew authored Dec 2, 2024
1 parent ce1d22b commit 09ccd2e
Show file tree
Hide file tree
Showing 6 changed files with 92 additions and 24 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,10 @@
`collections.abc.Mapping`. No support for instances of `dict` that implement
`__missing__` but are not instances of `collections.defaultdict`.
- Added support for `DataFrame.align` and `Series.align` for `axis=1` and `axis=None`.
- Added support fot `pd.json_normalize`.
- Added support for `pd.json_normalize`.
- Added support for `GroupBy.pct_change` with `axis=0`, `freq=None`, and `limit=None`.
- Added support for `DataFrameGroupBy.__iter__` and `SeriesGroupBy.__iter__`.
- Added support for `np.sqrt`, `np.trunc`, `np.floor`, numpy trig functions, `np.exp`, `np.abs`, `np.positive` and `np.negative`.

#### Dependency Updates

Expand Down
30 changes: 30 additions & 0 deletions docs/source/modin/numpy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ NumPy ufuncs called with Snowpark pandas arguments will ignore kwargs.
+-----------------------------+----------------------------------------------------+
| ``np.may_share_memory`` | Returns False |
+-----------------------------+----------------------------------------------------+
| ``np.abs`` | Mapped to df.abs() |
+-----------------------------+----------------------------------------------------+
| ``np.absolute`` | Mapped to df.abs() |
+-----------------------------+----------------------------------------------------+
| ``np.add`` | Mapped to df.__add__(df2) |
+-----------------------------+----------------------------------------------------+
| ``np.subtract`` | Mapped to df.__sub__(df2) |
Expand All @@ -38,6 +42,8 @@ NumPy ufuncs called with Snowpark pandas arguments will ignore kwargs.
+-----------------------------+----------------------------------------------------+
| ``np.divide`` | Mapped to df.__truediv__(df2) |
+-----------------------------+----------------------------------------------------+
| ``np.exp`` | Mapped to df.apply(snowpark.functions.exp) |
+-----------------------------+----------------------------------------------------+
| ``np.true_divide`` | Mapped to df.__truediv__(df2) |
+-----------------------------+----------------------------------------------------+
| ``np.float_power`` | Mapped to df.__pow__(df2) |
Expand All @@ -50,6 +56,18 @@ NumPy ufuncs called with Snowpark pandas arguments will ignore kwargs.
+-----------------------------+----------------------------------------------------+
| ``np.mod`` | Mapped to df.__mod__(df2) |
+-----------------------------+----------------------------------------------------+
| ``np.negative`` | Mapped to -df |
+-----------------------------+----------------------------------------------------+
| ``np.positive`` | Mapped to df |
+-----------------------------+----------------------------------------------------+
| ``np.trunc`` | Mapped to df.apply(snowpark.functions.trunc) |
+-----------------------------+----------------------------------------------------+
| ``np.sqrt`` | Mapped to df.apply(snowpark.functions.sqrt) |
+-----------------------------+----------------------------------------------------+
| ``np.ceil`` | Mapped to df.apply(snowpark.functions.ceil) |
+-----------------------------+----------------------------------------------------+
| ``np.floor`` | Mapped to df.apply(snowpark.functions.floor) |
+-----------------------------+----------------------------------------------------+
| ``np.remainder`` | Mapped to df.__mod__(df2) |
+-----------------------------+----------------------------------------------------+
| ``np.greater`` | Mapped to df > df2 |
Expand All @@ -72,6 +90,18 @@ NumPy ufuncs called with Snowpark pandas arguments will ignore kwargs.
+-----------------------------+----------------------------------------------------+
| ``np.logical_not`` | Mapped to ~df.astype(bool) |
+-----------------------------+----------------------------------------------------+
| ``np.sin`` | Mapped to df.apply(snowpark.functions.sin) |
+-----------------------------+----------------------------------------------------+
| ``np.cos`` | Mapped to df.apply(snowpark.functions.cos) |
+-----------------------------+----------------------------------------------------+
| ``np.tan`` | Mapped to df.apply(snowpark.functions.tan) |
+-----------------------------+----------------------------------------------------+
| ``np.sinh`` | Mapped to df.apply(snowpark.functions.sinh) |
+-----------------------------+----------------------------------------------------+
| ``np.cosh`` | Mapped to df.apply(snowpark.functions.cosh) |
+-----------------------------+----------------------------------------------------+
| ``np.tanh`` | Mapped to df.apply(snowpark.functions.tanh) |
+-----------------------------+----------------------------------------------------+

NEP18 Implementation Details
----------------------------
Expand Down
20 changes: 20 additions & 0 deletions src/snowflake/snowpark/modin/plugin/_internal/apply_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,16 @@
to_variant,
when,
udtf,
exp,
cos,
tan,
sinh,
cosh,
tanh,
ceil,
floor,
trunc,
sqrt,
)
from snowflake.snowpark.modin.plugin._internal.frame import InternalFrame
from snowflake.snowpark.modin.plugin._internal.ordered_dataframe import (
Expand Down Expand Up @@ -85,11 +95,21 @@
cloudpickle.register_pickle_by_value(sys.modules[__name__])

SUPPORTED_SNOWPARK_PYTHON_FUNCTIONS_IN_APPLY = {
exp,
ln,
log,
_log2,
_log10,
sin,
cos,
tan,
sinh,
cosh,
tanh,
ceil,
floor,
trunc,
sqrt,
snowflake_cortex_summarize,
}

Expand Down
32 changes: 17 additions & 15 deletions src/snowflake/snowpark/modin/plugin/utils/numpy_to_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,46 +195,46 @@ def map_to_bools(inputs: Any) -> Any:
"logaddexp2": NotImplemented,
"true_divide": lambda obj, inputs: obj.__truediv__(*inputs),
"floor_divide": lambda obj, inputs: obj.__floordiv__(*inputs),
"negative": NotImplemented,
"positive": NotImplemented,
"negative": lambda obj, inputs: -obj,
"positive": lambda obj, inputs: obj,
"power": NotImplemented, # cannot use obj.__pow__
"float_power": lambda obj, inputs: obj.__pow__(*inputs),
"remainder": lambda obj, inputs: obj.__mod__(*inputs),
"mod": lambda obj, inputs: obj.__mod__(*inputs),
"fmod": NotImplemented,
"divmod": NotImplemented,
"absolute": NotImplemented,
"fabs": NotImplemented,
"absolute": lambda obj, inputs: obj.abs(),
"abs": lambda obj, inputs: obj.abs(),
"rint": NotImplemented,
"sign": NotImplemented,
"heaviside": NotImplemented, # heaviside step function
"conj": NotImplemented, # same as conjugate
"conjugate": NotImplemented,
"exp": NotImplemented,
"exp": lambda obj, inputs: obj.apply(sp_func.exp),
"exp2": NotImplemented,
"log": lambda obj, inputs: obj.apply(sp_func.ln), # use built-in function
"log2": lambda obj, inputs: obj.apply(sp_func._log2),
"log10": lambda obj, inputs: obj.apply(sp_func._log10),
"expm1": NotImplemented,
"log1p": NotImplemented,
"sqrt": NotImplemented,
"sqrt": lambda obj, inputs: obj.apply(sp_func.sqrt),
"square": NotImplemented,
"cbrt": NotImplemented, # Cube root
"reciprocal": NotImplemented,
"gcd": NotImplemented,
"lcm": NotImplemented,
# trigonometric functions
"sin": NotImplemented,
"cos": NotImplemented,
"tan": NotImplemented,
"sin": lambda obj, inputs: obj.apply(sp_func.sin),
"cos": lambda obj, inputs: obj.apply(sp_func.cos),
"tan": lambda obj, inputs: obj.apply(sp_func.tan),
"arcsin": NotImplemented,
"arccos": NotImplemented,
"arctan": NotImplemented,
"arctan2": NotImplemented,
"hypot": NotImplemented,
"sinh": NotImplemented,
"cosh": NotImplemented,
"tanh": NotImplemented,
"sinh": lambda obj, inputs: obj.apply(sp_func.sinh),
"cosh": lambda obj, inputs: obj.apply(sp_func.cosh),
"tanh": lambda obj, inputs: obj.apply(sp_func.tanh),
"arcsinh": NotImplemented,
"arccosh": NotImplemented,
"arctanh": NotImplemented,
Expand Down Expand Up @@ -282,7 +282,9 @@ def map_to_bools(inputs: Any) -> Any:
"ldexp": NotImplemented,
"frexp": NotImplemented,
"fmod": NotImplemented,
"floor": NotImplemented,
"ceil": NotImplemented,
"trunc": NotImplemented,
"floor": lambda obj, inputs: obj.apply(sp_func.floor),
"ceil": lambda obj, inputs: obj.apply(sp_func.ceil),
"trunc": lambda obj, inputs: obj.apply(
sp_func.trunc
), # df.truncate not supported in snowpandas yet
}
16 changes: 8 additions & 8 deletions tests/integ/modin/test_apply_snowpark_python_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,22 +53,22 @@ def test_apply_log10():

@sql_count_checker(query_count=0)
def test_apply_snowpark_python_function_not_implemented():
from snowflake.snowpark.functions import cos, sin
from snowflake.snowpark.functions import desc, asc

with pytest.raises(NotImplementedError):
pd.Series([1, 2, 3]).apply(cos)
pd.Series([1, 2, 3]).apply(desc)
with pytest.raises(NotImplementedError):
pd.Series([1, 2, 3]).to_frame().applymap(sin, na_action="ignore")
pd.Series([1, 2, 3]).to_frame().apply(asc, na_action="ignore")
with pytest.raises(NotImplementedError):
pd.Series([1, 2, 3]).to_frame().applymap(sin, args=[1, 2])
pd.Series([1, 2, 3]).to_frame().applymap(asc, args=[1, 2])
with pytest.raises(NotImplementedError):
pd.DataFrame({"a": [1, 2, 3]}).apply(cos)
pd.DataFrame({"a": [1, 2, 3]}).apply(desc)
with pytest.raises(NotImplementedError):
pd.DataFrame({"a": [1, 2, 3]}).apply(sin, raw=True)
pd.DataFrame({"a": [1, 2, 3]}).apply(asc, raw=True)
with pytest.raises(NotImplementedError):
pd.DataFrame({"a": [1, 2, 3]}).apply(sin, axis=1)
pd.DataFrame({"a": [1, 2, 3]}).apply(asc, axis=1)
with pytest.raises(NotImplementedError):
pd.DataFrame({"a": [1, 2, 3]}).apply(sin, args=(1, 2))
pd.DataFrame({"a": [1, 2, 3]}).apply(asc, args=(1, 2))


@sql_count_checker(query_count=1)
Expand Down
15 changes: 15 additions & 0 deletions tests/integ/modin/test_numpy.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,21 @@ def test_np_ufunc_binop_operators(np_ufunc):
np.log,
np.log2,
np.log10,
np.trunc,
np.ceil,
np.floor,
np.sin,
np.cos,
np.tan,
np.sinh,
np.cosh,
np.tanh,
np.sqrt,
np.exp,
np.abs,
np.absolute,
np.positive,
np.negative,
],
)
def test_np_ufunc_unary_operators(np_ufunc):
Expand Down

0 comments on commit 09ccd2e

Please sign in to comment.