Name	Name	Last commit message	Last commit date
Latest commit sfc-gh-vbudati update ast_pb2.py Jun 28, 2024 718f674 · Jun 28, 2024 History 1,272 Commits
.github	.github	Add auto PR labeler for local testing label (#1718 )	May 31, 2024
ci	ci	SNOW-1300434 : Merge Snowpark pandas back to Snowpark Python (#1389 )	Apr 24, 2024
docs	docs	SNOW-1231747 : Add Support for Series.str.__getitem__ (#1724 )	Jun 4, 2024
recipe	recipe	prepare 1.18.0 release (#1673 )	May 27, 2024
scripts	scripts	[SERVER] Refresh dev branch with recent snowpark changes (#1537 )	May 8, 2024
src	src	update ast_pb2.py	Jun 28, 2024
tests	tests	Merge branch 'refs/heads/server-side-snowpark' into vbudati/SNOW-1491306	Jun 28, 2024
.gitignore	.gitignore	[SERVER] Refresh dev branch with recent snowpark changes (#1537 )	May 8, 2024
.pre-commit-config.yaml	.pre-commit-config.yaml	Fix server-side-snowpark linter errors [SNOW-1491327 ] (#1831 )	Jun 28, 2024
CHANGELOG.md	CHANGELOG.md	SNOW-1231747 : Add Support for Series.str.__getitem__ (#1724 )	Jun 4, 2024
CONTRIBUTING.md	CONTRIBUTING.md	SNOW-1358911 : Update READMEs and `CONTRIBUTING.md` with Snowpark pand…	May 17, 2024
LICENSE.txt	LICENSE.txt	SNOW-1300434 : Merge Snowpark pandas back to Snowpark Python (#1389 )	Apr 24, 2024
MANIFEST.in	MANIFEST.in	[SERVER] Refresh dev branch with recent snowpark changes (#1537 )	May 8, 2024
README.md	README.md	[README Fix] Move Doris' README changes to a new branch (#1713 )	May 30, 2024
license_header.txt	license_header.txt	Fix Precommit and Update tests to 2024 License (#1361 )	Apr 3, 2024
mypy.ini	mypy.ini	[SERVER] Refresh dev branch with recent snowpark changes (#1537 )	May 8, 2024
setup.py	setup.py	[ThinClient] Use connector with server-side-snowpark changes in setup…	Jun 12, 2024
snowpark_style_guide.md	snowpark_style_guide.md	SNOW-1161403 Replaced all pandas text to lowercase (#1277 )	Feb 28, 2024
tox.ini	tox.ini	CI: ignore some not implemented methods code coverage (#1736 )	Jun 5, 2024

Snowflake Snowpark Python and Snowpark pandas APIs

The Snowpark library provides intuitive APIs for querying and processing data in a data pipeline. Using this library, you can build applications that process data in Snowflake without having to move data to the system where your application code runs.

Getting started

Have your Snowflake account ready

If you don't have a Snowflake account yet, you can sign up for a 30-day free trial account.

Create a Python virtual environment

You can use miniconda, anaconda, or virtualenv to create a Python 3.8, 3.9, 3.10 or 3.11 virtual environment.

For Snowpark pandas, only Python 3.9, 3.10, or 3.11 is supported.

To have the best experience when using it with UDFs, creating a local conda environment with the Snowflake channel is recommended.

Install the library to the Python virtual environment

pip install snowflake-snowpark-python

To use the Snowpark pandas API, you can optionally install the following, which installs modin in the same environment. The Snowpark pandas API provides a familiar interface for pandas users to query and process data directly in Snowflake.

pip install "snowflake-snowpark-python[modin]"

Create a session and use the Snowpark Python API

from snowflake.snowpark import Session

connection_parameters = {
  "account": "<your snowflake account>",
  "user": "<your snowflake user>",
  "password": "<your snowflake password>",
  "role": "<snowflake user role>",
  "warehouse": "<snowflake warehouse>",
  "database": "<snowflake database>",
  "schema": "<snowflake schema>"
}

session = Session.builder.configs(connection_parameters).create()
# Create a Snowpark dataframe from input data
df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"]) 
df = df.filter(df.a > 1)
result = df.collect()
df.show()

# -------------
# |"A"  |"B"  |
# -------------
# |3    |4    |
# -------------

Create a session and use the Snowpark pandas API

import modin.pandas as pd
import snowflake.snowpark.modin.plugin
from snowflake.snowpark import Session

CONNECTION_PARAMETERS = {
    'account': '<myaccount>',
    'user': '<myuser>',
    'password': '<mypassword>',
    'role': '<myrole>',
    'database': '<mydatabase>',
    'schema': '<myschema>',
    'warehouse': '<mywarehouse>',
}
session = Session.builder.configs(CONNECTION_PARAMETERS).create()

# Create a Snowpark pandas dataframe from input data
df = pd.DataFrame([['a', 2.0, 1],['b', 4.0, 2],['c', 6.0, None]], columns=["COL_STR", "COL_FLOAT", "COL_INT"])
df
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0      1.0
# 1       b        4.0      2.0
# 2       c        6.0      NaN

df.shape
# (3, 3)

df.head(2)
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0        1
# 1       b        4.0        2

df.dropna(subset=["COL_INT"], inplace=True)

df
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0        1
# 1       b        4.0        2

df.shape
# (2, 3)

df.head(2)
#   COL_STR  COL_FLOAT  COL_INT
# 0       a        2.0        1
# 1       b        4.0        2

# Save the result back to Snowflake with a row_pos column.
df.reset_index(drop=True).to_snowflake('pandas_test2', index=True, index_label=['row_pos'])

Samples

The Snowpark Python developer guide, Snowpark Python API references, Snowpark pandas developer guide, and Snowpark pandas api references have basic sample code. Snowflake-Labs has more curated demos.

Logging

Configure logging level for snowflake.snowpark for Snowpark Python API logs. Snowpark uses the Snowflake Python Connector. So you may also want to configure the logging level for snowflake.connector when the error is in the Python Connector. For instance,

import logging
for logger_name in ('snowflake.snowpark', 'snowflake.connector'):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)

Reading and writing to pandas DataFrame

Snowpark Python API supports reading from and writing to a pandas DataFrame via the to_pandas and write_pandas commands.

To use these operations, ensure that pandas is installed in the same environment. You can install pandas alongside Snowpark Python by executing the following command:

pip install "snowflake-snowpark-python[pandas]"

Once pandas is installed, you can convert between a Snowpark DataFrame and pandas DataFrame as follows:

df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
# Convert Snowpark DataFrame to pandas DataFrame
pandas_df = df.to_pandas() 
# Write pandas DataFrame to a Snowflake table and return Snowpark DataFrame
snowpark_df = session.write_pandas(pandas_df, "new_table", auto_create_table=True)

Snowpark pandas API also supports writing to pandas:

import modin.pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns=["a", "b"])
# Convert Snowpark pandas DataFrame to pandas DataFrame
pandas_df = df.to_pandas()

Note that the above Snowpark pandas commands will work if Snowpark is installed with the [modin] option, the additional [pandas] installation is not required.

Contributing

Please refer to CONTRIBUTING.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snowflake Snowpark Python and Snowpark pandas APIs

Getting started

Have your Snowflake account ready

Create a Python virtual environment

Install the library to the Python virtual environment

Create a session and use the Snowpark Python API

Create a session and use the Snowpark pandas API

Samples

Logging

Reading and writing to pandas DataFrame

Contributing

About

Releases 44

Packages

Used by 5.4k

Contributors 92

Languages

License

snowflakedb/snowpark-python

Folders and files

Latest commit

History

Repository files navigation

Snowflake Snowpark Python and Snowpark pandas APIs

Getting started

Have your Snowflake account ready

Create a Python virtual environment

Install the library to the Python virtual environment

Create a session and use the Snowpark Python API

Create a session and use the Snowpark pandas API

Samples

Logging

Reading and writing to pandas DataFrame

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 44

Packages 0

Used by 5.4k

Contributors 92

Languages

Packages