-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] Server side snowpark #1538
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…: table(), filter() (#1468) A very basic initial attempt at serializing the AST. I'm trying to maintain a parallel codebase for phases 0 and 1 for now, since it would be a shame to do this work twice. Once we complete and ship phase 0, we'll be able to drastically simplify the phase 1 client. Unlike what I mentioned before, this implementation doesn't flush dependencies of eagerly evaluated expressions. Instead, any client-side value is appended to the pending batch. This is simpler to implement and will likely work well, although we may need to do some dependency analysis on the server to ensure we don't issue unnecessary queries.
…/snowpark-python into server-side-snowpark
Updates our server branch with recent snowpark changes.
<!--- Please answer these questions before creating your pull request. Thanks! ---> 1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR. <!--- In this section, please add a Snowflake Jira issue number. Note that if a corresponding GitHub issue exists, you should still include the Snowflake Jira issue number. For example, for GitHub issue #1400, you should add "SNOW-1335071" here. ---> Fixes SNOW-0 2. Fill out the following pre-review checklist: - [ ] I am adding a new automated test(s) to verify correctness of my new code - [ ] I am adding new logging messages - [ ] I am adding a new telemetry message - [ ] I am adding new credentials - [x] I am adding a new dependency 3. Please describe how your code solves the related issue. Update `ast_pb2.py` (already present in the repository). Add the `setuptools` dependencies required for development. Include the module path for `ast_pb2.py` in the manifest, so that the file makes it into the Snowpark wheel.
…es until we send a response on TCM creation (#1617)
Run `update-from-devvm.sh` from within `src/snowflake/snowpark/_internal/proto/` with a running local devvm to update the proto file on the thin client.
….py (#1766) Modifies `setup.py` to use the latest HEAD of https://github.com/snowflakedb/snowflake-connector-python/tree/server-side-snowpark which includes connector changes (most notable adding the `_dataframe_ast` field for phase 0). To update your local dev environment run ``` pip uninstall snowflake-connector-python -y python -m pip install --no-cache -e ".[development,pandas]" ``` Running the pip command should show `git clone` in the logs.
This is the thin-client PR complementary to https://github.com/snowflakedb/snowflake/pull/183143/files.
…frameAST field. (#1794) Vendors snowflake vcrpy from https://github.com/Snowflake-Labs/snowflake-vcrpy (could not get install working, therefore vendoring it) with custom Snowflake changes to track requests in vendored urllib3 within the snowflake python connector. Adds decorator `check_ast_encode_invoked` (applied with `autouse=True` to all tests) which checks that every query send contains `dataframeAst` property for phase 0, and errors out together with traceback information whenever tests need to be fixed / APIs are missing that need to be encoded within the AST.
… with Python 3.8 (#1796) Remove temporarily Modin tests as Modin is incompatible with Python 3.8.
…avoid negated logic (#1970)
… to pandas.DataFrame (#1973) Support all data/schema cases for `session.create_dataframe`, except for the data being a pandas.DataFrame.
…1994) Adds support for `pandas.Dataframe` in `session.create_dataframe`. Effectively encodes in the IR information about the temporary table which stores the data of the `pandas.DataFrame` server-side.
Support `DataFrame.agg` in Snowpark IR.
…lar to update-from-devvm.sh (#2017)
…2009) Support `DataFrame.{collect,collect_nowait,count}` for Snowpark IR. Furthermore, modifies test infrastructure to be able to pass multiple ASTs to the unparser in order so multiple evals can be unparsed. Adds new AstListener class for both server connection and mock server connection to capture ASTs easily. Modified steel-thread to work with AstListener to print out AST for interested audience.
Support `DataFrame.describe` for Snowpark IR. Adds preliminary `stddev` implementation to local testing API given this feature is missing at the moment.
…x existing issues (#2048)
…ra parens are now gone (#2073)
Client-side support for DataFrame write APIs (as Eval), `DataFrame.write.{copy_into_location, csv, json, parquet, save_as_table}`.
Support `session.write_pandas` for pandas Dataframes. Snowpark pandas Dataframes are out of scope for phase 0 and subsequently error. Fix local testing rest mock object by returning mock request with error message whenever called that will get propagated to user. This helps to more easily trace missing features within Mock/local testing feature.
…unctions. (#2096) Adds AST generation for `DataFrame.stat.{approx_quantile,corr,cov,crosstab,sample_by}`. Other changes: Existing functions `DataFrame.{col,sample,union_all,pivot}` get `_emit_ast` as new parameter to allow AST generation to be disabled on demand. The Snowpark `Column` class allows to disable AST generation with added `_emit_ast` now too. Adding In local testing mock functions `approx_percentile_accumulate,approx_percentile_estimate,covar_samp,corr_samp` , returning dummy values until properly implemented, to allow the mock server connection to successfully run the stats test.
…less object <df>.stat (#2104)
…r better performance (#2074) [SNOW-1491175](https://snowflakecomputing.atlassian.net/browse/SNOW-1491175) Remove all uses of `set_src_position` and `get_first_non_snowpark_stack_frame`, update the necessary test cases, and use the `inspect` module with better practices including - Deleting the retrieved frame to prevent reference cycles and memory leaks - Avoid capturing code context for every file along the stack - Skip two frames out of the Snowpark library code to retrieve relevant user code immediately when possible - Add comments discussing functionality in detail for future improvements [SNOW-1491175]: https://snowflakecomputing.atlassian.net/browse/SNOW-1491175?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Co-authored-by: Leonhard Spiegelberg <[email protected]>
…elease (#2122) Updates branch with recent release to minimize merge conflicts. Snowpark IR supports new arguments, i.e. Table got a new (optional) boolean parameter `is_temp_table_for_cleanup` and `regexp` got a new optional parameter `parameters`. Other: - Fixes CI for linting, deactivating Modin for Python3.8, vcrpy deps. - Removes steel-thread.py to avoid merge conflict. - Fixes to_date tests to be closer to original calls. - Changes SQLCounter interface to match desired QueryListener interface.
Refresh using git commits since 1.21 release, reduces further merge-conflicts with main.
due to force pushes the git history here is compromised, closing this branch for now. Use instead https://github.com/snowflakedb/snowpark-python/tree/ls-SNOW-1491199-merge-phase0-server-side and DO NOT FORCE PUSH. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
DO NOT MERGE.
this branch helps us compare snowpark server side changes vs. current main/HEAD