Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1018951: save_as_table fails when writing DataFrame returned from stored procedure #1215

Closed
TedCha opened this issue Jan 25, 2024 · 2 comments
Labels
bug Something isn't working needs triage Initial RCA is required

Comments

@TedCha
Copy link
Contributor

TedCha commented Jan 25, 2024

1. What version of Python are you using?

Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]

2. What operating system and processor architecture are you using?

Windows-10-10.0.19045-SP0

3. What are the component versions in the environment (pip freeze)?

Not applicable; using Snowpark Python Worksheet. Only package is snowflake-snowpark-python==1.11.1

4. What did you do?

Create procedure that returns a DataFrame:

CREATE OR REPLACE PROCEDURE TEST_PROCEDURE()
RETURNS TABLE ()
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python==1.11.1')
HANDLER = 'main'
EXECUTE AS OWNER
AS 'from snowflake.snowpark import Session, DataFrame

def main(session: Session) -> DataFrame:
    return session.sql("SELECT * FROM INFORMATION_SCHEMA.TABLES")
';

Call the previous procedure and attempt to save the returned DataFrame:

from snowflake.snowpark import Session
import logging

for logger_name in ('snowflake.snowpark', 'snowflake.connector'):
   logger = logging.getLogger(logger_name)
   logger.setLevel(logging.DEBUG)
   ch = logging.StreamHandler()
   ch.setLevel(logging.DEBUG)
   ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
   logger.addHandler(ch)

def main(session: Session) -> str:

    df = session.call("TEST_PROCEDURE")

    df.write.mode("overwrite").save_as_table(
        table_name="TEST_TEMP_TABLE",
        table_type="temporary",
        clustering_keys=[],
        block=True
    )

    return "SUCCESS"

Returns the following error:

Traceback (most recent call last):
  Worksheet, line 16, in main
  File "snowflake/snowpark/_internal/telemetry.py", line 162, in wrap
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "snowflake/snowpark/dataframe_writer.py", line 217, in save_as_table
    result = session._conn.execute(
             ^^^^^^^^^^^^^^^^^^^^^^
  File "snowflake/snowpark/_internal/server_connection.py", line 472, in execute
    result_set, result_meta = self.get_result_set(
                              ^^^^^^^^^^^^^^^^^^^^
  File "snowflake/snowpark/_internal/analyzer/snowflake_plan.py", line 125, in wrap
    raise ne.with_traceback(tb) from None
  File "snowflake/snowpark/_internal/analyzer/snowflake_plan.py", line 116, in wrap
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "snowflake/snowpark/_internal/server_connection.py", line 573, in get_result_set
    result = self.run_query(
             ^^^^^^^^^^^^^^^
  File "snowflake/snowpark/_internal/server_connection.py", line 123, in wrap
    raise ex
  File "snowflake/snowpark/_internal/server_connection.py", line 117, in wrap
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "snowflake/snowpark/_internal/server_connection.py", line 387, in run_query
    raise ex
  File "snowflake/snowpark/_internal/server_connection.py", line 368, in run_query
    results_cursor = self._cursor.execute(query, params=params, **kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "snowflake/connector/cursor.py", line 865, in execute
    Error.errorhandler_wrapper(
  File "snowflake/connector/errors.py", line 232, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "snowflake/connector/errors.py", line 287, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "snowflake/connector/errors.py", line 165, in default_errorhandler
    raise error_class(
snowflake.snowpark.exceptions.SnowparkSQLUnexpectedAliasException: (1301): You can only define aliases for the root Columns in a DataFrame returned by select() and agg(). You cannot use aliases for Columns in expressions.

5. What did you expect to see?

The table should have been created instead of throwing an error.

6. Can you set logging to DEBUG and collect the logs?

Note: Logs say Python 3.11 due to executing the code in a Snowpark Python Worksheet. Issue happens at 3.8 and 3.11.

2024-01-25 08:29:05,870 - Dummy-1 connection.py:202 - __init__() - INFO - Snowflake Connector for Python Version: 0.24.0, Python Version: 3.11.7, Platform: XP
2024-01-25 08:29:05,870 - Dummy-1 connection.py:511 - _next_sequence_counter() - DEBUG - sequence counter: 1
2024-01-25 08:29:05,919 - Dummy-1 converter.py:142 - __init__() - DEBUG - use_numpy: False
2024-01-25 08:29:05,919 - Dummy-1 connection.py:493 - cursor() - DEBUG - cursor
2024-01-25 08:29:05,919 - Dummy-1 session.py:427 - __init__() - INFO - Snowpark Session information: 
"version" : 1.11.1,
"python.version" : 3.11.7,
"python.connector.version" : 0.24.0,
"python.connector.session.id" : None,
"os.name" : Linux
2024-01-25 08:29:05,920 - Dummy-1 cursor.py:729 - execute() - DEBUG - executing SQL/command
2024-01-25 08:29:05,920 - Dummy-1 cursor.py:802 - execute() - INFO - query: [describe procedure TEST_PROCEDURE()]
2024-01-25 08:29:05,920 - Dummy-1 connection.py:511 - _next_sequence_counter() - DEBUG - sequence counter: 2
2024-01-25 08:29:05,920 - Dummy-1 cursor.py:435 - _execute_helper() - DEBUG - Request id: 12457c94-2bf8-4d92-ae5f-4f7102f66338
2024-01-25 08:29:05,920 - Dummy-1 cursor.py:438 - _execute_helper() - DEBUG - running query [describe procedure TEST_PROCEDURE()]
2024-01-25 08:29:05,920 - Dummy-1 cursor.py:447 - _execute_helper() - DEBUG - is_file_transfer: False
2024-01-25 08:29:05,920 - Dummy-1 cursor.py:487 - _execute_helper() - DEBUG - Failed to set SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:05,965 - Dummy-1 cursor.py:509 - _execute_helper() - DEBUG - Failed to reset SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:05,965 - Dummy-1 cursor.py:809 - execute() - DEBUG - sfqid: 01b1e809-0001-9d98-0003-e70600fed67e
2024-01-25 08:29:05,965 - Dummy-1 cursor.py:815 - execute() - INFO - query execution done
2024-01-25 08:29:05,965 - Dummy-1 cursor.py:820 - execute() - DEBUG - SUCCESS
2024-01-25 08:29:05,965 - Dummy-1 cursor.py:904 - _init_result_and_meta() - DEBUG - Query result format: json
2024-01-25 08:29:05,965 - Dummy-1 result_batch.py:441 - _parse() - DEBUG - parsing for result batch id: 12
2024-01-25 08:29:05,966 - Dummy-1 server_connection.py:372 - run_query() - DEBUG - Execute query [queryID: 01b1e809-0001-9d98-0003-e70600fed67e] describe procedure TEST_PROCEDURE()
2024-01-25 08:29:05,966 - Dummy-1 result_set.py:65 - result_set_iterator() - DEBUG - beginning to schedule result batch downloads
2024-01-25 08:29:05,966 - Dummy-1 result_set.py:128 - _report_metrics() - DEBUG - Reporting metrics not supported
2024-01-25 08:29:05,966 - Dummy-1 cursor.py:729 - execute() - DEBUG - executing SQL/command
2024-01-25 08:29:05,966 - Dummy-1 cursor.py:802 - execute() - INFO - query: [CALL TEST_PROCEDURE()]
2024-01-25 08:29:05,966 - Dummy-1 connection.py:511 - _next_sequence_counter() - DEBUG - sequence counter: 3
2024-01-25 08:29:05,966 - Dummy-1 cursor.py:435 - _execute_helper() - DEBUG - Request id: 63a0e805-4d5b-4b2b-9c28-72d9d86c3c95
2024-01-25 08:29:05,966 - Dummy-1 cursor.py:438 - _execute_helper() - DEBUG - running query [CALL TEST_PROCEDURE()]
2024-01-25 08:29:05,966 - Dummy-1 cursor.py:447 - _execute_helper() - DEBUG - is_file_transfer: False
2024-01-25 08:29:05,966 - Dummy-1 cursor.py:487 - _execute_helper() - DEBUG - Failed to set SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:06,045 - Dummy-1 cursor.py:509 - _execute_helper() - DEBUG - Failed to reset SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:06,045 - Dummy-1 cursor.py:809 - execute() - DEBUG - sfqid: 01b1e809-0001-9d98-0003-e70600fed682
2024-01-25 08:29:06,045 - Dummy-1 cursor.py:815 - execute() - INFO - query execution done
2024-01-25 08:29:06,045 - Dummy-1 cursor.py:820 - execute() - DEBUG - SUCCESS
2024-01-25 08:29:06,045 - Dummy-1 cursor.py:904 - _init_result_and_meta() - DEBUG - Query result format: json
2024-01-25 08:29:06,045 - Dummy-1 result_batch.py:441 - _parse() - DEBUG - parsing for result batch id: 0
2024-01-25 08:29:06,045 - Dummy-1 cursor.py:729 - execute() - DEBUG - executing SQL/command
2024-01-25 08:29:06,046 - Dummy-1 cursor.py:802 - execute() - INFO - query: [CALL TEST_PROCEDURE()]
2024-01-25 08:29:06,046 - Dummy-1 connection.py:511 - _next_sequence_counter() - DEBUG - sequence counter: 4
2024-01-25 08:29:06,046 - Dummy-1 cursor.py:435 - _execute_helper() - DEBUG - Request id: da9e5cf4-be6e-479c-9ce2-175c7a84b87b
2024-01-25 08:29:06,046 - Dummy-1 cursor.py:438 - _execute_helper() - DEBUG - running query [CALL TEST_PROCEDURE()]
2024-01-25 08:29:06,046 - Dummy-1 cursor.py:447 - _execute_helper() - DEBUG - is_file_transfer: False
2024-01-25 08:29:06,046 - Dummy-1 cursor.py:487 - _execute_helper() - DEBUG - Failed to set SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:08,705 - Dummy-1 cursor.py:509 - _execute_helper() - DEBUG - Failed to reset SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:08,705 - Dummy-1 cursor.py:809 - execute() - DEBUG - sfqid: 01b1e809-0001-9d98-0003-e70600fed686
2024-01-25 08:29:08,706 - Dummy-1 cursor.py:815 - execute() - INFO - query execution done
2024-01-25 08:29:08,706 - Dummy-1 cursor.py:820 - execute() - DEBUG - SUCCESS
2024-01-25 08:29:08,706 - Dummy-1 cursor.py:904 - _init_result_and_meta() - DEBUG - Query result format: json
2024-01-25 08:29:08,727 - Dummy-1 result_batch.py:142 - create_batches_from_response() - DEBUG - chunk size=2
2024-01-25 08:29:08,727 - Dummy-1 result_batch.py:441 - _parse() - DEBUG - parsing for result batch id: 0
2024-01-25 08:29:08,727 - Dummy-1 server_connection.py:372 - run_query() - DEBUG - Execute query [queryID: 01b1e809-0001-9d98-0003-e70600fed686] CALL TEST_PROCEDURE()
2024-01-25 08:29:08,727 - Dummy-1 result_set.py:223 - _create_iter() - DEBUG - result batch 1 has id: data_0_0_0
2024-01-25 08:29:08,727 - Dummy-1 result_set.py:223 - _create_iter() - DEBUG - result batch 2 has id: data_0_0_1
2024-01-25 08:29:08,727 - Dummy-1 result_set.py:65 - result_set_iterator() - DEBUG - beginning to schedule result batch downloads
2024-01-25 08:29:08,727 - Dummy-1 result_set.py:68 - result_set_iterator() - DEBUG - queuing download of result batch id: data_0_0_0
2024-01-25 08:29:08,810 - Dummy-1 result_set.py:68 - result_set_iterator() - DEBUG - queuing download of result batch id: data_0_0_1
2024-01-25 08:29:08,810 - ThreadPoolExecutor-1_0 result_batch.py:503 - create_iter() - DEBUG - started loading result batch id: data_0_0_0
2024-01-25 08:29:08,842 - ThreadPoolExecutor-1_1 result_batch.py:503 - create_iter() - DEBUG - started loading result batch id: data_0_0_1
2024-01-25 08:29:08,842 - Dummy-1 result_set.py:78 - result_set_iterator() - DEBUG - user requesting to consume result batch 1
2024-01-25 08:29:08,845 - ThreadPoolExecutor-1_0 result_batch.py:506 - create_iter() - DEBUG - finished loading result batch id: data_0_0_0
2024-01-25 08:29:08,845 - ThreadPoolExecutor-1_0 result_batch.py:441 - _parse() - DEBUG - parsing for result batch id: data_0_0_0
2024-01-25 08:29:08,852 - ThreadPoolExecutor-1_1 result_batch.py:506 - create_iter() - DEBUG - finished loading result batch id: data_0_0_1
2024-01-25 08:29:08,852 - Dummy-1 result_set.py:90 - result_set_iterator() - DEBUG - user began consuming result batch 1
2024-01-25 08:29:08,852 - ThreadPoolExecutor-1_1 result_batch.py:441 - _parse() - DEBUG - parsing for result batch id: data_0_0_1
2024-01-25 08:29:08,852 - Dummy-1 result_set.py:92 - result_set_iterator() - DEBUG - user finished consuming result batch 1
2024-01-25 08:29:08,863 - Dummy-1 result_set.py:78 - result_set_iterator() - DEBUG - user requesting to consume result batch 2
2024-01-25 08:29:08,864 - Dummy-1 result_set.py:90 - result_set_iterator() - DEBUG - user began consuming result batch 2
2024-01-25 08:29:08,864 - Dummy-1 result_set.py:92 - result_set_iterator() - DEBUG - user finished consuming result batch 2
2024-01-25 08:29:08,864 - Dummy-1 result_set.py:128 - _report_metrics() - DEBUG - Reporting metrics not supported
2024-01-25 08:29:08,865 - Dummy-1 cursor.py:729 - execute() - DEBUG - executing SQL/command
2024-01-25 08:29:08,865 - Dummy-1 cursor.py:802 - execute() - INFO - query: [CREATE  OR  REPLACE  TEMPORARY  TABLE  TEST_TEMP_TABLE()  AS  SELECT  *  FROM ( ...]
2024-01-25 08:29:08,865 - Dummy-1 connection.py:511 - _next_sequence_counter() - DEBUG - sequence counter: 5
2024-01-25 08:29:08,865 - Dummy-1 cursor.py:435 - _execute_helper() - DEBUG - Request id: 1eaeb3d1-377c-4119-90b2-b933ae720fdc
2024-01-25 08:29:08,865 - Dummy-1 cursor.py:438 - _execute_helper() - DEBUG - running query [CREATE  OR  REPLACE  TEMPORARY  TABLE  TEST_TEMP_TABLE()  AS  SELECT  *  FROM ( ...]
2024-01-25 08:29:08,865 - Dummy-1 cursor.py:447 - _execute_helper() - DEBUG - is_file_transfer: False
2024-01-25 08:29:08,865 - Dummy-1 cursor.py:487 - _execute_helper() - DEBUG - Failed to set SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:08,922 - Dummy-1 cursor.py:509 - _execute_helper() - DEBUG - Failed to reset SIGINT handler. Not in main thread. Ignored...
2024-01-25 08:29:08,922 - Dummy-1 cursor.py:809 - execute() - DEBUG - sfqid: 01b1e809-0001-9d98-0003-e70600fed692
2024-01-25 08:29:08,922 - Dummy-1 cursor.py:815 - execute() - INFO - query execution done
2024-01-25 08:29:08,922 - Dummy-1 cursor.py:853 - execute() - DEBUG - {'data': {'internalError': False, 'errorCode': '001003', 'age': 0, 'sqlState': '42000', 'queryId': '01b1e809-0001-9d98-0003-e70600fed692', 'line': -1, 'pos': -1, 'type': 'COMPILATION'}, 'code': '001003', 'message': "SQL compilation error:\nsyntax error line 1 at position 55 unexpected ')'.\nsyntax error line 1 at position 58 unexpected 'AS'.", 'success': False, 'headers': None}
2024-01-25 08:29:08,925 - Dummy-1 errors.py:145 - exception_telemetry() - DEBUG - Sending exception telemetry failed
CRITICAL:root:exception
Traceback (most recent call last):
  File "_udf_code.py", line 16, in main
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/telemetry.py", line 162, in wrap
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/dataframe_writer.py", line 217, in save_as_table
    result = session._conn.execute(
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 472, in execute
    result_set, result_meta = self.get_result_set(
                              ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/analyzer/snowflake_plan.py", line 125, in wrap
    raise ne.with_traceback(tb) from None
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/analyzer/snowflake_plan.py", line 116, in wrap
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 573, in get_result_set
    result = self.run_query(
             ^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 123, in wrap
    raise ex
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 117, in wrap
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 387, in run_query
    raise ex
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 368, in run_query
    results_cursor = self._cursor.execute(query, params=params, **kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/connector/cursor.py", line 865, in execute
    Error.errorhandler_wrapper(
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/connector/errors.py", line 232, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/connector/errors.py", line 287, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/usr/lib/python_udf/c5779d552c22877330735f35263cf9a764c85b621600d4ae0e1e7054ab9170f5/lib/python3.11/site-packages/snowflake/connector/errors.py", line 165, in default_errorhandler
    raise error_class(
snowflake.snowpark.exceptions.SnowparkSQLUnexpectedAliasException: (1301): You can only define aliases for the root Columns in a DataFrame returned by select() and agg(). You cannot use aliases for Columns in expressions.

I believe the issue is this line:

2024-01-25 08:29:08,865 - Dummy-1 cursor.py:802 - execute() - INFO - query: [CREATE  OR  REPLACE  TEMPORARY  TABLE  TEST_TEMP_TABLE()  AS  SELECT  *  FROM ( ...]

The CREATE TABLE query expects to have fields and types within the table definition but something is happening where those fields aren't being provided. I think the reported error is unrelated and just gets flagged as that error due to looking for a literal AS keyword.

@TedCha TedCha added bug Something isn't working needs triage Initial RCA is required labels Jan 25, 2024
@github-actions github-actions bot changed the title save_as_table fails when writing DataFrame returned from stored procedure SNOW-1018951: save_as_table fails when writing DataFrame returned from stored procedure Jan 25, 2024
@sfc-gh-jdu
Copy link
Collaborator

Have you tried to define your schema of returned dataframe when creating the stored procedure, instead of using RETURNS TABLE ()? Then I guess it will work.

@TedCha
Copy link
Contributor Author

TedCha commented Jan 29, 2024

Ahh yes that worked, thanks! I think I was expecting for the structure to be auto defined since I can create the table from the DataFrame within TEST_PROCEDURE with no schema definition but then returning it without a return definition causes an issue when creating it from another procedure..

@TedCha TedCha closed this as completed Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Initial RCA is required
Projects
None yet
Development

No branches or pull requests

2 participants