Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] Steel-thread example for pandas iloc get #1727

Conversation

sfc-gh-vbudati
Copy link
Contributor

@sfc-gh-vbudati sfc-gh-vbudati commented Jun 3, 2024

This is the steel-thread for Phase 0 pandas. The goal is to have iloc get working. I tried to cover all the basic row + columns cases but only test for scalar rows and columns.

src/snowflake/snowpark/_internal/ast.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/ast.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/ast.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/ast.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/ast.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/ast.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/ast.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/_internal/ast.py Show resolved Hide resolved
@sfc-gh-vbudati sfc-gh-vbudati force-pushed the vbudati/server-side-pandas-iloc-get-steel-thread branch from 7ed61e9 to 7445b0d Compare June 11, 2024 21:17
@sfc-gh-azwiegincew
Copy link
Collaborator

This now looks complete – just needs a test in tests/ast.

@sfc-gh-vbudati sfc-gh-vbudati changed the title [IR] [WIP] Steel-thread example for pandas iloc get [IR] Steel-thread example for pandas iloc get Jun 12, 2024
@sfc-gh-vbudati sfc-gh-vbudati marked this pull request as ready for review June 12, 2024 02:19
@sfc-gh-vbudati sfc-gh-vbudati requested review from a team as code owners June 12, 2024 02:19
@sfc-gh-vbudati sfc-gh-vbudati requested review from sfc-gh-jdu, sfc-gh-yuwang and sfc-gh-jrose and removed request for a team, sfc-gh-jdu, sfc-gh-yuwang and sfc-gh-jrose June 12, 2024 02:19
columns=["A", "B", "C", "D", "E", "F", "G"],
)
result = df.iloc[2, 2]
print(result)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails right now - I am not sure how to display result. It does not produce any other output.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work with the main branch? I assume you're running without Phase 1 enabled?

Copy link
Collaborator

@sfc-gh-azwiegincew sfc-gh-azwiegincew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG; some minor nits. The change request is about the AST test.

@@ -158,6 +159,11 @@ def __init__(

self._siblings = []

if not ast_stmt:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In column.py we're using if ... is None; we should use a consistent pattern for this check – whatever is the canonical approach in Python. It seems that is None makes more sense?

@@ -158,6 +159,11 @@ def __init__(

self._siblings = []

if not ast_stmt:
ast_stmt = pd.session._ast_batch.assign()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the point of creating an assign node like this. If a statement isn't actually supplied, how a "default" one won't really work. Is this because of the way that Pandas interacts with this code?


## EXPECTED OUTPUT

17
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? The test driver here runs the unparser, so we should be seeing something that looks like the original source.

columns=["A", "B", "C", "D", "E", "F", "G"],
)
result = df.iloc[2, 2]
print(result)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work with the main branch? I assume you're running without Phase 1 enabled?

@github-actions github-actions bot locked and limited conversation to collaborators Nov 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants