Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support result reuse in Athena data sources #7202

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dtaniwaki
Copy link
Contributor

@dtaniwaki dtaniwaki commented Oct 22, 2024

What type of PR is this?

  • Refactor
  • Feature
  • Bug Fix
  • New Query Runner (Data Source)
  • New Alert Destination
  • Other

Description

Currently, redash makes a query everytime the query page opens and it costs a lot if the query is too heavy.

On the other hand, there is a feature that efficiently uses query results in Athena.
https://docs.aws.amazon.com/athena/latest/ug/reusing-query-results.html

pyathena (>=2.18.0) supports the option.
https://github.com/laughingman7743/PyAthena/blob/6b8f0e94abd09115ba1277616ca84372e2f15d56/pyathena/common.py#L112-L113
It seems the reason of updating the major version is dropping python versions. and as far as I tried, queries work fine.

We may want to turn on/off the option per query but it will break the compatibility with the other data sources, so I made it a data source option. Could you consider adding this feature?

How is this tested?

  • Unit tests (pytest, jest)
  • E2E Tests (Cypress)
  • Manually
  • N/A

Run an Athena query with this option with manually container image.

Related Tickets & Documents

N/A

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

Screenshot_2024-10-23_at_1_49_15

@@ -116,7 +116,7 @@ pandas = "1.3.4"
phoenixdb = "0.7"
pinotdb = ">=0.4.5"
protobuf = "3.20.2"
pyathena = ">=1.5.0,<=1.11.5"
pyathena = "2.25.2"
Copy link
Contributor Author

@dtaniwaki dtaniwaki Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pyathena 3.x requires python >= 3.8.1, which conflicts with python >= 3.8 of the redash requirements, so I decided to use 2.x.

@dtaniwaki dtaniwaki changed the title Support result reuse Support result reuse in Athena data sources Oct 23, 2024
@dtaniwaki
Copy link
Contributor Author

@lucydodo @justinclift Could you consider merging this feature?

@lucydodo
Copy link
Member

It looks good, but since the version of the pyathena package goes up quite a bit,
so we should provbably look into the possibility of other side effectes before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants