Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-5615] Program to run SQL Logic Tests for Calcite #3145

Closed
wants to merge 23 commits into from

Conversation

mihaibudiu
Copy link
Contributor

This PR essentially adds more than 7 million SQL tests to Calcite, from the SQL Logic Test Suite built by sqlite.

This addresses https://issues.apache.org/jira/browse/CALCITE-5615

This is a larger PR but it is for a completely self-contained program with a main entry point.
The README.md contained in this PR documents the program structure and its intended usage.
Using this program will require users to install separately the tests from the SQL Logic Test Suite: https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki

The testing program finds multiple bugs in the default configuration of the Calcite compiler, both crashes and incorrect outputs. The plan is to file separate issues for each of these bugs as they are found.
However, this program is also intended to be extended to test other Calcite-based compilers (or it could be used for other SQL query engines; there is nothing special about Calcite in the implementation.)

@zabetak zabetak self-assigned this Apr 12, 2023
@zabetak zabetak changed the title Program to run SQL Logic Tests for Calcite [CALCITE-5615] Program to run SQL Logic Tests for Calcite Apr 12, 2023
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 36 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick pass leaving some comments along the way while trying to grasp what the contribution is about. I plan to leave also some questions comments under the JIRA ticket.

slt/src/main/java/org/apache/calcite/slt/Main.java Outdated Show resolved Hide resolved
slt/src/main/java/org/apache/calcite/slt/Main.java Outdated Show resolved Hide resolved
slt/src/main/java/org/apache/calcite/slt/Main.java Outdated Show resolved Hide resolved
slt/src/main/java/org/apache/calcite/slt/Utilities.java Outdated Show resolved Hide resolved
* Represents the data from a .test file from the
* SqlLogicTest test framework.
*/
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to have javadoc and non-javadoc comment coming one after the other. Please combine them together and use proper javadoc annotations to ensure it displays properly.

slt/src/main/java/org/apache/calcite/slt/SLTTestFile.java Outdated Show resolved Hide resolved
@mihaibudiu
Copy link
Contributor Author

I will only address @zabetak 's comments once we understand how this should be packaged; I have sent an email to the dev mailing list asking for suggestions. One thing I would encourage people to do in the meantime is to actually run the code (first on one or two test files, these are specified on the command-line, e.g., select1.test) and look at the bugs found.

@zabetak
Copy link
Member

zabetak commented Apr 14, 2023

@mbudiu-vmw Sure it makes sense to wait a bit. I will actually run the code during the weekend; I am already convinced that it is a very useful contribution :) I am just not sure yet where/how it should land. Thanks again for taking the time to contribute this to the community.

julianhyde added a commit to hydromatic/sql-logic-test that referenced this pull request Apr 19, 2023
Imported from apache/calcite#3145
as of 2022/04/13. Omit build.gradle.kts and gradle.properties
because we are using Maven, not Gradle.

Disable Checkstyle

Add a basic test; add an option so that Main does not call System.exit
julianhyde pushed a commit to hydromatic/sql-logic-test that referenced this pull request Apr 20, 2023
Imported from apache/calcite#3145
as of 2022/04/13. Omit build.gradle.kts and gradle.properties
because we are using Maven, not Gradle.

Disable Checkstyle

Add a basic test; add an option so that Main does not call System.exit
julianhyde pushed a commit to hydromatic/sql-logic-test that referenced this pull request Apr 27, 2023
Imported from apache/calcite#3145
as of 2022/04/13. Omit build.gradle.kts and gradle.properties
because we are using Maven, not Gradle.

Disable Checkstyle

Add a basic test; add an option so that Main does not call System.exit
@julianhyde
Copy link
Contributor

I think this code could go into the existing plus module, not create a new slt module. Calcite has too many modules already.

slt/src/main/java/org/apache/calcite/slt/Main1.java Outdated Show resolved Hide resolved
slt/src/main/java/org/apache/calcite/slt/Main1.java Outdated Show resolved Hide resolved
slt/README.md Outdated Show resolved Hide resolved
settings.gradle.kts Outdated Show resolved Hide resolved
slt/src/test/java/org/apache/calcite/slt/TestCalcite.java Outdated Show resolved Hide resolved
slt/src/test/java/org/apache/calcite/slt/TestCalcite.java Outdated Show resolved Hide resolved
@mihaibudiu
Copy link
Contributor Author

@zabetak thank you for the comments, I will address them once sql-logic-test 2.0 is available in my next PR.
But I would appreciate help to decide how to package this tool.

@zabetak
Copy link
Member

zabetak commented May 6, 2023

But I would appreciate help to decide how to package this tool.

@mihaibudiu From my perspective (and Julian's I think) this PR should contain something equivalent to what is now TestCalcite inside the plus module and allow all SQL logic tests that pass to run. It should probably contain CalciteExecutor but nothing more. We just want to run to be able to run the tests as part of the Calcite suite but nothing more for now. If you want I can push the relevant commits following this idea in this PR.

@mihaibudiu
Copy link
Contributor Author

With the latest commit this is testing the CalciteExecutor against the candidate 0.2 release of sql-logic-test.

@mihaibudiu mihaibudiu force-pushed the slt branch 2 times, most recently from f48d5ab to e4b82c4 Compare May 21, 2023 23:41
@mihaibudiu mihaibudiu requested a review from zabetak May 22, 2023 16:02
plus/build.gradle.kts Outdated Show resolved Hide resolved
plus/build.gradle.kts Outdated Show resolved Hide resolved
plus/src/test/java/org/apache/calcite/slt/TestCalcite.java Outdated Show resolved Hide resolved
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@julianhyde
Copy link
Contributor

Rather than checking that the number of failures doesn't change, it's better to check that there are no failures other than the list of 'expected failures'

You don't want the situation where you fix one test and introduce a new failure. It's much better to have a list of known failures rather than just a count.

@mihaibudiu
Copy link
Contributor Author

Currently the "execute" method from Main only returns an integer and writes all diagnostic to the supplied streams.
We have to make it return the actual TestStatistics object.
So this will require another release.

@mihaibudiu
Copy link
Contributor Author

It took about 6 minutes to run 10K tests using CalciteExecutor, so by extrapolation it would take about 2 days to run all tests.
Memory consumption increases slightly with time. The failures are saved in a list which keeps growing. I am not sure whether there could be leaks somewhere else.

@zabetak
Copy link
Member

zabetak commented Jun 7, 2023

It took about 6 minutes to run 10K tests using CalciteExecutor, so by extrapolation it would take about 2 days to run all tests.

Here are some thoughts regarding duration and frequency:

  • Tests runtime ~5 minutes -> could run on every PR;
  • Tests runtime between 1h and 3h -> could possibly run once daily;
  • Tests runtime under 6h -> could run once per week;
  • Tests runtime spanning days -> per request/ not automated (maybe before release)
    It's up to us I guess to divide them into meaningful buckets (e..g, those that are likely to break more often should run on every PR) and setup the relevant CI infra.

Obviously we don't have to setup everything as part of this PR but it would be nice if we could at least put the 5minute bucket in place.

Memory consumption increases slightly with time.

I guess we can find ways to address memory pressure; some quick ideas:

  • Run each test file separately; this in principle should reset the memory after each run.
  • Provide options/APIs to save failures in files (instead of keeping ever growing in memory structures)

@mihaibudiu
Copy link
Contributor Author

I think having many small tests is probably better. There are 622 test files in the project.
We could write a small script to generate a test for each one (or a group of files).
This could also be beneficial if the tests can run in parallel.

However, JUnit5 has a capability called DynamicTests using the @TestFactory annotation, but Calcite does not currently use it: https://www.baeldung.com/junit5-dynamic-tests.
I will try to prototype this solution to see what happens. With DynamicTests we could even choose how many tests to run by sampling the test file list.

@mihaibudiu mihaibudiu force-pushed the slt branch 3 times, most recently from ab681c2 to b76a4a8 Compare July 18, 2023 02:13
Signed-off-by: Mihai Budiu <[email protected]>
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I am doing some final touches locally and will merge this to master in the next few days.

@zabetak zabetak added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Aug 17, 2023
@mihaibudiu
Copy link
Contributor Author

We could also regenerate the golden file to account for the fixes made in the meantime. But I am on vacation for the next week so I can't do it right away.

zabetak pushed a commit to zabetak/calcite that referenced this pull request Aug 24, 2023
@zabetak
Copy link
Member

zabetak commented Aug 24, 2023

@mihaibudiu I am now testing everything along with my changes here: https://github.com/zabetak/calcite/tree/slt

If run comes back green I will merge it and we can work on follow-ups after that.

@zabetak zabetak closed this in 996692e Aug 25, 2023
Anthrino pushed a commit to Anthrino/calcite that referenced this pull request Aug 29, 2023
@mihaibudiu mihaibudiu deleted the slt branch December 6, 2024 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left. slow-tests-needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants