-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PPL command implementation for appendCol
#990
base: main
Are you sure you want to change the base?
PPL command implementation for appendCol
#990
Conversation
appendCol
ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/utils/AppendColCatalystUtils.java
Show resolved
Hide resolved
ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/CatalystQueryPlanVisitor.java
Outdated
Show resolved
Hide resolved
ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/utils/AppendColCatalystUtils.java
Outdated
Show resolved
Hide resolved
ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/utils/AppendColCatalystUtils.java
Outdated
Show resolved
Hide resolved
ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/utils/AppendColCatalystUtils.java
Outdated
Show resolved
Hide resolved
Two high level questions:
And the sub-search syntax is opensearch-spark/ppl-spark-integration/src/main/antlr4/OpenSearchPPLParser.g4 Lines 27 to 29 in 957de4e
Seems this PR doesn't follow the sub-search syntax. I prefer to follow the current sub-search syntax, in case we could combine columns from different tables. for examples: source=employees | FIELDS name, dept, salary | APPENDCOL [ search source = company | stats count() as event_count ] But if this is intentional (appendcol only works for one table), I am okey for current syntax.
instead of
PS, what is the expected result of query |
ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/CatalystQueryPlanVisitor.java
Outdated
Show resolved
Hide resolved
Yep, the sub-search under |
ppl-spark-integration/src/main/java/org/opensearch/sql/ast/tree/AppendCol.java
Outdated
Show resolved
Hide resolved
ppl-spark-integration/src/main/java/org/opensearch/sql/ppl/CatalystQueryPlanVisitor.java
Show resolved
Hide resolved
@andy-k-improving I would go with the same behaviour to match with Splunk ... |
@andy-k-improving LGTM - please validate @LantaoJin question and respond here |
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Co-authored-by: Taylor Curran <[email protected]> Signed-off-by: Andy Kwok <[email protected]>
Co-authored-by: Taylor Curran <[email protected]> Signed-off-by: Andy Kwok <[email protected]>
Co-authored-by: Taylor Curran <[email protected]> Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
c44d1b6
to
d6a29fb
Compare
* sub-search: Executes PPL commands as a secondary search. The sub-search uses the same data specified in the source clause of the main search results as its input. | ||
|
||
|
||
#### Example 1: To append the result of `stats avg(age) as AVG_AGE` into existing search result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add an additional example here and a test in IT, which is the most commonly used scenario for the AppenCol command:
source=employees | stats avg(age) as avg_age1 by dept | APPENDCOL [ stats avg(age) as avg_age2 by dept ]
The result should look like
dept avg_age1 avg_age2
Sales 35.3333 35.3333
Engineering 27.25 27.25
Marketing 33 33
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if above comment addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, please update https://github.com/opensearch-project/opensearch-spark/blob/main/docs/ppl-lang/PPL-Example-Commands.md as well.
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
|
||
PPL query: | ||
|
||
os> source=employees | stats avg(age) as avg_age1 by dept | fields dept, avg_age1 | APPENDCOL [ stats avg(age) as avg_age2 by dept | fields avg_age2 ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens without | fields avg_age2
? @andy-k-improving
In Splunk, command source=employees | stats avg(age) as avg_age1 by dept | APPENDCOLS [ stats avg(age) as avg_age2 by dept ]
could display the same outputs. Does current implementation have to add those fields
to get the same outputs?
Description
Introduce the new PPL command
appendCol
which aim to aggregate result from multiple searches into a single comprehensive table for user to view.This is accomplished by reading both main-search and sub-search in the form of
node
then transform it into the following of SQL with by adding_row_number_
for the dataset's natural order, then join both main and sub search with the_row_number_
column.Related Issues
Resolves: #956
Check List
--signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Test plan: