Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub-query example query #15

Open
jbguerraz opened this issue Jan 25, 2021 · 7 comments
Open

Sub-query example query #15

jbguerraz opened this issue Jan 25, 2021 · 7 comments

Comments

@jbguerraz
Copy link
Contributor

Error 2/3

{
"batchSize":20480,
"columns":["__time","channel","cityName","comment","count","countryIsoCode","diffUrl","flags","isAnonymous","isMinor","isNew","isRobot","isUnpatrolled","metroCode","namespace","page","regionIsoCode","regionName","sum_added","sum_commentLength","sum_deleted","sum_delta","sum_deltaBucket","user"],
"dataSource":{"type":"query","query":{"queryType":"scan","dataSource":{"type":"table","name":"A"},"columns":["AT"],"intervals":{"type":"intervals","intervals":["1980-06-12T22:30:00.000Z/2020-01-26T23:00:00.000Z"]}}},
"filter":{"dimension":"countryName","extractionFn":{"locale":"","type":"lower"},"type":"selector","value":"france"},
"intervals":{"type":"intervals","intervals":["1980-06-12T22:30:00.000Z/2020-01-26T23:00:00.000Z"]},
"limit":10,
"order":"descending",
"queryType":"scan"
}

The above querying is failing when executed directly on Apache Druid with below error. I think this is a query error and not a bug in code. Perhaps we just need to fix this query.

Time-ordering on scan queries is only supported for queries with segment specs of type MultipleSpecificSegmentSpec or SpecificSegmentSpec...a [MultipleIntervalSegmentSpec] was received instead.

This query must have never worked indeed. Introduced here: 4995687#diff-7c1b8c5172fe7687c2af90f55f18e7e5eacf13af953ccc2bbeb5de3fa56e2688R25

Probably used only to debug the object model of the query (specific to circular dependency) rather than for getting results. We should fix it so it is valid and return results but still test the sub-query case

Originally posted by @jbguerraz in #13 (comment)

@saketbairoliya2
Copy link
Contributor

Do we've any updates on this? @jbguerraz

@saketbairoliya2
Copy link
Contributor

As I understand the Load() function returns correct request body but when making call to Execute() function, we get error about the null value.

ErrorMessage:Cannot construct instance of org.apache.druid.query.QueryDataSource, problem: 'query' must be nonnull .....

@saketbairoliya2
Copy link
Contributor

For context - I'm trying this query:

{
	"dataSource": {
		"query": {
			"aggregations": [{
				"fieldName": "count",
				"name": "count",
				"type": "longSum"
			}],
			"dataSource": {
				"name": "dc_94b4f5fdfde940979b79c50539d8322a_b42fde98efed4e638a0016b34b3c10cf_dataset_pre",
				"type": "table"
			},
			"dimension": {
				"dimension": "string_value",
				"type": "default"
			},
			"filter": {
				"fields": [{
						"dimension": "_split_name_",
						"type": "selector",
						"value": "train"
					},
					{
						"dimension": "column_name",
						"type": "selector",
						"value": "addressState"
					}
				],
				"type": "and"
			},
			"granularity": "day",
			"intervals": {
				"intervals": [
					"2022-10-07T00:00:00Z/P1W"
				],
				"type": "intervals"
			},
			"metric": {
				"metric": "count",
				"type": "numeric"
			},
			"queryType": "topN",
			"threshold": 100
		},
		"type": "query"
	},
	"granularity": "day",
	"intervals": {
		"intervals": [
			"2022-10-07T00:00:00Z/P1W"
		],
		"type": "intervals"
	},
	"queryType": "timeseries"
}

@saketbairoliya2
Copy link
Contributor

saketbairoliya2 commented Nov 17, 2022

This was the query sent to druid, why do we've a - here? To avoid circular dependency?

{
	"queryType": "timeseries",
	"dataSource": {
		"type": "query",
		"-": {
			"queryType": "topN",
			"dataSource": {
				"type": "table",
				"name": "dc_94b4f5fdfde940979b79c50539d8322a_b42fde98efed4e638a0016b34b3c10cf_dataset_pre"
			},
			"intervals": {
				"type": "intervals",
				"intervals": ["2022-10-07T00:00:00Z/P1W"]
			},
			"dimension": {
				"type": "default",
				"dimension": "string_value"
			},
			"metric": {
				"type": "numeric",
				"metric": "count"
			},
			"threshold": 100,
			"filter": {
				"type": "and",
				"fields": [{
					"type": "selector",
					"dimension": "__split_name__",
					"value": "train"
				}, {
					"type": "selector",
					"dimension": "column_name",
					"value": "addressState"
				}]
			},
			"granularity": "day",
			"aggregations": [{
				"type": "longSum",
				"name": "count",
				"fieldName": "count"
			}]
		}
	},
	"intervals": {
		"type": "intervals",
		"intervals": ["2022-10-07T00:00:00Z/P1W"]
	},
	"granularity": "day"
}

@vigith
Copy link
Collaborator

vigith commented Nov 17, 2022

we haven't tested sub-querying.

@jbguerraz
Copy link
Contributor Author

As discussed on Slack this comes from this change: b1a5a24#diff-8086ca5b4c31b5fcab1a5f0afc76bea40123e6a3852ecc1a19e1bba87e0a4b68R11

Have to restore it as it was in 4995687#diff-79e56959c31d638b4bdc3513e79921e69eefeb66836c1e985dbaa9585f8c968aR12

Probably it could be related to #78

@bourbonkk
Copy link
Contributor

@saketbairoliya2 @jbguerraz @vigith
I wrote the test code for the subquery in #82 PR

There were code modifications.
Query struct : "-,omitempty" -> "query,omitempty"

type Query struct {
	Base
	Query builder.Query `json:"query,omitempty"`
}

This is necessary to make a subquery.

so please check it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants