Outline Query Hints #33

dwhitney · 2019-09-11T14:06:13Z

Hi, I'm doing some performance optimization over the next couple of weeks in our app, and I'm wondering if you could outline some of the existing query hints - how they work and what they do, etc.

Also I see a query hint called SORTED_TRIPLES, but I don't think there is an implementation for it. In my custom graph my triples are indexed and sorted, and I think I could see some pretty good benefits from this particular query hint. Could you describe how it's supposed to work and if I get a chance I will attempt to implement it and make a PR?

Thanks!

The text was updated successfully, but these errors were encountered:

Callidon · 2019-09-12T06:54:40Z

Hello

Query Hints are designed to work similarly to those implemented by Blazegraph.
The idea is that: you write down those very specific RDF triples into your SPARQL query and they provide hints to the query optimizer about how to do its job. Of course, these "query hints triples" are not processed by the query engine, they are just a convenient way of embedding query execution logic into the query. I really like this idea because, in my opinion, it's pretty elegant and very portable.

For example, in sparql-engine, the following query forces the optimizer to use symmetric hash joins operator to resolve all joins in the Basic Graph Pattern.

PREFIX dblp-pers: <https://dblp.org/pers/m/>
    PREFIX dblp-rdf: <https://dblp.uni-trier.de/rdf/schema-2017-04-18#>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX hint: <http://callidon.github.io/sparql-engine/hints#>
    SELECT ?name ?article WHERE {
      hint:Group hint:SymmetricHashJoin true.
      ?s rdf:type dblp-rdf:Person .
      ?s dblp-rdf:primaryFullPersonName ?name .
      ?s dblp-rdf:authorOf ?article .
    }

Here, the hint is the triple hint:Group hint:SymmetricHashJoin true. It does not require any more configuration than a classic query: you just put the hint into the query and execute it!

Initially, I've planned to add various hints to the engine, including one that leverage sorted indexes for query processing (SORTED_TRIPLES). However, due to schedule constraints, I've only implemented the hint that enables symmetric hash joins.

If you want to implement new query hints, feel free to! However, about the hint SORTED_TRIPLES, I've already the code that implement it, I'm just too busy to test it properly. If it's good for you, I should be able to put that in production at the end of September.

If you have any more questions, feel free to ask!

dwhitney · 2019-09-12T14:43:15Z

Sounds great! If you'd like, you can paste the code for `SORTED_TRIPLES` into a gist and I can try to implement and test it but I can wait until the end of September too. I will let you know if I have any questions, and again, thanks so much for this awesome library!

…

On Thu, Sep 12, 2019, 2:54 AM Thomas Minier ***@***.***> wrote: Hello Query Hints are designed to work similarly to those implemented by Blazegraph <https://wiki.blazegraph.com/wiki/index.php/QueryHints>. The idea is that: you write down those very specific RDF triples into your SPARQL query and they provide hints to the query optimizer about how to do its job. Of course, these "query hints triples" are not processed by the query engine, they are just a convenient way of embedding query execution logic into the query. I really like this idea because, in my opinion, it's pretty elegant and very portable. For example, in sparql-engine, the following query forces the optimizer to use symmetric hash joins operator to resolve all joins in the Basic Graph Pattern. PREFIX dblp-pers: <https://dblp.org/pers/m/> PREFIX dblp-rdf: <https://dblp.uni-trier.de/rdf/schema-2017-04-18#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX hint: <http://callidon.github.io/sparql-engine/hints#> SELECT ?name ?article WHERE { hint:Group hint:SymmetricHashJoin true. ?s rdf:type dblp-rdf:Person . ?s dblp-rdf:primaryFullPersonName ?name . ?s dblp-rdf:authorOf ?article . } Here, the hint is the triple hint:Group hint:SymmetricHashJoin true. It does not require any more configuration than a classic query: you just put the hint into the query and execute it! Initially, I've planned to add various hints to the engine, including one that leverage sorted indexes for query processing (SORTED_TRIPLES). However, due to schedule constraints, I've only implemented the hint that enables symmetric hash joins. If you want to implement new query hints, feel free to! However, about the hint SORTED_TRIPLES, I've already the code that implement it, I'm just too busy to test it properly. If it's good for you, I should be able to put that in production at the end of September. If you have any more questions, feel free to ask! — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#33?email_source=notifications&email_token=AAAIFIJIDV524ARYWGAFMRTQJHRTBA5CNFSM4IVUPHV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6Q3RWI#issuecomment-530692313>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAIFIPN6FF23DRTBOG32HTQJHRTBANCNFSM4IVUPHVQ> .

dwhitney · 2019-09-24T13:50:12Z

Having some thoughts on this...

I'm not sure if you are familiar with Datomic, it's a database created by Rich Hickey, who created Clojure, and, among other things, it uses datalog as a query language. One of the things you must do when defining a schema is specify a cardinality on each attribute that you define. In the example I linked to, a person has one name. I would imagine this is required because it would dramatically speed up a join pipeline if you knew there was only one value to find as opposed to multiple. I'd imagine adding a query hint for cardinality in sparql-engine would likely have the same effect. Perhaps that would be worth adding? Do you think it would be difficult? I could take a crack at it.

Callidon added the question Further information is requested label Sep 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Outline Query Hints #33

Outline Query Hints #33

dwhitney commented Sep 11, 2019 •

edited

Loading

Callidon commented Sep 12, 2019

dwhitney commented Sep 12, 2019 via email

dwhitney commented Sep 24, 2019 •

edited

Loading

Outline Query Hints #33

Outline Query Hints #33

Comments

dwhitney commented Sep 11, 2019 • edited Loading

Callidon commented Sep 12, 2019

dwhitney commented Sep 12, 2019 via email

dwhitney commented Sep 24, 2019 • edited Loading

dwhitney commented Sep 11, 2019 •

edited

Loading

dwhitney commented Sep 24, 2019 •

edited

Loading