-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outline Query Hints #33
Comments
Hello Query Hints are designed to work similarly to those implemented by Blazegraph. For example, in
Here, the hint is the triple Initially, I've planned to add various hints to the engine, including one that leverage sorted indexes for query processing ( If you want to implement new query hints, feel free to! However, about the hint If you have any more questions, feel free to ask! |
Sounds great! If you'd like, you can paste the code for `SORTED_TRIPLES`
into a gist and I can try to implement and test it but I can wait until the
end of September too.
I will let you know if I have any questions, and again, thanks so much for
this awesome library!
…On Thu, Sep 12, 2019, 2:54 AM Thomas Minier ***@***.***> wrote:
Hello
Query Hints are designed to work similarly to those implemented by
Blazegraph <https://wiki.blazegraph.com/wiki/index.php/QueryHints>.
The idea is that: you write down those very specific RDF triples into your
SPARQL query and they provide hints to the query optimizer about how to do
its job. Of course, these "query hints triples" are not processed by the
query engine, they are just a convenient way of embedding query execution
logic into the query. I really like this idea because, in my opinion, it's
pretty elegant and very portable.
For example, in sparql-engine, the following query forces the optimizer
to use symmetric hash joins operator to resolve all joins in the Basic
Graph Pattern.
PREFIX dblp-pers: <https://dblp.org/pers/m/>
PREFIX dblp-rdf: <https://dblp.uni-trier.de/rdf/schema-2017-04-18#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX hint: <http://callidon.github.io/sparql-engine/hints#>
SELECT ?name ?article WHERE {
hint:Group hint:SymmetricHashJoin true.
?s rdf:type dblp-rdf:Person .
?s dblp-rdf:primaryFullPersonName ?name .
?s dblp-rdf:authorOf ?article .
}
Here, the hint is the triple hint:Group hint:SymmetricHashJoin true. It
does not require any more configuration than a classic query: you just put
the hint into the query and execute it!
Initially, I've planned to add various hints to the engine, including one
that leverage sorted indexes for query processing (SORTED_TRIPLES).
However, due to schedule constraints, I've only implemented the hint that
enables symmetric hash joins.
If you want to implement new query hints, feel free to! However, about the
hint SORTED_TRIPLES, I've already the code that implement it, I'm just
too busy to test it properly. If it's good for you, I should be able to put
that in production at the end of September.
If you have any more questions, feel free to ask!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#33?email_source=notifications&email_token=AAAIFIJIDV524ARYWGAFMRTQJHRTBA5CNFSM4IVUPHV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6Q3RWI#issuecomment-530692313>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAIFIPN6FF23DRTBOG32HTQJHRTBANCNFSM4IVUPHVQ>
.
|
Having some thoughts on this... I'm not sure if you are familiar with Datomic, it's a database created by Rich Hickey, who created Clojure, and, among other things, it uses datalog as a query language. One of the things you must do when defining a schema is specify a cardinality on each attribute that you define. In the example I linked to, a person has one name. I would imagine this is required because it would dramatically speed up a join pipeline if you knew there was only one value to find as opposed to multiple. I'd imagine adding a query hint for cardinality in sparql-engine would likely have the same effect. Perhaps that would be worth adding? Do you think it would be difficult? I could take a crack at it. |
Hi, I'm doing some performance optimization over the next couple of weeks in our app, and I'm wondering if you could outline some of the existing query hints - how they work and what they do, etc.
Also I see a query hint called
SORTED_TRIPLES
, but I don't think there is an implementation for it. In my custom graph my triples are indexed and sorted, and I think I could see some pretty good benefits from this particular query hint. Could you describe how it's supposed to work and if I get a chance I will attempt to implement it and make a PR?Thanks!
The text was updated successfully, but these errors were encountered: