Optimizations #114

nsbgn · 2023-07-15T14:47:59Z

SPARQL can only go so far, so we probably need to be smarter to make this properly scalable. That said, for https://github.com/quangis/quangis-workflow, we need to get rid of memory errors asap, and so:

Be smarter about selecting on the bag-of-types by only keeping the most specific types and not having SPARQL unions unless absolutely necessary (implemented as of e0a1a7d, 30bc861, 499d255, but poorly thought through, poorly implemented and poorly tested). In essence, we're now removing some constraints that are already guaranteed to hold in the presence of other constraints.
Also eliminate pointless UNIONs in ordered data (671de1f, a258e93)
Order the bag-of-types such that the most specific types come first
Use subqueries to limit ordered queries
Seperate :contains predicates for types and operators, so that the search is more directed.
Annotating the transformation graphs as directly as possible (saving all supertypes of each conceptual step on the step itself), so that we can do ?workflow :containsType <A> and ?step :subtypeOf B instead of, respectively, ?workflow :containsType ?A. ?A rdfs:subClassOf* <A> and ?step :type ?B. ?B rdfs:subClassOf* <B>. This is the biggest improvement.
Every step in the transformation graph should record from which steps it is reachable/which steps it depends on. Then we don't need property paths and can just select on type, select on reachability, done. Should make for another huge improvement. (Note: if we were using trees, we could also record the "path" on each step, but that gets exponential for DAGs)
Given the above, we can drop steps that themselves depend on other steps that match.
Record distance from output on every step. That way, we can force breadth-first search even on SPARQL.

The text was updated successfully, but these errors were encountered:

Building off of a258e93

Attempt at optimization, #114

nsbgn added a commit that referenced this issue Jul 22, 2023

Remove more pointless UNIONs (#114)

671de1f

Building off of a258e93

nsbgn added a commit that referenced this issue Jul 23, 2023

Skip matches on the same branch.

49a56af

Attempt at optimization, #114

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations #114

Optimizations #114

nsbgn commented Jul 15, 2023 •

edited

Loading

Optimizations #114

Optimizations #114

Comments

nsbgn commented Jul 15, 2023 • edited Loading

nsbgn commented Jul 15, 2023 •

edited

Loading