-
Notifications
You must be signed in to change notification settings - Fork 195
Example: Serving out an RDF version of a VG graph with Apache Fuseki
Eric T Dawson edited this page Sep 15, 2016
·
1 revision
in VG root directory, make a graph from a multi-sequence FASTA:
vg msga -f test/GRCh38_alts/FASTA/HLA/B-3106.fa -D > b3106.vg
then convert it to a Turtle file:
vg view -p -r http://github.com/vgteam/vg -t b3106.vg > b3106.ttl
There are pipe symbols in the gi names. This is a problem at the moment see vg#245.
sed -ie 's/\|/\%7C/' b3106.ttl
I'm on MacOS X, using the homebrew package manager:
brew install fuseki
Start it up in update mode, so I can interactively feed it the turtle file:
fuseki-server --update --mem /b3106
Feed the turtle to Fuseki:
s-put -v http://localhost:3030/b3106/data default b3106.ttl
http://localhost:3030/control-panel.tpl
then select the b3106 dataset, to get to this page:
http://localhost:3030/sparql.tpl
PREFIX vg:<http://example.org/vg/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?path (group_concat(?sequence; separator="") as ?pathSeq)
WHERE {
BIND(<http://example.org/vg/path/gi%7C568815592:31353871-31357211> as ?path)
?step vg:path ?path;
vg:node ?node;
vg:rank ?rank.
?node rdf:value ?sequence
}
GROUP BY ?path
ORDER BY ?rank
Second query: Return all paths passing through a given node together with the rank this node has in each path
PREFIX vg:<http://example.org/vg/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?path ?rank
WHERE {
?step vg:node <http://example.org/vg/node/485>;
vg:rank ?rank;
vg:path ?path.
}
PREFIX vg:<http://example.org/vg/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX vgnode:<http://example.org/vg/node/>
SELECT ?nearbynode ?hopcount
WHERE {
SELECT ?nearbynode (count(?hop) as ?hopcount)
WHERE {
vgnode:261 vg:linksForwardToForward* ?hop.
?hop vg:linksForwardToForward+ ?nearbynode .
}
GROUP BY ?nearbynode
HAVING (?hopcount < 10)
ORDER BY ?hopcount
}
Problem is, it doesn't exactly return what you may think: It returns all nodes such that there are at most 10 nodes in any paths between the returned node and the selected node.
I'm returning pairs of nodes connected by an edge in the graph, to get at the full graph topology.
PREFIX vg:<http://example.org/vg/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX vgnode:<http://example.org/vg/node/>
SELECT ?subgraphnode_l ?subgraphnode_r
WHERE {
vgnode:261 vg:linksForwardToForward* ?subgraphnode_l .
?subgraphnode_l vg:linksForwardToForward ?subgraphnode_r .
?subgraphnode_r vg:linksForwardToForward* vgnode:280 .
}
ORDER BY ?subgraphnode_l ?subgraphnode_r