The Stardog com.complexible.stardog.plan.filter.functions.Function interface is the extension point for section 17.6 (Extensible Value Testing) of the SPARQL spec.
Function
corresponds to built-in expressions used in FILTER
, BIND
and SELECT
expressions, as well as
aggregate operators in a SPARQL query. Examples include &&
and ||
and functions defined in the
SPARQL spec like sameTerm
, str
, and now
.
The starting point for implementing your own custom function is to extend AbstractFunction.
This class provides much of the basic scaffolding for implementing a new Function
from scratch.
If your new function falls into one of the existing categories, it should implement the appropriate marker interface:
com.complexible.stardog.plan.filter.functions.cast.CastFunction
com.complexible.stardog.plan.filter.functions.datetime.DateTimeFunction
com.complexible.stardog.plan.filter.functions.hash.HashFunction
com.complexible.stardog.plan.filter.functions.numeric.MathFunction
com.complexible.stardog.plan.filter.functions.rdfterm.RDFTermFunction
com.complexible.stardog.plan.filter.functions.string.StringFunction
If not, then it must implement com.complexible.stardog.plan.filter.functions.UserDefinedFunction
. Extending one
of these marker interfaces is required for the Function
to be traverseable via the visitor pattern.
A zero-argument constructor must be provided which delegates some initialization to super
, providing first the int
number of required arguments followed by one or more URIs which identify the function. Any these URIs can be used to
identify the function in a SPARQL query. The URIs are typed as String
but should be valid URIs.
For functions which take take a range of arguments, for example a minimum of 2, but no more than 4 values, a
Range
can be used as the first parameter passed to super
rather than an int
.
Function
extends from Copyable
, therefore implementations should also provide a "copy constructor" which can be
called from the copy
method:
private MyFunc(final MyFunc theFunc) {
super(myFunc);
// make copies of any local data structures
}
@Override
public MyFunc copy() {
return new MyFunc(this);
}
Evaluating the function is handled by Value internalEvaluate(final Value...)
The parameters of this method correspond
to the arguments passed into the function; it's the values of the variables for each solution of the query. Here we can
perform whatever actions are required for our function. AbstractFunction
will have already taken care of validating
that we're getting the correct number of arguments to the function, but we still have to validate the input.
AbstractFunction
provides some convenience methods to this end, for example assertURI
and assertNumericLiteral
for requiring that inputs are either a valid URI, or a literal with a numeric datatype respectively.
Errors that occur in the evaluation of the function should throw a
com.complexible.stardog.plan.filter.ExpressionEvaluationException
; this corresponds to the ValueError
concept
defined in the SPARQL specification.
Create a file called com.complexible.stardog.plan.filter.functions.Function
in the META-INF/services
directory.
The contents of this file should be all of the fully-qualified class names for your custom Function(s). The
jar containing this META-INF/services
directory as well as the implementations for the Function(s) it references is
added on the classpath. Stardog will pick up the implementations on startup by using the JDK ServiceLoader
framework.
Functions are identified by their URI; you can reference them in a query using their fully-qualified URI, or specify
prefixes for the namespaces and utilize only the qname. For this example, if the namespace tag:stardog:api:
is
associated with the prefix stardog
and within that namespace we have our function myFunc
we can invoke it
from a SPARQL query as: bind(stardog:myFunc(?var) as ?tc)
While the SPARQL specification has an extension point for value testing and allows for custom functions in
FILTER
/BIND
/SELECT
expressions, there is no similar mechanism for aggregates. The space of aggregates is closed
by definition, all legal aggregates are enumerated in the spec itself.
However like with custom functions, there are many use cases for creating and using custom aggregate functions. Stardog provides a mechanism for creating and using custom aggregates without requiring custom SPARQL syntax.
To implement a custom aggregate, you should extend AbstractAggregate.
The rules regarding constructor, "copy constructor" and the copy
method for Function
apply to Aggregate
as well.
Two methods must be implemented for custom aggregates, Value _getValue() throws ExpressionEvaluationException
and
void aggregate(final Value theValue, final long theMultiplicity) throws ExpressionEvaluationException
. _getValue
returns the computed aggregate value while aggregate
adds a Value to the current running aggregation. In terms of
the COUNT
aggregate, aggregate
would increment the counter and _getValue
would return the final count.
The multiplicity argument to aggregate
corresponds to the fact that intermediate solution sets have a
multiplicity associated with them. It's most often 1, but joins and choice of the indexes used for the scans
internally can affect this. Rather than repeating the solution N times, we associate a multiplicity of N with the
solution. Again, in terms of COUNT
, this would mean that rather than incrementing the count by 1
, it would be
incremented by the multiplicity.
Aggregates such as COUNT
or SAMPLE
are implementations of Function
in the same way sameTerm
or str
are and
are registered with Stardog in the exact same manner.
You can use your custom aggregates just like any other aggregate function. Assuming we have
a custom aggregate gmean
defined in the tag:stardog:api:
namespace, we can refer to it within a query as such:
PREFIX : <http://www.example.org>
PREFIX stardog: <tag:stardog:api:>
SELECT (stardog:gmean(?O) AS ?C)
WHERE { ?S ?P ?O }