Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension of the metric reported by the benchmark runner #12

Merged
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
d933d2b
Standard deviation added to the result measurements and the unit test…
constraintAutomaton Sep 25, 2024
055f924
New IAggregateResult created where the raw time measurements are added.
constraintAutomaton Sep 25, 2024
1a4af06
Fix the faulty documentation of the outputRawResults field.
constraintAutomaton Sep 25, 2024
0bc65da
outputRawResults option added to the CLI interface.
constraintAutomaton Sep 25, 2024
7bb3dd0
Readme updated to include the outputRawResults field changes.
constraintAutomaton Sep 25, 2024
e730d22
Readme updated to present better the example input.
constraintAutomaton Sep 25, 2024
a70e1cf
Fix a small typo in the readme.
constraintAutomaton Sep 25, 2024
f87eec2
Example queries for the docker example added.
constraintAutomaton Sep 26, 2024
0601e4f
Comment about the IResult in the SparqlBenchmarkRunner test deleted.
constraintAutomaton Sep 26, 2024
94ab4ec
Example queries for the docker documentation deleted.
constraintAutomaton Sep 26, 2024
1a5b00a
raw results rename for iteration result
constraintAutomaton Sep 26, 2024
6af7ce6
Update the csv example output with an explanation of the timeAggregat…
constraintAutomaton Sep 26, 2024
867f984
Small rewording of in the readme for the explanation of the CLI program.
constraintAutomaton Sep 26, 2024
1197d14
Renamed interface for iteration results deleted and merged with IAggr…
constraintAutomaton Sep 27, 2024
90b7af1
Merge branch 'feature/extended_results' of github.com:constraintAutom…
constraintAutomaton Sep 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 49 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,24 @@ npm install sparql-benchmark-runner
```

## Usage
`sparql-benchmark-runner` can be used as a CLI programming with the the following options.
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
```
Options:
--version Show version number [boolean]
--endpoint URL of the SPARQL endpoint to send queries to
[string] [required]
--queries Directory of the queries [string] [required]
--replication Number of replication runs [number] [default: 5]
--warmup Number of warmup runs [number] [default: 1]
--output Destination for the output CSV file
[string] [default: "./output.csv"]
--timeout Timeout value in seconds to use for individual queries
[number]
--outputRawResults A flag indicating if SPARQL Benchmark Runner should also
output the raw results [boolean] [default: false]
----help
```
An example input is the following.

```bash
sparql-benchmark-runner \
Expand Down Expand Up @@ -95,6 +113,7 @@ async function executeQueries(pathToQueries, pathToOutputCsv) {
availabilityCheckTimeout: 1_000,
logger: (message) => console.log(message),
resultAggregator,
outputRawResults: false // false by default, if true will add the raw results (timeAggregate field) to the aggregated results
});

const results = await runner.run();
Expand All @@ -113,16 +132,42 @@ docker run \
--rm \
--interactive \
--tty \
--volume $(pwd)/output.csv:/tmp/output.csv \
--volume $(pwd)/queries:/tmp/queries \
--volume $(pwd)/output.csv:/output.csv \
--volume $(pwd)/queries:/queries \
comunica/sparql-benchmark-runner \
--endpoint https://dbpedia.org/sparql \
--queries /tmp/queries \
--output /tmp/output.csv \
--queries /queries \
--output /output.csv \
--replication 5 \
--warmup 1
```
[example queries](https://gist.github.com/davidsbatista/cdce57196bf84e3a988427b4d9ef9035) in `./queries/C1.text` from
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved

```
SELECT ?property ?hasValue ?isValueOf
WHERE {
{ <http://dbpedia.org/resource/Broccoli> ?property ?hasValue }
UNION
{ ?isValueOf ?property <http://dbpedia.org/resource/Broccoli> }
}

SELECT DISTINCT(?isValueOf)
WHERE {
?isValueOf <http://purl.org/dc/terms/subject> ?value .
{
SELECT DISTINCT(?value)
WHERE {
?resource <http://purl.org/dc/terms/subject> ?value .
FILTER regex(?value, "Category:American_.*_descent", "i")
}
}
}
ORDER BY ?isValueOf

SELECT DISTINCT ?ingredient_name
WHERE { ?food_recipe <http://dbpedia.org/ontology/ingredient> ?ingredient_name }
ORDER BY ?ingredient_name
```
## License

This code is copyrighted by [Ghent University – imec](http://idlab.ugent.be/)
Expand Down
6 changes: 6 additions & 0 deletions bin/sparql-benchmark-runner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,11 @@ async function main(): Promise<void> {
coerce: (arg: number) => arg * 1_000,
number: true,
},
outputRawResults: {
type: 'boolean',
description: 'A flag indicating if SPARQL Benchmark Runner should also output the raw results',
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
default: false,
},
})
.help()
.parse();
Expand All @@ -72,6 +77,7 @@ async function main(): Promise<void> {
warmup: args.warmup,
timeout: args.timeout,
availabilityCheckTimeout: 1_000,
outputRawResults: args.outputRawResults,
logger,
});
const results: IAggregateResult[] = await runner.run();
Expand Down
12 changes: 12 additions & 0 deletions lib/Result.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,26 @@ export interface IResult extends IResultMetadata {

/**
* Aggregate result for multiple executions of a query.
* timestamps from IResult has the meaning of average timestamps.
* The timestamps are the arrival of a results of the query whereas the times
* are the query execution times.
*/
export interface IAggregateResult extends IResult {
resultsMin: number;
resultsMax: number;
timeMin: number;
timeMax: number;
timeStd: number;
timestampsMin: number[];
timestampsMax: number[];
timestampsStd: number[];
replication: number;
failures: number;
}

/**
* Aggregate the raw results from multiple execution of a query.
*/
export interface IRawAggregateResult extends IAggregateResult {
timeAggregate: number[];
}
78 changes: 72 additions & 6 deletions lib/ResultAggregator.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import type { IResult, IAggregateResult } from './Result';
import type { IResult, IAggregateResult, IRawAggregateResult } from './Result';

export class ResultAggregator implements IResultAggregator {
/**
Expand Down Expand Up @@ -33,6 +33,7 @@ export class ResultAggregator implements IResultAggregator {
time: 0,
timeMax: Number.NEGATIVE_INFINITY,
timeMin: Number.POSITIVE_INFINITY,
timeStd: 0,
failures: 0,
replication: resultGroup.length,
results: 0,
Expand All @@ -42,6 +43,7 @@ export class ResultAggregator implements IResultAggregator {
timestamps: [],
timestampsMax: [],
timestampsMin: [],
timestampsStd: [],
};
let inconsistentResults = false;
let successfulExecutions = 0;
Expand Down Expand Up @@ -92,8 +94,16 @@ export class ResultAggregator implements IResultAggregator {
aggregate.timestamps = timestampsProcessed.timestampsAverage;
aggregate.timestampsMin = timestampsProcessed.timestampsMin;
aggregate.timestampsMax = timestampsProcessed.timestampsMax;
aggregate.timestampsStd = timestampsProcessed.timestampsStd;
}

for (const { time, error } of resultGroup) {
if (!error) {
aggregate.timeStd += (time - aggregate.time) ** 2;
}
}
aggregate.timeStd = Math.sqrt(aggregate.timeStd / successfulExecutions);

// Convert all possible leftover infinity / -infinity back to 0 for backward compatibility
aggregate.resultsMin = Number.isFinite(aggregate.resultsMin) ? aggregate.resultsMin : 0;
aggregate.resultsMax = Number.isFinite(aggregate.resultsMax) ? aggregate.resultsMax : 0;
Expand All @@ -106,10 +116,11 @@ export class ResultAggregator implements IResultAggregator {
}

public averageTimeStamps(timestampsAll: number[][], maxNumTimestamps: number): IProcessedTimestamps {
const timestampsSum: number[] = <number[]> Array.from({ length: maxNumTimestamps }).fill(0);
const timestampsMax: number[] = <number[]> Array.from({ length: maxNumTimestamps }).fill(Number.NEGATIVE_INFINITY);
const timestampsMin: number[] = <number[]> Array.from({ length: maxNumTimestamps }).fill(Number.POSITIVE_INFINITY);
const nObsTimestamp: number[] = <number[]> Array.from({ length: maxNumTimestamps }).fill(0);
const timestampsSum: number[] = <number[]>Array.from({ length: maxNumTimestamps }).fill(0);
const timestampsMax: number[] = <number[]>Array.from({ length: maxNumTimestamps }).fill(Number.NEGATIVE_INFINITY);
const timestampsMin: number[] = <number[]>Array.from({ length: maxNumTimestamps }).fill(Number.POSITIVE_INFINITY);
const nObsTimestamp: number[] = <number[]>Array.from({ length: maxNumTimestamps }).fill(0);
const timestampsStd: number[] = <number[]>Array.from({ length: maxNumTimestamps }).fill(0);

for (const timestamps of timestampsAll) {
for (const [ j, ts ] of timestamps.entries()) {
Expand All @@ -119,13 +130,54 @@ export class ResultAggregator implements IResultAggregator {
nObsTimestamp[j]++;
}
}

const timestampsAverage = timestampsSum.map((ts, i) => ts / nObsTimestamp[i]);

for (const timestamps of timestampsAll) {
for (const [ j, ts ] of timestamps.entries()) {
timestampsStd[j] += (ts - timestampsAverage[j]) ** 2;
}
}

for (let i = 0; i < timestampsStd.length; i++) {
timestampsStd[i] = Math.sqrt(timestampsStd[i] / nObsTimestamp[i]);
}

return {
timestampsMax,
timestampsMin,
timestampsAverage: timestampsSum.map((ts, i) => ts / nObsTimestamp[i]),
timestampsAverage,
timestampsStd,
};
}

public aggregateRawGroupedResults(
groupedResults: Record<string, IResult[]>,
aggregateResults: IAggregateResult[],
): IRawAggregateResult[] {
const rawAggregateResults: IRawAggregateResult[] = [];
const aggregateResultsMap: Map<string, IAggregateResult> = new Map(
aggregateResults.map(result => [ `${result.name}:${result.id}`, result ]),
);

for (const [ id, resultsSet ] of Object.entries(groupedResults)) {
// There will be always an aggregate results because it has been made from the group results
const currentAggregateResults = aggregateResultsMap.get(id)!;
const currentRawAggregateResult: IRawAggregateResult = {
...currentAggregateResults,
timeAggregate: [],
};
if (currentAggregateResults.error) {
currentRawAggregateResult.error = currentAggregateResults.error;
}
for (const { time, error } of resultsSet) {
currentRawAggregateResult.timeAggregate.push(error ? Number.NaN : time);
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
}
rawAggregateResults.push(currentRawAggregateResult);
}
return rawAggregateResults;
}

/**
* Produce aggregated query results from a set of single execution results.
* @param results Individual query execution results.
Expand All @@ -136,14 +188,28 @@ export class ResultAggregator implements IResultAggregator {
const aggregateResults = this.aggregateGroupedResults(groupedResults);
return aggregateResults;
}

/**
* Produce raw aggregated query results from a set of single execution results.
* @param results Individual query execution results.
* @returns Raw aggregated results per individual query.
*/
public aggregateRawResults(results: IResult[]): IRawAggregateResult[] {
const groupedResults = this.groupResults(results);
const aggregateResults = this.aggregateGroupedResults(groupedResults);
const aggregateRawResults = this.aggregateRawGroupedResults(groupedResults, aggregateResults);
return aggregateRawResults;
}
}

export interface IResultAggregator {
aggregateResults: (results: IResult[]) => IAggregateResult[];
aggregateRawResults: (results: IResult[]) => IRawAggregateResult[];
}

export interface IProcessedTimestamps {
timestampsMax: number[];
timestampsMin: number[];
timestampsAverage: number[];
timestampsStd: number[];
}
9 changes: 9 additions & 0 deletions lib/ResultAggregatorComunica.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,15 @@ export class ResultAggregatorComunica extends ResultAggregator {
groupedAggregates[key][0].httpRequests = requestsSum / successfulExecutions;
groupedAggregates[key][0].httpRequestsMax = requestsMax;
groupedAggregates[key][0].httpRequestsMin = requestsMin;
groupedAggregates[key][0].httpRequestsStd = 0;

for (const { httpRequests, error } of resultGroup) {
if (!error) {
groupedAggregates[key][0].httpRequestsStd += (httpRequests - groupedAggregates[key][0].httpRequests) ** 2;
}
}
groupedAggregates[key][0].httpRequestsStd =
Math.sqrt(groupedAggregates[key][0].httpRequestsStd / successfulExecutions);
}
}
return aggregateResults;
Expand Down
10 changes: 9 additions & 1 deletion lib/SparqlBenchmarkRunner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ export class SparqlBenchmarkRunner {
protected readonly resultAggregator: IResultAggregator;
protected readonly availabilityCheckTimeout: number;
public readonly endpointFetcher: SparqlEndpointFetcher;
public readonly outputRawResults: boolean;

public constructor(options: ISparqlBenchmarkRunnerArgs) {
this.logger = options.logger;
Expand All @@ -37,6 +38,7 @@ export class SparqlBenchmarkRunner {
additionalUrlParams: options.additionalUrlParams,
timeout: options.timeout,
});
this.outputRawResults = options.outputRawResults ?? false;
}

/**
Expand All @@ -61,7 +63,9 @@ export class SparqlBenchmarkRunner {
await options.onStop();
}

const aggregateResults = this.resultAggregator.aggregateResults(results);
const aggregateResults = this.outputRawResults ?
this.resultAggregator.aggregateRawResults(results) :
this.resultAggregator.aggregateResults(results);

return aggregateResults;
}
Expand Down Expand Up @@ -303,6 +307,10 @@ export interface ISparqlBenchmarkRunnerArgs {
* The delay between subsequent requests sent to the server.
*/
requestDelay?: number;
/**
* Output the raw results along side the aggregate results.
*/
outputRawResults?: boolean;
}

export interface IRunOptions {
Expand Down
Loading