Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop nesting blank nodes in internal structure #2368

Merged
merged 5 commits into from
Apr 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,30 @@ This project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html

The following changes are pending, and will be applied on the next major release:

## Unreleased changes

### Patch changes

- `getThingAll(dataset, { allowacceptBlankNodes: true })` now returns all Blank Nodes
subjects in the Dataset, in particular including those part of a single chain of
predicates. For instance, given the following dataset:

```
@prefix ex: <https://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
ex:camille
foaf:knows [
foaf:name "Dominique"@en ;
] .
;
```

`getThingAll(dataset, { allowacceptBlankNodes: true })` would have previously returned
a single element for the Named Node (`ex:camille`), it will now also include a second
element for the Blank Node. Blank Node identifiers are by definition unstable and shouldn't
be relied upon beyond local resolution.

## [2.0.1]

The following changes have been implemented but not released yet:
Expand Down
9 changes: 5 additions & 4 deletions src/formats/solidDatasetAsTurtle.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,11 @@ async function getDataset(ttl: string): Promise<SolidDataset> {
}

const ttl = `
prefix : <#>
prefix ex: <https://example.org/>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix vcard: <http://www.w3.org/2006/vcard/ns#>
@prefix : <#> .
@prefix ex: <https://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
@base <https://example.org/> .
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces the @base directive, because the URl resolution was incorrect.


<>
a foaf:PersonalProfileDocument ;
Expand Down
265 changes: 84 additions & 181 deletions src/rdf.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,11 @@ import {
serializeInteger,
xmlSchemaTypes,
} from "./datatypes";
import type { ImmutableDataset } from "./rdf.internal";
import { isBlankNodeId, type ImmutableDataset } from "./rdf.internal";
import { addRdfJsQuadToDataset } from "./rdfjs.internal";
import { fromRdfJsDataset, toRdfJsDataset } from "./rdfjs";
import { asUrl, getThing, getThingAll } from "./thing/thing";
import { getTermAll } from "./thing/get";

describe("fromRdfJsDataset", () => {
const fcNamedNode = fc
Expand Down Expand Up @@ -212,7 +214,7 @@ describe("fromRdfJsDataset", () => {
expect(fromRdfJsDataset(rdfJsDataset)).toStrictEqual({
type: "Dataset",
graphs: {
default: {
default: expect.objectContaining({
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer do exact matches on the internal representation, because of the unstable blank nodes identifiers.

[subject1IriString]: {
url: subject1IriString,
type: "Subject",
Expand All @@ -231,41 +233,35 @@ describe("fromRdfJsDataset", () => {
},
},
},
},
[acrGraphIriString]: {
}),
[acrGraphIriString]: expect.objectContaining({
[subject2IriString]: {
url: subject2IriString,
type: "Subject",
predicates: {
[predicate1IriString]: {
blankNodes: [
{
[predicate1IriString]: {
literals: {
[xmlSchemaTypes.string]: [literalStringValue],
},
},
},
{
[predicate1IriString]: {
literals: {
[xmlSchemaTypes.string]: [literalStringValue],
[xmlSchemaTypes.integer]: [literalIntegerValue],
},
},
[predicate2IriString]: {
literals: {
[xmlSchemaTypes.integer]: [literalIntegerValue],
},
},
},
expect.stringMatching(/_:/),
expect.stringMatching(/_:/),
],
},
},
},
},
}),
},
});
const subjectsExcludingBlankNodes = getThingAll(
chelseapinka marked this conversation as resolved.
Show resolved Hide resolved
fromRdfJsDataset(rdfJsDataset),
{ scope: acrGraphIriString },
);
const subjectsIncludingBlankNodes = getThingAll(
fromRdfJsDataset(rdfJsDataset),
{ scope: acrGraphIriString, acceptBlankNodes: true },
);
// There should be two blank nodes in the resulting dataset.
expect(
subjectsIncludingBlankNodes.length - subjectsExcludingBlankNodes.length,
).toBe(2);
Comment on lines +253 to +264
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces the previous exact match of the internal structure

});

it("can represent lists", () => {
Expand Down Expand Up @@ -453,104 +449,6 @@ describe("fromRdfJsDataset", () => {
);
});

it("throws an error when passed unknown Predicate types with chain Blank Node Subjects", () => {
const mockDataset: ImmutableDataset = {
type: "Dataset",
graphs: { default: {} },
};
const chainBlankNode = DF.blankNode();
const otherQuad = DF.quad(
DF.namedNode("https://arbitrary.subject"),
DF.namedNode("https://arbitrary.predicate"),
chainBlankNode,
DF.defaultGraph(),
);
const mockQuad = DF.quad(
chainBlankNode,
{ termType: "Unknown term type" } as any,
DF.namedNode("https://arbitrary.object"),
DF.defaultGraph(),
);
expect(() =>
addRdfJsQuadToDataset(mockDataset, otherQuad, {
chainBlankNodes: [chainBlankNode],
otherQuads: [mockQuad],
}),
).toThrow(
"Cannot parse Quads with nodes of type [Unknown term type] as their Predicate node.",
);
});

it("throws an error when passed unknown Predicate types in connecting Quads for chain Blank Node Objects", () => {
const mockDataset: ImmutableDataset = {
type: "Dataset",
graphs: { default: {} },
};
const chainBlankNode1 = DF.blankNode();
const chainBlankNode2 = DF.blankNode();
const otherQuad = DF.quad(
DF.namedNode("https://arbitrary.subject"),
DF.namedNode("https://arbitrary.predicate"),
chainBlankNode1,
DF.defaultGraph(),
);
const inBetweenQuad = DF.quad(
chainBlankNode1,
{ termType: "Unknown term type" } as any,
chainBlankNode2,
DF.defaultGraph(),
);
const mockQuad = DF.quad(
chainBlankNode2,
DF.namedNode("https://arbitrary.predicate"),
DF.namedNode("https://arbitrary.object"),
DF.defaultGraph(),
);
expect(() =>
addRdfJsQuadToDataset(mockDataset, otherQuad, {
chainBlankNodes: [chainBlankNode1, chainBlankNode2],
otherQuads: [mockQuad, inBetweenQuad],
}),
).toThrow(
"Cannot parse Quads with nodes of type [Unknown term type] as their Predicate node.",
);
});

it("throws an error when passed unknown Predicate types in the terminating Quads for chain Blank Node Objects", () => {
const mockDataset: ImmutableDataset = {
type: "Dataset",
graphs: { default: {} },
};
const chainBlankNode1 = DF.blankNode();
const chainBlankNode2 = DF.blankNode();
const otherQuad = DF.quad(
DF.namedNode("https://arbitrary.subject"),
DF.namedNode("https://arbitrary.predicate"),
chainBlankNode1,
DF.defaultGraph(),
);
const inBetweenQuad = DF.quad(
chainBlankNode1,
DF.namedNode("https://arbitrary.predicate"),
chainBlankNode2,
DF.defaultGraph(),
);
const mockQuad = DF.quad(
chainBlankNode2,
{ termType: "Unknown term type" } as any,
DF.namedNode("https://arbitrary.object"),
DF.defaultGraph(),
);
expect(() =>
addRdfJsQuadToDataset(mockDataset, otherQuad, {
chainBlankNodes: [chainBlankNode1, chainBlankNode2],
otherQuads: [mockQuad, inBetweenQuad],
}),
).toThrow(
"Cannot parse Quads with nodes of type [Unknown term type] as their Predicate node.",
);
});

it("throws an error when passed unknown Object types", () => {
const mockDataset: ImmutableDataset = {
type: "Dataset",
Expand Down Expand Up @@ -586,33 +484,36 @@ describe("fromRdfJsDataset", () => {
DF.defaultGraph(),
);

const updatedDataset = addRdfJsQuadToDataset(mockDataset, otherQuad, {
chainBlankNodes: [chainBlankNode1],
otherQuads: [mockQuad],
});
const updatedDataset = [mockQuad, otherQuad].reduce(
addRdfJsQuadToDataset,
mockDataset,
);

expect(updatedDataset).toStrictEqual({
graphs: {
default: {
"https://some.subject": {
predicates: {
"https://some.predicate/1": {
blankNodes: [
{
"https://some.predicate/2": {
blankNodes: ["_:some-blank-node"],
},
},
],
},
},
type: "Subject",
url: "https://some.subject",
},
},
},
type: "Dataset",
// There should be one blank node subject.
expect(
getThingAll(updatedDataset, { acceptBlankNodes: false }),
).toHaveLength(1);
expect(
getThingAll(updatedDataset, { acceptBlankNodes: true }),
).toHaveLength(2);

// The blank nodes should be linked
const blankNodes = getThingAll(updatedDataset, {
acceptBlankNodes: true,
}).filter((thing) => isBlankNodeId(asUrl(thing)));
let bnAreLinked = false;
blankNodes.forEach((bn) => {
const candidateObjects = getTermAll(bn, "https://some.predicate/2");
bnAreLinked ||=
candidateObjects.length > 0 &&
candidateObjects.some((obj) => obj.termType === "BlankNode");
});

// The named node should be linked to a blank node
getTermAll(
getThing(updatedDataset, "https://some.subject")!,
"https://some.predicate/1",
).some((term) => term.termType === "BlankNode");
});

it("can parse chained Blank Nodes that end in a dangling Blank Node", () => {
Expand Down Expand Up @@ -640,40 +541,42 @@ describe("fromRdfJsDataset", () => {
DF.blankNode("some-blank-node"),
DF.defaultGraph(),
);
const updatedDataset = [mockQuad, inBetweenQuad, otherQuad].reduce(
addRdfJsQuadToDataset,
mockDataset,
);

const updatedDataset = addRdfJsQuadToDataset(mockDataset, otherQuad, {
chainBlankNodes: [chainBlankNode1, chainBlankNode2],
otherQuads: [mockQuad, inBetweenQuad],
});

expect(updatedDataset).toStrictEqual({
graphs: {
default: {
"https://some.subject": {
predicates: {
"https://some.predicate/1": {
blankNodes: [
{
"https://some.predicate/2": {
blankNodes: [
{
"https://some.predicate/3": {
blankNodes: ["_:some-blank-node"],
},
},
],
},
},
],
},
},
type: "Subject",
url: "https://some.subject",
},
},
},
type: "Dataset",
});
// There should be 2 blank node subjects
expect(
getThingAll(updatedDataset, { acceptBlankNodes: false }),
).toHaveLength(1);
expect(
getThingAll(updatedDataset, { acceptBlankNodes: true }),
).toHaveLength(3);

// The blank nodes subjects and the blank node object should be linked.
const blankNodes = getThingAll(updatedDataset, {
acceptBlankNodes: true,
}).filter((thing) => isBlankNodeId(asUrl(thing)));
// Count the number of links between blank nodes,
// based on known predicates.
const bnLinks = blankNodes.reduce(
(prev, cur) =>
prev +
[
...getTermAll(cur, "https://some.predicate/2"),
...getTermAll(cur, "https://some.predicate/3"),
].filter((obj) => obj.termType === "BlankNode").length,
0,
);
// There should be a chain of links between blank nodes.
expect(bnLinks).toBe(2);

// The named node should be linked to a blank node.
getTermAll(
getThing(updatedDataset, "https://some.subject")!,
"https://some.predicate/1",
).some((term) => term.termType === "BlankNode");
});
});
});
Expand Down
Loading
Loading