Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix set semantics bug #9

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Fix set semantics bug #9

wants to merge 8 commits into from

Conversation

maartyman
Copy link
Collaborator

@maartyman maartyman commented Jan 18, 2024

Fixes #6

@maartyman
Copy link
Collaborator Author

I opened this PR for conversation on what to do.

I'm starting to doubt that this is actually possible with the rdfjs Store interface. In some way, we need to await the import of each quad into the store to continue to match a new quad to the store. I made a new test that amplifies this problem.

@maartyman
Copy link
Collaborator Author

One solution could be to have a temporary "import store" where we can syncrounously add quads to. When checking if this quad already exists (with the match function) we check both the temporary "import store" and the actual store. If it doesn't exist yet we can add it to both stores. At the end of the stream we can delete the temporary "import store".

@maartyman
Copy link
Collaborator Author

nvm, this would break if there are multiple import streams. If, for example, two simultaneous imports are done with identical quads you would get duplicate results.

@rubensworks
Copy link
Member

nvm, this would break if there are multiple import streams. If, for example, two simultaneous imports are done with identical quads you would get duplicate results.

You could work around this, by halting concurrent imports until the other import is done.
But this would complicate things of course, so I'd only do this if we're certain there's no easier way :-)

@rubensworks
Copy link
Member

Actually, what you suggest would make the logic quite clean:

  1. Wait until other concurrent imports are done
  2. Determine unique triples synchronously
  3. Import unique triples to store and listeners.

@maartyman
Copy link
Collaborator Author

This does mean that a non ending stream could block imports into the store.

@rubensworks
Copy link
Member

This does mean that a non ending stream could block imports into the store.

True, that would be problematic indeed...
So a more fine-grained quad-per-quad approach may be needed.

@maartyman
Copy link
Collaborator Author

I was implementing this, but I realize now that there is no way of knowing when we should free the synchronous store.

  1. Add quad in 'synchronous store'
  2. Import unique quads to 'actual store' and listeners
  3. After import is done delete 'synchronous store'

When do we delete the 'synchronous store'? We don't know when the import to the 'actual store' has finished. The only way I know is continuously match the quads to the 'actual store' until the match function returns an element, but this is really inefficient.

@rubensworks
Copy link
Member

This may be getting too complicated for what it's worth.

Maybe we should concede and just assume a synchronous quad-containment checker in the rdf store.
This would come at the cost of interoperability, but perhaps this is fine given the complexity.

As long as it's works with n3 and rdf-stores, it's fine by me :-)

@maartyman
Copy link
Collaborator Author

So I switch from a rdfjs interface to a n3 Store interface?

@maartyman
Copy link
Collaborator Author

Or I switch to a rdfjs dataset interface?

@maartyman
Copy link
Collaborator Author

@rubensworks
Copy link
Member

@maartyman This approach won't work with rdf-stores unfortunately, as that lib doesn't implement DatasetCore directly on the store, but through a separate asDataset() method.

What we could do instead, is make use of the sync countQuads method that both N3 and rdf-stores implement. We could just create a custom local interface with a copy of this countQuads method signature, so that we can pass both N3 stores and rdf-stores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PendingStream and internal store get out of sync due to set-semantics
2 participants