Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in attempting to load RDFa #34

Open
josephguillaume opened this issue Aug 24, 2024 · 4 comments
Open

Error in attempting to load RDFa #34

josephguillaume opened this issue Aug 24, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@josephguillaume
Copy link

This is a rather niche error, but I thought I'd document it anyway.
I saved an index.html file in my movies folder and Media Kraken failed to load, with the error below.

Stack trace:

Error: Found illegal @id 'null'
    at o.handle (https://noeldemartin.github.io/media-kraken/js/0.0c3c3aec.worker.js?__WB_REVISION__=cc5697750ce9b8580d94d3a24979b00e:23:36647)
    at _.newOnValueJob (https://noeldemartin.github.io/media-kraken/js/0.0c3c3aec.worker.js?__WB_REVISION__=cc5697750ce9b8580d94d3a24979b00e:1:9452)
    at async _.executeBufferedJobs (https://noeldemartin.github.io/media-kraken/js/0.0c3c3aec.worker.js?__WB_REVISION__=cc5697750ce9b8580d94d3a24979b00e:1:13363)

I tracked the error down to this minimal example

<!doctype html>                                             
<html>                                                       
<body>                                                             
<span 
  typeof="https://schema.org/Movie" 
  property="https://schema.org/name">
Movie
</span>                        
 </body>                                                    
</html>

It turns out I had defined a blank node of type schema:Movie, and Media Kraken was not able to cope with it.

I think it makes sense for Media Kraken to not support blank node movies.

However, I don't think it's the intended behaviour that Media Kraken tried to parse RDFa in index.html in the first place?
While it is correct that a html file in the movies folder could contain valid data, I don't think Media Kraken is set up to write to it.

I suspect that the intended behaviour here would be for Media Kraken to ignore invalid data, perhaps with a warning.

@NoelDeMartin
Copy link
Owner

Hey, thanks for reporting this and getting to a minimal reproduction.

I am aware that blank nodes are not supported, that's a know limitation tracked here: NoelDeMartin/soukai-solid#19

However, it's very weird that it's parsing the html because I haven't done any of that :/. The fetch request includes an Accept: text/turtle header, so maybe it's the Pod parsing the html and returning turtle? Which Pod server are you using?

In any case, I agree that this type of error should probably be handled explicitly, so I'll leave this issue open at least until I handle that. I'll leave it as an enhancement, though.

@NoelDeMartin NoelDeMartin added the enhancement New feature or request label Aug 25, 2024
@josephguillaume
Copy link
Author

That's right. I'm using community solid server. I've now checked and can confirm that the minimal example is returned as:

_:df_10585_0 a <https://schema.org/Movie>.
<https://server/movies/test.html> <https://schema.org/name> _:df_10585_0.

This is possibly an argument for avoiding a data model that overloads ldp:contains - the assumption is made that all the resources referenced by ldp:contains should be parsed as movies, and this is not necessarily the case.
It can be enforced with a shape tree, but at the expense of imposing a closed world assumption.
An alternative would be to use an additional predicate to ldp:contains, e.g. schema:itemListElement, as you have done elsewhere.

@NoelDeMartin
Copy link
Owner

I see. The thing with ldp:contains is that that's how it works with the type index. I guess I could register an instance of a "movies list" or something instead, but the problem with that is that I have to make up a new class (as I mentioned for Umai). Also, most people using the type index is using it in this way (registering containers, not instances of lists). So changing that could harm interoperability.

All in all, seeing how things stand right now, I think the best solution is to handle these malformed document errors. What I'm unsure about is whether to bother users with a warning or something, or silently ignoring the problem. I think I'll end up doing the latter, but showing some warning in console for developers trying to debug what's going on.

@josephguillaume
Copy link
Author

The type index spec is a little unclear about what solid:instanceContainer actually contains, but I agree that your interpretation seems to be the one in use, and have added an issue to document that view (in the process providing my opinion to a query Angelo had raised on this)

I also quoted the position that "Linked data is a set of documents", which came up in a disagreement as to whether the document (in that case a type index) should use predicates to link to its contents (in that case a type registration), or whether it is sufficient for the contents to be in the document.

Personally I have used both patterns.
I use contents in a document if I want a store of many instances, i.e. the document is just a vessel.
I link from a list to instances if I want to be able to enumerate them - even without dereferencing the URIs of the instances.

It seems both approaches can easily coexist, so I don't have an issue with Media Kraken sticking to this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants