Make serialize() on a CONSTRUCT result act like normal g.serialize() #1612

nicholascar · 2021-06-11T00:39:14Z

nicholascar
Jun 11, 2021
Maintainer

Currently (RDFlib 6.0.0a0 AKA master), g.serialize() defaults to turtle format and is decoded, however calling serialize() on a SPARQL CONSTRUCT result (i.e. on a Result object) returns the RDFlib 5.0.0 form of an RDF/XML, UTF-8-encoded result.

The old style serialize() in Result should be updated to match the main Graph() serialize().

aucampia · 2021-09-17T21:10:12Z

aucampia
Sep 17, 2021
Maintainer

One slight conundrum here is that the default format was "xml", and this was valid for both tabular and graph results. I'm working on this now and I think the best solution is to default format to None instead, and then if None, and result is graph, use turtle, otherwise use txt.

Feedback on the idea will be welcome, but hopefully I will have a patch soon.

0 replies

nicholascar · 2021-09-17T22:13:28Z

nicholascar
Sep 17, 2021
Maintainer Author

the best solution is to default format to None instead, and then if None, and result is graph, use turtle, otherwise use txt.

Sounds about right!

And I hope to introduce optional outputs to PANDAS Dataframes across all serialisation options soon as people often want to export results to PANDAS. However such an option will require PANDAS to be imported so has to be one of those optional, additional imports that I don’t quite know how to do but that I’ve seen elsewhere: import rdflib[pandas] or similar.

0 replies

aucampia · 2021-09-21T22:13:12Z

aucampia
Sep 21, 2021
Maintainer

I have been working on this, but I'm at somewhat of an impasse here, because the serializers are actually quite inconsistent.

Currently the rdf serializers treat stream destinations as IO[str] (jsonld, n3, nquads, nt, rdfxml) while result Result serializers seems like they will work for both IO[str] and IO[bytes], except for txtresults, which will only work for IO[str]. I think the best solution is to make ResultSerializers work for both IO[str] and IO[bytes]. I will use type hints to prohibit the use of encoding when IO[str] is supplied, and mandate encoding when IO[bytes] is supplied.

Any feedback on the matter will be appreciated.

0 replies

nicholascar · 2021-09-22T00:15:04Z

nicholascar
Sep 22, 2021
Maintainer Author

I think we are seeing the result of a poor transition from Py2 to Py3 here. Presumably I and others missed serialization options for things other than the RDF serializers, i.e. ResultSerializers, and there may be others.

If any consistency can be achieved here, that would be great. I guess we'll then just have to look for other serializers too, perhaps the extra CVS ones, Trig, TriX etc.

0 replies

aucampia · 2021-09-26T20:19:16Z

aucampia
Sep 26, 2021
Maintainer

I have done some digging to make sense of the IO types in python, and thought a bit how to actually deal with this situation.

Some related inquiries and issues, mostly related to typing:

I think if there was a clean slate, the best option would have been to only accept BinaryIO like buffers (i.e. io.RawIOBase or io.BufferedIOBase), with optional encoding, and if no encoding is supplied, and if the serializer supports multiple encodings, default to system preferred encoding (similar to what TextIOWrapper does).

However, given that some ResultSerializers work with BinaryIO, and some with TextIO, this will break compatibility, so probably the best compromise to maintain interface is for ResultSerializer.serialize() to accept both BinaryIO with optional encoding, and TextIO without any encoding. And then when ResultSerializer.serialize() defers to Graph.serialize(), just use TextIO.buffer and TextIO.encoding and pass that to Graph.serialize().

Further I will also try and ensure the default encoding is utf-8 throughout all encoders.

For formats that only allow one encoding (turtle) I think we should reconsider the behaviour if unsupported encodings are requested. Either we should always support arbitrary encodings, or always raise an exception if an unsupported encoding is supported. The current approach is to warn if an unsupported encoding is requested, and this may result in unexpected behaviour. I will however defer changing this for now, as it can be dealt with in another PR.

0 replies

aucampia · 2022-04-16T12:12:11Z

aucampia
Apr 16, 2022
Maintainer

Converted to an issue as this is something I am and was working on.

Make serialize() on a CONSTRUCT result act like normal g.serialize() #1834

All further discussion should be there.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make serialize() on a CONSTRUCT result act like normal g.serialize() #1612

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Make serialize() on a CONSTRUCT result act like normal g.serialize() #1612

nicholascar Jun 11, 2021 Maintainer

Replies: 6 comments

aucampia Sep 17, 2021 Maintainer

nicholascar Sep 17, 2021 Maintainer Author

aucampia Sep 21, 2021 Maintainer

nicholascar Sep 22, 2021 Maintainer Author

aucampia Sep 26, 2021 Maintainer

aucampia Apr 16, 2022 Maintainer

nicholascar
Jun 11, 2021
Maintainer

aucampia
Sep 17, 2021
Maintainer

nicholascar
Sep 17, 2021
Maintainer Author

aucampia
Sep 21, 2021
Maintainer

nicholascar
Sep 22, 2021
Maintainer Author

aucampia
Sep 26, 2021
Maintainer

aucampia
Apr 16, 2022
Maintainer