Make serialize() on a CONSTRUCT result act like normal g.serialize() #1612
Replies: 6 comments
-
One slight conundrum here is that the default format was "xml", and this was valid for both tabular and graph results. I'm working on this now and I think the best solution is to default format to None instead, and then if None, and result is graph, use turtle, otherwise use txt. Feedback on the idea will be welcome, but hopefully I will have a patch soon. |
Beta Was this translation helpful? Give feedback.
-
Sounds about right! And I hope to introduce optional outputs to PANDAS Dataframes across all serialisation options soon as people often want to export results to PANDAS. However such an option will require PANDAS to be imported so has to be one of those optional, additional imports that I don’t quite know how to do but that I’ve seen elsewhere: |
Beta Was this translation helpful? Give feedback.
-
I have been working on this, but I'm at somewhat of an impasse here, because the serializers are actually quite inconsistent. Currently the rdf serializers treat stream destinations as IO[str] (jsonld, n3, nquads, nt, rdfxml) while result Result serializers seems like they will work for both IO[str] and IO[bytes], except for txtresults, which will only work for IO[str]. I think the best solution is to make ResultSerializers work for both IO[str] and IO[bytes]. I will use type hints to prohibit the use of encoding when IO[str] is supplied, and mandate encoding when IO[bytes] is supplied. Any feedback on the matter will be appreciated. |
Beta Was this translation helpful? Give feedback.
-
I think we are seeing the result of a poor transition from Py2 to Py3 here. Presumably I and others missed serialization options for things other than the RDF serializers, i.e. ResultSerializers, and there may be others. If any consistency can be achieved here, that would be great. I guess we'll then just have to look for other serializers too, perhaps the extra CVS ones, Trig, TriX etc. |
Beta Was this translation helpful? Give feedback.
-
I have done some digging to make sense of the IO types in python, and thought a bit how to actually deal with this situation. Some related inquiries and issues, mostly related to typing:
I think if there was a clean slate, the best option would have been to only accept BinaryIO like buffers (i.e. io.RawIOBase or io.BufferedIOBase), with optional encoding, and if no encoding is supplied, and if the serializer supports multiple encodings, default to system preferred encoding (similar to what TextIOWrapper does). However, given that some ResultSerializers work with BinaryIO, and some with TextIO, this will break compatibility, so probably the best compromise to maintain interface is for Further I will also try and ensure the default encoding is For formats that only allow one encoding (turtle) I think we should reconsider the behaviour if unsupported encodings are requested. Either we should always support arbitrary encodings, or always raise an exception if an unsupported encoding is supported. The current approach is to warn if an unsupported encoding is requested, and this may result in unexpected behaviour. I will however defer changing this for now, as it can be dealt with in another PR. |
Beta Was this translation helpful? Give feedback.
-
Converted to an issue as this is something I am and was working on. All further discussion should be there. |
Beta Was this translation helpful? Give feedback.
-
Currently (RDFlib 6.0.0a0 AKA
master
),g.serialize()
defaults to turtle format and is decoded, however callingserialize()
on a SPARQLCONSTRUCT
result (i.e. on aResult
object) returns the RDFlib 5.0.0 form of an RDF/XML, UTF-8-encoded result.The old style
serialize()
inResult
should be updated to match the mainGraph()
serialize()
.Beta Was this translation helpful? Give feedback.
All reactions