Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the MessagePackParser#type field public #800

Closed
wants to merge 1 commit into from

Conversation

ArtDu
Copy link

@ArtDu ArtDu commented Feb 14, 2024

It's useful when you need to distinguishe bytes array from extension type

Closes #799

It's useful when you need to distinguishe bytes array from extension type
@komamitsu
Copy link
Member

@ArtDu Can you elaborate your use case?

@ArtDu
Copy link
Author

ArtDu commented Feb 24, 2024

@komamitsu

@ArtDu Can you elaborate your use case?

We use extensions values and byte arrays as well to store data in Tarantool(https://en.wikipedia.org/wiki/Tarantool).
And we can distinguishe those two types in jackson embedded object only by class type and it's no elegant way

        private boolean isExtensionValue(Object object) {
            return object instanceof MessagePackExtensionType;
        }


        @Override
        public Object deserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
            if (p.currentTokenId() != JsonTokenId.ID_EMBEDDED_OBJECT || !isExtensionValue(p.getEmbeddedObject())) {
                return super.deserialize(p, ctxt);
            }
            MessagePackExtensionType ext = p.readValueAs(MessagePackExtensionType.class);
            byte type = ext.getType();
            ...
        }

Instead of it we can use type variable that is stored in MessagePackParser

 @Override
        public Object deserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
            MessagePackParser mpParser = (MessagePackParser)p;
            if (mpParser.getType().equals(Type.EXT)) {
                return super.deserialize(p, ctxt);
            }
            ...
        }

@komamitsu
Copy link
Member

@ArtDu Thanks for sharing the details.

jackson-dataformat-msgpack has a function to deserialize extention type values by registering the target ext-type ID and callback lambda using org.msgpack.jackson.dataformat.ExtensionTypeCustomDeserializers#addCustomDeser. Does it work for your use case?

BTW, as you use jackson-databind interface, I guess you serialize Java objects to MessagePack format data and/or deserialize MessagePack format data to Java object vise verse. So, I think you know all the type of values when deserializing MessagePack format data since you know how the data is serialized. For instance, let's say you serialize an instance of this class

public static class Foo {
  String name;
  byte[] data;
}

and then, you know the type of data is stored as a byte array and don't need to consider extension types in this case. What do you think?

@ArtDu
Copy link
Author

ArtDu commented Mar 15, 2024

@komamitsu

jackson-dataformat-msgpack has a function to deserialize extention type values by registering the target ext-type ID and callback lambda using org.msgpack.jackson.dataformat.ExtensionTypeCustomDeserializers#addCustomDeser. Does it work for your use case?

In some cases it works for me, but I also need a serializer. That's why I use ObjectMapper(...).registerModule() method.

Also addCustomDeser way can't set targetType.
Sometimes I need to use a mapper in this way mapper.readValue(bytes, SpecialClass.class).
For example, we can interpret ExtensionValue (Datetime) as ZonedDateTime, as Instant, or as LocalTime. In this way, we determine how to get the needed type.

BTW, as you use jackson-databind interface, I guess you serialize Java objects to MessagePack format data and/or deserialize MessagePack format data to Java object vise verse. So, I think you know all the type of values when deserializing MessagePack format data since you know how the data is serialized. For instance, let's say you serialize an instance of this class

public static class Foo {
  String name;
  byte[] data;
}

and then, you know the type of data is stored as a byte array and don't need to consider extension types in this case. What do you think?

We have two types of interaction:

  1. When we expect an explicit class that we'll read from the database (msgpack bytes). It's like I described above
  2. When we don't know what type we'll come and we need to read bytes by its msgpack own type. The type could be an extension or raw bytes (And at this point I want to determine by the Type)

@komamitsu
Copy link
Member

@ArtDu I think understand your use case. Honestly speaking, the usage seems a bit too dynamic in terms of type-safety and to not mach jackson-databind interface. I don't think it's a common use case, and I'm hesitating to approve the change.

As for org.msgpack.jackson.dataformat.MessagePackParser#type, it's not supposed to be externally exposed and I might change it's behavior in the future. If it's exposed by this PR, it would be a possible maintenance concern.

Regarding the sample code you shared, I think you know the value is an embedded object at least in deserialize() and you can simplify it w/o checking currentTokenId like this.

        @Override
        public Object deserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
            // At this point, the current value must be an embedded object, right?
            Object embedded = p.getEmbeddedObject();
            if (!(embedded instanceof MessagePackExtensionType)) {
                // This should be `byte[]`.
                handleByteArrayAsXxxxx(embedded);
                return;
            }
            MessagePackExtensionType ext = (MessagePackExtensionType) embedded;
            byte type = ext.getType();
            ...
        }

@ArtDu
Copy link
Author

ArtDu commented Sep 17, 2024

This w/a works fine from experience. Since that I close this PR and ticket

We use extensions values and byte arrays as well to store data in Tarantool(https://en.wikipedia.org/wiki/Tarantool).
And we can distinguishe those two types in jackson embedded object only by class type and it's no elegant way

        private boolean isExtensionValue(Object object) {
            return object instanceof MessagePackExtensionType;
        }


        @Override
        public Object deserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
            if (p.currentTokenId() != JsonTokenId.ID_EMBEDDED_OBJECT || !isExtensionValue(p.getEmbeddedObject())) {
                return super.deserialize(p, ctxt);
            }
            MessagePackExtensionType ext = p.readValueAs(MessagePackExtensionType.class);
            byte type = ext.getType();
            ...
        }

@ArtDu ArtDu closed this Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make the MessagePackParser#type field public
2 participants