-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #776 by reifying avro fixed types #1
base: 4.1.x
Are you sure you want to change the base?
Conversation
@scottcarey , could you please review this here before I submit it back to Confluent? |
52de93a
to
3f6cd6c
Compare
(ByteBuffer) value; | ||
if (schema != null && schema.parameters() != null | ||
&& schema.parameters().containsKey(CONNECT_AVRO_FIXED_SIZE)) { | ||
int size = Integer.parseInt(schema.parameters().get(CONNECT_AVRO_FIXED_SIZE)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be name, not size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it name & size, for safety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, an avro fixed is name+size. The size is implicit in the name if you have access to the avro schema, but if you don't, you'll need to include the size.
if (avroSchema.getType() == org.apache.avro.Schema.Type.UNION) { | ||
for (org.apache.avro.Schema memberSchema : avroSchema.getTypes()) { | ||
if (memberSchema.getType() == org.apache.avro.Schema.Type.FIXED | ||
&& memberSchema.getFixedSize() == size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, actually, is the name vs size vs both test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, there is the possible case of two different fixed schemas of the same size but different name in the same union.
&& fieldSchema.parameters() != null | ||
&& fieldSchema.parameters().containsKey(CONNECT_AVRO_FIXED_SIZE)) { | ||
if (fixedValueSizeOf(value) | ||
== Integer.parseInt(fieldSchema.parameters().get(CONNECT_AVRO_FIXED_SIZE))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to change too, but it is trickier. AvroData accepts data in byte[]
and ByteBuffer
formats, in addition to GenericFixed
. The latter can be matched by name, but the former two must be matched by size.
/** | ||
* Get size of bytes value tagged as fixed | ||
*/ | ||
private static int fixedValueSizeOf(Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have this return a boolean, so it can apply a different test to GenericFixed
?
@Test | ||
public void testFromConnectFixedUnion() { | ||
org.apache.avro.Schema unionSchema = org.apache.avro.SchemaBuilder.builder().unionOf() | ||
.type(avroFixed("sample", 4)).and() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a third branch of the union, another fixed with a different name but same size as one of the above. It should be able to match the right one even if the size is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@Test | ||
public void testToConnectFixedUnion() { | ||
org.apache.avro.Schema sampleSchema = avroFixed("sample", 4); | ||
org.apache.avro.Schema otherSchema = avroFixed("other", 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should any of these other tests have 3 branches in the union, like the one above? Perhaps it is safest to do that for all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
Because fixed can appear multiple times in an enum, being a named type, enough information must be preserved to correctly identify the union index, which is mostly easily managed by simply reifying fixed types with their name and size.
Because fixed can appear multiple times in an enum, being a named type,
enough information must be preserved to correctly identify the union
index, which is mostly easily managed by simply reifying fixed types
with their name and size.