Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #776 by reifying avro fixed types #1

Open
wants to merge 1 commit into
base: 4.1.x
Choose a base branch
from
Open

Fix #776 by reifying avro fixed types #1

wants to merge 1 commit into from

Conversation

dcsobral
Copy link

Because fixed can appear multiple times in an enum, being a named type,
enough information must be preserved to correctly identify the union
index, which is mostly easily managed by simply reifying fixed types
with their name and size.

@dcsobral dcsobral requested a review from scottcarey April 28, 2018 05:22
@dcsobral
Copy link
Author

@scottcarey , could you please review this here before I submit it back to Confluent?

@dcsobral dcsobral force-pushed the bug/776 branch 2 times, most recently from 52de93a to 3f6cd6c Compare April 28, 2018 18:43
(ByteBuffer) value;
if (schema != null && schema.parameters() != null
&& schema.parameters().containsKey(CONNECT_AVRO_FIXED_SIZE)) {
int size = Integer.parseInt(schema.parameters().get(CONNECT_AVRO_FIXED_SIZE));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be name, not size.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it name & size, for safety?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, an avro fixed is name+size. The size is implicit in the name if you have access to the avro schema, but if you don't, you'll need to include the size.

if (avroSchema.getType() == org.apache.avro.Schema.Type.UNION) {
for (org.apache.avro.Schema memberSchema : avroSchema.getTypes()) {
if (memberSchema.getType() == org.apache.avro.Schema.Type.FIXED
&& memberSchema.getFixedSize() == size) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, actually, is the name vs size vs both test.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there is the possible case of two different fixed schemas of the same size but different name in the same union.

&& fieldSchema.parameters() != null
&& fieldSchema.parameters().containsKey(CONNECT_AVRO_FIXED_SIZE)) {
if (fixedValueSizeOf(value)
== Integer.parseInt(fieldSchema.parameters().get(CONNECT_AVRO_FIXED_SIZE))) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to change too, but it is trickier. AvroData accepts data in byte[] and ByteBuffer formats, in addition to GenericFixed. The latter can be matched by name, but the former two must be matched by size.

/**
* Get size of bytes value tagged as fixed
*/
private static int fixedValueSizeOf(Object value) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have this return a boolean, so it can apply a different test to GenericFixed?

@Test
public void testFromConnectFixedUnion() {
org.apache.avro.Schema unionSchema = org.apache.avro.SchemaBuilder.builder().unionOf()
.type(avroFixed("sample", 4)).and()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a third branch of the union, another fixed with a different name but same size as one of the above. It should be able to match the right one even if the size is the same.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Test
public void testToConnectFixedUnion() {
org.apache.avro.Schema sampleSchema = avroFixed("sample", 4);
org.apache.avro.Schema otherSchema = avroFixed("other", 6);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should any of these other tests have 3 branches in the union, like the one above? Perhaps it is safest to do that for all of them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Because fixed can appear multiple times in an enum, being a named type,
enough information must be preserved to correctly identify the union
index, which is mostly easily managed by simply reifying fixed types
with their name and size.
@dcsobral dcsobral changed the base branch from master to 4.1.x May 1, 2018 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants