Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs and add features for hive integration #1565

Merged
merged 14 commits into from
Nov 14, 2023

Conversation

vegetableysm
Copy link
Collaborator

@vegetableysm vegetableysm commented Sep 15, 2023

What do these changes do?

Fix bugs and add features based on test results

  • Support varchar(int)
  • Support char(int)
  • Support tinyInt, smallInt
  • Support binary
  • Support date
  • Support timestamp
  • Support decimal
  • Support list, struct and map
  • Fix bug that vineyard file system will return error path when call listStatus occasionally.
  • Fix bug of rename table.(normal table and partitioned table). Replace jimfs with RawLocalFileSystem.
  • Fix bug that compound query will not work.
  • Fix bug that vineyard fs does not delete file when hive drop a vineyard table.
  • Fix bug of decimal.
  • Refactor code of nested data struct.

Related issue number

Fixes #issue number

@vegetableysm vegetableysm changed the title Fix bugs and add features for hive integration(WIP) (WIP)Fix bugs and add features for hive integration Sep 15, 2023
@vegetableysm

This comment was marked as resolved.

@vegetableysm

This comment was marked as resolved.

@vegetableysm vegetableysm force-pushed the hive-integration branch 3 times, most recently from 3c83f7a to 9c12349 Compare November 13, 2023 02:58
… process will overwrite the content of table instead of appending.

Fix bug: spark will rename a empty file and it will trigle a exception because the objectID is invalid.
Support char and varchar.
Fix bug that getSplits will try to open a dir.

Signed-off-by: vegetableysm <[email protected]>
Fix the bug of renaming table(Including partitioned table).
Fixed a bug that compound queries don't work.

Signed-off-by: vegetableysm <[email protected]>
… when the file is deleted.

Signed-off-by: vegetableysm <[email protected]>
Refactor setValue and getValue.
Refactor string, varchar, char and binary type.

Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
Signed-off-by: vegetableysm <[email protected]>
+ tableFilePath
+ ", content: "
+ new String(buffer, StandardCharsets.UTF_8));
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use instanceof to check the exception type, rather, use Multiple-exception catches instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

+ new String(buffer, StandardCharsets.UTF_8));
break;
}
throw e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

case CHAR:
case VARCHAR:
return Types.MinorType.LARGEVARCHAR.getType();
return Types.MinorType.VARCHAR.getType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use large var char for string types to avoid exceeding the size limit of arrow buffers in VarCharArray.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

meta.addMember("buffer_", buffer.seal(client));
meta.addMember("null_bitmap_", BufferBuilder.empty(client));
meta.addMember("data_buffer_", dataBufferBuilder.seal(client));
meta.addMember("validity_buffer_", validityBufferBuilder.seal(client));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't rename these key names. They MUST be kept consistent with the builders/resolvers in existing C++/Python/Go/Rust SDKs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.resolve(
meta.getMemberMeta(
"buffer_" + String.valueOf(i) + "_")))
.getBuffer());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you refactor this part (e.g., use some temporary variables) to make it looks better?

public ObjectTransformer() {}

public Object defaultTransform(Object object) {
return object;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaultTransform -> transform.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


public int intTransform(Object object) {
return (int) object;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • intTransform -> transformInt
  • longTransform -> transformLong
  • ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sighingnow
Copy link
Member

Once ready for review, remember removing the "WIP" in the RP's title.

@vegetableysm vegetableysm changed the title (WIP)Fix bugs and add features for hive integration Fix bugs and add features for hive integration Nov 13, 2023
Signed-off-by: vegetableysm <[email protected]>
@sighingnow sighingnow merged commit fca0b35 into v6d-io:main Nov 14, 2023
4 of 5 checks passed
@sighingnow sighingnow deleted the hive-integration branch November 14, 2023 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants