Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Improve the performance of show tables with hive metastore #4605

Merged
merged 1 commit into from
Nov 28, 2024

Conversation

xleoken
Copy link
Member

@xleoken xleoken commented Nov 27, 2024

Purpose

When store the catalog in s3 object store, as the number of tables increases, the performance of the show tables deteriorates.

HiveCatalog#listTablesImpl will check each table in filesystem, will send many list request to the s3 server.

reason as #4592

Tests

List 102 tables

Before patch

abc098
abc099
Time taken: 25.643 seconds, Fetched 102 row(s)

After patch

abc098
abc099
Time taken: 1.711 seconds, Fetched 102 row(s)

API and Format

Documentation

@wwj6591812
Copy link
Contributor

wwj6591812 commented Nov 28, 2024

Hi, @xleoken thanks for your prepare this pr.
I have a question:
Why change AbstractCatalog#tableSchemaInFileSystem to AbstractCatalog#tableExistsInFileSystem can Improve the performance.? They both need call "listVersionedFiles(fileIO, schemaDirectory(), SCHEMA_PREFIX)" only once.

@xleoken
Copy link
Member Author

xleoken commented Nov 28, 2024

hi @wwj6591812, AbstractCatalog#tableExistsInFileSystem will check schema-0 first, it read the s3 key directly.

image

@wwj6591812
Copy link
Contributor

hi @wwj6591812, AbstractCatalog#tableExistsInFileSystem will check schema-0 first, it read the s3 key directly.

image

OK, After I git pull the new code, I found it .
I am agree with you.

@wwj6591812
Copy link
Contributor

+1

@xleoken
Copy link
Member Author

xleoken commented Nov 28, 2024

+1

thanks

@JingsongLi
Copy link
Contributor

+1

@JingsongLi JingsongLi merged commit cdd4061 into apache:master Nov 28, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants