Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[parquet] Support using file index result to filter row ranges #4780

Merged
merged 5 commits into from
Dec 30, 2024

Conversation

Tan-JiaLiang
Copy link
Contributor

Purpose

Support using file index result to filter row ranges.

Tests

org.apache.paimon.table.AppendOnlyFileStoreTableTest#testBitmapIndexResultFilterParquetRowRanges

API and Format

No.

Documentation

No.

@JingsongLi
Copy link
Contributor

Hi @Tan-JiaLiang , can you show some test results after index pushing down?

@Tan-JiaLiang
Copy link
Contributor Author

I test in a 20 fields table, generate 1000000 rows, using the BSI index to test the EQ predicate.

generate random values bound in [0, 1000) for BSI index field, and test predicate EQ with a random value(index result was no filter any ranges)

read Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
without-bsi-index-1000-744 3387 / 3565 885.7 1129.1 1.0X
without-bsi-index-1000-441 3521 / 3599 852.1 1173.5 1.0X
without-bsi-index-1000-778 3389 / 3485 885.3 1129.6 1.0X
with-bsi-index-1000-744 3425 / 3604 875.9 1141.7 1.0X
with-bsi-index-1000-441 3305 / 3449 907.7 1101.7 1.0X
with-bsi-index-1000-778 3330 / 3459 901.0 1109.9 1.0X

generate random values bound in [0, 10000) for BSI index field, and test predicate EQ with a random value(index result maybe more filter 4-5 ranges)

read Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
without-bsi-index-10000-7069 3471 / 3608 864.4 1156.8 1.0X
without-bsi-index-10000-9186 3329 / 3467 901.2 1109.6 1.0X
without-bsi-index-10000-9679 3369 / 3452 890.5 1122.9 1.0X
with-bsi-index-10000-7069 2992 / 3159 1002.8 997.2 1.2X
with-bsi-index-10000-9186 3074 / 3174 976.1 1024.5 1.1X
with-bsi-index-10000-9679 3123 / 3262 960.5 1041.1 1.1X

generate random values bound in [0, 20000) for BSI index field, and test predicate EQ with a random value(index result maybe more filter 10-20 ranges)

read Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
read-without-bsi-index-20000-18921 3212 / 3421 933.9 1070.8 1.0X
read-without-bsi-index-20000-4309 3246 / 3403 924.3 1081.9 1.0X
read-without-bsi-index-20000-14889 3421 / 3439 877.1 1140.2 0.9X
read-with-bsi-index-20000-18921 2054 / 2129 1460.8 684.6 1.6X
read-with-bsi-index-20000-4309 2310 / 2494 1298.7 770.0 1.4X
read-with-bsi-index-20000-14889 2920 / 3072 1027.5 973.2 1.1X

values bound in [0, 50000), and test predicate EQ with a random value(index result maybe more filter 30-35 ranges)

read Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
read-without-bsi-index-50000-469 3324 / 3398 902.4 1108.1 1.0X
read-without-bsi-index-50000-7187 3437 / 3493 872.9 1145.7 1.0X
read-without-bsi-index-50000-1208 3404 / 3470 881.3 1134.6 1.0X
read-with-bsi-index-50000-469 1451 / 1518 2068.2 483.5 2.3X
read-with-bsi-index-50000-7187 1206 / 1259 2487.3 402.0 2.8X
read-with-bsi-index-50000-1208 917 / 1044 3270.8 305.7 3.6X

generate random values bound in [0, 100000) for BSI index field, and test predicate EQ with a random value(index result maybe more filter 35-45 ranges)

read Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
read-without-bsi-index-100000-99543 3414 / 3497 878.7 1138.0 1.0X
read-without-bsi-index-100000-78972 3390 / 3420 885.0 1130.0 1.0X
read-without-bsi-index-100000-82689 3316 / 3406 904.6 1105.5 1.0X
read-with-bsi-index-100000-99543 797 / 867 3763.9 265.7 4.3X
read-with-bsi-index-100000-78972 544 / 613 5516.6 181.3 6.3X
read-with-bsi-index-100000-82689 733 / 822 4091.1 244.4 4.7X

generate random values bound in [0, 200000) for BSI index field, and test predicate EQ with a random value(index result maybe more filter 35-45 ranges)

read Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
read-without-bsi-index-200000-113323 3393 / 3419 884.2 1130.9 1.0X
read-without-bsi-index-200000-124857 3409 / 3440 880.1 1136.2 1.0X
read-without-bsi-index-200000-13588 3341 / 3405 898.0 1113.5 1.0X
read-with-bsi-index-200000-113323 164 / 173 18273.7 54.7 20.7X
read-with-bsi-index-200000-124857 264 / 295 11347.3 88.1 12.8X
read-with-bsi-index-200000-13588 260 / 279 11540.2 86.7 13.1X

generate random values bound in [0, 500000) for BSI index field, and test predicate EQ with a random value(index result maybe more filter 35-45 ranges)

read Best/Avg Time(ms) Row Rate(K/s) Per Row(ns) Relative
read-without-bsi-index-500000-222300 3260 / 3507 920.2 1086.8 1.0X
read-without-bsi-index-500000-211083 3427 / 3640 875.3 1142.5 1.0X
read-without-bsi-index-500000-172343 3530 / 3545 850.0 1176.5 0.9X
read-with-bsi-index-500000-222300 188 / 208 15969.4 62.6 17.4X
read-with-bsi-index-500000-211083 226 / 244 13271.6 75.3 14.4X
read-with-bsi-index-500000-172343 356 / 379 8426.1 118.7 9.2X

@Tan-JiaLiang
Copy link
Contributor Author

This result was testing after #4765 fixed.

@Tan-JiaLiang
Copy link
Contributor Author

It occurred to me that we could use the bitmap result to narrow down the row ranges, which might skip some deserialisation overhead. Be aware that this may cause fragmentation of the row ranges, but I think the row ranges themselves are limited, it doesn't take up much memory.

@Tan-JiaLiang
Copy link
Contributor Author

@JingsongLi Can you help to trigger the build again? The Container was startup timeout in E2E test.

@JingsongLi
Copy link
Contributor

Hi @Tan-JiaLiang thanks for your benchmark, what did you compare in the benchmark? Before and after this PR? Is that so, with or without an index?

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 43dff3a into apache:master Dec 30, 2024
11 of 12 checks passed
@Tan-JiaLiang
Copy link
Contributor Author

Yes, I compare with the EQ predicate in enable or disable the option file-index.read.enabled.

@Tan-JiaLiang
Copy link
Contributor Author

This is the Benchmark code

/** Benchmark for table read. */
public class BitmapIndexPushDownBenchmark {

    private static final int VALUE_COUNT = 20;

    private final int rowCount = 1000000;
    @TempDir java.nio.file.Path tempFile;

    private final RandomDataGenerator random = new RandomDataGenerator();

    @Test
    public void testParquetRead() throws Exception {
        System.out.println(tempFile);
        int[] bounds = new int[] {1000, 10000, 50000, 100000};
        for (int bound : bounds) {
            Table table = prepareData(bound, parquet(), "parquet_" + bound);
            Map<String, Table> tables = new LinkedHashMap<>();
            tables.put("without-bsi-index", table.copy(Collections.singletonMap("file-index.read.enabled", "false")));
            tables.put("with-bsi-index", table.copy(Collections.singletonMap("file-index.read.enabled", "true")));

            int[] values = new int[5];
            for (int i = 0; i < values.length; i++) {
                values[i] = random.nextInt(0, bound);
            }
            innerTest(tables, bound, values);
        }
    }

    private Options parquet() {
        Options options = new Options();
        options.set(CoreOptions.FILE_FORMAT, CoreOptions.FILE_FORMAT_PARQUET);
        options.set("file-index.bsi.columns", "k");
        return options;
    }

    private void innerTest(Map<String, Table> tables, int bound, int[] values) {
        int readTime = 3;
        Benchmark benchmark =
                new Benchmark("read", readTime * rowCount)
                        .setNumWarmupIters(1)
                        .setOutputPerIteration(true);

        for (String name : tables.keySet()) {
            for (int value : values) {
                benchmark.addCase(
                        "read-" + name + "-" + bound + "-" + value,
                        3,
                        () -> {
                            Table table = tables.get(name);
                            Predicate predicate = new PredicateBuilder(table.rowType()).equal(0, value);
                            for (int i = 0; i < readTime; i++) {
                                List<Split> splits = table.newReadBuilder().newScan().plan().splits();
                                AtomicLong readCount = new AtomicLong(0);
                                try {
                                    for (Split split : splits) {
                                        RecordReader<InternalRow> reader =
                                                table.newReadBuilder()
                                                        .withFilter(predicate)
                                                        .newRead()
                                                        .createReader(split);
                                        reader.forEachRemaining(row -> readCount.incrementAndGet());
                                    }
                                    System.out.printf("Finish read %d rows.\n", readCount.get());
                                } catch (Exception e) {
                                    throw new RuntimeException(e);
                                }
                            }
                        });
            }
        }
        benchmark.run();
    }

    private Table prepareData(int bound, Options options, String tableName) throws Exception {
        Table table = createTable(options, tableName);
        StreamWriteBuilder writeBuilder = table.newStreamWriteBuilder();
        StreamTableWrite write = writeBuilder.newWrite();
        StreamTableCommit commit = writeBuilder.newCommit();
        AtomicInteger writeCount = new AtomicInteger(0);
        for (int i = 0; i < rowCount; i++) {
            try {
                write.write(newRandomRow(bound));
                writeCount.incrementAndGet();
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        }
        List<CommitMessage> commitMessages = write.prepareCommit(true, 1);
        commit.commit(1, commitMessages);

        write.close();
        return table;
    }

    protected Table createTable(Options tableOptions, String tableName)
            throws Exception {
        Options catalogOptions = new Options();
        catalogOptions.set(CatalogOptions.WAREHOUSE, tempFile.toUri().toString());
        Catalog catalog = CatalogFactory.createCatalog(CatalogContext.create(catalogOptions));
        String database = "default";
        catalog.createDatabase(database, true);

        List<DataField> fields = new ArrayList<>();
        fields.add(new DataField(0, "k", new IntType()));
        for (int i = 1; i <= VALUE_COUNT; i++) {
            fields.add(new DataField(i, "f" + i, DataTypes.STRING()));
        }
        Schema schema =
                new Schema(fields, Collections.emptyList(), Collections.emptyList(), tableOptions.toMap(), "");
        Identifier identifier = Identifier.create(database, tableName);
        catalog.createTable(identifier, schema, false);
        return catalog.getTable(identifier);
    }

    protected InternalRow newRandomRow(int bound) {
        GenericRow row = new GenericRow(1 + VALUE_COUNT);
        row.setField(0, random.nextInt(0, bound));
        for (int i = 1; i <= VALUE_COUNT; i++) {
            row.setField(i, BinaryString.fromString(random.nextHexString(32)));
        }
        return row;
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants