Skip to content

Commit

Permalink
[core] Support bsi file index (#4464)
Browse files Browse the repository at this point in the history
  • Loading branch information
Tan-JiaLiang authored Nov 8, 2024
1 parent 2eca3a7 commit ddf10d4
Show file tree
Hide file tree
Showing 12 changed files with 1,343 additions and 8 deletions.
3 changes: 3 additions & 0 deletions docs/content/append-table/query-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,9 @@ scenario. Using a bitmap may consume more space but can result in greater accura
`Bitmap`:
* `file-index.bitmap.columns`: specify the columns that need bitmap index.

`Bit-Slice Index Bitmap`
* `file-index.bsi.columns`: specify the columns that need bsi index.

More filter types will be supported...

If you want to add file index to existing table, without any rewrite, you can use `rewrite_file_index` procedure. Before
Expand Down
102 changes: 102 additions & 0 deletions docs/content/concepts/spec/fileindex.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,3 +136,105 @@ offset: 4 bytes int (when it is negative, it represents t
</pre>

Integer are all BIT_ENDIAN.

## Column Index Bytes: Bit-Slice Index Bitmap

BSI file index is a numeric range index, used to accelerate range query, it can use with bitmap index.

Define `'file-index.bsi.columns'`.

BSI file index format (V1):

<pre>
BSI file index format (V1)
+-------------------------------------------------+
| version (1 byte) |
+-------------------------------------------------+
| row count (4 bytes int) |
+-------------------------------------------------+
| has positive value (1 byte) |
+-------------------------------------------------+
| positive bsi serialized (if has positive value)|
+-------------------------------------------------+
| has negative value (1 byte) |
+-------------------------------------------------+
| negative bsi serialized (if has negative value)|
+-------------------------------------------------+
</pre>

BSI only support the following data type:

<table class="table table-bordered">
<thead>
<tr>
<th class="text-left" style="width: 10%">Paimon Data Type</th>
<th class="text-left" style="width: 5%">Supported</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>TinyIntType</code></td>
<td>true</td>
</tr>
<tr>
<td><code>SmallIntType</code></td>
<td>true</td>
</tr>
<tr>
<td><code>IntType</code></td>
<td>true</td>
</tr>
<tr>
<td><code>BigIntType</code></td>
<td>true</td>
</tr>
<tr>
<td><code>DateType</code></td>
<td>true</td>
</tr>
<tr>
<td><code>LocalZonedTimestamp</code></td>
<td>true</td>
</tr>
<tr>
<td><code>TimestampType</code></td>
<td>true</td>
</tr>
<tr>
<td><code>DecimalType(precision, scale)</code></td>
<td>true</td>
</tr>
<tr>
<td><code>FloatType</code></td>
<td>false</td>
</tr>
<tr>
<td><code>DoubleType</code></td>
<td>false</td>
</tr>
<tr>
<td><code>String</code></td>
<td>false</td>
</tr>
<tr>
<td><code>VarBinaryType</code>, <code>BinaryType</code></td>
<td>false</td>
</tr>
<tr>
<td><code>RowType</code></td>
<td>false</td>
</tr>
<tr>
<td><code>MapType</code></td>
<td>false</td>
</tr>
<tr>
<td><code>ArrayType</code></td>
<td>false</td>
</tr>
<tr>
<td><code>BooleanType</code></td>
<td>false</td>
</tr>
</tbody>
</table>
3 changes: 3 additions & 0 deletions docs/content/primary-key-table/query-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,9 @@ Supported filter types:
`Bitmap`:
* `file-index.bitmap.columns`: specify the columns that need bitmap index.

`Bit-Slice Index Bitmap`
* `file-index.bsi.columns`: specify the columns that need bsi index.

More filter types will be supported...

If you want to add file index to existing table, without any rewrite, you can use `rewrite_file_index` procedure. Before
Expand Down
Loading

0 comments on commit ddf10d4

Please sign in to comment.