forked from pingcap/docs
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor data migration docs (pingcap#7202)
- Loading branch information
Showing
23 changed files
with
2,269 additions
and
263 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
--- | ||
title: Filter Binlog Events | ||
summary: Learn how to filter binlog events when migrating data. | ||
--- | ||
|
||
# Filter Binlog Events | ||
|
||
This document describes how to filter binlog events when you use DM to perform continuous incremental data replication. For the detailed replication instructions, refer to the following documents by scenarios: | ||
|
||
- [Migrate MySQL of Small Datasets to TiDB](/migrate-small-mysql-to-tidb.md) | ||
- [Migrate MySQL of Large Datasets to TiDB](/migrate-large-mysql-to-tidb.md) | ||
- [Migrate and Merge MySQL Shards of Small Datasets to TiDB](/migrate-small-mysql-shards-to-tidb.md) | ||
- [Migrate and Merge MySQL Shards of Large Datasets to TiDB](/migrate-large-mysql-shards-to-tidb.md) | ||
|
||
## Configuration | ||
|
||
To use binlog event filter, add a `filter` to the task configuration file of DM, as shown below: | ||
|
||
```yaml | ||
filters: | ||
rule-1: | ||
schema-pattern: "test_*" | ||
table-pattern: "t_*" | ||
events: ["truncate table", "drop table"] | ||
sql-pattern: ["^DROP\\s+PROCEDURE", "^CREATE\\s+PROCEDURE"] | ||
action: Ignore | ||
``` | ||
- `schema-pattern`/`table-pattern`: Filters matching schemas or tables | ||
- `events`: Filters binlog events. Supported events are listed in the table below: | ||
|
||
| Event | Category | Description | | ||
| --------------- | ---- | --------------------------| | ||
| all | | Includes all events | | ||
| all dml | | Includes all DML events | | ||
| all ddl | | Includes all DDL events | | ||
| none | | Includes no event | | ||
| none ddl | | Excludes all DDL events | | ||
| none dml | | Excludes all DML events | | ||
| insert | DML | Insert DML event | | ||
| update | DML | Update DML event | | ||
| delete | DML | Delete DML event | | ||
| create database | DDL | Create database event | | ||
| drop database | DDL | Drop database event | | ||
| create table | DDL | Create table event | | ||
| create index | DDL | Create index event | | ||
| drop table | DDL | Drop table event | | ||
| truncate table | DDL | Truncate table event | | ||
| rename table | DDL | Rename table event | | ||
| drop index | DDL | Drop index event | | ||
| alter table | DDL | Alter table event | | ||
|
||
- `sql-pattern`:Filters specified DDL SQL statements. The matching rule supports using a regular expression. | ||
- `action`: `Do` or `Ignore` | ||
|
||
- `Do`: the allow list. A binlog event is replicated if meeting either of the following two conditions: | ||
|
||
- The event matches the rule setting. | ||
- sql-pattern has been specified and the SQL statement of the event matches any of the sql-pattern options. | ||
|
||
- `Ignore`: the block list. A binlog event is filtered out if meeting either of the following two conditions: | ||
|
||
- The event matches the rule setting. | ||
- sql-pattern has been specified and the SQL statement of the event matches any of the sql-pattern options. | ||
|
||
If both `Do` and `Ignore` are configured, `Ignore` has higher priority over `Do`. That is, an event satisfying both `Ignore` and `Do` conditions will be filtered out. | ||
|
||
## Application scenarios | ||
|
||
This section describes the application scenarios of binlog event filter. | ||
|
||
### Filter out all sharding deletion operations | ||
|
||
To filter out all deletion operations, configure a `filter-table-rule` and a `filter-schema-rule`, as shown below: | ||
|
||
``` | ||
filters: | ||
filter-table-rule: | ||
schema-pattern: "test_*" | ||
table-pattern: "t_*" | ||
events: ["truncate table", "drop table", "delete"] | ||
action: Ignore | ||
filter-schema-rule: | ||
schema-pattern: "test_*" | ||
events: ["drop database"] | ||
action: Ignore | ||
``` | ||
|
||
### Migrate only DML operations of sharded schemas and tables | ||
|
||
To replicate only DML statements, configure two `Binlog event filter rule`, as shown below: | ||
|
||
``` | ||
filters: | ||
do-table-rule: | ||
schema-pattern: "test_*" | ||
table-pattern: "t_*" | ||
events: ["create table", "all dml"] | ||
action: Do | ||
do-schema-rule: | ||
schema-pattern: "test_*" | ||
events: ["create database"] | ||
action: Do | ||
``` | ||
|
||
### Filter out SQL statements not supported by TiDB | ||
|
||
To filter out SQL statements not supported by TiDB, configure a `filter-procedure-rule`, as shown below: | ||
|
||
``` | ||
filters: | ||
filter-procedure-rule: | ||
schema-pattern: "*" | ||
sql-pattern: [".*\\s+DROP\\s+PROCEDURE", ".*\\s+CREATE\\s+PROCEDURE", "ALTER\\s+TABLE[\\s\\S]*ADD\\s+PARTITION", "ALTER\\s+TABLE[\\s\\S]*DROP\\s+PARTITION"] | ||
action: Ignore | ||
``` | ||
|
||
> **Warning:** | ||
> | ||
> To avoid filtering out data that needs to be migrated, configure the global filtering rule as strictly as possible. | ||
|
||
## See also | ||
|
||
[Filter Binlog Events Using SQL Expressions](/filter-dml-event.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
--- | ||
title: Filter DML Events Using SQL Expressions | ||
summary: Learn how to filter DML events using SQL expressions. | ||
--- | ||
|
||
# Filter DML Events Using SQL Expressions | ||
|
||
This document introduces how to filter binlog events using SQL expressions when you use DM to perform continuous incremental data replication. For the detailed replication instruction, refer to the following documents: | ||
|
||
- [Migrate MySQL of Small Datasets to TiDB](/migrate-small-mysql-to-tidb.md) | ||
- [Migrate MySQL of Large Datasets to TiDB](/migrate-large-mysql-to-tidb.md) | ||
- [Migrate and Merge MySQL Shards of Small Datasets to TiDB](/migrate-small-mysql-shards-to-tidb.md) | ||
- [Migrate and Merge MySQL Shards of Large Datasets to TiDB](/migrate-large-mysql-shards-to-tidb.md) | ||
|
||
When performing incremental data replication, you can use the [Binlog Event Filter](/filter-binlog-event.md) to filter certain types of binlog events. For example, you can choose not to replicate `DELETE` events to the downstream for the purposes like archiving and auditing. However, the Binlog Event Filter cannot determine whether to filter the `DELETE` event of a row that requires finer granularity. | ||
|
||
To address the issue, since v2.0.5, DM supports using `binlog value filter` in incremental data replication to filter data. Among the DM-supported and `ROW`-formatted binlog, the binlog events carry values of all columns, and you can configure SQL expressions based on these values. If the expression calculates a row change as `TRUE`, DM does not replicate this row change to the downstream. | ||
|
||
Similar to [Binlog Event Filter](/filter-binlog-event.md), you need to configure `binlog value filter` in the task configuration file. For details, see the following configuration example. For the advanced task configuration and the description, refer to [DM advanced task configuration file](https://docs.pingcap.com/tidb-data-migration/stable/task-configuration-file-full#task-configuration-file-template-advanced). | ||
|
||
```yaml | ||
name: test | ||
task-mode: all | ||
|
||
mysql-instances: | ||
- source-id: "mysql-replica-01" | ||
expression-filters: ["even_c"] | ||
|
||
expression-filter: | ||
even_c: | ||
schema: "expr_filter" | ||
table: "tbl" | ||
insert-value-expr: "c % 2 = 0" | ||
``` | ||
In the above configuration example, the `even_c` rule is configured and referenced by the data source `mysql-replica-01`. According to this rule, for the `tb1` table in the `expr_filter` schema, when an even number is inserted into the `c` column (`c % 2 = 0`), this `insert` statement is not replicated to the downstream. The following example shows the effect of this rule. | ||
|
||
Incrementally insert the following data in the upstream data source: | ||
|
||
```sql | ||
INSERT INTO tbl(id, c) VALUES (1, 1), (2, 2), (3, 3), (4, 4); | ||
``` | ||
|
||
Then query the `tb1` table on downstream. You can see that only the rows with odd numbers on `c` are replicated. | ||
|
||
```sql | ||
MySQL [test]> select * from tbl; | ||
+------+------+ | ||
| id | c | | ||
+------+------+ | ||
| 1 | 1 | | ||
| 3 | 3 | | ||
+------+------+ | ||
2 rows in set (0.001 sec) | ||
``` | ||
|
||
## Configuration parameters and description | ||
|
||
- `schema`: The name of the upstream schema to match. Wildcard matching or regular matching is not supported. | ||
- `table`: The name of the upstream table to match. Wildcard matching or regular matching is not supported. | ||
- `insert-value-expr`: Configures an expression that takes effect on values carried by the `INSERT` type binlog events (WRITE_ROWS_EVENT). You cannot use this expression together with `update-old-value-expr`, `update-new-value-expr` or `delete-value-expr` in the same configuration item. | ||
- `update-old-value-expr`: Configures an expression that takes effect on the old values carried by the `UPDATE` type binlog events (UPDATE_ROWS_EVENT). You cannot use this expression together with `insert-value-expr` or `delete-value-expr` in the same configuration item. | ||
- `update-new-value-expr`: Configures an expression that takes effect on the new values carried by the `UPDATE` type binlog events (UPDATE_ROWS_EVENT). You cannot use this expression together with `insert-value-expr` or `delete-value-expr` in the same configuration item. | ||
- `delete-value-expr`: Configures an expression that takes effect on values carried by the `DELETE` type binlog events (DELETE_ROWS_EVENT). You cannot use this expression together with `insert-value-expr`, `update-old-value-expr` or `update-new-value-expr`. | ||
|
||
> **Note:** | ||
> | ||
> - You can configure `update-old-value-expr` and `update-new-value-expr` together. | ||
> - When `update-old-value-expr` and `update-new-value-expr` are configured together, the rows whose "update + old values" meet `update-old-value-expr` **and** whose "update + new values" meet `update-new-value-expr` are filtered. | ||
> - When one of `update-old-value-expr` and `update-new-value-expr` is configured, the configured expression determines whether to filter the **entire row change**, which means that the deletion of old values and the insertion of new values are filtered as a whole. | ||
|
||
You can use the SQL expression on one column or on multiple columns. You can also use the SQL functions supported by TiDB, such as `c % 2 = 0`, `a*a + b*b = c*c`, and `ts > NOW()`. | ||
|
||
The `TIMESTAMP` default time zone is the time zone specified in the task configuration file. The default value is the time zone of the downstream. You can explicitly specify the time zone in a way like `c_timestamp = '2021-01-01 12:34:56.5678+08:00'`. | ||
|
||
You can configure multiple filtering rules under the `expression-filter` configuration item. The upstream data source references the required rule in `expression-filters` to make it effective. When multiple rules are used, if **any** one of the rules are matched, the entire row change is filtered. | ||
|
||
> **Note:** | ||
> | ||
> Configuring too many expression filtering rules increases the calculation overhead of DM and slows down the data replication. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.