Skip to content

Commit

Permalink
Docs sync 20241028 (#459)
Browse files Browse the repository at this point in the history
Co-authored-by: shibd <[email protected]>
  • Loading branch information
streamnativebot and shibd authored Oct 28, 2024
1 parent 8ccbc88 commit 1a6ac69
Show file tree
Hide file tree
Showing 34 changed files with 2,121 additions and 34 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ author: ["ASF"]
contributors: ["ASF"]
language: Java
document:
source: "https://github.com/apache/pulsar/tree/v3.0.6/pulsar-io/canal"
source: "https://github.com/apache/pulsar/tree/v3.0.7/pulsar-io/canal"
license: Apache License 2.0
tags: ["Pulsar IO", "Canal", "Source", "MySQL"]
alias: Canal Source
features: ["Use Canal source connector to sync data to Pulsar"]
license_link: "https://www.apache.org/licenses/LICENSE-2.0"
icon: "/images/connectors/canal-logo.png"
download: "https://archive.apache.org/dist/pulsar/pulsar-3.0.6/connectors/pulsar-io-canal-3.0.6.nar"
download: "https://archive.apache.org/dist/pulsar/pulsar-3.0.7/connectors/pulsar-io-canal-3.0.7.nar"
support: StreamNative
support_link: https://streamnative.io
support_img: "/images/streamnative.png"
Expand Down
218 changes: 218 additions & 0 deletions connectors/canal-source/v4.0.0/canal-source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
---
description: The Canal source connector pulls messages from MySQL to Pulsar topics.
author: ["ASF"]
contributors: ["ASF"]
language: Java
document:
source: "https://github.com/apache/pulsar/tree/v4.0.0/pulsar-io/canal"
license: Apache License 2.0
tags: ["Pulsar IO", "Canal", "Source", "MySQL"]
alias: Canal Source
features: ["Use Canal source connector to sync data to Pulsar"]
license_link: "https://www.apache.org/licenses/LICENSE-2.0"
icon: "/images/connectors/canal-logo.png"
download: "https://archive.apache.org/dist/pulsar/pulsar-4.0.0/connectors/pulsar-io-canal-4.0.0.nar"
support: StreamNative
support_link: https://streamnative.io
support_img: "/images/streamnative.png"
owner_name: ""
owner_img: ""
dockerfile:
id: "canal-source"
---

The Canal source connector pulls messages from MySQL to Pulsar topics.

# Configuration

The configuration of Canal source connector has the following properties.

## Property

| Name | Required | Sensitive | Default | Description |
|------------------|----------|-----------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `username` | true | true | None | Canal server account (not MySQL). |
| `password` | true | true | None | Canal server password (not MySQL). |
| `destination` | true | false | None | Source destination that Canal source connector connects to. |
| `singleHostname` | false | false | None | Canal server address. |
| `singlePort` | false | false | None | Canal server port. |
| `cluster` | true | false | false | Whether to enable cluster mode based on Canal server configuration or not.<br/><br/><li>true: **cluster** mode.<br/>If set to true, it talks to `zkServers` to figure out the actual database host.<br/><br/><li>false: **standalone** mode.<br/>If set to false, it connects to the database specified by `singleHostname` and `singlePort`. |
| `zkServers` | true | false | None | Address and port of the Zookeeper that Canal source connector talks to figure out the actual database host. |
| `batchSize` | false | false | 1000 | Batch size to fetch from Canal. |

## Example

Before using the Canal connector, you can create a configuration file through one of the following methods.

* JSON

```json
{
"zkServers": "127.0.0.1:2181",
"batchSize": "5120",
"destination": "example",
"username": "",
"password": "",
"cluster": false,
"singleHostname": "127.0.0.1",
"singlePort": "11111",
}
```

* YAML

You can create a YAML file and copy the [contents](https://github.com/apache/pulsar/blob/master/pulsar-io/canal/src/main/resources/canal-mysql-source-config.yaml) below to your YAML file.

```yaml
configs:
zkServers: "127.0.0.1:2181"
batchSize: 5120
destination: "example"
username: ""
password: ""
cluster: false
singleHostname: "127.0.0.1"
singlePort: 11111
```

# Usage

Here is an example of storing MySQL data using the configuration file as above.

1. Start a MySQL server.

```bash
$ docker pull mysql:5.7
$ docker run -d -it --rm --name pulsar-mysql -p 3306:3306 -e MYSQL_ROOT_PASSWORD=canal -e MYSQL_USER=mysqluser -e MYSQL_PASSWORD=mysqlpw mysql:5.7
```

2. Create a configuration file `mysqld.cnf`.

```bash
[mysqld]
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
datadir = /var/lib/mysql
#log-error = /var/log/mysql/error.log
# By default we only accept connections from localhost
#bind-address = 127.0.0.1
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
log-bin=mysql-bin
binlog-format=ROW
server_id=1
```

3. Copy the configuration file `mysqld.cnf` to MySQL server.

```bash
$ docker cp mysqld.cnf pulsar-mysql:/etc/mysql/mysql.conf.d/
```

4. Restart the MySQL server.

```bash
$ docker restart pulsar-mysql
```

5. Create a test database in MySQL server.

```bash
$ docker exec -it pulsar-mysql /bin/bash
$ mysql -h 127.0.0.1 -uroot -pcanal -e 'create database test;'
```

6. Start a Canal server and connect to MySQL server.

```
$ docker pull canal/canal-server:v1.1.2
$ docker run -d -it --link pulsar-mysql -e canal.auto.scan=false -e canal.destinations=test -e canal.instance.master.address=pulsar-mysql:3306 -e canal.instance.dbUsername=root -e canal.instance.dbPassword=canal -e canal.instance.connectionCharset=UTF-8 -e canal.instance.tsdb.enable=true -e canal.instance.gtidon=false --name=pulsar-canal-server -p 8000:8000 -p 2222:2222 -p 11111:11111 -p 11112:11112 -m 4096m canal/canal-server:v1.1.2
```

7. Start Pulsar standalone.

```bash
$ docker pull apachepulsar/pulsar:2.3.0
$ docker run -d -it --link pulsar-canal-server -p 6650:6650 -p 8080:8080 -v $PWD/data:/pulsar/data --name pulsar-standalone apachepulsar/pulsar:2.3.0 bin/pulsar standalone
```

8. Modify the configuration file `canal-mysql-source-config.yaml`.

```yaml
configs:
zkServers: ""
batchSize: "5120"
destination: "test"
username: ""
password: ""
cluster: false
singleHostname: "pulsar-canal-server"
singlePort: "11111"
```

9. Create a consumer file `pulsar-client.py`.

```python
import pulsar

client = pulsar.Client('pulsar://localhost:6650')
consumer = client.subscribe('my-topic',
subscription_name='my-sub')

while True:
msg = consumer.receive()
print("Received message: '%s'" % msg.data())
consumer.acknowledge(msg)

client.close()
```

10. Copy the configuration file `canal-mysql-source-config.yaml` and the consumer file `pulsar-client.py` to Pulsar server.

```bash
$ docker cp canal-mysql-source-config.yaml pulsar-standalone:/pulsar/conf/
$ docker cp pulsar-client.py pulsar-standalone:/pulsar/
```

11. Download a Canal connector and start it.

```bash
$ docker exec -it pulsar-standalone /bin/bash
$ wget https://archive.apache.org/dist/pulsar/pulsar-2.3.0/connectors/pulsar-io-canal-2.3.0.nar -P connectors
$ ./bin/pulsar-admin source localrun \
--archive ./connectors/pulsar-io-canal-2.3.0.nar \
--classname org.apache.pulsar.io.canal.CanalStringSource \
--tenant public \
--namespace default \
--name canal \
--destination-topic-name my-topic \
--source-config-file /pulsar/conf/canal-mysql-source-config.yaml \
--parallelism 1
```

12. Consume data from MySQL.

```bash
$ docker exec -it pulsar-standalone /bin/bash
$ python pulsar-client.py
```

13. Open another window to log in MySQL server.

```bash
$ docker exec -it pulsar-mysql /bin/bash
$ mysql -h 127.0.0.1 -uroot -pcanal
```

14. Create a table, and insert, delete, and update data in MySQL server.

```bash
mysql> use test;
mysql> show tables;
mysql> CREATE TABLE IF NOT EXISTS `test_table`(`test_id` INT UNSIGNED AUTO_INCREMENT,`test_title` VARCHAR(100) NOT NULL,
`test_author` VARCHAR(40) NOT NULL,
`test_date` DATE,PRIMARY KEY ( `test_id` ))ENGINE=InnoDB DEFAULT CHARSET=utf8;
mysql> INSERT INTO test_table (test_title, test_author, test_date) VALUES("a", "b", NOW());
mysql> UPDATE test_table SET test_title='c' WHERE test_title='a';
mysql> DELETE FROM test_table WHERE test_title='c';
```
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ author: ["ASF"]
contributors: ["ASF"]
language: Java
document:
source: "https://github.com/apache/pulsar/tree/v3.0.6/pulsar-io/cassandra"
source: "https://github.com/apache/pulsar/tree/v3.0.7/pulsar-io/cassandra"
license: Apache License 2.0
tags: ["Pulsar IO", "Cassandra", "Sink"]
alias: Cassandra Sink
features: ["Use Cassandra sink connector to sync data from Pulsar"]
license_link: "https://www.apache.org/licenses/LICENSE-2.0"
icon: "/images/connectors/cassandra-sink.png"
download: "https://archive.apache.org/dist/pulsar/pulsar-3.0.6/connectors/pulsar-io-cassandra-3.0.6.nar"
download: "https://archive.apache.org/dist/pulsar/pulsar-3.0.7/connectors/pulsar-io-cassandra-3.0.7.nar"
support: StreamNative
support_link: https://streamnative.io
support_img: "/images/streamnative.png"
Expand Down
70 changes: 70 additions & 0 deletions connectors/cassandra-sink/v4.0.0/cassandra-sink.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
description: The Cassandra sink connector pulls messages from Pulsar topics to Cassandra clusters
author: ["ASF"]
contributors: ["ASF"]
language: Java
document:
source: "https://github.com/apache/pulsar/tree/v4.0.0/pulsar-io/cassandra"
license: Apache License 2.0
tags: ["Pulsar IO", "Cassandra", "Sink"]
alias: Cassandra Sink
features: ["Use Cassandra sink connector to sync data from Pulsar"]
license_link: "https://www.apache.org/licenses/LICENSE-2.0"
icon: "/images/connectors/cassandra-sink.png"
download: "https://archive.apache.org/dist/pulsar/pulsar-4.0.0/connectors/pulsar-io-cassandra-4.0.0.nar"
support: StreamNative
support_link: https://streamnative.io
support_img: "/images/streamnative.png"
owner_name: ""
owner_img: ""
dockerfile:
id: "cassandra-sink"
---

The Cassandra sink connector pulls messages from Pulsar topics to Cassandra clusters.

# Configuration

The configuration of the Cassandra sink connector has the following properties.

## Property

| Name | Type|Required | Default | Description
|------|----------|----------|---------|-------------|
| `roots` | String|true | " " (empty string) | A comma-separated list of Cassandra hosts to connect to.|
| `keyspace` | String|true| " " (empty string)| The key space used for writing pulsar messages. <br><br>**Note: `keyspace` should be created prior to a Cassandra sink.**|
| `keyname` | String|true| " " (empty string)| The key name of the Cassandra column family. <br><br>The column is used for storing Pulsar message keys. <br><br>If a Pulsar message doesn't have any key associated, the message value is used as the key. |
| `columnFamily` | String|true| " " (empty string)| The Cassandra column family name.<br><br>**Note: `columnFamily` should be created prior to a Cassandra sink.**|
| `columnName` | String|true| " " (empty string) | The column name of the Cassandra column family.<br><br> The column is used for storing Pulsar message values. |

## Example

Before using the Cassandra sink connector, you need to create a configuration file through one of the following methods.

* JSON

```json
{
"roots": "localhost:9042",
"keyspace": "pulsar_test_keyspace",
"columnFamily": "pulsar_test_table",
"keyname": "key",
"columnName": "col"
}
```

* YAML

```
configs:
roots: "localhost:9042"
keyspace: "pulsar_test_keyspace"
columnFamily: "pulsar_test_table"
keyname: "key"
columnName: "col"
```


# Usage

For more information about **how to connect Pulsar with Cassandra**, see [here](https://pulsar.apache.org/docs/en/next/io-quickstart/#connect-pulsar-to-cassandra).
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ author: ["ASF"]
contributors: ["ASF"]
language: Java
document:
source: "https://github.com/apache/pulsar/tree/v3.0.6/pulsar-io/debezium/mongodb"
source: "https://github.com/apache/pulsar/tree/v3.0.7/pulsar-io/debezium/mongodb"
license: Apache License 2.0
tags: ["Pulsar IO", "Debezium", "Source"]
alias: Debezium MongoDB Source
features: ["Use Debezium MongoDB source connector to sync data to Pulsar"]
license_link: "https://www.apache.org/licenses/LICENSE-2.0"
icon: "/images/connectors/debezium.png"
download: "https://github.com/streamnative/pulsar/releases/download/v3.0.6.1/pulsar-io-debezium-mongodb-3.0.6.1.nar"
download: "https://github.com/streamnative/pulsar/releases/download/v3.0.7.1/pulsar-io-debezium-mongodb-3.0.7.1.nar"
support: StreamNative
support_link: https://streamnative.io
support_img: "/images/streamnative.png"
Expand Down
Loading

0 comments on commit 1a6ac69

Please sign in to comment.