Skip to content
This repository has been archived by the owner on Feb 16, 2024. It is now read-only.

[BAHIR-234] Add ClickHouse Connector for Flink #85

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pyscala
Copy link

@pyscala pyscala commented May 28, 2020

Implement Streaming ClickHouseSink,support Flink Table API & Flink SQL
for ClickHouse connector

@pyscala
Copy link
Author

pyscala commented May 28, 2020

@lresende @mbalassi Appreciate for your time to review this PR.thanks

Copy link
Member

@eskabetxe eskabetxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pyscala for the contribution :)

I have added a few comments.

flink-connector-clickhouse/pom.xml Outdated Show resolved Hide resolved
flink-connector-clickhouse/pom.xml Outdated Show resolved Hide resolved
</dependency>
</dependencies>

<build>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this are in parent pom no?

/**
* Created by liufangliang on 2020/4/16.
*/
public class ClickHouseTableSinkTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all methods are empty

@t0t07
Copy link
Contributor

t0t07 commented Mar 18, 2021

@eskabetxe Is there remained concern on the PR? I'd love to see the PR merged so that I can use it in work :)

Copy link
Member

@eskabetxe eskabetxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should revise your tests, they appear to be executed with a local clickhouse instance

<dependency>
<groupId>ru.yandex.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.1.50</version>
Copy link
Member

@eskabetxe eskabetxe Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be in properties so user can change this version

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your reply,i will fix it later.


public class ClickHouseAppendSinkFunction extends RichSinkFunction<Row> implements CheckpointedFunction {
private static final String USERNAME = "user";
private static final String PASSWORD = "password";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identation

try {
Thread.sleep(retryInterval);
} catch (InterruptedException e) {
e.printStackTrace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if needed log the exception

try {
copy = new ClickHouseTableSink(address, username, password, database, table, schema, batchSize, commitPadding, retries, retryInterval, ignoreInsertError);
} catch (Exception e) {
throw new RuntimeException(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a error message

String[] fieldNames = tableSchema.getFieldNames();
String columns = String.join(",", fieldNames);
String[] questionMark = new String[fieldNames.length];
for (int i = 0; i < questionMark.length; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be change to a stream map no?

String questionMark = Arrays.stream(fieldNames)
.map(field -> "?")
.reduce((left,right) -> left+","+right)
.get();

private PreparedStatement pstat;

@Test
public void open() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty test

for (int i = 0; i < value.getArity(); i++) {
pstat.setObject(i + 1, value.getField(i));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the test?

.retryInterval(3000L);
Map<String, String> connectorProperties = clickhouse.toConnectorProperties();
for (Map.Entry<String, String> entry : connectorProperties.entrySet()) {
System.out.println(entry.getKey() + ":" + entry.getValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should test that the map is what you expect and not print it to console

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eskabetxe thanks for your reply, related optimization has been completed, review again.


@Test
public void createStreamTableSink() throws Exception {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are testing with local clickhouse?
we should have a testcontainer that create a clickhouse instance to test

liufangliang and others added 2 commits March 26, 2021 10:09
Implement Streaming ClickHouseSink,support Flink Table API & Flink SQL
for ClickHouse connector
@pyscala
Copy link
Author

pyscala commented Mar 26, 2021

@eskabetxe Optimization complete, looking forward to your reply.

@pyscala pyscala requested a review from eskabetxe March 26, 2021 03:05
@gj-zhang
Copy link

gj-zhang commented Jul 2, 2021

How is it going?

How to contribute code?

@bakey
Copy link

bakey commented Jan 12, 2022

Is this PR still alive?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants