Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] mongo cdc when type is boolean , use starrocks(paimon catalog) query with wrong value #4683

Open
1 of 2 tasks
bulolo opened this issue Dec 10, 2024 · 0 comments
Open
1 of 2 tasks
Labels
bug Something isn't working

Comments

@bulolo
Copy link

bulolo commented Dec 10, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

paimon 0.9
starrocks 3.3.7
flink 1.20

Compute Engine

paimon mongo cdc
starrocks

Minimal reproduce step

  • mongodb
db.getCollection("manufacturers").find({}, {
    _id: 0,
    slug: 1,
    isOverseas: 1
}).sort({
    slug: 1
}).limit(3)
slug	isOverseas
AA	true
AADY	
AAL	true

nZGVzT

  • paimon mongo cdc
./bin/flink run -d \
    ./lib/paimon-flink-action-0.9.0.jar \
    mongodb_sync_database \
    --warehouse s3://lakehouse-1253767413/paimon \
    --database biocitydb \
    --mongodb_conf hosts=172.16.0.11:27017 \
    --mongodb_conf username=mongouser \
    --mongodb_conf password=XXX \
    --mongodb_conf database=biocitydb \
    --catalog_conf s3.endpoint=cos.ap-guangzhou.myqcloud.com \
    --catalog_conf s3.access-key=XXX \
    --catalog_conf s3.secret-key=XXX \
    --table-conf bucket=1 \
    --including_tables 'manufacturers' \
    --table_prefix ods_
  • starrocks
CREATE EXTERNAL CATALOG paimon
PROPERTIES
(
    "type" = "paimon",
    "paimon.catalog.type" = "filesystem",
    "paimon.catalog.warehouse" = "s3://lakehouse-1253767413/paimon",
    "aws.s3.endpoint" = "cos.ap-guangzhou.myqcloud.com",
    "aws.s3.access_key" = "XXX",
    "aws.s3.secret_key" = "XXX"
);
SELECT
	slug,isOverseas
FROM
	paimon.biocitydb.ods_manufacturers
	order by slug asc
	limit 3
slug	isOverseas
AA	
AADY	
AAL	

0L9Bbo

  • query from paimon parquet file
    MPq18r

What doesn't meet your expectations?

same value with string type

Anything else?

发现的规律(不太确定):只要mongodb 字段(bool类型)存在(N/A),当同步到paimon,并在starrocks 查询,所有查出来的该字段的值都是NULL

举例:mongo db 中 AADY 的 isOverseas为N/A,数据中有N/A情况

slug	isOverseas
AA	true
AADY	
AAL	true

starrocks paimon catalog 中查询出来就都是 NULL

slug	isOverseas
AA	
AADY	
AAL	

只有当 mongo 布尔字段全部有值, paimon catalog 中查询出来才正常 显示布尔字符串(过一段时间又不正常)

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@bulolo bulolo added the bug Something isn't working label Dec 10, 2024
@bulolo bulolo changed the title [Bug] mongo cdc when type is boolean , use paimon catalog query wrong value [Bug] mongo cdc when type is boolean , use starrocks(paimon catalog) query with wrong value Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant