doris拆分查询反而能获得三倍的提升 #10535
howardyan93
started this conversation in
General
doris拆分查询反而能获得三倍的提升
#10535
Replies: 1 comment
-
doris这种类似es的全文检索,我们正在开发,现在你这个case,第一种查询是全量的扫描,然后模糊匹配,效率肯定很低,而且你是读取了所有字段 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
如下一个表
create table test(
ctime DATETIME,
field1 INT,
field2 STRING,
....
field9 STRING
)
ENGINE=OLAP
DUPLICATE KEY(
ctime
,fields1') PARTITION BY RANGE(
ctime)()DISTRIBUTED BY HASH(field1)
PROPERTIES
(
....
);
里面保存了20亿数据。
其中field9字段包含某个关键词keyword有个十几条。
如果使用like查询,需要将近6分钟
select * from test where field9 like '%keyword%'。
但是我发现,如果将查询拆分一下,速度反而能加快:
select ctime from test where field9 like '%keyword%'。
结果为t1, t2, t3... tN. 再使用In查询:
select * from test where ctime in (t1, t2, .... tN) and field9 like '%keyword%';
速度会缩短为2分钟。
但是如果使用如下查询:
select * from test where ctime in (
select ctime from test where field9 like '%keyword%'
) and field9 like '%keyword%';
时间还是6分钟左右。
虽然doris是列式存储,查询字段越少速度越快。但是3倍的查询差距仍让人迷惑。
ps: 开启了sql cache和partition cache。
Beta Was this translation helpful? Give feedback.
All reactions