You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I did a tpc-ds performance test on trino, the data has two formats:parquet and paimon, the parquet format query takes 3 seconds, the paimon format query takes 60 seconds, and the query sql:
select
i_item_id
,i_item_desc
,s_store_id
,s_store_name
,sum(ss_net_profit) as store_sales_profit
,sum(sr_net_loss) as store_returns_loss
,sum(cs_net_profit) as catalog_sales_profit
from
store_sales
,store_returns
,catalog_sales
,date_dim d1
,date_dim d2
,date_dim d3
,store
,item
where
d1.d_moy = 4
and d1.d_year = 1998
and d1.d_date_sk = ss_sold_date_sk
and i_item_sk = ss_item_sk
and s_store_sk = ss_store_sk
and ss_customer_sk = sr_customer_sk
and ss_item_sk = sr_item_sk
and ss_ticket_number = sr_ticket_number
and sr_returned_date_sk = d2.d_date_sk
and d2.d_moy between 4 and 10
and d2.d_year = 1998
and sr_customer_sk = cs_bill_customer_sk
and sr_item_sk = cs_item_sk
and cs_sold_date_sk = d3.d_date_sk
and d3.d_moy between 4 and 10
and d3.d_year = 1998
group by
i_item_id
,i_item_desc
,s_store_id
,s_store_name
order by
i_item_id
,i_item_desc
,s_store_id
,s_store_name
limit 100;
store_sales table contains 100 GB of data, the parquet format query reads only the data that meets the partition conditions, and the paimon format reads all the data
The text was updated successfully, but these errors were encountered:
I did a tpc-ds performance test on trino, the data has two formats:parquet and paimon, the parquet format query takes 3 seconds, the paimon format query takes 60 seconds, and the query sql:
select
i_item_id
,i_item_desc
,s_store_id
,s_store_name
,sum(ss_net_profit) as store_sales_profit
,sum(sr_net_loss) as store_returns_loss
,sum(cs_net_profit) as catalog_sales_profit
from
store_sales
,store_returns
,catalog_sales
,date_dim d1
,date_dim d2
,date_dim d3
,store
,item
where
d1.d_moy = 4
and d1.d_year = 1998
and d1.d_date_sk = ss_sold_date_sk
and i_item_sk = ss_item_sk
and s_store_sk = ss_store_sk
and ss_customer_sk = sr_customer_sk
and ss_item_sk = sr_item_sk
and ss_ticket_number = sr_ticket_number
and sr_returned_date_sk = d2.d_date_sk
and d2.d_moy between 4 and 10
and d2.d_year = 1998
and sr_customer_sk = cs_bill_customer_sk
and sr_item_sk = cs_item_sk
and cs_sold_date_sk = d3.d_date_sk
and d3.d_moy between 4 and 10
and d3.d_year = 1998
group by
i_item_id
,i_item_desc
,s_store_id
,s_store_name
order by
i_item_id
,i_item_desc
,s_store_id
,s_store_name
limit 100;
store_sales table contains 100 GB of data, the parquet format query reads only the data that meets the partition conditions, and the paimon format reads all the data
The text was updated successfully, but these errors were encountered: