Rubix caching: frequent cache evictions result in failed queries #3580

losipiuk · 2020-04-29T11:05:45Z

Test setup like in #3494 but rubix.cache.usage.percentage was setup to 2. This translates to cache size of ~10GB.
Cluster was exercised with the same query as in https://github.com/prestosql/presto/issues/3494 which reads whole lineitem (171GB).

With such configuration many cache evictions are happening throughout the query execution. This itself is fine and expected. Unexpected is that as a result queries are failing:

[ip-192-168-20-199:~/workspace/tmp] [] ⌘ for in in `seq 1 50`;do /Users/lukaszos/workspace/repos/prestosql/presto/presto-cli/target/presto-cli-*-executable.jar  -f count_lineitem.sql
 ;done
Query 20200429_103854_00000_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_103913_00001_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_103925_00002_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_103931_00003_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_103941_00004_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
Query 20200429_103945_00005_t8rib failed: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3]
Query 20200429_103952_00006_t8rib failed: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_df899620-c77b-4012-a32c-1eb8cd1119ef]
Query 20200429_103956_00007_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
Query 20200429_103959_00008_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104002_00009_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104005_00010_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104015_00011_t8rib failed: Malformed ORC file. Invalid stripe row index [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0]
Query 20200429_104016_00012_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104021_00013_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104023_00014_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104026_00015_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104030_00016_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
Query 20200429_104032_00017_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104037_00018_t8rib failed: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3]
Query 20200429_104051_00019_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104053_00020_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104058_00021_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104108_00022_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104120_00023_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104133_00024_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104134_00025_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3

Logs of coordinator and worker attached:
coordinator.log.gz
worker.log.gz

Note: Presto was built using Rubix 0.3.6 with qubole/rubix#363 applied.

Eventually system ended up in state as in #3524.

cc: @stagraqubole

The text was updated successfully, but these errors were encountered:

shubhamtagra · 2020-05-05T04:21:26Z

There are two problems with eviction we have found:

Query cancellations is triggering evictions if file is being read from cache. This will be fixed with first commit in Fix parallel warmup qubole/rubix#368. This will remove the unnecessary evictions
There is a case where invalidations can cause errors in reads happening in parallel. This usually happens in small disks available to Rubix because the evictions are LRU and having files being read and also being LRU candidate doesn't usually happen in a decently sized disks. We have identified a solution for this too and work is underway: Invalidations can cause corrupted reads qubole/rubix#372

sopel39 · 2020-05-12T11:30:48Z

@stagraqubole What are recommended workarounds for this? Is it sufficient to have large enough disk?

shubhamtagra · 2020-05-13T04:07:07Z

Yes, even a modestly sized disk will minimise the chances of hitting this given LRU files would not be read at the time of eviction.

New version is a bugfix release. Among other it includes fix for trinodb#3580

New version is a bugfix release. Among other it includes fix for #3580

New version is a bugfix release. Among other it includes fix for trinodb#3580

New version is a bugfix release. Among other it includes fix for trinodb#3580 Co-authored-by: Łukasz Osipiuk <[email protected]>

losipiuk · 2020-08-18T15:50:12Z

Fixed with #4551

losipiuk added a commit to losipiuk/trino that referenced this issue Jul 15, 2020

Update Rubix version to 0.3.13

e345bdc

New version is a bugfix release. Among other it includes fix for trinodb#3580

losipiuk mentioned this issue Jul 15, 2020

Bump Rubix version to 0.3.15 #4445

Closed

losipiuk mentioned this issue Jul 23, 2020

Update Rubix version to 0.3.16 #4551

Merged

losipiuk added a commit to losipiuk/trino that referenced this issue Jul 23, 2020

Update Rubix version to 0.3.15

47e24d1

New version is a bugfix release. Among other it includes fix for trinodb#3580

losipiuk added a commit to losipiuk/trino that referenced this issue Aug 14, 2020

Update Rubix version to 0.3.16

8365af1

New version is a bugfix release. Among other it includes fix for trinodb#3580

losipiuk added a commit that referenced this issue Aug 14, 2020

Update Rubix version to 0.3.16

9bc0116

New version is a bugfix release. Among other it includes fix for #3580

JamesRTaylor pushed a commit to lyft/presto that referenced this issue Aug 17, 2020

Update Rubix version to 0.3.16

d7729f2

New version is a bugfix release. Among other it includes fix for trinodb#3580

JamesRTaylor mentioned this issue Aug 17, 2020

Update Rubix version to 0.3.16 lyft/presto#55

Merged

JamesRTaylor pushed a commit to lyft/presto that referenced this issue Aug 17, 2020

Update Rubix version to 0.3.16 (#55)

549da69

New version is a bugfix release. Among other it includes fix for trinodb#3580 Co-authored-by: Łukasz Osipiuk <[email protected]>

losipiuk closed this as completed Aug 18, 2020

losipiuk mentioned this issue Sep 3, 2020

Release notes for 341 #4755

Closed

9 tasks

martint added this to the 341 milestone Sep 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rubix caching: frequent cache evictions result in failed queries #3580

Rubix caching: frequent cache evictions result in failed queries #3580

losipiuk commented Apr 29, 2020

shubhamtagra commented May 5, 2020 •

edited

Loading

sopel39 commented May 12, 2020

shubhamtagra commented May 13, 2020

losipiuk commented Aug 18, 2020

Rubix caching: frequent cache evictions result in failed queries #3580

Rubix caching: frequent cache evictions result in failed queries #3580

Comments

losipiuk commented Apr 29, 2020

shubhamtagra commented May 5, 2020 • edited Loading

sopel39 commented May 12, 2020

shubhamtagra commented May 13, 2020

losipiuk commented Aug 18, 2020

shubhamtagra commented May 5, 2020 •

edited

Loading