Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rubix caching: frequent cache evictions result in failed queries #3580

Closed
losipiuk opened this issue Apr 29, 2020 · 4 comments
Closed

Rubix caching: frequent cache evictions result in failed queries #3580

losipiuk opened this issue Apr 29, 2020 · 4 comments
Milestone

Comments

@losipiuk
Copy link
Member

Test setup like in #3494 but rubix.cache.usage.percentage was setup to 2. This translates to cache size of ~10GB.
Cluster was exercised with the same query as in https://github.com/prestosql/presto/issues/3494 which reads whole lineitem (171GB).

With such configuration many cache evictions are happening throughout the query execution. This itself is fine and expected. Unexpected is that as a result queries are failing:

[ip-192-168-20-199:~/workspace/tmp] [] ⌘ for in in `seq 1 50`;do /Users/lukaszos/workspace/repos/prestosql/presto/presto-cli/target/presto-cli-*-executable.jar  -f count_lineitem.sql
 ;done
Query 20200429_103854_00000_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_103913_00001_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_103925_00002_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_103931_00003_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_103941_00004_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
Query 20200429_103945_00005_t8rib failed: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3]
Query 20200429_103952_00006_t8rib failed: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_df899620-c77b-4012-a32c-1eb8cd1119ef]
Query 20200429_103956_00007_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
Query 20200429_103959_00008_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104002_00009_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104005_00010_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104015_00011_t8rib failed: Malformed ORC file. Invalid stripe row index [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0]
Query 20200429_104016_00012_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104021_00013_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104023_00014_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104026_00015_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104030_00016_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_19e8c24a-7fc9-4512-a296-555ab4efc0c2
Query 20200429_104032_00017_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104037_00018_t8rib failed: Malformed ORC file. Could not decompress all input (output buffer too small?) [s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3]
Query 20200429_104051_00019_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104053_00020_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_0dbfd7b1-920d-43b6-8179-8c53dab105d0
Query 20200429_104058_00021_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104108_00022_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104120_00023_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104133_00024_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3
Query 20200429_104134_00025_t8rib failed: Failed to read ORC file: s3://starburstdata-datasets/tpch-sf1000-ORC/lineitem/20180106_235637_00251_ecxdi_250fa5e5-8d62-47e0-98fa-98ce39a938b3

Logs of coordinator and worker attached:
coordinator.log.gz
worker.log.gz

Note: Presto was built using Rubix 0.3.6 with qubole/rubix#363 applied.

Eventually system ended up in state as in #3524.

cc: @stagraqubole

@shubhamtagra
Copy link
Member

shubhamtagra commented May 5, 2020

There are two problems with eviction we have found:

  1. Query cancellations is triggering evictions if file is being read from cache. This will be fixed with first commit in Fix parallel warmup qubole/rubix#368. This will remove the unnecessary evictions

  2. There is a case where invalidations can cause errors in reads happening in parallel. This usually happens in small disks available to Rubix because the evictions are LRU and having files being read and also being LRU candidate doesn't usually happen in a decently sized disks. We have identified a solution for this too and work is underway: Invalidations can cause corrupted reads qubole/rubix#372

@sopel39
Copy link
Member

sopel39 commented May 12, 2020

@stagraqubole What are recommended workarounds for this? Is it sufficient to have large enough disk?

@shubhamtagra
Copy link
Member

Yes, even a modestly sized disk will minimise the chances of hitting this given LRU files would not be read at the time of eviction.

losipiuk added a commit to losipiuk/trino that referenced this issue Jul 15, 2020
New version is a bugfix release. Among other it includes fix for
trinodb#3580
losipiuk added a commit to losipiuk/trino that referenced this issue Jul 23, 2020
New version is a bugfix release. Among other it includes fix for
trinodb#3580
losipiuk added a commit to losipiuk/trino that referenced this issue Aug 14, 2020
New version is a bugfix release. Among other it includes fix for
trinodb#3580
losipiuk added a commit that referenced this issue Aug 14, 2020
New version is a bugfix release. Among other it includes fix for
#3580
JamesRTaylor pushed a commit to lyft/presto that referenced this issue Aug 17, 2020
New version is a bugfix release. Among other it includes fix for
trinodb#3580
JamesRTaylor pushed a commit to lyft/presto that referenced this issue Aug 17, 2020
New version is a bugfix release. Among other it includes fix for
trinodb#3580

Co-authored-by: Łukasz Osipiuk <[email protected]>
@losipiuk
Copy link
Member Author

Fixed with #4551

@losipiuk losipiuk mentioned this issue Sep 3, 2020
9 tasks
@martint martint added this to the 341 milestone Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants