Skip to content
This repository has been archived by the owner on Mar 20, 2021. It is now read-only.
Bohan Zhang edited this page Jan 17, 2019 · 8 revisions

Frequently Asked Questions

1.

File "/home/ubuntu/ottertune/server/website/website/parser/base.py", line 302, in calculate_change_in_metrics
assert adj_val >= 0, '{} wrong metric type '.format(met_name)
AssertionError: pg_stat_user_tables.n_mod_since_analyze wrong metric type

This is because the value in metric_after.json is smaller than that in metric_before.json, the change is smaller than 0. If the metric type is MetricType.COUNTER, we calculate the change and require it is positive. Otherwise we set the metric as the value in metric_after.json. Thus, you can fix it by changing the metric type as 2 (MetricType.INFO) or 3 (MetricType.STATISTICS) in the metric fixture.

2. How to tune a new database ?

A brief answer is to add a collector in the client side, and a parser in the server side. OtterTune also needs knob and metric fixture of the target database. You do not need to collect all the information of knobs/metrics, start with some important ones. This PR is an example to support SAP HANA. (https://github.com/cmu-db/ottertune/pull/194)

3. How to define a new target objective besides throughput and latency ?

You can define your own function as a target objective, you should collect its information in the client side. This PR has some description. (https://github.com/cmu-db/ottertune/pull/182)

4. The throughput decreases over time when running TPCC Benchmark ? / When should I reload the database ?

When running TPC-C benchmark with OLTP-Bench, the size of database grows since it inserts new tuples. Thus, the throughput may decrease over time even with the same database configuration. So we reload the database every several loops. The frequency is hard to tell since it depends on the your hardware, database size and benchmark configuration. e.g. When scale factor/terminal/observation time is large in the benchmark, we should reload the database more frequently. You can set the frequency RELOAD_INTERVAL in the driver fabfile.py. However, when running analytics queries in TPC-H, the database size does not change and you do not need to reload the database. You can set RELOAD_INTERVAL as 0 then.

5. Exploration and Exploitation

Exploration phase is searching an unknown region where it has little to no data. Exploitation phase is making use of existing data to predict the performance and selecting a configuration that is near the best configuration. In OtterTune, we balance exploration and exploitation with UCB (Upper Confidence Bound). When sigma_multiplier parameter (see here) is large, OtterTune will have more chance to explore rather than exploit. If OtterTune converges to the configuration that you do not satisfy, you can increase this parameter to encourage OtterTune to explore more. It can search some new region with little data. It may become more volatile but have a chance to try new configuration with better performance.