-
Notifications
You must be signed in to change notification settings - Fork 20
RandomQueryGeneratorCaveats
This page lists known RQG caveats and some workarounds for them
The RQG is capable of generating queries containing variable numbers of columns and tables. This variability is a problem because the items generated from a FROM clause would then be used in a WHERE clause. Using some simple embedded Perl code, it is possible to keep track of things generated for one part of the query in order to properly reuse them in another part.
Knowing what tables participate in the FROM clause is important when generating a comprehensive and meaningful WHERE clause. So, we give each table a unique alias and record the total number of aliases that have been assigned:
table_list: new_table_item | table_list , new_table_item ; where_list: (where_item) | (where_list AND where_item); where_item: existing_table_item . _field = _digit ; new_table_item: _table AS { "table".++$tables }; existing_table_item: { "table".$prng->int(1, $tables) };
The new_table_item rule picks a random table, gives it an alias, and increments the number of aliases given so far. The existing_table_item rule picks one such alias. This way, the WHERE clause will only contain table aliases that have actually been generated for the FROM clause.
We can use the same trick to keep track of the SELECT items we generate - we will give each one a unique alias:
select_list: new_select_item | select_list , new_select_item ; having_list: (having_item) | (having_item AND having_list) ; having_item: existing_select_item = _digit ; new_select_item: _field AS { "field".++$fields }; existing_select_item: { "field".$prng->int(1, $fields) };
Certain queries can legitimately return different results between different server versions or storage engines. Such false positives can be avoided by careful grammar constructions.
Even though the ResultsetComparator Validator attempts to account for different ordering of result sets, if a LIMIT clause is specified without a full ORDER BY, two servers may return a different result set altogether. There are two ways to avoid this:
- ORDER BY using all columns from the SELECT
total_order_by: { join(', ', map { "field".$_ } (1..$fields) ) };
- ORDER BY on the PRIMARY KEY of each participating table (not compatible with LIMIT)
total_order_by: { join(', ', map { "table".$_." . `pk`" } (1..$tables) ) };
If the SELECT contains non-aggregate fields that are different from the ones used in the GROUP BY, then the result will be undefined. To avoid this:
- Use the ONLY_FULL_GROUP_BY SQL mode
- GROUP BY on all non-aggregate SELECT columns.
select_list: select_item | select_list , select_item ; select_item: aggregate_select_item | nonaggregate_select_item ; aggregate_select_item: COUNT( _field ) | SUM( _field ) | MIN( _field ) | MAX( _field ); nonaggregate_select_item: _field AS { $field_name = "field".++$fields ; push @nonaggregates , $field_name ; $field_name }; group_by: { scalar(@nonaggregates) > 0 ? " GROUP BY ".join (', ' , @nonaggregates ) : "" };
If you use GROUP BY and ORDER BY on a column that is not listed in the SELECT, the result may vary, depending on whether sorting happened before the grouping or the other way around. To avoid this, always use a column alias in the ORDER BY clause.
If you run a lot of DDL statements, your DML queries may frequently try to access a database object that does not exist. The following solutions may help to partially alleviate the problem:
- have many CREATE statements in your query_init rule. They will be run first and will hopefully populate your database sufficiently:
query_init: create ; create ; create ; create ; create ;
- have way more CREATE than DROP statements:
dml: create | create | create | create | drop ;
- keep track of the objects just created and dropped and only CREATE or DROP objects that do not (or do) exist:
create: CREATE TABLE IF NOT EXISTS pick_create_table (F1 INTEGER) ; drop: DROP TABLE IF EXISTS pick_drop_table ; pick_create_table: { if (scalar(@dropped_tables) > 0) { $created_table = shift @dropped_tables } else { $created_table = $prng->letter() } ; push @created_tables, $created_table ; $created_table } ; pick_drop_table: { if (scalar(@created_tables) > 0) { $dropped_table = pop @created_tables } else { $dropped_table = $prng->letter() } ; push @dropped_tables, $dropped_table ; $dropped_table } ; pick_existing_table: { if (scalar(@created_tables) > 0) { $prng->arrayElement(\@created_tables) } else { $prng->letter() } } ;