Skip to content

RandomQueryGeneratorValidators

philip-stoev edited this page Jul 17, 2012 · 3 revisions

Category:RandomQueryGenerator

Table of Contents

Validators Overview

The Validators are individual modules that perform extra checking on the queries being executed. They are enabled via the --validators=Validator1,Validator2,ValidatorN option. By default, ResultsetComparator is enabled when running the same queries against two servers and ResultsetComparator is enabled when replication is being tested.

Available Validators

Optimizer, Comparison Testing and Data Validation

  • ResultsetComparator and ResultsetComparator3 - compares the result sets returned from two or three different servers, accounting for sort order differences. This is automatically enabled when you use runall.pl to start two independent servers.
Comparing between two different binaries:
 --basedir1=/path/to/server1/
 --basedir2=/path/to/server2/
 --validator=ResultsetComparator

Comparing between two different configurations:

 --basedit=/path/to/server/
 --mysqld1=--option1=value1 --mysqld1=--option2=value2
 --mysqld2=--option3=value3 --mysqld1=--option4=value4

Comparisons involving two or more different products -- see RandomQueryGenerator3WayDBComparision

  • ResultsetComparatorSimplify and ResultsetComparatorSimplify3 - identical to ResultsetComparator but initiates an automatic test simplification procedure for the query where the two servers differ in their result sets, and produces an MTR test case. Note that the simplification process may issue numerous queries that can take a while and can interfere with the rest of the test (such as by crashing the server). See RandomQueryGeneratorSimplification#Simplifying_individual_queries for more information.
  • SelectStability - each SELECT query in the grammar will be re-executed verbatim several times, after delays of 0, 0.01 and 0.1 seconds respectively in order to detect situations where the result set has changed between invocations. This is mostly useful for catching Optimizer bugs where the result set depends on uninitialized memory or some undesirable thread interaction. This Validator requires either --threads=1 or a transactional grammar with REPEATABLE_READ or higher in order to operate properly.
  • QueryProperties - if the grammar contains embedded comments which describe the desired outome from the execution of queries, the QueryProperties validator will be activated automatically. For more information, please see RandomQueryGeneratorQueryProperties.
  • OptimizerTraceParser - if optimizer tracing is available and enabled in the tested MySQL server, this validator grabs the JSON-formatted trace text generated by the server for each statement execution, and uses a JSON parser to validate syntax and build a Perl datastructure from the trace text. If the trace does not have valid syntax, and this is not due to the trace exceeding the maximum memory threshold, the validation will fail (non-fatally). Requires that the Perl module JSON is available.

Database Corruption

  • ErrorMessageCorruption - checks any errors returned from the server for characters outside the printable ASCII range. Having such characters indicates memory corruption on the server. This is only useful if you have no database objects with non-ASCII names.
  • ResultsetCorruption - checks the result set from each query for characters outside of the printable ASCII range. Having such characters indicates database corruption. This is only useful if you have no blobs loaded with binary data, that is, no blobs => ['data'] clause in your Random Data Generator configuration file.

Replication

  • ReplicationSlaveStatus - issues a SHOW SLAVE STATUS after each query and signals failure if the slave thread has stopped. It is enabled by default if you use runall.pl --rpl_mode= to start a replication test;
  • ReplicationWaitForSlave - forces the slave to wait for the master using MASTER_POS_WAIT() and returns failure if this function fails (that is, the slave thread is not running). This is useful to force the slave to wait for the master, resulting in an error closer to the actual offending query.
  • DatabaseComparator - uses mysqldump to compare your two independent servers or the master and the slave after each query and signals failure if the data is different. This is mostly useful with --threads=1 and causes a considerable slow-down. May allow you to catch a replication error immediately as it forms, even before the slave thread has stopped.

Transactional Consistency

  • RepeatableRead - each SELECT query in the grammar will be re-executed using various table access methods and the results will be compared. This Validator is useful for catching transactional inconsistencies, including those that derive from different data being visible depending on how the table is being read. Forcing a different query plan is achieved crudely, by tacking on extra predicates to the WHERE clause of the SELECT the faint hope that the query plan will be affected. A more advanced solution would parse and understand the query and use FORCE KEY. This Validator requires a transactional grammar, REPEATABLE_READ or higher isolation level and column names pk and int_key.
  • DatabaseConsistency - after each query, a SELECT is issued within the same connection in order to obtain the average of the col_int_key and col_int fields from each table. The test fails if the number is different from 200.0000. When combined with a purpose-built grammar that uses transactional DML that preserves the AVG() property, the test will check for transactional consistency, isolation and atomicity. A sample such grammar (and the related .ZZ file) comes bundled with the RQG, see RandomQueryGeneratorTests#Transactional_Integrity. Note that this Validator may issue implicit ROLLBACKs in order to prevent transactional errors from being reported as false test failures.

Performance

  • ExecutionTimeComparator compares the execution time of the query on two servers and reports the queries that were more than R times faster or slower on one or the other server (R is configurable in the validator code, default is R=1.5). Only queries that took more than D seconds to execute are considered (D is configurable, default is D=0.02). Statistical analysis and qualification of results is possible by configuring the validator. A summary is printed at the end of the test, and a tab-separated text file is generated with results that can be easily imported into a spreadsheet. See RandomQueryGeneratorPerformanceTesting for more information and caveats.

Grammar and Test Debugging

  • AbortOnSyntaxError - aborts the test immediately when an SQL syntax error is encountered. This is useful for debugging and cleaning up grammars one-error-at-a-time, rather than seeing dozens of instances of the same error scroll in the output. Note that syntactically valid, but otherwise wrong queries are not covered.
  • MarkErrorLog - locates the server's error log file and writes (appends) each query string to it. To be used for valgrind testing (default validator if --valgrind option is used). This way there is a good chance of finding the query triggering a valgrind warning (or error) by reading the error log after the test. Usually the offending query is printed directly after the valgrind warning, if any. Note that all queries are written to the log, not just those related to valgrind warnings.
  • ExplainMatch - aborts the test when the EXPLAIN of a generated SELECT query matches a regular expression (currently hard-coded within lib/GenTest/Validator/ExplainMatch.pm). This is useful in cases where you are fishing for individual queries that implement a particular rare query optimization. The Validator also provides a complete mysqltest test case that runs the matched query.

Validator API

Each Validator is a module residing in lib/GenTest/Validator , having the following header:

 require Exporter;
 @ISA = qw(GenTest::Validator GenTest);
 
 use strict;
 use GenTest::Constants;
 use GenTest::Result;
 use GenTest::Validator;
 use GenTest::Executor;

and implementing the validate() method:

   if ($results->[0]->query() ~= m{i_like_this_query}sio ) {
     return STATUS_OK;
   } else {
     return STATUS_DATABASE_CORRUPTION;
   }
 }

$validator is a reference to self, $executors is a reference to an array of Executor objects, one per server, and $results is a reference to an array of Result, containing all information about the query being validated and its result set, as returned by each server.

The validate() method can return any of the STATUS_* constants listed in lib/GenTest/Constant.pm. STATUS_OK is used to report that the Validator is OK with the particular query (which is then passed to the next Validator), STATUS_WONT_HANDLE is used to report that the Validator is not interested in providing an opinion about this particular query, and STATUS_SKIP is used to signal that the query should not be processed further by any other Validator. You can also return STATUS_EOF to cause the test to terminate immediately without prejudice.

The rest of the error codes are either non-fatal and will allow the test to continue or fatal, aborting the test immediately. Use a non-fatal error code if it will be useful to continue the test and collect more test failures for examination, but where the consistency of the database has not been compromised and future test failures will not be caused by the initial failure. For example, if you are running queries against two servers and they report different error codes, that would be appropriately a fatal error, since the databases may have diverged, making further comparisons with more queries questionable.

Note that RQG objects, including Validators, are not classical Perl hashes. They are arrays, and the constructor is provided by lib/GenTest.pm.