Replies: 1 comment 1 reply
-
Thank you for this explanation. I recently am trying CrossHair to do symbolic execution. The above discussion seems not very severe for converting CrossHair to a pure symbolic executor except aliases and the case that functions refuse variable My biggest concern about CrossHair is that it does not directly execute the program in a way of other symbolic executors like Klee. For example, when there is a branch (e.g., Thank you again! |
Beta Was this translation helpful? Give feedback.
-
Generally speaking, CrossHair does not perform software verification. (i.e. it does not prove that your contracts always hold)
Under some very specific assumptions, however, it can fulfill this purpose. The following post describes how this can be done and what assumptions are required. Further discussion may happen below, but this content will be maintained over time in an attempt to remain up-to-date. (last edit: 2022-07-01; many thanks to lmontand for helping to catalog these!)
CrossHair explores different "paths" through the code and uses the SMT solver to reason about all possible inputs that would follow this path. Sometimes, CrossHair runs out of paths to explore. If you supply the
--report_all
option tocrosshair check
, CrossHair will reportConfirmed over all paths
if it exhausted all paths. Apart from the cases detailed below, "Confirmed over all paths." may be interpreted as a verification of the property.Cases where CrossHair might incorrectly return
CONFIRMED
instead ofREFUTED
The following is a list of cases were CrossHair might report a wrong analysis result because it made some assumptions which your code does not satisfy.
Non-determinism: All functions, directly or indirectly invoked by the function (or its pre/post conditions) should exhibit the same behavior when given the same arguments. Here are some examples of non-determinism. For cases raising
NotDeterministic
, see section below. Examples of non-determinism are:random
library), functions depending on the system's state (example:time
anddatetime
libraries), or functions whose output depends on previous queries.functools
cache operations like lru_cache are safe to use - CrossHair detects these and disables their usual caching behaviors)If you still wish to use such functions in your code, you might want to register contract for those to help CrossHair understand them. See plugins and contract registration.
Code termination: Verification requires that the code terminates on all inputs. (why? Think about it like an inductive proof - the inductive step only works if you also have a "base case"). A real verification tool will have you write a termination proof before reasoning about the property to be verified; CrossHair simply assumes termination. If your code does not terminate, CrossHair may declare a contract confirmed over all paths when it isn't. Here is an example.
All arguments have states that are directly constructable: Problematic examples: classes with mutable private members like this. A list containing itself as a member (this is also an example of the aliasing problem below). To ensure CrossHair correctly understand your classes, please follow these guidelines.
Aliases: Aliasing problems might happen when you are using containers/collections holding non-trivial objects. CrossHair remembers instances created before and is able to detect some cases. However, CrossHair is not able to correctly reason about all cases: see this issue which contains two interesting examples. More to this topic, you should avoid comparing objects with the
is
keyword. Here is an example of what could go wrong. However, you may safely makeis
comparisons withNone
and values of anEnum
.Closed type hierarchy: CrossHair will detect and attempt to use subclasses that have already been defined in the interpreter (warning, see exceptions below under "IMPORTANT"), but will not consider the possibility of additional subclasses. Path exhaustion will happen after all known subclasses have been attempted. As an example, suppose you run CrossHair on some module
module1
containing the classclass Class1
. Suppose another modulemodule2
containsclass Class2(Class1)
(i.e. a subclass ofClass1
). Ifmodule2
is not loaded by the interpreter, CrossHair will have no way of knowing thatClass2
exists. So, the analysis result might be "confirmed over all paths" even ifClass2
breaks some postconditions.IMPORTANT: classes defining either
__copy__
,__deepcopy__
,__reduce__
or__reduce_ex__
are not considered in the possible subclasses, because they have a non-trivial implementation of copying. Here is an example.Another similar and important note is that CrossHair won't detect subclassing that is revealed via dynamic isinstance hooks, e.g.
__subclasscheck__
and__subclasshook__
.Catching exceptions: If you do some "advanced" exception handling - not only based on the exception type itself, but on other information carried with the exception - your code handling the exception might not behave the correct way when CrossHair runs it. The reason for this is that CrossHair simplifies information of some exceptions of the standard library, to avoid unnecessary realization of symbolic variables. This only concerns exceptions thrown by the standard library, if your code throws some exceptions, they will remain untouched. Here is an example where handling the exception depends on the exception's message, which is missing when executing CrossHair.
Callables with TypeVar: If your function takes as input a
Callable[X, Y]
, whereX
and/orY
is/contains aTypeVar
, CrossHair might return "confirmed over all paths", even when the postcondition is wrong. See this issue as an example.Parallelism: The code is assumed to be single-threaded and to not have any other kind of parallelism. Therefore, CrossHair provides no detection for deadlocks, concurrent writes, racing conditions, etc. Here is a sample example which works perfectly fine in a single-threaded setting, but could be wrong when adding parallelism.
Pointers: Using the
ctypes
library and modifying pointers, it's easy to have CrossHair return incorrect results. To demonstrate the power ofctypes
, one could even redefine the value of 1, thus totally breaking the behavior of Python.Python environment: Similarly, do not expect any guarantees if you try to modify part of the standard library. Results only apply to a fresh and unmodified python interpreter and library.
Additionally note that if you supply incorrect pre- or postconditions or that you write incorrect type annotations to your functions, analysis results have high chances to be incorrect as well.
Cases where the result is caped to
UNKNOWN
Below is a list of more challenging cases where CrossHair's result is often
UNKNOWN
. You can find an example for most of them in this gist. Unlike the above list of cases, theUNKNOWN
result here is "correct." We present these cases to help you understand why CrossHair may not be able to prove your property.object
orAny
(both are treated the same way), it will only try the subtypesint
andstr
and it will cap the result toUNKNOWN
. This is because trying all subclasses ofobject
is clearly not feasible in a reasonable amount of time. Note that in some rare circumstances where only some trivial operations are performed on symbolic objects, CrossHair does not need to realize the object and can still prove your result. Here is an example.TypeVar
with constraints (for exampleTypeVar("T", int, str)
) is currently not supported and the result is caped toUNKNOWN
. Note that this is different from usingTypeVar("T", bound=Union[int, str])
, which is supported. For more information about the difference, see this post.TypeError
when receiving aSymbolicInt
instead of a regularint
, for example. If CrossHair detects such a case, the result is caped toUNKNOWN
. If this was not detected, the result will beREFUTED
and it will directly report you theTypeError
.UNKNOWN
. This can happen for some of the builtins and for some external C-based modules.Callable[..., <some_return_type>]
) or withtyping.Paramspec
ortyping.Concatenate
. In such cases, the result is caped toUNKNOWN
.1
are currently not supported and the result is caped toUNKNOWN
.UNKNOWN
.UNKNOWN
, as some paths are not explored.Note: In terms of design and evolutionary goals, CrossHair targets rapid counterexample discovery in practical use cases, rather than tactics for path exhaustion (verification). Often these two objectives are complementary, but not always. Therefore, issues that principally benefit verification use cases will not be prioritized as aggressively as other issues.
Beta Was this translation helpful? Give feedback.
All reactions