Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplification pass #256

Open
wants to merge 138 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
182c9e8
refactor call to a statement & add unreachable and return jumps
ailrst Aug 19, 2024
3711e4d
move transforms out of RunUtils.scala
ailrst Aug 19, 2024
fd56c9a
remove old cfg
ailrst Aug 9, 2024
bd6b2ad
update docs
ailrst Aug 19, 2024
96b9028
pe based expr evaluator
ailrst Aug 19, 2024
d264b97
interpreter test
ailrst Aug 19, 2024
a3bf83b
rewrite interpreter in functional style
ailrst Aug 22, 2024
74c4bc2
cleanup call/return
ailrst Aug 22, 2024
9e23558
cleanup memory ops to enter effects
ailrst Aug 22, 2024
7571aa3
tracing interpreter
ailrst Aug 26, 2024
f36c13e
compile with state monad
ailrst Aug 27, 2024
2c16c83
fix state monad interp
ailrst Aug 28, 2024
dd11175
indirect calls
ailrst Aug 28, 2024
c7b4a56
tracing interpreter
ailrst Aug 28, 2024
f64f904
breakpoints
ailrst Aug 28, 2024
897c565
improve breakpoints
ailrst Aug 29, 2024
cc97052
reorg
ailrst Aug 29, 2024
695b2d2
refactor with statemonad[s, either[v]]
ailrst Aug 29, 2024
00b482a
redo error handling
ailrst Aug 29, 2024
1a139d2
refactor call to a statement & add unreachable and return jumps
ailrst Aug 19, 2024
7e94536
move transforms out of RunUtils.scala
ailrst Aug 19, 2024
aae0064
remove old cfg
ailrst Aug 9, 2024
e2c0cd4
update docs
ailrst Aug 19, 2024
a3adee3
fix
ailrst Aug 30, 2024
4a92c99
fix externals
ailrst Aug 30, 2024
b994c99
disable IDE analyses if mainproc is external
ailrst Aug 30, 2024
b83c9ff
Merge branch 'call-statement' into interpreter
ailrst Aug 30, 2024
d6c5767
work on differential testing
ailrst Aug 30, 2024
2a03a23
hook for dynlinking
ailrst Sep 2, 2024
8b694c0
load full symtab
ailrst Sep 2, 2024
57ca6b3
update relf grammar
ailrst Sep 3, 2024
b276981
init bss
ailrst Sep 3, 2024
888d360
cleanup
ailrst Sep 3, 2024
0abf654
cleanup init trace
ailrst Sep 3, 2024
9192ce3
intrinsic stub and cleanup errors
ailrst Sep 3, 2024
5342358
init relocation table
ailrst Sep 3, 2024
87b0f86
pull stepper outside effects to fix interpreter composition again
ailrst Sep 4, 2024
f5a0590
cleanup
ailrst Sep 4, 2024
a6b58e8
cleanup
ailrst Sep 4, 2024
dd64c14
constprop test with interpreter
ailrst Sep 4, 2024
8be5cb6
interpreter docs
ailrst Sep 4, 2024
d2a2486
fix list
ailrst Sep 4, 2024
d06e268
paragraph
ailrst Sep 4, 2024
4ae3997
trap eval exceptions to monadic errors
ailrst Sep 4, 2024
4fc8ed2
improve interpretOne
ailrst Sep 4, 2024
8f966ce
notes on initialisation
ailrst Sep 4, 2024
f00b44b
simplify invoc funcs
ailrst Sep 4, 2024
ceef233
note missing features
ailrst Sep 5, 2024
cff6c97
run through all system tests
ailrst Sep 5, 2024
822c127
add resource limit
ailrst Sep 5, 2024
9b8f715
doc resource limit
ailrst Sep 5, 2024
648a41d
tweak doc
ailrst Sep 5, 2024
dfa7209
fix
ailrst Sep 5, 2024
38f871c
tweak interpretrlimit
ailrst Sep 5, 2024
6559180
basic malloc implementation
ailrst Sep 9, 2024
c6e938d
implement printf
ailrst Sep 10, 2024
285099b
cleanup intrins
ailrst Sep 10, 2024
aac7f5c
cleanup
ailrst Sep 23, 2024
1c2a5d5
Merge remote-tracking branch 'upstream/main' into interpreter
ailrst Sep 23, 2024
46ff66a
cleanup
ailrst Sep 23, 2024
f45015b
initial attempt at IR simplification
ailrst Sep 25, 2024
7769357
fixup analysis
ailrst Sep 25, 2024
533ccce
wl optim
ailrst Sep 25, 2024
febadbd
rename w ssa
ailrst Sep 25, 2024
4465cfb
tweak ssa
ailrst Sep 20, 2024
a6cc33f
change param to localvar
ailrst Sep 30, 2024
750d949
distinguish lvars and rvars in cilvisitor
ailrst Sep 30, 2024
c9c4181
transform ir and spec and loader to have params
ailrst Oct 1, 2024
f83fd74
cleanup spec param handling
ailrst Oct 2, 2024
3fce1c9
update visitor
ailrst Oct 2, 2024
eef39b9
add invariant check
ailrst Oct 2, 2024
9b87104
Merge branch 'procedure-call-abstraction' into xf
ailrst Oct 2, 2024
a21015c
fix test issues in merge
ailrst Oct 2, 2024
409c455
make simplify work with params
ailrst Oct 2, 2024
81fd5e4
improve dsa and params w liveness
ailrst Oct 4, 2024
8e99918
fix dsa, copyprop, cleanup
ailrst Oct 8, 2024
495e4f2
small fix
ailrst Oct 10, 2024
e567f77
rewrites
ailrst Oct 15, 2024
5c42e92
remove slices
ailrst Oct 16, 2024
41aa3d7
disable bad
ailrst Oct 16, 2024
24a12e3
attempt at condition lifting
ailrst Oct 17, 2024
8bf3e4d
cleanup
ailrst Oct 17, 2024
f75086e
add example
ailrst Oct 17, 2024
c26028e
add expr2smt
ailrst Oct 18, 2024
09eb1af
validate and improve expr lifting
ailrst Oct 18, 2024
8b4fd6f
transform order
ailrst Oct 18, 2024
bc48e00
cleanup and perf
ailrst Oct 19, 2024
f2adf72
add early proc trim and fix remove unreachable cfg maintenance
ailrst Oct 22, 2024
a6d302f
flow insensitive copyprop and single pass dsa
ailrst Oct 23, 2024
f313ed3
block coalescing
ailrst Oct 31, 2024
95744f7
booltobv1 and copyprop heuristic
ailrst Nov 1, 2024
70a22c1
WIP bitvector size type inference
ailrst Nov 1, 2024
760bc0a
copyprop completeness improvements
ailrst Nov 5, 2024
2b2a0a5
improvements to cond identification
ailrst Nov 6, 2024
6ea3b1a
improve shift/extend/extract removal
ailrst Nov 7, 2024
3beb84d
improve cond detection
ailrst Nov 7, 2024
7c0c98b
revert to old slice removal
ailrst Nov 7, 2024
a7d0754
cleanup
ailrst Nov 7, 2024
80491cb
validate and improve condition reduction
ailrst Nov 8, 2024
983597c
new prettyprinter
ailrst Nov 11, 2024
08af7a3
docs and cleanup
ailrst Nov 12, 2024
d3b8176
ccmp and rpo sort
ailrst Nov 13, 2024
0f9a756
Merge branch 'main' of github.com:UQ-PAC/bil-to-boogie-translator int…
ailrst Nov 13, 2024
a452165
work on adding param support to interpreter
ailrst Nov 29, 2024
b9c549c
multiple loggers
ailrst Dec 2, 2024
8dd5f0e
fix externalremover
ailrst Dec 3, 2024
21efc16
robustness and perf fixes (monad overflow)
ailrst Dec 3, 2024
797dc2b
Merge branch 'main' into simp-pass-main-merge
b-paul Dec 10, 2024
a824c07
prettyprinter
ailrst Dec 5, 2024
015edd3
Merge branch 'ilparser-serialiser' of github.com:ailrst/basil into si…
ailrst Dec 10, 2024
38f927b
run simplify after analysis
ailrst Dec 10, 2024
8329c11
Merge pull request #285 from UQ-PAC/simp-pass-main-merge
ailrst Dec 10, 2024
a082b35
dont try to <8bit initial memory regions
ailrst Dec 19, 2024
589df48
fix test load/store flag
ailrst Jan 6, 2025
cbc3cac
fix fmt
ailrst Jan 6, 2025
79b8d85
Merge remote-tracking branch 'origin' into simplification-pass
ailrst Jan 6, 2025
3a441cc
remove bvsaddo as not supported by z3 smtlib
ailrst Jan 6, 2025
92a10df
cleanup and autoformat new files
ailrst Jan 6, 2025
e71c1fc
add back initialMemory
ailrst Jan 6, 2025
d5a744b
tweak prettyprinter
ailrst Jan 6, 2025
ef7330a
fix args
ailrst Jan 6, 2025
fa73de2
fix
ailrst Jan 14, 2025
827946c
basic exit call identification
ailrst Jan 14, 2025
d72b987
allow custom initial state in AbstractDomain
sadrabt Jan 15, 2025
f272284
Merge pull request #298 from UQ-PAC/simplification-pass-init
ailrst Jan 15, 2025
589a9ee
load dir update readme
ailrst Jan 15, 2025
f6e1a16
tweak messages and readme
ailrst Jan 16, 2025
c8a208e
disable spec param conversion on procedures with no spec
ailrst Jan 16, 2025
fb808f6
add interproc solver to absint, analysis result printer
ailrst Jan 21, 2025
059807c
fixes
ailrst Jan 22, 2025
8e2c420
Merge pull request #301 from UQ-PAC/simplify-interproc-fixedpoint
ailrst Jan 22, 2025
7544b65
Simplify fixes (#302)
ailrst Jan 23, 2025
95f7b8a
disable debug print
ailrst Jan 23, 2025
8584daa
fix interlivevars return function
ailrst Jan 23, 2025
2e80fcc
wip copyprop throuh memory
ailrst Jan 23, 2025
d51dc93
minor simplifyexpr improvement
ailrst Jan 24, 2025
276af66
disable load prop
ailrst Jan 24, 2025
647e884
Merge pull request #303 from UQ-PAC/memory-copyprop
ailrst Jan 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .scalafmt.conf
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
version = "3.0.5"
runner.dialect = scala3
maxColumn = 120
preset = default
maxColumn = 120
indent.defnSite = 2
optIn.configStyleArguments = false
54 changes: 41 additions & 13 deletions docs/basil-ir.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@ BlockID ::=&~ String \\
\\
Jump ::=&~ GoTo ~|~ Unreachable ~|~ Return \\
GoTo ::=&~ \text{goto } BlockID* \\
Return::=&! \text{return } (outparams)
Call ::=&~ DirectCall ~|~ IndirectCall \\
DirectCall ::=&~ \text{call } ProcID \\
DirectCall ::=&~ (outparams) := \text{ call } ProcID \; (inparams) \\
IndirectCall ::=&~ \text{call } Expr \\
\\
&~ loads(e: Expr) = \{x | x:MemoryLoad, x \in e \} \\
Expand Down Expand Up @@ -55,8 +56,17 @@ Endian ::=&~ BigEndian ~|~ LittleEndian \\
- The `Unreachable` jump is used to signify the absence of successors, it has the semantics of `assume false`.
- The `Return` jump passes control to the calling function, often this is over-approximated to all functions which call the statement's parent procedure.

### Indirect Calls

An indirect call is a dynamic jump, to either a procedure or a block.

## Translation Phases

We have invariant checkers to validate the structure of the IR's bidirectional CFG is correct, see `src/main/scala/ir/invariant`. This includes:

- blocks belong to exactly one procedure: `invariant/BlocksUniqueToProcedure.scala`
- forwards block CFG links match backwards block CFG links: `invariant/CFGCorrect.scala`

#### IR With Returns

- Immediately after loading the IR return statements may appear in any block, or may be represented by indirect calls.
Expand All @@ -73,10 +83,21 @@ This ensures that all returning, non-stub procedures have exactly one return sta

#### Calls appear only as the last statement in a block

- Checked by `invariant/SingleCallBlockEnd.scala`
- The structure of the IR allows a call may appear anywhere in the block but for all the analysis passes we hold the invariant that it
only appears as the last statement. This is checked with the function `singleCallBlockEnd(p: Program)`.
And it means for any call statement `c` we may `assert(c.parent.statements.lastOption.contains(c))`.

## IR With Parameters

The higher level IR containing parameters is established by `ir.transforms.liftProcedureCallAbstraction(ctx)`.
This makes registers local variables, which are passed into procedures through prameters, and then returned from
procedures. Calls to these procedure must provide as input parameters the local variables corresponding to the
values passed, and assign the output parameters to local variables also. Note now we must consider indirect calls
as possibly assigning to everything, even though this is not explicitly represented syntactically.

- Actual parameters to calls and returns match formal parameters is checked by `invariant/CorrectCallParameters.scala`

## Interaction with BASIL IR

### Constructing Programs in Code
Expand Down Expand Up @@ -127,20 +148,27 @@ label, the dsl constructor will likely throw a match error.

Some additional constants are defined for convenience, Eg. `R0 = Register(R0, 64)`, see [the source file](../src/main/scala/ir/dsl/DSL.scala) for the full list.

### Static Analysis / Abstract Interpretation

- For static analysis the Il-CFG-Iterator is the current well-supported way to iterate the IR.
This currently uses the TIP framework, so you do not need to interact with the IR visitor directly.
See [BasicIRConstProp.scala](../src/main/scala/analysis/BasicIRConstProp.scala) for an example on its useage.
- This visits all procedures, blocks and statements in the IR program.
### Pretty printing

### Modifying and Visiting the IR with Visitor Pattern
The ir can be printed with the overloaded function below, which can take a procedure, block, or statement and returns a string.

[src/main/scala/ir/Visitor.scala](../src/main/scala/ir/Visitor.scala) defines visitors which can be used
for extracting specific features from an IR program. This is useful if you want to modify all instances of a specific
IR construct.

### CFG
```scala
translating.BasilIRPrettyPrinter()(b)
```

It is also possible to dump a `dot/graphviz` digraph containing just the blocks in the program
using the functions:

```scala
ir.dotBlockGraph(prog: Program) : String
ir.dotBlockGraph(proc: Procedure) : String
```

The cfg is a control-flow graph constructed from the IR, it wraps each statement in a `Node`.
### Static Analysis / Abstract Interpretation / IR Rewriting and modification

- See [development/simplification-solvers.md](development/simplification-solvers.md)
- For static analysis the Il-CFG-Iterator is the current well-supported way to iterate the IR.
This currently uses the TIP framework, so you do not need to interact with the IR visitor directly.
See [BasicIRConstProp.scala](../src/main/scala/analysis/BasicIRConstProp.scala) for an example on its useage.
- This visits all procedures, blocks and statements in the IR program.
203 changes: 203 additions & 0 deletions docs/development/interpreter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,203 @@
# BASIL IR Interpreter

The interpreter is designed for testing, debugging, and validation of static analyses and code transforms.
This page describes first how it can be used for this purpose, and secondly its design.

## Basic Usage

The interpreter can be invoked from the command line, via the interpret flag, by default this prints a trace and checks that the interpreter
exited on a non-error stop state.

```shell
./mill run -i src/test/correct/indirect_call/gcc/indirect_call.adt -r src/test/correct/indirect_call/gcc/indirect_call.relf --interpret
[INFO] Interpreter Trace:
StoreVar(#5,Local,0xfd0:bv64)
StoreMem(mem,HashMap(0xfd0:bv64 -> 0xf0:bv8, 0xfd6:bv64 -> 0x0:bv8, 0xfd2:bv64 -> 0x0:bv8, 0xfd3:bv64 -> 0x0:bv8, 0xfd4:bv64 -> 0x0:bv8, 0xfd7:bv64 -> 0x0:bv8, 0xfd5:bv64 -> 0x0:bv8, 0xfd1
...
[INFO] Interpreter stopped normally.
```

The `--verbose` flag can also be used, which may print interpreter trace events as they are executed, but not this may not correspond to the actual
execution trace, and contain additional events not corresponding to the program.
E.g. this shows the memory intialisation events that precede the program execution. This is mainly useful for debugging the interpreter.

### Testing with Interpreter

The interpreter is invoked with `interpret(p: IRContext)` to interpret normally and return an `InterpreterState` object
containing the final state.

#### Traces

There is also, `interpretTrace(p: IRContext)` which returns a tuple of `(InterpreterState, Trace(t: List[ExecEffect]))`,
where the second argument contains a list of all the events generated by the interpreter in order.
This is useful for asserting a stronger equivalence between program executions, but in most cases events describing "unobservable"
behaviour, such as register accesses should be filtered out from this list before comparison.

To see an example of this used to validate the constant prop analysis see [/src/test/scala/DifferentialAnalysis.scala](../../src/test/scala/DifferentialAnalysis.scala).

#### BreakPoints

Finally `interpretBreakPoints(p: IRContext, breakpoints: List[BreakPoint])` is used to
run an interpreter and perform additional actions at specified code points. For example, this may be invoked such as:

```scala
val watch = IRWalk.firstInProc((program.procedures.find(_.name == "main")).get).get
val bp = BreakPoint("entrypoint", BreakPointLoc.CMD(watch), BreakPointAction(saveState=true, stop=true, evalExprs=List(("R0", Register("R0", 64))), log=true))
val res = interpretBreakPoints(program, List(bp))
```

The structure of a breakpoint is as follows:

```scala
case class BreakPoint(name: String = "", location: BreakPointLoc, action: BreakPointAction)

// the place to perform the breakpoint action
enum BreakPointLoc:
case CMD(c: Command) // at a command c
case CMDCond(c: Command, condition: Expr) // at a command c, when condition evaluates to TrueLiteral

// describes what to do when the breakpoint is triggered
case class BreakPointAction(
saveState: Boolean = true, // stash the state of the interpreter
stop: Boolean = false, // stop the interpreter with an error state
evalExprs: List[(String, Expr)] = List(), // Evaluate the rhs of the list of expressions, and stash them (lhs is an arbitrary human-readable name)
log: Boolean = false // Print a log message about passing the breakpoint describing the results of this action
)
```

To see an example of this used to validate the constant prop analysis see [/src/test/scala/InterpretTestConstProp.scala](../../src/test/scala/InterpretTestConstProp.scala).

### Resource Limit

This kills the interpreter in an error state once a specified instruction count is reached, to avoid the interpreter running forever on infinite loops.

It can be used simply with the function `interptretRLimit`, this automatically ignores the initialisation instructions.

```scala
def interpretRLimit(p: IRContext, instructionLimit: Int) : InterpreterState
```

It can also be combined with other interpreters as shown:

```scala
def interp(p: IRContext, instructionLimit: Int) : (InterpreterState, Trace) = {
val interpreter = LayerInterpreter(tracingInterpreter(NormalInterpreter), EffectsRLimit(instructionLimit))
val initialState = InterpFuns.initProgState(NormalInterpreter)(p, InterpreterState())
BASILInterpreter(interpreter).run((initialState, Trace(List())), 0)._1
}
```

## Implementation / Code Structure

### Summary

- [Bitvector.scala](../../src/main/scala/ir/eval/Bitvector.scala)
- Evaluation of bitvector operations, throws `IllegalArgumentException` on violation of contract
(e.g negative divisor, type mismatch)
- [ExprEval.scala](../../src/main/scala/ir/eval/ExprEval.scala)
- Evaluation of expressions, defined in terms of partial evaluation down to a Literal
- This can also be used to evaluate expressions in static analyses, by passing a function to query variable assignments and memory state from the value domain.
- [Interpreter.scala](../../src/main/scala/ir/eval/Interpreter.scala)
- Definition of core `Effects[S, E]` and `Interpreter[S, E]` types describing state transitions in
the interpreter
- Instantiation/definition of `Effects` for concrete state `InterpreterState`
- [InterpreterProduct.scala](../../src/main/scala/ir/eval/InterpreterProduct.scala)
- Definition of product and layering composition of generic `Effects[S, E]`s interpreters
- [InterpretBasilIR.scala](../../src/main/scala/ir/eval/InterpretBasilIR.scala)
- Definition of `Eval` object defining expression evaluation in terms of `Effects[S, InterpreterError]`
- Definition of `Interpreter` instance for BASIL IR, using a generic `Effects` instance and concrete state.
- Definition of ELF initialisation in terms of generic `Effects[S, InterpreterError]`
- [InterpretBreakpoints.scala](../../src/main/scala/ir/eval/InterpretBreakpoints.scala)
- Definition of a generic interpreter with a breakpoint checker layered on top
- [interpretRLimit.scala](../../src/main/scala/ir/eval/InterpretRLimit.scala)
- Definition of layered interpreter which terminates after a specified cycle count
- [InterpretTrace.scala](../../src/main/scala/ir/eval/InterpretTrace.scala)
- Definition of a generic interpreter which records a trace of calls to the `Effects[]` instance.

### Explanation

The interpreter is structured for compositionality, at its core is the `Effects[S, E]` type, defined in [Interpreter.scala](../../src/main/scala/ir/eval/Interpreter.scala).
This type defines a small set of functions which describe all the possible state transformations, over a concrete state `S`, and error type `E` (always `InterpreterError` in practice).

This is implemented using the state Monad, `State[S,V,E]` where `S` is the state, `V` the value, and `E` the error type.
This is a flattened `State[S, Either[E]]`, defined in [util/functional.scala](../../src/main/scala/util/functional.scala).
`Effects` methods return delayed computations, functions from an input state (`S`) to a resulting state and a value (`(S, Either[E, V])`).
These are sequenced using `flatMap` (monad bind), or the `for{} yield()` syntax sugar for flatMap.

This `Effects[S, E]` is instantiated for a given concrete state, the main example of which is `NormalInterpreter <: Effects[InterpreterState, InterpreterError]`,
also defined in `Interpreter.scala`. The memory component of the state is abstracted further into the `MemoryState` object.

The actual execution of code is defined on top of this, in the `Interpreter[S, E]` type, which takes an instance of the `Effects` by parameter,
and defines both the small step (`interpretOne`) over on instruction, and the fixed point to termination from some in initial state in `run()`.
The fact that the stepping is defined outside the effects is important, as it allows concrete states, and state transitions over them to be
composed somewhat arbitrarily, and the interpretatation of the language compiled down to calls to resulting instance of `Effects`.

This is defined in [InterpretBasilIR.scala](../../src/main/scala/ir/eval/InterpretBasilIR.scala). `BASILInterpreter` defines an
`Interpreter` over an arbitrary instance of `Effects[S, InterpreterError]`, encoding BASIL IR commands as effects.
This file also contains definitions of the initial memory state setup of the interpreter, based on the ELF sections and symbol table.

### Composition of interpreters

There are two ways to compose `Effects`, product and layer. Both produce an instance of `Effects[(L, R), E]`,
where `L` and `R` are the concrete state types of the two Effects being composed.

Product runs the two effects, over two different concrete state types, simultaneously without interaction.

Layer runs the `before` effect first, and passes its state to the `inner` effect whose value is returned.

```scala
case class ProductInterpreter[L, T, E](val inner: Effects[L, E], val before: Effects[T, E]) extends Effects[(L, T), E] {
case class LayerInterpreter[L, T, E](val inner: Effects[L, E], val before: Effects[(L, T), E])
```

Examples of using these are in the `interpretTrace` and `interpretWithBreakPoints` interpreters respectively.

Note, this only works by the aforementioned requirement that all effect calls come from outside the `Effects[]`
instance itself. In the simple case, the `Interpreter` instance is the only object calling `Effects`.
This means, `Effects` triggered by an inner `Effects[]` instance do not flow back to the `ProductInterpreter`,
but only appear from when `Interpreter` above the `ProductInterpreter` interprets the program via effect calls.
For this reason if, for example, `NormalInterpreter` makes effect calls they will not appear in a trace emitted by `interptretTrace`.

### Note on memory space initialisation

Most of the interpret functions are overloaded such that there is a version taking a program `interpret(p: Program)`,
and a version taking `IRContext`. The variant taking IRContext uses the ELF symbol information to initialise the
memory before interpretation. If you are interpreting a real program (i.e. not a synthetic example created through
the DSL), this is most likely required.

We initialise:

- The general interpreter state, stack and memory regions, stack pointer, a symbolic mapping from addresses functions
- The initial and readonly memory sections stored in Program
- The `.bss` section to zero
- The relocation table. Each listed offset is stored an address to either a real procedure in the program, or a
location storing a symbolic function pointer to an intrinsic function.

`.bss` is generally the top of the initialised data, the ELF symbol `__bss_end__` being equal to the symbol `__end__`.
Above this we can somewhat choose arbitrarily where to put things, usually the heap is above, followed by
dynamically linked symbols, then the stack. There is currently no stack overflow checking, or heap implemented in the
interpreter.

Unfortunately these details are defined by the load-time linker and the system's linker script, and it is hard to find a good description
of their behaviour. Some details are described here https://refspecs.linuxfoundation.org/elf/elf.pdf, and here
https://dl.acm.org/doi/abs/10.1145/2983990.2983996.

### Missing features

- There is functionality to implement external function calls via intrinsics written in Scala code, but currently only
basic printf style functions are implemented as no-ops. These can be extended to use a file IO abstraction, where
a memory region is created for each file (e.g. stdout), with a variable to keep track of the current write-point
such that a file write operation stores to the write-point address, and increments it by the size of the store.
Importantly, an implementation of malloc() and free() is needed, which can implement a simple greedy allocation
algorithm.
- Despite the presence of procedure parameters in the current IR, they are not used for by the boogie translation and
are hence similarly ignored in the interpreter.
- The interpreter's immutable state representation is motivated by the ability to easily implement a sound approach
to non-determinism, e.g. to implement GoTos with guessing and rollback rather than look-ahead. This is more
useful for checking specification constructs than executing real programs, so is not yet implemented.
- The trace does not clearly distinguish internal vs external calls, or observable
and non-observable behaviour.
- While the interpreter semantics supports memory regions, we do not initialise the memory regions (or the initial memory state)
based on those present in the program, we simply assume a flat `mem` and `stack` memory partitioning.


1 change: 1 addition & 0 deletions docs/development/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
- [tool-installation](tool-installation.md) Guide to lifter, etc. tool installation
- [scala](scala.md) Advice on Scala programming.
- [cfg](cfg.md) Explanation of the old CFG datastructure
- [interpreter](interpreter.md) Explanation of IR interpreter


## Scala
Expand Down
Loading