-
Notifications
You must be signed in to change notification settings - Fork 1
PerfSim
- We want to simulate target processor quickly
- We also want to construct simulator quickly
-
Partitioned simulators are a known technique:
- Simplifies timing model
- Amortize functional model design effort over many models
Within partitioned simulation, there are many potential ways for these functional/timing partitions to interact. Mauer, Hill and Wood (Mauer 2002, ACM SIGMETRICS) categorized such simulators as functional-first (traditionally called trace-driven), timing-directed, and timing-first.
In the functional-first scheme a functional model is used to generate an execution trace that is fed into a timing model, which adds microprocessor-specific timing information to the trace.
A timing-directed simulator, in contrast, is an execution-driven simulator where the timing model invokes operations on the functional model at the right time.
In the timing-first style timing is first calculated, and then a functional model invoked to verify the results. Contemporary partitioned software simulators include Asim and MASE.
We believe that a timing-directed solution will ultimately lead to the best performance.
Generally, functional partition should have the following interface methods:
Operation | Parameters | Return value | Effect |
---|---|---|---|
Get Instruction | Addr | Instr | Fetch the instruction at this address and place it in flight. |
Get Dependencies | Instr | Deps | Get the dependencies of this instruction relative to other in-flight instructions. |
Get Operands | Instr | Srcs | Read the register file and prepare the instruction for execution. |
Get Results | Instr | Result | Execute the instruction and return the result, including branch information. For loads and stores, do effective address calculation. |
Read Memory (Do Loads) | Instr | Value | Perform and memory reads associated with the instruction. |
Speculatively Write Memory | Instr | -- | Make any memory writes visible to local loads. |
Commit | Instr | -- | Commit the instruction’s local changes and remove it from being in-flight. |
Abort | Instr | -- | Abort the instruction’s local changes and remove it from being in-flight. |
Write Memory | Instr | -- | Make any memory writes globally visible. |
These operations roughly correspond to stages in a traditional microprocessor pipeline, with additional support for controlling the precise timing of store operations.
The order in which the timing partition invokes these operations determines the state of the machine at any given moment.
According to Joel Emer, all data dependencies can be represented via these phases.
For a single in-flight instruction, the operations are typically invoked in-order (operations which do not apply may by skipped). This corresponds to instructions flowing through pipeline stages in a real computer:
- the instruction is fetched (
getInstruction
) before - it is decoded (
getDependencies
), takes place before - register read (
getOperands
), and so on.
The order in which the timing model invokes these operations on separate in-flight instructions determines the state of the machine. We can conceive of a timing model which fetches ten instructions before decoding one, for example.
The distinction between local writes and global writes allows for accurate control of inter-thread communication. The fact that the timing model uses these operations to control the exact timing, in modeled time, of the visibility of data allows for precise control of the timing of inter-thread communication. This is a key attribute of a closely-coupled functional partition.
As the functional partition executes each operation, it changes the architectural state of the simulator, and thus the result of subsequent operations. For example, executing getResult()
on an instruction which writes register will mean that a subsequent getOperands()
call will see that value of th register. If an instruction is executed in some way which is not consistent with program order, the abort()
operation undoes its effects and allows it to be retried. All operations are speculative and may be aborted until the commit()
operation is called, at which point they become permanent.
Three different timing models executing the same instruction sequence:
- About us
- HOWTO
- Design
- Instruction set architecture (ISA)
- Functional simulation
-
Performance simulation
- Infrastructure
- Module Structure
- Clocking
- Ports
- Logs
- Stats
- Configuration
- Hardware features
- Infrastructure
- Implementation
- Coding style
- Functional simulation
- Performance simulation
- Infrastructure
- Module structure
- Clocking
- Ports
- Logs
- Stats
- Configuration
- Hardware features
- Infrastructure
- Quality assurance
- Simmy Specification
- FAQ
- BKM