Skip to content

CGEN LLVM-IR is a generator of binary-to-LLVM-IR translators. Just provide the CPU architecture.

License

Notifications You must be signed in to change notification settings

leonardoarcari/cgen-llvm-ir-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CGEN LLVM IR generator

CGEN LLVM IR generator is an extension of the CGEN framework that attempts to generate C++ translators that emit LLVM IR code semantically equivalent to a binary file in input.

Ideally, once an RTL CPU Architecture description is provided to CGEN LLVM IR generator, a C++ program is generated, accepting a stream containing a binary program compiled for the same architecture. The binary input is disassembled and the code is translated to a semantically equivalent LLVM IR program.

Roadmap

Here's a brief list of tasks to be accomplished for a working prototype of the generator.

  • CPU registers allocation
    • Global variables allocation
    • Test correctness towards .cpu files
  • Disassambler
    • Read an instruction word from a byte stream
    • Decode instruction opcode
    • Decode instruction fields into in-memory objects
    • Provide dump() facilities
    • Test against available .cpu files
  • Semantic translator

Hands-on

To run CGEN LLVM IR generator, a convenient Python script is provided to hide the odds and quirks of Scheme and its implementation in Guile.

Prerequisites

We assume you have a working Guile 1.8 environment set up on your machine, with the guile executable exported in your system PATH. A guide to compile and install it is available here.

Also you are required to have LLVM 3.8.0 (+ development headers) installed (CMake must be able to find LLVM CMake Find script). Optionally, you will need clang-format installed to perform code formatting on generated C++ source files.

Running CGEN-IR:

$ ./cgen-ir.py --help
usage: cgen-ir.py [-h] -a ARCH -m MACHINE [-i ISA] [-t DEC_H] [-d DEC_CPP]
                  [-r REG_H]
                  dstPath

A generator of LLVM-IR generators. Yes.

positional arguments:
  dstPath               Destination path

optional arguments:
  -h, --help            show this help message and exit
  -a ARCH, --arch ARCH  .cpu description file
  -m MACHINE, --machine MACHINE
                        Variant of the architecture
  -i ISA, --isa ISA     ISA name of the architecture
  -t DEC_H, --decoder-header DEC_H
                        Decoder header filename
  -d DEC_CPP, --decoder-src DEC_CPP
                        Decoder source filename
  -r REG_H, --registers-header REG_H
                        Registers allocation source filename

You are required to provide at least:

-a ARCH     .cpu description file path
-m MACHINE    the machine you want to generate translators for (e.g.: arch700)
destPath    destination directory where to generate sources in

If you want to manually specify the name of generated sources you can use -t, -d, -r arguments.

Care: If your target .cpu file has multiple ISAs defined, you must provide a -i argument declaring which one you want to generate a translator for.

Compile generated translator

cgen-ir.py script generates source files along with a non-necessarily-working driver (i.e. main.cpp) and a CMakeLists.txt file. You can easily compile the generated translator with the usual process in CMake

$ cd dstPath
$ mkdir build
$ cd build
$ cmake ..
$ make

Examples

An example on CGEN LLVM IR generator usage is available for ARC700 architecture here.

About

CGEN LLVM-IR is a generator of binary-to-LLVM-IR translators. Just provide the CPU architecture.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published