Skip to content

Latest commit

 

History

History
95 lines (60 loc) · 5.85 KB

README.md

File metadata and controls

95 lines (60 loc) · 5.85 KB

Chaos Compiler Collection

A library and set of command line tools for parsing debugging symbols from PS2 games. The 1.x series of releases were focused on STABS symbols in .mdebug sections, while the 2.x series of releases can also parse standard ELF symbols and SNDLL linker symbols. DWARF support is in the works.

Tools

demangle

C++ symbol demangler with support for both the new Itanium C++ ABI (GCC 3+) mangling scheme and the old GCC 2 scheme.

objdump

Half-working EE core MIPS disassembler. Probably not too interesting.

stdump

Symbol table parser and dumper. It can extract the following information:

  • Data types (structs, unions, enums, etc)
  • Functions (name, return type, parameters and local variables)
  • Global variables

The following output formats are supported:

  • C++
  • JSON

This is intended to be used with ghidra-emotionengine-reloaded (>= 2.1.0 or one of the unstable builds) to import all of this information into Ghidra. Note that despite the name the STABS analyzer should work for the R3000 (IOP) and possibly other MIPS processors as well.

uncc

This is similar to stdump except it organizes its output into separate source files, and has a number of extra features designed to try and make said output closer to valid source code. A SOURCES.txt file must be provided in the output directory, which can be generated using the stdump files command (you should fixup the paths manually so that they're relative to the output directory, and remove the addresses). Additionally, non-empty files that do not start with // STATUS: NOT STARTED will not be overwritten.

If a FUNCTIONS.txt file is provided in the output directory, as can be generated using the included CCCDecompileAllFunctions.java script for Ghidra, the code from that file will be used to populate the function bodies in the output. In this case, the first group of local variable declarations emitted will be those recovered from the symbols, and the second group will be from the code provided in the functions file. Function names are demangled.

Global variable data will be printed in a structured way based on its data type.

Data types will be sorted into their corresponding files. Since this information is not stored in the symbol table, uncc uses heuristics to map types to files. Types will be put in .c or .cpp files when there is only a single translation unit the type appears in, and .h files when there are multiple (and hence when heuristics must be used to determine where to put them).

Use of a code formatter such as clang-format or astyle on the output is recommended.

Building

cmake -B bin/
cmake --build bin/

Documentation

Chaos Compiler Collection

DWARF (.debug) Section

MIPS ABI

MIPS Debug (.mdebug) Section

STABS

License

The source code for the CCC library and associated command line tools is released under the MIT license.

The GNU demangler is used, which contains source files licensed under the GPL and the LGPL. RapidJSON is used under the MIT license. The GoogleTest library is used by the test suite under the 3-Clause BSD license.