The symbol database provides a unified in-memory and on-disk representation for debugging symbols, including those imported from multiple different types of symbol tables, and those that are defined manually by the user.
Integer handles were chosen to represent references to symbols. These symbols are stored in lists with other symbols of the same type, and are sorted by their symbol handles. When we need to lookup a symbol by its handle, a binary search is done on the list.
The symbol list classes maintain maps that can be used to lookup a symbol by its address or name where that makes sense.
In addition, you can lookup a symbol from an address that it overlaps. For example, it is possible to find the function that an instruction belongs to by looking up its address.
Symbols keep track of which module they are associated with. This can be used, for example, to delete all the symbols for a given module when it is unloaded.
Data types are represented as trees of C/C++ AST nodes. They can contain type names that reference other data types by their symbol handle.
Specific nodes can be referenced using node handles, which bundle together an enum representing the type of symbol to be looked up, a symbol handle for said symbol, a pointer to the AST node, and a generation count used for detecting when the node handle has been invalidated.
Whenever an AST node is created, deleted or moved the invalidate_node_handles
function must be called on the associated symbol, otherwise trying to lookup a
node handle pointing into that symbol may result in an invalid pointer being
returned. The set_type
function will do this automatically.
Symbols keep track of how they were created. Each part of the code that creates
symbols calls get_symbol_source
on the symbol database object to obtain a
handle for a symbol source.
If a symbol source with a given name already exists, its handle will be reused, otherwise a new symbol source object will be created.
There isn't proper support for transactions, however since the symbol table importers only create new symbols, and each of those symbols is assigned a specific module handle, if there is an error in the import process we can restore the symbol database to the state it was in previously by deleting all symbols from the new module.
To support this design we never deduplicate data types that come from different modules or sources.
X macros are used in places where the same operation has to be performed for each type of symbol to reduce code duplication.
The following macro is defined in symbol_database.h
:
#define CCC_FOR_EACH_SYMBOL_TYPE_DO_X \
CCC_X(DataType, data_types) \
CCC_X(Function, functions) \
...
To use this macro, you would first define CCC_X
as the code you want to run
for each type of symbol, then you would use CCC_FOR_EACH_SYMBOL_TYPE_DO_X
.
Take the following example:
s32 sum = 0;
#define CCC_X(SymbolType, symbol_list) sum += symbol_list.size();
CCC_FOR_EACH_SYMBOL_TYPE_DO_X
#undef CCC_X
This would result in the following code being generated by the C++ preprocessor:
s32 sum = 0;
sum += data_types.size();
sum += functions.size();
...
This way, if we wanted to add another type of symbol, we wouldn't have to modify the symbol count code above.