Skip to content

Latest commit

 

History

History
99 lines (71 loc) · 3.33 KB

SymbolDatabase.md

File metadata and controls

99 lines (71 loc) · 3.33 KB

Symbol Database

The symbol database provides a unified in-memory and on-disk representation for debugging symbols, including those imported from multiple different types of symbol tables, and those that are defined manually by the user.

Handles

Integer handles were chosen to represent references to symbols. These symbols are stored in lists with other symbols of the same type, and are sorted by their symbol handles. When we need to lookup a symbol by its handle, a binary search is done on the list.

Maps

The symbol list classes maintain maps that can be used to lookup a symbol by its address or name where that makes sense.

In addition, you can lookup a symbol from an address that it overlaps. For example, it is possible to find the function that an instruction belongs to by looking up its address.

Modules

Symbols keep track of which module they are associated with. This can be used, for example, to delete all the symbols for a given module when it is unloaded.

Nodes

Data types are represented as trees of C/C++ AST nodes. They can contain type names that reference other data types by their symbol handle.

Specific nodes can be referenced using node handles, which bundle together an enum representing the type of symbol to be looked up, a symbol handle for said symbol, a pointer to the AST node, and a generation count used for detecting when the node handle has been invalidated.

Whenever an AST node is created, deleted or moved the invalidate_node_handles function must be called on the associated symbol, otherwise trying to lookup a node handle pointing into that symbol may result in an invalid pointer being returned. The set_type function will do this automatically.

Symbol Sources

Symbols keep track of how they were created. Each part of the code that creates symbols calls get_symbol_source on the symbol database object to obtain a handle for a symbol source.

If a symbol source with a given name already exists, its handle will be reused, otherwise a new symbol source object will be created.

Transactions

There isn't proper support for transactions, however since the symbol table importers only create new symbols, and each of those symbols is assigned a specific module handle, if there is an error in the import process we can restore the symbol database to the state it was in previously by deleting all symbols from the new module.

To support this design we never deduplicate data types that come from different modules or sources.

X Macros

X macros are used in places where the same operation has to be performed for each type of symbol to reduce code duplication.

The following macro is defined in symbol_database.h:

#define CCC_FOR_EACH_SYMBOL_TYPE_DO_X \
	CCC_X(DataType, data_types) \
	CCC_X(Function, functions) \
 	...

To use this macro, you would first define CCC_X as the code you want to run for each type of symbol, then you would use CCC_FOR_EACH_SYMBOL_TYPE_DO_X.

Take the following example:

s32 sum = 0;
#define CCC_X(SymbolType, symbol_list) sum += symbol_list.size();
CCC_FOR_EACH_SYMBOL_TYPE_DO_X
#undef CCC_X

This would result in the following code being generated by the C++ preprocessor:

s32 sum = 0;
sum += data_types.size();
sum += functions.size();
...

This way, if we wanted to add another type of symbol, we wouldn't have to modify the symbol count code above.