Magic Bitboard Table is slow due to pointer indirection and cache misses #1
Labels
help wanted
Extra attention is needed
performance
Something is working slower than it could, potentially
The current implementation of MagicTable is as follows:
and MagicBB is like:
To find the blockers of a rook or bishop, the move generator has to hold a pointer to a MagicTable instance (in
MoveGenerator
it isstd::shared_ptr<MagicTable> mt
) and do something likemt->bishop_magics[i].compute()
. To do this, the pointer to the magic table is dereferenced to find theMagicBB
object for the piece-square we are looking for, then the pointer to thetable
entry in MagicBB must be dereferenced to find the result. Since MagicTable::compute is called very often, this is leading to a time usage of about 12% in the compute() function. The multiple levels of indirection very often leads to cache misses. Running perf stat shows a 9.2% backend bound on my machine, and the magic table and transposition table are the only large sections of memory where cache misses may be likely to occur. Since the transposition table is not accessed during most of the search (in the quiescence nodes for example), it means that almost all the cache misses are likely inMagicBB::compute()
, and that stalling due to cache misses accounts for 75% of the time spent inMagicBB::compute
.The implementation of the magic table should be improved to reduce the cache miss rate. One way to do this would be to ensure that all the entries in all the magic tables are in a contiguous region of memory, and furthermore by finding better magic numbers the size of the total magic table could be potentially reduced. It may also improve performance to put the magic table meta-information as a global constant and hard-code it or compute them at compile-time using some TMP, as then the move generator and any other module using the magic bitboards don't have to dereference a pointer to a magic table object, but rather access the data directly.
The text was updated successfully, but these errors were encountered: