-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NUMA support and optimizations #11871
base: master
Are you sure you want to change the base?
Conversation
- Enable NUMA optimizations via CMake options - Introduce NUMA-aware thread assignment and memory allocation - Add NUMA debugging utilities - Enhance cache management for NUMA systems
Nice! The PR uses the old Debug() statements, which needs to be changed to the new Dbg / Ctl features:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for this PR! It looks really interesting. Just a few comments to get ci happy. Mainly migrating Debug->Dbg. Looking forward to trying this out!
src/cripts/Lulu.cc
Outdated
{ | ||
return details::splitter<Cript::string_view>(input, delim); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this chunk, perhaps a merge conflict issue.
src/iocore/net/Server.cc
Outdated
int my_thread_id = this_ethread()->id; | ||
int my_numa_node = this_ethread()->get_numa_node(); | ||
|
||
Debug("numa_sequencer", "[NUMASequencer] Thread %d (NUMA node %d) entered run_sequential.", my_thread_id, my_numa_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug has been removed, please use the Dbg macro instead.
src/iocore/net/Server.cc
Outdated
if (opt.f_mptcp) { | ||
Dbg(dbg_ctl_connection, "Define socket with MPTCP"); | ||
prot = IPPROTO_MPTCP; | ||
} | ||
|
||
// Create the socket | ||
Debug("numa", "[Server::listen] Attempting to create socket with family: %d, type: %d, protocol: %d", addr.sa.sa_family, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These Debug lines were maybe for your debug? Could they be removed?
} | ||
|
||
RamCache * | ||
RamCacheContainer::get_cache(unsigned int my_node, unsigned int node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RamCacheContainer::get_cache(unsigned int my_node, unsigned int node) | |
RamCacheContainer::get_cache(unsigned int /* my_node ATS_UNUSED */, unsigned int node) |
|
||
// returns true if consistent | ||
static bool | ||
check_pages_consistency(void *data, size_t size, const char *name = "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is unused unless NUMA_CONSISTENCY_CHECK
is defined. Please add precompile check here.
} | ||
|
||
static void | ||
move_pages_to_current_numa_zone(void *data, size_t size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is unused
src/iocore/net/Server.cc
Outdated
|
||
// If all threads have been added (assuming their number is equal to eventProcessor.net_threads), sort the thread IDs and set | ||
// ready_to_run to true | ||
if (thread_ids.size() == eventProcessor.net_threads) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get a signed/unsigned compare error for this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build is failing for systems without numa.h. I noted one place, but there could be other similar issues.
getcpu(&cpu, &node); | ||
this->numa_node = node; | ||
} | ||
return this->numa_node; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should handle when numa.h is not available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getcpu() function is included in the numactl package. Adding a conditional directive should solve this issue.
Summary:
This pull request adds support for Non-Uniform Memory Access (NUMA) to Apache Traffic Server, enhancing performance on NUMA systems.
Key Changes:
Performance testing has shown increased throughput, reduced CPU usage, improved latency, decreased UPI bus load
These changes aim to optimize memory access patterns and reduce latency on NUMA systems.
To build the application with NUMA support, use the following CMake command:
cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_MIMALLOC=ON -DENABLE_NUMA=ON -DENABLE_HWLOC=ON -Dmimalloc_DIR=""