Try out further ibverbs optimisations #44

bmerry · 2016-06-13T10:33:16Z

Things that documentation or common sense suggests might improve performance with MLNX_OFED

using contiguous pages for MRs
batch up acknowledgement of events
setting environment variables to make QPs and CQs use contiguous pages
posting receives in batches (via a linked list) instead of one-at-a-time

bmerry · 2016-07-14T14:24:06Z

There is also some work that can be done on sending. The biggest change here is simply to batch up more packets together, to amortize the various overheads. But asking for only the last packet in each batch to be acknowledged may also help.

bmerry · 2020-10-19T13:01:46Z

The send batching has long since been implemented, and 3.0 will implement the single completion event per batch. Newer versions of rdma-core offer some other opportunities for optimisation:

Thread domains
APIs for creating CQs, which allow them to be declared single-threaded
APIs for CQ polling, that pull attributes of completions on-demand
APIs for submitting send work requests

bmerry added the enhancement label Jun 13, 2016

bmerry added performance and removed enhancement labels Oct 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try out further ibverbs optimisations #44

Try out further ibverbs optimisations #44

bmerry commented Jun 13, 2016

bmerry commented Jul 14, 2016

bmerry commented Oct 19, 2020 •

edited

Loading

Try out further ibverbs optimisations #44

Try out further ibverbs optimisations #44

Comments

bmerry commented Jun 13, 2016

bmerry commented Jul 14, 2016

bmerry commented Oct 19, 2020 • edited Loading

bmerry commented Oct 19, 2020 •

edited

Loading