Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try out further ibverbs optimisations #44

Open
bmerry opened this issue Jun 13, 2016 · 2 comments
Open

Try out further ibverbs optimisations #44

bmerry opened this issue Jun 13, 2016 · 2 comments

Comments

@bmerry
Copy link
Contributor

bmerry commented Jun 13, 2016

Things that documentation or common sense suggests might improve performance with MLNX_OFED

  • using contiguous pages for MRs
  • batch up acknowledgement of events
  • setting environment variables to make QPs and CQs use contiguous pages
  • posting receives in batches (via a linked list) instead of one-at-a-time
@bmerry
Copy link
Contributor Author

bmerry commented Jul 14, 2016

There is also some work that can be done on sending. The biggest change here is simply to batch up more packets together, to amortize the various overheads. But asking for only the last packet in each batch to be acknowledged may also help.

@bmerry
Copy link
Contributor Author

bmerry commented Oct 19, 2020

The send batching has long since been implemented, and 3.0 will implement the single completion event per batch. Newer versions of rdma-core offer some other opportunities for optimisation:

  • Thread domains
  • APIs for creating CQs, which allow them to be declared single-threaded
  • APIs for CQ polling, that pull attributes of completions on-demand
  • APIs for submitting send work requests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant