-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
README example #256
Comments
Good point. Then again many people use ForwardDiff without even knowing about config and chunks, and it's still decently fast. Preparation features prominently in the tutorial and overview pages of the docs, so I am torn as to whether it belongs in the README. @adrhill thoughts? |
The point of the current README is to demonstrate to the average, time-deprived developer that DI is an easy to use, unified interface for Julia's many AD backends. I personally wouldn't want to complicate this example too much. We follow it up by the disclaimer "For more performance, take a look at the DifferentiationInterface tutorial", which implies sub-optimal performance in the code example. I guess we could further emphasize this disclaimer and possibly add a short second code block demonstrating preparation?
PR #255 introduces additional preparation functions for static points of linearization, so the optimal operator preparation is about to be problem dependent. |
I should stress that my preferred approach would be to direct users to the documentation, which they should definitely read.
|
I really like this. Perhaps you could strengthen then statement even more though? In the case of Tapir, not preparing can yield many orders of magnitude worse performance. |
By the way I also really liked it when the README was tested @adrhill, we could reintroduce that as well |
For the record -- I think basically any of the solutions proposed here are great, so I will not be offended if you close this issue without further consulting me! |
At present the main example in the README is:
Would it be better to make this example one which includes the
extras = prepare_gradient(f, backend, x)
etc stuff?I'm just thinking about the issue we had on Tapir.jl the other day, where a user didn't realise that you should really be using
prepare_gradient
in general. My general point is: if a user sees one example, maybe they should see one which we think is likely to give best performance across all backends?The text was updated successfully, but these errors were encountered: