-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu(::DataLoader)
, take III
#2245
Conversation
Once the build has completed, you can preview any updated documentation at this URL: https://fluxml.ai/Flux.jl/previews/PR2245/ in ~20 minutes |
@@ -171,7 +163,7 @@ In order to train the model using the GPU both model and the training data have | |||
``` | |||
Here `(xtrain, ytrain) |> gpu` applies [`gpu`](@ref) to both arrays -- it recurses into not just tuples, as here, but also whole Flux models. | |||
|
|||
### Saving GPU-Trained Models | |||
## Saving GPU-Trained Models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is just me reducing the number of levels of heading from 3 to 2. The file is a bit of a mess but no need for deep hierarchy.
for (x_cpu, y_cpu) in train_loader | ||
x = gpu(x_cpu) | ||
y = gpu(y_cpu) | ||
grads = gradient(m -> loss(m, x, y), model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this example to use explicit gradient, and to be less verbose.
@@ -122,61 +122,48 @@ julia> x |> cpu | |||
0.7766742 | |||
``` | |||
|
|||
```@docs | |||
cpu | |||
gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These docstrings moved from a "guide" section to a "reference" section.
""" | ||
function gpu(d::MLUtils.DataLoader) | ||
MLUtils.DataLoader(MLUtils.mapobs(gpu, d.data), | ||
d.batchsize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of writing this out here, we could move it upstream: JuliaML/MLUtils.jl#153
end | ||
end | ||
``` | ||
This is equivalent to `DataLoader(MLUtils.mapobs(gpu, (X, Y)); keywords...)`. | ||
Something similar can also be done with [`CUDA.CuIterator`](https://cuda.juliagpu.org/stable/usage/memory/#Batching-iterator), `gpu_train_loader = CUDA.CuIterator(train_loader)`. However, this only works with a limited number of data types: `first(train_loader)` should be a tuple (or `NamedTuple`) of arrays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we could hint at using mapobs
to transform the dataset into something CUIterator compatible. Could be a short example like
train_loader = mapobs(preprocess_transform, train_loader)
gpu_train_loader = CUDA.CuIterator(train_loader)
Also, mention when CuIterator should be preferred over gpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it preferred? This PR takes the line that it's not... it's tricky about what types, and finalize doesn't matter. So it's mentioned here in case people are already using it.
If finalize does matter, then we should do #2240 instead.
Simpler variant that just calls
DataLoader(mapobs(gpu, data), ...)
.What this misses compared to more complex CuIterator-like things is that it does not call
finalize
afterwards. But perhaps that doesn't matter since each call of the model will allocate so much more, and also notfinalize
explicitly.Closes #2240, closes #2186
PR Checklist