The goal is to optimize this trainable's accuracy. The accuracy increases fastest at the optimal lr, which is a function of the current accuracy. The optimal lr schedule for this problem is the triangle wave as follows. Note that many lr schedules for real models also follow this shape:
best lr
^
| /\
| / \
| / \
| / \
------------> accuracy
In this problem, using PBT with a population of 2-4 is sufficient to roughly approximate this lr schedule. Higher population sizes will yield faster convergence. Training will not converge without PBT.
If you want to read more about this example, vist the ray documentation.
Katib uses this training container in some Experiments, for instance in the PBT example.