v0.3 Release: Speed & Memory Usage improvements + PyTorch 0.5 updates
We've switched our mLSTM model to internally used PyTorch's fused LSTM cell which provides significantly improved GPU memory usage (allowing for larger batch size training) and slight improvements to speed compared to the unfused version we had included in earlier versions.
In order to convert any models you've trained in the past to be usable with this version, please see this issue.
We've also updated our distributed code to address the recent April 3rd changes made to PyTorch's Tensors and Variables.