You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, a few others have had this error. It is typically either an out-of-memory issue or a matter of a mismatch between the CUDA version within and outside the container. For the former, can you try running one of the smaller models? If that also doesn't work, consider either building from source or upgrading the Pytorch version, which worked in this, similar issue.
I am trying to setup Polycoder inferencing on my machine with 2xP100 GPUs, and use the docker command as available in README:
nvidia-docker run --rm -it -e NVIDIA_VISIBLE_DEVICES=0,1 --shm-size=1g --ulimit memlock=-1 --mount type=bind,src=$PWD/Downloads/checkpoints/checkpoints-2-7B,dst=/gpt-neox/checkpoints vhellendoorn/code-lms-neox:base
And then within the container:
The following is the output (stdout+stderr):
The text was updated successfully, but these errors were encountered: