ANN: Compute Server - new feature: Automatic Restart #7322
williamstein
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I just implemented a new feature for CoCalc compute servers. A compute server in CoCalc is a remote computer, whose resources (GPUs, CPUs, RAM, disks) you can utilize via CoCalc’s collaborative interface in Jupyter notebooks and terminals, providing hundreds of CPUs, thousands of GBs of RAM, full root privileges, run Docker containers, and much more.
You can now make it so a compute server will automatically restart whenever it stops responding (for about a minute) for any reason, including crashing due to running out of RAM or if it is a spot instance that is killed due to a surge in demand. Just check "Automatically Restart" in the compute server configuration dialog:
This is especially useful for the "Spot" provisioning type, since they are up to 91% cheaper, and tend to be killed randomly between 12 hours and 1 week from when you start them. Spot compute servers with "Automatic Restart" enabled are ideal for hosting a powerful but affordable web service or a computation that you update periodically (e.g., using crontab), or checkpoint and can automatically resume.
Beta Was this translation helpful? Give feedback.
All reactions