shelf combines the pytree registry from JAX with the fsspec project.
Similarly to what you do in JAX, registering a pair of serialization and deserialization callbacks allows you to easily save your custom Python types as files anywhere fsspec can reach!
Here's how you register a custom neural network type that uses pickle to store trained models on disk.
# my_model.py
import numpy as np
import pickle
import shelf
import os
class MyModel:
def __call__(self):
return 42
def train(self, data: np.ndarray):
pass
def score(self, data: np.ndarray):
return 1.
def save_to_disk(model: MyModel, ctx: shelf.Context) -> None:
"""Dumps the model to the directory ``tmpdir`` using `pickle`."""
fp = ctx.file("my-model.pkl", mode="wb")
pickle.dump(model, fp)
def load_from_disk(ctx: shelf.Context) -> MyModel:
"""Reloads the previously pickled model."""
fname, = ctx.filenames
fp = ctx.file(fname, mode="rb")
model: MyModel = pickle.load(fp)
return model
shelf.register_type(MyModel, save_to_disk, load_from_disk)
Now, for example in your training loop, save the model to anywhere using a Shelf
:
import numpy as np
from shelf import Shelf
from my_model import MyModel
def train():
# Initialize a `Shelf` to handle remote I/O.
shelf = Shelf()
model = MyModel()
data = np.random.randn(100)
# Train your model...
for epoch in range(10):
model.train(data)
# and save it to S3...
shelf.put(model, "s3://my-bucket/my-model.pkl")
# ... or GCS if you prefer...
shelf.put(model, "gs://my-bucket/my-model.pkl")
# ... or Azure!
shelf.put(model, "az://my-blob/my-model.pkl")
Conversely, if you want to reinstantiate a remotely stored model:
def score():
model = shelf.get("s3://my-bucket/my-model.pkl", MyModel)
accuracy = model.score(np.random.randn(100))
print(f"And here's how accurately it predicts: {accuracy:.2%}")
And just like that, push and pull your custom models and data artifacts anywhere you like - your service of choice just has to have a supporting fsspec
filesystem implementation available.
shelf
is an experimental project - expect bugs and sharp edges.
Install it directly from source, for example either using pip
or poetry
:
pip install git+https://github.com/nicholasjng/shelf.git
# or
poetry add git+https://github.com/nicholasjng/shelf.git
A PyPI package release is planned for the future.