Asset storage and I/O #53

iboB · 2024-08-26T10:12:08Z

iboB
Aug 26, 2024
Maintainer

It would seem like a nice idea to abstract away asset io from the inference, but currently it's not possible.

Both llama.cpp and whisper.cpp only provide an interface to load models through filenames.

Here's what we can do:

Don't abstract io away. Require the asset manager to provide filenames and leave the io to the inference implementation. This however will prevent us from using more exotic storage mediums (apk-baked assets on android, file system assets in the browser, completely in-memory asset storage).
Make PRs to whisper and llama with alternative model loading (preferably iostream, but something custom which provides the necessary runtime polymorphism will also do)

Since we kinda-sorta plan on rewriting llama and whisper, we can take option one for now and leave that problem to future us.