You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would seem like a nice idea to abstract away asset io from the inference, but currently it's not possible.
Both llama.cpp and whisper.cpp only provide an interface to load models through filenames.
Here's what we can do:
Don't abstract io away. Require the asset manager to provide filenames and leave the io to the inference implementation. This however will prevent us from using more exotic storage mediums (apk-baked assets on android, file system assets in the browser, completely in-memory asset storage).
Make PRs to whisper and llama with alternative model loading (preferably iostream, but something custom which provides the necessary runtime polymorphism will also do)
Since we kinda-sorta plan on rewriting llama and whisper, we can take option one for now and leave that problem to future us.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
It would seem like a nice idea to abstract away asset io from the inference, but currently it's not possible.
Both llama.cpp and whisper.cpp only provide an interface to load models through filenames.
Here's what we can do:
iostream
, but something custom which provides the necessary runtime polymorphism will also do)Since we kinda-sorta plan on rewriting llama and whisper, we can take option one for now and leave that problem to future us.
Beta Was this translation helpful? Give feedback.
All reactions