Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support model-loading from .npz files #341

Closed
jerinphilip opened this issue Feb 8, 2022 · 3 comments · Fixed by #342
Closed

Support model-loading from .npz files #341

jerinphilip opened this issue Feb 8, 2022 · 3 comments · Fixed by #342
Labels
mod: bergamot Changes affecting bergamot-library

Comments

@jerinphilip
Copy link
Contributor

jerinphilip commented Feb 8, 2022

XapaJIaMnu/translateLocally#89

Marian supports this path. It may be possible for us to support this - but this shouldn't be at the cost of making TranslationModelcomplicated. Current path is:

// (1) For WebAssembly: string config, legal MemoryBundle, missing entires corresponding to bundle. Thin.
TranslationModel(const std::string& config, MemoryBundle&& memory, size_t replicas = 1)
  : TranslationModel(parseOptionsFromString(config, /*validate=*/false), std::move(memory), replicas){};
  
// (2) Constructor converting options based load into a fast-load by loading MemoryBundle from Options
TranslationModel(const Config& options, size_t replicas = 1)
  : TranslationModel(options, getMemoryBundleFromConfig(options), replicas) {}

// (3) The constructor which does the heavylifting. Chooses mmap based fastpath if memory is non-empty. 
// Falls back to options based FS loads one otherwise.
TranslationModel(const Config& options, MemoryBundle&& memory, size_t replicas = 1);

We have already expressed intention to clean a bit of MemoryBundle (affecting getMemoryBundleFromConfig) in #338. If we can find a way to (a) not load into MemoryBundle from .npz, or alternatively, (b) load into MemoryBundle from .npz we may be able to achieve what we want.

We'd prefer (b) for when #257 is solved?

Marian checks based on extension https://github.com/marian-nmt/marian-dev/blob/b29cc07a95f49df7825f3a92e860bd642db0e812/src/common/io.cpp#L13-L21), which we can use.

@XapaJIaMnu sounds like the sketch of a plan?

@jerinphilip jerinphilip added the mod: bergamot Changes affecting bergamot-library label Feb 8, 2022
@XapaJIaMnu
Copy link
Collaborator

Yes sounds good!

@jerinphilip
Copy link
Contributor Author

@XapaJIaMnu: I had a brief conversation with @graemenail regarding the mmap path. The following is what I understood.

  1. The mmap path places loaded content in contiguous storage and provides a pointer to the graph.
  2. The NumPy file loads into vector<io::Item>, which are not consumable by the fast loading path.

Because 2, (b) mentioned in #341 (comment) does not appear straightforward. #342 has a working prototype with (a), where we return an empty AlignedMemory and loading falls back to the non-fast path. Is it possible to convert .npz to .bin on the fly and get closer towards (b)? Are there other approaches I am missing? I have limited knowledge of the source doing these things and could use inputs from expertise in the area.

Also, please bear in mind .npz loads are suggested for quick debugging use-cases, so we may defer spending more time here to quickly get the final functionality if (b) takes longer and get (a). I intend to check in a test in BRT to ensure this stays stable.

I appreciate any help you can provide.

@XapaJIaMnu
Copy link
Collaborator

ugh, this is a nightmare.
Both the fast and the slow coding path eventually go through this function: https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/common/binary.cpp#L30
As there is reordering memory on the fly based on the architecture, we never have a direct a direct way of feeding bytes into the graph.. There shouldn't be anything preventing (b) from working as we do eventually go through the function that I mentioned, therefore we do go through the intermediate vectorio::Item representation.

jerinphilip added a commit that referenced this issue Feb 9, 2022
Changes `ABORT` on non `.bin` model to an additional check for a `.npz` 
extension. If `.bin`, the fast load path is activated by returning `AlignedMemory`. 
Otherwise, the return of empty `AlignedMemory` causes fallback to
filesystem-based loads.

BRT: A test that checks if translation using `.npz` is approximately similar to 
that of default CLI translation is checked in to ensure stability going ahead.

Previously, we only supported `.bin` models' loading via a fast mmap 
path. While we had the underlying capability to load non `.bin` models, this 
was not exposed, encouraging fast loads. Loading `.npz` models are helpful 
for quick debugging and broader coverage of models available, which will 
enhance user experience at translateLocally and python bindings. 


Fixes #341.
See also: XapaJIaMnu/translateLocally#89
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mod: bergamot Changes affecting bergamot-library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants