Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Chatllama] Support Inference for trained models. #320

Open
4 tasks
PierpaoloSorbellini opened this issue Mar 31, 2023 · 1 comment
Open
4 tasks

[Chatllama] Support Inference for trained models. #320

PierpaoloSorbellini opened this issue Mar 31, 2023 · 1 comment
Labels
chatllama Issue related to the ChatLLaMA module good first issue Good for newcomers

Comments

@PierpaoloSorbellini
Copy link
Collaborator

PierpaoloSorbellini commented Mar 31, 2023

Description

Currently to perform inference of the models generated the user needs to interact with the model generated writing a small python script accordingly to how the model is saved by library, by loading the resulting checkpoint or model saved after training.

Moreover a lot of optimization can be integrated to speed-up the inference such as:

  • CPU Offloading.
  • llama.ccp implementation
  • accelerate / deepspeed distributed inference.

TODO

  • Implement Inference Class to make inference very easy and even possible from CLI.
  • Implement Inference with the optimisations available from deepspeed
  • Implement inference with the optimisations available from accelerate
  • Implement fast lama inference with known library llama.ccp implementation
@PierpaoloSorbellini PierpaoloSorbellini added good first issue Good for newcomers chatllama Issue related to the ChatLLaMA module labels Mar 31, 2023
@shrinath-suresh
Copy link
Contributor

@PierpaoloSorbellini The inference section is tagged with WIP. Do we have any basic inference code available in chatllama to load actor_rl model and run few queries ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chatllama Issue related to the ChatLLaMA module good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants