This repository has been archived by the owner on Jun 26, 2024. It is now read-only.
Releases: aniketmaurya/llm-inference
Releases · aniketmaurya/llm-inference
v0.0.6
What's Changed
- refactor package by @aniketmaurya in #13
- refactor apis by @aniketmaurya in #14
Full Changelog: v0.0.5...v0.0.6
Chatbot with Lit-GPT x LangChain
What's Changed
- [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #8
- fix build ci by @aniketmaurya in #9
- Refactor packaging by @aniketmaurya in #10
- refactor Chatbot by @aniketmaurya in #11
- longchat chatbot by @aniketmaurya in #12
New Contributors
- @pre-commit-ci made their first contribution in #8
Full Changelog: v0.0.4...v0.0.5
Chatbot support & bug fixes
What's Changed
- Chatbot by @aniketmaurya in #4
- Refactor bot by @aniketmaurya in #5
How to use Chatbot
from chatbot import LLaMAChatBot
checkpoint_path = f"state_dict.pth"
tokenizer_path = f"tokenizer.model"
bot = LLaMAChatBot(
checkpoint_path=checkpoint_path, tokenizer_path=tokenizer_path
)
print(bot.send("hi, what is the capital of France?"))
Full Changelog: v0.0.2...v0.0.3
v0.0.2
What's Changed
- Load finetuned weights by @aniketmaurya in #2
- Refactor serve by @aniketmaurya in #3
For inference
from llama_inference import LLaMAInference
import os
WEIGHTS_PATH = os.environ["WEIGHTS"]
checkpoint_path = f"{WEIGHTS_PATH}/lit-llama/7B/state_dict.pth"
tokenizer_path = f"{WEIGHTS_PATH}/lit-llama/tokenizer.model"
model = LLaMAInference(checkpoint_path=checkpoint_path, tokenizer_path=tokenizer_path, dtype="bfloat16")
print(model("New York is located in"))
For serving a REST API
# app.py
from llama_inference.serve import ServeLLaMA, Response
import lightning as L
component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
Full Changelog: v0.0.1...v0.0.2
v0.0.1
What's Changed
- Deploy LLaMA with Lightning App by @aniketmaurya in #1
Full Changelog: https://github.com/aniketmaurya/LLaMA-Inference-API/commits/v0.0.1