Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add first prototype of client API #1

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions .github/workflows/pylint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: Python Lint

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.7, 3.8]

steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install flake8 black mypy types-requests
- name: Lint with Black
run: |
# check if black would reformat anything
black lti_llm_client/ --check
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 lti_llm_client/ --count --select=C,E,F,W,B,B950 --ignore=E203,E501,E731,W503 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 lti_llm_client --count --exit-zero --max-complexity=10 --max-line-length=88 --statistics
- name: Type Checking with MyPy
run: |
# stop the build if there are type errors
mypy --strict lti_llm_client/
11 changes: 0 additions & 11 deletions .pre-commit-config.yaml

This file was deleted.

39 changes: 13 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,20 @@
# Fast Inference Solutions for BLOOM
# LTI's Large Language Model Deployment

This repo provides demos and packages to perform fast inference solutions for BLOOM. Some of the solutions have their own repos in which case a link to the corresponding repos is provided instead.
**TODO**: Add a description of the project.

Some of the solutions provide both half-precision and int8-quantized solution.
This repo is a fork of the [huggingface](https://huggingface.co/)'s [BLOOM inference demos](https://github.com/huggingface/transformers-bloom-inference).

## Client-side solutions
## Installation

Solutions developed to perform large batch inference locally:
```bash
pip install -e .
```

Pytorch:
## Example API Usage

* [Accelerate, DeepSpeed-Inference and DeepSpeed-ZeRO](./bloom-inference-scripts)
```python
import lti_llm_client

* Thomas Wang is working on a Custom Fused Kernel solution - will link once it's ready for a general use.

JAX:

* [BLOOM Inference in JAX](https://github.com/huggingface/bloom-jax-inference)



## Server solutions

Solutions developed to be used in a server mode (i.e. varied batch size, varied request rate):

Pytorch:

* [Accelerate and DeepSpeed-Inference based solutions](./bloom-inference-server)

Rust:

* [Bloom-server](https://github.com/Narsil/bloomserver)
client = lti_llm_client.Client()
client.prompt("CMU's PhD students are")
```
174 changes: 0 additions & 174 deletions bloom-inference-scripts/README.md

This file was deleted.

Loading