neulab · CoderPat · Oct 19, 2022
diff --git a/.github/workflows/pylint.yml b/.github/workflows/pylint.yml
@@ -0,0 +1,41 @@
+name: Python Lint
+
+on:
+  push:
+    branches: [ main ]
+  pull_request:
+    branches: [ main ]
+
+jobs:
+  build:
+
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: [3.7, 3.8]
+
+    steps:
+    - uses: actions/checkout@v2
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v2
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install -r requirements.txt
+        pip install flake8 black mypy types-requests
+    - name: Lint with Black
+      run: |
+        # check if black would reformat anything
+        black lti_llm_client/ --check 
+    - name: Lint with flake8
+      run: |
+        # stop the build if there are Python syntax errors or undefined names
+        flake8 lti_llm_client/ --count --select=C,E,F,W,B,B950 --ignore=E203,E501,E731,W503 --show-source --statistics
+        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
+        flake8 lti_llm_client --count --exit-zero --max-complexity=10 --max-line-length=88 --statistics
+    - name: Type Checking with MyPy
+      run: |
+        # stop the build if there are type errors
+        mypy --strict lti_llm_client/
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
diff --git a/README.md b/README.md
@@ -1,33 +1,20 @@
-# Fast Inference Solutions for BLOOM
+# LTI's Large Language Model Deployment
 
-This repo provides demos and packages to perform fast inference solutions for BLOOM. Some of the solutions have their own repos in which case a link to the corresponding repos is provided instead.
+**TODO**: Add a description of the project.
 
-Some of the solutions provide both half-precision and int8-quantized solution.
+This repo is a fork of the [huggingface](https://huggingface.co/)'s [BLOOM inference demos](https://github.com/huggingface/transformers-bloom-inference).
 
-## Client-side solutions
+## Installation
 
-Solutions developed to perform large batch inference locally:
+```bash
+pip install -e .
+```
 
-Pytorch:
+## Example API Usage
 
-* [Accelerate, DeepSpeed-Inference and DeepSpeed-ZeRO](./bloom-inference-scripts)
+```python
+import lti_llm_client
 
-* Thomas Wang is working on a Custom Fused Kernel solution - will link once it's ready for a general use.
-
-JAX:
-
-* [BLOOM Inference in JAX](https://github.com/huggingface/bloom-jax-inference)
-
-
-
-## Server solutions
-
-Solutions developed to be used in a server mode (i.e. varied batch size, varied request rate):
-
-Pytorch:
-
-* [Accelerate and DeepSpeed-Inference based solutions](./bloom-inference-server)
-
-Rust:
-
-* [Bloom-server](https://github.com/Narsil/bloomserver)
+client = lti_llm_client.Client()
+client.prompt("CMU's PhD students are")
+```
diff --git a/bloom-inference-scripts/README.md b/bloom-inference-scripts/README.md