-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.md.esh
164 lines (116 loc) Β· 6.91 KB
/
README.md.esh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# llava-container
This container provides a convenient way to run [LLaVA](https://github.com/haotian-liu/LLaVA) on Hyak.
## Running LLaVA on Hyak π
First, you'll need to log in to Hyak. If you've never set this up, go [here](https://uw-psych.github.io/compute_docs).
```bash
```
Then, you'll need to request a compute node. You can do this with the `salloc` command:
```bash
# Request a GPU node with 8 CPUs, 2 GPUs, 64GB of RAM, and 1 hour of runtime:
# (Note: you may need to change the account and partition)
salloc --account escience --partition gpu-a40 --mem 64G -c 8 --time 1:00:00 --gpus 2
```
One you're logged in to the compute node, you should set up your cache directories and Apptainer settings.
π *If you're following this tutorial, **you should do this every time you're running LLaVA on Hyak!** This is because the default settings for Apptainer will use your home directory for caching, which will quickly fill up your home directory and cause your jobs to fail.*
```bash
# Do this in every session where you're running LLaVA on Hyak!
# Set up cache directories:
export APPTAINER_CACHEDIR="/gscratch/scrubbed/${USER}/.cache/apptainer"
export HUGGINGFACE_HUB_CACHE="/gscratch/scrubbed/${USER}/.cache/huggingface"
mkdir -p "${APPTAINER_CACHEDIR}" "${HUGGINGFACE_HUB_CACHE}"
# Set up Apptainer:
export APPTAINER_BIND=/gscratch APPTAINER_WRITABLE_TMPFS=1 APPTAINER_NV=1
```
Then, you can run LLaVA. Let's try with the sample image on LLaVA's repository:
![Sample image](https://llava-vl.github.io/static/images/view.jpg)
```bash
# Run LLaVA:
apptainer run \
oras://ghcr.io/<%= "${GITHUB_REPOSITORY:-uw-psych/llava-container}" %>/llava-container:latest \
llava-run \
--model-path liuhaotian/llava-v1.5-7b \
--image-file "https://llava-vl.github.io/static/images/view.jpg" \
--query "What's going on here?"
# Description of the arguments:
# llava-run: the command to run in the container
# --model-path: the name of the model to use
# --image-file: the URL of the image to use
# --query: what to ask the model
```
If it's working, you should see output that looks something like this:
> The image features a pier extending out into a large body of water, possibly a lake or a river. The pier is made of wood and has a few benches placed on it, providing a place for people to sit and enjoy the view. The water appears calm and serene, making it an ideal spot for relaxation and contemplation.
>
> In the background, there are mountains visible, adding to the picturesque scenery. The pier is situated in front of a forest, creating a peaceful and natural atmosphere.
When you're done, you can exit the compute node with the command `exit` or `Ctrl-D`.
### Chat mode π£οΈ
For chat, just pass `--chat` instead of `--query`:
```bash
apptainer run \
oras://ghcr.io/<%= "${GITHUB_REPOSITORY:-uw-psych/llava-container}" %>/llava-container:latest \
llava-run \
--model-path liuhaotian/llava-v1.5-7b \
--image-file "https://llava-vl.github.io/static/images/view.jpg" \
--chat
```
### Running other commands π
If you want to a different command, such as one of the commands that comes with LLaVA, you can pass it after the image name:
```bash
apptainer run \
oras://ghcr.io/<%= "${GITHUB_REPOSITORY:-uw-psych/llava-container}" %>/llava-container:latest \
python -m llava.serve.cli
```
### Improving startup time π
If you notice slowness when launching the container, you can try extracting the container image to a sandbox directory:
```bash
# Set up a sandbox directory:
SANDBOX="/tmp/${USER}/sandbox/llava" && mkdir -p "$(dirname "${SANDBOX}")"
# Extract the container image to the sandbox:
apptainer build --sandbox "${SANDBOX}" oras://ghcr.io/<%= "${GITHUB_REPOSITORY:-uw-psych/llava-container}" %>/llava-container:latest
# Run LLaVA by passing the sandbox directory instead of the image URL:
apptainer run \
"${SANDBOX}" \
llava-run \
--model-path liuhaotian/llava-v1.5-7b \
--image-file "https://llava-vl.github.io/static/images/view.jpg" \
--query "What's going on here?"
```
### Running the web interface πΈοΈ
Included in the container is a wrapper script for the LLaVA web interface. To run it, you can use the following command:
```bash
apptainer run \
oras://ghcr.io/<%= "${GITHUB_REPOSITORY:-uw-psych/llava-container}" %>/llava-container:latest \
hyak-llava-web
```
This script will print out a command to set up an SSH tunnel to the web interface. You can then open the web interface by visiting `http://localhost:8000` in your browser. The output should look something like this:
```bash
# To access the gradio web server, run the following command on your local machine:
ssh -o StrictHostKeyChecking=no -N -L 8000:localhost:53641 -J [email protected] altan@g3021
```
You should be able to copy and paste this command into your terminal to set up the SSH tunnel. Then, you can open `http://localhost:8000` in your browser to access the web interface.
To configure the web interface, you can set the following environment variables:
- `MODEL_PATHS`: a list of model paths, quoted and separated by space (default: "liuhaotian/llava-v1.5-7b")
- Available models include, but are not limited to:
- liuhaotian/llava-v1.5-7b
- liuhaotian/llava-v1.5-13b
- liuhaotian/llava-v1.5-7b-lora
- liuhaotian/llava-v1.5-13b-lora
See https://github.com/haotian-liu/LLaVA/blob/main/docs/MODEL_ZOO.md for more details.
- `GRADIO_CONTROLLER_PORT`: the port number for the gradio controller (or leave it empty to use a random port)
- `LOCAL_HTTP_PORT`: the port number to print for the local HTTP server SSH tunnel command (default: 8000)
For example:
```bash
export MODEL_PATHS='liuhaotian/llava-v1.5-13b' # Use the 13b model instead of the 7b model
export LOCAL_HTTP_PORT=9000 # Use port 9000 instead of 8000
apptainer run \
oras://ghcr.io/<%= "${GITHUB_REPOSITORY:-uw-psych/llava-container}" %>/llava-container:latest \
hyak-llava-web
```
π *You need to select the model from the dropdown to start. If the model doesn't appear in the dropdown, wait a few seconds and refresh the page.*
## `llava-run`
The `llava-run.py` script is a modification of [`LLaVA/lava/eval/run_llava.py`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/run_llava.py) that adds support for loading 4- and 8-bit models as found in [`LaVA/llava/serve/cli.py`](https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/cli.py), as well as a chat mode that allows you to have a conversation with the model.
The following describes the usage of `llava-run`:
```plain
<%+ runscript.help.esh %>
```
See the [documentation](https://github.com/haotian-liu/LLaVA/blob/main/README.md) for LLaVA or the source code for [`llava/eval/run_llava.py`](https://github.com/haotian-liu/LLaVA/blob/main/llava/eval/run_llava.py) and [`llava/serve/cli.py`](https://github.com/haotian-liu/LLaVA/blob/main/llava/serve/cli.py) for more information on the arguments.