-
Notifications
You must be signed in to change notification settings - Fork 1
/
runscript.help
64 lines (55 loc) · 2.87 KB
/
runscript.help
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
This container provides a convenient way to run LLaVA. In addition to the LLaVA
module, it includes the commands:
- `llava-run`, a command-line wrapper for LLaVA inference
- `hyak-llava-web`, a wrapper to launch the gradio web interface and issue an
SSH connection string you can copy to open a tunnel to your own computer.
To run LLaVA with the `llava-run` script, use the following command:
apptainer run --nv --writable-tmpfs \
oras://ghcr.io/uw-psych/llava-container/llava-container:latest \
llava-run [llava-run arguments]
You must pass the "--nv" flag to enable GPU support.
Depending on your intended use, you may also want to pass the "--bind" flag
to mount a directory from the host system into the container.
To specify a directory to use for the HuggingFace model cache and enable access
to /gscratch, use the following command:
apptainer run --nv --writable-tmpfs \
--env HUGGINGFACE_HUB_CACHE=/path/to/cache \
--bind /gscratch \
oras://ghcr.io/uw-psych/llava-container/llava-container:latest \
llava-run [llava-run arguments]
The following describes the usage of this script:
llava-run [-h] [--model-path PATH] [--model-base PATH] --image-file
IMAGE [IMAGE ...] (--query QUERY [QUERY ...] | --chat)
[--json]
[--conv-mode {v0,v1,vicuna_v1,llama_2,plain,v0_plain,llava_v0,v0_mmtag,llava_v1,v1_mmtag,llava_llama_2,mpt}]
[--stack-sep SEP] [--temperature FLOAT] [--top_p FLOAT]
[--num_beams N] [--max_new_tokens N]
[--load-8bit | --load-4bit] [--device {cuda,cpu}]
[--hf-cache-dir DIR]
options:
-h, --help show this help message and exit
--model-path PATH Model path
--model-base PATH Model base (required for 'lora' models)
--image-file IMAGE [IMAGE ...]
Path or URL to image (provide multiple to process in
batch; use --sep delimiter within paths to stack image
inputs )
--query QUERY [QUERY ...]
Query (can be specified multiple times, e.g. --query a
--query b)
--chat Use chat instead of query
--json Produce JSON output
--conv-mode {v0,v1,vicuna_v1,llama_2,plain,v0_plain,llava_v0,v0_mmtag,llava_v1,v1_mmtag,llava_llama_2,mpt}
Conversation mode
--stack-sep SEP Internal separator for stacked image files (default:
",")
--temperature FLOAT Temperature (default: 0.2)
--top_p FLOAT Top p (default: 1.0)
--num_beams N Number of beams (default: 1)
--max_new_tokens N Max new tokens (default: 512)
--load-8bit Load 8bit model
--load-4bit Load 4bit model
--device {cuda,cpu} Device to use
--hf-cache-dir DIR HuggingFace cache directory
For details on the arguments, see the LLaVA documentation and the usage infor-
mation for llava.eval.run_llava and llava.serve.cli.