To set up the environments, follow the instructions in the existing repositories and download the necessary checkpoints. Additionally, we offer guidance for this step, which addresses potential issues such as package version conflicts and system-related problems.
- (optional) checkpoints allow for manual downloading; otherwise, the model will download automatically if the Internet works fine.
- use
export DECORD_EOF_RETRY_MAX=20480
to prevent possible issues from decord
-
- Installation: Instruction
- Checkpoints:
- Source: Video-ChatGPT-7B, LLaVA-Lightening-7B-v1-1, clip-vit (optional)
- Structure:
├── checkpoints/Video-ChatGPT-7B ├── LLaVA-7B-Lightening-v1-1 ├── Video-ChatGPT-7B └── clip-vit-large-patch14 (optional)
-
- Installation: Instruction
- Checkpoints:
- Source: Valley2-7b
- Structure:
├── checkpoints/Valley2-7b
-
- Installation: Instruction
- Possible Issues:
torchaudio error: OSError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory
--> Solution: reinstall torchaudio
- Possible Issues:
- Checkpoints:
- Source: Video-LLaMA-2-7B-Finetuned/Video-LLaMA-2-13B-Finetuned, VIT (optional), qformer (optional)
- Structure:
├── checkpoints/Video-LLaMA-2-7B-Finetuned ├── AL_LLaMA_2_7B_Finetuned.pth ├── imagebind_huge.pth ├── llama-2-7b-chat-hf ├── VL_LLaMA_2_7B_Finetuned.pth ├── blip2_pretrained_flant5xxl.pth (optional) └── eva_vit_g.pth (optional) ├── checkpoints/Video-LLaMA-2-13B-Finetuned ├── AL_LLaMA_2_13B_Finetuned.pth ├── imagebind_huge.pth ├── llama-2-13b-chat-hf ├── VL_LLaMA_2_13B_Finetuned.pth ├── blip2_pretrained_flant5xxl.pth (optional) └── eva_vit_g.pth (optional)
- Installation: Instruction
-
- Installation: Instruction
- Possible Issue 1:
ERROR: Could not find a version that satisfies the requirement torch==1.13.1+cu117
--> Solution:pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
- Possible Issue 2:
flash-attention error
--> Solution: inference doesn't need flash-attention
- Possible Issue 1:
- Checkpoints:
- Source: llama-7b, UMT-L-Qformer, VideoChat2_7B_stage2, VideoChat2_7B_stage3, Vicuna-7B-delta + script
- Structure:
├── checkpoints/VideoChat2 ├── umt_l16_qformer.pth ├── videochat2_7b_stage2.pth ├── videochat2_7b_stage3.pth └── vicuna-7b-v0
- Installation: Instruction
-
- Installation: Instruction
- Checkpoints:
- Source: Video-LLaVA-7B, LanguageBind_Video (optional), LanguageBind_Image (optional)
- Structure:
├── checkpoints/VideoLLaVA ├── Video-LLaVA-7B ├── LanguageBind_Video_merge (optional) └── LanguageBind_Image (optional)
-
- Installation: Instruction
- Possible Issue:
package motion-vector-extractor not supported on CentOS
--> No Alternative on CentOS - Possible Issue:
missing accelerate, apex
- Possible Issue:
- Checkpoints:
- Source: Video-LaVIT-v1
- Structure:
├── checkpoints/Video-LaVIT-v1/language_model_sft
- Installation: Instruction
-
- Installation: Instruction
- Checkpoints:
- Source: llama-vid-7b-full-224-video-fps-1, llama-vid-13b-full-224-video-fps-1 eva_vit_g, bert (optional)
- Structure:
├── checkpoints/LLaMA-VID-7B ├── llama-vid-7b-full-224-video-fps-1 ├── LAVIS/eva_vit_g.pth └── bert-base-uncased (optional) ├── checkpoints/LLaMA-VID-13B ├── llama-vid-13b-full-224-video-fps-1 ├── LAVIS/eva_vit_g.pth └── bert-base-uncased (optional)
-
- Installation: Instruction
- Checkpoints:
- Source: video_mistral_checkpoint_last, Mistral-7B-Instruct-v0.2, vit (optional)
- Structure:
├── checkpoints/MiniGPT4-Video ├── checkpoints/video_mistral_checkpoint_last.pth ├── Mistral-7B-Instruct-v0.2 └── eva_vit_g.pth (optional)
-
- Installation: Instruction
- Checkpoints:
- Source: pllava-7b, pllava-13b, pllava-34b
- Structure:
├── checkpoints/PLLaVA └── pllava-7b ├── checkpoints/PLLaVA └── pllava-13b ├── checkpoints/PLLaVA └── pllava-34b
-
- Installation: Instruction
- Checkpoints:
- Structure:
├── checkpoints/PLLaVA └── LLaVA-NeXT-Video-7B-DPO ├── checkpoints/PLLaVA └── LLaVA-NeXT-Video-34B-DPO
-
- Installation: Instruction
- Checkpoints:
- Source: sharegpt4video-8b
- Structure:
├── checkpoints/ShareGPT4Video └── sharegpt4video-8b
-
- Installation: make a .env under baselines/gpt4v; set
API_BASE
andAPI_KEY
- Installation: make a .env under baselines/gpt4v; set