Specific docs:
By default, all services have exposed ports. You can change them in .env
file.
Service | Port | Url |
---|---|---|
Frontend | 9500 | http://localhost:9500 |
Backend | 9501 | http://localhost:9501/docs |
Attu | 9502 | http://localhost:9502 |
Milvus | 9503 | http://localhost:9503 |
We use backend environment to run scripts. Start the docker container with the following command, which also ups the Milvus and dependend services.
make pull
make scripts
The scripts is runned by poetry run
commands.
poetry run compute_embeddings --help
During the first run the scripts will download models
The models as downloaded to HF_HOME
directory, which is mounted to /hf
inside of the container.
For testing purpuses you
poetry run compute_embeddings --path /data/ExampleDataset/videos --name VideoDataset --version v1 --mode video+audio --model LanguageBind
Similarly for images
poetry run compute_embeddings --path /data/ExampleDataset/images --name ImageDataset --version v1 --mode image --model LanguageBind
The scripts is runned by poetry run
commands.
poetry run create_index --help
poetry run create_index --name VideoDataset --version v1 --model LanguageBind
poetry run create_index --name ImageDataset --version v1 --model LanguageBind
You can check created collections in Attu web interface: http://localhost:9502/#/databases/ClipSeek
After creating collection it needed to be added to configuration file config.yaml
. You can directly copy and paste the dataset definition from meta.yaml
.
For our example, it will look like that
# Datasets Configuration
DATASETS:
- data_path: /data/ExampleDataset/images
dataset: ImageDataset
version: v1
modalities:
- image
- data_path: /data/ExampleDataset/videos
dataset: VideoDataset
version: v1
modalities:
- video
- audio
- hybrid
Now we ready to start the whole application.
make up logs