Skip to content

e1four15f/ClipSeek

Repository files navigation

ClipSeek: A Text-to-Clip Retrieval System

ClipSeek ClipSeek is a text-to-clip retrieval system that allows users to search for specific moments in videos using text queries. The system segments videos into clips and matches them with textual input using a multimodal deep learning model. It features a web-based interface and visualization of search results.

Demo

Specific docs:

Configuration

By default, all services have exposed ports. You can change them in .env file.

Service Port Url
Frontend 9500 http://localhost:9500
Backend 9501 http://localhost:9501/docs
Attu 9502 http://localhost:9502
Milvus 9503 http://localhost:9503

Example

Embeddings

We use backend environment to run scripts. Start the docker container with the following command, which also ups the Milvus and dependend services.

make pull
make scripts

The scripts is runned by poetry run commands.

poetry run compute_embeddings --help

During the first run the scripts will download models The models as downloaded to HF_HOME directory, which is mounted to /hf inside of the container.

For testing purpuses you

poetry run compute_embeddings --path /data/ExampleDataset/videos --name VideoDataset --version v1 --mode video+audio --model LanguageBind

Similarly for images

poetry run compute_embeddings --path /data/ExampleDataset/images --name ImageDataset --version v1 --mode image --model LanguageBind

Indexing

The scripts is runned by poetry run commands.

poetry run create_index --help
poetry run create_index --name VideoDataset --version v1 --model LanguageBind
poetry run create_index --name ImageDataset --version v1 --model LanguageBind

You can check created collections in Attu web interface: http://localhost:9502/#/databases/ClipSeek

Configuration

After creating collection it needed to be added to configuration file config.yaml. You can directly copy and paste the dataset definition from meta.yaml.

For our example, it will look like that

# Datasets Configuration
DATASETS:
-   data_path: /data/ExampleDataset/images
    dataset: ImageDataset
    version: v1
    modalities:
    - image
-   data_path: /data/ExampleDataset/videos
    dataset: VideoDataset
    version: v1
    modalities:
    - video
    - audio
    - hybrid

Running

Now we ready to start the whole application.

make up logs