Skip to content

Latest commit

 

History

History
112 lines (86 loc) · 3.91 KB

preparing_sthv1.md

File metadata and controls

112 lines (86 loc) · 3.91 KB

Preparing Something-Something V1

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/sthv1/.

Step 1. Prepare Annotations

First of all, you have to sign in and download annotations to $MMACTION2/data/sthv1/annotations on the official website.

Step 2. Prepare RGB Frames

Since the sthv1 website doesn't provide the original video data and only extracted RGB frames are available, you have to directly download RGB frames from sthv1 website.

You can download all RGB frame parts on sthv1 website to $MMACTION2/data/sthv1/ and use the following command to extract.

cd $MMACTION2/data/sthv1/
cat 20bn-something-something-v1-?? | tar zx
cd $MMACTION2/tools/data/sthv1/

For users who only want to use RGB frames, you can skip to step 4 to generate file lists in the format of rawframes. Since the prefix of official JPGs is "%05d.jpg" (e.g., "00001.jpg"), you have to add "filename_tmpl='{:05}.jpg'" to the dict of data.train, data.val and data.test in the config files related with sthv1 like this:

data = dict(
    videos_per_gpu=6,
    workers_per_gpu=4,
    train=dict(
        type=dataset_type,
        ann_file=ann_file_train,
        data_prefix=data_root,
        filename_tmpl='{:05}.jpg',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=ann_file_val,
        data_prefix=data_root_val,
        filename_tmpl='{:05}.jpg',
        pipeline=val_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=ann_file_test,
        data_prefix=data_root_val,
        filename_tmpl='{:05}.jpg',
        pipeline=test_pipeline))

Step 3. Extract Flow

This part is optional if you only want to use RGB frames.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance.

You can run the following script to soft link SSD.

# execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/sthv1_extracted/
ln -s /mnt/SSD/sthv1_extracted/ ../../../data/sthv1/rawframes

Then, you can run the following script to extract optical flow based on RGB frames.

cd $MMACTION2/tools/data/sthv1/
bash extract_flow.sh

Step 4. Generate File List

you can run the follow script to generate file list in the format of rawframes.

cd $MMACTION2/tools/data/sthv1/
bash generate_rawframes_filelist.sh

Step 5. Check Directory Structure

After the whole data process for Something-Something V1 preparation, you will get the rawframes (RGB + Flow), and annotation files for Something-Something V1.

In the context of the whole project (for Something-Something V1 only), the folder structure will look like:

mmaction
├── mmaction
├── tools
├── configs
├── data
│   ├── sthv1
│   │   ├── sthv1_{train,val}_list_rawframes.txt
│   │   ├── annotations
│   |   ├── rawframes
│   |   |   ├── 100000
│   |   |   |   ├── img_00001.jpg
│   |   |   |   ├── img_00002.jpg
│   |   |   |   ├── ...
│   |   |   |   ├── flow_x_00001.jpg
│   |   |   |   ├── flow_x_00002.jpg
│   |   |   |   ├── ...
│   |   |   |   ├── flow_y_00001.jpg
│   |   |   |   ├── flow_y_00002.jpg
│   |   |   |   ├── ...
│   |   |   ├── 100001
│   |   |   ├── ...

For training and evaluating on Something-Something V1, please refer to getting_started.md.