Skip to content

Latest commit



292 lines (244 loc) · 10.5 KB

File metadata and controls

292 lines (244 loc) · 10.5 KB

Object Of Interest Detection


Environment setup

  • cd to the root folder (where this README exists), install the package in requirements.txt

    pip install -r requirements.txt
  • create folder /mat_data in root path (as defined in default args), paste ONE .mat recording in it.

    Place ONLY ONE .mat recording in the /data folder every time.

Create folders

in root path

  • /labels:
    receive the generated .json label data.
  • /cache_display:
    receive the generated .pkl display data.
  • /snapshots:
    receive the .png display snapshots.
  • /cache_dataset:
    receive the .pkl dataset processed by json_loader.
  • /saved_models/tf:
    receive the .h5 MLP models.
  • /debug:
    contains misc/temp files for debug

Run -- [in dev]

stay in root path, run the script


This script contains the whole workflow: Dataset generating, Training and Display.

optional args

  • --skip_dataset_gen: workflow control. Default= False
    • False: Labeling tool would be executed. Generated are .json label file and .pkl display file.
    • True: Labeling would be skipped. The .json label file and .pkl display file would be loaded by the given record_name
  • --record_name: if skip dataset generating, record_name MUST be given
  • --skip_model_training: workflow control. Default= False
    • False: Training would be exexuted. Generated is a .h5 model.
    • True: Traininfg would be skipped. The .h5 model would be loaded by the given model_path
  • --model_path: if skip model training, model_path MUST be given
  • --mat_folder: redirect the data path, where .mat recordings to be loaded.
  • --label_folder: redirect the label path, where .json results to be saved.
  • --display_folder: redirect output path display data.
  • --range: the length of the ego trajectory in the future to be the reference in coordinate transformation, it should cover the actor trajectory
  • --sample_rate: downsample the actor trajectory, for test it may be set to a larger number to speed up the process
  • --gen_start_frame: Default = 0. The start frame of the recording when generating dataset
  • --is_dataset_from_pkl: Default = False. save/load data of json_loader as .pkl file
  • --load_start_frame: the start frame for display

To debug/test one single module:

  • Load .mat recording via MatLoader() in
  • Discriminate actors according RO-rules, generate label data and display data
    • label data saved as .json file in /labels folder
    • display data saved as .pkl file in /cache_display folder
  • Data structure & Concept: Pipeline concept - 1. Labeling

  • mlp_mode = "train":
    • Train the MLP model with train_set
    • MLP model saved as .h5 file in the folder /saved_models/tf
  • mlp_mode = "pred":
    • Load .h5 model, evaluate the model with test_set
    • predict samples in x_test, get pred result pred_ls
    • compare pred_ls with y_test
    • plot evaluation metrics in terminal

  • Display labeling result

    display MLP prediction result, compare with ground truth of RO-rules

Display with matplotlib
  • press SPACE to pause the plotting
  • press c to capture a snapshot, save to /snapshots
  • press ESC to exit


Marker stands for ...
Black line lane and road structure
Green line Ego traj
Orange box Ego car
Blue box Actor vehicles & trajs detected from Camera (BV2)
Cyan box Actor vehicles & trajs detected from Long-range Radar (LRR1)
Red edge Actor, labeled as RO (ground truth)
White 'x' on actor Actor, pred as NRO
Red 'x' on actor Actor, pred as RO
Circle Actor history poses

Pipeline Concept

0. Recording vs. Real Life

Labeling in recordings. Predicting in real life


In one recording, all the points of ego and actor trajectories are known. We can get thier trajectories in any slot within the whole time span. The ego and actors' states are paired according to their "global time". Taking the 0-th trajectory point's time stamp as the current time, the labeling tool discriminates actors as RO (Related Object) or NRO (Not-Related Object) basing on their maneuver in the following frames directly.


1. Labeling

How was the label data organized

  • The data structure of the .json label file:
    # data[0]: The origin ego EML pose of the recording
        "ego_recording_start": [
    # data[1:]: List of ACTOR-EGO pairs
    {   # ACTOR 0 & its corresponding EGO traj
        "RO": false,  # RO-Label for this pair

        "actor_traj": [ # points of ACTOR trajectory
            { # states of point 0 in ACTOR trajectory
                "time": 0.0, # rel. to actor[0]'s global time
                "id": 242.0,
                "type": 7.0,
                "ref_point": 6.0,
                "width": 2.0,
                "length": 5.0,
                "height": 1.5,
                "vel_x": 38.30303851718452,
                "vel_y": 3.612994544048159,
                "yaw": -6.309817113248073e-10,
                "pos_x": 117.108, # rel. to ego[0]'s pos
                "pos_y": -0.6645, # rel. to ego[0]'s pos
                "sensor": "camera",
                "global": 0.04,   # Global time stamp -> pairs the ACTOR and EGO trajs
                "pos_s": 116.22880609109846,
                "pos_d": 0.47164890690169525,
                "vel_s": 40.196145608552314,
                "vel_d": -1.2634057893602224
              # states of point 1 in ACTOR trajectory
            ... # 100 ACTOR points in total
        ],# END ACTOR point list

        "ego_traj": [ # points of EGO trajectory
            { # states of point 0 in EGO trajectory
                "global": 0.04,
                "time": 0.0,    # rel. to ego[0]'s global time
                "pos_x": -0.0,  # rel. to ego[0]'s pos
                "pos_y": 0.0,   # rel. to ego[0]'s pos
                "yaw": 0.0,
                "curv": 0,
                "vel_t": 33.875,
                "acc_t": 0.08999999999999986,
                "distance": 0.0,
                "world_x": 3.0299999999999994,  # 'EML_PositionX'
                "world_y": 5.893000000000001,   # 'EML_PositionY'
                "world_yaw": -1.8017986337985   # 'EML_YawAngle' in [rad]
              # states of point 1 in EGO trajectory
            ... # 300 EGO points in total
        ]# END EGO point list
    },# END Ego-Actor pair

    { # ACTOR 1 & its corresponding EGO traj
        "RO": false, # RO-Label for this pair
        [ # 100 actor points
            { actor state 0 },
            { actor state 99 },
        [ # 300 ego points
            { ego state 0 },
            { ego state 299 },
    ... # other actor-ego pairs
    { # ACTOR 11216 & its corresponding EGO traj
        "RO": false, # RO-Label for this pair
        "actor_traj": []
        "ego_traj": []

2. Process data for training

  • The 0 th point of trajectory (ego_traj[0], actor_traj[0]) stands for the object's current state at the corresponding frame time stamp

traj[0] contains the information about object's current state

  • To imitate what a driver would do in real life, i.e. discriminate whether the surrounding traffic (actor) would interefere ego trajectory and cause collision potentially basing on the observation in the previous seconds.

Extract actor's history sequence to feed to MLP

  • Details about coordination convertion:

Details about coordination convertion

3. Training

  • Feed input feature and label to MLP

Details about coordination convertion