Skip to content

Implementation and Dataset for the paper "Compositional Substitutivity of Visual Reasoning for Visual Question Answering" (ECCV 2024)

Notifications You must be signed in to change notification settings

NeverMoreLCH/CG-SPS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Compositional Substitutivity of Visual Reasoning for Visual Question Answering

Implementation for the ECCV 2024 paper "Compositional Substitutivity of Visual Reasoning for Visual Question Answering" [paper link]

Example Image



Dataset Download

Example Image

GQA-SPS Dataset

DownLoad Link: [Google Drive] [Baidu NetDisk (password: DSPS)]

Format:

  1. "gqa-sps-balanced-X-val-Y.json" is the question json for the val-Y split of X SPS, where X $\in$ {word, visual entity, referent}, and Y $\in$ {A, B}.
  2. "images_for_visual_enity_sps.zip" contains the images for "gqa-sps-balanced-visual-entity-val-A&B.json", for each image, "image_id.jpg" is used for model input, and "image_id_hl.jpg" high lights the substituted objects.

VQA-SPS v2 Dataset

DownLoad Link: [Google Drive] [Baidu NetDisk (password: DSPS)]

Format:

  1. "vqav2-sps-questions-X-val-Y.json" is the question json for the val-Y split of X SPS, where X $\in$ {word, visual entity, referent}, and Y $\in$ {A, B}.
  2. "vqav2-sps-annotations-X-val-Y.json" is the annotation json for the val-Y split of X SPS, where X $\in$ {word, visual entity, referent}, and Y $\in$ {A, B}.
  3. "images_for_visual_enity_sps.zip" contains the images for "vqav2-sps-questions-visual-entity-val-A&B.json", for each image, "image_id.jpg" is used for model input, and "image_id_hl.jpg" high lights the substituted objects.

About

Implementation and Dataset for the paper "Compositional Substitutivity of Visual Reasoning for Visual Question Answering" (ECCV 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published