- llava-150k-tool-aug.json augment the llava-insttrution-150 with extrac
"thoughts"
and"actions"
to ensure the data format as llava-plus required. - llava-plus-v1-117k-tool-merge.json is tool learning visual instruction data by prompting ChatGPT/GPT-4.
We provide an example to constuct grounding data here.