Updated handling of instructions #52

wongjingping · 2023-11-30T12:54:02Z

The openai and vllm runner now passes in the instructions into the prompt if provided.
We provide a separate prompt for instructions as a drop in replacement for prompt.md when dealing with instructions based datasets.

openai runner works as before (3.5 without instructions):

$ python main.py \
  -q data/questions_gen.csv \
  -o results/my_query_generator.csv \
  -g oa \
  -f prompts/prompt_openai.md \
  -m gpt-3.5-turbo-0613 \
  -p 5
preparing questions...
Correct so far: 136/200 (68.00%): 100%|█████████████████████████| 200/200 [00:48<00:00,  4.09it/s]
                exact_match   correct
query_category                       
date_functions     0.760000  0.760000
group_by           0.800000  0.800000
order_by           0.657143  0.742857
ratio              0.257143  0.342857
table_join         0.714286  0.742857
where              0.714286  0.714286
Average correct rate: 0.68

3.5 now with instructions:

$ python main.py \
  -q data/questions_instruct.csv \
  -o results/my_query_generator.csv \
  -g oa \
  -f prompts/prompt_openai.md \
  -m gpt-3.5-turbo-0613 \
  -p 5
preparing questions...
Correct so far: 148/240 (61.67%): 100%|█████████████████████████| 240/240 [00:59<00:00,  4.06it/s]
                           exact_match   correct
query_category                                  
abbreviation_instructions     0.400000  0.466667
date_functions                0.760000  0.760000
date_instructions             0.200000  0.200000
group_by                      0.800000  0.800000
order_by                      0.657143  0.742857
ratio                         0.257143  0.342857
table_join                    0.714286  0.742857
where                         0.714286  0.714286
Average correct rate: 0.62

gpt-4 turbo with instructions:

$ python main.py \
  -q data/questions_instruct.csv \
  -o results/my_query_generator.csv \
  -g oa \
  -f prompts/prompt_openai.md \
  -m gpt-4-1106-preview \
  -p 5
preparing questions...
Correct so far: 180/240 (75.00%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 240/240 [09:51<00:00,  2.47s/it]
                           exact_match   correct
query_category                                  
abbreviation_instructions     0.333333  0.400000
date_functions                0.800000  0.800000
date_instructions             0.360000  0.360000
group_by                      0.914286  0.942857
order_by                      0.828571  0.885714
ratio                         0.400000  0.685714
table_join                    0.800000  0.800000
where                         0.828571  0.828571
Average correct rate: 0.75

vllm with instructions:

python3 -W ignore main.py \
  -q data/questions_instruct.csv \
  -o "results/${model_name}_c${checkpoint_num}.csv" \
  -g vllm \
  -f "prompts/prompt_instructions.md" \
  -m "$model_path"
Preparing /models/combined/sqlcoder_7b_bf16_b16_ld005_r128_a128_ts/checkpoint-600
2023-11-30 12:55:04,457 INFO worker.py:1673 -- Started a local Ray instance.
INFO 11-30 12:55:05 llm_engine.py:72] Initializing an LLM engine with config: model='/models/combined/sqlcoder_7b_bf16_b16_ld005_r128_a128_ts/checkpoint-600', tokenizer='/models/combined/sqlcoder_7b_bf16_b16_ld005_r128_a128_ts/checkpoint-600', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=16384, download_dir=None, load_format=auto, tensor_parallel_size=4, quantization=None, seed=0)
INFO 11-30 12:55:15 llm_engine.py:207] # GPU blocks: 8141, # CPU blocks: 2048
Using prompt file prompts/prompt_instructions.md
Prepared 240 questions from data/questions_instruct.csv
Generating completions
Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 240/240 [02:39<00:00,  1.51it/s]
Time taken: 159.4s
Correct so far: 174/240 (72.50%): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 240/240 [00:04<00:00, 51.86it/s]
                           exact_match   correct
query_category                                  
abbreviation_instructions     0.333333  0.333333
date_functions                0.800000  0.800000
date_instructions             0.160000  0.160000
group_by                      0.857143  0.857143
order_by                      0.800000  0.914286
ratio                         0.771429  0.828571
table_join                    0.828571  0.828571
where                         0.714286  0.714286
Average tokens generated: 55.6

wongjingping closed this Dec 1, 2023

wongjingping force-pushed the jp/instruct branch from 634b86b to 9fc438c Compare December 1, 2023 02:27

wongjingping deleted the jp/instruct branch December 1, 2023 02:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated handling of instructions #52

Updated handling of instructions #52

wongjingping commented Nov 30, 2023 •

edited

Loading

Updated handling of instructions #52

Updated handling of instructions #52

Conversation

wongjingping commented Nov 30, 2023 • edited Loading

wongjingping commented Nov 30, 2023 •

edited

Loading