From cdcb07f26142472fe2567ab99d57657863d45b3d Mon Sep 17 00:00:00 2001
From: Andrei Fajardo
+ +## Fine-tuning (with `arc-finetuning-cli`) + +After you've created your finetuning examples (you'll need at least 10 of them), +you can submit a job to OpenAI to finetune an LLM on them. To do so, we have a +convenient command line tool, that is powered by LlamaIndex plugins such as +`llama-index-finetuning`. + +```sh +arc finetuning cli tool. + +options: + -h, --help show this help message and exit + +commands: + {evaluate,finetune,job-status} + evaluate Evaluation of ARC Task predictions with LLM and ARCTaskSolverWorkflow. + finetune Finetune OpenAI LLM on ARC Task Solver examples. + job-status Check the status of finetuning job. +``` + +### Submitting a fine-tuning job + +To submit a fine-tuning job, use any of the following three `finetune` command: + +```sh +# submit a new finetune job using the specified llm +arc-finetuning-cli finetune --llm gpt-4o-2024-08-06 + +# submit a new finetune job that continues from previously finetuned model +arc-finetuning-cli finetune --llm gpt-4o-2024-08-06 --start-job-id ftjob-TqJd5Nfe3GIiScyTTJH56l61 + +# submit a new finetune job that continues from the most recent finetuned model +arc-finetuning-cli finetune --continue-latest +``` + +The commands above will take care of compiling all of the single finetuning json +examples (i.e. stored in `finetuning_examples/`) into a single `jsonl` file that +is then passed to OpenAI finetuning API. + +### Checking the status of a fine-tuning job + +After submitting a job, you can check its status using the below cli commands: + +```sh +arc-finetuning-cli job-status -j ftjob-WYySY3iGYpfiTbSDeKDZO0YL -m gpt-4o-2024-08-06 + +# or check status of the latest job submission +arc-finetuning-cli job-status --latest +``` + +## Evaluation + +You can evaluate the `ARCTaskSolverWorkflow` and a specified LLM on the ARC test +dataset. You can even supply a fine-tuned LLM here. + +```sh +# evaluate ARCTaskSolverWorkflow single attempt with gpt-4o +arc-finetuning-cli evaluate --llm gpt-4o-2024-08-06 + +# evaluate ARCTaskSolverWorkflow single attempt with a previously fine-tuned gpt-4o +arc-finetuning-cli evaluate --llm gpt-4o-2024-08-06 --start-job-id ftjob-TqJd5Nfe3GIiScyTTJH56l61 +``` + +You can also specify certain parameters to control the speed of the execution so +as to not run into `RateLimitError`'s from OpenAI. + +```sh +arc-finetuning-cli evaluate --llm gpt-4o-2024-08-06 --batch-size 5 --num-workers 3 --sleep 10 +``` + +In the above command, `batch-size` refers to the number of test cases handled in +single batch. In total, there are 400 test cases. Moreover, `num-workers` is the +maximum number of async calls allowed to be made to OpenAI API at any given moment. +Finally, `sleep` is the amount of time in seconds the execution halts before moving +onto the next batch of test cases.