initial commit

Codium-ai · Jan 17, 2024 · bf7fc4a · bf7fc4a
commit bf7fc4a
Show file tree

Hide file tree

Showing 71 changed files with 9,735 additions and 0 deletions.
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,39 @@
+FROM ubuntu:22.04
+
+ENV TZ=UTC
+RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
+
+ENV DEBIAN_FRONTEND=noninteractive
+
+RUN  apt-get update && apt-get install -y \
+    clang \
+    curl \
+    git \
+    vim \
+    build-essential \
+    libffi-dev \
+    libssl-dev \
+    zlib1g-dev \
+    libbz2-dev \
+    libreadline-dev \
+    libsqlite3-dev \
+    software-properties-common \
+    vim
+
+
+RUN add-apt-repository ppa:deadsnakes/ppa -y
+
+RUN apt install python3.9-dev -y
+
+RUN apt install -y python3-pip \
+    python3.9-distutils
+
+RUN python3.9 -m pip install --upgrade pip
+
+RUN python3.9 --version
+
+RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.9 1
+RUN update-alternatives --set python /usr/bin/python3.9
+
+
+
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2024 CodiumAI
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,129 @@
+# Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering
+
+[Paper](https://arxiv.org/abs/2401.08500) |
+[Dataset](https://huggingface.co/datasets/talrid/CodeContests_valid_and_test_AlphaCodium/blob/main/codecontests_valid_and_test_processed_alpha_codium.zip)
+
+Official Implementation
+> Tal Ridnik, Dedy Kredo, Itamar Friedman <br/> CodiumAI
+
+**Abstract**
+
+Code generation problems differ from common natural language problems - they require matching the exact syntax of the target language, identifying happy paths and edge cases, paying attention to numerous small details in the problem spec, and addressing other code-specific issues and requirements. Hence, many of the optimizations and tricks that have been successful in natural language generation may not be effective for code tasks.
+
+In this work, we propose a new approach to code generation by LLMs, which we call AlphaCodium - a test-based, multi-stage, code-oriented iterative flow, that improves the performances of LLMs on code problems.
+
+We tested AlphaCodium on a challenging code generation dataset called CodeContests, which includes competitive programming problems from platforms such as Codeforces. The proposed flow consistently and significantly improves results.
+On the validation set, for example, GPT-4 accuracy (pass@5) increased from 19% with a single well-designed direct prompt to 44% with the AlphaCodium flow. 
+
+Many of the principles and best practices we acquired in this work, we believe, are broadly applicable to general code generation tasks.
+
+<p align="center">
+ <table class="tg">
+  <tr>
+    <td class="tg-c3ow"><img src="./pics/proposed_flow.png" align="center" width="600""></td>
+<tr>
+    <td class="tg-c3ow"><img src="./pics/iterations.png" align="center" width="600" ></td>
+
+  </tr>
+</table>
+</p>
+
+
+## Installation
+
+(1) setup a virtual environment and run: `pip install -r requirements.txt`
+
+(2) Duplicate the file `alpha_codium/settings/.secrets_template.toml`, rename it as `.secrets.toml`, and fill your openai api key:
+```
+[openai]
+key = "..."
+```
+
+(3) Download the processed CodeContest validation and test dataset from [hugging face](https://huggingface.co/datasets/talrid/CodeContests_valid_and_test_AlphaCodium/blob/main/codecontests_valid_and_test_processed_alpha_codium.zip), extract the zip file, and placed the extracted folder in the root of the project.
+
+## How to run
+
+### Configuration
+The file: `alpha_codium/settings/configuration.toml` contains the configuration for the project.
+In the `config` section you can choose the model you want to use ("gpt-4", "gpt-3.5-turbo-16k", or others).
+
+### Solving a specific problem
+To solve a specific problem with AlphaCodium, from the root folder run:
+```
+python -m alpha_codium.solve_problem \
+--dataset_name /path/to/dataset \
+--split_name test \
+--problem_number 0
+```
+- The `dataset_name` is the path to the dataset folder you downloaded in the installation step.
+- Note that the validation set contain 117 problems, and the test set contain 165 problems, so the `problem_number` parameter should be accordingly (zero-based)
+- The `split_name` can be either `valid` or `test`.
+- The followings sections in the configuration file: 
+`solve`, `self_reflection`,`possible_solutions`,`generate_ai_tests`,`initial_code_generation`,`public_tests`, `ai_tests`  
+enable to adjust possible configurations for the different stages of the flow.
+- Each run logs the results to a file named `alpha_codium/example.log`. Reviewing the log file is a good way to understand what is going on in each stage of the flow.
+
+Example problem (test set, problem number 12):
+<p align="center">
+ <table class="tg">
+  <tr>
+    <td class="tg-c3ow"><img src="./pics/example_problem.png" align="center" width="600""></td>
+    </tr>
+</table>
+</p>
+
+### Solving the entire dataset
+to solve the entire dataset with AlphaCodium, from the root folder run:
+```
+python -m alpha_codium.solve_dataset \
+--dataset_name /path/to/dataset \
+--split_name test
+--database_solution_path /path/to/output/dir/dataset_output.json
+```
+
+- The `split_name` can be either `valid` or `test`.
+- `database_solution_path` is the path to the directory where the solutions will be saved.
+- The `dataset` section in the configuration file contains the configuration for the running and evaluation a dataset.
+- Note that this is a long process, and it may take a few days to complete with large models (e.g. GPT-4) and several iterations per problem. 
+- `dataset.num_iterations` defines the number of iterations for each problem (pass@K). For large number of iterations, it is recommended to introduce some randomness and different options for each iteration to achieve top results.
+
+### Running the evaluation
+
+Once you generate a solution for the entire dataset (valid or test), you can evaluate it by running:
+```
+python -m alpha_codium.evaluate_dataset\
+--dataset_name /path/to/dataset\
+--split_name test\
+--database_solution_path /path/to/output/dir/dataset_output.json
+```
+
+## Broader Applicability
+While this work presents results on CodeContests dataset, we believe that it has a broader applicability.
+
+First and foremost, we feel that the proposed AlphaCodium [flow](./pics/proposed_flow.png), with reasonable adjustments, can be used as a more general framework for other code generation tasks.
+
+Secondly, many of the design concepts, principles, and tricks we acquired in this work are broadly applicable as-is to any general code generation tasks. For example:
+- **YAML Structured output**: asking the model to generate an output in YAML format, equivalent to a given Pydantic class
+- **Semantic reasoning via bullet points analysis**: bullet points analysis encourage an in-depth understanding of the problem, and force the model to divide the output into logical semantic sections, leading to improved results
+- **LLMs do better when generating a modular code**: when clearly asking the model to: `divide the generated code into small sub-functions, with meaningful names and functionality`, we observe a better-produced code, with fewer bugs, and higher success rates for the iterative fixing stages.
+- **Soft decisions with double validation**: with a double validation process, we add an extra step where, given the generated output, the model is asked to re-generate the same output, but correct it if needed
+- **Leave room for exploration**: since the model can be wrong, it’s better to avoid irreversible decisions, and leave room for exploration and code iterations with different possible solutions
+
+The list above is partial. See the [paper](https://arxiv.org/abs/2401.08500) for more details. The code provided [in this repo](./settings) can be used as a reference for better understanding the proposed concepts, and for applying them to other code generation tasks.
+
+## Acknowledgments
+Our process CodeContests dataset is based on the original [CodeContests](https://huggingface.co/datasets/deepmind/code_contests) dataset.
+We removed the train set (which is not relevant for our work), and did some post-processing and cleaning to the validation and test sets.
+
+
+## Citation
+```
+@misc{ridnik2024code,
+      title={Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering}, 
+      author={Tal Ridnik and Dedy Kredo and Itamar Friedman},
+      year={2024},
+      eprint={2401.08500},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG}
+}
+```
diff --git a/alpha_codium/__init__.py b/alpha_codium/__init__.py
@@ -0,0 +1,27 @@
+import os
+import random
+
+import numpy as np
+
+
+def set_all_seeds(seed):
+    random.seed(seed)
+    np.random.seed(seed)
+    os.environ["PYTHONHASHSEED"] = str(seed)
+
+    try:
+        import tensorflow as tf
+        tf.random.set_seed(seed)
+    except ImportError:
+        pass
+
+    try:
+        import torch
+        torch.manual_seed(seed)
+        if torch.cuda.is_available():
+            torch.cuda.manual_seed(seed)
+            torch.cuda.manual_seed_all(seed)  # if you are using multi-GPU.
+    except ImportError:
+        pass
+
+set_all_seeds(1337)
diff --git a/alpha_codium/code_contests/__init__.py b/alpha_codium/code_contests/__init__.py
diff --git a/alpha_codium/code_contests/data/__init__.py b/alpha_codium/code_contests/data/__init__.py