Evaluation of VCS merging algorithms

This is an experimental framework for comparing version control merging algorithms. It runs each merge tool on 6045 real-world merges. It classifies each merge as a conflict, a correct merge, or an incorrect merge. It uses each project's test suite to determine whether the merge was correct, and it penalizes merge tools for creating incorrect merges.

The paper Evaluation of Version Control Merge Tools evaluates 16 merge algorithms, including Hires-Merge, IntelliMerge, Plume-lib Merging (which was best), and Spork. Since then, the framework has been expanded to evaluate newer algorithms, such as Mergiraf.

Requirements

Download the cached data

Download the compressed cached data here and put it in the root directory of the project. Be aware the the uncompressed cache size is 84GB as of 2024-09-23.

Python

To install all the Python requirements, create a conda or mamba environment:

With conda:

conda env create -f environment.yml
conda activate AST

With mamba (faster https://github.com/mamba-org/mamba):

mamba env create -f environment.yml
mamba activate AST

Maven

You must use Maven version 3.9.*.

Ubuntu

sudo apt-get install -y jq
command -v curl >/dev/null || sudo apt install curl -y
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
&& sudo chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
&& sudo apt update \
&& sudo apt install gh -y
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

MacOS

brew install jq
brew install gh

Java

You must install Java 8, 11 and 17. You must set the JAVA8_HOME, JAVA11_HOME and JAVA17_HOME environment variables to the respective Java installations.

By default, you should install GraalVM (version 21 or later) and set a GRAALVM_HOME environment variable to the home of the GraalVM JDK installation. (Todo: the pipeline should be made to work with regular JDK. The correctness outputs will be reliable, but no timing information should be output.)

Run the code

Test the stack

To test the stack, execute:

make small-test

This runs the entire code on two small repos. The output data appears in results/small/.

results/small/result_adjusted.csv: the final result
results/small/merges/ contains all the merges.
results/small/merges_compared/ contains all merges and indicates whether the merge results are different and thus need to be analyzed.
results/small/merges_tested/ contains all merges that have been tested.
results/small/result_adjusted.csv contains the final result.

Perform full analysis

To run the stack on all repos:

./run_combined.sh

To run the stack on all repos and also diff the merges' outputs:

./run_combined.sh -d

This will run the entire code on all the repos and automatically decompress the cache if cache/ does not exist. All the output data can be found in results/. The final result is found in results/result_adjusted.csv. Directory results/merges contains all the merges for each repo. Directory results/merges_tested contains all the merges that have been tested.

If `make small-test` fails

If make small-test fails in a branch that you wish to merge into the main branch, run make small-test in the main branch (which should succeed) and also in your branch, and investigate the differences.

Updating the goal files

If you make a change to the mergers that changes merge results, you need to update the goal files or else reproducibility checks will fail. Copy certain files from results/small/ to test/small-goal-files/.

To update the reproducibility tests, run make run-all (this takes a long time!) and commit the results. This will run merges in parallel. If the load on your machine becomes very low (like no parallelism is happening), then terminate the process and restart it.

Load the stored cache

To decompress the cache run make decompress-cache. This is done automatically in run_combined.sh if cache/ does not exist.

Store the cache

To store the cache make compress-cache.

Clean Cache

To clean the cache run make clean-cache.

Clean Workspace

To cleanup the workspace:make clean

Style Checking

To run style checking run make style.

Code structure

Directory structure

Committed files

run.sh -> This file executes each step of the stack.
run_small.sh -> This file executes the stack on two repositories.
run_combined.sh -> This file executes the stack on all the repositories.
run_greatest_hits.sh -> This file executes the stack on the greatest hits repositories.
run_reaper.sh -> This file executes the stack on the reaper repositories.
run_1000.sh -> This file executes the stack on the 1000 repositories.
src/ -> contains the following scripts:
- python/ -> contains the following scripts:
  - merge_tester.py -> Main file which performs merges and evaluates all the results across all projects.
  - test_repo_heads.py -> Checks out all repos and removes all repos that fail their tests on main branch.
  - latex_output.py -> Output latex code for the resulting plots and table.
  - merge_analyzer.py -> Analyzes a merge to determine if it should be tested.
  - merges_sampler.py -> Samples merges to be tested.
  - get_repos.py -> Downloads the repos list.
  - cache_utils.py -> Contains functions to store and load the cache.
  - clean_cache_placeholders.py -> Removes all the cache placeholders.
  - repo.py -> Contains the Repo class which represents a repo.
  - write_head_hashes.py -> Writes the head hashes of all repos to a file.
  - add_jacoco_gradle.py -> Adds jacoco to gradle projects.
  - add_jacoco_maven.py -> Adds jacoco to maven projects.
- scripts/ -> contains the following scripts:
  - run_repo_tests.sh -> Runs a repo's programmer provided tests.
  - merge_tools/ -> Contains all the merge tools scripts.
- src/main/java/astmergeevaluation/FindMergeCommits.java -> Finds all merge commits in a repo.
input_data/ -> Input data, which is a list of repositories; see its README.md.

Uncommitted Files

cache/ -> This folder is a cache for each computation. contains:
- test_result/ -> Caches the test results for a specific commit. Used for parent testing and repo validation.
- merge_test_results/ -> Caches the test results for specific merges. Used for merge testing. First line indicates the merge result, second line indicates the run time.
- merge_diff_results/ -> Caches the diff results for specific merges.
cache-small/ -> This folder is a cache for each test computation. contains:
- test_result/ -> Caches the test results for a specific commit. Used for parent testing and repo validation.
- merge_test_results/ -> Caches the test results for specific merges. Used for merge testing. First line indicates the merge result, second line indicates the run time.
.workdir/ -> This folder is used for the local computations of each process and content is named by Unix process (using "$$"). If DELETE_WORKDIRS is set to false in src/python/variables.py this folder is not deleted after the computation and can be inspected.
repos/ -> In this folder each repo is cloned.
results/ -> Contains all the results for the full analysis.
results/small/ -> Contains all the results for the small analysis.
jars/ -> Location for the IntelliMerge and Spork jars.

Comparing merge algorithms

To investigate differences between two mergers:

Edit file src/python/utils/select_from_results.py to reflect the differences you are interested in.
Run src/python/utils/select_from_results.py to create a .csv database containing only the differences.
Set DELETE_WORKDIRS to false in src/python/variables.py.
Run src/python/replay_merge.py --idx INDEX (maybe add -test) for the index of the merge you are interested in. If the merge is in the small test, you may need to add --merges_csv ./test/small-goal-files/result.csv.

Overwriting results manually

In some cases it might be worth to overwrite the computed results. To do that you should modify the results/manual_override.csv file. In that file for the merge you want to overwrite a result of you should include at least the information repository,merge,left,right and a new column for the result you want to overwrite. You can overwrite anything you want but if there is a column you don't want to overwrite either do not include that column or leave the entry blanck i.e. ,,. See the file for an example.

Name		Name	Last commit message	Last commit date
Latest commit History 490 Commits
.github/workflows		.github/workflows
gradle/wrapper		gradle/wrapper
illustrations		illustrations
input_data		input_data
jars		jars
results		results
src		src
test		test
.gitattributes		.gitattributes
.gitconfig		.gitconfig
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.gradle		build.gradle
environment.yml		environment.yml
gradlew		gradlew
gradlew.bat		gradlew.bat
run.sh		run.sh
run_1000.sh		run_1000.sh
run_combined.sh		run_combined.sh
run_greatest_hits.sh		run_greatest_hits.sh
run_reaper.sh		run_reaper.sh
run_small.sh		run_small.sh
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluation of VCS merging algorithms

Requirements

Download the cached data

Python

Maven

Ubuntu

MacOS

Java

Run the code

Test the stack

Perform full analysis

If `make small-test` fails

Updating the goal files

Load the stored cache

Store the cache

Clean Cache

Clean Workspace

Style Checking

Code structure

Directory structure

Committed files

Uncommitted Files

Comparing merge algorithms

Overwriting results manually

About

Releases 4

Packages

Contributors 5

Languages

License

benedikt-schesch/AST-Merging-Evaluation

Folders and files

Latest commit

History

Repository files navigation

Evaluation of VCS merging algorithms

Requirements

Download the cached data

Python

Maven

Ubuntu

MacOS

Java

Run the code

Test the stack

Perform full analysis

If make small-test fails

Updating the goal files

Load the stored cache

Store the cache

Clean Cache

Clean Workspace

Style Checking

Code structure

Directory structure

Committed files

Uncommitted Files

Comparing merge algorithms

Overwriting results manually

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 5

Languages

If `make small-test` fails

Packages