When you need to merge an Merge Request(or Pull Request), if the target branch has changed, you need to rebase the source branch of the MR to keep in sync with the latest target branch. This process will cause changes to the MR source branch, and these changes are not verified by CI, so CI needs to be retriggered for verification. The Merge Queue provided by gitlab hosts the "Rebase + rerun CI" process in an automated way.
As shown in the figure above, the merge base of a Merge Request is B1. The orange area is the changes of the MR and has been verified by CI. When the MR is to be merged into the master, the master branch has already had new changes. Before the MR is merged, the green and orange areas need to be merged into a whole through rebase to re-trigger CI. Only after passing the new CI verification can MR be merged into the master.
In Monorepo, when we modify the code of a certain projectA in MR and pass the CI verification, the scope of this CI verification is projectA itself and its impact area, where the impact area represents other apps/libs that directly or indirectly reference projectA. From this, we can make a hypothesis, as shown in the figure below, if there is no intersection between the "green impact area" and the "orange impact area", as long as MR passes the CI verification, there is no need to retrigger CI after rebase master, which can improve merging efficiency and reduce resource costs.
So how to identify whether there is an intersection between the impact areas of different changes?
- This document aims to define a file format to represent dependencies between projects in monorepo.
- The file format can be used to determine whether there is an intersection between two sets of change files.
(Note: project generally refers to an independent app or lib. In the front-end field, it usually consists of a package.json file describing meta information, and code functions. There may be differences in other fields)
Principle | Description |
---|---|
Avoid git conflicts as much as possible | In Monorepo, every project dependency change will lead to changes in this file. When multiple MRs have modified the file, conflicts should be avoided as much as possible. |
Common to multiple technical fields | Code integration occurs in the R&D process of each technical field, so file design should avoid being strongly related to the technical solution in a certain field. |
Simple and easy to use | Code integration is a frequent event. It is necessary to pay attention to the parsing efficiency and usage complexity of the file. It should be as easy to use as possible. |
- File Name:project-impact-graph.yaml
- Description: only describe project dependencies
- Fields:
- projectName: the name of a project or module
- includedGlobs: a glob list, representing a project scope, used to match each file path in MR change
- dependentProjects: A set of project names that directly depend on the key project
- projectName: 'A'
includedGlobs:
- 'projects/A/**'
dependentProjects:
- 'A'
- 'B'
- 'C'
- projectName: 'D'
includedGlobs:
- projects/D/**
dependentProjects:
- 'D'
- File Name:project-impact-graph.yaml
- Description: based on Option 1, some business information is added to support filtering specified file paths.
- Fields:
- globalExcludedGlobs: used to identify paths that should be ignored before matching
- projects:
- Key: project name
- Value:
- includedGlobs: a glob list, representing a project scope, used to match each file path in MR change
- excludedGlobs: a project-level glob list, used to identify paths to ignore before matching
- dependentProjects: A set of project names that directly depend on the key project
globalExcludedGlobs:
- '**/OWNERS'
- '**/build.sh'
- '**/bootstrap.sh'
- 'common/autoinstallers/**'
projects:
'A':
includedGlobs:
- 'projects/A/**'
excludedGlobs:
- 'projects/A/README.md'
dependentProjects:
- 'A'
- 'B'
- 'C'
'D':
includedGlobs:
- 'projects/D/**'
excludedGlobs:
- 'projects/D/README.md'
dependentProjects:
- 'D'
- globalExcludedGlobs and excludedGlobs are defined in <monorepo_root>/.mergequeueignore and <project_root>/.mergequeueignore respectively
- <monorepo_root>/.mergequeueignore
**/OWNERS **/build.sh **/bootstrap.sh common/autoinstallers/**
- <project_root>/.mergequeueignore
projects/A/README.md
- <monorepo_root>/.mergequeueignore
Taking a monorepo managed by rush.js as an example, each project is in a unique folder. When the project dependencies in Monorepo change, the files need to be automatically regenerated The steps to generate the yaml file are as follows:
- Get globalExcludedGlobs from
<monorepo_root>/.mergequeueignore
- Use the rush tool to get the <project_root> of each project, and then generate includedGlobs based on the <project_root>
- Get excludedGlobs from
<project_root>/.mergequeueignore
- Use the rush tool to get the dependents of each project
- Generate project-impact-graph.yaml
When an MR is submitted to the Merge Queue, the Merge Queue will perform the following steps:
- Request and parse the latest
project-impact-graph.yaml
- Request the change file list
mr_change_path_list
,master_change_path_list
of MR and master - Filter paths in two lists based on globalExcludedGlobs and excludedGlobs defined in
project-impact-graph.yaml
- Match the projects defined in
project-impact-graph.yaml
one by one according tomr_change_path_list
andproject.includedGlobs
- Recursively find all dependent projects of each matched project in real time (directly and indirectly), and merge all the values of dependentProjects fields to get
mr_change_impact
- Run step4 and step5 again for
master_change_path_list
to getmaster_change_impact
- Calculate whether there is intersection between
mr_change_impact
andmaster_change_impact
- If there is an intersection, MR needs to retrigger CI check after rebase master, otherwise skip CI check
- Why use glob to represent folder/project?
Usage: Glob patterns are primarily used in file systems to filter filenames or directories. They are used in Unix shell commands and many programming languages like Python, JavaScript, etc.
Complexity: Glob patterns are simpler to use and understand. They typically include wildcard characters like * (matches any number of any characters). In monorepo, we can use project/** to represent all file paths under a project
- Why use includedGlobs to represent one project instead of includedGlob?
In a monorepo managed by rush, a project can be represented by a folder, but there is no guarantee that all package management tools adopt the same strategy. A project may have multiple different folders. In order to ensure file scalability, project-impact-graph.yaml
uses a glob list to represent the project
- The relation of includedGlobs and excludedGlobs
Project level excludeGlobs should not be allowed to affect public files by default, so excludedGlobs needs to be a subset of includedGlobs in a certain project. The maintainer of each project can only configure the file paths within the project that need to be ignored. In addition, public files can only be configured by the maintainer of the Monorepo repository in globalExcludedGlobs
- Priority of .gitignore, globalExcludedGlobs and excludedGlobs
If a file is in the scope of .gitignore, it can not appear in Merge Request and will not appear in project-impact-graph.yaml
If a file is filtered by globalExcludedGlobs, then it will never have a chance to match excludedGlobs
For detailed usage logic, please refer to the following two parts: How to generate the project-impact-graph.yaml file and How to use the project-impact-graph.yaml file
- How to determine the latest project-impact-graph.yaml
Users may have modified dependencies in MR, and project-impact-graph.yaml has changed. Merge Queue sticks to using the latest file. There are several scenarios:
- Neither MR nor master has modified project-impact-graph.yaml: Merge Queue requests the file from merge base commit
- Only MR modified project-impact-graph.yaml: Merge Queue requests the file from MR change
- Only Master modified project-impact-graph.yaml: Merge Queue requests the file from master change
- MR and Master modified project-impact-graph.yaml at the same time: Merge Queue uses two project-impact-graph.yaml files from MR and master to calculate dependent relations separately
Assume that Monorepo has projects A, B, B_subProject, C,..., M. Their dependency relations are as follows:
There are some cases defined in the src/test/index.test.ts
file. After cloning the repository, you can execute the following command to get a deeper understanding
pnpm install
pnpm test