Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: update MAINTAINERS.yaml for each CODEOWNERS file change #1315

Merged
merged 4 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/scripts/maintainers/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
github.api.cache.json
58 changes: 58 additions & 0 deletions .github/scripts/maintainers/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Maintainers

The ["Update MAINTAINERS.yaml file"](../../workflows/update-maintainers.yaml) workflow, defined in the `community` repository performs a complete refresh by fetching all public repositories under AsyncAPI and their respective `CODEOWNERS` files.

## Workflow Execution

The "Update MAINTAINERS.yaml file" workflow is executed in the following scenarios:

1. **Weekly Schedule**: The workflow runs automatically every week. It is useful, e.g. when some repositories are archived, renamed, or when a GitHub user account is removed.
2. **On Change**: When a `CODEOWNERS` file is changed in any repository under the AsyncAPI organization, the related repository triggers the workflow by emitting the `trigger-maintainers-update` event.
3. **Manual Trigger**: Users can manually trigger the workflow as needed.

### Workflow Steps

1. **Load Cache**: Attempt to read previously cached data from `github.api.cache.json` to optimize API calls.
2. **List All Repositories**: Retrieve a list of all public repositories under the AsyncAPI organization, skipping any repositories specified in the `IGNORED_REPOSITORIES` environment variable.
3. **Fetch `CODEOWNERS` Files**: For each repository:
- Detect the default branch (e.g., `main`, `master`, or a custom branch).
- Check for `CODEOWNERS` files in all valid locations as specified in the [GitHub documentation](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners#codeowners-file-location).
derberg marked this conversation as resolved.
Show resolved Hide resolved
4. **Process `CODEOWNERS` Files**:
1. Extract GitHub usernames from each `CODEOWNERS` file, excluding emails, team names, and users specified by the `IGNORED_USERS` environment variable.
2. Retrieve profile information for each unique GitHub username.
3. Collect a fresh list of repositories currently owned by each GitHub user.
5. **Refresh Maintainers List**: Iterate through the existing maintainers list:
- Delete the entry if it:
- Refers to a deleted GitHub account.
- Was not found in any `CODEOWNERS` file across all repositories in the AsyncAPI organization.
- Otherwise, update **only** the `repos` property.
6. **Add New Maintainers**: Append any new maintainers not present in the previous list.
7. **Changes Summary**: Provide details on why a maintainer was removed or changed directly on the GitHub Action [summary page](https://github.blog/2022-05-09-supercharging-github-actions-with-job-summaries/).
8. **Save Cache**: Save retrieved data in `github.api.cache.json`.

## Job Details

- **Concurrency**: Ensures the workflow does not run multiple times concurrently to avoid conflicts.
- **Wait for PRs to be Merged**: The workflow waits for pending pull requests to be merged before execution. If the merged pull request addresses all necessary fixes, it prevents unnecessary executions.

## Handling Conflicts

Since the job performs a full refresh each time, resolving conflicts is straightforward:

1. Close the pull request with conflicts.
2. Navigate to the "Update MAINTAINERS.yaml file" workflow.
3. Trigger it manually by clicking "Run workflow".

## Caching Mechanism

Each execution of this action performs a full refresh through the following API calls:

```
ListRepos(AsyncAPI) # 1 call using GraphQL - not cached.
for each Repo
GetCodeownersFile(Repo) # N calls using REST API - all are cached. N refers to the number of public repositories under AsyncAPI.
for each codeowner
GetGitHubProfile(owner) # Y calls using REST API - all are cached. Y refers to unique GitHub users found across all CODEOWNERS files.
```

To avoid hitting the GitHub API rate limits, [conditional requests](https://docs.github.com/en/rest/using-the-rest-api/best-practices-for-using-the-rest-api?apiVersion=2022-11-28#use-conditional-requests-if-appropriate) are used via `if-modified-since`. The API responses are saved into a `github.api.cache.json` file, which is later uploaded as a GitHub action cache item.
64 changes: 64 additions & 0 deletions .github/scripts/maintainers/cache.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
const fs = require("fs");

module.exports = {
fetchWithCache,
saveCache,
loadCache,
printAPICallsStats,
};

const CODEOWNERS_CACHE_PATH = "./.github/scripts/maintainers/github.api.cache.json";

let cacheEntries = {};

let numberOfFullFetches = 0;
let numberOfCacheHits = 0;

function loadCache(core) {
try {
cacheEntries = JSON.parse(fs.readFileSync(CODEOWNERS_CACHE_PATH, "utf8"));
} catch (error) {
core.warning(`Cache was not restored: ${error}`);
}
}

function saveCache() {
fs.writeFileSync(CODEOWNERS_CACHE_PATH, JSON.stringify(cacheEntries));
mszostok marked this conversation as resolved.
Show resolved Hide resolved
}

async function fetchWithCache(cacheKey, fetchFn, core) {
const cachedResp = cacheEntries[cacheKey];

try {
const { data, headers } = await fetchFn({
headers: {
"if-modified-since": cachedResp?.lastModified ?? "",
},
});

cacheEntries[cacheKey] = {
// last modified header is more reliable than etag while executing calls on GitHub Action
lastModified: headers["last-modified"],
data,
};

numberOfFullFetches++;
return data;
} catch (error) {
if (error.status === 304) {
numberOfCacheHits++;
core.debug(`Returning cached data for ${cacheKey}`);
return cachedResp.data;
}
throw error;
}
}

function printAPICallsStats(core) {
core.startGroup("API calls statistic");
core.info(
`Number of API calls count against rate limit: ${numberOfFullFetches}`,
);
core.info(`Number of cache hits: ${numberOfCacheHits}`);
core.endGroup();
}
131 changes: 131 additions & 0 deletions .github/scripts/maintainers/gh_calls.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
const { fetchWithCache } = require("./cache");

module.exports = { getGitHubProfile, getAllCodeownersFiles, getRepositories };

async function getRepositories(github, owner, ignoredRepos, core) {
core.startGroup(
`Getting list of all public, non-archived repositories owned by ${owner}`,
);

const query = `
query repos($cursor: String, $owner: String!) {
organization(login: $owner) {
repositories(first: 100 after: $cursor visibility: PUBLIC isArchived: false orderBy: {field: CREATED_AT, direction: ASC} ) {
nodes {
name
}
pageInfo {
hasNextPage
endCursor
}
}
}
}`;

const repos = [];
let cursor = null;

do {
const result = await github.graphql(query, { owner, cursor });
const { nodes, pageInfo } = result.organization.repositories;
repos.push(...nodes);

cursor = pageInfo.hasNextPage ? pageInfo.endCursor : null;
} while (cursor);

core.debug(`List of repositories for ${owner}:`);
core.debug(JSON.stringify(repos, null, 2));
core.endGroup();

return repos.filter((repo) => !ignoredRepos.includes(repo.name));
}

async function getGitHubProfile(github, login, core) {
try {
const profile = await fetchWithCache(
`profile:${login}`,
async ({ headers }) => {
return github.rest.users.getByUsername({
username: login,
headers,
});
},
core,
);
return removeNulls({
name: profile.name ?? login,
github: login,
twitter: profile.twitter_username,
availableForHire: profile.hireable,
isTscMember: false,
repos: [],
githubID: profile.id,
});
} catch (error) {
if (error.status === 404) {
return null;
}
throw error;
}
}

// Checks for all valid locations according to:
// https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners#codeowners-file-location
//
// Detect the repository default branch automatically.
async function getCodeownersFile(github, owner, repo, core) {
const paths = ["CODEOWNERS", "docs/CODEOWNERS", ".github/CODEOWNERS"];

for (const path of paths) {
try {
core.debug(
`[repo: ${owner}/${repo}]: Fetching CODEOWNERS file at ${path}`,
);
return await fetchWithCache(
`owners:${owner}/${repo}`,
async ({ headers }) => {
return github.rest.repos.getContent({
owner,
repo,
path,
headers: {
Accept: "application/vnd.github.raw+json",
...headers,
},
});
},
core,
);
} catch (error) {
core.warning(
`[repo: ${owner}/${repo}]: Failed to fetch CODEOWNERS file at ${path}: ${error.message}`,
);
}
}

core.error(
`[repo: ${owner}/${repo}]: CODEOWNERS file not found in any of the expected locations.`,
);
return null;
}

async function getAllCodeownersFiles(github, owner, repos, core) {
core.startGroup(`Fetching CODEOWNERS files for ${repos.length} repositories`);
const files = [];
for (const repo of repos) {
const data = await getCodeownersFile(github, owner, repo.name, core);
if (!data) {
continue;
}
files.push({
repo: repo.name,
content: data,
});
}
core.endGroup();
return files;
}

function removeNulls(obj) {
return Object.fromEntries(Object.entries(obj).filter(([_, v]) => v != null));
}
Loading
Loading