Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add workflow and script to check edit links on docs #3557

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
49 changes: 49 additions & 0 deletions .github/workflows/check-edit-links.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Weekly Docs Link Checker

on:
schedule:
- cron: '0 0 * * 0' # Runs every week at midnight on Sunday
workflow_dispatch:

jobs:
check-links:
name: Run Link Checker and Notify Slack
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'

- name: Install dependencies
run: npm install

- name: Run link checker
id: linkcheck
run: |
npm run test:editlinks | tee output.log

anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved
- name: Extract 404 URLs from output
id: extract-404
run: |
ERRORS=$(sed -n '/URLs returning 404:/,$p' output.log)
echo "errors<<EOF" >> $GITHUB_OUTPUT
echo "$ERRORS" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT

anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved
- name: Notify Slack
if: ${{ steps.extract-404.outputs.errors != '' }}
akshatnema marked this conversation as resolved.
Show resolved Hide resolved
uses: rtCamp/action-slack-notify@v2
env:
SLACK_WEBHOOK: ${{ secrets.WEBSITE_SLACK_WEBHOOK }}
SLACK_TITLE: 'Docs Edit Link Checker Errors Report'
SLACK_MESSAGE: |
🚨 The following URLs returned 404 during the link check:
```
${{ steps.extract-404.outputs.errors }}
```
MSG_MINIMAL: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this parameter state for Slack?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It removes a bunch of useless information from the message.

4 changes: 4 additions & 0 deletions components/layout/DocsLayout.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ interface IDocsLayoutProps {
*/
function generateEditLink(post: IPost) {
let last = post.id.substring(post.id.lastIndexOf('/') + 1);

if (last.endsWith('.mdx')) {
last = last.replace('.mdx', '.md');
}
const target = editOptions.find((edit) => {
return post.slug.includes(edit.value);
});
Expand Down
4 changes: 2 additions & 2 deletions config/edit-page-config.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[
{
"value": "/tools/generator",
"href": "https://github.com/asyncapi/generator/tree/master/docs"
"href": "https://github.com/asyncapi/generator/tree/master/apps/generator/docs"
},
{
"value": "reference/specification/",
Expand All @@ -19,4 +19,4 @@
"value": "reference/extensions/",
"href": "https://github.com/asyncapi/extensions-catalog/tree/master/extensions"
}
]
]
4 changes: 2 additions & 2 deletions jest.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ module.exports = {
collectCoverageFrom: ['scripts/**/*.js'],
coveragePathIgnorePatterns: ['scripts/compose.js', 'scripts/tools/categorylist.js', 'scripts/tools/tags-color.js'],
// To disallow netlify edge function tests from running
testMatch: ['**/tests/**/*.test.*', '!**/netlify/**/*.test.*'],
};
testMatch: ['**/tests/**/*.test.*', '!**/netlify/**/*.test.*']
};
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
"generate:tools": "node scripts/build-tools.js",
"test:netlify": "deno test --allow-env --trace-ops netlify/**/*.test.ts",
"test:md": "node scripts/markdown/check-markdown.js",
"test:editlinks": "node scripts/markdown/check-edit-links.js",
"dev:storybook": "storybook dev -p 6006",
"build:storybook": "storybook build"
},
Expand Down
2 changes: 1 addition & 1 deletion scripts/dashboard/build-dashboard.js
Original file line number Diff line number Diff line change
Expand Up @@ -181,4 +181,4 @@ if (require.main === module) {
start(resolve(__dirname, '..', '..', 'dashboard.json'));
}

module.exports = { getLabel, monthsSince, mapGoodFirstIssues, getHotDiscussions, getDiscussionByID, getDiscussions, writeToFile, start, processHotDiscussions };
module.exports = { getLabel, monthsSince, mapGoodFirstIssues, getHotDiscussions, getDiscussionByID, getDiscussions, writeToFile, start, processHotDiscussions, pause };
163 changes: 163 additions & 0 deletions scripts/markdown/check-edit-links.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
const fs = require('fs').promises;
const path = require('path');
const fetch = require('node-fetch-2');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why node-fetch-2 is used here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we use it in other "js" scripts.

const fetch = require('node-fetch-2');

const editUrls = require('../../config/edit-page-config.json');
const { pause } = require('../dashboard/build-dashboard');

const ignoreFiles = [
'reference/specification/v2.x.md',
'reference/specification/v3.0.0-explorer.md',
'reference/specification/v3.0.0.md'
];

/**
* Process a batch of URLs to check for 404s
* @param {object[]} batch - Array of path objects to check
* @returns {Promise<string[]>} Array of URLs that returned 404
*/
async function processBatch(batch) {
const TIMEOUT_MS = process.env.TIMEOUT_MS || 5000;
return Promise.all(
batch.map(async ({ filePath, urlPath, editLink }) => {
try {
if (!editLink || ignoreFiles.some((ignorePath) => filePath.endsWith(ignorePath))) return null;

const controller = new AbortController();
/* istanbul ignore next */
const timeout = setTimeout(() => controller.abort(), TIMEOUT_MS);
const response = await fetch(editLink, {
method: 'HEAD',
signal: controller.signal
});
clearTimeout(timeout);
if (response.status === 404) {
return { filePath, urlPath, editLink };
}
return null;
} catch (error) {
return Promise.reject(new Error(`Error checking ${editLink}: ${error.message}`));
}
})
);
}

/**
* Check all URLs in batches
* @param {object[]} paths - Array of all path objects to check
* @returns {Promise<string[]>} Array of URLs that returned 404
*/
async function checkUrls(paths) {
const result = [];
const batchSize = process.env.BATCH_SIZE || 5;

const batches = [];
for (let i = 0; i < paths.length; i += batchSize) {
const batch = paths.slice(i, i + batchSize);
batches.push(batch);
}

console.log(`Processing ${batches.length} batches concurrently...`);
const batchResultsArray = await Promise.all(
batches.map(async (batch) => {
const batchResults = await processBatch(batch);
await pause(1000);
return batchResults.filter((url) => url !== null);
})
);
anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved

result.push(...batchResultsArray.flat());
return result;
}

/**
* Determines the appropriate edit link based on the URL path and file path
* @param {string} urlPath - The URL path to generate an edit link for
* @param {string} filePath - The actual file path
* @param {object[]} editOptions - Array of edit link options
* @returns {string|null} The generated edit link or null if no match
*/
function determineEditLink(urlPath, filePath, editOptions) {
// Remove leading 'docs/' if present for matching
const pathForMatching = urlPath.startsWith('docs/') ? urlPath.slice(5) : urlPath;

const target = editOptions.find((edit) => pathForMatching.includes(edit.value));

// Handle the empty value case (fallback)
if (target.value === '') {
return `${target.href}/docs/${urlPath}.md`;
}
anshgoyalevil marked this conversation as resolved.
Show resolved Hide resolved

// For other cases with specific targets
return `${target.href}/${path.basename(filePath)}`;
}

/**
* Recursively processes markdown files in a directory to generate paths and edit links
* @param {string} folderPath - The path to the folder to process
* @param {object[]} editOptions - Array of edit link options
* @param {string} [relativePath=''] - The relative path for URL generation
* @param {object[]} [result=[]] - Accumulator for results
* @returns {Promise<object[]>} Array of objects containing file paths and edit links
*/
async function generatePaths(folderPath, editOptions, relativePath = '', result = []) {
try {
const files = await fs.readdir(folderPath);

await Promise.all(
files.map(async (file) => {
const filePath = path.join(folderPath, file);
const relativeFilePath = path.join(relativePath, file);

// Skip _section.md files
if (file === '_section.md') {
return;
}

const stats = await fs.stat(filePath);

if (stats.isDirectory()) {
await generatePaths(filePath, editOptions, relativeFilePath, result);
} else if (stats.isFile() && file.endsWith('.md')) {
const urlPath = relativeFilePath.split(path.sep).join('/').replace('.md', '');
result.push({
filePath,
urlPath,
editLink: determineEditLink(urlPath, filePath, editOptions)
});
}
})
);

return result;
} catch (err) {
throw new Error(`Error processing directory ${folderPath}: ${err.message}`);
}
}

async function main() {
const editOptions = editUrls;

try {
const docsFolderPath = path.resolve(__dirname, '../../markdown/docs');
const paths = await generatePaths(docsFolderPath, editOptions);
console.log('Starting URL checks...');
const invalidUrls = await checkUrls(paths);

if (invalidUrls.length > 0) {
console.log('\nURLs returning 404:\n');
invalidUrls.forEach((url) => console.log(`- ${url.editLink} generated from ${url.filePath}\n`));
console.log(`\nTotal invalid URLs found: ${invalidUrls.length}`);
} else {
console.log('All URLs are valid.');
}
} catch (error) {
throw new Error(`Failed to check edit links: ${error.message}`);
}
}

/* istanbul ignore next */
if (require.main === module) {
main();
}

module.exports = { generatePaths, processBatch, checkUrls, determineEditLink, main };
85 changes: 85 additions & 0 deletions tests/fixtures/markdown/check-editlinks-data.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
const determineEditLinkData = [
{
urlPath: 'docs/concepts/application',
filePath: 'markdown/docs/concepts/application.md',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/docs/concepts/application.md'
},
{
urlPath: 'concepts/application',
filePath: 'markdown/docs/concepts/application.md',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/concepts/application.md'
},
{
urlPath: '/tools/cli',
filePath: 'markdown/docs/tools/cli/index.md',
editLink: 'https://github.com/asyncapi/cli/tree/master/docs/index.md'
}
];

const processBatchData = [
{
filePath: '/markdown/docs/tutorials/generate-code.md',
urlPath: 'tutorials/generate-code',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/tutorials/generate-code.md'
},
{
filePath: '/markdown/docs/tutorials/index.md',
urlPath: 'tutorials/index',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/tutorials/index.md'
}
];

const testPaths = [
{
filePath: '/markdown/docs/guides/index.md',
urlPath: 'guides/index',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/guides/index.md'
},
{
filePath: '/markdown/docs/guides/message-validation.md',
urlPath: 'guides/message-validation',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/guides/message-validation.md'
},
{
filePath: '/markdown/docs/guides/validate.md',
urlPath: 'guides/validate',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/guides/validate.md'
},
{
filePath: '/markdown/docs/reference/index.md',
urlPath: 'reference/index',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/reference/index.md'
},
{
filePath: '/markdown/docs/tools/index.md',
urlPath: 'tools/index',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/tools/index.md'
},
{
filePath: '/markdown/docs/migration/index.md',
urlPath: 'migration/index',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/migration/index.md'
},
{
filePath: '/markdown/docs/migration/migrating-to-v3.md',
urlPath: 'migration/migrating-to-v3',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/migration/migrating-to-v3.md'
},
{
filePath: '/markdown/docs/tutorials/create-asyncapi-document.md',
urlPath: 'tutorials/create-asyncapi-document',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/tutorials/create-asyncapi-document.md'
},
{
filePath: '/markdown/docs/tutorials/generate-code.md',
urlPath: 'tutorials/generate-code',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/tutorials/generate-code.md'
},
{
filePath: '/markdown/docs/tutorials/index.md',
urlPath: 'tutorials/index',
editLink: 'https://github.com/asyncapi/website/blob/master/markdown/docs/tutorials/index.md'
}
];

module.exports = { determineEditLinkData, processBatchData, testPaths };
Loading
Loading