Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11072. Publish user-facing configs to the doc site #6916

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,129 @@ jobs:
path: |
~/.m2/repository/org/apache/ozone
retention-days: 1
build-config-doc:
sarvekshayr marked this conversation as resolved.
Show resolved Hide resolved
needs:
- build
if: ${{ github.repository == 'apache/ozone' && github.event_name == 'push' && github.ref_name == 'master' }}
runs-on: ubuntu-latest
steps:
- name: Checkout project
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'
- name: Download the source artifact
uses: actions/download-artifact@v4
with:
name: ozone-bin
path: ozone-bin
- name: Extract the source tarball
run: |
mkdir -p ozone-bin/extracted
tar -xzf ozone-bin/ozone-*.tar.gz -C ozone-bin/extracted
- name: Run the Python script to convert XML properties into Markdown
run: python3 dev-support/ci/xml_to_md.py ozone-bin/extracted > hadoop-hdds/docs/content/tools/Configurations.md
- name: Check if Configurations.md file has changed
id: hash-check
run: |
HASH_FILES="${{ hashFiles('hadoop-hdds/docs/content/tools/Configurations.md') }}"
PREV_HASH="${{ hashFiles('hadoop-hdds/docs/content/tools/Configurations.md@{previous-commit}') }}"
if [ "$HASH_FILES" != "$PREV_HASH" ]; then
echo "Configurations.md has changed, proceeding to commit and create PRs."
echo "hashes_differ=true" >> $GITHUB_ENV
else
echo "Configurations.md is unchanged, skipping commit and PR creation."
echo "hashes_differ=false" >> $GITHUB_ENV
fi
- name: Commit and push changes to apache/ozone
if: env.hashes_differ == 'true'
env:
GH_TOKEN: ${{ secrets.OZONE_WEBSITE_BUILD }}
run: |
git config --global user.name 'Github Actions'
git config --global user.email '[email protected]'
git checkout -b config-doc-update-from-$GITHUB_SHA
git add hadoop-hdds/docs/content/tools/Configurations.md
git commit -m "[Auto] Update Configurations.md from $GITHUB_SHA"
git push origin config-doc-update-from-$GITHUB_SHA
- name: Extract JIRA ID from the first commit of the branch
id: extract-jira
env:
BRANCH_NAME: ${{ github.ref_name }}
run: |
REMOTE=$(git remote -v | grep -E 'github.com[:/]apache/ozone' | awk '{print $1}' | head -1)
git fetch $REMOTE
JIRA_ID=$(git log ${REMOTE}/master..${REMOTE}/$(git rev-parse --abbrev-ref HEAD) --oneline | tail -1 | sed -E 's/.*(HDDS-[[:digit:]]+).*/\1/')
echo "jira_id=$JIRA_ID" >> $GITHUB_ENV
- name: Create Pull Request in apache/ozone
if: env.hashes_differ == 'true'
env:
GH_TOKEN: ${{ secrets.OZONE_WEBSITE_BUILD }}
WORKFLOW_NAME: ${{ github.workflow }}
WORKFLOW_RUN_ID: ${{ github.run_id }}
BRANCH_NAME: ${{ github.ref_name }}
JIRA_ID: ${{ env.jira_id }}
run: |
echo "## What changes were proposed in this pull request?" > pr_body.txt
echo "This is an automated pull request triggered by the [$WORKFLOW_NAME](https://github.com/apache/ozone/actions/runs/$WORKFLOW_RUN_ID) workflow run from [$BRANCH_NAME](https://github.com/apache/ozone/tree/$BRANCH_NAME). Please delete the [config-doc-update-from-$GITHUB_SHA](https://github.com/apache/ozone/tree/config-doc-update-from-$GITHUB_SHA) branch after this PR is merged or closed." >> pr_body.txt
echo "" >> pr_body.txt
echo "## What is the link to the Apache JIRA?" >> pr_body.txt
echo "[$JIRA_ID](https://issues.apache.org/jira/browse/$JIRA_ID)" >> pr_body.txt
echo "" >> pr_body.txt
echo "## How was this patch tested?" >> pr_body.txt
echo "Reviewers should manually verify the correctness of this change." >> pr_body.txt

gh pr create --base master --head config-doc-update-from-$GITHUB_SHA \
--title "$JIRA_ID. Update Configurations.md page with changes from $JIRA_ID" \
--body-file pr_body.txt
- name: Checkout ozone-site repository
if: env.hashes_differ == 'true'
uses: actions/checkout@v4
with:
token: ${{ secrets.OZONE_WEBSITE_BUILD }}
repository: apache/ozone-site
ref: 'HDDS-9225-website-v2'
path: ozone-site
- name: Copy MD file to ozone-site repository
if: env.hashes_differ == 'true'
run: |
TARGET_DIR=$(ls -d ozone-site/docs/*-administrator-guide/*-configuration)
echo "TARGET_DIR=${TARGET_DIR#ozone-site/}" >> $GITHUB_ENV
cp hadoop-hdds/docs/content/tools/Configurations.md "$TARGET_DIR/99-appendix.md"
- name: Commit and push changes to apache/ozone-site
if: env.hashes_differ == 'true'
env:
GH_TOKEN: ${{ secrets.OZONE_WEBSITE_BUILD }}
run: |
cd ozone-site
git config --global user.name 'Github Actions'
git config --global user.email '[email protected]'
git add "$TARGET_DIR/99-appendix.md"
git commit -m "[Auto] Update configurations.md page from ozone $GITHUB_SHA"
git push origin config-doc-update-from-$GITHUB_SHA
- name: Create Pull Request in apache/ozone-site
if: env.hashes_differ == 'true'
env:
GH_TOKEN: ${{ secrets.OZONE_WEBSITE_BUILD }}
WORKFLOW_NAME: ${{ github.workflow }}
WORKFLOW_RUN_ID: ${{ github.run_id }}
BRANCH_NAME: ${{ github.ref_name }}
JIRA_ID: ${{ env.jira_id }}
run: |
cd ozone-site
echo "## What changes were proposed in this pull request?" > pr_body1.txt
echo "This is an automated pull request triggered by the [$WORKFLOW_NAME](https://github.com/apache/ozone/actions/runs/$WORKFLOW_RUN_ID) workflow run from [$BRANCH_NAME](https://github.com/apache/ozone/tree/$BRANCH_NAME). Please delete the [config-doc-update-from-$GITHUB_SHA](https://github.com/apache/ozone/tree/config-doc-update-from-$GITHUB_SHA) branch after this PR is merged or closed." >> pr_body1.txt
echo "" >> pr_body1.txt
echo "## What is the link to the Apache JIRA?" >> pr_body1.txt
echo "[$JIRA_ID](https://issues.apache.org/jira/browse/$JIRA_ID)" >> pr_body1.txt
echo "" >> pr_body1.txt
echo "## How was this patch tested?" >> pr_body1.txt
echo "Reviewers should manually verify the correctness of this change." >> pr_body1.txt

gh pr create --base HDDS-9225-website-v2 --head config-doc-update-from-$GITHUB_SHA \
--title "$JIRA_ID. Update Configurations.md page with changes from $JIRA_ID" \
--body-file pr_body1.txt
compile:
needs:
- build-info
Expand Down
131 changes: 131 additions & 0 deletions dev-support/ci/xml_to_md.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
#!/usr/bin/python
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Python file to convert XML properties into Markdown
import os
import re
import zipfile
import xml.etree.ElementTree as ET
from collections import namedtuple
from pathlib import Path
import sys

Property = namedtuple('Property', ['name', 'value', 'tag', 'description'])

def extract_xml_from_jar(jar_path, xml_filename):
xml_files = []
with zipfile.ZipFile(jar_path, 'r') as jar:
for file_info in jar.infolist():
if file_info.filename.endswith(xml_filename):
with jar.open(file_info.filename) as xml_file:
xml_files.append(xml_file.read())
return xml_files

def wrap_config_keys_in_description(description, properties):
words = description.split()
wrapped_words = []
for word in words:
clean_word = word.strip('.,()')
if clean_word in properties:
word = f'`{word}`'
wrapped_words.append(word)
return ' '.join(wrapped_words)

def parse_xml_file(xml_content, properties):
root = ET.fromstring(xml_content)
for prop in root.findall('property'):
name = prop.findtext('name')
if not name:
raise ValueError("Property 'name' is required but missing in XML.")
description = prop.findtext('description', '')
if not description:
raise ValueError(f"Property '{name}' is missing a description.")
tag = prop.findtext('tag')
properties[name] = Property(
name = name,
value = prop.findtext('value', ''),
tag = '<br/>'.join(f'`{t}`' for t in tag.split(', ')),
description = wrap_config_keys_in_description(
' '.join(description.split()).strip(),
properties
)
)


def generate_markdown(properties):
markdown = f"""---

sidebar_label: Appendix
---

Configuration Key Appendix
==========================

This page provides a comprehensive overview of all configuration keys available in Ozone.

Configuration Keys
------------------

| **Name** | **Value** | **Tags** | **Description** |
|-|-|-|-|
"""

for prop in sorted(properties.values(), key=lambda p: p.name):
markdown += f"| `{prop.name}` | {prop.value} | {prop.tag} | {prop.description} |\n"
return markdown

def main():
if len(sys.argv) < 2 or len(sys.argv) > 3:
print("Usage: python3 xml_to_md.py <base_path> [<output_path>]")
sys.exit(1)

base_path = sys.argv[1]
output_path = sys.argv[2] if len(sys.argv) == 3 else None

# Find ozone SNAPSHOT directory dynamically using regex
snapshot_dir = next(
(os.path.join(base_path, d) for d in os.listdir(base_path) if re.match(r'ozone-[\d.]+\d-SNAPSHOT', d)),
None
)

if not snapshot_dir:
raise ValueError("Snapshot directory not found in the specified base path.")

extract_path = os.path.join(snapshot_dir, 'share', 'ozone', 'lib')
xml_filename = 'ozone-default-generated.xml'

property_map = {}
for file_name in os.listdir(extract_path):
if file_name.endswith('.jar'):
jar_path = os.path.join(extract_path, file_name)
xml_contents = extract_xml_from_jar(jar_path, xml_filename)
for xml_content in xml_contents:
parse_xml_file(xml_content, property_map)

markdown_content = generate_markdown(property_map)

if output_path:
output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open('w', encoding='utf-8') as file:
file.write(markdown_content)
else:
print(markdown_content)

if __name__ == '__main__':
main()
1 change: 1 addition & 0 deletions dev-support/rat/rat-exclusions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ src/test/resources/test.db.ini
# hadoop-hdds/docs
**/themes/ozonedoc/**
static/slides/*
content/tools/Configurations.md

# hadoop-ozone/dist
**/.ssh/id_rsa*
Expand Down