Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clp-package: Package building platform's /etc/os-release and determine execution_container from the file. #322

Merged
merged 7 commits into from
Mar 12, 2024

Conversation

junhaoliao
Copy link
Member

@junhaoliao junhaoliao commented Mar 10, 2024

References

Internally, it was found that when the CLP Package is built with container clp-core-dependencies-x86-ubuntu-jammy:main, running the package starting script (entry point <clp-package>/sbin/start-clp.sh leads to missing Python dependencies failures. Further investigation found that the failures indicates Python version discrepancy in the built package virtual environment and execution container. Previously, users could mitigate such discrepancy in the building environment versus the execution container by specifying a matching execution_container in clp-config.yml. However, the configuration field is not publicly documented and encouraged to be used. Instead, the CLP Package should include information about the building platform and determine which Docker image to generate the execution_container.

This PR depends on #321 which adds a GitHub workflow to build and push Ubuntu Jammy execution container to the GitHub Docker Registry.

Description

  1. Package building platform's /etc/os-release.
  2. In package starting script, determine execution_container name from the <clp-package>/etc/os-release file.
  3. Add /etc/os-release to Taskfile building platform dependent tasks to trigger rebuild when the building container changes.

Validation performed

Sanity test with Ubuntu Focal Built Package

  1. Loaded container clp-core-dependencies-x86-ubuntu-focal:main and ran task under project root to build the package.
  2. Started the package via <clp-package>/sbin/start-clp.sh.
  3. Ran docker ps and observed all CLP component dockers created by IMAGE ghcr.io/y-scope/clp/clp-execution-x86-ubuntu-focal:main.
    CONTAINER ID   IMAGE                                                     COMMAND                  CREATED          STATUS          PORTS                                                                      NAMES
    9724137d38bd   ghcr.io/y-scope/clp/clp-execution-x86-ubuntu-focal:main   "/opt/clp/bin/node /…"   47 seconds ago   Up 46 seconds                                                                              clp-webui-2d7d
    68a64a3bcc19   ghcr.io/y-scope/clp/clp-execution-x86-ubuntu-focal:main   "python3 /opt/clp/li…"   47 seconds ago   Up 46 seconds                                                                              clp-search_worker-2d7d
    13ef53be52ba   ghcr.io/y-scope/clp/clp-execution-x86-ubuntu-focal:main   "python3 /opt/clp/li…"   47 seconds ago   Up 46 seconds                                                                              clp-compression_worker-2d7d
    613fc5982a70   ghcr.io/y-scope/clp/clp-execution-x86-ubuntu-focal:main   "python3 -u -m job_o…"   48 seconds ago   Up 47 seconds                                                                              clp-search_scheduler-2d7d
    9ff6bf1c2e70   ghcr.io/y-scope/clp/clp-execution-x86-ubuntu-focal:main   "python3 -u -m job_o…"   48 seconds ago   Up 47 seconds                                                                              clp-compression_scheduler-2d7d
    ...
    

Sanity test with Custom exexcution_container

  1. Loaded container clp-core-dependencies-x86-ubuntu-focal:main and ran task under project root to build the package.
  2. Configured <clp-package>/etc/clp-config.yml to include a string field execution_container with value clp-execution-x86-ubuntu-focal:dev. e.g.,
    execution_container: "clp-execution-x86-ubuntu-focal:dev"
    
  3. Started the package via <clp-package>/sbin/start-clp.sh.
  4. Ran docker ps and observed all CLP component dockers created by IMAGE clp-execution-x86-ubuntu-focal:dev.
    CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS          PORTS                                                                      NAMES
    97df417c1b10   clp-execution-x86-ubuntu-focal:dev   "/opt/clp/bin/node /…"   7 seconds ago    Up 6 seconds                                                                               clp-webui-341d
    98fe1d416b7b   clp-execution-x86-ubuntu-focal:dev   "python3 /opt/clp/li…"   7 seconds ago    Up 7 seconds                                                                               clp-search_worker-341d
    e48ccca3203f   clp-execution-x86-ubuntu-focal:dev   "python3 /opt/clp/li…"   7 seconds ago    Up 7 seconds                                                                               clp-compression_worker-341d
    944fbeddf22a   clp-execution-x86-ubuntu-focal:dev   "python3 -u -m job_o…"   7 seconds ago    Up 7 seconds                                                                               clp-search_scheduler-341d
    f6f86b150e8c   clp-execution-x86-ubuntu-focal:dev   "python3 -u -m job_o…"   8 seconds ago    Up 7 seconds                                                                               clp-compression_scheduler-341d
    

Test with Ubuntu Jammy Built Package

  1. Loaded container clp-core-dependencies-x86-ubuntu-jammy:main and ran task under project root to build the package.
  2. Started the package via <clp-package>/sbin/start-clp.sh.
  3. (TODO: re-validate once PR gh-actions: Refactor execution container generation GH workflow; Add workflow for Ubuntu Jammy. #321 is merged). Shall run docker ps and observe all CLP component dockers created by IMAGE ghcr.io/y-scope/clp/clp-execution-x86-ubuntu-jammy:main.

Copy link
Contributor

@haiqi96 haiqi96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the change makes sense to me. It's not entirely clear to me that how we will support custom container name and container tags, but I assume you will have a plan for it?

parsed[var] = val

self.execution_container = "ghcr.io/y-scope/clp/"
if "ubuntu" == parsed["ID"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we store parsed["ID"] in a local variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I drafted the code, I thought about extracting those into variables parsed_id, parsed_version_id / parsed_version_codename, and even better we break the for-loop if all info we need has been parsed. However, that might be less clean and less extensible if we (ever) plan to support CentOS / RHEL in the future. That's because VERSION_ID is common across all Linux distributions, but VERSION_CODENAME is specific to Ubuntu / Debian distributions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add Debian into the if condition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about adding Debian and other RHEL distributions here, but that does not seem meaningful until we add (building support and most importantly,) execution container support in tools/docker-images. For now, I think raising an NotImplementedError here is a good notification to users.

with open(self.os_release_file_path) as os_release_file:
parsed = {}
for line in os_release_file:
var, val = line.strip().split("=")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically we can add a guard to ensure split does return two items, but since the os release file should have a relatively fixed format, this code should be fine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was thinking about adding try ... except blocks to handle such errors / FileNotFoundError, but since users are not expected to modify this <clp-package>/etc/os-release file, such handlings could be redundant. In any case, users will get notified by the Exception stack.

@@ -355,6 +359,27 @@ def validate_logs_dir(self):
except ValueError as ex:
raise ValueError(f"logs_directory is invalid: {ex}")

def load_execution_container_name(self):
if self.execution_container is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for sanity check, have you tested and made sure that the container specified in the clp-config will be properly picked up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I have updated the "Validation" section in the PR description to include the related instructions.

@junhaoliao
Copy link
Member Author

junhaoliao commented Mar 11, 2024

@haiqi96

It's not entirely clear to me that how we will support custom container name and container tags

Users can configure a string field execution_container to specify a custom container image in clp-config.yml just like what they previously could do.

Taskfile.yml Outdated Show resolved Hide resolved
Taskfile.yml Outdated Show resolved Hide resolved
Taskfile.yml Outdated Show resolved Hide resolved
Taskfile.yml Show resolved Hide resolved
Taskfile.yml Outdated Show resolved Hide resolved
components/clp-py-utils/clp_py_utils/clp_config.py Outdated Show resolved Hide resolved
components/clp-py-utils/clp_py_utils/clp_config.py Outdated Show resolved Hide resolved
components/clp-py-utils/clp_py_utils/clp_config.py Outdated Show resolved Hide resolved
parsed[var] = val

self.execution_container = "ghcr.io/y-scope/clp/"
if "ubuntu" == parsed["ID"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add Debian into the if condition?

components/clp-py-utils/clp_py_utils/clp_config.py Outdated Show resolved Hide resolved
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

clp-package: Include build-platform's /etc/os-release and determine execution_container from it. (#322)

@junhaoliao junhaoliao merged commit 8d5f1ae into y-scope:main Mar 12, 2024
1 check passed
@junhaoliao junhaoliao deleted the os-release branch March 12, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants