Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinate and/or create Spark operator example use case #1867

Closed
2 tasks
chuckbelisle opened this issue Oct 18, 2023 · 0 comments
Closed
2 tasks

Coordinate and/or create Spark operator example use case #1867

chuckbelisle opened this issue Oct 18, 2023 · 0 comments
Assignees

Comments

@chuckbelisle
Copy link
Contributor

chuckbelisle commented Oct 18, 2023

Let's have at least one use case for the new Spark feature or even have an active project try it out in one of our demo/sandbox namespaces.

Use Strategy

The following would be the suggested workflow:

  • Users will develop their spark applications, likely using PySpark on jupyter notebooks.
  • These applications will be containerized and pushed to either AAW-contrib-containers or Arti -> note that we do not have docker available on notebooks, I'm not entirely sure what this would look like
  • The users will also create a SparkApplication manifest and submit this to the cluster, including their image in the manifest

To enable this workflow:

Side Note

Confining Pods to the Appropriate Classification

Currently our implementation is very primitive, so there is no support for pro-b or GPU nodes. To-do would be to enable mutating webhooks on the spark operator and allow tolerations to be injected onto the spark driver pod when it's spawned by the operator. This would allow the spark apps to run on pro-b or gpu nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants