Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding support for large yaml files as input #409

Closed

Conversation

ElaiShalevRH
Copy link
Collaborator

This change adds a build time param for the serverless workflow dockerfile that allows passing YAML input files to a workflow with an unlimited size. This change comes in handy when we might use large openAPI spec files.
This change has been tested using a custom image built using this suggested change, and validated that the workflow does accept the arge input.

A reproducer for this fix can be seen here.

For full discretion, using large input files with this parameter may cause the workflow execution time to be slower.

@ElaiShalevRH
Copy link
Collaborator Author

@masayag

@@ -18,7 +18,7 @@ ARG QUARKUS_EXTENSIONS=org.kie:kogito-addons-quarkus-jobs-knative-eventing:9.100

# Additional java/mvn arguments to pass to the builder.
# This are is conventient to pass sonataflow and quarkus build time properties.
ARG MAVEN_ARGS_APPEND="-Dkogito.persistence.type=jdbc -Dquarkus.datasource.db-kind=postgresql -Dkogito.persistence.proto.marshaller=false"
ARG MAVEN_ARGS_APPEND="-DmaxYamlCodePoints=99999999 -Dkogito.persistence.type=jdbc -Dquarkus.datasource.db-kind=postgresql -Dkogito.persistence.proto.marshaller=false"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which size does this value represent? could there be a need for a higher value?

pls make sure also to update https://www.parodos.dev/blog/extracting-openapi-documents/ to include this option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the codePoints param refers to Unicode points, which are basically the number of characters in the input file (as we're not using special chars like emojis in the yaml input files).
The maximum code point number is the integer max, 2^31-1.
the value 99999999 can represent around 190MB size input file,
I can bump it to 99999999, which is closer to the max , or 2^31-1, of course.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for that high value, but pls include a comment in the docker file for this property, and the current value it represents.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. So any preferred benchmark for input size? 100MB?
For reference, the full github-openapi spec is 10MB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCP openapi-v2-4.14.yaml is 23M, let's go with 30M.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants