Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long (ever?) running process when combining FETCH_LICENSE=true and PREFER_MAVEN_DEPS_TREE=true #1264

Closed
heubeck opened this issue Jul 23, 2024 · 13 comments · Fixed by #1266
Closed

Comments

@heubeck
Copy link
Contributor

heubeck commented Jul 23, 2024

Using my simple test application https://github.com/heubeck/examiner causes cdxgen to run forever when using FETCH_LICENSE=true and PREFER_MAVEN_DEPS_TREE=true .

Have seen this on other projects as well, even before using PREFER_MAVEN_DEPS_TREE=true, but now it's reproducable.

cdxgen version: 10.8.1

as you seen in the screenshot, I aborted the last command after 27 minutes, but during the whole runtime, the node process consumed 100% of a single CPU core.

image

@setchy
Copy link
Member

setchy commented Jul 23, 2024

How interesting... I've been having issues at my org with Maven repositories for the past few months. We use cdxgen in server mode and the BOM processes which used to work flawlessly are now timing out and/or hanging. I've been busy on other deliverables so haven't had time to triage what could be contributing to the stalling

@phoenix-aditya
Copy link

Hey,
getting the same issue and i think this was discussed here as well: #1107 (comment),

@prabhu any insights into this?

@heubeck
Copy link
Contributor Author

heubeck commented Jul 23, 2024

Some debug information.

pstree:
image

debug log:
debug.log

@prabhu
Copy link
Collaborator

prabhu commented Jul 23, 2024

How does it look with the latest version and the master? Lots of fixes have gone in since 10.8.1

@heubeck
Copy link
Contributor Author

heubeck commented Jul 23, 2024

How does it look with the latest version and the master? Lots of fixes have gone in since 10.8.1

pardon me, detected it in the latest cdxgen but locally did not update before reproducing.

Exact same behavior with cdxgen 10.8.7.

@prabhu
Copy link
Collaborator

prabhu commented Jul 23, 2024

Found the bug. const validateIriResult = validateIri(iri, IriValidationStrategy.Strict);. This line never finishes for some urls. Must be some ReDos bugs in this line https://github.com/comunica/validate-iri.js/blob/b4086cd7a2905a3a622dc2071ff6b46de3a40ea2/lib/Validate.ts#L61

Example:

https://github.com/apache/maven-resolver/tree/${project.scm.tag}

Using IriValidationStrategy.Pragmatic makes it finish sooner, so it must be the presence of ${.

@prabhu prabhu mentioned this issue Jul 23, 2024
@prabhu
Copy link
Collaborator

prabhu commented Jul 23, 2024

Please test with the branch #1266

@heubeck
Copy link
Contributor Author

heubeck commented Jul 23, 2024

awesome! works for me now, thx @prabhu.

There are quite some errors in the debug.log, like

Querying https://repo1.maven.org/maven2/ for 'io.github.heubeck/[email protected]' https://repo1.maven.org/maven2/io/github/heubeck/examiner/1.12.19/examiner-1.12.19.pom
An error occurred when trying to fetch metadata [object Object] TypeError: Cannot read properties of undefined (reading 'length')
    at charAt (/var/home/heubeck/w/cdxgen/node_modules/sax/lib/sax.js:960:19)
    at SAXParser.write (/var/home/heubeck/w/cdxgen/node_modules/sax/lib/sax.js:984:11)
    at module.exports (/var/home/heubeck/w/cdxgen/node_modules/xml-js/lib/xml2js.js:346:12)
    at fetchPomXmlAsJson (file:///var/home/heubeck/w/cdxgen/utils.js:3253:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async getMvnMetadata (file:///var/home/heubeck/w/cdxgen/utils.js:3190:24)
    at async createJarBom (file:///var/home/heubeck/w/cdxgen/index.js:1165:17)

just in case that's relevant ;)

@prabhu
Copy link
Collaborator

prabhu commented Jul 23, 2024

@heubeck could you retest with the latest?

@heubeck
Copy link
Contributor Author

heubeck commented Jul 23, 2024

no more errors in the debug output. thx for the quick resolution... as always ;)

@setchy
Copy link
Member

setchy commented Jul 23, 2024

Happy to report our maven workloads are working better than they have in recent months after upgrading to v10.8.8. Thank you @prabhu 🙇

@prabhu
Copy link
Collaborator

prabhu commented Jul 23, 2024

Thank you for your confirmation and support!

@prabhu
Copy link
Collaborator

prabhu commented Jul 24, 2024

Related: comunica/validate-iri.js#19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants