Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic PURL generation from existing CPE or CSAF #331

Closed
adulau opened this issue Oct 4, 2024 · 8 comments
Closed

Automatic PURL generation from existing CPE or CSAF #331

adulau opened this issue Oct 4, 2024 · 8 comments

Comments

@adulau
Copy link

adulau commented Oct 4, 2024

We wanted to add an automatic PURL generation from existing CPE or CSAF in vulnerability-lookup. Is there a library for doing this?

@bureado
Copy link

bureado commented Oct 5, 2024

@adulau
Copy link
Author

adulau commented Oct 8, 2024

Thanks a lot @bureado for the exhaustive list. Just wondering about the direction, it's mainly from purl to CPE but not the reverse. Is there a specific reason to that ?

@bureado
Copy link

bureado commented Oct 14, 2024

@adulau for both projects I referenced above, it should be possible to get an array of purls given a CPE. But the question you bring up is, I think, very central to the broader problem.

Here's an example: the software entity that humans know as "nginx", more precisely the web server typically called "nginx", is recognized in scanoss/purl2cpe as cpe:2.3:a:nginx:nginx:*:*:*:*:*:*:*:* whic has a purls[].sizeof == 6 including pkg:deb/debian/nginx, pkg:github/nginx/nginx, and more.

We also know that binary nginx can be any of https://repology.org/project/nginx/packages in various *nixes and package managers, and that source nginx also exists, for which pkg:github/nginx/nginx is a good first hint but lacks specificity: Which branch/tag? Which commit reference? Is GitHub truly the remote that distros use when they build their binaries?

Note that once you've bridged to source world, then searches like https://whatsrc.org/search?q=nginx point to how the problem gets a bit more complicated, but also gives a more complete picture. Even more complete with things like SWHIDs, and a couple more that I'd like to add at this time:

  1. https://github.com/spice-labs-inc/goatrodeo
  2. https://github.com/kpcyrd/what-the-src

Plus an approach that uses WikiData's ontology:

  1. Some possible data sources to identify package managers, build systems and compilers (build toolchain) ossf/wg-securing-critical-projects#41
  2. Aggregation of distro and pkg data sets to create a searchable DB ossf/package-feeds#203 (comment)

I'm eager to collaborate in this space; sorry my comments have distracted from your original question, but I truly think there are a few efforts that could potentially cross-pollinate towards this goal. /cc @kpcyrd @dpp and others.

@andrewpollock
Copy link
Contributor

Greetings! From my adventures in this space, particularly across project boundaries, I've found it's rather important to level-set on the language of CPE, because it means different things to different people and some of it's dependent on what you're actually trying to achieve.

From CVEProject/quality-workgroup#12 (comment), there is a good enumeration (no pun intended) of the various different "types" of CPEs and how they're able to be used.

So, @adulau when you say:

We wanted to add an automatic PURL generation from existing CPE or CSAF in vulnerability-lookup

stepping back for a moment, could you drop in a few worked examples of the desired inputs and outputs?

@pombredanne
Copy link
Member

@adulau 👋 re:

We wanted to add an automatic PURL generation from existing CPE or CSAF in vulnerability-lookup. Is there a library for doing this?

  • @andrewpollock 's point make sense. IMHO, the only thing that could be sanely mapped to PURL are CPE names. Configuration and wildcard are problematic. And the version ranges could be expressed as VERS too in a more straightforward way.

  • also @bureado pointed to https://github.com/aboutcode-org/vulnerablecode-purl2cpe : this has been kept current; this is based on automatically traversing the graph of relationships between PURLs and CVEs in vulnerablecode and has been highly unreliable

  • since CPE names are given names, anything but a hand made mapping will be unreliable, with the caveat that it is is not one to one, and @bureado pointed, e.g., one CPE can point to multiple PURLs.

Any automatic PURL generation from existing CPE is IMHO a lost cause (unless we add a cpe PURL type, but that would just move the problem without resolving it). It needs a proper mapping. And a library could then use the mapping.

It makes sense to maintain here a reference an open PURL <-> CPE mapping and I see it being a useful community asset. But it needs to be curated by humans to have any value.

Or quoting @andrewpollock we need this:

https://www.first.org/resources/papers/vulncon2024/VulnCon-The-Trials-and-Tribulations-of-Bulk-Converting-CVEs-to-OSV-Pollock.pdf#page=38
Andrew’s Data Quality Wishlist
....
Comprehensive, open and free mapping between CPEs, Purls and canonical Git repositories

( @andrewpollock side note for the "Purls and canonical Git repositories" part, I have some WIP with PurlDB towards publishing and unlocking all that data)

@andrewpollock
Copy link
Contributor

( @andrewpollock side note for the "Purls and canonical Git repositories" part, I have some WIP with PurlDB towards publishing and unlocking all that data)

Ooh please keep me posted!

@adulau
Copy link
Author

adulau commented Oct 21, 2024

Thanks to all for the detailed feedback.

Quick question following @pombredanne feedback.

Any automatic PURL generation from existing CPE is IMHO a lost cause (unless we add a cpe PURL type, but that would just move the problem without resolving it). It needs a proper mapping. And a library could then use the mapping.

What is the exact definition of "proper mapping"? I mean we discussed internally to go further by either providing an actual improved directory of combined CPE. Especially allowing vendor and products to have aliases or following names as software and vendor names change regularly. It also seems we cannot reference a software without a package in purl.

We will work on improving the actual CPE directory and see how we link the associated packages to PURL in vulnerability-lookup.

I'll close the issue for the time being until we implement the PURL part in vulnerability-lookup.

@pombredanne
Copy link
Member

@adulau FYI, I have been invited by a CVE.org quality working group to present PURL sometimes in January 2025 as they would be interested to have it as a main id in the next CVE schema. To be continued...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants