-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bundle and feature package type for Eclipse p2 artifacts #272
base: master
Are you sure you want to change the base?
Conversation
I just wanted to note that if we talk about P2 units(!) they usually use the symbolicname = unit id, but this is not mandatory! For features they even use unit id = featureid+ So I would suggest to not talk about bundles and features, P2 has:
so the most general one would be to define it as |
That's a good point. Trying to be specific is something that'll most likely backfire sooner rather than later.
I'd still like to bring in a classifier, somehow. Just to be able to tell at first glance, whether a unit is e.g. a binary or plug-in. But that is not a required attribute and should instead be provided as an additional, optional qualifier. |
We were discussing this with @waynebeaton in https://gitlab.eclipse.org/eclipsefdn/emo-team/sbom/-/issues/4. You'll see that our proposals are very close to what you have here. However, I would like to note that PURLs are about referencing (physical) artifacts. As such, the PURL should be about units in p2 artifacts repositories i.e., artifacts, and not (installable) units from metadata repositories. |
@mbarbero thanks for the hint, would you mind to continue discussion here (CC @waynebeaton @merks)?
If we want to go for artifacts (== P2 Artifact repositories, not units a unit only can reference artifacts) then an artifact has three mandatory attributes
regarding repository and defaults I like to add the following:
|
In the end, I think making the repository URL mandatory is quite dangerous and can lead to a whole lot of problems which I'd like to avoid. @laeubi already said in the CycloneDX discussion, that it might get difficult, to detect which repository an artifact originally comes from.
But in addition, this would also be problematic for proprietary software... When creating an SBOM, all artifacts would originate from e.g.
In the end, I don't think that's generally possible, when working on the artifacts alone. So if I want to derive the the physical location of an artifact, I need to process the p2 metadata (at least the content.xml), which is working with units... |
Just to prevent confusion, the final URI has to be derived from the As an alternative one might use the derived url that would be something like Instead I think a security scanner that needs to fetch the final artifact (why?) needs to be configured with a set of artifact repositories it should use to search for the artifact key. |
You're right... a
Wouldn't the task of calculating the full url be part of the scanner, anyway? Looking at the Maven specification, it only requires the GAV, but doesn't say anything about how it's stored in a m2 repository. Meaning the scanner needs to know that they are stored under In that fashion, I would also put the burden on the tool, to figure out whether the uri points to a compound repository, a "plain" p2 repository, a Target file hosted on a Maven repository or even something completely different, rather than adding all that complexity to the specification. |
👍 |
If you use a Maven caching proxy (e.g. Sonatype Nexus), you have the same issue: the artifacts comes from your proxy rather than Maven central. Whether you put the internal reference or the public one is a SBOM tooling problem, not one to consider for the p2 PURL.
I have the exact reverse reasoning :) : given there is not central authority, the repository URL is a mandatory hint, otherwise there is no way to find where the artifacts come from. |
You can not even know this for maven artifacts, why the repository url is not mandatory there? |
For Maven artifacts, the repository URL is not mandatory because the Maven scheme's specification states that the default repository is https://repo.maven.apache.org/maven2.
According to the specification, a purl or package URL is an attempt to standardize existing approaches to reliably identify and locate software packages.. 'Locating' does not imply universal accessibility, but rather that the location should be precisely defined. As such, tools responsible for creating these PURLs should make their best effort to achieve this. When the mirrorsUrl mechanism is utilized for downloading an artifact, it's advisable that the repository URI in the PURL be designated as the "original" or "source", rather than the mirror URI. I guess that tools (Tycho and others) should be able to give this information. In scenarios involving a private copy (e.g., a mirror created by p2.mirror tasks), a satisfactory solution may not currently exist. However, it's conceivable that such a "copy" could retain metadata regarding its origin, which could then be utilized by tools to generate an accurate PURL. Thoughts? |
Some considerations:
|
Well the main problem is how p2 works (and this is similar to maven where you can define multiple repositories), that you give it a set of artifact repositories, and then you can query for an artifact key (that is type, id, version) and then you get back an artifact, but you can never know where it comes from because:
So maybe p2 can record the data where it has fetched from once but that don't mean it is the only "real" source, if you look at eclipse-sdk-prereqs.target what is the input for Eclipse Platfrom build we have there EMF, ORBIT, ECF, ... now we publish the eclipse.download/releaseXY site... what is a the source of artifact emf/ecf/.. at version X... is it download.eclipsereleaseXY? Or is is not download.eclipse.org/emf/ecf/orbit... what if the artifact key can be found in multiple locations? And even at maven you configure a set of maven repos, still the GAV don't guarantees it is download from what server... even if in an eclipse build all artifacts are downloaded from an eclipse mirror should we really claim the are from that mirror? How could an automated tool ever know? The only possible option (for me) would be to feed the tool (Tycho) with a list of repositories it should query and use in the PURL, but this of course puts the burden on the producer side to manage the urls, also there is no guarantee a user configures the "right" ones. This also does not answer how a PURL should look like that is (not yet) deployed anywhere but probably will, e.g. in most cases I want to deploy the BOM together with my release, but without a release the URL will not exits... |
I understand that the implementation presents challenges. However, our current focus is on defining the method to identify and locate a p2 artifact, which is the primary function of a PURL. This involves four key components:
A p2 artifact cannot be fully identified or located without the repository URL. Your insights are indeed valuable and appreciated. They pertain more to the implementation aspects, which would be best addressed in the p2/tycho discussions. |
PURL-TYPES.rst
Outdated
@@ -397,6 +397,59 @@ nuget | |||
|
|||
pkg:nuget/[email protected] | |||
|
|||
p2 | |||
---- | |||
``p2`` for Eclipse p2 units: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``p2`` for Eclipse p2 units: | |
``p2`` for Eclipse p2 artifacts: |
I'd use the word artifacts
rather than units
. This is about identifying and locating artifacts from artifact repositories, not resolving units from metadata repositories. Same change should be done throughout the rest of the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. My initial idea was to use units but from what it looks like, artifacts are the better approach. I also have to check whether the individual bullet points still make sense.
PURL-TYPES.rst
Outdated
3.5.500.v20220812-1420 | ||
2.0.0.202304281106 | ||
|
||
- The software artifact are accessed from a p2 repository. Given that each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The software artifact are accessed from a p2 repository. Given that each | |
- The software artifact are accessed from a p2 artifact repository. Given that each |
While it's common to have both artifacts and metadata repository at the same location, one could have both split, and metadata repo is not relevant here.
PURL-TYPES.rst
Outdated
https://download.itemis.com/updates/releases/2.1.1 | ||
|
||
- A p2 repository can host a multitude of artifacts. The type of artifact is | ||
provided by the ``classifier`` qualifier key and is optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's optional, a default shall be defined. I guess that osgi.bundle
is a good one. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For artifacts, the type/classifier is never optional and there is no default see IArtifactKey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that my initial idea was to use units, not artifacts. So the only reason I introduced the classifier
was not for technical reason, but rather as a hint to immediately see what type of element is described by the PURL.
With artifacts, having a classifier is now mandatory. Maven does something similar, with jar
being the default, if no classifier is specified. In general, I like this idea, because it would reduce the length of a lot of PURLs.
But when we now put into the specification that osgi.bundle
is assumed to be the default value, we then impose on p2 that this string must always be used to indentify bundles. That's not something I can decide...
Note that there was a similar discussion regarding Tycho recently and how it uses p2.eclipse-plugin
as an artifical group-id for bundles that don't have proper Maven coordinates. External tools shouldn't rely on this string to always stay like this, because it's an implementation detail, rather than a formal specification. To me, this sounds like a similar situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A maven classifier
is something completely different! jar
is the type / extension of an artifact (and the default is jar
).
For P2 there is no default but the classifier
can be empty (what is something different than classifier=osgi.bundle
!) so if one needs to differentiate between not specified and empty what might be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A maven
classifier
is something completely different!jar
is the type / extension of an artifact (and the default isjar
). For P2 there is no default but theclassifier
can be empty (what is something different thanclassifier=osgi.bundle
!) so if one needs to differentiate between not specified and empty what might be confusing.
That's for the clarification. Then the classifier must remain an optional hint, with no default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the spec to include the feedback so far. In short, the PURL contains a:
- namespace (artifact id)
- version (artifact version)
- qualifier (classifier (optional), location (mandatory))
@mbarbero from P2 point of view you can only locate an artifact inside an Artifact Repository, that's correct. Care must be taken to read the mapping form the repository to resolve the final artifact though. An artifact repository could be located at any URI and might require special java code to accessed. |
So to briefly summarize: The PURL should be calculated based on p2 artifacts, rather than units. This necessitates the following components:
Because there is no central authority for hosting p2 artifacts, it should also contain a means to find its physical location, in order to satisfy the locator property:
Problem with this requirement are, among others, that:
As a side-note, PURL already defines a
Given that the URL only needs to provide a "primary access mechanism", I don't see the precise definition as a requirement. To pick up the example of eclipse-sdk-prereqs.target, all p2 repositories containing a given artifact would be valid locators. I total, we need the a locator. in addition to the remaining three components. Whether this locator be a repository URL, a Maven GAV, a relative path on the file system or whatever else is then implementation specific and shouldn't be discussed as part of the specification. Did I miss anything? Are there any objections or concerns that I haven't addressed? |
This specification describes how the PURL for a given Eclipse artifact can be constructed. The locator includes both the information from the (unique) artifact key, as well as the base URI of the artifact repository.
A proof-of-concept has recently been merged to Tycho via eclipse-tycho/tycho#3258, based on this proposal. The only noteworthy change is that the Example: |
An initial draft for bundles and features. There are more p2 types, but I can't see any use case where it would be necessary to explicitly specify them.
#271