Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New attributes for license tag #347

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
68 changes: 42 additions & 26 deletions rep-0149.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 11-Oct-2017
Post-History: 02-Jan-2018, 31-Aug-2020
Post-History: 02-Jan-2018, 31-Aug-2020, 14-Apr-2022

Outline
=======
Expand Down Expand Up @@ -346,8 +346,8 @@ Example
</description>
<maintainer email="[email protected]">Someone</maintainer>

<license>BSD</license>
<license file="LICENSE">LGPL</license>
<license file="LICENSE">BSD-3-Clause</license>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion I have here is to add a type attribute to the license tag. The type attribute would describe what kind of license tag is being used, either an SPDX identifier, or a freeform identifier. If no type attribute is specified, we would assume freeform.

Thus you could have:

  • <license>BSD</license> - a common license tag today, which can be assumed to be a "freeform" license tag (and thus nothing can mechanically be determined about it)
  • <license type='freeform'>BSD</license> - which is the same as above but more explicit
  • <license type='spdx'>BSD-3-Clause</license> - which is very explicitly using the SPDX identifier, and thus can be mechanically verified

With this in place, we could put in linters (for instance) that would complain when the "type" is SPDX, but the actual text doesn't match a known SPDX identifier. We could also have linters that would complain if any packages in the workspace don't use the "spdx" type (though we'd have to do a lot of work to enable that by default).

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be simpler to require the use of SPDX identifiers?

Other efforts to standardise licensing information (like REUSE) have gone that way and it makes parsing and validation much more straightforward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, yes, we would require SPDX identifiers.

The issue is backwards compatibility. There are thousands of ROS (1 & 2) packages out there in the wild, and almost none of the package.xml files in them follow SPDX. Explicitly putting a type field in will make it easier for tools (and humans) to determine whether the field is expected to be SPDX or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of introducing a type attribute. There are many ROS 1 packages that simply specify BSD as license identifier, which is not enough information to infer the exact license.

which can be assumed to be a "freeform" license tag (and thus nothing can mechanically be determined about it)

I would be a bit more optimistic for the linter and in particular for generating a debian/copyright file:

  1. <license file="path_to_license.txt">BSD</license>: This is enough information to generate a Debian copyright file. The generator (and the linter) should check whether path_to_license.txt really contains a BSD license text.
  2. <license>BSD</license> (or <license source-files="*">BSD</license>) with a file LICENSE in the root of the repo: Again, this is enough information to generate a Debian copyright file in my opinion.
  3. <license>Some SPDX full name or SPDX identifier</license>: Again enough information to generate a Debian copyright file (and to validate it against a potential LICENSE file in the package or repo root).
  4. <license file="custom_license_text.txt">MyCustomLicenseIdentifier</license>: Also this would allow the generation of a debian/copyright file according to https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/. ("If there are licenses present in the package without a standard short name, an arbitrary short name may be assigned for these licenses. These arbitrary names are only guaranteed to be unique within a single copyright file.")

Even the case that a debian/copyright file is generated where the license short name does not match the license text I do not consider dramatic: It simply repeats the inconsistent license names and texts from the source code and should be discovered by relevant tools applied to the debian/copyright file.

As a consequence, the type="spdx" attribute is not really necessary but its use would introduce a safety layer that prevents that a typo in an SPDX license identifier leads to the interpretation as a custom license identifier (i.e., as case 4).

What we must prevent in any case is that the future debian/copyright file generator refines the license specification for which there is no evidence - which is the case for some ROS 1 packages that specify <license>BSD</license> without any further license headers or LICENSE file.

Copy link
Contributor

@gavanderhoorn gavanderhoorn Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, yes, we would require SPDX identifiers.

The issue is backwards compatibility. There are thousands of ROS (1 & 2) packages out there in the wild, and almost none of the package.xml files in them follow SPDX. Explicitly putting a type field in will make it easier for tools (and humans) to determine whether the field is expected to be SPDX or not.

but don't we have the same problem with the attributes and elements proposed here? All the packages in existence today would also not have those.

if the package format version would be increased, tools could just assume "v4 manifest->spdx".

Anything below that does not (have to) conform to that.

If the concern is "there may be licenses in use for which there are no SPDX identifiers", then SPDX has a standard approach for that in place IIRC.


Edit: absence of the type attribute could of course also be used, but it seems much cleaner to me to be able to say "format 4 manifests must use SPDX identifiers" than "if the license tag doesn't have the type attribute then it's OK to not use SPDX identifiers".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the type attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for completeness: The SPDX Specification also describes a solution for naming custom licenses (cf. https://spdx.dev/spdx-specification-21-web-version/#h.1v1yuxt), but I still prefer the explicit type attribute to distinguish between licenses from the SPDX License List and custom ones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type attribution with default looks good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced the type attribute as proposed by @clalancette. (Sorry, I should have updated the PR before yesterday's discussion in the TSC meeting.)

Is the description of the new attribute sufficient or should I give explicit explanations of the two possible values freeform and spdx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clalancette, are you fine if I mark the discussion on the type attribute as resolved, cf. my last question on the explanation of the attribute in the REP?

<license source-files="include/my_package/linear_math/*">Zlib</license>
ralph-lange marked this conversation as resolved.
Show resolved Hide resolved
ralph-lange marked this conversation as resolved.
Show resolved Hide resolved

<url type="website">http://wiki.ros.org/my_package</url>
<url type="repository">http://www.github.com/my_org/my_package</url>
Expand Down Expand Up @@ -465,32 +465,18 @@ Example
<license> (multiple, but at least one)
--------------------------------------

Name of license for this package, e.g. BSD, GPL, LGPL. In order to
assist machine readability, only include the license name in this tag.
For multiple licenses multiple separate tags must be used. A package
will have multiple licenses if different source files have different
licenses. Every license occurring in the source files should have
a corresponding ``<license>`` tag. For any explanatory text about
licensing caveats, please use the ``<description>`` tag.

Most common open-source licenses are described on the
`OSI website <http://www.opensource.org/licenses/alphabetical>`_.

Commonly used license strings:

- Apache-2.0
- BSD
- Boost Software License
- GPLv2
- GPLv3
- LGPLv2.1
- LGPLv3
- MIT
- Mozilla Public License Version 1.1
Name of license for this package or selected files of this package,
e.g. BSD-3-Clause, GPL-3.0-or-later, Apache-2.0. In order to assist
machine readability, only include the `SPDX license identifier
<https://spdx.org/licenses/>`_ in this tag. In the rare case that
a package (or selected source files of the package) are licensed under
multiple alternative licenses, the identifiers can be combined by
``or`` as described in Section 7.2 of the `Machine-readable
debian/copyright file specification V1.0
<https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_.

Attributes
''''''''''

ralph-lange marked this conversation as resolved.
Show resolved Hide resolved
.. raw:: html

<font color="blue">
Expand All @@ -505,10 +491,40 @@ Attributes

"You must give any other recipients of the Work or Derivative Works a copy of this License"

``source-files="FILENAME-PATTERN"`` *(optional)*

A filename pattern using the simplified shell glob syntax specified in Section 6.9 of the `Machine-readable
debian/copyright file specification V1.0 <https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_
and relative to the ``package.xml`` file.

The filename pattern specifies the source files this license information refers to. The value
``source-files="*"`` refers to all source files of the package, including source files that are downloaded automatically
during the build process - for example in the case of so-called *vendor packages*. If the attribute is not specified,
the tag again refers to all source files of the package, including downloaded source files.

If the filename patterns of multiple license tags match a particular file, the last tag applies to it - following
the logic described in Section 6.9 of the `Machine-readable
debian/copyright file specification V1.0 <https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_.
Consequently, more general tags should be given first.

.. raw:: html

</font>

Notes
'''''

The license information given in the license tags has to be consistent
with the information given in the license headers of the source files.
This may be checked by suitable linting tools.

Furthermore, by the license tags in the ``package.xml`` file and the
copyright information obtained from the license headers of the source files
(e.g., using ``licensecheck --copyright``)
a copyright file according to the `Machine-readable debian/copyright file
specification V1.0 <https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_
for binary versions of this package can be created automatically.

<url> (multiple)
----------------

Expand Down