feat: support evaluating spdx license expressions #1329

G-Rath · 2024-10-18T03:32:20Z

This implements a parser for SPDX license expressions in accordance with annex D of the v2 spec, to allow the scanner to properly determine if packages with such expressive licenses are permitted based on the licenses allowed by the user.

To do this I've implemented a two-phase parser which starts by tokenizing the license string and then turns it into an AST of nodes that can be walked to determine if the full expression is satisfied; for no particularly good reason I have used a string for the base token type rather than a struct, meaning the tokens value is also it's type - the tradeoff with this is while it means we don't have to do as much referencing or work in the tokenizer, we do have to do some extra work when walking the tree to resolve the "simple expression" tokens.

I'm proposing landing the current implementation as I don't think using a struct would be strictly better, though in hindsight it probably would have been a bit quicker to implement and so I plan to explore how much simpler (or complex) it might be as a follow up.

Currently this is focused on AND and OR support, as I believe those are the two primary operators that are relevant to the scanner, though we still might want to have richer handling for the WITH and + operators; currently both of those just get treated as being part of the license expression (though it's not actually possible right now to allow a license with WITH as the CLI expects license values to not have any spaces - this too will be a follow-up for me).

Finally, I've purposely not put any caching in place even though that should be easy, due to wanting to get this landed and as I don't expect it to actually have a significant impact on the scanner performance (ultimately most complex expressions in the real-world will be made up of a single operator, and chopping+looping over strings in memory is extremely fast)

Resolves #1299

codecov-commenter · 2024-10-18T03:39:34Z

Codecov Report

Attention: Patch coverage is 99.35065% with 1 line in your changes missing coverage. Please review.

Project coverage is 68.99%. Comparing base (9fcf53f) to head (22a69e7).

Files with missing lines	Patch %	Lines
internal/spdx/satisfies.go	99.32%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1329      +/-   ##
==========================================
+ Coverage   68.75%   68.99%   +0.23%     
==========================================
  Files         184      185       +1     
  Lines       17714    17864     +150     
==========================================
+ Hits        12180    12326     +146     
- Misses       4875     4878       +3     
- Partials      659      660       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jayvdb · 2024-11-05T08:37:53Z

Hi , I'd love to see this progress. We are already using osv-scanner in one small private repo replacing a fork of https://github.com/jslicense/licensee.js that adds pnpm support, and would like to continue the transition but this PR is needed to avoid adding a ridiculous number of ignores in our osv-scanner.yaml files.

another-rex · 2024-11-06T05:37:08Z

Thanks for the interest! We are in the process of getting this feature in, but are a bit blocked on some internal issues. We'll try to get this feature merged in a week or so.

another-rex

Since this works already let's merge this in first, and then use the deps.dev library later.

#1329 removed the slices import in `vulnerability_result.go`, because it was no longer using it. At the same time, #1385 added a new usage of the module. Add the import back to make osv-scanner build again.

G-Rath changed the title ~~feat: support parsing spdx license expressions~~ feat: support evaluating spdx license expressions Oct 18, 2024

G-Rath requested review from another-rex, cuixq and hogo6002 October 18, 2024 03:32

G-Rath added the enhancement New feature or request label Oct 18, 2024

G-Rath force-pushed the fix/support-or-and2 branch 2 times, most recently from 30babc5 to 9bebbe3 Compare October 22, 2024 00:13

another-rex approved these changes Nov 7, 2024

View reviewed changes

G-Rath added 11 commits November 8, 2024 07:53

test: add cases

6d42e4f

feat: introduce spdx.Satisfies

ac9852d

feat: support properly parsing license expressions

093fbd0

test: update cases

6e90837

refactor: move spdx#Satisfies into internal package

d642485

test: cleanup

0b028c0

refactor: address linting

8e2f4bc

refactor: remove unneeded code and add comments

27b10d8

fix: add newline in output message

4c18a83

chore: remove unused property

501ac8e

test: remove unneeded case

0ec86a4

G-Rath force-pushed the fix/support-or-and2 branch from a16ee28 to 0ec86a4 Compare November 7, 2024 18:53

refactor: use a clearer name for nextIfValid

22a69e7

G-Rath force-pushed the fix/support-or-and2 branch from fbafc02 to 22a69e7 Compare November 7, 2024 20:06

another-rex merged commit 73fe113 into google:main Nov 11, 2024
13 checks passed

another-rex deleted the fix/support-or-and2 branch November 11, 2024 19:37

michaelkedar mentioned this pull request Nov 12, 2024

fix: add "slices" import back in #1395

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support evaluating spdx license expressions #1329

feat: support evaluating spdx license expressions #1329

G-Rath commented Oct 18, 2024

codecov-commenter commented Oct 18, 2024 •

edited

Loading

jayvdb commented Nov 5, 2024

another-rex commented Nov 6, 2024

another-rex left a comment

feat: support evaluating spdx license expressions #1329

feat: support evaluating spdx license expressions #1329

Conversation

G-Rath commented Oct 18, 2024

codecov-commenter commented Oct 18, 2024 • edited Loading

Codecov Report

jayvdb commented Nov 5, 2024

another-rex commented Nov 6, 2024

another-rex left a comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 18, 2024 •

edited

Loading