Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial implementation of scan, parse, and satisfies #1

Merged
merged 2 commits into from
Sep 5, 2022

Conversation

elrayle
Copy link
Collaborator

@elrayle elrayle commented Aug 29, 2022

Major Components

  • license - defines active, deprecated, and exception licenses; defines license ranges
    NOTE: Currently hardcoded. May want to get from external canonical source in the future to avoid longterm maintenance to keep the lists up to date.
  • scan - breaks expression string into tokens
  • Parse - validates and creates nodes for expected parts
  • node - defines node structures and holds basic comparison/validation code for two license nodes
  • compare - compares to license ids are equal or within compatible ranges
  • Satisfies - checks if two license expressions are compatible

Remaining Work

Satisfies needs to be extended to support simple conjunctive expressions (e.g. "MIT AND Apache-2.0", "MIT OR Apache-2.0", etc.) and complex deep tree expressions (e.g. " (MIT AND Apache-1.0+) OR DocumentRef-spdx-tool-1.2:LicenseRef-MIT-Style-2 OR (GPL-2.0 WITH Bison-exception-2.2)").

includes start of comparison methods needed for Satisfy
* start of satisfies translation from js to go
* add comparator code to determine >, <, = for two licenses
* add ranges for licenses
* normalize -or-later, -only, + in license identifier

Known direct license comparisons that fail in go, but pass in js

```
satisfies(“Apache-3.0”, “Apache-2.0+”) // Apache-3.0 doesn’t exist
satisfies(“Apache-1.0+”, “Apache-2.0+”) // not in same range
satisfies(“GPL-2.0”, “GPL-2.0+”) // not in same range
satisfies(“GPL-2.0”, “GPL-2.0-or-later”) // not in same range
```
Copy link
Contributor

@dangoor dangoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and definitely looks like some tricky porting from JavaScript!

There's no need to make a change like this, but I thought I'd mention it in case it makes Satisfies clearer/easier to implement. This is fresh in my mind having just worked on validation code that has to work across types… What I'm about to say only makes sense if Satisfies only takes a particular license or GPL-2.0-or-later kind of thing and not full expressions.

If you wanted Node to be something polymorphic, you could think in terms of what operations need to be performed. Satisfies is the big one, but maybe there are others that could be performed (greater than, less than). You then make an interface (e.g. Node) that's like Satisifes(string) bool. Then you can have a License struct that implements that interface and contains info about a single license (or a license range, whatever makes sense for that). Then you can have an And struct that only passes Satisfies if the two Nodes it contains pass Satisfies.

Not sure if that makes any sense or seems helpful at all.


func (t *tokenStream) parseAtom() *Node {
parenNode := t.parseParenthesizedExpression()
if t.err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reading this and the previous function as (MIT (no closing paren) will just silently disappear from the expression rather than bubbling up an error. Is that right? It seems like that could cause trouble.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a follow-on PR to add a test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

danielramn18gmail.com

// Advances the index if the operator is found.
func (t *tokenStream) parseOperator(operator string) *string {
token := t.peek()
if token.role == OPERATOR_TOKEN && token.value == operator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these operators case sensitive? Is MIT or GPL-2.0 allowed?

Copy link
Collaborator Author

@elrayle elrayle Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABNF in the spec defines them as all cap. The javascript parser expects all caps. In the javascript, the results of parsing converted them to all lower case. Not sure why they chose to do that. I did the same, expecting that as I moved farther into the code, it would become clear why this was done. But now that a significant junk of the satisfies code is done, I don't see a need to convert them. For now, I will leave the conversion to lower case in, but this would be a very simple refactor to remove an unnecessary conversion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,293 @@
package spdxexp

/* Translation to GO from javascript code: https://github.com/clearlydefined/spdx-expression-parse.js/blob/master/scan.js */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The O in Go is lower case 😃

Copy link
Collaborator Author

@elrayle elrayle Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to give each Go file better documentation. I'll update this when I do that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{role: LICENSE_TOKEN, value: "Apache-2.0"},
{role: OPERATOR_TOKEN, value: ")"},
}, nil},
{"kitchen sink", " (MIT AND Apache-1.0+) OR DocumentRef-spdx-tool-1.2:LicenseRef-MIT-Style-2 OR (GPL-2.0 WITH Bison-exception-2.2)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙀

@elrayle elrayle merged commit b472351 into main Sep 5, 2022
@elrayle elrayle deleted the elr/scan-and-parse branch September 5, 2022 12:22
@elrayle
Copy link
Collaborator Author

elrayle commented Sep 5, 2022

Thanks for the suggestions about another approach for the structuring of nodes. Once the translation is complete, I can take a deeper look at potential refactors that can simplify the code and make it more closely follow typical Go code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants