Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intersection types #79

Open
nsbgn opened this issue Mar 4, 2022 · 7 comments
Open

Intersection types #79

nsbgn opened this issue Mar 4, 2022 · 7 comments

Comments

@nsbgn
Copy link
Contributor

nsbgn commented Mar 4, 2022

Now that #74 has freed up the pipe operator |, it will now be possible to implement operator overloading with yet another mechanism, on top of subtype polymorphism and parametric polymorphism. That is, we could consider adding the following, with an intuitive reading:

f = Operator(type= A ** B | C ** D)

This is somewhat possible already using parametric polymorphism:

f = Operator(type=lambda x, y: {T(x, y)[T(A,B), T(C, D))} >> x ** y)

However, applying either an A or a C to this function will not result in, respectively, B and D, but rather in a variable y[B] or y[D]. This means that subsequent uses of the result will correctly fail where it should, but we obtain no fixed type to reason with. The above solution would address this.

In mild defiance of the YAGNI principle, I'm pre-emptively publishing this, as it's likely that this will pop up if the library is ever used by others, but it is not necessary for the CCT algebra and so I will not work on it right now.

@nsbgn
Copy link
Contributor Author

nsbgn commented Mar 4, 2022

The + notation might be preferable over | as its precedence is higher than >>'s.

Although this doesn't matter as >> only returns its right operand anyway, since it relies on side effects.

@nsbgn
Copy link
Contributor Author

nsbgn commented Mar 7, 2022

An example use case is for the apply operator: instead of having apply and apply2 you could just define apply = Operator(type=lambda x, y, z, c: (x ** y ** z) ** R(c, x) ** R(c, y) ** R(c, z) | (y ** z) ** R(c, y) ** R(c, z)).

@nsbgn
Copy link
Contributor Author

nsbgn commented Mar 11, 2022

This would also help in defining an unordered tuple type (see also #78 for the ordered variant). That is, a type that is a supertype of both A * B and B * A. This would be useful since we often won't care about the order in which multiple attributes of a relation will be presented - just that there are multiple.

There are probably a lot of problems with this (for one, the type taxonomy would be cyclic).

The + operator might be natural for this.

@nsbgn nsbgn changed the title Explicit operator overloading Explicit overloading & union types Mar 11, 2022
@nsbgn nsbgn changed the title Explicit overloading & union types Explicit overloading; union, intersection, top, bottom types Apr 22, 2022
@nsbgn
Copy link
Contributor Author

nsbgn commented Apr 22, 2022

I gave it some more thought.

What should probably happen is the following:

We should have an Intersection type. We should have a Top and Bottom type; top representing a value of any type (thus a supertype of all types) and bottom representing a type which has no values (thus a subtype of all types). Then, in the CCT algebra, R(A * B, C & D) would make sense: it is the type of relations for which the key attributes have, respectively, type A and B, and for which the dependent attributes have at least types C and D. This is a subtype of R(A * B, C) and R(A * B, D), and also of R(A * B, Bottom). It is also a supertype of R(A * B, C & D & E). This is exactly the behaviour we want: we often don't care about the types of other independent attributes, so we don't want to define the relationships so rigidly using Product types.

Unfortunately, I think this will complicate the type inference a lot, and I need to give it some thought. ...And read papers, probably.

For consistency, Tuple types should be renamed Product types.

@nsbgn
Copy link
Contributor Author

nsbgn commented Jun 15, 2022

Top and bottom types should get a separate issue (EDIT: #94). We need them more urgently because it will help us in using types of the form F(_) in queries (since we can just search for F(Bottom) instead).

@nsbgn nsbgn changed the title Explicit overloading; union, intersection, top, bottom types Explicit overloading; union, intersection types Jun 15, 2022
@nsbgn
Copy link
Contributor Author

nsbgn commented Aug 10, 2022

This is now leading to an issue: the question parser often returns R(Obj, x) where the manually constructed queries ("gold standard") would use R(Obj, Reg * x).

As indicated in #79 (comment) I think the correct solution is to have intersection types in this library. However, as a temporary solution, in queries, we can convert R(Obj, Reg * x) to the union of R(Obj, Reg * x) and R(Obj, x). For this to work, however, we need to implement choice queries --- see issue #77.

nsbgn added a commit that referenced this issue Aug 15, 2022
nsbgn added a commit to quangis/quangis-web that referenced this issue Aug 15, 2022
Later, intersection types should be introduced so that `R(x, y)` is a
supertype of `R(x, y & z)` and `R(x, z & y)`.

cf. quangis/transforge#79 (comment)
nsbgn added a commit that referenced this issue Aug 22, 2022
Until #79 is implemented, as of 11bc3fc
we work around the issue of intersection types by using choice, as
described in #77. That is, if we want to query for `R(x, y)`, we also
accept `R(x, y & z)` (or, temporarily using product types for
intersection types, `R(x, y * z)`. However, this means that when we are
querying for workflows that contain an intersection type, we can't just
ignore it: we must query for `{... R(x, y * z)} UNION { ... R(x, y)}`.
This is implemented with this commit, but it should surely not be the
solution we end up with.
@nsbgn nsbgn changed the title Explicit overloading; union, intersection types Intersection types Sep 29, 2022
@nsbgn nsbgn removed this from the Version 0.2: RDF support milestone Nov 2, 2022
@nsbgn
Copy link
Contributor Author

nsbgn commented Nov 30, 2022

The product type A * B should perhaps be a supertype of the intersection type A & B.

nsbgn added a commit to quangis/quangis-workflow that referenced this issue Jul 5, 2023
In the meeting, we figured out that a workflow was generated that feeds
a source of type  R(Obj, Reg * Nom) into the second input of a
ZonalStatisticsMeanInterval, the output which in turn gets passed into a
LoadCountAmounts along with a second use of the same source, now
interpreted as a R(Obj, Count). This is another instance of the issue
described in quangis/transforge#79: we really
want to sometimes describe multiple attributes of the same thing.

See also
quangis/transforge#79 (comment)

We now work around it by being consistent with the annotation.
nsbgn added a commit that referenced this issue Jul 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant