From 85b4d9a9d9c7541fe0df8d6156f6b0cdb5401816 Mon Sep 17 00:00:00 2001
From: Alan Jeffrey <403333+asajeffrey@users.noreply.github.com>
Date: Tue, 23 Jan 2024 11:47:08 -0600
Subject: [PATCH] Add RFC for shared self (#5)
---
docs/shared-self-types.md | 367 ++++++++++++++++++++++++++++++++++++++
1 file changed, 367 insertions(+)
create mode 100644 docs/shared-self-types.md
diff --git a/docs/shared-self-types.md b/docs/shared-self-types.md
new file mode 100644
index 00000000..56349643
--- /dev/null
+++ b/docs/shared-self-types.md
@@ -0,0 +1,367 @@
+# Shared self types
+
+## Summary
+
+This RFC proposes sharing `self` types between method definitions which share a metatable.
+
+## Motivation
+
+Currently, metamethods are type-inferred independently, and so give
+completely separate types to `self`. This has poor ergonomics, as type
+errors for inconsistent methods are produced on method calls rather
+than method definitions. It also has poor performance, as the independent
+self types have separate memory footprints, and there is idiomatic code with
+exponential blowup in the size of the type graph.
+
+For example, `Point` class can be simulated using metatables:
+
+```lua
+ --!strict
+
+ local Point = {}
+ Point.__index = Point
+
+ function Point.new()
+ local result = {}
+ setmetatable(result, Point)
+ result.x = 0
+ result.y = 0
+ return result
+ end
+
+ function Point:getX()
+ return self.x
+ end
+
+ function Point:getY()
+ return self.Y
+ end
+
+ function Point:abs()
+ return math.sqrt(self:getX() * self:getX() + self:getY() * self:getY())
+ end
+```
+
+Currently, this code is problematic, since there is no connection between the types
+of the `Point` metamethods. For example, the inferred type for `Point.getX`
+is `({ x : a }) -> a`, rather than the expected `(Point) -> number`.
+
+Even worse, the method `Point.abs` does not type-check, since the type
+of `self.x * self.x` is unknown. If Luau had subtyping constraints and type families
+for overloaded operators, the inferred type would be something like:
+
+```lua
+ type PointMT = {
+ new : () -> Point,
+ getX : ({ x : a }) -> a,
+ getY : ({ y : a }) -> a,
+ abs : (a) -> number where
+ a <: { getX : (a) -> b, getY : (a) -> c }
+ Add, Mul> <: number,
+ }
+ type Point = {
+ x : number,
+ y : number,
+ @metatable PointMT
+ }
+```
+
+but this type is not great ergonomically, since this type may be presented to
+users in type hover or type error messages, and will surprise users
+expecting a simpler type such as:
+
+```lua
+ type PointMT = {
+ new : () -> Point
+ getX : (Point) -> number,
+ getY : (Point) -> number,
+ abs : (Point) -> number
+ }
+ type Point = {
+ x : number,
+ y : number,
+ @metatable PointMT
+ }
+```
+
+This is the type inferred by *shared self types*. Rather than
+inferring the `self` type separately for each metamethod declared on a
+table, and for each use of `setmetatable` the same type is used.
+
+Unfortunately, while this change is fairly straightforward for
+monomorphic types like `Point`, it is problematic for generic classes
+such as containers. For example:
+
+```lua
+local Set = {}
+Set.__index = Set
+
+function Set.new()
+ return setmetatable({
+ elements={}
+ }, Set)
+end
+
+function Set:add(el)
+ self.elements[el] = true
+end
+
+function Set:contains(el)
+ return self.elements[el] ~= nil
+end
+```
+
+In this case, the expected type would be something like:
+
+
+```lua
+ type SetMT = {
+ new : () -> Set,
+ add : (Set, E) -> (),
+ contains : (Set, E) -> boolean
+ }
+ type Set = {
+ elements : { [E] : boolean },
+ @metatable SetMT
+ }
+```
+
+Inferring this type is beyond the scope of this RFC, though. Initially, we propose only inferring `self` monotypes,
+in this case:
+
+
+```lua
+ type SetMT = {
+ new : () -> Set,
+ add : (Set, unknown) -> (),
+ contains : (Set, unknown) -> boolean
+ }
+ type Set = {
+ elements : { [unknown] : boolean },
+ @metatable SetMT
+ }
+```
+
+and propose allowing explicit declaration of the shared self type,
+following the common practice of naming the self type after the metatable:
+
+```lua
+ type Set = { elements : { [E] : boolean } }
+```
+
+This type (and its generic type parameters) are used to derive the
+type of `self` in methods declared using `function Set:m()`
+declarations:
+
+```lua
+ type SetSelf = {
+ elements : { [E] : boolean },
+ @metatable SetMT
+ }
+```
+
+In cases where shared self types are just getting in the way, there
+are two work-arounds. Firstly, the shared self type can be declared to
+be `any`, which will silence type errors:
+
+```lua
+ type Foo = any
+```
+
+Secondly, the self type can be declared explicitly:
+
+```lua
+ function Foo.m(self : Bar) ... end
+```
+
+## Design
+
+### Self types
+
+For each table `t`, introduce:
+
+* the self type parameters of `t`, a sequence of generic type and typepack variables,
+* the self type definition of `t`, a type which can use the type parameters of `t`, and
+* the self type of `t`, a type which can use the type parameters of `t`.
+
+These can be declared explicitly:
+
+```lua
+ type t = U
+```
+
+which defines, when `t` has type `T`:
+
+* the self type parameters of `t` to be `As`,
+* the self type definition of `t` to be `U`, and
+* the self type of `t` to be `U` extended with `@metatable T`.
+
+For example,
+
+```lua
+ type Set = { [E] : boolean }
+```
+
+declares, when `Set` has type `SetMT`:
+
+* the self type parameters of `Set` to be `E`,
+* the self type definition of `Set` to be `{ [E] : boolean }`, and
+* the self type of `Set` to be `{ [E] : boolean, @metatable SetMT }`.
+
+If there is no explicit declaration of the self type of `t`, then, when `t` has type `T`:
+
+* the self type parameters of `t` are empty,
+* the self type definition of `t` is a free table type `U`,
+* if `t` has an `__index` property of type `MT`, then the self type of `t` is
+ a metatable type, whose metatable is `MT`, and whose table is `U`, and
+* if `t` does not have an `__index` property, then the self type of `t` is `U`.
+
+The free table type is unified in the usual fashion.
+
+Self types are used in two ways: in calls to `setmetatable`, and in metamethod declarations.
+
+### `setmetatable`
+
+In calls to `setmetatable(t, mt)`:
+
+* if `mt` has self type parameters `As`, self type definition `T` and self type `S`,
+* and `t` has (final) type `T [ Ts/As ]`
+* then `setmetatable(t, mt)` has type `S [ Ts/As ]`.
+
+As currently, this has a side-effect of updating the type state of `t` from
+`T [ Ts/As ]` to `S [ Ts/As ]`.
+
+For example, in `setmetatable({ elements = {} }, Set)`, we have:
+
+* `Set` has type `SetMT`, self type parameter `E`, self type definition` { elements : { [E] : boolean } }`, and self type `{ elements : { [E] : boolean }, @metatable SetMT }`,
+* and `{ elements = {} }` has type `{ elements : { [X] : boolean } }` for a free `X`,
+* so `setmetatable({ elements = {} }, Set)` has type `{ elements : { [X] : boolean }, @metatable SetMT }`.
+
+as a result `Set.new` has type `() -> { elements : { [a] : boolean }, @metatable SetMT }`.
+
+As another example, in `setmetatable(result, Point)`:
+
+* `Point` has type `PointMT`, no self type parameters, a free self type definition (call it `T`), and a self type whose table type is `T` and whose metatable is `PointMT`,
+* and `result` has final type `{ x : number, y : number }`,
+* so (by unifying `T` with `{ x : number, y : number }`) `setmetatable(Point, result)` has type `{ x : number, y : number, @metatable PointMT }`.
+
+As a side-effect, the type state of `result` is updated to be `{ x : number, y : number, @metatable PointMT }`,
+and unification causes the self type of `Point` to be `{ x : number, y : number, @metatable PointMT }`.
+
+Note that this relies on the type of `result` being `{ x : number, y : number }`, which is why we use the final
+long-lived type of `t` rather than its current type state.
+
+### Method declarations
+
+In method declarations `function mt:m`:
+
+* if `mt` has self type parameters `As` and self type `S`,
+* then give the `self` parameter of `mt:m` the type `S [ Xs/As ]` for fresh free Xs (these types are quantified as all other free types are).
+
+For example, in the method `Set:add(el)`:
+
+* `Set` has self type parameter `E`, and self type `{ elements : { [E] : boolean }, @metatable SetMT }`,
+* so `self` has type `{ elements : { [X] : boolean }, @metatable SetMT }` when type checking the body of `Set:add(el)`.
+
+At this point, type inference proceeds as usual:
+
+* `el` is given fresh free type `Y`,
+* the statement `self.elements[el] = true` will unifies `X` and `Y`, and
+* quantifying results in type `({ elements : { [a] : boolean }, @metatable SetMT }) -> ()`.
+
+## Drawbacks
+
+### Partially-constructed objects
+
+Shared self types capture an idiom where tables with metatables have
+all of their fields initialized *before* any methods are called. In
+cases where methods are called before all the fields are initialized,
+this will result in optional types being inferred. For example:
+
+```lua
+ function Point.new()
+ local result = setmetatable({}, Point)
+ result.x = 0
+ print(result:getX())
+ result.y = 0
+ print(result:getY())
+ return result
+ end
+```
+
+the call to `result:getY()` uses the *current* type state of `result`, which is `{ x : number, @metatable PointMT }`.
+Unification will then cause `Point` to consider `y` to be optional:
+
+```lua
+ type PointMT = {
+ new : () -> Point
+ getX : (Point) -> number,
+ getY : (Point) -> number?,
+ abs : (Point) -> number -- with a type error!
+ }
+ type Point = {
+ x : number,
+ y : number?,
+ @metatable PointMT
+ }
+```
+
+Since `y` has type `number?` rather than `number`, the `abs` method will fail to type check.
+
+As a workaround, developers can declare different self types for different methods:
+
+```lua
+ function Point.getX(self : { x : number }) : number
+ return self.x
+ end
+ function Point.getY(self : { y : number }) : number
+ return self.y
+ end
+```
+
+resulting in:
+
+```lua
+ type PointMT = {
+ new : () -> Point
+ getX : ({ x : number }) -> number,
+ getY : ({ y : number }) -> number,
+ abs : (Point) -> number -- without a type error!
+ }
+ type Point = {
+ x : number,
+ y : number,
+ @metatable PointMT
+ }
+```
+
+or can switch off type checking `self` by declaring `type Point = any`.
+
+### Methods called on both tables and metatables.
+
+This is a similar problem, caused by calling methods directly on
+metatables as well as tables. For example calling `Point:abs()` will
+result in inferring that both `x` and `y` are optional,
+
+### Classes with multiple constructors
+
+With the current greedy unifier, classes with constructors of different types will fail:
+
+```lua
+ local Foo = {}
+ Foo.__index = Foo
+ function Foo.from(x) return setmetatable({ msg = tostring(x) }, Foo) end
+ function Foo.new() return setmetatable({}, Foo) end
+```
+
+rather than inferring an optional field, this will report a type error.
+
+## Alternatives
+
+We could use new syntax for declaring self types, rather than using
+the convention that they have the same name as the metatable.
+
+We could do nothing, but at a performance and ergonomics cost.
+
+We could introduce special syntax for classes or records, though this
+doesn't address type checking current code.