Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named function type returns #67

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

dphblox
Copy link

@dphblox dphblox commented Nov 4, 2024

In alignment with named function type arguments, introduce syntax to describe names of returned values for function types.

Rendered

Copy link

@hgoldstein hgoldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain on the language, but I wanted to call out that these aren't really tuples, they're multiple return values like golang's.

We talked about this in a different channel and my initial impression was that annotating parameter types in functions seemed like the wrong direction. I've softened a bit here, and I am amenable to the idea that this is a reasonable thing to provide for tooling (linting, hover types, inlay hints, etc.).

docs/syntax-named-function-type-returns.md Outdated Show resolved Hide resolved
docs/syntax-named-function-type-returns.md Outdated Show resolved Hide resolved
docs/syntax-named-function-type-returns.md Show resolved Hide resolved

So, this proposal posits that there is already established precedent for such features, and that users understand how comprehension aids function in Luau today.

A common concern is whether these comprehension aids would mislead people into believing that names are significant when considering type compatibility.
Copy link

@hgoldstein hgoldstein Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IME: the way programmers interact with type systems is generally "I'm going to write code how I want until there are no errors." This is especially true for gradual type systems: they're being applied to code written without static typing in mind (trying to run mypy or Pyre on a legacy codebase can be an exercise in frustration because of this).

If I had to guess: I would expect that someone writes code like:

local function maybeMap<T, U>(elem: T?, f: (item: T) -> (result: U)): U?
  -- ...
end

local function blah(foo: number): (bar: string)
  -- ...
end

local function maybeStringify(x: number?): string?
  return maybeMap(x, blah)
end

... and then realizes later there was a mismatch. Or at least the vast majority of folks will do so.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what mismatch you mean here. I noticed the example wouldn't typecheck because maybeMap should prob return U? rather than U, but that's not got anything to do with return names.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what mismatch you mean here.

The signature of maybeMap states it takes a function f of type (item: T) -> (result: U), but when we invoke it, we do so with blah who's type is (foo: number) -> (bar: string). The mismatch is in the names of the params / return types.

I noticed the example wouldn't typecheck because maybeMap should prob return U? rather than U, but that's not got anything to do with return names.

Yeah that has to do with me making a small mistake 😅

Copy link
Author

@dphblox dphblox Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Sometimes I do this intentionally, where e.g. a generic maybeMap function uses generic naming, but when I define a mapping callback inline, it might use names more descriptive based on local information, e.g.:

local maybeChar = tryGetCharacter()

local maybeHumanoid = maybeMap(maybeChar, function(char)
   return char:FindFirstChildWhichIsA("Humanoid")
end)

So I wouldn't think a discrepancy like that is always a problem.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my point: unless you have named parameters, people don't try to match the naming.

OCaml has named parameters that start with tildes, they're a part of the type of the function:

(* math.mli *)
val divide : ~dividend:int -> ~divisor:int -> int
(* math.ml *)
let divide ~dividend ~divisor = dividend / divisor
let halve number = divide ~dividend:number ~divisor:2

There's support for punning, so one can write code like:

let f ast_node = (* do something with an ast_node here *) in
(* `~f` is both the named parameter _and_ the local variable *)
List.map ~f list_of_ast_nodes 

Copy link
Author

@dphblox dphblox Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. So I think that starts to drift into somewhat adjacent territory around things like named parameter support (as opposed to today's positional parameters) which is out of scope since that's a much heavier change.

I would think of both argument names and return names as "sensible defaults" which document the type generically, without restricting how they're used downstream - it manifests mostly in LSP features.

It's OK for people to not match the names exactly (though it's certainly curious if they rearrange them, because that's probably a mistake, hence the lints).

@andyfriesen
Copy link
Collaborator

So first, I find these examples (especially higher order functions) confusing to read. No other programming language I use names returns in this way and the syntax presented here looks concerningly similar to a table type. To be quite honest, I think a great many of these multi-return functions would be better-written as functions that return tables anyway.

Secondly, new syntax is something we can never take back and the language grammar is a fiddly thing, so the bar has to be much higher than "we've done things kinda like this in the past." I'd need to see a much more concrete UX that justifies the extra complexity.

@dphblox
Copy link
Author

dphblox commented Nov 18, 2024

The original devs' intention was to avoid table types since constructing tables has an allocation cost and this code was being used in very hot paths. If Luau had an optimisation for returning table types, perhaps this would not be an issue, but that also leads down a rabbit hole of edge cases and corner cases.

The value prop of named returns is the same as the value prop for named arguments - it's an incremental syntax change with no grammar incompatibilities or special new inventions, which allows for easier comprehension of non-trivial function types & provides more information for LSP features such as autofill and inlay types. I personally believe there is obvious benefit to this, and many other developers I've shown this to for feedback feel the same, though these are our own opinions and this is clearly not a cut and dry issue.

But beyond UX benefit, there is a UX cost to not doing this - I've seen multiple Luau developers get confused by the asymmetry here. In particular, the developer I was talking to (which sparked this RFC) was explicitly confused by the fact that arguments could be named on one side, but returns could not be named on the other side. This just became another special case they had to learn and remember.

I think it's important to think about whether we are viewing Luau through the lens of these people. A common thing that's come up is that "the type system is separate from runtime Luau" but I haven't actually seen people think that way in the wild. To them, Luau is Luau, and the distinction between runtime and analysis time isn't really material.

@Quenty
Copy link

Quenty commented Nov 19, 2024

Here's my hot take:

To be quite frank, the issue is tuples are not 1st class citizens within Luau, and therefore adding on this additional feature, while elevating to first-class citizen-status, is mostly about the fact that tuples are allocated on the stack on not the heap, and thus, have performance gains.

We have two problems then:

  1. Ergonomics
  2. Performance

With performance getting in the way of ergonomics - and this trying to solve for it.

In my opinion, we should try to make the ergonomic way to write code (table returns) performant, especially if the tables are used in such a way that they're only consumed within the stack-frame above or below the function call. We can then consider allocating these small table-equivalents as a tuple/registry instead of as a table.

The problem is this adds additional complexity to our compiler and we must pick a hard edge to do this on. Fortunately, Roblox has access to millions of games running Lua code and so my proposal is we consider a strategic way to study when this sort of stack allocation would be safe in the real world and then pick something simple to execute on.

@dphblox
Copy link
Author

dphblox commented Nov 19, 2024

That's a good point, there's probably a good number of trivial cases that could deliver value there.

@AxisAngles
Copy link

AxisAngles commented Nov 19, 2024

I'm no Roblox engineer, but I am an end user of Luau who cares about performance and ergonomics.

I make modules which are intended to be used by the rest of my team (and for myself years down the line). My goal with my physics modules, algebra libraries, spatial query code, etc., is to make something which delivers correct and performant behavior in a way that is reasonably ergonomic.

This code tends to be especially hot, being called 100s to 1000s of time each frame for ballistics, physics, cheat detection, etc.

With the current proposal from Quenty, that tables which are analyzed to be constructed and immediately deconstructed should be optimized, there is the following extraordinarily common case.

If I choose to return a table with many parameters (for example, Sweep returns doesHit, time, dist, posA, normA, posB, normB) for the purpose of ergonomics, and rely on this optimization behavior, what if the caller of this code stores this table somewhere for the later use of 1 or 2 components? What if the caller tries to modify or add a component later? Does this deoptimize? At what level of complexity does this optimization fail? How does the user know? How do I guarantee the user will be calling my code correctly?

Right now, it is common knowledge among mid+ level programmers that tables are slow, and multiple returns are fast; "if they are returning a table, the cost is already paid, I should reuse it." But this assumption would be wrong. There are hidden ergonomics costs associated with this proposal. I would not choose to return a table knowing that it gives callers the opportunity to make the wrong choice unknowingly.

My take is that the distinction between fast-code and slow-code should be super clear. The true solution here I think is an open question.

@AxisAngles
Copy link

AxisAngles commented Nov 19, 2024

A while back, I had the opportunity to work with SlimeVR to make some Quaternion, Vector3, Matrix and Euler Angles libraries in Kotlin. Because there was a clear path to maximizing performance (which was a requirement by SlimeVR), my productivity was easily more than 3x. I was able to write code ergonomically with clear guarantees of how it would be optimized.

I have not had such a high velocity development experience since then.

This is a major problem with Luau in its current state. There is no performant way to make our own objects. There is no performant way to ergonomically return many values. We pay a huge price in productivity when we are constantly faced with the dilemma between paying a 2x complexity cost, or a 5x performance cost, and the associated cost of choosing incorrectly.

Maybe the correct solution is structs (which can have metatables with metamathods applied to them). These would seem to solve multiple issues. The struct syntax has the major benefit of making the user aware of performance guarantees and usage limitations without ambiguity.

@dphblox
Copy link
Author

dphblox commented Nov 19, 2024

I wonder if frozen tables were more first-class in Luau, perhaps we could build some of those same optimisations to avoid some of their expense. My only concern would be whether that's difficult to get right, and how reliably it'd work, but in the abstract theory, it could have potential. I will stew on this for a bit though because i want to take on board some of the concerns from here before I rush out to propose another thing.

As much as I attempted to design this RFC to avoid such pits of complexity, maybe if we can prove out that such optimisations are viable, maybe it's worth pursuing (and even if not - perhaps it's worth pursuing anyway, since user space libraries like Fusion already optimise much better when frozen tables are in use)

@Quenty
Copy link

Quenty commented Nov 19, 2024

Yeah, I would be ok with an analysis that is:

  1. The return type is frozen, allocated within frame and smaller than X bytes
  2. Therefore upon invocation of this function the return is allocated on the stack
  3. We can then optimize these scenarios

The key is to make this abstraction as invisible as possible, that is, to find the widest scope possible that we can cover with invariants so that reasoning about optimization isn't something we need to do as a user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants