Skip to content
This repository has been archived by the owner on Oct 8, 2024. It is now read-only.

Cleanup #164

Closed
bakkot opened this issue Nov 15, 2021 · 16 comments
Closed

Cleanup #164

bakkot opened this issue Nov 15, 2021 · 16 comments
Labels
good follow-on proposal this would be good but doesn't need to be in the first milestone

Comments

@bakkot
Copy link
Collaborator

bakkot commented Nov 15, 2021

I'm splitting this out from #162 because that issue is very focused on AbortController, which I think is an instance of a much more general issue, namely, the need to do cleanup when closing an iterator.

Generators have this functionality built in:

function* g() {
  try {
    let resource = getResource();
    yield* steps(resource);
  } finally {
    releaseResource(resource);
  }
}

for (let step of g()) {
  // ...
  if (condition) {
    break; // calls releaseResource
  }
}

but it's much harder to do this with iterator helpers, because there's no way to hook into the return/throw phase except by explicitly overriding those functions on your iterator instance.

I think we should consider supporting this. For example:

class IteratorHelper {
  #cleanup = null;
  cleanup(thunk) {
    let old = this.#cleanup;
    this.#cleanup = () => {
      old?.call(this);
      thunk.call(this);
    };
    return this;
  }
  
  return(arg) {
    this.#cleanup?.();
    /* call .return on the underlying iterator */
  }
  throw(e) {
    this.#cleanup?.();
    /* call .throw on the underlying iterator */
  }
}

Then you could do

let mapper = makeSomeExpensiveMappingFunction();
let items = iter().map(mapper).cleanup(() => mapper.release());

for (let item of items) {
  if (isTheNeedle(item)) {
    break; // calls `mapper.release()`
  }
}

(Open question: should .map and friends also call cleanup when the underlying iterator is exhausted? Probably yes? If so, this method could reasonably be renamed to finally.)

I'm not attached to this particular solution, but I do think the problem warrants some consideration.

@devsnek
Copy link
Member

devsnek commented Nov 15, 2021

this definitely seems worth discussing but i want to be careful we're not creating an ad-hoc duplicate of https://github.com/tc39/proposal-explicit-resource-management :)

@benjamingr
Copy link
Contributor

This is interesting! Note I may sound critical below but I'm not sure either way.

Note that generators (and async generators) are the only construct in the language using third-state semantics for cancellation/disposal and that it's likely that this is only because of the now defunct cancellable-promises and observable (the old dual one) proposals.

I would be cautious to build on this given the committee has previously shown reluctance to include these semantics in other places (and I recall the champion who specified this @jhusain regretting it).

Regarding the example: you also have a mini-kinda protocol here where you call .release on the mapper and that function now holds state.

With the signal proposal, the code would be similar except there will be no need to write .cleanup(() => mapper.release()) and it would likely "already work" since Node.js APIs typically pass a signal in an options object.

So for example with the signal-passing idea (I am using fetch but this works with any Node or browser new promise returning API):

async function fetcher(url, options) {
  const result = await fetch(url, options);
  return result.json(); // add whatever logic
}

for (let item of iter().map(fetcher)) {
  if (whatever(item)) break;
}

Vs, with the cleanup

async function makeFetcher() {
  const ac = new AbortController();
  async function fetcher(url, options) {
    const result = await fetch(url, { signal: ac.signal });
    return result.json(); // do processing
  }
  fetcher.cleanup = () => controller.abort();
  return fetcher;
}

for (let item of iter().map(makeFetcher())) {
  if (whatever(item)) break;
}

@devsnek

I am definitely in favour of that proposal! I think it's great but that cleanup and cancellation are substantially different and while there are some cases where it's obvious which to pick in some cases it's close.

@bakkot
Copy link
Collaborator Author

bakkot commented Nov 15, 2021

@devsnek

this definitely seems worth discussing but i want to be careful we're not creating an ad-hoc duplicate of https://github.com/tc39/proposal-explicit-resource-management :)

Agreed, but I think they actually complement each other: that proposal is about having syntax within a block for acquiring and releasing resources, relying on the resources to expose a protocol allowing to be released; what I'm talking about here is adding semantics to iterator helpers which would allow them to participate in the existing protocol for releasing iterators.

That proposal already makes iterators disposable by deferring to .return, so there's nothing further which would need to be done to make them work together.


@benjamingr

Note that generators (and async generators) are the only construct in the language using third-state semantics for cancellation/disposal

Sorry - in what sense do generators use third-state semantics for cancellation? If I recall correctly, the third-state proposal was specific to promises, and introduced a "third state" other than "fulfilled" and "rejected"; I'm not seeing the analogy here. Generators have only "done" and "not done", and the only way to observe this is with .next and inspecting the result; calling .return(x) behaves exactly as if the yield on which the generator is currently paused were replaced by a return x;. I'm not sure what the third state here is supposed to be.

that it's likely that this is only because of the now defunct cancellable-promises and observable (the old dual one) proposals

I had always understood it to be done like this to make the existing syntactic try/finally work correctly, which matches what other languages with generators do, like Python (although Python uses GC rather than an explicit return protocol).

With the signal proposal, the code would be similar except there will be no need to write .cleanup(() => mapper.release()) and it would likely "already work" since Node.js APIs typically pass a signal in an options object.

It would only work if mapper uses cancellation tokens. But cancellation tokens only make sense for very particular kinds of resources, i.e., those which represent some ongoing behavior. They aren't really sensible for things like holding on to file handles for a sequence of operations. So I would not expect mapper to make use of that protocol.

@bakkot bakkot mentioned this issue Nov 15, 2021
@benjamingr
Copy link
Contributor

Missed this, just to explain how generators have third state:

function* gen() {
  try {
    yield 1;
    console.log('in try');
    yield 2;
  } catch(e) {
    console.log('in catch!');
  }finally {
    console.log('in finally!');
  }
}

{ // first state
let g = gen();
console.log(g.next()); // value 1 done: false
console.log(g.next()); // "in try", value 2: done: false
console.log(g.next()); // "in finally!, value: undefined, done: true
}


{ // second state
let g = gen();
console.log(g.next()); // value 1 done: false
console.log(g.throw(new Error())); // "in catch!", "in finally!", value: undefined done: true
}


{ // third state
let g = gen();
console.log(g.next()); // value 1 done: false
console.log(g.return()); // "in finally!", value: undefined done: true
}

Note how it's exactly the third-state semantics of cancellable promises in that finally clauses run but not catch ones.

@benjamingr
Copy link
Contributor

Also you might be interested in prior art - for example .NET call the cancellation issue "deep cancellation" and there are interesting discussions about where they use IDisposable vs. CancellationToken (their AbortSignal)

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 29, 2022

Note how it's exactly the third-state semantics of cancellable promises in that finally clauses run but not catch ones.

I'm still not seeing the analogy to cancellable promises, sorry. From the perspective of a consumer of the generator, a generator which is handling return (or which has finished handling it) is not behaving any differently from a generator which has run past its final yield and is cleaning up that way (or which has finished cleaning up). In particular, the call to return does not bubble to any consumers, whereas cancelling a promise (in the third-state proposal) would cause derived promises to also be cancelled.

That was the problem with the third-state proposal: that it would introduce a new state for a promise to be in from the perspective of consumers of the promise, which entails also having some new way of handling that state (possibly a new catch-like syntax, even). None of that is relevant here: from the perspective of consumers, generators are only "done" or "not done".

The relevant difference in the design here is that iterators are "pull" rather than "push" (as Promises are), so even if multiple parties are iterating over the same iterator (which is a very unusual thing to do), they don't need to be immediately notified when it enters the "done" state, nor to be given information about how it entered that state. They just see the next time they pull that there are no items left in the iterator.

(You also don't run into the issue about how consumers of a Promise shouldn't be able to affect each other: because iterators are pull, multiple consumers are necessarily interacting already.)

Also you might be interested in prior art - for example .NET call the cancellation issue "deep cancellation" and there are interesting discussions about where they use IDisposable vs. CancellationToken (their AbortSignal)

Yeah, I've seen some of those discussions. Here, though, the choice seems fairly clear to me: iterators in the language are already using the "disposable" rather than the "cancellable" pattern, so that seems like the obvious pattern to follow with iterator helpers.

@michaelficarra michaelficarra added the good follow-on proposal this would be good but doesn't need to be in the first milestone label Jul 5, 2022
@hax
Copy link
Member

hax commented Jul 6, 2022

I come from #162 , it seems this issue is the right place to continue?

As @benjamingr

It is auto-generated by the iterator

So AsyncIterator.from(urls).map((url, { signal }) => fetch(url, { signal })) is just the shorthand of

const controller = new AbortController()
const {signal} = controller
AsyncIterator.from(urls)
.map((url, { signal }) => fetch(url, { signal }))
.cleanup(() => { controller.abort() }) 

If that, i guess promise should also have auto-generated signal like:

promise.then(v, {signal} => { ... })

because promise.then is very same to map.

@michaelficarra
Copy link
Member

michaelficarra commented Jul 22, 2022

Here's how I would add a provision for cleaning up held resources:

Object.defineProperties(Iterator.prototype, Object.getOwnPropertyDescriptors({
  onComplete: function (cleanup) {
    if (typeof cleanup !== 'function') {
      throw new TypeError;
    }
    let innerIter = this;
    return Object.setPrototypeOf({
      next(arg) {
        let val = innerIter.next(arg);
        if (val.done) {
          cleanup();
        }
        return val;
      },
      return(arg) {
        let val = innerIter.return(arg);
        cleanup();
        return val;
      }
    }, Iterator.prototype);
  },
}));

@bakkot
Copy link
Collaborator Author

bakkot commented Jul 22, 2022

I worry about having onComplete return a new iterator, rather than modifying the existing thing. Contrast onClick, Promise.prototype.finally, etc. Since this bug would manifest as cleanup not happening, it would be very easy to miss if one misused it.


Also, I think exceptions in the cleanup function should be thrown, not silently swallowed.

@michaelficarra
Copy link
Member

@bakkot they are thrown, there is no catch.

@bakkot
Copy link
Collaborator Author

bakkot commented Jul 22, 2022

There's a finally with a return, which will override any completion value from the try, including exceptions. (function(){ try { throw new Error; } finally { return 0; } })() returns 0.

That's the only thing the finally does, in fact. If you want to just propagate errors from cleaup, you don't need a try at all; just call cleanup and then return val.

@michaelficarra
Copy link
Member

Ah yes, you're right. Fix applied.

On the mutation question, I still don't think mutation is appropriate. Iterator.prototype methods should all return new wrapping iterators. Would a different name be sufficient?

@bakkot
Copy link
Collaborator Author

bakkot commented Jul 22, 2022

Looking again, I think cleanup should also happen if the underlying iterator's methods throw (including a throwy done getter) - that's how a syntactic generator with a try { yield* inner } finally { cleanup() } would work.


On the mutation question, I still don't think mutation is appropriate. Iterator.prototype methods should all return new wrapping iterators. Would a different name be sufficient?

A different name would certainly be an improvement, if we're committed to avoiding mutation. Unfortunately the obvious choice of finally is probably out because finally on a Promise is mutating (in addition to returning a new Promise). withCleanup, possibly?

I still think that's a mistake, though: someone who thinks it is mutating when it is not will have a silent and very hard to notice bug, whereas someone who thinks it is not mutating when it is will have a loud bug (because it will return undefined, not an iterator). What is the benefit of making it be not mutating? If the only benefit is consistency, the cost to the developer experience from silent errors is not even close to being worth that benefit.

@bakkot
Copy link
Collaborator Author

bakkot commented Jul 23, 2022

someone who thinks it is mutating when it is not will have a silent and very hard to notice bug, whereas someone who thinks it is not mutating when it is will have a loud bug

I'm reminded that I actually wrote a whole essay about this: https://writing.bakkot.com/least-frustration.

@michaelficarra
Copy link
Member

Closing since cleanup/finalisation doesn't need to be solved as part of this proposal. It will be a good follow-up proposal though.

@hax
Copy link
Member

hax commented Aug 28, 2022

I'm reminded that I actually wrote a whole essay about this: https://writing.bakkot.com/least-frustration.

@bakkot Good article! But I suggest u also add some bad design examples (for example, direct/indirect eval?.(), and recent #[NaN] === #[NaN]). 😂

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
good follow-on proposal this would be good but doesn't need to be in the first milestone
Projects
None yet
Development

No branches or pull requests

5 participants