Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested API change #237

Open
ColinDKelley opened this issue Jun 2, 2023 · 26 comments
Open

Suggested API change #237

ColinDKelley opened this issue Jun 2, 2023 · 26 comments

Comments

@ColinDKelley
Copy link
Contributor

ColinDKelley commented Jun 2, 2023

Problem Statement

Recently I've worked with a team of ~6 developers converting two services from EventMachine + Synchrony to Async. That has gone well, but one detail of the Async API has tripped us all up at first and in some cases, more than once.

The confusion arises from Async { ... } being overloaded:

  1. At the outer edge, Async { ... } starts up the reactor that must surround the concurrent code.
  2. Inside the reactor block, Async { ... } creates a concurrent Async::Task within the reactor.

When writing concurrent code (case 2), it is necessary that it be surrounded by a reactor (case 1). If that is missing, case 2 will be misconstrued as case 1, and it won't be concurrent at all. This is an easy mistake to make, especially when writing tests.

Mitigation

To address this risk, we like to follow Design by Contract, which in this case means that concurrent code (case 2) should ensure that there is a surrounding reactor before it proceeds. Here's how we've done that, crudely:

def process_jobs(jobs)
  Async::Task.current? or raise ArgumentError, "must start up reactor first with surrounding 'Async { ... }"

  jobs
    .map { Async { _1.process! } }
    .map(&:wait)
end

The wrapper code that creates the reactor looks like:

Async do
  process_jobs(jobs)
end

Proposal

An ideal solution would separate the overloaded cases so that the contract can be implicitly enforced by the Async gem without clients needing to write manual code each time.

Since case 2 appears much more frequently in code than case 1, let's leave case 2 as-is, but change case 1 to Async.run { ... }. So the wrapper for the above code would read:

Async.run do
  process_jobs(jobs)
end

With this contract, case 2 could automatically raise an exception on this line if it is run without a surrounding reactor:

    .map { Async { _1.process! } }

=> Async::ReactorNotRunning: You must start a reactor first with `Async.run { ... }` surrounding this code.

Migration

We obviously would not want to introduce this proposal as a breaking change. Instead we could phase it in:

Phase 1

Introduce Async.run as a preferred but optional way to create the reactor at the outer level. Update all documentation to show it as the preferred way to create the reactor.

Phase 2

Deprecate Async { ... } being called without a surrounding reactor. In the deprecation warning, indicate that it should be written as Async.run { ... } if it is the outer level that is creating the reactor.

Phase 3

Complete the deprecation: If Async { ... } is called without a surrounding reactor, raise the Async::ReactorNotRunning exception as shown above.

@ioquatix
Copy link
Member

ioquatix commented Jun 2, 2023

I will think about this, but just as a short term idea, the mitigation can be simplified like this:

def process_jobs(jobs, parent: Async::Task.current)
  jobs
    .map { parent.async { _1.process! } }
    .map(&:wait)
end

This has almost identical behaviour but is the correct way to establish "Run in a parent context".

https://github.com/socketry/async/blob/main/lib/async/task.rb#L236-L238

Maybe we can adjust the error message to be a bit more meaningful. Additionally, maybe we can document this more clearly.

The reason why this is a good idea is because it enables you to control the scope of the child tasks (e.g. using a semaphore, or some other control structure).

Regarding your proposal, how does this work:

Async.run do
  Async.run do # (1) Is this an error?
  end

  Async do # (2) This isn't an error?
  end
end

Async do # (3) This is an error?
end

Sync do # (4) What about this?
end

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 2, 2023

Regarding your proposal, how does this work

  Async.run do # (1) Is this an error?
  end

Haha, I was going to ask you the same thing! Is it supported today to start a reactor nested inside another one? Whatever the answer, we can support it.

The nice thing about not overloading is that we can be precise in either allowing it or disallowing it.

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 2, 2023

Async do # (3) This is an error?
end

Yes, that is the error we want to catch (after Phase 3). It represents code that wants to be concurrent--and assumes that its caller has spun up a surrounding reactor--but the caller forgot to, or didn't understand that they needed to. This is an easy mistake to make, especially in tests.

@ColinDKelley
Copy link
Contributor Author

Sync do # (4) What about this?
end

I'm a bit fuzzy on Sync, but if it needs a surrounding reactor, this would raise the same exception since there isn't one.

@ioquatix
Copy link
Member

ioquatix commented Jun 2, 2023

Sync do
  # It creates a reactor if required, otherwise it just executes the block.
end

The point of Sync is in places where you want a reactor, but don't need concurrency. In practice, most top level usage of Async is better written as Sync. It means synchronous execution with a reactor / event loop. I couldn't think of anything better and it seemed consistent with the intent.

@trevorturk
Copy link
Contributor

Apologies because I think I'm misunderstanding the issue as initially described, but would using a top level Sync instead of Async produce the desired behavior? If not, would you mind trying to explain a bit more so I can try to understand more fully?

@ioquatix
Copy link
Member

ioquatix commented Jun 2, 2023

I think a top level Sync solves the problem, assuming the user is aware that it's required.

If they don't know, then the problem is, some internal Async block might not execute as desired - instead in that case, the request here is that it fails with an exception.

I think that as I outlined on the linked PR, I don't think it should be an error - if the user calls the top level function, probably it should have a top level Sync block... which avoids the error. But I don't know if this is an acceptable solution or not.

i.e. your proposal is, "Is this acceptable?":

def process_jobs(jobs)
  Sync do
    jobs
      .map { Async { _1.process! } }
      .map(&:wait)
  end
end

And I wonder if it is or not.

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 2, 2023

I just read up on Sync and found these two salient sentences:

This method will run your block in the current event loop if one exists, or create an event loop if not.
In other words, Sync{...} is very similar in behaviour to Async{...}.wait.

These tell me that Sync { ... } has the same overloading as Async { ... }... with the addition of the the implied .wait (but only in case 2!).

So my proposal logically extends here. The quoted sentences above would simplify to:

This method will run your block in the current reactor (event loop) and .wait on the result.
If there is no current reactor, Async::ReactorNotRunning: You must start a reactor first with 'Async.run { ... }' surrounding this code. will be raised.

@ColinDKelley
Copy link
Contributor Author

BTW I spotted a mistake in my initial message for the exception. I had omitted the .run. I corrected it to:

Async::ReactorNotRunning: You must start a reactor first with Async.run { ... } surrounding this code.

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 2, 2023

@ioquatix

The point of Sync is in places where you want a reactor, but don't need concurrency. In practice, most top level usage of Async is better written as Sync. It means synchronous execution with a reactor / event loop.

To test my understanding:
Currently, at the top level, Async { ... } and Sync { ... } are identical, aren't they? Both spin up a reactor and block until it is finished?.

I find the "don't need concurrency" point confusing. Both offer concurrency inside their block, right? Concurrency with sibling blocks depends on what is enclosing this block and sibling blocks. Which gets us back to the importance, IMO, of using distinct APIs for these cases: spinning up a reactor vs. writing concurrent code inside a reactor.

@ColinDKelley
Copy link
Contributor Author

@ioquatix Any thoughts on the above? ^^^

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 5, 2023

@trevorturk Now that I understand Sync semantics, I will try restate the problem to include it. Does this help?

The confusion arises from Async/Sync { ... } each being overloaded:

  1. At the outer edge, Async/Sync { ... } starts up the reactor that must surround concurrent code. (There is no difference between Async and Sync here.)
  2. Inside the reactor block, Async { ... } creates a concurrent Async::Task within the reactor. Sync { ... } does the same, plus it adds an implied .wait at the end.

When writing concurrent code (case 2), it is necessary that it be surrounded by a reactor (case 1). If that is missing, case 2 will be misconstrued as case 1, and it won't be concurrent at all. This is an easy mistake to make, especially when writing tests.

To address this risk, we like to follow Design by Contract, which in this case means that concurrent code (case 2) should ensure that there is a surrounding reactor before it proceeds.
...

Or, here's another way to restate the problem at a high level:

  • To run concurrent code with async, it is necessary to create a surrounding reactor in advance of creating concurrent tasks.
  • If you try to do that latter without doing the former, your code will not run concurrently as you intended. (Instead, it will run sequentially.)
  • We can fix this by separating concerns: the API to create the surrounding reactor can be different from the API that creates concurrent tasks. This will simplify usage. Most importantly, it will allow us to follow Design by Contract: we can raise a helpful exception if someone tries to do the latter but didn't do the former.
  • I noticed that this exactly parallels how EventMachine.next_tick et.al. will raise a helpful exception if a surrounding EventMachine.run has not been called at that point to start up the reactor. That was the inspiration to suggest Async.run as the API for starting up the reactor.

@trevorturk
Copy link
Contributor

Thank you for the detailed explanation!

First, I think this helps clarify Sync for me, and makes me want to refactor my code to avoid Sync and instead use Async {}.wait simply because that might help my brain make better sense of things. (You're creating a reactor, and then waiting at the end of the block before continuing on... there's nothing much special about Sync... but then I see that Samuel is suggesting the otherwise, so I'm not quite settled here myself...)

Anyway, putting aside Async vs Sync, I did a simple test which I believe illustrates the issue you're pointing out in a way that might be helpful to others... or at least it helps me see the problem more clearly:

Async { puts "starting 1"; sleep 1; puts "1" }
Async { puts "starting 2"; sleep 2; puts "2" }

...produces:

starting 1
1
starting 2
2

..whereas:

Async do
  Async { puts "starting 1"; sleep 1; puts "1" }
  Async { puts "starting 2"; sleep 2; puts "2" }
end

...produces:

starting 1
starting 2
1
2

So, the issue is that, in the first example, you're not running the tasks concurrently and there is no warning/error, so it's an easy mistake to make.

If I'm understanding this correctly, I wonder if logging a warning might be enough... or perhaps having a global "strict" setting which would turn those warnings into errors? I see you're discussing more in-depth and perhaps better solutions here, but maybe there's room for a less controversial incremental change?

(I don't have a strong opinion here, I'm just trying to follow along, and hopefully this might be helpful in your discussion, and/or for future people stumbling across this issue!)

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 5, 2023

Thanks @trevorturk! That's a perfect illustration. It's exactly the trap that everyone on my team has stepped into at different times.

It's especially easy to have happen with specs. Let's suppose your code were in a method:

def sleep_1_and_2
  Async { puts "starting 1"; sleep 1; puts "1" }
  Async { puts "starting 2"; sleep 2; puts "2" }
end

The full program might correctly start the reactor around it:

Async do
  ...
  sleep_1_and_2
  ...
end

But the specs are often testing the pieces as independently as possible:

RSpec.describe 'sleep_1_and_2' do
  it 'calls sleep twice' do
    expect(Kernel).to receive(:sleep).with(1)
    expect(Kernel).to receive(:sleep).with(2)
    sleep_1_and_2
  end
end

The above spec will pass, but the code is not actually running concurrently as designed!

In my Proposal, the above mistake would cause a helpful exception to be raised:

Async::ReactorNotRunning: You must start a reactor first with `Async.run { ... }` surrounding this code.

and that would explain to us exactly what we needed to change to honor the contract:

RSpec.describe 'sleep_1_and_2' do
  it 'calls sleep twice' do
    expect(Kernel).to receive(:sleep).with(1)
    expect(Kernel).to receive(:sleep).with(2)
    Async.run do
      sleep_1_and_2
    end 
  end
end

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 5, 2023

Or, given that the contract of .run is "it starts up a reactor and runs it enclosing the given block", it might make a lot of sense to be more explicit:

Async.run_reactor do
  sleep_1_and_2
end

or

Async.with_reactor do
  sleep_1_and_2
end

or other variation.

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 5, 2023

and makes me want to refactor my code to avoid Sync and instead use Async {}.wait simply because that might help my brain make better sense of things.

Yes, that's where I am too. I don't need a shorthand for .wait. The .wait method call is plenty short enough and it's intention-revealing.

Also, I think it is best practice for any gem N to keep all of its methods inside the N namespace. The Kernel method Async is a bit surprising, but I believe there's precedent for pairing Kernel methods up with their namespace of the same name. I don't think it is best practice for the Async gem to have a Kernel method Sync, since that is outside its namespace. When I first saw Sync, I assumed there was a sync gem (or Ruby library) and went looking for it.

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 5, 2023

I realized while discussing with my team that this documentation for Async { ... } actually could be clarified to highlight the problem.

Currently it says

Run the given block of code in a task, asynchronously, creating a reactor if necessary.

To be more accurate, it could say

If there is a surrounding reactor, run the given block of code in a task, asynchronously with all other tasks surrounded by the reactor. If there is no surrounding reactor, create a reactor around the task and asynchronously run any sub-tasks that are created in the block, if any.

The difference in the scope of the reactor is why it's so easy to write concurrent code then accidentally run it non-concurrently.

The fix I'm proposing would simplify the semantics to

Run the given block of code in a task, asynchronously with all other tasks surrounded by a reactor. If there is no surrounding reactor, raise

Async::ReactorNotRunning: You must start a reactor first with `Async.run { ... }` surrounding this code.

which will then lead the developer to decide exactly the reactor boundary they want for concurrency.

@trevorturk
Copy link
Contributor

trevorturk commented Jun 6, 2023

I'm happy to hear I'm understanding the issue correctly now, and I hope that example helps clarify for others!

I think we're discussing a few issues (which might be best converted into GitHub Discussions?) but I'll try to summarize anyway:

  1. Consider deprecating Sync in favor of Async{}.wait

FWIW I'd be in favor of this, just because Sync confused me a bit until this discussion. However, it would be a pretty major API change and I'm sure Samuel considered it when introducing both Kernel methods, so I don't know if it's worth spending much time on. Perhaps we could add some documentation to explain a bit more in depth and to emphasize the recommended pattern?

As an aside, I believe I read at some point that Samuel was invited to put the Async functionality into Ruby Core but decided to keep it in a gem in order to allow for other developers to potentially test different approaches? I can't remember where I heard that (or if it's true) but I'm bringing it up because I think it may explain Samuel favoring Kernel methods here.

  1. Consider adding a new required Async.run method (or some other name like .run_reactor) which would raise an Async::ReactorNotRunning error in an effort to warn developers if they're unknowingly "doing it wrong" and thinking they have concurrency when they actually don't. (This is the solution you're proposing in this PR, but there are some open questions, like if an error should be raised on nested .run calls etc. )

  2. Consider adding new warnings and/or opt-in exceptions which attempt to provide the same safety, but without introducing Async.run etc and leaving Async overloaded. (This is the idea I'm floating.)

To my mind, if we're considering adding Async.run, I might be tempted to consider adding Async.task or something along those lines. Then, you'd deprecate the un-decorated Async{} method, in an effort to be super clear about everything, and to stop "overloading". For example, perhaps Async.reactor { Async.task { sleep(1) } }. However, then you'd have more verbosity, and obviously a big API change.

At least for today, I think I understand what you're trying to accomplish, and I think it's a noble effort and a good idea. I'm not sure I think it's worth breaking the API and making things more verbose, though, and I still wonder if an optional warning/exception might serve the purpose, if we could set aside the "overloading" issue. (I think the "overloading" issue is valid and worth discussing, I just don't know how big of an issue it is.)

I don't think this will be appealing, but FWIW I wanted to point out that the Alba gem (which I'm a big fan of) uses global configs that you might put in an initializer, for example here: https://github.com/okuramasafumi/alba/#encoder-configuration -- this is the kind of thing I was imagining when thinking of an opt-in config, like Async.require_reactor! or something. You could imagine introducing a config option like this where it would be disabled until the next major version, for example, if you wanted to move cautiously.

Sorry for the long post, and I hope I'm not derailing things too badly. I don't have any strong feelings about this, honestly, I just wanted to air out some ideas and figured why not! 😄

@ioquatix
Copy link
Member

ioquatix commented Jun 6, 2023

Sync doesn't do anything except call yield when already running in a block. It might be semantically very similar to Async{...}.wait but it's far more efficient. To be honest, the name might not be all that great, but it's like the opposite of Async.

Regarding this example:

Async { puts "starting 1"; sleep 1; puts "1" }
Async { puts "starting 2"; sleep 2; puts "2" }

I agree, the way it's explained in this issue, this could be considered confusing. However, there is no expectation that two async blocks run concurrently together. Async{...} only relates to making the internal part of the block asynchronous. From the outside, it still works sequentially. This is actually a design goal of Async, i.e.

t1 = Async { puts "starting 1"; sleep 1; puts "1" }
t2 = Async { puts "starting 2"; sleep 2; puts "2" }

t1.wait
t2.wait

This code has the same result no matter whether it's run in an event loop or not. However, those tasks do have side effects and that's what's showing up differently.

If what you want is for those tasks to run asynchronously you should always mark the outer part, e.g.

def sleep_1_and_2
  Sync do
    Async { puts "starting 1"; sleep 1; puts "1" }
    Async { puts "starting 2"; sleep 2; puts "2" }
  end
end

or

def sleep_1_and_2(parent: Async::Task.current)
  parent.async { puts "starting 1"; sleep 1; puts "1" }
  parent.async { puts "starting 2"; sleep 2; puts "2" }
end

Regarding the proposed implementation:

Async.run do
  process_jobs(jobs)
end

To me this is a bit ugly - why should the user care about whether process_jobs is using Async, or Threads, or Ractors, etc. It's leaking the internal implementation. I would far prefer to see:

def process_jobs(jobs)
  jobs.map do |job|
    Thread.new{job.process}
  end.map(&:value)
end

values = process_jobs(jobs)

That way, the Async implementation can be retroactively applied to existing code without changing the caller. If the caller is also updated to understand Async, they can then take advantage of it. i.e.

def process_jobs(jobs)
  Sync do |task|
    jobs.map do |job|
      task.async{job.process}
    end.map(&:wait)
  end
end

values = process_jobs(jobs)

Later on, if someone wants to process several batches of jobs:

Sync do |task|
  batches.map do |jobs|
    task.async do
      values = process_jobs(jobs)
      puts "Finished batch: #{values}"
    end
  end
end

Your proposal makes this kind of retroactive application of Async very hard. I would suggest you think of Sync -> Async as a form of (Reduce) -> (Map), the outer Sync forms a loose contract around a synchronisation of internal asynchronous work. Although in practice it's up to you to enforce what is actually synchronised (i.e. calling wait on child tasks). Async::Barrier is an explicit form of a synchronisation step. We could probably expand on that interface a bit, e.g.

def process_jobs(jobs)
  Async::Barrier do |barrier|
    jobs.map do |job|
      barrier.async{job.process}
    end.map(&:wait)
  end # implicit stop / kill of jobs that leak here.
end

The fiber scheduler Fiber.schedule has a different design, Fiber.schedule{...} is an error if there is no fiber scheduler set. That is because there is no default scheduler and no way to know what a default scheduler might be (we could introduce this concept later, e.g. Fiber.default_scheduler = Async::Scheduler). I think it's okay, but it means when you have code that wants to use the fiber scheduler, you have to explicitly declare the scheduler. To me, this is a bit cumbersome and hard to compose, i.e. it's hard to write code that uses concurrency internally without exposing it. I encourage you to try and write something which uses Fiber.schedule which is capable of hiding the scheduler. You will need to have very clunky code, i.e. if the reactor does not exist, create it:

def fetch_urls(urls)
  unless Fiber.scheduler
    Fiber.set_scheduler(Scheduler.new)
    begin
      fetch_urls(urls)
    ensure
      Fiber.set_scheduler(nil)
    end
  end

  urls.each do |url|
    Fiber.schedule{Net::HTTP.get(url)}
  end
end

Note that Async very early on did have something like the interface you suggested:

https://github.com/socketry/async/blob/v1.10.0/lib/async.rb#L25-L30

Async.run do |task|
  task.async # the only way to create child tasks
end

The initial design did not have Async or Sync and the ONLY way to create nested tasks was to use Async::Task#async. The fact we use Async so prolifically now could be seen as an anti-pattern (I personally try to avoid it).

I personally found Async.run clunky, so I switched it to Async. It was also a bit cheeky design to use a method in Kernel. I then asked myself, why can't I have Async inside Async. It seemed natural - so I added it. I then found that there was a very common pattern of Async{...}.wait which I wanted to optimise for, so that became Kernel::Sync which is also a bit cheeky. The fact that these methods are defined in separate files, e.g. lib/kernel/async.rb and lib/kernel/sync.rb is because I don't consider them to be part of the core interface. They are (incredibly) convenient helpers at best.

Bare Async is convenient, but as you found out, it has semantics that might not be appropriate for all cases. In the same way that unbounded creation of threads can be a problem, unbounded creation of fibers can be a problem. If anything, we could change Async so that it can't create child tasks, and force users to write Async::Task.current.async{...} - but that's also not compatible with existing code and pretty clunky - it's kind of the lowest effort interface to get something working, but honestly, in most cases, you want a more advanced form of concurrency (like a barrier, semaphore, etc). What would make sense to me is to provide safer high level constructs, like async_each, async_map and so on. In fact, those already exist but we could probably have a safer default implementation (i.e. a semaphore with reasonable defaults).

@ColinDKelley
Copy link
Contributor Author

ColinDKelley commented Jun 6, 2023

This code has the same result no matter whether it's run in an event loop or not. However, those tasks do have side effects and that's what's showing up differently.

I'd rather not say it "has the same result" if it runs serially rather than concurrently. The Async client code I care about is code that needs to run concurrently. For example, we have a metrics monitoring service that polls every second minute for hundreds of metrics, each served from an http endpoint that takes 2-10 seconds to return. If they run serially we do not "have the same result", because we won't be able to keep up. That's why we were using EventMachine + Synchrony, and now Async.

Regarding the proposed implementation:

Async.run do
  process_jobs(jobs)
end

To me this is a bit ugly - why should the user care about whether process_jobs is using Async, ...

This is a contract question.

  1. process_jobs needs to be concurrent. It will not have the same results if run serially, because it won't meet its latency requirements.
  2. process_jobs chose to use Async as its concurrency mechanism rather than Thread or Ractor etc.
  3. Methods that use Async need to have a Reactor run around them.

Let me stop there and see if I have (3) right. You show this example:

def process_jobs(jobs)
  Sync do |task|
    jobs.map do |job|
      task.async{job.process}
    end.map(&:wait)
  end
end

In the event that no Reactor is run around process_jobs, the Sync do will spin one up. But it will also then create a new Async::Task, which we don't need. I've been assuming we should skip this and hence have the Reactor created outside, by the caller. But would you argue that instead every method that is built on Async have such a wrapper just inside the method body, to handle the possibility that it is called with no enclosing Reactor, and don't worry about the wasted Async::Task?

@ioquatix
Copy link
Member

ioquatix commented Jun 6, 2023

In the event that no Reactor is run around process_jobs, the Sync do will spin one up. But it will also then create a new Async::Task, which we don't need.

You can't avoid having at least one task, whether it's in this method or outside. However, if it's already in an event loop, it won't create a task.

But would you argue that instead every method that is built on Async have such a wrapper just inside the method body, to handle the possibility that it is called with no enclosing Reactor, and don't worry about the wasted Async::Task?

The Sync implementation does not create a task if it's already running in an event loop.

# Only one task is created, the root one, along with the event loop:
Sync{Sync{Sync{sleep 1}}

# The same, only one root task created:
Async{Sync{Sync{sleep 1}}

I'd rather not say it "has the same result" if it runs serially rather than concurrently.

Fair enough, but in my mind, concurrency is never guaranteed, some event loop design may run everything sequentially and it should, in theory, still work correctly, even if the actual wall clock performance is bad.

There are a multitude of different design decisions in schedulers, like optimistic (immediately start tasks) vs pessimistic (schedule tasks to start on the next iteration), throughput vs latency scheduling (prefer a single task running for a long time, or prioritise switching between tasks), etc. Over time, more and more blocking operations will be supported by the fiber scheduler, e.g. fallocate, accept, madvise, close to name a few - and these will increase the level of concurrency over what we have today. But not all systems will support them, so depending on underlying implementation details you will have more or less sequential vs concurrent execution.

In fact, some degenerate testing systems can deliberately limit fiber scheduling to detect race conditions and deadlocks by deliberately invoking every possible task order and interleaving of every non-blocking operation.. I believe the sqlite3 test suite does something like this for interrupting system calls.

For example, Async could have a configuration option to limit the number of total fibers - your computer doesn't have unlimited resources and eventually your ability to create new fibers will be exhausted. Code like this could be considered a ticking time bomb:

def process_jobs(jobs)
  Async::Task.current? or raise ArgumentError, "must start up reactor first with surrounding 'Async { ... }"

  jobs
    .map { Async { _1.process! } }
    .map(&:wait)
end

# Process hung for 1 hour, now there are 1 million jobs:
process_jobs(job_server.all_pending_jobs)

It can be very tricky to write correct, general code that uses a semaphore without causing deadlocks, especially in systems with constrained resources. I suppose my point is, while the goal of Async is to provide concurrency, it won't always be possible and you might encounter system limitations or resource limitations.

process_jobs needs to be concurrent. It will not have the same results if run serially, because it won't meet its latency requirements.

Then I don't see why this should be the responsibility of the caller, because if that's true, invariably someone will eventually forget and the performance will be bad. In my proposed Sync usage, it always just works, but in your proposed implementation, it raises an exception. If there is a clear and obvious way to make it always work, I don't see why we wouldn't just do that, it seems easier to me for consumers of the interface.

We can make this really easily, i.e. in the async-await gem you can do this:

class MyJobProcessor
  include Async::Await

  sync def process_jobs(jobs)
    # ...
  end
end

The purpose of the async-await gem is partly as a joke, but also to expose a bunch of standard patterns that help us implement robust and reliable concurrent code.

@emiltin
Copy link
Contributor

emiltin commented Jun 30, 2023

If there was always a reactor, Async {} could be used anywhere with consistent behaviour, without having to worry about whether it's inside a reactor or not. So the two examples:

Async { puts "starting 1"; sleep 1; puts "1" }
Async { puts "starting 2"; sleep 2; puts "2" }

Async do
  Async { puts "starting 1"; sleep 1; puts "1" }
  Async { puts "starting 2"; sleep 2; puts "2" }
end

would both output:

starting 1
starting 2
1
2

A agree that having to use something like Async.run to explicitly create the reactor would make it much harder to compose code. For example, code in a gem that uses Async would have to either create the reactor, in which case you cannot use it inside code that already has a reactor, or the caller would be be responsible for creating the reactor, which seems odd.

Could there always be a reactor, whenever someone doesrequire 'async'?

@ioquatix
Copy link
Member

ioquatix commented Jun 30, 2023

Could there always be a reactor, whenever someone does require 'async'?

In short, unfortunately this is impossible. However, we could introduce something like aruby, e.g.

#!/usr/bin/env aruby

# already in a reactor

Such a design is feasible, however I'm not sure of it's value.

I do use the following alias which is similar:

alias arb "ruby -rasync -e 'Async{|task| binding.irb; task.stop}'"

The async-await Ruby gem is more oriented towards exploration, it might be a candidate for inclusion.

@emiltin
Copy link
Contributor

emiltin commented Jun 30, 2023

In short, unfortunately this is impossible. However, we could introduce something like aruby, e.g.

Hmm. Can Ruby be updated to make it possible...? Easy for me to ask of course :-)

A separate #!/usr/bin/env aruby is a workaround, but it's not so convenient I guess.

@ioquatix
Copy link
Member

ioquatix commented Jun 30, 2023

Essentially what you are asking is for the Ruby interpreter's user code to be run inside an async block. It could change too much behaviour.

Rather than changing Ruby, it's better to write code that works correctly regardless of whether it's used in an event loop (fiber scheduler).

IMHO, this is best for everyone, because existing application servers like Sidekiq, Unicorn and Puma can continue to work correctly, while your application gains concurrency.

My goal has always been for asynchronous execution to be transparent to existing sequential code. It's true this can't be achieved for the top level Ruby script without some kind of aruby, i.e. Ruby by default is not event driven (yet). The fiber scheduler does not provide the same primitives as Async, and it's harder to compose. However, you can largely achieve what you want provided you require it, e.g.

# async/default_scheduler.rb
Fiber.set_scheduler(Async::Scheduler.new)

# user_code.rb
require 'async/default_scheduler'

Fiber.schedule do
  # ... etc
end

Something like that is possible, but you loose a lot of the nice scope that is provided by Async{...}. I think you'd find in practice it's much less convenient when implementing libraries that have to deal with concurrency in a way that's easy for users to depend on.

@emiltin
Copy link
Contributor

emiltin commented Jul 6, 2023

Good points, thank for you taking the time to reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants