Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track ecto queue time separately #318

Open
2 tasks
tombruijn opened this issue Mar 21, 2018 · 23 comments
Open
2 tasks

Track ecto queue time separately #318

tombruijn opened this issue Mar 21, 2018 · 23 comments

Comments

@tombruijn
Copy link
Member

tombruijn commented Mar 21, 2018

Currently we track the queue time for ecto queries by combining it with the actual query duration here:

total_time = (entry.queue_time || 0) + (entry.query_time || 0) + (entry.decode_time || 0)
duration = trunc(total_time / @nano_seconds)
Transaction.record_event(transaction, "query.ecto", "", entry.query, duration, 1)

This means we lose detail here and are not able to show the queue time separately from the query duration.

An idea might be to create a separate event for the ecto queue duration, e.g. queue.ecto.

- queue.ecto - 100ms (100ms - 10ms (query time) = 90ms queue time)
   - query.ecto - 10ms

To do

  • Record Ecto events as separate events
  • Potentially update the extension to not nest Ecto events under the query event, because it's not a sub event of the query(?)
@tombruijn
Copy link
Member Author

@tombruijn tombruijn changed the title 🚀 Track ecto queue time separately Track ecto queue time separately May 30, 2018
@tombruijn
Copy link
Member Author

@jeffkreeftmeijer what do you think? Is adding a separate event here a good solution? It is small enough for the issue week?

@jeffkreeftmeijer
Copy link
Member

jeffkreeftmeijer commented May 30, 2018

Not sure about the implementation (how do we do this in the Ruby integration?), but this should be small enough to do in the issue week, yes.

@tombruijn
Copy link
Member Author

queue duration is not being tracked separately in Ruby for ActiveRecord as ActiveSupport::Notifications doesn't report it separately from what I can tell.

@jeffkreeftmeijer jeffkreeftmeijer self-assigned this Jun 13, 2018
@jeffkreeftmeijer
Copy link
Member

We can separate this up into queue, query and decode. I don't like nesting the query event in the queue event (as that's not what's happening), I'd rather do something like this:

  • query.ecto
    • queue.ecto
    • db.ecto
    • decode.ecto

Also, we'd need to nest these events in each other, which we currently don't support using Transaction.record_event/6.

@thijsc
Copy link
Member

thijsc commented Oct 17, 2018

@tombruijn
Copy link
Member Author

Let's check if its technically possible in the agent to do this.

@wesoudshoorn
Copy link
Member

@thijsc can you help research if this is possible (with someone involved in Elixir) and if so, implement this in the agent?

@tombruijn
Copy link
Member Author

Let's discuss with @jeffkreeftmeijer how we can track this in Elixir, Thijs says it's possible to send this to the extension.

@jeffkreeftmeijer
Copy link
Member

This should be a minor change on the Elixir side. Instead of adding all values together, we'd record a separate event for each of them. We then wrap everything in one main query.ecto event (assuming that works on the agent side), to get something that looks like this.

query.ecto
- queue.ecto
- db.ecto
- decode.ecto

@edds
Copy link

edds commented Jul 20, 2020

Just wondering if there is any progress on this? Not being able to see the database pool queue time really masks potential issues when diagnosing an application slow down.

@aboxshall
Copy link

We're also running in to situations where this info would be really valuable. Using appsignal we can see when queries are slow, but it doesn't go in to detail about what makes this query slow and it can also be misleading e.g. is the query execution time actually slow, or is a particular query prone to be queued for a long time?

@shairyar
Copy link
Member

shairyar commented Oct 2, 2020

@jay-meister
Copy link

Definitely would find this useful

@shairyar
Copy link
Member

@andreragsilva
Copy link

Also here to express how much we want/need this functionality 🙌

As @edds already mentioned, it would help with the diagnosis of our applications. We need to be able to identify bottlenecks with more precision be it a long execution query or an event that leads to growing queue times.

@nbernardes
Copy link

We have an app running that right now is suffering a lot from this and we can't really diagnose with ease the problems. This issue would def help on it! 💪

@marinho10
Copy link

this would be really useful 🙏

@edds
Copy link

edds commented May 14, 2021

If it helps, a workaround that we used for this was to manually track queue, idle time and query time which we then graphed on a custom dashboard. It wouldn't help when looking at an individual slow query, but did let us spot trends when the database was getting clogged up. It looked something like this:

      :telemetry.attach(
        "my-app-ecto",
        [:my_app, :repo, :query],
        &MyApp.EctoTelemetry.handle_event/4,
        nil
      )

with something along the lines of:

defmodule MyApp.EctoTelemetry do
  def handle_event([:my_app, :repo, :query], measurements, _metadata, _config) do
    case measurements do
      %{queue_time: queue} ->
        Appsignal.add_distribution_value("ecto.queue_time", System.convert_time_unit(queue, :native, :millisecond))

      _ ->
        nil
    end

    case measurements do
      %{idle_time: idle} ->
        Appsignal.add_distribution_value("ecto.idle_time", System.convert_time_unit(idle, :native, :millisecond))

      _ ->
        nil
    end

    case measurements do
      %{query_time: query} ->
        Appsignal.add_distribution_value("ecto.query_time", System.convert_time_unit(query, :native, :millisecond))

      _ ->
        nil
    end
  end
end

@matthewlehner
Copy link

This would also be extremely helpful for us. We're seeing some random outliers with Ecto queries and are not sure if it's a database resource thing or a connection pool thing (or both). Having this available to us in AppSignal would certainly help us diagnose these kinds of problems a whole lot more quickly.

@shairyar
Copy link
Member

shairyar commented Sep 5, 2023

@thbar
Copy link

thbar commented Sep 5, 2023

@edds many thanks! What you provided will be useful to help us "size" the Ecto pool more easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests