Documentation around 4 -> 5 upgrade; ArgumentError: Start sender first #274

wjessop · 2023-05-02T15:09:31Z

I'm unsure what I need to do to make the 4 to 5 upgrade work. I see the README file has some detail:

Using a singleton instance of the DogStatsD client instead of creating a new instance whenever you need one; this will let the buffering mechanism flush metrics regularly
Or properly disposing of the DogStatsD client instance when it is not needed anymore using the method Datadog::Statsd#close

but I thought we were using a singleton instance, and I don't think we want to be sprinkling #close around the code. The current way we're creating the instance is in an initialiser:

require 'datadog/statsd'

Rails.configuration.statsd = Datadog::Statsd.new(
  tags: {
    env: Rails.env
  },
  disable_telemetry: Rails.env != 'production'
)

then we use it like so:

Rails.configuration.statsd.distribution('elasticsearch.client.response_time', duration, tags: tags)

Deployed on Heroku using Puma, we're getting an error #<ArgumentError: Start sender first> that is crashing the Rails (6.1.7.3) process on boot. Is there something else we should be doing?

The text was updated successfully, but these errors were encountered:

wjessop · 2023-05-23T13:41:53Z

Is there anyone who could take a look at this?

scottopell · 2023-05-24T20:26:14Z

Hi @wjessop

Sorry for the delay, I'm happy to help you figure this out.

What you're seeing appears to be some kind of forking-related error, could you provide some more details on how your Puma server is configured? Particularly anything related to "Workers" as this appears to be how Puma creates new processes.

Here's my repro attempt if its helpful to compare (this is just rails new plus a scaffold and a bit of datadog configuration)
https://github.com/scottopell/demo-app-rails6

To give you a bit more insight, when the statsd client is forked (I'm guessing this is the case as the rails initializer probably runs once at startup, and then puma forks the process), we should be starting a new sender thread in the new process, but it appears that is not happening in your case, which is leading to this error.

wjessop · 2023-05-25T11:52:52Z

Here's our current datadog initialiser:

# frozen_string_literal: true

require 'datadog/statsd'

# set up statsd monitoring
unless Rails.configuration.respond_to?(:statsd)
  statsd = Datadog::Statsd.new(
    tags: {
      env: Rails.env
    },
    disable_telemetry: Rails.env != 'production'
  )

  Rails.configuration.statsd = statsd
end

The Puma config:

# frozen_string_literal: true

require 'puma_worker_killer'
require 'barnes'

workers Integer(ENV.fetch('WEB_CONCURRENCY', 1))
threads_count = Integer(ENV.fetch('PUMA_MAX_THREADS', 1))
threads(threads_count, threads_count)

preload_app!

rackup DefaultRackup
environment ENV.fetch('RACK_ENV', 'development')
worker_timeout 15
worker_shutdown_timeout 8

on_worker_boot do
  # Worker specific setup for Rails 4.1+
  # See: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot
  Makara::Context.release_all
  ActiveRecord::Base.establish_connection

  SemanticLogger.reopen
end

# Puma worker killer. Bandaid for memory issues.
before_fork do
  PumaWorkerKiller.enable_rolling_restart(3600) # Overriding 6 hour default to do it every hour to kill off memory-hungry workers that hang around.
  Barnes.start # see https://devcenter.heroku.com/articles/language-runtime-metrics-ruby
end

Env for the staging environment where we were trying this out:

WEB_CONCURRENCY: 2
PUMA_MAX_THREADS: 5

I figure the same as you, I'm probably missing some config but don't know what that would be.

scottopell · 2023-05-26T19:34:47Z

Thanks for the extra details, I'm unfortunately still not able to reproduce this.

I have a few questions that can hopefully help us get to the bottom of this:

Can you confirm that changing only the version of dogstatsd-ruby causes the issue you're seeing? From which version to which other version are you changing?
Could you also provide your ruby version and any relevant gem versions (puma, barnes, puma_worker_killer, anything that may have anything to do with threads/forking really)
Do both workers emit this error? Or does one continue running okay?
Does this error happen immediately on application startup? Or after some amount of time (maybe the same interval as puma_worker_killer?)
Do you have any logs you can share with a stack trace and any relevant events that happen before this error?

Without a reproduction I'm stuck reading through the code and I'm unfortunately not having much luck figuring out how you'd get into this state.

As a workaround, you could try setting single_thread: true in the statsd constructor, which may help? If it does help that may help us narrow down what's going on here.

scottopell · 2023-06-08T17:17:06Z

ping @wjessop in case you missed the above

wjessop · 2023-06-12T12:34:21Z

Sorry @scottopell, was on vacation! Will try to take a look at this this week.

scottopell · 2023-07-05T13:33:10Z

@wjessop one more ping if you still need help

wjessop · 2023-07-05T13:45:44Z

Sorry Scott, lots going on!

Can you confirm that changing only the version of dogstatsd-ruby causes the issue you're seeing? From which version to which other version are you changing?

Yes, it's only dogstatsd-ruby changing.

Could you also provide your ruby version and any relevant gem versions (puma, barnes, puma_worker_killer, anything that may have anything to do with threads/forking really)

Ruby 2.7.5
barnes (0.0.9)
puma (5.6.4)
puma_worker_killer (0.3.1)
https://github.com/wjessop/oas_agent (master, starts a background reporting thread but doesn't interact with datadog in it)

Do both workers emit this error? Or does one continue running okay?

I don't think anything remains up, but I've only seen one instance of the error in the logs.

Do you have any logs you can share with a stack trace and any relevant events that happen before this error?

There's no stack trace, just the log, I don't have the log output to hand, but I recall it wasn't particularly useful.

As a workaround, you could try setting single_thread: true in the statsd constructor, which may help?

We could try this but I think we'd still want to solve the issue with the error.

I think what would help would be if you could provide a canonical example of what a datadog config looks like when using Puma in pre-forking mode (probably the most common Rails app server and configuration). Saying "Using a singleton instance of the DogStatsD client instead of creating a new instance whenever you need one; this will let the buffering mechanism flush metrics regularly" in the docs is somewhat abstract, a concrete example would be really helpful to compare to what we're currently doing.

wjessop · 2023-09-20T12:19:53Z

Hey Scott, if you've got it a canonical example of what a datadog config looks like when using Puma in pre-forking mode would be really helpful here.

wjessop · 2023-11-06T22:26:56Z

@scottopell Any chance you could give an example?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation around 4 -> 5 upgrade; ArgumentError: Start sender first #274

Documentation around 4 -> 5 upgrade; ArgumentError: Start sender first #274

wjessop commented May 2, 2023

wjessop commented May 23, 2023

scottopell commented May 24, 2023

wjessop commented May 25, 2023

scottopell commented May 26, 2023

scottopell commented Jun 8, 2023

wjessop commented Jun 12, 2023

scottopell commented Jul 5, 2023

wjessop commented Jul 5, 2023

wjessop commented Sep 20, 2023

wjessop commented Nov 6, 2023

Documentation around 4 -> 5 upgrade; ArgumentError: Start sender first #274

Documentation around 4 -> 5 upgrade; ArgumentError: Start sender first #274

Comments

wjessop commented May 2, 2023

wjessop commented May 23, 2023

scottopell commented May 24, 2023

wjessop commented May 25, 2023

scottopell commented May 26, 2023

scottopell commented Jun 8, 2023

wjessop commented Jun 12, 2023

scottopell commented Jul 5, 2023

wjessop commented Jul 5, 2023

wjessop commented Sep 20, 2023

wjessop commented Nov 6, 2023