feat: add some basic telemetry event metrics and support ping payloads #53
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi! We've been troubleshooting some latency issues around our push notifications, and that led to adding some additional telemetry to this library so we could turn on some detailed logging and generate some good data to help us track down these problems. I'm happy to update more docs as well to document the usage of these events if desired. Also, I'm happy to update this to make
:telemetry
an optional dependency and update how we emit these so that it is only done if the:telemetry
module is loaded, but I wasn't sure of the value of that.I added telemetry events & metrics to the following modules:
lib/connection.ex
- telemetry event[:kadabra, :connection, :stop]
added, with the duration of the connection logged. This has been helpful for understanding the lifecycle of both our FCM and APNS connections.lib/socket.ex
- telemetry events[:kadabra, :socket, :recv_frame], [:kadabra, :socket, :send], [:kadabra, :socket, :closed]
. I just went ahead and included the frame received and data being sent. I found these useful when trying to understand why our connections were being closed, like seeing theGOAWAY
frames or the messages we sent leading up to that.lib/stream.ex
- telemetry events[:kadabra, :stream, :start], [:kadabra, :stream, :stop]
, with the duration of the lifetime of the stream included. This was useful for us to gather metrics on the lifetime of requests as well as to see how many streams were being created.In this change is also an API breaking change -- updates to
ping
. Another thing we attempted to gather metrics on was the RTT of pings over these connections, and by allowing for us to specify an optional payload in the ping, we could calculate the RTT time from the pong response received without having to worry about out-of-order ping responses when sent around the same time. You can still just callKadabra.ping(pid)
, but I couldn't think of a way to return the payload when we get the response from the server without making it so that the data in the ping frame was always included in the message sent to the calling process.