Reporting Real-time Performance of Channels
Go ships with great tools for runtime performance monitoring. However, for one of the projects I am working on I needed tailored metrics around three channels that do the primary communication between the larger components of the processing service.
I wanted to be able to print realtime stats on movement and blocking of these channels only each minute. So I created ChannelPerf
:
https://gist.github.com/elliotchance/7db9eb40a935f5526b62f0bcf77b7d2c
How does it work?
For the most part it is a drop in replacement-ish with how you would normally use a channel:
// ch := make(chan interface{}, 1)
ch := NewChannelPerf(1, "Some name")
// ch <- "foo"
ch.Send("foo")
// next, ok := <-ch
next, ok := ch.Receive()
// close(ch)
ch.Close()
// for next := range ch {
// fmt.Println(next)
// }
for {
next, ok := ch.Receive()
if !ok {
break
}
fmt.Println(next)
}
The name (second parameter of NewChannelPerf
) is used to label statistics with ch.String()
:
fmt.Println(ch)
Displays something like:
Some name (clock: 1.387381ms, sent: 3 in 1.745µs, received: 3 in 1.268469ms)
clock
is how long it has been tracking statistics. Statistics can be reset withch.Reset()
sent
is the number of messages successfully sent to the channel, and the total time waiting (blocking) to do so. If the channel is full and goroutine(s) are blocked trying to send to the channel, this time will increase.received
is the number of messages successfully received from the channel (unrelated to how many messages may be available), and how long it was waiting on receiving. If the channel is empty and goroutine(s) are blocked waiting then this time will increase.- One important thing to note is that the
sent
andreceived
are total times, not actual times. For example, if there are 3 goroutines blocked on sending, once sent the time will be the addition of all of these. If you only have one goroutine on either side of the channel that’s easy. However, you might have to make some guesstimates when you have multiple goroutines on either or both sides.
Logging Metrics for Multiple Goroutines
I did say at the top I have three channels I’d like to track. To make this easier you can use ChannelPerfGroup
:
channelPerfGroup := NewChannelPerfGroup()
ch1 := channelPerfGroup.NewChannelPerf(1, "CH1")
ch2 := channelPerfGroup.NewChannelPerf(1, "CH2")
ChannelPerfGroup
has an aggregate String()
that returns a single line, but it also provides something more useful… a ticker:
channelPerfGroup.StartTicker(time.Millsecond, true, func() {
fmt.Println(channelPerfGroup)
})
Every millisecond it will invoke that func — which I print the metrics. This is just for demonstration, you probably want it much less frequent than that. The true
will reset the stats (false
would allow infinitely accumulating stats).
Here is the first ~2 milliseconds:
CH1 (clock: 1.387755ms, sent: 3 in 9.705µs, received: 3 in 485.072µs); CH1 (clock: 1.387381ms, sent: 3 in 1.745µs, received: 3 in 1.268469ms)
CH1 (clock: 745.583µs, sent: 0 in 0s, received: 0 in 0s); CH2 (clock: 751.181µs, sent: 0 in 0s, received: 0 in 0s)
Caveats
- This is not performance free. However, any runtime performance monitoring will incur some overhead. I haven’t investigated the exact overhead, but it was designed with performance in mind.
- The channel uses an
interface{}
— you can customize this to whatever type you need. In my case I used it for several different types so I just perform the type cast after theReceive
. - The
Send
/Receive
counts and timers are not updated until that action completes. So one or more inflight blocking operations will not show in the stats when you print them. If you have aChannelPerf
that reportsreceive: 0 in 0s
it’s likely it was just waiting the whole period but has not yet updated the stats. - There may be some overlap with timings on operations before and after the
Reset
if they were inflight at the time of theReset
.