Reporting Real-time Performance of Channels

Elliot Chance
3 min readSep 27, 2019

--

Photo by Florian Steciuk on Unsplash

Go ships with great tools for runtime performance monitoring. However, for one of the projects I am working on I needed tailored metrics around three channels that do the primary communication between the larger components of the processing service.

I wanted to be able to print realtime stats on movement and blocking of these channels only each minute. So I created ChannelPerf :

https://gist.github.com/elliotchance/7db9eb40a935f5526b62f0bcf77b7d2c

How does it work?

For the most part it is a drop in replacement-ish with how you would normally use a channel:

// ch := make(chan interface{}, 1)
ch := NewChannelPerf(1, "Some name")

// ch <- "foo"
ch.Send("foo")

// next, ok := <-ch
next, ok := ch.Receive()

// close(ch)
ch.Close()

// for next := range ch {
// fmt.Println(next)
// }
for {
next, ok := ch.Receive()
if !ok {
break
}
fmt.Println(next)
}

The name (second parameter of NewChannelPerf) is used to label statistics with ch.String():

fmt.Println(ch)

Displays something like:

Some name (clock: 1.387381ms, sent: 3 in 1.745µs, received: 3 in 1.268469ms)
  • clock is how long it has been tracking statistics. Statistics can be reset with ch.Reset()
  • sent is the number of messages successfully sent to the channel, and the total time waiting (blocking) to do so. If the channel is full and goroutine(s) are blocked trying to send to the channel, this time will increase.
  • received is the number of messages successfully received from the channel (unrelated to how many messages may be available), and how long it was waiting on receiving. If the channel is empty and goroutine(s) are blocked waiting then this time will increase.
  • One important thing to note is that the sent and received are total times, not actual times. For example, if there are 3 goroutines blocked on sending, once sent the time will be the addition of all of these. If you only have one goroutine on either side of the channel that’s easy. However, you might have to make some guesstimates when you have multiple goroutines on either or both sides.

Logging Metrics for Multiple Goroutines

I did say at the top I have three channels I’d like to track. To make this easier you can use ChannelPerfGroup:

channelPerfGroup := NewChannelPerfGroup()
ch1 := channelPerfGroup.NewChannelPerf(1, "CH1")
ch2 := channelPerfGroup.NewChannelPerf(1, "CH2")

ChannelPerfGroup has an aggregate String() that returns a single line, but it also provides something more useful… a ticker:

channelPerfGroup.StartTicker(time.Millsecond, true, func() {
fmt.Println(channelPerfGroup)
})

Every millisecond it will invoke that func — which I print the metrics. This is just for demonstration, you probably want it much less frequent than that. The true will reset the stats (false would allow infinitely accumulating stats).

Here is the first ~2 milliseconds:

CH1 (clock: 1.387755ms, sent: 3 in 9.705µs, received: 3 in 485.072µs); CH1 (clock: 1.387381ms, sent: 3 in 1.745µs, received: 3 in 1.268469ms)
CH1 (clock: 745.583µs, sent: 0 in 0s, received: 0 in 0s); CH2 (clock: 751.181µs, sent: 0 in 0s, received: 0 in 0s)

Caveats

  1. This is not performance free. However, any runtime performance monitoring will incur some overhead. I haven’t investigated the exact overhead, but it was designed with performance in mind.
  2. The channel uses an interface{} — you can customize this to whatever type you need. In my case I used it for several different types so I just perform the type cast after the Receive.
  3. The Send /Receive counts and timers are not updated until that action completes. So one or more inflight blocking operations will not show in the stats when you print them. If you have a ChannelPerf that reports receive: 0 in 0s it’s likely it was just waiting the whole period but has not yet updated the stats.
  4. There may be some overlap with timings on operations before and after the Reset if they were inflight at the time of the Reset .

--

--

Elliot Chance

I’m a data nerd and TDD enthusiast originally from Sydney. Currently working for Uber in New York. My thoughts here are my own. 🤓 elliotchance@gmail.com