Thursday, October 13, 2016

Monitoring my Home Server with Prometheus

This is a guide to monitoring my home network (and various services) with Prometheus.

Startup Commands

$ DATADIR=/home/services/prometheus
$ docker run --restart=always --name=alertmanager -d -p 9093:9093 \
  -v ${DATADIR}/alertmanager:/alertmanager prom/alertmanager \
$ docker run --restart=always --name=snmpexporter -d -p 9116:9116 \
  -v ${DATADIR}/snmp_exporter:/snmp-exporter prom/snmp-exporter \
$ docker run --restart=always --name=prometheus -d -p 9090:9090 \
  --link=alertmanager --link=snmpexporter -v ${DATADIR}/prometheus:/prometheus-data \
  prom/prometheus -config.file=/prometheus-data/prometheus.yml
$ docker run --restart=always --name=grafana -d -p 3000:3000 --link=prometheus \
  -v ${DATADIR}/grafana:/var/lib/grafana grafana/grafana


  scrape_interval:     15s
  evaluation_interval: 15s


# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

      - targets: ['localhost:9090']

  - job_name: 'snmp'
    metrics_path: /snmp
      module: [default]
      - targets:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: snmpexporter:9116

Grafana Setup:

  1. Go to http://localhost:3000/
  2. Click the Grafana logo on the top left
  3. Click Data Sources
  4. Click Add Data Source:
    Name: Prometheus
    Type: Prometheus
    Url: http://prometheus:9090/
    Access: proxy

Tuesday, March 29, 2016

Go Channels slow for bulk data

I've been writing some Go code recently, for processing reasonably medium amounts (millions of data points) of time-series data. The data processing involves aggregating multiple time series by combining similar time periods and then performing simple operations across sets.

Reading the pipelines post on the golang blog demonstrated how channels can be used for some really neat looking (and potentially parallel) code, with very little work.

Without worrying too much about the implementations of Interpolate(), RollingAverage() and GetAllData(), I ended up with some code that followed this format:

type Value struct {
  Timestamp time.Time
  Value     float64

// Create the processing pipeline
input := make(chan *Value)
steadyData := Interpolate(input, 300) // Fix the interval between points to 5 minutes
output := RollingAverage(steadyData)  // Calculate rolling average

// Write all data to the pipeline
for v := range GetAllData() {
  input <- v

// Fetch the results from the end
for v := range output {
  fmt.Println("Got an output value", v.Value)

Interpolate(), RollingAverage() and GetAllData() all create goroutines so that the processing that they do can all be performed in parallel.

It seems relatively elegant and does make it very easy to insert other steps into the pipeline, or change order or functions. It's generally what I'd regard as pretty code.

Unfortunately, it's SLOW. Extremely slow. I ended up throwing away all the pipeline code and just passing around []*Value everywhere, taking the hit of creating and copying new slices, and the potential loss in productivity by only using a single core.

Even when the number crunching in each step is relatively complex, the performance increase by using more cores is dwarfed by the loss of using channels.

To demonstrate the performance difference, this is the code I threw together, which you can run to see for yourself:

package main
import (
type Value struct {
  Timestamp time.Time
  Value     float64
func averageOfChan(in chan *Value) float64 {
  var sum float64
  var count int
  for v := range in {
    sum += v.Value
  return sum / float64(count)

func averageOfSlice(in []*Value) float64 {
  var sum float64
  var count int
  for _, v := range in {
    sum += v.Value
  return sum / float64(count)
func main() {
  // Create a large array of random numbers
  input := make([]*Value, 1e7)
  for i := 0; i < 1e7; i++ {
    input[i] = &Value{time.Unix(int64(i), 0), rand.Float64()}
  func() {
    st := time.Now()
    in := make(chan *Value, 1e4)
    go func() {
      defer close(in)          
      for _, v := range input {
        in <- v                
    fmt.Println("Channel version took", time.Since(st))

  func() {
    st := time.Now()           
    fmt.Println("Slice version took", time.Since(st))

Running this on my home PC, I get this:
Channel version took 1.14759465s
Slice version took 24.839719ms

Yes, it's 46x faster to pass around a slice in this contrived (but representative) example. I did attempt to optimise this by changing the size of the input channel, and 1e4 is about the fastest channel size I found.

In short: channels are neat. pipelines are neat. channels are slow.

I'd be happy to hear if I'm doing something wrong or there is a better (faster) way.