Routing Microservice Metrics over Tor
Ablative Hosting has several servers distributed around the globe in various data centers from various vendors, it is very important that any potential adversary not be able to identify that these servers are part of the infrastructure.
There are three key points of correlation for these servers;
- Orchestration connections from the Ansible hosts
- Inter-host service communication polling (e.g. the cryptocurrency microservices connecting to our Bitcoin or Monero nodes)
- Metrics from the various servers
Protecting Ansible using Tor
Thankfully Ansible natively supports using SSH options so the host inventory uses .onion addresses and the Ansible host’s Tor SOCKS port.
Protecting Inter-host service Communication
Any inter-host communication that is expected to cross data center boundaries are wrapped via Tor, and it’s this native GoLang integration that led us to writing plumbago
Protecting Graphite Line Protocol Metrics with Tor
The graphite line protocol is really simple (<metric path> <metric value> <metric timestamp>
) and operates over TCP,
it’s so simple that it’ll happily work with piping a string into netcat;
PORT=2003
SERVER=graphite.your.org
echo "local.random.diceroll 4 `date +%s`" | nc ${SERVER} ${PORT}
The graphite docs refer to this as the “plaintext” protocol which presents us at Ablative with two problems. We have a policy were we encrypt everything, nothing travels over the wire unencrypted but more importantly any connections from our remote servers back to the metrics cluster will provide correlation for an adversary.
Plumbago
The application is pretty simple, we start a simple TCP listener on the given IP and port (default to 2003 for Graphite line protocol)
l, _ := net.Listen(CONN_TYPE, CONN_HOST+":"+CONN_PORT)
graphite, _ := GraphiteFactory(CONN_TYPE, graphiteHost, graphitePort, "")
for {
conn, _ := l.Accept()
go handleRequest(conn, graphite)
}
handleRequest is equally simple;
func handleRequest(conn net.Conn, graphite *Graphite) {
buf := make([]byte, 8000)
reqLen, err := conn.Read(buf)
if err != nil {
fmt.Println("Error reading:", err.Error())
}
fmt.Printf("Received %d bytes of data\n", reqLen)
graphite.PassThru(buf)
conn.Close()
}
The key work is done in the GraphiteFactory and the PassThru functions;
func GraphiteFactory(protocol string, host string, port int, prefix string) (*Graphite, error) {
var graphite *Graphite
graphite = &Graphite{Host: host, Port: port, Protocol: "tcp", Prefix: prefix, Proxy: "127.0.0.1:9050}
err := graphite.Connect()
if err != nil {
return nil, err
}
return graphite, nil
}
The call to Connect()
is where the proxy configuration is done;
func (graphite *Graphite) Connect() error {
if graphite.conn != nil {
graphite.conn.Close()
}
address := fmt.Sprintf("%s:%d", graphite.Host, graphite.Port)
if graphite.Timeout == 0 {
graphite.Timeout = defaultTimeout * time.Second
}
var err error
var conn net.Conn
dialer, socksErr := proxy.SOCKS5("tcp", graphite.Proxy, nil, proxy.Direct)
if socksErr != nil {
fmt.Fprintln(os.Stderr, "can't connect to the proxy:", socksErr)
}
conn, err = dialer.Dial(graphite.Protocol, address)
if err != nil {
return err
}
graphite.conn = conn
return nil
}
With connectivity to the SOCKS proxy confirmed the pass through is a very simple affair;
func (graphite *Graphite) PassThru(buf []byte) error {
_, err := graphite.conn.Write(buf)
if err != nil {
return err
}
return nil
}
These basic building blocks are what allow us to have per-datacenter Graphite proxies for use with software such as collectd or to bake Tor routed Graphite functionality directly into our services.