Saturday, April 20, 2024
HomeGolangQuick and dynamic encoding of Protocol Buffers in Go

Quick and dynamic encoding of Protocol Buffers in Go


Protocol Buffers are a preferred alternative for serializing structured knowledge attributable to their compact measurement, quick processing velocity, language independence, and compatibility. There exist different alternate options, together with Cap’n ProtoCBOR, and Avro.

Often, knowledge buildings are described in a proto definition file (.proto). The protoc compiler and a language-specific plugin convert it into code:

$ head flow-4.proto
syntax = "proto3";
package deal decoder;
possibility go_package = "akvorado/inlet/circulate/decoder";

message FlowMessagev4 {

  uint64 TimeReceived = 2;
  uint32 SequenceNum = 3;
  uint64 SamplingRate = 4;
  uint32 FlowDirection = 5;
$ protoc -I=. --plugin=protoc-gen-go --go_out=module=akvorado:. flow-4.proto
$ head inlet/circulate/decoder/flow-4.pb.go
// Code generated by protoc-gen-go. DO NOT EDIT.
// variations:
//      protoc-gen-go v1.28.0
//      protoc        v3.21.12
// supply: inlet/circulate/knowledge/schemas/flow-4.proto

package deal decoder

import (
        protoreflect "google.golang.org/protobuf/mirror/protoreflect"

1

Whereas empty fields should not serialized to Protocol Buffers, empty columns in ClickHouse take some house, even when they compress properly. Furthermore, unused fields are nonetheless decoded they usually might muddle the interface. 

Akvorado collects community flows utilizing IPFIX or sFlow, decodes them with GoFlow2, encodes them to Protocol Buffers, and sends them to Kafka to be saved in a ClickHouse database. Amassing a brand new area, equivalent to supply and vacation spot MAC addresses, requires modifications in a number of locations, together with the proto definition file and the ClickHouse migration code. Furthermore, the price is paid by all customers.1 It might be good to have an application-wide schema and let customers allow or disable the fields they want.

Whereas the principle objective is flexibility, we don’t wish to sacrifice efficiency. On this entrance, that is fairly successful: when upgrading from 1.6.4 to 1.7.1, the decoding and encoding efficiency virtually doubled! 🤗

goos: linux
goarch: amd64
pkg: akvorado/inlet/circulate
cpu: AMD Ryzen 5 5600X 6-Core Processor
                            │ preliminary.txt  │              remaining.txt              │
                            │    sec/op    │   sec/op     vs base                │
Netflow/with_encoding-12      12.963µ ± 2%   7.836µ ± 1%  -39.55% (p=0.000 n=10)
Sflow/with_encoding-12         19.37µ ± 1%   10.15µ ± 2%  -47.63% (p=0.000 n=10)

There’s a related operate utilizing NetFlow. NetFlow and IPFIX protocols are much less complicated to decode than sFlow as they’re utilizing an easier TLV construction. 

I exploit the following code to benchmark each the decoding and encoding course of. Initially, the Decode() technique is a skinny layer above GoFlow2 producer and shops the decoded knowledge into the in-memory construction generated by protoc. Later, among the knowledge might be encoded immediately throughout circulate decoding. This is the reason we measure each the decoding and the encoding.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments