PingThings

BTrDB Explained

A brief description of the Berkeley Tree Database (BTrDB)

PingThings

December 12, 2019

BTrDB is a next-gen timeseries database for high-precision, dense telemetry.

The Berkeley Tree DataBase (BTrDB) is pronounced ”Better DB“.

Problem: Existing timeseries databases are poorly equipped for a new generation of ultra-fast sensor telemetry. Specifically, millions of high-precision power meters are to be deployed throughout the power grid to help analyze and prevent blackouts. Thus, new software must be built to facilitate the storage and analysis of its data.

Baseline: We need at least 1.4M inserts/s and 5x that in reads if we are to support 1000 micro-synchrophasors per server node. No timeseries database can do this.

Summary

Goals: Develop a multi-resolution storage and query engine for many 100+ Hz streams at nanosecond precision—and operate at the full line rate of underlying network or storage infrastructure for affordable cluster sizes (less than six).

Developed at The University of California Berkeley, BTrDB offers new ways to support the aforementioned high throughput demands and allows efficient querying over large ranges.

Fast writes/reads

Measured on a 4-node cluster (large EC2 nodes):

53 million inserted values per second
119 million queried values per second

Fast analysis

In under 200ms, it can query a year of data at nanosecond-precision (2.1 trillion points) at any desired window—returning statistical summary points at any desired resolution (containing a min/max/mean per point).

High compression

Data is compressed by 2.93x—a significant improvement for high-precision nanosecond streams. To achieve this, a modified version of run-length encoding was created to encode the jitter of delta values rather than the delta values themselves. Incidentally, this outperforms the popular audio codec FLAC which was the original inspiration for this technique.

Efficient Versioning

Data is version-annotated to allow queries of data as it existed at a certain time. This allows reproducible query results that might otherwise change due to newer realtime data coming in. Structural sharing of data between versions is done to make this process as efficient as possible.

The Tree Structure

BTrDB stores its data in a time-partitioned tree.

All nodes represent a given time slot. A node can describe all points within its time slot at a resolution corresponding to its depth in the tree.

The root node covers ~146 years. With a branching factor of 64, bottom nodes at ten levels down cover 4ns each.

level	node width
1	2⁶² ns (~146 years)
2	2⁵⁶ ns (~2.28 years)
3	2⁵⁰ ns (~13.03 days)
4	2⁴⁴ ns (~4.88 hours)
5	2³⁸ ns (~4.58 min)
6	2³² ns (~4.29 s)
7	2²⁶ ns (~67.11 ms)
8	2²⁰ ns (~1.05 ms)
9	2¹⁴ ns (~16.38 µs)
10	2⁸ ns (256 ns)
11	2² ns (4 ns)

A node starts as a vector node, storing raw points in a vector of size 1024. This is considered a leaf node, since it does not point to any child nodes.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│                           VECTOR NODE                           │
│                     (holds 1024 raw points)                     │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . │ <- raw points
└─────────────────────────────────────────────────────────────────┘

Once this vector is full and more points need to be inserted into its time slot, the node is converted to a core node by time-partitioning itself into 64 “statistical” points.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│                            CORE NODE                            │
│                   (holds 64 statistical points)                 │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ │ <- stat points
└─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┘
  ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼  <- child node pointers

A statistical point represents a 1/64 slice of its parent’s time slot. It holds the min, max, mean, standard deviation, and count of all points inside its time slot, and points to a new node holding extra details. When a vector node is first converted to a core node, the raw points are pushed into new vector nodes pointed to by the new statistical points.

level	node width	stat point width	total nodes	total stat points
1	2⁶² ns (~146 years)	2⁵⁶ ns (~2.28 years)	2⁰ nodes	2⁶ points
2	2⁵⁶ ns (~2.28 years)	2⁵⁰ ns (~13.03 days)	2⁶ nodes	2¹² points
3	2⁵⁰ ns (~13.03 days)	2⁴⁴ ns (~4.88 hours)	2¹² nodes	2¹⁸ points
4	2⁴⁴ ns (~4.88 hours)	2³⁸ ns (~4.58 min)	2¹⁸ nodes	2²⁴ points
5	2³⁸ ns (~4.58 min)	2³² ns (~4.29 s)	2²⁴ nodes	2³⁰ points
6	2³² ns (~4.29 s)	2²⁶ ns (~67.11 ms)	2³⁰ nodes	2³⁶ points
7	2²⁶ ns (~67.11 ms)	2²⁰ ns (~1.05 ms)	2³⁶ nodes	2⁴² points
8	2²⁰ ns (~1.05 ms)	2¹⁴ ns (~16.38 µs)	2⁴² nodes	2⁴⁸ points
9	2¹⁴ ns (~16.38 µs)	2⁸ ns (256 ns)	2⁴⁸ nodes	2⁵⁴ points
10	2⁸ ns (256 ns)	2² ns (4 ns)	2⁵⁴ nodes	2⁶⁰ points
11	2² ns (4 ns)		2⁶⁰ nodes

The sampling rate of the data at different moments will determine how deep the tree will be during those slices of time. Regardless of the depth of the actual data, the time spent querying at some higher level (lower resolution) will remain fixed (quick) due to summaries provided by parent nodes.

Appendix

This page is written based on the following sources:

set-up

btrdb

ni4ai-platform

Project Kickoff

October 25, 2019

Interacting with Data using "The Plotter"

This post gives a demo for new users to learn how to interface with data in PredictiveGrid using "The Plotter"

July 27, 2020

Author

PingThings

PingThings partners with forward thinking utilities to take advantage of a structural shift to artificial intelligence in the energy industry driven by the grid's growing complexity. PingThings solves these issues with the PredictiveGrid™, a purpose built platform for ingesting, storing, accessing, visualizing, analyzing, and training machine- and deep-learning models with data from large numbers of sensors. The platform is offered as an on-premise appliance, and in either a public or private cloud. Benchmarks indicate that we are at least two orders of magnitude faster than competitors.

Contact