Building an embedded column store database with upscaledb - part 1 by Christoph Rupp

Posted on Jan 15, 20:11:03 CET 2016

Writing your own SQL database with upscaledb is not difficult. Start of a new series.

This post is the first in a longer series of posts describing how to implement SQL-ish scenarios in a key/value store. In the first article I'll explain the basics, then will continue to describe OLAP vs OLTP and how embedded databases (specifically upscaledb) can be optimized for both usages. The third post will show simple recipes for translating SQL databases to upscaledb and for running simple queries. Afterwards I will go in depth and describe analytical operations on complex real-live data.

What’s the point in using an embedded key/value store to implement SQL-style database functions? Wouldn’t it be much easier to just use an SQL database instead? Most likely it would indeed be “easier”, but there are some compelling reasons why an embedded key/value store is more attractive. One is performance; most key/value stores beat SQL databases hands down. Another one is the user experience. SQL databases often require the installation (and maintenance) of a database server, which is time-consuming and requires lots of support. If you follow this blog series then you will see that it’s not difficult to work with an embedded database. Let me show you how you can build your own column store database tailored for your application.

What is a column store database?

The blob title mentioned a "column store database" - one of the buzzwords in the DBMS market. But what is this exactly?

Traditionally, databases stored the data of a single row sequentially; they are "row oriented". The row's data is indexed by its primary key. Imagine a table with user information, here described in pseudo-SQL:


(Storing a password as plain text is a security problem - let’s pretend that we store password hashes instead.)

We fill this table with some dummy data:

1 tom s3cr3t
2 amy AS302.x
3 rick kcir

In order to store this data, it has to be serialized to disk. Each row information is indexed by its primary key (the user id). I am using the arrow (->) to describe the "is indexing" relation.

1 -> tom|s3cr3t| / 2 -> amy|AS302.x| / 3 -> rick|kcir|

(Note that databases could also assign their own internal id instead of using the primary key as an index.)

This schema is simple and works well. When looking up data, the USERID column serves as an index and returns a blob with a compact representation of the row's full data. Reading and writing of a single row is very fast.

This schema also has a disadvantage: Database indexes are usually implemented as B-tree structures. A B-tree consists of nodes, and each node will fill up quickly if the row data becomes large. The B-tree will have a high fanout, and performing full table scans requires lots of slow disk access.

In contrast, a column store database (or "column oriented database") stores each column as a separate index. In our example, the data could be serialized like this:

1 -> tom / 2 -> amy / 3 -> rick
amy -> 2 / tom -> 1 / rick -> 3 -> 2 / -> 1 / -> 3
AS302.x -> 2 / kcir -> 3 / s3cr3t -> 1

In real life, a column store DBMS might automatically choose which kind of indexes it wants to create. It might even create compound indexes which combine multiple rows.

In contrast to the row oriented serialization, a column oriented index is much smaller because it only stores attributes of the same type. In addition, these attributes are sorted and therefore can be compressed heavily. Running table scans therefore requires less I/O, and is much faster as long as only one index is scanned. These table scans are typical for analytical queries, i.e. to calculate AVERAGE or SUM over a full table, or MIN and MAX values.

On the other hand, inserting a new row requires several write operations, because each index must be updated. Therefore column store databases are often faster when reading data, while traditional, row oriented databases are faster when writing data.

How can upscaledb help?

As a key/value store, upscaledb specializes in providing a small subset of the functionality that a full DBMS offers. This subset enables you to store indexed data and create multiple indexes. upscaledb knows about the "type" of your data, i.e. whether it's numeric or a binary blob. Its representation is therefore very compact. In addition, upscaledb has built-in API functions for analytical calculations (i.e. COUNT, COUNT DISTINCT, SUM, AVERAGE).

As an application developer you have the full freedom (or responsibility?) to create those indexes that you require, and to optimize the database for the specific needs of your application. You can decide whether your database is optimized for short read/writes or for long analytical scans, or even have a mix of both worlds.

What's the meaning of "embedded"?

upscaledb is an embedded database, but the word "embedded" is a bit ambiguous. upscaledb will run fine on embedded platforms (smart phones, tablets, Raspberry PI etc). But more importantly, upscaledb is a library which runs embedded into your application. End users are not exposed to the database. They do not have to maintain it, and your application does not need to deploy or install a separate database. In addition, your application does not require expensive inter-process communication (IPC) when talking to the database. Not relying on IPC makes the whole interaction between application and database much faster.

In the next part of this series I will write more about row oriented vs column oriented database design, and show how upscaledb can be used to implement these concepts.

Latest Posts

May 20, 18:55:21 CET 2016
32bit integer compression algorithms - part 3

Apr 11, 19:12:27 CET 2016
32bit integer compression algorithms - part 2

Mar 22, 18:38:21 CET 2016
Fast range queries in upscaledb

Apr 03, 22:26:51 CET 2016
Release of upscaledb 2.2.0

Mar 08, 21:02:54 CET 2016
32bit integer compression algorithms

Feb 10, 20:12:32 CET 2016
Building an embedded column store database with upscaledb - part 3

Latest Comments