upscaledb comes with a handy tool
“ups_bench” which is used for benchmarking and
testing a myriad of configurations and data patterns. If you
wonder which database configuration is the fastest for you then
simply fire up ups_bench
, describe your data and
your operations through a few command line parameters and look
at the results (or the generated image files).
upscaledb’s performance depends on its configuration, the different APIs that you use, whether your application mainly concentrates on reading or writing (or deleting) data, and what kind of data you store. This document describes the various options that you can use to run the tests.
Unix and Windows users find ups_bench
in the
tools/ups_bench
directory; the executable file is
called ups_bench
(or ups_bench.exe
).
Simply execute it and it will insert 1 million random keys,
each key with a size of 16 bytes and each record 1024 bytes
large. A progress bar is shown, and at the end a few
statistical results are printed to screen:
upscaledb 2.1.11 - Copyright (C) 2005-2015 Christoph Rupp (chris@crupp.de).
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Configuration: --seed=1381063435 --keysize=16 --recsize=1024
--distribution=random
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
[OK]
total elapsed time (sec) 11.239185
upscaledb elapsed time (sec) 9.453202
upscaledb total_#ops 1000001
upscaledb insert_#ops 999999 (105784.155906/sec)
upscaledb insert_throughput 95224190.254546/sec
upscaledb insert_latency (min, avg, max) 0.000000, 0.000009, 0.255820
upscaledb filesize 1016430592
Depending on your input, the data is created with a Random Number Generator (RNG). To get reproducible results you need to specify a “seed” value for the RNG. If you do not specify a seed then upscaledb picks one for you and prints it to stdout (see above). If you tune the options then just specify this seed and upscaledb will use the identical data as in the previous run.
A word of warning – never use the Debug build of upscaledb if you want to run benchmarks. The debug build is slower by magnitudes and will scale terribly.
Try running ./ups_bench --help
to get a full
overview about the various options!
upscaledb 2.2.x supports basic schema information, and
ups_bench
exposes these parameters. The following
options describe the key’s type:
--key=binary binary data (this is the default)
--key=string concatenated ascii strings from /usr/share/dict
--key=custom binary data, but uses a user-supplied callback function to compare keys
--key=uint8 unsigned integer, 8 bits (1 byte)
--key=uint16 unsigned integer, 16 bits (2 byte)
--key=uint32 unsigned integer, 32 bits (4 byte)
--key=uint64 unsigned integer, 64 bits (8 byte)
For custom
, string
and
binary
key types you can specify the additional
key size with the --keysize
parameter.
The record size can be specified with
--recsize
. The default is 1024 bytes.
The ups_bench
tool supports several data
patterns – random data, ascending or descending data. The
default data distribution is “random”.
--distribution=random random data is generated (the default)
--distribution=ascending data is ascending (0, 1, 2, ...)
--distribution=descending data is descending (999, 998, 997, ...)
Applications have different patterns when accessing data; some mostly write data, but never delete any. Others insert some data but often look up keys or run full table scans. Others even might have different stages where they bulk load, and then look up or scan exclusively.
By default, upscaledb only inserts data. Here are the options to control this behavior:
--erase-pct=<n> <n> % of all operations are deletes/erase operations
--find-pct=<n> <n> % of all operations are lookup operations
--table-scan-pct=<n> <n> % of all operations are full table scans
If you want to simulate separate application runs with a
bulk load phase and a lookup phase, then first generate the
file by running ups_bench
, then run it again and
specify --reopen
:
./ups_bench --keysize=32 # generates the file by bulk loading with data
./ups_bench --keysize=32 --find-pct=50 --erase-pct=10 --reopen # opens the generated file; 50% of all operations are lookups, 10% are deletes, the remaining 40% are additional inserts
By default, ups_bench
stops after 1 million
database operations. You can change this limit, or use a
different limit, either by specifying the amount of inserted
bytes or by setting a time limit:
--stop-ops=<n> stops after <n> database operations (inserts, erase, lookups, scans)
--stop-seconds=<n> stops after <n> seconds
--stop-bytes=<n> stops after inserting <n> bytes (key data + record data)
--duplicate=first enables duplicate
keys and inserts duplicates at the *beginning* of the duplicate list. Requires
`--use-cursors`!
--duplicate=last enables duplicate keys and inserts duplicates at the *end* of the duplicate list; this is the default duplicate behavior.
--overwrite overwrites existing keys
--no-mmap disables use of mmap
--pagesize=<n> sets the pagesize (in bytes)
--cache=<n> sets the cache size (in bytes)
--cache=unlimited uses an unlimited cache size
--direct-access uses the flag UPS_DIRECT_ACCESS. Requires `--inmemorydb`
--use-transactions=tmp enables transactions and runs each database operation in a temporary transaction
--use-transactions=all enables transactions and runs *all* database operations in one single transaction
--use-transactions=n enables transactions and groups <n> database operations in a transaction
--flush-txn-immediately does not buffer Transactions before they are committed (`UPS_FLUSH_WHEN_COMMITTED`)
--disable-recovery disables recovery; used in combination with Transactions
--inmemorydb runs upscaledb in memory only
--use-fsync uses fsync after transactions are committed
--use-recovery uses recovery, even if transactions are disabled
--use-remote enables the remote client/server
--use-cursors uses a database cursor for all operations
You can also specify output parameters which track the
performance and other various metrics. Internally,
ups_bench
measures the wall-clock time of each
database operation, and prints a summary about throughput and
latency when done:
total elapsed time (sec) 0.747461
upscaledb elapsed time (sec) 0.099174
upscaledb total_#ops 10001
upscaledb insert_#ops 4046 (70337.495855/sec)
upscaledb insert_throughput 70224427.374380/sec
upscaledb insert_latency (min, avg, max) 0.000002, 0.000014, 0.000124
upscaledb find_#ops 2972 (140719.583701/sec)
upscaledb find_throughput 4121208.803951/sec
upscaledb find_latency (min, avg, max) 0.000002, 0.000007, 0.000093
upscaledb erase_#ops 2981 (145190.417647/sec)
upscaledb erase_latency (min, avg, max) 0.000002, 0.000007, 0.000049
upscaledb filesize 4358144
If you’re interested in even more details then run
with --metrics=all
and you will receive a myriad
of information including cache hits, cache misses, but also two
.png
graphic files with a visualization of the
latencies and the throughput (the images are only generated if
gnuplot is
installed).
Simulating an application with the following data access patterns: inserts = 60%, lookups = 30%, deletes=10%; keys are 32 bytes random binary data, records are 128 bytes:
ups_bench --find-pct=30 --erase-pct=10 --keysize=32 --recsize=128
Same as above, but runs the test in memory:
ups_bench --find-pct=30 --erase-pct=10 --keysize=32 --recsize=128 --inmemorydb
… or with a cache size of 512 mb:
ups_bench --find-pct=30 --erase-pct=10 --keysize=32 --recsize=128 --cache=536870912
Use 64bit numeric keys and 8 byte records with an ascending distribution (this is more or less equal to a UPS_RECORD_NUMBER database):
ups_bench --key=uint64 --recsize=8 --distribution=ascending
Bulk load the data, groups 10 inserts into a single transaction; enables duplicate keys:
ups_bench --use-transactions=10 --duplicate=last