Frequently Asked Questions

Every application is different, and therefore it is difficult to give general rules for optimizations. If you want to squeeze out the last few percents of performance, it is absolutely necessary to write a benchmark which mimicks the behaviour of your application and allows you to test the various upscaledb settings. Or use ups_bench, a tool which was exactly created for this purpose (see Benchmarking). That being said, you can choose from a variety of options to improve performance.
Cache size

Try to come up with a cache size that is big enough that the working set of the index fits into the cache. Again, it is helpful if you create a benchmark application to figure out which size is the best for you.

Key size

Keep keys as small as possible. The more keys fit into a database page, the less I/O is required. You can set the keysize with UPS_PARAM_KEY_SIZE when creating a new Database. A fixed length key size is more efficient than variable length keys.

Key type

Better than specifying a key size is to specify the actual key type - use UPS_PARAM_KEY_TYPE in ups_env_create_db. These key types allow a more dense btree layout, saving I/O and optimizing for CPU caches. Also, the compare function is inline and does not require a callback function, further improving performance. A word of warning though - fixed keys are always stored in the btree node. If the key size is very large then the btree can only store a few keys per node, and the tree's fanout will be high. In such cases it might be better to NOT specify the key size; upscaledb will then move the key to a blob if it becomes too large. Use ups_bench to test the different configurations.

Record size

If all your keys have the same record size then also specify UPS_PARAM_RECORD_SIZE when calling ups_env_create_db. Small records are packed into the Btree leafs and do not require allocation of external blob space, further increasing performance.

Page size

The default pagesize is always a good start. However, if your keys are larger, you might want to increase the page size. Otherwise pages have to be split too often. Again it helps if you write a benchmark or use ups_bench for testing.

Compiler Flags

The GNU compiler collection has a few switches which squeeze out extra performance:

-mfpmath=sse -Ofast -flto -march=native -funroll-loops
This switch can be enabled at compile time:
./configure CFLAGS="-mfpmath=sse -Ofast -flto -march=native -funroll-loops"

Transactions

Transaction states are stored in memory and are consolidated with the B+Tree index at runtime. This consolidation is tricky when duplicate keys are involved, therefore performance will be a bit better if duplicate keys are disabled. Also, when inserting values it is VERY important to use the UPS_OVERWRITE flag whenever possible. An insert with UPS_OVERWRITE will not require any disk I/O.

Choose one of the following options:
  • UPS_ENABLE_TRANSACTIONS: Transactions will make sure that no data is lost and the database file is always in a consistent state.
  • UPS_ENABLE_RECOVERY: Writes all modified pages to a write-ahead log file. If the application crashes, upscaledb will read these log files and recover itself.
  • UPS_ENABLE_FSYNC: Calls fsync() and flushes modified buffers to the harddisk. This protects against system crashes (i.e. power failures), but costs lots of performance.

upscaledb is thread-safe and can be used from multiple threads without problems. However, it is not yet concurrent; it uses a big lock to make sure that only one thread can access the upscaledb environment at a time.

In addition, ups_db_find (and ups_cursor_find) return temporary pointers that can be overwritten by subsequent calls, also from other threads. Use UPS_RECORD_USER_ALLOC or Transactions if this is a problem. See the upscaledb.h documentation on ups_record_t for more information.

The upscaledb file format is stored in host endian. If you open a big endian file on a little endian machine (or vice versa) then you will get corrupt data. Also, some compression algorithms only work on little endian machines. Use ups_export and ups_import to export and import the data from little- to big-endian and vice versa.
The upscaledb library can be up to 12 mb in size (this varies from platform to platform). Reason is that all possible Btree configurations are mapped to C++ template classes, which causes the compiler to generate lots of debug information. To remove the debug information, simply strip the library.
Yes, many! Please see the file LICENSE.foss-exceptions which is also distributed in the source tarball.

Questions about Licensing

Yes. Please use the contact form or send a mail if you are interested in consulting.
A license for a single developer costs 525 €. VAT is not included. This license is valid for one full year and covers all versions that were released during this year.
Every developer which uses upscaledb (by linking to the library, by including the header files etc) will require one license.
Yes, you can always do that, but each license can only be transferred once every six months.
No, this is not possible.
If you require three or more licenses then please ask for a quote, either through the contact form or via mail.
Yes, we offer several license exceptions for FOSS (Free and Open Source Software) projects. The list of exceptions can be found here and in the source tarball. If your project uses a different license then just tell us and we might add your license to the list.
Theoretically yes. The GPL allows you to create commercial applications. However, your commercial application will also be released under the GPL, which means that you have to give it away for free and open source.
If you use the GPL version of upscaledb, then your application also has to be released under the GPL. This means:
  • You will need to deliver the complete source code of upscaledb AND of your application to your users/customers.
  • Alternatively, you have to provide a written offer with instructions on how to get the source code.
  • Your GPL application can only use 3rd party libraries if they have a license that is GPL compatible.
If you buy a commercial license then you do not have to pay any royalties and you can use upscaledb for as many projects as you want. Only few limitations have to be respected:
  • You are allowed to modify the source of upscaledb, but you are not allowed to release the modifications (unless you release them under the GPL)
  • You are not allowed to create a product which delivers the same, or substantially the same, functionality as upscaledb itself. In other words: you are not allowed to create an embedded database library which hides upscaledb behind a new header file and a new library.
Yes, an evaluation is allowed as long as you do not release a product.
Yes, as long as you do not release your software you can use the GPL version of upscaledb.
It is valid for one full year. Following years will require a license renewal.
Yes, but only those versions that were released during your first year, or while your subscription was active. You can keep using these versions of upscaledb, but for all newer versions the GPL applies.
Please get in touch using the contact form or via mail for a quote.
Please get in touch using the contact form or via mail and we will send you an invoice for your order.