Tutorial

Duplicate Keys

upscaledb supports duplicate keys. A key is a duplicate of another key if both keys are equal. Duplicate keys can be used to model 1:n relationships, where i.e. one customer ID has multiple order IDs. The sample samples/env2.c demonstrates how to use duplicate keys to map orders to customers.

The following section gives an overview on the usage of duplicate keys.

Enabling Duplicate Keys

Duplicate keys have to be enabled when the Database is created with ups_create_ex or ups_env_create_db. The flag is called UPS_ENABLE_DUPLICATES.

if ((st = ups_env_create_db(env, &db, 1, UPS_ENABLE_DUPLICATE_KEYS)
                != UPS_SUCCESS) {
  printf ("ups_env_create_db failed: %d (%s)\n", st, ups_strerror (st));
  exit (–1);
}

Inserting Duplicate Keys

To insert a duplicate key, call ups_db_insert or ups_cursor_insert with the flag UPS_DUPLICATE. If you do not specify this flag and the key already exists, the status UPS_DUPLICATE_KEY is returned. If the flag is specified, but the key does not yet exist, it is inserted just as if the flag was not specified.

Note that you can modify the order of the inserted records by specifying one of the flags UPS_DUPLICATE_INSERT_BEFORE, UPS_DUPLICATE_INSERT_AFTER, UPS_DUPLICATE_INSERT_FIRST or UPS_DUPLICATE_INSERT_LAST to ups_cursor_insert. The default is UPS_DUPLICATE_INSERT_LAST, which is also the behaviour of ups_db_insert.

The following snippet inserts five duplicate keys.

ups_key_t key = ups_make_key("numbers", strlen("numbers") + 1);
ups_record_t record = {0};

for (int i = 0; i < 5; i++) {
  record.data = &i;
  record.size = sizeof (i);
  if ((st = ups_db_insert (db, NULL, &key, &record, UPS_DUPLICATE))
                    != UPS_SUCCESS) {
    printf ("ups_db_insert failed: %d (%s)\n", st, ups_strerror (st));
    exit (–1);
  }
}

Traversing Duplicate Keys

Duplicate keys and their records can only be traversed with a Cursor; ups_db_find always returns the first duplicate record.

The default behaviour of ups_cursor_move is to traverse also all duplicate keys. However, duplicate keys can be omitted by specifying the flag UPS_SKIP_DUPLICATES, and the Cursor can be forced to only step through duplicates of the current key with the flag UPS_ONLY_DUPLICATES (in this case, ups_cursor_move returns UPS_KEY_NOT_FOUND if the last duplicate key is reached).

The following snippet moves a Cursor to the first duplicate with the key “numbers”, and then traverses all duplicates of this key.

key.data = “numbers”;
key.size = strlen (key.data) + 1; /* +1 for the terminating zero-byte */

if ((st = ups_cursor_find (cursor, &key, 0, 0)) != UPS_SUCCESS)
  ; // handle error

do {
  // print the current record
  printf("%s\n", (const char *)record.data);
  // move to the next record
  st = ups_cursor_move(cursor, 0, &record, 0);
} while (st == UPS_SUCCESS);

Replacing Duplicate Keys

The records of a duplicate key can be overwritten with ups_cursor_overwrite. There is no difference between overwriting a record of a duplicate key and a non-duplicate key. If ups_db_insert is used with the flag UPS_OVERWRITE, only the first duplicate record is overwritten.

Get the Number of Duplicate Keys

You can always check the number of duplicate keys with ups_cursor_get_duplicate_count. Move a Cursor to the key (either with ups_cursor_find or ups_cursor_move), and call ups_cursor_get_duplicate_count. If a key does not have duplicates, ups_cursor_get_duplicate_count will return 1 in its count parameter. Otherwise it returns the number of duplicate keys.

ups_status_t
ups_cursor_get_duplicate_count (ups_cursor_t *cursor,
            uint32_t *count, uint32_t flags);

API reference for ups_cursor_get_duplicate_count

Here is an example which prints the number of duplicate keys of the item identified by “numbers”.

uint32_t count;
memset (&key, 0, sizeof (key));
key.data = “numbers”;
key.size = strlen (key.data) + 1; /* +1 for the terminating zero-byte */

if ((st = ups_cursor_find (cursor, &key, &rec, 0)) != UPS_SUCCESS)
  ; // handle error
if ((st = ups_cursor_get_duplicate_count (cursor, &count, 0)) != UPS_SUCCESS)
  ; // handle error
printf (“key ‘numbers’ has %d duplicate keys\n”, (int)count);

Deleting Duplicate Keys

In this case, ups_db_erase and ups_cursor_erase behave differently. ups_db_erase deletes the key and all duplicate records at once. ups_cursor_erase only deletes the duplicate to which it points.

if ((st = ups_cursor_erase (cursor, 0)) != UPS_SUCCESS)
  ; // handle error