The Vitess Docs | Lookup as Primary Vindex

Lookup as Primary Vindex

It is likely that a customer order goes through a life cycle of events. This would best be represented in a separate corder_event table that will contain a corder_id column as a foreign key into corder.corder_id. It would also be beneficial to co-locate the event rows with their associated order.

Just like we shared the hash vindex between customer and corder, we can share corder_keyspace_idx between corder and corder_event. We can also make it the Primary Vindex for corder_event. When an order is created, the lookup row for it is also created. Subsequently, an insert into corder_event will request the vindex to compute the keyspace_id for that corder_id, and that will succeed because the lookup entry for it already exists. This is where the significance of the owner table comes into play: The owner table creates the entries, whereas other tables only read those entries.

Inserting a corder_event row without creating a corresponding corder entry will result in an error. This behavior is in line with the traditional foreign key constraint enforced by relational databases.

Sharing the lookup vindex also has the additional benefit of saving space because we avoid creating separate entries for the new table.

We start with creating the sequence table in the product keyspace.

Schema:

create table corder_event_seq(id bigint, next_id bigint, cache bigint, primary key(id)) comment 'vitess_sequence';
insert into corder_event_seq(id, next_id, cache) values(0, 1, 3);

VSchema:

    "corder_event_seq": { "type": "sequence" }

We then create the corder_event table in customer:

create table corder_event(corder_event_id bigint, corder_id bigint, ename varchar(128), primary key(corder_id, corder_event_id));

In the VSchema, there is no need to create a vindex because we are going to reuse the existing one:

    "corder_event": {
      "column_vindexes": [{
        "column": "corder_id",
        "name": "corder_keyspace_idx"
      }],
      "auto_increment": {
        "column": "corder_event_id",
        "sequence": "product.corder_event_seq"
      }
    }

Alternate VSchema DDL:

alter vschema add sequence product.corder_event_seq;
alter vschema on customer.corder_event add vindex corder_keyspace_idx(corder_id);
alter vschema on customer.corder_event add auto_increment corder_event_id using product.corder_event_seq;

We can now insert rows in corder_event against rows in corder:

mysql> insert into corder(customer_id, product_id, oname) values (1,1,'gift'),(1,2,'gift'),(2,1,'work'),(3,2,'personal'),(4,1,'personal');
Query OK, 5 rows affected (0.04 sec)

mysql> insert into corder_event(corder_id, ename) values(1, 'paid'), (5, 'delivered');
Query OK, 2 rows affected (0.01 sec)

mysql> insert into corder_event(corder_id, ename) values(6, 'expect failure');
ERROR 1105 (HY000): vtgate: http://sougou-lap1:12345/: execInsertSharded: getInsertShardedRoute: could not map [INT64(6)] to a keyspace id

As expected, inserting a row for a non-existent order results in an error.

Reversible Vindexes #

In Vitess, it is insufficient for a table to only have a Lookup Vindex. This is because it is not practical to reshard such a table. The overhead of performing a lookup before redirecting every row event to a new shard would be prohibitively expensive.

To overcome this limitation, we must add a column with a non-lookup vindex, also known as Functional Vindex to the table. By rule, the Primary Vindex computes the keyspace id of the row. This means that the value of the column should also be such that it yields the same keyspace id.

A Reversible Vindex is one that can back-compute the column value from a given keyspace id. If such a vindex is used for this new column, then Vitess will automatically perform this work and fill the correct value for it. The list of vindex properties, like Functional, Reversible, etc. are listed in the Vindexes Reference.

In other words, adding a column with a vindex that is both Functional and Reversible allows Vitess to auto-fill the values, thereby avoiding any impact to the application logic.

The binary vindex is one that yields the input value itself as the keyspace_id, and is naturally reversible. Using this Vindex will generate the keyspace_id as the column value. The modified schema for the table will be as follows:

create table corder_event(corder_event_id bigint, corder_id bigint, ename varchar(128), keyspace_id varbinary(10), primary key(corder_id, corder_event_id));

We create a vindex instantiation for binary:

    "binary": {
      "type": "binary"
    }

Modify the table VSchema:

    "corder_event": {
      "column_vindexes": [{
        "column": "corder_id",
        "name": "corder_keyspace_idx"
      }, {
        "column": "keyspace_id",
        "name": "binary"
      }],
      "auto_increment": {
        "column": "corder_event_id",
        "sequence": "product.corder_event_seq"
      }
    }

Alternate VSchema DDL:

alter vschema on customer.corder_event add vindex `binary`(keyspace_id) using `binary`;

Note that binary needs to be backticked because it is a keyword.

After these modifications, we can now observe that the keyspace_id column is getting automatically populated:

mysql> insert into corder(customer_id, product_id, oname) values (1,1,'gift'),(1,2,'gift'),(2,1,'work'),(3,2,'personal'),(4,1,'personal');
Query OK, 5 rows affected (0.01 sec)

mysql> insert into corder_event(corder_id, ename) values(1, 'paid'), (5, 'delivered');
Query OK, 2 rows affected (0.01 sec)

mysql> select corder_event_id, corder_id, ename, hex(keyspace_id) from corder_event;
+-----------------+-----------+-----------+------------------+
| corder_event_id | corder_id | ename     | hex(keyspace_id) |
+-----------------+-----------+-----------+------------------+
|               1 |         1 | paid      | 166B40B44ABA4BD6 |
|               2 |         5 | delivered | D2FD8867D50D2DFE |
+-----------------+-----------+-----------+------------------+
2 rows in set (0.00 sec)

There is no support for backfilling the reversible vindex column yet. This will be added soon.

The original keyspace_id for all these rows came from customer_id. Since hash is also a reversible vindex, reversing the keyspace_id using hash will yield the customer_id. We could instead leverage this knowledge to replace keyspace_id+binary with customer_id+hash. Vitess will auto-populate the correct value. Using this approach may be more beneficial because customer_id is a value the application can understand and make use of.