VDiff
Compare the source and target in a workflow to ensure integrity
Command #
VDiff [-source_cell=<cell>] [-target_cell=<cell>] [-tablet_types=replica]
[-filtered_replication_wait_time=30s] <keyspace.workflow>
Description #
VDiff does a row by row comparison of all tables associated with the workflow, diffing the source keyspace and the target keyspace and reporting counts of missing/extra/unmatched rows.
It is highly recommended that you do this before you finalize a workflow with SwitchWrites.
Parameters #
-source_cell #
optional
default all
VDiff will choose a tablet from this cell to diff the source table(s) with the target tables
-target_cell #
optional
default all
VDiff will choose a tablet from this cell to diff the source table(s) with the target tables
-tablet_types #
optional
default replica
A comma separated list of tablet types that are used while picking a tablet for sourcing data.
One or more from MASTER, REPLICA, RDONLY.
-filtered_replication_wait_time #
optional
default 30s
VDiff finds the current position of the source master and then waits for the target replication to reach
that position for _filtered_replication_wait_time_. If the target is much behind the source or if there is
a high write qps on the source then this time will need to be increased.
keyspace.workflow #
mandatory
Name of target keyspace and the associated workflow to run VDiff on.
Example #
$ vtctlclient VDiff customer.commerce2customer
Summary for corder: {ProcessedRows:10 MatchingRows:10 MismatchedRows:0 ExtraRowsSource:0 ExtraRowsTarget:0}
Summary for customer: {ProcessedRows:11 MatchingRows:11 MismatchedRows:0 ExtraRowsSource:0 ExtraRowsTarget:0}
Notes #
- You can follow the progress of the command by tailing the vtctld logs
- VDiff can take very long (hours/days) for huge tables, so this needs to be taken into account. If VDiff takes more than an hour and you use vtctlclient then it will hit the grpc/http default timeout of 1 hour. In that case you can use vtctl (the bundled vctlclient + vtctld) instead.
- There is no throttling, so you might see an increased lag in the replica used as the source.
VReplication and VDiff performance improvements as well as freno-style throttling support are on the roadmap!