Databases with Automatic Rebalance Benchmark (TIDB vs YugabyteDB vs CockroachDB)

K Prayogo
5 min readJan 5, 2022

--

Automatic rebalance/repair/self-healing (we can remove or add new node, and it will distribute the data and rebalance itself, data are replicated to more than 1 node). Previous benchmark doesn’t really care about this awesome feature (no more cutoff downtime to kill master instance and promote slave as master then switch every client to connect to new master — if not using any proxy).

Some databases that I found that support this feature:

Reproducibility

The repository are here: https://github.com/kokizzu/hugedbbench on the 2021 folder. We’re going to test local single (if possible) and multi server deployment using docker. Why using docker? because i don’t want to ruin my computer/server with trash files they are creating in system directory (if any). Some of databases not included if not supporting SQL or if a license key required to start. Why only benchmarking 2 column? because it fit my project’s most common use case, where there’s 1 PK (bigint or string), and 1 unique key (mostly string), and the rest mostly some indexed or non-indexed column. Why are you even doing this? Just want to select the best thing for my next side project’s techstack (and because my past companies I’ve work with seems love to move around database server location a lot).

The specs for the server that used in this benchmark: 32-core 128GB RAM 500GB NVMe disk.

CockroachDB

CockroachDB is one of NewSQL movement that support PostgreSQL syntax, to deploy in single node we can use docker compose. The UI for cluster monitor on port 8080 is quite ok :3 better than nothing.

Here’s the result for 100 inserts x 1000 goroutines:

CockroachDB InsertOne 10.034616078s

CockroachDB Count 42.326487ms

CockroachDB UpdateOne 12.804722812s

CockroachDB Count 78.221432ms

CockroachDB SelectOne 2.281355728s

TiDB

TiDB is one of NewSQL movement that support MySQL syntax, the recommended way is using tiup command, but we’re going to use docker so it would be fair with other database product. The official docker use 3 placement driver and 3 kv server, so I try that first. The cluster monitor in port 10080 but it blocked by chrome, so I moved it on 10081, it’s very plaintexty compared to other products.

# reducing to single server mode (1 pd, 1 kv, 1 db), first run:

TiDB InsertOne 3.216365486s

TiDB UpdateOne 3.913131711s

TiDB SelectOne 1.991229179s

YugaByteDB is one of NewSQL movement that support PostgreSQL syntax, to deploy in single node we can use docker compose too. The cluster monitor on port :7000 is quite ok. The tmp directory mounted because if it isn’t it would stuck starting on 2nd time unless the temporary file manually deleted. limits.conf applied.

YugaByteDB Count 159.357304ms

YugaByteDB Count 214.389496ms

YugaByteDB SelectOne 2.778803557s

YugaByteDB Total 33.834838111s

YugaByteDB InsertOne 38.614091068s

YugaByteDB Count 76.615212ms

YugaByteDB UpdateOne 56.796680169s

YugaByteDB Count 84.35411ms

YugaByteDB SelectOne 3.14747611s

Here’s the recap of 100 records x 1000 goroutine insert/update/select duration, only for single instance:

So, at best, it roughly on average take 29 μs to insert, 39 μs to update, 19 μs to select one record.

Comparing only multi (RF=2+):

So, at best, it roughly on average take 31 μs to insert, 41 μs to update, 21 μs to select one record.

Comparing only multi with replication factor with true HA:

It seems TiDB has most balanced performance in expense the need to have pre-allocated disk space, while CockroachDB has worst performance on multi-instance update task, and YugabyteDB has worst performance on multi-instance insert task.

What happened if we do the benchmark once more, remove one storage node (docker stop), then redo the benchmark (only for RF=2+)?

Yugabytedb test doesn’t even entering the insert stage after 5 minutes ‘__’) may be because of truncate is slow? so I changed the benchmark scenario only for yugabyte to be 1 node be killed after 2 seconds of insertion phase, but still yugabyte giving an error “ERROR: Timed out: Write RPC (request call id 3873) to 172.21.0.5:9100 timed out after 60.000s (SQLSTATE XX000)”, it cannot complete. EDIT yugabyte staff on slack suggested that it should be using RF=3 so it would still survive when one node died.

TiDB seems to be the winner also for case when a node died, in expense of the need of 7 initial node (1 tidb [should be at least 2 for HA], 3 tipd, 3 tikv, but probably can be squeezed to be 1 tidb, 1 tipd, 2 tikv, since apparently the default replication factor is 3), where cockroachdb only need 3, and yugabytedb need 4 (1 ybmaster, 3 ybserver). Not sure tho what would happened if 1 tidb/ybmaster instance is died. The recap spreadsheet are here.

:3 myahaha! So this is probably the reason lots of companies Next time we’re gonna test how simple is it to add and remove node (and securely, if possible only limited set of servers can join without have to set firewall/DMZ to restrict unprivileged servers) then re-benchmark with more complex common use case (like UPSERT, range queries, WHERE-IN, JOIN, and secondary index). If automatic rebalance not in the requirement, I would still use Tarantool (since 2020.09) and Clickhouse (since 2021.04), but now I found one more new favorite automatic-rebalance database other than Aerospike (since 2016.11), moving to TiDB.

Btw do not comment on this blog (since it’s too much spammy comment and there’s no notification whether new comment added), just use github issue or reddit instead.

UPDATE: redo the benchmark for all database after updating the limits.conf, TiDB improved by a lot, while CockroachDB remains the same except for update benchmark.

Originally published at http://kokizzu.blogspot.com.

--

--