Spotlight on Three Differentiating Features of Clustrix

Clustrix Features

Clustrix stands apart from other databases.  Our scale-out SQL relational database offers some key differences that stem from our distributed architecture and give us the flexibility to create features that don’t exist, or aren’t possible, with MySQL or other database architectures. Here are a few of the lesser-known yet highly valuable Clustrix features.

1. Online Schema Changes

In Clustrix, you can alter a table online with no downtime. This works just like MySQL except that Clustrix can still accept new transactions and regular workloads can occur while the ALTER is taking place. As the ALTER is processed, Clustrix creates a copy of the parts of the table modified during the ALTER and applies changes to those rows — one with the new version and one with the original (pre-ALTER) version. This ensures that all changes to the table are applied once the schema change has finished, or that changes to the table are preserved if the ALTER is cancelled or must be rolled back before it completes.

For example, here is what you would run if you wanted to change the character set to utf8mb4:

On the database level:

clustrix> ALTER DATABASE database_name CHARACTER SET utf8mb4;

On the table level:

clustrix> ALTER TABLE table_name CHARACTER SET utf8mb4;

On the column level:

clustrix> alter table table_name modify `column_name` column_type character set utf8mb4;

2. The Rebalancer

With Clustrix, there is no need for you to balance and distribute the data across the cluster themselves since the cluster handles this automatically. Clustrix features a Rebalancer that has four main operations: copy, move, rerank, and redistribute.

Copy comes into play during reprotect, after a disk or node failure. The cluster will make a copy of the under-protected replica(s) to protect against any future hardware failure.

The Rebalancer moves data to keep an even distribution.  This ensures that the data is evenly distributed across disks and nodes so the cluster is utilized evenly as it grows. Clustrix keeps multiple copies of data for fault tolerance, but uses only one copy for reads.

The rerank operation keeps the read load evenly distributed across the cluster. If there is an imbalance in reads for a particular replica, the Rebalancer ranks the paired replica as the new read replica to help keep load even across the cluster. Redistribute works in a similar way – to even out “lumpy” distribution in indexes and tables.

The most common question I hear about the Rebalancer is, “What user interaction is required in order to keep distribution even?”

While these values are tunable via global variables, there is no need, under most circumstances, to tune these values. The cluster handles these operations in the background casino online and it is entirely transparent to the user. Our Web UI – called Insight – provides a nice view of the distribution and rebalanced activity so you can get an idea of what the cluster is doing and how data is distributed.

3. Replication Enhancements

Because Clustrix stores binlogs in the database, we can offer several advantages over MySQL.

First, Clustrix replicates binlogs and spreads them out redundantly across the cluster just like other data, so drive or node failure does not result in lost binlog data.

One way that Clustrix has enhanced Replication beyond MySQL is to add the ability to create more than one binlog. This feature lets a Clustrix cluster serve as master to more than one MySQL or Clustrix slave.  This gives full flexibility and control to the replication setup and is helpful when integrating Clustrix into existing infrastructure.

This also makes it easy for Clustrix to fit into existing infrastructure. Imagine, for instance, replacing a MySQL master with Clustrix. You import the data into Clustrix as a slave of your existing MySQL server and then promote it to master, transferring any slaves to use Clustrix as their new replication source. You now have Clustrix in place of one of your old MySQL servers.

Second, let’s assume that at some point in the future, you decide to add more nodes and replace another MySQL master using the same Clustrix cluster. You can create a new binlog for the new database on Clustrix without interfering with the existing replication setup. You can then feed replication traffic from each binlog to your slaves without having to use one large binlog.

I often help customers work Clustrix into their existing replication topology. Clustrix features are unique, and it’s always fun to hear companies get excited by the flexibility that Clustrix provides.