ClustrixDB in AWS – Scaling Writes and Reads

ClustrixDB in AWS

ClustrixDB is a shared-nothing clustered scale-out MySQL-compatible relational database that massively scales both writes and reads. Designed for the cloud, ClustrixDB offers built-in high availability and is largely self-managing. With ClustrixDB you simply add database servers (“flex up”) to grow capacity and throughput to handle seasonal workloads, and then you can “flex down” after the peak, only paying for database capacity that you need.

In the following, we’ll explore how ClustrixDB clusters work including operations, high-availability and replication. Then we’ll take a step deeper into ClustrixDB relational scale-out technology, describing just how writes scale-out in the ClustrixDB architecture. And then we’ll finish with a high-level feature comparison between ClustrixDB, and two of Amazon’s offerings – RDS MySQL and RDS Aurora.

Designed to help your site grow fast, ClustrixDB takes the pain out of scaling and automates all of the complicated database operations traditionally performed to increase throughput, capacity, and high availability. Correspondingly, ClustrixDB lets you handle growth easily, predictably, and at the low-cost increments of adding commodity hardware.

ClustrixDB keeps things simple; the application sees a single database that provides SQL with ACID guarantees. This is uniquely different than solutions like RDS MySQL or RDS Aurora, which provide the ability to deploy read-replicas easily, but require the application to be modified to direct read queries to the read replicas. To avoid stale data, important reads must still be pointed at the master. With ClustrixDB, your application sees a single database – not the individual servers in the cluster. It simply uses all the hardware resources in aggregate yielding dramatic performance benefits.

ClustrixDB does not require classic ‘sharding,’ a strategy for scaling used by single write-master databases like MySQL or RDS Aurora. Sharding splits tables across multiple database servers, and often requires significant application modification. It also affects consistency between the database shards which can impact ACID compliance.

ClustrixDB patented technology distributes and redistributes data so you never have to shard or worry about data distribution—you get all the benefits of scale, without the management or maintenance overhead of sharding. ClustrixDB sends complex queries to any node, and unlike sharding, there are no consistency, performance, or referential integrity limitations that have to be handled by the application layer.

Built from the ground up, ClustrixDB has extensive support for MySQL features, including complex queries involving joins on a dozen or more tables, aggregates, sorts, and subqueries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes.

What is a ClustrixDB Cluster?

A ClustrixDB “cluster” is a group of three or more servers, in a peer-to-peer configuration. Each of those server nodes handles full writes and reads, and each node is shared-nothing, having its own, local subset of the data. There is at minimum two copies of the entire dataset, automatically sliced, distributed and balanced across the available servers. There are no ‘master,’ ‘slave,’ ‘compute’ or ‘standby’ nodes—all nodes are fully transactional.

ClustrixDB is best deployed as a 3 node or larger cluster, each instance with a minimum of 8 vcpu, 16GB RAM, and local (ephemeral) SSDs, running CentOS or RHEL 6.x. For example on AWS EC2, the following instances are recommended for ClustrixDB:

  • c3.xlarge or larger: for smaller datasets with significant processing requirements
  • i2.xlarge or larger: for very large datasets
  • A Load Balancer is needed to distribute the load from the application(s), and thus it’s best to have 2 NICs per instance, with the public IPs facing the Load Balancer, and the private IPs reserved for intermodal communication.

Each of the instances has its own storage, ideally in RAID 0. This allows for maximized I/O per node, while leveraging redundant replicas for high availability.

ClustrixDB Operation

After setting-up your ClustrixDB cluster, the easiest way to get up and running is to import a MySQL backup. Importing in parallel, ClustrixDB will be ready to roll much faster than it took MySQL to create that dumpfile. Then, simply point your application to the database load-balancer, and ClustrixDB will start processing queries.

Each ClustrixDB cluster uses a patented distribution methodology to allocate all the data amongst the available instances within the cluster. By doing this, ClustrixDB can distribute the workload (inserts, updates, reads) across the cluster instead of having one instance be the ‘Master’ and other instances be ‘Replicas’ that only accommodate a read workload. For example, a 3-instance cluster containing 900GB of data will have 300GB of data slices hash-distributed to each node.The ClustrixDB Rebalancer slices and distributes logical database tables across multiple physical nodes. The intelligent query planner can distribute the query to the data at its location in the cluster without moving the data.

High Availability: ClustrixDB automatically makes at least one copy (replica) of all the data and collocates it on one of the instances as a back up. At any one time there is always a backup copy of data and should an instance fail, the cluster will keep running and will begin a ‘reprotect’ process to ensure the data is reprotected with a second copy. This approach provides significantly higher application availability (most Clustrix customers have over 5 9’s).

ClustrixDB is seen by the application(s) as a single database, and the data is stored across all the nodes in the cluster, utilizing local solid-state disk (SSD) drives. Because the data is automatically replicated across cluster instances, your data is highly durable with less possibility of data loss during an instance failure. This is called “local high-availability” and is unique to ClustrixDB. Other RDBMS do not have local high-availability and rely on separate replicated servers, such as MySQL’s read slaves or Aurora’s Replicas in other AZ’s, requiring manual operations and application workload distribution to provide a minimal level of availability. With no manual intervention or application changes required, ClustrixDB weathers node loss transparently.

For Availability Zone disaster recovery, Clustrix offers multiple alternatives based on the requirements of the business. Clustrix supports multiple replication solutions and can provide master-master, master-slave, master-slave-slave to DR (disaster recovery) cluster(s) across availability zones, data centers and continents. This replication ensures that your database is available immediately after a failover because the replicated data copies already exist in the other zones. You simply point your application at the DR cluster.

Replication: ClustrixDB supports MySQL’s replication protocol as both a Slave and a Master. It supports the creation of multiple Slave instances that can connect to distinct replication Masters to consolidate servers. ClustrixDB also supports the creation of multiple binary logs (binlogs), each of which can correspond to specific databases and can be accessed by distinct replication Slaves. When running multiple-Slave configurations, ClustrixDB takes advantage of each node by load-balancing replication connections in a round-robin fashion. For more information about ClustrixDB replication, including configuring replication failover, see here.

Security: ClustrixDB recommends securing all unused ports on all instances running ClustrixDB, and managing public and private IPs carefully. Passwords are saved in either SHA1 or SHA2, and most native MySQL encryption functions are supported. ClustrixDB supports the MySQL Client for command-line access to the database, and in the next Release we will be adding SSL access as well.

ClustrixDB Relational Scale-Out Technology

The challenge of a scale-out database is how to add more nodes without increasing query latency, yet keeping all copies of the data transactionally in sync without breaking ACID guarantees. NoSQL solutions scale-out because they selectively relax ACID guarantees, and therefore rely on the application layer to provide consistency and referential integrity. ClustrixDB is able to scale-out and provide referential integrity guarantees via a combination of:

  • Shared-Nothing Data Distribution
  • Automatic Rebalancer
  • Sierra (Cascades) Query Optimizer
  • Multi-Version Concurrency Control (MVCC) and 2 Phase Locking (2PL)
  • Relational Scale Out Tech Blog

ClustrixDB automatically slices each table and distributes it across the available nodes, as well as creating at least one ‘replica’ of each slice, saved on a different node. Having two copies of the data allows the cluster to continue operations despite the loss of a node, with no data-loss. Both primary and secondary indexes are sliced and distributed across the cluster, and each node has a local copy of the Data Map of all slices.

In addition, ClustrixDB recognizes new nodes and incorporates them into the cluster with a few clicks and a single brief interruption-of-service. Workloads and data are automatically balanced across all nodes in the cluster by the patented Rebalancer, which is tuned to have minimal performance effect on ongoing transactions.

The ClustrixDB Query Optimizer provides the ability to execute one query with maximum parallelism and many simultaneous queries with maximum concurrency. This is achieved via a distributed query planner and compiler and a distributed shared-nothing execution engine. The ClustrixDB Query Optimizer is known as Sierra. Sierra is modeled off of the Cascades Query optimization framework, proven to satisfy the requirements and demands of many commercial database systems, most notably: Tandem’s NonStop SQL and Microsoft’s SQL Server.

After Shared-Nothing Data Distribution and Sierra Query Optimization, the next fundamental methodology of ClustrixDB is its ability to ‘send the query to the data’. This is one of the fundamental principles that allows ClustrixDB to scale almost linearly as more nodes are added.

This is very different from other systems, which routinely move large amounts of data to the node that is processing the query, then eliminate all the data that doesn’t fit the query (typically lots of data). By only moving qualified data across the network to the requesting node, ClustrixDB significantly reduces the network traffic bottleneck. In addition, processors on multiple nodes can be brought to bear on the data selection process. By selecting data on multiple nodes in parallel, the system produces results more quickly than if all data was selected by a single node, which first has to collect all the required data from the other nodes in the system. Data movement is thereby minimized, and transaction processing is maximized.

ClustrixDB uses parallel query evaluation for simple queries and Massively Parallel Processing (MPP) for analytic queries (akin to columnar stores). In addition, the Fair Scheduler ensures that OLTP queries are prioritized ahead of OLAP queries. Data is read from the ranking replica, assigned by the Rebalancer. This can either reside on the same node or require at most a single hop. As data set size and the number of nodes increase, the number of hops that one query requires (0 or 1) does not change. This allows linear scalability of reads and writes.

Comparison of ClustrixDB, Amazon RDS for MySQL, and Amazon RDS for Aurora

The basic difference between ClustrixDB and AWS’ MySQL-compatible offerings, is ClustrixDB scales writes and reads and the others do not. All three are natively compatible with MySQL client applications, ClustrixDB has others advantages over MySQL and Aurora besides scaling writes. A review of the following benchmark can help you decide between ClustrixDB, Aurora, and MySQL on Amazon RDS for your solution on AWS.

(Click here for a more detailed comparison of Clustrix and Aurora.)

 

Summary

ClustrixDB is a MySQL compatible clustered RDBMS able to scale-out SQL write and read transactions, natively providing high availability and fault tolerance. It accomplishes this by leveraging a shared-nothing architecture, automatic data-slicing and distribution by the Rebalancer, query optimization via Sierra (leveraging Cascades), and an Evaluation Model which distributes pre-compiled query fragments to the node on which the data resides. In addition, MVCC and 2 Phase Locking (2PL) are leveraged. All of this was written from the ground-up, and does not leverage any MySQL code.