The primary motivation for choosing Clustrix was the scalability, fault tolerance and online schema changes it provides.
With more than five million members, TheLadders is the premier online job-matching service, committed to finding the right person for the right job since 2003. With a unique suite of personalized products and resume services, the company helps all career-driven professionals connect with employers and recruiters more effectively and efficiently. TheLadders is headquartered in New York City.
TheLadders had been searching for over a year for a scalable database solution. The company was looking for a solution that offered scalability and fault tolerance, and recognized that sharding would be very time-consuming and expensive. TheLadders was looking for a solution that would enable them to be ready for unpredictable growth. They wanted their database to be simple, maintainable and scalable beyond one machine. They also wanted their developers to focus on adding value to their product, rather than spending time working on a database layer. TheLadders did not, however, want to shard data or give up data consistency and rich queries, since they found rewriting 200,000+ lines of code highly inefficient.
They tried different MySQL plugins and extensions without success and analyzed eight different ways to architect their structured data infrastructure before concluding that Clustrix was the right solution.
TheLadders considered various sharding solutions, only to find that sharding would cost 2.5 man-years to re-write code and 1.5 man-years to support.
TheLadders chose Clustrix because of its:
Shared-nothing massively parallel architecture that eliminates the need to shard data
Full SQL support, simplicity of SQL results in faster application development and lower code maintenance
High-availability with automatic recovery, since the company expects non-stop operations
Online operations like schema changes, re-provisioning, and cluster software upgrades
Concurrency and transaction control that is an efficient, database side, MVCC, ACID implementation
Administration simplicity to scale read/write throughput by simply adding more nodes
High node performance provided by a high-tuned software package and optimized hardware
Excellent support, considered a must-have feature
Testing Clustrix for Scalability and Fault Tolerance
TheLadders is growing fast and needed a scalable platform. Yet they could not sacrifice fault tolerance for scalability – TheLadders needed both. While testing, TheLadders pulled power cords from cluster nodes during load tests to prove Clustrix’s resilience. They tested scalability to determine how many concurrent users each node could sustain and whether the cluster could scale both predictably and linearly as additional nodes and users were added.
As early adopters, they had to make sure they tested everything including MySQL network protocol compatibility tests and the compatibility with the schema and SQL code base (200,000+ lines of SQL code). They also tested write performance by type (Figure 1) single/multi-row inserts and updates, bulk data load and read performance by query type (Figure 2) PK, short/long range scans, sub-queries, derived tables and joins.
TheLadders found setting up new clusters and adding nodes to be trivial. The Clustrix cluster supports online schema changes without locking tables, so data layout changes can be made live and on-the-fly. The Clustrix cluster enabled TheLadders to keep growing without making complex application and operational changes – the database simply scales in a fault-tolerant manner.
Figure 1. Write Performance Test
Figure 2. Read Performance Test
Time and Cost Savings with Clustrix
By choosing Clustrix, TheLadders eliminated the need to shard data, master database single point of failure, a single box memory limit, and write bottlenecks (including single-threaded MySQL replication). Clustrix enabled flexible topologies with other MySQL wire-line compatible databases, supporting multiple replication sources/targets. They were able to reduce overall development costs by increasing time spent implementing higher value-added end-user functionality rather than fixing database bottlenecks. By using Clustrix, TheLadders saved CAPEX, replaced a half-million dollar setup with a $150K cluster, and even made their database setup “greener.”
Download this story.