Is it Possible to Alter a 3 Billion Row SQL Table Online?
At Clustrix, our goal has always been to simplify everything about the database. We started off tackling the scale-out challenges, then performance, easy MySQL migration, real-time analytics, and so on as the product matured and customers expressed their desires for other features. But all along ClustrixDB has had support for online schema changes, also known as online DDL. ClustrixDB has this ability due to its underlying MVCC architecture. While there can be a cost to pay in copying the entire SQL table, the benefits are huge. Unlike other systems, there is no need to lock the table for writes or reads during the change. There are few if any caveats in that it works equally well for adding columns, changing data-types, or adding indexes. ClustrixDB does all the work in the background until the revised table is ready and then replaces the old table with the new copy in a quick transaction.
Our customers have voiced their appreciation for this capability repeatedly. Software development has always been an iterative process and allowing iterative database design isn’t some advanced feature, it’s a requirement.
Big Data Online Schema Changes are > 100GB
Recently a certain NoSQL vendor blogged about the challenges of growing beyond 100GB of data. At Clustrix, we were surprised that this was a challenge for some. Most of our customers are easily managing terabytes of data on ClustrixDB without drama. In fact, our customer operations (support) team regularly assists customers with questions and tuning advice involving many terabytes of data, billions of rows, and tens of thousands of transactions per second. All in a day’s work.
This summer, we had a happy convergence of all these things. A particular customer had a table that had far outgrown their initial design and now held 2.7 billion rows of data with an auto-increment column online casino using the INT data-type. The then-current maximum id value was north of 3.6 billion and we estimated they had about 15-20 days of keys remaining before hitting the 4.2 billion limit of the data-type. This single table alone was nearly 2TB in size, well beyond the 100GB threshold that challenges other solutions. It’s a nice problem to have when your growth outpaces your design. We find that many young companies with popular apps are in this very situation.
Big Problem, Simple Solution
root@server:/$ mysql database_name -e "ALTER TABLE table_name CHANGE column_name column_name BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, auto_increment=3800000000, container=btree; CALL system.task_run ("pdcache_update")"; clx cmd "mysql system -e "call qpc_flush()""
One command, [Okay, to be precise there are technically three] and a lot of waiting is all it took on ClustrixDB. Because this was a huge table under a significant amount of load we actual did five things simultaneously with this ALTER:
- Change the primary key column data-type to BIGINT
- Restructure the table container as b-tree instead of our default layer-tree to help performance
- Reset the auto-increment key value to 3.8 billion (estimated new max plus buffer)
- Reset the system pdcaches
- Flush each server’s query plan cache
We have a detailed guide for planning your SQL table ALTERs in our online documentation. Not all the above will apply to your schema changes, but they can all be accomplished with one command without locking your tables.
Results: Happy Customers
Our first attempt at the above ran out of space after 39 hours and had to be rolled-back. Disappointing, but successful in that the schema was unlocked and addressable by queries and DML the entire time. After freeing some space, the second attempt completed after 60 hours. As we knew both of these attempts would exceed any reasonable maintenance window, we scheduled them for the quiet part of the customer’s weekly cycle. Throughout both events, ClustrixDB was serving 25,000 to 55,000 queries per second, day and night.
That’s what we consider a real Big Data problem. How does your Big Data solution support your need for online schema changes?