Why Build an Appliance?

Clustrix sells appliances. We marry our software with industry standard hardware to make plug and play devices that drop in and work on the network. This is same model as my former company Isilon (now EMC), NetApp, and pretty much all successful storage vendors.  Why do this instead of selling software by itself?

Qualifying SSDs is particularly tricky because of the huge variation in quality.
Qualifying SSDs is particularly tricky because of the huge variation in quality.

The number one reason for the appliance model is quality control. By reducing the supported hardware set, we drastically increase the QA time we get on the intended hardware. QA time looking for hardware interactions really matter in a product that stores customers’ data. The bar is simply set much higher there.  As an example, we do extensive testing of durability of data on our specific hardware. Depending on the hard drive controller, controller firmware, drives, and drive firmware, we have gotten very different results on exactly when a specific piece of data makes it to stable storage. Many drives and even controllers lie about when the data is safe. Some drives implement tagged queuing or FUA poorly. I’ve even seen drives return from a sync command without actually having synced the data. The only way to ensure data integrity on a storage system is to properly characterize the hardware with an extensive test regime and control every piece of that storage system. With that control, we can form relationships with the vendors to fix bugs that we expose in the hardware and firmware. At Clustrix, we have a variety of different pieces of test software to exercise the disk subsystem and verify the data is safe every time. With that test software, we have rejected many pieces of hardware and countless versions of firmware that didn’t make the cut. The same rigorous process goes into qualifying networking, Infiniband, NVRAM, processing, and memory components. This kind of focused testing and qualification is not possible on a software-only product.

The second critical benefit of the appliance model is much tighter integration between hardware and software. At Clustrix, we are able to monitor the hardware in the box and present that data seamlessly in our “system” database. For example, “SELECT * from system.memory” will give you all the details for the memory installed on the system and tell you if there are any correctable or uncorrectable ECC errors. We have logic to send alerts on correctable errors and safely shut down the node on an uncorrectable error. On Clustrix, this hardware-specific data sits right along side of database-specific data like queries per second and disk full percentage which allows exceptionally easy integration with tools like Nagios and Cacti. This sort of tight integration is only possible on an appliance.

Finally, being an appliance makes the Clustrix database much easier to install, manage, and use. Creating a high performance, bullet-proof database is no longer a science project. You no longer have to put together the pieces, get all the right versions of firmware, get the right versions of the kernel and libraries, the right version of the database software and make sure all the parts are tuned to work together. The Clustrix appliance is an integrated and tuned database right out of the box.