Real-time reporting gives you valuable insight from data in your live operational database, up-to-date to the current moment. ClustrixDB helps you understand what’s going on with your business as events are occurring and transactions are completed – to help you tweak operations and ultimately improve performance.
You might be an e-commerce company that wants
to know which offers are increasing your bottom line
during the Black Friday sale – as the sale is ongoing.
Or you might be an ad company trying to tune your strategies for advertising, and the delayed results from data warehouses such as Hadoop put you at a competitive disadvantage.
High Performance, Real-Time Reporting: Reality or Myth?
ClustrixDB Excels at Fast Real-Time Reporting
ClustrixDB is built with two key features that allow you to run fast real-time reporting on the database while ingesting massive volumes of data:
1Massively Parallel Processing (MPP)
Clustrix brings the massively parallel processing used in data warehouses to the primary database. ClustrixDB uses multiple cores on a single node and multiple nodes in parallel to make your queries and reports go faster. The more nodes you add, the faster your reporting gets. ClustrixDB does distributed processing for joins and aggregates, as well. When you have tables with billions of rows, such as a 15-table join and a 6-way aggregate, pulling all the data to a single node is just not feasible. ClustrixDB evaluates joins on all nodes in parallel and does partial distributed aggregation of data on each node. This capability minimizes data movement and maximizes parallel processing, allowing ClustrixDB to get faster as you add nodes and to scale as you run more analytic queries.
2Distributed Multi-Version Concurrency Control (MVCC)
ClustrixDB uses distributed multi-version concurrency control so that your reads and writes do not interfere with each other. Your reads and analytics see a consistent snapshot of the database when they come in. Any writes to the data will write newer versions, and they do not have to wait for the reads and analytics to finish. This approach removes all interference between reads and writes making both of them scale. The data seen by the analytic queries is always consistent, so your reports are always reliable.
With the move from expensive and limited scale-up to scale-out, more resources such as processors, memory and storage can be added cheaply to a cluster, using commodity off-the-shelf hardware. Your primary operational database can now have the resources not just for simple reads, writes, and updates, but also for increasingly complex aggregates and reporting.