Sharding a database is a complex, risky, and expensive process.
As the application continues to evolve and the database grows in size, organizations must invest even more time and money to enhance the sharding code that is compensating for growing complexity and slower cross-shard performance.
Challenge #1: Sharding is expensive.
Is there a way to effectively split the data by the primary key, a mod function, or an index table? What happens once the shards outgrow the scalability of each database server or the distribution mix of the data changes? The cost to re-shard the data and update the sharding strategy is a significant investment, taking away precious development resources that should be focused on creating strategic functionality to grow the business.
Challenge #2: Sharding jeopardizes high availability, backups, and disaster recovery.
In case of a database failure, sharding adds increased complexity to backing up database shards and ensuring they are redundant across multiple servers. What is the time and effort required to design a high availability system and how is a redundant system brought online without losing critical transactions or having excessive downtime?
Challenge #3: Sharding means additional administrative complexity.
Operationally, sharding creates multiple database servers — all of which need to be managed, upgraded, and maintained. The more shards that are introduced, the more complex the environment becomes, thereby increasing backup, patching, and schema maintenance costs. Replication further complicates the architecture, effectively doubling the number of servers.
Challenge #4: Sharding strategies lack support.
There is no 24x7 help when a code change breaks the application and causes a sharding strategy fail. Whatever the time — day, night or weekend — developers and operations staff are on their own to assess what has changed and what code needs fixing, potentially requiring a rollback with data loss.
Challenge #5: Sharding sacrifices development agility.
Any changes to the schema to facilitate new features require downtime in order to make the updates across all shards. A traditional database is fundamentally not designed to handle real-time schema changes. This, plus the added complexity of multiple shards, means that sharding takes much longer to roll out changes, resulting in downtime and expensive support staff. Developers have to replicate custom code for each application so that it will interact with the database, introducing more risk and complexity for every code change. Additionally, they have to have the knowledge and experience to do it. The result? Sharding causes a barrier to frequent application updates, affecting the speed of innovation and overall competitiveness of an organization.
Challenge #6: Sharding demands custom application code that can fail ACID compliance.
With data residing across multiple shards in a shared-nothing configuration, the application needs to manage the transactions to ensure data consistency — a function typically built into the database. Organizations must make a significant investment to replicate and maintain this standard database functionality.