#📝 Note: Database Sharding
#Overview
Sharding is a database architecture pattern that involves breaking a large database into smaller, more manageable parts called shards. Each shard is a separate database, and collectively, they hold the entire dataset.
#Why Shard?
- Horizontal Scalability: Distribute load across multiple servers.
- Improved Performance: Smaller datasets per server lead to faster queries.
- Availability: A failure in one shard doesn't affect the entire system.
#Sharding Strategies
- Key-based (Hash) Sharding: Use a hash of a key (e.g.,
user_id) to determine the shard. - Range-based Sharding: Assign shards based on ranges of a value (e.g.,
A-M,N-Z). - Directory-based Sharding: Maintain a lookup table to map keys to shards.
#Trade-offs
- Complexity: Application logic must handle shard routing.
- Resharding: Moving data between shards is difficult.
- Joins: Performing joins across shards is very expensive and often avoided.