Database Scalability

From Single Server to Distributed Systems

12 min readSystem Design

Imagine this scenario: You've had a good run up to 10,000 users with everything you know about SQL Server. It has served you well. You've normalized, structured and re-structured the database. You've used profilers and optimized the indexes so that your cascaded queries run at an acceptable level—just enough to keep the boss off your back for a while. 😊

But now you're hitting limits. Your application is slowing down, and you're wondering: What's next?

SQL Server, like MySQL, PostgreSQL, and Oracle, is a Relational Database Management System (RDBMS). These systems are powerful, but they weren't designed for infinite scale out of the box. The good news? There are proven techniques to scale them effectively.

Understanding RDBMS and ACID

A relational database organizes data into tables with relationships between them. The foundation of RDBMS reliability is ACID—a set of properties that guarantee database transactions are processed reliably:

Click on any property to learn more

These properties make RDBMS systems incredibly reliable for transactional workloads. However, maintaining ACID guarantees across distributed systems introduces complexity—something we'll explore in the scaling techniques below.

The Scalability Debate

I often hear two extreme positions:

  • "SQL Server is crap—it doesn't scale"
  • "You can solve EVERYTHING with RDBMS"

Both are wrong and reflect a dangerous dogmatism. Limiting your toolbox is never a good strategy, especially when there are numerous proven techniques to scale relational databases: master-slave replication, master-master replication, federation, sharding, denormalization, SQL tuning, and more.

Let's explore these techniques, understand their tradeoffs, and learn when to apply each one.

Master-Slave Replication

In master-slave replication, the master database handles all writes and reads, while replicating writes to one or more slave databases that serve only reads. Slaves can also replicate to additional slaves in a tree-like fashion.

Master-Slave Architecture

Master

Handles Reads & Writes

Replicating...
Idle
Idle

Slave 1

Read Only

Slave 2

Read Only

Slave 3

Read Only

Master (Read/Write)
Slave (Read Only)
Replication Flow

Benefits

  • Improves read performance by distributing load across multiple slaves
  • System can continue in read-only mode if master fails
  • Backups can be performed on slaves without impacting master

Challenges

  • Additional logic needed to promote a slave to master during failover
  • Potential data loss if master fails before replication completes
  • Heavy write load can bog down read replicas with replication lag
  • More slaves means more replication lag and hardware complexity

Master-Master Replication

Both masters serve reads and writes, coordinating with each other on write operations. If either master fails, the system continues operating with full read-write capability.

Master-Master Architecture

Master 1

Reads & Writes

Sending changes →
Syncing...

Master 2

Reads & Writes

← Receiving changes

How It Works

  • Both masters accept reads and writes simultaneously
  • Changes are continuously synchronized bidirectionally
  • Conflict resolution is critical when both masters modify the same data
  • System remains fully operational if one master fails
Master (Read/Write)
Active Sync
Conflict

Benefits

  • High availability—system remains fully operational if one master fails
  • Distributes both read and write load across multiple nodes

Challenges

  • Requires load balancer or application logic to route writes
  • Most systems are either loosely consistent (violating ACID) or have increased write latency due to synchronization
  • Conflict resolution becomes critical as write nodes and latency increase
  • Replication adds hardware and operational complexity

Federation (Functional Partitioning)

Federation splits databases by function. Instead of one monolithic database, you might have separate databases for forums, users, and products. This reduces read/write traffic to each database and minimizes replication lag.

Federation (Functional Partitioning)

Click on any database to see its tables and structure

Application Layer

Routes queries to appropriate DB

Key Benefits

Smaller databases fit more in memory
Parallel writes across databases
Less traffic to each database
Better cache locality per function

Example Query Routing

GET /api/users/123
→ Routes to Users DB
GET /api/threads/456
→ Routes to Forums DB
GET /api/products/789
→ Routes to Products DB

Benefits

  • Smaller databases fit more data in memory, improving cache hit rates
  • No single master serializing writes—enables parallel writes and increased throughput
  • Reduces replication lag by distributing load

Challenges

  • Not effective if your schema requires huge functions or tables
  • Application logic must determine which database to query
  • Joining data across databases is complex and requires server links
  • Adds hardware and operational complexity

Sharding (Horizontal Partitioning)

Sharding distributes data across multiple databases such that each database manages only a subset of the data. For example, a users database might be sharded by the user's last name initial or geographic location.

Sharding (Horizontal Partitioning)

Range-Based Strategy

Users are distributed based on the first letter of their last name (A-H, I-P, Q-Z). Simple but can lead to uneven distribution.

Load Distribution

8 users
8 users
10 users

⚠ Notice: Range-based sharding can lead to uneven distribution (Q-Z has more users)

Sharding Considerations

Range-based: Simple to implement but can have hotspots. Good for range queries.
Hash-based: More even distribution but makes range queries difficult.
Consistent hashing: Minimizes data movement when adding/removing shards.

Benefits

  • Reduces read/write traffic and replication per shard
  • Improves cache hits with smaller, more focused datasets
  • Smaller index sizes improve query performance
  • If one shard fails, others remain operational
  • Parallel writes across shards increase throughput

Challenges

  • SQL Server doesn't natively support sharding (unlike MongoDB)—requires emulation with careful planning
  • Application logic must work with shards, potentially creating complex SQL queries
  • Data distribution can become lopsided (e.g., power users on one shard)
  • Rebalancing shards adds significant complexity
  • Joining data across shards is extremely complex
  • Adds substantial hardware and operational overhead

Denormalization

Denormalization improves read performance at the expense of some write performance. Redundant copies of data are written across multiple tables to avoid expensive joins.

Denormalization Comparison

users

id (PK)
name
email

orders

id (PK)
user_id (FK)
total
date

Slow Query (Multiple Joins)

SELECT
  orders.id,
  orders.total,
  orders.date,
  users.name,
  users.email
FROM orders
JOIN users ON orders.user_id = users.id
WHERE orders.id = 12345;
~50ms
Multiple table lookups + JOIN operation
✓ Benefits
  • • No data duplication
  • • Easy to update user info
  • • Maintains data integrity
⚠ Drawbacks
  • • Expensive JOIN operations
  • • Slower read performance
  • • Multiple disk seeks

When to Denormalize

Read-heavy workloads where reads outnumber writes 100:1 or more
Distributed systems where joins across data centers are expensive
Performance-critical queries that are run frequently
Write-heavy workloads where data changes frequently

In most systems, reads heavily outnumber writes—often by ratios of 100:1 or even 1000:1. A read resulting in complex joins can be very expensive, spending significant time on disk operations. Once data becomes distributed through federation and sharding, joins across data centers become even more complex. Denormalization can circumvent the need for such joins.

Benefits

  • Dramatically improves read performance by eliminating expensive joins
  • Reduces query complexity for read-heavy workloads
  • Some RDBMS (PostgreSQL, Oracle) support materialized views to automate this

Challenges

  • Data is duplicated—classic space-for-speed tradeoff
  • Constraints needed to keep redundant copies in sync increase design complexity
  • Heavy write loads can make denormalized databases perform worse than normalized ones

I have extensive experience with denormalization in data analysis contexts. For a deeper dive, check out this 2006 study on data denormalization in decision support systems.

Choosing the Right Approach

Each scaling technique has distinct tradeoffs. Here's a comprehensive comparison to help guide your decision:

Master-Slave Replication

Master handles writes, slaves handle reads

Read Performance
High
Write Performance
Medium
Complexity
Low
Cost
Low
Scalability
Medium
Consistency
High

Master-Master Replication

Both masters handle reads and writes

Read Performance
High
Write Performance
Medium
Complexity
Medium
Cost
Medium
Scalability
Medium
Consistency
Medium

Federation

Split databases by function

Read Performance
High
Write Performance
High
Complexity
Medium
Cost
Medium
Scalability
High
Consistency
High

Sharding

Distribute data across multiple databases

Read Performance
High
Write Performance
High
Complexity
High
Cost
High
Scalability
High
Consistency
Medium

Denormalization

Duplicate data to avoid joins

Read Performance
High
Write Performance
Low
Complexity
Medium
Cost
Low
Scalability
Medium
Consistency
Medium

Master-Slave Replication

Best For

Read-heavy workloads with occasional writes

Avoid When

Write-heavy workloads or when read slaves can't keep up

Master-Master Replication

Best For

High availability requirements, geographically distributed users

Avoid When

When ACID guarantees are critical

Federation

Best For

Clear functional boundaries, microservices architecture

Avoid When

When you need frequent cross-database joins

Sharding

Best For

Massive scale, data can be partitioned cleanly

Avoid When

Small to medium scale, complex queries across shards

Denormalization

Best For

Read-heavy workloads (100:1 or higher read/write ratio)

Avoid When

Write-heavy workloads, frequently changing data

Reading the Matrix

Complexity: Implementation and operational difficulty
Cost: Hardware, infrastructure, and maintenance
Scalability: Ability to handle growing data and traffic
Consistency: Data consistency guarantees (ACID compliance)

Key Takeaways

  • Start simple: Master-slave replication for read-heavy workloads
  • Need high availability? Consider master-master replication
  • Clear functional boundaries? Federation works well
  • Massive scale needed? Sharding (but be prepared for complexity)
  • Read-heavy with complex joins? Denormalization can help
  • Often you'll combine multiple techniques as your system evolves

Closing Thoughts

Database scalability isn't about picking one "right" technique—it's about understanding the tradeoffs and choosing the approach (or combination of approaches) that fits your specific needs.

Whether you're dealing with SQL Server, PostgreSQL, MySQL, or any other RDBMS, these fundamental patterns apply. Don't fall into the trap of dogmatism. Keep your toolbox full and choose the right tool for the job.

This is part of my system design series. Stay tuned for more deep dives into distributed systems, caching strategies, and architectural patterns.

Until next time… Happy coding! 🚀