Skip to main content

Key Topics Overview

Core concepts you need to know for system design interviews.

CAP Theorem

The CAP theorem states that a distributed system can only provide two of the following three guarantees:

Consistency

Every read receives the most recent write
All nodes see the same data at the same time

Availability

Every request receives a response
System remains operational even with node failures

Partition Tolerance

System continues to operate despite network failures
Must handle network partitions between nodes

Trade-offs

CA: RDBMS (MySQL, PostgreSQL)
CP: MongoDB, Redis
AP: Cassandra, DynamoDB

Load Balancing

Load balancers distribute incoming traffic across multiple servers to ensure:

Key Features

High Availability
Fault Tolerance
Scalability

Common Algorithms

Round Robin
Least Connections
Weighted Round Robin
IP Hash
Least Response Time

Caching

Caching improves system performance by storing frequently accessed data in faster memory.

Caching Strategies

Cache-Aside (Lazy Loading)
- Load data into cache only when needed
- Good for read-heavy workloads
Write-Through
- Update cache and DB simultaneously
- Ensures consistency
Write-Behind
- Update cache first, then DB asynchronously
- Better write performance

Cache Eviction Policies

LRU (Least Recently Used)
LFU (Least Frequently Used)
FIFO (First In First Out)

Content Delivery Networks (CDN)

CDNs distribute content to geographically dispersed servers to:

Benefits

Reduce Latency
Decrease Server Load
Improve Availability
Handle Traffic Spikes

Use Cases

Static Content
Media Files
API Caching
Dynamic Content

Database Architecture

Master-Slave Replication

Master (Primary)

Handles write operations
Maintains authoritative copy
Replicates changes to slaves

Slaves (Replicas)

Handle read operations
Provide redundancy
Scale read capacity

When to Use

Read-heavy workloads
Need for data redundancy
Geographic distribution

Scaling Strategies

Vertical Scaling (Scale Up)

Add more power to existing machines
Limits: Hardware capacity
Simple but expensive

Horizontal Scaling (Scale Out)

Add more machines
Better fault tolerance
More complex architecture

Database Sharding

Horizontal Sharding

Split data across multiple databases
Based on partition key
Example: User IDs 1-1M on Shard 1, 1M-2M on Shard 2

Vertical Sharding

Split different features into separate databases
Example: User profiles in one DB, user posts in another

Database Types

SQL (Relational)

Structured data
ACID compliance
Complex queries
Examples: MySQL, PostgreSQL

NoSQL

Document (MongoDB)
- Flexible schema
- Nested data
- Good for content management
Key-Value (Redis)
- Simple structure
- High performance
- Caching
Column-Family (Cassandra)
- High scalability
- Good for time-series data
Graph (Neo4j)
- Relationship-focused
- Social networks
- Recommendation engines

API Design

REST Principles

Stateless
Resource-based
Standard HTTP methods
HATEOAS

Best Practices

Use proper HTTP methods
Version your APIs
Use proper status codes
Implement pagination
Support filtering and sorting

Synchronous vs Asynchronous

Synchronous

Blocking operations
Immediate response
Simpler to implement
Higher latency

Asynchronous

Non-blocking
Better scalability
Message queues
Event-driven architecture

When to Use Each

Sync: CRUD operations, simple requests
Async: Long-running tasks, notifications

Idempotency

Definition

Multiple identical requests should have same effect as single request
Critical for distributed systems

Implementation

Use idempotency keys
Store request status
Check for duplicates

Idempotent HTTP Methods

GET
PUT
DELETE
HEAD

Non-Idempotent Methods

POST
PATCH

Additional Resources