This document provides software engineering advice from Jeff Dean based on his experience building large-scale distributed systems at Google. Some key points include:
- Design systems to be simple, scalable, performant, reliable, and general while balancing different goals. Get advice on designs before coding.
- Distributed systems require careful data partitioning and high-capacity even within datacenters. Products are deployed across multiple datacenters worldwide.
- Real hardware is unreliable so systems must be designed to handle many types of failures gracefully.
- Prioritize low latency, consider data access patterns and encoding, use caching and parallelism where possible.
- Favor eventual consistency over strong consistency for availability.
- Add monitoring, debugging hooks,
Related topics: