mirror of
https://github.com/binhnguyennus/awesome-scalability.git
synced 2025-01-20 19:11:28 -05:00
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
architectureawesomeawesome-listbackendbig-datacomputer-sciencedesign-patternsdevopsdistributed-systemsinterviewinterview-practiceinterview-questionslistsmachine-learningprogrammingresourcesscalabilitysystemsystem-designweb-development
CONTRIBUTING.md | ||
README.md |
Awesome Scalability, Availability, and Stability Backend
A curated list of selected readings/case studies to illustrate Scalability, Availability, and Stability patterns in backend design.
What if your backend went slow?
Understand your problems: performance problem (slow for a single user) or scalability problem (fast for a single user but slow under heavy load) from basic.
What if your backend went down?
"Even if you lose all one day, you can build all over again if you retain your calm!" - Thuan Pham, CTO at Uber Technologies Inc.
Contributing
Please take a look at the contribution guidelines first. Contributions are always welcome!
Contents
Basic
- CAP Theorem and the Trade-offs
- Scale up vs Scale out
- Latency vs Throughput: Striving for maximal throughput with acceptable latency
- ACID?
- Architecture Issues: Bottlenecks, Database, CPU, IO
- Immutability
Scalability
- Distributed Caching
- Distributed Logging & Tracing
- Distributed Messaging
- Storage
- NoSQL
- RDBMS
- Why SQL is beating NoSQL, and what this means for the future of data
- Sharding MySQL at Pinterest
- How Airbnb Partitioned Main MySQL Database in Two Weeks
- Replication is the Key for Scalability & High Availability
- How Twitch uses PostgreSQL
- Scaling MySQL-based financial reporting system at Airbnb
- Scaling to 100M at Wix: MySQL is a Better NoSQL
- Why Uber Engineering Switched from Postgres to MySQL
- Handling Growth with Postgres at Instagram
- HTTP Caching
- Concurrency
- Event-Driven Architecture
- Load-balancing
- Parallel Computing
Availability
Stability
- Circuit Breaker
- Always use timeouts (if possible)
- Let it crash/Supervisors: Embrace failure as a natural state in the life-cycle of the application
- Crash early: An error now is better than a response tomorrow
- Bulkheads: Partition and tolerate failure in one part
- Steady state: Always put logs on separate disk
- Throttling: Maintain a steady pace
Others
Special Thanks
- Jonas Bonér, CTO at Lightbend, for the original inspiration