awesome-scalability/README.md

# High Scalability, High Availability, High Stability, High Performance, and High Intelligence System Design Patterns

An updated and curated list of selected readings to illustrate best practices in building high scalability, high availability, high stability, high performance, and high intelligence large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users.

#### If your system goes slow :traffic_light:
> Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some [design principles](#principle) and checking how [scalability](#scalability) and [performance](#performance) problems are solved at tech companies. The section of [intelligence](#intelligence) are created for those who work with data and machine learning at big (data) and deep (learning) scale.

#### If your system goes down :construction:
> "Even if you lose all one day, you can build all over again if you retain your calm!" - Thuan Pham, CTO of Uber. So, keep calm and mind the [availability](#availability) and [stability](#stability) matters! 

#### If you are having a system design interview :ocean:
> Look at some [interview notes](#interview) and [real-world architectures with completed diagrams](#architecture) to get a comprehensive view before designing your system on whiteboard. You can check some [talks](#talk) of engineers from tech giants to know how they build, scale, and optimize their systems. There are some selected [books](#book) for you (most of them are free)! Good luck :four_leaf_clover:

#### If you are building your dream team :ferris_wheel:
> The goal of scaling team is not growing team size but increasing team output and value. You can find out how tech companies reach that goal in various aspects: hiring, management, organization, culture, and communication in the [organization](#organization) section.

#### Community power :mountain_cableway::aerial_tramway::mountain_cableway:

> Contributions are greatly welcome! You may want to take a look at the [contribution guidelines](CONTRIBUTING.md).

> If you find this project helpful, please share on your chat groups, [on Twitter](https://ctt.ec/V8B2p), or [on Weibo](http://t.cn/RnjFLCB) so more people can be helped! Power is gained by sharing knowledge, not hoarding it. Thank you! :hibiscus:

## Contents
- [Principle](#principle)
- [Scalability](#scalability)
- [Availability](#availability)
- [Stability](#stability)
- [Performance](#performance)
- [Intelligence](#intelligence)
- [Architecture](#architecture)
- [Interview](#interview)
- [Organization](#organization)
- [Talk](#talk)
- [Book](#book)

## Principle
* [Designs, Lessons and Advice from Building Large Distributed Systems - Jeff Dean, Google](https://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf)
* [How To Design A Good API and Why it Matters - Joshua Bloch, CMU & Google](https://www.infoq.com/presentations/effective-api-design)
* [On Efficiency, Reliability, Scaling - James Hamilton, VP at AWS](http://mvdirona.com/jrh/work/)
* [Things to Keep in Mind When Building a Platform for the Enterprise - Heidi Williams, VP Platform at Box](https://blog.box.com/blog/4-things-to-keep-in-mind-when-building-a-platform-for-the-enterprise/)
* [Principles of Chaos Engineering](https://www.usenix.org/conference/srecon17americas/program/presentation/rosenthal)
* [Finding the Order in Chaos](https://www.usenix.org/conference/srecon16/program/presentation/lueder)
* [The Twelve-Factor App](https://12factor.net/)
* [Clean Architecture](https://8thlight.com/blog/uncle-bob/2012/08/13/the-clean-architecture.html)
* [High Cohesion and Low Coupling](http://www.math-cs.gordon.edu/courses/cs211/lectures-2009/Cohesion,Coupling,MVC.pdf)
* [CAP Theorem and Trade-offs](http://robertgreiner.com/2014/08/cap-theorem-revisited/)
* [CP Databases and AP Databases](https://blog.andyet.com/2014/10/01/right-database)
* [Stateless vs Stateful Scalability](http://ithare.com/scaling-stateful-objects/)	
* [Scale Up vs Scale Out](https://www.brianjgraf.com/2013/05/17/scalability-scale-up-scale-out-care/)
* [Scale Up vs Scale Out: Hidden Costs](https://blog.codinghorror.com/scaling-up-vs-scaling-out-hidden-costs/)
* [Best Practices for Scaling Out](https://blog.openshift.com/best-practices-for-horizontal-application-scaling/)
* [Best Practices for Continuous Delivery](https://techblog.rakuten.co.jp/2018/02/06/cd-the-best-practice/)
* [ACID and BASE](https://neo4j.com/blog/acid-vs-base-consistency-models-explained/)
* [Blocking/Non-Blocking and Sync/Async](https://blogs.msdn.microsoft.com/csliu/2009/08/27/io-concept-blockingnon-blocking-vs-syncasync/)
* [Performance and Scalability of Databases](https://use-the-index-luke.com/sql/testing-scalability)
* [Database Isolation Levels and Effects on Performance and Scalability](http://highscalability.com/blog/2011/2/10/database-isolation-levels-and-their-effects-on-performance-a.html)
* [The Probability of Data Loss in Large Clusters](https://martin.kleppmann.com/2017/01/26/data-loss-in-large-clusters.html)
* [SQL vs NoSQL](https://www.upwork.com/hiring/data/sql-vs-nosql-databases-whats-the-difference/)
* [SQL vs NoSQL - Lesson Learned at Salesforce](https://engineering.salesforce.com/sql-or-nosql-9eaf1d92545b)
* [NoSQL Databases: Survey and Decision Guidance](https://medium.baqend.com/nosql-databases-a-survey-and-decision-guidance-ea7823a822d)
* [How Sharding Works](https://medium.com/@jeeyoungk/how-sharding-works-b4dec46b3f6)
* [Consistent Hashing](http://www.tom-e-white.com/2007/11/consistent-hashing.html)
* [Consistent Hashing: Algorithmic Tradeoffs](https://medium.com/@dgryski/consistent-hashing-algorithmic-tradeoffs-ef6b8e2fcae8)
* [Don’t be tricked by the Hashing Trick](https://booking.ai/dont-be-tricked-by-the-hashing-trick-192a6aae3087)
* [Uniform Consistent Hashing at Netflix](https://medium.com/netflix-techblog/distributing-content-to-open-connect-3e3e391d4dc9)
* [Eventually Consistent - Werner Vogels, CTO at Amazon](https://www.allthingsdistributed.com/2008/12/eventually_consistent.html)
* [Cache is King](https://www.stevesouders.com/blog/2012/10/11/cache-is-king/)
* [Anti-Caching](http://the-paper-trail.org/blog/paper-notes-anti-caching/)
* [Understand Latency](http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it)
* [Latency Numbers Every Programmer Should Know](http://norvig.com/21-days.html#answers)
* [The Calculus of Service Availability](https://queue.acm.org/detail.cfm?id=3096459&__s=dnkxuaws9pogqdnxmx8i)
* [Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO](http://highscalability.com/blog/2014/5/12/4-architecture-issues-when-scaling-web-applications-bottlene.html)	
* [Common Bottlenecks](http://highscalability.com/blog/2012/5/16/big-list-of-20-common-bottlenecks.html)
* [Life Beyond Distributed Transactions](https://queue.acm.org/detail.cfm?id=3025012)
* [Relying on Software to Redirect Traffic Reliably at Various Layers](https://www.usenix.org/conference/srecon15/program/presentation/taveira)
* [Breaking Things on Purpose](https://www.usenix.org/conference/srecon17americas/program/presentation/andrus)
* [Avoid Over Engineering](https://medium.com/@rdsubhas/10-modern-software-engineering-mistakes-bc67fbef4fc8)
* [Scalability Worst Practices](https://www.infoq.com/articles/scalability-worst-practices)
* [Use Solid Technologies - Don’t Re-invent the Wheel - Keep It Simple!](https://medium.com/@DataStax/instagram-engineerings-3-rules-to-a-scalable-cloud-application-architecture-c44afed31406)
* [Simplicity by Distributing Complexity](https://jobs.zalando.com/tech/blog/simplicity-by-distributing-complexity/)
* [Why Over-Reusing is Bad](http://tech.transferwise.com/why-over-reusing-is-bad/)
* [Performance is a Feature](https://blog.codinghorror.com/performance-is-a-feature/)
* [Make Performance Part of Your Workflow](https://codeascraft.com/2014/12/11/make-performance-part-of-your-workflow/)
* [The Benefits of Server Side Rendering over Client Side Rendering](https://medium.com/walmartlabs/the-benefits-of-server-side-rendering-over-client-side-rendering-5d07ff2cefe8)
* [Writing Code that Scales](https://blog.rackspace.com/writing-code-that-scales)
* [Automate and Abstract: Lessons at Facebook](https://architecht.io/lessons-from-facebook-on-engineering-for-scale-f5716f0afc7a)
* [AWS Do's and Don'ts](https://8thlight.com/blog/sarah-sunday/2017/09/15/aws-dos-and-donts.html)
* [(UI) Design Doesn’t Scale - Stanley Wood, Design Director at Spotify](https://medium.com/@hellostanley/design-doesnt-scale-4d81e12cbc3e)
* [Linux Performance](http://www.brendangregg.com/linuxperf.html)
* [Building Fast and Resilient Web Applications - Ilya Grigorik](https://www.igvita.com/2016/05/20/building-fast-and-resilient-web-applications/)
* [Accept Partial Failures, Minimize Service Loss](https://www.usenix.org/conference/srecon17asia/program/presentation/wang_daxin)
* [RACI (Responsible, Accountable, Consulted, Informed) at Etsy](https://codeascraft.com/2018/01/04/selecting-a-cloud-provider/)
* [Design for Loose-coupling](http://bulgerpartners.com/how-loosely-coupled-architectures-are-helping-the-modernization-of-legacy-software/)
* [Design for Resiliency](http://highscalability.com/blog/2012/12/31/designing-for-resiliency-will-be-so-2013.html)
* [Design for Self-healing](https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/self-healing)
* [Design for Scaling Out](https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/scale-out)	
* [Design for Evolution](https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/design-for-evolution)	
* [Mistakes to Avoid while Creating an Internal Product at Skyscanner](https://medium.com/@SkyscannerEng/9-mistakes-to-avoid-while-creating-an-internal-product-63d579b00b1a)
* [Learn from Mistakes at Reddit](http://highscalability.com/blog/2013/8/26/reddit-lessons-learned-from-mistakes-made-scaling-to-1-billi.html)
* [Code Review Best Practices at Palantir](https://medium.com/@palantir/code-review-best-practices-19e02780015f)

## Scalability
* [Microservices and Orchestration](https://hackernoon.com/microservices-are-hard-an-invaluable-guide-to-microservices-2d06bd7bcf5d)
	* [Microservices Resource Guide - Martin Fowler, Chief Scientist at ThoughtWorks](https://martinfowler.com/microservices/)
	* [Microservices Patterns](http://microservices.io/patterns/)
	* [Advantages and Drawbacks of Microservices](https://cloudacademy.com/blog/microservices-architecture-challenge-advantage-drawback/)
	* [Microservices Scale Cube](http://microservices.io/articles/scalecube.html)
	* [Thinking Inside the Container (8 parts) at Riot Games](https://engineering.riotgames.com/news/thinking-inside-container)
	* [Containerization at Pinterest](https://medium.com/@Pinterest_Engineering/containerization-at-pinterest-92295347f2f3)
	* [Techniques for Splitting Up a Codebase into Microservices and Artifacts at LinkedIn](https://engineering.linkedin.com/blog/2016/02/q-a-with-jim-brikman--splitting-up-a-codebase-into-microservices)
	* [The Evolution of Container Usage at Netflix](https://medium.com/netflix-techblog/the-evolution-of-container-usage-at-netflix-3abfc096781b)
	* [Dockerizing MySQL at Uber](https://eng.uber.com/dockerizing-mysql/)
	* [Testing of Microservices at Spotify](https://labs.spotify.com/2018/01/11/testing-of-microservices/)
	* [Organize Monolith Before Breaking it into Services at Weebly](https://medium.com/weebly-engineering/how-to-organize-your-monolith-before-breaking-it-into-services-69cbdb9248b0)
	* [Lessons learned running Docker in production at Treehouse](https://medium.com/treehouse-engineering/lessons-learned-running-docker-in-production-5dce99ece770)
	* [Inside a SoundCloud Microservice](https://developers.soundcloud.com/blog/inside-a-soundcloud-microservice)
	* [Operate Kubernetes Reliably at Stripe](https://stripe.com/blog/operating-kubernetes)
	* [Kubernetes Traffic Routing (2 parts) at Rakuten](https://techblog.rakuten.co.jp/2017/09/28/k8s-routing2/)
	* [Agrarian-Scale Kubernetes (3 parts) at New York Times](https://open.nytimes.com/agrarian-scale-kubernetes-part-3-ee459887ed7e)
	* [Nanoservices at BBC Online](https://medium.com/bbc-design-engineering/powering-bbc-online-with-nanoservices-727840ba015b)
	* [PowerfulSeal: Testing Tool for Kubernetes Clusters at Bloomberg](https://www.techatbloomberg.com/blog/powerfulseal-testing-tool-kubernetes-clusters/)
	* [Conductor: Microservices Orchestrator at Netflix](https://medium.com/netflix-techblog/netflix-conductor-a-microservices-orchestrator-2e8d4771bf40)
	* [Making 10x Improvement in Release Times with Docker and Amazon ECS at Nextdoor](https://engblog.nextdoor.com/how-nextdoor-made-a-10x-improvement-in-release-times-with-docker-and-amazon-ecs-35aab52b726f)
	* [K8Guard: Auditing System for Kubernetes Clusters at Target.com](http://target.github.io/infrastructure/k8guard-the-guardian-angel-for-kuberentes)
	* [Deconstructing Monolithic Applications into (Kafka-driven) Services at Heroku](https://blog.heroku.com/monolithic-applications-into-services)
	* [Docker Containers that Power Over 100.000 Online Shops at Shopify](https://shopifyengineering.myshopify.com/blogs/engineering/docker-at-shopify-how-we-built-containers-that-power-over-100-000-online-shops)
* [Distributed Caching](https://www.wix.engineering/single-post/scaling-to-100m-to-cache-or-not-to-cache)
	* [Read-Through, Write-Through, Write-Behind, and Refresh-Ahead Caching](https://docs.oracle.com/cd/E15357_01/coh.360/e15723/cache_rtwtwbra.htm#COHDG5177)
	* [Eviction Policy and Expiration Policy](http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html)
	* [EVCache: Caching for a Global Netflix](https://medium.com/netflix-techblog/caching-for-a-global-netflix-7bcc457012f1)
	* [Memsniff: Robust Memcache Traffic Analyzer at Box](https://blog.box.com/blog/introducing-memsniff-robust-memcache-traffic-analyzer/)
	* [Caching with Consistent Hashing and Cache Smearing at Etsy](https://codeascraft.com/2017/11/30/how-etsy-caches/)
	* [Analysis of Photo Caching at Facebook](https://code.facebook.com/posts/220956754772273/an-analysis-of-facebook-photo-caching/)
	* [Cache Efficiency Exercise at Facebook](https://code.facebook.com/posts/964122680272229/web-performance-cache-efficiency-exercise/)
	* [tCache: Scalable Data-aware Java Caching at Trivago](http://tech.trivago.com/2015/10/15/tcache/)
	* [Reduce Memcached Memory Usage by 50% at Trivago](http://tech.trivago.com/2017/12/19/how-trivago-reduced-memcached-memory-usage-by-50/)
	* [Caching Internal Service Calls at Yelp](https://engineeringblog.yelp.com/2018/03/caching-internal-service-calls-at-yelp.html)
	* [Scaling Live Streaming for Large Events (with Distributed Cache) at Hulu](https://medium.com/hulu-tech-blog/scaling-hulu-live-streaming-for-large-events-march-madness-and-beyond-bedd73874f2)
	* [Estimating the Cache Efficiency using Big Data at Allegro](https://allegro.tech/2017/01/estimating-the-cache-efficiency-using-big-data.html)
	* [Caching (with Hashing) at Zenefits](https://engineering.zenefits.com/2016/02/basic-infrastructure-patterns/)
	* [Distributed Cache (Akka, Kubernetes) at Zalando](https://jobs.zalando.com/tech/blog/distributed-cache-akka-kubernetes/)
	* [Application Data Caching from RAM to SSD at NetFlix](https://medium.com/netflix-techblog/evolution-of-application-data-caching-from-ram-to-ssd-a33d6fa7a690)
* [Distributed Tracking and Tracing](https://www.oreilly.com/ideas/understanding-the-value-of-distributed-tracing)
	* [Tracking Service Infrastructure at Scale at Shopify](https://www.usenix.org/conference/srecon17americas/program/presentation/arthorne)
	* [Distributed Tracing with Pintrace at Pinterest](https://medium.com/@Pinterest_Engineering/distributed-tracing-at-pinterest-with-new-open-source-tools-a4f8a5562f6b)
	* [Distributed Tracing at HelloFresh](https://engineering.hellofresh.com/scaling-hellofresh-distributed-tracing-7b182928247d)
	* [Analyzing Distributed Trace Data at Pinterest](https://medium.com/@Pinterest_Engineering/analyzing-distributed-trace-data-6aae58919949)
	* [Distributed Tracing at Uber](https://eng.uber.com/distributed-tracing/)
	* [JVM Profiler: Tracing Distributed JVM Applications at Uber](https://eng.uber.com/jvm-profiler/)
	* [Data Checking at Dropbox](https://www.usenix.org/conference/srecon17asia/program/presentation/mah)
	* [Tracing Distributed Systems at Showmax](https://tech.showmax.com/2016/10/tracing-distributed-systems-at-showmax/)
	* [Real-time Distributed Tracing at LinkedIn](https://engineering.linkedin.com/distributed-service-call-graph/real-time-distributed-tracing-website-performance-and-efficiency)
	* [Zipkin: Distributed Systems Tracing at Twitter](https://blog.twitter.com/engineering/en_us/a/2012/distributed-systems-tracing-with-zipkin.html)
	* [osquery Across the Enterprise at Palantir](https://medium.com/@palantir/osquery-across-the-enterprise-3c3c9d13ec55)
* [Distributed Logging](https://blog.treasuredata.com/blog/2016/08/03/distributed-logging-architecture-in-the-container-era/)
	* [The Problem with Logging - Jeff Atwood](https://blog.codinghorror.com/the-problem-with-logging/)
	* [The Log: What Every Software Engineer Should Know](https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying)
	* [Using Logs to Build a Solid Data Infrastructure - Martin Kleppmann](https://www.confluent.io/blog/using-logs-to-build-a-solid-data-infrastructure-or-why-dual-writes-are-a-bad-idea/)
	* [Scalable and Reliable Log Ingestion at Pinterest](https://medium.com/@Pinterest_Engineering/scalable-and-reliable-data-ingestion-at-pinterest-b921c2ee8754)
	* [Building DistributedLog at Twitter: High-performance replicated log service](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2015/building-distributedlog-twitter-s-high-performance-replicated-log-servic.html)
	* [Logging Service with Spark at CERN Accelerator](https://databricks.com/blog/2017/12/14/the-architecture-of-the-next-cern-accelerator-logging-service.html)
	* [Logging and Aggregation at Quora](https://engineering.quora.com/Logging-and-Aggregation-at-Quora)
	* [BookKeeper: Distributed Log Storage at Yahoo](https://yahooeng.tumblr.com/post/109908973316/bookkeeper-yahoos-distributed-log-storage-is)
	* [LogDevice: Distributed Data Store for Logs at Facebook](https://code.facebook.com/posts/357056558062811/logdevice-a-distributed-data-store-for-logs/)
	* [LogFeeder: Log Collection System at Yelp](https://engineeringblog.yelp.com/2018/03/introducing-logfeeder.html)
	* [Collection and Analysis of Daemon Logs at Badoo](https://badoo.com/techblog/blog/2016/06/06/collection-and-analysis-of-daemon-logs-at-badoo/)
* [Distributed Security (Monitoring, Authentication, etc)](https://msdn.microsoft.com/en-us/library/cc767123.aspx)
	* [Approach to Security at Scale at Dropbox](https://blogs.dropbox.com/tech/2018/02/security-at-scale-the-dropbox-approach/)
	* [Aardvark and Repokid: AWS Least Privilege for Distributed, High-Velocity Development at Netflix](https://medium.com/netflix-techblog/introducing-aardvark-and-repokid-53b081bf3a7e)	
	* [LISA: Distributed Firewall at LinkedIn](https://www.slideshare.net/MikeSvoboda/2017-lisa-linkedins-distributed-firewall-dfw)
	* [Distributed Security Alerting at Slack](https://slack.engineering/distributed-security-alerting-c89414c992d6)
	* [Secure Infrastructure To Store Bitcoin In The Cloud at Coinbase](https://engineering.coinbase.com/how-coinbase-builds-secure-infrastructure-to-store-bitcoin-in-the-cloud-30a6504e40ba)
	* [BinaryAlert: Real-time Serverless Malware Detection at Airbnb](https://medium.com/airbnb-engineering/binaryalert-real-time-serverless-malware-detection-ca44370c1b90)
	* [Scalable IAM Architecture to Secure Access to 100 AWS Accounts at Segment](https://segment.com/blog/secure-access-to-100-aws-accounts/)
	* [OAuth Audit Toolbox at Indeed](http://engineering.indeedblog.com/blog/2018/04/oaudit-toolbox/)
	* [Active Directory Password Blacklisting at Yelp](https://engineeringblog.yelp.com/2018/04/ad-password-blacklisting.html)
	* [Secure Infrastructure to Store Bitcoin in the Cloud at Coinbase](https://engineering.coinbase.com/how-coinbase-builds-secure-infrastructure-to-store-bitcoin-in-the-cloud-30a6504e40ba)	
	* [Syscall Auditing at Scale at Slack](https://slack.engineering/syscall-auditing-at-scale-e6a3ca8ac1b8)
	* [Athenz: Fine-Grained, Role-Based Access Control at Yahoo](https://yahooeng.tumblr.com/post/160481899076/open-sourcing-athenz-fine-grained-role-based)
	* [WebAuthn Support for Secure Sign In at Dropbox](https://blogs.dropbox.com/tech/2018/05/introducing-webauthn-support-for-secure-dropbox-sign-in/)
	* [Job-based Forecasting Workflow for Observability Anomaly Detection at Uber](https://eng.uber.com/observability-anomaly-detection/)
	* [Alibaba Monitoring System](https://www.usenix.org/conference/srecon18asia/presentation/xinchi)
	* [Smart Monitoring System for Anomaly Detection on Business Trends at Alibaba](https://www.usenix.org/conference/srecon17asia/program/presentation/wang)
	* [Security Development Lifecycle (SDL) at Slack](https://slack.engineering/moving-fast-and-securing-things-540e6c5ae58a)
	* [Unprivileged Container Builds at Kinvolk](https://kinvolk.io/blog/2018/04/towards-unprivileged-container-builds/)
	* [Diffy: Differencing Engine for Digital Forensics in the Cloud at Netflix](https://medium.com/netflix-techblog/netflix-sirt-releases-diffy-a-differencing-engine-for-digital-forensics-in-the-cloud-37b71abd2698)
* [Distributed Messaging, Queuing, and Event Streaming](https://arxiv.org/pdf/1704.00411.pdf)
	* [Samza: Stream Processing System for Latency Insighs at LinkedIn](https://engineering.linkedin.com/blog/2018/04/samza-aeon--latency-insights-for-asynchronous-one-way-flows)
	* [Delaying Asynchronous Message Processing with RabbitMQ at Indeed](http://engineering.indeedblog.com/blog/2017/06/delaying-messages/)	
	* [Bullet: Forward-Looking Query Engine for Streaming Data at Yahoo](https://yahooeng.tumblr.com/post/161855616651/open-sourcing-bullet-yahoos-forward-looking)
	* [EventHorizon: Tool for Watching Events Streaming at Etsy](https://codeascraft.com/2018/05/29/the-eventhorizon-saga/)
	* [Benchmarking Streaming Computation Engines at Yahoo](https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at)
	* [Cherami: Message Queue System for Transporting Async Tasks at Uber](https://eng.uber.com/cherami/)
	* [Qyu: Distributed Task Execution System for Complex Workflows at FindHotel](http://blog.findhotel.net/2018/03/qyu-a-distributed-task-execution-system-for-complex-workflows/)
	* [Messaging Service at Riot Games](https://engineering.riotgames.com/news/riot-messaging-service)
	* [Event Stream Analytics with Druid (Search Engine meet Column DB) at Walmart](https://medium.com/walmartlabs/event-stream-analytics-at-walmart-with-druid-dcf1a37ceda7)	
	* [Debugging Production with Event Logging at Zillow](https://www.zillow.com/engineering/debugging-production-event-logging/)
	* [Kafka the Message Broker](https://martin.kleppmann.com/papers/kafka-debull15.pdf)
		* [When to use RabbitMQ or Kafka](https://content.pivotal.io/blog/understanding-when-to-use-rabbitmq-or-apache-kafka)		
		* [Kafka at Scale at LinkedIn](https://engineering.linkedin.com/kafka/running-kafka-scale)
		* [Real-time Data Pipeline with Kafka at Yelp](https://engineeringblog.yelp.com/2016/07/billions-of-messages-a-day-yelps-real-time-data-pipeline.html)
		* [Building Reliable Reprocessing and Dead Letter Queues with Kafka at Uber](https://eng.uber.com/reliable-reprocessing/)
		* [Audit Kafka End-to-End at Uber](https://eng.uber.com/chaperone/)
		* [Kafka for PaaS at Rakuten](https://techblog.rakuten.co.jp/2016/01/28/rakuten-paas-kafka/)
		* [Publishing with Kafka at The New York Times](https://open.nytimes.com/publishing-with-apache-kafka-at-the-new-york-times-7f0e3b7d2077)
		* [Kafka Streams on Heroku](https://blog.heroku.com/kafka-streams-on-heroku)
		* [Kafka in Platform Events Architecture at Salesforce](https://engineering.salesforce.com/how-apache-kafka-inspired-our-platform-events-architecture-2f351fe4cf63)	
		* [Kafka in Socket Architecture (with a Comprehensive Comparison Table) at Trello](https://tech.trello.com/why-we-chose-kafka/)	
		* [Analytics Pipeline (Kafka, Dataflow, BigQuery) at Teads.tv](http://highscalability.com/blog/2018/4/9/give-meaning-to-100-billion-events-a-day-the-analytics-pipel.html)
	* [Data Deduplication Techniques](https://en.wikipedia.org/wiki/Data_deduplication)
		* [Exactly-once Semantics are Possible: How Kafka Does it](https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/)
		* [Real-time Deduping at Scale with Kafka-based Pipleline at Tapjoy](http://eng.tapjoy.com/blog-list/real-time-deduping-at-scale)
		* [Delivering Billions of Messages Exactly Once: Deduping at Segment](https://segment.com/blog/exactly-once-delivery/)
		* [Deduplication For Efficient Storage (From 50 PB To 32 PB) At Mail.Ru](https://medium.com/@andrewsumin/efficient-storage-how-we-went-down-from-50-pb-to-32-pb-99f9c61bf6b4)				
* [Distributed Searching](http://nwds.cs.washington.edu/files/nwds/pdf/Distributed-WR.pdf)
	* [Search Architecture of Instagram](https://engineering.instagram.com/search-architecture-eeb34a936d3a)
	* [Search Architecture of eBay](http://www.cs.otago.ac.nz/homepages/andrew/papers/2017-8.pdf)
	* [Improving Search Engine Efficiency by over 25% at eBay](https://www.ebayinc.com/stories/blogs/tech/making-e-commerce-search-faster/)	
	* [Search Federation Architecture at LinkedIn (2018)](https://engineering.linkedin.com/blog/2018/03/search-federation-architecture-at-linkedin)
	* [Search at Slack](https://slack.engineering/search-at-slack-431f8c80619e)
	* [Search and Recommendations at DoorDash](https://blog.doordash.com/powering-search-recommendations-at-doordash-8310c5cfd88c)
	* [Search Service at Twitter (2014)](https://blog.twitter.com/engineering/en_us/a/2014/building-a-complete-tweet-index.html)
	* [Nautilus: Travel Search Engine of Expedia](http://blog.expedia.com/expedias-nautilus-travel-search-engine-overview-and-applications/)
	* [Galene: Search Architecture of LinkedIn](https://engineering.linkedin.com/search/did-you-mean-galene)
	* [Manas: High Performing Customized Search System at Pinterest](https://medium.com/@Pinterest_Engineering/manas-a-high-performing-customized-search-system-cf189f6ca40f)
	* [Sherlock: Near Real Time Search Indexing at Flipkart](https://tech.flipkart.com/sherlock-near-real-time-search-indexing-95519783859d)
	* [Nebula: Storage Platform to Build Search Backends at Airbnb](https://medium.com/airbnb-engineering/nebula-as-a-storage-platform-to-build-airbnbs-search-backends-ecc577b05f06)
	* [ELK (Elasticsearch, Logstash, Kibana) Stack](https://logz.io/blog/15-tech-companies-chose-elk-stack/)
		* [Predictions in Real Time with ELK at Uber](https://eng.uber.com/elk/)
		* [Scaling Elasticsearch Clusters at Uber](https://www.infoq.com/presentations/uber-elasticsearch-clusters?utm_source=presentations_about_Case_Study&utm_medium=link&utm_campaign=Case_Study)
		* [Elasticsearch Performance Tuning Practice at eBay](https://www.ebayinc.com/stories/blogs/tech/elasticsearch-performance-tuning-practice-at-ebay/)
		* [Elasticsearch at Kickstarter](https://kickstarter.engineering/elasticsearch-at-kickstarter-db3c487887fc)
		* [Distributed Troubleshooting Platform with ELK Stack at Target.com](http://target.github.io/infrastructure/distributed-troubleshooting)
		* [ELK at Robinhood](https://robinhood.engineering/taming-elk-4e1349f077c3)
		* [Log Parsing with Logstash and Google Protocol Buffers at Trivago](https://tech.trivago.com/2016/01/19/logstash_protobuf_codec/)
		* [Fast Order Search using Data Pipeline and Elasticsearch at Yelp](https://engineeringblog.yelp.com/2018/06/fast-order-search.html)
		* [Sharding out Elasticsearch at Vinted](http://engineering.vinted.com/2017/06/05/sharding-out-elasticsearch/)
* [Distributed Storage](http://highscalability.com/blog/2011/11/1/finding-the-right-data-solution-for-your-application-in-the.html)
	* [In-memory Storage](https://medium.com/@denisanikin/what-an-in-memory-database-is-and-how-it-persists-data-efficiently-f43868cff4c1)
		* [Introduction to In-memory Data - Viktor Gamov, Solutions Architect at Hazelcast](https://www.infoq.com/presentations/in-memory-data)
		* [MemSQL Architecture - The Fast (MVCC, InMem, LockFree, CodeGen) And Familiar (SQL)](http://highscalability.com/blog/2012/8/14/memsql-architecture-the-fast-mvcc-inmem-lockfree-codegen-and.html)
		* [Optimizing Memcached Efficiency at Quora](https://engineering.quora.com/Optimizing-Memcached-Efficiency)
		* [Real-Time Data Warehouse with MemSQL on Cisco UCS](https://blogs.cisco.com/datacenter/memsql)
		* [Moving to MemSQL (with Horizontally Scalable, ACID Compliant, MySQL Compatibility) at Tapjoy](http://eng.tapjoy.com/blog-list/moving-to-memsql)
		* [MemSQL and Kinesis for Real-time Insights at Disney-ABC TV](https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/68131)
	* [Durable Storage (S3, HDFS)](http://www.datacenterknowledge.com/archives/2013/10/04/object-storage-the-future-of-scale-out)
		* [Scaling HDFS at Uber](https://eng.uber.com/scaling-hdfs/)
		* [Reasons for Choosing S3 over HDFS at Databricks](https://databricks.com/blog/2017/05/31/top-5-reasons-for-choosing-s3-over-hdfs.html)
		* [Quantcast File System on Amazon S3](https://www.quantcast.com/blog/quantcast-file-system-on-amazon-s3/)
		* [Data Sink with S3 at Deliveroo](https://deliveroo.engineering/2017/06/15/data-sink.html)
		* [Using S3 in Netflix Chukwa](https://medium.com/netflix-techblog/evolution-of-the-netflix-data-pipeline-da246ca36905)	
		* [Yahoo Cloud Object Store - Object Storage at Exabyte Scale](https://yahooeng.tumblr.com/post/116391291701/yahoo-cloud-object-store-object-storage-at)
		* [Ambry: Distributed Immutable Object Store at LinkedIn](https://www.usenix.org/conference/srecon17americas/program/presentation/shenoy)
		* [Hammerspace: Persistent, Concurrent, Off-heap Storage at Airbnb](https://medium.com/airbnb-engineering/hammerspace-persistent-concurrent-off-heap-storage-3db39bb04472)	
* [Relational Databases (MySQL, MSSQL, PostgreSQL)](https://www.mysql.com/products/cluster/scalability.html)
	* [Stop Using Shiny New Things and Love MySQL - Lesson at Pinterest](https://medium.com/@Pinterest_Engineering/learn-to-stop-using-shiny-new-things-and-love-mysql-3e1613c2ce14)
	* [Microsoft SQL versus MySQL](https://www.upwork.com/hiring/data/sql-vs-mysql-which-relational-database-is-right-for-you/)
	* [SQL Database Performance Tuning](https://www.toptal.com/sql-server/sql-database-tuning-for-developers)
	* [Scaling PostgreSQL Using CUDA](http://highscalability.com/blog/2009/5/28/scaling-postgresql-using-cuda.html)
	* [Scaling Distributed Joins](http://blog.memsql.com/scaling-distributed-joins/)
	* [MySQL System Design at Booking.com](https://www.percona.com/live/mysql-conference-2015/sessions/bookingcom-evolution-mysql-system-design)
	* [PostgreSQL at Twitch](https://blog.twitch.tv/how-twitch-uses-postgresql-c34aa9e56f58)
	* [Scaling MySQL-based Financial Reporting System at Airbnb](https://medium.com/airbnb-engineering/tracking-the-money-scaling-financial-reporting-at-airbnb-6d742b80f040)
	* [Scaling MySQL at Wix](https://www.wix.engineering/single-post/scaling-to-100m-mysql-is-a-better-nosql)
	* [MaxScale (MySQL) Database Proxy at Airbnb](https://medium.com/airbnb-engineering/unlocking-horizontal-scalability-in-our-web-serving-tier-d907449cdbcf)
	* [Switching from Postgres to MySQL at Uber](https://eng.uber.com/mysql-migration/)
	* [Handling Growth with Postgres at Instagram](https://engineering.instagram.com/handling-growth-with-postgres-5-tips-from-instagram-d5d7e7ffdfcb)
	* [Scaling the Analytics Database (Postgres) at TransferWise](http://tech.transferwise.com/scaling-our-analytics-database/)
	* [Updating a 50 Terabyte PostgreSQL Database at Adyen](https://medium.com/adyen/updating-a-50-terabyte-postgresql-database-f64384b799e7)
	* [Replication](https://m.alphasights.com/a-primer-on-database-replication-381b319cd032)		
		* [MySQL Parallel Replication (4 parts) at Booking.com](https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-4-annex-under-the-hood-eb456cf8b2fb)
		* [Mitigating MySQL Replication Lag and Reducing Read Load at Github](https://githubengineering.com/mitigating-replication-lag-and-reducing-read-load-with-freno/)
		* [Black-Box Auditing: Verifying End-to-End Replication Integrity between MySQL and Redshift at Yelp](https://engineeringblog.yelp.com/2018/04/black-box-auditing.html)
		* [Monitoring MySQL Delayed Replication at IMVU](https://engineering.imvu.com/2013/01/09/monitoring-delayed-replication-with-a-focus-on-mysql/)
		* [Partitioning Main MySQL Database at Airbnb](https://medium.com/airbnb-engineering/how-we-partitioned-airbnb-s-main-database-in-two-weeks-55f7e006ff21)
		* [Herb: Multi-DC Replication Engine for Schemaless Datastore at Uber](https://eng.uber.com/herb-datacenter-replication/)
	* [Sharding (Horizontal Partitioning)](https://www.educative.io/collection/page/5668639101419520/5649050225344512/5146118144917504)
		* [Sharding MySQL at Pinterest](https://medium.com/@Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f)
		* [Sharding MySQL at MailChimp](https://devs.mailchimp.com/blog/using-shards-to-accommodate-millions-of-users/)
		* [Sharding MySQL at Twilio](https://www.twilio.com/engineering/2014/06/26/how-we-replaced-our-data-pipeline-with-zero-downtime)
		* [Sharding MySQL (3 parts) at Evernote](https://blog.evernote.com/tech/2015/10/08/the-great-shard-migration-part-ii/)
		* [Sharding Layer of Schemaless Datastore at Uber](https://eng.uber.com/schemaless-rewrite/)
		* [Sharding & IDs at Instagram](https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c)
		* [Solr: Improving Performance for Batch Indexing at Box](https://blog.box.com/blog/solr-improving-performance-batch-indexing/)					
* [NoSQL Databases](https://www.thoughtworks.com/insights/blog/nosql-databases-overview)
	* [Key-Value Databases (DynamoDB, Voldemort, Manhattan)](http://highscalability.com/anti-rdbms-list-distributed-key-value-stores)
		* [Scaling Mapbox infrastructure with DynamoDB Streams](https://blog.mapbox.com/scaling-mapbox-infrastructure-with-dynamodb-streams-d53eabc5e972)
		* [Manhattan: Twitter’s distributed key-value database](https://blog.twitter.com/engineering/en_us/a/2014/manhattan-our-real-time-multi-tenant-distributed-database-for-twitter-scale.html)
		* [Sherpa: Yahoo’s distributed NoSQL key-value store](https://yahooeng.tumblr.com/post/120730204806/sherpa-scales-new-heights)
		* [Riak inside Chat Service Architecture at Riot Games](https://engineering.riotgames.com/news/chat-service-architecture-persistence)
		* [MPH: Fast and Compact Immutable Key-Value Stores at Indeed](http://engineering.indeedblog.com/blog/2018/02/indeed-mph/)
		* [zBase: High Performance, Elastic, Distributed Key-Value Store at Zynga](https://www.zynga.com/blogs/engineering/zbase-high-performance-elastic-distributed-key-value-store-2)
		* [Venice: Distributed Key-Value Database at Linkedin](https://engineering.linkedin.com/blog/2017/02/building-venice-with-apache-helix)
		* [DynamoDB Hot Shards at Segment](https://segment.com/blog/the-million-dollar-eng-problem/)
	* [Columnar Databases (Cassandra, HBase, Redshift)](https://aws.amazon.com/nosql/columnar/)
		* [Consistent Hashing in Cassandra](https://blog.imaginea.com/consistent-hashing-in-cassandra/)
		* [Understanding Gossip (Cassandra Internals)](https://www.youtube.com/watch?v=FuP1Fvrv6ZQ)
		* [When NOT to use Cassandra?](https://stackoverflow.com/questions/2634955/when-not-to-use-cassandra)
		* [Avoid Pitfalls in Scaling Cassandra Cluster at Walmart](https://medium.com/walmartlabs/avoid-pitfalls-in-scaling-your-cassandra-cluster-lessons-and-remedies-a71ca01f8c04)
		* [Storing Images in Cassandra at Walmart](https://medium.com/walmartlabs/building-object-store-storing-images-in-cassandra-walmart-scale-a6b9c02af593)
		* [Cassandra at Instagram](https://www.slideshare.net/DataStax/cassandra-at-instagram-2016)
		* [Scale Ad Analytics with Cassandra at Yelp](https://engineeringblog.yelp.com/2016/08/how-we-scaled-our-ad-analytics-with-cassandra.html)
		* [Store Billions of Messages with Cassandra at Discord](https://blog.discordapp.com/how-discord-stores-billions-of-messages-7fa6ec7ee4c7)
		* [Scale to 100+ Million Reads/Writes using Spark and Cassandra at Dream11](https://medium.com/dream11-tech-blog/leaderboard-dream11-4efc6f93c23e)		
		* [Moving Food Feed from Redis to Cassandra at Zomato](https://www.zomato.com/blog/how-we-moved-our-food-feed-from-redis-to-cassandra)
		* [Benchmarking Cassandra Scalability on AWS at Netflix](https://medium.com/netflix-techblog/benchmarking-cassandra-scalability-on-aws-over-a-million-writes-per-second-39f45f066c9e)
		* [Imgur Notification: From MySQL to HBASE at Imgur](https://blog.imgur.com/2015/09/15/tech-tuesday-imgur-notifications-from-mysql-to-hbase/)
		* [Improving HBase Backup Efficiency at Pinterest](https://medium.com/@Pinterest_Engineering/improving-hbase-backup-efficiency-at-pinterest-86159da4b954)
		* [HBase Practice at Xiaomi](https://www.slideshare.net/HBaseCon/hbase-practice-at-xiaomi)
		* [ClickHouse - Open Source Distributed Column Database at Yandex](https://clickhouse.yandex/)
		* [Scaling Redshift without Scaling Costs at GIPHY](https://engineering.giphy.com/scaling-redshift-without-scaling-costs/)
		* [Service Decomposition at Scale (with Cassandra) at Intuit QuickBooks](https://quickbooks-engineering.intuit.com/service-decomposition-at-scale-70405ac2f637)
		* [Cassandra for Keeping Counts In Sync at SoundCloud](https://developers.soundcloud.com/blog/keeping-counts-in-sync)
	* [Document Databases (MongoDB, SimpleDB, CouchDB)](https://msdn.microsoft.com/en-us/magazine/hh547103.aspx)
		* [eBay: Building Mission-Critical Multi-Data Center Applications with MongoDB](https://www.mongodb.com/blog/post/ebay-building-mission-critical-multi-data-center-applications-with-mongodb)
		* [MongoDB at Baidu: Multi-Tenant Cluster Storing 200+ Billion Documents across 160 Shards](https://www.mongodb.com/blog/post/mongodb-at-baidu-powering-100-apps-across-600-nodes-at-pb-scale)
		* [Migrating Mongo Data at Addepar](https://medium.com/build-addepar/migrating-mountains-of-mongo-data-63e530539952)
		* [The AWS and MongoDB Infrastructure of Parse (acquired by Facebook)](https://medium.baqend.com/parse-is-gone-a-few-secrets-about-their-infrastructure-91b3ab2fcf71)
		* [Migrating Mountains of Mongo Data at Addepar](https://medium.com/build-addepar/migrating-mountains-of-mongo-data-63e530539952)
		* [Couchbase Ecosystem at LinkedIn](https://engineering.linkedin.com/blog/2017/12/couchbase-ecosystem-at-linkedin)
		* [SimpleDB at Zendesk](https://medium.com/zendesk-engineering/resurrecting-amazon-simpledb-9404034ec506)
	* [Graph Databases](https://www.ibm.com/developerworks/library/cl-graph-database-1/index.html)
		* [Handling Billions of Edges in a Graph Database](https://www.infoq.com/presentations/graph-database-scalability)		
		* [Neo4j case studies with Walmart, eBay, AirBnB, NASA, etc](https://neo4j.com/customers/)
		* [FlockDB: Distributed Graph Database for Storing Adjancency Lists at Twitter](https://blog.twitter.com/engineering/en_us/a/2010/introducing-flockdb.html)
		* [JanusGraph: Scalable Graph Database backed by Google, IBM and Hortonworks](https://architecht.io/google-ibm-back-new-open-source-graph-database-project-janusgraph-1d74fb78db6b)
		* [Amazon Neptune](https://aws.amazon.com/neptune/)
	* [Datastructure Databases (Redis, Hazelcast)](https://db-engines.com/en/system/Hazelcast%3BMemcached%3BRedis)
		* [Using Redis To Scale at Twitter](http://highscalability.com/blog/2014/9/8/how-twitter-uses-redis-to-scale-105tb-ram-39mm-qps-10000-ins.html)
		* [Scaling Job Queue with Redis at Slack](https://slack.engineering/scaling-slacks-job-queue-687222e9d100)
		* [Moving persistent data out of Redis at Github](https://githubengineering.com/moving-persistent-data-out-of-redis/)
		* [Storing Hundreds of Millions of Simple Key-Value Pairs in Redis at Instagram](https://engineering.instagram.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c)
		* [Redis in Chat Architecture of Twitch (from 27:22)](https://www.infoq.com/presentations/twitch-pokemon)
		* [Learn Redis the hard way (in production) at Trivago](http://tech.trivago.com/2017/01/25/learn-redis-the-hard-way-in-production/)
		* [Optimizing Session Key Storage in Redis at Deliveroo](https://deliveroo.engineering/2016/10/07/optimising-session-key-storage.html)
		* [Optimizing Redis Storage at Deliveroo](https://deliveroo.engineering/2017/01/19/optimising-membership-queries.html)		
		* [Memory Optimization in Redis at Wattpad](http://engineering.wattpad.com/post/23244724794/store-more-stuff-memory-optimization-in-redis)
		* [Sending an e-mail to millions of users (with Redis) at Drivy](https://drivy.engineering/sending-mass-emails/)
		* [Redis Fleet at Heroku](https://blog.heroku.com/rolling-redis-fleet)
* [Time Series Databases (TSDB)](https://www.influxdata.com/time-series-database/)
	* [What is Time-Series Data & Why We Need a Time-Series Database](https://blog.timescale.com/what-the-heck-is-time-series-data-and-why-do-i-need-a-time-series-database-dcf3b1b18563)
	* [Time Series Data: Why and How to Use a Relational Database instead of NoSQL](https://blog.timescale.com/time-series-data-why-and-how-to-use-a-relational-database-instead-of-nosql-d0cd6975e87c)
	* [Practical Guide to Monitoring and Alerting with Time Series at Scale](https://www.usenix.org/conference/srecon17americas/program/presentation/wilkinson)
	* [Beringei: High-performance Time Series Storage Engine at Facebook](https://code.facebook.com/posts/952820474848503/beringei-a-high-performance-time-series-storage-engine/)	
	* [Atlas: In-memory Dimensional Time Series Database at Netflix](https://medium.com/netflix-techblog/introducing-atlas-netflixs-primary-telemetry-platform-bd31f4d8ed9a)
	* [Heroic: Time Series Database at Spotify](https://labs.spotify.com/2015/11/17/monitoring-at-spotify-introducing-heroic/)
	* [Roshi: Distributed Storage System for Time-Series Event at SoundCloud](https://developers.soundcloud.com/blog/roshi-a-crdt-system-for-timestamped-events)
	* [Building a Scalable Time Series Database on PostgreSQL](https://blog.timescale.com/when-boring-is-awesome-building-a-scalable-time-series-database-on-postgresql-2900ea453ee2)
	* [Scaling Time Series Data Storage at Netflix](https://medium.com/netflix-techblog/scaling-time-series-data-storage-part-i-ec2b6d44ba39)
* [HTTP Caching (Reverse Proxy, CDN)](https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching)
	* [Reverse Proxy (Nginx, Varnish, Squid, rack-cache)](https://www.mertech.com/overview-reverse-proxying/)
	* [Stop Worrying and Love the Proxy](https://blog.turbinelabs.io/how-we-learned-to-stop-worrying-and-love-the-proxy-89af98fabaf8)
	* [Playing HTTP Tricks with Nginx](https://www.elastic.co/blog/playing-http-tricks-nginx)
	* [Using CDN to Improve Site Performance at Coursera](https://building.coursera.org/blog/2015/07/09/improving-coursera-global-site-performance-a-head-to-head-cdn-battle-with-production-traffic/)
	* [Strategy: Caching 404s Saved 66% On Server Time at The Onion](http://highscalability.com/blog/2010/3/26/strategy-caching-404s-saved-the-onion-66-on-server-time.html)
	* [Increasing Application Performance with HTTP Cache Headers](https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers)
	* [Zynga Geo Proxy: Reducing Mobile Game Latency at Zynga](https://www.zynga.com/blogs/engineering/zynga-geo-proxy-reducing-mobile-game-latency)
	* [Google AMP at Condé Nast](https://technology.condenast.com/story/the-why-and-how-of-google-amp-at-conde-nast)
	* [Running A/B Tests on Hosting Infrastructure (CDNs) at Deliveroo](https://deliveroo.engineering/2016/09/19/ab-testing-cdns.html)
	* [HAProxy with Kubernetes for User-facing Traffic at SoundCloud](https://developers.soundcloud.com/blog/how-soundcloud-uses-haproxy-with-kubernetes-for-user-facing-traffic)
	* [Bandaid: Service Proxy at Dropbox](https://blogs.dropbox.com/tech/2018/03/meet-bandaid-the-dropbox-service-proxy/)
	* [CDN in LIVE's Encoder Layer at LINE](https://engineering.linecorp.com/en/blog/detail/230)
* [Load Balancing](https://blog.vivekpanyam.com/scaling-a-web-service-load-balancing/)
	* [Introduction to Modern Network Load Balancing and Proxying](https://blog.envoyproxy.io/introduction-to-modern-network-load-balancing-and-proxying-a57f6ff80236)
	* [Load Balancing infrastructure to support more than 1.3 billion users at Facebook](https://www.usenix.org/conference/srecon15europe/program/presentation/shuff)
	* [DHCPLB: DHCP Load Balancer at Facebook](https://code.facebook.com/posts/1734309626831603/dhcplb-an-open-source-load-balancer/)
	* [Katran: Scalable Network Load Balancer at Facebook](https://code.facebook.com/posts/1906146702752923/open-sourcing-katran-a-scalable-network-load-balancer/)
	* [Load Balancing with Eureka at Netflix](https://medium.com/netflix-techblog/netflix-shares-cloud-load-balancing-and-failover-tool-eureka-c10647ef95e5)
	* [Load Balancing at Yelp](https://engineeringblog.yelp.com/2017/05/taking-zero-downtime-load-balancing-even-further.html)
	* [Load Balancing at Github](https://githubengineering.com/introducing-glb/)
	* [Consistent Hashing to Improve Load Balancing at Vimeo](https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed)
	* [UDP Load Balancing at 500 pixel](https://developers.500px.com/udp-load-balancing-with-keepalived-167382d7ad08)
	* [QALM: QoS Load Management Framework at Uber](https://eng.uber.com/qalm/)	
	* [Traffic Steering using Rum DNS at LinkedIn](https://www.usenix.org/conference/srecon17europe/program/presentation/rastogi)
* [Rate Limiting](https://www.keycdn.com/support/rate-limiting/)
	* [Rate Limiting for Scaling to Millions of Domains at Cloudfare](https://blog.cloudflare.com/counting-things-a-lot-of-different-things/)
	* [Cloud Bouncer: Distributed Rate Limiting at Yahoo](https://yahooeng.tumblr.com/post/111288877956/cloud-bouncer-distributed-rate-limiting-at-yahoo)
	* [Scaling API with Rate Limiters at Stripe](https://stripe.com/blog/rate-limiters)
	* [Rate Limiting at Etsy](https://www.sans.org/summit-archives/file/summit-archive-1509593697.pdf)
	* [Rate Limiter at BloomReach](http://engineering.bloomreach.com/qps-monitoring-throttling-system/)
	* [Distributed Rate Limiting at Allegro](https://allegro.tech/2017/04/hermes-max-rate.html)
	* [Ratequeue: Core Queueing-And-Rate-Limiting System at Twilio](https://www.twilio.com/blog/2017/11/chaos-engineering-ratequeue-ha.html)
* [Autoscaling](https://medium.com/@BotmetricHQ/top-11-hard-won-lessons-learned-about-aws-auto-scaling-5bfe56da755f)
	* [A Horror Movie Featuring Auto Scaling Groups, EBS Volumes, Terraform, and Bash](https://blog.gruntwork.io/yak-shaving-series-1-all-i-need-is-a-little-bit-of-disk-space-6e5ef1644f67)
	* [Autoscaling Pinterest](https://medium.com/@Pinterest_Engineering/auto-scaling-pinterest-df1d2beb4d64)
	* [Autoscaling Based on Request Queuing at Square](https://medium.com/square-corner-blog/autoscaling-based-on-request-queuing-c4c0f57f860f)
	* [Autoscaling Applications at PayPal](https://www.paypal-engineering.com/2017/08/16/autoscaling-applications-paypal/)
	* [Autoscaling Jenkins at Trivago](http://tech.trivago.com/2017/02/17/your-definite-guide-for-autoscaling-jenkins/)
	* [Scryer: Predictive Auto Scaling Engine at Netflix](https://medium.com/netflix-techblog/scryer-netflixs-predictive-auto-scaling-engine-a3f8fc922270)
* [Concurrency](http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/)
	* [Message-Passing Concurrency](https://link.springer.com/chapter/10.1007/978-3-642-35170-9_11)
	* [Software Transactional Memory](https://dl.acm.org/citation.cfm?id=3037750)
	* [Dataflow Concurrency](http://www.marketwired.com/press-release/java-concurrency-and-scalability-platform-akka-celebrates-fifth-anniversary-1928674.htm)
	* [Shared-State Concurrency](https://doc.rust-lang.org/book/second-edition/ch16-03-shared-state.html)
	* [Concurrency series by Larry Osterman (Principal SDE at Microsoft)](https://social.msdn.microsoft.com/Profile/Larry%2bOsterman%2b%5BMSFT%5D/activity)
		* [Part 8 – Concurrency for scalability](https://blogs.msdn.microsoft.com/larryosterman/2005/02/28/concurrency-part-8-concurrency-for-scalability/)
		* [Part 9 - APIs that enable scalable programming](https://blogs.msdn.microsoft.com/larryosterman/2005/03/02/concurrency-part-9-apis-that-enable-scalable-programming/)
		* [Part 10 - How do you know if you’ve got a scalability issue?](https://blogs.msdn.microsoft.com/larryosterman/2005/03/03/concurrency-part-10-how-do-you-know-if-youve-got-a-scalability-issue/)
		* [Part 11 – Hidden scalability issues](https://blogs.msdn.microsoft.com/larryosterman/2005/03/04/concurrency-part-11-hidden-scalability-issues/)
		* [Part 12 – Hidden scalability issues (cont)](https://blogs.msdn.microsoft.com/larryosterman/2005/03/07/concurrency-part-12-hidden-scalability-issues-part-2/)
	* [Concurrency with Erlang](http://learnyousomeerlang.com/the-hitchhikers-guide-to-concurrency)
		* [Erlang in WhatsApp](https://blog.whatsapp.com/196/1-million-is-so-2011)
		* [Erlang in Riot Chat Server](https://engineering.riotgames.com/news/chat-service-architecture-servers)
		* [How Discord Scaled Elixir to Five Millions Concurrent Users](https://blog.discordapp.com/scaling-elixir-f9b8e1e7c29b)
		* [Mnesia: A Distributed DBMS Rooted in Concurrency](https://www.developer.com/db/article.php/3864331/Mnesia-A-Distributed-DBMS-Rooted-in-Concurrency.htm)
		* [Mesia and CAP](https://medium.com/@jlouis666/mnesia-and-cap-d2673a92850)		
	* [Running Concurrent Queries in GoSocial (Go and Neo4j) at Medium](https://medium.engineering/running-concurrent-queries-in-gosocial-28e5841b05b5)
	* [The Secret To 10 Million Concurrent Connections](http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html)
* [Parallel Computing](https://blogs.msdn.microsoft.com/ddperf/2009/05/02/are-we-taking-advantage-of-parallelism/)
	* [SPMD (Single Program Multiple Data): The Genetic Pattern](https://www2.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-186.html)
	* [Master/Worker Pattern](https://docs.gigaspaces.com/sbp/master-worker-pattern.html)
	* [Loop Parallelism Pattern: Extracting parallel tasks from loops](https://www.cs.umd.edu/class/fall2001/cmsc411/projects/unroll/main.htm)
	* [Fork/Join Pattern: Good for recursive data processing](http://highscalability.com/learn-how-exploit-multiple-cores-better-performance-and-scalability)
	* [Map-Reduce: Born for Simplified Data Processing on Large Clusters](http://static.googleusercontent.com/media/research.google.com/en/us/archive/mapreduce-osdi04.pdf)
	* [On the Death of Map-Reduce - Henry Robinson, Cloudera](http://the-paper-trail.org/blog/the-elephant-was-a-trojan-horse-on-the-death-of-map-reduce-at-google/)
	* [Server-side Optimization to Parallelize the Rendering of Web Pages at Yelp](https://engineeringblog.yelp.com/2017/07/generating-web-pages-in-parallel-with-pagelets.html)
	* [Accelerator: Data Processing Framework with Fast Data Access and Parallel Execution at eBay](https://www.ebayinc.com/stories/blogs/tech/announcing-the-accelerator-processing-1-000-000-000-lines-per-second-on-a-single-computer/)
* [Event-Driven Architecture](https://martinfowler.com/articles/201701-event-driven.html)
	* [Pub-Sub Messaging](https://aws.amazon.com/pub-sub-messaging/)
		* [Autoscaling Pub-Sub Consumers at Spotify](https://labs.spotify.com/2017/11/20/autoscaling-pub-sub-consumers/)
		* [Pulsar: Pub-Sub Messaging at Scale at Yahoo](https://yahooeng.tumblr.com/post/150078336821/open-sourcing-pulsar-pub-sub-messaging-at-scale)
		* [Wormhole: Pub-Sub system at Facebook (2013)](https://code.facebook.com/posts/188966771280871/wormhole-pub-sub-system-moving-data-through-space-and-time/)
		* [Pub-Sub in Chatting Architecture at LINE](https://engineering.linecorp.com/en/blog/detail/85)
	* [Domain Events](https://martinfowler.com/eaaDev/DomainEvent.html)
		* [Domain Events: Simple and Reliable Solution](http://enterprisecraftsmanship.com/2017/10/03/domain-events-simple-and-reliable-solution/)
		* [Domain-Driven Design in Organizing Monolith Before Breaking it into Services at Weebly](https://medium.com/weebly-engineering/how-to-organize-your-monolith-before-breaking-it-into-services-69cbdb9248b0)
		* [Modelling for Domain Driven Design at Moonpig](https://engineering.moonpig.com/development/modelling-for-domain-driven-design)
	* [Event Sourcing](https://martinfowler.com/eaaDev/EventSourcing.html)
		* [Event Sourced Architectures for High Availability](https://www.infoq.com/presentations/Event-Sourced-Architectures-for-High-Availability)
		* [Event Sourcing and Stream Processing at Scale](https://martin.kleppmann.com/2016/01/29/event-sourcing-stream-processing-at-ddd-europe.html)
		* [Scaling Event Sourcing for Netflix Downloads](https://www.infoq.com/presentations/netflix-scale-event-sourcing)
		* [Scaling Event-Sourcing at Jet.com](https://medium.com/@eulerfx/scaling-event-sourcing-at-jet-9c873cac33b8)
		* [Event Sourcing (2 parts) at eBay](https://www.ebayinc.com/stories/blogs/tech/event-sourcing-in-action-with-ebays-continuous-delivery-team/)
	* [Command & Query Responsibility Segregation (CQRS)](https://docs.microsoft.com/en-us/azure/architecture/patterns/cqrs)
		* [Exploring CQRS and Event Sourcing - MSDN (with free ebook)](https://msdn.microsoft.com/en-us/library/jj554200.aspx)
		* [CQRS Simple Architecture](https://www.future-processing.pl/blog/cqrs-simple-architecture/)
		* [Building Scalable Applications Using Event Sourcing and CQRS with Kafka](https://initiate.andela.com/event-sourcing-and-cqrs-a-look-at-kafka-e0c1b90d17d8)
	* [Stream Processing, Event Sourcing, Reactive, CEP, etc - Martin Kleppmann](https://www.confluent.io/blog/making-sense-of-stream-processing/)
		* [Point-To-Point and Its Differences from Pub-Sub](https://www.journaldev.com/9743/jms-messaging-models)
		* [Store-Forward](https://docs.oracle.com/cd/E13222_01/wls/docs91/saf_admin/overview.html)
		* [Request-Reply](https://docs.tibco.com/pub/ftl/4.3.0/doc/html/GUID-A64ABED1-682E-4E1D-A94A-5590CB91B9BB.html)
		* [Enterprise Service Bus](http://www.oracle.com/technetwork/articles/soa/ind-soa-esb-1967705.html)		
* [Distributed Source Code and Configuration Files Management](https://betterexplained.com/articles/intro-to-distributed-version-control-illustrated/)
	* [Distributed Version Control Systems: A Not-So-Quick Guide Through](https://www.infoq.com/articles/dvcs-guide)
	* [DGit: Distributed Git at Github](https://githubengineering.com/introducing-dgit/)
	* [Stemma: Distributed Git Server at Palantir](https://medium.com/@palantir/stemma-distributed-git-server-70afbca0fc29)
	* [Configuration Management for Distributed Systems at Flickr](https://code.flickr.net/2016/03/24/configuration-management-for-distributed-systems-using-github-and-cfg4j/)
	* [Git Repository at Microsoft](https://blogs.msdn.microsoft.com/bharry/2017/05/24/the-largest-git-repo-on-the-planet/)
	* [How Microsoft Solved Git’s Problem with Large Repositories](https://www.infoq.com/news/2017/02/GVFS)	
	* [Single Repository at Google](https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext)	
	* [Scaling Infrastructure and (Git) Workflow at Adyen](https://medium.com/adyen/from-0-100-billion-scaling-infrastructure-and-workflow-at-adyen-7b63b690dfb6)	
	* [Dotfiles Distribution at Booking.com](https://medium.com/booking-com-infrastructure/dotfiles-distribution-dedb69c66a75)

## Availability
* [Resilience Engineering: Learning to Embrace Failure](https://queue.acm.org/detail.cfm?id=2371297)	
	* [Resilience Engineering with Project Waterbear at LinkedIn](https://engineering.linkedin.com/blog/2017/11/resilience-engineering-at-linkedin-with-project-waterbear)
	* [Resiliency against Traffic Oversaturation at iHeartRadio](https://tech.iheart.com/resiliency-against-traffic-oversaturation-77c5ed92a5fb)
	* [Resiliency in Distributed Systems at GO-JEK](https://blog.gojekengineering.com/resiliency-in-distributed-systems-efd30f74baf4)
	* [Practical NoSQL Resilience Design Pattern for the Enterprise at eBay](https://www.ebayinc.com/stories/blogs/tech/practical-nosql-resilience-design-pattern-for-the-enterprise/)
	* [Ensuring Resilience to Disaster at Quora](https://engineering.quora.com/Ensuring-Quoras-Resilience-to-Disaster)
	* [Resilience at Shopify](https://scaleyourcode.com/blog/article/23)
	* [Site Resiliency at Expedia](https://www.infoq.com/presentations/expedia-website-resiliency?utm_source=presentations_about_Case_Study&utm_medium=link&utm_campaign=Case_Study)
* [Failover](http://cloudpatterns.org/mechanisms/failover_system)
	* [The Evolution of Global Traffic Routing and Failover](https://www.usenix.org/conference/srecon16/program/presentation/heady)
	* [Testing for Disaster Recovery Failover Testing](https://www.usenix.org/conference/srecon17asia/program/presentation/liu_zehua)
	* [Designing a Microservices Architecture for Failure](https://blog.risingstack.com/designing-microservices-architecture-for-failure/)
	* [ELB for Automatic Failover at GoSquared](https://engineering.gosquared.com/use-elb-automatic-failover)
	* [Eliminate the Database for Higher Availability at American Express](http://americanexpress.io/eliminate-the-database-for-higher-availability/)
	* [Failover with Redis Sentinel at Vinted](http://engineering.vinted.com/2015/09/03/failover-with-redis-sentinel/)
	* [High-availability SaaS Infrastructure at FreeAgent](http://engineering.freeagent.com/2017/02/06/ha-infrastructure-without-breaking-the-bank/)	
* [Availability in Globally Distributed Storage Systems](http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/36737.pdf)	
* [NodeJS High Availability at Yahoo](https://yahooeng.tumblr.com/post/68823943185/nodejs-high-availability)
* [Every Day is Monday in Operations (11 parts) at LinkedIn ](https://www.linkedin.com/pulse/introduction-every-day-monday-operations-benjamin-purgason)
* [How Robust Monitoring Powers High Availability for LinkedIn Feed](https://www.usenix.org/conference/srecon17americas/program/presentation/barot)
* [Architectural Patterns for High Availability - Adrian Cockcroft, Director of Architecture at Netflix](https://www.infoq.com/presentations/Netflix-Architecture)
* [Supporting Global Events at Facebook](https://code.facebook.com/posts/166966743929963/how-production-engineers-support-global-events-on-facebook/)
* [Backends High Availability at BlaBlaCar](https://medium.com/blablacar-tech/the-expendables-backends-high-availability-at-blablacar-8cea3b95b26b)
* [Chubby: Lock Service for Loosely Coupled Distributed Systems at Google](https://blog.acolyer.org/2015/02/13/the-chubby-lock-service-for-loosely-coupled-distributed-systems/)
* [Tips for High Availability at Netflix](https://medium.com/@NetflixTechBlog/tips-for-high-availability-be0472f2599c)
* [Scaling High-Availability Infrastructure in the Cloud at Twilio](https://www.twilio.com/engineering/2011/12/12/scaling-high-availablity-infrastructure-in-cloud)

## Stability
* [Circuit Breaker](https://martinfowler.com/bliki/CircuitBreaker.html)
	* [Circuit Breaking in Distributed Systems](https://www.infoq.com/presentations/circuit-breaking-distributed-systems)
	* [Circuit Breakers for Distributed Services at LINE](https://engineering.linecorp.com/en/blog/detail/76)
	* [Applying Circuit Breaker to Channel Gateway at LINE](https://engineering.linecorp.com/en/blog/detail/78)
	* [Lessons in Resilience at SoundCloud](https://developers.soundcloud.com/blog/lessons-in-resilience-at-SoundCloud)
	* [Circuit Breaker for Scaling Containers](https://f5.com/about-us/blog/articles/the-art-of-scaling-containers-circuit-breakers-28919)
	* [Protector: Circuit Breaker for Time Series Databases at Trivago](http://tech.trivago.com/2016/02/23/protector/)
	* [Improved Production Stability with Circuit Breakers at Heroku](https://blog.heroku.com/improved-production-stability-with-circuit-breakers)
* [Always Use Timeouts If Possible](https://www.javaworld.com/article/2824163/application-performance/stability-patterns-applied-in-a-restful-architecture.html)
* [Crash Early: Better Error Now Than Response Tomorrow](http://odino.org/better-performance-the-case-for-timeouts/)
* [Fault Tolerance (Timeouts and Retries, Thread Separation, Semaphores, Circuit Breakers) at Neflix](https://medium.com/netflix-techblog/fault-tolerance-in-a-high-volume-distributed-system-91ab4faae74a)
* [Crash-safe Replication for MySQL at Booking.com](https://medium.com/booking-com-infrastructure/better-crash-safe-replication-for-mysql-a336a69b317f)
* [Bulkheads: Partition and Tolerate Failure in One Part](https://skife.org/architecture/fault-tolerance/2009/12/31/bulkheads.html)
* [Steady State: Always Put Logs on Separate Disk](https://docs.microsoft.com/en-us/sql/relational-databases/policy-based-management/place-data-and-log-files-on-separate-drives)
* [Throttling: Maintain a Steady Pace](http://www.sosp.org/2001/papers/welsh.pdf)
* [Multi-Clustering: Improving Resiliency and Stability of a Large-scale Monolithic API Service at LinkedIn](https://engineering.linkedin.com/blog/2017/11/improving-resiliency-and-stability-of-a-large-scale-api)
* [Determinism (4 parts) in League of Legends Server](https://engineering.riotgames.com/news/determinism-league-legends-fixing-divergences)

## Performance
* [Performance Optimization on OS, Storage, Database, Network](https://stackify.com/application-performance-metrics/)
	* [Improving Performance with Background Data Prefetching at Instagram](https://engineering.instagram.com/improving-performance-with-background-data-prefetching-b191acb39898)
	* [Compression Techniques to Solve Network I/O Bottlenecks at eBay](https://www.ebayinc.com/stories/blogs/tech/how-ebays-shopping-cart-used-compression-techniques-to-solve-network-io-bottlenecks/)
	* [Optimizing Web Servers for High Throughput and Low Latency at Dropbox](https://blogs.dropbox.com/tech/2017/09/optimizing-web-servers-for-high-throughput-and-low-latency/)
	* [Linux Performance Analysis in 60.000 Milliseconds at Netflix](https://medium.com/netflix-techblog/linux-performance-analysis-in-60-000-milliseconds-accc10403c55)
	* [Performance Testing with SSDs (2 parts) at MailChimp](https://devs.mailchimp.com/blog/performance-testing-with-ssds-pt-2/)
	* [Live Downsizing Google Cloud Persistent Disks (PD-SSD) at Mixpanel](https://engineering.mixpanel.com/2018/07/31/live-downsizing-google-cloud-pds-for-fun-and-profit/)
	* [Decreasing RAM Usage by 40% Using jemalloc with Python & Celery at Zapier](https://zapier.com/engineering/celery-python-jemalloc/)
	* [Reducing Memory Footprint at Slack](https://slack.engineering/reducing-slacks-memory-footprint-4480fec7e8eb)
	* [Performance Improvements at Pinterest](https://medium.com/@Pinterest_Engineering/driving-user-growth-with-performance-improvements-cfc50dafadd7)
	* [Server Side Rendering at Wix](https://www.youtube.com/watch?v=f9xI2jR71Ms)
	* [30x Performance Improvements on MySQLStreamer at Yelp](https://engineeringblog.yelp.com/2018/02/making-30x-performance-improvements-on-yelps-mysqlstreamer.html)
	* [Optimizing APIs through Dynamic Polyglot Runtime, Fully Asynchronous, and Reactive Programming at Netflix](https://medium.com/netflix-techblog/optimizing-the-netflix-api-5c9ac715cf19)
	* [Performance Monitoring with Riemann and Clojure at Walmart](https://medium.com/walmartlabs/performance-monitoring-with-riemann-and-clojure-eafc07fcd375)
	* [Performance Tracking Dashboard for Live Games at Zynga](https://www.zynga.com/blogs/engineering/live-games-have-evolving-performance)
	* [Optimizing CAL Report Hadoop MapReduce Jobs at eBay](https://www.ebayinc.com/stories/blogs/tech/optimization-of-cal-report-hadoop-mapreduce-job/)
	* [Performance Tuning on Quartz Scheduler at eBay](https://www.ebayinc.com/stories/blogs/tech/performance-tuning-on-quartz-scheduler/)
	* [Profiling C++ (Part 1: Optimization, Part 2: Measurement and Analysis) at Riot Games](https://engineering.riotgames.com/news/profiling-optimisation)
	* [Diagnosing Networking Issues in the Linux Kernel at Mixpanel](https://code.mixpanel.com/2015/03/26/diagnosing-networking-issues-in-the-linux-kernel/)
	* [Hardware-Assisted Video Transcoding at Dailymotion](https://medium.com/dailymotion-engineering/hardware-assisted-video-transcoding-at-dailymotion-66cd2db448ae)
* [Performance Optimization by Tuning Garbage Collection](https://confluence.atlassian.com/enterprise/garbage-collection-gc-tuning-guide-461504616.html)
	* [Garbage Collection Optimization for High-Throughput and Low-Latency Java Applications at LinkedIn](https://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high-throughput-and-low-latency-java-applications)
	* [Analyzing V8 Garbage Collection Logs at Alibaba](https://www.linux.com/blog/can-nodejs-scale-ask-team-alibaba)
	* [Python Garbage Collection for Dropping 50% Memory Growth Per Request at Instagram](https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf)
	* [Performance Impact of Removing Out of Band Garbage Collector (OOBGC) at Github](https://githubengineering.com/removing-oobgc/)	
	* [Using Java Large Heap (110 GB) at Expedia](https://techblog.expedia.com/2015/09/25/solving-problems-with-very-large-java-heaps/)
	* [Debugging Java Memory Leaks at Allegro](https://allegro.tech/2018/05/a-comedy-of-errors-debugging-java-memory-leaks.html)
	* [Optimizing JVM at Alibaba](https://www.youtube.com/watch?v=X4tmr3nhZRg)
* [Performance Optimization on Video, Image, Page Load](https://developers.google.com/web/fundamentals/performance/why-performance-matters/)
	* [Optimizing 360 Photos at Scale at Facebook](https://code.facebook.com/posts/129055711052260/optimizing-360-photos-at-scale/)
	* [Reducing Image File Size in the Photos Infrastructure at Etsy](https://codeascraft.com/2017/05/30/reducing-image-file-size-at-etsy/)
	* [Improving GIF Performance at Pinterest](https://medium.com/@Pinterest_Engineering/improving-gif-performance-on-pinterest-8dad74bf92f1)
	* [Optimizing Video Playback Performance at Pinterest](https://medium.com/@Pinterest_Engineering/optimizing-video-playback-performance-caf55ce310d1)
	* [Optimizing Video Stream for Low Bandwidth with Dynamic Optimizer at Netflix](https://medium.com/netflix-techblog/optimized-shot-based-encodes-now-streaming-4b9464204830)
	* [Adaptive Video Streaming at YouTube](https://youtube-eng.googleblog.com/2018/04/making-high-quality-video-efficient.html)
	* [Reducing Video Loading Time by Prefetching during Preroll at Dailymotion](http://engineering.dailymotion.com/reducing-video-loading-time-prefetching-video-during-preroll/)
	* [Boosting Site Speed Using Brotli Compression at LinkedIn](https://engineering.linkedin.com/blog/2017/05/boosting-site-speed-using-brotli-compression)
	* [Improving Homepage Performance at Zillow](https://www.zillow.com/engineering/improving-homepage-performance/)
	* [The Process of Optimizing for Client Performance at Expedia](https://techblog.expedia.com/2018/03/09/go-fast-or-go-home-the-process-of-optimizing-for-client-performance/)

## Intelligence
* [Big Data](https://insights.sei.cmu.edu/sei_blog/2017/05/reference-architectures-for-big-data-systems.html)	
	* [Data Platform at Netflix](https://www.youtube.com/watch?v=CSDIThSwA7s)
	* [Data Platform at Flipkart](https://tech.flipkart.com/overview-of-flipkart-data-platform-20c6d3e9a196)
	* [Data Pipeline Management Platform at Khan Academy](http://engineering.khanacademy.org/posts/khanalytics.htm)
	* [Data Infrastructure at Airbnb](https://medium.com/airbnb-engineering/data-infrastructure-at-airbnb-8adfb34f169c)
	* [Data Infrastructure at LinkedIn](https://www.infoq.com/presentations/big-data-infrastructure-linkedin)
	* [Data Infrastructure at GO-JEK](https://blog.gojekengineering.com/data-infrastructure-at-go-jek-cd4dc8cbd929)
	* [Data Ingestion Infrastructure at Pinterest](https://medium.com/@Pinterest_Engineering/scalable-and-reliable-data-ingestion-at-pinterest-b921c2ee8754)
	* [Data Analytics Architecture at Pinterest](https://medium.com/@Pinterest_Engineering/behind-the-pins-building-analytics-f7b508cdacab)
	* [Big Data Processing (2 parts) at Spotify](https://labs.spotify.com/2017/10/23/big-data-processing-at-spotify-the-road-to-scio-part-2/)
	* [Big Data Processing at Uber](https://cdn.oreillystatic.com/en/assets/1/event/160/Big%20data%20processing%20with%20Hadoop%20and%20Spark%2C%20the%20Uber%20way%20Presentation.pdf)
	* [Analytics Pipeline at Lyft](https://cdn.oreillystatic.com/en/assets/1/event/269/Lyft_s%20analytics%20pipeline_%20From%20Redshift%20to%20Apache%20Hive%20and%20Presto%20Presentation.pdf)
	* [Big Data Analytics and ML Techniques at LinkedIn](https://cdn.oreillystatic.com/en/assets/1/event/269/Big%20data%20analytics%20and%20machine%20learning%20techniques%20to%20drive%20and%20grow%20business%20Presentation%201.pdf)
	* [Self-Serve Reporting Platform on Hadoop at LinkedIn](https://cdn.oreillystatic.com/en/assets/1/event/137/Building%20a%20self-serve%20real-time%20reporting%20platform%20at%20LinkedIn%20Presentation%201.pdf)
	* [Analytics Platform for Tracking Item Availability at Walmart](https://medium.com/walmartlabs/how-we-build-a-robust-analytics-platform-using-spark-kafka-and-cassandra-lambda-architecture-70c2d1bc8981)
	* [RBEA: Real-time Analytics Platform at King](https://techblog.king.com/rbea-scalable-real-time-analytics-king/)
	* [Gimel: Analytics Data Processing Platform at PayPal](https://www.paypal-engineering.com/2018/04/17/gimel/)
	* [AthenaX: Streaming Analytics Platform at Uber](https://eng.uber.com/athenax/)
	* [Databook: Turning Big Data into Knowledge with Metadata at Uber](https://eng.uber.com/databook/)
	* [Maze: Funnel Visualization Platform at Uber](https://eng.uber.com/maze/)
	* [Metacat: Making Big Data Discoverable and Meaningful at Netflix](https://medium.com/netflix-techblog/metacat-making-big-data-discoverable-and-meaningful-at-netflix-56fb36a53520)
	* [TensorFlowOnSpark: Distributed Deep Learning on Big Data Clusters at Yahoo](https://yahooeng.tumblr.com/post/157196488076/open-sourcing-tensorflowonspark-distributed-deep)
	* [CaffeOnSpark: Distributed Deep Learning on Big Data Clusters at Yahoo](https://yahooeng.tumblr.com/post/139916828451/caffeonspark-open-sourced-for-distributed-deep)
	* [Experimentation Platform at Airbnb](https://medium.com/airbnb-engineering/https-medium-com-jonathan-parks-scaling-erf-23fd17c91166)
	* [Smart Product Platform at Zalando](https://jobs.zalando.com/tech/blog/zalando-smart-product-platform/?gh_src=4n3gxh1)
	* [Log Analysis Platform at LINE](https://www.slideshare.net/wyukawa/strata2017-sg)
* [Distributed Machine Learning](https://www.csie.ntu.edu.tw/~cjlin/talks/bigdata-bilbao.pdf)
	* [Michelangelo: Machine Learning Platform at Uber](https://eng.uber.com/michelangelo/)
	* [Horovod: Open Source Distributed Deep Learning Framework for TensorFlow at Uber](https://eng.uber.com/horovod/)
	* [COTA: Improving Customer Care with NLP & Machine Learning at Uber](https://eng.uber.com/cota/)	
	* [Repo-Topix: Topic Extraction Framework at Github](https://githubengineering.com/topics/)
	* [Concourse: Generating Personalized Content Notifications in Near-Real-Time at LinkedIn](https://engineering.linkedin.com/blog/2018/05/concourse--generating-personalized-content-notifications-in-near)
	* [Altus Care: Applying a Chatbot to Platform Engineering at eBay](https://www.ebayinc.com/stories/blogs/tech/altus-care-apply-chatbot-to-ebay-platform-engineering/)
	* [Box Graph: Spontaneous Social Network at Box](https://blog.box.com/blog/box-graph-how-we-built-spontaneous-social-network/)
	* [PricingNet: Pricing Modelling with Neural Networks at Skyscanner](https://hackernoon.com/pricingnet-modelling-the-global-airline-industry-with-neural-networks-833844d20ea6)
	* [Scaling Gradient Boosted Trees for Click-Through-Rate Prediction at Yelp](https://engineeringblog.yelp.com/2018/01/building-a-distributed-ml-pipeline-part1.html)	
	* [Learning with Privacy at Scale at Apple](https://machinelearning.apple.com/2017/12/06/learning-with-privacy-at-scale.html)
	* [Deep Learning for Image Classification Experiment at Mercari](https://medium.com/mercari-engineering/mercaris-image-classification-experiment-using-deep-learning-9b4e994a18ec)
	* [Deep Learning for Frame Detection in Product Images at Allegro](https://allegro.tech/2016/12/deep-learning-for-frame-detection.html)
	* [Content-based Video Relevance Prediction at Hulu](https://medium.com/hulu-tech-blog/content-based-video-relevance-prediction-b2c448e14752)
	* [Training ML Models with Airflow and BigQuery at WePay](https://wecode.wepay.com/posts/training-machine-learning-models-with-airflow-and-bigquery)
	* [Improving Photo Selection With Deep Learning at TripAdvisor](http://engineering.tripadvisor.com/improving-tripadvisor-photo-selection-deep-learning/)
	* [Machine Learning (2 parts) at Condé Nast](https://technology.condenast.com/story/handbag-brand-and-color-detection)
	* [Machine Learning Applications In The E-commerce Domain (4 parts) at Rakuten](https://techblog.rakuten.co.jp/2017/07/12/machine-learning-applications-in-the-e-commerce-domain-4/)
	* [Mapping the World of Music Using Machine Learning (2 parts) at iHeartRadio](https://tech.iheart.com/mapping-the-world-of-music-using-machine-learning-part-2-aa50b6a0304c)
	* [Venue Rating System at Foursquare](https://engineering.foursquare.com/finding-the-perfect-10-how-we-developed-the-foursquare-venue-rating-system-c76b08f7b9b3)
	* [Using Machine Learning to Improve Streaming Quality at Netflix](https://medium.com/netflix-techblog/using-machine-learning-to-improve-streaming-quality-at-netflix-9651263ef09f)
	* [Improving Video Thumbnails with Deep Neural Nets at YouTube](https://youtube-eng.googleblog.com/2015/10/improving-youtube-video-thumbnails-with_8.html)
	* [Quantile Regression for Delivering On Time at Instacart](https://tech.instacart.com/how-instacart-delivers-on-time-using-quantile-regression-2383e2e03edb)
	* [Cross-Lingual End-to-End Product Search with Deep Learning at Zalando](https://jobs.zalando.com/tech/blog/search-deep-neural-network/)
	* [Machine Learning at Jane Street](https://blog.janestreet.com/real-world-machine-learning-part-1/)
	* [Machine Learning for Ranking Answers End-to-End at Quora](https://engineering.quora.com/A-Machine-Learning-Approach-to-Ranking-Answers-on-Quora)
	* [Clustering Similar Stories Using LDA at Flipboard](http://engineering.flipboard.com/2017/02/storyclustering)
	* [Similarity Search at Flickr](https://code.flickr.net/2017/03/07/introducing-similarity-search-at-flickr/)
	* [Large-Scale Machine Learning Pipeline for Job Recommendations at Indeed](http://engineering.indeedblog.com/blog/2016/04/building-a-large-scale-machine-learning-pipeline-for-job-recommendations/)
	* [Deep Learning from Prototype to Production at Taboola](http://engineering.taboola.com/deep-learning-from-prototype-to-production/)
	* [Atom Smashing using Machine Learning at CERN](https://cdn.oreillystatic.com/en/assets/1/event/144/Atom%20smashing%20using%20machine%20learning%20at%20CERN%20Presentation.pdf)
	* [Mapping Tags at Medium](https://medium.engineering/mapping-mediums-tags-1b9a78d77cf0)
	* [Clustering with the Dirichlet Process Mixture Model in Scala at Monsanto](http://engineering.monsanto.com/2015/11/23/chinese-restaurant-process/)
	* [Map Pins with DBSCAN & Random Forests at Foursquare](https://engineering.foursquare.com/you-are-probably-here-better-map-pins-with-dbscan-random-forests-9d51e8c1964d)
	* [Detecting and Preventing Fraud at Uber](https://eng.uber.com/advanced-technologies-detecting-preventing-fraud-uber/)
	* [Financial Forecasting at Uber](https://eng.uber.com/transforming-financial-forecasting-machine-learning/)
	* [Productionizing ML with Workflows at Twitter](https://blog.twitter.com/engineering/en_us/topics/insights/2018/ml-workflows.html)
	* [GUI Testing Powered by Deep Learning at eBay](https://www.ebayinc.com/stories/blogs/tech/gui-testing-powered-by-deep-learning/)

## Architecture
* [Systems We Make](https://systemswemake.com/)
* [Tech Stack (2 parts) at Uber](https://eng.uber.com/tech-stack-part-two/)
* [Tech Stack at Medium](https://medium.engineering/the-stack-that-helped-medium-drive-2-6-millennia-of-reading-time-e56801f7c492)
* [Services (2 parts) at Airbnb](https://medium.com/airbnb-engineering/building-services-at-airbnb-part-2-142be1c5d506)
* [Back-end at LinkedIn](https://engineering.linkedin.com/architecture/brief-history-scaling-linkedin)
* [Back-end at Flickr](https://yahooeng.tumblr.com/post/157200523046/introducing-tripod-flickrs-backend-refactored)
* [Real-time Presence Platform at LinkedIn](https://engineering.linkedin.com/blog/2018/01/now-you-see-me--now-you-dont--linkedins-real-time-presence-platf)
* [Real-time User Action Counting System for Ads at Pinterest](https://medium.com/@Pinterest_Engineering/building-a-real-time-user-action-counting-system-for-ads-88a60d9c9a)
* [API Platform at Riot Games](https://engineering.riotgames.com/news/riot-games-api-deep-dive)
* [Games Platform at The New York Times](https://open.nytimes.com/play-by-play-moving-the-nyt-games-platform-to-gcp-with-zero-downtime-cf425898d569)
* [Data Visualisation Platform at Myntra](https://medium.com/myntra-engineering/universal-dashboarding-platform-udp-data-visualisation-platform-at-myntra-5f2522fcf72d)
* [Simone: Distributed Simulation Service at Netflix](https://medium.com/netflix-techblog/https-medium-com-netflix-techblog-simone-a-distributed-simulation-service-b2c85131ca1b)
* [Zuul: Edge Service for Dynamic Routing, Monitoring, Resiliency, Security, etc at Netflix](https://medium.com/netflix-techblog/open-sourcing-zuul-2-82ea476cb2b3)
* [Seagull: Distributed System that Helps Running > 20 Million Tests Per Day at Yelp](https://engineeringblog.yelp.com/2017/04/how-yelp-runs-millions-of-tests-every-day.html)
* [MySQL Realtime Traffic Emulator at KakaoTalk](http://tech.kakao.com/2016/02/16/opensource-2-mtre/)
* [Architecture of Sticker Services at LINE](https://www.slideshare.net/linecorp/architecture-sustaining-line-sticker-services)
* [Stack Overflow Enterprise at Palantir](https://medium.com/@palantir/terraforming-stack-overflow-enterprise-in-aws-47ee431e6be7)
* [Distributed Cron at Quora](https://engineering.quora.com/Quoras-Distributed-Cron-Architecture)
* [Architectures of Finance and Banking Systems](https://www.sesameindia.com/images/core-banking-system-architecture)
	* [Reference Architecture For The Open Banking Standard](https://hortonworks.com/blog/reference-architecture-open-banking-standard/)
	* [Building a Modern Bank Backend at Monzo](https://monzo.com/blog/2016/09/19/building-a-modern-bank-backend/)
	* [Reinventing the Trading Platform for Scale at Wealthsimple](https://medium.com/@Wealthsimple/engineering-at-wealthsimple-reinventing-our-trading-platform-for-scale-17e332241b6c)
	* [Architecture for Core Banking System at Margo Bank](https://medium.com/margobank/choosing-an-architecture-85750e1e5a03)
	* [Architecture of Nubank](https://www.infoq.com/presentations/nubank-architecture)
	* [Tech Stack at TransferWise](http://tech.transferwise.com/the-transferwise-stack-heartbeat-of-our-little-revolution/)
	* [Tech Stack at Addepar](https://medium.com/build-addepar/our-tech-stack-a4f55dab4b0d)

## Interview
* [Designing Large-Scale Systems](https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/)
	* [My Scaling Hero - Jeff Atwood (a dose of Endorphins before your interview, JK)](https://blog.codinghorror.com/my-scaling-hero/)
	* [Software Engineering Advice from Building Large-Scale Distributed Systems - Jeff Dean](https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf)
	* [Introduction to Architecting Systems for Scale](https://lethain.com/introduction-to-architecting-systems-for-scale/)
	* [Anatomy of a System Design Interview](https://hackernoon.com/anatomy-of-a-system-design-interview-4cb57d75a53f)
	* [8 Things You Need to Know Before a System Design Interview](http://blog.gainlo.co/index.php/2015/10/22/8-things-you-need-to-know-before-system-design-interviews/)
	* [Top 10 System Design Interview Questions ](https://hackernoon.com/top-10-system-design-interview-questions-for-software-engineers-8561290f0444)
	* [Top 10 Common Large-Scale Software Architectural Patterns in a Nutshell](https://towardsdatascience.com/10-common-software-architectural-patterns-in-a-nutshell-a0b47a1e9013)
	* [Cloud Big Data Design Patterns - Lynn Langit](https://lynnlangit.com/2017/03/14/beyond-relational/)	
	* [How NOT to design Netflix in your 45-minute System Design Interview?](https://hackernoon.com/how-not-to-design-netflix-in-your-45-minute-system-design-interview-64953391a054)
* [Explaining Low-Level Systems (OS, Network/Protocol, Database, Storage)](https://www.palantir.com/how-to-ace-a-systems-design-interview/)	
	* [OSI and TCP/IP Cheat Sheet](http://jaredheinrichs.com/mastering-the-osi-tcpip-models.html)
	* [The Precise Meaning of I/O Wait Time in Linux](http://veithen.github.io/2013/11/18/iowait-linux.html)
	* [Paxos Made Live – An Engineering Perspective](https://research.google.com/archive/paxos_made_live.html)
	* [How to do Distributed Locking](https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html)
	* [SQL Transaction Isolation Levels Explained](http://elliot.land/post/sql-transaction-isolation-levels-explained)
* ["What Happens When... and How" Questions](https://www.glassdoor.com/Interview/What-happens-when-you-type-www-google-com-in-your-browser-QTN_56396.htm)
	* [What Happens When You Type google.com into Browser and Press Enter?](https://github.com/alex/what-happens-when)
	* [Netflix: What Happens When You Press Play?](http://highscalability.com/blog/2017/12/11/netflix-what-happens-when-you-press-play.html)
	* [Monzo: How Peer-To-Peer Payments Work](https://monzo.com/blog/2018/04/05/how-monzo-to-monzo-payments-work/)
	* [Transit and Peering: How Your Requests Reach GitHub](https://githubengineering.com/transit-and-peering-how-your-requests-reach-github/)
	* [How Expedia Finds your Flights: A Detailed View](https://techblog.expedia.com/2016/03/07/how-expedia-finds-flights-a-detailed-view/)

## Organization
* [Engineering Levels at SoundCloud](https://developers.soundcloud.com/blog/engineering-levels)
* [Scaling Engineering Teams at Twitter](https://www.youtube.com/watch?v=-PXi_7Ld5kU)
* [Scaling Decision-Making Across Teams at LinkedIn](https://engineering.linkedin.com/blog/2018/03/scaling-decision-making-across-teams-within-linkedin-engineering)
* [Scaling Data Science Team at GOJEK](https://blog.gojekengineering.com/the-dynamics-of-scaling-an-organisation-cb96dbe8aecd)
* [Scaling Agile at Zalando](https://jobs.zalando.com/tech/blog/scaling-agile-zalando/?gh_src=4n3gxh1)
* [Scaling Agile at bol.com](https://hackernoon.com/how-we-run-bol-com-with-60-autonomous-teams-fe7a98c0759)
* [Lessons Learned from Scaling a Product Team at Intercom](https://blog.intercom.com/how-we-build-software/)
* [Hiring, Managing, and Scaling Engineering Teams at Typeform](https://medium.com/@eleonorazucconi/toby-oliver-cto-typeform-on-hiring-managing-and-scaling-engineering-teams-86bef9e5a708)	
* [Scaling the Datagram Team at Instagram](https://instagram-engineering.com/scaling-the-datagram-team-fc67bcf9b721)
* [Scaling the Design Team at Flexport](https://medium.com/flexport-design/designing-a-design-team-a9a066bc48a5)
* [Team Model for Scaling a Design System at Salesforce](https://medium.com/salesforce-ux/the-salesforce-team-model-for-scaling-a-design-system-d89c2a2d404b)
* [Building Analytics Team (4 parts) at Wish](https://medium.com/wish-engineering/scaling-the-analytics-team-at-wish-part-4-recruiting-2a9823b9f5a)
* [From 2 Founders to 1000 Employees at Transferwise](https://medium.com/transferwise-ideas/from-2-founders-to-1000-employees-how-a-small-scale-startup-grew-into-a-global-community-9f26371a551b)
* [Lessons Learned Growing a UX Team from 10 to 170 at Adobe](https://medium.com/thinking-design/lessons-learned-growing-a-ux-team-from-10-to-170-f7b47be02262)
* [Five Lessons from Scaling at Pinterest](https://medium.com/@sarahtavel/five-lessons-from-scaling-pinterest-6a699a889b08)

## Talk
* [Distributed Systems in One Lesson - Tim Berglund, Senior Director of Developer Experience at Confluent](https://www.youtube.com/watch?v=Y6Ev8GIlbxc)
* [Building Real Time Infrastructure at Facebook - Jeff Barber and Shie Erlich, Software Engineer at Facebook](https://www.usenix.org/conference/srecon17americas/program/presentation/erlich)
* [Building Reliable Social Infrastructure for Google - Marc Alvidrez, Senior Manager at Google](https://www.usenix.org/conference/srecon16/program/presentation/alvidrez)
* [Building a Distributed Build System at Google Scale - Aysylu Greenberg, SDE at Google](https://www.youtube.com/watch?v=K8YuavUy6Qc)
* [Site Reliability Engineering at Dropbox - Tammy Butow, Site Reliability Engineering Manager at Dropbox](https://www.youtube.com/watch?v=ggizCjUCCqE)
* [How Google Does Planet-Scale for Planet-Scale Infra - Melissa Binde, SRE Director for Google Cloud Platform](https://www.youtube.com/watch?v=H4vMcD7zKM0)
* [Netflix Guide to Microservices - Josh Evans, Director of Operations Engineering at Netflix](https://www.youtube.com/watch?v=CZ3wIuvmHeM&t=2837s)
* [Achieving Rapid Response Times in Large Online Services - Jeff Dean, Google Senior Fellow](https://www.youtube.com/watch?v=1-3Ahy7Fxsc)
* [Architecture to Handle 80K RPS Celebrity Sales at Shopify - Simon Eskildsen, Engineering Lead at Shopify](https://www.youtube.com/watch?v=N8NWDHgWA28)
* [Lessons of Scale at Facebook - Bobby Johnson, Director of Engineering at Facebook](https://www.youtube.com/watch?v=QCHiNEw73AU)
* [Performance Optimization for the Greater China Region at Salesforce - Jeff Cheng, Enterprise Architect at Salesforce](https://www.salesforce.com/video/1757880/)
* [How GIPHY Delivers a GIF to 300 Millions Users - Alex Hoang and Nima Khoshini, Services Engineers at GIPHY](https://vimeo.com/252367076)
* [High Performance Packet Processing Platform at Alibaba - Haiyong Wang, Senior Director at Alibaba](https://www.youtube.com/watch?v=wzsxJqeVIhY&list=PLMu8-hpCxIVENuAue7bd0eCAglLGY_8AW&index=7)
* [Solving Large-scale Data Center and Cloud Interconnection Problems -  Ihab Tarazi, CTO at Equinix](https://atscaleconference.com/videos/solving-large-scale-data-center-and-cloud-interconnection-problems/)
* [Scaling Dropbox - Kevin Modzelewski, Back-end Engineer at Dropbox](https://www.youtube.com/watch?v=PE4gwstWhmc)
* [Scaling Reliability at Dropbox - Sat Kriya Khalsa, SRE at Dropbox](https://www.youtube.com/watch?v=IhGWOaD5BYQ)
* [Scaling with Performance at Facebook - Bill Jia, VP of Infrastructure at Facebook](https://atscaleconference.com/videos/performance-scale-2018-opening-remarks/)
* [Scaling Live Videos to a Billion Users at Facebook - Sachin Kulkarni, Director of Engineering at Facebook](https://www.youtube.com/watch?v=IO4teCbHvZw)
* [Scaling Low-latency Live Streams at Facebook (Latencies for Real-time Interactions) - Saral Shodhan, SDE at Facebook](https://atscaleconference.com/videos/scaling-low-latency-live-streams/)
* [Scaling Low-latency Live Streams at Facebook (End-to-End Considerations) - Federico Larumbe, SDE at Facebook](https://atscaleconference.com/videos/scaling-low-latency-live-streams-2-of-2/)
* [Scaling Infrastructure at Instagram - Lisa Guo, Instagram Engineering](https://www.youtube.com/watch?v=hnpzNAPiC0E)
* [Scaling Infrastructure at Twitter - Yao Yue, Staff Software Engineer at Twitter](https://www.youtube.com/watch?v=6OvrFkLSoZ0)
* [Scaling Infrastructure at Etsy - Bethany Macri, Engineering Manager at Etsy](https://www.youtube.com/watch?v=LfqyhM1LeIU)
* [Scaling Real-time Infrastructure at Alibaba for Global Shopping Holiday - Xiaowei Jiang, Senior Director at Alibaba](https://atscaleconference.com/videos/scaling-alibabas-real-time-infrastructure-for-global-shopping-holiday/)
* [Scaling Data Infrastructure at Spotify - Matti (Lepistö) Pehrs, Spotify](https://www.youtube.com/watch?v=cdsfRXr9pJU)
* [Scaling Pinterest - Marty Weiner, Pinterest’s founding engineer](https://www.youtube.com/watch?v=jQNCuD_hxdQ&list=RDhnpzNAPiC0E&index=11)
* [Scaling Slack - Bing Wei, Software Engineer (Infrastructure) at Slack](https://www.infoq.com/presentations/slack-scalability)
* [Scaling Backend at Youtube - Sugu Sougoumarane, SDE at Youtube](https://www.youtube.com/watch?v=5yDO-tmIoXY&feature=youtu.be)
* [Scaling Backend at Uber - Matt Ranney, Chief Systems Architect at Uber](https://www.youtube.com/watch?v=nuiLcWE8sPA)
* [Scaling Global CDN at Netflix - Dave Temkin, Director of Global Networks at Netflix](https://www.youtube.com/watch?v=tbqcsHg-Q_o)
* [Scaling Load Balancing Infra to Support 1.3 Billion Users at Facebook - Patrick Shuff, Production Engineer at Facebook](https://www.youtube.com/watch?v=bxhYNfFeVF4)
* [Scaling (a NSFW site) to 200 Million Views A Day And Beyond - Eric Pickup, Lead Platform Developer at MindGeek](https://www.youtube.com/watch?v=RlkCdM_f3p4)
* [Scaling Counting Infrastructure at Quora - Chun-Ho Hung and Nikhil Gar, SEs at Quora](https://www.infoq.com/presentations/quora-analytics)
* [Scaling Git at Microsoft - Saeed Noursalehi, Principal Program Manager at Microsoft](https://www.youtube.com/watch?v=g_MPGU_m01s)

## Book
* [Big Data, Web Ops & DevOps Ebooks - O'Reilly (Online - Free)](http://www.oreilly.com/webops/free/)
* [Google Site Reliability Engineering (Online - Free)](https://landing.google.com/sre/book.html)
* [Distributed Systems for Fun and Profit (Online - Free)](http://book.mixu.net/distsys/)
* [What Every Developer Should Know About SQL Performance (Online - Free)](https://use-the-index-luke.com/sql/table-of-contents)
* [Beyond the Twelve-Factor App - Exploring the DNA of Highly Scalable, Resilient Cloud Applications (Free)](http://www.oreilly.com/webops-perf/free/beyond-the-twelve-factor-app.csp)
* [Chaos Engineering - Building Confidence in System Behavior through Experiments (Free)](http://www.oreilly.com/webops-perf/free/chaos-engineering.csp?intcmp=il-webops-free-product-na_new_site_chaos_engineering_text_cta)
* [The Art of Scalability](http://theartofscalability.com/)
* [Designing Data-Intensive Applications](https://dataintensive.net/)
* [Web Scalability for Startup Engineers](https://www.goodreads.com/book/show/23615147-web-scalability-for-startup-engineers)
* [Scalability Rules: 50 Principles for Scaling Web Sites](http://scalabilityrules.com/)

## Special Thanks
* Jonas Bonér, CTO at Lightbend, for the [original inspiration](https://www.slideshare.net/jboner/scalability-availability-stability-patterns)

## License

[![CC-BY](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/by.svg)](https://creativecommons.org/licenses/by/4.0/)

This repo is created and maintained by [Binh Nguyen](http://binhnguyennus.com/). Feel free to use it at your convenience! Thank you & Happy coding :heart:
-												Adaptive Video Streaming at YouTube

											
										
										
											2018-08-10 21:27:56 -04:00
+								# High Scalability, High Availability, High Stability, High Performance, and High Intelligence System Design Patterns
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
-												all about system, i.e every thing behind front-end layer

											
										
										
											2018-08-19 20:57:03 -04:00
+								An updated and curated list of selected readings to illustrate best practices in building high scalability, high availability, high stability, high performance, and high intelligence large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users.
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
-												all about system, i.e every thing behind front-end layer

											
										
										
											2018-08-19 20:57:03 -04:00
+								#### If your system goes slow :traffic_light:
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								> Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some [design principles](#principle) and checking how [scalability](#scalability) and [performance](#performance) problems are solved at tech companies. The section of [intelligence](#intelligence) are created for those who work with data and machine learning at big (data) and deep (learning) scale.
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
-												all about system, i.e every thing behind front-end layer

											
										
										
											2018-08-19 20:57:03 -04:00
+								#### If your system goes down :construction:
 								> "Even if you lose all one day, you can build all over again if you retain your calm!" - Thuan Pham, CTO of Uber. So, keep calm and mind the [availability](#availability) and [stability](#stability) matters!
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
-												all about system, i.e every thing behind front-end layer

											
										
										
											2018-08-19 20:57:03 -04:00
+								#### If you are having a system design interview :ocean:
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								> Look at some [interview notes](#interview) and [real-world architectures with completed diagrams](#architecture) to get a comprehensive view before designing your system on whiteboard. You can check some [talks](#talk) of engineers from tech giants to know how they build, scale, and optimize their systems. There are some selected [books](#book) for you (most of them are free)! Good luck :four_leaf_clover:
-												Add the System Design section, enjoy vacation in my Vietnam

											
										
										
											2018-03-10 07:58:39 -05:00
-												Add the section of Organization

											
										
										
											2018-06-02 23:53:32 -04:00
+								#### If you are building your dream team :ferris_wheel:
-												Sending an e-mail to millions of users (with Redis) at Drivy

											
										
										
											2018-06-14 06:32:59 -04:00
+								> The goal of scaling team is not growing team size but increasing team output and value. You can find out how tech companies reach that goal in various aspects: hiring, management, organization, culture, and communication in the [organization](#organization) section.
-												Add the section of Organization

											
										
										
											2018-06-02 23:53:32 -04:00
-												all about system, i.e every thing behind front-end layer

											
										
										
											2018-08-19 20:57:03 -04:00
+								#### Community power :mountain_cableway::aerial_tramway::mountain_cableway:
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
-												Add sharing by Twitter

											
										
										
											2018-02-09 11:13:54 -05:00
+								> Contributions are greatly welcome! You may want to take a look at the [contribution guidelines](CONTRIBUTING.md).
-												Migrating Mongo Data at Addepar

											
										
										
											2018-04-14 03:05:55 -04:00
-												Scalable IAM Architecture to Secure Access to 100 AWS Accounts at Segment

											
										
										
											2018-04-19 20:39:03 -04:00
+								> If you find this project helpful, please share on your chat groups, [on Twitter](https://ctt.ec/V8B2p), or [on Weibo](http://t.cn/RnjFLCB) so more people can be helped! Power is gained by sharing knowledge, not hoarding it. Thank you! :hibiscus:
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
 								## Contents
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								- [Principle](#principle)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+								- [Scalability](#scalability)
 								- [Availability](#availability)
 								- [Stability](#stability)
-												Add a section for Performance

											
										
										
											2018-01-26 07:05:29 -05:00
+								- [Performance](#performance)
-												I am a fan of AI, too

											
										
										
											2018-03-24 22:48:02 -04:00
+								- [Intelligence](#intelligence)
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								- [Architecture](#architecture)
-												Architecture of LIVE's Encoder Layer at LINE

											
										
										
											2018-03-16 22:08:35 -04:00
+								- [Interview](#interview)
-												Add the section of Organization

											
										
										
											2018-06-02 23:53:32 -04:00
+								- [Organization](#organization)
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								- [Talk](#talk)
 								- [Book](#book)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								## Principle
-												MySQL Realtime Traffic Emulator at KakaoTalk

											
										
										
											2018-04-16 03:37:48 -04:00
+								* [Designs, Lessons and Advice from Building Large Distributed Systems - Jeff Dean, Google](https://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf)
-												Continuous Delivery Industry Best Practice at Rakuten

											
										
										
											2018-06-03 00:51:45 -04:00
+								* [How To Design A Good API and Why it Matters - Joshua Bloch, CMU & Google](https://www.infoq.com/presentations/effective-api-design)
-												shorter is better

											
										
										
											2018-03-24 23:11:17 -04:00
+								* [On Efficiency, Reliability, Scaling - James Hamilton, VP at AWS](http://mvdirona.com/jrh/work/)
-												Things to Keep in Mind When Building a Platform for the Enterprise - Heidi Williams, VP Platform at Box

											
										
										
											2018-05-01 23:54:58 -04:00
+								* [Things to Keep in Mind When Building a Platform for the Enterprise - Heidi Williams, VP Platform at Box](https://blog.box.com/blog/4-things-to-keep-in-mind-when-building-a-platform-for-the-enterprise/)
-												Principles of Chaos Engineering

											
										
										
											2018-01-21 23:27:41 -05:00
+								* [Principles of Chaos Engineering](https://www.usenix.org/conference/srecon17americas/program/presentation/rosenthal)
-												Finding the Order in Chaos

											
										
										
											2018-01-21 23:37:01 -05:00
+								* [Finding the Order in Chaos](https://www.usenix.org/conference/srecon16/program/presentation/lueder)
-												Twelve-Factor App

											
										
										
											2018-01-27 05:22:09 -05:00
+								* [The Twelve-Factor App](https://12factor.net/)
-												High Cohesion and Low Coupling

											
										
										
											2018-03-25 20:28:16 -04:00
+								* [Clean Architecture](https://8thlight.com/blog/uncle-bob/2012/08/13/the-clean-architecture.html)
 								* [High Cohesion and Low Coupling](http://www.math-cs.gordon.edu/courses/cs211/lectures-2009/Cohesion,Coupling,MVC.pdf)
-												CAP Twelve Years Later: How the Rules Have Changed (2012) - Eric Brewer (VP of Infrastructure at Google)

											
										
										
											2018-01-24 21:24:36 -05:00
+								* [CAP Theorem and Trade-offs](http://robertgreiner.com/2014/08/cap-theorem-revisited/)
-												Stateless vs Stateful Scalability

											
										
										
											2018-04-11 19:37:23 -04:00
+								* [CP Databases and AP Databases](https://blog.andyet.com/2014/10/01/right-database)
 								* [Stateless vs Stateful Scalability](http://ithare.com/scaling-stateful-objects/)
-												refactor the whole list, ensure no dead link

											
										
										
											2018-03-24 22:16:18 -04:00
+								* [Scale Up vs Scale Out](https://www.brianjgraf.com/2013/05/17/scalability-scale-up-scale-out-care/)
 								* [Scale Up vs Scale Out: Hidden Costs](https://blog.codinghorror.com/scaling-up-vs-scaling-out-hidden-costs/)
-												Using Machine Learning to Improve Streaming Quality at Netflix

											
										
										
											2018-03-24 22:36:09 -04:00
+								* [Best Practices for Scaling Out](https://blog.openshift.com/best-practices-for-horizontal-application-scaling/)
-												Continuous Delivery Industry Best Practice at Rakuten

											
										
										
											2018-06-03 00:51:45 -04:00
+								* [Best Practices for Continuous Delivery](https://techblog.rakuten.co.jp/2018/02/06/cd-the-best-practice/)
-												Refactor the Basic section into Principles

											
										
										
											2018-01-20 08:55:07 -05:00
+								* [ACID and BASE](https://neo4j.com/blog/acid-vs-base-consistency-models-explained/)
-												Why Non-Blocking?

											
										
										
											2018-01-20 21:17:24 -05:00
+								* [Blocking/Non-Blocking and Sync/Async](https://blogs.msdn.microsoft.com/csliu/2009/08/27/io-concept-blockingnon-blocking-vs-syncasync/)
-												Performance and Scalability of Databases

											
										
										
											2018-02-27 11:54:35 -05:00
+								* [Performance and Scalability of Databases](https://use-the-index-luke.com/sql/testing-scalability)
-												Database Isolation Levels and Effects on Performance and Scalability

											
										
										
											2018-02-26 10:15:10 -05:00
+								* [Database Isolation Levels and Effects on Performance and Scalability](http://highscalability.com/blog/2011/2/10/database-isolation-levels-and-their-effects-on-performance-a.html)
-												The Probability of Data Loss in Large Clusters

											
										
										
											2018-05-04 23:20:15 -04:00
+								* [The Probability of Data Loss in Large Clusters](https://martin.kleppmann.com/2017/01/26/data-loss-in-large-clusters.html)
-												refactor the whole list, ensure no dead link

											
										
										
											2018-03-24 22:16:18 -04:00
+								* [SQL vs NoSQL](https://www.upwork.com/hiring/data/sql-vs-nosql-databases-whats-the-difference/)
-												minor edit

											
										
										
											2018-05-26 03:55:51 -04:00
+								* [SQL vs NoSQL - Lesson Learned at Salesforce](https://engineering.salesforce.com/sql-or-nosql-9eaf1d92545b)
-												NoSQL Databases: Survey and Decision Guidance

											
										
										
											2018-05-08 08:03:43 -04:00
+								* [NoSQL Databases: Survey and Decision Guidance](https://medium.baqend.com/nosql-databases-a-survey-and-decision-guidance-ea7823a822d)
-												Refactored, happy weekend my friends!

											
										
										
											2018-02-04 04:10:26 -05:00
+								* [How Sharding Works](https://medium.com/@jeeyoungk/how-sharding-works-b4dec46b3f6)
-												The Process of Optimizing for Client Performance at Expedia

											
										
										
											2018-03-24 23:19:44 -04:00
+								* [Consistent Hashing](http://www.tom-e-white.com/2007/11/consistent-hashing.html)
-												Consistent Hashing: Algorithmic Tradeoffs

											
										
										
											2018-04-10 07:03:54 -04:00
+								* [Consistent Hashing: Algorithmic Tradeoffs](https://medium.com/@dgryski/consistent-hashing-algorithmic-tradeoffs-ef6b8e2fcae8)
-												Don’t be tricked by the Hashing Trick

											
										
										
											2018-05-12 23:28:07 -04:00
+								* [Don’t be tricked by the Hashing Trick](https://booking.ai/dont-be-tricked-by-the-hashing-trick-192a6aae3087)
-												Continuous Delivery Industry Best Practice at Rakuten

											
										
										
											2018-06-03 00:51:45 -04:00
+								* [Uniform Consistent Hashing at Netflix](https://medium.com/netflix-techblog/distributing-content-to-open-connect-3e3e391d4dc9)
-												Eventually Consistent - Werner Vogels, CTO at Amazon

											
										
										
											2018-01-26 10:31:50 -05:00
+								* [Eventually Consistent - Werner Vogels, CTO at Amazon](https://www.allthingsdistributed.com/2008/12/eventually_consistent.html)
-												refactor the whole list, ensure no dead link

											
										
										
											2018-03-24 22:16:18 -04:00
+								* [Cache is King](https://www.stevesouders.com/blog/2012/10/11/cache-is-king/)
-												Anti-Caching

											
										
										
											2018-01-24 09:41:57 -05:00
+								* [Anti-Caching](http://the-paper-trail.org/blog/paper-notes-anti-caching/)
-												Understand why Cache is King!

											
										
										
											2018-01-20 09:14:49 -05:00
+								* [Understand Latency](http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it)
-												Latency Numbers Every Programmer Should Know

											
										
										
											2018-03-10 07:31:15 -05:00
+								* [Latency Numbers Every Programmer Should Know](http://norvig.com/21-days.html#answers)
-												The Calculus of Service Availability

											
										
										
											2018-05-07 11:42:48 -04:00
+								* [The Calculus of Service Availability](https://queue.acm.org/detail.cfm?id=3096459&__s=dnkxuaws9pogqdnxmx8i)
-												Refactor the Basic section into Principles

											
										
										
											2018-01-20 08:55:07 -05:00
+								* [Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO](http://highscalability.com/blog/2014/5/12/4-architecture-issues-when-scaling-web-applications-bottlene.html)
-												refactor the whole list, ensure no dead link

											
										
										
											2018-03-24 22:16:18 -04:00
+								* [Common Bottlenecks](http://highscalability.com/blog/2012/5/16/big-list-of-20-common-bottlenecks.html)
-												Life Beyond Distributed Transactions

											
										
										
											2018-01-26 09:21:38 -05:00
+								* [Life Beyond Distributed Transactions](https://queue.acm.org/detail.cfm?id=3025012)
-												Relying on Software to Redirect Traffic Reliably at Various Layers

											
										
										
											2018-01-21 22:47:46 -05:00
+								* [Relying on Software to Redirect Traffic Reliably at Various Layers](https://www.usenix.org/conference/srecon15/program/presentation/taveira)
-												Breaking Things on Purpose

											
										
										
											2018-01-21 23:09:26 -05:00
+								* [Breaking Things on Purpose](https://www.usenix.org/conference/srecon17americas/program/presentation/andrus)
-												Refactor for better viewing experience

											
										
										
											2018-01-23 21:52:18 -05:00
+								* [Avoid Over Engineering](https://medium.com/@rdsubhas/10-modern-software-engineering-mistakes-bc67fbef4fc8)
-												Scalability Worst Practices

											
										
										
											2018-01-25 03:43:48 -05:00
+								* [Scalability Worst Practices](https://www.infoq.com/articles/scalability-worst-practices)
-												Use Solid Technologies - Don’t Re-invent the Wheel - Keep It Simple!

											
										
										
											2018-01-23 06:49:13 -05:00
+								* [Use Solid Technologies - Don’t Re-invent the Wheel - Keep It Simple!](https://medium.com/@DataStax/instagram-engineerings-3-rules-to-a-scalable-cloud-application-architecture-c44afed31406)
-												Simplicity by Distributing Complexity

											
										
										
											2018-05-02 00:10:15 -04:00
+								* [Simplicity by Distributing Complexity](https://jobs.zalando.com/tech/blog/simplicity-by-distributing-complexity/)
-												Why Over-Reusing is Bad

											
										
										
											2018-01-29 02:58:49 -05:00
+								* [Why Over-Reusing is Bad](http://tech.transferwise.com/why-over-reusing-is-bad/)
-												Performance is a Feature

											
										
										
											2018-01-23 06:12:08 -05:00
+								* [Performance is a Feature](https://blog.codinghorror.com/performance-is-a-feature/)
-												Make Performance Part of Your Workflow

											
										
										
											2018-01-25 08:01:12 -05:00
+								* [Make Performance Part of Your Workflow](https://codeascraft.com/2014/12/11/make-performance-part-of-your-workflow/)
-												minor edit

											
										
										
											2018-05-26 03:55:51 -04:00
+								* [The Benefits of Server Side Rendering over Client Side Rendering](https://medium.com/walmartlabs/the-benefits-of-server-side-rendering-over-client-side-rendering-5d07ff2cefe8)
-												Writing Code that Scales

											
										
										
											2018-01-23 12:38:37 -05:00
+								* [Writing Code that Scales](https://blog.rackspace.com/writing-code-that-scales)
-												Continuous Delivery Industry Best Practice at Rakuten

											
										
										
											2018-06-03 00:51:45 -04:00
+								* [Automate and Abstract: Lessons at Facebook](https://architecht.io/lessons-from-facebook-on-engineering-for-scale-f5716f0afc7a)
-												AWS Do's and Don'ts

											
										
										
											2018-01-24 08:57:48 -05:00
+								* [AWS Do's and Don'ts](https://8thlight.com/blog/sarah-sunday/2017/09/15/aws-dos-and-donts.html)
-												(UI) Design Doesn’t Scale - Stanley Wood, Design Director at Spotify

											
										
										
											2018-01-24 08:17:50 -05:00
+								* [(UI) Design Doesn’t Scale - Stanley Wood, Design Director at Spotify](https://medium.com/@hellostanley/design-doesnt-scale-4d81e12cbc3e)
-												refactor the whole list, ensure no dead link

											
										
										
											2018-03-24 22:16:18 -04:00
+								* [Linux Performance](http://www.brendangregg.com/linuxperf.html)
-												minor edit

											
										
										
											2018-05-26 03:55:51 -04:00
+								* [Building Fast and Resilient Web Applications - Ilya Grigorik](https://www.igvita.com/2016/05/20/building-fast-and-resilient-web-applications/)
-												Accept Partial Failures, Minimize Service Loss

											
										
										
											2018-05-19 22:54:58 -04:00
+								* [Accept Partial Failures, Minimize Service Loss](https://www.usenix.org/conference/srecon17asia/program/presentation/wang_daxin)
-												RACI (Responsible, Accountable, Consulted, Informed) at Etsy

											
										
										
											2018-05-11 14:38:27 -04:00
+								* [RACI (Responsible, Accountable, Consulted, Informed) at Etsy](https://codeascraft.com/2018/01/04/selecting-a-cloud-provider/)
-												Change the link of Design for Loose-coupling to a better one

											
										
										
											2018-01-26 09:35:06 -05:00
+								* [Design for Loose-coupling](http://bulgerpartners.com/how-loosely-coupled-architectures-are-helping-the-modernization-of-legacy-software/)
-												Refactor and add some entries for Basic section

											
										
										
											2018-01-10 12:08:02 -05:00
+								* [Design for Resiliency](http://highscalability.com/blog/2012/12/31/designing-for-resiliency-will-be-so-2013.html)
-												Refactor the list

											
										
										
											2018-01-21 23:19:38 -05:00
+								* [Design for Self-healing](https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/self-healing)
-												Using Machine Learning to Improve Streaming Quality at Netflix

											
										
										
											2018-03-24 22:36:09 -04:00
+								* [Design for Scaling Out](https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/scale-out)
-												Refactor and add some entries for Basic section

											
										
										
											2018-01-10 12:08:02 -05:00
+								* [Design for Evolution](https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/design-for-evolution)
-												Mistakes to Avoid while Creating an Internal Product at Skyscanner

											
										
										
											2018-05-18 19:36:05 -04:00
+								* [Mistakes to Avoid while Creating an Internal Product at Skyscanner](https://medium.com/@SkyscannerEng/9-mistakes-to-avoid-while-creating-an-internal-product-63d579b00b1a)
 								* [Learn from Mistakes at Reddit](http://highscalability.com/blog/2013/8/26/reddit-lessons-learned-from-mistakes-made-scaling-to-1-billi.html)
-												High Cohesion and Low Coupling

											
										
										
											2018-03-25 20:28:16 -04:00
+								* [Code Review Best Practices at Palantir](https://medium.com/@palantir/code-review-best-practices-19e02780015f)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
 								## Scalability
-												Operate Kubernetes Reliably at Stripe

											
										
										
											2018-01-31 02:30:26 -05:00
+								* [Microservices and Orchestration](https://hackernoon.com/microservices-are-hard-an-invaluable-guide-to-microservices-2d06bd7bcf5d)
-												Microservices Resource Guide - Martin Fowler, Chief Scientist at ThoughtWorks

											
										
										
											2018-01-23 06:22:01 -05:00
+									* [Microservices Resource Guide - Martin Fowler, Chief Scientist at ThoughtWorks](https://martinfowler.com/microservices/)
-												Refactored, happy weekend my friends!

											
										
										
											2018-02-04 04:10:26 -05:00
+									* [Microservices Patterns](http://microservices.io/patterns/)
-												refactor the whole list, ensure no dead link

											
										
										
											2018-03-24 22:16:18 -04:00
+									* [Advantages and Drawbacks of Microservices](https://cloudacademy.com/blog/microservices-architecture-challenge-advantage-drawback/)
 									* [Microservices Scale Cube](http://microservices.io/articles/scalecube.html)
-												refactor

											
										
										
											2018-02-11 22:16:53 -05:00
+									* [Thinking Inside the Container (8 parts) at Riot Games](https://engineering.riotgames.com/news/thinking-inside-container)
-												Add the section of Microservices

											
										
										
											2018-01-17 00:38:38 -05:00
+									* [Containerization at Pinterest](https://medium.com/@Pinterest_Engineering/containerization-at-pinterest-92295347f2f3)
-												Techniques for Splitting Up a Codebase into Microservices and Artifacts at LinkedIn

											
										
										
											2018-02-20 01:12:18 -05:00
+									* [Techniques for Splitting Up a Codebase into Microservices and Artifacts at LinkedIn](https://engineering.linkedin.com/blog/2016/02/q-a-with-jim-brikman--splitting-up-a-codebase-into-microservices)
-												Add the section of Microservices

											
										
										
											2018-01-17 00:38:38 -05:00
+									* [The Evolution of Container Usage at Netflix](https://medium.com/netflix-techblog/the-evolution-of-container-usage-at-netflix-3abfc096781b)
 									* [Dockerizing MySQL at Uber](https://eng.uber.com/dockerizing-mysql/)
-												Testing of Microservices at Spotify

											
										
										
											2018-01-18 03:27:42 -05:00
+									* [Testing of Microservices at Spotify](https://labs.spotify.com/2018/01/11/testing-of-microservices/)
-												Organize Monolith Before Breaking it into Services at Weebly

											
										
										
											2018-01-24 07:03:53 -05:00
+									* [Organize Monolith Before Breaking it into Services at Weebly](https://medium.com/weebly-engineering/how-to-organize-your-monolith-before-breaking-it-into-services-69cbdb9248b0)
-												Lessons learned running Docker in production at Treehouse

											
										
										
											2018-01-29 03:10:33 -05:00
+									* [Lessons learned running Docker in production at Treehouse](https://medium.com/treehouse-engineering/lessons-learned-running-docker-in-production-5dce99ece770)
-												Inside a SoundCloud Microservice

											
										
										
											2018-01-30 06:27:34 -05:00
+									* [Inside a SoundCloud Microservice](https://developers.soundcloud.com/blog/inside-a-soundcloud-microservice)
-												Operate Kubernetes Reliably at Stripe

											
										
										
											2018-01-31 02:30:26 -05:00
+									* [Operate Kubernetes Reliably at Stripe](https://stripe.com/blog/operating-kubernetes)
-												Kafka for PaaS at Rakuten

											
										
										
											2018-03-09 02:57:19 -05:00
+									* [Kubernetes Traffic Routing (2 parts) at Rakuten](https://techblog.rakuten.co.jp/2017/09/28/k8s-routing2/)
-												refactor

											
										
										
											2018-02-11 22:16:53 -05:00
+									* [Agrarian-Scale Kubernetes (3 parts) at New York Times](https://open.nytimes.com/agrarian-scale-kubernetes-part-3-ee459887ed7e)
-												Nanoservices at BBC Online

											
										
										
											2018-02-11 22:24:39 -05:00
+									* [Nanoservices at BBC Online](https://medium.com/bbc-design-engineering/powering-bbc-online-with-nanoservices-727840ba015b)
-												PowerfulSeal: Testing Tool for Kubernetes Clusters at Bloomberg

											
										
										
											2018-02-13 07:33:37 -05:00
+									* [PowerfulSeal: Testing Tool for Kubernetes Clusters at Bloomberg](https://www.techatbloomberg.com/blog/powerfulseal-testing-tool-kubernetes-clusters/)
-												Conductor: Microservices Orchestrator at Netflix

											
										
										
											2018-02-14 21:22:38 -05:00
+									* [Conductor: Microservices Orchestrator at Netflix](https://medium.com/netflix-techblog/netflix-conductor-a-microservices-orchestrator-2e8d4771bf40)
-												Making 10x Improvement in Release Times with Docker and Amazon ECS at Nextdoor

											
										
										
											2018-02-26 00:49:48 -05:00
+									* [Making 10x Improvement in Release Times with Docker and Amazon ECS at Nextdoor](https://engblog.nextdoor.com/how-nextdoor-made-a-10x-improvement-in-release-times-with-docker-and-amazon-ecs-35aab52b726f)
-												K8Guard: Auditing System for Kubernetes Clusters at Target.com

											
										
										
											2018-03-25 00:16:33 -04:00
+									* [K8Guard: Auditing System for Kubernetes Clusters at Target.com](http://target.github.io/infrastructure/k8guard-the-guardian-angel-for-kuberentes)
-												Deconstructing Monolithic Applications into (Kafka-driven) Services at Heroku

											
										
										
											2018-04-27 23:47:25 -04:00
+									* [Deconstructing Monolithic Applications into (Kafka-driven) Services at Heroku](https://blog.heroku.com/monolithic-applications-into-services)
-												Docker Containers that Power Over 100.000 Online Shops at Shopify

											
										
										
											2018-05-10 12:44:32 -04:00
+									* [Docker Containers that Power Over 100.000 Online Shops at Shopify](https://shopifyengineering.myshopify.com/blogs/engineering/docker-at-shopify-how-we-built-containers-that-power-over-100-000-online-shops)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+								* [Distributed Caching](https://www.wix.engineering/single-post/scaling-to-100m-to-cache-or-not-to-cache)
-												refactor the section Distributed Caching

											
										
										
											2018-03-24 21:31:40 -04:00
+									* [Read-Through, Write-Through, Write-Behind, and Refresh-Ahead Caching](https://docs.oracle.com/cd/E15357_01/coh.360/e15723/cache_rtwtwbra.htm#COHDG5177)
 									* [Eviction Policy and Expiration Policy](http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html)
-												Reduce Memcached Memory Usage by 50% at Trivago

											
										
										
											2018-02-01 00:27:18 -05:00
+									* [EVCache: Caching for a Global Netflix](https://medium.com/netflix-techblog/caching-for-a-global-netflix-7bcc457012f1)
-												Box Graph: Spontaneous Social Network at Box

											
										
										
											2018-04-02 03:11:53 -04:00
+									* [Memsniff: Robust Memcache Traffic Analyzer at Box](https://blog.box.com/blog/introducing-memsniff-robust-memcache-traffic-analyzer/)
-												Minor rename

											
										
										
											2018-01-29 03:12:47 -05:00
+									* [Caching with Consistent Hashing and Cache Smearing at Etsy](https://codeascraft.com/2017/11/30/how-etsy-caches/)
-												refactored

											
										
										
											2018-03-22 03:02:06 -04:00
+									* [Analysis of Photo Caching at Facebook](https://code.facebook.com/posts/220956754772273/an-analysis-of-facebook-photo-caching/)
 									* [Cache Efficiency Exercise at Facebook](https://code.facebook.com/posts/964122680272229/web-performance-cache-efficiency-exercise/)
-												tCache: Scalable Data-aware Java Caching at Trivago

											
										
										
											2018-03-24 23:25:13 -04:00
+									* [tCache: Scalable Data-aware Java Caching at Trivago](http://tech.trivago.com/2015/10/15/tcache/)
-												Reduce Memcached Memory Usage by 50% at Trivago

											
										
										
											2018-02-01 00:27:18 -05:00
+									* [Reduce Memcached Memory Usage by 50% at Trivago](http://tech.trivago.com/2017/12/19/how-trivago-reduced-memcached-memory-usage-by-50/)
-												Caching Internal Service Calls at Yelp

											
										
										
											2018-03-22 02:56:25 -04:00
+									* [Caching Internal Service Calls at Yelp](https://engineeringblog.yelp.com/2018/03/caching-internal-service-calls-at-yelp.html)
-												Scaling Live Streaming for Large Events (with Distributed Cache) at Hulu

											
										
										
											2018-05-09 21:55:13 -04:00
+									* [Scaling Live Streaming for Large Events (with Distributed Cache) at Hulu](https://medium.com/hulu-tech-blog/scaling-hulu-live-streaming-for-large-events-march-madness-and-beyond-bedd73874f2)
-												Estimating the Cache Efficiency using Big Data at Allegro

											
										
										
											2018-05-10 02:56:51 -04:00
+									* [Estimating the Cache Efficiency using Big Data at Allegro](https://allegro.tech/2017/01/estimating-the-cache-efficiency-using-big-data.html)
-												Caching (with Hashing) at Zenefits

											
										
										
											2018-05-11 14:08:03 -04:00
+									* [Caching (with Hashing) at Zenefits](https://engineering.zenefits.com/2016/02/basic-infrastructure-patterns/)
-												Distributed Cache (Akka, Kubernetes) at Zalando

											
										
										
											2018-05-19 22:43:19 -04:00
+									* [Distributed Cache (Akka, Kubernetes) at Zalando](https://jobs.zalando.com/tech/blog/distributed-cache-akka-kubernetes/)
-												Application Data Caching from RAM to SSD at NetFlix

											
										
										
											2018-07-20 20:50:02 -04:00
+									* [Application Data Caching from RAM to SSD at NetFlix](https://medium.com/netflix-techblog/evolution-of-application-data-caching-from-ram-to-ssd-a33d6fa7a690)
-												Tracking Service Infrastructure at Scale at Spotify

											
										
										
											2018-01-21 23:04:51 -05:00
+								* [Distributed Tracking and Tracing](https://www.oreilly.com/ideas/understanding-the-value-of-distributed-tracing)
-												Fix typo at Tracking Service Infrastructure at Scale at Spotify

											
										
										
											2018-01-31 06:26:12 -05:00
+									* [Tracking Service Infrastructure at Scale at Shopify](https://www.usenix.org/conference/srecon17americas/program/presentation/arthorne)
-												Tracking Service Infrastructure at Scale at Spotify

											
										
										
											2018-01-21 23:04:51 -05:00
+									* [Distributed Tracing with Pintrace at Pinterest](https://medium.com/@Pinterest_Engineering/distributed-tracing-at-pinterest-with-new-open-source-tools-a4f8a5562f6b)
-												Distributed Tracing at HelloFresh

											
										
										
											2018-04-07 00:36:14 -04:00
+									* [Distributed Tracing at HelloFresh](https://engineering.hellofresh.com/scaling-hellofresh-distributed-tracing-7b182928247d)
-												Tracking Service Infrastructure at Scale at Spotify

											
										
										
											2018-01-21 23:04:51 -05:00
+									* [Analyzing Distributed Trace Data at Pinterest](https://medium.com/@Pinterest_Engineering/analyzing-distributed-trace-data-6aae58919949)
 									* [Distributed Tracing at Uber](https://eng.uber.com/distributed-tracing/)
-												JVM Profiler: Tracing Distributed JVM Applications at Uber

											
										
										
											2018-06-29 13:50:51 -04:00
+									* [JVM Profiler: Tracing Distributed JVM Applications at Uber](https://eng.uber.com/jvm-profiler/)
-												Data Checking at Dropbox

											
										
										
											2018-01-21 23:56:40 -05:00
+									* [Data Checking at Dropbox](https://www.usenix.org/conference/srecon17asia/program/presentation/mah)
-												Real-time Distributed Tracing at LinkedIn

											
										
										
											2018-03-20 11:02:20 -04:00
+									* [Tracing Distributed Systems at Showmax](https://tech.showmax.com/2016/10/tracing-distributed-systems-at-showmax/)
 									* [Real-time Distributed Tracing at LinkedIn](https://engineering.linkedin.com/distributed-service-call-graph/real-time-distributed-tracing-website-performance-and-efficiency)
-												Zipkin: Distributed Systems Tracing at Twitter

											
										
										
											2018-03-20 11:04:36 -04:00
+									* [Zipkin: Distributed Systems Tracing at Twitter](https://blog.twitter.com/engineering/en_us/a/2012/distributed-systems-tracing-with-zipkin.html)
-												osquery Across the Enterprise at Palantir

											
										
										
											2018-03-24 22:03:18 -04:00
+									* [osquery Across the Enterprise at Palantir](https://medium.com/@palantir/osquery-across-the-enterprise-3c3c9d13ec55)
-												The Log: What Every Software Engineer Should Know

											
										
										
											2018-01-25 04:55:51 -05:00
+								* [Distributed Logging](https://blog.treasuredata.com/blog/2016/08/03/distributed-logging-architecture-in-the-container-era/)
-												The Problem with Logging - Jeff Atwood

											
										
										
											2018-02-12 10:59:50 -05:00
+									* [The Problem with Logging - Jeff Atwood](https://blog.codinghorror.com/the-problem-with-logging/)
-												Using Logs to Build a Solid Data Infrastructure - Martin Kleppmann

											
										
										
											2018-02-10 06:14:17 -05:00
+									* [The Log: What Every Software Engineer Should Know](https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying)
 									* [Using Logs to Build a Solid Data Infrastructure - Martin Kleppmann](https://www.confluent.io/blog/using-logs-to-build-a-solid-data-infrastructure-or-why-dual-writes-are-a-bad-idea/)
-												LogFeeder: Log Collection System at Yelp

											
										
										
											2018-03-22 03:06:51 -04:00
+									* [Scalable and Reliable Log Ingestion at Pinterest](https://medium.com/@Pinterest_Engineering/scalable-and-reliable-data-ingestion-at-pinterest-b921c2ee8754)
-												Refactor the Graph Databases section

											
										
										
											2018-01-20 07:30:24 -05:00
+									* [Building DistributedLog at Twitter: High-performance replicated log service](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2015/building-distributedlog-twitter-s-high-performance-replicated-log-servic.html)
-												Split Distributed Tracing and Logging into two parts

											
										
										
											2018-01-20 07:53:17 -05:00
+									* [Logging Service with Spark at CERN Accelerator](https://databricks.com/blog/2017/12/14/the-architecture-of-the-next-cern-accelerator-logging-service.html)
-												Logging and Aggregation at Quora

											
										
										
											2018-01-16 20:42:41 -05:00
+									* [Logging and Aggregation at Quora](https://engineering.quora.com/Logging-and-Aggregation-at-Quora)
-												BookKeeper: Distributed Log Storage at Yahoo

											
										
										
											2018-01-18 03:08:13 -05:00
+									* [BookKeeper: Distributed Log Storage at Yahoo](https://yahooeng.tumblr.com/post/109908973316/bookkeeper-yahoos-distributed-log-storage-is)
-												LogDevice: Distributed Data Store for Logs at Facebook

											
										
										
											2018-01-26 02:32:10 -05:00
+									* [LogDevice: Distributed Data Store for Logs at Facebook](https://code.facebook.com/posts/357056558062811/logdevice-a-distributed-data-store-for-logs/)
-												LogFeeder: Log Collection System at Yelp

											
										
										
											2018-03-22 03:06:51 -04:00
+									* [LogFeeder: Log Collection System at Yelp](https://engineeringblog.yelp.com/2018/03/introducing-logfeeder.html)
-												Collection and Analysis of Daemon Logs at Badoo

											
										
										
											2018-05-25 11:22:32 -04:00
+									* [Collection and Analysis of Daemon Logs at Badoo](https://badoo.com/techblog/blog/2016/06/06/collection-and-analysis-of-daemon-logs-at-badoo/)
-												Job-based Forecasting Workflow for Observability Anomaly Detection at Uber

											
										
										
											2018-05-18 19:41:41 -04:00
+								* [Distributed Security (Monitoring, Authentication, etc)](https://msdn.microsoft.com/en-us/library/cc767123.aspx)
-												Approach to Security at Scale at Dropbox

											
										
										
											2018-04-03 22:22:27 -04:00
+									* [Approach to Security at Scale at Dropbox](https://blogs.dropbox.com/tech/2018/02/security-at-scale-the-dropbox-approach/)
-												Add a new section: Distributed Security

											
										
										
											2018-04-03 21:44:03 -04:00
+									* [Aardvark and Repokid: AWS Least Privilege for Distributed, High-Velocity Development at Netflix](https://medium.com/netflix-techblog/introducing-aardvark-and-repokid-53b081bf3a7e)
 									* [LISA: Distributed Firewall at LinkedIn](https://www.slideshare.net/MikeSvoboda/2017-lisa-linkedins-distributed-firewall-dfw)
 									* [Distributed Security Alerting at Slack](https://slack.engineering/distributed-security-alerting-c89414c992d6)
-												BinaryAlert: Real-time Serverless Malware Detection at Airbnb

											
										
										
											2018-04-14 03:00:38 -04:00
+									* [Secure Infrastructure To Store Bitcoin In The Cloud at Coinbase](https://engineering.coinbase.com/how-coinbase-builds-secure-infrastructure-to-store-bitcoin-in-the-cloud-30a6504e40ba)
-												Scalable IAM Architecture to Secure Access to 100 AWS Accounts at Segment

											
										
										
											2018-04-19 20:39:03 -04:00
+									* [BinaryAlert: Real-time Serverless Malware Detection at Airbnb](https://medium.com/airbnb-engineering/binaryalert-real-time-serverless-malware-detection-ca44370c1b90)
-												OAuth Audit Toolbox at Indeed

											
										
										
											2018-04-30 03:13:34 -04:00
+									* [Scalable IAM Architecture to Secure Access to 100 AWS Accounts at Segment](https://segment.com/blog/secure-access-to-100-aws-accounts/)
-												Active Directory Password Blacklisting at Yelp

											
										
										
											2018-05-01 21:25:08 -04:00
+									* [OAuth Audit Toolbox at Indeed](http://engineering.indeedblog.com/blog/2018/04/oaudit-toolbox/)
-												Secure Infrastructure to Store Bitcoin in the Cloud at Coinbase

											
										
										
											2018-05-03 13:35:57 -04:00
+									* [Active Directory Password Blacklisting at Yelp](https://engineeringblog.yelp.com/2018/04/ad-password-blacklisting.html)
 									* [Secure Infrastructure to Store Bitcoin in the Cloud at Coinbase](https://engineering.coinbase.com/how-coinbase-builds-secure-infrastructure-to-store-bitcoin-in-the-cloud-30a6504e40ba)
-												Syscall Auditing at Scale at Slack

											
										
										
											2018-05-11 14:11:44 -04:00
+									* [Syscall Auditing at Scale at Slack](https://slack.engineering/syscall-auditing-at-scale-e6a3ca8ac1b8)
-												Athenz: Fine-Grained, Role-Based Access Control at Yahoo

											
										
										
											2018-05-12 23:48:46 -04:00
+									* [Athenz: Fine-Grained, Role-Based Access Control at Yahoo](https://yahooeng.tumblr.com/post/160481899076/open-sourcing-athenz-fine-grained-role-based)
-												WebAuthn Support for Secure Sign In at Dropbox

											
										
										
											2018-05-16 21:54:56 -04:00
+									* [WebAuthn Support for Secure Sign In at Dropbox](https://blogs.dropbox.com/tech/2018/05/introducing-webauthn-support-for-secure-dropbox-sign-in/)
-												Job-based Forecasting Workflow for Observability Anomaly Detection at Uber

											
										
										
											2018-05-18 19:41:41 -04:00
+									* [Job-based Forecasting Workflow for Observability Anomaly Detection at Uber](https://eng.uber.com/observability-anomaly-detection/)
-												Alibaba Monitoring System

											
										
										
											2018-07-28 09:28:05 -04:00
+									* [Alibaba Monitoring System](https://www.usenix.org/conference/srecon18asia/presentation/xinchi)
-												Smart Monitoring System for Anomaly Detection on Business Trends at Alibaba

											
										
										
											2018-05-19 22:49:10 -04:00
+									* [Smart Monitoring System for Anomaly Detection on Business Trends at Alibaba](https://www.usenix.org/conference/srecon17asia/program/presentation/wang)
-												Security Development Lifecycle (SDL) at Slack

											
										
										
											2018-06-01 06:53:00 -04:00
+									* [Security Development Lifecycle (SDL) at Slack](https://slack.engineering/moving-fast-and-securing-things-540e6c5ae58a)
-												Unprivileged Container Builds at Kinvolk

											
										
										
											2018-06-09 23:01:06 -04:00
+									* [Unprivileged Container Builds at Kinvolk](https://kinvolk.io/blog/2018/04/towards-unprivileged-container-builds/)
-												Diffy: Differencing Engine for Digital Forensics in the Cloud at Netflix

											
										
										
											2018-07-20 20:51:28 -04:00
+									* [Diffy: Differencing Engine for Digital Forensics in the Cloud at Netflix](https://medium.com/netflix-techblog/netflix-sirt-releases-diffy-a-differencing-engine-for-digital-forensics-in-the-cloud-37b71abd2698)
-												Refactor the section of Distributed Messaging/Queuing

											
										
										
											2018-05-10 22:26:03 -04:00
+								* [Distributed Messaging, Queuing, and Event Streaming](https://arxiv.org/pdf/1704.00411.pdf)
-												Samza: Stream Processing System for Latency Insighs at LinkedIn

											
										
										
											2018-04-19 20:19:50 -04:00
+									* [Samza: Stream Processing System for Latency Insighs at LinkedIn](https://engineering.linkedin.com/blog/2018/04/samza-aeon--latency-insights-for-asynchronous-one-way-flows)
-												Refactor the section of Distributed Messaging/Queuing

											
										
										
											2018-05-10 22:26:03 -04:00
+									* [Delaying Asynchronous Message Processing with RabbitMQ at Indeed](http://engineering.indeedblog.com/blog/2017/06/delaying-messages/)
-												Merge two small sections into Distributed Messaging and Event Streaming

											
										
										
											2018-03-21 22:23:38 -04:00
+									* [Bullet: Forward-Looking Query Engine for Streaming Data at Yahoo](https://yahooeng.tumblr.com/post/161855616651/open-sourcing-bullet-yahoos-forward-looking)
-												EventHorizon: Tool for Watching Events Streaming at Etsy

											
										
										
											2018-06-02 22:55:55 -04:00
+									* [EventHorizon: Tool for Watching Events Streaming at Etsy](https://codeascraft.com/2018/05/29/the-eventhorizon-saga/)
-												Merge two small sections into Distributed Messaging and Event Streaming

											
										
										
											2018-03-21 22:23:38 -04:00
+									* [Benchmarking Streaming Computation Engines at Yahoo](https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at)
-												Cherami: Message Queue System for Transporting Async Tasks at Uber

											
										
										
											2018-05-10 22:13:01 -04:00
+									* [Cherami: Message Queue System for Transporting Async Tasks at Uber](https://eng.uber.com/cherami/)
-												Qyu: Distributed Task Execution System for Complex Workflows at FindHotel

											
										
										
											2018-06-07 07:03:55 -04:00
+									* [Qyu: Distributed Task Execution System for Complex Workflows at FindHotel](http://blog.findhotel.net/2018/03/qyu-a-distributed-task-execution-system-for-complex-workflows/)
-												Messaging Service at Riot Games

											
										
										
											2018-04-02 10:21:16 -04:00
+									* [Messaging Service at Riot Games](https://engineering.riotgames.com/news/riot-messaging-service)
-												Refactor the section of Distributed Messaging/Queuing

											
										
										
											2018-05-10 22:26:03 -04:00
+									* [Event Stream Analytics with Druid (Search Engine meet Column DB) at Walmart](https://medium.com/walmartlabs/event-stream-analytics-at-walmart-with-druid-dcf1a37ceda7)
-												Debugging Production with Event Logging at Zillow

											
										
										
											2018-05-22 11:49:45 -04:00
+									* [Debugging Production with Event Logging at Zillow](https://www.zillow.com/engineering/debugging-production-event-logging/)
-												Refactor the section of Distributed Messaging/Queuing

											
										
										
											2018-05-10 22:26:03 -04:00
+									* [Kafka the Message Broker](https://martin.kleppmann.com/papers/kafka-debull15.pdf)
 										* [When to use RabbitMQ or Kafka](https://content.pivotal.io/blog/understanding-when-to-use-rabbitmq-or-apache-kafka)
 										* [Kafka at Scale at LinkedIn](https://engineering.linkedin.com/kafka/running-kafka-scale)
 										* [Real-time Data Pipeline with Kafka at Yelp](https://engineeringblog.yelp.com/2016/07/billions-of-messages-a-day-yelps-real-time-data-pipeline.html)
 										* [Building Reliable Reprocessing and Dead Letter Queues with Kafka at Uber](https://eng.uber.com/reliable-reprocessing/)
 										* [Audit Kafka End-to-End at Uber](https://eng.uber.com/chaperone/)
 										* [Kafka for PaaS at Rakuten](https://techblog.rakuten.co.jp/2016/01/28/rakuten-paas-kafka/)
 										* [Publishing with Kafka at The New York Times](https://open.nytimes.com/publishing-with-apache-kafka-at-the-new-york-times-7f0e3b7d2077)
 										* [Kafka Streams on Heroku](https://blog.heroku.com/kafka-streams-on-heroku)
 										* [Kafka in Platform Events Architecture at Salesforce](https://engineering.salesforce.com/how-apache-kafka-inspired-our-platform-events-architecture-2f351fe4cf63)
 										* [Kafka in Socket Architecture (with a Comprehensive Comparison Table) at Trello](https://tech.trello.com/why-we-chose-kafka/)
 										* [Analytics Pipeline (Kafka, Dataflow, BigQuery) at Teads.tv](http://highscalability.com/blog/2018/4/9/give-meaning-to-100-billion-events-a-day-the-analytics-pipel.html)
 									* [Data Deduplication Techniques](https://en.wikipedia.org/wiki/Data_deduplication)
 										* [Exactly-once Semantics are Possible: How Kafka Does it](https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/)
-												Create a branch for Deduplication Techniques

											
										
										
											2018-01-18 04:13:49 -05:00
+										* [Real-time Deduping at Scale with Kafka-based Pipleline at Tapjoy](http://eng.tapjoy.com/blog-list/real-time-deduping-at-scale)
-												Deduplication For Efficient Storage (From 50 PB To 32 PB) At Mail.Ru

											
										
										
											2018-03-21 03:28:41 -04:00
+										* [Delivering Billions of Messages Exactly Once: Deduping at Segment](https://segment.com/blog/exactly-once-delivery/)
 										* [Deduplication For Efficient Storage (From 50 PB To 32 PB) At Mail.Ru](https://medium.com/@andrewsumin/efficient-storage-how-we-went-down-from-50-pb-to-32-pb-99f9c61bf6b4)
-												Add a section for Distributed Searching

											
										
										
											2018-01-26 09:06:57 -05:00
+								* [Distributed Searching](http://nwds.cs.washington.edu/files/nwds/pdf/Distributed-WR.pdf)
 									* [Search Architecture of Instagram](https://engineering.instagram.com/search-architecture-eeb34a936d3a)
 									* [Search Architecture of eBay](http://www.cs.otago.ac.nz/homepages/andrew/papers/2017-8.pdf)
-												add a subsection for ELK Stack

											
										
										
											2018-03-27 23:47:42 -04:00
+									* [Improving Search Engine Efficiency by over 25% at eBay](https://www.ebayinc.com/stories/blogs/tech/making-e-commerce-search-faster/)
-												Search Federation Architecture at LinkedIn (2018)

											
										
										
											2018-03-14 18:42:37 -04:00
+									* [Search Federation Architecture at LinkedIn (2018)](https://engineering.linkedin.com/blog/2018/03/search-federation-architecture-at-linkedin)
-												Search Service of Twitter (2014)

											
										
										
											2018-01-27 23:20:09 -05:00
+									* [Search at Slack](https://slack.engineering/search-at-slack-431f8c80619e)
-												Search and Recommendations at DoorDash

											
										
										
											2018-03-30 13:15:07 -04:00
+									* [Search and Recommendations at DoorDash](https://blog.doordash.com/powering-search-recommendations-at-doordash-8310c5cfd88c)
-												refactor

											
										
										
											2018-03-30 13:19:27 -04:00
+									* [Search Service at Twitter (2014)](https://blog.twitter.com/engineering/en_us/a/2014/building-a-complete-tweet-index.html)
 									* [Nautilus: Travel Search Engine of Expedia](http://blog.expedia.com/expedias-nautilus-travel-search-engine-overview-and-applications/)
 									* [Galene: Search Architecture of LinkedIn](https://engineering.linkedin.com/search/did-you-mean-galene)
-												Sherlock: Near Real Time Search Indexing at Flipkart

											
										
										
											2018-02-02 20:33:02 -05:00
+									* [Manas: High Performing Customized Search System at Pinterest](https://medium.com/@Pinterest_Engineering/manas-a-high-performing-customized-search-system-cf189f6ca40f)
 									* [Sherlock: Near Real Time Search Indexing at Flipkart](https://tech.flipkart.com/sherlock-near-real-time-search-indexing-95519783859d)
-												Nebula: Storage Platform to Build Search Backends at Airbnb

											
										
										
											2018-02-05 20:28:11 -05:00
+									* [Nebula: Storage Platform to Build Search Backends at Airbnb](https://medium.com/airbnb-engineering/nebula-as-a-storage-platform-to-build-airbnbs-search-backends-ecc577b05f06)
-												add a subsection for ELK Stack

											
										
										
											2018-03-27 23:47:42 -04:00
+									* [ELK (Elasticsearch, Logstash, Kibana) Stack](https://logz.io/blog/15-tech-companies-chose-elk-stack/)
-												Predictions in Real Time with ELK at Uber

											
										
										
											2018-05-10 11:46:14 -04:00
+										* [Predictions in Real Time with ELK at Uber](https://eng.uber.com/elk/)
-												Scaling Elasticsearch Clusters at Uber

											
										
										
											2018-05-10 11:47:04 -04:00
+										* [Scaling Elasticsearch Clusters at Uber](https://www.infoq.com/presentations/uber-elasticsearch-clusters?utm_source=presentations_about_Case_Study&utm_medium=link&utm_campaign=Case_Study)
-												add a subsection for ELK Stack

											
										
										
											2018-03-27 23:47:42 -04:00
+										* [Elasticsearch Performance Tuning Practice at eBay](https://www.ebayinc.com/stories/blogs/tech/elasticsearch-performance-tuning-practice-at-ebay/)
 										* [Elasticsearch at Kickstarter](https://kickstarter.engineering/elasticsearch-at-kickstarter-db3c487887fc)
 										* [Distributed Troubleshooting Platform with ELK Stack at Target.com](http://target.github.io/infrastructure/distributed-troubleshooting)
 										* [ELK at Robinhood](https://robinhood.engineering/taming-elk-4e1349f077c3)
-												Log Parsing with Logstash and Google Protocol Buffers at Trivago

											
										
										
											2018-05-03 13:14:32 -04:00
+										* [Log Parsing with Logstash and Google Protocol Buffers at Trivago](https://tech.trivago.com/2016/01/19/logstash_protobuf_codec/)
-												Fast Order Search using Data Pipeline and Elasticsearch at Yelp

											
										
										
											2018-06-02 22:38:37 -04:00
+										* [Fast Order Search using Data Pipeline and Elasticsearch at Yelp](https://engineeringblog.yelp.com/2018/06/fast-order-search.html)
-												Sharding out Elasticsearch at Vinted

											
										
										
											2018-06-11 06:22:42 -04:00
+										* [Sharding out Elasticsearch at Vinted](http://engineering.vinted.com/2017/06/05/sharding-out-elasticsearch/)
-												Add a section for Distributed Searching

											
										
										
											2018-01-26 09:06:57 -05:00
+								* [Distributed Storage](http://highscalability.com/blog/2011/11/1/finding-the-right-data-solution-for-your-application-in-the.html)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+									* [In-memory Storage](https://medium.com/@denisanikin/what-an-in-memory-database-is-and-how-it-persists-data-efficiently-f43868cff4c1)
-												Introduction to In-memory Data - Viktor Gamov, Solutions Architect at Hazelcast

											
										
										
											2018-03-06 19:07:24 -05:00
+										* [Introduction to In-memory Data - Viktor Gamov, Solutions Architect at Hazelcast](https://www.infoq.com/presentations/in-memory-data)
-												MemSQL Architecture - The Fast (MVCC, InMem, LockFree, CodeGen) And Familiar (SQL)

											
										
										
											2018-03-18 10:30:16 -04:00
+										* [MemSQL Architecture - The Fast (MVCC, InMem, LockFree, CodeGen) And Familiar (SQL)](http://highscalability.com/blog/2012/8/14/memsql-architecture-the-fast-mvcc-inmem-lockfree-codegen-and.html)
-												Optimizing Memcached Efficiency at Quora

											
										
										
											2018-01-01 20:28:28 -05:00
+										* [Optimizing Memcached Efficiency at Quora](https://engineering.quora.com/Optimizing-Memcached-Efficiency)
-												Real-Time Data Warehouse with MemSQL on Cisco UCS

											
										
										
											2018-01-04 06:17:04 -05:00
+										* [Real-Time Data Warehouse with MemSQL on Cisco UCS](https://blogs.cisco.com/datacenter/memsql)
-												refactor

											
										
										
											2018-03-22 23:27:21 -04:00
+										* [Moving to MemSQL (with Horizontally Scalable, ACID Compliant, MySQL Compatibility) at Tapjoy](http://eng.tapjoy.com/blog-list/moving-to-memsql)
-												MemSQL and Kinesis for Real-time Insights at Disney-ABC TV

											
										
										
											2018-05-08 22:48:08 -04:00
+										* [MemSQL and Kinesis for Real-time Insights at Disney-ABC TV](https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/68131)
-												Scaling HDFS at Uber

											
										
										
											2018-04-23 04:13:30 -04:00
+									* [Durable Storage (S3, HDFS)](http://www.datacenterknowledge.com/archives/2013/10/04/object-storage-the-future-of-scale-out)
 										* [Scaling HDFS at Uber](https://eng.uber.com/scaling-hdfs/)
-												refactor

											
										
										
											2018-02-14 04:40:39 -05:00
+										* [Reasons for Choosing S3 over HDFS at Databricks](https://databricks.com/blog/2017/05/31/top-5-reasons-for-choosing-s3-over-hdfs.html)
 										* [Quantcast File System on Amazon S3](https://www.quantcast.com/blog/quantcast-file-system-on-amazon-s3/)
-												Data Sink with S3 at Deliveroo

											
										
										
											2018-06-04 07:30:12 -04:00
+										* [Data Sink with S3 at Deliveroo](https://deliveroo.engineering/2017/06/15/data-sink.html)
-												refactor

											
										
										
											2018-02-14 04:40:39 -05:00
+										* [Using S3 in Netflix Chukwa](https://medium.com/netflix-techblog/evolution-of-the-netflix-data-pipeline-da246ca36905)
-												Refactor the Object Storage part

											
										
										
											2018-01-20 07:56:21 -05:00
+										* [Yahoo Cloud Object Store - Object Storage at Exabyte Scale](https://yahooeng.tumblr.com/post/116391291701/yahoo-cloud-object-store-object-storage-at)
-												Hammerspace: Persistent, Concurrent, Off-heap Storage at Airbnb

											
										
										
											2018-01-23 12:52:49 -05:00
+										* [Ambry: Distributed Immutable Object Store at LinkedIn](https://www.usenix.org/conference/srecon17americas/program/presentation/shenoy)
-												What is Time-Series Data & Why We Need a Time-Series Database

											
										
										
											2018-04-03 20:56:02 -04:00
+										* [Hammerspace: Persistent, Concurrent, Off-heap Storage at Airbnb](https://medium.com/airbnb-engineering/hammerspace-persistent-concurrent-off-heap-storage-3db39bb04472)
 								* [Relational Databases (MySQL, MSSQL, PostgreSQL)](https://www.mysql.com/products/cluster/scalability.html)
-												Stop Using Shiny New Things and Love MySQL - Lesson at Pinterest

											
										
										
											2018-05-12 22:56:05 -04:00
+									* [Stop Using Shiny New Things and Love MySQL - Lesson at Pinterest](https://medium.com/@Pinterest_Engineering/learn-to-stop-using-shiny-new-things-and-love-mysql-3e1613c2ce14)
-												What is Time-Series Data & Why We Need a Time-Series Database

											
										
										
											2018-04-03 20:56:02 -04:00
+									* [Microsoft SQL versus MySQL](https://www.upwork.com/hiring/data/sql-vs-mysql-which-relational-database-is-right-for-you/)
 									* [SQL Database Performance Tuning](https://www.toptal.com/sql-server/sql-database-tuning-for-developers)
 									* [Scaling PostgreSQL Using CUDA](http://highscalability.com/blog/2009/5/28/scaling-postgresql-using-cuda.html)
 									* [Scaling Distributed Joins](http://blog.memsql.com/scaling-distributed-joins/)
 									* [MySQL System Design at Booking.com](https://www.percona.com/live/mysql-conference-2015/sessions/bookingcom-evolution-mysql-system-design)
 									* [PostgreSQL at Twitch](https://blog.twitch.tv/how-twitch-uses-postgresql-c34aa9e56f58)
 									* [Scaling MySQL-based Financial Reporting System at Airbnb](https://medium.com/airbnb-engineering/tracking-the-money-scaling-financial-reporting-at-airbnb-6d742b80f040)
 									* [Scaling MySQL at Wix](https://www.wix.engineering/single-post/scaling-to-100m-mysql-is-a-better-nosql)
-												MaxScale (MySQL) Database Proxy at Airbnb

											
										
										
											2018-05-11 14:23:14 -04:00
+									* [MaxScale (MySQL) Database Proxy at Airbnb](https://medium.com/airbnb-engineering/unlocking-horizontal-scalability-in-our-web-serving-tier-d907449cdbcf)
-												What is Time-Series Data & Why We Need a Time-Series Database

											
										
										
											2018-04-03 20:56:02 -04:00
+									* [Switching from Postgres to MySQL at Uber](https://eng.uber.com/mysql-migration/)
 									* [Handling Growth with Postgres at Instagram](https://engineering.instagram.com/handling-growth-with-postgres-5-tips-from-instagram-d5d7e7ffdfcb)
 									* [Scaling the Analytics Database (Postgres) at TransferWise](http://tech.transferwise.com/scaling-our-analytics-database/)
-												Updating a 50 Terabyte PostgreSQL Database at Adyen

											
										
										
											2018-04-06 22:36:10 -04:00
+									* [Updating a 50 Terabyte PostgreSQL Database at Adyen](https://medium.com/adyen/updating-a-50-terabyte-postgresql-database-f64384b799e7)
-												Refactor the section of DB Replication

											
										
										
											2018-05-01 21:11:57 -04:00
+									* [Replication](https://m.alphasights.com/a-primer-on-database-replication-381b319cd032)
 										* [MySQL Parallel Replication (4 parts) at Booking.com](https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-4-annex-under-the-hood-eb456cf8b2fb)
 										* [Mitigating MySQL Replication Lag and Reducing Read Load at Github](https://githubengineering.com/mitigating-replication-lag-and-reducing-read-load-with-freno/)
 										* [Black-Box Auditing: Verifying End-to-End Replication Integrity between MySQL and Redshift at Yelp](https://engineeringblog.yelp.com/2018/04/black-box-auditing.html)
-												Monitoring MySQL Delayed Replication at IMVU

											
										
										
											2018-05-02 00:24:33 -04:00
+										* [Monitoring MySQL Delayed Replication at IMVU](https://engineering.imvu.com/2013/01/09/monitoring-delayed-replication-with-a-focus-on-mysql/)
-												Re-read and correct an entry of Airbnb MySQL replication

											
										
										
											2018-05-10 13:08:43 -04:00
+										* [Partitioning Main MySQL Database at Airbnb](https://medium.com/airbnb-engineering/how-we-partitioned-airbnb-s-main-database-in-two-weeks-55f7e006ff21)
-												Herb: Multi-DC Replication Engine for Schemaless Datastore at Uber

											
										
										
											2018-07-28 09:32:37 -04:00
+										* [Herb: Multi-DC Replication Engine for Schemaless Datastore at Uber](https://eng.uber.com/herb-datacenter-replication/)
-												What is Time-Series Data & Why We Need a Time-Series Database

											
										
										
											2018-04-03 20:56:02 -04:00
+									* [Sharding (Horizontal Partitioning)](https://www.educative.io/collection/page/5668639101419520/5649050225344512/5146118144917504)
 										* [Sharding MySQL at Pinterest](https://medium.com/@Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f)
 										* [Sharding MySQL at MailChimp](https://devs.mailchimp.com/blog/using-shards-to-accommodate-millions-of-users/)
-												Sharding MySQL at Twilio

											
										
										
											2018-05-08 07:33:27 -04:00
+										* [Sharding MySQL at Twilio](https://www.twilio.com/engineering/2014/06/26/how-we-replaced-our-data-pipeline-with-zero-downtime)
-												Sharding Layer of Schemaless Datastore at Uber

											
										
										
											2018-04-29 18:26:50 -04:00
+										* [Sharding MySQL (3 parts) at Evernote](https://blog.evernote.com/tech/2015/10/08/the-great-shard-migration-part-ii/)
-												Refactor the section of DB Replication

											
										
										
											2018-05-01 21:11:57 -04:00
+										* [Sharding Layer of Schemaless Datastore at Uber](https://eng.uber.com/schemaless-rewrite/)
-												Solr: Improving Performance for Batch Indexing at Box

											
										
										
											2018-06-23 02:41:22 -04:00
+										* [Sharding & IDs at Instagram](https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c)
 										* [Solr: Improving Performance for Batch Indexing at Box](https://blog.box.com/blog/solr-improving-performance-batch-indexing/)
-												Refactor the section of Relational Databases

											
										
										
											2018-04-02 09:49:00 -04:00
+								* [NoSQL Databases](https://www.thoughtworks.com/insights/blog/nosql-databases-overview)
-												Manhattan: Twitter’s distributed key-value database

											
										
										
											2018-01-02 21:23:02 -05:00
+									* [Key-Value Databases (DynamoDB, Voldemort, Manhattan)](http://highscalability.com/anti-rdbms-list-distributed-key-value-stores)
-												Scaling Mapbox infrastructure with DynamoDB Streams

											
										
										
											2018-01-02 21:05:24 -05:00
+										* [Scaling Mapbox infrastructure with DynamoDB Streams](https://blog.mapbox.com/scaling-mapbox-infrastructure-with-dynamodb-streams-d53eabc5e972)
-												Manhattan: Twitter’s distributed key-value database

											
										
										
											2018-01-02 21:23:02 -05:00
+										* [Manhattan: Twitter’s distributed key-value database](https://blog.twitter.com/engineering/en_us/a/2014/manhattan-our-real-time-multi-tenant-distributed-database-for-twitter-scale.html)
-												Sherpa: Yahoo’s distributed NoSQL key-value store

											
										
										
											2018-01-18 02:42:22 -05:00
+										* [Sherpa: Yahoo’s distributed NoSQL key-value store](https://yahooeng.tumblr.com/post/120730204806/sherpa-scales-new-heights)
-												Riak inside Chat Service Architecture at Riot Games

											
										
										
											2018-01-25 05:33:05 -05:00
+										* [Riak inside Chat Service Architecture at Riot Games](https://engineering.riotgames.com/news/chat-service-architecture-persistence)
-												MPH: Fast and Compact Immutable Key-Value Stores at Indeed

											
										
										
											2018-02-08 04:39:11 -05:00
+										* [MPH: Fast and Compact Immutable Key-Value Stores at Indeed](http://engineering.indeedblog.com/blog/2018/02/indeed-mph/)
-												zBase: High Performance, Elastic, Distributed Key-Value Store at Zynga

											
										
										
											2018-02-08 05:01:53 -05:00
+										* [zBase: High Performance, Elastic, Distributed Key-Value Store at Zynga](https://www.zynga.com/blogs/engineering/zbase-high-performance-elastic-distributed-key-value-store-2)
-												Venice: Distributed Key-Value Database at Linkedin

											
										
										
											2018-04-14 04:08:52 -04:00
+										* [Venice: Distributed Key-Value Database at Linkedin](https://engineering.linkedin.com/blog/2017/02/building-venice-with-apache-helix)
-												DynamoDB Hot Shards at Segment

											
										
										
											2018-05-11 14:01:54 -04:00
+										* [DynamoDB Hot Shards at Segment](https://segment.com/blog/the-million-dollar-eng-problem/)
-												OAuth Audit Toolbox at Indeed

											
										
										
											2018-04-30 03:13:34 -04:00
+									* [Columnar Databases (Cassandra, HBase, Redshift)](https://aws.amazon.com/nosql/columnar/)
-												Consistent Hashing in Cassandra

											
										
										
											2017-12-27 19:47:33 -05:00
+										* [Consistent Hashing in Cassandra](https://blog.imaginea.com/consistent-hashing-in-cassandra/)
-												Understanding Gossip (Cassandra Internals)

											
										
										
											2018-03-17 21:07:08 -04:00
+										* [Understanding Gossip (Cassandra Internals)](https://www.youtube.com/watch?v=FuP1Fvrv6ZQ)
-												When NOT to use Cassandra?

											
										
										
											2018-01-02 20:29:38 -05:00
+										* [When NOT to use Cassandra?](https://stackoverflow.com/questions/2634955/when-not-to-use-cassandra)
-												Event Stream Analytics with Druid (Search Engine meet Column DB) at Walmart

											
										
										
											2018-04-03 07:57:22 -04:00
+										* [Avoid Pitfalls in Scaling Cassandra Cluster at Walmart](https://medium.com/walmartlabs/avoid-pitfalls-in-scaling-your-cassandra-cluster-lessons-and-remedies-a71ca01f8c04)
 										* [Storing Images in Cassandra at Walmart](https://medium.com/walmartlabs/building-object-store-storing-images-in-cassandra-walmart-scale-a6b9c02af593)
-												Cassandra at Instagram

											
										
										
											2018-01-02 00:08:40 -05:00
+										* [Cassandra at Instagram](https://www.slideshare.net/DataStax/cassandra-at-instagram-2016)
-												Improving HBase Backup Efficiency at Pinterest

											
										
										
											2018-04-02 09:19:57 -04:00
+										* [Scale Ad Analytics with Cassandra at Yelp](https://engineeringblog.yelp.com/2016/08/how-we-scaled-our-ad-analytics-with-cassandra.html)
 										* [Store Billions of Messages with Cassandra at Discord](https://blog.discordapp.com/how-discord-stores-billions-of-messages-7fa6ec7ee4c7)
 										* [Scale to 100+ Million Reads/Writes using Spark and Cassandra at Dream11](https://medium.com/dream11-tech-blog/leaderboard-dream11-4efc6f93c23e)
-												Moving Food Feed from Redis to Cassandra at Zomato

											
										
										
											2018-02-08 05:04:31 -05:00
+										* [Moving Food Feed from Redis to Cassandra at Zomato](https://www.zomato.com/blog/how-we-moved-our-food-feed-from-redis-to-cassandra)
-												Benchmarking Cassandra Scalability at Netflix; Half of my heart is in Cassandra Ooh Na Na...

											
										
										
											2018-02-16 07:45:03 -05:00
+										* [Benchmarking Cassandra Scalability on AWS at Netflix](https://medium.com/netflix-techblog/benchmarking-cassandra-scalability-on-aws-over-a-million-writes-per-second-39f45f066c9e)
-												Improving HBase Backup Efficiency at Pinterest

											
										
										
											2018-04-02 09:19:57 -04:00
+										* [Imgur Notification: From MySQL to HBASE at Imgur](https://blog.imgur.com/2015/09/15/tech-tuesday-imgur-notifications-from-mysql-to-hbase/)
 										* [Improving HBase Backup Efficiency at Pinterest](https://medium.com/@Pinterest_Engineering/improving-hbase-backup-efficiency-at-pinterest-86159da4b954)
-												HBase Practice at Xiaomi

											
										
										
											2018-06-05 06:57:09 -04:00
+										* [HBase Practice at Xiaomi](https://www.slideshare.net/HBaseCon/hbase-practice-at-xiaomi)
-												ClickHouse - Open Source Distributed Column Database at Yandex

											
										
										
											2018-03-21 22:00:09 -04:00
+										* [ClickHouse - Open Source Distributed Column Database at Yandex](https://clickhouse.yandex/)
-												Scaling Redshift without Scaling Costs at GIPHY

											
										
										
											2018-04-30 03:06:01 -04:00
+										* [Scaling Redshift without Scaling Costs at GIPHY](https://engineering.giphy.com/scaling-redshift-without-scaling-costs/)
-												Service Decomposition at Scale (with Cassandra) at Intuit QuickBooks

											
										
										
											2018-05-11 14:17:04 -04:00
+										* [Service Decomposition at Scale (with Cassandra) at Intuit QuickBooks](https://quickbooks-engineering.intuit.com/service-decomposition-at-scale-70405ac2f637)
-												Cassandra for Keeping Counts In Sync at SoundCloud

											
										
										
											2018-08-10 21:21:05 -04:00
+										* [Cassandra for Keeping Counts In Sync at SoundCloud](https://developers.soundcloud.com/blog/keeping-counts-in-sync)
-												SimpleDB at Zendesk

											
										
										
											2018-02-03 21:52:04 -05:00
+									* [Document Databases (MongoDB, SimpleDB, CouchDB)](https://msdn.microsoft.com/en-us/magazine/hh547103.aspx)
-												eBay: Building Mission-Critical Multi-Data Center Applications with MongoDB

											
										
										
											2018-01-02 20:39:57 -05:00
+										* [eBay: Building Mission-Critical Multi-Data Center Applications with MongoDB](https://www.mongodb.com/blog/post/ebay-building-mission-critical-multi-data-center-applications-with-mongodb)
-												MongoDB at Baidu: Multi-Tenant Cluster Storing 200+ Billion Documents across 160 Shards

											
										
										
											2018-01-02 21:01:27 -05:00
+										* [MongoDB at Baidu: Multi-Tenant Cluster Storing 200+ Billion Documents across 160 Shards](https://www.mongodb.com/blog/post/mongodb-at-baidu-powering-100-apps-across-600-nodes-at-pb-scale)
-												Migrating Mongo Data at Addepar

											
										
										
											2018-04-14 03:05:55 -04:00
+										* [Migrating Mongo Data at Addepar](https://medium.com/build-addepar/migrating-mountains-of-mongo-data-63e530539952)
-												The AWS and MongoDB Infrastructure of Parse (acquired by Facebook)

											
										
										
											2018-01-17 00:10:53 -05:00
+										* [The AWS and MongoDB Infrastructure of Parse (acquired by Facebook)](https://medium.baqend.com/parse-is-gone-a-few-secrets-about-their-infrastructure-91b3ab2fcf71)
-												Migrating Mountains of Mongo Data at Addepar

											
										
										
											2018-02-10 05:53:54 -05:00
+										* [Migrating Mountains of Mongo Data at Addepar](https://medium.com/build-addepar/migrating-mountains-of-mongo-data-63e530539952)
-												Couchbase Ecosystem at LinkedIn

											
										
										
											2018-01-18 04:31:26 -05:00
+										* [Couchbase Ecosystem at LinkedIn](https://engineering.linkedin.com/blog/2017/12/couchbase-ecosystem-at-linkedin)
-												SimpleDB at Zendesk

											
										
										
											2018-02-03 21:52:04 -05:00
+										* [SimpleDB at Zendesk](https://medium.com/zendesk-engineering/resurrecting-amazon-simpledb-9404034ec506)
-												Handling Billions of Edges in a Graph Database

											
										
										
											2018-03-06 01:21:40 -05:00
+									* [Graph Databases](https://www.ibm.com/developerworks/library/cl-graph-database-1/index.html)
 										* [Handling Billions of Edges in a Graph Database](https://www.infoq.com/presentations/graph-database-scalability)
-												Refactor the Graph Databases section

											
										
										
											2018-01-20 07:30:24 -05:00
+										* [Neo4j case studies with Walmart, eBay, AirBnB, NASA, etc](https://neo4j.com/customers/)
 										* [FlockDB: Distributed Graph Database for Storing Adjancency Lists at Twitter](https://blog.twitter.com/engineering/en_us/a/2010/introducing-flockdb.html)
-												Learn From Mistakes

											
										
										
											2018-01-23 13:38:47 -05:00
+										* [JanusGraph: Scalable Graph Database backed by Google, IBM and Hortonworks](https://architecht.io/google-ibm-back-new-open-source-graph-database-project-janusgraph-1d74fb78db6b)
-												Refactor the Graph Databases section

											
										
										
											2018-01-20 07:30:24 -05:00
+										* [Amazon Neptune](https://aws.amazon.com/neptune/)
-												Redis in Slack job queue

											
										
										
											2018-01-02 22:02:41 -05:00
+									* [Datastructure Databases (Redis, Hazelcast)](https://db-engines.com/en/system/Hazelcast%3BMemcached%3BRedis)
-												Enhance the Redis section

											
										
										
											2018-01-26 11:49:10 -05:00
+										* [Using Redis To Scale at Twitter](http://highscalability.com/blog/2014/9/8/how-twitter-uses-redis-to-scale-105tb-ram-39mm-qps-10000-ins.html)
 										* [Scaling Job Queue with Redis at Slack](https://slack.engineering/scaling-slacks-job-queue-687222e9d100)
-												Moving persistent data out of Redis at Github

											
										
										
											2018-01-02 22:13:42 -05:00
+										* [Moving persistent data out of Redis at Github](https://githubengineering.com/moving-persistent-data-out-of-redis/)
-												Enhance the Redis section

											
										
										
											2018-01-26 11:49:10 -05:00
+										* [Storing Hundreds of Millions of Simple Key-Value Pairs in Redis at Instagram](https://engineering.instagram.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c)
-												Redis in Chat Architecture of Twitch (from 27:22)

											
										
										
											2018-01-26 23:26:40 -05:00
+										* [Redis in Chat Architecture of Twitch (from 27:22)](https://www.infoq.com/presentations/twitch-pokemon)
-												Learn Redis the hard way (in production) at Trivago

											
										
										
											2018-02-01 00:33:40 -05:00
+										* [Learn Redis the hard way (in production) at Trivago](http://tech.trivago.com/2017/01/25/learn-redis-the-hard-way-in-production/)
-												Redis at Deliveroo

											
										
										
											2018-02-10 06:23:25 -05:00
+										* [Optimizing Session Key Storage in Redis at Deliveroo](https://deliveroo.engineering/2016/10/07/optimising-session-key-storage.html)
-												Refactor

											
										
										
											2018-02-14 04:29:46 -05:00
+										* [Optimizing Redis Storage at Deliveroo](https://deliveroo.engineering/2017/01/19/optimising-membership-queries.html)
-												Memory Optimization in Redis at Wattpad

											
										
										
											2018-06-01 06:55:05 -04:00
+										* [Memory Optimization in Redis at Wattpad](http://engineering.wattpad.com/post/23244724794/store-more-stuff-memory-optimization-in-redis)
-												Sending an e-mail to millions of users (with Redis) at Drivy

											
										
										
											2018-06-14 06:32:59 -04:00
+										* [Sending an e-mail to millions of users (with Redis) at Drivy](https://drivy.engineering/sending-mass-emails/)
-												Redis Fleet at Heroku

											
										
										
											2018-08-10 21:14:21 -04:00
+										* [Redis Fleet at Heroku](https://blog.heroku.com/rolling-redis-fleet)
-												Don’t be tricked by the Hashing Trick

											
										
										
											2018-05-12 23:28:07 -04:00
+								* [Time Series Databases (TSDB)](https://www.influxdata.com/time-series-database/)
-												What is Time-Series Data & Why We Need a Time-Series Database

											
										
										
											2018-04-03 20:56:02 -04:00
+									* [What is Time-Series Data & Why We Need a Time-Series Database](https://blog.timescale.com/what-the-heck-is-time-series-data-and-why-do-i-need-a-time-series-database-dcf3b1b18563)
-												Increasing Application Performance with HTTP Cache Headers

											
										
										
											2018-01-23 22:02:00 -05:00
+									* [Time Series Data: Why and How to Use a Relational Database instead of NoSQL](https://blog.timescale.com/time-series-data-why-and-how-to-use-a-relational-database-instead-of-nosql-d0cd6975e87c)
-												High-availability SaaS Infrastructure at FreeAgent

											
										
										
											2018-05-10 12:58:17 -04:00
+									* [Practical Guide to Monitoring and Alerting with Time Series at Scale](https://www.usenix.org/conference/srecon17americas/program/presentation/wilkinson)
-												Time Series Data: Why and How to Use a Relational Database instead of NoSQL - by Mike Freedman, Professor of Computer Science, Princeton University

											
										
										
											2018-01-23 21:46:01 -05:00
+									* [Beringei: High-performance Time Series Storage Engine at Facebook](https://code.facebook.com/posts/952820474848503/beringei-a-high-performance-time-series-storage-engine/)
 									* [Atlas: In-memory Dimensional Time Series Database at Netflix](https://medium.com/netflix-techblog/introducing-atlas-netflixs-primary-telemetry-platform-bd31f4d8ed9a)
 									* [Heroic: Time Series Database at Spotify](https://labs.spotify.com/2015/11/17/monitoring-at-spotify-introducing-heroic/)
-												Roshi - Distributed Storage System for Time-Series Event at SoundCloud

											
										
										
											2018-01-30 06:38:20 -05:00
+									* [Roshi: Distributed Storage System for Time-Series Event at SoundCloud](https://developers.soundcloud.com/blog/roshi-a-crdt-system-for-timestamped-events)
-												Time Series Data: Why and How to Use a Relational Database instead of NoSQL - by Mike Freedman, Professor of Computer Science, Princeton University

											
										
										
											2018-01-23 21:46:01 -05:00
+									* [Building a Scalable Time Series Database on PostgreSQL](https://blog.timescale.com/when-boring-is-awesome-building-a-scalable-time-series-database-on-postgresql-2900ea453ee2)
-												Scaling Time Series Data Storage at Netflix

											
										
										
											2018-01-26 11:36:04 -05:00
+									* [Scaling Time Series Data Storage at Netflix](https://medium.com/netflix-techblog/scaling-time-series-data-storage-part-i-ec2b6d44ba39)
-												Stop worrying and love the proxy

											
										
										
											2018-01-21 21:21:17 -05:00
+								* [HTTP Caching (Reverse Proxy, CDN)](https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+									* [Reverse Proxy (Nginx, Varnish, Squid, rack-cache)](https://www.mertech.com/overview-reverse-proxying/)
-												Increasing Application Performance with HTTP Cache Headers

											
										
										
											2018-01-23 22:02:00 -05:00
+									* [Stop Worrying and Love the Proxy](https://blog.turbinelabs.io/how-we-learned-to-stop-worrying-and-love-the-proxy-89af98fabaf8)
-												Playing HTTP Tricks with Nginx

											
										
										
											2018-01-21 21:31:56 -05:00
+									* [Playing HTTP Tricks with Nginx](https://www.elastic.co/blog/playing-http-tricks-nginx)
-												Increasing Application Performance with HTTP Cache Headers

											
										
										
											2018-01-23 22:02:00 -05:00
+									* [Using CDN to Improve Site Performance at Coursera](https://building.coursera.org/blog/2015/07/09/improving-coursera-global-site-performance-a-head-to-head-cdn-battle-with-production-traffic/)
-												Strategy: Caching 404s Saved 66% On Server Time at The Onion

											
										
										
											2018-01-21 21:07:24 -05:00
+									* [Strategy: Caching 404s Saved 66% On Server Time at The Onion](http://highscalability.com/blog/2010/3/26/strategy-caching-404s-saved-the-onion-66-on-server-time.html)
-												Increasing Application Performance with HTTP Cache Headers

											
										
										
											2018-01-23 22:02:00 -05:00
+									* [Increasing Application Performance with HTTP Cache Headers](https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers)
-												Zynga Geo Proxy: Reducing Mobile Game Latency at Zynga

											
										
										
											2018-02-08 04:58:22 -05:00
+									* [Zynga Geo Proxy: Reducing Mobile Game Latency at Zynga](https://www.zynga.com/blogs/engineering/zynga-geo-proxy-reducing-mobile-game-latency)
-												Google AMP at Condé Nast

											
										
										
											2018-02-09 04:58:01 -05:00
+									* [Google AMP at Condé Nast](https://technology.condenast.com/story/the-why-and-how-of-google-amp-at-conde-nast)
-												Running A/B Tests on Hosting Infrastructure (CDNs) at Deliveroo

											
										
										
											2018-02-10 06:28:29 -05:00
+									* [Running A/B Tests on Hosting Infrastructure (CDNs) at Deliveroo](https://deliveroo.engineering/2016/09/19/ab-testing-cdns.html)
-												HAProxy with Kubernetes for User-facing Traffic at SoundCloud

											
										
										
											2018-02-13 07:15:34 -05:00
+									* [HAProxy with Kubernetes for User-facing Traffic at SoundCloud](https://developers.soundcloud.com/blog/how-soundcloud-uses-haproxy-with-kubernetes-for-user-facing-traffic)
-												The Precise Meaning of I/O Wait Time in Linux

											
										
										
											2018-03-13 18:35:30 -04:00
+									* [Bandaid: Service Proxy at Dropbox](https://blogs.dropbox.com/tech/2018/03/meet-bandaid-the-dropbox-service-proxy/)
-												Real-time Analytics Platform at King

											
										
										
											2018-04-10 20:45:13 -04:00
+									* [CDN in LIVE's Encoder Layer at LINE](https://engineering.linecorp.com/en/blog/detail/230)
-												Add a new section: Rate Limiting

											
										
										
											2018-05-09 22:08:18 -04:00
+								* [Load Balancing](https://blog.vivekpanyam.com/scaling-a-web-service-load-balancing/)
-												Rearrange the sections: move HTTP Caching near Load Balancing and Concurrency near Parallel, look better!

											
										
										
											2018-01-26 13:19:46 -05:00
+									* [Introduction to Modern Network Load Balancing and Proxying](https://blog.envoyproxy.io/introduction-to-modern-network-load-balancing-and-proxying-a57f6ff80236)
 									* [Load Balancing infrastructure to support more than 1.3 billion users at Facebook](https://www.usenix.org/conference/srecon15europe/program/presentation/shuff)
-												Katran: Scalable Network Load Balancer at Facebook

											
										
										
											2018-06-03 00:56:24 -04:00
+									* [DHCPLB: DHCP Load Balancer at Facebook](https://code.facebook.com/posts/1734309626831603/dhcplb-an-open-source-load-balancer/)
 									* [Katran: Scalable Network Load Balancer at Facebook](https://code.facebook.com/posts/1906146702752923/open-sourcing-katran-a-scalable-network-load-balancer/)
-												Rearrange the sections: move HTTP Caching near Load Balancing and Concurrency near Parallel, look better!

											
										
										
											2018-01-26 13:19:46 -05:00
+									* [Load Balancing with Eureka at Netflix](https://medium.com/netflix-techblog/netflix-shares-cloud-load-balancing-and-failover-tool-eureka-c10647ef95e5)
 									* [Load Balancing at Yelp](https://engineeringblog.yelp.com/2017/05/taking-zero-downtime-load-balancing-even-further.html)
 									* [Load Balancing at Github](https://githubengineering.com/introducing-glb/)
 									* [Consistent Hashing to Improve Load Balancing at Vimeo](https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed)
-												QALM: QoS Load Management Framework at Uber

											
										
										
											2018-04-02 08:39:11 -04:00
+									* [UDP Load Balancing at 500 pixel](https://developers.500px.com/udp-load-balancing-with-keepalived-167382d7ad08)
 									* [QALM: QoS Load Management Framework at Uber](https://eng.uber.com/qalm/)
-												Traffic Steering using Rum DNS at LinkedIn

											
										
										
											2018-05-20 00:40:26 -04:00
+									* [Traffic Steering using Rum DNS at LinkedIn](https://www.usenix.org/conference/srecon17europe/program/presentation/rastogi)
-												Add a new section: Rate Limiting

											
										
										
											2018-05-09 22:08:18 -04:00
+								* [Rate Limiting](https://www.keycdn.com/support/rate-limiting/)
 									* [Rate Limiting for Scaling to Millions of Domains at Cloudfare](https://blog.cloudflare.com/counting-things-a-lot-of-different-things/)
 									* [Cloud Bouncer: Distributed Rate Limiting at Yahoo](https://yahooeng.tumblr.com/post/111288877956/cloud-bouncer-distributed-rate-limiting-at-yahoo)
 									* [Scaling API with Rate Limiters at Stripe](https://stripe.com/blog/rate-limiters)
 									* [Rate Limiting at Etsy](https://www.sans.org/summit-archives/file/summit-archive-1509593697.pdf)
-												Rate Limiter at BloomReach

											
										
										
											2018-05-10 02:45:46 -04:00
+									* [Rate Limiter at BloomReach](http://engineering.bloomreach.com/qps-monitoring-throttling-system/)
-												Distributed Rate Limiting at Allegro

											
										
										
											2018-05-10 02:53:04 -04:00
+									* [Distributed Rate Limiting at Allegro](https://allegro.tech/2017/04/hermes-max-rate.html)
-												Ratequeue: Core Queueing-And-Rate-Limiting System at Twilio

											
										
										
											2018-05-10 03:04:43 -04:00
+									* [Ratequeue: Core Queueing-And-Rate-Limiting System at Twilio](https://www.twilio.com/blog/2017/11/chaos-engineering-ratequeue-ha.html)
-												Refactor

											
										
										
											2018-02-01 21:41:20 -05:00
+								* [Autoscaling](https://medium.com/@BotmetricHQ/top-11-hard-won-lessons-learned-about-aws-auto-scaling-5bfe56da755f)
-												A Horror Movie Featuring Auto Scaling Groups, EBS Volumes, Terraform, and Bash

											
										
										
											2018-02-20 01:22:30 -05:00
+									* [A Horror Movie Featuring Auto Scaling Groups, EBS Volumes, Terraform, and Bash](https://blog.gruntwork.io/yak-shaving-series-1-all-i-need-is-a-little-bit-of-disk-space-6e5ef1644f67)
-												Autoscaling Pinterest

											
										
										
											2018-02-02 20:27:11 -05:00
+									* [Autoscaling Pinterest](https://medium.com/@Pinterest_Engineering/auto-scaling-pinterest-df1d2beb4d64)
-												Refactor

											
										
										
											2018-02-01 21:41:20 -05:00
+									* [Autoscaling Based on Request Queuing at Square](https://medium.com/square-corner-blog/autoscaling-based-on-request-queuing-c4c0f57f860f)
 									* [Autoscaling Applications at PayPal](https://www.paypal-engineering.com/2017/08/16/autoscaling-applications-paypal/)
 									* [Autoscaling Jenkins at Trivago](http://tech.trivago.com/2017/02/17/your-definite-guide-for-autoscaling-jenkins/)
-												Scryer: Predictive Auto Scaling Engine at Netflix

											
										
										
											2018-02-02 20:30:38 -05:00
+									* [Scryer: Predictive Auto Scaling Engine at Netflix](https://medium.com/netflix-techblog/scryer-netflixs-predictive-auto-scaling-engine-a3f8fc922270)
-												Replace the heading article of Concurrency by the post of Joe Duffy (Founder of the Parallel Extensions to the .NET Framework team at MS && MS Midori)

											
										
										
											2018-01-25 04:32:13 -05:00
+								* [Concurrency](http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+									* [Message-Passing Concurrency](https://link.springer.com/chapter/10.1007/978-3-642-35170-9_11)
 									* [Software Transactional Memory](https://dl.acm.org/citation.cfm?id=3037750)
 									* [Dataflow Concurrency](http://www.marketwired.com/press-release/java-concurrency-and-scalability-platform-akka-celebrates-fifth-anniversary-1928674.htm)
-												Shared-State Concurrency

											
										
										
											2018-07-20 20:58:51 -04:00
+									* [Shared-State Concurrency](https://doc.rust-lang.org/book/second-edition/ch16-03-shared-state.html)
-												Concurrency series by Larry Osterman (Principal SDE at Microsoft)

											
										
										
											2018-01-19 22:25:42 -05:00
+									* [Concurrency series by Larry Osterman (Principal SDE at Microsoft)](https://social.msdn.microsoft.com/Profile/Larry%2bOsterman%2b%5BMSFT%5D/activity)
 										* [Part 8 – Concurrency for scalability](https://blogs.msdn.microsoft.com/larryosterman/2005/02/28/concurrency-part-8-concurrency-for-scalability/)
 										* [Part 9 - APIs that enable scalable programming](https://blogs.msdn.microsoft.com/larryosterman/2005/03/02/concurrency-part-9-apis-that-enable-scalable-programming/)
 										* [Part 10 - How do you know if you’ve got a scalability issue?](https://blogs.msdn.microsoft.com/larryosterman/2005/03/03/concurrency-part-10-how-do-you-know-if-youve-got-a-scalability-issue/)
 										* [Part 11 – Hidden scalability issues](https://blogs.msdn.microsoft.com/larryosterman/2005/03/04/concurrency-part-11-hidden-scalability-issues/)
 										* [Part 12 – Hidden scalability issues (cont)](https://blogs.msdn.microsoft.com/larryosterman/2005/03/07/concurrency-part-12-hidden-scalability-issues-part-2/)
-												Fix a heading bullet error

											
										
										
											2018-01-25 05:34:20 -05:00
+									* [Concurrency with Erlang](http://learnyousomeerlang.com/the-hitchhikers-guide-to-concurrency)
-												Concurrency with Erlang

											
										
										
											2018-01-25 05:23:46 -05:00
+										* [Erlang in WhatsApp](https://blog.whatsapp.com/196/1-million-is-so-2011)
 										* [Erlang in Riot Chat Server](https://engineering.riotgames.com/news/chat-service-architecture-servers)
 										* [How Discord Scaled Elixir to Five Millions Concurrent Users](https://blog.discordapp.com/scaling-elixir-f9b8e1e7c29b)
-												Mnesia and CAP

											
										
										
											2018-01-27 22:46:27 -05:00
+										* [Mnesia: A Distributed DBMS Rooted in Concurrency](https://www.developer.com/db/article.php/3864331/Mnesia-A-Distributed-DBMS-Rooted-in-Concurrency.htm)
 										* [Mesia and CAP](https://medium.com/@jlouis666/mnesia-and-cap-d2673a92850)
-												Running Concurrent Queries in GoSocial (Go and Neo4j) at Medium

											
										
										
											2018-02-07 07:13:27 -05:00
+									* [Running Concurrent Queries in GoSocial (Go and Neo4j) at Medium](https://medium.engineering/running-concurrent-queries-in-gosocial-28e5841b05b5)
-												The Secret To 10 Million Concurrent Connections

											
										
										
											2018-03-03 21:41:37 -05:00
+									* [The Secret To 10 Million Concurrent Connections](http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html)
-												Rearrange the sections: move HTTP Caching near Load Balancing and Concurrency near Parallel, look better!

											
										
										
											2018-01-26 13:19:46 -05:00
+								* [Parallel Computing](https://blogs.msdn.microsoft.com/ddperf/2009/05/02/are-we-taking-advantage-of-parallelism/)
 									* [SPMD (Single Program Multiple Data): The Genetic Pattern](https://www2.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-186.html)
 									* [Master/Worker Pattern](https://docs.gigaspaces.com/sbp/master-worker-pattern.html)
 									* [Loop Parallelism Pattern: Extracting parallel tasks from loops](https://www.cs.umd.edu/class/fall2001/cmsc411/projects/unroll/main.htm)
 									* [Fork/Join Pattern: Good for recursive data processing](http://highscalability.com/learn-how-exploit-multiple-cores-better-performance-and-scalability)
 									* [Map-Reduce: Born for Simplified Data Processing on Large Clusters](http://static.googleusercontent.com/media/research.google.com/en/us/archive/mapreduce-osdi04.pdf)
 									* [On the Death of Map-Reduce - Henry Robinson, Cloudera](http://the-paper-trail.org/blog/the-elephant-was-a-trojan-horse-on-the-death-of-map-reduce-at-google/)
-												Edit the title: Server-side Optimization to Parallelize the Rendering of Web Pages at Yelp

											
										
										
											2018-01-26 14:00:38 -05:00
+									* [Server-side Optimization to Parallelize the Rendering of Web Pages at Yelp](https://engineeringblog.yelp.com/2017/07/generating-web-pages-in-parallel-with-pagelets.html)
-												Accelerator: Data Processing Framework with Fast Data Access and Parallel Execution at eBay

											
										
										
											2018-04-30 03:01:17 -04:00
+									* [Accelerator: Data Processing Framework with Fast Data Access and Parallel Execution at eBay](https://www.ebayinc.com/stories/blogs/tech/announcing-the-accelerator-processing-1-000-000-000-lines-per-second-on-a-single-computer/)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+								* [Event-Driven Architecture](https://martinfowler.com/articles/201701-event-driven.html)
-												refactor the section of Event-Driven Architecture

											
										
										
											2018-03-21 22:28:38 -04:00
+									* [Pub-Sub Messaging](https://aws.amazon.com/pub-sub-messaging/)
 										* [Autoscaling Pub-Sub Consumers at Spotify](https://labs.spotify.com/2017/11/20/autoscaling-pub-sub-consumers/)
 										* [Pulsar: Pub-Sub Messaging at Scale at Yahoo](https://yahooeng.tumblr.com/post/150078336821/open-sourcing-pulsar-pub-sub-messaging-at-scale)
 										* [Wormhole: Pub-Sub system at Facebook (2013)](https://code.facebook.com/posts/188966771280871/wormhole-pub-sub-system-moving-data-through-space-and-time/)
-												minor rename

											
										
										
											2018-03-21 22:31:26 -04:00
+										* [Pub-Sub in Chatting Architecture at LINE](https://engineering.linecorp.com/en/blog/detail/85)
-												Correct the link of Domain Event

											
										
										
											2018-01-26 13:09:17 -05:00
+									* [Domain Events](https://martinfowler.com/eaaDev/DomainEvent.html)
-												Domain Events: Simple and Reliable Solution

											
										
										
											2018-01-26 13:16:09 -05:00
+										* [Domain Events: Simple and Reliable Solution](http://enterprisecraftsmanship.com/2017/10/03/domain-events-simple-and-reliable-solution/)
-												Domain-Driven Design in Organizing Monolith Before Breaking it into Services at Weebly

											
										
										
											2018-03-21 22:19:47 -04:00
+										* [Domain-Driven Design in Organizing Monolith Before Breaking it into Services at Weebly](https://medium.com/weebly-engineering/how-to-organize-your-monolith-before-breaking-it-into-services-69cbdb9248b0)
-												Modelling for Domain Driven Design at Moonpig

											
										
										
											2018-05-05 00:23:20 -04:00
+										* [Modelling for Domain Driven Design at Moonpig](https://engineering.moonpig.com/development/modelling-for-domain-driven-design)
-												Add entries for the section of Event Sourcing

											
										
										
											2018-01-25 03:22:14 -05:00
+									* [Event Sourcing](https://martinfowler.com/eaaDev/EventSourcing.html)
 										* [Event Sourced Architectures for High Availability](https://www.infoq.com/presentations/Event-Sourced-Architectures-for-High-Availability)
 										* [Event Sourcing and Stream Processing at Scale](https://martin.kleppmann.com/2016/01/29/event-sourcing-stream-processing-at-ddd-europe.html)
 										* [Scaling Event Sourcing for Netflix Downloads](https://www.infoq.com/presentations/netflix-scale-event-sourcing)
 										* [Scaling Event-Sourcing at Jet.com](https://medium.com/@eulerfx/scaling-event-sourcing-at-jet-9c873cac33b8)
-												Event Sourcing (2 parts) at eBay

											
										
										
											2018-08-16 10:28:36 -04:00
+										* [Event Sourcing (2 parts) at eBay](https://www.ebayinc.com/stories/blogs/tech/event-sourcing-in-action-with-ebays-continuous-delivery-team/)
-												Building Scalable Applications Using Event Sourcing and CQRS using Kafka

											
										
										
											2018-02-01 21:18:15 -05:00
+									* [Command & Query Responsibility Segregation (CQRS)](https://docs.microsoft.com/en-us/azure/architecture/patterns/cqrs)
-												Exploring CQRS and Event Sourcing - MSDN (with free ebook)

											
										
										
											2018-03-09 02:46:58 -05:00
+										* [Exploring CQRS and Event Sourcing - MSDN (with free ebook)](https://msdn.microsoft.com/en-us/library/jj554200.aspx)
-												Simone: Distributed Simulation Service at Netflix

											
										
										
											2018-02-05 02:27:39 -05:00
+										* [CQRS Simple Architecture](https://www.future-processing.pl/blog/cqrs-simple-architecture/)
-												refactor the section of Event-Driven Architecture

											
										
										
											2018-03-21 22:28:38 -04:00
+										* [Building Scalable Applications Using Event Sourcing and CQRS with Kafka](https://initiate.andela.com/event-sourcing-and-cqrs-a-look-at-kafka-e0c1b90d17d8)
 									* [Stream Processing, Event Sourcing, Reactive, CEP, etc - Martin Kleppmann](https://www.confluent.io/blog/making-sense-of-stream-processing/)
 										* [Point-To-Point and Its Differences from Pub-Sub](https://www.journaldev.com/9743/jms-messaging-models)
 										* [Store-Forward](https://docs.oracle.com/cd/E13222_01/wls/docs91/saf_admin/overview.html)
 										* [Request-Reply](https://docs.tibco.com/pub/ftl/4.3.0/doc/html/GUID-A64ABED1-682E-4E1D-A94A-5590CB91B9BB.html)
-												Add a new section: Distributed Security

											
										
										
											2018-04-03 21:44:03 -04:00
+										* [Enterprise Service Bus](http://www.oracle.com/technetwork/articles/soa/ind-soa-esb-1967705.html)
-												Rename to Distributed Source Code and Configuration Files Management

											
										
										
											2018-03-24 21:49:59 -04:00
+								* [Distributed Source Code and Configuration Files Management](https://betterexplained.com/articles/intro-to-distributed-version-control-illustrated/)
-												move the section of Distributed Control to the end

											
										
										
											2018-03-24 21:36:20 -04:00
+									* [Distributed Version Control Systems: A Not-So-Quick Guide Through](https://www.infoq.com/articles/dvcs-guide)
-												DGit: Distributed Git at Github

											
										
										
											2018-05-10 12:28:30 -04:00
+									* [DGit: Distributed Git at Github](https://githubengineering.com/introducing-dgit/)
-												Rename to Distributed Source Code and Configuration Files Management

											
										
										
											2018-03-24 21:49:59 -04:00
+									* [Stemma: Distributed Git Server at Palantir](https://medium.com/@palantir/stemma-distributed-git-server-70afbca0fc29)
 									* [Configuration Management for Distributed Systems at Flickr](https://code.flickr.net/2016/03/24/configuration-management-for-distributed-systems-using-github-and-cfg4j/)
-												Single Repository at Google

											
										
										
											2018-08-02 08:34:22 -04:00
+									* [Git Repository at Microsoft](https://blogs.msdn.microsoft.com/bharry/2017/05/24/the-largest-git-repo-on-the-planet/)
 									* [How Microsoft Solved Git’s Problem with Large Repositories](https://www.infoq.com/news/2017/02/GVFS)
 									* [Single Repository at Google](https://cacm.acm.org/magazines/2016/7/204032-why-google-stores-billions-of-lines-of-code-in-a-single-repository/fulltext)
-												Scaling Infrastructure and (Git) Workflow at Adyen

											
										
										
											2018-04-06 22:37:45 -04:00
+									* [Scaling Infrastructure and (Git) Workflow at Adyen](https://medium.com/adyen/from-0-100-billion-scaling-infrastructure-and-workflow-at-adyen-7b63b690dfb6)
-												Dotfiles Distribution at Booking.com

											
										
										
											2018-05-12 23:35:57 -04:00
+									* [Dotfiles Distribution at Booking.com](https://medium.com/booking-com-infrastructure/dotfiles-distribution-dedb69c66a75)
-												Rename to Distributed Source Code and Configuration Files Management

											
										
										
											2018-03-24 21:49:59 -04:00
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+								## Availability
-												Add the subsection Resilience Engineering for Availability

											
										
										
											2018-05-07 11:37:14 -04:00
+								* [Resilience Engineering: Learning to Embrace Failure](https://queue.acm.org/detail.cfm?id=2371297)
-												Resilience Engineering with Project Waterbear at LinkedIn

											
										
										
											2018-05-07 11:49:05 -04:00
+									* [Resilience Engineering with Project Waterbear at LinkedIn](https://engineering.linkedin.com/blog/2017/11/resilience-engineering-at-linkedin-with-project-waterbear)
-												Add the subsection Resilience Engineering for Availability

											
										
										
											2018-05-07 11:37:14 -04:00
+									* [Resiliency against Traffic Oversaturation at iHeartRadio](https://tech.iheart.com/resiliency-against-traffic-oversaturation-77c5ed92a5fb)
 									* [Resiliency in Distributed Systems at GO-JEK](https://blog.gojekengineering.com/resiliency-in-distributed-systems-efd30f74baf4)
 									* [Practical NoSQL Resilience Design Pattern for the Enterprise at eBay](https://www.ebayinc.com/stories/blogs/tech/practical-nosql-resilience-design-pattern-for-the-enterprise/)
 									* [Ensuring Resilience to Disaster at Quora](https://engineering.quora.com/Ensuring-Quoras-Resilience-to-Disaster)
-												Resilience at Shopify

											
										
										
											2018-05-07 12:00:05 -04:00
+									* [Resilience at Shopify](https://scaleyourcode.com/blog/article/23)
-												Site Resiliency at Expedia

											
										
										
											2018-05-10 11:51:45 -04:00
+									* [Site Resiliency at Expedia](https://www.infoq.com/presentations/expedia-website-resiliency?utm_source=presentations_about_Case_Study&utm_medium=link&utm_campaign=Case_Study)
-												Move the sub-section Resilience Engineering in the first place

											
										
										
											2018-05-10 12:39:01 -04:00
+								* [Failover](http://cloudpatterns.org/mechanisms/failover_system)
 									* [The Evolution of Global Traffic Routing and Failover](https://www.usenix.org/conference/srecon16/program/presentation/heady)
 									* [Testing for Disaster Recovery Failover Testing](https://www.usenix.org/conference/srecon17asia/program/presentation/liu_zehua)
 									* [Designing a Microservices Architecture for Failure](https://blog.risingstack.com/designing-microservices-architecture-for-failure/)
 									* [ELB for Automatic Failover at GoSquared](https://engineering.gosquared.com/use-elb-automatic-failover)
 									* [Eliminate the Database for Higher Availability at American Express](http://americanexpress.io/eliminate-the-database-for-higher-availability/)
-												High-availability SaaS Infrastructure at FreeAgent

											
										
										
											2018-05-10 12:58:17 -04:00
+									* [Failover with Redis Sentinel at Vinted](http://engineering.vinted.com/2015/09/03/failover-with-redis-sentinel/)
 									* [High-availability SaaS Infrastructure at FreeAgent](http://engineering.freeagent.com/2017/02/06/ha-infrastructure-without-breaking-the-bank/)
-												Availability in Globally Distributed Storage Systems

											
										
										
											2018-04-12 11:46:11 -04:00
+								* [Availability in Globally Distributed Storage Systems](http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/36737.pdf)
-												NodeJS High Availability at Yahoo

											
										
										
											2018-01-18 03:12:40 -05:00
+								* [NodeJS High Availability at Yahoo](https://yahooeng.tumblr.com/post/68823943185/nodejs-high-availability)
-												Athenz: Fine-Grained, Role-Based Access Control at Yahoo

											
										
										
											2018-05-12 23:48:46 -04:00
+								* [Every Day is Monday in Operations (11 parts) at LinkedIn ](https://www.linkedin.com/pulse/introduction-every-day-monday-operations-benjamin-purgason)
-												How Robust Monitoring Powers High Availability for LinkedIn Feed

											
										
										
											2018-01-21 22:55:28 -05:00
+								* [How Robust Monitoring Powers High Availability for LinkedIn Feed](https://www.usenix.org/conference/srecon17americas/program/presentation/barot)
-												Architectural Patterns for High Availability - Adrian Cockcroft, Director of Architecture at Netflix

											
										
										
											2018-01-25 00:37:22 -05:00
+								* [Architectural Patterns for High Availability - Adrian Cockcroft, Director of Architecture at Netflix](https://www.infoq.com/presentations/Netflix-Architecture)
-												Resiliency in Distributed Systems at GO-JEK

											
										
										
											2018-03-25 21:31:47 -04:00
+								* [Supporting Global Events at Facebook](https://code.facebook.com/posts/166966743929963/how-production-engineers-support-global-events-on-facebook/)
-												Backends High Availability at BlaBlaCar

											
										
										
											2018-04-02 10:35:44 -04:00
+								* [Backends High Availability at BlaBlaCar](https://medium.com/blablacar-tech/the-expendables-backends-high-availability-at-blablacar-8cea3b95b26b)
-												Chubby: DLM for High Availability

											
										
										
											2018-04-09 11:34:50 -04:00
+								* [Chubby: Lock Service for Loosely Coupled Distributed Systems at Google](https://blog.acolyer.org/2015/02/13/the-chubby-lock-service-for-loosely-coupled-distributed-systems/)
-												Tips for High Availability at Netflix

											
										
										
											2018-05-02 01:31:15 -04:00
+								* [Tips for High Availability at Netflix](https://medium.com/@NetflixTechBlog/tips-for-high-availability-be0472f2599c)
-												Scaling High-Availability Infrastructure in the Cloud at Twilio

											
										
										
											2018-05-08 07:32:12 -04:00
+								* [Scaling High-Availability Infrastructure in the Cloud at Twilio](https://www.twilio.com/engineering/2011/12/12/scaling-high-availablity-infrastructure-in-cloud)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
 								## Stability
-												Change heading links and add entries for Circuit Breaker

											
										
										
											2018-01-25 08:51:21 -05:00
+								* [Circuit Breaker](https://martinfowler.com/bliki/CircuitBreaker.html)
 									* [Circuit Breaking in Distributed Systems](https://www.infoq.com/presentations/circuit-breaking-distributed-systems)
-												Circuit Breakers for Distributed Services at LINE

											
										
										
											2018-01-25 08:56:08 -05:00
+									* [Circuit Breakers for Distributed Services at LINE](https://engineering.linecorp.com/en/blog/detail/76)
-												Change heading links and add entries for Circuit Breaker

											
										
										
											2018-01-25 08:51:21 -05:00
+									* [Applying Circuit Breaker to Channel Gateway at LINE](https://engineering.linecorp.com/en/blog/detail/78)
-												Lessons in Resilience at SoundCloud

											
										
										
											2018-01-30 06:32:39 -05:00
+									* [Lessons in Resilience at SoundCloud](https://developers.soundcloud.com/blog/lessons-in-resilience-at-SoundCloud)
-												Change heading links and add entries for Circuit Breaker

											
										
										
											2018-01-25 08:51:21 -05:00
+									* [Circuit Breaker for Scaling Containers](https://f5.com/about-us/blog/articles/the-art-of-scaling-containers-circuit-breakers-28919)
-												Protector: Circuit Breaker for Time Series Databases at Trivago

											
										
										
											2018-02-01 00:41:13 -05:00
+									* [Protector: Circuit Breaker for Time Series Databases at Trivago](http://tech.trivago.com/2016/02/23/protector/)
-												Improved Production Stability with Circuit Breakers at Heroku

											
										
										
											2018-05-02 03:23:55 -04:00
+									* [Improved Production Stability with Circuit Breakers at Heroku](https://blog.heroku.com/improved-production-stability-with-circuit-breakers)
-												Fault Tolerance (Timeouts and Retries, Thread Separation, Semaphores, Circuit Breakers) at Neflix

											
										
										
											2018-05-10 22:07:06 -04:00
+								* [Always Use Timeouts If Possible](https://www.javaworld.com/article/2824163/application-performance/stability-patterns-applied-in-a-restful-architecture.html)
-												Scaling Real-time Infrastructure at Alibaba for Global Shopping Holiday

											
										
										
											2018-03-23 00:35:27 -04:00
+								* [Crash Early: Better Error Now Than Response Tomorrow](http://odino.org/better-performance-the-case-for-timeouts/)
-												Fault Tolerance (Timeouts and Retries, Thread Separation, Semaphores, Circuit Breakers) at Neflix

											
										
										
											2018-05-10 22:07:06 -04:00
+								* [Fault Tolerance (Timeouts and Retries, Thread Separation, Semaphores, Circuit Breakers) at Neflix](https://medium.com/netflix-techblog/fault-tolerance-in-a-high-volume-distributed-system-91ab4faae74a)
-												Refactor the section of Stability

											
										
										
											2018-03-23 00:18:03 -04:00
+								* [Crash-safe Replication for MySQL at Booking.com](https://medium.com/booking-com-infrastructure/better-crash-safe-replication-for-mysql-a336a69b317f)
 								* [Bulkheads: Partition and Tolerate Failure in One Part](https://skife.org/architecture/fault-tolerance/2009/12/31/bulkheads.html)
 								* [Steady State: Always Put Logs on Separate Disk](https://docs.microsoft.com/en-us/sql/relational-databases/policy-based-management/place-data-and-log-files-on-separate-drives)
 								* [Throttling: Maintain a Steady Pace](http://www.sosp.org/2001/papers/welsh.pdf)
 								* [Multi-Clustering: Improving Resiliency and Stability of a Large-scale Monolithic API Service at LinkedIn](https://engineering.linkedin.com/blog/2017/11/improving-resiliency-and-stability-of-a-large-scale-api)
-												Determinism (4 parts) in League of Legends Server

											
										
										
											2018-07-08 07:41:02 -04:00
+								* [Determinism (4 parts) in League of Legends Server](https://engineering.riotgames.com/news/determinism-league-legends-fixing-divergences)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
-												Add a section for Performance

											
										
										
											2018-01-26 07:05:29 -05:00
+								## Performance
-												Live Downsizing Google Cloud Persistent Disks (PD-SSD) at Mixpanel

											
										
										
											2018-08-11 19:36:49 -04:00
+								* [Performance Optimization on OS, Storage, Database, Network](https://stackify.com/application-performance-metrics/)
-												Refactor the section of Performance

											
										
										
											2018-04-03 21:19:29 -04:00
+									* [Improving Performance with Background Data Prefetching at Instagram](https://engineering.instagram.com/improving-performance-with-background-data-prefetching-b191acb39898)
 									* [Compression Techniques to Solve Network I/O Bottlenecks at eBay](https://www.ebayinc.com/stories/blogs/tech/how-ebays-shopping-cart-used-compression-techniques-to-solve-network-io-bottlenecks/)
 									* [Optimizing Web Servers for High Throughput and Low Latency at Dropbox](https://blogs.dropbox.com/tech/2017/09/optimizing-web-servers-for-high-throughput-and-low-latency/)
 									* [Linux Performance Analysis in 60.000 Milliseconds at Netflix](https://medium.com/netflix-techblog/linux-performance-analysis-in-60-000-milliseconds-accc10403c55)
 									* [Performance Testing with SSDs (2 parts) at MailChimp](https://devs.mailchimp.com/blog/performance-testing-with-ssds-pt-2/)
-												Live Downsizing Google Cloud Persistent Disks (PD-SSD) at Mixpanel

											
										
										
											2018-08-11 19:36:49 -04:00
+									* [Live Downsizing Google Cloud Persistent Disks (PD-SSD) at Mixpanel](https://engineering.mixpanel.com/2018/07/31/live-downsizing-google-cloud-pds-for-fun-and-profit/)
-												Refactor the section of Performance

											
										
										
											2018-04-03 21:19:29 -04:00
+									* [Decreasing RAM Usage by 40% Using jemalloc with Python & Celery at Zapier](https://zapier.com/engineering/celery-python-jemalloc/)
-												Reducing Memory Footprint at Slack

											
										
										
											2018-05-02 00:37:47 -04:00
+									* [Reducing Memory Footprint at Slack](https://slack.engineering/reducing-slacks-memory-footprint-4480fec7e8eb)
-												Optimizing CAL Report Hadoop MapReduce Jobs at eBay

											
										
										
											2018-05-01 21:32:57 -04:00
+									* [Performance Improvements at Pinterest](https://medium.com/@Pinterest_Engineering/driving-user-growth-with-performance-improvements-cfc50dafadd7)
-												Refactor the section of Performance

											
										
										
											2018-04-03 21:19:29 -04:00
+									* [Server Side Rendering at Wix](https://www.youtube.com/watch?v=f9xI2jR71Ms)
 									* [30x Performance Improvements on MySQLStreamer at Yelp](https://engineeringblog.yelp.com/2018/02/making-30x-performance-improvements-on-yelps-mysqlstreamer.html)
 									* [Optimizing APIs through Dynamic Polyglot Runtime, Fully Asynchronous, and Reactive Programming at Netflix](https://medium.com/netflix-techblog/optimizing-the-netflix-api-5c9ac715cf19)
 									* [Performance Monitoring with Riemann and Clojure at Walmart](https://medium.com/walmartlabs/performance-monitoring-with-riemann-and-clojure-eafc07fcd375)
-												Performance Tracking Dashboard for Live Games at Zynga

											
										
										
											2018-04-29 18:29:51 -04:00
+									* [Performance Tracking Dashboard for Live Games at Zynga](https://www.zynga.com/blogs/engineering/live-games-have-evolving-performance)
-												Optimizing CAL Report Hadoop MapReduce Jobs at eBay

											
										
										
											2018-05-01 21:32:57 -04:00
+									* [Optimizing CAL Report Hadoop MapReduce Jobs at eBay](https://www.ebayinc.com/stories/blogs/tech/optimization-of-cal-report-hadoop-mapreduce-job/)
-												Performance Tuning on Quartz Scheduler at eBay

											
										
										
											2018-05-01 21:36:23 -04:00
+									* [Performance Tuning on Quartz Scheduler at eBay](https://www.ebayinc.com/stories/blogs/tech/performance-tuning-on-quartz-scheduler/)
-												Profiling C++ (Part 1: Optimization, Part 2: Measurement and Analysis) at Riot Games

											
										
										
											2018-05-22 11:45:45 -04:00
+									* [Profiling C++ (Part 1: Optimization, Part 2: Measurement and Analysis) at Riot Games](https://engineering.riotgames.com/news/profiling-optimisation)
-												Diagnosing Networking Issues in the Linux Kernel at Mixpanel

											
										
										
											2018-05-30 21:19:13 -04:00
+									* [Diagnosing Networking Issues in the Linux Kernel at Mixpanel](https://code.mixpanel.com/2015/03/26/diagnosing-networking-issues-in-the-linux-kernel/)
-												Hardware-Assisted Video Transcoding at Dailymotion

											
										
										
											2018-07-28 09:36:40 -04:00
+									* [Hardware-Assisted Video Transcoding at Dailymotion](https://medium.com/dailymotion-engineering/hardware-assisted-video-transcoding-at-dailymotion-66cd2db448ae)
-												Add a new sub-section: Performance Optimization by Tuning Garbage Collection

											
										
										
											2018-05-26 03:48:27 -04:00
+								* [Performance Optimization by Tuning Garbage Collection](https://confluence.atlassian.com/enterprise/garbage-collection-gc-tuning-guide-461504616.html)
 									* [Garbage Collection Optimization for High-Throughput and Low-Latency Java Applications at LinkedIn](https://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high-throughput-and-low-latency-java-applications)
 									* [Analyzing V8 Garbage Collection Logs at Alibaba](https://www.linux.com/blog/can-nodejs-scale-ask-team-alibaba)
 									* [Python Garbage Collection for Dropping 50% Memory Growth Per Request at Instagram](https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf)
 									* [Performance Impact of Removing Out of Band Garbage Collector (OOBGC) at Github](https://githubengineering.com/removing-oobgc/)
 									* [Using Java Large Heap (110 GB) at Expedia](https://techblog.expedia.com/2015/09/25/solving-problems-with-very-large-java-heaps/)
-												Debugging Java Memory Leaks at Allegro

											
										
										
											2018-05-26 04:02:13 -04:00
+									* [Debugging Java Memory Leaks at Allegro](https://allegro.tech/2018/05/a-comedy-of-errors-debugging-java-memory-leaks.html)
-												Add a new sub-section: Performance Optimization by Tuning Garbage Collection

											
										
										
											2018-05-26 03:48:27 -04:00
+									* [Optimizing JVM at Alibaba](https://www.youtube.com/watch?v=X4tmr3nhZRg)
-												Optimizing CAL Report Hadoop MapReduce Jobs at eBay

											
										
										
											2018-05-01 21:32:57 -04:00
+								* [Performance Optimization on Video, Image, Page Load](https://developers.google.com/web/fundamentals/performance/why-performance-matters/)
-												Refactor the section of Performance

											
										
										
											2018-04-03 21:19:29 -04:00
+									* [Optimizing 360 Photos at Scale at Facebook](https://code.facebook.com/posts/129055711052260/optimizing-360-photos-at-scale/)
 									* [Reducing Image File Size in the Photos Infrastructure at Etsy](https://codeascraft.com/2017/05/30/reducing-image-file-size-at-etsy/)
 									* [Improving GIF Performance at Pinterest](https://medium.com/@Pinterest_Engineering/improving-gif-performance-on-pinterest-8dad74bf92f1)
 									* [Optimizing Video Playback Performance at Pinterest](https://medium.com/@Pinterest_Engineering/optimizing-video-playback-performance-caf55ce310d1)
 									* [Optimizing Video Stream for Low Bandwidth with Dynamic Optimizer at Netflix](https://medium.com/netflix-techblog/optimized-shot-based-encodes-now-streaming-4b9464204830)
-												Adaptive Video Streaming at YouTube

											
										
										
											2018-08-10 21:27:56 -04:00
+									* [Adaptive Video Streaming at YouTube](https://youtube-eng.googleblog.com/2018/04/making-high-quality-video-efficient.html)
-												Refactor the section of Performance

											
										
										
											2018-04-03 21:19:29 -04:00
+									* [Reducing Video Loading Time by Prefetching during Preroll at Dailymotion](http://engineering.dailymotion.com/reducing-video-loading-time-prefetching-video-during-preroll/)
-												Live Downsizing Google Cloud Persistent Disks (PD-SSD) at Mixpanel

											
										
										
											2018-08-11 19:36:49 -04:00
+									* [Boosting Site Speed Using Brotli Compression at LinkedIn](https://engineering.linkedin.com/blog/2017/05/boosting-site-speed-using-brotli-compression)
-												Refactor the section of Performance

											
										
										
											2018-04-03 21:19:29 -04:00
+									* [Improving Homepage Performance at Zillow](https://www.zillow.com/engineering/improving-homepage-performance/)
 									* [The Process of Optimizing for Client Performance at Expedia](https://techblog.expedia.com/2018/03/09/go-fast-or-go-home-the-process-of-optimizing-for-client-performance/)
-												Add a section for Performance

											
										
										
											2018-01-26 07:05:29 -05:00
-												I am a fan of AI, too

											
										
										
											2018-03-24 22:48:02 -04:00
+								## Intelligence
-												Cloud Big Data Design Patterns - Lynn Langit

											
										
										
											2018-05-30 01:40:01 -04:00
+								* [Big Data](https://insights.sei.cmu.edu/sei_blog/2017/05/reference-architectures-for-big-data-systems.html)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+									* [Data Platform at Netflix](https://www.youtube.com/watch?v=CSDIThSwA7s)
 									* [Data Platform at Flipkart](https://tech.flipkart.com/overview-of-flipkart-data-platform-20c6d3e9a196)
-												Data Pipeline Management Platform at Khan Academy

											
										
										
											2018-05-30 21:17:30 -04:00
+									* [Data Pipeline Management Platform at Khan Academy](http://engineering.khanacademy.org/posts/khanalytics.htm)
-												Data Infrastructure at Airbnb

											
										
										
											2018-05-05 00:57:38 -04:00
+									* [Data Infrastructure at Airbnb](https://medium.com/airbnb-engineering/data-infrastructure-at-airbnb-8adfb34f169c)
-												Data Infrastructure at LinkedIn

											
										
										
											2018-05-10 11:58:58 -04:00
+									* [Data Infrastructure at LinkedIn](https://www.infoq.com/presentations/big-data-infrastructure-linkedin)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+									* [Data Infrastructure at GO-JEK](https://blog.gojekengineering.com/data-infrastructure-at-go-jek-cd4dc8cbd929)
-												Data Ingestion Infrastructure at Pinterest

											
										
										
											2018-05-05 01:03:07 -04:00
+									* [Data Ingestion Infrastructure at Pinterest](https://medium.com/@Pinterest_Engineering/scalable-and-reliable-data-ingestion-at-pinterest-b921c2ee8754)
-												Data Analytics Architecture at Pinterest

											
										
										
											2018-05-05 01:06:16 -04:00
+									* [Data Analytics Architecture at Pinterest](https://medium.com/@Pinterest_Engineering/behind-the-pins-building-analytics-f7b508cdacab)
-												Data Platform at Uber

											
										
										
											2018-05-08 21:54:56 -04:00
+									* [Big Data Processing (2 parts) at Spotify](https://labs.spotify.com/2017/10/23/big-data-processing-at-spotify-the-road-to-scio-part-2/)
-												update a link

											
										
										
											2018-05-08 22:56:27 -04:00
+									* [Big Data Processing at Uber](https://cdn.oreillystatic.com/en/assets/1/event/160/Big%20data%20processing%20with%20Hadoop%20and%20Spark%2C%20the%20Uber%20way%20Presentation.pdf)
 									* [Analytics Pipeline at Lyft](https://cdn.oreillystatic.com/en/assets/1/event/269/Lyft_s%20analytics%20pipeline_%20From%20Redshift%20to%20Apache%20Hive%20and%20Presto%20Presentation.pdf)
-												Big Data Analytics and ML Techniques at LinkedIn

											
										
										
											2018-05-09 06:34:34 -04:00
+									* [Big Data Analytics and ML Techniques at LinkedIn](https://cdn.oreillystatic.com/en/assets/1/event/269/Big%20data%20analytics%20and%20machine%20learning%20techniques%20to%20drive%20and%20grow%20business%20Presentation%201.pdf)
-												Self-Serve Reporting Platform on Hadoop at LinkedIn

											
										
										
											2018-05-09 06:53:37 -04:00
+									* [Self-Serve Reporting Platform on Hadoop at LinkedIn](https://cdn.oreillystatic.com/en/assets/1/event/137/Building%20a%20self-serve%20real-time%20reporting%20platform%20at%20LinkedIn%20Presentation%201.pdf)
-												Analytics Platform (Spark, Kafka and Cassandra) for Tracking Item Availability at Walmart

											
										
										
											2018-08-18 03:09:44 -04:00
+									* [Analytics Platform for Tracking Item Availability at Walmart](https://medium.com/walmartlabs/how-we-build-a-robust-analytics-platform-using-spark-kafka-and-cassandra-lambda-architecture-70c2d1bc8981)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+									* [RBEA: Real-time Analytics Platform at King](https://techblog.king.com/rbea-scalable-real-time-analytics-king/)
 									* [Gimel: Analytics Data Processing Platform at PayPal](https://www.paypal-engineering.com/2018/04/17/gimel/)
 									* [AthenaX: Streaming Analytics Platform at Uber](https://eng.uber.com/athenax/)
-												Databook: Turning Big Data into Knowledge with Metadata at Uber

											
										
										
											2018-08-19 04:39:27 -04:00
+									* [Databook: Turning Big Data into Knowledge with Metadata at Uber](https://eng.uber.com/databook/)
-												Maze: Funnel Visualization Platform at Uber

											
										
										
											2018-08-19 04:41:42 -04:00
+									* [Maze: Funnel Visualization Platform at Uber](https://eng.uber.com/maze/)
-												Metacat: Making Big Data Discoverable and Meaningful at Netflix

											
										
										
											2018-06-16 20:42:43 -04:00
+									* [Metacat: Making Big Data Discoverable and Meaningful at Netflix](https://medium.com/netflix-techblog/metacat-making-big-data-discoverable-and-meaningful-at-netflix-56fb36a53520)
-												Scaling Experimentation Platform at Airbnb

											
										
										
											2018-04-28 00:34:34 -04:00
+									* [TensorFlowOnSpark: Distributed Deep Learning on Big Data Clusters at Yahoo](https://yahooeng.tumblr.com/post/157196488076/open-sourcing-tensorflowonspark-distributed-deep)
 									* [CaffeOnSpark: Distributed Deep Learning on Big Data Clusters at Yahoo](https://yahooeng.tumblr.com/post/139916828451/caffeonspark-open-sourced-for-distributed-deep)
-												Smart Product Platform at Zalando

											
										
										
											2018-05-19 22:46:20 -04:00
+									* [Experimentation Platform at Airbnb](https://medium.com/airbnb-engineering/https-medium-com-jonathan-parks-scaling-erf-23fd17c91166)
 									* [Smart Product Platform at Zalando](https://jobs.zalando.com/tech/blog/zalando-smart-product-platform/?gh_src=4n3gxh1)
-												Log Analysis Platform at LINE

											
										
										
											2018-05-08 23:06:20 -04:00
+									* [Log Analysis Platform at LINE](https://www.slideshare.net/wyukawa/strata2017-sg)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+								* [Distributed Machine Learning](https://www.csie.ntu.edu.tw/~cjlin/talks/bigdata-bilbao.pdf)
-												Michelangelo: Machine Learning Platform at Uber

											
										
										
											2018-04-28 00:43:18 -04:00
+									* [Michelangelo: Machine Learning Platform at Uber](https://eng.uber.com/michelangelo/)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+									* [Horovod: Open Source Distributed Deep Learning Framework for TensorFlow at Uber](https://eng.uber.com/horovod/)
 									* [COTA: Improving Customer Care with NLP & Machine Learning at Uber](https://eng.uber.com/cota/)
 									* [Repo-Topix: Topic Extraction Framework at Github](https://githubengineering.com/topics/)
-												Concourse: Generating Personalized Content Notifications in Near-Real-Time at LinkedIn

											
										
										
											2018-06-01 07:01:54 -04:00
+									* [Concourse: Generating Personalized Content Notifications in Near-Real-Time at LinkedIn](https://engineering.linkedin.com/blog/2018/05/concourse--generating-personalized-content-notifications-in-near)
-												Altus Care: Applying a Chatbot to Platform Engineering at eBay

											
										
										
											2018-06-03 00:40:31 -04:00
+									* [Altus Care: Applying a Chatbot to Platform Engineering at eBay](https://www.ebayinc.com/stories/blogs/tech/altus-care-apply-chatbot-to-ebay-platform-engineering/)
 									* [Box Graph: Spontaneous Social Network at Box](https://blog.box.com/blog/box-graph-how-we-built-spontaneous-social-network/)
 									* [PricingNet: Pricing Modelling with Neural Networks at Skyscanner](https://hackernoon.com/pricingnet-modelling-the-global-airline-industry-with-neural-networks-833844d20ea6)
-												Scaling Experimentation Platform at Airbnb

											
										
										
											2018-04-28 00:34:34 -04:00
+									* [Scaling Gradient Boosted Trees for Click-Through-Rate Prediction at Yelp](https://engineeringblog.yelp.com/2018/01/building-a-distributed-ml-pipeline-part1.html)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+									* [Learning with Privacy at Scale at Apple](https://machinelearning.apple.com/2017/12/06/learning-with-privacy-at-scale.html)
-												Deep Learning for Frame Detection in Product Images at Allegro

											
										
										
											2018-05-10 02:59:46 -04:00
+									* [Deep Learning for Image Classification Experiment at Mercari](https://medium.com/mercari-engineering/mercaris-image-classification-experiment-using-deep-learning-9b4e994a18ec)
 									* [Deep Learning for Frame Detection in Product Images at Allegro](https://allegro.tech/2016/12/deep-learning-for-frame-detection.html)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+									* [Content-based Video Relevance Prediction at Hulu](https://medium.com/hulu-tech-blog/content-based-video-relevance-prediction-b2c448e14752)
 									* [Training ML Models with Airflow and BigQuery at WePay](https://wecode.wepay.com/posts/training-machine-learning-models-with-airflow-and-bigquery)
 									* [Improving Photo Selection With Deep Learning at TripAdvisor](http://engineering.tripadvisor.com/improving-tripadvisor-photo-selection-deep-learning/)
 									* [Machine Learning (2 parts) at Condé Nast](https://technology.condenast.com/story/handbag-brand-and-color-detection)
 									* [Machine Learning Applications In The E-commerce Domain (4 parts) at Rakuten](https://techblog.rakuten.co.jp/2017/07/12/machine-learning-applications-in-the-e-commerce-domain-4/)
-												Mapping the World of Music Using Machine Learning (2 parts) at iHeartRadio

											
										
										
											2018-05-20 00:03:19 -04:00
+									* [Mapping the World of Music Using Machine Learning (2 parts) at iHeartRadio](https://tech.iheart.com/mapping-the-world-of-music-using-machine-learning-part-2-aa50b6a0304c)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+									* [Venue Rating System at Foursquare](https://engineering.foursquare.com/finding-the-perfect-10-how-we-developed-the-foursquare-venue-rating-system-c76b08f7b9b3)
 									* [Using Machine Learning to Improve Streaming Quality at Netflix](https://medium.com/netflix-techblog/using-machine-learning-to-improve-streaming-quality-at-netflix-9651263ef09f)
 									* [Improving Video Thumbnails with Deep Neural Nets at YouTube](https://youtube-eng.googleblog.com/2015/10/improving-youtube-video-thumbnails-with_8.html)
 									* [Quantile Regression for Delivering On Time at Instacart](https://tech.instacart.com/how-instacart-delivers-on-time-using-quantile-regression-2383e2e03edb)
 									* [Cross-Lingual End-to-End Product Search with Deep Learning at Zalando](https://jobs.zalando.com/tech/blog/search-deep-neural-network/)
 									* [Machine Learning at Jane Street](https://blog.janestreet.com/real-world-machine-learning-part-1/)
-												Mitigating MySQL Replication Lag and Reducing Read Load at Github

											
										
										
											2018-04-26 21:05:36 -04:00
+									* [Machine Learning for Ranking Answers End-to-End at Quora](https://engineering.quora.com/A-Machine-Learning-Approach-to-Ranking-Answers-on-Quora)
-												Clustering Similar Stories Using LDA at Flipboard

											
										
										
											2018-04-26 21:16:24 -04:00
+									* [Clustering Similar Stories Using LDA at Flipboard](http://engineering.flipboard.com/2017/02/storyclustering)
-												Scaling Experimentation Platform at Airbnb

											
										
										
											2018-04-28 00:34:34 -04:00
+									* [Similarity Search at Flickr](https://code.flickr.net/2017/03/07/introducing-similarity-search-at-flickr/)
-												refactor, no deadlink

											
										
										
											2018-05-03 13:27:40 -04:00
+									* [Large-Scale Machine Learning Pipeline for Job Recommendations at Indeed](http://engineering.indeedblog.com/blog/2016/04/building-a-large-scale-machine-learning-pipeline-for-job-recommendations/)
-												Deep Learning from Prototype to Production at Taboola

											
										
										
											2018-05-05 01:47:18 -04:00
+									* [Deep Learning from Prototype to Production at Taboola](http://engineering.taboola.com/deep-learning-from-prototype-to-production/)
-												Atom Smashing using Machine Learning at CERN

											
										
										
											2018-05-09 06:44:15 -04:00
+									* [Atom Smashing using Machine Learning at CERN](https://cdn.oreillystatic.com/en/assets/1/event/144/Atom%20smashing%20using%20machine%20learning%20at%20CERN%20Presentation.pdf)
-												Mapping Tags at Medium

											
										
										
											2018-05-25 08:13:45 -04:00
+									* [Mapping Tags at Medium](https://medium.engineering/mapping-mediums-tags-1b9a78d77cf0)
-												Clustering with the Dirichlet Process Mixture Model in Scala at Monsanto

											
										
										
											2018-06-02 23:03:50 -04:00
+									* [Clustering with the Dirichlet Process Mixture Model in Scala at Monsanto](http://engineering.monsanto.com/2015/11/23/chinese-restaurant-process/)
-												Map Pins with DBSCAN & Random Forests at Foursquare

											
										
										
											2018-06-16 20:40:48 -04:00
+									* [Map Pins with DBSCAN & Random Forests at Foursquare](https://engineering.foursquare.com/you-are-probably-here-better-map-pins-with-dbscan-random-forests-9d51e8c1964d)
-												Financial Forecasting at Uber

											
										
										
											2018-07-08 07:39:23 -04:00
+									* [Detecting and Preventing Fraud at Uber](https://eng.uber.com/advanced-technologies-detecting-preventing-fraud-uber/)
 									* [Financial Forecasting at Uber](https://eng.uber.com/transforming-financial-forecasting-machine-learning/)
-												Productionizing ML with Workflows at Twitter

											
										
										
											2018-06-23 02:42:53 -04:00
+									* [Productionizing ML with Workflows at Twitter](https://blog.twitter.com/engineering/en_us/topics/insights/2018/ml-workflows.html)
-												GUI Testing Powered by Deep Learning at eBay

											
										
										
											2018-06-29 13:48:56 -04:00
+									* [GUI Testing Powered by Deep Learning at eBay](https://www.ebayinc.com/stories/blogs/tech/gui-testing-powered-by-deep-learning/)
-												Add a new section for Machine Learning at Scale

											
										
										
											2018-03-24 22:40:12 -04:00
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								## Architecture
-												MaxScale (MySQL) Database Proxy at Airbnb

											
										
										
											2018-05-11 14:23:14 -04:00
+								* [Systems We Make](https://systemswemake.com/)
-												Tech Stack (2 parts) at Uber

											
										
										
											2018-04-28 00:45:55 -04:00
+								* [Tech Stack (2 parts) at Uber](https://eng.uber.com/tech-stack-part-two/)
-												Tech Stack at Medium

											
										
										
											2018-06-09 23:05:54 -04:00
+								* [Tech Stack at Medium](https://medium.engineering/the-stack-that-helped-medium-drive-2-6-millennia-of-reading-time-e56801f7c492)
-												Games Platform at The New York Times

											
										
										
											2018-05-30 01:48:38 -04:00
+								* [Services (2 parts) at Airbnb](https://medium.com/airbnb-engineering/building-services-at-airbnb-part-2-142be1c5d506)
 								* [Back-end at LinkedIn](https://engineering.linkedin.com/architecture/brief-history-scaling-linkedin)
-												Real-time Analytics Platform at King

											
										
										
											2018-04-10 20:45:13 -04:00
+								* [Back-end at Flickr](https://yahooeng.tumblr.com/post/157200523046/introducing-tripod-flickrs-backend-refactored)
-												Real-time User Action Counting System for Ads at Pinterest

											
										
										
											2018-07-15 09:18:09 -04:00
+								* [Real-time Presence Platform at LinkedIn](https://engineering.linkedin.com/blog/2018/01/now-you-see-me--now-you-dont--linkedins-real-time-presence-platf)
 								* [Real-time User Action Counting System for Ads at Pinterest](https://medium.com/@Pinterest_Engineering/building-a-real-time-user-action-counting-system-for-ads-88a60d9c9a)
-												Games Platform at The New York Times

											
										
										
											2018-05-30 01:48:38 -04:00
+								* [API Platform at Riot Games](https://engineering.riotgames.com/news/riot-games-api-deep-dive)
 								* [Games Platform at The New York Times](https://open.nytimes.com/play-by-play-moving-the-nyt-games-platform-to-gcp-with-zero-downtime-cf425898d569)
 								* [Data Visualisation Platform at Myntra](https://medium.com/myntra-engineering/universal-dashboarding-platform-udp-data-visualisation-platform-at-myntra-5f2522fcf72d)
-												Real-time Analytics Platform at King

											
										
										
											2018-04-10 20:45:13 -04:00
+								* [Simone: Distributed Simulation Service at Netflix](https://medium.com/netflix-techblog/https-medium-com-netflix-techblog-simone-a-distributed-simulation-service-b2c85131ca1b)
-												Zuul: Edge Service for Dynamic Routing, Monitoring, Resiliency, Security, etc at Netflix

											
										
										
											2018-05-25 07:47:00 -04:00
+								* [Zuul: Edge Service for Dynamic Routing, Monitoring, Resiliency, Security, etc at Netflix](https://medium.com/netflix-techblog/open-sourcing-zuul-2-82ea476cb2b3)
-												Real-time Analytics Platform at King

											
										
										
											2018-04-10 20:45:13 -04:00
+								* [Seagull: Distributed System that Helps Running > 20 Million Tests Per Day at Yelp](https://engineeringblog.yelp.com/2017/04/how-yelp-runs-millions-of-tests-every-day.html)
-												MySQL Realtime Traffic Emulator at KakaoTalk

											
										
										
											2018-04-16 03:37:48 -04:00
+								* [MySQL Realtime Traffic Emulator at KakaoTalk](http://tech.kakao.com/2016/02/16/opensource-2-mtre/)
-												Architecture of Sticker Services at LINE

											
										
										
											2018-05-11 03:10:00 -04:00
+								* [Architecture of Sticker Services at LINE](https://www.slideshare.net/linecorp/architecture-sustaining-line-sticker-services)
-												Games Platform at The New York Times

											
										
										
											2018-05-30 01:48:38 -04:00
+								* [Stack Overflow Enterprise at Palantir](https://medium.com/@palantir/terraforming-stack-overflow-enterprise-in-aws-47ee431e6be7)
 								* [Distributed Cron at Quora](https://engineering.quora.com/Quoras-Distributed-Cron-Architecture)
-												Split the Intelligence section into Big Data and ML

											
										
										
											2018-04-26 20:55:35 -04:00
+								* [Architectures of Finance and Banking Systems](https://www.sesameindia.com/images/core-banking-system-architecture)
-												Create the new section Architectures

											
										
										
											2018-03-21 22:36:48 -04:00
+									* [Reference Architecture For The Open Banking Standard](https://hortonworks.com/blog/reference-architecture-open-banking-standard/)
 									* [Building a Modern Bank Backend at Monzo](https://monzo.com/blog/2016/09/19/building-a-modern-bank-backend/)
 									* [Reinventing the Trading Platform for Scale at Wealthsimple](https://medium.com/@Wealthsimple/engineering-at-wealthsimple-reinventing-our-trading-platform-for-scale-17e332241b6c)
-												Architecture for Core Banking System at Margo Bank

											
										
										
											2018-05-13 09:19:00 -04:00
+									* [Architecture for Core Banking System at Margo Bank](https://medium.com/margobank/choosing-an-architecture-85750e1e5a03)
-												Architecture of Nubank

											
										
										
											2018-05-10 13:52:11 -04:00
+									* [Architecture of Nubank](https://www.infoq.com/presentations/nubank-architecture)
-												Tech Stack at Addepar

											
										
										
											2018-05-13 09:14:12 -04:00
+									* [Tech Stack at TransferWise](http://tech.transferwise.com/the-transferwise-stack-heartbeat-of-our-little-revolution/)
 									* [Tech Stack at Addepar](https://medium.com/build-addepar/our-tech-stack-a4f55dab4b0d)
-												refactor

											
										
										
											2018-03-21 22:46:03 -04:00
-												Architecture of LIVE's Encoder Layer at LINE

											
										
										
											2018-03-16 22:08:35 -04:00
+								## Interview
-												Refactor the section of Interview

											
										
										
											2018-04-07 22:50:03 -04:00
+								* [Designing Large-Scale Systems](https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/)
-												fix a typo error

											
										
										
											2018-04-08 11:24:56 -04:00
+									* [My Scaling Hero - Jeff Atwood (a dose of Endorphins before your interview, JK)](https://blog.codinghorror.com/my-scaling-hero/)
-												Advice from Building Large-Scale Distributed Systems - Jeff Dean

											
										
										
											2018-04-08 11:19:27 -04:00
+									* [Software Engineering Advice from Building Large-Scale Distributed Systems - Jeff Dean](https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf)
-												Introduction to Architecting Systems for Scale

											
										
										
											2018-05-08 07:48:43 -04:00
+									* [Introduction to Architecting Systems for Scale](https://lethain.com/introduction-to-architecting-systems-for-scale/)
-												Refactor the section of Interview

											
										
										
											2018-04-07 22:50:03 -04:00
+									* [Anatomy of a System Design Interview](https://hackernoon.com/anatomy-of-a-system-design-interview-4cb57d75a53f)
 									* [8 Things You Need to Know Before a System Design Interview](http://blog.gainlo.co/index.php/2015/10/22/8-things-you-need-to-know-before-system-design-interviews/)
 									* [Top 10 System Design Interview Questions ](https://hackernoon.com/top-10-system-design-interview-questions-for-software-engineers-8561290f0444)
 									* [Top 10 Common Large-Scale Software Architectural Patterns in a Nutshell](https://towardsdatascience.com/10-common-software-architectural-patterns-in-a-nutshell-a0b47a1e9013)
-												minor edit

											
										
										
											2018-06-03 00:42:45 -04:00
+									* [Cloud Big Data Design Patterns - Lynn Langit](https://lynnlangit.com/2017/03/14/beyond-relational/)
-												Refactor the section of Interview

											
										
										
											2018-04-07 22:50:03 -04:00
+									* [How NOT to design Netflix in your 45-minute System Design Interview?](https://hackernoon.com/how-not-to-design-netflix-in-your-45-minute-system-design-interview-64953391a054)
-												SQL Transaction Isolation Levels Explained

											
										
										
											2018-05-10 21:54:41 -04:00
+								* [Explaining Low-Level Systems (OS, Network/Protocol, Database, Storage)](https://www.palantir.com/how-to-ace-a-systems-design-interview/)
-												refactor

											
										
										
											2018-05-11 14:43:16 -04:00
+									* [OSI and TCP/IP Cheat Sheet](http://jaredheinrichs.com/mastering-the-osi-tcpip-models.html)
-												Refactor the section of Interview

											
										
										
											2018-04-07 22:50:03 -04:00
+									* [The Precise Meaning of I/O Wait Time in Linux](http://veithen.github.io/2013/11/18/iowait-linux.html)
-												Paxos Made Live – An Engineering Perspective

											
										
										
											2018-04-12 11:39:23 -04:00
+									* [Paxos Made Live – An Engineering Perspective](https://research.google.com/archive/paxos_made_live.html)
-												How to do Distributed Locking

											
										
										
											2018-05-04 23:15:05 -04:00
+									* [How to do Distributed Locking](https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html)
-												SQL Transaction Isolation Levels Explained

											
										
										
											2018-05-10 21:54:41 -04:00
+									* [SQL Transaction Isolation Levels Explained](http://elliot.land/post/sql-transaction-isolation-levels-explained)
-												Analyzing V8 Garbage Collection Logs at Alibaba

											
										
										
											2018-05-11 14:33:37 -04:00
+								* ["What Happens When... and How" Questions](https://www.glassdoor.com/Interview/What-happens-when-you-type-www-google-com-in-your-browser-QTN_56396.htm)
-												Refactor the section of Interview

											
										
										
											2018-04-07 22:50:03 -04:00
+									* [What Happens When You Type google.com into Browser and Press Enter?](https://github.com/alex/what-happens-when)
 									* [Netflix: What Happens When You Press Play?](http://highscalability.com/blog/2017/12/11/netflix-what-happens-when-you-press-play.html)
-												Monzo: How Peer-To-Peer Payments Work

											
										
										
											2018-05-10 13:51:00 -04:00
+									* [Monzo: How Peer-To-Peer Payments Work](https://monzo.com/blog/2018/04/05/how-monzo-to-monzo-payments-work/)
-												Refactor the section of Interview

											
										
										
											2018-04-07 22:50:03 -04:00
+									* [Transit and Peering: How Your Requests Reach GitHub](https://githubengineering.com/transit-and-peering-how-your-requests-reach-github/)
-												How Expedia Finds your Flights: A Detailed View

											
										
										
											2018-05-08 11:44:13 -04:00
+									* [How Expedia Finds your Flights: A Detailed View](https://techblog.expedia.com/2016/03/07/how-expedia-finds-flights-a-detailed-view/)
-												Add the System Design section, enjoy vacation in my Vietnam

											
										
										
											2018-03-10 07:58:39 -05:00
-												Add the section of Organization

											
										
										
											2018-06-02 23:53:32 -04:00
+								## Organization
 								* [Engineering Levels at SoundCloud](https://developers.soundcloud.com/blog/engineering-levels)
 								* [Scaling Engineering Teams at Twitter](https://www.youtube.com/watch?v=-PXi_7Ld5kU)
 								* [Scaling Decision-Making Across Teams at LinkedIn](https://engineering.linkedin.com/blog/2018/03/scaling-decision-making-across-teams-within-linkedin-engineering)
 								* [Scaling Data Science Team at GOJEK](https://blog.gojekengineering.com/the-dynamics-of-scaling-an-organisation-cb96dbe8aecd)
 								* [Scaling Agile at Zalando](https://jobs.zalando.com/tech/blog/scaling-agile-zalando/?gh_src=4n3gxh1)
-												Scaling Agile at bol.com

											
										
										
											2018-06-16 21:23:28 -04:00
+								* [Scaling Agile at bol.com](https://hackernoon.com/how-we-run-bol-com-with-60-autonomous-teams-fe7a98c0759)
-												Hiring, Managing, and Scaling Engineering Teams at Typeform

											
										
										
											2018-06-16 20:47:52 -04:00
+								* [Lessons Learned from Scaling a Product Team at Intercom](https://blog.intercom.com/how-we-build-software/)
 								* [Hiring, Managing, and Scaling Engineering Teams at Typeform](https://medium.com/@eleonorazucconi/toby-oliver-cto-typeform-on-hiring-managing-and-scaling-engineering-teams-86bef9e5a708)
-												Scaling the Datagram Team at Instagram

											
										
										
											2018-06-16 20:50:39 -04:00
+								* [Scaling the Datagram Team at Instagram](https://instagram-engineering.com/scaling-the-datagram-team-fc67bcf9b721)
-												Scaling the Design Team at Flexport

											
										
										
											2018-06-16 20:53:33 -04:00
+								* [Scaling the Design Team at Flexport](https://medium.com/flexport-design/designing-a-design-team-a9a066bc48a5)
-												Team Model for Scaling a Design System at Salesforce

											
										
										
											2018-06-16 20:55:52 -04:00
+								* [Team Model for Scaling a Design System at Salesforce](https://medium.com/salesforce-ux/the-salesforce-team-model-for-scaling-a-design-system-d89c2a2d404b)
-												Building Analytics Team (4 parts) at Wish

											
										
										
											2018-06-16 20:59:02 -04:00
+								* [Building Analytics Team (4 parts) at Wish](https://medium.com/wish-engineering/scaling-the-analytics-team-at-wish-part-4-recruiting-2a9823b9f5a)
-												From 2 Founders to 1000 Employees at Transferwise

											
										
										
											2018-06-16 21:11:31 -04:00
+								* [From 2 Founders to 1000 Employees at Transferwise](https://medium.com/transferwise-ideas/from-2-founders-to-1000-employees-how-a-small-scale-startup-grew-into-a-global-community-9f26371a551b)
-												Lessons Learned Growing a UX Team from 10 to 170 at Adobe

											
										
										
											2018-06-16 21:16:45 -04:00
+								* [Lessons Learned Growing a UX Team from 10 to 170 at Adobe](https://medium.com/thinking-design/lessons-learned-growing-a-ux-team-from-10-to-170-f7b47be02262)
-												Five Lessons from Scaling at Pinterest

											
										
										
											2018-06-16 21:19:37 -04:00
+								* [Five Lessons from Scaling at Pinterest](https://medium.com/@sarahtavel/five-lessons-from-scaling-pinterest-6a699a889b08)
-												Add the section of Organization

											
										
										
											2018-06-02 23:53:32 -04:00
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								## Talk
-												Distributed Systems in One Lesson - Tim Berglund, Senior Director of Developer Experience at Confluent

											
										
										
											2018-03-10 04:31:35 -05:00
+								* [Distributed Systems in One Lesson - Tim Berglund, Senior Director of Developer Experience at Confluent](https://www.youtube.com/watch?v=Y6Ev8GIlbxc)
-												Principles of Chaos Engineering

											
										
										
											2018-01-21 23:27:41 -05:00
+								* [Building Real Time Infrastructure at Facebook - Jeff Barber and Shie Erlich, Software Engineer at Facebook](https://www.usenix.org/conference/srecon17americas/program/presentation/erlich)
-												Building Reliable Social Infrastructure for Google - Marc Alvidrez, Senior Manager at Google

											
										
										
											2018-01-21 23:41:08 -05:00
+								* [Building Reliable Social Infrastructure for Google - Marc Alvidrez, Senior Manager at Google](https://www.usenix.org/conference/srecon16/program/presentation/alvidrez)
-												Building a Distributed Build System at Google Scale - Aysylu Greenberg, SDE at Google

											
										
										
											2018-05-02 13:41:31 -04:00
+								* [Building a Distributed Build System at Google Scale - Aysylu Greenberg, SDE at Google](https://www.youtube.com/watch?v=K8YuavUy6Qc)
-												Site Reliability Engineering at Dropbox - Tammy Butow, Site Reliability Engineering Manager at Dropbox

											
										
										
											2018-02-14 04:46:43 -05:00
+								* [Site Reliability Engineering at Dropbox - Tammy Butow, Site Reliability Engineering Manager at Dropbox](https://www.youtube.com/watch?v=ggizCjUCCqE)
-												How Discord Scaled Elixir to Five Millions Concurrent Users

											
										
										
											2018-01-25 05:08:37 -05:00
+								* [How Google Does Planet-Scale for Planet-Scale Infra - Melissa Binde, SRE Director for Google Cloud Platform](https://www.youtube.com/watch?v=H4vMcD7zKM0)
-												Scaling Slack - Bing Wei, Software Engineer (Infrastructure) at Slack

											
										
										
											2018-01-24 21:41:51 -05:00
+								* [Netflix Guide to Microservices - Josh Evans, Director of Operations Engineering at Netflix](https://www.youtube.com/watch?v=CZ3wIuvmHeM&t=2837s)
 								* [Achieving Rapid Response Times in Large Online Services - Jeff Dean, Google Senior Fellow](https://www.youtube.com/watch?v=1-3Ahy7Fxsc)
-												Scaling Facebook Live Videos to a Billion Users - Sachin Kulkarni, Director of Engineering at Facebook

											
										
										
											2018-01-30 23:23:23 -05:00
+								* [Architecture to Handle 80K RPS Celebrity Sales at Shopify - Simon Eskildsen, Engineering Lead at Shopify](https://www.youtube.com/watch?v=N8NWDHgWA28)
-												Add the Awesome Lectures and Talks section

											
										
										
											2018-01-10 12:46:14 -05:00
+								* [Lessons of Scale at Facebook - Bobby Johnson, Director of Engineering at Facebook](https://www.youtube.com/watch?v=QCHiNEw73AU)
-												Scaling (a NSFW website) to 200 Million Views A Day And Beyond - Erick Pickup, Lead Developer at MindGeek

											
										
										
											2018-02-12 11:59:11 -05:00
+								* [Performance Optimization for the Greater China Region at Salesforce - Jeff Cheng, Enterprise Architect at Salesforce](https://www.salesforce.com/video/1757880/)
 								* [How GIPHY Delivers a GIF to 300 Millions Users - Alex Hoang and Nima Khoshini, Services Engineers at GIPHY](https://vimeo.com/252367076)
-												Scaling NodeJS at Alibaba

											
										
										
											2018-02-17 19:21:39 -05:00
+								* [High Performance Packet Processing Platform at Alibaba - Haiyong Wang, Senior Director at Alibaba](https://www.youtube.com/watch?v=wzsxJqeVIhY&list=PLMu8-hpCxIVENuAue7bd0eCAglLGY_8AW&index=7)
-												Solving Large-scale Data Center and Cloud Interconnection Problems -  Ihab Tarazi, CTO at Equinix

											
										
										
											2018-05-02 00:47:35 -04:00
+								* [Solving Large-scale Data Center and Cloud Interconnection Problems -  Ihab Tarazi, CTO at Equinix](https://atscaleconference.com/videos/solving-large-scale-data-center-and-cloud-interconnection-problems/)
-												Site Reliability Engineering at Dropbox - Tammy Butow, Site Reliability Engineering Manager at Dropbox

											
										
										
											2018-02-14 04:46:43 -05:00
+								* [Scaling Dropbox - Kevin Modzelewski, Back-end Engineer at Dropbox](https://www.youtube.com/watch?v=PE4gwstWhmc)
-												Scaling Reliability at Dropbox - Sat Kriya Khalsa, SRE at Dropbox

											
										
										
											2018-02-14 04:51:40 -05:00
+								* [Scaling Reliability at Dropbox - Sat Kriya Khalsa, SRE at Dropbox](https://www.youtube.com/watch?v=IhGWOaD5BYQ)
-												Scaling with Performance at Facebook - Bill Jia, VP of Infrastructure at Facebook

											
										
										
											2018-03-23 20:59:34 -04:00
+								* [Scaling with Performance at Facebook - Bill Jia, VP of Infrastructure at Facebook](https://atscaleconference.com/videos/performance-scale-2018-opening-remarks/)
-												Scaling Infrastructure at Etsy - Bethany Macri, Engineering Manager at Etsy

											
										
										
											2018-02-14 00:02:31 -05:00
+								* [Scaling Live Videos to a Billion Users at Facebook - Sachin Kulkarni, Director of Engineering at Facebook](https://www.youtube.com/watch?v=IO4teCbHvZw)
-												to be more accurate

											
										
										
											2018-03-23 20:52:58 -04:00
+								* [Scaling Low-latency Live Streams at Facebook (Latencies for Real-time Interactions) - Saral Shodhan, SDE at Facebook](https://atscaleconference.com/videos/scaling-low-latency-live-streams/)
 								* [Scaling Low-latency Live Streams at Facebook (End-to-End Considerations) - Federico Larumbe, SDE at Facebook](https://atscaleconference.com/videos/scaling-low-latency-live-streams-2-of-2/)
-												Scaling Infrastructure at Etsy - Bethany Macri, Engineering Manager at Etsy

											
										
										
											2018-02-14 00:02:31 -05:00
+								* [Scaling Infrastructure at Instagram - Lisa Guo, Instagram Engineering](https://www.youtube.com/watch?v=hnpzNAPiC0E)
 								* [Scaling Infrastructure at Twitter - Yao Yue, Staff Software Engineer at Twitter](https://www.youtube.com/watch?v=6OvrFkLSoZ0)
 								* [Scaling Infrastructure at Etsy - Bethany Macri, Engineering Manager at Etsy](https://www.youtube.com/watch?v=LfqyhM1LeIU)
-												Scaling Real-time Infrastructure at Alibaba for Global Shopping Holiday

											
										
										
											2018-03-23 00:35:27 -04:00
+								* [Scaling Real-time Infrastructure at Alibaba for Global Shopping Holiday - Xiaowei Jiang, Senior Director at Alibaba](https://atscaleconference.com/videos/scaling-alibabas-real-time-infrastructure-for-global-shopping-holiday/)
-												Scaling Infrastructure at Etsy - Bethany Macri, Engineering Manager at Etsy

											
										
										
											2018-02-14 00:02:31 -05:00
+								* [Scaling Data Infrastructure at Spotify - Matti (Lepistö) Pehrs, Spotify](https://www.youtube.com/watch?v=cdsfRXr9pJU)
-												Scaling Pinterest - Marty Weiner, Pinterest’s founding engineer

											
										
										
											2018-01-10 13:07:04 -05:00
+								* [Scaling Pinterest - Marty Weiner, Pinterest’s founding engineer](https://www.youtube.com/watch?v=jQNCuD_hxdQ&list=RDhnpzNAPiC0E&index=11)
-												Scaling Slack - Bing Wei, Software Engineer (Infrastructure) at Slack

											
										
										
											2018-01-24 21:41:51 -05:00
+								* [Scaling Slack - Bing Wei, Software Engineer (Infrastructure) at Slack](https://www.infoq.com/presentations/slack-scalability)
-												Scaling Infrastructure at Etsy - Bethany Macri, Engineering Manager at Etsy

											
										
										
											2018-02-14 00:02:31 -05:00
+								* [Scaling Backend at Youtube - Sugu Sougoumarane, SDE at Youtube](https://www.youtube.com/watch?v=5yDO-tmIoXY&feature=youtu.be)
 								* [Scaling Backend at Uber - Matt Ranney, Chief Systems Architect at Uber](https://www.youtube.com/watch?v=nuiLcWE8sPA)
-												Scaling Global CDN at Netflix - Dave Temkin, Director of Global Networks at Netflix

											
										
										
											2018-02-14 04:35:48 -05:00
+								* [Scaling Global CDN at Netflix - Dave Temkin, Director of Global Networks at Netflix](https://www.youtube.com/watch?v=tbqcsHg-Q_o)
-												Scaling Load Balancing Infra to Support 1.3 Billion Users at Facebook - Patrick Shuff, Production Engineer at Facebook

											
										
										
											2018-02-17 01:21:30 -05:00
+								* [Scaling Load Balancing Infra to Support 1.3 Billion Users at Facebook - Patrick Shuff, Production Engineer at Facebook](https://www.youtube.com/watch?v=bxhYNfFeVF4)
-												Correct the title:
Scaling (a NSFW site) to 200 Million Views A Day And Beyond - Eric Pickup, Lead Platform Developer at MindGeek

											
										
										
											2018-02-12 12:08:40 -05:00
+								* [Scaling (a NSFW site) to 200 Million Views A Day And Beyond - Eric Pickup, Lead Platform Developer at MindGeek](https://www.youtube.com/watch?v=RlkCdM_f3p4)
-												Scaling Counting Infrastructure at Quora - Chun-Ho Hung and Nikhil Gar, SEs at Quora

											
										
										
											2018-02-18 21:18:26 -05:00
+								* [Scaling Counting Infrastructure at Quora - Chun-Ho Hung and Nikhil Gar, SEs at Quora](https://www.infoq.com/presentations/quora-analytics)
-												Scaling Git at Microsoft - Saeed Noursalehi, Principal Program Manager at Microsoft

											
										
										
											2018-02-22 05:11:14 -05:00
+								* [Scaling Git at Microsoft - Saeed Noursalehi, Principal Program Manager at Microsoft](https://www.youtube.com/watch?v=g_MPGU_m01s)
-												Add the Awesome Lectures and Talks section

											
										
										
											2018-01-10 12:46:14 -05:00
-												minor edit

											
										
										
											2018-06-16 21:02:36 -04:00
+								## Book
-												Big Data, Web Ops & DevOps Ebooks - O'Reilly (Online - Free)

											
										
										
											2018-03-25 21:21:56 -04:00
+								* [Big Data, Web Ops & DevOps Ebooks - O'Reilly (Online - Free)](http://www.oreilly.com/webops/free/)
-												Add two very good online and free books: Google SRE and DistSys (mixu)

											
										
										
											2018-01-26 14:40:30 -05:00
+								* [Google Site Reliability Engineering (Online - Free)](https://landing.google.com/sre/book.html)
 								* [Distributed Systems for Fun and Profit (Online - Free)](http://book.mixu.net/distsys/)
-												Add the book: What Every Developer Should Know About SQL Performance (Online - Free)

											
										
										
											2018-02-27 11:58:49 -05:00
+								* [What Every Developer Should Know About SQL Performance (Online - Free)](https://use-the-index-luke.com/sql/table-of-contents)
-												Edit the section of Books

											
										
										
											2018-01-27 05:33:29 -05:00
+								* [Beyond the Twelve-Factor App - Exploring the DNA of Highly Scalable, Resilient Cloud Applications (Free)](http://www.oreilly.com/webops-perf/free/beyond-the-twelve-factor-app.csp)
 								* [Chaos Engineering - Building Confidence in System Behavior through Experiments (Free)](http://www.oreilly.com/webops-perf/free/chaos-engineering.csp?intcmp=il-webops-free-product-na_new_site_chaos_engineering_text_cta)
-												Moving the Talks section above the Books section

											
										
										
											2018-01-21 23:22:51 -05:00
+								* [The Art of Scalability](http://theartofscalability.com/)
 								* [Designing Data-Intensive Applications](https://dataintensive.net/)
 								* [Web Scalability for Startup Engineers](https://www.goodreads.com/book/show/23615147-web-scalability-for-startup-engineers)
 								* [Scalability Rules: 50 Principles for Scaling Web Sites](http://scalabilityrules.com/)
-												Update README.md
											
										
										
											2017-12-26 22:47:31 -05:00
+								## Special Thanks
-												Distributed tracing at Pinterest with Pintrace

											
										
										
											2018-01-02 21:30:17 -05:00
+								* Jonas Bonér, CTO at Lightbend, for the [original inspiration](https://www.slideshare.net/jboner/scalability-availability-stability-patterns)
-												Add CC0 lisence - Thank you very much. my friends!

											
										
										
											2018-01-24 11:41:50 -05:00
-												Minor fix for heading

											
										
										
											2018-01-24 11:47:00 -05:00
+								## License
-												Add CC0 lisence - Thank you very much. my friends!

											
										
										
											2018-01-24 11:41:50 -05:00
 								[![CC-BY](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/by.svg)](https://creativecommons.org/licenses/by/4.0/)
-												all about system, i.e every thing behind front-end layer

											
										
										
											2018-08-19 20:57:03 -04:00
+								This repo is created and maintained by [Binh Nguyen](http://binhnguyennus.com/). Feel free to use it at your convenience! Thank you & Happy coding :heart: