From 48b3e076b47cf5b9d4a6d41f789ab98dbd151bbd Mon Sep 17 00:00:00 2001 From: "Mia von Steinkirch, PhD, MSc" <1130416+bt3gl@users.noreply.github.com> Date: Mon, 2 Mar 2020 16:21:47 -0800 Subject: [PATCH] Create data_engineering.md --- data_engineering.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 data_engineering.md diff --git a/data_engineering.md b/data_engineering.md new file mode 100644 index 0000000..3e746d1 --- /dev/null +++ b/data_engineering.md @@ -0,0 +1,21 @@ +# Enterprise Solutions + +* [Netflix data pipeline](https://medium.com/netflix-techblog/evolution-of-the-netflix-data-pipeline-da246ca36905). +* [Netlix data videos](https://www.youtube.com/channel/UC00QATOrSH4K2uOljTnnaKw). +* [Yelp data pipeline](https://engineeringblog.yelp.com/2016/07/billions-of-messages-a-day-yelps-real-time-data-pipeline.html). +* [Gusto data pipeline](https://engineering.gusto.com/building-a-data-informed-culture/). +* [500px data pipeline](https://medium.com/@samson_hu/building-analytics-at-500px-92e9a7005c83.) +* [Twitter data pipeline](https://blog.twitter.com/engineering/en_us/topics/insights/2018/ml-workflows.html). +* [Coursera data pipeline](https://medium.com/@zhaojunzhang/building-data-infrastructure-in-coursera-15441ebe18c2). +* [Cloudfare data pipeline](https://blog.cloudflare.com/how-cloudflare-analyzes-1m-dns-queries-per-second/). +* [Pandora data pipeline](https://engineering.pandora.com/apache-airflow-at-pandora-1d7a844d68ee). +* [Heroku data pipeline](https://medium.com/@damesavram/running-airflow-on-heroku-ed1d28f8013d). +* [Zillow data pipeline](https://www.zillow.com/data-science/airflow-at-zillow/). +* [Airbnb data pipeline](https://medium.com/airbnb-engineering/https-medium-com-jonathan-parks-scaling-erf-23fd17c91166). +* [Walmart data pipeline](https://medium.com/walmartlabs/how-we-built-a-data-pipeline-with-lambda-architecture-using-spark-spark-streaming-9d3b4b4555d3). +* [Robinwood data pipeline](https://robinhood.engineering/why-robinhood-uses-airflow-aed13a9a90c8). +* [Lyft data pipeline](https://eng.lyft.com/running-apache-airflow-at-lyft-6e53bb8fccff). +* [Slack data pipeline](https://speakerdeck.com/vananth22/operating-data-pipeline-with-airflow-at-slack). +* [Remind data pipeline](https://medium.com/@RemindEng/beyond-a-redshift-centric-data-model-1e5c2b542442). +* [Wish data pipeline](https://medium.com/wish-engineering/scaling-analytics-at-wish-619eacb97d16). +* [Databrick data pipeline](https://databricks.com/blog/2017/03/31/delivering-personalized-shopping-experience-apache-spark-databricks.html).