From ab30db2f699efb9a4d2c71209c3a458155f25ff8 Mon Sep 17 00:00:00 2001 From: "Mia von Steinkirch, PhD, MSc" <1130416+bt3gl@users.noreply.github.com> Date: Mon, 2 Mar 2020 16:31:50 -0800 Subject: [PATCH] Update data_engineering.md --- data_engineering.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/data_engineering.md b/data_engineering.md index 4c3d403..87b506f 100644 --- a/data_engineering.md +++ b/data_engineering.md @@ -3,10 +3,27 @@ ### Articles * [Machine Learning System Design](https://medium.com/@ricomeinl/machine-learning-system-design-f2f4018f2f8). +* [How to Code Neat Machine Learning Pipelines](https://www.neuraxio.com/en/blog/neuraxle/2019/10/26/neat-machine-learning-pipelines.html). + + +### Courses + +* [Coursera's Big Data Pipeline course](https://www.coursera.org/lecture/big-data-integration-processing/big-data-processing-pipelines-c4Wyd). + ### Tools * [Airflow resources](https://github.com/bt3gl/Curated_ETL-and-ML-Pipelines/blob/master/airflow.md). +* [Data science resources](https://github.com/davidyakobovitch/data_science_resources). +* [AWS Data pipeline samples](https://github.com/aws-samples/data-pipeline-samples/tree/master/samples). +* [Lorte data pipelining](https://github.com/instacart/lore). + + +### Videos + +* [Industrial Machine Learning Talk](https://www.youtube.com/watch?v=3JYDT8lap5U). + + ### Enterprise Solutions @@ -29,3 +46,6 @@ * [Remind data pipeline](https://medium.com/@RemindEng/beyond-a-redshift-centric-data-model-1e5c2b542442). * [Wish data pipeline](https://medium.com/wish-engineering/scaling-analytics-at-wish-619eacb97d16). * [Databrick data pipeline](https://databricks.com/blog/2017/03/31/delivering-personalized-shopping-experience-apache-spark-databricks.html). + + +