Serverless Machine Learning
Overview
Duration is 1 min
In this lab, you go from exploring a taxicab dataset to training and deploying a high-accuracy distributed model with Cloud ML Engine.
What you need
What you'll need
To complete this lab, you'll need:
- Access to a standard internet browser (Chrome browser recommended).
- Time. Note the lab's Duration in the Lab Details tab in Qwiklabs, which is an estimate of the time it should take to complete all steps. Plan your schedule so you have time to complete the lab. Once you start the lab, you will not be able to pause and return later (you begin at step 1 every time you start a lab).
- You do NOT need a Google Cloud Platform account or project. An account, project and associated resources are provided to you as part of this lab.
- If you already have your own GCP account, make sure you do not use it for this lab.
- If your lab prompts you to log into the console, use only the student account provided to you by the lab. This prevents you from incurring charges for lab activities in your personal GCP account.
Before accessing the Cloud Console, log out of all other Google / Gmail accounts you may be logged in with. If this is not possible, use a new Incognito window (Chrome) or another browser for the Qwiklabs session.
Start your lab
Note the Setup time in the Lab Details tab in Qwiklabs. That's how long it will take for the lab account to build its resources. You can track your lab's progress with the status bar at the top of your screen.
When you are ready, click Start Lab.
Important: What is happening during this time?
Your lab is spinning up GCP resources for you behind the scenes, including an account, a project, resources within the project, and permission for you to control the resources you will need to run the lab. This means that instead of spending time manually setting up a project and building resources from scratch as part of your lab, you can more quickly begin learning.
Find Your Lab's GCP Username and Password
To access the resources and console for this lab, locate the Connection Details panel in Qwiklabs. Here you will find the account ID and password for the account you will use to log in to the Google Cloud Platform:
If your lab provides other resource identifiers or connection-related information, it will appear on this panel as well.
What you learn
In this series of labs, you go from exploring a taxicab dataset to training and deploying a high-accuracy distributed model with Cloud ML Engine.
Setup
Activate Google Cloud Shell
From the GCP Console click the icon (as depicted below) on the top right toolbar:
Then click "Start Cloud Shell" as shown here:
It should only take a few moments to provision and connect to the environment:
This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on the Google Cloud, greatly enhancing network performance and authentication. Much if not all of your work in this lab can be done with simply a browser or your Google Chromebook.
Once connected to the cloud shell, you should see that you are already authenticated and that the project is already set to your PROJECT_ID:
gcloud auth list
Command output
Credentialed accounts:
- <myaccount>@<mydomain>.com (active)
Note: gcloud
is the powerful and unified command-line tool for Google Cloud Platform. Full documentation is available from https://cloud.google.com/sdk/gcloud. It comes pre-installed on CloudShell and you will surely enjoy its support for tab-completion.
gcloud config list project
Command output
[core]
project = <PROJECT_ID>
If it is not, you can set it with this command:
gcloud config set project <PROJECT_ID>
Command output
Updated property [core/project].
Launch Cloud Datalab
Duration is 2 min
To launch Cloud Datalab:
Step 1
In Cloud Shell, type:
gcloud compute zones list
Pick a zone in a geographically closeby region.
Step 2
In Cloud Shell, type:
datalab create dataengvm --zone <ZONE>
Datalab will take about 5 minutes to start.
If you are not yet familiar with Datalab, what follows is a graphical cheat sheet for the main Datalab functionality:
Move on to the next step.
Checkout notebook into Cloud Datalab
Duration is 5 min
Step 1
If necessary, wait for Datalab to finish launching. Datalab is ready when you see a message prompting you to do a "Web Preview".
Step 2
Click on the Web Preview icon on the top-left corner of the Cloud Shell ribbon. Switch to port 8081.
Step 3
In Datalab, click on the icon for "Open ungit" in the top-right ribbon.
Step 4
In the Ungit window, select the text that reads /content/datalab/notebooks and remove the notebooks so that it reads /content/datalab, then hit enter.
In the panel that comes up, type the following as the GitHub repository to Clone from:
https://github.com/GoogleCloudPlatform/training-data-analyst
Then, click on Clone repository.
1. Explore dataset, create ML datasets, create benchmark
Duration is 15 min
In this lab, you will:
- Explore a dataset using BigQuery and Datalab
- Sample the dataset and create training, validation, and testing datasets for local development of TensorFlow models
- Create a benchmark to evaluate the performance of ML against
Step 1
In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/datasets/ and open create_datasets.ipynb.
Step 2
In Datalab, click on Clear | All Cells (click on__ Clear_, then in the drop-down menu, select_ All Cells)__. Now, read the narrative and execute each cell in turn.
2a. Getting Started with TensorFlow
Duration is 15 min
In this lab, you will learn how the TensorFlow Python API works:
- Building a graph
- Running a graph
- Feeding values into a graph
- Find area of a triangle using TensorFlow
Step 1
In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open a_tfstart.ipynb
Step 2
In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.
2b. Machine Learning using tf.learn
Duration is 15 min
In this lab, you will implement a simple machine learning model using tf.learn:
- Read .csv data into a Pandas dataframe
- Implement a Linear Regression model in TensorFlow
- Train the model
- Evaluate the model
- Predict with the model
- Repeat with a Deep Neural Network model in TensorFlow
Step 1
In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open b_tflearn.ipynb
Step 2
In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.
2c. TensorFlow on Big Data
Duration is 15 min
In this lab, you will learn how to:
- Read from a potentially large file in batches
- Do a wildcard match on filenames
- Break the one-to-one relationship between inputs and features
Step 1
In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open__ c_batched.ipynb__.
Step 2
In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.
2d. Refactor for Distributed training and monitoring
Duration is 15 min
In this lab, you will learn how to:
- Use the Experiment class
- Monitor training using TensorBoard
Step 1
In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/tensorflow and open d_experiment.ipynb.
Step 2
In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.
3. Getting Started with Cloud ML Engine
Duration is 30 min
In this lab, you will learn how to:
- Package up TensorFlow model
- Run training locally
- Run training on cloud
- Deploy model to cloud
- Invoke model to carry out predictions
Step 1
If you don't already have a bucket on Cloud Storage, create one from the Storage section of the GCP console. Bucket names have to be globally unique.
Step 2
In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/cloudmle and open__ cloudmle.ipynb__.
Step 3
In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.
4. Feature Engineering
Duration is 30 min
In this lab, you will improve the ML model using feature engineering. In the process, you will learn how to:
- Work with feature columns
- Add feature crosses in TensorFlow
- Read data from BigQuery
- Create datasets using Dataflow
- Use a wide-and-deep model
Step 1
In Cloud Datalab, click on the Home icon, and then navigate to training-data-analyst/courses/machine_learning/feateng and open__ feateng.ipynb__.
Step 2
In Datalab, click on Clear | All Cells. Now read the narrative and execute each cell in turn.
Your instructor will demo notebooks that contain hyper-parameter tuning and training on 500 million rows of data. The changes to the model are minor -- essentially just command-line parameters, but the impact on model accuracy is huge:
©Google, Inc. or its affiliates. All rights reserved. Do not distribute.