Google ML

Google Machine Learning – Study Notes


In the study note series, this post covers Google Cloud Machine Learning. All details are accurate at the time of writing, please refer to Google for current details.


These are general definitions that are frequently used:

LabelTrue answer
InputPredictor variable(s), what you can use to predict the label
ExampleInput + corresponding label
ModelMath function that takes input variables and creates approximation to label
PredictionUsing model on unlabelled data
RegressionContinuous labels (i.e. Size of tip) are
ClassificationDiscrete labels (i.e. Gender)
Linear ModelNeural Network with no hidden layers
Gradient DescentUsed to find the best I/P parameters
Weights/biasParameters we optimize
Batch sizeThe amount of data we compute error on
Epochone pass through entire dataset Gradient descent = process of reducing error
EvaluationIs the model good enough? Has to be done on full dataset
TrainingProcess of optimizing the weights; includes gradient descent + evaluation
Mean Square ErrorThe loss measure for regression problems Cross-entropy: the loss measure for classification problems
AccuracyA more intuitive measure of skill for classifiers
PrecisionAccuracy when classifier says “yes” (useful for unbalanced classes where there are many more yes-es than no-es)
RecallAccuracy when the truth is “yes” (useful for unbalanced classes where there are very few yes-es)
ROC curveA way to pick the threshold (of the probability that is output by the classifier) at which a specific precision or recall is reached. The area under the curve (AUC) is a threshold-independent measure of skill.
DGDirected Graph
DNNDeep Neural Network


There are two types of Models, supervised and unsupervised

SupervisedHas labels, i.e. The correct answer
UnsupervisedNo labels
Model types

Unsupervised ML is all about discovery, not prediction


For business users, use a Confusion Matrix as it is more intuitive. A confusion matrix represents the percentage of times each label was predicted for each label in the training set during evaluation.


  • To calculate the Regression Error, use a Mean, Square Error approach
  • To calculate the Classification Error, use Cross Entropy approach


This is a form of regression, that constrains/ regularizes or shrinks the coefficient estimates towards zero. In other words, this technique discourages learning a more complex or flexible model, so as to avoid the risk of overfitting.


Data Sets need to be balanced and have a similar number of each scenario. If it doesn’t you need to use Precision and Recall

PrecisionPositive Predictive Value (TP/(TP + FP))
RecallTrue Positive Rate (TP/(TP + FN))
Precision vs Recall


  • Precision is the formula to check how accurate the model is when most of the output are positives. In other words, if most of the output is yes.
  • Recall is the formula to check how accurate the model is when most of the output are negatives. In other words, if most of the output is no.
  • Gradient Descent is an optimization algorithm to find the minimal value of a function. Gradient descent is used to find the minimal RMSE or cost function.
  • Dropout is a regularization method to remove random selection of fixed number of units in a neural network layer. More units dropped out, the stronger the regularization.
  • To increase the Area Under the Curve (AUC) you need to Increase Regularization.

Combining Approaches

  • High confusion, low AUC scores, or low precision and recall scores can indicate that your model needs additional training data or has inconsistent labels. A very high AUC score and perfect precision and recall can indicate that the data is too easy and may not generalize well.


Make sure models are not tuned to Under Fit or Over Fit. To solve overfitting, the following would help improve the model’s quality:

  • Increase the number of examples, the more data a model is trained with, the more use cases the model can be training on and better improves its predictions.
  • Tune hyperparameters which is related to number and size of hidden layers (for neural networks), and regularization, which means using techniques to make your model simpler such as using dropout method to remove neuron networks or adding “penalty” parameters to the cost function.
  • Remove features by removing irrelevant features. Feature engineering is a wide subject and feature selection is a critical part of building and training a model. Some algorithms have built- in feature selection, but in some cases, data scientists need to cherry-pick or manually select or remove features for debugging and finding the best model output.

Integer Encoding

As a first step, each unique category value is assigned an integer value.

For example, “red” is 1, “green” is 2, and “blue” is 3.

This is called a label encoding or an integer encoding an example is available here.


One-Hot Encoding

For categorical variables where no such ordinal relationship exists, the integer encoding is not enough.

For example:

One-Hot Encoding Example


Batch Size define the Learning Rate, there is a goldilocks value, often hard to find

Feature Crosses

Feature crosses are engineered based on our understanding of the problem. Eg. Combine Y and NYC to get that Yellow cars in NYC are always cabs


DNNs are good for image based issues where the data points are dense and correlated


When looking at a problem

  • Choose one attribute that needs to be predicted (i.e. Label)
  • Choose another attribute that describe the label (i.e. Features)

Split into

  • Training Data
  • Validation Data
  • Test Data (Independent Test Data)

If this is not possible

  • Training Data
  • Validation Data (cross validate)

Cloud AutoML

Cloud AutoML is a new tech to auto create ML Models. The API includes:

  • Vision
  • Speech
  • Jobs
  • Translation
  • Natural Language

Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs. It relies on Google’s state-of-the-art transfer learning and neural architecture search technology


Training Models

Open the AutoML Vision UI and click the lightbulb icon in the left navigation bar to display the available models.
To view the models for a different project, select the project from the drop-down list in the upper right of the title bar.

  • Click the row for the model you want to evaluate.
  • If necessary, click the Evaluate tab just below the title bar.
    If training has been completed for the model, AutoML Vision shows its evaluation metrics.
  • To view the metrics for a specific label, select the label name from the list of labels in the lower part of the page.


If you’re not happy with the quality levels, you can go back to earlier steps to improve the quality:

  • AutoML Vision allows you to sort the images by how “confused” the model is, by the true label and its predicted label. Look through these images and make sure they’re labelled correctly.
  • Consider adding more images to any labels with low quality.
  • You may need to add different types of images (e.g. wider angle, higher or lower resolution, different points of view).
  • Consider removing labels altogether if you don’t have enough training images.
  • Remember that machines can’t read your label name; it’s just a random string of letters to them. If you have one label that says “door” and another that says “door_with_knob” the machine has no way of figuring out the nuance other than the images you provide it.
  • Augment your data with more examples of true positives and negatives. Especially important examples are the ones that are close to the decision boundary (i.e. likely to produce confusion, but still correctly labelled).
  • Specify your own TRAIN, TEST, VALIDATION split. The tool randomly assigns images, but near-duplicates may end up in TRAIN and VALIDATION which could lead to overfitting and then poor performance on the TEST set.
  • Once you’ve made changes, train and evaluate a new model until you reach a high enough quality level.


  • TensorFlow does lazy evaluation by default. You write a Directed Graph (DG) and then run the DG in a session to get a result. Often used in production
  • TensorFlow can do eager evaluation (tf.eager). You write a DG and get a result. Often used development
  • TensorFlow allows for auto scalling

Numpy is the default language for programming with TF. Numpy is quicker as evaluation is immediate. Two options to code with are

  • Np.arrays, Np.add
  • Tf.constant, tf.add (lazy)

TF can:

  • Distribute computation. Like JAVA it can run on any hardware
  • To read shared data use a TextLineDataset
  • When using many workers, make sure they don’t all see the same data by using dataset.shuffle

train_spec = tf.estimator.TrainSpec(input_fn=train_input_fn, max_steps=1000) eval_spec = tf.estimator.EvalSpec(input_fn=eval_input_fn) tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)



Think “steps”, not “epochs” with production-ready, distributed models.

  • Gradient updates from slow workers could get ignored
  • When retraining a model with fresh data, we’ll resume from earlier number of steps (and corresponding hyper-parameters)


The EvalSpec controls the evaluation and the checkpointing of the model since they happen at the same time CheckPointing is an essential part of eval. Think of eval as exporting to TensorBoard


TensorBoard is a collection of visualization tools designed specifically to help you visualize TensorFlow.

  • TensorFlow graph
  • Plot quantitative metrics
  • Pass and graph additional data


Your HyperParameters are the variables that govern the training process itself. For example, part of setting up a deep neural network is deciding how many hidden layers of nodes to use between the input layer and the output layer, and how many nodes each layer should use. These variables are not directly related to the training data. They are configuration variables. Note that parameters change during a training job, while hyperparameters are usually constant during a job.

From <>

HyperParameter tuning is a vital part of tuning a model. Often input values are chosen arbitrarily. Create to parse command-line parameters and send along to train_and_evaluate:

  • No Of Hidden Layers
  • No of Nodes in Hidden Layers

Can also parameterise the output directory so that the results are not overwritten 

Online PredictionBatch prediction
Optimized to minimize the latency of serving predictions.Optimized to handle a high volume of instances in a job and to run more complex models.
Can process one or more instances per request.Can process one or more instances per request.
Predictions returned in the response message.Predictions written to output files in a Cloud Storage location that you specify.
Input data passed directly as a JSON string.Input data passed indirectly as one or more URIs of files in Cloud Storage locations.
Returns as soon as possible.Asynchronous request.
Accounts with the following IAM roles can request online predictions: Legacy Editor or Viewer AI Platform Admin or DeveloperAccounts with the following IAM roles can request batch predictions: Legacy Editor AI Platform Admin or Developer
Runs on the runtime version and in the region selected when you deploy the model.Can run in any available region, using any available runtime version. Though you should run with the defaults for deployed model versions.
Runs models deployed to AI Platform.Runs models deployed to AI Platform or models stored in accessible Google Cloud Storage locations.
Can serve predictions from a TensorFlow SavedModel or a custom prediction routine (beta).Can serve predictions from a TensorFlow SavedModel.
$0.0401 to $0.1349 per node hour (Americas). Price depends on machine type selection.$0.0791 per node hour (Americas).


Saving your configuration

How you specify your cluster configuration depends on how you plan to run your training job:


Create a YAML configuration file representing the TrainingInput object, and specify the scale tier identifier and machine types in the configuration file. You can name this file whatever you want. By convention the name is config.yaml.

The following example shows the contents of the configuration file, config.yaml, for a job with a custom processing cluster.

  scaleTier: CUSTOM
  masterType: complex_model_m
  workerType: complex_model_m
  parameterServerType: large_model
  workerCount: 9
  parameterServerCount: 3


In Datalab, start locally on sampled dataset then, scale it out to GCP using serverless technology


Powered by Google’s machine learning. Dialogflow incorporates Google’s machine learning expertise and products such as Google Cloud Speech-to-Text.


Kubeflow is a free and open-source software platform developed by Google and first released in 2018. Kubeflow is designed to develop machine learning applications e.g. using TensorFlow and to deploy these to Kubernetes. Wikipedia

Cloud to Speech

Synchronous Recognition (REST and gRPC) sends audio data to the Speech-to-Text API, performs recognition on that data, and returns results after all audio has been processed. Synchronous recognition requests are limited to audio data of 1 minute or less in duration.

Asynchronous Recognition (REST and gRPC) sends audio data to the Speech-to-Text API and initiates a Long Running Operation. Using this operation, you can periodically poll for recognition results. Use asynchronous requests for audio data of any duration up to 480 minutes.

Streaming Recognition (gRPC only) performs recognition on audio data provided within a gRPC bi-directional stream. Streaming requests are designed for real-time recognition purposes, such as capturing live audio from a microphone. Streaming recognition provides interim results while audio is being captured, allowing result to appear, for example, while a user is still speaking.

Close Bitnami banner