High model accuracy guarantees the model solves the business problem correctly

High accuracy does not mean the model is correct for the business problem — check that ML metrics align with business outcomes

Training-serving skew and model drift are the same phenomenon

Training-serving skew is not the same as model drift — skew is a pipeline bug (different transformations); drift is a real-world data change

Quantisation always preserves full model accuracy

Quantisation reduces model size and latency but can reduce accuracy — always benchmark before deploying a quantised model

Google Professional Machine Learning Engineer

ML Problem Framing and Data Strategy

Before building a model, frame the problem correctly: What is the prediction target? What labels do you have? What are the business metrics, and how do they relate to ML metrics (accuracy, AUC, RMSE)? Is this classification, regression, ranking, or generation? Can the problem be solved with rules or simple statistics instead of ML? Data strategy: structured data (Cloud SQL, BigQuery, Spanner), unstructured data (Cloud Storage for images/audio/video/text). Feature engineering in BigQuery ML or Vertex AI Feature Store (serves features consistently between training and serving — avoids training-serving skew). Data validation with TFX (TensorFlow Extended) DataValidationComponent detects schema drift and anomalies.

Vertex AI: Training and Model Registry

Vertex AI is GCP's unified ML platform. Custom Training: run training code in a managed container on CPUs, GPUs, or TPUs. Training pipelines in Vertex AI Pipelines (Kubeflow Pipelines SDK or TFX) orchestrate multi-step workflows with automatic caching and artifact tracking. Hyperparameter tuning: Vertex AI Vizier (Bayesian optimisation) explores the hyperparameter space more efficiently than grid or random search. Vertex AI Experiments tracks runs, parameters, and metrics for comparison. Model Registry: versioned model artefacts with aliases (production, staging, challenger) — separates model management from deployment. Distributed training: data parallelism (same model on multiple workers, each sees a different batch), model parallelism (split model layers across devices for models too large for one device). MirroredStrategy (single node, multiple GPUs) versus MultiWorkerMirroredStrategy (multiple nodes).

Model Deployment and Serving

Vertex AI Endpoints: deploy model versions with traffic splits for A/B testing and canary rollouts. Dedicated endpoints (always-on) versus Serverless prediction (autoscaling to zero). Online prediction: low-latency, single-record requests. Batch prediction: high-throughput, asynchronous, for scoring large datasets. Model optimisation for serving: quantisation (FP32 to INT8 reduces model size and improves latency with some accuracy trade-off), distillation (train a smaller student model to mimic a larger teacher), TensorRT or ONNX Runtime for GPU inference optimisation. Feature latency: pre-compute slow features offline, serve fast features online from Memorystore. Explainability: Vertex Explainable AI provides feature attributions using SHAP (Shapley values) or Integrated Gradients. Required for regulated industries and for debugging unexpected model behaviour.

MLOps: Pipelines, Monitoring, and Governance

MLOps maturity levels: Level 0 (manual, notebook-driven), Level 1 (automated training pipeline, triggered by schedule or data drift), Level 2 (full CI/CD for ML — code changes trigger pipeline, evaluation gates before promotion). Model monitoring: Vertex AI Model Monitoring detects training-serving skew (difference between training data distribution and live prediction input distribution) and prediction drift (change in model output distribution over time). Alerts trigger retraining pipelines. Data freshness: stale feature data degrades model performance before accuracy metrics detect it. Governance: Dataplex for data cataloguing and lineage, BigQuery Authorized Views for column-level access control on training data, Vertex AI Model Cards for model documentation. Privacy-preserving ML: differential privacy (add calibrated noise during training), federated learning (train on device, aggregate model updates centrally — no raw data leaves the device).

Google Professional Machine Learning Engineer

ML Problem Framing and Data Strategy

Vertex AI: Training and Model Registry

Model Deployment and Serving

MLOps: Pipelines, Monitoring, and Governance

Key exam facts — Professional Machine Learning Engineer

Common exam traps

Practice this topic

Test yourself on Google PMLE