Core Infrastructure: Compute Engine and GKE
Compute Engine is Google Cloud's IaaS offering — virtual machines running on Google's infrastructure. Key concepts: machine types (general purpose e2, n2; compute-optimised c2, c3; memory-optimised m2, m3; accelerator-optimised a2 for GPUs), persistent disks (Standard HDD, Balanced SSD, SSD, Extreme SSD), managed instance groups (MIG — automated scaling and healing of identical VMs, use with load balancers), instance templates (define the VM configuration used by MIGs). gcloud CLI is essential for the ACE exam: gcloud compute instances create, gcloud compute ssh, gcloud compute disks list. Google Kubernetes Engine (GKE) manages Kubernetes clusters: Standard mode (you control node configuration), Autopilot mode (Google manages nodes — you pay per pod, not per node — simpler, recommended for most workloads). GKE node pools allow different machine types within one cluster — use dedicated pools for GPU workloads or spot instances. Workload Identity: assign GCP service accounts to Kubernetes service accounts — eliminates the need for service account key files in pods.
Networking: VPCs, Firewall Rules, and Load Balancers
Google Cloud networking differs from AWS in important ways. VPCs are global (not regional) — a single VPC can have subnets in any region. Subnets are regional. Shared VPC allows multiple projects to use a single VPC (centrally managed by a host project). VPC Peering connects VPCs — non-transitive like AWS. Cloud VPN connects on-premises to Google Cloud over IPSec — HA VPN uses redundant tunnels for 99.99% SLA. Cloud Interconnect: Dedicated Interconnect (physical 10G or 100G private connection), Partner Interconnect (through a Cloud Interconnect partner — for lower bandwidth needs). Firewall rules: stateful, applied to instances via network tags or service accounts — implied deny all inbound, implied allow all outbound (can override). Hierarchical firewalls (network firewall policies) apply at the organisation or folder level. Load balancers: Global HTTP(S) Load Balancer (layer 7, anycast IP, CDN integration, URL-based routing — use for web apps), Regional TCP Proxy and SSL Proxy (layer 4), Network Load Balancer (layer 4, pass-through, highest performance), Internal Load Balancers (within VPC — internal TCP/UDP or HTTP(S)).
Storage and Databases for ACE
ACE covers the operational use of Google Cloud storage services. Cloud Storage: create buckets (globally unique names), choose storage class based on access frequency, configure lifecycle rules (transition to Coldline after 30 days, delete after 365 days), enable versioning for accidental deletion protection, set IAM permissions at bucket or object level (avoid ACLs — use IAM). gsutil is the CLI tool for Cloud Storage operations: gsutil cp, gsutil rsync, gsutil ls, gsutil iam ch. Cloud SQL: managed relational database — create instances, configure machine type and storage, enable automatic storage increases, configure read replicas and high availability (regional) configurations. Cloud SQL Auth Proxy: secure connection to Cloud SQL without whitelisting IPs — authenticate via service account. Firestore vs Bigtable: Firestore is serverless NoSQL for document-oriented data (mobile apps, web apps — automatic scaling, strong consistency); Bigtable is for high-throughput analytical workloads (time-series, IoT — not for small datasets, overhead not worth it under millions of rows).
Identity, Access, and Security
ACE security topics: IAM roles — primitive roles (Owner, Editor, Viewer — too broad, avoid), predefined roles (granular per service — use these), custom roles (define your own permission set for least privilege). Service accounts: create per application, grant minimum required roles, use Workload Identity in GKE, avoid downloaded key files (rotate regularly if you must use them). Organisation policy constraints: restrict resource locations (e.g., only allow resources in europe-west1), prevent external IP addresses on VMs, restrict VM machine types — enforce compliance at scale. Cloud Key Management Service (KMS): manage encryption keys — Customer-Managed Encryption Keys (CMEK) gives you control over the key lifecycle. Secret Manager: store API keys, passwords, and certificates — version secrets, configure rotation, grant access via IAM. VPC Service Controls: create security perimeters around sensitive services — prevent data exfiltration from BigQuery or Cloud Storage even by authenticated identities from outside the perimeter.
Monitoring, Logging, and Operations Suite
Google Cloud Operations Suite (formerly Stackdriver) provides observability. Cloud Monitoring: metrics from all Google Cloud services, custom metrics via the Monitoring API or OpenTelemetry, alerting policies trigger on metric thresholds, SLO monitoring tracks service level objectives. Cloud Logging: log ingestion from all GCP services and custom applications (structured JSON logs preferred), Log Explorer for ad-hoc queries, Log-based metrics (create custom metrics from log entries), log sinks export logs to Cloud Storage, BigQuery, or Pub/Sub for long-term retention and analysis. Cloud Trace: distributed tracing for applications (manual instrumentation with OpenTelemetry or automatic with GKE/App Engine). Cloud Error Reporting: aggregates and analyses application errors, groups stack traces, sends notifications. Audit logs: Admin Activity (always on, who called which API), Data Access (optional, logs data read/write — can be high volume), System Events (automatic actions by Google), Policy Denied (access denied events). Exam tip: always enable Data Access audit logs for sensitive data services like BigQuery and Cloud Storage.