Database Selection and Trade-offs
GCP database decision tree: relational and transactional with global scale? Spanner. Relational with regional scale and familiar SQL? Cloud SQL. NoSQL document with real-time sync? Firestore. NoSQL wide-column with high throughput? Bigtable. In-memory cache? Memorystore. Data warehouse and analytics? BigQuery. CAP theorem in practice: Spanner achieves global consistency using TrueTime (GPS + atomic clocks to bound clock skew) — it is CP. Firestore in Datastore mode is strongly consistent for single-entity operations but eventually consistent for queries across entities. Bigtable is strongly consistent for reads and writes to a single row. Multi-region versus single-region: Cloud SQL offers regional HA with synchronous standby (failover in 60s). Spanner multi-region adds cross-region replicas for disaster recovery and read performance at the cost of higher write latency.
Cloud SQL: Operations and High Availability
Cloud SQL HA: Primary instance with a standby in the same region but different zone. Replication is synchronous (standby must acknowledge before write commits). Failover is automatic on primary failure; promoted standby becomes the new primary. Read replicas are asynchronous and can be in different regions. Cloud SQL for PostgreSQL/MySQL/SQL Server: version compatibility, flag configuration (shared_buffers, max_connections, innodb_buffer_pool_size), maintenance windows, automatic storage increases. Backup: automated backups (point-in-time recovery up to 7 days), on-demand backups retained until explicitly deleted. Connectivity: Cloud SQL Auth Proxy handles SSL/TLS and IAM-based authentication — the recommended connection method. Private IP (VPC peering) is the secure production approach; Public IP with authorised networks is an alternative but less secure.
Spanner: Architecture and Query Optimisation
Spanner splits data into row ranges called splits. Each split is served by a Paxos group with a leader and replicas. Writes go through the leader; reads can go to any replica (bounded staleness reads) or the leader (strong reads). Interleaving: child tables interleaved into parent tables are stored co-located physically — dramatically reduces cross-split reads for parent-child queries. Schema design: avoid hotspots by not using monotonically increasing primary keys (UUIDs or bit-reversal are common solutions). Composite keys distribute load. Secondary indexes: non-interleaved (global, covering indexes reduce lookups), interleaved (co-located with parent table). Query hints: @{FORCE_INDEX=index_name}, GROUPBY_SCAN_OPTIMIZATION=TRUE. Query Insights: visualise slow queries, lock wait times, transaction statistics. Spanner autoscaler adjusts compute capacity (processing units) based on CPU and storage utilisation.
Migration Strategies and Bigtable
Database Migration Service (DMS): managed migrations for MySQL, PostgreSQL, and SQL Server to Cloud SQL or AlloyDB. Supports homogeneous (same engine) and some heterogeneous migrations. Continuous replication using CDC (Change Data Capture) minimises downtime during cutover. Datastream: CDC service for streaming changes from Oracle, MySQL, PostgreSQL to BigQuery, Cloud Storage, or Spanner. Use for real-time analytics on operational data or ongoing synchronisation after initial migration. Bigtable design principles: row key is the only index — design keys to distribute reads and writes. Column families group related columns and have separate compaction/TTL settings. Hot rows (caused by monotonically increasing timestamps in keys) are the most common performance problem. Solution: salt keys with a hash prefix or reverse the timestamp.