Database Selection: Matching Workload to Service
AWS database selection is the most tested concept. Relational (RDS/Aurora): structured data with complex queries, ACID transactions, foreign keys. Aurora: MySQL and PostgreSQL-compatible, up to 5x faster than standard MySQL — Aurora Serverless v2 scales capacity instantly, Aurora Global Database spans multiple regions with sub-second replication lag, Aurora Multi-Master for multiple write nodes. Use Aurora for new workloads; RDS for lift-and-shift compatibility requirements. Key-value (DynamoDB): single-digit millisecond latency, unlimited scale, schemaless — use for session stores, user profiles, IoT data, gaming leaderboards. Document (DocumentDB): MongoDB-compatible managed service — for existing MongoDB workloads migrating to AWS. Wide-column (Keyspaces): Apache Cassandra-compatible — for Cassandra workloads needing fully managed scale. In-memory (ElastiCache): Redis (rich data structures, persistence, pub/sub, Lua scripting, Cluster mode for horizontal scale) or Memcached (simpler, multi-thread, pure caching). Graph (Neptune): property graph (Gremlin API) and RDF (SPARQL API) — knowledge graphs, fraud detection, social networks. Time-series (Timestream): built-in time-series functions, magnetic/memory store tiers, automatic expiry. Ledger (QLDB): cryptographically verifiable journal of all data changes — finance, supply chain audit trails.
RDS and Aurora Deep Dive
RDS operational knowledge for the specialty. Multi-AZ: synchronous standby replica for automatic failover (1-2 minute RTO), standby is not readable in standard Multi-AZ — use Multi-AZ DB Cluster (two readable standbys, better RTO). Read Replicas: asynchronous replication, readable, up to 15 for Aurora, 5 for RDS — promote to standalone for DR. Aurora storage: distributed across 6 storage nodes in 3 AZs (2 copies per AZ) — no standby needed for storage durability, automatically grows in 10 GB increments up to 128 TB. Aurora backtrack: rewind the database to a previous point in time without restoring from backup — seconds resolution, very fast — only for Aurora MySQL. Parameter groups: engine-specific configuration — DB Parameter Group (instance-level), Cluster Parameter Group (cluster-wide for Aurora). Option groups: RDS-specific extensions (Oracle options, MSSQL features). Performance Insights: visualise database load — DB Load metric shows average active sessions, slice by wait event/SQL/user/host to identify bottlenecks. Enhanced Monitoring: OS-level metrics (CPU, memory, I/O) at 1-60 second granularity.
DynamoDB Advanced: Design Patterns and Global Tables
DynamoDB specialty depth. Access pattern design: define ALL access patterns before designing the table — single-table design collapses multiple entity types into one table using generic attribute names (PK, SK) and attribute overloading (GSI keys serve different entities). Adjacency list pattern: model many-to-many relationships with inverted GSI — row item (PK=OrderId, SK=ITEM#ItemId) and inverted GSI (PK=ItemId, SK=OrderId). Sparse GSI: only items where the indexed attribute exists appear in the GSI — efficient for rare attributes (only suspended accounts have a SuspendedDate attribute). Transactions: TransactWriteItems (up to 100 items, all-or-nothing write — financial operations, inventory deduction) and TransactGetItems (consistent multi-item read). Global Tables: multi-region active-active replication with last-writer-wins conflict resolution — automatic failover for reads and writes to healthy regions. DynamoDB Streams + Lambda: event-driven processing on item changes — search indexing (write to OpenSearch on DynamoDB change), cache invalidation, cross-service event propagation. Capacity planning: on-demand mode for spiky unknown traffic, provisioned + Auto Scaling for predictable steady-state (significantly cheaper at high volume).
Database Migration Service (DMS) and Schema Conversion
Migration is a major DBS-C01 domain. AWS DMS: heterogeneous and homogeneous database migrations — source endpoint, target endpoint, replication instance (sizing determines throughput). Full load: bulk initial migration. Change Data Capture (CDC): ongoing replication after full load using source database logs (binlog for MySQL, WAL for PostgreSQL, LogMiner for Oracle). Full load + CDC: zero-downtime migration — full load while source continues receiving writes, CDC catches up, cutover when replication lag approaches zero. Schema Conversion Tool (SCT): converts schema and application code for heterogeneous migrations (Oracle to Aurora PostgreSQL, SQL Server to Aurora MySQL) — assessment report identifies conversion complexity and manual conversion effort. DMS data validation: compare source and target row counts and checksums — detects migration errors. DMS Fleet Advisor: analyses on-premises database inventory and recommends migration targets based on schema complexity and workload characteristics.