Back to Notes

AWS RDS

AWS RDS

Managed relational database service. Handles provisioning, patching, backups, and failover. Supports PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, and Aurora.


Core Concepts

ConceptDescription
DB InstanceIsolated database environment. Runs one DB engine.
DB Instance ClassCPU/memory size (db.t3.micro → db.r6g.16xlarge)
Storage Typegp2 (SSD), gp3 (better SSD), io1 (provisioned IOPS for high perf)
Multi-AZSynchronous standby in different AZ — automatic failover
Read ReplicaAsync copy for read scaling. Can be in different region.
Parameter GroupDB engine configuration (e.g., max_connections, work_mem)
Option GroupEngine-specific features (Oracle/SQL Server only)

Multi-AZ vs Read Replicas

Multi-AZRead Replica
PurposeHA + failoverRead scaling
ReplicationSynchronous (no lag)Asynchronous (some lag)
FailoverAutomatic (~60s)Manual promotion
Can serve reads?No (standby is passive)Yes
Cross-region?No (same region, different AZ)Yes
Cost2x instance costAdditional instance cost

Use Multi-AZ for production availability. Use Read Replicas to offload read traffic.


Automated Backups vs Snapshots

Automated BackupManual Snapshot
Retention1–35 days (configurable)Indefinite
GranularityPoint-in-time (5-min restore)Specific moment
CostFree up to DB sizeS3 storage cost
Deleted with DB?Yes (unless retain specified)No

Restore: Creates a NEW DB instance (not in-place). DNS changes needed.


Connection Management

RDS has a fixed max connections based on instance size. Common issue: connection exhaustion.

db.t3.micro: ~85 max connections
db.r6g.large: ~3000 max connections

Solutions:

  • RDS Proxy: Connection pooler in front of RDS. Reduces DB connections by pooling app connections. Integrates with IAM auth and Secrets Manager.
  • PgBouncer / pgpool: Self-managed connection pooler (for PostgreSQL)
  • Right-size your instance class

RDS Proxy

App instances → RDS Proxy → RDS Instance
(thousands of connections)   (pool of connections)   (max ~500 connections)

Benefits:

  • Reduces failover time (proxy maintains pool during Multi-AZ failover — apps don't reconnect)
  • IAM authentication support
  • Secrets Manager integration
  • Useful for Lambda → RDS (Lambda spins up thousands of short-lived functions)

Aurora — AWS-optimized RDS

Aurora is AWS's cloud-native relational DB (compatible with MySQL/PostgreSQL).

FeatureRDS PostgreSQLAurora PostgreSQL
Storage scalingManualAuto-scales 10GB–128TB
ReplicasUp to 5Up to 15
ReplicationAsync< 10ms lag
Failover~60s< 30s
CostBaseline~20-30% more, but cheaper at scale
ServerlessNoAurora Serverless v2 (auto-scale capacity)

Aurora Global Database: Primary region + up to 5 read-only secondary regions. < 1s replication lag globally.


Security

VPC → Private Subnets → Security Group → RDS Instance
  • Always deploy in private subnet (no public IP)
  • Security group: allow only from app servers' security group on port 5432/3306
  • Encryption at rest: KMS (enabled at creation, can't enable after)
  • Encryption in transit: SSL/TLS (enforce with rds.force_ssl=1 parameter)
  • IAM authentication: Use IAM token instead of password (rotate automatically)
  • Secrets Manager: Store DB password, auto-rotate every N days

Monitoring

MetricNormalAlert when
CPU Utilization< 80%> 90% sustained
DB Connections< 80% of max> 90% of max
Free Storage> 20%< 10%
Read/Write Latency< 5ms> 20ms
Replica Lag< 1s> 10s

Enhanced Monitoring: OS-level metrics (per-process CPU, memory). 50 vs 60s granularity. Performance Insights: Query-level analysis — top SQL statements by load.


Interview Talking Points

"When would you use Multi-AZ vs Read Replicas?" Multi-AZ for HA (if primary dies, failover in 60s). Read Replicas for horizontal read scaling (reporting, analytics, read-heavy workloads). Can use both together.

"What's Aurora vs RDS PostgreSQL?" Aurora is a drop-in replacement with higher availability (15 replicas < 10ms lag), auto-scaling storage, faster failover, and Aurora Serverless for variable workloads. More expensive per instance but cheaper at scale due to better hardware utilization.

"Lambda to RDS — what's the problem?" Lambda auto-scales to thousands of concurrent executions, each opening a DB connection. RDS has fixed max connections → exhaustion. Solution: RDS Proxy pools connections; Lambda connects to proxy, not RDS directly.


Related

  • [[AWS/VPC]] — RDS must be in private subnet
  • [[AWS/IAM]] — IAM auth for RDS, Secrets Manager
  • [[AWS/Lambda]] — Lambda to RDS via RDS Proxy
  • [[System Design/Backend 101/Data Base/Basics]] — DB fundamentals
  • [[Distributed Systems Concepts]] — replication, consistency