Virtual Machine Scale Sets
A Virtual Machine Scale Set (VMSS) defines a template for VM instances: the OS image, VM size, extensions, and network configuration. Azure uses this template to create and manage instances in the set. All instances are identical by design. The scale set can grow or shrink instance count automatically, manually, or on a schedule. Because all instances share the same configuration, you update the entire fleet by updating the model and triggering a rolling upgrade, rather than remoting into each VM individually.
VMSS supports two orchestration modes. Uniform orchestration is the original mode: Azure manages all instances identically using a single VM model. Flexible orchestration allows instances to have different configurations within the same scale set and supports mixing VM sizes and configurations. Flexible mode is required for certain scenarios like using the scale set with an Application Gateway backend pool. For most auto-scaling web workloads, Uniform mode is simpler and sufficient.
Scaling policies define when and how the scale set adjusts instance count. Metric-based scaling adds instances when CPU utilization, memory, or a custom metric exceeds a threshold, and removes instances when it drops below another threshold. Schedule-based scaling pre-scales the set at specific times, useful for predictable traffic patterns like business hours. Instance protection marks specific instances as protected from scale-in, preventing the scale set from terminating them even when scaling down.
Availability, SLAs, and choosing the right construct
Azure SLAs for VMs depend on the deployment architecture. A single VM with Premium SSD storage carries a 99.9% uptime SLA. Two or more VMs in an Availability Set carry a 99.95% SLA. VMs deployed across two or more Availability Zones carry a 99.99% SLA. These numbers reflect Azure's commitment to platform-level availability, not application availability. Your application must be designed to handle individual VM failures regardless of which SLA tier you target.
Availability Sets distribute VMs across fault domains and update domains within a single datacenter. Fault domains represent separate physical racks with independent power and network. Update domains are groups that Azure patches sequentially during maintenance, ensuring that VMs across update domains are never all unavailable at the same time. An Availability Set configuration with two VMs across two fault domains guarantees that a single rack failure takes down at most one VM.
Availability Zones place resources in separate physical datacenters within the same Azure region. Each zone has independent power, cooling, and networking infrastructure. A zone failure, caused by anything from a power outage to a cooling failure in that datacenter, does not affect VMs in other zones. Zone-redundant deployments require careful load balancer configuration (Standard Load Balancer or Application Gateway with zone-spanning) to distribute traffic across zones and route around zone failures automatically.
How to choose the correct answer
Single VM + Premium SSD: 99.9% SLA. Minimum for non-critical production workloads.
Availability Set (2+ VMs): 99.95% SLA. Protects against rack failure and planned maintenance impact.
Availability Zones (2+ VMs across zones): 99.99% SLA. Protects against datacenter failure.
VMSS Uniform: identical instances, managed together, best for auto-scaling stateless workloads.
VMSS Flexible: mix of VM configurations in one scale set, required for some ALB/AGW integration scenarios.
Metric-based scaling: reacts to demand. Schedule-based: pre-scales for known traffic patterns.
Instance protection: prevents specific instances from scale-in, useful for stateful instances within a scale set.