Auto-Scaling gives the ability to scale your EC2 instances up or down according to demand to handle the load on the service. With auto-scaling you don’t have to worry about whether or not the number of instances you’re using will be able to handle a demand spike or if you’re overspending during a slower period. Auto-scaling automatically scales for you for seamless performance.
For instance, if there are currently 3 m1.xlarge instances handling the service, and they spend a large portion of their time only 20% loaded with a smaller portion of their time heavily loaded, they can be vertically scaled down to smaller instance sizes and horizontally scaled out/in to more or less instances to automatically accommodate whatever load they have at that time. This can also save many dollars by only paying for the smaller instance size. More savings can be attained by using reserved instance billing for the minimum number of instances defined by the Auto-Scaling configuration and letting those scaled out instances pay the on-demand rate while running. This is a little tricky though because an instance billing cannot be changed while the instance is running. When scaling down, make sure to terminate the newest instances, since they are running at the on-demand billing rate.
Vertical Scaling is typically referred to as scale-up or scale-down by changing the instance size, while Horizontal Scaling is typically referred to as scale-out or scale-in by changing the number of instances.
When traffic on AWS Service has predictable or unpredictable increases or decreases, Auto-Scaling can keep customers happy with the service because their response times stay more consistent and High Availability is more reliable.
Auto-Scaling to Improve HA
If there is only one server instance, Auto-scaling can be used to put a new server in place, in a few minutes, when the running one fails. Just set both Min and Max number of instances to 1.
Auto-Scaling to Improve Response Time Consistency
If there are multiple servers and the load on them becomes so heavy that the response time slows, expand horizontally only for the time necessary to cover the extra load, and keep the response time low.
AWS Auto-Scaling Options to Set
When Auto-Scaling up or down, there are a lot of things to think about:
- Evaluation Period is the time, in seconds, between checks of the load on the Scaling Group.
- Cool Down is the time, in seconds, after a scaling operation that a new scaling operation can be performed. When scaling out, this time should be fairly short in the event that the load is too heavy for one Scale-Up operation. When scaling in, this time should be at least twice that of the Scale-Out operation.
- With Scale-Out, make sure it scales fast enough to quickly handle a load heavier than one expansion. 300 seconds is a good starting point.
- With Scale-In, make sure it scales slow enough to not keep going out and in. We call this “Flapping”. Some call it “Thrashing”.
- When the Auto-Scale Group includes multiple AZs, Scaling out and in should be incremented by the number of AZs involved. If only one AZ is scaled up and something happens to that AZ, noticeability in a bad way goes up.
- Scale-In can be accomplished by different rules:
- Terminate Oldest Instance
- Terminate Newest Instance
- Terminate Instance Closest to the next Instance Hour (Best Cost Savings)
- Terminate Oldest Launch Configuration (default)
Auto-Scaling is a two stage process, and here is the rub. The AWS Management Console does not do Auto-Scaling so it has to be done through AWS APIs.
- Set up the Launch Configuration and assign it to a group of instances you want to control. If there is no user_data file that argument can be left out. The block-device-mapping argument can be found in the details for the ami_id.
- # as-create-launch-config <auto_scaling_launch_config_name> –region <region_name> –image-id <AMI_ID> –instance-type <type> –key <SSH_key_pair_name> –group <VPC_security_group_ID> –monitoring-enabled –user-data-file=<path_and_name_for_user_data_file> –block-device-mapping “<device_name>=<snap_id>:100:true:standard”
- # as-create-auto-scaling-group <auto_scaling_group_name> –region <region_name> –launch-configuration <auto_scaling_launch_config_name> –vpc-zone-identifier <VPC_Subnet_ID>,<VPC_Subnet_ID> –availability-zones <Availability_Zone>,<Availability_Zone> –load-balancers <load_balancer_name> –min-size <min_number_of_instances_that_must_be_running> –max-size <max_number_of_instances_that_can_be_running> –health-check-type ELB –grace-period <time_seconds_before_first_check> –tag “k=Name, v=<friendly_name>, p=true”
- Have CloudWatch initiate Scaling Activities. One CloudWatch Alert for Scaling Out and one for Scaling In. Also send notifications when scaling.
- Scale Out (Alarm Actions output from first command are used by second command argument)
- # as-put-scaling-policy –name <auto_scaling_policy_name_for_high_CPU> –region <region_name> –auto-scaling-group <auto_scaling_group_name> –adjustment <Number_of_instances_to_change_by> –type ChangeInCapacity –cooldown <time_in_seconds_to_wait_to_check_after_adding_instances>
- # mon-put-metric-alarm –alarm-name <alarm_name_for_high_CPU> –region <region_name> –metric-name CPUUtilization –namespace AWS/EC2 –statistic Average –period <number_of_seconds_to_check_each_time_period> –evaluation-periods <number_of_periods_between_checks> –threshold <percent_number> –unit Percent –comparison-operator GreaterThanThreshold –alarm-description <description_use_alarm_name> –dimensions “AutoScalingGroupName=<auto_scaling_group_name>” –alarm-actions <arn_string_from_last_command>
- Scale In(Alarm Actions output from first command used as second command argument)
- # as-put-scaling-policy –name <auto_scaling_policy_name_for_low_CPU> –region <region_name> –auto-scaling-group <auto_scaling_group_name> “–adjustment=-<Number_of_instances_to_change_by> ” –type ChangeInCapacity –cooldown <time_in_seconds_to_wait_to_check_after_removing_instances>
- # mon-put-metric-alarm –alarm-name <alarm_name_for_low_CPU> –region <region_name> –metric-name CPUUtilization –namespace AWS/EC2 –statistic Average –period <number_of_seconds_to_check_each_time_period> –evaluation-periods <number_of_periods_between_checks> –threshold <percent_number> –unit Percent –comparison-operator LessThanThreshold –alarm-description <description_use_alarm_name> –dimensions “AutoScalingGroupName=<auto_scaling_group_name>” –alarm-actions <arn_string_from_last_command>
AMI Changes Require Auto-Scaling Updates
The instance configuration could change for any number of reasons:
- Security Patches
- New Features added
- Removal of un-used Old Features
Whenever the AMI specified in the Auto-Scaling definition is changed, the Auto-Scaling Group needs to be updated. The update requires creating a new Scaling Launch Config with the new AMI ID, updating the Auto-Scaling Group, then deleting the old Scaling Launch Config. Without this update the Scale out operation will use the old AMI.
1. Create new Launch Config:
# as-create-launch-config <new_auto_scaling_launch_config_name> –region <region_name> –image-id <AMI_ID> –instance-type <type> –key <SSH_key_pair_name> –group <VPC_security_group_ID> –monitoring-enabled –user-data-file=<path_and_name_for_user_data_file> –block-device-mapping “<device_name>=<snap_id>:100:true:standard”
2. Update Auto Scaling Group:
# as-update-auto-scaling-group <auto_scaling_group_name> –region <region_name> –launch-configuration <new_auto_scaling_launch_config_name> –vpc-zone-identifier <VPC_Subnet_ID>,<VPC_Subnet_ID> –availability-zones <Availability_Zone>,<Availability_Zone> –min-size <min_number_of_instances_that_must_be_running> –max-size <max_number_of_instances_that_can_be_running> –health-check-type ELB –grace-period <time_seconds_before_first_check>
3. Delete Old Auto-Scaling Group:
as-delete-launch-config <old_auto_scaling_launch_config_name> –region <region_name> –force
Now all Scale Outs should use the updated AMI.
-Charles Keagle, Senior Cloud Engineer