Now that AWS has released the new NAT Gateway, you’ll need a plan to migrate off your old legacy NAT (too soon?). The migration isn’t terribly difficult, but following this guide will help provide an outage-free transition to the new service.
Your NAT Gateway migration plan should start with an audit. If you take a look at our blog post on NAT Gateway Considerations you’ll see that there are a number of “gotchas” that need to be taken into account before you begin. Here what you’re looking for:
- Are your NAT EIPs whitelisted to external vendors? If yes, you’ll need to take a longer outage window for the migration.
- Does your NAT perform other functions for scripts or access? If you’re using the NAT for account scripts or bastion access, you’ll need to migrate those scripts and endpoints to another instance.
- Do you have inbound NAT security group rules? If yes, you’ll lose this security during the migration and will need to transfer these rules to the Outbound of the origination security groups.
- Do you need high availability across AZs? If yes, you’ll need more than one NAT Gateway.
- Is your NAT in an auto scaling group (ASG)? If yes, you’ll need to remove the ASG to clean up.
Check your Routes
Next you’ll want to take a look at your various routes in your private subnets and map them to the NAT instances that are referenced. Unless you are changing your configuration, you can note the public subnet that the NAT currently exists in and use that one for your NAT Gateway. That’s not required, but it introduces the least amount of change in your environment.
(Optional) Disassociate EIP from NAT Instance
In the case where the EIP of your NAT is whitelisted to third party providers, you’ll need to remove it from the NAT instance prior to the creation of the replacement gateway. NOTE: removing the NAT will begin your “downtime” for the migration, so you’ll want to understand the impact and know if a maintenance window is appropriate. When you disassociate the EIP, denote the EIP allocation id because you’ll need it later.
Find the EIP that is currently attached to the NAT instance and Right Click > Disassociate
Deploy the NAT Gateway(s)
At this point, you should have the EIP allocation id and public subnet id for the NAT that you intend to replace. If you aren’t moving your NAT EIP, you can generate one during the creation of the NAT Gateway. Click VPC Service and Click NAT Gateways on the left side. Then click Create NAT Gateway.
Select the public subnet and EIP allocation id or create a new EIP. Then click Create a NAT Gateway.
Once you’ve created the NAT Gateway, you’ll be prompted to update your route tables. Click Edit Route Tables.
At this point, you’ll want to go through the route tables that reference the NAT instance you replaced and edit the route to reference the NAT Gateway instead. Unsurprisingly, NAT Gateway ids start with “nat”.
You’ll repeat this process for every NAT instance and subnet that you’ll be migrating.
Log into at least one instance in each private subnet and verify connectivity based on what is allowed. On a Linux box, running “curl icanhazip.com” will return the external IP address and quickly confirm that you’re connected to the Internet, and the reply should match the EIP attached to your NAT Gateway.
Once you’ve migrated to the NAT Gateway and verified everything is working correctly, you’ll likely want to schedule the decommissioning of the NAT instances. If the instances aren’t in an ASG, you can stop the instances and set a calendar entry to terminate them a safe time in the future. If they are in an ASG, you’ll need to set the minimum and maximum to 0 and let the ASG terminate the instance.
-Coin Graham, Senior Cloud Engineer