Wednesday, April 17, 2019

Stitching Clouds - Azure to AWS Cisco CSRs behind IGW (static NAT)

The Microsoft Azure cloud has made some tremendous strides forward in the past few years. Despite entering the market years after the market leader (AWS) had established a dominance, they have quickly built their market share. The playbook was classic Microsoft - court enterprises, and leverage the overwhelming dominance of the Windows operating system with packaged and discounted OS licensing costs.

Because of this and more, Azure now holds a strong second place in the cloud environment market. Most infrastructure abstractions copy the AWS model, but use different names. Cisco has provided a solid mapping of names between the services.


Most enterprises begin with a single cloud, and quickly realize that each cloud has its own benefits and potentially unique features, and almost all enterprises are now what's called in industry as "multi-cloud". That means linking the network of these cloud together so all services can communicate.

A good place to start is Azure, since it will help us build our Cisco CSR IOS configuration once built. There are a half-dozen steps to building the VNG (Virtual Network Gateway, the parallel for the AWS VGW - Virtual Network Gateway), and which Azure's documentation covers very well, so I won't rehash it. This doc will walk you through creating:

  • Gateway Network (where Azure-build public-facing services, like the VNG, will live)
  • Virtual Network Gateway (VNG), the device which terminates infrastructure VPNs
  • Local Network Gateways, the reference object which contains public IPs and other context information for your public VPN endpoints, similar to "Customer Gateways" in AWS-land
  • Connections, finally, the real VPN between your Azure tenant and the non-Azure VPN endpoint
Now, the docs from $MSFT are excellent, but there are a few low-hanging fruit and gotchas, and I'll talk through them here. For the VNG, here's a few gotchas: 
  • HA - High Availability. If you're an enterprise, you want HA. However, the VNG doesn't launch with HA enabled by default, like it would in AWS. So let's go into the VNG and add HA. Flip "Active-active mode" to "Enabled". 
    • NOTE: When you save after updating the active-active mode, this will cause the VNG to be torn down and non-functional for about 20-30 minutes while it is being rebuilt in HA-mode. All your Local Network Gateways and Connections will endure, but the configuration of each will be affected, so it's a good idea to do this before moving on to the other items. 
  • Set the BGP ASN - it has to be a public ASN (non-registered), but can be an other. Make sure it doesn't overlap with your CSRs or other ASNs, or your BGP atribute routing will get complex. 

Go through and build all of these items for your CSR or CSRs. For the Local Network Gateway, make sure to add the BGP configuration and point at the public (Elastic IP) of your CSR. For the BGP peer IP address, this IP will be inside your VPN, so it will be a local loopback on your device. I created a local loopback on my CSR with IP 10.255.255.1. This interface and IP will be what you tell BPG to source your connection from when you build BGP across this tunnel. 


Once all that configuration information is saved, switch to the "Connection" configuration item. At this point, it knows everything it should be doing, but there is a gotcha, where you need to enable BGP across this connection. It's a simple on/off, so flip it to "Enabled" and hit save. It'll take a minute to update. 


Now let's let Azure do the work for us. Navigate into your Connection to your CSR, click on "Overview" in the left column, and then click "Download Configuration." There's lots of configuration types, pick the one most relevant to you. For our Cisco CSRs, the one that is easiest for me to read and works is the IOS (ISR, ASR) template. 


My advice is to copy all the configuration out to a notepad and make sure you save a copy. Save as you go, we'll be making a few very important changes in order for this to work. 

Right at the top, there's a few items to check. First, make sure there are two public IPs listed. If there aren't, your HA mode isn't finished rebuilding or isn't enabled. I don't recommend any enterprise of any size move forward without HA. It's always worth the investment. 

!   > Public IP addresses:   
!     + Public IP 1:         1.2.3.4
!     + Public IP 2:         5.6.7.8

Second, make sure this connection is built with BGP enabled. If you don't see "True" here, or if the line is missing, go back and double-check your Connection, Local Network Gateway, and Virtual Network Gateway config - one of them will be missing the BGP=Enabled section or was rebuilding after a save when you downloaded config. 

!   > Azure virtual network
!     + Enable BGP:            True

In the IKE section, make sure to update the local IP address to your PRIVATE IP address. Hosts in AWS aren't directly on the internet, they almost always use an EIP to do 1:1 NAT. 

crypto ikev2 policy AzureCSR1
  proposal AzureCSR1-proposal
  match address local 10.20.30.40
  exit

The IKEv2 policy has the same issue, matching on the public (EIP) address, rather than local, and needs to be updated: 

crypto ikev2 profile AzureCSR1-profile
  match address  local 10.20.30.40

The tunnels that are built require a slight modification - when HA is used, there are two destinations, requiring two tunnels. In Cisco-land, when two tunnels share a crypto profile, they require the "shared" keyword. I have no idea why this is, but Azure's auto-config builder misses this, and that means the CSR can only bring up one tunnel at a time... unless you make this change: 

int tun 90
 tunnel protection ipsec profile AzureCSR1-IPsecProfile shared

Remember that all tunnels in Cisco's transitCSR product for AWS live in VRFs, and it's a good idea for these tunnels to also exist in VRFs. First, let's build the VRF. 

ip vrf AzureCSR1_VRF
 rd 64518:200
 route-target export 64518:0
 route-target import 64518:0

Then put the new loopback interface in the VRF

int loopback 90
 ip vrf forwarding AzureCSR1_VRF
 ip address 10.255.255.1
 no shut

Make sure to update the BGP config to source traffic from your new interface, and make sure to configure the BGP neighbor in the Azure connectivity VRF:

router bgp 64512
 address-family ipv4 vrf AzureCSR1_VRF
  neighbor 10.9.255.228 remote-as 65555
  neighbor 10.9.255.228 activate
  neighbor 10.9.255.228 update-source loopback90

And there's one final oddity with Azure's provided config - they recommend you use an APIPA reserved address (169.254.X.X) for your tunnel, and a /32 to boot. Which means the router can't inherently understand which traffic to send over the tunnel, even though all the items are in place. the trick to kick-start it all is to add a static route over the tunnel towards the BGP neighbor so they can establish a neighborship and start routing. If you are using HA mode (again, highly recommended, make sure to send each BGP neighbor over the appropriate tunnel).

ip route vrf AzureCSR1_VRF 10.9.255.228 255.255.255.255 tun 90
ip route vrf AzureCSR1_VRF 10.9.255.229 255.255.255.255 tun 91

And bam, you now have routing between your Cisco transit CSRs and an Azure tenant. There's no fancy lambda to configure connectivity that I'm aware of - but if you stumble across any Azure-focused Cisco automation here, please link in the comments and we can share with the community.

Thanks all. Good luck out there!
kyler

1 comment: