So I ran into a situation related to networking when deploying our first integrated Azure Stack this week. We had planned to use BGP routing for our border connectivity. However, we ran into some speed bumps along the way that almost led us to switch to Static Routing.
First, I would like to point out that I am not very proficient when it comes to networking. I have a good stable knowledge of networking at a Software Defined level within Microsoft technologies like SNDv2 and Azure, etc. Outside of that, when it comes to routing and more complex network related topics, I will admit, I am not the guy to ask for help.
This blog will be about my experiences working with Dell EMC, our network team, and how we came to our final architecture decision to work around issues we had with the way Big Cloud Fabric manages BGP.
- What Is BGP? (Very High Level)
- The Big Cloud Fabric (BCF) Environment
- Our Options and Our Solution
What Is BGP (A very high-level overview!)
What is BGP (Border Gateway Protocol?) Answer from Wikipedia is a standardized exterior gateway protocol designed to exchange routing and reachability information among autonomous systems (AS) on the Internet.
BGP is something that I would recommend over Static routing. BGP routing gives us a dynamic routing protocol that is aware of changes in our network. Static routing, well it uses static routes, requires more thought process before any changes, and also may cause some issues when the changes are made. Microsoft doesn’t recommend static routing method but they will support it.
So instead of getting deep into what Azure Stack Cloud Network BGP Routing is I am going to be lazy here and link you to the Microsoft Documentation for a deeper dive into what it is. I will go over a little myself but trust me, you will get a better idea by reading the Microsoft Documentation first.
Below is a high-level design of how the TOR’s (Top of the Rack) Switches within an Azure Stack scale unit currently connects to the border network switches. I say currently because this design will change once they start supporting multi-scale units in a region.
In a normal environment, the two TOR switches coming from an Azure Stack stamp would cross-connect to our border switches physically and the routing would be done via BGP.
The Big Cloud Fabric and Our Issues
I am not really going to talk about what the Big Cloud Fabric is at a deep level. For one thing, I really don’t know much about the product outside of what I read on their website. What I know comes directly based off of what issues we came across trying to integrate our first multi-node azure stack into our datacenter. This product from what I have researched is very powerful and our issues are not directly related to BCF but how BGP works within BCF.
We mentioned in the first section about BGP Routing and how Azure Stack’s TOR switches are connected to our border network switches. Normally you would have an ASN assigned on those switches with a BGP process running on them. Then the Azure Stack’s TOR’s would make a connection to that process and routes would be generated, and connectivity would happen and bingo we have internet access or something like that. That is my version of what happens and I am sticking to it.
Now, with BCF, the switches are all centrally managed by a centralized control plane. All process run on that control plan and not the switches themselves. The BCF website has a good diagram that I took and posted below.
So what exactly was our issue and where am I going with this blog you ask? As I mentioned, there are no BGP processes running at the switch level. The switches that our Azure Stack TOR’s connect to and peer BGP with do not have that capability. All those processes are running on the centralized controllers. Not a bad design unless you are trying to integrate an Azure Stack and Microsoft is not allowing configuration changes to their networking design. Now to be fair to Microsoft, I don’t blame how tight they have been on the OEM’s about making changes. This is something that needs to be controlled and in time some of these needed changes may take place.
With no BGP process on the border switches, we will have issues trying to get BGP to peer correctly from our Stack TOR’s to the Big Cloud Fabric. We had to do some minor changes within the TOR’s configuration. “We” as in our Dell EMC engineer that was on site with me working out these issues.
Below is an example of our environment at a high level. With this configuration, the default TOR configurations wouldn’t work when generated from the Azure Stack Deployment Worksheet. Since the is no BGP process running on those border switches we would never get BGP to peer and would never get dynamic routes. This means we would need to change where the TOR’s are trying to peer to.
Our Dell EMC Engineer and Network team worked together to come up with some configuration changes that we applied to the TOR’s within Azure Stack.
- They removed the two BGP neighbors of the border switches from each TOR switches.
- They added the BGP neighbor of the centralized controller to each TOR switches.
- They enabled ebgp-multihop with a ttl of 2 to the new neighbor to each TOR switches.
- They did have to add a static route. (Sorry, I did not document this one and my network routing skills can’t remember what route they had to add.)
At the end of the day, we now had a configuration that worked. BGP was active and connected and we now started to see routes. We could now also get out of the BMC network and now had the correct routing to get back into the BMC network.
So, we are happy right? We can now click the magic button to start the deployment and go to lunch? Not so fast! We, again I mean Dell EMC now had to submit these configuration changes to Microsoft to get approved. Small changes yes, but even small ones could and most likely get denied by Microsoft.
Our Options blah blah blah (and a solution)
So, now that we had a configuration that allowed us to still use BGP and the Big Cloud Fabric we then had to wait for Microsoft to approve these new configurations. Dell EMC would have to officially request these changes be made via some process they have with Microsoft. Dell EMC themselves would have to approve these changes as well. So, what are our options now? What if Microsoft denies these changes? Do we wait around for their decision?
So we started to brain dump and come up with alternatives just in case Microsoft denied our configuration change requests. We came up with a few different options:
- Wait for Microsoft to approve these configurations changes. This wasn’t really a valid option for us at this time.
- Find a way around BCF.
- Go with Static Routing instead of BGP Routing.
Option 1 wasn’t going to work for us. That is why we decided to look for other options because we had been told by Dell EMC that configuration changes are not easy to get approved. Option 3 was something we didn’t want to fall back to. We really needed to come up with another option that would work.
So, we came up with our solution which later turned into a twist of what might be coming down the road anyway….. is this foreshadowing? Not sure I am not an English major.
Our Solution and Our Decision
We decided to place 2 more switches that are not being managed by the Big Cloud Fabric between the Azure Stack TOR’s and the BCF Switches. The only thing we had available where 2 Dell 6010 switches. A little overkill for 2 ports on each switch but this would provide us growth for future stacks anyway. (hint hint)
This would allow us to fall back to the original configuration that had already been approved by Microsoft. It would then allow us to make configurations on our own and leave the TOR’s alone. At the end of the day, I guess you can say we created our own little spin for our Azure Stack deployments that would allow us to deploy Azure Stack and still keep our cloud networking team happy.
Update: I have heard that our network team actually just routed us around the Big Cloud Fabric all together now. I will update this blog with the correct information once I get confirmation. However, the original plan above would have worked for us at the end of the day as well.
I don’t really have any final notes or thoughts, to be honest. Except for one thing, I really need to learn networking at a deeper level. This has become clearer now working with Stack then ever before.
Also, I love this technology, but I will admit that it isn’t perfect yet. There is plenty of grown for this product. My network guys constantly tell me Microsoft needs to work better with various types of environments. This one was a good example but something that could be fixed fairly easy as we discovered.
Now, as to our spine that we created for our Azure Stacks moving forward. Can’t confirm or deny anything but be looking for something like what we did here with your OEM’s networking equipment in order to support multiscale units in a region.