Saturday, December 6, 2014

The Use of NAT Mode on Meraki MR Access Points

In networking, I find myself looking at certain features of wireless equipment and asking myself under what circumstance would I implement this feature for a customer.  I try to come up with a list of Pros and Cons as to when it's appropriate.  One that popped up recently was NAT Mode / Meraki DHCP.



Thursday, November 20, 2014

Hallway Design Nightmares Part2: TXPower

One of the effects of the hallway design is that Radio Resource Management (RRM) frequently doesn't work as expected.  It's not that it doesn't work, it's just that hallway designs significantly limit the perspective of how APs see each other, which is primarily how RRM determines what it should be doing.  Given that all the APs in the hall generally see each other at or about the same level, they all have a very similar view of the network.  For illustration purposes, I've mocked up an imaginary residence hall.  A specific hall that I visited recently was build with concrete walls.  We will talk about some of the things I saw, and how they apply to this generic sample floor.



For this example, I've place 3 APs where the black dots are.  Let's also assume that this is the 2nd floor in a 3 story building.



Hallway Design Nightmares Part 1: Introduction

A lot of people know that Hallways designs are a bit of a pet peeve of mine.  My opinion is that they do not work, and beyond moving the APs out of the hall, you cannot fix them.  But the reality is that moving APs and existing infrastructure can be really expensive, time consuming and difficult to get management to buy off on.  This means that you probably will have to live with some of your existing hallway designs for the foreseeable future.

In this series, my aim is to explore the options available to improving performance of a hallway design.  As an engineer working for a Cisco Gold Partner, I'm going to talk about some of the tools available in the Cisco Unified Wireless Network (CUWN) to help deal with these designs.

I'll leave this short intro post with the following advise:
Don't put APs in the hallway.
Put APs as close to clients as you can get them.
Think about how RRM works, and incorporate that into your designs
Don't put APs in the hallway, PLEASE!

Monday, October 6, 2014

Avaya: Building the Shortest Path Bridge

It's been... well a while since Avaya presented at #WFD7 and I find myself thinking once or twice a day about SPB networks and how someone (maybe even me) might build a campus network using this technology.



Tuesday, September 2, 2014

Airtight and Scrape: An Interesting Social Wifi Use Case

Social Wifi is one of those polarizing topics where people are either "Meh, I don't care," or "OMG, NOOOOO!"

At the latest Wireless Field Day, I had the pleasure of meeting Drew Lentz (@Wirelessnerd) who I've interacted with over Twitter for a couple years.  At Airtight, he was presenting Scrape, an application and use case for social Wi-Fi.  There's a lot to like about Scrape.  One is that you trade your social media information for a better experience.




Monday, August 4, 2014

Teaser: Project Rawrbox - My idea for remote wireless diagnostics

So I'm sure everyone has noticed that the blog has quieted down a bit since #WFD6.  I could tell you that work's been busy, that I've completed my CCIE-W (#43153), but those are excuses.  I will put some more focus into the blog for the next few months as I wrap up some things in the real world.

What I can tell you is that I have been working on an new project that I hope to complete in the next few months.  I had hoped to have it ready to show off to the other delegates at #WFD7, but it's not going to make it.

The idea behind the project is simple.  A wireless diagnostic rig that I can ship to a remote site and perform basic packet captures, performance data and maybe even specturm analysis for under $250 USD.  The goal is to publish the software as an RPI image and give others the ability to use some of the tools I've built to create their own rigs.  The plans, software and image will be published for the community once it's finished.

Today the hardware looks something like this:

Raspberry PI B+ running latest Raspbian
High Performance MicroSD Card (Lexar)
3x USB Pigtails (2x RH, 1x LH)
3x USB Wifi Adapters (currently evaluating)
Pelican Case
TP Link PoE Splitter
Miscellaneous usb and power cables.

Currently I have a number of Python scripts to automate the packet capture process, uploading of files to both an FTP server and Cloudshark and am working the task scheduler right now.  I have support for AutoSSH for remote management.  Eventually there will be a web interface to simplify the operations, and a setup script to personalize the image to your environment.

There are some hardware limitations with the RPI hardware, but I'm hopeful that I can overcome most of these to get this project published soon.

Thanks for your patience and you'll see more on this soon.

Why SDN is coming to a wireless network near you

We are literally days away from the 7th installment of Wireless Field Day.  I'm really hoping (borderline begging) to hear a couple vendors talk about their Software Defined Network (SDN) solutions.  Now before you hop on the "Shut-up about SDN" bandwagon, here's why I see this as a hot topic right now.

I see SDN changing the way we build networks, and not just in the datacenter.  In the past we built large L2 networks and we eventually hit scaling limitations.  Then we started building routed networks, which solved the scalability problems, and introduced a whole new set of challenges around management of L2 domains, L2 adjacency requirements and troubleshooting routing protocols.

With SDN, the possibility arises where we could have the best of both worlds, and hopefully not the worst of both.  I believe that we will start building underlay L3 networks between our core and access layers, and all the L2 will exist as a dynamic overlay on top.

And since a lot of wireless vendors already have a tunneled overlays for their wireless traffic, the question is not IF SDN will be coming to a wireless network near you, but when, and how will that look.  Just the idea of having VXLAN dynamic overlays for your wireless clients, gets me all excited.  And with VTEP support being baked into silicon for switches, there are a lot of possibilities for new and innovative solutions to emerge.

I hope you will all join in on the Wireless Field Day experience this week, there will be some great discussions, both around SDN and other mobility topic.  All the sessions will be streamed live over at the TechFieldDay.com

Sunday, July 13, 2014

Thoughts on Eliminating VLANs at the Access-Layer Edge

There's a lot of talk in the industry about getting away from VLAN segmentation and relying on stateful firewalls at our access-layer edge to govern control over what users have access to.  This is a great idea, it solves issues with IPv6 and it simplifies network design.  But there are some significant challenges that make it a no-go for today's enterprise networks.  Most vendors are touting their "stateful" firewalls in the AP and edge switches solves those challenges. But I find the current generation of these solutions inadequate to solve this issue in enterprise networks.

Issue #1: I need your identity at more places that the access-layer edge
Web Content Filtering is a great example of needing your identity elsewhere in the network.  In a restrictive corporate environment, there are Active Directory integration that help solve these problems, but what about non domain devices?  What about organizations with the Internet of Everything?

I've seen solutions from Radius Accounting integration to agents on Domain Controllers, but these are usually point solutions and I personally have not had good luck with these nor are they widely supported.  Also they are single device specific, so you can't send it to multiple devices for determining identity.  Datacenter firewalls are another place where this falls apart.  How do I write an ACL based off your identity when I may or may not have your identity.  The solution inevitably leads to more identity verification: Captive Web Portals, VPN clients, etc.

Issue #2: Scalability of ACLs
Anyone who has tried to write complex ACLs to govern what a client can or can't get to, can tell you that ACLs get very unwieldy very quickly.  You CANNOT effectively write ACLs for every resource and port that every potential client should be able to get to.  It's also not effective as these ACLs take up precious TCAM space in our network equipment.

Solutions?
The solution to this problem is an identity exchange.  Cisco has a pair of technologies called SGT and SXP with ISE or ACS (part of their TrustSec solution) that attempts to solve this problem.  Instead of filtering traffic with ACLs on ingress to the network, they identify identity at the edge and pass that identity information to the rest of the network and filter packets on egress of the network.  Both protocols are Cisco proprietary, but the idea is sound.  While I'm not a fan of having to have special hardware to pass this info around the network, the idea of a central identity repository that all devices have access to solves the issue of having to filter all packets at the access-layer edge, we allow the rest of the network to share in this burden and create a solution that scales.

Jake's Opinion:
Personally, I don't think we will see single VLAN designs be successful for quite some time.  The wide variety of firewalls, web content filtering, lack of network-wide identity and complex nature of BYOD policies really prevent us from completely abstracting out the devices IP addressing.  My hope is that with the upcoming SDN-apocalypse that we will see SDN solutions providing ways to distribute identity throughout the network and get us closer than ever to the simplified access layer edge that so many vendors are suggesting today.

Wednesday, April 9, 2014

Scanning internal resources for Heartbleed:

Often I get tapped to look at or figure out things not directly related to the mobility space.  Today was one of those days, as we had a number of customer inquiring about Heartbleed, or CVE-2014-0160.  I won't go into a lot of detail about the mechanics of the vulnerability, but I have been pretty concerned with how do I know if my <insert linux based appliance i don't control> is vulnerable.

For public facing sites, there are a number of scanners, the one I use is over at http://filippo.io/Heartbleed/  I tested a number of the cloud sites I use and found that one of my cloud wifi sites is/was affected.  But then the question came around of how to I determine if my internal resources were affected.


Monday, March 10, 2014

Aerohive Hivemanager: A System of Control

So you will never win an argument with an @Aerohive agent employee about controllers.  I've discussed with them how there are some things (a lot really) that HiveManager does that is a function that controllers do.  "Management Plane!" they scream in unison.  No-one who has faced an @Aerohive agent employee with the word controller has lived.  We are survived by not mentioning controllers around them, but sooner or later someone is going to have to fight them.

Well, this isn't the matrix Neo, but what @Aerohive does with their HiveManager product is implement a system of control.  That system is designed to turn this:





into this:




*EDIT: This is a joke. I'm not saying that HiveManager is a WLC.



Poor Abby, she gave me lots of fodder with her recent blog : What Is a WLAN Controller
It's a good article, well written and she articulates her point clearly, although I don't agree with most of it.  Please don't take this as an attack on your blog, but I think that the good ol' WLC will be around for a long time no matter how much the cooperative control marketing gets thrown around.

But Abby does try to define exactly what a wireless LAN controller is.  Something I challenged her to define.  And I did warn you #ItsATrap



Oxford Dictionary:


She goes on to define what she calls a wireless LAN controller, but i think she misses a lot.  Her definition is: "A wireless LAN controller is a device that directs or regulates traffic on the wireless network..."  I'm truncating her definition.  Not because I want to be snarky, but because you cannot define a WLAN controller by a vendor that sells WLAN controllers.

I actually find her definition pretty weak.  A WLC does a LOT more than directing and regulating wireless traffic.  I would say a better definition would be "A WLAN Controller directs or regulates the operation of a Wireless LAN."  While the distinction is small, a WLC is much more than the regulation of wireless traffic.  You could even say that a WLC is a collection of multiple "System of Control." A WLC also regulates what a wireless AP can or can't do based on its features. I'm going to take a discussion that I've had with Abby and other @Aerohive guys around their BR series of routers/APs.

Did you know that a BR200 can operate as just an AP/Switch?  Bet you didn't.  That's because it isn't supported by @Aerohive.   Now the BR100 does this and is supported.  So what I did was take the config that my HMOL pushed down to my BR100 (Thanks @Aerohive for providing me with one for attending #WLPC) and applied those same commands to my BR200 and Voila, an AP with a swiitch.  All the commands are documented in their CLI Reference guides http://www.aerohive.com/support/tech-docs-and-online-training but they are supposedly not supported because HiveManager does not support this on the BR200 platform.

The moral of this story is that HiveManager, a supposed "Network Management System" regulates what an AP can or can't do.  I would say that it is not a NMS because it regulates and/or directs what can or cannot be configured on the equipment.  And once you configure things via the CLI, 
forget about managing it with HiveManager.  Especially if you configure something that isn't possible with HM.  Here's another example:

When trying to configure a port-forward on a BR200, I ran across the limitation that HMOL will only allow you to forward a port that is directly connected to the BR200 (the BR200 is the L3 interface for).  I was trying to setup connectivity from my BR200 over to my home lab, and wanted to forward a port across a L3 routed link to my console server.  No dice, HMOL pukes when trying to configure this.  But if you configure the port-forward from the CLI of the BR200, it works, and works well (although likely unsupported).

There you have it, HiveManager, while not a controller, is certainly a system of control.  I'll tell you something else, even with a WLC not always do you "HAVE TO" have a NMS.  I can deploy a WLC in a small network and configure and operate it without deploying the NMS.  And while I'm sure you *could* deploy @Aerohive gear without HM or HMOL, it would be easier to try to rescue Morpheus from a military-controller building being held by three agents.  I believe even the @Aerohive team-members would suggest spinning up a temporary version of HiveManager to get the initial config implemented.

P.S. Please don't take this as "I don't like Aerohive equipment."  The hardware is what you would expect from an enterprise equipment manufacturer.  It's been great in the lab and my 330 has been driving my wife's wireless network for over a year with very few issues.  The BR200 and 100 have been also been good, but the lack of support in HMOL for features has left a bitter taste in my mouth especially around HiveManager.  I really think @Aerohive needs to take a hard look at Hivemanager and follow the philosophy of "As simple as possible, but no simplier."  Right now, the biggest thing holding back @Aerohive is HiveManager.

*Edit:
After a lengthy and good discussion on twitter, I thought i would clarify that I'm not saying that HM is a controller.  It is fun to make correlations between HM and functions that common WLCs perform.  When I say control, I do not mean a control plane.  I just mean that the HM exerts control over the equipment.  While it doesn't participate in the control plane, limiting what you can do on the equipment in my book is defined as control.
Evernote helps you remember everything and get organized effortlessly. Download Evernote.

Thursday, February 27, 2014

Airtight Networks: A look at WIPS Part 2 - Over the Air

So in Part 1 of this series on Airtight WIPS, we looked at how Airtight determines if an AP is rogue, or an "on-network rogue" by some vendors definition.  Now that we have found an AP is plugged into my network, what do I do with it.  In Airtight, we can quarantine the rogue AP, meaning we will try to prevent clients from associating to it.

In looking at what the Airtight system is doing, I decided to leverage my 1 year eval of Omnipeek aquired from the #WLPC conference.  Thanks @KeithRParsons for putting this on.
But before we look at what Airtight is doing to protect me, let's take a baseline.  Just an example of our client associating to this 



Here you can see a pretty textbook capture of the authentication/association process of "Jake's iPhone 5S" to my survey-5 SSID (open).

Now let's engage the quarantine of this rogue AP.  For this example, we are going to manually quarantine one of my rogue APs.  I chose manual just to ensure I am not accidentally containing any of my home networks and impacting my wife during testing.  Here is a quick screenshot of me doing this.  It's a pretty simple process, just select the rogue AP and clicking the Quarantine button.



Now that we are protecting my clients from the big bad "survey-5" rogue network, I'm going to repeat the process.



So for starters we see a deauthentication frame after both the Association Request and Response.  Let's look at the Deauth sent just after the Association request. The first indication I see is that the deauth frame was sent at 6Mbps, whereas in the previous example, all the management frames were sent at 12Mbps  This tells me this is the Airtight AP sending this frame and spoofing the rogue SSID. Now lets look at the details of this packet:

So the Airtight AP sent a broadcast de-authentication packet spoofed from the rogue SSID.  This is fine since A. this rogue is on my network and B. I've decided to quarantine it.  I was half expecting to see a unicast deauth, but broadcast works fine in this scenario.
Now let's look at the deauthentication that occurs right after the association response.  This time it appears that the deauth is coming from my iPhone.  Notice again that it is sent at 6Mbps.



We can see that the Airtight is now pretending to be my iPhone, saying that my iPhone has decided to leave the rogue SSID.  I honestly didn't expect this, but it totally makes sense.  In addition to make sure the client has left, it makes sense we should also send AP a deauth so it ends the clients session, ensuring the client is dropped if it doesn't listen to the deauth message.  Comparing the signal strength to my client



We see 22dbm difference, so I'm positive that this packet isn't coming from my phone, which is very close to the capture adapter.  Both the rogue and the Airtight AP are in the lab about 30 feet away.
As a side note, I did see some odd behavior on my iPhone during the deauth.  Instead of just disconnecting, the iPhone displayed a PSK entry field, even though this is an open SSID.


I didn't see the same behavior on my windows laptop, so this may be some Apple specific behavior.  Hopefully one of the Apple guys reads this and leaves a comment (in my dreams).
My windows 7 laptop was able to connect to the AP.  But while I was able to establish a connection, it was very short-lived.  We see the Airtight change gears and send some unicast deauth packets to both the client and AP.
Shortly after connecting, we see omnipeek detect a wireless duration attack, which just appears to be a CTS to self with a very large duration field:


After we this, my laptop starts probing and tries to associate again.  We see the same deauth results and the duration attack is repeated with a different NAV value occurs and the laptop gives up disconnects.  The Airtight appears to say, well if you won't listen to me, I'll just use the duration attack to reduce your ability to do anything on this WLAN until you give up.


I'm sure there are some more methods to the Airtight WIPS.  I know I've heard them speak on ARP poisoning as one of the tools in their toolbelt, but this post is running long as it is. I'm very impressed by Airtights ability to contain rogue clients and access points.  They've also done a great job making the classification of networks easy to manage.  In the next part of this series, I'm going to look at some common scenarios that I've found in my travels as a wireless engineer.  I would consider these scenarios commonplace and they are not designed as ways to "beat" WIPS, but to show off how well the Airtight system combats common rogue technologies and attacks against WLANs.

For Part 3: I'm going to round up some gear and put together some typical rogue AP type scenarios and show how the Airtight system defends and protects your clients.

Monday, February 17, 2014

Mac OS X Mavericks: DNS-Server required for default route

I ran into an interesting client behavior issues with several Apple Macbook Air Laptops today at work.  I was able to solve the mystery, but thought I would post my findings as I just confirmed it in the home lab with my wife's computer.

Observed behavior:
Mac OS X clients running 10.9.1 using a DHCP address require the DNS Server to be configured in order for the default route to take affect.  Without the default route, off-network traffic receives a "no route to host" error.


Now the network engineers at work were mocking up was an isolated network with no DNS or internet access. As far as I can tell

Scenario:
Catalyst 3750 acting as a L3 switch and doing DHCP for local VLANs.

DHCP Configuration:
ip dhcp pool ClientTest
   network 192.168.145.0 255.255.255.0
   default-router 192.168.145.1

Vlan Configuration:
interface Vlan145
  ip address 192.168.145.1 255.255.255.0

For this scenario, I am using a Flexconnect AP with the SSID dropping off to vlan 145.  Also I am using a server upstream to ping.  I haven't had a chance to verify that this works with other vendors, but I would guess it does.   Assume that routing is working (as it is).  The production scenario was with a 4500, with local mode APs, and the results were the same.

Now for the client.  My client in this case is a 2012 MBA running 10.9.1.

Here is my ifconfig output:
        en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST>    mtu 1500
ether 28:37:xx:xx:xx:xx 
inet6 fe80::3 prefixlen 64 scopeid 0x4
inet 192.168.145.2 netmask 0xffffff00 broadcast 192.168.145.255
nd6 options=1<PERFORMNUD>
media: autoselect
status: active

Now lets see if we can ping the default gateway:
Valeries-MacBook-Air:~ valeriesnyder$ ping 192.168.145.1
PING 192.168.145.1 (192.168.145.1): 56 data bytes
64 bytes from 192.168.145.1: icmp_seq=0 ttl=255 time=3.185 ms
64 bytes from 192.168.145.1: icmp_seq=1 ttl=255 time=3.030 ms
c64 bytes from 192.168.145.1: icmp_seq=2 ttl=255 time=5.038 ms

Success.  Now lets ping my server upstream from the lab:
Valeries-MacBook-Air:~ valeriesnyder$ ping 10.200.20.20
PING 10.200.20.20 (10.200.20.20): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
Request timeout for icmp_seq 0
ping: sendto: No route to host
Request timeout for icmp_seq 1
ping: sendto: No route to host
Request timeout for icmp_seq 2


Hmm, this should work.  I verify in the network control panel that I have the 192.168.145.1 as the default router.  There's no reason I *shouldn't* be able to ping the server.  Except that I'm not passing the DNS Server through the DHCP server.

Lets change the DHCP server configuration to the following:
ip dhcp pool Test
   network 192.168.145.0 255.255.255.0
   default-router 192.168.145.1 
   dns-server 1.1.1.1

In this example, i'm using 1.1.1.1 which is a totally bogus DNS server.  It doesn't need to work in order to get routing to work on the MBA.
Same SSID, just the DNS-Server change.

Here is the relavent ifconfig output:
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether 28:37:xx:xx:xx:xx 
inet6 fe80::3 prefixlen 64 scopeid 0x4 
inet 192.168.145.2 netmask 0xffffff00 broadcast 192.168.145.255
nd6 options=1<PERFORMNUD>
media: autoselect
status: active

Now skipping straight to pinging outside the local subnet we find that it works, just by adding the DNS-Server
Valeries-MacBook-Air:~ valeriesnyder$ ping 10.200.20.20
PING 10.200.20.20 (10.200.20.20): 56 data bytes
64 bytes from 10.200.20.20: icmp_seq=0 ttl=124 time=8.056 ms
64 bytes from 10.200.20.20: icmp_seq=1 ttl=124 time=2.113 ms
64 bytes from 10.200.20.20: icmp_seq=2 ttl=124 time=3.217 ms


I have no idea why this client behavior occurs, or if it intentional. I was able to replicate the behavior with a Win2k8r2 dhcp server and an IP Helper.
   *Note that Windows populates a DNS Server of 127.0.0.1 if you do not specify a DNS server.
   You *MUST* delete this option in order to replicate the behavior.

Further Testing:
Configuring the IP address of the client manually without a DNS server configured exhibits the same behavior.

Conclusion:
Mac OS X Mavericks (10.9.1) requires a DNS Server defined in order to use the configured default route.

Sunday, February 16, 2014

Airtight Networks: A look at WIPS Part 1: Over the Wire

When it comes to security, I have a personal motto: Think maliciously, act responsibly.  I really enjoy trying to manipulate clients, exploiting behavior and finding ways to prevent it in the "real world".

For protecting wireless networks in the "real world" one of the best tools is is WIPS.  Word on the street is that the Airtight solution is pretty good.  The presentation by Rick Farina at Wireless Field Day 6 was fun for me.  Rick is a great presenter and brought a lot of fun and energy to a security presentation.  For more on the WFD6 presentation, check out what fellow delegate Lee Badman wrote on his personal blog: http://wirednot.wordpress.com/2014/02/02/airtight-networks-rising/

Also watch his presentation here:

This is the first of a few posts on Airtight Networks WIPS solution.  Part 1 will cover how Airtight does rogue detection, Part 2 will cover containment and OTA communication and Part 3 will cover common Rogue scenarios and look at how

*I will note that Airtight gave me a C55 AP during my visit during WFD5.  While I'm grateful to them for this, these posts are my own opinion and not influenced by their generosity.



Coming from the Cisco world, I see that Airtight takes a very different approach to identifying on-network rogues. Instead of trying to correlate Wi-Fi traffic to wired traffic by listening on the wire (Rogue Detector), scanning CAM tables on switches, or trying to connect to open access points and sending traffic towards the controller (RLDP),  Airtight sends broadcast (or potentially unicast) frames on a vlans connected to the Sensor/AP and then listens to see if those frames are ever sent over the air.

Let's look at the wired side of my Airtight C55 AP.  For example, here the mac addresses from my WIPS mode AP on my network:


You'll notice that there's the management mac and the rest are mac addresses created by the WIPS-mode AP, one for each vlan that we are doing WIPS on.  Here is what I have my AP configured for:

*Note, VLAN125 is present, just not in the picture.


Spanning the port on the Airtight AP, I'm able to capture some of the packets coming from the WIPS-AP.
For simplicity, i wrote a display filter to clean up the data only from the Airtight AP:

(eth.src > f2:91:4a:7f:00:00 and eth.src < f2:91:4a:7f:ff:ff)



You can see it does a lot of GARP for addresses on the VLANs I am monitoring.  I also observed it sending DHCP requests on the VLANs configured as well:


So what does this give us?  Well, by sending these L2 broadcast messages out, the hope is that a rogue (on network) AP will hear the L2 broadcast and forward this traffic out to clients over the air.  Once this happens, the WIPS wireless radio will hear the packet and be able to see that it is from itself.  This allows the Airtight system to tell that an AP is connected to the network.  From here we can Airtight take action against this rogue network/clients connecting to this network.

What's next?  In part 2 of this series, I'm going to look at how the Airtight WIPS prevents clients from connecting to a Rogue AP, as well as looking at some of the options settings.  For Part 3, I'm going to try my hand at some common scenarios to see where the Airtight system works and where it does.

Tuesday, January 28, 2014

Onward to Wireless Field Day 6

So it's time for Wireless Field Day 6, and I don't think I finished a quarter of the blogs I wanted to write for WFD5.  Most of you know that I have been head down studying for my CCIE-W which escaped me for the second time earlier this month.  That has eaten up a bunch of my blogging time, but no excuses, I will find more time to talk about the WFD vendors this time around.

Sunday, January 12, 2014

Rube Goldberg Network Design

The holy grail of network engineer is building a completely redundant network with no single point of failure, where outages are never seen by the end users and the network team is a happy upbeat group of individuals who never get blamed for anything.  The problem adding redundancy is the added complexity needed. Sadly in the lust for more 9s of uptime you can build what I call the Rube Goldberg Network.

For those of you not familiar with what a Rube Goldberg machine is, here's the definition from Wikipedia:

"A Rube Goldberg machinecontraption,inventiondevice, or apparatus is a deliberately over-engineered or overdone machine that performs a very simple task in a very complex fashion..."