jump to navigation

Cooperative Control: Part 4 (final) March 8, 2010

Posted by Devin Akin in : Uncategorized , trackback

Fast/Secure Layer 3 Roaming

Mobility in typical IP networks is challenging because, as a user moves from subnet to subnet, their IP settings change, which usually makes IP-based sessions or applications fail.  To allow users to maintain their IP settings and network connections while roaming across subnets throughout a WLAN, Aerohive has developed the Dynamic Network Extension Protocol (DNXP).  At the time a user roams to an AP that is located in a different subnet, DNXP will dynamically establish a tunnel from the new AP back to an AP in the subnet the user roamed from.  The user’s traffic is tunneled back to its original subnet, which allows clients to preserve their IP address settings, authentication state, encryption keys, firewall sessions, and QoS enforcement settings as they roam across HiveAPs in different subnets.  This is especially important for clients using voice and video applications.

When layer 3 roaming is enabled, HiveAPs can automatically discover their layer 3 neighbors (neighboring HiveAPs on different subnets) by scanning radio channels.  If HiveAPs are within radio range of each other, are in the same hive, have layer 3 roaming enabled, and are in different IP networks, the HiveAPs will build layer 3 neighbor relationships with each other over the routed Ethernet network.  HiveAPs will then distribute tunnel and client information to their layer 3 neighbors.  This way, when the user roams across layer 3 boundaries, the tunnels can be built without delay.

In situations where HiveAPs cannot discover each other automatically over the air, possibly due to being on opposite sides of an RF obstacle, you can manually configure layer 3 neighbors for HiveAPs using HiveManager.

When layer 3 neighbors are discovered, either automatically or manually, HiveAPs in different subnets will exchange lists of available HiveAP portals and client and roaming cache information.  This way, if a client does roam to a new subnet, the HiveAP in the new subnet will be aware of the client and can dynamically build a tunnel back to any one of the portal HiveAPs in the previous subnet.  This allows for fast/secure layer 3 roaming.

The following diagram shows the basic steps performed by HiveAPs as clients roam within their subnet and across subnet boundaries.

Diagram5

Diagram 5. The Process for Fast/Secure Layer 3 Roaming

Step 1 – The client performs seamless, fast/secure layer 2 roaming within subnet A.

Step 2 – After the client successfully roams to HiveAP 2, HiveAP 2 will send an encrypted control packet over the Ethernet infrastructure to HiveAP neighbors in the neighboring subnet.  The control packet contains, as a minimum, the client’s identity, security and QoS information, SIP call state, and the client’s originating subnet.

Step 3 – Because the client’s identity and key information, including SIP call state, is proactively synchronized between neighboring HiveAPs, when the client roams to HiveAP3, HiveAP3 has all the information it needs to enforce policies and to tunnel permitted traffic, over the GRE tunnel, to a portal HiveAP in the client’s original subnet.  This behavior allows the client to maintain its IP address and active sessions as it roams.  Predictively, HiveAP3 forwards the wireless client’s roaming information to HiveAP4 in anticipation of any further roaming.

The ability for a client to maintain its IP, QoS, firewall, and security settings while roaming across subnet boundaries ensures that client application sessions do not get dropped while roaming.  Based on a configurable idle time or number of packets per minute, HiveAPs can be set to disassociate these wireless clients so that they can reconnect and receive an IP address in their new subnet allowing traffic to be locally forwarded.  If a client roams across subnet boundaries when it does not have any active sessions in process, it can be immediately transitioned to the new subnet, eliminating the need to tunnel traffic.

In summary, with HiveAPs and cooperative control, wireless clients have the ability to perform fast/secure roaming between HiveAPs within the same or between different subnets without impacting client data or voice connections.

Tunnel Load Balancing in Large Scale Layer 3 Roaming Environments

Aerohive’s layer 3 roaming feature provides unprecedented scalability by using tunnel load balancing to distribute tunnels across all portal HiveAPs within a subnet.  This leverages the distributed processing power of the wireless network to support thousands of layer 3 roaming tunnels and multiple gigabits of cross subnet throughput.  When a HiveAP in a remote subnet attempts to establish a tunnel to a HiveAP in the original subnet, in the very rare case that the HiveAP in the original subnet has high tunnel load, it can inform the HiveAP in the remote subnet to tunnel to another portal HiveAP in the subnet.  This prevents any single HiveAP from being over-utilized.

Radio Resource Management (RRM)

To respond to changes in the RF environment, HiveAPs use Aerohive Channel Selection Protocol (ACSP).  ACSP allows HiveAPs to cooperate in order to to automatically select the best channels and power settings on which to operate for optimal network performance across an entire system.  HiveAPs use ACSP to scan channels and to build tables of discovered wireless devices.  These tables, along with additional RF information such as channel utilization and retry counters, are used to identify and classify interference types and sources.  HiveAPs communicate ACSP state information with each other and use this information to select the appropriate channels and power levels for the network topology and configuration.

For each radio in access mode, ACSP will select a channel and power level to maximize coverage while minimizing interference with its neighbors.  This is accomplished by ensuring that HiveAPs use different channels than their immediate neighbors, and that they adjust their power to minimize co-channel interference with other, more distant, HiveAPs.  For radios in backhaul (mesh) mode, ACSP ensures that that they use the same channel throughout the mesh, while still minimizing interference with the access links.

To maintain optimal performance, ACSP constantly checks the radio power settings and can automatically decrease radio power based on communication from neighboring APs to give the maximum coverage possible while minimizing interference.  This behavior is highly beneficial in a failure state or when an AP is taken off line, where neighboring APs can automatically adjust their power to the optimum state, essentially taking into account the missing AP.  ACSP can also be scheduled to recalibrate the radio channels during a configurable daily time window and when a specified number of clients are associated.   This helps ensure that radio channels do not switch while the WLAN is being utilized, preventing a disruption of service for wireless clients.

Station Load Balancing

Many times in a wireless network, many users will unknowingly be connected to the same AP, or even the same radio on the same AP, while neighboring APs may be under-utilized.  This can have a significant impact on client performance and may cause the users to have an unsatisfactory experience.  It is logical, therefore, that clients be encouraged to move from the more heavily-loaded APs to the lightly-loaded ones.  To aid in the distribution of clients among HiveAPs in a cooperative control infrastructure, Aerohive has implemented station load balancing.

HiveAP load is determined, as a minimum, by:

1)    the overall load of the system

2)    the load in a specific area on a specific channel

3)    the voice traffic load of attached stations

4)    the total number of attached stations

5)    the signal quality of attached stations

HiveAPs can make decisions to offload stations from one radio to another within the same HiveAP (Bandsteering) based on client capabilities and/or to offload stations to a HiveAP that is better suited to handle the load in the immediate area.  Transitioning clients between radios and between APs is done without breaking the application session.

Use of admission control can prevent over-utilization by ensuring there is enough headroom for stations that roam to the HiveAP.  It also prevents overloading a single HiveAP, especially when there are other HiveAPs nearby that can better handle the load.  This is useful with VoWiFi, because it helps ensure that a HiveAP has availability to support new or roaming voice stations, and that there is enough airtime available for excellent voice quality.

————————

This is the last blog post on Cooperative Control.  If you want to read more on the topic, I will refer you to our website at Aerohive.com for the upcoming comprehensive whitepaper on Cooperative Control.  Future blogs will incorporate my personality.  See ya next week, and oh, btw, stay tuned for some really exciting action at the beginning of the quarter.  :)

Bookmark and Share

Comments»

1. Ms MSFT - April 16, 2010

HI Devin, Although I greatly admire the ’simple’ message this Layer 3 roaming is where this architecture does not scale.

Imagine a voice client roaming from AP1 to AP2 where these APs are on the same floors but connected to different Access Layer switches in a building and on different subnets. Traffic now has to come client – AP2 – access layer switch – down to the core – core switch – across to core switch – up to access layer switch – er I lost it here but the latency added by all this traversing is going to kill voice latency and jitter.

Dont trivialise a user having to get a different IP address on roaming – you have never worked anywhere wher they use MSFT for DHCP servers have you :) ?.

OMG, a long wait and your wireless adapted whinging that ‘wireless network is lost’, systray bubbles popping like an elephant crossing a roll of bubble wrap…

2. Devin Akin - April 17, 2010

Hi Ms. MSFT,

L3 roaming is quite simple and quite fast. Setting up a tunnel between APs (which happens only in the case of L3 roaming) takes only a few milliseconds (easily fast enough to support fast/secure roaming of Wi-Fi clients). By the way, it’s alot more often that the subnets are divided up by-floor. Additionally, HiveAPs can support hundreds of tunnels each, so in the rare case of a L3 roam, it’s no big deal really. It scales far better than a controller in fact because you’re leveraging many powerful APs rather than one powerful controller. The latency from AP to AP across access and core layer switches is never more than about 1-2 milliseconds, so that’s “in the noise” as they say in Silicon Valley. Considering that a fast/secure roam needs to happen faster than 150ms and that typical client roam time when using a fast/secure roaming mechanism ranges from 15-50 ms, adding 1-2 ms for end-to-end Ethernet transference and 1-2 ms for L3 tunnel building is irrelevant. Getting a new IP as a client roams is very seldom a good idea because it can break application sessions. I don’t know what you meant by all of that stuff at the end of your post. Hope this helps.