Failover

When each Virtual Traffic Manager in a cluster determines that one of its peers has failed, the Virtual Traffic Manager may take over some or all of the traffic shares that the failed system was responsible for. The traffic distribution method determines how this is done.

Traffic IP Address Transfer (Single-Hosted Mode)

Each Virtual Traffic Manager in a cluster uses its knowledge of which machines are active to determine which Traffic IP addresses it should be running. The cluster uses a fully deterministic algorithm to distribute IP addresses across the machines:

Because the algorithm is deterministic, the Virtual Traffic Managers do not need to negotiate between themselves when one of their peers fails or recovers.

The algorithm is optimized to spread the distribution of Traffic IP addresses across the active Virtual Traffic Managers in a cluster, and to minimize the number of IP address transfers if a Virtual Traffic Manager fails or recovers.

When a Virtual Traffic Manager raises a Traffic IP address, it sends several ARP messages to inform adjacent network devices that the MAC address corresponding to the IP address may have changed. The Virtual Traffic Manager will send up to 10 ARP messages (tunable using the flipper!arp_count setting); the frequency of these messages is controlled by the flipper!monitor_interval setting (by default, the messages are sent at 0.5-second intervals).

Note that if a Virtual Traffic Manager detects that its own network connectivity has failed, it will immediately drop its Traffic IP addresses and broadcast I have failed health messages to its peers. This is in anticipation of other Virtual Traffic Managers in the cluster raising the interfaces when they realize that the first Virtual Traffic Manager has failed.

Traffic IP Address Transfer (Multi-Hosted Mode)

Each Virtual Traffic Manager in the Traffic IP Group deterministically chooses whether or not it should handle each packet, based on the source IP of that packet (and optionally the source port; see Traffic Distribution).

If a Virtual Traffic Manager fails, its share of the load is spread evenly between the remaining Virtual Traffic Managers. When it recovers, it takes equal shares of the load from its peers, thus ensuring that the traffic is always evenly distributed across the working machines in the Traffic IP Group.

Multi-hosted IP functionality is not included with the Virtual Traffic Manager software by default. You can download and install it as an additional kernel module, and is supported on Linux kernels, version 2.6.18 and later. See the Virtual Traffic Manager documentation on the Ivanti Web site (www.ivanti.com) for more information on supported versions.

Traffic IP Address Transfer (RHI Mode)

If a Virtual Traffic Manager's fault tolerance checks fail, it lowers the addresses used in RHI traffic IP groups and withdraws route advertisements from the network.

If the network detects an inability to reach the designated active Virtual Traffic Manager in an RHI traffic IP group, routing decisions for the traffic IP address(es) use instead the next best available route using the lowest metric, such as to the designated passive Virtual Traffic Manager, or to a Virtual Traffic Manager hosting the same traffic IP address in another datacenter.

Recovering from Failure

When a failed Virtual Traffic Manager recovers, its share of traffic is transferred back to it.

Each time traffic shares are transferred from one Virtual Traffic Manager to another, any connections currently in that share are dropped. This is inevitable when a transfer occurs because a Virtual Traffic Manager fails, but may not be desirable when a Virtual Traffic Manager recovers.

In this case, you can disable the flipper!autofailback setting on the System > Fault Tolerance page of the Admin UI. When this is disabled, a Virtual Traffic Manager does not take any traffic when it recovers. Instead, the user interface displays a message indicating that the Virtual Traffic Manager has recovered and can take back its IP addresses.

When you want to reactivate the Virtual Traffic Manager, go to the Diagnose page and select the Reactivate this Virtual Traffic Manager link.

Alternatively, you can edit each of your traffic IP groups and set the recovered Virtual Traffic Manager to passive. Once you set it to passive in all of the groups, it will not need to take any shares of traffic; it will then reactive automatically, clearing the error state. In addition, no traffic will be lost because not traffic shares will have been transferred.