Troubleshooting Tips

The process of diagnosing faults in a complex traffic-managed cluster is an involved one, and a systematic approach to troubleshooting is required. In the event of any problems, check through the following areas in turn to locate and diagnose the cause of the fault.

Generating Test Requests

You can test your system using the normal client software that users will employ, such as a Web browser or email client. Tools like the “Live HTTP Headers” extension for Mozilla browsers are very useful when inspecting the request and response flow between the client and the Traffic Manager.

You can also use a network snooping tool such as Wireshark (http://www.wireshark.org/) to record network traffic and assemble request and response sessions.

httpclient

The “httpclient” program, included with the Traffic Manager distribution, can be used to issue HTTP requests to particular machines. You can find it at $ZEUSHOME/admin/bin/httpclient.

Use the following syntax:

httpclient --hostheader=<website_name> http://<trafficIP>/

You can also perform initial tests using a telnet client to perform a basic test on the service:

$ telnet www.mysite.com 80

Trying 62.254.209.66...

Connected to 62.254.209.66.

GET / HTTP/1.1<RETURN>

Host: www.mysite.com<RETURN>

<RETURN>

The openssl toolkit (http://www.openssl.org/) includes a Telnet-like client that uses the SSL protocol, for testing SSL-related problems.

zeusbench

The “zeusbench” program is a useful benchmarking tool that can be used to send large numbers of HTTP requests for load and performance testing purposes. You can find it at $ZEUSHOME/admin/bin/zeusbench.

There are many command line options. You can run a simple load test using the following command:

zeusbench –t 30 –c 100 –k http://host/url

Run zeusbench –h for a full list of command line options.

Checking Automatic Back-End Failover

To check the Traffic Manager’s automatic failover of back ends you will need at least two back-end servers configured, or there will be no machines for the Traffic Manager to fall back on. You can test failover by pulling out the network cable on one of the back-end machines; or you can manually stop the service running on the back end. Verify that nothing is then listening on that port on the back end.

If you only have one back-end server you could run two instances of the required service on different ports on the same server machine, and then manually stop one instance of the service.

Try to use your selected service through the Traffic Manager. For SMTP send an email through the server farm, or for HTTP make a Web page request. If at least one back-end server is available to fulfill your request it should succeed.

Check the Diagnose > Event Log page for notification that a back-end machine has failed.

Checking Automatic Front-End Failover

If you have two or more Traffic Managers, you can check that automatic front-end failover is working properly. Set up a traffic IP group spanning your Traffic Managers, and a service using this traffic IP group, such as a virtual server managing Web content on port 80. Check that you can request Web pages from this service successfully on each of the traffic IP addresses in the group. You can do this by entering the IP address rather than the DNS name in your browser.

Click Services > Traffic IP Groups and then click Unfold All to view details about your traffic IP groups. This shows you which machine has raised which IP address; note that if you have more machines than traffic IP addresses in the group, some machines will be on standby and not actively handling traffic.

Now pick a machine that has raised one of the traffic IP addresses, and pull out all the network cables on that machine. The IP address will be raised by another machine in the group; try browsing to the traffic IP address again and check that you can still receive content. You may need to refresh the page in your browser to ensure that it is not using cached content.

Configuring Fault-Tolerance describes how the Traffic Manager’s fault tolerance works, the tests that it conducts, and the decisions that it makes.