Saturday, September 3, 2011

Load Sharing with Round Robin DNS

One of most common implementations of DNS is the Berkeley Internet Name Domain (BIND). This allows address records (A records) to be duplicated for a specific host, with different IP addresses. The name server then alternatively rotates addresses for any one name that has multiple A records, and is known as DNS round robin.
As an example, your company has three web servers. Their real names and IP addresses are as follows: 

www.yourcompany.com128.1.1.1
www.yourcompany.com128.1.1.2
www.yourcompany.com128.1.1.3
You want to set up the servers so that DNS requests by clients (in this case, web server access) are round robin rotated. This is accomplished by placing multiple A records in the authoritative name server files.
For the above example, we want all clients to access our site by using www.yourcompany.com, but we want these requests to be shared between our three servers using DNS round robin. To do so, we need to place the following A records in the name server file:
www.yourcompany.com.INA128.1.1.1
www.yourcompany.com.INA128.1.1.2
www.yourcompany.com.INA128.1.1.2
Note the '.' after the name www.yourcompany.com on each A record. This is mandatory.
A Time To Live (TTL) field is often added to the A records. The TTL value is the maximum time that the information should be held to be reliable. By setting the TTL to a fairly small amount time e.g. 60 seconds, the effectiveness of the distribution can be maximized. A lower value may be specified, but this causes more DNS traffic in updates, which improves the load sharing on the web servers at the expense of increasing the local on the name server.
www.yourcompany.com.60INA128.1.1.1
www.yourcompany.com.60INA128.1.1.1
www.yourcompany.com.60INA128.1.1.1
When a DNS request for an IP address is received, BIND returns one of the IP addresses and makes a note of it. The next request will then return the next IP address in the file and so on until the last one, after which BIND returns to the first address again.

Advantages of DNS Round Robin:

An obvious advantage of using DNS Round Robin for load distribution is that it is seamless to the user. It is also simple and cost-effective to implement. It is standard software in most systems or else may be obtained at no or low cost. For this reason, it is very effective for small to medium size businesses or organisations and it is extremely popular among ISPs, e-commerce sites, universities and other cost sensitive sites.

Disadvantages of DNS Round Robin:

It should be noted that DNS is a load sharing mechanism rather than a load balancing mechanism. It does not gauge the "load" on the server in any way, but rather it shares the load among multiple hosts. One or more of the hosts in the pool will tend to get more activity than the other servers. DNS Round Robin should be quite effective up to about 10 servers per virtual cluster.
DNS has no way of detecting physical failure e.g. when the hard disk on server2 (www2) fails. As requests come in for www.yourcompany.com, the DNS will continue to forward one out of every three requests to www2 - which will fail. Effectively, 33% of all requests to www.yourcompany.com are now connecting to a black hole. This is an improvement over having just one web server and having all the requests being lost due to a hardware failure, but only to a certain degree.
It is very common for a computer to request data from the same host several times in any given session. It is also common for many hosts from the same site to make requests from the same servers. Usually, these requests are made to a local name server that in turn ask another name server for the resolution of the domain. In order to minimize network traffic these responses are cached - possibly on each name server along the way. This helps the network to respond quickly with domain name resolution, but it can also defeat the round-robin load distribution.
However, this caching problem may be resolved by using the TTL value. The protocol requires that local and intermediate DNS servers dump these entries from their cache when the TTL runs out. Most systems, such as Solaris, NT and Linux (and others) support BIND 4.9 and later versions. DNS round robin supports pools of servers for any applications, not just web servers. Pools of web, email, ftp, database and other servers can all be setup to load share using DNS.

Dynamic Load Balancing DNS: dlbDNS

Another approach to solving the problem of network traffic congestion is to add a dynamic load balancing feature to the existing DNS.
Distributing a request across servers can be implemented by monitoring the servers regularly and directing the request dynamically to the 'best server'. This way of dynamically directing a request across multiple servers based on the server load is called dynamic load balancing. This feature can be added to the pre-existing DNS, as it already plays a prominent role in resolving client requests and can be configured to direct client requests across multiple servers in an effort to avoid network traffic congestion. Here, the 'best server' refers to the server with the best rating, based on a rating algorithm.
There are several load balancing models available. The model implemented by dlbDNS is an internal monitoring system, that monitors the performance of the servers and provides feedback to the DNS. It is easy to maintain and administrate. Other advantages include closeness to the source of addressable problems and no security hazards.
As there are several load balancing models available, there are also a number of load balancing algorithms available. The rating algorithm implemented in dlbDNS, which determines the best server, is based on the number of users and load average shown. The algorithm is reasonable, as it favors the host(s) with the smallest number of unique logins and lower load averages.
The Server-Side Algorithm is added to the pre-existing DNS feature. During configuration, a new attribute called DNAME is added to distinguish the hosts taking part in dynamic load balancing. If the service requested is of type DNAME, do the following:
  1. Determine the set of participating servers for this service
  2. Request ratings from all participating servers by establishing a concurrent connectionless (UDP) connection with each server
  3. Using the ratings returned, determine the best server
  4. Handle error conditions such as:
    • Server is too busy to return the rating within the time frame
    • The rating returned by the server gets lost on its way back to the dlbDNS
    • All servers have same rating
    • A server is down
Rating Demon Algorithm is run on each server taking part in dynamic load balancing and is as follows:
  1. Receive request for rating from dlbDNS and respond by returning the host rating
  2. Calculate the host rating once every minute rather than calculating it at the time of request, as quick response time is a most important factor
  3. Ensure the host rating is updated every minute, independent of the dlbDNS request
  4. Handle error conditions such as dlbDNS closing the UDP sockets without waiting for host response
Implementation of the dlbDNS provides efficient utilization of system resources and ensures that facilities newly added to the existing network will be utilized. Since DNS is used, applications such as FTP and TELNET will also utilize dlbDNS.
See figure: dlbDNS This dlbDNS algorithm was proposed by the Computer Science department of Wichita State University, due to the uneven distribution of load across the servers thus causing major problems for the department. It should be noted that there is further work to be done on this approach. In particular the rating algorithm is incomplete. An algorithm that takes into account the number of processors, CPU and memory utilization would make the rating algorithm more efficient. Also, a more extensible design is needed, as Linux servers are the only servers that can participate in the dynamic load balancing scheme.

Conclusion:

The most common implementation of DNS load sharing is most certainly the DNS Round Robin technique. It is very feasible for small to middling companies and organizations. However, it is not sufficient for larger organizations and they should probably take into account other load balancing sc