A High-Availability Architecture for the Dynamic Domain Name System
The Domain Name System (DNS) provides a mapping between host names and Internet Protocol (IP) addresses. Hosts that are configured using the Dynamic Host Configuration Protocol (DHCP) can have their assigned IP addresses updated in a Dynamic DNS (DDNS). DNS and DDNS are critical components of the Internet. Most applications use host names rather than IP addresses, allowing the underlying operating system (OS) to translate these host names to IP addresses on behalf of the application. When the DDNS service is unavailable, applications that use DNS cannot contact the hosts served by that DDNS server. Unfortunately, the current DDNS implementation cannot continue to operate under failure of a master DNS server. Although a slave DNS server can continue to translate names to addresses, new IP addresses or changes to existing IP addresses cannot be added. Therefore, those new hosts cannot be reached by the DDNS.
A new architecture is presented that eliminates this single point of failure. In this design, instead of storing resource records in a flat text file, all name servers connect to a Lightweight Directory Access Protocol (LDAP) directory to store and retrieve resource records. These directory servers replicate all resource records across each other using a multi-master replication mechanism. The DHCP servers can add records to any of the functioning DNS servers in event of an outage.
In this scheme, all DNS servers use the anycast Border Gateway Protocol (BGP). This allows any of the DNS servers to answer queries sent to a single IP address. The DNS clients always use the same IP address to send queries. The routing system removes routes to non-functional name servers and delivers the request to the closest (according to network metrics) available DNS server.
This thesis also describes a concrete implementation of this system that was created to demonstrate the viability of this solution. A reference implementation was built in a laboratory to represent an Internet Service Provider (ISP) with three identical regions. This implementation was built using Quagga as the BGP routing software running on a set of core routers and on each of the DNS servers. The Berkeley Internet Name Daemon (BIND) was used as an implementation of the DNS. The BIND Simplified Database Backend (SDB) interface was used to allow the DNS server to store and retrieve resource records in an LDAP directory. The Fedora Directory Server was used as a multi-master LDAP directory. DHCP service was provided by the Internet Systems Consortium's (ISC) DHCP server.
The objectives for the design were high-availability, scalability and consistency. These properties were analyzed using the metrics of downtime during failover, replication overhead, and latency of replication. The downtime during failover was less than one second. The precision of this metric was limited by the synchronization provided by the Network Time Protocol (NTP) implementation used in the laboratory. The network traffic overhead for a three-way replication was shown to be only 3.5 times non-replicated network traffic. The latency of replication was also shown to be less than one second. The results show the viability of this approach and indicate that this solution should be usable over a wide area network, serving a large number of clients.