Recursives in the Wild: Engineering Authoritative …johnh/PAPERS/Mueller17a.pdfRecursives in the...

Recursives in the WildEngineering Authoritative DNS Servers (extended)

ISI-TR-720 1 June 2017

Moritz Muumlller12 Giovane C M Moura1

Ricardo de O Schmidt12 John Heidemann3

1 SIDN Labs 2 University of Twente 3 USCInformation Sciences Institute

ABSTRACTIn Internet Domain Name System (DNS) services operateauthoritative name servers that individuals query throughrecursive resolvers Operators strive to provide reliabilityby operating multiple name servers (NS) each on a separateIP address and by using IP anycast to allow NSes to pro-vide service from many physical locations To meet theirgoals of minimizing latency and balancing load across NSesand anycast operators need to know how recursive resolversselect an NS and how that interacts with their NS deploy-ments Prior work has shown some recursives search for lowlatency while others pick an NS at random or round robinbut did not examine how prevalent each choice was Thispaper provides the first analysis of how recursives select be-tween name servers in the wild and from that we provideguidance to name server operators to reach their goals Weconclude that all NSes need to be equally strong and there-fore we recommend to deploy IP anycast at every single au-thoritative

1 INTRODUCTIONThe Internet Domain Name System (DNS) puts the

ldquodotrdquo in com providing a global naming service forweb e-mail and all Internet services [14] DNS is adistributed service with a hierarchical namespace whereeach component (the root org and wikipediaorg) isserved by authoritative servers Often multiple author-itative servers provide the same component with dif-ferent goals including reducing latency providing faulttolerance and to help mitigate denial-of-service (DoS)attacks To use the DNS a userrsquos browser or operat-ing system employs a stub resolver to place a query Itthen talks to a recursive resolver that walks throughthe DNS hierarchy possibly using prior cached results

DNS can be a noticeable part of web latency [25] sousers web browser authors and DNS service providersstrive to reduce latency through DNS server replica-tion [15] and IP anycast [18 13] DNS authoritativeservers can be replicated with multiple servers identi-fied to recursive resolvers through NS records [15] each

pointing at one or multiple IP addresses (eg one IPv4and another IPv6 address)

Today most large DNS services replicate name serversto many physical locations with IP anycast Impor-tant services such as the DNS Root are very widelyreplicated with 13 different name servers (each a rootletter) all with multiple sites totaling over 500 any-cast sites [21] with distinct IP addresses in distinctASes [11] Also top-level domains (TLDs) run at leasttwo different authoritatives with two distinct IP ad-dresses for example nl (the Netherlands) has 8 sep-arate servers of which 5 are unicast and 3 use anycastacross more than 80 sites

A DNS operator is faced with a challenge how manyservers should they operate How many should use any-cast and how many sites should each anycast serviceemploy Recent work has suggested few IP anycastinstances can provide good global latency [22] for onename server but is one anycast service enough to im-prove latency

Answering these questions when engineering a DNSservice is challenging because little is known about therecursive resolvers that make requests There are manydifferent implementations of recursive resolvers and howthey select between authoritative servers is not definedEarly work [29] shows that the behavior across differentrecursive resolvers is diverse with some making inten-tional choices and others alternating across all NSes fora service While this result has been reconfirmed toour knowledge there is no public study on how thisinteracts with different design choices of name serverdeployments nor how it should influence its design

The first contribution of this paper is to re-evaluatehow recursive resolvers select authoritative name servers(sect4) but in the wild with the goal of learning from theaggregated behavior in order to better engineer author-itative deployments We answer this question with acontrolled study of an experimental worldwide nameserver deployment using Amazon Web Services (AWS)coupled with global data from the Root DNS serversand the (nl) TLD (sect5) Our key results are thatmost recursives check all authoritives over time (sect41)

1

about half of recurisves show a preference based on la-tency (sect42) and that these preferences are most signfi-cant when authoratives have large differences in latency(sect43)

Based on thse findings our second contribution is tosuggest how DNS operators can optimize a DNS serviceto reduce latency for diverse clients (sect7) In order toachieve optimal performance we conclude that all NSesneed to be equally strong and therefore recommend touse anycast at every single one of them

2 BACKGROUND OPERATING DNSFigure 1 shows the relationship between the main el-

ements involved in the DNS ecosystem Each authorita-tive server (AT) is identified by a domain name storedin an NS record which can be reachable by one or mul-tiple IP addresses Operators often mix unicast andanycast strategies across their authoritatives and thereis no consensus on how many NSes is the best For ex-ample most of TLDs within the root zone use 4 NSesbut some use up to 13 and each of these NSes can bereplicated and globally distributed using IP anycast andload balancers [16]1

Recursive resolvers (R in Figure 1) answer to DNSqueries originated at clients (CL in Figure 1) by ei-ther finding it in their local cache or sending queriesto authoritative servers to obtain the final answer to bereturned to the client [9] Besides the local cache withinformation on DNS records many recursives also keepan infrastructure cache with information on the latency(Round Trip Time RTT) of each queried authorita-tive server grouped by IP address The infrastructurecache is used to make informed choices among multi-ple authoritatives for a given zone For example Un-bound [26] implements a smoothed RTT (SRTT) andBIND [2] an SRTT with a decaying factor Some im-plementations of recursive resolvers particularly thosefor embedded devices like home routers may omit theinfrastructure cache

3 MEASUREMENTS AND DATASETSNext we describe how we measure the way recursives

choose authoritative servers using both active measure-ments and passive observations of production DNS atthe root and nl Our work focuses on measurementsfrom the field so that we capture the actual range ofcurrent behavior and to evaluate all currently used re-cursives (Our work therefore complements prior stud-ies that examine specific implementations in testbeds [29]Their work are definite about why a recursive makes achoice but not on how many such recursives are in use)

31 Measurement Design1Figure 9 shows the number of NS records of TLDs in theroot zone

ID locations (airport code) VPs2A GRU (Sao Paulo BR) NRT (Tokyo JP) 87022B DUB (Dublin IE) FRA (Frankfurt DE) 86852C FRA SYD (Sydney AU) 86583A GRU NRT SYD 86843B DUB FRA IAD (Washington US) 86934A GRU NRT SYD DUB 87024B DUB FRA IAD SFO (San Francisco US) 8689

Table 1 Combinations of authoritatives we de-ploy and the number of VPs they see

To observe recursive-to-authoritative mapping on theInternet we deploy authoritative servers for a test do-main (ourtestdomainnl) in 7 different datacenters allreachable by a distinct IPv4 unicast address Sites arehosted by Amazon using NSD 417 running on UbuntuLinux on AWS EC2 virtual machines

We then resolve names serviced by this test domainfrom about 9700 vantage points (VPs) all the RIPEAtlas probes that are active when we take each measure-ment [20] Each VP is a DNS client (a CL in Figure 1)that queries for a DNS TXT resource record over IPv4(In principle we could also query over IPv6 we focus onIPv4 for now because 69 of VPs lack IPv6 support)Each VP uses whatever their local configured recursiveis Those recursives are determined by the individualor ISP hosting each VP

To determine which authoritative NS the VP reacheswe configure each with a different response for the sameDNS TXT resource We choose TXT records becausea DNS CHAOS query [27] would be responded by therecursive and not by the authoritative The resultingdataset from the processing described is publicly avail-able at our website [17] and at Ripe Atlas [19]

Cold caches DNS responses are extensively cached [5]We insure that caches do not interfere with our measure-ments in several ways our authoritatives are used onlyfor our test domain we set the time-to-live (TTL) [14] ofthe TXT record to 5 seconds use unique labels for eachquery and run separate measurements with a break ofat least 4 hours giving recursives ample time to dropthe IP addresses of the authoritatives from their infras-tructure caches

Authoritatives location We deploy 7 combina-tions of authoritative servers located around the globe(Table 1) We identify each by the number of sites (2to 4) and a variation (A B or C) The combinationsvary geographic proximity with the authoritatives closeto each other (2B 3B 4B) or farther apart (2A 2C 3A4A) For each combination we determine the recursive-to-authoritative mapping with RIPE Atlas queryingthe TXT record of the domain name every 2 minutesfor 1 hour

2

AT1 AT2 AT3 AT4

unicast anycast

AT authoritative R recursiveMI middlebox CL client

R1 R2 R3 Rn

MI1 MI2 MI3

CL1 CL2 CL3

Figure 1 TLD Setup Recur-sives Middleboxes and Clients

0

5

10

15

20

25

30

2A (9

60

)

2B (9

55

)

2C (8

24

)

3A (9

13

)

3B (8

48

)

4A (9

47

)

4B (7

52

)

o

f q

ue

rie

s a

fte

r firs

t q

ue

ry

authoritative combination

Figure 2 Queries to probe allauthoritatives after the firstquery (Boxes show quartilesand whiskers 1090ile)

0

02

04

06

08

1

2A 2B 2C 3A 3B 4A 4B

qu

erie

s s

ha

re

authoritatives combination

0

100

200

300

400

FRA DUB IAD SFO GRU NRT SYD

RT

T (

ms)

location

Figure 3 Query distribution(top) and median RTT (bot-tom) for combinations of au-thoritatives

Measurement challenges and considerations Weconsider several challenges that might interfere withmeasurement

Atlas probes might be configured via DHCP to usemultiple recursives and therefore in our analysis weconsider unique combinations of probe ID and recursiveIP as a single VP (or client in Figure 1)

Middleboxes (load balancers DNS forwarders) be-tween VPs and recursives (MI in Figure 1) or recursiveswhich use anycast may interfere causing queries to goto different recursives or to warm up a cache We cannoteliminate their effects but we confirm that they haveonly minor effects on our data by comparing client andauthoritative data Specifically we compare Figure 4 tothe same plot using data collected at the authoritativesfor all recursives that send at least five queries duringone measurement (see Figure 8)

The two graphs are basically equivalent suggestingthat middleboxes do not significantly distort what wesee at the clients

To take RIPE Atlasrsquo large number of probes in Eu-rope [4 22 3] into account we group probes by con-tinent and analyze them individually in most researchquestions

We focus on UDP DNS for IPv4 only not TCP orIPv6 IPv6 is future work we cannot study IPv6 at thistime because the majority of our VPs only have IPv4connectivity [3] We focus on DNS over UDP because itis by far the dominant transport protocol today (morethan 97 of connections for nl and most DNS roots)

32 Root DNS and TLD dataWe use passive measurements from the DITL (Day

In The Life of the Internet) [7] collected on 2017-04-12 at 10 Root DNS servers (B G and L are missing)We look at the one-hour sample from 1200 to 1300

(UTC) since that duration sufficient to evaluate ourclaims By default most implementations of recursiveresolvers do not treat Root DNS servers different fromother authoritatives

We also use traffic collected at 4 authoritative serversof the ccTLD nl [28] For consistency we use capturesfrom the same time slot as of DITL data We use thesedata sets to validate our observations from sect31 Notethat we cannot enforce a cold cache condition in thesepassive measurements and RTT data is not available

4 ANALYSIS OF RECURSIVE BEHAVIOR

41 Do recursives query all authoritativesOur first question is to understand how many recur-

sive resolvers query all available authoritative serversFigure 2 shows how many queries after the very firstone it takes for a recursive to probe all available au-thoritatives (2 to 4 depending on the configuration fromTable 1)

The percentage of recursives that query all availableauthoritatives is given in the x-axis labels of Figure 2Most recursives query all authoritatives (75 to 96)and with two authoritatives (2A 2B 2C) half the re-cursives probe the second authoritative already on theirsecond query but with four authoritatives (4A 4B) ittakes a median of up to 7 queries for the recursives toquery them all Operators can conclude that all theirauthoritatives are visible to most recursives

42 How are queries distributed per authori-tative over time

Since most recursives query all available authoritativeservers relatively quickly we next look at how queriesare spread over multiple authoritatives and if this isaffected by RTT Here our analysis starts once each

3

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60Recursives (x100)

SYDFRA

0 10 0 2 01

Figure 4 Recursive queries distribution for authoritative combinations 2A (top) 2B (center) and 2C(bottom) Solid and dotted horizontal lines mark VPs with weak and strong preference towards anauthoritative

recursive reaches a hot-cache condition by querying allauthoritatives at least once

Figure 3 compares the fraction of queries (bottom)received by each authoritative with the median RTT(top) from the recursives to that authoritative We seethat authoritatives with lower RTTs are often favoredeg FRA has the lowest latency (51ms) and always seesmost queries

When running multiple authoritative servers the op-erator should expect an uneven distribution of queriesamong them Servers to which clients see shorter RTTwill likely receive most queries

Our findings in this section and in sect41 confirm thoseof previous work by Yu et al [29] in which authors showthat 3 out of 6 recursive implementations are stronglybased on RTT However unlike the previous work ourconclusions are drawn from real-world observations in-stead of experimental setup and predictions based onalgorithms

43 How do recursives distribute queriesWe now look at how individual recursives in the wild

distribute their queries across multiple options of au-thoritatives

Figure 4 shows the individual preferences of recur-sives (VPsmdashgrouped by continent) when having thechoice between two authoritatives The x-axis of Fig-ure 4 displays all recursives and the y-axis gives thefraction of queries every recursive sends to each author-itative Table 2 summarizes these results

In order to quantify how many recursives are actu-ally RTT based we consider only VPs that experience

a difference in median RTT of at least 50ms betweenthe authoritatives Based on our observations we definetwo thresholds for recursives preference (i) a weak pref-erence if the recursive sends at least 60 of its queriesto one authoritative (solid lines in Figure 4) and (ii)a strong preference if at least 90 of queries go to oneauthoritative (dotted lines in Figure 4)

We see that 61 of recursives in 2A (top) 59 in2B (center) and 69 in 2C have at least a weak pref-erence and 10 12 and 37 have a strong prefer-ence in 2A 2B and 2C respectively (We show in Fig-ure 10 that recursives with a weak preference develop astronger preference the longer they query the authori-tatives See Appendix C for more)

The distribution of queries per authoritative is in-versely proportional to the median RTT to each recur-sive The bottom plot of Figure 4 clearly shows thispoint where there is a strong bias for VPs in Europe(EU) VPs largely prefer FRA (Frankfurt) over SYD(Sydney) and the opposite for VPs in Oceania (OC)SYD over FRA (We have generated Figure 4 using datacollected at Ripe Atlas probes (CL in Figure 1) Weproduce Figure 8 by analyzing pcap data collected atthe authoritatives (AT in Figure 1) for the same mea-surements See Appendix A for more)

By contrast when given a choice between two roughlyequidistant authoritatives there is a more even splitWe see a roughly even split both when the recursives arenear with Europe going to Frankfurt and Dublin (con-figuration 2B EU to FRA and DUB) or far where theygo to Brazil and Japan (configuration 2A EU to GRUand NRT) Some VPs still have a preference we assume

4

config 2A 2B 2C

cont- NRT GRU FRA DUB FRA SYD

ient RTT RTT RTT RTT RTT RTT

AF 39 467 61 393 57 200 43 204 85 200 15 513

AS 70 130 30 353 53 241 47 261 54 200 46 193

EU 37 310 63 248 65 39 35 53 83 39 17 355

NA 46 190 54 173 41 162 59 152 66 149 33 237

OC 74 201 26 363 46 346 54 335 22 370 78 48

SA 27 364 73 102 49 259 51 259 70 258 30 399

(AF Africa ASAsia EU Europe OC Oceania

NA North America SA South America)

Table 2 Query distribution and median RTT forVPs grouped by continent and three differentcombinations of authoritatives (Table 1)

these represent VPs in Ireland or Germany Thus DNSoperators can expect that the majority of recursives willsend most queries to the fastest responding authorita-tive However a significant share of recursives (in caseof 2B up to 41) also send up to 40 of their queriesto the slower responding authoritative

To expand on this result Figure 5 compares the me-dian RTT between VPs that go to a given site and thefraction of queries they send to that site again groupedby continent Differences between the two points foreach continent indicate a spread in preference (differ-ences in queries on the y axis) or RTT (differences inthe x axis) We show the results for 2B because in thissetup both authoritatives are located rather close toeach other such that the VPs should see a similar RTTfor both of them We see that recursives in Europe thatprefer Frankfurt do so because of lower latency (EU VPsthat prefer FRA have 139ms lower latency than DUB)In contrast recursives in Asia distribute queries nearlyequally in spite of a similar difference in latency (ASVPs see 203ms difference) We conclude that prefer-ences based on RTT decrease when authoritatives arefar away (when they have large median RTT) As aconsequence DNS operators who operate two authori-tatives close to each other can expect a roughly equaldistribution from recursives further away and a prefer-ence from recursives closer by

44 How does query frequency influence se-lection

Many recursive resolvers track the latency to author-itatives (sect2) but how long they keep this informationvaries By default BIND [2] caches latency for 10 min-utes and Unbound caches it for about 15 minutes [26]In this section we measure the influence of frequencyof queries in the selection of authoritatives by the re-cursives To do that we repeat the measurement forconfiguration 2C

but instead of a 2-minute interval between querieswe probe every 5 10 15 and 30 minutes We choose

0

02

04

06

08

1

0 50 100 150 200 250 300 350

fra

ctio

n o

f q

ue

rie

s

RTT (ms)

DUBFRA

EU NA AFAS

SAOC

Figure 5 RTT sensitivity of 2B

0

02

04

06

08

1

2 5 10 15 20 30

fra

ctio

n o

f q

ue

rie

s

query interval (minutes)

AFASEUNAOCSA

Figure 6 Fraction of queries to FRA (remaindergo to SYD configuration 2C) as query intervalvaries from 2 to 30 minutes

2C because in this setup we observe the strongest pref-erence for one of the two recursives

We show these results in Figure 6 We see that pref-erences for authoritatives are stronger when probing isvery frequent but persist with less frequent queries par-ticularly at 2 minute intervals Beyond 10 minutes thepreferences are fairly stable but surprisingly continueThis result suggests that recursive preference often per-sist beyond the nominal 10 or 15 minute timeout inBIND and Unbound and therefore also recursives thatquery only occasionally the name servers of an operatorcan still benefit from a once learned preference

5 RECURSIVE BEHAVIOR TOWARDS AU-THORITATIVES IN PRODUCTION

After we have analyzed recursive behavior in a mea-surement setup (sect4) we now want to validate the resultsby looking at DNS traffic of real-life deployments of theroot zone and the ccTLD nl

Root We use DITL-2017 [7] traffic from 10 out of13 root letters (B G and L were missing at the pointof our analysis) to analyze queries to the root servers(root letters) Figure 7 (top) shows the distribution ofqueries of recursives that sent at least 250 queries tothe root servers in one hour For each VP the top colorband represents the letter it queries most with the nextband its second preferred letter etc

While we find that almost all recursives tend to ex-plore all authoritatives (sect41) many recursives (about20) send queries to only one letter The remaindertend to query many letters (60 query at least 6) butonly 2 query all 10 authoritatives One reason thisanalysis of root traffic differs from our experiment is

5

that here we cannot ldquoclearrdquo the client caches and mostrecursives have prior queries to root letters

The nl TLD the picture slightly changes for queriesto a country-code TLD In the bottom plot of Figure 7we plot the distribution of queries to 4 out of 8 nl

authoritatives The majority of recursives query all theauthoritatives which confirms our observations from ourtest deployment Here the number of recursives thatquery only authoritatives is also smaller than at theroot servers

We conclude that recursive behavior at the root andat a TLD is comparable with our testbed except thata much larger fraction of resolvers have a strong prefer-ence for a particular root letter

6 RELATED WORKTo the best of our knowledge this is the first extensive

study that investigates how authoritative server load isaffected by the choices recursives resolvers make

The study by Yu et al [29] considers the closely re-lated question of how different recursives choose author-itatives Their approach is to evaluate different imple-mentations of recursive resolvers in a controlled envi-ronment and they find that half of the implementationschoose the authority with lowest latency while the oth-ers choose randomly (although perhaps biased by la-tency) Our study complements theirs by looking atwhat happens in practice in effect weighing their find-ings by the diverse set of software and latencies seenacross the 9000 vantage points and by all users of theroots and nl

Kuhrer et al [12] evaluates millions of general openrecursives resolvers They consider open recursive re-sponse authenticity and integrity distribution of devicetypes and their potential role in DNS attacks Al-though similar to our work they focus on external iden-tification and attacks not ldquoregularrdquo recursive use

Also close to our work Ager et al [1] examine recur-sive resolution at 50 ISPs and Google Public DNS andOpenDNS Our study considers many more recursives(more than 9000 locations in RIPE Atlas) and we fo-cus on the role those recursives have in designing anauthoritative server system

Schomp et al [23] consider the client-side of recursiveresolvers Unlike our work they do not discuss impli-cations for DNS operators

Finally other studies such as Castro et al [6] have ex-amined DNS traffic at the roots They often use DITLdata (as we do) but typical focus on client performanceand balance of traffic across roots rather than the de-sign of a specific server infrastructure

7 RECOMMENDATIONS AND CONCLU-SIONS

Our main contribution is the analysis of how recur-sives choose authoritatives in the wild and how that caninfluence the design of authoritative server systems Wepresent the following recommendations for DNS providers

Primary recommendation when optimizing userlatency worst-case latency will be limited by the leastanycast authoritative The implication is that if someauthoritatives in a server system are anycast all shouldbe Because we have shown that most recursives will al-ways send some queries to all authoritatives of a serviceeven if authoritatives employ large anycast networks forlow latency recursives will still send some queries to uni-cast sites sometimes thus some queries will see higherlatency This cost is reduced for clients who prefer-entially choose a nearby site but not eliminated (Ofcourse deployments of anycast to some authoratativeshelps some queries but not the tail)

SIDN operates nl and for us this principle suggestsadjusting our architecture We currently have 5 unicastauthoritatives in the Netherlands and three authori-tatives that are anycast with sites around the worldAlthough the anycast authoritatives can offer lower la-tency to users from North America 23 of incomingqueries to the unicast name servers in the Netherlandsare from the US [24] seeing worse latency than theymight otherwise

Other Considerations Other reasons motivate mul-tiple authoritatives per service or large use of anycastAnycast is important to mitigate DDoS attacks [16] Inaddition standard practices recommend multiple au-thoritatives in different locations for fault tolerance [8]

However for latency prior work has shown that rel-atively few well-peered anycast sites can provide goodglobal latency [22] We add to this advice on that all au-thoritatives have to provide low latency to reduce over-all service latency to users of most recursives

Conclusion In this paper we have shown the di-verse server selection strategies of recursives in the wildWhile many select authoritatives preferentially to re-duce latency some queries usually go to all authorita-tives The main implication of these findings is thatall name servers in a DNS service for a zone need tobe consistently provisioned (with reasonable anycast)to provide consistent low latency to users

AcknowledgmentsWe would like to thank Marco Davids for his support in thisresearch This research has been partially supported by measure-ments obtained from RIPE Atlas an open measurements plat-form operated by RIPE NCC as well as by measurement datamade available by DNS-OARC

Moritz Muller Giovane C M Moura and Ricardo de O Schmidtdeveloped this work as part of the SAND project(httpwwwsand-projectnl)

John Heidemannrsquos work in this paper is partially sponsoredby the Department of Homeland Security (DHS) Science andTechnology Directorate HSARPA Cyber Security Division viaBAA 11-01-RIKA and Air Force Research Laboratory Informa-

6

Figure 7 Distribution of queries of recursives with at least 250 queries across 10 out of 13 Rootletters (top) and across 4 out of 8 name servers of nl (bottom)

tion Directorate (agreement FA8750-12-2-0344) and via contractnumber HHSP233201600010C The US Government is autho-rized to make reprints for governmental purposes notwithstand-ing any copyright The views contained herein are those of theauthors and do not necessarily represent those of NSF DHS orthe US Government

8 REFERENCES

[1] B Ager W Muhlbauer G Smaragdakis andS Uhlig Comparing dns resolvers in the wild InProceedings of the 10th ACM SIGCOMMconference on Internet Measurement pages 15ndash21ACM Sept 2010

[2] C Almond Address database dump (ADB) -understanding the fields and what they representhttpskbiscorgarticleAA-014630Ad

dress-database-dump-ADB-understanding-the

-fields-and-what-they-representhtml 2017[3] V Bajpai S Eravuchira J Schonwalder

R Kisteleki and E Aben Vantage PointSelection for IPv6 Measurements Benefits andLimitations of RIPE Atlas Tags In IFIPIEEEInternational Symposium on Integrated NetworkManagement (IM 2017) Lisbon Portugal May2017

[4] V Bajpai S J Eravuchira and J SchonwalderLessons learned from using the RIPE Atlasplatform for measurement research SIGCOMMComput Commun Rev 45(3)35ndash42 July 2015

[5] T Callahan M Allman and M Rabinovich Onmodern DNS behavior and properties ACMSIGCOMM Computer Communication Review43(3)7ndash15 July 2013

[6] S Castro D Wessels M Fomenkov andK Claffy A Day at the Root of the InternetACM Computer Communication Review38(5)41ndash46 Apr 2008

[7] DNS OARC DITL Traces and Analysis httpswwwdns-oarcnetoarcdataditl2017Feb 2017

[8] R Elz R Bush S Bradner and M PattonSelection and Operation of Secondary DNSServers RFC 2182 (Best Current Practice) July1997

[9] P Hoffman A Sullivan and K Fujiwara DNSTerminology RFC 7719 (Informational) Dec2015

[10] Internet Assigned Numbers Authority (IANA)Root Fileshttpswwwianaorgdomainsrootfiles2017

[11] Internet Assigned Numbers Authority (IANA)Technical requirements for authoritative nameservers httpswwwianaorghelpnameserver-requirements 2017

[12] M Kuhrer T Hupperich J Bushart C Rossowand T Holz Going wild Large-scale classificationof open dns resolvers In Proceedings of the 2015ACM Conference on Internet MeasurementConference pages 355ndash368 ACM Oct 2015

[13] D McPherson D Oran D Thaler andE Osterweil Architectural Considerations of IPAnycast RFC 7094 (Informational) Jan 2014

[14] P Mockapetris Domain names - concepts andfacilities RFC 1034 (INTERNET STANDARD)Nov 1987

[15] P Mockapetris Domain names - implementationand specification RFC 1035 Nov 1987

[16] G C M Moura R de O SchmidtJ Heidemann W B de Vries M Muller L Weiand C Hesselman Anycast vs DDoS Evaluatingthe November 2015 Root DNS Event InProceedings of the 2016 ACM Conference on

7

Internet Measurement Conference pages 255ndash270Oct 2016

[17] M Muller G C M Moura R de O Schmidtand J Heidemann Recursives in the wilddatasets httpswwwsimpleweborgwikiindexphpTracesRecursives_in_the_Wild

_Engineering_Authoritative_DNS_Servers andhttpsantisiedudatasetsdns May2017

[18] C Partridge T Mendez and W Milliken HostAnycasting Service RFC 1546 (Informational)Nov 1993

[19] RIPE NCC RIPE Atlas measurement idshttpsatlasripenetmeasurementsID2017 ID is the experiment ID 2A 7951948 2B7953390 2C 7967380 3A 7961003 3B 79541224A 7966930 4B 7960323 2C-5min 83218462C-10min 8323303 2C-15min 83249632C-20min 8329423 2C-15min 8335072

[20] RIPE NCC Staff RIPE Atlas A Global InternetMeasurement Network Internet Protocol Journal(IPJ) 18(3)2ndash26 Sep 2015

[21] Root Server Operators Root DNS Feb 2017httproot-serversorg

[22] R d O Schmidt J Heidemann and J HKuipers Anycast latency How many sites areenough In Proceedings of the Passive and ActiveMeasurement Workshop pages 188ndash200 SydneyAustralia Mar 2017 Springer

[23] K Schomp T Callahan M Rabinovich andM Allman On measuring the client-side dnsinfrastructure In Proceedings of the 2015 ACMConference on Internet Measurement Conferencepages 77ndash90 ACM Oct 2013

[24] SIDN Labs nl stats and data Mar 2017httpstatssidnlabsnlnetworkgeo

[25] A Singla B Chandrasekaran P Godfrey andB Maggs The internet at the speed of light InProceedings of the 13th ACM Workshop on HotTopics in Networks pages 1ndash7 ACM Oct 2014

[26] W Wijngaards Unbound Timeout Informationhttpsunboundnetdocumentationinfo_t

imeouthtml Nov 2010[27] S Woolf and D Conrad Requirements for a

mechanism identifying a name server instanceRFC 4892 Internet Request For Comments June2007

[28] M Wullink G C Moura M Muller andC Hesselman Entrada A high-performancenetwork traffic data streaming warehouse InNetwork Operations and Management Symposium(NOMS) 2016 IEEEIFIP pages 913ndash918 IEEEApr 2016

[29] Y Yu D Wessels M Larson and L ZhangAuthority Server Selection in DNS Caching

Resolvers SIGCOMM Computer CommunicationReview 42(2)80ndash86 Mar 2012

APPENDIXA CLIENT SELECTION FROM THE VIEW

OF AUTHORITATIVE SERVERSThe question of how resolvers choose authoritatives

in the wild can be measured from the clients (CL inFigure 1) and from the authoritative servers themselves(AT in that figure) Each method provides a slightlydifferent perspective In sect43 we looked at data fromthe clientrsquos perspetvie to determine exactly which ATresponded to each CL This method does not makeclear which Rn was used in the process However thisclient perspective represents theldquouser experiencerdquomdashhowusers or applications will observe in practice the effectsof choices of resolvers

In this appendix we add data from the authoriativeservers This perspective allows us to observe which Rn

chose which AT but does not allow to determine whichCL issued each query since they may be behind MI andeven using multiple resolvers This authoritative-sideevaluation allows us to check our results from the clientperspective hold with a different direction of analysis

To do that we evaluate the same measurements col-lected at each authoritative (available at [17]) and pro-duce per recursive resolver IP address (Rn) a distribu-tion of queries across all authoritatives The results isshown in Figure 8 We show only data for clients thatsend at least five queries during one measurement

This authoritative-side analysis shows that our client-side results are not biased based on the location oftheir observation Regardless of where it is measured(from the authoritative or clientrsquos point of view) mostresolvers will choose authoritatives in the wild similarlyeventually selecting all possible authoritatives

B NUMBER OF AUTHORITATIVES PERTLD

The goal of our paper is to consider how DNS providerscan configure their DNS services using replication withIP anycast (multiple sites per authoritiative server) andmultiple NS records (multiple authoritative servers)

However there is no consensus on of an optimal num-ber of ATs for a TLD or DNS provider For examplethe Root DNS employs 13 authoritatives (or root serverletters) with more than 500 anycast sites The com

zone has a similar scope The nl zone on the otherhand uses 8 authoritatives some of which are anycastand some unicast

We next briefly look across all top-level domains (TLDs)to see how many authoritatives they use For this studywe analyze the root zone file [10] on 2017-05-04 count-ing the number of authoritative records (NS) for each

8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02

Figure 8 Recursive queries distribution for authoritative combinations 2A (top) 2B (center) and 2C(bottom) Measured at the authoritative and including only recursives that sent at least 5 queries

0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s

number of authoritatives

Figure 9 Number of authoritatives of TLDs inthe root zone (2017-05-04)

TLD Figure 9 shows the distribution We can see thatthe majority of TLDs employ 4 NS records in regardlessof using anycast or unicast

We also intent as future work to investigate the choiceof multiple records in more details as the deploymentof IP anycast on these records

C PREFERENCE OF RECURSIVES OVERTIME

15 30 45 60measurement period (minutes)

00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es

2A-weak2A-strong2B-weak2B-strong2C-weak2C-strong

Figure 10 Fraction of recursives that have aweak (solid line) or a strong (dotted line) de-pending on the length of the measurement

We show in sect41 that recursives query all authorita-tives fast and in sect42 that they develop a preferencetowards faster responding authoritatives In this ap-pendix we analyze if this preference changes over time

Figure 9 shows the fraction of recursives for 2A ndash 2Cthat have a weak and strong preference with solid anddotted lines for measurements over a period of 15 3045 and 60 minutes In general recursives with a weakpreference develop a stronger preference over time es-pecially after 30 minutes Therefore operators can ex-pect that the longer recursives with a preference sendqueries the stronger they will prefer one of their au-thoritatives

9

about half of recurisves show a preference based on la-tency (sect42) and that these preferences are most signfi-cant when authoratives have large differences in latency(sect43)

Based on thse findings our second contribution is tosuggest how DNS operators can optimize a DNS serviceto reduce latency for diverse clients (sect7) In order toachieve optimal performance we conclude that all NSesneed to be equally strong and therefore recommend touse anycast at every single one of them

2 BACKGROUND OPERATING DNSFigure 1 shows the relationship between the main el-

ements involved in the DNS ecosystem Each authorita-tive server (AT) is identified by a domain name storedin an NS record which can be reachable by one or mul-tiple IP addresses Operators often mix unicast andanycast strategies across their authoritatives and thereis no consensus on how many NSes is the best For ex-ample most of TLDs within the root zone use 4 NSesbut some use up to 13 and each of these NSes can bereplicated and globally distributed using IP anycast andload balancers [16]1

Recursive resolvers (R in Figure 1) answer to DNSqueries originated at clients (CL in Figure 1) by ei-ther finding it in their local cache or sending queriesto authoritative servers to obtain the final answer to bereturned to the client [9] Besides the local cache withinformation on DNS records many recursives also keepan infrastructure cache with information on the latency(Round Trip Time RTT) of each queried authorita-tive server grouped by IP address The infrastructurecache is used to make informed choices among multi-ple authoritatives for a given zone For example Un-bound [26] implements a smoothed RTT (SRTT) andBIND [2] an SRTT with a decaying factor Some im-plementations of recursive resolvers particularly thosefor embedded devices like home routers may omit theinfrastructure cache

3 MEASUREMENTS AND DATASETSNext we describe how we measure the way recursives

choose authoritative servers using both active measure-ments and passive observations of production DNS atthe root and nl Our work focuses on measurementsfrom the field so that we capture the actual range ofcurrent behavior and to evaluate all currently used re-cursives (Our work therefore complements prior stud-ies that examine specific implementations in testbeds [29]Their work are definite about why a recursive makes achoice but not on how many such recursives are in use)

31 Measurement Design1Figure 9 shows the number of NS records of TLDs in theroot zone

ID locations (airport code) VPs2A GRU (Sao Paulo BR) NRT (Tokyo JP) 87022B DUB (Dublin IE) FRA (Frankfurt DE) 86852C FRA SYD (Sydney AU) 86583A GRU NRT SYD 86843B DUB FRA IAD (Washington US) 86934A GRU NRT SYD DUB 87024B DUB FRA IAD SFO (San Francisco US) 8689

Table 1 Combinations of authoritatives we de-ploy and the number of VPs they see

To observe recursive-to-authoritative mapping on theInternet we deploy authoritative servers for a test do-main (ourtestdomainnl) in 7 different datacenters allreachable by a distinct IPv4 unicast address Sites arehosted by Amazon using NSD 417 running on UbuntuLinux on AWS EC2 virtual machines

We then resolve names serviced by this test domainfrom about 9700 vantage points (VPs) all the RIPEAtlas probes that are active when we take each measure-ment [20] Each VP is a DNS client (a CL in Figure 1)that queries for a DNS TXT resource record over IPv4(In principle we could also query over IPv6 we focus onIPv4 for now because 69 of VPs lack IPv6 support)Each VP uses whatever their local configured recursiveis Those recursives are determined by the individualor ISP hosting each VP

To determine which authoritative NS the VP reacheswe configure each with a different response for the sameDNS TXT resource We choose TXT records becausea DNS CHAOS query [27] would be responded by therecursive and not by the authoritative The resultingdataset from the processing described is publicly avail-able at our website [17] and at Ripe Atlas [19]

Cold caches DNS responses are extensively cached [5]We insure that caches do not interfere with our measure-ments in several ways our authoritatives are used onlyfor our test domain we set the time-to-live (TTL) [14] ofthe TXT record to 5 seconds use unique labels for eachquery and run separate measurements with a break ofat least 4 hours giving recursives ample time to dropthe IP addresses of the authoritatives from their infras-tructure caches

Authoritatives location We deploy 7 combina-tions of authoritative servers located around the globe(Table 1) We identify each by the number of sites (2to 4) and a variation (A B or C) The combinationsvary geographic proximity with the authoritatives closeto each other (2B 3B 4B) or farther apart (2A 2C 3A4A) For each combination we determine the recursive-to-authoritative mapping with RIPE Atlas queryingthe TXT record of the domain name every 2 minutesfor 1 hour

2

AT1 AT2 AT3 AT4

unicast anycast


R1 R2 R3 Rn

MI1 MI2 MI3

CL1 CL2 CL3


0

5

10

15

20

25

30

2A (9

60

)

2B (9

55

)

2C (8

24

)

3A (9

13

)

3B (8

48

)

4A (9

47

)

4B (7

52

)

o

f q

ue

rie

s a

fte

r firs

t q

ue

ry



0

02

04

06

08

1

2A 2B 2C 3A 3B 4A 4B

qu

erie

s s

ha

re


0

100

200

300

400


RT

T (

ms)

location


















3

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60Recursives (x100)

SYDFRA

0 10 0 2 01














4

config 2A 2B 2C



AF 39 467 61 393 57 200 43 204 85 200 15 513

AS 70 130 30 353 53 241 47 261 54 200 46 193

EU 37 310 63 248 65 39 35 53 83 39 17 355

NA 46 190 54 173 41 162 59 152 66 149 33 237

OC 74 201 26 363 46 346 54 335 22 370 78 48

SA 27 364 73 102 49 259 51 259 70 258 30 399









0

02

04

06

08

1

0 50 100 150 200 250 300 350

fra

ctio

n o

f q

ue

rie

s

RTT (ms)

DUBFRA

EU NA AFAS

SAOC


0

02

04

06

08

1

2 5 10 15 20 30

fra

ctio

n o

f q

ue

rie

s


AFASEUNAOCSA








5






















6



8 REFERENCES



















7






























8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9

AT1 AT2 AT3 AT4

unicast anycast


R1 R2 R3 Rn

MI1 MI2 MI3

CL1 CL2 CL3


0

5

10

15

20

25

30

2A (9

60

)

2B (9

55

)

2C (8

24

)

3A (9

13

)

3B (8

48

)

4A (9

47

)

4B (7

52

)

o

f q

ue

rie

s a

fte

r firs

t q

ue

ry



0

02

04

06

08

1

2A 2B 2C 3A 3B 4A 4B

qu

erie

s s

ha

re


0

100

200

300

400


RT

T (

ms)

location


















3

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60Recursives (x100)

SYDFRA

0 10 0 2 01














4

config 2A 2B 2C



AF 39 467 61 393 57 200 43 204 85 200 15 513

AS 70 130 30 353 53 241 47 261 54 200 46 193

EU 37 310 63 248 65 39 35 53 83 39 17 355

NA 46 190 54 173 41 162 59 152 66 149 33 237

OC 74 201 26 363 46 346 54 335 22 370 78 48

SA 27 364 73 102 49 259 51 259 70 258 30 399









0

02

04

06

08

1

0 50 100 150 200 250 300 350

fra

ctio

n o

f q

ue

rie

s

RTT (ms)

DUBFRA

EU NA AFAS

SAOC


0

02

04

06

08

1

2 5 10 15 20 30

fra

ctio

n o

f q

ue

rie

s


AFASEUNAOCSA








5






















6



8 REFERENCES



















7






























8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60Recursives (x100)

SYDFRA

0 10 0 2 01














4

config 2A 2B 2C



AF 39 467 61 393 57 200 43 204 85 200 15 513

AS 70 130 30 353 53 241 47 261 54 200 46 193

EU 37 310 63 248 65 39 35 53 83 39 17 355

NA 46 190 54 173 41 162 59 152 66 149 33 237

OC 74 201 26 363 46 346 54 335 22 370 78 48

SA 27 364 73 102 49 259 51 259 70 258 30 399









0

02

04

06

08

1

0 50 100 150 200 250 300 350

fra

ctio

n o

f q

ue

rie

s

RTT (ms)

DUBFRA

EU NA AFAS

SAOC


0

02

04

06

08

1

2 5 10 15 20 30

fra

ctio

n o

f q

ue

rie

s


AFASEUNAOCSA








5






















6



8 REFERENCES



















7






























8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9

config 2A 2B 2C



AF 39 467 61 393 57 200 43 204 85 200 15 513

AS 70 130 30 353 53 241 47 261 54 200 46 193

EU 37 310 63 248 65 39 35 53 83 39 17 355

NA 46 190 54 173 41 162 59 152 66 149 33 237

OC 74 201 26 363 46 346 54 335 22 370 78 48

SA 27 364 73 102 49 259 51 259 70 258 30 399









0

02

04

06

08

1

0 50 100 150 200 250 300 350

fra

ctio

n o

f q

ue

rie

s

RTT (ms)

DUBFRA

EU NA AFAS

SAOC


0

02

04

06

08

1

2 5 10 15 20 30

fra

ctio

n o

f q

ue

rie

s


AFASEUNAOCSA








5






















6



8 REFERENCES



















7






























8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9






















6



8 REFERENCES



















7






























8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9



8 REFERENCES



















7






























8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9






























8

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9

00

05

102A

AF AS EU

NRTGRU

NA OC SA

00

05

10

fract

ion

of q

uerie

s2B DUB

FRA

0 200

05

10

2C

0 5 0 10 20 30 40 50 60 70Recursives (x100)

SYDFRA

0 10 0 2 02


0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12

num

ber

of T

LD

s







00

02

04

06

08

10fra

ctio

n of

recu

rsiv

es





9

Recursives in the Wild: Engineering Authoritative …johnh/PAPERS/Mueller17a.pdfRecursives in the...

Documents

Transcript of Recursives in the Wild: Engineering Authoritative …johnh/PAPERS/Mueller17a.pdfRecursives in the...