As part of our role as .nz registry, we manage 4 out of 7 of the .nz authoritative name servers. We receive hundreds of millions of DNS queries from the Internet every day. Many of them are sent from recursive DNS servers commonly located at Internet Service Providers (ISPs) or institutional networks that are used to respond to user requests to resolve a domain name. We store this traffic in a Hadoop Cluster to support research and operations.
In this post we are going to explore certain patterns in the traffic from some popular resolvers including local ISPs such as Spark, Vodafone, Inspire, Actrix, as well as popular public DNS resolvers such as Google DNS and OpenDNS.
We receive some traffic for domains other than .nz, for example, .as, .gg, .je, .aq which we are authoritative for, as well as domains we are not authoritative for but were sent to us due to misconfigurations. By extracting .nz traffic, we are focusing our analysis exclusively on .nz domain space.
We analysed traffic for 1st March 2016.
DNS in a Glimpse
A query sent to a DNS name server is a message which specifies a target domain name, query type, and query class asking for matching resource records. The query type indicates the requested record type, for example:
- A - IPv4 address
- MX - the mail exchanger of a domain
- AAAA - IPv6 address
- TXT - text records commonly used with Sender Policy Framework (SPF)
- SPF - special data used in Sender Policy Framework protocol as an anti-spam technique(OBSOLETE - use TXT)
- DS - the record used to identify the DNSSEC signing key of a delegated zone
- DNSKEY - the key record used in DNSSEC
- NS - the authoritative name servers used for delegating a DNS zone
- SRV - records defining service location of servers for specified services
- SOA - authoritative information about a DNS zone, including the primary name server, the email of the domain administrator, the domain serial number, and several timers related to refreshing the zone.
There are many other query types we do not explore here.
The response by the DNS name server either answers the question posed in the query, refers the requester to another set of name servers, or signals some error condition. The response code is set in the header of a response message to return the status of the query. For example, NOERROR is returned when the query is completed successfully, while NXDOMAIN response code indicates the domain name does not exist, and there are also other response codes such as REFUSED, which means the DNS server refused to answer for the query that was sent to it. For an exhaustive list of these codes, please refer to the DNS RCODEs section of the link here.
Resolvers' Source Addresses
Recursive DNS servers are also called resolvers, which are normally implemented to have multiple IP addresses for load balancing and service redundancy. With ISPs' addresses we know and subnets published on the websites of Google and OpenDNS, we search the source addresses in our query data to filter the traffic from these resolvers. From the dataset used, we could identify 737 addresses from Google DNS and 119 addresses from OpenDNS.
Exploring the Query Traffic
On each resolver's plot, the bar represents one source IP address(we will call them servers from now on). Compared with ISPs that just have a few servers, public DNS resolvers have hundreds of them which are plotted as dense bars.
From the plots, we can observe some interesting patterns:
- Though the distribution of query types with each server varies even within the same resolver, they still share some common patterns. We can see several popular query types such as A, AAAA, MX, DS. And A is still the most popular one.
- Regarding DNSSEC, it shows some amount of DS and DNSKEY queries, from which we could assume that these servers are doing DNSSEC validation. In the traffic, DS queries are significantly more than DNSKEY queries, which is reasonable assuming, as authoritative name server for mainly top/second level domains, we are queried for DS records to be used in the validation of the Chain of Trust, while DNSKEY queries are used in validating signed records within the delegated zones.
- Additionally, there is a big fraction (~ 50%) of NS queries sent from some of Google's servers. This is against intuition as directly requesting NS records is not required in a common DNS resolution process. We don't exclude the possibility that these servers are not the common DNS resolvers but are running some specific tasks like network probing, etc.
More explorations could be done in the future, for example, the characteristics of IPv6 traffic, DNS flags in the header of a DNS message, etc.
Exploring the Responses
Responses are interesting because we can explore the quality of queries against the zone data on the authoritative name servers.
From the Response Codes in the plots above, we are not surprised that the majority is responded with NOERROR, meaning the domain names requested are found in our zone, while NXDOMAIN(the domain names requested do not exist in our zone) for the rest. There are also other response codes such as REFUSED, FORMERR, SERVFAIL, etc., which are not observed in our sample of .nz traffic.
Unexpectedly, four servers from Spark and some from OpenDNS have high fraction of NXDOMAIN responses which is suspicious in some way. Possible causes are infected hosts, spam sources, or search for non-registered domains using DNS instead of WHOIS.
Exploring the Registered Domains Queried
We explored the percentage of registered domain names queried at least once per day, which is very interesting because it will show how active is the .nz namespace.
To get the queried domain names that are registered, we extract the unique domain names from the responses with NOERROR response code. The total number of registered domain names on that day is obtained from the Internet Data Portal (IDP).
From the plots, we observe that public DNS resolvers show lower percentages than local ISPs possibly due to the different population compositions. Clients using public DNS resolvers are more likely to be international so their requests are widely distributed across different TLDs (Top Level Domains) and .nz domain names just represent a small part, while those using local ISPs are mostly .nz centric.
We explored two different days and they show similar patterns like the above.
We explored one day of .nz traffic coming from popular resolvers including local ISPs and public DNS resolvers. Interesting patterns were observed, which can be used to identify well behaved resolvers and further assist classification of the traffic by different types of source addresses. More exploration will be carried on to characterize all the traffic based on patterns from sampled traffic.