There are times when decisive action is the straightest path to success. Starting from 1 February 2019, the organisations behind open source DNS software implementations are going to deploy changes to their code that could break your domains. That day has been labeled DNS Flag day.
Do software developers want to intentionally break domains? Well, no. For years, those software developers had to include workarounds in their code to allow a few domains to work; domains using DNS software that's not standard compliant, or living behind network devices not respecting Internet standards. Those workarounds are coming to an end. If you run a domain name and want to get more information, please check https://dnsflagday.net, which includes an online tool for testing.
As the guardians of the .nz namespace, we see it as our responsibility to investigate how this change will affect .nz, and we have been collecting information about DNS standard compliance across all .nz domains for a couple of months. The research involved was presented at LACNIC 30 and will be presented at DNS-OARC 29 in the coming weeks.
What do we test for
The workarounds to be removed starting on February 2019 are related to a component of the DNS called EDNS. EDNS was created to extend optionality and usefulness of the DNS protocol. For example, there couldn't be DNSSEC without EDNS.
ISC, the organisation behind BIND, the de-facto standard DNS implementation, created a test to verify if a DNS server responds correctly to a series of queries exploring different elements of the DNS standard, including EDNS. They have been collecting compliance data for the root zone and other domains for a while.
CZ.NIC, managers of the Czech Republic ccTLD, created a tool that tests a nameserver once, independently of how many domains it hosts, allowing bulk verification of a whole namespace like .cz or .nz.
We are using the CZ.NIC tool currently for .nz, and are checking for EDNS compliance. In the future, we will extend to full DNS compliance.
In a coordinated effort with .CL, .CZ, .SE, .NU, .NZ, and using the public results for the root zone, we can compare how different namespaces fare on the test. The figures below are not exhaustive but are the most compelling output.
Our first look is at the general nameserver distribution, as one nameserver can have multiple IPv4 and/or IPv6 addresses.
Although different zones have different numbers of domains, the number of servers is more or less stable, with the exception of Sweden with over 10k extra addresses compared to the rest.
Basic DNS test
dig soa ZONE @SERVER +noedns +noad +norec
For each nameserver, we send a query to confirm they respond. In general, most of the nameservers pass this test.
The root zone has higher levels of correctness on this basic tests because IANA imposes a set of technical tests to TLD operators. From now on, the root zone metric will be a baseline to compare other zones.
The errors "NOSOA" and "NOAA" imply the server didn't send the right response to the query, due to misconfiguration mostly.
dig soa ZONE @SERVER +edns=0 +nocookie +noad +norec
With the baseline defined, we can start showing how increasing more complex queries start producing errors.
From this test we can start seeing the first protocol violations. To activate EDNS, a DNS query will include an OPT record, which is required to be copied in the DNS response. The NOOPT errors are servers not returning that record. The NSID errors are servers returning the NSID option when originally they were not asked to provide it!
dig soa ZONE @SERVER +edns=0 +nocookie +noad +norec +dnssec
Having working EDNS is essential for DNSSEC. The DO bit signals a DNS client wants to receive DNSSEC-related records, like RRSIG (signatures) and DNSKEYs. While testing for DO-bit support, we start to find higher levels of failure.
From the plot we can see two different stories. The Root, .SE and .NU zones have nearly 100% of nameservers answering correctly, and .NZ, .CZ and .CL slightly less than 80%, with the other 20% failing to include the DO bit on the response as required! There is also a few nameservers that timeout with the query. If there is a signed domain behind those failing servers, DNSSEC will definitely break.
dig soa ZONE @SERVER +edns=1 +noednsneg +nocookie +noad +norec
The EDNS1 test is quite tricky, as EDNS version 1 has not been defined yet, the only version available is EDNS0. So this is a test to verify if the nameserver handles the error correctly, or any potential network device doing packet inspection understands if the query is valid or not.
You can see that the root zone keeps its high compliance level, but the ccTLDs in our list fall behind with roughly 50% of the nameservers passing the test. The expected response must include a BADVERS return code, no SOA record, and the OPT record signaling EDNS version 0. The noerror and soa cases represent a nameserver that didn't validate the query properly, the noopt case a nameserver that violated the standards by not returning the OPT record as we saw above, and the badversion case where a nameserver actually responded indicating it supports EDNS version 1!
dig soa ZONE @SERVER +edns=0 +noad +norec +nsid +subnet=0.0.0.0/0 +expire +cookie=0102030405060708
The OPTLIST test is intended to explore the adoption of new DNS options, as they have been added in later years. NSID defined 11 years ago in RFC 5001 asks the server to reply with a server identification string, useful for anycast deployments. subnet is an option defined 2 years ago for clients to signal where the original DNS query came from, useful for CDN operators. expire is defined in RFC 7314 to query the EXPIRE timer in the SOA record. cookie is defined in RFC 7873 and provides a lightweight DNS transaction security mechanism against a variety of attacks. In simple terms, is a gauge of how new and fresh the DNS software used on the nameservers is.
The plot provides two views. First, which options are more commonly deployed like nsid and subnet. Second, the error cases, as there are a few nameservers failing to respond to the query (timeout) and some returning a DNS error (formerr) that means they didn't understand the query, indicating the software is a few years old.
Why does it matter?
We started this article pointing out that changes will be introduced due to DNS Flag day. The deployment of these changes will cause currently functioning domains to stop working. We estimate around 1.2% of .nz domains will be broken, and we will notify those registrants and DNS operators about our findings using the Registrar Portal.
The Internet is a tool for innovation and disruption, but introducing innovation in the core DNS protocols has always proven difficult. There are constant demands to be backward compatible and to avoid making big changes that will break existing features. Consequently, the DNS in particular is a protocol that stayed the same for many years. We will be actively guarding and investigating the level of protocol compliance within the .nz namespace and reporting back our findings. Stay tuned.