ITHI M8: Authoritative server traffic analysis

ITHI Metric M8 characterizes the behavior of recursive resolvers as observed by a sampling of authoritative servers. We want to find answer to the following questions:

Do recursive servers forward gratuitous queries to non-existing domains?

Do recursive servers use Extended DNS (EDNS), and how?

Do recursive servers signal that DNSSEC is OK in their queries?

Do recursive servers start deploying QName minimization as part of DNS privacy efforts?

Recursive resolvers, in the normal course of operation, send queries for domain names to the servers authoritative for these domains. We expect that authoritative servers will receive requests from many resolvers. We also expect that there is some selection bias: pretty much every resolver will query popular global names, but domains specific to particular countries are much more likely to be queried by resolvers in that country. We don’t know how these potential bias affects measurements like the use of EDNS or DNSSEC by recursive. We will only know that when we can get multiple measurements from multiple sites and compare the results. Our first step is thus to instrument a sufficient number of servers to at least get preliminary results.

Measuring QName Minimization

Measuring the use of EDNS or the setting of the DO bit is pretty straightforward, since the information is visible in the queries and responses. Measuring QNAME minimization is less obvious. We first have to understand how QName minimization work, which is shown in this comparison table:

Resolve “X record” for “www.example.com”	Without QName minimization	Standard QName minimization	QName minimization with firewall traversal
Query to the root	www.example.com/X?	.com/NS?	.com/A?
Response from the root	.com/NS	.com/NS	.com/A and .com/NS, or maybe just .com/NS
Query to TLD server	www.example.com/X?	example.com/NS?	example.com/A?
Response from TLD server	.example.com/NS	.example.com/NS	.example.com/A and .example.com/NS, or maybe just .example.com/NS
Query to zone server	www.example.com/X?
Response from zone server	www.example.com/X or maybe www.example.com/CNAME.

The table shows two variants of QName minimization. The standard approach was to break down the queried name into “zone” components, and instead of sending request for the complete name, just send successive requests for the name server (NS) records of those components. However, early deployment attempts showed that some firewalls were surprised to see these NS requests, and sometime blocked the packets. The “firewall traversal” variants works by querying an A record instead of the NS record. The table shows that the length of names is a good decision attribute:

If the name in the query is longer than the name in the response, then we can be pretty sure that the query was not minimized.

If the name in the query has the same length as the name in the response, then the query may or may not have been minimized – since in all three cases there is a final step in which the full name is queried for the server in charge of the zone.

We also assume that in some cases, even resolvers who do not implement QName minimization will send queries for short names such as “example.com”. To make decisions, we need to see multiple queries sent by the same resolver. We consider resolvers as “probably not minimizing” if the names in the queries were sometimes longer than the name in the response, and “maybe minimizing” if the names in the queries were always the same length as the names in the response.

Statistics per resolver

When we do statistics about characteristics like EDNS, DO or QName minimization, we could simply count the number of queries with the tracked property, and then divide by the total number of queries. That would give us for example the “percentage of queries with the DO bit set.” But to get statistics per resolver, we need to get data for individual resolvers.

We use the IP address to identify the resolvers at each data collection point, for each data collection session. We will then for each of the attributes that we look at count the number of queries with the attribute, and the total number of queries. We compute a measure for that data collection point and that session by using a weight of 1 for each resolver. This give us the ratios measured in a particular session. The metric itself is obtained by averaging the ratios over all the sessions.

As part of the EDNS metric, we also track usage of the various EDNS options. We are already tracking global usage of these options as part of M6, but M8 gives us an indication of the distribution of these options in the pool of receivers. Our simple solution is to only sample one packet per resolver and count the appearance of options in that packet.

Comparison with M5

The M5 metric also tracks the behavior of resolvers. The data is collected by running advertisements on a large number of clients distributed more or less randomly on the Internet, generating name resolutions as part of processing the advertisements. The names all belong to domains controlled by the experiment, so the traffic can be observed at servers controlled by the experiment. The name resolutions are processed by resolvers configured for these clients, leading to sampling a large proportion of the resolvers on the Internet.

The main difference between M8 and M5 is that M8 acquires its data by looking at traffic naturally generated by diverse clients, while M5 uses artificial traffic generated for the experiment. There may well be differences, from the choice of names to the specific resource records or EDNS options used in the queries. Instrumentation of a sample of servers provides complementary data on DNSSEC usage. It also provides statictics on EDNS usage or QName minimization deployment that are not available from M5.

Limitation of the DNSSEC OK Option

We are measuring the fraction of resolvers that set the DNSSEC OK option in their queries. We understand that resolvers may very well set that option even if they do not intend to perform response validation according to DNSSEC. For reference, we know that in August 2018 84% of resolvers measured by M5 set the DO bit in queries, but only 25% performed DNSSEC validation. Users of the M8 statistics should be conscious that we are measuring the presence of the DNSSEC OK option, not actual usage of DNSSEC.

Limits of M8

We considered and rejected several additional measurement ideas.

We will not attempt to measure “name leakage” at authoritative servers, because this is hard to measure without knowing a lot about the zone served by the authoritative, and there are privacy issues if we do that.

We were told that we may have a problem with "pseudo CCTLD", such as ".fr.com". There may be some legitimate use for such pseudo-CCTLD, e.g. people setting up some competition for the national service. But there may also well be some more nefarious actions, e.g. phishers setting pseudo domains like "banque-de-paris.fr.com" to phish customers of "banque-de-paris.fr". But it is hard to measure occurrence of such domains by parsing samples of actual traffic. In any case, if we did want to measure this phenomenon, it should be done by parsing recursive resolver traffic, not traffic at a sample of authoritative servers.

We will not attempt to measure the distribution of market share among recursive servers. Doing so would require collecting lists of IP addresses of resolvers at various locations, sending these lists to our staging servers, and computing request frequencies there. This would be a violation of our privacy rules, and we would rather not do it.

List of M8 metrics

The design results in the following list of metrics:

Metric	Description
M8.1	% NX Domain Queries at Authoritative
M8.2.1	% of resolvers using EDNS
M8.2.2.X	% of resolvers using EDNS option code X
M8.3	% of resolvers that set the DNSSEC OK option in queries
M8.4	% resolvers that appear to deploy QName minimization

The current values of the metrics will be available here, once sampling sites become operational.