This is the third post in a multi-post series. The previous post is here.
Which bits are more likely to be affected by bit-errors? What does the bit-error distribution look like? In this blog post, I will attempt to answer those questions by looking at bit-errors in the requested record type field of DNS queries.
This post actually raises more questions than it answers: the bit-errors of the record type field are not distributed uniformly (the distribution one would expect from a random process), but instead mainly occur in bit 6 of the requested record type. I don't know why this is the case. I also don't know if this is only true for the record type field, or if this extends to the query name field as well. If you have any good suggestions, please contact me.
Astute readers will have noticed that in the previous post I didn't describe some of the top 15 requested record types. As a refresher, lets take another look at the top 15 most requested record types:
The 7th most popular record type is 513. Type 513 is not mentioned in the Wikipedia list of record types, and it is not in Wireshark's record type list. Why are there 4835 requests for an undefined record type?
The answer is clearer when we look at 513 in binary (zero-extended to 16 bits):
This value is only one bit away from 1, the A record request type. Other requested record types in the top 15 share this similarity: type 1025 and type 257 are both one bit away from type 1. In the full query types table there are other requests with this property, such as requests for type 65, 2049, 16385.
The requested record types one bit away from type 1, including binary representation and how often they were requested, are represented below:
Note: some entries are blank due to overlap with other popular record types. Unpopular/deprecated record types, such as RP, were included in the count for bit errors. All query type overlaps are noted in the notes column.
The count column is how often record type was requested. The unique count column is how often each record type was requested from a unique source IP. This was done to minimize the effect of one bit-error repeatedly manifesting itself via many repeated requests.
A visualization of the unique count column:
To obtain the unique count column, first we must get all the unique (source IP, query type) pairs (and disregard any queries for 0mdn.net):
After getting the (source IP, query type) pairs, a bash for loop can show us how many unique source IPs requested a certain record type.
Counting only record requests by unique source IP address shows that the same error-prone query is repeated many times from the same source, but the overall distribution stays the same.
There has been much speculation about bit-error distribution and if any bits are more likely to be affected. Judging by bit-errors of the query type field some bits are considerably more likely to be affected: bit 6 accounts for the vast majority of bit-errors, with the error rate dropping sharply with distance from bit 6. This distribution is evident in the query type field; I have not verified if it still holds in the query name field.
I don't know why the distribution is as skewed as it is. Maybe the distribution is an artifact of the query type field and typical allocation alignments? Other thoughts and ideas are welcome.
There are nearly as many AAAA record requests as there are A record requests. Do bit-errors of AAAA requests exhibit the same distribution?
Despite there being a nearly identical number of queries of reach record types (when excluding queries for 0mdn.net), there are almost no bit-errors for AAAA record queries. The errors that do exist though correspond to errors in bit 6 and bit 7. Some of the discrepancy between the amount of bit errors in A and AAAA queries can be explained since there are simply fewer sources of AAAA queries:
There are only ~24% as many sources of AAAA requests as there are of A requests. Still, this would only account for ~76% of the difference in error rate.
The bit-error distribution, at least with respect to the requested record type field, is not uniform. It is centered at bit 6 and sharply falls off with distance from bit 6. I don't have an explanation as to why, but I suspect might have to do with packet alignment in memory. Other possibilities include errant networking equipment or software somewhere on the Internet. Any ideas and suggestions, especially testable ones, are most welcome.
There are also more bit-errors in A records requests than AAAA record requests. The fact that there are fewer sources of AAAA records accounts for a part of this discrepancy, but does not completely eliminate it.
If you have any insight, please contact me.
Update:
Part 4 is now up, Bitsquatting PCAP Analysis Part 4: Source Country Distribution.
Which bits are more likely to be affected by bit-errors? What does the bit-error distribution look like? In this blog post, I will attempt to answer those questions by looking at bit-errors in the requested record type field of DNS queries.
This post actually raises more questions than it answers: the bit-errors of the record type field are not distributed uniformly (the distribution one would expect from a random process), but instead mainly occur in bit 6 of the requested record type. I don't know why this is the case. I also don't know if this is only true for the record type field, or if this extends to the query name field as well. If you have any good suggestions, please contact me.
Bit-errors in the requested record type: A records
Astute readers will have noticed that in the previous post I didn't describe some of the top 15 requested record types. As a refresher, lets take another look at the top 15 most requested record types:
Rank | Query Count | Record Type |
---|---|---|
1 | 550892 | a |
2 | 509605 | aaaa |
3 | 358926 | mx |
4 | 26829 | any |
5 | 25039 | soa |
6 | 7729 | cname |
7 | 4835 | 513 |
8 | 4728 | ns |
9 | 2597 | txt |
10 | 1148 | srv |
11 | 698 | 1025 |
12 | 232 | 257 |
13 | 222 | a6 |
14 | 143 | ptr |
15 | 138 | spf |
The 7th most popular record type is 513. Type 513 is not mentioned in the Wikipedia list of record types, and it is not in Wireshark's record type list. Why are there 4835 requests for an undefined record type?
The answer is clearer when we look at 513 in binary (zero-extended to 16 bits):
0000 0010 0000 0001
This value is only one bit away from 1, the A record request type. Other requested record types in the top 15 share this similarity: type 1025 and type 257 are both one bit away from type 1. In the full query types table there are other requests with this property, such as requests for type 65, 2049, 16385.
The requested record types one bit away from type 1, including binary representation and how often they were requested, are represented below:
Bit Flipped | Binary Value | RR Type | Count | Unique Count | Note |
---|---|---|---|---|---|
0 | 1000 0000 0000 0001 | 32769 | 0 | 0 | |
1 | 0100 0000 0000 0001 | 16385 | 5 | 2 | |
2 | 0010 0000 0000 0001 | 8193 | 0 | 0 | |
3 | 0001 0000 0000 0001 | 4097 | 0 | 0 | |
4 | 0000 1000 0000 0001 | 2049 | 22 | 7 | |
5 | 0000 0100 0000 0001 | 1025 | 698 | 25 | |
6 | 0000 0010 0000 0001 | 513 | 4835 | 142 | |
7 | 0000 0001 0000 0001 | 257 | 232 | 50 | |
8 | 0000 0000 1000 0001 | 129 | 0 | 0 | |
9 | 0000 0000 0100 0001 | 65 | 128 | 37 | |
10 | 0000 0000 0010 0001 | 33 | overlaps SRV | ||
11 | 0000 0000 0001 0001 | 17 | 2 | 1 | overlaps RP |
12 | 0000 0000 0000 1001 | 9 | 0 | 0 | overlaps MR |
13 | 0000 0000 0000 0101 | 5 | overlaps CNAME | ||
14 | 0000 0000 0000 0011 | 3 | 0 | 0 | overlaps MD |
15 | 0000 0000 0000 0000 | 0 | 2 | 1 |
Note: some entries are blank due to overlap with other popular record types. Unpopular/deprecated record types, such as RP, were included in the count for bit errors. All query type overlaps are noted in the notes column.
The count column is how often record type was requested. The unique count column is how often each record type was requested from a unique source IP. This was done to minimize the effect of one bit-error repeatedly manifesting itself via many repeated requests.
A visualization of the unique count column:
To obtain the unique count column, first we must get all the unique (source IP, query type) pairs (and disregard any queries for 0mdn.net):
$ tshark -n -r completelog.pcap -R '!(dns.qry.name contains 0mdn.net)' -o column.format:'"SOURCE", "%s", "QTYPE", "%Cus:dns.qry.type"' | sort -u > analysis/src_and_qtype.txt
After getting the (source IP, query type) pairs, a bash for loop can show us how many unique source IPs requested a certain record type.
$ for qt in 32769 16385 8193 4097 2049 1025 513 257 129 65 SRV RP MR CNAME MD unused; do echo "$qt:" `grep -i " $qt$" analysis/src_and_qtype.txt | wc -l`; done 32769: 0 16385: 2 8193: 0 4097: 0 2049: 7 1025: 25 513: 142 257: 50 129: 0 65: 37 SRV: 80 RP: 1 MR: 0 CNAME: 635 MD: 0 unused: 1
Counting only record requests by unique source IP address shows that the same error-prone query is repeated many times from the same source, but the overall distribution stays the same.
There has been much speculation about bit-error distribution and if any bits are more likely to be affected. Judging by bit-errors of the query type field some bits are considerably more likely to be affected: bit 6 accounts for the vast majority of bit-errors, with the error rate dropping sharply with distance from bit 6. This distribution is evident in the query type field; I have not verified if it still holds in the query name field.
I don't know why the distribution is as skewed as it is. Maybe the distribution is an artifact of the query type field and typical allocation alignments? Other thoughts and ideas are welcome.
Bit-errors in the requested record type: AAAA records
There are nearly as many AAAA record requests as there are A record requests. Do bit-errors of AAAA requests exhibit the same distribution?
Bit Flipped | Binary Value | Value | Count | Unique Count | Note |
---|---|---|---|---|---|
0 | 1000 0000 0001 1100 | 32796 | 0 | 0 | |
1 | 0100 0000 0001 1100 | 16412 | 0 | 0 | |
2 | 0010 0000 0001 1100 | 8220 | 0 | 0 | |
3 | 0001 0000 0001 1100 | 4124 | 0 | 0 | |
4 | 0000 1000 0001 1100 | 2076 | 0 | 0 | |
5 | 0000 0100 0001 1100 | 1052 | 0 | 0 | |
6 | 0000 0010 0001 1100 | 540 | 4 | 1 | |
7 | 0000 0001 0001 1100 | 284 | 4 | 1 | |
8 | 0000 0000 1001 1100 | 156 | 0 | 0 | |
9 | 0000 0000 0101 1100 | 92 | 0 | 0 | |
10 | 0000 0000 0011 1100 | 60 | 0 | 0 | |
11 | 0000 0000 0000 1100 | 12 | overlaps PTR | ||
12 | 0000 0000 0001 0100 | 20 | 0 | 0 | |
13 | 0000 0000 0001 1000 | 24 | 0 | 0 | |
14 | 0000 0000 0001 1110 | 30 | 0 | 0 | |
15 | 0000 0000 0001 1101 | 29 | 0 | 0 |
Despite there being a nearly identical number of queries of reach record types (when excluding queries for 0mdn.net), there are almost no bit-errors for AAAA record queries. The errors that do exist though correspond to errors in bit 6 and bit 7. Some of the discrepancy between the amount of bit errors in A and AAAA queries can be explained since there are simply fewer sources of AAAA queries:
$ tshark -n -r completelog.pcap -R '(dns.qry.type == AAAA) and !(dns.qry.name contains 0mdn.net)' -o column.format:'"SOURCE", "%s"' | sort -u > analysis/aaaa_sources.txt $ tshark -n -r completelog.pcap -R '(dns.qry.type == A) and !(dns.qry.name contains 0mdn.net)' -o column.format:'"SOURCE", "%s"' | sort -u > analysis/a_sources.txt $ wc -l analysis/aaaa_sources.txt analysis/a_sources.txt 7206 analysis/aaaa_sources.txt 29833 analysis/a_sources.txt
There are only ~24% as many sources of AAAA requests as there are of A requests. Still, this would only account for ~76% of the difference in error rate.
Conclusion
The bit-error distribution, at least with respect to the requested record type field, is not uniform. It is centered at bit 6 and sharply falls off with distance from bit 6. I don't have an explanation as to why, but I suspect might have to do with packet alignment in memory. Other possibilities include errant networking equipment or software somewhere on the Internet. Any ideas and suggestions, especially testable ones, are most welcome.
There are also more bit-errors in A records requests than AAAA record requests. The fact that there are fewer sources of AAAA records accounts for a part of this discrepancy, but does not completely eliminate it.
If you have any insight, please contact me.
Update:
Part 4 is now up, Bitsquatting PCAP Analysis Part 4: Source Country Distribution.
No comments:
Post a Comment