No doubt that, when typing a domain into our web browsers, we’ve all fatfingered and found ourselves not at the page we intended, but at a typosquatted domain that aims to benefit from our mistake.
But Artem Dinaburg’s paper “Bit-squatting DNS Hijacking without Exploitation,” presented last week at the Black Hat security conference, explores how bit-flipping in memory chips or CPU caches can also cause you to visit a wrong domain that may be one character off from the real one.
Why do bits flip? Dinaburg postulates that most often this is a consequence of either cosmic rays, or operating devices outside of their optimal temperature range. The latter is likely more frequent in smartphones and other handheld devices that are used in a wide variety of environmental conditions.
Dinaburg conservatively estimates that 614,400 memory errors occur per hour globally. Not all of these could impact a DNS request, but this is still a significant opportunity for problems to occur.
The average computer does 1,500 DNS lookups per day, but only three of the addresses are typed in by a human being.
So Dinaburg registered 31 domain names that were one bit off from popular services like microsoft.com, amazon.com, doubleclick and several CDNs. He ran the experiment for approximately seven months.
During his experiment 52,317 requests were received from 12,949 unique IP addresses. Below is a chart showing his traffic volumes per day.
There were a few spikes in his results. The first two appear to result from a bit error cached at Zynga, the makers of Farmville, while the third seems to be a DNS error cached in a proxy or caching DNS server.
This shows bitsquatting could be criminally profitable if it were to target popular domains, especially domain names for content delivery networks (CDNs) like Akamai and Facebook.
This technique could be harnessed to distribute fake anti-virus or other drive-by exploit-driven scams.
So what can be done about it?
First, as domains can be inexpensively purchased, high volume web providers could proactively register domains that could be bitsquatted, just as some already do to prevent typosquatting, in order to protect their brands and customers.
Dinaburg also suggests that all PCs and internet-connected devices start using ECC memory, a measure that would greatly reduce the frequency of this type of error.
I found Dinaburg to be a dynamic and interesting speaker and his research really innovative. This isn’t likely to be the world’s next big security problem, but it is something all high-volume web service providers should think about.
This is totally weird – do you have an example of what a bitsquatting domain for say www.google.com would actually be registered as?
The www is controlled by google, and I don't think there are any bit-changed top-level domains from .com, so it would be in the 'google' section.
google is:
0x67
0x6f
0x6f
0x67
0x6c
0x65
0x67 is 0b1100111
Lets changes the 3rd bit to 1:
0b1110111
which is 0x77
which is 'w'
so 'www.woogle.com' would be a bitsquatting domain for www.google.com
You didn't give any examples of the "bitsquat" domains he used, but I assume they were all regular, old-school (non-international) ASCII domain names consisting of letters, digits and hyphens only.
So any single-bit change – such as the above example of woogle.com instead of google.com – is just a special case of a single-character change.
So why are cosmic rays a more plausible explanation for these erroneous lookups than a good, old-fashioned "wrong character typed on keyboard" error?
IMO, for this research to be worth anything at all, it needs to make some comparisons. For example, if you make a _random_ change to one character in a domain name (say, gocgle.com for google.com), do you get results that are statistically significantly different from the (generally rather modest) error rates in the graph above?
For all we know the error rates correlate better with the position of the new letter on the keyboard relative to the original than with its hamming distance from the ASCII code of the original.
Let's go with some rough figures. Assume 1,000,000,000 computers in the world. Round up this researcher's estimated single-bit-error rate to 1,000,000 per hour worldwide. Assume only one per computer. So every hour, 1 in 1000 PCs will have a single-bit error somewhere in their (let's guess again) 10,000,000,000 bits of memory. Let's generously allow that 1,000,000 (1 in 10,000) of these are in characters that make up the domain name in a DNS request that is about to be made.
Not a lot of bitsquat errors there.
On my PC, however, I might type 10,000 characters in a busy hour. I'll probably make well over 100 – closer to 1000 – fairly arbitrary typing mistakes in that time. Of these, at least one error – perhaps as many as 10 – will be in domain names I'm typing in.
So which is more likely to explain this researcher's results? Cosmic rays, or my bad typing?
I think he covers that with:
The average computer does 1,500 DNS lookups per day, but only three of the addresses are typed in by a human being.
-although I have no idea how he gathered those statistics. What his data seems to indicate is that there are more mistakes off by a bit instead of a keyboard key than should show up in the data assuming an average of 3 typed domains per day.
I'd be interested to see the global distribution of his results though — it's possible that the results could just be due to a different keyboard layout than the standard QWERTY, combined with fat-fingering at some other level than the keyboard (I see many examples if mis-typed urls in emails, tweets (via uri shorteners), and blogs, for example).
hoogle.com might be a typo for google.com but it is unlikely that goo4le.com is a typo
How does he know they weren't typos? For services getting that much use I've have thought you'd get every possible error without needing a DRAM bit error – which are pretty rare at ground level and the odds of having a DNS problem before a crash seems quite low outside a DNS server and pretty low there.
There is some confusion about what is going on here. I type in Facebook.com to my browser. With no typoes. My computer on it's own because of the make-up of the page I access, request 10-200 resources per page on average. These are all computer generated and not subject to typoes. It is in these requests for images, style sheets, scripts, etc that these errors happen. Hope that clears things up.