tag:blogger.com,1999:blog-82704338379109804712024-03-05T04:17:44.821-06:00Artem Dinaburg's BlogComputer Security and Various ThoughtsUnknownnoreply@blogger.comBlogger26125tag:blogger.com,1999:blog-8270433837910980471.post-491475562514880882015-06-22T23:45:00.000-05:002015-06-22T23:54:02.282-05:00I Love Open Plan Offices<div dir="ltr" style="text-align: left;" trbidi="on">
It seems like every month there is a story about how open plan offices suck and how private offices are ideal for software developers. Open plan offices <a href="http://qz.com/85400/moving-to-open-plan-offices-makes-employees-less-productive-less-happy-and-more-likely-to-get-sick/#">make workers sick</a>, <a href="http://mattrogish.com/blog/2012/03/17/open-plan-offices-must-die/">developers hate them</a>, and well, they’re “<a href="http://www.theguardian.com/news/2013/nov/18/open-plan-offices-bad-harvard-business-review">devised by Satan in the deepest caverns of hell</a>.” The comments from tech workers are similarly negative (see <a href="https://news.ycombinator.com/item?id=5767414">here</a>, <a href="https://news.ycombinator.com/item?id=8984059">here</a>, <a href="https://news.ycombinator.com/item?id=3729302">here</a>, and <a href="https://news.ycombinator.com/item?id=3553853">here</a>). <a href="http://www.joelonsoftware.com/articles/FieldGuidetoDevelopers.html">People who would know</a> and several studies have said that private offices make the most productive developers.
<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://pbs.twimg.com/media/CBZk1TMU0AAZLKa.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="300" src="https://pbs.twimg.com/media/CBZk1TMU0AAZLKa.jpg" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Some people's view of an open plan office. The open environment ensures close collaboration and teamwork.</td></tr>
</tbody></table>
<br />
Well, I'm a software developer, and I love open plan offices and hate private offices. I’ve worked in open plan offices, shared offices, private offices, and from home. Open plan offices are by far my favorite.<br />
<br />
It will sound cliche, but open plans work for me because of the increased collaboration. Discussing work over email, phone, and chat is simply not the same. Developers work in teams, and it's much easier to be effective within easy conversational range. There is also a social aspect that builds team cohesion. Finally, being in the same environment makes people more accountable. It's much easier to stay on task when you see other people working hard towards the same goal -- and much harder to get distracted and browse <a href="http://www.reddit.com/r/pimpcats">reddit</a>.<br />
<br />
Conversely, private offices don’t work for me because they are stifling and isolating. My private office experience was sitting alone in a room for 8 hours a day with almost no human contact. Sure, maybe I would catch people going to lunch, or someone would stop by to ask a question, but that was rare. The barriers to communication were simply too high. With private offices, communication required getting up, leaving your office, walking around, and knocking on someone’s door. When people had a question, they would first search, then email, then call, and finally walk over to talk in person. And this is all for team members on the same floor. This cycle could take hours. If everyone was in the same room, it would take seconds. And if the person you needed was on a different floor, well, you'd better hope to meet at lunch. The one floor elevator ride was simply too insurmountable, except in the most dire of circumstances.<br />
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://solitarywatch.com/wp-content/uploads/2013/11/tapley-supermax-photo.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="http://solitarywatch.com/wp-content/uploads/2013/11/tapley-supermax-photo.jpg" height="265" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">My view of private offices. No distractions for 23 hours a day, comes with a private bathroom, and even has a window.</td></tr>
</tbody></table>
<div dir="ltr" style="line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;">
<br />
For some people, open plan offices really are terrible. The solution isn’t to force everyone into private offices -- they're also terrible. Just like <a href="http://www.amazon.com/gp/product/0316076201/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0316076201&linkCode=as2&tag=legknisho0f-20&linkId=MVHKLUSYLIYW7OIH">there is no best Pepsi</a> and <a href="http://www.ted.com/talks/malcolm_gladwell_on_spaghetti_sauce?language=en">no best spaghetti sauce</a>, there is no best office space. I’ve met people who love private offices, who love open offices, who will only work from home, and people who think the ideal environment is a shared office. They are all correct. The answer is to stop forcing a single office layout on all employees.<br />
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.frugallivingnw.com/wp-content/uploads/2012/10/ragu-pasta-sauce-printable-coupon.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="http://www.frugallivingnw.com/wp-content/uploads/2012/10/ragu-pasta-sauce-printable-coupon.jpg" height="105" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><a href="https://en.wikipedia.org/wiki/Howard_Moskowitz">There is no best office, just like there is no best pasta sauce</a>.</td></tr>
</tbody></table>
<br />
When so many companies make it their mantra to <a href="https://www.google.com/#q=%22hire+the+best%22">hire the best</a> people, why put them into unproductive environments (or make them to <a href="https://twitter.com/dhh/status/517477412125564928">move to San Francisco</a>)? Keep some larger offices to share, leave some as private offices, and make a common area an open floor plan. And if people want to, let them work form home. To get and keep the best people, accommodate what makes them productive. It’ll be easier to recruit, and productivity will go up.</div>
Unknownnoreply@blogger.com7tag:blogger.com,1999:blog-8270433837910980471.post-4413319788761987812014-11-02T22:00:00.001-06:002014-11-03T09:36:50.590-06:00Advertising with Google: It Sucks<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="margin-bottom: 0pt; margin-top: 0pt;">
<span style="line-height: 18.3999996185303px;">Advertising with Google as a small business is much different than using Google as a consumer. Paying Google for ads and calling Google support is a lot like <a href="http://www.theverge.com/2014/8/19/6004131/comcast-the-worst-company-in-america">paying Comcast for cable and calling Comcast support</a>. </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">The events described in this post happened about a year ago. I finally decided to make this post for two reasons: to show that Google is exactly like every other large faceless near-monopoly and to show my disappointment with how Google handles paying customers. </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">Before this experience, I thought the parts of Google that make money must be amazing. Look at <a href="https://mail.google.com/">Gmail</a>, <a href="https://drive.google.com/">Google <strike>Docs</strike> Drive</a>, <a href="https://calendar.google.com/">Google Calendar</a>, <a href="https://www.blogger.com/">Blogger</a> and even <a href="https://www.android.com/">Android</a>. They’re great, and they’re all free. Google makes these services to capture you, the product. If the free stuff is great, then the parts of Google that face advertisers (the customer) and take in <a href="https://investor.google.com/financial/tables.html"><b>billions</b> of dollars in revenue</a> have to be <b>absolutely amazing</b>. Sadly, they’re not.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">Lets start with some background. A year ago I had a great idea: <a href="http://legalknifeshop.com/">start a website that would sort pocket knives by legality to carry</a>. You would pick a city and see knives that are legal to carry in that jurisdiction. The list of knives have Amazon referral links. Everyone would benefit: the customers would stay legal, and I would collect a sales commission. </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">The story of the site is long and deserves its own blog post, but in summary... creating product sales sites is hard. The site I made, <a href="http://legalknifeshop.com/">legalknifeshop.com</a>, is still up. Go and take a look, and if you live in <a href="http://www.cityofchicago.org/city/en.html">Chicago</a>, <a href="http://legalknifeshop.com/product-category/chicago/">order a knife</a>. Also, I know the site has many problems. I’ve been too busy to work on it. If you see something wrong, I’m probably aware of it but don’t have time to fix anything.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">But this post is about advertising, so lets get back to the story. The first order of business after making a site is to advertise. There is <a href="http://blogs.wsj.com/digits/2013/06/13/in-online-ads-theres-google-and-then-everybody-else/">only one choice in online advertising: Google</a>. I signed up for an <a href="https://www.google.com/adwords/">AdWords</a> account, selected keywords, and paid money. Legalknifeshop was approved and the ads started. Time to wait for the sales to roll in! </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">The ads ran for two days. Then I got the first email. </span><br />
<blockquote class="tr_bq">
<span style="line-height: 18.3999996185303px;">Urgent Warning - Your AdWords Account May Get Suspended</span></blockquote>
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">That's not good. It turns out that after being approved, legalknifeshop was unapproved, despite no content changes.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">The site was in violation of the <a href="https://support.google.com/adwordspolicy/answer/6014299?hl=en">weapons policy</a> that prohibits <a href="http://www.smashinglists.com/10-deadliest-combat-knives-daggers/">dangerous knives</a>. The violation was silly: legalknifeshop only sells explicitly legal knives, and knives <a href="http://legalknifeshop.com/product-category/chicago/">legal to carry in Chicago</a> (the only supported location) must have <2.5" blade. <a href="http://www.scoutstuff.org/bsa/camping/knives-accessories/knives/knife-bs-pocket.htm">Boy scouts handle longer and more dangerous knives</a>. Certainly someone at Google could understand and reinstate the ads. </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">I called Google support. After 4 different menu prompts there was a human, to whom I explained the situation. </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">She put me on hold for 15 minutes, came back and said that she <b>“had no idea what was going on,”</b> but the site needed to be escalated back to review”. </span><span style="line-height: 18.3999996185303px;">“But what about the previous review,” I asked. </span><span style="line-height: 18.3999996185303px;"><b>“I don’t work in that department,”</b> she answered. </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">This was the first sign of what was to come.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">Then another email. </span><br />
<blockquote class="tr_bq">
<span style="line-height: 18.3999996185303px;">Your AdWords account: Ads not running due to AdWords Advertising Policies. </span></blockquote>
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">Maybe the review department wasn’t convinced? Having heard nothing else, I sent a support an email asking why legalknifeshop violated the weapons policy.</span><br />
<blockquote class="tr_bq">
<span style="line-height: 18.3999996185303px;">Call 1-866-2-GOOGLE* for free expert help reviewing your AdWords ads </span></blockquote>
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">That was the title of the next email. It continued: "Having reviewed your AdWords account XXXXXXXXXX, we can see that so far 549 potential customers have seen your ad, and 4 of them clicked on your ad to view your website. We can help you attract more potential customers to your website or answer any other questions you may have".</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">Maybe they could start by reinstating my ads? Really, Google? What world do we live in when a <a href="https://cloud.google.com/bigquery/">Big Data company</a> like Google<b> can't link their suspended account database to their promotional emails</b>?</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">I called support again and spent 20 minutes on hold. The reply was "oh, I can't really see what's going on, <b>you'll have to contact the people who you were contacting before"</b>. This is when I knew I was in trouble. There isn’t even a coherent customer records management or issue tracking.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">During the next few weeks I called, used the online customer chat, and sent emails. Finally, after persevering and explaining that legalknifeshop sold no dangerous knives, someone at Google admitted that no, legalknifeshop does not violate the weapons policy. Victory at last! </span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">The sweet taste of victory was not to last.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">The next day, another suspension email. Now, my site was in violation of the bridge page policy. A policy that was not mentioned once during the many prior discussions with Google support. I appealed again, but it was not to be.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">It turns out that legalknifeshop doesn't provide enough value to end users. To provide enough value to advertise with Google requires the following changes:</span><br />
<ul style="text-align: left;">
<li><span style="line-height: 18.3999996185303px;">Provide listings for more than one city (doable)</span></li>
<span style="line-height: 18.3999996185303px;"><b>and one of:</b></span><br />
<li><span style="line-height: 18.3999996185303px;">Use more than Amazon as a referral partner. (pretty much impossible)</span></li>
<b>or</b><br />
<li><span style="line-height: 18.3999996185303px;">Sell the knives myself. (impossible)</span></li>
</ul>
<br />
<div style="text-align: left;">
<span style="line-height: 18.3999996185303px;">During my new week of appeals, I received the following promotions from Google:</span></div>
<blockquote class="tr_bq">
<span style="line-height: 18.3999996185303px;">Reach the right customers by adding negative keywords to your AdWords ad 'Knives Legal In Chicago'</span></blockquote>
<br />
<blockquote class="tr_bq">
<span style="line-height: 18.3999996185303px;">Your AdWords ads have stopped running. Talk to Google to get help. Remember to call 1-866-2-GOOGLE* for a free review of your AdWords ads with a Google expert</span></blockquote>
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">There is no way I could meet the bridge page requirements. Google won, they refused to take my money. I really tried to pay, but they just wouldn’t take it.</span><br />
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">Two days later, there was one last email:</span><br />
<blockquote class="tr_bq">
<span style="line-height: 18.3999996185303px;">AdWords Tune-up: Make your 'LegalKnifeShop #1' campaign ad stand out with a longer headline</span></blockquote>
<span style="line-height: 18.3999996185303px;"><br /></span>
<span style="line-height: 18.3999996185303px;">That was the last straw. Out of pure spite, I <a href="http://advertise.bingads.microsoft.com/en-us/home">advertised with Bing</a>. But that's a story for another blog post.</span></div>
</div>
Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-8270433837910980471.post-29634793044575140002014-05-18T01:20:00.001-05:002014-05-18T01:23:26.554-05:00What Happens When Your Phone Falls into the Ocean<style>
td, th, table {
border:none;
}
table { border-collapse:collapse }
</style>
<div dir="ltr" style="text-align: left;" trbidi="on">
Short answer: It'll stop working.<br />
<br />
But have you wondered, why? What happens to the insides of the phone?<br />
<br />
We've all heard that salt water conducts electricity and "fries" your electronics, but what does that mean? Will the phone rust? Will the battery melt? Today we get to find out!<br />
<br />
In this blog post we'll look inside my <a href="http://www.amazon.com/gp/product/B0097CZBH4/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=B0097CZBH4&linkCode=as2&tag=legknisho0f-20&linkId=GVKNZCQ4NIRWH7BH">Black 16GB iPhone 5</a> (yes, that's an affiliate link) that took a swim in the <a href="https://en.wikipedia.org/wiki/Pacific_Ocean">Pacific</a>. Where possible, <a href="http://www.ifixit.com/Teardown/iPhone+5+Teardown/10525">iFixit's iPhone 5 teardown</a> pictures will model what iPhone 5 internals <b>should</b> look like.<br />
<br />
<h3 style="text-align: left;">
The Initial Opening</h3>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0"><tbody>
<tr><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgO2eUWvJPKs16h7PvDaJLkbFyzusmMuo9pzo-bZo7u1qDtwZOAW7OlNa-iNBvTZhW873XVIWoAzSJYcIs4_QCQvxB69wAn1BCo5Lqe_fqoywjqJhG0d2jZmqRYEjnRDloDPcSd0fgcnntr/s1600/initial_opening.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgO2eUWvJPKs16h7PvDaJLkbFyzusmMuo9pzo-bZo7u1qDtwZOAW7OlNa-iNBvTZhW873XVIWoAzSJYcIs4_QCQvxB69wAn1BCo5Lqe_fqoywjqJhG0d2jZmqRYEjnRDloDPcSd0fgcnntr/s1600/initial_opening.JPG" height="320" width="239" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">This is how an iPhone 5 looks like after swimming in the Pacific.</td></tr>
</tbody></table>
</td><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEuybVcY_h0go5bKDyLX8PjEunCI320s_Z0KsLrUbDn8brHlpzuEyIgI3kxl7vtPRqKF8WpRNuh_nUg4TREpojlalM5_xmRT-Fbwt0VgkCM1OHDe9Y2XedzJ7n52kXQeF_fgQxSffwVALr/s1600/ifixit_initial_opening.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEuybVcY_h0go5bKDyLX8PjEunCI320s_Z0KsLrUbDn8brHlpzuEyIgI3kxl7vtPRqKF8WpRNuh_nUg4TREpojlalM5_xmRT-Fbwt0VgkCM1OHDe9Y2XedzJ7n52kXQeF_fgQxSffwVALr/s1600/ifixit_initial_opening.jpg" height="239" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">This is how iPhone 5 innards should look like, courtsey of <a href="http://www.ifixit.com/">iFixit</a>.</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<br />
Right away, there's a huge difference. There ocean phone is full of fine sand, salt stains, and a giant rust spot. If you remember chemistry class, then the rust spot makes sense: <a href="http://science.howstuffworks.com/question445.htm">rust is an electrical process</a>. The biggest rust spot will be at the anode of the battery leads, which is exactly where it is.<br />
<br />
<h3 style="text-align: left;">
The Screen Assembly and Mainboard</h3>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0"><tbody>
<tr><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLY9DKg3hhLQvIOlMtkL6ROtGBsaF3S3FCpOdLE7AHRBFZnjfKxnqLDIXyWtOH3dY22ON2pt0M_HXQWNaXRMrfkKy587tQt8BUWZk7uUfehc2gq4qKB2ARWJpcB0IYa_vBL5RGd9hRJ4qQ/s1600/open_front.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLY9DKg3hhLQvIOlMtkL6ROtGBsaF3S3FCpOdLE7AHRBFZnjfKxnqLDIXyWtOH3dY22ON2pt0M_HXQWNaXRMrfkKy587tQt8BUWZk7uUfehc2gq4qKB2ARWJpcB0IYa_vBL5RGd9hRJ4qQ/s1600/open_front.JPG" height="320" width="239" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The salted, rusty screen assembly. The rust feels like its simply rubbed off from the other part.</td></tr>
</tbody></table>
</td>
<td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiu2qVRip1hgJWsRBASXFqUP6I4g7b_FQg-nv1v3wce9btEo0taeuax0SEBzuBSFTCBXc_dimTOqJx_1OcF6tbw7XDgOg874HrPDoEJ-oHx9O0Q7WzDxkQ7D7VNVtJPdq-60q1IZSIStT_k/s1600/open_back.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiu2qVRip1hgJWsRBASXFqUP6I4g7b_FQg-nv1v3wce9btEo0taeuax0SEBzuBSFTCBXc_dimTOqJx_1OcF6tbw7XDgOg874HrPDoEJ-oHx9O0Q7WzDxkQ7D7VNVtJPdq-60q1IZSIStT_k/s1600/open_back.JPG" height="320" width="239" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The salted, rusty mainboard and battery. The battery is surprisingly intact.</td></tr>
</tbody></table>
</td>
<td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgv3_Cvh_jws_8PzD7-Z11HA6yurWBfRD69NvBe0BYUicjLoUcmN7l2br-bbLRd43p8wmN3lwMvY_J3NuLa8OTf4gvynMFFFzgWyc5WfxW11pfF-vxwnZMscBYdnNlmNh52tDNZsQOr5MY7/s1600/ifixit_open_backfront.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgv3_Cvh_jws_8PzD7-Z11HA6yurWBfRD69NvBe0BYUicjLoUcmN7l2br-bbLRd43p8wmN3lwMvY_J3NuLa8OTf4gvynMFFFzgWyc5WfxW11pfF-vxwnZMscBYdnNlmNh52tDNZsQOr5MY7/s1600/ifixit_open_backfront.jpg" height="240" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">How an iPhone 5 screen assembly and mainboard should look like. Notice the distinct lack of salt, rust, and sand. Courtesy of <a href="http://www.ifixit.com/">iFixit</a>.</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
Once again, the salt stains and rust is the big difference. Surprisingly, the lithium-ion battery is perfectly intact, without holes or burn damage. Why surprising? <a href="https://www.youtube.com/watch?v=8ypUVpwgcAA">Lithium and water tend to react vigorously</a>.<br />
<br />
<h3 style="text-align: left;">
Battery Closeup</h3>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0"><tbody>
<tr><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkj2LWStsQzStJQqaJwUT4LtaMRaN8YzX3Sp1dq-Ix8cQRp6wDFesKGj13lpGIRE5E1c9FFAhAe997UIjtAT3yrI-TmIzn-QHFftcm57lGDblhvx_uPdnCwU_EXhhgQW-WiQ3zptd8NT_p/s1600/battery_1.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkj2LWStsQzStJQqaJwUT4LtaMRaN8YzX3Sp1dq-Ix8cQRp6wDFesKGj13lpGIRE5E1c9FFAhAe997UIjtAT3yrI-TmIzn-QHFftcm57lGDblhvx_uPdnCwU_EXhhgQW-WiQ3zptd8NT_p/s1600/battery_1.jpg" height="320" width="242" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The battery leads are extra rusty.</td></tr>
</tbody></table>
</td><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJA7vu21StjkTyBH-BWLMzb1NiqlGha5z6A9FQIs5rkrK6RTUZx7rKMSLo1YJQDXhyphenhyphenKfkfkITPrEJNejxhrdWHz0utbiQ3tTZ9QGBn9m03iVVT85zB7xKCvKIC1VVKd484arR7T2FTbGCN/s1600/ifixit_battery.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJA7vu21StjkTyBH-BWLMzb1NiqlGha5z6A9FQIs5rkrK6RTUZx7rKMSLo1YJQDXhyphenhyphenKfkfkITPrEJNejxhrdWHz0utbiQ3tTZ9QGBn9m03iVVT85zB7xKCvKIC1VVKd484arR7T2FTbGCN/s1600/ifixit_battery.jpg" height="240" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">An unsalted battery, courtesy of <a href="http://www.ifixit.com/">iFixit</a>.</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<br />
Other than the rusty leads, the battery seems fine from the outside. The phone would not power on, but I am not sure if its related to the battery or other electronics failures. Probably both.<br />
<br />
<h3 style="text-align: left;">
Mainboard and Chassis Closeup</h3>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0"><tbody>
<tr><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCRUkVQ16Noh_5SFmpJit0hcUomzeESOYdbm0NpZsB05_wLxJsgirF_GbFgERR8hipQRrprRp_t4_-dZSiEDlKT0CGTTpgnXOHbd3gQEGw93rfY474c1P6tbDSA4EiUOUVq0u7nI3s6yCq/s1600/mainboard_closeup.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCRUkVQ16Noh_5SFmpJit0hcUomzeESOYdbm0NpZsB05_wLxJsgirF_GbFgERR8hipQRrprRp_t4_-dZSiEDlKT0CGTTpgnXOHbd3gQEGw93rfY474c1P6tbDSA4EiUOUVq0u7nI3s6yCq/s1600/mainboard_closeup.JPG" height="320" width="239" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">There's a large salt deposit under the mainboard.<br />
The salt water must have pooled there as the phone dried. </td></tr>
</tbody></table>
</td><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjKDMEPhH4sxyuV545kC16DlDI0z2v2h8IomIJMs5RwULlv925Y4vn2iULMZd96tn0Z3ouPvpuMReeUfWxwu_4jBhKrxGBnLgidA2i3BBHsgBN0voj4oBZKGw9y3xIIf4SbmT560a0QxNN4/s1600/ifixit_mainboard_closeup.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjKDMEPhH4sxyuV545kC16DlDI0z2v2h8IomIJMs5RwULlv925Y4vn2iULMZd96tn0Z3ouPvpuMReeUfWxwu_4jBhKrxGBnLgidA2i3BBHsgBN0voj4oBZKGw9y3xIIf4SbmT560a0QxNN4/s1600/ifixit_mainboard_closeup.jpg" height="240" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A pristine mainboard, courtesy of <a href="http://www.ifixit.com/">iFixit</a>.</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<br />
Getting to this part was a challenge: some of the screws were so corroded and rusted that unscrewing them stripped the grooves used for unscrewing. I had to resort to force and some prying, which didn't matter since the phone was already broken.<br />
<br />
<h3 style="text-align: left;">
A Zoom on the Mainboard</h3>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0"><tbody>
<tr>
<td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDyLAHnURve43bMgFrIsJ_QxA-63m6ZzFqnVJ6drXFWrr55NlNawu7dsUNXr-4-CrnlkXzG49minENnlFmOHaQH1X_HtMYA-tbctIgxRku9rrsyGNpCSzlLQ5Y21ofleQXMrt_Occ7844X/s1600/mainboard_zoom_1.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDyLAHnURve43bMgFrIsJ_QxA-63m6ZzFqnVJ6drXFWrr55NlNawu7dsUNXr-4-CrnlkXzG49minENnlFmOHaQH1X_HtMYA-tbctIgxRku9rrsyGNpCSzlLQ5Y21ofleQXMrt_Occ7844X/s1600/mainboard_zoom_1.JPG" height="320" width="239" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A zoom on the corroded mainboard.</td></tr>
</tbody></table>
</td><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg70JBpixeIhWM1WZfFEtEu6YZr0_segNFXQRHjiLxmQcgHWG7O1ROO8TRZp_sRnFgyJ5AcYdBhGq-AuBJ_3bodPSlcNomRWMMmIja9gPOPYcdK-9joto6Fn7fF4khZlpjnGhWCymoYq5lo/s1600/mainboard_zoom_2.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg70JBpixeIhWM1WZfFEtEu6YZr0_segNFXQRHjiLxmQcgHWG7O1ROO8TRZp_sRnFgyJ5AcYdBhGq-AuBJ_3bodPSlcNomRWMMmIja9gPOPYcdK-9joto6Fn7fF4khZlpjnGhWCymoYq5lo/s1600/mainboard_zoom_2.JPG" height="320" width="239" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The front of the camera and mainboard.</td></tr>
</tbody></table>
</td><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgNP3A-2PnTY0GnFMv441wq-_TojNZrftoz7Dy3wmhQJqmw-8L6jDaFvR2v2eEtNM3y37r6LU5iOiM9QjUlJxD9EceXDN5okoW0AfcUv2cLiOKkTMeCQptBufPBjLt3wHcXbWjbWxFwOHr/s1600/ifixit_mainboard_zoom.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgNP3A-2PnTY0GnFMv441wq-_TojNZrftoz7Dy3wmhQJqmw-8L6jDaFvR2v2eEtNM3y37r6LU5iOiM9QjUlJxD9EceXDN5okoW0AfcUv2cLiOKkTMeCQptBufPBjLt3wHcXbWjbWxFwOHr/s1600/ifixit_mainboard_zoom.jpg" height="240" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A pristine mainboard, courtesy of <a href="http://www.ifixit.com/">iFixit</a>.</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<br />
Almost every connector is rusted or otherwise corroded. This is one of the main reason everything stops working: the small connectors corrode and the mainboard components can't make electrical contact.<br />
<br />
<h3 style="text-align: left;">
Everything</h3>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0"><tbody>
<tr><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinkTEFhISnFrnoy6xbQPZsNej5GfQbsK2R7i7lAey065zxeZ5p3vNDRRNpmOoUPuFsATM38-mitgao3MM_TKuhfDK1xCrt0qruMfr1nU-GK3VAKHEZpUoVzpTCSaap3YUxBcS89XoViVIO/s1600/everything.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinkTEFhISnFrnoy6xbQPZsNej5GfQbsK2R7i7lAey065zxeZ5p3vNDRRNpmOoUPuFsATM38-mitgao3MM_TKuhfDK1xCrt0qruMfr1nU-GK3VAKHEZpUoVzpTCSaap3YUxBcS89XoViVIO/s1600/everything.JPG" height="320" width="239" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">All the rusted components in one glorious photo.</td></tr>
</tbody></table>
</td><td><table style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvZIyfiiX3qjETiMeR6BElmlRDrubYTRQUnWfizHLbc9oCVOMaZRhIX1K9-g7kTpWwU9o0WC_nWoaH8TAsFs62Br-2yhq6iA5TqKhv-UarJpMf3mDkmVtgYhgaNX0muNZBtS2KCM4aUrXf/s1600/ifixit_everything.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvZIyfiiX3qjETiMeR6BElmlRDrubYTRQUnWfizHLbc9oCVOMaZRhIX1K9-g7kTpWwU9o0WC_nWoaH8TAsFs62Br-2yhq6iA5TqKhv-UarJpMf3mDkmVtgYhgaNX0muNZBtS2KCM4aUrXf/s1600/ifixit_everything.jpg" height="240" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">All the pristine components, courtesy of <a href="http://www.ifixit.com/">iFixit</a>.</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<br />
<div>
The obligatory "all components in one photo" shot. Mine isn't as nice as iFixit's, but the other differences should be obvious. The one interesting thing to note is the serious corrosion on the SIM card.<br />
<br />
<h3 style="text-align: left;">
Conclusion</h3>
<div>
<br /></div>
If your phone falls into the ocean, its going to have a bad time. Don't let your phone fall into the ocean.</div>
</div>Unknownnoreply@blogger.com6tag:blogger.com,1999:blog-8270433837910980471.post-37563772638624782172014-05-15T23:29:00.001-05:002014-05-15T23:29:55.087-05:00Bjarne Stroustrup on the Past and Future of C++ (Including Long Template Errors)<div dir="ltr" style="text-align: left;" trbidi="on">
Someone on IRC pointed me to <a href="http://www.stroustrup.com/">Bjarne Stroustrup</a>'s <a href="http://channel9.msdn.com/Events/GoingNative/2013/Opening-Keynote-Bjarne-Stroustrup">talk</a> on C++11 and <a href="https://en.wikipedia.org/wiki/C++14">C++14</a> at Microsoft's <a href="http://channel9.msdn.com/Events/GoingNative/2013">Going Native 2013</a>. If you work with C++ and haven't seen <a href="http://channel9.msdn.com/Events/GoingNative/2013/Opening-Keynote-Bjarne-Stroustrup">Bjarne's talk</a> yet, <a href="http://channel9.msdn.com/Events/GoingNative/2013/Opening-Keynote-Bjarne-Stroustrup">go watch it now</a>.<br />
<br />
<a href="http://channel9.msdn.com/Events/GoingNative/2013/Opening-Keynote-Bjarne-Stroustrup">Stop reading this and go. I'll wait. </a><br />
<br />
An hour into the talk, Bjarne starts discussing templates, long error messages, and the tradeoffs that were made during the design of C++. The long error messages were a conscious decision to preserve performance and expressiveness with the computing power available back in the mid 1980s.<br />
<br />
It amazed me that Bjarne admits template error messages are a huge debacle, and he has been working for <b>20 years</b> to fix the problem. The solution is near: <a href="https://isocpp.org/blog/2013/02/concepts-lite-constraining-templates-with-predicates-andrew-sutton-bjarne-s">C++14 concepts</a> will finally allow for sane template errors. Messages like "<a href="http://blog.dinaburg.org/2014/05/c11-better-but-still-frustrating.html">Member must be CopyAssignable</a>" will be possible, and hopefully normal. This isn't just theory: <a href="http://concepts.axiomatics.org/~ans/">there is an experimental branch of GCC that supports concepts right now</a>.<br />
<br />
Other parts of the talk are fascinating in their own right and have given me a lot more respect for C++ and Bjarne Stroustrup. The man could have rested after creating the<a href="http://www.stroustrup.com/C++.html"> original C++ spec and compiler</a>, but he has been working for 20 years to improve the language. That dedication has made C++11 much better than C++98.<br />
<br />
Bjarne also brings up a good point: many people who dislike C++ are <a href="http://knowyourmeme.com/memes/youre-doing-it-wrong">using it the wrong way</a>. The language should only be used when you need a performance and lightweight abstraction at the same time. If you don't care about <a href="http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html?iid=tech_vt_tech+64-32_manuals">performance</a> or you need <a href="https://www.python.org/">high-level abstraction</a>, C++ is the wrong tool for the job.<br />
<br />
The talk has a lot more interesting content. If you haven't <a href="http://channel9.msdn.com/Events/GoingNative/2013/Opening-Keynote-Bjarne-Stroustrup">watched it yet, go now</a>.</div>
Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-8270433837910980471.post-33537345785734059322014-05-11T23:45:00.000-05:002014-05-13T14:56:55.781-05:00C++11: better, but still frustrating<div dir="ltr" style="text-align: left;" trbidi="on">
<style>
table.centered
{
border-collapse:collapse;
}
table.centered,th, td
{
border: 1px solid black;
}
table.centered
{
text-align: center;
margin-left: auto;
margin-right: auto;
}
th
{
background: #292929;
}
td
{
padding-left:3px;
padding-right:3px;
}
</style>
<br />
<div dir="ltr" style="text-align: left;" trbidi="on">
Update: jduck pointed out that the before/after code snippets were identical. Oops. Now fixed.<br />
<br />
I'd previously given up on C++ due to the many small frustrations: <a href="http://tgceec.tumblr.com/">incomprehensible error messages</a>, silly parsing issues (e.g. <a href="https://stackoverflow.com/questions/6695261/template-within-template-why-should-be-within-a-nested-template-arg">'<span style="font-family: Courier New, Courier, monospace;">>></span>'</a>), <a href="https://en.wikipedia.org/wiki/Rule_of_three_(C%2B%2B_programming)">rules to avoid subtle errors</a>, and many <a href="https://stackoverflow.com/questions/495021/why-can-templates-only-be-implemented-in-the-header-file">other small frustrations</a> that soured me on the language. That was back in the days of <a href="https://en.wikipedia.org/wiki/C%2B%2B#Standardization">C++98</a> and <a href="https://en.wikipedia.org/wiki/C%2B%2B03">C++03</a>.<br />
<br />
The language has evolved, and recently I found myself working on a project written in C++11. So far my experience has been better, but still frustrating.<br />
<br />
<h3 style="text-align: left;">
A Motivating Example</h3>
<br />
I'll start with a real example. The project created a lot of Foo objects that were passed by reference to numerous functions. I needed to keep a collection of every Foo object that was passed to a specific function. <br />
<br />
My first thought was, "I know, I'll create a <span style="font-family: Courier New, Courier, monospace;">vector</span> of <span style="font-family: Courier New, Courier, monospace;">Foo&</span>". This thought is<a href="http://quotegeek.com/literature/h-l-mencken/4265/"> simple, elegant, and of course, wrong</a>.<br />
<br />
A vector of references isn't possible because references can't be reassigned. That is, <span style="font-family: Courier New, Courier, monospace;">references[0] = foo;</span> would update the referenced object, not the zeroth entry of the references vector. More technically, references are not <a href="http://en.cppreference.com/w/cpp/concept/CopyAssignable">CopyAssignable</a>, a requirement for members of containers.<br />
<br />
<h3 style="text-align: left;">
Errors Galore</h3>
<br />
But how would someone new to C++ know this? What do compilers say when making a vector of references? Lets find out by compiling this small (and wrong) program.<br />
<br />
<div style="background: #202020; border: solid black; margin: auto; overflow: auto; padding: .2em .6em; text-align: left; width: auto;">
<pre style="background: #202020; color: #d1d1d1;"><span style="color: #008073;">#</span><span style="color: #008073;">include </span><span style="color: #02d045;"><</span><span style="color: #40015a;">vector</span><span style="color: #02d045;">></span>
<span style="color: #008073;">#</span><span style="color: #008073;">include </span><span style="color: #02d045;"><</span><span style="color: #40015a;">iostream</span><span style="color: #02d045;">></span>
<span style="color: #e66170; font-weight: bold;">int</span> <span style="color: #e66170; font-weight: bold;">main</span><span style="color: #d2cd86;">(</span><span style="color: #e66170; font-weight: bold;">int</span> argc<span style="color: #d2cd86;">,</span> <span style="color: #e66170; font-weight: bold;">const</span> <span style="color: #e66170; font-weight: bold;">char</span><span style="color: #d2cd86;">*</span> argv<span style="color: #d2cd86;">[</span><span style="color: #d2cd86;">]</span><span style="color: #d2cd86;">)</span>
<span style="color: #b060b0;">{</span>
<span style="color: #e66170; font-weight: bold;">int</span> a <span style="color: #d2cd86;">=</span> <span style="color: #008c00;">1</span><span style="color: #b060b0;">;</span>
<span style="color: #00dddd;">std</span><span style="color: #b060b0;">::</span><span style="color: #e66170; font-weight: bold;">vector</span><span style="color: #d2cd86;"><</span><span style="color: #e66170; font-weight: bold;">int&</span><span style="color: #d2cd86;">></span> test <span style="color: #d2cd86;">=</span> <span style="color: #b060b0;">{</span>a<span style="color: #b060b0;">}</span><span style="color: #b060b0;">;</span>
<span style="color: #00dddd;">std</span><span style="color: #b060b0;">::</span><span style="color: #e66170; font-weight: bold;">cout</span> <span style="color: #d2cd86;"><</span><span style="color: #d2cd86;"><</span> <span style="color: #02d045;">"</span><span style="color: #00c4c4;">a: </span><span style="color: #02d045;">"</span> <span style="color: #d2cd86;"><</span><span style="color: #d2cd86;"><</span> test<span style="color: #d2cd86;">[</span><span style="color: #008c00;">0</span><span style="color: #d2cd86;">]</span> <span style="color: #d2cd86;"><</span><span style="color: #d2cd86;"><</span> <span style="color: #00dddd;">std</span><span style="color: #b060b0;">::</span>endl<span style="color: #b060b0;">;</span>
<span style="color: #e66170; font-weight: bold;">return</span> <span style="color: #008c00;">0</span><span style="color: #b060b0;">;</span>
<span style="color: #b060b0;">}</span>
</pre>
</div>
<br />
Here are the results for Clang, GCC and MSVC:<br />
<br />
<table class="centered">
<tbody>
<tr>
<th>Compiler</th> <th>Error List</th> <th>Error Count</th>
</tr>
<tr>
<td><a href="http://clang.llvm.org/">Clang</a></td> <td><a href="http://rextester.com/JMFGP72087">rextester.com/JMFGP72087</a></td> <td>158 lines</td>
</tr>
<tr>
<td><a href="http://gcc.gnu.org/">GCC</a></td> <td><a href="http://rextester.com/HGKFIT84222">rextester.com/HGKFIT84222</a></td> <td>187 lines</td>
</tr>
<tr>
<td><a href="http://msdn.microsoft.com/en-US/vstudio/hh386302">MSVC</a></td> <td><a href="http://rextester.com/UXPG39365">rextester.com/UXPG39365</a></td> <td>107 lines</td>
</tr>
</tbody></table>
<br />
In classic C++ style, the error messages are hundreds of error lines from obscure library implementation code. They give no indication of what is wrong, and no indication of the solution. I pity someone who doesn't have C++ experience trying to figure out what is wrong with their code. <a href="http://25iq.com/2012/11/16/charlie-munger-on-mistakes/">Pretty much any error would be more helpful</a>, even an obscure message like "Member must be CopyAssignable" -- as long as it pointed out the correct line of code.<br />
<br />
<h3 style="text-align: left;">
The Fix</h3>
<br />
For reference, the corrected program is:<br />
<br />
<div style="background: #202020; border: solid black; margin: auto; overflow: auto; padding: .2em .6em; text-align: left; width: auto;">
<pre style="background: #202020; color: #d1d1d1;"><span style="color: #008073;">#</span><span style="color: #008073;">include </span><span style="color: #02d045;"><</span><span style="color: #40015a;">vector</span><span style="color: #02d045;">></span>
<span style="color: #008073;">#</span><span style="color: #008073;">include </span><span style="color: #02d045;"><</span><span style="color: #40015a;">iostream</span><span style="color: #02d045;">></span>
<span style="color: #008073;">#</span><span style="color: #008073;">include </span><span style="color: #02d045;"><</span><span style="color: #40015a;">functional</span><span style="color: #02d045;">></span>
<span style="color: #e66170; font-weight: bold;">int</span> <span style="color: #e66170; font-weight: bold;">main</span><span style="color: #d2cd86;">(</span><span style="color: #e66170; font-weight: bold;">int</span> argc<span style="color: #d2cd86;">,</span> <span style="color: #e66170; font-weight: bold;">const</span> <span style="color: #e66170; font-weight: bold;">char</span><span style="color: #d2cd86;">*</span> argv<span style="color: #d2cd86;">[</span><span style="color: #d2cd86;">]</span><span style="color: #d2cd86;">)</span>
<span style="color: #b060b0;">{</span>
<span style="color: #e66170; font-weight: bold;">int</span> a <span style="color: #d2cd86;">=</span> <span style="color: #008c00;">1</span><span style="color: #b060b0;">;</span>
<span style="color: #00dddd;">std</span><span style="color: #b060b0;">::</span><span style="color: #e66170; font-weight: bold;">vector</span><span style="color: #d2cd86;"><</span><span style="color: #00dddd;">std</span><span style="color: #b060b0;">::</span>reference_wrapper<span style="color: #d2cd86;"><</span><span style="color: #e66170; font-weight: bold;">int</span><span style="color: #d2cd86;">></span><span style="color: #d2cd86;">></span> test <span style="color: #d2cd86;">=</span> <span style="color: #b060b0;">{</span>a<span style="color: #b060b0;">}</span><span style="color: #b060b0;">;</span>
<span style="color: #00dddd;">std</span><span style="color: #b060b0;">::</span><span style="color: #e66170; font-weight: bold;">cout</span> <span style="color: #d2cd86;"><</span><span style="color: #d2cd86;"><</span> <span style="color: #02d045;">"</span><span style="color: #00c4c4;">a: </span><span style="color: #02d045;">"</span> <span style="color: #d2cd86;"><</span><span style="color: #d2cd86;"><</span> test<span style="color: #d2cd86;">[</span><span style="color: #008c00;">0</span><span style="color: #d2cd86;">]</span> <span style="color: #d2cd86;"><</span><span style="color: #d2cd86;"><</span> <span style="color: #00dddd;">std</span><span style="color: #b060b0;">::</span>endl<span style="color: #b060b0;">;</span>
<span style="color: #e66170; font-weight: bold;">return</span> <span style="color: #008c00;">0</span><span style="color: #b060b0;">;</span>
<span style="color: #b060b0;">}</span>
</pre>
</div>
<br />
The fix is to use the <span style="font-family: Courier New, Courier, monospace;"><a href="http://en.cppreference.com/w/cpp/utility/functional/reference_wrapper">std::reference_wrapper</a></span> utility function when making a container of references.<br />
<br />
<h3 style="text-align: left;">
Conclusion</h3>
<br />
There's definitely upsides: the '<span style="font-family: Courier New, Courier, monospace;">>></span>' parse has <a href="https://stackoverflow.com/questions/15785496/c-templates-angle-brackets-pitfall-what-is-the-c11-fix">finally been fixed</a>. Classes can now be initialized with initializer lists. There is type inference via '<span style="font-family: Courier New, Courier, monospace;"><a href="http://www.cprogramming.com/c++11/c++11-auto-decltype-return-value-after-function.html">auto</a></span>'. <a href="http://en.cppreference.com/w/cpp/language/range-for">For-each style loops exist</a>.<br />
<br />
C++11 is a great improvement over C++03, but its still frustrating: the obvious solution (like containers of references) is wrong in subtle ways, and compilers still generate hundreds of obscure error messages for a one-character typo.</div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-14646085104032928062014-05-06T23:44:00.001-05:002014-05-06T23:44:18.293-05:00Do Not Reply Addresses Suck<div dir="ltr" style="text-align: left;" trbidi="on">
If you send emails from do not reply addresses, I hate you.<br />
Your customers hate you.<br />
People you've never <a href="http://www.leadformix.com/blog/2013/11/beware-of-using-do-not-reply-email-address/">met</a> <a href="http://blog.signalhq.com/2013/04/30/do-not-reply-email-address-best-practices/">also</a> <a href="http://www.campaignmonitor.com/blog/post/3550/why-no-reply-address-is-an-email-marketing-no-no/">hate</a> <a href="https://www.aweber.com/blog/email-marketing/do-not-reply-address-dont-bother.htm">you</a>.<br />
<br />
Why you should stop using no reply addresses:<br />
<div>
<div>
<ul style="text-align: left;">
<li>Something will go wrong and your customers can't tell you.</li>
<li>You will send email to people who aren't your customers. They will have no way to ask you to stop.</li>
</ul>
</div>
<div>
Stop hating your current and future customers. Stop using no reply addresses. Only send email from monitored email addresses.</div>
<div>
<br /></div>
<div>
And one more thing...</div>
<div>
<br /></div>
<div>
If you create accounts without validating email addresses, I doubly hate you... and from now on I will be naming and shaming.</div>
<div>
<br /></div>
<h3 style="text-align: left;">
The Do Not Reply Hall of Shame</h3>
<h4 style="text-align: left;">
<br /></h4>
<h4 style="text-align: left;">
Ask.fm</h4>
<div>
<br /></div>
<div>
Ask.fm lets people register without validating their email. The many follow-up emails are all from <a href="mailto:noreply@ask.fm">noreply@ask.fm</a>. Ask.fm, please stop.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6GTWLqe5UtUx3-EcNcQjVwngaIiy2ZUxFSghuwPzTE1Y46rR0BEY06yzy3ZT7bkLjZsa_YANSU3rEMmDNiEz8BEG89r-gvuQJlRT1VDbzENjy3v1nxDnVimYR-orA70yCFGoJThQRtqCM/s1600/askfm.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6GTWLqe5UtUx3-EcNcQjVwngaIiy2ZUxFSghuwPzTE1Y46rR0BEY06yzy3ZT7bkLjZsa_YANSU3rEMmDNiEz8BEG89r-gvuQJlRT1VDbzENjy3v1nxDnVimYR-orA70yCFGoJThQRtqCM/s1600/askfm.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">As you can guess, nothing was ever done. Ask.fm, I hate you.</td></tr>
</tbody></table>
<div>
<br /></div>
<h4 style="text-align: left;">
DataViz.com</h4>
<div>
<br /></div>
<div>
DataViz lets people buy software without verifying their email. Wish I could reply and tell them, but the email is from <a href="mailto:do-not-reply@dataviz.com">do-not-reply@dataviz.com</a>.</div>
</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhEgjdAERyZSxRSwZxXVVDmTB7yONmgmxPat2df9mx-VLPxcO1jLpIv8Vo4gNNazLz84Ht48J6UzsndnZ0fNfZxsp1aLRpX4m_sLJ-A0fF22yQdDY5pHoYVpSJEGPiYECUKh9GoF1yqvzU/s1600/dataviz.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhEgjdAERyZSxRSwZxXVVDmTB7yONmgmxPat2df9mx-VLPxcO1jLpIv8Vo4gNNazLz84Ht48J6UzsndnZ0fNfZxsp1aLRpX4m_sLJ-A0fF22yQdDY5pHoYVpSJEGPiYECUKh9GoF1yqvzU/s1600/dataviz.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">My name is not Artem Dubov and I did not buy this software.</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
<br /></div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-73450950448836337232014-02-08T16:50:00.000-06:002014-11-02T20:53:06.192-06:00Direct Download Link For Adobe Flash Player<div dir="ltr" style="text-align: left;" trbidi="on">
Are you tired of seeing this?<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhim-9tqbE1WwHfFFPCT5yiJ-OqjAJfu3MCoTrgxB8G83qpwaZ-DrNBGDdRaLyiukAmwxfOPTgLI_05xeuCfxkD48TWXocSx1YTywcLuvp-FgVRR7UQOltI5DMvmrcmSuXqrJ2iXo7_b_Kn/s1600/AdobeFail.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhim-9tqbE1WwHfFFPCT5yiJ-OqjAJfu3MCoTrgxB8G83qpwaZ-DrNBGDdRaLyiukAmwxfOPTgLI_05xeuCfxkD48TWXocSx1YTywcLuvp-FgVRR7UQOltI5DMvmrcmSuXqrJ2iXo7_b_Kn/s1600/AdobeFail.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">"Error: Unable to proceed with the installation". Thats bad. But there's a green checkmark. Thats good? </td></tr>
</tbody></table>
<div>
<div>
<br /></div>
<div>
Do you find Adobe's <a href="http://helpx.adobe.com/flash-player/kb/installation-problems-flash-player-rvm.html">toubleshooting page</a> completely useless?</div>
<div>
</div>
<div>
Then use this handy direct link and bypass Adobe's broken installer. As a bonus, it wont try to trick you into installing Lightroom or other unwanted products.</div>
<div>
<br /></div>
<div>
<s><a href="http://fpdownload.macromedia.com/get/flashplayer/pdc/12.0.0.44/install_flash_player_osx.dmg">http://fpdownload.macromedia.com/get/flashplayer/pdc/12.0.0.44/install_flash_player_osx.dmg</a></s></div>
<br />
Update as of November 2nd, 2014:
<br />
<br />
Jerry Leichter very helpfully pointed out that Adobe maintains an official redistributable version of Flash player that doesn't use the awful installer. You can get it here:
<br />
<br />
<a href="https://www.adobe.com/products/flashplayer/distribution3.html">https://www.adobe.com/products/flashplayer/distribution3.html</a>
</div>
</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-25932285331063860082014-01-05T03:09:00.001-06:002014-11-02T19:15:23.235-06:00Stupid IDN Tricks: Unicode Combining Characters (or http://░͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇.ws)<h3>Nov 3, 2014: The domains mentioned in this article are expiring and I'm not renewing them. All links have been redirected to the archive.org mirror of the original site.</h3>
<br/>
<br/>
<div dir="ltr" style="text-align: left;" trbidi="on">
Safari will display Unicode <a href="http://www.ssec.wisc.edu/~tomw/java/unicode.html#x0300">combining diacritical marks</a> in the URL bar (try going to <a href="http://web.archive.org/web/20141103011158/http://xn--luaaaaaaaaaaaaaaaaaaaaaaaaaaaa8465w.ws">http://░͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇.ws</a>). It is possible to register domains with these marks. Some of these domains will look much like legitimate domains (e.g. apple.com vs. apple͢.com). This is probably not good.<br />
<br />
<h3 style="text-align: left;">
Internationalized Domain Names (IDN)</h3>
<br />
DNS was only designed with 7-bit unsigned ASCII in mind. However, not everyone in the world speaks English, and they really want to type domains in their own language. So there is a terrible hack to map Unicode characters to 7-bit unsigned ASCII, called <a href="https://en.wikipedia.org/wiki/Internationalized_domain_name">IDNA</a>.<br />
<br />
<h3 style="text-align: left;">
Homograph Attacks</h3>
<br />
Hopefully everyone has heard of homograph attacks using internationalized domain names. If not, here is a recap (taken from the <a href="http://www.chromium.org/developers/design-documents/idn-in-google-chrome">Chrome wiki</a>):<br />
<blockquote class="tr_bq">
... different characters from different languages can look very similar, and this can make <a href="http://en.wikipedia.org/wiki/Phishing">phishing</a> attacks possible. For example, the Latin "a" looks a lot like the Cyrillic "а", so someone could register <a href="http://xn--eby-7cd.com/">http://ebаy.com</a> (<a href="http://xn--eby-7cd.com/">http://xn--eby-7cd.com/</a>), which would easily be mistaken for <a href="http://ebay.com/">http://ebay.com</a>. This is called a <a href="http://en.wikipedia.org/wiki/Internationalized_domain_name#ASCII_spoofing_concerns">homograph attack</a>.</blockquote>
<br />
<h3 style="text-align: left;">
Defenses Against Homograph Attacks</h3>
<br />
There are multilayered solutions to the homograph attack:<br />
<div style="text-align: left;">
</div>
<ul style="text-align: left;">
<li><a href="http://kb.mozillazine.org/Network.IDN.blacklist_chars">Browser characters blacklists</a>. These prevent you from registering characters that look like '/', and so on.</li>
<li>IDN character display rules (see: <a href="https://wiki.mozilla.org/IDN_Display_Algorithm">Firefox</a>, <a href="http://www.chromium.org/developers/design-documents/idn-in-google-chrome">Chrome</a>). These rules restrict non-ASCII domain names to only those languages specifically configured by the user, and prevent display of mixed-language domains. For instance, if your have a Chinese installation of Windows then Chinese characters will be displayed for Chinese IDNs.</li>
<li><a href="https://www.verisigninc.com/en_US/products-and-services/domain-name-services/value-added-products/idn-domain-names/idn-code-points/registration-rules/index.xhtml">Registrar restrictions</a>. Registrars will prevent you from registering a domain that combines more than one language. So you can't register a name that is half English and half Russian, for instance.</li>
</ul>
<br />
<h3 style="text-align: left;">
Another Attempt</h3>
<br />
So how do we explain <a href="http://web.archive.org/web/20141103011158/http://xn--luaaaaaaaaaaaaaaaaaaaaaaaaaaaa8465w.ws">http://░͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇͇.ws</a>?<br />
<br />
<h3 style="text-align: left;">
Defeating Registrar Restrictions</h3>
<br />
Registrars prohibit combining languages in domain names. But there are characters that aren't in any language. The most interesting of these are <a href="http://www.ssec.wisc.edu/~tomw/java/unicode.html#x0300">Unicode Combining Diacritical Marks</a>. These unicode code points will modify the glyph right before them, instead of adding a new character. For example, the letter A when combined with U+0x332 will become: A̲.<br />
<br />
But will these characters display in browsers?<br />
<br />
Chrome: <span style="color: red;">No :(</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMzY_gxLzztR_xOGuZkhyqIPfhyphenhyphentNjh1JibqX2LRuVfb5jC1swP_SKC89id4Lx_KjsJE0AcUMbgt18mSwkdAvanAJ6aTwO1Sve6woWEjm8Ny6utzDli9T6DCUo0CjXr2HrdvbeKqSNHA7_/s1600/chrome_idn.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="25" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMzY_gxLzztR_xOGuZkhyqIPfhyphenhyphentNjh1JibqX2LRuVfb5jC1swP_SKC89id4Lx_KjsJE0AcUMbgt18mSwkdAvanAJ6aTwO1Sve6woWEjm8Ny6utzDli9T6DCUo0CjXr2HrdvbeKqSNHA7_/s320/chrome_idn.png" width="320" /></a></div>
Firefox: <span style="color: red;">No :(</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh61BXihZfqpGuTLYEzLZi9jPdsF4LhpCLzAy_PcKqm3rlJisfyLwkMFR_r3uVCUloUmfOkMkIzAXKl-WfPSgVoUuHeGXC4Wb14-aYMaCYJ0O25jCRid3lR_ZND21q4QoO_s9SUgwLkkkCK/s1600/firefox_idn.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh61BXihZfqpGuTLYEzLZi9jPdsF4LhpCLzAy_PcKqm3rlJisfyLwkMFR_r3uVCUloUmfOkMkIzAXKl-WfPSgVoUuHeGXC4Wb14-aYMaCYJ0O25jCRid3lR_ZND21q4QoO_s9SUgwLkkkCK/s1600/firefox_idn.png" /></a></div>
Safari: <span style="color: lime;">Yes :)</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi53niZY9uEpO9nCHv0366spbHOtJmGOu3KXvBYjPYDebh3wVTZoaKjzrLdp23BULuiO-d2scggT3mBH8z0yCWbF62CxXB3941Y2_A3oRT0RgIMNtAJx47OJ4O4_PNzJaEM05pYORZGnKyA/s1600/safari_idn.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi53niZY9uEpO9nCHv0366spbHOtJmGOu3KXvBYjPYDebh3wVTZoaKjzrLdp23BULuiO-d2scggT3mBH8z0yCWbF62CxXB3941Y2_A3oRT0RgIMNtAJx47OJ4O4_PNzJaEM05pYORZGnKyA/s1600/safari_idn.png" /></a></div>
<h3 style="text-align: left;">
Impact</h3>
<br />
Someone could register apple͢.com and it would display in Safari as:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4s9e0v9tM0LL0BrnCdCJeOgAoMxgGKWr7BRWjmzKgY2q-oRjYMeW3WJZIagWAMCXE2o2MrWUfUdrEQAvGYHuCEly6jsvSOD3Af4omyStysJs2fjjGE6SOk7-uRG3Wl26av2Rha6ZGY_ku/s1600/apple_com.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4s9e0v9tM0LL0BrnCdCJeOgAoMxgGKWr7BRWjmzKgY2q-oRjYMeW3WJZIagWAMCXE2o2MrWUfUdrEQAvGYHuCEly6jsvSOD3Af4omyStysJs2fjjGE6SOk7-uRG3Wl26av2Rha6ZGY_ku/s1600/apple_com.png" /></a></div>
<br />
This is not good.</div>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-8270433837910980471.post-41566875781386245772013-12-06T03:45:00.001-06:002013-12-06T03:45:36.565-06:00I Hate (General Purpose) Computers<div dir="ltr" style="text-align: left;" trbidi="on">
I hate computers. More specifically, general purpose computers. They cause me many hours of frustration, mostly due to malware.<br />
<br />
Most people don't need or want the freedom to run the malware of their choice. They need a nice computing appliance with a well-designed GUI that "just works". General computing is important, it just shouldn't be the default option.<br />
<br />
I propose appliance-default computers with a big red FTC mandated 'general computing' switch. It would save millions of hours in security and support costs, while protecting consumer freedom.<br />
<br />
<h3 style="text-align: left;">
Anger and Frustration</h3>
<br />
It all started over Thanksgiving. Once again, it was time to answer family computer questions.<br />
<br />
My father asked, "How can I be absolutely sure I don't get infected with <a href="https://en.wikipedia.org/wiki/CryptoLocker">CryptoLocker</a>?". He was very concerned. It was on the news, and there was a warning email at work.<br />
<br />
Unfortunately, there was nothing more I could tell him. He already does everything right, and could <b>still</b> be infected with CryptoLocker. There's nothing I can do: he has a computer and it can run malware. Sure there are precautions, but these are mostly useless.<br />
<br />
<h3>
Malware Precautions: Largely Useless</h3>
<br />
These (largely useless) precautions to avoid "being a victim" just happened to be on the news as I was drafting this blog post. The news report was about the recent <a href="http://www.infosecurity-magazine.com/view/35995/pony-botnet-lifts-two-million-mostly-weak-passwords">social media password leak</a>.<br />
<br />
The precautions:<br />
<ul style="text-align: left;">
<li><b>Install Anti-Virus Software.</b> <a href="http://searchsecurity.techtarget.com/feature/Antivirus-evasion-techniques-show-ease-in-avoiding-antivirus-detection">AV simply doesn't work</a>. Any criminal worth their salt will ensure their malware works with all common AV products. Also, they're really <a href="https://www.youtube.com/watch?v=r7io8B0R9UU">easy to avoid</a>.</li>
<li><b>Avoid Unfamiliar Websites.</b> Thats not going to help you against the <a href="http://barracudalabs.com/2012/03/maliciousness-in-top-ranked-alexa-domains/">next site compromise</a> and <a href="http://java-0day.com/">Java 0-day</a>.</li>
<li><b>Avoid unfamiliar email attachments</b>. People <b>do</b> get legitimate emails from strangers. Also, how many people know that <a href="http://blog.spiderlabs.com/2013/10/hacking-a-reporter-writing-malware-for-fun-and-profit-part-1-of-3.html">picture files don't end in '.jar'</a>? And lets not forget about phishing.</li>
<li><b>Install a firewall</b>. <a href="https://en.wikipedia.org/wiki/Blaster_(computer_worm)">What is this, 2003</a>? Nothing is directly connecting to his NATed machine.</li>
</ul>
<br />
These precautions try to mask the core issue: malicious code can run on a computer, and there is nothing you can do about it except live in fear of every website and email attachment.<br />
<br />
Even when following every single precaution, you could <b>still</b> be infected with malware.<br />
<br />
<h3 style="text-align: left;">
Computers vs. Computing Appliances</h3>
<br />
The problem is that my father has a computer. A computer is a platform that permits arbitrary code execution. This encompasses pretty much all desktops and laptops.<br />
<br />
What he needs is a computing appliance with a large monitor and a keyboard. A computing appliance is a platform that only permits execution of pre-approved code, like iOS or Windows on ARM.<br />
<br />
In fact, the vast majority of people only need a computing appliance. They will <b>never, ever</b> develop software. They have no interest in running arbitrary, unapproved applications. The only unapproved code they will ever run is <a href="https://en.wikipedia.org/wiki/Zeus_(Trojan_horse)">ZeuS</a> or <a href="http://www.forbes.com/sites/parmyolson/2013/11/27/cryptolocker-thieves-likely-making-millions-as-bitcoin-breaks-1000/">CryptoLocker</a>.<br />
<br />
<h3 style="text-align: left;">
A Computing Compromise</h3>
<br />
Every time OS vendors try to move into a direction of computing appliances, a vocal minority <a href="http://www.zdnet.com/blog/hardware/richard-m-stallman-on-steve-jobs-im-not-glad-hes-dead-but-im-glad-hes-gone/15275">screams bloody murder</a>. Just look at <a href="http://techrights.org/2013/01/31/bricked-by-uefi/">what happened</a><span id="goog_1836428200"></span><span id="goog_1836428201"></span><a href="http://www.blogger.com/"></a> when Microsoft introduced Secure Boot with Windows 8.<br />
<br />
To some extent, these people have a point.<br />
<br />
Computing appliances have many faults:<br />
<ul style="text-align: left;">
<li><a href="https://developer.apple.com/appstore/guidelines.html">Arbitrary application guidelines and limits</a></li>
<li><a href="http://blog.forumwarz.com/2011/02/01/irejected-how-apple-took-nine-weeks-to-arbitrarily-reject-our-app/">Opaque and draconian review processes</a></li>
<li>Single vendor control of all applications</li>
<li>Loss of "<a href="https://freedom-to-tinker.com/">freedom to tinker</a>" and exploration</li>
</ul>
<br />
Of these, the last is the most important and can't easily be solved by competition between vendors.<br />
<br />
It <b>is</b> important to let those who want to modify their computer and their software to do as they see fit. It just shouldn't be the default option. <br />
<br />
The best execution of this I've heard of is the <a href="http://www.chromium.org/chromium-os/developer-information-for-chrome-os-devices/samsung-series-5-chromebook#TOC-Developer-switch">Developer Mode switch</a> on Google's chromebooks. You have to physically flip a switch that allows unrestricted code execution. Additionally, flipping the switch wipes all local data.<br />
<br />
It's a beautiful solution: there is no accidental enabling, and it prevents '<a href="http://searchsecurity.techtarget.com/definition/evil-maid-attack">evil maid</a>' attacks.<br />
<br />
There is, of course, little profit in having a general computing mode in appliances. Most customers wouldn't use it, and it would cost time and effort to maintain. The only purpose would be to protect consumer freedoms.<br />
<br />
Which is why computing appliances are a perfect target for government regulation. The <a href="http://www.ftc.gov/">FTC</a> can require all computing appliances to ship with a 'general computing' switch to protect consumers from malware and controlling vendors. The millions of hours in saved frustration and tech support would be well worth it.</div>
Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-8270433837910980471.post-7175580965176364762013-11-09T13:49:00.001-06:002013-11-09T13:49:37.089-06:00A Bit Flip That Killed?<div dir="ltr" style="text-align: left;" trbidi="on">
During my bitsquatting research I was amazed how many critical RAM chips in a typical PC lack error correcting memory.<br />
<br />
It turns out that ECC is missing from an even more critical device: cars.<br />
<br />
Details from the recent Toyota civil settlement show that the drive-by-wire control of Toyota cars was lacking error detection and correcting RAM. <br />
<br />
From <a href="http://www.edn.com/design/automotive/4423428/Toyota-s-killer-firmware--Bad-design-and-its-consequences">EDN.com</a>:<br />
<blockquote class="tr_bq">
Although the investigation focused almost entirely on software, there is at least one HW factor: Toyota claimed the 2005 Camry's main CPU had error detecting and correcting (EDAC) RAM. It didn't. EDAC, or at least parity RAM, is relatively easy and low-cost insurance for safety-critical systems.</blockquote>
<div>
<br /></div>
<div>
I can't fathom why that would ever be the case. The amount of RAM required is relatively small, and the extra cost is inconsequential to the total cost of a car. Oh, and the software <b>runs next to a car engine</b>.</div>
<div>
<br /></div>
<div>
As little as a <a href="http://www.eetimes.com/document.asp?doc_id=1319903">single bit flip could, and possibly did, have fatal consequences</a>:</div>
<div>
<blockquote class="tr_bq">
"We've demonstrated how as little as a single bit flip can cause the driver to lose control of the engine speed in real cars due to software malfunction that is not reliably detected by any fail-safe," Michael Barr, CTO and co-founder of Barr Group, told us in an exclusive interview. Barr served as an expert witness in this case.</blockquote>
</div>
<div>
<br /></div>
<div>
Drive-by-wire systems aren't the only critical control systems susceptible to bit-errors. There is some speculation that a <a href="http://www.flightglobal.com/news/articles/qantas-a330-upset-inquiry-considers-cosmic-particle-strike-335187/">bit-error caused a sudden altitude drop in a Qantas A330.</a> Amazingly, airplane software systems <a href="http://www.atsb.gov.au/media/3532398/ao2008070.pdf">did not have to consider single or multiple bit errors until 2010 (see page 222)</a> to achieve certification.</div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-44251921650666785702013-10-21T21:12:00.001-05:002013-10-21T21:12:38.745-05:00Git and Bit Errors<div dir="ltr" style="text-align: left;" trbidi="on">
Finally, a topic to unite my two most popular blog posts: <a href="http://blog.dinaburg.org/2013/07/git-fails-on-large-files.html">git failures</a> and <a href="http://dinaburg.org/bitsquatting">bitsquatting</a>.<br />
<br />
A friend recently pointed me to an <a href="http://article.gmane.org/gmane.comp.version-control.git/236238">amazingly detailed investigation of a corrupted git repository</a>. The cause of the corruption? A single bit flip. To quote the source:<br />
<br />
<blockquote class="tr_bq">
As for the corruption itself, I was lucky that it was indeed a single<br />byte. In fact, it turned out to be a single bit. The byte 0xc7 was<br />corrupted to 0xc5. So presumably it was caused by faulty hardware, or a<br />cosmic ray.<br />And the aborted attempt to look at the inflated output to see what was<br />wrong? I could have looked forever and never found it. Here's the diff<br />between what the corrupted data inflates to, versus the real data:<br /> - cp = strtok (arg, "+");<br /> + cp = strtok (arg, ".");</blockquote>
<div>
<br /></div>
<br />
It is quite amazing to see evidence of a bit error resulting in a perfectly innocuous, syntactically valid and yet completely erroneous change in a real program and a real codebase.<br />
<br />
How many times does this happen without anyone noticing?<br />
<br />
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-21412323573146369582013-09-13T15:32:00.002-05:002013-09-13T15:32:38.269-05:00Bitsquatting at DEFCON21 and More<div dir="ltr" style="text-align: left;" trbidi="on">
I was very excited to see that several researchers are investigating bitsquatting and writing about it. There were two presentations about bitsquatting at DEFCON 21, a presentation at ICANN 47, and a research paper presented at WWW2013. <br />
<b><br /></b>
<b>Jaeson Schultz - <a href="https://www.defcon.org/html/defcon-21/dc-21-speakers.html#Schultz">DEFCON 21</a> - <a href="http://blogs.cisco.com/wp-content/uploads/Schultz-Examining_the_Bitsquatting_Attack_Surface-whitepaper.pdf">Examining the Bitsquatting Attack Surface</a></b> <br />
Jaeson presented some excellent ways to exploit bitsquatting that I did not think of -- such as using bitsquats in URL delimeters to target otherwise unexploitable domains. As an example taken from the paper, <span style="font-family: "Courier New",Courier,monospace;">ecampus.phoenix.edu</span> can become <span style="font-family: "Courier New",Courier,monospace;">ecampus.ph/enix.edu/<span style="font-family: inherit;">.</span></span><br />
<br />
<span style="font-family: inherit;">Additionally Jaeson presents a great mitigation that can be implemented at the local level -- Response Policy Zones. From the paper:</span><br />
<blockquote class="tr_bq">
<span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: inherit;">An RPZ is a local zone file which allows the DNS resolver to respond to specific DNS requests by saying that the domain name does not exist (NXDOMAIN), or redirecting the user<br />to a walled garden, or other possibilities. To mitigate the effects of single bit errors for users of a DNS resolver the resolver administrator can create a Response Policy Zone that protects against bitsquats of frequently resolved, or internal-only domain names. </span> </span> </blockquote>
<br />
<b>Robert Stucke - <a href="https://www.defcon.org/html/defcon-21/dc-21-speakers.html#Stucke">DEFCON 21</a> - <a href="http://rot26.net/stucke.pdf">DNS Has Been Found To Be Hazardous To Your Health</a></b><br />
Robert demonstrated some new vectors for bitsquatting, such as web applications and hosted email providers. Speifically, he bitsquatted <span style="font-family: "Courier New",Courier,monospace;">gstatic.com</span> (a site that serves static content for Google properties). Not only was he able to return arbitrary content to people using Google's search services, he could also affect web applications, such as feed readers, that rely on correct resolution of <span style="font-family: "Courier New",Courier,monospace;">gstatic.com</span>. Robert also bitsquatted <span style="font-family: "Courier New",Courier,monospace;">psmtp.com,</span> a hosted email provider. This allowed him to potentially receive other people's email.<br />
<br />
<b>Nigel Roberts - <a href="http://durban47.icann.org/node/39629">ICANN 47</a> - <a href="http://ccnso.icann.org/node/40215">Bitsquatting</a></b><br />
Nigel (who runs .gg) presented about bitsquatting to ICANN. Hopefully this will result in more research at the ccTLD level. <br />
<br />
<b>Nick Nikiforakis, et al. - <a href="http://www2013.org/2013/02/18/www2013-accepted-papers/">WWW2013</a> - <a href="https://lirias.kuleuven.be/bitstream/123456789/395759/1/bitsquatting.pdf">Bitsquatting: Exploiting Bit-flips for Fun, or Profit?</a></b><br />
Nick and his coauthors did a measurement study about the prevalence of bitsquatting and what content appears on bitsquatted domains. They identified several that are used for adverstising, affiliate programs, and malware distribution. There is also a great graph in the paper where you can see a huge spike in bitsquat domain registration after my Blackhat presentation :).</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-32851549106677169892013-08-01T02:08:00.003-05:002013-08-01T02:08:43.230-05:00Introducing Binfuzz.js<div dir="ltr" style="text-align: left;" trbidi="on">
Tomorrow morning I will be giving a demonstration of <a href="http://dinaburg.org/binfuzz">Binfuzz.js</a> at <a href="http://www.blackhat.com/us-13/arsenal.html#Dinaburg">Blackhat Arsenal 2013</a>. Please stop by the Arsenal area from 10:00 - 12:30. The <a href="https://media.blackhat.com/us-13/Arsenal/us-13-Dinaburg-Binfuzz.js-Slides.pdf">slides are already available</a> on the Blackhat website.<br />
<br />
The <a href="http://dinaburg.org/binfuzz">Binfuzz.js page on dinaburg.org</a> is now live, and all the code is uploaded to <a href="https://github.com/artemdinaburg/binfuzz">Github</a>.<br />
<br />
<h3 style="text-align: left;">
What is Binfuzz.js?</h3>
<br />
Binfuzz.js is a library for fuzzing structured binary data in JavaScript. Structured binary data is data that can be easily represented by one or more C structures. Binfuzz.js uses the definition of a structure to create instances of the structure with invalid or edge-case values. Supported structure features include nested structures, counted arrays, file offset fields, and length fields. The <a href="http://dinaburg.org/binfuzz">live example </a>uses Binfuzz.js to generate Windows ICO files (<a href="http://msdn.microsoft.com/en-us/library/ms997538.aspx">a surprisingly complex format</a>) to stress your browser's icon parsing and display code.<br />
<br />
<h3 style="text-align: left;">
Features</h3>
<br />
Binfuzz.js includes support for:<br />
<br />
<ul style="text-align: left;">
<li>Several predefined elementary types: Int8, Int16, Int32 and Blob.</li>
<li>Nested structures</li>
<li>Arrays</li>
<li>Counter Fields (e.g. field A = number of elements in Array B)</li>
<li>Length Fields (e.g. field A = length of Blob B)</li>
<li>File Offsets (e.g. field A = how far from the start of the file is Blob B?</li>
<li>Custom population functions (e.g. field A = fieldB.length + fieldC.length)</li>
</ul>
<br />
The ICO fuzzing example includes uses of all of these because I needed them to implement ICO file generation.<br />
<br />
<h4 style="text-align: left;">
Combinatorics</h4>
<br />
Binfuzz.js calculates the total number of combinations based on how many possible combinations there are for each individual field. It is then possible to generate a structured data instance corresponding to a specific combination number. It is not necessary to generate prior combinations. This way random combinations can be selected when fuzzing run time is limited.<br />
<br />
<h3 style="text-align: left;">
Why?</h3>
<br />
The best way to learn is by doing, and I wanted to learn JavaScript. So I decided to create an ICO file fuzzer in JavaScript. I chose ICO files because of <a href="http://en.wikipedia.org/wiki/Favicon">favicon.ico</a>, a file browsers automatically request when navigating to a new page. After starting the project, I realized I got a lot more than I bargained for. Icons are a surprisingly <a href="http://blogs.msdn.com/b/oldnewthing/archive/2010/10/18/10077133.aspx">complex format that has evolved over time</a>. There are several images in one file, each image has corresponding metadata, there are internal fields that refer to offsets in the file, and the size of the icon data for each image depends the metadata. All of these interlinked reationships need to be described and processed by Binfuzz.js.</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-70942880858547710622013-07-29T20:34:00.000-05:002013-07-29T20:34:49.172-05:00A Travel Story<div dir="ltr" style="text-align: left;" trbidi="on">
I travel frequently, and not just to the usual tourist destinations. I've gone to places like Singapore, Japan, Ukraine, and India. This is a story of trying desperately to get home from a recent trip.<br />
<br />
<h3 style="text-align: left;">
Be Aware of You Surroundings</h3>
<br />
The first sign of trouble was when the roof started leaking. The storm outside had only been raging for half an hour before the two droplets landed on my head. The buckets and signs on the floor captured most of the leaks, but there were numerous unmarked drips. You don't want this kind of water in your hair or food. Always be aware of your surroundings.<br />
<br />
I arrived at my gate to find a delayed flight. At least there was now time to eat, since all the good food options were in the international terminal. Casually I trekked over, scarfed down some ethnic food and began meandering back to the gate.<br />
<br />
<h3 style="text-align: left;">
An Ominous Sign</h3>
<br />
The blaring alarm noises and the flashing emergency lights of the fire alarm told me something was wrong. Neither the airport staff, nor security, nor the airline staff knew what was happening. Just that there wasn't a fire. Maybe. There was no smoke, no firemen, and everyone was calm. Carefully and slowly, I continued towards the gate. Then the power went out.<br />
<br />
It was still daytime, and the sunlight and fire alarm (probably on backup power) provided enough lights to get by. The gate was in complete chaos. None of the computers worked, and the staff tried their best to assuage angry passengers. Some were just grumpy, others in tears, but all wanted some answers. There were no answers, no air conditioning, and it was getting hot.<br />
<br />
The darkened terminal was like an impoverished refugee camp. Uniformed staff handing out bottles of water to angry men, crying women and screaming children. Mobs of people begged staff for answers. It was dark, hot, loud, and no one knew what was going on. This went on for two hours.<br />
<br />
Then the plane arrived, but we couldn't board. The jetbridges were electric and wouldn't extend. Other parts of the airport had power, but the airline couldn't or wouldn't use the working jetbridges. The flight was cancelled, but we weren't rebooked to a new flight because there was no power. We had to call the central reservation office. This was never announced, of course. I happened to overhear another passenger talk with the gate agent.<br />
<br />
So I called. After 30 minutes, a man with an accent answered. He sounded legitimately concerned, but all the flights for the day were sold out. I asked if he could rebook me on a competing airline; he typed something on a keyboard and then told me that all flights on all airlines to my destination were sold out. No inventory; The earliest possibility was the next night. Begrudgingly, I agreed. I asked about hotel vouchers. "Of course, the gate agents can print them for you", he replied. The power was still out.<br />
<br />
<h3 style="text-align: left;">
Perseverance</h3>
<br />
I wasn't about to spend another night in this place. There was still one flight on a competing airline, and it was leaving soon. Sure I was told there were no seats, but one can't blindly trust a company to do something that lowers profit. And besides, maybe someone wouldn't show up. The competing airline was naturally in the furthest possible terminal from where I was. It was a long walk, but they probably power.<br />
<br />
Turns out there were seats on the flight. There would be a cost, but I would get home, today. Life was also better in this part of the airport. There were no leaks in the roof and the air conditioning was on. Nobody was crying. I sat down near the new gate.<br />
<br />
There was a current flight there, delayed indefinitely. Must be weather, I figured. That assumption was shattered when I heard two airline employees talking next to the gate entrance. Turns out the plane needed fuel. They sent for a fuel truck, but it arrived without fuel. For the past half hour they were trying to find either fuel, a new fuel truck, or whomever got them into the boondoggle. No one was answering. After 15 more minutes, they found a new truck. Two hours later my plane arrived.<br />
<br />
<h3 style="text-align: left;">
Don't Count Your Chickens...</h3>
<br />
As we boarded, I realized there were plenty of seats. The flight was only half full. So much for "no seats available." As we taxied to the runway, I was thankful to finally be out of this wretched place.<br />
<br />
My enthusiasm was premature. Literally as we were next in line to depart, our plane was directed to stop. After an hour on the tarmac, the full story unfolded. The original flight path had unexpected weather. We had an approved alternate flight path, but in the time it took us to taxi (about 30 min in the rain at this airport), the alternate flight path also had unexpected weather. It was 30 more minutes before we moved again.<br />
<br />
<h3 style="text-align: left;">
Departure At Last</h3>
<br />
This time, it was for real. Our wheels touched off the ground and we ascended into the stormy sky. The plane shook violently as we passed through the rain, wind, and lightning. I clutched my seat and thought about this abhorrent place: the fire alarm, the crying passengers, the hot, dark, sweaty terminal, helpless staff, the leaky roof and fuel-less fuel trucks. I just wanted to go home and to never again fly into Philadelphia International Airport.</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-89880727666520258882013-07-21T01:57:00.002-05:002013-07-21T01:57:51.543-05:00Readability Improvements<div dir="ltr" style="text-align: left;" trbidi="on">
In preparation for new content relating to my <a href="http://www.blackhat.com/us-13/arsenal.html#Dinaburg">Blackhat 2013 Arsenal presentation</a>, I made some readability improvements to the blog and to dinaburg.org in general.<br />
<br />
The layout is wider and the <a href="http://www.smashingmagazine.com/2011/10/07/16-pixels-body-copy-anything-less-costly-mistake/">font size for text</a> is now <a href="https://explodie.org/writings/stop-using-small-font-size.html">16 pixels</a>.<br />
<br />
Please <a href="http://dinaburg.org/about.html">let me know</a> what you think.</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-17268562810434027292013-07-12T02:17:00.000-05:002013-07-12T02:30:39.665-05:00Git Fails On Large Files<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="p1">
Turns out <a href="http://git-scm.com/">git</a> fails spectacularly when working with large files. I was surprised, but the behavior is <a href="http://caca.zoy.org/wiki/git-bigfiles">pretty</a> <a href="http://superuser.com/questions/406907/why-is-git-so-slow-with-large-files">well</a> <a href="https://help.github.com/articles/working-with-large-files">documented</a>. In typical git fashion, there is an obscure error message and an equally obscure command to fix it.<br />
<br />
<h3 style="text-align: left;">
The Problem</h3>
<div>
<br /></div>
A real-life example (with repository names changed):<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 10px 20px 10px 10px; width: 100%;">artem@MBP:~/git$ git clone git@gitlab:has_a_large_file.git
Cloning into 'has_a_large_file'...
Identity added: /Users/artem/.ssh/devkey (/Users/artem/.ssh/devkey)
remote: Counting objects: 6, done.
error: git upload-pack: git-pack-objects died with error.
fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
remote: Compressing objects: 100% (5/5), done.
remote: fatal: Out of memory, malloc failed (tried to allocate 1857915877 bytes)
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: index-pack failed</pre>
</div>
<div class="p1">
<br />
I pushed the large file without issues, but couldn't pull it again because the remote was dying. The astute reader will notice the remote was running <a href="http://gitlab.org/">gitlab</a>. The push also broke the gitlab web interface for the repository.<br />
<br />
From my Googling, the problem is that the remote side is running out of memory when compressing a large file (<a href="http://git-scm.com/book/en/Git-Internals-Packfiles">read more about git packfiles here</a>). Judging by the error, git attempts to <span style="font-family: Courier New, Courier, monospace;">malloc(size_of_large_file)</span> and the malloc fails.<br />
<br />
This situation raises conundrums that may only be answered by <a href="http://stevelosh.com/blog/2013/04/git-koans/">Master Git</a>:<br />
<ul style="text-align: left;">
<li>Why was I able to push a large file, but not pull it?</li>
<li>Why would one <span style="font-family: Courier New, Courier, monospace;">malloc(size_of_large_file)</span> ?</li>
<li>What happens when you push a >4Gb file to a 32-bit remote?</li>
</ul>
<br />
I was curious enough about the last one to look at the code: it will likely die gracefully <a href="https://github.com/git/git/blob/727a46b2f9a1ce69eaf09bc46cb129f1c40833d8/wrapper.c#L49">(see line 49 of wrapper.c)</a>. <a href="http://en.wikipedia.org/wiki/Integer_overflow">Integer overflow</a> likely avoided; would need to read more code much more carefully to be sure.<br />
<br />
<h3 style="text-align: left;">
The Solution</h3>
<div>
<br /></div>
In theory, the solution is to re-pack the remote with a smaller pack size limit. That requires ssh access to the remote repository, which I don't have. So the following fix is untested, and taken from <a href="http://www.kevinblake.co.uk/development/git-repack/">http://www.kevinblake.co.uk/development/git-repack/</a>. The obscure command in question (must be run on the remote):<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 10px 20px 10px 10px; width: 100%;">git repack -a -f -d</pre>
<br />
Of course, repacking the remote but having non-repacked local repositories around<a href="http://stackoverflow.com/questions/13367431/issues-after-trying-to-repack-a-git-repo-for-improved-performance"> may cause other problems</a>.<br />
<br />
<h3 style="text-align: left;">
Just For Fun</h3>
<div>
<br /></div>
Here is another large file fail:<br />
<br /></div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 10px 20px 10px 10px; width: 100%;">artem@MBP:~/temp/largerandomfile$ dd if=/dev/urandom of=./random_big_file bs=4096 count=1048577
1048577+0 records in
1048577+0 records out
4294971392 bytes transferred in 437.836959 secs (9809522 bytes/sec)
artem@MBP:~/temp/largerandomfile$ git add random_big_file
artem@MBP:~/temp/largerandomfile$ git commit -m "Added a big random file"
[master 377db57] Added a big random file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 random_big_file
artem@MBP:~/temp/largerandomfile$ git push origin master
Counting objects: 4, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (2/2), done.
error: RPC failed; result=22, HTTP code = 413 KiB/s
fatal: The remote end hung up unexpectedly
Writing objects: 100% (3/3), 4.00 GiB | 18.74 MiB/s, done.
Total 3 (delta 0), reused 1 (delta 0)
fatal: recursion detected in die handler
Everything up-to-date</pre>
<br />
Everything up-to-date, indeed.<br />
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-5458282374395421042013-06-16T01:50:00.000-05:002013-06-16T01:50:32.243-05:00OptimizeVM: Fast Windows Guests for VMware<div dir="ltr" style="text-align: left;" trbidi="on">
Do you make Windows VMs? Are they slow? <a href="https://github.com/artemdinaburg/OptimizeVM">OptimizeVM</a> will make them fast(er).<br />
<br />
I got tired of re-googling and re-typing the same commands over and over make my Windows VMs fast, so I collected them on Github: <a href="https://github.com/artemdinaburg/OptimizeVM">https://github.com/artemdinaburg/OptimizeVM</a>.<br />
<br />
<a href="https://github.com/artemdinaburg/OptimizeVM">OptimizeVM</a> is based on the commands provided in the official <a href="http://www.vmware.com/files/pdf/VMware-View-OptimizationGuideWindows7-EN.pdf">VMware View Optimization Guide</a>.<br />
<br />
The goal of OptimizeVM is to minimize disk access and remove fancy graphical effects. Certain Windows features, such as Windows Search, System Restore, Windows Updates, and Registry Backup will cause constant background disk access. The disk access makes your VM slow and increases virtual disk size. These features are also unnecessary for VMs that get reverted to a clean snapshot every couple of days.<br />
<br />
The script also removes some annoyances like the Action Center, Network Location Wizard, hidden file extensions, and so on.<br />
<br />
The disabling of some features, such as Windows Firewall, <a href="http://windows.microsoft.com/en-us/windows7/products/features/windows-defender">Windows Defender</a>, and Windows Update do lower the security of your system. If you are very worried, turn them back on. I leave them off since in my workflow VMs get reverted to a clean state every few days.<br />
<br />
If you want to speed up your Windows VMs, here are a few more useful links:<br />
<br />
<ul style="text-align: left;">
<li><a href="http://pubs.vmware.com/view-50/index.jsp?topic=/com.vmware.view.administration.doc/GUID-06DF65D1-7151-479A-B388-162535251F6F.html">Optimize Windows Guest Operating System Performance</a></li>
<li><a href="http://pubs.vmware.com/view-50/index.jsp?topic=/com.vmware.view.administration.doc/GUID-E938922D-7DFD-41E7-A774-0C81CAE6595C.html">Optimizing Windows 7 for Linked-Clone Desktops</a></li>
<li><a href="http://communities.vmware.com/thread/291255?start=0&tstart=0">VMware Communities Thread</a></li>
</ul>
<br />
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-45992521307202179042013-04-30T01:19:00.000-05:002013-07-21T01:08:31.804-05:00JavaScript Frustrations and Solutions<div dir="ltr" style="text-align: left;" trbidi="on">
Since there's no better way to learn than by doing, I've been teaching myself JavaScript by writing a structured binary data fuzzer. The fuzzer currently generates <a href="http://msdn.microsoft.com/en-us/library/ms997538.aspx">Windows ICO</a> files, and will soon be released. In the meantime, I wanted to describe some frustrating experiences learning JavaScript and include solutions to them.<br />
<br />
<h3 style="text-align: left;">
Object Orientation in JS is Confusing </h3>
<div>
<br /></div>
Some of this may be because I am used to <a href="http://www.crockford.com/javascript/inheritance.html">class-ical inheritance</a>, but considering the number of JavaScript OOP libraries (e.g. <a href="http://idya.github.io/oolib/">oolib</a>, <a href="http://css.dzone.com/articles/dejavu-high-performance-oop">dejavu</a>, <a href="https://github.com/ded/klass">Klass</a>, <a href="https://github.com/Gozala/selfish">selfish</a>), I'm not alone.<br />
<br />
The first confusing thing is that objects are functions declared via the <span style="font-family: Courier New, Courier, monospace;">function</span> keyword and instantiated via the <span style="font-family: Courier New, Courier, monospace;">new</span> operator. The overloaded use of <span style="font-family: Courier New, Courier, monospace;">function</span> doesn't let you know right away if the code you are reading is an object a traditional function. The use of the <span style="font-family: Courier New, Courier, monospace;">new</span> operator gives a false impression of class-ical inheritance and has other deficiencies. For instance, until the introduction of <span style="font-family: Courier New, Courier, monospace;"><a href="https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Object/create">Object.create</a></span> it was impossible to validate arguments to an object's constructor. The deficiency is shown in the following example.<br />
<br />
In this motivating example, we want to create an object to encapsulate integers and validate certain properties in the object's constructor. The initial code could look something like this:<br />
<br />
<blockquote class="tr_bq">
<span style="font-family: Courier New, Courier, monospace;">function Int(arg) {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> console.log("Int constructor");</span><br />
<span style="font-family: Courier New, Courier, monospace;"> this.name = arg['name'];</span><br />
<span style="font-family: Courier New, Courier, monospace;"> if(this.name === undefined)</span><br />
<span style="font-family: Courier New, Courier, monospace;"> {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> alert('a name is required!');</span><br />
<span style="font-family: Courier New, Courier, monospace;"> }</span><br />
<span style="font-family: Courier New, Courier, monospace;"> this.size = arg['size'];</span><br />
<span style="font-family: Courier New, Courier, monospace;">};</span><br />
<span style="font-family: Courier New, Courier, monospace;">Int.prototype.getName = function() {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> console.log("Int: " + this.name);</span><br />
<span style="font-family: Courier New, Courier, monospace;">};</span><br />
<span style="font-family: Courier New, Courier, monospace;">var i = new Int({'name': 'generic int'});</span><br />
<span style="font-family: Courier New, Courier, monospace;">i.getName();</span></blockquote>
<br />
Running this code would print:<br />
<br />
<blockquote class="tr_bq">
<span style="font-family: Courier New, Courier, monospace;">Int constructor</span><br />
<span style="font-family: Courier New, Courier, monospace;">Int: generic int</span></blockquote>
<br />
But now lets say I want to write something to deal specifically with 4-byte integers. The initial code to inherit from the <span style="font-family: Courier New, Courier, monospace;">Int</span> object would look similar to the following:<br />
<br />
<blockquote class="tr_bq">
<span style="font-family: Courier New, Courier, monospace;">function Int4(arg) {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> arg['size'] = 4;</span><br />
<span style="font-family: Courier New, Courier, monospace;"> Int.call(this, arg);</span><br />
<span style="font-family: Courier New, Courier, monospace;"> console.log("Int4 constructor");</span><br />
<span style="font-family: Courier New, Courier, monospace;">};</span><br />
<span style="font-family: Courier New, Courier, monospace;">Int4.prototype = new Int({});</span><br />
<span style="font-family: Courier New, Courier, monospace;">Int4.prototype.constructor = Int4;</span><br />
<span style="font-family: Courier New, Courier, monospace;">Int4.prototype.getName = function() {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> console.log("Int4: " + this.name);</span><br />
<span style="font-family: Courier New, Courier, monospace;">};</span><br />
<span style="font-family: Courier New, Courier, monospace;">var i4 = new Int4({'name': '4-byte int'});</span><br />
<span style="font-family: Courier New, Courier, monospace;">i4.getName();</span></blockquote>
<br />
This code will alert with 'a name is required'! To set Int4's prototype chain we need to create a new <span style="font-family: Courier New, Courier, monospace;">Int</span> object. Arguments to the constructor cannot be validated since they are not known when <span style="font-family: Courier New, Courier, monospace;">new Int({})</span> is called. Luckily this has been fixed by use of <a href="https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Object/create">Object.create</a>:<br />
<br />
<blockquote class="tr_bq">
<span style="font-family: Courier New, Courier, monospace;">function Int4(arg) {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> arg['size'] = 4;</span><br />
<span style="font-family: Courier New, Courier, monospace;"> Int.call(this, arg);</span><br />
<span style="font-family: Courier New, Courier, monospace;"> console.log("Int4 constructor");</span><br />
<span style="font-family: Courier New, Courier, monospace;">};</span><br />
<span style="font-family: Courier New, Courier, monospace;"><b>Int4.prototype = Object.create(Int.prototype);</b></span><br />
<span style="font-family: Courier New, Courier, monospace;">Int4.prototype.constructor = Int4;</span><br />
<span style="font-family: Courier New, Courier, monospace;">Int4.prototype.getName = function() {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> console.log("Int4: " + this.name);</span><br />
<span style="font-family: Courier New, Courier, monospace;">};</span><br />
<span style="font-family: Courier New, Courier, monospace;">var i4 = new Int4({'name': '4-byte int'});</span><br />
<span style="font-family: Courier New, Courier, monospace;">i4.getName();</span></blockquote>
<br />
<h3>
All Functions are Function Objects and all Objects are Associative Arrays.</h3>
<br />
All functions are actually Function objects, all objects are associative arrays. There are also <a href="https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array">Arrays</a>, which are not functions and but are also associative and also objects. Sometimes you want Arrays to be arrays, and sometimes you actually want Objects to be arrays. Confused yet?<br />
<br />
<div>
<h3 style="text-align: left;">
Scoping Rules and Variable Definition Rules that Lead to Subtle Bugs</h3>
<div>
<br /></div>
Scoping rules are a bit confusing, since there is at least three ways to declare variables: assignment, <a href="https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Statements/var"><span style="font-family: Courier New, Courier, monospace;">var</span></a>, and <span style="font-family: Courier New, Courier, monospace;"><a href="https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Statements/let">let</a></span>. Of course, all of these have different semantics. The biggest problem for me was that creating a variable by assignment adds it to the global scope, but using <span style="font-family: Courier New, Courier, monospace;">var</span> will keep it in function scope. And when using identically named variables, a missing <span style="font-family: Courier New, Courier, monospace;">var</span> in one function will make that function use the global variable instead of the local. <a href="http://blog.safeshepherd.com/23/how-one-missing-var-ruined-our-launch/">Using the wrong variable will lead to lots of frustrating errors</a>.<br />
<br />
The solution is to always <span style="font-family: Courier New, Courier, monospace;"><a href="http://ejohn.org/blog/ecmascript-5-strict-mode-json-and-more/">"use strict"</a></span> to force variable definitions. Of course, doing this globally will break some existing libraries you are using. Such is life.<br />
<br />
<h3 style="text-align: left;">
Type Coercion With the Equality Operator (<span style="font-family: Courier New, Courier, monospace;">==</span>)</h3>
<div>
<br /></div>
Its amazing what is considered equal in JavaScript via <span style="font-family: Courier New, Courier, monospace;">==</span>. Instead of restating all these absurdities, I'll just link to someone else who has:<br />
<a href="http://javascriptweblog.wordpress.com/2011/02/07/truth-equality-and-javascript/">http://javascriptweblog.wordpress.com/2011/02/07/truth-equality-and-javascript/</a><br />
<br />
When I started my project, I didn't realize that the Strict Equality (<span style="font-family: Courier New, Courier, monospace;">===</span>) existed. It should be used anywhere you would expect <span style="font-family: Courier New, Courier, monospace;">==</span> to work. It seems more sane to have <span style="font-family: Courier New, Courier, monospace;">==</span> be Strict Equality, and another Coercive Equality operator (something like <span style="font-family: Courier New, Courier, monospace;">~=</span> or <span style="font-family: Courier New, Courier, monospace;">~~</span>), but what is done is done.<br />
<br />
<h3 style="text-align: left;">
Problems Modularizing and Importing Code</h3>
<div>
<br /></div>
C/C++ has <span style="font-family: Courier New, Courier, monospace;">#include</span>, Python has <span style="font-family: Courier New, Courier, monospace;">import</span>, JavaScript has... <a href="http://stackoverflow.com/questions/950087/include-javascript-file-inside-javascript-file">terrible hacks</a>. There is sadly no standard way to import new code in a .js file, making modularization of your code difficult. I resorted to simply including prerequisite scripts in the HTML where they will be used, but I wish there was a way to include JavaScript from JavaScript.<br />
<br />
<h3 style="text-align: left;">
Browser Compatibility Issues</h3>
<div>
<br /></div>
Not all browsers have <span style="font-family: Courier New, Courier, monospace;">Object.create</span>. Not all browsers have <span style="font-family: Courier New, Courier, monospace;">console.log</span> in all situations. Not all browsers support <span style="font-family: Courier New, Courier, monospace;">"use strict"</span>. Turns out every browser is slightly different in a way that will subtly break your code, but of course the main culprit is usually IE.</div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-16906435864308090282013-03-20T01:21:00.001-05:002013-03-20T01:21:18.164-05:00Solution to Printing Blank Pages Problem in Linux<div dir="ltr" style="text-align: left;" trbidi="on">
<div>
This isn't an overly technical post but I hope it saves someone hours of frustration printing on Linux. </div>
<div>
<br /></div>
<div>
In my case the problem was a combination of broken generic printer drivers and a bad default value for the "Print Quality" setting. As a word of caution, according to the <a href="http://en.wikipedia.org/wiki/Anna_Karenina_principle">Anna Karenina Principle</a> odds are your problem is its own unique snowflake and this wont help you print.</div>
<div>
<br /></div>
<h3 style="text-align: left;">
Problem </h3>
<div>
<ul style="text-align: left;">
<li>You are trying to print from Linux. </li>
<li>The printer starts, makes printing noises, but only a blank page (i.e. one with no ink on it) comes out.</li>
<li>You verified your printer works by printing from another OS. If you have not, do this. If your printer still prints blanks on Windows/MacOS, you have a printer problem, not a Linux problem.</li>
</ul>
</div>
<div>
<br /></div>
<h3 style="text-align: left;">
Solution</h3>
<div>
The solution is two part; both parts were needed to actually see ink on paper.</div>
<div>
<ol style="text-align: left;">
<li>Install printer-specific software. <br /><br />The drivers that came with CUPS and claimed to support my printer didn't work. For HP printers, you need to <span style="font-family: Courier New, Courier, monospace;">sudo apt-get install hplip</span>, and run <span style="font-family: Courier New, Courier, monospace;">hp-setup</span>. If you have another brand printer, look <a href="http://wiki.debian.org/SystemPrinting">here</a> for help.<br /><br /></li>
<li>Change the "Print Quality" setting to something else.<br /><br />The setting is in the CUPS web interface. Go to http://localhost:631 (you may need to log in with a local account) -> Administration -> Manage Printers -> Your Printer's Name -> Administration Selection Box, pick "Set Default Options". Clicking that will take you to the following page:</li>
</ol>
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhM-lg0Hvxrnm5lRCQ-Xjyuz1Kt1gWo4rq3P5MbcQuki-kq_QgbMTS4tx18XyfvkruS9Z3qi4y3XEMWO7jt8IVcyrjyQK1ieddfaXKICn7jmCVrlv1K173ZbXFqBJHpq_usgnmmG1HG7s4r/s1600/DefaultOptions.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhM-lg0Hvxrnm5lRCQ-Xjyuz1Kt1gWo4rq3P5MbcQuki-kq_QgbMTS4tx18XyfvkruS9Z3qi4y3XEMWO7jt8IVcyrjyQK1ieddfaXKICn7jmCVrlv1K173ZbXFqBJHpq_usgnmmG1HG7s4r/s1600/DefaultOptions.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Change the Print Quality setting to something else. Try all the values. For me Normal Grayscale worked, Normal Color did not.</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
Try all the Print Quality options. Hopefully one of them prints. Yes, the setting is hard to find to and obscure, but hey, at least you didn't have to edit another config file!</div>
<div>
<br /></div>
<div>
My next post may be about trying to get network printer sharing to work between Linux and Mac OS X Mountain Lion, which was its own struggle.</div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-6276485882088889032013-02-18T21:43:00.001-06:002013-02-18T23:04:00.515-06:00Your Missing Package: When Address Correction Fails<div dir="ltr" style="text-align: left;" trbidi="on">
<style>
table.centered
{
border-collapse:collapse;
}
table.centered,th
{
border: 1px solid black;
padding-left: 4px;
padding-right: 4px;
}
table.centered
{
text-align: center;
margin-left: auto;
margin-right: auto;
}
th
{
background: #292929;
}
td
{
border: 1px solid black;
padding-left:3px;
padding-right:3px;
}
</style>
Amazon address correction is wrong for large parts of Chicago. This leads to late and missing packages. <a href="#map">This handy map</a> shows areas most affected by address correction failure. To avoid delivery problems always use your full <a href="http://en.wikipedia.org/wiki/ZIP_code">ZIP+4</a> when placing online orders. You can find the full ZIP+4 for your address via the <a href="https://tools.usps.com/go/ZipLookupAction!input.action">USPS ZIP Code (TM) Lookup Tool.</a><br />
<br />
I don't mean to pick on Amazon -- this problem has happened with several other retailers. I used Amazon because it was easy to cross-check their address verification with USPS. If you are an online retailer, make sure you have a working address correction system. If Amazon can get it wrong, what makes you think yours works? Bad address correction is costing you customers.<br />
<br />
<h1 style="text-align: left;">
The Problem</h1>
Have your Amazon packages ever been late or missing?<br />
Have you ever gotten a "notice left" email but no notice?<br />
Did USPS confirm delivery but there was no package?<br />
Do you only use a 5 digit ZIP code when filling out your address?<br />
<br />
You may be a victim of address correction failure. And you are not alone.<br />
<br />
Here is how to check:<br />
<br />
First, go to "<a href="http://www.amazon.com/gp/css/account/address/view.html/ref=ya_28">manage addresses</a>" and look at your address on Amazon.<br />
Now, go to the <a href="https://tools.usps.com/go/ZipLookupAction!input.action">USPS ZIP Code (TM) Lookup Tool</a> and check your address.<br />
<br />
If the full 9 digit ZIP Codes do not match, there is a problem. If you live in Chicago, I made a <a href="#map">heat map of where verification failures are most likely to occur</a>.<br />
<br />
<h3 style="text-align: left;">
Address Verification Failures</h3>
<div>
Mailers validate your address prior to shipment to save money on shipping costs. The address validation step is called <a href="http://www.semaphorecorp.com/cgi/dpv.html">Delivery Point Validation (DPV)</a>, and it requires a complete mailing address including a full 9 digit ZIP Code. Since few people know their full ZIP Codes, a suite of software called <a href="http://en.wikipedia.org/wiki/Coding_Accuracy_Support_System">Coding Accuracy Support System (CASS)</a> will correct an address into one that can be checked via DPV. <b>The correction step can fail, and "correct" your address to a different building</b>. To find out why, its time for a quick lesson on DPV, CASS, and ZIP Codes.</div>
<div>
<br /></div>
<div>
Note: I am not an expert on mailing, this information is what I have learned from judicious searching. It may be wrong. If I am, <a href="http://dinaburg.org/about.html">please correct me</a>.<br />
<br /></div>
<h3 style="text-align: left;">
DPV and CASS</h3>
Mailers use DPV to ensure an address is deliverable before passing the mail to USPS. In return, they receive discounted postage rates for reducing the work USPS has to do. From <a href="https://ribbs.usps.gov/cassmass/documents/tech_guides/ARCHIVES/CASS_CYCLE_L_2007_2009/CASS_CERT_REQ_MAILERS_GUIDE.PDF">The History of Worksharing Discounts and CASS Certified™ Software</a>:<br />
<br />
<blockquote class="tr_bq">
In 1983, the United States Postal Service (USPS) implemented a program that provided mailers a postage discount for sharing the work to prepare the mail for processing. This allowed the USPS to provide more cost-efficient mail processing based on the advance work performed by the mailer in providing high-quality addresses for their mail.</blockquote>
<br />
People are notoriously bad typers and spellers, and tend to omit information. Before a delivery point is verified, an address has to go through a <a href="http://en.wikipedia.org/wiki/Coding_Accuracy_Support_System">Coding Accuracy Support System (CASS)</a> check. The CASS software will fix an address to one that can be validated by DPV. From the <a href="http://en.wikipedia.org/wiki/Coding_Accuracy_Support_System">Wikipedia page</a>:<br />
<br />
<blockquote class="tr_bq">
The input of:<br />
<span style="font-family: Courier New, Courier, monospace;">1 MICROWSOFT</span><br />
<span style="font-family: Courier New, Courier, monospace;">REDMUND WA</span><br />
Produces the output of:<br />
<span style="font-family: Courier New, Courier, monospace;">1 MICROSOFT WAY</span><br />
<span style="font-family: Courier New, Courier, monospace;">REDMOND WA 98052-8300</span></blockquote>
<br />
CASS software has to be certified by the USPS and has to undergo certification testing every two years. The caveat is that<b> CASS validation only checks address matching, not the accuracy of the matched address</b>. <a href="https://ribbs.usps.gov/index.cfm?page=cassmass">From the USPS</a>:<br />
<br />
<blockquote class="tr_bq">
However, CASS processing does not measure the accuracy of ZIP + 4, delivery point, 5-digit ZIP, or carrier route codes in a mailer’s address file.</blockquote>
<br />
<div>
If the mailer's ZIP+4 database is wrong, CASS can't fix it.<br />
<br /></div>
<h3 style="text-align: left;">
Why do ZIP+4 Codes matter?</h3>
In a city, a ZIP+4 will determine the building or even the floor or group of apartments a piece of mail goes to. From the <a href="http://about.usps.com/who-we-are/postal-facts/welcome.htm">USPS website</a> (emphasis mine):<br />
<br />
<blockquote class="tr_bq">
The ZIP+4 Code was introduced in 1983. The extra four numbers allow <b>mail to be sorted to a specific group of streets or to a high-rise building</b>. In 1991, two more numbers were added so that mail could be sorted directly to a residence or business. Today, the use of ZIP Codes extends far beyond the mailing industry, and they are a fundamental component in the nation’s 911 emergency system.</blockquote>
<br />
If the ZIP+4 code is wrong, your mail goes to the wrong building. Your mailman might not catch this. Mail with electronic mailing information (i.e. pretty much all packages from online retailers) is automatically sorted and binned by machines. On busy urban routes the mailman doesn't know everyone and they aren't going to check every single piece of mail. They're going to take machine sorted mail bin, deposit it at the address they always do, and move on. If you're lucky, you may get a <a href="https://redelivery.usps.com/redelivery/">redelivery notice</a>.<br />
<br />
<h3 style="text-align: left;">
... but Amazon ships via UPS/Fedex?</h3>
UPS and FedEx may do hand-off to USPS for final delivery. This is a part of USPS work-share programs that UPS calls <a href="http://www.upsmailinnovations.com/services/domestic_mail.html">a mailing innovation</a>.<br />
<br />
<a href="" name="map"></a><br />
<h1 style="text-align: left;">
<a name="map">
The Address Verification Failure Map</a></h1>
The following map shows differences between ZIP+4 Codes returned by USPS and ZIP+4 Codes corrected by Amazon for 1,857 addresses in the City of Chicago. Green markers mean a match, blue markers represent ZIP+4 Codes from USPS, and yellow markers represents ZIP+4 codes from Amazon. A red connecting line associates the USPS and Amazon results for the same address.<br />
<br />
<iframe height="860" src="http://dinaburg.org/data/zipcodes/chicago_zipmap.html" width="100%"></iframe>
<br />
There are correction mistakes throughout the City, with the most mistakes in the Loop and the area immediately to the north and northwest. This correlates pretty well with the number of large apartments and condos, and hence specificity of ZIP+4 codes.<br />
<br />
I chose Chicago addresses because thats where I live. The addresses were a random sampling from the <a href="https://data.cityofchicago.org/Community-Economic-Development/Business-Licenses/r5kz-chrr">City of Chicago business license holders</a>. The City of Chicago has an excellent open data site at <a href="https://data.cityofchicago.org/">https://data.cityofchicago.org/</a>. This research would not have been possible without it.<br />
<br />
I sampled 2000 addresses out of a possible 381677. Of these, 143 (~7%) addresses were not found -- that is, either the USPS or Amazon had a failure in obtaining a ZIP+4 for the address. There were 519 (~26%) addresses with a different ZIP+4 between USPS and Amazon, and 1338 (67%) addresses with the same ZIP+4.<br />
<br />
I am making available the addresses used to generate this map.<br />
<br/>
<table class="centered">
<tbody>
<tr>
<th>File</th>
<th>Metadata</th>
<th>Description</th>
</tr>
<tr><td><a href="http://dinaburg.org/data/zipcodes/zip_diffs.txt">zip_diffs.txt</a></td><td>41KB, text</td><td>ZIP+4 Differences</td></tr>
<tr><td><a href="http://dinaburg.org/data/zipcodes/zip_equals.txt">zip_equals.txt</a></td><td>100KB, text</td><td>ZIP+4 Matches</td></tr>
<tr><td><a href="http://dinaburg.org/data/zipcodes/zip_fails.txt">zip_fails.txt</a></td><td>11KB, text</td><td>Failure to get ZIP+4 for an address</td></tr>
</tbody></table>
<br />
My verification scripts would select the first suggested address or the automatically corrected address (assuming no address was suggested) given by Amazon. For some streets, the suggested address was very far from the initial input. No human would have selected it, so the most egregious correction errors would likely have been caught. The places where the yellow and blue marker are close together are the most dangerous -- it is likely only a +4 digit difference which most users (like myself) would never notice.<br />
<br />
To map ZIP+4 addresses to latitude/longitude and to create the map, I used the <a href="http://www.mapquestapi.com/geocoding/">MapQuest API</a>. MapQuest may seem like an odd choice, but it had great documentation and examples, and it was the first service I could find with support for mapping a ZIP+4 to latitude/longitude.<br />
<br />
<h1 style="text-align: left;">
Backstory</h1>
I recently moved to Chicago with only what I could fit in my car, which meant I had to buy a lot of household items. I do most of my shopping online since I hate the crowds, salesmen, and poor selection at brick and mortar stores. This means I buy a lot of stuff on Amazon.<br />
<br />
I first became suspicious when I received the following email:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3Zu1oU8CAKwinceAUM8EjyII97KOaJe4gZDZPgd1lWgtaUvV_ZSHpQupssEHrJYw7X4yLgetNZ7yXg52cDDNibfn3llNMgtzV_YBKGO-2tWmhTBP6NpR9n_1-1FMS7YkEBidcuQw5W5JL/s1600/amazon_first_email.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="192" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3Zu1oU8CAKwinceAUM8EjyII97KOaJe4gZDZPgd1lWgtaUvV_ZSHpQupssEHrJYw7X4yLgetNZ7yXg52cDDNibfn3llNMgtzV_YBKGO-2tWmhTBP6NpR9n_1-1FMS7YkEBidcuQw5W5JL/s640/amazon_first_email.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fool me once, shame on you.</td></tr>
</tbody></table>
It is impossible to leave an unattended package at my address. The building has 24/7 front desk staff and a dedicated package receiving room. I dutifully filled out the re-delivery form, and received my package a few days later. I thought nothing of it until I received this second email:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0YoZVTadrin7tOnXn49FnmSKldE2Wvz2UUabV7yJssfD2qtb3N3PiHRQItOaEOrMiqifWXZ130UNS1vMXhobMMI096CxEf1GyPccgrYGCuWiV2HNUEFE54Ms_6KzAWkTexMbO2C3o30j0/s1600/amazon_second_email.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="186" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0YoZVTadrin7tOnXn49FnmSKldE2Wvz2UUabV7yJssfD2qtb3N3PiHRQItOaEOrMiqifWXZ130UNS1vMXhobMMI096CxEf1GyPccgrYGCuWiV2HNUEFE54Ms_6KzAWkTexMbO2C3o30j0/s640/amazon_second_email.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fool me twice, shame on me.</td></tr>
</tbody></table>
Around the same time my fiancee had several packages (not from Amazon, but other vendors) never arrive, despite USPS confirming delivery. Something was wrong, it was time to investigate.<br />
<br />
The addresses on the re-delivered package labels, order confirmation, and amazon.com all seemed correct. The front desk staff hadn't noticed any delivery attempts, and no packages had been left for me.<br />
<br />
I was stumped and considered just not shopping online, until I had a thought: USPS re-delivery worked, but original delivery sent it to a mystery address. Was there a difference between the USPS address and the Amazon.com address?<br />
<br />
Sure enough, there was. The ZIP+4 code had the wrong +4 digits. Searching online for the ZIP+4 Code from USPS results only in matches with my building's address. Searching for the ZIP+4 Code from Amazon results only in matches from buildings a few numbers down, with no front desk staff.<br />
<br />
Mystery solved.<br />
<br />
I immediately emailed Amazon with the problem. This was in mid January. As of February 18th, my address is still corrected to the wrong ZIP+4 Code.<br />
<br />
<h3 style="text-align: left;">
A Bigger Problem</h3>
<div>
Did I just live at the wrong address, and this was an isolated case, or if there was a more systematic address correction problem?</div>
<div>
<br /></div>
<div>
That is why I made the map. Turns out some areas are more affected than others, and that my address is not the only one. I hope that by exposing this publicly I can help others avoid the hassle and headaches of online ordering. </div>
<br />
<h1 style="text-align: left;">
Conclusion</h1>
Major vendors, including Amazon, get address correction wrong. In my sample of Chicago business addresses, 26% had a ZIP+4 that did not match the one returned by USPS.<br />
<br />
If you are an online retailer, please check your CASS and DPV software. Don't just assume it works, but write some scripts to test it yourself. Your customers will thank you. If your customers complain about missing packages, check that their address corrects properly.<br />
<br />
If you buy things online, memorize your ZIP+4 Code and use the full code where you can. If you live in an urban area, and the vendor only accepts a 5 digit ZIP, shop somewhere else because you may never get what you bought.<br />
<br /></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-88098442130993231652013-01-05T01:08:00.001-06:002013-01-05T01:08:33.804-06:00The Internet Sign<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
The Internet. It enhances communication, enables global commerce, and has become an indispensable part of people's daily lives. The Internet disseminates information around the globe and helps bypass censorship in repressive regimes. It is a great force for good, and some have said, has <a href="http://hbswk.hbs.edu/archive/1799.html">resulted in the largest legal creation of wealth on the planet</a>.<br />
<br />
What commemorates the creation of the Internet? There is a <a href="http://mercury.lcs.mit.edu/~jnc/plaque.html">plaque at Stanford University</a>. And near a "No Parking" sign outside the former ARPA building in <a href="http://arlingtonva.us/">Arlington County, Virginia</a> there is a sign.<br />
<br />
The Internet Sign.<br />
<br />
I refer to the sign as the Internet Sign to make its significance is more obvious, but more technically it is the <a href="http://en.wikipedia.org/wiki/ARPANET">ARPANET</a> Sign.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEWlylu_CY4djwsZBLuzN0oR7haWNxzZWwqkHOngYJwD6d_hp4ZK2AOC29Ys0V_QHLVYfrw6w0SlaFSxVb8kiuR6VCDacLwavb52amntMkI8IQtdFhDQ6Rf2z7RFzXC3gtU9qZuNCgXvVU/s1600/IMG_0589.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEWlylu_CY4djwsZBLuzN0oR7haWNxzZWwqkHOngYJwD6d_hp4ZK2AOC29Ys0V_QHLVYfrw6w0SlaFSxVb8kiuR6VCDacLwavb52amntMkI8IQtdFhDQ6Rf2z7RFzXC3gtU9qZuNCgXvVU/s640/IMG_0589.JPG" width="480" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-size: small; text-align: left;">The sign is near the corner of Oak St. and Wilson Boulevard in Arlington, Virginia. It is not (yet) visible on Google Street View. The location of the sign is the old ARPA building. ARPA moved to the Wilson Boulevard location from the Pentagon, then as DARPA it moved to 3701 N. Fairfax Dr. DARPA recently moved again, still within Arlington County, to <a href="http://www.darpa.mil/About/Contact/Address.aspx">675 N. Randolph Street</a>.</span></td></tr>
</tbody></table>
<br />
The following text appears on the sign:<br />
<br />
<blockquote class="tr_bq">
ARPANET</blockquote>
<blockquote class="tr_bq">
THE ARPANET, A PROJECT OF THE<br />ADVANCED RESEARCH PROJECTS AGENCY<br />OF THE DEPARTMENT OF DEFENSE,<br />DEVELOPED THE TECHNOLOGY THAT<br />BECAME THE FOUNDATION FOR THE<br />INTERNET AT THIS SITE FROM 1970 TO<br />1975. ORIGINALLY INTENDED TO SUPPORT<br />MILITARY NEEDS, ARPANET TECHNOLOGY<br />WAS SOON APPLIED TO CIVILIAN USES,<br />ALLOWING INFORMATION TO BE RAPIDLY<br />AND WIDELY AVAILABLE. THE INTERNET,<br />AND SERVICES SUCH AS E-MAIL,<br />E-COMMERCE AND THE WORLDWIDEWEB,<br />CONTINUES TO GROW AS THE UNDER-<br />LYING TECHNOLOGIES EVOLVE. THE<br />INNOVATIONS INSPIRED BY THE<br />ARPANET HAVE PROVIDED GREAT<br />BENEFITS FOR SOCIETY.</blockquote>
<blockquote class="tr_bq">
ERECTED IN 2008 BY ARLINGTON COUNTY, VIRGINIA </blockquote>
<br />
Below the main text is a smaller plaque with binary digits:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdB-c4ztT_GX-e8bEH6WfRz5d6g-asym2drFeABZiuu4GAMOpTV24_tcj12uJLRW-I9vbAeaM3vFQE05TKuduqBi53KXF9kI-Hfi4Z2KglZ5BymkvufTvAGboMSeZ53t7iiQDod4K-Z4b_/s1600/binary.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdB-c4ztT_GX-e8bEH6WfRz5d6g-asym2drFeABZiuu4GAMOpTV24_tcj12uJLRW-I9vbAeaM3vFQE05TKuduqBi53KXF9kI-Hfi4Z2KglZ5BymkvufTvAGboMSeZ53t7iiQDod4K-Z4b_/s640/binary.jpg" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-size: small; text-align: left;">The binary (<span style="font-family: Courier New, Courier, monospace;">01000001 01010010 01010000 01000001 01001110 01000101 01010100</span>) spells <span style="font-family: Courier New, Courier, monospace;">ARPANET</span> in <a href="http://en.wikipedia.org/wiki/ASCII">ASCII</a>.</span></td></tr>
</tbody></table>
<br />
The Internet Sign wasn't actually erected in 2008; the unveiling ceremony happened in 2011. <a href="http://www.arlnow.com/2011/05/24/mystery-solved-new-sign-erected-in-2008/">ARLnow has reasons for the delay</a>:<br />
<blockquote class="tr_bq">
According to Arlington spokeswoman Diana Sun, the county was unable to get permission from the building owner to put the sign on their property, so they had to go through a lengthy process of getting the sign installed in the public right-of-way (sidewalk). By the time all the pieces were in place, and by the time they could organize a small ceremony at a County Board meeting, it was 2011 — three years later than originally planned.</blockquote>
<br />
Which building owner that didn't want the sign on their property? A <a href="http://goo.gl/maps/jLPUi">glance at Google maps</a> will show the adjacent land is used by the US State Department. Why would the State Department refuse to commemorate a tool that has allowed uncensored information to reach the oppressed masses? I imagine security concerns about tourists congregating so close to a government building.<br />
<br />
While I am sure the State Department's reasons for not hosting the Internet Sign are sound, the result is a rather sad commemoration. Surely there is a more tactful way to acknowledge the creation of the Internet than by a sign on the sidewalk.<br />
<br />
</div>
Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-8270433837910980471.post-35683501167995862722012-11-23T19:59:00.001-06:002012-11-23T19:59:58.586-06:00Bitsquatting PCAP Analysis Part 4: Source Country Distribution<div dir="ltr" style="text-align: left;" trbidi="on">
This is part 4 of a multipart series, the previous post is <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-3-bit.html">Bitsquatting PCAP Analysis Part 3: Bit-error distribution</a>.<br />
<br />
This blog post will examine the source country distribution of packets in the bitsquatting PCAPs. To map a source IP address to a physical location, we will use MaxMind's free GeoLite Data (available at <a href="http://dev.maxmind.com/geoip/geolite">http://dev.maxmind.com/geoip/geolite</a>) as the data source, and write a quick Python script using <a href="https://github.com/appliedsec/pygeoip">pygeoip</a> to do the IP-to-location translation.<br />
<br />
<h3 style="text-align: left;">
IP to Location Translation</h3>
<br />
First, lets download and decompress the free <a href="http://dev.maxmind.com/geoip/geolite">GeoLite City Database provided by MaxMind</a>:
<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
$ gunzip GeoLiteCity.dat.gz</pre>
<br />
Next, we will install <a href="https://github.com/appliedsec/pygeoip">pygeoip</a>. The installation procedures for Python packages vary, but its likely that pygeoip can be installed by setuptools:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;"># easy_install pygeoip</pre>
<br />
The <a href="https://github.com/appliedsec/pygeoip">pygeoip page on github</a> provides all the necessary usage examples to create an IP-to-country script. My script, which reads in IPv4 addresses line-by-line on from a file (or stdin) and outputs an "ip:country:city" mapping is available here: <a href="https://github.com/artemdinaburg/ip-to-city">ip_to_city_country.py</a>.<br />
<br />
The example usage:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ ./ip_to_city_country.py --help
usage: ip_to_city_country.py [-h] [-d GEOIPDB] [ipfile]
Show city and country of IP addresses using MaxMind GeoIP Database
positional arguments:
ipfile a file from which to read IP addresses (default: stdin)
optional arguments:
-h, --help show this help message and exit
-d GEOIPDB Path to the GeoIPCity database (default: GeoLiteCity.dat)
$ echo '8.8.8.8' | ./ip_to_city_country.py
8.8.8.8:US:Mountain View</pre>
<br />
<h3 style="text-align: left;">
Source Address Frequency</h3>
<br />
The first step to mapping source country frequency is to identify source address frequency. While the source address frequency is only an intermediate step to gather source country distribution, it is very handy for a manual analysis of where queries are coming from.<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -o column.format:'"SOURCE", "%s"' | sort | uniq -c | sort -rn > analysis/ips_all.txt</pre>
<br />
A read-filter can be applied to get the source IPs with the <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span> outliers removed:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -R '!(dns.qry.name contains 0mdn.net)' -o column.format:'"SOURCE", "%s"' | sort | uniq -c | sort -rn > analysis/ips_nomdn.txt</pre>
<br />
The results for the frequency of all source IPs (<a href="http://dinaburg.org/data/bitsquat_analysis/ips_all.txt">ips_all.txt</a>, 848KB, text) and only IPs not requesting <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span> (<a href="http://dinaburg.org/data/bitsquat_analysis/ips_nomdn.txt">ips_nomdn.txt</a>, 740KB, text) are available for download.<br />
<br />
These intermediate results show how many packets were received from each IP. The list is interesting in its own right. The top few results are an unresponsive IP in Poland, IPs with PTR records pointing to subdomains of rscott.org (possibly in related to <a href="http://rscott.org/dns/">http://rscott.org/dns/</a> ?), an <a href="http://www.team-cymru.com/ReadingRoom/Whitepapers/2009/recursion.pdf">open-recursive namserver</a> at a Russian ISP, a resolver for LeaseWeb, and an <a href="http://en.wikipedia.org/wiki/Message_transfer_agent">MTA</a> for WindStream Communications. Feel free to investigate <a href="http://dinaburg.org/data/bitsquat_analysis/ips_nomdn.txt">more on your own</a>.<br />
<br />
<h3 style="text-align: left;">
Source Country Frequency</h3>
<br />
<br />
To find the frequency of source countries, each address will be mapped to its origin country. Only unique addresses, not how many packets were received from each address, will be counted for the distribution. Some shell commands and the <span style="font-family: Courier New, Courier, monospace;">ip_to_city_country.py</span> script will identify the source countries. In the commands below, <span style="font-family: Courier New, Courier, monospace;">gcut</span>, the GNU version of <span style="font-family: Courier New, Courier, monospace;">cut</span> is used since the default <span style="font-family: Courier New, Courier, monospace;">cut</span> on Mac OS X cannot handle non-ASCII characters.<br />
<div>
<br /></div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ awk '{print $2}' analysis/ips_all.txt | ./ip_to_city_country.py > analysis/ip_all_location_mapping.txt
$ gcut -f 2 -d ':' analysis/ip_all_location_mapping.txt | sort | uniq -c | sort -rn > analysis/all_country_frequency.txt
$ awk '{print $2}' analysis/ips_nomdn.txt | ./ip_to_city_country.py > analysis/ip_nomdn_location_mapping.txt
$ gcut -f 2 -d ':' analysis/ip_nomdn_location_mapping.txt | sort | uniq -c | sort -rn > analysis/nomdn_country_frequency.txt</pre>
<br />
The all country frequency table (<a href="http://dinaburg.org/data/bitsquat_analysis/all_country_frequency.txt">all_country_frequency.txt</a>, 1.5KB, text) and the frequency table sans requests for <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span> (<a href="http://dinaburg.org/data/bitsquat_analysis/nomdn_country_frequency.txt">nomdn_country_frequency.txt</a>, 1.5KB, text) have very similar distributions, only the magnitude changes. This is easier to see in graph form:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJYTjbLHgyuBvkknfI3p9-7UmZTXBEghBr6JlsHva8tAggRJ85KKgQAzTRHD83r9-chvpzcOrExwyRFh9UAyBqiFDLhBvPfF9gvnSe5b1p54Xbj5F87zr2TlXq10uHdgwdiWVqqFEK54OY/s1600/test.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="Number of packets vs. source country ( all queries )" border="0" height="304" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJYTjbLHgyuBvkknfI3p9-7UmZTXBEghBr6JlsHva8tAggRJ85KKgQAzTRHD83r9-chvpzcOrExwyRFh9UAyBqiFDLhBvPfF9gvnSe5b1p54Xbj5F87zr2TlXq10uHdgwdiWVqqFEK54OY/s640/test.png" title="" width="640" /></a></div>
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5DYzpBy1F_iQKYesLtwxg2vQpANvTIIkkbEyW8Wh-02Z9ttXWgFzF-clJ31hne3Ptf-t-JlZErda5LVeHi1sHlm4YCPfF6SzbkXwex-2jqBvHx4r7LvmbaCOpqm4o6QcEctns1ZB2AcYi/s1600/test2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="Number of DNS Packets vs. Source Country (excluding 0mdn.net)" border="0" height="304" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5DYzpBy1F_iQKYesLtwxg2vQpANvTIIkkbEyW8Wh-02Z9ttXWgFzF-clJ31hne3Ptf-t-JlZErda5LVeHi1sHlm4YCPfF6SzbkXwex-2jqBvHx4r7LvmbaCOpqm4o6QcEctns1ZB2AcYi/s640/test2.png" title="" width="640" /></a></div>
<br />
<br />
The <error> field means the MaxMind GeoLite database did not have an entry for the particular IP address.<br />
<br />
The large numbers for the US is likely due to the US-centric nature of many of the domains I bitsquatted, such as <span style="font-family: Courier New, Courier, monospace;">fbcdn.net</span>, and the fact that the US just has considerably <a href="http://dev.maxmind.com/ip-allocation">more IP allocations than other countries</a>. The extensive world coverage of bitsquatting queries is really quite amazing; there are queries from 192 of the 250 countries in the MaxMind database.<br />
<br />
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-1336028115739423002012-11-18T23:55:00.000-06:002012-11-23T20:02:20.970-06:00Bitsquatting PCAP Analysis Part 3: Bit-error distribution<div dir="ltr" style="text-align: left;" trbidi="on">
<style>
table.centered
{
border-collapse:collapse;
}
table.centered,th
{
border: 1px solid black;
padding-left: 4px;
padding-right: 4px;
}
table.centered
{
text-align: center;
margin-left: auto;
margin-right: auto;
}
th
{
background: #292929;
}
td
{
border: 1px solid black;
padding-left:3px;
padding-right:3px;
}
</style>
This is the third post in a multi-post series. The previous post is <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-2-query.html">here</a>.<br />
<br />
Which bits are more likely to be affected by bit-errors? What does the bit-error distribution look like? In this blog post, I will attempt to answer those questions by looking at bit-errors in the requested record type field of DNS queries.<br />
<br />
This post actually raises more questions than it answers: the bit-errors of the record type field are <b>not</b> distributed uniformly (the distribution one would expect from a random process), but instead mainly occur in bit 6 of the requested record type. I don't know why this is the case. I also don't know if this is only true for the record type field, or if this extends to the query name field as well. If you have any good suggestions, please <a href="http://dinaburg.org/about.html">contact me</a>.<br />
<br />
<h3 style="text-align: left;">
Bit-errors in the requested record type: A records</h3>
<br />
Astute readers will have noticed that in the <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-2-query.html">previous post</a> I didn't describe some of the top 15 requested record types. As a refresher, lets take another look at the top 15 most requested record types:<br />
<br />
<table class="centered">
<tbody>
<tr>
<th>Rank</th>
<th>Query Count</th>
<th>Record Type</th>
</tr>
<tr><td>1</td><td>550892</td><td>a</td></tr>
<tr><td>2</td><td>509605</td><td>aaaa</td></tr>
<tr><td>3</td><td>358926</td><td>mx</td></tr>
<tr><td>4</td><td>26829</td><td>any</td></tr>
<tr><td>5</td><td>25039</td><td>soa</td></tr>
<tr><td>6</td><td>7729</td><td>cname</td></tr>
<tr><td>7</td><td>4835</td><td>513</td></tr>
<tr><td>8</td><td>4728</td><td>ns</td></tr>
<tr><td>9</td><td>2597</td><td>txt</td></tr>
<tr><td>10</td><td>1148</td><td>srv</td></tr>
<tr><td>11</td><td>698</td><td>1025</td></tr>
<tr><td>12</td><td>232</td><td>257</td></tr>
<tr><td>13</td><td>222</td><td>a6</td></tr>
<tr><td>14</td><td>143</td><td>ptr</td></tr>
<tr><td>15</td><td>138</td><td>spf</td></tr>
</tbody></table>
<br />
The 7th most popular record type is 513. Type 513 is not mentioned in the <a href="http://en.wikipedia.org/wiki/List_of_DNS_record_types">Wikipedia list of record types</a>, and it is not in <a href="http://anonsvn.wireshark.org/wireshark/releases/wireshark-1.8.3/epan/dissectors/packet-dns.c">Wireshark's record type list</a>. Why are there 4835 requests for an undefined record type?<br />
<br />
The answer is clearer when we look at 513 in binary (zero-extended to 16 bits):<br />
<br />
<div style="text-align: center;">
<span style="font-family: Courier New, Courier, monospace;"><b>0000 0010 0000 0001</b></span></div>
<br />
This value is only one bit away from 1, the A record request type. Other requested record types in the top 15 share this similarity: type 1025 and type 257 are both one bit away from type 1. In the full query types table there are other requests with this property, such as requests for type 65, 2049, 16385.<br />
<br />
The requested record types one bit away from type 1, including binary representation and how often they were requested, are represented below:<br />
<br />
<table class="centered">
<tbody>
<tr><th>Bit Flipped</th><th>Binary Value</th><th>RR Type</th><th>Count</th><th>Unique Count</th><th>Note</th></tr>
<tr><td>0</td><td>1000 0000 0000 0001</td><td>32769</td><td>0</td><td>0</td><td></td></tr>
<tr><td>1</td><td>0100 0000 0000 0001</td><td>16385</td><td>5</td><td>2</td><td></td></tr>
<tr><td>2</td><td>0010 0000 0000 0001</td><td>8193</td><td>0</td><td>0</td><td></td></tr>
<tr><td>3</td><td>0001 0000 0000 0001</td><td>4097</td><td>0</td><td>0</td><td></td></tr>
<tr><td>4</td><td>0000 1000 0000 0001</td><td>2049</td><td>22</td><td>7</td><td></td></tr>
<tr><td>5</td><td>0000 0100 0000 0001</td><td>1025</td><td>698</td><td>25</td><td></td></tr>
<tr><td>6</td><td>0000 0010 0000 0001</td><td>513</td><td>4835</td><td>142</td><td></td></tr>
<tr><td>7</td><td>0000 0001 0000 0001</td><td>257</td><td>232</td><td>50</td><td></td></tr>
<tr><td>8</td><td>0000 0000 1000 0001</td><td>129</td><td>0</td><td>0</td><td></td></tr>
<tr><td>9</td><td>0000 0000 0100 0001</td><td>65</td><td>128</td><td>37</td><td></td></tr>
<tr><td>10</td><td>0000 0000 0010 0001</td><td>33</td><td></td><td></td><td>overlaps SRV</td></tr>
<tr><td>11</td><td>0000 0000 0001 0001</td><td>17</td><td>2</td><td>1</td><td>overlaps RP</td></tr>
<tr><td>12</td><td>0000 0000 0000 1001</td><td>9</td><td>0</td><td>0</td><td>overlaps MR</td></tr>
<tr><td>13</td><td>0000 0000 0000 0101</td><td>5</td><td></td><td></td><td>overlaps CNAME</td></tr>
<tr><td>14</td><td>0000 0000 0000 0011</td><td>3</td><td>0</td><td>0</td><td>overlaps MD</td></tr>
<tr><td>15</td><td>0000 0000 0000 0000</td><td>0</td><td>2</td><td>1</td><td></td></tr>
</tbody></table>
<br />
Note: some entries are blank due to overlap with other popular record types. Unpopular/deprecated record types, such as RP, were included in the count for bit errors. All query type overlaps are noted in the notes column.<br />
<br />
The count column is how often record type was requested. The unique count column is how often each record type was requested from a unique source IP. This was done to minimize the effect of one bit-error repeatedly manifesting itself via many repeated requests.<br />
<br />
A visualization of the unique count column:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUjpsB1Q03cE8kSiw5y_0MwW8AzGsHEX7Z9CC1RDSgRunxEs4dqwpFektPzE9EWEITQ5DEcts4moHXwaOiS6WCQbU5TqGCDNAKNwZ3s46vvTa0Zm3QXhwIqiE6TANUZ4ocUBl5MMV01Pcs/s1600/graphs.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUjpsB1Q03cE8kSiw5y_0MwW8AzGsHEX7Z9CC1RDSgRunxEs4dqwpFektPzE9EWEITQ5DEcts4moHXwaOiS6WCQbU5TqGCDNAKNwZ3s46vvTa0Zm3QXhwIqiE6TANUZ4ocUBl5MMV01Pcs/s1600/graphs.png" /></a></div>
<br />
To obtain the unique count column, first we must get all the unique (source IP, query type) pairs (and disregard any queries for <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span>):<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -R '!(dns.qry.name contains 0mdn.net)' -o column.format:'"SOURCE", "%s", "QTYPE", "%Cus:dns.qry.type"' | sort -u > analysis/src_and_qtype.txt</pre>
<br />
<br />
After getting the (source IP, query type) pairs, a bash for loop can show us how many unique source IPs requested a certain record type.<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ for qt in 32769 16385 8193 4097 2049 1025 513 257 129 65 SRV RP MR CNAME MD unused; do echo "$qt:" `grep -i " $qt$" analysis/src_and_qtype.txt | wc -l`; done
32769: 0
16385: 2
8193: 0
4097: 0
2049: 7
1025: 25
513: 142
257: 50
129: 0
65: 37
SRV: 80
RP: 1
MR: 0
CNAME: 635
MD: 0
unused: 1</pre>
<br />
Counting only record requests by unique source IP address shows that the same error-prone query is repeated many times from the same source, but the overall distribution stays the same.
<br />
<br />
There has been much speculation about bit-error distribution and if any bits are more likely to be affected. Judging by bit-errors of the query type field some bits are <b>considerably</b> more likely to be affected: bit 6 accounts for the vast majority of bit-errors, with the error rate dropping sharply with distance from bit 6. This distribution is evident in the query type field; I have not verified if it still holds in the query name field.<br />
<br />
I don't know why the distribution is as skewed as it is. Maybe the distribution is an artifact of the query type field and typical allocation alignments? Other thoughts and ideas are welcome.<br />
<br />
<h3 style="text-align: left;">
Bit-errors in the requested record type: AAAA records</h3>
<br />
There are nearly as many AAAA record requests as there are A record requests. Do bit-errors of AAAA requests exhibit the same distribution?<br />
<br />
<table class="centered">
<tbody>
<tr><th>Bit Flipped</th><th>Binary Value</th><th>Value</th><th>Count</th><th>Unique Count</th><th>Note</th></tr>
<tr><td>0</td><td>1000 0000 0001 1100</td><td>32796</td><td>0</td><td>0</td><td></td></tr>
<tr><td>1</td><td>0100 0000 0001 1100</td><td>16412</td><td>0</td><td>0</td><td></td></tr>
<tr><td>2</td><td>0010 0000 0001 1100</td><td>8220</td><td>0</td><td>0</td><td></td></tr>
<tr><td>3</td><td>0001 0000 0001 1100</td><td>4124</td><td>0</td><td>0</td><td></td></tr>
<tr><td>4</td><td>0000 1000 0001 1100</td><td>2076</td><td>0</td><td>0</td><td></td></tr>
<tr><td>5</td><td>0000 0100 0001 1100</td><td>1052</td><td>0</td><td>0</td><td></td></tr>
<tr><td>6</td><td>0000 0010 0001 1100</td><td>540</td><td>4</td><td>1</td><td></td></tr>
<tr><td>7</td><td>0000 0001 0001 1100</td><td>284</td><td>4</td><td>1</td><td></td></tr>
<tr><td>8</td><td>0000 0000 1001 1100</td><td>156</td><td>0</td><td>0</td><td></td></tr>
<tr><td>9</td><td>0000 0000 0101 1100</td><td>92</td><td>0</td><td>0</td><td></td></tr>
<tr><td>10</td><td>0000 0000 0011 1100</td><td>60</td><td>0</td><td>0</td><td></td></tr>
<tr><td>11</td><td>0000 0000 0000 1100</td><td>12</td><td></td><td></td><td>overlaps PTR</td></tr>
<tr><td>12</td><td>0000 0000 0001 0100</td><td>20</td><td>0</td><td>0</td><td></td></tr>
<tr><td>13</td><td>0000 0000 0001 1000</td><td>24</td><td>0</td><td>0</td><td></td></tr>
<tr><td>14</td><td>0000 0000 0001 1110</td><td>30</td><td>0</td><td>0</td><td></td></tr>
<tr><td>15</td><td>0000 0000 0001 1101</td><td>29</td><td>0</td><td>0</td><td></td></tr>
</tbody></table>
<br />
Despite there being a nearly identical number of queries of reach record types (when excluding queries for <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span>), there are almost no bit-errors for AAAA record queries. The errors that do exist though correspond to errors in bit 6 and bit 7. Some of the discrepancy between the amount of bit errors in A and AAAA queries can be explained since there are simply fewer sources of AAAA queries:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -R '(dns.qry.type == AAAA) and !(dns.qry.name contains 0mdn.net)' -o column.format:'"SOURCE", "%s"' | sort -u > analysis/aaaa_sources.txt
$ tshark -n -r completelog.pcap -R '(dns.qry.type == A) and !(dns.qry.name contains 0mdn.net)' -o column.format:'"SOURCE", "%s"' | sort -u > analysis/a_sources.txt
$ wc -l analysis/aaaa_sources.txt analysis/a_sources.txt
7206 analysis/aaaa_sources.txt
29833 analysis/a_sources.txt</pre>
<br />
There are only ~24% as many sources of AAAA requests as there are of A requests. Still, this would only account for ~76% of the difference in error rate. <br />
<br />
<h3 style="text-align: left;">
Conclusion</h3>
<br />
The bit-error distribution, at least with respect to the requested record type field, is not uniform. It is centered at bit 6 and sharply falls off with distance from bit 6. I don't have an explanation as to why, but I suspect might have to do with packet alignment in memory. Other possibilities include errant networking equipment or software somewhere on the Internet. Any ideas and suggestions, especially testable ones, are most welcome.<br />
<br />
There are also more bit-errors in A records requests than AAAA record requests. The fact that there are fewer sources of AAAA records accounts for a part of this discrepancy, but does not completely eliminate it.<br />
<br />
If you have any insight, please <a href="http://dinaburg.org/about.html">contact me.</a><br />
<br />
Update:<br />
Part 4 is now up, <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-4.html">Bitsquatting PCAP Analysis Part 4: Source Country Distribution</a>.<br />
<br /></div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-79223695208487076752012-11-14T23:08:00.002-06:002012-11-18T23:58:29.188-06:00Bitsquatting PCAP Analysis Part 2: Query Types, IPv6<div dir="ltr" style="text-align: left;" trbidi="on">
<style>
table.centered
{
border-collapse:collapse;
}
table.centered,th
{
border: 1px solid black;
padding-left: 4px;
padding-right: 4px;
}
table.centered
{
text-align: center;
margin-left: auto;
margin-right: auto;
}
th
{
background: #292929;
}
td
{
border: 1px solid black;
padding-left:3px;
padding-right:3px;
}
</style>
<br />
<div dir="ltr" style="text-align: left;" trbidi="on">
This is the second post in a multi-part series. The previous post is <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-1.html">here</a>.<br />
<br />
In this installment of Bitsquatting PCAP analysis we will make an educated guess about the prevalence of IPv6 on the Internet, which services DNS is used for, and identify some mysteries in the bitsquatting PCAPs.<br />
<br />
All of this information is going to come from just one field: the requested record type of each DNS query.<br />
<br />
<h3 style="text-align: left;">
Background</h3>
<br />
First, some background on DNS record types. DNS is essentially a distributed hierarchical database. Values are retrieved by specifying a location and a record type. The location is a fully qualified domain name. The record type is one of <a href="http://en.wikipedia.org/wiki/List_of_DNS_record_types">several defined record types</a>. The most commonly requested record type is A, which means IPv4 address. When you are using IPv4 and translate <span style="font-family: Courier New, Courier, monospace;">www.google.com</span> to an IP address, you are retrieving the A record for <span style="font-family: Courier New, Courier, monospace;">www.google.com</span>.<br />
<br />
The <span style="font-family: Courier New, Courier, monospace;">dig</span> command is used to manually query for DNS records. The following command will retrieve the A record for <span style="font-family: Courier New, Courier, monospace;">www.google.com</span>:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ dig +short www.google.com a
173.194.75.99
173.194.75.147
173.194.75.104
173.194.75.103
173.194.75.105
173.194.75.106</pre>
<br />
The above command says: ask my local name server (usually specified in <span style="font-family: Courier New, Courier, monospace;">/etc/resolv.conf</span>) for the A record for <span style="font-family: Courier New, Courier, monospace;">www.google.com</span>. And output the result in <span style="font-family: Courier New, Courier, monospace;">short</span> form. Note: the IP addresses returned for you will likely be different. Google attempts to direct you to a physically closer server based on the <a href="http://www.maxmind.com/en/geolocation_landing">geo-ip location</a> of the requesting DNS server. <a href="http://www.cs.brown.edu/courses/csci2950-u/papers/CDN-measuring-IMC08-huang.pdf">This is one part of how most content delivery networks work</a>. More in a future blog post.<br />
<br />
One more common record type is AAAA, which is used to retrieve IPv6 addresses. Why is the record type called AAAA? Because IPv4 addresses are 32 bits wide, and IPv6 addresses are 128 bits wide. If A is 32-bit, then AAAA would be 32+32+32+32=128-bit. Interestingly there used to be another record type for retrieving IPv6 addresses, <a href="http://tools.ietf.org/html/rfc2874">A6</a>, that has <a href="http://tools.ietf.org/html/rfc6563">since been deprecated</a>. Even if you are using IPv4, you can still retrieve the AAAA record of <span style="font-family: Courier New, Courier, monospace;">wwww.google.com</span>:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ dig +short www.google.com aaaa
2607:f8b0:400c:c01::67
</pre>
<br />
<h3 style="text-align: left;">
What is DNS used for?</h3>
<br />
By tallying the frequency of requested record types, we can determine the popularity of DNS uses. The requested record type is specified by the query type field of each DNS request. We can retrieve the query type from each packet using <span style="font-family: Courier New, Courier, monospace;">tshark</span>. Lets get a list of all requested record types, and how often each record type was requested:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -o column.format:'"QTYPE", "%Cus:dns.qry.type"' | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn > analysis/all_qtypes.txt</pre>
<br />
The full record type frequency table is available: (<a href="http://dinaburg.org/data/bitsquat_analysis/all_qtypes.txt">all_qtypes.txt</a>, 408B, text).<br />
<br />
The table below shows the top 15 requested record types. Amazingly, the most requested DNS record type is IPv6 address resolution! Considering that <a href="http://www.sput.nl/internet/ipv6/stats/">other places</a> measure IPv6 DNS traffic at only 15% of web traffic, something is definitely amiss. More on this after the discussion of DNS use.<br />
<br />
<table class="centered">
<tbody>
<tr>
<th>Rank</th>
<th>Query Count</th>
<th>Record Type</th>
</tr>
<tr><td>1</td><td>2050660</td><td>aaaa</td></tr>
<tr><td>2</td><td>1132372</td><td>a</td></tr>
<tr><td>3</td><td>359779</td><td>mx</td></tr>
<tr><td>4</td><td>47335</td><td>a6</td></tr>
<tr><td>5</td><td>38404</td><td>any</td></tr>
<tr><td>6</td><td>25954</td><td>soa</td></tr>
<tr><td>7</td><td>8155</td><td>cname</td></tr>
<tr><td>8</td><td>5130</td><td>ns</td></tr>
<tr><td>9</td><td>4835</td><td>513</td></tr>
<tr><td>10</td><td>2622</td><td>txt</td></tr>
<tr><td>11</td><td>1149</td><td>srv</td></tr>
<tr><td>12</td><td>698</td><td>1025</td></tr>
<tr><td>13</td><td>232</td><td>257</td></tr>
<tr><td>14</td><td>144</td><td>ptr</td></tr>
<tr><td>15</td><td>141</td><td>spf</td></tr>
</tbody></table>
<div>
<br /></div>
<br />
Name resolution is by far the most popular use of DNS. Name resolution is responsible for the first, second, fourth, and seventh most frequently requested record types. Amazingly there is a very high frequency of deprecated A6 records. Can there really be that many old BIND servers out there?<br />
<br />
The second most popular use of DNS is for email related services. The third most requested record type is MX, which is used for determining the incoming mail servers for a domain. MX records can be viewed from the command line as well:
<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ dig +short gmail.com mx
10 alt1.gmail-smtp-in.l.google.com.
30 alt3.gmail-smtp-in.l.google.com.
5 gmail-smtp-in.l.google.com.
20 alt2.gmail-smtp-in.l.google.com.
40 alt4.gmail-smtp-in.l.google.com.</pre>
<br />
Along with MX, the other records commonly used for email are TXT (to hold <a href="http://en.wikipedia.org/wiki/Sender_Policy_Framework">SPF</a> and <a href="http://en.wikipedia.org/wiki/DKIM">DKIM</a> data) which is the tenth most frequently requested, and SPF (used for SPF data) which is the fifteenth most frequent.<br />
<br />
The fifth, sixth, and eighth most frequently record types are used all used for DNS infrastructure purposes. The ANY record type simply retrieves all available records, the SOA record type specifies who is the primary source for information about the domain, and the NS type specifies nameservers that can be used to answer queries about the domain.<br />
<br />
The next most commonly requested record type, SRV, is used for custom protocol related records. In practice, most SRV queries are used to retrieve information for <a href="http://blog.nominet.org.uk/tech/2006/11/21/srv-records/">Jabber/XMPP and other messaging services</a>, including <a href="http://www.onsip.com/about-voip/sip/dns-srv-records-sip">VoIP</a>/<a href="http://technet.microsoft.com/en-us/library/gg398680.aspx">Videoconferencing</a> services.<br />
<br />
Finally PTR records are used for reverse DNS lookups. A reverse lookup is performed when you want to map an IP address to a domain name. This is one of the few (maybe the only?) time when you will encounter the <a href="http://en.wikipedia.org/wiki/.arpa">.arpa TLD</a>. ARPA originally stood for the Advanced Research Projects Agency, the US Government agency that funded the creation of the Internet. These days .arpa has been <a href="http://en.wikipedia.org/wiki/Backronym">backronymed</a> to Address and Routing Parameter Area, and what used to be ARPA is now <a href="http://www.darpa.mil/">DARPA</a>.<br />
<br />
To request a PTR records for an IPv4 address, the octets of the IP are reversed, and <span style="font-family: Courier New, Courier, monospace;">.in-addr.arpa</span> is appended. This is because IP addresses are hierarchical from left to right but DNS is hierarchical from right to left. For example, to see what domain <span style="font-family: 'Courier New', Courier, monospace;">173.194.75.99</span><span style="font-family: inherit;"> (one of the IPs for </span><span style="font-family: Courier New, Courier, monospace;">www.google.com</span><span style="font-family: inherit;">) corresponds to, we would use the following command:</span>
<br />
<span style="font-family: inherit;"><br /></span>
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ dig +short 99.75.194.173.in-addr.arpa ptr
ve-in-f99.1e100.net.
</pre>
<br />
The returned domain is not <span style="font-family: Courier New, Courier, monospace;">www.google.com</span>, but this is due to Google's infrastructure. There is a clever easter egg in the domain: 1e100 means 1.0 × 10<sup>100</sup>, which is one <a href="http://www.clear.rice.edu/comp280/05spring/Lectures/lect-extra-googol.shtml">googol</a>.<br />
<br />
<h3 style="text-align: left;">
What can we learn about the prevalence of IPv6?</h3>
<br />
Before we jump to conclusions about IPv6, we should remember that there are outliers in the bitsquatting PCAPs. If you recall from the <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-1.html">previous post</a>, there were numerous queries for <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span> because that domain was an authoritative name server. Queries for <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span> might be affecting the record type distribution. Lets filter out these queries:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -R '!(dns.qry.name contains 0mdn.net)' -o column.format:'"QTYPE", "%Cus:dns.qry.type"' | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn > analysis/nomdn_qtypes.txt</pre>
<br />
The full list of record types and their frequencies is available: (<a href="http://dinaburg.org/data/bitsquat_analysis/nomdn_qtypes.txt">nomdn_qtypes.txt</a>, 379B, text).<br />
<br />
This command works using the <span style="font-family: Courier New, Courier, monospace;">-R</span> option of <span style="font-family: Courier New, Courier, monospace;">tshark</span>. The <span style="font-family: Courier New, Courier, monospace;">-R</span> option specifies a <a href="http://wiki.wireshark.org/DisplayFilters"><span style="font-family: Courier New, Courier, monospace;">wireshark</span> display filter</a> that is applied when reading PCAPs. The filter of <span style="font-family: Courier New, Courier, monospace;">!(dns.qry.name contains 0mdn.net)</span> will match all packets where the query name field does not contain <span style="font-family: Courier New, Courier, monospace;">0mdn.net</span>. Lets examine the new results:<br />
<br />
<table class="centered">
<tbody>
<tr>
<th>Rank</th>
<th>Query Count</th>
<th>Record Type</th>
</tr>
<tr><td>1</td><td>550892</td><td>a</td></tr>
<tr><td>2</td><td>509605</td><td>aaaa</td></tr>
<tr><td>3</td><td>358926</td><td>mx</td></tr>
<tr><td>4</td><td>26829</td><td>any</td></tr>
<tr><td>5</td><td>25039</td><td>soa</td></tr>
<tr><td>6</td><td>7729</td><td>cname</td></tr>
<tr><td>7</td><td>4835</td><td>513</td></tr>
<tr><td>8</td><td>4728</td><td>ns</td></tr>
<tr><td>9</td><td>2597</td><td>txt</td></tr>
<tr><td>10</td><td>1148</td><td>srv</td></tr>
<tr><td>11</td><td>698</td><td>1025</td></tr>
<tr><td>12</td><td>232</td><td>257</td></tr>
<tr><td>13</td><td>222</td><td>a6</td></tr>
<tr><td>14</td><td>143</td><td>ptr</td></tr>
<tr><td>15</td><td>138</td><td>spf</td></tr>
</tbody></table>
<br />
The new table is a much different picture with regards to IPv6, but there is still a large amount of AAAA record requests.<br />
<br />
Lesson Learned: There are enough AAAA record requests to indicate IPv6 connectivity is important. If you are attempting to re-do the bitsquatting experiment, have IPv6 connectivity and answer AAAA requests!<br />
<br />
<h3 style="text-align: left;">
What is the nature of IPv6 traffic (AAAA record requests)?</h3>
<br />
Why were there so many AAAA record requests for the authoritative nameservers, and how do these compare to other domains? Lets use <span style="font-family: Courier New, Courier, monospace;">tshark</span> to retrieve all AAAA record requests, and which domain was the request was for:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -R '(dns.qry.type == AAAA)' -o column.format:'"QTYPE", "%Cus:dns.qry.name"' | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn > analysis/AAAA_queries.txt</pre>
<br />
The full list of AAAA query frequencies is available: (<a href="http://dinaburg.org/data/bitsquat_analysis/AAAA_queries.txt">AAAA_queries.txt</a>, 17KB, text).<br />
<br />
<table class="centered">
<tbody>
<tr>
<th>AAAA Queries</th>
<th>Domain</th>
</tr>
<tr><td>794921</td><td>ns2.0mdn.net</td></tr>
<tr><td>774496</td><td>ns1.0mdn.net</td></tr>
<tr><td>77181</td><td>static.ak.dbcdn.net</td></tr>
<tr><td>77053</td><td>support.doublechick.net</td></tr>
<tr><td>66595</td><td>gmaml.com</td></tr>
<tr><td>58634</td><td>g.mic2osoft.com</td></tr>
<tr><td>28107</td><td>s0.0mdn.net</td></tr>
<tr><td>16327</td><td>www.amazgn.com</td></tr>
<tr><td>13401</td><td>mail.gmaml.com</td></tr>
<tr><td>6367</td><td>www.micro3oft.com</td></tr>
<tr><td>5678</td><td>amazgn.com</td></tr>
<tr><td>4924</td><td>www.mic2osoft.com</td></tr>
<tr><td>4789</td><td>www.eicrosoft.com</td></tr>
<tr><td>4578</td><td>pop.gmaml.com</td></tr>
<tr><td>4346</td><td>static.ak.fbgdn.net</td></tr>
</tbody></table>
<br />
<br />
The two authoritative name servers receive the most AAAA requests, but there are other domains with numerous IPv6 lookups. Maybe these domains are just popular?<br />
<br />
<h3 style="text-align: left;">
Ratio of IPv4 to IPv6 address lookups</h3>
<h4 style="text-align: left;">
<span style="font-weight: normal;">The ratio of IPv4 address resolutions to IPv6 address resolutions will show the proportion of IPv6 traffic for each domain. This measurement should completely disregard popularity, as it uses ratios instead of absolute numbers. My hypothesis was that the ratios should be approximately the same for all domains, as none of the domains I bitsquatted were IPv6 related. Lets calculate the ratios. </span></h4>
<h4 style="text-align: left;">
Step 1: Calculate A record frequency</h4>
The following command will tabulate the frequency of A record requests for each domain:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -R '(dns.qry.type == A)' -o column.format:'"QTYPE", "%Cus:dns.qry.name"' | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn > analysis/A_queries.txt</pre>
<br />
The full list of A query frequencies is available: (<a href="http://dinaburg.org/data/bitsquat_analysis/A_queries.txt">A_queries.txt</a>, 99KB, text).<br />
<br />
<h4 style="text-align: left;">
Step 2: Massage Data</h4>
The following commands will prepare both the A record frequency and AAAA record frequency tables to be joined on the domain name field.<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ sort -f -k2 analysis/A_queries.txt > a_q_for_join.txt
$ sort -f -k2 analysis/AAAA_queries.txt > aaaa_q_for_join.txt</pre>
<br />
<h4 style="text-align: left;">
Step 3: Calculate the ratio of A to AAAA record requests</h4>
Amazingly, the POSIX standard specifies a relational join command that operates on specially delimited text files. The <span style="font-family: Courier New, Courier, monospace;">join</span> command below will join the first file on the second field (<span style="font-family: Courier New, Courier, monospace;">-1 2</span>), with the second file also on the second field (<span style="font-family: Courier New, Courier, monospace;">-2 2</span>). The second field of both files is the domain name. The output of <span style="font-family: Courier New, Courier, monospace;">join</span> is then piped to <span style="font-family: Courier New, Courier, monospace;">awk</span> to calculate the ratio of A to AAAA record requests.<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ join -1 2 -2 2 a_q_for_join.txt aaaa_q_for_join.txt | awk '{printf "%d\t%2.2f\t%s\n", $2+$3, $2/$3, $1}' | sort -rn >analysis/ratio_of_a_to_aaaa.txt</pre>
<br />
The full list of A:AAAA ratios is available: (<a href="http://dinaburg.org/data/bitsquat_analysis/ratio_of_a_to_aaaa.txt">ratio_of_a_to_aaaa.txt</a>, 18KB, text).<br />
<br />
<table class="centered">
<tbody>
<tr>
<th>Total Query Count</th>
<th>A to AAAA Query Ratio</th>
<th>Domain</th>
</tr>
<tr><td>1095763</td><td>0.41</td><td>ns1.0mdn.net</td></tr>
<tr><td>1072642</td><td>0.35</td><td>ns2.0mdn.net</td></tr>
<tr><td>93208</td><td>0.40</td><td>gmaml.com</td></tr>
<tr><td>80862</td><td>0.05</td><td>static.ak.dbcdn.net</td></tr>
<tr><td>77147</td><td>0.00</td><td>support.doublechick.net</td></tr>
<tr><td>70140</td><td>4.23</td><td>mail.gmaml.com</td></tr>
<tr><td>59500</td><td>0.01</td><td>g.mic2osoft.com</td></tr>
<tr><td>53969</td><td>2.31</td><td>www.amazgn.com</td></tr>
<tr><td>43270</td><td>6.62</td><td>amazgn.com</td></tr>
<tr><td>28694</td><td>0.02</td><td>s0.0mdn.net</td></tr>
<tr><td>20575</td><td>8.63</td><td>micro3oft.com</td></tr>
<tr><td>13585</td><td>9.32</td><td>miarosoft.com</td></tr>
<tr><td>12175</td><td>0.91</td><td>www.micro3oft.com</td></tr>
<tr><td>10762</td><td>1.19</td><td>www.mic2osoft.com</td></tr>
<tr><td>9032</td><td>26.62</td><td>u2s.micro3oft.com</td></tr>
</tbody>
</table>
<br />
Different domains exhibit a wildly different ratio of IPv4 to IPv6 lookups! Some actually have more IPv6 resolutions than IPv4 resolutions. The mystery is, why is this the case?<br />
<br />
<h3 style="text-align: left;">
Conclusion</h3>
<br />
IPv6 connectivity is important. When removing outliers, there were almost as many IPv6 resolution requests as IPv4 requests. When investigating in more detail, some domains actually receive <b>more</b> IPv6 resolution requests than IPv4 resolution requests. I do not know why. If you have suggestions, please <a href="http://dinaburg.org/about.html">contact me</a>.<br />
<br />
Update:<br />
Part 3 is now up, <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-3-bit.html">Bitsquatting PCAP Analysis Part 3: Bit-error distribution</a>.<br />
<br /></div>
</div>
Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-8270433837910980471.post-76086620067027448382012-11-05T00:12:00.000-06:002012-11-14T23:12:04.364-06:00Bitsquatting PCAP Analysis Part 1: Analyzing PCAPs using Unix command line tools<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
This blog post was originally going to be about domain name distribution in the <a href="http://dinaburg.org/bitsquatting.html">bitsquatting</a> PCAPs, but I found a problem with my first analysis. The problem has been turned into an opportunity, and now this blog post is about domain name distribution in the bitsquatting PCAPs, <b>and</b> a tutorial on how to determine the distribution yourself!<br />
<br />
This blog post/tutorial will follow the process I used to answer the following questions:<br />
<br />
<ul>
<li>How many unique domains appear in queries directed at the bitsquatting nameserver? Answer: 4271.</li>
<li>What is the frequency distribution of queried domains? Answer: long tail; percentages <a href="http://dinaburg.org/data/bitsquat_analysis/requested_domain_percentage.txt">here</a>.</li>
</ul>
<br />
<h3 style="text-align: left;">
Prerequisites</h3>
<br />
A basic familiarity with Unix is assumed throughout this tutorial. While all the commands listed were run on Mac OS X, any sufficiently Unix-y environment should work.<br />
<br />
To do the analysis we are going to install some extra software. We will need the following:<br />
<ul style="text-align: left;">
<li><a href="http://www.wireshark.org/">wireshark</a> (specifically, the <a href="http://www.wireshark.org/docs/man-pages/tshark.html">tshark</a> and <a href="http://www.wireshark.org/docs/man-pages/mergecap.html">mergecap</a> utilities) to dissect packet captures</li>
<li>the GNU version of <a href="http://www.gnu.org/software/coreutils/">coreutils</a> (for ls -v)</li>
<li><a href="http://www.7-zip.org/">7zip</a> to decompress the compressed DNS packet captures</li>
<li>and <a href="http://www.gnu.org/software/wget/">wget</a> because I hate remembering to use "curl -O" to download via the command line.</li>
</ul>
<br />
All the prerequisites should be easily available with your favorite package manager. Since all command examples in this tutorial were run on Mac OS X, I installed the prerequisites via <a href="http://mxcl.github.com/homebrew/">homebrew</a>:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0px 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;"># brew install coreutils
# brew install wireshark
# brew install p7zip
# brew install wget</pre>
<br />
<h3 style="text-align: left;">
Downloading and Extracting the Data</h3>
<br />
The first step of analysis is to get the data. Lets download and extract the Bitsquatting PCAPs:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ wget http://dinaburg.org/data/dnslogs.tar.7z
$ 7z x dnslogs.tar.7z
$ tar xvf dnslogs.tar
$ rm dnslogs.tar dnslogs.tar.7z</pre>
<br />
Numerous files named dnslog, dnslog1, dnslog2, etc. should now be in your working directory. These files contain packet captures (PCAPs) of DNS traffic.<br />
<br />
The <span style="font-family: Courier New, Courier, monospace;"><a href="http://www.tcpdump.org/">tcpdump</a></span> utility is the most basic way to analyze PCAP contents. Lets take a look to see what the logs contain:<br />
<br />
<blockquote class="tr_bq">
<span style="font-family: Courier New, Courier, monospace;">$ tcpdump -n -v -r dnslog</span></blockquote>
<br />
All of the output should be details about DNS queries. The output format is described in detail on the <a href="http://www.tcpdump.org/tcpdump_man.html">tcpdump man page</a>. This tutorial is not about tcpdump, I included this step since it is a very good idea to investigate any unknown PCAPs with <span style="font-family: Courier New, Courier, monospace;">tcpdump</span> and look for oddities before opening them in more complex tools. <a href="http://isisblogs.poly.edu/2012/08/03/tracing-bugs-in-wireshark/">Opening files from unknown sources in wireshark can be dangerous</a>. Even though it wont be further referenced in this blog post, the <span style="font-family: Courier New, Courier, monospace;">tcpdump</span> utility is extremely handy; I highly recommend <a href="https://www.google.com/search?q=tcpdump+tutorial">reading some <span style="font-family: Courier New, Courier, monospace;">tcpdump</span> tutorials</a> for background knowledge.<br />
<br />
<h3 style="text-align: left;">
Combining PCAPs</h3>
<br />
The PCAPs are cumbersome to work with since they are split into several files. To make analysis easier, lets re-assemble all the disparate PCAPs into a single file. There is a tool called <span style="font-family: Courier New, Courier, monospace;">mergecap</span> that comes with wireshark that is made exactly for this purpose.<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ mergecap -a -w completelog.pcap `gls -1v`</pre>
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: inherit;">The above command will use </span><span style="font-family: Courier New, Courier, monospace;">mergecap</span><span style="font-family: inherit;"> in append mode (</span><span style="font-family: Courier New, Courier, monospace;">-a</span><span style="font-family: inherit;">), and save the result into </span><span style="font-family: Courier New, Courier, monospace;">completelog.pcap</span><span style="font-family: inherit;">. Append mode instructs </span><span style="font-family: Courier New, Courier, monospace;">mergecap</span><span style="font-family: inherit;"> to simply concatenate the files with correct headers, otherwise </span><span style="font-family: Courier New, Courier, monospace;">mergecap</span><span style="font-family: inherit;"> will use packet timestamps to create combined file. The files to merge are given by "</span><span style="font-family: Courier New, Courier, monospace;">gls -1v</span><span style="font-family: inherit;">". Note: </span><span style="font-family: Courier New, Courier, monospace;">gls</span><span style="font-family: inherit;"> is GNU </span><span style="font-family: Courier New, Courier, monospace;">ls</span><span style="font-family: inherit;"><span style="font-family: inherit;">, it is used because </span>the default </span><span style="font-family: Courier New, Courier, monospace;">ls</span><span style="font-family: inherit;"> on Mac OS X does not have a numeric sort option. If you are using Linux just use "</span><span style="font-family: Courier New, Courier, monospace;">ls -1v</span><span style="font-family: inherit;">" in your command line. </span><br />
<span style="font-family: inherit;"></span>
<br />
<h3 style="text-align: left;">
Initial Analysis</h3>
<br />
Now that we have a merged PCAP, lets do some analysis. To review, the two questions we will answer in this blog post are:<br />
<br />
<ul style="text-align: left;">
<li>How many unique domains appear in queries directed at the bitsquatting nameserver?</li>
<li>What is the frequency distribution of queried domains ?</li>
</ul>
<br />
The answers to both of these questions depend on extracting the query name from every incoming DNS query. Luckily we will not need two write any PCAP reading code; there are many great projects specifically meant for dissecting PCAPs. In this post, we will be using using <span style="font-family: Courier New, Courier, monospace;">tshark</span>, the text-only part of the wireshark network traffic analyzer as our PCAP dissector.<br />
<br />
The following <span style="font-family: Courier New, Courier, monospace;">tshark</span> command will display the query name field of every DNS query in <span style="font-family: Courier New, Courier, monospace;">completelog.pcap</span>:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -o column.format:'"QNAME", "%Cus:dns.qry.name"'</pre>
<br />
The above command instructs <span style="font-family: Courier New, Courier, monospace;">tshark</span> to not attempt DNS resolution (<span style="font-family: Courier New, Courier, monospace;">-n</span>), to read from <span style="font-family: Courier New, Courier, monospace;">completelog.pcap</span> as the packet source (<span style="font-family: Courier New, Courier, monospace;">-r</span>), and to override the default output format to be the <span style="font-family: Courier New, Courier, monospace;">dns.qry.name</span> field of the packet (<span style="font-family: Courier New, Courier, monospace;">-o</span>).<br />
<br />
The <span style="font-family: Courier New, Courier, monospace;">tshark</span> utility supports many output formats. The <a href="http://anonsvn.wireshark.org/viewvc/trunk/epan/column.c?view=markup">column.c file in the wireshark source</a> specifies allowed formats. It is my understanding that the custom format specifier (<span style="font-family: Courier New, Courier, monospace;">%Cus</span>) can accept any protocol filter field. The <a href="http://www.wireshark.org/docs/dfref/d/dns.html">wireshark Display Filter DNS Protocol reference</a> specifies all filter fields for the DNS protocol.<br />
<br />
If you ran the command you will notice it takes a long time to finish. Its best to pick a small subset of the data first and ensure there are no problems before working with the full set. Lets verify that it is possible to count domain frequency in the first 1000 queries:<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -o column.format:'"QNAME", "%Cus:dns.qry.name"' | head -n 1000 | sort | uniq -c | sort -rn
183 ns1.0mdn.net
171 ns2.0mdn.net
129 eicrosoft.com
108 micro3oft.com
85 mic2osoft.com
65 static.ak.fbgdn.net
44 www.micro3oft.com
38 iicrosoft.com
32 gmaml.com
24 www.mic2osoft.com
20 www.miarosoft.com
19 www.amazgn.com
14 profile.ak.fjcdn.net
13 forum.micro3oft.com
12 mscrl.eicrosoft.com
12 0mdn.net
11 amazgn.com
8 profile.ak.fbgdn.net
8 aeazon.com
2 www.gmaml.com
2 www.eicrosoft.com
</pre>
<br />
<span style="font-family: inherit;">It turns out there are casing issues with the query name field: </span><span style="font-family: Courier New, Courier, monospace;">micro3oft.com</span><span style="font-family: inherit;"> and </span><span style="font-family: Courier New, Courier, monospace;">MICRO3OFT.COM </span><span style="font-family: inherit;">are counted as different domains although they are semantically the same. Domain resolution is case-insensitive, but query name case can matter. For instance, 0x20 encoding uses query name casing to increase DNS query entropy. Increasing query entropy makes DNS forgery attacks more difficult. More details can be found in the <a href="http://courses.isi.jhu.edu/netsec/papers/increased_dns_resistance.pdf">0x20 encoding paper</a>.</span><br />
<br />
<div>
Since we are interested in only the semantic meaning of domains, they should all be converted to lower case before frequency counting. This can be done by piping the names through <span style="font-family: Courier New, Courier, monospace;"><a href="http://www.gnu.org/software/coreutils/manual/html_node/Translating.html">tr</a></span>. The new command line should only output lowercased domains:</div>
<div>
<br /></div>
<div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ tshark -n -r completelog.pcap -o column.format:'"QNAME", "%Cus:dns.qry.name"' | head -n 1000 | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn
183 ns1.0mdn.net
171 ns2.0mdn.net
129 eicrosoft.com
108 micro3oft.com
85 mic2osoft.com
65 static.ak.fbgdn.net
44 www.micro3oft.com
38 iicrosoft.com
32 gmaml.com
24 www.mic2osoft.com
20 www.miarosoft.com
19 www.amazgn.com
14 profile.ak.fjcdn.net
13 forum.micro3oft.com
12 mscrl.eicrosoft.com
12 0mdn.net
11 amazgn.com
8 profile.ak.fbgdn.net
8 aeazon.com
2 www.gmaml.com
2 www.eicrosoft.com
</pre>
</div>
<div>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Our problem is solved. Lets create a directory for our analysis outputs and count the domain frequency in the full data set:</span>
<br />
<br />
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ mkdir analysis
$ tshark -n -r completelog.pcap -o column.format:'"QNAME", "%Cus:dns.qry.name"' | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn > analysis/requested_domains.txt
</pre>
</div>
<div>
<br />
The full frequency count can be viewed here: (<a href="http://dinaburg.org/data/bitsquat_analysis/requested_domains.txt">requested_domains.txt</a>, 117KB, text).</div>
<div>
<br />
Now we can finally answer our first question: How many unique domains appear in queries directed at the bitsquatting name server? Since the resulting file has one domain per line, the line count will be the number of unique domains:<br />
<br /></div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ wc -l analysis/requested_domains.txt
4271 analysis/requested_domains.txt
</pre>
<div>
<div>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">There was a total of 4271 unique requested domains. </span>
<br />
<br />
<h3 style="text-align: left;">
<span style="font-family: inherit;">Removing Outliers</span></h3>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Lets try to get a feel for the distribution of domains in the query name field. First, lets look at the most frequently requested domains:</span><br />
<span style="font-family: inherit;"><br /></span></div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ head -n 10 analysis/requested_domains.txt
1124949 ns1.0mdn.net
1101708 ns2.0mdn.net
184405 gmaml.com
142283 support.doublechick.net
80865 static.ak.dbcdn.net
78839 miarosoft.com
70174 mail.gmaml.com
59500 g.mic2osoft.com
57514 microsmft.com
54125 www.amazgn.com</pre>
<div>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">The left column is the number of times the domain appeared in the query name field, and the right column is the domain. </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">The domains </span><span style="font-family: Courier New, Courier, monospace;">ns1.0mdn.net</span><span style="font-family: inherit;"> and </span><span style="font-family: Courier New, Courier, monospace;">ns2.0mdn.net</span><span style="font-family: inherit;"> are outliers, they are by far the most frequently requested. These domains were the <a href="http://en.wikipedia.org/wiki/Name_server#Authoritative_name_server">authoritative name servers</a> for my bitsquatting domains. The high frequency of queries for these domains has nothing to do with their popularity and has everything to do with their authoritative name server status. Including them in the top 10 count would be improper. The new top ten most frequently queried domains, with authoritative servers excluded are:</span><br />
<span style="font-family: inherit;"><br /></span></div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">184405 gmaml.com
142283 support.doublechick.net
80865 static.ak.dbcdn.net
78839 miarosoft.com
70174 mail.gmaml.com
59500 g.mic2osoft.com
57514 microsmft.com
54125 www.amazgn.com
45866 amazgn.com
32021 mssupport.micrgsoft.com
</pre>
<div>
<div>
<br />
<br />
<h3>
Calculating Percentages</h3>
<br />
<span style="font-family: inherit;">Raw query numbers are interesting, but to better </span>comprehend<span style="font-family: inherit;"> query name frequency the percentage of total queries is a better measurement. To calculate the percentages we first need to calculate the total number of queries excluding queries for authoritative name servers. The following </span><span style="font-family: Courier New, Courier, monospace;">awk</span><span style="font-family: inherit;"> script will add all values in the the query count column (the first column) of </span><span style="font-family: Courier New, Courier, monospace;">requested_domains.txt</span><span style="font-family: inherit;">, excluding the first two rows (the query counts for the authoritative name servers):</span><br />
<span style="font-family: inherit;"><br /></span></div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ awk 'NR > 2 {sum+=$1} END {print sum}' < analysis/requested_domains.txt
1451284</pre>
<div>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Using the total number of queries we can write another </span><span style="font-family: Courier New, Courier, monospace;">awk</span><span style="font-family: inherit;"> script to convert query frequencies into percentages. Lets look at the percentage of queries </span>represented<span style="font-family: inherit;"> by each of the top 10 most frequently queried domains:</span><br />
<span style="font-family: inherit;"><br /></span></div>
<pre style="border-left: 5px solid #dd7700; font-family: Courier New, Courier, monospace; margin: 0 20px; overflow: auto; padding: 20px 20px 10px 10px; width: 100%;">$ awk 'NR > 2 {printf "%2.2f %s\n", $1/1451284*100, $2}' < analysis/requested_domains.txt | head -n 10
12.71 gmaml.com
9.80 support.doublechick.net
5.57 static.ak.dbcdn.net
5.43 miarosoft.com
4.84 mail.gmaml.com
4.10 g.mic2osoft.com
3.96 microsmft.com
3.73 www.amazgn.com
3.16 amazgn.com
2.21 mssupport.micrgsoft.com</pre>
<div>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">The full percentages can be downloaded here: (</span><a href="http://dinaburg.org/data/bitsquat_analysis/requested_domain_percentage.txt">requested_domain_percentage.txt</a>, 117KB, text<span style="font-family: inherit;">).</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">We have now answered the second question: </span>What is the frequency distribution of queried domains? The domain frequency distribution is a superb example of a <a href="http://en.wikipedia.org/wiki/Long_tail" style="font-family: inherit;">long tail</a>.</div>
</div>
</div>
<br />
Update:<br />
Part 2 is now up, <a href="http://blog.dinaburg.org/2012/11/bitsquatting-pcap-analysis-part-2-query.html">Bitsquatting PCAP Analysis Part 2: Query Types, IPv6</a>.<br />
<br /></div>
Unknownnoreply@blogger.com0