<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0"><channel><title>cortesi</title><link>http://corte.si</link><description>Cortesi</description><generator>PyRSS2Gen-1.0.0</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Introducing pathod: a pathological HTTP server</title><link>http://corte.si/posts/code/pathod/announce0_1.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/pathod/announce0_1.html"&gt;Introducing pathod: a pathological HTTP server&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;01 May 2012&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;I've just released &lt;a href="http://cortesi.github.com/pathod""&gt;pathod&lt;/a&gt;, a pathological
HTTP/S daemon useful for testing and torturing HTTP clients. At its core is a
tiny, terse language for crafting HTTP responses. It also has a built-in web
interface that lets you play with the response spec language, inspect logs, and
access pathod's full help document. &lt;/p&gt;

&lt;p&gt;The rest of this post is a quick teaser showing some of pathod's abilities.
See the detailed documentation on the &lt;a href="http://cortesi.github.com/pathod""&gt;pathod
site&lt;/a&gt; if you want more. &lt;/p&gt;

&lt;h1&gt;The simplest possible response&lt;/h1&gt;

&lt;p&gt;The easiest way to craft a response is to specify it directly in the request
URL. Lets start with the simplest possible example. Start pathod, and then
visit this URL:&lt;/p&gt;

&lt;pre class="terminal"&gt;
http://localhost:9999/p/200
&lt;/pre&gt;

&lt;p&gt;The "/p/" path is the location of the response generator in pathod's default
configuration - everything after that a response specification in pathod's
mini-language.  The general form of a response spec is as follows:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;code[MESSAGE]:[colon-separated list of features]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In this case, we're specifying only the HTTP response code - that is, an HTTP
200 OK with no headers and no content, resulting in a response like this:&lt;/p&gt;

&lt;pre class="terminal"&gt;
HTTP/1.1 200 OK
&lt;/pre&gt;

&lt;h1&gt;Specifying features&lt;/h1&gt;

&lt;p&gt;One example of a "feature" is a response header. Lets embellish our response by
adding one:&lt;/p&gt;

&lt;pre class="terminal"&gt;
200:h"Etag"="foo"
&lt;/pre&gt;

&lt;p&gt;The first letter of the feature - "h", in this case - is a mnemonic indicating
the type of feature we're adding. The full response to this spec looks like this:&lt;/p&gt;

&lt;pre class="terminal"&gt;
HTTP/1.1 200 OK
Etag: foo
&lt;/pre&gt;

&lt;p&gt;Both "Etag" and "foo" are Value Specifiers, a syntax used throughout the
response specification language. In this case they are literal values, as
indicated by the fact that they are quoted strings. The Value Specification
syntax also lets us load values from files or generate random data. For
instance, here is a specification that generates 100k of random binary data for
the header value:&lt;/p&gt;

&lt;pre class="terminal"&gt;
200:h"Etag"=@100k
&lt;/pre&gt;

&lt;p&gt;Now, binary data in the header value will probably break things in interesting
ways, but is unlikely to be read by the client as a valid (but over-long)
value. To see if the client really drops off its perch if we feed it a single
100k header, we have to constrain the random data. Here's the same response,
but with data generated only from ASCII letters:&lt;/p&gt;

&lt;pre class="terminal"&gt;
200:h"Etag"=@100k,ascii_letters
&lt;/pre&gt;

&lt;p&gt;pathod has a large number of built-in character classes from which random
data can be generated. &lt;/p&gt;

&lt;h1&gt;Pauses and Disconnects&lt;/h1&gt;

&lt;p&gt;Next, we can disrupt the communications in various ways. At the moment, this
means adding pauses and disconnects to a response. Let's start with an HTTP 404
response with a body consisting of a 100k of random binary data:&lt;/p&gt;

&lt;pre class="terminal"&gt;
404:b@100k
&lt;/pre&gt;

&lt;p&gt;Here's the same response, but with a 120 second pause after sending 100 bytes:&lt;/p&gt;

&lt;pre class="terminal"&gt;
404:b@100k:p120,100
&lt;/pre&gt;

&lt;p&gt;And, the same response again, but with hard disconnect after sending 100 bytes:&lt;/p&gt;

&lt;pre class="terminal"&gt;
404:b@100k:d100
&lt;/pre&gt;

&lt;p&gt;Instead of specifying a time explicitly, we can ask pathod to just randomly
disconnect at a time of its choosing:&lt;/p&gt;

&lt;pre class="terminal"&gt;
404:b@100k:dr
&lt;/pre&gt;

&lt;p&gt;That's it for the teaser - hopefully it's enough to entice you into looking at
&lt;a href="http://cortesi.github.com/pathod""&gt;pathod&lt;/a&gt;'s full documentation.&lt;/p&gt;

&lt;h1&gt;What's next?&lt;/h1&gt;

&lt;p&gt;pathod is an "airport project" - the first draft was written in its
entirety during a 40-hour trip back home from New York (I drew a bad lot in
stopovers). I've now firmed it up a bit, but there's still work to be done. In
the next month, mitmproxy's test suite will move to pathod, after which
there will be a simple, well-documented way to unit test. I also plan to build
out the JSON API (which is used to drive pathod in test suites), and expand the
mini-language with convenient ways  to generate pathological cookies,
authentication headers, SSL errors, and cache control. &lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/pathod/announce0_1.html</guid><pubDate>Tue, 01 May 2012 08:14:00 GMT</pubDate></item><item><title>mitmproxy 0.8</title><link>http://corte.si/posts/code/mitmproxy/announce0_8/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/announce0_8/index.html"&gt;mitmproxy 0.8&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;09 April 2012&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://mitmproxy.org"&gt;
&lt;img src="http://corte.si/posts/code/mitmproxy/announce0_8/mitmproxy_0_8.png"/&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm happy to announce the release of &lt;a href="http://mitmproxy.org"&gt;mitmproxy 0.8&lt;/a&gt;.
This release has a few major new features, big speedups, and many, many small
bugfixes and improvements. Here are the headlines:&lt;/p&gt;

&lt;h2&gt;Android interception&lt;/h2&gt;

&lt;p&gt;The most prominent new feature is that we now have a supported way to intercept
Android traffic. What's more, we can do this without a cumbersome transparent
proxying rig - see the &lt;a href="http://mitmproxy.org/doc/certinstall/android.html"&gt;Android section in the
documentation&lt;/a&gt; for the
details. Special thanks goes to &lt;a href="http://twitter.com/yjmbo"&gt;Jim Cheetham&lt;/a&gt; for
lending me an Android device and helping to get this feature off the ground.&lt;/p&gt;

&lt;h2&gt;Replacement patterns&lt;/h2&gt;

&lt;p&gt;Another exceedingly useful new feature is &lt;a href="http://mitmproxy.org/doc/replacements.html"&gt;replacement
patterns&lt;/a&gt;. These consist of a
filter, a regular expression and a replacement string, and run continuously
while mitmproxy processes requests and responses. You can pass these either on
the command-line, or using a built-in replacement pattern editor.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/code/mitmproxy/announce0_8/mitmproxy0_8_replace.png"/&gt;&lt;/p&gt;

&lt;p&gt;I'm sure you can immediately think of many uses for this flexible feature, but
my favourite is to use it during testing as a way to conveniently inject
complicated exploits into web traffic. I do this by setting a replacement
pattern that swaps a short but likely unique string (say MYXSS) for a long
exploit, and then I use simple interaction and front-end tools like Firebug to
inject exploits into requests manually based on the short string marker.&lt;/p&gt;

&lt;h2&gt;Improved pretty-printing of request and response contents&lt;/h2&gt;

&lt;p&gt;This release of mitmproxy has a completely redesigned subsystem for
pretty-printing request and response bodies. For instance, we now extract EXIF
tags and other basic information to give you something better than a hex dump 
when looking at an image:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/code/mitmproxy/announce0_8/mitmproxy0_8-pretty.png"/&gt;&lt;/p&gt;

&lt;p&gt;We also have much improved HTML indenting (using &lt;a href="http://lxml.de/"&gt;lxml&lt;/a&gt;), and
a built-in JavaScript beautifier (thanks to
&lt;a href="http://jsbeautifier.org"&gt;JSBeautifier&lt;/a&gt;) that teases out compressed and
obfuscated scripts into something readable.&lt;/p&gt;

&lt;h2&gt;Changelog&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Detailed tutorial for Android interception. Some features that land in
this release have finally made reliable Android interception possible.&lt;/li&gt;
&lt;li&gt;Upstream-cert mode, which uses information from the upstream server to
generate interception certificates.&lt;/li&gt;
&lt;li&gt;Replacement patterns that let you easily do global replacements in flows
matching filter patterns. Can be specified on the command-line, or edited
interactively.&lt;/li&gt;
&lt;li&gt;Much more sophisticated and usable pretty printing of request bodies.
Support for auto-indentation of JavaScript, inspection of image EXIF
data, and more.&lt;/li&gt;
&lt;li&gt;Details view for flows, showing connection and SSL cert information (X
keyboard shortcut).&lt;/li&gt;
&lt;li&gt;Server certificates are now stored and serialized in saved traffic for
later analysis. This means that the 0.8 serialization format is NOT
compatible with 0.7.&lt;/li&gt;
&lt;li&gt;Add a shortcut key ("f") to load the remainder of a request or response body,
if it is abbreviated.&lt;/li&gt;
&lt;li&gt;Many other improvements, including bugfixes, and expanded scripting API,
and more sophisticated certificate handling.&lt;/li&gt;
&lt;/ul&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/mitmproxy/announce0_8/index.html</guid><pubDate>Mon, 09 Apr 2012 16:57:00 GMT</pubDate></item><item><title>mitmproxy 0.7</title><link>http://corte.si/posts/code/mitmproxy/announce0_7/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/announce0_7/index.html"&gt;mitmproxy 0.7&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;27 February 2012&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://mitmproxy.org"&gt;
&lt;img src="http://corte.si/posts/code/mitmproxy/announce0_7/mitmproxy_0_7.png"/&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm happy to announce the release of &lt;a href="http://mitmproxy.org"&gt;mitmproxy 0.7&lt;/a&gt;. The
biggest visible change is a new structured editor for headers, query strings
and form fields. Other new feature include a reverse proxy mode, extended
script API that makes many common tasks much easier, and a myriad of
improvements to the interface (including a massive increase in speed).
Everybody still on 0.6 should upgrade - get it here:&lt;/p&gt;

&lt;h2&gt;&lt;a href="http://mitmproxy.org"&gt;mitmproxy-0.7.tar.gz&lt;/a&gt; &lt;a href="http://mitmproxy.org/docs"&gt;(docs)&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;You can also now install mitmproxy using &lt;a href="http://pypi.python.org/pypi/pip"&gt;pip&lt;/a&gt;, like so:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;pip install mitmproxy
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In other news, the project has had an amazing month, after a rash of
high-profile results obtained using mitmproxy were published. It started with
&lt;a href="http://mclov.in/2012/02/08/path-uploads-your-entire-address-book-to-their-servers.html"&gt;Arun Thampi's
discovery&lt;/a&gt;
that Path uploads users' address books to their servers. Things snowballed from
there, and for a few days mitmproxy seemed to be everywhere. Similar findings
were made for
&lt;a href="http://markchang.tumblr.com/post/17244167951/hipster-uploads-part-of-your-iphone-address-book-to-its"&gt;Hipster&lt;/a&gt;,
&lt;a href="http://www.theverge.com/2012/2/14/2798008/ios-apps-and-the-address-book-what-you-need-to-know"&gt;The
Verge&lt;/a&gt;
did a mitmproxy-driven AddressbookGate expose (including vaguely threatening
background shots of mitmproxy doing its dastardly work), and lots of people
said nice things on Twitter. &lt;/p&gt;

&lt;p&gt;To see the impact all of this for the mitmproxy project, you need only look at
the &lt;a href="http://github.com/cortesi/mitmproxy"&gt;Github page&lt;/a&gt; - watchers of the repo
went from about 200 a month a go, to 950 at the time of this post. &lt;/p&gt;

&lt;h2&gt;Changelog&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;New built-in key/value editor. This lets you interactively edit URL query
strings, headers and URL-encoded form data. &lt;/li&gt;
&lt;li&gt;Extend script API to allow duplication and replay of flows.&lt;/li&gt;
&lt;li&gt;API for easy manipulation of URL-encoded forms and query strings.&lt;/li&gt;
&lt;li&gt;Add "D" shortcut in mitmproxy to duplicate a flow.&lt;/li&gt;
&lt;li&gt;Reverse proxy mode. In this mode mitmproxy acts as an HTTP server,
forwarding all traffic to a specified upstream server.&lt;/li&gt;
&lt;li&gt;UI improvements - use Unicode characters to make GUI more compact,
improve spacing and layout throughout.&lt;/li&gt;
&lt;li&gt;Add support for filtering by HTTP method.&lt;/li&gt;
&lt;li&gt;Add the ability to specify an HTTP body size limit.&lt;/li&gt;
&lt;li&gt;Move to typed netstrings for serialization format - this makes 0.7
backwards-incompatible with serialized data from 0.6!&lt;/li&gt;
&lt;li&gt;Significant improvements in speed and responsiveness of UI. &lt;/li&gt;
&lt;li&gt;Many minor bugfixes and improvements.&lt;/li&gt;
&lt;/ul&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/mitmproxy/announce0_7/index.html</guid><pubDate>Mon, 27 Feb 2012 20:38:00 GMT</pubDate></item><item><title>OpenBSD in decline?</title><link>http://corte.si/posts/security/openbsd-decline/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/security/openbsd-decline/index.html"&gt;OpenBSD in decline?&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;26 February 2012&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;My leisurely Sunday activity today is to set up a new
&lt;a href="http://openbsd.org"&gt;OpenBSD&lt;/a&gt; firewall for my mobile app testing lab. I haven't
done a from-scratch OpenBSD install for years, so I spent some time reading
through the change logs for the last few versions to catch up with what's
changed. Although the project is clearly still making steady, well-engineered
progress, I had the nagging feeling that the rate of change wasn't what it used
to be. So, I pulled some numbers from &lt;a href="http://archives.neohapsis.com/archives/openbsd/cvs/"&gt;CVS commit message list
archives&lt;/a&gt;, and graphed
them. Here are the number of commits per month from January 2001 to January
2012. The orange line is a simple 12-month moving average: &lt;/p&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/security/openbsd-decline/commitspermonth.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now, we should be cautious about interpreting this - the number of commits
doesn't tell us anything about the quality, importance or magnitude of code
change. Even if it did all of these things, there are other and perhaps better
measures of a project's health. Still, the trend is clear, and suggests a
sustained decline in activity.&lt;/p&gt;

&lt;p&gt;I just &lt;a href="http://openbsd.org/orders.html"&gt;bought some T-shirts&lt;/a&gt; to help support
one of my favourite open source projects. You should too. &lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/security/openbsd-decline/index.html</guid><pubDate>Sun, 26 Feb 2012 09:08:00 GMT</pubDate></item><item><title>Malware</title><link>http://corte.si/posts/visualisation/malware/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/visualisation/malware/index.html"&gt;Malware&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;05 January 2012&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;div class="hidden"&gt;&lt;h2&gt;If you subscribe to my RSS feed, please visit this
article directly.  The table below has interactive elements that won't work in
most feed readers.&lt;/h2&gt; &lt;/div&gt;

&lt;p&gt;Hover and click for more.&lt;/p&gt;

&lt;table class="spacertable"&gt;
    &lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/00f29767bee5f8bd5b2d55d5be734f69.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_00f29767bee5f8bd5b2d55d5be734f69_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/01310712a180d9f939c126712d24363d.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_01310712a180d9f939c126712d24363d_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/023293a96c763bbdee3991994cdcdcef.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_023293a96c763bbdee3991994cdcdcef_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0309fc0e6dbeb714c5361f82b2ccb037.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0309fc0e6dbeb714c5361f82b2ccb037_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/038e3a7add116ac69e5f9539ce461386.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_038e3a7add116ac69e5f9539ce461386_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/03b3f30aed5b7dc39bd6e356bbde3713.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_03b3f30aed5b7dc39bd6e356bbde3713_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/04240e137999dc6b5115de8db3a15f53.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_04240e137999dc6b5115de8db3a15f53_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/04fee7e6dedf912b4a72886486627b05.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_04fee7e6dedf912b4a72886486627b05_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/05fd535d70dfb5ee4f36e87e39d8c70d.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_05fd535d70dfb5ee4f36e87e39d8c70d_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/07ddb50c4cc358fc3718847684ca5fae.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_07ddb50c4cc358fc3718847684ca5fae_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/08b983ec55bfd50d1d2cb9a90b1ae54e.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_08b983ec55bfd50d1d2cb9a90b1ae54e_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/08c926bf7fbb3397236effef1b30b4df.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_08c926bf7fbb3397236effef1b30b4df_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/094fedd2e4c175cd81dc170fd4d03917.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_094fedd2e4c175cd81dc170fd4d03917_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/096381c0f5ddc29319ba2b2647cea116.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_096381c0f5ddc29319ba2b2647cea116_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/09dd27fcccb9c000d37c6394364be1b5.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_09dd27fcccb9c000d37c6394364be1b5_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0b4f82e83741e79310d797d54db5a9be.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0b4f82e83741e79310d797d54db5a9be_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0bcee1314e8c61fa8ef55743f3bb7742.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0bcee1314e8c61fa8ef55743f3bb7742_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0cc9e0ba6a0bd8b79aaf2be22c496228.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0cc9e0ba6a0bd8b79aaf2be22c496228_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0d9109ab6b06f38221b713eb6a54c42f.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0d9109ab6b06f38221b713eb6a54c42f_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0d97f71367f8b6dcb8cbc8ec964ebdbe.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0d97f71367f8b6dcb8cbc8ec964ebdbe_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0dcfe476fbd68148f007e6c48c226e0f.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0dcfe476fbd68148f007e6c48c226e0f_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0e2bf707dbc146c9d60c373237d050b7.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0e2bf707dbc146c9d60c373237d050b7_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0eab36fc4307a1fd3ad8d832c526cf40.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0eab36fc4307a1fd3ad8d832c526cf40_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0f5c70c82a74c8ff3d05fbf4d90bc5bf.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0f5c70c82a74c8ff3d05fbf4d90bc5bf_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0fc12afe2d283b92184897b6e7bcc2c2.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0fc12afe2d283b92184897b6e7bcc2c2_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/0ff25e3cefcce4336d0abeb9f02ccb02.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_0ff25e3cefcce4336d0abeb9f02ccb02_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/109f8c72ff91dee5906aba0e47324526.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_109f8c72ff91dee5906aba0e47324526_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/12e9e61357be212f28ea4c81ef75018d.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_12e9e61357be212f28ea4c81ef75018d_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/12eec9b3e0aa2e6683487c13eede2382.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_12eec9b3e0aa2e6683487c13eede2382_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/131f1cb94df6e2969ac874503cbfd934.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_131f1cb94df6e2969ac874503cbfd934_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/14064e26cbd3daed7e6eb3b4fb245c8f.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_14064e26cbd3daed7e6eb3b4fb245c8f_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/14560f7dc19e6fef87743f83e5234519.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_14560f7dc19e6fef87743f83e5234519_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/14e6950dd4bcffe54bf158a20437e6b4.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_14e6950dd4bcffe54bf158a20437e6b4_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1511f2d75e07bb94f5da8cbc031a51dd.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1511f2d75e07bb94f5da8cbc031a51dd_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1542a2f2732bbdad500bf112686503ac.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1542a2f2732bbdad500bf112686503ac_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/163524fb9a41e6ec79178a902797f8f1.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_163524fb9a41e6ec79178a902797f8f1_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/16c533cc9b3dac1bde9885b4bd967bff.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_16c533cc9b3dac1bde9885b4bd967bff_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/177827ae9615791e067b4a9fb4be1ab9.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_177827ae9615791e067b4a9fb4be1ab9_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/17fa099ecef82edd1e4ddc61be575ae4.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_17fa099ecef82edd1e4ddc61be575ae4_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/17fd97da6d93430ec0d9aa040b4b2c58.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_17fd97da6d93430ec0d9aa040b4b2c58_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/18ce863d41622cd7aaa3c7d3d11e2f3e.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_18ce863d41622cd7aaa3c7d3d11e2f3e_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/18f9ede7d921742f963a0eb06887fdfa.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_18f9ede7d921742f963a0eb06887fdfa_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1998bb714c0de980635ee9b8c1951381.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1998bb714c0de980635ee9b8c1951381_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/19bc481e5cb1113c7eff49b67273f892.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_19bc481e5cb1113c7eff49b67273f892_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1a30184661ee6585f4a188107e63a4d2.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1a30184661ee6585f4a188107e63a4d2_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;&lt;tr&gt;

&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1a3aa70d060be5e6e778e3519b400bf1.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1a3aa70d060be5e6e778e3519b400bf1_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1a8700c754f97c115fa91fa161fa05cc.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1a8700c754f97c115fa91fa161fa05cc_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1aa40b6ea4e7be64d4e6a024fcdf76fe.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1aa40b6ea4e7be64d4e6a024fcdf76fe_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1b0e377994cfdb4eec0d2fb028118844.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1b0e377994cfdb4eec0d2fb028118844_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;


&lt;td&gt;
    &lt;a href='http://corte.si/posts/visualisation/malware/detail/1b5bad65f8b72a52cfcae67e3e538f34.html'&gt;
        &lt;img class='malwareimg' src='http://corte.si/posts/visualisation/malware/images/small_1b5bad65f8b72a52cfcae67e3e538f34_entropy.png'/&gt;
    &lt;/a&gt;
&lt;/td&gt;

&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The images above are &lt;a href="http://corte.si/posts/visualisation/entropy/index.html"&gt;entropy visualizations&lt;/a&gt;
of samples from a malware database - black is zero entropy, with colour ranging
through blue, up to hot pink for maximum entropy. Large areas of very high
entropy are usually sections that are packed - encrypted or obfuscated by the
malware authors to make the malware hard to detect and reverse engineer.
Smaller areas might be keys, passwords, or other chunks of data meant to be
hidden from view.&lt;/p&gt;

&lt;p&gt;When you hover over an image, you see a &lt;a href="http://corte.si/posts/visualisation/binvis/index.html"&gt;character class
visualization&lt;/a&gt; with the following colors:&lt;/p&gt;

&lt;table style="margin: 30px"&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #000000"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;0x00&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #ffffff"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;0xFF&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #377eb8"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;Printable characters&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #e41a1c"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;Everything else&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Clicking will show you high-detail versions of both visualizations, and let you
look up the binary hash to see what it is. I've used a square Hilbert curve
layout - the files start in the top-left corner, and pass through the quadrants
clockwise. &lt;/p&gt;

&lt;p&gt;I spent hours looking through thousands these visualizations today. I find them
eerie and rather beautiful - an entirely different perspective from my
day-to-day interactions with malware.&lt;/p&gt;

&lt;script src="http://corte.si/jquery-1.7.1.min.js" type="text/javascript"&gt;&lt;/script&gt;
&lt;script&gt;
    $(function(){
        $('.malwareimg').each(function(){
            $('&lt;img/&gt;').appendTo('body')
                .css({ display: "none" })
                .attr('src',$(this).attr('src').replace("entropy", "charclass"));
        });
        $('.malwareimg').hover(
            function(){
                t = $(this);
                t.attr('src',t.attr('src').replace("entropy", "charclass"));
            },
            function(){ 
                t = $(this);
                t.attr('src',t.attr('src').replace('charclass','entropy'));
            }
         );

    })
&lt;/script&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/visualisation/malware/index.html</guid><pubDate>Thu, 05 Jan 2012 22:50:00 GMT</pubDate></item><item><title>Visualizing entropy in binary files</title><link>http://corte.si/posts/visualisation/entropy/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/visualisation/entropy/index.html"&gt;Visualizing entropy in binary files&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;04 January 2012&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;Last week, I wrote about &lt;a href="http://corte.si/posts/visualisation/binvis/index.html"&gt;visualizing binary files using space-filling
curves&lt;/a&gt;, a technique I use when I need to get a
quick overview of the broad structure of a file. Today, I'll show you an
elaboration of the same basic idea - still based on space-filling curves, but
this time using a colour function that measures local entropy.&lt;/p&gt;

&lt;p&gt;Before I get to the details, let's quickly talk about the motivation for a
visualization like this. We can think of entropy as the degree to which a chunk
of data is disordered. If we have a data set where all the elements have the
same value, the amount of disorder is nil, and the entropy is zero. If the data
set has the maximum amount of heterogeneity (i.e. all possible symbols are
represented equally), then we also have the maximum amount of disorder, and
thus the maximum amount of entropy. There are two common types of high-entropy
data that are of special interest to reverse engineers and penetration testers.
The first is compressed data - finding and extracting compressed sections is a
common task in many security audits. The second is cryptographic material -
which is obviously at the heart of most security work. Here, I'm referring not
only to key material and certificates, but also to hashes and actual encrypted
data. As I show below, a tool like the one I'm describing today can be highly
useful in spotting this type of information.&lt;/p&gt;

&lt;p&gt;For this visualization, I use the &lt;a href="http://en.wikipedia.org/wiki/Entropy_(information_theory"&gt;Shannon
entropy&lt;/a&gt; measure to
calculate byte entropy over a sliding window. This gives us a "local entropy"
value for each byte, even though the concept doesn't really apply to single
symbols. &lt;/p&gt;

&lt;p&gt;With that out of the way, let's look at some pretty pictures.&lt;/p&gt;

&lt;h1&gt;Visualizing the OSX ksh binary&lt;/h1&gt;

&lt;p&gt;In my previous post, I used the &lt;a href="http://en.wikipedia.org/wiki/Korn_shell"&gt;ksh&lt;/a&gt;
binary as a guinea pig, and I'll do the same here. On the left is the entropy
visualization with colours ranging from black for zero entropy, through shades
of blue as entropy increases, to hot pink for maximum entropy. On the right is
the Hilbert curve visualization from the last post for comparison - see &lt;a href="http://corte.si/posts/visualisation/binvis/index.html"&gt;the
post itself&lt;/a&gt; for an explanation of the colour
scheme. Click for larger versions with much more detail:&lt;/p&gt;

&lt;table class="spacertable"&gt;
    &lt;tr&gt;
        &lt;td&gt;Entropy&lt;/td&gt;
        &lt;td&gt;Byte class&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;
            &lt;a href="http://corte.si/posts/visualisation/entropy/hilbert-entropy-large.png"&gt;
                &lt;img src="http://corte.si/posts/visualisation/entropy/hilbert-entropy.png"/&gt;
            &lt;/a&gt;
        &lt;/td&gt;
        &lt;td&gt;
            &lt;a href="http://corte.si/posts/visualisation/binvis/binary-large-hilbert.png"&gt;
                &lt;img src="http://corte.si/posts/visualisation/binvis/binary-hilbert.png"/&gt;
            &lt;/a&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Note that this is a dual-architecture
&lt;a href="http://en.wikipedia.org/wiki/Mach-O"&gt;Mach-O&lt;/a&gt; file, containing code for both
i386 and x86_64. You can see this if you squint somewhat at these images - some
broad structures in the file are repeated twice. We can see that there are a
number of different sections of the &lt;strong&gt;ksh&lt;/strong&gt; binary that have very high entropy.
It's not immediately obvious why a system binary would contain either
compressed sections or cryptographic material. As it happens, the explanation
in this case is quite interesting. Let's have a closer look:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;img src="http://corte.si/posts/visualisation/entropy/entropy-annotated.png"/&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Sections &lt;strong&gt;1&lt;/strong&gt; and &lt;strong&gt;2&lt;/strong&gt; are a lovely validation of the central idea of this
post. These two areas do indeed contain cryptographic material - in this case,
&lt;a href="http://developer.apple.com/library/mac/#technotes/tn2206/_index.html"&gt;code signing hashes and
certificates&lt;/a&gt;.
Rather satisfyingly, they stand out like a sore thumb. It turns out that all of
the official OSX binaries are signed by Apple. This is then used in turn to
apply &lt;a href="http://developer.apple.com/library/mac/#technotes/tn2206/_index.html"&gt;a variety of
policies&lt;/a&gt;,
depending on who the signatory is, and whether they are trusted.&lt;/p&gt;

&lt;p&gt;You can dump some rudimentary data about a binary's signature using the
&lt;strong&gt;codesign&lt;/strong&gt; command (which you can also use to sign binaries yourself):&lt;/p&gt;

&lt;pre&gt;
&gt; codesign -dvv /bin/ksh 
Executable=/bin/ksh
Identifier=com.apple.ksh
Format=Mach-O universal (i386 x86_64)
CodeDirectory v=20100 size=5662 flags=0x0(none) hashes=278+2 location=embedded
Signature size=4064
Authority=Software Signing
Authority=Apple Code Signing Certification Authority
Authority=Apple Root CA
Info.plist=not bound
Sealed Resources=none
Internal requirements count=1 size=92
&lt;/pre&gt;

&lt;p&gt;Section &lt;strong&gt;3&lt;/strong&gt; (the two occurrences are the same data repeated for each
architecture) is interesting for a different reason - it's a cautionary example
of how the simple entropy measure we're using sometimes detects high entropy in
highly structured data. A hex dump of the start of the region looks like this: &lt;/p&gt;

&lt;pre&gt;
000d1f00  00 01 00 00 00 02 00 00  00 06 00 00 00 00 00 00  |................|
000d1f10  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000d1f20  00 01 02 03 04 05 06 07  08 09 0a 0b 0c 0d 0e 0f  |................|
000d1f30  10 11 12 13 14 15 16 17  18 19 1a 1b 1c 1d 1e 1f  |................|
000d1f40  20 21 22 23 24 25 26 27  28 29 2a 2b 2c 2d 2e 2f  | !"#$%&amp;'()*+,-./|
000d1f50  30 31 32 33 34 35 36 37  38 39 3a 3b 3c 3d 3e 3f  |0123456789:;&lt;=&gt;?|
000d1f60  40 41 42 43 44 45 46 47  48 49 4a 4b 4c 4d 4e 4f  |@ABCDEFGHIJKLMNO|
000d1f70  50 51 52 53 54 55 56 57  58 59 5a 5b 5c 5d 5e 5f  |PQRSTUVWXYZ[\]^_|
000d1f80  60 61 62 63 64 65 66 67  68 69 6a 6b 6c 6d 6e 6f  |`abcdefghijklmno|
000d1f90  70 71 72 73 74 75 76 77  78 79 7a 7b 7c 7d 7e 7f  |pqrstuvwxyz{|}~.|
000d1fa0  80 81 82 83 84 85 86 87  88 89 8a 8b 8c 8d 8e 8f  |................|
000d1fb0  90 91 92 93 94 95 96 97  98 99 9a 9b 9c 9d 9e 9f  |................|
000d1fc0  a0 a1 a2 a3 a4 a5 a6 a7  a8 a9 aa ab ac ad ae af  |................|
000d1fd0  b0 b1 b2 b3 b4 b5 b6 b7  b8 b9 ba bb bc bd be bf  |................|
000d1fe0  c0 c1 c2 c3 c4 c5 c6 c7  c8 c9 ca cb cc cd ce cf  |................|
000d1ff0  d0 d1 d2 d3 d4 d5 d6 d7  d8 d9 da db dc dd de df  |................|
000d2000  e0 e1 e2 e3 e4 e5 e6 e7  e8 e9 ea eb ec ed ee ef  |................|
000d2010  f0 f1 f2 f3 f4 f5 f6 f7  f8 f9 fa fb fc fd fe ff  |................|
&lt;/pre&gt;

&lt;p&gt;We see that this section contains each byte value from 0x00 to 0xff in order -
furthermore this whole block is repeated with minor variations a number of
times. There are two things to explain here - why is this detected as "high
entropy" data, and what the heck is it doing in the file? &lt;/p&gt;

&lt;p&gt;First, we need to understand that the Shannon entropy measure looks only at the
relative occurrence frequencies of individual symbols (in this case, bytes). A
chunk of data like the one above therefore looks like it has high entropy,
because each symbol occurs once and only once, making the data highly
heterogeneous. &lt;/p&gt;

&lt;p&gt;Now, what earthly use would chunks of data like this be? With a bit of digging,
I found the answer in the &lt;strong&gt;ksh&lt;/strong&gt; source code. These sections are maps used for
translation between various &lt;a href="http://en.wikipedia.org/wiki/EBCDIC"&gt;character&lt;/a&gt;
&lt;a href="http://en.wikipedia.org/wiki/ASCII"&gt;encodings&lt;/a&gt;. If you're interested, here's
the &lt;a href="http://opensource.apple.com/source/ksh/ksh-13/ksh/src/lib/libast/string/ccmap.c"&gt;culprit in all its repetitive
glory&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;The code&lt;/h1&gt;

&lt;p&gt;As usual, the code for generating all of the images in this post is up on
GitHub. The entropy visualizations were created with
&lt;a href="https://github.com/cortesi/scurve/blob/master/binvis"&gt;binvis&lt;/a&gt;, a new addition
to &lt;a href="https://github.com/cortesi/scurve"&gt;scurve&lt;/a&gt;, my compendium of code related
to space-filling curves. &lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/visualisation/entropy/index.html</guid><pubDate>Wed, 04 Jan 2012 05:26:00 GMT</pubDate></item><item><title>A personal link mill</title><link>http://corte.si/posts/socialmedia/linkmill/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/socialmedia/linkmill/index.html"&gt;A personal link mill&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;30 December 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;I posted a link to an interesting visualization paper on Twitter today,
&lt;a href="https://twitter.com/#!/__mharrison__/status/152503684822081537"&gt;prompting someone to ask me where I had found
it&lt;/a&gt;. Sadly, I
had to admit that I had no clue where I first saw it referenced, due to the way
I consume links I find on the net. So, I thought I'd write a quick blog post to
explain myself, and then pitch a product idea that could make my life (and
maybe yours) much easier.&lt;/p&gt;

&lt;p&gt;First, the problem statement: my aim is to efficiently discover links to
interesting stuff on the net. Simple as that. A few years ago, my flow of links
came mostly from social news sites (&lt;a href="http://news.ycombinator.com"&gt;Hacker News&lt;/a&gt;
and &lt;a href="http://reddit.com"&gt;Reddit&lt;/a&gt;), and items shared by people I follow on social
networks. Over time, I became more and more disenchanted with this way of doing
things. The social news approach is to take a torrent of very low quality links
(user submissions), and then crowd-source the filtration process through
voting.  But popularity is not a good measure of information quality, and the
result is a bland, lowest-common-denominator view of the world that has no room
for anything that doesn't make it to the front page. Don't get me wrong -
Reddit and HN do a lot of other things well - but they just don't cut it as
primary information sources. Mining links from social networks is a more
promising approach, but still problematic. None of the social networks provide
the tools needed to extract shared links from the update stream and consume
them efficiently. There is also a structural issue - I don't necessarily want
to mix my social ties and my information sources, and I definitely don't want
to be limited to just one platform. These are separate functions that I feel
require separate tools.&lt;/p&gt;

&lt;h1&gt;My personal link mill&lt;/h1&gt;

&lt;p&gt;Eventually, I took matters into my own hands. First, I hugely broadened the
number of information sources I consumed. The tool I use for this is Google
Reader - I now subscribe to about 800 individual feeds, and this number is
growing daily. The trick here is to find high-quality, low-volume link sources.
The motherlode of good links for me was to be found on social bookmarking
sites. About 700 of my subscriptions are to the RSS feeds of individual users
on &lt;a href="http://pinboard.in"&gt;Pinboard&lt;/a&gt; and &lt;a href="http://delicious.com"&gt;Delicious&lt;/a&gt;. This
gives me very fine control and a great mix of interests. Plus, getting links
from individual curators handily sidesteps the social news group-think problem.
The remainder of my subscriptions are split between blogs, some sub-Reddits, a
few Twitter users and subsections of &lt;a href="http://arxiv.org"&gt;arXiv&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So much for how my intake works. Just as important is the way that I consume
it. I do my "filtering" in batches, usually in the evening. Using
&lt;a href="http://reederapp.com/"&gt;Reeder&lt;/a&gt; on my iPad works well for me, letting me flick
quickly and comfortably through all the new links of the day. When I find
something that looks interesting, I resist the temptation to read it then and
there - instead, I batch up all my reading for later. If it's a web page, it
goes to &lt;a href="http://www.instapaper.com/"&gt;Instapaper&lt;/a&gt;.  If it's a PDF, it gets
downloaded into a &lt;a href="http://www.dropbox.com/"&gt;DropBox&lt;/a&gt; folder, which is synced to
&lt;a href="http://www.goodiware.com/goodreader.html"&gt;GoodReader&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Finally, the actual reading. Every morning, I toddle off to a nice cafe with my
iPad, and read all the interesting stuff I saved the previous day in a single
sitting. I'm ruthless about just skimming things that don't warrant careful
attention. If I find something particularly interesting I save it permanently,
and perhaps tweet it or mail it to someone I think might be interested. &lt;/p&gt;

&lt;h1&gt;Problems - and a product idea?&lt;/h1&gt;

&lt;p&gt;This system works for me, but it has many problems. There's no end-to-end
coordination, so by the time I sit down to actually read something, I have no
easy way to tell which feed it came from. Google Reader sucks at managing
hundreds of low-volume subscriptions. Reeder is a great, but is not tailored to
consuming redundant information from many sources. The end result is that
maintaining the system I have is a time-consuming pain in the ass. The fact
that it's still worth it despite this, makes me think there might be commercial
room for a better solution.&lt;/p&gt;

&lt;p&gt;Which brings me to a rough product idea - a formalized version of this link
mill for people who want to take direct control of their information intake.
The business end is a generalized feed consumer, letting you subscribe to RSS
feeds, Twitter users, Google+ updates, sub-Reddits and other information
sources.  Links are extracted from these feeds, keeping track of which links
appeared where. The user is then presented with a stream of links to consume,
de-duplicated so that those appearing in multiple feeds are presented only
once. The system keeps track of links the user marks as "interesting", batching
them for later consumption. It also uses this information to score the feeds,
letting the user see which feeds are low quality, and should be ditched. Given
the right tools, the time needed for a user to maintain and tend their link
feed garden would be quite modest, and the rewards would be great.&lt;/p&gt;

&lt;p&gt;If someone built this, I for one would gladly fork over some of my hard-earned
doubloons to use it. In fact, with some validation of the idea and a few
collaborators I might think of building it myself. Does this sound useful to
anyone else?&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/socialmedia/linkmill/index.html</guid><pubDate>Fri, 30 Dec 2011 15:42:00 GMT</pubDate></item><item><title>Visualizing binaries with space-filling curves</title><link>http://corte.si/posts/visualisation/binvis/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/visualisation/binvis/index.html"&gt;Visualizing binaries with space-filling curves&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;23 December 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;In my day job I often come across binary files with unknown content. I have a
set of standard avenues of attack when I confront such a beast - use "file" to
see if it's a known file type, "strings" to see if there's readable text, run
some in-house code to extract compressed sections, and, of course, fire up a
hex editor to take a direct look. There's something missing in that list,
though - I have no way to get a quick view of the overall structure of the
file.  Using a hex editor for this is not much chop - if the first section of
the file looks random (i.e. probably compressed or encrypted), who's to say
that there isn't a chunk of non-random information a meg further down?
Ideally, we want to do this type of broad pattern-finding by eye, so a
visualization seems to be in order.&lt;/p&gt;

&lt;p&gt;First, lets begin by picking a colour scheme. We have 256 different byte
values, but for a first-pass look at a file, we can compress that down into a
few common classes:&lt;/p&gt;

&lt;table&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #000000"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;0x00&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #ffffff"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;0xFF&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #377eb8"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;Printable characters&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="background-color: #e41a1c"&gt;&amp;nbsp;&lt;/td&gt;
        &lt;td&gt;Everything else&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;This covers the most common padding bytes, nicely highlights strings, and lumps
everything else into a miscellaneous bucket. The broad outline of what we need
to do next is clear - we sample the file at regular intervals, translate each
sampled byte to a colour, and write the corresponding pixel to our image. This
brings us to the big question - what's the best way to arrange the pixels? A
first stab might be to lay the pixels out row by row, snaking to and fro to
make sure each pixel is always adjacent to its predecessor. It turns out,
however, that this zig-zag pattern is not very satisfying - small scale
features (i.e. features that take up only a few lines) tend to get lost.  What
we want is a layout that maps our one-dimensional sequence of samples onto the
2-d image, while keeping elements that are close together in one dimension as
near as possible to each other in two dimensions.  This is called "locality
preservation", and the &lt;a href="http://en.wikipedia.org/wiki/Space-filling_curve"&gt;space-filling
curves&lt;/a&gt; are a family of
mathematical constructs that have precisely this property. If you're a regular
reader of this blog, you may know that I have an
&lt;a href="http://corte.si/posts/code/hilbert/portrait/index.html"&gt;almost&lt;/a&gt;
&lt;a href="http://corte.si/posts/code/sortvis-fruitsalad/index.html"&gt;unseemly&lt;/a&gt;
&lt;a href="http://corte.si/posts/code/hilbert/swatches/index.html"&gt;fondness&lt;/a&gt; for these critters. So,
lets add a couple of space-filling curves to the mix to see how they stack up.
The &lt;a href="http://en.wikipedia.org/wiki/Z-order_curve"&gt;Z-Order curve&lt;/a&gt; has found wide
practical use in computer science. It's not the best in terms of locality
preservation, but it's easy and quick to compute. The &lt;a href="http://en.wikipedia.org/wiki/Hilbert_curve"&gt;Hilbert
curve&lt;/a&gt;, on the other hand, is
(nearly) as good as it gets at locality preservation, but is much more
complicated to generate. Here's what our three candidate curves look like - in
each case, the traversal starts in the top-left corner:&lt;/p&gt;

&lt;table class="spacertable"&gt;
    &lt;tr&gt;
        &lt;td&gt;Zigzag&lt;/td&gt;
        &lt;td&gt;Z-order&lt;/td&gt;
        &lt;td&gt;Hilbert&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;img src="http://corte.si/posts/visualisation/binvis/zigzag.png"/&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src="http://corte.si/posts/visualisation/binvis/zorder.png"/&gt;&lt;/td&gt;
        &lt;td&gt;&lt;img src="http://corte.si/posts/visualisation/binvis/hilbert.png"/&gt;&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;And here they are, visualizing the
&lt;a href="http://en.wikipedia.org/wiki/Korn_shell"&gt;ksh&lt;/a&gt;
(&lt;a href="http://en.wikipedia.org/wiki/Mach-O"&gt;Mach-O&lt;/a&gt;,
&lt;a href="http://en.wikipedia.org/wiki/Fat_binary"&gt;dual-architecture&lt;/a&gt;) binary
distributed with OSX - click for the significantly more spectacular larger
versions of the images:&lt;/p&gt;

&lt;table class="spacertable"&gt;
    &lt;tr&gt;
        &lt;td&gt;Zigzag&lt;/td&gt;
        &lt;td&gt;Z-order&lt;/td&gt;
        &lt;td&gt;Hilbert&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;
            &lt;a href="http://corte.si/posts/visualisation/binvis/binary-large-zigzag.png"&gt;
                &lt;img src="http://corte.si/posts/visualisation/binvis/binary-zigzag.png"/&gt;
            &lt;/a&gt;
        &lt;/td&gt;
        &lt;td&gt;
            &lt;a href="http://corte.si/posts/visualisation/binvis/binary-large-zorder.png"&gt;
                &lt;img src="http://corte.si/posts/visualisation/binvis/binary-zorder.png"/&gt;
            &lt;/a&gt;
        &lt;/td&gt;
        &lt;td&gt;
            &lt;a href="http://corte.si/posts/visualisation/binvis/binary-large-hilbert.png"&gt;
                &lt;img src="http://corte.si/posts/visualisation/binvis/binary-hilbert.png"/&gt;
            &lt;/a&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The classical Hilbert and Z-Order curves are actually square, so for these
visualizations I've unrolled them, stacking four sub-curves on top of each
other.  To my eye, the Hilbert curve is the clear winner here. Local features
are prominent because they are nicely clumped together. The Z-order curve shows
some annoying artifacts with contiguous chunks of data sometimes split between
two or more visual blocks. &lt;/p&gt;

&lt;p&gt;The downside of the space-filling curve visualizations is that we can't look at
a feature in the image and tell where, exactly, it can be found in the file.
I'm toying with the idea (though not very seriously) of writing an interactive
binary file viewer with a space-filling curve navigation pane. This would let
the user click on or hover over a patch of structure and see the file offset
and the corresponding hex. &lt;/p&gt;

&lt;h1&gt;More detail&lt;/h1&gt;

&lt;p&gt;We can get more detail in these images by increasing the granularity of the
colour mapping. One way to do this is to use a trick I first concocted to
&lt;a href="http://corte.si/posts/code/hilbert/portrait/index.html"&gt;visualize the Hilbert Curve at
scale&lt;/a&gt;. The basic idea is to use a
3-d Hilbert curve traversal of the RGB colour cube to create a palette of
colours. This makes use of the locality-preserving properties of the Hilbert
curve to make sure that similar elements have similar colours in the
visualization. See the &lt;a href="http://corte.si/posts/code/hilbert/portrait/index.html"&gt;original
post&lt;/a&gt; for more.&lt;/p&gt;

&lt;p&gt;So, here's a Hilbert curve mapping of a binary file, using a Hilbert-order
traversal of the RGB cube as a colour palette. Again, click on the image for
the much nicer large scale version:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;a href="http://corte.si/posts/visualisation/binvis/hilbert-hilbert-large.png"&gt;
        &lt;img src="http://corte.si/posts/visualisation/binvis/hilbert-hilbert.png"/&gt;
    &lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;This shows significantly more fine-grained structure, which might be good for a
deep dive into a binary. On the other hand, the colours don't map cleanly to
distinct byte classes, so the image is harder to interpret. An ideal hex viewer
would let you flick between the two palettes for navigation. &lt;/p&gt;

&lt;h1&gt;The code&lt;/h1&gt;

&lt;p&gt;As usual, I'm publishing the code for generating all of the images in this
post. The binary visualizations were created with
&lt;a href="https://github.com/cortesi/scurve/blob/master/binvis"&gt;binvis&lt;/a&gt;, which is a new
addition to &lt;a href="https://github.com/cortesi/scurve"&gt;scurve&lt;/a&gt;, my space-filling curve
project. The curve diagrams were made with the "drawcurve" utility to be found
in the same place.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/visualisation/binvis/index.html</guid><pubDate>Fri, 23 Dec 2011 06:02:00 GMT</pubDate></item><item><title>netograph.com - Realtime privacy snapshots of the social web</title><link>http://corte.si/posts/netograph/launch/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/netograph/launch/index.html"&gt;netograph.com - Realtime privacy snapshots of the social web&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;08 December 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;Today, I'm launching &lt;a href="http://netograph.com"&gt;Netograph&lt;/a&gt;, a new privacy-related
site that I've been hacking on over the past few months. The goal of the
project is to provide you with a quick overview of the privacy picture for a
URL, &lt;strong&gt;before&lt;/strong&gt; you've clicked on the link. At the moment, Netograph scans
&lt;a href="http://reddit.com"&gt;Reddit&lt;/a&gt;, &lt;a href="http://news.ycombinator.com"&gt;Hacker News&lt;/a&gt;,
&lt;a href="http://pinboard.in"&gt;Pinboard&lt;/a&gt;, &lt;a href="http://delicous.com"&gt;Delicous&lt;/a&gt; and
&lt;a href="http://digg.com"&gt;Digg&lt;/a&gt; - links on these sites should show up within a few
minutes of submission.&lt;/p&gt;

&lt;p&gt;For more details, head over to &lt;a href="http://netograph.com"&gt;netograph.com&lt;/a&gt;. There you
will also find
&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/netograph/"&gt;Firefox&lt;/a&gt; and
&lt;a href="https://chrome.google.com/webstore/detail/bfhmbldbigkpniinkmckafbgcajcbaai"&gt;Chrome&lt;/a&gt;
browser addons that let you view the Netograph report for a URL instantly with
a right-click. Enjoy!&lt;/p&gt;

&lt;table class="spacertable"&gt;
    &lt;tr&gt;

        &lt;td&gt;
            &lt;a href="http://netograph.com/starmap/1740"&gt;
                &lt;img src="http://corte.si/posts/netograph/launch/ng-guardian.png"&gt;
                guardian.co.uk
            &lt;/a&gt;
        &lt;/td&gt;

        &lt;td&gt;
            &lt;a href="http://netograph.com/starmap/2512"&gt;
                &lt;img src="http://corte.si/posts/netograph/launch/ng-techcrunch.png"&gt;
                techcrunch.com
            &lt;/a&gt;
        &lt;/td&gt;

        &lt;td&gt;
            &lt;a href="http://netograph.com/starmap/2457"&gt;
                &lt;img src="http://corte.si/posts/netograph/launch/ng-reddit.png"&gt;
                reddit.com
            &lt;/a&gt;
        &lt;/td&gt;

    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;What's next?&lt;/h2&gt;

&lt;p&gt;This is just the first step. As I hinted in a &lt;a href="http://corte.si/posts/privacy/neighbourhoods-of-trust/index.html"&gt;previous
post&lt;/a&gt;, the most interesting
results from Netograph are likely to come from aggregating and
cross-correlating the data for individual URLs. I'm already hard at work on
this - the next iteration of Netograph will aim to shine some light on the
sometimes shadowy network of third-parties that track and analyze nearly every
URL we visit. I will also be publishing some interesting tidbits from this data
corpus on my blog as I go along, so watch this space.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/netograph/launch/index.html</guid><pubDate>Thu, 08 Dec 2011 05:39:00 GMT</pubDate></item><item><title>Otago Polytechnic Talk</title><link>http://corte.si/posts/talks/polytech.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/talks/polytech.html"&gt;Otago Polytechnic Talk&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;31 October 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;Further reading for the guest lecture I'm giving at Otago Polytechnic today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The talk I'm not giving: &lt;a href="https://www.owasp.org/index.php/Top_10_2010-Main"&gt;OWASP Top 10&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tools: &lt;a href="http://getfirebug.com/"&gt;FireBug&lt;/a&gt;, &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/tamper-data/"&gt;TamperData&lt;/a&gt;, &lt;a href="http://python.org"&gt;Python&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/Samy_(XSS)"&gt;Myspace Worm&lt;/a&gt;, and Samy
Kamkar's &lt;a href="http://namb.la/popular/tech.html"&gt;own explanation of the exploit&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Halvar Flake's &lt;a href="http://www.immunityinc.com/infiltrate/2011/presentations/Fundamentals_of_exploitation_revisited.pdf"&gt;Programming and state machines&lt;/a&gt;, which is where I first saw the term "programming the weird machine".&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/talks/polytech.html</guid><pubDate>Mon, 31 Oct 2011 09:05:00 GMT</pubDate></item><item><title>Neighborhoods of trust on the web</title><link>http://corte.si/posts/privacy/neighbourhoods-of-trust/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/privacy/neighbourhoods-of-trust/index.html"&gt;Neighborhoods of trust on the web&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;27 September 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;For the last fortnight I've been hard at work on a new project that aims to
examine trust and security on the web at scale. The basic idea is to use a
browser instance to render a URL, and then to extract all persistent state with
browser forensic techniques afterwards. This gives you a dump of cookies, cache
contents, Flash storage, HTML5 databases, and so on. At the same time, all
traffic is routed through a specialised version of
&lt;a href="http://mitmproxy.org"&gt;mitmproxy&lt;/a&gt;, and captured for later analysis. The result
is a very detailed snapshot of what viewing a given URL actually &lt;em&gt;does&lt;/em&gt;. The
next step is to do this "at scale" - this means running many instances of this
process in parallel on headless servers, decoupling things using queues,
backing it all onto a database, and then spending days and days fine-tuning.
I'm happy with my progress so far - my infrastructure is now now scanning all
the URLs passing through &lt;a href="http://news.ycombinator.com"&gt;Hacker News&lt;/a&gt;,
&lt;a href="http://reddit.com"&gt;Reddit&lt;/a&gt;, &lt;a href="http://digg.com"&gt;Digg&lt;/a&gt;,
&lt;a href="http://delicious.com"&gt;Delicious&lt;/a&gt; and &lt;a href="http://pinboard.in"&gt;Pinboard&lt;/a&gt; in
realtime, without breaking a sweat.&lt;/p&gt;

&lt;p&gt;I am pretty excited about the possibilities for this project, and I'm exploring
plans for the future with like-minded security folk. Get in touch if this
interests you, and keep an eye on my blog for more news.&lt;/p&gt;

&lt;p&gt;After my pilot run, I had 150 gigs of data covering about 120 thousand URLs.
Below is a quick peek at one tiny slice of this data - an appetizer for things
to come.&lt;/p&gt;

&lt;h1&gt;Neighborhoods of trust&lt;/h1&gt;

&lt;p&gt;&lt;a href="http://corte.si/posts/privacy/neighbourhoods-of-trust/images/full.png"&gt;
    &lt;img src="http://corte.si/posts/privacy/neighbourhoods-of-trust/images/wholegraph.png"/&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This graph shows structures that emerge from the way sites use third-party
executable resources. In this context, "executable" means means JavaScript,
Flash and HTML, and "third-party" means domains other than the URL's own. The
nodes in this graph are the third-party domains, and the edges are associations
between them via the URLs I crawled. For example, if a site loaded scripts from
both Google Analytics and from Doubleclick, that would create (or reinforce) an
edge between the nodes "google-analytics.com" and "doubleclick.com".  Using
this data, I calculated a co-occurrence coefficient for the third-party
sources, and then extracted the resulting neighbourhood structures
&lt;a href="http://lanl.arxiv.org/abs/0803.0476"&gt;algorithmically&lt;/a&gt;. The neighbourhood
information was used to colour and lay out the graph, trying to keep nodes that
are closely correlated together. Finally, nodes are scaled based on how many
URLs reference them.&lt;/p&gt;

&lt;p&gt;The result is a rather stunning graph showing neighborhoods of trust - areas of
the Internet bound together based on the third parties allowed to run code in
users' browsers. I've spent a few hours playing with this data, and the sheer
range of interesting structure is surprising. At one end of the spectrum, you
can zoom in to the individual node relationships, and find small clusters of
surprising sites that cross-load resources from each other, often because they
are owned by the same entity. At the other end, countries, language groups, and
broad fields of interest aggregate in huge tribes of kinship.&lt;/p&gt;

&lt;p&gt;Here are a few of the larger-scale features from the graph: &lt;/p&gt;

&lt;table class="layouttable"&gt;
    &lt;tr&gt;
        &lt;td&gt;
            &lt;img style="float: left" src="http://corte.si/posts/privacy/neighbourhoods-of-trust/images/wholegraph-b.png"/&gt;
        &lt;/td&gt;
        &lt;td&gt;

                &lt;h2&gt;Mainstream&lt;/h2&gt;

                The most widely used resources dominate in the neighbourhood
                extraction algorithm, which causes them to cluster together in
                their own super-community. The top nodes in this cluster,
                descending order of occurrence are: google-analytics.com,
                facebook.com, doubleclick.net, fbcdn.net, quantserve.com,
                twitter.com, google.com, googlesyndication.com, googleapis.com,
                scorecardresearch.net, facebook.net, addthis.com. These are
                also the top nodes overall.
        &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;
            &lt;img style="clear: left; float: left;" src="http://corte.si/posts/privacy/neighbourhoods-of-trust/images/wholegraph-a.png"/&gt;
        &lt;/td&gt;
        &lt;td&gt;

            &lt;h2&gt;Japanese&lt;/h2&gt;

            The main resources are hatena.ne.jp, microad.jp, mixi.jp,
            yahoo.co.jp, nakanohito.jp. More surprisingly, also in this cluster
            are topsy.com, appspot.com and postrank.com. Perhaps these
            resources are especially commonly used on Japanese sites. 

        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;

            &lt;img src="http://corte.si/posts/privacy/neighbourhoods-of-trust/images/wholegraph-d.png"/&gt;

        &lt;/td&gt;
        &lt;td&gt;

            &lt;h2&gt;Russian&lt;/h2&gt;

            Top resources are yadro.ru, yandex.ru, rambler.ru, vkontakte.ru,
            openstat.net, userapi.com, shinystat.net, and dt00.net

        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;

            &lt;img src="http://corte.si/posts/privacy/neighbourhoods-of-trust/images/wholegraph-c.png"/&gt;

        &lt;/td&gt;
        &lt;td&gt;

            &lt;h2&gt;Porn&lt;/h2&gt;

            And here we have a portion of the web dedicated to pron. The top
            resources are awempire.com, clickbank.net, picadmedia.com,
            getresponse.com, adultfriendfinder.com, adultadword.com, phcdn.com,
            juicyads.com, brazzers.com, etology.com, data-ero-advertising.com
            and viddler.com. A more surprising inclusion in this group is
            wufoo.com - I wonder if this is an artifact, or whether Wufoo
            really does have a use in the adult content world.  

        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;

            &lt;img src="http://corte.si/posts/privacy/neighbourhoods-of-trust/images/wholegraph-e.png"/&gt;

        &lt;/td&gt;
        &lt;td&gt;

            &lt;h2&gt;Misc&lt;/h2&gt;

            Just to show that it's not all clear-cut, here's an example of a
            neighbourhood I find harder to explain. The top resources are
            netdna-cdn.com, amgdgt.com, trafficmp.com, ooyala.com,
            suitesmart.com, demdex.net, adfrontiers.com, lycos.com and
            break.com. I speculate that this group might be loosely aligned
            around a number of big CDNs and analysis suites.

        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The graph in this post was created, analyzed and pre-processed using
&lt;a href="http://projects.skewed.de/graph-tool/"&gt;graph-tool&lt;/a&gt;, a great Python library for
dealing with large graphs. The visualization and modularity analysis was done
using the ever-wonderful &lt;a href="http://gephi.org/"&gt;Gephi&lt;/a&gt;. If these aren't both in
your arsenal of analysis tools, you're missing out.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/privacy/neighbourhoods-of-trust/index.html</guid><pubDate>Tue, 27 Sep 2011 23:23:00 GMT</pubDate></item><item><title>Why the Apple UDID had to die</title><link>http://corte.si/posts/security/udid-must-die/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/security/udid-must-die/index.html"&gt;Why the Apple UDID had to die&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;09 September 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;strong&gt;EDIT: A &lt;a href="http://blogs.wsj.com/digits/2011/09/19/privacy-risk-found-on-cellphone-games/"&gt;WSJ Digits
article&lt;/a&gt;
is now up, containing a responses from Zynga and Chillingo. Other networks
declined to comment.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A UDID is a "Unique Device Identifier" - you can think of it as a serial number
burned permanently into every iPhone, iPad and iPod Touch. Any installed app
can access the UDID without requiring the user's knowledge or consent.  We know
that UDIDs are very widely used - in a sample of 94 apps I tested, &lt;a href="http://corte.si/posts/security/apple-udid-survey/index.html"&gt;74%
silently sent the UDID to one or more servers on the
Internet&lt;/a&gt;, often without encryption.
This means that UDIDs are not secret values - if you use an Apple device
regularly, it's certain that your UDID has found its way into scores of
databases you're entirely unaware of. Developers often assume UDIDs are
anonymous values, and routinely use them to aggregate detailed and sensitive
user behavioural information. One example is Flurry, a mobile analytics firm
used by 15% of apps I tested, which can monitor application startup, shutdown,
scores achieved, and a host of other application-specific events, all linked to
the user's UDID. I recently showed that it was possible to use
&lt;a href="http://en.wikipedia.org/wiki/OpenFeint"&gt;OpenFeint&lt;/a&gt;, a large mobile social
gaming network, to &lt;a href="http://corte.si/posts/security/openfeint-udid-deanonymization/index.html"&gt;de-anonymize
UDIDs&lt;/a&gt;, linking them to
usernames, email addresses, GPS locations, and even Facebook profiles.&lt;/p&gt;

&lt;p&gt;This post looks at the way UDIDs are used in the broader social gaming
ecosystem. The work is based on a simple question: what happens if we swap our
UDID for another while communicating with the network?  There are a number of
ways to do this - in my case I used &lt;a href="http://mitmproxy.org"&gt;mitmproxy&lt;/a&gt;, an
intercepting HTTP/S proxy I developed which lets me re-write the traffic
leaving a device on the fly. In most cases this was a simple matter of
replacing one string with another, but two networks (Scoreloop and Crystal)
prevented UDID substitution using cryptography. Unfortunately, both networks
relied on the secrecy of key material distributed in the application binaries
to every device. I have verified that it is possible to reverse engineer the
application binaries to extract the key material and circumvent the
cryptographic protection.&lt;/p&gt;

&lt;p&gt;The outcome of this experiment shows that social gaming networks systematically
misuse UDIDs, resulting in serious privacy breaches for their users. All the
networks I tested allowed UDIDs to be linked to potentially identifying user
information, ranging from usernames to email addresses, friends lists and
private messages. Furthermore, 5 of the 7 networks allow an attacker to log in
as a user using only their UDID, giving the attacker complete control of the
user's account. Two networks had further problems that compromised a user's
Facebook and Twitter accounts - Crystal lets an attacker take control of a user
accounts by leaking API keys, while Scoreloop partially discloses users'
friends lists, even if they are private. &lt;/p&gt;

&lt;p&gt;&lt;style&gt;
    .yes {
        background-color: #d55858;
        color: #000000;
    }
    .no {
        background-color: #5bd65b;
        color: #000000;
    }&lt;/p&gt;

&lt;p&gt;&lt;/style&gt;&lt;/p&gt;

&lt;table&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;/th&gt;
        &lt;th&gt;Data leaked&lt;/th&gt;
        &lt;th&gt;Log in as user&lt;/th&gt;
        &lt;th&gt;Social Media Accounts&lt;/th&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;a href="http://www.chillingo.com/"&gt;Crystal&lt;/a&gt;&lt;/th&gt;
        &lt;td class="yes"&gt; Username, friends, Facebook, Twitter, games played, location, email address &lt;/td&gt;
        &lt;td class="yes"&gt; Yes &lt;/td&gt;
        &lt;td class="yes"&gt; Control of Facebook, Twitter accounts&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;a href="http://www.gameloft.com/"&gt;GameLoft&lt;/a&gt;&lt;/th&gt;
        &lt;td class="yes"&gt; Username, email address, games played, nationality, friends &lt;/td&gt;
        &lt;td class="yes"&gt; Yes &lt;/td&gt;
        &lt;td class="no"&gt; No &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;a href="http://www.geocade.com/"&gt;Geocade&lt;/a&gt;&lt;/th&gt;
        &lt;td class="yes"&gt; Username, email address, games played, location &lt;/td&gt;
        &lt;td class="yes"&gt; Yes &lt;/td&gt;
        &lt;td class="no"&gt; No &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;a href="http://openfeint.com/"&gt;OpenFeint&lt;/a&gt;&lt;/th&gt;
        &lt;td class="yes"&gt; Username, last played game, online status, friends &lt;/td&gt;
        &lt;td class="yes"&gt; Yes &lt;/td&gt;
        &lt;td class="no"&gt; No &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;a href="http://www.scoreloop.com/"&gt;Scoreloop&lt;/a&gt;&lt;/th&gt;
        &lt;td class="yes"&gt; Email address, gender, username, nationality, friends &lt;/td&gt;
        &lt;td class="yes"&gt; Yes &lt;/td&gt;
        &lt;td class="yes"&gt; Access private Facebook and Twitter friends lists &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;a href="http://plusplus.com/"&gt;Plus+&lt;/a&gt;&lt;/th&gt;
        &lt;td class="yes"&gt; Username &lt;/td&gt;
        &lt;td class="no"&gt; No &lt;/td&gt;
        &lt;td class="no"&gt; No &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;&lt;a href="http://www.zynga.com/"&gt;Zynga&lt;/a&gt;&lt;/th&gt;
        &lt;td class="yes"&gt; First name, username, friends*, in-game messages*,
        mobile number* &lt;/td&gt;
        &lt;td class="yes"&gt; Yes* &lt;/td&gt;
        &lt;td class="no"&gt; No &lt;/td&gt;
    &lt;/tr&gt;

&lt;/table&gt;

&lt;p&gt;* The starred Zynga findings rely on the fact that other networks can be used
to obtain the user's email address using the UDID. &lt;/p&gt;

&lt;p&gt;There are two caveats to keep in mind while considering these results. First,
the findings are based on the default settings for each social network - some
networks may have settings that reduce the amount of information exposed.
Second, some of the data leaked is optional - for instance, it's not mandatory
for a user to link Facebook or Twitter accounts with any of the networks. &lt;/p&gt;

&lt;p&gt;All the affected companies and Apple were notified 5 weeks ago. The Crystal and
Scoreloop teams have both repaired the problems that could lead to a follow-on
compromise of a user's social network accounts. At the time of writing, it is
still possible to log in as a user using only a UDID on five of the vulnerable
networks. &lt;/p&gt;

&lt;h1&gt;The future&lt;/h1&gt;

&lt;p&gt;A few days after I notified the companies involved, it was revealed that Apple
was &lt;a href="http://techcrunch.com/2011/08/19/apple-ios-5-phasing-out-udid/"&gt;quietly killing the UDID
API&lt;/a&gt;. It will
still be present in IOS5, but is marked deprecated, and will probably be
removed in future. I recommend that developers shift away from using UDIDs now,
rather than wait for formal removal of the API.&lt;/p&gt;

&lt;p&gt;We can now expect a frenzy of activity as developers look for alternatives. The
challenge will be to make sure that the cure isn't as bad as the disease -
Apple's recommendation to "create a unique identifier specific to your app"
could tempt developers to replicate the UDID mechanism on a smaller scale,
flaws and all. Expect more blog posts on this topic soon.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/security/udid-must-die/index.html</guid><pubDate>Fri, 09 Sep 2011 20:22:00 GMT</pubDate></item><item><title>mitmproxy 0.6</title><link>http://corte.si/posts/code/mitmproxy/announce0_6.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/announce0_6.html"&gt;mitmproxy 0.6&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;07 August 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://mitmproxy.org"&gt;
&lt;img src="http://corte.si/posts/code/mitmproxy/mitmproxy_0_4.png"/&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm happy to announce the release of mitmproxy 0.6, featuring a redesigned
scripting API, slew of major new features and a panoply of small bugfixes and
improvements.&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://mitmproxy.org/downloads/mitmproxy-0.6.tar.gz"&gt;mitmproxy-0.6.tar.gz&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;We now have an IRC channel and a mailing list - if you're interested in
mitmproxy, come join us!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IRC&lt;/strong&gt;: #mitmproxy on the &lt;a href="http://www.oftc.net/oftc/"&gt;OFTC&lt;/a&gt; IRC network
(irc://irc.oftc.net:6667).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mailing List&lt;/strong&gt;: &lt;a href="http://groups.google.com/group/mitmproxy"&gt;http://groups.google.com/group/mitmproxy&lt;/a&gt;&lt;/p&gt;

&lt;table style="margin: 0; padding: 0; border: 0;"&gt;
  &lt;form action="http://groups.google.com/group/mitmproxy/boxsubscribe"&gt;
  &lt;tr&gt;&lt;td style="padding-left: 5px; border: 0;"&gt;
  Email: &lt;input type=text name=email&gt;
  &lt;input type=submit name="sub" value="Subscribe"&gt;
  &lt;/td&gt;&lt;/tr&gt;
  &lt;/form&gt;
&lt;/table&gt;

&lt;h2&gt;Changelog&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;New scripting API that allows much more flexible and fine-grained
rewriting of traffic. See the docs for more info.&lt;/li&gt;
&lt;li&gt;Support for gzip and deflate content encodings. A new "z"
keybinding in mitmproxy to let us quickly encode and decode content, plus
automatic decoding for the "pretty" view mode.&lt;/li&gt;
&lt;li&gt;An event log, viewable with the "v" shortcut in mitmproxy, and the "-e"
commandline argument in both mitmproxy and mitmdump.&lt;/li&gt;
&lt;li&gt;Huge performance improvements both in the mitmproxy interface, and loading
large numbers of flows from file.&lt;/li&gt;
&lt;li&gt;A new "replace" convenience method for all flow objects, that does a
universal regex-based string replacement.&lt;/li&gt;
&lt;li&gt;Header management has been rewritten to maintain both case and order.&lt;/li&gt;
&lt;li&gt;Improved stability for SSL interception.&lt;/li&gt;
&lt;li&gt;Default expiry time on generated SSL certs has been dropped to avoid an
OpenSSL overflow bug that caused certificates to expire in the distant
past on some systems.&lt;/li&gt;
&lt;li&gt;A "pretty" view mode for JSON and form submission data.&lt;/li&gt;
&lt;li&gt;Expanded documentation and examples.&lt;/li&gt;
&lt;li&gt;Many other small improvements and bugfixes.&lt;/li&gt;
&lt;/ul&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/mitmproxy/announce0_6.html</guid><pubDate>Sun, 07 Aug 2011 10:30:00 GMT</pubDate></item><item><title>mitmproxy 0.5</title><link>http://corte.si/posts/code/mitmproxy/announce0_5.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/announce0_5.html"&gt;mitmproxy 0.5&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;27 June 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://mitmproxy.org"&gt;
&lt;img src="http://corte.si/posts/code/mitmproxy/mitmproxy_0_4.png"/&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I've just tagged and released mitmproxy 0.5. Everyone should update - this
release squelches a few annoying performance killers. You can download it from
the project website:&lt;/p&gt;

&lt;h2&gt;&lt;a href="http://mitmproxy.org"&gt;mitmproxy.org&lt;/a&gt;&lt;/h2&gt;

&lt;h2&gt;Changelog&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;An -n option to start the tools without binding to a proxy port.&lt;/li&gt;
&lt;li&gt;Allow scripts, hooks, sticky cookies etc. to run on flows loaded from
save files.&lt;/li&gt;
&lt;li&gt;Regularize command-line options for mitmproxy and mitmdump.&lt;/li&gt;
&lt;li&gt;Add an "SSL exception" to mitmproxy's license to remove possible
distribution issues.&lt;/li&gt;
&lt;li&gt;Add a --cert-wait-time option to make mitmproxy pause after a new SSL
certificate is generated. This can pave over small discrepancies in
system time between the client and server.&lt;/li&gt;
&lt;li&gt;Handle viewing big request and response bodies more elegantly. Only
render the first 100k of large documents, and try to avoid running the
XML indenter on non-XML data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BUGFIX&lt;/strong&gt;: Make the "revert" keyboard shortcut in mitmproxy work after a
flow has been replayed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BUGFIX&lt;/strong&gt;: Repair a problem that sometimes caused SSL connections to consume
100% of CPU.&lt;/li&gt;
&lt;/ul&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/mitmproxy/announce0_5.html</guid><pubDate>Mon, 27 Jun 2011 17:06:00 GMT</pubDate></item><item><title>UDID media roundup</title><link>http://corte.si/posts/security/udid-media-roundup.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/security/udid-media-roundup.html"&gt;UDID media roundup&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;10 June 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;After a hectic month, I'm finally able to return to the UDID privacy issues I
covered in my last few blog posts. I plan to publish some further results soon,
but first, a quick roundup of the media coverage of the &lt;a href="http://corte.si/posts/security/openfeint-udid-deanonymization/index.html"&gt;OpenFeint UDID
de-anonymization result&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://blogs.wsj.com/digits/2011/05/11/the-privacy-risks-of-id-codes-in-your-apps/"&gt;A post on on the Wall Street Journal tech
blog&lt;/a&gt;
by &lt;a href="http://www.jennifervalentinodevries.com/"&gt;Jennifer Valentino-DeVries&lt;/a&gt;, one
of the very few journalists who do good, novel investigative work into issues
like UDID privacy.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An interview with &lt;a href="http://www.repubblica.it/tecnologia/2011/06/03/news/identificativo_iphone-17073898/"&gt;La
Repubblica&lt;/a&gt;, a major Italian daily.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An article in &lt;a href="http://www.spiegel.de/netzwelt/gadgets/0,1518,761735,00.html"&gt;Der Spiegel&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Coverage on &lt;a href="http://articles.cnn.com/2011-05-09/tech/identity.iphones.ipads_1_apps-identifier-privacy?_s=PM:TECH"&gt;CNN online&lt;/a&gt;, &lt;a href="http://www.wired.com/gadgetlab/2011/05/iphone-udid/"&gt;Wired Gadgetlab&lt;/a&gt; and the &lt;a href="http://www.huffingtonpost.com/2011/05/10/iphone-udid-personal-information-identity_n_860139.html"&gt;Huffington Post&lt;/a&gt;. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And, last but not least, a &lt;a href="http://netsecpodcast.com/?p=772"&gt;nice 30-minute
interview&lt;/a&gt; with &lt;a href="https://twitter.com/#!/quine"&gt;Zach
Lanier&lt;/a&gt; from the &lt;a href="http://netsecpodcast.com/"&gt;Network Security
Podcast&lt;/a&gt;. This is your opportunity to get some more
details on the OpenFeint issue and find out what a a weird accent I have.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The issue was also mentioned on many, many blogs and smaller publications.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/security/udid-media-roundup.html</guid><pubDate>Fri, 10 Jun 2011 14:38:00 GMT</pubDate></item><item><title>How UDIDs are used: a survey</title><link>http://corte.si/posts/security/apple-udid-survey/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/security/apple-udid-survey/index.html"&gt;How UDIDs are used: a survey&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;19 May 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;I recently published some
&lt;a href="http://corte.si/posts/security/openfeint-udid-deanonymization/index.html"&gt;research&lt;/a&gt; showing that
the OpenFeint social gaming network can be used to link Apple UDIDs to users'
real-world identities. To understand why this is a problem, we have to look at
the way UDIDs are used in the broader app ecosystem. Once we do this, we see
that the vast majority of applications send UDIDs to servers on the Internet,
and that UDID-linked user information is aggregated in literally thousands of
databases on the net. In this context, UDID de-anonymization is a serious
threat to user privacy.&lt;/p&gt;

&lt;p&gt;We have one good research paper surveying UDID use - in 2010, Eric Smith
&lt;a href="http://www.pskl.us/wp/?p=476"&gt;looked at the unencrypted portion of app
traffic&lt;/a&gt;, and found that 68% of tested apps send
UDIDs upstream in the clear. I was curious to see what the figures would look
like if encrypted (HTTPS) traffic was included, so I decided to do my own
survey, using &lt;a href="http://mitmproxy.org"&gt;mitmproxy&lt;/a&gt; to analyse all traffic from the
94 applications I had installed on my iPhone. Below is a set of graphs
highlighting the main facts. I've also published a list of all applications and
the domains they contacted &lt;a href="http://corte.si/posts/security/apple-udid-survey/appdomains.html"&gt;here&lt;/a&gt; - it makes for
interesting reading. &lt;/p&gt;

&lt;h1&gt;Apps are noisier than you think they are&lt;/h1&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/security/apple-udid-survey/all_domains.png"/&gt;&lt;/p&gt;

&lt;p&gt;84% of apps tested contacted one or more domains during use. At the extreme
end,
&lt;a href="http://itunes.apple.com/us/app/idestroy-wicked-sick-stress/id309689677?mt=8"&gt;iDestroy&lt;/a&gt;
contacted 14 domains, including 3 different ad networks and OpenFeint.&lt;/p&gt;

&lt;h1&gt;... and send your UDID to more places than you expect&lt;/h1&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/security/apple-udid-survey/udid_domains.png"/&gt;&lt;/p&gt;

&lt;p&gt;74% of apps tested sent the device UDID  to one or more domains. &lt;/p&gt;

&lt;h1&gt;... often without encryption&lt;/h1&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/security/apple-udid-survey/udid_scheme.png"/&gt;&lt;/p&gt;

&lt;p&gt;46% of apps that transmitted UDIDs did so in the clear. 54% of apps
transmitting UDIDs used encryption for all UDID traffic&lt;sup class="footnote-ref" id="fnref-1"&gt;&lt;a href="#fn-1"&gt;1&lt;/a&gt;&lt;/sup&gt;. &lt;/p&gt;

&lt;h1&gt;A few big UDID aggregators dominate&lt;/h1&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/security/apple-udid-survey/topdomains.png"/&gt;&lt;/p&gt;

&lt;p&gt;Three big aggregators of UDID-related data dominate: &lt;a href="http://apple.com"&gt;Apple&lt;/a&gt;,
&lt;a href="http://www.flurry.com"&gt;Flurry&lt;/a&gt;, and &lt;a href="http://www.openfeint.com"&gt;OpenFeint&lt;/a&gt;.
Each one of these companies has the vast majority of UDIDs on file, linked to a
rich set of privacy-sensitive information. OpenFeint's ubiquity is one of the
reasons why UDID de-anonymization using their API is so serious.&lt;/p&gt;

&lt;h1&gt;...  behind them are a long tail of smaller aggregators&lt;/h1&gt;

&lt;p&gt;Here is a list of all the remaining domains that had UDIDs transmitted to them
- a mixture of ad networks, analytics firms, individual developer sites, and
online services. &lt;/p&gt;

&lt;table&gt;
&lt;tr&gt;
&lt;td&gt; ads.mp.mydas.mobi &lt;/td&gt;
&lt;td&gt; analytics.localytics.com &lt;/td&gt;
&lt;td&gt; api.dropbox.com &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt; bayobongo.com &lt;/td&gt;
&lt;td&gt; bbc.112.2o7.net &lt;/td&gt;
&lt;td&gt; beatwave.collect3.com.au &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt; catalog.lexcycle.com &lt;/td&gt;
&lt;td&gt; data.mobclix.com &lt;/td&gt;
&lt;td&gt; init.gc.apple.com &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt; msh.amazon.com &lt;/td&gt;
&lt;td&gt; notifications.lexcycle.com &lt;/td&gt;
&lt;td&gt; promo.limbic.com &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt; soma.smaato.com &lt;/td&gt;
&lt;td&gt; www.chimerasw.com &lt;/td&gt;
&lt;td&gt; www.phasiclabs.com &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt; www.trainyard.ca &lt;/td&gt;
&lt;td&gt; api.twitter.com &lt;/td&gt;
&lt;td&gt; ngpipes.ngmoco.com &lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt; npr.122.2o7.net &lt;/td&gt;
&lt;td&gt; ws.tapjoyads.com &lt;/td&gt;
&lt;td&gt;  &lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;h1&gt;Methodology&lt;/h1&gt;

&lt;p&gt;For each application, I started a logging instance of mitmdump, like so:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;mitmdump -w appname
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I then started up the application, interacted with anything that might elicit
network traffic, and shut it down. The collected data was analyzed with a
simple script, that used the &lt;a href="http://mitmproxy.org/doc/library.html"&gt;libmproxy&lt;/a&gt;
API to traverse the traffic dumps and extract the needed information. &lt;/p&gt;

&lt;div class="footnotes"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn-1"&gt;
&lt;p&gt;The fact that 54% of UDID-using apps would have gone undetected by
Smith's study seems to indicate that there should be a much greater difference
between our results - Smith found 68% of apps use UDIDs vs my 74%. The
discrepancy can be accounted for by the fact that we used different samples -
Smith used predominantly applications in Apple's "Top Free" lists, whereas I
used both paid and unpaid applications that happened to be on my phone.&amp;nbsp;&lt;a href="#fnref-1" class="footnoteBackLink" title="Jump back to footnote 1 in the text."&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/security/apple-udid-survey/index.html</guid><pubDate>Thu, 19 May 2011 17:34:00 GMT</pubDate></item><item><title>De-anonymizing Apple UDIDs with OpenFeint</title><link>http://corte.si/posts/security/openfeint-udid-deanonymization/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/security/openfeint-udid-deanonymization/index.html"&gt;De-anonymizing Apple UDIDs with OpenFeint&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;04 May 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;Every iPhone, iPad and iPod touch has an associated Unique Device Identifier
(UDID). You can think of the UDID as a serial number burned into the device -
one that can't be removed or changed&lt;sup class="footnote-ref" id="fnref-1"&gt;&lt;a href="#fn-1"&gt;1&lt;/a&gt;&lt;/sup&gt;. This number is exposed to app
developers through an API, without requiring the device owner's permission or
knowledge.&lt;/p&gt;

&lt;p&gt;Few Apple users realise just how widely their UDIDs are used. &lt;a href="http://www.pskl.us/wp/?p=476"&gt;Research
shows&lt;/a&gt; that 68% of apps silently send UDIDs to
servers on the Internet. This is often accompanied by information on how, when
and where the device is used.  The most common destination for traffic
containing a user's UDID is Apple itself, followed by the
&lt;a href="http://www.flurry.com/"&gt;Flurry&lt;/a&gt; mobile analytics network and OpenFeint, a
mobile social gaming company. These companies are uber-aggregators of
UDID-linked user information, because so many apps use their APIs. Trailing
behind the big three are thousands of individual developer sites, ad servers
and smaller analytics firms. Users have no way to stop their device from
offering up their UDID, telling who their data is being sent to, or even
telling that it's happening at all. This situation has caused wide-spread
concern, including coverage in the &lt;a href="http://blogs.wsj.com/digits/2010/12/19/unique-phone-id-numbers-explained/"&gt;Wall Street
Journal&lt;/a&gt;,
and &lt;a href="http://www.txinjuryblog.com/tags/udid-lawsuit/"&gt;two&lt;/a&gt;
&lt;a href="http://www.infosecurity-us.com/view/15643/apple-faces-second-lawsuit-over-udid-disclosure-to-third-parties/"&gt;lawsuits&lt;/a&gt;
aimed at Apple.&lt;/p&gt;

&lt;p&gt;The saving grace is that your device UDID is not linked to your real-world
identity. If it were possible to de-anonymize UDIDs, the result would be a
serious privacy breach. Apple is well aware of this, and &lt;a href="http://developer.apple.com/library/ios/#documentation/uikit/reference/UIDevice_Class/Reference/UIDevice.html"&gt;explicitly tells
developers that they are not permitted to publicly link a UDID to a user
account&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I recently published a tool called &lt;a href="http://mitmproxy.org"&gt;mitmproxy&lt;/a&gt;, a
man-in-the-middle proxy that allows one to intercept and monitor SSL-encrypted
HTTP traffic. Using mitmproxy to view the encrypted traffic sent by my own iOS
devices, I was able to observe protocols and data flows that have clearly
received very little external review. A slew of interesting security results
followed (keep an eye on this blog), but by far the most alarming was the fact
that it was possible to use OpenFeint to completely de-anonymize a large
proportion of UDIDs.&lt;/p&gt;

&lt;h1&gt;De-anonymizing UDIDs with OpenFeint&lt;/h1&gt;

&lt;h2&gt;Linking UDIDs to OpenFeint user accounts&lt;/h2&gt;

&lt;p&gt;When an OpenFeint-enabled app is first fired up, it submits the device's UDID
to OpenFeint's servers, which then return a list of associated accounts:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;https://api.openfeint.com/users/for_device.xml?udid=XXX
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is a completely unauthenticated call - you can try it out by cutting and
pasting it into your browser, replacing XXX with &lt;a href="http://support.apple.com/kb/HT4061"&gt;your own
UDID&lt;/a&gt;. Here's an example of the response
for my UDID, with sensitive information removed: &lt;/p&gt;

&lt;pre&gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;
&amp;lt;resources&amp;gt;
  &amp;lt;user&amp;gt;
    &amp;lt;chat_enabled&amp;gt;true&amp;lt;/chat_enabled&amp;gt;
    &amp;lt;gamer_score&amp;gt;XXX&amp;lt;/gamer_score&amp;gt;
    &amp;lt;id&amp;gt;XXX&amp;lt;/id&amp;gt;
    &amp;lt;last_played_game_id&amp;gt;187402&amp;lt;/last_played_game_id&amp;gt;
    &amp;lt;last_played_game_name&amp;gt;tiny wings&amp;lt;/last_played_game_name&amp;gt;
    &amp;lt;lat&amp;gt;XXX&amp;lt;/lat&amp;gt;
    &amp;lt;lng&amp;gt;XXX&amp;lt;/lng&amp;gt;
    &amp;lt;online&amp;gt;false&amp;lt;/online&amp;gt;
    &amp;lt;profile_picture_source&amp;gt;FbconnectCredential&amp;lt;/profile_picture_source&amp;gt;
    &amp;lt;profile_picture_updated_at&amp;gt;XXX&amp;lt;/profile_picture_updated_at&amp;gt;
    &amp;lt;profile_picture_url&amp;gt;http://XXX&amp;gt;
    &amp;lt;uploaded_profile_picture_content_type nil=&amp;quot;true&amp;quot;&amp;gt;
    &amp;lt;/uploaded_profile_picture_content_type&amp;gt;
    &amp;lt;uploaded_profile_picture_file_name nil=&amp;quot;true&amp;quot;&amp;gt;
    &amp;lt;/uploaded_profile_picture_file_name&amp;gt;
    &amp;lt;uploaded_profile_picture_file_size nil=&amp;quot;true&amp;quot;&amp;gt;
    &amp;lt;/uploaded_profile_picture_file_size&amp;gt;
    &amp;lt;uploaded_profile_picture_updated_at nil=&amp;quot;true&amp;quot;&amp;gt;
    &amp;lt;/uploaded_profile_picture_updated_at&amp;gt;
    &amp;lt;name&amp;gt;XXX&amp;lt;/name&amp;gt;
  &amp;lt;/user&amp;gt;
&amp;lt;/resources&amp;gt;

&lt;/pre&gt;

&lt;p&gt;Included is my latitude and longitude, the last game I played, my chosen
account name, and my Facebook profile picture URL.&lt;/p&gt;

&lt;h2&gt;Linking UDIDs to GPS co-ordinates&lt;/h2&gt;

&lt;p&gt;If the user has opted to allow OpenFeint to use their location, latitude and
longitude is returned in the profile results. This lets us trivially associate
a UDID with GPS co-ordinates.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The location leak was fixed by OpenFeint after my report. Although some
portions of the OpenFeint API still returns a user location, it seems that it
is no longer served for direct profile requests.&lt;/em&gt; &lt;/p&gt;

&lt;h2&gt;Linking UDIDs to Facebook profiles&lt;/h2&gt;

&lt;p&gt;If the user registered a Facebook account with OpenFeint, a profile picture URL
hosted by the Facebook CDN was returned in the user's profile data. Facebook
profile picture URLs include the user's Facebook ID, directly linking it to
their Facebook account.&lt;/p&gt;

&lt;p&gt;For example, here's Bruce Schneier's Facebook profile picture URL:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://profile.ak.fbcdn.net/hprofile-ak-snc4/41795_60615378024_8092_n.jpg
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The 11-digit number in this URL is his Facebook user ID. We can now view his
profile using a URL like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;http://www.facebook.com/profile.php?id=60615378024
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This final step represents a complete de-anonymization of the UDID, directly
linking the supposedly anonymous identifier with a user's real-world identity.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Facebook ID leak was fixed by OpenFeint after my report.&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;OpenFeint's response&lt;/h1&gt;

&lt;p&gt;I reported this problem to OpenFeint on 5th of April. I did not hear back from
them immediately, but I knew they were working on the problem because their API
stopped returning GPS coordinates and Facebook profile picture URLs. On the
12th, I received an email from Jason Citron, OpenFeint's CEO, who wanted to set
up a phone conversation with me, him and an OpenFeint legal representative.  We
spoke on the evening of the 20th of April. I recapped my findings and expressed
concern that their API still linked UDIDs to user accounts. They thanked me for
the vulnerability report, confirmed that they had tightened their API in
response to it, and asked for more time to consider the issue before I released
anything. The following morning, it was announced that OpenFeint had been
&lt;a href="http://openfeint.com/company/press/33-GREE-Puts-Over-100-Million-into-OpenFeint-to-Drive-Global-Expansion-with-100M-users"&gt;bought by GREE for $104
million&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Last week I received what I assume is OpenFeint's last word on the matter, in
the form of an email from Jason Citron: "We will continue to pay attention to
the issues you raised and will continue to adjust our practices as necessary."
At the time of writing, OpenFeint's API still allows you to associate a UDID
with a private user information.&lt;/p&gt;

&lt;h1&gt;Impact&lt;/h1&gt;

&lt;p&gt;Testing with a small corpus of UDIDs gathered from my own and friends' devices,
I was able to link roughly 30% of UDIDs to GPS co-ordinates, 20% of users to a
weak identity (e.g.  OpenFeint profile picture, user-chosen account name), and
10% of UDIDs directly to a Facebook profile. I stress that my sample was small
and probably unrepresentative - only OpenFeint knows what the real numbers are.
None the less, we can make a broad guess at the magnitude of the problem, based
on the fact that OpenFeint &lt;a href="http://openfeint.com/company/press/33-GREE-Puts-Over-100-Million-into-OpenFeint-to-Drive-Global-Expansion-with-100M-users"&gt;claims to have 75 million
users&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;This would mean that about 7.5 million users may have had Facebook accounts
linked publicly to their UDIDs until OpenFeint stopped returning profile
picture URLs a few weeks ago.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;About 22.5 million users may have had GPS co-ordinates linked publicly to
their UDIDs until the issue was corrected.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;About 15 million users may still have identifying information like profile
pictures and user-chosen account names (that can often be used to identify
users) exposed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;All 75 million users still have personal details like the last
OpenFeint-enabled game they played and whether they are online (i.e. logged in
to the OpenFeint network) exposed. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Although the Facebook and GPS de-anonymization issues have been repaired, we
have to consider the possibility that these vulnerabilities have already been
used to de-anonymize a database of UDIDs. &lt;/p&gt;

&lt;h1&gt;Conclusion&lt;/h1&gt;

&lt;p&gt;I want to stress that the problem here is not primarily with OpenFeint. By
designing an API to expose UDIDs and encouraging developers to use it, Apple
has ensured that there are literally thousands of databases linking UDIDs to
sensitive user information on the net. A leak from any one of these - or worse
a large-scale de-anonymization like the OpenFeint one - inevitably has serious
consequences for user privacy. &lt;/p&gt;

&lt;div class="footnotes"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn-1"&gt;
&lt;p&gt;I should note that this is not quite accurate. The UDID is actually a
computed value - a hash calculated over a set of identifying hardware
attributes. In a sense, it only really exists as an API call.&amp;nbsp;&lt;a href="#fnref-1" class="footnoteBackLink" title="Jump back to footnote 1 in the text."&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/security/openfeint-udid-deanonymization/index.html</guid><pubDate>Wed, 04 May 2011 19:30:00 GMT</pubDate></item><item><title>subscount: Counting RSS feed subscribers</title><link>http://corte.si/posts/code/subscount/announce.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/subscount/announce.html"&gt;subscount: Counting RSS feed subscribers&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;02 April 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;A couple of months ago, I wrote a post &lt;a href="http://corte.si/posts/socialmedia/post-lifecycle/index.html"&gt;following one of my blog posts through
the the social news grist mill&lt;/a&gt;. In it,
I bemoaned the fact that social news seems to be displacing more old-fashioned
person-to-person connections on the 'net - 33,000 unique visitors to my blog
resulted in only 41 new Google Reader subscribers. Google Reader is so
completely dominant in this space that ignoring everything else was good enough
as a first approximation, but I made a mental note to come up with a more
complete figure.&lt;/p&gt;

&lt;p&gt;So, yesterday I hacked up a little tool called
&lt;a href="http://github.com/cortesi/subscount"&gt;subscount&lt;/a&gt; to help. It parses parses
Apache-style logs to make a best guess at feed subscriber numbers, and emits a
snippet of JavaScript that can be used to show subscriber numbers on statically
rendered sites like my blog.&lt;/p&gt;

&lt;h1&gt;Estimating feed subscribers from web server logs&lt;/h1&gt;

&lt;p&gt;Broadly speaking, there are four different groups of feed retrievers we need to
deal with: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Well behaved aggregators that report a feed ID and the number of end
subscribers in the user agent string. In my case, this is Google Reader,
FriendFeed and NetVibes. There's no standard governing this, but there are so
few significant players that I just catered manually for all the variations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Poorly behaved aggregators that report a subscriber number, but no feed ID.
An example here is PostRank. Again, there are a small number of these, so
subscount handles them with a hand-coded set of rules.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Individual subscribers using tools like Akregator and NetNewsWire. In this
case, we distinguish between subscribers by IP address, which should be good
enough as long as we keep the analysis time window to a day or so.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A myriad of automated feed consumers. These are mostly poorly behaved, and
rarely identify themselves properly. Weeding them out would be nearly
impossible, so we treat them just like individual subscribers.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When subscount traverses a log file, it calculates a unique identifier and a
number of subscribers for each retriever of the feed. For individual
subscribers, the ID is the IP address, and the number of subscribers is 1. For
aggregators, we use the reported feed ID, and the reported number of
subscribers. We use the unique ID to make sure we don't count anyone more than
once, and simply tot up the numbers at the end. &lt;/p&gt;

&lt;p&gt;Needless to say, the figure we come up with is just an estimate - but I think
it's probably a reasonable one. As expected, the figures show that Google
Reader alone is responsible for 68% of my subscribers.&lt;/p&gt;

&lt;h1&gt;Deploying subscount&lt;/h1&gt;

&lt;p&gt;I thought it would be neat to report the number of feed subscribers I have next
to the feed icon on my blog, so I extended subscount to help. The &lt;strong&gt;-j&lt;/strong&gt; flag
to subscount takes a DOM element ID, like so:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;./subscount -p "/rss.xml" -j subscriber_div /var/log/mylog
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And then prints a snippet that modifies the specified tag with the subscriber
number, like this:&lt;/p&gt;

&lt;pre&gt;function _subs(){
    var subsdiv = document.getElementById(&amp;quot;subscriber_div&amp;quot;);
    if (subsdiv)
        subsdiv.innerHTML = (&amp;quot;947&amp;quot;);
};
window.onload = _subs;

&lt;/pre&gt;

&lt;p&gt;I run subscount from cron just after log rotation every night (it can read
gzipped log files directly), and pipe the output to a file in my blog's web
root. I then simply source this file in a script tag, and voila! - dynamically
updated subscriber numbers for my statically rendered site. You can see the
results in my sidebar.&lt;/p&gt;

&lt;h1&gt;The code&lt;/h1&gt;

&lt;p&gt;As usual, the &lt;a href="http://github.com/cortesi/subscount"&gt;code is available on GitHub&lt;/a&gt;.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/subscount/announce.html</guid><pubDate>Sat, 02 Apr 2011 12:22:00 GMT</pubDate></item><item><title>mitmproxy: Breaking Apple's Game Center with replay</title><link>http://corte.si/posts/code/mitmproxy/tute-gamecenter/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/tute-gamecenter/index.html"&gt;mitmproxy: Breaking Apple&amp;#39;s Game Center with replay&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;31 March 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;This is the second in the series of tutorials I'm writing for
&lt;a href="http://mitmproxy.org"&gt;mitmproxy&lt;/a&gt;. You can find the first one - a 30 second
tutorial on client replay - &lt;a href="http://corte.si/posts/code/mitmproxy/tute-30-seconds.html"&gt;here&lt;/a&gt;. There
will be more to come in the next few days.&lt;/p&gt;

&lt;h1&gt;The setup&lt;/h1&gt;

&lt;p&gt;In this tutorial, I'm going to show you how simple it is to creatively
interfere with Apple Game Center traffic using mitmproxy. To set things up, I
registered my mitmproxy CA certificate with my iPhone - there's a &lt;a href="http://mitmproxy.org/doc/certinstall/ios.html"&gt;step by step
set of instructions&lt;/a&gt; for doing
this in the mitmproxy docs. I then started mitmproxy on my desktop, and
configured the iPhone to use it as a proxy. &lt;/p&gt;

&lt;h1&gt;Taking a look at the Game Center traffic&lt;/h1&gt;

&lt;p&gt;Lets take a first look at the Game Center traffic. The game I'll use in this
tutorial is &lt;a href="http://itunes.apple.com/us/app/super-mega-worm/id388541990?mt=8"&gt;Super Mega
Worm&lt;/a&gt; - a
great little retro-apocalyptic sidescroller for the iPhone: &lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;img src="http://corte.si/posts/code/mitmproxy/tute-gamecenter/supermega.png"/&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;After finishing a game (take your time), watch the traffic flowing through
mitmproxy:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;img src="http://corte.si/posts/code/mitmproxy/tute-gamecenter/one.png"/&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;We see a bunch of things we might expect - initialisation, the retrieval of
leaderboards and so forth. Then, right at the end, there's a POST to this
tantalising URL:&lt;/p&gt;

&lt;pre&gt;
https://service.gc.apple.com/WebObjects/GKGameStatsService.woa/wa/submitScore
&lt;/pre&gt;

&lt;p&gt;The contents of the submission are particularly interesting:&lt;/p&gt;

&lt;pre&gt;&amp;lt;plist version=&amp;quot;1.0&amp;quot;&amp;gt;
&amp;lt;dict&amp;gt;
    &amp;lt;key&amp;gt;category&amp;lt;/key&amp;gt;
    &amp;lt;string&amp;gt;SMW_Adv_USA1&amp;lt;/string&amp;gt;
    &amp;lt;key&amp;gt;score-value&amp;lt;/key&amp;gt;
    &amp;lt;integer&amp;gt;55&amp;lt;/integer&amp;gt;
    &amp;lt;key&amp;gt;timestamp&amp;lt;/key&amp;gt;
    &amp;lt;integer&amp;gt;1301553284461&amp;lt;/integer&amp;gt;
&amp;lt;/dict&amp;gt;
&amp;lt;/plist&amp;gt;

&lt;/pre&gt;

&lt;p&gt;This is a &lt;a href="http://en.wikipedia.org/wiki/Property_list"&gt;property list&lt;/a&gt;,
containing an identifier for the game, a score (55, in this case), and a
timestamp. Looks pretty simple to mess with.&lt;/p&gt;

&lt;h1&gt;Modifying and replaying the score submission&lt;/h1&gt;

&lt;p&gt;Lets edit the score submission. First, select it in mitmproxy, then press
&lt;strong&gt;enter&lt;/strong&gt; to view it. Make sure you're viewing the request, not the response -
you can use &lt;strong&gt;tab&lt;/strong&gt; to flick between the two. Now press &lt;strong&gt;e&lt;/strong&gt; for edit. You'll
be prompted for the part of the request you want to change - press &lt;strong&gt;b&lt;/strong&gt; for
body.  Your preferred editor (taken from the EDITOR environment variable) will
now fire up. Lets bump the score up to something a bit more ambitious:&lt;/p&gt;

&lt;pre&gt;&amp;lt;plist version=&amp;quot;1.0&amp;quot;&amp;gt;
&amp;lt;dict&amp;gt;
    &amp;lt;key&amp;gt;category&amp;lt;/key&amp;gt;
    &amp;lt;string&amp;gt;SMW_Adv_USA1&amp;lt;/string&amp;gt;
    &amp;lt;key&amp;gt;score-value&amp;lt;/key&amp;gt;
    &amp;lt;integer&amp;gt;2200272667&amp;lt;/integer&amp;gt;
    &amp;lt;key&amp;gt;timestamp&amp;lt;/key&amp;gt;
    &amp;lt;integer&amp;gt;1301553284461&amp;lt;/integer&amp;gt;
&amp;lt;/dict&amp;gt;
&amp;lt;/plist&amp;gt;

&lt;/pre&gt;

&lt;p&gt;Save the file and exit your editor. &lt;/p&gt;

&lt;p&gt;The final step is to replay this modified request. Simply press &lt;strong&gt;r&lt;/strong&gt; for
replay.&lt;/p&gt;

&lt;h1&gt;The glorious result and some intrigue&lt;/h1&gt;

&lt;p&gt;&lt;center&gt;
    &lt;img src="http://corte.si/posts/code/mitmproxy/tute-gamecenter/leaderboard.png"/&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;And that's it - according to the records, I am the greatest Super Mega Worm
player of all time. &lt;/p&gt;

&lt;p&gt;Curiously, the top competitors' scores are all the same: 2,147,483,647. If you
think that number seems familiar, you're right: it's 2^31-1, the maximum value
you can fit into a signed 32-bit int. Now let me tell you another peculiar
thing about Super Mega Worm - at the end of every game, it submits your highest
previous score to the Game Center, not your current score.  This means that it
stores your highscore somewhere, and I'm guessing that it reads that stored
score back into a signed integer. So, if you &lt;em&gt;were&lt;/em&gt; to cheat by the relatively
pedestrian means of modifying the saved score on your jailbroken phone, then
2^31-1 might well be the maximum score you could get. Then again, if the game
itself stores its score in a signed 32-bit int, you could get the same score
through perfect play, effectively beating the game. So, which is it in this
case? I'll leave that for you to decide.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/mitmproxy/tute-gamecenter/index.html</guid><pubDate>Thu, 31 Mar 2011 18:23:00 GMT</pubDate></item><item><title>mitmproxy: A 30-second client playback example</title><link>http://corte.si/posts/code/mitmproxy/tute-30-seconds.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/tute-30-seconds.html"&gt;mitmproxy: A 30-second client playback example&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;31 March 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/announce0_4.html"&gt;Yesterday&lt;/a&gt; I published version 0.4 of
&lt;a href="http://mitmproxy.org"&gt;mitmproxy&lt;/a&gt; - an intercepting proxy for HTTP/S traffic.
The tool already has pretty complete documentation, but I've decided to write a
series of less formal tutorials to showcase its abilities. Below is the first,
and simplest, of these - keep an eye on the blog for more in the coming days.&lt;/p&gt;

&lt;h1&gt;A 30-second client playback example&lt;/h1&gt;

&lt;p&gt;My local cafe is serviced by a rickety and unreliable wireless network,
generously sponsored with ratepayers' money by our city council. After
connecting, you  are redirected to an SSL-protected page that prompts you for a
username and password. Once you've entered your details, you are free to enjoy
the intermittent dropouts, treacle-like speeds and incorrectly configured
transparent proxy. &lt;/p&gt;

&lt;p&gt;I tend to automate this kind of thing at the first opportunity, on the theory
that time spent now will be more than made up in the long run. In this case, I
might use &lt;a href="http://getfirebug.com/"&gt;Firebug&lt;/a&gt; to ferret out the form post
parameters and target URL, then fire up an editor to write a little script
using Python's &lt;a href="http://docs.python.org/library/urllib.html"&gt;urllib&lt;/a&gt; to simulate
a submission. That's a lot of futzing about. With mitmproxy we can do the job
in literally 30 seconds, without having to worry about any of the details.
Here's how.&lt;/p&gt;

&lt;h2&gt;1. Run mitmdump to record our HTTP conversation to a file.&lt;/h2&gt;

&lt;pre class="terminal"&gt;
&gt; mitmdump -w wireless-login
&lt;/pre&gt;

&lt;h2&gt;2. Point your browser at the mitmdump instance.&lt;/h2&gt;

&lt;p&gt;I use a tiny Firefox addon called &lt;a href="https://addons.mozilla.org/en-us/firefox/addon/toggle-proxy-51740/"&gt;Toggle
Proxy&lt;/a&gt; to
switch quickly to and from mitmproxy. I'm assuming you've already &lt;a href="http://mitmproxy.org/doc/ssl.html"&gt;configured
your browser with mitmproxy's SSL certificate
authority&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;3. Log in as usual.&lt;/h2&gt;

&lt;p&gt;And that's it! You now have a serialized version of the login process in the
file wireless-login, and you can replay it at any time like this:&lt;/p&gt;

&lt;pre class="terminal"&gt;
&gt; mitmdump -c wireless-login
&lt;/pre&gt;

&lt;h2&gt;Embellishments&lt;/h2&gt;

&lt;p&gt;We're really done at this point, but there are a couple of embellishments we
could make if we wanted. I use &lt;a href="http://wicd.sourceforge.net/"&gt;wicd&lt;/a&gt; to
automatically join wireless networks I frequent, and it lets me specify a
command to run after connecting. I used the client replay command above and
voila! - totally hands-free wireless network startup.&lt;/p&gt;

&lt;p&gt;We might also want to prune requests that download CSS, JS, images and so
forth. These add only a few moments to the time it takes to replay, but they're
not really needed and I somehow feel compelled trim them anyway. So, we fire up
the mitmproxy console tool on our serialized conversation, like so:&lt;/p&gt;

&lt;pre class="terminal"&gt;
&gt; mitmproxy wireless-login
&lt;/pre&gt;

&lt;p&gt;We can now go through and manually delete (using the &lt;strong&gt;d&lt;/strong&gt; keyboard shortcut)
everything we want to trim. When we're done, we use &lt;strong&gt;S&lt;/strong&gt; to save the
conversation back to the file.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/mitmproxy/tute-30-seconds.html</guid><pubDate>Thu, 31 Mar 2011 09:58:00 GMT</pubDate></item><item><title>mitmproxy 0.4 has been released</title><link>http://corte.si/posts/code/mitmproxy/announce0_4.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/mitmproxy/announce0_4.html"&gt;mitmproxy 0.4 has been released&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;30 March 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://mitmproxy.org"&gt;
&lt;img src="http://corte.si/posts/code/mitmproxy/mitmproxy_0_4.png"/&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I've just tagged and released mitmproxy 0.4. You can download it from the new
project website:&lt;/p&gt;

&lt;h2&gt;&lt;a href="http://mitmproxy.org"&gt;mitmproxy.org&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;This is a huge update, with dozens
of new features, and improvements to almost every aspect of the project.  A few
highlights are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complete serialization of HTTP/S conversations&lt;/li&gt;
&lt;li&gt;On-the-fly generation of SSL interception certificates&lt;/li&gt;
&lt;li&gt;Ability to replay both the client and the server side of HTTP/S conversations&lt;/li&gt;
&lt;li&gt;mitmdump has grown up to be a powerful tcpdump-like commandline tool for HTTP/S&lt;/li&gt;
&lt;li&gt;Scripting hooks for programmatic modification of traffic using Python&lt;/li&gt;
&lt;li&gt;Many, many user interface improvements, bug fixes, and minor features&lt;/li&gt;
&lt;li&gt;Better &lt;a href="http://mitmproxy.org/doc/index.html"&gt;documentation&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Special thanks go to &lt;a href="http://www.henriknordstrom.net/"&gt;Henrik Nordström&lt;/a&gt; for
many great contributions to this release. I'd love more contributors to join
the project - if you feel like hacking on mitmproxy, take a look at the
&lt;a href="https://github.com/cortesi/mitmproxy/blob/master/todo"&gt;todo&lt;/a&gt; file at the top
of the tree for ideas.&lt;/p&gt;

&lt;p&gt;Over the next week I will write a series of tutorials to showcase mitmproxy's
abilities, ranging from simple to quite complex. Keep an eye on the blog for
these - they will be published here first, before making their way into the
official documentation.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/mitmproxy/announce0_4.html</guid><pubDate>Wed, 30 Mar 2011 14:44:00 GMT</pubDate></item><item><title>A self-portrait, drawn on an iPad</title><link>http://corte.si/posts/general/ipad-self-portrait.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/general/ipad-self-portrait.html"&gt;A self-portrait, drawn on an iPad&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;29 March 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;center&gt;
    &lt;img src="http://corte.si/posts/general/ipad-selfportrait.png"/&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;I've just returned from a trip I made to my parents' home to see my
grandmother. She's 92, and had braved a long international journey to come to
see her great-grandchildren for the first, and probably the last, time. My
grandmother is a truly remarkable woman who led a truly remarkable life - she
speaks five languages, lived on every continent bar Antarctica, made dozens of
trips to Egypt as an amateur Egyptologist, and once mounted a canoe expedition
up the Amazon river. And she had talent to match her sense of adventure - she
was a great artist, winning many international prizes for her wood carvings.
Most important to me, though, was the fact that she was the eccentric
grandmother every child deserves to have. She'd seen enough and knew enough
not to care what anyone else thought, living life absolutely (and sometimes
infuriatingly) by her own rules. My fondest memories of childhood are of visits
to her, filled with books, brilliant schemes, and the smell of sawdust in her
studio. &lt;/p&gt;

&lt;p&gt;The 8 years since I last saw her have been cruel. Physically, she seems almost
unchanged, but her memory is fading fast. She loved playing with my baby son,
but would often ask me who he was. I would re-explain that he was her
great-grandson, and she would be delighted all over again. She was overwhelmed
by the helter-skelter of family conversations and small children, and clearly
yearned to be back home, where she spends her days in quiet reminiscence among
the curios and collected oddities of 70 years of travel. Her days as an artist
have long been over - she hasn't been able to hold a pencil for a decade, much
less a wood-chisel or a jigsaw. &lt;/p&gt;

&lt;p&gt;On a whim, a few days into the visit, I started up a sketch app on my iPad and
and handed it to her. She was doubtful at first, but quickly became deeply
engrossed. She sat for hours, hunched over, using the side of her index finger
to draw. And draw. Saving a sketch and starting a new one was too complicated,
so she used the eraser to clear space instead, drawing each image over the top
of the previous one. At the end of the day she handed the iPad back to me with
the final sketch of a beautiful, sad young woman still on the screen.  &lt;/p&gt;

&lt;p&gt;A self-portrait, she said with a smile.  &lt;/p&gt;

&lt;p&gt;Truly, the tragedy of life is not that we grow old, it's that we stay young. &lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/general/ipad-self-portrait.html</guid><pubDate>Tue, 29 Mar 2011 17:27:00 GMT</pubDate></item><item><title>Indenting XML-ish markup: a code snippet</title><link>http://corte.si/posts/code/pretty_xmlish/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/pretty_xmlish/index.html"&gt;Indenting XML-ish markup: a code snippet&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;06 February 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;I've been hacking on &lt;a href="https://github.com/cortesi/mitmproxy"&gt;mitmproxy&lt;/a&gt;
recently, gearing up to a new release in the next week. One of the features I
needed was to pretty-print XML-ish markup (HTML, SOAP, etc.) to make it easier
to quickly scan through traffic not formatted for human eyes. I needed this
function to cope robustly with incomplete or malformed data, which ruled out
proper XML parsers like ElementTree. I also needed it to be fast on large-ish
files, which ruled out BeautifulSoup. On the upside, I didn't need it to be
&lt;em&gt;perfect&lt;/em&gt; - and as long as it didn't lose or corrupt data, getting the
indentation mostly right would be good enough. &lt;/p&gt;

&lt;p&gt;Today I sat down and hacked up my own solution. This turns out to be just 40
lines of code, somewhat gnarled and ugly after being fine-tuned against a few
dozen real-world data samples:&lt;/p&gt;

&lt;p&gt;&lt;pre&gt;import re, textwrap&lt;/p&gt;

&lt;p&gt;TAG = r&amp;quot;&amp;quot;&amp;quot;
        &amp;lt;\s*
        (?!\s&lt;em&gt;[!&amp;quot;])
        (?P&amp;lt;close&amp;gt;\s&lt;/em&gt;\/)?
        (?P&amp;lt;name&amp;gt;\w+)
        (
            [^&amp;#39;&amp;quot;\t &amp;gt;]+ |
            &amp;quot;[^\&amp;quot;]&lt;em&gt;&amp;quot;[&amp;#39;\&amp;quot;]&lt;/em&gt; |
            &amp;#39;[^&amp;#39;]&lt;em&gt;&amp;#39;[&amp;#39;\&amp;quot;]&lt;/em&gt; | 
            \s+
        )*
        (?P&amp;lt;selfcont&amp;gt;\s&lt;em&gt;\/\s&lt;/em&gt;)?
        \s&lt;em&gt;&amp;gt;
      &amp;quot;&amp;quot;&amp;quot;
UNI = set([&amp;quot;br&amp;quot;, &amp;quot;hr&amp;quot;, &amp;quot;img&amp;quot;, &amp;quot;input&amp;quot;, &amp;quot;area&amp;quot;, &amp;quot;link&amp;quot;])
INDENT = &amp;quot; &amp;quot;&lt;/em&gt;4
def pretty_xmlish(s):
    &amp;quot;&amp;quot;&amp;quot;
        A robust pretty-printer for XML-ish data. 
        Returns a list of lines.
    &amp;quot;&amp;quot;&amp;quot;
    data, offset, indent, prev = [], 0, 0, None
    for i in re.finditer(TAG, s, re.VERBOSE|re.MULTILINE):
        start, end = i.span()
        name = i.group(&amp;quot;name&amp;quot;)
        if start &amp;gt; offset:
            txt = []
            for x in textwrap.dedent(s[offset:start]).split(&amp;quot;\n&amp;quot;):
                if x.strip():
                    txt.append(indent&lt;em&gt;INDENT + x)
            data.extend(txt)
        if i.group(&amp;quot;close&amp;quot;) and not (name in UNI and name==prev):
            indent = max(indent - 1, 0)
        data.append(indent&lt;/em&gt;INDENT + i.group().strip())
        offset = end
        if not any([i.group(&amp;quot;close&amp;quot;), i.group(&amp;quot;selfcont&amp;quot;), name in UNI]):
            indent += 1
        prev = name
    trail = s[offset:]
    if trail.strip():
        data.append(trail)
    return data&lt;/p&gt;

&lt;p&gt;&lt;/pre&gt;&lt;center&gt;&lt;/p&gt;

&lt;div class="subtitle"&gt;
    &lt;a href="http://corte.si/posts/code/pretty_xmlish/snippet.py"&gt;(snippet.py)&lt;/a&gt;
&lt;/div&gt;

&lt;p&gt;&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Little snippets of code like this are too trivial to spin out into an
independent library, but I'd like to put them up somewhere public where other
folks could use them. Right now I don't know of a good place to do this.
There's &lt;a href="http://snipplr.com"&gt;snipplr&lt;/a&gt;, but they went with the wrong kind of
"social" when they made a social snippet repository, ending up with a
social-news-like site focused on upvotes and popularity. This just seems to be
a total mismatch to the problem space. What I really want is some combination
of asymmetric follow, change tracking, tags, powerful search and good curation
tools - more like delicious.com (may it rest in peace) than Reddit. Github's
&lt;a href="http://gist.github.com"&gt;gists&lt;/a&gt; are structurally much closer to this, but
aren't quite there on curation and search. I also suspect that the fact that
gists are full-fledged Git repos is overkill for snippet tracking, much as I
love Git (and Github) for larger projects.&lt;/p&gt;

&lt;p&gt;I'll no doubt be fine-tuning this function in the days to come - if you're
interested, you'll have to keep an eye on &lt;a href="https://github.com/cortesi/mitmproxy/blob/master/libmproxy/utils.py"&gt;this file
here&lt;/a&gt;,
which is less than ideal. If anyone knows of a better snippet sharer, though,
let me know...&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/pretty_xmlish/index.html</guid><pubDate>Sun, 06 Feb 2011 16:45:00 GMT</pubDate></item><item><title>Social news eats a blog post</title><link>http://corte.si/posts/socialmedia/post-lifecycle/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/socialmedia/post-lifecycle/index.html"&gt;Social news eats a blog post&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;24 January 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;This is the second post in which I try to add some data to my nagging doubts
about the technical news ecosystem. In my &lt;a href="http://corte.si/posts/socialmedia/redditgraph/index.html"&gt;previous
post&lt;/a&gt;, I showed off a visualisation of how
the proggit front page changes over time. In this post, I take a look at the
flip-side of the coin - what happens to a specific post as it passes through
the short, fickle social news cycle?  To do this, I'll take a deep dive into my
own server logs, looking at a &lt;a href="http://corte.si/posts/code/cyclesort/index.html"&gt;recent post of
mine&lt;/a&gt; that appeared briefly on both &lt;a href="http://news.ycombinator.com"&gt;Hacker
News&lt;/a&gt; and
&lt;a href="http://www.reddit.com/r/programming""&gt;proggit&lt;/a&gt;. I'd guess that nearly all
posts follow more or less the same trajectory as they are extruded through the
social news mill, so this should be interesting to more people than just me.
At the risk of making things a bit dry and descriptive, I'm saving speculation
and interpretation for a future post.&lt;/p&gt;

&lt;p&gt;The scene is set at about 10pm New Zealand time, when I put the finishing
touches to my blog post, and fire off an rsync up to my server. I quickly
double-check that the blog and the RSS feed have updated OK, &lt;a href="http://twitter.com/cortesi/status/6627667512131584"&gt;tweet a
link&lt;/a&gt; to the post, and go
to bed. While I sleep, the post creeps onto both Hacker News and proggit,
ultimately getting 41000 hits over the next 5 days or so. The graphs below show
only the first 50 hours of the post's lifetime - everything after that is just
a long, slow dénouement as it dwindles into obscurity.&lt;/p&gt;

&lt;h1&gt;Our real-time robot overlords&lt;/h1&gt;

&lt;p&gt;The action starts almost as soon as I click the "tweet" button. Within seconds,
the post is retrieved by Twitterbot. One second later, Googlebot appears, and
almost simultaneously I get hit by Jaxified, Njuice, LinkedIn and PostRank. In
all, 10 bots read my blog post within the first minute, handily beating the
first human, who slouches lethargically into view at a tardy 90 seconds. &lt;/p&gt;

&lt;p&gt;Below is a list of the bots that retrieved my post before the first submission
to a social news site. These are the realtime robots, presumably hoovering up
the Twitter firehose and indexing all the links they find. The cast of
characters is a mixture of the expected big fish, stealth startups, and
skunkworks projects at well-known companies. Bot identity was gleaned from
HTTP &lt;a href="http://en.wikipedia.org/wiki/User_agent"&gt;user-agent&lt;/a&gt; headers when they
were provided, or by checking the ownership of the responsible IP through
reverse DNS resolution and whois lookups when they weren't. Most of the
real-time bots were well behaved, identifying themselves clearly with a URL in
the user-agent string.&lt;/p&gt;

&lt;table&gt;
    &lt;tr&gt;
        &lt;th&gt;minutes after publication&lt;/th&gt;
        &lt;th&gt;bot&lt;/th&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th rowspan="10"&gt;1&lt;/th&gt; &lt;td&gt;&lt;a href="http://twitter.com"&gt;Twitter&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://www.google.com/bot.html"&gt;Google&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://www.jaxified.com/crawler"&gt;Jaxified&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://njuice.com/"&gt;NJuice&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://www.linkedin.com"&gt;LinkedIn&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://www.postrank.com/"&gt;PostRank&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;Unidentified bot from a Microsoft-owned IP&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://help.yahoo.com/help/us/ysearch/slurp"&gt;Yahoo! Slurp&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;Unidentified bot from a &lt;a
        href="http://www.bbc.co.uk/blogs/rad/"&gt;BBC RAD labs&lt;/a&gt; IP. 
        &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://www.oneriot.com/"&gt;OneRiot&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th rowspan="4"&gt;2&lt;/th&gt; &lt;td&gt;&lt;a href="http://friendfeed.com/about/bot"&gt;FriendFeed&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://www.kosmix.com/"&gt;Kosmix&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://labs.topsy.com/butterfly/"&gt;Topsy Butterfly&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;Unidentified bot from &lt;a href="http://marban.com"&gt;marban.com&lt;/a&gt; subdomain. (PoPUrls?)&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th rowspan="2"&gt;3&lt;/th&gt; &lt;td&gt;&lt;a href="http://metauri.com/"&gt;metauri.com&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://search.msn.com/msnbot.htm"&gt;msnbot&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th rowspan="2"&gt;6&lt;/th&gt; &lt;td&gt;&lt;a href="http://summify.com"&gt;Summify&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;td&gt;Bot identifying itself just as "NING", can't confirm that it's &lt;a
        href="http://www.ning.com/"&gt;the Ning&lt;/a&gt;. &lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;9&lt;/th&gt; &lt;td&gt;&lt;a href="http://tineye.com/crawler.html"&gt;tineye&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;26&lt;/th&gt; &lt;td&gt;&lt;a href="http://spinn3r.com/robot"&gt;spinn3r.com&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;27&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.backtype.com/"&gt;backtype.com&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;

    &lt;tr&gt;
        &lt;th&gt;47&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.facebook.com/externalhit_uatext.php"&gt;facebookexternalhit&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h1&gt;Enter the heavyweights: Hacker News and Reddit&lt;/h1&gt;

&lt;p&gt;48 minutes after the post was published, the first hit from a social news site
appears: hello &lt;a href="http://news.ycombinator.com"&gt;Hacker News&lt;/a&gt;. The post
quickly makes it onto the front page, and HN traffic peaks at 399 hits per hour
in the second hour after publication. All told, the post got 2337 hits with a
HN &lt;a href="http://en.wikipedia.org/wiki/HTTP_referrer"&gt;referrer header&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;a href="http://corte.si/posts/socialmedia/post-lifecycle/ycombinator.png" title="ycombinator"&gt;
        &lt;img src="http://corte.si/posts/socialmedia/post-lifecycle/ycombinator.png" alt="ycombinator" /&gt;
    &lt;/a&gt;
    &lt;div class="subtitle"&gt;
        news.ycombinator.com
    &lt;/div&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Two hours and three minutes after publication, the real monster of social news
arrives: the first hit from Reddit appears. The Reddit traffic peaks in the
sixth hour after publication at 3025 hits per hour, and delivers a total of
23807 hits in the 51 hours after publication.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;a href="http://corte.si/posts/socialmedia/post-lifecycle/reddit.png" title="reddit"&gt;
        &lt;img src="http://corte.si/posts/socialmedia/post-lifecycle/reddit.png" alt="reddit" /&gt;
    &lt;/a&gt;
    &lt;div class="subtitle"&gt;
        reddit.com/r/programming
    &lt;/div&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;h1&gt;The long tail&lt;/h1&gt;

&lt;p&gt;Reddit accounted for the vast majority of the post's traffic, dwarfing all
other sources combined. In all, I received only 2300 hits with specified
referrer headers that weren't Reddit or HN. Here are all the referrers that
were responsible for more than 10 hits to the post:&lt;/p&gt;

&lt;table&gt;
    &lt;tr&gt;&lt;th&gt;hits&lt;/th&gt;&lt;th&gt;site&lt;/th&gt;&lt;/tr&gt;

    &lt;tr&gt;&lt;th&gt;456&lt;/th&gt; &lt;td&gt;&lt;a href="http://popurls.com"&gt;popurls.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;359&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.google.com/reader"&gt;Google Reader&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;282&lt;/th&gt; &lt;td&gt;&lt;a href="http://twitter.com"&gt;Twitter&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;196&lt;/th&gt; &lt;td&gt;&lt;a href="http://jimmyr.com"&gt;jimmyr.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;183&lt;/th&gt; &lt;td&gt;&lt;a href="http://delicious.com"&gt;delicious&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;153&lt;/th&gt; &lt;td&gt;&lt;a href="http://pop.is"&gt;pop.is&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;139&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.google.com"&gt;Google Search&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;82&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.wired.com"&gt;wired.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;56&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.facebook.com"&gt;Facebook&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;36&lt;/th&gt; &lt;td&gt;&lt;a href="http://longurl.com"&gt;longurl.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;36&lt;/th&gt; &lt;td&gt;&lt;a href="http://glozer.net/trendy"&gt;glozer.net/trendy&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;30&lt;/th&gt; &lt;td&gt;&lt;a href="http://oursignal.com"&gt;oursignal.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;28&lt;/th&gt; &lt;td&gt;&lt;a href="http://hackurls.com"&gt;hackurls.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;24&lt;/th&gt; &lt;td&gt;&lt;a href="http://pipes.yahoo.com"&gt;Yahoo Pipes&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;18&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.netvibes.com"&gt;www.netvibes.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;15&lt;/th&gt; &lt;td&gt;&lt;a href="http://dzone.com"&gt;dzone.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th&gt;11&lt;/th&gt; &lt;td&gt;&lt;a href="http://www.freshnews.com"&gt;www.freshnews.org&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;It's interesting to see that I got nearly 200 hits from delicous.com. By
contrast, &lt;a href="http://pinboard.in"&gt;pinboard.in&lt;/a&gt; - which seems to be delicous.com's
anointed successor - sent me only two hits. Then again, my post was published
in late November 2010, about a month before Yahoo &lt;a href="http://techcrunch.com/2010/12/16/is-yahoo-shutting-down-del-icio-us/"&gt;spectacularly
hobbled&lt;/a&gt;
their bookmarking property. I wonder what those figures would look like today.&lt;/p&gt;

&lt;p&gt;The thin end of the long tail are the 200 hits from 94 sites that were
responsible for 10 or fewer hits each. We can break this motley crew up into a
few different classes: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Sites that provide some sort of social news analysis, piggy-backing off HN,
Reddit and delicious.com. For example, &lt;a href="http://popacular.com"&gt;popacular.com&lt;/a&gt;,
&lt;a href="http://seesmic.com"&gt;seesmic.com&lt;/a&gt;, &lt;a href="http://hotgrog.com"&gt;hotgrog.com&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;URL shorteners like &lt;a href="http://j.mp"&gt;j.mp&lt;/a&gt; and unshorteners like
&lt;a href="http://unitny.me"&gt;untiny.me&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Social media-ish services like &lt;a href="http://friendfeed.com"&gt;FriendFeed&lt;/a&gt;,
&lt;a href="http://stumbleupon.com"&gt;StumbleUpon&lt;/a&gt;, &lt;a href="http://pinboard.in"&gt;pinboard.in&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tiny personal blogs. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And, surprisingly - a number of sites that just provide an alternative
interface or URL for Hacker News: &lt;a href="http://hackerne.ws/"&gt;hackerne.ws&lt;/a&gt;,
&lt;a href="http://ihackernews.com/"&gt;ihackernews.com&lt;/a&gt;,
&lt;a href="http://hacker-newspaper.gilesb.com/"&gt;hacker-newspaper.gilesb.com&lt;/a&gt;,
&lt;a href="http://www.icombinator.net/"&gt;www.icombinator.net&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;Robot scavengers of the social news ecosphere&lt;/h1&gt;

&lt;p&gt;Let's take a look at overall bot traffic, separating out our silicone friends by
looking for non-human and non-standard user-agent headers. The moment the post
hits the HN front page bot traffic spikes, and this spike continues as the post
is submitted to Reddit and starts its climb up the proggit front page. &lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;a href="http://corte.si/posts/socialmedia/post-lifecycle/robots.png" title="robots"&gt;
        &lt;img src="http://corte.si/posts/socialmedia/post-lifecycle/robots.png" alt="robots" /&gt;
    &lt;/a&gt;
    &lt;div class="subtitle"&gt;
        all bots
    &lt;/div&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;Enter the robot scavengers of the social news ecosphere - a set of second-tier
aggregators that monitor social news and Twitter for hot stories. Here's a
sample of bot visitors, taken more or less at random from the logs:&lt;/p&gt;

&lt;table&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://inagist.com"&gt;inagist.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://www.netvibes.com"&gt;www.netvibes.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://chattertrap.com"&gt;chattertrap.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://twingly.com"&gt;twingly.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://coder.io"&gt;coder.io&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://newsmagpie.com"&gt;newsmagpie.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://worio.com"&gt;worio.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://www.myvbo.com"&gt;www.myvbo.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://www.zemanta.com"&gt;www.zemanta.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://embed.ly"&gt;embed.ly&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://brandwatch.net"&gt;brandwatch.net&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://www.flipboard.com"&gt;www.flipboard.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://paper.li"&gt;paper.li&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://rivva.de"&gt;rivva.de&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://attribyte.com"&gt;attribyte.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://diffbot.com"&gt;diffbot.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://yoono.com"&gt;yoono.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://hatena.net.jp"&gt;hatena.net.jp&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://hourlypress.com"&gt;hourlypress.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://longurl.org"&gt;longurl.org&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://untiny.me"&gt;untiny.me&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://goo.ne.jp"&gt;goo.ne.jp&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://www.baidu.com"&gt;www.baidu.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://sharethis.com"&gt;sharethis.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://ideashower.com"&gt;ideashower.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://pannous.info"&gt;pannous.info&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://wikiwix.com"&gt;wikiwix.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://pipes.yahoo.com"&gt;pipes.yahoo.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://mustexist.com"&gt;mustexist.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://pics.fefoo.com"&gt;pics.fefoo.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://cyber.law.harvard.edu"&gt;cyber.law.harvard.edu&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://seatgeek.com"&gt;seatgeek.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;a href="http://metadatalabs.com"&gt;metadatalabs.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://moreover.com"&gt;moreover.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://thinglabs.com"&gt;thinglabs.com&lt;/a&gt;&lt;/td&gt;
    &lt;td&gt;&lt;a href="http://stufftotweet.com"&gt;stufftotweet.com&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://chilitweets.com"&gt;chilitweets.com&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="http://bkluster.hut.edu.vn"&gt;bkluster.hut.edu.vn&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="http://wikio.com"&gt;wikio.com&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="http://pipes.yahoo.com"&gt;Yahoo Pipes&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;a href="http://zite.com"&gt;zite.com&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="http://zelist.ro"&gt;zelist.ro&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="http://buzzzy.com"&gt;buzzzy.com&lt;/a&gt;&lt;/td&gt;
        &lt;td&gt;&lt;a href="http://intravnews.com"&gt;intravnews.com&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;At this point, I'd like to bitch a bit about how astonishingly badly behaved
some of the automated systems skulking around today's web are. The vast, vast
majority don't provide any clue about the responsible entity in the user-agent
string. The list above consists of responsible bots that do identify
themselves, and less responsible ones that I could identify through reverse
domain resolution. Most of the irresponsible bots come from Amazon Web
Services, which seems to be a right wretched hive of scum and villainy. The
worst performers here boggle the mind - about a dozen hosts from AWS retrieved
the blog post more than 200 times a day, all using full GET requests, without
an If-Modified-Since header, and with no identification. The arch-villain hit
the post 600 times in its first 24 hours - that's about once every 2.5 minutes.&lt;/p&gt;

&lt;h1&gt;Referrer-less viewers and stealthy bots&lt;/h1&gt;

&lt;p&gt;I was surprised to see that almost 20% of requests not identified as bot
requests had no specified referrer, a much greater percentage than I would have
anticipated. Here's a graph showing the number of referrer-less requests per
hour:&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;a href="http://corte.si/posts/socialmedia/post-lifecycle/noreferrer.png" title="no referrer"&gt;
        &lt;img src="http://corte.si/posts/socialmedia/post-lifecycle/noreferrer.png" alt="no referrer" /&gt;
    &lt;/a&gt;
    &lt;div class="subtitle"&gt;
        requests without a referrer
    &lt;/div&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;It looks like the double-peak in this graph coincides with the traffic peaks
from HN and Reddit. This suggests that the majority of these hits do in fact
come (perhaps indirectly) from HN and Reddit users. One possibility is that a
chunk of this referrer-less traffic comes from non-browser Twitter clients.&lt;/p&gt;

&lt;p&gt;A fraction of the referrer-less traffic also comes from stealthy bots sending
user-agent strings that match those of desktop browsers. About 5% of these
requests, for example, come from the Amazon EC2 cloud, so are unlikely to be
real browsers. One Internet darling that does this is Instapaper, which seems
to use the requesting client's user-agent string rather than frankly confessing
itself to be a bot. It also appears to re-request an article in full for each
user, rather than simply checking if there's been a change and using a cached
copy. On the upside, this means that I know that 131 readers used Instapaper to
view my post.&lt;/p&gt;

&lt;h1&gt;Aftermath&lt;/h1&gt;

&lt;p&gt;After the post drifts off the proggit and HN front pages, traffic dies down.
There's a dwindling tail of stragglers that bothered to flip through to the
second or third page of top stories, and a tiny dribble of users who discovered
the link through other sources. A month later, the post gets about 60 hits per
day, of which more than a third are from bots. Non-bot traffic is still
dominated by Reddit, presumably from people searching or idly flicking through
Reddit's history.&lt;/p&gt;

&lt;p&gt;So, in the end, after my once-thrumming server quiets down, what has the
lasting effect been on my own social graph? I had a small surge of Twitter
follows, going from 230 to 245 followers. There was a minor blip of subscribers
to my RSS feed, with Google Reader reporting subscriptions going from about 510
to 551. Out of 33,000 unique visitors 56 decided to cultivate a more permanent
relationship of some sort to my blog. That's 1 in 600. If you remember only one
figure from this post, this should be it. &lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/socialmedia/post-lifecycle/index.html</guid><pubDate>Mon, 24 Jan 2011 20:43:00 GMT</pubDate></item><item><title>A journey through the bowels of proggit</title><link>http://corte.si/posts/socialmedia/redditgraph/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/socialmedia/redditgraph/index.html"&gt;A journey through the bowels of proggit&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;12 January 2011&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://corte.si/posts/socialmedia/redditgraph/proggit4.png" title="proggit - 4 hours"&gt;
    &lt;img src="http://corte.si/posts/socialmedia/redditgraph/proggit4.png" alt="proggit - 4 hours" /&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I've had a nagging sense of dissatisfaction with my information diet lately,
and it's becoming clear that over-reliance on social news sites like Reddit and
Hacker News (much as I love them) lies at the heart of my discontent. For the
past few months, I've been gathering data to help me come up with a coherent
explanation for my malaise. I'm still working on it, so this post will have no
conclusions, only repulsive metaphors and pretty pictures.&lt;/p&gt;

&lt;p&gt;For a week or so in November I logged the slow, peristaltic progress of stories
through the bowels of &lt;a href="http://www.reddit.com/r/programming"&gt;proggit&lt;/a&gt;, watching
them get nudged this way and that by the malodorous, hot gas of public opinion
before finally being shunted on to the colon of the second page of results.  In
other words, I sampled the top 25 stories every 5 minutes through the RSS feed.
One of the things I was interested in was how submission rankings changed over
time, so I visualised the dataset using the same technique I came up with to
&lt;a href="http://sortvis.org"&gt;visualise sorting algorithms&lt;/a&gt;. The image above shows 4
hours of proggit, with each submission represented by a line. The lines are
coloured based on the average rank the story achieves over its lifetime in the
top 25, ranging between upvote orange for top stories, and downvote blue for
bottom stories.&lt;/p&gt;

&lt;p&gt;Here's a bigger sample - 72 hours of data embedded in a widget to let you zoom
and pan around. The busy cut-and-thrust of life on reddit is all here. The
meteoric rise, inevitably followed by long, slow decay. The sudden, mysterious,
mid-flight disappearances. The jostling and writhing among the bottom
submissions that never quite manage to make it into the big leagues. Heady
stuff. Click to view:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://corte.si/posts/socialmedia/redditgraph/proggit72/index.html" title="proggit - 72 hours"&gt;
    &lt;img src="http://corte.si/posts/socialmedia/redditgraph/mini72.png" alt="proggit - 72 hours" /&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Perhaps I'll do an expanded version that lets you view submission titles, times
and so forth later on. &lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/socialmedia/redditgraph/index.html</guid><pubDate>Wed, 12 Jan 2011 20:51:00 GMT</pubDate></item><item><title>Cyclesort - a curious little sorting algorithm</title><link>http://corte.si/posts/code/cyclesort/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/code/cyclesort/index.html"&gt;Cyclesort - a curious little sorting algorithm&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;22 November 2010&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;One of the nice things about building &lt;a href="http://sortvis.org"&gt;sortvis.org&lt;/a&gt; and
writing the posts that led up to it is that people email me with pointers to
esoteric algorithms I've never heard of. Today's post is dedicated to one of
these - a curious little sorting algorithm called
&lt;a href="http://en.wikipedia.org/wiki/Cycle_sort"&gt;cyclesort&lt;/a&gt;. It was described in 1990
in a &lt;a href="http://comjnl.oxfordjournals.org/content/33/4/365.full.pdf"&gt;3-page paper by B.K.
Haddon&lt;/a&gt;, and has
become a firm favourite of mine. &lt;/p&gt;

&lt;p&gt;Cyclesort has some nice properties - for certain restricted types of data it
can do a stable, in-place sort in linear time, while guaranteeing that each
element will be moved at most once. But what I really like about this algorithm
is how naturally it arises from a simple theorem on &lt;a href="http://mathworld.wolfram.com/SymmetricGroup.html"&gt;symmetric
groups&lt;/a&gt;.  Bear with me while
I work up to the algorithm through a couple of basic concepts.&lt;/p&gt;

&lt;h1&gt;Cycles&lt;/h1&gt;

&lt;p&gt;Lets start with the definition of a
&lt;a href="http://mathworld.wolfram.com/PermutationCycle.html"&gt;cycle&lt;/a&gt;. A cycle is a
subset of elements from a permutation that have been rotated from their
original position. So, say we have an ordered set &lt;strong&gt;[0, 1, 2, 3, 4]&lt;/strong&gt;, and a
cycle &lt;strong&gt;[0, 3, 1]&lt;/strong&gt;. The cycle defines a rotation where element 0 moves to
position 3, 3 to 1 and 1 to 0.  Visually, it looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/code/cyclesort/graph1.png"/&gt;&lt;/p&gt;

&lt;p&gt;We can apply a cycle to an ordered set to obtain a permutation, and we can then
reverse that cycle to re-obtain the original set. Here's a Python function that
applies a cycle to a list in-place:&lt;/p&gt;

&lt;pre&gt;
def apply_cycle(lst, c):
    # Extract the cycle&amp;#39;s values
    vals = [lst[i] for i in c]
    # Rotate them circularly by one position
    vals = [vals[-1]] + vals[:-1]
    # Re-insert them into the list
    for i, offset in enumerate(c):
        lst[offset] = vals[i]



&lt;/pre&gt;

&lt;p&gt;Here's an interactive session showing the function in action:&lt;/p&gt;

&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; lst = [0, 1, 2, 3, 4]
&amp;gt;&amp;gt;&amp;gt; c = [0, 3, 1]
&amp;gt;&amp;gt;&amp;gt; apply_cycle(lst, c)
&amp;gt;&amp;gt;&amp;gt; lst
[1, 3, 2, 0, 4]
&amp;gt;&amp;gt; c.reverse()
&amp;gt;&amp;gt; apply_cycle(lst, c)
&amp;gt;&amp;gt; lst
[0, 1, 2, 3, 4]

&lt;/pre&gt;

&lt;h1&gt;Permutations&lt;/h1&gt;

&lt;p&gt;Now, it's a fascinating fact that &lt;strong&gt;any permutation can be decomposed into a
unique set of disjoint cycles&lt;/strong&gt;. We can think of this as analogous to the
factorization of a number - every permutation is the product a unique set of
component cycles in the same way every number is the product of a unique set of
prime factors.  Taking this as a given, how could we calculate the cycles that
make up a permutation?  One obvious way to proceed is to pick a starting point,
and simply "follow" the cycle in reverse until we get back to where we started.
We know from the result above that the element is guaranteed to be part of a
cycle, so we must eventually reach our starting point again. When we do, hey
presto, we have a complete cycle. If we keep track of the elements that are
already part of a known cycle, we can skip to the next unknown element and
repeat the process.  Once we reach the end of the list we're done.&lt;/p&gt;

&lt;p&gt;This scheme can only work if we know where in the ordered sequence any given
element belongs, because this is the way we find the "previous hop" in a cycle.
In the examples above, we worked with lists that consist of a contiguous range
of numbers &lt;strong&gt;0..n&lt;/strong&gt;, which gives us a short-cut: the element's value &lt;em&gt;is&lt;/em&gt; its
offset in the ordered list. In the code below I've factored this out into a
function &lt;strong&gt;key&lt;/strong&gt;, which takes an element value, and returns its correct offset
- in this case &lt;strong&gt;key&lt;/strong&gt; is simply the identity function.&lt;/p&gt;

&lt;p&gt;Here's a Python function that finds all cycles in permutations of numbers
ranging from &lt;strong&gt;0..n&lt;/strong&gt;:&lt;/p&gt;

&lt;pre&gt;
def key(element):
    return element

def find_cycles(l):
    seen = set()
    cycles = []
    for i in range(len(l)):
        if i != key(l[i]) and not i in seen:
            cycle = []
            n = i
            while 1: 
                cycle.append(n)
                n = key(l[n])
                if n == i:
                    break
            seen = seen.union(set(cycle))
            cycles.append(list(reversed(cycle)))
    return cycles


&lt;/pre&gt;

&lt;p&gt;Running it on our example permutation produces the cycle we used to produce it:&lt;/p&gt;

&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; find_cycles([1, 3, 2, 0, 4])
&amp;gt;&amp;gt;&amp;gt; [[3, 1, 0]]

&lt;/pre&gt;

&lt;p&gt;Here's &lt;strong&gt;find_cycles&lt;/strong&gt; run on a longer, randomly shuffled list:&lt;/p&gt;

&lt;pre&gt;l = [0, 5, 6, 8, 7, 4, 9, 1, 3, 2]
&amp;gt;&amp;gt;&amp;gt; find_cycles(l)
&amp;gt;&amp;gt;&amp;gt; [[7, 4, 5, 1], [9, 6, 2], [8, 3]]

&lt;/pre&gt;

&lt;p&gt;And here's a handsomely colourful graphical version of the output above:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/code/graph2.png"/&gt;&lt;/p&gt;

&lt;h1&gt;A sorting algorithm emerges&lt;/h1&gt;

&lt;p&gt;Let's take a closer look at the &lt;strong&gt;find_cycles&lt;/strong&gt; function above. We keep track of
elements that are already part of a cycle in the &lt;strong&gt;seen&lt;/strong&gt; set, so that we can
skip them as we proceed through the list. The &lt;strong&gt;seen&lt;/strong&gt; set can be as large as
the list itself, so we've doubled the memory requirement for the algorithm. If
we're allowed to destroy the input list, we can avoid explicitly tracking seen
elements by relocating elements to their correct position as we work our way
around each cycle. All the cycles are disjoint and we traverse each cycle only
once, so doing this won't affect the function's output. We can then tell that
we need to skip an element we've already seen by checking whether it's in the
correct sorted position. Here's the result:&lt;/p&gt;

&lt;pre&gt;
def key(element):
    return element

def find_cycles2(l):
    cycles = []
    for i in range(len(l)):
        if i != key(l[i]):
            cycle = []
            n = i
            while 1: 
                cycle.append(n)
                tmp = l[n]
                if n != i:
                    l[n] = last_value
                last_value = tmp
                n = key(last_value)
                if n == i:
                    l[n] = last_value
                    break
            cycles.append(list(reversed(cycle)))
    return cycles



&lt;/pre&gt;

&lt;p&gt;But... at the end of this process, the original list is sorted! Tada: cyclesort
pops out of the shrubbery almost as a side-effect of efficiently finding all
cycles. If we're only interested in sorting, we can strip the code that saves
the cycles, which leaves us with a nice, pared-back sorting algorithm:&lt;/p&gt;

&lt;pre&gt;
def key(element):
    return element

def cyclesort_simple(l):
    for i in range(len(l)):
        if i != key(l[i]):
            n = i
            while 1: 
                tmp = l[n]
                if n != i:
                    l[n] = last_value
                last_value = tmp
                n = key(last_value)
                if n == i:
                    l[n] = last_value
                    break


&lt;/pre&gt;

&lt;p&gt;The &lt;strong&gt;cyclesort_simple&lt;/strong&gt; algorithm only works on permutations of sets of
numbers ranging from &lt;strong&gt;0&lt;/strong&gt; to &lt;strong&gt;n&lt;/strong&gt;. There are other fast ways to sort data of
this restricted kind, but all the methods I know of require additional memory
proportional to &lt;strong&gt;n&lt;/strong&gt;. Cyclesort can do it without any extra storage at all,
which is a neat trick.&lt;/p&gt;

&lt;h1&gt;Visualising cyclesort&lt;/h1&gt;

&lt;p&gt;At this point, we have enough information to visualise the algorithm, so let's
take a look at the beastie we're working with. I've had to make some little
adjustments to the usual sortvis.org visualisation process to cope with
cyclesort. In the algorithm above, the first element is duplicated into the
second position of each cycle, and that duplicate remains in play until it's
over-written by the last element of the cycle. I changed the algorithm slightly
to write a null placeholder at the start of the cycle to avoid duplicates, and
taught the sortvis.org visualiser to deal with "empty" slots.  The resulting
&lt;a href="http://sortvis.org/visualisations.html"&gt;weave&lt;/a&gt; visualisation looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://corte.si/posts/code/cyclesort/cyclesort.png" title="cyclesort"&gt;
    &lt;img src="http://corte.si/posts/code/cyclesort/cyclesort.png" alt="cyclesort" /&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is quite satisfying - you can tell where each cycle begins and ends by the
gaps, which span each cycle exactly. It's immediately clear that the
permutation above, for instance, contained five cycles. Within each cycle, you
can follow along as each element replaces the next, until we finally close the
gap by placing the last element in the first slot.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://sortvis.org/visualisations.html"&gt;dense&lt;/a&gt; visualisation is less
informative because the gaps are too small to see at a single-pixel width, and
the algorithm doesn't have much other large-scale structure. It still looks
neat, though:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://corte.si/posts/code/cyclesort/cyclesort-dense.png" title="cyclesort"&gt;
    &lt;img src="http://corte.si/posts/code/cyclesort/cyclesort-dense.png" alt="cyclesort" /&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;Generalising cyclesort&lt;/h1&gt;

&lt;p&gt;Cyclesort works whenever we can write an implementation of the &lt;strong&gt;key&lt;/strong&gt;
function, so there's quite a bit of scope for clever exploitation of structured
data. The Haddon paper presents a solution for one common case: permutations
whose elements come from a relatively small set, where the number of occurances
of each element is known. The insight is that the &lt;strong&gt;key&lt;/strong&gt; function can have
persistent state, letting us calculate the positions of elements incrementally
as we work through the list.&lt;/p&gt;

&lt;p&gt;We begin by adding an extra argument to our sort function: a list &lt;strong&gt;(element,
count)&lt;/strong&gt; tuples telling us a) the order of the keys, and b) the frequency with
which each key occurs. &lt;/p&gt;

&lt;pre&gt;[(&amp;quot;a&amp;quot;, 10), (&amp;quot;b&amp;quot;, 33), (&amp;quot;c&amp;quot;, 18), (&amp;quot;d&amp;quot;, 41)]

&lt;/pre&gt;

&lt;p&gt;Now, in the sorted list, we know that there will be a contiguous blog of 10
"a"s, followed by a contiguous block of 33 "b"s, and so forth. We can use this
information to calculate the offset of each contiguous block up front:&lt;/p&gt;

&lt;pre&gt;
def offsets(keys):
    d = {}
    offset = 0
    for key, occurences in keys:
        d[key] = offset
        offset += occurences
    return d


&lt;/pre&gt;

&lt;p&gt;The &lt;strong&gt;key&lt;/strong&gt; function uses this offset dictionary to look up the current index
for any element. Each time we insert an element into position, we increment the
relevant offset entry - next time we get to an element of the same type, we
will place it in the next position in the contiguous block. We also make a
small modification to the algorithm to cater for the progressive position
increment process: we start a cycle only when the element is equal to or above
the position where it ought to be. Here's a Python implementation:&lt;/p&gt;

&lt;pre&gt;
def offsets(keys):
    d = {}
    offset = 0
    for key, occurences in keys:
        d[key] = offset
        offset += occurences
    return d


def key(o, element):
    return o[element]


def cyclesort_general(l, keys):
    o = offsets(keys)
    for i in range(len(l)):
        if i &amp;gt;= key(o, l[i]):
            n = i
            while 1: 
                tmp = l[n]
                if n != i:
                    l[n] = last_value
                last_value = tmp
                n = key(o, last_value)
                o[last_value] += 1
                if n == i:
                    l[n] = last_value
                    break


&lt;/pre&gt;

&lt;p&gt;This algorithm runs in &lt;strong&gt;O(n + m)&lt;/strong&gt;, where &lt;strong&gt;n&lt;/strong&gt; is the number of elements and
&lt;strong&gt;m&lt;/strong&gt; is the number of distinct element values. In practice &lt;strong&gt;m&lt;/strong&gt; is usually
small, so this is often tantamount to being &lt;strong&gt;O(n)&lt;/strong&gt;.&lt;/p&gt;

&lt;h1&gt;The code&lt;/h1&gt;

&lt;p&gt;As usual, the code for these visualisations have been incorporated into the
&lt;a href="https://github.com/cortesi/sortvis"&gt;sortvis project&lt;/a&gt;. I've also added the
visualisations above to the &lt;a href="http://sortvis.org"&gt;sortvis.org&lt;/a&gt; website.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/code/cyclesort/index.html</guid><pubDate>Mon, 22 Nov 2010 21:36:00 GMT</pubDate></item><item><title>What Stuxnet means</title><link>http://corte.si/posts/security/stuxnet.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/security/stuxnet.html"&gt;What Stuxnet means&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;15 November 2010&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;a href="http://www.symantec.com/connect/blogs/stuxnet-breakthrough"&gt;The last bit of evidence is now
in&lt;/a&gt; - it appears
that the mysterious &lt;a href="http://en.wikipedia.org/wiki/Stuxnet"&gt;Stuxnet&lt;/a&gt; worm was
indeed aimed at Iran's nuclear capability. This means that we now know for sure
that Stuxnet was an event of great significance - the first example of a type
of sophisticated interstate warfare that we can expect to see a lot more of in
future. It neatly ties together a number of trends that we've been talking
about to clients at &lt;a href="http://www.nullcube.com"&gt;Nullcube&lt;/a&gt; for years:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The worm as a targeted delivery platform.&lt;/strong&gt; Stuxnet spread
indiscriminately, waiting until it infected its intended target before
springing into action. This is a marvelous delivery platform with excellent
deniability. When executed with flair - using multiple previously unknown
vulnerabilities, spreading through both physical media and networks - it can be
incredibly hard to defend against. Look for a Stuxnet-like worm that
exfiltrates data from targeted systems next.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Internet security is a national concern.&lt;/strong&gt; There's a tendency to view the
Internet as an internationally homogeneous network.  Stuxnet makes it (even
more) clear that the Internet is a domain for contest between nation states,
and that national differences in security readiness and technology populations
matter. Look for more direct government involvement in tracking and improving
the security of local networks. I suspect we'll also see the rise of national
perimeter defenses in some countries in the next few years. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Embedded systems are a target.&lt;/strong&gt; Embedded systems are everywhere, are often
ignored when security is considered, and are opaque, difficult to inspect, and
difficult to monitor. This is a malware nirvana. Whether they are directly or
indirectly connected to a network, embedded systems are a target. My
prediction: soon, we'll see a Stuxnet-like worm that spreads directly from
embedded system to embedded system, most likely affecting DSL modems. In fact,
we've already seen a clumsy precursor of this in
&lt;a href="http://en.wikipedia.org/wiki/Psyb0t"&gt;Psyb0t&lt;/a&gt;, discovered at the beginning of
2009.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's a lot about this incident that we will most likely never know. We're
unlikely to find out who's behind Stuxnet (although Israel and the US seem to
be the only real possibilities). We're unlikely to find out if Stuxnet ever
repayed the immense technological capital its creators invested. But we do know
that it's a sign of things to come.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/security/stuxnet.html</guid><pubDate>Mon, 15 Nov 2010 15:54:00 GMT</pubDate></item><item><title>Tau: is it worth switching?</title><link>http://corte.si/posts/maths/tau/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/maths/tau/index.html"&gt;Tau: is it worth switching?&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;04 October 2010&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;The mailing list for my &lt;a href="http://dunedin.linux.net.nz/Main/HomePage"&gt;local LUG&lt;/a&gt;
recently had a small flurry of posts on &lt;a href="http://www.tauday.com/"&gt;The Tau
Manifesto&lt;/a&gt;, a proposal to replace of the constant &#960;
with &#964;, equal to 2&#960;.  Pro- and anti- camps quickly emerged, and much beer will
likely be spilt over the issue at our next meeting. &lt;/p&gt;

&lt;p&gt;Disregarding for the moment any conceptual elegance or expanatory power that
Tau might have, I was interested to know if the move would really reduce
redundancy in common mathematical expressions. Lets say (rather arbitrarily)
that Tau simplifies a mathematical expression whenever &#960; is preceded by an even
constant - that means that 2&#960; becomes &#964;, and 4&#960; becomes 2&#964;, and so forth. I had
a vague intuition that the majority of occurances of &#960; in the wild fell into
this category, which might indicate that &#964; is a more natural (or at least
parsimonious) constant to use.  Was my hunch right? This, I felt, was something
I could quantify. &lt;/p&gt;

&lt;h1&gt;Methodology&lt;/h1&gt;

&lt;p&gt;I wrote a small script to crawl all the articles linked to from the Wikipedia
&lt;a href="http://en.wikipedia.org/wiki/List_of_equations"&gt;List of Equations&lt;/a&gt; page. For
each page, I extracted all mathematical expressions, and checked the LaTeX
source of each for occurances of the symbol &#960;. A little bit of light parsing
was then done to check if the symbol was directly preceded by an integer
constant.  Finally, I rendered the LaTeX source back to images to produce the
equation tables below.&lt;/p&gt;

&lt;p&gt;Of course, anyone of sound judgement will disregard what follows entirely, due
to the many obvious shortcomings of this procedure and its underlying
assumptions.  Readers of my blog, on the other hand, may find the results
interesting.&lt;/p&gt;

&lt;h1&gt;Results&lt;/h1&gt;

&lt;p&gt;I found a total of 3173 equations, of which 133 contained the symbol &#960;. Of
these 133 equations, the distribution of constant factors preceding &#960; looked
like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://corte.si/posts/maths/tau/taugraph.png"/&gt;&lt;/p&gt;

&lt;p&gt;I call this a straight win for Tau - the vast majority of expressions using &#960;
(119 of 133) are preceded by even integer constants.&lt;/p&gt;

&lt;h1&gt;Equations&lt;/h1&gt;

&lt;p&gt;Below are all the expressions that included &#960;, plus the detected constant
factor. The headings point to the Wikipedia pages from which the equations were
taken.&lt;/p&gt;

&lt;p&gt;If nothing else, this list is a nice reminder of the mysterious ubiquity of a
constant involving the diameter and circumference of a circle in all aspects of
physics and higher math.&lt;/p&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Relativistic_wave_equations"&gt;Relativistic wave equations&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/1.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Sine-Gordon_equation"&gt;Sine-Gordon equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/2.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/3.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Fokker%E2%80%93Planck_equation"&gt;Fokker&#8211;Planck equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/4.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/5.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/6.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/7.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Euler%27s_equation"&gt;Euler&amp;#39;s equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/8.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Friedmann_equations"&gt;Friedmann equations&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/9.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/10.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/11.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/12.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/13.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/14.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/15.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/16.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/17.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/18.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Vlasov_equation"&gt;Vlasov equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/19.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/20.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/21.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Screened_Poisson_equation"&gt;Screened Poisson equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/22.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/23.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/24.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/25.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/26.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Quadratic_equation"&gt;Quadratic equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/27.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/28.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Stokes-Einstein_relation"&gt;Stokes-Einstein relation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;6&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/29.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;6&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/30.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;6&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/31.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Fisher_equation"&gt;Fisher equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/32.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/33.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_odd"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/34.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/35.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Einstein%27s_field_equation"&gt;Einstein&amp;#39;s field equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/36.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/37.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/38.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/39.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/40.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/41.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/42.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/43.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/44.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/45.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/46.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/47.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/48.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/49.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/50.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/51.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;8&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/52.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Sackur-Tetrode_equation"&gt;Sackur-Tetrode equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/53.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Laplace%27s_equation"&gt;Laplace&amp;#39;s equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/54.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/55.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/56.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/57.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/58.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/59.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/60.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Cauchy-Riemann_equations"&gt;Cauchy-Riemann equations&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/61.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Cubic_equation"&gt;Cubic equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/62.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/63.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/64.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/65.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/66.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Partial_differential_equation"&gt;Partial differential equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/67.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/68.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/69.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/70.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/71.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/72.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/73.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Lane-Emden_equation"&gt;Lane-Emden equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/74.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Heat_equation"&gt;Heat equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/75.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/76.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/77.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/78.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/79.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/80.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/81.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/82.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/83.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/84.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/85.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/86.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/87.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/88.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/89.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/90.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/91.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Wave_equation"&gt;Wave equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/92.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/93.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/94.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/95.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Primitive_equations"&gt;Primitive equations&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/96.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_none"&gt;None&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/97.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Quintic_equation"&gt;Quintic equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/98.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/99.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/100.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/101.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/102.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Black%E2%80%93Scholes_equation"&gt;Black&#8211;Scholes equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/103.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/104.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/105.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Fredholm_integral_equation"&gt;Fredholm integral equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/106.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Poisson%27s_equation"&gt;Poisson&amp;#39;s equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/107.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/108.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/109.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Helmholtz_Equation"&gt;Helmholtz Equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/110.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Van_der_Waals_equation"&gt;Van der Waals equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/111.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/112.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/113.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/114.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;2&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/115.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Lorentz_equation"&gt;Lorentz equation&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/116.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/117.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/118.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;a href="http://en.wikipedia.org/wiki/Maxwell%27s_equations"&gt;Maxwell&amp;#39;s equations&lt;/a&gt;&lt;/h2&gt;

&lt;table&gt;
    &lt;th&gt;constant&lt;/th&gt; &lt;th&gt;expression&lt;/th&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/119.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/120.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/121.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/122.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/123.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/124.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/125.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/126.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/127.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/128.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/129.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/130.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/131.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/132.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td style="text-align: center;" class="factor_even"&gt;4&lt;/td&gt;
        &lt;td&gt;
            &lt;img src="http://corte.si/posts/maths/tau/133.png"/&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/maths/tau/index.html</guid><pubDate>Mon, 04 Oct 2010 18:54:00 GMT</pubDate></item><item><title>iPad: the perfect computing device for children?</title><link>http://corte.si/posts/general/ipadbaby.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/general/ipadbaby.html"&gt;iPad: the perfect computing device for children?&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;05 September 2010&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;&lt;center&gt;
&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/aGnnv_LD080?fs=1&amp;amp;hl=en_US"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/aGnnv_LD080?fs=1&amp;amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;This is my 10 week old son playing with
&lt;a href="http://itunes.apple.com/us/app/uzu/id376551723?mt=8"&gt;Uzu&lt;/a&gt; on the iPad like a
tiny, mad concert pianist. Remarkably, this was the first time we saw him
intentionally manipulating the external world with his hands. That was a week
ago - since then it's like a switch has been flicked somewhere in his fuzzy
little cranium, and now his previously aimless flailing has turned into a
methodical tactile exploration of everything around him.&lt;/p&gt;

&lt;p&gt;Watching this minor domestic miracle, it struck me that the iPad may well be
the perfect computing device for small children. The multitouch interface
couldn't be more simple and direct, and it works equally well for adult hands
and tiny stubby fingers. All the complexity of the desktop-and-windows metaphor
has been stripped away - the iPad does only one thing at a time, so when an app
is running it &lt;em&gt;becomes&lt;/em&gt; the device. The result is the clearest, most unmediated
computing experience so far. It's also interesting to consider that what makes
the iPad so great for children is also what makes it so great for adults. &lt;/p&gt;

&lt;p&gt;As I've &lt;a href="http://corte.si/posts/politics/apple-is-china.html"&gt;written before&lt;/a&gt;, I'm more than
slightly terrified by just how good Apple's devices are.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/general/ipadbaby.html</guid><pubDate>Sun, 05 Sep 2010 13:43:00 GMT</pubDate></item><item><title>Sea lions and lifestyle change</title><link>http://corte.si/posts/photos/sealions-and-lifestyle/index.html</link><description>&lt;div class="post"&gt;
    &lt;div class="posthead"&gt;
        &lt;h1&gt;&lt;a href="http://corte.si/posts/photos/sealions-and-lifestyle/index.html"&gt;Sea lions and lifestyle change&lt;/a&gt;&lt;/h1&gt;
        &lt;h2&gt;02 September 2010&lt;/h2&gt;
    &lt;/div&gt;
    &lt;div class="postbody"&gt;
        &lt;p&gt;About a year and a half ago, after dinner at a favourite local restaurant, and
having entered into that zone of philosophical clarity that sets in around the
dessert wine, my wife and I had the sudden simultaneous realisation that it was
time for a change. For most of our adult lives, we had lived in the suburb of
Newtown in Sydney - a hyper-urban jungle densely packed with coffee shops and
theatres, inhabited by a thronging mixture of students and bohemians with
counterculturally-correct hairdos. It was all beginning to seem a bit tired and
same-ish. We needed more time and more space. We needed to get back to the
essentials of life.&lt;/p&gt;

&lt;p&gt;Four weeks later our furniture was in a shipping container en-route to Dunedin,
a small university town near the southern tip of New Zealand. We decided to
work together from home, keeping our schedules flexible to make time for walks,
reading, cooking, and (more recently) spending time with our son. It was a huge
risk - it was quite possible that the isolation would impose a punishing work
travel regime on me, or put a crimp in my wife's very specialised career in
linguistics.  It took enterprise, determination and a no small amount of
possibly-foolish optimism, but it's all worked out. Our leap of faith has
turned out to be one of the best decisions we've ever made. Dunedin is a
breathtakingly beautiful place to live - I still can't quite believe that I can
get up from my desk, and within 20 minutes be on a deserted beach littered with
lazy sea lions basking in the winter sun.&lt;/p&gt;

&lt;p&gt;My advice to you is this: when your life begins to seem a bit stuffy and
constricted, when you begin to feel you've lost sight of something more
fundamental and get the urge to refactor - &lt;em&gt;just do it&lt;/em&gt;. There has never been a
better time in history for people who choose to march to a different drum. &lt;/p&gt;

&lt;p&gt;To prove what a lucky fellow I am, here are two photos from my walk yesterday
morning - click to view in a lightbox.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;a href="http://www.flickr.com/photos/cortesi/4947413235/lightbox/" title="New Zealand Sea Lion Bull"&gt;
        &lt;img src="http://corte.si/posts/photos/sealions-and-lifestyle/male.jpg" alt="New Zealand Sea Lion Bull" /&gt;
    &lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;It's not clear from the picture, but this is a massive New Zealand Sea Lion
bull - about 400 kilograms of apparently boneless muscle and blubber.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
    &lt;a href="http://www.flickr.com/photos/cortesi/4948006620/lightbox/" title="Sleeping female New Zealand Sea Lion"&gt;
        &lt;img src="http://corte.si/posts/photos/sealions-and-lifestyle/female.jpg" alt="Sleeping female New Zealand Sea Lion" /&gt;
    &lt;/a&gt;
&lt;/center&gt;&lt;/p&gt;

&lt;p&gt;It's hard to believe that this sleek female is the same species as the dumpy,
snub-nosed chap above. New Zealand Sea Lions are the rarest species of sea lion
in the world - it's an immense privilege to be able to share a beach with them.&lt;/p&gt;

    &lt;/div&gt;
&lt;/div&gt;
</description><guid isPermaLink="true">http://corte.si/posts/photos/sealions-and-lifestyle/index.html</guid><pubDate>Thu, 02 Sep 2010 12:59:00 GMT</pubDate></item></channel></rss>
