Notice: This blog is no longer updated. You may find a broken link or two

You can follow my new adventures @mikeonwine

Thanks to Paul Cook for the initial link to this fascinating little javascript script Social History. Thes cript analyzes the css color of various links to determine whether or not the user has been to that site. If the link has the “visited” style, then he marks the user as having been to that site. Now the Social History implementation of this is rather innocuous — it’s a clever way of only displaying only the sharing buttons of sites that the user is an active participant of. Of course there are far more interesting applications for advertising.

One of the things that I always wanted to do but never got around to was to analyze a user’s browsing history to estimate age and gender. Of course the idea is definitely not new, in fact Xerox (of all companies??) has a patent on the whole process and I’m certain plenty of networks already do something of the sort… but what the heck, let’s have some fun!

So what I did is I modified the SocialHistory JS so that it polled the browser to find out which of the Quantcast top 10k sites were visited. I then apply the ratio of male to female users for each site and with some basic math determine a guestimate of your gender. The math is really quite simple, I just take:

1 / (1 + r_1 * r_2 * … * r_n)

where p_i is the ratio of men-to-women for the specific site. For example, if you had been to two sites that had a 2-1 ratio of men to women, the probability of you being female would be:
1 / (1 + 2 * 2) = 1/5 = 20%

Ok, so Click the button to give it a shot (those of you using RSS readers probably need to click this link to open this post in a browser):

UPDATE: This takes a while on Internet Explorer — please be patient (or try FireFox)



Kind of cute right? Don’t worry — I am not storing your history in any way, this is purely for fun. I’d appreciate it if you paste the resulting probabilities in the comments together with your actual gender, I’m interested to know the accuracy of this simplistic approach. In case it isn’t obvious — please don’t do this for real.

UPDATE: I’ve disabled comments for this post, as there are simply too many!

Related Posts: