Monday, January 29, 2007

(Partial) Death of the Google Bomb

So Google killed some of the best Google bombs. A search for "miserable failure" no longer links to Dubya's official web page. The search for liar no longer gets you Tony Blair's web page, and most of the other Google bombs have stopped working too. I guess this is good news for litigious bastards.
Still, it's a relief to see that you can still search for Internet Exploder and get the right page. Hmm. Now that I've pointed this out, will they tweak their algorithm again?

You are here

Ever wondered where you were on the net. This page has a red dot showing your location in Internet address space. The map, from supergeek cartoon xkcd uses a rather neat integer-to-space mapping, the Peano curve. I used this mapping to map addresses to positions for a video of memory activity for the talk associated with my ISMM paper. The paper was about optimistic stack allocation for Java-like languages and here is the video of memory activity. The pdf of the paper is subject to the following draconian ACM Copyright Policy: "© ACM, 2006. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2006 international Symposium on Memory Management (Ottawa, Ontario, Canada, June 10 - 11, 2006). ISMM '06. ACM Press, New York, NY, 162-173. DOI="

The movie perhaps needs a little explanation. It shows the execution of the 213.javac benchmark from the SpecJvm98 benchmark suite with the 10% data set. The rightmost column is the normal parameter/local variable Java stack (it's a single threaded benchmark). The next column is the allocation stack, where the system is stack allocating new objects. The next column is the semispace based first conventional GC generation. The large area is the main heap, which is so large that it is never garbage collected. Grey means unallocated, green means allocated and white means recently accessed. The idea of using the Peano curve to map addresses to screen coordinates is to make it easier to see locality of reference at both the cache line and page level. The stack allocation heuristic being used is "caller" (see the paper for details).

At any rate I find it quite hypnotic to watch. The main visible activity is the semi-space collector ploughing through memory, and switching halves at regular intervals. In fact a large amount of allocation and deallocation is also taking place on the allocation stack (this is the entire point) but it's not as visible since it's taking place in a very small amount of memory (which is also the entire point). The scale is such that the semispaces are each 64kbytes large if I recall correctly.

Sunday, January 28, 2007

My password is fish

If you're an active user of the web, you will be constantly asked to create accounts on web sites. For those sites where you really don't care about establishing an online persona or establishing your own true identity there's always BugMeNot which will feed the website some random id and password.

However, there are a lot of sites where you would be sorry to have someone steal your identity. The obvious ones like online banking sites have their own more-or-less secure login procedures, but then there is the middle tier. Ebay, MoneyBookers and Amazon for example are all sites where there would be real privacy and money problems if someone stole your login. Then there are the networking sites. After spending hours a day and years of your life establishing a witty, informed, devastatingly perceptive online persona on some social networking site it would be even sadder to see it hijacked.

So you need secure passwords for these sites. But noone can remember dozens of different passwords, so you have a few choices. Reusing passwords is a popular one, but has obvious huge security problems. Writing them down is a little better, but not much, and it has accessibility problems unless you keep the passwords with you at all time. Letting your browser manage your passwords also has the problem of having them at hand when you need them, especially considering that hard disks don't live for ever.

A newish way to handle those pesky passwords has been implemented by PasswordMaker and in a different version by PwdHash. Here the idea is that you have one master password. The name of the site and the master password are used together to generate a different password for each site. The method used to generate the per-site password is a secure hash function. These functions have the desirable property that if you are given the output of the hash function you cannot feasibly work out what the input to the function was. (However, if you guess the input, you can use the output to check whether your guess was correct).

PasswordMaker and PwdHash are available as convenient browser plugins, which means they are hardly any more trouble to use than just typing in "fish" on every site. If you are on a machine where you can't install a browser extension they both have Javascript versions that you can use just by remembering the web page name. Since they are Javascript-based your master password doesn't leave the machine you're sitting at. I've started using PasswordMaker, and it works fine. I'm told PwdHash is a good product too, and they have a very detailed paper analyzing security risks. (Thanks to Christian.)

The producers of both products seem to have given too little attention to the risk of a dictionary attack on the master password, performed by an evil webmaster (or some evil person who has taken control of a website). Because of the way these schemes work (and this is their advantage in many ways) you only need the master password in order to generate all the site passwords. This means you, the user, have just one thing to remember. However this means that if the evil webmaster guesses your password they can verify it with your site password and then get into all your other accounts.

Now if you've resisted the temptation to use a really dumb password, then you wouldn't expect Mr. E. Webmaster to be able to guess it. You would expect him to be too busy optimizing his site for Internet Exploder or installing adware on your computer. However, both products, in their default configuration, have the problem that two users with the same master password will have the same site password on a given site. For PasswordMaker, the default is not to use the user name when calculating the site password, and for PwdHash they don't have the option of using the user name at all. Instead they allow you to have two passwords, but it's not the default.

The problem with identical master passwords leading to identical site passwords is that Mr. Evil Webmaster can generate a huge list of master passwords and their associated site password ahead of time. If we are in fact dealing with Mr. Evil Website Hijacker, he can use all the other computers he has hijacked to calculate this huge list in parallel. Once he has the list, he just needs to wait until someone uses one of the site passwords in the list, and he has their master password. Needless to say, this is bad news. There are password 'cracking' services right now that use this principle or variations on it to crack fairly difficult passwords.

So I like these tools, but I would urge you to configure PasswordMaker to use your name as a salt and to configure PwdHash with two passwords, where one of them is unique to you. Using your full name or birthday as the second password is not a bad choice here, as long as the real password is secure, because in this case we are using the second password as a salt, not as a password per se. In the case of PasswordMaker, there are hundreds of other options you can configure. Remember that every option you configure you will have to remember if you use PasswordMaker on another computer. My advice is to leave them almost all alone.

I'll still be using normal passwords for high security applications like my stock broker, but I'm going to cut way down on the use of my "1234" password for other sites. Or maybe the passwords I wrote down on a bit of paper. I'm not telling you which it is. That wouldn't be secure, would it?