The valuable data is still available on other sites, including this searchable database and this mirror (439 MB, tgz file). You can find users like 1983280 and track all the searches between March 1st and May 31st this year. The data set includes these fields: UserID, Query, Query Time, Clicked Rank, Destination Domain. So what can you find out about our user? She's a teenager interested in politics, she's from Washington DC, she likes photography and American Idol, one of her parents died and she's about to get married. From other users, you can find the name, the address, the work place and other details that allows identifying the person. New York Times discovered the user no. 4417749, Thelma Arnold, a 62-year-old widow who lives in Lilburn. AOL didn't realize that, in the name of the science, has comitted the biggest privacy breach a search engine ever did. Google didn't let the Government to obtain a similar data set, and AOL, who gets the search results from Google, releases them to the public.
Despite all the privacy considerations, the database is fascinating and it could be the subject of a book about human nature.
What happens when your life is exposed to the public by small fragments of text? You reveal your intentions, your problems and fears, your friendships and your hidden desires. Your queries reveal more than any detective or psychiatrist could find about your life.