It wasn't your typical data breach. In an effort to help researchers interested in search queries, AOL last month shared about 20 million search words and phrases used by 658,000 of its subscribers. But the move backfired as AOL got a harsh lesson in just how revealing search terms can be.
The AOL cache, supposedly stripped of personally identifiable information and posted online, laid bare eye-opening details of nameless subscribers. Social Security numbers, names, dates of birth, cell phone numbers, hometown stores and hospitals, graphic sex acts--all available for public viewing on the Web.
AOL apologized for the error and withdrew the site, but the damage was done. In a startling example of connecting the dots, The New York Times tracked down 62-year-old Thelma Arnold, a resident of Lilburn, Ga., based on her AOL searches. Although AOL wiped all of the data from public-facing sites, fleet-fingered third parties copied the AOL database and made it searchable elsewhere on the Web.
The gaffe demonstrates how much personal information can be gleaned from searches and how quickly it can spread; the resulting uproar could catalyze data protection efforts by some lawmakers. "We must stop companies from unnecessarily storing the building blocks of American citizens' private lives," Rep. Edward J. Markey, D-Mass., huffed last week in a statement. Earlier this year, Markey introduced a bill to stop companies from warehousing certain types of search data. A spokesman says Markey hopes for a renewed push to pass the bill when legislators return from summer break.
Markey had better be prepared for a hard slog. His bill has languished in subcommittee since February. The federal government uses gobs of data, including personal data, for Homeland Security data mining and other efforts. The Bush administration earlier this year demanded keywords, URLs, and other information from AOL, Google, MSN, Yahoo, and 30 other companies in a quest to prove the necessity of the 1998 Child Online Protection Act. And the Justice Department is pushing for laws that would force Internet service providers and some other companies to retain Web-usage data for specified periods to ensure that it's available if needed for law enforcement.