Abstract
Introduction:
TrackMeNot (TMN), a lightweight browser extension, was created to help protect web searchers against surveillance and data-profiling. It does so not by means of concealment or encryption (i.e. covering one's tracks), but instead, paradoxically, by the opposite: noise and obfuscation. In a cloud of false leads, actual searches are essentially hidden in plain view.
Our presentation covers key aspects of the TMN story: 1) why we created TMN, 2) an overview of its technical mechanisms, 3) a selection of critiques and rebuttals, and 4) ways the Values-at-Play methodology contributed to the definition of several of TMN ‘s distinctive features.
Background and Rationale:
Public awareness of the ubiquitous practice of logging and profiling user search activities by search engine companies was initially raised in August 2005, when front page articles in the mainstream press revealed that the United States Department of Justice (DOJ) had issued a subpoena to Google for one week's worth of search query records (absent identifying information) and a random list of one million URLs from its Web index. These records were requested to bolster the Government’s defense of the constitutionality of the Child Online Protection Act (COPA) then under challenge. When Google refused the DOJ’s initial request, the DOJ filed a motion in a Federal District Court to force compliance. Google argued that the request imposed a burden, would compromise trade secrets, undermine customers' trust in Google, and have a chilling effect on search activities. In March 2006, the Court granted a reduced version of the first motion, ordering Google to provide a random listing of 50,000 URLs, but denied the second motion, namely, the list of search queries.
While from the perspective of user privacy this seemed a positive result, the case raised several disquieting points. First, it revealed the vulnerability of searches to systematic surveillance. Second, the court documents indicated that AOL, Yahoo!, and Microsoft were not issued subpoenas because they had complied with the government's request. Third, individual users had then, and still have little idea of, or say in, policies concerning their search records. Although, in this instance, Google resisted disclosure, there is no guarantee of such a response in the future. Fourth, although we may believe that our searches are conducted anonymously, the reality is that the query logs kept by search companies can often be linked to our identities.
One year later, another highly publicized news story confirmed the vulnerability of search data to identification when it was revealed that from anonymized search data posted to the internet by AOL for research purposes, the identity of some searchers had been extracted. Other reports followed detailing how the major search companies (Yahoo!, AOL, MSN & Google) store and analyze individual search data to create user profiles.
We were disturbed by the idea that search inquiries are systematically monitored, stored, and scrutinized by corporations like AOL, Yahoo!, Google, etc. and may even be available to third parties. Because the Web has become a crucial repository of information and association and our search and information retrieval behaviors profoundly reflect who we are, what we care about, and how we live our lives, it seemed wrong that search companies should be unilaterally setting policy. But what can be done?
Legal approaches are one obvious avenue for action – citizens can urge legislators to legislate limits on access or request new laws that extend protection, e.g. through the Fourth Amendment, to records of search and retrieval. But success in this arena requires orchestrated efforts by many parties and although not an implausible, is likely to be fraught with difficulty. Appealing to search companies directly is another possibility, but seems even less hopeful as their interests, at least on the surface, are in direct conflict with such limits. Both avenues hold, at best, only long term promise.
TrackMeNot offers a third avenue of action, a near term solution that places control directly in the hands of concerned individuals. It depends neither on the cooperation nor the permission of others, particularly those with potentially clashing interests. As such it fits within the class of strategies, described by Gary T. Marx, whereby individuals resist surveillance by taking advantage of blind spots inherent in large-scale systems. (See Gary T. Marx "A Tack in the Shoe: Neutralizing and Resisting the New Surveillance," Journal of Social Issues, Vol 59, No. 2, 2003, 369-390)
How TMN works:
TrackMeNot runs in Firefox as a low-priority background process that periodically issues randomized search-queries to popular search engines, AOL, Yahoo!, Google, and MSN. It hides users' actual search trails in a cloud of 'ghost' queries, thereby increasing the difficulty of aggregating such data into accurate or identifying user profiles. Although early versions included a static word list, later versions used a dynamic query mechanism whereby each client begins with a common seed list and 'evolves' (uniquely) over time, parsing the results of its searches for 'logical' future query terms with which it replaces those already used. The seed list, furnished to users when they download TMN, is manually constructed (and updated periodically) from top 50 and top 100 search queries published by various search companies, for example Google’s Zeitgeist, LYCOS, and Yahoo!
Launched on August 18, 2006, TMN was soon after approved for inclusion on Mozilla’s Addon website. While initial attention came primarily from the technical community, word spread to popular media outlets (such as Associated Press and Yahoo! News), after which responses came from a more general audience. Its approach to protecting privacy in search -- by obfuscating real search queries with fake ones – met with a wide range of reactions. While some applauded the effort, others argued it would not work, and others that it worked in unacceptably disruptive ways.
In responding to these reactions, we were forced to grapple with several important questions. First, it became clear that different user groups had quite different and sometimes contradictory ideas about how TMN should work and what it should accomplish. To give just one example: some thought TMN’s search terms should blend innocuously with their queries, as background noise, while others thought they should include jarring search terms, including ones that were politically and sexually charged. Some groups believed that the most important purpose of TMN was to protect users’ identities, while others believed it was to guard against profiling, and still others that it protect against social stigma. Some held that perfect secrecy should be the ultimate (though unreachable) goal, while others thought that simply raising the cost of identification and profiling was sufficient. Other responses challenged us to think about proper uses (and misuses) of network resources and about who has a right to determine users’ online experiences. Although these concerns may appear to be technical, their resolution usually involved a decision about which values should be prioritized.
TrackMeNot and Values-at-Play:
The Values-at-Play (VAP) approach to systematically considering moral and political values in technical design proved useful at several critical junctures in the development of TMN. Acknowledging privacy (in web-search) as a value embedded in the definition of the project, the approach also highlighted other places where values were implicated. For example, we correlated features such as robust user-control, simplicity and configurability with the value of autonomy, and determinedly maintained a design in which users would not have to rely on third party cooperation, thus promoting independence. We minimized the draw on client-side resources so that TMN would not interfere with other functionality, yet also attempted to maintain transparency, allowing users easy access to query lists, logs of TMN operations, and real-time monitoring of queries as they were sent. Although we realized that TMN would draw on both bandwidth and server-side resources, we believed the impact to be minimal and when traded off against the potential for user privacy was a justified cost. Finally, VAP was useful in articulating the distinctive character of TMN’s approach to protecting privacy, namely as a social rather than an individual value.
