Copyright 1998 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org.
Why is so much of the recent attention to privacy issues focused on Internet privacy when consumers have had privacy concerns long before they started doing business online? Certainly, the current hype surrounding the Internet in general has contributed to the buzz. These days, anything that happens online seems much more exciting than things going on in the "real" world. But in the case of online privacy, I think there is some substance behind the hype.
Internet privacy is now a hot-button issue; the flurry of media reports about HTTP cookies has raised public concerns that consumers' online activities are being monitored [1]. In mid-May, Vice President Al Gore announced a White House initiative aimed at helping to improve online privacy protections. And in June, the Federal Trade Commission reported the results of its March privacy "sweep," in which the agency visited more than 1,400 commercial Web sites in search of clearly displayed privacy policies. The FTC reported that while 85% of the sites it had visited collect personal information from consumers, only 14% had posted any privacy-related notices, and only 2% had posted comprehensive privacy policies [2].
Meanwhile, the European Union is preparing to launch the European Data Protection Directive on Oct. 25. This directive will prohibit EU member countries from sending personal data to other countries that lack adequate privacy protection. Online transactions, which often cross national borders, may be significantly impacted by this directive.
The Internet and computerized databases make automated collection and processing of information particularly easy and convenient. In fact, for the typical Web site operator, it's easier to collect information about Web site visitors than to figure out how to configure a Web server not to collect that information. As a result, there are now zillions of databases silently collecting mostly innocuous "click-stream" data from everyone who surfs on by. But when these databases are merged, and especially when click-stream data is combined with personally identifiable data that users type in when filling out online forms, Web surfers may be profiled in ways that raise serious privacy concerns. Imagine, for example, if employers started inferring health information about their employees (or prospective employees) based on information about visitors to medical- or health-related Web sites.
As better tools are developed for processing huge quantities of data, and as better data-mining applications come to the market, chances are that new businesses will be built around data mining and people will start finding uses for all the data they have been stashing away. And even if the organization that owns the data doesn't make use of it, the data may be subpoenaed in lawsuits or accessed in unauthorized ways by employees or hackers.
Although much of the information being collected online appears to be going unused, some of it is being used actively, often to the benefit of the individual to whom it pertains. Individuals frequently reveal personal information to gain benefits such as home delivery of products, customized services, and the ability to buy items on credit. I enjoy the convenience of ordering books online; with just a few clicks of the mouse they can be billed to my credit card and delivered to my door. But I often wonder whether online stores are using my information for purposes other than processing my order. Which leads me to what I think is the root of the privacy problem: Consumers have little knowledge about or control over the use of their personal information. This problem is exacerbated on the Internet due to the ease with which information can be collected, processed and combined with other information.
Although commercial Web sites are evolving toward more privacy-friendly practices, many still collect information without providing any explanation about what they will do with it. When people find out that their data might be used in ways they didn't expect, or that information they did not know about is being silently collected, they get worried. There is nothing inherently evil about HTTP cookies, although they can potentially be used in undesirable ways. But most people don't understand what cookies are used for, and most Web sites that use them fail to provide any explanatory information.
In a recent article about online privacy, Esther Dyson summed up the problem: "The biggest challenge right now is ignorance: People aren't worried enough, and are careless. Other people are worried too much, and are paranoid. No one knows what is known and what isn't. It's the one-way mirror effect that makes people so uneasy" [3].
In mid-May, the World Wide Web Consortium (W3C) released a public working draft of the Platform for Privacy Preferences (P3P) [4]. P3P is designed to allow Web sites to express their privacy practices -- including which data they collect from users, what they use the data for, and whether that data will be shared with other parties -- in a machine-readable format that can be automatically parsed by Web browsers and compared with privacy preferences input by the user. If there is a match between Web site practices and user preferences, a P3P agreement is reached. Users should be able to configure their browsers to reach agreement with, and proceed seamlessly through, Web sites that have certain types of practices; users should also be able to receive browser prompts when encountering Web sites that engage in potentially objectionable procedures. For example, a user might request to be prompted when a Web site proposes to collect information that will be used for marketing purposes. Thus, users need not read the privacy policies at every site they visit to be assured that their information is going to be used only in ways they consider acceptable.
P3P also includes a user data repository where users can store information that they don't mind sharing with certain Web sites. If they reach an agreement that calls for the collection of certain pieces of information, they can have that information transferred automatically if it is stored in the repository. In addition, the repository can be used to store site-specific IDs that allow for pseudonymous interactions with Web sites that cannot be tracked across multiple sites. Web sites may request to store data in the user's repository as well. These requests are also governed by the P3P agreement reached with the user.
P3P can also facilitate choice by allowing Web sites to offer visitors a selection of privacy policies. For example, a Web site with information on movies might allow users who wish to provide no personal information a generic site with an index of movie reviews. For visitors who are willing to provide their home address, the site would also offer movie timetables for theaters near the visitor's home. Visitors to the site could choose which P3P agreement best suits their purposes.
As one of P3P's designers, I am optimistic that this tool will go a long way toward addressing the privacy concerns of Internet users. However, I realize that it does not solve the whole problem by itself. While P3P offers the opportunity for consumers to find Web sites whose practices are acceptable to them, it does not guarantee that mutually acceptable terms will always be found. Depending on market conditions, individuals may or may not find privacy-friendly choices available.
A number of services have been launched to provide consumers with some assurance up front that a Web site's policies accurately reflect their practices. These services generally require Web sites to pay a fee, enter into certain contractual agreements, and possibly undergo an audit in exchange for the ability to display some sort of seal of approval. So far, these seals have manifested themselves as visual labels displayed on Web sites. However, these seals could also come in the form of PICS labels or digitally signed certificates that could be transmitted as part of a P3P proposal [5]. Users could instruct their browsers to look for such certificates.
The first such privacy seal service was launched in 1996 by the Electronic Frontier Foundation and CommerceNet. Called TRUSTe [6], this service has undergone a number of major changes since its initial launch, in response to comments from TRUSTe members and the public. TRUSTe currently licenses its "trustmark" to companies that sign an agreement and pay a fee on a sliding scale according to their annual income. The agreement requires that sites follow the TRUSTe guidelines and submit to periodic reviews. TRUSTe has more than 100 licensees including IBM, Lands' End and The New York Times.
The Better Business Bureau (BBB) sponsors the BBBOnline Seal program [7], which helps consumers recognize companies that have a satisfactory complaint-handling record with the BBB, participate in the BBB's advertising self-regulation program, and agree to binding arbitration in the case of disputes with consumers. This program could easily be extended into the privacy area as well.
Other industry groups have developed additional privacy seals and other self-regulatory programs, and I expect several new programs to be announced over the coming months.
Perhaps the best-known Web anonymity tool is the Anonymizer (http://www.anonymizer.com), a service that submits requests to Web sites on behalf of its users. Because the request is submitted by the Anonymizer rather than the user, information about that person is not revealed to the Web site. The Anonymizer is easy to use and provides both free and paid-subscription services. Users must trust the Anonymizer to protect their privacy, however, and this tool does not prevent users' Internet service providers from logging their Web activities.
Other anonymity tools do not require users to trust a third party to maintain anonymity. Most of these tools, however, are still research prototypes. Crowds [8] is an anonymity system developed by my colleagues at AT&T Labs-Research. With the slogan "Anonymity loves company," Crowds is based on the idea that people can be anonymous when they blend into a crowd. Large numbers of geographically distributed Web surfers can join a group called a crowd and forward all of their HTTP requests through the crowd. Each request is randomly forwarded to another member of the crowd, who can either submit it directly to the end server or forward it to another randomly selected member of the crowd. Neither the end Web server nor any of the crowd members can determine where the request originated. Users participate in a crowd by running a proxy server on their local computers and configuring their browsers to use the local computer as a proxy.
Another anonymity tool developed at Lucent's Bell Labs is useful for people who want persistent but anonymous relationships with Web sites. The Lucent Personalized Web Assistant (LPWA) [9] inserts pseudonyms into Web forms that request a user's name. LPWA is designed to consistently use the same pseudonyms every time a particular user returns to the same site, but use a different pseudonym at each Web site. This tool works in conjunction with an anonymizing proxy server; it could also be used with a system like Crowds. LPWA users allow Web sites to accumulate a profile of their preferences over time that may be useful for tailoring content and advertisements to their interests. However, LPWA prevents profile information from being linked to a user's name or combined with information revealed to other sites.
P3P also contains a feature that allows for pseudonymous relationships and can be used in place of cookies. Users may choose to send the same unique identifier each time they return to a Web site with which they have reached an agreement. A P3P-compliant browser keeps track of the identifiers and sends a different one to each site. The goal is that once P3P implementations are readily available, Web sites will use this feature when they wish to develop persistent relationships with consumers but do not need personally identifiable information to provide their services. A P3P-compliant browser should also give users complete control over when to take advantage of this feature.
I would also like to see more electronic-commerce systems that minimize the amount of personally identifiable information that must be transferred to complete a transaction. This can be done by designing systems that transfer only the information that each party absolutely needs to know. For example, in an electronic payment transaction the bank need only know that the individual is authorized to withdraw money from a particular account, the identification number of that account, and the sum of money to be withdrawn; the vendor need only know that it has received a valid payment. The bank need not know what the individual is doing with the withdrawn money, and the vendor need not know the individual's name or bank account number (in contrast, these pieces of information must be transferred, for example, when individuals purchase goods with checks). Thus, only the purchaser has access to the list of purchases that he or she has made.
Electronic cash systems have the potential to offer the privacy of cash payments with the convenience of electronic funds transfers. However, some of these systems have many of the same vulnerabilities as traditional cash, including risk of theft or loss.
Finally, it is important to recognize that the approaches presented here address only part of the problem. Even the most privacy-friendly information practices may be thwarted if data collectors do not protect their communications and databases. Security precautions should be taken to prevent communications from being intercepted and databases from being compromised.
As more privacy safeguards are deployed, consumer confidence in the Internet should increase -- and the hype surrounding Internet privacy can be directed elsewhere.
Lorrie Faith Cranor (lorrie AT acm DOT org) is a researcher in the Secure Systems Research Department at AT&T Labs-Research in Florham Park, N.J. She also co-chairs the World Wide Web Consortium's P3P Interest Group, and maintains a home page at http://www.research.att.com/~lorrie/. (Note: This column does not reflect official AT&T policy.)