internet anonymizing techniques
by David M. Martin David Martin is studying network security at Boston University. He is currently finishing up his Ph.D.
I didn't think much about network anonymity until I attended a panel discussion on Web privacy and anonymity at the February 1997 Internet Society Symposium on Network and Distributed System Security. The consensus on privacy was, well, that there just isn't much and that commercial interests are unlikely to champion the cause. The panelists exhorted the audience to take notice, and I listened. Apparently, many people have heard the call! Our special issue editor mentioned a similar panel at a USENIX conference that I missed. Several different projects have come to light in the past couple of years, and some of them are mature enough to see use on your own networks whether you're ready for that or not. In this article, I'll discuss the most promising and visible anonymizing technologies and try to give you a sense of their strengths and weaknesses. Anonymity The first thing to know is that anonymity isn't the same as confidentiality. Network stream secrecy can be pretty much guaranteed with appropriate cryptography, even though technical issues such as key distribution infrastructures and political ones such as exportability and key escrow always seem to overshadow the underlying techniques. If this article were about secrecy, I'd continue with praises of PGP, Kerberos, SSL, SSH, IPSEC, and the like. Network anonymity instead tries to protect the identities of communication endpoints. Let's say you're working in a nuclear plant and come to believe that your boss is illegally siphoning off the safety budget. You decide to report it to the US Federal Bureau of Investigation, so you type a URL into your browser: <https://www.fbi.gov>. No luck, they don't offer encrypted Web service, so you back off and resolve to try again from home later. But you've already tipped off the network administrator, your boss's brother, Ned, who sees the connection attempt from your PC to <www.fbi.gov> in his logs. Later, as you're driving home in the heavy rain, the soundtrack becomes rumbling and dissonant. You realize that your brakes aren't working, and the quick cut to the next scene cruelly truncates your cross-armed shriek. How can you give the story a less cinematic ending? Receiver anonymity would help: if Ned couldn't tell that the connection attempt is to <www.fbi.gov> (the receiver), then you wouldn't be nearly as exposed. Even better might be sender anonymity, so that no one would know your PC, instead of your coworker's PC, is sending the initial connection request. Of course, the best protection would be the combination of sender and receiver anonymity, so that Ned could see only that some computer on his network is attempting a connection to some remote site. Anon.penet.fi Even though it's been dead for two years now, it's hard to describe anonymous networking without mentioning the remailing service at <anon.penet.fi>. Penet was a simple and easy-to-use double-blind email forwarding service, run in Finland by Johan Helsingius, that kept its mapping between real and anonymous email addresses secret. If you contacted the FBI through <anon.penet.fi>, Ned could only see that your PC had contacted Penet. Similarly, Ned's fiance's crooked sister Fran the FBI postmaster could see only a connection from Penet, not one from the nuclear plant. The obvious downside to the scheme is that you don't really know Penet's relationship to Ned and Fran, so you just have to hope Penet isn't inclined or able to betray your identity. Unfortunately, that was Penet's undoing: even though it was most certainly disinclined, it was indeed able to betray identities, and so ultimately a legal challenge forced it to do so. After one identity was revealed and more were demanded, Helsingius shut the service down. The Anonymizer The Anonymizer <www.anonymizer.com> is to the Web what Penet was to email: a simple and easy-to-use forwarding service. To fetch the FBI Web page, you could load up <http://www.anonymizer.com:8080/www.fbi.gov>. Ned would see a network connection to the Anonymizer, and Fran would see a connection from the Anonymizer, and neither would know precisely what was happening. But again, you'd have to trust the Anonymizer not to betray your identity; it certainly is able to, at least while your HTTP session is in progress. Actually, the Anonymizer remembers the session much longer than is technically required. According to the Anonymizer user agreement, "Usage logs are usually kept for fifteen (15) days for maintenance purposes, monitoring Spamming and monitoring abuses of netiquette. Any relevant portion(s) of such logs may be kept for as long as needed to stop the abuses." So the Anonymizer can be raided just as Penet was. Nonetheless, the Anonymizer is probably the most heavily used anonymizing service today. Although it's fine at hiding client identities from Web servers, it's not so good at hiding the server identities from the network segment between the Web client and the Anonymizer. For example, Ned can see "www.fbi.gov" just by unpacking the URL you sent to the Anonymizer above. Most proxies and firewalls capture and log URLs routinely. The Anonymizer is the first and only Internet privacy service with a price tag. Nonpaying users like me are awarded a bonus 30-60 second delay per page, giving us time to ponder our customer status while gazing at the (quickly delivered) enticement to upgrade. And they do promise a solution to the server hiding problem soon, "for the affordable price of $50.00," probably by encrypting the target URL on the way to the Anonymizer. Mixmaster and Nym.alias.net The email Mixmaster is probably the most untraceable remailing technology to date. It's an implementation of an idea called "mixing" that was invented and expounded in the early 1980s academic press by David Chaum (currently chief technology officer of Digicash Inc.). Here's the idea: given a would-be anonymous message, first choose a route through a series of special forwarding nodes from the message source to its destination, and then wrap some extra layers of data around the message. To form the innermost layer, concatenate the name of the last node the node one hop away from the message destination with the original message, and encrypt the result with the public key of the second-to-last node in the route. Now think about this bundle: it has one layer of routing data prepended to the original message, and it's encrypted with a key possessed by the second-to-last node in the route. So if the bundle were to somehow arrive at the second-to-last node, it could be decrypted there, and that one layer of routing data would be enough to get the original message to its final destination. If you repeat this sort of encapsulation, this time with the third-to-last node, you can see what happens. You now have a bundle that can be decrypted only by the third-to-last node. Once decrypted there, the interior can be sent on to the second-to-last node. At the same time, the third-to-last node can't read the interior of the bundle, because that part is encrypted with a different key. So each node on the path can see only one hop in either direction. If you use the Mixmaster to transmit your outgoing email through one Mix hop, the destination will see a connection only from that Mix hop sort of like an encrypting Anonymizer. Most people forward through two or more hops so no single Mix node experiences both the sender IP address and receiver IP address of a single message. In other words, using two or more hops keeps the sender anonymous to every hop but the first and the receiver anonymous to every hop but the last. Your identity is best hidden if you actually run your own Mixmaster forwarding node and direct your own outgoing mail through it. Then the first hop is yours, and because only the first hop can see the true sender IP address, this offers excellent sender protection. But you really do have to carry other sites' traffic, too. If everyone knows that your forwarding node serves only your personal traffic, you haven't gained anything by using it. However, running a busy forwarding node is fairly conspicuous, and it's hard to say how Ned would react to that kind of traffic. If you're worried about an attacker powerful enough to monitor several nodes simultaneously, then you also have to worry about timing attacks and other subtle correlations. In the extreme case, suppose the Mix network is completely idle until you send your message. Then even though they can't undo the layered encryption, Ned and Fran can locate your route's endpoints just by watching the right part of the network. After all, each Mix packet must be part of your message. Mixmaster resists attacks like this with batching/reordering: each forwarding node keeps quiet absorbing messages but not transmitting them until its outbound buffer overflows, at which point the node emits a randomly chosen outbound message to its next hop. There's a slick gateway to the Mixmaster available at the Anonymizer Web site mentioned above. Also, check out <http://www.cs.berkeley.edu/~raph/remailer-list.html> for links to Mixmaster sources and documentation, as well as plenty of information about other types of remailers and privacy tools. But wait when you send the FBI anonymous email with the Mixmaster, how can they reply to you? Well, they can't, unless you explicitly tell them how to do it. One technique is just to tell them to send their reply to the newsgroup alt.anonymous.messages with the subject "74847." Ok, it's pretty untraceable, but also expensive and unreliable. Here's a better idea: you already know how to build an untraceable Mix route from you to the FBI, so why not build one from the FBI to you while you're at it and send it along with your original message? Because it's a Mix route, they can only tell what the first link is on the return path, so it doesn't endanger you. Mixmaster doesn't directly support including return paths in messages, but there's a third-party solution that works well with Mixmaster and other remailers. The "nymserver" <nym.alias.net> lets people build "reply blocks" in advance and register them under a pseudonym stored in the <nym.alias.net> database. For instance, you could register a Mixmaster return path to your site under the name <NukeRat@nym.alias.net> and just include that address in your original mail. Then, when the FBI replies to you as <NukeRat@nym.alias.net>, the mail will reach you even though neither they nor <nym.alias.net> see the whole return path. Complete instructions are available at <http://www.cs.berkeley.edu/~raph/n.a.n.html>. Onion Routing Researchers at the US Naval Research Laboratory have adapted Mixing technology to provide sender and receiver anonymity to other applications in a system called Onion Routing. The "onions" are the layered encrypted messages, and the "onion routers" are the forwarding nodes. The first Onion Routing papers described support for Web traffic, email, and rlogin, and they have since announced development efforts for many other protocols as well: Telnet, FTP, NNTP, DNS, NFS, among others. But to support interactive applications with even a hope of tolerable latency, the batching and reordering technique of the Mixmaster is out of the question. This means that coordinated observation of the network links connecting onion routers could reveal a connection's route and so betray the connection's source or destination. Therefore, it's important to ensure that the links between onion routers can't be simultaneously sniffed. The easiest approach is to put onion routers on different LANs in different buildings with different network administrators ones who would be unlikely to collude. Interactive applications need two-way communication over the entire route, but Mixing is pretty much a one-way technique. We've seen how to handle this with a reply block, but onion routers instead create and maintain connection state as the onions flow through the routers. Once a connection has been built using Mix-style layered encryption, subsequent messages can flow in either direction along the chosen route. Then the routers switch to secret-key cryptography for the connection, because it's so much faster than public-key cryptography. To access the FBI Web site through an onion network, you first have to set your browser's HTTP proxy to point to an onion network entry point (an "application proxy"). Then when you attempt to load a page, the onion router sanitizes your request by removing revealing headers, builds a connection as described, and lets the two endpoints communicate freely through it. Like the Mixmaster, the best protection results from having a trusted connection between you and an onion router, such as by running an onion router on your local workstation. In short, it's much harder for Ned to track you down through an onion network than through the Anonymizer, but it requires a lot more infrastructure support, too. Although not yet widely deployed, Onion Routing is a technology to keep your eye on. Look to <http://www.onion-router.net/> for current status and other useful information. Crowds Another general-purpose anonymizing tool called Crowds is in development at AT&T Research. Its slogan, "Anonymity Loves Company," reveals a strategic similarity to the Mixmaster and Onion Routing in the sense that its untraceability improves as more and more people use it. But Crowds doesn't rely on Mixing at all. Whereas Mixmaster and Onion Routing randomly construct a new and independent route for every connection, Crowds randomly assigns a native route to each Crowds "jondo" (John Doe, Crowdspeak for "forwarding node"). Each jondo has two different input streams. The local stream contains requests for anonymous site access from a local user these follow the native route and the remote stream contains requests that are merely visiting the jondo from elsewhere. When requests leave a jondo, the subsequent jondos can't tell whether they're following their predecessor's native route or some other route that just included the predecessor, so neither can they tell who is really sending the message. Every visited jondo does see the ultimate destination of a connection request, so Crowds doesn't provide connection receiver anonymity. But this allows it to route around broken nodes automatically. To prevent snooping between nodes, Crowds employs secret-key encryption with one key per route. Like Onion Routing, Crowds builds connection state into the jondos as their routes are built in order to support subsequent two-way communication. And also like Onion Routing, Crowds is vulnerable to the coordinated observation of jondos, so the same remarks about separating onion routers apply to Crowds jondos. Crowds has another timing vulnerability that its designers have worked hard to minimize. The problem is that the untraceability of a route follows from each jondo not knowing its position on a route, i.e., whether it's the second or third hop or greater. If a malicious jondo knew that it was the second hop on a route, then the predecessor jondo (which it can see) would be the entry point of the supposedly anonymous sender. Now suppose that Ned has installed a malicious jondo in a Crowds network in order to catch you red-handed. You use Crowds to load up the FBI Web site, and Ned's jondo notices your outbound HTTP GET. Right as his jondo delivers the last byte of the main Web page, it starts a timer and measures the amount of time that elapses until it sees a GET for the first embedded image in the page. This elapsed time is a strong hint as to the number of Crowds jondos preceding it. Crowds does try to defeat this kind of attack. In the previous example, Crowds searches for the image tags in the HTML itself and delivers the referenced images along with the original document. Because the first jondo never has to ask for them, there's no reliable second event for a malicious jondo to time. Predicting client behavior and compensating for it in proxies is difficult. At present, Crowds supports only Web traffic, but one can imagine proxies for other types of applications, too. A version of Crowds written in Perl can be obtained by US and Canadian citizens at <http://www.research.att.com/projects/crowds/>. The Lucent Personal Web Assistant LPWA doesn't really deal in sender or receiver anonymity directly. Instead, it combats linkability the ability of adversaries to correlate your activities at various Web sites and draw conclusions that you'd perhaps prefer to keep private. Have you ever deliberately misspelled your name on a registration form in order to see how much junk mail it generates? LPWA works on the same principle by automatically deriving a unique pseudonym for you at each site you visit and presenting that identity for you. For example, the New York Times site gives you access to much of that newspaper as long as you register. LPWA can automatically generate your New York Times name and password, and even a unique email address where you can receive return mail. For example: after setting your browser to use the LPWA proxy (lpwa.com:8000), go to <http://www.nytimes.com>. LPWA will interpose a couple of pages describing the LPWA service and asking you for your real identity (or at least the one from which your many pseudonyms will be derived). Then when LPWA sends you on to your original URL and the New York Times site asks you to register, just give the string "\u" as your subscriber ID, "\p" as your password, and "\@" as your email address. LPWA will intercept those codes and replace them with the nonsensical pseudonyms it uses for you at that site. At one site, my pseudonymous username and password are "elnadkv5" and "dphagcal," and my email address is <lqnb2bx4tdzrv/@lpwa.com>. A paper describing LPWA shows how the pseudonyms are computed from a collision-resistant hash function of your real LPWA username/password and a site name. Given current knowledge about hash functions, it's computationally infeasible for others, including LPWA, to produce a real username given only a pseudonym at a site. In other words, there doesn't have to be any raidable database at all. In spite of that, LPWA does keep track of its existing mappings somehow. Otherwise, lpwa.com wouldn't be able to forward mail for <lqnb2bx4tdzrv/@lpwa.com> to <dm@cs.bu.edu>. Incidentally, LPWA calls these pseudonymous email addresses "target-revocable": because you leave a different email address at every site, you can later refuse email from any one of them just by adding your pseudonym there to your mail filter's reject list. Should you trust LPWA not to gather up all of the correlations between people interested in protecting their privacy? Such a mailing list might fetch noticeably higher prices than others! Well, you could apply LPWA's techniques right on your own equipment instead of using their central proxy. In fact, by coding things up right, you could combine the protection of Onion Routing or Crowds with LPWA rewriting. But at this point, I'm afraid you would indeed have to code it up yourself. The LPWA maintainers currently classify it as a "technology demonstration" only, and source code doesn't appear to be currently available: patent pending! JANUS
What if you want to publish Web pages without revealing the name of
your server? JANUS is a service of the Fernuniversitaet Hagen that
behaves like the Anonymizer, but it hides receiver identities instead
of sender identities. How can clients request pages if the page
locations are secret? JANUS handles this by encrypting and decrypting
URLs on the fly. For instance, <http://www.fbi.gov> becomes The standard cautions about single-hop forwarding apply here; coordinated and well-placed sniffers could determine the mapping for a site pretty easily. See <http://janus.fernuni-hagen.de/> for instructions and more information. Conclusion So what are you going to do? You really have only a couple of choices today. You can try contacting the FBI through the Anonymizer and hope Ned isn't logging requested URLs and isn't snooping your data stream. Otherwise, you have to install special software and become part of an anonymous forwarding network, successfully convincing others to forward through your site, before you can send anonymously with any confidence. It's probably easiest to just make a call from a pay phone on the way home. You might get soaked, but Ned won't notice. I've described only technical approaches to anonymity, but there are strongly supported voluntary cooperation privacy standards in development. See <http://www.w3.org/> for information about the Platform for Internet Content Selection (PICS), the Platform for Privacy Preferences (P3P), and other privacy strategies. I hope this article has given you a sense of today's Internet anonymizing technology. As the author, I get to take advantage of this opportunity to mention my own work, too: an investigation of anonymity as a networking primitive. Check out <http://www.cs.bu.edu/students/grads/dm/anon.html> for information on this as well as links to other sites. Unfortunately, my system is still in initial development, so I don't think it warrants more attention in this article. Maybe next time!
|
|
First posted: 28th May 1998 efc Last changed: 28th May 1998 efc |
|