Adam Barth
|
Collin Jackson
|
John C. Mitchell
|
postMessage
,
provides authentication, but we discover an attack that breaches
confidentiality. We modify the
postMessage
API to provide confidentiality and see our modifications
standardized and adopted in browser implementations.
Web sites contain content from sources of varying trustworthiness. For example, many web sites contain third-party advertising supplied by advertisement networks or their sub-syndicates [6]. Other common aggregations of third-party content include Flickr albums [12], Facebook badges [9], and personalized home pages offered by the three major web portals [15,40,28]. More advanced uses of third-party components include Yelp's use of Google Maps [14] to display restaurant locations and the Windows Live Contacts gadget [27]. A web site combining content from multiple sources is called a mashup, with the party combining the content called the integrator and integrated content called a gadget. In simple mashups, the integrator does not intend to communicate with the gadgets and requires only that the browser isolate frames. In more complex mashups, the integrator does intend to communicate with the gadgets and requires secure inter-frame communication.
In this paper, we study the contemporary web version of a recurring problem
in computer systems: isolating untrusted, or partially trusted, software
components while providing secure inter-component communication.
Whenever a site integrates third-party content, such as an advertisement, a
map, or a photo album, the site runs the risk of incorporating malicious
content. Without isolation, malicious content can compromise the
confidentiality and integrity of the user's session with the integrator.
While the browser's well-known "same-origin
policy" [34] restricts script
running in one frame from manipulating content in another frame, the browser
uses a different policy to determine whether one frame is allowed to
navigate (change the location of) another frame. Although restricting
navigation is essential to providing isolation,
navigation also enables one form of inter-frame
communication used in mashup frameworks from leading companies. Furthermore, we
show that an attacker can use frame navigation to attack another inter-frame
communication mechanism, postMessage
.
postMessage
. The results of our analysis are
summarized in Table 1.
Microsoft.Live.Channels
library [27], which uses fragment identifier
messaging to let the Windows Live Contacts gadget communicate with its
integrator.
The protocol used by Windows Live is analogous to the Needham-Schroeder
public-key protocol [29]. We discover an
attack on this protocol, related to Lowe's anomaly in the Needham-Schroeder
protocol [23], in which a malicious gadget can
impersonate the integrator to the Contacts gadget.
We suggested a solution based
on Lowe's improvement to the Needham-Schroeder protocol [23], and
Microsoft implemented and deployed our suggestion within days.
postMessage
is implemented in
Opera, Internet Explorer 8, Firefox 3, and Safari.
Although postMessage
has been deployed since 2005, we demonstrate an
attack on the channel's confidentiality using frame navigation.
In light of this attack, the postMessage
channel provides
authentication but lacks confidentiality, analogous to a channel in which
senders cryptographically sign their messages. To secure the channel, we
propose a change to the postMessage
API. We implemented our change
in patches for Safari and Firefox. Our
proposal has been adopted by the HTML 5 working group, Internet
Explorer 8, Firefox 3, and Safari.
In this paper, we are concerned with securing in-browser interactions from malicious attackers. We assume an honest user employs a standard web browser to view content from an honest web site. A malicious "web attacker" attempts to disrupt this interaction or steal sensitive information. Typically, a web attacker places malicious content (e.g., JavaScript) in the user's browser and modifies the state of the browser, interfering with the honest session. To study the browser's security policy, which determines the privileges of the attacker's content, we define the web attacker threat model below.
attacker.com
. The web attacker can obtain
SSL certificates for domains he or she owns; certificate authorities such
as instantssl.com provide such certificates for free.
The web attacker's network
abilities are decidedly weaker than the usual network attacker
considered in studies of network security because the web attacker can
neither eavesdrop on messages sent to other recipients nor forge messages
from other network locations. For example, a web attacker cannot act as a
"man-in-the-middle."
attacker.com
in at least one browser window, thereby rendering the
attacker's content.
We make this assumption because we believe that an
honest user's interaction with an honest site should be secure even if the
user separately visits a malicious site in a different browser window.
We assume the web attacker is constrained by the browser's security policy
and does not employ a browser exploit to circumvent the policy.
The web attacker's host privileges are decidedly weaker than an
attacker who can execute a arbitrary code on the user's machine with the
user's privileges. For example, a web attacker cannot install or run a
system-wide key logger or botnet client.
attacker.com
.
There are several techniques an attacker can use to drive traffic to
attacker.com
. For example, an attacker can place web advertisements,
display popular content indexed by search
engines, or send bulk e-mail to attract users. Typically, simply
viewing an attacker's advertisement lets the attacker mount a
web-based attack. In a previous
study [20], we purchased over 50,000
impressions for $30. During each of these impressions, a user's browser rendered our
content, giving us the access required to mount a web attack.
We believe that a normal, but careful, web user who reads news and conducts banking, investment, and retail transactions, cannot effectively monitor or restrict the provenience of all content rendered in his or her browser, especially in light of third-party advertisements. In other words, we believe that the web attacker threat model is an accurate representation of normal web behavior, appropriate for security analysis of browser security, and not an assumption that users promiscuously visit all possible bad sites in order to tempt fate.
attacker.com
.
bankofthevvest.com
) or using other social engineering. In particular,
we do not assume that a user treats
attacker.com
as if it were
a site other than attacker.com
.
The attacks presented in this paper
are "pixel-perfect" in the sense that the browser provides the user no
indication whatsoever that an attack is underway. The attacks do not
display deceptive images over the browser security indicators nor do they
spoof the location bar and or the lock icon.
In this paper, we do not consider cross-site scripting attacks, in
which an attacker exploits a bug in an honest principal's web site to inject
malicious content into another security origin.
None of the attacks described in this paper rely
on the attacker injecting content into another principal's security origin.
Instead, we focus on privileges the browser itself affords the attacker to
interact with honest sites.
Netscape Navigator 2.0 introduced the HTML <frame>
element, which
allows web authors to delegate a portion of their document's screen real
estate to another document. These frames can be navigated independently of
the rest of the main content frame and can, themselves, contain frames, further
delegating screen real estate and creating a frame hierarchy. Most modern
frames are embedded using the more-flexible <iframe>
element,
introduced in Internet Explorer 3.0. In this paper, we use the term
frame to refer to both <frame>
and <iframe>
elements.
The main, or top-level, frame of a browser window displays its
location in
the browser's location bar. Subframes are often indistinguishable from
other parts of a page, and the browser does not display their location in
its user interface.
Browsers decorate a window with a lock icon only if every
frame contained in the window was retrieved over HTTPS but do not
require the frames to be served from the same host. For example, if
https://bank.com/
embeds a frame from https://attacker.com/
, the
browser will decorate the window with a lock icon.
otherWindow
is another window's frame,
var stolenPassword = otherWindow.document.forms[0]. password.value;attempts to steal the user's password in the other window. Modern web browsers permit one frame to read and write all the DOM properties of another frame only when their content was retrieved from the same origin, i.e. when the scheme, host, and port number of their locations match. If the content of
otherWindow
was
retrieved from a different origin, the browser's security policy will
prevent this script from accessing otherWindow.document
.
Permissive Policy
A frame can navigate any other frame. |
For example, if otherWindow
includes a frame,
otherWindow.frames[0].location = "https://attacker.com/";navigates the frame to
https://attacker.com/
. This has the effect of
replacing the frame's document with content retrieved from that URL. Under
the permissive policy, this navigation succeeds even if otherWindow
contains content from a different security origin.
There are a number of other idioms for navigating frames, including
window.open("https://attacker.com/", "frameName");which requests that the browser search for a frame named
frameName
and navigate the frame to the specified URL. Frame names exist in a
global name space and are not restricted to a single security origin.
In 1999, Georgi Guninski discovered that the permissive frame navigation
policy admits serious attacks [16]. Guninski discovered
that, at the time, the
password field on the CitiBank login page was contained within a frame.
Because the permissive frame navigation policy lets any frame
navigate any other frame, a web attacker can navigate the password frame
on CitiBank's page to https://attacker.com/
, replacing the frame with identical-looking
content that sends the user's password to attacker.com
. In the modern
web, this cross-window attack might proceed as follows:
attacker.com
.
bank.com
, which displays its
password field in a frame.
https://attacker.com/
. The
location bar still reads bank.com
and the lock icon is not
removed.
attacker.com
.
Window Policy
A frame can navigate only frames in its window. |
This policy prevents the cross-window attack because
the web attacker does not control a frame in the same window
as the CitiBank or the Google AdSense
login page.
Without a foothold in the window, the attacker cannot navigate
the login frame to attacker.com
.
The window frame navigation policy is neither universally deployed nor sufficiently strict to protect users on the modern web because mashups violate its implicit security assumption that an honest principal will not embed a frame to a dishonest principal.
Before | After |
attacker.com
and impersonates the gadget
to the user.
attacker.com
, replacing the existing content with
the attacker's advertisement.
Although browser vendors do not document their navigation policies, we were able to reverse engineered the navigation policies of existing browsers, and we confirmed our understanding with the browsers' developers. The existing policies are shown in Table 2. In addition to the permissive and window policies described above, we discovered two other frame navigation policies:
Descendant Policy
A frame can navigate only its descendants. |
Child Policy
A frame can navigate only its direct children. |
The Internet Explorer 6 team wanted to enable the child policy by default, but shipped the permissive policy because the child policy was incompatible with a large number of web sites. The Internet Explorer 7 team designed the descendant policy to balance the security requirement to defeat the cross-window attack with the compatibility requirement to support existing sites [33].
position: absolute
style. The descendant policy permits a
frame to navigate a target frame precisely when the frame could overwrite
the screen real estate of the target frame. Although the child policy is
stricter than the descendant policy, the additional strictness does not
prevent many additional attacks because a frame can simulate the visual
effects of navigating a
grandchild frame by drawing over the region of the screen occupied by the
grandchild frame. The child policy's added strictness does, however, reduce
the policy's compatibility with existing sites, discouraging browser vendors
from deploying the child policy.
Over the past few years, web developers have built sophisticated mashups
that, unlike simple aggregators and advertisements, are comprised of gadgets
that communicate with each other and with their integrator. Yelp, which
integrates the Google Maps gadget, motivates the need for secure inter-frame
communication by illustrating how communicating gadgets are used in real
deployments. Sections 4.1 and 4.2 analyze
and improve fragment-identifier messaging and postMessage
.
<script>
tag that executes JavaScript from maps.google.com
. This script
creates a rich JavaScript API the integrator can use to interact with the
map, but the script runs with all of the integrator's privileges.
Although the browser's scripting policy isolates frames from different security origins, clever mashup designers have discovered an unintended channel between frames: the fragment identifier channel [3,36]. This channel is regulated by the browser's less-restrictive frame navigation policy. This "found" technology lets mashup developers place each gadget in a separate frame and rely on the browser's security policy to prevent malicious gadgets from attacking the integrator and honest gadgets.
#
), then the browser does not reload
the frame. If frames[0]
is currently located at
https://example.com/doc
,
frames[0].location = "https://example.com/doc#message";changes the frame's location without reloading the frame or destroying its JavaScript context. The frame can observe the value of the fragment by periodically polling
window.location.hash
to see if the fragment
identifier has changed. This technique can be used to send short string
messages entirely within the browser, avoiding network latency.
However, the communication channel is somewhat
unreliable because, if two navigations occur between polls, the first message
will be lost.
#
from
eavesdropping on messages because they are unable to read the
frame's location (even
though the navigation policy permits them to write to the frame's
location). Browsers also prevent arbitrary security origins from tampering
with portions of messages. Other security origins can, however, overwrite
the fragment identifier in its entirety, leaving the recipient to guess the
sender of each message.
To understand these security properties, we develop
an analogy with well-known properties of network channels. We view the
browser as guaranteeing that the fragment identifier channel has
confidentiality: a message can be read only by its intended
recipient.
The fragment identifier channel fails to be a secure channel because
it lacks authentication, the ability of the recipient to unambiguously
determine the sender of a message. The channel also fails to be
reliable because messages might not be delivered, and the attacker
might be able to replay previous messages using the browser's
history
API.
The security properties of the fragment identifier channel are analogous to a channel on an untrusted network secured by a public-key cryptosystem in which each message is encrypted with the public key of its intended recipient. In both cases, if Alice sends a message to Bob, no one except Bob learns the contents of the message (unless Bob forwards the message). In both settings, the channel does not provide a reliable procedure for determining who sent a given message. There are two interesting differences between the fragment identifier channel and the public-key channel:
Microsoft.Live.Channels
[36]. The Windows
Live Contacts gadget uses this API to communicate with its
integrator. The integrator can instruct the gadget to add or remove
contacts from the user's contacts list, and the gadget can send the integrator
details about the user's contacts. Whenever the integrator asks the gadget
to perform a sensitive action, the gadget asks the user
to confirm the operation and displays the integrator's host name to aid the
user in making trust decisions.
Microsoft.Live.Channels
attempts to build a secure channel over the
fragment identifier channel. By reverse engineering the implementation, we
determined that it uses two sessions of the following protocol (one in each
direction) to establish a secure channel:
URI | ||
Message |
The Needham-Schroeder protocol has a well-known anomaly, due to Lowe [23], which leads to an attack in the browser setting. In the Lowe scenario, an honest principal, Alice, initiates the protocol with a dishonest party, Eve. Eve then convinces honest Bob that she is Alice. In order to exploit the Lowe anomaly, an honest principal must be willing to initiate the protocol with a dishonest principal. This requirement is met in mashups because the integrator initiates the protocol with the gadget attacker's gadget in order to establish a channel. The Lowe anomaly can be exploited to impersonate the integrator to the Windows Live Contacts gadget as follows:
Integrator Attacker | URI | |
Attacker Gadget | URI | |
Gadget Integrator | ||
Integrator Attacker | Message |
The SMash library in the mashup application creates the secret, an unguessable random value. When creating the component, it includes the secret in the fragment of the component URL. When the component creates the tunnel iframe it passes the secret in the same manner.The SMash developers have contributed their code to the OpenAjax project, which plans to include their fragment identifier protocol in version 1.1. The SMash protocol can be understood as follows:
URI | ||
Message |
Attacker Gadget | URI | |
Gadget Integrator | ||
Attacker Gadget | Message |
load
event to fire.
Integrator sends secret messages to child | Attacker hijacks integrator's child |
URI | ||
URI | ||
Message | ||
Message |
Microsoft.Live.Channels
and of the
Windows Live Contacts gadget. IBM adopted our suggestions and revised their
SMash paper. The OpenAJAX Alliance adopted our suggestions and updated
their codebase. All three now use the above protocol to establish a secure
channel using fragment identifiers.
HTML 5 [19]
specifies a new browser API for asynchronous communication between
frames. Unlike the fragment identifier channel, the postMessage
channel was designed for cross-site communication.
The postMessage
API was originally implemented in Opera 8
and is now supported by Internet Explorer 8,
Firefox 3 [37], and
Safari [24].
postMessage
method:
frames[0].postMessage("Hello world.");The browser then generates a
message
event in the recipient's frame
that contains the message, the origin (scheme, port, and domain) of the
sender, and a JavaScript pointer to the frame that sent the message.
postMessage
channel guarantees authentication, messages
accurately identify their senders, but the channel lacks confidentiality.
Thus, postMessage
has almost the "opposite" security properties as
the fragment identifier channel. Where the
fragment identifier channel has confidentiality without authentication, the
postMessage
channel has authentication without confidentiality.
The security properties of the postMessage
channel are analogous to a
channel on a
untrusted network secured by an existentially unforgeable signature scheme.
In both cases, if Alice sends a message to Bob, Bob can determine
unambiguously that Alice sent the message. With postMessage
, the
origin
property accurately identifies the sender; with
cryptographic signatures, verifying the signature on a message accurately
identifies the signer of the message. One difference between the channels
is that cryptographic signatures can be easily replayed, but the
postMessage
channel is resistant to replay attacks. In some cases,
however, an attacker might be able to mount a replay
attack by reloading honest frames.
postMessage
is widely believed to provide a secure channel
between frames, we show an attack on the confidentiality of the
channel. A message sent with postMessage
is
directed at a frame, but if the attacker navigates that frame to
attacker.com
before the message
event is generated, the
attacker will receive the message instead of the intended recipient.
postMessage
on that frame. The attacker can load the
integrator inside a frame and carry out an attack without violating the
descendant frame navigation policy. After the attacker loads the integrator
inside a frame, the attacker navigates the gadget frame to
attacker.com
. Then, when the integrator calls postMessage
on
the "gadget's" frame, the browser delivers the message to the attacker
whose content now occupies the "gadget's" frame; see
Figure 4. The integrator can prevent this attack
by "frame busting," i.e., by refusing to render the mashup if
top !== self
, indicating that the integrator is contained in a frame.
Gadget requests secret from integrator | Integrator's reply is delivered to attacker |
postMessage
idiom is also vulnerable to interception, even
under the child frame navigation policy:
window.onmessage = function(e) { if (e.origin == "https://b.com") e.source.postMessage(secret); };The
source
attribute of the MessageEvent
is a JavaScript
reference to the frame that sent the message. It is tempting to conclude
that the reply will be sent to
https://b.com. However, an attacker might be able to intercept the
message. Suppose that the honest gadget calls
top.postMessage("Hello")
. The gadget attacker can intercept the
message by
embedding the honest gadget in a frame, as depicted in
Figure 5. After the gadget
posts its message to the integrator, the attacker navigates the
honest gadget to
https://attacker.com
. (This navigation is permitted under both the
child and descendant frame navigation policies.) When the integrator
replies to the source
of the message, the message will be
delivered to the attacker instead of to the honest gadget.
postMessage
as an underlying communication primitive, but
we would prefer that postMessage
provide
a secure channel natively. In MashupOS [39], we proposed a new
browser API, CommRequest
, to send messages between origins. When
sending a message using CommRequest
, the sender addresses the message
to a principal:
var req = new CommRequest(); req.open("INVOKE", "local:https://b.com//inc"); req.send("Hello");Using this interface,
CommRequest
protects the confidentiality of
messages because the CommServer
will deliver messages only to the
specified principal. Although CommRequest
provides adequate
security, the postMessage
API is further along in the standardization
and deployment process. We therefore propose extending the
postMessage
API to provide the additional security benefits of
CommRequest
by including a second parameter:
the origin of the intended recipient. If the sender specifies a
target origin, the browser will deliver the message to the targeted frame
only if that frame's current security origin matches the argument.
The browser is free to deliver the message to any principal if the sender
specifies a target origin of *
. Using this improved API, a frame can
reply to a message using the following idiom:
window.onmessage = function(e) { if (e.origin == "https://b.com") e.source.postMessage(secret, e.origin); };As shown in this example use, the API uses the same origin syntax for both sending and receiving messages. The scheme is included in the origin for those developers who wish to defend against active network attackers by distinguishing between HTTP and HTTPS. We implemented this API change as a patch for Safari and a patch for Firefox. Our proposal was accepted by the HTML 5 working group [17]. The new API is now included in Firefox 3 [38], Safari [32], and Internet Explorer 8 [25].
XMLHttpRequest
or entering an infinite loop.
Unfortunately, this approach can lead to false positives. SMash waits 20 seconds for a gadget to load before assuming that the gadget has been hijacked and warning the user. An attacker might be able to fool the user into entering sensitive information during this time interval. Using a shorter time interval might cause users with slow network connections to receive warnings even though no attack is in progress. We expect that the deployment of the descendant policy will obviate the need for server-enforced gadget hijacking mitigations.
Writing programs in one of these safe subsets is often awkward because the
language is highly
constrained to avoid potentially dangerous features. To improve
usability,
the safe subsets are often accompanied by a compiler that transforms
untrusted HTML and JavaScript into the subset, possibly at the cost of
performance. These safe subsets will become easier to use over time as
these compilers become more sophisticated and more libraries become
available, but with the deployment of postMessage
and the
descendant policy, we expect that frame-based mashup designs will continue to
find wide use as well.
document.domain
property
to communicate directly in JavaScript. Similar to most frame-based mashups,
the descendant frame navigation policy is required to prevent gadget
hijacking.
<module>
tag [5] is similar to an
<iframe>
tag, but the module runs in an unprivileged security
context, without a principal, and the browser prevents the integrator from
overlaying content on top of the module. Unlike postMessage
, the
communication primitive used with the module tag is intentionally
unauthenticated: it does not identify the sender of a message. It is unknown
whether navigation can be used to intercept messages as there are no
implementations of the <module>
tag.
security
attribute [26] of frames that can be
set to restricted
. With security="restricted"
, the
frame's content cannot run JavaScript. Similarly, the proposed
<jail>
tag [8] encloses untrusted content and prevents the
sandboxed content from running JavaScript. However, eliminating JavaScript
prevents gadgets from offering interactive experiences.
postMessage
and frame
navigation policies allow web authors to obtain some of the
benefits of MashupOS using existing web APIs.
Web browsers provide a platform for web applications. These applications rely on the browser to isolate frames from different security origins and to provide secure inter-frame communication. To provide isolation, browsers implement a number of security policies, including a frame navigation policy. The original frame navigation policy, the permissive policy, admits a number of attacks. The modern frame navigation policy, the descendant policy, prevents these attacks by permitting one frame to navigate another only if the frame could draw over the other frame's region of the screen. The descendant policy provides an attractive trade-off between security and compatibility, is deployed in the major browsers, and has been standardized in HTML 5.
In existing browsers, frame navigation can be used as an inter-frame
communication channel with a technique known as fragment identifier
messaging. If used directly, the fragment identifier channel lacks
authentication. To provide authentication, Windows.Live.Channels
,
SMash, and OpenAjax 1.1 use messaging protocols. These protocols are
vulnerable to attacks on authentication but can be repaired in a manner
analogous to Lowe's variation of the Needham-Schroeder
protocol [23].
The postMessage
communication channel suffered the
converse security vulnerability: using frame navigation, an attacker can
breach the confidentiality of the channel. We propose providing
confidentiality by extending the postMessage
API to let the sender
specify an intended recipient. Our proposal was adopted by the HTML 5
working group, Internet Explorer 8, Firefox 3, and Safari.
With these improvements to the browser's isolation and communication primitives, frames are a more attractive feature for integrating third-party web content. Two challenges remain for mashups incorporating untrusted content. First, a gadget is permitted to navigate the top-level frame and can redirect the user from the mashup to a site of the attacker's choice. This navigation is made evident by the browser's location bar, but many users ignore the location bar. Improving the usability of the browser's security user interface is an important area of future work. Second, a gadget can subvert the browser's security mechanisms if the attacker employs a browser exploit to execute arbitrary code. A browser design that provides further isolation against this threat is another important area of future work.
<module>
tag.