################################################
	   #                                              #
	   # ##   ## ###### ####### ##    ## ## ##     ## #
	   # ##   ## ##  ## ##      ###   ## ##  ##   ##  #
	   # ##   ## ##     ##      ####  ## ##   ## ##   #
	   # ##   ## ###### ######  ## ## ## ##    ###    #
	   # ##   ##     ## ##      ##  #### ##   ## ##   #
	   # ##   ## ##  ## ##      ##   ### ##  ##   ##  #
	   # ####### ###### ####### ##    ## ## ##     ## #
	   #                                              #
	   ################################################


	 The following paper was originally published in the
   Proceedings of the First USENIX Workshop on Electronic Commerce
		    New York, New York, July 1995


	For more information about USENIX Association contact:

		   1. Phone:	510 528-8649
		   2. FAX:	510 548-5738
		   3. Email:	office@usenix.org
		   4. WWW URL:  http://www.usenix.org


Generic Extensions of WWW Browsers
**********************************

Ralf Hauser and Michael Steiner  
Information Technology Solutions Department, IBM Research Division  
Research Laboratory, CH-8803 Rueschlikon,  Switzerland  
tel: +41.1.724-8426, fax: +41.1.710-3608,
email:  hauser@acm.org, sti@zurich.ibm.com 


Abstract:
+++++++++

Current WWW browsers provide two main services: communication and
information   rendering.   While  this  is  sufficient  for  some
purposes, many future applications will need  more  sophisticated
processing  on  the user side before server-data can be presented
to the user or before the user input can be  transferred  to  the
servers.  For example, electronic payments ought to be seamlessly
integrated into a customer's browser to enable him  to  shop  via
the  Web. This paper proposes two pragmatic approaches to fulfill
these requirements without altering current  browser  technology.
We   conclude  with  the  proposal  of  a  generalized  extension
framework for WWW browsers.


Introduction
************

The World Wide Web (WWW) has grown extremely  fast  in  the  past
months,  not  only  quantitatively  but  also  qualitatively. New
services are being added on a day-by-day  basis  and  range  from
telephone  directories to movie databases and ``shopping malls''.
In  particular,  the  ``POST''-Method  of  HTTP   [1]   and   the
``FORMS''-feature   of  HTML[2]  have  made  the  WWW  technology
suitable to most Internet applications.

With new commercial applications emerging daily, it is impossible
to  anticipate  all  future needs and build a single browser that
fulfills all requirements. One  example  for  this  are  security
requirements:  the simple authentication mechanisms of early HTTP
versions are no longer  sufficient.  Advanced  services  such  as
secure  payments  and  intellectual  property  rights  protection
require new mechanisms based on multi-party security.

Therefore, browsers should provide interfaces to allow  the  easy
addition  of  arbitrary extensions. Most current browsers support
the concept of external viewers based on  Internet  Media  (MIME)
Types  [3]. The users of a WWW browser can freely configure their
preferred viewers, enabling them to receive an open-ended set  of
document  types  with  this  feature.  However,  permitting  only
external viewers as extensions is not sufficient.  Services  such
as confidentiality, integrity protection and packetizing data are
only intermediate processing steps.  They are filters and need to
return  output  back  to  the  browser  controlling  the  overall
application run.

WWW-browsers are also proposed as exclusive user interfaces or as
HTTP  engines(fn1). The browser is then the  slave of a different
application.  This  cannot  be  handled  by  the  MIME  extension
feature,  and  the different browser implementations address this
issue of ``remote control'' in different ways.

In the next section we shows how  rudimentary  filter  extensions
can  be  implemented  with  existing technology and virtually any
browser without modifications of either source or object code. We
use  the  example  of  a secure purchase and payment of a digital
movie to illustrate the approaches. In section  3  we  outline  a
generic  framework  that  accommodates  both  filters  as well as
remote control.


Short-Term Approaches
*********************

For short- and mid-term solutions, an  extension  mechanism  must
not require changes of the browsers but should depend only on the
features implemented by the majority of available browsers.

We propose two pragmatic approaches with minimal dependencies  on
implementation  specifics  to ``snap-in'' a client-side extension
into today's  browsers  .  The  first  approach  relies  only  on
external  viewers  that  use  a special MIME type, and the second
approach is based on a small daemon resembling the HTTP server  (
httpd) running on the user's local machine.

As an example to demonstrate our approaches we take the  scenario
that  a  User  likes  to  purchase  a movie from a server using a
secure payment scheme (e.g iKP[5]).


External Viewer Approach
========================

Figure 1 shows the purchase of a movie based on external viewers. This
minimal approach consists of

 1. a payment ``extension'' MIME viewer entailing 
    o user interface routines to obtain the user's confirmation to
      join the purchase contract, and the 
    o cryptographic mechanisms to build a secure digital credit cart 
	slip. 
 2. a method to ``re-integrate'' the extension's output into the WWW 
	browser. 

   
Figure 1: Payment of a Movie based on external viewers

Re 1: Whenever some input requires further processing on the user
side, the server tags it with a specific MIME-type and submits it
to the browser(fn2). In our example of the purchase of a  digital
movie,  the  browser  transparently  passes this information to a
``payment'' extension in the form of a MIME viewer. The extension
then  displays  contract  parameters  such as duration, number of
frames per second, resolution, and price.  Then  it  prompts  the
user for a confirmation. For small amounts, this confirmation can
be provided  automatically  by  the  user's  device.  For  larger
amounts,  the payment extension's policy will require the user to
explicitly enter a PIN or a password to unlock the user's  credit
card for the generation of a digital credit-card slip.

Re 2: Such a re-integration of the  information  produced  for  a
payment meta-protocol aims to feed data back to the browser or at
employing the HTTP communication infrastructure that  is  already
in  place from the browser to the server. In this way the payment
extensions to not have to care about  communication  and  can  be
unaware  of  potential  gateways  to  cross  on  the  way  to the
merchant. The first version of remote control  features  of  both
Mosaic  and  Netscape  does  not  permit external applications to
transmit more information transparently to a server via a browser
than  can  be fitted into a URL. Most implementations expect this
URL string to be only a few hundred  characters  long.  For  many
situations  this  length is insufficient (e.g., with RSA a simple
signed or encrypted block would already require 170 bytes).

One intermediate solution is to include the  digital  slip  in  a
hidden  input field of an HTML form that can be arbitrarily long.
The entire html document containing this form is  then  deposited
in  a  local  file with a standard name. This file can be brought
back into the browser in two ways:


 o The last action of the payment extension is to cause the browser to
   load this local file by the aforementioned remote-control
   features. The user must then click one more time to send the form
   (containing the slip) to the server.  
 o If the browser entails no remote control feature, the server
   already provides a link to the standard local filename when
   supplying the user with the last information rendered by the HTML
   viewer. In this case, the last action of the browser is to inform
   the user that the local file is ready and the pertinent link can be 
   clicked. This approach even requires two extra user interactions
   which will be unnecessary in implementations of long-term frameworks. 

Evaluation 

Such a  short-term  solution  allows  one  to  ``  snap''  almost
arbitrary   extensions   ``   in''to   existing   browsers.   The
requirements for the communication and operating environments are
minimal.  The  solution,  however,  has two drawbacks. First, the
filtering process  sporadically  needs  user  input  to  continue
operation  -  input  that  is unnecessary from a logical point of
view. Second, post-processing  filters  (i.e.  filters  that  are
employed  after  the last logically necessary user input) require
an extra communication exchange with the server because the  MIME
dispatcher  of  the  browser only operates on incoming data. This
extra  communication  exchange  is  not  possible  with   certain
extensions(fn3).


Client-Side HTTP Daemon
=======================

The second short-term approach is to have an httpd-like daemon on
the  user's  machine as shown in Figure 2.  For security reasons,
this extension httpd only accepts connections on  a  well-defined
port  from  browsers on the same machine. When the payment for an
electronic purchase is about to take place, the user clicks on  a
URL  (provided  by  the  shop's  server) and causes a normal POST
request   to   be   sent   to    this    local    httpd    (e.g.:
http://localhost:54321/buy).  The  body  of this request contains
the information to assemble the  digital  credit  card  slip.  To
obtain  further  user  input  (such as credit card numbers or the
sensitive PIN for  transaction  approval)  this  extension  httpd
either  directly  pops up with a special window for user keyboard
input, or  collects  this  information  through  the  browser  by
sending back an appropriate HTML form.


Figure 2: Payment with a Client-Side Extension HTTP-Daemon acting as a 
Proxy

There are two approaches for the extension httpd to transmit the final 
order with the digital slip to the server: 

 1. It puts the complete order information into a URL addressed to the
    shop's server. It then sends this URL with a HTTP Redirect
    directive to the browser. The browser therefore automatically 
    issues a normal HTTP request with that URL to the server of the 
    shop. 
 2. It acts as a Proxy; therefore, it contacts the shop's server
    directly and forwards the shop's response to the browser. 

The main problem with this approach is how the shop's server can find
out the port number of the extension httpd to construct the mentioned
local URL. The answer is three-fold:

 o A well-known port will be established for the most common case. 
 o If the user cannot use this port for any reason, the port will be
   published as a special MIME-type of the form
   extension-mgr/port-number, where port-number stands for the port
   actually used. This will be sent with the HTTP ACCEPT header-field
   to the server and can be used to generate the URLs.  
 o If all this fails, the user will have to manually fill out a form
   that explicitly asks for the extension's port number, and then send
   it to the server.  

Evaluation 

The advantage is  that  this  approach  requires  no  extra  user
interaction  to  keep  the filtering process alive.  Additionally
state  can  be  kept  easily  between   different   communication
exchanges.

Order transmission  approach  (1)  has  the  advantage  that  the
extension  httpd  only  interacts  with  the  user's browser and,
therefore, can be totally shielded from outside communication. On
the  other  hand  we  have  to  consider the limitation mentioned
before of the amount of information that can  be  fitted  into  a
URL.

Order transmission approach (2) does not suffer from such a  size
restriction. It can additionally react on faults as timeouts (e.g
by sending retries) and pre-process the response  of  the  shop's
server.  Figure  2 shows how the security extension could provide
copyright protection by decrypting the delivered  digital  movie.
The main disadvantage of this approach is that the daemon must be
aware of proxies or gateways (e.g SOCKS) to  always  be  able  to
reach the shop.


General Client Side Extension Framework
***************************************

The problems of our approaches presented in the last section  led
us  to  look  for a long term solution. This solution should also
support certain  extensions  as  load  balancing  by  redirecting
certain  URLs to local files or geographically closer servers and
outlining of document structures ([6]) cannot be  handled  easily
by  the approaches mentioned before. We can relax our requirement
of not changing browsers. Therefore we propose a  change  in  the
architecture of browsers.

In the currently popular WWW browsers, the HTML viewer, the  HTTP
communications   module,   and   a   primitive   ``MIME  document
dispatcher''  are  combined  in  one  monolithic  package.   Some
implementations  add some limited capabilities of handling remote
control events as well.

   Figure 3: Examples of current Extensions

This architecture can be greatly simplified and  its  flexibility
can  be  increased  if we view the browser as a generic extension
manager  who  acts  merely  as  a  dispatcher.  Not   only   MIME
extensions,  but  also  the  communication  toolbox  and the HTML
viewer are subsumed as extensions.

This  results  in  an  architecture  with  only   three   logical
components:

 o the extension manager, 
 o passive extensions. 
 o active extensions. 

Passive extensions are called by  the  extension  manager.  Their
purpose  is  to  either  filter  and  transform data, to serve as
gateways to different protocols or to render data.

Active extensions control the execution of the extension  manager
and   initiate  ``web-actions''.  Note  that  the  two  types  of
extensions are not mutually exclusive. The HTML-viewer is clearly
a data viewer but when it reacts on user interaction and issues a
request it can also be considered as an active extension.

This leads  to  the  following  sketch  of  a  minimal  interface
defining  the  interaction  between  the  extension  manager  and
extensions/external applications.

The  extension  manager  exports   two   functions   for   active
extensions.  Parameters named request or response are required to
be valid HTTP requests rsp. responses:


 o ProcessURL( 
         IN:                request
         IN:                returnMeResponse?
         OUT:               [response if returnMeResponse?]
   ), 

 o RenderMIME( 
         IN:                response
         IN:                associated URL
   ), 

The extension has similarly to provide one or both of the following
two functions:

 o HandleURL( 
         IN:                request
         OUT:               requestOrResponse
   ), 

 o HandleMIME( 
         IN:                response
         OUT:               [requestOrResponse if not(IsRenderer?)]
   ), 

These  functions  are  registered  either   statically   with   a
configuration   file   or   dynamically   by   calling   specific
registration functions of the extension  manager.  Together  with
the  callback  function  a  regular expression is registered. The
extension manager tries it to  match  with  requested  URLs  rsp.
MIME-types  to  find  out  which  extension  to take. If matching
extensions are found, the extension manager will  take  the  best
match and call the associated callback-function.

Once the call to a filter returns, the extension manager inspects
requestOrResponse  and  depending on the type of the return value
looks for the next MIME or URL filter to call. Data viewer do not
return any data and the extension manager stops processing.

If  an  external  application  wants  the  resulting  data  of  a
ProcessURL(fn4) call regardless of its MIME  type,  it  will  set
returnMeResponse? in the ProcessURL call. Instead of handing  the
final  response  to  a  data  viewer  according  to the MIME-type
matching rules the  extension  manager  will  return  it  to  the
caller.


Example
=======

Our example begins with a user who has already browsed the  price
list  of  a  movie-on-demand  WWW  shop  and  has  made a choice.
Therefore, the HTML viewer knows the URL of the selected  digital
movie  as  well as the desired payment protocol. The user's click
is processed in the following steps:


 1. The HTML viewer will initiate the operation by calling the
    extension manager with:

   ProcessURL(
   ``POST 3KP://movie.com/ HTTP/1.0
   Merchandise=Entertainment/movie324.mpg, 
   Value=$5, 
   Shop-Name=MovieComInc, 
   Shop-Certificate=...``, 
   FALSE
   returnDataHandle
   ). 

 2. The extension manager looks for a registered extension and will
    match ``3kp:*''. It will call the payment extension then with: 

   HandleURL(
   ``POST 3KP://movie.com/ HTTP/1.0
   Merchandise=Entertainment/movie324.mpg, 
   Value=$5, 
   Shop-Name=MovieComInc, 
   Shop-Certificate=...``, 
   returnDataHandle
   ). 

   The payment extension will generate a payment token and puts the
   following HTTP request inreturnDataHandle: 

   ``POST http://movie.com/3kp-buy HTTP/1.0
   Merchandise=Entertainment/movie324.mpg, 
   PayToken="XYZ" `` 

   Note that the payment extension may actively call the extension
   manager during its operation through the ProcessURL interface, for
   example, to retrieve some other party's public key.  

 3. Inspecting returnDataHandle the extension manager determines that
    it should call the  HTTP communications  module(fn5). This module 
    handles the HTTP request and  returns the HTTP  response with the 
    requested document from the remote server in requestOrResponse:  

   ``HTTP/1.0 200 OK 
   Date: Friday, 23-Jun-95 13:52:57 GMT
   MIME-version: 1.0 
   Content-type: application/copyright_protect 
   encrypted data ...'' 

 4. Because the movie-on-demand shop has some copyright protection in
    place, the responses MIME type will match the copyright protection
    extension and call the HandleMIME callback corresponding to this 
    extension. In a simple case, this callback would decrypt the data
    with a session key and return the movie with MIME type MPEG. 

 5. Finally, the extension manager will hand the movie324.mpg data to
    the movie viewer (e.g. an MPEG viewer). 

Remarks
=======

The HTTP communications module need not be the only  one  capable
of  changing the parameters from HTTP requests to HTTP responses.
A caching filter returns the document if it  is  already  in  the
cache,  and an access filter for a browser at a public place will
return a document containing an error message if it refuses  some
requests that may infer high costs or are against the charter for
which the terminal was provided. Protocol  gateways  (e.g.,  ftp,
gopher,  mailto)  are  also examples of filters that semantically
convert HTTP requests into HTTP responses.

Response-side filters may also issue new requests,  for  example,
for pre-fetching of referenced documents.

Integrity protection, confidentiality, and packetizing  data  for
long  messages  require  symmetrical filtering also on the server
side. The Common Gateway Interface (CGI) [7] allows filtering  on
the  server  side but unfortunately, this interface only foresees
one filtering  step.  To  obtain  multiple  filtering  steps,  an
extension  manager  can  be  placed within this single filter and
emulate the flexibility and configurability  of  the  client-side
extension manager.

It is hoped that such a  framework  with  standardized  extension
manager  semantics and interfaces will allow arbitrary developers
to offer their own, compatible and powerful extensions.


Comparison with Existing Approaches
***********************************

In the final paper this section will discusses the current proposals, 
namely: 

 o Kristol's RFC for a HTTP extension mechanism [8], 
 o NCSA's Common Client Interface (CCI) [9], already available as a 
   prototype, 
 o Spyglass's Software Development Interface [10],
 o W&A's API for WWW Applets[11], 
 o others. 

These proposals are compared with our framework, and their
shortcomings are analyzed.   

Conclusions
***********

This paper has presented two short-term solutions of how to  snap
almost  arbitrary  extensions  into  current WWW browsers without
altering  their  source  code   or   executables.   Our   example
illustrates  how  these  two solutions enable electronic payments
with acceptable user interfaces and highest security already with
existing  browser  technology.  We have built two prototypes that
provided proof of these concepts.

The requirements for extra user input,  which  is  logically  not
justified,  or  for peculiar ``internal'' browser inter-component
communication is a sign that the technology  originally  was  not
designed for the use described here.

Therefore we have stated our own view of a sound architecture for
flexibly configurable and composable WWW browsers of the future.


References
==========

1  Tim Berners-Lee, R. T. Fielding, and H. Frystyk Nielsen. Hypertext
   Transfer Protocol, March 1995. Internet Draft, Expires September 8, 1995. 

2  HyperText Markup Language Specification - 2.0. Internet Draft,
   February 1995. Expires June 19, 1995.  

3  J. Postel. Media type registration procedure. Internet Request for
   Comment RFC 1590, March 1994.  

4  Rawn Shah. The World Wide Web System as an Operating Environment.
   http://www.rtd.com/people/rawn/os-paper.html, 1994. 

5  Mihir Bellare, Juan Garay, Ralf Hauser, Amir Herzberg, Hugo
   Krawczyk, Michael Steiner, Gene Tsudik, and Michael Waidner. iKP --
   a family of secure electronic payment protocols. In First USENIX 
   Workshop on Electronic Commerce. USENIX, July 1995. 

6  Alan F. Slater. Extending W3 clients. In Second International
   Conference on the World-Wide Web, pages 899--908, Chicago, October 1994. 

7  Rob McCool. The Common Gateway Interface. NCSA, 1.1 edition, 1994. 

8  David M. Kristol. A proposed extension mechanism for HTTP. Internet
   Request for Comment RFC, January 1995. 

9  NCSA Mosaic Common Client Interface, September 1994. Version 1.0,
   http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/cci-spec.html. 

10 Paul Rohr. Software Development Interface, December 1994.
   http://www.spyglass.com/techreport/iapi.htm. 

11 Bert Bos. W&A an API for WWW applets, v1.1. http://www.let.rug.nl/
   bert/W3A/W3A.html, February 1995. 

Footnotes
*********

...interfaces fn1
   Shah [4] envisions using the WWW system as operating environment
   and the browsers as generic user interfaces also to other applications. 

...browser fn2
   For example, the user has filled out an order form and passed it to
   the shop's server. A first plausibility check, for example, whether
   the required amount is on stock or whether it requires extra freight 
   negotiations, can already be performed by the server before turning
   the information of this flow into the first flow of the payment
   protocol under the pertinent special MIME type.  

...extensions fn3
   For example, communication-oriented post-processing such as
   packetizing bulk data, because the GET and POST method remain the
   only available communication primitives.  

...ProcessURL fn4
   ProcessURL together with RenderMIME replace the current
   ``remote-control'' features of some browsers. 

...module fn5
   We call HTTP communications module the piece of code that
   implements the http semantics on a specific network
   infrastructure. Most current WWW browsers assume TCP/IP to be the
   infrastructure - in the future, this may change.