{Don’t} Repeat Yourself: Automatically Synthesizing Client-side Validation Code for Web Applications

Nazari Skrupsky; Maliheh Monshizadeh; Prithvi Bisht; Timothy Hinrichs; V.N. Venkatakrishnan; Lenore Zuck

Technical Sessions

All sessions will be held in Independence West unless otherwise noted.

The full WebApps '12 Proceedings are now available:

webapps12_proceedings.pdf

webapps12.epub

webapps12.mobi

Wednesday, June 13, 2012

9:00 a.m.–9:15 a.m.				Wednesday
Introductory Remarks Program Chair: Michael Maximilien, IBM Research—Watson
9:15 a.m.–10:30 a.m.				Wednesday
Keynote 1: Programming Style and Your Brain Douglas Crockford, PayPal Computer programs are the most complicated things humans make. They must be perfect, which is hard for us because we are not perfect. Programming is thought to be a "head" activity, but there is a lot of "gut" involved. Indeed, it may be the gut that gives us the insight necessary for solving hard problems. But gut messes us up when it come to matters of style.The systems in our brains that make us vulnerable to advertising and propaganda also influence our programming styles. This talk looks systematically at the development of a programming style that specifically improves the reliability of programs. The examples are given in JavaScript, a language with an uncommonly large number of bad parts, but the principles are applicable to all languages. Douglas Crockford was born in the wilds of Minnesota, but left when he was only six months old because it was just too damn cold. He turned his back on a promising career in television when he discovered computers. He has worked in learning systems, small business systems, office automation, games, interactive music, multimedia, location-based entertainment, social systems, and programming languages. He is the inventor of Tilton, the ugliest programming language that was not specifically designed to be an ugly programming language. He is best known for having discovered that there are good parts in JavaScript. This was an important and unexpected discovery. He also discovered the JSON Data Interchange Format, the world's best loved data format. "Programming Style and Your Brain" Presentation by Douglas Crockford Available Media Read more about "Programming Style and Your Brain" Presentation by Douglas Crockford
10:30 a.m.–11:00 a.m.				Wednesday
Break Constitution Foyer
11:00 a.m.–12:30 p.m.				Wednesday
Papers 1: JavaScript, Social Modeling and Reasoning about DOM Events Benjamin S. Lerner, Matthew J. Carroll, Dan P. Kimmel, Hannah Quay-de la Vallee, and Shriram Krishnamurthi, Brown University Web applications are fundamentally reactive. Code in a web page runs in reaction to events, which are triggered either by external stimuli or by other events. The DOM, which specifies these behaviors, is therefore central to the behavior of web applications. We define the first formal model of event behavior in the DOM, with high fidelity to the DOM specification. Our model is concise and executable, and can therefore be used for testing and verification. We have applied it in several settings: to establish some intended meta-properties of the DOM, as an oracle for testing the behavior of browsers (where it found real errors), to demonstrate unwanted interactions between extensions and validate corrections to them, and to examine the impact of a web sandbox. The model composes easily with models of other web components, as a step toward full formal modeling of the web. Available Media Jigsaw: Efﬁcient, Low-effort Mashup Isolation James Mickens, Microsoft Research; Matthew Finifter, University of California, Berkeley A web application often includes content from a variety of origins. Securing such a mashup application is challenging because origins often distrust each other and wish to expose narrow interfaces to their private code and data. Jigsaw is a new framework for isolating these mashup components. Jigsaw is an extension of the JavaScript language that can be run inside standard browsers using a Jigsaw-to-JavaScript compiler. Unlike prior isolation schemes that require developers to specify complex, error-prone policies, Jigsaw leverages the well-understood public/private keywords from traditional object-oriented languages, making it easy for a domain to tag internal data as externally visible. Jigsaw provides strong iframe-like isolation, but unlike previous approaches that use actual iframes as isolation containers, Jigsaw allows mutually distrusting code to run inside the same frame; this allows scripts to share state using synchronous method calls instead of asynchronous message passing. Jigsaw also introduces a novel encapsulation mechanism called surrogates. Surrogates allow domains to safely exchange objects by reference instead of by value. This improves sharing efficiency by eliminating cross-origin marshaling overhead. Available Media Social Networks Proﬁle Mapping Using Games Mohamed Shehab, Moo Nam Ko, and Hakim Touati, University of North Carolina at Charlotte Mapping user profiles across social network sites enables sharing and interactions between social networks, which enriches the social networking experience. Manual mapping for user profiles is a time consuming and tedious task. In addition profile mapping algorithms are inaccurate and are usually based on simple name or email string matching. In this paper, we propose a Game With A Purpose (GWAP) approach to solve the profile mapping problem. The proposed approach leverages the game appeal and social community to generate the profile mappings. We designed and implemented an online social networking game (GameMapping), the game is fun and is based on human verification. The game presents the players with some profiles information, and uses human computation and knowledge about the information being presented to map similar user profiles. The game was modeled using incomplete information game theory, and a proof of sequential equilibrium was provided. To test the effectiveness of the mapping technique and detection strategies, the game was implemented and deployed on Facebook, MySpace and Twitter and the experiments were performed on the real data collected from users playing the game. Available Media
12:30 p.m.–1:30 p.m.				Wednesday
FCW Luncheon Back Bay CD
1:30 p.m.–3:30 p.m.				Wednesday
Papers 2: Distributed Systems and Browser Ext Hybrid Cloud Support for Large Scale Analytics and Web Processing Navraj Chohan, Anand Gupta, Chris Bunch, Kowshik Prakasam, and Chandra Krintz, University of California, Santa Barbara Platform-as-a-service (PaaS) systems, such as Google App Engine (GAE), simplify web application development and cloud deployment by providing developers with complete software stacks: runtime systems and scalable services accessible from well-defined APIs. Extant PaaS offerings are designed and specialized to support large numbers of concurrently executing web applications (multi-tier programs that encapsulate and integrate business logic, user interface, and data persistence). To enable this, PaaS systems impose a programming model that places limits on available library support, execution duration, data access, and data persistence. Although successful and scalable for web services, such support is not as amenable to online analytical processing (OLAP), which have variable resource requirements and require greater flexibility for ad-hoc query and data analysis. OLAP of web applications is key to understanding how programs are used in live settings. In this work, we empirically evaluate OLAP support in the GAE public cloud, discuss its benefits, and limitations. We then present an alternate approach, which combines the scale of GAE with the flexibility of customizable offline data analytics. To enable this, we build upon and extend the AppScale PaaS – an open source private cloud platform that is API-compatible with GAE. Our approach couples GAE and AppScale to provide a hybrid cloud that transparently shares data between public and private platforms, and decouples public application execution from private analytics over the same datasets. Our extensions to AppScale eliminate the restrictions GAE imposes and integrates popular data analytic programming models to provide a framework for complex analytics, testing, and debugging of live GAE applications with low overhead and cost. Available Media Poor Man's Social Network: Consistently Trade Freshness for Scalability Zhiwu Xie, Virginia Polytechnic Institute and State University; Jinyang Liu, Howard Hughes Medical Institute; Herbert Van de Sompel, Los Alamos National Laboratory; Johann van Reenen and Ramiro Jordan, University of New Mexico Typical social networking functionalities such as feed following are known to be hard to scale. Different from the popular approach that sacrifices consistency for scalability, in this paper we describe, implement, and evaluate a method that can simultaneously achieve scalability and consistency in feed following applications built on shared-nothing distributed systems. Timing and client-side processing are the keys to this approach. Assuming global time is available at all the clients and servers, the distributed servers publish a pre-agreed upon schedule based on which the continuously committed updates are periodically released for read. This opens up opportunities for caching and client-side processing, and leads to scalability improvements. This approach trades freshness for scalability. Following this approach, we build a twitter-style feed following application and evaluate it on a following network with about 200,000 users under synthetic workloads. The resulting system exhibits linear scalability in our experiment. With 6 low-end cloud instances costing a total of no more than $1.2 per hour, we recorded a peak timeline query rate at about 10 million requests per day, under a fixed update rate of 1.6 million new tweets per day. The maximum staleness of the responses is 5 seconds. The performance achieved sufficiently verifies the feasibility of this approach, and provides an alternative to build small to medium size social networking applications on the cheap. Available Media Executing Web Application Queries on a Partitioned Database Neha Narula and Robert Morris, MIT CSAIL Partitioning data over multiple storage servers is an attractive way to increase throughput for web-like workloads. However, there is often no one partitioning that yields good performance for all queries, and it can be challenging for the web developer to determine how best to execute queries over partitioned data. This paper presents DIXIE, a SQL query planner, optimizer, and executor for databases horizontally partitioned over multiple servers. DIXIE focuses on increasing inter-query parallel speedup by involving as few servers as possible in each query. One way it does this is by supporting tables with multiple copies partitioned on different columns, in order to expand the set of queries that can be satisified from a single server. DIXIE automatically transforms SQL queries to execute over a partitioned database, using a cost model and plan generator that exploit multiple table copies. We evaluate DIXIE on a database and query stream taken from Wikipedia, partitioned across ten MySQL servers. By adding one copy of a 13 MB table and using DIXIE’s query optimizer, we achieve a throughput improvement of 3.2X over a single optimized partitioning of each table and 8.5X over the same data on a single server. On specific queries DIXIE with table copies increases throughput linearly with the number of servers, while the best single-table-copy partitioning achieves little scaling. For a large class of joins, which traditional wisdom suggests requires tables partitioned on the join keys, DIXIE can find higher-performance plans using other partitionings. Available Media Gibraltar: Exposing Hardware Devices to Web Pages Using AJAX Kaisen Lin, UC San Diego; David Chu, James Mickens, Li Zhuang, and Feng Zhao, Microsoft Research; Jian Qiu, National University of Singapore Gibraltar is a new framework for exposing hardware devices to web pages. Gibraltar’s fundamental insight is that JavaScript’s AJAX facility can be used as a hardware access protocol. Instead of relying on the browser to mediate device interactions, Gibraltar sandboxes the browser and uses a small device server to handle hardware requests. The server uses native code to interact with devices, and it exports a standard web server interface on the localhost. To access hardware, web pages send device commands to the server using HTTP requests; the server returns hardware data via HTTP responses. Using a client-side JavaScript library, we build a simple yet powerful device API atop this HTTP transfer protocol. The API is particularly useful to developers of mobile web pages, since mobile platforms like cell phones have an increasingly wide array of sensors that, prior to Gibraltar, were only accessible via native code plugins or the limited, inconsistent APIs provided by HTML5. Our implementation of Gibraltar on Android shows that Gibraltar provides stronger security guarantees than HTML5; furthermore, it shows that HTTP is responsive enough to support interactive web pages that perform frequent hardware accesses. Gibraltar also supports an HTML5 compatibility layer that implements the HTML5 interface but provides Gibraltar’s stronger security. Available Media
3:30 p.m.–4:00 p.m.				Wednesday
Break Constitution Foyer
4:00 p.m.–5:00 p.m.				Wednesday
Keynote 2: Managing Asynchrony: An Application Perspective Yehuda Katz, Tilde, Inc. "Managing Asynchrony: An Application Perspective" Presentation Available Media Read more about "Managing Asynchrony: An Application Perspective" Presentation
5:00 p.m.–6:15 p.m.				Wednesday
Demo papers: Client and JavaScript LIBERATED: A Fully In-Browser Client and Server Web Application Debug and Test Environment Derrell Lipman, University of Massachusetts Lowell Traditional web-based client-server application development has been accomplished in two separate pieces: the frontend portion which runs on the client machine has been written in HTML and JavaScript; and the backend portion which runs on the server machine has been written in PHP, ASP.net, or some other “server-side” language which typically interfaces to a database. The skill sets required for these two pieces are different. In this paper, I demonstrate a new methodology for web-based client-server application development, in which a simulated server is built into the browser environment to run the backend code. This allows the frontend to issue requests to the backend, and the developer to step, using a debugger, directly from frontend code into backend code, and to debug and test both the frontend and backend portions. Once working, that backend code is moved to a real server. Since the application-specific code has been tested in the simulated environment, it is unlikely that bugs will be encountered at the server that did not exist in the simulated environment. I have implemented this methodology and used it for development of a live application. All of the code is open source. Available Media JavaScript in JavaScript (js.js): Sandboxing Third-Party Scripts Jeff Terrace, Stephen R. Beard, and Naga Praveen Kumar Katta, Princeton University View the slides online. Running on billions of today’s computing devices, JavaScript has become a ubiquitous platform for deploying web applications. Unfortunately, an application developer who wishes to include a third-party script must enter into an implicit trust relationship with the third-party—granting it unmediated access to its entire application content. In this paper, we present js.js, a JavaScript interpreter (which runs in JavaScript) that allows an application to execute a third-party script inside a completely isolated, sandboxed environment. An application can, at runtime, create and interact with the objects, properties, and methods available from within the sandboxed environment, giving it complete control over the third-party script. js.js supports the full range of the JavaScript language, is compatible with major browsers, and is resilient to attacks from malicious scripts. We conduct a performance evaluation quantifying the overhead of using js.js and present an example of using js.js to execute Twitter’s Tweet Button API. Available Media Aperator: Making Tweets Enable Actionable Commands on Third Party Web Applications Peter Zakin, Soumya Sen, and Mung Chiang, Princeton University Twitter has become a persistent part of our digital lives, connecting us not only to our individual audiences but also to an entire landscape of applications built for the web. While much has been done to support the Twitter ecosystem outside of Twitter, little has been done within Twitter to power those same applications. This work introduces a service called Aperator, which supports application-specific actionable commands through tweets. This ability creates several interesting opportunities for both end-users and application developers building on the Twitter platform. For example, the actionable command capability allows a link that a Twitter user shares with his followers to be directly added to any of the user’s connected link sharing networks, such as Delicious or Read it Later. The client side of this system has a console for end-users to sign up and provide their login credentials for various web services that our system supports: Delicious, Foursquare, Read it Later, Foursquare etc. The system’s backend has two cron jobs that run every minute to: (a) retrieve and parse tweets from a specific twitter account and store them in a command form in a MySQL database, and (b) execute the unexecuted commands found in the users tweets. This paper describes the concept, implementation, and results from an experimental study of this new application. Available Media Don’t Repeat Yourself: Automatically Synthesizing Client-side Validation Code for Web Applications Nazari Skrupsky, Maliheh Monshizadeh, Prithvi Bisht, Timothy Hinrichs, V.N. Venkatakrishnan, and Lenore Zuck, University of Illinois at Chicago We outline the groundwork for a new software development approach where developers author the server-side application logic and rely on tools to automatically synthesize the corresponding client-side application logic. Our approach uses program analysis techniques to extract a logical speciﬁcation from the server and synthesizes client code from that speciﬁcation. Our implementation (WAVES) synthesizes interactive client interfaces that include asynchronous callbacks whose performance and coverage rival that of manually written clients, while ensuring that no new security vulnerabilities are introduced. Available Media