- Overview
- Conference Organizers
- Registration Information
- Registration Discounts
- At a Glance
- Calendar
- Activities
- Technical Sessions
- Workshops
- Posters and Demos
- Birds-of-a-Feather Sessions
- Sponsorship
- Hotel and Travel Information
- Services
- Students
- Questions
- Help Promote!
- For Participants
- Call for Papers
- Past Conferences
sponsors
usenix conference policies
You are here
Real-Time User-Centric Management of Time-Intensive Analytics Using Convergence of Local Functions
Vinay Deolalikar, HP-Autonomy Research
The past decade has witnessed an astonishing growth in unstructured information in enterprises. The commercial value locked in enterprise unstructured information is being increasingly recognized. Accordingly, a range of textual document analytics—clustering, classification, taxonomy generation, provenance, etc.— have taken center stage as a potential means to manage this explosive growth in unstructured enterprise information, and unlock its value.
Several analytics are time-intensive: the time taken to complete processing the increasingly large volumes of data is significantly more than real-time. However, users are increasingly demanding real-time services that rely on such time-intensive analytics. There is clearly a tension between the aforementioned two developments.
In light of the preceding, vendors increasingly realize that while an analytic may take a longer time to converge, they need to extract useful information from it in real-time. Furthermore, this information has to be application-driven. In other words, it is often not an option to simply "wait until the analytic has finished running:" they must start providing the user with information while the analytic is still running. In summary, there is an emerging stress in Enterprise Information Management (EIM) on application-driven real-time information being extracted from time-intensive analytics.
A priori, it is not clear what could be extracted from an analytic that has yet to complete, and whether any such information would be useful. As of the present, there is little or no research literature on this problem: it is generally assumed that all of the information from an analytic will be available upon its completion.
We present an approach to this problem that is based on decomposing the objective function of the analytic, which is a global function that determines the progress of the analytic, into multiple local, user-centric functions. How can we construct meaningful local functions? How can such functions be measured? How do these functions evolve with time? Do these functions encode useful information that can be obtained real-time? These are the questions we will address in this paper.
We demonstrate our approach using local functions on document clustering using the de facto standard algorithm—k-means. In this case, the multiple local user-centric functions transform k-means into a flow algorithm, with each local function measuring a flow. Our results show that these flows evolve very differently from the global objective function, and in particular, may often converge quickly at many local sites. Using this property, we are able to extract useful information considerably earlier than the time taken by k-means to converge to its final state.
We believe that such pragmatic approaches will have to be taken in order to manage systems performing analytics on large volumes of unstructured data.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Vinay Deolalikar},
title = {{Real-Time} {User-Centric} Management of {Time-Intensive} Analytics Using Convergence of Local Functions},
booktitle = {10th International Conference on Autonomic Computing (ICAC 13)},
year = {2013},
isbn = {978-1-931971-02-7},
address = {San Jose, CA},
pages = {167--173},
url = {https://www.usenix.org/conference/icac13/technical-sessions/presentation/deolalikar},
publisher = {USENIX Association},
month = jun
}
connect with us