Using Hadoop for Webscale Computing

Ajay Anand

Using Hadoop for Webscale Computing

Abstract:

Apache Hadoop is an open source implementation of a distributed filesystem and map-reduce programming model combined into one package. Hadoop scales smoothly from tens to thousands of computers. The framework allows engineers to harness the power of these clusters very simply, taking advantage of three major features:

A reliable, non-hardware-based distributed filesystem: Hadoop DFS runs on any number of nodes, taking advantage of their combined storage to manage replication and recovery from failure.
A simple, functional programming model: Hadoop Map-Reduce is a parallelized implementation of a very simple programming methodology first popularized by the functional programming group in the 1970s.
Infrastructure to aid in the automation of job execution: Hadoop automates bringing user code to the data, and it manages parallel execution and handles node failure.

The talk will provide an overview of Apache Hadoop, along with examples of how this infrastructure is being used at Yahoo! and other organizations today.

Ajay Anand, Yahoo!

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@conference {268192,
author = {Ajay Anand},
title = {Using Hadoop for Webscale Computing},
year = {2008},
address = {Boston, MA},
publisher = {USENIX Association},
month = jun
}

connect with us

twitter

usenix conference policies

Using Hadoop for Webscale Computing

Ajay Anand, Yahoo!

Open Access Media

Presentation Video

Presentation Audio

Links

connect with us

twitter

usenix conference policies

You are here

connect with us

Using Hadoop for Webscale Computing

Ajay Anand, Yahoo!

Open Access Media

Presentation Video

Presentation Audio

Links