Hadoop Scales Really Well - Yahoo! Launches World’s Largest Hadoop Production Application
Posted by: Marc Boucher in Search Engine Technology
Some exciting news today from Eric Baldeschwieler, Senior Director, Grid Computing on the Yahoo Developer Network, Yahoo! Launches World’s Largest Hadoop Production Application. I’ll note that my company Hyperix is using Hadoop for our vertical search platform.
Here’s some of the stats:
Some Webmap size data:
* Number of links between pages in the index: roughly 1 trillion links
* Size of output: over 300 TB, compressed!
* Number of cores used to run a single Map-Reduce job: over 10,000
* Raw disk used in the production cluster: over 5 Petabytes
Tags: Hadoop, Yahoo Search














Entries (RSS)