Posts Tagged “Hadoop”

Hadoop
Some exciting news today from Eric Baldeschwieler, Senior Director, Grid Computing on the Yahoo Developer Network, Yahoo! Launches World’s Largest Hadoop Production Application. I’ll note that my company Hyperix is using Hadoop for our vertical search platform.

Here’s some of the stats:

Some Webmap size data:

* Number of links between pages in the index: roughly 1 trillion links
* Size of output: over 300 TB, compressed!
* Number of cores used to run a single Map-Reduce job: over 10,000
* Raw disk used in the production cluster: over 5 Petabytes

Tags: ,

Comments No Comments »


Since we’re getting closer to the public unveiling of the first vertical search engine produced from the Hyperix search platform I think it’s time to release a couple of tidbits about the platform. Hints have been out there for a while and some people within the search community are already aware of this, but one of the primary components of Hyperix is the use of Hadoop.

Here’s the quick intro of what Hadoop does from the open source site:

(more…)

Tags:

Comments No Comments »