Posts Tagged “Yahoo Search”

Yahoo Search Blog
In his latest entry on the Yahoo Search Blog, Vish Makhijani, discusses “Yahoo! Search An Open Approach to Search“. This post builds on last weeks announcement of the largest Hadoop production application and I love it. It’s innovative, especially for content producers. They, we finally get a say in the output of Yahoo’s search results like never before. Regardless if you’re a content producer or searcher you can sign up for more information here.

“Because the platform is open it gives all Web site owners — big or small — an opportunity to present more useful information on the Yahoo! Search page as compared to what is presented on other search engines. Site owners will be able to provide all types of additional information about their site directly to Yahoo! Search. So instead of a simple title, abstract and URL, for the first time users will see rich results that incorporate the massive amount of data buried in websites — ratings and reviews, images, deep links, and all kinds of other useful data — directly on the Yahoo! Search results page.”

(more…)

Tags: ,

Comments No Comments »

Hadoop
Some exciting news today from Eric Baldeschwieler, Senior Director, Grid Computing on the Yahoo Developer Network, Yahoo! Launches World’s Largest Hadoop Production Application. I’ll note that my company Hyperix is using Hadoop for our vertical search platform.

Here’s some of the stats:

Some Webmap size data:

* Number of links between pages in the index: roughly 1 trillion links
* Size of output: over 300 TB, compressed!
* Number of cores used to run a single Map-Reduce job: over 10,000
* Raw disk used in the production cluster: over 5 Petabytes

Tags: ,

Comments No Comments »

Over at the Yahoo! Search blog Sharad Verma recaps WebmasterWorld’s Pubcon. I could not attend but it sounds like I missed a good conference and relevant keynote for Hyperix. It’s nothing new to me but it’s nice to see other people talking about it.

Noteworthy Keynote

I thought that Richard Rosenblatt’s keynote on Wednesday delivered sound insight. According to Richard, most content online today is about sports, politics, news, and other common topics, leaving long tail topics underserved. He emphasized that there is significant demand for quality content in the long tail and therefore an unaddressed opportunity to create content and capitalize on the monetization opportunities. Susan Esparza at Bruce Clay, Inc. lived blogged from the address, where Richard revealed, “The old model was about owning a generic domain name (pets.com). The new is that the search engines don’t care where you are. Get a one or two word domain on a nontraditional domain. Target the wide body and the long tail.”

Tags:

Comments No Comments »