John Battelle on WebFountain
[John Battelle](http://battellemedia.com/) has finally posted about WebFountain, the corporate search engine being developed inside IBM. I was waiting for this for some time because there is so little information on WebFountain out there and John was sure to go to the bottom of this to some extent, and sure he did.
There are some very interesting things in the post, for example: _The platform has been designed to encompass different approaches and paradigms and make the results of each available to the others._
What this means is that you do not go over content once, but you could do a Google type query, including something similar to PageRank and then go over the result with a totally different method. And the cool thing is that you can plug stuff in and out, changing order and use. This is a query you would do in WebFountain:
_“Give me all the documents on the web which have at least one page of content in Arabic, are located in the Midwest, and are connected to at least two similar documents but are not connected to the official Al Jazeera website, and mention anyone on a specified list of suspected terrorists.�_
or:
_“Tell me all the places on the web where “The Passion of the Christ� is discussed that also mentions one of the top five box office movies that is not Lord of the Rings, and throw out all sites that either are in Spanish, or are in the Southern hemisphere. Oh, and translate the ones that are not in English when you return results.�_
Now how cool is that? :) IBM is also using semantics in the search engine and are really moving into the corporate market as it has a suggested price tag of $15 billion in search, today. Don’t expect them to replace Google though. The stuff they do is too processor intensive and just not scalable it seems. So it will be more in the few-but-important-and-complicated queries market. By the way, they can reindex the web in 24 hours. :)

