Crawling to the people

Yaniv let the cat out of the bag about some of our ideas for making other parts of the search and its relevant data open, free and accessible to all of us.

I’d thought I’ll add some background and my thoughts on the subject.

First, the idea was iterated a couple of times when we were in that place where you have a solution(s) and you are seeking a problem(s) to solve.

It all started from this post by Jeremie Miller. Jeremie, being the good guy that he is, was thinking about create standards and protocols to make the crawling, processing and sharing of data for search and search engines public, free and accessible. While neither Yaniv nor I are in Jeremie’s loop and have no idea of what he is up to (but you can count on it to be interesting, that’s for sure), we talked about it a bit and it sunk in.

We both liked the idea of having the raw data accessible as well as being able to run custom post processors that can make something useful out of it so that no one is tied to whatever logic and algorithms the crawler writer enforces.

Then came the announcement from Kevin Burton about spinn3r, a service that re uses the web index of the Blogosphere crawled by TailRank’s crawler and allows you (and everyone else) to use that crawled data.

This information also sunk in and today at lunch (which did take quite a while :-) ) we started to brainstorm about it a bit more seriously.

This can really open up and innovate search from the bottom up. Give access to a lot of people to APIs and capabilities that were previously only available for big companies. This is the platform that can create something very interesting.

We would love to hear your comments.