Search a Search?
I was in search for an open source search engine for a while. A year ago at the WOS 2004 I heard a talk about nutch. But this was ‘only’ an approach to build something like Google aka a web search, there topics are more related to problems of crawler distribution an consolidation of search data.
That’s not my main interest, a first step should be something like a framework, a basic modular system that is able to dig into all kinds of data formats, on a local basis. Witch means some workstations, personal storage, one or to server, all changing media stuff (usb stick, cd, ipod) that enters or leaves my digital computing equipment.
This was where I came across the Gnome Beagle Project. Based on the Mono environment it is a free implementation of a full blown desktop search engine with further capabilities. I was astonished to find out that almost on the same day Suse announced beagle in its upcoming release 9.3, sure this is because beagle was founded by some novell guys but I’m sure that this application will emerge on a wide range of distributions soon. So long everyone who wants to try it has to get his hands dirty… C# dirty… I know at this point it could get a problem because of the development platform that is used. – Why Mono is Currently An Unacceptable Risk – but hey stay tough.
I just reloaded the nutch page after a few month, and wow after it got a little bit silent around this project they joined the Apache Incubator, a little step in the right direction? The direction where every content provider will also provide the necessary search index and processing power? We’ll see, keep it on Doug.
In general I’m a little bit worried about the topic of open source search engine because compared with other technical functionality the OS scene has missed some development in the last years. Yes search engine design is a hard topic in technical means as in development coordination, but it’s not about to tinker with secret magic algorithms and to discuss the underlying architecture.
What’s more? beyond beagle? I think with the growing presents of marketing search engines (like Google) there is a larger need for free and documented content search. Not only your local hard disk and not only the WWW there is so much more content that wants to be found by free algorithms and not proprietary ones. .) The rest will emerge by itself, distribution via modern sharing systems will be used, bandwidth, to come back to nutchs web crawler point gets cheaper and cheaper.
I hope that we’ll see a rapid development in this area during the next years. For me, large commercial search engines have passed there top ‘as search engines’ yes they will evolve on the commercial market, but not as a technical search base for the masses. – That’s has to be our part.