logo       
Google Custom Search
    AddThis Social Bookmark Button
-->

Re: Kinosearch: msg#00062

Subject: Re: Kinosearch

On Jun 30, 2005, at 7:02 AM, ed phillips wrote:

Congratulations Marvin,

Thank you.

It seems to me that theoretically there is no reason why a Perl
implementation can't be faster than Lucene. Plenty of aspects of
Lucene could be speeded up.

My question is, and please forgive someone with more curiousity than
time at the moment (production obligations call me), are you using
enough of the design behind Lucence such as for example the scoring
formula to be considered Lucene based if not a Lucene port?

Not really. There are a few areas in which Lucene/Plucene has provided inspiration for Kinosearch, but that's also true for mnoGoSearch, Xapian, Egothor, Search::FreeText, a few search engine articles on Perl.com and elsewhere, etc. The two biggest influences on Kinosearch are Lucene/Plucene and, believe it or not, mnoGoSearch. mnoGoSearch was the first search engine I experimented with extensively, and like mnoGoSearch, Kinosearch was originally based on a MySQL backend. Boy, that was a while ago!

My other question is, why not roll your work into Plucene?

That would require a torso transplant for Plucene.  You up for that?  ;)

It would be possible to rearrange chunks of Kinosearch to superficially resemble Lucene. In fact, that's probably a good idea -- though I think that it may be possible to choose names which are slightly more intuitive. (4rThe main class for indexing is "Kindexer". The main class for searching is "KSearch". The name "Kinosearch" contains the word "search". "Kino" is the main character in John Steinbeck's novel "The Pearl", but the real benefit of the name "Kinosearch" is that it lightens the burden on the horrendously overloaded term "index" -- Kinosearch's indexes are "kindexes", and Kinosearch's indexer is a "Kindexer". )

The low-level stuff in Kinosearch is pretty different, though. There are a lot fewer classes. And Lucene/Plucene is so tightly integrated. I don't think you could swap parts of Kinosearch into Plucene -- I think you'd have to start with Kinosearch and abstract out classes analogous to Lucene's.

Also, it has looked to me for some time like the Plucene project, if not quite dead, had reached a persistent vegetative state. When I first wrote to the list last July regarding performance issues, no one even responded. :(

or did you
crib from Plucene and fork?

My earliest experiments in search engine tinkering were with the Perl parts of mnoGoSearch, and were informed by some hard core MySQL optimization I'd had to do as part of a logfile analysis system, and data compression techniques I'd studied during my stint as an audio mastering engineer. Kindexer.pm and KSearch.pm started out as monolithic scripts, kindexer.plx and ksearch.cgi. Things have evolved (quite a lot) from there.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


<Prev in Thread] Current Thread [Next in Thread>