> > Has anyone compared the query performance between Plucene
> > and Lucene?
> > I saw mention of slower indexing, what about query
> > performance?
>
> It's also considerably slower.
>
I would additionally consider the resource (CPU/Memory) consumption with
Plucene. We've had issues with this in medium to larger deployments. We
integrated memcached and that helped considerably, but finally we have
decided to stick with Lucene or Lucene.NET for anything of scale.
Another thing and I hestitate to bring this up, as we have not fully
debugged it, is that we have seen issues with run away CGI/Plucene procs.
These would run for minutes at a time and would consume 30-50% CPU and
20-40MB of RAM.
With strace we saw this pattern repeating over and over:
read(9, "\3\0042726\2\1\2\2\2\0011\2\3\2\2\3\0015\2\1\5\3\3\002"..., 4096) =
4096
_llseek(9, 48078, [48078], SEEK_SET) = 0
_llseek(9, 0, [48078], SEEK_CUR) = 0
_llseek(9, 47182, [47182], SEEK_SET) = 0
_llseek(9, 0, [47182], SEEK_CUR) = 0
read(9, "\3\0042726\2\1\2\2\2\0011\2\3\2\2\3\0015\2\1\5\3\3\002"..., 4096) =
4096
...which looks like a looping index read. This would run for minutes on end
and then finally exit.
As I said I'm not clear as to what exactly triggers this, but it was
happening with about 5-10% of our search requests and would start to
snowball if we did not monitor things closely.
-Brice
|