April 28, 2007

Update an indexed record with Zend Search Lucene.

Filed under: PHP — Marcus @ 11:00 pm

Looking at the few tutorials available off zftutorials, there’s not much in the way of a proper “here’s a usable search engine based on Zend_Search_Lucene”. So while building one, I came across the problem of updating existing documents, which Lucene handles by deleting then re-adding the document. Which is fine, you just need to have a way of uniquely identifying the document in the index. So, something like a DB’s unique ID, stored as a particular field like ‘id’ in the index maybe? Well, I thought so, till I realised that I wasn’t removing anything… turns out that the default Lucene analyzer will treat numbers as whitespace characters, so that doesn’t work too well. One option that I’ve briefly implemented (depending on speed, and whether I need proper number indexing) is to just convert each digit of the ID into a lower case character. The other is to get into a custom analyzer which will treat numbers with the respect they deserve.

Extending Search Analysis

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

.

Powered by WordPress