Monday, June 25, 2012

What is the theory behind Apache Lucene?

There is a recurring request from users to have more insight into Lucene internals. For example, see:

Although most of the ideas behind Lucene are explained in any good book on Information Retrieval, Lucene also implements some advanced algorithms for specific tasks. In these cases, it is probably easier to read an article describing the idea than to reverse-engineer the code. This is why I started a wiki page to collect links to research papers and blog articles that explain some advanced ideas behind Lucene.

Feel free to help me improve this wiki page by sending me ideas of Lucene algorithms that would deserve an entry on it!