<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><atom:link rel="hub" href="http://tumblr.superfeedr.com/" xmlns:atom="http://www.w3.org/2005/Atom"/><description>I am a search engineer working at Twenga and a Lucene committer at The Apache Software Foundation.</description><title>Adrien Grand</title><generator>Tumblr (3.0; @jpountz)</generator><link>http://blog.jpountz.net/</link><item><title>lz4-java 1.1.0 is out</title><description>&lt;p&gt;I&amp;#8217;m happy to announce the release of lz4-java 1.1.0. Artifacts can be downloaded from &lt;a href="http://repo1.maven.org/maven2/net/jpountz/lz4/lz4/1.1.0" target="_blank"&gt;Maven Central&lt;/a&gt; and javadocs can be found at &lt;a href="http://jpountz.github.com/lz4-java/1.1.0/docs/" target="_blank"&gt;jpountz.github.com/lz4-java/1.1.0/docs/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Release highlights&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;lz4 has been upgraded from r87 to &lt;a href="http://code.google.com/p/lz4/source/detail?r=88" target="_blank"&gt;r88&lt;/a&gt; (improves LZ4 HC compression speed).&lt;/li&gt;
&lt;li&gt;Experimental &lt;a href="http://jpountz.github.com/lz4-java/1.1.0/docs/net/jpountz/lz4/LZ4BlockOutputStream.html" target="_blank"&gt;streaming support&lt;/a&gt;: data is serialized into fixed-size blocks of compressed data. This can be useful for people who need to manipulate data using streams and want compression to be transparent.&lt;/li&gt;
&lt;li&gt;The released artifact contains pre-compiled JNI bindings for some common platforms: win32/amd64, darwin/x86_64, linux/i386 and linux/amd64. Users of these platforms can now benefit from the speed of the JNI bindings without having to build from source.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;b&gt;Performance&lt;/b&gt;&lt;/p&gt;

&lt;p&gt;In order to give a sense of the speed of lz4 and xxhash, I published some benchmarks. The lz4 compression/decompression benchmarks have been computed using Ning&amp;#8217;s &lt;a href="http://github.com/ning/jvm-compressor-benchmark" target="_blank"&gt;jvm-compressor-benchmark&lt;/a&gt; framework while the xxhash benchmark has been computed using a &lt;a href="http://code.google.com/p/caliper/" target="_blank"&gt;Caliper&lt;/a&gt; &lt;a href="http://github.com/jpountz/jvm-checksum-benchmark" target="_blank"&gt;benchmark&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;a href="http://jpountz.github.com/lz4-java/1.1.0/lz4-compression-benchmark/" target="_blank"&gt;lz4 compression&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;&lt;a href="http://jpountz.github.com/lz4-java/1.1.0/lz4-decompression-benchmark/" target="_blank"&gt;lz4 decompression&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;&lt;a href="http://jpountz.github.com/lz4-java/1.1.0/xxhash-benchmark/" target="_blank"&gt;xxhash hashing&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;I did my best to make these benchmarks unbiased, but the performance of these algorithms depends a lot on the kind of data which is compressed/hashed (the length for example) so whenever possible you should make benchmarks with your own data to decide which implementation to use.&lt;/p&gt;

&lt;p&gt;Happy compressing and hashing!&lt;/p&gt;</description><link>http://blog.jpountz.net/post/42711432648</link><guid>http://blog.jpountz.net/post/42711432648</guid><pubDate>Sun, 10 Feb 2013 02:13:00 +0100</pubDate><category>lz4</category><category>xxhash</category></item><item><title>Putting term vectors on a diet</title><description>&lt;h3&gt;What are term vectors?&lt;/h3&gt;

&lt;p&gt;&lt;a href="http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/IndexReader.html#getTermVectors(int)" target="_blank"&gt;Term vectors&lt;/a&gt; are an interesting Lucene feature, which allows for retrieving a single-document inverted index for any document ID of your index. This means that given any document ID, you can quickly list all its unique terms in sorted order, and for every term you can quickly know its original positions and offsets. For example, if you indexed the following document:&lt;/p&gt;

&lt;table align="center"&gt;&lt;tr&gt;&lt;th&gt;Field name&lt;/th&gt;&lt;th&gt;Field value&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;text&lt;/td&gt;&lt;td&gt;the quick brown fox jumps over the lazy dog&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;You would retrieve the following term vectors:&lt;/p&gt;

&lt;table align="center"&gt;&lt;tr&gt;&lt;th&gt;Term&lt;/th&gt;&lt;th&gt;Frequency&lt;/th&gt;&lt;th&gt;Positions&lt;/th&gt;&lt;th&gt;Offsets&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;brown&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;[10,15]&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;dog&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;[40,43]&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;fox&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;[16,19]&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;jumps&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;[20,25]&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;lazy&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;&lt;td&gt;[35,39]&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;over&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;[26,30]&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;quick&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;[4,9]&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;the&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;0, 6&lt;/td&gt;&lt;td&gt;[0,3], [31,34]&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;For very small documents, it makes little sense to store term vectors given than they can be recomputed very quickly by re-analyzing a document&amp;#8217;s stored fields. But if your documents are large or if your analysis pipeline is expensive, storing term vectors on disk can be much faster than computing them on the fly. So far, term vectors have been mainly used for &lt;a href="http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/vectorhighlight/FastVectorHighlighter.html" target="_blank"&gt;highlighting&lt;/a&gt; and &lt;a href="http://lucene.apache.org/core/4_1_0/queries/org/apache/lucene/queries/mlt/MoreLikeThis.html" target="_blank"&gt;MoreLikeThis&lt;/a&gt; (searching for similar documents) but there is an interesting issue open in Lucene JIRA to &lt;a href="https://issues.apache.org/jira/browse/LUCENE-4272" target="_blank"&gt;use term vectors to perform partial document updates&lt;/a&gt;.

&lt;/p&gt;&lt;p&gt;However, term vectors come with a cost. They store a lot of information and often take up a lot of disk space. This is bad because it can make indexing and searching slower (especially if the index size grows beyond the size of your OS cache).&lt;/p&gt;

&lt;h3&gt;Term vectors compression&lt;/h3&gt;

&lt;p&gt;Having worked on &lt;a href="http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene" target="_blank"&gt;stored fields compression&lt;/a&gt; in the past months, my first idea was to apply the same recipe: collect enough raw data to fill a 16&amp;#160;KB block, then compress it and flush it to disk. However term vectors are more challenging to compress: terms are already unique so it is rather hard for &lt;a href="http://en.wikipedia.org/wiki/LZ77_and_LZ78" target="_blank"&gt;LZ&lt;/a&gt; codecs such as &lt;a href="http://code.google.com/p/lz4/" target="_blank"&gt;LZ4&lt;/a&gt; to reach good compression ratios. Moreover, general-purpose compression algorithms are usually not very good at compressing numeric data (frequencies, term positions and offsets) so I needed something else.&lt;/p&gt;

&lt;p&gt;After long hours of trial and error, I managed to write a new &lt;a href="http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/codecs/TermVectorsFormat.html" target="_blank"&gt;term vectors format&lt;/a&gt; based on LZ4 and &lt;a href="http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/util/packed/PackedInts.html" target="_blank"&gt;bit-packing&lt;/a&gt; which efficiently compresses term vectors for various cases. Depending on the collection of documents, the compression ratio of the term vector files varied from 0.53 to 0.90. For example, indexing term vectors (with positions and offsets enabled) for 1M articles from the English Wikipedia database generates 5.9G of term vector files with the default codec from Lucene 4.0 or 4.1. By switching to this new term vectors format, the size of the term vector files decreased to 3.9G! Another good news is that this size reduction made indexing faster: while indexing those 1M articles took 1038 seconds with the current term vectors format, it took only 870 seconds with this new compressed format (see ingestion rate charts below).&lt;/p&gt;

&lt;div align="center" style="margin:20px"&gt;
&lt;img src="https://issues.apache.org/jira/secure/attachment/12565686/Lucene40TVF_ingest_rate.png"/&gt;&lt;br/&gt;
Ingestion rate with the current default format.
&lt;/div&gt;

&lt;div align="center" style="margin:20px"&gt;
&lt;img src="https://issues.apache.org/jira/secure/attachment/12565685/CompressingTVF_ingest_rate.png"/&gt;&lt;br/&gt;
Ingestion rate with the new compressed format.
&lt;/div&gt;

&lt;p&gt;Although this new format is still very experimental, I think it&amp;#8217;s promising and would make a good candidate to become the new default term vectors format for a future version of Lucene. If you are interested in better understanding how it works and the compression ratio you can expect from this format, you can read more about it in &lt;a href="https://issues.apache.org/jira/browse/LUCENE-4599" target="_blank"&gt;Lucene Jira&lt;/a&gt;.&lt;/p&gt;</description><link>http://blog.jpountz.net/post/41301889664</link><guid>http://blog.jpountz.net/post/41301889664</guid><pubDate>Wed, 23 Jan 2013 22:22:00 +0100</pubDate><category>Lucene</category></item><item><title>lz4-java 1.0.0 released</title><description>&lt;p&gt;I am happy to announce that I released the first version of &lt;a href="http://github.com/jpountz/lz4-java" target="_blank"&gt;lz4-java&lt;/a&gt;, version 1.0.0.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://github.com/jpountz/lz4-java" target="_blank"&gt;lz4-java&lt;/a&gt; is a Java port of the &lt;a href="http://code.google.com/p/lz4/" target="_blank"&gt;lz4&lt;/a&gt; compression library and the &lt;a href="http://code.google.com/p/xxhash/" target="_blank"&gt;xxhash&lt;/a&gt; hashing library, which are both known for being blazing fast.&lt;/p&gt;

&lt;p&gt;This release is based on &lt;a href="http://code.google.com/p/lz4/source/list" target="_blank"&gt;lz4 r87&lt;/a&gt; and &lt;a href="http://code.google.com/p/xxhash/source/list" target="_blank"&gt;xxhash r6&lt;/a&gt;. Artifacts have been pushed to &lt;a href="http://search.maven.org/#artifactdetails%7Cnet.jpountz.lz4%7Clz4%7C1.0.0%7Cjar" target="_blank"&gt;Maven Central (net.jpountz.lz4:lz4:jar:1.0.0)&lt;/a&gt; and javadocs can be found at &lt;a href="http://jpountz.github.com/lz4-java/1.0.0/docs/" target="_blank"&gt;&lt;a href="http://jpountz.github.com/lz4-java/1.0.0/docs/" target="_blank"&gt;http://jpountz.github.com/lz4-java/1.0.0/docs/&lt;/a&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Examples&lt;/h3&gt;

&lt;p&gt;For those who would like to get started quickly, here are examples:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;lz4 compression and decompression
&lt;script src="https://gist.github.com/4489277.js"&gt;&lt;/script&gt;&lt;/li&gt;
&lt;li&gt;block hashing with xxhash
&lt;script src="https://gist.github.com/4489297.js"&gt;&lt;/script&gt;&lt;/li&gt;
&lt;li&gt;streaming hashing with xxhash.
&lt;script src="https://gist.github.com/4489305.js"&gt;&lt;/script&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Happy compressing and hashing!&lt;/p&gt;</description><link>http://blog.jpountz.net/post/40049763474</link><guid>http://blog.jpountz.net/post/40049763474</guid><pubDate>Wed, 09 Jan 2013 01:13:00 +0100</pubDate><category>lz4</category><category>xxhash</category></item><item><title>Stored fields compression in Lucene 4.1</title><description>&lt;p&gt;Last time, I tried to explain how &lt;a href="http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene" target="_blank"&gt;efficient stored fields compression&lt;/a&gt; can help when your index grows larger than your I/O cache. Indeed, magnetic disks are so slow that it is usually worth spending a few CPU cycles on compression in order to avoid disk seeks.&lt;/p&gt;

&lt;p&gt;I have a very good news for you: the stored fields format I used for these experiments will become the new default stored fields format as of Lucene 4.1! Here are the main highlights:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;only one disk seek per document in the worst case (compared to two with the previous default stored fields format)&lt;/li&gt;
&lt;li&gt;documents are compressed together in blocks ot 16&amp;#160;KB or more using the blazing fast &lt;a href="http://code.google.com/p/lz4/" target="_blank"&gt;LZ4&lt;/a&gt; compression algorithm&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Over the last weeks, I&amp;#8217;ve had the occasion to talk about this new stored fields format with various Lucene users and developers who raised interesting questions that I&amp;#8217;ll try to answer:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;b&gt;What happens if my documents are larger than 16KB?&lt;/b&gt; This stored fields format prevents documents from spreading across chunks: if your documents are larger than 16KB, you will have larger chunks that contain only one document.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Is it configurable?&lt;/b&gt; Yes and no: the stored fields format that will be used by Lucene41Codec is not configurable. However, it is based on another format: &lt;a href="https://builds.apache.org/job/Lucene-Artifacts-trunk/javadoc/codecs/org/apache/lucene/codecs/compressing/CompressingStoredFieldsFormat.html" target="_blank"&gt;CompressingStoredFieldsFormat&lt;/a&gt;, which allows you to configure the chunk size and the compression algorithm to use (LZ4, LZ4 HC or Deflate).&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Are there limitations?&lt;/b&gt; Yes, there is one: individual documents cannot be larger than 2&lt;sup&gt;32&lt;/sup&gt; - 2&lt;sup&gt;16&lt;/sup&gt; bytes (a little less than 2&amp;#160;GB). But this should be fine for most (if not all) use-cases.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Can I disable compression?&lt;/b&gt; Of course you can, all you need to do is to write a new codec that uses a stored fields format which does not compress stored fields such as &lt;a href="http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/codecs/lucene40/Lucene40StoredFieldsFormat.html" target="_blank"&gt;Lucene40StoredFieldsFormat&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;My index is stored in memory / on a SSD, does it still make sense to compress stored fields?&lt;/b&gt;
I think so:&lt;ul&gt;&lt;li&gt;it won&amp;#8217;t slow down your search engine: on my very slow laptop (Core 2 Duo T6670), decompressing a 16&amp;#160;KB block of english text takes 80µs on average, so even if your result pages have 50 documents, your queries will only be 4ms slower (much less with faster hardware and/or smaller pages)&lt;/li&gt;
&lt;li&gt;RAM and SSD are expensive, so thanks to stored fields compression you&amp;#8217;ll be able to have larger indexes on the same hardware, or equivalent indexes on cheaper hardware&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Can I plug in my own compression algorithm?&lt;/b&gt; Unfortunately you can&amp;#8217;t, but if you really need to use a different compression algorithm, the code should be easy to adapt. However you should be aware of two optimizations of the LZ4 implementation in Lucene that you would almost certainly need to implement if you want to achieve similar performance:
&lt;ul&gt;&lt;li&gt;it doesn&amp;#8217;t compress to a temporary buffer before writing the compressed data to disk, instead it writes directly to a Lucene &lt;a href="http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/store/DataOutput.html" target="_blank"&gt;DataOutput&lt;/a&gt; &amp;#8212; this proved to be faster (with MMapDirectory at least)&lt;/li&gt;
&lt;li&gt;it stops decompressing as soon as enough data has been decompressed: for example, if you need to retrieve the second document of a chunk, which is stored between offsets 1024 and 2048 of the chunk, Lucene will only decompress 2&amp;#160;KB of data.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Many thanks to &lt;a href="https://twitter.com/rcmuir" target="_blank"&gt;Robert Muir&lt;/a&gt; who helped me improve and fix this new stored fields format!&lt;/p&gt;</description><link>http://blog.jpountz.net/post/35667727458</link><guid>http://blog.jpountz.net/post/35667727458</guid><pubDate>Wed, 14 Nov 2012 01:07:00 +0100</pubDate><category>lucene</category><category>lz4</category></item><item><title>Efficient compressed stored fields with Lucene</title><description>&lt;p&gt;Whatever you are storing on disk, everything usually goes perfectly well until your data becomes too large for your &lt;a href="http://en.wikipedia.org/wiki/Page_cache" target="_blank"&gt;I/O cache&lt;/a&gt;. Until then, most disk accesses actually never touch disk and are almost as fast as reading or writing to main memory. The problem arises when your data becomes too large: disk accesses that can&amp;#8217;t be served through the I/O cache will trigger an actual disk seek, and everything will suddenly become much slower. Once data becomes that large, there are three options: either you find techniques to reduce disk seeks (usually by loading some data in memory and/or relying more on sequential access), buy more RAM or better disks (SSD?), or performance will degrade as your data will keep growing.&lt;/p&gt;

&lt;p&gt;If you have a Lucene index with some stored fields, I wouldn&amp;#8217;t be surprised that most of the size of your index is due to its &lt;tt&gt;.fdt&lt;/tt&gt; files. For example, when indexing 10M documents from a wikipedia dump that &lt;a href="http://blog.mikemccandless.com/" target="_blank"&gt;Mike McCandless&lt;/a&gt; uses for &lt;a href="http://people.apache.org/~mikemccand/lucenebench/" target="_blank"&gt;nightly benchmarks&lt;/a&gt;, the &lt;tt&gt;.fdt&lt;/tt&gt; files are 69.3% of the index size.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;.fdt&lt;/tt&gt; is one of the two file extensions that are used for stored fields in Lucene. You can read more about how they work in &lt;a href="http://lucene.apache.org/core/4_0_0-BETA/core/org/apache/lucene/codecs/lucene40/Lucene40StoredFieldsFormat.html" target="_blank"&gt;Lucene40StoredFieldsFormat&amp;#8217;s docs&lt;/a&gt;. The important thing to know is that loading a document from disk requires two disk seeks:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;one in the fields index file (&lt;tt&gt;.fdx&lt;/tt&gt;),&lt;/li&gt;
  &lt;li&gt;one in the fields data file (&lt;tt&gt;.fdt&lt;/tt&gt;).&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The fields index file being usually small (~ 8 * maxDoc bytes), the I/O cache should be able to serve most disk seeks in this file. However the fields data file is often much larger (a little more than the original data) so the seek in this file is more likely to translate to an actual disk seek. In the worst case that all seeks in this file translate to actual disk seeks, if your search engine displays &lt;tt&gt;p&lt;/tt&gt; results per page, it won&amp;#8217;t be able to handle more than &lt;tt&gt;100/p&lt;/tt&gt; requests per second (given that a disk seek on commodity hard drives is ~ 10ms). As a consequence the hit rate of the I/O cache on this file is very important for your query throughput. One option to improve the I/O cache hit rate is to compress stored fields so that the fields data file is smaller overall.&lt;/p&gt;

&lt;p&gt;Up to version &lt;tt&gt;2.9&lt;/tt&gt;, Lucene had an option to compress stored fields but it has been deprecated and then removed (see &lt;a href="https://issues.apache.org/jira/browse/LUCENE-652" target="_blank"&gt;LUCENE-652&lt;/a&gt; for more information). In newer versions, users can still compress documents but this has to be done at the document level instead of the index level. However, the problem is still the same: if you are working with small fields, most compression algorithms are inefficient. In order to fix it, ElasticSearch 0.19.5 introduced &lt;a href="http://www.elasticsearch.org/guide/reference/index-modules/store.html" target="_blank"&gt;store-level compression&lt;/a&gt;: it compresses large (64KB) fixed-size blocks of data instead of single fields in order to improve the compression ratio. This is probably the best way to compress small docs with Lucene up to version &lt;tt&gt;3.6&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;Fortunately, Lucene 4.0 (which should be released very soon) introduces flexible indexing: it allows you to customize Lucene low-level behavior, in particular the index files formats. With &lt;a href="https://issues.apache.org/jira/browse/LUCENE-4226" target="_blank"&gt;LUCENE-4226&lt;/a&gt;, Lucene got a new &lt;a href="http://lucene.apache.org/core/4_0_0-BETA/core/org/apache/lucene/codecs/StoredFieldsFormat.html" target="_blank"&gt;StoredFieldsFormat&lt;/a&gt; that efficiently compresses stored fields. Handling compression at the codec level allows for several optimizations compared to ElasticSearch&amp;#8217;s approach:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;blocks can have variable size so that documents never spread across two blocks (so that loading a document from disk never requires uncompressing more than one block),&lt;/li&gt;
  &lt;li&gt;uncompression can stop as soon as enough data has been uncompressed,&lt;/li&gt;
  &lt;li&gt;less memory is required.&lt;/li&gt;
&lt;/ul&gt;&lt;h3&gt;Lucene40StoredFieldsFormat vs. CompressingStoredFieldsFormat&lt;/h3&gt;

&lt;p&gt;In order to ensure that it is really a win to compress stored fields, I ran a few benchmarks on a large index:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;10M documents,&lt;/li&gt;
  &lt;li&gt;every document has 4 stored fields:
    &lt;ul&gt;&lt;li&gt;an ID (a few bytes),&lt;/li&gt;
      &lt;li&gt;a title (a few bytes),&lt;/li&gt;
      &lt;li&gt;a date (a few bytes),&lt;/li&gt;
      &lt;li&gt;a body (up to 1KB).&lt;/li&gt;
    &lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;tt&gt;CompressingStoredFieldsFormat&lt;/tt&gt; has been instantiated with the following parameters:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;compressionMode = FAST (fast compression and fast uncompression, but high compression ratio, uses &lt;a href="http://code.google.com/p/lz4/" target="_blank"&gt;LZ4&lt;/a&gt; under the hood),&lt;/li&gt;
  &lt;li&gt;chunkSize = 16K (means that data will be compressed into blocks of ~16KB),&lt;/li&gt;
  &lt;li&gt;storedFieldsIndexFormat = MEMORY_CHUNK (the most compact fields index format, requires at most 12 bytes of memory per chunk of compressed documents).&lt;/li&gt;
&lt;/ul&gt;&lt;h4&gt;Index size&lt;/h4&gt;

&lt;ul&gt;&lt;li&gt;Lucene40StoredFieldsFormat
    &lt;ul&gt;&lt;li&gt;Fields index: 76M&lt;/li&gt;
      &lt;li&gt;Field data: 9.4G&lt;/li&gt;
    &lt;/ul&gt;&lt;/li&gt;
  &lt;li&gt;CompressingStoredFieldsFormat
    &lt;ul&gt;&lt;li&gt;Fields index: 1.7M&lt;/li&gt;
      &lt;li&gt;Field data: 5.7G&lt;/li&gt;
    &lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;h4&gt;Indexing speed&lt;/h4&gt;

&lt;p&gt;Indexing took almost the same time with both &lt;tt&gt;StoredFieldsFormat&lt;/tt&gt;s (~ 37 minutes) and ingestion rates are very similar:&lt;/p&gt;

    &lt;script type="text/javascript"&gt;
      google.load("visualization", "1", {packages:["corechart"]});
      google.setOnLoadCallback(drawChart);
      function drawChart() {
        var data = google.visualization.arrayToDataTable([
          ['Second', 'Lucene40Codec', 'CompressingCodec'],
[0 ,  7231.38186593 ,  6461.06305759 ],
[10 ,  5983.30832017 ,  6188.95120861 ],
[20 ,  6229.78593077 ,  6374.51194816 ],
[30 ,  8165.35769749 ,  7913.89082887 ],
[40 ,  7457.86252697 ,  7409.18848067 ],
[50 ,  6067.18303453 ,  6710.46586569 ],
[60 ,  7010.3822783 ,  6916.23453053 ],
[70 ,  7557.35710772 ,  6775.69651741 ],
[80 ,  7462.69918681 ,  7675.58706468 ],
[90 ,  7149.01360737 ,  6845.55942755 ],
[100 ,  7111.57863651 ,  7314.70149254 ],
[110 ,  7814.23383085 ,  6644.7473861 ],
[120 ,  6695.62934096 ,  5843.5374943 ],
[130 ,  4218.97317562 ,  4568.85942966 ],
[140 ,  4706.36186986 ,  4791.3880597 ],
[150 ,  2713.73806877 ,  3602.93266273 ],
[160 ,  6002.34003239 ,  3134.36129971 ],
[170 ,  8704.66079064 ,  8209.12055664 ],
[180 ,  7113.71072309 ,  7825.99070286 ],
[190 ,  8299.567638 ,  8057.9271132 ],
[200 ,  7654.16953672 ,  8253.04975124 ],
[210 ,  7359.90395572 ,  7429.24202517 ],
[220 ,  7605.10177924 ,  6761.24378109 ],
[230 ,  6761.44465739 ,  5827.36008235 ],
[240 ,  7069.33668321 ,  5921.51741294 ],
[250 ,  5613.97064628 ,  6457.06467662 ],
[260 ,  2549.53633932 ,  3776.75919131 ],
[270 ,  3244.58525943 ,  2884.27978713 ],
[280 ,  2838.2185249 ,  2381.68650978 ],
[290 ,  2084.960199 ,  3127.62896274 ],
[300 ,  2457.69788934 ,  3500.65426877 ],
[310 ,  654.349605028 ,  4394.57226936 ],
[320 ,  2792.94156706 ,  6834.94174659 ],
[330 ,  5572.7066753 ,  7620.39796345 ],
[340 ,  8028.7635334 ,  7316.84625331 ],
[350 ,  5394.02560484 ,  7453.95024876 ],
[360 ,  6471.35984212 ,  7237.93094746 ],
[370 ,  7095.42864105 ,  6733.37810945 ],
[380 ,  6588.77398891 ,  6601.45771144 ],
[390 ,  6677.94354315 ,  6092.16633242 ],
[400 ,  6974.12220069 ,  5981.01926985 ],
[410 ,  6128.42708779 ,  3744.87256187 ],
[420 ,  7654.00497512 ,  2589.71142425 ],
[430 ,  2761.36885133 ,  2579.31840796 ],
[440 ,  3060.30135501 ,  2969.90858866 ],
[450 ,  3030.30688344 ,  3144.81606743 ],
[460 ,  3055.5104934 ,  6188.84079602 ],
[470 ,  2544.08167984 ,  6294.02578656 ],
[480 ,  2317.25535499 ,  7038.59715875 ],
[490 ,  2834.3669334 ,  6873.9436069 ],
[500 ,  2685.83636439 ,  7301.26557354 ],
[510 ,  7526.54228856 ,  5915.75 ],
[520 ,  6455.53616533 ,  5979.38640523 ],
[530 ,  7648.70149254 ,  7167.06462492 ],
[540 ,  7847.6695165 ,  7155.98101233 ],
[550 ,  3518.23664431 ,  7159.28395831 ],
[560 ,  7310.57044792 ,  5320.87348345 ],
[570 ,  7936.85572139 ,  3019.70484576 ],
[580 ,  6334.47769191 ,  2849.10966103 ],
[590 ,  4406.32137821 ,  3169.1738367 ],
[600 ,  3074.83730704 ,  3604.41328328 ],
[610 ,  3195.62016649 ,  5705.280685 ],
[620 ,  3108.08281141 ,  7093.3861465 ],
[630 ,  4397.51779043 ,  7952.24074433 ],
[640 ,  3175.8532911 ,  6457.61373718 ],
[650 ,  3740.02331742 ,  6619.28501414 ],
[660 ,  3152.16258168 ,  7521.78662186 ],
[670 ,  5789.78204563 ,  7336.85574827 ],
[680 ,  7753.16455175 ,  7511.41411093 ],
[690 ,  7811.0551863 ,  7503.44930158 ],
[700 ,  5407.88257343 ,  3797.41552354 ],
[710 ,  6010.7244799 ,  3220.99460913 ],
[720 ,  6722.06777039 ,  3655.69544682 ],
[730 ,  6733.38329397 ,  3662.78412045 ],
[740 ,  6594.91170362 ,  3795.47740138 ],
[750 ,  4433.82051519 ,  7646.06715292 ],
[760 ,  2952.94670125 ,  6629.09874995 ],
[770 ,  3058.6952607 ,  7111.80597015 ],
[780 ,  2613.41376217 ,  7619.90144032 ],
[790 ,  3187.36781439 ,  7796.40198092 ],
[800 ,  2159.76980311 ,  7557.00497512 ],
[810 ,  3360.00450965 ,  6177.51376925 ],
[820 ,  2881.54026918 ,  6317.04358402 ],
[830 ,  6603.03530275 ,  6557.03675631 ],
[840 ,  6705.96667381 ,  3939.50683997 ],
[850 ,  7889.86340282 ,  2986.0263652 ],
[860 ,  7591.25131153 ,  3718.7921941 ],
[870 ,  7617.19655784 ,  4011.77053773 ],
[880 ,  7366.32270967 ,  4594.60129894 ],
[890 ,  7037.6331109 ,  7518.56716418 ],
[900 ,  7914.12070934 ,  7150.28643667 ],
[910 ,  6092.25762681 ,  6453.73501683 ],
[920 ,  3451.24138843 ,  7692.3227967 ],
[930 ,  3223.36817709 ,  7290.48756219 ],
[940 ,  2696.00637084 ,  7732.98529164 ],
[950 ,  2307.44870868 ,  6926.05006645 ],
[960 ,  2990.58324017 ,  7210.13496767 ],
[970 ,  1101.32559425 ,  5750.95126455 ],
[980 ,  8286.20895522 ,  4087.97837106 ],
[990 ,  8071.46458802 ,  2639.58386671 ],
[1000 ,  6071.14518744 ,  2867.13371366 ],
[1010 ,  4596.98540541 ,  3547.81651753 ],
[1020 ,  7847.94265302 ,  3683.47186712 ],
[1030 ,  7879.1870348 ,  7435.77395631 ],
[1040 ,  8036.40674675 ,  7740.09784411 ],
[1050 ,  6226.38111264 ,  7444.54795788 ],
[1060 ,  5110.66610661 ,  6800.29723476 ],
[1070 ,  7841.52995953 ,  5741.56301824 ],
[1080 ,  3666.49253731 ,  6588.80099502 ],
[1090 ,  3342.87273037 ,  6426.41917222 ],
[1100 ,  3648.27363184 ,  7748.2495935 ],
[1110 ,  3448.82626076 ,  7443.91864218 ],
[1120 ,  2695.68591917 ,  4218.93810926 ],
[1130 ,  2213.60902385 ,  2902.23135164 ],
[1140 ,  4217.59546472 ,  3116.27098855 ],
[1150 ,  7335.20931524 ,  3681.73352273 ],
[1160 ,  8160.23704818 ,  3685.04084981 ],
[1170 ,  7277.10669827 ,  6451.72419762 ],
[1180 ,  7957.07159616 ,  7638.43781095 ],
[1190 ,  5968.90573649 ,  6586.47313889 ],
[1200 ,  7505.61721221 ,  5709.33769114 ],
[1210 ,  6972.91189539 ,  7081.62767228 ],
[1220 ,  6817.32647573 ,  6666.10466201 ],
[1230 ,  2569.80792107 ,  7796.61542523 ],
[1240 ,  2961.59268911 ,  7748.08825301 ],
[1250 ,  2953.57717458 ,  6809.40671029 ],
[1260 ,  3325.47017678 ,  5255.88258923 ],
[1270 ,  3168.44013553 ,  2974.29010018 ],
[1280 ,  2949.7047839 ,  3247.86363393 ],
[1290 ,  4768.21427468 ,  3164.18798719 ],
[1300 ,  8007.2285868 ,  3826.87042381 ],
[1310 ,  5702.58287024 ,  8012.66101863 ],
[1320 ,  7762.01982022 ,  7552.88557214 ],
[1330 ,  5933.35989634 ,  7175.30010004 ],
[1340 ,  6226.80921056 ,  5621.75674397 ],
[1350 ,  7626.32067376 ,  7147.0964844 ],
[1360 ,  7722.08908213 ,  7518.85679149 ],
[1370 ,  8062.86069652 ,  7485.1641791 ],
[1380 ,  3202.91719315 ,  6509.03083258 ],
[1390 ,  3875.07735676 ,  7235.73750963 ],
[1400 ,  2931.70992276 ,  5295.57422266 ],
[1410 ,  3124.20558293 ,  3050.83011994 ],
[1420 ,  3988.26323983 ,  2906.26329817 ],
[1430 ,  4911.69455177 ,  2920.95385807 ],
[1440 ,  8478.51197501 ,  5175.78191189 ],
[1450 ,  7429.43739802 ,  2415.05359401 ],
[1460 ,  6672.61940717 ,  2047.1583114 ],
[1470 ,  8193.45634146 ,  2161.18532701 ],
[1480 ,  7319.40689394 ,  2637.0722916 ],
[1490 ,  7426.06347744 ,  1797.73932535 ],
[1500 ,  7833.2945414 ,  2402.61120142 ],
[1510 ,  7569.88285812 ,  2802.59987762 ],
[1520 ,  5001.70840602 ,  2281.42312556 ],
[1530 ,  4121.78156519 ,  2729.34289145 ],
[1540 ,  3749.40971661 ,  3942.18441523 ],
[1550 ,  3007.03482587 ,  3257.2696667 ],
[1560 ,  3356.23138159 ,  3863.08955224 ],
[1570 ,  2521.33049723 ,  3591.15223881 ],
[1580 ,  2397.85229939 ,  3013.4251346 ],
[1590 ,  2037.66870398 ,  2080.01637348 ],
[1600 ,  3082.39458595 ,  1980.20895522 ],
[1610 ,  3037.60270653 ,  1859.20662031 ],
[1620 ,  3167.35885716 ,  2236.40878672 ],
[1630 ,  3530.39611309 ,  2859.16566234 ],
[1640 ,  3146.55010809 ,  3087.10640427 ],
[1650 ,  4022.94958755 ,  2075.93355325 ],
[1660 ,  3801.32344672 ,  700.21650961 ],
[1670 ,  1976.5943707 ,  1833.63031049 ],
[1680 ,  3005.26407966 ,  2212.89439139 ],
[1690 ,  1876.57190054 ,  2798.12319433 ],
[1700 ,  2340.41373812 ,  2812.54940568 ],
[1710 ,  2613.67833258 ,  2530.16547233 ],
[1720 ,  3348.88610595 ,  3301.65396613 ],
[1730 ,  3055.9476351 ,  4342.28187919 ],
[1740 ,  2843.47585137 ,  733.944954128 ],
[1750 ,  2941.81365898 ,  2605.76403696 ],
[1760 ,  2988.604194 ,  1954.14622502 ],
[1770 ,  3415.55933323 ,  2530.01199208 ],
[1780 ,  3774.77076322 ,  2092.70736543 ],
[1790 ,  3059.78943844 ,  2063.64917953 ],
[1800 ,  3785.84320569 ,  2038.83234076 ],
[1810 ,  3258.23750181 ,  2014.10521537 ],
[1820 ,  3099.01083796 ,  1761.65584396 ],
[1830 ,  2510.89600777 ,  1228.13280971 ],
[1840 ,  0.0 ,  968.893034826 ],
[1850 ,  0.0 ,  2349.91619374 ],
[1860 ,  0.0 ,  2987.83243641 ],
[1870 ,  4265.97084992 ,  2382.28081129 ],
[1880 ,  2494.22062786 ,  2535.90066252 ],
[1890 ,  3219.86437565 ,  2149.34316606 ],
[1900 ,  3428.48378007 ,  2198.49422698 ],
[1910 ,  2288.44906993 ,  2742.50956625 ],
[1920 ,  3671.15928931 ,  2045.94227292 ],
[1930 ,  2098.21876177 ,  2930.74498688 ],
[1940 ,  1930.70447652 ,  2478.89193881 ],
[1950 ,  1760.08457711 ,  2304.18453316 ],
[1960 ,  1767.12144861 ,  1815.52069502 ],
[1970 ,  2239.42469471 ,  2026.40981879 ],
[1980 ,  2815.93124241 ,  1749.18288726 ],
[1990 ,  2655.28541312 ,  2598.28806715 ],
[2000 ,  2982.60238752 ,  2191.94126575 ],
[2010 ,  2391.31657943 ,  2208.39033997 ],
[2020 ,  2360.64676617 ,  1947.67811398 ],
[2030 ,  3521.1291697 ,  2192.69237625 ],
[2040 ,  3030.18605424 ,  2066.34184235 ],
[2050 ,  3274.53493098 ,  2414.25870531 ],
[2060 ,  3547.98367973 ,  2500.84794466 ],
[2070 ,  2788.19723918 ,  2115.59028064 ],
[2080 ,  3666.30146469 ,  3220.12609062 ],
[2090 ,  3394.85152553 ,  2666.15451867 ],
[2100 ,  2997.23837818 ,  1784.41336328 ],
[2110 ,  2802.07233065 ,  3311.12560856 ],
[2120 ,  2406.01655854 ,  3593.04729447 ],
[2130 ,  2440.63936415 ,  3541.41179891 ],
[2140 ,  3476.44485491 ,  2395.280232 ],
[2150 ,  2805.3771952 ,  2447.43751391 ],
[2160 ,  2963.63743978 ,  1980.42416166 ],
[2170 ,  2648.96490403 ,  2273.24368105 ],
[2180 ,  1720.99542839 ,  2901.5269457 ],
[2190 ,  2271.85227708 ,  1126.26436843 ],
[2200 ,  2979.74817497 ,  0 ],
[2210 ,  3237.79104478 ,  0 ],
[2220 ,  3251.95955352 ,  0 ]
]);

var options = {
          title: 'Ingestion rate (docs/s)'
        };

        var chart = new google.visualization.LineChart(document.getElementById('chart_div'));
        chart.draw(data, options);
      }
    &lt;/script&gt;&lt;div id="chart_div" style="width: 900px; height: 500px;"&gt;&lt;/div&gt;

&lt;h4&gt;Document loading speed&lt;/h4&gt;

&lt;p&gt;I measured the average time to load a document from disk using random document identifiers in the [0 - maxDoc[ range. According to &lt;a href="http://linux.die.net/man/1/free" target="_blank"&gt;free&lt;/a&gt;, my I/O cache was ~ 5.2G when I ran these tests:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;&lt;tt&gt;Lucene40StoredFieldsFormat&lt;/tt&gt;: 11.5ms,&lt;/li&gt;
  &lt;li&gt;&lt;tt&gt;CompressingStoredFieldsFormat&lt;/tt&gt;: 4.25ms.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;In the case of &lt;tt&gt;Lucene40StoredFieldsFormat&lt;/tt&gt;, the fields data file is much larger than the I/O cache so many requests to load a document translated to an actual disk seek. On the contrary, &lt;tt&gt;CompressingStoredFieldsFormat&lt;/tt&gt;&amp;#8217;s fields data file is only a little larger than the I/O cache, so most seeks are served by the I/O cache. This explains why loading documents from disk was more than 2x faster, although it requires more CPU because of uncompression.&lt;/p&gt;

&lt;p&gt;In that very particular case it would probably be even faster to switch to a more aggressive compression mode or a larger block size so that the whole &lt;tt&gt;.fdt&lt;/tt&gt; file can fit into the I/O cache.&lt;/p&gt;

&lt;h3&gt;Conclusion&lt;/h3&gt;

&lt;p&gt;Unless your server has very fast I/O, it is usually faster to compress the fields data file so that most of it can fit into the I/O cache. Compared to &lt;tt&gt;Lucene40StoredFieldsFormat&lt;/tt&gt;, &lt;tt&gt;CompressingStoredFieldsFormat&lt;/tt&gt; allows for efficient stored fields compression and therefore better performance.&lt;/p&gt;</description><link>http://blog.jpountz.net/post/33247161884</link><guid>http://blog.jpountz.net/post/33247161884</guid><pubDate>Tue, 09 Oct 2012 22:04:00 +0200</pubDate><category>lucene</category><category>lz4</category></item><item><title>Wow, LZ4 is fast!</title><description>&lt;p&gt;I&amp;#8217;ve been doing some experiments with &lt;a href="http://code.google.com/p/lz4/" target="_blank"&gt;LZ4&lt;/a&gt; recently and I must admit that I am truly impressed. For those not familiar with LZ4, it is a compression format from the &lt;a href="http://en.wikipedia.org/wiki/LZ77_and_LZ78" target="_blank"&gt;LZ77&lt;/a&gt; family. Compared to other similar algorithms (such as Google&amp;#8217;s &lt;a href="http://code.google.com/p/snappy/" target="_blank"&gt;Snappy&lt;/a&gt;), LZ4&amp;#8217;s &lt;a href="http://code.google.com/p/lz4/source/browse/trunk/lz4_format_description.txt" target="_blank"&gt;file format&lt;/a&gt; does not allow for very high compression ratios since:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;you cannot reference sequences which are more than 64kb backwards in the stream,&lt;/li&gt;
&lt;li&gt;it encodes lengths with an algorithm that requires &lt;tt&gt;1 + floor(n / 255)&lt;/tt&gt; bytes to store an integer n instead of the &lt;tt&gt;1 + floor(log(n) / log(2^7))&lt;/tt&gt; bytes that &lt;a href="http://en.wikipedia.org/wiki/Variable-length_quantity" target="_blank"&gt;variable-length encoding&lt;/a&gt; would require.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;This might sound like a lot of lost space, but fortunately things are not that bad: there are generally a lot of opportunities to find repeated sequences in a 64kb block, and unless you are working with trivial inputs, you very rarely need to encode lengths which are greater than 15. In case you still doubt LZ4 ability to achieve high compression ratios, the original implementation includes a &lt;a href="http://code.google.com/p/lz4/source/browse/trunk/lz4hc.c" target="_blank"&gt;high compression algorithm&lt;/a&gt; that can easily achieve a 40% compression ratio on common ASCII text.&lt;/p&gt;

&lt;p&gt;
But this file format also allows you to write fast compressors and uncompressors, and this is really what LZ4 excels at: compression and uncompression &lt;b&gt;speed&lt;/b&gt;. To measure how faster LZ4 is compared to other famous compression algorithms, I wrote three &lt;a href="https://github.com/jpountz/lz4-java" target="_blank"&gt;Java implementations of LZ4&lt;/a&gt;:
&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;a JNI binding to the original C implementation (including the high compression algorithm),&lt;/li&gt;
&lt;li&gt;a pure Java port, using the standard API,&lt;/li&gt;
&lt;li&gt;a pure Java port that uses the sun.misc.Unsafe API to speed up (un)compression.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;
Then I modified Ning&amp;#8217;s &lt;a href="https://github.com/ning/jvm-compressor-benchmark" target="_blank"&gt;JVM compressor benchmark&lt;/a&gt; (kudos to Ning for sharing it!) to add my compressors and ran the &lt;a href="http://corpus.canterbury.ac.nz/descriptions/#calgary" target="_blank"&gt;Calgary&lt;/a&gt; compression benchmark.
&lt;/p&gt;

&lt;p&gt;The results are very impressive:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;the JNI default compressor is the fastest one in all cases but one, and the JNI uncompressor is always the fastest one,&lt;/li&gt;
&lt;li&gt;even when compressed with the high compression algorithm, data is still very fast to uncompress, which is great for read-only data,&lt;/li&gt;
&lt;li&gt;the unsafe Java compressor/uncompressor is by far the fastest pure Java compressor/uncompressor,&lt;/li&gt;
&lt;li&gt;the safe Java compressor/uncompressor has comparable performance to some compressors/uncompressors that use the sun.misc.Unsafe API (such as LZF).&lt;/li&gt;
&lt;/ul&gt;&lt;h3&gt;Compression&lt;/h3&gt;

&lt;a href="http://people.apache.org/~jpountz/lz4/calgary-compress/2012_07_26_22_25/result.jpg" target="_blank"&gt;&lt;img src="http://people.apache.org/~jpountz/lz4/calgary-compress/2012_07_26_22_25/result.jpg"/&gt;&lt;/a&gt;

&lt;h3&gt;Uncompression&lt;/h3&gt;

&lt;a href="http://people.apache.org/~jpountz/lz4/calgary-uncompress/2012_07_26_23_53/result.jpg" target="_blank"&gt;&lt;img src="http://people.apache.org/~jpountz/lz4/calgary-uncompress/2012_07_26_23_53/result.jpg"/&gt;&lt;/a&gt;

&lt;p&gt;If you are curious about the compressors whose names start with &amp;#8220;LZ4 chunks&amp;#8221;, these are compressors that are implemented with Java &lt;a href="http://docs.oracle.com/javase/tutorial/essential/io/streams.html" target="_blank"&gt;streams API&lt;/a&gt; and compress every 64kb block of the input data separately.&lt;/p&gt;

&lt;p&gt;For the full Japex reports, see &lt;a href="http://people.apache.org/~jpountz/lz4" target="_blank"&gt;people.apache.org/~jpountz/lz4&lt;/a&gt;.&lt;/p&gt;</description><link>http://blog.jpountz.net/post/28092106032</link><guid>http://blog.jpountz.net/post/28092106032</guid><pubDate>Fri, 27 Jul 2012 02:55:00 +0200</pubDate><category>lz4</category></item><item><title>What is the theory behind Apache Lucene?</title><description>&lt;p&gt;There is a recurring request from users to have more insight into Lucene internals. For example, see:
&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://lucene.472066.n3.nabble.com/lucene-algorithm-td3939519.html" target="_blank"&gt;Lucene user mailing-list - lucene algorithm?&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;&lt;a href="http://stackoverflow.com/questions/2602253/how-does-lucene-index-documents" target="_blank"&gt;StackOverflow - How does Lucene index documents?&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.quora.com/Could-you-introduce-the-index-file-structure-and-theory-of-Lucene" target="_blank"&gt;Quora - Could you introduce the index-file structure and theory of Lucene?&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Although most of the ideas behind Lucene are explained in &lt;a href="http://wiki.apache.org/lucene-java/InformationRetrieval" target="_blank"&gt;any good book on Information Retrieval&lt;/a&gt;, Lucene also implements some advanced algorithms for specific tasks. In these cases, it is probably easier to read an article describing the idea than to reverse-engineer the code. This is why I started a &lt;a href="http://wiki.apache.org/lucene-java/LucenePapers" target="_blank"&gt;wiki page&lt;/a&gt; to collect links to research papers and blog articles that explain some advanced ideas behind Lucene.&lt;/p&gt;

&lt;p&gt;Feel free to help me improve this wiki page by sending me ideas of Lucene algorithms that would deserve an entry on it!&lt;/p&gt;</description><link>http://blog.jpountz.net/post/25875440790</link><guid>http://blog.jpountz.net/post/25875440790</guid><pubDate>Mon, 25 Jun 2012 22:51:40 +0200</pubDate><category>lucene</category></item><item><title>How fast is bit packing?</title><description>&lt;p&gt;One of the most anticipated changes in Lucene/Solr 4.0 is its improved memory efficiency. Indeed, according to &lt;a href="http://blog.mikemccandless.com/2010/07/lucenes-ram-usage-for-searching.html" target="_blank"&gt;several&lt;/a&gt; &lt;a href="http://www.lucidimagination.com/blog/2012/04/06/memory-comparisons-between-solr-3x-and-trunk/" target="_blank"&gt;benchmarks&lt;/a&gt;, you could expect a 2/3 reduction in memory use for a Lucene-based application (such as Solr or ElasticSearch) compared to Lucene 3.x.&lt;/p&gt;

&lt;p&gt;One of the techniques that Lucene uses to reduce its memory footprint is bit-packing. This means that integer array values, instead of being fixed-size (8, 16, 32 or 64 bits per value), can have any size in the [1-64] range. If you store 17-bits integers this way, this is a 47% reduction of the size of your array compared to an int[]!&lt;/p&gt;

&lt;p&gt;Here is what the interface looks like:&lt;/p&gt;
&lt;pre class="prettyprint"&gt;
interface Mutable {
  long get(int index);
  void set(int index, long value);
  int size();
}
&lt;/pre&gt;

&lt;p&gt;Under the hood, this interface has 4 implementations that have different speed and memory efficiency:
&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Direct8, Direct16, Direct32 and Direct64 that just wrap a byte[], a short[], an int[] or a long[],&lt;/li&gt;
&lt;li&gt;Packed64, which packs values contiguously in 64-bits (long) blocks,&lt;/li&gt;
&lt;li&gt;Packed64SingleBLock, that looks like Packed64 but uses padding bits to prevent values from spanning across several blocks (32 bits per value at most),&lt;/li&gt;
&lt;li&gt;Packed8ThreeBlocks and Packed16ThreeBlocks, that store values in either 3 bytes (24 bits per value) or 3 shorts (48 bits per value).&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;In case you are interested, the code is available in &lt;a href="http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/util/packed/" target="_blank"&gt;Lucene svn repository&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Direct{8,16,32,64}&lt;/h3&gt;

&lt;p&gt;The methods of these classes directly translate to operations on an array:
&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Direct8: byte[],&lt;/li&gt;
&lt;li&gt;Direct16: short[],&lt;/li&gt;
&lt;li&gt;Direct32: int[],&lt;/li&gt;
&lt;li&gt;Direct64: long[].&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Operations on these classes should be very fast given that they directly translate into array accesses. However, these implementations also have the same drawback as arrays, which is that if you want to store 17-bits values, you will need to use a Direct32, which has a 88% memory overhead for 17-bits values.&lt;/p&gt;

&lt;h3&gt;Packed64&lt;/h3&gt;

&lt;p&gt;This implementation stores values contiguously in 64-bits blocks. This is the most compact implementation: if you want to store a million17-bits values, it will require roughly 17 * 1000000 / 8 ~= 2MB space. One pitfall is that some values may span across two different blocks (when the number of bits per value is not a divisor of n), as a consequence, to avoid &lt;a href="http://stackoverflow.com/questions/9820319/why-is-a-cpu-branch-instruction-slow" target="_blank"&gt;costly CPU branches&lt;/a&gt;, the implementation of the get and set methods are a little tricky and always update 2 blocks with different shifts and masks.&lt;/p&gt;

&lt;h3&gt;Packed64SingleBlock&lt;/h3&gt;

&lt;p&gt;This implementation is similar to Packed64 but does not allow its values to span across several blocks. If you want to store 21-bits values, every block will consist of 3&amp;#160;21-bits values (using 3*21=63 bits) and 64-63=1 padding bit (2% space loss). Here are the different value sizes that this class accepts.&lt;/p&gt;

&lt;table style="text-align:center"&gt;&lt;tr&gt;&lt;th style="padding:8px"&gt;Bits per value&lt;/th&gt;&lt;th style="padding:8px"&gt;Values per block&lt;/th&gt;&lt;th style="padding:8px"&gt;Padding bits&lt;/th&gt;&lt;th style="padding:8px"&gt;Space loss&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;32&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;21&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;2%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;12&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;6%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;6%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;2%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;7&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;2%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;6%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;6%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;21&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;2%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;32&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;64&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;h3&gt;Packed{8,16}ThreeBlocks&lt;/h3&gt;

&lt;p&gt;This class uses 3 bytes or shorts to store a single value. It is well-suited for 24 and 48-bits values, but has a maximum size of Integer.MAX_VALUE/3 (so that the underlying array can be addressed by an int).&lt;/p&gt;

&lt;h3&gt;How do they compare?&lt;/h3&gt;

&lt;p&gt;For every number of bits per value, there are 2 to 4 available implementations. One important criterion to select the one that best suits your needs is the memory overhead.&lt;/p&gt;

&lt;p&gt;Here are the memory overheads for every number of bits per value and bit-packing scheme. The X-axis is the number of bits per value while the Y-axis is the memory overhead (space loss / actually used space).&lt;/p&gt;

&lt;p&gt;For every bit-packing scheme, I only considered the most compact implementation. I could use a Direct64 to store 20-bits values, but it is very likely to have similar (probably a little worse since the CPU cache is less likely to help) performance to a Direct32, although it requires twice as more space.&lt;/p&gt;

&lt;p&gt;For example, there are 4 available implementations to store 20-bits values:
&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Direct32 (32 bits per value), which has 60% memory overhead&lt;/li&gt;
&lt;li&gt;Packed64 (20 bits per value), which has 0% memory overhead&lt;/li&gt;
&lt;li&gt;Packed64SingleBlock (21 bits + 1/3 padding bit per value), which has 7% memory overhead&lt;/li&gt;
&lt;li&gt;Packed8ThreeBlocks (24 bits per value), which has 20% memory overhead&lt;/li&gt;
&lt;/ul&gt;&lt;div id="overhead_chart_div"&gt;&lt;/div&gt;
    &lt;script type="text/javascript"&gt;
      google.load("visualization", "1", {packages:["corechart"]});
      google.setOnLoadCallback(drawOverheadChart);
      function drawOverheadChart() {
        var data = google.visualization.arrayToDataTable([
          ['Bits per value', 'Packed64', 'Packed64SingleBLock', 'Packed*ThreBlocks', 'Direct*'],
[1,0,0.0,23.0,7.0],
[2,0,0.0,11.0,3.0],
[3,0,0.015873015873,7.0,1.66666666667],
[4,0,0.0,5.0,1.0],
[5,0,0.0666666666667,3.8,0.6],
[6,0,0.0666666666667,3.0,0.333333333333],
[7,0,0.015873015873,2.42857142857,0.142857142857],
[8,0,0.0,2.0,0.0],
[9,0,0.015873015873,1.66666666667,0.777777777778],
[10,0,0.0666666666667,1.4,0.6],
[11,0,0.163636363636,1.18181818182,0.454545454545],
[12,0,0.0666666666667,1.0,0.333333333333],
[13,0,0.230769230769,0.846153846154,0.230769230769],
[14,0,0.142857142857,0.714285714286,0.142857142857],
[15,0,0.0666666666667,0.6,0.0666666666667],
[16,0,0.0,0.5,0.0],
[17,0,0.254901960784,0.411764705882,0.882352941176],
[18,0,0.185185185185,0.333333333333,0.777777777778],
[19,0,0.122807017544,0.263157894737,0.684210526316],
[20,0,0.0666666666667,0.2,0.6],
[21,0,0.015873015873,0.142857142857,0.52380952381],
[22,0,0.454545454545,0.0909090909091,0.454545454545],
[23,0,0.391304347826,0.0434782608696,0.391304347826],
[24,0,0.333333333333,0.0,0.333333333333],
[25,0,0.28,0.92,0.28],
[26,0,0.230769230769,0.846153846154,0.230769230769],
[27,0,0.185185185185,0.777777777778,0.185185185185],
[28,0,0.142857142857,0.714285714286,0.142857142857],
[29,0,0.103448275862,0.655172413793,0.103448275862],
[30,0,0.0666666666667,0.6,0.0666666666667],
[31,0,0.0322580645161,0.548387096774,0.0322580645161],
[32,0,0.0,0.5,0.0],
[33,0,0.939393939394,0.454545454545,0.939393939394],
[34,0,0.882352941176,0.411764705882,0.882352941176],
[35,0,0.828571428571,0.371428571429,0.828571428571],
[36,0,0.777777777778,0.333333333333,0.777777777778],
[37,0,0.72972972973,0.297297297297,0.72972972973],
[38,0,0.684210526316,0.263157894737,0.684210526316],
[39,0,0.641025641026,0.230769230769,0.641025641026],
[40,0,0.6,0.2,0.6],
[41,0,0.560975609756,0.170731707317,0.560975609756],
[42,0,0.52380952381,0.142857142857,0.52380952381],
[43,0,0.488372093023,0.116279069767,0.488372093023],
[44,0,0.454545454545,0.0909090909091,0.454545454545],
[45,0,0.422222222222,0.0666666666667,0.422222222222],
[46,0,0.391304347826,0.0434782608696,0.391304347826],
[47,0,0.36170212766,0.0212765957447,0.36170212766],
[48,0,0.333333333333,0.0,0.333333333333],
[49,0,0.30612244898,0,0.30612244898],
[50,0,0.28,0,0.28],
[51,0,0.254901960784,0,0.254901960784],
[52,0,0.230769230769,0,0.230769230769],
[53,0,0.207547169811,0,0.207547169811],
[54,0,0.185185185185,0,0.185185185185],
[55,0,0.163636363636,0,0.163636363636],
[56,0,0.142857142857,0,0.142857142857],
[57,0,0.122807017544,0,0.122807017544],
[58,0,0.103448275862,0,0.103448275862],
[59,0,0.0847457627119,0,0.0847457627119],
[60,0,0.0666666666667,0,0.0666666666667],
[61,0,0.0491803278689,0,0.0491803278689],
[62,0,0.0322580645161,0,0.0322580645161],
[63,0,0.015873015873,0,0.015873015873],
[64,0,0.0,0,0.0]
        ]);

        var options = {
          title: 'Memory overhead', height: 500, vAxis: {viewWindowMode: 'explicit', viewWindow: {min: 0, max: 1}}
        };

        var chart = new google.visualization.LineChart(document.getElementById('overhead_chart_div'));
        chart.draw(data, options);
      }
    &lt;/script&gt;&lt;p&gt;Even if we now know how compact the different implementations are, it is still very difficult to decide which implementation to use whithout knowing their relative performance characteristics. This is why I wrote a simple benchmark that for every number of bits per value in [1,64]:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;creates 2 to 4 packed integer arrays (one per implementation) of size 10,000,000&lt;/li&gt;
&lt;li&gt;tests their random write performance (offsets are randomly chosen in the [0, 10000000[ range),&lt;/li&gt;
&lt;li&gt;tests their random read performance.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The X-axis is the number of bits per value while the Y-axis is the number of read/written values per second.&lt;/p&gt;

&lt;div id="set_chart_div"&gt;&lt;/div&gt;

    &lt;script type="text/javascript"&gt;
      google.setOnLoadCallback(drawSetChart);
      function drawSetChart() {
        var data = google.visualization.arrayToDataTable([
          ['Bits per value', 'Packed64', 'Packed64SingleBLock', 'Packed*ThreBlocks', 'Direct*'],
[1,40024182,50013915,26799649,31148781],
[2,19674188,23695987,26695755,31289709],
[3,14226400,19857153,26694682,31241852],
[4,12373123,18097508,26728959,31189720],
[5,11454225,17155978,26972793,31156861],
[6,10878472,16655075,26544916,31139097],
[7,10541390,16439906,27059476,31221018],
[8,9937865,16144305,26674469,31103117],
[9,10060535,16016328,26803435,28926206],
[10,9897956,15770329,26866715,29072370],
[11,9777015,15662203,26738005,29207536],
[12,9660067,15671511,26671281,29008772],
[13,9567019,15460832,26837050,29099937],
[14,9506542,15507833,26868253,29088647],
[15,9453000,15438857,26874611,29082010],
[16,9384342,15453020,26715737,29307437],
[17,9334604,15233412,26851190,28265115],
[18,9306847,15345372,26822516,28252309],
[19,9289988,15335770,26317885,28502489],
[20,9242061,15312377,26920845,28462649],
[21,9210224,15309043,26899269,28468981],
[22,9179631,15134307,26856890,28450456],
[23,9159639,15136373,26891619,28469664],
[24,9133198,15067646,26692590,28431212],
[25,9099484,15130953,25196942,28397666],
[26,9056843,15080917,24952860,27843814],
[27,9030181,15135605,24945959,27743461],
[28,9000132,15092810,24773626,28418073],
[29,9018631,15089474,25066920,28290945],
[30,8943546,15014653,25092872,28009670],
[31,8966245,15087269,24824155,27962025],
[32,8928867,15071491,24761137,28264808],
[33,8920969,0,24865661,27320110],
[34,8926212,0,25055605,26690234],
[35,8920330,0,24376696,26660631],
[36,8887081,0,25038496,27374089],
[37,8894831,0,24954637,27500014],
[38,8859078,0,25060738,27508369],
[39,8865104,0,25081292,27471973],
[40,8845813,0,24873532,27695949],
[41,8781041,0,25049886,27087887],
[42,8851026,0,24824304,27354543],
[43,8772517,0,24742211,27132346],
[44,8809178,0,24901051,27417557],
[45,8779912,0,24829358,27322703],
[46,8781733,0,24917159,27129512],
[47,8737012,0,24918855,27470129],
[48,8786427,0,24980024,27382678],
[49,8777120,0,0,27467107],
[50,8742312,0,0,26929687],
[51,8732868,0,0,27359683],
[52,8701630,0,0,27390082],
[53,8634870,0,0,27435931],
[54,8691887,0,0,27172447],
[55,8699268,0,0,27288376],
[56,8699971,0,0,27352817],
[57,8694983,0,0,27521569],
[58,8675943,0,0,27479408],
[59,8680025,0,0,27430633],
[60,8669569,0,0,27455008],
[61,8646643,0,0,27547610],
[62,8646508,0,0,27330770],
[63,8598927,0,0,27387172],
[64,8610027,0,0,27392016]
        ]);

        var options = {
          title: 'Random set', height: 500
        };

        var chart = new google.visualization.LineChart(document.getElementById('set_chart_div'));
        chart.draw(data, options);
      }
    &lt;/script&gt;&lt;p&gt;The Direct* implementations are clearly faster than the packed implementations (~3x faster than Packed64 and 2x faster than Packed64SingleBlock). However, it is interesting to observe that the Packed*ThreeBlocks implementations are almost as fast as the Direct* implementations.&lt;/p&gt;

&lt;p&gt;Packed64 and Packed64SingleBLock are much faster with small values (1 or 2 bits), due to the fact that the CPU caches can hold many more values at the same time, resulting in fewer cache misses when trying to access the data.&lt;/p&gt;

&lt;p&gt;Now, how do read operations compare?&lt;/p&gt;

&lt;div id="get_chart_div"&gt;&lt;/div&gt;

    &lt;script type="text/javascript"&gt;
      google.setOnLoadCallback(drawGetChart);
      function drawGetChart() {
        var data = google.visualization.arrayToDataTable([
           ['Bits per value', 'Packed64', 'Packed64SingleBLock', 'Packed*ThreBlocks', 'Direct*'],
[1,49499152,55816105,15060428,18722499],
[2,25283155,25005155,15056413,18632532],
[3,19558665,21115686,15129155,18649184],
[4,17847427,19309503,15155653,18593107],
[5,16906615,18166840,15096733,18598612],
[6,16454639,17747338,15155823,18679563],
[7,16180143,17417339,15192372,18622640],
[8,15815419,17218841,15205616,18655037],
[9,15697081,17002324,15146994,17850284],
[10,15565933,16844119,15106870,17875076],
[11,15461803,16699354,15181195,17754281],
[12,15333161,16615448,15162435,17860538],
[13,15326129,16463168,15199208,17744178],
[14,15241576,16495390,15154762,17800154],
[15,15250276,16487404,15226363,17810772],
[16,15082406,16552239,15197053,17823784],
[17,15053857,16350533,15163570,17398572],
[18,15050303,16343254,15224914,17491174],
[19,15059566,16322038,15068781,17555536],
[20,15029306,16356276,15194852,17449492],
[21,15003195,16360029,15210569,17513605],
[22,14961913,16238606,15259364,17495461],
[23,14934477,16184813,15168377,17447383],
[24,14915185,16173809,15158979,17424475],
[25,14899154,16121629,14142718,17415336],
[26,14841785,15920097,14103531,17411601],
[27,14848549,16048678,14166176,17484032],
[28,14831644,16186097,14079716,17459000],
[29,14820707,16121523,14118513,17403430],
[30,14711167,16068491,13925003,17336406],
[31,14737775,15971596,14107706,17431069],
[32,14698641,16104616,13997093,17398179],
[33,14729415,0,14103713,16737702],
[34,14714873,0,13968590,16626710],
[35,14684921,0,14167653,16537464],
[36,14713197,0,14139225,16811790],
[37,14652201,0,14090537,16752590],
[38,14597405,0,14211869,16822380],
[39,14652222,0,14155315,16759882],
[40,14602779,0,14113776,16836072],
[41,14550463,0,14107634,16750897],
[42,14569186,0,14168421,16208709],
[43,14534068,0,14097957,16812648],
[44,14596257,0,14116369,16811819],
[45,14540653,0,14112409,16795018],
[46,14519835,0,14071547,16765412],
[47,14436686,0,14126130,16738008],
[48,14536156,0,14191052,16789650],
[49,14541968,0,0,16808156],
[50,14468183,0,0,16698831],
[51,14427984,0,0,16824616],
[52,14363002,0,0,16902790],
[53,14377267,0,0,16773191],
[54,14379246,0,0,16829771],
[55,14346461,0,0,16833457],
[56,14406927,0,0,16887386],
[57,14332995,0,0,16842712],
[58,14333535,0,0,16900931],
[59,14382419,0,0,16824542],
[60,14337640,0,0,16873512],
[61,14295068,0,0,16890941],
[62,14264630,0,0,16857824],
[63,14221928,0,0,16793856],
[64,14225371,0,0,16789101]
        ]);

        var options = {
          title: 'Random get', height: 500
        };

        var chart = new google.visualization.LineChart(document.getElementById('get_chart_div'));
        chart.draw(data, options);
      }
    &lt;/script&gt;&lt;p&gt;This time results are very different. The fastest implementation are still the Direct* ones, but they are only ~18% faster than Packed64 and Packed*ThreeBlocks, and only ~8% faster than Packed64SingleBLock on average. This means that for read-only use cases, you could save a lot of memory by switching your arrays to a packed implementation while keeping performance to the same level.&lt;/p&gt;

&lt;h3&gt;Conclusion&lt;/h3&gt;

&lt;p&gt;Although bit-packing can help reduce memory use significantly, it is very rarely used in practice, probably because:
&lt;/p&gt;&lt;ul&gt;&lt;li&gt;people usually don&amp;#8217;t know how many bits per value they actually need,&lt;/li&gt;
&lt;li&gt;8, 16, 32 and 64-bits arrays are language built-ins, while packed arrays require some extra coding.&lt;/li&gt;
&lt;/ul&gt;
 However, this experiment shows that you can achieve significant reductions in memory use by using packed integer arrays, without sacrificing performance too much since packed arrays can be almost as fast as raw arrays, especially for read operations.</description><link>http://blog.jpountz.net/post/25530978824</link><guid>http://blog.jpountz.net/post/25530978824</guid><pubDate>Thu, 21 Jun 2012 00:02:00 +0200</pubDate><category>packedints</category><category>lucene</category></item></channel></rss>
