# Quantized Indexing

A Data Compression Patent, applying the power of combinatorial mathematics
to enumerative coding by uncovering a method which solves the problem of arithmetic
precision in Enumerative Coding, the underlying problem which has long
prevented it from being used for general purpose entropy coding, and for coding
complex data structures.

While Enumerative Coding has been acknowledged as the most “desirable”
universal coding scheme since the 1970s, to this point it has only been used as a
technique for encoding CDs and DVDs, where brute force techniques can be
used to create the constrained codes.

The wider and longer term research effort to develop a general purpose
implementation had previously only produced the Arithmetic Coder, (invented by
Jorma Rissanen, IBM Research, 1976) which, while it is in widespread use today
as a general purpose entropy coder, is still essentially only an approximation of
the exact codes used in Enumerative Coding.

With the solution to the arithmetic precision problem, the wider potential for this
algorithm was revealed, from delivering a competitive advantage in
large scale applications such as database and web search and media storage to
its obvious advantages in content delivery, particularly for audio and video codec
applications where the requirement is to most efficiently process small amounts
of dynamically and unpredictably changing data.

resilience to uncertain inputs, Quantized Indexing uses only simple instructions,
shifts and adds rather than more complex multiply and divide instructions. This
fact alone delivers an incremental speed advantage for applications running on
the lower powered processors typically employed in portable devices, such as
media players and cell phones, while significantly extending battery life.

Simply stated, Quantized Indexing always compresses data to a smaller size
than the very best research Arithmetic Coders, while simultaneously delivering a
.
For a simple, but immediate, appreciation of the inherent speed advantage of this
algorithm, and the power of Combinatorial Mathematics applied to Enumerative
Coding, consider that in processing any string of data, Quantized Indexing does
no work at all on the most probable symbol and less work on the least probable
symbol. Even that lesser work comprises only simple instructions, not the more
complex and power consuming instructions of an Arithmetic Coder, which is also
required to inspect every bit that is presented in order to maintain its probability table.

In testing and verification with a large number and variety of inputs, such as the
sparse arrays which are typically encountered in search applications
by companies such as Google, Yahoo and Oracle, the speed advantage is more
than 240 times faster, while the compressed output size advantage is “only” 6%.
With minimal optimization, the speed advantage can exceed 1000 times faster.

In addition to its speed and efficiency gains, the stability and predictability of the
encoded size of the data permits optimal packaging into fixed size fields, for
serializing complex data into separately compressed components more suitable
for fast random access without the need to expand other intervening data.
In contrast, when tested with small amounts of dynamically variable data such as
that processed in the audio and video codecs the tested speed advantage is reduced to
“only” 20 times, but the compressed output size can be reduced by more than 50%

In either case, when executed on the lower powered processors typically found in
portable devices, the compressed output size advantage remains constant, but
the speed advantage is increased by a factor of about 1.7, precisely because no
complex instructions are required.

Detailed performance test results and research source code are available.

Patents: