Lzma stands for Lempel-Ziv-Markov chain Algorithm. Lzma is a compression tool like bzip2 and gzip to compress and decompress files. It tends to be significantly faster and efficient than bzip compression. As we know, gzip compression ratio is worse than bzip2 (and lzma).
In this article, let us understand how to use lzma, an effective compression utility which is significantly better in compression ratio and faster operation.
Compress the input text file using lzma -c
$ lzma -c --stdout sample.txt >sample.lzma
Decompress the lzma file using -d option
$ lzma -d –stdout sample.lzma >sample.txt
Comparison between bzip2 and lzma compression tools
To understand the effectiveness of lzma, let us compress/decompress a 1MB sample.txt with both lzma and bzip2 and compare the outcome. These testing has been done with the machine which has 1GB of RAM and the processor is Pentium 4.
Size of the sample.txt input file:
$ ls -l sample.txt -rw-r--r-- 1 bala bala 1048576 2010-05-14 19:43 sample.txt
Note: We used time command in front of every compression and decompression commands to get the CPU usage of the command.
Compress the sample.txt using bzip2
Compress the input file with bzip2 command and it doesnt require the option during compression.
$ time bzip2 sample.txt real 0m27.874s user 0m13.981s sys 0m0.148s $ ls -l sample.txt.bz2 -rw-r--r-- 1 bala bala 1750 2010-05-14 19:43 sample.txt.bz2
After bzip2 compression, the output file size is of 1750 bytes.
Decompress the sample.txt using bunzip2
Decompress the compressed file with bunzip2 utility and it also doesn’t need any option to be passed.
$ bunzip2 sample.txt.bz2 real 0m0.232s user 0m0.128s sys 0m0.020s
Compress the sample.txt using lzma
Now, let us compress the sample.txt using lzma command with the following options:
- -c to compress
- –stdout to print the compressed output in stdout
$ time lzma -c --stdout sample.txt >sample.lzma real 0m2.035s user 0m1.544s sys 0m0.132s $ ls -l sample.lzma -rw-r--r-- 1 bala bala 543 2010-05-14 19:48 sample.lzma
After the compression, lzma produces the output file with the size as 543 bytes, which is comparatively less than bzip2 command. Also, as seen above, the CPU time used by lzma is much less than the bzip2.
Decompress the sample.txt using lzma
Decompress the *.lzma file using the lzma command with following options:
- -d to compress
- –stdout to print the decompressed output in stdout
$ time lzma -d --stdout sample.lzma >sample.txt real 0m0.043s user 0m0.016s sys 0m0.004s
As seen above, the decompression done by lzma is many times quicker than bzip2
Different Levels of Lzma Compression
- Lzma provides the compression range from -1 to -9.
- -9 is the highest compression ratio, which requires certain amount of time and system resources to do it. These ratio are not applicable for decompression.
- -1 is the lowest level compression ratio and it runs much quicker.
Do the following to do a quick lzma compression using the low level compression ratio:
$ lzma -1 -c --stdout sample.txt >sample.lzma $ ls -l sample.lzma -rw-r--r-- 1 bala bala 548 2010-05-14 20:47 sample.lzma
Note: -fast is alias to -1.
-9 is the highest level compression ratio and it takes longer time to compress than the low level ratio. Do the following to do a intensive compression using the high level compression ratio:
$ lzma -9 -c --stdout sample.txt >sample.lzma $ ls -l sample.lzma -rw-r--r-- 1 bala bala 543 2010-05-14 20:55 sample.lzma
Note: -best is alias to -9.