[Linux] High Compression and Extraction with the bzip2 and bunzip2 Commands

目次

Overview

The bzip2 command uses the “Burrows-Wheeler Transform” algorithm to compress (encode) and decompress (decode) files. While it is slower than the standard gzip command, it offers a higher compression ratio. Compressed files typically use the .bz2 extension.

Specifications (Arguments and Options)

Syntax

bzip2 [options] [filename]

bunzip2 [options] [filename]

Note: bunzip2 is equivalent to bzip2 -d.

Main Arguments and Options

OptionDescription
-d / --decompressDecompresses the file.
-z / --compressCompresses the file (default behavior).
-k / --keepKeeps the original file after processing (it is deleted by default).
-f / --forceOverwrites existing files with the same name.
-c / --stdoutWrites the result to standard output; the original file is not changed.
-s / --smallReduces memory usage during execution (reduces speed).
-1 .. -9Sets the compression level. -1 is fastest/lowest, and -9 is slowest/highest (default is 9).

Basic Usage

This process shows how to compress a text file (service.log), check the size change, and then restore it.

# Check the original file
ls -lh service.log

# Compress the file (original is deleted, .bz2 is created)
bzip2 service.log

# Check the compressed file size
ls -lh service.log.bz2

# Decompress the file (restore it)
bzip2 -d service.log.bz2

Example Result

-rw-r--r-- 1 user user 10M  Jan 20 10:00 service.log

-rw-r--r-- 1 user user 2.1M Jan 20 10:00 service.log.bz2

-rw-r--r-- 1 user user 10M  Jan 20 10:00 service.log

Compressing Without Deleting the Original File

By default, the original file is removed after compression. Use the -k option to keep it.

# Create a compressed file while keeping the original
bzip2 -k service.log

Practical Commands

Compressing Files Recursively within a Directory

Since bzip2 cannot traverse directories on its own, you must combine it with the find command. The following example compresses all files inside the archive/ directory individually.

# Target files in the archive directory and run bzip2 on each one
find archive/ -type f -exec bzip2 {} \;

Example Result

# Before execution
archive/log1.txt
archive/subdir/log2.txt

# After execution (each file is compressed individually)
archive/log1.txt.bz2
archive/subdir/log2.txt.bz2

Customization Points

  • Adjusting Compression Levels (-1 to -9): Use -1 if you need to compress a file very quickly. Use -9 if you want the smallest file size possible, even if it takes more time.Bashbzip2 -1 large_data.csv
  • Using Standard Output (-c): This is useful when you want to pipe the compression result to another command instead of saving it to a file.Bash# Example: Transfer compressed data over SSH without saving locally bzip2 -c database.sql | ssh remote_host "cat > database.sql.bz2"

Important Notes

  • No Directory Compression: bzip2 is designed for single files only. If you want to compress an entire directory into one archive, you must use the tar command.
  • Original Files are Deleted: If you run the command without -k, the original file will be deleted after successful processing. It is safer to use -k if you are unsure.
  • Processing Speed: Compared to gzip, bzip2 has a higher compression ratio but puts more load on the CPU and takes longer to finish. It may not be suitable for tasks where speed is the priority.

Advanced Application

Archiving and Compressing Directories with tar

In real-world tasks, bzip2 is most often used as an option within the tar command. By using the -j option, you can create a tar archive and compress it with bzip2 at the same time.

# Archive the logs directory and compress it into logs.tar.bz2
tar -cjvf logs.tar.bz2 logs/

# Extract logs.tar.bz2
tar -xjvf logs.tar.bz2

Summary

The bzip2 command is a very effective tool when saving storage space is your top priority. Use it as an alternative to gzip when you need to minimize file size for long-term log storage or file transfers over slow networks, even if it uses more CPU power and time.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次