Overview
The bzip2 command uses the “Burrows-Wheeler Transform” algorithm to compress (encode) and decompress (decode) files. While it is slower than the standard gzip command, it offers a higher compression ratio. Compressed files typically use the .bz2 extension.
Specifications (Arguments and Options)
Syntax
bzip2 [options] [filename]
bunzip2 [options] [filename]
Note: bunzip2 is equivalent to bzip2 -d.
Main Arguments and Options
| Option | Description |
-d / --decompress | Decompresses the file. |
-z / --compress | Compresses the file (default behavior). |
-k / --keep | Keeps the original file after processing (it is deleted by default). |
-f / --force | Overwrites existing files with the same name. |
-c / --stdout | Writes the result to standard output; the original file is not changed. |
-s / --small | Reduces memory usage during execution (reduces speed). |
-1 .. -9 | Sets the compression level. -1 is fastest/lowest, and -9 is slowest/highest (default is 9). |
Basic Usage
This process shows how to compress a text file (service.log), check the size change, and then restore it.
# Check the original file
ls -lh service.log
# Compress the file (original is deleted, .bz2 is created)
bzip2 service.log
# Check the compressed file size
ls -lh service.log.bz2
# Decompress the file (restore it)
bzip2 -d service.log.bz2
Example Result
-rw-r--r-- 1 user user 10M Jan 20 10:00 service.log
-rw-r--r-- 1 user user 2.1M Jan 20 10:00 service.log.bz2
-rw-r--r-- 1 user user 10M Jan 20 10:00 service.log
Compressing Without Deleting the Original File
By default, the original file is removed after compression. Use the -k option to keep it.
# Create a compressed file while keeping the original
bzip2 -k service.log
Practical Commands
Compressing Files Recursively within a Directory
Since bzip2 cannot traverse directories on its own, you must combine it with the find command. The following example compresses all files inside the archive/ directory individually.
# Target files in the archive directory and run bzip2 on each one
find archive/ -type f -exec bzip2 {} \;
Example Result
# Before execution
archive/log1.txt
archive/subdir/log2.txt
# After execution (each file is compressed individually)
archive/log1.txt.bz2
archive/subdir/log2.txt.bz2
Customization Points
- Adjusting Compression Levels (-1 to -9): Use
-1if you need to compress a file very quickly. Use-9if you want the smallest file size possible, even if it takes more time.Bashbzip2 -1 large_data.csv - Using Standard Output (-c): This is useful when you want to pipe the compression result to another command instead of saving it to a file.Bash
# Example: Transfer compressed data over SSH without saving locally bzip2 -c database.sql | ssh remote_host "cat > database.sql.bz2"
Important Notes
- No Directory Compression:
bzip2is designed for single files only. If you want to compress an entire directory into one archive, you must use thetarcommand. - Original Files are Deleted: If you run the command without
-k, the original file will be deleted after successful processing. It is safer to use-kif you are unsure. - Processing Speed: Compared to
gzip,bzip2has a higher compression ratio but puts more load on the CPU and takes longer to finish. It may not be suitable for tasks where speed is the priority.
Advanced Application
Archiving and Compressing Directories with tar
In real-world tasks, bzip2 is most often used as an option within the tar command. By using the -j option, you can create a tar archive and compress it with bzip2 at the same time.
# Archive the logs directory and compress it into logs.tar.bz2
tar -cjvf logs.tar.bz2 logs/
# Extract logs.tar.bz2
tar -xjvf logs.tar.bz2
Summary
The bzip2 command is a very effective tool when saving storage space is your top priority. Use it as an alternative to gzip when you need to minimize file size for long-term log storage or file transfers over slow networks, even if it uses more CPU power and time.
