[Python] Compressing and Archiving Files in Tar Format: Writing with tarfile

This article explains how to use Python’s tarfile module to group multiple files and directories into a tar archive.

In addition to creating simple archives (uncompressed), you can also reduce file size by using compression algorithms like gzip.

目次

List of Write Modes for the open Function

The modes specified when creating a new archive (or overwriting) using tarfile.open() are as follows.

Select the appropriate compression mode matching your file extension.

Mode StringMeaning / Usage
'w'Creates a tar file without compression (extension .tar).
'w:gz'Performs gzip compression (extension .tar.gz, .tgz). Most common.
'w:bz2'Performs bzip2 compression (extension .tar.bz2). Tends to have higher compression rates than gzip.
'w:xz'Performs lzma (xz) compression (extension .tar.xz). Very high compression rate but slower processing.

Implementation Example: Creating a Source Code Backup

In this example, we will bundle a development project’s source code directory (src/) and a ReadMe file (README.md) to create a compressed archive named backup_v1.0.tar.gz.

Source Code

import tarfile
import os

# Archive file name to create
archive_name = "backup_v1.0.tar.gz"

# Files or directories to add to the archive
# (Ensure these exist for the code to run successfully)
targets = ["src", "README.md"]

print(f"--- Creating archive '{archive_name}' ---")

# 1. Open in write mode ('w:gz')
with tarfile.open(archive_name, "w:gz") as tar:
    
    for item in targets:
        # 2. Add using add(path, arcname=alias)
        # Specifying a directory automatically adds its contents recursively
        # If arcname is not specified, the original path structure is used in the archive
        if os.path.exists(item):
            tar.add(item)
            print(f"Added: {item}")
        else:
            print(f"Warning: '{item}' not found.")

print("Backup complete.")

Explanation

tar.add(name, arcname=None)

This method adds a file or directory to the archive.

  • Directory Handling: By default, directories are processed recursively (files inside are included). You can disable this by setting recursive=False, but usually, simply specifying the folder name is sufficient.
  • arcname (Archive Name): Similar to ZIP files, use this if you want to change the name of the file or directory inside the archive. For example, tar.add("C:/Users/Data", arcname="Data") stores the content under the directory Data instead of the full path.

About Append Mode

Unlike ZIP files, you cannot append (‘a’ mode) to compressed tar files (such as .tar.gz).

To add files to a compressed archive, you must decompress it and recreate it, or use the uncompressed .tar format which supports appending.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次