[Linux] Convert Filename Character Encodings and Case with the convmv Command

目次

Overview

The convmv command is a tool used to convert the character encoding of “filenames” themselves, rather than the contents of the files.

It is specifically designed to fix “mojibake” (garbled text) in filenames that occurs when transferring files created on Windows (Shift_JIS/CP932) or Mac (UTF-8 NFD) to Linux (UTF-8 NFC). It also includes features to convert filenames between uppercase and lowercase in bulk.

Note: This command is often not installed by default. You can add it using the following commands:

sudo apt install convmv or sudo dnf install convmv

Specifications (Arguments and Options)

Syntax

convmv [options] -f [from_code] -t [to_code] [filename/directory...]

Main Arguments and Options

OptionDescription
-f [code]Specifies the original character encoding (from). Example: shiftjis, utf8.
-t [code]Specifies the target character encoding (to). Example: utf8.
–notestActually renames the files (the default is test mode, which only shows what would happen).
-rProcesses directories recursively.
-iInteractive mode; asks for confirmation if a file with the same name already exists.
–listLists all supported character encodings.
–replaceOverwrites the target file if it already exists.
–upperConverts all filenames to uppercase.
–lowerConverts all filenames to lowercase.
–exec [command]Executes a specified command on files that were successfully converted.

Basic Usage

First, run the command in “Test Mode (Dry Run)” without the --notest option to simulate the results. For safety, convmv does not make any actual changes by default.

Assume you have garbled filenames (Shift-JIS) brought over from Windows:

# First, run a test (without --notest) to check the changes
convmv -f shiftjis -t utf8 win_data_?????.txt

Example Output:

Starting a dry run without changes...
mv "./win_data_ƒeƒXƒg.txt"      "./win_data_test.txt"
No changes to your files done. Use --notest to finally rename the files.

Once you confirm that the new filenames are correct, add the --notest option to apply the changes.

# Actually rename the files
convmv -f shiftjis -t utf8 --notest win_data_?????.txt

Practical Commands

Convert All Files in a Directory from Shift-JIS to UTF-8

Use the -r option to process directories and their contents recursively. This is useful for fixing batches of uploaded folders.

# Convert all files under the public_html directory
convmv -r -f shiftjis -t utf8 --notest public_html/

Standardize Filenames to Lowercase (or Uppercase)

This is used to prevent issues with mixed case sensitivity, such as when publishing files on a web server. No character encoding specification is required.

# Convert all filenames in the current directory to lowercase
convmv --lower --notest *

# Convert a specific file to uppercase
convmv --upper --notest index.html

Example Output (Lowercase conversion):

mv "./README.TXT"      "./readme.txt"
mv "./Image.PNG"       "./image.png"
Ready!

Customization Tips

  • Check Encodings (–list): You must specify the encoding names exactly. Use grep to find the correct name from the list.Bashconvmv --list | grep -i jis # Displays shiftjis, euc-jp, iso-2022-jp, etc.
  • Overwrite Confirmation (-i): Use the -i option if you want to proceed carefully when the conversion might result in a filename that already exists.Bashconvmv --lower -i --notest *

Important Notes

  1. Default Does Not Change Files: convmv will never rename files unless you include the --notest option. If the command runs but nothing changes, you likely forgot this option.
  2. File Contents Remain Unchanged: This tool only renames files (metadata). To convert the character encoding of the text inside a file, use commands like iconv or nkf.
  3. Irreversible Operation: Once you convert a filename, it can be difficult to determine what the original encoding was. Always back up important data before performing these operations.

Advanced Usage

Fix Mac (NFD) Filenames for Linux (NFC)

Files created on Mac often use the NFD format (where characters like “ga” are split into “ka” and a “dakuten” mark). You can combine these into the NFC format used by Linux and Windows. This helps prevent issues with searching or script processing.

# Convert from UTF-8 to UTF-8 using the NFC normalization option (--nfc)
convmv -r -f utf8 -t utf8 --nfc --notest /mnt/mac_disk/

Summary

The convmv command is the best solution for fixing garbled filenames that occur during file transfers between different operating systems.

Because it has a two-step safety mechanism—checking results first and then confirming with --notest—you can use it with confidence. It is a valuable tool to use before deploying to a web server or when migrating old file servers.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次