[Excel VBA] How to Correctly Read CSV Files by Specifying Character Encoding (UTF-8, etc.)

Have you ever experienced garbled text (Mojibake) when opening a CSV file in Excel?

Especially when you directly open a CSV file saved in UTF-8 format in a Japanese environment Excel, full-width characters and symbols often get garbled.

In this article, I will introduce how to correctly read text files by specifying the character encoding using Excel VBA.

目次

Target: UTF-8 Format CSV File

For example, let’s assume the following CSV file (e.g., product_utf8.csv):

Product NamePriceRemarks
Apple100Grown in Aomori
Banana80Grown in Philippines

If you open this directly in Excel, the text may be displayed as “??” or “æ”.

The cause is that Excel is not opening the file with the correct character encoding (UTF-8).

VBA Code: Reading CSV in UTF-8

Sub ImportCSVwithUTF8()
    Dim csvPath As String
    csvPath = ThisWorkbook.Path & "\product_utf8.csv"
    
    Workbooks.OpenText _
        Filename:=csvPath, _
        Origin:=65001, _ ' UTF-8 (Code Page)
        DataType:=xlDelimited, _
        Comma:=True
End Sub

Explanation of Arguments

ParameterExplanation
FilenameThe path of the CSV file to read.
Origin:=65001Specifies the UTF-8 code page (65001).
DataTypeSpecifies the delimiter format (xlDelimited).
Comma:=TrueReads the file as comma-separated.

Reference: List of Code Pages (Partial)

Code PageCorresponding Character Encoding
932Shift-JIS (Japanese)
65001UTF-8
1200UTF-16 (Unicode)

Image After Execution (Correctly Displayed)

Product NamePriceRemarks
Apple100Grown in Aomori
Banana80Grown in Philippines

Important Notes

  • The specification of Origin is very important. If omitted, Excel tries to open it with the OS default code (e.g., Shift-JIS), which causes garbled text.
  • UTF-8 CSVs are common in files output from web apps and systems.
  • Workbooks.OpenText is effective not only for text files but also for CSVs.

Application Points

  • For semicolon-separated or tab-separated files, you can use Semicolon:=True or Tab:=True together.
  • Even UTF-8 with BOM (Byte Order Mark) can basically be read with this method.

Summary

When handling UTF-8 CSVs in Excel, there is a high possibility of garbled text if the character encoding is not specified.

By explicitly specifying Origin:=65001 in VBA, you can import data correctly without text corruption.

If you are struggling with handling CSV files containing multi-byte characters, please try utilizing this VBA code.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次