Introduction
Have you ever experienced garbled text when importing external text files in VBA? One common cause is the file’s BOM (Byte Order Mark). A BOM is invisible identification data attached to the beginning of files saved in encodings like UTF-8.
By using the ADODB.Stream object in VBA, you can read a file directly as binary data and programmatically check if this BOM exists. This allows for more advanced file manipulation, such as preventing character encoding errors.
This article explains how to use ADODB.Stream to read the first few bytes of a file and verify if it contains a UTF-8 BOM.
Prerequisite
To use this code, you must enable a reference library. In the VBA Editor (VBE), go to Tools > References and check “Microsoft ActiveX Data Objects x.x Library” (version 6.1 is common).
VBA Sample Code to Detect BOM
This macro asks the user to select a file, reads the first 3 bytes, and determines if the byte data matches the UTF-8 BOM (hexadecimal EF BB BF).
' Select a file and check if its header is a UTF-8 BOM
Sub CheckForUtf8Bom()
Dim fileStream As ADODB.Stream
Dim targetFilePath As String
Dim byteData() As Byte
Dim hexCode As String
'--- 1. Show file selection dialog ---
targetFilePath = Application.GetOpenFilename("All Files,*.*")
If targetFilePath = "False" Then Exit Sub ' Exit if cancelled
'--- 2. Open file as binary using ADODB.Stream ---
Set fileStream = New ADODB.Stream
With fileStream
.Type = adTypeBinary ' Binary Mode (Type=1)
.Open
.LoadFromFile targetFilePath
'--- 3. Read the first 3 bytes ---
byteData = .Read(3)
.Close
End With
Set fileStream = Nothing
'--- 4. Convert byte data to a hexadecimal string ---
hexCode = Hex(byteData(0)) & " " & Hex(byteData(1)) & " " & Hex(byteData(2))
'--- 5. Check if it matches UTF-8 BOM ("EF BB BF") ---
If hexCode = "EF BB BF" Then
MsgBox "This file has a UTF-8 BOM attached.", vbInformation, "BOM Check Result"
Else
MsgBox "No UTF-8 BOM found in this file." & vbCrLf & _
"First 3 bytes: " & hexCode, vbExclamation, "BOM Check Result"
End If
End Sub
Code Explanation
ADODB.Stream Object
ADODB.Stream is a powerful object for handling data in files or memory as streams.
.Type = adTypeBinary: Sets the stream type to binary (value 1). This allows accurate reading of the file byte by byte..LoadFromFile targetFilePath: Loads the contents of the specified file into the stream..Read(3): Reads the specified number of bytes (3 in this case) from the current position in the stream and returns them as a Byte array.
Determining the BOM
hexCode = Hex(...): Converts each element of the byte array retrieved by.Readinto a hexadecimal string using theHexfunction and joins them with spaces.If hexCode = "EF BB BF" Then: The UTF-8 BOM is defined as the 3-byte sequenceEF BB BF. The code compares the converted string against this value to branch the logic.
Summary
In this article, we covered how to read file header information byte by byte using ADODB.Stream to detect a UTF-8 BOM.
- Use
ADODB.Streamto handle files as binary data. - Set
.Type = adTypeBinaryto enable binary mode. - Use
.Read(number_of_bytes)to read a specific amount of data from the beginning of the file.
This technique is versatile. Beyond checking text file encoding, it can be applied to advanced file operations such as identifying image file formats by their signatures or parsing custom binary files.
