Introduction
Since XML data has a hierarchical structure, using XPath (XML Path Language) is very effective for pinpointing and retrieving specific information.
In Excel VBA, by using MSXML2.DOMDocument combined with XPath expressions, you can accurately retrieve target nodes and implement simple code to write them into Excel.
This article introduces how to search for nodes with specific attributes or elements using XPath expressions and output the content to Excel.
Assumed XML Structure (Example: staff_list.xml)
Let’s assume we have the following XML file.
<staffs>
<staff id="A01">
<name>Ichiro Suzuki</name>
<department>Sales</department>
</staff>
<staff id="A02">
<name>Hanako Sato</name>
<department>Admin</department>
</staff>
<staff id="A03">
<name>Kenta Tanaka</name>
<department>Sales</department>
</staff>
</staffs>
We want to extract the following information from this XML:
- The name of the employee with ID “A01” (
<name>tag). - The names of all employees belonging to the “Sales” department, concatenated together.
VBA Code: Extracting Nodes with XPath and Outputting to Exce
Sub ExtractFromXMLwithXPath()
Dim xmlDoc As Object
Dim node As Object, nodeList As Object
Dim resultText As String
' Create DOMDocument object
Set xmlDoc = CreateObject("MSXML2.DOMDocument.6.0")
xmlDoc.async = False
' Load the XML file
xmlDoc.Load ThisWorkbook.Path & "\staff_list.xml"
' 1. Extract name for ID "A01"
Set node = xmlDoc.SelectSingleNode("/staffs/staff[@id='A01']/name")
ThisWorkbook.Worksheets(1).Range("C2").Value = node.Text
' 2. Extract names of all employees in "Sales"
Set nodeList = xmlDoc.SelectNodes("/staffs/staff[department='Sales']/name")
' Loop through the results
For Each node In nodeList
resultText = resultText & node.Text & ", "
Next
' Remove the trailing comma and space
If Len(resultText) > 0 Then resultText = Left(resultText, Len(resultText) - 2)
ThisWorkbook.Worksheets(1).Range("C3").Value = resultText
End Sub
Key Points Explanation
| Item | Description |
| SelectSingleNode() | Searches for a single node using XPath (attributes and hierarchy can be specified). |
| SelectNodes() | Retrieves multiple nodes matching the condition (processed via a loop). |
| @id=’A01′ | Specifies a node where the attribute id is “A01”. |
| [department=’Sales’] | Specifies a condition based on the value of a child element. |
| node.Text | Retrieves the text content inside the tag. |
Execution Result (Excel)
| Column | Value |
| C2 | Ichiro Suzuki |
| C3 | Ichiro Suzuki, Kenta Tanaka |
Application Ideas
- Fuzzy Search: You can use XPath functions like
SelectNodes("//staff[starts-with(department,'Sa')]"). - Deep Hierarchy: XPath allows you to flexibly filter data even if the XML hierarchy is deep.
- Data Transfer: This logic can be easily incorporated into processes that transfer extraction results to other sheets or workbooks.
Important Notes
- Character Encoding: Be careful if the XML file is saved in UTF-8 or Shift-JIS, as character corruption (mojibake) may occur depending on the environment.
- Strict Syntax: XPath syntax is strict regarding structure. Pay attention to spelling errors in tag names and incorrect hierarchy paths.
Summary
By using XPath expressions, you can accurately extract necessary data from complex XML structures and simply transcribe it to Excel.
Leveraging the integration between Excel VBA and XML allows you to efficiently utilize system data and configuration files. Please use this method to streamline your business automation and master data management.
