In business settings, data is commonly managed not only in CSV files but also in Microsoft Excel format (.xlsx). By using the Pandas read_excel function, you can efficiently import specific sheets or necessary columns within an Excel workbook as a DataFrame.
This article explains the basic usage of the read_excel function and the main parameters (arguments) for finely controlling the scope of data importation.
List of Key Parameters
While read_excel has many options, the four most commonly used parameters are summarized in the table below.
| Parameter | Meaning / Role | Example Specification |
| sheet_name | Specifies the sheet to read. Can be specified by sheet name (string) or index (0-based integer). Multiple sheets can be specified as a list. | "Sales_Data", 0, ["Sheet1", "Sheet2"] |
| header | Specifies the row number to use as the header (column names). Default is 0 (1st row). Specify None if there is no header. | 1 (use 2nd row as header), None |
| usecols | Limits the “columns” to be read. Can be specified by Excel column letters (“A:C”), column names, or column numbers. | "A:C", [0, 2], ["ID", "Name"] |
| index_col | Specifies the column to use as the DataFrame index (row label). | 0, "Date" |
Implementation Sample Code
Below is the complete Python code that generates a sample Excel file and reads it using various options.
Note: To execute this code, the Excel operation library openpyxl must be installed (pip install openpyxl).
import pandas as pd
import os
def demonstrate_read_excel():
"""
Function to demonstrate Excel file reading options using Pandas
"""
# ---------------------------------------------------------
# 1. Preparation: Create a test Excel file
# ---------------------------------------------------------
file_name = "monthly_sales_report.xlsx"
sheet1_name = "April_2026"
sheet2_name = "May_2026"
# Sheet 1: April Data (Assuming 1st row contains title/extra info)
# Col A: ID, Col B: Product, Col C: Price, Col D: Quantity
df_april = pd.DataFrame({
"Transaction_ID": [1001, 1002, 1003],
"Product_Name": ["Laptop", "Mouse", "Monitor"],
"Unit_Price": [1200, 50, 300],
"Sold_Quantity": [5, 20, 10]
})
# Sheet 2: May Data (Standard format)
df_may = pd.DataFrame({
"Transaction_ID": [2001, 2002],
"Product_Name": ["Keyboard", "HDMI Cable"],
"Unit_Price": [80, 15],
"Sold_Quantity": [15, 30]
})
# Save as Excel file
# Intentionally insert an empty row in Sheet 1 to test the header option
with pd.ExcelWriter(file_name, engine="openpyxl") as writer:
# April Sheet: Put report title in 1st row, start data from 2nd row (index=1)
df_april.to_excel(writer, sheet_name=sheet1_name, index=False, startrow=1)
# May Sheet: Placed normally
df_may.to_excel(writer, sheet_name=sheet2_name, index=False)
print(f"--- Preparation Complete: Created {file_name} ---\n")
# ---------------------------------------------------------
# 2. Basic Reading
# ---------------------------------------------------------
print("=== 1. Basic Reading (Specifying sheet_name) ===")
# Specify sheet name with sheet_name
# Since it only reads the first sheet by default, we specify it explicitly
df_basic = pd.read_excel(file_name, sheet_name=sheet2_name)
print(f"Contents of sheet '{sheet2_name}':")
print(df_basic)
print("\n")
# ---------------------------------------------------------
# 3. Reading Multiple Sheets Simultaneously
# ---------------------------------------------------------
print("=== 2. Reading Multiple Sheets ===")
# Passing a list to sheet_name returns a dictionary of {sheet_name: DataFrame}
sheets_dict = pd.read_excel(file_name, sheet_name=[sheet1_name, sheet2_name], header=1)
print("Keys of loaded sheets:", sheets_dict.keys())
print(f"Top rows of sheet '{sheet1_name}':")
print(sheets_dict[sheet1_name].head(2))
print("\n")
# ---------------------------------------------------------
# 4. Specifying Header Position (header)
# ---------------------------------------------------------
print("=== 3. Specifying Header Position (header=1) ===")
# Since we left the 1st row empty during creation (startrow=1),
# we specify header=1 to read the 2nd row as the header
df_header = pd.read_excel(file_name, sheet_name=sheet1_name, header=1)
print(df_header.head(2))
print("\n")
# ---------------------------------------------------------
# 5. Reading Only Specific Columns (usecols)
# ---------------------------------------------------------
print("=== 4. Reading Only Specific Columns (usecols) ===")
# Pattern A: Range specification using Excel column letters ("A:C")
# Get columns from A to C
df_cols_range = pd.read_excel(
file_name,
sheet_name=sheet2_name,
usecols="A:C"
)
print("--- usecols='A:C' ---")
print(df_cols_range)
# Pattern B: Specification by list of column names
# Specifying by name is the most robust method
df_cols_list = pd.read_excel(
file_name,
sheet_name=sheet2_name,
usecols=["Product_Name", "Sold_Quantity"]
)
print("\n--- usecols=['Product_Name', 'Sold_Quantity'] ---")
print(df_cols_list)
# Pattern C: Specification by column numbers (indices)
# Get 0th (Col A) and 2nd (Col C) columns
df_cols_idx = pd.read_excel(
file_name,
sheet_name=sheet2_name,
usecols=[0, 2]
)
print("\n--- usecols=[0, 2] ---")
print(df_cols_idx)
# (Reference) Delete file
# os.remove(file_name)
if __name__ == "__main__":
demonstrate_read_excel()
Execution Result
--- Preparation Complete: Created monthly_sales_report.xlsx ---
=== 1. Basic Reading (Specifying sheet_name) ===
Contents of sheet 'May_2026':
Transaction_ID Product_Name Unit_Price Sold_Quantity
0 2001 Keyboard 80 15
1 2002 HDMI Cable 15 30
=== 2. Reading Multiple Sheets ===
Keys of loaded sheets: dict_keys(['April_2026', 'May_2026'])
Top rows of sheet 'April_2026':
Transaction_ID Product_Name Unit_Price Sold_Quantity
0 1001 Laptop 1200 5
1 1002 Mouse 50 20
=== 3. Specifying Header Position (header=1) ===
Transaction_ID Product_Name Unit_Price Sold_Quantity
0 1001 Laptop 1200 5
1 1002 Mouse 50 20
=== 4. Reading Only Specific Columns (usecols) ===
--- usecols='A:C' ---
Transaction_ID Product_Name Unit_Price
0 2001 Keyboard 80
1 2002 HDMI Cable 15
--- usecols=['Product_Name', 'Sold_Quantity'] ---
Product_Name Sold_Quantity
0 Keyboard 15
1 HDMI Cable 30
--- usecols=[0, 2] ---
Transaction_ID Unit_Price
0 2001 80
1 2002 15
Detailed Explanation of Each Option
How to Specify usecols
usecols allows for very flexible specifications.
- Range specification by string: You can use formats familiar to Excel users, such as
"A:D"or"A,C,E". - Specification by list of column names: Specifying column names like
["Item", "Price"]makes the code less likely to break even if the column order changes. - Integer list: Specify by column number (0-based) like
[0, 1].
Adjusting header
If the top of the Excel file contains titles or explanatory notes, the row where data starts will be shifted. In that case, specifying the appropriate row number, such as header=2 (3rd row is the header), allows you to create a DataFrame with the correct structure.
By combining these options, you can import raw Excel data directly into Python without pre-processing.
