In Pandas, you can perform arithmetic operations (addition, subtraction, multiplication, and division) between DataFrames intuitively, much like calculating cells in Excel. When you use operators (+, -, *, /), calculations are automatically performed between elements where the row labels (index) and column labels (columns) match.
This article explains comprehensive calculation methods, including not just addition, but also subtraction, multiplication, and division.
Basics of DataFrame Operations
If you have df1 and df2, you can write the operations as follows:
- Addition:
df1 + df2 - Subtraction:
df1 - df2 - Multiplication:
df1 * df2(This is the element-wise product, not the matrix product) - Division:
df1 / df2
Implementation Sample Code
Here, we assume a scenario where we analyze the “Sales Performance” of a retail store. We will perform calculations in the following four steps:
- Multiplication: Calculate “Sales Amount” from “Sales Quantity” and “Unit Price”.
- Subtraction: Calculate “Gross Profit” by subtracting “Cost” from “Sales Amount”.
- Addition: Calculate “Final Profit” by adding “Special Rebate” to “Gross Profit”.
- Division: Calculate “Achievement Rate” by dividing “Sales Amount” by “Target Sales”.
import pandas as pd
def demonstrate_dataframe_operations():
"""
Function to demonstrate arithmetic operations (add, sub, mul, div)
between Pandas DataFrames.
"""
# ---------------------------------------------------------
# Data Preparation: Create DataFrame for Product (Row) x Region (Column)
# ---------------------------------------------------------
products = ["Product_A", "Product_B", "Product_C"]
regions = ["East_Region", "West_Region"]
# 1. Sales Quantity Data (Quantity)
df_quantity = pd.DataFrame(
[[100, 150], [80, 90], [200, 220]],
index=products, columns=regions
)
# 2. Unit Price Data (Unit Price)
# Assuming unit prices differ by region
df_price = pd.DataFrame(
[[500, 520], [1200, 1150], [300, 300]],
index=products, columns=regions
)
print("--- 1. Sales Quantity (Quantity) ---")
print(df_quantity)
print("\n--- 2. Unit Price (Unit Price) ---")
print(df_price)
# ---------------------------------------------------------
# [Multiplication (*)]: Quantity * Unit Price = Sales Amount
# ---------------------------------------------------------
# Calculates element-wise product
df_sales = df_quantity * df_price
print("\n=== [Multiplication] Sales Amount (Quantity * Price) ===")
print(df_sales)
# ---------------------------------------------------------
# [Subtraction (-)]: Sales Amount - Cost = Gross Profit
# ---------------------------------------------------------
# Create Cost Data (Cost)
# Preparing dummy values (approx. 70% of sales)
df_cost = pd.DataFrame(
[[35000, 54000], [67000, 72000], [42000, 46000]],
index=products, columns=regions
)
df_gross_profit = df_sales - df_cost
print("\n=== [Subtraction] Gross Profit (Sales - Cost) ===")
print(df_gross_profit)
# ---------------------------------------------------------
# [Addition (+)]: Gross Profit + Rebate = Final Profit
# ---------------------------------------------------------
# Rebate data from headquarters
df_rebate = pd.DataFrame(
[[1000, 2000], [500, 500], [3000, 3000]],
index=products, columns=regions
)
df_final_profit = df_gross_profit + df_rebate
print("\n=== [Addition] Final Profit (Gross Profit + Rebate) ===")
print(df_final_profit)
# ---------------------------------------------------------
# [Division (/)]: Sales Amount / Target Sales = Achievement Rate
# ---------------------------------------------------------
# Target Sales Data (Target)
df_target = pd.DataFrame(
[[60000, 80000], [100000, 100000], [50000, 70000]],
index=products, columns=regions
)
# Calculate achievement rate (result is decimal)
df_achievement_rate = df_sales / df_target
print("\n=== [Division] Achievement Rate (Sales / Target) ===")
# Round to 2 decimal places for better visibility
print(df_achievement_rate.round(2))
if __name__ == "__main__":
demonstrate_dataframe_operations()
Execution Result
--- 1. Sales Quantity (Quantity) ---
East_Region West_Region
Product_A 100 150
Product_B 80 90
Product_C 200 220
--- 2. Unit Price (Unit Price) ---
East_Region West_Region
Product_A 500 520
Product_B 1200 1150
Product_C 300 300
=== [Multiplication] Sales Amount (Quantity * Price) ===
East_Region West_Region
Product_A 50000 78000
Product_B 96000 103500
Product_C 60000 66000
=== [Subtraction] Gross Profit (Sales - Cost) ===
East_Region West_Region
Product_A 15000 24000
Product_B 29000 31500
Product_C 18000 20000
=== [Addition] Final Profit (Gross Profit + Rebate) ===
East_Region West_Region
Product_A 16000 26000
Product_B 29500 32000
Product_C 21000 23000
=== [Division] Achievement Rate (Sales / Target) ===
East_Region West_Region
Product_A 0.83 0.98
Product_B 0.96 1.04
Product_C 1.20 0.94
Important Notes
- Index Mismatch: If the indices or column names of the two DataFrames used in the calculation do not match, Pandas will automatically combine them (like an outer join), and locations where data exists in only one DataFrame will become
NaN(Not a Number). - Zero Division: When performing division, if the denominator contains
0, the result will beinf(infinity) orNaN. - Matrix Multiplication: If you want to calculate the matrix product (dot product) instead of element-wise multiplication, use the
df.dot()method instead of the*operator.
