[Python] Summary of Arithmetic Operations Between Pandas DataFrames (Addition, Subtraction, Multiplication, Division)

In Pandas, you can perform arithmetic operations (addition, subtraction, multiplication, and division) between DataFrames intuitively, much like calculating cells in Excel. When you use operators (+, -, *, /), calculations are automatically performed between elements where the row labels (index) and column labels (columns) match.

This article explains comprehensive calculation methods, including not just addition, but also subtraction, multiplication, and division.

目次

Basics of DataFrame Operations

If you have df1 and df2, you can write the operations as follows:

  • Addition: df1 + df2
  • Subtraction: df1 - df2
  • Multiplication: df1 * df2 (This is the element-wise product, not the matrix product)
  • Division: df1 / df2

Implementation Sample Code

Here, we assume a scenario where we analyze the “Sales Performance” of a retail store. We will perform calculations in the following four steps:

  1. Multiplication: Calculate “Sales Amount” from “Sales Quantity” and “Unit Price”.
  2. Subtraction: Calculate “Gross Profit” by subtracting “Cost” from “Sales Amount”.
  3. Addition: Calculate “Final Profit” by adding “Special Rebate” to “Gross Profit”.
  4. Division: Calculate “Achievement Rate” by dividing “Sales Amount” by “Target Sales”.
import pandas as pd

def demonstrate_dataframe_operations():
    """
    Function to demonstrate arithmetic operations (add, sub, mul, div) 
    between Pandas DataFrames.
    """
    
    # ---------------------------------------------------------
    # Data Preparation: Create DataFrame for Product (Row) x Region (Column)
    # ---------------------------------------------------------
    products = ["Product_A", "Product_B", "Product_C"]
    regions = ["East_Region", "West_Region"]

    # 1. Sales Quantity Data (Quantity)
    df_quantity = pd.DataFrame(
        [[100, 150], [80, 90], [200, 220]],
        index=products, columns=regions
    )

    # 2. Unit Price Data (Unit Price)
    # Assuming unit prices differ by region
    df_price = pd.DataFrame(
        [[500, 520], [1200, 1150], [300, 300]],
        index=products, columns=regions
    )

    print("--- 1. Sales Quantity (Quantity) ---")
    print(df_quantity)
    print("\n--- 2. Unit Price (Unit Price) ---")
    print(df_price)


    # ---------------------------------------------------------
    # [Multiplication (*)]: Quantity * Unit Price = Sales Amount
    # ---------------------------------------------------------
    # Calculates element-wise product
    df_sales = df_quantity * df_price
    
    print("\n=== [Multiplication] Sales Amount (Quantity * Price) ===")
    print(df_sales)


    # ---------------------------------------------------------
    # [Subtraction (-)]: Sales Amount - Cost = Gross Profit
    # ---------------------------------------------------------
    # Create Cost Data (Cost)
    # Preparing dummy values (approx. 70% of sales)
    df_cost = pd.DataFrame(
        [[35000, 54000], [67000, 72000], [42000, 46000]],
        index=products, columns=regions
    )
    
    df_gross_profit = df_sales - df_cost
    
    print("\n=== [Subtraction] Gross Profit (Sales - Cost) ===")
    print(df_gross_profit)


    # ---------------------------------------------------------
    # [Addition (+)]: Gross Profit + Rebate = Final Profit
    # ---------------------------------------------------------
    # Rebate data from headquarters
    df_rebate = pd.DataFrame(
        [[1000, 2000], [500, 500], [3000, 3000]],
        index=products, columns=regions
    )
    
    df_final_profit = df_gross_profit + df_rebate
    
    print("\n=== [Addition] Final Profit (Gross Profit + Rebate) ===")
    print(df_final_profit)


    # ---------------------------------------------------------
    # [Division (/)]: Sales Amount / Target Sales = Achievement Rate
    # ---------------------------------------------------------
    # Target Sales Data (Target)
    df_target = pd.DataFrame(
        [[60000, 80000], [100000, 100000], [50000, 70000]],
        index=products, columns=regions
    )
    
    # Calculate achievement rate (result is decimal)
    df_achievement_rate = df_sales / df_target
    
    print("\n=== [Division] Achievement Rate (Sales / Target) ===")
    # Round to 2 decimal places for better visibility
    print(df_achievement_rate.round(2))

if __name__ == "__main__":
    demonstrate_dataframe_operations()

Execution Result

--- 1. Sales Quantity (Quantity) ---
           East_Region  West_Region
Product_A          100          150
Product_B           80           90
Product_C          200          220

--- 2. Unit Price (Unit Price) ---
           East_Region  West_Region
Product_A          500          520
Product_B         1200         1150
Product_C          300          300

=== [Multiplication] Sales Amount (Quantity * Price) ===
           East_Region  West_Region
Product_A        50000        78000
Product_B        96000       103500
Product_C        60000        66000

=== [Subtraction] Gross Profit (Sales - Cost) ===
           East_Region  West_Region
Product_A        15000        24000
Product_B        29000        31500
Product_C        18000        20000

=== [Addition] Final Profit (Gross Profit + Rebate) ===
           East_Region  West_Region
Product_A        16000        26000
Product_B        29500        32000
Product_C        21000        23000

=== [Division] Achievement Rate (Sales / Target) ===
           East_Region  West_Region
Product_A         0.83         0.98
Product_B         0.96         1.04
Product_C         1.20         0.94

Important Notes

  • Index Mismatch: If the indices or column names of the two DataFrames used in the calculation do not match, Pandas will automatically combine them (like an outer join), and locations where data exists in only one DataFrame will become NaN (Not a Number).
  • Zero Division: When performing division, if the denominator contains 0, the result will be inf (infinity) or NaN.
  • Matrix Multiplication: If you want to calculate the matrix product (dot product) instead of element-wise multiplication, use the df.dot() method instead of the * operator.
よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次