[Python] Retrieving and Updating Column Data in Pandas DataFrame (Bracket and Dot Notation)

Extracting specific columns from a Pandas DataFrame for analysis, or updating column values based on calculation results, are fundamental operations in data processing.

This article explains how to retrieve column data using bracket notation [] and dot notation ., how to access specific elements, and how to overwrite and update data for an entire column.

目次

Basics of Retrieving and Updating Column Data

There are two main ways to access columns in a DataFrame.

  • Bracket Notation: df["column_name"] This is the most standard method. It can be used even if the column name contains spaces or symbols.
  • Dot Notation: df.column_name This is a concise way to write code, available only when the column name satisfies Python’s variable naming rules (no spaces, does not start with a number, etc.).

Additionally, by assigning a new list or Series to the extracted column, you can update the data for that column all at once.

Implementation Sample Code

Below is code using a fruit shop’s inventory list as an example. We will manipulate a DataFrame containing Price and Stock information.

import pandas as pd

def manage_column_data():
    """
    Function demonstrating DataFrame column access, element reference, and column data updates
    """
    
    # 1. Data Preparation
    # Create a DataFrame with fruit names as the index, and Price/Stock as columns
    fruit_names = ["Apple", "Banana", "Orange", "Grape", "Peach"]
    inventory_data = {
        "Price": [150, 100, 120, 400, 300],
        "Stock": [50, 80, 60, 20, 15]
    }
    
    df = pd.DataFrame(inventory_data, index=fruit_names)
    
    print("--- Initial DataFrame ---")
    print(df)
    print("\n")


    # 2. Retrieving Column Data
    print("=== Retrieving Column Data ===")
    
    # Pattern A: Bracket Notation (Recommended)
    # Retrieved as a Series
    price_series = df["Price"]
    print("--- Price Column (Bracket Notation) ---")
    print(price_series)
    
    # Pattern B: Dot Notation
    # Code is shorter, but may not be usable depending on the column name
    stock_series = df.Stock
    print("\n--- Stock Column (Dot Notation) ---")
    print(stock_series)


    # 3. Accessing Specific Elements (Cells)
    print("\n=== Accessing Specific Elements ===")
    
    # Get the value for "Banana" in the Stock column
    # Access the Series (df.Stock) using the index (.Banana)
    banana_stock = df.Stock.Banana
    print(f"Stock count for Banana: {banana_stock}")


    # 4. Updating Column Data (Overwriting)
    print("\n=== Updating Column Data ===")
    
    # Update the 'Price' column with a new price list
    # The length of the list must match the number of rows in the DataFrame
    new_prices = [160, 110, 130, 420, 320]
    
    # It is also possible to create a Pandas Series and assign it by explicitly specifying the index
    # By matching indices, data is mapped correctly even if the order differs
    new_price_series = pd.Series(new_prices, index=fruit_names)
    
    df["Price"] = new_price_series
    
    print("--- DataFrame after Price Revision ---")
    print(df)

if __name__ == "__main__":
    manage_column_data()

Execution Result

--- Initial DataFrame ---
        Price  Stock
Apple     150     50
Banana    100     80
Orange    120     60
Grape     400     20
Peach     300     15


=== Retrieving Column Data ===
--- Price Column (Bracket Notation) ---
Apple     150
Banana    100
Orange    120
Grape     400
Peach     300
Name: Price, dtype: int64

--- Stock Column (Dot Notation) ---
Apple     50
Banana    80
Orange    60
Grape     20
Peach     15
Name: Stock, dtype: int64

=== Accessing Specific Elements ===
Stock count for Banana: 80

=== Updating Column Data ===
--- DataFrame after Price Revision ---
        Price  Stock
Apple     160     50
Banana    110     80
Orange    130     60
Grape     420     20
Peach     320     15

Explanation and Important Points

Bracket Notation vs. Dot Notation

While dot notation (df.Price) is convenient, bracket notation (df["Price"]) is mandatory in the following cases:

  • When the column name contains spaces: (e.g., "Unit Price")
  • When the column name is a Python reserved word or method name: (e.g., "class", "sum", "count")
  • When specifying the column name dynamically using a variable: (e.g., col = "Price"; df[col])

Benefits of Updating with Series

When updating a column, you can assign a simple list (array). However, assigning a pd.Series has the advantage of automatic adjustment via the index.

If the index of the Series matches the index of the DataFrame, the correct values will be assigned to the correct rows without worrying about the order of the data.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次