Sorting Data with the sort_values Method in Pandas

目次

Basics of Sorting Data with the sort_values Method

In a Pandas DataFrame, you can use the sort_values method to sort data based on specific column values. You can set detailed conditions, such as primary and secondary sort keys, by providing a list of multiple columns.

Preparing Sample Data

We will create sample data for project management. This DataFrame includes three columns: Department (Dept), Priority, and Budget.

import pandas as pd

# Define project data
# Dept, Priority, Budget
project_data = {
    "Dept": ["Sales", "Tech", "Sales", "HR", "Tech", "HR"],
    "Priority": [2, 5, 2, 1, 4, 1],
    "Budget": [500000, 1200000, 750000, 300000, 950000, 280000]
}

df = pd.DataFrame(project_data)
print("--- Data before sorting ---")
print(df)

Sorting with Multiple Columns

You can sort by multiple keys by passing a list of column names to the by parameter. You can also control the sort order (ascending or descending) for each column by passing a list of booleans to the ascending parameter.

Implementation Code

In the code below, we sort the Department (Dept) in ascending order. Within the same department, we sort Priority and Budget in descending order.

# Sort by multiple columns
# Dept: Ascending (True), Priority: Descending (False), Budget: Descending (False)
df_sorted = df.sort_values(
    by=["Dept", "Priority", "Budget"],
    ascending=[True, False, False]
)

print("\n--- Data after sorting ---")
print(df_sorted)

Execution Result

Running the code sorts the rows according to the specified priority.

--- Data before sorting ---
    Dept  Priority   Budget
0  Sales         2   500000
1   Tech         5  1200000
2  Sales         2   750000
3     HR         1   300000
4   Tech         4   950000
5     HR         1   280000

--- Data after sorting ---
    Dept  Priority   Budget
3     HR         1   300000
5     HR         1   280000
0  Sales         2   750000
2  Sales         2   500000
1   Tech         5  1200000
4   Tech         4   950000

Parameter Explanation

  • by: A list of column names used as the sort criteria, in order of priority.
  • ascending: A list specifying the sort order for each column in the by parameter. True is for ascending order (small to large), and False is for descending order (large to small).

Using this method makes it easy to rank items by category or find the maximum and minimum values under specific conditions.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次