While Python is a language that handles memory management automatically (it has garbage collection), excessive memory usage can become a problem in processes handling large data or long-running server applications. In such cases, you can use the standard library tracemalloc module to track and measure in detail “which lines in which files are consuming how much memory.”
This article explains how to use tracemalloc to take snapshots of memory allocation status and identify areas with high memory consumption.
Basic Usage of tracemalloc
tracemalloc (Trace Memory Allocation) is a module for tracing memory block allocations. The basic procedure is as follows:
- Start tracing with
tracemalloc.start(). - Execute the process you want to measure.
- Take a snapshot of the memory allocation status at that moment with
tracemalloc.take_snapshot(). - Analyze and display the acquired data.
Practical Code Example: Comparing List vs. Dictionary Memory
As a concrete example, we will create a script that generates a “List” and a “Dictionary” with a large amount of data and compares/analyzes their respective memory consumption.
import tracemalloc
import os
def generate_large_list(size):
"""
Function to create a list of the specified size
"""
return [i for i in range(size)]
def generate_large_dict(size):
"""
Function to create a dictionary of the specified size
(Tends to consume more memory than a list because it holds both keys and values)
"""
return {f"key_{i}": i for i in range(size)}
def main():
# 1. Start tracking memory allocation
print("--- Starting Memory Measurement ---")
tracemalloc.start()
# 2. Execute target processes
# Create a list with 100,000 elements
data_list = generate_large_list(100000)
# Create a dictionary with 100,000 elements
data_dict = generate_large_dict(100000)
# 3. Take a snapshot of the current memory state
snapshot = tracemalloc.take_snapshot()
# 4. Get and display statistics
# 'lineno' option aggregates memory usage by filename and line number
top_stats = snapshot.statistics('lineno')
print("\n[Top 5 Memory Consumers]")
for index, stat in enumerate(top_stats[:5], 1):
# The stat object contains size, count (number of blocks), and traceback (location)
# Convert size to a readable unit (KiB)
size_kib = stat.size / 1024
print(f"{index}. {stat.traceback.format()[0]}")
print(f" Size: {size_kib:.1f} KiB | Count: {stat.count}")
# Stop tracking (Release memory)
tracemalloc.stop()
if __name__ == "__main__":
main()
Example Output:
--- Starting Memory Measurement ---
[Top 5 Memory Consumers]
1. File "memory_check.py", line 15
return {f"key_{i}": i for i in range(size)}
Size: 11824.2 KiB | Count: 199923
2. File "memory_check.py", line 8
return [i for i in range(size)]
Size: 3624.1 KiB | Count: 99912
...
Analysis of Results
Looking at the output, it is obvious which lines of code are consuming how much memory.
- 1st Place: Dictionary generation at line 15. Consumes about 11.8 MiB.
- 2nd Place: List generation at line 8. Consumes about 3.6 MiB.
Thus, we can confirm that even with the same number of elements, dictionaries consume more memory than lists.
Getting Current and Peak Memory Usage
If you don’t need a detailed analysis with snapshots but simply want to know “how much memory is being used right now,” the get_traced_memory() function is convenient. It returns a tuple of (current usage, peak usage since start).
import tracemalloc
# Start tracking
tracemalloc.start()
# Process consuming memory
temp_data = [b"0" * 1024 * 1024 for _ in range(50)] # Allocate approx 50MB
# Get memory usage
current, peak = tracemalloc.get_traced_memory()
print(f"Current Memory Usage: {current / 1024 / 1024:.2f} MiB")
print(f"Peak Memory Usage: {peak / 1024 / 1024:.2f} MiB")
tracemalloc.stop()
Output:
Current Memory Usage: 50.00 MiB
Peak Memory Usage: 50.00 MiB
Summary
- Use the
tracemallocmodule to investigate Python memory usage in detail. - Start with
start(), save the state withtake_snapshot(), and analyze line-by-line consumption withstatistics('lineno'). get_traced_memory()is effective for simple measurements.- It is a very powerful tool for investigating memory leaks and performing performance tuning.
