Overview
When you need to parallelize CPU-intensive tasks (such as heavy calculations) in Python, using multiprocessing is more effective than threading.
Since each process has its own independent memory space and Python interpreter, it is not restricted by the Global Interpreter Lock (GIL). This allows you to fully utilize the performance of multi-core CPUs.
In this article, we will explain the basics of process creation, passing arguments, waiting for completion (join), and daemon processes.
Specifications (Input/Output)
Parameters for multiprocessing.Process
| Parameter | Type | Meaning |
| target | callable | The function object to be executed when the process starts. |
| args | tuple | Positional arguments to pass to the target function. A comma is required for a single element (e.g., (1,)). |
| kwargs | dict | A dictionary of keyword arguments to pass to the target function. |
| daemon | bool | If set to True, the process becomes a daemon process and is forced to exit when the main process ends. |
Main Methods
| Method | Description |
| start() | Spawns the process and begins execution of the target function. |
| join(timeout=None) | Waits for the process to exit. Without this, the main process might finish before the sub-processes. |
Basic Usage
Define a function and specify it as the target when creating a Process instance. Use start() to begin and join() to wait for completion.
# Basic form passing arguments as a tuple
p = multiprocessing.Process(target=my_func, args=("value1",))
p.start()
p.join()
Full Code Example
In this example, two separate processes execute different functions: “counting numbers” and “printing characters.” This code demonstrates how to pass arguments using both args (positional) and kwargs (keyword).
import multiprocessing
import time
import os
def print_numbers(process_name: str, limit: int):
"""
Function that prints numbers a specified number of times.
"""
print(f"[{process_name}] PID: {os.getpid()} Started")
for i in range(limit):
print(f" {process_name}: {i}")
time.sleep(0.5)
print(f"[{process_name}] Finished")
def print_letters(process_name: str, letters: list):
"""
Function that prints letters from a list sequentially.
"""
print(f"[{process_name}] PID: {os.getpid()} Started")
for char in letters:
print(f" {process_name}: {char}")
time.sleep(0.7)
print(f"[{process_name}] Finished")
def main():
print(f"Main Process PID: {os.getpid()} Started")
# Process 1: Using args (positional argument tuple)
# Note: Even with a single element, a comma is required like (val, )
p1 = multiprocessing.Process(
target=print_numbers,
args=("NumProc", 3)
)
# Process 2: Using kwargs (keyword argument dictionary)
p2 = multiprocessing.Process(
target=print_letters,
kwargs={"process_name": "CharProc", "letters": ["A", "B", "C"]}
)
# Start processes
p1.start()
p2.start()
print("--- Processes are running ---")
# Wait for processes to finish
p1.join()
p2.join()
print("All processes have finished.")
if __name__ == "__main__":
# This guard block is mandatory on Windows and macOS
main()
Example Output
Since the two processes run concurrently, the output will be mixed. Notice that the PIDs (Process IDs) are different for each.
Main Process PID: 12345 Started
--- Processes are running ---
[NumProc] PID: 12346 Started
[CharProc] PID: 12347 Started
NumProc: 0
CharProc: A
NumProc: 1
CharProc: B
NumProc: 2
[NumProc] Finished
CharProc: C
[CharProc] Finished
All processes have finished.
Customization Points
- Passing Arguments:
args=("hoge",): When passing a single argument, remember the trailing comma to ensure it is treated as a tuple.kwargs={"key": "value"}: This is recommended for better readability when you have many arguments or want to override default values.
- Managing Process Lists: When launching many processes, it is common to store them in a list (
processes = []) and use loops to callstart()andjoin().
Important Notes
- Necessity of
if __name__ == "__main__":: On Windows and macOS, the entire module is re-imported when spawning a new process. Without this guard block, processes will be created infinitely (recursive explosion), causing an error. - Independent Memory Space: Modifying a global variable only affects the “copy” inside that specific process. To share data between processes, you must use specific features like
QueueorValue. - Zombie Processes: If a parent process continues running for a long time without calling
join(), finished child processes may remain in the system as “zombie” processes. Ensure you calljoin()or use a context manager.
Advanced Application
Here is an example using a daemon process (daemon=True). A daemon process is forced to terminate when the main process ends, even if it is still working. This is useful for background tasks like “log monitoring” or “health checks.”
import multiprocessing
import time
def background_task():
print("Background task started")
while True:
print(" ...working")
time.sleep(0.5)
def main_daemon():
# Set daemon=True
p = multiprocessing.Process(target=background_task, daemon=True)
p.start()
print("Main process running (for 2 seconds)...")
time.sleep(2.0)
print("Main process ending. The daemon will also be terminated.")
# Exits without calling p.join()
if __name__ == "__main__":
main_daemon()
Conclusion
multiprocessing.Process is the most fundamental class for parallel computing in Python.
Caution: Creating a process is more expensive than creating a thread. It is not suitable for launching thousands of lightweight tasks that finish in milliseconds (in that case, consider using Pool).Understanding the difference from threading (memory independence) and using it correctly will help improve the performance of your Python programs.
Best for: CPU-intensive calculations with high independence and background tasks with different lifecycles than the main process.
Key Points: Pay attention to the trailing comma in args and always include the if __name__ == "__main__": block.
