[Python] Exclusive Control of File Writing Between Processes Using multiprocessing.Lock

目次

Overview

In a multi-process environment, “race conditions” can occur when multiple processes attempt to write to the same file or output logs to the standard output simultaneously. This can result in corrupted data or scrambled displays. To prevent this, multiprocessing.Lock is used. While one process holds the lock, other processes are put into a waiting state. This “serializes” the operations, ensuring data integrity.

Specifications (Input/Output)

  • Input:
    • Shared resource (e.g., a text file).
    • multiprocessing.Lock object.
  • Output: A final result that is consistent and accurate, even when updated by multiple processes in parallel.
  • Mechanism: Using a with lock: block acquires the lock upon entry and automatically releases it upon exit.

Basic Usage

Create a Lock in the main process and pass it as an argument to each sub-process.

from multiprocessing import Process, Lock

lock = Lock()

def worker(l):
    with l:
        # Only one process can execute this block at a time
        print("Exclusive control in progress")

# Pass the lock as an argument
p = Process(target=worker, args=(lock,))
p.start()

Full Code Example

The following code demonstrates a file-based counter with exclusive control applied. Without a lock, multiple processes would perform “read-then-write” operations simultaneously, causing values to be overwritten and resulting in an incorrect count. Using a lock ensures the program works as expected.

import multiprocessing
import time
import os

# Filename for saving data
FILE_NAME = "counter_tmp.txt"

def init_data():
    """Initializes the file (writes 0)"""
    with open(FILE_NAME, "w") as f:
        f.write("0")
    print("[Main] Initialized file with: 0")

def read_data():
    """Reads the numerical value from the file"""
    with open(FILE_NAME, "r") as f:
        text = f.read()
    return int(text) if text else 0

def write_data(n):
    """Writes the numerical value to the file"""
    with open(FILE_NAME, "w") as f:
        f.write(str(n))

def increment_process(lock, process_name, loops):
    """
    Process that reads from and writes to a file to increment a value.
    Uses a Lock to protect the sequence from reading to writing.
    """
    for _ in range(loops):
        # --- Start Critical Section ---
        with lock:
            # 1. Read
            current_val = read_data()
            
            # 2. Calculate (simulate processing time)
            new_val = current_val + 1
            time.sleep(0.01)
            
            # 3. Write
            write_data(new_val)
            
            # Optional: Log the update (also protected by the lock to prevent mixed output)
            # print(f"[{process_name}] updated to {new_val}")
        # --- End Critical Section ---
        
        # Wait slightly outside the lock to give other processes a turn
        time.sleep(0.001)
    
    print(f"[{process_name}] Finished")

def main():
    # 1. Create a Lock object
    lock = multiprocessing.Lock()
    
    # 2. Initialize data
    init_data()
    
    loops = 10
    process_count = 2
    
    print("--- Starting Parallel Processing ---")
    
    # 3. Create processes and pass the lock
    p1 = multiprocessing.Process(target=increment_process, args=(lock, "Process-1", loops))
    p2 = multiprocessing.Process(target=increment_process, args=(lock, "Process-2", loops))
    
    p1.start()
    p2.start()
    
    p1.join()
    p2.join()
    
    print("--- All processes finished ---")
    
    # Verify results
    final_val = read_data()
    expected_val = loops * process_count
    
    print(f"Final Result: {final_val}")
    print(f"Expected Value: {expected_val}")
    
    if final_val == expected_val:
        print(">> Success: Counted correctly due to exclusive control.")
    else:
        print(">> Failure: A race condition occurred.")
        
    # Cleanup
    if os.path.exists(FILE_NAME):
        os.remove(FILE_NAME)

if __name__ == "__main__":
    main()

Example Output

[Main] Initialized file with: 0
--- Starting Parallel Processing ---
[Process-1] Finished
[Process-2] Finished
--- All processes finished ---
Final Result: 20
Expected Value: 20
>> Success: Counted correctly due to exclusive control.

Customization Points

  • Lock Scope: It is crucial to wrap the entire “sequence of actions”—from read_data to write_data—inside the lock. Locking only the internal parts of individual functions is insufficient if an interruption occurs between reading and writing.
  • Protecting Standard Output: Since print statements can also scramble output when they conflict, you may want to use a separate lock specifically for logging.

Important Notes

  • Deadlocks: There is a risk of a “deadlock” (where the program hangs indefinitely) if an exception occurs inside a lock and prevents the with block from exiting, or if a process holding one lock tries to acquire another.
  • Performance: Locking slow operations like file access can reduce the performance benefits of parallel processing. Ensure you only lock the absolute minimum necessary sections of your code.
  • Difference from Thread Locks: threading.Lock and multiprocessing.Lock are distinct entities. Always use the one from the multiprocessing module when working between processes.

Advanced Application

Recursive Lock (multiprocessing.RLock): This is a lock that can be acquired multiple times by the same process. It is suitable for using locks within recursive function calls.

import multiprocessing

def recursive_worker(rlock, count):
    if count <= 0:
        return
    
    # An RLock can be acquired repeatedly within the same process
    with rlock:
        print(f"Layer {count}: Holding lock")
        recursive_worker(rlock, count - 1)

if __name__ == "__main__":
    rlock = multiprocessing.RLock()
    p = multiprocessing.Process(target=recursive_worker, args=(rlock, 3))
    p.start()
    p.join()

Conclusion

multiprocessing.Lock is the most fundamental tool for preventing resource conflicts between processes.

Design Tip: If the lock scope is too broad, it becomes no different from serial execution; if it is too narrow, it fails to prevent conflicts. Always implement exclusive control in parallel processing when sharing external resources like files.

Best for: Writing to files, accessing databases, and displaying logs on the standard output.

Key Action: Always wrap the entire range where you need to guarantee atomic “read-calculate-write” operations within a with lock: block.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次