When reading the contents of a text file in Python, you use the built-in open() function. There are three main methods for “extracting data” afterwards:
read(): Reads the entire content as a single string.readlines(): Reads the entire content as a list (array of lines).readline(): Reads one line at a time sequentially.
It is important to use these appropriately depending on the file size and how you intend to process the data. This article explains the behavior and usage scenarios of each method using a sample file.
Preparation: Creating a Sample File
For explanation, we will use a file named todo_list.txt containing the following content.
todo_list.txt
Buy milk
Walk the dog
Read a book
1. read(): Get Entire File as a String
The read() method reads the entire file content as one long string (str type) at once. It is suitable for small files, simply displaying the content, or searching through the entire text.
file_path = "todo_list.txt"
with open(file_path, "r", encoding="utf-8") as file_handle:
# Read the entire file as a string
full_text = file_handle.read()
print(f"Type: {type(full_text)}")
print("-" * 10)
print(full_text)
Output:
Type: <class 'str'>
----------
Buy milk
Walk the dog
Read a book
2. readlines(): Get All Lines as a List
The readlines() method reads the entire file and returns it as a list (list type) separated by newlines. It is convenient when you want to access specific lines (e.g., only the 1st line, only the 5th line) or loop through lines.
with open(file_path, "r", encoding="utf-8") as file_handle:
# Get a list where each element is a line
lines_list = file_handle.readlines()
print(f"Type: {type(lines_list)}")
print(f"Number of lines: {len(lines_list)}")
print("-" * 10)
# Display with index using enumerate
for index, line in enumerate(lines_list):
# Since the line contains a newline character from the file,
# remove it with strip() or adjust print with end=""
print(f"{index}: {line}", end="")
Output:
Type: <class 'list'>
Number of lines: 3
----------
0: Buy milk
1: Walk the dog
2: Read a book
Note: Since readlines() loads the entire file into a list in memory, reading a huge file may cause memory shortage.
3. readline(): Read One Line at a Time
The readline() method (no s at the end) reads only one line from the current position in the file each time it is called. It is suitable for handling huge files or checking only the first few lines.
with open(file_path, "r", encoding="utf-8") as file_handle:
print("--- Reading line by line ---")
# Read the 1st line
first_line = file_handle.readline()
print(f"1: {first_line}", end="")
# Read the 2nd line
second_line = file_handle.readline()
print(f"2: {second_line}", end="")
# Read the 3rd line
third_line = file_handle.readline()
print(f"3: {third_line}", end="")
Output:
--- Reading line by line ---
1: Buy milk
2: Walk the dog
3: Read a book
Supplement: Looping Through the File Object Directly (Recommended)
If you just want to “process the file line by line from beginning to end,” looping through the file object directly with a for statement is the most recommended method (it is memory efficient and the code is concise).
with open(file_path, "r", encoding="utf-8") as file_handle:
# Loop directly without using readlines()
for i, line in enumerate(file_handle, 1):
print(f"Line {i}: {line.strip()}")
Output:
Line 1: Buy milk
Line 2: Walk the dog
Line 3: Read a book
Summary
read(): Converts the entire file into “one string.” Used for full-text search, etc.readlines(): Converts the entire file into a “list of strings.” Used for random access to lines.readline(): Reads the file “one line at a time.” Used for manual control or huge files.for line in f:: The most memory-efficient method for processing all lines sequentially.
Note that the read string includes a newline code (\n) at the end, so when outputting with print(), you need to specify end="" or remove the newline with .strip().
