Overview
This article explains how to use Selenium to navigate a browser to a specified URL. The get() method is a crucial function because it doesn’t just open the page; it also waits for the browser to finish loading the HTML (until the onload event is fired).
Specifications
- Input: A string representing the URL of the website you want to access.
- Output:
- The browser displays the page.
- Retrieval of the page title and the current URL.
- Prerequisites: Selenium and WebDriver must be correctly configured.
Basic Usage
This basic code launches the Chrome browser, accesses a specified blog (morinokabu.com), and displays the title.
from selenium import webdriver
import time
driver = webdriver.Chrome()
# Navigate to the specified URL (waits for the page to finish loading)
driver.get("https://morinokabu.com")
# Display the page title to the standard output
print(f"Page Title: {driver.title}")
time.sleep(2) # Wait briefly for confirmation
driver.quit()
Full Code Example
This is a practical implementation that includes error handling and confirms the state change (URL transition) before and after access.
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
import time
def open_website_demo():
"""
Demo function to access a specific URL and retrieve information using Selenium.
"""
driver = None
target_url = "https://morinokabu.com"
try:
# Launch browser
print("Launching the browser...")
driver = webdriver.Chrome()
# 1. Execute access
print(f"Accessing: {target_url}")
driver.get(target_url)
# 2. Retrieve page information
# Since the get() method blocks until the page load is complete,
# the page is already displayed by the time it reaches here.
page_title = driver.title
current_url = driver.current_url
print("-" * 30)
print("Access Complete")
print(f"Title : {page_title}")
print(f"URL : {current_url}")
print("-" * 30)
# Wait for 3 seconds for confirmation
time.sleep(3)
except WebDriverException as e:
# Error handling for invalid URLs or lack of internet connection
print(f"An access error occurred: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
finally:
# Close the browser
if driver:
print("Closing the browser.")
driver.quit()
if __name__ == "__main__":
open_website_demo()
Customization Points
Page Navigation Methods
In addition to get(), there are other methods that utilize the browser’s navigation features:
driver.back(): Performs the same action as clicking the “Back” button.driver.forward(): Performs the same action as clicking the “Forward” button.driver.refresh(): Reloads the page (F5).
Specifying the URL
Always specify a complete URL starting with http:// or https://. Omitting this will result in an error.
Important Notes
- Behavior of Loading Wait:
driver.get()stops (blocks) processing until the page’s HTML structure (DOM) is loaded. However, for modern websites (like SPAs) where content is displayed later via JavaScript, you may need additional waiting processes usingWebDriverWait. - Timeout: If a page loads extremely slowly, the script may hang for a long time. You can manage timeout periods by setting
driver.set_page_load_timeout(seconds)if necessary. - Local Files: In addition to URLs on the web, you can open local HTML files by specifying a path like
file:///C:/path/to/file.html.
Advanced Application
This code serves as the basis for a crawler that visits multiple URLs stored in a list.
from selenium import webdriver
import time
def crawl_pages():
urls = [
"https://morinokabu.com",
"https://www.python.org",
"https://www.google.com"
]
driver = webdriver.Chrome()
try:
for url in urls:
print(f"Navigating to: {url}")
driver.get(url)
print(f" - Title: {driver.title}")
time.sleep(1) # Wait to reduce server load
finally:
driver.quit()
if __name__ == "__main__":
crawl_pages()
Conclusion
driver.get("URL") is the most fundamental operation in Selenium. Understanding that it “waits for the load to complete” allows for smoother subsequent element retrieval and operations. For dynamic pages, consider adding further wait logic after this method.
Would you like me to adjust the technical depth or add specific error-handling examples for the WebDriverWait mentioned in the notes?
