How to Get Page Title, URL, and Window Information using Selenium in Python

目次

Overview

This article explains how to use Selenium WebDriver to get web page metadata (title, URL, HTML source) and browser status (window size, position, cookies). These features are essential for verifying test results (assertions) and checking the current state during web scraping.

Specifications (Input/Output)

  • Input: None (retrieved from the currently active browser window).
  • Output:
    • Strings (Title, URL, HTML source)
    • Dictionary data (Window size, position)
    • List data (Cookie information)
  • Requirement: A browser must be started with Selenium and a page must be displayed.

Basic Usage

This is basic code to get and display the page title, URL, and window size.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://morinokabu.com")

# Use properties and methods to get information
print(f"Title: {driver.title}")
print(f"URL: {driver.current_url}")
print(f"Size: {driver.get_window_size()}")

driver.quit()

Full Code

This is a complete implementation example that covers all properties and methods. It organizes and displays the retrieved information. Since page_source and cookies can contain a lot of data, the output is limited for better readability.

from selenium import webdriver
import time

def analyze_page_details():
    """
    Get and display various properties and window information using Selenium.
    """
    driver = webdriver.Chrome()
    
    try:
        target_url = "https://morinokabu.com"
        print(f"Accessing: {target_url} ...")
        driver.get(target_url)
        
        # Brief wait for page load
        time.sleep(2)

        print("\n--- Basic Page Information ---")
        # Title and URL (Properties)
        print(f"Title       : {driver.title}")
        print(f"Current URL : {driver.current_url}")

        print("\n--- Browser Window Information ---")
        # Window size and position (Methods)
        size = driver.get_window_size()
        position = driver.get_window_position()
        print(f"Size     : Width {size['width']} px, Height {size['height']} px")
        print(f"Position : X {position['x']}, Y {position['y']}")

        print("\n--- Cookie Information ---")
        # List of Cookies (Method)
        cookies = driver.get_cookies()
        print(f"Number of Cookies : {len(cookies)}")
        if cookies:
            # Display the first one as a sample
            print(f"Sample   : {cookies[0]['name']} = {cookies[0]['value']}...")

        print("\n--- HTML Source ---")
        # Page Source (Property)
        src = driver.page_source
        # Display only the first 200 characters
        print(f"Source (First 200 chars):\n{src[:200]}...")

    except Exception as e:
        print(f"An error occurred: {e}")

    finally:
        driver.quit()

if __name__ == "__main__":
    analyze_page_details()

Customization Points

The following table shows the information you can get and whether it is a “Property” or a “Method”.

NameTypeDescription / Return Value
driver.titlePropertyReturns the content of the <title> tag as a string.
driver.current_urlPropertyReturns the current URL in the address bar as a string.
driver.page_sourcePropertyReturns the full HTML source code of the current page as a string.
driver.get_cookies()MethodReturns a list of cookies stored in the browser.
driver.get_window_size()MethodReturns the width and height as a dictionary: {'width': int, 'height': int}.
driver.get_window_position()MethodReturns the top-left coordinates as a dictionary: {'x': int, 'y': int}.

Important Notes

Mixing Properties and Methods

title and current_url are properties, so they do not use parentheses (). On the other hand, get_window_size() is a method and requires (). If you mix them up, you will get an error.

Freshness of page_source

driver.page_source returns the state of the DOM at the “moment of retrieval,” not when you first accessed the URL. Since it can capture the state after JavaScript has changed the DOM, you get information closer to what you see compared to using static scraping libraries like Requests.

Privacy and Cookies

The information from get_cookies() may contain sensitive data. Be careful to mask session IDs if you output this information to logs.

Advanced Usage

This example uses window size information to move the browser window to the center of the screen.

from selenium import webdriver

def center_window():
    driver = webdriver.Chrome()
    driver.get("https://morinokabu.com")

    # 1. Define screen size (assuming 1920x1080 resolution)
    screen_w, screen_h = 1920, 1080
    
    # 2. Set browser size
    target_w, target_h = 1280, 720
    driver.set_window_size(target_w, target_h)
    
    # 3. Calculate center position
    pos_x = (screen_w - target_w) // 2
    pos_y = (screen_h - target_h) // 2
    
    # 4. Set position
    driver.set_window_position(pos_x, pos_y)
    print(f"Moved window to center: ({pos_x}, {pos_y})")
    
    # Wait and close
    import time
    time.sleep(2)
    driver.quit()

if __name__ == "__main__":
    center_window()

Summary

The information you get from WebDriver acts as the “eyes” of your automation script. In particular, current_url is helpful for checking if page navigation was successful, and page_source is useful for debugging when elements cannot be found. Remember the difference between properties and methods when you use them.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次