How to find all broken links using Selenium webdriver in Python

Selenium overview

Selenium is an open-source web-based automation tool. We'll learn how to find the broken links in the web page using selenium in Python.

We'll follow the steps mentioned below to find the broken links:

  • Find all links present on the web page.
  • Send an HTTP request to each link and get its status code.
  • Based on the status code we will decide if a link is broken or not.

Example

from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import requests
#specify where your chrome driver present in your pc
PATH=r"C:\Users\educative\Documents\chromedriver\chromedriver.exe"
#get instance of web driver
driver = webdriver.Chrome(PATH)
#provide website url here
driver.get("http://demo.guru99.com/test/newtours/")
#get all links
all_links = driver.find_elements(By.CSS_SELECTOR,"a")
#check each link if it is broken or not
for link in all_links:
#extract url from href attribute
url = link.get_attribute('href')
#send request to the url and get the result
result = requests.head(url)
#if status code is not 200 then print the url (customize the if condition according to the need)
if result.status_code != 200:
print(url, result.status_code)

Explanation

  • Lines 1–4: We import the required packages.
  • Line 7: We provide the path where we placed the driver of the web browser. For chrome, it is chromedriver.exe in the windows environment.
  • Line 10: We get the instance of the webdriver.
  • Line 13: We provide the URL to the driver.get() method to open it.
  • Line 16: We use the find_elements() method to get all links present on the current web page.
  • Line 19: We use the for-in loop to loop through each link returned in the above step.
  • Line 21: We extract the URL from the element.
  • Line 24: We send an HTTP request to the URL and store the result.
  • Lines 27–28: We check if the status code is not equal to 200 then we consider it as a broken link and print it. We can also customize this condition according to our needs.

Free Resources