Python – Getting HTML with DOM

The other day I encountered a scenario where I needed to get HTML with Python, but only after Javascript had finished running. I accomplished this using the selenium driver.

  1. Download selenium with pip install selenium
  2. Download the driver for the browser you want to emulate. You can download them from this page. The driver must be in the PATH variable or you will need to specify the path in the constructor for the webdriver.
  3. Import selenium with from selenium import webdriver
  4. Now use the following code:
browser = webdriver.Chrome()
browser.get(raw_input("Enter URL: "))
html_source = browser.page_source

Note: If you did not put your driver in path, you have to call the constructor with browser = webdriver.Chrome(<PATH_TO_DRIVER_HERE>)

Leave a Reply