Sometimes, we want to use Selenium with scrapy for dynamic page with Python.
In this article, we’ll look at how to use Selenium with scrapy for dynamic page with Python.
How to use Selenium with scrapy for dynamic page with Python?
To use Selenium with scrapy for dynamic page with Python, we can create our own scrapy.Spider
subclass.
For instance, we write
import scrapy
from selenium import webdriver
class ProductSpider(scrapy.Spider):
name = "product_spider"
allowed_domains = ["example.store"]
start_urls = ["http://example.store"]
def __init__(self):
self.driver = webdriver.Firefox()
def parse(self, response):
self.driver.get(response.url)
while True:
next = self.driver.find_element_by_xpath('//td[@class="pagn-next"]/a')
try:
next.click()
except:
break
self.driver.close()
to create the ProductSpider
class that’s a subclass of the scrapy.Spider
class.
We initialize the Firefox webdriver in the __init__
method.
And then we add the parse
method that calls driver.get
method to open the page at the response.url
.
Next, we loop through the td elements in the while loop.
And we call click
to click on the element that’s found.
Once we’re done, we call close
to close the webdriver.
Conclusion
To use Selenium with scrapy for dynamic page with Python, we can create our own scrapy.Spider
subclass.