Sometimes, we want to use Selenium with scrapy for dynamic page with Python.
In this article, we’ll look at how to use Selenium with scrapy for dynamic page with Python.
How to use Selenium with scrapy for dynamic page with Python?
To use Selenium with scrapy for dynamic page with Python, we can create our own scrapy.Spider subclass.
For instance, we write
import scrapy
from selenium import webdriver
class ProductSpider(scrapy.Spider):
name = "product_spider"
allowed_domains = ["example.store"]
start_urls = ["http://example.store"]
def __init__(self):
self.driver = webdriver.Firefox()
def parse(self, response):
self.driver.get(response.url)
while True:
next = self.driver.find_element_by_xpath('//td[@class="pagn-next"]/a')
try:
next.click()
except:
break
self.driver.close()
to create the ProductSpider class that’s a subclass of the scrapy.Spider class.
We initialize the Firefox webdriver in the __init__ method.
And then we add the parse method that calls driver.get method to open the page at the response.url.
Next, we loop through the td elements in the while loop.
And we call click to click on the element that’s found.
Once we’re done, we call close to close the webdriver.
Conclusion
To use Selenium with scrapy for dynamic page with Python, we can create our own scrapy.Spider subclass.