How to use Selenium with scrapy for dynamic page with Python?

Spread the love

Sometimes, we want to use Selenium with scrapy for dynamic page with Python.

In this article, we’ll look at how to use Selenium with scrapy for dynamic page with Python.

How to use Selenium with scrapy for dynamic page with Python?

To use Selenium with scrapy for dynamic page with Python, we can create our own scrapy.Spider subclass.

For instance, we write

import scrapy
from selenium import webdriver


class ProductSpider(scrapy.Spider):
    name = "product_spider"
    allowed_domains = ["example.store"]
    start_urls = ["http://example.store"]

    def __init__(self):
        self.driver = webdriver.Firefox()

    def parse(self, response):
        self.driver.get(response.url)

        while True:
            next = self.driver.find_element_by_xpath('//td[@class="pagn-next"]/a')

            try:
                next.click()
            except:
                break

        self.driver.close()

to create the ProductSpider class that’s a subclass of the scrapy.Spider class.

We initialize the Firefox webdriver in the __init__ method.

And then we add the parse method that calls driver.get method to open the page at the response.url.

Next, we loop through the td elements in the while loop.

And we call click to click on the element that’s found.

Once we’re done, we call close to close the webdriver.

Conclusion

To use Selenium with scrapy for dynamic page with Python, we can create our own scrapy.Spider subclass.

How to use Selenium with scrapy for dynamic page with Python?

Conclusion

Related Posts

By John Au-Yeung

Leave a Reply Cancel reply