Categories
Python Answers

How to do web scraping with Python?

Sometimes, we want to do web scraping with Python.

In this article, we’ll look at how to do web scraping with Python.

How to do web scraping with Python?

To do web scraping with Python, we can use BeautifulSoup.

To install it, we run

pip install beautifulsoup4

Then we use it by writing

import urllib2
from bs4 import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen('http://example.com').read())

for row in soup('table', {'class': 'spad'})[0].tbody('tr'):
    tds = row('td')
    print(tds[0].string, tds[1].string)

to open the page at the URL with urlopen.

And then we call read to convert the response into a HTML string.

Next, we use the BeautifulSoup class with the string to create the soup object.

And then we get the table element with soup and then we get the tr element in the table with tbody.

Then we get the td’s in the tr element with row.

And then we get the text of the td’s with string.

Conclusion

To do web scraping with Python, we can use BeautifulSoup.

Categories
Python Answers

How to delete an element from a dictionary with Python?

Sometimes, we want to delete an element from a dictionary with Python.

In this article, we’ll look at how to delete an element from a dictionary with Python.

How to delete an element from a dictionary with Python?

To delete an element from a dictionary with Python, we can use the del operator.

For instance, we write

del d[key]

to remove the item with key key from dict d with del.

Conclusion

To delete an element from a dictionary with Python, we can use the del operator.

Categories
Python Answers

How to get the number of elements in a list in Python?

Sometimes, we want to get the number of elements in a list in Python.

In this article, we’ll look at how to get the number of elements in a list in Python.

How to get the number of elements in a list in Python?

To get the number of elements in a list in Python, we use the len function.

For instance, we write

l = len([1, 2, 3])

to get the number of items in the [1, 2, 3] list with len.

It should return 3 since there’re 3 items in the list.

Conclusion

To get the number of elements in a list in Python, we use the len function.

Categories
Python Answers

How to compute the similarity between two text documents with Python?

Sometimes, we want to compute the similarity between two text documents with Python.

In this article, we’ll look at how to compute the similarity between two text documents with Python.

How to compute the similarity between two text documents with Python?

To compute the similarity between two text documents with Python, we can use the scikit-learn library.

To install it, we run

pip install -U scikit-learn

Then we use by writing

from sklearn.feature_extraction.text import TfidfVectorizer

documents = [open(f).read() for f in text_files]
tfidf = TfidfVectorizer().fit_transform(documents)
pairwise_similarity = tfidf * tfidf.T

to open the files with the paths in the text_files list.

Then we create a TfidfVectorizer object and call fit_transforms with the strings returned by read.

And then we get their pairwise similarity with tfidf * tfidf.T.

Conclusion

To compute the similarity between two text documents with Python, we can use the scikit-learn library.

Categories
Python Answers

How to fix WebDriverException: unknown error: cannot find Chrome binary error with Selenium in Python for older versions of Google Chrome?

Sometimes, we want to fix WebDriverException: unknown error: cannot find Chrome binary error with Selenium in Python for older versions of Google Chrome.

In this article, we’ll look at how to fix WebDriverException: unknown error: cannot find Chrome binary error with Selenium in Python for older versions of Google Chrome.

How to fix WebDriverException: unknown error: cannot find Chrome binary error with Selenium in Python for older versions of Google Chrome?

To fix WebDriverException: unknown error: cannot find Chrome binary error with Selenium in Python for older versions of Google Chrome, we set the path of the Chrome binary.

For instance, we write

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.binary_location = "C:\\Program Files\\Chrome\\chrome64_55.0.2883.75\\chrome.exe"
driver = webdriver.Chrome(chrome_options = options, executable_path=r'C:\path\to\chromedriver.exe')
driver.get('http://example.com/')
driver.quit()

to set create an Options object.

Then we set options.binary_location to the path of the Chrome binary.

And then we set the executable_path to the path of the Chrome driver.

Then we call get to open a page at the URL and call quit to exit.

Conclusion

To fix WebDriverException: unknown error: cannot find Chrome binary error with Selenium in Python for older versions of Google Chrome, we set the path of the Chrome binary.