Categories
Python Answers

How to strip HTML from strings in Python?

Sometimes, we want to strip HTML from strings in Python.

In this article, we’ll look at how to strip HTML from strings in Python.

How to strip HTML from strings in Python?

To strip HTML from strings in Python, we can use the StringIO and HTMLParser modules.

For instance, we write:

from io import StringIO
from html.parser import HTMLParser


class MLStripper(HTMLParser):
    def __init__(self):
        super().__init__()
        self.reset()
        self.strict = False
        self.convert_charrefs = True
        self.text = StringIO()

    def handle_data(self, d):
        self.text.write(d)

    def get_data(self):
        return self.text.getvalue()


def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()


print(strip_tags('<p>hello world</p>'))

We create the MLStripper class with the constructor setting the options for parsing HTML.

convert_charrefs converts all characters to Unicode characters.

text sets the source of the text.

In the handle_data method we write the converted text with text_write.

And we return the result in get_data.

Next, we create the strip_tags function that creates a new MLStripper instance.

Then we call s.feed with html to strip the tags off the html string.

And then we return the stripped string that we retrieved from get_data.

Therefore, the print function should print ‘hello world’.

Conclusion

To strip HTML from strings in Python, we can use the StringIO and HTMLParser modules.

Categories
Python Answers

How to compare two lists and return matches with Python?

Sometimes, we want to compare two lists and return matches with Python.

In this article, we’ll look at how to compare two lists and return matches with Python.

How to compare two lists and return matches with Python?

To compare two lists and return matches with Python, we can use the set’s intersection method.

For instance, we write:

a = [1, 2, 3, 4, 5]
b = [9, 8, 7, 6, 5]
intersection = set(a).intersection(b)
print(list(intersection))

We have 2 lists a and b that we want to get the intersection between.

To do this, we convert a to a set with set.

Then we call intersection with band assign the intersection ofaandbtointersection`.

And finally, we call list with intersection to convert it back to a list.

Therefore, [5] is printed.

Conclusion

To compare two lists and return matches with Python, we can use the set’s intersection method.

Categories
Python Answers

How to manually raise or throw an exception in Python?

Sometimes, we want to manually raise or throw an exception in Python.

In this article, we’ll look at how to manually raise or throw an exception in Python.

How to manually raise or throw an exception in Python?

To manually raise or throw an exception in Python, we can use the raise keyword.

For instance, we write:

try:
    raise ValueError('Represents a hidden bug, do not catch this')
    raise Exception('This is the exception you expect to handle')
except Exception as error:
    print(repr(error))

We use raise with ValueError to raise ValueError with a message.

Then we use the except clause to catch the Exception error, which is the parent class of all exceptions.

So the print call will print ValueError('Represents a hidden bug, do not catch this').

And the Exception exception is never raised.

Conclusion

To manually raise or throw an exception in Python, we can use the raise keyword.

Categories
Python Answers

How to process escape sequences in a string in Python?

Sometimes, we want to process escape sequences in a string in Python.

In this article, we’ll look at how to process escape sequences in a string in Python.

How to process escape sequences in a string in Python?

To process escape sequences in a string in Python, we can use the Python byte’s decode method.

For instance, we write:

s = "spam\\neggs"
decoded_string = bytes(s, "utf-8").decode("unicode_escape")
print(decoded_string)

We call bytes with string s and 'utf-8' encoding.

Then we call decode with 'unicode_escape' to decode the escape characters.

Therefore, decoded_string is:

spam
eggs

Conclusion

To process escape sequences in a string in Python, we can use the Python byte’s decode method.

Categories
Python Answers

How to read subprocess stdout line by line in Python?

Sometimes, we want to read subprocess stdout line by line in Python?

In this article, we’ll look at how to read subprocess stdout line by line in Python?

How to read subprocess stdout line by line in Python?

To read subprocess stdout line by line in Python, we can call stdout.readline on the returned process object.

For instance, we write:

import subprocess

proc = subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE)
while True:
    line = proc.stdout.readline()
    if not line:
        break
    print(line.rstrip())

We call subprocess.Popen with a list of strings with the command and command line arguments.

Then we set stdout to subprocess.PIPE to return the output.

Next, we call proc.stdout.readline to return the next line of the stdout output in the while loop.

If line is None, then we stop the loop.

Otherwise, we print the line.

Therefore, we get text like:

b'total 64'
b'-rw-r--r-- 1 runner runner   183 Oct 20 01:10 main.py'
b'-rw-r--r-- 1 runner runner 14924 Oct 19 23:40 poetry.lock'
b'drwxr-xr-x 1 runner runner   126 Oct 19 23:17 __pycache__'
b'-rw-r--r-- 1 runner runner   319 Oct 19 23:39 pyproject.toml'
b'-rw-r--r-- 1 runner runner 12543 Oct 20 00:16 somepic.png'
b'-rw-r--r-- 1 runner runner   197 Oct 19 23:21 strings.json'
b'-rw------- 1 runner runner 18453 Oct 20 00:16 test1.png'

on the screen.

Conclusion

To read subprocess stdout line by line in Python, we can call stdout.readline on the returned process object.