Sometimes, we want to parse a HTML page with Node.js.
In this article, we’ll look at how to parse a HTML page with Node.js.
How to parse a HTML page with Node.js?
To parse a HTML page with Node.js, we can use the Cheerio library.
To install it, we run
npm i cheerio
Then we write
const cheerio = require('cheerio');
const $ = cheerio.load('<h2 class="title">Hello world</h2>');
$('h2.title').text('Hello there!');
$('h2').addClass('welcome');
$.html();
to call cheerio.load
to load the HTML string into the $
object.
Then we call $
with the CSS selector string to select the elements we want to manipulate.
And we call text
to set the text of the h2 element with class title
.
And we call addClass
to add the welcome
class to all h2 elements.
We return the HTML of the manipulated document with $.html
.
To get the HTML string from a web page, we can make a request with any HTTP client.
Conclusion
To parse a HTML page with Node.js, we can use the Cheerio library.