How to Scrape a Website Using Puppeteer in Node.js ? Last Updated : 05 Apr, 2023 Comments Improve Suggest changes Like Article Like Report Puppeteer is a Node.js library that provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It allows automating, testing, and scraping of web pages over a headless/headful browser. Installing Puppeteer: To use Puppeteer, you must have Node.js installed. Then, Puppeteer can be installed in the command line using the npm package manager. npm install puppeteer Using Puppeteer: The Puppeteer library can be imported in your script using: const puppeteer = require('puppeteer'); It is important to remember that Puppeteer is a promise-based library that performs asynchronous calls to the headless Chrome instance. Therefore, we wrap it in an async wrapper. This means that the code is executed immediately. Here is a simple example to take a screenshot of a page: JavaScript import Puppeteer const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); // Open new page in headless browser const page = await browser.newPage(); // To visit page in browser await page.goto('https://scrapethissite.com'); // Save Screenshot at Path await page.screenshot({path: 'screenshot.png'}); // Close our browser instance await browser.close(); })(); Running your Code: Save your code as a JavaScript file and run it in the command line using the following command: node filename.js Example: The following code returns an object with the NHL Hockey Team Name and Wins for that year, JavaScript const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://scrapethissite.com/pages/forms/'); const textsArray = await page.evaluate( () => [...document.querySelectorAll( '#hockey > div > table > tbody > tr > td.name')] .map(elem => elem.innerText) ); const WinArray = await page.evaluate( () => [...document.querySelectorAll( '#hockey > div > table > tbody > tr > td.wins')] .map(elem => elem.innerText) ); const result = {}; textsArray.forEach((textsArray, i) => result[textsArray] = WinArray[i]); console.log(result); await browser.close(); })();  Output: { 'Boston Bruins': '36', 'Buffalo Sabres': '31', 'Calgary Flames': '31', 'Chicago Blackhawks': '36', 'Detroit Red Wings': '34', 'Edmonton Oilers': '37', 'Hartford Whalers': '31', 'Los Angeles Kings': '46', 'Minnesota North Stars': '27', 'Montreal Canadiens': '39', 'New Jersey Devils': '32', 'New York Islanders': '25', 'New York Rangers': '36', 'Philadelphia Flyers': '33', 'Pittsburgh Penguins': '41', 'Quebec Nordiques': '16', 'St. Louis Blues': '47', 'Toronto Maple Leafs': '23', 'Vancouver Canucks': '28', 'Washington Capitals': '37', 'Winnipeg Jets': '26'} Comment More infoAdvertise with us Next Article How to Scrape a Website Using Puppeteer in Node.js ? P prakharalt Follow Improve Article Tags : Web Technologies JS++ Node.js Node.js-Misc Similar Reads How to scrape the web data using cheerio in Node.js ? Node.js is an open-source and cross-platform environment that is built using the chrome javascript engine. Node.js is used to execute the javascript code from outside the browser. Cheerio: Its working is based on jQuery. It's totally working on the consistent DOM model. Cheerio is used for scraping 2 min read What is Web Scraping in Node.js ? Web scraping is the automated process of extracting data from websites. It involves using a script or a program to collect information from web pages, which can then be stored or used for various purposes such as data analysis, research, or application development. In Node.js, web scraping is common 3 min read Run Python Script using PythonShell from Node.js Nowadays Node.js is the most attractive technology in the field of backend development for developers around the globe. And if someone wishes to use something like Web Scraping using python modules or run some python scripts having some machine learning algorithms, then one need to know how to integ 3 min read What are Request and Cheerio in Node.js NPM ? In this article, we are going to learn about that request and the cheerio library of node.js with their installation. Request Library: The request module is a very popular library in node.js. It helps in sending the request to the external web application to make the conversation between the client 3 min read How to get Trending GitHub Repositories Using Node.js ? Approach: Fetch the entire HTML page and store it as string using request package.Load HTML into cheerio and find the CSS selectors to extract repositories details. Using request package: request package: The request is designed to be the simplest way possible to make http calls. It supports HTTPS a 2 min read How to refresh a page using selenium JavaScript ? Selenium is a tool with the help of it, we can perform automation testing in web browsers. According to the official documentation of Selenium - Selenium can automate anything present on the web. Automation scripts in Selenium can be written in multiple programming languages like C#, JavaScript, Jav 2 min read Node.js Puppeteer Puppeteer is an open-source library for Node.js that helps in automating and simplifying development by providing control over the Developers tools. It allows developers to write and maintain simple and automated tests. Most of the things that were done in the browser manually can be done by using p 5 min read Web crawling using Breadth First Search at a specified depth Web scraping is extensively being used in many industrial applications today. Be it in the field of natural language understanding or data analytics, scraping data from websites is one of the main aspects of many such applications. Scraping of data from websites is extracting large amount of context 5 min read Introduction to Web Scraping Web scraping is an automated technique used to extract data from websites. Instead of manually copying and pasting information which is a slow and repetitive process it uses software tools to gather large amounts of data quickly. These tools can be custom-built or used across multiple sites. It also 6 min read Node.js Query String The Query String module used to provides utilities for parsing and formatting URL query strings. It can be used to convert query string into JSON object and vice-versa. Node.js Query StringQuery strings in Node.js are a common way to pass data to a server via a URL. A query string is the part of a U 2 min read Like