Playwright and Puppeteer are the most powerful open-source tools for controlling headless browsers. The main difference between these tools lie in cross-browser support and feature richness. Playwright supports multiple browser engines, on the other hand, Puppeteer is primarily focused on Chromium-based browsers and offers a more straightforward experience.
Explore the key differences and similarities between Playwright and Puppeteer:
Main differences between Playwright and Puppeteer
Playwright and Puppeteer are both open-source Node.js libraries commonly used for web automation tasks and web scraping. Both tools support controlling headless browsers, automation via DevTools, and provide APIs for page and element interaction.
Features | Playwright | Puppeteer |
---|---|---|
Maintainer | Microsoft | Google (Chrome team) |
Browser Support | Chromium (Chrome, Edge), Firefox, and WebKit (Safari) | Primarily Chromium, limited Firefox support |
Programming Languages | JavaScript/TypeScript, Python, Java, C# (official) | JavaScript/TypeScript (official); unofficial wrappers |
Cross-browser Testing | ✅ | Limited (mostly Chromium-focused) |
Mobile Browser Emulation | Native support for Chrome Android & Mobile Safari | Primarily Chrome Android emulation |
Community & Ecosystem | Rapidly growing but newer | Larger, more mature ecosystem |
GitHub statistics (April, 2025) | 71.8k stars, 4.1k forks | 90.4k stars, 9.2k forks |
What is Puppeteer?
Puppeteer is an open-source Node.js library that a user-friendly API to control headless Chrome or Chromium browsers over the DevTools Protocol or WebDriver BiDi.
Puppeteer allows automation testing of Chrome Extensions for performance testing. Users can capture precise screenshots of entire pages or specific UI components.
Advantages of Puppeteer
- Since Puppeteer is developed maintained by Google, the tool quickly integrates the latest Chrome developments.
- Cross-browser support is limited. Runs Chrome/Chromium in headless mode by default.
- Offers full control over Chrome’s features including clicking buttons, form submission, scrolling, and taking screenshots.
- For Chrome-only tasks, Puppeteer is slightly faster than Playwright.
Disadvantages of Puppeteer
- Puppeteer does not support other browsers like Safari or Microsoft Edge.
- The primary language Puppeteer supports is JavaScript (and TypeScript via typings).
- Puppeteer is tightly coupled with specific versions of Chromium or Firefox. If you want to test on older browser versions, you need to manage the browser binary manually.
What is Playwright?
Playwright is an open-source, cross-browser automation and testing library developed by Microsoft. The tool enables developers developers to interact all major browsers like Chromium (Chrome, Edge), Firefox, and WebKit (Safari).
Playwright allows capturing screenshots of entire pages or specific elements, generating PDFs of pages, and recording videos of test sessions.
Advantages of Playwright
- Cross-browser and cross-language support: Playwright is available in multiple browsers and supports multiple programming languagesincluding Python, .NET, JavaScript and TypeScript.
- Built-in cross-browser testing: Developers can use the same scripts and tests across all supported browsers both in visible (headed) and headless modes.
- Native mobile app testing of Chrome for Android and Mobile Safari: Includes predefined device profiles for common mobile devices.
- Built-in auto-wait: Auto-wait mechanisms ensure that elements become actionable before interactions occur.
Disadvantages of Playwright
- PDF Generation Limitation: Only supported on headless Chromium. Firefox and WebKit do not support PDF generation currently.
- Resource intensive: Launching multiple browsers can consume memory and CPU resources.
- Less mature ecosystem relative to Puppeteer (offering extensive community support): While Playwright has quickly grown in popularity (initially released in early 2020), the tool still require more community engagement.
Automating News Headline Scraping with Playwright
In this example, we will
- Navigate to BBC News
- Grab the top 5 headlines
- Save them into a .txt file
Step 1: Install Node.js (if you haven’t already)
Check if Node.js is installed. Open your terminal (or command prompt) and type node -v. If a version number shows up something like e.g. v22.11.0, you’re all set. If you get an error like “command not found”. Go to https://nodejs.org and download the LTS version.
node -v
Step 2: Create a Project Folder
Create a new folder (directory) for the project:
mkdir ~/Desktop/news-scraper
You can enterthat folder in your terminal by running this command:
cd ~/Desktop/news-scraper
Step 3: Initialize the Project
The following command creates a file called package.json in your folder. It will track your project’s dependencies.
npm init -y
After running the command a file called package.json was created, including its name and version:

Step 4: Install Playwright
npm install playwright
We need to install the browser binaries, the following command will download Chromium, Firefox, and WebKit:
npx playwright install
Step 5: Create the Automation Script
After installing Playwright and the browser binaries, we need to write the first script that opens a browser, visits a website, collects headlines, and saves them to a file.
- Create a new file named scrape.js, this will create an empty file:
touch scrape.js
- Open the file:
nano scrape.js
The following image shows the nano text editor after you running the command, it is ready typing or pasting your JavaScript code into the file scrape.js.

- Paste this code into the nano window:
const { chromium } = require('playwright');
const fs = require('fs'); // Import fs module to save data
(async () => {
const browser = await chromium.launch(); // Launch the browser
const page = await browser.newPage(); // Open a new page
await page.goto('https://www.bbc.com/news'); // Navigate to BBC News
// Scrape the first 5 headlines
const headlines = await page.$$eval('.gs-c-promo-heading__title', elements =>
elements.slice(0, 5).map(el => el.innerText.trim()) // Extract text of the top 5 headlines
);
// Save the headlines to a .txt file
fs.writeFileSync('headlines.txt', headlines.join('\n'), 'utf8');
console.log('✅ Headlines saved to headlines.txt');
await browser.close(); // Close the browser
})();
- Press Ctrl + O, and then enter to save the file.
- After saving the file, press Ctrl + X to exit nano
- Navigate to your project directory:
cd ~/Desktop/news-scraper
- Run the following command in the terminal to to scrape the headlines from BBC News:
node scrape.js
Step 6: Set up the headline scraper
The scraper will navigate to the BBC News website, scrape the top headlines and save the scraped headlines into a .txt file.
- Open the test.js file for modifying the script:
nano test.js
- Update the code with the following:
const { chromium } = require('playwright');
const fs = require('fs');
(async () => {
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto('https://www.bbc.com/news');
// Scrape the first 5 headlines
const headlines = await page.$$eval('.gs-c-promo-heading__title', elements =>
elements.slice(0, 5).map(el => el.innerText.trim())
);
// Save the headlines to a .txt file
fs.writeFileSync('headlines.txt', headlines.join('\n'), 'utf8');
console.log('✅ Headlines saved to headlines.txt');
await browser.close();
})();
- Press Ctrl + O to save the file, and then enter to confirm the filename.
- Press Ctrl + X to exit.
- Run the web scraping script:
node test.js
- You will see:
✅ Headlines saved to headlines.txt
- List all files in the current directory:
ls
- Check the headlines.txt file:
cat headlines.txt
See the top 5 headlines printed out in the terminal:

Step 7: Save the extracted data as a text file (option 1)
- In the terminal, run to find the headlines.txt file:
ls
- You can copy the file to another location by using the following command:
cp headlines.txt ~/Desktop
Export to CSV (option 2)
- Install the csv-writer package
npm install csv-writer
- Modify the script with the following code for CSV Export
const { chromium } = require('playwright');
const fs = require('fs'); // For saving to .txt file
const { createObjectCsvWriter } = require('csv-writer'); // For CSV export
(async () => {
const browser = await chromium.launch(); // Launch browser
const page = await browser.newPage(); // Open a new page
await page.goto('https://www.bbc.com/news'); // Navigate to BBC News
// Extract the top 5 headlines
const headlines = await page.$$eval('.sc-87075214-3', (elements) => {
return elements.slice(0, 5).map(el => el.innerText.trim());
});
// Log the extracted headlines
console.log('Extracted Headlines:', headlines);
// Set up the CSV writer
const csvWriter = createObjectCsvWriter({
path: 'headlines.csv',
header: [
{ id: 'headline', title: 'Headline' }
]
});
// Write the headlines to the CSV file
const records = headlines.map(headline => ({ headline }));
await csvWriter.writeRecords(records);
console.log('✅ Headlines saved to headlines.csv');
await browser.close(); // Close the browser
})();
- Run the script again:
node scrape.js
The CSV file should look like this:

Troubleshooting
Selector Not Found
The following response shows that the script ran, but it didn’t extract any headlines from the target. The class used in the code may not be correct anymore since websites change their structure frequently.
How to fix the issue:
The $$eval function in Playwright uses CSS selectors that enables users to identify and extract the elements. You need to inspect the page, right-click on one of the elements and then select inspect. In the Developer Tools, look at the HTML structure and identify the class name for the element. With the new selector, update your script with the new class.
Extracted Headlines: []
✅ Headlines saved to headlines.txt
Combining Scraping and Automation in One Puppeteer Script
In this example, we will:
- Navigate across different blogs
- Extract article titles, URLs, publish dates and tags.
Step 1: Create a folder for your project
mkdir realpython-scraper
Then navigate to the folder
cd realpython-scraper
Step 2: Initialize a New Node.js Project
This will hold your project’s dependencies:
npm init -y
After creating a package.json file, install Puppeteer in the folder by running:
npm install puppeteer
Step 3: Create the Scraping Script
- Create a new JavaScript file for scraping script:
touch realpython-scraper.js
- Open the file:
nano realpython-scraper.js
- Paste the following code, then save and exit (CTRL + O → Enter, CTRL + X):
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('https://realpython.com/', {
waitUntil: 'domcontentloaded'
});
await page.waitForSelector('.card.border-0');
const articles = await page.$$eval('.card.border-0', cards => {
return cards
.filter(card => card.querySelector('h2.card-title')) // filter only articles
.slice(0, 5)
.map(card => {
const title = card.querySelector('h2.card-title')?.innerText.trim();
const excerpt = card.querySelector('p.card-text')?.innerText.trim();
const url = card.querySelector('a')?.href;
return { title, excerpt, url };
});
});
console.log('\n📰 Top 5 Articles on Real Python:\n');
articles.forEach((a, i) => {
console.log(`${i + 1}. ${a.title}`);
console.log(` ${a.url}`);
console.log(` ${a.excerpt}\n`);
});
await browser.close();
})();
The script will extract:
- Article titles, URLs, publish dates and tags.
Step 3: Run the Script
node scrape.js
Expected output:

Troubleshooting
In the below image
- The Extracted Job Listings: [] means the script didn’t find any job listings on the page.
- No element found for selector: #text-input-what indicates the form input for the job search couldn’t be found.

How to fix the issue:
- Job listings scraping issue: The selector used for extracting the job titles can be outdated or incorrect. You need to inspect the page and update the script with the correct selector.
Many social media platforms, job search engines like Indeed, and e-commerce sites like Amazon use anti-bot measures to prevent automated requests. For example, in the image below, Amazon serves the dog page, indicating that they utilize bot detection and block your request. Puppeteer (particularly in headless mode or with default settings) launches the target website.

Comments
Your email address will not be published. All fields are required.