Capturing network activity with Puppeteer

Key takeaways:

  • Puppeteer is a Node.js library that allows control of headless Chrome, enabling the capture of network activity in the HAR format.

  • HAR files log browser interactions, including requests, responses, and timings, providing crucial insights for performance analysis.

  • Using the puppeteer-har library, developers can easily generate HAR files by navigating to a web page and capturing all network activity.

  • Analyzing HAR data helps identify performance bottlenecks, such as slow resources and excessive redirects, enhancing web page loading times.

Puppeteer is a Node.js library that provides a high-level API to control headless Chrome or Chromium over the DevTools protocol. One of its powerful features is capturing network activity during web browsing sessions using the HTTP Archive (HAR) format. Here, we’ll cover how to capture network activity with Puppeteer and analyze it using HAR files.

HAR files

HAR (HTTP archive) files are JSON-formatted archives that contain a log of a web browser’s interaction with a web page during a browsing session. HAR files capture network-related data, including requests, responses, headers, and timings. They are commonly used for analyzing and optimizing web page performance.

Setting up Puppeteer with Puppeteer HAR

We’ll use the puppeteer-har library to capture network activity with Puppeteer, which extends Puppeteer to support HAR recording. First, install Puppeteer and puppeteer-har:

npm install puppeteer puppeteer-har

Generating a website HAR file with Puppeteer

A website HTTP archive (HAR) file can be easily generated using Puppeteer. Using a generated HAR file, developers can review the entire traffic within their website and get performance and security insights for each transaction.

The following code snippet will navigate to the Packt website and generate a HAR file for review:

const puppeteer = require('puppeteer');
const PuppeteerHar = require('puppeteer-har');

(async () => {
  const browser = await puppeteer.launch({headless:false, args: ['--no-sandbox']});
  const page = await browser.newPage();

  const har = new PuppeteerHar(page);
  await har.start({ path: 'book_demo.har' });
  await page.goto('https://www.packtpub.com/');
  await har.stop();
  await browser.close();
})();
Capture HAR File with Puppeteer for PacktPub Website
  • Line 2: Imports the puppeteer-har library, which provides functionality for capturing the HAR files during web page interactions.

  • Line 8: Initializes a new instance of PuppeteerHar using the current page.

  • Line 9: Begins to capture a HAR file with name book_demo.har.

  • Line 10: Navigate the browser to the URL https://www.packtpub.com/.

On running the preceding test code, a new HAR file under the name book_demo.har will be generated. Please switch to the output tab and follow the steps below to view the .har file in the Google HAR analyzer tool.

  1. Click the “CHOOSE FILE” button.

  2. Browse through the “app” folder and select the book_demo.har file.

  3. Click the “Open” button located in the lower right corner.

If you have a Google account, select the "Sign in" option, otherwise select the "Don't sign in" option.
If you have a Google account, select the "Sign in" option, otherwise select the "Don't sign in" option.
1 of 5

The Google HAR analyzer web tool shows the output that can be examined by the frontend developers for web traffic issues, performance issues, and more.

Navigating to a web page

We use Puppeteer’s page.goto() method to navigate to a specific web page. As we navigate, Puppeteer captures network requests and responses, which are stored in the HAR file.

await page.goto('https://www.packtpub.com/');

Analyzing HAR data

Once we’ve captured network activity in the HAR file, we can analyze it to gain insights into the web page’s performance. HAR files contain detailed information about each network request, including request and response headers, timings, and content sizes.

Automating network performance testing

Puppeteer and HAR files can be used to automate network performance testing, allowing developers to identify bottlenecks and optimize web page loading times. By analyzing HAR data, we can pinpoint slow-loading resources, excessive redirects, or large file sizes that impact performance.

Conclusion

Capturing network activity with Puppeteer and HAR files is a valuable tool for web developers and performance engineers. By following the steps outlined in this guide and analyzing the captured HAR data, you can optimize your web pages for better performance and user experience.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved