Key takeaways:
Puppeteer is a Node.js library that allows control of headless Chrome, enabling the capture of network activity in the HAR format.
HAR files log browser interactions, including requests, responses, and timings, providing crucial insights for performance analysis.
Using the
puppeteer-har
library, developers can easily generate HAR files by navigating to a web page and capturing all network activity.Analyzing HAR data helps identify performance bottlenecks, such as slow resources and excessive redirects, enhancing web page loading times.
Puppeteer is a Node.js library that provides a high-level API to control headless Chrome or Chromium over the DevTools protocol. One of its powerful features is capturing network activity during web browsing sessions using the HTTP Archive (HAR) format. Here, we’ll cover how to capture network activity with Puppeteer and analyze it using HAR files.
HAR (HTTP archive) files are JSON-formatted archives that contain a log of a web browser’s interaction with a web page during a browsing session. HAR files capture network-related data, including requests, responses, headers, and timings. They are commonly used for analyzing and optimizing web page performance.
We’ll use the puppeteer-har library to capture network activity with Puppeteer, which extends Puppeteer to support HAR recording. First, install Puppeteer and puppeteer-har
:
npm install puppeteer puppeteer-har
A website HTTP archive (HAR) file can be easily generated using Puppeteer. Using a generated HAR file, developers can review the entire traffic within their website and get performance and security insights for each transaction.
The following code snippet will navigate to the Packt website and generate a HAR file for review:
const puppeteer = require('puppeteer'); const PuppeteerHar = require('puppeteer-har'); (async () => { const browser = await puppeteer.launch({headless:false, args: ['--no-sandbox']}); const page = await browser.newPage(); const har = new PuppeteerHar(page); await har.start({ path: 'book_demo.har' }); await page.goto('https://www.packtpub.com/'); await har.stop(); await browser.close(); })();
Line 2: Imports the puppeteer-har
library, which provides functionality for capturing the HAR files during web page interactions.
Line 8: Initializes a new instance of PuppeteerHar
using the current page.
Line 9: Begins to capture a HAR file with name book_demo.har
.
Line 10: Navigate the browser to the URL https://www.packtpub.com/
.
On running the preceding test code, a new HAR file under the name book_demo.har
will be generated. Please switch to the output tab and follow the steps below to view the .har
file in the Google HAR analyzer tool.
Click the “CHOOSE FILE” button.
Browse through the “app” folder and select the book_demo.har
file.
Click the “Open” button located in the lower right corner.
The Google HAR analyzer web tool shows the output that can be examined by the frontend developers for web traffic issues, performance issues, and more.
We use Puppeteer’s page.goto()
method to navigate to a specific web page. As we navigate, Puppeteer captures network requests and responses, which are stored in the HAR file.
await page.goto('https://www.packtpub.com/');
Once we’ve captured network activity in the HAR file, we can analyze it to gain insights into the web page’s performance. HAR files contain detailed information about each network request, including request and response headers, timings, and content sizes.
Puppeteer and HAR files can be used to automate network performance testing, allowing developers to identify bottlenecks and optimize web page loading times. By analyzing HAR data, we can pinpoint slow-loading resources, excessive redirects, or large file sizes that impact performance.
Capturing network activity with Puppeteer and HAR files is a valuable tool for web developers and performance engineers. By following the steps outlined in this guide and analyzing the captured HAR data, you can optimize your web pages for better performance and user experience.
Free Resources