How to create a web scraper with Dart

Dart is a programming language created by Google. It is used for building web, mobile, and server applications. It’s known for its efficiency in creating fast and robust applications. Dart offers a simple syntax and supports both object-oriented and functional programming paradigms. It’s often used with Flutter, a popular framework for creating cross-platform mobile applications.

What is a web scraper?

A web scraper is a tool that automatically extracts data from websites. It navigates web pages, gathers specific information, and organizes it into a structured format like a spreadsheet or database. This tool enables users to collect data at scale, often used for market research, price monitoring, or aggregating information from multiple sources on the internet. In this Answer, we will learn to create a web scraper using the Dart programming language. We will go in a step-by-step fashion.

Set up a Dart project

Firstly, we have to set up a Dart project. This involves the following steps:

  1. Install Dart SDK: We have already installed the Dart SDK for you on our platform. However, if you are working on your local system or machine, you can install the Dart SDK by following the instructions on the official Dart websitehttps://dart.dev/get-dart.

  2. Create a new Dart project: We have already created the relevant Dart project for you on our platform (see the coding playground at the end of this Answer). However, if you are working on your local system or machine, you can use the following command to create a new Dart project: dart create web_scraper

Create a web scraper with Dart

Now, let’s create a web scraper using Dart.

Add dependencies

First, make sure the following dependencies are added to the pubspec.yaml file:

  1. http: ^1.1.2

  2. html: ^0.15.0

The code snippet below shows a sample pubspec.yaml file.

name: web_scraper
description: A web scraper created using Dart.
version: 1.0.0
environment:
sdk: ^3.2.3
dependencies:
http: ^1.1.2
html: ^0.15.0
dev_dependencies:
lints: ^3.0.0
test: ^1.24.0
pubspec.yaml file

Note: Make sure to run the following command within the project's directory to install all the dependencies: dart pub get.

Import dependencies

Next, import the following dependencies within the Dart file located within the bin directory of the Dart project. For this Answer, it is the web_scraper.dart file (see the coding playground at the end of this Answer).

import 'package:http/http.dart' as http;
import 'package:html/parser.dart' as parser;
Import dependencies

The package:http/http.dart package enables us to make HTTP requests to web servers and retrieve data from URLs. On the other hand, the package:html/parser.dart package enables us to parse HTML content, allowing us to extract specific elements, data, or information from HTML documents obtained through HTTP requests.

Set up a web application

We can use a web scraper to scrape content from various simple to complex websites. However, for this Answer, we will use a simple web application for demonstration purposes. The following is the code snippet and display of the web application:

Demo web application

Build the web scraper

Now we are ready to create the web scraper using the Dart programming language. The code snippet below shows how to scrape the content from the demo web application.

import 'package:http/http.dart' as http;
import 'package:html/parser.dart' as parser;
void main() async {
var url = Uri.parse('http://localhost:8000/index.html');
try {
var response = await http.get(url);
if (response.statusCode == 200) {
var document = parser.parse(response.body);
var titleElement = document.querySelector('title');
if (titleElement != null) {
var title = titleElement.text;
print('Title: $title');
} else {
print('Title not found');
}
var heading = document.querySelector('h1');
if (heading != null) {
var headingText = heading.text;
print('Heading: $headingText');
} else {
print('Heading not found');
}
var aboutSection = document.querySelector('#about');
if (aboutSection != null) {
var aboutContent = aboutSection.text;
print('About Section Content: $aboutContent');
} else {
print('About Section not found');
}
var servicesSection = document.querySelector('#services');
if (servicesSection != null) {
var servicesContent = servicesSection.text;
print('Services Section Content: $servicesContent');
} else {
print('Services Section not found');
}
var contactSection = document.querySelector('#contact');
if (contactSection != null) {
var contactContent = contactSection.text;
print('Contact Section Content: $contactContent');
} else {
print('Contact Section not found');
}
var footerSection = document.querySelector('footer');
if (footerSection != null) {
var footerContent = footerSection.text;
print('Footer Section Content: $footerContent');
} else {
print('Footer Section not found');
}
} else {
print('Failed to load page: ${response.statusCode}');
}
} catch (e) {
print('Error: $e');
}
}
Web scraper

Example

Execute the coding playground below by clicking the "Run" button. Open a new terminal using the "+" button and execute the following command:

cd /web_scraper && dart run
Command to execute the dart

The above command is used to view the contents scraped from the demo web application.

import 'package:http/http.dart' as http;
import 'package:html/parser.dart' as parser;

void main() async {
  var url = Uri.parse('http://localhost:7080/index.html');
  try {
    var response = await http.get(url);
    if (response.statusCode == 200) {
      var document = parser.parse(response.body);
      var titleElement = document.querySelector('title');
      if (titleElement != null) {
        var title = titleElement.text;
        print('Title: $title');
      } else {
        print('Title not found');
      }
      var heading = document.querySelector('h1');
      if (heading != null) {
        var headingText = heading.text;
        print('Heading: $headingText');
      } else {
        print('Heading not found');
      }
      var aboutSection = document.querySelector('#about');
      if (aboutSection != null) {
        var aboutContent = aboutSection.text;
        print('About Section Content: $aboutContent');
      } else {
        print('About Section not found');
      }
      var servicesSection = document.querySelector('#services');
      if (servicesSection != null) {
        var servicesContent = servicesSection.text;
        print('Services Section Content: $servicesContent');
      } else {
        print('Services Section not found');
      }
      var contactSection = document.querySelector('#contact');
      if (contactSection != null) {
        var contactContent = contactSection.text;
        print('Contact Section Content: $contactContent');
      } else {
        print('Contact Section not found');
      }
      var footerSection = document.querySelector('footer');
      if (footerSection != null) {
        var footerContent = footerSection.text;
        print('Footer Section Content: $footerContent');
      } else {
        print('Footer Section not found');
      }
    } else {
      print('Failed to load page: ${response.statusCode}');
    }
  } catch (e) {
    print('Error: $e');
  }
}
Coding playground

Explanation

  • Line 1: We import the package using, 'package:http/http.dart' as http;: it imports the http package with the prefix http, allowing access to HTTP functionality.

  • Line 2: We import the package using, 'package:html/parser.dart' as parser;: it imports the parser module from the html package aliased as parser, enabling HTML parsing capabilities.

  • Line 5: We define the URL to fetch by parsing the string representation of the URL into a Uri object.

  • Line 7: We send an HTTP GET request to the specified URL using the http.get function and await the response.

  • Line 8: We check if the response status code indicates success (200).

  • Line 9: We parse the response body (HTML content) into an HTML document using the parser.parse function.

  • Line 10: We retrieve the <title> element from the HTML document.

  • Line 11: We check if the title element exists in the document.

  • Line 12: We extract the text content of the title element.

  • Lines 17–23: We use a similar process as extracting the title, but for the <h1> element.

  • Lines 24–51: We use a similar process as extracting the title, but for sections with specific IDs (#about, #services, #contact, footer).

  • Lines 52–54: We print an error message indicating the failure to load the page if the response status code is not 200.

  • Lines 55–57: We print an error message with the exception details and catch any exceptions that may occur during execution.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved