The HTML parser is a structured markup processing tool. It defines a class called HTMLParser, which is used to parse HTML files. It comes in handy for
HTMLParser.feed
is one of the methods of HTML parser. We use this to input data to the HTML parser.
The data is in the format of a string and should be complete. Complete data means that all tags are complete and nothing is missing. For example, </p
, is incomplete data because the closing >
is missing. The parser buffers all incomplete data.
HTMLParser.feed(data)
This code below shows how we can use HTML parser to separate start tags, end tags, comments, and data from the HTML string.
from html.parser import HTMLParserclass Parser(HTMLParser):# method to print the start tags.def handle_starttag(self, tag, attrs):print("start tag: ",tag)# method to print the end tags.def handle_endtag(self, tag):print("end tag: ",tag)# method to print the data between the tags.def handle_data(self, data):print("Data: ",data)# method to print the comments.def handle_comment(self, data):print("comment: ",data)# Creating an instance of our class.parser = Parser()# Poviding the input.# user parser.feed for inputparser.feed('<html><title>Desserts</title><body><p>''I am a fan of frozen yoghurt.</p><''/body><!--My first webpage--></html>')
Free Resources