What is constituency grammar in NLP?

The inventory of constituents plays a key role in language grammar development. In NLP, grammars govern the composition of a sentence and define the linear order in which words can occur in a sentence to be considered correct syntactically.
Two types of grammar are commonly used:

Context-free grammar (CFG), also known as constituency grammar or phrase structure grammar.
Dependency grammar.

What is constituency grammar?

Constituency grammar is drawn from a set of languages called context-free languages (CFL) and consists of a set of rules or productions stating how a constituent can be segmented into smaller constituents, up to the level of individual words.

Constituency grammar is defined by four parameters:

A set of non-terminals (aka variables), each denoting a set of strings.
A finite set of terminal symbols (lexicon), constituting the alphabet of the language considered.
A non-terminal starting symbol.
A list of rules called productions that recursively define the structure of the language. Each rule has the form A → s, where:
1) "A" is a non-terminal (variable) symbol on the left-hand side of the rule.
2) "s" is a sequence of terminals and non-terminals that might be empty.

Let's go through some grammar rules to better assimilate the concept:

Grammar Rule / Production	Description	Example
S → NP + VP	A sentence can be composed of Noun Phrase + Verb Phrase	I + want a vacation.
NP → Pronoun	A Noun Phrase can be composed of a Pronoun	I
NP → Proper-Noun	A Noun Phrase can be composed of a proper noun	Las Vegas
NP → Det Nominal	A Noun Phrase can be composed of a determiner (Det) followed by a Nominal	a + knight
Nominal → Nominal Noun	A Nominal may consist of one or more Nouns	morning + flight
PP → Preposition NP	A Preposition Phrase can be composed of a preposition followed by a Noun Phrase	from + Las Vegas
VP → Verb	Verb Phrase	make
VP → Verb NP	A Verb Phrase can be composed of a verb and a Noun Phrase	book + a flight
VP → Verb NP PP	A Verb Phrase can be composed of a verb followed by a Noun Phrase and a Preposition Phrase	book + a flight + in the evening
VP → Verb PP	A Verb Phrase can be composed of a verb and a Preposition Phrase	eating + in the evening

import nltk
nltk.download('punkt')
nltk.download('maxent_ne_chunker')
nltk.download('words')
nltk.download('treebank')
nltk.download('averaged_perceptron_tagger')

def extract_constituents(sentence):
    tokens = nltk.word_tokenize(sentence)
    tagged = nltk.pos_tag(tokens)
    grammar = "NP: {<DT>?<JJ>*<NN>}"
    cp = nltk.RegexpParser(grammar)
    parse_tree = cp.parse(tagged)
    return parse_tree

sentence = "The man eats the apple"
constituents = extract_constituents(sentence)

constituents.pprint()

constituents.draw()

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources

License: Creative Commons-Attribution NonCommercial-ShareAlike 4.0 (CC-BY-NC-SA 4.0)

Symbol	Type	Example
NP	Noun Phrases	"he," "the boy," "the man with the old black shoes"
VP	Verb Phrases	"walked," "sit down and be quiet"
PP	Prepositional Phrases	"on the floor," "with the paper," "apart from everything said before."

What is constituency grammar in NLP?

What is constituency grammar?

Implementation