The FreqDist
function gives the user the frequency distribution of all the words in the text. This is particularly helpful in probability calculations, where frequency distribution counts the number of times that each outcome of an experiment occurs.
The FreqDist
function is accessible via the nltk
library.
FreqDist
expects an iterable list of tokens. Although a string is iterable, it is not iterable in the form of tokens.
The text is passed to a tokenizer first, and then the tokens are sent to FreqDist
.
To import the tokenizer:
from nltk.tokenize import word_tokenize
To import FreqDist
from the library:
from nltk.probability import FreqDist
To run the FreqDist
function:
fdist = FreqDist(word.lower() for word in word_tokenize(sentence))
Free Resources