We use sentiment analysis to examine the opinions of a given text. With sentiment analysis, we can check the sentiment of the author of a text. Big companies like Twitter and Facebook use it to check their tweets or status for hate speech. The algorithm identifies the sentiment by analyzing patterns of words in different lines of text.
The algorithm checks the words against a set of positive and negative words. Using this algorithm, we can also check the magnitude of these sentiments, that is, how positive or negative these words are.
The algorithm picks up words and performs its computation. The most commonly used data for sentiment analysis comes from tweets. These tweets contain words and punctuation. Since punctuation has no fundamental importance, we must remove all punctuation and special characters from the data before analysis.
We use the tm
package to clean the text.
library(tm)
We use the following command to create a vector of tweets for preprocessing.
tweets <- iconv(data)
We use the following commands to remove all the unnecessary data and clean it.
tweets <- tm_map(tweets, tolower) # converts the dataset to lower casetweets <- tm_map(tweets, removePunctuation) # removes punctuationstweets <- tm_map(tweets, removeNumbers) # removes numbers
syuzhet
package to classify emotions and their relative scores. It has an in-built classification algorithm for analyzing emotions.library(syuzhet)#Reading the Tweet datadata <- read.csv("file_path.csv", header = T)tweet_lines <- iconv(data$Sentence)#Calculating the scores using syuzhet libraryscores <- get_nrc_sentiment(tweet_lines)#Plotting the scores in the form of a bar plotbarplot(colSums(scores),las = 2, ylab = 'Count',main = 'Sentiment Analysis of Tweets')
We use the code above to find out the sentiments of the tweets.
file_path
that is the path to the tweets..csv
data into a vector of tweets.syuzhet
to find the sentiments. They range from anger to positive.We learned how to preprocess the data to make it suitable for sentiment analysis. We also learned about assigning numeric scores to a sentiment, which lets us know the strength of the emotion.
Free Resources