What is sentiment analysis in R?

Overview

We use sentiment analysis to examine the opinions of a given text. With sentiment analysis, we can check the sentiment of the author of a text. Big companies like Twitter and Facebook use it to check their tweets or status for hate speech. The algorithm identifies the sentiment by analyzing patterns of words in different lines of text.

The algorithm checks the words against a set of positive and negative words. Using this algorithm, we can also check the magnitude of these sentiments, that is, how positive or negative these words are.

How to do sentiment analysis

The algorithm picks up words and performs its computation. The most commonly used data for sentiment analysis comes from tweets. These tweets contain words and punctuation. Since punctuation has no fundamental importance, we must remove all punctuation and special characters from the data before analysis.

We use the tm package to clean the text.

library(tm)

Syntax

We use the following command to create a vector of tweets for preprocessing.

tweets <- iconv(data)

We use the following commands to remove all the unnecessary data and clean it.

tweets <- tm_map(tweets, tolower) # converts the dataset to lower case
tweets <- tm_map(tweets, removePunctuation) # removes punctuations
tweets <- tm_map(tweets, removeNumbers) # removes numbers

Libraries

  • We use the syuzhet package to classify emotions and their relative scores. It has an in-built classification algorithm for analyzing emotions.

Code

main.r
file_path.csv
library(syuzhet)
#Reading the Tweet data
data <- read.csv("file_path.csv", header = T)
tweet_lines <- iconv(data$Sentence)
#Calculating the scores using syuzhet library
scores <- get_nrc_sentiment(tweet_lines)
#Plotting the scores in the form of a bar plot
barplot(colSums(scores),las = 2, ylab = 'Count',main = 'Sentiment Analysis of Tweets')

Explanation

We use the code above to find out the sentiments of the tweets.

  • Lines 1: We import the relevant packages.
  • Line 4: We read the data from file_paththat is the path to the tweets.
  • Line 5: We convert the .csv data into a vector of tweets.
  • Line 8: We use the syuzhet to find the sentiments. They range from anger to positive.
  • Line 11: We convert the score data found in line 8 into a bar plot.

Conclusion

We learned how to preprocess the data to make it suitable for sentiment analysis. We also learned about assigning numeric scores to a sentiment, which lets us know the strength of the emotion.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved