Mining Twitter for the Next US President

20 min readDec 12, 2020

EE380 Data Mining Project at UT Austin

by Allen-Jasmin Farcas, Aishwarya Rajen, Amodh Kant Saxena, Li-Heng Chen, Sung Yeon Choi

In this project, we predicted the outcome of the 2020 US presidential elections using sentiment analysis on Twitter data. We collected information from more than 2.5 Million tweets from different datasets already available on the internet using the Twitter API. Predicting on real-world data such as tweets comes at the expense of not having ground truth labels. To solve this problem, we use existing sentiment analysis tools and a surrogate approach to label our dataset. For our own model of sentiment analysis, we trained multiple Machine Learning (ML) models such as Naive Bayes, Support Vector Machines (SVMs), Random Forest (RF) and Recurrent Neural Networks (RNNs) We studied the tradeoffs between the models analyzing why some models perform better than others. The popularity of both Republican and Democrat candidates was evaluated using the sentiments of the tweets. This allowed us to correctly predict, based on Twitter data, the 46th president of the United States.

Introduction and Background

In the current era of internet, social media has become a major source of communication and is acting as a popular platform for sharing one’s thoughts and opinions. As we confront a pandemic, with stay-at-home warnings, the only thing keeping us in touch with the “real” world is social media. Compared to previous years its potential for significant social, ecological, and economical impact has grown considerably. Twitter, in particular, is a prominent social media platform where a large number of political news, political visions and various political ideologies are shared and debated upon. The fact that Twitter has its biggest audience in the U.S.A., with more than 68 million users [10], made us curious to see if the outcome of the presidential election could be predicted using Twitter data. A lot of people were expressing their opinions on Twitter regarding the good and bad characteristics of the presidential election candidates. This is why we wanted to analyze the sentiments of tweets related to the presidential election. This would allow us to gain a good perspective of which political party and candidate has more support or criticism and predict the winner of this election.

In recent years, many analyses and predictive models based on social media data [1][2][3] have been proposed to address different scenarios. To the best of our knowledge, we are aware of only a few works that are related, but not directly comparable, to our project. In [4][5], a detailed sentiment analysis is performed on the 2016 US elections. The authors evaluate the sentiments of the Twitter population using the SentiStrength algorithm [11], creating different hypotheses and visualizations to better understand the data and the predicted sentiments. A similar analysis was conducted in [6] on datasets from both the 2016 US elections and the 2017 UK elections, using existing Natural Language Processing libraries and location-based analysis. Inspired from these previous works, we decided to utilize sentimental analysis as a key ingredient in predicting this US presidential election.

Overview

Our goal in this project is to predict which party (i.e., Democratic or Republican) has a higher probability of winning the elections based on the number of tweets that support or oppose them. It is challenging to predict directly from the text of a tweet which party a person supports. Yet, according to previous works, it is much easier to predict the sentiment of a sentence (or a tweet in this case), identifying whether it is Positive, Negative, or Neutral. Hence, we use a “two step” approach to achieve our prediction on each tweet, wherein we filter the tweets based on keywords into two categories (i.e., Democratic and Republican) and then use the sentiments predicted from each tweet’s text to identify which party each tweeter supports or opposes. As shown in the example below, a datapoint from Twitter is first categorized as a “Democratic tweet” based on keywords such as twitter handles and name tags (e.g., @JoeBiden). Then, a sentimental analysis model classifies it as a positive tweet. In this case, this tweet is determined as a “vote for Biden” because it is positive with respect to the Democratic party, and Joe Biden is the candidate from the Democratic party

Fig 2. Core concept of sentiment analysis in this project

The tweets related to elections were gathered from two publicly available datasets [12, 13]. The Twitter API was used to extract the required tweet text and other information such as location, retweet count, to name just a few. The data is preprocessed and labeled using existing sentiment analysis models such as Textblob, Sentistrength and VADER. Using a novel surrogate labelling approach we generate the ground truth labels for training our own sentiment classification models namely Naive Bayes, SVM, RF and RNN. Finally, the tweets are filtered and predictions are made on who is most likely to be the next US president. Next, we will detail the whole process of data collection, data preprocessing, data labelling, classification models and prediction shown in Fig. 1.

In order to get past the problem of having a huge unlabeled dataset with over 2.5 million tweets, we decided to use existing sentiment classifiers to label the dataset using a majority voter approach. This enabled us to carry on and define our own models to train them on this dataset. This was a real-world data mining problem and we successfully overcame it. Another novel thing we did was a preprocessing of the tweets to curate the dataset of possible bot tweets or irrelevant data.

Data Collection

We will now introduce two datasets used in this project and refer to them as following: [12] as the IEEE dataset and [13] as the GitHub dataset. Github dataset contains tweets related to the 2020 US presidential election from June 2020. IEEE dataset contains election related tweet data from July 2020 to October 2020, but only had tweets until August 2020 at the time of our experiment. We combined two dataset to cover a longer period of time (June 2020 to October 2020) and to restrict the possible amount of overlap between the two datasets, we merged them using complementary data timestamps.

Twitter regulation on tweet data prevents any dataset from exposing any data other than tweet IDs. Hence, IEEE and GitHub dataset only contained tweet IDs and we had to use Twitter API to extract text data we needed. We used tweepy Python library to extract the data like following.

import tweepy
import pandas as pd#Authenticate to Twitter API
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True, parser=tweepy.parsers.JSONParser())
try:
    api.verify_credentials()
    print("Authentication OK")
except:
    print("Error during authentication")
    raisefor chunk in pd.read_csv(filename, chunksize=chunksize):
    ids = chunk["Id"].tolist()
    data = api.statuses_lookup(id_=ids, tweet_mode="extended")

Note that you require Twitter Developer account to get the credentials required to execute above code (like Consumer Key and Access Token). Refer here to get an account and access Twitter API. Few things to note in the code above is that Twitter API only supports maximum of 100 IDs per request and 900 requests per window (15 minutes) so it takes time to retrieve large amount of data. You also want to make sure that you use extended mode to get full text. We have about 8 million tweet data after data collection and this data characteristic is visualized below as wordcloud.

Fig 3. Wordcloud of retrieved tweets: All tweets (Left), Republican (Middle), and Democrats (Right)

We will now explain how we did data preprocessing and labelling ground-truth on our data for our ML sentiment analysis.

Data Preprocessing

Data Reduction — There were couple of data reduction steps we took to reduce size of our initial 8 million data. First, we took out any tweets that were not English, as it would confuse our ML models. Then, we took out retweets that did not have any additional content — these tweets likely just add a repetition of same tweets to our models. For the remaining tweets, we got rid of any unnecessary data from large tweet JSON to end up with structure in Fig 4.

Bot Detection — As part of the data reduction, we also carried out bot detection, since Twitter has many bots that post similar messages periodically. We classified bots as twitter accounts which had more than 150 tweets/day on any single day and removed them to end up with ~2.5 million tweets.

Labelling ground truth

We chose to classify the tweets text into three sentiments: Positive, Negative and Neutral. In order to find the ground truth sentiment labels to train our own supervised machine learning classifier we made use of three off-the-shelf prevalent models : SentiStrength, Textblob, and VADER.

SentiStrength — SentiStrength is a popular tool which has been used in several twitter sentiment analysis works and is known to have good accuracy for tweets as the software is specifically developed for short informal texts. It is a free for academic use downloadable software [11]. SentiStrength reports two sentiment strengths:

Sent(-ve) : -1 (not negative) to -5 (extremely negative)
Sent(+ve): 1 (not positive) to 5 (extremely positive)

Sentiment(total) = Sent(+ve) + Sent(-ve)

Fig 6. SentiStrength software GUI (Left) and label criteria we used for SentiStrength (right)

One downside of SentiStrength is that the code version is only available for paid user, and free users are forced to use GUI version — tweet text had to be extracted and embedded with labels in separate scripts and required manual labor to run SentiStrength algorithm. Nonetheless, we use the sum of the positive and negative sentiment values given by the software to compute the overall sentiment label of the sentence. Based on our empirical observations on a small sample set of tweets we used the criteria in the table above to label our tweets.

Textblob — Textblob is a python library that provides a simple API for many natural language processing tasks. It is a dictionary based model and predicts both Polarity and subjectivity of a sentence using the PatternAnalyzer. The polarity is a float value which lies in the range of [-1,1] where 1 means positive statement and -1 means a negative statement. Subjectivity is also a float which lies in the range of [0,1]. Subjective sentences generally refer to personal opinion, emotion or judgment whereas objective sentences refer to factual information.

The sentiment attribute of Textblob gives out both the polarity and subjectivity of the text sentence. The code snippet below shows an example where the text has given out a polarity of -0.45 indicating Negative polarity.

Vader — VADER is Valence Aware Dictionary for sEntiment Reasoning. This model is specifically tuned for sentiments expressed in social media. It is a lexicon rule based sentiment analysis tool similar to Textblob. It uses a dictionary of stored words to determine the overall polarity of a sentence. The ‘compound’ field of the polarity_scores attribute of the SentimentIntensityAnalyzer() is used to give the sentiment of the text. It is calculated by adding valence scores of each word in the lexicon, with each score weighted according to the rules and normalized to be between -1 (most extreme negative) and +1 (most extreme positive). The sentiment of the text is chosen based on the following criteria. Following code shows how Vader can be used in Python

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
polarity = analyzer.polarity_scores(text)

Surrogate Labelling — As previously mentioned, labelling millions of tweets manually is not a feasible option, therefore we use a surrogate labeling approach to relax the problem. As shown in Fig. 9 below, only the “unconfident” data can be partially labeled by humans in the training set.

In most cases the tweets can be labeled by surrogate models which we named expert predictors in Figure 10. In this table we also show the confidence of data. Undoubtedly, the labels predicted by the experts can be used directly if all the three experts predicted the same label. This means the confidence level of prediction is high. About 60% of our dataset had two models predict the same label, while the other model predicts differently. In that case, we handle it by conducting a majority vote. When all three experts predict different labels for the same tweet,, the respective data point is labeled manually or omitted due to lack of confidence. As for the testing set, all the data should ideally be labeled by humans to best estimate the true accuracy. But since there were more than 100,000 tweets having low confidence, it was not feasible to manually label all of them and hence we resorted to omitting them for our project.

Preparation to Model Sentiments

Now, we have the tweet texts to act as input features and sentiment labels to act as the dependent variable for our supervised machine learning models. The next step is to translate the sentences into suitable feature sets to feed to our model. For this first we need to denoise the tweets as social media texts have a lot of name mentions, URL links, hashtags, smileys and other symbols, that do not contribute towards the sentiment analysis. Also, to make the words homogenic we convert all tweets to lowercase.

tweet = re.sub('@[^\s]+','', tweet)
tweet = re.sub('((www\.[^\s]+)|(https?://[^\s]+))','URL', tweet)
tweet = tweet.lower()
tweet = re.sub('[^a-zA-Z]+',' ', tweet)
tweet = re.sub(' +',' ', tweet)
tweet.strip()

Bag of Words — After denoising, the next step is to tokenize the sentences into words to create a Bag of Words (BOW) model. The BOW model converts text into numbers by taking into consideration the number of times a particular word is repeated in the dataset. We use the NLTK library’s Regex Tokenizer and Scikit-learn’s CountVectorizer to create the input vectors.

from sklearn.feature_extraction.text import CountVectorizer
from nltk.tokenize import RegexpTokenizer
token = RegexpTokenizer(r'[a-zA-Z0-9]+')
cv = CountVectorizer(stop_words='english',ngram_range = (1,1),tokenizer = nltk.word_tokenize)
text_counts = cv.fit_transform(data['text'])

The CountVectorizer converts a collection of text documents into a matrix of token counts. The implementation produces a sparse matrix representation of the counts which can be fed as input features to our machine learning model. English sentences will have a lot of common repeated words such as “The”, “a”, “it”, “is”, and so on, which are known as stop_words. These words do not contribute towards the prediction of sentiment and hence can be omitted. So, we ask CountVectorizer to omit counts of these common English stop_words already built-in within the Scikit-learn package. The parameter ngram_range=(1,1) indicates splitting a sentence into unigrams (individual words). There is an option to split the sentence into tokens of bigrams or trigrams as well, but our models had best accuracy output using unigrams so we used them.

TF-IDF — We also experimented with TF-IDF model. TF-IDF stands for term frequency-inverse document frequency, and the TF-IDF weight is a weight often used in information retrieval and text mining. This weight is a statistical measure used to evaluate how important a word is to a document in a collection. The higher the TF-IDF score, the rarer the term is and vice-versa.

from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer(stop_words=\"english\",norm=\"l2\")
tf_input = tfidf.fit_transform(data['text'])

Test-Train split — Once feature sets have been generated, the feature sets and the sentiment labels are split into training and testing datasets, which would be used to train as well as evaluate our classification models. We used a 25% test split which gave us nearly 2M tweets for training our models. Following example splits BOW model into train-test set.

from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
X_train, X_test, Y_train, Y_test = train_test_split(text_counts,data['Sentiment'], test_size=0.25, random_state=5)

Learning to Model Sentiments

Naive Bayes — This model applies the Bayes theorem with a naive assumption of conditional independence between input features. It is a simple yet powerful ML model used to solve classification problems. Naive Bayes is known to be particularly successful using the BOW representation as input and hence we tried using it for our sentiment classification.

We used Scikit-Learn’s Naive Bayes model. Since our training dataset was very large (nearly 2Million) we wanted to do the training in small batches (online learning) in order to reduce the training. We experimented with the Multinomial Naive Bayes (MNB) model and Bernoulli Naive Bayes (BNB) model, which both have support for online learning on scikit learn. For both models, we performed a Gridsearch over the alpha parameter which defines additive smoothing (from 0 to 1) and whether or not class prior will be learned, which ended up alpha = 1 and class prior learning for both models. Note that this Gridsearch was only performed on first batch due to similar computation reason from before. We report here the results from BOW model input as it performed better.

MNB — The Multinomial Naive Bayes is one of the most used Naive Bayes methods for text classification. Compared to BNB, this model can take into account multiple features, therefore achieving a higher accuracy than BNB.

BNB — the samples need to be represented as a binary-valued feature vector, therefore, the function. Due to this constraint of the classifier, the accuracy suffers because the included features are binarized to suit the model.

Support Vector Machine — A large text dataset with dense vocabulary seems the perfect playground for a Support Vector Machine, which leverages the high dimensional space to achieve higher accuracy. To prevent the overfitting of the model we use the L2 Ridge regularization penalty. For training it, we used Stochastic Gradient Descent (SGD) — the SGDClassifier module from the Scikit learn library. SVM is a good discriminative classifier that gives a pretty high overall accuracy of 87% as shown in the classification report.

Random Forest — Random Forest uses an ensemble of Decision Tree classifiers. Each of the Decision Trees in the ensemble is trained on Bootstrap replicates of the dataset. Equally weighted averaging is performed over all the Trees to classify a test sample.

We used the Scikit-learn’s Random Forest classifier. We performed a Gridsearch over the number of ensembles from the set {50, 100, 500, 1000} to get the optimum number of ensembles for the Random Forest classifier. Best results were achieved for 1000 number of ensembles. The classifier was trained on batches of 100,000. We report here the results from TF-IDF model input as it performed better.

Recurrent Neural Network (RNN) — When we are reading a sentence, we are processing it word by word, while keeping memories of what came before. The Recurrent neural network (RNN) adopts the same principle, but in an extremely simplified version: it processes sequences by iterating through the sequence elements and maintaining a state containing information relative to what it has seen so far, therefore it has an internal loop. When processing two distinct tweets, the state of the RNN is reset, so each tweet will be considered as a single data point. What changes is that this data point is no longer processed in a single step, rather, the network internally loops over the sequence of elements.

One-hot encodings are sparse, high-dimensional and hardcoded, whereas word embeddings are dense, lower-dimensional and learned from data, basically mapping human language into a geometric space. To save memory and computational cost, for RNN we use word embeddings. For these embeddings we used GloVe [14] that has a 2 billion tweet with 1.2 million words vocabulary word embeddings pretrained. We used the 25 dimensional word embedding due to hardware constraints that limited the amount of processing that can be done.

embeddings_index = {}
f = open(filename)
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:],dtype='float32')
    embeddings_index[word] = coefs
embedding_dim = 25
embedding_matrix = np.zeros((max_words, embedding_dim))
for word, i in word_index.items()
   if i < max_words:
       embedding_vector = embeddings_index.get(word)
       if embedding_vector is not None:
           embedding_matrix[i] = embedding_vector

We used Tensorflow framework to implement the RNN and we used 3 outputs, one for each predicted label. The Sparse Categorical Crossentropy was required in order to handle the particular outputs defined by us. We also used RMSprop as an optimizer which keeps a moving average over the Root Mean Squared (RMS) gradients that are propagating, by which the current gradients are divided for normalization purposes.

from tensorflow.keras.layers import Dense, SimpleRNN, Embedding
from tensorflow.keras.models import Sequential
import tensorflow as tf
model = Sequential()
model.add(Embedding(max_words, embedding_dim))
model.add(SimpleRNN(128))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer='rmsprop', loss=tf.keras.losses.SparseCategoricalCrossentropy(), metrics=['acc'])
model.compile(
    optimizer=tf.keras.optimizers.RMSprop(learning_rate=1e-3),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy()]
)

In order to keep the embedding weights from the pretrained file, we set them to be untrainable.

model.layers[0].set_weights([embedding_matrix])
model.layers[0].trainable = False

We can see below the results from RNN where class 0 is Neutral, class 1 is Positive and class 2 is Negative. As can be seen, the results are quite promising with respect to the other methods. One of the biggest benefits is the use of word embeddings and another one is the recurrent connections that allow for a short-term memory of all the words from a tweet.

Analysis on model and results

Model —The Naive Bayes models had a good performance, but due to the limitation of the number of features used, the BNB model performed slightly worse compared to the MNB model, having 75% accuracy compared to 82% accuracy of MNB model. The Random Forest model, due to its large size, compared to the Naive Bayes models it is performing slightly better, achieving 83% accuracy. Even though SVMs are used mostly for 2 class problems, it performed better in terms of accuracy, achieving 86% accuracy while being slightly more complex because it had 3 classes. With the usage of word embeddings, a pretty small RNN achieved “easily” 93% accuracy using a 25 dimensional feature space as input for the word embeddings. This aligns with the current state of the art in the ML field as Deep Neural Networks such as RNNs (or more recently Transformers) are dominating the AI field for various tasks. This is not to discourage other classifiers; they may perform better in different circumstances, but for this sentiment analysis task it seems like RNN is a clear winner.

Data and results — We filtered our Tweets dataset into two groups (Democratic and Republican) based on the common keywords and usernames that are likely to be used in the tweets. The keywords that were used for the filtering are listed below :

keywords_dem = ["@dnc", "@thedemocrats", "dnc", "biden", "BIDEN", "joe", "dems", "democrat", "democratic", "hilary", "clinton", "@joebiden", "@kamalaharris", "@senkamalaharris", "@mikebloomberg", "our best days still lie ahead", "no malarkey", "harris", "bidenharris", "creepyjoebiden", "sleepyjoe", "sleepy joe", "biden-harris", "kamala", "dr.biden", "voteblue", "blue"]
keywords_rep = ["#maga2020", "@gop", "gop", "trump", "TRUMP", "@potus", "@realdonaldtrump", "republican", "republicans", "pence", "@mike_pence", "@vp", "keep america great", "potus", "flotus", "donaldtrump", "donaldjtrump", "donald", "president", "presidenttrump", "trump2020", "votered", "mike", "pence", "michael", "Pence", "mikepence", "red"]

After the filtering operation we ended up with 1.5M+ tweets related to the Democrats, 70K+ tweets related to the Republicans and the remaining tweets were ‘Generic’ ones, in the sense they could have been news or facts related to the election but not really referring to a particular party. We ignored the generic tweets for our prediction of the president. It makes sense that there are way more tweets talking about Democrats than Republicans, as Trump is the current president and a lot more people are talking about his work.

The following are a few tweets on each party and their predicted sentiments.

And finally, predicting each tweet based on each party based on our initial concept — party + sentiments :

Fig 19. Sentiment of all tweets (Left) and Result of predicted votes (Right)

As seen from the Fig 19, the Democratic party has a larger number of positive sentiment tweets than the Republicans. It is also interesting to note that a majority of the tweets got classified as Neutral tweets by our sentiment analysis models. The positive tweets for Democratic party and the negative tweets for the Republican party were considered as votes for the Democratic party and vice versa. By doing this we see that the Democrats have more support than the Republicans by roughly 5%. This is consistent with the fact that the election was a tight race with Joe Biden being elected the President at the end !

Concluding Remarks

In this work, we consider the election prediction problem as a sentiment analysis problem. We focused on processing raw Twitter data, and novelly devised a surrogate approach to label the groundtruth. We also demonstrated pretty good accuracy achieved by various classifiers with different complexity. Interestingly, the outcome from our framework can predict the 2020 US presidential election quite accurately.

Rather than just using an existing and usually “clean” dataset, we spent most of our efforts on collecting data from scratch. We believe that this is the part where we learned the most in this project. We first learned about Twitter data and how tricky it is to work with especially difficulty of scraping tweets and ended up finding existing dataset. These datasets, as they pertain to the current event (well almost), were updated frequently and we had to decide at which point we wanted to stop and use the datasets. Another challenge was difficulty of visualizing these dataset and results from the models, and while we had ideas from our own and other related works, we just did not have time to implement most of them. Collaboration in a five-people team is also an interesting and challenging experience. Unlike the usual group size of 1–2 people, organization and job partition become more important aspects.

Despite some successes we have achieved in this project, we believe that there is room for improving the model accuracy. For example, more sophisticated machine learning models, such as the long short-term memory (LSTM) networks, could be investigated. We also plan to seek new features that can be used to capture essential information. Looking further ahead, labeling the ground-truth of massive amounts of Twitter data is still an unsolved problem. One possible solution is to partially label the “unconfident” tweets by conducting subjective tests using online crowdsourcing, such as the Amazon Mechanical Turk (AMT). Yet, it could be too time/money consuming for the scale of this final project.

Github

References

[1] E. Kušena and M. Strembeck, “Politics, sentiments, and misinformation: An analysis of the Twitter discussion on the 2016 Austrian Presidential Elections,” Online Social Networks and Media, Vol. 5, Mar, 2018, pp. 37–50.

[2] K. Sarkar, “Sentiment Polarity Detection in Bengali Tweets Using Deep Convolutional Neural Networks,” Journal of Intelligent Systems, Vol. 28, Issue 3, Mar 13, 2018.

[3] H. Sebei, M. A. H. Taieb, and M. B. Aouicha, “Review of social media analytics process and Big Data pipeline,” Social Network Analysis and Mining, Vol. 8, №30. Sep. 2018.

[4] U. Yaqub, S. A. Chun, V. Atluri, and J. Vaidya, “Sentiment based Analysis of Tweets during the US Presidential Elections,” In Proc. 18th Annual International Conference on Digital Government Research, Association for Computing Machinery, New York, NY, USA, 1–10. 2017.

[5] U.Yaquba, S. A. Chunb, V. Atluria, and J. Vaidya, “Analysis of political discourse on twitter in the context of the 2016 US presidential elections,” Government Information Quarterly, Vol. 34, Issue 4, Dec. 2017, pp. 613–626.

[6] “Textblob: Simplified Text Processing.” [Online]. Available: https://textblob.readthedocs.io/en/dev/

[7] K. Yadav, “Predicting US Presidential Election Result Using Twitter Sentiment Analysis with Python.” [Online]. Available: https://medium.com/datadriveninvestor/predicting-us-presidential-election-using-twitter-sentiment-analysis-with-python-8affe9e9b8f

[8] S. Daityari, “How To Perform Sentiment Analysis in Python 3 Using the Natural Language Toolkit (NLTK).” [Online]. Available: https://www.digitalocean.com/community/tutorials/how-to-perform-sentiment-analysis-in-python-3-using-the-natural-language-toolkit-nltk

[9] K. Ma, Q. Wu, Z. Wang, Z. Duanmu, H. Yong, H. Li, and L. Zhang, “Group MAD Competition − A New Methodology to Compare Objective Image Quality Models,” in Proc. IEEE Conf. Comput. Vision Pattern Recog., Jun. 2016.

[10] “Twitter: most users by country.” [Online]. Available: https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/

[11] “SentiStrength.” [Online]. Available: http://sentistrength.wlv.ac.uk

[12] “USA Nov. 2020 election dataset.” [Online]. Available: https://ieee-dataport.org/open-access/usa-nov2020-election-20-mil-tweets-sentiment-and-party-name-labels-dataset

[13] “2020 US Presidential Election Tweet IDs.” [Online]. Available: https://github.com/echen102/us-pres-elections-2020

[14] “GloVe: Global Vectors for Word Representation.” [Online]. Available: https://nlp.stanford.edu/projects/glove/