SlideShare a Scribd company logo
Text Classification & Sentiment Analysis
Muhammad Atif Qureshi
Arjumand Younus
2
Contents
●
An Introduction to Text Classification
– Text Classification Examples
– Text Classification Methods
● Naive Bayes
– Formalization
– Learning
●
Applications of Sentiment Analysis
●
Baseline Algorithm for Sentiment Analysis
● Sentiment Lexicons
● Sentiment Analysis for the Political Domain (Personal Research)
3
Text Classification Examples
● News filtering and organization
● Document organization and retrieval
● Sentiment analysis/Opinion mining
● Email classification and spam filtering
● Authorship attribution
4
Spam Classification Example
Slide borrowed from Coursera Lectures on “Natural Language Processing
By Prof. Dan Jurafsky
5
Text Classification
● Set of training documents D = {d1,....,dN} such that each
record is labeled with a class value 'c' from C = {c1,....,cJ}
● Features in training data are related to labels by means of
classification model
● Classification model helps predict label for an unknown
(test) record
● With text classification, model uses text-based features
6
Text Classification Methods
● Hand-coded rules
● Supervised machine learning
– Naive bayes
– Logistic regression
– Support vector machines
– K-nearest neighbors
7
Naive Bayes
● Simple (“naive”) classification method based on Bayes rule
● Relies on simple document representation namely bag of
words
I love this movie. It's sweet but with satirical humor. The
dialogue Is great and the adventure scenes are great
fun...It manages to be whimsical and romantic while
laughing at the conventions of the fairy tale genre. I
would recommend it to just about anyone. I've seen it
several times as I love it so much, and I'm always
happy to see it again whenever I have a friend who
hasn't seen it yet.
8
Bag of Words Representation:
Subset of Words
I love this movie. It's sweet but with satirical humor. The
dialogue is great and the adventure scenes are great
fun...It manages to be whimsical and romantic while
laughing at the conventions of the fairy tale genre. I
would recommend it to just about anyone. I've seen it
several times as I love it so much, and I'm always
happy to see it again whenever I have a friend who
hasn't seen it yet.
great 2
love 2
recommend 1
laugh 1
happy 1
..... ....
9
Bayes' Rule Applied to Documents
and Classes
● For a document d and a class c
P(d/c)P(c)
P(d)
P(c/d) =
10
Naive Bayes Classifier (1/3)
CMAP argmax P(c/d)=
c∈C
argmax
P(d/c)P(c)
P(d)
=
c∈C
argmax P(d/c)P(c)
c∈C
=
11
Naive Bayes Classifier (2/3)
CMAP =
=
argmax P(d/c)P(c)
c∈C
argmax P(x 1, x2,..,xn/c)P(c)
c∈C
Document represented as
features x1....xn
How often does this class
occur?
We can just count the relative
frequencies in a corpus.
12
Naive Bayes Classifier (3/3)
CMAP =
=
argmax P(d/c)P(c)
c∈C
argmax P(x 1, x2,..,xn/c)P(c)
c∈C
O(|Xn
|.|C|) parameters
Could only be estimated if a very,
very large number of training examples
was available.
argmax P(x 1, x2,..,xn/c)P(c)
13
Multinomial Naive Bayes
Independence Assumptions
Bag of Words assumption: Assume position doesn't
matter
● Conditional Independence: Assume the feature
probabilities P(xi/cj) are independent given the class c.
P(x 1,x 2,..,xn/c)
P(x1,x2,..,xn/c)=P(x1/c)x.....P(xn/c)
14
Multinomial Naive Bayes Classifier
positions ← all word positions in test document
cNB
=
cj∈C
argmax P(cj) ∏
i∈positions
P(xi/cj)
15
Multinomial Naive Bayes Classifier
CMAP = argmax P(x 1, x2,..,xn/c)P(c)
c∈C
argmax P(cj)∏
x ∈X
P(x/c)
c∈C
cNB
=
16
Learning the Multinomial Naive
Bayes Model
● First attempt: maximum likelihood estimates
– simply use frequencies in the data
17
Parameter Estimation
● Create mega-document for topic j by concatenating all
docs in this topic
– Use frequency of w in mega-document
18
Problem with Maximum Likelihood
● What if we have seen no training documents with the word
fantastic and classified as positive
● Zero probabilities cannot be conditioned away, no matter
the other evidence!
19
Laplace (add-1) Smoothing for
Naive Bayes
20
Multinomial Naive Bayes: Learning
● From training corpus, extract Vocabulary
21
Multinomial Naive Bayes: A
Worked Example
22
Sentiment Analysis Overview
23
Sentiment Analysis Applications
(1/4)
● Movie: is this review positive or negative?
● Products: what do people think about the new iPhone?
● Public sentiment: how is consumer confidence? Is despair
increasing?
● Politics: what do people think about this candidate or
issue?
● Prediction: predict election outcomes or market trends
from sentiment
24
Sentiment Analysis Applications
(2/4)
25
Sentiment Analysis Applications
(3/4)
26
Sentiment Analysis Applications
(4/4)
27
Formal Definition of Sentiment
Analysis
● Sentiment analysis is the detection of attitudes
“enduring, affectively colored beliefs, dispositions towards objects or persons”
1. Holder (source) of attitude
2. Target (aspect) of attitude
3. Type of attitude
➢ From a set of types
• like, love, hate, value, desire, etc.
➢ Or (more commonly) simple weighted polarity:
• positive, negative, neutral together with strength
4. Text containing the attitude
➢ Sentence or entire document
28
Sentiment Analysis Tasks
● Simplest:
– Is the attitude of this text positive or negative?
● More complex:
– Rank the attitude of this text from 1 to 5
● Advanced:
– Detect the target, source, or complex attitude types
29
Sentiment Analysis: A Baseline
Algorithm
● Polarity detection in movie reviews:
– Is an IMDB movie review positive or negative?
● Data: Polarity Data 2.0:
– http://www.cs.cornell.edu/people/pabo/movie-review-dat
a/
30
Baseline Algorithm (adapted from
Pang and Lee)
● Tokenization
● Feature Extraction
● Classification using different classifiers
– Naive Bayes
– MaxEnt
– SVM
31
Sentiment Tokenization Issues
● Deal with HTML and XML markup
● Twitter markup (names, hash tags)
● Capitalization (preserve for words in all caps)
● Phone numbers, dates
● Emoticons
32
Extracting Features for Sentiment
Classification
● How to handle negation
– I didn't like this movie
vs
– I really like this movie
● Which words to use?
– Only adjectives
– All words
33
Negation
● Add NOT_ to every word between negation and following
punctuation:
Didn't like this movie, but I
Didn't NOT_like NOT_this NOT_movie but I
34
Reminder: Naive Bayes
35
Sentiment Lexicons
● Dictionary of well-known “sentiment” words
– Abusive terms
– Adjectives like bad, worse, good, better, ugly, pretty
● Available for use in research
– LIWC: Linguistic Inquiry and Word Count
– SentiStrength
– Bing Liu's Opinion Lexicon
36
My Research: Election Trolling on
Twitter (Pakistan Elections 2013)
Twitterer Tweet
A @B Yeh...#Shame with fake account, this is how
PTIians think they will get votes
B @A Stop making a fuss and fuck off.
A @B A dumb leader like IK can produce followers
like you.
B @A A corrupt leader like Noora can hire paid trolls
like you
Ad

More Related Content

What's hot (20)

Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment Analysis
Nihar Suryawanshi
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
Karol Chlasta
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
Ravi Kumar
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Data Science Society
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Seher Can
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Amenda Joy
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Makrand Patil
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Aditya Nag
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
Nurendra Choudhary
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
Pravin Katiyar
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
SonuCreation
 
Amazon sentimental analysis
Amazon sentimental analysisAmazon sentimental analysis
Amazon sentimental analysis
Akhila
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
Jaganadh Gopinadhan
 
Text Classification
Text ClassificationText Classification
Text Classification
RAX Automation Suite
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
Rebecca Williams
 
Sentiment analysis presentation
Sentiment analysis presentationSentiment analysis presentation
Sentiment analysis presentation
GunjanSrivastava23
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
Ali Habeeb
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
CJ Jenkins
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
VeenaSKumar2
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment Analysis
Nihar Suryawanshi
 
Sentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use casesSentiment analysis - Our approach and use cases
Sentiment analysis - Our approach and use cases
Karol Chlasta
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
Ravi Kumar
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Seher Can
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Amenda Joy
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Aditya Nag
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
Nurendra Choudhary
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
Pravin Katiyar
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
SonuCreation
 
Amazon sentimental analysis
Amazon sentimental analysisAmazon sentimental analysis
Amazon sentimental analysis
Akhila
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
Jaganadh Gopinadhan
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
Rebecca Williams
 
Sentiment analysis presentation
Sentiment analysis presentationSentiment analysis presentation
Sentiment analysis presentation
GunjanSrivastava23
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
Ali Habeeb
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
CJ Jenkins
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
VeenaSKumar2
 

Viewers also liked (20)

Machine Learning based Text Classification introduction
Machine Learning based Text Classification introductionMachine Learning based Text Classification introduction
Machine Learning based Text Classification introduction
Treparel
 
Text classification
Text classificationText classification
Text classification
James Wong
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
DKALab
 
Text categorization
Text categorizationText categorization
Text categorization
KU Leuven
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayes
Dhwaj Raj
 
Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweets
Vasu Jain
 
Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter Data
Yanchang Zhao
 
Language-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment AnalysisLanguage-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment Analysis
saschanarr
 
SENTIment POLarity Classification Task - Sentipolc@Evalita 2014
SENTIment POLarity Classification Task - Sentipolc@Evalita 2014 SENTIment POLarity Classification Task - Sentipolc@Evalita 2014
SENTIment POLarity Classification Task - Sentipolc@Evalita 2014
University of Torino
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
John Breslin
 
Keyword-based Search and Exploration on Databases (SIGMOD 2011)
Keyword-based Search and Exploration on Databases (SIGMOD 2011)Keyword-based Search and Exploration on Databases (SIGMOD 2011)
Keyword-based Search and Exploration on Databases (SIGMOD 2011)
weiw_oz
 
Interactive Query and Search for your Big Data
Interactive Query and Search for your Big DataInteractive Query and Search for your Big Data
Interactive Query and Search for your Big Data
DataWorks Summit
 
Presentation
PresentationPresentation
Presentation
Xiaoyu Chen
 
Language-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment AnalysisLanguage-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment Analysis
saschanarr
 
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Ralf Stockmann
 
Algorithm Name Detection & Extraction
Algorithm Name Detection & ExtractionAlgorithm Name Detection & Extraction
Algorithm Name Detection & Extraction
Deeksha thakur
 
Text classification
Text classificationText classification
Text classification
Harry Potter
 
Social media Listening and Analytics: A brief Overview
Social media Listening and Analytics: A brief OverviewSocial media Listening and Analytics: A brief Overview
Social media Listening and Analytics: A brief Overview
Sherin Daniel
 
Machine Learning based Text Classification introduction
Machine Learning based Text Classification introductionMachine Learning based Text Classification introduction
Machine Learning based Text Classification introduction
Treparel
 
Text classification
Text classificationText classification
Text classification
James Wong
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
DKALab
 
Text categorization
Text categorizationText categorization
Text categorization
KU Leuven
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayes
Dhwaj Raj
 
Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweets
Vasu Jain
 
Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter Data
Yanchang Zhao
 
Language-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment AnalysisLanguage-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment Analysis
saschanarr
 
SENTIment POLarity Classification Task - Sentipolc@Evalita 2014
SENTIment POLarity Classification Task - Sentipolc@Evalita 2014 SENTIment POLarity Classification Task - Sentipolc@Evalita 2014
SENTIment POLarity Classification Task - Sentipolc@Evalita 2014
University of Torino
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
John Breslin
 
Keyword-based Search and Exploration on Databases (SIGMOD 2011)
Keyword-based Search and Exploration on Databases (SIGMOD 2011)Keyword-based Search and Exploration on Databases (SIGMOD 2011)
Keyword-based Search and Exploration on Databases (SIGMOD 2011)
weiw_oz
 
Interactive Query and Search for your Big Data
Interactive Query and Search for your Big DataInteractive Query and Search for your Big Data
Interactive Query and Search for your Big Data
DataWorks Summit
 
Language-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment AnalysisLanguage-Independent Twitter Sentiment Analysis
Language-Independent Twitter Sentiment Analysis
saschanarr
 
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Ralf Stockmann
 
Algorithm Name Detection & Extraction
Algorithm Name Detection & ExtractionAlgorithm Name Detection & Extraction
Algorithm Name Detection & Extraction
Deeksha thakur
 
Text classification
Text classificationText classification
Text classification
Harry Potter
 
Social media Listening and Analytics: A brief Overview
Social media Listening and Analytics: A brief OverviewSocial media Listening and Analytics: A brief Overview
Social media Listening and Analytics: A brief Overview
Sherin Daniel
 
Ad

Similar to Text classification & sentiment analysis (20)

Sarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisSarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysis
Anuj Gupta
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
Learning to learn - to retrieve information
Learning to learn - to retrieve informationLearning to learn - to retrieve information
Learning to learn - to retrieve information
Pramit Choudhary
 
02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis
Subhas Kumar Ghosh
 
Machine learning with in the python lecture for computer science
Machine learning with in the python lecture for computer scienceMachine learning with in the python lecture for computer science
Machine learning with in the python lecture for computer science
jayasreepalani02
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
parlamind
 
Analyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning projectAnalyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning project
Boston Institute of Analytics
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Francesco Casalegno
 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
YONG ZHENG
 
W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...
W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...
W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...
cniclsh1
 
Icdm2013 slides
Icdm2013 slidesIcdm2013 slides
Icdm2013 slides
Mohsen Farhadloo
 
Openbar Leuven // Less is more. Working with less data in NLP by Yves Peirsman
Openbar Leuven // Less is more. Working with less data in NLP by Yves PeirsmanOpenbar Leuven // Less is more. Working with less data in NLP by Yves Peirsman
Openbar Leuven // Less is more. Working with less data in NLP by Yves Peirsman
Openbar
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
Isabelle Augenstein
 
Cooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptxCooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptx
UsamaHassan90
 
Sentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes AlgorithmSentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes Algorithm
Khushboo Gupta
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
milkesa13
 
Hard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal RetrievalHard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal Retrieval
VasileiosMezaris
 
Collective sensing
Collective sensingCollective sensing
Collective sensing
mahdikianirad1
 
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross EntropyRecommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Vito Walter Anelli
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
SisayNegash4
 
Sarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisSarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysis
Anuj Gupta
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
Learning to learn - to retrieve information
Learning to learn - to retrieve informationLearning to learn - to retrieve information
Learning to learn - to retrieve information
Pramit Choudhary
 
02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis02 naive bays classifier and sentiment analysis
02 naive bays classifier and sentiment analysis
Subhas Kumar Ghosh
 
Machine learning with in the python lecture for computer science
Machine learning with in the python lecture for computer scienceMachine learning with in the python lecture for computer science
Machine learning with in the python lecture for computer science
jayasreepalani02
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
parlamind
 
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
YONG ZHENG
 
W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...
W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...
W4L1_11-667: LARGE LANGUAGE MODELS: METHODS AND APPLICATIONS - Human Evaluati...
cniclsh1
 
Openbar Leuven // Less is more. Working with less data in NLP by Yves Peirsman
Openbar Leuven // Less is more. Working with less data in NLP by Yves PeirsmanOpenbar Leuven // Less is more. Working with less data in NLP by Yves Peirsman
Openbar Leuven // Less is more. Working with less data in NLP by Yves Peirsman
Openbar
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
Isabelle Augenstein
 
Cooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptxCooperative game model based sentiment analysis of product reviews.pptx
Cooperative game model based sentiment analysis of product reviews.pptx
UsamaHassan90
 
Sentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes AlgorithmSentimental Analysis - Naive Bayes Algorithm
Sentimental Analysis - Naive Bayes Algorithm
Khushboo Gupta
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
milkesa13
 
Hard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal RetrievalHard-Negatives Selection Strategy for Cross-Modal Retrieval
Hard-Negatives Selection Strategy for Cross-Modal Retrieval
VasileiosMezaris
 
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross EntropyRecommender Systems Fairness Evaluation via Generalized Cross Entropy
Recommender Systems Fairness Evaluation via Generalized Cross Entropy
Vito Walter Anelli
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
SisayNegash4
 
Ad

More from M. Atif Qureshi (10)

Utilising wikipedia to explain recommendations
Utilising wikipedia to explain recommendationsUtilising wikipedia to explain recommendations
Utilising wikipedia to explain recommendations
M. Atif Qureshi
 
Text mining, word embeddings, & wikipedia
Text mining, word embeddings, & wikipediaText mining, word embeddings, & wikipedia
Text mining, word embeddings, & wikipedia
M. Atif Qureshi
 
Fundamentals of IR models
Fundamentals of IR modelsFundamentals of IR models
Fundamentals of IR models
M. Atif Qureshi
 
Exploiting Wikipedia for Entity Name Disambiguation in Tweets
Exploiting Wikipedia for Entity Name Disambiguation in TweetsExploiting Wikipedia for Entity Name Disambiguation in Tweets
Exploiting Wikipedia for Entity Name Disambiguation in Tweets
M. Atif Qureshi
 
A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...
A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...
A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...
M. Atif Qureshi
 
Welcoming Webology
Welcoming WebologyWelcoming Webology
Welcoming Webology
M. Atif Qureshi
 
Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...
Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...
Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...
M. Atif Qureshi
 
Identifying and ranking topic clusters in the blogosphere
Identifying and ranking topic clusters in the blogosphereIdentifying and ranking topic clusters in the blogosphere
Identifying and ranking topic clusters in the blogosphere
M. Atif Qureshi
 
Invent Episode 3: Tech Talk on Parallel Future
Invent Episode 3: Tech Talk on Parallel FutureInvent Episode 3: Tech Talk on Parallel Future
Invent Episode 3: Tech Talk on Parallel Future
M. Atif Qureshi
 
Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...
Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...
Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...
M. Atif Qureshi
 
Utilising wikipedia to explain recommendations
Utilising wikipedia to explain recommendationsUtilising wikipedia to explain recommendations
Utilising wikipedia to explain recommendations
M. Atif Qureshi
 
Text mining, word embeddings, & wikipedia
Text mining, word embeddings, & wikipediaText mining, word embeddings, & wikipedia
Text mining, word embeddings, & wikipedia
M. Atif Qureshi
 
Fundamentals of IR models
Fundamentals of IR modelsFundamentals of IR models
Fundamentals of IR models
M. Atif Qureshi
 
Exploiting Wikipedia for Entity Name Disambiguation in Tweets
Exploiting Wikipedia for Entity Name Disambiguation in TweetsExploiting Wikipedia for Entity Name Disambiguation in Tweets
Exploiting Wikipedia for Entity Name Disambiguation in Tweets
M. Atif Qureshi
 
A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...
A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...
A Perspective-Aware Approach to Search: Visualizing Perspectives in News Sear...
M. Atif Qureshi
 
Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...
Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...
Master's Thesis Defense: Improving the Quality of Web Spam Filtering by Using...
M. Atif Qureshi
 
Identifying and ranking topic clusters in the blogosphere
Identifying and ranking topic clusters in the blogosphereIdentifying and ranking topic clusters in the blogosphere
Identifying and ranking topic clusters in the blogosphere
M. Atif Qureshi
 
Invent Episode 3: Tech Talk on Parallel Future
Invent Episode 3: Tech Talk on Parallel FutureInvent Episode 3: Tech Talk on Parallel Future
Invent Episode 3: Tech Talk on Parallel Future
M. Atif Qureshi
 
Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...
Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...
Analyzing Web Crawler as Feed Forward Engine for Efficient Solution to Search...
M. Atif Qureshi
 

Recently uploaded (20)

CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Sets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledgeSets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledge
saumyasl2020
 
Time series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdfTime series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdf
asmaamahmoudsaeed
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
Process Mining at Deutsche Bank - Journey
Process Mining at Deutsche Bank - JourneyProcess Mining at Deutsche Bank - Journey
Process Mining at Deutsche Bank - Journey
Process mining Evangelist
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdfPublication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
StatsCommunications
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
Process Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce DowntimeProcess Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce Downtime
Process mining Evangelist
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
bastakwyry
 
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
Taqyea
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Introduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdfIntroduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdf
AbdurahmanAbd
 
report (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhsreport (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhs
AngelPinedaTaguinod
 
Ann Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdfAnn Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdf
আন্ নাসের নাবিল
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Sets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledgeSets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledge
saumyasl2020
 
Time series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdfTime series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdf
asmaamahmoudsaeed
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug - Expert In Artificial Intelligence
Dr. Robert Krug
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdfPublication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
Publication-launch-How-is-Life-for-Children-in-the-Digital-Age-15-May-2025.pdf
StatsCommunications
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
Process Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce DowntimeProcess Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce Downtime
Process mining Evangelist
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
bastakwyry
 
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
Taqyea
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Introduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdfIntroduction to systems thinking tools_Eng.pdf
Introduction to systems thinking tools_Eng.pdf
AbdurahmanAbd
 
report (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhsreport (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhs
AngelPinedaTaguinod
 

Text classification & sentiment analysis

  • 1. Text Classification & Sentiment Analysis Muhammad Atif Qureshi Arjumand Younus
  • 2. 2 Contents ● An Introduction to Text Classification – Text Classification Examples – Text Classification Methods ● Naive Bayes – Formalization – Learning ● Applications of Sentiment Analysis ● Baseline Algorithm for Sentiment Analysis ● Sentiment Lexicons ● Sentiment Analysis for the Political Domain (Personal Research)
  • 3. 3 Text Classification Examples ● News filtering and organization ● Document organization and retrieval ● Sentiment analysis/Opinion mining ● Email classification and spam filtering ● Authorship attribution
  • 4. 4 Spam Classification Example Slide borrowed from Coursera Lectures on “Natural Language Processing By Prof. Dan Jurafsky
  • 5. 5 Text Classification ● Set of training documents D = {d1,....,dN} such that each record is labeled with a class value 'c' from C = {c1,....,cJ} ● Features in training data are related to labels by means of classification model ● Classification model helps predict label for an unknown (test) record ● With text classification, model uses text-based features
  • 6. 6 Text Classification Methods ● Hand-coded rules ● Supervised machine learning – Naive bayes – Logistic regression – Support vector machines – K-nearest neighbors
  • 7. 7 Naive Bayes ● Simple (“naive”) classification method based on Bayes rule ● Relies on simple document representation namely bag of words I love this movie. It's sweet but with satirical humor. The dialogue Is great and the adventure scenes are great fun...It manages to be whimsical and romantic while laughing at the conventions of the fairy tale genre. I would recommend it to just about anyone. I've seen it several times as I love it so much, and I'm always happy to see it again whenever I have a friend who hasn't seen it yet.
  • 8. 8 Bag of Words Representation: Subset of Words I love this movie. It's sweet but with satirical humor. The dialogue is great and the adventure scenes are great fun...It manages to be whimsical and romantic while laughing at the conventions of the fairy tale genre. I would recommend it to just about anyone. I've seen it several times as I love it so much, and I'm always happy to see it again whenever I have a friend who hasn't seen it yet. great 2 love 2 recommend 1 laugh 1 happy 1 ..... ....
  • 9. 9 Bayes' Rule Applied to Documents and Classes ● For a document d and a class c P(d/c)P(c) P(d) P(c/d) =
  • 10. 10 Naive Bayes Classifier (1/3) CMAP argmax P(c/d)= c∈C argmax P(d/c)P(c) P(d) = c∈C argmax P(d/c)P(c) c∈C =
  • 11. 11 Naive Bayes Classifier (2/3) CMAP = = argmax P(d/c)P(c) c∈C argmax P(x 1, x2,..,xn/c)P(c) c∈C Document represented as features x1....xn How often does this class occur? We can just count the relative frequencies in a corpus.
  • 12. 12 Naive Bayes Classifier (3/3) CMAP = = argmax P(d/c)P(c) c∈C argmax P(x 1, x2,..,xn/c)P(c) c∈C O(|Xn |.|C|) parameters Could only be estimated if a very, very large number of training examples was available. argmax P(x 1, x2,..,xn/c)P(c)
  • 13. 13 Multinomial Naive Bayes Independence Assumptions Bag of Words assumption: Assume position doesn't matter ● Conditional Independence: Assume the feature probabilities P(xi/cj) are independent given the class c. P(x 1,x 2,..,xn/c) P(x1,x2,..,xn/c)=P(x1/c)x.....P(xn/c)
  • 14. 14 Multinomial Naive Bayes Classifier positions ← all word positions in test document cNB = cj∈C argmax P(cj) ∏ i∈positions P(xi/cj)
  • 15. 15 Multinomial Naive Bayes Classifier CMAP = argmax P(x 1, x2,..,xn/c)P(c) c∈C argmax P(cj)∏ x ∈X P(x/c) c∈C cNB =
  • 16. 16 Learning the Multinomial Naive Bayes Model ● First attempt: maximum likelihood estimates – simply use frequencies in the data
  • 17. 17 Parameter Estimation ● Create mega-document for topic j by concatenating all docs in this topic – Use frequency of w in mega-document
  • 18. 18 Problem with Maximum Likelihood ● What if we have seen no training documents with the word fantastic and classified as positive ● Zero probabilities cannot be conditioned away, no matter the other evidence!
  • 19. 19 Laplace (add-1) Smoothing for Naive Bayes
  • 20. 20 Multinomial Naive Bayes: Learning ● From training corpus, extract Vocabulary
  • 21. 21 Multinomial Naive Bayes: A Worked Example
  • 23. 23 Sentiment Analysis Applications (1/4) ● Movie: is this review positive or negative? ● Products: what do people think about the new iPhone? ● Public sentiment: how is consumer confidence? Is despair increasing? ● Politics: what do people think about this candidate or issue? ● Prediction: predict election outcomes or market trends from sentiment
  • 27. 27 Formal Definition of Sentiment Analysis ● Sentiment analysis is the detection of attitudes “enduring, affectively colored beliefs, dispositions towards objects or persons” 1. Holder (source) of attitude 2. Target (aspect) of attitude 3. Type of attitude ➢ From a set of types • like, love, hate, value, desire, etc. ➢ Or (more commonly) simple weighted polarity: • positive, negative, neutral together with strength 4. Text containing the attitude ➢ Sentence or entire document
  • 28. 28 Sentiment Analysis Tasks ● Simplest: – Is the attitude of this text positive or negative? ● More complex: – Rank the attitude of this text from 1 to 5 ● Advanced: – Detect the target, source, or complex attitude types
  • 29. 29 Sentiment Analysis: A Baseline Algorithm ● Polarity detection in movie reviews: – Is an IMDB movie review positive or negative? ● Data: Polarity Data 2.0: – http://www.cs.cornell.edu/people/pabo/movie-review-dat a/
  • 30. 30 Baseline Algorithm (adapted from Pang and Lee) ● Tokenization ● Feature Extraction ● Classification using different classifiers – Naive Bayes – MaxEnt – SVM
  • 31. 31 Sentiment Tokenization Issues ● Deal with HTML and XML markup ● Twitter markup (names, hash tags) ● Capitalization (preserve for words in all caps) ● Phone numbers, dates ● Emoticons
  • 32. 32 Extracting Features for Sentiment Classification ● How to handle negation – I didn't like this movie vs – I really like this movie ● Which words to use? – Only adjectives – All words
  • 33. 33 Negation ● Add NOT_ to every word between negation and following punctuation: Didn't like this movie, but I Didn't NOT_like NOT_this NOT_movie but I
  • 35. 35 Sentiment Lexicons ● Dictionary of well-known “sentiment” words – Abusive terms – Adjectives like bad, worse, good, better, ugly, pretty ● Available for use in research – LIWC: Linguistic Inquiry and Word Count – SentiStrength – Bing Liu's Opinion Lexicon
  • 36. 36 My Research: Election Trolling on Twitter (Pakistan Elections 2013) Twitterer Tweet A @B Yeh...#Shame with fake account, this is how PTIians think they will get votes B @A Stop making a fuss and fuck off. A @B A dumb leader like IK can produce followers like you. B @A A corrupt leader like Noora can hire paid trolls like you
  翻译: