SlideShare a Scribd company logo
Sentiment Analysis
in
Machine Learning
Prof Pranali V Deshmukh
Department of Information Technology
International Institute of Information Technology, I²IT
www.isquareit.edu.in
1
Predicting sentiment by topic:
An intelligent restaurant
review system
It’s a big day & I want to book a table at
a nice Japanese restaurant
Seattlehas many
★★★★
sushirestaurants
Whatarepeople
sayingabout the
food?
the ambiance?...
3
Positive reviews not positive about everything
Samplereview:
Watching the chefs create
incredible edible artmade the
experience veryunique.
My wife tried their ramen and it
was pretty forgettable.
All the sushi was delicious!
Easilybest sushi in Seattle.
Experience
4
From reviews to topic sentiments
Experience
★★★★
Ramen
★★★
Sushi
★★★★★
Novel intelligent
restaurant review app
Easily best sushi
in Seattle.
Allreviewsfor
restaurant
5
Intelligent restaurant review system
Allreviewsfor
restaurant
Breakall reviews
into sentences
The seaweed salad was just OK,
vegetable salad was just ordinary.
I like the interior decoration and
the blackboard menu on thewall.
6
All the sushi was delicious.
My wife tried their ramen and
it was pretty forgettable.
The sushi was amazing, and
the rice is just outstanding.
The service is somewhat hectic.
Easily best sushi in Seattle.
Core building block
Easilybest sushi in Seattle.
Sentence Sentiment
Classifier
Easily best sushi in Seattle.
7
Intelligent restaurant review system
Allreviewsfor
restaurant
My wife tried their ramen and
it was pretty forgettable.
The service is somewhat hectic.
Easily best sushi in Seattle.
All the sushi was delicious.
The sushi was amazing, and
the rice is just outstanding.
Easily best sushi in Seattle.
All the sushi was delicious.
The sushi was amazing, and
the rice is just outstanding.
BreakSeall e
lc
r
e
t
v
s
i
e
e
n
w
t
e
s
nces
into s
e
a
n
b
t
o
e
u
n
t
c
“
e
s
s
u
s
h
i
”
The seaweed salad was just OK,
vegetable salad was just ordinary.
I like the interior decoration and the
blackboard menu on thewall.
Sentence
Sentiment
Classifier
Sushi
★★★★★
Average
predictions
Easilybest
sushi
in Seattle.
Most
&
8
Machine Learning Specialization
Classifier applications
9 ©2015 Emily Fox & Carlos Guestrin
9
Classifier
Sentence
from
review
Classifier
MODEL
Input: x
Output: y
Predicted
class
10
Example multiclass classifier
Output y has more than 2 categories
Education
Finance
Technology
11
Input: x
Webpage
Output: y
1
Spam filtering
Input: x Output: y
Not spam
Spam
Text of email,
sender, IP,…
1
Image classification
Input: x Image
pixels
13
Output:y
Predicted object
Personalized medical diagnosis
Disease
Classifier
MODEL
Input: x
Healthy
Cold
Flu
Pneumonia
…
Output: y
14
Reading your mind
“Hammer”
“House”
15
1
Linear
classifiers
16
1
Representing
classifiers
Sentence
from
review
Classifier
MODEL
Input: x
Output: y
Predicted class
How does itwork???
17
Count positive &negativewords in
sentence
If number of positive words>
number of negative words:
ŷ=
Else:
Listofpositive
words
Listofnegative
words
great,awesome,
good, amazing,…
bad,terrible,
disgusting, sucks,…
ŷ=
18
Sentence
from
review
Input: x
Simple threshold classifier
Count positive &negative words
in sentence
If number of positive words>
number of negative words:
ŷ=
Else:
Listofpositive
words
Listofnegative
words
great,awesome,
good, amazing,…
bad,terrible,
disgusting, sucks,…
Sushi was
great, the
food was
awesome,
but the
servicewas
terrible.
Simple threshold classifier
2
1
ŷ=
19
Problems with threshold classifier
• How do we get list of
positive/negativewords?
• Words havedifferent
degreesof sentiment:
- Great >
good
- How do weweigh
different words?
• Single words arenot enough:
- Good 

Positive
- Not good 

Negative
Addressed
bylearning
aclassifier
Addressed
bymore
elaborate
features
20
A(linear) classifier
21
• Will usetraining datato learn aweight for
eachword
Word Weight
good 1.0
great 1.5
awesome 2.7
bad -1.0
terrible -2.1
awful -3.3
restaurant,the, we, where, … 0.0
… …
Scoring a
sentence
Word Weight
good 1.0
great 1.2
awesome 1.7
bad -1.0
terrible -2.1
awful -3.3
restaurant,the,
we, where, …
0.0
… …
Input x:
Sushi was great,
the food was awesome, but
the service was terrible.
Called alinear classifier, because output is weighted sum of input.
22
Word Weight
… …
23
Sentence
from
review
Input: x
Simple linear classifier
Score(x) =weighted count of
words in sentence
If Score (x) > 0:
ŷ=
Else:
ŷ=
Machine Learning Specialization
Decision boundaries
24 ©2015 Emily Fox & Carlos Guestrin
2
Suppose only two words had non-zero weight
Word Weight
awesome 1.0
awful -1.5
awful
3
2
1
4
…
Sushi was awesome, the
food wasawesome,
but the service was awful.
Score(x) =1.0#awesome – 1.5#awful
0
0 1 2 3 4 …
awesome
25
Decision boundary example
Word Weight
awesome 1.0
awful -1.5
awful
1
4
3
2
…
Score(x) =1.0#awesome – 1.5#awful
Score(x)>
0
Score(x)<
0
0
0 1 2 3 4 …
awesome
26
Decision boundary separates
positive & negative predictions
• For linear classifiers:
- When 2weights are non-zero


line
- When 3weights are non-zero


plane
- When manyweights are non-zero


hyperplane
• For more generalclassifiers


morecomplicatedshapes
22
Machine Learning Specialization
Training and evaluating
a classifier
28 ©2015 Emily Fox & Carlos Guestrin
2
Training a classifier = Learning the weights
Data
(x,y)
(Sentence1, )
(Sentence2, )
…
Training
set
Test
set
Learn
classifier
Evaluate?
Word Weight
good 1.0
awesome 1.7
bad -1.0
awful -3.3
… …
29
Classification error
at,
Test example
(
S
(
F
u
o
s
h
o
i
d
w
w
a
a
s
s
g
O
r
e
K
a
,
t ))
Learnedclassifier
Hide label
Correct
Mistakes
ŷ=
M
Co
is
r
t
r
a
e
k
c
e
t!
0
1
0
1
30
Classification error & accuracy
• Error measuresfraction of mistakes
- Bestpossible valueis0.0
• Often, measureaccuracy
-Fraction of correct predictions
- Bestpossible valueis1.0
error = .
accuracy= .
31
Machine Learning Specialization
What’s a good
accuracy?
32 ©2015 Emily Fox & Carlos Guestrin
3
What if you ignore the sentence, and just guess?
33
• For binaryclassification:
- Half the time, you’ll get it right! (on average)


accuracy =0.5
• For kclasses,accuracy =1/k
- 0.333 for 3classes, 0.25 for 4 classes,…
Atthe very,very,very least,
you should healthily beatrandom…
Otherwise, it’s(usually) pointless…
2010data shows:
“90% emails sent are spam!”
Predicting everyemail is spam
getsyou 90%accuracy!!!
Majority class prediction
Amazing performance when
there is class imbalance
(butsilly approach)
• One class is more common thanothers
• Beatsrandom (ifyou know the majority class)
Is a classifier with 90% accuracy good? …
34
So, always be digging in and asking the
hard questions about reported accuracies
35
• Is there class imbalance?
• How does it compare to asimple,
baseline approach?
- Random guessing
- Majority class
-…
• Most importantly:
what accuracy does my application need?
- Whatis good enough for myuser’sexperience?
- Whatis the impact of the mistakeswe make?
Machine Learning Specialization
False positives, false
negatives, and confusion
matrices
36 ©2015 Emily Fox & Carlos Guestrin
3
Types of mistakes
True
label
Predicted label
True
Positive
False
Negative
(FN)
False True
Positive Negative
(FP)
37
Cost of different types of mistakes can be
different (& high) in some applications
Spam
filtering
Medical
diagnosis
False
negative
False
positive
Annoying
Email lost
Disease not
treated
Wasteful
treatment
38
True
label
Confusion matrix –
binary classification
Predicted label
39
Confusion matrix –
multiclass classification
Healthy Cold Flu
Healthy
Cold
Flu
True
label
Predicted label
40
Machine Learning Specialization
Learning curves:
How much data do I need?
41 ©2015 Emily Fox & Carlos Guestrin
4
How much data does a model need to
learn?
42
• The more the merrier  
- But dataquality is most important factor
• Theoretical techniques sometimes can
bound how much dataisneeded
- Typically too loose for practicalapplication
- But provide guidance
• In practice:
- More complex models require moredata
- Empirical analysiscan provide guidance
Learning
curves
Amount of trainingdata
Test
error
43
Is there a limit?
Yes, for most
models…
Amount of trainingdata
Test
error
Biasof model
44
More complex models tend to have less
bias…Sentiment classifier using single
words can do OK,but…
Never classifies correctly:
“Thesushi wasnot good.”
More complex model:
consider pairsof words(bigrams)
Word Weight
good +1.5
not good -2.1
Lessbias 

potentially more accurate,
needs more datato learn
45
Models with less bias tend to
need more data to learn well,
but do better with sufficient data
Amount of trainingdata
Test
error
Classifier based
on singlewords
46
Machine Learning Specialization
Class probabilities
47 ©2015 Emily Fox & Carlos Guestrin
4
Machine Learning Specialization
4
How confident is your prediction?
• Thus far,we’veoutputted a prediction
• But, how sureareyou about the prediction?
- “The sushi &everything
else were awesome!”
- “The sushi wasgood,
the service was OK.”
©2015 Emily Fox & Carlos Guestrin
Definite
Not sure
Many classifiers provide aconfidence level:
P(y|x)
Extremelyuseful in practice
Output label Input sentence
P(y=+|x)=0.99
P(y=+|x)=0.55
4
Machine Learning Specialization
Summary of classification
49 ©2015 Emily Fox & Carlos Guestrin
4
Machine Learning Specialization
5
What you can do now…
• Identify aclassification problem and
some common applications
• Describe decision boundaries andlinear
classifiers
• Train aclassifier
• Measure its error
- Some rules of thumb for goodaccuracy
• Interpret the typesof errorassociated with
classification
• Describe the tradeoffs between model bias
anddataset size
• Use class probability to expressdegree of
confidence inprediction
©2015 Emily Fox & Carlos Guestrin
5
Thank You !!
https://www.isquareit.edu.in/
5
Ad

More Related Content

What's hot (20)

Informed and Uninformed search Strategies
Informed and Uninformed search StrategiesInformed and Uninformed search Strategies
Informed and Uninformed search Strategies
Amey Kerkar
 
Depth First Search ( DFS )
Depth First Search ( DFS )Depth First Search ( DFS )
Depth First Search ( DFS )
Sazzad Hossain
 
AVL Tree Data Structure
AVL Tree Data StructureAVL Tree Data Structure
AVL Tree Data Structure
Afaq Mansoor Khan
 
Two one Problem artificial intelligence
Two one Problem artificial intelligence  Two one Problem artificial intelligence
Two one Problem artificial intelligence
Wasim Raza
 
Breadth first search and depth first search
Breadth first search and  depth first searchBreadth first search and  depth first search
Breadth first search and depth first search
Hossain Md Shakhawat
 
Ppt bubble sort
Ppt bubble sortPpt bubble sort
Ppt bubble sort
prabhakar jalasutram
 
Recurrences
RecurrencesRecurrences
Recurrences
Ala' Mohammad
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
International School of Engineering
 
Shortest path algorithm
Shortest path algorithmShortest path algorithm
Shortest path algorithm
sana younas
 
Unit 4 external sorting
Unit 4   external sortingUnit 4   external sorting
Unit 4 external sorting
DrkhanchanaR
 
queue & its applications
queue & its applicationsqueue & its applications
queue & its applications
somendra kumar
 
AVL Tree
AVL TreeAVL Tree
AVL Tree
Dr Sandeep Kumar Poonia
 
Data Structure (Queue)
Data Structure (Queue)Data Structure (Queue)
Data Structure (Queue)
Adam Mukharil Bachtiar
 
Dfs presentation
Dfs presentationDfs presentation
Dfs presentation
Alizay Khan
 
4 informed-search
4 informed-search4 informed-search
4 informed-search
Mhd Sb
 
AI 7 | Constraint Satisfaction Problem
AI 7 | Constraint Satisfaction ProblemAI 7 | Constraint Satisfaction Problem
AI 7 | Constraint Satisfaction Problem
Mohammad Imam Hossain
 
Uninformed Search technique
Uninformed Search techniqueUninformed Search technique
Uninformed Search technique
Kapil Dahal
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Ahmed Yousry
 
Master method
Master method Master method
Master method
Rajendran
 
N queens using backtracking
N queens using backtrackingN queens using backtracking
N queens using backtracking
srilekhagourishetty
 
Informed and Uninformed search Strategies
Informed and Uninformed search StrategiesInformed and Uninformed search Strategies
Informed and Uninformed search Strategies
Amey Kerkar
 
Depth First Search ( DFS )
Depth First Search ( DFS )Depth First Search ( DFS )
Depth First Search ( DFS )
Sazzad Hossain
 
Two one Problem artificial intelligence
Two one Problem artificial intelligence  Two one Problem artificial intelligence
Two one Problem artificial intelligence
Wasim Raza
 
Breadth first search and depth first search
Breadth first search and  depth first searchBreadth first search and  depth first search
Breadth first search and depth first search
Hossain Md Shakhawat
 
Shortest path algorithm
Shortest path algorithmShortest path algorithm
Shortest path algorithm
sana younas
 
Unit 4 external sorting
Unit 4   external sortingUnit 4   external sorting
Unit 4 external sorting
DrkhanchanaR
 
queue & its applications
queue & its applicationsqueue & its applications
queue & its applications
somendra kumar
 
Dfs presentation
Dfs presentationDfs presentation
Dfs presentation
Alizay Khan
 
4 informed-search
4 informed-search4 informed-search
4 informed-search
Mhd Sb
 
AI 7 | Constraint Satisfaction Problem
AI 7 | Constraint Satisfaction ProblemAI 7 | Constraint Satisfaction Problem
AI 7 | Constraint Satisfaction Problem
Mohammad Imam Hossain
 
Uninformed Search technique
Uninformed Search techniqueUninformed Search technique
Uninformed Search technique
Kapil Dahal
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Ahmed Yousry
 
Master method
Master method Master method
Master method
Rajendran
 

Similar to Sentiment Analysis in Machine Learning (20)

Sentimental analysis
Sentimental analysisSentimental analysis
Sentimental analysis
Learnbay Datascience
 
Lecture11.pptx
Lecture11.pptxLecture11.pptx
Lecture11.pptx
SanjarBey
 
Supervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningSupervised learning: Types of Machine Learning
Supervised learning: Types of Machine Learning
Libya Thomas
 
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisRsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Sanjana Chowdhury
 
Effective Use of Surveys in UX | Triangle UXPA Workshop
Effective Use of Surveys in UX | Triangle UXPA WorkshopEffective Use of Surveys in UX | Triangle UXPA Workshop
Effective Use of Surveys in UX | Triangle UXPA Workshop
Amanda Stockwell
 
Reflective writing analytics: empirically determined keywords of written refl...
Reflective writing analytics: empirically determined keywords of written refl...Reflective writing analytics: empirically determined keywords of written refl...
Reflective writing analytics: empirically determined keywords of written refl...
Thomas Ullmann
 
Soft on People, Hard on Code: interpersonal approaches that promote high qual...
Soft on People, Hard on Code: interpersonal approaches that promote high qual...Soft on People, Hard on Code: interpersonal approaches that promote high qual...
Soft on People, Hard on Code: interpersonal approaches that promote high qual...
Mark Brannan
 
Statistics for linguistics
Statistics for linguisticsStatistics for linguistics
Statistics for linguistics
aiaioo
 
C3.1.logistic intro
C3.1.logistic introC3.1.logistic intro
C3.1.logistic intro
Daniel LIAO
 
C3.1.2
C3.1.2C3.1.2
C3.1.2
Daniel LIAO
 
Enhancing Test Questions Using KPUP Format
Enhancing Test Questions Using KPUP FormatEnhancing Test Questions Using KPUP Format
Enhancing Test Questions Using KPUP Format
Joseline Santos
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-Making
Penn State University
 
How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...
Julián Urbano
 
Machine Learning course Lecture number 1.pptx
Machine Learning course Lecture number 1.pptxMachine Learning course Lecture number 1.pptx
Machine Learning course Lecture number 1.pptx
hamedj21
 
The Art of Speaking Data.
The Art of Speaking Data.The Art of Speaking Data.
The Art of Speaking Data.
David Wellman
 
Topic_5_NB_Sentiment_Classification_.pptx
Topic_5_NB_Sentiment_Classification_.pptxTopic_5_NB_Sentiment_Classification_.pptx
Topic_5_NB_Sentiment_Classification_.pptx
HassaanIbrahim2
 
Machine Learning Foundations
Machine Learning FoundationsMachine Learning Foundations
Machine Learning Foundations
Albert Y. C. Chen
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Sri Ambati
 
The Art of Questioning to improve Software Testing, Agile and Automating
The Art of Questioning to improve Software Testing, Agile and AutomatingThe Art of Questioning to improve Software Testing, Agile and Automating
The Art of Questioning to improve Software Testing, Agile and Automating
Alan Richardson
 
Passionate Partnering, for Testers
Passionate Partnering, for TestersPassionate Partnering, for Testers
Passionate Partnering, for Testers
SQALab
 
Lecture11.pptx
Lecture11.pptxLecture11.pptx
Lecture11.pptx
SanjarBey
 
Supervised learning: Types of Machine Learning
Supervised learning: Types of Machine LearningSupervised learning: Types of Machine Learning
Supervised learning: Types of Machine Learning
Libya Thomas
 
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisRsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Sanjana Chowdhury
 
Effective Use of Surveys in UX | Triangle UXPA Workshop
Effective Use of Surveys in UX | Triangle UXPA WorkshopEffective Use of Surveys in UX | Triangle UXPA Workshop
Effective Use of Surveys in UX | Triangle UXPA Workshop
Amanda Stockwell
 
Reflective writing analytics: empirically determined keywords of written refl...
Reflective writing analytics: empirically determined keywords of written refl...Reflective writing analytics: empirically determined keywords of written refl...
Reflective writing analytics: empirically determined keywords of written refl...
Thomas Ullmann
 
Soft on People, Hard on Code: interpersonal approaches that promote high qual...
Soft on People, Hard on Code: interpersonal approaches that promote high qual...Soft on People, Hard on Code: interpersonal approaches that promote high qual...
Soft on People, Hard on Code: interpersonal approaches that promote high qual...
Mark Brannan
 
Statistics for linguistics
Statistics for linguisticsStatistics for linguistics
Statistics for linguistics
aiaioo
 
C3.1.logistic intro
C3.1.logistic introC3.1.logistic intro
C3.1.logistic intro
Daniel LIAO
 
Enhancing Test Questions Using KPUP Format
Enhancing Test Questions Using KPUP FormatEnhancing Test Questions Using KPUP Format
Enhancing Test Questions Using KPUP Format
Joseline Santos
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-Making
Penn State University
 
How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...
Julián Urbano
 
Machine Learning course Lecture number 1.pptx
Machine Learning course Lecture number 1.pptxMachine Learning course Lecture number 1.pptx
Machine Learning course Lecture number 1.pptx
hamedj21
 
The Art of Speaking Data.
The Art of Speaking Data.The Art of Speaking Data.
The Art of Speaking Data.
David Wellman
 
Topic_5_NB_Sentiment_Classification_.pptx
Topic_5_NB_Sentiment_Classification_.pptxTopic_5_NB_Sentiment_Classification_.pptx
Topic_5_NB_Sentiment_Classification_.pptx
HassaanIbrahim2
 
Machine Learning Foundations
Machine Learning FoundationsMachine Learning Foundations
Machine Learning Foundations
Albert Y. C. Chen
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Sri Ambati
 
The Art of Questioning to improve Software Testing, Agile and Automating
The Art of Questioning to improve Software Testing, Agile and AutomatingThe Art of Questioning to improve Software Testing, Agile and Automating
The Art of Questioning to improve Software Testing, Agile and Automating
Alan Richardson
 
Passionate Partnering, for Testers
Passionate Partnering, for TestersPassionate Partnering, for Testers
Passionate Partnering, for Testers
SQALab
 
Ad

More from International Institute of Information Technology (I²IT) (20)

Minimization of DFA
Minimization of DFAMinimization of DFA
Minimization of DFA
International Institute of Information Technology (I²IT)
 
Understanding Natural Language Processing
Understanding Natural Language ProcessingUnderstanding Natural Language Processing
Understanding Natural Language Processing
International Institute of Information Technology (I²IT)
 
What Is Smart Computing?
What Is Smart Computing?What Is Smart Computing?
What Is Smart Computing?
International Institute of Information Technology (I²IT)
 
Professional Ethics & Etiquette: What Are They & How Do I Get Them?
Professional Ethics & Etiquette: What Are They & How Do I Get Them?Professional Ethics & Etiquette: What Are They & How Do I Get Them?
Professional Ethics & Etiquette: What Are They & How Do I Get Them?
International Institute of Information Technology (I²IT)
 
Writing Skills: Importance of Writing Skills
Writing Skills: Importance of Writing SkillsWriting Skills: Importance of Writing Skills
Writing Skills: Importance of Writing Skills
International Institute of Information Technology (I²IT)
 
Professional Communication | Introducing Oneself
Professional Communication | Introducing Oneself Professional Communication | Introducing Oneself
Professional Communication | Introducing Oneself
International Institute of Information Technology (I²IT)
 
Servlet: A Server-side Technology
Servlet: A Server-side TechnologyServlet: A Server-side Technology
Servlet: A Server-side Technology
International Institute of Information Technology (I²IT)
 
What Is Jenkins? Features and How It Works
What Is Jenkins? Features and How It WorksWhat Is Jenkins? Features and How It Works
What Is Jenkins? Features and How It Works
International Institute of Information Technology (I²IT)
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
International Institute of Information Technology (I²IT)
 
Hypothesis-Testing
Hypothesis-TestingHypothesis-Testing
Hypothesis-Testing
International Institute of Information Technology (I²IT)
 
Data Science, Big Data, Data Analytics
Data Science, Big Data, Data AnalyticsData Science, Big Data, Data Analytics
Data Science, Big Data, Data Analytics
International Institute of Information Technology (I²IT)
 
Types of Artificial Intelligence
Types of Artificial Intelligence Types of Artificial Intelligence
Types of Artificial Intelligence
International Institute of Information Technology (I²IT)
 
Difference Between AI(Artificial Intelligence), ML(Machine Learning), DL (Dee...
Difference Between AI(Artificial Intelligence), ML(Machine Learning), DL (Dee...Difference Between AI(Artificial Intelligence), ML(Machine Learning), DL (Dee...
Difference Between AI(Artificial Intelligence), ML(Machine Learning), DL (Dee...
International Institute of Information Technology (I²IT)
 
What Is Cloud Computing?
What Is Cloud Computing?What Is Cloud Computing?
What Is Cloud Computing?
International Institute of Information Technology (I²IT)
 
Introduction To Design Pattern
Introduction To Design PatternIntroduction To Design Pattern
Introduction To Design Pattern
International Institute of Information Technology (I²IT)
 
Importance of Theory of Computations
Importance of Theory of ComputationsImportance of Theory of Computations
Importance of Theory of Computations
International Institute of Information Technology (I²IT)
 
Java as Object Oriented Programming Language
Java as Object Oriented Programming LanguageJava as Object Oriented Programming Language
Java as Object Oriented Programming Language
International Institute of Information Technology (I²IT)
 
What Is High Performance-Computing?
What Is High Performance-Computing?What Is High Performance-Computing?
What Is High Performance-Computing?
International Institute of Information Technology (I²IT)
 
Data Visualization - How to connect Microsoft Forms to Power BI
Data Visualization - How to connect Microsoft Forms to Power BIData Visualization - How to connect Microsoft Forms to Power BI
Data Visualization - How to connect Microsoft Forms to Power BI
International Institute of Information Technology (I²IT)
 
AVL Tree Explained
AVL Tree ExplainedAVL Tree Explained
AVL Tree Explained
International Institute of Information Technology (I²IT)
 
Ad

Recently uploaded (20)

Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 

Sentiment Analysis in Machine Learning

  • 1. Sentiment Analysis in Machine Learning Prof Pranali V Deshmukh Department of Information Technology International Institute of Information Technology, I²IT www.isquareit.edu.in 1
  • 2. Predicting sentiment by topic: An intelligent restaurant review system
  • 3. It’s a big day & I want to book a table at a nice Japanese restaurant Seattlehas many ★★★★ sushirestaurants Whatarepeople sayingabout the food? the ambiance?... 3
  • 4. Positive reviews not positive about everything Samplereview: Watching the chefs create incredible edible artmade the experience veryunique. My wife tried their ramen and it was pretty forgettable. All the sushi was delicious! Easilybest sushi in Seattle. Experience 4
  • 5. From reviews to topic sentiments Experience ★★★★ Ramen ★★★ Sushi ★★★★★ Novel intelligent restaurant review app Easily best sushi in Seattle. Allreviewsfor restaurant 5
  • 6. Intelligent restaurant review system Allreviewsfor restaurant Breakall reviews into sentences The seaweed salad was just OK, vegetable salad was just ordinary. I like the interior decoration and the blackboard menu on thewall. 6 All the sushi was delicious. My wife tried their ramen and it was pretty forgettable. The sushi was amazing, and the rice is just outstanding. The service is somewhat hectic. Easily best sushi in Seattle.
  • 7. Core building block Easilybest sushi in Seattle. Sentence Sentiment Classifier Easily best sushi in Seattle. 7
  • 8. Intelligent restaurant review system Allreviewsfor restaurant My wife tried their ramen and it was pretty forgettable. The service is somewhat hectic. Easily best sushi in Seattle. All the sushi was delicious. The sushi was amazing, and the rice is just outstanding. Easily best sushi in Seattle. All the sushi was delicious. The sushi was amazing, and the rice is just outstanding. BreakSeall e lc r e t v s i e e n w t e s nces into s e a n b t o e u n t c “ e s s u s h i ” The seaweed salad was just OK, vegetable salad was just ordinary. I like the interior decoration and the blackboard menu on thewall. Sentence Sentiment Classifier Sushi ★★★★★ Average predictions Easilybest sushi in Seattle. Most & 8
  • 9. Machine Learning Specialization Classifier applications 9 ©2015 Emily Fox & Carlos Guestrin 9
  • 11. Example multiclass classifier Output y has more than 2 categories Education Finance Technology 11 Input: x Webpage Output: y
  • 12. 1 Spam filtering Input: x Output: y Not spam Spam Text of email, sender, IP,… 1
  • 13. Image classification Input: x Image pixels 13 Output:y Predicted object
  • 14. Personalized medical diagnosis Disease Classifier MODEL Input: x Healthy Cold Flu Pneumonia … Output: y 14
  • 18. Count positive &negativewords in sentence If number of positive words> number of negative words: ŷ= Else: Listofpositive words Listofnegative words great,awesome, good, amazing,… bad,terrible, disgusting, sucks,… ŷ= 18 Sentence from review Input: x Simple threshold classifier
  • 19. Count positive &negative words in sentence If number of positive words> number of negative words: ŷ= Else: Listofpositive words Listofnegative words great,awesome, good, amazing,… bad,terrible, disgusting, sucks,… Sushi was great, the food was awesome, but the servicewas terrible. Simple threshold classifier 2 1 ŷ= 19
  • 20. Problems with threshold classifier • How do we get list of positive/negativewords? • Words havedifferent degreesof sentiment: - Great > good - How do weweigh different words? • Single words arenot enough: - Good   Positive - Not good   Negative Addressed bylearning aclassifier Addressed bymore elaborate features 20
  • 21. A(linear) classifier 21 • Will usetraining datato learn aweight for eachword Word Weight good 1.0 great 1.5 awesome 2.7 bad -1.0 terrible -2.1 awful -3.3 restaurant,the, we, where, … 0.0 … …
  • 22. Scoring a sentence Word Weight good 1.0 great 1.2 awesome 1.7 bad -1.0 terrible -2.1 awful -3.3 restaurant,the, we, where, … 0.0 … … Input x: Sushi was great, the food was awesome, but the service was terrible. Called alinear classifier, because output is weighted sum of input. 22
  • 23. Word Weight … … 23 Sentence from review Input: x Simple linear classifier Score(x) =weighted count of words in sentence If Score (x) > 0: ŷ= Else: ŷ=
  • 24. Machine Learning Specialization Decision boundaries 24 ©2015 Emily Fox & Carlos Guestrin 2
  • 25. Suppose only two words had non-zero weight Word Weight awesome 1.0 awful -1.5 awful 3 2 1 4 … Sushi was awesome, the food wasawesome, but the service was awful. Score(x) =1.0#awesome – 1.5#awful 0 0 1 2 3 4 … awesome 25
  • 26. Decision boundary example Word Weight awesome 1.0 awful -1.5 awful 1 4 3 2 … Score(x) =1.0#awesome – 1.5#awful Score(x)> 0 Score(x)< 0 0 0 1 2 3 4 … awesome 26
  • 27. Decision boundary separates positive & negative predictions • For linear classifiers: - When 2weights are non-zero   line - When 3weights are non-zero   plane - When manyweights are non-zero   hyperplane • For more generalclassifiers   morecomplicatedshapes 22
  • 28. Machine Learning Specialization Training and evaluating a classifier 28 ©2015 Emily Fox & Carlos Guestrin 2
  • 29. Training a classifier = Learning the weights Data (x,y) (Sentence1, ) (Sentence2, ) … Training set Test set Learn classifier Evaluate? Word Weight good 1.0 awesome 1.7 bad -1.0 awful -3.3 … … 29
  • 30. Classification error at, Test example ( S ( F u o s h o i d w w a a s s g O r e K a , t )) Learnedclassifier Hide label Correct Mistakes ŷ= M Co is r t r a e k c e t! 0 1 0 1 30
  • 31. Classification error & accuracy • Error measuresfraction of mistakes - Bestpossible valueis0.0 • Often, measureaccuracy -Fraction of correct predictions - Bestpossible valueis1.0 error = . accuracy= . 31
  • 32. Machine Learning Specialization What’s a good accuracy? 32 ©2015 Emily Fox & Carlos Guestrin 3
  • 33. What if you ignore the sentence, and just guess? 33 • For binaryclassification: - Half the time, you’ll get it right! (on average)   accuracy =0.5 • For kclasses,accuracy =1/k - 0.333 for 3classes, 0.25 for 4 classes,… Atthe very,very,very least, you should healthily beatrandom… Otherwise, it’s(usually) pointless…
  • 34. 2010data shows: “90% emails sent are spam!” Predicting everyemail is spam getsyou 90%accuracy!!! Majority class prediction Amazing performance when there is class imbalance (butsilly approach) • One class is more common thanothers • Beatsrandom (ifyou know the majority class) Is a classifier with 90% accuracy good? … 34
  • 35. So, always be digging in and asking the hard questions about reported accuracies 35 • Is there class imbalance? • How does it compare to asimple, baseline approach? - Random guessing - Majority class -… • Most importantly: what accuracy does my application need? - Whatis good enough for myuser’sexperience? - Whatis the impact of the mistakeswe make?
  • 36. Machine Learning Specialization False positives, false negatives, and confusion matrices 36 ©2015 Emily Fox & Carlos Guestrin 3
  • 37. Types of mistakes True label Predicted label True Positive False Negative (FN) False True Positive Negative (FP) 37
  • 38. Cost of different types of mistakes can be different (& high) in some applications Spam filtering Medical diagnosis False negative False positive Annoying Email lost Disease not treated Wasteful treatment 38
  • 39. True label Confusion matrix – binary classification Predicted label 39
  • 40. Confusion matrix – multiclass classification Healthy Cold Flu Healthy Cold Flu True label Predicted label 40
  • 41. Machine Learning Specialization Learning curves: How much data do I need? 41 ©2015 Emily Fox & Carlos Guestrin 4
  • 42. How much data does a model need to learn? 42 • The more the merrier   - But dataquality is most important factor • Theoretical techniques sometimes can bound how much dataisneeded - Typically too loose for practicalapplication - But provide guidance • In practice: - More complex models require moredata - Empirical analysiscan provide guidance
  • 44. Is there a limit? Yes, for most models… Amount of trainingdata Test error Biasof model 44
  • 45. More complex models tend to have less bias…Sentiment classifier using single words can do OK,but… Never classifies correctly: “Thesushi wasnot good.” More complex model: consider pairsof words(bigrams) Word Weight good +1.5 not good -2.1 Lessbias   potentially more accurate, needs more datato learn 45
  • 46. Models with less bias tend to need more data to learn well, but do better with sufficient data Amount of trainingdata Test error Classifier based on singlewords 46
  • 47. Machine Learning Specialization Class probabilities 47 ©2015 Emily Fox & Carlos Guestrin 4
  • 48. Machine Learning Specialization 4 How confident is your prediction? • Thus far,we’veoutputted a prediction • But, how sureareyou about the prediction? - “The sushi &everything else were awesome!” - “The sushi wasgood, the service was OK.” ©2015 Emily Fox & Carlos Guestrin Definite Not sure Many classifiers provide aconfidence level: P(y|x) Extremelyuseful in practice Output label Input sentence P(y=+|x)=0.99 P(y=+|x)=0.55 4
  • 49. Machine Learning Specialization Summary of classification 49 ©2015 Emily Fox & Carlos Guestrin 4
  • 50. Machine Learning Specialization 5 What you can do now… • Identify aclassification problem and some common applications • Describe decision boundaries andlinear classifiers • Train aclassifier • Measure its error - Some rules of thumb for goodaccuracy • Interpret the typesof errorassociated with classification • Describe the tradeoffs between model bias anddataset size • Use class probability to expressdegree of confidence inprediction ©2015 Emily Fox & Carlos Guestrin 5
  翻译: