SlideShare a Scribd company logo
Mining Explicit User Generated Content:
Sentiment Analysis
Kira Radinsky
Slides based on material from: Bing Liu (WWW-2008 tutorial)
2
Introduction
• Two main types of textual information.
– Facts and Opinions
• Most current text information processing
methods (e.g., web search, text mining) work
with factual information.
• Sentiment analysis/opinion mining
– computational study of opinions, sentiments and
emotions expressed in text.
– huge volumes of opinionated text on the web
3
User Generated Media
Word-of-mouth on the Web
– User-generated media: One can express opinions
on anything in reviews, forums, discussion groups,
blogs ...
– Opinions of global scale: No longer limited to:
• Individuals: one’s circle of friends
• Businesses: Small scale surveys, tiny focus
groups, etc.
An Example Review
• “I bought an iPhone a few days ago. It was such a nice
phone. The touch screen was really cool. The voice quality
was clear too. Although the battery life was not long, that
is ok for me. However, my mother was mad with me as I
did not tell her before I bought the phone. She also thought
the phone was too expensive, and wanted me to return it
to the shop. …”
• What do we see?
– Opinions, targets of opinions, and opinion holders
4
5
Target Object
• Definition (object): An object o is a product, person, event,
organization, or topic. o is represented as
– A hierarchy of components, sub-components, and so on.
– Each node represents a component and is associated with a
set of attributes of the component.
• An opinion can be expressed on any node or attribute of
the node (also called features)
* Liu, Web Data Mining book, 2006
What is an Opinion?
An opinion :
(oj, fjk, soijkl, hi, tl),
• oj is a target object.
• fjk is a feature of the object oj.
• hi is an opinion holder.
• tl is the time when the opinion is expressed.
• soijkl is the sentiment value of the opinion of
the opinion holder hi on feature fjk of object oj
at time tl. soijkl is +ve, -ve, or neu, or a more
granular rating. 6
Sentiment Analysis approaches
• Document level sentiment classification
– Unsupervised review classification (Turney, ACL-02)
– Sentiment classification using machine learning
methods (Pang et al, EMNLP-02)
• Sentence level sentiment analysis
– Using learnt patterns (Rilloff and Wiebe, EMNLP-03)
• Feature-based opinion mining and
summarization
– Next slides
8
Feature-Based Sentiment Analysis
• Objective: Discovering all quintuples
(oj, fjk, soijkl, hi, tl)
• Sentiment classification at both document and sentence
(or clause) levels are not enough,
– they do not tell what people like and/or dislike
– A positive opinion on an object does not mean that the opinion
holder likes everything.
– An negative opinion on an object does not mean that the opinion
holder dislikes everything.
9
Feature-Based Opinion Summary
“I bought an iPhone a few days
ago. It was such a nice phone.
The touch screen was really cool.
The voice quality was clear too.
Although the battery life was not
long, that is ok for me. However,
my mother was mad with me as
I did not tell her before I bought
the phone. She also thought the
phone was too expensive, and
wanted me to return it to the
shop. …”
Feature Based Summary:
Feature1: Touch screen
Positive: 212
• The touch screen was really cool.
• The touch screen was so easy to use
and can do amazing things.
…
Negative: 6
• The screen is easily scratched.
• I have a lot of difficulty in removing
finger marks from the touch screen.
…
Feature2: battery life
…
*Hu & Liu, KDD-2004
10
Visual Comparison
 Summary of
reviews of
Cell Phone 1
Voice Screen Size WeightBattery
+
_
 Comparison of
reviews of
Cell Phone 1
Cell Phone 2
_
+
* Liu et al. WWW-2005
Bing feature-based opinion summary
11
Sentiment Analysis is Hard!
• “This past Saturday, I bought a Nokia phone
and my girlfriend bought a Motorola phone
with Bluetooth. We called each other when we
got home. The voice on my phone was not so
clear, worse than my previous phone. The
battery life was long. My girlfriend was quite
happy with her phone. I wanted a phone with
good sound quality. So my purchase was a real
disappointment. I returned the phone
yesterday.”
12
Not Just ONE Problem
• (oj, fjk, soijkl, hi, tl),
– oj - a target object: Named Entity Extraction (more)
– fjk - a feature of oj: Information Extraction
– soijkl is sentiment: Sentiment determination
– hi is an opinion holder: Information/Data Extraction
– tl is the time: Data Extraction
• Co-reference resolution
• Relation extraction
• Synonym match (voice = sound quality) …
• None of them is a solved problem!
13
Easier and Harder Problems
• Reviews are easier.
– Objects/entities are given (almost), and little noise
• Forum discussions and blogs are harder.
– Objects are not given, and a large amount of noise
• Determining sentiments seems to be easier.
• Determining objects and their corresponding
features is harder.
• Combining them is even harder.
14
15
Two Main Types of Opinions
• Direct Opinions: direct sentiment expressions
on some target objects, e.g., products, events,
topics, persons.
– E.g., “the picture quality of this camera is great.”
• Comparative Opinions: Comparisons
expressing similarities or differences of more
than one object. Usually stating an ordering or
preference.
– E.g., “car x is cheaper than car y.”
16
Comparative Opinions
• Gradable
– Non-Equal Gradable: Relations of the type greater
or less than
• Ex: “optics of camera A is better than that of camera B”
– Equative: Relations of the type equal to
• Ex: “camera A and camera B both come in 7MP”
– Superlative: Relations of the type greater or less
than all others
• Ex: “camera A is the cheapest camera available in
market”
* Jindal and Liu, AAAI 2006
Mining Comparative Opinions
(Jinal and Liu, SIGIR-06)
Given a collection of evaluative texts
Task 1: Identify comparative sentences.
Task 2: Categorize different types of
comparative sentences.
Task 2: Extract comparative relations from the
sentences.
Identify comparative sentences
Keyword strategy
• An observation: It is easy to find a small set of
keywords that covers almost all comparative sentences, i.e.,
with a very high recall and a reasonable precision
• Compiled a list of 83 keywords used in comparative
sentences, which includes:
– Words with POS tags of JJR, JJS, RBR, RBS
• POS tags are used as keyword instead of individual
words.
• Exceptions: more, less, most and least
– Other indicative words like beat, exceed, ahead, etc
– Phrases like in the lead on par with etc
2-step learning strategy
• Step1: Extract sentences which contain at
least a keyword (recall = 98%, precision =32%
on our data set for gradables)
• Step2: Use the naïve Bayes (NB) classifier to
classify sentences into two classes:
comparative and non-comparative, and use
features like:
– Use words within radius r of a keyword to form a
sequence (words are replaced with POS tags)
– Use different minimum supports for different
keywords (multiple minimum supports)
Mining Comparative Opinions
1. (Bos and Nissim 2006) proposes a method to
extract items from superlative sentences. It
does not study sentiments either.
2. (Fiszman et al 2007) tried to identify which
entity has more of a certain property in a
comparative sentence.
3. (Ding and Liu 2008) studies sentiment
analysis of comparatives, i.e., identifying
which entity is preferred.
20
Ad

More Related Content

What's hot (20)

Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Marina Santini
 
Statistical Methods for Integration and Analysis of Online Opinionated Text...
Statistical Methods for Integration and Analysis of Online Opinionated Text...Statistical Methods for Integration and Analysis of Online Opinionated Text...
Statistical Methods for Integration and Analysis of Online Opinionated Text...
Kavita Ganesan
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
Ali Habeeb
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
ishan0019
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment Analysis
Karthik Sharma
 
NLP based Mining on Movie Critics
NLP based Mining on Movie Critics NLP based Mining on Movie Critics
NLP based Mining on Movie Critics
supraja reddy
 
Text classification & sentiment analysis
Text classification & sentiment analysisText classification & sentiment analysis
Text classification & sentiment analysis
M. Atif Qureshi
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Sagar Ahire
 
Ml ppt
Ml pptMl ppt
Ml ppt
Alpna Patel
 
Lexicon-Based Sentiment Analysis at GHC 2014
Lexicon-Based Sentiment Analysis at GHC 2014Lexicon-Based Sentiment Analysis at GHC 2014
Lexicon-Based Sentiment Analysis at GHC 2014
Bo Hyun Kim
 
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Kavita Ganesan
 
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sentiment Analysis Using Hybrid Structure of Machine Learning AlgorithmsSentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sangeeth Nagarajan
 
Mining of product reviews at aspect level
Mining of product reviews at aspect levelMining of product reviews at aspect level
Mining of product reviews at aspect level
ijfcstjournal
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
Makrand Patil
 
Sentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews DatasetSentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews Dataset
Maham F'Rajput
 
Frame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with SentiloFrame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with Sentilo
Valentina Presutti
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Makrand Patil
 
Ontology based opinion mining for book reviews
Ontology based opinion mining for book reviewsOntology based opinion mining for book reviews
Ontology based opinion mining for book reviews
firzhan naqash
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
Rebecca Williams
 
Social Media Sentiments Analysis
Social Media Sentiments AnalysisSocial Media Sentiments Analysis
Social Media Sentiments Analysis
PratisthaSingh5
 
Statistical Methods for Integration and Analysis of Online Opinionated Text...
Statistical Methods for Integration and Analysis of Online Opinionated Text...Statistical Methods for Integration and Analysis of Online Opinionated Text...
Statistical Methods for Integration and Analysis of Online Opinionated Text...
Kavita Ganesan
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
Ali Habeeb
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
ishan0019
 
Potentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment AnalysisPotentials and limitations of ‘Automated Sentiment Analysis
Potentials and limitations of ‘Automated Sentiment Analysis
Karthik Sharma
 
NLP based Mining on Movie Critics
NLP based Mining on Movie Critics NLP based Mining on Movie Critics
NLP based Mining on Movie Critics
supraja reddy
 
Text classification & sentiment analysis
Text classification & sentiment analysisText classification & sentiment analysis
Text classification & sentiment analysis
M. Atif Qureshi
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Sagar Ahire
 
Lexicon-Based Sentiment Analysis at GHC 2014
Lexicon-Based Sentiment Analysis at GHC 2014Lexicon-Based Sentiment Analysis at GHC 2014
Lexicon-Based Sentiment Analysis at GHC 2014
Bo Hyun Kim
 
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Enabling Opinion-Driven Decision Making - Sentiment Analysis Innovation Summit
Kavita Ganesan
 
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sentiment Analysis Using Hybrid Structure of Machine Learning AlgorithmsSentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sentiment Analysis Using Hybrid Structure of Machine Learning Algorithms
Sangeeth Nagarajan
 
Mining of product reviews at aspect level
Mining of product reviews at aspect levelMining of product reviews at aspect level
Mining of product reviews at aspect level
ijfcstjournal
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
Makrand Patil
 
Sentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews DatasetSentiment Analysis on Amazon Movie Reviews Dataset
Sentiment Analysis on Amazon Movie Reviews Dataset
Maham F'Rajput
 
Frame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with SentiloFrame-based Sentiment Analysis with Sentilo
Frame-based Sentiment Analysis with Sentilo
Valentina Presutti
 
Ontology based opinion mining for book reviews
Ontology based opinion mining for book reviewsOntology based opinion mining for book reviews
Ontology based opinion mining for book reviews
firzhan naqash
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
Rebecca Williams
 
Social Media Sentiments Analysis
Social Media Sentiments AnalysisSocial Media Sentiments Analysis
Social Media Sentiments Analysis
PratisthaSingh5
 

Viewers also liked (8)

Tutorial 14 (collaborative filtering)
Tutorial 14 (collaborative filtering)Tutorial 14 (collaborative filtering)
Tutorial 14 (collaborative filtering)
Kira
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
Kira
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
silambu111
 
Tutorial 12 (click models)
Tutorial 12 (click models)Tutorial 12 (click models)
Tutorial 12 (click models)
Kira
 
Tutorial 2 (mle + language models)
Tutorial 2 (mle + language models)Tutorial 2 (mle + language models)
Tutorial 2 (mle + language models)
Kira
 
Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)
Kira
 
Tutorial 7 (link analysis)
Tutorial 7 (link analysis)Tutorial 7 (link analysis)
Tutorial 7 (link analysis)
Kira
 
Information storage and retrieval
Information storage and retrievalInformation storage and retrieval
Information storage and retrieval
Sadaf Rafiq
 
Tutorial 14 (collaborative filtering)
Tutorial 14 (collaborative filtering)Tutorial 14 (collaborative filtering)
Tutorial 14 (collaborative filtering)
Kira
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
Kira
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
silambu111
 
Tutorial 12 (click models)
Tutorial 12 (click models)Tutorial 12 (click models)
Tutorial 12 (click models)
Kira
 
Tutorial 2 (mle + language models)
Tutorial 2 (mle + language models)Tutorial 2 (mle + language models)
Tutorial 2 (mle + language models)
Kira
 
Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)
Kira
 
Tutorial 7 (link analysis)
Tutorial 7 (link analysis)Tutorial 7 (link analysis)
Tutorial 7 (link analysis)
Kira
 
Information storage and retrieval
Information storage and retrievalInformation storage and retrieval
Information storage and retrieval
Sadaf Rafiq
 
Ad

Similar to Tutorial 13 (explicit ugc + sentiment analysis) (20)

Sentiment Analysis (GDSCTU).pdf
Sentiment Analysis (GDSCTU).pdfSentiment Analysis (GDSCTU).pdf
Sentiment Analysis (GDSCTU).pdf
YasminAzou
 
Presentation for data science and data anayltics
Presentation for data science and data anaylticsPresentation for data science and data anayltics
Presentation for data science and data anayltics
timaprofile
 
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.pptopinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
ssuser059331
 
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.pptopinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
ssuser059331
 
Data Acquisition for Sentiment Analysis
Data Acquisition for Sentiment AnalysisData Acquisition for Sentiment Analysis
Data Acquisition for Sentiment Analysis
Ali BELCAID
 
Tf4
Tf4Tf4
Tf4
Azizi Abdullah
 
Feature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon ReviewsFeature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon Reviews
Ravi Kiran Holur Vijay
 
Naïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments usingNaïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments using
Nidhi Baranwal
 
Conducting, analyzing and reporting in depth interviews slideshare 0213 dmf
Conducting, analyzing and reporting in depth interviews slideshare  0213 dmfConducting, analyzing and reporting in depth interviews slideshare  0213 dmf
Conducting, analyzing and reporting in depth interviews slideshare 0213 dmf
David Filiberto
 
Big Data & Sentiment Analysis
Big Data & Sentiment AnalysisBig Data & Sentiment Analysis
Big Data & Sentiment Analysis
Michel Bruley
 
Opinion mining techniques in tourisms
Opinion mining techniques in tourismsOpinion mining techniques in tourisms
Opinion mining techniques in tourisms
Pawan Kumar Tiwari
 
Opinion Mining Techniques in Tourisms
Opinion Mining Techniques in TourismsOpinion Mining Techniques in Tourisms
Opinion Mining Techniques in Tourisms
Pawan Kumar Tiwari
 
UX Field Research Toolkit - A Workshop at Big Design - 2017
UX Field Research Toolkit - A Workshop at Big Design - 2017UX Field Research Toolkit - A Workshop at Big Design - 2017
UX Field Research Toolkit - A Workshop at Big Design - 2017
Kelly Moran
 
Web Opinion Mining
Web Opinion MiningWeb Opinion Mining
Web Opinion Mining
Erhard Dinhobl
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
oralonso
 
data analysis.ppt
data analysis.pptdata analysis.ppt
data analysis.ppt
HanaKassahun1
 
data analysis.pptx
data analysis.pptxdata analysis.pptx
data analysis.pptx
HanaKassahun1
 
2005 Web Content Mining 4
2005 Web Content Mining   42005 Web Content Mining   4
2005 Web Content Mining 4
George Ang
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
SaravanaD2
 
Opinion mining
Opinion miningOpinion mining
Opinion mining
shabnamfsayyad
 
Sentiment Analysis (GDSCTU).pdf
Sentiment Analysis (GDSCTU).pdfSentiment Analysis (GDSCTU).pdf
Sentiment Analysis (GDSCTU).pdf
YasminAzou
 
Presentation for data science and data anayltics
Presentation for data science and data anaylticsPresentation for data science and data anayltics
Presentation for data science and data anayltics
timaprofile
 
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.pptopinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
ssuser059331
 
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.pptopinionminingkavitahyunduk00-110407113230-phpapp01.ppt
opinionminingkavitahyunduk00-110407113230-phpapp01.ppt
ssuser059331
 
Data Acquisition for Sentiment Analysis
Data Acquisition for Sentiment AnalysisData Acquisition for Sentiment Analysis
Data Acquisition for Sentiment Analysis
Ali BELCAID
 
Feature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon ReviewsFeature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon Reviews
Ravi Kiran Holur Vijay
 
Naïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments usingNaïve multi label classification of you tube comments using
Naïve multi label classification of you tube comments using
Nidhi Baranwal
 
Conducting, analyzing and reporting in depth interviews slideshare 0213 dmf
Conducting, analyzing and reporting in depth interviews slideshare  0213 dmfConducting, analyzing and reporting in depth interviews slideshare  0213 dmf
Conducting, analyzing and reporting in depth interviews slideshare 0213 dmf
David Filiberto
 
Big Data & Sentiment Analysis
Big Data & Sentiment AnalysisBig Data & Sentiment Analysis
Big Data & Sentiment Analysis
Michel Bruley
 
Opinion mining techniques in tourisms
Opinion mining techniques in tourismsOpinion mining techniques in tourisms
Opinion mining techniques in tourisms
Pawan Kumar Tiwari
 
Opinion Mining Techniques in Tourisms
Opinion Mining Techniques in TourismsOpinion Mining Techniques in Tourisms
Opinion Mining Techniques in Tourisms
Pawan Kumar Tiwari
 
UX Field Research Toolkit - A Workshop at Big Design - 2017
UX Field Research Toolkit - A Workshop at Big Design - 2017UX Field Research Toolkit - A Workshop at Big Design - 2017
UX Field Research Toolkit - A Workshop at Big Design - 2017
Kelly Moran
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
oralonso
 
2005 Web Content Mining 4
2005 Web Content Mining   42005 Web Content Mining   4
2005 Web Content Mining 4
George Ang
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
SaravanaD2
 
Ad

More from Kira (7)

Tutorial 11 (computational advertising)
Tutorial 11 (computational advertising)Tutorial 11 (computational advertising)
Tutorial 11 (computational advertising)
Kira
 
Tutorial 10 (computational advertising)
Tutorial 10 (computational advertising)Tutorial 10 (computational advertising)
Tutorial 10 (computational advertising)
Kira
 
Tutorial 8 (web graph models)
Tutorial 8 (web graph models)Tutorial 8 (web graph models)
Tutorial 8 (web graph models)
Kira
 
Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)
Kira
 
Tutorial 5 (lucene)
Tutorial 5 (lucene)Tutorial 5 (lucene)
Tutorial 5 (lucene)
Kira
 
Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)
Kira
 
Tutorial 3 (b tree min heap)
Tutorial 3 (b tree min heap)Tutorial 3 (b tree min heap)
Tutorial 3 (b tree min heap)
Kira
 
Tutorial 11 (computational advertising)
Tutorial 11 (computational advertising)Tutorial 11 (computational advertising)
Tutorial 11 (computational advertising)
Kira
 
Tutorial 10 (computational advertising)
Tutorial 10 (computational advertising)Tutorial 10 (computational advertising)
Tutorial 10 (computational advertising)
Kira
 
Tutorial 8 (web graph models)
Tutorial 8 (web graph models)Tutorial 8 (web graph models)
Tutorial 8 (web graph models)
Kira
 
Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)
Kira
 
Tutorial 5 (lucene)
Tutorial 5 (lucene)Tutorial 5 (lucene)
Tutorial 5 (lucene)
Kira
 
Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)Tutorial 4 (duplicate detection)
Tutorial 4 (duplicate detection)
Kira
 
Tutorial 3 (b tree min heap)
Tutorial 3 (b tree min heap)Tutorial 3 (b tree min heap)
Tutorial 3 (b tree min heap)
Kira
 

Recently uploaded (20)

Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 

Tutorial 13 (explicit ugc + sentiment analysis)

  • 1. Mining Explicit User Generated Content: Sentiment Analysis Kira Radinsky Slides based on material from: Bing Liu (WWW-2008 tutorial)
  • 2. 2 Introduction • Two main types of textual information. – Facts and Opinions • Most current text information processing methods (e.g., web search, text mining) work with factual information. • Sentiment analysis/opinion mining – computational study of opinions, sentiments and emotions expressed in text. – huge volumes of opinionated text on the web
  • 3. 3 User Generated Media Word-of-mouth on the Web – User-generated media: One can express opinions on anything in reviews, forums, discussion groups, blogs ... – Opinions of global scale: No longer limited to: • Individuals: one’s circle of friends • Businesses: Small scale surveys, tiny focus groups, etc.
  • 4. An Example Review • “I bought an iPhone a few days ago. It was such a nice phone. The touch screen was really cool. The voice quality was clear too. Although the battery life was not long, that is ok for me. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, and wanted me to return it to the shop. …” • What do we see? – Opinions, targets of opinions, and opinion holders 4
  • 5. 5 Target Object • Definition (object): An object o is a product, person, event, organization, or topic. o is represented as – A hierarchy of components, sub-components, and so on. – Each node represents a component and is associated with a set of attributes of the component. • An opinion can be expressed on any node or attribute of the node (also called features) * Liu, Web Data Mining book, 2006
  • 6. What is an Opinion? An opinion : (oj, fjk, soijkl, hi, tl), • oj is a target object. • fjk is a feature of the object oj. • hi is an opinion holder. • tl is the time when the opinion is expressed. • soijkl is the sentiment value of the opinion of the opinion holder hi on feature fjk of object oj at time tl. soijkl is +ve, -ve, or neu, or a more granular rating. 6
  • 7. Sentiment Analysis approaches • Document level sentiment classification – Unsupervised review classification (Turney, ACL-02) – Sentiment classification using machine learning methods (Pang et al, EMNLP-02) • Sentence level sentiment analysis – Using learnt patterns (Rilloff and Wiebe, EMNLP-03) • Feature-based opinion mining and summarization – Next slides
  • 8. 8 Feature-Based Sentiment Analysis • Objective: Discovering all quintuples (oj, fjk, soijkl, hi, tl) • Sentiment classification at both document and sentence (or clause) levels are not enough, – they do not tell what people like and/or dislike – A positive opinion on an object does not mean that the opinion holder likes everything. – An negative opinion on an object does not mean that the opinion holder dislikes everything.
  • 9. 9 Feature-Based Opinion Summary “I bought an iPhone a few days ago. It was such a nice phone. The touch screen was really cool. The voice quality was clear too. Although the battery life was not long, that is ok for me. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, and wanted me to return it to the shop. …” Feature Based Summary: Feature1: Touch screen Positive: 212 • The touch screen was really cool. • The touch screen was so easy to use and can do amazing things. … Negative: 6 • The screen is easily scratched. • I have a lot of difficulty in removing finger marks from the touch screen. … Feature2: battery life … *Hu & Liu, KDD-2004
  • 10. 10 Visual Comparison  Summary of reviews of Cell Phone 1 Voice Screen Size WeightBattery + _  Comparison of reviews of Cell Phone 1 Cell Phone 2 _ + * Liu et al. WWW-2005
  • 12. Sentiment Analysis is Hard! • “This past Saturday, I bought a Nokia phone and my girlfriend bought a Motorola phone with Bluetooth. We called each other when we got home. The voice on my phone was not so clear, worse than my previous phone. The battery life was long. My girlfriend was quite happy with her phone. I wanted a phone with good sound quality. So my purchase was a real disappointment. I returned the phone yesterday.” 12
  • 13. Not Just ONE Problem • (oj, fjk, soijkl, hi, tl), – oj - a target object: Named Entity Extraction (more) – fjk - a feature of oj: Information Extraction – soijkl is sentiment: Sentiment determination – hi is an opinion holder: Information/Data Extraction – tl is the time: Data Extraction • Co-reference resolution • Relation extraction • Synonym match (voice = sound quality) … • None of them is a solved problem! 13
  • 14. Easier and Harder Problems • Reviews are easier. – Objects/entities are given (almost), and little noise • Forum discussions and blogs are harder. – Objects are not given, and a large amount of noise • Determining sentiments seems to be easier. • Determining objects and their corresponding features is harder. • Combining them is even harder. 14
  • 15. 15 Two Main Types of Opinions • Direct Opinions: direct sentiment expressions on some target objects, e.g., products, events, topics, persons. – E.g., “the picture quality of this camera is great.” • Comparative Opinions: Comparisons expressing similarities or differences of more than one object. Usually stating an ordering or preference. – E.g., “car x is cheaper than car y.”
  • 16. 16 Comparative Opinions • Gradable – Non-Equal Gradable: Relations of the type greater or less than • Ex: “optics of camera A is better than that of camera B” – Equative: Relations of the type equal to • Ex: “camera A and camera B both come in 7MP” – Superlative: Relations of the type greater or less than all others • Ex: “camera A is the cheapest camera available in market” * Jindal and Liu, AAAI 2006
  • 17. Mining Comparative Opinions (Jinal and Liu, SIGIR-06) Given a collection of evaluative texts Task 1: Identify comparative sentences. Task 2: Categorize different types of comparative sentences. Task 2: Extract comparative relations from the sentences.
  • 18. Identify comparative sentences Keyword strategy • An observation: It is easy to find a small set of keywords that covers almost all comparative sentences, i.e., with a very high recall and a reasonable precision • Compiled a list of 83 keywords used in comparative sentences, which includes: – Words with POS tags of JJR, JJS, RBR, RBS • POS tags are used as keyword instead of individual words. • Exceptions: more, less, most and least – Other indicative words like beat, exceed, ahead, etc – Phrases like in the lead on par with etc
  • 19. 2-step learning strategy • Step1: Extract sentences which contain at least a keyword (recall = 98%, precision =32% on our data set for gradables) • Step2: Use the naïve Bayes (NB) classifier to classify sentences into two classes: comparative and non-comparative, and use features like: – Use words within radius r of a keyword to form a sequence (words are replaced with POS tags) – Use different minimum supports for different keywords (multiple minimum supports)
  • 20. Mining Comparative Opinions 1. (Bos and Nissim 2006) proposes a method to extract items from superlative sentences. It does not study sentiments either. 2. (Fiszman et al 2007) tried to identify which entity has more of a certain property in a comparative sentence. 3. (Ding and Liu 2008) studies sentiment analysis of comparatives, i.e., identifying which entity is preferred. 20
  翻译: