Sentiment Analysis + MaxEnt *

  • Published on
    24-Feb-2016

  • View
    56

  • Download
    0

DESCRIPTION

Sentiment Analysis + MaxEnt *. MAS.S60 Rob Speer Catherine Havasi. * Lots of slides borrowed for lots of sources! See end. . People on the Web have opinions. The world is full of text. Customer verbatims Blogs Comments Reviews Forums. Measuring public opinion through social media?. - PowerPoint PPT Presentation

Transcript

Slide 1

Sentiment Analysis + MaxEnt*MAS.S60Rob SpeerCatherine Havasi* Lots of slides borrowed for lots of sources! See end. Thanks to people1

People on the Web have opinions2The world is full of textCustomer verbatimsBlogsCommentsReviewsForumsPeople in U.S.

Measuring public opinionthrough social media?

WriteCan we derive a similar measurement?

QueryQueryAggregateText Sentiment MeasureI do notI like Obama

Anne Hathaway

Oct. 3, 2008 - Rachel Getting Married opens: BRK.A up .44%Jan. 5, 2009 - Bride Wars opens: BRK.A up 2.61%Feb. 8, 2010 - Valentine's Day opens: BRK.A up 1.01%March 5, 2010 - Alice in Wonderland opens: BRK.A up .74%Nov. 24, 2010 - Love and Other Drugs opens: BRK.A up 1.62%Nov. 29, 2010 - Anne announced as co-host of the Oscars: BRK.A up .25%Application: Information ExtractionICWSM 20086The Parliament exploded into fury against the government when word leaked out

Observation: subjectivity often causes false hits for IEGoal: augment the results of IE

Subjectivity filtering strategies to improve IE Riloff, Wiebe, Phillips AAAI05

Sentiment can be hard to analyzehttp://www.killermovies.com/o/operationcondor/reviews/6na.htmlThis is where the money was spent, on well-choreographed kung-fu sequences, on giant Kevlar hamster balls, on smashed-up crates of bananas, and on scorpions. Ignore the gaping holes in the plot (how, exactly, if the villain's legs were broken, did he escape from the secret Nazi base, and why didn't he take the key with him?). Don't worry about the production values, or what, exactly, the Japanese girl was doing hitchhiking across the Sahara. Just go see the movie.Thwarted Expectations NarrativeI thought it was going to be amazing but its not unless youre a hungover college student. Tripadvisor, Amys Caf

This is a messy taskInter-annotator agreement on sentiment analysis tasks can be as low as 70%Pang et al., 2002: adding n-grams doesnt seem to helpTwitter Mood Swings

Alan Mislove, Northeastern Daily Mood

Weekly Mood

Opinion mining tasksAt the document (or review) level:Task: sentiment classification of reviewsClasses: positive, negative, and neutralAssumption: each document (or review) focuses on a single object (not true in many discussion posts) and contains opinion from a single opinion holder.At the sentence level:Task 1: identifying subjective/opinionated sentencesClasses: objective and subjective (opinionated)Task 2: sentiment classification of sentencesClasses: positive, negative and neutral.Assumption: a sentence contains only one opinion; not true in many cases.Then we can also consider clauses or phrases.Opinion Mining Tasks (cont.)At the feature level:Task 1: Identify and extract object features that have been commented on by an opinion holder (e.g., a reviewer).Task 2: Determine whether the opinions on the features are positive, negative or neutral.Task 3: Group feature synonyms.Produce a feature-based opinion summary of multiple reviews.Opinion holders: identify holders is also useful, e.g., in news articles, etc, but they are usually known in the user generated content, i.e., authors of the posts.Bags of WordsLook for certain keywordsValence of a wordAdvantage: Works quicklyDisadvantage: Lexical CreativitySentiment analysis: word counting Subjectivity Clues lexicon from OpinionFinder / U PittWilson et al 20052000 positive, 3600 negative words

ProcedureWithin topical messages,Count messages containing these positive and negative words

Main resourcesLexiconsGeneral Inquirer (Stone et al., 1966)OpinionFinder lexicon (Wiebe & Riloff, 2005)SentiWordNet (Esuli & Sebastiani, 2006)

Annotated corporaUsed in statistical approaches (Hu & Liu 2004, Pang & Lee 2004)MPQA corpus (Wiebe et. al, 2005)

Tools Algorithm based on minimum cuts (Pang & Lee, 2004) OpinionFinder (Wiebe et. al, 2005)

CorpusICWSM 200818MPQA: www.cs.pitt.edu/mqpa/databaserelease (version 2)

English language versions of articles from the world press (187 news sources)

Also includes contextual polarity annotations

Themes of the instructions:No rules about how particular words should be annotated.

Dont take expressions out of context and think about what they could mean, but judge them as they are used in that sentence.

18The annotation scheme that I just described has been used in building a corpus of annotated news articles.

535 documents, 11,114 sentencesThe corpus also includes contextual polarity annotations which I havent mentioned yet and which well see later.

Gold StandardsICWSM 200819Derived from manually annotated dataDerived from found data (examples): Livejournal Cambria, Havasi 2008Blog tags Balog, Mishne, de Rijke EACL 2006

Websites for reviews, complaints, political arguments amazon.com Pang and Lee ACL 2004complaints.com Kim and Hovy ACL 2006bitterlemons.com Lin and Hauptmann ACL 2006 Word lists (example):General Inquirer Stone et al. 1996

A note on the sentiment listThis list is not well suited for social media English.sucks, :) , :( (Top examples) word valence count will positive 3934 bad negative 3402 good positive 2655 help positive 1971 (Random examples) word valence count funny positive 114 fantastic positive 37 cornerstone positive 2 slump negative 85 bearish negative 17 crackdown negative 5

PatternsICWSM 200821Lexico-syntactic patterns Riloff & Wiebe 2003way with : to ever let China use force to have its way with expense of : at the expense of the worlds security and stabilityunderlined : Jiangs subdued tone underlined his desire to avoid disputes 21ConjunctionICWSM 200822

22*We cause great leadersICWSM 200823

23Statistical associationICWSM 200824If words of the same orientation likely to co-occur together, then the presence of one makes the other more probable (co-occur within a window, in a particular context, etc.)

Use statistical measures of association to capture this interdependence E.g., Mutual Information (Church & Hanks 1989)24Sentiment Ratio Moving AverageHigh day-to-day volatility.Average last k days.Keyword jobs,k = 1, 7, 30(Gallup tracking polls: 3 or 7-day smoothing)

Sentiment Ratio Moving AverageHigh day-to-day volatility.Average last k days.Keyword jobs,k = 1, 7, 30(Gallup tracking polls: 3 or 7-day smoothing)

Sentiment Ratio Moving AverageHigh day-to-day volatility.Average last k days.Keyword jobs,k = 1, 7, 30(Gallup tracking polls: 3 or 7-day smoothing)

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Smoothed comparisonsjobs sentiment

Beyond good and badCan we identify excitement, embarrassment, fear, and all kinds of other emotions?Sentiment as Topics

The Hourglass of EmotionsA quantified version of Robert Plutchiks psychoevolutionary wheel of emotions (1980)SenticNetAugments ConceptNet with emotion-tagged dataLearns a function from semantic vectors to the emotion spaceEvaluation: classify LJ posts that are tagged with Current mood: ...

Learning valenceMost classifiers are effectively learning a valence for every featurefunny = +1disappointed = -2seagal = -3

Nave Bayes again?Sure, okayBut interesting n-grams clearly arent independentGwyneth Paltrow will be double-counted every timeMaximum Entropy (MaxEnt)MaxEnt finds a probability distribution that follows a logistic curveDoesnt require independence

If someone says logistic regression, thats the *same* as maxent. The maximum entropy classifier happens to be one that fits a logistic curve.50The logistic (logit) distributionc = classd = dataf = features in the data = a value for every featureZ = whatever you need to divide by to make it add up to 1

features are just evidence for or against the data being in a particular class.51MaxEnt learns a probability distributionOptimizing for two things:Maximize the probability of your data...but be as uninformative as possible about things missing from your data

An unfair coin comes up heads four times. Whats the probability that it comes up heads the next time?Maximum Likelihood

optimal value of lambda = infinity?53Maximum A Posteriorip(class | data) p(data | class) * p(class)posterior = likelihood * prior

Our prior on p(class) can apply a penalty to large weights

Slide CreditsBrendan OConnor, CMU OpinionFinderCarmen Banea /Jan Wiebe LuminosoSmilies: Aditya Joshi

Recommended

View more >