Politics_Moscow.pptx

download Politics_Moscow.pptx

of 32

Transcript of Politics_Moscow.pptx

  • 7/30/2019 Politics_Moscow.pptx

    1/32

  • 7/30/2019 Politics_Moscow.pptx

    2/32

    Political Search TrendsWebSci12, SIGIR12 (demo)

    Joint Work with E. Borra and K. Garimella

  • 7/30/2019 Politics_Moscow.pptx

    3/32

    Political Search Trends

    http://politicalsearchtrends.sandbox.yahoo.com/

    http://politicalsearchtrends.sandbox.yahoo.com/http://politicalsearchtrends.sandbox.yahoo.com/
  • 7/30/2019 Politics_Moscow.pptx

    4/32

    Point of Departure: Labeled BlogsLeft leaning blogs (387) Right leaning blogs (644)

    From Benkler and Shaw A tale of two blogospheres(2010) and Wonkosphere Blog Directory

  • 7/30/2019 Politics_Moscow.pptx

    5/32

    Who are these People?

    Use self-provided age and gender and ZIP-

    derived estimates

    People clicking on right-leaning blogs:

    Are older (50 vs. 45 years)

    Are more male (63% vs. 55%)

    Are more white (81% vs. 78%)

    More likely to work for Yahoo (92.3% vs. 11.4%)

    All these trends agree with voters demographics

  • 7/30/2019 Politics_Moscow.pptx

    6/32

    huffingtonpost.com is left-leaning a left-leaning vote for pizza is a vegetableAggregate votes across all clicks on political blogs to compute overall leaning

    From Blogs to Queries

    vL = left-clicks for queryVL = total left clicks

  • 7/30/2019 Politics_Moscow.pptx

    7/32

    Examples of Assigned Leaning

    Examples using Wikipedia mapping for 6 months of data, July 4, 2011 January 8, 2012.

    queries for Wikipedia entity Patient Protection & Affordable Care Actobama healthcare bill text (.91) who pays for obamacare (.04)

    obama health care privileges (.83) obamacare reaches the supreme court (.09)

    is affordable care act unconstitutional (.78) is obamacare constitutional (.16)

    queries for Wikipedia category Occupy

    who started occupy wall street (.94) occupy wall street rape (.09)

    we are the 99% (.91) occupy movement violence (.25)

    occupy movement supporters (.78) crime in occupy movement (.44)

    liesprotest

    http://politicalsearchtrends.sandbox.yahoo.com/?q=lieshttp://politicalsearchtrends.sandbox.yahoo.com/?q=protesthttp://politicalsearchtrends.sandbox.yahoo.com/?q=protesthttp://politicalsearchtrends.sandbox.yahoo.com/?q=lies
  • 7/30/2019 Politics_Moscow.pptx

    8/32

    ``cost obama trip to india

    Mapping Queries to Statements

    364 distinct queries mapped to true facts

    574 distinct queries mapped to false facts

  • 7/30/2019 Politics_Moscow.pptx

    9/32

    Correlation with leaning? Any guess? None.

    Correlation with leaning, when conditioned on

    source? Any guess? None.

    Correlation with volume? Any guess?

    Well ...

    Impact of Truth Value

  • 7/30/2019 Politics_Moscow.pptx

    10/32

    Political Twitter Trends

    Under reviewJoint Work with Venkata Garimella and

    Asmelash Teka

  • 7/30/2019 Politics_Moscow.pptx

    11/32

    Twitter and Politics

  • 7/30/2019 Politics_Moscow.pptx

    12/32

    Hashtag Wars

  • 7/30/2019 Politics_Moscow.pptx

    13/32

    Political Twitter Trends (PTT)

    Show live demo

  • 7/30/2019 Politics_Moscow.pptx

    14/32

    Data Set

    Start with seed set of users with known

    political orientation, e.g. @BarackObama or

    @MittRomney

    Get their tweets

  • 7/30/2019 Politics_Moscow.pptx

    15/32

    Extending from Seed Set

    Get all the retweets

  • 7/30/2019 Politics_Moscow.pptx

    16/32

    U.S. Users Only

    Lots of international interest in U.S politics

    People from all over the world retweet

    Use Yahoo! Placemaker to remove non-US users

    http://developer.yahoo.com/geo/placemaker/

  • 7/30/2019 Politics_Moscow.pptx

    17/32

    Evaluating Data Quality

    Do we have the correct political leaning?

    Accuracy = 0.98, 0.93 for Wefollow and Twellow respectively

    Inspection: greatest environmentalist. Also, despise republicans

    Corrected accuracy: 0.99 and 0.95

  • 7/30/2019 Politics_Moscow.pptx

    18/32

    From Users to Back to Hashtags

    Tag cloud for left users Tag cloud for right users

  • 7/30/2019 Politics_Moscow.pptx

    19/32

    Detecting Political Hashtags

    Most hashtags are non-political #fb, #FavouriteAlbums,

    Not always obvious

    #yes4m, #usmc, co-occurrence with seed political hashtags: #p2,

    #tcot,#gop, #ows, obama*, romney*,

    Keep top 10% in terms of P(POL|h)

    Time dependence #america during Olympics and during elections

    Remove low volume hashtags

    Mostly noise and no large political issues

  • 7/30/2019 Politics_Moscow.pptx

    20/32

    Detecting Trending Hashtags

    Trending = currently popular

    Having a higher volume than expected

    #obamagotosama: May 1, to May 8, 2011

    #ows: Sep. 25, to Oct. 2, 2011

    Non-trending hashtags: #vote, #democracy

  • 7/30/2019 Politics_Moscow.pptx

    21/32

    Assigning a Leaning to Hashtags

    Voting approach:

    Mere counts:

    Normalized counts:

    + smoothing:

  • 7/30/2019 Politics_Moscow.pptx

    22/32

    Leanings over Time: Constant

  • 7/30/2019 Politics_Moscow.pptx

    23/32

    Leanings over Time: Shifting

  • 7/30/2019 Politics_Moscow.pptx

    24/32

    Leanings over Time: Outliers

  • 7/30/2019 Politics_Moscow.pptx

    25/32

    Detecting Change Points

    Filter hashtags without sufficient support

    Total number of weeks > 4

    Relat. and absol. change in leaning from previous week

    Change from previous week > std and

    Change from previous week > 0.25

    Change from average value is big

    Current value - Average value > std

    Change in leaning is in the direction of other leaning

    Change in direction = TRUE

  • 7/30/2019 Politics_Moscow.pptx

    26/32

    Detected Change Points

  • 7/30/2019 Politics_Moscow.pptx

    27/32

    What Causes Change Points

    Volume-to-user ratio:

    High means small, active set (hijackers)

    Low means general masses

  • 7/30/2019 Politics_Moscow.pptx

    28/32

    Description of Hijackers

  • 7/30/2019 Politics_Moscow.pptx

    29/32

    Topical Clustering of Hashtags

    Hashtags are often micro-topics

    Cluster hashtags to have more high-level topics

    We used simple k-means clustering on co-

    occurrence feature vectors

  • 7/30/2019 Politics_Moscow.pptx

    30/32

    Cluster Evolution Over Time

  • 7/30/2019 Politics_Moscow.pptx

    31/32

    Ongoing Work

    Beyond 2-party systems: UK and Germany

    Fractional party membership

    Visualization challenges Hans Roslings bubbles

  • 7/30/2019 Politics_Moscow.pptx

    32/32

    !

    [email protected]

    mailto:[email protected]:[email protected]:[email protected]:[email protected]