#pman32,107 posts1979 authorshigh RT #sinfluencers
flickr photo by mhartford, cc
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
New York Times (1857-Current file); Dec 30, 2000; ProQuest Historical Newspapers The New York Times (1851 - 2006)pg. A1
Nutritional Information: New York Times
Ingredients: International coverage 42% (includes 8% Iraq, 5% Afghanistan, minimum weekly 5% China, and no less than 2% Africa), Washington coverage 28% (includes 7% Obama, 6% Congress, and trace amounts of Limbaugh), New York State/Albany coverage 14%, New York City coverage 10%, and less than 6% domestic US coverage.
Warning: contains less than 40% of sports coverage of the leading competitor, the New York Post, and 50% of business coverage of the Wall Street Journal. May contain less than your recommended daily allowance of Latin America News.
You have not read any international news today.
newspapers
radio (npr, talk)
television (network, cable)
blogs (int’l, domestic, left, right)
email forwards
magazines
websites
?
hand-codinglink analysisautomated content analysis
PEJ News coverage index
mediatenor.com
Hand-coding: traditional media monitoring
Upsides:- High accuracy- Flexibility- Low startup cost
Downsides:- Small data sets, problems extrapolating- Time consuming, no real-time data- Intercoder reliability, difficulty of coder setup
Figure 1: Community structure of political blogs (expanded set), shown using utilizing a GEMlayout [11] in the GUESS[3] visualization and analysis tool. The colors reflect political orientation,red for conservative, and blue for liberal. Orange links go from liberal to conservative, and purpleones from conservative to liberal. The size of each blog reflects the number of other blogs that linkto it.
longer existed, or had moved to a di!erent location. When looking at the front page of a blog we didnot make a distinction between blog references made in blogrolls (blogroll links) from those madein posts (post citations). This had the disadvantage of not di!erentiating between blogs that wereactively mentioned in a post on that day, from blogroll links that remain static over many weeks [10].Since posts usually contain sparse references to other blogs, and blogrolls usually contain dozens ofblogs, we assumed that the network obtained by crawling the front page of each blog would stronglyreflect blogroll links. 479 blogs had blogrolls through blogrolling.com, while many others simplymaintained a list of links to their favorite blogs. We did not include blogrolls placed on a secondarypage.
We constructed a citation network by identifying whether a URL present on the page of one blogreferences another political blog. We called a link found anywhere on a blog’s page, a “page link” todistinguish it from a “post citation”, a link to another blog that occurs strictly within a post. Figure 1shows the unmistakable division between the liberal and conservative political (blogo)spheres. Infact, 91% of the links originating within either the conservative or liberal communities stay withinthat community. An e!ect that may not be as apparent from the visualization is that even thoughwe started with a balanced set of blogs, conservative blogs show a greater tendency to link. 84%of conservative blogs link to at least one other blog, and 82% receive a link. In contrast, 74% ofliberal blogs link to another blog, while only 67% are linked to by another blog. So overall, we see aslightly higher tendency for conservative blogs to link. Liberal blogs linked to 13.6 blogs on average,while conservative blogs linked to an average of 15.1, and this di!erence is almost entirely due tothe higher proportion of liberal blogs with no links at all.
Although liberal blogs may not link as generously on average, the most popular liberal blogs,Daily Kos and Eschaton (atrios.blogspot.com), had 338 and 264 links from our single-day snapshot
4
Adamic and Glance
Link analysis: Leveraging web architectures
Upsides:- Highly automatable- Large data sets- Leverage existing tools for network research
Downsides:- Only consider link structure, not content- Danger of conflating linking with social structure- Need for hand-coding to make sense of clusters- Good for blogs, bad for MSM
newsmap.jp
Content analysis: Just becoming possible
Upsides:- Can work with unstructured text, blogs and MSM- Large data sets, highly automatable- Easy linkage with visualization platforms
Downsides:- Inaccuracy- Language constraints- Major programming investments
What We Have Done
extract
story
text
create
term
list
allow
rich
queries
get
news
stories
1
2
3
4
Terming
Lexicon-based simple matching More complex term extraction
Archer Daniels Midland CompanyArcher, Bill
Archer, Dennis W.Archer, Jeffrey
ArcheryArchibold, Randal C.
ArchitectureArchitecure and DesignArchives and Records
Archon Corp.Archstone-Smith Trust
ArcSight Inc.Arctic Cat Inc.Arctic Monkeys
- 9.25 million stories - 900G of database + downloaded content - 162 million story / tag associations - 1,500 sources - 10,000 feeds - roughly 20,000 stories per day
Topic focus
Pivoting on “republican”
Global Attention and Power Laws
You say “stimulus” and I say “bailout”
what’s been hard:
topic clusteringreplicating across languageslegal concernsdark matter
Top Related