Mac281 big data & journalism lecture 2014

55
Big Data & Journalism MAC281 t witter/ rob_jewitt [email protected] 1

Transcript of Mac281 big data & journalism lecture 2014

Page 2: Mac281 big data &  journalism lecture 2014

2009 #iranelectionImage: Gilad Lotan, ReTweet Revolution

2

Page 3: Mac281 big data &  journalism lecture 2014

3

An

ato

my o

f a tw

eet

Page 4: Mac281 big data &  journalism lecture 2014

4

Overview

Intro

Database Journalism and Computer Assisted Reporting

Data Today : Visualisations and Interactivity

How To Be A Data Journalist

Ethics?

Page 6: Mac281 big data &  journalism lecture 2014

6

Adam Westbrook

“I think data-driven journalism is one of the big potential growth areas in the future of journalism. A lot of the forward-thinking discussion about the future of news focuses on the ‘glamorous’ possibilities, like video journalism and interactivity, but I often see data journalism being ignored.

In fact, I believe it is journalism in its truest essence: uncovering and mining through information the public do not have enough time to do themselves, interrogating it, and making sense of it before sharing it with the audience. If more journalists did this (rather than relying on ‘data’ from press releases) we would be a far more enlightened public.

Source link

Page 7: Mac281 big data &  journalism lecture 2014

7

Adam Westbrook

My message to the next generation of journalists - or any journalist looking for a new niche or direction - would be to learn the skills and tools of data interrogation. It’s not glamorous, but it’s a skill not many journalists have, and one which will give one an edge in the market.”

Source link

Page 8: Mac281 big data &  journalism lecture 2014

8

Brian Storm

One of our big goals in the storytelling process is to humanize the statistics. It’s hard for people to care about numbers, especially large numbers. How do you get your head around the death of 800,000 people in the Rwandan genocide? I think if you meet the individuals - see and hear the stories of the survivors - you can gain a better insight into the tragedy.

Source link

Page 9: Mac281 big data &  journalism lecture 2014

9

“Data-driven journalism is the future”

“[Journalism’s] going to be about poring over data and equipping yourself with the tools to analyse it and picking out what's interesting. And keeping it in perspective, helping people out by really seeing where it all fits together, and what's going on in the country.” Sir Tim Berners-Lee, inventor of the Web, 2010

Page 10: Mac281 big data &  journalism lecture 2014

10

Origins

1950s

Database Journalism

Computer Assisted Reporting (CAR)

Very expensive

Page 11: Mac281 big data &  journalism lecture 2014

11

The Indianapolis Star

Capital Journal circa 1961

Page 13: Mac281 big data &  journalism lecture 2014

13

CBS: 1952, Walter Cronkite

Presidential election battle

Eisenhower vs Stevenson

Remington Rand UNIVAC

Early vote returns analysis

Predicted a landslide victory

Contrary to popular opinion

Page 15: Mac281 big data &  journalism lecture 2014

15

Other notable examples

Clarence Jones, The Miami Herald, 1969 Criminal Justice systems

David Burnham, The New York Times, 1972 Police crime rates

Elliot Jaspin, The Providence Journal, 1986 School bus drivers and criminal records

Bill Dedman, The Atlanta Journal, 1988 Pullitzer Prize for The Color of Money

Page 17: Mac281 big data &  journalism lecture 2014
Page 18: Mac281 big data &  journalism lecture 2014
Page 19: Mac281 big data &  journalism lecture 2014

Since 2004

Page 20: Mac281 big data &  journalism lecture 2014

20

Page 21: Mac281 big data &  journalism lecture 2014

21

Adrian Holovaty (2005)

Chicago Transport Authority map + Firefox plug-in + Google Maps = real time updates

Chicago Police Department + Google Maps = real time police reports

Page 22: Mac281 big data &  journalism lecture 2014

22

Adrian Holovaty (2006)

Now working for the Washington Post

A fundamental way newspaper sites need to change

Most material collected by journalists is: "structured information: the type of information that can

be sliced-and-diced, in an automated fashion, by computers”

Page 23: Mac281 big data &  journalism lecture 2014

23

Adrian Holovaty (2006)

Traditional journalism

Articles as the finished product

Data journalism

Continually maintained and improved

Radical overhaul needed- Employing data- Making data available- Storing data- Coding data

=✓

=✗

Page 24: Mac281 big data &  journalism lecture 2014

24

Maps Everywhere!

Page 25: Mac281 big data &  journalism lecture 2014
Page 27: Mac281 big data &  journalism lecture 2014

27

Maps Everywhere!

2007 – Holovaty won $1.1 million from the Knight Foundation for Everyblock

2010 – SR2 Blog won Guardian.co.uk’s ‘most inspirational site’ accolade

Page 28: Mac281 big data &  journalism lecture 2014
Page 33: Mac281 big data &  journalism lecture 2014

33

Bella Hurrell, Specials Editor with BBC News Online (2011)

Proximity of “journalists, designers and developers all working together, sitting alongside each other”

Page 34: Mac281 big data &  journalism lecture 2014

34

Bella Hurrell, Specials Editor with BBC News Online (2011)

“We have found that proximity really important to the success of projects. Although we have done this for a while, increasingly other organisations are reorganising along these lines after coming to realise the benefits of breaking down silos and co-locating people with different skillsets can produce more innovative solutions at a faster pace.”

Page 35: Mac281 big data &  journalism lecture 2014

35

Bella Hurrell, Specials Editor with BBC News Online (2011)

“As data visualisation has come into the zeitgeist, and we have started using it more regularly in our story-telling, journalists and designers on the specials team have become much more proficient at using basic spreadsheet applications like Excel or Google Docs”

Page 36: Mac281 big data &  journalism lecture 2014

36

Paul Bradshaw

Page 37: Mac281 big data &  journalism lecture 2014

37

Paul Bradshaw

“It represents the convergence of a number of fields which are significant in their own right - from investigative research and statistics to design and programming. The idea of combining those skills to tell important stories is powerful - but also intimidating. Who can do all that?”

Page 38: Mac281 big data &  journalism lecture 2014

38

Paul Bradshaw

“It represents the convergence of a number of fields which are significant in their own right - from investigative research and statistics to design and programming. The idea of combining those skills to tell important stories is powerful - but also intimidating. Who can do all that?”

“The reality is that almost no one is doing all of that, but there are enough different parts of the puzzle for people to easily get involved in, and go from there”

Page 39: Mac281 big data &  journalism lecture 2014
Page 40: Mac281 big data &  journalism lecture 2014

Dealing with Data (Bradshaw, 2010)

4 crucial aspects

40

1. Finding data  

2. Interrogating data  

3. Visualizing data

4. Mashing data

Page 42: Mac281 big data &  journalism lecture 2014
Page 43: Mac281 big data &  journalism lecture 2014
Page 44: Mac281 big data &  journalism lecture 2014
Page 45: Mac281 big data &  journalism lecture 2014

45

Data visualisation vs data journalism

Page 47: Mac281 big data &  journalism lecture 2014

47

New Tools of the Trade?

Analysis

Excel or Calc sort your data

Google Refine clean your dirty data

Yahoo Pipes Composition mash-up tool

ScraperWiki transforms info from webpages

into data

R Process and manipulate data

Visualisation

Google Fusion Tables visualise data on maps,

timelines, etc

Tableau Public Visualise and share

IBM’s Many Eyes data visualisation tool

Processing create images & interactives

Wordle generate word clouds from

bulky text

Page 48: Mac281 big data &  journalism lecture 2014

48

Free tools…

Page 49: Mac281 big data &  journalism lecture 2014

49

Free tools…

Page 50: Mac281 big data &  journalism lecture 2014
Page 51: Mac281 big data &  journalism lecture 2014

Summary

Is this journalism?

Journalism educators doing students a disservice?

Journalists replaced by programmers?

Wikileaks: no journalist's required?

Page 53: Mac281 big data &  journalism lecture 2014

53

Images

Knight Foundation, 2008, Sir Tim Berners-Lee talking about the Web at the Newseum

Bill on Capitol Hill, 2007, The Rim and the Slot

Marion Doss, 2008, Capital Journalism News Room 16 October 1961

Igorschwarzmann, 2010, NYT News Room

Mkandlez, 2009, The Billion Pound O Gram

BitBoy, 2006, The Elephant in the Room

Ravages, 2008, Links

Page 54: Mac281 big data &  journalism lecture 2014

54

Issues

To what extent is the traditional craft of storytelling being challenged by the emergence of big data?

What kind of problems are manifest by the deluge of large data sets (eg MPs expenses, Wikileaks Iraq war logs, US cables, etc)?

Can the use or release of big data sets have ethical implications?