Carl Miller
Transcript of Carl Miller
CASM The Centre for the Analysis of Social Media
Since 2008: 1.2 bn regular users
More time on social media than any other way of using the Internet
Rapid transition of our lives onto social-digital platforms
More political, social and intellectual activity being captured,
data which had been previously lost
sOCMINT The Promise of Socmint
Very large
Constantly refreshing Unmediated
Rich
Linked
New bodies of data that are:
Social media intelligence
Socmint
sOCMINT The Opportunities
Inform 1 understand 2
PRedict 3
the ‘who’, ‘what’, ‘when’
The ‘why’ and ‘how
‘what next’
To inform Socmint
Event Detection
Targeted Event Detection
2012 olympics
We knew We didn’t know ‘Olympians’ were
competing in ‘events’ for ‘medals’.
When the medals were being won, or by whom
targeted events Step 1: Collecting Tweets
1 30,470,932 Tweets posted between 18 July and 13 August 2012 were collected. These Tweets all contained at least either the first or last name of an Olympian competing in the games.
targeted events Step 2: Measuring Tweets over Time
targeted events Step 3: Measuring Change in rate of Tweets
targeted events Step 4: Identify the possible pre- and post-event windows of the Tweetstream
Tweet Text Score
Gold Gold Chad le Clos by the fingertip #teamSA #london2012
0.715
20th of a second between Chad Le Clos and Michael Phelps in the 200M Butterfly!? Wow, what a final! Credit to Le Clos! #London2012Olympics
0.618
Here comes Michael Phelps - WOW! Misses gold by 0.01 seconds! Phelps takes silver. South African Chad Le Clos wins gold #London2012 #Olympx
0.597
Wow Michael Phelps misses gold by 0.01 seconds! Phelps takes silver. South African Chad Le Clos wins gold #London2012
0.595
Tweet Text Score
When I was 10 I dreamed of going to the Olympics, the furthest I got was European Champion at the age of 12 and then I stopped...
-0.066
took gymnastics from 18 ms - 10 yrs then quit cuz I didn't think I was good I find out I prob wld hv been in the Olympics fml
-0.067
@raytetreault: I'm gonna get the Olympics ring tattoo and just tell everyone I was in the Olympics. #soundsgood
-0.067
Situational Awareness
To understand
Socmint
Social network analysis
Natural Language
Processing
The Classifier Natural language processing § The practical value of NLP is to create classifiers § Classifiers are models that are taught to put natural
language – most often Tweets - into categories defined by the analyst on the basis of examples of each category provided by an analyst.
§ This is ‘machine learning’ through ‘annotation’. We’ll be doing it this afternoon.
§ The basis of this is Bayesian mathematics: it is inherently probabilistic.
Method 51 Natural language processing
predicting X factor
Predicting x factor From Opinion to Action
Soci
al M
edia
Opin
ion
“Brand” Evaluation
immediate Evaluation
actio
n
Behaviour Modelling
Predicting x factor
challenges To socmint
sOCMINT Two Parallel Challenges
Methodology Old methods overwhelmed.
New, unfamiliar applications of new
technologies to understand a new, digital-social world
Legitimacy An obviously contested
area, and one that stands to suffer much harm from use without public consent
Two challenges that stand in the way of it paying decisive dividends to public security
Challenges to socmint Representivity
§ Most data needs to be applicable to a given group in the offline world. There are a number of reasons to be suspect about social media data:
§ Data gathered from the platform may not represent the platform (sampling issues, especially keywords)
§ Social media content may not represent social media users: social media subject to power laws. Research suggests that a small number, around 5 percent, of ‘power-users’ on Twitter are responsible for 75 percent of Twitter activity.
§ Social media users may not represent actual people (sock-puppets and bots)
Challenges to socmint Veracity
§ Is what is being measured what is happening? New technologies and methodologies, many experimental and probabilistic.
§ The openness and anonymity of social media, especially, make them a suitable medium for deceptive tactics. A deliberate intent to mislead could be expressed through: disinformation; misinformation; honeypot accounts; impersonation, and wiki-circularity, even self-deception
Deception
Challenges to socmint Reality
§ Does it actually correspond with the real world? § Online disinhibition effect. Our ability to count things
on social media has outpaced our understanding of what these things mean as social and cultural practices – as symbols, as language-games, as rituals, as products of digital worlds ruled by new norms and subjective truths.
Challenges to socmint Validation
§ There are not yet developed and tested strategies commonly used across SOCMINT practice or social media research to validate whatever is produced.
§ Either single source to rate the confidence in any single piece of intelligence reporting
§ Or how it feeds into all-source assessment against other pieces of intelligence and bodies of open-data.
Challenges to socmint Use
§ SOCMINT depends on getting to the right people in time, securely, and presented in a format that makes sense to strategic and operational decision makers as well as those at the front-line. Issues are:
§ SOCMINT often complex, self-contradictory and dynamic
§ Must be understood within a cloud of caveats
Challenges to socmint Legitimacy, public acceptability and law
§ The Internet is a contested place – from the beginning, a cyber-libertarian belief that the Internet exists to evolve humanity beyond states
§ its universal language – the TCP/IP protocol – embraces an open architecture that distrusts centralised control, allows any computer or network to join, and does not make (nor allow internet service providers to make) judgments about content.
§ Therefore vital that the collection and use of SOCMINT rests on a firm basis of public acceptability
Challenges to socmint Not yet a Discipline
§ Scattering of isolated islands of emphasis § Not a united body of learning, method or example § Spans disciplines from computer sciences and
ethnography to advertising and brand management § Conducted across the private sector (from tech start-
ups to large business analytics firms), academia, now beginning in the public sector.
§ Fastest take-up was marketing and advertising § Slower was the government and public sector § Still made barely an impact on charities and the third
sector
CASM @carljackmiller [email protected]