Ikanow oanyc summit
-
Upload
open-analytics -
Category
Technology
-
view
3.619 -
download
0
Transcript of Ikanow oanyc summit
![Page 1: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/1.jpg)
REALIZING BUSINESS VALUE FROM OPEN SOURCE DATA
AND OPEN SOURCE INTELLIGENCE
Presented by: Chris Morgan
![Page 2: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/2.jpg)
http://bit.ly/data-vending
DATA AND ART (PRIMER)
Providing value on the potential of bad news to serve out a bag of salty potato chips
harnessing the power of open data and sentiment
![Page 3: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/3.jpg)
Data Intelligence
Operational Lens
Intelligence is information that has been transformed to meet an operational need
Intelligence
![Page 4: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/4.jpg)
Intelligence CycleNo matter what methodology you use…
intelligence analysis is an iterative process.
![Page 5: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/5.jpg)
• Provide value to the organization – turn data into intelligence using an “operational lens”
• Ensure cyclical feedback occurs during collection, processing, analysis, and consumption
• Validate that a particular network is the right source of data for the questions you need answered
Open Source Analysis Goals
![Page 6: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/6.jpg)
Common Pitfalls
Analyzing What Instead of Why
The important thing is often not what people are saying… but why they
are saying it.
![Page 7: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/7.jpg)
Common Pitfalls
Using the Wrong Analysis Tools
Reporting tools rarely help dig into the why. Many common tools, reports, and metrics are misleading:– Word clouds atomize message context– Sentiment metrics are often highly inaccurate– Information in aggregate hides more than it
reveals
![Page 8: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/8.jpg)
Use Case
Sentiment Analysis
http://bit.ly/ikanow-and-r
![Page 9: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/9.jpg)
Enron Sentiment Analysis
Data source
~500,000 Publically available Enron emails
http://bit.ly/ikanow-and-r
![Page 10: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/10.jpg)
Enron Sentiment Analysis
Hypothesis
Utilize Sentiment analysis as first order process to prioritize and streamline the
overall analysis process
http://bit.ly/ikanow-and-r
![Page 11: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/11.jpg)
Enron Sentiment Analysis
Caveats
Sentiment was only attributed to the sender Not a complete representation of an organizations
email corpus Counteraction of uneven coverage was estimated Not a full analysis of the set of information
(objective was to use sentiment analysis as a reduction technique)
http://bit.ly/ikanow-and-r
![Page 12: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/12.jpg)
Workflow• Data Ingestion Process– Extraction of entities, events, facts and some
basic statistics• Aggregation and Reduction– Aggregation of keywords with sentiment from
each email– Average sentiment score– Follow on aggregation by email address of the
sender over a given week (average sentiment score)
• Visualize and Analyze– Imported into Infinit.e and R for visualization
http://bit.ly/ikanow-and-r
![Page 13: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/13.jpg)
• Horizontal Bar– Positive sentiment =
Green– Negative sentiment =
Red
• Chart on Left– Positive sentiment =
Green– Negative sentiment =
Red
• Chart on Right– Heuristic – weeks with
abrupt negative shifts indicated problems in organization
– Positive sentiment = Blue
– Negative sentiment = Red
One email sender’s Weekly Average Sentiment across time
Workflow
![Page 14: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/14.jpg)
Workflow
close-up snapshot of sub-set of 20 individuals email average sentiment score over time
![Page 15: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/15.jpg)
Individual analysis based on the reduction of the
information by the sentiment analysis
process
Workflow
![Page 16: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/16.jpg)
Findings• Indicators and Additional Analysis– 801 weeks highlighted out of 11,500 weeks as
important for further investigation– Keywords found could further be used to
investigate statistically the 801 weeks highlighted for manual review
– Individual evaluation of emails highlighted through a reduction process (case construction)
– Pipeline created for further analysis
![Page 17: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/17.jpg)
Lessons Learned
1. Drastically reduced the timeline necessary for case
construction
![Page 18: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/18.jpg)
Lessons Learned
2. Multiple contexts for this type of technique
Intelligence Analysis
E-Discovery
Brand management Social Media
Analysis
![Page 19: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/19.jpg)
Lessons Learned
3. Negative shifts were only investigated, analysis of the positivity side for other use cases could be applied to different questions easily
![Page 20: Ikanow oanyc summit](https://reader035.fdocuments.us/reader035/viewer/2022062513/5550fd7fb4c90572478b4bcd/html5/thumbnails/20.jpg)
Lessons Learned
4. R and Infinit.e provide a interesting technology
integration for evaluating and reducing unstructured data