Transforming instagram data into location intelligence

40
Data Science Innovation: Transforming Instagram Data Into Location Intelligence and Internet of Things April 2014 [email protected] or linkedin.com/in/sureshsood

description

 

Transcript of Transforming instagram data into location intelligence

Page 1: Transforming instagram data into location intelligence

Data Science Innovation: Transforming Instagram Data

Into Location Intelligence and Internet of Things April 2014

[email protected] or

linkedin.com/in/sureshsood

Page 2: Transforming instagram data into location intelligence

Topic Areas

1. Statistics/Data mining or Data Science?2. Data Science workflows/discovery3. Research informing our thinking about location intelligence4. Data Science innovation and exploratory analysis5. Motivations for Instagram project6. Pattern mining trajectories/Data mining 7. Instagram analytics tools8. NoSQL- MongoDB9. Datafication 3 back end (walk thru)10. Location Social Recommender system11. Q&A

Page 3: Transforming instagram data into location intelligence

Statistics, Data Mining or Data Science ?

• Statistics– precise deterministic causal analysis over precisely collected data

• Data Mining– deterministic causal analysis over re-purposed data carefully sampled

• Data Science– trending/correlation analysis over existing data using bulk of population i.e. big

data

Adapted from:

NIST Big Data taxonomy draft report (see http://bigdatawg.nist.gov /show_InputDoc.php)

Page 4: Transforming instagram data into location intelligence

Data Science Workflows & Discovery

Page 5: Transforming instagram data into location intelligence

Useful References Informing our Thinking about Location Intelligence

(Silva et al (2013) A comparison of Foursquare and Instagram to the study of city dynamics and urban social behavior, Proceedings of the 2nd ACM SIGKDD International Workshop on Urban ComputingInstagram and Foursquare datasets might be compatible in finding popular regions of cityChaoming Song, et al. (2010), Limits of Predictability in Human Mobility, Science There is a potential 93% average predictability in user mobility, an exceptionally high value rooted in the inherent regularity of human behavior. Yet it is not the 93% predictability that we find the most surprising. Rather, it is the lack of variability in predictability across the population.Scellato et al. (2011), NextPlace: A Spatio-temporal Prediction Framework for Pervasive Systems. Proceedings of the 9th International Conference on Pervasive Computing (Pervasive'11)Daily and weekly routines => Few significant places every day => Regularity in human activities => Regularity leads to predictability

Page 6: Transforming instagram data into location intelligence

Domenico, A. Lima, Musolesi.M. (2012) Interdependence and Predictability of Human Mobility and Social Interactions. Proceedings of the Nokia Mobile Data Challenge Workshop.we have shown that it is possible to exploit the correlation between movement data and social interactions in order to improve the accuracy of forecasting of the future geographic position of a user. In particular, mobility correlation, measured by means of mutual information, and the presence of social ties can be used to improve movement forecasting by exploiting mobility data of friends. Moreover, this correlation can be used as indicator of potential existence of physical or distant social interactions and vice versa. Sadilek, A and Krumm, J. (2012) Far Out: Predicting Long-Term Human MobilityWhere are you going to be 285 days from now at 2pm …we show that it is possible to predict location of a wide variety of hundreds of subjects even years into the future and with high accuracy.

Useful References Informing our Thinking about Location Intelligence

Page 7: Transforming instagram data into location intelligence

“One of the most fascinating aspects of location-based data is the stability and predictability of patterns that can be mined from seemingly unrelated data. A cluster of random dots on a map can represent a daily transportation route, the most popular dating spots or the neighborhoods with the highest concentration of gang violence. These patterns, analyzed over time and in large numbers, begin to allow for informed predictions of behaviors and events. For government, this analytical capability enables better resource allocation and more effective outcomes”.Interview with G. Edward DeSeve, former White House ARRA chief administrator,

December 15, 2011. Seen in “The power of zoom: Transforming government through location intelligence” by Deloitte Consulting LLP Source: https://www.deloitte.com/assets/Dcom-UnitedStates/Local%20Assets/Documents/Federal/us_fed_govlab_power_of_zoom_report_100212.pdf

Useful References Informing our Thinking about Location Intelligence

Page 8: Transforming instagram data into location intelligence

Useful NSW Govt resources on Location Intelligence

• NSW Globe – globe.six.nsw.gov.au– Uses Google Earth to explore spatial data and images

• NSW Location Intelligence Strategy (April 2014)– http://www.finance.nsw.gov.au/ict/sites/default/files/

NSW Location Intelliegence Strategy.pdf

• NSW Government datasets– http://data.nsw.gov.au/

Page 9: Transforming instagram data into location intelligence

Data Science Innovation

Data Science innovation is something an organization has not done before or even something nobody anywhere has done before. A data science innovation focuses on discovering and using new or untraditional data sources to solve new problems.

Adapted from:Franks, B. (2012) Taming the Big Data Tidal Wave, p. 255, John Wiley & Son

Page 10: Transforming instagram data into location intelligence

The ANZ Heavy Traffic Index comprises flows of vehicles weighing more than 3.5 tonnes (primarily trucks) on 11 selected roads around NZ. It is contemporaneous with GDP growth.

The ANZ Light Traffic Index is made up of light or total traffic flows (primarily cars and vans) on 10 selected roads around the country. It gives a six month lead on GDP growth

http://www.anz.co.nz/commercial-institutional/economic-markets-research/truckometer/

Page 11: Transforming instagram data into location intelligence

Discovery (Exploratory) Analytics

Exploratory– Unstructured– Machine learning– Data mining– Complex analysis– Data diversity

Richness of new sources

X Business Intelligence– Dashboard– Real time decisioning– Alerts– Fresh data– Response time

Speed of Query

Page 12: Transforming instagram data into location intelligence

Data Science InnovationNew sources of information for data driven applications and Internet of Things

Number of journeys madeDistances travelledTypes of roads usedSpeedTime of travelLevels of acceleration and brakingAny accidents which may occur

The Industrial Ecology Lab - towards an integrated Australian research platform

Page 13: Transforming instagram data into location intelligence

Black Box Insurance • Telematics technology (black box) helps assess the driving

behavior and deliver true driver centric premiums by capturing: – Number of journeys – Distances travelled– Types of roads – Speed– Time of travel– Acceleration and braking– Any accidents

• Benefits low mileage, smooth and safe drivers• Privacy vs. Saving monies on insurance (Canada)

– http://bit.ly/Black_box

Page 14: Transforming instagram data into location intelligence

Internet of Things“trillion sensors”

Source: www.tsensorssummit.org

Page 15: Transforming instagram data into location intelligence

Smartphone, Google Glass or Apple Watchwill Know What you Want before you do

“…from 2014 your phone [glasses or watch] will anticipate your needs, do the research, tell you what what you want to know – sometimes before the question even occurs to you…”

Chapman, Jake (2013), The Wired World in 2014

Page 16: Transforming instagram data into location intelligence

Push Notification Providers 1. Appboy2. Urban Airship3. StackMob4. Parse5. https://notifica.re6. http://www.xtify.com7. http://push.io8. http://streamin.io9. https://pushbots.com10.http://appsfire.com11.mBlox12.http://quickblox.com/13.https://www.mobdb.net14.http://www.elementwave.com15.Kahuna - http://www.usekahuna.com/

http://www.quora.com/What-are-some-alternatives-to-Urban-Airship-for-mobile-push

Page 17: Transforming instagram data into location intelligence

Mobile Relationship Management Workflow (Urban Airship)What/When?/Where?

Page 18: Transforming instagram data into location intelligence

Apple Passbook Styles Urban Airship

Page 19: Transforming instagram data into location intelligence

Motivations for Instagram Project• Trajectory data (not i.i.d. – independent and identically distributed)

• A new authentication approach based on trajectory

• Predictive capability phones, glasses and watches

• Internet of Things (Sensors, RFID, Wheelchairs and Drones) • Indoor GPS

• Car parking “anywhere”

• Location based services e.g. advertising

• Tourist recommender system

• Food analytics and traceability (farm fork)

• Mobile apps with trajectory data e.g. Foursquare, Instagram, Nike+ EveryTrial

• Insurance “pay as you drive”– telematics black box based insurance policy

Page 20: Transforming instagram data into location intelligence

Pattern Mining Trajectories

Group of

Trajectories

Trajectory Patterns: 1. Hot regions (basic unit)2. Trajectory pattern is relationships amongst regions

Opportunities : Location based networksDestination predictionCar-poolingPersonal route planningGroup buyingLoyalty Credit card data

Adapted from: Chang, Wei, Yeh and Peng, “Discovering Personalised Routes from Trajectories”ACM, LBSN’11, Chicago,illinois,USA, 1 November 2011

Page 21: Transforming instagram data into location intelligence

Open Source Artifact Highlighting 68 Data Mining Algorithms

Page 22: Transforming instagram data into location intelligence

First Australian Instagram Study Conducted by UTS:AAI

Page 23: Transforming instagram data into location intelligence

Why is Instagram Popular ?

• Mobile photo sharing app + social network• Mobile first Workflow:

– take picture or select => crop/filter => geo-tag/hashtag/description/share

• Instagram is “Twitter but with photo updates”• Status updates are transformed photos• Default is pictures and accounts are public • Pictures include:

– Geolocation, hashtags, comments and likes• Mobile app friendly vs. desktop

Page 24: Transforming instagram data into location intelligence

Instagram Analytics Tools (off the shelf)• Statigram

– Lifetime likes– Total comments– New followers/last 7 days– Most liked photos

• Simply Measured– Total engagement Instagram, Facebook and Twitter– Engaging photo/filter/location– Top photos by date– Active commenters– Best time for engagement– Best day for engagement– Top filters

• Nitrogram– Countries of followers– Most engaging– Most commented– Likes and comments on a photo

Page 25: Transforming instagram data into location intelligence

MongoDB - An Innovation in Databases?“MongoDB gets the job done”

“document-oriented NoSQL database”

“MongoDB is natural choice when dealing with JSON”

“Same data model in code = same model in database”

“Data structure store to model applications”

“In MongoDB Instagram post can be stored in single collection and stored exactly as represented in the program as one object. In a relational database an Instagram post would occupy multiple tables.”

“MongoDB understands geo-spatial co-ordinates and supports geo-spatial indexing”

“Initial MongoDB prototype RedHat OpenShift (Public/Private or Community “Platform as a Service”)

Recommendation engine integrating Mahout libraries and MongoDB (see Roadmap)

As discussed @ Journey to MongoDB:Trajectory Pattern Mining in Australian InstagramBy Suresh Sood and Xinhua Zhu

**Sydney MongoDB Meetup 30 April 2013

Page 26: Transforming instagram data into location intelligence

JSON Sources Driving Internet of Things

• RaZberry– http://www.theregister.co.uk/Print/2013/09/16/zwave_pi_its_time_the_raspberry_pi_took_control/

• Teradata– http://www.teradata.com.au/newsrelease.aspx?LangType=3081

• Google– http://googledevelopers.blogspot.com.au/2012/10/got-big-json-bigquery-expands-data.html

Page 27: Transforming instagram data into location intelligence

• Rich query language• Native secondary indexes• Geospatial indexes & search• Text indexes & search• Aggregation framework (see Mongo doc for Release 2.4.9) • Map-Reduce (Javascript ) implementation• Client-side analytics

MongoDB Analytics Support of Instagram Project

Page 28: Transforming instagram data into location intelligence

Architectural Implementation using MongoDB

Name Node

Mongo Database distributed across shards

DataCollection

DataCollection Stats Stats

Map Reduce

Instagram via API

Page 29: Transforming instagram data into location intelligence

Client for Instagram projectdatafication.com.au/instagram

Page 30: Transforming instagram data into location intelligence

Timeline based Trajectory Analysis

Page 31: Transforming instagram data into location intelligence

Google Map based Trajectory Analysis

Page 32: Transforming instagram data into location intelligence

Social Relationship Analysis

Page 33: Transforming instagram data into location intelligence

Location based Retrieval

Page 34: Transforming instagram data into location intelligence

Popular HashTag Analysis

Page 35: Transforming instagram data into location intelligence

Popular Image Analysis

Page 36: Transforming instagram data into location intelligence

Peak Usage Time Analysis

Page 37: Transforming instagram data into location intelligence

Active User Analysis

Page 38: Transforming instagram data into location intelligence

Roadmap

Data collection

Individual(Group) Analysis

Find Preference and Behavior pattern(including Trajectory pattern)

RecommendationRecommend right product (or service) to right person ( or group) at right time and place

Manually Automatically

Page 39: Transforming instagram data into location intelligence

MongoDB Mahout or Mortar Recommender

Recommended Trajectories

• Trajectories• Points of Interest• User profiles• Image details

• Recommender engine (Mahout or Mortar)

Algorithms

MongoDB Connector for

Hadoop

Version 1.2.0

Page 40: Transforming instagram data into location intelligence

Supporting Documentation

• Instagram project documentation – Data Model and Data Collection Procedure (V2.0)

• MongoDB Aggregation and Data Processing Release 2.4.9