RedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
-
Upload
redis-labs -
Category
Technology
-
view
204 -
download
8
Transcript of RedisConf17 - Redis Powers Next-gen Ambient Intelligence Platform
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0REDIS POWERS NEXT-GENAMBIENT INTELLIGENCE PLATFORM
REDISCONF 2017
Madhu Therani, [email protected]
www.near.co | CONFIDENTIAL**
Company Overview
www.near.co
Unify streaming and static data from multiple sources to map out “consumer” journeys
Near is a “Ambient Intelligence” platform that uses massive data and artificial intelligence to understand consumers in smart environments
Near’s Mission
www.near.co | CONFIDENTIAL**
Company Overview
www.near.co
Varied Data Sources – Connected via Location
Bringing massive data into a unified platformfor the most accurate understanding of consumer behavior
Location DataSINGLE IDENTIFIER
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Our Current Platform
Ad Engine
PlacesAPI
NearPlacesDB
ID System
LocationRefinement
Batch Analytics
To ExternalSystems
Profile Cache
Count Store
AudienceAnalyticsAPI
AudienceRulesAPI
CampaignAnalyticsAPI
Sources
Profile Store
ProductAPI
PeopleAPI
ALLSPARK
DaaS APIs
DedupingID Linkage
RealtimeAudience Segmentation
SpatialCounts
Model-basedProfile
RealtimeAnalytics
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Our SaaS Product - AllSpark
Enables Definition, Curation of and Engagement with “Audiences”
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Our requirements - 24 months ago
Scale from 20K events per sec to 200K events per sec with & without location
Map to the physical world - Gather/index geo information and associated activity to “physical spaces”
Maintain “user” level aggregated identity - accumulate, summarize, analyze, expire
Evaluate utility of inferences in realtime and batch mode
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Where are we now?
Process 150K events per sec globally, 6-10 TB of data everyday
Global footprint - Washington DC, Amsterdam, Hong Kong, Singapore, Tokyo
Geo-data from 44 countries
1.5 Billion profiles actively managed
Allspark actively manages nearly 3000 audience segments, 1000+ campaigns per quarter
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Role of Redis
Powered incremental development
Scaling of key subsystems
Allowed “exploration” of data structures as platform requirements evolved
Enabled the development of a reliable realtime + batch pipeline - Storm/MR
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Key Subsystem - ID System
Powers ID generation for profiles - across sources at different data centers
Deduplication
Opt-out management
ID unification - maintain multiple channel-specific IDs for system-wide Allspark ID
Metrics - what is the traffic like ? location vs non-location? Source mix ?
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Key Subsystem - GridStore
Measuring physical pings at scale
Every point is an HLL
HLL by hour/days
For all countries
At multiple resolutions
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Key Subsystem - CountStore
Counts about Audience Groups
Powers varied kind of demographic and behavior analysis - what interests/behaviors
Spatio-temporal distribution - What other locales were visited in a given time period?
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Key Subsystem - RealTime Bidding Engine
Powering Programmatic Advertising
Metrics on a billion+ events in a day
Caches - bid cache, event cache, ad cache
Realtime status for various kinds of optimization
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Other Usecases
Realtime analytics across pipelines
Caches across pipeline - Audience analytics
Powering model-as-a-service - metrics for data science
Shared state across both realtime and batch pipelines
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Open source Redis deployment scale
60+ high-end servers
RAM requirements range from 128 GB to 512 GB
DBs are clustered per sub-system
Backup and maintenance with manual processes
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Ad hoc Issues
Backups - snapshots versus AOF - how to evaluate
Replication across DCs
Key semantics and management - Key definition embeds a lot of ad hoc meaning
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
Redis Benefits
Enabled incremental development
Externalization of core data structures - developers of varying ability can develop scalable systems
Developing systems in multiple languages
Sweet spot between - relational dbs (mysql) and nosql (couch/mongo) - build using Redis - then decide
Title
Visualize, Engage and Analyze Audience
https://near.co/products/allspark
2.0
What next
Evaluating Enterprise Redis
Reorganizing some of other datastores - KyotoTycoon, ElasticSearch, Mongo, Cassandra, Hbase
ML on the edge - Realtime analytics on samples - alert generation
Title
Thank You
Acknowledgements
NearNEar Tech Team
NEar Tech Team
Near’s Tech Team in Bangalore & SFO