Implementing and Visualizing Clickstream data with MongoDB
-
Upload
mongodb -
Category
Technology
-
view
3.642 -
download
5
description
Transcript of Implementing and Visualizing Clickstream data with MongoDB
![Page 1: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/1.jpg)
Implementing and Visualizing Click-Stream Data with MongoDB
Jan 22, 2013 - New York MongoDB User Group
Cameron Sim - LearnVest.com
![Page 2: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/2.jpg)
Agenda • About LearnVest
• HL Application Architecture
• Data Capture
• Event Packaging
• MongoDB Data Warehousing
• Loading & Visualization
• Finishing up
• Next Steps
![Page 3: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/3.jpg)
LearnVest Inc. ���www.learnvest.com
Company Founded in 2008 by Alexa Von Tobel, CEO
50+ People and Growing rapidly
Based in NYC
Platforms Web & iPhone
Mission Statement Aiming to making Financial Planning as accessible as having a gym membership
Key Products Account Aggregation and Management
(Bank, Credit, Loan, Investment, Mortgage)
Original and Syndicated Newsletter Content
Financial Planning (tiered product offering)
Stack
Operational Wordpress, Backbone.js, Node.js Java Spring 3, Redis, Memcached,
MongoDB, ActiveMQ, Nginx, MySQL 5.x
Analytics MongoDB 2.2.0 (3-node replica-set)
Java 6, Spring 3 pyMongo
Django 1.4
![Page 4: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/4.jpg)
LearnVest.com Web
![Page 5: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/5.jpg)
LearnVest.com IPhone
![Page 6: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/6.jpg)
MondoDB Data Warehousing Loading & Visualization
High Level Architecture Analytics
Services Loaders & Dashboards
Production
Platform Delivery Services
HTTPS pyMongo MongoDB Java Conn MongoDB Replication JDBC
Event Collection Event Packaging
![Page 7: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/7.jpg)
Philosophy For Data Collection
Capture Everything • User-Driven events over web and mobile • System-level exceptions • Everything else Temporary Data • Be ‘ok’ with approximate data • Operational Databases are the system of record Aggregate events as they come in • Remove the overhead of basic metrics (counts, sums) on core events • Group by user unique id and increment counts per event, over time-dimensions (day, week-ending, month, year)
![Page 8: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/8.jpg)
Data Capture IOS - (void) sendAnalyticEventType:(NSString*)eventType object:(NSString*)object name:(NSString*)name page:(NSString*)page source:(NSString*)source; { NSMutableDictionary *eventData = [NSMutableDictionary dictionary]; if (eventType!=nil) [params setObject:eventType forKey:@"eventType"]; if (object!=nil) [eventData setObject:object forKey:@"object"]; if (name!=nil) [eventData setObject:name forKey:@"name"]; if (page!=nil) [eventData setObject:page forKey:@"page"]; if (source!=nil) [eventData setObject:source forKey:@"source"]; if (eventData!=nil) [params setObject:eventData forKey:@"eventData"]; [[LVNetworkEngine sharedManager] analytics_send:params]; }
![Page 9: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/9.jpg)
Data Capture
WEB (JavaScript) function internalTrackPageView() { var cookie = {
userContext: jQuery.cookie('UserContextCookie'), };
var trackEvent = {
eventType: "pageView", eventData: { page: window.location.pathname + window.location.search } };
// AJAX jQuery.ajax({ url: "/api/track", type: "POST", dataType: "json", data: JSON.stringify(trackEvent), // Set Request Headers beforeSend: function (xhr, settings) { xhr.setRequestHeader('Accept', 'application/json'); xhr.setRequestHeader('User-Context', cookie.userContext); if(settings.type === 'PUT' || settings.type === 'POST') { xhr.setRequestHeader('Content-Type', 'application/json'); } } });
}
![Page 10: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/10.jpg)
Bus Event Packaging 1. Spring 3 RESTful service layer, controller methods define the eventCode via @tracking
annotation • Custom Intercepter class extends HandlerInterceptorAdapter and implements
postHandle() (for each event) to invoke calls via Spring @async to an EventPublisher • EventPublisher publishes to common event bus queue with multiple subscribers, one of which
packages the eventPayload Map<String, Object> object and forwards to Analytics Rest Service
![Page 11: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/11.jpg)
Bus Event Packaging 1) Spring RestController Methods Interface
@RequestMapping(value = "/user/login", method = RequestMethod.POST, headers="Accept=application/json") public Map<String, Object> userLogin(@RequestBody Map<String, Object> event, HttpServletRequest request);
Concrete/Impl Class @Override @Tracking("user.login") public Map<String, Object> userLogin(@RequestBody Map<String, Object> event, HttpServletRequest request){ //Implementation
return event; }
![Page 12: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/12.jpg)
Bus Event Packaging 2) Custom Intercepter class extends HandlerInterceptorAdapter protected void handleTracking(String trackingCode, Map<String, Object> modelMap, HttpServletRequest request) { Map<String, Object> responseModel = new HashMap<String, Object>(); // remove non-serializables & copy over data from modelMap try { this.eventPublisher.publish(trackingCode, responseModel, request); } catch (Exception e) { log.error("Error tracking event '" + trackingCode + "' : " + ExceptionUtils.getStackTrace(e)); } }
![Page 13: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/13.jpg)
Bus Event Packaging 2) Custom Intercepter class extends HandlerInterceptorAdapter public void publish (String eventCode, Map<String,Object> eventData, HttpServletRequest request) { Map<String,Object> payload = new HashMap<String,Object>(); String eventId=UUID.randomUUID().toString(); Map<String, String> requestMap = HttpRequestUtils.getRequestHeaders(request); //Normalize message payload.put("eventType", eventData.get("eventType")); payload.put("eventData", eventData.get("eventType")); payload.put("version", eventData.get("eventType")); payload.put("eventId", eventId); payload.put("eventTime", new Date()); payload.put("request", requestMap); . . . //Send to the Analytics Service for MongoDB persistence } public void sendPost(EventPayload payload){ HttpEntity request = new HttpEntity(payload.getEventPayload(), headers); Map m = restTemplate.postForObject(endpoint, request, java.util.Map.class); }
![Page 14: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/14.jpg)
Bus Event Packaging The Serialized Json (User Action) { “eventCode” : “user.login”, “eventType” : “login”, “version” : “1.0”, “eventTime” : “1358603157746”, “eventData” : { “” : “”, “” : “”, “” : “” }, “request” : { “call-source” : “WEB”, “user-context” : “00002b4f1150249206ac2b692e48ddb3”, “user.agent” : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/ 23.0.1271.101 Safari/537.11”, “cookie” : “size=4; CP.mode=B; PHPSESSID=c087908516 ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF9 46F139669D746F; csrftoken=73bdcd ddf151dc56b8020855b2cb10c8", "content-length" : "204", "accept-encoding" : "gzip,deflate,sdch”, } }
![Page 15: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/15.jpg)
Bus Event Packaging The Serialized Json (Generic Event) { “eventCode” : “generic.ui”, “eventType” : “pageView”, “version” : “1.0”, “eventTime” : “1358603157746”, “eventData” : { “page” : “/learnvest/moneycenter/inbox”, “section” : “transactions”, “name” : “view transactions” “object” : “page” }, “request” : { “call-source” : “WEB”, “user-context” : “00002b4f1150249206ac2b692e48ddb3”, “user.agent” : “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/ 23.0.1271.101 Safari/537.11”, “cookie” : “size=4; CP.mode=B; PHPSESSID=c087908516 ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF9 46F139669D746F; csrftoken=73bdcd ddf151dc56b8020855b2cb10c8", "content-length" : "204", "accept-encoding" : "gzip,deflate,sdch”, } }
![Page 16: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/16.jpg)
MongoDB Data Warehousing MongoDB Information • v2.2.0 • 3-node replica-set • 1 Large (primary), 2x Medium (secondary) AWS Amazon-Linux machines • Each with single 500GB EBS volumes mounted to /opt/data MongoDB Config File dbpath = /opt/data/mongodb/datarest = truereplSet = voyager Volumes ~IM events daily on web, ~600K on mobile 2-3 GB per day at start, slowed to ~1GB per day Currently at 78GB (collecting since August 2012) Future Scaling Strategy • Setup 2nd Replica-Set • Shard replica-sets to n at 60% / 250GB per EBS volume • Shard key probably based on sequential mix of email_address & additional string
![Page 17: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/17.jpg)
MongoDB Data Warehousing
Approach • Persist all events, bucketed by source:- WEB MOBILE • Persist all events, bucketed by source, event code and time:- WEB/MOBILE user.login time (day, week-ending, month, year) 3. Insert into collection e_web / e_mobile 4. Upsert into:- e_web_user_login_day e_web_user_login_week e_web_user_login_month e_web_user_login_year 5. Predictable model for scaling and measuring business growth
![Page 18: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/18.jpg)
MongoDB Data Warehousing
2. Persist all events, bucketed by source, event code and time:- //instantiate collections dynamically DBCollection collection_day = mongodb.getCollection(eventCode + "_day"); DBCollection collection_week = mongodb.getCollection(eventCode + "_week"); DBCollection collection_month = mongodb.getCollection(eventCode + "_month"); DBCollection collection_year = mongodb.getCollection(eventCode + "_year"); BasicDBObject newDocument = new BasicDBObject().append("$inc" new BasicDBObject().append("count", 1)); //update day dimension collection_day.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_day.format(d)),newDocument, true, false); //update week dimension collection_week.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_day.format(w)), newDocument, true, false); //update month dimension collection_month.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_month.format(d)), newDocument, true, false); //update month dimension collection_year.update(new BasicDBObject().append("user-context", userContext) .append("eventType", eventType) .append("date", sdf_year.format(d)), newDocument, true, false);
![Page 19: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/19.jpg)
MongoDB Data Warehousing
Persist all events, bucketed by source, event code and time:- > show collectionse_mobilee_webe_web_account_addManual_daye_web_account_addManual_monthe_web_account_addManual_weeke_web_account_addManual_year e_web_user_login_day e_web_user_login_week e_web_user_login_month e_web_user_login_yeare_mobile_generic_ui_daye_mobile_generic_ui_monthe_mobile_generic_ui_weeke_mobile_generic_ui_year > db.e_web_user_login_day.find() { "_id" : ObjectId("50e4b9871b36921910222c42"), "count" : 5, "date" : "01/02", "user-context" : "c4ca4238a0b923820dcc509a6f75849b" } { "_id" : ObjectId("50cd6cfcb9a80a2b4ee21422"), "count" : 7, "date" : "01/02", "user-context" : "c4ca4238a0b923820dcc509a6f75849b" } { "_id" : ObjectId("50cd6e51b9a80a2b4ee21427"), "count" : 2, "date" : "01/02", "user-context" : "c4ca4238a0b923820dcc509a6f75849b" } { "_id" : ObjectId("50e4b9871b36921910222c42"), "count" : 3, "date" : "01/03", "user-context" : "50e49a561b36921910222c33" }
![Page 20: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/20.jpg)
MongoDB Data Warehousing
Persist all events > db.e_web.findOne() { "_id" : ObjectId("50e4a1ab0364f55ed07c2662"), "created_datetime" : ISODate("2013-01-02T21:07:55.656Z"), "created_date" : ISODate("2013-01-02T00:00:00.000Z"),"request" : { "content-type" : "application/json", "connection" : "keep-alive", "accept-language" : "en-US,en;q=0.8", "host" : "localhost:8080", "call-source" : "WEB", "accept" : "*/*", "user-context" : "c4ca4238a0b923820dcc509a6f75849b", "origin" : "chrome-extension://fdmmgilgnpjigdojojpjoooidkmcomcm", "user-agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.101 Safari/537.11", "accept-charset" : "ISO-8859-1,utf-8;q=0.7,*;q=0.3", "cookie" : "size=4; CP.mode=B; PHPSESSID=c087908516ee2fae50cef6500101dc89; resolution=1920; JSESSIONID=56EB165266A2C4AFF946F139669D746F; csrftoken=73bdcdddf151dc56b8020855b2cb10c8", "content-length" : "255", "accept-encoding" : "gzip,deflate,sdch" }, "eventType" : "flick", "eventData" : { "object" : "button", "name" : "split transaction button", "page" : "#inbox/79876/", "section" : "transaction_river_details" } }
![Page 21: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/21.jpg)
MongoDB Data Warehousing Indexing Strategy • Indexes on core collections (e_web and e_mobile) come in under 3GB on 7.5GB Large Instance and 3.75GB on Medium instances
• Split datetime in two fields and compound index on date with other fields like eventType and user unique id (user-context)
• Heavy insertion rates, much lower read rates....so less indexes the better
![Page 22: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/22.jpg)
MongoDB Data Warehousing Indexing Strategy > db.e_web.getIndexes()[ { "v" : 1, "key" : { "request.user-context" : 1, "created_date" : 1 }, "ns" : "moneycenter.e_web", "name" : "request.user-context_1_created_date_1" }, { "v" : 1, "key" : { "eventData.name" : 1, "created_date" : 1 }, "ns" : "moneycenter.e_web", "name" : "eventData.name_1_created_date_1" }]
![Page 23: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/23.jpg)
Loading & Visualization Objective • Show historic and intraday stats on core use cases (logins, conversions) • Show user funnel rates on conversion pages • Show general usability - how do users really use the Web and IOS platforms?
Non-Functionals • Intraday doesn’t need to be “real-time”, polling is good enough for now • Overnight batch job for historic must scale horizontally General Implementation Strategy • Do all heavy lifting & object manipulation, UI should just display graph or table • Modularize the service to be able to regenerate any graphs/tables without a full load
![Page 24: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/24.jpg)
Loading & Visualization Java Batch Service Java Mongo library to query key collections and return user counts and sum of events DBCursor webUserLogins = c.find( new BasicDBObject("date", sdf.format(new Date()))); private HashMap<String, Object> getSumAndCount(DBCursor cursor){
HashMap<String, Object> m = new HashMap<String, Object>(); int sum=0; int count=0; DBObject obj; while(cursor.hasNext()){ obj=(DBObject)cursor.next(); count++; sum=sum+(Integer)obj.get("count"); } m.put("sum", sum); m.put("count", count); m.put("average", sdf.format(new Float(sum)/count)); return m;
}
![Page 25: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/25.jpg)
Loading & Visualization Java Batch Service Use Aggregation Framework where required on core collections (e_web) and external data //create aggregation objects DBObject project = new BasicDBObject("$project", new BasicDBObject("day_value", fields) ); DBObject day_value = new BasicDBObject( "day_value", "$day_value"); DBObject groupFields = new BasicDBObject( "_id", day_value); //create the fields to group by, in this case “number” groupFields.put("number", new BasicDBObject( "$sum", 1)); //create the group DBObject group = new BasicDBObject("$group", groupFields); //execute AggregationOutput output = mycollection.aggregate( project, group );
for(DBObject obj : output.results()){ . . }
![Page 26: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/26.jpg)
Loading & Visualization
Java Batch Service MongoDB Command Line example on aggregation over a time period, e.g. month > db.e_web.aggregate( [ { $match : { created_date : { $gt : ISODate("2012-10-25T00:00:00")}}}, { $project : { day_value : {"day" : { $dayOfMonth : "$created_date" }, "month":{ $month : "$created_date" }} }}, { $group : { _id : {day_value:"$day_value"} ,
number : { $sum : 1 } } }, { $sort : { day_value : -1 } } ])
![Page 27: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/27.jpg)
Loading & Visualization Java Batch Service Persisting events into graph and table collections >db.homeGraphs.find() { "_id" : ObjectId("50f57b5c1d4e714b581674e2"), "accounts_natural" : 54, "accounts_total" : 54, "date" : ISODate("2011-02-06T05:00:00Z"), "linked_rate" : "12.96", "premium_rate" : "0", "str_date" : "2011,01,06", "upgrade_rate" : "0", "users_avg_linked" : "3.43", "users_linked" : 7 } { "_id" : ObjectId("50f57b5c1d4e714b581674e3"), "accounts_natural" : 144, "accounts_total" : 144, "date" : ISODate("2011-02-07T05:00:00Z"), "linked_rate" : "11.11", "premium_rate" : "0", "str_date" : "2011,01,07", "upgrade_rate" : "0", "users_avg_linked" : "4", "users_linked" : 16 } { "_id" : ObjectId("50f57b5c1d4e714b581674e4"), "accounts_natural" : 119, "accounts_total" : 119, "date" : ISODate("2011-02-08T05:00:00Z"), "linked_rate" : "15.13", "premium_rate" : "0", "str_date" : "2011,01,08", "upgrade_rate" : "0", "users_avg_linked" : "4.5", "users_linked" : 18 }
![Page 28: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/28.jpg)
Loading & Visualization
Django and HighCharts Extract data (pyMongo) def getHomeChart(dt_from, dt_to): """Called by home method to get latest 30 day numbers""" try: conn = pymongo.Connection('localhost', 27017) db = conn['lvanalytics'] cursor = db.accountmetrics.find( {"date" : {"$gte" : dt_from, "$lte" : dt_to}}).sort("date") return buildMetricsDict(cursor) except Exception as e: logger.error(e.message)
Return the graph object (as a list or a dict of lists) to the view that called the method pagedata={} pagedata['accountsGraph']=mongodb_home.getHomeChart() return render_to_response('home.html',{'pagedata': pagedata}, context_instance=RequestContext(request))
>db.homeGraphs.find() { "_id" : ObjectId("50f57b5c1d4e714b581674e2"), "accounts_natural" : 54, "accounts_total" : 54, "date" : ISODate("2011-02-06T05:00:00Z"), "linked_rate" : "12.96", "premium_rate" : "0", "str_date" : "2011,01,06", "upgrade_rate" : "0", "users_avg_linked" : "3.43", "users_linked" : 7 } { "_id" : ObjectId("50f57b5c1d4e714b581674e3"), "accounts_natural" : 144, "accounts_total" : 144, "date" : ISODate("2011-02-07T05:00:00Z"), "linked_rate" : "11.11", "premium_rate" : "0", "str_date" : "2011,01,07", "upgrade_rate" : "0", "users_avg_linked" : "4", "users_linked" : 16 } { "_id" : ObjectId("50f57b5c1d4e714b581674e4"), "accounts_natural" : 119, "accounts_total" : 119, "date" : ISODate("2011-02-08T05:00:00Z"), "linked_rate" : "15.13", "premium_rate" : "0", "str_date" : "2011,01,08", "upgrade_rate" : "0", "users_avg_linked" : "4.5", "users_linked" : 18 }
![Page 29: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/29.jpg)
Loading & Visualization
Django and HighCharts Populate the series.. (JavaScript with Django templating) seriesOptions[0] = { id: 'naturalAccounts', name: "Natural Accounts", data: [ {% for a in pagedata.metrics.accounts_natural %} {% if not forloop.first %}, {% endif %} [Date.UTC({{a.0}}),{{a.1}}] {% endfor %} ], tooltip: { valueDecimals: 2 } };
![Page 30: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/30.jpg)
Loading & Visualization Django and HighCharts And Create the Charts and Tables...
![Page 31: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/31.jpg)
Loading & Visualization Django and HighCharts And Create the Charts and Tables...
![Page 32: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/32.jpg)
Lessons Learned • Date Time managed as two fields, Datetime and Date
• Aggregating and upserting documents as events are received works for us
• Real-time Map-Reduce in pyMongo - too slow, don’t do this. • Django-noRel - Unstable, use Django and configure MongoDB as a
datastore only
• Memcached on Django is good enough (at the moment) - use django-celery with rabbitmq to pre-cache all data after data loading
• HighCharts is buggy - considering D3 & other libraries
• Don’t need to retrieve data directly from MongoDB to Django, perhaps provide all data via a service layer (at the expense of ever-additional features in pyMongo)
![Page 33: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/33.jpg)
Next Steps • A/B testing framework, experiments and variances
• Unauthenticated / Authenticated user tracking
• Provide data async over service layer
• Segmentation with graphical libraries like D3 & Cross-Filter (http://square.github.com/crossfilter/)
• Saving Query Criteria, expanding out BI tools for internal users
• MongoDB Connector, Hadoop and Hive (maybe Tableau and other tools)
• Storm / Kafka for real-time analytics processing
• Shard the Replica-Set, looking into Gizzard as the middleware
![Page 34: Implementing and Visualizing Clickstream data with MongoDB](https://reader033.fdocuments.us/reader033/viewer/2022061218/54b72b824a79599d2a8b462c/html5/thumbnails/34.jpg)
Kevin Connelly Director of Engineering [email protected]
Cameron Sim Director of Analytics Tech [email protected]
Thanks & Questions���������������
Hrishi Dixit Chief Technology Officer
Jeremy Brennan
Director of UI/UX Technology [email protected]
Will Larche Lead IOS Developer [email protected]
<your name here>
New Awesome Developer [email protected]
HIRED!