Distributed Stream Processing on Fluentd / #fluentd

download Distributed Stream Processing on Fluentd / #fluentd

of 62

  • date post

    10-May-2015
  • Category

    Technology

  • view

    9.876
  • download

    9

Embed Size (px)

description

at Fluentd meetup in Japan 2012/02/04

Transcript of Distributed Stream Processing on Fluentd / #fluentd

  • 1.12 2 4

2. 12 2 4 3. Working at NHN Japanwe are hiring!12 2 4 4. What we are doing about logs with uentd data miningreportingpage views, unique users, traffic amount per page, ...12 2 4 5. What we are doing about logs with uentd super large scale sed | grep | wclike processes12 2 4 6. What uentd? (not Storm, Kafka or Flume?)Ruby, Ruby, Ruby! (NOT Java!) we are working in lightweight language cultureeasy to try, easy to patchPlugin model architecture Builtin TimeSlicedOutput mechanism12 2 4 7. What I talk todayWhat we are trying with fluentdHow we did, and how we are doing now What is distributed stream process topologies like?What is important about stream processingImplementation details(appendix)12 2 4 8. Architecture in last weeks presentationarchivedeliver server(scribed) archive server (scribed)deliver serverRAIDserver(scribed)(scribed) Large volume RAID serversend data both archive servers and Fluentd workers (as stream)import past logs and convert on demand (as batch) HadoopHadoopHadoop HadoopHadoopHadoop Shib ClusterHadoopCluster Hadoop Cluster Hadoop ClusterFluentdClusterHadoopCluster Hadoop Hadoop HiveCluster Cluster ClusterCluster Cluster Cluster Web Clientaggregation queriesconvert logs as structured dataon demand and write HDFS (as stream)12 2 4 9. Now archivedeliverserver(scribed)archive server (scribed)deliver server RAIDserver (scribed)(Fluentd)Large volume RAID server HadoopHadoop Hadoop Hadoop Hadoop HadoopShib ClusterHadoopCluster HadoopClusterHadoop ClusterFluentd Cluster Hadoop ClusterHadoopHadoop HiveCluster ClusterClusterClusterClusterClusterWeb Client Fluentd Watcher12 2 4 10. Fluentd in production service 10 days12 2 4 11. Scale of Fluentd processesfrom 127 Web Servers 146 log streams12 2 4 12. Scale of Fluentd processes 70,000 messages/sec120 Mbps(at peak time)12 2 4 13. Scale of Fluentd processes 650 GB/day(non-blog: 100GB)12 2 4 14. Scale of Fluentd processes 89 fluentd instanceson12 nodes (4Core HT)12 2 4 15. We cant go back. crouton by kbysmnr12 2 4 16. What we are trying with uentdlog conversionfrom: raw log (apache combined like format) to: structured and query-friendly log (TAB separated, masked some fields, many flags added)12 2 4 17. What we are trying with uentd log conversion 99.999.999.99 - - [03/Feb/2012:10:59:48 +0900] "GET /article/detail/6246245/ HTTP/1.1" 20017509 "http://news.livedoor.com/topics/detail/6246245/" "Mozilla/4.0 (compatible; MSIE 8.0;Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.1; .NET4.0C)" "news.livedoor.com""xxxxxxx.xx.xxxxxxx.xxx" "-" 163266152930 news.livedoor.com /topics/detail/6242972/ GET 302 210 226 - 99.999.999.99 TQmljv9QtXkpNtCSuWVGGg Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X)AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A406 Safari/7534.48.3 TRUETRUE FALSE FALSE FALSE FALSE FALSE hhmmdd vhost path method status bytes duration referer rhost userlabel agent FLAG [FLAGS] FLAGS: status_redirection status_errors rhost_internal sufx_miscle sufx_imagele agent_bot FLAG: logical OR of FLAGS userlabel: hash of (tracking cookie / terminal id (mobile phone) / rhost+agent)12 2 4 18. What we are trying with uentd TimeSlicedOutput of fluentd Traditional log rotation is important, but troublesome We want: 2/3 23:59:59 log in access.0203_23.log 2/4 00:00:00 log in access.0204_00.log12 2 4 19. How we did, and how we are doing nowcollect archiveconvert aggregate show12 2 4 20. How we did in past (2011)collect (scribed)stream stream store to hdfs archive (scribed)HIGH LATENCY hourly/daily time to ush +hourly invocation + convert (Hadoop Streaming)running time20-25minson demand aggregate (Hive)on demandshow12 2 4 21. How we are doing nowcollect (Fluentd)stream stream archive (scribed)stream convertstreamVELY LOW LATENCY convert (Fluentd) 2-3 minutes (only time to wait ush)store to hdfs (over Clouderas Hoop)on demand aggregate (Hive)on demandshow12 2 4 22. crouton by kbysmnr break.12 2 4 23. What is important about stream processing reasonable efficiency(compared with batch throughput) ease to re-run same conversion as batch None SPOF ease to add/remove nodes12 2 4 24. Stream processing and batch How to re-run conversion as batchwhen we got troubles? We want to use just one converter program for both stream processes and batch processes!12 2 4 25. out_exec_lter (uentd built-in plugin)1. fork and exec command program 2. write data to child process stdin as TAB separated fields specified by out_keys (for tag, remove_prefix available)3. read data from child process stdout as TABseparated fields named by in_keys (for tag, add_prefix available)4. set messages timestamp by time_key value in parsed data as format specified by time_format12 2 4 26. out_exec_lter and Hadoop Streaming read from stdin / write to stdout TAB separated values as input/outputWOW!!!!!!! difference: tag may be needed without_exec_filter simple solution: if not exists, ignore.12 2 4 27. What is important about stream processing reasonable efficiency(compared with batch throughput) ease to re-run same conversion as batch None SPOF ease to add/remove nodes12 2 4 28. What is distributed stream process toplogies like? deliverarchiver backupservers deliver servers serversworkerworkerworkerworkerworkerworkerworkerworkerworker servers servers serializerserializer Redundancy and load balancingHDFS MUST be guaranteed anywhere.(Hoop Server)12 2 4 29. Deliver nodesdeliver archiverbackupserversdeliver servers Accept connections from web servers, servers Copy messages and send to: workerworkerworkerworker workerworkerworkerworkerworker servers1. archiver (and its backup)2. convert workers (w/ load balancing) serversserializer and ... 3.serializeruseful for casual worker append/remove HDFS(Hoop Server)12 2 4 30. Worker nodesdeliverarchiver backupservers Under load balancing, deliver workers as many as you want servers servers workerworkerworkerworker workerworkerworkerworkerworker servers serversserializerserializer HDFS(Hoop Server)12 2 4 31. Serializer nodesdeliverarchiver backup Receive converted data stream from workers,servers deliveraggregate by services, and : servers 1. write to storage(hfds/hoop) servers 2. and... workerworkerworkerworker useful to reduce overhead of storage from many workerworkerworkerworkerworkerservers concurrent write operations serversserializerserializer HDFS(Hoop Server)12 2 4 32. Watcher nodesdeliverarchiver backupserversdeliver Watching data for serversreal-time workload repotingsand trouble notications servers workerworkerworkerworker worker 1. for raw data from delivers workerworkerworkerworker servers2. for structured data from serializers serversserializerserializerwatcherHDFSwatcher (Hoop Server)12 2 4 33. crouton by kbysmnrbreak.12 24 34. Implementation detailslog agents on servers(scribeline) deliver (copy, in_scribe, out_scribe, out_forward)worker(in/out_forward, out_exec_filter) serializer/hooper(in/out_forward, out_hoop) watcher(in_forward, out_flowcounter, out_growthforecast)12 2 4 35. log agent: scribeline log delivery agent tool, python 2.4, scribe/thrift easy to setup and start/stopworks with any httpd configuration updates works with logrotate-ed log filesautomatic delivery target failover/takeback(NEW) Cluster support (random select from server list) https://github.com/tagomoris/scribe_line12 2 4 36. From scribeline To deliver deliver server (primary)category: blogmessage: RAW LOG uentd(Apache combined + )in_scribe scribelinescribe serversin_scribe uentd deliver server (secondary)12 2 4 37. deliver 01 (primary) From scribeline To deliver deliver 02 (secondary) xNN servers x8 uentdper node deliver 03 (primary for high throughput nodes)12 2 4 38. From scribeline To deliver deliver server (primary)category: blogmessage: RAW LOG uentd(Apache combined + )in_scribe scribeline serversin_scribe uentd deliver server (secondary)12 2 4 39. deliver node internal routing deliver server (primary) x8 uentd instances deliver uentdcopy scribe.* in_scribe out_scribecategory: blog host archive.server.localmessage: RAW LOG add_prex scriberemove_prex scribe add_newline true remove_newline trueout_owcounter (see later..) time: received_attag: scribe.blogroundrobin (see next) message: RAW LOG out_forward (see later with out_owcounter..)12 2 4 40. deliver node: roundrobin strategy to workers roundrobin x56 substore congurations (7workers x 8instances) out_forward server: worker01 port 24211 secondary server: worker02 port 24211 out_forward time: received_at server: worker01 port 24212 secondarytag: scribe.blog server: worker03 port 24212 message: RAW LOG out_forward server: worker01 port 24213 secondary server: worker04 port 24213 out_forward server: worker01 port 24214 secondary server: worker05 port 2421412 2 4 41. From deliver To worker deliver server worker server X deliver uentd worker uentd Xn1copy scribe.* in_forward roundrobintime: received_attag: scribe.blogout_forwardmessage: RAW LOGin_forward time: received_attag: scribe.blog message: RAW LOGworker uentd Yn2worker server Y12 2 4 42. worker node internal routing worker server x8 worker instances, x1 serializer instance worker uentd serializer uentd in_forward out_exec_lter scribe.*in_forwardcommand: convert.shin_keys: tag,messageremove_prex scribe out_hoop converted.blogout_keys: ....... hoop_server servername.localadd_prex: convertedusernametime_key: timeeldpath /on_hdfs/%Y%m%d/blog-%H.logtime_format: %Y%m%d%H%M%S time:received_at out_hoop converted.newstag: scribe.blog out_forward converted.*path /on_hdfs/%Y%m%d/news-%H.logmessage: RAW LOGtime:written_timeTAB separated tag: converted.blog text data[many data elds]12 2 4 43. out_exec_lter (review.)1. fork and exec command program 2. write data to child process stdin as TAB separated fields specified by out_keys (for tag, remove_prefix available)3. read data from child process stdout as TABseparated fields named by in_keys (for tag, add_prefix av