CUBRID Reference Architecture for Social Networking Service (2011!8!7)

download CUBRID Reference Architecture for Social Networking Service (2011!8!7)

of 46

Transcript of CUBRID Reference Architecture for Social Networking Service (2011!8!7)

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    1/46

    2010 NHN BUSINESS PLATFORM CORPORATION

    CUBRID Reference Architecture

    for Social Networking ServiceKieun Park

    NHN Business Platform Corp.

    2011.8

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    2/46

    Copyright Notice

    Copyright 2010 NHN Corporation. All Rights Reserved.

    NHN NHN.. NHN,., NHN. URL,.

    NHN.

    This document is an intellectual asset of NHN Corp.; it cannot be arbitrarily used for other purposeswithout the approval of NHN Corp.This document is offered only for the purpose of information provision. NHN Corp. has endeavored toverify the completeness and accuracy of information contained in this document, but it does not takethe responsibility for possible errors or omissions in this document. Therefore, the responsibility for theusage of this document or the results of the usage falls entirely upon the user, and NHN Corp. does notmake any explicit or implicit guarantee regarding this.

    Software products or merchandises mentioned in this document, including relevant URL information,conform to the copyright laws of their respective owners. It is the responsibility of the user to abide bythe corresponding copyright law.NHN Corp. may modify the details of this document without prior notice.

    46 CUBRID Reference Architecture for Social Networking Service2 /

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    3/46

    Abstract

    46 CUBRID Reference Architecture for Social Networking Service3 /

    The top ranked facebook celebrity has 44 million fans. The top ranked twitter user has 11 million followers. There are

    over 900 million objects in the facebook site and 140 million tweets people send per day. Needless to say, these facts

    heavily impact on database they have. Thus, best practice in database architecture is important.

    Online social networking (OSN) services have rapidly proliferated and changed the way data is stored and served. Social

    data is an enormous graph of small objects that are tightly interconnected. The service page of OSN is a view of those

    small objects customized to a specific viewers at a specific time. Typically, the view is aggregation of events connected by

    social graph which is changing constantly with users' realtime interaction. Even though the Dunbar's number shows that

    the number of people with whom one gets stable social relationship is relatively small as 150, in OSN site celebs have a

    large number of followers so that the social graph is very huge. These properties of the data lead to new challenges, anddemands new database architecture to handle them.

    The main considerations of database architecture for OSN are about scale-out and performance in addition to high

    availability as mandatory. the main characteristics of OSN service in terms of data are power-law scaling, data feeding

    frenzy and Zipfian distribution access. Data being delivered are exponentially growing according to the popularity of the

    service. Cost-effective database scale-out architecture is important to business requirement as well as to technical issues.

    In this presentation, CUBRID Reference Architecture for social networking service will be shown. The presented

    architectures are based on best practices developed from real business cases of NHN, biggest portal service provider in

    Korea. Described are the helpful features to support the database architecture demands for OSN service. For example,

    index scan with top-k sorting technique is developed for fast feed aggregation. Also, HA, automatic sharding and

    clustering features of the CUBRID will be explained. Finally, the nStore, a distributed database system based on the

    CUBRID, will be introduced. Concept of the nStore is similar to Amazon Dynamo but different in that it support SQL.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    4/46

    I Am

    46 CUBRID Reference Architecture for Social Networking Service4 /

    Kieun Park

    Software/Database Architect

    Service Platform Development Center

    NHN Business Platform Corp.

    [email protected]

    CUBRID Open Source DBMS

    nStore Distributed Database System

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    5/46

    Contents

    46 CUBRID Reference Architecture for Social Networking Service5 /

    Characteristics ofonline social

    networking service

    How fast is the data growing in online social networking service?

    Characteristics of OSN service: Power-law scaling growth, data feeding

    frenzy, and Zipfian distribution access

    How does it access database? Feed aggregation

    Challenges anddemands on

    databasearchitecture

    CUBRID featuresCUBRID referencearchitecture for

    social networkingservice

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    6/46

    Contents

    46 CUBRID Reference Architecture for Social Networking Service6 /

    Characteristics ofonline social

    networking service

    Business demands and system requirements

    Main considerations of database architecture for OSN service

    Scale-out, performance, and high availability

    Challenges anddemands ondatabase

    architecture

    CUBRID featuresCUBRID referencearchitecture forsocial networking

    service

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    7/46

    Contents

    46 CUBRID Reference Architecture for Social Networking Service7 /

    Characteristics ofonline social

    networking service

    Index scan with top-k sorting technique

    High availability feature

    Automatic sharding component

    CUBRID Cluster System

    nStore, a distributed database system based on the CUBRID

    Challenges anddemands ondatabase

    architecture

    CUBRID uniquefeatures

    CUBRID referencearchitecture forsocial networking

    service

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    8/46

    Contents

    46 CUBRID Reference Architecture for Social Networking Service8 /

    Characteristics ofonline social

    networking service

    CUBRID Web Reference Architecture

    CUBRID SNS Reference Architecture

    Challenges anddemands ondatabase

    architecture

    CUBRID featuresCUBRID referencearchitecture forsocial networking

    service

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    9/46

    46 CUBRID Reference Architecture for Social Networking Service9 /

    Characteristics of online social

    networking service

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    10/46

    Some Infographics about Online Social Networking Service

    46 CUBRID Reference Architecture for Social Networking Service10 /

    Source http://blog.skloog.com/history-social-media-history-social-media-bookmarking/

    The history and evolution of OSN are made in last 10 years.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    11/46

    Some Infographics about Online Social Networking Service

    46 CUBRID Reference Architecture for Social Networking Service11 /

    Source http://www.digitalsurgeons.com/facebook-vs-twitter-infographic/

    500 million Facebook users, 106 million Twitter users

    Social networks with user bases larger than the population of most countries

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    12/46

    Some Infographics about Online Social Networking Service

    46 CUBRID Reference Architecture for Social Networking Service12 /

    Source http://www.digitalbuzzblog.com/infographic-twitter-statistics-facts-figures/

    The top ranked twitter user, Lady Gaga, has 11 million followers.

    About 55 million Tweets per day.

    Twitter gets about 600 million queries every day.

    (http://twitaholic.com)

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    13/46

    Some Infographics about Online Social Networking Service

    46 CUBRID Reference Architecture for Social Networking Service13 /

    Source http://www.digitalbuzzblog.com/facebook-statistics-stats-facts-2011/

    Source http://www.digitalbuzzblog.com/facebook-statistics-facts-figures-for-2010/

    The most followed person, Eminem, has more than 44 million fans.More than 5 billion pieces of content shared each week.

    2,716,000 messages, 1,587,000 wall posts, 10,208,000 comments in

    20 minutes on Facebook.

    (http://www.independent.co.uk)

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    14/46

    Some Infographics about Online Social Networking Service

    46 CUBRID Reference Architecture for Social Networking Service14 /

    Source http://www.flowtown.com/blog/have-we-reached-a-world-of-infinite-information

    Have we reached a world of infinite information?

    In a similar manner to ouruniverse, the Internet isexpanding at an incredibly rapidpace, reaching new levels ofinformation storage and content

    creation every second.Every minute,24 hours of video

    By 2020,roughly 25x1018 (quintillion)

    information containers

    The growth gapbetween

    the digital contents createdand the available storage

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    15/46

    Statistics of Facebook and Twitter

    46 CUBRID Reference Architecture for Social Networking Service15 /

    Source http://blog.twitter.com/2011/03/numbers.htmlSource http://www.facebook.com/press/info.php?statistics

    More than 750 million active users.

    There are over 900 million objects thatpeople interact with (pages, groups,

    events and community pages)

    140 million; the average number ofTweets people sent per day.

    6,939; current TPS record.

    http://blog.twitter.com/2011/03/numbers.htmlhttp://www.facebook.com/press/info.php?statisticshttp://www.facebook.com/press/info.php?statisticshttp://www.facebook.com/press/info.php?statisticshttp://www.facebook.com/press/info.php?statisticshttp://blog.twitter.com/2011/03/numbers.htmlhttp://blog.twitter.com/2011/03/numbers.htmlhttp://blog.twitter.com/2011/03/numbers.html
  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    16/46

    Statistics of Me2Day

    46 CUBRID Reference Architecture for Social Networking Service16 /

    4,367,861

    4,721,6445,010,230

    5,430,343

    6,019,556

    6,425,8476,684,905

    Jan-11 Feb-11 Mar-11 Apr-11 May-11 Jun-11 Jul-11

    # MembersPostings per day: 278,461

    Total postings: 123,456,727

    Total photos: 10,638,089

    Rank Nickname Friends

    1 ** 432,186

    2 ** 427,021

    3 * 337,414

    4 ** 258,272

    5 257,759

    6 * 228,3597 ** 224,226

    8 * 223,739

    9 ** 223,541

    10 ** 221,132

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    17/46

    Online social networking service

    46 CUBRID Reference Architecture for Social Networking Service17 /

    Social data is an enormous graph of small objects

    that are tightly interconnected.

    The service page of OSN is a aggregation of events

    connected by social graph which is changing

    constantly with users' realtime interaction.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    18/46

    Feed Following Works

    46 CUBRID Reference Architecture for Social Networking Service18 /

    Data Storage Layer

    Content Management Layer

    Application Layer

    Database

    Cache

    Database

    Delivery & AggregationEngine

    Feeds Following

    FollowerContents(comment, photo, tag, ) News Feeds

    (personalized feeds)

    Outbox Inbox

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    19/46

    Characteristics of Online Social Networking Service

    46 CUBRID Reference Architecture for Social Networking Service19 /

    Users follow activity and news ofother users and entities.

    Followers gets personalized feeds thataggregate streams produced those

    followed.

    Highly variable and somewhat bit fan-out of the follows graph makes datafeeding difficult to implement andrequires high cost to operate.

    Datafeedingfrenzy

    Power-law

    scalinggrowth

    Online social networks have properties

    of significant clustering, small diameter,and power-law degrees.

    Zipfiandistribution

    access

    Twitter Activity

    5% of users account for 75% of allactivity, 10% account for 86% of activity,and the top 30% account for 97.4%.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    20/46

    46 CUBRID Reference Architecture for Social Networking Service20 /

    Challenges and demands on database

    architecture

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    21/46

    Challenge and Demands on Database Architecture

    46 CUBRID Reference Architecture for Social Networking Service21 /

    Online social networking service have rapidly proliferated and changed

    the way data is stored and served.

    Today social media generates more information in a short period of

    time than was previously available in the entire world a few

    generations ago.

    Not only the exponential growth of Facebook, Google+, Twitter, but

    also the use of more and more rich media such as user-generated video

    from smart phone, is surely driving big data.

    Source http://www.itu.int/net/itunews/issues/2010/06/35.aspx

    From business demands to technology implementation.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    22/46

    When an application is being designed, software architects need to plan for much greaterapplication load to avoid major redesigns in the future. While scaling out web servers can bedone quite easily, properly scaling out database servers is far more challenging and happens.

    With enterprise data volumes moving past terabytes to tens of petabytes and more, businessand IT leaders face significant opportunities and challenges from big data. For a largeenterprise, big data may be in the petabytes or more; for a small or mid-size enterprise, datavolumes that grow into tens of terabytes may become challenging to analyze and manage.

    Social media now produces massive amounts of data. Facebooks network, for instance,consists of 100 million entities generating tens of millions of events per second. Twitter,meanwhile, funnels 140 million public tweets a day. [GigaOM research notes]

    Challenge and Demands on Database Architecture

    46 CUBRID Reference Architecture for Social Networking Service22 /

    Managing user generated socialinteraction data!

    Coping with explosion in data volume!

    Cost-effective scale-out to meet rapidly growing demands!

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    23/46

    46 CUBRID Reference Architecture for Social Networking Service23 /

    CUBRID unique features

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    24/46

    CUBRID

    46 CUBRID Reference Architecture for Social Networking Service24 /

    Free

    open sourceis the choice

    of the modernworld

    Powerful

    clean architecturewith rich

    functionalityfor competitive

    performance

    Enterprise

    unique featuresfor stability

    and reliability

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    25/46

    HA feature Reclaim deleted space

    Fast serial data (cached) LFS (large file support ) for

    database volume

    CUBRID

    46 CUBRID Reference Architecture for Social Networking Service25 /

    2006 20112007 2008 2009 2010 2012

    CUBRID became an open source project.CUBRID 2008 R1.1 stable was released.

    The development of CUBRID DBMS started.

    First internal release CUBRID 2008 R1.0

    October, 2008

    November, 2008

    August, 2009CUBRID 2008 R2.0 stable released.

    October, 2009

    CUBRID Cluster Project has been started.

    September, 2009

    Official open source community, www.cubrid.org, opened.

    October, 2010

    CUBRID 3.0 stable released.

    CUBRID 4.0 stable released.July, 2011

    INSERTperformance

    enhancement Database

    volume sizereduced.

    Multi-rangescan and keylimit function

    Covered index

    FBO (file-based object)

    HA monitoring Full SQL

    functionsupport

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    26/46

    CUBRID Index Scan with Top-k Sorting Technique

    46 CUBRID Reference Architecture for Social Networking Service26 /

    Multi-range scan

    (4,10001) (4,9999) (4,875)

    (15, 10000) (15,9999) (15, 7467)

    (36,947) (36,120) (36,3)

    Single range scan with key filter

    Filter out

    # of leaf pages accessed> # of keys of scan result

    # of leaf pages accessed= # of keys of scan result

    Sort after scan On the fly sortingduring scan

    SELECT post_no FROM postsWHERE id IN (4, 15, 36, ) AND registered_date < 20000ORDER BY registered_date DESC LIMIT 20

    CUBRID does multi-range index scan.

    (4,10001) (4,9999) (4,875)

    (15, 10000) (15,9999) (15, 7467)

    (36,947) (36,120) (36,3)

    My friendsnewesttwenty

    comments

    Disk I/O ?!

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    27/46

    CUBRID Index Scan with Top-k Sorting Technique

    46 CUBRID Reference Architecture for Social Networking Service27 /

    SELECT * FROM tbl WHERE a = 2 AND b < KORDER BY b LIMIT 3; SELECT * FROM tbl WHERE a IN (2, 4, 5) AND b < KORDER BY b LIMIT 3;

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    28/46

    CUBRID Test Results

    46 CUBRID Reference Architecture for Social Networking Service28 /

    Refer http://www.cubrid.org/cubrid_mysql_sns_benchmark_test

    0

    50

    100

    150

    200

    250

    300

    Test Case 1 Test Case 2Test Case 3

    Test Case 4

    M UNION

    M IN

    C UNION

    C IN

    User group 1: users with 50 or less friendsUser group 2: users with 51~2000 friendsUser group 3: users with friends up to tens of thousands

    Test case 1: user group 1 onlyTest case 2: user group 2 onlyTest case 3: 40% of user group 1, 50% of user group 2,

    10% of user group 3Test case 4: 10% of user group 1, 50% of user group 2,

    40% of user group 3

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    29/46

    CUBRID High Availability Feature

    46 CUBRID Reference Architecture for Social Networking Service29 /

    Database Server

    Application

    Master DB Slave DB Slave DB

    ActiveServer

    Standby-2Server@ Remote IDC

    Standby-1Server

    ActiveBroker

    Read-WriteMode

    Read-OnlyMode

    BackupBroker

    automaticfail-over/fail-back

    Broker

    automaticswitch-over

    CUBRID Driver CUBRID Driver

    UPDATE

    SELECT

    UPDATE

    CUBRID HA, highly fault-resistant DBMS enables

    Non-stop 24x7 service System maintenance

    without shutdown Automatically fail-over

    (less than 20 sec) Various acess modes

    (read-write, read-only)

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    30/46

    CUBRID High Availability Feature

    46 CUBRID Reference Architecture for Social Networking Service30 /

    A-NodeActive Server Node

    UPDATE

    S1-NodeStandby Server Node

    SELECT

    S2-Node

    TransactionLog

    SlaveDB

    MasterDB

    SlaveDB

    TransactionLog

    TransactionLog

    ReplicationLog

    ReplicationLog

    ReplicationLog

    SELECT

    Log Shipping(synchronous)

    Log Shipping(asynchronous)

    LogWriter

    LogApplier

    CUBRIDServer

    LogWriter

    LogApplier

    CUBRIDServer

    Heartbeat Heartbeat

    Heartbeat

    Log Applying Log Applying

    Log Applying

    HA feature is based on databasereplication with transaction log

    multiplication technique.

    Statement-based replication could cause data inconsistency.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    31/46

    CUBRID Automatic Sharding Component

    46 CUBRID Reference Architecture for Social Networking Service31 /

    Application

    Shard #1 Shard #2 Shard #3 Shard #4Database Server

    Broker

    k0001k0005K000

    k0002k0006K000

    k0003k0007K000

    k0004k0008K000

    SELECT WHERE key=k0008UPDATE WHERE key=k0002

    ShardingMetadata

    Expand Shard

    New Shard

    Automatic sharding featureenables No more application logic Scale-out DB architecture

    Features

    Multiple sharding strategiesShard by modulus, date/time range,extendible hash

    User hint-awareSELECT * FROM tbl WHEREnonkey=abc /* shard=1 */

    automatic sharding

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    32/46

    CUBRID Cluster System

    46 CUBRID Reference Architecture for Social Networking Service32 /

    Application

    Node #1 Node #2 Node #3 Node #4Cluster Server

    Broker

    global schema / distributed partition

    load balancing

    gtablepart_01part_05

    gtablepart_02part_06

    gtablepart_03part_07

    gtablepart_04part_08

    SELECT * FROM gtableWHERE part_key=2 AND

    INSERT INTO gtable

    Main features of CUBRIDCluster are

    Global schema Distributed partition Load balancing

    Users can get

    Single big database view Location transparency Additionally, linear

    scalability

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    33/46

    CUBRID Cluster System

    46 CUBRID Reference Architecture for Social Networking Service33 /

    Global Schema

    Local Schema #4Local Schema #3Local Schema #2Local Schema #1

    Database #1 Database #2 Database #3 Database #4

    contents contents contents

    contents

    info

    infoauthor

    authorcode level local

    GlobalSchema

    User

    LocalSchema

    UserSELECT * FROM info, codeWHERE info.id = code.id

    INSERT INTO contents

    UPDATE local SELECT * FROM contentsWHERE

    SELECT * FROM contentsWHERE auth = (SELECT name FROM author WHERE)

    The global schema is a single representation or a global view of all nodeswhere each node has its own database and schema.

    The users can access any databases through a single schema regardless of andwithout knowing the location of the distributed data.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    34/46

    CUBRID Cluster

    46 CUBRID Reference Architecture for Social Networking Service34 /

    Data

    SystemCatalog

    Index

    DataSystemCatalog

    Index

    DataSystemCatalog

    Index

    Logical View Logical View

    Physical ViewPhysical View

    Schema Schema

    Global Schema

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    35/46

    CUBRID Cluster

    46 CUBRID Reference Architecture for Social Networking Service35 /

    The distributed partition maps global schema onto table partitioning.Partitions are resident in different nodes but accessed through global schema.

    Database #1 Database #2 Database #3 Database #4

    Global Schema

    part_01 part_02 part_03 part_04

    part_05 part_06 part_07 part_08

    gtable PARTITION BY HASH (part_key)

    SELECT * FROM gtable, infoWHERE gtable.part_key=02

    AND info.id = gtable.id

    info

    part_02

    part_06

    part_03

    part_07

    part_03

    part_08

    part_01

    part_05

    info

    Partition DataPartition DataPartition DataPartition Data

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    36/46

    nStore, a distributed database system based on the CUBRID

    46 CUBRID Reference Architecture for Social Networking Service36 /

    Concept

    Container > Table >

    Column

    Data Model

    Simplified Tabular

    Query Language

    Simplified SQL

    Availability

    3-copy Replication

    Distribution

    Key-based Consistency

    Hashing

    RDB-like tabular model Schema, column, record Index on columns (ordered search)

    Restricted data type Integer(bigint), string, timestamp(msec),

    id(128bit), bool

    Data partitioned by key E.g., user-id could be a key

    SQL-like query language SELECT a, b, c FROM post

    WHERE fid IN (?, ?, ?) AND b=?

    ORDER BY ts LIMIT 20,CK=iamyaw

    INSERT INTO post(no, id, date) VALUES (?, ?, ?),CK=iamyaw

    Join supported Between tables in one container

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    37/46

    nStore, a distributed database system based on the CUBRID

    46 CUBRID Reference Architecture for Social Networking Service37 /

    Application Application Application

    CUBRID

    CUBRID

    CUBRID

    CUBRID

    CUBRID nStore nStore

    nStorenStore

    nStore

    REST API

    http://server/keyspace/query?ckey=iamyaw&nsql=select a from tbl where k=100&format=json

    Data DistributionReplication (3- Copy)

    Rebalancing

    Query ProcessingStorage System

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    38/46

    nStore, a distributed database system based on the CUBRID

    46 CUBRID Reference Architecture for Social Networking Service38 /

    Table A Table B

    Table C

    IndexedColumn

    Indexed Column

    Container (ckey=iamyaw)

    GlobalTable G

    Equi-join

    Equi-join

    Table A Table B

    Table C

    IndexedColumn

    Indexed Column

    Container (ckey=kieun_park)

    Equi-join

    Container Server

    Container Server

    Management Node

    nStore

    Distribution layer

    Application

    RDBMS

    REST API

    Container Server

    Container Server

    Container Server

    Tables

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    39/46

    nStore, a distributed database system based on the CUBRID

    46 CUBRID Reference Architecture for Social Networking Service39 /

    nStore

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    40/46

    nStore Test Results

    46 CUBRID Reference Architecture for Social Networking Service40 /

    0

    5000

    10000

    15000

    20000

    25000

    INSERTREAD

    READ w/compatction READ/UPDATE

    READ/INSERT

    Cassandra

    Hbase

    MongoDB

    nStore

    Tested using YCSB (http://research.yahoo.com/Web_Information_Management/YCSB)

    INSERT: 50,000,000 records (1K size)READ: Zifian distributionREAD w/ compaction: after SSTable compaction (Cassandra, Hbase)READ/UPDATE: 50:50 (50,000,000 records DB)READ/INSERT: 50:50 (50,000,000 records DB)

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    41/46

    46 CUBRID Reference Architecture for Social Networking Service41 /

    CUBRID referencearchitecture for

    social networking service

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    42/46

    CUBRID Web Reference Architecture

    46 CUBRID Reference Architecture for Social Networking Service42 /

    CUBRID HA

    slavemaster

    Web Server RW RO

    master master master master

    slave slave slave slave

    CUBRID HA

    DB Sharding

    CUNITOR

    Cache Server

    Web Application Server (Business Logic)

    Web Server

    (User Interface)

    Small-sizeweb service

    Mid-size

    web service

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    43/46

    Social Networking Service Architecture

    46 CUBRID Reference Architecture for Social Networking Service43 /

    User Profile DB Social Relation DB Analytics DBFeed Outbox DB Feed Inbox DB

    Cache Layer

    Social Query Engine Aggregation Engine Delivery Engine Search Engine RecommendationEngine

    Search Index

    Web Application Servers (Business Logic)

    Web Servers (User Interface)

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    44/46

    CUBRID SNS Reference Architecture

    46 CUBRID Reference Architecture for Social Networking Service44 /

    slave

    master

    CUBRID HA

    slave

    master

    CUBRID Cluster

    node #1 node #2 node #n

    nStore w/ CUBRID

    container container

    containercontainer

    RW RO

    DB Sharding

    broker

    User profile DBsharded by user-id

    slave

    master

    CUBRID HA

    slave

    master

    RW RO

    DB Sharding

    broker

    Social relation DBsharded by user-id Inbox/Outbox storage

    distributed according to user-id

    Analytic DBpartitioned for OLAP

    management

    container container

    CUNITOR

    monitoringserver

    OAM

    Cache server farm Application servers ETL

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    45/46

    Best Practices

    46 CUBRID Reference Architecture for Social Networking Service45 /

    Automatic sharding is an effective way to scale-out DB system

    storing relational model data.

    High available database architecture is the basic businessrequirements and not technical barrier anymore.

    nStore is a solution for peta-byte scale data with benefits of high

    available and scalable distributed store.

  • 7/30/2019 CUBRID Reference Architecture for Social Networking Service (2011!8!7)

    46/46

    End of Slides.