OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

30
Data Storage Solutions for SNS game Dinh Nguyen Anh Dung P2S G6 VNG

description

Presentation in OGDC 2012 organized by VNG Corp.

Transcript of OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Page 1: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Data Storage Solutions for SNS game

Dinh Nguyen Anh Dung – P2S – G6 – VNG

Page 2: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

CONTENT

• SNS games and SQL-based databases

• NoSQL technology and Couchbase

• NoSQL does not come without challenges

• SNS Storage Engine (SSE)

Page 3: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

SNS games AND SQL-based databases

Page 4: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

SNS games characteristics

• Huge amount of concurrent requests but

require low response time

• Accounts can be stored separately

– No need for centralized storage

– In most cases, no need to put strict constrains on

data relationship

Page 5: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Native limitations of SQL-based DBMS

• Centralized fundamentally

– Vertical scale up issue

• Schema

– High risk (and cost) for updates

• Normalized data

– Unnecessary overhead: join tables, locking, data

constrain check,…

Page 6: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Native limitations of SQL-based DBMS

Source : NoSQL - WhitePaper

Page 7: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Native limitations of SQL-based DBMS

• SQL processing overhead at both DBMS and

client side.

• Most data accesses end up at hard-disk

– Very challenging to meet low response time

– Internal caching does not help much

• Hard to distributed data across multiple-

servers

Page 8: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

NoSQL technology and Couchbase

Page 9: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

NoSQL technology

• Persistent distributed hash-table

• Active set resides on RAM

– Extremely fast response time

• Horizontal scale up

• Raw and direct data access

– set, get, add, inc, dec : no overhead

Page 10: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

NoSQL technology

Key Value

Jack.Gold 50123

Jack.Exp 4670

Jack.Coin 700

Peter.Gold 7050

Peter.Exp 20005

Peter.Coin 1

Key Value

Peter.Gold 7050

Jack.Exp 4670

Peter.Exp 20005

Key Value

Jack.Gold 50123

Key Value

Peter.Coin 1

Jack.Coin 700

Page 11: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Active set on RAM

HDD

ACTIVE SET ON RAM

CLIENT

Lazy write

Page 12: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Couchbase server

• Based on membase technology

• Distributed

• Replica

• Since 1.8, have native client for PHP

• Bucket types

– Couchbase (persistent)

– Memcache (memory only)

Page 13: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

NoSQL does not come without challenges

Page 14: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Our first SNS game with Couchbase

Page 15: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Architecture and design issues

• Transition from relational database design to

key-value design

– Account data => keys : how ?

• Only minimum support for

locking, concurrency control

– add : failed if exists - mutex

– cas : read get cas, write failed if cas is out-dated

Page 16: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Architecture and design issues

• No transaction support

– Data corruption becomes so easy!

• No high-level data support (e.g. list,queue,…)

• No tools for raw data viewing / editing

Page 17: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Pitfalls

• Too much freedom for developers

– Anyone can add / modify any key any time

• Epic key design mindset

– One key for all : bad performance, concurrency

control is a true night mare

• Abuse the power of set

– Never fail ! Developer LOVE it !

Page 18: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

SSE – SNS Storage Engine

Page 19: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Our second SNS game with Couchbase

Page 20: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

What is SSE ?

• A thin “layer” between developers and the all-mighty Couchbase

– SSE is simply a PHP library

• Provide better support for locking and concurrency control

– Basic support for : Begin – update - commit

• Provide high-level data structures

– Collection, queue, stack, integer (gold), inc-only integer (exp), binary flags (quest)…

Page 21: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

What is SSE ?

• Minimize the risk of weak concurrency support

– Ability to rollback pending writes

• Schema

– Limit freedom of developers!

– No more nightmare for backup and raw data

view/editing

• Buffers to eliminate repeated read / writes

Page 22: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Raw account view / editing tool

Page 23: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

What is SSE ?

Page 24: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

What is SSE ?

Page 25: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Multi-instance architecture

• Replica is too costly to performance

• One node failed means cluster failed

• Adding nodes requires rebalance

– Only good when having clusters with large

number of nodes (more than 20 nodes)

Page 26: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Multi-instance architecture

• One instance for index (user-to-instance

mapping)

– Use APC on logic servers to cache / reduce load

to index instance

• Many instances of data

– Dynamically adjust weight on each instance base

on average load of instance

– Node failure only affects part of the user-base

Page 27: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Multi-instance architecture

Data Instance 1

Data Instance 2

DataInstance 3

Index Instance

Game Logic Game Logic Game Logic

APC APC APC

Game Logic

APC

Page 28: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

Disavantages

• Lower performance of multi-get

• Not well balance between instances in terms

of accesses

Page 29: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh

How good is SSE for us ?

• No more data loss due to concurrency

• No more data corruption

• No mysterious bugs due to un-intended

writes

• Reduce more than 3 times workload of server

developers

Page 30: OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh