NoSQL and Cloud Services - Philip Balinow, Comfo

Post on 24-May-2015

74 views 0 download

Tags:

Transcript of NoSQL and Cloud Services - Philip Balinow, Comfo

Introduction into Cloud Computing

philip.balinov@komfo.comDevOps Engineer

SOME DEFINITIONS

def. CLOUD COMPUTING

def. BIG DATA

SOME DEFINITIONS

def. CLOUD COMPUTING - distributed computing over a network- the ability to run a program or application on many connected computers at the same time.

def. BIG DATA- data sets so large and complex that it becomes difficult to process using traditional data processing applications

SOME DEFINITIONS, CONTD.

Q: No, seriously. What is the Cloud?

A: Well, there are three types

Infrastructure-as-a-Service

Platform-as-a-Service

Software-as-a-Service

THE CLOUD – PROS AND CONS

WHY CLOUD COMPUTING IS SO GREAT

Better hardware utilization

Economy of scale

Usage-based pricing

In-built resilience (here be monsters)

No front-up costs

No long-term contracts

Retail

Finance

E-Commerce

Telecommunication

B2B

Publishing/Media

Government & NGO

Automotive

Travel

KOMFO'S CLIENTS

KOMFO PLATFORM WORKFLOW OVERVIEW

EXTERNAL PROVIDERS

OUR TECHNOLOGY

PHP, Python, C

MySQL

MongoDB, Elasticsearch

Javascript (node.js)

Ruby

Freedom to use any tool fit for the job

KOMFO MAIN CHALLENGES

Continuously changing workload

Fast feature changes

BigData

(Predictive) Analytics

Security

Availability

HOW TO USE THE CLOUD

Automation

Horizontal scaling

Break tasks into many small sub-tasks

Synchronize all the workers

Write for eventual consistency

HOW TO USE THE CLOUD

Beware! There are traps*

*Minimally shared with fair weighting. Allows burst when idle resources are available. In rare cases, resources may be throttled back under heavy host contention.

–**Disk I/O is shared across the host.

–***A vCPU corresponds to a physical CPU thread.

HOW TO USE THE CLOUD

There are more traps*

Coordinate tasks – good messaging

system (AMQP, DB, MemCached)

Asynchronous task execution (see

above, also API Callback hooks)

Implement transactions in software

CLOUD ARCHITECTURE

Messaging

Ensure communication between dynamic

number of nodes

Message-oriented middleware

Exactly-once delivery

At-least-once delivery

Transaction-based delivery

Timeout-based delivery

MIX & MATCH

Crunch numbers in the cloud

Application servers

Slow running tasks

Temporary services

Test servers

Automation – automatic deployment of multi-

tiered environments

MIX & MATCH, CONTD.

Traditional servers for:

Incompatible apps (single-threaded, memory,

disk intensive, specialized hardware) do not

work well in cloud environments

Database servers are best kept on dedicated

machines

OK, so we have an (endlessly) scalable cloud app now.

What are we forgetting?

DATABASES, NOSQL

def. NoSQL a mechanism for storage and retrieval of data

that is modeled in means other than the tabular relations used in relational databases.

DATABASES

Postgres, Hadoop, MongoDB, Cassandra, Riak

In-memory dataset for faster operation

No predefined structure

Integrated sharding, load-balancing and failover

Versatility - can be used for anything from data

storage to real-time messaging to search indexes

DATABASES

Use the best tool for the job depending on the task

NoSQL Advantages

Some sources generate a lot of data

Complex interconnections, cyclical

dependencies

Aggregations must be performed on both new

and old data

Structure of foreign sources may change on

short notice

DATABASES

Use the best tool for the job depending on the task

NoSQL Disadvantages (Classic SQL advantages)

Not ACID compliant

No transactions

No relations between data

Lack of structure means aggregations are slow

DATABASES, LONG TERM STRATEGY

Data quickly becomes irrelevant

Archive it, but keep it accessible

Online Data Warehouse solutions

Amazon Redshift

Keep Everything

Terabytes for pennies

Summary

The cloud rocks, mmkay?

Questions?