Opportunities to Improve System Reliability and Resilience by Donald Belcham

66
System Reliability and Resilience and stuff

description

Opportunities to Improve System Reliability and Resilience Donald Belcham .NET Conf UY 2014 http://netconf.uy

Transcript of Opportunities to Improve System Reliability and Resilience by Donald Belcham

Page 1: Opportunities to Improve System Reliability and Resilience by Donald Belcham

SystemReliability and Resilience

and stuff

Page 2: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Some things need to be cleared up first

Page 3: Opportunities to Improve System Reliability and Resilience by Donald Belcham

http://en.wikipedia.org/wiki/Vedette_(cabaret)

Page 4: Opportunities to Improve System Reliability and Resilience by Donald Belcham

tuple

Page 5: Opportunities to Improve System Reliability and Resilience by Donald Belcham

//Initialize customer and invoiceInitialize(customer, invoice);

Page 6: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public void Initialize(Customer customer, Invoice

invoice){

customer.Name = “asdf”;invoice.Date = DateTime.Now;

}

Page 7: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Initialize(customer, invoice);//did something happen to customer// and/or invoice?

Page 8: Opportunities to Improve System Reliability and Resilience by Donald Belcham

customer.Name =InitNameFrom(customer,

invoice);invoice.Date =

InitDateFrom(customer, invoice);

Page 9: Opportunities to Improve System Reliability and Resilience by Donald Belcham

customer.Name =GetNameFrom(customer,

invoice);invoice.Date =

GetDateFrom(customer, invoice);

Page 10: Opportunities to Improve System Reliability and Resilience by Donald Belcham

var results = Initialize(customer, invoice);

customer.Name = results.Item1;invoice.Date = results.Item2;

Page 11: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public tuple<string, DateTime>Initialize(customer,

invoice){

return new Tuple<string, DateTime>(“asdf”, DateTime.Now);

}

Page 12: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public static bool TryParse(string s, out DateTime result)

or

public static tuple<bool, DateTime?> TryParse(string s)

Page 13: Opportunities to Improve System Reliability and Resilience by Donald Belcham

tuple• Avoid side effects

• Avoid out parameters

• multiple values without a specific type

Page 14: Opportunities to Improve System Reliability and Resilience by Donald Belcham

null object

Page 15: Opportunities to Improve System Reliability and Resilience by Donald Belcham

private ILogger _logger;public MyClass(ILogger logger) {

_logger = logger;}

if (_logger != null) {_logger.Debug(

“it worked on my machine!”);}

Page 16: Opportunities to Improve System Reliability and Resilience by Donald Belcham

null checks for everyone!

Page 17: Opportunities to Improve System Reliability and Resilience by Donald Belcham

forget one and…

Page 18: Opportunities to Improve System Reliability and Resilience by Donald Belcham

public class NullLogger : ILogger {public void Debug(string text) {

//do sweet nothing}

}

Page 19: Opportunities to Improve System Reliability and Resilience by Donald Belcham

private ILogger _logger = new NullLogger();

public MyClass(ILogger logger) {_logger = logger;

}

_logger.Debug(“it worked on my machine!”);

Page 20: Opportunities to Improve System Reliability and Resilience by Donald Belcham

null object• Can eliminate null checks

• Simple to implement

Page 21: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Circuit Breaker

Page 22: Opportunities to Improve System Reliability and Resilience by Donald Belcham
Page 23: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Retry

Page 24: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on O

ut o

f Pro

cess

Dependency

N times

Page 25: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Out o

f Pro

cess

Dependency

N times

*

Y clients

Page 26: Opportunities to Improve System Reliability and Resilience by Donald Belcham

= Denial of

Service Attack

Page 27: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Limit the # of retries

Page 28: Opportunities to Improve System Reliability and Resilience by Donald Belcham

N * Ybecomes

5 * Y

Page 29: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Y isstill a

problem

Page 30: Opportunities to Improve System Reliability and Resilience by Donald Belcham
Page 31: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Circuit Breaker

Page 32: Opportunities to Improve System Reliability and Resilience by Donald Belcham
Page 33: Opportunities to Improve System Reliability and Resilience by Donald Belcham

State Machine

On :: Off

Page 34: Opportunities to Improve System Reliability and Resilience by Donald Belcham

On Offwhen not healthy

Page 35: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Off Onmanually

Page 36: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Get to softwarebefore we ask you

to dance

Page 37: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Healthyor

Unhealthy

Out o

f Pro

cess

Dependency

Page 38: Opportunities to Improve System Reliability and Resilience by Donald Belcham

State is independent of

requestor

Out o

f Pro

cess

Dependency

Page 39: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on

Has many independent external dependencies

Page 40: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on

Can throttle itself

Page 41: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on

Has a wait threshold

Page 42: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your ApplicationExternal

DependencyCircuit Breaker

Threshold = 2Pause = 10msTimeout = 30sState = ClosedRequest

Request

Failure (i.e. HTTP 500)Failure Count = 1Pause 10ms

Request

Failure (i.e. HTTP 500)Failure Count = 2State = Open

OperationFailedException

Page 43: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 2Pause = 10msTimeout = 30sState = OpenRequest

30s has not passed

CircuitBreakerOpenException

Request

30s has not passed

CircuitBreakerOpenException

System can try

to

become

healthy

for 30s

Your ApplicationExternal

DependencyCircuit Breaker

Page 44: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 2Pause = 10msTimeout = 30sState = ½ OpenRequest

Request

Failure (i.e. HTTP 500)

Failure Count = 2State = Open

OperationFailedException

30s has passed

Your ApplicationExternal

DependencyCircuit Breaker

Page 45: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 2Pause = 10msTimeout = 30sState = ½ OpenRequest

Request

Failure Count = 0State = Closed

Response

30s has passed

Response

Your ApplicationExternal

DependencyCircuit Breaker

Page 46: Opportunities to Improve System Reliability and Resilience by Donald Belcham

ClosedOpen

½ Open

Page 47: Opportunities to Improve System Reliability and Resilience by Donald Belcham

½ Open is like a

manual reset

Page 48: Opportunities to Improve System Reliability and Resilience by Donald Belcham

PauseTimeout

Page 49: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Pausebetween calls

in the loop

Page 50: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Timeoutbefore you

can call again

Page 51: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Exceptions

Page 52: Opportunities to Improve System Reliability and Resilience by Donald Belcham

OperationFailed:

AggregateException

Page 53: Opportunities to Improve System Reliability and Resilience by Donald Belcham

CircuitBreakerOpen:

ApplicationException

Page 54: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Don’t Loose Exception Info

Page 55: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Always use InnerException(s)

Page 56: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Threshold = 3State = ClosedRequest

Request

Failure (i.e. HTTP 500)

Request

Failure (i.e. HTTP 500)Failure Count = 2

Failure Count = 0State = Closed

Response

Response

Request?

Your ApplicationExternal

DependencyCircuit Breaker

Failure Count = 1

Page 57: Opportunities to Improve System Reliability and Resilience by Donald Belcham

SegregateDependencies

Page 58: Opportunities to Improve System Reliability and Resilience by Donald Belcham

circuitBreaker(“database”)

circuitBreaker(“weatherservice”)

Page 59: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Dependency type, endpoint svc,

endpoint

Page 60: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Where?

Page 61: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on O

ut o

f Pro

cess

Dependency

Cir

cuit

Bre

aker

Pro

xy

Page 62: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Watch forInception

Page 63: Opportunities to Improve System Reliability and Resilience by Donald Belcham

Your

Applicati

on W

eb S

erv

ice

Cir

cuit

Bre

aker

Cir

cuit

Bre

aker

Pro

xy

Data

baseR

eposi

tory

Page 64: Opportunities to Improve System Reliability and Resilience by Donald Belcham

circuit breaker• retry looping

• slow down attempts

• good neighbour

Page 65: Opportunities to Improve System Reliability and Resilience by Donald Belcham

¡Muchas gracias!

Page 66: Opportunities to Improve System Reliability and Resilience by Donald Belcham

gracias

Donald Belcham@dbelcham

[email protected]