Post on 28-Nov-2014
description
Caching in the Distributed Environment
Based on the article published in the Microsoft Architecture Journal : Issue 17Available on-line at http://www.msarchitecturejournal.com/pdf/Journal17.pdf
Abhijit Gadkari
1
http://msdn.microsoft.com/en-us/arcjournal/default.aspx2
AgendaBackground info and basics
Different types of cache like temporal , spatial , primed and demand cache
Some Examples
Caching in the ORM world!
Transactional cache and Shared cache
Managing the interaction
Size of a cache and its impact on application performance
Five minute introduction of “Velocity” – Microsoft ‘s Distributed Caching platform
Open Forum !
3
Basics
Data is stored in memory – i.e. L1, L2, L3 etc. known as cache. This concept is extensively used in the von Neumann Architecture.
Memory Access time is measured in access time. Given an address , the memory presents the data at some other time
Memory Access Time = Latency + Transfer Size / Transfer Rate [2]
Cloud
Hard Disk
RAM
On Board Cost per byteStorage SizeLatency
Persistence
4
5
Data
Reference Data
ActivityData
ResourceData
Types of Data
Understanding the different types of data and their semantics helps to understand the different caching needs that comes with usage of that data type. [1]
6
Data Type [1] Caching Strategy [1]
Reference Data Practically immutable, non-volatile and long lasting in nature -ideal candidate for caching. Can be shared across processes / application. For example, zip code, state list, department list, etc.
Activity Data Activity data is generated by the currently executing activity as part of a business transaction. Only good for the life on the transaction. Short lived in nature. For example, shopping cart on e-commerce web site.
Resource Data Highly dependent on domain logic and volatile in nature. Cache only when required. [a.k.a. don’t cache unless and until absolutely required]. Commonly associated keywords –concurrency , locking, ACID, dirty read, corrupt cache, business logic, etc. For example, quantity information in an inventory application.
Unknown DO NOT CACHE [ME]
“Keep a data item in electronic memory if its access frequency is five minutes or higher, otherwise keep it in magnetic memory”[2]Wikipedia defines cache as “a temporary area where frequently accessed data can
be stored for rapid access”[3]
Why ? – For Performance and Availability
Principle of LocalityBased on work done in 1959 on Atlas System’sVirtual Memory [4]
Temporal CacheGood for frequently accessed , relatively nonvolatile data. For example, drop-down list on a web page
Spatial Cache Data adjacent to recently referenced data will be requested in near future. For example, GridViewpaging
7
Temporal Cache
using System.Web.Caching8
public sealed class Cache : IEnumerable
Spatial Cache
9
In .NET, cache can be synchronized using SqlCacheDependency
Primed and Demand Cache [5,6]
Primed and Demand cache is based on the future use of the data. Predating future is not easy and should be based on sound engineering principals
The primed cache pattern is applicable when the cache or the part of the cache can be predicted in advance. For example, a web browser cache
The demand cache pattern is useful when cache can not be predicted in advance. For example, a cached copy of user credentials
The primed cache is populated at the beginning of the application, whereas the demand cache is populated during the execution of the application
10
Primed Cache
In .NET ICachedReport interface can be used to store thepre-populated reports. The primed cache results in an almost constant size cache structure
11
Demand Cache
1 user can have many roles 1 role can have many permissions
Managing demand cache Minimize memory leakMaximize hit-ratioEffective eviction policy
In dynamic environmentAdaptive Caching Strategiescan be very effective
12
Caching in the ORM World!
cust_id type credit_allowed
3456 gold 1
7890 bronze 0
RDBMS
IMPEDA NCEMIS MATCH
In memory object graphRDBMS – persistent storage
Ms Entity Framework /LINQJDO, TopLink, Hibernate, NHibernate
The ORM manager populates the data stored in persistent storagelike database in the form of an object graph. An object graph is agood caching candidate
13
Customer
Gold Silver Bronze
14
Layered Cache Architecture
The layering principle is based on the explicitSeparation of responsibilities
Cache layering is prevalent in many ORM solutions. For Example, Velocity, Hibernate
The first layer represents the transactional cache and theSecond layer is the sharedcache designed as a processor clustered cache
15
Transactional Cache
Objects formed in a valid state and participating ina transaction can be stored in the transactional cache
Strictly bounded by the ACID rules
Transactional cache size is small size and short lived
Thrashing , cache corruption and caching conflicts should be strictly avoided
Many caching frameworks offer out of the boxprepackaged transactional cache solution
16
Shared Cache
Can be implemented as a process cache or clustered cache. The clustered cache introduces resource replication overhead
Shared cache is a read-only cache
Distributed caching solutions typically implements a shared cache solution
Can be implemented as an identity map. For example, caching read-only, static reports using ICachedReport
17
18
Chasing the Right Size Cache
Remember the 80-20 rule a.k.a. Pareto principle and the bell shapedgraph
19
Microsoft project code named Velocity [1]http://msdn.microsoft.com/fi-fi/library/cc645013(en-us).aspx
20
Distributed in-memory application cache platform Can store any serializable CLR objectAllows clustering and provides ASP.NET session provider object so that ASP.NET session objects can be stored in the distributed cache without having to write to database
21
Application Application
Web Server[s] / App Server[s]
Database
Application Application
Distributed Cache
Database
Web Server[s] / App Server[s]
Conventional Stack Stack with Distributed Cache
Physicalimplementation
One Logical View
Application Application
Velocity
Named Cache
Regions
Regions
Named Cache
Regions
22
Features [1]
Machine -> Cache Host -> Named Cache -> Regions -> Cache Items -> objects
Cache Operations Get [select]– Returns object or entire Cache itemAdd [insert]- Creates new entry else exception if entry existsPut[update] - Replaces existing entry or creates a new oneRemove [delete]- Removes existing entry
Expiration and Eviction Policy is based on time-to-live [TTL] logic
Concurrency model supports optimistic version based updates and pessimistic locking
“Velocity” can be deployed as a service or embedded within the application. For example, host application can be ASP.NET / .NET application
23
// Create instance of cachefactory (reads appconfig)CacheFactory fac = new CacheFactory();
// Get a named cache from the factoryCache catalog = fac.GetCache("catalogcache");
// Simple Get/Putcatalog.Put("toy-101", new Toy("thomas", .,.));
// From the same or a different clientToy toyObj = (Toy)catalog.Get("toy-101");
// Region based Get/Putcatalog.CreateRegion("toyRegion");
// Both toy and toyparts are put in the same region catalog.Put("toyRegion", "toy-101", new Toy( .,.));Catalog.Put("toyRegion", "toypart-100", new ToyParts(…));
Toy toyObj = (Toy)catalog.Get("toyRegion", "toy-101");
Example [1]
24
ResourcesBased on the paper “Caching in the Distributed Environment” published in the Microsoft Architecture Journal : Issue 17
1. Microsft Project Code Named “Velocity” by N. Sampathkumar, MKrishnaprasad and A. Nori2.Transaction Processing : Concepts and Techniques by Jim Gray and Andreas Reuter [ISBN: 1558601902]3. http://en.wikipedia.org/wiki/Cache4. “The Locality Principle” by Peter J. Denning , Communications of the ACM”, July 2005, Vol 48, No 75. “Caching Patterns and Implementation”, by Octavian Paul Rotaru, Leonardo Journal of Sciences LJS: 5:8 , January-June 20066. Data Access Patterns: Database Interactions in Object-Oriented Applications, by Clifton Nock, Addision Wesley
Open Forum !
Abhijit GadkariAbhijit.Gadkari@gmail.com
Blog : http://soaas.blogspot.com/
25