Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

42
Coimbra, April 18th, 2012 Sanne Grinovero Hibernate Team, JBoss Red Hat, Inc

description

Apresentação de Sanne Grinovero - 8º encontro PT.JUG.

Transcript of Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Page 1: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Coimbra, April 18th, 2012

Sanne GrinoveroHibernate Team, JBoss

Red Hat, Inc

Page 2: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

About me• Hibernate

• Hibernate Search

• Hibernate OGM

• Infinispan

• Lucene Directory

• Infinispan Query

in.relation.to/Bloggers/Sanne

Twitter: @SanneGrinovero

Studied at FEUP (Porto)!

Page 3: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Hibernate Object/Grid Mapper ?

JPA for NoSQL

• initially Key/Value store• we started with Infinispan

Page 4: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Relational Databases• Transactions • Referential integrity• Simple Types• Well understood- tuning, backup, resilience

Page 5: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Relational Databases

But scaling is hard!-Replication-Multiple instances w/ shared disk-Sharding

Page 6: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Relational Databases on a cloud

Master/replicas: which master?

A single master? I was promised elasticity

Less reliable “disks”

IP in configuration files? DNS update times?

Who coordinates this? How does that failover?

Page 7: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

¬SQL

more meaning NotOnlySQL

¬SQL U SQL = anything

Page 8: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

No-SQL goalsVery heterogeneus• Large datasets• High availability• Low latency / higher throughput• Specific data access pattern• Specific data structures• ...

Page 9: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

• Document based stores • Column based • Graph oriented databases• Key / value stores• Full-Text Search

NotOnlySQL

Page 10: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Choose one.Before starting.

Stick to it.

NotOnlySQL

Page 11: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Flexibility at a cost

•Programming model•one per product :-(•Often very thight code coupling•No standard drivers / stable APIs

•no schema => app driven schema•query (Map Reduce, specific DSL, ...)•data structure transpires•Transactions ?•durability / consistency puzzles

Page 12: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Where does Infinispan fit?

Distributed Key/Value store• (or Replicated, local only efficient cache, invalidating cache)

Each node is equal• Just start more nodes, or kill some

No bottlenecks• by design

Cloud-network friendly• JGroups• And “cloud storage” friendly too!

Page 13: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

But how to use it?

map.put( “user-34”, userInstance );

map.get( “user-34” );

map.remove( “user-34” );

Page 14: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

It's a ConcurrentMap !

map.put( “user-34”, userInstance );

map.get( “user-34” );

map.remove( “user-34” );

map.putIfAbsent( “user-38”, another );

Page 15: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Other Hibernate/Infinispancollaborations

● Second level cache for Hibernate ORM

● Hibernate Search indexing backend

● Infinispan Query

Page 16: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Cloud-hack experiments

Let's play with Infinispan's integration for Hibernate's second level cache design:- usually configured in clustering mode INVALIDATION.

•Let's use DIST or REPL instead.- Disable expiry/timeouts.

What's the effect on your cloud-deployed database?

Page 17: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Cloud-hack experiments

Now introduce Hibernate Search: - full-text queries should be handled by Lucene, NOT by the database.

Hibernate Search identifies hits from the Lucene index, but loads them by PK. *by default

Page 18: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

What's the work left to the database?

Page 19: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

These tools are very appropriate for the job:

Load by PK ->second level cache ->

Key/Value store

FullText query ->Hibernate Search ->

Lucene Indexes

Page 20: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

These tools are very appropriate for the job:

Load by PK ->second level cache ->

Key/Value store

FullText query ->Hibernate Search ->

Lucene Indexes

What if we now shut down the database?

Page 21: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
Page 22: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Goals

• Encourage new data usage patterns• Familiar environment• Ease of use• Easy to jump in• Easy to jump out• Push NoSQL exploration in enterprises• “PaaS for existing API” initiative

Page 23: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

What it does

•JPA front end to key/value stores•Object CRUD (incl polymorphism and associations)•OO queries (JP-QL)

•Reuses•Hibernate Core•Hibernate Search (and Lucene)•Infinispan

•Is not a silver bullet•not for all NoSQL use cases

Page 24: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Concepts

Page 25: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Schema or no schema?

•Schema-less•move to new schema very easy•app deal with old and new structure or migrate all

data•need strict development guidelines

•Schema•reduce likelihood of rogue developer corruption•share with other apps•“didn’t think about that” bugs reduced

Page 26: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Entities as serialized blobs?

•Serialize objects into the (key) value•store the whole graph?

•maintain consistency with duplicated objects•guaranteed identity a == b•concurrency / latency•structure change and (de)serialization, class definition

changes

Page 27: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

OGM’s approach to schema

•Keep what’s best from relational model•as much as possible•tables / columns / pks

•Decorrelate object structure from data structure•Data stored as (self-described) tuples•Core types limited

•portability

Page 28: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

OGM’s approach to schema

•Store metadata for queries•Lucene index

•CRUD operations are key lookups

Page 29: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

• Entities are stored as tuples (Map<String,Object>)• Or Documents?

• The key is composed of• table name• entity id

• Collections are represented as a list of tuples- The key is composed of:

• table name hosting the collection information• column names representing the FK• column values representing the FK

How does it work?

Page 30: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
Page 31: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Let's see some code...

Page 32: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Queries / Infinispan

•Hibernate Search indexes entities•Store Lucene indexes in Infinispan•JP-QL to Lucene query transformation

•Works for simple queries•Lucene is not a relational SQL engine

Page 33: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

select a from Animal a where a.size > 20

> animalQueryBuilder.range().onField(“size”).above(20).excludeLimit().createQuery();

select u from Order o join o.user u where o.price > 100 and u.city = “Paris”> orderQB.bool() .must( orderQB.range() .onField(“price”).above(100).excludeLimit().createQuery() ) .must( orderQB.keyword(“user.city”).matching(“Paris”) .createQuery()).createQuery();

Page 34: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Why Infinispan?

•We know it well•Supports transactions•Supports distribution of Lucene indexes•Designed for clouds•It's a key/value store with support for Map/Reduce

•Simple•Likely a common point for many other “databases”

Page 35: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Why Infinispan?

•Map/Reduce as an alternative to indexed queries•Might be chosen by a clever JP-QL engine

•Potential for additional query types

Page 36: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM
Page 37: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Why ?

Nothing new to learn for most common operations:

• JPA models

• JP-QL queries

Everything else is performance tuning, including:

• Move to/from different NoSQL implementations

• Move to/from a SQL implementation

• Move to/from clouds/laptops

• JPA is a well known standard: move to/from Hibernate :-)

Page 38: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Development state:• Query via Hibernate Search• Smart JP-QL parser is on github

• Available in master:• EHCache• Infinispan

• In development branches:• MongoDB• Voldemort

Page 39: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Summary

• Performance / scalability is different• Isolation is different

Page 40: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

http://ogm.hibernate.org

Page 41: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

http://www.jboss.org/jbw2011keynote.htmlhttps://github.com/Sanne/tweets-ogm

Page 42: Using JPA applications in the era of NoSQL: Introducing Hibernate OGM

Q + A