Dynamic Content Web Sites: Technologies & Scalability

45
Dynamic Content Web Sites: Technologies & Scalability Emmanuel Cecchet [email protected]

description

Dynamic Content Web Sites: Technologies & Scalability. Emmanuel Cecchet [email protected]. Dynamic content Web site. Web content is more and more dynamic. e-Commerce servers. Multi-tier architecture. Outline. Technologies Performance Clustering Conclusion. PHP. - PowerPoint PPT Presentation

Transcript of Dynamic Content Web Sites: Technologies & Scalability

Page 1: Dynamic Content Web Sites: Technologies & Scalability

Dynamic Content Web Sites: Technologies & Scalability

Emmanuel [email protected]

Page 2: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 2

Dynamic content Web site

Web content is more and more dynamic

Page 3: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 3

e-Commerce servers

Multi-tier architecture

HTTP

Application protocol

Web server DatabaseApplication server

Client

Internet

SQL

Page 4: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 4

Outline

TechnologiesPerformanceClusteringConclusion

Page 5: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 5

PHP

Hypertext PreprocessorScripting languageModule integrated in Web server

SQL

Web server DatabaseClient

HTTP

...

httpd

PHP

httpd

PHP

Page 6: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 6

PHP example

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html> <body> <h1>Region list</h1> <?php $result = mysql_query("SELECT * FROM regions", $link) or die("ERROR: Request failed"); if (mysql_num_rows($result) == 0) print("<h2>Sorry, no region, db is empty.</h2><br>"); else while ($row = mysql_fetch_array($result)) { print("<a href=\"BrowseCategories.php?region=". $row["id"]."\">".$row["name"]."</a><br>\n"); } mysql_free_result($result); ?> </body></html>

Page 7: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 7

PHP

Proseasy to learnideal for small projectswidely usedno strong typing

Consno strong typingcode maintenanceinterpreted languageexecutes in the Web server processad-hoc APIs for database access

Page 8: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 8

Java Servlets

Java basedExecutes in a “Servlet Container”JDBC: unified interface for database access

JDBC

Tomcat Servlet container

Web server DatabaseClient

HTTP

Servlet server

AJP12

h t t p d JVM

...

h t t p d

servlet

servletservletservletservlet

Page 9: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 9

Java Servlet example

public class BrowseRegions extends HttpServlet{ …

public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException

{ out.print("<h1>Region list</h1>"); try

{ ResultSet rs = connection.createStatement().executeQuery("SELECT * FROM regions"); if (!rs.first()) out.print("<h2>Sorry, no region, db is empty</h2><br>"); else do

{ out.print("<a href=\"BrowseCategories?region="+rs.getInteger("id")+ "\">"+rs.getString("name")+"</a><br>\n");

} while (rs.next()); } catch (Exception e) { out.print("ERROR: Request failed for the following reason: " + e); return; } }}

Page 10: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 10

What about JSP?

Java Server Pages Sun’s answer to Microsoft ASP“scripting for servlets”

Scripting languageCompiled into a Java servlet at the first

execution

Page 11: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 11

JSP example

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html> <body> <h1>Region list</h1> <%

try { ResultSet rs = connection.createStatement().executeQuery("SELECT * FROM regions"); if (!rs.first()) { %> <h2>Sorry, no region, db is empty</h2><br> <% } else do{ %> <a href="BrowseCategories?region=" <%= rs.getInteger("id") %> "> <%= rs.getString("name") %> </a><br> <%

} while (rs.next()); } catch (Exception e) { %> ERROR: Request failed for the following reason: <%= e.getMessage() %> <% } %> </body></html>

Page 12: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 12

Servlets/JSP

ProsOO programming (JSP for scripting)design patterns maturityJDBC for database accessServlet container independent from Web server

ConsWeb server / Servlet server communication limited number of servicesOO programming is more verbose (servlets)

Page 13: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 13

J2EE Servers

Java 2 Enterprise EditionSeparation of presentation and business logics

J2EE Application Server

Web server DatabaseWeb container

EJB container

Presentation logic

Business logic

Client

Internet

Page 14: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 14

J2EE Servers

Presentation logic JSP or Servlets

Business logicEnterprise JavaBeans (EJB)Entity Beans

database mapping BMP: by hand CMP: automatic

Session Beans stateless: temporary operations stateful: temporary objects (shopping cart)

Message Driven Beans asynchronous messages

Page 15: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 15

J2EE

Proswell suited for large projects or EAIpresentation and business logic isolation large number of services (transactions, security,

asynchronous messaging, clustering, …)

Consrequires skills

large number of specs impact of design on performances

complex to setupportability across servers to improve

Page 16: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 16

Open-source offers

PHPimplementation from php.netincluded in Apache

ServletsTomcat (http://jakarta.apache.org/tomcat/)Jetty (http://jetty.mortbay.com)

J2EEJOnAS (http://jonas.objectweb.org)JBoss (http://jboss.org)

Page 17: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 17

Outline

TechnologiesPerformanceClusteringConclusion

Page 18: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 18

RUBiS Benchmark

online auction sitemodeled after eBay.com

9 open-source implementations PHPServlets7 EJB

all results are online

http://rubis.objectweb.org/

Page 19: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 19

RUBiS – PHP & Servlets

Apache/PHP vs Apache/Tomcat

Page 20: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 20

JVM Performance

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560 580 600

Number of clients

Th

rou

gh

pu

t in

re

qu

est

s/m

inu

te

IBM

JRockit

Sun

Page 21: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 21

Design patterns: Servlets only

Presentation and business logic mixed

Database

Web container

Servlet

Presentation logic

Business logic

Servlet

Presentation logic

Business logic

Page 22: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 22

Design patterns: Session Beans

Presentation and business logic separation

Session bean

EJB container

Business

logic

Servlet

Web container

Servlet

Database

Presentation

logic

Presentation

logic

Session bean

Business

logic

Page 23: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 23

Design pattern: Entity Beans

Data Access Objects separation with Entity Beans (BMP or CMP)

EJB container

Entity

Bean

Database

Entity

Bean

Entity

Bean

Web container

Servlet

Presentation

logic

Business

logic

Servlet

Presentation

logic

Business

logic

Page 24: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 24

Design patterns: Session façade

Façade session bean with EJB 1.1

EJB container

Entity

Bean

Session facade

Web container

Session facade

Database

Entity

Bean

Entity

Bean

Business logic

Business logic

Servlet

Servlet

Presentation

logic

Presentation

logic

Communication layer

Remote interface

Page 25: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 25

Design patterns: EJB 2.0 local

Session façade with EJB 2.0 local interface Entity Beans

EJB container

Entity

Bean

Session facade

Web container

Session facade

Database

Entity

Bean

Entity

Bean

Business logic

Business logic

Servlet

Servlet

Presentation

logic

Presentation

logic

Local interface

Page 26: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 26

Code complexity

0

2000

4000

6000

8000

10000

12000

14000

PHP Servlets-only

Sessionbeans

EB CMP EB BMP Sessionfaçade

EJB 2.0local

Lin

es

of

co

de

Business logic

Presentation logic

Page 27: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 27

RUBiS - J2EE Servers

Apache/Tomcat/JBoss vs Apache/Tomcat/JOnAS

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Session Beans EB-CMP EB-BMP Session façade EJB 2.0 Local

Max

imu

m t

hro

ug

hp

ut

in r

equ

ests

/min

ute

Optimized JBoss

Optimized JOnAS

Page 28: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 28

J2EE Performance

Session Beans = Servlets >= PHPEntity Beans

BMP = CMPdata access very (too?) fine grain

Design pattern determines performance Communication layers : 45 to 90% cpu usageContainer designLess than 2% of execution time in user bean

code

Page 29: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 29

Outline

TechnologiesPerformanceClusteringConclusion

Page 30: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 30

Clustering

Internet

Web server

Web server

Web server

Client

Client

Client

Client

Client

Client

Client

Web server

Web serverServlet Server EJB Server

Servlet Server

Servlet Server

EJB Server

DB Server

DB Server

Servlet Server EJB Server DB Server

Servlet Server

Servlet Server

EJB Server

EJB Server

EJB Server

DB Server

DB Server

DB Server

Page 31: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 31

Web site Clustering

Load balancing on Web Servershardware: L4-switchsoftware

One-IP techniquesLVS (http://www.linuxvirtualserver.org/)RR-DNS

Page 32: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 32

Servlet/JSP Clustering

Web server to Servlet serverLoad balancing with JK module (mod_jk)Static weighted round-robinSession affinity

Servlet/JSP server clusteringTomcat in-memory session replicationfailover ensured by mod_jk

Page 33: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 33

EJB Clustering

Servlet/JSP to EJB serverclustered JNDIload-balancing and failover by cluster-aware

stubs

EJB Server clusteringcluster stubs for load-balancingtransparent failover for idempotent methodsbean state persisted in database

Page 34: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 34

J2EE Clustering

Database clusteringCommercial offers

Oracle RAC (60.000$ / cpu)based on expensive SAN (Storage Area Network)

Open-source solutionsNo real clustering

– master/slave replication in MySQL– Postgres-R (still in alpha)

Page 35: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 35

Database clustering

Performance scalability bounded by databaseLarge SMP are not commodityDatabase tier must be

scalablefault tolerant (high availability + failover)without modifying the client applicationusing open source databaseson commodity hardware

Page 36: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 36

RAIDb

Redundant Array of Inexpensive Databases (RAIDb) better performance and fault tolerance than a single database, at a low cost, by combining multiple DB instances into an array of DB.

RAIDb controller gives the view of a single database to the client balances the load on the database backends

RAIDb levels RAIDb-0: full partitioning RAIDb-1: full mirroring (best fault tolerance) RAIDb-2: partial replication (best performance)

Page 37: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 37

C-JDBC

Middleware implementing RAIDbTwo components

generic JDBC 2.0 driver (C-JDBC driver)C-JDBC Controller

C-JDBC Controller providesperformance scalabilityhigh availabilityfailovercaching, logging, monitoring, …

Supports heterogeneous databases

Page 38: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 38

RAIDb with

Database Database

C-JDBC Controller Scalability - Fault tolerance - Failover -

Monitoring - Caching - Logging - ...

Database JDBC driver

DatabaseDatabase Database

Database

JVM

Java client program

Database JDBC driver

JVM

EJB Container JOnAS, WebLogic,

JBoss, WebSphere, ...

Servlet container Tomcat, Jetty, ...

Database JDBC driver

MySQL, PostgreSQL, Oracle, DB2, InstantDB, ...

Database JDBC driver

JVM

Java client program

C-JDBC driver

JVM

EJB Container

JOnAS,

WebLogic, JBoss, WebSphere, ...

Servlet container

Tomcat,

Jetty, ...

C-JDBC driver

C-JDBC driver

JVM

MySQL, PostgreSQL, Oracle, DB2, InstantDB, ...

JVM

C-JDBC

No scalability No fault tolerance

No failover

EJB Container

JOnAS,

WebLogic, JBoss, WebSphere, ...

Servlet container

Tomcat,

Jetty, ...

Page 39: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 39

C-JDBC RAIDb-1 example

no client codemodification

original PostgreSQLdriver and RDBMSengine

C-JDBC providesscalable performanceand high availability

PostgreSQL PostgreSQL

C-JDBC Controller RAIDb-1

PostgreSQL JDBC driver

PostgreSQL

Java client program

C-JDBC driver

JVM

C-JDBC driver

C-JDBC driver

JVM

JVM

EJB Container JOnAS, WebLogic,

JBoss, WebSphere, ...

Servlet container Tomcat, Jetty, ...

Page 40: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 40

C-JDBC RAIDb-2 example

unload a singleOracle DB withseveral MySQL

add caching,fault tolerance,and monitoringfor free

MySQL MySQL

C-JDBC Controller RAIDb-2

MySQL JDBC driver

MySQLOracle

Java client program

C-JDBC driver

JVM

C-JDBC driver

C-JDBC driver

JVM

JVM

Oracle JDBC driver

EJB Container JOnAS, WebLogic,

JBoss, WebSphere, ...

Servlet container Tomcat, Jetty, ...

Page 41: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 41

TPC-W Performance

0

200

400

600

800

1000

1200

1400

1600

0 1 2 3 4 5 6

Number of nodes

Th

rou

gh

pu

t in

req

ues

ts p

er m

inu

te

Single DB

RAIDb-0

RAIDb-1

RAIDb-2

Page 42: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 42

Outline

TechnologiesPerformanceClusteringConclusion

Page 43: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 43

PHP, Servlets or J2EE ?

PHP: Apacheideal for small projectsno typing, ad-hoc APIs

Servlets: Tomcat/JettyOO programmingJDBC for database access

J2EE: JOnAS/JBossfor large projectsbusiness and presentation logic isolation large number of services

Clustering for scalability

Page 44: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 44

Questions ?Apache/PHP/Tomcat: http://www.apache.orgJetty: http://jetty.mortbay.comJOnAS: http://jonas.objectweb.org/JBoss: http://www.jboss.orgRUBiS: http://www.objectweb.org/rubisLVS: http://www.linuxvirtualserver.org : http://c-jdbc.objectweb.org/

Page 45: Dynamic Content Web Sites: Technologies & Scalability

LinuxWorld 2003 – San Francisco – [email protected] 45

RUBiS – Overall results

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Session Beans EB-CMP EB-BMP Sessionfaçade

EJB 2.0 Local

Max

imu

m t

hro

ug

hp

ut

in in

tera

ctio

ns/

min

ute

JBoss - Standard RMI

JOnAS - Standard RMI

JBoss - Optimized RMI

JOnAS - Jeremie