Document Um System Sizing Guide

158
Documentum ® System Sizing Guide for all platforms Release 1.1 November 2001 DOC3-SYSIZEGD-1101 $((/

Transcript of Document Um System Sizing Guide

Page 1: Document Um System Sizing Guide

Documentum® System Sizing Guide

for all platforms

Release 1.1

November 2001

DOC3-SYSIZEGD-1101

������������� � �

Page 2: Document Um System Sizing Guide

Copyright © 2000, 2001Documentum, Inc.6801 Koll Center ParkwayPleasanton, CA 94566All Rights Reserved.

Documentum®, Documentum 4i™, Docbase™, Documentum eContent Server™, Documentum Server®, Documentum Desktop Client™, Documentum Intranet Client™, Documentum WebPublisher™, Documentum ftpIntegrator™, Documentum RightSite®, Documentum Administrator™, Documentum Developer Studio™, Documentum Web Development Kit™, Documentum WebCache™, Documentum ContentCaster™, AutoRender Pro™, Documentum iTeam™, Documentum Reporting Gateway™, Documentum Content Personalization Services™, Documentum Site Delivery Services™, Documentum Content Authentication Services™, Documentum DocControl Manager™, Documentum Corrective Action Manager™, DocInput™, Documentum DocViewer™, Virtual Document Manager™, Docbasic®, Documentum DocPage Server®, Documentum WorkSpace®, Documentum SmartSpace®, and Documentum ViewSpace® are trademarks or registered trademarks of Documentum, Inc. in the United States and throughout the world. All other company and product names are used for identification purposes only and may be trademarks of their respective owners.

Page 3: Document Um System Sizing Guide

Documentum System Sizing Guide iii

Preface

1 Overview of System SizingOverview of the Sizing Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1Common Sizing Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3Terminology Used in This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4

2 Deriving Workload RequirementsWhat Is a Workload? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1Determining the Workload for a Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2User Connection States and Resource Consumption . . . . . . . . . . . . . . . . . . . 2-2

Inactive Connections and Resource Consumption. . . . . . . . . . . . . . . . . . 2-5RightSite Server Connection States . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5

The Busy Hour. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6Response Time Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9Using the Derived Workload. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9The Documentum Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9

The iTeam™ 2.2 Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10Workload Scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11Workload Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12Workload Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13Workload Response Time Requirements . . . . . . . . . . . . . . . . . . . 2-14

The WebPublisher™ 4.1 Workload . . . . . . . . . . . . . . . . . . . . . . . . . 2-16Workload Scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17Workload Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18Workload Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-19Workload Response Time Requirements . . . . . . . . . . . . . . . . . . . 2-20

The Load and Delete Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-22Comparing and Contrasting the Workloads . . . . . . . . . . . . . . . . . . . . . . . 2-23

Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23Resource Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-24Usage Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-25

Operations Not Included in Workloads. . . . . . . . . . . . . . . . . . . . . . . . . . 2-26

3 Hardware Architecture and ScalingOverview of Software Trends Affecting Scaling . . . . . . . . . . . . . . . . . . . . . . 3-1

More Powerful Processors and Software Reuse . . . . . . . . . . . . . . . . . . . 3-2Wide Variance in User Deployments . . . . . . . . . . . . . . . . . . . . . . . . . 3-4The Trends and Documentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5

C O N T E N T S

Page 4: Document Um System Sizing Guide

iv Documentum System Sizing Guide

Scaling the Web Tier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5Scaling the eContent Server Tier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7Scaling DocBrokers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10Scaling the RDBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10Host-based vs. Multi-tiered Configurations . . . . . . . . . . . . . . . . . . . . . . . .3-11High Availability Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-12Scaling Across the Enterprise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14Scaling the Web Content Management Edition . . . . . . . . . . . . . . . . . . . . . .3-17

Web Content Authoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19Site Delivery Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20Access Software for Dynamic Page and Metadata Retrieval . . . . . . . . . . . .3-20

Scaling the Portal Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21

4 Server Configuration and SizingOverview of Server Sizing Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1Hardware Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2

Host-based Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3N-Tier Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3

Server Sizing Results from Benchmark Tests . . . . . . . . . . . . . . . . . . . . . . . 4-4Special Focus for Some Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5Interpreting the CPU Sizing Tables . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6Compaq Sizing Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9Sun/Solaris Sizing Information . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10

Sun Enterprise 450 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11Sun Enterprise 6500 and 4500 . . . . . . . . . . . . . . . . . . . . . . . . . .4-12

IBM, Windows NT, and AIX Sizing Information . . . . . . . . . . . . . . . . . .4-15IBM Netfinity 7000 M10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15 IBM AIX Systems: S7A and F50 . . . . . . . . . . . . . . . . . . . . . . . . .4-16

HP Windows NT and HP-UX Servers . . . . . . . . . . . . . . . . . . . . . . . .4-18HP NT/Intel Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18 HP-UX Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20

Other CPU-Related Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Sizing Server Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23

Overview of the Sizing Process . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23Key Concepts Relating to Memory Use . . . . . . . . . . . . . . . . . . . . . . .4-25

Virtual and Physical Memory . . . . . . . . . . . . . . . . . . . . . . . . . .4-25Cache Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26

DBMS Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-27eContent Server Caches. . . . . . . . . . . . . . . . . . . . . . . . . . .4-27RightSite Server Caches and Work Areas. . . . . . . . . . . . . . . . .4-27

Estimating Physical Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . .4-28User Connection Memory Requirements. . . . . . . . . . . . . . . . . . . .4-28

Page 5: Document Um System Sizing Guide

Documentum System Sizing Guide v

DBMS Memory Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4-29Operating System Memory Requirements . . . . . . . . . . . . . . . . . . 4-30

Estimating Paging File Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-30Additional Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-31

Examples of Memory Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-32Example One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-32Example Two. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-33Example Three . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-34

Sizing Server Disk Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-35Key Concepts for Disk Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-37

Disk Space and Disk Access Capacity . . . . . . . . . . . . . . . . . . . . . 4-37Effect of Table Scans, Indexes, and Cost-based Optimizers on I/O . . . . 4-38

Tuning with the Optimizer . . . . . . . . . . . . . . . . . . . . . . . . 4-38DBMS Buffer Cache Memory Effect on Disk I/Os . . . . . . . . . . . . . . 4-39

Disk Striping and RAID Configurations . . . . . . . . . . . . . . . . . . . . . . 4-40Disk Storage Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-42Disk Space Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-44

Physical Disk Requirements of the DocumentumSoftware Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-44

Typical Disk Space Calculation Model for Content andAttribute Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-45

Additional Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-46Additional References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-46

Database License Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-46Certified Database and HTTP Server Versions. . . . . . . . . . . . . . . . . . . . . . 4-47

5 Server Network Configuration GuidelinesOverview of Network Sizing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1Key Concepts for Network Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2

Bandwidth and Latency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3Bandwidth Needs and Response Time . . . . . . . . . . . . . . . . . . . . . . . . 5-4

Making the Decision: Localizing Traffic or Buying More Bandwidth . . . . . . . . . . 5-6More Bandwidth or Remote Web Servers . . . . . . . . . . . . . . . . . . . . . . 5-8Content Transfer Response Time: More Bandwidth or Content Servers . . . . . 5-9Operation Response Time: More Bandwidth or Replication . . . . . . . . . . . 5-11

Additional Specific Network Recommendations . . . . . . . . . . . . . . . . . . . . 5-12

6 Sizing for Client ApplicationsSizing for Desktop Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1

CPU Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1Component Initialization and Steady State Processing . . . . . . . . . . . . . . . 6-3Memory Resource Needs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4

Page 6: Document Um System Sizing Guide

vi Documentum System Sizing Guide

Sizing for AutoRender Pro. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5System Requirements for Client Products . . . . . . . . . . . . . . . . . . . . . . . . . 6-6

A Additional WorkloadsThe EDMI Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1

Workload Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2Workload Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3Workload Response Time Requirements. . . . . . . . . . . . . . . . . . . . . . . A-4Workload Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6

The Web Site Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6Workload Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7Workload Response Times. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8Workload Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9

The Document Find and View Workload . . . . . . . . . . . . . . . . . . . . . . . . . A-9The Online Customer Care Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9

Workload Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10Workload Response Time Requirements. . . . . . . . . . . . . . . . . . . . . . A-13

Comparing and Contrasting the Workloads. . . . . . . . . . . . . . . . . . . . . . . A-14Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14Usage Models and Resource Consumption . . . . . . . . . . . . . . . . . . . . A-15Document Find and View Workload . . . . . . . . . . . . . . . . . . . . . . . . A-17

Operations Not Included in Workloads . . . . . . . . . . . . . . . . . . . . . . . . . A-17

Index

Page 7: Document Um System Sizing Guide

Documentum System Sizing Guide vii

P R E F A C E

Purpose of the Manual

This document describes the process for estimating an organization’s initial requirements for server capacity and configuration options. The specific intention is to assist our customers in determining a range of optimal configuration options relative to their business objectives. This document presumes the customers understand their business requirements and network operating environment.

Intended Audience

This document is intended for system administrators, technical managers, operation coordinators, or other technical personnel responsible for sizing a Documentum environment.

Organization of the Manual

This manual contains six chapters and one appendix. The table below lists the information that you can expect to find in each.

Chapter Contents

Chapter 1, Overview of System Sizing

An overview of the sizing process and definitions of terms used in the document.

Chapter 2, Deriving Workload Requirements

A discussion of key concepts in the capacity planning process and descriptions of workloads.

Chapter 3, Hardware Architecture and Scaling

The architecture of the Documentum Server software, how it scales with increased load, and the implications for hardware configuration.

Page 8: Document Um System Sizing Guide

PrefaceUsing Links in PDF Files

viii Documentum System Sizing Guide

Using Links in PDF Files

If you are reading this document as a Portable Display Format (PDF) file, cross-references and page numbers in the index appear as clickable blue hypertext links. Table of contents page numbers are also clickable links, but they appear in black.

➤ To follow a link:

1. Move the pointer over a linked area.

The pointer changes to a pointing finger when positioned over a link. The finger pointer displays a W when moved over a Weblink.

2. Click to follow the link.

Note: A Web browser must be chosen in your Weblink preferences to follow a Weblink. See Setting Weblink preferences in your Adobe Acrobat Help for more information.

Chapter 4, Server Configuration and Sizing

Information and guidelines for sizing the server configuration.

Chapter 5, Server Network Configuration Guidelines

Information and guidelines for sizing the server network configuration.

Chapter 6, Sizing for Client Applications

Information about Documentum client sizing requirements.

Appendix A, Additional Workloads

Information about workload models other than those described in Chapter).

Chapter Contents

Page 9: Document Um System Sizing Guide

PrefaceBug Lists and Documentation On-Line

Documentum System Sizing Guide ix

Bug Lists and Documentation On-Line

Customers with a Software Support Agreement can read our product documentation and, after commercial release of a product, view lists of fixed bugs on Documentum’s Technical Support Web pages, Support On-Line. To enter Support On-Line, you must request access and obtain a user name and password.

Applying for Access

➤ To apply for access to Support On-Line:

1. In your Web browser, openhttp://www.documentum.com/

2. Click the Technical Support link.

3. Click the Request Access link.

4. Complete the form and send it.

Documentum will respond to your request within two business days.

Fixed Bugs List

A list of customer-reported bugs that have been fixed will be available two weeks after this release, at Support On-Line, the Technical Support area of the Documentum Web site. For information about obtaining access to Support On-Line, refer to “Applying for Access.” You must have Adobe Acrobat Reader or Acrobat Exchange installed to view the lists of fixed bugs.

➤ To view the lists of fixed bugs:

1. In your web browser, openhttp://www.documentum.com/

2. Click the Technical Support link.

3. Log on to the Technical Support site.

4. In the Troubleshooting section, click View Bugs.

5. Click Fixed Bugs and Feature Requests Lists.

Page 10: Document Um System Sizing Guide

PrefacePurchasing Bound Paper Manuals

x Documentum System Sizing Guide

6. Click the name of the bug list.

Product Documentation

Customers with a Software Support Agreement can read our product documentation at the Documentum Web site. You must have a user name and password, and Adobe Acrobat Exchange or Acrobat Reader installed in order to view the documentation. To obtain a user name and password, refer to “Applying for Access.”

➤ To view a document:

1. In your Web browser, open http://www.documentum.com/

2. Click the Technical Support link.

3. Log on to the Technical Support site.

4. In the Resources section, click Documentation.

5. Click the name of the document.

Purchasing Bound Paper Manuals

Our product documentation is available for purchase as bound paper manuals. To place an order, call the Documentation Order Line at (925) 600-6666. You can pay with a purchase order, check, or credit card.

Page 11: Document Um System Sizing Guide

Documentum System Sizing Guide 1–1

1Overview of System Sizing 1

This chapter provides a brief overview of the system sizing process and introduces the terminology used in this guide. The following topics are discussed in this chapter:

■ “Overview of the Sizing Process” on page 1-1

■ “Common Sizing Mistakes” on page 1-3

■ “Terminology Used in This Guide” on page 1-4

Overview of the Sizing Process

System sizing is the process of determining what hardware, software, and network configurations will provide the best performance for users at the lowest cost to the enterprise. Another term for system sizing is capacity planning.

Figure 1-1 illustrates the system sizing process.

Page 12: Document Um System Sizing Guide

Overview of System SizingOverview of the Sizing Process

1–2 Documentum System Sizing Guide

Figure 1-1 The System Sizing Process.

The first and most important step is to determine the performance and configuration requirements for the application or service that will be using Documentum. These requirements include expectations for:

■ Number of users serviced during the hour of peak usage (the busy hour)

■ Acceptable response times

■ Document sizes

■ Document availability (for distributed sites)

■ Documentum products used

■ Geographic access (some local and some remote)

After the requirements are known, usage of server and network resources can be estimated and then budgeted. Typically, the budget for resources is allocated far in advance of their final implementation and deployment.

Derive workload and customer performance requirements

Decide high-level hardware deployment architecture

Estimate CPU configuration

Estimate memory needs

Estimate disk capacity and access needs

Refine network analysis

Budget for servers and telecommunications services

Before actual deployment, check to ensure requirements have not grown to exceed purchased hardware

Page 13: Document Um System Sizing Guide

Overview of System SizingCommon Sizing Mistakes

Documentum System Sizing Guide 1–3

Consequently, it is wise to review current knowledge of the application and environment between budgeting and actual deployment, to ensure that the budgeted resources satisfy the requirements. For instance, the budgeted hardware may have been intended for 1,000 users per hour, but the initial rollout now must cover 2,000 users per hour during the busy hour. The difference may require a reassessment of the hardware resources.

Documentum provides a spreadsheet for the system sizing process. After the customer enters user, hardware, and document profile information for the system, the spreadsheet suggests configuration information.

Common Sizing Mistakes

Several mistakes are commonly made in system sizing:

■ Failure to obtain sufficient information about the customer requirements and the deployed application

Sometimes systems are sized based on only a partial picture of the workload. If a significant portion of the workload is left out, the estimated hardware resources might be insufficient to serve the entire workload. This often happens, unnoticed, when an application that is being developed experiences feature creep (the addition of features).

■ Paying insufficient attention to server machine differences

For example, both an Intel server and an Intel-based lap top might have a single processor and the same memory, but there are performance differences between them beyond mere expandability. Server-class machines have more processor cache, more bus bandwidth, and faster disk I/O subsystems than a laptop. These result in large performance differences with server applications such as Documentum.

■ Assuming that somehow an Intel server will need fewer disks and less memory than a comparable UNIX server machine

Intel-based servers are subject to the same limitations with respect to these resources as UNIX-based servers

■ Assuming that application developers will always tune their applications prior to deployment.

Page 14: Document Um System Sizing Guide

Overview of System SizingTerminology Used in This Guide

1–4 Documentum System Sizing Guide

In many cases, the hardware for an application is chosen and budgeted for many months in advance of the application’s full deployment. The budgeting used in this guide assumes some level of application performance tuning. However, a couple of poorly optimized queries can result in large response times. Capacity planning is not a substitute for performance testing and tuning.

Terminology Used in This Guide

Table 1-1 lists and defines terms used throughout this document.

Table 1-1 Definition of Common Terms

Term Definition

Active User A connected user who has not reached activity time-out. Active users consume Documentum Server® resources.

Active User – In Transaction

A connected user who is currently waiting for a response to a request from Documentum Server. An example is a user who is logged into Desktop Client™ and waiting for the Docbase™ View window to open. An active user – in transaction consumes the most Documentum Server resources of any user state, including CPU, RAM, network throughput, and disk throughput.

Active User – Out of Transaction

A connected user who is not currently waiting for a Documentum Server response to a request. An example is a user who is logged into WorkSpace and viewing a document in a word processor. While the word processor displays the document, the Documentum Server does not receive any more requests from the user until the user is finished viewing. An active user – out of transaction consumes fewer Documentum Server resources than an active user – in transaction, but more than an inactive user. RAM is the primary resource that an active user – out of transaction consumes.

Page 15: Document Um System Sizing Guide

Overview of System SizingTerminology Used in This Guide

Documentum System Sizing Guide 1–5

Activity Time-out A Documentum Server feature that conserves server-side resources. When a connected user has not made a request of the Documentum Server within a specified time limit, the server transparently frees the connection to free up unused OS and DBMS resources. The next time the user makes a request of the Documentum Server, the request is handled automatically without requiring the user to login again. The activity time-out counter is reset after each completed user request of the Documentum Server.

Bandwidth Refer to Network Throughput.

Bottleneck A resource that limits performance. Examples are CPU, RAM, network throughput and disk throughput.

Connected User A user who is currently logged into a Docbase.

Connecting User A user who has requested a Docbase connection but is not yet logged in.

Database Server (RDBMS instance)

SQL Relational Database Management System required as part of the Documentum Docbase. Used to store Documentum object attribute information.

Docbase The dynamic document and Web page repository accessed by the Documentum Server. The Docbase stores a document or Web page as an object that encapsulates the document’s native content together with its attributes, including information about the document’s relationships, associated versions, renditions, formats, workflow, and security.

DocPage Server Documentum Server version 3.x

Documentum Server Software used to service incoming and outgoing document management requests for data in the Docbase. Different versions carry different product names:

■ eContent Server refers to Documentum Server, version 4.2.

■ e-Content Server refers to version 4.1.

■ EDM Server refers to version 4.0.

■ DocPage Server refers to version 3.x.

Table 1-1 Definition of Common Terms

Term Definition

Page 16: Document Um System Sizing Guide

Overview of System SizingTerminology Used in This Guide

1–6 Documentum System Sizing Guide

Disk Throughput Number of bytes per unit of time transferred to or from the disk subsystem during read or write operations.

e-Content Server Documentum Server version 4.1

eContent Server Documentum Server version 4.2.

EDM Server Documentum Server version 4.0.

HTTP Server (Web Server)

Software required to service HTTP requests by a Web browser from a file system or from Documentum RightSite®.

Inactive User A connected user who has reached activity time-out. Inactive users do not consume Documentum Server resources

Named User A user for whom a user profile is defined in the Docbase. Each user profile is stored as a dm_user object in the Docbase.

Network Latency The delay in response to a network request due to the time it takes for a byte of data to traverse the network and travel from the client to the server and back again. Latency depends on the distance between the client and server, how many pieces of equipment are in between, and the types of communication lines.

Network Throughput

The number of bytes per unit of time that can flow between a client and server. Throughput is also referred to as bandwidth.

Physical Memory Total RAM dedicated to a physical computer system.

RightSite Server technology required to coordinate document management requests between an HTTP server and the Documentum eContent Server. This component is required for all Documentum Web products including Intranet Client™ and other web-content-based applications. RightSite must be physically installed on the same host as the HTTP server.

Table 1-1 Definition of Common Terms

Term Definition

Page 17: Document Um System Sizing Guide

Overview of System SizingTerminology Used in This Guide

Documentum System Sizing Guide 1–7

Transactions Requests from the client that have a response from the server. The client must wait for the response before it can continue. For example, the client might send “Update the Check Out field with my name” and the server sends back “Done”. Multiple application transactions can occur for specific user-level functions

Transformation Engine

The facility within Documentum Server that automatically transforms content in one format to another format.

The transformation engine uses a supported converters to perform the transformation. Through the transformation engine, you can:

■ Transform one word processing file format to another word processing file format

■ Transform one graphic image format to another graphic image format

■ Transform one kind of format to another kind of format—for example, changing a raster image format to a page description language format

Some of the converters are supplied with the Documentum system; others must be purchased separately.

User States Named users may be in one of the following activity states: connected user, active user, or inactive user. Active users may be divided into two categories: active user – in transaction and active user – out of transaction. Understanding the different user states can have a beneficial impact on system sizing because they vary in resource consumption due to the activity time-out feature. (For more information, refer to “User Connection States and Resource Consumption” on page 2-2.)

Virtual Memory A service provided by the operating system (and hardware) that allows each process to operate as if it has exclusive access to all memory (0 to 2³², typically). However, a process only needs a small amount of this memory to perform its activities. This small amount, called the process working set, is actually kept in memory. The operating system manages sharing of physical memory among the various working sets.

Table 1-1 Definition of Common Terms

Term Definition

Page 18: Document Um System Sizing Guide

Overview of System SizingTerminology Used in This Guide

1–8 Documentum System Sizing Guide

Page 19: Document Um System Sizing Guide

Documentum System Sizing Guide 2–1

2Deriving Workload Requirements 1

This chapter introduces workloads, discusses two concepts that are key to determining workload requirements, and describes the workloads used in the benchmark tests provided by Documentum. The following topics are included:

■ “What Is a Workload?” on page 2-1

■ “Determining the Workload for a Site” on page 2-2

■ “User Connection States and Resource Consumption” on page 2-2

■ “The Busy Hour” on page 2-6

■ “Response Time Expectations” on page 2-9

■ “Using the Derived Workload” on page 2-9

■ “The Documentum Workloads” on page 2-9

■ “Comparing and Contrasting the Workloads” on page 2-23

■ “Operations Not Included in Workloads” on page 2-26

Note: Benchmark results are described in Chapter 4, Server Configuration and Sizing. Detailed benchmark reports are available in Kpool.

What Is a Workload?

A workload is a usage pattern for a group of users. For example, checking documents out of the Docbase and checking them in at a later time represents a simple usage pattern for the users called contributors (because they contribute content to the Docbase). However, typical workloads involve more than simple check ins and check outs. Typically, a workload also includes activities such as navigating through folders, viewing documents, participating in workflows, constructing or publishing virtual documents, and so forth.

Page 20: Document Um System Sizing Guide

Deriving Workload RequirementsDetermining the Workload for a Site

2–2 Documentum System Sizing Guide

Determining the Workload for a Site

Before you attempt to define the workload for a site, you should understand two principles: the relationship between user connection states and resources consumed by a workload, and the concept of the busy hour. “User Connection States and Resource Consumption” on page 2-2 defines the user connection states and describes how they differ in resource consumption. “The Busy Hour” on page 2-6 defines the concept of the busy hour.

To estimate a workload for a site, you must obtain the following information:

■ What Documentum products are in use

■ The estimated number of users who are connected during the busy hour

■ The estimated number of active users during the busy hour

■ What Docbase operations are performed and how often each is performed

■ The number, size, and content profile of documents in the Docbase

Because it can be especially difficult to identify the Docbase operations and how often each will be performed, it is strongly recommended that you use the Documentum Sizing Spreadsheet. The spreadsheet makes some standard assumptions about Docbase operations based on the user category (contributor or consumer) and products in use (the workload column in which you enter the information).

User Connection States and Resource Consumption

User connection states affect the amounts of resources consumed by a workload. There are four user connection states, and the resources consumed by a user in each state differ. The four states are:

■ Connecting

■ Active - in transaction

■ Active - out of transaction

■ Inactive

Page 21: Document Um System Sizing Guide

Deriving Workload RequirementsUser Connection States and Resource Consumption

Documentum System Sizing Guide 2–3

A connecting user is a user who is requesting a connection with the Docbase. A connecting user consumes CPU, network, memory, and disk and swap space.

An active user - in transaction is a user who is connected to the Docbase and is currently waiting for a response to a request from eContent Server. An active - in transaction user consumes CPU, network, memory, and disk and swap space.

An active user - out of transaction is a user who is connected to the Docbase but is not currently waiting for a response from eContent Server. An active - out of transaction user consumes server memory and swap space.

An inactive user is an active user - out of transaction whose Docbase session has timed out. An inactive user consumes only memory on the client machine.

Figure 2-1 illustrates the user connection states and their relationship with resource consumption.

Page 22: Document Um System Sizing Guide

Deriving Workload RequirementsUser Connection States and Resource Consumption

2–4 Documentum System Sizing Guide

Figure 2-1 User Connection States and Resource Consumption

Users consume server CPU mainly during session establishment and when they initiate a request to eContent Server. When an active user is not initiating a request, only server memory and some operating system networking resources are consumed. When a user is inactive, the server resources are reclaimed for other purposes. Only the client machine will consume some memory resources in this state (essentially remembering where the session should resume when the session returns to the Active state).

Connecting User

Active UserIn Transaction

Active UserOut of Transaction

Consumes Server memory and swap space

Inactive UserConsumes memory only on client machine

Consumes CPU, network, memory, disk, and swap space

Session Up

Inactivity Time-out Reached

User Initiates an Action

Consumes CPU, network, memory, disk, and swap space

Page 23: Document Um System Sizing Guide

Deriving Workload RequirementsUser Connection States and Resource Consumption

Documentum System Sizing Guide 2–5

Inactive Connections and Resource Consumption

Both eContent Server and the RightSite Server free inactive connections. Freeing inactive connections reduces the memory demands on the system and minimizes the number of concurrent DBMS sessions. By default, eContent Server frees inactive connections after 5 minutes of inactivity and the RightSite Server does so after 30 minutes of inactivity.

When eContent Server frees an inactive connection, the server disconnects from the DBMS and kills the process (Unix) or thread (Windows NT) that corresponds to the inactive connection. When the RightSite Server frees an inactive connection, the server kills the process or thread associated with the connection.

The freed sessions can be re-established. With eContent Server, the session is re-established transparently when the user initiates another command. With RightSite, the user must login again (for named sessions). However, when a session is restarted, there is a startup cost that includes operations such as reconnecting to the DBMS, resetting caches, and so forth.

Inactive time-out trades off CPU time for reduced memory and concurrent session requirements. That is, stopping and restarting a session repeatedly uses more CPU than leaving the session connected continuously. However, disconnecting the session frees memory for other uses and reduces the maximum number of active database sessions needed.

RightSite Server Connection States

From eContent Server’s viewpoint, the RightSite Server is a user. RightSite Server connections to eContent Server go through state transitions with associated resource use similar to any other user connecting or connected to eContent Server.

An active RightSite Server that is not processing a request consumes only memory and swap space. When the RightSite Server requests a Docbase connection or processes requests, it consumes all of the major types of resources (CPU, network, memory, disk, and swap space).

Page 24: Document Um System Sizing Guide

Deriving Workload RequirementsThe Busy Hour

2–6 Documentum System Sizing Guide

The Busy Hour

The busy hour is the hour during the day in which the largest number of operations and active sessions occur. Even in the busy hour, however, the total amount of activity is only a percentage of the total possible activity.

To illustrate this, consider the telephone world. Suppose that ABC Telephone Company has 1 million telephones installed in a given calling area, and in that area the busy hour is from 11:00 a.m. to 12:00 noon. Assuming that the average phone call lasts 2 minutes, the busy hour could theoretically involve 30 million calls—1 million phones used to make 30 calls each within that hour. In reality, only a percentage of the phones are used during the busy hour and the calls vary in duration and occurrence. Users do not typically make repeated 2-minute phone calls. They make a call of some duration, hang up, and engage in some other activity, and they may or may not make another call.

Figure 2-2 illustrates the busy hour and the assumption that activity during the busy hour is only a percentage of total possible activity.

Page 25: Document Um System Sizing Guide

Deriving Workload RequirementsThe Busy Hour

Documentum System Sizing Guide 2–7

Figure 2-2 Telephone Busy Hour

The ABC Communications Company sizes the back end of its phone system to accommodate the real-world busy hour use of the phones, not the upper limit of theoretical use.

Applying the analogy to Documentum systems, the number of phones is the number of users (or seats) in a Documentum installation. The number of phone owners actually making phone calls is the number of users during the busy hour. The phone conversations are the active sessions established between a user and the Docbase.

Just as only a percentage of installed telephones are used during the busy hour and used only intermittently, only a percentage of Docbase users are logged-in during the busy hour and only a percentage are making requests. Because eContent Server frees inactive sessions to save on resource consumption, only a percentage of the logged-in users served in the busy hour have active sessions at any one time. Figure 2-3 shows the relative proportion of busy hour users and active users to licensed users. Proportions will vary from site to site.

0

10

20

30

40

50

60

8:00 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00

1

10

100

1000

10000

100000

1000000

10000000

100000000The Busy-Hour calls

Lots of Telephones

Lots & Lots of potential 2-min calls that could be made by those phones.

K c

alls

pe

r h

ou

r

Page 26: Document Um System Sizing Guide

Deriving Workload RequirementsThe Busy Hour

2–8 Documentum System Sizing Guide

Figure 2-3 Licensed Users Versus Busy-Hour Users Versus Currently Active Users

In general, it is best to try to size the server systems for the busy hour. But when is it? And how can one estimate it for an application that has not been deployed yet? A bit of “guess-timating” is typically required. Typical sites estimate that 20 to 30 percent of the licensed users (in a full deployment) request service during the busy hour. Testing has shown about 20 percent of the users served in one hour are active at any point in the hour (assuming users make random requests to the Docbase throughout the hour). In the absence of any data, these are reliable proportions to use on the sizing spreadsheet.

Note: Because RightSite waits longer to time out its session, there are typically more RightSite active sessions than eContent Server active sessions at any one time.

Licensed users Users per busy hour Avg Active users

Page 27: Document Um System Sizing Guide

Deriving Workload RequirementsResponse Time Expectations

Documentum System Sizing Guide 2–9

Response Time Expectations

Response times are an important criterion for judging the effectiveness of the system or service that is deployed. Response times should match or better user expectations.

When users are asked about desired response times, they typically respond “two to three seconds” if they have no information about the document size or format. However, users also expect that it will take longer to check out a 10-megabyte document than one that is 10K bytes. They also expect that it will take longer to publish a virtual document with 1000 parts than it will to publish one with 10 parts.

We recommend determining the major components of the workload and corresponding expectations for response time.

Using the Derived Workload

After you have determined the workload at your site, compare it to the workloads described in the following section, “The Documentum Workloads.” After you determine which workload most closely matches your site’s workload, you can fill in the appropriate columns in the Documentum Sizing Spreadsheet with your information.

You can also examine the benchmark test results reported in Chapter 4, Server Configuration and Sizing, for those tests conducted using the workload that matches your workload. Using these may also help you determine your configuration requirements.

The Documentum Workloads

This section describes the workloads used in the benchmark tests conducted by Documentum. The following three workloads are included in this section:

■ “The iTeam™ 2.2 Workload” on page 2-10

■ “The WebPublisher™ 4.1 Workload” on page 2-16

Page 28: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

2–10 Documentum System Sizing Guide

■ “The Load and Delete Workload” on page 2-22

Appendix A describes four additional workloads:

■ “The EDMI Workload” on page A-1

■ “The Web Site Workload” on page A-6

■ “The Document Find and View Workload” on page A-9

■ “The Online Customer Care Workload” on page A-9

The iTeam™ 2.2 Workload

iTeam is a Documentum application that provides a collaborative framework for developing projects. It groups together the documents, resources, news, and discussions for a project and allows users to easily reuse these for future projects. The iTeam workload simulates the activities of iTeam users.

iTeam is Web-based and uses the following Documentum products:

■ Documentum eContent Server

■ Documentum RightSite Server

■ Documentum Web Development Kit™ (WDK)

■ Docbasic®

■ Documentum Intranet Client™

Figure 2-4 illustrates the software architecture.

Page 29: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

Documentum System Sizing Guide 2–11

Figure 2-4 iTeam Software Architecture

Each iTeam connection through the Web Server establishes two Documentum sessions to eContent Server: one through the Documentum RightSite Server and the other through the Jrun server. The RightSite-based session is used for various Documentum Intranet Client component customizations and the Jrun-based session supports the operations made through the WDK and DFC using Java Server pages.

Workload Scenario

Each iTeam user logs into iTeam throughout the hour and performs a series of different tasks. The majority of the users execute frequently occurring tasks that include checking their inboxes, handling the tasks, displaying the iTeam personal view, viewing documents in text, MS Word, and Powerpoint formats, reading news, and participating in group discussions. Other users perform tasks that occur less frequently, such as document check out and check in, workflow and business policy operations, and attribute searches. In addition to these standard operations, the workload also exercises the following Documentum capabilities:

■ JSP code written with Documentum Web Development kit

■ Customized Documentum 4i Intranet Client components that use the Documentum RightSite Server

Page 30: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

2–12 Documentum System Sizing Guide

This workload stresses the multi-user capabilities of the system. The operations performed on the objects are feature-rich. For example, documents proceed through a lifecycle of three states: InProgress, Review, and Approved. When a document is promoted to the Review state, a workflow is started to notify a group of 10 users that the document is ready for review. The notifications are placed in the users’ inboxes, and the first user to review the change completes the task for the team.

The attribute search is resource intensive. Each document has a set of at least 50 attributes, set to pre-determined values, that are used to achieve fixed-sized result sets for attribute-based searches. The number of hits for each search is typically about 30 documents but can be as high as 110. The searches are case-insensitive and wildcard-based, and use the iTeam simple single-box search.

Note: An acceptable variation of the workload customizes the single-box search to include a full-text search. In such cases, the values in the attributes are also placed in the document content. (The full benchmark reports, found in Kpool, discloses when full-text searching was used.)

Workload Operations

Table 2-1 describes the operations performed by the workload.

Table 2-1 Operations in the iTeam 2.2 Workload

Operation Description

CONN_+_PERSONAL_VIEW Establishes a user’s connection and Docbase session and automatically displays the Personal View for the user. A Personal View shows items of interest for every project with which the user is associated. The items include news, activities, and issues for the user. In this workload, each user has an interest in 5 deliverables in each of the 5 projects in which the user is participating.

PERSONAL_VIEW Displays the user’s Personal View.

VIEW_INBOX Displays the workflow tasks in which a user is participating.

PROCESS_WORKFLOW _ITEM Forwards a completed workflow task.

Page 31: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

Documentum System Sizing Guide 2–13

Workload Scaling

To scale the workload, the Docbase size is increased as more users are tested. Each user is associated with 5 projects and each iTeam project has 20 users. Consequently, for every 20 users in the workload, there are 5 projects, and for a 200 users/busy hour run, there will be at least 50 active projects. Additionally, a production Docbase in operation at least one year will have at least as many inactive or completed projects in the Docbase as there are active projects. Consequently, in a configuration supporting 200 users per busy hour, there should be at least 100 projects.

Each project in this workload includes:

■ 10 news items

VIEW_DOCUMENT Selects a text, Word, or Powerpoint document for display and then returns to the Document portion of the Center view.

DISCUSS_POST_REPLY Displays the Discussion tab of the Center/Project view so that the user can review some random discussion and then reply to the discussion.

CHECKOUT_DOC Checks out a random text, Word, or Powerpoint document. The document is selected from the document list of a Project view’s deliverables.

CHECKIN_DOC Checks in a checked out document.

PROMOTE_DOCUMENT Displays the page that lists all possible operations on a document and then promotes the document to the Review state. The promotion starts a distribution workflow to route the document to other members of the team for review.

READ_NEWS Displays a random news article (text format) from a project.

SEARCH_ATTRIBUTES Searches the documents in one project. The search is conducted on 4 or 5 attributes in a case-insensitive mode using wild card matching. The search returns from 0 to 100 hits.

Table 2-1 Operations in the iTeam 2.2 Workload

Operation Description

Page 32: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

2–14 Documentum System Sizing Guide

■ 10 project deliverables (iTeam activities, with at least 5 visible to each user)

■ 5 deliverable documents per project deliverable (2 text, 2 Word, and 1 Powerpoint)

■ 5 reference documents per project deliverable (2 text, 2 Word, and 1 Powerpoint)

■ 10 discussion groups

Each deliverable or reference document (not including discussions and news) has three versions. In total, each project represents about 160 objects with content, so a 200-user per hour configuration should have at least 35,000 objects. This represents the minimum number of documents. More projects and documents can be loaded to support a larger number of users. (The total number of documents loaded for each benchmark test is disclosed in the full benchmark reports.) The document formats, sizes, and distributions are shown in Table 2-2.

Workload Response Time Requirements

When a benchmark test is run, the primary benchmark obtained is the number of users performing their tasks that can be supported with acceptable response times.

Each iTeam reference user performs 10 operations at random times, and the response time for these operations is measured. Each operation typically displays several HTML screens dynamically generated by RightSite or WDK/JRun and the content from the Docbase (Powerpoint, Word, and text files).

Users start and end their work randomly throughout the busy hour. The interval between a user’s requests affects performance and response time because Documentum frees a user connection that does not have any activity

Table 2-2 Document Formats, Sizes and Distributions

Documents Portion of Total Number of Documents

Estimated Average Size

Power Point 17 % 50,000

Word 34 % 40,000

Text 34 % 25,000

Messages (text) 15 % 2000

Page 33: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

Documentum System Sizing Guide 2–15

after some amount of time (typically two to five minutes). Re-establishing the session (which happens transparently when work is initiated on an idle session) consumes more CPU resources. Simulating this behavior in the test models the real world more accurately. (“User Connection States and Resource Consumption” on page 2-2 provides more information about the resource consumption of various user connection states.)

Two to four seconds per screen is the acceptable response time generally, with some exceptions when the operation is complex. After factoring in the relative weights of all operations and the number of screens, the average response time per screen is typically two seconds. Table 2-3 lists the response time requirements for the operations.

Table 2-3 Response Time Requirements for iTeam 2.2 Workload

Task Number of Screens

Acceptable Response Time per Screen(in seconds)

Total Acceptable Average Response Time (in seconds)

CONN_+_PERSONAL_VIEW 2 5 10

PERSONAL_VIEW 1 5 5

VIEW_INBOX 1 6 6

PROCESS_WORKFLOW_ITEM

4 3 12

VIEW_DOCUMENT 3 3 9

DISCUSS_POST_REPLY 4 2 8

CHECKOUT_DOC 3 3 9

CHECKIN_DOC 4 2 8

PROMOTE_DOCUMENT 4 4 16

READ_NEWS 1 2 2

SEARCH_ATTRIBUTES 3 7 20

Page 34: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

2–16 Documentum System Sizing Guide

Table 2-4 shows a set of sample results. (Note that the sample results were generated by running the benchmark test on an optimal hardware configuration for the number of users tested.)

The WebPublisher™ 4.1 Workload

WebPublisher is a Documentum application that provides a framework for managing content on a Web site. It integrates with Documentum AutoRender Pro™ for file renditions and with Documentum WebCache Server for content delivery to a Web site. The WebPublisher 4.1 workload simulates the operations of WebPublisher users.

WebPublisher is Web-based and uses the following Documentum products:

Table 2-4 Example Response Times for iTeam 2.2 Workload

Operation Average Operation Response Time (in seconds)

Total Operations Acceptable Average Response Time (in seconds)

Total Operations in One Hour

Average Response Time per Screen (in seconds)

CONN_+_PERSONNAL_VIEW 9.54 10 200 4.77

PERSONAL_VIEW 4.89 5 147 4.89

VIEW_INBOX 3.62 4 200 3.62

PROCESS_WORKFLOW_ITEM

6.78 12 20 1.70

VIEW_DOCUMENT 6.75 9 649 2.25

DISCUSS_POST_REPLY 6.6 8 588 1.65

CHECKOUT_DOC 8.3 9 56 2.77

CHECKIN_DOC 5.35 8 56 1.34

PROMOTE_DOCUMENT 13.67 16 63 3.42

READ_NEWS 0.42 2 200 0.42

SEARCH_ON_TITLE 19.87 20 24 6.62

For all 2,203 2.31

Page 35: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

Documentum System Sizing Guide 2–17

■ Documentum e-Content Server

■ Documentum RightSite Server

■ Docbasic

■ AutoRender Pro

■ WebCache Server

Figure 2-5 illustrates the software architecture.

Figure 2-5 WebPublisher Architecture

Workload Scenario

The operations in the WebPublisher workload simulate the various actions of WebPublisher users. In addition to many basic Documentum operations, this workload exercises the following features and technology of Documentum 4i:

■ eContent Server business lifecycle processing

■ eContent Server 4i workflow processing

■ Customized Documentum 4i Intranet Client components using the Documentum RightSite Server

■ AutoRender Pro rendering of documents from Word to HTML

■ WebCache Web-page publishing

There are two kinds of users: content managers and content authors. Each content manager works with ten content authors on their own Web site (wcm_channel).

Page 36: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

2–18 Documentum System Sizing Guide

Each content manager:

■ Randomly selects a text or Word template from Content Configuration and creates a page for a designated author

Content managers create one page for each author with whom they work.

■ Reviews any tasks labeled Reviewer

■ Approves any tasks labeled Approver

■ Views a page on a Web site

Each content author:

■ Checks out and edits the page provided by the content manager

■ Checks in an edited page

■ Unlocks the checked out page

■ Publishes a page using the WebCache server and views it on a Web site

■ Submits and routes a page to the content manager group for review

The workload documents are attached to the WebPublisher Engineering Lifecycle, and all operations in the workload are activities in the Manager Process Workflow.

Workload Operations

Table 2-5 lists the operations in the WebPublisher workload.

Table 2-5 Operations in the WebPublisher Workload

Operation Description

CONN Establishes a user’s connection and session and displays the EDIT WEB PAGE screen for the user.

WORK_ON_TASKS Displays any available Reviewer or Approver tasks for a content manager or any available Author tasks for a content author.

CREATE_WEB_PAGE Displays a screen that allows the user to create a new Web page. The user enters a file name, the type of workflow, the name of the next user who will work on the page, and the page’s expiration date.

Page 37: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

Documentum System Sizing Guide 2–19

Workload Scaling

The Docbase size is increased as the workload’s user population grows.

SELECT_A_TASK Selects the Author, Reviewer, or Approver task and displays actions that a user can execute on the Web page associated with the task.

VIEW_DETAILS Displays the tasks in which the user is participating and the attributes of the Web page on which the user is to work.

REVIEW_WEB_PAGE Writes and saves some review notes for a Web page and submits the page to the next person for approval.

APPROVE_WEB_PAGE Writes and saves some review notes for a Web page and submits the page to the next person for approval.

WEB_VIEW Moves the Web page from the source Docbase to the target Docbase.

CHECKOUT_TEXT_PAGE Checks out a text page and brings up a text editor.

CHECKOUT_WORD_PAGE Checks out a Word page and brings up a Word editor.

CHECKIN_TEXT_PAGE Checks in a checked out text page. The operation occurs when the Save This Page command is executed.

CHECKIN_WORD_PAGE Checks in a checked out Word page. The operation occurs when the Save This Page command is executed.

CANCELCHECKOUT_PAGE Unlocks a locked page.

FINISH_SUBMIT Submits a Web page to the next person, who may be a content manager, for review, approval, or both.

Table 2-5 Operations in the WebPublisher Workload

Operation Description

Page 38: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

2–20 Documentum System Sizing Guide

Workload Response Time Requirements

When a benchmark test is run, the primary benchmarked obtained is the number of users performing their tasks who can be supported with acceptable response times.

The user operations in this workload represent a possible worst-case scenario, because each user moves a document through the entire lifecycle (WIP, Staging, and Approved) within the test period.

Each content manager task consists of 7 operations. Each content author task consists of 11 operations. All tasks are performed at random times and the response time for them is measured. Each task typically displays several HTML screens dynamically generated by RightSite and the content from the Docbase (Word and text files).

The interval between a user’s requests affects performance and response time because Documentum frees a user connection that does not have any activity after some amount of time (typically two to five minutes). Re-establishing the session (which happens transparently when work is initiated on an idle session) consumes more CPU resources. Simulating this behavior in the test more accurately models the real word.

Two to four seconds per screen is generally the acceptable response time, with some exceptions when the operation is complex. After factoring in the relative weights of all operations and the number of screens, the average response time per screen is typically three seconds. Table 2-6 lists the response time requirements for the WebPublisher workload.

Table 2-6 Response Time Requirements for WebPublisher Workload Operations

Operation Number of Screens

Acceptable Response Time per Screen(in seconds)

Total Acceptable Average Response Time(in seconds)

CONN 1 10 10

WORK_ON_TASKS 1 2 2

CREATE_WEB_PAGE 7 3 21

SELECT_A_TASK 1 2 2

Page 39: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

Documentum System Sizing Guide 2–21

Table 2-7 shows an sample set of results.

VIEW_DETAILS 1 2 2

REVEIW_WEB_PAGE 3 1.3 6

APPROVE_WEB_PAGE 3 1.3 6

WEB_VIEW 1 12 12

CHECKOUT_TEXT_PAGE 2 4 8

CHECKOUT_WORD_PAGE 2 4 8

CHECKIN_TEXT_PAGE 2 4 8

CHECKIN_WORD_PAGE 2 4 8

CANCELCHECKOUT_PAGE 3 3 9

FINISH_SUBMIT 3 3 9

Table 2-6 Response Time Requirements for WebPublisher Workload Operations

Operation Number of Screens

Acceptable Response Time per Screen(in seconds)

Total Acceptable Average Response Time(in seconds)

Table 2-7 Example Set of Results for WebPublisher Workload

Operation Average Operations Response Time(in seconds)

Total Operations Acceptable Average Response Time

Total Operations In 1.5 hours

Average Screen Response Time (in seconds)

CONN 9.77 10 88 9.77

WORK_ON_TASKS 1.8 2 686 1.8

CREATE_WEB_PAGE 20.39 21 80 2.91

SELECT_A_TASK 0.83 2 686 0.83

VIEW_DETAILS 2.06 2 159 2.06

REVIEW_WEB_PAGE 3.09 5 138 1.03

Page 40: Document Um System Sizing Guide

Deriving Workload RequirementsThe Documentum Workloads

2–22 Documentum System Sizing Guide

The Load and Delete Workload

The load and delete workload simulates a common scenario for content repositories: loading and deleting documents. Many Documentum sites load a large number of documents in batches and then provide online user access to those documents or process the documents in some way (for example, the documents may be assembled and published). In all cases, Docbases do not grow infinitely. Documents are brought into the Docbase and eventually aged out.

The load and delete workload uses eContent Server and Docbasic.

The primary benchmarks for this workload are:

■ The time to load 10,000 PDF documents (each 600,000+ bytes in size)

■ The time to delete 10,000 PDF documents

■ The time to load and delete 10,000 PDF documents (at the same time)

The Docbase is loaded using as many parallel loading sessions as are needed to increase throughput. Each loading session loads 10,000 documents. A unique number is assigned to the load session and recorded in the chap_num

APPROVE_WEB_PAGE 2.35 5 78 0.78

WEB_VIEW 11.39 12 157 11.39

CHECKOUT_TEXT_PAGE 5.73 8 82 2.87

CHECKOUT_WORD_PAGE 5.84 8 76 2.92

CHECKIN_TEXT_PAGE 7.00 8 82 3.5

CHECKIN_WORD_PAGE 6.93 8 76 3.47

CANCELCHECKOUT_PAGE 6.47 9 158 2.31

FINISH_SUBMIT 4.95 9 295 1.65

Table 2-7 Example Set of Results for WebPublisher Workload

Operation Average Operations Response Time(in seconds)

Total Operations Acceptable Average Response Time

Total Operations In 1.5 hours

Average Screen Response Time (in seconds)

Page 41: Document Um System Sizing Guide

Deriving Workload RequirementsComparing and Contrasting the Workloads

Documentum System Sizing Guide 2–23

attribute of all documents loaded during that session. (This number is indexed in the DBMS and used by the delete program.) The reported time-to-load is the longest time reported for any of the sessions.

The documents are deleted in a single session. Each time the delete program runs, it deletes all the documents that have the same value in chap_num. This ensures that 10,000 documents are deleted each time the program runs.

A secondary benchmark for the tests is the number of documents that have been pre-loaded into the Docbase. The size of the Docbase directly affects the response times for the primary metrics. The more documents a Docbase contains, the longer it takes to insert new objects. The minimum number that are pre-loaded for benchmark tests using this workload is 100,000. To minimize disk space requirements, the pre-loaded documents can be any size (for example, 5K bytes).

When loaded, the documents are created as a subtype of dm_document and have various custom attributes.

Comparing and Contrasting the Workloads

This section compares and contrasts the workloads in terms of the software architecture, resource consumption, and usage patterns modeled in each.

Software Architecture

All of the workloads except for the load and delete workload use an HTTP thin-client, or N-tier, paradigm. In an HTTP thin-client architecture, Documentum DMCL (client library) processing occurs on the machine that hosts RightSite and the Internet Server. This is in contrast to the 3-tier architecture, in which client library processing occurs on the users’ PCs. With HTTP thin-client architecture, very little work actually happens on the client machine (all users are assumed to be using browsers that support HTTP). RightSite performs the operations that, in a 3-tier environment, are performed by the hundreds or thousands of client PCs. Figure 2-6 illustrates the difference between 3-tier architecture and HTTP thin-client architecture.

Page 42: Document Um System Sizing Guide

Deriving Workload RequirementsComparing and Contrasting the Workloads

2–24 Documentum System Sizing Guide

Figure 2-6 Documentum Client-library/3-tier Versus N

Resource Consumption

The workloads differ as to per-user operations and resource consumption.

The most resource-intensive workload is the WebPublisher workload, primarily due to its heavy workflow and lifecycle component. The next most expensive workload is the iTeam workload, for which the workflow and lifecycle components are less intensive than the Web Publisher workload but more intensive than the EDMI workload. (The EDMI workload is described in Appendix A, Additional Workloads.) The least expensive workload is the RightSite static Web site workload (which has a very small dynamic HTML component, while all of the other workloads have heavy dynamic HTML content).

Figure 2-7 illustrates the CPU consumption, normalized per user, for each of the workloads (assuming a 400Mhz Pentium II processor). This data is useful when comparing workloads on hardware for which there is no benchmark data.

Documentum DMCL operations (3 Tier Mode)

Hundreds to Thousands of Individual user PC’s

Documentum DocPage Server

Thin-client HTTP

Internet Server + Documentum RightSite

On Centralized Middle-tier Server

(Documentum DMCL Operations)

Documentum DocPage Server

Hundreds to Thousands of Individual user PC’s

Page 43: Document Um System Sizing Guide

Deriving Workload RequirementsComparing and Contrasting the Workloads

Documentum System Sizing Guide 2–25

Figure 2-7 Per User CPU Relationships Among the Workloads

Usage Patterns

All the workloads except the RightSite static Web site workload use 100-percent named access. There were no anonymous RightSite users for those tests. A typical Documentum deployment that includes RightSite has both named and anonymous users. Named users provide a user name and password and then are authenticated and provided with some exclusive resources. In particular, there is a separate RightSite process, WDK session, or both created for each named user. RightSite anonymous users do not provide a name or password; they share the anonymous login configured with RightSite. On the resource side, they share from a pool of RightSite processes, rather than having their own resources. Consequently, a workload that uses 100 percent named users consumes more CPU and memory resources than a workload that has anonymous users.

The Document-Find-and-View workload has a client-server architecture and uses named users. RightSite is not part of that workload. Although the EDMI workload (described in Appendix A, Additional Workloads) is also Web-based, in most cases the RightSite portion can be factored out of the data to allow you to size this workload based on a client-server model.

Cpu Secs Per Reference User normalized to a 400MHz Pentium II Processor

0 10 20 30 40 50 60 70 80

Web Publisher

iteam

Online Customer Care

EDMI

RightSite Static Website

cpu secs

Page 44: Document Um System Sizing Guide

Deriving Workload RequirementsOperations Not Included in Workloads

2–26 Documentum System Sizing Guide

Operations Not Included in Workloads

The workloads described in this chapter do not include the following operations:

■ Creating full-text indexes and full-text searching

■ Dumping and loading a Docbase

■ Distributed content operations and object replication

■ Operations involving turbo storage areas

If these are part of your workload you may want to increase the expected resource consumption for your workload.

Page 45: Document Um System Sizing Guide

Documentum System Sizing Guide 3–1

3Hardware Architecture and Scaling 1

This chapter discusses the architecture of Documentum Server-side software and how to scale the software with increased load. The server software is multi-tiered and can be partitioned horizontally to allow for scaling up within a single host or scaling out across multiple hosts. This chapter outlines how this can be done for each server tier. The chapter includes the following topics:

■ “Overview of Software Trends Affecting Scaling” on page 3-1

■ “Scaling the Web Tier” on page 3-5

■ “Scaling the eContent Server Tier” on page 3-7

■ “Scaling DocBrokers” on page 3-10

■ “Scaling the RDBMS” on page 3-10

■ “Host-based vs. Multi-tiered Configurations” on page 3-11

■ “High Availability Considerations” on page 3-12

■ “Scaling Across the Enterprise” on page 3-14

■ “Scaling the Web Content Management Edition” on page 3-17

■ “Scaling the Portal Edition” on page 3-21

Overview of Software Trends Affecting Scaling

Several trends in the computer industry today impose scaling requirements on server software:

■ More processors per server and more powerful processors

■ Software reuse

■ Wide variance in potential user deployments

Page 46: Document Um System Sizing Guide

Hardware Architecture and ScalingOverview of Software Trends Affecting Scaling

3–2 Documentum System Sizing Guide

The requirements of the first two trends point to partitioning software with a single server as a solution. The requirements of the third trend point to partitioning the software over unit machines (for higher capacity and better reliability) as a solution.

More Powerful Processors and Software Reuse

One trend affecting scaling decisions is the rapid increase in processor power (133MHz to 400MHz to 800MHz within a few years) and the availability of server systems that can support more of these powerful processors within a single system. Theoretically, if server software were well tuned, performance would be improving dramatically. However, this trend is countered by the popularity of software reuse.

Software reuse ensures that no single vendor provides all the software used in a system and thus, scaling the system is limited by the weakest component. One chief lesson of the RDBMS scaling efforts in the past ten years has been that excellent scaling within a large SMP server is most likely when the server software vendor has almost complete control over most aspects of the software, down almost to the operating system level. With software reuse, this level of control is significantly more difficult to obtain and multi-user response time problems caused by internal server resource bottlenecks become harder to locate and fix.

The best option left to software vendors is to ensure that the server software can be partitioned into independent units that can be accessed transparently as a single server. If partitioning is possible, sites can add additional servers as the existing servers start to reach their user limits. The sites can add additional servers until the entire capacity of the processors is exceeded.

Partitioning allows Documentum to scale on large SMP-based systems. Customers might prefer to use such systems to consolidate software on one easily managed system rather than distributing it over a large number of small server machines. Figure 3-1 illustrates scaling up (partitioning within one large system).

Page 47: Document Um System Sizing Guide

Hardware Architecture and ScalingOverview of Software Trends Affecting Scaling

Documentum System Sizing Guide 3–3

Figure 3-1 Scaling Up and Internally Partitioning Server Software

The ability to partition the server software into separate units also supports the ability to scale the system out across multiple server machines. That is, enterprises can spread the software server units over many machines or put them within a single large system. Figure 3-2 illustrates scaling out (partitioning over multiple servers).

Page 48: Document Um System Sizing Guide

Hardware Architecture and ScalingOverview of Software Trends Affecting Scaling

3–4 Documentum System Sizing Guide

Figure 3-2 Scaling Out to Increase Capacity

Wide Variance in User Deployments

Scaling out across many small machines is critical for IT groups faced with uncertain rollout schedules and user populations. IT groups are faced with Web-based software deployments that could serve not only their company employees, but the company’s suppliers and customers. The user populations could easily number in the tens of thousands, and there is typically a large possible variance in the types of usage.

One way to deal with this uncertainty is to buy a configuration that can handle the worst-case scenario. This is potentially a waste of hardware budget money if actual usage falls far below expectations. The other way to handle the situation is to purchase a smaller system that might need more capacity later. But, if the small system is not chosen correctly, complicated upgrade and cutover scenarios might result when the new hardware is added (especially if servers need to be replaced). It is typically advantageous for an IT group to be able to add capacity using unit server machines. That is, if one system reaches its capacity, the ability to add to the overall system capacity by simply plugging in another system without bringing down the original system is helpful.

Server software is spread over multiple machines but appears as a single server to the

Page 49: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Web Tier

Documentum System Sizing Guide 3–5

The Trends and Documentum

The Documentum software supports scaling up (partitioning over one server machine) and scaling out (partitioning over multiple server machines). The Web-tier software, the eContent Server software, and the DocBroker (the name server) can all be partitioned internally within a single system and externally across multiple machines. Additionally, they can be partitioned so that they appear to be a single server to the connecting clients.

How this transparent load balancing is accomplished depends on the server component. With the Web-tier software, it is done with standard load balancing hardware. eContent Server load balancing is accomplished by the DocBrokers (name servers), and the DocBroker load balancing is handled by each calling client (app server).

Partitioning not only allows the system to scale seamlessly and easily with increased user load, it also helps make the entire system less sensitive to single-system failures. If one server machine crashes, the other machines assume the work of the crashed machine. Sometimes applications must be able to handle and retry under certain failures, but in general this failover feature works for the Web-tier software, eContent Server tier, and the DocBrokers.

Finally, it’s important for an enterprise application to be able to scale over large geographic areas that might be separated by low bandwidth, high latency or heavily used, high-bandwidth connections. Documentum provides many distributed feature options to allow applications to scale over these deployment scenarios. (Table 3-1 on page 3-14 includes a list of these options.)

Scaling the Web Tier

This section describes how to scale the Web-tier software up within a single system or out across multiple systems.

Documentum offers two basic Web solutions:

■ A RightSite-based solution

■ An application-server-based solution

Page 50: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Web Tier

3–6 Documentum System Sizing Guide

Typically, there are different reasons for partitioning each solution. Prior to Release 4.2, RightSite servers were partitioned within a single system to work around an overloaded DocBroker. The dmcl.ini file identified a single primary DocBroker and the Rightsite server accessed that DocBroker for every user. Release 4.2 introduced DocBroker load balancing features to address the problem of overloaded DocBrokers. (DocBroker scaling is discussed in “Scaling DocBrokers” on page 3-10.)

The application-server-based solutions (for example, the WDK with BEA or any application server with DFC) need to be partitioned due to multi-user internal bottlenecks that cause response times to increase.

Internally, the application servers can be partitioned in the following manner:

1. The Ethernet boards are set up to listen on multiple IP addresses.

2. An HTTP server is set up to work with each of these separate addresses.

3. Each application server is associated with an HTTP server.

After partitioning the software, a network load balancer can be employed to spread the traffic over the various software servers. One type of network load balancer that is frequently used is a hardware device that provides a single virtual IP address to Web farm IP address mapping. Many different vendors provide these products, but the products must have some form of session stickiness to ensure that all session traffic from one client stays with the same server for the duration of the session.

The load balancer can ensure this stickiness in many ways, including source IP-based mapping, insert-cookie, and passive-cookie-based scenarios. If all users are on separate PCs, the load balancer must at least support source IP mapping. This method maps the source IP address of the packet to a particular server IP address for the duration of the session.

The stickiness between two hosts is typically undone after some preset period of inactivity, when it is assumed that the session has timed out due to inactivity.

Figure 3-3 illustrates the use of a network load balancer. Some examples of network load balancers include the Cisco Local Director, the Coyote Point Equalizer, and BIG IP.

Page 51: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the eContent Server Tier

Documentum System Sizing Guide 3–7

Figure 3-3 Transparent Load Balancing of Web Software

Scaling the eContent Server Tier

This section describes how to scale eContent Server up within a single system and out across multiple systems.

IP-0

IP-1

IP-2

IP-3

IP-4, IP-5

User IP-A

User IP-B

Network Load Balancer Maps traffic to IP-0 address to addresses IP-1 through IP-4. Should support at least Source based mapping (i.e., IP-B � IP-4 mapping).

Web server farm and one of the servers has a a partitioned HTTP/App server setup and requires two IP addresses.

HTTP Server

App Server IP-4

HTTP Server

App Server IP-5

Page 52: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the eContent Server Tier

3–8 Documentum System Sizing Guide

eContent Server can be spread over multiple instances within the same host or over multiple hosts for the same Docbase. When this is done at a single data center (that is, not split in some fashion to handle geographic separation of users), all the eContent Servers typically access the same file store as a network file system. Content is stored on a file server, and file synchronization over the shared file system is provided by the interactions that eContent Server has with the database. The procedure for setting up such a configuration is detailed in the eContent Server Administrator’s Guide.

The view provided to the client software (such as an application server, RightSite, or Desktop Client) is that of a single system for the Docbase. The DocBroker maps client requests to the Docbase’s eContent Servers. The DocBroker has a list of all active eContent Servers for a particular Docbase. When a client initiates a connection to a Docbase, the client first queries the DocBroker to obtain connection information for the eContent Server. The DocBroker provides the information, and the client proceeds to set up its session with that eContent Server. If no special proximity values have been set for the servers, the default behavior is to randomly pick an eContent Server from a list of available servers for each client request. Benchmark studies show that this random-pick load balancing method works quite well.

A group of eContent Servers for a single Docbase is referred to as an eContent Server cluster or server set. Using a server set not only provides higher capacity but also provides high availability in an active/active cluster arrangement. If one server fails, the DocBroker will assign new connections or re-connections to the other eContent Servers in the cluster.

Figure 3-4 illustrates load balancing for eContent Server.

Page 53: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the eContent Server Tier

Documentum System Sizing Guide 3–9

Figure 3-4 Load Balancing Over a Cluster of eContent Servers

App Server or RightSite server acting as a client to the

Group of eContent server machines for the same Docbase

RDBMS for attribute information

File Server for shared content

(1) Each eContent server informs the docbroker that it is

(2). While trying to connect to the Docbase the client software queries the docbroker to figure out how to connect to the Docbase.

(3). The Docbroker randomly picks an eContent server from

(4) client software connects to the appropriate eContent

Page 54: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling DocBrokers

3–10 Documentum System Sizing Guide

Scaling DocBrokers

DocBrokers (Docbase name servers) can become a bottleneck in some large multi-user environments. As the number of simultaneous requests to the DocBroker increases (more users connect), the DocBroker reaches a point where it cannot service requests fast enough. When this happens, connection response time increases.

One solution to this problem is to add more DocBrokers and spread user requests over these multiple DocBrokers. The DocBrokers can be started on separate machines or on the same machine. If they are on the same machine, they have either different port numbers or different IP addresses (after the Ethernet board has been multi-homed to support multiple IP addresses).

To spread the user requests over multiple DocBrokers, eContent Server release 4.2 lets you configure the client software to randomly pick from the list of DocBrokers in the dmcl.ini file when the client needs connection information. That is, the client software will read in the list of DocBrokers and then randomly pick one from the list. This algorithm load-balances across a group of symmetric DocBrokers, ensuring that the DocBroker’s services scales with the increase in number of users.

(Prior to eContent Server release 4.2, the users could only be spread across multiple DocBrokers by changing the DocBroker referenced in the [PRIMARY_DOCBROKER] clause in the dmcl.ini file. Although secondary DocBrokers could be listed in the file, they were used only if the primary DocBroker did not respond. With application servers and RightSite, this meant having each of the separate servers reference a different dmcl.ini file.)

Scaling the RDBMS

Given that the Web software, the eContent Servers, and the DocBrokers can increase their capacity just by adding servers and enough hardware, the bottleneck will shift to the RDBMS (for attribute information) or the shared file server (for the content). RDBMS technologies can’t be partitioned easily and rely more on scaling up to add capacity, rather than scaling out. Each file server and database vendor has different scalability limits. We recommend viewing the Documentum detailed benchmark reports for more information.

Page 55: Document Um System Sizing Guide

Hardware Architecture and ScalingHost-based vs. Multi-tiered Configurations

Documentum System Sizing Guide 3–11

Host-based vs. Multi-tiered Configurations

This section outlines various considerations affecting the decision to implement a host-based configuration (all software running on the same host) or a multi-tiered configuration (one or more server software components running on separate hosts).

First, typically no solution is perfect in every way. The decision must balance requirements and needs against what each solution offers. Secondly, many of the considerations that make the largest difference are not technical. This section focuses on the strengths of each solution and mentions items that are perhaps neutral for both.

The advantages of a host-based solution include:

■ Simplifies some administrative tasks due to the single-host nature of the installation

■ Allows for simplified software upgrade procedures (all software on same host)

■ Provides better load balancing of applications and servers to the available CPUs

■ Enhances ability to share CPU capacity and system resources among separate applications

The advantages of a multi-tiered, multi-server configuration include:

■ Lets you add capacity more simply, through additional server machines (does not require a “fork-lift” removal/replacement of one server machine with a larger one)

■ Lets you take advantage of the lowest cost/high-capacity server hardware

■ Allows problems with software servers to be isolated more easily

■ Supports high-availability configurations at less cost

Although an important consideration is performance, in just about every case, you can set up the Documentum software on a host-based configuration and achieve the same performance as on a multi-tier, multi-server configuration. Performance is, therefore, a neutral consideration.

Another consideration is ease of administration. Aside from the administrative advantages mentioned for each configuration (single host means fewer boxes, but multi-host eases problem sectionalization), this very

Page 56: Document Um System Sizing Guide

Hardware Architecture and ScalingHigh Availability Considerations

3–12 Documentum System Sizing Guide

important consideration can only be addressed within the context of the skills of a company’s administrative staff. This is especially true when trying to decide which operating system to use (Windows NT or Unix). For example, if the administrative staff’s skills are in Unix and not Windows NT, the staff can more likely ensure good performance and high-availability with Unix-based applications than Window NT applications. Their skill set could have a profound impact on quality of service.

High Availability Considerations

This section outlines some of the considerations and options available for configurations that support high availability. The section assumes that at the data integrity level, the need for mirrored drives (RAID 1) or other redundant configurations (such as RAID 5) is obvious and well covered in the literature and does not need to discussed here. This section is more concerned with computer system and service availability and refers to the configurations discussed earlier.

One advantage of a multi-tier, multi-server architecture is the ability to take advantage of high availability at lower cost. This advantage is due to the lower costs of the backup hardware. A system with fewer machines must have backup machines with more capacity than systems with many machines. This is illustrated in Figure 3-5.

Page 57: Document Um System Sizing Guide

Hardware Architecture and ScalingHigh Availability Considerations

Documentum System Sizing Guide 3–13

Figure 3-5 Host-Based versus Multi-Tiered System

If the load is concentrated on a single host, then the backup machine must be able to handle 100 percent of the load to avoid major degradations in service response. However, if the load is distributed over six hosts, for example, service response can be maintained if each host only supports 20 percent of the load.

The other important consideration is the high-availability technology. It is possible to use different solutions for the Web tier and the eContent Server tier. For example, the network load balancers described earlier for the Web tier typically re-route traffic to non-failed servers. In the eContent Server tier, the DocBrokers also route users to non-failed eContent Servers. Finally, the client dmcl software finds other DocBrokers on the list if the one picked initially does not respond. The native Documentum high-availability features allow for maximal use of the hardware at lower cost than typical operating system vendor solutions.

With the Documentum eContent Server, the solutions provided by the operating system are typically desirable only when a pair of systems supports numerous Docbases and no single Docbase runs on both systems. Failover

In a host-based configuration or two node system each should be able to run 100%

In a multi-server env with 6 machines each server need only run only 20% of the load to maintain service with 1 host failure.

Page 58: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling Across the Enterprise

3–14 Documentum System Sizing Guide

will move the Docbases from one system to the other. If a Docbase spans multiple hardware servers, you should use the native high-availability features of eContent Server.

However, in any high-availability scenario, the RDBMS and the file server (if used) must be set up to be highly available. These servers usually rely on the OS-specific high-availability technology (for example, the Microsoft cluster server on Windows NT).

Scaling Across the Enterprise

This section outlines some of the distributed features and cost trade-offs that can achieve good user response time. Most enterprise deployments include users at remote data centers or branch offices. Deciding which configuration to use depends heavily on examining the cost of network bandwidth between the sites against the cost of extra server administration at the remote site. If the cost of the network bandwidth is high between sites, then it might be more economical to replicate Docbases at the remote site.

Table 3-1 summarizes the various deployment options offered by Documentum to handle remote users.

Table 3-1 Summary of Deployment Options

Option Description

Central System All server software is located at some central data center.

Remote Web Servers Remote users access local Web and application servers that communicate through the DMCL to Docbases that are stored in a centralized data center.

Distributed Storage Area

The RDBMS is in a centralized site. The Docbase has eContent Servers installed at the central site and each remote site. Each eContent Server has its own storage area so that local users can access their content at higher speeds.

Content Servers A feature available for Distributed Storage Areas. The eContent Servers at remote sites are designated as content servers and used to retrieve only content. All metadata requests are handled by the server at the primary site. This reduces the amount of database communication.

Page 59: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling Across the Enterprise

Documentum System Sizing Guide 3–15

One criterion for selecting one of the options listed in Table 3-1 is whether a single Docbase will be employed. There are times when this occurs naturally in an enterprise and has nothing to do with response time. In those cases object replication, reference links or federations might be the appropriate deployment option given the existence of multiple docbases. In many cases, an enterprise deployment is being considered and the whether to create multiple Docbases or have a single set of centrally managed docbases has not been decided. In fact, sometimes it would be easiest to create a small set of centrally managed Docbases and multiple Docbases would be considered only when trying to ensure good response time across an enterprise with various remote sites that have poor connecting network bandwidth. The most important points of this section are to convey an understanding of the administrative and network bandwidth needs of the various deployment options to guide the decision process.

Content Replication Similar to the Content Servers option, except that the content is replicated to the remote site. All users are able to access the content quickly, unlike in the Content Server situation.

Reference Links In a multi-Docbase model, reference links allow users to reference objects in other Docbases. Only metadata is copied to the remote site. Content is not replicated to the remote site. Users have fast access to local objects, but when remote objects are displayed, users need to get them from the remote site.

Object Replication In a multi-Docbase model, objects and their content are replicated from one Docbase to another. Replication occurs in off-peak hours; during the busier day users are able to access local objects. This option is most useful when bandwidth is extremely expensive compared to the additional cost of setting up Docbases at the remote site.

Federations In this model, multiple Docbases already exist at the local and remote site. Federations allow those Docbases to be integrated in a more seamless fashion. Users have fast access to local objects, however, accessing remote objects still requires high bandwidth for good response.

Table 3-1 Summary of Deployment Options

Option Description

Page 60: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling Across the Enterprise

3–16 Documentum System Sizing Guide

Figure 3-6 shows the relationship between administrative overhead at a remote site and the various Documentum deployment options. Using modems and routers at the remote site generates the least amount of overhead. However, doing so requires remote users to access the central system to get to the Docbase. One possible solution is to move the Web servers to the remote site (causing a slight increase in administrative overhead to deal with the Web server machine). This might be done in order to take advantage of the DMCL being less verbose over the network than HTML (in Web deployments). The distributed storage areas, content servers, and content replication options all involve setting up an eContent Server at the remote site. Finally, in environments that use federations, reference links, or object replication, the remote site typically has an entire Docbase setup (RDBMS, eContent Server, Web server, and network equipment). This represents the largest administrative overhead for the remote site.

Figure 3-6 Administrative Overhead for the Deployment Options

Network bandwidth is another consideration. It might be administratively simple to locate all Docbases in a central site but financially prohibitive due to the cost of network bandwidth. The cost of networking associated with the

Documentum Deployment Option

Mo

re a

dm

inis

trat

ive

ove

rhea

d

modem

routers or modems

Web Servers and routers

eContent Servers, Web Servers, and routers

RDBMS, eContent Servers, Web Servers, and routers

remote user to central system

remote user to central system

remote user to local website to central system

distributed storage areas, content servers, & content replication

Ref. Links, Object replication, & Federations

Page 61: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Web Content Management Edition

Documentum System Sizing Guide 3–17

deployment options is in many respects the inverse of the amount of administrative overhead. Locating a Docbase at a remote site (which localizes access to their own site for the remote users) typically reduces the need for network bandwidth between the two sites. This relationship is outlined in Figure 3-7.

Figure 3-7 Documentum Deployment Feature vs. Network Bandwidth Need

Scaling the Web Content Management Edition

The Web Content Management (WCM) Edition allows a company to manage their Web content and deliver it to a Web site as content and metadata. The WCM Edition has several major components: content authoring, Site Delivery Services, and access software for dynamic content and attribute retrieval from Internet users. It can be characterized by the hardware and software needed on either side of the Internet firewall and the software that connects them. Figure 3-8 shows those components.

Documentum Deployment Option

Mo

re n

etw

ork

ban

dw

idth

Object Replication

Reference Links and Federations

Central System

distributed storage areas, content servers, & content replication

Page 62: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Web Content Management Edition

3–18 Documentum System Sizing Guide

Figure 3-8 Outline of WCM Architecture

RightSite/HTTP Server

EContent Server

RDBMS

Content Personalization Services

RDBMS

Application Server / Web Server machines

Web Cache

JDBC from App Server for Attribute Information

Content consumers (typically Internet)

Content Authors (typically Intranet)

Web Cache

AutoRender

Page 63: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Web Content Management Edition

Documentum System Sizing Guide 3–19

Web Content Authoring

A Web content management offering must be able to handle increasing numbers of:

■ Content authors (or contributors) and organizational complexity

■ Static and dynamic content

■ Content consumers (or Internet users)

Managing the content going into the Web site is accomplished with Documentum WebPublisher. WebPublisher is a Web-based application that uses the services of RightSite, eContent Server, AutoRender Pro, and Content Personalization Services™. For the most part, sizing and scaling of users and content in a WebPublisher implementation is like sizing traditional Documentum applications.

One way in which WebPublisher differs is that AutoRender Pro, an optional piece of the 4i family, is highly integrated into the WebPublisher application. WebPublisher uses AutoRender Pro to convert WebPublisher-managed documents into PDF or HTML automatically. AutoRender Pro polls a work queue for each Docbase it services. One AutoRenderPro server can service multiple Docbases or multiple AutoRender Pro servers can be set up to serve a single Docbase (when the capacity of one AutoRender machine has been exceeded). The Content Personalization Services service, integrated into WebPublisher, provides the optional ability to set content attributes and link documents automatically.

WebPublisher’s support of eContent Server document lifecycle and workflow services allows for scaling of Web site content as well as organizational complexity. For example, it permits the use of a business process to ensure that all content on the web has been approved at the right levels. As the content grows in a Web site, it becomes more difficult to ensure that all the content and information is correct, up-to-date, and approved. In addition, the organization required to manage that process become more complex as the process matures. Both of these items are addressed by the eContent Server document lifecycle and workflow features used by WebPublisher. Consequently, for regular content contributors, WebPublisher can be more workflow intensive than the average eContent Server implementation.

Page 64: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Web Content Management Edition

3–20 Documentum System Sizing Guide

Site Delivery Services

The software that distributes content and attributes outside a firewall is Documentum Site Delivery Services. This software has two main components: WebCache™ and ContentCaster™.

WebCache moves content and attributes to the other side of a firewall and ContentCaster moves content to Web server machines worldwide. WebCache is integrated into WebPublisher in that single object pushes of content and attributes are done by WebPublisher users. The WebCache software consists of two parts: a source transmitter and a target receiver. The source transmitter consumes the larger part of the system resources. Because the transmitter is coupled tightly with eContent Server, it can be scales to accommodate additional users in the same fashion as eContent Server. That is, adding more eContent Servers for the Docbase provides more WebCache transmitters.

Access Software for Dynamic Page and Metadata Retrieval

An Internet Web site grows not only in terms of the amount of content on the site, but also the number of users who access the site and the number of dynamic pages created.

The growth in users has several effects on scaling and sizing. First, the resource consumption of these additional users will require additional Web server, application server, and database server machines. ContentCaster will ensure that the content delivered by WebCache is synchronized throughout all the Web server machines in a data center and across a worldwide deployment. After the content is delivered, no additional overhead incurred by Documentum in accessing that content from a Web server. Consequently, sizing and scaling the machines needed for static access is no different from standard static content from Web sites.

The metadata is delivered to the RDBMS by WebCache. After it is delivered, metadata values can be accessed in several ways. Documentum provides a JDBC interface for application servers that use metadata to construct dynamic HTML pages; however, any native database interface can be used. In this way, Documentum software also causes no additional overhead in the creation of dynamic pages after the metadata values have been stored in the database serving the Internet users. Consequently, a site’s scalability is limited not by Documentum, but by the Web server environment, the application servers, and the RDBMS serving up the dynamic content.

Page 65: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Portal Edition

Documentum System Sizing Guide 3–21

Scaling the Portal Edition

The application basis for the Portal edition is the iTeam application. iTeam is based on Documentum WDK and the RightSite and eContent Servers. Sizing and scaling an iTeam implementation is the same as sizing an application based on those technologies.

Page 66: Document Um System Sizing Guide

Hardware Architecture and ScalingScaling the Portal Edition

3–22 Documentum System Sizing Guide

Page 67: Document Um System Sizing Guide

Documentum System Sizing Guide 4–1

4Server Configuration and Sizing 1

This chapter discusses server configuration and sizing. The following topics are included:

■ “Overview of Server Sizing Process” on page 4-1

■ “Hardware Configurations” on page 4-2

■ “Server Sizing Results from Benchmark Tests” on page 4-4

■ “Other CPU-Related Notes” on page 4-22

■ “Sizing Server Memory” on page 4-23

■ “Examples of Memory Calculation” on page 4-32

■ “Sizing Server Disk Capacity” on page 4-35

■ “Database License Sizing” on page 4-46

■ “Certified Database and HTTP Server Versions” on page 4-47

Overview of Server Sizing Process

The server sizing process has two components:

■ Sizing server CPU

■ Sizing memory needs

Sizing server CPU for a Documentum deployment requires you to identify the workload and the hardware configuration for the site. With that information, you can use the benchmark reports to help determine how many CPUs are needed. You must also size server memory. Figure 4-1 illustrates the server sizing process.

Page 68: Document Um System Sizing Guide

Server Configuration and SizingHardware Configurations

4–2 Documentum System Sizing Guide

Figure 4-1 CPU Selection Process

The benchmark tests reported in this chapter were conducted using the standard workloads described in Chapter 2, Deriving Workload Requirements, and Appendix A, Additional Workloads. The hardware configurations are briefly described in “Hardware Configurations” below and compared in detail in “Host-based vs. Multi-tiered Configurations” on page 3-11. Not all possible combinations of workloads and hardware configurations were tested.

Hardware Configurations

This section describes the hardware configurations used in the benchmark tests. These configurations include only eContent Server, the RDBMS, and Documentum Web-tier software. Products like WebCache are add-ons with respect to CPU consumption.

Benchmark results for these configurations with a variety of workloads and hardware vendors are reported in “Server Sizing Results from Benchmark Tests” on page 4-4. The reports include sizing information for CPU and memory for each configuration tested.

Estimate your users per busy hour

Match your workload to a Documentum standard workload.

Locate desired HW type (for example, Sun, Intel) and find the corresponding users/hour in chart to select the number of CPUs.

Adjust number of CPUs based on differences between your workload and the standard workload.

Move on to selecting the memory.

Page 69: Document Um System Sizing Guide

Server Configuration and SizingHardware Configurations

Documentum System Sizing Guide 4–3

Host-based Configuration

In a host-based configuration, all server software runs on the same physical machine. The machine can be either a relatively small machine or a larger mainframe-class system (for example, a Sun E6500 or HP V2250). (The smallest machine used in benchmark tests on a host-based configuration had 2 processors.) Figure 4-2 illustrates a host-based configuration.

Figure 4-2 Host-Based Configuration—All Server Software Running on Same Host

N-Tier Configurations

In an N-tier configuration, the server software resides on different host machines.

Figure 4-3 shows an N-tier configuration in which the Web servers, the HTTP server (for example, Iplanet or IIS) and the Documentum application server software (either RightSite or WDK running under some application server) reside on a different server machine than the one that hosts eContent server and the RDBMS. The CPU data for benchmark tests using an N-tier configuration will focus on how many CPUs are needed for the entire system, not just for a particular server.

HTTP Server +

RightSite Server

(and/or WDK/App Server) +

eContent Server +

RDBMS

Page 70: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–4 Documentum System Sizing Guide

Figure 4-3 N-tier Web Separate

Figure 4-4 illustrates an N-tier configuration in which the Web server, the eContent server and the RDBMS are on separate machines. If a benchmark uses this configuration with non-homogenous processors, the benchmark notes will clearly identify which systems were used for what purpose.

Figure 4-4 N-tier - All Separate

Server Sizing Results from Benchmark Tests

This section contains sizing results from benchmarks conducted on a variety of hardware configurations, using the following workloads:

■ iTeam

■ WebPublisher

■ Online Customer Care

■ EDMI

■ Website (Anonymous access to RightSite virtual links)

■ Find and View

eContent Server +

RDBMS

HTTP Server +

RightSite Server (and/or WDK/App server)

eContent Server

HTTP Server +

RightSite Server (and/or WDK/App server)

RDBMS

Page 71: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–5

Not all possible combinations of hardware configurations and workloads were tested.

The results are reported by hardware vendor in tables that list the number of users per hour supported on various hardware configurations. Some tables also list a projected number of supported users for some configurations. (“Interpreting the CPU Sizing Tables” on page 4-6 provides some assistance in interpreting the tables.) Because it is unlikely that a customer’s prospective workload will match one of the standard Documentum workloads perfectly, it is necessary to adjust the selection of CPUs to ensure that the system is sized appropriately.

Note: Treat RightSite and WDK/App server CPU as the same. For example, if a set of operations takes 3 CPU seconds for RightSite, then you can assume that the operations consume 3 CPU seconds for the WDK/App server configuration. This is not true for memory, but can be assumed for CPU.

Special Focus for Some Tests

Some of the benchmarks conducted on N-tier configurations focus on the capacity of a single tier within the N-tier environment. For those benchmarks, the sizing result tables reference the figures below, to ensure that you recognize the focus of the test.

In Figure 4-5, the focus is on the Web tier, running either RightSite or an application server running the WDK.

Figure 4-5 Special Focus: Web Server Software Only in an N-tier Environment

eContent Server

HTTP Server +

RightSite Server (and/or WDK/App server)

RDBMS

Special Focus: Web Server

Page 72: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–6 Documentum System Sizing Guide

Figure 4-6 shows a configuration in which the focus is on eContent Server on its own host. Figure 4-7 shows a configuration in which the focus is on the host machine on which both eContent Server and the RDBMS server reside. Sizing results from benchmark tests that use the configuration in either Figure 4-6 or Figure 4-7 can be applied to deployments that don’t include Web software because the focus ignores the Web-server hardware.

Figure 4-6 Special Focus eContent Server Machine in an N-tier Environment

Figure 4-7 Special Focus: eContent Server and RDBMS on the Same Host

Interpreting the CPU Sizing Tables

Table 4-1, Table 4-2, and the accompanying notes provide guidance for interpreting the sizing tables in this chapter.

eContent Server

HTTP Server +

RightSite Server (and/or WDK/App server)

RDBMS

Special Focus: eContent Server

eContent Server +

RDBMS

HTTP Server +

RightSite Server (and/or WDK/App server)

Special Focus: eContent Server and RDBMS on same host

Page 73: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–7

Table 4-1 is the first example table.

Explanation of the first example table:

■ The table title identifies the workload used in the test and the hardware vendor of the server machines used in the test.

■ The first column, Configuration, lists all of the hardware that was used in a test. The component in boldface was the focus and the limiting factor for a particular test. For example, in the second row, the component on which eContent Server was installed is the limiting factor. The “model-2” means that the component had two CPUs.

2x means that there were two of those systems used for that tier.

■ The second column identifies the number of users per hour supported on the configuration with acceptable response times. The users-per-hour number has meaning only within the context of the workload identified in the title.

Table 4-1 <Workload_name> on <Hardware_Vender>

Configuration Users/Busy Hour Notes

■ model-8 (RDBMS)

■ 2 x model-4 (eContent)

■ 2 x model-2 (Web)

1600 The database and file server could have easily been run on a 4 processor machine.

The RDBMS host is the bottleneck in this test.

■ model-8 (RDBMS)

■ 1 x model-2 (eContent)

■ 2 x model-2 (Web)

400 This is a derived configuration based on the test results with a 4 processor 6400R.

The eContent Server host is the bottleneck in this test. (Figure 4-5)

Page 74: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–8 Documentum System Sizing Guide

The second example, Table 4-2, shows another type of sizing table found in this chapter.

Explanation of the second example table:

■ The table title identifies the workload used in the test and the hardware vendor of the server machines used in the test.

■ The first column identifies the system. The notation is:

vendor- model- number of processors

■ The remaining columns identify the configuration on which the workload was run. For example, in the above table:

❒ The second column indicates the number of users per hour running the workload on a host-based system with two processors.

❒ The third column indicates the number of users per hour running the workload on an N-tier system with 8 and 12 processors.

❒ In this case, the model could have at most 4 processors, so the System column (first column) indicates how many of the servers were used for these tests.

❒ The fourth column indicates the number of users per hour running the workload on the eContent Server/RDBMS system. This ignores the Web tier.

Table 4-2 <Workload_name> on <Hardware_vendor_name>

System and number of CPUs

Users/hour on host-based system (Figure 4-2)

Users/hour on an N-tier system (Figure 4-3)

Users/hour on N-tier with focus on eContent Server and RDBMS (Figure 4-7)

Users/hours on N-tier with focus on Web Server (Figure 4-5)

Vendor-model-2 200 - 700 500

Vendor-model-4 450* - 1500 900

Vendor-model-8 (2 servers)

- 1000 - 1500*

Vendor-model-12(3 servers)

- 1500* - -

Page 75: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–9

❒ The fifth column indicates the number of users per hour running the EDMI workload on the Web server tier. It is sizing for the Web server only.

Compaq Sizing Information

The information in this section covers the following Compaq servers:

■ DL360

■ Proliant 6400R

■ Proliant 8500

The sizing information is based on N-tier tests. In the tests, the eContent Server tier machines were Proliant 6400Rs. The features of a Proliant 6400R include:

■ Up to four Intel® Pentium III® XeonTM processors at 550 MHz

■ 6 PCI slots: 2 64-Bit/66 MHz; 3 64-Bit/33MHz; 1 32-bit/33 MHz

The Web-tier machines were Proliant DL360s. Each machine has, at most, two 800MHz CPUs, 4GB of memory, and two internal drives set up in a mirrored pair. The DL360s are 1u machines that fit 42 to a standard rack.

The Proliant 8500 was the RDBMS server and file server for this test. Its features include:

■ Up to eight 700MHz Pentium III Xeon processors (1M or 2M L2 Caches)

■ Up to 16GB of 100MHz SDRAM memory (only 4 GB addressable by NT)

■ Multi-peer 64-bit PCI buses including 66MHz PCI slots

Page 76: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–10 Documentum System Sizing Guide

The tests employed 550Mhz processors and the machine had two Compaq disk storage arrays attached to it. Table 4-3 shows the results of the tests.

Sun/Solaris Sizing Information

This section describes the following Sun machines and the benchmarks run on them:

■ “Sun Enterprise 450” on page 4-11

■ “Sun Enterprise 6500 and 4500” on page 4-12

Table 4-3 Compaq Sizing Data for iTeam Workload

Configuration Users/busy hour Notes

■ 8500-8 (RDBMS)

■ 2 x 6400R-4 (eContent)

■ 4 x DL360-2 (Web)

1600 The database and file server could have easily been run on a 4-processor 8500

The database was the bottleneck in this test (single process address space limitation).

■ 8500-8 (RDBMS)

■ 1 x 6400R-2 (eContent)

■ 2 x DL360-2 (Web)

400 This is a derived configuration based on the test results with a 4-processor 6400R.

The eContent Server host was the bottleneck in this test (Figure 4-6).

■ 8500-8 (RDBMS)

■ 1 x 6400R-4 (eContent)

■ 2 x DL360-2 (Web)

800 The eContent Server host was the bottleneck in this test (Figure 4-6).

■ 8500-8 (RDBMS)

■ 1 x 6400R-4 (eContent)

■ 1 x DL360-2 (Web)

500 The Web server host was the bottleneck in this test (Figure 4-5).

Page 77: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–11

Sun Enterprise 450

The Sun Enterprise 450 is the top-of-the-line workgroup server from Sun. It can have up to four 400MHz Ultra Sparc II processors (each with 4MB E-cache memory). The E450 can have up to 182 GB of internal storage capacity and up to 6 TB of external storage. It has 6 PCI buses providing up to 1GB/sec I/O throughput.

Table 4-4 shows the results when the EDMI workload is run on configurations with the Sun Enterprise 450.

Table 4-5 shows the results of the Anonymous RightSite Website Workload run on configurations with the Sun Enterprise 450.

Notes:

■ The Sun E450 can have up to four 400MHz Ultra Sparc II processors. The system Sun-E450-8 is two E450s with 4 processors each. Sun-E450-12 represents three E450s with 4 processors each.

■ Values marked with an asterisk (*) are from actual runs. All other values are estimated based on the actual runs.

Table 4-4 EDMI Workload on Sun Enterprise 450 with 400 MHz Ultra Sparc II Processor

System and Number of CPUs

Users/Hour on Host-based System (Figure 4-2)

Users/Hour on an N-tier System (Figure 4-3)

Users/Hour on N-tier with Focus on eContent Server (Figure 4-7)

Users/Hour on N-tier with Focus on Web-Server (Figure 4-5)

Sun-E450-2 200 - 700 500

Sun-E450-4 450* - 1500 900

Sun-E450-8 (2 servers) - 1000 - 1500*

Sun-E450-12 (3 servers) - 1500* - -

Table 4-5 Anonymous RightSite Website Workload on Sun Enterprise 450 with 400MHz Ultra Sparc II Processors

System and Number of CPUs Users/Hour on Host-Based System (Figure 4-2)

Sun-E450-2 1000

Sun-E450-4 2000*

Page 78: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–12 Documentum System Sizing Guide

■ The benchmark tests were run with Solaris 2.6 and EDMS 98 (v 3.1.6)

■ In the 1500 EDMI users/hour test, each of the two E450s was about 60 percent busy. It is likely that the RightSite Application server results can be increased by 30 percent.

■ The metric for the Website benchmark can also be stated in terms of HTTP Gets serviced per hour. Each Website user performs five STATIC_HTML operations, and each STATIC_HTML operation, in turn, performs five HTTP gets. Consequently, a 2000 Website users/hour run generates 50,000 HTTP gets per hour (each with an average HTTP get response time of 550 msecs).

Sun Enterprise 6500 and 4500

The Sun Enterprise E6500 is the top-of-the-line mid-range server from Sun. It can have up to 30 x 336 MHz Ultra Sparc II processors (each with 4MB E-cache memory). The E6500 can have up to 375 GB of storage capacity in the internal cabinet and up to 10 TB of external storage. It can have 16 system boards, which are either I/O boards or CPU/memory boards (2 CPUs per board). Each I/O board has four PCI channels. The system on which the tests were conducted had a total of 26 physical slots (the slots used to house the CPU/memory system boards are also used to house the I/O boards).

The Sun Enterprise 4500 is the mid-range server from Sun. The Enterprise 4500 can have up to 14 CPUs at 336 MHz for each processor (or 8 system boards).

See http://www.sun.com/servers/ for more information.

It is assumed that, with respect to the Documentum application, given equal numbers of CPUs, memory, and disk, these systems will perform in an identical fashion. That is, a 14-CPU E4500 will achieve the same performance as a 14-CPU E6500. Therefore, in the sizing tables (Table 4-6 and Table 4-7) they are treated as the same.

Page 79: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–13

Figure 4-6 shows the sizing figures when the EDMI workload was run on configurations with the Sun Enterprise 4500 and 6500

Table 4-6 EDMI Workload on Sun Enterprise 4500 and 6500 with 336MHz Ultra Sparc II Processors

System and Number of CPUs

Users/Hour on Host-based System(Figure 4-2)

Users/Hour on an N-tier System (Figure 4-3) (refer to the Notes, the fourth bullet)

Users/Hour on N-tier with Focus on eContent Server and RDBMS (Figure 4-7)

Users/Hour on N-tier with Focus on Web Server (Figure 4-5)

Sun-E6500/E4500-2 225 475 475

Sun-E6500/E4500-4 425 950 950

Sun-E6500/E4500-6 650 1425 1425

Sun-E6500/E4500-8 850 950 1900 1900

Sun-E6500/E4500-10 1075 1200 2350 2350

Sun-E6500/E4500-12 1300 1425 2825 2825

Sun-E6500/E4500-14 1500* 1650 3300 3300

Sun-E6500/E4500-16 1650 1900 3775 3775

Sun-E6500/E4500-18 1850 2125 4250* 4250*

Sun-E6500/E4500-20 2050 2350

Sun-E6500/E4500-22 2250 2600

Sun-E6500/E4500-24 2825

Sun-E6500/E4500-26 3050

Sun-E6500/E4500-28 3300

Sun-E6500/E4500-30 3550

Sun-E6500/E4500-32 3775

Sun-E6500/E4500-34 4025

Sun-E6500/E4500-36 4250

Page 80: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–14 Documentum System Sizing Guide

Table 4-7 shows the sizing figures when a host-based Web site workload was run on configurations with the Sun Enterprise 4500 and 6500.

Notes:

■ Values marked with an asterisk (*) are from actual runs. All other values are estimated based on the actual runs.

■ The host-based EDMI test that achieved 2,250 EDMI users/hour was actually run on a 26-processor machine. However, even at the peak, the average CPU utilization was around 65 percent. We believe that if the system is reduced to 22 CPUs, response times will be maintained and the utilization will be 80 percent.

■ The 3-tier 4,250 result was actually run on a 3-tier configuration of two E4500s and a single E6500 with 54 total CPUs. However, from an analysis of the CPU utilization we feel that the response times could have been maintained on a system with 18 CPUs on both the eContent Server/DBMS machine and the RightSite application server machine (for a total of 36 CPUs).

■ The scores on the Website 4 processor test are 40 percent higher than those shown in Table 4-6. This difference is due to the fact that the DBMS server were from different vendors.

■ In all cases, we take the largest number of users tested with and extrapolate downward for lower CPUs. The largest number is not necessarily the limit of that server technology; it is the largest number tested in a lab situation.

Table 4-7 Web Site Workload CPU Sizing for Sun Enterprise 4500 & 6500 with 336MHz Ultra Sparc II Processors

System and Number of CPUs Users/Busy Hour

Sun-E6500/E4500-2 1300

Sun-E6500/E4500-4 2800

Sun-E6500/E4500-6 4300

Sun-E6500/E4500-8 5800

Sun-E6500/E4500-10 7300

Sun-E6500/E4500-12 8800*

Page 81: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–15

■ The metric for the Web site benchmark can also be stated in terms of HTTP Gets serviced per hour. Each Website user performs five STATIC_HTML operations. Each STATIC_HTML operation, in turn, performs five HTTP gets. Consequently, an 8,800 Web site users/hour run generates 220,000 HTTP gets per hour (each with an average HTTP get response time of 300 msecs).

IBM, Windows NT, and AIX Sizing Information

This section contains sizing information for the following machines:

■ “IBM Netfinity 7000 M10” on page 4-15

■ “IBM AIX Systems: S7A and F50” on page 4-16

IBM Netfinity 7000 M10

The IBM NF7000M10 is the top-of-the-line Intel server from IBM. It can have up to 4 x 400 MHz Pentium II Xeon processors. The NF7000 M10 has 6 PCI slots and can support up to 54 GB of internal storage and 5 TB of external storage. The system on which the tests were conducted had a total of 4 physical CPUs, 4GB of memory, and 2 EXP10 disk arrays.

The EXP10 array can hold up to ten 9GB drives. Two I/O channels are used to interface with the two controllers on the array. The array supports various levels of RAID.

Table 4-8 lists the sizing figures for the IBM Netfinity 7000 M10.

Table 4-8 EDMI Workload on IBM Netfinity 7000 M10

System and Number of CPUs

Users/Hour on Host-based System (Figure 4-2)

Users/Hour on an N-tier System(Figure 4-3)

Users/Hour on N-tier with Focus on eContent Server and RDBMS (Figure 4-7)

Users/Hour on N-tier with Focus on Web Server (Figure 4-5)

IBM-NF7000M10-2 250 500 500

IBM-NF7000M10-4 500* 500 900* 1000*

IBM-NF7000M10-8 900*

Page 82: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–16 Documentum System Sizing Guide

Notes:

■ Values marked with an asterisk (*) are from actual runs. All other values are estimated based on the actual runs.

■ There are some performance differences between certain DBMS vendors on Windows NT.

■ In all cases, we take the largest number of users tested with and extrapolate downward for lower CPUs. The largest number is not necessarily the limit of that server technology; it is the largest number tested in a lab situation.

■ These tests were conducted with Windows NT 4.0 SP4.

IBM AIX Systems: S7A and F50

The IBM S7A is a high-end RS6000 AIX server from IBM. Its highlights are:

■ Standard configuration

❒ Microprocessor: 4-way 125 MHz RS64-I or 262 MHz RS64II (upgrade only)

❒ Level 2 (L2) cache: 4MB for 125 MHz processors; 8MB for 262 MHz processors

❒ RAM (memory): 512M

❒ Media bays: 3 (2 available)

❒ Expansion slots: 14 PCI (11 available)

❒ PCI bus width: 32- and 64-bit

❒ Memory slots: 20

■ AIX operating system

❒ Version 4.3 for 125 MHz processors and Version 4.3.1 for 262 MHz processors

■ System expansion

❒ SMP configurations: Up to 2 additional 4-way processors

❒ RAM: Up to 32GB

❒ Internal PCI slots: Up to 56 per system

❒ Internal media bays: Up to 12 per system

Page 83: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–17

❒ Internal disk bays: Up to 48 (hot-swappable)

❒ Internal disk storage: Up to 436.8GB

❒ External disk storage: Up to 1.3TB SCSI; up to 14.0TB SSA

The IBM F50 is a lower-end Enterprise server. Its highlights are:

■ Standard configuration

❒ Microprocessors: 166 MHz or 332 MHz PowerPC 604e with X5 cache

❒ Level 2 (L2) cache: 256KB ECC

❒ RAM (memory): 128MB ECC Synchronous DRAM

❒ Disk/media bays: 18 (1 used)/4 (2 used)

❒ I/O expansion slots: 9 (7 PCI, 2 PCI/ISA)

❒ PCI bus widths: 2 32-bit and 1 64-bit

❒ Memory slots: 2

■ AIX operating system

❒ Version 4.2.1 or Version 4.3

■ System expansion

❒ SMP configurations: To 2, 3 or 4 166 MHz or 332 MHz processors (cannot be mixed)

❒ RAM: Up to 3GB

❒ Internal disk storage: Up to 172.8GB (163.8GB hot-swappable)

❒ External disk storage: Up to 4.8TB SCSI-2; up to 3.5TB SSA

Table 4-9 shows the CPU sizing for the Tested IBM/AIX Servers.

Table 4-9 EDMI Workload on IBM/AIX Servers

Configuration

in the format: No. of Servers x System-No. of Processors

Users/Busy Hour

■ 1 x S7A-4 (RDBMS)

■ 1 x F50-4 (eContent)

■ 3 x Netserver-2 (Web)

3000

Page 84: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–18 Documentum System Sizing Guide

Notes:

■ The 4-tier tests used three IBM NF7000M10s as RightSite application servers. Each 7000M10 had four 400MHz Pentium II Xeon processors (for a total of 20 CPUs used between Unix and NT).

■ The tested F50/S7A combination equipped both systems with four processors. The DBMS server ran on the S7A (AIX 4.3 operating system) and the eContent Server ran on the F50 (AIX 4.2 operating system).

■ The notation F50/S7A-4 means that both systems have two processors.

■ The S7A was set up to have only 4 processors, to match the processing power of the new, less expensive IBM H70 (4-processor machine).

HP Windows NT and HP-UX Servers

This section discusses the following machines:

■ “HP NT/Intel Servers” on page 4-18

■ “HP-UX Servers” on page 4-20

HP NT/Intel Servers

The HP NETSERVER LXR 8000 is the top-of-the-line Intel server from HP. It can have up to 4 x 400 MHz Pentium II Xeon processors (with up to 1M cache per processor) and upgrades to 8-way multiprocessing and future processors. The NETSERVER LXR 8000 has 10 full length PCI slots: four 64-bit hot-swap slots, five 32-bit slots, and one shared 32-bit PCI/ISA slot. It can handle up to 8 GB of physical memory. The system on which tests were conducted had a total of 4 physical CPUs and 4GB of memory and connected to a disk array model 30/FC through two fiber-optic interfaces.

■ 1 x S7A-2(RDBMS),

■ 1 x F50-2 (eContent)

■ 3 x Netserver-2 (Web)

1500

Table 4-9 EDMI Workload on IBM/AIX Servers

Configuration

in the format: No. of Servers x System-No. of Processors

Users/Busy Hour

Page 85: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–19

HP has another comparable server called the LH4. This server is also a 4-processor Intel-based system. It only lacks some of the expandability of the LXR 8000. The LH4 can only go up to 4 GB of memory.

The HP Lpr is a two-processor rack-mounted server that can have up to 1 GB of memory. The processors used in the tests were 600Mhz. Each server is 2U in size, and a standard rack can hold 20 servers.

Table 4-10 lists the sizing results for the HP LXR8000 and LH4 when the iTeam workload is run.

Table 4-11 lists the sizing results for the Lpr/LH4 N-tier test.

Table 4-10 iTeam Workload on HP LXR8000 & LH4

System and Number of CPUs Users per Busy Hour on Host-Based System (Figure 4-2)

HP-LH4-2 100

HP-LH4-4 200*

Table 4-11 iTeam Workload on Lpr/LH4, N-tier Test

Configuration

in the format: No. of Servers x System-No. of Processors

Users per Busy Hour

Notes

■ 1 x LH4-2 (RDBMS),

■ 1 x Lpr-2 (eContent)

■ 2 x Lpr-2 (Web)

400 -

■ 1 x LH4-2 (RDBMS)

■ 1 x Lpr-2 (eContent)

■ 1 x Lpr-2 (Web)

200 The Lpr can only have 1GB of memory and this limits the number of RightSite server connections. The 2-processor Lpr was memory bound, not CPU bound.

Page 86: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

4–20 Documentum System Sizing Guide

Table 4-12 lists the sizing results for the EDMI workload on HP LXR8000 and LH4 machines.

Note:

■ Values marked with an asterisk (*) are from actual runs. All others are estimates based on the actual runs.

HP-UX Servers

Two types of HP-UX servers have been tested with Documentum: the V2600 and the K580.

The V2600 machine is a high-end HP-UX server with the following features:

■ Up to 32 CPUs (a maximum of 16 were used in the tests)

■ Up to 32 GB of memory

■ Up to 28 2X PCI slots

■ System-wide throughput of up to 15.36 GB with HP’s HyperPlane crossbar technology

■ Up to 19-GB I/O throughput

The K580 machine has the following features:

■ Up to six-way symmetric multiprocessing

■ Single-level, large, full-processor-speed 2-MB/2-MB and 1-MB/1-MB instruction/data caches

■ Up to 37 I/O slots with optional I/O expansion cabinets

■ Four internal Fast/Wide Differential SCSI-2 disk storage bays

Table 4-12 EDMI Workload on HP LXR8000 and LH4

System andNumber of CPUs

Users per Hour On Host-based System (Figure 4-2)

Users per Hour on anN-tier System (Figure 4-3)

Users per Hour on N-tier with Focus on eContent and RDBMS(Figure 4-7)

Users per Hour on N-tier with Focus on Web-Server(Figure 4-5)

HP-LH4-2 250 - 500 500

HP-LH4-4 500* 500 900* 1000*

HP-LH4-8 (2 servers) - 900* - -

Page 87: Document Um System Sizing Guide

Server Configuration and SizingServer Sizing Results from Benchmark Tests

Documentum System Sizing Guide 4–21

■ Up to 30 TB of total disk capacity using optional expansion cabinets

Table 4-13 shows the sizing results for the HP-V2600 using the online customer care workload. (The online customer care workload is described in “The Online Customer Care Workload” on page A-9.)

Table 4-14 shows the sizing results for the HP-K580 running the Document-Find-and-View workload. (The Document-Find-and-View workload is described in “The Document Find and View Workload” on page A-9.)

Notes:

■ The Document-Find-and-View workload is different from the EDMI workload in that it is client-server based (not Web-based). It also has a subset of the EDMI operations (folder searching and attribute searching). It is read-only.

■ Values marked with an asterisk (*) are from actual runs. All others are estimates based on the actual runs.

Table 4-13 Online Customer Care Workload on V2600 Machines

System and Number of CPUs Users/Hour on Host-based System (Figure 4-2)

HP-V2600-4 500

HP-V2600-8 1000

HP-V2600-16 2000

Table 4-14 Document Find and View Workload on K580 Machines

System and Number of CPUs Users/Hour on Host-based System (Figure 4-2)

HP-K580-2 1300

HP-K580-4 2600

HP-K580-6 4025*

Page 88: Document Um System Sizing Guide

Server Configuration and SizingOther CPU-Related Notes

4–22 Documentum System Sizing Guide

Other CPU-Related Notes

Here are some other guidelines and recommendations:

■ When feasible, dedicate a separate server to the Documentum installation.

■ Do not run Documentum on the same physical server as an ERP (Enterprise Resource Planning) system. Do not install Documentum on the PDC (Primary Domain Controller), the NIS (Network Information Services Master), a file or print server, or another application server.

■ Documentum supports eContent Server and RightSite on all Microsoft-certified Windows NT Server hardware vendors. Omission of any Windows NT Server hardware vendor from this document is due to lack of space and is not intended to imply any lack of support for that vendor’s Windows NT Server hardware.

■ Contact your chosen hardware vendor to select the best server for your immediate and future needs. Note that the term Enterprise Servers refer to business, not scientific, application servers. Workgroup Servers may also be acceptable for development, testing, workgroup, and other less demanding configurations. Consult the release notes for the specific product for the exact hardware on which a product is certified.

The following web sites may be useful references for researching hardware vendor and associated product information:

HP UNIX Enterprise Servers http://www.enterprisecomputing.hp.com/

Sun Enterprise Servers http://www.sun.com/servers

IBM RS/6000 Servers http://www.rs6000.ibm.com/hardware

IBM NT servers http://www.pc.ibm.com/us/netfinity/

NT Server OS http://www.microsoft.com/ntserver

HP NT Server http://www.hp.com/netserver/

COMPAQ NT Server http://www.compaq.com/products/servers

Page 89: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

Documentum System Sizing Guide 4–23

Sizing Server Memory

This section discusses how to size server memory. It includes the following topics:

■ “Overview of the Sizing Process” on page 4-23

■ “Key Concepts Relating to Memory Use” on page 4-25

■ “Estimating Physical Memory Usage” on page 4-28

■ “Estimating Paging File Space” on page 4-30

■ “Additional Considerations” on page 4-31

Overview of the Sizing Process

Memory sizing considers two system components: physical memory and paging (or swap) file space.

To size physical memory, you must determine how much memory process working sets are going to consume. Aside from the DBMS, you should be primarily concerned with the non-shared pages of process working sets. (“Key Concepts Relating to Memory Use” on page 4-25 defines process working sets and how they are managed within physical and virtual memory.)

To size the paging file, you consider the virtual memory allocated.

If you run out of either, problems can arise. If you run out of physical memory, then lots of I/O could occur to the paging (swap) file, which leads to poor performance. If you run out of space in the paging file, commands may fail. Although some operating systems clearly distinguish between the two problems, most typically give out messages that are confusing and vague.

Figure 4-8 illustrates the steps to take to determine memory and swap space needs.

Page 90: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

4–24 Documentum System Sizing Guide

Figure 4-8 High-Level Steps to Determining Memory and Swap Space Needs

The information gathered in step one is used to obtain the estimates for physical memory and paging file space.

Oversizing memory is strongly recommended because most memory use in a Documentum deployment is attributed to the server caches in the system. The caches enhance the performance of various operations, and more efficient operations mean better response times. Consequently, it is better to oversize memory than risk undersizing it.

“Estimating Physical Memory Usage” on page 4-28 and “Estimating Paging File Space” on page 4-30 contain guidelines for estimating physical and paging or swap file memory. Some general guidelines are listed in “Additional Considerations” on page 4-31.

For some examples of memory calculations, refer to “Examples of Memory Calculation” on page 4-32.

1. Determine Operating System.

2. Estimate “Active” users

3. Estimate Number of documents

Estimate the amount of physical memory required

Estimate the amount of swap or paging space required

Page 91: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

Documentum System Sizing Guide 4–25

Key Concepts Relating to Memory Use

This section discusses two key concepts:

■ Virtual and physical memory

■ Cache memory use

Understanding these concepts is crucial to accurate memory estimation.

Virtual and Physical Memory

Virtual memory is a service provided by the operating system (and hardware) that allows each process to operate as if it had exclusive access to all physical memory. However, a process only needs a small amount of the virtual memory to perform its activities. This small amount, called the process working set, is the actual amount of physical memory used by the process. The operating system manages the sharing of physical memory among the various working sets.

Physical memory is a limited resource. When the operating system wants to conserve physical memory or manage a situation in which all working sets won’t fit into physical memory, it moves the excess pages into a paging file. Additionally, although a process may think it has exclusive access to all of virtual memory, the operating system transparently shares many of the read-only portions (such as program instructions).

Figure 4-9 illustrates the relationships between physical memory, virtual memory, and process working sets.

Page 92: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

4–26 Documentum System Sizing Guide

Figure 4-9 Real Memory vs. Virtual Memory

Cache Memory Usage

On a typical Documentum server software installation (RightSite, eContent Server, and a DBMS server), memory is used most heavily by various caches. A cache is a memory area used to hold some data or object so the software can avoid performing an expensive operation (read data from disk, from network, and so forth).

As a particular process grows, it typically fills its caches with information, making its operations less expensive. (An administrator can control the maximum size of some caches, but others are sized automatically.) This trade-off between performance and cache size means that excessive memory use by a cache is not always bad. The most important sizing task is to ensure that cache needs do not outstrip the available physical memory.

Shared Pages (EXE, RD only or “real” shared memory)

Private Pages

Private Pages

Process #1

Process #2

Physical Memory

Process Working

Virtual Memory: An abstraction provided by OS and hardware

The Paging File or Swap file.

May have to reserve space for all virtual memory allocated (depending on OS) . Holds pages that have been pulled out of real memory.

Page 93: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

Documentum System Sizing Guide 4–27

DBMS Caches

The DBMS data cache is generally the most dominant cache. (Its size is under administrative control.) The DBMS uses the data cache to minimize the number of disk I/Os that it must perform for data and index rows. It is significantly less expensive for a DBMS to find 100 rows in a data cache than to find them on disk. A production server system with many documents will likely need hundreds of Mbytes (perhaps even one or more Gbytes) of memory for this cache to ensure acceptable performance. Sizing the DBMS cache generously reduces disk I/Os significantly.

Several DBMS servers also have caches for SQL statements that are executed repeatedly. These caches conserve the number of CPU cycles needed for operations such as security validation, parsing, and execution plan preparation. It is typically good to give these caches plenty of memory. Check the RDBMS documentation for more details.

eContent Server Caches

eContent Server uses several caches to enhance performance for operations such as DBMS interactions, CPU cycles, and network operations. Most of the caches are small (less than 1M byte) and bounded by the number of objects they can contain.

The global type cache is the most dominant of eContent Server’s caches. The global type cache holds structures that provide eContent Server with fast access to the DBMS tables that make up a type’s instances. The size of this cache is limited by the number of types in the Docbase. The amount of real memory consumed is determined by how many instances and types are accessed. Although this cache is called the global type cache, it primarily supports per-session access. Each eContent Server process, or thread, has its own copy.

If the process working set of your eContent Server is larger than the memory estimates listed in Table 4-15, your installation is probably using more custom types than those used in the capacity testing.

RightSite Server Caches and Work Areas

RightSite’s memory use is dominated by:

■ DMCL object cache

Page 94: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

4–28 Documentum System Sizing Guide

■ Docbasic compiled code

■ Temporary intermediate Dynamic HTML construction memory

The DMCL object cache requires memory to store recently referenced Docbase objects. Its size is bounded by the maximum number of objects that can be stored in it (the number is set by an environment variable).

The Docbasic compiled-code memory area contains the pre-compiled Docbasic code for the dynamic HTML executed by RightSite. The memory used is, at most, equal to the space used by the on-disk cache of pre-compiled Docbasic routines.

The temporary, intermediate dynamic HTML construction memory is the memory used by RightSite to construct a dynamic HTML page. RightSite makes heavy use of memory when constructing dynamic pages, and the larger the number of dynamic pages accessed, the more memory is needed.

All of these areas can grow Mbytes in size depending on the workload.

Estimating Physical Memory Usage

This section contains guidelines for estimating physical memory needs for a server machine.

User Connection Memory Requirements

Table 4-15 lists the physical memory estimates for a single-user connection for eContent Server and RightSite on Unix and Windows NT. These estimates are based on observations of WebPublisher™ 4.2 and iTeam 4.2. Actual memory needs will vary depending on the complexity of the workload.

Table 4-15 Estimated Physical Memory Needed per Connection

Server Solaris and AIX HP-UX Windows NT

eContent Server 10M Bytes 10M Bytes 10M Bytes

RightSite Server 20M Bytes 20M Bytes 20M Bytes

Page 95: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

Documentum System Sizing Guide 4–29

DBMS Memory Requirements

When estimating memory usage for the DBMS, consider giving the DBMS hundreds of Mbytes of memory (perhaps even several Gbytes if the Docbase has several million documents). Because eContent Server supports a general purpose query language, some queries can result in DBMS table scans. By having more memory, you minimize the number of disk I/Os needed to execute those queries.

Table 4-16 describes the DBMS memory requirements for Docbases of various sizes. The values in this table are presented for planning use only. The actual requirements may vary, depending on the object types and number of objects in the Docbase.

Operating systems generally require special tuning to support multi-giga bytes of physical memory for the DBMS. Table 4-17 lists some examples of the required tuning.

Table 4-16 DBMS Memory Recommendations by Docbase Size

Number of Documents Planning Ranges of Document Metadata Size (DBMS disk space)

Minimum Recommended Memory Size for DBMS

1,000,000 4G to 5G 500M to 1GB

2,000,000 8G to 10G > 1GB

3,000,000 12G to 15G > 1GB

4,000,000 16G to 20G > 2GB

5,000,000 20G to 25G > 2GB

Table 4-17 Special Considerations for Supporting Large Memory

DBMS and Platform To Achieve Required Tuning

Oracle on Unix A DBMS buffer cache >2GB Oracle executable must be relinked/rebased. See Oracle documentation for more details.

Oracle on Solaris A DBMS buffer cache >1GB Shared memory parameters must be set properly in the system file.

Page 96: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

4–30 Documentum System Sizing Guide

Operating System Memory Requirements

The operating system buffer cache also uses physical memory. This cache is used to store content file blocks as they are read (or written) to disk. On some versions of Unix, this cache can be explicitly sized. It should be small for installations with content files that are small. For deployments with large content files, it should be large.

On Windows NT, set up the operating system to favor process working sets over the buffer cache. The buffer cache and the memory set aside for process working sets are dynamically sized, and an administrator can configure how conflict between the two areas is resolved. The buffer cache must have less priority than the process working sets or an anomalous situation could arise that forces the DBMS or eContent Server out of memory to make way for the file cache. This will lead to very poor performance.

Additionally, if the RDBMS is Microsoft SQL Server, you can put the tempdb in physical memory (called temp DB in RAM). Doing so may lead to some performance gain and is useful especially when there is sufficient memory available. See the SQL Server product information for more details.

Estimating Paging File Space

Determining the space required for the paging file is almost as important as determining the required physical memory. Table 4-18 lists the paging file space recommendations for eContent Server and the RightSite Server.

Oracle on Windows NT

A DBMS buffer cache >2GB Windows NT Enterprise must be booted with the /3GB flag in the boot.ini.

Table 4-17 Special Considerations for Supporting Large Memory

DBMS and Platform To Achieve Required Tuning

Table 4-18 Recommended Paging File Space per Active Connection

Server Unix Windows NT

eContent Server 11M to 20M Bytes 4G

RightSite Server 6M to 12M Bytes 4G

Page 97: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Memory

Documentum System Sizing Guide 4–31

The actual amount of paging file space required for each active connection depends on the number of Documentum object types that are created. Implementations that create many customized types might require more paging file space per connection than shown in Table 4-18.

The maximum page file size for each drive letter on Windows NT is 4GB. Always size the paging file to this maximum for each drive in the server unless it interferes with the operating system’s ability to create a crash dump.

If the paging file area is improperly sized, errors can occur that appear to be memory errors. On Unix, you can use vmstat to detect an out-of-memory or out-of-paging file condition. On Windows NT, you can use the performance monitor, the task manager, or post-error from an error popup to detect these conditions.

Additional Considerations

Keep in mind these additional guidelines when you size memory:

■ On Solaris 2.6, the disk space for /tmp is treated like a process and its working set. This means that /tmp will consume some physical memory and swap space. You may need to increase some of the swap space estimates if there is heavy use of the /tmp file system.

■ Consider the impact of the following business-related factors on memory utilization:

❒ End of month, quarter, year peaks

❒ Company growth over three years

❒ Batch processing (system administration jobs)

❒ Use of AutoRender Pro and the Transformation Engine

❒ Users connecting from other sites

❒ Integrations with other systems (such as SAP R3, PeopleSoft, and so forth.)

■ Manage the risk of the unknown by adding a safety margin to your calculation

■ Research maximum RAM capacity per server. If you may need more physical RAM in the future, do not buy a server box with maximum RAM currently installed. If it is easy to buy physical RAM, then plan for the

Page 98: Document Um System Sizing Guide

Server Configuration and SizingExamples of Memory Calculation

4–32 Documentum System Sizing Guide

worst case and buy for the best. Your choice of a server model may be governed by the memory requirements calculated for your active Documentum users rather than by the CPU requirements.

Examples of Memory Calculation

The three examples in this section use the following equation as a basis for memory estimates:

Memory = Base Memory + DBMS Memory + (Per User Memory for Documentum x Number of Active Users)

Example One

This example is based on the following assumptions:

■ All components (eContent Server, HTTP Server, RightSite, RDBMS) reside on one server (host-based).

■ The Docbase will have up to 500,000 objects within the next two years.

■ The maximum number of active users will be 50 users (20 percent of 250 users per hour).

Table 4-19 shows the memory calculations for the first example.

Table 4-19 Memory Calculation Example 1

Software Components Minimum Memory Capacity Required

Operating system 128 MB

RDBMS 500 MB

eContent Server 32 MB

RightSite and HTTP (required only for Web clients or RightSite applications)

32 MB

50 Active Users x (10 MB + 20 MB) 1.5 GB

TOTAL Estimated Server Memory Requirements 2.2 GB

Page 99: Document Um System Sizing Guide

Server Configuration and SizingExamples of Memory Calculation

Documentum System Sizing Guide 4–33

Example Two

This example is based on the following assumptions:

■ eContent Server and the RDBMS are on one server; RightSite and the HTTP Server are on a second server.

■ The Docbase will have up to 500,000 objects within the next two years.

■ The maximum number of active users will be 50 users (20 percent of 250 users per hour).

Table 4-20 shows the calculations for the first server, and Table 4-21 shows the calculations for the second server.

Table 4-20 Memory Calculation for Example 2, First Server (eContent Server and RDBMS)

Software Component Required Minimum Memory Capacity

Operating system 128 MB

RDBMS 500 MB

eContent Server 32 MB

50 Active Users x 10 MB 500 MB

TOTAL Estimated Server 1 Memory Requirements

1.2 GB

Table 4-21 Memory Calculation for Example 2, Second Server (RightSite and HTTP Server)

Software Component Required Minimum Memory Capacity

Operating system 128 MB

HTTP Server 32 MB

RightSite Server 32 MB

50 Active Users x 20 MB 1 GB

TOTAL Estimated Server 2 Memory Requirements

1.2 GB

Page 100: Document Um System Sizing Guide

Server Configuration and SizingExamples of Memory Calculation

4–34 Documentum System Sizing Guide

The aggregate minimum required memory for both servers is 2.4 GB.

Example Three

This example is based on the following assumptions:

■ Three servers: One for the Documentum eContent Server, another for the RDBMS, and another server for the HTTP server software and RightSite.

■ The Docbase will have up to 500,000 objects within the next two years

■ The maximum number of active users will be 50 users (20 percent of 250 users per hour)

Table 4-22 shows the calculations for the first server; Table 4-23 shows the calculations for the second server; and Table 4-24 shows the calculations for the third server.

Table 4-22 Memory Calculation for Example 3, First Server (eContent Server)

Software Component Required Minimum Memory Capacity

Operating system 128 MB

eContent Server 32 MB

50 Active Users x 10 MB 500 MB

TOTAL Estimated Memory Requirement for the eContent Server Machine

660 MB (round up to 700 MB)

Table 4-23 Memory Calculation for Example 3, Second Server (RDBMS)

Software Component Required Minimum Memory Capacity

Operating system 128 MB

RDBMS 500 MB

TOTAL Estimated Memory Requirement for the RDBMS Machine

628 MB (round up to 768 MB)

Page 101: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

Documentum System Sizing Guide 4–35

The aggregate estimated minimum memory for all three servers is 2.3 GB.

Sizing Server Disk Capacity

This section describes how to size server disk capacity and provides guidelines and information to help you with that task.

Figure 4-10 summarizes the disk capacity sizing process.

Table 4-24 Memory Calculation for Example 3, Third Server (HTTP and RightSite Servers)

Software Component Required Minimum Memory Capacity

Operating system 128 MB

HTTP Server 32 MB

RightSite Server 32 MB

50 Active Users x 20 MB 1 GB

TOTAL Estimated Memory Requirement for the HTTP and RightSite Server Machine

1.2 GB

Page 102: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

4–36 Documentum System Sizing Guide

Figure 4-10 Disk Storage Sizing Process

There are numerous inputs to this process, and it is important to obtain correct estimates and information for them. If the inputs are incorrect and disk space is sized incorrectly, the server will perform very badly. Fixing problems related to disk space is difficult after the fact. The inputs to the process include:

■ The amount of physical space needed for all file stores

■ The recovery characteristics for all file store areas

■ The disk access needs for each file store area

After you obtain the needed inputs, you can determine which disk configuration will best suit your needs.

1. Determine Operating System.

2. Estimate “Active” users & users per hour

3. Estimate Number of documents

4. Est. # of renditions and versions

5. Frequency of Full text Searches, case-insensitive searches, and dump/load operations

Estimate the amount of physical space required on disk

Estimate the amount of drives needed to provide sufficient disk access capacity.

Establish recovery requirements for each Data storage area

Page 103: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

Documentum System Sizing Guide 4–37

The next section “Key Concepts for Disk Sizing,” reviews some of the factors that affect decisions about disk sizing and configuration. “Disk Striping and RAID Configurations” on page 4-40 describes a variety of common disk configurations. “Disk Storage Areas” on page 4-42 describes the characteristics of the disk storage areas used in a Documentum deployment and makes some disk sizing and configuration recommendations for each.

Key Concepts for Disk Sizing

To correctly size disk space, it is important that you understand several key concepts:

■ Disk Space and Disk Access Capacity

■ Effect of Table Scans, Indexes, and Cost-based Optimizers on I/O

■ DBMS Buffer Cache Memory Effect on Disk I/Os

Disk Space and Disk Access Capacity

To size disks for a Documentum installation, you must size both the required disk space and the disk access capacity. Sizing disk space entails ensuring that there is sufficient room to store the permanent and temporary data. Sizing access capacity entails ensuring that there are sufficient disk spindles (or arms) to gather all of the data in a reasonable time frame.

To illustrate the difference, suppose there is a 4 GB DBMS table, which can fit on a single 9 GB drive. Suppose also that the DBMS needs to scan the entire table. If the DBMS uses 16K blocks, then 262,000 disk I/Os are needed to scan the table. If the scan takes place over a single disk drive (at 10 msec an I/O), it takes 44 minutes to scan the table (assuming that no pre-fetching occurs). However, if the table is striped over 10 drives in a RAID 0 stripe, the scan might actually only take 4 minutes if one read could hit all 10 drives (by small stripe units). Notice that ten 9 GB drives offer 10 times the space needed for the table; however, meeting the space requirements did not necessarily meet the response time requirements.

Page 104: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

4–38 Documentum System Sizing Guide

Effect of Table Scans, Indexes, and Cost-based Optimizers on I/O

The biggest demands on disks typically come from large table scans. A table scan occurs when a DBMS reads all of the data in a table looking for a few rows. In a table scan, the DBMS does not use an index to shorten the lookup effort. Table scans are not always bad (in some cases using an index can actually hurt performance). However, if the table is large enough or the amount of physical RAM is small enough, table scans generate enormous amounts of disk I/O.

Note: A database index is a data structure that allows the DBMS to locate some records efficiently without having to read every row in the table. Documentum maintains many indexes on tables and even allows the Administrator to define additional indexes on their underlying tables.

In general, large table scans should not appear in your workload, but there are some operations that will result in a table scan. For example, the DBMS cannot use an index for a case-insensitive attribute search, which leads to a table scan. If your applications contain operations that result in large table scans which can’t be tuned away, it is important to size the server’s disks properly to get best performance.

Tuning with the Optimizer

Some table scans are unavoidable (case-insensitive search); others are the result of query optimization problems by the DBMS vendors. Some vendors, such as Oracle, actually support multiple modes of query optimization. One popular mode used frequently by the Documentum Server relies on the Oracle rule-based optimizer. This optimizer picks a data access plan based on a sequence of rules, not on the statistics of the tables (the number of rows). As Docbases get larger, using a rule-based optimizer can lead to some costly mistakes that cause table scans. On large Docbases, a cost-based optimizer might deliver better access plans because it can determine table size and column value distributions. However, cost-based optimizers are not guaranteed to pick the best data access plan. Even with the table statistics, they can mistakenly pick an access plan that leads to more disk I/O.

It’s not too difficult to switch between the Oracle rule-based optimizer and a cost-based optimizer (like “ALL_ROWS”). (The global parameter is set in the init.ora file.) If you use a cost-based optimizer but the tables have no statistics,

Page 105: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

Documentum System Sizing Guide 4–39

Oracle uses the rule-based optimizer instead. Therefore, in Oracle, tables could be selectively moved from rule-based to cost-based optimizing if the site doesn’t generate statistics for the tables.

Databases like Sybase and the Microsoft SQL Server always use cost-based optimization and need to have statistics for the various tables.

DBMS Buffer Cache Memory Effect on Disk I/Os

The DBMS data cache (illustrated in Figure 4-11) holds images of rows that were read from or written to disk. If the DBMS can find the data rows in the cache, it does not need to read from the disk. Consequently, there is an inverse relationship between the amount of data in the cache and the demands on the disk.

Figure 4-11 Illustration of a DBMS Cache

Theoretically, if memory is large enough and the Docbase small enough, nearly all of the Docbase attribute information may fit in memory, almost eliminating the requirement for disk I/O. However, in reality, all attribute information for large Docbases cannot fit in physical memory. So, the DBMS will keep some information and remove other information from the cache as needed. A Least Recently Used (LRU) algorithm keeps the data most often referenced in the cache and lets the least referenced go. This also helps minimize disk I/O because the most referenced data stays in the cache.

Is data in cache?

Yes! No need for disk I/O

No! then go out to disk and get it.

DBMS data cache

Page 106: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

4–40 Documentum System Sizing Guide

Table 4-25 shows disk I/O ranges for some Docbase sizes based on the EDMI workload. The workload does not have any table scans; however, this represents disk I/O statistics for various database vendors under different memory configurations.

Disk Striping and RAID Configurations

Disk striping distributes a piece of data over separate disks. When the data is requested, all of the disks participate in returning the data.

It might seem that retrieving data from multiple drives would take longer than retrieving data from one drive. However, given that the data is typically large (many Kbytes), it is unlikely that retrieving the data from a single disk could be accomplished in a single operation anyway. If you stripe the data, the drives work in parallel, thus allowing the operation to happen much more quickly.

The striping logic of an operating system (or of disk array firmware) makes a group of disks appear as a single disk or volume. The operating system will break up a disk request into multiple disk requests across multiple drives. Figure 4-12 illustrates how striping works.

Table 4-25 Ranges of Disk I/Os vs. Memory Used vs. Docbase Sizes vs. Users/Hour

Docbase Size(Megabytes)

Disk I/Os per second DBMS Memory(Gigabytes)

EDMI Users/Hour

1 60 - 200 1 - 2 900 - 1500

2 - 2.5 60 - 300 1 - 2 800 - 3000

5 410 - 2000 1.5 - 3 2000+ - 4000+

Page 107: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

Documentum System Sizing Guide 4–41

Figure 4-12 Disk Striping Concept

A key component of disk striping performance is the size of the data stripe (data block). The smaller the stripe, the more parallel drives that might used for a single I/O. The more parallel drives, the better the performance. However, if the stripe is too small, the overhead of dealing with the stripe exceeds any performance gains from the striping. If the stripe is too large, then I/Os are likely to queue up on a single drive.

To illustrate, using an extreme example of poor stripe size from the DBMS world, suppose an administrator has many individual disks and stripes the data by creating multiple table space files across the independent disks. The administrator puts a single, large, sequential portion of the table space on each drive. If a request is made for a portion of the table, it is likely that the I/Os are going to be concentrated on a single drive in an uneven fashion.

In general, RAID0 (striping without parity), as described above, outperforms tablespace or DBMS device-level disk striping. However, RAID0 has some disadvantages. If a single drive fails, the entire stripe set fails. That is, four disks in a RAID 0 stripe set (or logical drive) have a shorter mean time before failure (MTBF) than a single drive, because any one of the four physical drives can bring the logical drive down.

Striping Logic

Striping Logic Retrieve the data in 4 sequential I/Os Retrieve the data in 4

parallel I/Os

Page 108: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

4–42 Documentum System Sizing Guide

There are two major ways to protect performance yet maintain reliability: mirroring and striping with parity. With mirroring (or RAID 1), the data is written to two drives. If one drive fails, the data can be read from the other. When data is striped over a set of mirrored drive pairs, the configuration is called RAID1+0 or RAID10.

In striping with parity (or RAID5), parity information is written out in addition to the data. The parity information can be used to recreate data if a drive fails. The parity information for a write operation is always written to a drive that does not contain the data that generated the parity code. The parity information can be written to any drive in the configuration that doesn’t contain the data that generated the parity code. For example, suppose there are four drives on which data is striped. One write might put data on drives 1, 2, and 3 with the parity information on drive 4. Another write might put the data on drives 1 and 2, with the parity information on drive 3.

The disadvantage of a RAID1+0 configuration is the cost of the additional drives. The disadvantage of a RAID5 configuration is the extra I/Os needed to write out the parity information. In general, the access penalty for RAID5 is fairly severe for DBMS files. It can provide decent performance for Docbase content.

Disk Storage Areas

A Documentum installation has many different disk storage areas that need sizing. Table 4-26 outlines these storage areas and some of their characteristics.

Table 4-26 I/O Characteristics for Documentum

Data Area Used For Size Recovery Characteristics

I/O Activity Advice

DBMS and Documentum Server backup

Backup copies of metadata and content files

Same size as content plus DBMS data—typically Gbytes

Hard to recover Sequential writes, but infrequent usage

RAID 5

Documentum content and fulltext index

Actual online content plus full text index

Gbytes Hard to recover Random read/write for small files and sequential for large files

RAID5 or mirrored pairs bound into a RAID 0 stripe (RAID 1+0)

Page 109: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

Documentum System Sizing Guide 4–43

Documentum eContent Server temp storage area

Intermediate file transfer area

Typically less than 100M bytes

Easy to recover Random read/write for small files and sequential for large files

RAID5 is acceptable

DBMS transaction logs

Ensuring DBMS operations remain stable after a failure

Mbytes to Gbytes

Hard to recover Sequential writes

Mirrored pairs bound into a RAID 0 stripe (RAID 1+0)

DBMS data & index

Holding the document meta-data

Gbytes Hard to recover Random read/write

RAID 1+0 preferred

DBMS temp & Oracle rollback segments

Temp is used for DBMS sorting and worktables; rollback seg-ments are used for transaction aborts

Mbytes to Gbytes

Easier to recover and rebuild than DBMS data and indexes

Sequential and random writes (some reads)

Mirrored pairs bound into a RAID 0 stripe (RAID 1+0)

RightSite temp & log storage area

Temp is used as an intermediate file transfer area; log storage area stores log files

Hundreds of Mbytes

Easy to recover Random read and write for small files and sequential for large files

RAID5 is acceptable

RightSite DMCL cache

Per session file cache (what would have been on client machines in a client server environment)

Hundreds of Mbytes

Easy to recover Random read and write for small files and sequential for large files

RAID5 is acceptable

Table 4-26 I/O Characteristics for Documentum

Data Area Used For Size Recovery Characteristics

I/O Activity Advice

Page 110: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

4–44 Documentum System Sizing Guide

Disk Space Sizing

This section provides a formula for calculating disk space needs.

Physical Disk Requirements of the Documentum Software Components

Table 4-27 lists the disk space requirements for the servers in the system. The figures in this table are based on test Docbase scenarios. The actual requirements for a particular Docbase will vary based on such factors as subtype depth, number of attributes, size of each attribute value, number of repeating values, and overhead for non-document objects.

RightSite Docbasic compiled disk cache

Pre-compiled Docbasic scripts

Mbytes Easy to recover Random read and write for small files and sequential for large files

RAID5 is acceptable

Internet Server log files

Per-operation logging by Internet Server

Tens of Mbytes Easy to recover Random read and write for small files and sequential for large files

RAID5 is acceptable

OS paging/swap files

Holds the pages of a process working set no longer needed in memory

Gbytes Painful to recover (OS will have hard time to boot)

Mostly sequential writes

Suggest RAID 1+0

Table 4-26 I/O Characteristics for Documentum

Data Area Used For Size Recovery Characteristics

I/O Activity Advice

Table 4-27 Physical Disk Requirements for Server Software in Documentum System

Server Requirement

eContent Server 175 MB

RDBMS (not including Table Space) 100 MB

Page 111: Document Um System Sizing Guide

Server Configuration and SizingSizing Server Disk Capacity

Documentum System Sizing Guide 4–45

Typical Disk Space Calculation Model for Content and Attribute Data

You can use the following formula to estimate disk usage for document content and the metadata:

5K Per Object x Number of saved versions stored in the RDBMS table (Document Object Data) +

Document Size x Number of saved versions stored in the Docbase (Document Content) +

Rendition Size x Number of saved versions stored in the Docbase (Renditions) +

30% of the original document size x Number of saved versions stored in the Docbase (Fulltext Indexes) +

2.5K x Number of saved versions stored in the Docbase (Annotations)

Expressed mathematically, the formula is:((5k * number of versions) + (Document Size * number of versions) + (Rendition Size * number of versions) + ((30% * Document Size) * number of versions) + (2.5 * number of versions)) * Total Number of Documents = Total Disk Capacity

(Note that not all documents require versions, renditions, annotations or full-text indexes. You can configure a system to prune version and rendition trees and annotations).

Use the Excel Spreadsheet to automatically calculate the estimate.

Base Documentum Objects (including RDBMS Table Space)

50 MB

RightSite 60 MB

Table 4-27 Physical Disk Requirements for Server Software in Documentum System

Server Requirement

Page 112: Document Um System Sizing Guide

Server Configuration and SizingDatabase License Sizing

4–46 Documentum System Sizing Guide

Additional Considerations

Documentum does not impose additional overhead on content storage in the file system beyond the actual size of the content file unless the contents are full-text indexed.

If a Docbase is participating in object replication, there must be available disk space to execute the requirements of the replication job.

Similarly, disk space must be available when a dump and load is performed.

Additional References

System Administrator’s Guide, Chapter 11, Tools

Docbase Design Principles Customer Training Course.

If your organization has an existing Docbase, you can generate pertinent information regarding disk utilization (and more) by executing eContent Server’s State of Docbase system administration tool. This tool is described in the System Administrator’s Guide, Chapter 11, Tools.

Database License Sizing

RDBMS user licenses are an important component of the cost of a deployed system. Database vendors typically have three different licensing schemes:

■ Per seat licensing (fee for each possible named user of the system)

■ Concurrent user licensing (fee for the maximum number of concurrent users on a system)

■ Per CPU licensing (fee per CPU used during production use)

Per CPU licensing is common when associated with Internet applications

In most cases, Documentum works well with concurrent user licensing. Because user interaction with the Documentum environment is likely to be erratic (get document, read document) and eContent Server closes inactive sessions, the number of concurrent DBMS users is determined by the number of active Documentum sessions. The number of active Documentum sessions

Page 113: Document Um System Sizing Guide

Server Configuration and SizingCertified Database and HTTP Server Versions

Documentum System Sizing Guide 4–47

is a percentage of the number of users supported per hour. For example, in the EDMI workload, the number of active Documentum sessions was about 17 to 20 percent of the total EDMI users supported per hour.

Consequently, you can estimate the number of active sessions. Increase this value to account for peak periods (for example, estimated number + 20 percent extra).

Some notes on concurrent user licensing and Documentum:

■ For some implementations (for example, Oracle), multiple DBMS sessions will be created for some user activities. Consequently, the actual number of concurrent DBMS users will be greater than the number of active Documentum users. Typically, the actual number is approximately one and half times the number of active users. In these cases, however, the DBMS vendors typically do not charge extra for multiple sessions per Documentum user.

■ Although shortening the eContent Server client session time-out reduces the number of active sessions at any one time, it also causes more frequent reconnections, which drives response time up. For remote users, this penalty might be severe. You might actually need to increase the client session timeout to keep remote users happy.

For anonymous RightSite users, per-CPU licensing might be more appropriate. In this case, the number of active Documentum sessions created by the pool of anonymous RightSite servers is typically fairly small. However, each of those active servers might perform numerous operations. This fits well with a per-CPU licensing scheme.

Certified Database and HTTP Server Versions

Please refer to the eContent Server release notes and the RightSite release notes for information about the certified RDBMS versions and HTTP server versions.

Page 114: Document Um System Sizing Guide

Server Configuration and SizingCertified Database and HTTP Server Versions

4–48 Documentum System Sizing Guide

Page 115: Document Um System Sizing Guide

Documentum System Sizing Guide 5–1

5Server Network Configuration Guidelines 1

This chapter contains guidelines for sizing and configuring the network between multiple Documentum sites. The following topics are covered:

■ “Overview of Network Sizing” on page 5-1

■ “Key Concepts for Network Sizing” on page 5-2

■ “Making the Decision: Localizing Traffic or Buying More Bandwidth” on page 5-6

■ “Additional Specific Network Recommendations” on page 5-12

Overview of Network Sizing

Sizing the network between Documentum sites is principally a matter of:

■ Sizing bandwidth needs for servers and users

■ Determining locations for servers

First, you must understand network bandwidth needs. Sometimes the need for bandwidth between remote users and their server is so great that it makes more sense to relocate the server (or just content) closer to users to improve response times and drive down telecommunications costs.

The main issue that affects remote user response time is content size relative to available bandwidth. The second issue is the number of operations that have to take place between a client and the server and how much data is generated during those operations.

Figure 5-1 illustrates the steps for configuring your network resources.

Page 116: Document Um System Sizing Guide

Server Network Configuration GuidelinesKey Concepts for Network Sizing

5–2 Documentum System Sizing Guide

Figure 5-1 Steps To Configure Network Resources

Key Concepts for Network Sizing

Before you can make decisions about sizing and configuring the network, you must understand several key concepts about networks and how they affect sizing. These key concepts are:

■ “Bandwidth and Latency” on page 5-3

■ “Bandwidth Needs and Response Time” on page 5-4

1. Estimate Number of users/hour 2. Estimate operations per user 3. Estimate the bytes per operation 4. Estimate document size 5. Determine locations of users

Determine network demand per user community as if they had their own

Determine possible geographic locations for servers

Rework network load based on geographic locations. Adjust if network demand outstrips budget.

Page 117: Document Um System Sizing Guide

Server Network Configuration GuidelinesKey Concepts for Network Sizing

Documentum System Sizing Guide 5–3

Bandwidth and Latency

Bandwidth describes the available throughput of the various components in a network. It is usually measured in Bytes/sec or bits/sec. The factors affecting bandwidth are:

■ Transmission speed

■ The amount of external traffic using the media

To illustrate how external traffic affects bandwidth, let’s look at a single-lane freeway with a speed limit of 60 miles per hour. Optimal bandwidth allows the cars to go at the speed limit (that is, a car could cover 60 miles in an hour). Actual bandwidth is likely to be far less due to rush hour traffic or accidents. For example, the traffic on the freeway may have an average speed of 20 miles per hour during rush hour. Available bandwidth is diminished by external traffic forces, in this case, additional cars.

Data transfer latency is the time it takes to send data between two hosts. The factors affecting latency are:

■ Processing time on the respective host machines

■ Propagation delay across the transmission media

■ Available bandwidth for each network between the two hosts

■ Processing time for all routers and gateways between the two hosts

To continue with the above example, latency is the time it takes to get from one place to another. The distance from Pleasanton, California to the Golden Gate bridge in San Francisco might only be about 50 miles, but the delays caused by the toll bridge between Oakland and San Francisco and the various traffic lights in San Francisco are likely to make that trip take longer than 1 hour (Figure 5-2).

Page 118: Document Um System Sizing Guide

Server Network Configuration GuidelinesKey Concepts for Network Sizing

5–4 Documentum System Sizing Guide

Figure 5-2 Example Of Bandwidth vs. Latency

When you apply the concepts of latency and bandwidth to sizing a real-world network, you must be sensitive to the various components in the network and how the components affect the response time of a Documentum implementation.

Bandwidth Needs and Response Time

There are two ways to provision network resources based on network load. You can:

■ Optimize for average network demand

■ Optimize for online response time

To optimize for average demand, you need only to determine the number of bytes transferred in a busy hour and ensure that there is enough bandwidth to meet that demand.

Unfortunately, optimizing for network demand often leaves online users with poor response time. For example, suppose only 5 Mbytes of data must be transferred between two sites in one hour. A 56K bps link should provide sufficient bandwidth for users to get good response time (5M bytes per hour is only 20 percent of the total amount of bytes that could be transferred on a 56K bps link in 1 hour). However, that isn’t the case. Network demands of online users are characterized by small bursts separated by long pauses, and response will be judged as good or bad by how long it takes to service one of those bursts. Figure 5-3 illustrates the nature of the network demand.

50 miles distance

65 miles/hour on HW 580 (Bandwidth)

25 miles/hour on HW 101 in San Francisco

Golden Gate Bridge In San Francisco Pleasanton, CA

Latency for trip from Pleasanton to Golden Gate Bridge >= 1 hour

Page 119: Document Um System Sizing Guide

Server Network Configuration GuidelinesKey Concepts for Network Sizing

Documentum System Sizing Guide 5–5

Figure 5-3 Example of Bursty Network Load Caused by Online Users

As an example, suppose that a particular command (such as the display of a dynamically rendered Web page) transfers 80,000 bytes of data. With a 56K bps line, the command will take 11 seconds to complete at best and will be judged by the users as a command with poor response time. At 256K bps, the response can go down to approximately 2.5 seconds and be considered a command with acceptable performance. Now suppose that 25 users issue the above command 10 times in 1 hour. Their average network demand for the hour is 4.8M bytes (80,000 bytes x 6 commands x 10 users), which is only 20 percent of the 56K bps bandwidth and 4 percent of a 256K bps bandwidth.

Although this seems to indicate that an enormous amount of bandwidth is needed for users to achieve any level of decent performance, it turns out that because there are large pauses in the use of the line, more online users can share bandwidth due to the random nature of their requests. That is, the 256K bps line could also likely serve an additional 70 users at the same number of requests with good response time (which would drive the average usage of the line to about 30%).

Typically, it is not cost effective to provide so much bandwidth so that all commands can complete in 3 seconds or less. Rather, it is important to try to ensure that the most frequently performed operations have good response times. For example, if users log into their application only once an hour, then a login that takes 10 seconds over some amount of bandwidth is much less

Example of bursty Network load caused by online users

0

0.2

0.4

0.6

0.8

1

time (secs)

% b

and

wid

th u

sed

unused bandw idth

longer the burst, the poorer the response time

Page 120: Document Um System Sizing Guide

Server Network Configuration GuidelinesMaking the Decision: Localizing Traffic or Buying More Bandwidth

5–6 Documentum System Sizing Guide

annoying than some other command that takes 10 seconds and is run 10 times in the hour for each user. Determining how much bandwidth to allocate for a particular Documentum application focuses on trying to make the most frequent commands run the most quickly. This can be achieved by adding more bandwidth or by choosing a Documentum distributed option to localize network requests. “Making the Decision: Localizing Traffic or Buying More Bandwidth,” discusses those decisions.

Making the Decision: Localizing Traffic or Buying More Bandwidth

When you are sizing and configuring a network between Documentum sites, the goal is to achieve good user response time at minimum cost. This section describes the trade offs between bandwidth and a variety of Documentum options, including:

■ Remote Web servers

■ Content servers

■ Object replication

For all options, there is a trade off between purchasing more bandwidth and localizing access by putting a server at the remote site. No single formula can be applied to make the evaluation. Bandwidth costs differ by region and by proximity to service and telephone facilities. There are also different choices for the types of server machines to put at the remote site, and maintenance and staffing costs for that software and hardware will also differ by region.

Let’s illustrate the cost trade offs with a hypothetical example. Suppose a remote office supports 25 users. Analysis determines that without any special servers at that site, users need a bandwidth of 700K bps to achieve good response time. However, locating a special server at the remote office would reduce the traffic demands so that a bandwidth of 128K bps would provide good response time.

The assumed costs associated with the remote server machine include:

■ Server host (2 CPUs, 1GB memory, 2 internal disks, monitor, keyboard, tape drive, and rack): $15,000

■ Software costs (OS, etc): $1,500

Page 121: Document Um System Sizing Guide

Server Network Configuration GuidelinesMaking the Decision: Localizing Traffic or Buying More Bandwidth

Documentum System Sizing Guide 5–7

■ Tape backup costs: $200 per year

■ Other administrative costs: $1,000 one-time fee for training remote admin (who already supports the users at the remote site) + $2,000 one-time fee for initial setup

In this example, the remote server machine requires fairly little day-to-day administration, and the administration can be done remotely from the central site. Additionally, the example assumes that the service will be in production for three years and that power costs are negligible. These assumptions bring the total charge for the remote machine to about $20,000.

Let’s also assume that the remote office has access to a frame-relay service provided by the local telephone company and that the frame-relay service charges include both port prices and access facility charges. These port prices are shown in Table 5-1.

Table 5-2 shows the access facility charges.

Table 5-1 Example Frame Relay Port Prices

Speed Installation Fee Monthly Fee

56K bps $354.67 $70.93

128K bps $354.67 $141.87

384K bps $354.67 $378.32

1.544M bps $354.67 $472.90

37M bps $1,418.69 $4,539.79

Table 5-2 Example Frame Access Facility Charges

Speed Installation Fee Monthly Fee

56K bps $597.37 $47.41

128K bps $600.69 $165.94

384K bps $600.69 $165.94

1.544M bps $600.69 $165.94

Page 122: Document Um System Sizing Guide

Server Network Configuration GuidelinesMaking the Decision: Localizing Traffic or Buying More Bandwidth

5–8 Documentum System Sizing Guide

Table 5-3 shows the cost of setting up and using frame relay from the central site to the remote office for bandwidths of 1.544M bps and 128K bps.

In this example, a remote server saves the company $3,000 over three years. Such a savings is less than 10 percent. If the remote office is in a different country or a cost-effective frame-relay service can’t be used, then using the remote server could add up to significantly more savings.

The following sections discuss how to determine the bandwidth needs for a variety of deployment options. The Documentum Sizing Spreadsheet also contains information about this topic.

More Bandwidth or Remote Web Servers

Documentum is an N-tier server architecture, and each tier employs a different protocol with different networking characteristics. In Web-based deployments, the browser-to-Web server protocol (HTTP/HTML) is likely to be the most verbose. Depending on the application, the HTML could be 20 times more verbose than the corresponding DMCL. Consequently, if remote users are centralized in a single office, it might be best to locate a remote Web server in that office to achieve better response times. Doing so takes advantage of the fact that HTML requires more bandwidth than the corresponding Documentum DMCL protocol between the Web tier and eContent Server. Because there is little state to maintain on a Web or application server machine, the cost to use a remote Web or application server machine is low.

Table 5-3 Total Cost for Three-Year Period for Two Line Sizes

Charges 128K bps 1.544M bps

Port installation fee $354.00 $354.00

Port charge for 3 years $5,076.00 $16,992.00

Access installation fee $600.00 $600.00

Access charge for 3 years $5,940.00 $5,940.00

Total for a single site $11,970.00 $23,886.00

Total for both sites $23,940.00 $47,772.00

Page 123: Document Um System Sizing Guide

Server Network Configuration GuidelinesMaking the Decision: Localizing Traffic or Buying More Bandwidth

Documentum System Sizing Guide 5–9

Figure 5-4 Bandwidth and Remote Web Servers

Content Transfer Response Time: More Bandwidth or Content Servers

Suppose that the Web server is located at the remote site, but performance is still poor (or expected to be poor) given the bandwidth provided. The next item to improve is the time required to transfer content (or files). You can add more network bandwidth (for example, upgrade a 56K bps link to a 128K bps link) or add a Documentum content server at the remote site to localize content access. Deciding which strategy to employ will depend on the relative costs of each, and the costs are likely to differ based on geography.

Verbose HTML from Web/App server top Browser

Less verbose DMCL between Web/App server and eContent

Verbose HTML from Web/App server top Browser

Less verbose DMCL between Web/App server and eContent

Page 124: Document Um System Sizing Guide

Server Network Configuration GuidelinesMaking the Decision: Localizing Traffic or Buying More Bandwidth

5–10 Documentum System Sizing Guide

Using Documentum content servers is quite attractive when bandwidth is very expensive and users typically access large files. The administration and hardware needs of a Documentum content server are fairly small. A content server only requires additional disk space for content (beyond the needs of a regular Web or application server) and it can be administered remotely.

Figure 5-5 illustrates the use of content servers.

Figure 5-5 Bandwidth and Remote Content Servers

DMCL + Content Transfer

Docbase is remote. Response time can be improved by increasing bandwidth

DMCL only

Content is local and only operation traffic is sent to the remote eContent server.

Content served locally by Content

Page 125: Document Um System Sizing Guide

Server Network Configuration GuidelinesMaking the Decision: Localizing Traffic or Buying More Bandwidth

Documentum System Sizing Guide 5–11

Operation Response Time: More Bandwidth or Replication

The trade off between bandwidth and local server machines also applies to DMCL operations issued from remote application servers or from remote Desktop Client users. That is, although content and Web access are moved to the remote site, users will still get poor response time. In such cases, Documentum object replication might be the best solution for the enterprise. With object replication, all or part of a Docbase is replicated to the remote site. The replication process happens during off-peak hours, and consequently, remote users almost always interact with their local servers and get great response time. The costs of object replication include setting up a remote RDBMS, the overhead of replication administration, and the overhead of replica update latencies (nightly updates). Figure 5-6 illustrates how bandwidth and object replication interact.

Figure 5-6 Bandwidth and Object Replication

Replication update at night

Fast online access of replicated data during the day

Page 126: Document Um System Sizing Guide

Server Network Configuration GuidelinesAdditional Specific Network Recommendations

5–12 Documentum System Sizing Guide

Additional Specific Network Recommendations

■ Keep roundtrip ping time between servers in a distributed server configuration to below 250 milliseconds. Roundtrip ping time measures network latency.

■ Land-based communication is better than satellite communication due to the physical distance data travels using satellite communication.

■ Consider adding additional network interface cards to each server to prevent network saturation.

■ To handle large images across the network, consider a second network card and higher speed media (for example, 100Mbit Ethernet not 10Mbit)

■ Place the WAN between the client and eContent Server or the Web Browser and RightSite (if possible).

■ Do not place a WAN between eContent Server and the RDBMS (if possible).

■ Determine whether you can use Documentum’s network compression feature to optimize network performance between the client and server.

Page 127: Document Um System Sizing Guide

Documentum System Sizing Guide 6–1

6Sizing for Client Applications 1

This chapter describes sizing considerations and requirements for Documentum clients. The following topics are covered:

■ “Sizing for Desktop Client” on page 6-1

■ “Sizing for AutoRender Pro” on page 6-5

■ “System Requirements for Client Products” on page 6-6

Sizing for Desktop Client

CPU speed and memory are the main resources that must be sized properly for Documentum Desktop Client. It is also important to size the disk space and network resources correctly, as these can have a profound impact on response time.

CPU Speed

Desktop Client offers a wide array of operations and functionality. Generally, the richer the feature set, the more CPU processing is required. A typical large enterprise will deploy a range of functionality over a PC population that varies from very slow (for example, 166Mhz) to extremely fast (for example, 800Mhz). There is a direct relationship between the speed of a machine and the number of features that can run on the machine with acceptable response times. Figure 6-1 illustrates this relationship.

Page 128: Document Um System Sizing Guide

Sizing for Client ApplicationsSizing for Desktop Client

6–2 Documentum System Sizing Guide

Figure 6-1 CPU Needs vs. Features used for Desktop Client

Basic usage represents a fixed set of basic operations carried out on documents that are not extraordinarily large. For example, navigating folders and checking documents out and in are fairly basic operations and should be achieved with acceptable performance on a 166MHz server. However, if the folders in the navigation path have a large number (hundreds) of subfolders and documents or the documents checked out and in are many megabytes in size, the 166MHz server probably won’t be fast enough to get acceptable response time.

Table 6-1 shows some example response times for a set of basic office integration operations on two different servers. (Response times were measured for the steady state invocation, not the initial operation.)

160Mhz 300MHz 600Mhz

Basic usage: Navigation, checkout/in, workflow, no OLE linked documents

Additional usage scenarios: Business Policy, etc

Advanced usage: OLE Link processing, XML, large document processing, custom validations

Table 6-1 Response Time on Two CPUs

Operation Response Time (in seconds) on

400Mhz CPU 166Mhz CPU

Launch App 0.42 1.93

Open Dialog Box 1.58 3.73

Open Small Doc (70K bytes) 2.73 4.71

Page 129: Document Um System Sizing Guide

Sizing for Client ApplicationsSizing for Desktop Client

Documentum System Sizing Guide 6–3

Simple usage of custom validations (user-defined attribute integrity checks made when importing a document or changing its attributes) is unlikely to provide good response time with slower machines. And the more advanced the validation, the more processing is required.

Advanced usage features are those that not only provide more sophisticated functionality (such as the conversion of OLE linked documents or XML documents into Documentum virtual documents) but also process large documents (large in size or in the number of virtual documents created). For example, the response time for an XML document chunked into thousands of nodes will be much higher than the response time for one chunked into a few nodes. Similarly, checking in a document with hundreds of OLE links will take longer than checking in a document with a single OLE link. Finally, checking in a 40 Mbyte Powerpoint document takes longer than checking in a 2 Mbyte Powerpoint document.

As the documents get larger and more disk I/O is performed at the PC, disk performance becomes a larger issue. SCSI drives are typically better choices for PCs that will be using advanced features than EISA drives.

Component Initialization and Steady State Processing

Some operations take longer the first time they occur in a Desktop Client session than on second or subsequent executions, due to component initialization. Those operations display dialog boxes that use data that must be initialized the first time the dialog box is displayed. Once the start-up penalty is paid, the data is cached for the duration of the session.

Save Small Doc (70K bytes) 2.74 4.05

Open Medium Doc (300K bytes) 3.99 5.81

Save Medium Doc (300K bytes) 3.01 5.56

Open Large Doc (4M bytes) 4.98 8.58

Save Large Doc (4M bytes) 5.40 12.71

Table 6-1 Response Time on Two CPUs

Operation Response Time (in seconds) on

400Mhz CPU 166Mhz CPU

Page 130: Document Um System Sizing Guide

Sizing for Client ApplicationsSizing for Desktop Client

6–4 Documentum System Sizing Guide

Response times when an operation must initialize components is longer than when the operation is executing in a steady state (using cached data). Table 6-2 lists some example initialization and steady-state response times for a set of operations (on a 400MHz CPU).

If initialization response times on a system are unacceptable, you can reduce them by increasing the CPU speed of the Desktop Client machine. (Note that this will also reduce the steady-state response time. There will always be some difference between initialization and steady-state response times.)

Memory Resource Needs

This section describes memory resources needed for Documentum Desktop Client and integrations.

The base Explorer integration (using basic features) can run using 64M bytes. However, actual memory resource needs will vary, depending on:

■ The level of features used

■ The type of applications used

■ Size of the documents being processed and the folders being navigated

Using advanced features, such as converting OLE links to documents, could require 96 Mbytes and more of memory.

When using the MS Office integrations, the memory required is about 10 Mbytes for each application invoked. (The Office integrations integrate Desktop Client with MS Word, PowerPoint, and Excel.) If all three are invoked and are running at the same time, the system requires an extra 30 Mbytes of memory.

Table 6-2 Example Response Times Comparing Initialization and Steady State Operations

Operation Response Time (in seconds)

First 2nd and Subsequent

Open Dialog Box 10.40 1.58

Open Sm Doc 6.63 2.73

Save Sm Doc 3.65 2.74

Page 131: Document Um System Sizing Guide

Sizing for Client ApplicationsSizing for AutoRender Pro

Documentum System Sizing Guide 6–5

Sizing for AutoRender Pro

AutoRender Pro services rendering requests issued by Documentum clients to convert documents from one format into another. For example, WebPublisher™ uses AutoRender Pro to automatically convert MS Word documents to HTML when they are checked in. AutoRender Pro is a single-threaded server application that runs on a server machine separate from eContent Server. Figure 6-2 illustrates how AutoRender Pro works with eContent Server and documents.

Figure 6-2 AutoRender Pro in Conjunction with a Docbase

The factors affecting response time for a single rendering job are:

■ The CPU speed and disk access speed of the AutoRender Pro server

■ The complexity of the work

AutoRender Server machine

Convert Document

Converted Document checked into Docbase

Queue of Pending Rendering Requests Document

brought to AutoRender Server disk

EContent server machine

Page 132: Document Um System Sizing Guide

Sizing for Client ApplicationsSystem Requirements for Client Products

6–6 Documentum System Sizing Guide

To convert a document, AutoRender Pro must parse and rewrite the document to a different format. The conversion is CPU intensive, and document size, complexity (graphics and so forth), and format affect the amount of resources needed for the job. If the document is large, physical disk I/Os occur on the AutoRender drives as the document’s temporary version is created. In fact, performance studies show that during the conversion process both the CPU and the disk are quite busy. If the CPU is slow or if the drives or their controllers have poor response time, rendering jobs will take longer to process. Consequently, using good disk controllers, fast drives, and fast CPUs (> 600MHz) for the AutoRender Pro server machines is strongly recommended.

Memory requirements are minimal (refer to the release notes for actual figures).

Multiple AutoRender Pro servers can be set up for a single Docbase. Each server pulls work requests off the same work queue. Adding more servers increases the rendering capacity, allowing more users to enter requests simultaneously while maintaining the desired response time.

Note: It is possible for a single Auto Render Pro server to support multiple Docbases.

System Requirements for Client Products

Refer to the current product release notes or installation guides for information about system requirements and certification information. That documentation is available on the Documentum ftp site, the Documentum Support Web site, and in the Documentation Library in dm_notes.

Page 133: Document Um System Sizing Guide

Documentum System Sizing Guide A–1

AAdditional Workloads A

This appendix contains descriptions of workloads not included in Chapter 2. It includes the following topics:

■ “The EDMI Workload” on page A-1

■ “The Web Site Workload” on page A-6

■ “The Document Find and View Workload” on page A-9

■ “The Online Customer Care Workload” on page A-9

■ “Comparing and Contrasting the Workloads” on page A-14

■ “Operations Not Included in Workloads” on page A-17

The EDMI Workload

The EDMI workload represents a common software configuration for Web-based Documentum deployments. The workload uses the following products:

■ eContent Server

■ RightSite Server

■ Docbasic

■ SmartSpace Intranet Client

The users in this workload are named users accessing the Docbase through Internet browsers and SmartSpace Intranet Client. The software architecture for the workload is illustrated in Figure A-1.

Page 134: Document Um System Sizing Guide

Additional WorkloadsThe EDMI Workload

A–2 Documentum System Sizing Guide

Figure A-1 EDMI Software Architecture

This architecture is capable of supporting parallel multi-tier servers (multiple RightSite and eContent Servers). The architecture allows maximum scalability and adaptability by ensuring that storage, network, and server capacity can be added to increase performance and throughput as more clients are added to the network. A company can easily scale any tier to accommodate growth and change. The architecture also provides the flexibility for optimizing server and client hardware.

Workload Scenario

In this workload, a large number of named users work with documents representing standard operating procedures, work instructions, human resource notes, corporate Web pages, or other information that must be read prior to performing some task.

There are three different groups of users: contributors, coordinators, and consumers. Contributors access and modify documents and submit modified documents for review using a workflow. Coordinators are a type of contributor with coordinating workflow tasks. Consumers only read the documents. Consumers sometimes access documents by logging onto the system explicitly and at other times only access public Web pages.

WEB Server (e.g., Microsoft or Netscape)

Documentum RightSite Server

Documentum DocPage Server

RDBMS

OS file System

Documentum SmartSpace (in Browser)

Documentum SmartSpace (in Browser)

Browser to Dyn & static Web pages

Page 135: Document Um System Sizing Guide

Additional WorkloadsThe EDMI Workload

Documentum System Sizing Guide A–3

Typically, in the type of Web-based deployment modeled by this workload, the largest number of users are consumers. Consequently, in the workload, 80 percent of the user population are consumers and 20 percent are contributors.

Workload Operations

The operations in the workload include:

■ Locating a document through a folder search and viewing it

■ Checking out and checking in documents

■ Workflow processing (Inbox, routing, and so forth)

■ Virtual document processing (publishing)

■ Accessing static Web pages stored in the Docbase

■ Accessing dynamic Web pages that query the Docbase

Table A-1 lists the operations in the workload.

Table A-1 Operations in the EDMI Workload

Operation Description

CONN Starts a Docbase session through SmartSpace Intranet.

FOLDER_SRCH Searches folder by folder, eventually displaying a selected document.

STATIC_HTML Accesses a sequence of Web pages. 20 percent of the pages are dynamic. 80 percent of the pages are static (stored in the Docbase).

VDM_PUBLISH Constructs a virtual document from its components and then displays the document to the user. Each document has 10 components of 2K each.

VIEW_INBOX Displays the user ’s Inbox.

CHECKOUT_DOC Checks out a previously selected and viewed document.

CHECKIN_DOC Checks in an edited document.

Page 136: Document Um System Sizing Guide

Additional WorkloadsThe EDMI Workload

A–4 Documentum System Sizing Guide

Workload Response Time Requirements

When a benchmark test is run, the primary metric obtained is the number of users who can be supported with acceptable response times.

In the EDMI workload, each user type (contributor, coordinator, consumer) performs a specific number of random tasks (operations) at random times during the hour, and the response times for these tasks are measured. Each task typically consists of dynamically generating several HTML screens from RightSite.

The interval between a user’s requests affects performance and response time because Documentum frees a user connection that does not have any activity after some amount of time (typically two to five minutes, settable by the administrator). Re-establishing the session (which happens transparently when work is initiated on an idle session) consumes more CPU resources. Simulating this behavior in the test more accurately models the real word.

The acceptable response time is, generally, no more than one to two seconds per screen. Table A-2 lists the response time requirements for the EDMI workload. Table A-3 shows a sample set of results.

SUBMIT_REVIEW Submits a modified document to a review workflow, for a technical and compliance review. This is a contributor’s task.

ASSIGN_REVIEW Assigns a document to be reviewed for technical or regulatory compliance. This is a coordinator’s task.

FORWARD_REVIEW Forwards a document to the next activity in the review workflow.

Table A-1 Operations in the EDMI Workload

Operation Description

Page 137: Document Um System Sizing Guide

Additional WorkloadsThe EDMI Workload

Documentum System Sizing Guide A–5

Table A-2 Response Time Requirements for the EDMI Workload

Operation Number of Screens

Acceptable Response Time per Screen(in seconds)

Total Acceptable Average Response Time (in seconds)

CONN 3 2 6

FOLDER_SRCH 5 2 10

STATIC_HTML 5 1 5

VDM_PUBLISH 1 6 6

VIEW_INBOX 1 4 4

CHECKOUT_DOC 2 2 4

CHECKIN_DOC 3 2 6

SUBMIT_REVIEW 3 3 9

ASSIGN_REVIEW 3 3.3 10

FORWARD_REVIEW 2 2.5 5

Table A-3 Example Response Times for the EDMI Workload

Operation Average Response Time (in seconds)

Total Acceptable Average Response Time (in seconds)

Total Operations in One Hour

CONN 2.83 6 2268

FOLDER_SRCH 7.26 10 3860

STATIC_HTML 1.87 5 2981

VDM_PUBLISH 2.3 6 3025

VIEW_INBOX 0.65 4 1742

Page 138: Document Um System Sizing Guide

Additional WorkloadsThe Web Site Workload

A–6 Documentum System Sizing Guide

Workload Scaling

The Docbase size is increased as the number of users in the workload increases. To support larger numbers of users per hour, many more documents are preloaded. There are at least 650 documents in the Docbase for supported user. However, in most benchmarks with this workload, there are well over 2,000 documents per user. Consequently, as more users are added to the test, the queries become more expensive to perform. The documents range in size from 2 Kbytes to 1 Mbyte.

The Web Site Workload

The Web Site workload simulates a common deployment scenario for some of our Web-based implementations. It uses eContent Server and RightSite Server and incorporates RightSite’s ability to retrieve HTML content from the Docbase transparently for the user. Companies often use RightSite in this manner to obtain better security and version control for their Web content.

The user population is entirely anonymous. Anonymous users don’t need to provide any security credentials in order to access a page. (Tighter security can be provided by RightSite named users. The EDMI workload uses named users.)

CHECKOUT_DOC 1.63 4 1397

CHECKIN_DOC 5.7 6 805

SUBMIT_REVIEW 7.17 9 428

ASSIGN_REVIEW 9.17 10 155

FORWARD_REVIEW 2.64 5 241

Table A-3 Example Response Times for the EDMI Workload

Operation Average Response Time (in seconds)

Total Acceptable Average Response Time (in seconds)

Total Operations in One Hour

Page 139: Document Um System Sizing Guide

Additional WorkloadsThe Web Site Workload

Documentum System Sizing Guide A–7

User access starts at a dynamically generated home page, EDMI_home.htm. This page includes an HREF reference to all the root-level static Web pages. (Static Web pages are pages that are stored in the Docbase.) The home page dynamically constructs the next level of references and serves them as a page to the user.

The static Web pages are all stored in the cabinet /Website in a folder structure that is several layers deep. The pages are linked to each other through <HREF> tags in a hierarchical structure. The structure models the real-world way of storing Web pages in separate directories (grouped by applications, for example). The HREF references are five levels deep, with four references per level. Consequently, in a large Docbase, each root page ultimately points, directly and indirectly, to thousands of pages.

Each static web page consists of 3000 bytes of random text, followed by up to four references to other web pages. The pages at the bottom level don’t reference any pages. (Figure A-2, in the next section, illustrates the Web page structure.)

Workload Operations

There is only one operation in this workload: STATIC_HTML. The STATIC_HTML operation consists of a sequence of Web page accesses. In 20 percent of the accesses, the page is dynamically generated. In the remaining 80 percent of the accesses, the page is fetched from the Docbase.

The sequence of access operations starts from the home page. A reference is picked at random from each page, and the operation moves to the next page until the bottom is reached. Figure A-2 illustrates the access strategy. The boxes in bold represent the total number of Web pages accessed by one cycle (six Web page references).

Page 140: Document Um System Sizing Guide

Additional WorkloadsThe Web Site Workload

A–8 Documentum System Sizing Guide

Figure A-2 Static Web Page Example

Workload Response Times

When a benchmark test is run, the primary metric obtained is the number of users who can be supported with acceptable response times.

Each user performs five tasks. Each task consists of an HTML screen that is dynamically generated from RightSite plus the additional static pages. The tasks are performed at random times and the response time is measured.

Acceptable response time is, in general, considered to be no more than one or two seconds per screen. Table A-4 shows the response time requirements for the workload.

The Documentum client session time-out has little effect on this workload because there is a steady stream of activity to only a few anonymous servers in the RightSite pool. However, their random activities do affect the CPU

EDMI home

Root Of

group #1

Root Of

group

Root Of

group #3

2nd Level group

3rd Level group

4th Level group

5th Level group #2

2nd Level group #2

2nd Level group #2

3rd Level group #2

3rd Level group #2

4th Level group #2

4th Level group #2

5th Level group

5th Level group #2

Table A-4 Response Time Requirements for the Web Site Workload

Operation Number of Screens

Acceptable Response Time per screen (in seconds)

Total Acceptable Average Response Time (in seconds)

STATIC_HTML 5 1 5

Page 141: Document Um System Sizing Guide

Additional WorkloadsThe Document Find and View Workload

Documentum System Sizing Guide A–9

resources used, because RightSite spawns more servers to handle the excess load when all anonymous servers are busy. Random user activity helps ensure that more servers are spawned during a run, which ensures that the test models real world activity peaks more accurately.

Workload Scaling

The Docbase size is increased as the number of users in the workload increases. There are at least 40 Web pages in the Docbase for each anonymous user supported. However, this workload is usually part of a configuration used in the EDMI benchmark and, consequently, the static Web pages are only 20 percent of the total number of documents stored in the Docbase. The Web pages are all 3K bytes.

The Document Find and View Workload

The Document Find and View workload is essentially a small subset of the EDMI workload, exercised in a client-server environment using WorkSpace rather than in a Web environment using SmartSpace Intranet. Users locate documents using attribute searches or by navigating through folders. After a document is found, it is displayed to the user. Each document is 400K bytes.

The Online Customer Care Workload

The online customer care workload demonstrates Documentum interactive performance for a large number of users on a very large Docbase (100,000,000 content objects). This workload uses out-of-the-box SmartSpace with a few customizations. The customizations are centered on querying and setting custom attributes.

The workload includes the following Documentum products commonly used in Web-based deployments:

■ eContent Server

■ RightSite

■ Docbasic

Page 142: Document Um System Sizing Guide

Additional WorkloadsThe Online Customer Care Workload

A–10 Documentum System Sizing Guide

■ SmartSpace Intranet Client

Workload Operations

In this workload, a large number of users work with documents representing customer and supplier correspondence or standard operating procedures that must be read prior to performing some task. The users access the Docbase through Internet browsers.

The operations mimic the content management needs of businesses such as insurance or financial services for online customer care. These needs are defined by large volumes of images (customer correspondence or policy agreements) and large groups of users who read the documents or modify their attributes. The operations include:

■ Locating a document (through a folder search)

■ Inserting documents and notes into the Docbase and viewing them

■ Workflow processing (Inbox, routing, and so forth)

■ Accessing dynamic Web pages that query the Docbase

Table A-5 lists the basic operations in the workload.

Table A-5 Online Customer Care Workload Operations

Operation Description

CONN Starts a Docbase session using SmartSpace Intranet.

CREATE_DOCUMENT Creates a new document and opens the document’s editor for the user.

CREATE_DOCUMENT_F Checks in the new document.

FOLDER_SRCH Searches through a folder hierarchy for a document. This is performed by office users.

FORWARD_REVIEW Forwards a document for review in a workflow.

This operation occurs when a workflow or data entry user completes a task.

QUERY_SRCH This represents a Docbase search based on a customer ID.

Page 143: Document Um System Sizing Guide

Additional WorkloadsThe Online Customer Care Workload

Documentum System Sizing Guide A–11

In the workload, the activities performed by a user are made up of the actions described in Table A-5 grouped in a way that is meaningful for the user’s function. Table A-6 describes the user functions and their associated activities.

SET_PROPERTIES Sets the classification attributes of a TIFF document. Data entry users perform this operation.

VIEW_DOCUMENT Displays a TIFF, Word, text, or Excel document for a user to view.

VIEW_INBOX Selects a user’s Inbox icon and displays the next set of tasks in the Inbox.

Table A-5 Online Customer Care Workload Operations

Operation Description

Table A-6 User Activities in the Online Customer Care Workload

User Function Description of User Activity

Data entry user Data entry users examine correspondence that was scanned into the Docbase to ensure that attributes are correctly set. The data entry users work on documents that have been routed to them through their Inboxes.

A data entry user’s activity includes the following operations:

■ Check inbox for items to re-attribute (VIEW_INBOX)

■ View TIFF image noted by next inbox item (VIEW_DOCUMENT)

■ Change the attributes to fit desired descriptions and categories (SET_PROPERTIES)

■ Complete workflow task (FORWARD_REVIEW)

Page 144: Document Um System Sizing Guide

Additional WorkloadsThe Online Customer Care Workload

A–12 Documentum System Sizing Guide

Workflow users Workflow users review and sometimes annotate documents routed to them. (The documents are routed prior to the busy hour.) A workflow user’s activity includes the following operations:

■ Check inbox for work (VIEW_INBOX)

■ View document designated by next item in the inbox (VIEW_DOCUMENT)

■ Complete the task (FORWARD_REVIEW)

■ For 30 percent of the users, create a text note associated with the document (CREATE_DOCUMENT & CREATE_DOCUMENT_F)

Workflow users continue activities until all items in their Inboxes are processed.

Office users Office users create MS Office Word and Excel documents. An office user’s activity includes the following operations:

■ Create Word document (CREATE_DOCUMENT & CREATE_DOCUMENT_F)

■ Create Excel document (CREATE_DOCUMENT & CREATE_DOCUMENT_F)

■ Navigate some folders (FOLDER_SEARCH)

■ View a text document (VIEW_DOCUMENT)

Branch users Branch users search for documents using a policy number or customer ID and then view the documents. A branch user’s activity includes the following operations:

■ Query for document based on customer ID (QUERY_SRCH)

■ View TIFF image (VIEW_DOCUMENT)

Table A-6 User Activities in the Online Customer Care Workload

User Function Description of User Activity

Page 145: Document Um System Sizing Guide

Additional WorkloadsThe Online Customer Care Workload

Documentum System Sizing Guide A–13

Workload Response Time Requirements

When a benchmark test is run, the primary metric obtained is the number of users who can be supported with acceptable response times.

Each user type performs a specific number of random tasks at random times during the hour, and the response time for these tasks are measured. Each task typically consists of several HTML screens that are dynamically generated from RightSite. Acceptable response time is, in general, considered to be no more than one to two seconds per screen.

The interval between a user’s requests affects performance and response time because Documentum frees a user connection that does not have any activity after some amount of time (typically two to five minutes, settable by the administrator). Re-establishing the session (which happens transparently when work is initiated on an idle session) consumes more CPU resources. Simulating this behavior in the test more accurately models the real word.

Another important metric is obtained when response times are measured for operations on the documents in the Docbase. For tests using this workload, the Docbase is loaded with 100,000,000 TIFF images, to model a multi-user environment after years of use. To preserve space but still have a sufficient number of database rows, 90 percent of the content objects have zero length.

Call Center Users Call center users staff phone banks, taking customer calls. The user receives a call and queries the customer’s records based on the customer’s ID. Sometimes, the call center user must put some additional information in the Docbase in response to the call.

■ Query for a document based on customer ID (QUERY_SRCH)

■ View TIFF image (VIEW_DOCUMENT)

■ For 25 percent of the activities, create a text document associated with this image (CREATE_DOCUMENT & CREATE_DOCUMENT_F)

■ Otherwise query again once complete.

Table A-6 User Activities in the Online Customer Care Workload

User Function Description of User Activity

Page 146: Document Um System Sizing Guide

Additional WorkloadsComparing and Contrasting the Workloads

A–14 Documentum System Sizing Guide

When the benchmark test starts, an array of policy numbers corresponding to documents with non-zero length content is loaded. All queries for documents choose from this array. In this way, only documents with content are queried. Content sizes range from 50K to 500K bytes.

Table A-7 lists the response time requirements for the workload.

Comparing and Contrasting the Workloads

This section compares and contrasts the workloads in terms of the software architecture, the usage patterns modeled in each, and the resulting resource consumption.

Software Architecture

The EDMI and Web site workloads use an HTTP thin-client or 4-tier architecture. In an HTTP thin-client architecture, Documentum DMCL (client library) processing occurs on the machine that hosts RightSite and the Internet Server. This is in contrast to the 3-tier architecture, in which client library

Table A-7 Response Time Requirements for the Online Customer Care Workload

Operation Number of Screens

Acceptable Response Time per Screen (in seconds)

Total Acceptable Average Response Time (in seconds)

CONN 3 2 6

CREATE_DOCUMENT 6 2 12

CREATE_DOCUMENT_F 5 2 10

FOLDER_SRCH 4 2 8

FORWARD_REVIEW 2 2 4

QUERY_SRCH 3 2 6

SET_PROPERTIES 2 2 4

VIEW_DOCUMENT 1 2 2

VIEW_INBOX 2 2 4

Page 147: Document Um System Sizing Guide

Additional WorkloadsComparing and Contrasting the Workloads

Documentum System Sizing Guide A–15

processing occurs on the user’s PC. With HTTP thin-client architecture, very little work actually happens on the client machine (all users are assumed to be using browsers sending HTTP). That is, RightSite performs those operations that, in a 3-tier architecture, are performed on the hundreds (or even thousands) of client PCs. Figure A-3 illustrates the difference between 3-tier architecture and HTTP thin-client architecture.

Figure A-3 Client-Library 3-Tier Architecture vs 4-Tier (HTTP thin-client) Architecture

Usage Models and Resource Consumption

The EDMI and Anonymous RightSite Web Site workload usage models are opposites.

In the EDMI workload, all the users are named users. Named users provide a user name and password and then are authenticated and provided with some exclusive resources. In particular, a separate RightSite process is created for each named user.

The tasks in the EDMI workload operate primarily on dynamic Web pages. The named users use RightSite with SmartSpace to generate dynamic pages in 80 percent of the accesses and fetch static pages from the Docbase in 20 percent of the accesses.

Documentum DMCL operations (3 Tier Mode)

Hundreds to Thousands of Individual user PC’s

Documentum DocPage Server

Thin-client HTTP

Internet Server + Documentum RightSite

On Centralized Middle-tier Server

(Documentum DMCL Operations)

Documentum DocPage Server

Hundreds to Thousands of Individual user PC’s

Page 148: Document Um System Sizing Guide

Additional WorkloadsComparing and Contrasting the Workloads

A–16 Documentum System Sizing Guide

In the Web site workload, all users are anonymous users. Anonymous users don’t provide a name or password; they share the anonymous login configured with RightSite. On the resource side, anonymous users share a pool of RightSite processes, rather than having their own resources.

The tasks in the Website workload operate primarily on static Web pages. The anonymous users use RightSite only (SmartSpace is not used) to generate dynamic pages in 20 percent of the accesses and to fetch static pages in 80 percent of the accesses. This dynamic-to-static Web page profile is opposite the one used in the EDMI workload.

The difference in the dynamic-to-static Web page profiles means that the EDMI workload makes much heavier demands on the RightSite Server than the Website workload does. In benchmarks using the EDMI workload, the RightSite Server will consume more CPU than any other piece of server software included in the benchmarks because the dynamic-to-static page ratio makes heavy demands on the RightSite Server. Figure A-4 illustrates the relationship between CPU consumption and dynamic-to-static ratio.

Figure A-4 Relationship Between Workloads and RightSite CPU Consumption

The difference in the user profiles means that the EDMI workload will consume more CPU and memory resources per user than the Web site workload, because named users consume more resources than anonymous users.

Page 149: Document Um System Sizing Guide

Additional WorkloadsOperations Not Included in Workloads

Documentum System Sizing Guide A–17

Document Find and View Workload

The Document Find and View workload is client-server and named. RightSite is not part of that workload. In addition, although the EDMI workload is Web-based, in most cases the RightSite portion can be factored out of the data to allow you to size this workload based on a client/server model.

Operations Not Included in Workloads

The following operations are not included in the workloads described in this appendix:

■ Creating PDF renditions

■ Creating or searching full-text indexes

■ Default SSI attribute searching (case insensitive attribute searches)

■ Deleting documents from the Docbase

■ Dumping and loading a Docbase

■ Distributed content operations and object replication

■ Operations on objects in turbo storage

If these are part of your workload, you may want to increase the expected resource consumption for your workload.

Page 150: Document Um System Sizing Guide

Additional WorkloadsOperations Not Included in Workloads

A–18 Documentum System Sizing Guide

Page 151: Document Um System Sizing Guide

Documentum System Sizing Guide Index-1

I N D E X

Aactive user 1-4active user - in transaction 1-4, 2-3active user - out of transaction 1-4, 2-3activity timeout 1-5AIX F50 (IBM server) 4-17AIX S7A (IBM server) 4-16Anonymous RightSite Web Site workload.

See Web Site workloadAutoRender Pro

sizing 6-5in WCM Edition 3-19

availability considerations, for system 3-12

Bbackup capacity and sizing 3-12bandwidth

defined 1-5, 5-3response times, affect on 5-4vs localizing 5-6vs remote Web servers 5-8

benchmark testsDocument Find and View workload on

HP K580 machines 4-21EDMI workload

on AIX servers 4-17on IBM Netfinity 7000 M10 4-15on LXR8000 & LH4 4-20on Sun Enterprise 450 4-11on Sun Enterprise 6500/4500 4-13

focus for N-tier tests 4-5hardward configurations 4-2iTeam workload

on Compaq servers 4-10on HP LXR8000 & LH4 4-19on Lpr/LH4 4-19

Online Customer Care workload on V2600 4-21

result tables, interpreting 4-6Web site workload

on Sun Enterprise 6500/4500 4-14Web Site workload on Sun Enterprise

450 4-11bottleneck, defined 1-5busy hour

active sessions, estimating 2-8defined 2-6

Ccaches

affect on performance 4-26database 4-27eContent Server 4-27memory use 4-26RightSite 4-27

capacity planning. See system sizingcluster, eContent Server 3-8Compaq servers

described 4-9Web site for 4-22

configurationshost-based vs multi-tiered 3-11

connected user 1-5connecting user 1-5, 2-3connection states. See user connection statescontent

replication, described 3-15servers 3-14transfers, response times 5-9

ContentCaster 3-20cost-based optimizers 4-38CPU usage

Documentum server 2-4Documentum workloads 2-24RightSite Server 2-5

Page 152: Document Um System Sizing Guide

Index-2 Documentum System Sizing Guide

Ddata caches, database 4-27database

cachesdescribed 4-27disk I/O and 4-39

license sizing 4-46memory requirements 4-29scaling 3-10server 1-5

DBMS. See databasesDesktop Client sizing

CPU speed 6-1dmcl operations 6-3memory 6-4

disk access capacity 4-37disk capacity sizing

overview 4-35process inputs 4-36query optimizers and 4-38space vs access capacity 4-37tables scans, affect on 4-38

disk I/Odatabase cache and 4-39of disk storage areas 4-42

disk space sizingformula for 4-45general notes 4-46server software requirements 4-44space vs access speed 4-37

disk storage areassizing 4-42

disk stripingdefined 4-40with parity 4-42without parity 4-41

disk throughput 1-6distributed storage areas 3-14DL360 (Compaq server) 4-9

DMCLobject cache 4-28response times 5-11

Docbase size and memory requirements 4-29

Docbasic compiled-code memory area 4-28DocBrokers

load balancing 3-10scaling 3-10

DocPage Server 1-5Document Find and View workload

described A-9on HP K580 machines 4-21

Documentum Serverdefined 1-5transformation engine 1-7

Documentum Sizing Spreadsheet 1-3, 2-2Documentum workloads

Docbase usage patterns 2-25Document Find and View A-9EDMI A-1iTeam 2-10Load and Delete workload 2-22Online Customer Care A-9operations not included 2-26, A-17resource consumption 2-24, A-15software architecture 2-23, A-14Web Site A-6WebPublisher 2-16

dynamic HTML, memory use 4-28dynamic Web pages, affects on scaling 3-20

Ee-Content Server 1-6eContent Server

caches 4-27cluster 3-8defined 1-6load balancing 3-8scaling 3-7

Page 153: Document Um System Sizing Guide

Documentum System Sizing Guide Index-3

server set 3-8editions

Portal 3-21Web Content Management 3-17

EDM Server 1-6EDMI workload

on AIX servers 4-17described A-1execution scenario A-2on IBM Netfinity 7000 M10 4-15on LXR8000 & LH4 4-20operations A-3response time requirements A-4scaling A-6on Sun Enterprise 450 4-11on Sun Enterprise 6500/4500 4-13

Enterprise 450 (Sun server) 4-11Enterprise 4500 (Sun server) 4-12Enterprise 6500 (Sun server) 4-12Enterprise Resource Planning system,

Documentum and 4-22errors in system sizing 1-3

FF50 (AIX server) 4-17failover

operating system solutions 3-13partitioning and 3-5

federations 3-15

Gglobal type cache 4-27

Hhardware configurations

in benchmarks 4-2choosing configuration 3-11

host-based configurations 3-11HP servers

K580 4-20

LH4 4-19Lpr 4-19NETSERVER LXR 8000 4-18V2600 4-20Web sites for 4-22

HTTP Server 1-6

IIBM servers

AIX F50 4-17AIX S7A 4-16Netfinity 7000 M10 4-15Web site for 4-22

inactive usersresource consumption 2-5

inactive users, defined 1-6, 2-3indexes and disk capacity 4-38iTeam in Portal edition 3-21iTeam workload

on Compaq servers 4-10execution scenario 2-11on HP LXR8000 & LH4 4-19on Lpr/LH4 4-19operations 2-12purpose 2-10resource consumption 2-24response times

examples 2-16requirements 2-15task performance and 2-14

scaling 2-13

KK580 (HP-UX server) 4-20

Llatency

defined 5-3network 1-6

LH4 (HP server) 4-19

Page 154: Document Um System Sizing Guide

Index-4 Documentum System Sizing Guide

Load and Delete 2-22load balancing

DocBrokers 3-10eContent Server 3-8network 3-6transparent 3-5

Lpr (HP server) 4-19

Mmemory

physical 1-6virtual 1-7

memory sizingcache usage 4-26database requirements 4-29for Desktop Client 6-4Docbase size, affect on 4-29dynamic HTML 4-28equation for 4-32examples 4-32 to 4-35guidelines, general 4-31for MS Office integrations 6-4operating system 4-30operating system requirements 4-29oversizing 4-24overview 4-23paging file 4-30RightSite 4-27user connection requirements 4-28virtual memory 4-25

metadata retrieval, affects on scaling 3-20Microsoft Windows NT

Web site 4-22mirroring, in disk striping 4-42MS Office integrations memory use 6-4multi-site deployments

administrative overhead 3-16deployment options, list of 3-14network bandwidth 3-16

multi-tiered configurationsadvantages 3-11availability and 3-12

Nname servers. See DocBrokersnamed user 1-6Netfinity 7000 M10 (IBM Windows NT

machine) 4-15NETSERVER LXR 8000 (HP server) 4-18network

latency 1-6load balancer 3-6throughput 1-6

network sizingbandwidth

cost of 3-16vs localizing 5-6vs response time 5-4

for content transfer speed 5-9for DMCL operations 5-11guidelines, general 5-12overview 5-1

N-tier configurationsbenchmark test focus 4-5described 4-3

Oobject replication, described 3-15Online Customer Care workload

described A-9on HP V2600 machines 4-21operations A-10response time requirements A-13user activities A-11

operating systemsmemory requirements 4-30physical memory, tuning 4-29

Page 155: Document Um System Sizing Guide

Documentum System Sizing Guide Index-5

Ppaging file

memory use, estimating 4-30out of memory detection 4-31sizing 4-23usage 4-25on Windows NT 4-31

parity, in disk striping 4-41partitioning

and failover 3-5RDBMS 3-10scaling out 3-3scaling up 3-2Web tier software 3-6

performance and cache use 4-26physical memory

cache usage 4-26calculation examples 4-32 to 4-35defined 1-6global type cache 4-27large memory support 4-29operating system, configuring 4-30process working set 4-25sizing 4-23user connection requirements 4-28

Portal Edition, scaling 3-21primary domain controller, Documentum

and 4-22process working sets

defined 4-25on Windows NT 4-30

Proliant servers (Compaq) 4-9

Qquery optimizers, affects on table scans

4-38

RRAID configurations 4-40reference links, described 3-15

relational databases. See databasesremote Web servers

defined 3-14vs bandwidth 5-8

resource consumptionbusy hour 2-6Documentum workloads 2-24, A-15dynamic and static web pages A-16inactive users 2-5process working set 4-25RightSite 2-5user connection states and 2-2

response timesDesktop client 6-3examples

iTeam workload 2-16WebPublisher workload 2-21

requirementsEDMI workload A-4iTeam workload 2-15Online Customer Care workload

A-13Web Site workload A-8WebPublisher workload 2-20

user expectations 2-9vs bandwidth needs 5-4

RightSiteactive sessions, estimating 2-8described 1-6DMCL object cache 4-28user connection states 2-5

rule-based optimizers 4-38

SS7A (AIX machine) 4-16scaling

across sites 3-14databases 3-10DocBrokers 3-10dynamic Web access and 3-20

Page 156: Document Um System Sizing Guide

Index-6 Documentum System Sizing Guide

eContent Server 3-7iTeam workload 2-13out 3-3, 3-4, 3-5Portal Edition 3-21trends affecting 3-1 to 3-5up 3-2, 3-5Web Content Management Edition

3-17, 3-19Web tier software 3-5

server machinesCompaq 4-9Sun Solaris 4-10

server sizingCPU use

RightSite and WDK/App server4-5

guidelines, general 4-22overview 4-1server CPU use 2-4

server softwareconfigurations, compared A-14host-based configurations 4-3N-tier configurations 4-3partitioning

across multiple hosts 3-3, 3-4on one host 3-2

reuse 3-2Site Delivery Services 3-20Solaris memory sizing 4-31spreadsheet, sizing 1-3, 2-2striping. See disk stripingSun servers

Enterprise 450 4-11Enterprise 6500 & 4500 4-12Web site for 4-22

swap file. See paging filesystem sizing

AutoRender Pro 6-5benchmark result tables, interpreting

4-6benchmark test results 4-4 to 4-21

configuration constraints 4-22database license sizing 4-46Desktop Client 6-1disk capacity 4-35disk space sizing formula 4-45disk space vs access capacity 4-37disk storage areas 4-42disk striping 4-40Documentum Spreadsheet 1-3, 2-2glossary of terms 1-4hardware configurations 3-11for high availability 3-12memory calculation examples 4-32 to

4-35mistakes, common 1-3network

bandwidth needs 5-4overview 5-1

overview 1-1paging file 4-30requirements of 1-2scaling

across sites 3-14databases 3-10DocBrokers 3-10Web tier software 3-5

serverguidelines 4-22memory 4-23process overview 4-1software requirements 4-44software reuse and 3-2

table scans, affect on 4-29user deployment considerations 3-4virtual memory 4-25workloads

defined 2-1estimating 2-2

Page 157: Document Um System Sizing Guide

Documentum System Sizing Guide Index-7

Ttable scans

disk capacity, affect on 4-38query optimizers and 4-38

throughput, defined 1-6transactions 1-7transformation engine 1-7

Uuser connection states

active user 1-4active user - in transaction 1-4, 2-3active user - out of transaction 1-4, 2-3connected user 1-5connecting user 1-5, 2-3described 1-7inactive user 1-6inactive users

defined 2-3memory requirements 4-28paging file use 4-30resource consumption 2-2RightSite Server 2-5

VV2600 (HP-UX server) 4-20virtual memory 1-7, 4-25vmstat UNIX utility 4-31

WWCM Edition. See Web Content

Management EditionWeb Content Management Edition

AutoRender Pro and 3-19described 3-17scaling requirements 3-19Site Delivery Services and 3-20WebPublisher and 3-19

Web Site workloaddescribed A-6operations A-7response time requirements A-8scaling A-9on Sun Enterprise 450 4-11on Sun Enterprise 6500/4500 4-14

Web tier software, scaling 3-5WebCache 3-20WebPublisher 3-19WebPublisher workload

operationslist of 2-18overview 2-17

purpose 2-16resource consumption 2-24response times

examples 2-21requirements 2-20task performance and 2-20

scaling 2-19Windows NT paging file 4-31workloads

See also Documentum workloadsactive sessions, estimating 2-8busy hour 2-6defined 2-1estimating 2-2response time expectations 2-9use in sizing 2-9

Page 158: Document Um System Sizing Guide

Index-8 Documentum System Sizing Guide