Performance Study on SharePoint Workloads in a SQL Server ...€¦ · Performance Study on...
Transcript of Performance Study on SharePoint Workloads in a SQL Server ...€¦ · Performance Study on...
Performance Study on SharePoint Workloads in a SQL Server Environment
A Dell Technical White Paper
Dell │ SharePoint Solutions Engineering
Ravikanth Chaganti and Jisha J
August 2010
Performance Study on SharePoint Workloads in a SQL Server Environment
2
Executive Summary A Microsoft ® SharePoint® Server 2010 farm hosts the core platform services and applications that provide many different functions for its users. These functions include document management, version control, ease of access, and intuitive administration just to name a few. Fundamental to the architecture of a SharePoint environment is the back-end database. Microsoft SQL is specifically designed to support SharePoint and needs to be configured correctly in order to save time, effort, and money. This white paper focuses on the SQL Server I/O subsystem and the role it plays in a SharePoint environment. The goal of this research is to provide guidance and insight into optimizing the database host and the related benefits to the overall scalability and performance of a SharePoint farm. This paper provides detailed information on the factors to be considered when designing a farm and how to best configure them. Finally, this paper covers several performance metrics for various farm components and provides detailed information on how the recommended farm architecture can achieve sub one second response times. Dell is able to provide this data due to an internally developed load generation tool which was created specifically for SharePoint, and was used to conduct several different experiments that were intended to stress the SQL Server I/O subsystem. The lessons learned in this paper will help IT Managers build more efficient and effective SharePoint environments on an SQL back-end.
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. © 2010 Dell Inc. All rights reserved. Reproduction in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell. Dell, the DELL logo, and the DELL badge, and PowerEdge are trademarks of Dell Inc. Microsoft,
Windows Server, SharePoint, and SQL Server are registered trademarks of Microsoft Corporation in the
United States and/or other countries. Intel and Xeon are registered trademarks of Intel Corporation in
the U.S. and/or other countries. EMC is a registered trademark of EMC Corporation. Adobe is a
registered trademark of Adobe Systems Incorporated in the United States and/or other countries.
Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own. August 2010
Performance Study on SharePoint Workloads in a SQL Server Environment
3
Contents Executive Summary ....................................................................................................... 2
Introduction ................................................................................................................ 4
SharePoint Farm Topologies .......................................................................................... 4
Microsoft SQL Server and SharePoint 2010 ......................................................................... 4
SharePoint Farm Performance Study ................................................................................... 5
Dell SharePoint Load Generation Framework ...................................................................... 5
Content Population Tool ............................................................................................ 5
VSTS Load Testing Framework ..................................................................................... 6
Load Testing Workload Test Mix ..................................................................................... 7
Test Methodology ....................................................................................................... 9
Experimental Design .................................................................................................... 10
Test Results and Analysis .............................................................................................. 12
Conclusion ................................................................................................................ 14
References ................................................................................................................ 15
Performance Study on SharePoint Workloads in a SQL Server Environment
4
Introduction Microsoft SharePoint Server 2010 offers functionality that makes it a good choice for many different
business scenarios. Typically, SharePoint is deployed in a server farm that includes a web presentation
tier, an application tier, and a database tier. More detailed information about designing and building
SharePoint farms intended for organizations with different scale and performance needs is available in
a series of white papers available on www.dell.com/sharepoint.
This white paper will examine how representative SharePoint workloads impact a Microsoft SQL Server
hosting the database tier in a SharePoint server farm. Dell has developed a tool that implements
common tasks associated with collaboration and document publishing workloads. This tool has been
used to place loads on a SharePoint farm, allowing a detailed analysis of the impact of different
workload patterns on the database tier. This white paper will focus on the impacts to the SQL Server
I/O subsystem, with a goal of providing insight into when changes to the database host are likely to
benefit the overall scalability or performance of a SharePoint farm.
SharePoint Farm Topologies The SharePoint server farm offers opportunities to employ a scale-out philosophy; many of the
SharePoint roles can be configured to operate on multiple servers. When used in conjunction with load-
balancing techniques, the capacity, availability, and throughput of the farm can be increased relatively
easily. Similarly, for deployments with modest scale or performance needs, the roles generally
associated with the presentation and application tiers can be consolidated onto fewer servers. In
smaller farms, the database server may be the only component which remains separate.
Single-server SharePoint deployments are possible, but are recommended only for development or
testing environments because such deployments are effectively locked on the single server and unable
to scale. Certain key SharePoint roles cannot be relocated from this topology and deployed onto other
servers. This single-server scalability limitation may be overcome by deploying a single physical server,
but using separate virtual machines to house the database and farm roles. However, the performance
and scalability considerations of a virtualized farm are beyond the scope of this white paper.
Microsoft SQL Server and SharePoint 2010 All of the data in a SharePoint farm is stored in content databases on a Microsoft SQL Server host. The
data includes everything from simple text-based list items to large binary files that are stored in
SharePoint document libraries. When the web front-end servers in a farm process a user request, they
make queries to the database server in order to process the request. Therefore, the performance of
the back-end database can have a significant influence in the perceived speed and quality of the entire
farm. Because of this, it is important to gain a better understanding of the impact of different types of
end-user requests on the SQL Server database in a SharePoint farm. In order to meet this need, the Dell
SharePoint Solutions Engineering team developed a load generation tool for SharePoint and conducted
several different experiments that were intended to stress the SQL Server I/O subsystem.
Performance Study on SharePoint Workloads in a SQL Server Environment
5
SharePoint Farm Performance Study Microsoft SharePoint 2010 is a versatile platform that can be used in a large variety of ways. Some
SharePoint workloads work almost out of the box, while others require or allow significant
customization; and still others are the result of completely custom developed applications. This
flexibility results in a gazillion possible ways of using SharePoint which makes it almost impossible to
accurately size servers and storage for a SharePoint farm. Also, there is no standard benchmark for
sizing SharePoint workloads yet. It is very important to be able to provide the right guidance to
customers when it comes to recommending infrastructure elements of a SharePoint implementation.
This understanding of customer needs led to the development of the Dell SharePoint Load Generation
framework used to perform load testing of a SharePoint farm.
Dell SharePoint Load Generation Framework An internally developed load generation framework had been used in understanding the performance
characteristics of the SharePoint farm. This framework includes load testing of SharePoint out of the
box usage profiles, such as collaboration and publishing.
The Dell SharePoint load generation framework has two components—a content population tool and the
Visual Studio Team Suite (VSTS) web test framework.
Content Population Tool
The content population tool is designed to prepare the SharePoint farm for load testing. This content
population tool was designed to distribute the SharePoint content across multiple site collections.
Figure 1. SharePoint Content Population Tool
Performance Study on SharePoint Workloads in a SQL Server Environment
6
The content population tool was developed to:
Create SharePoint web applications
Create site collections
Add web parts to home pages
Create document libraries
Create SharePoint list items
Upload documents, images, etc.
This tool is capable of populating hundreds of gigabytes of SharePoint content in a few hours. The size
of the SharePoint content database and other aspects, such as the number of site collections, etc.,
vary based on the usage profile selection. A usage profile is a collection of use cases closely mapped to
real world SharePoint usage. To some extent, these usage profiles were mapped to the SharePoint
Capacity Planner1 and other Microsoft recommendations. Although the SharePoint Capacity Planner was
intended for MOSS 2007, there are several aspects of these recommendations2 that still apply to
SharePoint 2010 out of the box workloads. The content generated and uploaded by the content
population tool serves as a baseline for SharePoint 2010 load testing using the VSTS web test
framework.
VSTS Load Testing Framework
Dell’s SharePoint load generation framework uses VSTS 2008 to perform load testing. Within VSTS, each
load test directly maps to a SharePoint usage profile, and each usage profile defines a list of use cases
and how many use cases are run per hour per connected user. Using VSTS 2008 helps in the rapid
creation of use cases and the parameterization of those use cases. SharePoint load testing is performed
using a VSTS test rig of several physical test agents (shown in Figure 2), and the results are captured in
a SQL database on the test controller.
1 SharePoint Capacity Planner: http://www.microsoft.com/downloads/details.aspx?FamilyID=dbee0227-d4f7-48f8-85f0-e71493b2fd87&displaylang=en 2 Microsoft SharePoint 2010 Performance and Capacity Management: http://technet.microsoft.com/en-us/library/cc262971.aspx
Performance Study on SharePoint Workloads in a SQL Server Environment
7
Figure 2. VSTS Test Rig
VSTS Test Rig
Test Controller
Agent 1 Agent 2 Agent 3 Agent 4 Agent 5 Agent 6 Agent 7 Agent 8 Agent 9 Agent 10
Database Server
Application Server
NLB Cluster
Web Server
SharePoint Farm
Web Server Web Server
Start Test
Run Test
Load Testing Workload Test Mix As mentioned earlier, the load test usage profiles were based on the SharePoint Capacity Planner and
other Microsoft recommendations for SharePoint 2010. The System Center SharePoint Capacity Planner
defines several usage profiles for both collaboration and document publishing workloads. These usage
profiles are categorized into low, medium, and heavy usage profiles. These categories define several
aspects of a usage profile, such as how many requests are sent per hour per connected user, what use
cases constitute a load test, and what percentage (test mix) of each use case is used within each load
test.
Within the scope of this performance study white paper, the heavy collaboration usage profile was
used. Table 1 shows the heavy collaboration test mix as suggested by the SharePoint Capacity Planner
(SCP).
Table 1. SCP Usage Profile Definition
SCP Usage Profiles Heavy Collaboration
Home Page Access (%) 30
List Page Access (%) 20
Document/Picture Download (%) 15
Document/Picture Upload (%) 8
Search (%) 15
Performance Study on SharePoint Workloads in a SQL Server Environment
8
Total requests/hour/connected user 60
As shown in Table 1, SCP defines only a high level test mix for each usage profile. Table 2 shows a more
granular translation of this SCP heavy collaboration usage profile. Several use cases were mapped to
each of the categories described by SCP, and the number of use cases per hour per connected user has
been assigned.
Table 2. Dell's Test Mix for a Heavy Collaboration Profile
Heavy Collaboration Test Mix Number of tests/hr/user
Home Page Access
Read Site Home Page 18
List Page Access
Read Survey 6
Read Lists 6
Document/Picture Download
Read Document Library 2
Read Home to Document Library 1
Read Wiki Page 2
Read Picture Library 1
Read Home to Wiki Page 2
Read Home to Picture Library 1
Document/Picture Upload
Create Wiki Page 3
Upload Document 2
Search
Search Site 10
List Item Insertion/Deletion
Respond to Survey 2
Reply to Discussion Topic 1
Edit Wiki Page 2
Comment Home to Blog Post 1
Total tests/hour/connected user 60
It is important to note that Dell’s test mix (shown in Table 2) is not a one to one mapping to the
previously described SCP and Microsoft recommendations. For example, SCP defines total requests per
hour per connected user. However, within Dell’s test mix for the heavy collaboration profile, this
definition translates to more than 60 requests per hour because the usage profile uses 60 tests per hour
per connected user. And, one test could mean more than one request. Hence, the results published in
this white paper may or may not map directly to the SharePoint Capacity Planner recommendations,
but they are specific to the workload mix defined in Table 2.
Performance Study on SharePoint Workloads in a SQL Server Environment
9
Test Methodology The intent of the experiments conducted as a part of this performance study was to understand how
disk I/O and memory requirements scale with the user load on the SharePoint farm. Several load test
iterations were conducted with increasing user loads. For example, an initial user load of 500 virtual
users was used, and the same had been incremented by 500 users until the monitored resources (disk
I/O or memory) reached a bottleneck state.
The data set used to build the content database included several different types of files. These file
types included Microsoft Office documents and Adobe® PDF documents, as well as several image
formats. Table 3 shows a distribution of file content sizes used in this performance study.
Table 3. Data Set
Average File Size Number of Files
1KB to 500KB 34240
500KB to 1MB 5223
1MB to 10MB 13003
10MB to 70MB 125
The aggregated SharePoint content database size was approximately 53GB. For the duration of the load
tests, this content database grew by almost 20%. This performance study involved load testing of an
out of the box SharePoint deployment using a test mix shown in Table 2. A full content crawl was
performed once at the beginning of the load tests. No subsequent crawls were performed after the
load tests or during the load tests.
Two metrics—Disk I/O and Memory—were used to characterize SQL performance when used in a
SharePoint deployment. For performing disk I/O characterization, SQL server memory was restricted to
2GB; and SharePoint load testing was performed with increasing user loads. The farm average response
time was monitored during this process and the disk backend was upgraded as required.
For performing memory characterization, the disk backend was upgraded to 14 disks in a RAID 0
configuration; and load testing was performed with increasing user loads. SQL server memory was
increased in increments of 2GB whenever a bottleneck caused the average farm response time to go
beyond one second.
Performance Study on SharePoint Workloads in a SQL Server Environment
10
Experimental Design This section provides the detailed discussion of the results and analysis of the already discussed test
strategy. This information would be helpful in understanding the load exerted by the SharePoint farm,
based on the usage profile and the user load. This data may be helpful in the preliminary sizing of a
SharePoint farm deployment.
SharePoint workloads were executed on a SharePoint farm using Visual Studio Team Suite and recorded
web tests, with varied user loads. Heavy Collaboration scenario was analyzed for this purpose.
The detailed configuration of the SharePoint farm is shown in Table 4.
Table 4. SharePoint Farm Configuration
Server Machine
SQL Server Database Server
Dell™ PowerEdge™ R710
SHAREPOINT App Server PowerEdge R610
SHAREPOINT Web Front End 1
PowerEdge R610
SHAREPOINT Web Front End 2
PowerEdge R610
SHAREPOINT Web Front End 3
PowerEdge R610
All three web front-end servers were clustered using the Network Load Balancing (NLB) feature. More
information on NLB may be found here: http://technet.microsoft.com/en-us/library/bb742455.aspx.
The content database and the tempdb database files were accommodated in a separate set of disks to
individually analyze the database components.
Performance Study on SharePoint Workloads in a SQL Server Environment
11
The detailed configuration of the SQL Server database is provided in Table 5.
Table 5. SQL Server Configuration
Components Details
Hardware
Server
Model: Dell PowerEdge R710
Processor: 2 *Quad core Intel® Xeon®
Processors E5530 @ 2.40GHz, L3 8MB
Memory: 16GB (8 *2GB RDIMM 1067MHz)
NOTE: SQL Server Memory was restricted
to 2GB to redirect most of the database
requests to the storage.
Storage
Model: Dell EMC® CX4-120
Hard drives: 146GB 15k SAS drives
Flare version: 3.26.040.5.025
Network Interface Cards
Broadcom Teamed NIC (2 * Broadcom
BCM5709C NetXtreme II GigE(NDIS VBD
Client))
Driver: 5.0.13
Software
Operating System Microsoft Windows Server® 2008 R2
Enterprise Edition
Database Microsoft SQL Server 2008 R2 x64
The overall farm response time of approximately one second was considered as a measure of
performance to determine the maximum possible database load within the acceptable performance
limits. When the farm reached the 1s response time limit, the content database disks were expanded
to accommodate more load. The SQL memory was restricted to 1GB, to push the maximum requests to
the disks.
Throughout the test period, the overall utilization of the web front-end servers was monitored and
verified to not exceed more than 50% of the system capacity.
Performance Study on SharePoint Workloads in a SQL Server Environment
12
Test Results and Analysis As mentioned earlier, to understand the disk I/O characteristics, the database disks were deployed on
RAID 10 volumes consisting of varying numbers of physical disks. For performing I/O characterization,
two disk initial configurations were deployed. Throughout the disk I/O characterization testing, SQL
Server was restricted to use only 2GB of physical memory. This restriction forced most I/O requests to
the disk and, hence, increased the load on the disk I/O subsystem.
Table 6. Disk I/O Performance
Number of Disks
Maximum Concurrent User Load
Average Farm
Response Time
2 1500 0.96
2 2000 1.04
2 2500 1.1
14 3000 0.58
14 4000 2.73
As shown in Table 6, a two disk configuration could support up to 2500 concurrent users with an
average farm response time of approximately one second. With the goal of restricting the average farm
response time to below one second, a 14 disk configuration was tested. Adding more disks to the
database backend supported up to 3000 concurrent users. At this point, pushing the user load beyond
3000 concurrent users resulted in a farm response time higher than 1 second. This result occurred
because the underlying 2GB memory allocated to SQL server started becoming a bottleneck.
For performing memory characterization, the SQL server backend and the SharePoint content database
were placed on a 14 disk RAID volume. With a constant disk backend, the SQL server memory was
increased in increments of 2GB to find the maximum concurrent user load supported by the SQL
database backend.
Table 7. Memory Performance
Number of Disks
SQL Server Memory
Maximum Concurrent User
Load
Average Farm Response Time
14 2GB 3000 0.58
14 4GB 4000 0.35
14 4GB 5000 0.67
14 6GB 6000 0.79
As shown in Table 7, the test runs were started with the SQL Server memory restricted to 2GB. This
configuration with a 14 disk backend for the content database could scale up to 3000 concurrent users.
The SQL server memory was then scaled up to 6GB to support 6000 concurrent users. This behavior may
show that the SQL server memory alone can be scaled up to support increased user loads. However, it
may not be entirely true because at one point the web front-end servers will become a bottleneck.
Performance Study on SharePoint Workloads in a SQL Server Environment
13
From the preceding tables, Tables 6 and 7, the performance data shows a clear pattern where the SQL
server memory could become a bottleneck rather than the underlying disk backend used to store SQL
content.
The SharePoint requests when getting translated to the database requests are just the read and writes
from the database files. The database delivers the best it can, based on the number of requests from
the application. As the number of application requests (in turn, the database requests) leave behind
the database maximum capability, the application’s performance starts suffering. The key criterion is
to have the SharePoint database sized optimally to meet the expected higher user load scenarios.
Database memory is a key factor when planning for adequate capacity to support the existing and
future user loads.
Another important point to be noted is that the trend and the performance parameters are hugely
influenced by the data set being used during the test period. For example, uploading a 5MB document
consumes more resources compared to uploading a 100KB document. The aggregate amount of data to
be operated on during any point of time should be considered while sizing the database resources,
taking into account the maximum user load expected. Suppose a total of 8000 users are active on the
farm during a particular period. If a majority of the users perform heavy weight activities like
document and image upload or download, the farm resources may get highly consumed. The high
consumption may affect the overall farm performance.
Note
When you have unrestricted memory and enough physical disks for the database, backed by the
processor and the network capability, even with increasing huge user loads, the database response
time may be expected to remain consistent. To verify this expectation, a number of Collaboration test
iterations were done with the following configuration.
SQL Server Memory: Unrestricted (16GB of server RAM)
Number of physical disks hosting the Content Database: 14 (with RAID10)
The Content Database Response Time with varied user loads using the preceding configuration is shown
in Table 8.
Table 8. User Load Versus Content Database Response Time
User Load Database Response Time(s)
8000 0.01
10000 0.01
12000 0.012
Performance Study on SharePoint Workloads in a SQL Server Environment
14
Conclusion SharePoint allows organizations to store and manipulate unstructured data with great ease and
flexibility. The performance of a SharePoint farm is hugely dependent on the working data set of all
the combined users accessing SharePoint at any point of time, which requires the content database to
be sized and implemented optimally to meet the organizational requirements.
Based on the experiments conducted with a working data set of about 53GB and an additional effective
physical disk pair in RAID10 added to the content database backend, an increased capability of the
farm to handle an additional 1000 users was shown. This data may be helpful in sizing the content
database backend capability based on the expected user load for the organization. However, having an
oversized disk backend and undersized or restricted SQL server memory configuration will still result in
poor farm performance.
Performance Study on SharePoint Workloads in a SQL Server Environment
15
References SharePoint Server Home Page
http://office.microsoft.com/en-us/sharepointserver/default.aspx
Dell SharePoint Solutions
www.dell.com/sharepoint
Dell SQL Server 2008 Solutions
www.dell.com/sql2008
Windows Server 2008 R2
www.dell.com/microsoft
Microsoft Tech Blogs: SHAREPOINT Performance Counters
http://blogs.msdn.com/ketaanhs/archive/2010/03/13/moss-performance-counters.aspx
Network Load Balancing Technical Overview
http://technet.microsoft.com/en-us/library/bb742455.aspx
How Network Load Balancing Works
http://www.isaserver.org/tutorials/basicnlbpart1.html