Post on 24-Dec-2015
Architecting for Scale in SharePoint 2010Russ HoubergSenior Technical Architect, MCMKnowledgeLake, Inc.
Storage ArchitectureSQL Tuning TidbitsRemote Blob Storage (Demo)Performance and Control Scalable Taxonomy Design (Demo)Search… A Complete StoryThe Big Picture: 10 million, 100 million
A BILLION Documents…
Scaling SP2010 from the Ground Up
Storage Architecture can make or break SharePoint Performance• Poor storage performance can tank the whole SharePoint
farm!
Can Be Tough to Estimate• Use an extendable storage platform if possible
Wider is Better• More spindles always better than higher GB• Avoid using a small number of large disks for increasing
storage capacity
Storage Architecture
TempDB, Search DBs, Content DBs• Multiple Data Files in Primary File Group• # Files = ½ to ¼ of CPU Cores | <= CPU Cores• Separate to unique spindle sets if possible
• Pre-Allocate all Data Files, Including TempDB• Estimate Projected DB Size and Divide by # Files to get the pre-
allocation size for each file
• Leave “AutoGrow” enabled, but don’t rely on it• Pre-Allocation to prevent AutoGrow• Set AutoGrow to 10% or logical MB/GB value based on projected
database Size
Storage Architecture
Data / Log File Spindle Priority
Storage Architecture
Priority DB File RAID IOPS Optimization
1 TempDB Data RAID 10 2 IOPS/GB Write
2 TempDB Log RAID 10 2 IOPS/GB Write
3 Content/DB Log RAID 10 2 IOPS/GB Write
4 Crawl DB Log RAID 10 2 IOPS/GB Write
5 Crawl DB Data RAID 10 2 IOPS/GB Read/Write
6 Property DB Log RAID 10 2 IOPS/GB Write
7 Property DB Data RAID 10 2 IOPS/GB Read/Write
8 Services DB Log RAID 10 2 IOPS/GB Write
9 Services DB Data [Depends] [Depends] [Depends]
10 Content DB Data (Collab) RAID 10 0.75 IOPS/GB Read / Write
11 Content DB Data (Archive) RAID 5 0.75 IOPS/GB Read
SQL Instant Initialization• Run SQL As Domain User with either…• Local Admin • Grant “Perform Volume Maintenance Tasks”
TempDB Pre-Allocation to 10% Largest DBSAN vs DAS vs NAS (Don’t Overshare!)Host Bus Adapter (HBA) ConfigurationNTFS Allocation Unit Size: 64KEnable Locked Pages in Memory (SQL Std.)Don’t skimp on RAM!
SQL Tuning Tidbits
Remote BLOB Storage (RBS)• By default SharePoint stores Binary Large Objects (BLOBs) in
the content database
• When enabled… Intercepts binary content (documents) and sends them to a BLOB store
• Microsoft provides the “local” FILESTREAM provider to allow for usage of the SQL Server local NTFS file system as a BLOB store.
RBS Background
Remote BLOB StorageWhat’s this ECM thing?- Interesting workarounds• API access was problematic
SharePoint 2003
SP1 Brings us EBS Provider- BLOBs are orphaned during edit/save- Orphan cleanup is resource intensive- Externalization happens on the WFE (reduced RPS)- Future support of EBS API is not guaranteed
SharePoint 2007
Long Live RBS- Transactional consistency supports “VETO”- Transactional consistency allows for UPDATE- Orphan cleanup uses SQL Indexes- Transparent to the SharePoint API- RBS is the best option for future support
SharePoint 2010
Remote BLOB StorageSharePoint WFE
SharePoint Object Model
BLOB StoreProvider Library
BlobStore
SQL Server
ContentDB
ConfigDB
2. Enforce Business
Logic
RBS Client Library Relational Access
1. Save Request
3. Save Blob
4. Write Blob
5. Return BLOB ID
6. Save Metadata & BLOB ID
7. Back to User
SQL Server 2008 R2• Any Version, even SQL Express R2
FILESTREAM RBS Provider (Current Version)• http://go.microsoft.com/fwlink/?LinkId=177388
RBS Requirements
The FILESTREAM provider is supported by SharePoint Server 2010 only when it is used with SQL Server 2008 R2 or SQL Server 2008 R2 Express. • Only “local commodity storage” (hard drive) is supported.• Direct Attached Storage (DAS), Network Attached Storage
(NAS), and Storage Area Network (SAN) are all considered to be “remote commodity storage” and are not supported by SharePoint 2010.
Any other 3rd Party RBS Provider is considered to be a “remote server” provider and SharePoint 2010 licensing requires that SQL Server 2008 R2 Enterprise Edition be implemented.
RBS Licensing and Limitations
Performance and Control- Column Indexes were not possible- Database Indexes were not supported
SharePoint 2003
- Column Indexes (10) could be configured via the UI- End users could impact performance with poor performing list views
SharePoint 2007
- Database optimizations allow far more items in a list- Support for (20) Multi-Column Indexes- Resource intensive operations can be limited or disallowed during production hours• Large query thresholds• Blocking Operations• Can be overridden via the Object Model• Can configure an unblocked “window”
SharePoint 2010
SP2010 Boundaries – Now More Stuff!!!• 30 Million Documents/Items in a List• 5000 Item View/Query Result Size (Default for a reason)• 100 Million Items in SharePoint Server 2010 Search• 1 BILLION Items in FAST For SharePoint 2010 Index• 250,000 Site Collections per Web Application• 200GB Content DB Size (SOFT LIMIT)• Recommend for Collaboration content or Fast Backup/Restore SLA• Content DB sizes up to 1TB are SUPPORTED for large single-site
repositories and archives of non-collaborative content!• That’s 150 Million items in a single Site Collection in a single Content
Database with RBS enabled (avg. 7KB metadata row)
Scalable Taxonomy Design
Enabling 100 Million• Place large Collaboration Site Collections (20GB+) in their
own content database• Break Up Archive/Records Site Collections by Year or, if
necessary, Content Type and Year• AVOID Item Level ACLs!!!• Release to Metadata Based Folder Structures as a workaround
• Use Content Type Syndication to facilitate multiple Site Collections of the same type
• Use Content Organizer as a “Drop Zone”
Scalable Taxonomy Design
Search… A Complete Story- WSS CAML Only- SPS Shared Services yielded decent full text results
SharePoint 2003
- WSS 3.0 SiteDataQuery allowed search across lists/sites- MOSS Search added Managed Properties - FAST ESP for SharePoint was a late player
SharePoint 2007
- Microsoft SharePoint Foundation Search- Site Collection Scope | No Redundancy | 10 Million
- Microsoft Search Server Express 2010- Extended Features| No Redundancy | 10 Million
- Microsoft SharePoint 2010 Search / Search Server- Extended Features | Scale Out | Redundancy | 100 Million
- Microsoft FAST Search Server 2010 for SharePoint - Extreme Scale | Redundancy | Doc Processing Pipeline- 1 Billion documents! (per farm)
SharePoint 2010
SharePoint Server 2010 / Search Server• Multiple Crawl Servers (Scale Out/Redundancy)• Crawl Servers comprised of stateless Crawlers• Multiple Crawlers improve crawl performance• Multiple Crawl DBs support more Crawlers• Crawl DB is separated from Property DB• Index is comprised of multiple Index Partitions that can be
mirrored on different Query Servers• Multiple Index Partitions improve Query Performance
Search… A Complete Story
FAST Search Server 2010 for SharePoint• Extreme Scale and Performance• Custom Relevancy and Navigation Tuning• Tune Performance for content volume, query volume, crawl
pipeline performance and query speed• Uses SharePoint 2010 Query Servers• Bolt on FAST Servers for additional processing• Add server ROWS for query performance and high availability
or COLUMNS for crawl performance• Can scale to support 1 Billion items!
Search… A Complete Story
Storage is the KEY to PerformanceRBS reduces Content DB Size and facilitates large repositoriesSharePoint governs end-user operations Content Type Publishing and Content Organization help balance database loadingSearch solutions now handle the entire range of corpus possibilities10 million is easy, 100 million can be done, 1 BILLION is possible!
In Review…
http://www.houberg.net
@rhouberg
http://www.knowledgelake.com/resources
More…