SharePoint Search Topology and Optimization

Post on 01-Jun-2015

514 views 0 download

Tags:

description

This presentation covers the architecture of SharePoint Search Topology, how to extend search and how to optimize your search farm for better results. It describes how you can build your Search topology with PowerShell commands and it explains how you can use the Query Rules and Query Builder for a great search results.

Transcript of SharePoint Search Topology and Optimization

Search Topology and Optimization

Mike MaadaraniSharePoint Architectmike@maadarani.com

November 23rd, 2013

Thank you to all of our Sponsors!!

Bio..

Mike MaadaraniApp Dev and Architecture for over 18 years (15 Years Microsoft, 3 Years with the “Other Guys”)Business focused on Enterprise Content Management & Publishing SitesTechnology focused on SharePoint, SQL Server and SharePoint IntegrationArchitect, trainer, and presenterBlog: www.maadarani.com mike@maadarani.com; @mikemaadarani

Configuring SSA and PS

Topology Scenarios

Agenda

Closing and Q&A

Relevancy, Query Builder, &Optimization

SharePoint 2013 Search Overview

Architecture and Resource Utilization

Search in 2010

Crawl Component

Query Component

SharePoint 2010 Search Service Application

Crawl Indexing Engine

Query Engine

Search Admin

Property Store (SQL)

Content

UserWFE

FAST Search for SharePoint 2010

FAST Content SSA

FAST Query SSA

FAST back-end components(managed separately)

Extensibility:• Sandbox• Entity

Extraction

Crawl Indexing Engine

Query Engine

Content Pipeline

Analysis Engine

Query Pipeline

Search AdminContent

UserWFE

… In SharePoint 2013

SharePoint 2013 Search Service Application

Index Component

Query Engine

Content Pipeline

Content ProcessingComponent

CrawlComponent

Query ProcessingComponent

AnalyticsProcessingComponent

Query Pipeline

Search Admin

Admin Component

Entire index on local disk

Property Store (SQL)

Content

UserWFE

Analysis Engine

Crawl Indexing Engine

Link/query analysis & recommendations

Separate crawl and indexing

Extensibility:• Web

callout• Entity

Extraction

SharePoint 2013 Search Architecture

SearchAdmin

Content UserCrawlContentProcessing Index

QueryProcessing WFE

API

AnalyticsProcessing

Crawl

Search Admin

Link

Analytics Reporting

FAST Search Index

SharePointSP AppsDevicesNon-SP UX

HTTPFile sharesSharePointUser profilesLotus Notes DocumentumExchange foldersCustom - BCS

Public API

Search topology components

Content Query

Why Search is so important?

I just uploaded a document.

Make it searchable, quick!

FAST

Why Search is so important?

EASY

Why Search is so important?

EASY

Why Search is so important?

Search Driven Applications

Why Search is so important?

Search Everything

I can find ALL of Rob Ford’s hidden videos!

noderunner.exe noderunner.exe noderunner.exe noderunner.exe

Where does Search live in the farm?

Windows servicesSharePoint Search Host Controller service

Runtime/lifecycle control of search components (except crawler) hostcontrollerservice.exe

SharePoint Server Search serviceCrawl Component mssearch.exe mssdmn.exe

ProcessesNoderunner.exe

Runtime environment for search components (except crawler)

msseearch.exemssdmn.exe

CrawlComponentnoderunner.exe

Search Runtime Environment

hostcontrollerservice.exe

Host Controller

Sh

are

Poin

t A

pp

Serv

er

Admin entitiesSearch Service Instance: Provisioning of the search service on each boxSearch Service Application: SharePoint Configuration entity

Still there, but only Crawl Component

AdminComponent

Query ProcessingComponent

Content ProcessingComponent

IndexComponent

Analytics ProcessingComponent

Where do I host my components?

CPU loadDriving factors

QPS

Query transformations

Network loadDriving factors

Number of index partitions

Size of queries and results

Example:20 index partitions @ 20 qps => 200/100 Mbit/s in/outbound

Query processing component (QPC)

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

CPU loadDriving factors

QPS and item count

Guidelines per index component @ 2 GHz CPU1M items: 5 QPS per CPU core

5M items: 2 QPS per CPU core

10M items: 1 QPS per CPU core

Disk loadDriving factors

QPS and item count

New content invalidates caches

Disk size: 500GB @ 10M items per index component

Index component

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Crawl component

CPU loadDriving factors

Documents per second

Link discovery

Crawl management

Network loadDriving factors

Downloading items from content sources

Passing items on to CPC

Disk loadAll documents are temporarily stored in data folder

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Content processing component (CPC)

CPU loadDriving factors

Documents per second

Document size and complexity

Feature extraction

Estimate: 5-10 DPS per CPU core

Network loadDriving factors

Documents per second

Document size

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Analytics processing component (APC)

CPU loadDriving factors

Number of items

Site activity

Disk loadLocal disk used for temporary storage

Bulk load, primacy concern is load isolation

Network loadSame as for CPU load

PLUS: Network traffic increases when distributing APC across multiple machines

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Search administration component

Low CPU and network load

Load increase with more components in the search topology

Item count

DPS QPS

Load impact (relative)

CPU NetworkDisk

Create your SSA

$SSADB = "SharePoint_Demo_Search"

$SSAName = "Search Service Application Ottawa"

$SVCAcct = "mcm\sp_search"

$SSI = get-spenterprisesearchserviceinstance -local

#1. Start the search services for SSI

Start-SPEnterpriseSearchServiceInstance -Identity $SSI

#2. Create the Application Pool

$AppPool = new-SPServiceApplicationPool -name $SSAName"-AppPool" -account $SVCAcct

#3. Create the search application and set it to a variable

$SearchApp = New-SPEnterpriseSearchServiceApplication -Name $SSAName -applicationpool $AppPool -databaseserver SQL2012 -databasename $SSADB

#4. Create search service application proxy

$SSAProxy = new-SPEnterpriseSearchServiceApplicationProxy -name $SSAName" Application Proxy" -Uri $SearchApp.Uri.AbsoluteURI

#5. Provision Search Admin Component

Set-SPEnterpriseSearchAdministrationComponent -searchapplication $SearchApp -searchserviceinstance $SSI

#6. Create the topology

$Topology = New-SPEnterpriseSearchTopology -SearchApplication $SearchApp

#7. Assign server(s) to the topology

$hostApp1 = Get-SPEnterpriseSearchServiceInstance -Identity "SPWFE"

New-SPEnterpriseSearchAdminComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchCrawlComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchIndexComponent -SearchTopology $Topology -SearchServiceInstance $hostApp1 –IndexPartition 0

#8. Create the topology

$Topology | Set-SPEnterpriseSearchTopology

Small Search Topology

Fault tolerant small search topology

Host

VM

Index QPC

VM

Admin

Crawl

CPC

APC

Host

VM

Index QPC

VM

Admin

Crawl

CPC

APC

Other SharePoint applications

Web front end

Admin

Crawl

CPC

APC

Index

QPC

Small search farm (up to 10M items)

Resources @ 10M items8x CPU cores24 GB RAM800 GB disk

Sized independently

Separate disk

for index

Scaling from small to medium search topology

Adm

Crawl

Index Index IndexIndex QPCCPC CPC

APC

Adm

Crawl

Index Index Index IndexQPCCPC CPC

APC

Extend your SSA

#2. Extend the Search Topology:

$hostApp1 = Get-SPEnterpriseSearchServiceInstance -Identity "SPWFE"

$hostApp2 = Get-SPEnterpriseSearchServiceInstance -Identity "SPSearch"

Start-SPEnterpriseSearchServiceInstance -Identity $hostApp1

Start-SPEnterpriseSearchServiceInstance -Identity $hostApp2

#3. Keep running this command until the Status is Online:Get-SPEnterpriseSearchServiceInstance -Identity $hostApp1 Get-SPEnterpriseSearchServiceInstance -Identity $hostApp2

#4. Once the status is online, you can proceed with the following commands:$ssa = Get-SPEnterpriseSearchServiceApplication$active = Get-SPEnterpriseSearchTopology -SearchApplication $ssa –Active

$newTopology = New-SPEnterpriseSearchTopology -SearchApplication $ssa

New-SPEnterpriseSearchAdminComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchCrawlComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1

New-SPEnterpriseSearchIndexComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp1 –IndexPartition 0

New-SPEnterpriseSearchAdminComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchCrawlComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2

New-SPEnterpriseSearchIndexComponent -SearchTopology $newTopology -SearchServiceInstance $hostApp2 –IndexPartition 1

#5. Activate the topology:

Set-SPEnterpriseSearchTopology -Identity $newTopology

Medium Search Topology

Tweaking Your results

Challenges: Intent

Where is my talk Project Plan?

Are Documents held at the same place?

I wonder if there are references from

previous projects?Different people have different intentsQuery Rules help you handle intents

There is rarely a single right answer

Infrastructure Project

Configuration in the Conceptual Relevance Flow

For all queries:

Authorities: Level 1: http://employment

Ranking model: {incorporate user ratings}

Query:HR Employmentquarterly report

Search Web Part

Query Processing Engine

Document Collection

Thesaurus: HR Human ResourcesBest bets: HR Employment /HR/employment

(WORDS HR, Human Resources) AND(WORDS employees, employed) AND (WORDS quarterly, quarterlies) AND(WORDS report, reports, reported)

Mixed Results for:• HR Employment best bet• HR Employment quarterly

report• HR Employment

ContentType=reports

Dynamic Reordering Rules: Quarterly Report {prefer docs from http://reports}

Query Rule: {Terms} Quarterly Report {Terms} ContentType=“reports”

Authorities: SSA-level configuration

Sites that are important

Sites with low intrinsic relevance

Takes ~24hrs to propagate

Authorities: Connected

Authorities: Connected

1

0

1

1

2

4

3

2

4

Setting an authority affects all sites connected through hyperlinks

Sites are weighted

by distance to the authority

Query Rules

Tune Search Results

Created at the SSA, Tenant, Site Collection or SiteSSA

Site Collection

Site

Query Rules

ConditionWhen Do I apply the rule?

ActionWhat to do when the rule is matched?

PublishingWhen should the rule be active?

Query Rules

Exact match, beginning or end Ad-hoc or term store dictionary Match a regex (advanced) Is this query more likely aimed at

the following source…? Do people mostly click on result of

the following type…?

Conditions Show a promoted result Show a block of results Replace the core results

with a different query

Actions

Query Builder

Dynamically Ranking Change

Part of the query

Results Ranking

Query Builder

Session Objective and Takeaways

High Availability and Performance

Better Search Quality

Better management

Friendly results and tools

Thank You / Merci

www.maadarani.com, mike@maadarani.com , @mikemaadarani

Q & A

Join us for SharePint today!Date & Time: Nov 23rd, 2013 @6:00 pm

Location: The Observatory Pub, Algonquin Student’s Association

Address: A-170 on Algonquin Campus

Parking: No need to move your car!*

Site: http://www.algonquinsa.com/ob.aspx

*Please drive responsibly! We are happy to call you a cab

Remember to fill out your evaluation forms to win some great prizes!

&