SharePoint Search - SPSNYC 2014

Post on 13-Jan-2015

181 views 4 download

description

Avtex's Brian Caauwe was a presenter at SharePoint Saturday in NYC in July of 2014. Here is his presentation.

Transcript of SharePoint Search - SPSNYC 2014

SHAREPOINT SEARCHIntroducing the new search service

Brian Caauwe – Sr. Consultant

July 26th, 2014

KEY TOPICS

• Editions

• Components

• Administration

• Customizations

WHO AM I?• Brian Caauwe

• SharePoint Consultant & Speaker

• Avtex Solutions (Minneapolis, MN)

• Email: bcaauwe@avtex.com

• Twitter: @bcaauwe

• Blog: http://blog.avtex.com/author/bcaauwe

• Unfortunate Sports Fan

• Minnesota Twins

• Minnesota Vikings

• Technical Editor

• Professional SharePoint 2013 Administration

• Certifications

• MCM: SharePoint Server 2010

THANK YOU EVENT SPONSORS

• Please visit them and inquire about their products & services

• To win prizes make sure to get your bingo card stamped by ALL sponsors

POLL

• SharePoint Version

• 2007 – WSS, MOSS

• 2010 – SPF, Server, FAST

• 2013 – SPF, Server

• Work Roles

• SharePoint Administrator

• SharePoint Developer

• Business User

• Other

SEARCH EDITIONS

SEARCH EDITIONS

• SharePoint Foundation 2013

• SharePoint Server 2013

• Standard

• Enterprise

• ALL editions now use the SAME search service

• osearch15

• TechNet Reference: http://technet.microsoft.com/en-us/library/cb36484c-0e8f-480e-be88-5daa8bf2d47d#bkmk_SearchfeaturesOnPrem

SEARCH EDITIONSSHAREPOINT FOUNDATION 2013

• Now uses enterprise search engine

• Can now administer service

• Content Sources

• Crawl Schedule

• etc

• Limited scalability

SEARCH EDITIONSSHAREPOINT SERVER 2013 - STANDARD

• Scalable components

• People Search

• Promoted Results

• Customized Sorting

• Graphical Refiners

• Search Server web parts

SEARCH EDITIONSSHAREPOINT SERVER 2013 - ENTERPRISE

• Content by Search web part

• Entity Extraction

• Content Processing Enrichment

• Video Search

• Item Recommendations

SEARCH COMPONENTS

SEARCH COMPONENTSLOGICAL ARCHITECTURE

Search Admin

Crawl

Links

Analytics Reporting

CrawlContent

ProcessingIndex

QueryProcessing

Administration

AnalyticsProcessing

WFE

Event Store

SEARCH COMPONENTSADMINISTRATION COMPONENT

Component

• Monitors states of all other components

• Managed Topology Changes

• Finally scalable

• Only one active at a time

Database

• Search Admin Database

• Configuration data

• Topology

• Crawl, Query rules

• Property Mappings

• Content Sources, Crawl Schedules

• Analytics Settings

Administration

SEARCH COMPONENTSCRAWL COMPONENT

Component

• Performs the crawling

• Invokes connectors / protocol handlers

• SharePoint content

• Business Applications

• File Shares

• More…

• Delivers crawled items AND metadata to Content Processing Component

• Communicates with ALL crawl databases

Database(s)

• Crawl Database

• Crawl history

• Information on crawled items

• Scale out for each 20 million items crawled

• Host distribution

• 2010 Handled by Host URL

• 2013 Handled by Content DB

Crawl

SEARCH COMPONENTSCONTENT PROCESSING COMPONENT (CPC)

Component

• Handles document parsing and iFilters

• Extracts data for Document Parsing and Property Mappings

• Performs linguistic processing

• Entity Extraction

• Generates phonetic name variations (people search)

• Sends items to the Index Component

Database(s)

• Link Database

• Receives information about links and URLs from CPC

• Stores unprocessed information for use in analytics

• Information on search clicks

• # of times people pick on results

• Scale out for each 20 million items crawled

• Scale out for each 100 million queries / year

ContentProcessing

SEARCH COMPONENTSANALYTICS PROCESSING COMPONENT (APC)

Component

• Performs Search Analytics

• Pulls information from Links DB

• Stores information for search reports

• Performs Usage Analytics

• Pulls information from event store

• Generates recommendations, usage and statistics reports

• Sends results to the content processing component to be pushed to the index

Database(s)

• Analytics Reporting Database

• Results of usage analytics

• Statistics information from the analyses

• Scale out when size > 200 GB

AnalyticsProcessing

SEARCH COMPONENTSINDEX COMPONENT

Component

• Logical representation of an index replica

• Mapped one-to-one to an index replica

• Each partition holds one or more index replicas

• Receives processed items from content processing component

• Receives queries from query processing component and writes to index

• Returns result sets to the query processing component

On File index

• Located ON SharePoint servers housing index component

• Index update groups

• Default (majority of managed properties)

• Security (ACL managed property)

• Link (managed properties related to link structure)

• Usage (managed properties related to usage data)

• People (managed properties related to people search)

• Full-text index

• Contains text from searchable managed properties

• Multiple replicas / server supported after October 2013 CU

Index

SEARCH COMPONENTSQUERY PROCESSING COMPONENT (QPC)

Component

• Analyses and processes queries

• Decides which query rules are applicable

• Submits query to index component

• Determines which index partition to send query to

• Performs pre processing

• Receives result sets from index component

• Performs post processing

• Sends result set back to requestor

• Performs linguistic processing at query time

• Word breaking, stemming, spellchecking, thesaurus

QueryProcessing

SEARCH COMPONENTSCOMPONENT PARTNERS

Name CPU Network Disk Memory

Administration ● ● ● ●

Crawl ●● ●●● ●● ●●

Content Processing (CPC) ●●● ●● ●●●

Analytics Processing (APC) ●● ●●● ●● ●●

Index ●●● ●● ●●● ●●●

Query Processing (QPC) ● ●● ●●

The content of this slide is borrowed from Neil Hodgkinson (@nellymo)

QueryProcessingIndex

AnalyticsProcessing

ContentProcessing

CrawlAdministration

SEARCH ADMINISTRATION

SEARCH ADMINISTRATIONMAPPING TERMINOLOGY FROM 2010 TO 2013

2010 Term 2013 Term

Scopes Result Source

Federated Location Result Source

Keyword Query Rule

Best Bets Promoted Result

Managed Property Schema > Managed Property

Crawled Property Schema > Crawled Property

Search Result Removal Crawl Log > URL View > Remove the item from the Index

XSLT Display Templates

N/A Result Types

N/A Result Block

N/A Continuous Crawl

Host Distribution Rule N/A

SEARCH ADMINISTRATIONSEARCH TOPOLOGY

Central Administration

• View topology

• No more options…

PowerShell

• Manage the search service instances

• Manage topology and components

SEARCH ADMINISTRATIONSEARCH TOPOLOGY - POWERSHELL

## Get Service ##$svc = Get-SPEnterpriseSearchServiceInstance -Identity “servername”

## Start Service ##Start-SPEnterpriseSearchServiceInstance -Identity $svc

## Get Search Service Application ##$ssa = Get-SPEnterpriseSearchServiceApplication

## Get Active Topology ##$activeTop = Get-SPEnterpriseSearchTopology -SearchApplication $ssa -Active

## Clone Topology ##$clone = New-SPEnterpriseSearchTopology -SearchApplication $ssa -SearchTopology $activeTop -Clone

SEARCH ADMINISTRATIONSEARCH TOPOLOGY - POWERSHELL

## New Administration Component ##$adminComp = New-SPEnterpriseSearchAdminComponent -SearchTopology $clone -SearchServiceInstance $svc

## New Analytics Processing Component ##$apc = New-SPEnterpriseSearchAnalyticsProcessingComponent -SearchTopology $clone -SearchServiceInstance $svc

## New Crawl Component ##$crawlComp = New-SPEnterpriseSearchCrawlComponent -SearchTopology $clone -SearchServiceInstance $svc

## New Content Processing Component ##$cpc = New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $clone -SearchServiceInstance $svc

SEARCH ADMINISTRATIONSEARCH TOPOLOGY - POWERSHELL

## New Query Processing Component ##$qpc = New-SPEnterpriseSearchQueryProcessingComponent -SearchTopology $clone -SearchServiceInstance $svc

## New Index Partition / Replica ##$idx = New-SPEnterpriseSearchIndexComponent -SearchTopology $clone -SearchServiceInstance $svc -IndexPartition 0 –RootDirectory “D:\SP\SearchIndex”

## Activate New Topology ##$clone.Activate()## OR ##Set-SPEnterpriseSearchTopology –Identity $clone

SEARCH ADMINISTRATIONSEARCH TOPOLOGY

Topology Recap

• Ensure service is “online” before using in search topology

• To clone topology, use New-SPEnterpriseSearchTopology -Clone

• Otherwise you won’t have component ID’s

• Index Component

• When specifying a root directory, it MUST exist but be empty

• Also if referencing remote server, the Cmdlet checks local server

• Always specify a partition, otherwise it chooses 0

• When adding a new partition, it must have the same number of replicas as existing partitions

• After adding a new partition, the index WILL be repartitioned … amount of time it takes depends on index size

• You can ADD a partition, but not DELETE

• Clean up old topologies / components

SEARCH ADMINISTRATIONFARM ADMINISTRATION

Diagnostics

• Crawl Logs

• Only way to directly remove item from index

• Search Reports

• Crawl Health

• Query Health

• Usage Reports

SEARCH ADMINISTRATIONFARM ADMINISTRATION

Crawling

• Content Sources

• Crawl Schedules

• Continuous OR Incremental crawl

• Full crawl

• Crawl Rules

• Server Name Mappings

• File Types

• Index Reset

• Pause / Resume

• Crawler Impact Rules

SEARCH ADMINISTRATIONFARM ADMINISTRATION

Queries and Results

• Authoritative Pages

• Result Sources

• Query Rules

• Query Client Types

• Search Schema

• Query Suggestions

• Enabled / Disabled

• Always / Never Suggest

• Import AND Export

• Search Dictionaries (Term Store Management)

• Company Exclusion / Inclusion

• Query Spelling Exclusion / Inclusion

• Search Result Removal

SEARCH ADMINISTRATIONFARM ADMINISTRATION

Search Schema (Managed / Crawled Properties)

• Searchable

• Advanced Searchable Settings

• Full-text index

• Weight group

• Queryable

• Retrievable

• Allow Multiple Values

• Refinable

• Sortable

• Safe for Anonymous

• Alias

• Token Normalization

• Complete Matching

• Company Name Extraction

• Custom Entity Extraction

SEARCH ADMINISTRATIONFARM ADMINISTRATION - POWERSHELL ONLY

## Result Types ##$owner = Get-SPEnterpriseSearchOwner -Level Ssa

$word = Get-SPEnterpriseSearchResultItemType –SearchApplication $ssa –Owner $owner | ?{$_.Name –eq “Microsoft Word”}

$pdf = Get-SPEnterpriseSearchResultItemType –SearchApplication $ssa –Owner $owner | ?{$_.Name –eq “PDF”}

$wordPDF = New-SPEnterpriseSearchResultItemType -SearchApplication $ssa -Name “WordPDF” –Owner $owner –ExistingResultItemType $pdf –ExistingResultItemTypeOwner $owner

Set-SPEnterpriseSearchResultItemType –Identity $wordPDF –SearchApplication $ssa –owner $owner –RulePriority 1 –DisplayTemplateUrl $word.DisplayTemplateUrl

## Thesaurus ##Import-SPEnterpriseSearchThesaurus -SearchApplication $ssa -FileName “\\server\share\thesaurus.csv”

SEARCH ADMINISTRATIONSITE ADMINISTRATION

Result Types

• Map results to display templates

Consumes farm settings, but allows site independent settings

• Result Sources

• Query Rules

• Search Schema

• Map Existing Managed Properties to Crawled Properties

• New Managed Properties - Types: Text or Yes/No

• Cannot make Sortable, Refinable, Multiple Values

SEARCH ADMINISTRATIONSITE ADMINISTRATION

Search Settings

• Search Center URL

• Search Navigation

Searchable Columns

• Exclude site columns from indexing

List Settings

• Can flag a list to force re-index

SEARCH CUSTOMIZATIONS

SEARCH CUSTOMIZATIONSCRAWL COMPONENT

Custom Connectors

• Really means BCS

• LOBSystemInstance needs ShowInSearchUI to show in Central Admin for content source

• DisplayUriField set on method otherwise URL’s in search will start with bdc3://

• LastModifiedTimeStampField set and ChangedIdEnumerator and DeletedIdEnumerator implemented if you want incremental crawls

MSDN Reference: http://msdn.microsoft.com/en-us/library/gg294165.aspx

Crawl

SEARCH CUSTOMIZATIONSCONTENT PROCESSING COMPONENT (CPC)

Content Enrichment Web Service

• Web service call outside of SharePoint to:

• Clean data

• Remove from index

• Augment properties

• Configurations

• Trigger Expression

• Input Managed Properties

• Output Managed Properties

• Failure Mode

• Debug Mode

MSDN Reference: http://msdn.microsoft.com/en-us/library/jj163968.aspx

ContentProcessing

SEARCH CUSTOMIZATIONSCONTENT PROCESSING COMPONENT (CPC)

Content Enrichment Web Service

• Registering the service in PowerShell

$ssa = Get-SPEnterpriseSearchServiceApplication

$cewsConfig = New-SPEnterpriseSearchContentEnrichmentConfiguration$cewsConfig.Endpoint = “http://externalserver/cews.svc”$cewsConfig.InputProperties = “Title”, “Company”$cewsConfig.OutputProperties = “Title”, “Company”, “Prop3”$cewsConfig.Trigger = ‘Contains(Company, “CoName”)’$cewsConfig.FailureMode = “Error”$cewsConfig.DebugMode = $false

Set-SPEnterpriseSearchContentEnrichmentConfiguration -SearchApplication $ssa -ContentEnrichmentConfiguration $cewsConfig

ContentProcessing

SEARCH CUSTOMIZATIONSCONTENT PROCESSING COMPONENT (CPC)

Custom Entity Extraction

• Different Extraction types

• Word Extraction

• 5 Dictionaries

• Microsoft.UserDictionaries.EntityExtraction.Custom.Word.n

• Word Part Extraction

• 5 Dictionaries

• Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.n

• Word Exact Extraction

• One Dictionary

• Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1

• Word Part Exact Extraction

• One Dictionary

• Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1

TechNet Reference: http://technet.microsoft.com/en-us/library/jj219480.aspx

ContentProcessing

SEARCH CUSTOMIZATIONSCONTENT PROCESSING COMPONENT (CPC)

## Entity Extraction ##Import-SPEnterpriseSearchCustomExtractionDictionary -SearchApplication $ssa –DictionaryName Microsoft.UserDictionaries.EntityExtraction.Custom.Word.1 –FileName “\\server\share\dictionary.csv”

Custom Entity Extraction

• Sample File

• Import through PowerShell

ContentProcessing

SEARCH CUSTOMIZATIONSCONTENT PROCESSING COMPONENT (CPC)

Custom Entity Extraction

• Map in Central Administration

ContentProcessing

SEARCH CUSTOMIZATIONSQUERY PROCESSING COMPONENT (QPC)

Ranking Models

• Customize ranking based on YOUR logic

• VERY complex… a LOT of math

Registered in PowerShell

MSDN Reference: http://msdn.microsoft.com/en-us/library/sharepoint/dn169052.aspx

$ssa = Get-SPEnterpriseSearchServiceApplication$owner = Get-SPEnterpriseSearchOwner -Level Ssa$customModel = [string](Get-Content .\CustomModel.xml)

$newModel = New-SPEnterpriseSearchRankingModel –SearchApplication $ssa –Owner $owner –RankingModelXML $customModel

QueryProcessing

SEARCH CUSTOMIZATIONSQUERY PROCESSING COMPONENT (QPC)

Security Trimming

• Pre

• Augments claims

• Processed BEFORE index lookup

• Accurate refiner counts

• Post

• Secondary security checkpoint

• Processed AFTER index lookup

• Negatively effects refiner counts

Needs to be deployed to GAC

Registered in PowerShell

MSDN Reference: http://msdn.microsoft.com/en-us/library/sharepoint/ee819930.aspx

$ssa = Get-SPEnterpriseSearchServiceApplication

New-SPEnterpriseSearchSecurityTrimmer -ID “1” -SearchApplication $ssa -TypeName “<strong typed assembly>”

QueryProcessing

UX

SEARCH CUSTOMIZATIONSUSER EXPERIENCE

Display Templates

• New way to change search results

• Good by XSLT

• Get used to JavaScript

• Available through Design Manager

• Live in Master Page Gallery

• Separate folders for Content by Search and Core Search

• .HTML file

• .JS file (DO NOT TOUCH)

MSDN Reference: http://msdn.microsoft.com/en-us/library/jj945138.aspx

UX

SEARCH CUSTOMIZATIONSUSER EXPERIENCE

Display Templates

• Samples

• Announcements

• Pages

• Documents

UX

SEARCH CUSTOMIZATIONSUSER EXPERIENCE

Search Web Parts

• Search Results

• Query Builder

• Auto Refine

• Sorting

• Query Rules

• Inline testing

• Content by Search

• Search Results Web Part settings plus

• Term Navigation

• Tuned for use out of search center

SESSION SUMMARY

• Editions

• Components

• Administration

• Customizations

HOW TO CONTACT ME

• Brian Caauwe

• SharePoint Consultant & Speaker

• Email: bcaauwe@avtex.com

• Twitter: @bcaauwe

• Blog: http://blog.avtex.com/author/bcaauwe

REFERENCES

SharePoint 2013 training for IT pros

• http://technet.microsoft.com/en-US/sharepoint/fp123606

Search Edition Features

• http://technet.microsoft.com/en-us/library/cb36484c-0e8f-480e-be88-5daa8bf2d47d#bkmk_SearchfeaturesOnPrem

BCS Connector

• http://msdn.microsoft.com/en-us/library/gg294165.aspx

Content Enrichment Web Service

• http://msdn.Microsoft.com/en-us/library/jjl63968.aspx

REFERENCES

Custom Entity Extraction

• http://technet.microsoft.com/en-us/library/jj219480.aspx

Ranking Models

• http://msdn.microsoft.com/en-us/library/sharepoint/dn169052.aspx

Security Trimming

• http://msdn.microsoft.com/en-us/library/sharepoint/ee819930.aspx

Display Templates

• http://msdn.microsoft.com/en-us/library/jj945138.aspx