Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web...

download Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

of 28

Transcript of Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web...

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    1/28

    © 2008 IBM Corporation

    ®

    Integrating the Google Search Appliance withWebSphere Portal and Lotus Web ContentManagement 

    Dave HayPortal and Collaboration ArchitectIBM Software Services for Lotus (ISSL)davidhay!u"#ib$#co$

    %&& '* +,&*-

    mailto:[email protected]:[email protected]

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    2/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

     About Me 

    With IBM since 1992

    Experienced with hardware, software and now services

     – AS/400 and iSeries

     – etwor! Station

     – We"Sphere and #ot$s software

     – #in$x advocate

     – %o&&a"oration evan'e&ist

    Infrastr$ct$re Architect

    With ISS# since 2009

    http://www-05.ibm.com/uk/locations/hursley_details.html

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    3/28

    © IBM Corporation

    IBM Software Group | Lotus software

     

    Introduction – The Project 

    Ma(or )* financia& instit$tion

    Interna& and externa& we"sites

    %ontent he&d in #ot$s We" %ontent Mana'e+ent

    Existin' intranet and internet sites $sin' oo'&e Search

     App&iance IBM tea+ and oo'&e partner en'a'ed

    So&$tion adoption pro'ra++e

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    4/28© 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Requirements 

    -o de&iver access to $nsec$red A. sec$red content

    -o +aintain sec$rit of content within search res$&ts

    -o present content in context via search res$&ts

    -o de&iver persona&ied res$&ts with variance and re&evance

    -o inte'rate with We"Sphere orta&

    -o +aintain access to existin' search faci&ities

    -o perfor+ in &ine with nonf$nctiona& re3$ire+ents

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    5/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Lotus Web Content Management 

    o&e"ased content +ana'e+ent sste+ B$i&t $pon We"Sphere orta&

    Wor!f&owdriven a$thorin', approva& and p$"&ishin' process

    %ontent accessi"&e via port&ets, standa&one we"sites, AI,

    feeds etc5 etc5 etc5 %ontent stored in standards"ased 6ava %ontent epositor

    76%8 data"ase

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    6/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Google Search Aliance 

     Search in a "ox:  Se&fcontained app&iance

     – ;n& re3$ires power and data

    .ifferent +ode&s for different re3$ire+ents

    %&ient $ses B

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    7/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Challenges 

    reserve existin' search f$nctiona&it

    Inte'rate with c&ient=s c$sto+ sec$rit so&$tion

    eed to +aintain se're'ation SA sho$&d never interact withW%M direct&

    W%M s$pports standard Seed&ist for+at

    SA s$pports oo'&e >eeds for+at

    )ser experience ? what and where

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    8/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Terminolog! 

    %raw&in' ? the process that the SA 'oes thro$'h to "$i&d its on"ox search index 7 !nown as the defa$&t co&&ection 8

    Servin' ? the SA provides search re3$est for+ and search res$&tsto $sers

    Searchin' ? the process that the $sers 'o thro$'h

    %o&&ections ? provide views: into the defa$&t co&&ection "ased $pon)# patterns

    >rontEnds ? defines the $ser experience I and ;)- of SA 

    @S#- ? Extensi"&e St&esheet #an'$a'e -ransfor+ations, $sed todrive the $ser experience

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    9/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Seedlists and "eeds 

    oo'&e >eeds is the for+at that the SA $ses when craw&in', and whato$r so&$tion needed to prod$ce

    W%M a$to+atica&& prod$ces a Seed&ist, a&"eit onde+and

    Seed&ist can a&so "e sched$&ed and, perhaps, persisted

     – $estion a"o$t where seed&ist wo$&d "e persisted e5'5 fi&e sse+, data"ase

    Both are @M# str$ct$res What are the differences

    IBM Seed&ist for+at has feat$res that oo'&e >eeds doesn=t offerC

     – refi&terin' " $ser 'ro$ps stored in +etadata in the index

     – ostfi&terin' at r$nti+e

     – a'ination ? $sef$& for &ar'e content stores – E+"edded seed&ists 7 seed&ists within seed&ists 8

     – Incre+enta& indexin' 7 what has chan'ed since the &ast craw& 8

    #on'ter+ o"(ective is for standardiation aro$nd the Seed&ist for+at

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    10/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    The Solution 

    IBM tea+ deve&oped %raw&in' rox 7%8 so&$tion % is "ased $pon an esta"&ished oo'&e pattern, so not >irst

    ;f A *ind 7>;A*8

    % is a standard 6EE app&ication dep&oed onto We"Sphere

     App&ication Server D51 % acts as "ro!er "etween SA and W%M

     – SA never connects to W%M direct

    % can "e sca&ed across c&$stered We"Sphere environ+ent to+eet nonf$nctiona& re3$ire+ents

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    11/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    S!stem Conte#t $iagram 

    WCM

    ata!ase

    Core

    We!

    Se"urit#

    We! Ser$er 

    %orta&

     '()inistrator 

    Goo*&e

    Sear"+

     'pp&ian"e

    Content

     'ut+or 

    ,n( -ser 

    CWS

     '()inistrator 

    Content

     'ut+orin*

    Ser$er 

    %orta&

    ata!ases

    ,.istin*

    Content

    /InsiteGS'

     '()inistrator 

    ata!ase

     '()inistrator 

    ,n( -ser 

     '()in1&ow

    -ser (ata1&ow

    Se"urit#f&ow

    Insite

     '()inistrator 

    %orta&Content

    e&i$er#

    C&uster 

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    12/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Cra%ling Process 

    SA +a!es a craw& re3$est to %raw&in' rox via a specific )#

    % re3$ests Seed&ist fro+ W%M

    % 'enerates 6$+p a'e: 

     – -M# pa'e of &in!s, pa'ed as needed

    SA craw&s 6$+p a'e: re3$estin' each )# fro+ %

    % ret$rns content and +etadata to SA 

     – In(ected into SA $sin' oo'&e >eeds for+at

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    13/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Cra%ling Process – &um Page 

    S f G | f

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    14/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Cra%ling Process ' "eeds 

    IBM S ft G | L t ft

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    15/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    $eli(ering Secured Search 

    %ontent is sec$red in W%M $sin' $ser 'ro$ps

    %raw&in' prox in(ects 'ro$ps into SA as +etadata via >eedprocess

    SA needs to 'et the $ser 'ro$ps to perfor+ search across A%#

    sec$red content in index ow does the SA !now the identit and 'ro$ps of the $ser

    SA can $se #.A, "$t c&ient doesn=t $se it F with a c$sto+a$thentication +echanis+ $sed instead

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    16/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    The Coo)ie Crac)er 

    #i!e the %raw&in' rox, this is another pattern that SAs$pports

    %oo!ie %rac!er is $sed to decrpt and va&idate $ser=s sec$ritto!en

    -hen ret$rns $ser I. and 'ro$ps to SA 

    SA can then perfor+ search across A%#sec$red content inindex

     A&so need a edirect )# to force: $ser to a$thenticate ifanon+o$s or expired session

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    17/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    Ser(ing Process 

    )ser initiates search re3$est

     – Either " accessin' SA direct& or via porta&

    )ser indicates whether sec$red or $nsec$red search is re3$ired

     – If $nsec$red, then SA searches as $s$a&

     – If sec$red, SA redirects $ser re3$est to %oo!ie %rac!er – If no va&id to!en, SA redirects $ser re3$est to edirect )# to

    force &o'on

     – ;nce va&id to!en, %oo!ie %rac!er ret$rns $ser I. and 'ro$ps toSA 

     – SA perfor+s search across A%#sec$red content

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    18/28

    © 3333 IBM Corporation

    IBM Software Group | Lotus software

     

    The Multile GSA Scenario 

    Ma "e needed for perfor+ance and/or resi&ience

    M$&tip&e patterns inc&$din' Active/Active %raw&, Active/ActiveSearch, Active/assive %raw& etc5

    ;ption to $se +irrorin' to !eep passive SA in snc with active

    SA  %raw&in' rox needs to "e desi'ned to !now: which SA is

    +a!in' a re3$est

    %raw&in' rox a&so needs to persist ti+esta+p of &ast Seed&istre3$est

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    19/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    GSA Securit! 

    SA can $se sec$rit +echanis+s s$ch as -#M and >or+Based A$thentication to contro& craw&er access

     – We chose to $se -#M

    SA a&so s$pports so&$tions s$ch as *er"eros and SAM# forc&ient a$thentication ? essentia& for sec$re servin'

     – We chose to $se %oo!ie %rac!in'

    We a&so needed to consider other aspectsC

     – )sin' --S to encrpt access fro+ SA to %raw&in' rox

     – )sin' I white&ist and networ! A%#s to contro& access to

    SA ports s$ch as >eeds and Ad+in – )sin' --S to encrpt data "ein' fed into the >eed port

     – )sin' on"ox $ser acco$nts 7 ad+inistrator, +ana'er 8rather than #.A

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    20/28

    © 2008 IBM Corporation

    IBM Software Group | Lotus software

     

    *nd'user *#erience 

    ;ptions to de&iver )@ fro+ porta& or fro+ SA  SA experience driven " frontend: 

    >rontend provides search re3$est and search res$&ts

    ;ption to have +$&tip&e frontendsG each with different

    the+e/st&e >rontends de&ivered $sin' Extensi"&e St&esheet #an'$a'e

    -ransfor+ations 7@S#-8

    e$se existin' st&es e5'5 %SS fi&es, icons, &o'os etc5

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    21/28

    © 2008 IBM Corporation

    p |

     

    *#amles o+ ,- 

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    22/28

    © 2008 IBM Corporation

    p |

     

    Comonent $esign 

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    23/28

    © 2008 IBM Corporation  

    S)ills 

    %&ient had previo$s experience with SA  – eeded to ac3$ire additiona& SA ad+inistration experience

    %raw&in' rox, %oo!ie %rac!er and edirect )# app&icationsrea&ied in 6EE

    @S#- s!i&&s needed to c$sto+ie frontends – SA has on"ox frontend too&in'

     – @S#- expertise needed to +odif over and a"ove

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    24/28

    © 2008 IBM Corporation  

    Project Li+ec!cle 

    %ond$ct re3$ire+ents 'atherin' exercise – We started with a "ase&ine re3$ire+ent for sec$red search in orta&

     – E3$ates to an a'i&e pro(ectG we !new where we wanted to 'et to, "$tthe wapoints on the (o$rne chan'ed a&on' the wa

    Wor! with oo'&e partner to $nderstand art of the possi"&e

     – atterns s$ch as %raw&in' rox and %oo!ie %rac!in' ca+e this wa Identif dependancies

     – eed SA D5H software &eve& to s$pport content&eve& A%#s

     – eeded additiona& fix for SS# s$pport

    .eve&op and f$nctiona&& test, iterative&

    &an for nonf$nctiona& testin', to "$i&d capacit +ode& – )sin' %raw&in' rox a'ainst W%M was a !nown $n!nown

    &an to $p'rade prod$ction SAs to D5H

    &an for ad+inistrator and deve&oper trainin'

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    25/28

    © 2008 IBM Corporation  

    The "uture 

    %&ient p&ans to +a!e this Search So&$tion a standard part of a&& f$t$reorta&/W%M dep&o+ents

    -his inc&$des interna& A. externa& we" sites

    ;ption to re$se a&&/part of so&$tion 7 esp5 %raw&in' rox 8 for%o&&a"oration pro(ect with #ot$s %onnections

    Extend so&$tion to offer ersona&iation 7 variance and re&evance 8 $sin'+etadata

    %onsider sched$&in' Seed&ist 'eneration, and cachin' across c&$sters

    #oo! at options to standardie @S#- across or'aniation

    %onsider search on +o"i&e devices e5'5 iad, Android

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    26/28

    © 2008 IBM Corporation  

    Lessons Learned 

    eed co+p&ete set of s!i&&s – orta&/W%M

     – SA 

     – @S#-

     – Sec$rit infrastr$ct$re

     – etwor!in' ro(ect spans infrastr$ct$re, app&ication and sec$rit discip&ines

    .ecide on )@ as soon as possi"&e

    >oc$s on re3$ire+ents, re3$ire+ents, re3$ire+ents

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    27/28

    © 2008 IBM Corporation  

     An! questions . 

    IBM Software Group | Lotus software

  • 8/16/2019 Dave Hay - WUG - Integrating the Google Search Appliance With WebSphere Portal and- Lotus Web Content Management

    28/28

    © 2008 IBM Corporation  

    /o% to contact me 

    Lotus Sametime

    Lotus 0otes

    12314 563748

    mailto:[email protected]://w3.ibm.com/connections/profiles/html/profileView.do?key=9bd96659-504f-4433-af15-4ab4dfd4863c&lang=enhttp://www.linkedin.com/profile/view?id=38767494&authType=NAME_SEARCH&authToken=wqqF&locale=en_US&srchid=eed489c9-8c34-443f-8dbe-6c9e9cb2efe3-0&srchindex=1&srchtotal=160&pvs=ps&pohelp=&goback=.fps_*1_Dave_Hay_*1_*1_*1_*1_*51_*1_Y_*1_*1_*1_false_1_R_true_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2http://twitter.com/david_hayhttp://portal2portal.blogspot.com/