© Galatea Training Services Limited, 2002
Web Servers - 1
V0.1 ISA1-1.PPT
Internet SystemsAdministration
Session 1Web Servers
© Galatea Training Services Limited, 2002
Web Servers - 2
V0.1 ISA1-1.PPT
Contents
Client/Server BasicsElectronic PublishingHTTP OverviewOther Web-Related Servers
© Galatea Training Services Limited, 2002
Web Servers - 3
V0.1 ISA1-1.PPT
Requests
Services
Clients and Servers
© Galatea Training Services Limited, 2002
Web Servers - 4
V0.1 ISA1-1.PPT
TCP/IP NetworkConnections/Ports
a.b.c.d
Services
e.f.g.h
Port 80
© Galatea Training Services Limited, 2002
Web Servers - 5
V0.1 ISA1-1.PPT
Servers and Browsers
request
resourcereturned
browser display
© Galatea Training Services Limited, 2002
Web Servers - 6
V0.1 ISA1-1.PPT
Browsers Plug-Ins
Extends browser capabilityMore than just HTML
RealPlayer live audio and videoShockwave animationsAcrobat Reader view PDF files
© Galatea Training Services Limited, 2002
Web Servers - 7
V0.1 ISA1-1.PPT
HypertextHyperlink
Hypertextdocument
Target
© Galatea Training Services Limited, 2002
Web Servers - 8
V0.1 ISA1-1.PPT
File Types
ASCII text files– Letters number and punctuation– View and edit with standard tools– HTMLBinary files– Images– Sound– Programs
© Galatea Training Services Limited, 2002
Web Servers - 9
V0.1 ISA1-1.PPT
Image File Types (1)
GIF (Graphics Interchange Format)– 256 colours– Lossless compression– Transparency– Can be animated– Good for illustrations– Proprietary (patent)
© Galatea Training Services Limited, 2002
Web Servers - 10
V0.1 ISA1-1.PPT
Image File Types (2)PNG (Portable Network Graphic)– As GIF, except
more coloursno animationnot proprietary
JPEG (Joint Photographic Experts Group)– Millions of colours– Lossy compression– Good for photographs
© Galatea Training Services Limited, 2002
Web Servers - 11
V0.1 ISA1-1.PPT
Audio File TypesWAV– WindowsAIFF– MacintoshAU– UNIX
Modern browsers support all these andmore
© Galatea Training Services Limited, 2002
Web Servers - 12
V0.1 ISA1-1.PPT
MIME TypesApplication– application/excel
Audio– audio/midi
Image– image/jpeg
Message– message/news
Multipart– multipart/digest
Text– text/html
Video– video/mpeg
© Galatea Training Services Limited, 2002
Web Servers - 13
V0.1 ISA1-1.PPT
HTTP Request
Request Line
Header Section
Entity Body
Request method,resource location,protocol version
Information aboutrequest, client
Data to be passed to theserver
© Galatea Training Services Limited, 2002
Web Servers - 14
V0.1 ISA1-1.PPT
HTTP Response
Status Line
Header Section
Entity Body
Status code, reasonphrase, protocol version
Information about server,response
Requested resource –often HTML
© Galatea Training Services Limited, 2002
Web Servers - 15
V0.1 ISA1-1.PPT
Request MethodsGET– Typical way of getting a resource from a server– Can be used to pass data to the server
HEAD– Server returns only header data– Use to verify the existence of a resource
POST– Used to send data to the server– Typically- send HTML form data to the server
© Galatea Training Services Limited, 2002
Web Servers - 16
V0.1 ISA1-1.PPT
HTTP Status Code CategoriesInformational
Success
Redirection
Client error
Server error
© Galatea Training Services Limited, 2002
Web Servers - 17
V0.1 ISA1-1.PPT
Proxy Server
Proxysecuritycontent filtercache
Web server
© Galatea Training Services Limited, 2002
Web Servers - 18
V0.1 ISA1-1.PPT
Streaming
Data transmitted
continuously
Requestresource
Display withoutwaiting forcompletemessage 13
2
© Galatea Training Services Limited, 2002
Web Servers - 19
V0.1 ISA1-1.PPT
FTPCopies files from onehost to anotherUsed to retrieve filesfrom Internet archivesUseful for binary andtext filesLog in identification– Anonymous ftp
© Galatea Training Services Limited, 2002
Web Servers - 20
V0.1 ISA1-1.PPT
SSL
Secure Sockets LayerEncrypts data in TCP/IP packets– ordinary HTTP uses clear textCommercial web applicationsWeb server support
© Galatea Training Services Limited, 2002
Web Servers - 21
V0.1 ISA1-1.PPT
Summary
We have covered:
– Client/Server Basics
– Electronic Publishing
– HTTP Overview
– Other Web-Related Servers
© Galatea Training Services Limited, 2002
Planning a Server - 1
V0.1 ISA1-2.PPT
Internet SystemsAdministration
Session 2Planning a Server
© Galatea Training Services Limited, 2002
Planning a Server - 2
V0.1 ISA1-2.PPT
Contents
Hosting OptionsUNIX or NT?Sizing a ServerDomain Names
© Galatea Training Services Limited, 2002
Planning a Server - 3
V0.1 ISA1-2.PPT
Hosting
Host – a computer connected to theInternet with an addressA web server is a host– Where can you locate your web server?– How can you connect your web server to
the Internet?
© Galatea Training Services Limited, 2002
Planning a Server - 4
V0.1 ISA1-2.PPT
Hosting Options
Set up your own web serverCo-locationVirtual hostPersonal-Page sites (ISP)Free-Page sites
© Galatea Training Services Limited, 2002
Planning a Server - 5
V0.1 ISA1-2.PPT
Your Own Server - Issues
Cost of server hardware and softwareOperations –– Backup– 24/7– Power supplies
Security– Protecting your server– Protecting other peoples’ resources
© Galatea Training Services Limited, 2002
Planning a Server - 6
V0.1 ISA1-2.PPT
Internet
Co-Located/Dedicated Server
Low costconnection
ISP’sconnection to
Internet
Develop web pages
Monitor server usage
Your office
ISP
Manages set up,maintenance,
accounting, backupsetc
© Galatea Training Services Limited, 2002
Planning a Server - 7
V0.1 ISA1-2.PPT
Co-Located Server Issues
You must supply the serverYou must administerFees to ISP for serviceBut– Good connectivity to Internet– No local floor-space required
© Galatea Training Services Limited, 2002
Planning a Server - 8
V0.1 ISA1-2.PPT
Dedicated Server Issues
Not your own serverFees to ISP for serviceBut– Much easier to set up– 24/7 support– Good connectivity to Internet
© Galatea Training Services Limited, 2002
Planning a Server - 9
V0.1 ISA1-2.PPT
ISP
Internet
Virtual Host
Low costconnection
ISP’sconnection to
Internet
Develop web pages
Monitor server usage
Your office
Shared Server
Own domain
Another Co.
Another Co.
© Galatea Training Services Limited, 2002
Planning a Server - 10
V0.1 ISA1-2.PPT
Virtual Host Issues
Shared serverLimited or no server programmingaccessStandard solutionsBut– Inexpensive– Good connectivity to Internet
© Galatea Training Services Limited, 2002
Planning a Server - 11
V0.1 ISA1-2.PPT
Personal and Free Pages
ISP standard offeringVery limited storageNot for commercial useMay be free with some ISPs
© Galatea Training Services Limited, 2002
Planning a Server - 12
V0.1 ISA1-2.PPT
ISP
Connecting Your Server
InternetIP address
Domain name
Your office
ISDNCableDSL
T1/E1
© Galatea Training Services Limited, 2002
Planning a Server - 13
V0.1 ISA1-2.PPT
IP Addressing
Static IP address– needed for a web server– never changes– allows other domains to connect to your server
Dynamic IP address– different on each dial-up– acceptable for dial-up Internet access– inappropriate for running web based services
© Galatea Training Services Limited, 2002
Planning a Server - 14
V0.1 ISA1-2.PPT
ISP
Internet
Router
ISP’sconnection to
Internet
RouterDirects packets to different
networks or to Internetdepending on address
© Galatea Training Services Limited, 2002
Planning a Server - 15
V0.1 ISA1-2.PPT
Servers - UNIX or NT? (1)
Unix– Built for TCP/IP networking– Scaleable– Many hardware platforms– Robust– Command Language Interface– X Windows GUI– Requires longer learning curve
© Galatea Training Services Limited, 2002
Planning a Server - 16
V0.1 ISA1-2.PPT
Servers - UNIX or NT? (2)
NT/XP/Win 2000– Closed - proprietary– Limited hardware platform choice– GUI oriented– Easy to learn– Getting more robust
© Galatea Training Services Limited, 2002
Planning a Server - 17
V0.1 ISA1-2.PPT
Servers- UNIX or NT? (3)
Linux– Open – Free version of UNIX– Works well on limited hardware– Robust– Very versatile– Requires investment in learning– Becoming extremely popular for web
servers
© Galatea Training Services Limited, 2002
Planning a Server - 18
V0.1 ISA1-2.PPT
Sizing Your Server
Bandwidth and network capacity– Moving data from and to your server– Buy a bigger pipeWeb server processor– Serving out web pages fast enough to
keep up with demand– Buy a faster CPU– Add RAM
© Galatea Training Services Limited, 2002
Planning a Server - 19
V0.1 ISA1-2.PPT
Domain Names
www.vortexwidgets.com
Host name Domain Top leveldomain
© Galatea Training Services Limited, 2002
Planning a Server - 20
V0.1 ISA1-2.PPT
Top Level Domains
.com business
.org non-profit
.net ISPs
.edu university
.gov government
.mil military
.us United States
.au Australia
.uk United Kingdom
.jp Japan
.sw Sweden
© Galatea Training Services Limited, 2002
Planning a Server - 21
V0.1 ISA1-2.PPT
Summary
We have covered:
– Hosting Options
– UNIX or NT?
– Sizing a Server
– Domain Names
© Galatea Training Services Limited, 2002
Users and Documents - 1
V0.1 ISA1-3.PPT
Internet SystemsAdministration
Session 3Users and Documents
© Galatea Training Services Limited, 2002
Users and Documents - 2
V0.1 ISA1-3.PPT
Contents
Server Users and DirectoriesServer AdministratorsDocument HierarchyDirectory IndexingFile and Directory NamesTransferring Files
© Galatea Training Services Limited, 2002
Users and Documents - 3
V0.1 ISA1-3.PPT
Web Servers and Directories
htdocs
usr home bin dev
/
Web server
© Galatea Training Services Limited, 2002
Users and Documents - 4
V0.1 ISA1-3.PPT
Users and Accounts
Web site visitors do not need accountsUser accounts define server privilegesUsers have a home directoryUser name and password
© Galatea Training Services Limited, 2002
Users and Documents - 5
V0.1 ISA1-3.PPT
System Administration Duties (1)
Install and configure server OSSet up daemons/services (UNIX/NT)Install and configure web serverKeep server software up to date– Patches (UNIX)– Service packs (NT)
© Galatea Training Services Limited, 2002
Users and Documents - 6
V0.1 ISA1-3.PPT
System Administration Duties (2)
Backup and recovery– To disk or tape– Full and incremental backups
Accounts and quotas– Maintain users and their accounts– Decide level of resources for users
Network software configurationMonitoring security
© Galatea Training Services Limited, 2002
Users and Documents - 7
V0.1 ISA1-3.PPT
File Systems (1)
usr home bin dev lib etc var
/
© Galatea Training Services Limited, 2002
Users and Documents - 8
V0.1 ISA1-3.PPT
File Systems (2)
Microsoft– FAT/FAT16/FAT32– NTFS
Apple– HFS
UNIX– UFS– EXT2– Reiserfs
– NFS
© Galatea Training Services Limited, 2002
Users and Documents - 9
V0.1 ISA1-3.PPT
Directories
Absolute Pathname/usr/local/httpd/htdocs
Relative pathnamecd /usr/local/httpdcd htdocs
Dot and Dot Dot. = current directory.. = parent directory
© Galatea Training Services Limited, 2002
Users and Documents - 10
V0.1 ISA1-3.PPT
Uniform Resource LocatorsDescribe how to find a web resourcehttp://www.vortexwidgets.com/support/industrial.ht
mlServername: www.vortexwidgets.comPath: supportFilename: industrial.html
Use relative URLs in your own web pages– Make it easy to move to another directory
Dot and Dot Dot. = current directory.. = parent directory
© Galatea Training Services Limited, 2002
Users and Documents - 11
V0.1 ISA1-3.PPT
Directory Indexing
If no file name in a URL:– Directory browsing enabled
If index document, return default documentOtherwise return list of files
– Directory browsing disabledIf index document, return default documentOtherwise return nothing
© Galatea Training Services Limited, 2002
Users and Documents - 12
V0.1 ISA1-3.PPT
Good Filename Practice (1)
Don’t use spaces in names– Use underscores (_) or dash (-) insteadDon’t use special characters& + ?Keep filenames shortUse a standard naming convention
© Galatea Training Services Limited, 2002
Users and Documents - 13
V0.1 ISA1-3.PPT
Good Filename Practice (2)
Use consistent filename extensions.html.gif
Don’t use extensions with directory namesUse lowercase letters in all filenames– UNIX is case sensitive– Windows is (mostly) not case sensitive
© Galatea Training Services Limited, 2002
Users and Documents - 14
V0.1 ISA1-3.PPT
Transferring Files
Development PC
Web Server
HTML editorProgramming tools
Graphics tools
FTPHTTP PUT
FrontPage extensions
© Galatea Training Services Limited, 2002
Users and Documents - 15
V0.1 ISA1-3.PPT
FTP
File Transfer ProtocolUse a GUI FTP client, ORFTP CLI CommandsOpen hostnamePut filenameGet filenameCD directory
LS (list directory)MKDIR dirnameMPUTMGET
© Galatea Training Services Limited, 2002
Users and Documents - 16
V0.1 ISA1-3.PPT
FrontPage
An easy to use HTML editorSynchronisation between developmentPC and web serverBUT– Not all ISPs support this type of usage– Works best with Microsoft web server– Proprietary
© Galatea Training Services Limited, 2002
Users and Documents - 17
V0.1 ISA1-3.PPT
Summary
We have covered:– Server Users and Directories– Server Administrators– Document Hierarchy– Directory Indexing– File and Directory Names– Transferring Files
© Galatea Training Services Limited, 2002
Server Configuration - 1
V0.1 ISA1-4.PPT
Internet SystemsAdministration
Session 4Server Configuration
© Galatea Training Services Limited, 2002
Server Configuration - 2
V0.1 ISA1-4.PPT
Contents
Choosing web server softwareCustomising your web serverControlling accessSecure Socket Layer configurationVirtual hosts
© Galatea Training Services Limited, 2002
Server Configuration - 3
V0.1 ISA1-4.PPT
Popular Web ServersApache
Microsoft IIS
NetscapeEnterpriseServerOthers
© Galatea Training Services Limited, 2002
Server Configuration - 4
V0.1 ISA1-4.PPT
Apache
Open SourceMultiple platforms (UNIX and Microsoft)Very powerful and configurableUses configuration fileshttpd.conf
© Galatea Training Services Limited, 2002
Server Configuration - 5
V0.1 ISA1-4.PPT
Microsoft IIS
Easy to use, GUI orientedClosed - proprietaryMicrosoft Management ConsoleExtendable through ISAPI– DLL– ASPSupport for FrontPage extensions
© Galatea Training Services Limited, 2002
Server Configuration - 6
V0.1 ISA1-4.PPT
Evaluating Web Servers
Performance Benchmarks– SPECweb96/99– WebStone
CostEase of use or installationScalabilitySecuritySupport of industry standards
© Galatea Training Services Limited, 2002
Server Configuration - 7
V0.1 ISA1-4.PPT
Apache Configuration
All configuration through configuration filesDirectives define optionsDirectives are organised into sections:– Directory– DirectoryMatch– Files– FilesMatch– Location– LocationMatch
© Galatea Training Services Limited, 2002
Server Configuration - 8
V0.1 ISA1-4.PPT
IIS ConfigurationIP addressTCP portHome directoryExecuteVirtual directoryDefault documentDirectory browsingAuthentication controlApplication mappingsRedirect to URL
© Galatea Training Services Limited, 2002
Server Configuration - 9
V0.1 ISA1-4.PPT
Controlling Access
UNIX modelFile permissions– User, Group, Other– Read, Write, Execute
In any combination per file– and directory
© Galatea Training Services Limited, 2002
Server Configuration - 10
V0.1 ISA1-4.PPT
HTTP Authentication
Protect specific files and directoriesRequire User Name and PasswordHTTP1.1– Basic authentication
no encryption across network– Digest authentication
uses MD5 encryption
© Galatea Training Services Limited, 2002
Server Configuration - 11
V0.1 ISA1-4.PPT
Server
Secure Socket Layer
Port443
?
HTTPS
HTTP using SSL
© Galatea Training Services Limited, 2002
Server Configuration - 12
V0.1 ISA1-4.PPT
Certificates
Supplies information about your siteCertificate Authority issues–Verisign–ThawteHelps user trust a transaction withyour serverCosts?
© Galatea Training Services Limited, 2002
Server Configuration - 13
V0.1 ISA1-4.PPT
Virtual Hosts (1)
Name1
Name 2
Name 3
© Galatea Training Services Limited, 2002
Server Configuration - 14
V0.1 ISA1-4.PPT
Virtual Hosts (2)
Name based– One IP address, multiple names– Web server distinguishes using hostnameIP address based– Multiple IP addresses (and names)– Not all servers support this
© Galatea Training Services Limited, 2002
Server Configuration - 15
V0.1 ISA1-4.PPT
Summary
We have covered:
– Choosing web server software
– Customising your web server
– Controlling access
– Secure Socket Layer configuration
– Virtual hosts
© Galatea Training Services Limited, 2002
Server-Side Programming - 1
V0.1 ISA1-5.PPT
Internet SystemsAdministration
Session 5Server-Side Programming
© Galatea Training Services Limited, 2002
Server-Side Programming - 2
V0.1 ISA1-5.PPT
Contents
Dynamic DocumentsCGI and FormsServer-Side IncludesActive Server PagesServlets and Java Server Pages
© Galatea Training Services Limited, 2002
Server-Side Programming - 3
V0.1 ISA1-5.PPT
Normal HTML is static– only changes when you edit it
Some web sites need to present informationthat changes rapidlySome web sites need to process datareceived from visitorsServer-Side Programming meets thisrequirement
WHY Server-Side Programming?
© Galatea Training Services Limited, 2002
Server-Side Programming - 4
V0.1 ISA1-5.PPT
The CGI Process
user program
user requests a form
field parameters passed
HTML results returned
g a t e w a y
server sends HTML
browser
browser requests URL
form display
form filled browser sends fields
server HTML response response displayed
web server
© Galatea Training Services Limited, 2002
Server-Side Programming - 5
V0.1 ISA1-5.PPT
CGI Scripting - Perl
© Galatea Training Services Limited, 2002
Server-Side Programming - 6
V0.1 ISA1-5.PPT
CGI Form
© Galatea Training Services Limited, 2002
Server-Side Programming - 7
V0.1 ISA1-5.PPT
CGI Form HTML
© Galatea Training Services Limited, 2002
Server-Side Programming - 8
V0.1 ISA1-5.PPT
Server-Side Includes
Not all web sites need CGIA small amount of data needs to bedynamicGet the server to fill this inUse special tags in HTML– Directives– Interpreted and replaced with data by the
server
© Galatea Training Services Limited, 2002
Server-Side Programming - 9
V0.1 ISA1-5.PPT
Server-Side Includes
© Galatea Training Services Limited, 2002
Server-Side Programming - 10
V0.1 ISA1-5.PPT
Typical Server-Side Includes
INCLUDE– Includes a file or a URL at this position– Useful for maintaining a common look and feel
EXEC– Execute a program and insert output at this
positionECHO– Insert the value of a variable at this position
© Galatea Training Services Limited, 2002
Server-Side Programming - 11
V0.1 ISA1-5.PPT
Active Server Pages
Microsoft proprietary scriptinglanguageEmbed scripts into documentsServer process scripts– issues ‘pure’ HTMLUse any language that supports COM– VBScript, Jscript, C++, Perl, Java
© Galatea Training Services Limited, 2002
Server-Side Programming - 12
V0.1 ISA1-5.PPT
ASP Example
generates
© Galatea Training Services Limited, 2002
Server-Side Programming - 13
V0.1 ISA1-5.PPT
Servlets and Java Server Pages
Sun Microsystems Java Language– a well designed language– suitable for both client and server side
programmingServlets – Java programs executed bythe serverJava Virtual Machine (JVM)– Interprets Java code
© Galatea Training Services Limited, 2002
Server-Side Programming - 14
V0.1 ISA1-5.PPT
Typical Java Code
© Galatea Training Services Limited, 2002
Server-Side Programming - 15
V0.1 ISA1-5.PPT
Java Server Pages
Simpler than a full servletSimilar to SSIExample:
<HTML>Today’s date is <%= new Date() %></HTML>
Code between <% %> is executed by server
© Galatea Training Services Limited, 2002
Server-Side Programming - 16
V0.1 ISA1-5.PPT
Java Beans
A component modelWritten in JavaBuilding block for an applicationCan be used to provide commonfunctionality wherever requiredSimilar in principle to MicrosoftActiveX controls
© Galatea Training Services Limited, 2002
Server-Side Programming - 17
V0.1 ISA1-5.PPT
Summary
We have covered:
– Dynamic Documents
– CGI and Forms
– Server-Side Includes
– Active Server Pages
– Servlets and Java Server Pages
© Galatea Training Services Limited, 2002
Log Files - 1
V0.1 ISA1-6.PPT
Internet SystemsAdministration
Session 6Log Files
© Galatea Training Services Limited, 2002
Log Files - 2
V0.1 ISA1-6.PPT
Contents
Log File FormatsReferrersBeing ProactiveStatistics
© Galatea Training Services Limited, 2002
Log Files - 3
V0.1 ISA1-6.PPT
Provide a lot of informationabout the usage of your website
You can log any transaction
Debug server-side programs
Help you tune your web site
Log Files
© Galatea Training Services Limited, 2002
Log Files - 4
V0.1 ISA1-6.PPT
Logging Transactions
One line per transactionNot computationally expensiveCan grow very large– Store on a separate partition– Rotate log filesTwo main logging formats– Common Logfile Format (CLF)– Extended Logfile Format (ELF)
© Galatea Training Services Limited, 2002
Log Files - 5
V0.1 ISA1-6.PPT
Common Logfile Formatremotehost rfc1413 authuser [date] “request” status bytes
remotehost – client hostname or IP addressrfc1413 – the identity of the remote user (usually -)authuser – users own name (may be -)[date] – date and time of requestrequest – the HTTP request as it came from clientstatus – the HTTP status code returned by the serverbytes – the content length of the returned document
© Galatea Training Services Limited, 2002
Log Files - 6
V0.1 ISA1-6.PPT
Combined Logfile Format
referrer – the URL that brought the user to thisresourceuser-agent – the client that made the request– i.e Netscape 4.5)
remotehost rfc1413 authuser [date] “request” status bytes referer user-agent
© Galatea Training Services Limited, 2002
Log Files - 7
V0.1 ISA1-6.PPT
Extendible Logfile FormatAllows the administrator to specify thefields to be loggedExample:
© Galatea Training Services Limited, 2002
Log Files - 8
V0.1 ISA1-6.PPT
ReferrersHow did visitors reach your web site?What web page were they previouslyviewing?Understanding referrers helps you tomake your web site more accessible
© Galatea Training Services Limited, 2002
Log Files - 9
V0.1 ISA1-6.PPT
ReferrersVisitor uses a search engine to look for“vortex widgets”The referrer field might contain:
© Galatea Training Services Limited, 2002
Log Files - 10
V0.1 ISA1-6.PPT
Example Error Diagnosis (1)
Error is inline 7
As a resultthe form did
not work
© Galatea Training Services Limited, 2002
Log Files - 11
V0.1 ISA1-6.PPT
Example Error Diagnosis (2)Error is in
line 7Web serveris deniedaccess
Userdeniedaccess
© Galatea Training Services Limited, 2002
Log Files - 12
V0.1 ISA1-6.PPT
Statistics
Getting information from dataUnderstand usage patterns for yourweb siteObtain “hit counts”Tune your web site to attract moreusers
© Galatea Training Services Limited, 2002
Log Files - 13
V0.1 ISA1-6.PPT
Log File Analysis
Most requested pagesTop entry pagesMost used browsersBandwidth utilisationMost active domainsInformation about search enginesTop referring sites and URLsError counts
© Galatea Training Services Limited, 2002
Log Files - 14
V0.1 ISA1-6.PPT
StatisticsYou need tools to extract statisticsfrom logsWebalizer– http://www.mrunix.net/webalizerWebTrends– http://www.webtrends.comWusage– http://www.boutell.com/wusage
© Galatea Training Services Limited, 2002
Log Files - 15
V0.1 ISA1-6.PPT
Using Webalizer (1)Usage Statistics for www.mrunix.netSummary Period: Last 12 MonthsGenerated 23-Oct-2000 02:17 EDT
© Galatea Training Services Limited, 2002
Log Files - 16
V0.1 ISA1-6.PPT
Using Webalizer (2)
Summary by Month
MonthDaily Avg Monthly Totals
Hits Files Pages Visits Sites KBytes Visits Pages Files HitsMay 1999 6377 5570 903 455 10484 884568 14119 28004 172671 197696Apr 1999 6216 5394 858 419 10087 821968 12594 25758 161844 186504Mar 1999 7530 6582 1046 499 12128 1052978 15480 32432 204059 233445Feb 1999 4712 4128 656 321 6629 511793 8048 16419 103203 117816Jan 1999 4470 3934 607 284 8079 605694 8808 18844 121980 138571Dec 1998 2998 2673 411 197 6524 410110 6120 12769 82875 92951Nov 1998 2910 2567 400 192 4260 346705 5588 11627 74468 84403Oct 1998 3052 2668 457 202 2203 189253 2839 6399 37360 42738Sep 1998 2072 1826 345 169 3475 314492 5075 10376 54807 62165Aug 1998 1014 901 211 125 2693 196560 3890 6571 27958 31455Jul 1998 1484 1325 302 184 4041 298225 5716 9383 41102 46019Jun 1998 1707 1502 322 222 4809 251502 6675 9687 45077 51227Totals 5883848 94952 188269 1127404 1284990
© Galatea Training Services Limited, 2002
Log Files - 17
V0.1 ISA1-6.PPT
Summary
We have covered:– Log File Formats
– Referrers
– Being Proactive
– Statistics
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 1
V0.1 ISA1-7.PPT
Internet SystemsAdministration
Session 7Search Engines, Robots and
Automation
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 2
V0.1 ISA1-7.PPT
Contents
Search EnginesPublicising your siteRobots and SpidersAutomation
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 3
V0.1 ISA1-7.PPT
Objectives
How to increase usage of your siteHow to enhance itHow to publicise itHow your site interacts with searchengines
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 4
V0.1 ISA1-7.PPT
Search Engines
How can users find content within your site?Navigation bar or table of contents– simple but slow
A web site based search engine– greatly increases usefulness of your web site– easy to install– offers other useful features
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 5
V0.1 ISA1-7.PPT
Search EnginesExcite for Web Servers (EWS)– http://www.excite.com/navigate/
SWISH-E (Simple Web Indexing System forHumans –Enhanced)– http://sunsite.berkeley.edu/SWISH-E/
AltaVista search Intranet– http://www.altavista-software.com/
Microsoft Index Server– http://www.microsoft.com/NTServer/fileprint/exec/feature
/Indexfaq.asp
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 6
V0.1 ISA1-7.PPT
Example – SWISH
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 7
V0.1 ISA1-7.PPT
Virtual Search Engine
Use an existing high capacity search engine– Infoseek
Filter out all hits except those for your websiteNo cost, no disk space, minimal effortBUT– Only if your site is properly indexed by the
search engine
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 8
V0.1 ISA1-7.PPT
Installing a Search Engine
Indexing your site
Search Form
Search Engine
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 9
V0.1 ISA1-7.PPT
Publicising Your Site
Use tags well<META>
– Use keywords to describeyour site and capture hits
<TITLE>,<H1>,<H2>– Search engines look at these
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 10
V0.1 ISA1-7.PPT
Publicity Tips
Register with the major search engines anddirectories– Search engines match key words– Directories match topics– Metasearch engines use other search engines
Content needs to be useful/interestingAdvertise your siteFind sites that will link to yours
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 11
V0.1 ISA1-7.PPT
Controlling the Robotsand Spiders
By default robots visit everypage of your siteThis may be what you want,but:– what about dynamic content?– what if you have too many
matching keywords?Use a robot exclusion protocol
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 12
V0.1 ISA1-7.PPT
Robots.txt
Specify whichrobots arecontrolled
Specify whichdirectories are
not allowed
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 13
V0.1 ISA1-7.PPT
AutomationSystem administratorshave a lot ofresponsibility– managing disk space– checking for errors in the
logs– generating reports– performing backups
Scripts can automatemost of these
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 14
V0.1 ISA1-7.PPT
UNIX Tools (1)UNIX has strong tools andscripting languagesCRON– Daemon that starts programs
according to a schedule– Crontab creates list of tasks
AT– Run a command at a future
time– Windows NT has a version of
AT
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 15
V0.1 ISA1-7.PPT
UNIX Tools (2)Perl, Tcl– very powerful, easy to use
scripting languagesShell scripts– csh, ksh, etc– files containing OS commands and
programming featuresExpect– automation of command-line tasks
where there is user interaction inresponse to system prompts
© Galatea Training Services Limited, 2002
Search Engines, Robots & Automation - 16
V0.1 ISA1-7.PPT
Summary
We have covered:
– Search Engines
– Publicising your site
– Robots and Spiders
– Automation
Top Related