Microsoft Wins Everything David Strom [email protected] SS2: 6 May 1998.
Evaluating Web Server Log Analysis Tools David Strom [email protected] SD’98 2/13/98.
-
Upload
kathryn-todd -
Category
Documents
-
view
223 -
download
4
Transcript of Evaluating Web Server Log Analysis Tools David Strom [email protected] SD’98 2/13/98.
SD'98 (c) David Strom, Inc. 2
Summary
• Examine different log files
• What you can and can’t learn from your logs
• Pros and cons of various tools
SD'98 (c) David Strom, Inc. 4
Access logs
• Domain name
• Date, time
• Server command processed and result
• URL of visitor
• Bytes transmitted
SD'98 (c) David Strom, Inc. 5
Sample access log data
• rm258.fav.usu.edu [31/May/1995:09:03:23 +0600] "GET /NEI.html HTTP/1.0" 302 396
• rm258.fav.usu.edu [31/May/1995:09:03:28 +0600] "GET /xculture/nei/nei.html HTTP/1.0" 200 2114
• rm258.fav.usu.edu [31/May/1995:09:03:30 +0600] "GET /gifs/sedlbutton.gif HTTP/1.0" 200 1336
• 129.71.83.161 [31/May/1995:09:20:32 +0600] "GET /RELs.html HTTP/1.0" 304 0
• Leslie-Francis.tenet.edu [31/May/1995:09:36:06 +0600] "GET / HTTP/1.0" 200 1867
• ls973.ulib.albany.edu [31/May/1995:09:40:52 +0600] "GET /viii1.html HTTP/1.0" 404 244
SD'98 (c) David Strom, Inc. 6
Errors reported in your logs
• Clients that time out (or leave in frustration!)
• Scripts that don’t produce any output
• Server bugs
• User authentication or configuration problems
SD'98 (c) David Strom, Inc. 7
Sample error log data
• [Thu May 30 07:25:32 1996] send timed out for bamberg.sedl.org
• [Thu May 30 07:57:41 1996] send timed out for kenya.sedl.org
• [Thu May 30 08:23:11 1996] send timed out for ppp092.kyoto-inet.or.jp
• [Thu May 30 09:15:52 1996] access to /usr/local/www/htdocs/scimath/compass/vol03 failed for 170.211.67.51, reason: File does not exist
• [Thu May 30 09:57:56 1996] send timed out for dd10-048.compuserve.com
• [Thu May 30 10:47:25 1996] read timed out for ncia110b.ncia.net
SD'98 (c) David Strom, Inc. 9
Sample referral log data
• http://www.isisnet.com/ ->/change/welcome.html• http://www.ipl.org/ref/RR/EDU/Research-rr.html
->/welcome.html• http://www.tenet.edu/snp/main.html
->/policy/networks/toc.html• http://www.tenet.edu/new/main.html
->/policy/networks/toc.html• http://guide-p.infoseek.com/NS/Titles?qt=teacher+training -
>/resources/SCIMAST/announcement.html• http://www.tenet.edu/new/main.html
->/policy/networks/toc.html• http://www.tenet.edu/new/main.html
->/policy/networks/toc.html• http://www.nwrel.org/national/regional-labs.html
->/welcome.html
SD'98 (c) David Strom, Inc. 10
Common log format
• Output by most standard servers
• Needed by most third-party log analyzers• hoohoo.ncsa.uiuc.edu/docs/setup/httpd/Overview.html
SD'98 (c) David Strom, Inc. 11
Extended/custom log formats
• Log whatever you wish in whatever order you wish
• Useful if you will read them regularly!
• But can’t work with the analyzers
• Now in IIS v4, NSCP v3, others.
SD'98 (c) David Strom, Inc. 12
What you can learn from your log files
• Hits per day
• Domain origins
• The path people take in and around your web
• Problem areas
SD'98 (c) David Strom, Inc. 13
HITS
• (How Idiots Track Success)
• Nobody uses this word anymore
• Doesn’t really measure individual users, just access
• Catching servers and proxies mess up these statistics
SD'98 (c) David Strom, Inc. 14
Domain origins
• Where users are coming from -- sometimes
• Just because they are from ibm.net doesn’t mean they work at IBM!
• Forgotten accounts, friends and family using the account
• Hacked user names
• Proxies don’t help here either
SD'98 (c) David Strom, Inc. 15
The path people take in and around your web
• Search engines help sometimes
• Which search site was the most popular front door
• Who links to you and why
• Is there a pattern or a random walk?
SD'98 (c) David Strom, Inc. 16
Problem areas to deal with
• Broken links (locally)
• Broken outbound links
• Time outs (sunspots?)
SD'98 (c) David Strom, Inc. 17
What you can’t learn from your logs
• Who are these people, anyway?– No specific user names– Is it a bot or a real human?
• How long did they view a page?– Most people don’t spend much time on your
web– Where did they go visit next?
SD'98 (c) David Strom, Inc. 18
What technologies are available?
• Built-in analyzer tools
• Sites that capture user info
• Secure sites with registration
• Build your own from perl
• Third-party tools
SD'98 (c) David Strom, Inc. 19
Built-in tools
• WebSite, website.ora.com
• IIS with Site Server, www.microsoft.com/iis
• Netscape servers, www.netscape.com
• Easy to use but limited
SD'98 (c) David Strom, Inc. 20
WebSite Professional v2
• Win NT, 95
• Best web server for learning about logs, best docs
• QuickStats module for instant analysis:– single report but nice set of information– shows today, last two days requests and unique
hosts– IP addresses of visitors, average requests/hour
SD'98 (c) David Strom, Inc. 21
IIS Site Server
• NT Server v4 w/SP3 only
• Lots of preconfigured reports
• Two versions, Express and Full (customized reports)
• backoffice.microsoft.com/products/siteserver/express/
SD'98 (c) David Strom, Inc. 22
Netscape v3 web servers
• Various NT, Unix versions
• Reports for a few variables but nothing too extensive
• Best to use a third-party tool here
SD'98 (c) David Strom, Inc. 23
Sites that capture user info
• WebCounter, www.digits.com -- third-party hit counter
• Someone else does the programming and debugging
• But beyond your control
SD'98 (c) David Strom, Inc. 24
Secure sites with registration
• You know your users
• But many won’t register, or forget their passwords
• Requires scripting, database integration, more maintenance
SD'98 (c) David Strom, Inc. 25
Build your own from perl
• Needs some in-house support
• Works best with Unix-based webs
• Examples:– refstats,
members.aol.com/htmlguru/refstats.html– surfreport, bienlogic.com/SurfReport/
SD'98 (c) David Strom, Inc. 26
Third-party tools
• WebTracker, www.CQMInc.com/webtrack
• WebTrends, www.webtrends.com
• net.Genesis, www.netgen.com
• MarketWave, www.marketwave.com
• IIS Assistant, www.go-iis.com