MANUAL Urchin v5x

232
Urchin 5.000 Urchin Administration/User Guide © Copyright 2003 Urchin Software Corporation. All rights reserved. Printed Date: 01/25/2005 03:03:02 Modified Date: 01/23/2005 11:41:26

Transcript of MANUAL Urchin v5x

Page 1: MANUAL Urchin v5x

Urchin 5.000Urchin Administration/User

Guide

© Copyright 2003 Urchin Software Corporation.All rights reserved.

Printed Date: 01/25/2005 03:03:02Modified Date: 01/23/2005 11:41:26

Page 2: MANUAL Urchin v5x

Table of ContentsChapter 1: Getting Started................................................................................................................................1

Welcome to Urchin!.................................................................................................................................1System Requirements..............................................................................................................................2

Supported Platforms and Hardware Requirements...........................................................................2Urchin Setup Requirements...............................................................................................................4

Installation...............................................................................................................................................6Quickstart Installation Guide.............................................................................................................6Installation Guide (Windows).........................................................................................................10Installation Guide (UNIX)...............................................................................................................11Installation Guide (Mac OS X 10.2.x)............................................................................................14Installation Guide (Sun Cobalt).......................................................................................................15Uninstalling Urchin 5......................................................................................................................18Troubleshooting Install Problems...................................................................................................20

Upgrades................................................................................................................................................20Upgrading Urchin 4.........................................................................................................................20Upgrading Urchin 3.........................................................................................................................23Upgrading Urchin 3 on Sun Cobalt.................................................................................................24Urchin 3, 4, &5 Reporting Differences...........................................................................................26Upgrading Urchin 5.........................................................................................................................28

Initial Configuration..............................................................................................................................30E−commerce Reporting...................................................................................................................30Setup Recommendations.................................................................................................................32

Chapter 2: Visitor Tracking............................................................................................................................34Using UTM with E−commerce..............................................................................................................34Visitor Identification Methods...............................................................................................................35Urchin Traffic Monitor (UTM)..............................................................................................................38Session−ID Identification......................................................................................................................45UTM Quick−Install (Apache)................................................................................................................46Installing UTM On Every Page (Apache).............................................................................................47UTM Quick−Install (IIS).......................................................................................................................48Using UTM with Domain Aliases.........................................................................................................49Using UTM with Multiple Sites............................................................................................................50Tracking Flash and Browser Events (UTM−5 only).............................................................................51Tracking Banner Ad Exits and Other Outbound Links.........................................................................53

Chapter 3: Urchin Administration..................................................................................................................54Administration Overview.......................................................................................................................54Profiles...................................................................................................................................................56

Importing Profiles (Windows).........................................................................................................56Working with Profiles.....................................................................................................................57

Log Files................................................................................................................................................59Working with Log Sources..............................................................................................................59Log Management.............................................................................................................................60Log Rotation Best Practices............................................................................................................61Logging − Apache and IIS..............................................................................................................64Logging − iPlanet............................................................................................................................67Logging: Tomcat (Apache Jakarta Project).....................................................................................67Logging − Other Webservers..........................................................................................................68Wildcard &Date Substitution in Log Path......................................................................................69Processing Historical Logs..............................................................................................................72

i

Page 3: MANUAL Urchin v5x

Table of ContentsChapter 3: Urchin Administration

Log Reprocessing............................................................................................................................73Filtering..................................................................................................................................................74

Filtering Overview..........................................................................................................................74Filter Fields......................................................................................................................................76Exclude/Include Filters....................................................................................................................81Decode URL Filters.........................................................................................................................82Search &Replace.............................................................................................................................83Lookup Table Filters.......................................................................................................................84Advanced Filters..............................................................................................................................85DynamicURL Filters (deprecated)..................................................................................................86Regular Expression Overview.........................................................................................................88

Affiliations, Users &Groups..................................................................................................................89Working with Affiliations...............................................................................................................89Working with Users &Groups.........................................................................................................90

Scheduling Tasks...................................................................................................................................92Working with the Task Scheduler...................................................................................................92

System Settings......................................................................................................................................94Changing the Port Number..............................................................................................................94Licensing Urchin.............................................................................................................................94DNS Database Update.....................................................................................................................96

Chapter 4: Reporting Interface.......................................................................................................................97Report−Side Filtering............................................................................................................................97Reporting Interface Overview................................................................................................................97Exporting Data.......................................................................................................................................99Date Range.............................................................................................................................................99

Chapter 5: E−commerce Module..................................................................................................................102E−commerce Overview.......................................................................................................................102ELF &ELF2 Log Formats....................................................................................................................104Custom E−commerce Logs..................................................................................................................107Visitor Correlation...............................................................................................................................112Cancelling E−commerce Transactions................................................................................................113

Chapter 6: Campaign Tracking Module......................................................................................................114Campaign Tracking Overview.............................................................................................................114The Five Dimensions of Campaign Tracking......................................................................................117Step 1: Track Campaign Data (Set up UTM−3)..................................................................................118Step 2: Install and License Campaign Tracking..................................................................................120Step 3: Define a Conversion Goal.......................................................................................................120Tagging Your Online Links 1−2−3.....................................................................................................122Import Cost Data from Google............................................................................................................123Import Cost Data from Overture..........................................................................................................126Adding Cost and Impression Data.......................................................................................................128How To Analyze Keyword Buying.....................................................................................................129How To Track Content−Targeted Ads................................................................................................134How To Track Email Campaigns........................................................................................................136How To Use Master Tracking Codes...................................................................................................138URL Builder........................................................................................................................................139Implementation Checklist....................................................................................................................141

ii

Page 4: MANUAL Urchin v5x

Table of ContentsChapter 7: Advanced Topics..........................................................................................................................143

Utilities.................................................................................................................................................143Administration Utilities Overview................................................................................................143geo−update: DNS Database Update Utility ..................................................................................145inspector: Urchin Installation Integrity Checker...........................................................................147u3importer: Urchin 3 Data Import Utility.....................................................................................148uconf−driver: Configuration Management Utility........................................................................151uconf−export: Text−based Configuration Export Utility..............................................................162uconf−import: Text−based Configuration Import Utility.............................................................164uconf−schedule: Global Scheduling Utility..................................................................................167udb−sanitizer: Database Maintenance Utility...............................................................................168urchinctl: Urchin Services Control Utility....................................................................................171urchin: Urchin Log Processing Engine.........................................................................................172

Integration............................................................................................................................................173NFS locking requirement..............................................................................................................173Overview of Urchin Integration Capabilities................................................................................173Changing the Location of the Urchin Data Directory...................................................................175Using an Existing Apache Webserver (UNIX−type Platforms)...................................................177Using an Existing IIS Webserver (Windows Platforms)...............................................................179Using External Authentication or Authentication Bypass............................................................181Linking Directly to Urchin Reports...............................................................................................183Script−based Configuration Management Overview....................................................................186Data Export....................................................................................................................................189

Customization......................................................................................................................................190Custom Log Formats.....................................................................................................................190Custom Navigation........................................................................................................................192Custom Reports.............................................................................................................................194Custom Date/Time Formats..........................................................................................................196Custom DNS Entries.....................................................................................................................197Custom Lookup Tables.................................................................................................................198Cobranding Urchin........................................................................................................................200

Hosting Automation Solutions.............................................................................................................201How are H−Sphere and Urchin 5 Integrated?...............................................................................201Using Urchin with Plesk PSA 5.0.................................................................................................201Ensim Webppliance.......................................................................................................................202Sphera's HostingDirector...............................................................................................................203

Performance &Tuning.........................................................................................................................203Global Filtering of Hits from Monitoring Software......................................................................203Reducing Disk Storage for Urchin Profile Monthly Databases....................................................204

Security Features..................................................................................................................................207Activating SSL on the Urchin Webserver.....................................................................................207

Chapter 8: Reference......................................................................................................................................208Integer Field List..................................................................................................................................208Regular Field List................................................................................................................................209Regular Report List..............................................................................................................................213Configuration Table and Directive List...............................................................................................217Error code list for failed FTP and HTTP remote webserver log transfers...........................................225

iii

Page 5: MANUAL Urchin v5x

Chapter 1: Getting Started

Welcome to Urchin!

Urchin 5 represents 7 years of development, and is in our view the most advanced web analytics packageavailable today. Combining proven datacenter−class performance with unprecedented ease−of−use, Urchin 5is the best choice for businesses and hosting providers of all sizes.

What is Urchin?

Urchin is a web analytics system designed to enable businesses to easily analyze the traffic to their website(s)and create detailed, insightful, and intuitive reports. Basically, Urchin is a log−analysis program, but itssophisticated unique visitor reporting goes far beyond what was available up until now.

Chapter 1: Getting Started 1

Page 6: MANUAL Urchin v5x

How Does Urchin Work?

Urchin consists of 4 primary components:

The Admin Server• The Log−processing and DNS resolution engine• The Visitor Interaction Data Architecture (VIDA) database• The Scheduler•

The Admin Server is Urchin's nerve center. It is a web−based control panel system, powered by acustomized Apache web server, that controls all the other Urchin components. With the Admin Server, youcan access and control the Urchin system from any computer on the Internet (by turning on remote access andreporting).

The log−processing and DNS resolution engine does the heavy lifting in the Urchin system, coverting largeraw log files into meaningful data, translating IP addresses to domains, and entering that information into theVIDA database.

The VIDA system is our highly−specialized, optimized, proprietary database for quickly entering andextracting web analytics data. This analytics−specific database is a significant part of Urchin's speedadvantage over the competition.

The Scheduler regularly checks the configuration database for scheduled tasks that need to be run, andexecutes Urchin to process them at their scheduled times.

Who should use Urchin?

Urchin is ideal for any individual or business who has access to their website's log file(s) and HTML. If youdo not have access to your site's log file(s), ask your hosting provider to install Urchin. It is very popularamong hosts. Contact [email protected].

System Requirements

Supported Platforms and Hardware Requirements

Urchin runs on numerous architectures and operating systems. An Urchin installation is only needed on asystem that will be processing logs. For viewing reports, only a web browser is required.

Supported Platforms

Chapter 1: Getting Started 2

Page 7: MANUAL Urchin v5x

Windows

Windows 2003 Server• Windows XP• Windows 2000 (Professional and Server)• Windows NT 4.x•

UNIX−type Systems

Mac OS X (10.1 and higher)• Mac OS X Server (10.1 and higher)• Linux x86

RedHat Enterprise 3.0, RedHat 9, RedHat 8, RedHat 7.x, RedHat 6.x♦ Fedora Core 2, Fedora Core 1♦ SuSE 9♦ Other Linux OSes should be compatible; see the list in the Non−Explicitly SupportedPlatforms section

FreeBSD 5.2, FreeBSD 4.x• Solaris 2.6, 7, 8 , 9 (SPARC)• Solaris 9 (x86)• Sun Cobalt RaQ550, Qube3, RaQ4, RaQ3•

Anticipated OS Support

The following OSes should have a native build of Urchin released in the timeframe noted for each one:

FreeBSD 5.3 − first quarter 2005• Solaris 10 − first quarter 2005•

If you don't see your OS listed, and a substitute cannot be found in the compatibility list in the next section,contact us to suggest it as a possible inclusion.

Non−Explicitly Supported Platforms

We strive to make Urchin available natively on as many platforms as is economically reasonable. If there isno specific Urchin distribution for your platform, you may find an available Urchin distribution that iscompatible with your OS as explained below.

Windows 98, Windows 3.x:

Urchin cannot be installed on Windows 98 or 3.x, but these platforms can be used to view reports withInternet Explorer 4.x and newer.

Linux:

There are many different variants of Linux and we don't build an individual Urchin distribution for all ofthem. However, there is typically a high degree of compatibility across Linux flavors so one of ourdistributions almost certainly will work on your machine. Some known compatible distributions are:

Chapter 1: Getting Started 3

Page 8: MANUAL Urchin v5x

RedHat Enterprise Linux 2.1: use the RedHat 7.2 distribution of UrchinSuSE Linux 8: use the RedHat 7.2 distribution of Urchin

For all other x86−based Linux variants you can determine which Urchin distribution to use by looking at ourFAQ article on this topic.

Solaris:

For SPARC systems, any OS release prior to Solaris 2.6 is not supported.• For x86 systems, any OS release prior to Solaris 9x86 is not supported.•

Urchin 5 System Requirements

Urchin's superior performance allows you to get more from less hardware investment. For instance, an olderPentium II might be too slow for desktop use, but will make a fine Urchin server. And Urchin's unmatchedportability means you can use whichever operating system you like. Below, we provide a recommended levelof hardware for high performance.

Recommended Systems

Single Small to Medium Website Analysis

500mhz or better processor• 128mb RAM• 10GB+ IDE hard disk• Ethernet interface•

Service Provider / Enterprise Installations

1Ghz Pentium IV / 500mhz UltraSPARC / similar mhz range PPC/MIPS/etc.• 256mb RAM• Ultra2/Wide SCSI hard disk (such as a Seagate Cheetah)• 100base−T ethernet• Backup system•

Memory/System/Disk Usage

Urchin Memory(RAM) usage can be configured to use between 20−500Mb• Urchin can be configured to run at low, normal or high priority• Urchin's data storage will use approximately 10% of the size of raw logs•

Urchin Setup Requirements

Chapter 1: Getting Started 4

Page 9: MANUAL Urchin v5x

This article lists the operational issues that should be anticipated prior to installing and running Urchin. Someof the information is required to operate Urchin successfully. Other items are important for using Urchin mosteffectively once the software is installed.

Basic Urchin Installation Considerations

On Windows you must install while logged in as the Administrator.• On UNIX−type systems you may install as any user, but if you do not install as the superuser, youwill be restricted in what areas of the file system you may install.

Urchin comes bundled with an Apache webserver binary for configuration and report delivery. Yoursystems administrators should be aware that this new web service will be running after Urchin isinstalled.

Although the Urchin distribution itself is small, taking up only about 25 megabytes, you should installin a disk location that has plenty of room (e.g. several hundred megabytes at least) to allow for thegrowth of the Urchin databases over time. See the Performance and Management Issues section foradditional considerations.

If you are upgrading from Urchin 3, you will need to import your databases into Urchin 5 using theu3importer utility. There is no direct upgrade of Urchin 3 to Urchin 5 simply by running the Urchin 5installer. See Upgrades in the Getting Started section of the Documentation Center.

Upgrading from Urchin 3 or Urchin 4 to Urchin 5 requires relicensing your product.•

Basic Urchin Processing Considerations

Access to webserver logs − you must know the path to the log files for a given site, and you musthave permission to access these files. If the logs are on a remote system, then you will also need anaccount name and password to use when retrieving the logs.

Properly configured log format − although Urchin can process custom log formats, you will simplifythe management requirements if you configure your webserver as appropriate to log in a standardformat. It is recommended that you use either Extended Combined Log Format (e.g. NCSA or Apachelogs), or W3C Extended Log Format (e.g. IIS logs). For IIS sites, logging of Process Accountingshould be turned off. See the Advanced Configuration section for additonal considerations.

Unique user account for Urchin processes − On UNIX−type systems it is desirable to enhancesecurity by having Urchin programs run as a special user id that is used exclusively for Urchin andhas only limited privileges. Setting up such an account will require that you have elevated orsuperuser privileges on the system in question.

Scheduling − you will need to choose a run schedule for Urchin processing to deliver reports in atimely fashion as well as account for the time needed to process if you have large data sets.

Advanced Urchin Processing Considerations

If you desire Unique Visitor tracking then you will have to perform the following basic steps:Install the UTM sensor code in the web pages on your site♦ Activate cookie logging in the log format for your webserver♦ Set the tracking methodology in the Urchin Profile for the website to be UTM♦

If you choose not to use Unique Visitor tracking then you should consider what level of granularityyou desire for visitor or session reporting, and select the appropriate alternative Visitor TrackingMethod for each site. Besides UTM the choices are IP only, IP/User−Agent (the default), Session ID,or Username.

Chapter 1: Getting Started 5

Page 10: MANUAL Urchin v5x

Performance and Management Issues

Log rotation − if you do not have some external mechanism for archiving or removing webserver logsafter they have been processed by Urchin, you can configure Urchin to perform this task in theAdvanced Settings for each Log Source.

Retaining past Urchin databases for historical reporting − once the databases for a given month arecreated they are available from then on for historical analysis. Users should consider how far backthey need to keep historical data so they can plan for purging unnecessary data to save disk space.Urchin can be configured to compress databases that are older than a certain date.

Memory requirements − Urchin has configuration controls to limit the amount of RAM it utilizeswhen processing logs. The default is set to 20Mb, which may be too conservative for sites with logsgreater than 10Mb in size. Plan to have sufficient system RAM so that you may increase Urchinmemory usage as needed and tune the software's memory settings for maximum processingperformance.

Location of Urchin data storage − utilizing the etc/urchin.conf file, Urchin can be configured so thatthe report databases are stored in a file system area outside the Urchin distribution. This allows you toallocate dedicated sufficient file system space for database growth where it's most convenient.

Remote Access and Integration Issues

Using SSL for Urchin administration and reporting − the webserver that is bundled with Urchin iscompiled with support for SSL. The configuration does not have SSL activated by default, howeverthis can be turned on as desired by the user.

Firewall configuration − if your network topology includes firewalls, proxy servers, and otherelements that will be in between the Urchin processing server and users trying to view reports orsystems that hold logs that need to be retrieved, then those devices will have to be configured so thatthey don't interfere with Urchin's remote access. This typically can be done without subverting thesecurity that such a topology is intended to provide.

Installation

Quickstart Installation Guide

This Quickstart article is for first time installers of Urchin. If you have an existing installation, read theUpgrades section.

When you have completed the installation steps, login to the Urchin administration interface to performconfiguration. The initial username and password are:

Chapter 1: Getting Started 6

Page 11: MANUAL Urchin v5x

Username: adminPassword: urchin

Reset the password during your initial configuration in the Setup Wizard.

If you require unique visitor and session tracking, complete the steps in this Quickstart Guide and continuewith the UTM Quick−Install article in the Visitor Tracking section.

Installing on Windows Systems

Go to www.urchin.com and click the Download link.• Download the Urchin for Windows installer to your desktop.• Once the download has completed, double−click the installer file to start the InstallShield® wizard.• Follow the on−screen instructions. The defaults should be acceptable for most installations.• Once the installer has completed, go to Start −> Programs −> Urchin −> Urchin Administration andlogin. You will get a License Urchin screen. Click on "Obtain Demo License". The interface willconnect to the licensing server at the Urchin Software Corporation website and walk you through theprocess.

Once licensing is completed you will be presented with a Setup Wizard. Follow the instructions tocomplete your initial configuration. Please make sure to reset the password for the admin account andto record this password somewhere for safekeeping.

When the Setup Wizard has completed you'll be taken to the Profile configuration screen. Click Addto create a new Profile.

Once you have created the appropriate Profiles, you're ready to start processing logs so that you canview Report data for your sites. To access the administration interface remotely or for users to seetheir individual reports, use the URL http://yourhost:9999, where yourhost should be replaced by thename of the system where Urchin is installed.

Installing on UNIX−type Systems

Go to www.urchin.com and click the Download link.• Select the installer for the OS type that most closely matches your platform. The name of the installerimage will include the Urchin version and the operating system type (e.g.urchin5000_freebsd4x.sh, urchin5000_redhat9.tar.gz)

If necessary, upload the installer to a temporary location on the system on which you are installingUrchin.

If you are not on the system's console, telnet (or use ssh if available) to the system and cd to thedirectory where the installer is located.

Installers will have either a .sh or a .tar.gz suffix. Depending on the type of installer you will do oneof the following:

For a shell archive (e.g. urchin5000_freebsd4x.sh) simply type the name of the filelike so:

./urchin5000_freebsd4x.sh.

This will unpack several files that comprise the installation kit.

For a tar.gz image (e.g. urchin5000_redhat9.tar.gz), uncompress and unpack theinstallation files with the commands:

Chapter 1: Getting Started 7

Page 12: MANUAL Urchin v5x

gunzip urchin5000_redhat9.tar.gztar xf urchin5000_redhat9.tar

From the command line execute the main installation script by typing:

./install.sh

The script will prompt you for input as needed; just follow the instructions.• When the installer has finished, you will be given the URL to access the Urchin administrationinterface, as well as the default admin password.

Copy/paste the URL into a browser window, and enter the admin username and password to startconfiguring Urchin. You will get a License Urchin screen. Click on "Obtain Demo License". Theinterface will connect to the licensing server at the Urchin Software Corporation website and walkyou through the process.

Once licensing is completed you will be presented with a Setup Wizard. Follow the instructions tocomplete your initial configuration. Please make sure to reset the password for the admin account andto record this password somewhere for safekeeping.

When the Setup Wizard has completed you'll be taken to the Profile configuration screen. Click Addto create a new Profile.

Once you have created the appropriate Profiles, you're ready to start processing logs so that you canview Report data for your sites. To access the administration interface remotely or for users to seetheir individual reports, use the URL http://yourhost:9999, where yourhost should be replaced by thename of the system where Urchin is installed.

Installing on Mac OS X 10.2.x Systems

Go to www.urchin.com and click the Download link• Download the Urchin installation archive for Mac OS X 10.2.x• If the installer is downloaded directly via a browser to the system where it will be installed, an Urchin5 folder will automatically be created on the desktop. If downloaded via some other mechanism suchas ftp, double−clicking the installation archive icon which will unpack the archive and create thedesktop folder.

Open the Urchin 5 folder, and double−click the Urchin.mpkg file, which will launch an interactiveinstallation process. It's required that you are using an account with administration privileges toinstall.

At the end of the installation a browser will launch and take you to the Urchin administration screen.You will get a License Urchin screen. Click on "Obtain Demo License". The interface will connect tothe licensing server at the Urchin Software Corporation website and walk you through the process.

Once licensing is completed you will be presented with a Setup Wizard. Follow the instructions tocomplete your initial configuration. Please make sure to reset the password for the admin account andto record this password somewhere for safekeeping.

When the Setup Wizard has completed you'll be taken to the Profile configuration screen. Click Addto create a new Profile.

Once you have created the appropriate Profiles, you're ready to start processing logs so that you canview Report data for your sites. To access the administration interface remotely or for users to seetheir individual reports, use the URL http://yourhost:9999, where yourhost should be replaced by thename of the system where Urchin is installed.

Installing on Sun Cobalt Systems

Chapter 1: Getting Started 8

Page 13: MANUAL Urchin v5x

Use a web browser from your desktop system to connect to http://www.urchin.com/download/urchin5• On the download page select the installer that most closely matches your platform. The name of theinstaller will include the Urchin version and the Sun Cobalt system type. For example:

urchinc−5.0.00_cobalt_raq550.i386.pkg

Save the .pkg file to your desktop or to a temporary folder• Using your browser connect to the main Site Administrator's page for your Cobalt box• Navigate to the section of the interface for installing new third party software. The location of thisarea in the Cobalt interface will be platform specific:

Raq 3, RaQ 4 − click on Maintenance in the left hand frame, then click Install Software in thetop row

Qube 3, RaQ 550 − click on the BlueLinQ tab, then click Third Party Software, then click theInstall Manually button

XTR − click on the BlueLinQ tab, then click New Software, then click the Install Manuallybutton

Prepare and launch the package installer:RaQ 3, RaQ 4 − In the Software Package box select the Upload radio button, then click theBrowse button to the right and navigate to the location on your desktop system where yousaved the .pkg file you downloaded. Once you click the Open button in the browse window,the pkg filename will be entered into the Upload text box. Then click the "Install a pkgPackage" button. When the installation is finished an Urchin link will appear in the lower boxfor installed software.

Qube 3, XTR, RaQ 550 − In the Location box select the Upload radio button, then click theBrowse button to the right and navigate to the location on your desktop system where yousaved the .pkg file you downloaded. Once you click the Open button in the browse window,the pkg filename will be entered into the Upload text box. Then click the Prepare button.Once the package has been prepared, an Install Software window will appear. In this windowclick the Install button. When the installation is finished Urchin 5 will be listed under thePrograms tab.

Click on the Urchin link in your Cobalt administration interface and the Urchin admininstration loginwindow will appear. Enter the admin username and password to start configuring Urchin. You willget a License Urchin screen. Click on "Obtain Demo License". The interface will connect to thelicensing server at the Urchin Software Corporation website and walk you through the process.

Once licensing is completed you will be presented with the Setup Wizard. Follow the instructions tocomplete your initial configuration. Please make sure to reset the password for the admin account andto record this password somewhere for safekeeping.

When the Setup Wizard has completed you'll be taken to the Profile configuration screen. Click Addto create a new Profile.

Once you have created the appropriate Profiles, you're ready to start processing logs so that you canview Report data for your sites. To access the administration interface remotely or for users to seetheir individual reports, use the URL http://yourhost:9999, where yourhost should be replaced by thename of the system where Urchin is installed.

Chapter 1: Getting Started 9

Page 14: MANUAL Urchin v5x

Installation Guide (Windows)

The installer is an executable which guides you through all the steps necessary to install Urchin. The basiccomponents of the Urchin 5 installation process are:

Creating the distribution directory and unpacking the files• Installing and starting an Apache webserver as an NT service to allow web based configuration andreport delivery

Installing and launching the Urchin task scheduler which manages log processing jobs as an NTservice

Initial configuration and demo licensing of Urchin via the administration interface•

Installation Preparation

You must be logged in as Administrator on the console of your system in order to install Urchin. By defaultthe Urchin webserver service will use port 9999 when it launches. You will have the option of choosing adifferent port number during installation. Please verify that any port you choose does not conflict withexisting operational services on your system.

You will need access to the Internet from your machine. Internet access is required to complete the demolicensing and activate your Urchin distribution once it is installed.

Installation Instructions

If you are upgrading an existing installation of Urchin, please consult the Upgrades section of theDocumentation Center for relevant details.

Double click on the urchin5XXX_win_setup.exe (e.g. urchin5000_win_setup.exe) icon to launch the installer,and follow the instructions in the dialog screens.

Initial Configuration Using the Administration Interface

Once Urchin is installed you can connect to your Urchin administration interface by going to the Start Menu,and selecting Programs−>Urchin−>Urchin Administration. Alternatively, you can enter the direct URL

http://localhost:port_number

into your browser, where port_number is either 9999 or a number you may have chosen during theinstallation. Wen you initially connect to the configuration interface, you will be presented with a LicenseUrchin wizard. You should click on "Obtain Demo License". The interface will connect to the licensing serverat the Urchin Software Corporation website and walk you through the process. When finished with the licenseprocess, you will be returned to the Urchin administration interface, where you will be led through a Setupwizard that will set some required initial configuration parameters.

Remote Access Configuration

Chapter 1: Getting Started 10

Page 15: MANUAL Urchin v5x

If you connect to the Urchin configuration interface by using the hostname in the URL (e.g.http://yourhost:9999 instead of http://localhost:9999) the program will detect this as a remote access (even ifyou are on the console of the machine you're connecting to) and will prompt you for a username andpassword. The default settings for logging in with administration privileges are:

Username: adminPassword: urchin

Managing Urchin Services

There are two programs that are installed as NT services, the Urchin Task Scheduler and the UrchinWebserver. These services may be manually stopped and started by using the Disable Services and EnableServices shortcuts under Start Menu−>Programs−>Urchin. When these shortcuts are used both services aresimultaneously turned off or on.

User Access to Reports

Users should use URL http://yourhost:portnumber, where yourhost is the name of the system where Urchinis installed and portnumber is the port number for the Urchin webserver (9999, unless you specified adifferent number during installation).

Advanced Reporting Options

If you require unique visitor and session tracking, continue with the Visitor Tracking section of theDocumentation Library. If you would like to know about processing e−commerce data, please see theE−commerce Module section as well.

Installation Guide (UNIX)

The basic components of the Urchin 5 installation process are:

Creating the distribution directory, unpacking the files, and setting appropriate ownership and filepermissions

Configuring and launching an Apache webserver to allow web based configuration and reportdelivery

Launching the Urchin task scheduler daemon, which manages log processing jobs• Initial configuration and demo licensing of Urchin via the administration interface•

The installer image you download is in the form of an archive, which will unpack into an install script, somesupport files, and the Urchin distribution. Urchin can be installed by any legitimate user on your system. Itdoes not expect nor require any special system privileges either to install or operate, and is specificallydesigned to run as a non−root user for security reasons.

Installation Preparation

Chapter 1: Getting Started 11

Page 16: MANUAL Urchin v5x

You may install as any user, with the exceptions that you will have to install as the superuser if you install in adirectory that has write access restrictions, or if you configure your webserver to respond to requests on a portnumber that is lower than 1025. Only the superuser can configure the webserver with a port number lowerthan 1025. Please verify that the port you choose does not conflict with existing operational services on yoursystem. The installation process will attempt to check for conflicts.

If you are installing as root, you will also be asked for a user account name and a group name, which are usedin the configuration file for the webserver, and also used to set the ownership on the installed Urchindistribution. The user and group names you select must be valid logins recognized by your system; you cannotchoose arbitrary names for these. You also are not allowed to use root as the login to own the Urchin files forsystem security reasons. If you are not logged in as root while installing, you will typically not have theprivileges to set the ownership of the files to the user of your choice. The install script will automaticallydetect this and install the distribution with your login as the owner of the files.

Lastly, you will need access to the Internet from your machine, since it is required for you to connect to theurchin.com site to complete the demo licensing and activate your Urchin distribution once it is installed.

Installation Instructions

The installer archive could be either a .tar.gz or a .sh archive, depending on your OS, and will be labeled witha name that identifies it for your OS type (e.g. urchin5000_freebsd4x.sh,urchin5000_redhat9.tar.gz). Copy the archive to any writeable area on your system and dependingon your install image type do one of the following:

For a shell archive (e.g. urchin5000_freebsd4x.sh) simply type the name of the file like so:

./urchin5000_freebsd4x.sh

If you get a "Permission Denied" error, then run the command in this fashion:

sh ./urchin5000_freebsd4x.sh

For a .tar.gz image (e.g. urchin5000_redhat9.tar.gz), uncompress and unpack theinstallation files with the commands:

gunzip urchin5000_redhat9.tar.gztar xf urchin5000_redhat9.tar

Once the archive has been unpacked you should have the following files−

install.sh (the installation script)• install.txt (instructions similar to this document)• license.txt (legal restrictions, licensing, and purchasing info)• inspector (verifies the installed distribution)• gunzip (supplied to unpack urchin.tar.gz)• urchin.tar.gz (a tarred and compressed Urchin distribution)•

To install simply type:

Chapter 1: Getting Started 12

Page 17: MANUAL Urchin v5x

./install.sh

and follow the instructions.

Initial Configuration Using the Administration Interface

The installation script will start the Urchin webserver and Task Scheduler daemons. Once they are started youcan connect to your Urchin administration interface by using the URL http://yourhost:9999, where yourhost isthe DNS hostname for your system. If you have changed the default port number from 9999 to some otherport during the installation, then you should use that port number in the URL. You will get a login screen. Usethese initial login values:

Username: adminPassword: urchin

Upon initial login, the interface will take you to a License Urchin wizard. You should click on "Obtain DemoLicense". The interface will connect to the licensing server at the Urchin Software Corporation website andwalk you through the process. When finished with the license process, you will be returned to the Urchinadministration interface where you will be led through a Setup wizard that will set some required initialconfiguration parameters.

Managing Urchin 5 Services

There are 2 daemons, urchind and urchinwebd, that need to be running in order for log processing, reporting,and configuration administration to occur. These daemons are stopped and started by the urchinctl program inthe bin subdirectory of your Urchin distribution. To start or stop both daemons, use:

./urchinctl start

./urchinctl stop

You can also specify to only start or stop one daemon at a time by using a −w option for the webserver or a −soption for the scheduler. To see all of the available options, execute urchinctl with the −h option.

Any errors encountered when one of the daemons is launched should be reported on the command line. Forthe urchinwebd daemon, once you think it is running successfully, you should also check the var/error_log filefor any startup problems.

At install time, the install.sh script will create a bootup/shutdown script that you can use in conjunction withyour system rc files to cause the Urchin services daemons to be started at boot time and halted at shutdown.The script is named urchin_daemons and is located in the util subdirectory of your Urchin distribution.

User Access to Reports

Users should use URL http://yourhost:portnumber, where yourhost is the name of the system where Urchinis installed and portnumber is the port number for the Urchin webserver (9999, unless you specified adifferent number during installation).

Advanced Reporting Options

Chapter 1: Getting Started 13

Page 18: MANUAL Urchin v5x

If you require unique visitor and session tracking, continue with the Visitor Tracking section of theDocumentation Library. If you would like to know about processing e−commerce data, please see theE−commerce Module section as well.

Installation Guide (Mac OS X 10.2.x)

These installation notes pertain to installing Urchin 5 on systems running a minimum of Mac OS X 10.2. Forolder Mac OS X versions please see the general instructions for UNIX−type installations.

The Mac OS X 10.2 installer is a point−and−click package style installer that is downloaded in the form of adisk image. The basic components of the Urchin 5 installation process are:

Download Urchin and unpack the installation archive• Double−click the Urchin.mpkg file, which will launch an interactive installation process•

The installer will install 3 distinct parts:

Urchin binaries, utilities, and support files, including an Apache webserver for administration andreport delivery

Urchin StartupItems• Urchin Preference Pane•

Installation Preparation

The Mac OS X installer requires Mac OS X 10.2 or higher. Users of older Mac OS X systems need to use theMac OS X 10.1.x shell archive installer.

An installing user must be able to authenticate using an account that has administrative privileges on thesystem since the installer will be installing files in restricted locations.

While installing a dialog will inquire about what disk you want to install on. Currently, it is required that youinstall on the Startup volume.

Installation Instructions

If you are upgrading an existing installation of Urchin, please consult the Upgrades section of theDocumentation Center for relevant details.

If the installation disk image is downloaded to your system via a browser, it will automatically unpack andcreate an Urchin 5 folder on the desktop. If the installer image is downloaded via ftp or other mechanism,once the disk image is double−clicked, it will uncompress and create the desktop folder. Inside the folder thecontents will be as follows:

Chapter 1: Getting Started 14

Page 19: MANUAL Urchin v5x

Urchin.mpkgReadme.rtfInstall.rtfLicense.rtfuninstall_urchin.shPackages folder

Double−click the Urchin.mpkg icon and follow the instructions in the dialog boxes to complete yourinstallation. The dialogs will prompt you

Initial Configuration Using the Administration Interface

The installer will start the Urchin webserver and Task Scheduler daemons and launch a browser to connectyou to the Urchin administration interface. You will get a login screen. Use these initial login values:

Username: adminPassword: urchin

Upon initial login, the interface will take you to a License Urchin wizard.

Click on "Obtain Demo License". The interface will connect to the licensing server at the Urchin SoftwareCorporation website and walk you through the process. When finished with the license process, you will bereturned to the Urchin 5 administration interface where you will be led through a Setup wizard that will setsome required initial configuration parameters. At any time in the future you can connect to your Urchin 5administration interface by using the URL http://yourhost:9999, where yourhost is the DNS hostname for yoursystem.

Managing Urchin 5 Services

The Urchin 5 services can be controlled or monitored by launching the System Preferences and clicking theUrchin icon.

User Access to Reports

Users should use URL http://yourhost:9999, where yourhost is the name of the system where Urchin isinstalled.

Advanced Reporting Options

If you require unique visitor and session tracking, continue with the Visitor Tracking section of theDocumentation Library. If you would like to know about processing e−commerce data, please see theE−commerce Module section as well.

Installation Guide (Sun Cobalt)

Chapter 1: Getting Started 15

Page 20: MANUAL Urchin v5x

The installer is a .pkg file, installed via the Cobalt Administration interface. The basic tasks of the Urchin 5installer for Sun Cobalt are:

Create the /home/urchinc distribution directory, unpack the files, and set appropriate ownership andfile permissions

Configure and launch a light Apache webserver to allow web based configuration and report delivery• Launch the Urchin task scheduler daemon, which manages log processing jobs• Permit initial configuration and demo licensing of Urchin via the administration interface•

Installation Preparation

You must have root access to install Urchin on a Sun Cobalt system. Although Urchin itself does not requireany special system privileges to operate, and is specifically designed to run as a non−root user for securityreasons, installation requires superuser access to some areas of your system.

You should download the appropriate package file for your system. This can be done one of 2 ways:

Use a web browser from your desktop system to download fromhttp://www.urchin.com/download/urchin5 and save the .pkg file on your local machine until you'reready to install

Use ftp directly from your Cobalt system to ftp.urchin.com/pub/urchin5, and put the downloaded .pkgfile into the /home/packages directory

Your Cobalt system will need access to the Internet, since it is necessary for you to connect to the urchin.comsite to complete the demo licensing and activate your Urchin distribution once it is installed.

RaQ550 owners should read and understand the information on RaQ550 web.log permissions issues wheninstalling Urchin.

Installation Instructions

If you are upgrading an existing installation of Urchin, please consult the Upgrades section of theDocumentation Center for relevant details.

Begin by connecting with your browser to the main Site Administrator's page for your Cobalt box, andnavigate to the section of the Sun Cobalt administration interface used to install new third party software. Thelocation of this area in the Cobalt interface will be platform specific:

Raq 3 or RaQ 4 − click on Maintenance in the left hand frame, then click Install Software in the toprow

RaQ 550 or Qube 3 − click on the BlueLinQ tab, then click Third Party Software, then click theInstall Manually button

In the new software area, prepare and launch the package installer using the directions appropriate for yourplatform:

RaQ 3 or RaQ 4•

Chapter 1: Getting Started 16

Page 21: MANUAL Urchin v5x

If you downloaded the Urchin package by using a browser and saving the .pkg file on yourdesktop system, then in the Software Package box select the Upload radio button, then clickthe Browse button to the right and navigate through your local filesystem until you locate thefile. Once you click the Open button in the browse window, the pkg filename will be enteredinto the Upload text box.

If you copied the software into your Cobalt system's /home/packages directory, then select theradio button labeled Loaded. Then choose your package installer from the drop down box tothe right of this button.

Click the "Install a pkg Package" button. When the installation is finished, in the lowersection labeled Software on the Sun Cobalt Server, an Urchin link will appear.

RaQ 550 or Qube 3If you downloaded the Urchin package by using a browser and saving the .pkg file on yourdesktop system, then in the Location box select the Upload radio button, then click theBrowse button to the right and navigate through your local filesystem until you locate the file.Once you click the Open button in the browse window, the pkg filename will be entered intothe Upload text box.

If you copied the software into your Cobalt system's /home/packages directory, then in theLocation box, select the radio button labeled "Packages in /home/packages", and choose yourpackage installer from the drop down box to the right of this button.

Click the Prepare button. Once the package has been prepared, an Install Software windowwill appear. In this window click the Install button. When the installation is finished, Urchinwill be listed under the Programs tab.

Initial Configuration Using the Administration Interface

Click on the Urchin link in your Cobalt administration interface and the Urchin admininstration login windowwill appear. Alternatively, you can connect directly to your Urchin administration interface without goingthrough the Cobalt administration interface by using the URL http://yourhost:9999, where yourhost is theDNS hostname for your Cobalt system. Enter the admin username and password to start configuring Urchin.Use these initial login values:

Username: adminPassword: urchin

Upon initial login, the interface will take you to a License Urchin wizard. You should click on "Obtain DemoLicense". The interface will connect to the licensing server at the Urchin Software Corporation website andwalk you through the process. When finished with the license process, you will be returned to the Urchinadministration interface where you will be led through the Setup Wizard, which will set some required initialconfiguration parameters. Follow the instructions to complete your initial configuration. Please make sure toreset the password for the admin account and to record this password somewhere for safekeeping. When theSetup Wizard has completed you'll be taken to the Profile configuration screen. Click Add to create a newProfile. You will see a list of some sample Cobalt profiles and Log Sources that you can use as templates.Once you have a Profile you're ready to start processing logs and viewing Report data.

Managing Urchin Services

There are 2 daemons, "urchind" and "urchinwebd", that need to be running in order for log processing,reporting, and configuration administration to occur. These daemons are automatically launched by the

Chapter 1: Getting Started 17

Page 22: MANUAL Urchin v5x

installation process and are configured for your system so that they should always restart if the system isrebooted. However, you may have need to control these processes manually. The daemons are stopped andstarted by the urchinctl program in the bin subdirectory of your Urchin distribution in /home/urchin. To startor stop both daemons, use:

/home/urchinc/bin/urchinctl start/home/urchinc/bin/urchinctl stop

You can also specify to only start or stop one daemon at a time by using a −w option for the webserver or a −soption for the scheduler. To see all of the available options, execute urchinctl with the −h option. Any errorsencountered when one of the daemons is launched should be reported on the command line. For theurchinwebd daemon, once you think it is running successfully, you should also check the var/error_log file forany startup problems.

User Access to Reports

Users should use URL http://yourhost:portnumber, where yourhost is the name of the system where Urchinis installed and portnumber is the port number for the Urchin webserver (9999, unless you specified adifferent number during installation).

Urchin Traffic Monitor

On Sun Cobalt systems, due to the combination of the default webserver logging format, the automatedwebserver log splitting mechanism, and the built−in statistics gathering software, it is currently not possible toutilize the Urchin UTM.

Uninstalling Urchin 5

Windows

Uninstalling on a Windows system can be done in two ways.

Using Add/Remove Programs control panel − go to Start−>Settings− >Control Panel and doubleclickon Add/Remove Programs. Highlight Urchin and click the Change/Remove button. An InstallShieldwindow should launch and present you with a dialog box with 3 radio button choices: Modify, Repair,and Remove. Select the Remove button and click Next, then follow the remaining dialog boxes tocomplete.

Re−running an Urchin installer − Running the setup.exe you installed Urchin with should detect thatUrchin is already installed and present you with the dialog box with the Modify, Repair, and Removeradio buttons.

When the uninstall process is completed there will be an Urchin data folder left in its original installationlocation. This folder contains Urchin report and configuration data, and is not removed during uninstallation.If you are completely removing Urchin from your system, you may remove the Urchin folder to reclaim diskspace.

Chapter 1: Getting Started 18

Page 23: MANUAL Urchin v5x

UNIX−type Systems

Using the urchinctl program, stop the Urchin webserver and Urchin task scheduler services like so:

/path/to/urchin/bin/urchinctl stop

Once this is done you can remove the entire Urchin installation directory. If you have installed theurchin_daemons boot script that causes the Urchin services to start/stop when the system is rebooted, youshould remove this script from the startup initialization area of your system.

Mac OS X 10.2.x and later

In the Urchin installer disk image (e.g. the urchin5.XXXX_macosx102.dmg), there is a script that can be usedto automate removing Urchin.

Mount the Urchin installer disk image by double clicking on it. This will mount a new volume onyour desktop named Urchin 5.XXX (where 5.XXX) is the Urchin version number, e.g. 5.702)

Open up a Terminal window by launching the Finder and selecting Applications from the Go menu.Navigate into the Utilities folder and double click on Terminal.

In your terminal window run the command:sudo /Volumes/Urchin 5.XXX/uninstall_urchin.sh

where 5.XXX is the Urchin version number.

The uninstall_urchin.sh script will remove all of the Urchin binaries and support files, but leave yourconfiguration and report data intact. If you want to remove all data as well, then you should manually deletethe /usr/local/urchin directory with the command:

sudo rm −r /usr/local/urchin

Sun Cobalt

Connect to the main Site Administration page for your Cobalt system and follow the directions below for yoursystem type:

RaQ3 or RaQ4 − Click on Maintenance in the left hand frame, then click Install Software in the toprow. In the list of installed software on the system, click the Urchin link. In the Urchin managementscreen, click Uninstall Urchin, then click to confirm that you want to uninstall.

Qube3, XTR, or RaQ 550 − Select the BlueLinQ tab, then click Installed Software in the left handframe. In the software list you will see an entry for Urchin. The right hand column of the Urchin entryhas an uninstall icon. Click the icon and then click OK to confirm that you want to uninstall.

When the uninstall has completed the Cobalt Administration interface should refresh and any entry for Urchinshould be gone. Once this is done you can remove the /home/urchin directory. Please note that removing/home/urchin will irretrievably delete any remaining configuration and report data.

Chapter 1: Getting Started 19

Page 24: MANUAL Urchin v5x

Troubleshooting Install Problems

All Platforms

Please pay close attention to the output from your installer. Read all dialog boxes, requests for user input, andoutput text carefully. If your installer fails please record the complete and exact error messages that aregenerated including any error codes. This info is required for full analysis of your problem.

Windows

To create debugging output of what's happening during a Windows installation, you can have the setup.exeprogram log its activities to a file. This is particularly useful when you have some unknown error duringinstallation. To trigger logging, you'll have to launch setup.exe by using the Run Command mechanism. Go tothe Start menu and select Run... and in the Open: entry box, enter the full path to the setup.exe along with theappropriate logging options like so:

C:\temp\setup.exe /v"/Lv C:\temp\installer.log"

Pay close attention to the syntax of this command. The spaces, quotes, slashes, and backslashes should all beentered in exactly as shown. C:\temp\setup.exe should be replaced by the real path to setup.exe on yoursystem. The file C:\temp\installer.log is where the execution logging output will be stored.

UNIX−type Systems

If you have problems executing the shell archive installer, you may download a compressed tar archive of theinstallation kit which you simply gunzip and untar. Then you can run the install.sh installation script andcomplete the install as described in the installation guide notes for Unix.

Upgrades

Upgrading Urchin 4

Overview

Upgrading from Urchin 4 to Urchin 5 requires using the procedure that is specific both to your operatingsystem and the Urchin version you're running. This document contains upgrade sections that cover allsupported platforms and Urchin versions. Please make sure to verify that you are following the appropriateinstructions for your situation.

Before performing any upgrades please make sure you do the following:

Chapter 1: Getting Started 20

Page 25: MANUAL Urchin v5x

Shutdown the Urchin services and back up your entire existing Urchin installation. Having theservices disabled will guarantee that there is no database activity while the backup is in progress.

1.

Have a record of the existing installation location and port number of the webserver.2.

Considerations When Upgrading

Licensing: Urchin 4 licenses are not compatible with Urchin 5. You will have to upgrade your Urchin4 licenses. Speak with your sales rep for assistance with this.

Differences in report numbers: After upgrading to Urchin 5, you may notice some changes in thesession counts in your reports compared to using Urchin 4. Please see the article entitled Urchin 3, 4,&5 Reporting Differences in this section for details on these differences

Visitor tracking with UTM−1: if you are using UTM−1 with Urchin 4 to track unique visitors to yoursite, no website changes are required. Urchin 5 will process and report on UTM−1 data. Once youhave upgraded to Urchin 5, it is advisable, although not required, to edit the Profiles for any UTMenabled sites. Go to the Profile Settings tab and select UTM− Enabled All for the Default Report Set.

It is strongly advised that you upgrade to at least Urchin 5.6 and update your website to use UTM−4.UTM−4 improves visitor tracking metrics and options for campaign tracking. If you do not need thecampaign tracking capability UTM−4 provides you can reduce log space overhead by editing the__utm.js file and setting __utmctm=0.

Procedures

Windows with Urchin 4.10x

Doubleclick on the urchin5xxx_win_setup.exe file and follow the instructions in the Welcome andLicense Agreement dialog screens.

1.

In the dialog screen labeled Preparing to Upgrade Urchin Installation, the installer will present youwith a list showing you the directory location and webserver port number it has determined for yourexisting installation. It will use these parameters for your upgrade.

2.

If you decide you don't want to use these installer settings, you may exit the installation by clickingthe Cancel button. It is not an option to alter the settings of your current installation during yoursoftware upgrade, since the upgrade has to match the previous configuration information stored inyour Urchin 4 databases.

3.

Click the Next button and the installer will proceed with converting your installation to the newversion. Your report and configuration data will automatically be preserved during this process.

4.

Windows with Urchin 4.00x

To migrate properly from Urchin 4.00x installations to Urchin 5, you will have to save some of your existingconfiguration and data files.

Disable your currently running Urchin Services by going to Start−>Programs−>Urchin−>DisableServices

1.

Navigate to your Urchin installation folder (e.g. C:\Program Files\Urchin) and rename the data folderto data−saved

2.

Copy etc\httpd.conf to var\urchinwebd.conf3. Launch the urchin5xxx_win_setup.exe installer. The installer should detect your previous installationand determine the configuration parameters. Just follow the instructions in the dialog windows of the

4.

Chapter 1: Getting Started 21

Page 26: MANUAL Urchin v5x

installer.After the installer has finished go to Start−>Programs−>Urchin−>Disable Services to deactivate yourcurrently running Urchin 5 services.

5.

In the Urchin installation folder (e.g. C:\Program Files\Urchin), rename the new data folder todata−notused, and rename the data−saved folder from step 2 back to data

6.

Launch the Urchin services by going to Start−>Programs−>Urchin− >Enable Services7.

UNIX with all versions of Urchin 4

The Urchin 5 install.sh script will properly handle all existing installations of Urchin 4 on UNIX−typesystems. When running install.sh, be sure to select Upgrade as the installation type when prompted. Please seethe Installation Guide (UNIX) section of the Documentation Center for full instructions.

Mac OS X 10.2.x

Note: Mac OS X 10.1 users should use the section on upgrades for UNIX−type systems.

For the majority of cases the Urchin 5 package installer for Mac OS X 10.2 systems will automatically detectexisting Urchin 4 installations and upgrade them. You simply use the instructions in the Installation Guide forMac OS X 10.2. The one exception to this standard package installer upgrade procedure is if you previouslyinstalled using the shell archive installer but did not install in the default location of /usr/local/urchin4, and arenow using the package installer to upgrade. In this case you should take the following steps:

Turn off your existing Urchin services using the urchinctl program (i.e. ./urchinctl stop)• Install normally using the package installer, but do not do any initial configuration using the admininterface. When the browser launches at the end of the install, simply close the window.

After the package installer has completed, go to the Urchin Preference Pane in the System Preferencesand stop all running Urchin services

In the new Urchin 5 installation directory, /usr/local/urchin, rename the data subdirectory todata−saved

Move the entire data subdirectory from your old Urchin version install directory into /usr/local/urchin• In /usr/local/urchin update the ownership of the data subdirectory using the command:

chown −R www:www data

Launch your new Urchin 5 services by using the Urchin preference pane in System Preferences tostart the scheduler and webserver

Your old configuration and report data should now be available in your new Urchin installation. Once youhave confirmed that the configuration and processing is normal you can remove your old Urchin 4distribution, as well as the /usr/local/urchin/data−saved directory.

Sun Cobalt

Login to your Sun Cobalt administration interface and navigate to the section where Urchin is listed. UninstallUrchin 4 by clicking the Uninstall Urchin 4 link. Once the system reports the uninstall process is complete,you can install the Urchin 5 pkg installer normally per instructions in the installation section of this guide. TheUrchin 5 installation will detect the existing Urchin 4 data and move it into place as part of the Urchin 5installation.

Chapter 1: Getting Started 22

Page 27: MANUAL Urchin v5x

Upgrading Urchin 3

Overview

Urchin 5 is an entirely new product with thoroughly revised internal workings and data formats that are notcompatible with Urchin 3. Therefore an existing Urchin 3 installation cannot be upgraded by simply installingUrchin 5 in its place. However, it is possible to install Urchin 5 side by side with Urchin 3 so that you maymigrate report and configuration data from one to the other. The basic Urchin 3 to Urchin 5 upgrade processconsists of:

Installing and licensing Urchin 5• Deactivating Urchin 3 log processing• Running a migration tool to import Urchin 3 data into Urchin 5• Post migration configuration of Urchin 5 processing•

There are some special circumstances to consider for Urchin 3 to Urchin 5 migrations:

Not all Urchin 3 configurations can be migrated. In particular existing configurations that rely on theUrchin 3 SubreportMode directive cannot be imported directly into Urchin 5, which does not supportSubreportMode.

u3importer cannot be used to migrate Urchin 3 data between differing platform types as part of theimport process. So you cannot, for example, take Urchin 3 databases created on a Windows platformand try to import them into an Urchin 5 installation on a Sun. u3importer must be run on a platform ofthe type where the Urchin 3 databases were created.

If you are upgrading a Sun Cobalt server, you should use the instructions in the special dedicatedsection of the Upgrades documentation on Upgrading Urchin 3 on Sun Cobalt.

Procedure

You should already have downloaded the Urchin 5 installer appropriate for your system. Also you will need toknow the full path to your Urchin 3 config file to complete the upgrade process. Proceed as follows:

Install Urchin 5 as appropriate for your platform per instructions in the Installation section of theDocumentation Center

Obtain a license for Urchin 5 and perform basic configuration of global settings such as assigning anadmin password and so forth, but do not create any Profiles.

Run the inspector program in the Urchin 5 util subdirectory to verify that your installation is correct.If any errors are reported correct them before proceeding with your Urchin 3 migration.

If necessary to guarantee that no changes are made to your Urchin 3 databases during migration,deactivate your Urchin 3 log processing as follows:

Windows − launch the Urchin 3 configuration interface and set reports to Off as appropriate♦ UNIX−type systems − edit your crontab and comment out the line that controls Urchinprocessing

Chapter 1: Getting Started 23

Page 28: MANUAL Urchin v5x

Run the u3importer program located in the Urchin 5 util subdirectory. This program will prompt youfor the full path to your Urchin 3 config file, then prompt to indicate which sites you want to importinto Urchin 5.

Once u3importer has finished, your Urchin 3 report and configuration data should be established in Urchin 5.Connect to the Urchin 5 administration interface and verify that you have correct Profiles for all yourwebsites.

Ecommerce Processing

Urchin 5 has the ability to process ELF logs and correlate the data with access logs. The ELF log source canbe added to the regular log source for a profile if the Urchin 5 Ecommerce module is installed.

Upgrading Urchin 3 on Sun Cobalt

Overview

Please read and understand all these instructions first before proceeding with your migration from Urchin 3 toUrchin 5 on Sun Cobalt systems. The recommended way to upgrade on Sun Cobalt is to start with a freshinstallation of Urchin 5 which has not gone through any configuration other than the having the initial SetupWizard run. You should not manually configure any Profiles, Log Sources, Users, etc. after installing. Themigration utilities will create these as needed while importing your Urchin 3 data.

The basic steps in upgrading a Sun Cobalt system are:

Backup your entire Urchin 3 distribution1. Deactivate Urchin 3 log processing2. Install Urchin 5, connect to the administration interface and run the Setup Wizard3. Run u3importer to import Urchin 3 report configurations as Urchin 5 profiles and convert Urchin 3data to Urchin 5 format

4.

Download and run the u5_cobalt_import.pl script to import other Urchin 3 for Cobalt config settings,such as Customers and Users, into your Urchin 5 configuration

5.

Schedule tasks to process logs for each newly created Profile6.

Procedure

You will have to telnet or ssh into your Cobalt system as root to perform some of these instructions. Youshould keep a terminal window open so that you can move back and forth from the command line to thegraphical interfaces as necessary.

1. Backup Urchin 3 distribution − this is suggested strictly as a standard precaution. The process of importingyour Urchin 3 data into Urchin 5 does not alter your Urchin 3 installation.

Chapter 1: Getting Started 24

Page 29: MANUAL Urchin v5x

2. Deactivate Urchin 3 log processing − on Cobalt systems this requires that you move 2 scripts that managethe daily execution of Urchin. On the command line in your terminal window execute these commands:

mv /etc/cron.daily/urchin /home/urchin3da/admin/binmv /etc/cron.daily/urchin_purge_weblogs /home/urchin3da/admin/bin

3. Install Urchin 5 following the instructions for Sun Cobalt in the Installation section of this guide, and runthe Setup Wizard to do the initial configuration. Important: in the Admin Settings screen of the SetupWizard, you must set Data Center Mode to On.

4. Run u3importer − this program, located in the util subdirectory of your Urchin 5 installation, will promptyou to import report directives from your Urchin 3 config file. When prompted for the location of your configfile path, enter

/home/urchin3da/config

Subsequently you will see additional prompts that say

Import Urchin 3 Configurations

and then

Import Urchin 3 Data

Just hit the return key to accept the default response of all for these last two steps. When u3importer hasfinished you should verify that correct Profiles were created by examining the configuration via the Urchinadministration interface.

5. Download and run u5_cobalt_import.pl − since u3importer only deals with importing Urchin 3 databasesand creating Urchin 5 Profiles for your existing Urchin 3 reports, other configuration info that is specific toCobalt installations has to be imported separately using this tool. You can download u5_cobalt_import.plfrom ftp://ftp.urchin.com/urchin5/support. Put this script in the /home/urchin/util directory on your Cobaltsystem. Then in your terminal window execute the program like so:

./u5_cobalt_import.pl

The script will prompt you for input as needed. When the script has finished, the configuration import portionof the migration process will be complete.

6. Schedule tasks to process logs for the Urchin 5 profiles − when you import Urchin 3 data using u3importer,a task is created in the Scheduler for each Profile, but the scheduled time to run is not set. You can either seteach schedule manually via the Urchin 5 administration interface, or use theuconf−schedule utility to set all tasks simultaneously.

Chapter 1: Getting Started 25

Page 30: MANUAL Urchin v5x

Urchin 3, 4, &5 Reporting Differences

Differences to Note When Migrating Between Urchin Products

This document is an overview describing basic differences in data analysis as well as certain migration issueswhen moving from one major Urchin version to another. Each major Urchin version is listed along with asummary explaining key elements of how data is analyzed. The latter portion of this page covers issues toanticipate for particular migration scenarios.

Urchin 3

Visitor tracking is done by incoming IP address only. There is no distinction between a visitor and asession.

All MIME types except images (gif/jpg/png) are treated as pageviews.• Pageview hits with a HEAD request type are logged as treated as actual pageviews.• Pageviews are not required to count a visitor, so a request for a single image file could be counted as anew visitor.

Hits with error codes of 404 or 5xx are considered legitimate visits and could increment the visitorcount.

Traffic−>Hourly report and Tracking reports (e.g. Top Entrances, Top Exits) data is stored on amonthly basis, therefore the only report granularity is for a single month date range.

Urchin 4

The default visitor tracking method uses a combination of IP address plus the User−Agent field fromlog entries. Other tracking options include UTM−1, session id, and IP−only (i.e. Urchin 3 styletracking).

Urchin 4 provides UTM−1 to enable optimal visitor and session tracking. UTM−1 utilizes client sidecookies to identify unique individuals as opposed to relying on IP addresses, which are not necessarilyunique to a particular person or system.

By default a session requires a legitimate pageview to be counted. A request for an image is notconsidered a pageview nor is a request with a status code other than 2xx, 302, or 304. This willtypically reduce counts for visitors, sessions, pageviews and related reports in Urchin 4 usingIP−Only tracking when compared to Urchin 3 reports for the same data. The pageview requirement isconfigurable by the Urchin administrator for sites that have a design that makes counting of images aspageviews desirable.

When using UTM−1 tracking, sessions without UTM cookie info will be processed using the defaultof IP+User−Agent.

Traffic−>Hourly report and Tracking reports (e.g. Top Entrances, Top Exits) data is stored on a dailybasis, therefore the report granularity is for any time period of a day or greater.

Urchin 5

The default visitor tracking method uses a combination of IP address plus the User−Agent field fromlog entries. Other tracking options include UTM (either UTM−2 or UTM−1), session id, username,and IP−only (i.e. Urchin 3 style tracking).

Chapter 1: Getting Started 26

Page 31: MANUAL Urchin v5x

Urchin 5 provides improved visitor and session tracking based on UTM−2, which uses client sidecookies with a configurable session timeout. With this technology, hits with the same cookies spreadout over a large period of time can be counted as multiple sessions as opposed to a single longsession. This produces more meaningful averages in the reports.

For both UTM−1 and UTM−2 tracking, the processing logic has changed so that only hits with UTMcookie information are processed when counting visitors, sessions, and pageviews. Hits without UTMinfo do not fall back to processing using IP+UserAgent as in Urchin 4. Such non−cookie sessions aretracked only for reports that are based on hits and bytes. This can lower counts for visitors, sessions,pageviews, and related reports when compared to Urchin 4 because it significantly reduces the effectof robot traffic on your statistics.

An explicit include or exclude MIME type list is now used to define what a pageview is. By default,Urchin 5 excludes the following MIME types from the pageview list:gif,jpg,jpeg,png,js,css,cur,ico,idaAll other MIME types are considered to be pageviews or downloads.

Pageview hits which use HEAD as the request type only cause the Hits count for that page to beincremented, the pageview count is not.

By default a session requires a legitimate pageview to be counted. A request for an image is notconsidered a pageview nor is a request with a status code other than 2xx, 302, or 304. This willtypically reduce counts for visitors, sessions, pageviews and related reports in Urchin 5 usingIP−Only tracking when compared with older Urchin version reports for the same data. The pageviewrequirement is configurable by the Urchin administrator for sites that have a design that makescounting of images as pageviews desirable.

With the exception of the Status and Errors report, all reports that graph vs. hits are based on validhits. Previously, such graphs were based on all hits (i.e. valid and hits with errors).

Traffic−>Hourly report and Tracking reports (e.g. Top Entrances, Top Exits) data is stored on a dailybasis, therefore the report report granularity is for any time period of a day or greater.

Migrating from Urchin 3 to Urchin 5

Reporting

Since in Urchin 3 the Tracking reports and Traffic−>Hourly Graph data is only stored on a monthly basis, andin Urchin 4 and Urchin 5 this data is stored on a daily basis, a side by side comparison of these reportsrequires that you set the date range to one month in the newer products. Also, when importing Urchin 3 datathere is no way to break out the monthly data for these reports into individual days, so all data for thesespecific reports for a given month will be placed into the first day of the month in the newer Urchin version.Here too, setting the date range to one month will allow the imported historical data to be viewed in thecorrect context.

Administration

Administration is primarily via graphical interface and is based on a binary configuration database. However,command line tools and the ability to import a flat file configuration are available for those who are used toand prefer the config file approach of Urchin 3.

Migrating from Urchin 4 to Urchin 5

Reporting

Chapter 1: Getting Started 27

Page 32: MANUAL Urchin v5x

Urchin 4 databases are fully compatible with Urchin 5. Report data will be immediately available once youupgrade. As noted above in the product descriptions, Urchin 5 uses a different logic for processing hits, soonce you upgrade you will initially see a difference in report numbers compared to recent historical datagenerated with Urchin 4. These variances will differ depending on which visitor tracking method you've beenusing.

Log Tracking

Logtracking data in Urchin 4 is kept in a single tracking file. In Urchin 5 this data is kept in individualmonthly databases. When an Urchin 4 installation is upgraded to Urchin 5, the old logtracking data isconverted into equivalent Urchin 5 monthly logtracking databases, and the Urchin 4 logtrack file is archived.

Migrating from UTM−1 to UTM−2

Reporting

IMPORTANT: UTM−2 cannot be used with Urchin 4. You must be running Urchin 5 before switching yourwebsite to use UTM−2.

The improved accuracy in identifying unique visitors that UTM−2 provides means that you may see somedifferences in reported numbers compared to what you have been seeing using UTM−1. These differencesshould be on the order of 10% or less.

Upgrading Urchin 5

Overview

Upgrading Urchin 5 is a straightforward process. The installers typically deal automatically with upgradingexisting installations while leaving your configuration and report data intact. This document contains upgradesections that cover all supported platforms. Please make sure to verify that you are following the appropriateinstructions for your situation.

Before performing any upgrades please make sure you do the following:

Back up your entire existing Urchin installation, in particular any customized configuration files.1. Shutdown the Urchin services. Having the services disabled will guarantee that there is no databaseactivity while the backup is in progress.

2.

Have a record of the existing installation location and port number of the webserver.3.

Considerations When Upgrading

It is always advisable to install on your website the latest __utm.js provided with the current releasewhen upgrading Urchin. In addition, as of Urchin 5.7 there is a new UTM, and all users of Urchin 5.xproducts are encouraged to upgrade to this UTM version even if you do not upgrade to 5.7 at thistime.

Chapter 1: Getting Started 28

Page 33: MANUAL Urchin v5x

Campaign Tracking Module users who download Google CPC data must modify their Googledownload process when upgrading to Urchin 5.6 or 5.7. Please see the help article on importingGoogle cost data in the Campaign Tracking Module section.

Visitor tracking with UTM−1, UTM−2, or UTM−3: Urchin 5.6 and newer versions are backwardscompatible when processing all older versions of UTM data. Although not required, it is stronglyadvised that you upgrade your website to UTM−4 or later regardless of the Urchin 5 version you areusing.

Optimizing UTM−4 settings: UTM−4 improves visitor tracking metrics and options for campaigntracking. If you do not need the campaign tracking capability, you can reduce log space overhead byediting the __utm.js file and setting __utmctm=0. This will still allow you to benefit from theimproved UTM−4 visitor tracking.

Procedures

Windows

Doubleclick on the urchin5xxx_win_setup.exe file and follow the instructions in the Welcome andLicense Agreement dialog screens.

1.

In the dialog screen labeled Preparing to Upgrade Urchin Installation, the installer will present youwith a list showing you the directory location and webserver port number it has determined for yourexisting installation. It will use these parameters for your upgrade. It is not an option to alter thesettings of your current installation during your software upgrade, since the upgrade has to match theprevious configuration information stored in your Urchin 5 databases.

2.

Click the Next button and the installer will proceed with converting your installation to the newversion. Your report and configuration data will automatically be preserved during this process.

3.

UNIX

The install.sh installation script which is bundled as part of all UNIX−type installers will properly upgradeany older version of Urchin 5 installed on UNIX−type systems. When running install.sh, be sure to selectUpgrade as the installation type when prompted. Otherwise the upgrade procedure for UNIX is identical to anew installation.

When using install.sh interactively to do an upgrade, at one point you will be presented with the prompt:

Please select the installation type [Default: 1] 1. New 2. Upgrade−>

Be sure to select 2 to trigger an Upgrade. If you are using install.sh in non− interactive mode by specifyingcommand line options then be sure to use the −m option to specify an upgrade. Please see the InstallationGuide (UNIX) section of the Documentation Center for full instructions on using install.sh.

Mac OS X 10.2.x

Note: Mac OS X 10.1 users should use the section on upgrades for UNIX−type systems.

Chapter 1: Getting Started 29

Page 34: MANUAL Urchin v5x

If you have previously used an Urchin 5 package installer for Mac OS X 10.2, then using a newer packageinstaller will automatically detect your existing Urchin 5 installation and upgrade it. Users in this situationshould simply use the instructions in the Installation Guide for Mac OS X 10.2 and skip the rest of theinstructions in this subsection as in this case the instructions for new installation and upgrade are the same.

If you did not previously use the package installer, but installed using the install.sh installation script and didnot install in the default location of /usr/local/urchin, then you must use a modified procedure to upgrade. Thepackage installer will only install in /usr/local/urchin, so it cannot be used to automatically upgrade anotherinstall location. If you have this situation, you have two choices:

Do not use the package installer to upgrade. Instead download and use the same type of installer youused previously. This means you can follow the standard upgrade instructions for UNIX−typesystems detailed in the previous subsection.

1.

If you prefer to start using the package installer to upgrade, you can take the steps listed below, butrealize that this procedure will cause your Urchin installation to be relocated to the default of/usr/local/urchin:

Turn off your existing Urchin services using the urchinctl program (i.e. ./urchinctl stop)♦ Move the current Urchin installation directory to /usr/local/urchin. You will need to move theentire directory structure starting with the top level directory of your current Urchininstallation. For example if you previously had installed Urchin in /applications/urchin, thenyou would use the following command:

mv /applications/urchin /usr/local/urchin

You should verify that you have enough disk space in the /usr/local file system for yourcurrent Urchin installation before doing the move.

Once you've relocated Urchin to the proper location you may launch the Urchin 5 pkginstaller and follow the interactive instructions to upgrade

2.

Your old configuration and report data should now be available in your updated Urchin installation.

Sun Cobalt

The Urchin 5 pkg installers for Sun Cobalt systems automatically detect existing installations and upgrade theUrchin 5 files as needed. Simply follow the instructions for a new Sun Cobalt installation to perform anupgrade.

Initial Configuration

E−commerce Reporting

Urchin is capable of extensive e−commerce reporting in conjunction with its standard web traffic reports. To

Chapter 1: Getting Started 30

Page 35: MANUAL Urchin v5x

accomplish this, two basic elements are required:

Shopping cart software that produces activity logs in the ELF/ELF2 format (many can be configuredto do so).

The Urchin E−commerce Module, which is available as an add−on to any Urchin 5.x license.•

To set up a Profile for ELF/ELF2 processing, use the Profile Setup Wizard in the Urchin admin interface andchoose Profile type E−commerce. In the Log Source Wizard (which you will be taken through in the Profilesetup process), you will need to specify two Log Sources − the standard website access log, and the ELF log.

ELF: To process existing ELF logs with Urchin requires only that you set LogFormat in the Log Source toELF (or auto), and that the Visitor Tracking method in the Profile for the site be set to IP−ONLY.

ELF2: To use ELF2 you must configure your shopping cart software to generate log entries formatted asshown below. The ELF2 log format is based on the ELF log format and specification. Some additional fieldswere added to improve visitor tracking. Any fields containing internal tab characters must be quoted.

The transaction line starts with an exclamation character '!' and contains the following fields separated bytabs:

!orderidremote host IP (as given by %h in NCSA extended/combined log format)time (as given by %t in NCSA extended/combined log format)storesessionidtotaltaxshippingbillcitybillstatebillzipbillcountrycs_useragentcs_cookie

The item line does not start with an exclamation character and contains the following fields separated by tabs:

orderidremote host IP (as given by %h in NCSA extended/combined log format)time (as given by %t in NCSA extended/combined log format)productcodeproductnamevariationpricequantityupsoldcs_useragentcs_cookie

Chapter 1: Getting Started 31

Page 36: MANUAL Urchin v5x

Setup Recommendations

Overview

Once Urchin is installed, there are some initial operational parameters that will have to be configured. This isdone via a Setup Wizard that runs when you connect to your Urchin administration interface for the first time,and during the first stages while you are establishing Profiles. These initial configuration actions include:

Licensing Urchin• Configure Admin Settings for remote report and administration access, as well as establishing DataCenter Mode operation

Setting the Urchin Administrator account password• Scheduling tasks for each of your profiles to process data• Log management•

Procedure

Connect to the Urchin administration interface. On Windows systems you can go toStart−>Programs−>Urchin−>Urchin Administration. On UNIX− type systems, Sun Cobalt, and Mac OS Xyou can use the URL http://hostname:9999, where hostname is the registered hostname of your Urchinsystem. For a new installation you should use the following to login:

Username: admin Password: urchin

You'll be presented with an Urchin Setup Wizard welcome screen. Click Continue to proceed through each ofthe following wizard screens. Note that the choices you make in this initial configuration can always bealtered later on.

License Urchin

You have to choose one of the links under the Action Items area of this screen to license Urchin before youcan proceed with configuring and using the software. Click Buy License to purchase and install a license viathe web right away. If you purchased a license via a sales rep prior to installing Urchin, then click ActivatePre− Purchased License. Otherwise, click Obtain Demo License to install an expiring license.

Admin Settings

Remote Access Settings − select On for each case if you want to allow remote browser connections. Ifyou select Off for either of these, then the only allowed access is on the console of the system whereUrchin is installed.

Data Center Mode − this setting determines whether Urchin is configured to allow creation ofAffiliations, which allow you to logically organize Profiles, Groups, and Users into restricted accesscategories. If you are undecided it is best to set this to On as it adds no overhead and gives you theflexibility to use it in the future.

Chapter 1: Getting Started 32

Page 37: MANUAL Urchin v5x

Administrative User

Reset the password for the admin account and record this password for safekeeping.

Scheduling Tasks

When you create profiles you are given the option to schedule what time to run the task that processes the datafor that profile. You should check the settings for each profile to be sure that the timing of you task makessense in terms of when the log data will be available, how long it will take to process that data, and when youwant the updated reports available to your users.

Log Management

Urchin includes a log tracking module which keeps track of how far into each log it has processed so far.Thus, log file rotation does not necessarily need to be coordinated with Urchin operation. However, Urchindoes provide automation for log rotation or removal under the Advanced Settings for each Log Source.

Chapter 1: Getting Started 33

Page 38: MANUAL Urchin v5x

Chapter 2: Visitor Tracking

Using UTM with E−commerce

Overview

Since the key aspect of UTM is the ability to identify and correlate visitor activity, when utilized on ane−commerce site in conjunction with the E−Commerce Module, visitor activity that generates revenue can betracked across your sites and reported on collectively. Transactions on the server that hosts your shopping cartsoftware can be correlated with sessions on your other webserver, allowing session variables, such as referralsand keywords, to be reported on versus the revenue they generate. When using the Campaign TrackingModule, the UTM provides multi−session tracking that tracks the visitor from source to purchase or goal.Conversion ratios and ROI reports in the Campaign Tracking Module provide detailed results of on−linemarketing efforts including keyword buying, e−mail campaigns, and link exchanges.

Same Domain Configuration

If the front−end website and secure e−commerce site use the same domain, installing the UTM on youre−commerce site is no different than installing on other types of websites. The information in the other areasof this section on UTM installation will provide the specifics of installation. Special attention should be paidto the areas explaining how to set the UTM domain appropriately for your e−commerce and other sites.Further information on e−commerce transaction log formats is provided in the E−commerce section of the

Chapter 2: Visitor Tracking 34

Page 39: MANUAL Urchin v5x

documentation.

Cross Domain Configuration

It is increasingly common for web sites with e−commerce shopping carts to outsource the e−commercecomponent to another organization such as Amazon, PayPal, or Yahoo Stores. This can create a problem forthe UTM tracking as the domain for UTM changes as a visitor goes from the main website to the secure store.In order for the secure store to use the same UTM visitor ID as the main website, the visitor ID must bepassed in the link to the secure store.

The UTM contains a __utmLinker function that will wrap the link with the appropriate id before sending thelink to the store. Instead of linking directly to the store, simply pass the link to the __utmLinker function.Here are the specific instructions for using the __utmLinker:

Edit the __utm.js file in the document root of both web sites and set the __utmdn variable to "none".1. Set the UTM Domain for the profile to nothing (blank).2. Change the links from the main site to the secure site in the form:

<a href='javascript:__utmLinker("https://previous_link?with_parameters");'> link−to−shopping−cart </a>

3.

Visitor Identification Methods

Overview

Urchin has five different methods for identifying visitors and sessions, depending on available information. Ofthese, the patent−pending Urchin Traffic Monitor (UTM) is a highly accurate system that was specificallydesigned to identify unique visitors, sessions, exact paths, and return frequency behavior. There are a numberof visitor loyalty and client reports that are only available when using the UTM System. The UTM System iseasy to install and is highly recommended for all businesses.

In addition to the UTM System, Urchin can use IP addresses, User−Agents, Usernames, and Session−IDs toidentify sessions. The following table compares the abilities of each of the five identification techniques:

Ability IP Only IP+User−Agent UsernameSession ID UTM

Identifies non−proxied sessions X X X X X

Identifies some proxied sessions X X X X

Uniquely identifies each session X X

Defeats session IP proxying X X

Defeats most provider caching X X

Defeats browser caching X

Chapter 2: Visitor Tracking 35

Page 40: MANUAL Urchin v5x

Uniquely identifies visitors X X

Captures exact path sequence X

Captures visitor loyalty metrics X

Captures browser capabilities X

Data Model

The underlying model within Urchin for handling unique visitors is based on a hierarchical notion of a uniqueset of visitors interacting with the website through one or more sessions. Each session can contain one or morehits and pageviews. Pageviews are kept in order so that a path through the website for each session isunderstood. As shown in the diagram, the Visitor represents an individual’s interaction with the website overtime. Each unique visitor will have one or more sessions, and within each session is zero or more pageviewsthat comprise the path the visitor took for that session.

Proxying and Caching

In attempting to identify and track unique visitors and sessions, we are basically going against the nature ofthe web, which is anonymous interaction. Particularly troublesome to tracking visitors are the increasinglycommon proxying and caching techniques used by service providers and the browsers, themselves. Proxyinghides the actual IP address of the visitor and can use one IP address to represent more than one web user. Auser’s IP address can change between sessions and in some cases multiple IP addresses will be used torepresent a cluster of users. Thus, it is possible that one visitor will have different IP addresses for each hitand/or different IP addresses between sessions.

Caching of pages can occur at several locations. Large providers look to decrease the load on their network bycaching or remembering commonly viewed pages and images. For example, if thousands of users from aparticular provider are viewing the CNN website, the provider may benefit from caching the static pages andimages of the website and delivering those pieces to the users from within the provider’s network. This hasthe effect of pages being delivered without the knowledge of the actual website.

Browser caching adds to the question. Most browsers are configured to only check content once per session. Ifa visitor lands on the home page of a particular website, clicks to a subpage, and then uses the back−button togo back to the home page, the second request of the home page is most likely never sent to the website server,but pulled from the browser’s memory. An analysis of paths may result in an incomplete path missing the

Chapter 2: Visitor Tracking 36

Page 41: MANUAL Urchin v5x

cached pages.

In the above diagram, the actual path taken through the website by the client is shown at the top, while theapparent path from the server’s point of view is shown at the bottom. In this case, before proceeding toPage−3 the user goes back to the Page−1. The server never sees this request and from its point of view itappears the user went directly from Page−2 to Page−3. There may not even be a link from Page−2 to Page−3.

Visitor Identification Methods

As mentioned previously, Urchin has five different methods for identifying visitors, sessions and paths. Themore sophisticated methods which can address the above issues may require special configuration of yourwebsite. The following descriptions describe the workings of each method in more detail.

1. IP−Only: The IP−Only method is provided for backward compatibility with Urchin 3, and for basic ITreporting where uniquely identifying sessions is not needed. This method uses only the IP Address to identifyvisitor sessions. Thirty minutes of inactivity will constitute a new session. The only data requirements forusing this method is a timestamp and IP Address of the visitor.

2. IP−Agent: The default method, which requires no additional configuration, uses the IP address anduser−agent (browser) information to determine unique sessions. A configurable thirty−minute timeout is usedto identify the beginning of a new session for a visitor. While this method is still susceptible to proxying andcaching, the addition of the user−agent information can help detect multiple users from one IP address. Inaddition, this method includes a special AOL filter, which attempts to reduce the impact of their round−robinproxying techniques. This method does not require any additional configuration.

3. Usernames: This method is provided for secure sites that require logins such as Intranets and Extranets.Websites that are only partially protected should not use this method. The Username identification is takendirectly from the username field in the log file. This information is generally logged if the website isconfigured to require authentication. This method uses a thirty−minute period of inactivity to separatesessions from the same username.

4. Session ID: The fourth visitor identification method available in Urchin is the Session ID method, whichcan use pre− existing unique session identifiers to uniquely identify each session. Many content deliveryapplications and web servers will provide session ids to manage user interaction with the webserver. Thesesession ids are typically located in the URI query or stored in a Cookie. As long as this information is

Chapter 2: Visitor Tracking 37

Page 42: MANUAL Urchin v5x

available in the log data, Urchin can be configured to take advantage of these identifiers. Using session idsprovides a much more accurate measurement of unique sessions, but still does not identify returning uniquevisitors. This method is also susceptible to some forms of caching including the above example.

In many cases, the ability to use session ids may already be available, and thus, the time required to configurethis feature may be short. For dynamically generated sites, taking advantage of this feature should bestraightforward. The result is more accurate visitor session and path analysis.

5. Urchin Traffic Monitor (UTM): The last method for visitor identification available in Urchin is theUrchin Tracking Module. This system was specifically designed to negate the effects of caching and proxyingand allow the server to see every unique click from every visitor without significantly increasing the load onthe server. The UTM system tracks return visitor behavior, loyalty and frequency of use. The client−side datacollection also provides information on browser capabilities.

The UTM is installed by including a small amount of JavaScript code in each of your webpages. This can bedone manually or automatically via server side includes and other template systems. Complete details oninstalling UTM are covered in the articles later in this section.

Once installed, the Urchin Traffic Monitor is triggered each time someone views a page from the website. TheUTM Sensor uniquely identifies each visitor and sends one extra hit for each pageview. This additional hit isvery lightweight and most systems will not see any additional load. The Urchin engine identifies these extrahits in the normal log file and uses this additional data to create an exact picture of every step taken by theusers. This method also identifies visitors and sessions uniquely so that return visitation behavior can beproperly analyzed. While this method takes a little extra time to configure, it highly recommended forcomprehensive detailed analytics.

Urchin Traffic Monitor (UTM)

Overview

The patent−pending Urchin Traffic Monitor (UTM), originally available in Urchin 4, was specificallydesigned to provide the most accurate measurements of unique website visitors. For businesses looking to geta deeper understanding of their online visitor behavior, the UTM is an extremely valuable technology thatcombines the best of client and server side information while letting you control the data. Easy to install, thistechnology allows business owners to exactly identify unique visitors, click paths, and return loyalty metricsincluding: first time visitors, returning visitors, and frequency of use. The second version of UTM, UTM−2,released with Urchin 5, expands these capabilities, capturing additional browser parameters and loyaltymetrics. UTM−3, released with Urchin 5.5, adds a powerful campaign tracking capability. Subsequentversions of the UTM released with Urchin 5.6, and Urchin 5.7 contain a number of enhancements to thecampaign tracking capability.

There are two components to the Urchin Traffic Monitor System: the UTM Sensor, which is a lightweight

Chapter 2: Visitor Tracking 38

Page 43: MANUAL Urchin v5x

module installed into the content of the website; and the UTM Engine which is part of the log processingUrchin Engine. The UTM Sensor enables client−side data collection, which is then funneled back through theweb server augmenting the normal logfile. The client−side information is combined with the existingserver−side data by the UTM Engine to provide a more accurate and complete picture of website activity.

The UTM Sensor is a small amount of JavaScript code that accomplishes two important functions. First, theSensor negates the effects of caching by forcing at least one hit to progress to the original web server for eachpageview. The impact on the server is minimal, and the details about the additional hit are logged into thenormal web logfiles resulting in a more complete data set. Secondly, the UTM Sensor uniquely identifies eachvisitor by using client−side "1st party" cookies to keep track of first time and returning visitors. This cookieidentifier is a communication tag only viewable to your web server in the same nature as session ids. It is not athird party cookie, which provides information outside your system, violating many privacy policies.

The above diagram illustrates the operation of the UTM System. The web server in the middle of the diagramprovides two basic functions: content delivery and logging. The content of the website includes the UTMSensor which is delivered to the user’s browser, shown on the left. The UTM Sensor sets unique identifiersand sends an additional request back to the same web server. This additional request is logged into the normallog file along with all of the normal traffic. The UTM Engine, which is part of the Urchin log processingengine, understands this additional data and merges the two types of data together providing an accurate andmore complete picture of visitor behavior.

UTM Sensor

The UTM Sensor increases the accuracy and completeness of logfile data by negating the effects of cachingand proxying. The following example illustrates how the UTM Sensor handles caching. Shown in the figurebelow, the user receives the content of a pageview from the cached memory of the browser. This typicallyoccurs when the user goes back to a previously viewed page. The same model applies if the caching isprovided by a service provider. In the example, the content for page "X" is not delivered from the web server,but from the cached memory of the browser. At this point, there is no knowledge of the pageview as it is notseen by the web server. However, the UTM Sensor activates an additional unique hit that forces at least onesmall record back to the original web server. This information is logged in the normal logfile, which now hasknowledge of the originating "X" pageview.

Chapter 2: Visitor Tracking 39

Page 44: MANUAL Urchin v5x

The second important function of the UTM Sensor is to uniquely identify both sessions and unique visitors.Through a patent−pending combination of browser cookies, the Sensor detects and initializes the uniquevisitor and session identifiers allowing exact monitoring of new and returning visitors regardless of serviceprovider proxy behavior. Most service providers take advantage of proxying by recycling IP addresses andclustering users behind firewalls. This can cause problems with normal logfile tracking, which typicallyutilizes the IP address as an identifier of the user.

In the example shown in the figure below, the UTM Sensor is able to pierce the veil of the proxy by utilizingthe cookie identifiers instead of the IP addresses. In the figure, a first time unique visitor accesses the websitethrough a firewall with IP address #1. The delivered Pageview contains the UTM Sensor, which sets theidentifier on the visitor’s browser.

On the return visit by the same visitor shown in the bottom of the figure, the unique id is passed to the webserver along with each request. So even if the user is now assigned a second IP address, the UTM technologyproperly identifies the visitor with the original id. In addition to negating the effects of complex proxyingtechniques, this also tracks visitors who travel and may use their laptops from several locations and throughseveral providers.

Once the additional UTM data is recorded in the normal web server log , the UTM Engine will recognize andprocess these additional hits in order to create an exact analysis of each click of the user. During installation, itis important that the logging format is checked for both referral and cookie logging to be present so that all ofthe appropriate data is stored.

Installation

Chapter 2: Visitor Tracking 40

Page 45: MANUAL Urchin v5x

There are four steps to installing the UTM system, which can be accomplished in a very short amount of time.Complex sites may be able to take advantage of existing server−side includes or centralized delivery methodsto shorten the installation. During installation, you will need access and permissions to modify the content ofthe website. You may also need to modify the logging of the web server, which may require a different set ofpermissions. The following four steps do not necessarily need to be performed in order.

Upgrade Note: UTM−2, which ships with Urchin 5, is not recognized by Urchin 4. Once UTM−2 is installed,you will no longer be able to run Urchin 4. All versions of Urchin 5 recognize both UTM−1 and UTM−2. Aswell, as of Urchin 5.5 there is UTM−3. Only Urchin 5.5 and up can process UTM−3 data. Therefore, whenupgrading, it is important to migrate to the appropriate version of Urchin before installing a more recentversion of the UTM sensor. UTM−4 data, however, can be processed by any Urchin 5.x version.

1. Install UTM Sensor into content: The first step in installing the UTM is to include the JavaScript and GIFcomponents of the UTM Sensor in the content of the site. The two pieces necessary for completing this stepare included in the util/utm/ folder within the Urchin distribution. It is important that the names of these twofiles are not changed and that they are copied to the document root directory of the website. Either drag anddrop, upload, or copy the __utm.js and the __utm.gif files into the main directory of your website.

Once these files are in place, you will need to include the __utm.js file at the beginning of each webpage inthe website. If your site utilizes server side includes and you use a header include for each file, it is possible toinclude the UTM in the beginning of this include file only. It will then automatically be a part of eachwebpage. For static HTML sites that do not use includes, you will need to modify and add the UTM entry toeach page individually. For dynamic sites that use a content generation engine, the UTM can be included atthe beginning of the template that is delivered to the customer.

In any case, the following line of code should be included in the beginning of each HTML page, but after anyMETA tags, that is delivered to the end user. For static sites, edit each webpage and add the line below beforethe rest of the HTML content (but after any META tags).

<script src="/__utm.js" type="text/javascript"></script>

For sites that undergo regular maintenance or have multiple authors, be sure to build the addition of this lineinto the your internal website authoring procedures, guidelines, and QA processes.

If you are using a package like "HTML Tidy", you may want to include the Javascript line in the HEAD areaof your page to make it more palatable, for instance:

<html><head> <meta http−equiv="Content−Type" content="text/html;charset=ISO−

Chapter 2: Visitor Tracking 41

Page 46: MANUAL Urchin v5x

8859−1"> <script src="/__utm.js"type="text/javascript"></script> ...</head>

2. Set UTM Domain (if necessary): The UTM (beginning with UTM−2) has a domain setting that controlsthe scope of the cookies. For single websites, the default setting, "auto", can be left alone. If you have multiplewebsites that share a common root domain and you wish to process them together, then the domain should beset to the common root domain.

To set the domain setting, edit the __utm.js file that was copied into your document root in step 1. Towardsthe top, you will see the line:

var __utmdn="auto"; /*−− ...

Change the word "auto" to the domain that the cookies should apply to. The domain must be part or all of theactual URL for this site. Example:

var __utmdn="urchin.com"; /*−− ...

3. Activate cookies in the logging: The third step to installing the UTM System is to verify and potentiallymodify the logging format of your web server. For the UTM to function properly, it is required that bothreferral and cookie information is logged. You will need access to the configuration of the web server. Thefollowing general guidelines should work for most IIS and Apache users, however you should check withyour system administrator to ensure proper formats.

For Apache Users:Apache servers typically use a configuration file called "httpd.conf." Within this file, configuration directivesdetermine the format and location of logfiles. By default, most Apache configurations will log in the NCSAExtended Combined format, which includes referrals and user−agents, but is missing the cookie information.Be sure that your logfiles contain the "{Cookie}i" field specification. To modify your logging format from thedefault, a "special" LogFormat directive can be added and then the log files can reference this format usingthe CustomLog directive.

The above example is provided as a reference and does not apply to all possible Apache settings. Please referto the Apache documentation and consult with your system administrator on the actual directives needed toactivate cookie logging. The LogFormat directive specifies the specific format of the log file. The exampleshows the addition of the cookie information to the end of the log file. This format is then named "special" sothat it can be identified in the virtual host configurations. The CustomLog directive in the virtual host

Chapter 2: Visitor Tracking 42

Page 47: MANUAL Urchin v5x

specification identifies the location of the log file and the format to use. The example uses the "special"format as defined previously.

For Microsoft IIS users:The Internet Services Manager provides a point−and−click interface for adjusting the web serverconfiguration. To access this manager, you will need to login to the web server with the appropriateadministrator privileges.

To access the Internet Services Manager, click on the "Start" menu −− > Settings −−> Control Panel, and thendouble click on the Administrative Tools Folder and then on the Internet Services Manager icon to open themanager.

Modifications to each website can either be made individually or the entire server can be modified. In the leftwindow either right−click on the server name to modify the entire server, or right−click on the website namesuch as "mysite1.com." Select the "Properties" option to open the properties dialog box. For the entire server,click on the "Edit�E button with "WWW Services" selected in the menu to bring up the Properties dialog boxshown on the left below.

Shown in the above figure, be sure that logging is enabled and set to the "W3C Extended Log File Format."Then select the "Properties�E button to configure the log file format specifics.

Chapter 2: Visitor Tracking 43

Page 48: MANUAL Urchin v5x

The window shown above will appear. Click on the "Extended Properties" tab, scroll down and make sureboth the "Cookie" and "Referrer" boxes are checked. If not, check these boxes and "Apply" the changes to thesite.

Whether you use IIS, Apache or another web server, please refer to your server documentation for moreinformation on configuring logfile formats. All major web servers support the logging of cookies and areeasily modified to activate this feature.

4. Set Urchin configuration to UTM: The final step in configuring the UTM for your site is to enable theUTM tracking in the Urchin Configuration. This is either done at the time the Profile was created or after byediting the Profile. Open the Urchin Configuration either directly on the machine or by logging in remotely asthe "admin" user. Your installation instructions will provide more details on how to access the configuration.

Once open, Click on the "Configuration" icon to the left to provide a list of the existing Profiles in theconfiguration. To enable UTM tracking for a particular Profile, click the "Edit" to the right of the profilename. (Note: if you have not already added the profile, do so now using the "Add" button). After clicking onthe "Edit" button, click on the "Reporting" tab to bring up the Reporting Settings Window.

Under the "Visitor Tracking Options" section, use the menu to select "Urchin Traffic Monitor (UTM)" for theVisitor Tracking Method. If you explicitly set the UTM Domain in step 2, then set the UTM Domain settingin the above figure to the same value as in step 2. If you did not specify the domain in step 2, then set the

Chapter 2: Visitor Tracking 44

Page 49: MANUAL Urchin v5x

UTM Domain to the address of your website without the "www.". If your website domain does not start with"www.", then use the whole thing. Click the "Update" button to save your settings. That’s it. The installationis complete, and future traffic will contain and benefit from the the UTM System.

Session−ID Identification

Overview

Many application servers including ASP pages will use a unique session number to identify individualscurrently on the site. And while this information doesn't usually contain any historical tracking, it doesprovides an accurate way of identifying unique sessions.

Session IDs are typically located in either the URL query parameters or in a Cookie that is assigned to theuser. As long as this information is logged into the Log File, Urchin can use this to uniquely identify eachsession. Using Session IDs increases the accuracy of reporting by defeating the effects of proxy servers. UsingSession IDs does not provide unique visitor tracking like the UTM system, but if you already have SessionIDs in place, it can be an easy way to increase the session accuracy immediately.

Session ID Location

Before configuring Urchin to use Session IDs, check your log file to make sure the IDs are coming throughand make a note of the field and format. If the Session IDs are in the request, then the 'request_query' fieldwill contain the variable string. If they are in the cookie field, then the 'cs_cookie' field should be used.

Make a note of the field and the variable name used to mark the identifier. If you don't see the ID in the LogFile and you are sure you are using Session IDs, check to see that the logging format contains the appropriatefield.

Urchin Configuration

Once you have the Session ID information, you can easily set your Profile in Urchin to use this for visitoridentification. Bring up the Urchin Administration Interface and under Configuration, click on Edit next to the

Chapter 2: Visitor Tracking 45

Page 50: MANUAL Urchin v5x

Profile you wish to configure (or click Add if you don't have the Profile configured yet). Click on theReporting Tab to bring up the "Visitor Identification" settings:

Shown in the above image, change the Visitor Tracking Method to "Session ID" and set the Session Field toeither request_query or cs_cookie as determined above. Then enter what comes before and after the SessionID in the two Parsing boxes. For the first example provided at the beginning of this document,sid=12345,enter "sid=" and ""in the two boxes. Click update and you are ready to go.

UTM Quick−Install (Apache)

The following is intended as a quick run−through on installing the UTM for websites running on an Apacheserver on all platforms except for Sun Cobalt. For more detailed information on the UTM, please see thearticle entitled "Urchin Traffic Monitor (UTM)" found in this section.

Step 1: Copy UTM files to website document root. The files, __utm.js and __utm.gif are located in the"util/utm" directory in the Urchin distribution. Copy these two files to the main directory of your websitecontent. IMPORTANT: the filenames start with two underscore characters.

Step 2: Reference UTM in your HTML. Enter the following line in all of your HTML pages. While it cango anywhere in the pages, we recommend putting it in the <head> section. If you use a common include ortemplate, you can enter it there. IMPORTANT: the filename starts with two underscores.

<script src="/__utm.js" type="text/javascript"></script>

If you are using a package like "HTML Tidy", you may want to include the Javascript line in the HEAD areaof your page to make it more palatable, for instance:

<html><head> <meta http−equiv="Content−Type" content="text/html; charset=ISO−8859−1"> <script src="/__utm.js" type="text/javascript"></script>

Chapter 2: Visitor Tracking 46

Page 51: MANUAL Urchin v5x

...</head>

Step 3: Enable cookies in your Apache logging. If not already enabled, you can use the following httpd.confexample to enable cookie logging:

Step 4: Set Urchin Profile to use UTM. In the Urchin Administration interface, edit the profile in questionand click on the Reporting tab. Set the Visitor Tracking Method to UTM. Set the UTM Domain to the addressof your website without the www. When done click the Update button. Then click on the Profile Settings taband choose UTM−Enable All for the Default Report Set, then click Update again.

That's it! Your website will now begin logging UTM data into your normal log file which will be identifiedthe next time you run Urchin.

Is it working? To see if the UTM is successfully making entries to your log file, examine the log after youhave installed the UTM and clicked on a few pages of the site. You should see an entry similar to thefollowing at the end of the log file:

... "GET /__utm.gif?..." 200 ..."__utma=..."

If you don't see the __utma entries, be sure to check that cookies was enabled in the logging properly. If thestatus code is not 200 then check to make sure the files were properly copied to your document root.

Installing UTM On Every Page (Apache)

Installing the UTM sensor on every page of a web site allows Urchin to provide the most accurate analyticspossible. This article describes how to easily install the UTM sensor on every page of a large site.

How can I install UTM on every page?

mod_layout is an Apache module that provides both a Footer and Header directive to automatically includeoutput from other URIs at the beginning and ending of every web page. You can use it to include the __utm.jscalls on every page of a site. It is an invaluable tool for service providers who do not wish to modify theirclients' web pages, as well as for single sites with a large number of web pages.

To install mod_layout:

Chapter 2: Visitor Tracking 47

Page 52: MANUAL Urchin v5x

Download mod_layout from tangent.org1. Extract the compressed file and read the README.2. Install mod_layout as described in INSTALL3. Create an html file called utm.html4. Add <script src="/__utm.js" type="text/javascript"></script> to utm.html5. Modify your current Apache configuation file to include the utm.html file.6.

Example

<VirtualHost 63.212.171.4>ServerName urchin.comServerAlias www.urchin.comLayoutHeader /path/to/file/utm.html...</VirtualHost>

UTM Quick−Install (IIS)

The following is intended as a quick run−through on installing the UTM for websites running on a MicrosoftIIS server on any Windows platform. For more detailed information on the UTM, please see the articleentitled "Urchin Traffic Monitor (UTM)" found in this section.

Step 1: Copy UTM files to website document root. The files, __utm.js and __utm.gif are located in the"utils\utm" folder in the Urchin distribution. Copy these two files to the main folder of your website content.IMPORTANT: the filenames start with two underscore characters.

Step 2: Reference UTM in your HTML. Enter the following line in all of your HTML pages. While it cango anywhere in the pages, we recommend putting it in the <head> section. If you use a common include ortemplate, you can enter it there. IMPORTANT: the filename starts with two underscores.

<script src="/__utm.js" type="text/javascript"></script>

If you are using a package like "HTML Tidy", you may want to include the Javascript line in the HEAD areaof your page to make it more palatable, for instance:

<html><head> <meta http−equiv="Content−Type" content="text/html; charset=ISO−8859−1"> <script src="/__utm.js" type="text/javascript"></script> ...</head>

Step 3: Enable cookies in your IIS logging. Open the IIS Manager and bring up the Properties window foryour website. Make sure the logging is enabled and set to the W3C Extended format. Click the Properties

Chapter 2: Visitor Tracking 48

Page 53: MANUAL Urchin v5x

button next to the format and under the Extended Properties Tab, check the box next to Cookie.

Step 4: Set Urchin Profile to use UTM. In the Urchin Administration interface, edit the profile in questionand click on the Reporting tab. Set the Visitor Tracking Method to UTM. Set the UTM Domain to the addressof your website without the www. When done click the Update button. Then click on the Profile Settings taband choose UTM−Enable All for the Default Report Set, then click Update again.

That's it! Your website will now begin logging UTM data into your normal log file which will be identifiedthe next time you run Urchin.

Is it working? To see if the UTM is successfully making entries to your log file, examine the log after youhave installed the UTM and clicked on a few pages of the site. You should see an entry similar to thefollowing at the end of the log file:

... "GET /__utm.gif?..." 200 ..."__utma=..."

If you don't see the __utma entries, be sure to check that cookies was enabled in the logging properly. If thestatus code is not 200 then check to make sure the files were properly copied to your document root.

Using UTM with Domain Aliases

BackgroundBecause cookies are domain based objects, there are some important considerations when a site has multipledomains. A cookie that is set under a domain, "mysite.com", will be passed to all subdomains such as"www.mysite.com". However, this cookie will not be passed to "mysite.net" or any other different rootdomains.

If your website only has one domain responding to "mysite.com" and "www.mysite.com", you can follow thestandard UTM installation. However, If you have a website with one or many aliases, it is recommended toredirect traffic from the aliases to the primary site. This will ensure that the UTM visitor tracking is getting setunder the primary domain and that all visitors are tracked consistently.

If we don't do this, then a visitor may appear as two visitors if they access the same site through two separatedomains. The following instructions provides an example of how to redirect aliased domains to the primarydomain in Apache and IIS servers.

Redirecting Aliases in ApacheIf you are using an Apache webserver, the configuration can be easily modified to redirect all trafficoriginating under one of the aliases to the primary site. One way to do this is to create two VirtualHost entries.The first will be the primary domain which will include your normal configuration; and the secondVirtualHost will be for all the aliases and will redirect to the primary. Example:

#−−−primary virtualhost<VirtualHost 1.2.3.4>Servername www.mysite.com

Chapter 2: Visitor Tracking 49

Page 54: MANUAL Urchin v5x

Serveralias mysite.com...</VirtualHost>

#−−−second virtualhost<VirtualHost 1.2.3.4>Servername mysite.orgServeralias www.mysite.org mysite.net www.mysite.netRewriteEngine onRewriteRule ^(.*) http://www.mysite.com$1 [R=301]</VirtualHost>

The second VirtualHost uses a rewrite rule with a 301 (Moved Permanently) redirect code to forward alltraffic to the original site. A single 301 hit will still be recorded in the log file which is nice for tracking whichdomains people are entering on, but all remaining traffic will be forced under the one domain. At this point, asfar as the UTM is concerned, the site appears to be a one domain site and is ready for normal UTMinstallation.

Note: please be advised that you should work with your administrator and reference the apache.org site onconfiguration parameters.

Redirecting Aliases in IISIf you are using a Microsoft IIS webserver, the configuration can be easily modified to redirect all trafficoriginating under one of the aliases to the primary site. One way to do this is to create two websites in the IISconfiguration. The first will be the primary domain (www.mysite.com) which will include your normalconfiguration; and the second will be for all the aliases (mysite.net, mysite.org, etc) and will redirect to theprimary.

In the IIS Manager, right click on one of the websites and bring up the properties dialog. On the "Web Site"tab, click the "Advanced..." button. This brings up the window where additional domains can be assigned tothe website using the "Host Header" field. Set the primary domain in the primary website, and use the secondwebsite to house all of the aliases.

Once the second website housing all of the aliases is configured and enabled, create a blank homepage withthe following redirect code:

<head><META HTTP−EQUIV=Refresh CONTENT="0; URL=http://www.mysite.com/"></head>

This will instruct the visitor's browser to immediately redirect to the primary URL. At this point, the primarywebsite appears to be a simple one−domain configuration, and normal UTM installation can proceed withdefault settings.

Using UTM with Multiple Sites

Multiple Sites − Same Root Domain

Chapter 2: Visitor Tracking 50

Page 55: MANUAL Urchin v5x

Multiple sites with the same domain (e.g., www.urchin.com and help.urchin.com) can either be processedtogether or separately, depending on the UTM Domain setting of the two sites. If the UTM Domain is set tothe default, "auto", then the two sites will be processed separately. This means that Visitor trackinginformation will be kept separate for each site. Visitor reporting for one site will not be affected by visitortraffic to other site.

Process TogetherIf you wish to process the sites together, sharing Visitor tracking information, then the UTM Domain can beexplicitly set to the common domain (e.g., urchin.com). You will need to set this in the UTM code and theUrchin configuration. To set this in UTM code, edit the __utm.js file in the document root of each site.Towards the top, you will see the line:

var __utmdn="auto"; /*−− ...

Change the word "auto" to the common domain:

var __utmdn="urchin.com"; /*−− ...

Next, in the Urchin configuration, create a single Profile with UTM activated, and set the UTM Domain to thecommon domain. You will be processing the logs from both sites in the same Profile.

In processing the logs for two sites together, it is recommended to apply a Filter to one of the logs in order todistinguish pages and paths. For the www.urchin.com and help.urchin.com example, inserting '/help' in frontof the URLs for help.urchin.com log will allow you distiguish between http://www.urchin.com/foo.html andhttp://help.urchin.com/foo.html. The resulting pages will be referenced as "/foo.html" and "/help/foo.html",respectively.

Create a search and replace filter on the 'request_stem' field with the following settings:

Filter Field: request_stem Search String: ^/ Replace String: /help/

In our example, this filter would then be applied to the log file for help.urchin.com. Running the two separateLog Sources together will require an additional Load Balanced Server Module in the license. Please contactyour sales representative for details.

Tracking Flash and Browser Events (UTM−5 only)

You can track any browser based event, including Flash and Javascript events, if you have installed theUTM−5 (available at ftp://ftp.urchin.com/urchin5/utm−5/) on your website. To track an event, call theurchinTracker JavaScript function with an argument specifying a name for the event. For example, calling:

javascript:urchinTracker('/homepage/flashbuttons/button1');

Chapter 2: Visitor Tracking 51

Page 56: MANUAL Urchin v5x

will cause each occurrence of the the calling Flash event to be logged as though it were a pageview under thename /homepage/flashbuttons/button1. The argument must begin with a forward slash. The event names maybe organized into any directory style structure you wish.

For example, if you wish to organize flash events by page, by type of event, you might organize a hierarchyalong these lines:

'/homepage/flashbuttons/button1'• '/homepage/clips/clip1'•

Flash Code Examples

on (release) {

// Track with no action

getURL("javascript:urchinTracker('/folder/file');");

}

on (release) {

//Track with action

getURL("javascript:urchinTracker('/folder/file');");

_root.gotoAndPlay(3);

myVar = "Flash Track Test"

}

onClipEvent (enterFrame) {

getURL("javascript:urchinTracker('/folder/file');");

}

HTML Code Examples

The following illustrates how to log an onClick event:

<a href="javascript:void(0);"onClick="javascript:urchinTracker('/folder/file');">

The following illustrates how to log a rollover event:

<a href="javascript:void(0);"

Chapter 2: Visitor Tracking 52

Page 57: MANUAL Urchin v5x

onMouseOver="javascript:urchinTracker('/folder/file');">

Tracking Banner Ad Exits and Other Outbound Links

If you publish advertising banners on your site, there is an easy way for you to track which banners visitorsclick on to leave your site and which advertisers they visit.

First, make sure that you have installed the UTM−5 (available at ftp://ftp.urchin.com/urchin5/utm−5/) on yourwebsite.

Next, you will need to add some code to each of the banners.

For an animated GIF or other type of static banner ad, you would add the following code:

<a href="http://www.advertisersite.com"onClick="javascript:urchinTracker('/bannerads/advertisername/bannername');">

This code causes each click on the banner to be logged as though it were a pageview named/bannerads/advertisername/bannername. It is a good idea to log all of your advertising banners into a logicaldirectory structure such as /bannerads/the name of the advertiser/the name of the banner. This way, you willbe able to easily identify the number of referrals to each advertiser.

The equivalent code for a Flash banner is provided below:

on (release) {

getURL("javascript:urchinTracker('/bannerads/advertisername/bannername');"); getURL("http://www.advertisersite.com");

}

Chapter 2: Visitor Tracking 53

Page 58: MANUAL Urchin v5x

Chapter 3: Urchin Administration

Administration Overview

Introduction

The Urchin Administration Interface is a browser−based command center from which you can controlvirtually everything related to running Urchin, including setting up Profiles, scheduling log processing events,managing Users and Groups, configuring Filters, and much more.

Chapter 3: Urchin Administration 54

Page 59: MANUAL Urchin v5x

To get started, login to your Urchin system using a browser. If the default port was used during installation,then the URL should be http://your.server.com:9999/, replacing 'your.server.com' with the actual name of thesystem Urchin is running on. Alternatively, http://localhost:9999/ can be used if you are directly on thesystem. On Windows platforms, there is an 'Urchin Administration' shortcut in the Start menu. The defaultpassword for the 'admin' account is 'urchin'. Be sure to change this to a more secure password.

Controls

After logging into the system and proceeding through the startup wizard, you will see the administrationscreen with a menu on the leftside navigation. The three primary buttons are 'View Reports', 'Configuration',and 'Preferences'. Click on the 'Configuration' button to begin configuring Urchin.

This menu provides access to all of the critical configuration controls. Click on the arrows to expand aparticular section. The darkened color indicates which control is currently being displayed. When firstclicking on one of the configuration sections, a list of existing entries may be shown with appropriate 'edit'buttons next to each entry.

Clicking on the 'Edit' button next to a particular entry will allow you to modify the configuration for the entry.To add new entries, click the 'Add' button in the upper right shown in the above figure. After clicking 'Edit' ona particular entry, the set of configuration screens available for that entry is shown using tabs across the top toselect the particular configuration subject.

Chapter 3: Urchin Administration 55

Page 60: MANUAL Urchin v5x

Click on a particular tab to access the configuration settings under the tab. After changing any settings, be sureto click the 'Update' button provided at the bottom of each screen.

Once you have a long list of entries in a particular area, there are some additional controls that make it easierto find those entries. The Next/Previous buttons are located just above and below the list of entries forscrolling through the entries. The number shown can also increase how many entries are shown at one time

Shown in the above image, the + − Filter option can help you quickly find a particular entry. Simply enter allor part of the entry's name and press return.

Details about each section are provided further in this manual and by clicking the 'Help' link provided at thebottom of each admin screen. Definitions about each configuration parameter are generally found by clickingthe 'Help' link.

Profiles

Importing Profiles (Windows)

Chapter 3: Urchin Administration 56

Page 61: MANUAL Urchin v5x

Overview

Urchin's Import Profiles function is a convenient way for users with systems running the Microsoft InternetInformation Server to set up Profiles for each of their IIS sites quickly. Urchin can read the IIS configuration,determine what websites are running on the server, and then build basic Profiles for each website that use theIIS logs as their Log Sources. You can then customize the Profiles or add additional Profiles as desired for theimported sites.

How to Import Profiles

To get started importing Profiles, log−in to the Urchin administration system as admin and click on theConfiguration button at left. Click the Import button at top−right and you will be taken to the Import Profilesscreen. This screen allows you to select which, if any, websites to import. Once you've checked sites to importclick the Import button. Click Done when you've finished with all your import choices.

Recommendations

It's a good idea to create at least one Profile for each website on the server so that you get a complete pictureof traffic to the server via Urchin's Summary Report. The Summary Report gives you overall trafficinformation for the server, as well as a ranking of each site by various traffic parameters. This is very handy ifyou are a host and bill according to bandwidth usage. Note that the Summary Report only shows data basedon Profiles that have been configured −− sites without functioning Profiles are not included.

Working with Profiles

Overview

A Profile is the term used for a set of reports for a website and the configuration settings needed to createthose reports. In general, you will need to set up a Profile for each website for which you want reporting. Ifneeded, multiple Profiles can be used for the same website with different filtering options.

The configuration of a Profile includes information about the website, log file sources, filters, and theschedule for processing. Once a Profile is created and configured, it needs to be 'Run' in order to process raw

Chapter 3: Urchin Administration 57

Page 62: MANUAL Urchin v5x

log file data.

Licensing Information

The Urchin base license includes 100 Profiles. If you need more Profiles, the license can be upgraded bycontacting your sales representative or by clicking on the Settings −−> License −−> Upgrade link within theconfiguration. The base license also includes one server per Profile. If you need additional Load BalancedServers, you will need to upgrade your license.

Creating a Profile

To get started creating a Profile, login to the Urchin administration interface as an Urchin admin and click onthe Configuration button at left. To create a new Profile, click the Add button at top−right as shown in theimage below. You will be taken to the Add Profile Wizard. This is a simple series of steps designed to helpyou get the Profile set up in basic form quickly and easily. Each screen in the Wizard has explicit helpinformation that is available by clicking on the ? icon.

Once a Profile is created, the configuration can be modified by clicking the 'Edit' button next to that entry inthe list. Tabs are provided at the top of the configuration area to easily access the different configurationscreens.

Recommendations

Urchin has several different methods for identifying visitors and sessions, depending on availableinformation. Of these, the patent−pending Urchin Traffic Monitor (UTM) is a highly accurate systemthat was specifically designed to identify unique visitors, sessions, exact paths, and return frequencybehavior. There are a number of visitor loyalty and client reports that are only available when usingthe UTM System. The UTM System is easy to install and is highly recommended for all businesses.To install UTM, please refer to the UTM install instructions in the Visitor Tracking section of thisdocumentation.

If you intend to set up one or more Filters in conjunction with your Profile, it is advisable to havemore than one Profile for that website or part thereof. We recommend having one Profile that is the"master" −− it contains everything. If you wish, for example, to filter out spiders or robots, it's a goodidea to put these Filters in a second Profile so you can easily compare the results of the Filters to themaster Profile.

Chapter 3: Urchin Administration 58

Page 63: MANUAL Urchin v5x

Log Files

Working with Log Sources

Overview

You will generally add a Log Source in the course of creating a Profile. A Log Source is Urchin's way ofidentifying the characteristics of an access log (sometimes called a transfer log) for one of your websites.Access logs contain all the hits, or requests for web documents, that are made to your website. Some of thelog file characteristics that are associated with a Log Source are the path to the log file, the format of the logfile (e.g. W3C or NCSA), whether the log is local or on a remote system, and whether a filter should beapplied to the log file during processing.

An important concept to understand is that Log Sources exist independently of Profiles. Every Profile musthave at least one Log Source associated with it to obtain reporting. However, several Profiles couldconceivably use the same Log Source. For example, you may want to create multiple Profiles using the sameLog Source, but give each Profile a different filter to produce varying report results. So there is not necessarilya 1:1 ratio between Log Sources and Profiles.

Configuring Log Sources

To get started adding a Log Source to the system, log−in to the Urchin administrative system as theadministrator and click on the Configuration button at left. Next, click the Log Manager button. To create anew Log Source, click the Add button at the top right of the screen. You will be taken to the Add Log SourceWizard. This is a simple series of steps designed to help you get the Log Source set up quickly and easily.Each screen in the Wizard has explicit help information to explain the configuration information displayed onthat screen.

In the Log Settings screen you will note that you have to choose a Log Format. This setting tells Urchin howthe data in your log file is arranged. It is important that you select the correct format for your log or Urchinwill not be able to produce meaningful report data. Urchin understands a default set of log formats that youcan choose from via a dropdown menu. They are:

Auto: Urchin uses this format to automatically detect NCSA, W3C, Netscape, ELF, and ELF2 logformats. Instead of explicitly selecting one of these, you may choose Auto and Urchin will correctlydeduce how to read the data if your log format is in this list.

NCSA: Apache modified Extended/Combined format (see Logging − Apache and IIS for adescription of this format)

W3C: Microsoft IIS servers typically use this format, although other webservers can also beconfigured to produce W3C logs.

Netscape: Netscape and iPlanet servers use this format by default.•

Chapter 3: Urchin Administration 59

Page 64: MANUAL Urchin v5x

ELF/ELF2: E−Commerce Log Format; see the specification in the E−commerce Module section fordetails.

Google: If you have licensed the Campaign Tracking Module, use this format for logs containingGoogle cost−per−click spending data. Note that the Google log format can not be auto detected.

Overture:If you have licensed the Campaign Tracking Module, use this format for logs containingOverture cost−per−click spending data. Note that the Overture log format can not be auto detected.

Custom: Although not initially listed in the dropdown menu, you can create your own custom logformats, which will automatically appear in the dropdown menu when properly configured. Pleaserefer to the "Custom Log Formats" article in the Advanced Topics −> Customization section of theDocumentation Center.

If you don't believe your webserver currently produces logs in one of the recognized default formats, theneither you can reconfigure your webserver to log in one of these formats, or you can create a custom logformat that conforms to how your webserver currently logs. If you want to reconfigure your webserverlogging, then it is recommended that you choose the W3C or NCSA style logging.

Load Balancing and Parallel Log Processing

If you have purchased a Load Balancing License, the Log Source Wizard provides a Parallel Log Processingoption. When Parallel Log Procesing is enabled, Urchin opens all of the log files at once and reads them in arotating fashion, one section at a time, each section corresponding to 15 minutes of log activity. EnablingParallel Log Processing significantly increases performance on load balanced sites.

Log Management

Overview

Log management is an important concern when running software such as Urchin. Because busy sites willbuild up large log files fairly quickly (up to several gigabytes in one month in some cases), log managementshould be considered carefully. It is recommended that a standard log rotation practice be established.Compressing and otherwise archiving files offline are standard practices. Please see the article on LogRotation Best Practices in this section for further information on establishing such a procedure. Logmanagement is necessary only for disk resource usage considerations, not for purposes of avoidingreprocessing data. Urchin does not need any sort of log rotation to avoid data duplication, as it is equippedwith a log tracking capability that ensures that previously read log data is not reprocessed. Because Urchinshould never need to re−read a log file once they have been processed, at your discretion you may delete thelog(s) after each processing run. However, it is not uncommon to keep old logs for a specified amount of timefor historical or auditing reasons.

Managing Logs via Urchin

Each Log Source has a Log Destiny setting with the options Don't Touch, Archive/Compress, and Delete.Once all Profiles that are utilizing a Log Source have finished their processing, Urchin uses the Log Destinysetting to determine the disposition of the Log Source. The Log Destiny setting is accessible under theAdvanced Settings tab for a given Log Source. It is recommended to set Log Destiny to Archive/Compress so

Chapter 3: Urchin Administration 60

Page 65: MANUAL Urchin v5x

that you save disk space if you want to keep your logs for some period of time. If you are comfortable withthe fact that once you've processed a log that it is removed, then you can choose a Log Destiny of Delete.However, realize that this means you will not have the option of rerunning Urchin against that log in thefuture unless you have a backup elsewhere.

Considerations

A few special situations should be noted:

Do not use the Archive or Delete options with a Log Source if you are processing live logs. A live logis one that is being actively written to by a webserver. Using these setting with a live log will cause aloss of data.

If Log Destiny for a remotely retrieved Log Source is set to Don't Touch, then that log will growcontinually unless there is some process external to Urchin that is handling log management on themachine where the log is created. Since Urchin must transfer a copy of the remote logfile to the localsystem before processing, as the log file grows it will take Urchin longer and longer to transfer thefile. This will have the side effect of lengthening your overall Urchin run time.

Log Rotation Best Practices

Overview

It is very typical in most operating environments for the system services and applications such as webserversto generate logfiles that record actions and events related to those services. In most cases, it is also standardpractice for the operating system and/or applications to perform regular maintenance on the logfiles to keepthe size of the logfiles in check. This prevents the logfiles from growing without bounds and eventuallyrunning out of disk space.

A common approach to managing logs is to have a regularly scheduled log rotation task that renames theexisting logs with a timestamp and then restarts the service or application with a new, zero length logfile. It isalso a standard practice for the log rotation task to compress the old logfiles, and to delete logfiles after acertain age or rotation cycle threshold has been reached.

In the specific case of webserver logs, the rotation is usually handled on a daily basis to ensure that the logsremain at a manageable size. In addition, a daily rotation schedule is generally a good granularity to facilitatepost−processing of webserver logs with an analysis tool like Urchin. Some webservers such as Microsoft's IIShave built−in log rotation functionality, which, when enabled, will rotate logs on a daily basis by default.Other webservers such as Apache have no explicit log rotation handler, but provide tools for easily restartingthe webserver (without loss of web service) to accommodate the log rotation operation (e.g. apachectlrestart ).

Log Rotation in Previous Versions of Urchin

Chapter 3: Urchin Administration 61

Page 66: MANUAL Urchin v5x

Unlike Urchin 4 and 5, previous versions of Urchin have no built in log tracking mechanism to determinewhich logs have already been processed, so those earlier Urchin versions depend heavily on a reliable logrotation scheme to ensure that logs are only processed a single time. As such, pre−Urchin 4 versions have theoption of providing simple log rotation functionality and the ability to restart the webserver as part of theoverall processing duties. If this Urchin logrotation mechanism is not utilized, the responsibility of reliable logrotation must be handled completely by an external log management mechanism. This has traditionally beenthe function of a larger overall system log management scheme provided as part of the operating system (e.g.the open−source "logrotated" found in many Linux distributions).

Log Rotation Practices with Urchin 5

With the advent of Urchin 4, the need for log rotation to avoid duplicate processing of logs has beeneliminated thanks to Urchin's Log Tracking technology. This allows Urchin 5 much greater flexibility inprocessing of logs, such as the ability to process "live" logs that are still being written by the webserver, or toprocess logs that are rotated on an manual or irregular basis.

Important Note: Unlike previous versions of Urchin, Urchin 5 does not provide hooks for invoking a logrotation procedure or restarting a webserver after log rotation tasks have been performed, although certainpost−log−processing actions are possible as described below.

While Urchin 5 operation does not require that webserver logs be rotated regularly or at all, it is recommendedthat a standard log rotation scheme be implemented to ensure smooth operation and to keep the Log Trackingutility from having to do a lot of unnecessary processing. It is much more efficient from both a system andapplication standpoint to manage several smaller logs than one very large log, as file operations tend to slowconsiderably as files get larger. Smaller files are also much easier to back up and restore in the event of a diskfailure or other system failure.

Log rotation mechanisms needn't be overly complex −− in most cases, a simple shell script or Perl script rundaily from cron on UNIX−type systems is all that is necessary. The script merely needs to rotate the existingwebserver log and timestamp it (using the %Y%m%d or YYYYMMDD formats is recommended), and restartthe webserver. Additional logic can be added to prune old logfiles to keep disk space usage in check. Asample log rotation script written in Perl can be downloaded from http://www.urchin.com/support in theHelper Scripts area. This script rotates one or more logs and timestamps them appropriately, then removeslogs that are older than a certain number of days (configurable). Note: If you are running IIS on a Windowssystem, the log rotation functionality is included as part of the IIS management and no external script isneeded.

Configuring Urchin 5 for Use with Log Rotation

Once you have your log rotation scheme in place, it is a simple matter to configure Urchin to process yourrotated log. You can either set up the Log File Path specification to use a wildcard which matches thetime−stamped log filename pattern when configuring a Log Source (.e.g. access−log.* for Apache logs orex*.log for IIS logs) or you can use Urchin's built−in timestamp pattern matching (e.g. access−log.%Y%m%dfor Apache, ex%y%m%d.log for IIS). When Urchin encounters this pattern, it will substitute yesterday's datefor the %Y%m%d pattern and process the log with the resulting filename (e.g. access−log.20020617). Forfurther information on the date matching pattern, please see the article in this section entitled Wildcard &DateSubstitution in Log Paths.

Chapter 3: Urchin Administration 62

Page 67: MANUAL Urchin v5x

The wildcard specification has the advantage of allowing you to place a number of unprocessed logs in asingle directory and have Urchin process them the next time it runs. This is especially convenient for handlingsituations where the expected logfiles are not in place when Urchin runs, e.g. due to a remote webserver beingdown or loss of network connectivity. The disadvantage is that Urchin must open up the directory and searcheach log file to determine if it has already been processed, and this can induce significant overhead whenmany log files are resident in the directory. If you deem your log rotation scheme to be reliable, using theYYYYMMDD pattern matching scheme is a more efficient method.

You may also wish to have Urchin 5 delete or archive/compress the log once it has been processed. DifferentLog Destiny options can be set in the the Advanced Settings of a Log Source. For more information on theseLog Destiny settings, please see the Log Management document in the Log Files section of the UrchinAdministration area.

Important! Log Destiny options should not be used with live logs that have not been rotated!

Configuring Log Rotation on UNIX−type systems

Due to the large variation operating system functionality and webserver configurations, and the highlikelyhood that log rotation procedures are highly site−specific, there is no cookbook method for establishingwebserver log rotation on UNIX−type systems. However, a sample log rotation script called WebLogRotateis available from the Urchin web site in the Helper Scripts area. This script is written in Perl to make it asportable as possible, and is typically invoked from cron on a daily basis.

Configuring Log Rotation for Windows IIS Webservers

As mentioned above, the management functions of IIS allow for automatic log rotation of webserver logs,though this functionality is not enabled by default. Please follow the steps below to configure an IISwebserver for proper log rotation. It is recommended that the logs be rotated daily, and that the log rotation beset to happen in relation to local time. By default, IIS will rotate logs at midnight GMT rather than localtime.

Under Windows 2000, you should insure that IIS webserver is configured properly to do log rotation. This isaccomplished using the Computer Management function of Windows 2000. Windows NT, Windows XP andWindows 2003 Server utilize a similar procedure. To open Computer Management and establish log rotation,perform the following actions:

Click Start −> Settings −> Control Panel• Double−click Administrative Tools• Double−click Computer Management.• Double−click on Internet Information Services• Right−click on Default Web Site and select Properties• In the pop−up window, select the Web Site tab• At the bottom of the window, click on the Properties tab• Click the Daily radio button under the New Log Time Period heading• Click the Use local time for file naming and rollover checkbox.•

This will ensure that IIS rotates the webserver logs on a daily basis just after midnight.

Chapter 3: Urchin Administration 63

Page 68: MANUAL Urchin v5x

Logging − Apache and IIS

Overview

It is critical to set up your webserver logging in a format that allows Urchin to properly interpret the data andproduce fully detailed reporting. This article explains the process for the most common webservers, Apacheand Microsoft IIS. For maximum reporting depth, it is important to enable logging to include Referral andUser Agent information. To enable unique visitor reporting when using the Urchin Tracking Module (UTM),it is additionally required to enable cookie logging. UTM−based tracking is the only way to get true uniquevisitor reporting. It's advisable, although not required, that you decide whether you want to use UTM prior tochanging your webserver logging. If so, you should enable cookies in your logs now. It will not hurt if youenable cookies but do not install UTM on your website immediately. You may want to look over the sectionon Visitor Tracking to familiarize yourself with the UTM installation before proceeding.

Configuration

Apache

By default, Apache generally logs in what's called common log format, and also provides an option to log in amore detailed format known as NCSA extended/combined log format. For optimal reporting, Urchin requiresa variation of the NCSA extended/combined format. To configure Apache to use the appropriate format do thefollowing:

Make a backup copy of your httpd.conf file. Then use a text editor to open your original httpd.conf.1. Locate the section containing lines that begin with the word LogFormat2. Insert a new LogFormat line using one of the forms shown below, depending on whether you will beusing UTM or not. The LogFormat entry must be added to your configuration file as a single linewithout carriage returns or line breaks. Make sure you pay close attention to entering in all thecharacters correctly.

For websites that will not use UTM

LogFormat "%h %v %u %t \"%r\" %>s %b \"%{Referer}i\"\"%{User−Agent}i\"" urchin

For UTM−enabled websites:

LogFormat "%h %v %u %t \"%r\" %>s %b \"%{Referer}i\"\"%{User−Agent}i\" \"%{Cookie}i\"" urchin

The word "urchin" at the end of the LogFormat line is a nickname that will be used elsewhere in yourhttpd.conf to apply this format to a log file. This string can be anything you choose. Using "urchin"will help identify that this entry was created to accommodate Urchin processing.

3.

Chapter 3: Urchin Administration 64

Page 69: MANUAL Urchin v5x

Examine the <VirtualHost> entry for which you wish to enable this new logging format. Deactivateany existing TransferLog or CustomLog entries within a <VirtualHost></VirtualHost> group byinserting a # in front (e.g. TransferLog becomes #TransferLog). Then insert the following newCustomLog entry, replacing the string path_to_log with the appropriate path to your log location:

CustomLog path_to_log/access.log urchin

If you chose some identifier other than "urchin" as the nickname for your LogFormat entry earlier,use that nickname in place of "urchin" in the CustomLog entry.

4.

Save the edits to your httpd.conf file.5. IMPORTANT! Check the syntax of your new httpd.conf by running the command:

apachectl configtest

This should produce the response syntax ok. If not, doublecheck your httpd.conf file and fix anyerrors. If you cannot get the correct response, do not continue with this procedure. Instead, make abackup copy of your edited file, then restore the original by overwriting this version with a copy ofhttpd.conf you saved at the start of this procedure. This will ensure that your webserver continues towork normally while you figure out what is wrong with your changes.

6.

Once you have confirmed the syntax of your httpd.conf, restart Apache. The preferred method is bycalling the apachectl script, which is typically installed with Apache.

apachectl restart

7.

Check the logging. Open a browser and hit the site in question a few times. Then examine the last fewlines of the log file specified in your CustomLog entry. You should see several recent hits have beenwritten to the log. For the Urchin modified extended/combined log format, a log line will look similarto this:

64.40.51.27 www.urchin.com − [28/Aug/2002:15:11:01 −0700] "GET//var/www/urchin_help−test/images/urchin_header_logo.gif HTTP/1.1"200 3017 "http://www.urchin.com/" "Mozilla/4.0 (compatible; MSIE6.0; Windows NT 5.0)"

If you have configured UTM on your site and have turned on cookie logging a log line will looksimilar to this:

64.40.51.27 www.urchin.com − [28/Aug/2002:15:11:01 − 0700] "GET//var/www/urchin_help−test/images/urchin_header_logo.gif HTTP/1.1"200 3017 "http://www.urchin.com/" "Mozilla/4.0 (compatible; MSIE6.0; Windows NT 5.0)""__utma=171060324.1378004559.1063331913.1063334677.1063521838.3;__utmb=171060324; __utmc=171060324"

Note the additional UTM cookie information at the end of the line.

8.

Microsoft Internet Information Server (IIS)

Note: Microsoft IIS uses a W3C logging format.

Chapter 3: Urchin Administration 65

Page 70: MANUAL Urchin v5x

Urchin can provide very basic reporting if your IIS log files have, at the very least, the following fields:

Date• Time• C−IP• CS−URI−Stem• SC−Status• SC−Bytes•

These are required fields. Without them you will not get meaningful reporting. However, this minimallogging does not provide enough information for Referral and Browser reporting. Therefore it is advisable toset more detailed logging properties for your IIS server.

IIS logging properties are configured either separately for each domain on the server, or globally. For serverswith more than a few domains, the global option is recommended. The following steps will ensure that therequired log file fields are being recorded. If you elect to log additional fields, Urchin will just ignore them atprocessing time. However, logging unneeded fields will increase the size of your log files so it is best to onlylog the fields needed by Urchin.

Launch the IIS services management tool by going to Start−>Programs−>AdministrativeTools−>Computer Management

1.

Expand the Services and Applications tree, then select Internet Information Services, which shouldbring up a list of websites (except on Windows 2003 Server which will require that you furtherexpand the Web Sites folder to get a listing of sites).

2.

Right click on the entry for the site you want to modify and select Properties3. Select the Web Site tab and in the section at the bottom of this screen verify that the Enable Loggingcheckbox is checked. Then from the Active Log Format dropdown menu choose W3C ExtendedLog File Format.

4.

Click on the Properties button next to the Active Log Format box5. Select the Extended Properties tab6. Check the boxes for the following fields:Date [ date]Time [ time ]Client IP Address [ c−ip ]User Name [ cs−username ]Method [ cs−method ]URI Stem [ cs−uri−stem ]URI Query [ cs−uri−query ]Protocol Status [ sc−status ]Bytes Sent [ sc−bytes ]User Agent [ cs[User−Agent] ]Referer [ cs[Referer] ]Cookie [ cs[Cookie] ] (This field only required for UTM tracking)

7.

You should make sure the Process Accounting box is unchecked as it does not provide usefulweb access activity information.

8.

Select Apply and OK on each window to save your settings.9. It is not necessary to restart IIS. Your logs should immediately begin logging according to the newsettings.

10.

Chapter 3: Urchin Administration 66

Page 71: MANUAL Urchin v5x

Logging − iPlanet

Overview

This article provides a brief overview of how to configure logging for an iPlanet webserver to facilitate properprocessing and reporting for Urchin.

Use "Netscape" type for Log Source setting.

There is a set of minimally required fields necessary for Urchin to produce reports. They are:

date• time• hostname or ip address of requesting system• request (i.e. what document did the requesting system ask from your webserver)• status code generated by request (numeric)• bytes (bytes transferred from server to client)•

In addition, for the most complete reporting you need the following fields:

referral• user−agent• cookies (if the Urchin Traffic Monitor is installed on your site)•

Configuration

Init fn=flex−init access="$accesslog"format.access="%Ses−>client.ip% − %Req−>vars.auth−user%[%SYSDATE%] clf−request%\%Req−>srvhdrs.clf−status% %Req−>srvhdrs.content−length%s.user−agent%\s.referer%\s.cookie%\

Logging: Tomcat (Apache Jakarta Project)

Chapter 3: Urchin Administration 67

Page 72: MANUAL Urchin v5x

Overview

This article describes how to configure the Tomcat webserver for use with Urchin.

Standard logging format without cookies. className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="access_log"suffix=".log" pattern="%h %v %u %t "%r" %s %b "%{Referer}i" "%{User−Agent}i""resolveHosts="false"/>

You must have Tomcat 5 to log cookies.className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="access." suffix=".log"pattern="%h %v %u %t %r %s %b %{Referer}i %{User−Agent}i %{Cookie}i" resolveHosts="false" />

Logging − Other Webservers

Overview

This article provides a brief overview of how to configure logging for webservers other than Apache and IISto facilitate proper processing and reporting for Urchin. Urchin will process any webserver log as long as itcan understand how the data is organized in each log file entry. The information in this article applies only toaccess logs. If you are interested in details of e−commerce logging, please see the E−commerce Modulesection of the Documentation Center.

Regardless of your webserver type or logging format there is a set of minimally required fields necessary forUrchin to produce reports. They are:

date• time• hostname or ip address of requesting system• request (i.e. what document did the requesting system ask from your webserver)• status code generated by request (numeric)• bytes (bytes transferred from server to client)•

In addition, for the most complete reporting you need the following fields:

referral• user−agent• cookies (if the Urchin Traffic Monitor is installed on your site)•

Configuration

The specifics of how to make changes to logging characteristics for every webserver would be toocumbersome to list. In general the easiest approach is to configure your logging to conform to either Urchin'sNCSA or W3C form, then choose the appropriate default format from the Log Format dropdown menu in theLog Source. If your webserver can support this approach then see the document Logging − Apache and IIS.

Chapter 3: Urchin Administration 68

Page 73: MANUAL Urchin v5x

The information there on how the necessary data fields are setup may be useful to you, even though the detailson the methods for making the changes won't necessarily apply to your webserver.

Wildcard &Date Substitution in Log Path

Overview

Urchin 5 allows you to specify wildcard and date matching variables in the path to a log file. When an Urchintask is executed and the log path is read, these variables are converted and compared for matches with thedirectories and filenames on your system.

The date matching capabilities in Urchin 5 are more extensive than those provided in previous versions ofUrchin, namely:

Date substitution may happen at any point in the pathname of the logfile; previous versions of Urchinonly allowed substitutions in the actual filename specification

A more robust and flexible data pattern matching algorithm has been implemented, although theprevious YYYYMMDD−style pattern matching is still supported for backward compatibility

The most commonly used time matching variables and formats are shown next. The full set of all supportedtime formatting variables is listed at the end of this article.

* an asterisk matches zero or more consecutive charactersDD is replaced by the 2 digit numeric day of the month, e.g. 01−31%d is equivalent to DD MM is replaced by the 2 digit numeric month, e.g. 01−12%m is equivalent to MM YY is replaced by the 2 digit numeric year, e.g. 01−99YYYY is replaced by the 4 digit numeric year, e.g. 0001−2003%Y is equivalent to YYYY

Note that the asterisk in this context behaves like filename matching as you'd have in a command shell inUNIX or DOS, not like regular expression matching where this character would match zero or more instancesof the preceeding character. These variables can be combined in any way the user chooses.

The list below shows examples of how instances of these variables would translate on 08/13/2003. Note thatthe day specifiers DD and %d get converted into the day before 13.

YYYYMMDD would translate into 20030812• %Y−%m−%d would translate into 2003−08−12• %Y/%m/%d would translate into 2003/08/12 (note that this has implications in a path)• *YYYYMMDD would match any filename ending in the string 20030812•

The DD and %d day specifiers get converted into the previous day by default because of the way webserverlogs and Urchin processing are typically managed. Your logs will usually be rotated daily to keep them fromgrowing too large and so that each log contains primarily data for a single day. This rotation happens mostfrequently just before midnight. Urchin processing would usually occur after this when the clock has moved

Chapter 3: Urchin Administration 69

Page 74: MANUAL Urchin v5x

past midnight to the next day of the month. If you were adding a YYYYMMDD style timestamp to your logfile name as it is rotated, then that date and Urchin's run time would differ by one day. Evaluating a dayconversion at the time Urchin is run would result in a failure to find the correct log name since the logtimestamp would read 20030812, but Urchin would be executed on 20030813.

Although this is the most common model for log management, it isn't the only option. So Urchin has aconfiguration parameter that controls the manner in which these variables get resolved to a particular day. TheDate/Time Wildcard Substitution in Log Path Name setting can be used to adjust a time offset that controlshow DD and %d are evaluated. This setting is explained in greater detail at the end of this article.

The year, month, and day variables can be used either in the log file name or in the directory/folder path to thelog file. The asterisk can only be used in the filename portion of the log file path. As well, time formatvariables can be repeated within a log source path, but the asterisk may only be used once. The examples inthe Procedure section will help clarify this.

Procedure

When creating or editing a Log Source, you should use the time variables in the path you use in the Log FilePath box under the Log Settings tab. As an example, a typical daily Apache webserver log rotation schemecreates a log with the datestamp indicating the date of the log entries, e.g. at 1 minute after midnight on07/16/2002 the log rotation mechanism archives the log:

/var/log/httpd/access.log

and saves it as

/var/log/httpd/access.log.20020715

To match this pattern in the log source for an Urchin Profile, you'd simply specify

/var/log/httpd/access.log.YYYYMMDD

in the Log File Path and Urchin will automatically look for the previous day's log when it runs that day. Asanother example, when Microsoft's IIS webserver is configured to rotate logs daily, it will name the logfileand include the current date as part of the filename, e.g. ex021127.log. Therefore, to process a daily IIS log,you would use a logfile specification something like:

C:\WINNT\System32\LogFiles\W3SVC1\exYYMMDD.log

in the Log File Path field of the Log Source for the Profile.

To allow Urchin to process logs that are rotated more frequently than just a daily basis, you can use acombination of the YYYYMMDD syntax and wildcards to match all logfiles created the previous day. To dothis, you would need to ensure that the rotated log file was named consistently, e.g. with an hour appended tothe filename. In the Log File Path specification, you'd then use a pattern such as:

/var/log/httpd/access.log.YYYYMMDD*

or

Chapter 3: Urchin Administration 70

Page 75: MANUAL Urchin v5x

C:\WINNT\System32\LogFiles\W3SVC1\exYYMMDD*.log

A more complex usage would be one where logs are stored in directories named so that they reflect the year,month, and day. Suppose you had the following directory paths for storing logs:

/logs/2003/07/logs/2003/08/logs/2003/09

and you kept all logs for a given month in their respective directories and each log had the day of the monthappended to it (e.g. access.log.01, access.log.02). To allow Urchin to figure out what logs to process youcould use one of the following log path formats:

/logs/YYYY/MM/access.log.DD /logs/%Y/%M/access.log.%d

At log processing times, Urchin will then process all logs matching yesterday's date pattern, with any suffix.As with any use of wildcards in the Log File Path field specification, it is important that Log Tracking for theProfile be enabled to ensure that Urchin does not re−process logs.

Considerations

To determine the date for the replacement pattern, Urchin subtracts 24 hours from the current time, based onthe local time. It will properly handle month and year boundaries. However, this can be modified using theDate/Time Wildcard Substitution in Log Path Name setting under the Advanced Settings tab of a logsource. You can select either Localtime or GMT time as the basis for your time adjustments, then using theHours edit box specify a plus or minus offset in hours.

Complete Date and Time Format Reference

This is the full list of supported time format variables, which follows conventions used in the Standard CLibrary strftime() routine:

%A = national representation of the full weekday name.• %a = national representation of the abbreviated weekday name.• %B = national representation of the full month name.• %b = national representation of the abbreviated month name.• %d = the day of the month as a decimal number (01−31).• %e = the day of month as a decimal number (1−31); single digits are preceded by a blank.• %H = the hour (24−hour clock) as a decimal number (00−23).• %I = the hour (12−hour clock) as a decimal number (01−12).• %j = the day of the year as a decimal number (001−366).• %k = the hour (24−hour clock) as a decimal number (0−23); single digits are preceded by a blank.• %l = the hour (12−hour clock) as a decimal number (1−12); single digits are preceded by a blank.• %M = the minute as a decimal number (00−59).• %m = the month as a decimal number (01−12).• %p = national representation of either "ante meridiem" or "post meridiem" as appropriate.• %S = the second as a decimal number (00−60).• %s = the number of seconds since the Epoch, UTC (see mktime(3)).•

Chapter 3: Urchin Administration 71

Page 76: MANUAL Urchin v5x

%w = the weekday (Sunday as the first day of the week) as a decimal number (0−6).• %Y = the year with century as a decimal number.• %y = the year without century as a decimal number (00−99).• %z = the time zone offset from UTC; a leading plus sign stands for east of UTC, a minus sign forwest of UTC, hours and minutes follow with two digits each and no delimiter between them (commonform for RFC 822 date headers).

%% = `%'. (for use when a literal percent sign is needed inside a date/time entry)•

Processing Historical Logs

Overview

You may wish to process your historical logs after installing Urchin. This is easily accomplished. Simplyspecify a directory and a partial filename and/or wildcard (including regular expressions) in the LogManager's Log Settings screens. NOTE: You may not use wildcards on remote HTTP and HTTPS logsources.

How to Process Historical Logs

First, add a Log Source to the system. Click on the Configuration button at left, and then the Log Managerbutton. On the main screen, click on the Add button at top−right. On the first screen, select Add Local LogSource, and continue. On the next screen, click Browse, which will bring up the File Browser. Locate thecorrect directory in the left−side window. The right−side window will display the files in the directory, andthe left side will display any other directories. When you are in the correct directory, enter a partial filenameand an asterisk (or other regular expression), and click the Verify button. A window will open which willshow you all the matches to your pattern. Click any of the filenames to get information on the file −− location,size, modification date, and file permissions. If the pattern match is correct, click OK, and then OK again inthe File Browser window.

Next, if it hasn't been already, associate this log file with a Profile by clicking the Configuration button at left,and then the Profiles button. Once the association has been completed (see Working with Profiles), click theRun/Schedule button next to the Profile in the main Profiles listing, and schedule the execution of the Profile,or click the Run Now button for immediate processing.

Urchin does not need any sort of log rotation to avoid data duplication. Urchin is equipped with a log trackingcapability that ensures only new hits are processed. However, as mentioned above, logs can quickly consumelarge volumes of disk space, so it is a good idea to periodically compress and archive log files. BecauseUrchin never needs to re−read log files once they have been processed, it is perfectly acceptable to delete thelog(s) after each processing run. However, many people keep logs for a specified amount of time in case theyare needed for some reason, such as if a new Profile is created for that site, and historical analysis is desired.

Recommendations

Chapter 3: Urchin Administration 72

Page 77: MANUAL Urchin v5x

Log management is not essential from the outset, but as logs grow, it becomes important. We recommenddeciding on a log management plan when you initially deploy Urchin.

Log Reprocessing

Overview

Certain circumstances may warrant re−processing of log data, such as a DNS server being down when theprocessing was, incorrectly applied filters, and so on. The following document describes the proper procedurenecessary to back out and reprocess webserver log data.

Please note that reprocessing logs requires the use of Urchin utilities that are only available from a commandline shell environment. It is not possible to do the complete procedure exclusively from the Urchin web−basedadministrative GUI.

Reprocessing a Single Day:In the Urchin admin GUI, edit the Profile and turn off Log Tracking under the Storage/DB tab. Be sure toclick Update to save your change.

Under the Log Sources tab, ensure that the proper log file (s) to be re−processed are specified. The log datashould only contain hits for the date(s) that you are zeroing out the statistics for.

Invoke a command shell on the Urchin system• Run the udb−sanitizer utility in the 'util' directory/folder of the Urchin distribution with the command

udb−sanitizer −p profile−name −d YYYYMM where YYYYMM is the year and month containing the dayyou wish to reprocess

Select option 5, Zero out one or more days. The utility will prompt you for the correct day and will zeroout the statistics for that particular day. If you have a range of contiguous days you'd like to zero you youcan specify that range by using the numbers of the start and end days separated by a hyphen (e.g. 5−10 tozero out days 5 through 10 of the month). If necessary, re− invoke the utility to zero out statistics foradditional days in that month if you cannot use a range.

Click the Run Now button under the Run/Schedule tab for the Profile to reprocess the log data• Reset the Log Source by changing the Log File Path back to its original setting• Under the Storage/DB tab in the profile edit area, turn Log Tracking back on•

Reprocessing an Entire Month:

The procedure for reprocessing an entire month's worth of data is identical to the single day procedure above,except when invoking the udb−sanitizer utility select Option 2, Delete this month entirely instead of Option5.

Additional information:

Chapter 3: Urchin Administration 73

Page 78: MANUAL Urchin v5x

The udb−sanitizer utility provides additional functionality for managing Urchin databases. Please see theudb−sanitizer article in the Advanced Topics−>Utilities section for further information about its capabilitiesand usage.

Filtering

Filtering Overview

This article describes data processing filters, which are applied before reports are generated. In addition todata processing filters, Urchin provides report filtering on the reporting interface. Read Reporting Interface−−> Report Side Filtering for information.

To create a filter, click the Add button in the Filter Manager screen.

Filtering Sequence

Each time the scheduler runs a profile, each entry in the log files passes through the steps shown in the figurebelow. Before any of the report tables are updated, the 'raw' fields in the log file entry are parsed, whichcreates a number of 'auto' calculated fields. For example, the browser and platform fields are calculated fromthe raw cs_useragent field.

Filtering is applied once all of the fields have been populated, and before any entries are made in the reporttables. Filters can be applied to any type of field, including calculated fields. No additional parsing occursafter filters are applied. Thus, it is important to apply the Filter to the correct field. A list of the purpose ofeach available field is provided in the next section.

Filters are applied in the following order:

Advanced Filters, Search &Replace Filters, and DynamicURL Filters1. Decode URL and Japanese Encoding Filters2. Lookup Tables3. Include and Exclude Filters4.

For example, if an Exclude Filter is applied to the same field as the Decode URL Filter, the Exclude Filtermust take into account that encoded characters, such as %20, will have already been translated.

Chapter 3: Urchin Administration 74

Page 79: MANUAL Urchin v5x

Filter Types

Exclude Pattern: This type of filter excludes log file lines (hits) that match the Filter Pattern.Matching lines are ignored in their entirety; for example, a filter that excludes Netscape will alsoexclude all other information in that log line, such as visitor, path, referral, and domain information.

Include Pattern: This type of filter includes log file lines (hits) that match the Filter Pattern. Allnon−matching hits will be ignored and any data in non−matching hits is unavailable to the Urchinreports.

Decode URL: This is a predefined filter that decodes URL−encoded characters back to their originalform. For example, '%20' in a URL is replaced with a space. Apply this filter to URI−stems andqueries to see the original text.

Japanese Encode (UTF−8): This is a predefined filter, generally applied to the keywords field orother potentially multi−encoded field, that looks for Japanese encoded words and converts theencoding to UTF−8 format for a consistent storage and display.

Search &Replace: This is a simple filter that can be used to search for a pattern within a field andreplace the found pattern with an alternate form. See the section on Search &Replace Filters for moreinformation.

Dynamic URL (deprecated): This type of filter is used to translate arcane dynamically generatedURLs into more human−readable page names. Note: the new Page Query Terms Report duplicates thebulk of this function, and the Advanced Filter encompasses all DynamicURL's features and more. It isstrongly recommended that you either eliminate old Dynamic URL filters if possible or else convertthem to one of the newer forms of filter.

Advanced: This type of filter allows you to build a field from one or two other fields. The filteringengine will apply the expressions in the two Extract fields to the specified fields and then construct afield using the Constructor expression. Read the Advanced Filters article for more information.

Choosing Where To Apply a Filter

Filters can be applied either to profiles or to individual log sources. The scope of the filter can be different foreach of these cases. A filter applied to a profile will affect all log sources processed for that profile. A filterapplied to a log source will always affect that specific log source, even if multiple profiles are using the samelog source. In general, you should apply filters to the profile unless one of the following cases occurs:

You have multiple log sources for a profile and you do not want the filter to apply to all of the logsources.

You have multiple profiles using the same log source, and you want all of the profiles to use the samefilter.

In these two cases, apply the filter to the specific log Source, otherwise, it is recommended to apply the filterto the profile.

Creating and Managing Filters

In the Urchin administration interface, click Configuration, then Urchin Profile−−>Filter Manager. Click theAdd button to launch the Filter Wizard.

Once you have created a filter, edit the profiles or log sources to which you wish to apply the filter, and addthe filter.

Chapter 3: Urchin Administration 75

Page 80: MANUAL Urchin v5x

To create a filter while editing a profile or log source, click the Profile Filters tab or Log Filters tab. Awindow appears showing the currently active filters. Click the Add button on this window to launch the FilterWizard.

The filter creation screen has a dropdown menu at the top with selectable built−in filters for common filteringtasks such as filtering out robot traffic to your site. These built−in filters also serve as examples of how to setup various kinds of filters.

Filter Fields

Overview

When a hit or line in a log file is read during processing, the hit is broken down into 'Raw Fields'. Fields aregenerally separated by spaces, tabs, or commas. The Log Format as chosen in the Log Source−>Log Settingsscreen determines how these Raw Fields are assigned internally. Once the Raw Fields are read, Urchinautomatically calculates the 'Auto Fields', using the values in the 'Raw Fields'. Most reports use data in theseAuto Fields for updating.

Filters can be applied to either Raw or Auto Fields. The following two tables provides insight into the purposeof each Field. The first table lists the Fields used for standard reports. A dash in the Fields Used columnmeans that the report in question summarizes numbers generated in other reports and therefore is not tiedspecifically to the data in particular fields. The second table lists all available Fields and their purpose.

Report Field List

Report Name Fields Used

Traffic

Sessions Graph −

Pageviews Graph −

Hits Graph −

Bytes Graph −

Summary −

Visitors &Sessions

Visitors by Day −

Sessions by Day −

Unique Visitors −

Unique Sessions −

Visitor Loyalty utm_session_number

Session Frequency −

Summary −

Pages &Files

Chapter 3: Urchin Administration 76

Page 81: MANUAL Urchin v5x

Requested Pages request_stem

Downloads request_stem

All Files request_origfilepath

Directory by Pages Drilldown request_stem

Directory by Files Drilldown request_origfilepath

Directory by Bytes Drilldown request_stem

File Types by Hits request_origmime

File Types by Bytes request_origmime

Page Query Terms request_stem|request_query

Posted Forms request_stem

Status and Errors sc_status|request_errordetail

Navigation

Entrance Pages request_stem

Exit Pages request_stem

Click Paths request_stem

Click To and From request_stem

Length of Pageview request_stem

Depth of Session −

Length of Session −

Click To and From Report request_stem

Referrals

Referrals referral_domainandstem

Referral Drilldown referral_domainandstem

Search Terms referral_domain|referral_keywords

Search Engines referral_domain|referral_keywords

Referral Errors referral_errordetail|referral_domainandstem

Domains &Users

Domains domain_primary|domain_complete

Domain Drilldown domain_primary|domain_complete

Countries domain_primary|domain_complete

IP Addresses c_ip

IP Drilldown c_ip

Usernames by Hits cs_username

Usernames by Bytes cs_username

Usernames by Sessions cs_username

Browsers &Robots

Browsers by Sessions Drilldown useragent_complete

Browsers by Hits Drilldown useragent_complete

Browsers by Bytes Drilldown useragent_complete

Chapter 3: Urchin Administration 77

Page 82: MANUAL Urchin v5x

Platforms by Sessions Drilldown useragent_complete

Platforms by Hits Drilldown useragent_complete

Platforms by Bytes Drilldown useragent_complete

Combos by Sessions useragent_complete

Robots by Hits Drilldown browser_base

Robots by Bytes Drilldown browser_base

Client Parameters

Screen Resolution utm_screen_resolution

Screen Colors utm_screen_colors

Languages utm_language

Java Enabled utm_java_enabled

Timezone Offset utm_timezone_offset

Javascript Version utm_js_version

E−Commerce

Revenue −

Number of Transactions −

Products by Revenue elf_productname|elf_productcode

Products by Quantity elf_productname|elf_productcode

Products by Revenue Drilldown elf_productname|elf_productcode

Products by Quantity Drilldown elf_productname|elf_productcode

E−Commerce Summary −

Revenue Source

Revenue by Region Drilldown elf_region

Revenue by City elf_region

Revenue by Referrals referral_domainandstem

Revenue by Search Terms referral_domain|referral_keywords

Revenue by Search Engines Drilldownreferral_domain|referral_keywords

Revenue by Domains Drilldown domain_primary|domain_complete

Complete Field List

id Field Type Purpose

1 iis_date (RAW) IIS raw date of hit field.

2 iis_time (RAW) IIS raw time of hit field.

3 apache_time (RAW) Apache raw date &time of hit field.

4 c_ip (RAW) Client IP Address.

5 cs_username (RAW) Client username (if any)

6 selected>cs_request (RAW) Apache raw entire request field.

7 cs_method (RAW) IIS raw request method field.

8 cs_uristem (RAW) IIS raw request stem field.

Chapter 3: Urchin Administration 78

Page 83: MANUAL Urchin v5x

9 cs_uriquery (RAW) IIS raw request query field.

10 sc_status (RAW) Return status code from server.

11 sc_bytes (RAW) Number of bytes transferred for request.

12 c_host (RAW) Client hostname (converts to c_ip if necessary).

13 cs_useragent (RAW) Browser user−agent information.

14 cs_cookie (RAW) Cookies sent by browser.

15 cs_referer (RAW) Raw Referral information (could be internal).

16 custom_date (RAW) Used for datestamp in Custom Logs.

17 custom_time (RAW) Used for timestamp in Custom Logs.

19 cs_host (RAW) Requested virtualhost by Client.

20 s_port (RAW) Server port number.

21 cs_version (RAW) IIS Raw HTTP version.

22 s_sitename (RAW) IIS Server site name.

23 s_computername (RAW) IIS Computer name.

24 s_ip (RAW) IIS Server IP address.

25 elf_orderid (RAW) E−commerce order id number.

26 elf_store (RAW) E−commerce store name.

27 elf_sessionid (RAW) E−commerce session id.

28 elf_total (RAW) E−commerce transaction amount.

29 elf_tax (RAW) E−commerce tax amount.

30 elf_shipping (RAW) E−commerce shipping amount.

31 elf_billcity (RAW) E−commerce customer city.

32 elf_billstate (RAW) E−commerce customer state.

33 elf_billzip (RAW) E−commerce customer zip code.

34 elf_billcountry (RAW) E−commerce customer country.

35 elf_productcode (RAW) E−commerce product code.

36 elf_productname (RAW) E−commerce product name.

37 elf_variation (RAW) E−commerce product variation.

38 elf_price (RAW) E−commerce product price.

39 elf_quantity (RAW) E−commerce product quantity.

40 elf_upsold (RAW) E−commerce upsold variable.

76 referral_protocol (AUTO) Referral protocol (http/https/etc.)

77 referral_host (AUTO) Referral complete hostname.

78 referral_domain (AUTO) Referral domain name.

79 referral_port (AUTO) Referral port number (if any).

80 referral_url (AUTO) Referral complete URL. (includes host)

81 referral_uri (AUTO) Referral complete URI. (no host)

82 referral_stem (AUTO) Referral URI stem without query info.

83 referral_query (AUTO) Referral Query info by itself.

Chapter 3: Urchin Administration 79

Page 84: MANUAL Urchin v5x

84 referral_anchor (AUTO) Referral information after # tag.

85 referral_directory (AUTO) Referral directory up to filename.

86 referral_filename (AUTO) Referral filename without directory.

87 referral_mime (AUTO) Referral mime type (file extension)

88 referral_keywords (AUTO) Referral search engine keywords

89 referral_domainandstem(AUTO) Referral domain and URI stem together.

90 referral_errordetail (AUTO) Referral error detail information.

91 request_method (AUTO) Request method (GET/POST/etc.).

92 request_url (AUTO) Request complete URL (if provided).

93 request_version (AUTO) Request protocol version.

94 request_protocol (AUTO) Request protocol (HTTP/etc.).

95 request_host (AUTO) Request hostname (if any).

96 request_port (AUTO) Request port number (if any).

97 request_uri (AUTO) Request URI with query.

98 request_stem (AUTO) Request URI without query.

99 request_query (AUTO) Request query information (e.g., after ?)

100 request_anchor (AUTO) Request information after # tag

101 request_directory (AUTO) Request directory without filename.

102 request_filename (AUTO) Request filename without directory.

103 request_mime (AUTO) Request mime type (file extension).

104 request_origfilepath (AUTO) Request original uri stem if UTM.

105 request_origmime (AUTO) Request original mime type if UTM.

106 request_errordetail (AUTO) Request detail for error hits.

107 useragent_complete (AUTO) Complete user− agent.

108 browser_base (AUTO) Browser name (e.g., Netscape).

109 browser_version (AUTO) Browser version.

110 platform_base (AUTO) Platform (e.g., Windows).

111 platform_version (AUTO) Platform version.

112 domain_primary (AUTO) First level domain. (e.g. com).

113 domain_complete (AUTO) Complete domain. (e.g. urchin.com).

114 sid (AUTO) Session id (if any).

115 utm_cookiea (AUTO) UTM−2 cookie−a

116 utm_cookieb (AUTO) UTM−2 cookie−b

117 utm_cookiec (AUTO) UTM−2 cookie−c

119 utm_cookie1 (AUTO) UTM−1 cookie−1

120 utm_cookie2 (AUTO) UTM−2 cookie−2

121 utm_cookie3 (AUTO) UTM−3 cookie−3

122 utm_unique_id (AUTO) UTM unique visitor id.

124 utm_page (AUTO) UTM page variable (used for request_ variables).

Chapter 3: Urchin Administration 80

Page 85: MANUAL Urchin v5x

125 utm_referral (AUTO) UTM Referral (used for referral_ variables).

126 utm_screen_resolution(AUTO) Screen resolution (e.g., 800x600).

127 utm_screen_available (AUTO) Available screen resolution in pixels.

128 utm_browser_size (AUTO) Browser size in pixels.

129 utm_screen_colors (AUTO) Screen color bit depth.

130 utm_language (AUTO) Browser language code setting.

131 utm_java_enabled (AUTO) yes|no if java is enabled.

132 utm_cookies_enabled (AUTO) yes|no if cookies are enabled.

133 utm_timezone_offset (AUTO) +/−HHMM timezone offset value of browser.

134 utm_js_version (AUTO) Javascript version info.

135 utm_session_number (AUTO) Number of sessions for this visitor.

145 elf_region (AUTO) E−Commerce region drilldown information.

Exclude/Include Filters

Introduction

Exclude and Include Filters, set up in the admin interface and applied to a log source or profile, are used toeliminate unwanted hits when processing a log file. The filters use POSIX regular expressions when matchingagainst data in the fields of a hit. If you are unfamiliar with regular expressions, please read the RegularExpression Overview document in this section before proceeding.

How Urchin Uses Exclude/Include Filters

These filters are applied after the Decode URL, Japanese Encode, Dynamic URL, Search & Replace andAdvanced filters. Urchin applies the Exclude/Include filters in succession. If the filter being applied is anExclude Filter and the pattern matches, the hit is thrown away and Urchin continues with the next hit. If thepattern does not match, Urchin applies the next filter to the hit. This means that you can create either a singleExclude Filter with multiple patterns separated by '|' or you can create multiple Exclude Filters with a singlepattern each.

Include Filters are applied with the reverse logic. When an Include Filter is applied, the hit is thrown away ifthe pattern does not match the data. If multiple Include Filters are applied, the hit must match every appliedInclude Filter in order for the hit to be saved. To include multiple patterns for a specific field, create a singleinclude filter that contains all of the individual expressions separated by '|'.

Using Exclude/Include Filters

Chapter 3: Urchin Administration 81

Page 86: MANUAL Urchin v5x

In the figure above, the exclude filter requires a filter expression and a filter field. During processing, the filterexpression is compared with data in the filter field and the hit is thrown away if the filter matches. See theFilter Fields article for a complete list of fields that are available. The above example illustrates how to filterout image hits by filtering out all mime types that match gif, jpg, png, jpeg, and ico. This list can becustomized to match any mime type.

In the figure above, the include filter requires a filter expression and a filter field. During processing, the filterexpression is compared with data in the filter field and the hit is thrown away if the filter does not match. Seethe Filter Fields article for a complete list of fields that are available. This example shows how to filter in onlyhtml pages by requiring the mime type of the request to be html.

Controls

The 'Case Sensitive' control allows you to specify whether the filter should be applied with or without casesensitivity.

Decode URL Filters

Introduction

The Decode URL filter is used to convert data from a URL encoded form to a more readable form. Encodingssuch as %20, for example, are converted into spaces.

Chapter 3: Urchin Administration 82

Page 87: MANUAL Urchin v5x

Using Decode URL Filters

To use the Decode URL filter, select a Filter Field. During processing, the data in the Filter Field is decodedand stored back in that field. The field can then be displayed in the reports. Refer to the Filter Fields article fora complete field list. Although the above example illustrates how to use a Decode URL filter to decodereferral keywords, the filter is also useful for decoding request stem and request query.

Search &Replace

Introduction

Use the Search &Replace Filters to replace a matched expression with another string. This type of filter is asimplified version of an advanced filter.

Using Search &Replace Filters

The search &replace filter requires a filter field, an expression to search for, and a replace expression. Thesearch expression is a POSIX regular expression. The replace expression is any text that you wish to havereplace the matched part. Refer to the Filter Fields article for a complete field list. This above example aboveillustrates how to use a search &replace URL filter to remove a leading directory from the path of a page.

Another use for this type of filter would be to replace category id numbers with descriptive words in the querystring of a request. For example, suppose that samples of the requested file with attached queries looks as

Chapter 3: Urchin Administration 83

Page 88: MANUAL Urchin v5x

follows:

/docs/document.cgi?id=1000/docs/document.cgi?id=2000

Using the search and replace filter, you could convert the 1000 or 2000 ids to their equivalents. For example,1000 could be changed to books and 2000 to magazines. This would make the viewing of the pages reportmore useful for people who are not familiar with the codes used to identify the individual items.

Lookup Table Filters

The Lookup Table filter is available beginning with Urchin 5.6. The Lookup Table filter can be used to:

implement master tracking codes for campaign tracking. Read Campaign Tracking Module−−>HowTo Use Master Tracking Codes.

implement an external data table to lookup and replace character field values when a match occurs.Lookup tables can match against a single field and update multiple fields. Read AdvancedTopics−−>Customization−−>Custom Lookup Tables.

map Japanese phone manufacturer/model abbreviations to full names. See below.•

To apply the Japanese phone filter:

In the Filter Wizard:Settings screen, enter your desired filter name (Filter Name field).1. Select Lookup Table as shown in the screen image below.2. Select platform_version (AUTO) from the Filter Field drop down menu, as shown in the screen imagebelow.

3.

Select phone models from the Table Name drop down menu, as shown below.4.

Chapter 3: Urchin Administration 84

Page 89: MANUAL Urchin v5x

Advanced Filters

Introduction

The Advanced Filter option allows you to construct Fields for reporting from one or two existingFields. POSIX regular expressions and corresponding variables can be used to capture all or parts ofFields and combine the result in any order you wish. For general information on how filtering worksand a list of what each Field is used for, see the Filtering Overview and Filter Fields articles at thebeginning of this section.

Using Advanced Filters

Shown in the figure above, the Advanced Filter takes up to two fields: Field A and Field B, andconstructs the Output Field. The construction occurs in the following manner. The Extract Aexpression is applied to Field A, and the Extract B expression is applied to Field B. These expressioncan use complete or partial text matches and include wildcards. The following is a list of the mostcommon wildcards and their meanings. The expressions conform to POSIX regular expressions.

Wildcard Meaning

. match any single character

* match zero or more of the previous item

+ match one or more of the previous item

? match zero or one of the previous item

() remember contents of parenthesis as item

[] match one item in this list

− create a range in a list

| or

^ match to the beginning of the field

$ match to the end of the field

\ escape any of the above

Use the parenthesis () to capture parts of the Fields. These can be referenced in the Constructor usingthe $A1, $A2, $B1, $B2 notation. The A|B refers to the Field, and the number refers to which

Chapter 3: Urchin Administration 85

Page 90: MANUAL Urchin v5x

parenthesis to grab. In the above example, the entire A Field and the entire B Field are captured andassembled as the new field. The Output Field can be a separate field or the same field as Field A orField B.

Controls

The 'Override Output Field' control allows you to decide what to do if the Output Field already exists.The 'Required Field' allows you to decide what to do if one of the expressions does not match.

DynamicURL Filters (deprecated)

NoteUrchin 5 and later displays query terms in a seperate report by default. Unlike versions 4 and earlier,it is not necessary to create a filter to display query strings. However, unlike versions 4 and earlier,the data is not displayed in the 'Pages' report. Urchin 5 displays this data in a drill−down reporttitled 'Page Query Terms.' This report can be found under the 'Pages &Files' report menu.

Many sites today will use a CGI, ASP or other scripting mechanism to provide dynamic content.Often, a single script is used to deliver multiple pages of information. While this can be a handy wayto track users sessions or provide ?live? content, it poses an additional challenge for meaningfulreporting.

By default, Urchin strips all the parameters associated with a page request (e.g. those that wouldtypically be used with a CGI or ASP) and stores only the pathname of the page requested in itsdatabase. The DynamicURL filtering feature allows you to use regular expressions to selectivelycapture these parameters and present them in an intuitive way.

As an example, a CGI script might be used to deliver information about all products in a catalog. Thescript draws from a database, and uses parameters passed through the request to determine whichproduct to display. The resulting hit in the webserver log for this request might look like:

/cgi−bin/showProduct.cgi?sessionId=123456789

|______________________| |_________________________________|

Under normal operation, Urchin will record that the showProduct.cgi page was requested, and allparameters up to and including the "?" will be stripped. By using a DynamicURL filter, Urchin canstore some or all of the parameters and produce a unique page record based on the parameter list.

Now in this example, we don?t necessarily want to capture the entire second part of the requestbecause of the ?sessionId.? Let?s assume that this parameter changes for each visit and we get 30,000visits per day. Including this piece of information would create far too many unique pages and renderthe Pages reporting useless. Instead we just want to capture the ?productId? and report only on thatinformation.

/cgi−bin/showProduct.cgi?sessionId=123456789

Chapter 3: Urchin Administration 86

Page 91: MANUAL Urchin v5x

We may still want to know which script was used as well as which product was implicated in therequest. By using a DynamicURL filter, we can capture multiple parts of the request and recombinethem into a new, formatted request ready for reporting. Here is an example of a filter that could beused with the page request above:

(/cgi−bin/showProduct.cgi\?).*productId=(.*)

This regular expression will match the above request no matter what the value of the sessionId orproductId was. And the parenthesis capture the parts of the request that we want to keep for reporting.The effective request of the above example would look like:

/cgi−bin/showProduct.cgi/knobs

Up to 5 sets of parenthesis can be used. And, multiple filters can be applied. If a request does notmatch the DynamicURL filter, it is left unmodified, but still included in the reporting. This allows youto use multiple DynamicURL filters for each area of a site. Keep in mind there is a slight performancehit for each filter used.

Note that DynamicURL filters can only be applied to the base URL and query string that form thepage request. They cannot be used to filter referrals or any other fields in the log file. Also, whenDynamicURLs and FilterIn/FilterOut are used together the DynamicURL will be applied after theother filters. So consideration must be given to how one set of filters affects the others when choosingwhat to filter.

Examples

Example 1: We want to capture the all the specific Knowledgebase article IDs in the Urchin 4 reportfor help.urchin.com. Here's a sample of what the Request portion of the hit looks like in the log file:

GET /knowledge.cgi?cmd=2 The proper Dynamic URL filter to extract the article ID is:

(/knowledge\.cgi\?)cmd=2 and this produces Top Pages reports that look like:

1. /knowledge.cgi 1,081 46.43% 2. /knowledge.cgi/id=767 244 10.48% 3. /knowledge.cgi/id=807 136 5.84% 4. /knowledge.cgi/id=768 50 2.15% 5. /knowledge.cgi/id=777 40 1.72%

Example 2: We want to capture the all the search keywords used in the Urchin 4 report forhelp.urchin.com. Here's a sample of what the Request portion of the hit looks like in the log file:

GET /knowledge.cgi?cmd=1PE=0= utm The proper Dynamic URL filter to extract thekeyword information is:

(/knowledge.cgi\?).*s_(keyword=[^ and this produces Top Pages reports that look like:

1. /knowledge.cgi 1,373 68.65% 2. /knowledge.cgi/keyword=utm 29 1.45% 3. /knowledge.cgi/keyword=default+page 18 0.90%

Chapter 3: Urchin Administration 87

Page 92: MANUAL Urchin v5x

4. /knowledge.cgi/keyword=no+referral 11 0.55% 5. /knowledge.cgi/keyword=scheduler 10 0.50%

Regular Expression Overview

Introduction

Posix regular expressions are used to match or capture portions of a field using wildcards andmetacharacters. They are often used for text manipulation tasks. Most of the filters included in Urchinuse these expressions to match the data and perform an action when a match is achieved. For instance,an exclude filter is designed to exclude the hit if the regular expression in the filter matches the datacontained in the field specified by the filter.

Regular expressions are text strings that contain characters, numbers, and wildcards. A list ofcommon wildcards is contained in the table below. Note that these wildcard characters can be usedliterally by escaping them with a backslash '\'.

Wildcard Meaning

. match any single character

* match zero or more of the previous item

+ match one or more of the previous item

? match zero or one of the previous item

() remember contents of parenthesis as item

[] match one item in this list

− create a range in a list

| or

^ match to the beginning of the field

$ match to the end of the field

\ escape any of the above

Tips for Regular Expressions

Make the regular expression as simple as possible. Complex expressions take longer toprocess or match than simple expressions.

1.

Avoid the use of .* if possible since this expression matches everything and may slow downprocessing the expression. For instance, if you need to match index.html, use index\.html, not.*index\.html.*

2.

Try to group patterns together when possible. For instance, if you wish to match a file suffixor .gif, .jpg, and .png, use "\.(gif|jpg|png)" not "\.gif|\.jpg|\.png".

3.

Be sure to escape the regular expression wildcards or metacharacters if you wish to match4.

Chapter 3: Urchin Administration 88

Page 93: MANUAL Urchin v5x

those literal characters.Use anchors whenever possible. The anchor characters are ^ and $, which match either thebeginning or end of an expression. Using these when possible will speed up processing. Forinstance, to match foo directory in /foo/bar, use ^/foo/ instead of /foo/. Using the ^ will forcethe expression to match at the beginning and will improve processing speed.

5.

Affiliations, Users &Groups

Working with Affiliations

Overview

An Affiliation is a high level association that is used to group together related Profiles, Log Sources,Users and Groups under a single identifying label, which is typically a corporate or client name. ForUrchin installations where there is a need to support multiple complex client organizations, creatingan Affiliation allows the Urchin administrator to keep track easily of all the Urchin reportingcomponents for a particular client or corporate entity. Access rights to Urchin reports can becontrolled via an Affiliation association, and within an Affiliation even more granular access rights tocertain reports can be assigned to Groups or Users, thereby protecting your data as desired at multiplelevels. As well, Affiliation level administration rights can be assigned to a user who can then act as alocal Urchin administrator for the Affiliation. This allows distribution of the responsibility formanaging, configuring, and maintaining the Urchin reports within an Affiliation.

It is important to note that since Affiliations are the highest level organizational "element" in theUrchin administration interface, you should create the Affiliations before you create any Profiles, LogSources, Users, or Groups that will be associated with the Affiliation. The choice of Affiliation mustbe made when an Urchin element is created, and it cannot be changed afterwards. At creation time ifyou do not choose a specific Affiliation the default Affiliation of (NONE) will be set. This tellsUrchin that the element in question has no Affiliation.

Creating Affiliations

To create an Affiliation, go to the Configuration−>Users &Groups−>Affiliation screen and click theAdd button. Only the Affiliation Name is required; Contact and Contact Email are not used by Urchinand are strictly informational fields for the benefit of the adminstrator. Report Data Location(optional) specifies where to store the report data for all the Profiles that belong to the Affiliation. Thedefault location is the data/reports directory within the Urchin distribution. Changing the Report DataLocation allows you to physically separate within your file system the report data for differentorganizations. Directory Browsing Location (optional) is provided as a security measure when givingan Affiliation administrator access to create Profiles. A directory entered into this field will limit theAffiliation admin's ability to browse for log files to only that directory. By default, there are no

Chapter 3: Urchin Administration 89

Page 94: MANUAL Urchin v5x

restrictions on where an Affiliation admin can browse for log files.

Using Affiliations

To assign a Profile, Log Source, User, or Group to an Affiliation, use the dropdown menu labeledOptional Affiliation in the initial screen of the setup wizard when creating the given element. Once anAffiliation is assigned to a Profile, Log Source, User, etc., the Urchin admin interface will restrictmodification choices to those elements that are associated with the Affiliation. In this way theAffiliation acts to control access rights at a high level so that you can isolate organizations from oneanother without the need to set specific access permissions on every report.

Affiliations also aid in distributing management reponsibilities for the Urchin configuration. Withinan Affiliation, the primary Urchin administrator can assign local admin privileges to Affiliation users.There are three admininstrative levels of control that can be assigned to a User. See the Working withUsers &Groups article in this section for details.

When viewing the admin screens for Urchin configuration parameters, you may filter the entries toselectively show only those with a particular Affiliation name by using the Affiliation dropdownmenu in the top bar of the table. Although (NONE) means no affiliation, for purposes of filtering(NONE) is shown as an option in the Affilation dropdown so that you may view only entries that arenot affiliated.

Working with Users &Groups

Overview

Urchin's Users &Groups functionality allows Urchin administrators to easily set up any number ofusers and grant them access to whichever reports deemed appropriate. These users can then be putinto groups to expedite and simplify management of large numbers of users. If a group of users isgranted report access, all users in that group will have access upon logging in to the system. Users donot have report access ever unless specifically allowed.

How to Use Users

Log−in to your Urchin system as an administrator. Note: the URL for accessing the Urchinsystem is identical regardless of the user type.

1.

Click on Configuration in the main left−side navigation.2. Click on Users &Groups.3. Click the Add button at upper−right to enter the User Wizard.4. Select a username −− it should be lower−case and must not have spaces −− and choose apassword.

5.

Enter the user's real name −− this will be displayed in the Urchin system. Click the Nextbutton.

6.

Determine what level of control the user should have − unless you are running Urchin inDatacenter mode, the only choices will be User and Super Admin. If you are running in

7.

Chapter 3: Urchin Administration 90

Page 95: MANUAL Urchin v5x

Datacenter mode, you also have the choice of Affiliate Admin (see the embedded help byclicking on the help link for specifics on affiliate admin settings). Click the Next button.Once the user has been created, click on the Edit icon next to that user.8. Click on the Report Access tab. The available Profiles will be shown in the box at left.9. Select one or more Profiles. To select multiple Profiles, use the command key or control keydepending on your platform.

10.

Click the right−facing arrow to move the Profile(s) to the Access Granted box.11. Click Update to save changes.12.

How to Use Groups

Log−in to your Urchin system as an administrator and click on Configuration at left1. Click on Users &Groups.2. Click on Groups.3. Click the Add button at top−right.4. Enter the Group Name −− this can be anything descriptive.5. Enter the Group Description −− this might be something to do with the location orcomposition of the group.

6.

Click Finish.7. Click Done and then click Edit next to the group name.8. Click the Users in Group tab to select users to add to the group −− to select multiple users,use the command key or control key depending on your platform.

9.

Click the right−facing arrow to move users to the Users in Group box.10. Click the Update button to save changes.11.

Chapter 3: Urchin Administration 91

Page 96: MANUAL Urchin v5x

To add users to the group, click the Users in Group tab and add users as described in theprocedure above.

12.

Recommendations

Any "Super Admin" level user has complete control over the Urchin system, so it is advisableto only grant that privilege to one person.

Passwords should contain one or more capital letters and/or symbols to make them difficult toguess.

Scheduling Tasks

Working with the Task Scheduler

Overview

The Task Scheduler is the nervecenter of Urchin −− it is responsible for the actual scheduling andexecution of Urchin log processing events for all Profiles. From the Scheduler, you can run tasksimmediately or add them to the list of Urchin events for repeated execution at nearly any intervaldesired.

Chapter 3: Urchin Administration 92

Page 97: MANUAL Urchin v5x

How to Use the Scheduler

Log−in to your Urchin Administration Interface and click Configuration in the main left−sidenavigation, then Urchin Profiles.

1.

Locate the Profile you wish to schedule and click Edit.2. Click the Run/Schedule tab.3. Under Task Settings, select the desired interval. Daily is recommended.4. Set the time of day for the task.5. Click Update to save changes.6. To run the task immediately, click Run Now. Subsequent scheduled tasks will occuraccording to the schedule you have set.

7.

Recommendations

Most tasks should be scheduled for daily execution, since that is the log rotation schedule formany webservers. However, Urchin's log tracking facility makes it possible to read the samelog multiple times without doubling data, so this is not required.

Notes on the Scheduler's Operation

All tasks are handled sequentially by Urchin, so multiple tasks given the same time ofexecution will still be processed one at a time.

To see the results of all tasks that have been executed, see the Task History screen in theScheduler navigation section.

Chapter 3: Urchin Administration 93

Page 98: MANUAL Urchin v5x

System Settings

Changing the Port Number

Changing the Port NumberThe default port number that the Urchin webserver will listen on is 9999. Changing this numberconsists of two basic steps:

Changing the port number in the Server Settings screen♦ Stopping and starting the Urchin services, which will be a slightly different process forWindows versus Unix−type systems

The detailed process is as follows:

Login to the Urchin administration interface♦ Navigate to Configuration−>Settings−>Access Settings and click on the Server Settings tab♦ Set your new port number in the Server Port Number box♦ Click on the Update button♦

Now you must restart the Urchin services:

On Unix−type systems go to the bin directory of your Urchin distribution and run:

./urchinctl restart

On Windows systems, from the console, go to Start−>Programs−>Urchin and choose DisableServices, then choose Enable Services.

The webserver should now be listening for connection requests on the new port number. This meansthat the URL used to view reports and configure the Urchin software has changed, and your usersshould be notified regarding the new URL.

NotesPlease note that on many systems, root privileges may be required to use port numbers less than 1024.Also, if another service is already running on the port specified, Urchin will fail to start.

Licensing Urchin

Overview

Urchin must be licensed in one of the three ways before it can be used:

Obtain Demo License♦ Buy License♦ Activate Pre−Purchased License♦

Chapter 3: Urchin Administration 94

Page 99: MANUAL Urchin v5x

If you are trying out Urchin for the first time, you will want to install a demo license. This is a free15−day evaluation, which no limitations on Urchin's function.

Installing a Demo License

To install a demo license, log−in to your Urchin Administration Interface with a web browser (usuallyhttp://your.server.com:9999), click "Install Demo License", and follow the on−screen steps, includingentering your contact information. It's important to enter your real information, as it will be necessaryif and when you decide to purchase Urchin later. Click the Install Demo License link to complete theprocess.

Buy License

To purchase a license, log−in to the Urchin system as an administrator and click on Configuration atleft. Next, click on the Settings button and then the License button. On the main screen, click the BuyLicense link, which will take you to our online licensing center. Once you have completed thepurchase, the Urchin system will be fully operational in perpetuity.

Activate Pre−Purchased License

To activate a pre−purchased license (such as if you purchased Urchin on CD or you have moved theUrchin installation to a new server), log−in to the Urchin system as an administrator and click onConfiguration at left. Next, click on the Settings button and then the License button. On the mainscreen, click the Activate Pre−Purchased License link, which will take you to our online licensing

Chapter 3: Urchin Administration 95

Page 100: MANUAL Urchin v5x

center. Once you have completed the process, the Urchin system will be fully operational inperpetuity.

Installing a License Without Internet Access

It is possible to license Urchin without internet access, such as behind a firewall. To accomplish this,you will need to run the "inspector" utility, which is bundled with Urchin and found in the "util"directory. Attach its output to an email and send to [email protected] for assistance.

Recommendations

Please enter your real contact information when activating your demo, as it will be necessaryfor billing purposes if you decide to buy. We will also need to know who you are in order toprovide support.

DNS Database Update

DNS Database Update

Urchin includes a DNS database which provides the information used in creating the Domain Reports,including the conversion of IP addresses to domain names. These databases are stored in the Urchindata directory and need to be updated on a periodic basis. Urchin includes geo−update which is autility that checks for updates and downloads new updates when they are available. The utility isscheduled to check for updates once a month and allows the user to set the day and time for thedownload or allows for disabling the downloads.

The geo−update utility can also be used to import custom entries into the DNS databases. For moreinformation, see the geo−update utility article in the Advanced Topics −> Utilities section.

Considerations

The geo−update utility needs an internet connection to be able to check for and download newupdates. The utility uses port 80 to communicate with the webserver providing the updates. It ispossible that proxy servers and firewalls can interfere with Urchin's ability to successfully downloadupdates.

Chapter 3: Urchin Administration 96

Page 101: MANUAL Urchin v5x

Chapter 4: Reporting Interface

Report−Side Filtering

Urchin is capable of sophisticated filtering of any text−based report via the reporting interface. Tofilter in or out any text string, enter it into the Filter box and click the "+" (include) or "−" (exclude)button. The Urchin reporting system will re−query the database and only display correspondingresults.

To conduct more complex filtering operations, POSIX regular expressions can be used (POSIX is astandard for text manipulation which is beyond the scope of this guide).

Reporting Interface Overview

Welcome to the Urchin Reporting Interface!

Overview

Chapter 4: Reporting Interface 97

Page 102: MANUAL Urchin v5x

The Urchin Reporting Interface is the system that displays the actual Urchin reports. To access theReporting Interface, login to your Urchin Administration Interface and select a report to view. If youare the Administrator, you will have access to all reports. If you are a User, you will have access tothose reports specified by the Administrator.

Each Profile that has been configured has its own set of reports. Click the magnifying glass icon nextto a profile to view the reports for that profile.

Controls

Note the Date Range at the top of any report. All data shown is for that time period only. To changethe timeframe, just select a different date range from the controls at bottom−left of the screen. See theDate Range article in this section for more information.

Standard/SVG: Urchin can display reports in either standard HTML, or via Adobe's ScalableVector Graphics (SVG) format. By default, Urchin will attempt to determine if the browser inuse has the SVG plug−in installed. If so, reports will be displayed in SVG format. If not,Urchin will use standard HTML. If the user attempts to select SVG with a browser that doesnot have the SVG plug−in, a link will be provided to a web page with information on gettingthe plug− in.

Search: To instantly find any item in list−type reports, enter a search term or phrase into theSearch box and press Enter. The list will be updated with any matches.

Filter: To filter in or out any items in list−type reports, enter a string of text into the Filter boxand click the "+" (include) or "−" (exclude) button. The list will be changed accordingly

#Shown: To show a different number of items in the report being viewed, simply select thedesired number from the pulldown menu.

Go To#: If you know the position number of the entry you would like to see, enter it here andpress Enter.

Export:Tab: click the "T" button to export data in tab−delimited format.◊ Word: click the Word icon to export data in Microsoft Word native format.◊ Excel: click the Excel icon to export data in Microsoft Excel native format.◊

Printing: click the printer icon to get a print−friendly view of the data; click the Print Pagelink from that screen to actually print the report.

Recommendations

Try different Date Range settings to see how your data changes over time. For low− trafficsites, a month may be a better timeframe than a week, since traffic might not be statisticallysignificant for that small of a time period.

Chapter 4: Reporting Interface 98

Page 103: MANUAL Urchin v5x

See Also

Glossary of Terms♦

Exporting Data

Overview

Urchin's data export function makes it easy to extract data from any Urchin report. This is useful forbringing report data into a spreadsheet, word processor, database, etc. for further analysis.

How to Use Export Data

To export data from any report, select the appropriate type based on the application you plan to use tomanipulate the data. For general database importing, use tab−separated format. For Word and Excelexport, the application should launch automatically after the data is exported, and the new documentshould be populated with the data you have exported.

Tab: click the "T" button to export data in tab−delimited format.♦ Word: click the Word icon to export data in Microsoft Word native format.♦ Excel: click the Excel icon to export data in Microsoft Excel native format.♦

Printing: click the printer icon to get a print−friendly view of the data; click the Print Page link fromthat screen to actually print the report.

Recommendations

To export data to a database, tab−separated is usually the preferred format♦

Date Range

Overview

Urchin's Date Range function allows you to view report data by any timeframe desired, from 1 day tothe entire period of time for which data exists, or any part thereof. The Date Range feature makes iteasy to specify either a standard timeframe (such as a week, month, or year), or any customtimeframe.

Using the Date Range Function

Chapter 4: Reporting Interface 99

Page 104: MANUAL Urchin v5x

Standard Date Range: To view data by a standard timeframe such as a day, week, month, oryear, just click the desired period in the Date Range navigation area, and the report's data willchange accordingly. The Date Range Calendar is clickable in many ways to accomplish this:

Year: click the year to display data for the entire calendar year.◊ Month: click the name of the month.◊ Week: click the arrow to the left of the calendar for the week you are interested in.◊ Date: click the date you are interested in.◊ Day: to only show data for every instance of a particular day of the week in thecurrently selected Date Range, click the day name.

Custom: click the Enter Range button, which brings up the Urchin Calendar. Selectthe starting date in the calendar at left, and the ending date at the calendar at right.Click Apply Date Range, and the report will change to show data for the timeframeselected.

After selecting a custom Date Range, all reports you examine that are compatible with the selectedtimeframe will display data for that period until either the browser is closed or a different Date Rangeis specified.

Recommendations

If you are examining a low−traffic site, try looking at a longer timeframe to get moremeaningful data.

If you are interested in traffic trends over the life of your site (and the data exists), tryanalyzing a year or more worth of data −− Urchin will adjust the size of bar graph elements toaccommodate the selection.

Urchin 5.6 feature: You can also see the data displayed hourly, daily, or monthly over your selecteddate range. Select hourly, daily, or monthly from the Date View pulldown, as shown in the imagebelow.

Chapter 4: Reporting Interface 100

Page 105: MANUAL Urchin v5x

Chapter 4: Reporting Interface 101

Page 106: MANUAL Urchin v5x

Chapter 5: E−commerce Module

E−commerce Overview

Introduction

Urchin's E−commerce reporting module expands the power of Urchin's reporting to allow you tofollow visitors all the way to the point of conversion and actually measure your ROI on variousaspects of your website and marketing campaigns. There are two sections of reporting enabled by themodule. The first section of reports, E−Commerce, provides trend analysis of on−line revenue,transactions, and product detail. The second commerce reports section, Revenue Source, correlatesrevenue against visitor parameters including keywords and search engines.

Chapter 5: E−commerce Module 102

Page 107: MANUAL Urchin v5x

This valuable reporting capability allows you to exploit cross−system on−line business resources andoptimize on−line campaigns. Easily calculate your ROI from CPC, and organic search engineplacements.

System Overview

When a visitor to a website makes a purchase, the shopping cart software will make an entry into atransaction log file which, when processed along with normal web traffic logs, creates a completepicture of the E−commerce system.

Urchin processes these logs together and correlates the web site session with the E−commercetransaction. The purchase and product information is stored in the Urchin databases ready for viewingin the E−Commerce reports.

Configuration

There are three key elements for configuring Urchin's e−commerce processing:

Establish a usable e−commerce log format: Urchin needs to understand how youre−commerce logs are constructed. The choices are ELF2, ELF, or custom log format. See theELF &ELF2 Log Formats or Custom E−commerce Log Formats articles in this section fordetails.

Coordinating processing of e−commerce and webserver access logs together: typicallye−commerce transactions are tracked separately from normal webserver activity andfrequently the sites are not even hosted on the same machine. You'll have to make sure thatboth sets of logs are available to Urchin so that they can be processed together. Both logsshould be listed as log sources in the single profile that is setup to handle your e−commercereporting.

Choosing a visitor tracking method: the visitor tracking method will determine how wellUrchin can correlate e−commerce activity with normal website activity. You should decidewhich visitor tracking method will yield the level of analysis you desire. The more accurateUTM method requires making some simple modifications to your website documents toachieve the most complete analysis. These modifications should be made to all websitesinvolved in your online business. See the Visitor Tracking section of the DocumentationCenter for details on setting up UTM.

Considerations

Chapter 5: E−commerce Module 103

Page 108: MANUAL Urchin v5x

It is strongly advised to have your shopping cart software log in ELF2 format if possible. This willreduce some of the Urchin administration overhead in setting up your e−commerce reporting sinceUrchin has a built−in capability to deal with this format automatically.

ELF &ELF2 Log Formats

Overview

The E−commerce log formats (ELF &ELF2) were designed to record information about customertransactions from online shopping sites. ELF was originally created for use with Urchin 3 and may beused with Urchin 5 when processing data with the IP−Only visitor method. ELF2 is similar to ELFand includes additional fields that allow for visitor correlation using the IP+UserAgent, UTM, andother visitor tracking methods. It is recommended to log data in the ELF2 format since it is able toprovide better visitor correlation with your webserver data. If you cannot set up your shopping cartsoftware to log in ELF/ELF2, then you must configure your own Urchin custom log format prior toattempting to process your e−commerce data.

This document describes the format of the ELF and ELF2 log files that are created by the shoppingcart software and explains how to configure Urchin for processing of e− commerce logs.

Configuring Urchin for ELF/ELF2 Log Files

You must select specific Urchin configuration parameters depending on your e−commerce log type.

ELF processing

In the Log Source−>Log Settings screen set the Log Format to either elf or auto♦ In the Profile−>Reporting screen set the Visitor Tracking Method in the Profile to IP−ONLY,which is the only method supported when using ELF e−commerce log formats

ELF2 processing

In the Log Source−>Log Settings screen set the Log Format to either elf2 or auto♦ In the Profile−>Reporting screen set the Visitor Tracking Method to any of the choices,which are all supported when using ELF2

Your e−commerce log should be listed as a second log source along with your main website log in theprofile that is created to handle your e−commerce reporting. The logs are processed sequentially byUrchin.

ELF/ELF2 Log Format Description

Both ELF and ELF2 are tab−separated multi−line log formats. The first line begins with an '!'exclamation character and contains overall information about the purchase. Subsequent lines containdetailed information about the items purchased. The first line is referred to as the transaction and thesubsequent lines are referred to as items. Blank fields should contain a '−' character. Since tabs are

Chapter 5: E−commerce Module 104

Page 109: MANUAL Urchin v5x

used to separate fields, the tab character is not allowed to be used within a field.

A typical ELF/ELF2 log file will have the following general form:

!transation1item1item2item3!transaction_2item1item2...

ELF2 Log Format

ELF2 Transaction Line

The ELF2 transaction line begins with an '!' exclamation and contains the following tab separatedfields (empty fields should contain a '−' character):

!%{ORDERID} %{REMOTE_HOST} %{DATE/TIME} %{STORE} %{SESSIONID} %{TOTAL}%{TAX}%{SHIPPING} %{BILL_CITY} %{BILL_STATE} %{BILL_ZIP} %{BILL_COUNTRY}%{USER_AGENT} %{COOKIES}

where:

%{ORDERID} is the order number♦ %{REMOTE_HOST} is the hostname/ip address of the remote machine♦ %{DATE/TIME} is the time in the common log format [dd/mmm/yyyy:HH:MM:SS+/−ZZZZ]

%{STORE} is the name/id of the storefront♦ %{SESSIONID} is the unique session identifier of the customer♦ %{TOTAL} is the transaction total including tax and shipping (decimal only, no '$'characters)

%{TAX} is the amount of tax charged to the subtotal♦ %{SHIPPING} is the amount of shipping charges♦ %{BILL_CITY} is the billing city of the customer♦ %{BILL_STATE} is the billing state of the customer♦ %{BILL_ZIP} is the billing zip code of the customer♦ %{BILL_COUNTRY} is the billing country of the customer♦ %{USER_AGENT} is the user agent of the customers browser♦ %{COOKIES} are the incoming cookies contained in the headers from the customersbrowser

ELF2 Item Line

The ELF2 item line contains the following tab separated fields (empty fields should contain a '−'character):

Chapter 5: E−commerce Module 105

Page 110: MANUAL Urchin v5x

%{ORDERID} %{REMOTE_HOST} %{DATE/TIME} %{PRODUCT_CODE}%{PRODUCT_NAME} %{VARIATION}%{PRICE} %{QUANTITY} %{UPSOLD} %{USER_AGENT} %{COOKIES}

where:

%{ORDERID} is the order number♦ %{REMOTE_HOST} is the hostname/ip address of the remote machine♦ %{DATE/TIME} is the time in the common log format [dd/mm/yyyy:HH:MM:SS +/−ZZZZ]♦ %{PRODUCT_CODE} is the identifier of the product♦ %{PRODUCT_NAME} is the name of the product♦ %{VARIATION} is an optional variation of the product for colors, sizes, etc♦ %{PRICE} is the unit price of the product (decimal only, no '$' signs)♦ %{QUANTITY} is the quantity ordered of this product♦ %{UPSOLD} is a boolean (0|1) if the product was on sale♦ %{USER_AGENT} is the user agent of the customers browser♦ %{COOKIES} are the incoming cookies contained in the headers from the customersbrowser

ELF2 Log File Example

The following 2 lines demonstrate a transaction and corresponding item entry in an ELF2 log:

!36530 123.123.123.123 [21/Aug/2003:11:31:45 −0800] − − 895.00 − −Virginia Beach VA 23452 US "Mozilla/4.0 (compatible; MSIE6.0; Windows NT 5.1)""__utma=171060324.2002410569.1061216915.1061216915.1061490246.2;__utmb=171060324;__utmc=171060324"36530 123.123.123.123 [21/Aug/2003:11:31:45 −0800] U5−BASE Urchin 5 Base License −895.00 1 − "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT5.1)""__utma=171060324.2002410569.1061216915.1061216915.1061490246.2;__utmb=171060324;__utmc=171060324"

ELF Log Format

ELF Transaction Line

The ELF transaction line begins with an '!' exclamation and contains the following tab separated fields(empty fields should contain a '−' character):

!%{ORDERID} %{REMOTE_HOST} %{STORE} %{SESSIONID} %{DATE/TIME} %{TOTAL}%{TAX} %{SHIPPING} %{BILL_CITY} %{BILL_STATE} %{BILL_ZIP}%{BILL_COUNTRY}

where:

%{ORDERID} is the order number♦ %{REMOTE_HOST} is the hostname/ip address of the remote machine♦

Chapter 5: E−commerce Module 106

Page 111: MANUAL Urchin v5x

%{STORE} is the name/id of the storefront♦ %{SESSIONID} is the unique session identifier of the customer♦ %{DATE/TIME} is the time in the common log format [dd/mmm/yyyy:HH:MM:SS+/−ZZZZ]

%{TOTAL} is the transaction total including tax and shipping (decimal only, no '$'characters)

%{TAX} is the amount of tax charged to the subtotal♦ %{SHIPPING} is the amount of shipping charges♦ %{BILL_CITY} is the billing city of the customer♦ %{BILL_STATE} is the billing state of the customer♦ %{BILL_ZIP} is the billing zip code of the customer♦ %{BILL_COUNTRY} is the billing country of the customer♦

ELF Item Line

The ELF item line contains the following tab separated fields (empty fields should contain a '−'character):

%{ORDERID} %{PRODUCT_CODE} %{PRODUCT_NAME} %{VARIATION} %{PRICE}%{QUANTITY} %{UPSOLD}

where:

%{ORDERID} is the order number♦ %{PRODUCT_CODE} is the identifier of the product♦ %{PRODUCT_NAME} is the name of the product♦ %{VARIATION} is an optional variation of the product for colors, sizes, etc♦ %{PRICE} is the unit price of the product (decimal only, no '$' signs)♦ %{QUANTITY} is the quantity ordered of this product♦ %{UPSOLD} is a boolean (0|1) if the product was on sale♦

ELF Log File Example

The following lines demonstrate 2 transactions and corresponding item entries in an ELF log:

!12313 ppp−46.mia−tc−2.netrox.net ZongStore 1102323131 [27/Jul/1999:11:43:02 −0700]198.12 8.12 10.00 Cedar Rapids Iowa 52403 US12313 102 T Shirt XL 10.00 10 012313 103 Boxers L 9.00 10 0!12314 213.12.54.123 − 110123413 [27/Jul/1999:11:43:02−0700] 11.75 0.75 1.00Santa Ana CA 92705 US12314 102 T Shirt S 10.00 1 0

Custom E−commerce Logs

Chapter 5: E−commerce Module 107

Page 112: MANUAL Urchin v5x

Overview

Many shopping carts provide the ability to capture and log valuable information regarding purchasesin formats other than ELF or ELF2, and therefore cannot be automatically processed by Urchin. Thisarticle explains how to create a custom log format for your E−commerce log file if you cannot alteryour shopping cart to generate ELF/ELF2. Before continuing, please read the article titled "CustomLog Formats" in the Advanced Topics−>Customization section of the Document Library, whichexplains the creation of custom logs in detail.

E−commerce Log Format Types

Shopping carts are capable of logging information about purchases and the items purchased in either asingle line format or a multi−line format. In single format each line contains all the informationnecessary to completely describe a transaction and the items purchased and all lines have the samelayout. In multi−line formats, multiple lines are used to describe a purchase, with one format for thetransaction lines and another format for the items purchased. ELF/ELF2 logs are multi−line formats.You must examine your E−commerce logs to determine if the data is single line or multi−line as thiswill affect how you set up your custom log format. Please follow the instructions below depending onyour type of log format.

General E−commerce Logging Requirements

Regardless of the format of the log entries your shopping cart produces, each entry must contain thedate and time and at least one of the following fields to provide visitor correlation:

Remote Host or IP Address (for IP−Only or IP−Useragent visitor methods)♦ Useragent (for IP−Useragent visitor method)♦ Cookies (for UTM or SID visitor method)♦ Session ID (for SID visitor method)♦

If any of the above fields are missing Urchin will not produce meaningful analysis of your revenue.Urchin also defines the following E−commerce fields:

%{ORDERID} is the order number♦ %{STORE} is the name/id of the storefront♦ %{SESSIONID} is the unique session identifier of the customer♦ %{TOTAL} is the transaction total including tax and shipping (decimal only, no '$'characters)

%{TAX} is the amount of tax charged to the subtotal♦ %{SHIPPING} is the amount of shipping charges♦ %{BILL_CITY} is the billing city of the customer♦ %{BILL_STATE} is the billing state of the customer♦ %{BILL_ZIP} is the billing zip code of the customer♦ %{BILL_COUNTRY} is the billing country of the customer♦ %{PRODUCT_CODE} is the identifier of the product♦ %{PRODUCT_NAME} is the name of the product♦ %{VARIATION} is an optional variation of the product for colors, sizes, etc♦ %{PRICE} is the unit price of the product (decimal only, no '$' characters)♦

Chapter 5: E−commerce Module 108

Page 113: MANUAL Urchin v5x

%{QUANTITY} is the quantity ordered of this product♦ %{UPSOLD} is a boolean (0|1) if the product was on sale♦

Single−line Format Logs

Follow these instructions if your E−commerce log file only contains hits that all have the same lineformat as explained above.

Create a new custom log format in the lib/custom/logformats directory by making a copy ofthe custom.lf.sample logformat file. Name your copy with a .lf suffix.

1.

Edit your new custom log format file and set the following entries based on therecommendations below:

PrimaryPositions: This entry specifies the order of fields in your log file. Create acomma separated list of field ids which describes your field order. The field namesand ids are found in the lib/reporting/logformats/fieldlist.txt file. See example below.

SecondaryPositions: Leave this as '−' since it is not used for single−line format logfiles.

PrimaryKey: Leave this as '−' since it is not used for single−line format log files.◊ SecondaryKey: Leave this as '−' since it is not used for single−line format log files.◊ PrimaryContent: Valid entries for this field are TRANSACTION or ITEM. If the hitsin your log file describe the purchase of each individual product, set this to ITEM. Ifthe hits in the log file describe the entire purchase, set this to TRANSACTION.

SecondaryContent: Leave this as '−' since it is not used for single−line format logfiles.

CommentKey: If some of the lines in your log file are comments or are notconsidered hits and begin with a specific character, enter the character here.

FieldSeparator1: The field separators define which characters are considered fieldseparators. Typical entries are tabs (\t) and spaces (\s). Set these appropriately basedon the characters between the fields in your log file.

FieldSeparator2: See FieldSeparator1 above◊ QuotesEscapeSep: This specifies whether field separators will be ignored inside afield that contains quote "" characters. This should probably be left as YES.

BracketsEscapeSep: This specifies whether field separators will be ignored inside afield that contains bracket [] characters. This should probably be left as YES.

MergSuccessiveSep: This specifies whether to consider two separator characters in arow as one separator. This can probably be left as NO.

CleanWhiteSpace: This specifies whether to remove white space from the ends of thefields when they are parsed. This can probably be left as NO.

StatusRequired: Leave this set to NO unless your hits contain web server type statuscodes

CustomDateFormat: If your log format contains a custom date format, set theappropriate strptime format that describes the entry

CustomTimeFormat: If your log format contains a custom time format, set theappropriate strptime format that describes the entry

2.

Save your custom log format in the lib/custom/logformats directory3. Select the custom log format for your log source in the Urchin Admin interface.4. Process your log file(s) with Urchin.5.

Single−line Format Example

Chapter 5: E−commerce Module 109

Page 114: MANUAL Urchin v5x

The following example is a single hit from a log that only has transaction data.

12345 123.123.123.123 "Urchin Store" [26/Aug/2003:11:43:02−0700] 192.73 "San Diego" "CA" 92101 "US" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;)""__utma=171060324.2734232095.1061444425.1061444425.1061444763.2"

The list below shows each field name listed with the id number obtained from thelib/reporting/logformats/fieldslist.txt file. The id numbers thus assigned are used in thePrimaryPositions field in your custom log format file.

Transaction ID 251. Remote Host or IP Address 122. Store Name 263. Apache Date/Time 34. Total Cost 285. Bill City 316. Bill State 327. Bill Zip 338. Bill Country 349. User Agent 1310. Cookies 1411.

Based on the list above, you would set the following entries in the custom logformat file:

PrimaryPositions: "25, 12, 26, 3, 28, 31, 32, 33, 34, 13, 14"SecondaryPositions: −PrimaryKey: −SecondaryKey: −PrimaryContent: TRANSACTIONSecondaryContent: −CommentKey: #FieldSeparator1: \sFieldSeparator2: \tQuotesEscapeSep: YESBracketsEscapeSep: YESMergSuccessiveSep: NOCleanWhiteSpace: NOStatusRequired: NOCustomDateFormat: −CustomTimeFormat: −

The PrimaryPositions specify the field order and the PrimaryContent tells Urchin that this logcontains transactions (or general information about purchases). The field separators were set to spaceand tab since the fields were separated by white space. The custom date/time formats were notspecified since the date/time was formatted as an Apache date.

Multi−line Format Logs

Urchin has the ability to read multi−line formats as long as the beginning character of each linecontains a specific character that can identify which format is being used. For example, the ELF/ELF2log files contain a '!' exclamation character as the first character in the transaction line. The item linesdo NOT contain a leading '!' character.

Chapter 5: E−commerce Module 110

Page 115: MANUAL Urchin v5x

Follow these instructions if your E−commerce log file contains two different format lines, one for thetransaction and the other for product or item details.

Create a new custom log format in the lib/custom/logformats directory by making a copy ofthe custom.lf.sample logformat file. Name your copy with a .lf suffix.

1.

Edit the new custom log format file and set the following entries based on therecommendations below:

PrimaryPositions: This entry specifies the order of fields in your log file. Create acomma separated list of field ids which describes your field order. The field namesand ids are found in the lib/reporting/logformats/fieldlist.txt file.

SecondaryPositions: This entry specifies the order of fields in your log file. Create acomma separated list of field ids which describes your field order. The field namesand ids are found in the lib/reporting/logformats/fieldlist.txt file.

PrimaryKey: Set the primary key to the character that identifies the log file line as thesame format described by the primarypositions

SecondaryKey: Set the seconday key to the character that identifies the log file line asthe same format described by the secondarypositions

PrimaryContent: Valid entries for this field are TRANSACTION or ITEM. If the hitsin your log file describe the purchase of each individual product, set this to ITEM. Ifthe hits in the log file describe the entire purchase, set this to TRANSACTION.

SecondaryContent: See PrimaryContent above◊ CommentKey: If some of the lines in your log file are comments or are notconsidered hits and begin with a specific character, enter the character here.

FieldSeparator1: The field separators define which characters are considered fieldseparators. Typical entries are tabs (\t) and spaces (\s). Set these appropriately basedon the characters between the fields in your log file.

FieldSeparator2: See FieldSeparator1 above◊ QuotesEscapeSep: This specifies whether field separators will be ignored inside afield that contains quote "" characters. This should probably be left as YES.

BracketsEscapeSep: This specifies whether field separators will be ignored inside afield that contains bracket [] characters. This should probably be left as YES.

MergSuccessiveSep: This specifies whether to consider two separator characters in arow as one separator. This can probably be left as NO.

CleanWhiteSpace: This specifies whether to remove white space from the ends of thefields when they are parsed. This can probably be left as NO.

StatusRequired: Leave this set to NO unless your hits contain web server type statuscodes

CustomDateFormat: If your log format contains a custom date format, set theappropriate strptime format that describes the entry

CustomTimeFormat: If your log format contains a custom time format, set theappropriate strptime format that describes the entry

2.

Save your custom log format in the lib/custom/logformats directory3. Select the custom log format for your log source in the Urchin Admin interface.4. Process your log file(s) with Urchin.5.

Chapter 5: E−commerce Module 111

Page 116: MANUAL Urchin v5x

Visitor Correlation

Overview

Visitor correlation is the process of identifying visitor behavior even if the sessions come fromdifferent log files or independnet web or E−commerce servers. Urchin uses data from the various logsources to analyze relationships between sessions and transactions and then correlates this informationto provide a clear picture of how visitor activity relates to purchases, thereby providing valuablereturn on investment reporting. For example, referrals from search engines and specific keywords canbe correlated with the amount purchased from your E−commerce site to tell you which referrals andsearch terms are yielding the most revenue. Typically you will have one log for your public facingwebsite, and another log from your secure transaction website. In order to correlate disparate datasources together, Urchin must use the same visitor identification method for each site.

Types of Visitor Correlation

Urchin is capable of correlating visitors based on several methods. These methods are described in theVisitor Identification Methods article in the Visitor Tracking section. Your choice of visitor trackingmethod will directly affect the accuracy of the information Urchin has at its disposal to correlate theE−commerce transactions with other website activity. Please choose the method that is suitable for thelevel of detail you desire.

The following data is required in the E−commerce log file for each of the visitor methods:

UTM: cookies♦ SID: cookies or SID field♦ Username: username♦ IP+UserAgent: remote host or IP address and user−agent (i.e. browser type)♦ IP−Only: remote host or IP address♦

These items need to be considered when you examine your E−commerce logging format. Pleasereview the ELF &ELF2 Log Formats and Custom E−commerce Log Formats documents in thissection for more detail on how to ensure the proper data is in your logs.

Configuraton

This section presents a general overview of e−commerce visitor correlation configuration issues.

Non−UTM Sites

If you do not have the UTM sensor installed on your sites, then your visitor correlation configurationwill depend on what e−commerce format you are using.

ELF: Use IP−Only as your visitor tracking method. In such a case Urchin will automaticallycorrelate the various sessions as long as all logs contain the IP fields. Simply choose IP−Onlyfor the Visitor Tracking Method in the Profile's Reporting screen.

Chapter 5: E−commerce Module 112

Page 117: MANUAL Urchin v5x

ELF2: Use IP+User−Agent as your visitor tracking method. In such a case Urchin willautomatically correlate the various sessions as long as all logs contain the IP and User−Agentfields. Simply choose IP+User−Agent for the Visitor Tracking Method in the Profile'sReporting screen.

UTM−Enabled Sites

For UTM−enabled sites, the same version of the UTM sensor must be installed on all the pages youwant to track. In the Profile Reporting screen set Visitor Tracking Method to Urchin Traffic Monitor.In general you would set the UTM Domain to the domain that is common to the sites you'reprocessing. For example, when processing web logs from ads.urchin.com along with logs fromsecure.urchin.com the UTM Domain would be set to urchin.com. For specific details on installing andconfiguring UTM, please see the Visitor Tracking section of the Documentation Library.

Cancelling E−commerce Transactions

It is sometimes necessary to back−out or cancel an e−commerce transaction. Cancelling orders whichdid not go through or which were disallowed for one reason or another ensures that your Urchinreports, including Campaign Tracking reports, provide accurate information.

To cancel an order or transaction, find the transaction in your ELF or ELF2 log. Then, create aduplicate entry which contains a negative transaction total that cancels out the original transaction.For example, if the the original transaction total is $699, enter a duplicate entry with −699 dollars asthe transaction total.

Read ELF &ELF2 Log Formats to understand the E−commerce log format that applies to you.

Chapter 5: E−commerce Module 113

Page 118: MANUAL Urchin v5x

Chapter 6: Campaign Tracking Module

Campaign Tracking Overview

The Urchin Campaign Tracking Module accurately tracks visitors from a source, such as a searchengine or email link, to a conversion or transaction on your site.

With the Urchin Campaign Tracking Module, you gain the benefits of:

Multi−Session Tracking: Track visitors from lead to conversion across multiple sessions.♦

Chapter 6: Campaign Tracking Module 114

Page 119: MANUAL Urchin v5x

ROI Analysis: Buy the keywords that convert. Cut those that don't.♦ Goal Conversion: Verify conversion to purchase or any other goal.♦ A/B Testing: Test content and go with what works.♦ Click Fraud Reporting: Identify and take action against click fraud.♦ Day Parts Reporting: Don't waste money when your audience is away.♦ Multi−Dimensional Comparisons: Marketing campaigns, advertising channels, e− mailblasts, search engines, specific keywords, organic searches, and more.

How does it work?

The Urchin Campaign Tracking Module tracks data from a variety of sources to provide closed−loopROI analysis. Let's look at the steps.

Step 1: From Link to Web Page

Each visitor to your site enters via a link indicating where they clicked from, the keywords they used,if any, as well as campaign and medium information. The patent−pending Urchin Traffic Monitor(UTM−3 and UTM−4), which is part of the Campaign Tracking Module, parses the link to obtain thisinformation.

The UTM is a small amount of JavaScript code in each of your web pages. You can install theUTM−3 or UTM−4 manually in each web page or automatically via server side includes and othertemplate systems. Once installed, the UTM is triggered each time a visitor views the page. The UTMperforms three tasks;

it ensures that a page hit is registered in the web log if the page was cached or proxied,♦ it parses the link to obtain and log campaign information, and♦ it updates visitor activity information.♦

Step 2: Parsing the Link

The UTM parses the incoming link to obtain the campaign information. For example,

http://www....com/?utm_source=googleper−click

indicates that the visitor clicked on a cost−per−click link on the Google search engine. (UTM−4automatically detects the keywords that the visitor searched on.) Although this particular link usesonly two variables, utm_source and utm_medium, which indicate the source, Google, and the medium"cost−per−click", your links may incorporate three additional variables: utm_campaign, utm_content,and utm_term. These three variables are available to indicate a specific marketing initiative, adcontent, and a paid search term (necessary for UTM−3), respectively. Information on these variablesand how to set up your Urchin Campaign Tracking Module software is provided in the article Step 1:Track Campaign Data.

The UTM is not limited to parsing links that you embed in emails or paid keywords, but also parseskeyword information from organic links. This is important because it enables you to makeside−by−side comparisons of paid versus unpaid search results. The UTM recognizes links from thetop search engines and parses out the source and keyword information. In addition, the CampaignTracking Module can also be configured to recognize and parse links from custom organic search

Chapter 6: Campaign Tracking Module 115

Page 120: MANUAL Urchin v5x

engines, if required. Information on how to do this is provided in the article Adding a Custom SearchEngine.

Step 3: Logging Campaign Information and User Activity

The UTM does two things with the campaign information it parses from the links; it

formats a web document request that allows the web server to make a special entry in the weblog, and

updates the client first−party cookie.♦

The UTM formats the information it parses from the link into the appropriate web document requestthat will result in the web server adding the referral information to the web log. The UTM also readsthe client's first−party cookie, updating user tracking information as required. For example, if this isthe user's first visit to your site, the UTM will add the campaign tracking information to the cookie. Ifthe user previously found and visited your site, the UTM increments the session counter in the cookie.Regardless of how many sessions or how much time has passed, the UTM "remembers" the originalreferral. This gives the Campaign Tracking Module true multi−session tracking capability.

Step 4: Adding Goal and CPC Data

For the purposes of campaign tracking and ROI calculation, the Urchin Database receives

a conversion goal via the Urchin Admin interface (optional),♦ search engine cost−per−click data (optional), and♦ data from the web log.♦

Once a page in your web site has been defined as a conversion goal, the Urchin Campaign TrackingModule will be able to calculate metrics indicating how successful your site is at converting visitors.By comparing referrals, sessions, and visitor activity to conversions, the Urchin Campaign TrackingModule can report on the effectiveness of your keywords, mediums, campaigns, and content. Thesystem can also report latency metrics such as time to goal and sessions to goal. To learn how todefine a conversion goal, read Step 3: Define a Conversion Goal.

The Campaign Tracking Module allows you to import your cost−per−click data directly from yourGoogle and Overture spending accounts. This allows the system to report ROI at all levels ofgranularity, from per−keyword/per−search engine to per−campaign aggregates. To learn how toimport spending data from Google, read Import Cost Data from Google. To learn how to importspending data from Overture, read Import Cost Data from Overture.

Updates from the web log to the Urchin Database occur according to the schedule that you establishfor your profile, as part of your Urchin base product configuration.

Step 5: Closing the Loop: Reporting and ROI

Once the the Urchin database has been updated with visitor activity, a conversion goal, andcost−per−click data, the Urchin Reporting Engine is able to create over fifty campaign trackingreports. Among these reports is the following report excerpt, which compares ROI for the keyword

Chapter 6: Campaign Tracking Module 116

Page 121: MANUAL Urchin v5x

"analytics system architecture" for each search engine (both "cpc", cost−per− click, and "organic") onwhich visitors searched for the keyword.

In this case, the keyword "analytics system architecture" was purchased on Google Adwords(google[cpc]). Visitors clicked on the sponsored link 102 times, for a total cost of $6.34 to theadvertiser. A revenue amount of $89.15 resulted from these clicks, for an ROI of 1306.15The averagevalue of each click indicates that the advertiser should bid a maximum of 88 cents per click on thiskeyword. There was also one click on an organic (unpaid) search link, but it did not result in anyrevenue.

The Next Step

With visitor tracking and referral link parsing by the UTM and E−commerce revenue andcost−per−click data import, the Campaign Tracking Module can accurately correlate conversions tospecific campaigns and keywords, provide side−by−side comparisons of paid versus unpaidkeywords, and calculate ROI and conversion ratios for keyword buys. To begin realizing thesebenefits, read Step 1: Track Campaign Data.

The Five Dimensions of Campaign Tracking

Effective campaign tracking uses a combination of the following five marketing dimensions:

Source♦ Medium♦ Term♦ Content♦ Campaign♦

This article describes how these marketing dimensions are used in the Urchin Campaign TrackingModule to track campaign referrals.

Chapter 6: Campaign Tracking Module 117

Page 122: MANUAL Urchin v5x

SourceEvery referral to a web site has an origin, or source. Examples of sources are the Google searchengine, the AOL search engine, the name of a newsletter, or the name of a referring web site.

MediumThe medium helps to qualify the source; together, the source and medium provide specificinformation about the origin of a referral. For example, in the case of a Google search engine source,the medium might be "cost−per−click", indicating a sponsored link for which the advertiser paid, or"organic", indicating a link in the unpaid search engine results. In the case of a newsletter source,examples of medium include "email" and "print".

TermThe term or keyword is the word or phrase that a user types into a search engine.

ContentThe content dimension describes the version of an advertisement on which a visitor clicked. It is usedin content−targeted advertising and Content (A/B) Testing to determine which version of anadvertisement is most effective at attracting profitable leads.

CampaignThe campaign dimension differentiates product promotions such as "Spring Ski Sale" or slogancampaigns such as "Get Fit For Summer".

To Learn More

To learn how to use Urchin Campaign Tracking Management software to track your referrals alongthe five dimensions of campaign tracking, read Step 1: Track Campaign Data.

Step 1: Track Campaign Data (Set up UTM−3)

In order to track campaign data, you need to:

copy the UTM files to your web site document root,♦ reference the UTM in your HTML and enable cookies in your logging,♦ pass the UTM variables in your links.♦

Copy the UTM files to your web site document root

Copy the files __utm.js and __utm.gif from the util/utm directory of your Urchin distribution to yourweb site document root.

Important: Do not change the names of these files.

Reference the UTM in your HTML and enable cookies in your logging

Chapter 6: Campaign Tracking Module 118

Page 123: MANUAL Urchin v5x

In the Visitor Tracking section of the Urchin documentation, find the Quick−Install article that appliesto your environment. Follow the instructions in Step 2 and Step 3 of the article to reference the UTMin your HTML and enable cookies in your logging.

Pass the UTM variables in your links

The UTM variables provide a way of tracking your referrals along the five dimensions of campaigntracking by attaching campaign data to your links. The UTM parses this campaign data to determinethe referral source, the keywords used, and other campaign tracking information.

To pass the UTM variables in a link, add a question mark(?) to the URL followed by the variables andvalues you would like to assign. Values may be any string containing letters, numbers, underscore(_),and plus(+). Use underscores to separate multiple words in a value (e.g.utm_campaign=think_different); your URL may not contain spaces.

Urchin provides a URL builder tool that creates links for you, embedding the campaign informationthat you specify. Using this tool ensures that your links contain the correct syntax.

The following link indicates that the visitor was referred by a paid Google link and that the visitor hadsearched on the keywords "running shoes". It also indicates that the medium was cost−per−click.

Example

http://www.mycompany.com/?utm_source=googletm_term=running+shoes

How you use the variables in your links will depend upon your campaign tracking objectives.

For recommendations on how to use the UTM variables for Search Engine Marketing, readHow To Analyze Keyword Buying.

For recommendations on how to use the UTM variables for A/B testing, read How ToPerform A/B Testing.

For recommendations on how to use the UTM variables for content−targeted advertising, readHow To Track Content−Targeted Ads.

Learn how to use each variable from the table below.

VariableName

Description Example

utm_sourceRequired. Use utm_source to identify a search engine orother source.

utm_source=google

utm_mediumRecommended. Use utm_medium to identify a mediumsuch as email, cost−per−click(cpc), or cpc−content.

utm_medium=cpc

utm_term

Required for keyword analysis using Urchin5.5/UTM−3. (Urchin 5.6/UTM−4 and later versions,automatically detect keywords from cost−per−clickreferrals.) Use utm_term to identify the keywords that thevisitor searched on to get your link. If you specify autm_term with Urchin 5.6/UTM−4 and higher, yourspecified term overrides the detected term.

utm_term=running+shoes

Chapter 6: Campaign Tracking Module 119

Page 124: MANUAL Urchin v5x

utm_contentRequired for content−targeted advertising and A/Btesting. Use utm_content to differentiate ads or links thatpoint to the same URL.

utm_content=logolink

utm_campaignRequired for keyword analysis. Use utm_campaign toidentify a specific product promotion or strategiccampaign.

utm_campaign=spring_sale

Step 2: Install and License Campaign Tracking

You must have an Urchin base product license and a Campaign Tracking Module license in order touse the Campaign Tracking Module. If you wish to perform ROI calculations, you must also have anE−commerce Module License.

If you have not yet installed the Urchin base product, read theGetting Started−−>System Requirements−−>Urchin Setup Requirements article and theGetting Started−−>Installation section now.

If you are already using the Urchin base product, follow these steps to obtain a Campaign TrackingModule License:

Sign into Urchin as Administrator.1. In the Urchin Admin Interface, click Configuration.2. Click Settings−−>License.

The License Information screen appears with a link entitled Upgrade License.

3.

Click Upgrade License.

The Upgrade License wizard appears.

4.

Follow the steps as indicated in the Wizard to purchase and install the license.5.

Step 3: Define a Conversion Goal

Chapter 6: Campaign Tracking Module 120

Page 125: MANUAL Urchin v5x

A goal is a web site page which a visitor reaches once she or he has made a purchase or performedsome other desired action, such as a download or user registration. Before Urchin can calculate goalconversion metrics, you must define one or more goals within your campaign profile.

What is my campaign profile?

Your campaign profile is the profile from which you intend to run campaign reports.

If you have never used Urchin before, you will first need to create a basic profile. Follow theinstructions in Urchin Administration−−>Profiles−−>Working with Profiles, then follow theinstructions in this article.

If you have already defined a profile, follow the instructions in this article to enable theprofile for campaign tracking and create a conversion goal.

Enable a profile for campaign tracking and create a conversion goal

Sign on as Administrator and, in the Admin Interface, click Configuration.1. Click Urchin Profiles−−>Profiles.2. Click the Edit key next to the profile you wish to edit.3. In the Profile Settings tab, click the Campaign Website radio button and click Update.4. On the Profile Filters tab, make sure that the following filters are applied:

Decode UTM Campaign Content◊ Decode UTM Campaign Name◊ Decode UTM Campaign Source (Medium)◊ Decode UTM Campaign Source (Medium) Term◊ Decode UTM Campaign Term

If these filters are not applied, click Add.The Filter Wizard appears.Select Pre−Configured Filter radio button and press Next. Select the filters listedabove in the Available Filters area, move them to the Applied Filters area, and clickFinish. On the Profile Filters tab, click Update.

5.

In the Reporting tab, add a Primary Goal Match, a Primary Goal Field, and click Update.

Urchin logs a goal completion each time that the Primary Goal Field matches the valuespecified in Primary Goal Match. Any POSIX regular expression may be entered in thePrimary Goal Match field.

For example, if you select "request_stem" from the drop−down menu as the Primary GoalField and enter "/downloads" as the Primary Goal Match, Urchin logs a goal completion eachtime the request_stem (i.e. request URI without query information) has a value of"/downloads".

The field "request_stem" is the most common Primary Goal Field used, however, other fieldsmay be used as well. For a complete description of fields, read Reference−−>Regular Field

6.

Chapter 6: Campaign Tracking Module 121

Page 126: MANUAL Urchin v5x

List.Setting Multiple Goals

It is possible to set multiple goals in the Primary Goal Match field. For example, entering thefollowing in the Primary Goal Match field and "request_stem" in the Primary Goal Field:

/((forms/(downloadarea|registerarea)_confirmation)|special/profile_form))\.asp

will tag the following pages as goals:

/forms/downloadarea_confirmation.asp♦ /forms/registerarea_confirmation.asp♦ /special/profile_form.asp♦

You may enter any POSIX regular expression, up to 255 characters in length, in the Primary GoalMatch field.

Tagging Your Online Links 1−2−3

If you are using the Urchin Profit Suite or the Urchin Campaign Tracking Module, you'll want tomake sure that you've got a comprehensive strategy for tagging your online ads. This is an importantprerequisite to allowing Urchin to show you which marketing activities are really paying off.Fortunately, the tagging process goes smoothly − once you understand how to differentiate yourcampaigns. Here is a three−step process to help you get started.

1. Tag only what you need to.

Generally speaking, you need to tag all of your paid keyword links (such as those on Google Adwordsand Overture), your banners and other ads, and the links inside your promotional e−mail messages.

There are certain links that you don't need to, and many times will not be able to tag. You should notattempt to tag organic (unpaid) keyword links from search engines and it isn't necessary to tag linksthat come from referral sites, such as portals and affiliate sites. Urchin automatically detects thesearch engine and keyword from organic (unpaid) keyword referrals, and you'll see metrics for thesereferrals in your Urchin reports, typically under "Organic" listings. Urchin also detects referrals fromother websites and displays them in your reports, whether or not you have tagged them.

2. Create your links using the URL Builder.

Campaign links consist of a URL address followed by a question mark and your campaign variables.But, you won't need to worry about link syntax if you fill out the URL Builder form and press theGenerate URL button. A tagged link will be generated for you and you'll be able to copy and paste itto your ad. If you are asking "which fields should I fill in?", you're ready for Step 3.

Chapter 6: Campaign Tracking Module 122

Page 127: MANUAL Urchin v5x

3. Use only the campaign variables you need.

Urchin's link tagging capabilities allow you to uniquely identify virtually any campaign you can thinkof. But, don't think that you must use all six fields in the URL Builder form in each of your links. Onthe contrary, you should usually only need to use three: Source, Medium, and Campaign. Let's look atthe best ways to tag the three most common kinds of online campaigns − banner ads, emailcampaigns, and paid keywords.

Banner Ad E−mail Campaign Pay Per Click Keywords

Campaign Source citysearch newsletter1 google

Campaign Medium banner email ppc

Campaign Term

Campaign Content

Campaign Name productxyz productxyz productxyz

You'll notice that Campaign Term isn't used for any of these links, even the Pay Per Click Keywordcampaign. That's because Term is no longer necessary as long as you are using Urchin 5.6/UTM−4and later. (Campaign Term IS necessary if you are still using Urchin 5.5 and UTM−3.)

What about Campaign Content? We're only covering the most common scenarios in this article, but ifyou are interested in Campaign Content, read the article How To Perform A/B Testing.

What about the Campaign ID/Master Tracking Code? If you want to hide the tagging information thatyou put in your links, Urchin gives you a way of creating a table that keeps all the informationprivate. To read more about this, see the article How To Use Master Tracking Codes.

So get started tagging your links and tracking your way to online success!

Import Cost Data from Google

Importing cost data from Google is easy. Just perform the following steps:

Download your Google AdWord spending into a log file.Download your spending data on a daily or weekly basis, prior to the regularly scheduled runof your profile (or before manually runnning the profile).

Chapter 6: Campaign Tracking Module 123

Page 128: MANUAL Urchin v5x

Modify your profile to read the log file.You will only need to modify your profile once, as part of your initial setup.

Download Google AdWord spending into a log file

On adwords.google.com, log in to your Google AdWords account.1. In the Reports tab, click Custom Report.2. Fill out the URL Report fields and click Create Report.3.

View − check the Daily Metrics radio button.◊ Date Range −

If this is the first time you are downloading data for campaign tracking, enter a daterange beginning with the date you started tracking campaign data and ending withyesterday's date. (The date you started tracking campaign data is the date youcompleted implementing the instructions in the article Campaign TrackingModule−−>Step 1: Track Campaign Data.)

If you have already downloaded historical data, enter a date range beginning with theday after your previous download end−date, and ending with yesterday's date.

For daily downloads (recommended), enter a date range beginning with yesterday'sdate and ending with yesterday's date.

Detail Level − click Show options.Check Keyword names and Include all keywords.⋅ Check Campaign names and select All Campaigns.⋅ Check Ad Group names and select All Ad Groups.⋅

Values−◊

Ad Text−Check Destination URL.

Conversions− (Optional. Check the following if you have enabled conversiontracking on your Google Adwords account. If you do not have conversion trackingenabled, skip this step.)

Chapter 6: Campaign Tracking Module 124

Page 129: MANUAL Urchin v5x

Report name − Enter a report name.◊ Scheduling − Checkmarking this box means that you will not have to define thisreport again. The report format will be saved and will be automatically run each day,week, or month (according to your selection).

Email − Checkmarking this box tells Google to email you when the report has run.◊

Click the "Create Report" button4. Once you have manually run the report, or once it has been automatically scheduled and runby Google, the report will appear in the Download Center. (Click the Download Center link atthe top of the page.) Select the report for the day or week you want and download it as a .tsvfile.

5.

Modify your profile to read the log file

You will only need to modify your profile once, as part of your initial setup.

In the Urchin Admin interface, click Configuration, and then Urchin Profiles−− >Profiles.1. Click the Edit icon for your campaign profile.Your campaign profile is the profile that you configured as part of Step 2:ConfigureUrchin−−>Define a Conversion Goal.

2.

On the Profile Settings tab, make sure that Profile Type is Campaign with E− CommerceWebsite.

3.

On the Reporting tab, under Campaign Options, make sure that Primary Goal Match and aPrimary Goal Field are filled in. If they are not, read Step 2:Configure Urchin−−>Define aConversion Goal.

4.

On the Profile Filters tab, make sure that the following filters are applied:Decode UTM Campaign Content◊ Decode UTM Campaign Name◊ Decode UTM Campaign Source (Medium)◊ Decode UTM Campaign Source (Medium) Term◊ Decode UTM Campaign Term

If these filters are not applied, click Add.The Filter Wizard appears.Select Pre−Configured Filter radio button and press Next. Select the filters listedabove in the Available Filters area, move them to the Applied Filters area, and clickFinish. On the Profile Filters tab, click Update.

5.

Chapter 6: Campaign Tracking Module 125

Page 130: MANUAL Urchin v5x

On the Log Sources tab, click Add.The Log Source Wizard appears.

6.

Select Pre−Configured Log Source and click Next.7. In the Available Log Sources area, select a log source that contains your Google AdWordsspending data, move it to the Log Sources to Process area, and click Finish.

8.

On the Log Sources tab, click Update.9.

Import Cost Data from Overture

Importing cost data from Overture is easy. Just perform the following steps:

Download your Overture spending into a log file.Download your spending data on a daily or weekly basis, prior to the regularly scheduled runof your profile (or before manually runnning the profile).

Modify your profile to read the log file.You will only need to modify your profile once, as part of your intial setup.

Download Overture spending into a log file

On www.overture.com, click Advertiser Login and log in to your Overture account.1. Click the Reports tab on the top of the page.2. In the Select a Report Type dropdown menu, select Acount Activity Detail (Match Type)3. Specify a filter, date range, and click Create Report.

Select Overture Results filter.◊ Enter a date range.

If this is the first time you are downloading data for campaign tracking, enter a daterange beginning with the date you started tracking campaign data and ending withyesterday's date. (The date you started tracking campaign data is the date youcompleted implementing the instructions in the article Campaign TrackingModule−−>Step 1: Track Campaign Data.)

If you have already downloaded historical data, enter a date range beginning with theday after your previous download end−date, and ending with yesterday's date.

For daily downloads (recommended), enter a date range beginning with yesterday'sdate and ending with yesterday's date.

4.

Chapter 6: Campaign Tracking Module 126

Page 131: MANUAL Urchin v5x

When the report appears in your browser, scroll to the bottom of the page and clickDownload as Spreadsheet.

5.

Save the file to your logsource directory.6. Convert the report file to the UTF−8 format, using one of the two methods below.

Open the file in Excel, File−−>Save As, and choose tab−delimited format.◊

orIn the util directory of your Urchin distribution, execute the following script:iconv −f UTF−16 −t UTF−8 filename > newlogsourcename

7.

Modify your profile to read the log file

You will only need to modify your profile once, as part of your intial setup.

In the Urchin Admin interface, click Configuration, and then Urchin Profiles−− >Profiles.1. Click the Edit icon for your campaign profile.Your campaign profile is the profile that you configured as part of Step 2:ConfigureUrchin−−>Define a Conversion Goal.

2.

On the Profile Settings tab, make sure that Profile Type is Campaign with E− CommerceWebsite.

3.

On the Reporting tab, under Campaign Options, make sure that Primary Goal Match and aPrimary Goal Field are filled in. If they are not, read Step 2:Configure Urchin−−>Define aConversion Goal.

4.

On the Profile Filters tab, make sure that the following filters are applied:Decode UTM Campaign Content◊ Decode UTM Campaign Name◊ Decode UTM Campaign Source (Medium)◊ Decode UTM Campaign Source (Medium) Term◊ Decode UTM Campaign Term

If these filters are not applied, click Add.The Filter Wizard appears.Select Pre−Configured Filter radio button and press Next. Select the filters listedabove in the Available Filters area, move them to the Applied Filters area, and clickFinish. On the Profile Filters tab, click Update.

5.

On the Log Sources tab, click Add.The Log Source Wizard appears.

6.

Select Pre−Configured Log Source and click Next.7. In the Available Log Sources area, select a log source that contains your Overture spendingdata, move it to the Log Sources to Process area, and click Finish.

8.

On the Log Sources tab, click Update.9.

Chapter 6: Campaign Tracking Module 127

Page 132: MANUAL Urchin v5x

Adding Cost and Impression Data

The Urchin Campaign Tracking Module (beginning with version 5.6) allows you to add fixedadvertising costs and impression data to campaigns. If, for example, you have a cost associated withsearch engine optimization, website development, or an email campaign, you can enter this cost andsee it reflected in Urchin reports, including campaign ROI calculations.

The cost and impression data you enter is aggregated for the date range you specify when viewingreports. For example, if you enter 10,000 impressions for a campaign for January 1 and 5,000impressions for February 1, Urchin will report 15,000 impressions for the reporting date range ofJanuary through February. You may also enter negative numbers, thereby adjusting cost andimpression data. Using the same example, if you enter 10,000 impressions for January 1 and −5000impressions for February 1, Urchin will report 5,000 impressions for January through February.

How To Add Cost and Impression Data

In the Admin interface, click Configuration.1. Edit the profile to which you wish to add data.2. In the Storage/DB tab, click the Add Cost Data button.The Add CTM Entry Wizard appears.

3.

Enter the date as of which the cost and/or number of impressions should apply.4. Enter the CTM variable(s) that describe the campaign for which you are entering data. Forexample, to apply the data towards all organic Google referrals, specify the Source as googleand the Medium as organic. To apply the data towards the summer newsletter (and assumingthat you tag summer newsletter referrals with a utm_source=summer_news), specify theSource as summer_news.

5.

Enter the cost amount and/or number of impressions that you want to associate with thiscampaign.

6.

Click Add to Next Run.Urchin adds the cost/impression data to the Urchin database the next time that this profile isrun.

7.

Example: How To Add Non−Search Engine Specific SEO Costs

If you wish to enter a cost that applies to all organic search (i.e. the cost is not specific to Google orYahoo, etc), enter a "−" for Source and "organic" for Medium, as shown below.

Chapter 6: Campaign Tracking Module 128

Page 133: MANUAL Urchin v5x

How To Analyze Keyword Buying

How does keyword buying analysis help me?

Which keywords should I invest in? How much should I bid for a keyword? How much do I make onkeywords? At which times of the day should I maximize my search engine exposure? How can Iidentify click fraud?

You can answer these and other questions by analyzing your keyword buying with the UrchinCampaign Tracking Module. This article provides a walkthrough of each step, from collecting thedata to analyzing the reports.

What are the steps to analyze keyword buying?

License and Install UrchinYou will need to purchase and install the Urchin base product and the Campaign TrackingModule. If you need keyword ROI metrics, you should license the Profit Suite, whichincludes the Urchin base product, the Campaign Tracking Module, and the E−CommerceModule. Read Step 2:Install and License Campaign Tracking for more information.

Define a Conversion GoalRead Step 3:Define a Conversion Goal to learn how to specify a goal for your site.

Purchase Your Keywords♦

Chapter 6: Campaign Tracking Module 129

Page 134: MANUAL Urchin v5x

For each purchased keyword from a pay−per−click search engine (such as Google orOverture), you will need to set up a referral link to your site and embed UTM variables. Thisarticle describes the best way to use the UTM variables for keyword buying analysis, below.Track Campaign DataYou will need to install the UTM, enable cookie logging, and embed UTM variables in yourreferral links. The article Step 1:Track Campaign Data contains information on how to do allthree of these things, and additional information on how to use the UTM variables forkeyword buying analysis is provided in this article, below.

Import Keyword Spending Data(only necessary for ROI reporting)Read Import Cost Data from Google and or Import Cost Data from Overture.

Import E−commerce Data (only necessary for e−commerce ROI analysis reports)♦ Optimize Your Keyword BuysInformation on how to use the keyword reports to optimize your keyword buying is providedin this article, below.

Which UTM variables should I use for keyword analysis?

If you are using UTM−4 (Urchin 5.6) and later versionsFor paid search engine links, such as Google AdWords, use utm_source, utm_medium, andutm_campaign.

If you are using broad matching, and you wish to see metrics for the broad matched keyword(rather than letting UTM detect the specific keyword), you may wish to use utm_term.

The following link is an example of how you would use the UTM variables in a GoogleAdWords link. It indicates that the referral came from a paid Google search term and that themedium was cost−per−click. It also indicates that the visitor clicked on your adidaspromotion link. The UTM−4 automatically detects the keywords used to find your site.

Example

http://www.mycompany.com/?utm_source=googletm_campaign=adidas

Urchin provides a URL builder tool that creates links for you, embedding the campaigninformation that you specify. Using this tool ensures that your links contain the correctsyntax.

If you are using UTM−3 (Urchin 5.5 and 5.501)For paid search engine links, such as Google AdWords, use utm_source, utm_medium,utm_term, and utm_campaign.

The following link is an example of how you would use the UTM−3 variables in a GoogleAdWords link. It indicates that the referral came from a paid Google search term, that themedium was cost−per−click, and that the visitor had searched on the keywords "runningshoes". It also indicates that the visitor clicked on your adidas promotion link.

Example

http:// www.mycompany.com/? utm_source=googletm_term=running+shoesdas

Chapter 6: Campaign Tracking Module 130

Page 135: MANUAL Urchin v5x

Urchin provides a URL builder tool that creates links for you, embedding the campaigninformation that you specify. Using this tool ensures that your links contain the correctsyntax.

What about un−sponsored links? (aka unpaid, free, or organic listings in search engines)

You only embed variables in sponsored links – links for which you paid on a search engine, or linksover which you otherwise have control, such as links in an email that you send to customers.

You don’t have to worry about un−sponsored links because Urchin Campaign Tracking automaticallydetermines which search engine the referral came from and which keywords the visitor used.

Be consistent with UTM variables.

It is important that you use consistent names and spellings for all of your campaign variable values.For example, choose a code or name that indicates “cost−per−click” and use it consistently. ToUrchin Campaign Tracking, utm_medium=cpc and utm_medium=cost_per_click are differentmediums.

Beginning with Urchin 5.6, a master tracking code feature is available that significantly reduces thepossibility of consistency errors. Read How To Use Master Tracking Codes.

Optimize your keyword buys.

Which keywords should I buy?♦ How much should I pay for a keyword?♦ How much do I make on a keyword?♦ At which times of the day should I maximize keyword exposure?♦ How can I identify click−fraud?♦

Which keywords should I buy?

You should buy keywords that return the highest number of transactions and/or goals, or yield thehighest revenue. Begin by looking at the Keyword Comparison−−>Conversion report. Whichkeywords deliver the highest goal conversion and/or sales conversion rates? In this example, thehighest goal conversion and sales conversion rates (1.8% and 1.75%, respectively) come the thirditem. (Note: Sales conversion rate appears only if you have licensed the E−commerce module.)

Chapter 6: Campaign Tracking Module 131

Page 136: MANUAL Urchin v5x

Next, look at the Keyword Analysis−−>Conversion report and drill down to see keyword by keyworddetail inside of each of your organic search engines. Keywords that deliver high conversion rates onorganic search engines are often good keywords to buy.

If you have licensed the E−commerce module, look at the Keyword Comparison−−>ROI report andthe Keyword Analysis−−>ROI report. Which keywords perform the best on each search engine?Again, often organic search engine results can give a good indication of how a keyword will performas a sponsored link on a particular search engine.

How much should I pay for a keyword?

To answer this question, you will need to have licensed the E−commerce module. Look at theKeyword Comparison−−>ROI report and drill down on the keyword you are analyzing. The Avg.Value metric will tell you the average value per click, or total revenue divided by clicks. This is themaximum amount you should bid on the keyword. Note that the average value does not take intoaccount production costs or other business expenses. In the example below, the keyword "analyticssystem architecture" on the "google [cpc]" (Google cost−per−click) search engine is yielding anaverage value of 33 cents.

How much do I make on a keyword?

If you have licensed the E−commerce module, look at the Keyword Comparison−−>ROI report or theKeyword Analysis−−>ROI report. Both of these reports show your Return on Investment for eachkeyword on each search engine, for all keywords across a single search engine, and for all searchengines across a single keyword. In the example below, the cost−per−click ROI for "analytics systemarchitecture" on Google is 763%.

At which times of the day should I maximize a keyword's exposure?

Look at the Day Parts Breakdown−−>Goal Conversion by Hour or Sales Conversion by Hour. Drill

Chapter 6: Campaign Tracking Module 132

Page 137: MANUAL Urchin v5x

down on the keyword you are analyzing. The report will display the number of goals or transactionsand a conversion rate by hour of the day. The timezone is controlled by your administrator using theTime Offset field on the Reporting tab of Configuration−−>Urchin Profiles−−>Profiles−−>Edit. Bydefault, this is set to "Local Time".

How can I identify click−fraud?

Look at the Click Fraud Watch−−>Repeat Clicks by IP report. Drill down on any IP−Visitor ID toview search engines. If any cost−per−click search engines appear with Repeat Clicks, click on thosesearch engines to determine the keyword(s) on which Repeat Clicks occurred. You can ignore anyRepeat Clicks on organic search engines since these clicks are harmless. Note that Repeat Clicks donot necessarily indicate hostile activity; Repeat Clicks often occur naturally as result of visitors goingback and forth between the referral and your site. Look for a high number of Repeat Clicks (10 ormore) per day over a period of several days on specific paid keywords.

You can click on an IP−Visitor ID in the Repeat Clicks by IP report to obtain information on the clickoriginator. Note that this data may contain information on the ISP and not the individual visitor.

For additional information on click−fraud, visit Alchemist Media, Inc. Click Fraud Guidelines orcontact Jessie Stricchiola ([email protected]). Acquisition report. In the Filter field, type"referral|none|organic" and click the minus button. The report now displays only referrals that weretagged with UTM variables. For example, in the following report, the report displays referrals fromthree source[medium] combinations.

Drill down on the source[medium] for the content you would like to examine. The report nowdisplays the different versions of content within that source [medium]. The CTR (Click ThroughRate) tells you the percentage of impressions (ad displays) that resulted in clicks. The %New fieldtells you the percentage of clicks that are new leads.

Which versions of my advertisements refer the visitors most interested in my site?

Look at the Content (A/B) Testing−−>Quality report. In the Filter field, type "referral|none|organic"and click the minus button. The report now displays only referrals that were tagged with UTMvariables.

Drill down on a source[medium] for the content you would like to examine. The report now displaysthe different versions of content within that source [medium]. Depth is a measure of interest which

Chapter 6: Campaign Tracking Module 133

Page 138: MANUAL Urchin v5x

tells you the average number of pages on your site that each visitor viewed. Loyalty indicates theaverage number of times visitors returned to your site.

Which versions of my advertisements refer the visitors most likely to reach a conversion goal onmy site?

Look at the Content (A/B) Testing−−>Conversion report. In the Filter field, type"referral|none|organic" and click the minus button. The report now displays only referrals that weretagged with UTM variables.

Drill down on a source[medium] for the content you would like to examine. The report now displaysthe different versions of content within that source [medium]. Goal Conv. (Goal Conversion Rate) isthe percentage of referrals that reached a goal on your site. If you have licensed the the E−commerceModule, the percentage of referrals that made a purchase on your site, Sales Conv. (Sales ConversionRate), is displayed.

Which versions of my advertisements provide the biggest return?

If you have licensed the E−commerce module, you will be able to see the revenue associated witheach version of content. Look at the Content (A/B) Testing−−>ROI report. In the Filter field, type"referral|none|organic" and click the minus button. The report now displays only referrals that weretagged with UTM variables.

Drill down on a source[medium] for the content you would like to examine. The report now displaysthe different versions of content within that source [medium]. Revenue is the gross revenue associatedwith referrals from the content within the source[medium]. For Content Analysis, the Cost columnwill always be 0 and the ROI column will be equal to Revenue.

How To Track Content−Targeted Ads

If you currently purchase keywords on Google or Overture, you should also consider participating inthe Google and Overture content−targeted advertising programs. These programs place yourcost−per−click search ads on content sites that are published by Google and Overture partners. Forexample, if you sell vacation packages to France, your ad might appear in an article on Parisianrestaurants. Participation in the Google and Overture programs is free, and you pay the samecost−per−click that you pay for search engine referrals.

To track your content−targeted ad referrals, you will need to:

sign up for content−targeted advertising on your Google and/or Overture cost− per−clickaccount

edit your links to track content−targeted ad referrals♦ Once you have signed up for content−targeted advertising, each of your cost−per−click ads will have

Chapter 6: Campaign Tracking Module 134

Page 139: MANUAL Urchin v5x

two links associated with it − one link for search referrals and one link for content−targeted referrals.You will need to edit the link used for content−targeted referrals so that you can track search enginereferrals and content−targeted referrals separately.

Which UTM variables should I use to track content−targeted ad referrals?

For content−targeted ad referrals, you should use utm_source, utm_medium, utm_content.

Use utm_source to indicate the search engine.♦ Use utm_medium to indicate a cost−per−click content−targeted ad. For example, you mightuse "utm_medium=cpc−content" to differentiate from your search referrals which say"utm_medium=cpc".

Use utm_content to specify which specific ad referred the visitor.♦ If you have multiple types of products or multiple campaigns, you should also useutm_campaign. For example, if you have a “spring sale” campaign and an “adidas”promotion, you should indicate the appropriate campaign in your link.

The following link illustrates how you would use the UTM variables for a content−targeted adreferral:

Example

http:// www.mycompany.com/buy_page?utm_source=googleontent

Urchin provides a URL builder tool that creates links for you, embedding the campaign informationthat you specify. Using this tool ensures that your links contain the correct syntax.

To sign up for content−targeted ad placements on Google

Log in to your AdWords account.1. In the Campaign Summary table, click the appropriate ad campaign.2. Click Edit Campaign Settings above the Ad Groups table.3. At the bottom of the Edit Campaign Settings table, locate the distribution preferencescheckboxes, and:

4.

Click the checkbox next to content sites in Google’s network to check this option. Your adwill be included on additional content sites in the expanded network (iIf you click again toremove the check, your ad will not be included on these sites).

5.

Click Save All Changes at the bottom of the page to finish.6. To edit your links to track content targeted referrals from Google

Log in to your Adwords account.1. Navigate to your campaign and Ad Group2. Scroll to the bottom of your keyword list to edit your Ad(s)3. Click Edit on one of your ads (Or, click Create New Ad)4. Create a link according to the guidelines described above, in the section "Which UTMvariables should I use to track content−targeted ad referrals?"

5.

To sign up for content−targeted ad placements on Overture

Log in to your Overture account.1. Click the Account Set−Up link on the Account tab.2.

Chapter 6: Campaign Tracking Module 135

Page 140: MANUAL Urchin v5x

Under Content Match Advertising, select On and click Submit.3. To edit your links to track content targeted referrals from Overture

Log in to your Overture account.1. Click the Manage Products tab."Pay−For_Performance Search | Content Match" displays on the top margin of the page.

2.

Click Content Match and select Manage Listings from the drop down menu.3. Click next to the search term you wish to edit and press Edit Listings.4. Click Modify Listings in the pop up.5. Create a link according to the guidelines described above, in the section "Which UTMvariables should I use to track content−targeted ad referrals?"

How To Track Email Campaigns

The Urchin Campaign Tracking Module (beginning with version 5.6 and UTM−4) allows youto track email campaign impressions, clickthroughs, and conversions. An email impression isregistered when the email recipient opens the email message. A clickthrough is registeredwhen the recipient clicks on a link inside the email message. A conversion is registered whenthe recipient reaches a goal page on your site or completes a purchase.

This article describes how to:

create the email message, and◊ interpret the email campaign results◊

Creating the Email Message

You will need to create your email message as described in this section, so that the UrchinCampaign Tracking Module can accurately track impressions (opened emails) and referrals.

To track the email impressions, embed the __utm.gif image anywhere in themessage as illustrated below.

Example 1 (Tracking the email impressions using a master tracking code)

<img src="http://www.mysite.com/__utm.gif?utmt=imp

reference the __utm.gif installedon your site.

utmt=impisrequired.

adcampaigntrackingcodes.

Example 1 illustrates how your reference to __utm.gif should look if you are usingmaster tracking codes. (To learn how to use master tracking codes, read How To UseMaster Tracking Codes.) In Example 1, the email impressions will be credited to the

1.

6.

Chapter 6: Campaign Tracking Module 136

Page 141: MANUAL Urchin v5x

master tracking code of 10.

Example 2 (Tracking the email impressions without a master tracking code)

<img src="http://www.mysite.com/__utm.gif? utmt=impcmd=email">

reference the __utm.gif installedon your site.

utmt=impisrequired.

ad campaign trackingcodes.

Example 2 illustrates how your reference to __utm.gif should look if you do not usemaster tracking codes, and therefore explicitly state your campaign tracking variablesin the reference. In Example 2, the email impressions would be credited to the source"news1" and the medium "email".

Explanation of Variables

The reference to __utm.gif includes campaign variables that are different than thecampaign variables you use in links. In your reference to __utm.gif, use:

utmccn instead of utm_campaign⋅ utmcsr instead of utm_source⋅ utmcmd instead of utm_medium⋅ utmcct instead of utm_content⋅ utmcid instead of utm_id (for use with master tracking codes)⋅

To track the email referrals, create links to your site in the email message. Tagthese links using the utm_medium, utm_source, utm_content, utm_campaign, andutm_id campaign variables. Continuing with the example above of an email messagewhich you track using the source "news1" and the medium="email":

<ahref="http://www.mysite.com/?utm_source=news1">ad

text</a>

2.

Analyzing the Results

To see the number of clicks, impressions, and the clickthrough rate for email campaigns, lookat the Medium Comparison−−>Acquisition report under Campaign Tracking. For example, ifyou have tagged your links with utm_medium=email, this report will show all your emailactivity summarized under "email". You can drill down on "email" to view each individualnewsletter or mailing (source).

If you are conducting A/B testing in conjunction with an email campaign, look at theContent(A/B) Testing−−>Acquisition report. For information on A/B Testing, read How ToPerform A/B Testing

Chapter 6: Campaign Tracking Module 137

Page 142: MANUAL Urchin v5x

How To Use Master Tracking Codes

The Urchin Campaign Tracking Module (beginning with version 5.6 and UTM−4) allows youto tag your links using master tracking codes (a utm_id) instead of individual variables. Yousimply use utm_id in your links, and define the meaning of each utm_id in a table. Forexample, instead of

http://www.hostsite.com/?utm_source=overturetm_campaign=springpromo

you can use the UTM−4 variable utm_id as follows.

http://www.hostsite.com/?utm_id=2

Using utm_id hides your campaign tracking variables from web surfers and makes yourtagging process less error−prone, since campaign tracking variable values are specified in atable, where corrections and changes are easily made.

To use master tracking codes, you

Define your codes in a table◊ Apply the table as a filter to your profile◊ Use your master tracking codes in your links◊

Defining Your Codes

To define your codes:

Create a table in Excel that maps your codes to a set of campaign variables. Anexample is shown below.The first row of the file must begin with "#Fields:", followed by UTM variable namesin any order.Each row defines the campaign variable settings for the master tracking code youplace in the #Fields column. Use a hyphen (−) to indicate no value.You may omit any fields from the table for which there are no values for any code.For example, utm_term has been omitted from the spreadsheet, below.

1.

Save the Excel table as a tab delimited plain text file in the lib/custom/lookuptablesdirectory of your Urchin distribution. You must save the file with an extension of".lt".

2.

Chapter 6: Campaign Tracking Module 138

Page 143: MANUAL Urchin v5x

Applying the Table to Your Profile

To apply the table to a profile:

In the Admin tool, click Configuration.1. Edit the profile to which you wish to apply the master tracking codes.2. In the Profile Filters tab, click Add.3. In the Filter Wizard:Options screen, select Add New Filter and click Next4. In the Filter Wizard:Settings screen, select Lookup Table.The Table Name field appears in the wizard.

5.

From the Table Name drop down list, select the name of the table you created in thesection Defining Your Codes, above. If your table does not appear in the drop downlist, make sure that the table file name ends with .lt and that it has been saved in thelib/custom/lookuptables directory of your Urchin distribution.

6.

From the Filter Field drop down list, select utm_id (AUTO).7. Click Finish.8.

Using Your Codes in Your Links

Use the values in the Fields column of your lookup table as utm_id values. For example,using the lookup table created in this article, above, you might tag a link as follows:

http://www.hostsite.com/?utm_id=1

Additional Notes

If you use utm_id in conjunction with other UTM variables in a link (such as utm_source),the values of the other variables will be overwritten with the values in your lookup table. Forexample, using the lookup table shown in this article, above, the utm_source for the followinglink would be overwritten with the value of "google":

http://www.hostsite.com/?utm_id=1ure

However, the utm_term in the following link would not be overwritten, since no value forutm_term was provided in the lookup table. (utm_source and utm_medium would beoverwritten with values from the table.)

http://www.hostsite.com/?utm_id=1ureutm_medium=cpc

URL Builder

Chapter 6: Campaign Tracking Module 139

Page 144: MANUAL Urchin v5x

Urchin CTM URL Generator

Fill in the form information and click the "Generate URL" button, below. If you are new totagging links or this is your first time using this tool, read Tagging Your Online Links 1−2−3.

Step 1: Type the URL of your website.

Website URL: *(e.g. http://www.urchin.com/download.html)

Step 2: If you are using a Master Tracking Code, type the code and go to Step 3.

Campaign ID/Master Tracking Code: *(e.g. 1003)

Step 2: Or, if you are not using a Master Tracking Code, fill in the fields below and go to Step 3.

Campaign Source: * (referrer: google, citysearch, newsletter4)

Campaign Medium: * (marketing medium: ppc, banner, email)

Campaign Term: (keywords. Not necessary beginning with Urchin 5.6)

Campaign Content: (use to differentiate ads)

Campaign Name: * (product, promo code, or slogan)

Step 3

Help Information

CampaignID/Master TrackingCode (utm_id)

• Available for 5.6/UTM−4 and later versions. If you are using mastertracking codes, enter the code with which to tag this link. If you donot enter a Campaign ID, you must enter a Campaign Source and aCampaign Medium.

Campaign Source(utm_source)

• Required. Use utm_source to identify a search engine, newslettername, or other source. Example: utm_source=google

Campaign Medium(utm_medium)

• Required. Use utm_medium to identify a medium such as email orcost−per− click. Example: utm_medium=cpc

Campaign Term(utm_term)

• Required for keyword analysis on pre−Urchin 5.6 and UTM−3tracked sites. Use utm_term to identify the keywords that the visitorsearched on to get your link. Example: utm_term=running+shoes

Campaign Content(utm_content)

• Required for A/B testing and content−targeted ads. Use utm_contentto differentiate ads or links that point to the same URL. Examples:utm_content=logolink or utm_content=textlink

Campaign Name(utm_campaign)

• Required for keyword analysis. Use utm_campaign to identify aspecific product promotion or strategic campaign. Example:

Chapter 6: Campaign Tracking Module 140

Page 145: MANUAL Urchin v5x

utm_campaign=spring_sale

* Required Fields. You must enter a Campaign ID or enter a Campaign Source and CampaignMedium.

Implementation Checklist

Print this checklist and use it to implement campaign tracking.

Task Status Taskshelp.urchin.com

Campaign TrackingModule Articles

Create a plan to ensure consistent use of campaigntracking variables in referral links

The Five Dimensions ofCampaign Tracking,Step 1:Track CampaignData,How To Use MasterTracking Codes

Tag paid search engine keywords

How To Analyze KeywordBuying,Step 1:Track CampaignData

Tag Google and Overture content−targeted adsHow To TrackContent−Targeted Ads

Tag newsletter and advertising links

Step 1:Track CampaignData,How To Perform A/BTesting,How To Track EmailCampaigns

Determine your license requirements (Profit Suite, BaseLicense+CTM, or Campaign Tracking Module only (ifyou currently license Urchin 5.0)

Chapter 6: Campaign Tracking Module 141

Page 146: MANUAL Urchin v5x

Purchase and download softwareStep 2:Install and LicenseCampaign Tracking

Install software

GettingStarted−−>Installation,Getting Started−−>InitialConfiguration,

Copy the UTM−3 or UTM−4 to web site and reference itin all HTML pages

Step 1: Track CampaignData

Enable cookie logging on webserverVisitor Tracking−−>QuickInstall

Set up a campaign profile with a goalStep 3:Define a ConversionGoal

Add cost−per−click data log sources to campaign profilefor Google and Overture spending data

Import Cost Data fromGoogle,Import Cost Data fromOverture

Add e−commerce (ELF−2) log sources to campaignprofile for e−commerce revenue data

E−commerce Module −ELF &ELF2 Log Formats

Create a schedule for downloading keyword spending datafrom Google and Overture

Import Cost Data fromGoogle, Import Cost Datafrom Overture

Chapter 6: Campaign Tracking Module 142

Page 147: MANUAL Urchin v5x

Chapter 7: Advanced Topics

Utilities

Administration Utilities Overview

Overview

Urchin ships with a number of utility programs that are used for diagnostic and configuration purposes. Theseutilities are located in the util directory of the Urchin distribution. This document is intended as anintroductory overview of these utilities. It is not a comprehensive guide to their usage. Please consult thespecific documentation for each utility in the Utilities section of the Advanced Topics area of the UrchinDocumentation Center at http://help.urchin.com for detailed information about the capabilities and usage ofeach of these programs.

All utilities support "−h" and "−v" command line arguments. Invoking a utility with the "−h" argument willgive a summary of usage and the available options for that tool. Invoking with "−v" argument prints theUrchin version of the utility.

Chapter 7: Advanced Topics 143

Page 148: MANUAL Urchin v5x

geo−update

This utility is used to check for updates to Urchin's internal DNS database files and download the updates ifthey are available. The utility can also be used to import custom entries into the DNS databases by using thedomain.local file or another specified text file.

inspector

This utility performs basic sanity checks on your installed Urchin distribution, ensuring that the overallstructure of the distribution is intact, that all the binaries shipped with the product are the proper version, andthat the underlying permissions are correct (on UNIX−type platforms). The utility also reports on theoperational status of the Urchin Scheduler (urchind) and the bundled Apache web server (urchinwebd).

u3importer

This utility is used to migrate existing Urchin 3 config file information and report databases into Urchin. Itruns interactively and prompts the user for the location of the Urchin 3 config file. The process then importsthe Urchin 3 data without disturbing the existing Urchin 3 installation. u3importer cannot import allconfiguration data specified by Urchin 3 config file directives. Some Urchin 3 directives, such as subreportmode, are not supported in Urchin 5, and have no equivalent. Others such as filters, are organizedsignificantly different in Urchin 5, so they cannot be imported exactly as they were specified in Urchin 3. Themain objective of this tool is to get all your Urchin 3 report blocks and data imported in a basic fashion so thatyou have Urchin 5 reporting operational for all your sites, and past report data is available.

uconf−export/uconf−import

These utilities use an XML−style text format to represent the contents of the Urchin 5 configuration databasein a human readable intermediate form. An Urchin configuration can therefore be exported and saved, orimported to restore the state of an Urchin configuration. Saved configurations can also be modified with aneditor, or configuration files can constructed from scratch, before being imported back into the Urchinconfiguration database. You can therefore mimic the "config" file functionality that existed with Urchin 3 ifdesired. It is recommended that you use uconf−export on a regular basis to save your current configurationstate as a backup.

uconf−driver

This utility provides a command line interface for administering the Urchin configuration. All functionalitypresent in the Urchin administration interface is available in this utility, thus it can completely replace the useof the administration interface as far as managing all aspects of Urchin. uconf−driver is intended for use insituations where scriptable actions for managing the Urchin configuration are desired. This makes it ideal forenvironments such as large shared hosting operations, where the amount of data that must be managed makesit impractical to manage Urchin via the web−based administrative interface.

uconf−schedule

This global task scheduling utility alows you to schedule all configured Profiles to run at a certain time,including scheduling them all to run immediately. The Urchin Task Scheduler (urchind) must be running foruconf−schedule to work.

Chapter 7: Advanced Topics 144

Page 149: MANUAL Urchin v5x

udb−sanitizer

The udb−sanitizer program is used to effect repairs on your Profile databases when there is a problem thatleads to database inconsistency or corruption. The Urchin log processing engine routinely does databaseconsistency checks while processing logs. When it detects a database that needs repair it will report the needfor udb−sanitizer to be run. In addition to database repair, this utility allows removal of a single day or amonth's worth of data in the event that webserver logs need to be reprocessed.

urchinctl

(Note: this utility lives in the bin directory, not the util directory).This utility provides a means of starting and stopping the Urchin Scheduler and Urchin Webserver services.On UNIX−type systems, urchinctl is typically called from one of the system's boot−time scripts toautomatically start up or shut down Urchin services.

urchin_daemons

On UNIX−type machines Urchin runs a scheduler daemon (urchind) and an Apache webserver (urchinwebd).These programs should be started at boot time. The urchin_daemons script can be added to the systeminitialization scripts in the location appropriate for your UNIX−type OS, and it will cause the Urchin servicedaemons to be launched properly.

geo−update: DNS Database Update Utility

Overview

The geo−update utility is used to check for updates to Urchin's internal DNS database files and download theupdates if they are available. The utility can also be used to import custom entries into the DNS databases byusing the domain.local file or another specified text file.

Usage

The geo−update utility is most often executed by the Urchin scheduler (urchind) based on the __domaindbtask. This task is set to check for new updates once per month and can be configured by the user to occur at acertain time using the admin interface.

There are two main functions for this utility:

Download new versions of the domain databases if they are available1. Import custom domain entries2.

The default behavior is to check for updates, download the new databases and overwrite the existing ones,then import the local data from domain.local. The domain.local file must be located in the data/geodatasubdirectory of your Urchin installation. The default behaviors can be overridden by running geo−update fromthe command line with the appropriate options. The usage is:

Chapter 7: Advanced Topics 145

Page 150: MANUAL Urchin v5x

geo−update [−Hhv] [−D | −F] [−i file]

−H log output to History file instead of stdout −h print help information −v print version information for this utility −D disable download of domain databases (Cannot be used with −F) −F force download of domain databases (Cannot be used with −D) −i import domain data from specified file rather than domain.local

Using the −D option will disable downloading new databases. This is most often useful when importing localchanges into the database without causing a complete update. Using the −F option will force the download ofnew databases even if the databases are already up to date. This feature is useful if you imported incorrectcustom domain information or otherwise damaged your domain database and wish to start over with a freshcopy. Logging output to a history file with the −H option will make the run information available whenviewing the Task History screen in the Configuration−>Scheduler area of the Urchin administration interface.

Examples

To force a download of the latest domains database: geo−update −F

To import custom entries from the domain.local file without downloading new database: geo−update −D

To import custom entries from the file mydomains without downloading new database: geo−update −D −i mydomains

Custom DNS Entriess

When using either the domain.local file in the data/geodata directory or some other custom file, the format ofthe entries should be one entry per line, starting with the IP or network address followed by a space or tab, andthen the domain for that address. Spaces are not allowed in domain name. The allowed forms include thefollowing:

192.168.10.100 somehost.somedomain.net (Explicit hostname IP)• 192.168.10.16/24 somedomain.net (IP address with network prefix)• 192.168.10.0/24 somedomain.net (subnet with network prefix)•

When processing Urchin will check for specific IPs first and then look for encompassing network ranges.

Considerations

Since geo−update will completely overwrite any existing Urchin domains database each time it updates, it isadvisable that you always keep your local modifications in the domain.local file so they will automatically beadded. Otherwise after an update you will have to add local adjustments manually. Be sure to keep a backupcopy of your domain.local file elsewhere.

The geo−update utility needs an internet connection to be able to check for and download new updates. Theutility uses port 80 to communicate with the webserver providing the updates. It is possible that you will haveproblems when going through firewalls and proxy servers when doing updates. Please consult your networkadministrator if this is the case.

Chapter 7: Advanced Topics 146

Page 151: MANUAL Urchin v5x

inspector: Urchin Installation Integrity Checker

Overview

The Urchin Installation Integrity Checker, inspector, provides a means of checking the Urchin 5 distributionand reporting if there are problems with permissions, missing files, or certain problems with the Urchinconfiguration itself. By default, it also provides a summary of information about the Urchin installation andthe type of platform it is installed on.

The types of operations that inspector can perform are:

Perform a sanity check on the Urchin distribution, including existance of files and proper permissions• Check availability of a network port (for Urchin's webserver)• Reset proper permissions on the Urchin distribution•

Usage

inspector is located in the util directory of the Urchin 5 distribution.

Usage of the utility is as follows:

inspector [−h] (prints usage message and exits) inspector [−v] (prints version and exits) inspector [−v] [−p port] [−r]

where:

−p checks a specified port to determine availability −r directs inspector to fix permissions on the distribution files

When called with no command line arguments, inspector will perform a number of sanity checks on theUrchin installation to ensure integrity of the installation. Specifically, it performs the following operations:

Prints the Urchin version and Server information such as the operating system version and hostname1. Verifies that all the proper Urchin binaries and utilities are in place and are the correct version. OnUNIX−type platforms, discrepancies in the permissions of the binaries and files in the Urchindistribtuion are also noted.

2.

Checks and reports the status of the Urchin Apache webserver (urchinwebd)3. Checks and reports the status of the Urchin Scheduler (urchind)4. Checks the Urchin configuration and reports the total number of records and some summary licensinginformation.

5.

When invoked with the "−p" argument, inspector will check for the availability of the specified network port.This option is intended primarily for use by the Urchin installation process, and provides no information aboutthe actual Urchin distribution itself.

Chapter 7: Advanced Topics 147

Page 152: MANUAL Urchin v5x

The "−r" (repair) option will cause the utility to attempt to repair the permissions of any files in the Urchindistribution that are not consistent with the the original installation. This is typically only useful onUNIX−type platforms, and in most cases will require that the utility be run as the "root" user.

u3importer: Urchin 3 Data Import Utility

Overview

The u3importer is a command line utility found in the util directory of the Urchin distribution. This utilityallows for the importing of report configurations from an existing Urchin 3 config file as well as the dataassociated with each Urchin 3 report.

The u3importer has two modes; an interactive mode which prompts the user for the required information toperform the import operations, and a non−interactive mode that requires no dialog (more suited to unattendedscripting operations).

When run in interactive mode without any command line arguments, for each <Report> block and itsassociated directives in the Urchin 3 config file the utility will:

Create an Urchin 5 Profile, Log Source, and Task1. Create any associated Filter entries2. Read the Urchin 3 databases and write the data into new Urchin 5 databases in the appropriatelocation in the Urchin 5 distribution

3.

In non−interactive mode using the "−c" option (explained below), u3importer will simply take Urchin 3databases and import them into new Urchin 5 database files without making any changes to the Urchin 5configuration.

Important Note: The utility must be run on the same operating system platform that originally created theUrchin 3 report data. It is unable to read and convert data created on a different operating system platform andattempting to do may cause unexpected termination of the utility. If you wish to upgrade from an Urchin 3installation on one operating system to a new Urchin 5 installation on another platform type, you should firstinstall a temporary copy of Urchin 5 on the old platform. Next, run the u3importer to create an interim Urchin5 configuration and profile data. Since the resulting Urchin 5 profile data is platform−independent, this datacan then be moved over to your permanent Urchin 5 installation on the new platform. Please note that it is notnecessary to license the Urchin 5 distribution on your old platform in order to use the u3importer utility.

Please be sure to read the Limitations section below for a more detailed explanation of the limitations ofimporting Urchin 3 data.

Usage

Usage of the utility is as follows:

u3importer [−v] (prints usage and exits)

Chapter 7: Advanced Topics 148

Page 153: MANUAL Urchin v5x

u3importer (interactive mode) u3importer [−c urchin3−report−data−path urchin5−report−data−path]

(non−interactive mode)

Data Importing Procedure

When upgrading from Urchin 3, u3importer should be run before creating any new Urchin 5 profiles, ifpossible, since it must create a new profile for each Urchin 3 report it reads. The utility will not add data to anexisting profile. Instead if a profile of the same name as it is trying to import exists, it will create a similarname with the number 2 appended to it and import the data into that profile. Therefore, it is strongly advisedto run u3importer before normal operation begins under Urchin 5. It is also recommended to disable theautomation of any existing Urchin 3 processing, so that new log files are not discarded or lost during theupgrade process. This will also ensure that the Urchin 3 data is not changing while running u3importer toimport your databases.

Interactive Mode Operation

UNIX

Inside a command line shell on the system where Urchin is installed. Change directory to the Urchin utildirectory, and execute u3importer like so:

./u3importer

Windows

On the system where Urchin is installed, open a Command Shell by going to Start−>Run..., enter "cmd" andhit Enter. Once the shell window launches, type:

C: cd \Program Files\Urchin\util u3importer.exe

Step 1: Locate the Urchin 3 configurationThe u3importer utility will prompt for the location of the Urchin 3 configuration file. A suggested location isprovided. To accept the suggestion, simply press return. Otherwise, enter the complete path to the config filelocated in the Urchin 3 folder. Wherever Urchin 3 was installed, there should be a config file located in thatUrchin 3 folder. On Unix systems, this could be /usr/local/urchin3/config. On Windows systems, this could beC:\Program Files\Urchin3\config. If you cannot find the Urchin3 installation, please contact your systemadministrator for details.

Step 2: Import Urchin 3 configuration profilesOnce the utility locates the Urchin 3 configuration, it will list all of the sites that exist in the configuration andprompt you for which ones to import. To import all profiles, press enter. To import only select profiles, type Yor N as each profile is prompted. Before continuing to the next step, you may verify that the configurationswere imported correctly by inspecting the Urchin 4 Configuration interface.

Step 3: Import Urchin 3 dataAfter importing the configurations, the utility will then prompt to import the data associated with each profile.

Chapter 7: Advanced Topics 149

Page 154: MANUAL Urchin v5x

Importing the data will allow you to view Urchin 3.x historical reports under the new Urchin interface. Toimport data for all profiles, press enter. To import data for only select profiles, type Y or N as each profile isprompted.

Non−interactive Mode Operation

In non−interactive mode u3importer simply reads existing Urchin 3 databases and creates the associated newUrchin 5 databases. It is up to the user to make sure the Urchin 5 databases are located properly within theUrchin distribution and that the necessary Profiles, Log Sources, and Tasks are created to fully configure thesite. To launch u3importer in non−interactive mode without additional dialog, use the "−c" option like so:

u3importer −c /path/to/urchin3_udata/path/to/urchin5_databases

The path to your Urchin 3 data should point to the same directory path as shown in your Urchin 3 config fileReportDirectory directive for a given site. The path to the place to create your Urchin 5 directory can beanywhere. But if you want it to automatically become a part of your Urchin 5 configuration when you lateradd a profile for a site, you should make the path point to the data/reports subdirectory of your Urchindistribution. As an example, if you have a site named test.urchin.com in your Urchin 3 configuration, then inthe report block for that site will be ReportDirectory directive similar to:

ReportDirectory: /urchin3/test.urchin.com/

Assuming your Urchin 5 installation is in the default location of /usr/local/urchin5, then to convert the Urchin3 databases for this site and have them put in the proper location in your Urchin 4 installation, you would runthe following command (line breaks added for readability, this command would be invoked on a single line):

u3importer −c \ /urchin3/test.urchin.com \ /usr/local/urchin5/data/reports/test.urchin.com

In a scripted environment, after running this command you would typically use the uconf−import oruconf−driver utilities to create the associated Profile, Log Source, and Task configuration records to completeyour migration for this site.

Limitations

Urchin 3 provided only a subset of the reports that are available by default in Urchin 5, and in many cases thereporting data was maintained only on a monthly basis. Therefore, not all Urchin 5 reports will be populatedwhen importing Urchin 3 data. The following outlines the limitations of importing data from Urchin 3:

Unpopulated reports: the following reports were not present in Urchin 3, so they will be blank inUrchin 5 after importing Urchin 3 data.

Traffic −> (hourly view for all graph reports)i. Visitors &Sessions (all reports)ii. Pages &Files −> Downloadsiii. Pages &Files −> All Filesiv. Pages &Files −> Directory by Files Drilldownv. Pages &Files −> Page Query Termsvi.

1.

Chapter 7: Advanced Topics 150

Page 155: MANUAL Urchin v5x

Navigation −> Click Pathsvii. Navigation −> Length of Pageviewviii. Referrals −> Referral Errorsix. Domains &Users −> IP Addressesx. Domains &Users −> IP Drilldownxi. Domains &Users −> Usernames by Bytesxii. Browsers &Robots −> Browsers by Bytes Drilldownxiii. Browsers &Robots −> Platforms by Bytes Drilldownxiv. Browsers &Robots −> Robots by Hits Drilldownxv. Browsers &Robots −> Robots by Bytes Drilldownxvi. Client Parameters (all reports)xvii.

Navigation data in Urchin 3 is stored on a monthly basis, whereas in Urchin 5 it is stored daily. Whenimporting Urchin 3 data, the data is all consolidated into the first of the month for the followingreports:

Navigation −> Entrance Pagesi. Navigation −> Exit Pagesii.

2.

No imported Profiles are scheduled to run, since there was no concept of scheduled tasks in Urchin 3.Use the uconf−schedule utility to globally schedule all imported Profiles to run at a certain time.

3.

Unique Log Sources are created for every TransferLog entry in the Urchin 3 config file, even ifmuliple Urchin 3 reports use that same TransferLog. You may want to consolidate the Log Sourcesafter importing your Urchin 3 data if multiple Profiles share the same log file.

4.

Unique Filters are created for each FilterIn, FilterOut, and DynamicURL directive encountered in theUrchin 3 config file. Again, you may want to consolidate the imported Filters if you have multipleProfiles or Log Sources that use the same filter.

5.

The u3importer does not perform any type of disk space checking before running, so you will want toensure that you have enough disk space to perform the import operation. Generally speaking, you willneed to allocate at least as much additional disk space as what is currently required for all your Urchin3 report directories.

6.

Considerations

No checking is done to see if previous data exists in the Urchin databases for the time range coveredby the imported data. Therefore, if you run the u3importer tool again, you will double your statisticsfor that period.

1.

Care should be taken when using the u3importer tool to import data from months that have alreadybeen populated with native Urchin 5 data. While the importing process does not overwrite existingdata, it does no checking to see if data already exists for a given day and will happily add (possiblyduplicate) data to the databases.

2.

uconf−driver: Configuration Management Utility

Overview

Chapter 7: Advanced Topics 151

Page 156: MANUAL Urchin v5x

The Configuration Management Utility. uconf−driver, provides a command line interface for administeringthe Urchin 5 configuration. All functionality present in the Urchin 5 administration interface is available inthis utility, thus it can completely replace the use of the administration interface for managing any facet of theUrchin 5 configuration. The uconf−driver is intended for use in situations where managing the Urchin 5configuration through automated/unattended scripts is desired. This makes it ideal for environments such asweb hosting or large corporations where the amount of data that must be managed makes it impractical tomanage Urchin via the standard web−based administration interface.

The uconf−driver utility is located in the util directory of the Urchin 5 distribution.

Usage

Usage of the utility is as follows:

uconf−driver [−h] (prints usage message and exits) uconf−driver [−v] (prints version and exits) uconf−driver [−f file] [−d path] command parameters...

where:

−f specifies the path to a file containing uconf−driver commands −d specifies an alternative path to the Urchin configuration database −e specifies printing of the "entry=" field (Urchin 5.100 and later)

and command is one of the following (note that each command is all onone line, line breaks added for readability):

action=ntotalrecords action=nrecords table=tablename action=list table=tablename [start=startnum] [n=count] action=get ident action=add table=tablename name=recordname \ directive=value [directive=value ...] action=edit ident directive=value \

directive=value ...] action=delete ident action=get_parameter ident parameter=directive action=set_parameter ident directive=value

and ident is one of:

recnum=recordnum table=tablename entry=tblentry table=tablename name=recordname

and tablename is one of:

Global Machine Filter Logfile Profile Task Affiliation Group

Chapter 7: Advanced Topics 152

Page 157: MANUAL Urchin v5x

User

The format of a uconf−driver command is a series of name/value pairs passed as command line argumentsthat define the action to be taken. In particular, the action command line argument directs the behavior ofuconf−driver. The nine possible actions are described in the Usage section above. Valid user−specifiedconfiguration directives begin with the prefixes "ct_", "cr_", or "cs_", but Urchin also uses directives with a"cx_" for internal purposes. These "cx_" directives should never be modified by uconf−driver utility.

Beginning with Urchin 5.100, you must now use the "−e" option to have the "entry=" field printed foruconf−driver commands that return an entire record. This was done to improve uconf−driver performancewhen it is used with very large Urchin configurations. Calculation of the "entry" field requires uconf−driver tosearch sequentially through all records in the database, which is a slow process on Urchin configurations withtens of thousands of records. When used without the "−e" argument, uconf_driver actions are very quick evenwith Urchin configuration databases containing millions of records.

When invoked, uconf−driver will either execute the single action specified by the command line arguments orthe multiple actions contained in a file specified with the "−f" option. If no command line arguments are givenand no "−f" option is specified, uconf−driver will read actions from stdin. When specifying multiplecommands via stdin or the "−f" option, each line of input should represent one uconf−driver command.

A complete list of valid Urchin configuration directives is available in Reference section of the UrchinDocument Center at http://help.urchin.com and is entitled "Configuration Table and Directive List".

Overview of uconf−driver Actions

Several of the uconf−driver actions require you to identify the record that is to be retrieved or modified. Thisidentification can be accomplished in three ways. One of the argument lists below should be substituted forthe ident string (see "Usage" above) wherever a command requires a record identifier.

If the internal record number is known, then substitute "recnum=recordnum" for ident as thecommand line argument. Note that this record number is not necessarily sequential within a table.

1.

If it desired to loop through all of the records in a particular table, then specify "entry=tblentry" and"table=tablename" as the ident command line argument. The first entry defaults to "1".

2.

The exact name of the record can be specified using the "name=name argument. Be sure to use quotesif the name may contain white spaces or characters that may be interpreted as metacharacters by theshell running your command.

3.

Description of Actions

Retrieving Data: these commands provide the ability to extract records, partial records, and record countsfrom the Urchin configuration.

"ntotalrecords" outputs the total number of records in all tables in the Urchin configuration database.• "nrecords table=tablename" outputs the number of records in the specified tablename.• "list table=tablename [start=startnum] [n=count]" retrieves multiple records from a particular table.Each record is printed on a separate line. The optional "start=startnum" argument specifies thestarting number of the table entry to begin grabbing, and the "n=count" argument specifies how manyentries to retrieve. For example, if a particular table has 44 records and you want to grab records

Chapter 7: Advanced Topics 153

Page 158: MANUAL Urchin v5x

20−29, specify "start=20 n=10"."get ident" prints all information associated with the particular ident argument on a single line.li>"get_parameter" allows you to retrieve the value for a specific directive from the record matchingident in the configuration database, and requires the "parameter=parameter" argument.

Modifying Data: the "add", "edit" and "delete" functions provide comprehensive editing ability of records inthe Urchin configuration database. A directive list should also be passed along with these actions. The"set_parameter" function sets a particular directive within a record.

"add" requires both the "table=tablename" and "name=recordname" parameters and inserts a newrecord into the database with the specified set of "directive=value pairs.

"edit" is similar to the "add" command, except the record directives for the specified id are replacedwith the new list of "directive=value pairs. This function clears all previous directives for id and addsall new directives specified on the command line.

"delete" deletes the record matching ident from the table.• "set_parameter" sets the specified "directive=value" directive in the configuration for the recordmatching ident.

Diagnostics Returned and Exit status

Upon completion of a command, uconf−driver will print out one of the following diagnostics based on theaction parameter that was specified:

Action Output Description

(any) [usage msg]command line not in recognized format

(any) [no msg]command didn't perform any action, perhaps due to an out of range entry,incorrect table name, etc.

(any) −l command line parameters invalid for request type

add [recnum] record number of record that was created

delete [recnum] record number of record that was deleted

edit [recnum] record number of record that was edited

get [record] complete set of name/value directives for specified record

get_paramenter[param] value of requested directive

list [records] complete list of all name/value directives for all records in the specified table

nrecords [count] count of records in the specified table

ntotalsrecords [count] count of all records in all tables

set_parameter 0 always prints 0 on success

Please note that you must parse the runtime output from uconf−driver to determine if the command wassuccessful. At present, the utility always exits with a status code of 0, so the exit status cannot be used todetermine if the command succeeded or not.

Examples

Chapter 7: Advanced Topics 154

Page 159: MANUAL Urchin v5x

Here is a set of example commands using the uconf−driver utility. Please note that all commands are on asingle line; line breaks have been added for readability.

# Extract the total number of records from the Urchin configuration databaseuconf−driver action=ntotalrecords

# Extract the number of records in the "profile" tableuconf−driver action=nrecords table="profile"

# List records 6−8 in the "logfile" tableuconf−driver action=list table="logfile" start=6 n=3

# Extract the record for user "joe" from the "user" tableuconf−driver action=get table=user name="joe"

# Add a regular non−privileged user to the configurationuconf−driver action=add table=user name="bob" ct_fullname="Bob Jones" ct_password="b0bz@pw" cs_adminlevel=3

# Set the default language for the Urchin reports to Englishuconf−driver action=set_parameter table=user name="bob" cs_language=en

# Change the network port that the Urchin webserver usesuconf−driver action=set_parameter recnum=1 ct_port=1234

Sample Bourne Shell script to add a Profile/Task/LogSource/User using uconf−driver

This is a sample working script that demonstrates how uconf−driver could be embedded in a script toautomate the creation of an entire new Profile, a Log Source for it to process, a scheduled Task, and a Userwith rights to view the Profile.

#!/bin/sh## Proof−of−concept Bourne shell script for adding a Profile,# Task, Log Source and User record set to the Urchin configuration.# The Profile will be set to run at 01:05am daily.## NOTE: Line breaks have been added for readability## Define the pertinent information here. Obviously, this should really be# stuff that's parsed from the command line.domain=mysite.comlogfile=/path/to/webserverlogs/mysite−access.logusername=userjoepassword=joepasswdlanguage=enregion=us

cd /path/to/urchin/util

# Add Profilep_recnum=`./uconf−driver action=add table=profile name=$domain \ ct_name=$domain ct_website=http://www.$domain \ ct_reportdomains=$domain,www.$domain`

# Add Taskt_recnum=`./uconf−driver action=add table=task name=$domain \

Chapter 7: Advanced Topics 155

Page 160: MANUAL Urchin v5x

ct_name=$domain cr_frequency=5 cr_enabled=on cs_hour=01 \ cs_minute=05 cs_rid=$p_recnum`

# Set proper cross reference from Profile to Taskrecnum=`./uconf−driver action=set_parameter recnum=$p_recnum cs_taskid=$t_recnum`

if [ "$recnum" != "$p_recnum" ]; then echo "Failed to associate profile with task"fi

# Add Log Sourcel_recnum=`./uconf−driver action=add table=logfile name=$domain \ cr_action=2 ct_name=$domain cr_type=local ct_loglocation=$logfile \ cs_logformat=auto cs_rlist=!$p_recnum!`

# Set proper cross reference from Profile to Log Sourcerecnum=`./uconf−driver action=set_parameter recnum=$p_recnumcs_llist=!$l_recnum!`if [ "$recnum" != "$p_recnum" ]; then echo "Failed to associate profile with log source"fi

# Add regular non−privileged user with access to this Profileu_recnum=`./uconf−driver action=add table=user name=$username \ ct_name=$username ct_password=$password ct_fullname="$domain user" \ cs_language=$language cs_region=$region cs_adminlevel=3 \ cs_rlist=!$p_recnum!`

# Set proper cross reference from Profile to Userrecnum=`./uconf−driver action=set_parameter recnum=$p_recnum cs_ulist=!$u_recnum!`

if [ "$recnum" != "$p_recnum" ]; then echo "Failed to associate user with profile"fiexit#### END OF SCRIPT##

Here is the same script, written using DOS commands.

@echo offREM Proof−of−concept Windows batch file for adding a Profile,REM Task, Log Source and User record set to the Urchin configuration.REM The Profile will be set to run at 01:05am daily.REMREM This should work on Windows 2000, XP and 2003 Server.REMREM NOTE: Line breaks have been added for readability − you will needREM to ensure that all your commands appear on one line in the scriptREMREMREM Prompt the user for the information we need. This section couldREM be replaced using values from command line arguments instead.REMset/p domain=Enter domain: set/p logfile=Enter webserver log pathname: set/p username=Enter username:

Chapter 7: Advanced Topics 156

Page 161: MANUAL Urchin v5x

set/p password=Enter password: set/p language=Enter language: set/p region=Enter region:

cd c:\program files\urchin\util

REMREM Add ProfileREMuconf−driver action=add table=profile name=%domain% ct_name=%domain% \ ct_website=http://www.%domain% \ ct_reportdomains=%domain%,www.%domain% > #tmpset/p p_recnum= <#tmpREMREM Add TaskREMuconf−driver action=add table=task name=%domain% ct_name=%domain% \ cr_frequency=5 cr_enabled=on cs_hour=01 cs_minute=05 \ cs_rid=%p_recnum% > #tmpset/p t_recnum= <#tmpREMREM Set proper cross reference from Profile to TaskREMuconf−driver action=set_parameter recnum=%p_recnum% cs_taskid=%t_recnum% > #tmpset/p recnum= <#tmpif not %recnum%==%p_recnum% echo Failed to associate profile with taskREMREM Add Log SourceREMuconf−driver action=add table=logfile name=%domain% cr_action=2 \ ct_name=%domain% cr_type=local ct_loglocation=%logfile% \ cs_logformat=auto cs_rlist=!%p_recnum%! > #tmpset/p l_recnum= <#tmpREMREM Set proper cross reference from Profile to Log SourceREMuconf−driver action=set_parameter recnum=%p_recnum% cs_llist=!%l_recnum%!` > #tmpset/p recnum= <#tmpif not %recnum%==%p_recnum% echo Failed to associate profile with log sourceREMREM Add regular non−privileged user with access to this ProfileREMuconf−driver action=add table=user name=%username% ct_name=%username% \ ct_password=%password% ct_fullname="%domain% user" \ cs_language=%language% cs_region=%region% cs_adminlevel=3 \ cs_rlist=!%p_recnum%! > #tmpset/p u_recnum= <#tmpREMREM Set proper cross reference from Profile to UserREMuconf−driver action=set_parameter recnum=%p_recnum% cs_ulist=!%u_recnum%! > #tmpset/p recnum= <#tmpif not %recnum%==%p_recnum% echo Failed to associate user with profiledel #tmpREMREM END OF SCRIPTREM

Chapter 7: Advanced Topics 157

Page 162: MANUAL Urchin v5x

Here is the same script, written using VBS. On a Windows system, you would need to use "cscript" to run thisscript, e.g. "cscript add_records.vbs".

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' Proof−of−concept win32 cscript for adding a Profile,' Task, Log Source and User record set to the Urchin configuration.' The Profile will be set to run at 01:05am daily.'' NOTE: Line breaks have been added for readability'' Define the pertinent information here. Obviously, this should really be' stuff that's parsed from the command line.'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Option Explicit'On Error Resume Next

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' Declare needed vars'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−Dim driverpath Dim tmpfilepath Dim domainDim logfileDim usernameDim passwordDim languageDim regionDim calls

driverpath = "C:\Program Files\Urchin\util\uconf−driver.exe"tmpfilepath = "C:\Program Files\Urchin\var\udtemp"domain = "mysite.com"logfile = "/path/to/webserverlogs/mysite−access.log"username = "userjoe"password = "joepasswd"language = "en"region = "us"

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' Declare objects'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−Dim shell, fso, execo, tfileDim recnum, p_recnum, t_recnum, l_recnum, u_recnum

Set shell = CreateObject("WScript.Shell")Set fso = CreateObject("Scripting.FileSystemObject")

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' create temporary empty file'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−Set tfile = fso.CreateTextFile(tmpfilepath)tfile.Close

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' add profile'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−calls = """" &driverpath &""" action=add table=profile name=""" &domain &"""

Chapter 7: Advanced Topics 158

Page 163: MANUAL Urchin v5x

ct_name=""" &domain &""" ct_website=""http://www." &domain &""" ct_reportdomains=""" &domain &",www." &domain &""" −f " &tmpfilepathWScript.Echo calls

Set execo = Shell.Exec(calls)

While execo.StdOut.AtEndOfStream true p_recnum = p_recnum &(execo.StdOut.ReadLine())wend

if (isNumeric(p_recnum) = False Or p_recnum <= 0) Then WScript.Echo "Add Profile Failed " &p_recnum WScript.Quit(−1)End IfWScript.Stdout.write (p_recnum) &isNumeric(p_recnum) &vbCRLf

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' add Task'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−calls = """" &driverpath &""" action=add table=task name=""" &domain &""" ct_name=""" &domain &""" cr_frequency=5 cr_enabled=on cs_hour=01 cs_minute=05 cs_rid=" &p_recnum &" −f " &tmpfilepathWScript.Echo calls

Set execo = Shell.Exec(calls)

While execo.StdOut.AtEndOfStream true t_recnum = t_recnum &(execo.StdOut.ReadLine())wend

if (isNumeric(t_recnum) = False Or t_recnum <= 0) Then WScript.Echo "Add Task Failed " &t_recnum WScript.Quit(−1)End If

WScript.Stdout.write (t_recnum) &isNumeric(t_recnum) &vbCRLf

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' Associate task back to profile'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−calls = """" &driverpath &""" action=set_parameter recnum=" &p_recnum &" cs_taskid=" &t_recnum &" −f " &tmpfilePathWScript.Echo calls

Set execo = Shell.Exec(calls)

recnum = ""While execo.StdOut.AtEndOfStream true recnum = recnum &(execo.StdOut.ReadLine())wend

if (isNumeric(recnum) = False Or recnum <= 0 or recnum p_recnum) Then WScript.Echo "Failed to associate profile to task" &recnum WScript.Quit(−1)End If

WScript.Stdout.write (recnum) &isNumeric(recnum) &vbCRLf

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Chapter 7: Advanced Topics 159

Page 164: MANUAL Urchin v5x

' Add log source'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−calls = """" &driverpath &""" action=add table=logfile name=""" &domain &""" cr_action=2 ct_name=""" &domain &""" cr_type=local ct_loglocation=""" &logfile &""" cs_logformat=auto cs_rlist=!" &p_recnum &"! −f " &tmpfilePath WScript.Echo calls

Set execo = Shell.Exec(calls)

While execo.StdOut.AtEndOfStream true l_recnum = l_recnum &(execo.StdOut.ReadLine())wend

if (isNumeric(l_recnum) = False Or l_recnum <= 0) Then WScript.Echo "Failed to Add Log Source" &l_recnum WScript.Quit(−1)End If

WScript.Stdout.write (l_recnum) &isNumeric(l_recnum) &vbCRLf

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' Associate Log Source to Profile'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−calls = """" &driverpath &""" action=set_parameter recnum=" &p_recnum &" cs_llist=!" &l_recnum &"! −f " &tmpfilePathWScript.Echo calls

Set execo = Shell.Exec(calls)

recnum = ""While execo.StdOut.AtEndOfStream true recnum = recnum &(execo.StdOut.ReadLine())wend

if (isNumeric(recnum) = False Or recnum <= 0 or recnum p_recnum) Then WScript.Echo "Failed to associate profile width log source" &recnum WScript.Quit(−1)End If

WScript.Stdout.write (recnum) &isNumeric(recnum) &vbCRLf

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' Add regular non−privileged user with access to this Profile'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−calls = """" &driverpath &""" action=add table=user name=""" &username &""" ct_name=""" &username &""" ct_password=""" &password &""" ct_fullname=""" &domain &""" cs_language=""" &language &""" cs_region=""" &region &""" cs_adminlevel=3 cs_rlist=!" &p_recnum &"! −f " &tmpfilepathWScript.Echo calls

Set execo = Shell.Exec(calls)

While execo.StdOut.AtEndOfStream true u_recnum = u_recnum &(execo.StdOut.ReadLine())wend

if (isNumeric(u_recnum) = False Or u_recnum <= 0) Then WScript.Echo "Failed to Add Log Source" &u_recnum WScript.Quit(−1)

Chapter 7: Advanced Topics 160

Page 165: MANUAL Urchin v5x

End If

WScript.Stdout.write (u_recnum) &isNumeric(u_recnum) &vbCRLf

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' Associate user to profile'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−calls = """" &driverpath &""" action=set_parameter recnum=" &p_recnum &" cs_ulist=!" &u_recnum &"! −f " &tmpfilePathWScript.Echo calls

Set execo = Shell.Exec(calls)

recnum = ""While execo.StdOut.AtEndOfStream true recnum = recnum &(execo.StdOut.ReadLine())wend

if (isNumeric(recnum) = False Or recnum <= 0 or recnum p_recnum) Then WScript.Echo "Failed to associate profile width user" &recnum WScript.Quit(−1)End If

WScript.Stdout.write (recnum) &isNumeric(recnum) &vbCRLf

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' delete temp file and exit'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−If (fso.FileExists(tmpfilepath) = True) Then fso.DeleteFile(tmpfilepath) End IfWScript.Quit (0)

'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−' END OF SCRIPT'−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Special Conditions of Use

Passwords

When adding or editing the "ct_password" directive for either User or Remote Log Source password,uconf_driver will automatically encrypt the password before writing it to the configuration database to ensurethat passwords are not stored in clear text. For portability reasons, the encryption is in a proprietary formatthat is not compatible with other password encryption formats such as "crypt" on UNIX−type systems.

Managing associations between various records

Note that uconf−driver is a lower−level utility that does not map any of the directives for you. In using theutility to script some operations, please be aware that many of the tables contain directives that refer to otherrecords, or lists of records. These directives are: ct_ulist, ct_glist, ct_llist, ct_flist, and ct_rlist, which refer tothe user, group, logfile, filter, and profile tables respectively. These lists are represented as exclamation pointdelimited list of record numbers:

ct_flist="!13!36!56!"

Chapter 7: Advanced Topics 161

Page 166: MANUAL Urchin v5x

where each entry represents the recnum value of a record and is surrounded with exclamation points.Important! Be sure to keep the cross− references intact. For example, a Filter record has a "ct_rlist" whichdetails all of the profiles that the filter applies to; and a Profile record has a "ct_flist" which details all of thefilters that apply to this profile.Important! The uconf−driver uses this exclamation point delimited lists of record numbers, whereas theuconf−export and uconf−import utilities use comma−delimited lists of names. Be sure to use the appropriatelist specification depending on which utility you are using.

Considerations

The uconf−driver utility is intended for advanced users who are comfortable with command line scripting, andas such it provides only minimal error/sanity checking. Exercise caution when using the utility as improperusage could result in an unusable Urchin configuration. It is strongly recommended that the Urchinconfiguration be backed up using the uconf−export utility before making changes with uconf−driver.

See Also:

Script−based Configuration Management in the Integration section of the Advanced Topics area ofthe Urchin documentation provides a general overview of using a script−based, unattended/automatedapproach to managing Urchin's configuration

Configuration Table and Directive List in the Reference area of the Urchin documentation. Thisdocument has a complete list of all the valid tables and parameters that can be used with theuconf−driver utility.

uconf−export: Text−based Configuration Export Utility

Overview

The Text−based Configuration Export Utility. uconf−export, provides a command line interface for readingthe Urchin configuration database and exporting it in a human−readable text format. The exported data is anXML−type record format, which is directly compatible with the format expected by the uconf−import utility.Each record in the exported data corresponds to a configuration record in the Urchin configuration database.

The uconf−export utility is located in the util directory of the Urchin 5 distribution.

Usage

Usage of the utility is as follows:

uconf−export [−h] (prints usage message and exits) uconf−export [−v] (prints version and exits) uconf−export [−f file]

Chapter 7: Advanced Topics 162

Page 167: MANUAL Urchin v5x

where:

−f specifies the path to a file that uconf−export will write to

If no command line arguments are given, uconf−export will write to the standard output.

Output format for uconf−export

The utility exports configuration records in an XML−style format. Each record begins with a <TableName="RecordName"> line and ends with a </Table> line. The list of configuration directivesassociated with the record are printed, one per line, between the record begin/end lines. For instance, a Profilerecord would look something like this:

<Profile Name="help.urchin.com"> ct_name=help.urchin.com ct_website=http://help.urchin.com ct_reportdomains=help.urchin.com,www.help.urchin.com cs_llist="help.urchin.com daily log" ct_defaultpage=index.html cs_referrallevel=3 cs_timeoffset=localtime cr_logtracking=on cr_processpath=on cs_pathlevel=3 cs_vmethod=0 cs_visitortimeout=1800 cr_sessionpageview=on cs_ulist="webmaster" ct_affiliation=(NONE) cr_profiletype=Standard_Website cs_reportset=Basic_All</Profile>

The valid list of Table names is:

GlobalMachineFilterLogfileProfileTaskAffiliationGroupUser

Considerations

The uconf−export utility provides no ability to export a certain subset of the configuration data. If youwish to extract only certain records from the output of uconf−export, you should use an external scriptto parse this output or use the uconf−driver utility to extract single records.

The output of uconf−export is carefully constructed so that it can be directly imported into the Urchinconfiguration using the uconf−import utility. This allows you to use uconf−export to create atext−based backup of your Urchin configuration which can then be easily restored using

Chapter 7: Advanced Topics 163

Page 168: MANUAL Urchin v5x

uconf−import.

See Also:

Script−based Configuration Management in the Integration section of the Advanced Topics area ofthe Urchin documentation provides a general overview of using a script−based, unattended/automatedapproach to managing Urchin's configuration

Configuration Table and Directive List in the Reference area of the Urchin documentation. Thisdocument has a complete list of all the valid tables and parameters that may appear in the output ofthe uconf−export utility.

uconf−import: Text−based Configuration Import Utility

Overview

The Text−based Configuration Import Utility. uconf−import, provides a command line interface for importingtext−based configuration records into the Urchin configuration database. The imported data is an XML−typerecord format, which is directly compatible with the format exported by the uconf−export utility.

The uconf−import utility is located in the util directory of the Urchin 5 distribution.

Usage

Usage of the utility is as follows:

uconf−import [−h] (prints usage message and exits) uconf−import [−v] (prints version and exits) uconf−import [−o|−r] [−f file]

where:

−f specifies the path to a file that uconf−import will write to −o overwrites existing records with the same name with the data being imported −r removes all existing configuration data and replaces it with the data being imported

If neither the "−o" or "−r" arguments are specified, only new records will be written into the configurationdatabase; no existing records will be modified or overwritten. If no command line arguments are given,uconf−import will write to the standard output.

Input format for uconf−import

Chapter 7: Advanced Topics 164

Page 169: MANUAL Urchin v5x

The utility imports configuration records in an XML−style format. Each record begins with a <TableName="RecordName"> line and ends with a </Table> line. The list of configuration directivesassociated with the record should be ordered, one per line, between the record begin/end lines. For instance, aProfile record would look something like this:

<Profile Name="help.urchin.com"> ct_name=help.urchin.com ct_website=http://help.urchin.com ct_reportdomains=help.urchin.com,www.help.urchin.com cs_llist="help.urchin.com daily log" ct_defaultpage=index.html cs_referrallevel=3 cs_timeoffset=localtime cr_logtracking=on cr_processpath=on cs_pathlevel=3 cs_vmethod=0 cs_visitortimeout=1800 cr_sessionpageview=on cs_ulist="webmaster" ct_affiliation=(NONE) cr_profiletype=Standard_Website cs_reportset=Basic_All</Profile>

The valid list of Table names is:

GlobalMachineFilterLogfileProfileTaskAffiliationGroupUser

Cross Linking of Records

Many records in the Urchin configuration contain directives that cross reference records in other tables. Forinstance, a Profile record will have a cross reference to the Logfile table for each specified Log Source;likewise, a record in the Logfile table will have a cross reference to the Profiles which utilize that Log Source.Directives of this type usually have a name like ct_Xlist, and consist of a string of comma−separated recordnames in other tables that the record references. When using uconf−import, the utility will verify the inputdata and build the proper internal cross−reference links. For instance uconf−import will properlycross−reference the following two records when it imports them. Notice the use of cs−llist in the Profilerecord, and cs_rlist in the Logfile record.

<Profile Name="download.urchin.com"> ct_name=download.urchin.com ct_website=http://download.urchin.com ct_reportdomains=download.urchin.com,www.download.urchin.com cs_llist="download.urchin.com daily log" cs_timeoffset=localtime

Chapter 7: Advanced Topics 165

Page 170: MANUAL Urchin v5x

cr_logtracking=on cr_processpath=on cs_pathlevel=3 cs_vmethod=0 cs_visitortimeout=1800 cr_sessionpageview=on cr_profiletype=Standard_Website cs_reportset=Basic_All</Profile>

<Logfile Name="download.urchin.com daily log"> ct_name=download.urchin.com daily log ct_loglocation=/bigdisk/rawlogs/download.urchin.com/access.log cr_type=local cs_rlist="download.urchin.com"</Logfile>

Record Ordering Requirement with "−r" argument

When using uconf−import with the "−r" argument with uconf−import, it is required that the first four recordsin the imported configuration be as follows:

<Global Name="Access Settings"> ...</Global>

<Machine Name="Process Settings"> ...</Machine>

<Affiliation Name="(NONE)"> ...</Affiliation>

<User Name="(admin)"> ... </User>

If uconf−import does not find the first four records to be of the type and order specified above, the utility willprint a warning diagnostic and exit without performing the import.

Saving and Restoring the Urchin Configuration

Using the capabilities of the uconf−export and uconf−import utilities, it is a simple process to both save andrestore the Urchin configuration to a known good state. You merely need to save the output of theuconf−export utility in a file, and then reimport it at some later date with uconf−import. This allows yourecover easily from corruption caused by a server crash or other catastrophic event. For example, consider thefollowing scenario:

As a standard operational policy, you save your Urchin configuration regularly with the command:

uconf−export −f /path/to/backupdir/urchin_saved_config.txt

Time goes by, bad stuff happens and the server disk crashes, requiring a complete reinstall of Urchin.• You restore your Urchin configuration with the command:•

Chapter 7: Advanced Topics 166

Page 171: MANUAL Urchin v5x

uconf−import −r −f /path/to/backupdir/urchin_saved_config.txt

In addition, it should be noted that using the "−r" argument with uconf−import can be used to clean out oldunreferenced records in the Urchin configuration database. Simply doing the following procedure clears outall unused records in the database, thereby compacting it.

uconf−export −f /path/to/tempdir/urchinconfig.txtuconf−import −r −f /path/to/tempdir/urchinconfig.txt

Upon importing, uconf−import will also do a certain amount of sanity checking to ensure the critical recordsare in the proper order and that the proper record cross−references are in place.

Considerations

When a ct_password directive for a Logfile or User record is imported, the supplied password mustbe in cleartext format or already encrypted in Urchin's proprietary encryption format. The utility willautomatically encrypt cleartext passwords as it stores them in the configuration database. To allowUrchin configuration data to be moved among different operating system platforms, Urchin uses aninternal encryption format that is not compatible with standard password formats such as thosegenerated by crypt on UNIX−type systems.

The "−r" argument should be used with caution. When using this mode, you must ensure that theinput data is both comprehensive and in the correct format since the Urchin configuration is beingcompletely overwritten by the data supplied to uconf−import. It is safest to use "−r" importingfunctionality only with configuration data created by the uconf−export utility.

See Also:

Script−based Configuration Management in the Integration section of the Advanced Topics area ofthe Urchin documentation provides a general overview of using a script−based, unattended/automatedapproach to managing Urchin's configuration

Configuration Table and Directive List in the Reference area of the Urchin documentation. Thisdocument has a complete list of all the valid tables and directives that may be used with theuconf−import utility.

uconf−schedule: Global Scheduling Utility

Overview

The Global Scheduler utility, uconf−schedule, provides a means of scheduling Urchin 5 tasks to runimmediately or at some regularly scheduled interval without having to make individual scheduling changes toeach Urchin Profile.

The types of scheduling operations that uconf−schedule can perform are:

Chapter 7: Advanced Topics 167

Page 172: MANUAL Urchin v5x

schedule all profiles to run immediately• schedule all profiles to run at a particular time and/or frequency• disable scheduling for all profiles•

The Global Scheduling Utility is one of the Urchin 5 utilities that is typically used in an automated/scriptedenvironment. For a synopsis of the entire suite of scripting utilities and an in−depth description of their usage,please see the Script−based Configuration Management document in the Integration section of theAdvanced Topics area on http://help.urchin.com.

Usage

uconf−schedule is located in the util directory of the Urchin 5 distribution.

Usage of the utility is as follows:

uconf−schedule [−v] (prints version and exits) uconf−schedule [−k] [−r]

In the default mode, uconf−schedule will interactively prompt for the type of global scheduling operation toapply, and any other parameters necessary (such as the time of day, date, etc.). If the "−k" option is supplied,the scheduling is done without modifying the existing schedule for the profiles. If the "−r" option is specified,the utility will simply schedule all profiles to run immediately, with no further input required. Running"uconf−schedule −k −r" is a useful way of scheduling all profiles to run immediately without changing thenormally scheduled run times for the Profiles.

Considerations

The uconf−schedule utility does not actually run jobs itself. It merely applies a global scheduling change to allprofiles, which in turn is picked up by the Urchin Scheduler (urchind). Therefore, it is necessary to be runningthe Urchin Scheduler in order for events scheduled by uconf−schedule to be invoked, including Run−Nowevents.

udb−sanitizer: Database Maintenance Utility

Overview

The Urchin Database Maintenance Utility, udb−sanitizer, provides a means of checking the Urchin 5 profiledatabases and performing various maintenance operations on these databases.

The types of operations that udb−sanitizer can perform are:

Check the integrity of Urchin monthly profile databases• Rebuild Urchin monthly profile database headers and indexes• Roll back databases to a previous saved backup state• Delete profile data for a day, multiple days, or an entire month•

Chapter 7: Advanced Topics 168

Page 173: MANUAL Urchin v5x

Usage

udb−sanitizer is located in the util directory of the Urchin 5 distribution.

Usage of the utility is as follows:

udb−sanitizer [−h] (prints usage message and exits) udb−sanitizer [−v] (prints version and exits) udb−sanitizer −p profile [−d YYYYMM[DD]] −bfhiprqx] [−z [−e DD]]

where:

−b go directly to rollback option −d specifies year−month and optionally the day to operate on −e with z and d options, zero multiple days (range d−>e) in same month −f force action to occur without confirmation −h print this help information −i go directly to rebuild−index option −p specifies name of profile (required) −r go directly to remove option −q quiet mode, suppress output except for critical user confirmation −x go directly to rebuild−header option −z go directly to zero−day option

Note: When udb−sanitizer is called with options that do not completely describe what action to take, it willprompt the user as needed for additional input. You can cause an action to be performed without any userinteraction by using the "−d" option in conjunction with any of the −b,−i,−r,−x, or −z options.

Operation

In normal operation, udb−sanitizer is invoked from a command shell and interactively prompts the user for theactions to take. For each month of Urchin reporting data that the utility finds, it will present the followinginteractive menu:

Options: 1. Rollback data to state before last run 2. Delete this month entirely 3. Rebuild header to match data 4. Rebuild indexes 5. Zero out one or more days Please choose 1−5 or press return to do nothing:

If no action for the currently selected month is desired, pressing the Enter/Return key will cause the utility tomove forward to the next chronological month where data is present and present the same menu choices.

Actions associated with the options presented above are:

Data rollbackThe utility will revert all reporting data for a profile to that contained in a ZIP archive. The user ispresented with a list of ZIP archive backups to choose from. The ZIP archives are named with thefollowing convention "YYYYMM−backup−YYYYMMDDHHMMSS.zip", where the firstYYYYMM refers to the month of data being backed up (e.g.200309 refers to September 2003), and

1.

Chapter 7: Advanced Topics 169

Page 174: MANUAL Urchin v5x

the YYYYMMDDHHMMSS portion is the timestamp of when the ZIP archive was created. Thistimestamp should be helpful in determining which ZIP archive you want to roll back to. Please notethat there is no way to invoke udb−sanitizer to do a rollback based solely on command linearguments; it will always prompt for the ZIP archive to rollback to if any exist. If no ZIP backuparchives exist, the utility prints a diagnostic to that effect and exits.Delete monthly dataAll data for a particular profile for the specified month is removed. This option is useful for zeroingout the statistics for a month if the data is incorrect, e.g. the wrong filters were applied or the wronglogs were processed; or perhaps some of the advanced profile parameters were changed such as theclick path depth or referral level and it is desirable to update that month's Urchin reporting data toreflect the change. This action can be performed without user interaction by invoking udb−sanitizerwith the "−f", "−r" and "−d" arguments, e.g.

udb−sanitizer −f −r −d 200309 −p mysite.com

2.

Rebuild database headersThis causes the utility to read the Urchin database tables directly and rebuild the database headersbased on the data found. This should only be done if udb−sanitizer finds a discrepancy between theheaders and the data. WARNING: if the database headers do not match the data, this is typicallyindicative of some type of database corruption; in this case, the prudent course of action is tocompletely remove the data for that month and reprocess the corresponding webserver logs. This maynot be possible for various reasons, so rebuilding the headers may be the only way to resuscitate thedatabases so that the Urchin log processing and reporting engines are able to work with them, but thisis not guaranteed to fix corruption. This action can be performed without user interaction by invokingudb−sanitizer with the "−f", "−i" and "−d" arguments, e.g.

udb−sanitizer −f −x −d 200309 −p mysite.com

3.

Rebuild database indexesThis causes the utility to read the Urchin database tables directly and rebuild the database indexesbased on the data found. This should only be done if udb−sanitizer finds a discrepancy between theheaders and the data. NOTE: the same warning given about corruption in the database headers appliesto this option as well. This action can be performed without user interaction by invokingudb−sanitizer with the "−f", "−i" and "−d" arguments, e.g.

udb−sanitizer −f −i −d 200309 −p mysite.com

4.

Zero data for one or more daysThis option allows data for selected days within the month to be zeroed out, thereby allowing Urchinlog processing to be rerun for those days only (e.g. urchin −p profile −d YYYYMMDD). This actioncan be performed without user interaction to zero out a single day by invoking udb−sanitizer with the"−f", "−z" and "−d" arguments, e.g.

udb−sanitizer −f −z −d 20030907 −p mysite.com

and for multiple days by including the "−e" argument as well to specify an end date, e.g.

udb−sanitizer −f −z −d 20030907 −e 10 −p mysite.com

which will zero out data for September 7th through the 10th. This is more efficient than invokingmultiple instances of udb−sanitizer to zero out a single day at a time, as the database indexes andheaders only are checked once. The index/header checking operation can require a noticeable amountof time on profiles with a lot of data.

5.

Considerations

Chapter 7: Advanced Topics 170

Page 175: MANUAL Urchin v5x

Invoking udb−sanitizer without specific dates on profiles with a lot of historical data can be timeconsuming, as the utility must open up the databases for each month, perform sanity checks, and thenpresent the menu of actions.

Actions that delete daily or monthly data cannot be undone! The only recourse is to reprocess thewebserver logs for that time period to repopulate the profile databases. Use these options with care.

urchinctl: Urchin Services Control Utility

Overview

The Urchin Services Control utility, urchinctl, provides a means of starting and stopping the Urchin Schedulerand Urchin Webserver services. On UNIX−type systems, urchinctl is typically called from one of the system'sboot−time scripts to automatically start up or shut down Urchin services.

The types of operations that urchinctl can perform are:

Start, stop, or restart the scheduler or webserver (or both)• Start the webserver on an alternate port• Start the webserver with SSL encryption•

Usage

urchinctl is located in the bin directory of the Urchin distribution.

Usage of the utility is as follows:

urchinctl [−h] (prints usage message and exits) urchinctl [−v] (prints version and exits) urchinctl [−e] [−p port] [−s | −w] action

where:

−e activates encryption (SSL) in the webserver −p specifies the port for the webserver to listen on −s performs the action on the Urchin scheduler ONLY −w performs the action on the Urchin webserver ONLY

and action is one of:

start (starts the service(s)) stop (stops the service(s)) restart (stops and then starts the service(s)) status (displays webserver/scheduler runtime status)

Chapter 7: Advanced Topics 171

Page 176: MANUAL Urchin v5x

By default, the action is performed on both the webserver and the scheduler unless the "−s" or "−w" commandline arguments are specified. Note that these arguments are mutually exclusive.

Considerations

On UNIX−type systems, urchinctl should be run as the user/UID that Urchin is installed as to ensurethat the urchinwebd and urchind processes are started as that UID.

Starting up the Urchin webserver with SSL encryption initially requires additional configurationsteps. Please see the document titled Activating SSL on the Urchin Webserver in the SecurityFeatures section of the Advanced Topics area of the Urchin Documentation Library.

urchin: Urchin Log Processing Engine

Overview

The Urchin Log Processing Engine, urchin, is the core log processing component of Urchin.Ordinarily, the log processing engine is invoked from the Urchin Scheduler (urchind) when a task isrun. However, it is possible to execute urchin directly from a command shell to run a specific profile.This is useful in highly scripted environments where running a Urchin tasks from an external sourcesuch as the Windows Task Scheduler or cron on UNIX−type systems. It is also useful for running aprofile under special circumstances, such as to process only hits for a particular day, or to do sometype of debugging.

urchin is not truly a utility − it is documented here because it possesses some limited command−linecapabilities that may prove useful in certain environments.

Usage

urchin is located in the bin directory of the Urchin distribution.

Usage of the utility is as follows:

urchin [−h] (prints usage message and exits) urchin [−v] (prints version and exits) urchin [−DHt [−d YYYYMMDD] −p profile

where:

−D runs urchin in debug mode −H causes the runtime output of urchin to be logged to the standard history directory for the profile −t runs urchin in test mode only to output runtime parameters −d specifies that Urchin should only process hits from logs that match the specified date (YYYYMMDD format) −p specifies the profile to run

Chapter 7: Advanced Topics 172

Page 177: MANUAL Urchin v5x

Considerations

On UNIX−type systems, urchin should be run as the user/UID that Urchin is installed as toensure that the databases for the profile are owned by that UID, since urchin will create themif they do not already exist.

Integration

NFS locking requirement

Urchin configuration, which uses data files located in the 'data/conf' folder of the installation,will set read and write locks during setup and administration. If the data/conf folder ismounted over NFS, it is required that the NFS server is running the appropriate lockingdaemons to handle remote locking.

If the locking does not work properly, the Urchin installation may hang indefinitely.

Technical Notes: Urchin uses the fcntl() function on UNIX to perform advisory andexclusive (read/write) locking. Some older systems may not support all of the rpc lockingrequests of different platforms. Be sure you are running recent versions of your OS with thelatest patches. Make sure that the NFS server is running 'statd' and 'lockd' or equivalent.

Overview of Urchin Integration Capabilities

IntroductionThe Urchin 5 front end is built on modular components and industry standard techniques toallow administrators to integrate Urchin report access into their existing or plannedinfrastructures. As part of our commitment to hosting providers and data centers, we test andsupport three integration points and six different functions which should allow Urchin tointegrate into most architectures.

The following diagram illustrates the primary components of the Urchin front−end and thethree integration points. Both administrative and reporting functions are web−based anddelivered from this system.

Chapter 7: Advanced Topics 173

Page 178: MANUAL Urchin v5x

All content from the Urchin system is delivered via an Apache server that is installed as"urchinwebd" shown in the left side of the diagram. Requests are handled by the Apacheserver and passed using the CGI interface to a Session Controller application (session.cgi).The Session Controller performs Authentication and depending on the action passes therequest on to either the Admin Engine (admin) or the Urchin Report Engine (urchin.cgi).Content with embedded session identifiers is passed back to the user.

The three points labeled "A", "B", and "C" are the three integration points mentionedpreviously. Point "A" allows the administrator to either replace or bypass the Urchin webserver. Point "B" allows for external or no authentication. And point "C" allows for directaccess to the reporting via a wrapper or portal.

Point A: Web Server IntegrationMany hosting companies will already have a running web server or have a specially compiledweb server that runs within their systems. As long as a web server is providing basic CGIoperations, you may replace the Apache binary or use an existing web server to provideaccess to Urchin.

This integration point keeps the rest of the Urchin system intact. The Urchin interface will beused for administration, authentication, and report delivery. Users and Profiles will need to beconfigured within the Urchin system so as a user enters the system, Urchin will know whichreports to allow them access to. For complete details on the requirements of replaced binaryor an example of using an existing web server, please see the appropriate document for thetype of webserver you are running under the Integration section of the Advanced topics areaof the Urchin Documentation Center at http://help.urchin.com.

Point B: External Authentication or Authentication BypassComplex hosting environments will often have existing centralized controls for providinguser authentication such as LDAP. With a simple configuration change, the SessionController can call an authentication routine of your choice. A simple interface is provided forreturning successful or failed logins. Existing authentication routines can be easily wrapped toprovide the correct framework for both systems. Using integration at point "B", it alsopossible to bypass the authentication with a dummy routine and link directly to the landingpage.

Chapter 7: Advanced Topics 174

Page 179: MANUAL Urchin v5x

By using integration at point "B", the overall Urchin system remains intact with the exceptionthat authentication is performed by and external application. Users and Profiles will need tobe configured within the Urchin system so as a user enters the system, Urchin will knowwhich reports to allow them access to. For complete details on the interface for using externalauthentication and an example of how to bypass authentication, please see the associatedIntegration document entitled Using External Authentication or Authentication Bypass.

Point C: Link Directly to Reports from Wrapper or PortalMany of our customers have a one−to−one relation between a customer and a website, andwish to link to reporting for a particular website directly from the website's administrationarea. This area is typically already authenticated. Point C integration makes it easy to linkdirectly to reporting either via a wrapper script or from an existing portal. In this scenario, theUrchin authentication and initial report selection screen are bypassed as users are takendirectly to the Urchin reports.

By using integration at point "C", it is assumed that the service provider has control over whogets to view what, either by a one−to−one relation or by an existing configuration database.Urchin will still need to be configured for generating Profile reports, but this can beautomated within or outside of the administrative interface. Urchin provides the capability tolink directly to an Urchin report with a specific URL when the "Direct Report Linking" isenabled in the administration interface. Access to the Urchin report is controlled via a".report.conf" file embedded in the directory specified by this URL.

For a complex portal integration, the Urchin reporting engine will propagate session and otherportal variables in order to keep the session operating. For details on how to use a wrapper orportal to access Urchin reporting directly, and proper configuration of the ".report.conf"please see the Integration document entitled Linking Directly to Urchin Reports.

Changing the Location of the Urchin Data Directory

Overview

For simplicity, the default Urchin 5 installation is contained in a single umbrella directory thatcontains all Urchin applications and utilities, library files, documentation, etc. All dataprocessed by Urchin is also stored in this installation directory under the data sub−directory.

In many cases, it is desirable to configure Urchin to store its data on another disk partition ordrive that is better suited to dynamic data and allows for greater storage capacity. Urchin caneasily be configured to do this.

Procedure

The location of where Urchin stores its report data can be changed in the urchin.conf file,which is located in the etc directory of the main Urchin installation directory.

Chapter 7: Advanced Topics 175

Page 180: MANUAL Urchin v5x

For UNIX−type systems:

Open a command shell as the user that Urchin was installed as1. cd to the directory where Urchin is installed2. Stop the Urchin services with the command: bin/urchinctl stop3. Using a text editor, open the urchin.conf file in the etc directory of the Urchindistribution

4.

Uncomment the following line by removing the leading '#' character:#dataDirectory: ./data/ and substitute the full directory path you want touse instead of the default directory, e.g.dataDirectory: /bigdisk/urchin/data/

5.

If necessary, create the new data directory. Important note: this directory must bereadable and writable by the user Urchin runs as.

6.

Copy the existing data to the new location. To ensure that all permissions and data areproperly preserved, it is recommended that you use the following command:cd data; tar cf − . | (cd /bigdisk/urchin/data; tar xpf−)

7.

Rename the data directory to data.old. You can remove it completely if youwish, but you may want to ensure that everything is working properly before doingthis.

8.

For ease of administration, you may want to create a symlink in the main Urchindirectory pointing at this new location, e.g.ln −s /bigdisk/urchin/data ./data

9.

Restart the Urchin services with the command: bin/urchinctl start10. Log in to Urchin as the admin user. You should be presented with the LicenseUrchin screen. Simply click on the Reactivate License link and you arefinished.

11.

For Windows systems:Stop the Urchin Services: Start−>Programs−>Urchin−>Disable UrchinServices

1.

Open Windows Explorer and navigate to the etc folder of the Urchin distribution.By default, this is C:\Program Files\Urchin\etc

2.

Using a text editor, open the urchin.conf file3. Uncomment the following line by removing the leading '#' character:#dataDirectory: ./data/ and substitute the drive letter and full pathnameyou want to use instead of the default folder, e.g.dataDirectory: E:\Urchin\data

4.

If necessary, create the new data folder. Important note: the permissions on thisfolder must allow read and write access to the Urchin service.

5.

Copy the contents of the data folder to the new folder.6. Rename the data folder to data.old. You can remove it completely if you wish,but you may want to ensure that everything is working properly before doing this.

7.

Restart the Urchin Services: Start−>Programs−>Urchin−>Enable UrchinServices

8.

Considerations

Due to the way licensing is implemented on UNIX−type systems to prevent tampering,moving Urchin's data directory will require the Urchin license to be reactivated. However, the

Chapter 7: Advanced Topics 176

Page 181: MANUAL Urchin v5x

relicense operation is extremely simple − you merely need to log in as the Urchin admin userand click Reactivate License.

Using an Existing Apache Webserver (UNIX−type Platforms)

By default, Urchin 5 administration and reporting are done using a standalone Apache serverthat is bundled with the Urchin product. In the vast majority of Urchin installations, this is thepreferred method for delivering Urchin admin and reporting interfaces. However, in rareinstances it may be necessary to utilize an existing Apache installation. This may be due tosite requirements that a localized version of Apache be used throughout the organization, orthat all web services be controlled via a single Apache configuration. The information belowdescribes two different models that can be employed to meet these requirements.

DISCLAIMER: These modifications to the Urchin installation are unsupported and falloutside the scope of the standard Urchin free and paid support plans. Any assistance renderedto set up or debug these configurations will be done at Urchin Software Corporation'sstandard Hourly Support rate.

Option 1: Utilitizing an existing site−specific Apache httpd binary to run Urchin services as aseparate instance

Side effects:

Urchin upgrades that depend on features added and/or configuration changesto the bundled Apache may not work properly if the existing Apache binarydoesn't support these changes/features.

Configuration changesEnsure that your httpd includes support for the following modules:mod_accessmod_cgimod_dirmod_mime

Install Urchin 5 in the normal fashion, choosing the desired port for Urchin'sadmin and reporting interfaces to run on.

Once Urchin is installed, do the following:

cd /path/to/urchin/bin ./urchinctl stop mv urchinwebd urchinwebd.orig ln −s /path/to/your/httpd urchinwebd ./urchinctl start

This will start up a separate instance of Apache that uses your Apache binary,but runs independently from your standard web services.

Option 2: Running Urchin services from an existing Apache configuration

Chapter 7: Advanced Topics 177

Page 182: MANUAL Urchin v5x

Side effects:

The entire Urchin distribution must be owned by the same UID that httpdruns as.

The admin GUI cannot be used to change the port Urchin runs on.⋅ Use the bin/urchinctl command with the −s argument exclusively to start/stoponly the Urchin scheduler (urchind)

Configuration changesAdd the following lines to your existing httpd.conf file. You will need to supply theIP address for your server, the port number for the Urchin to use, as well as the pathto the location where you've installed Urchin 5.

## Support for Urchin administration and reporting services

Listen [port#]<VirtualHost [server−ip]:[port#]>ErrorLog /path/to/urchin−distribution/var/error.logDocumentRoot /path/to/urchin−distribution/htdocs/<Directory /path/to/urchin−distribution/htdocs/> AddHandler cgi−script .cgi Options ExecCGI DirectoryIndex session.cgi AllowOverride None Order allow,deny Allow from all</Directory></VirtualHost>

Once these configuration changes have been made, perform the following tasks:

Change the ownership of the Urchin distribution to the UID that your Apachewebserver runs as:

chown −R apache−user /path/to/urchin/bin

Ensure that the Apache bundled with Urchin is stopped:

cd /path/to/urchin/bin ./urchinctl −w stop

Restart your Apache server to enable Urchin reporting and administration◊ Edit your Urchin boot−time startup script(s) and replace any instances of

urchinctl start

with

urchinctl −s start

This will cause only the Urchin scheduler to be run at boot time, rather than both thescheduler and Apache server.

Chapter 7: Advanced Topics 178

Page 183: MANUAL Urchin v5x

Using an Existing IIS Webserver (Windows Platforms)

By default, Urchin 5 administration and reporting are done using a standalone Apache server that isbundled with the Urchin product. In the vast majority of Urchin installations, this is the preferredmethod for delivering Urchin admin and reporting interfaces. However, in rare instances it may benecessary to utilize an existing IIS webserver. This may be due to site requirements that disallow theuse of a third party webserver product on the server, or the need to set up Urchin reporting as a virtualhost on an existing IIS server.

DISCLAIMER: These modifications to the Urchin installation are unsupported and fall outside thescope of the standard Urchin free and paid support plans. Any assistance rendered to set up or debugthese configurations will be done at Urchin Software Corporation's standard Hourly Support rate.

Procedure

Note: this procedure assumes that Urchin has been installed in the default location of C:\ProgramFiles\Urchin. If you have installed Urchin elsewhere, please be sure to substitute the proper locationin the example below.

Step 1: Create a new user for the Urchin web interface

Go to "Administrative Tools" −> "Computer Management"1. On the left hand side of the Computer Management screen, click "Local Users and Groups"2. Right−click on the Users folder and select "New User..."3. Enter "IUSR_URCHIN" in the "User name:" field4. Uncheck the "User must change password at next logon" box5. Check the "User cannot change password" box6. Click "Create" and then "Close"7. Double click on the"Users" folder on the left8. Right−click on the IUSR_URCHIN on the right and select "Properties..."9. Under the "Member of" tab, remove all existing entries, and then click the "Add.." button andchoose "Guests" in the popup window, and click "Add" again, then click "OK"

10.

Click "Apply" and "Close" to save your changes11. Step 2: Install Urchin (if not already done)

Step 3: Disable the Urchin Apache web server

Go to "Administrative Tools" −> "Services"1. Under "Services" find the "Urchin Webserver" record2. Right click on "Urchin Webserver" and select "Stop"3. Right click on "Urchin Webserver" and select "Properties", then change the Startup type" to"Disabled"

4.

Click "OK"5.

Chapter 7: Advanced Topics 179

Page 184: MANUAL Urchin v5x

Step 4: Added a new web site to IISGo to "Administrative Tools" −> "Internet Services Manager"1. Right−click on your server's name and select "New" and then "Web Site"2. In the "Description:" field type "Urchin" and click Next3. Select the IP address and port number (typically 9999) and click Next4. In the "Path:" field browse to the location where Urchin is installed (typically C:\ProgramFiles\Urchin\htdocs) and click Next

5.

Add a check mark in "Execute:" and click Next and then Finish6. Right−click on the new Urchin web site and go to "Properties"

Under the "Web Site" tab, un−check "Enable Logging"a. Click on the "Home Directory" tab and check "Script source access"b. Click on the "Documents" tab and Remove both Default entries, then click Add andenter "session.cgi" in the popup window, then click OK.

c.

Click on the "Directory Security" tab, and then click Edit in the "Anonymous accessand authentication control" area. Ensure that the "Anonymous access" box ischecked, then click Edit... to change the "Account used for anonymous access". In thepop−up window, select "IUSR_URCHIN" for the Username. Click OK and then OKagain to get back to the Properties window.

d.

Click OK to save your changes and exit the Properties windowe.

7.

Step 5: Set up directory permissionsRight click on "Start" and select "Explore"1. Navigate to the location where Urchin is installed (typically C:\Program Files\Urchin)2. Right click on the "Urchin" folder and select "Properties"3. Click on the "Security" tab4. Un−check "Allow inheritable permissions from parent to propagate to this object" and thenclick "Remove" in the pop−up window.

5.

Click "Add" and select the Administrator user and then click "Add"6. Click "Add" and select IUSR_URCHIN and then click "Add"7. In the "Name:" field ensure that only the Administrator and IUSR_URCHIN entries are there8. Ensure that only the following permissions are allowed for both the Administrator andIUSR_URCHIN users:

Read &Execute◊ List Folder Contents◊ Read◊

9.

Click OK to save the permissions.10. Click into the "Urchin" folder in the Windows Explorer window.11. Right click on the "data" folder and select "Properties"12. Click on the "Security" tab.13. Un−check "Allow inheritable permissions from parent to propagate to this object" and thenclick "Remove" in the pop−up window.

14.

Click "Add" and select the Administrator user and then click "Add"15. Click "Add" and select IUSR_URCHIN and then click "Add"16. In the "Name:" field ensure that only the Administrator and IUSR_URCHIN entries are there17. Ensure that the following permissions are allowed for both the Administrator andIUSR_URCHIN users:

Full Control◊ Modify◊ Read &Execute◊

18.

Chapter 7: Advanced Topics 180

Page 185: MANUAL Urchin v5x

List Folder Contents◊ Read◊ Write◊

Click "OK"19. Step 6: IIS 6 only

Go to the "Web Service Extentions Manager"1. Click on 'Add new web service extention..'2. Enter "Urchin CGI" in the "Extention Name:"3. Click 'Add'4. Browse to and highlight report.cgi and session.cgi

Default location: C:\Program Files\Urchin\htdocs◊ 5.

Check "Set Extention Status to Allow"6. Click "OK"7. Go to the main IIS entry for your server: "Hostname"8. Right click and select "Properties"9. Click on "Mime Types"10. Click "New"11. Enter ".cgi" in the "Extention:" field.12. Enter "application/octet−stream" in the "Mime Type" field.13. Click "OK"14. Click "OK"15.

Your IIS webserver should now be set up to call the Urchin web interface if you connect to it usingthe URL of http://my.server.com:port, where port is what you set Urchin to use when you installed it(default is 9999).

Using External Authentication or Authentication Bypass

OverviewBy default, Urchin authentication is performed when the Urchin Session Controller (session.cgi) callsthe “auth” binary located in the “bin” directory of the Urchin Installation. This binary queries theconfiguration database and compares the username and password provided with that stored in theconfiguration. An exit code signifying either success or failure is returned to the Session Controller.The location of the authentication binary can be controlled with a configuration change. This modulardesign allows administrators to call an external authentication program instead of the default “auth”binary.

Chapter 7: Advanced Topics 181

Page 186: MANUAL Urchin v5x

Shown in the above diagram, this external authentication program could perform any desiredauthentication function including LDAP and other database calls. As long as the program isexecutable by the Urchin user and conforms to the input/output requirements, Urchin can be easilymodified to use a different form of authentication.

Specifying the Authentication RoutineTo configure which authentication routine the Session Controller calls, edit the “etc/session.conf” filelocated in the Urchin Installation. This file contains configurable parameters that control the behaviorof the Session Controller including which routine to call for authentication. Edit the line:

AUTHENTICATION: ../bin/auth

Replace, the “../bin/auth” with the path to your authentication routine. Be sure that the authenticationroutine is executable by the same user that urchinwebd (Urchin’s Apache web server) is running as.

Input/Output RequirementsWhen the Session Controller calls the authentication routine, it will pass the username, password, andthe remote IP address of the user as command line arguments, such that:

argv[1] = usernameargv[2] = passwordargv[3] = remote_addr

The external authentication routine could choose to ignore any and all of these parameters. But typicalauthentication routines will at least look at the first two. After performing any and all desiredauthentication, the routine should exit with a code equal to zero for success and a minus one forfailure.

Exit Code0 = successful authentication−1 = authentication failed

The above authentication interface allows administrators to easily customize their own routines forvalidating user logins.

Bypassing AuthenticationUsing the above techniques, the Urchin authentication can be purposefully bypassed. In the casewhere a hosting provider wants to use the entire Urchin System for controlling users and groups, butthey have already authenticated the user by the time the user arrives at Urchin, bypassing theauthentication is an option to avoid a double login. As long as the host can guarantee that access to theUrchin System is controlled from an authenticating portal and that the username cannot be tamperedwith, the host can bypass authentication using the following technique.

To bypass the authentication create a dummy external authentication routine that always exits with azero. For example, perl code might look like:

#!/usr/bin/perlexit(0);

Chapter 7: Advanced Topics 182

Page 187: MANUAL Urchin v5x

Point the Session Controller at this dummy authentication routine by editing the “etc/session.conf”file to point to this dummy routine as described above. Next, simply provide a link that looks like:

http://hostname:9999/session.cgi?action=login

Modify the above link to point to your actual hostname and port, and modify the user to the point tothe desired username or variable. The dummy authentication routine will automatically approve thislogin. Please use this method with care to avoid security problems.

Note for Windows Users

In order to provide similiar functionality in Windows environments where Perl is not installed, asimple noauth.exe binary is available from the Helper Scripts area of the Urchin Support web site.

This binary is merely a "no−op" − it simply returns a successful status when called. Be sure you understandthe security implications of this before implementing this solution.

Linking Directly to Urchin Reports

Overview

In a standard Urchin installation, delivery of Urchin reports is controlled via the embedded session controllerand Apache webserver that ship with Urchin. Users view their Urchin reports by authenticating themselves viaan Urchin−controlled login process, and are then presented with a list of Urchin reports that they areauthorized to view.

It is possible, however, to bypass Urchin's authentication and session controller and provide users with Urchinreports via a direct link from a portal or other external web site. Urchin 5 provides this capability as one of itsstandard integration points.

If all Urchin report data is in /data/reports, read Basic Scenario, below.• If Urchin report data is in a location other than /data/reports, read Advanced Scenario, below.• If you are supporting multiple users with multiple profiles, and each user has his/her reports in his/herown directory, read Large Configuration Scenario, below.

Basic Scenario

This section applies to a standard Urchin installation, where all Urchin reporting data is located in thedata/reports directory of the Urchin distribution and the Apache webserver supplied with Urchin (urchinwebd)delivers the content for the Urchin reports.

Step 1: Enable "Direct Report Linking"

Log in to the Urchin administrative interface as the "admin" user.• Navigate to the Settings −> Access Settings screen.•

Chapter 7: Advanced Topics 183

Page 188: MANUAL Urchin v5x

Set the "Direct Report Linking" field to "on".•

Alternatively, you can enable direct report linking by using the "uconf−driver" utility in the "util"directory of the Urchin distribution:

Start a command shell on the Urchin system♦ Change directory to the "util" directory of the Urchin distribution♦ Set the direct report linking by typing:uconf−driver action=set_parameter recnum=1 cr_directlink=on

Step 2: Configure links to Urchin reports

For each profile (report) to which you want to provide a link, create a link in the following format:

(baseurl)/report.cgi?profile=(profilename)

where (baseurl) is everything before session.cgi in the current URL, (profilename) is theURI−escaped name of the profile, and (username) is a user that has access to the report. The user=setting is optional and controls the language and localization preferences; if not specified, the adminuser is assumed. For example,

http://www.hollywoodweb.com/report.cgi?profile=www.hollywoodweb.com

Advanced Scenario

Step 1: Enable "Direct Report Linking"

Follow Step 1 in Basic Scenario, above. Step 2: Set urchin.cgi permissions

Make sure that bin/urchin.cgi and util/uconf−driver in your Urchin installation are accessible and executableby your web server user.

Step 3: Copy htdocs/report.cgi to the directory(ies) from which you wish to run it.

Step 4: Link/alias uicons, ujs, usvg, and ucss folders

The reporting interface needs access to certain javascript files and icons. From the directory that willcontain report.cgi, create links or aliases to these folders. This can be done as symbolic links, webserver aliases, or you can simply copy the folders into the location. To set up symbolic links:

cd [report.cgi location]ln –s [urchin path]/htdocs/uicons uiconsln –s [urchin path]/htdocs/ujs ujsln –s [urchin path]/htdocs/usvg usvgln –s [urchin path]/htdocs/ucss ucss

When using symbolic links, make sure that your web server is configured to allow the following of symbolic

Chapter 7: Advanced Topics 184

Page 189: MANUAL Urchin v5x

links. For Apache, this is the FollowSymLinks directive.

Step 5: Ensure that the web server is configured to execute CGI applications.

For Apache, enable ExecCGI. Do not use script aliases.

Step 6: Edit htdocs/.report.conf

Uncomment the "Profile =" and "User =" lines if they are commented, and specify the profile anduser. For example:

Profile = www.hollywoodweb.com User = johng

Large Configuration Scenario

This scenario allows you to easily support multiple users with multiple profiles. Instead of creating a.report.conf file for each user−profile combination, you create a single .report.conf file that allows the reportengine to dynamically extract the user and profile name from the URI stem.

The steps to establishing the Large Configuration Scenario are identical to the Advanced Scenario, above,except that instead of explicitly setting Profile and User in a .report.conf file, you specify a RegularExpression (as is commonly used in Tcl, Perl, C#, VB.NET, and other languages) in the URIMatch field. Forexample, if the user johng types in the following URL to get his reports:

http://mach1.net/johng/www.hollywoodweb.com/report.cgi

then specifying the following fields in the .report.conf file would extract "johng" as the user and"www.hollywoodweb.com" as the profile:

URIMatch = ^/(.*)/(.*)/report.cgi Profile = $2 User = $1

Explanation: The URI stem is "/johng/www.hollywoodweb.com/report.cgi". The URIMatch field is a RegularExpression which parses out the user and profile from the URI stem.

The first part of the URIMatch field is "^/" indicating that the URI stem must begin with a "/".• The last part of the URIMatch field is "report.cgi" indicating that the URI stem must contain"report.cgi" before or at the end.

The "(.*)/" means "extract the string up to the next "/" and save it (argument 1). Thus, the string"johng" is assigned to $1.

The next "(.*)/" means "extract the string up to the next "/" and save it (argument 2).• "Profile = $2" means assign the contents of argument 2 to Profile.• "User = $1" means assign the contents of argument 1 to User.•

Therefore, the steps for the Large Configuration Scenario are:

Chapter 7: Advanced Topics 185

Page 190: MANUAL Urchin v5x

Step 1 − 5: Follow steps 1 through 5 in the Advanced Scenario.• Step 6: Create a .report.conf file that uses Regular Expressions to dynamically extract the user andprofile name from the URL string that the user types in.

Script−based Configuration Management Overview

Overview

Urchin 5 provides utilities and functionality to allow all administrative operations to be performed viaunattended scripts. Only the report viewing requires the web−based interface to be operational. Allconfiguration and log processing activities can be scripted using the following utilities and techniques.For first time users, it is helpful to run the web−based administrative interface first, in order to getfamiliar with the terminology and capabilities of the Urchin administrative system.

Urchin 5 includes several utilities for modifying the Urchin configuration database without using theweb−based administrative interface. Located in the util directory of the distribution, these utilities are:

uconf−driver uconf−export uconf−import uconf−schedule

Each of these utilities must be run from a command shell or a script, as there is no ability to executethem from the web−based Urchin administrative interface. Complete documentation for each of theseutilities is available in the Utilities section of the Advanced Topics area of the Urchin DocumentationCenter at http://help.urchin.com. Here is a summary of the functional use of these utilities:

The uconf−export utility exports the entire configuration into a file, or to standard output(stdout) if no file is specified. The format of the exported data is an XML−type formatdefined in the documentation for uconf−export. Each record in the exported data correspondsto a configuration record in the configuration database.

1.

The uconf−import utility imports the same XML type formatted data used by uconf−exportinto the configuration database. This tool provides functionality for importing or editing ofsingle records, or replacing the entire Urchin configuration with the contents of a text−basedconfiguration file.

2.

The uconf−driver utility performs specific actions to individual records. All parameters canbe passed on one line as arguments to the script, or a file with multiple commands (one perline) can be used.

3.

The uconf−schedule utility updates task scheduling directives on a global bases for allconfigured profiles; for example to set all Profiles to run at 1:00am daily. It also hasadditional capabilities to run Profiles immediately with or without permanently changing thescheduled time − see the documentation for uconf−schedule for additional details on thesefeatures.

4.

Note that you can use the uconf−export and uconf−import utilities to easily make a backup of orrestore your Urchin configuration. This provides a very quick method of recovering an Urchin

Chapter 7: Advanced Topics 186

Page 191: MANUAL Urchin v5x

installation after a disk failure or other system problem. An example of this functionality:

Save configuration for safekeeping:

uconf−export > /path/to/saved−configurations/urchin−cfg.save

Restore Urchin configuration from a known good backup:

uconf−import −r −f /path/to/saved−configurations/urchin−cfg.save

Intended Usage

The uconf−export and uconf−import utilities are intended to provide a simple method for importingand exporting data from the Urchin configuration database using regular text files in an XML−typeformat. These utilities also allow you to specify the names of profiles, log sources, filters, users, etc.in configuration directives which specify access lists rather than the more cryptic record number liststhat are used by the uconf−driver utility. The uconf−import utility can be used to add new records ormodify existing records, but it cannot remove old records (unless the database is completely reset withthe "−r" option).

The uconf−driver utility is very powerful and can be used for very specific scripting operations thatmay change only a few parameters in a database record, as well as performing complete recordadditions, modifications and deletions from the database. It can also be used for querying theconfiguration database for several parameters. This utility is more ideally suited for use in anenvironment where scripting all administrative functions of Urchin is desired, such as in automatedprovisioning systems or very large hosting environments where use of the Urchin administrative GUIis impractical.

Note that the uconf−driver is a lower−level utility that does not automatically maintain associationsbetween the various database tables when working with directives that maintain cross−reference lists.When using uconf−driver to script configuration operations, please be aware that many of the tablescontain directives that refer to other records, or lists of records. These directives are: ct_ulist, ct_glist,ct_llist, ct_flist, and ct_rlist, which refer to the user, group, logfile, filter, and profile tablesrespectively. These lists are represented as exclamation point delimited list of recnums, asdemonstrated by this list of filter records:

ct_flist="!13!36!56!"

where each entry represents the recnum value of a record and is surrounded with exclamation points.For the uconf−import and uconf−export utilities, this directive would be specified as:

ct_flist="filter1,filter2,filter3"

Important! Regardless of which utility you use to manipulate the configuration, you must be careful tokeep cross−references intact. For example, a Filter record has a ct_rlist which details all of the profilesthat the filter applies to; and a Profile record has a ct_flist which details all of the filters that apply tothis profile. Note that the uconf−import and uconf−export utilities translate and verify the listsspecified in the directive for you; uconf−driver does not.

Special Usage Notes

Chapter 7: Advanced Topics 187

Page 192: MANUAL Urchin v5x

The uconf−driver uses exclamation point delimited lists of record numbers for directives thatmaintain associations with other tables (e.g. ct_flist), whereas the uconf−export anduconf−import utilities use comma−delimited lists of names in these directives. Be sure to usethe appropriate list specification syntax for the utility you are using.

If a Profile is added but no corresponding Task is added, scheduling of the Profile cannot bemanaged within the Urchin admin GUI interface. In addition, the Profile cannot be scheduledto run with the Urchin Task Scheduler.

When adding or editing the "ct_password" directive for use with either a User or Remote LogSource password, uconf_driver and uconf−import will automatically encrypt the passwordbefore writing it to the Urchin configuration database to ensure that passwords are not storedin clear text. For portability reasons, the encryption is in a proprietary format that is notcompatible with other password encryption formats such as "crypt" on UNIX−type systems.

Examples of pseudo−code scripts to perform tasks

This section gives examples of using uconf−driver and uconf−import in scripting pseudo−code, whichcould be easily translated into a UNIX−type shell script, a Perl script or a Visual Basic scriptdepending on the needs of the application. Additional examples are given in the documentation foreach specific utility.

Apply a German language setting to all users

$user_count = `uconf−driver action=nrecords table=user`;for ($i = 1 ; $i <= $user_count ; $i ++) {

uconf−driver action=set_parameter table=user entry=$ict_language=ge; }

Add a new profile/task/logsource/user

Note that in this example, uconf−import takes care of building the proper mappings so that the newprofile, task, log source and user will all be properly cross referenced, e.g. the log source is associatedwith the profile and vice−versa, the user will be in the profile's list of authorized users and vice−versa,etc. It should also be noted that the Task is associated to the Profile because they share the same"Name=" tag.

Step 1: Some automated provisioning application creates the following text file (/path/to/text−file):

<Profile Name="www.newdomain.com"> ct_name=www.urchin.com ct_affiliation=(NONE) ct_website=http://www.newdomain.com ct_reportdomains=www.newdomain.com,newdomain.com cs_llist=newdomain.com−access−log ct_defaultpage=index.html cs_vmethod=0 cs_ulist=newuser</Profile>

Chapter 7: Advanced Topics 188

Page 193: MANUAL Urchin v5x

<Task Name="www.newdomain.com"> ct_name=www.newdomain.com ct_affiliation=(NONE) cr_frequency=0 cr_runnow=0 cr_enabled=off</Task>

<Logfile Name="newdomain.com−access−log"> ct_name=newdomain.com−access−log ct_affiliation=(NONE) ct_loglocation=/path/to/logs/newdomain.com−access.log cs_logformat=auto cr_type=local cs_rlist=www.newdomain.com</Logfile>

<User Name="newuser"> ct_affiliation=(NONE) ct_fullname="New User" ct_name=newuser ct_password=change$me cs_rlist=www.newdomain.com</User>

Step 2: Call the uconf−import utility to import the new profile and user into Urchin:

uconf−import −f /path/to/text−file

Data Export

There are two ways to export data from Urchin:

using the buttons on the upper right of any report screen (easy for any

Urchin user)

using a database export script (advanced option for programmers only)♦ Using the Export buttons

Urchin's data export function makes it easy to extract data from any Urchin report. This is useful forbringing report data into a spreadsheet, word processor, database, etc. for further analysis.

To export data from any report, select the appropriate type based on the application you plan to use tomanipulate the data. For general database importing, use tab−separated format. For Word and Excel

Chapter 7: Advanced Topics 189

Page 194: MANUAL Urchin v5x

export, the application should launch automatically after the data is exported, and the new documentshould be populated with the data you have exported.

Tab: click the "T" button to export data in tab−delimited format.♦ Word: click the Word icon to export data in Microsoft Word native format.♦ Excel: click the Excel icon to export data in Microsoft Excel native format.♦

Printing: click the printer icon to get a print−friendly view of the data; click the Print Page link fromthat screen to actually print the report.

Recommendation: To export data to a database, tab−separated is usually the preferred format.

Using a Database Export Script

The following PERL script, U5DataExtractor, queries the Urchin databases for a particular profile andproduces text−based reports that are suitable for sending via Email.

The script should be configured before use with the proper path to the Urchin distribution, and thedefault profile name. Exectuting the script with the "−−help" option displays the usage.

Important Note: This script is strictly provided as−is, with no warranty expressed or implied. This isan unsupported script. Use this script for Urchin 5 only.

Customization

Custom Log Formats

Introduction

Urchin can process virtually any log file format. By providing Urchin with the necessary informationabout the log format, Urchin will read and parse the raw data according to your configuration. Thisarticle describes a step by step method for creating custom log formats. Once a custom format iscreated it can be selected in the Administration Interface as the format for any log file.

Chapter 7: Advanced Topics 190

Page 195: MANUAL Urchin v5x

It should be noted that certain log data is required in order to create certain reports. You will need tobe sure your log contains the minimally required fields for Urchin to process it. If you are unsure seethe Log Files − Logging − Other Webservers document in the Urchin Administration section of theDocumentation Center.

Creating Custom Formats

A custom log format is created by creating a format specification file in the

[urchin install dir]/lib/custom/logformats/

folder. A sample file is provided called 'custom.lf'. Multiple custom files can be created. Each oneneeds the '.lf' extension for Urchin to recognize it. The default built−in log formats such as apache,w3c, netscape, etc. are located in

[urchin install dir]/lib/reporting/logformats/

In each of the above directories, there is an available fields list in the fieldlist.txt file. The customfolder holds the custom fields list and the reporting folder holds the standard fields list. Custom logformats can refer to fields in either list by using the field id numbers. Once a custom format is created,it is available for selection in the Admininstration Interface. Here are the basic steps for creating acustom format:

Step 1: Copy the exampleIn the 'lib/custom/logformats' folder of the Urchin distribution, make a copy of the 'custom.lf' file. Thenew filename should not use spaces, and it needs the '.lf' extension.

Step 2: Set the primary positionsEdit the file created in the above step. The file contains a lot of useful information about eachvariable. The first step is to set the PrimaryPositions variable. This comma separated list identifieseach field (by id number) in the log format and its relative position. Use the fieldlist.txt files in thetwo directories mentioned above to determine which field ids to use. If you have custom date and timeformats that are different from Apache and IIS formats, use fields 16 and 17 respectively. If the dateand time are together in one field, just use field 16.

Step 3: Check the fields separatorCheck the FieldSeparator1 variable, and set this to the separator between your fields. Use \s for a

Chapter 7: Advanced Topics 191

Page 196: MANUAL Urchin v5x

space and a \t for a tab.

Step 4: Is the HTTP status field available?If the HTTP status field is available, then leave the StatusRequired variable set to YES. This willseparate valid and error hits appropriately. If there is not status in the log, then set this to NO, so thatall hits will be considered valid.

Step 5: Are you using a custom Date/Time format?If so, then edit the CustomDateFormat for field 16 and the CustomTimeFormat for field 17. Theformat is specified using the % variables defined in the Custom Date Format article later in thissection of the documentation.

Step 6: Check the other variablesMost of the other variables will be OK for most custom log formats, but check the commentsprovided in the file on each variable to see if it applies to your situation.

Step 7: Do you have custom calculated fields?In addition to the format, you can specify custom calculated fields in the specification file. Anexample with comments is provided at the bottom of the file. The custom calculated field works thesame way as the Advanced Filter in the filtering section except that custom calculated fields areprocessed first. Please see the article on custom calculated fields form more information.

Save the file and you ready to assign it as the format for a log file in the Administrative Interface. Itwill automatically show up as one of the format options in the pull down menu for log formats.

Custom Navigation

Introduction

The list of reports in the navigation of the reporting interface can be completely customized. Urchinships with a number of default "Report Sets" which control the navigation and list of reports.Additional Report Sets can be easily added. And once created, they are automatically selectable in theReport Set configuration setting for the Profile. Different Report Sets can be set for different users aswell.

Chapter 7: Advanced Topics 192

Page 197: MANUAL Urchin v5x

Using Report Sets, you can modify the list of reports, turn off entire sections, change colors, changetext and move reports around. This is all done by creating a Report Set definition file. Keep in mindthat a Report Set definition is created for a particular Profile Type such as "Standard Website".Furthermore, adding a Profile can automatically select a custom Report Set as the default if desired.

Creating Custom Report Set

A custom Report Set is created by creating a Report Set definition file in the

[urchin install dir]/lib/custom/profiletypes/<ProfileType>/

folder, where <ProfileType> is "Standard_Website", "E−Commerce_Website", or another ProfileType. Multiple custom Report Sets can be created. Each one will need a '.rs' extension for Urchin torecognize it. Be sure to use underscore "_" characters in the directory and filenames instead of spaces.Once a custom Report Set is created, it is available for selection in the Admininstration Interface.Here is a step by step procedure for creating a custom Report Set.

Step 1: Copy an existing Report SetSample Report Set definition files are included in the Profile Type folders that ship with the product.The first step is to copy the sample and rename the file with a ".rs" extension. Be sure to useunderscores instead of spaces in the name. For example, if creating a Report Set for the StandardWebsite Profile Type, then in the directory, "lib/custom/profiletypes/Standard_Website/" within theUrchin installation, copy the file "All_Reports.rs.sample" and rename it to "my_report_set.rs" or someother descriptor.

Step 2: Modify the Report SetEdit the new Report Set definition file. Each line in the definition file can represent a report or amenu. A description of all the configuration fields for each line is described in the top of the file. Butin general, to turn off reports or entire sections, enter a '#' character in front of the entries you wish toturn off. The names of reports can be changed by modifying the third field. Names and help text caneither reference an entry in the dictionary system or you can enter the text directly into the field in thisfile. Colors can be changed by editing fields five and six. Menus can be created by copying one of theexisting menu lines and editing the text entry appropriately. The order of the entries in the file mirrorsthe order seen in the navigation. Each entry will need a unique ID which is in the first field. For list ofall Regular Reports, see the Reference Section of this manual.

Chapter 7: Advanced Topics 193

Page 198: MANUAL Urchin v5x

Step 3: Is this the default Report Set for Profiles of this type?If you want the newly created Report Set to be the default setting when creating new Profiles of thistype, then copy the 'default.config' file from the 'lib/reporting/profiletypes/Standard_Website/' folderto the 'lib/custom/profiletypes/Standard_Website/' folder, where 'Standard_Website' is the Profile typethat we are creating the Report Set for. Edit the new default.config file in the custom area and add aline that contains: 'cs_reportset=my_report_set' where 'my_report_set is the name of the Report Setwe just created without the .rs extension.

Save the file(s) and you ready to assign it as the Report Set for existing Profiles. In the AdministrativeInterface, edit a profile and under the Profile Settings tab, use the Report Set pull down menu to selectyour new Report Set.

Custom Reports

OverviewUrchin's internal data processing and reporting engine is very powerful and can be configured toprocess virtually any type of data. This document has been created as an overview and guide forcreating custom reports.

As shown in the figure below, there are two separate processes within Urchin that control the creationof reports. There is the Log Processing step which parses the raw log file data and populates a numberreport data tables. Log processing is triggered by the Urchin Scheduler or when you click the 'RunNow' button in the Administration.

The second process is triggered when someone actually clicks to view a report. All reports are createddynamically. Clicking on a particular report creates a query to the data tables which are compiled bythe Reporting Engine and delivered to the user as a viewable report.

Chapter 7: Advanced Topics 194

Page 199: MANUAL Urchin v5x

The key to creating valuable custom reports is understanding the three controls that affect reporting.The first control is the Log Format, which tells the Urchin Engine what is in the log file and how toprocess it. To create custom log formats, please see the appropriate article in this section of themanual.

The second control is the Data Map. The Data Map controls which fields are stored in which datatables. To create a custom report, you may need to create an additional entry in the Data Map asdiscussed below. The Data Map is critical to defining what information is available for reporting.

The last control is the Report Set. The Report Set contains the listing of reports that is seen in thenavigation when viewing reports. The Report Set entries also contain the information necessary formaking the query to one of the data tables, including the table number, and how to display the data.Creating a report set entry is discussed below.

Step 1: Select your FieldsYou may need to reference the Regular Field List and Regular Report List in the Reference section ofthis guide to see exactly which fields are available and what existing reports are storing. Urchin datatables can correlate two fields together as an option. In general, the data tables store text data versusnumeric data versus time. You will need to define the text fields and the numeric fields to use. See theexamples at the end of this document.

Step 2: Create a new Profile TypeIn the Urchin installation folder, copy one of the existing Profile Types in 'lib/reporting/profiletypes/'to 'lib/custom/profiletypes/'. Simply create a copy of the entire folder, such as 'Standard_Website', andplace the copy into the 'lib/custom/profiletypes/' folder. The copy should include all of the filescontained in the original folder. Be sure to rename the folder to give it a unique Profile Type name.Use underscores in the name instead of spaces.

Step 3: Edit the Data MapIn your newly created folder, there will be a "datamap.dm" file which contains all of the Data Tableentries. The format of the file is described in the beginning of the file itself. Edit the file and create anew entry at the end of the file. Here are some notes on setting each field in the new entry:

TABLE: Make sure this is a unique entry between 1−200. This is the table id number thatwill be referenced during the query.

AFIELD: Use the Regular Field List in the reference section to look up the id value of thefield you wish to report on.

BFIELD: If you are correlating two fields, enter the second field here and set the SEP field toa pipe symbol (|), otherwise leave these two fields as dashes (−).

IFIELD: Use the Integer Field List in the reference section to look up the id value of the fieldyou wish to report versus.

REQUIRED: Set this to A, B, or BOTH to require the field(s) to exist before making anentry.

Step 4: Edit the Report SetIn the same folder, edit one of the Report Sets, (.rs files), to include the reference to your new table.You can also create a new Report Set if desired. This file contains entries in the same order as thenavigation for the reporting. Each section of reports begins with a menu. Determine which sectionyou wish to place your new report and copy one of the existing reports entries that matches the format

Chapter 7: Advanced Topics 195

Page 200: MANUAL Urchin v5x

and type of report you are creating. Edit your newly created entry as follows:

VID: The first field is the unique View ID. Modify this so that it is a unique value for thisreport.

NAVNAME/FNAME/NAME: Set all three of these fields to the name of your report.Replace the dictionary entry with the name in quotes.

IFIELD: Set this to one of(VISITORS,SESSIONS,PAGEVIEWS,HITS,BYTES,TIME,REVENUE,TRANS,ITEMS)

INAME: Set this to the text name of the Integer field.♦ TABLE: Set this to the table number created in the previous step.♦ HELP: Set to "−" for now.♦ FPART: If you used only Field A in the previous step, the set this to CFIELD1, otherwiseyou can grab BOTH.

FILTYPE/FILTER/FILCON/SOPT: Set all of these to "−".♦ Step 5: Configure and ProcessCreate or edit a Profile and set the Profie Type to the newly created Profile Type and the Report Set tothe one created/edited in the previous step. Process log data and view your new report.

Custom Date/Time Formats

Introduction

Urchin can process virtually any date or time format contained in a log file. The only requirement isto provide Urchin with the necessary date or time format which matches the pattern of the datecontained in the log file. This article describes the variables used to specify the date or time format.These formats are specified inside a custom log format file. Please see the "Custom Log Formats"article above in this section.

How Date/Time Parsing Works

Urchin determines the date/time by comparing a specified format against the date/time field(s) in thelog file.

For example, an IIS log contains the date in the following form:2002−11−12Urchin is able to determine the year, month, and day be using the following format:%Y−%m−%d

Creating A Date/Time Format

To create a custom date/time format, first look at the order and pattern of the date/time data containedin your log file. Then, select from the following Date/Time variables listed below to make up theformat.

Chapter 7: Advanced Topics 196

Page 201: MANUAL Urchin v5x

For example, if your log file contains the time as "07:01:47", then you need to create a pattern tomatch this. The first thing to note is that the pattern is hours:minutes:seconds. Looking at the variablelist below, you will note that %H is the variable for hours, %M is the variable for minutes, and %S isthe varialble for seconds. Putting these together yields a format of "%H:%M:%S". If you have a literal'%' character in the date or time format field, you can specify the literal % as %%.

The most common variables are: %Y, %m, %d, %H, %M, and %S.

Date/Time Variable Definitions

%A = national representation of the full weekday name.♦ %a = national representation of the abbreviated weekday name.♦ %B = national representation of the full month name.♦ %b = national representation of the abbreviated month name.♦ %d = the day of the month as a decimal number (01−31).♦ %e = the day of month as a decimal number (1−31); single digits are preceded by a blank.♦ %H = the hour (24−hour clock) as a decimal number (00−23).♦ %I = the hour (12−hour clock) as a decimal number (01−12).♦ %j = the day of the year as a decimal number (001−366).♦ %k = the hour (24−hour clock) as a decimal number (0−23); single digits are preceded by ablank.

%l = the hour (12−hour clock) as a decimal number (1−12); single digits are preceded by ablank.

%M = the minute as a decimal number (00−59).♦ %m = the month as a decimal number (01−12).♦ %p = national representation of either "ante meridiem" or "post meridiem" as appropriate.♦ %S = the second as a decimal number (00−60).♦ %s = the number of seconds since the Epoch, UTC (see mktime(3)).♦ %w = the weekday (Sunday as the first day of the week) as a decimal number (0−6).♦ %Y = the year with century as a decimal number.♦ %y = the year without century as a decimal number (00−99).♦ %z = the time zone offset from UTC; a leading plus sign stands for east of UTC, a minus signfor west of UTC, hours and minutes follow with two digits each and no delimiter betweenthem (common form for RFC 822 date headers).

%% = `%'.♦

Custom DNS Entries

Overview

The geo−update utility is used to check for updates to Urchin's internal DNS database files anddownload the updates if they are available. This utility is run regularly via the __domaindb task thatshould be listed in the Configuration−>Scheduler screen of the Urchin administration interface. The

Chapter 7: Advanced Topics 197

Page 202: MANUAL Urchin v5x

utility can also be used to import custom entries into the DNS databases. See the section on thegeo−update program under Advanced Topics−>Utilities for complete instructions on the options touse for creating custom DNS entries.

Custom Lookup Tables

Beginning with version 5.6, Urchin allows you to define custom lookup tables. One useful applicationof a lookup table is to substitute human readable text for the often cryptic request parameters usedwith dynamic URLs. For example, consider a web site in which the Pages &Files−−>Page QueryTerms report is used to rank the popularity of requested documents. In the report (shown below), thedocument id is displayed instead of the full document name. The numeric id is shown because thereport simply ranks the popularity of requests of the form

http://www.hostsite.com/index.cgi?id=1001

Applying a lookup table which maps document names to numeric ids allows us to view the sameinformation in Pages &Files−−>Requested Pages, with the full document name displayed.

This article illustrates how to create and apply a lookup table for this example. The details of yourlookup table and filters may differ according to your particular application, however, the basic stepswill still apply.

Defining Your Lookup Table

To define your table:

Chapter 7: Advanced Topics 198

Page 203: MANUAL Urchin v5x

Create a table in Excel that maps your codes to text labels. An example is shown below. Thefirst row of the file must begin with "#Fields:", followed by "request_stem" in column 2.

1.

Save the Excel table as a tab delimited plain text file in the lib/custom/lookuptables directoryof your Urchin distribution. You must save the file with an extension of ".lt".

2.

Applying the Table

Apply the following Advanced filter to your profile. This filter tells Urchin to look in therequest_uri for the string "id=", extract the id, and write the id into the request_stem. (Notethat request_stem is the title of the second column of the lookup table.)

1.

Apply the following Lookup Table filter (applied on the request_stem field) to your profile.Select your lookup table in the Table Name drop down list. If your lookup table does notappear as an option in the drop down list, make sure that your lookup table file name endswith .lt and that it has been saved in the lib/custom/lookuptables directory of your Urchin

2.

Chapter 7: Advanced Topics 199

Page 204: MANUAL Urchin v5x

distribution.

Cobranding Urchin

Overview

Urchin accomodates cobranding in the administration interface, the reporting interface, and,beginning with Urchin 5.6, the login screen. There are two files to edit in order to include HTML atthe top of the interface (three files with version 5.6). If a complete portal integration is being done,then the Urchin reporting can be delivered within a frameset or table by your application server.(Please see the article on Portal Integration in the Integration Section.) Otherwise, follow theinstructions below to cobrand your interface.

Please note that your license agreement may prohibit obscuring or changing the Urchin Logo andReports, beyond what is provided in this article.

Cobranding Instructions

To cobrand your interface, you will need to edit the following files located in the Urchin installation:

[urchin install dir]/lib/custom/cobrands/cobrand_admin.tpl [urchin install dir]/lib/custom/cobrands/cobrand_report.tpl [urchin install dir]/lib/custom/cobrands/cobrand_session.tpl (version 5.6 only)

The first file controls the cobranding on the admin interface and the second controls the cobrandingon the reporting interface. The third file, available beginning with version 5.6, controls the cobrandingon the login screen. Add HTML content to these files as necessary to include your branding. TheHTML provided in these files will be placed on top of the Urchin interface as shown in the examplebelow.

Chapter 7: Advanced Topics 200

Page 205: MANUAL Urchin v5x

Hosting Automation Solutions

How are H−Sphere and Urchin 5 Integrated?

An unlicensed copy of Urchin 5 is now integrated into Positive Software Corporation's H−Sphere.Psoft customers who wish to integrate Urchin 5 can now download H−Sphere to enable Urchin 5:

http://www.psoft.net/HSdocumentation/new_features.html#231

Note: Urchin 5 comes unlicensed, so customers may activate the 15 day demo license, then purchasean Urchin license via the standard methods: either in the Urchin 5 admin interface, on the urchin.comwebsite, or by contacting [email protected]. Detailed download and installation information isavailable here:

http://www.psoft.net/HSdocumentation/sysadmin/urchin4.html

Using Urchin with Plesk PSA 5.0

Note: the following information has been provided by Plesk technical personnel.

Instructions

Chapter 7: Advanced Topics 201

Page 206: MANUAL Urchin v5x

Log into your PSA interface as Admin and select the Extras button. This takes you to MyPlesk.comand allows MyPlesk.com to know that you are the Admin of a PSA license. Under the Server Toolstab you will see the Urchin offering. Below are some additional instructions that you will need wheninstalling Urchin:

For PSA 5.0 only

Use Urchin install instructions found at MyPlesk.com♦ PSA log rotation feature should be turned off♦ Configure and use Urchin log archiving/deleting♦

Everything in Urchin should be set as described in documentation with only one difference − Log Filepath should point to /path/to/vhosts/domain.com/logs/access_log.processed and if ssl is enabled,/path/to/vhosts/domain.com/logs/access_ssl_log.processed. .processed files are created by PSAstatistics utility (after calculations by internal PSA stats) and processing should be scheduled to rundaily at 5:00am

Here are some examples:

If you are running the standard version of PSA, your path will be/usr/local/psa/home/vhosts/DOMAIN.NAME/logs/access_log.processed

If you are running the RPM version of PSA, your path will be/home/httpd/vhosts/DOMAIN.NAME/logs/access_log.processed

If you began using PSA version 1.3.x and have upgraded to PSA version 5.x, your path willbe /usr/local/plesk/apache/vhosts/DOMAIN.NAME/logs/access_log.processed

Affiliate Program Info

Please note that if your server is registered with MyPlesk.com (and you join the Affiliate Program),your MyPlesk.com account will be credited with an amount equal to 10% of the amount you pay forthe Urchin Solution.

Ensim Webppliance

How are Ensim Webppliance and Urchin 5 Integrated?

An unlicensed copy of Urchin 5 is currently integrated into Webppliance 3.6 for Windows.

Ensim has not announced plans to integrate Urchin into their Webppliance for Linux/Unix products.They are interested in determining demand, though. If you would like Ensim to support thisintegration, please contact [email protected].

You may attempt to run Urchin 5 outside of your Ensim environment, but it is unsupported.

Chapter 7: Advanced Topics 202

Page 207: MANUAL Urchin v5x

Sphera's HostingDirector

How are Sphera's HostingDirector and Urchin 5 Integrated?

An unlicensed copy of Urchin 5 is being integrated into HostingDirector. Sphera customers who wishto upgrade to Urchin 5 will be able to do so when HostingDirector 3.8 is available toward the end of2003. With that release Urchin 5 will become a shared service on the server, so customers will nolonger have to license a separate copy of Urchin for each VPS.

Performance &Tuning

Global Filtering of Hits from Monitoring Software

Overview

Most Hosting environments provide some sort of monitoring of customer webservers in order tomaintain Service Level Agreements (SLAs). As a side effect, however, the hits from this monitoringcan really skew the Urchin reporting for the monitored web sites − artificially inflating session,pageview, hit and byte counts.

Recommendation

In Hosting environments that employ such monitoring, it is highly recommended that astandard/global Urchin filter is applied to each customer's configured Urchin profiles to strip out thehits generated by montoring software. This is easily done in an environment where a centralizedUrchin installation (managed by the hosting company) provides reporting for each customer'swebsite(s). In dedicated/colocation environments where the customers themselves maintain aninstance of Urchin on their server(s), the Hosting company should provide a sample filter that isappropriate for the monitoring being used.

To aid in the implementation of Urchin filtering, the Host and the customer should work together tocreate a specific page on the customer website that only the monitoring software utilizes, e.g.something like:

http://www.customerdomain.com/healthpage.html

Examples

Example 1: Filter out the IP address for the monitoring system

Chapter 7: Advanced Topics 203

Page 208: MANUAL Urchin v5x

Filter Type: ExcludeFilter Field: IPFilter Spec: 172\.16\.1\.1

This will strip any hits with the IP address 172.16.1.1 out of the webserver log as Urchin is processingit.

Example 2: Filter out specific page that the monitoring system hits

Filter Type: ExcludeFilter Field: REQUESTFilter Spec: ^/healthpage.html

This will strip any hits with a request for /healthpage.html out of the webserver log as Urchin isprocessing it.

Considerations

It may be desirable to create additional, non−filtered Profiles for the customers so they cansee the actual traffic load (including the filtering) on the webserver(s).

1.

The Hosting company may want to provide a Profile that provides reporting exclusively forthe monitoring hits − e.g. it filters in only hits from the monitoring software. This profilecould be used to show that the proper monitoring is being done and that SLAs are being met.

Reducing Disk Storage for Urchin Profile Monthly Databases

Overview

Urchin reporting data is stored in independent monthly databases for each Profile configuredwithin Urchin. These databases typically reside in the data/reports directory of the Urchindistribution. By default, Urchin will keep an unlimited number of these monthly Profiledatabases. For most small and medium sized sites, the storage requirements are modest.Because Urchin reporting does not require access to the raw webserver logs once they've beenprocessed, there is no need to keep the webserver logs. The processed Urchin monthlydatabases will be approximately 5−10% of the size of the raw webserver logs that wereprocessed to populate the Urchin databases, and in most cases this will represent a veryminimal amount of disk space even if all Urchin databases are kept indefinitely.

For large sites, however, which produce hundreds or thousands of megabytes worth ofwebserver logs per day, or hosting providers who have a very large number of Profilesconfigured, it may be desirable to reduce Urchin's ongoing data storage requirement. This canbe accomplished in one of the following ways:

Set the profile to automatically delete the raw tracking data after processing the logs1. Set the profile to archive historic data2.

2.

Chapter 7: Advanced Topics 204

Page 209: MANUAL Urchin v5x

Limit the number of months of historical reporting data that are retained3. Instructions for each of these methods is provided at the end of this article.

Technical Overview of Urchin Database Storage

For each Urchin profile, Urchin maintains a set of nine monthly databases that provide datafor the reporting engine. The databases are named after the month for which they store data.The complete list of databases is:

YYYYMM−hdata.und −−> hash table data YYYYMM−hdata.uni −−> hash table index YYYYMM−hdata.uns −−> hash table string data YYYYMM−ldata.und −−> log tracking data YYYYMM−ldata.uni −−> log tracking indexes YYYYMM−pdata.und −−> path data YYYYMM−sdata.und −−> session data YYYYMM−tdata.und −−> totals data YYYYMM−udata.unf −−> header for the databaseYYYYMM−vdata.und −−> visitor data YYYYMM−vdata.uni −−> visitor index

Each set of databases is complete for the month of data that it contains. Since there is nointerdependency between the monthly database sets, archiving and pruning operations can beperformed independently on each database set without affecting any other month.

Under normal operation, the entire set of nine monthly database file is retained for eachmonth. However, four of these database files are used only by the Urchin log processingengine. These database files are:

YYYYMM−pdata.undYYYYMM−sdata.undYYYYMM−vdata.undYYYYMM−vdata.uni

These databases contain information about paths, sessions and visitors and can account for asubstantial percentage of the total storage space required for the month, on the order of10−50%. Thus there can be a significant disk space advantage by setting the Keep RawTracking Data option to off in the Storage/DB screen of the Profile configuration.

Important Note: If you plan to upgrade to a future major release of Urchin, this raw trackingdata will be used for linking records together. Absence of this data will affect certain newvisitor−centric drill down reports that are planned for Urchin. Therefore, it is recommendedthat only extremely high traffic sites for which keeping the raw tracking data represents a diskor CPU resource consumption issue disable the keeping of raw tracking data.

Other potential disk space savings can be obtained by compressing historic Urchin monthlydatabases into ZIP archives. The resulting archives are typically only 20−30% the size of theuncompressed database set. While the Urchin reporting engine cannot read the ZIP archivesdirectly, it has the ability to extract the databases it needs from the ZIP archives on the fly.This is completely transparent to a person viewing Urchin reports, other than a slight delaywhile the databases are being unpacked. The reporting engine does not remove the databases

Chapter 7: Advanced Topics 205

Page 210: MANUAL Urchin v5x

it has unpacked; this allows quicker access to data while the person is viewing the Urchinreports. However, the original ZIP archive is left in place, so a periodic cleanup operation cansimply remove the unpacked databases to regain the disk space once again.

The last avenue for reducing Urchin storage requirements is to establish a policy for theduration of historical reporting that Urchin is to provide. For instance, in environments whereUrchin is provided as a reporting service with a hosting package, it is very common toprovide Urchin historical for the period of one year. Due to the monthly organization ofUrchin databases, it is very easy for automatic scripting mechanisms to automatically removeold monthly databases that have aged past a certain threshold. When a historical reportinglength policy is implemented, Urchin's data storage requirement will typically stabilize oronly increase slightly once the historical retention limit has been reached.

Methods for Reducing Data Storage − How To

Method 1: Delete the Raw Tracking Data after Log Processing

You can configure the profile to delete raw visitor and session information after processing.For large sites, this improves performance and reduces the amount of data stored. Note:Sessions that overlap days appear as two sessions (one for each day) instead of one session,when this configuration is selected. The difference in results will be negligible for most sites.

To configure the profile to delete raw visitor and session information after processing:

In the Admin interface, click Configuration, then Urchin Profiles−−>Profiles.1. Edit the desired profile.2. In the Storage/DB tab, turn the Keep Raw Tracking Data field "off".3. Click Update.4.

Method 2: Auto−Archive Historic Data

You can configure the profile to compress historic monthly data into an archive. The reportscan view the archived data, but no additional hits may be processed for the archived months.

To configure the profile to archive historic data,

In the Admin interface, click Configuration, then Urchin Profiles−−>Profiles.1. Edit the desired profile.2. In the Storage/DB tab, turn the Archive DB field "on".3. Specify a number of months for the Archive DB After field.4. Click Update.5.

Method 3: Limit Retention of Databases for Historical Reporting

For each Urchin Profile, simply remove any databases in the data/reports/profile−namedirectory that begin with a YYYYMM prefix that have aged past the threshold needed forhistorical reporting. For example, if you wish to retain a one−year reporting history and thecurrent month is February 2004, you would remove any databases named 200301−*data.un*to delete the reporting data from January 2003 for that Urchin profile. This would be repeatedfor all databases older than January 2003.

Chapter 7: Advanced Topics 206

Page 211: MANUAL Urchin v5x

For an example of a ready−to−run Perl script that will automatically prune the Urchindatabases after a certain period of time, please see the PruneUrchinData script athttp://www.urchin.com/support/scripts/purge_udata.pl

Security Features

Activating SSL on the Urchin Webserver

The Urchin webserver that ships with Urchin 4.100 and later is capable of encrypting communication via SSL.To enable SSL, you will need to have either a valid certificate signed by a certificate authority or aself−signed certificate.

The process for enabling SSL in the Urchin webserver are as follows:

Copy your SSL certificate file into the Urchin var directory and name it server.crt1. Copy your SSL key file into the Urchin var directory and name it server.key2. Edit the urchinwebd.conf.template file located in the Urchin var directory. Change the ServerNamedirective from localhost to the name of your webserver. For instance:

ServerName: www.urchin.com

NOTE: The ServerName in the urchinwebd.conf.template file needs to match the name of the serverthat is in the certificate file.

3.

Start or restart the webserver using urchinctl with the "−e" option. Urchinctl is located in the Urchinbin directory. The "−e" option instructs urchinctl to enable SSL in the webserver. For example, torestart the webserver with SSL enabled, use:

urchinctl −e restart

To start the server without SSL enabled, just remove the "−e" option from the urchinctl command.

4.

You should now be able to access your SSL enabled server using https://servername.domain.com:port/

NOTE: Customizing the SSL settings in the urchinwebd.conf.template may result in problems that couldprohibit the webserver from starting.

Chapter 7: Advanced Topics 207

Page 212: MANUAL Urchin v5x

Chapter 8: Reference

Integer Field List

OverviewWhen a hit is processed by Urchin, certain integer fields are available including whether the hit is a pageview,a new session, how many bytes were transferred, etc. These integer values are used in updating many of thetables. In particular the Data Map which maps all of the text−type data tables references these integer fields bynumber.

Integer Field ListThe following table lists all of the available integer fields and their corresponding id number.

IFIELD id Field Name

1 Session

2 Pageview

3 Non−Pageviews

4 Hits

5 Valid Hits

6 Error Hits

Chapter 8: Reference 208

Page 213: MANUAL Urchin v5x

7 UTM Hits

8 Non UTM Hits

9 Robot Hits

10 Non Robot Hits

11 Bytes

12 Robot Bytes

13 Non Robot Bytes

14 Forms

15 Responses

16 Transactions

17 Items

18 Transaction Revenue

19 Item Revenue

20 Downloads

21 Repeat Responses

22 Cost

23 Primary Goals

24 Clicks

25 Impressions

Regular Field List

Overview

When a hit or entry in a log file is read during processing, the hit is broken down into 'Raw Fields'. Fields aregenerally separated by spaces, tabs, or commas. The Log Format determines how these Raw Fields areassigned internally. Once the Raw Fields are read, Urchin calculates a number of 'Auto Fields' based on the'Raw Fields'. Most reports use these Auto Fields for updating.

Filters can be applied to either Raw or Auto Fields. The following table lists all available Fields and theirpurpose.

Regular Field List

id Field Type Purpose

1 iis_date (RAW) IIS raw date of hit field.

2 iis_time (RAW) IIS raw time of hit field.

Chapter 8: Reference 209

Page 214: MANUAL Urchin v5x

3 apache_time (RAW) Apache raw date &time of hit field.

4 c_ip (RAW) Client IP Address.

5 cs_username (RAW) Client username (if any)

6 cs_request (RAW) Apache raw entire request field.

7 cs_method (RAW) IIS raw request method field.

8 cs_uristem (RAW) IIS raw request stem field.

9 cs_uriquery (RAW) IIS raw request query field.

10 sc_status (RAW) Return status code from server.

11 sc_bytes (RAW) Number of bytes transferred for request.

12 c_host (RAW) Client hostname (converts to c_ip if necessary).

13 cs_useragent (RAW) Browser user−agent information.

14 cs_cookie (RAW) Cookies sent by browser.

15 cs_referer (RAW) Raw Referral information (could be internal).

16 custom_date (RAW) Used for datestamp in Custom Logs.

17 custom_time (RAW) Used for timestamp in Custom Logs.

19 cs_host (RAW) Requested virtualhost by Client.

20 s_port (RAW) Server port number.

21 cs_version (RAW) IIS Raw HTTP version.

22 s_sitename (RAW) IIS Server site name.

23 s_computername (RAW) IIS Computer name.

24 s_ip (RAW) IIS Server IP address.

25 elf_orderid (RAW) E−commerce order id number.

26 elf_store (RAW) E−commerce store name.

27 elf_sessionid (RAW) E−commerce session id.

28 elf_total (RAW) E−commerce transaction amount.

29 elf_tax (RAW) E−commerce tax amount.

30 elf_shipping (RAW) E−commerce shipping amount.

31 elf_billcity (RAW) E−commerce customer city.

32 elf_billstate (RAW) E−commerce customer state.

33 elf_billzip (RAW) E−commerce customer zip code.

34 elf_billcountry (RAW) E−commerce customer country.

35 elf_productcode (RAW) E−commerce product code.

36 elf_productname (RAW) E−commerce product name.

37 elf_variation (RAW) E−commerce product variation.

38 elf_price (RAW) E−commerce product price.

39 elf_quantity (RAW) E−commerce product quantity.

40 elf_upsold (RAW) E−commerce upsold variable.

76 referral_protocol (AUTO) Referral protocol (http/https/etc.)

77 referral_host (AUTO) Referral complete hostname.

Chapter 8: Reference 210

Page 215: MANUAL Urchin v5x

78 referral_domain (AUTO) Referral domain name.

79 referral_port (AUTO) Referral port number (if any).

80 referral_url (AUTO) Referral complete URL. (includes host)

81 referral_uri (AUTO) Referral complete URI. (no host)

82 referral_stem (AUTO) Referral URI stem without query info.

83 referral_query (AUTO) Referral Query info by itself.

84 referral_anchor (AUTO) Referral information after # tag.

85 referral_directory (AUTO) Referral directory up to filename.

86 referral_filename (AUTO) Referral filename without directory.

87 referral_mime (AUTO) Referral mime type (file extension)

88 referral_keywords (AUTO) Referral search engine keywords

89 referral_domainandstem (AUTO) Referral domain and URI stem together.

90 referral_errordetail (AUTO) Referral error detail information.

91 request_method (AUTO) Request method (GET/POST/etc.).

92 request_url (AUTO) Request complete URL (if provided).

93 request_version (AUTO) Request protocol version.

94 request_protocol (AUTO) Request protocol (HTTP/etc.).

95 request_host (AUTO) Request hostname (if any).

96 request_port (AUTO) Request port number (if any).

97 request_uri (AUTO) Request URI with query.

98 request_stem (AUTO) Request URI without query.

99 request_query (AUTO) Request query information (e.g., after ?)

100 request_anchor (AUTO) Request information after # tag

101 request_directory (AUTO) Request directory without filename.

102 request_filename (AUTO) Request filename without directory.

103 request_mime (AUTO) Request mime type (file extension).

104 request_origfilepath (AUTO) Request original uri stem if UTM.

105 request_origmime (AUTO) Request original mime type if UTM.

106 request_errordetail (AUTO) Request detail for error hits.

107 useragent_complete (AUTO) Complete user− agent.

108 browser_base (AUTO) Browser name (e.g., Netscape).

109 browser_version (AUTO) Browser version.

110 platform_base (AUTO) Platform (e.g., Windows).

111 platform_version (AUTO) Platform version.

112 domain_primary (AUTO) First level domain. (e.g. com).

113 domain_complete (AUTO) Complete domain. (e.g. urchin.com).

114 sid (AUTO) Session id (if any).

115 utm_cookiea (AUTO) UTM−2 cookie−a

116 utm_cookieb (AUTO) UTM−2 cookie−b

Chapter 8: Reference 211

Page 216: MANUAL Urchin v5x

117 utm_cookiec (AUTO) UTM−2 cookie−c

119 utm_cookie1 (AUTO) UTM−1 cookie−1

120 utm_cookie2 (AUTO) UTM−2 cookie−2

121 utm_cookie3 (AUTO) UTM−3 cookie−3

122 utm_unique_id (AUTO) UTM unique visitor id.

123 utm_new_campaign (AUTO) new campaign variables detected

124 utm_page (AUTO) UTM page variable (used for request_ variables).

125 utm_referral (AUTO) UTM Referral (used for referral_ variables).

126 utm_screen_resolution (AUTO) Screen resolution (e.g., 800x600).

127 utm_screen_available (AUTO) Available screen resolution in pixels.

128 utm_browser_size (AUTO) Browser size in pixels.

129 utm_screen_colors (AUTO) Screen color bit depth.

130 utm_language (AUTO) Browser language code setting.

131 utm_java_enabled (AUTO) yes|no if java is enabled.

132 utm_cookies_enabled (AUTO) yes|no if cookies are enabled.

133 utm_timezone_offset (AUTO) +/−HHMM timezone offset value of browser.

134 utm_js_version (AUTO) Javascript version info.

135 utm_session_number (AUTO) Number of sessions for this visitor.

136 utm_repeat_campaign (AUTO) Repeat campaign detected.

137 utm_campaign (AUTO) same as utm_campaign in a link.

138 utm_medium (AUTO) same as utm_medium in a link.

139 utm_source (AUTO) same as utm_source in a link.

140 utm_term (AUTO) same as utm_term in a link.

141 utm_content (AUTO) same as utm_content in a link.

142 utm_campaign_session (AUTO) session number of this campaign.

143 utm_campaign_number (AUTO) Number of responses in __utmz.

144 utm_campaign_time (AUTO) Time in seconds of the current campaign.

145 elf_region (AUTO) E−Commerce region drilldown information.

146 utm_campaign_srcmedium (AUTO) utm_campaign [utm_medium].

147 utm_campaign_srcmedtrm (AUTO) utm_campaign [utm_medium] | utm_term.

148 utm_campaign_sesdelta (AUTO)difference in current session number and campaignsession number.

149 utm_campaign_daysdelta (AUTO)difference in days between the hit viewtime and thecampaign time.

150 utm_campaign_hour (AUTO) hour of the day the campaign occurred.

151 utm_campaign_goal (AUTO) campaign goal that was met.

152 log_source_name (AUTO) Log source name in Log Source Wizard.

153 utm_ipandvisitorid (AUTO) IP address or host − visitor key.

154 utm_id (AUTO) same as utm_id in a link.

155 utm_type (AUTO) used for email impressions.

Chapter 8: Reference 212

Page 217: MANUAL Urchin v5x

Regular Report List

Overview

During processing, each hit in the log file is separated and calculated into different fields. These fields arethen used to update data tables which are queried for report views. Some reports have special storage and arenot included in the data tables.

The following table lists all of the predefined reports and which data table is queried for each.

Regular Field List

View# Report NameDataTable

Fields Used

1100 Traffic

1102 Sessions Graph − −

1103 Pageviews Graph − −

1104 Hits Graph − −

1105 Bytes Graph − −

1110 Summary − −

1900 Visitors &Sessions

1903 Visitors by Day − −

1907 Sessions by Day − −

1901 Unique Visitors − −

1905 Unique Sessions − −

1904 Visitor Loyalty 37 utm_session_number vs. sessions

1906 Session Frequency − −

1902 Summary − −

1200 Pages &Files

1201 Requested Pages 7 request_stem vs. pageviews

1211 Downloads 17 request_stem vs. downloads

1206 All Files 12 request_origfilepath vs. hits

1202 Directory by Pages Drilldown 7 request_stem vs. pageviews

1207 Directory by Files Drilldown 12 request_origfilepath vs. hits

1208 Directory by Bytes Drilldown 18 request_stem vs. bytes

1203 File Types by Hits 13 request_origmime vs. hits

1209 File Types by Bytes 19 request_origmime vs. bytes

Chapter 8: Reference 213

Page 218: MANUAL Urchin v5x

1210 Page Query Terms 11 request_stem|request_query vs. hits

1205 Posted Forms 8 request_stem vs. hits

1204 Status and Errors 14 sc_status|request_errordetail vs. hits

1600 Navigation

1601 Entrance Pages 20 request_stem vs. pageviews

1602 Exit Pages 20 request_stem vs. pageviews

1609 Click Paths 21 request_stem vs. sessions

1603 Click To and From 7 request_stem vs. pageviews

1610 Length of Pageview 22 request_stem vs. time

1604 Depth of Session − −

1608 Length of Session − −

1606 Click To and From Report 20 request_stem vs. pages

1300 Referrals

1301 Referrals 1 referral_domainandstem vs. sessions

1303 Referral Drilldown 1 referral_domainandstem vs. sessions

1302 Search Terms 2 referral_domain|referral_keywords vs. sessions

1304 Search Engines 2 referral_domain|referral_keywords vs. sessions

1305 Referral Errors 23 referral_errordetail|referral_domainandstem vs. hits

1400 Domains &Users

1401 Domains 4 domain_primary|domain_complete

1402 Domain Drilldown 4 domain_primary|domain_complete vs. sessions

1403 Countries 4 domain_primary|domain_complete vs. sessions

1404 IP Addresses 6 c_ip vs. sessions

1405 IP Drilldown 6 c_ip vs. sessions

1406 Usernames by Hits 10 cs_username vs. hits

1407 Usernames by Bytes 16 cs_username vs. bytes

1409 Usernames by Sessions 5 cs_username vs. sessions

1500 Browsers &Robots

1501 Browsers by Sessions Drilldown3 useragent_complete vs. sessions

1504 Browsers by Hits Drilldown 9 useragent_complete vs. hits

1505 Browsers by Bytes Drilldown 15 useragent_complete vs. bytes

1502 Platforms by Sessions Drilldown3 useragent_complete vs. sessions

1506 Platforms by Hits Drilldown 9 useragent_complete vs. hits

1507 Platforms by Bytes Drilldown 15 useragent_complete vs. bytes

1503 Combos by Sessions 3 useragent_complete vs. sessions

1510 Robots by Hits Drilldown 24 browser_base vs. hits

1511 Robots by Bytes Drilldown 25 browser_base vs. bytes

1800 Client Parameters

1801 Screen Resolution 31 utm_screen_resolution vs. sessions

Chapter 8: Reference 214

Page 219: MANUAL Urchin v5x

1804 Screen Colors 32 utm_screen_colors vs. sessions

1805 Languages 33 utm_language vs. sessions

1806 Java Enabled 34 utm_java_enabled vs. sessions

1808 Timezone Offset 35 utm_timezone_offset vs. sessions

1809 Javascript Version 36 utm_js_version vs. sessions

2100 E−Commerce

2101 Revenue − −

2102 Number of Transactions − −

2103 Products by Revenue 42 elf_productname|elf_productcode vs. revenue

2104 Products by Quantity 41 elf_productname|elf_productcode vs. items

2105 Products by Revenue Drilldown 42 elf_productname|elf_productcode vs. revenue

2106 Products by Quantity Drilldown 41 elf_productname|elf_productcode vs. items

2107 E−Commerce Summary − −

2100 Revenue Source

2101 Revenue by Region Drilldown 43 elf_region vs. revenue

2102 Revenue by City 43 elf_region vs. revenue

2103 Revenue by Referrals 44 referral_domainandstem vs. revenue

2104 Revenue by Search Terms 45 referral_domain|referral_keywords vs. revenue

2105Revenue by Search EnginesDrilldown

45 referral_domain|referral_keywords vs. revenue

2106 Revenue by Domains Drilldown 46 domain_primary|domain_complete vs. revenue

2200 Campaign Tracking

2201 Lead Source−Acquisition 56, 57, 53source(medium) vs clicks, impressions, new leads

2202 Lead Source−Quality 56, 52, 51source(medium) vs clicks, pages, sessions

2203 Lead Source−Conversion 56, 55, 81source(medium) vs clicks, goals, transactions

2204 Lead Source−ROI 56, 82, 54source(medium) vs clicks, revenue, cost

2206 Lead Source−Conversion 56, 55 source(medium) vs clicks, goals

2206 Lead Source−cost breakdown 54 source(medium) vs cost

2211 Keyword Analysis−Acquisition 70, 71, 67source (medium) vs clicks, impressions, new leads

2212 Keyword Analysis−Quality 70, 66, 65source(medium) | term vs clicks, pages, sessions

2213 Keyword Analysis−Conversion 70, 69, 85source (medium) | term vs clicks, goals, transactions

2214 Keyword Analysis−ROI 70, 86, 68source(medium) | term vs clicks, revenue, cost

2215 Keyword Analysis−Conversion 70, 69 source(medium) | term vs clicks, goals

2216Keyword Analysis−costbreakdown

68 source(medium) | term vs clicks

2221KeywordComparison−Acquisition

70, 71, 67source (medium) vs clicks, impressions, new leads

2222 Keyword Comparison−Quality 70, 66, 65source (medium) | term vs clicks, pages, sessions

2223KeywordComparison−Conversion

70, 69, 85source (medium) | term vs clicks, goals, transactions

Chapter 8: Reference 215

Page 220: MANUAL Urchin v5x

2224 Keyword Comparison−ROI 70, 86, 68source(medium) | term vs clicks, revenue, cost

2225KeywordComparison−Conversion

70, 69 source (medium) | term vs clicks, goals

2226Keyword Comparison−costbreakdown

68 source (medium) | term vs clicks

2231CampaignComparison−Acquisition

56, 57, 53campaign name | source(medium) vs clicks,impressions, leads

2232 Campaign Comparison−Quality 56, 52, 51campaign name | source(medium) vs clicks, pages,sessions

2233CampaignComparison−Conversion

56, 55, 81campaign name | source(medium) vs clicks, goals,transactions

2234 Campaign Comparison−ROI 56, 82, 54campaign name | source(medium) vs clicks, revenue,cost

2235CampaignComparison−Conversion

56, 55 campaign name | source(medium) vs clicks, goals

2236Campaign Comparison−costbreakdown

54 campaign name | source(medium) vs cost

2241 Medium Comparison−Acquisition56, 57, 53campaign name | source(medium) vs clicks,impressions, leads

2242 Medium Comparison−Quality 56, 52, 51campaign name | source(medium) vs clicks, pages,sessions

2243 Medium Comparison−Conversion56, 55, 81campaign name | source(medium) vs clicks, goals,transactions

2244 Medium Comparison−ROI 56, 82, 54campaign name | source(medium) vs clicks, revenue,cost

2245 Medium Comparison−Conversion56, 55 campaign name | source(medium) vs clicks, goals

2246Medium Comparison−costbreakdown

54 campaign name | source(medium) vs cost

2251 Content Testing−Acquisition 63, 64, 60campaign name | source(medium) vs clicks,impressions, leads

2252 Content Testing−Quality 63, 59, 58campaign name | source(medium) vs clicks, pages,sessions

2253 Content Testing−Conversion 63, 62, 83campaign name | source(medium) vs clicks, goals,transactions

2254 Content Testing−ROI 63, 84, 61campaign name | source(medium) vs clicks, revenue,cost

2255 Content Testing−Conversion 63, 62 campaign name | source(medium) vs clicks, goals

2256 Content Testing−cost breakdown61 campaign name | source(medium) vs cost

2265 Goal Conversion by Hour 73, 72 term | hour vs goals, clicks

2266 Sales Conversion by Hour 87, 72 term | hour vs transactions, clicks

2267 Repeat Clicks by IP 76IP−VisitorID | source (medium) vs |term repeatresponses

2268 Repeat Clicks by IP 76

Chapter 8: Reference 216

Page 221: MANUAL Urchin v5x

IP−VisitorID | source (medium) vs |term repeatresponses

2261 Time To Goal 75 days delta vs goals

2262 Sessions To Goal 74 session delta vs goals

2263 Time To Transaction 89 days delta vs transactions

2264 Sessions To Transaction 88 session delta vs transactions

Configuration Table and Directive List

Overview

The following matrices provide exact details on the table names, directives, and meanings for each databasetable in the Urchin 5 configuration. The first matrix defines each of the table names and what that table is usedfor. Then, for each database table, a comprehensive list of directives is provided.

Please note that most records will not specify a value for every possible directive for the table to which therecord belongs. In some cases the directives may not be applicable to that particular record. Also, Urchin willuse default values if there is no explicit definition for a directive. Directives may be manipulated by theweb−based Urchin administration interface, or by scripts that use the uconf−driver or uconf−importutilities.

It should also be noted that this reference guide does not contain verbose descriptions of the directives andhow they are to be used. In many cases, the intended usage of the directive may not be immediately obviousfrom the directive name and description provided. You should consult the appropriate sections in the UrchinDocumentation Center at http://help.urchin.com to gain more insight about the capabilities of the product (e.g.filtering, backups, archiving, report view customization, etc.) and how the capabilities can be controlled withthe configuration directives detailed below.

Note: Where applicable, default values are printed in bold typeface.

Table Name Definitions

Table Name Meaning/Purpose

global General settings including licensing and remote access

machine Process settings including database sizing, memory usage and process priority

filter Specifies log and profile runtime filter parameters

logfile Specifies the location and format of a log source

profile Log Processing and Reporting settings for a particular website

task Runtime schedule settings for a particular profile

Chapter 8: Reference 217

Page 222: MANUAL Urchin v5x

affiliation Enterprise−level management of profiles, log sources, filters, groups and users

group Group−level management of users including profile access

user Individual user settings including password, language and locality

Directive List: global table

Directive Meaning/Purpose

cr_dcmode datacenter mode (on|off)

cr_directlink allow direct web links to Urchin reports (on|off)

cr_remoteaccessallow remote access (on|off)

cr_remoteadmin allow remote administration (on|off)

cs_region two−letter global region code

fr=France

ge=Germany

it=Italy

ja=Japan

ko=Korea

po=Portugal

sp=Spain

sw=Sweden

uk=United Kingdom

ct_license license code used by Urchin Licensing

ct_name identifier for record in global table

ct_port port that Apache runs on (default: 9999)

ct_serial serial code used by Urchin Licensing

ct_schedulers [internal use only]

cr_setupwizard run setup wizard first time (on|off)

ct_var VAR code used by Urchin licensing

ct_schedulersleeplength of time in seconds that the scheduler waits before checking for the nexttask (default: 3)

Directive List: machine table

Directive Meaning/Purpose

cr_priority run priority of Urchin log processing engine (low|normal|high)

cs_preset [internal use only]

cs_limitdbtable maximum number of records allowed in database tables (default: 10000)

ct_dbuffsize data buffer size in MB (default: 13)

ct_pbuffsize path buffer size in MB (default: 1)

ct_name identifier for record in machine table

Chapter 8: Reference 218

Page 223: MANUAL Urchin v5x

ct_sbuffsize session buffer size in MB (default: 3)

ct_tbuffsize text buffer size in MB (default: 1)

ct_vbuffsize visitor buffer size in MB (default:2)

Directive List: filter table

Unless otherwise noted, directives in this table apply to all filter types.

cr_action [internal use only]

cr_casesensitivefilter is case sensitive (yes|no)applies to advanced|exclude|include|replace filter types

cr_filtertype type of filter:

advanced=Advanced filter built from two other fields

decode=Decode URL−encoded characters back to their original form

dynamicurl=DynamicURL filter from Urchin 3 and Urchin 4 (deprecated)

exclude=Exclude pattern filter

include=Include pattern filter

jaconv=Convert various Japanese encodings into UTF−8 encoding

replace=Pattern search and replace filter

cr_overrideoverwrite data in the output field if it is already populated (yes|no)applies to advanced filter type

cs_filterfieldID number of field to apply filter to, from the Regular Field List reference tableapplies to decode|exclude|include|jaconv|replace filter types

cs_infieldaID number of first field to apply filter to, from the Regular Field List referencetableapplies to advanced filter type

cs_infieldbID number of second field to apply filter to, from the Regular Field List referencetableapplies to advanced filter type

cs_outfieldID number of the field to ouput filter results to, from the Regular Field Listreference tableapplies to advanced filter type

cs_llistexclamation−point delimited list of log source recnums to which this filter isapplied (uconf−driver)

comma delimited list of log source names to which this filter is applied(uconf−import)

cs_rlistexclamation−point delimited list of profile recnums to which this filter is applied(uconf−driver)

comma delimited list of profile names to which this filter is applied(uconf−driver)

ct_affiliation optional affiliation

ct_filterfilter pattern (simple pattern or POSIX regular expression)applies to include|exclude filter types

Chapter 8: Reference 219

Page 224: MANUAL Urchin v5x

ct_inexparegular expression pattern for first filterapplies to advanced filter type

ct_inexpbregular expression pattern for second filterapplies to advanced filter type

ct_name identifier for record in filter table

ct_outexpexpression defined explicitly or constructed from saved pattern parts of inputexpressions, e.g.($A1, $B2)applies to advanced filter type

ct_replacereplacement string patternapplies to replace filter type

ct_searchsearch string patternapplies to replace filter type

Directive List: logfile table

Directive Meaning/Purpose

cr_action [internal use only]

cr_logdestinydisposition of log after processing (1=don't touch, 2=archive/compress,3=delete)

cr_protocol remote log transfer protocol (ftp|http)

cr_type location of log (local|remote)

cr_uristemtolowerconvert the URI stem to lower case when reading log (on|off)

cs_flistexclamation−point delimited list of filter recnums which are applied to this logsource (uconf−driver)

comma delimited list of filter names which are applied to this log source (uconf−import)

cs_logformat logging format for logfile (auto|elf|elf2|ncsa|netscape|w3c)

cs_rlistexclamation−point delimited list of profile recnums using this log source(uconf−driver)

comma delimited list of profile names using this log source (uconf−import)

ct_affiliation optional affiliation

ct_loglocation local log pathname/location (e.g. /logs/access.log)

ct_name identifier for record in logfile table

ct_passwordpassword for ftp/http remote log access and UNC pathnames in Windowsenvironments

ct_pathtimeoffsetoffset in hours from local time when using date matching patterns in the logfilespecification (e.g. +8)

ct_pathtimegmtsubstitute GMT time for local time when using date matching patterns in thelogfile specification (on|off)

ct_port port number (e.g. 21 for ftp, 80 for http)

ct_querytoken specify the query token separating the URI stem from the query (default: ? )

ct_remotelocationremote log pathname/location

Chapter 8: Reference 220

Page 225: MANUAL Urchin v5x

ct_separatorsingle character field separator character (\s, \t are escaped characters for spaceand tab)

ct_serverfully qualified domain name or IP address of remote host/server for remote logdownloads

ct_usernameusername to use for ftp/http remote log access and UNC pathnames in Windowsenvironments (default: anonymous)

Directive List: profile table

Directive Meaning/Purpose

cr_archivedata enable automatic ZIP archiving of older Urchin monthly databases (on|off)

cr_autorollbackenable automatic rollback of Urchin databases after failed log processing(on|off)

cr_cleanbackups enable automatic removal of outdated Urchin database ZIP backups (on|off)

cr_createbackupsenable automatic creation of Urchin database ZIP backups to allow rollbackfunctionality (on|off)

cr_includemimesspecify whether ct_mimes list of pageview suffix/MIME types should be aninclude or exclude list (exclude|include)

cr_includeparametersspecify whether ct_parameters list of URI query terms types should be aninclude or exclude list (exclude|include)

cr_keeprawtrackingdataspecify whether raw tracking data should be retained (on|off)

cr_logtracking turn log tracking (on|off)

cr_pgoalcasesensitive campaign primary goal match case sensitive (yes|no)

cr_processpath turn visitor tracking (on|off)

cr_processvisitorsspecify whether to keep visitor information between log processing runs(on|off)

cr_profiletype profile type (Standard_Website|E−Commerce_Website)

cr_sessionpageview session requires a pageview (on|off)

cs_archivenmonths create monthly ZIP archives of Urchin databases after n months (default: 12)

cs_flistexclamation−point delimited list of filter recnums applied to this profile(uconf−driver)

comma delimited list of filter names applied to this profile(uconf−driver)

cs_glistexclamation−point delimited. list of group recnums granted access to thisprofile (uconf−driver)

comma delimited. list of group names granted access to this profile

cs_keepnbackups specify number of ZIP backups to keep (0−10, default: 2)

cs_limitdbtablemaximum number of database records to keep for any database table for thisprofile (overrides cs_limitdbtable global value; default: 10000)

cs_llistexclamation−point delimited. list of log source recnums associated with thisprofile (uconf−driver)

comma delimited. list of log source names associated with this profile

Chapter 8: Reference 221

Page 226: MANUAL Urchin v5x

(uconf−import)

cs_pathlevel depth of path reporting (default: 3)

cs_pgoalfield internal numeric id of field in ct_pgoalfield

cs_referrallevel referral level to report (default: 3)

cs_reportset

report view template for this profile specified as one of the six built−intemplates:Basic All|Basic Lite|Basic ITUTM−Enabled All|UTM−Enabled Nopaths|UTM−Enabled Webdesignor a User−Specified reporting template that matches a custom ".rs" reportingtemplate file

cs_sidfieldID number for field where session ID is contained, from the Regular FieldList reference table

cs_taskid recnum for associated task in Task table (uconf−driver only)

cs_timeoffset Time offset (in seconds) for data in log (default: 0=GMT)

cs_ulistexclamation−point delimited list of user recnums granted access to thisprofile (uconf−driver)

comma delimited list of user names granted access to this profile(uconf−import)

cs_vmethod visitor tracking (0=IP−UserAgent, 1=Session ID, 2=UTM, 3=IP−Only)

cs_visitortimeout session timeout in seconds (default: 3600)

ct_affiliation optional affiliation

ct_defaultpage default page for site (e.g. index.html)

ct_downloadscomma separated list of download page suffix/MIME−types to match(default: dmg,doc,exe,gz,pdf,pkg,ppt,sh,tar,xls,zip)

ct_keywordscomma separated list of search engine referral keywords to match(default:general,key,kw,mt,p,q,qs,qt,query,search,search_string,text,word,words)

ct_lasthittime of most recent hit processed for this profile in seconds since 1970[read−only, set by log processing engine]

ct_mimescomma separated list of pageview suffixes/MIME types to match or exclude(default: css,cur,gif,ico,ida,jpeg,jpg,js,png)

ct_name identifier for record in profile table

ct_parameterscomma separated list of URI query terms to include or exclude in the PageQuery Terms report (default: sid)

ct_pgoalexp campaign primary goal expression to match

ct_pgoalfield field name to match expression in ct_pgoalexp against

ct_reportdomainscomma delimited list of site domains (e.g.urchin.com,www.urchin.com,quantified.net,www.quantified.net)

ct_sidpre text pattern that precedes session id pattern being matched

ct_sidpost text pattern that terminates the session id pattern being matched

ct_utmdomaindomain named to be used for UTM tracking (must match that set in __utm.jsfile in the document root of the website itself)

Chapter 8: Reference 222

Page 227: MANUAL Urchin v5x

ct_website URL for website associated with this profile (e.g. http://www.urchin.com)

Directive List: task table

Directive Meaning/Purpose

cd_btimestart time of last run for this task in seconds since 1970 [read−only, set by logprocessing engine]

cd_etimefinish time of last run for this task in seconds since 1970 [read−only, set by logprocessing engine]

cd_lastruntime of last initiation for this task in seconds since 1970 [read−only, set by logprocessing engine]

cd_nextrun time of next run for this task in seconds since 1970

cr_dow day of week to run task (0=Sun,1=Mon,2=Tue,3=Wed,4=Thu,5=Fri,6=Sat)

cr_enabled [internal use only]

cr_frequency task frequency (0=never,3=once,4=hourly,5=daily,6=weekly,7=monthly)

cr_runnow [internal use only]

cs_dom day of month to run task (monthly scheduling option) [1−31]

cs_hour hour of day to run task [0−23]

cs_minute minute of hour to run task [0−59]

cs_rid recnum for associated profile in Profile table (uconf−driver only)

ct_affiliation optional affiliation

ct_application [internal use only]

ct_completed percent of log processing completed [read−only, set by log processing engine]

ct_day day of month to run task (run−once option)[1−31]

ct_lockid [internal use only]

ct_month month to run task (run−once option)[1−12]

ct_pid [internal use only]

ct_name identifier for record in task table

ct_runstatuscurrent runtime status of task (0=processing logs,1=processingDNS,2=completed,3=error,4=queued)[read−only, set by log processing engine]

ct_statuscurrent scheduling status of task (0=disabled,1=notscheduled,2=scheduled,3=running,4=completed,5=error)[read−only, set by log processing engine]

ct_year year for task to run (run−once option) [4−digit CCYY format]

Directive List: affiliation table

Directive Meaning/Purpose

ct_browselocationpathname specification for top−level directory allowed for browsing for logs inLog Source

ct_cachedirectory

Chapter 8: Reference 223

Page 228: MANUAL Urchin v5x

pathname specification of directory used to store temporary cache files used indisplay of reports for an affiliation

ct_contact descriptive name for the affiliation's contact person

ct_email email address for affiliation's contact person

ct_name identifier for record in affiliation table

ct_reportdirectorypathname specification for top−level directory where Urchin reporting databaseswill live for the affiliation

Directive List: user table

cr_changelanguageuser may change language preference (no|yes)

cr_changepassworduser may change password (no|yes)

cr_changeregion user may change region preference (no|yes)

cr_leveltype affiliation admin privilege level

0=manage users/groups/tasks

1=manage users/groups/tasks/filters

2=manage users/groups/tasks/filters/log sources/profiles

cs_adminlevel admin level (1=admin, 2=affiliate admin, 3=user)

cs_glistexclamation−delimited list of group recnums the user belongs to(uconf−driver)

comma delimited list of group names the user belongs to (uconf−import)

cs_language two−letter report language code for user

en=English

fr=French

ge=German

ja=Japanese

sp=Spanish

cs_region two−letter region code for user

us=United States

ch=China

fr=France

ge=Germany

it=Italy

ja=Japan

ko=Korea

po=Portugal

sp=Spanish

sw=Sweden

uk=United Kingdom

cs_rlistexclamation−point delimited list of profile recnums the user has access to(uconf−import)

Chapter 8: Reference 224

Page 229: MANUAL Urchin v5x

comma−delimited list of profile names the user has access to(uconf−import)

cs_rslistexclamation−point delimited set of "recnum|ReportSetName" pairs thatoptionally controls the report view for this user for a particular report(e.g. !79|Basic_All!83|Basic_Lite!)

ct_affiliation optional affiliation for user

ct_fullname full name of user

ct_name identifier for record in user table

ct_passworduser password (automatically encrypted on input by uconf−driver oruconf−import)

Directive List: group table

cs_rlistexclamation−point delimited list of profile recnums the group has access to(uconf−driver)

comma−delimited list of profile names the group has access to (uconf−import)

cs_rslistexclamation−point delimited set of "recnum|ReportSetName" pairs that optionallycontrols the report view for this group for a particular report(e.g. !79|Basic_All!83|Basic_Lite!)

cs_ulistexlamation−point delimited list of user recnums assigned to the group(uconf−driver)

comma−delimited list of user names assigned to the group (uconf−import)

ct_affiliation optional affiliation

ct_groupdescdescription of the group

ct_name identifier for record in group table

Error code list for failed FTP and HTTP remote webserverlog transfers

Overview

An Urchin Log Source can be configured to collect a webserver log from a remote server via FTP or HTTP.Under normal circumstances, the transfer will be successful and no errors appear in the runtime log. However,if some error is encountered during the transfer (e.g. an invalid username/password, remote serverunreachable, remote log unreadable, etc.), Urchin will log an error code in the runtime output, as viewable inthe Task History for the Profile. This error code appears in parenthesis next to the "failed" message after thewebserver log transfer is attempted, e.g. (−9)

Chapter 8: Reference 225

Page 230: MANUAL Urchin v5x

The error codes are listed below along with a text message explaining the problem that was encountered.

Error Code List

1 Unsupported protocol. This build of curl has no support for this protocol.

2 Failed to initialize.

3 URL malformat. The syntax was not correct.

4 URL user malformatted. The user−part of the URL syntax was not correct.

5 Couldn't resolve proxy. The given proxy host could not be resolved.

6 Couldn't resolve host. The given remote host was not resolved.

7 Failed to connect to host.

8 FTP weird server reply. The server sent data curl couldn't parse.

9 FTP access denied. The server denied login.

10 FTP user/password incorrect. Either one or both were not accepted by the server.

11 FTP weird PASS reply. Curl couldn't parse the reply sent to the PASS request.

12 FTP weird USER reply. Curl couldn't parse the reply sent to the USER request.

13 FTP weird PASV reply, Curl couldn't parse the reply sent to the PASV request.

14 FTP weird 227 format. Curl couldn't parse the 227−line the server sent.

15 FTP can't get host. Couldn't resolve the host IP we got in the 227−line.

16 FTP can't reconnect. Couldn't connect to the host we got in the 227−line.

17 FTP couldn't set binary. Couldn't change transfer method to binary.

18 Partial file. Only a part of the file was trans− fered.

19 FTP couldn't download/access the given file, the RETR (or similar) command failed.

20 FTP write error. The transfer was reported bad by

Chapter 8: Reference 226

Page 231: MANUAL Urchin v5x

the server.

21 FTP quote error. A quote command returned error from the server.

22 HTTP not found. The requested page was not found. This return code only appears if −−fail is used.

23 Write error. Curl couldn't write data to a local filesystem or similar.

24 Malformat user. User name badly specified.

25 FTP couldn't STOR file. The server denied the STOR operation.

26 Read error. Various reading problems.

27 Out of memory. A memory allocation request failed.

28 Operation timeout. The specified time−out period was reached according to the conditions.

29 FTP couldn't set ASCII. The server returned an unknown reply.

30 FTP PORT failed. The PORT command failed.

31 FTP couldn't use REST. The REST command failed.

32 FTP couldn't use SIZE. The SIZE command failed. The command is an extension to the original FTP spec RFC 959.

33 HTTP range error. The range "command" didn't work.

34 HTTP post error. Internal post−request generation error.

35 SSL connect error. The SSL handshaking failed.

36 FTP bad download resume. Couldn't continue an ear− lier aborted download.

37 FILE couldn't read file. Failed to open the file. Permissions?

38 LDAP cannot bind. LDAP bind operation failed.

39 LDAP search failed.

40 Library not found. The LDAP library was not found.

41 Function not found. A required LDAP function was not found.

42 Aborted by callback. An application told curl to abort the operation.

Chapter 8: Reference 227

Page 232: MANUAL Urchin v5x

43 Internal error. A function was called with a bad parameter.

44 Internal error. A function was called in a bad order.

45 Interface error. A specified outgoing interface could not be used.

46 Bad password entered. An error was signaled when the password was entered.

47 Too many redirects. When following redirects, curl hit the maximum amount.

48 Unknown TELNET option specified.

49 Malformed telnet option.

51 The remote peer's SSL certificate wasn't ok

52 The server didn't reply anything, which here is considered an error.

53 SSL crypto engine not found

54 Cannot set SSL crypto engine as default

55 Failed sending network data

56 Failure in receiving network data

57 Share is in use (internal error)

58 Problem with the local certificate

59 Couldn't use specified SSL cipher

60 Problem with the CA cert (path? permission?)

61 Unrecognized transfer encoding

Chapter 8: Reference 228