Website Evaluation

Architectural criteria for website evaluation conceptual framework and empirical validation

SEOYOUNG HONG and JINWOO KIM

HCI Lab, School of Business, Yonsei University, Seoul, Korea; e-mail: [email protected]

Abstract. With the rapid development of the Internet, manytypes of websites have been developed. This variety of websitesmakes it necessary to adopt systemized evaluation criteria witha strong theoretical basis. This study proposes a set ofevaluation criteria derived from an architectural perspectivewhich has been used for over a 1000 years in the evaluation ofbuildings. The six evaluation criteria are internal reliability andexternal security for structural robustness, useful content andusable navigation for functional utility, and system interfaceand communication interface for aesthetic appeal. The impactsof the six criteria on user satisfaction and loyalty have beeninvestigated through a large-scale survey. The study resultsindicate that the six criteria have dierent impacts on usersatisfaction for dierent types of websites, which can beclassied along two dimensions: users goals and users activitylevels.

1. Introduction

As the number of Internet users has increased, so hasthe variety of websites (Wilson et al. 1997). At thebeginning of the Internet era, most websites werepersonal homepages. However, websites now includeany dierent types, for example, sites for tradingphysical products (Daft and Lengel 1986) and sites foronline network games (Mulligan 1998).As the variety of websites increases, it becomes more

important to have a set of usability evaluation criteriathat meet three requirements identied by Wilkinson etal. (1997). First, we need evaluation criteria that havestrong theoretical foundations, so that we can be surethat they are comprehensive and do not miss anyimportant aspects of the usability of websites (Kim et al.2002). Second, we need empirical validation for theevaluation criteria to be sure that they measure whatthey are intended to measure and produce reliableresults (Shneiderman 1994). Third, the criteria should beapplicable to dierent types of websites. The eective-

ness of specic evaluation criteria may vary signicantlybetween dierent types of websites (Kim and Lee 2002).For example, from the users perspective, providingsecurity might be very important for online brokeragesites but not as important for cyber museum sites.Therefore, an objective classication scheme for web-sites should be developed in conjunction with thedevelopment of evaluation criteria.The primary objective of this study was to propose

systemized evaluation criteria and a classicationframework for websites of various kinds, with atheoretical basis and empirical validation. Systemizedevaluation criteria with a theoretical basis and empiricalvalidation are an important pre-condition for buildingusable and useful websites. This is because theevaluation criteria may be used both to diagnoseproblems in current websites and to allocate resourcesfor future websites. The evaluation criteria in this studywere based on the theory of the architectural frame-work. This theoretical background helps us to explainwhy the evaluation criteria are important for websitedevelopment. The reliability of the proposed evaluationcriteria were then investigated empirically through alarge-scale survey. This empirical validation assurespractitioners that these criteria measure what they areintended to measure. Finally, various websites wereclassied along two dimensions by multi-dimensionalscaling and the impacts of the proposed evaluationcriteria were compared along the two dimensions bystructural equation modelling methods. This compar-ison allows practitioners to allocate limited resourcesappropriately according to the characteristics of specicwebsites.To summarize, this study was conducted to answer

three main research questions. First, what are theimportant evaluation criteria for website usability?Second, how can diverse websites be classied? Finally,

BEHAVIOUR & INFORMATION TECHNOLOGY, SEPTEMBEROCTOBER 2004, VOL. 23, NO. 5, 337357

Behaviour & Information TechnologyISSN 0144-929X print/ISSN 1362-3001 online # 2004 Taylor & Francis Ltd

http://www.tandf.co.uk/journalsDOI: 10.1080/01449290410001712753

how does the relative importance of evaluation criteriachange for dierent types of websites?This paper consists of ve sections. The next section

describes the theoretical background of the proposedevaluation criteria based on the architectural frame-work. The following section explains the classicationscheme for diverse websites. Then we explain theprocedure and measures of the large-scale survey, andpresent its results. Finally, the paper ends with areview of the limitations and implications of the studyresults.

2. Design principles and evaluation criteria

Many design principles and evaluation criteria havebeen proposed for the development and evaluation ofwebsites (Alastrair 1997, Selz and Schubert 1998,Nielsen 2000, Krug 2000). However, these criteria haveproblems in three respects. First, the criteria generallylack a theoretical background, suggesting severalmeasures based on existing practices with no explicittheoretical constructs. Others suggest numerous criteriawithout any justication of why they are needed.Therefore, we cannot be sure whether they arecomprehensive or miss some important aspects of thequality of websites. Second, some studies simplyproposed evaluation criteria without any empiricalvalidation (Selz and Schubert 1998). Therefore, wecannot be sure whether they are measuring what theyare supposed to evaluate, i.e. the usability of the website,and whether the usability is relevant to the performanceof the websites. Finally, the criteria proposed by someprior studies lack external validity. We cannot be surewhether they can be applied to other types of websites.For example, some criteria are domain-specic andapplicable only to e-commerce sites but not, forexample, to personal web pages (Shneiderman 1993,Ho et al. 1998, Perry and Bodkin 2000, Kwon et al.2002). For example, Kim and Lee (2002) provided a setof evaluation criteria based on architectural theory butthe criteria were applied only to a specic type ofwebsite (stock trading sites). Other constructs are tooabstract and do not reect the characteristics of websiteseven though they are applicable to all kinds of sites(Bauer and Scharl 2000). For example, aesthetic screendesign might be important for cyber galleries but not asimportant for online stock trading. Evaluation criteriashould be sensitive enough to reect the uniquecharacteristics of dierent kinds of websites.We propose evaluation criteria and a classication

scheme based on the parallels between websites andbuildings (Mitchell 1995, Winograd and Tabor 1996).Just as a building is a type of artifact that people

construct in real space, so a website is a type of artifactthat people build in cyber space. Websites can beregarded as similar to buildings in cyber space for tworeasons. First, websites and buildings serve similarobjectives. Buildings oer a physical space wherevarious activities are performed, whereas websites oera virtual space on the Internet for many of the sameactivities. In other words, buildings such as market-places, schools, post oces, and libraries in the realworld can be compared to websites such as virtual malls,distance learning, e-mail, and portal sites on the Internet(Mitchell 1995). Second, users perceptions are impor-tant both for websites and for real-world buildingsbecause one of the ultimate goals of the two is to provideappropriate experiences for users (Gonzales et al. 1997,Liao and Cheung 2001). Therefore, the architecture ofboth websites and buildings emphasizes the quality ofusers experiences. For example, reliability and con-venient functions are important factors for both websiteusers and building residents. The architectural quality ofwebsites may therefore be similar to that of buildingsfrom the user-experience perspective.One of the advantages of using the parallels with

buildings is that we can learn from the conceptualframework of architectural principles that has been usedto evaluate buildings for over a 1000 years (Giedion1941). Buildings have usually been designed andappraised from three interrelated perspectives based onthe works of the famous Roman architecture criticVitruvius: rmitas, utilitas, and venustas (Rasmussen1959). These three perspectives have been elaboratedlater in the domain of POE (Post Occupancy Evalua-tion), which is the process of evaluating buildings in asystematic and rigorous manner after they have beenbuilt and occupied for some time (Zirmring andReizenstein 1980, Preiser et al. 1988, Gonzales et al.1997).Firmitas refers to the structural robustness of a design

(Giedion 1941). A building has to be robust enough toprotect inhabitants from all external threats such as coldwinds and snow. It also has to stand rm throughinternal wear and tear in order to avoid collapsing.Utilitas means the appropriate allocation of space in adesign. A building should provide spaces suitable for thepurposes for which it is intended (Giedion 1941).Finally, venustas represents the aesthetic appeal ofarchitecture (Rasmussen 1959). A building should havea pleasant appearance to arouse pleasurable emotions.In summary, in order to be a good building, it has toprovide structural robustness, functional utility, andaesthetic appeal. This conceptual framework of archi-tectural quality is used in this study as a useful tool toorganize design dimensions of websites into a systematicevaluation framework.

338 S. Hong and J. Kim

2.1. Robustness dimension

The robustness of websites can be dened as thesolidity of the system structure in overcoming allexpected and unexpected threats. We hypothesize thatrobustness is an important design dimension that mayaect user satisfaction and loyalty to websites. This isbecause users want to feel secure before they initiate anyinteraction with sites. For example, a survey conductedby the European Messaging Association revealed thatthe vast majority of respondents demand appropriaterobustness before they conducted any interaction on theweb (Shankar 1996). It is also noted in a survey studythat structural robustness on the Internet has receivedconsiderable attention both directly in the form of safeand secure interaction and indirectly in the form ofpossible risks (National Computer Board 1997).We believe that the robustness dimension of a website

can be evaluated by two criteria: internal reliability andexternal security. The internal reliability criteria denotethe operational stability of websites (Huang et al. 1999).We hypothesize that internal reliability is important forthe structural robustness of websites because unstablesystems frustrate users anddiminish the quality of the userexperience. For example, it has been found that interac-tion with e-commerce sites depends on the perceivedstability of user experience (Liang and Huang 1998).Similarly, it has been argued that the most importantobstacle to online interaction is the lack of system stability(Venkata and Lili 2000). Internal reliability of websitescan be measured by such factors as speed of access andstability of performance (Bhimani 1996).The external security criteria represent the safety of

websites from external threats (Zona Research 2000).We believe that external security is important for thestructural robustness of websites because a website thatis not considered a safe place would not attract users(Liu et al. 1997). Lack of security has been found to beone of the main factors inhibiting users from engaging inonline transactions (Sasa 2000). A recent study in e-commerce also found that perceived security risks exerta signicant eect on users willingness to be involved intransaction activities (Jarvenpaa et al. 2000, Liao andCheung 2001, McKnigh et al. 2002). The externalsecurity of websites can be evaluated by such factorsas the quality of rewalls and privacy policies (Panurach1996). Therefore, external security includes criteria forsecurity levels as well as privacy.

2.2. Utility dimension

From the functional perspective, a building should beappropriate for its usage (Britannica 2001). A building

that is good for an oce may not necessarily be suitablefor residential purposes. This dimension in POEincludes such factors as storage, workow, humanfactors, and exibility (Preiser et al. 1988).The functional utility principle for websites indicates

that they should provide appropriate features for theusers interactions with the system. We hypothesize thatproviding appropriate features for users to completetheir intended activities is an important architecturalconstruct because it determines how eectively websiteshelp users accomplish their goal.We propose that the utility dimension for websites

can be evaluated by two criteria: useful contents andusable navigation. It has been found that usefulness andusability are two of the most important factors for usersatisfaction (Davis 1989). Content usefulness refers to thequality of information provided in websites (Huang etal. 1999, Perry and Bodkin 2000). A recent studyrevealed that information quality is important for thesuccess of general information systems (Huang et al.1999). Such factors as accuracy of information andrelevance of contents can be used to measure theusefulness of contents. Navigation usability refers tothe ease of navigation of websites. Many prior studieshave shown that usable navigation is one of the mostimportant criteria for quality websites (Alexander andTate 1999, Lichtenberg 1999, Chircu and Kauman2000, Wang 2000). The usability of site navigation canbe evaluated by the ease of nding target locations andidentifying the users current location.

2.3. Aesthetic appeal dimension

From the aesthetic perspective, a building should beenjoyable enough to provide a pleasant feeling to theinhabitants. This dimension includes such factors asimage, graphics, and environmental perception (Preiseret al. 1988).Aesthetic appeal in websites refers to the user inter-

face, because the user interface is the aspect of computersystems that users actually see and hear (Moran 1981).We hypothesize that aesthetic appeal is an importantarchitectural dimension for websites because it enhancesa customers pleasure as they browse and nd relevantinformation (Benjamin 1995). A recent study found thatpleasurable interfaces were important to the success ofcommercial websites (Liu and Arnett 2000).The appeal of websites can be evaluated on two

criteria: (1) system interface attractiveness and (2)communication interface attractiveness. System inter-face attractiveness refers to the pleasantness of thehuman - computer interface (Lohse and Spiller 1998).We believe that providing a pleasant system interface is

339Architectural criteria for website

important for aesthetic appeal. This is because users arelikely to return to a website if it provides an interestingand entertaining interface experience (Rice 1997). It hasbeen found that the appearance of the homepage makesa major contribution to user satisfaction with commer-cial websites (Ho and Wu 1999). The attractiveness ofthe system interface can be evaluated on the basis ofvisual interface features such as graphic design andimages. To assess this visual aspect, this study evaluatedhow appropriately visual design components such asinformation text, images and colour are provided inwebsites (Parunak 1989). It also evaluated how diverseindividual web-pages look without seriously under-mining the stylistic coherence of the entire website(Lynch and Horton 2002).Communication interface attractiveness refers to the

pleasantness of the interfaces between users. These aremostly implemented by communication systems (Daftand Lengel 1986, Wilson et al. 1997). We hypothesizethat providing pleasant communication interfaces be-tween users is important because communicating withother people in a community is a key feature of websites(Armstrong and Hagel 1996). For example, it has beenfound that providing a pleasant peer review feedbacksection is one of the best ways to increase usersatisfaction (Kim 1999). It is also noted that mostcommercial websites allow buyers and sellers to interactthrough the electronic medium (Liu et al. 1997). Theattractiveness of the communication interface can bemeasured by the variety of communication aids pro-vided and communities accessed (Daft and Lengel 1986,Wilson et al. 1997).In summary, we proposed three architectural dimen-

sions based on a building metaphor. We also proposedthe two most important evaluation criteria for eacharchitectural dimension. The applicability of eacharchitectural dimension to websites, supported by thebuilding analogy where possible, identies the mostimportant evaluation criteria contributing to thatarchitectural dimension. We further conducted a surveystudy, which will be explained in Section 5, toempirically verify the validity and relevance of theproposed criteria to websites.

3. Dimensions for the classication of websites

We propose two dimensions for classifying websitesfrom the perspective of users behaviour. A behaviouralperspective on architecture focuses on why buildingsexist and how they are used by the occupants (Zirmringand Reizenstein 1997). The key behavioural determi-nants of building architecture are the goals and activitiesof the occupants of the buildings (Barrett 1992,

Zirmring and Reizenstein 1997). Buildings can beclassied by their purpose (i.e. are they for sellingproducts or teaching students?) and how people act inthem (i.e. are they running around or are they sittingquietly?). These same dimensions are applicable towebsite classication. As for buildings, there can bedierent purposes and activities for each website.Therefore, this study proposes two dimensions forclassifying websites: users goals and users activitylevels. According to the dierent goals and activity levelsassociated with websites, the evaluation criteria becomemore or less important.

3.1. Users goals dimension

The rst dimension we propose is the goals users havewhen they visit buildings or websites. Users goals can beeither instrumental, which we will call utilitarian goals,or experiential, which we will call hedonic goals (Manoand Oliver 1993, Dhar and Werternbroch 2000). Forexample, people visiting government buildings usuallyhave utilitarian goals, whereas those visiting amusementparks usually have hedonic goals. In the study of Dharand Werternbroch (2000), the utilitarian and hedonicdimensions were found to be valid classication dimen-sions of various products and services. Homan andNovak (1996) also classied the benets that users couldobtain from computer-mediated environments intoutilitarian and hedonic benets. Hence, based on theseprior studies, this study proposes that websites can alsobe classied into utilitarian sites that oer instrumentalbenets and hedonic sites that oer experiential benets.

3.2. Users activity levels dimension

Another important dimension for classifying build-ings is how active the occupants are (Zirmring andReizenstein 1997). For example, visitors in a movietheatre are usually passive and do not initiateactivities voluntarily (unless there is an emergencysuch as re), whereas visitors in a theme park areusually more active, participating in a number ofactivities voluntarily. Likewise, websites can be classi-ed according to how actively visitors participate inwebsite activities; they may be divided into active orpassive depending on how actively users interact withthem (Yamaguchi et al. 1997, McCrickard et al. 2003).On active websites, users interact with the system andcommunicate with other visitors more frequently thanthose on passive websites (Goldberg et al. 1992). Forexample, users of online-game sites participate in siteactivities more actively than those of personal-home-


page sites. Therefore, this study proposes that websitescan be classied into active or passive sites along theuser activity level dimension.

4. Framework for classication and evaluation

We believe that the extent to which websites followthe architectural principles and are optimized on theevaluation criteria has an impact on the level of usersatisfaction and, in turn, on the level of user loyalty. Inother words, a website with a high architectural qualitymay produce a higher level of user satisfaction, whichthen leads to increased motivation for users to revisit thesite.User satisfaction is a subjective evaluation on a

pleasant-unpleasant continuum of the consequences ofusing a website (Fournier and Mick 1999). Usersatisfaction is one of the most frequently used measuresof system success because the performance of a system isusually related to users satisfaction ratings (DeLoneand McLean 1992).It is also clearly related to loyalty, which is the

customers intention to visit a website again based ontheir previous experiences and future expectations(Czepiel and Gilmore 1987, Berry 1995). It is especiallyimportant for e-commerce websites to ensure thatcustomers visit their sites repeatedly because their valueis determined mostly by the number of loyal users (Roseet al. 1999). If none of the users is willing to visit a siteagain its business value becomes worthless despite itstechnical or managerial assets. A recent study onInternet shoppers provides some concrete evidence ofthe economic value of loyalty: the expenditure of loyalusers is almost twice as much as that of new users(George 2002, Korgaonkar and Wolin 2002). This isbecause users conduct major transactions only withthose sites proven reliable after several trial purchasesfor relatively small amounts. Therefore, we selectedloyalty as the nal dependent variable in our causalmodel. The overall model with the architectural metricsis shown in gure 1 below. The proposed hypotheseslinking constructs are presented as lines in gure 1. Thehypotheses are to be veried through empirical study.

H1 :Evaluation criteria will have a positive eect oncorresponding architectural quality dimensions:

H11: Internal reliability will have a positive eecton robustness.

H12: External security will have a positive eect onrobustness.

H13: Content usefulness will have a positive eecton utility.

H14: Navigation usability will have a positive eecton utility.

H15: System interface attractiveness will have apositive eect on aesthetic appeal.

H16: Communication interface attractiveness willhave a positive eect on aesthetic appeal.

H2 :Architectural quality dimensions will have a posi-tive eect on user satisfaction:

H21: Robustness will have a positive eect on usersatisfaction.

H22: Utility will have a positive eect on usersatisfaction.

H23: Aesthetic appeal will have a positive eect onuser satisfaction.

H3 : User satisfaction will have a positive inuence onloyalty.

H4: Types of websites (dened by goals and activitylevels) will have a moderating eect on the hypothesizedrelationships between evaluation criteria and architec-tural quality dimensions:

H41: Users goals (Utilitarian/Hedonic) will moder-ate the hypothesized relationships betweenevaluation criteria and architectural qualitydimensions.

H42: Users activity levels (Active/Passive) willmoderate the hypothesized relationships be-tween evaluation criteria and architecturalquality dimensions.

H5: Types of websites (dened by goals and activitylevels) will have a moderating eect on the hypothesizedrelationships between architectural quality dimensionsand user satisfaction:

H51: Users goals (Utilitarian/Hedonic) will moder-ate the hypothesized relationships betweenarchitectural quality dimensions and usersatisfaction.

H52: Users activity levels (Active/Passive) willmoderate the hypothesized relationships be-tween architectural quality dimensions anduser satisfaction.

5. Survey

In order to test empirically the impacts of architectur-al quality dimensions and evaluation criteria suggestedin this study, we constructed a questionnaire to measurethem in a large-scale online survey.


5.1. Questionnaire development

A total of 81 questionnaire items were initiallycompiled from the published literature (e.g., Selz andSchubert 1998, Huang et al. 1999) and interviews withindustry experts in website evaluation. The purpose ofthe interviews was to increase the content validity of thequestionnaire items. In order to increase the validity ofthe measures used, we conducted an online pretest of the81 items. A total of 2396 users responded to an onlinesurvey for the pre-test. In the pre-test, the participantswere asked to rate a website that was randomly assignedto them on the 81 questionnaire items. Participantsresponded to each item on a 7 point Lickert scale withstrongly-agree and strongly-disagree at the two ends ofthe scale.The websites used in the pre-test were drawn from a

total of 516 websites selected by the authors with a viewto maximizing the external validity of the measures.Exploratory factor analyses were conducted to screenout irrelevant or unnecessary items, and consequently atotal of 38 items were selected for the nal questionnaireused in the main survey. A more detailed description ofthe pre-test can be found in Hong (2002).

For the evaluation criteria, there were two items forinternal reliability, three for external security, four forcontent usefulness, three for navigation usability, threefor system interface attractiveness and three for com-munication interface attractiveness. The items used inthe main survey are presented in table 1.For the architectural quality dimensions, there were

three items for robustness, three for utility, and three foraesthetic appeal. The items used in the main survey areshown in table 2.For the website classication dimensions, there were

ve items for the utilitarian/hedonic dimension and fouritems for the active/passive dimension. The items used inthe main survey are shown in table 3.Finally, the questionnaire ended with one item for

user satisfaction (I am satised with the website ingeneral) and one item for loyalty (I will use the websiteagain in the future).

5.2. Data collection

The main survey was conducted in Korea as a part ofa nationwide contest organized by the Korean govern-

Figure 1. An overall framework of evaluation criteria and classication schema.


ment and a major newspaper company. Owners ofwebsites were induced to participate in the contest bythe oer of cash prizes and ocial certication as a topwebsite (like a Webby award) for the winners. Thenewspaper company also provided a special editorialsection for the winners in their leading newspaper, whichwas expected to provide signicant advertising benetsfor the winning sites. Three hundred and eleven websiteswere submitted for evaluation, of which 11 were rejectedbecause of insucient data in their application form and300 were accepted.In the application form, owners were asked to answer

two questions and give the URL of their websites. Therst question asked them to allocate their website to oneof 16 listed categories. Several examples of eachcategory were provided. The categories are presentedin table 4, with the number of applicants in each. The 16categories were derived from a report issued by aKorean government agency on a census of websites(KRNIC 2001). The personal homepages categoryattracted the most applications (58), followed by thegame sites category (43), whereas the web-hostingcategory had the fewest applications (3).The second question on the application form asked

website owners what functions were provided in theirwebsites. We provided respondents with a list of 12generic functions that can frequently be found onwebsites (ordering, payment, trading, search, e-mail,chatting, bulletin-board with web master, bulletin-board

with other users, multimedia, game, information andcommunity). Answers to this question were used later toprovide post-hoc conrmation for our classicationscheme based on the items in table 3. The integrity ofthe answers from the website owners was maintained bya strong warning message saying that any incorrectinformation in the application form they submittedwould cause the nomination to be rejected and theapplicant to be banned from reentering the contest inthe future.In order to evaluate the websites objectively, volun-

teer web users were recruited through an advertisementin a major daily newspaper. Several advertisements wereplaced in the newspaper as well as on the companywebsite. Respondents were compensated by a small giftworth 10 US dollars and with public recognition asindependent evaluators for the best website contest. Atotal of 2381 web users participated as independentevaluators in this study. Demographic information onthe independent evaluators is provided in table 5. Mostof the evaluators were males (85.85%) in their twenties(64.17%). They were mostly heavy users of the Internetbecause they had used the Internet for more than 2 years(97.78%) and accessed it for more than 2 h per day(85.17%).The main survey was conducted through an online

survey service. The URLs of the submitted websiteswere randomly allocated to independent evaluatorsincrementally, keeping the numbers of evaluators

Table 1. Questionnaire items for the six evaluation criteria.

Criterion Code Questionnaire items Reference

Internal reliability IS1 The web site quickly responds to my requests in a consistent manner *QUISIS2 The web site operates stably in the process of downloading and uploading

information*QUIS

External security ES1 The web site has strong protection against any unauthorized attempts fromoutside

By Author

ES2 The web site exercises enough precaution to provide a safe place on the web By AuthorES3 The web site has a strict policy to protect private information of its users By Author

Content usefulness CU1 The content in the web site is objective **IQCU2 The content in the web site is accurate **IQCU3 The content in the website is frequently updated **IQCU4 The web site contains many valuable contents **IQ

Navigation usability NU1 It is easy to identify the current location in the web site ***PUEUNU2 It is easy to understand the overall navigation structure of the web site ***PUEUNU3 It is easy to navigate the web site toward the target point ***PUEU

System interface attractiveness SI1 The web site uses appropriate colors for its screens By AuthorSI2 The web site has attractive screen layouts By AuthorSI3 The web site has a well-diversied screen design By Author

Communication interfaceattractiveness

CI1 I can communicate with other users pleasantly within the web site By Author

CI2 The web site provides a comfortable cyber place for meeting other users By AuthorCI3 The web site promotes an atmosphere for intimate meeting with other users By Author

*QUIS (Chin 1988), **IQ (Huang and Yang 1999), ***PUEU (Davis 1989)


balanced across all websites. Before the evaluators lledin any questions about the allocated website, they wererst asked to sign a form declaring that they did nothave any relationship with the websites allocated tothem. If they identied a relationship with the allocatedsite, a new site was allocated instead. Before lling in thequestionnaire, evaluators were also asked to perform asimple task in order to familiarize them with theassigned website. For example, they were asked tosearch for and download certain information frominformation-providing websites or upload their opinionto a bulletin board on communication sites. In total,10 051 evaluations were made by the 2381 evaluators ofthe 300 websites. On average, each evaluator wasallocated around ve websites, and each website wasevaluated by around thirty evaluators.

6. Results

The following analyses were performed to validate theproposed website design principles and classicationschema. We rst conducted a Multi DimensionalScaling (MDS) analysis to identify the main dimensionsof website classication by plotting the website cate-gories on a perceptual map. We then tested the validityand reliability of the measures for the architecturalquality dimensions and evaluation criteria. Next, weconducted structural equation modeling to analyse theimpacts of evaluation criteria and architectural qualitydimensions on user satisfaction and loyalty. Finally, weconducted multi-group chi-square tests to compareactive with passive websites, and utilitarian with hedonicwebsites.

6.1. MDS analysis for website classication

To verify the two classication dimensions proposedin this study, users goals and users activity levels, we

conducted a MDS (Multi-Dimensional Scaling) analysiswith the nine items presented in table 3. The MDS wasconducted to identify two dimensional clusters accord-ing to respondents perceptions of their goals andactivity levels. The MDS also provided metrics toevaluate the reliability of the classication, in the formof RSQ and Stress values. Any clustering with an RSQvalue higher than 0.6 and Stress value less than 0.5indicates an adequate level of reliability, which wouldprovide evidence for the behavioural dimensions as asuitable classication scheme. The ve items measuringutilitarian vs. hedonic goals of users were used toclassify the 16 categories of websites into two maingroupings as shown in gure 2. Six website categories(e.g. VA11, Game sites) were classied into the hedonicgrouping, whereas the remaining 10 categories (e.g.VA1, General Shopping Mall) were classied into theutilitarian grouping. The stress value of the classicationwas 0.05 and RSQ value was 0.99, which indicated thatthe users goals dimension could be used as a reliableclassication dimension.Similarly, the four items measuring the active or

passive activity levels of users were used to classify the16 website categories into two main groupings as shownin gure 3. The stress value of this classication was 0.02and the RSQ value was 0.99. These values indicate thatusers activity levels was also an appropriate dimensionfor website classication. Seven website groups (e.g.VA15, personal homepage) were classied into thepassive category, whereas nine website groups (e.g.VA3, auction) were classied into the active category.In order to complement the results of the MDS

analysis, we conducted a post-hoc analysis to comparethe proportion of websites providing each of the 12generic functions for each of the classication dimen-sions. This analysis was based on the data provided bywebsite owners when they submitted their website forthe contest as explained earlier in section 5.2. Thecomparison was also conducted to identify genericfunctions that were more commonly provided in certain

Table 2. Questionnaire items for the three architectural quality dimensions.

Architectural dimensions Code Questionnaire items Reference

Robustness FR1 The web site is stable in general By AuthorFR2 The web site is dependable in general By AuthorFR3 I can use the website without worry By Author

Utility CV1 It is easy to use the web site in general ***PUEUCV2 I can get useful information from the web site **IQCV3 I can navigate the web site conveniently *QUIS

Aesthetic appeal DL1 It is interesting to use the website By AuthorDL2 The process of using the website is delightful By AuthorDL3 I like the look and feel of the web site By Author

*QUIS (Chin 1988),**IQ (Huang and Yang 1999),***PUEU (Davis 1989)


types of websites. The results of the post-hoc comparisonanalyses are provided in table 6: 1 (utilitarian vs.hedonic) and table 6: 2 (active vs. passive).It was found that websites classied into the

utilitarian group provided more functions that allowedusers to conduct goal-oriented activities, such asordering (t-value: 7 7.21), payment (t-value: 7 5.11),trading (t-value: 7 2.92) and searching (t-value:7 4.65). On the other hand, websites classied into thehedonic group provided more functions related tochatting (t-value: 2.23). It was also found that websitesclassied into the active group provided more functionsthat allowed users to engage in interactions with thesites, such as community (t-value: 2.650) and ordering(t-value: 5.416). The results of the post-hoc comparison

therefore complement the results of the MDS classica-tion.

6.2. Validity and reliability of the measures

We conducted a conrmatory factor analysis (CFA)to test the reliability and validity of the six evaluationcriteria proposed in this study. The CFA was conductedwith data pooled across all categories of websites. Scoreson the 18 questions presented in table 1 were used asinput. The results of the CFA analysis are presented intable 7 below.As can be seen in table 7, all 18 items converged neatly

onto their corresponding constructs with relatively highfactor loadings. For example, the two items developedto measure the internal reliability of websites (IS1 and

Table 3. Questionnaire items for the two behavioural dimensions for classifying websites.

Dimensions Code Questionnaire items Reference

Utilitarian vs. Hedonic UH1 The web site helps me to nish my job eectively By AuthorUH2 The web site provides many interesting materials By AuthorUH3 The web site is mostly used in my leisure time By AuthorUH4 The web site mostly provides utilitarian information By AuthorUH5 The web site focuses on aective satisfaction By Author

Active vs. Passive AP1 The web site is suitable for being used actively in conjunction with otherusers

By Author

AP2 The website requires me to input much information and to manipulatemany features

By Author

AP3 The website consists of many interactive multimedia features By AuthorAP4 The website can be used to actively interact with other users By Author

Table 4. Categories of websites and number of applicants foreach category.

No Code CategoryNumber ofSites

1 VA1 General shopping mall 42 VA2 Specialty shopping mall 383 VA3 Auction 54 VA4 Reservation service 85 VA7 Health and medical 66 VA8 Computer and Internet 167 VA9 Economy and industry 138 VA10 Women and children 259 VA11 Games 4310 VA12 Portals 1011 VA13 Web hosting 312 VA14 Community 2513 VA15 Personal homepages 5814 VA16 Organization homepage 915 VA17 Online education 2616 VA18 Web casting 11

Note: VA5 and VA6 were excluded from further analysisbecause of a lack of sites submitted.

Table 5. Demographic data on independent evaluators.

Respondents Number Ratio (%)

Gender Male 2044 85.85Female 337 14.15

Age in years 10 20 34 1.4320 30 1528 64.1730 40 791 33.2240 28 1.18

Internet usage(years)

0.5 1 7 0.29

1 1.5 13 0.541.5 2 33 1.392 2328 97.78

Internet usage(hours a day)

1 33 1.39

1 2 320 13.442 3 517 21.713 5 584 24.535 10 604 25.3710 323 13.57


Figure 2. The percetual map for Utilitarian Hedonic groupings.

Figure 3. The perceptual map for Active Passive groupings.


IS2) are found to converge onto one factor (Fac1) withfactor loadings of 0.91 and 0.76, respectively. In terms ofreliability, all the six criteria showed relatively highconstruct reliability coecients. The lowest was forsystem interface (0.89), but this was still higher than thethreshold of reliability. The average variance extracted(AVE) by the six criteria was also above the acceptedcut-o point of 0.75. Finally, several goodness of tmeasures indicated that the six constructs measured bythe 18 items have sucient reliability to warrant furthercausal analysis.

A similar analysis was conducted to test the threearchitectural quality dimensions measured by the nineitems shown in table 2. The results of this CFA analysisare presented in table 8 below.The results were similar to those for the evaluation

criteria. The nine items converged onto their respectivethree constructs as shown in table 8. The reliabilitycoecients for the three design dimensions were againhigher than the threshold value of 0.7, and the averagevariance extracted by the three dimensions was alsohigher than the cut-o point of 0.75. Finally, all the

Table 6.(1). The proportion of web sites oering each function, classied by users goals.

Users Goals

Utilitarian Hedonic

Mean SD Mean SD t-value p-value

Ordering 0.500 0.502 0.129 0.336Payment 0.366 0.483 0.117 0.322 7 5.11 *0.000Trading 0.112 0.316 0.025 0.155 7 2.92 *0.000Search 0.575 0.496 0.313 0.465 7 4.65 *0.004E-mail 0.306 0.463 0.344 0.476 0.69 0.492Chatting 0.164 0.372 0.270 0.445 2.23 *0.026Bulletin Boardwith web master

0.687 0.466 0.583 0.495 7 1.86 0.064

Bulletin Boardwith other users

0.619 0.487 0.724 0.448 1.91 0.058

Multimedia 0.127 0.334 0.135 0.343 0.21 0.837Game 0.328 0.471 0.325 0.470 7 0.06 0.953Information 0.052 0.223 0.037 0.189 7 0.63 0.526Community 0.187 0.391 0.178 0.384 7 0.19 0.848

(*p5 0.05).

Table 6.(2). The proportion of web sites oering each function, classied by users activity levels.

Users activity levels

Active Passive

Mean SD Mean SD t-value p-value

Ordering 0.510 0.503 0.194 0.396Payment 0.354 0.481 0.169 0.376 3.317 *0.001Trading 0.083 0.278 0.055 0.228 0.941 0.348Search 0.385 0.489 0.453 0.499 7 1.102 0.272E-mail 0.313 0.466 0.333 0.473 7 0.357 0.721Chatting 0.156 0.365 0.254 0.436 7 1.895 0.059Bulletin Boardwith web master

0.677 0.470 0.607 0.490 1.186 0.237

Bulletin Boardwith other users

0.594 0.494 0.716 0.452 7 2.057 *0.041

Multimedia 0.104 0.307 0.144 0.352 7 0.956 0.340Game 0.344 0.477 0.318 0.467 0.434 0.664Information 0.063 0.243 0.035 0.184 0.988 0.325Community 0.438 0.499 0.279 0.449 2.650 *0.009

(*p5 0.05)


goodness of t statistics were well above the recom-mended level.In summary, the 18 items shown in table 1 and the

nine items shown in table 2 were found to be reliable andvalid measures for the six evaluation criteria and threearchitectural quality dimensions respectively. We there-fore proceeded to analyse the causal relationshipsamong the evaluation criteria, architectural qualitydimensions, user satisfaction and loyalty using thesemeasures.

6.3. Impacts of evaluation criteria on user satisfactionand loyalty

This study veried the structural equation model foreach of the four website groupings (utilitarian, hedonic,active, and passive) to investigate the relationship of theevaluation criteria to users satisfaction and loyalty to awebsite. The covariance matrix of the factor scores ofthe six evaluation criteria and three design dimensionswas used as the input for the analysis.The t of the model was assessed using several

indicators, including adjusted goodness of t test androot mean square residuals. Table 9 is a summary of thegoodness of t indices, which shows that each model foreach of the four categories of websites had goodness oft indexes higher than the threshold value. Since thegoodness of t of the models was above the threshold,

the set of paths hypothesized by the model was testedusing maximum likelihood estimation.Figures 4 to 7 present four LISREL models for the

utilitarian, hedonic, active, and passive websites respec-tively. Each gure shows the causal relations among thesix evaluation criteria, three design dimensions, usersatisfaction, and loyalty. The coecients for the paths inthe gures represent the strength of the relationshipsbetween the proposed constructs. Paths drawn with adashed line and with the coecient value in italicsindicate statistically non-signicant relationships.Figure 4 presents a structural model for the utilitarian

websites. Most of the hypothesized relationships wereobserved except for two. The relationship betweencommunication interface attractiveness and aestheticappeal, and that between robustness and user satisfac-tion, were found not to be statistically signicant.Internal reliability was found to have more inuencethan external security on robustness. Navigation usabil-ity was found to have more inuence than contentusefulness on utility. System interface attractiveness wasfound to have more inuence than communicationinterface attractiveness on aesthetic appeal. The impactof utility on user satisfaction was almost the same asthat of aesthetic appeal. Finally, satisfaction was foundto have a signicant impact on user loyalty.Figure 5 presents a similar structural model for the

hedonic websites. Most of the hypothesized relation-ships were conrmed except for two. The impact of

Table 7. Results of the conrmatory factor analysis for the six evaluation criteria.

Construct VA Fac1 Fac2 Fac3 Fac4 Fac5 Fac6

Internal reliability IS1 0.91IS2 0.76

External security ES1 0.86ES2 0.90ES3 0.88

Content usefulness CU1 0.76CU2 0.78CU3 0.80CU4 0.80

Navigation usability NU1 0.81NU2 0.84NU33 0.69

System interface attractiveness SI1 0.71SI2 0.77SI3 0.78

Communication interface attractiveness CI1 0.87CI2 0.87CI3 0.82

Cronbach alpha coecients 0.82 0.91 0.86 0.83 0.79 0.88Construct reliability 0.91 0.95 0.93 0.91 0.89 0.94Average variance extracted 0.84 0.88 0.79 0.78 0.75 0.85

DF=120, RMSEA=0.078, RMR=0.036, GFI=0.99 AGFI=0.99.


communication interface attractiveness on aestheticappeal was barely signicant but with a negative sign,whereas the relationship between robustness and usersatisfaction was statistically non-signicant. Externalsecurity was found to have slightly more inuence thaninternal reliability on robustness. Navigation usabilitywas found to have more inuence than contentusefulness on utility. System interface attractivenesshad more inuence than communication interfaceattractiveness on aesthetic appeal. Finally, aestheticappeal was found to have more inuence than utility onuser satisfaction.Figure 6 presents a structural model for the active

websites. The gure shows that all the proposedhypotheses were conrmed for active websites. Externalsecurity was found to have more inuence than internalreliability on the robustness of active websites. Naviga-tion usability was found to have more inuence thancontent usefulness on utility. System interface attrac-tiveness was found to have more inuence thancommunication interface attractiveness on aestheticappeal. Finally, robustness was found to have the mostinuence on user satisfaction of the three architecturalquality dimensions.

Finally, gure 7 presents a structural model for thepassive websites. Most of the hypothesized relationshipswere found to hold except for three. The relationshipbetween navigation usability and utility was found to bebarely signicant with a negative direction, the onebetween communication interface attractiveness andaesthetic appeal was also barely signicant with anegative direction, and the one between robustnessand user satisfaction was statistically non-signicant.External security was found to have more inuence thaninternal reliability on the robustness of passive websites.Content usefulness was more important than navigationusability for utility. System interface attractiveness wasmore important than communication interface attrac-tiveness for aesthetic appeal. Finally, aesthetic appealwas found to be the most important design dimensionfor user satisfaction with passive websites.

6.4. Multi-group analysis comparing dierent websites

Two multi-group analyses were conducted to comparedierent categories of websites: one between theutilitarian and hedonic groups, the other between theactive and passive groups.The results of a multi-group analysis using nested chi-

square tests comparing the utilitarian and hedonicgroups are presented in table 10. Of the two evaluationcriteria for the robustness dimension, internal reliablitywas found to aect robustness more signicantly in thehedonic group than in the utilitarian group(Dw2=32.21, p5 0.01), but the impact of externalsecurity was more signicant in the utilitarian groupthan in the hedonic group (Dw2=3.87, 1, p5 0.05). Ofthe two evaluation criteria for utility, content usefulnesswas found to aect utility more signicantly in theutilitarian group than in the hedonic group (Dw2=8.99,p5 0.01), but the impact of navigation usability wasmore signicant in the hedonic group than in theutilitarian group (Dw2= 9.91, p5 0.01). There was nostatistically signicant dierence between the twogroups in terms of the two criteria of aesthetic appeal.Finally, as regards the impacts of the three architecturalquality dimensions on user satisfaction, the impacts ofrobustness and utility on satisfaction were moresignicant in the utilitarian group than in the hedonicgroup (Dw2=4.42, p5 0.01 for robustness;Dw2=69.03, p5 0.01 for utility), whereas the impactof aesthetic appeal was more signicant in the hedonicgroup than in the utilitarian group (Dw2=19.02,p5 0.01).The results of a similar multi- group analysis

comparing the active and passive group are presentedin table 11. There was no statistically signicant

Table 8. Results of the conrmatory factor analysis for thethree architectural quality dimensions.

Construct VA Fac1 Fac2 Fac3

Robustness FR1 0.81FR2 0.85FR3 0.77

Utility CV1 0.80CV2 0.85CV3 0.75

Aesthetic appeal DL1 0.85DL2 0.86DL3 0.90

Cronbach alpha coecients 0.85 0.84 0.90Construct Reliability 0.92 0.92 0.95Average Variance Extracted 0.81 0.81 0.87

DF=24, RMSEA=0.076, RMR=0.024, GFI=0.99,AGFI=0.99.

Table 9. The goodness-of-t indices of the structural equationmodel for the four groups.

Group Df GFI AGFI NFI RMSEA

Utilitarian group 347 0.92 0.88 0.94 0.094Hedonic group 347 0.92 0.87 0.94 0.096Active group 347 0.92 0.87 0.93 0.095Passive group 347 0.93 0.88 0.96 0.092


dierence between the groups in terms of the twoevaluation criteria of robustness. In terms of the twoevaluation criteria for utility, content usefulness wasfound to aect utility more signicantly in the passivegroup than in the active group (Dw2=85.46,p5 0.01). However the impact of navigation usabilitywas more signicant in the active group rather than inthe passive group (Dw2=78.03, p5 0.01). For the twoevaluation criteria for aesthetic appeal, the onlysignicant nding was that system interface attractive-ness had more eect in the passive group than theactive group (Dw2= 13.57, p5 0.01). Finally, asregards the impacts of the three architectural qualitydimensions on user satisfaction, the impacts ofrobustness and utility on user satisfaction were moresignicant in the active group than in the passive(Dw2=639.53, p5 0.01 for robustness; Dw2=132.01,p5 0.01 for utility), whereas the impact of aestheticappeal was more signicant in the passive group thanin the active (Dw2=609.99, p5 0.01).

7. Conclusions and discussion

This study proposed three design dimensions and sixevaluation criteria for websites, along with a two-dimensional classication scheme. We proposed aconceptual model based on long-established architectur-al theory. This theoretical model was veried empiricallyby multi-dimensional scaling and structural equationmodeling methods. We classied websites into utilitarianand hedonic groups according to users goals, and intoactive or passive groups according to users activitylevels. These classication schemes were supplementedby calculating the ratio of generic functions provided bythe websites. We also empirically veried the validityand reliability of the architectural dimensions andevaluation criteria using conrmatory factor analysis.The results of the structural equation modeling indicatethat the hypothesized relationships hold, with a fewexceptions. Internal reliability and external security haveimpacts on the robustness of websites; content useful-

Figure 4. A path diagram for the utilitarian group.


ness and navigation usability have impacts on the utilityof websites; system interface attractiveness and commu-nication interface attractiveness have impacts on theaesthetic appeal of websites; robustness, utility andaesthetic appeal have impacts on user satisfaction; andnally user satisfaction has an impact on user loyalty towebsites. More interestingly, the impacts of six evalua-tion criteria and three architectural quality dimensionswere found to change according to the types of websites,as dened by users activity levels and goals.This study has several limitations. Firstly, it depends

mainly on survey data. The absence of an analysis ofobjective features of websites (e.g. background colour ornumber of levels) makes it hard to explain whichconcrete design factors matched the subjective measures.For example, it was not clear whether the number ofdistinctive items or the number of levels in the sitehierarchy aected navigation usability.Secondly, the websites evaluated in this study were

limited to those voluntarily submitted for a best websiteaward competition. Even though we recruited a largenumber of websites, we cannot be sure that our sample

represented the full range of websites. For example, allthe websites evaluated were domestic websites forKorean users. Moreover, there was neither a brandingwebsite nor a non-prot organization website in oursamples. Our results may therefore be biased towardswebsites whose owners were eager to gain recognitionfrom the Korean public.Thirdly, our study has the generic limitations of the

online survey method, which include self-selection bias.Moreover, we could not collect detailed demographicinformation such as income or education levels becauseof the privacy policy of the newspaper company. Lackof demographic information might undermine thevalidity of the study results, because a recent studyindicated that the relative importance of evaluationcriteria might vary with income and education level(Lightner 2003). We also could not collect performancedata on the survey procedure, for example how long ittook evaluators to answer the 38 questions.Fourthly, the 16 categories and 12 generic functions

of websites were used in this study to supplement aclassication scheme based on architectural dimensions.

Figure 5. A path diagram for the hedonic group.


Even though these groupings and generic functions wereconstructed on the basis of a survey of many websitesand interviews with industry experts, they are by nomeans comprehensive or concrete. For example, perso-nal homepages, which were classied as hedonic andpassive in our results, can be quite diverse, and somemay be very utilitarian and active (e.g. a homepageabout simulating stock trading games).Fifthly, the two classication dimensions need more

renement. We did not believe that the two dimensionswere inter-related. Consequently, we could only com-pare either active and passive groups or hedonic andutilitarian groups, but could not explore the interactionbetween the two dimensions.Finally, we need to elaborate questions for evaluation

criteria and user satisfaction. We only had one questionfor measuring user satisfaction. However, a recent studyfound that user satisfaction for e-commerce customersconsists of multiple dimensions (McKinney et al. 2002).We also need to include more dependent variables in ourresearch model. For example, trust has been identied as

an important issue in web site design and e-commerce(Jarvenpaa, Tracinsky and Vitale 2000, McKnigh et al.2002). It would be an interesting extension to add trustto our model and investigate the impacts of evaluationcriteria on the level of perceived trust (Kim and Moon1998).In spite of these limitations, this study has several

interesting implications. From the theoretical perspec-tive, it makes three main contributions. First, it providesa conceptual framework based on principles andevaluation criteria derived from the architecture ofbuildings. The strong theoretical background of thearchitectural principles enables us to provide a plausiblerationale for the evaluation criteria proposed in thisstudy. Second, this study provides a set of subjectivemetrics whose reliability and validity has been empiri-cally veried. The metrics can be used for various kindsof websites in future because their reliability and validitywere consistently high across dierent categories ofwebsites. The metrics could be extended to an objectiveindex of website quality in future when they are meshed

Figure 6. A path diagram for the active group.


with objective feature lists (Kim et al. 2002). Third, thisstudy provides two dimensions for website classication.They need further renement in terms of categories andgeneric functions, but they can be used as a buildingblock to construct a comprehensive structure forclassifying websites.From a practical perspective, this study indicates

where eort should be focused to improve usersatisfaction and loyalty. The more interesting point isthat the focus should be shifted according to the type ofwebsite, in terms of users goals and activity levels.For example, the priorities in developing a website

with a strong utilitarian emphasis (e.g. an auction) dierfrom those for a website with a strong hedonic emphasis(e.g. a game). For the utilitarian website, we need tofocus on how to improve the utility and aesthetic appealof the website in a balanced way, without as muchattention to robustness. On the other hand, for thehedonic website we need to focus most of our attentionon improving aesthetic appeal, with less attention toutility and robustness. This may be because users with

hedonic goals mainly care about their enjoyment, andwebsites that are too convenient might not be asinteresting as those with an appropriate level ofchallenge (Csikszentmihalyi and Csikszentmihalyi1988). Also, robustness was found to be less importantin both cases, probably because it is not a principle thatcan easily be observed by ordinary users or becausemost websites are now reasonably robust. In order toimprove the aesthetic appeal of hedonic websites, weneed to focus on the system interface, because systeminterface attractiveness was found to be more importantthan communication interface attractiveness for hedonicwebsites, in addition to being more important forhedonic sites than for utilitarian ones. On the otherhand, in order to improve the convenience of utilitariansites, we need to focus more on navigation usabilitybecause navigation usability was found to be moreimportant than content usefulness, and was moreimportant for utilitarian websites than for hedonicwebsites. Therefore, in summary, among the sixevaluation criteria, improving system interface attrac-

Figure 7. A path diagram for the passive group.


tiveness was found to be the most eective way toincrease user satisfaction and loyalty for hedonicwebsites, whereas navigation usability and systeminterface attractiveness were the most eective methodsfor utilitarian websites.To take another example, compare the maintenance

of a website for those who want to actively interact with

the site (e.g. online education) and another in whichusers are likely to watch what is going on passively (e.g.a personal homepage). For the active website, we needto allocate resources evenly for improving utility andaesthetic appeal, with a much smaller but still signicantallocation for robustness. By contrast, for the passivewebsite, we need to allocate more resources for aesthetic

Table 10. Results of a nested chi-square test for utilitarian vs. hedonic groups.

Construct Utilitarian Group Hedonic GroupChi-square (Dw2) Df(Ddf)

IV DV Path T value Std Path T value Std Uncon 52269.06 738

Internal reliability Robust-ness

0.18** 11.97 0.02 0.45** 29.41 0.02 52301.27**(32.21)

739(1)

External security 0.85** 36.56 0.02 0.69** 38.27 0.02 52272.93*(3.87)

739(1)

Content usefulness Utility 0.13* 3.96 0.03 0.05* 2.34 0.02 52338.05**(8.99)

739(1)

Navigation usability 0.85** 24.05 0.04 1.02** 42.48 0.02 52338.07**(9.01)

739(1)

System interfaceattractiveness

Aestheticappeal

0.88** 44.51 0.02 0.92** 48.87 0.02 52269.55(0.49)

739(1)

Communication in-terface attractiveness

7 0.00 0.08 0.02 0.10** 5.91 0.02 52271.56(2.5)

739(1)

robustness Satisfac-tion

7 0.05 2.67 0.02 0.02 1.02 0.02 52273.48**(4.42)

739(1)

Utility 0.50** 18.05 0.03 0.15** 6.47 0.02 52338.09**(69.03)

739(1)

Aesthetic appeal 0.53** 15.49 0.03 0.90** 24.98 0.04 52298.08**(19.02)

739(1)

(*p5 0.05, **p5 0.01)

Table 11. Results of a nested chi-square test for active vs. passive groups.

Construct Active Group Passive GroupChi-square (Dw2) df(Ddf)

IV DV Path T value Std Path T value Std Uncon 27072.16 Df 684

Internal reliability Robust-ness

0.45** 21.55 0.02 0.31** 24.56 0.01 27072.17(0.01)

685(1)

External security 0.68** 28.04 0.02 0.78** 43.66 0.02 27072.16(0.00)

685(1)

Content usefulness Utility 0.13* 2.36 0.06 0.98** 35.94 0.03 27157.62**(85.46)

685(1)

Navigation usability 0.88* 15.43 0.06 0.05* 7 2.03 0.02 27150.19**(78.03)

685(1)

System interfaceattractiveness

Aestheticappeal

0.84** 39.14 0.02 0.91** 55.69 0.02 27085.73**(13.57)

685(1)

Communication in-terface attractiveness

0.07** 3.67 0.02 0.06** 4.69 0.01 27072.16(0.00)

685(1)

Robustness Satisfac-tion

0.06* 2.55 0.02 0.01 0.84 0.01 27711.69**(639.53)

685(1)

Utility 0.40** 13.52 0.03 0.12** 2.86 0.12 27204.17**(132.01)

685(1)

Aesthetic appeal 0.49** 12.32 0.04 0.73** 16.23 0.73 27682.15**(609.99)

685(1)

(*p5 0.05, **p5 0.01)


appeal than for utility, with a much smaller allocationfor robustness. In order to increase aesthetic appeal, weneed to focus on the system interface for both websites,but more for the passive website than for the active one.Moreover, the results have an interesting implication forthe utility dimension. In order to increase utility, webdevelopers need to focus on content usefulness ratherthan navigation usability for the passive group. Thepath coecient of the relationship between contentusefulness and utility was only 0.13 for the active groupbut 0.98 for the passive group. However, navigationusability was found to be more important for the activegroup (path coecient 0.88) than for the passive group(path coecient 0.05). Useful content might be moreimportant for passive users, because they primarily viewwhatever content is provided, but navigation usabilitymight be more important for active users because theyengage with the website by navigating between variousparts of a site. In order to enhance robustness, both siteswould need to focus more on external security than oninternal reliability. However, the active website shouldallocate more resources to internal reliability than thepassive one, whereas the passive website should allocatemore resources to external security than the active one.External security might be more important for passivewebsites because much of the content in these websites ishighly personalized, which requires a higher level ofsecurity and protection. On the other hand, internalreliability might be more important for active websitesbecause reliable access is an important pre-condition forinteraction with them.Evaluating websites using the six criteria proposed in

this study and developing websites with the threearchitectural dimensions in mind may help us toconstruct more pleasant environments on the Internet.Allocating our attention and resources dierentlyaccording to the type of website will increase theeectiveness of our investment. After all, we would notwish to put a luxurious Italian leather sofa in themiddle of a fast food restaurant, nor a plastic ocedesk in the middle of a cozy living room. The studyresults indicate that similar principles and criteriashould be applied in the evaluation and developmentof websites.

Acknowledgements

The authors appreciate the support of the members ofthe HCI Lab at Yonsei University. This work wassupported by a Korea Research Foundation Grant(KRF-2002-005-H20002) to the second author of thispaper. The authors also appreciate the comments fromtwo anonymous reviewers and an Editor of Behaviourand Information Technology.

References

ALASTRAIR, G. 1997, Testing the surf: criteria for evaluationInternet information resource. The Public-Access ComputerSystem Review, 8(30), 5 23.

ALEXANDER, J. and TATE, M. A. 1999, Web wisdom: How toevaluate and create information quality on the Web (NewJersey: Lawrence Erlbaum Associates).

ARMSTRONG, A. and HAGEL, J. 1996, The real value of onlinecommunities. Harvard Business Review, 74(3), 134 141.

BARRETT, P. S. 1992, Development of a post occupancybuilding appraisal model. Facilities Management: ResearchDirections, 5, 116 125.

BAUER, C. and SCHARL, A. 2000, Quantitative evaluation ofweb site content and structure. Internet Research: ElectronicNetworking Application Policy, 10(1), 31 43.

BENJAMIN, R. I. 1995, Electronic markets and virtual chains onthe information superhighway. Sloan Management Review,36(1), 62 72.

BERRY, L. 1995, Relationship marketing of services: growinginterest, emerging perspectives. Journal of Academy. Mar-keting Science, 23(4), 236 245.

BHIMANI, A. 1996, Securing the commercial Internet. Commu-nications of ACM, 39(6), 29 35.

BRITANNICA. 2001, Commodity, Firmness, and delight: Theultimate synthesis. Encyclopedia Britannica article Avail-able at (http://www.britannica.com/bcom/eb/article/0/0,5716,119280+6,00.html.).

CHIRCU, A. M. and KAUFFMAN, R. J. 2000, Reintermediationstrategies in business-to-business electronic commerce.International Journal of Electronic Commerce, 4(4), 7 42.

CSIKSZENTMIHALYI, M. and CSIKSZENTMIHALYI, I. S. 1988,Optimal Experience : Psychological Studies of Flow in Con-sciousness (New York: Cambridge University Press).

CZEPIEL, J. A. and GILMORE, R. 1987, Exploring the concept ofloyalty in services. In: J. A. Czepiel, C. A. Congram and J.Shanahan (eds) The Services Challenge: Integrating forCompetitive Advantage. (Chicago, IL: American MarketingAssociation), pp. 91 94.

DAFT, R. and LENGEL, R. 1986, Organizational informationrequirements, media richness and structural design. Man-agement Science, 32(5), 554 571.

DAVIS, F. D. 1989, Perceived usefulness and easiness of use.MIS Quarterly, 13(3), 319 340.

DELONE, W. H. and MCLEAN, E. R. 1992, Information systemssuccess: the quest for the dependent variable. InformationSystems Research, 3(1), 60 95.

DHAR, R. and WERTERNBROCH, K. 2000, Consumer choicebetween hedonic and utilitarian goods. Journal of MarketingResearch, 35(2), 60 71.

FOURNIER, S. and MICK, D. G. 1999, Rediscovering satisfac-tion. Journal of Marketing, 63(4), 5 23.

GEORGE, J. F. 2002, Inuences on the intent to make Internetpurchases, Internet Research, 12(2), 165 181.

GIEDION, S. 1941, Space, Time, and Architecture: The Growth ofa New Tradition. (Cambridge: Harvard University Press).

GOLDBERG, Y., SAFRAN, M. and SHAPIRO, E. 1992, Active mail?A FrameWork for implementing groupware, CSCW 92,October 1992 (Toronto: Canada), pp. 75 83.

GONZALES, M., FERNANDEZ, C. and CAMESELLE, J. 1997,Empirical validation of a model of user satisfaction withbuildings and their environments as workplaces. Journal ofEnvironmental Psychology, 17, 69 74.


HO, T.-H., CHRISTOPHER, S. T. and DAVID, R. B. 1998, Rationalshopping behavior and the option value of variable pricing.Management Science, 44, 145 160.

HO, C. F. and WU, W. 1999, Antecedents of customersatisfaction on the Internet: An empirical study of onlineshopping. Proceeding of 32nd Hawaii International. Con-ference on System Science, (Maui, Hawaii), pp. 1 9.

HOFFMAN, D. L. and NOVAK, P. T. 1996, Marketing inhypermedia computer-mediated environments: conceptualfoundations. Journal of Marketing, 60, 50 68.

HONG, S. 2002, Developing measures for testing architecturalusability of diverse websites. Unpublished thesis at YonseiUniversity.

HUANG, K., LEE, Y. and WANG, R. 1999, Quality Informationand Knowledge, (New Jersey: Prentice Hall).

JARVENPAA, S. L., TRACTINSKY, N. and VITALE, M. 2000,Consumer trust in an Internet store. Information Technologyand Management, 1(1 2 ), 45 71.

KIM, J. and MOON, J. 1998, Designing towards emotionalusability in customer interface- trustworthiness of cyber-banking system interfaces. Interacting with Computers, 10,1 29.

KIM, J. 1999, An empirical study of navigation aids incustomer interface. Behavior and Information Technology,18(3), 213 224.

KIM, J. and LEE, J. 2002, Critical design factors for successfule-commerce systems. Behavior & Information Technology,21(3), 185 199.

KIM, J., LEE, J., HAN, K. and LEE, M. 2002, Business asbuildings: metrics for the architectural quality of internetbusiness. Information Systems Research, 13(3), 239 254.

KORGAONKAR, P. and WOLIN, L. D. 2002, Web usage,advertising, and shopping: relationship patterns. InternetResearch, 12(2), 191 205.

KRNIC. 2001, available at http://www.nic.or.kr/index_kr.html

KRUG, S. 2000, Dont make me think a common senseapproach to web usability, (Indianapolis: New RidersPublishing).

KWON, O. B., KIM, C.-R. and LEE, E. J. 2002, Impact of websiteinformation design factors on consumer ratings of web-based auction sites. BIT, 21(6), 387 402.

LIANG, T. P. and HUANG, J. S. 1998, An empirical study onconsumer acceptance products in electronic markets: atransaction cost model. Decision Support System, 24(1),29 43.

LIAO, Z. and CHEUNG, M. T. 2001, Internet-based e-shoppingand consumer attitudes: an empirical study. InformationManagement, 38, 299 306.

LICHTENBERG, L. 1999, Inuences of electronic developmentson the role of editors and publishers strategic issues. TheInternational Journal on Media Management, 1(1), 23 30.

LIU, C. and ARNETT, K. P. 2000, Exploring the factorsassociated with web site success in the context of electroniccommerce. Information Management, 38(1), 23 33.

LIU, C., ARNETT, K. P., CALELLA, L. and BEATTY, B. 1997, Websites of the Fortune 500 companies: facing customersthrough home pages. Information Management, 31(6),335 345.

LOHSE, G. L. and SPILLER, P. 1998, Electronic shopping: theeect of customer interfaces on transfer and sales. Commu-nications of ACM, 41(7), 81 88.

LYNCH, P. J. and HORTON, S. 2002, Web style guide: basicdesign principles for creating web site, 2nd edition. YaleUniversity Press.

MANO, H. and OLIVER, R. L. 1993, Assessing the dimension-ality and structure of the consumption experience: evalua-tion, feeling, and satisfaction. Journal of ConsumerResearch, 20(12), 452 466.

MCCRICKARD, D. S., CHEWAR, C. M., SOMERVELL, J. P. andNDIWALANA, A. 2003, A model for notication systemsevaluation assessing user goals for multitasking activity.ACM Transactions on Computer-Human Interaction (TO-CHI), 10(4), 312 338.

MCKINNEY, V., YOON, K. and ZAHEDI, F. 2002, Web-customersatisfaction: an expectation and disconrmation approach.Information Systems Research, 13(3), 296 315.

MCKNIGH, D. H., CHOUDHURY, V. and KACMAR, C. 2002,Developing and validating trust measures for e-commerce:an integrative typology. Information Systems Research,13(3), 334 359.

MITCHELL, T. 1995, City of bits : Space, Place, and the Infobahn(Cambridge, MA: MIT Press).

MORAN, T. P. 1981, The command language grammar: arepresentation for the user interface of interactive systems.International Journal of Man-Machine Studies, 15(1), 3 50.

NATIONAL COMPUTER BOARD. 1997. First Secure VISA CardPayment Over the Internet (Singapore : Singapore NationalComputer Board Corporate Publication).

NIELSEN, J. 2000, Designing web usability (New Riders Publish-ing: Indianapolis).

PANURACH, P. 1996, Money in electronic commerce: digitalcash, electronic fund transfer, and ecash. Communications ofACM, 39(6), 45 50.

PARUNAK, H. 1989, Hypermedia typologies and user naviga-tion, In Proceedings of Hypertext 89 Conference, November1989, (Pittsburgh, USA), pp. 43 50.

PERRY, M. and BODKIN, C., 2000, Content analysis of Fortune100 Company web sites. Corporate Communication, 5(2),87 96.

PREISER, W.F., RABINOWITZ, H.Z. and WHITE, E.T., 1988, Post-Occupancy Evaluation. New York, Van Nostrand ReinholdCo.

RASMUSSEN, S. 1959, Experiencing Architecture (Cambridge:MIT Press).

RICE, M. 1997, What makes users revisit a web site?MarketingNews, 31(6), 12.

ROSE, G., KHOO, J. and STRAUB, D. W. 1999, Currenttechnological impediments to business-to-consumer electro-nic commerce. Communications of the AIS, 1(16), 1.74.

SASA, D. 2000, Electronic commerce: a half-empty glass?Communications of the AIS, 3(18), 1 99.

SCHMITT, B. 1999, Experiential Marketing (The Free Press).SELZ, D. and SCHUBERT, P. 1998, Web assessment a model forthe evaluation and the assessment of successful electroniccommerce applications. EM-Electronic Markets, 7, 46 48.

SHANKAR, B. 1996, Electronic commerce will be a big business.Telecommunications, 30(7), 24.

SHNEIDERMAN, B. 1993, Design the user interface: Strategy foreective human-computer interaction, (Reading, MA: Addi-son-Wesley Publishing Co).


SHNEIDERMAN, B. 1994, Beyond accuracy, reliability, andeciency: criteria for a good computer system, CHI94:Conference on Human Factors in Computing Systems, April1994 (Boston: United States), pp. 195 198.

UTTING, K. and YANKELOVICH, N. 1990, Context and orienta-tion in hypertext networks. ACM Transactions On Informa-tion Systems, 7(1), 58 84.

VENKATA, N. P. and LILI, Q. 2000, The content and accessdynamics of a busy Web site: ndings and implications.SIGCOMM 00, October 2000, (Stockholm, Sweden),pp. 111 123.

WANG, P. 2000, Users interaction with world wide webresources: an exploratory study using a holistic approach,information. Processing and Management, 36, 229 251.

WILKINSON, G. L., BENNETT, L. T. and OLIVER, K. M. 1997,Evaluation criteria and indicators of quality for Internetresources. Educational Technology, 37(3), 52 59.

WILSON, E., MORRISON, J. and NAPIER, A. 1997. A. perceivedeectiveness of computer mediated communications andface-to-face communications in student software develop-ment teams. Journal of Computer Information Systems,38(2), 2 7.

WINOGRAD, T. and TABOR, P. 1996, Software design andarchitecture (Reading, MA: Addison-Wesley).

YAMAGUCHI, T., HOSOMI, I. and MIYASHITA, T. 1997, WebStage:An Active Media Enhanced World Wide Web Browser. Inthe Proceeding of CHI 97, March 1997 (Atlanta: USA),pp. 391 398.

ZIRMRING, C. M. and REIZENSTEIN, J. 1980, Post-Occupancyevaluation: an overview. Environment and Behavior, 12(4),429 450.

ZONA RESEARCH, INC. 2000. Web robustness measurement: Thefuture may be now. Zona market report, available at: http://www. zonaresearch.com.


Website Evaluation

Documents

Transcript of Website Evaluation