Daily prediction of marine beach water quality in Hong Kong

17
Research paper Daily prediction of marine beach water quality in Hong Kong W. Thoe a , S.H.C. Wong b , K.W. Choi a , J.H.W. Lee c, * a Croucher Laboratory of Environmental Hydraulics, Department of Civil Engineering, The University of Hong Kong, China b Department of Civil and Environmental Engineering, Environmental and Water Studies, Stanford University, Stanford, USA c Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, China Received 1 February 2012; revised 8 April 2012; accepted 8 May 2012 Abstract Bacterial concentration (Escherichia coli) is generally adopted as a key indicator of beach water quality. Currently the beach management system in Hong Kong relies on past water quality data sampled at intervals between 3 and 14 days. Beach advisories are issued when the geometric mean E. coli level of the past five samples exceeds the beach water quality objective (WQO) of 180 counts/100 mL. When the E. coli level varies dynamically, the system is not able to track the daily bacterial variation. And yet worldwide there does not exist a generally accepted method to predict beach water quality in a marine environment, which is influenced by hydro-meteorological variables, catchment character- istics, as well as complicated tidal currents and wave effects. A comprehensive study of beach water quality prediction has been carried out for four representative beaches in Hong Kong: Big Wave Bay (BW), Deep Water Bay (DW), New Cafeteria (NC) and Silvermine Bay (SIL). Statistical analysis of the extensive regular monitoring data was carried out for two periods before and after the commissioning of the Harbour Area Treatment Scheme (HATS): (1990e1997) and (2002e2006) respectively. The data analysis shows that E. coli is strongly correlated with seven hydro-environmental variables: rainfall, solar radiation, wind speed, tide level, salinity, water temperature and past E. coli concentration. The relative importance of the parameters is beach-specific, and depends on the local geographical and hydrographical characteristics as well as location of nearby pollution sources. Multiple Linear Regression (MLR) and Artificial Neural Network (ANN) models are developed from the sparsely sampled regular moni- toring data (2002e2006) to predict the next-day E. coli concentration using the key hydro-environmental variables as input parameters. The models are validated against daily monitoring data in the bathing seasons of 2007 and 2008. The models are able to track the dynamic changes in E. coli concentration and predict WQO compliance/exceedance with an overall accuracy of 70e96%. Both the MLR and ANN models are superior to the current beach advisories in capturing water quality variations, and in predicting WQO exceedances. For example, the models predict around 80% and 50% of the exceedances at BW and NC respectively in JuneeJuly 2007, as compared to 0% and 14% based purely on past data. Similarly, observed exceedances are predicted with success rates of 71%, 42%, and 53% at BW, NC, and SIL respectively during JulyeOctober 2008, as compared with 0%, 0%, and 6% using the current water quality assessment criterion. The MLR and ANN models have similar performances; ANN model tends to be better in predicting the high-end concentrations, with however a greater number of false positive predictions (false alarms). This work demonstrates the practical feasibility of predicting bacterial concentration based on the critical hydro-environmental variables, and paves the way for developing a real time water quality forecast and management system for Hong Kong. Ó 2012 International Association for Hydro-environment Engineering and Research, Asia Pacific Division. Published by Elsevier B.V. All rights reserved. Keywords: Marine beach water quality; Daily prediction; Statistical model; Data-driven model * Corresponding author. Tel.: þ86 852 2358 6161. E-mail addresses: [email protected], [email protected] (J.H.W. Lee). Available online at www.sciencedirect.com Journal of Hydro-environment Research 6 (2012) 164e180 www.elsevier.com/locate/jher 1570-6443/$ - see front matter Ó 2012 International Association for Hydro-environment Engineering and Research, Asia Pacific Division. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.jher.2012.05.003

Transcript of Daily prediction of marine beach water quality in Hong Kong

Available online at www.sciencedirect.com

Journal of Hydro-environment Research 6 (2012) 164e180www.elsevier.com/locate/jher

Research paper

Daily prediction of marine beach water quality in Hong Kong

W. Thoe a, S.H.C. Wong b, K.W. Choi a, J.H.W. Lee c,*

aCroucher Laboratory of Environmental Hydraulics, Department of Civil Engineering, The University of Hong Kong, ChinabDepartment of Civil and Environmental Engineering, Environmental and Water Studies, Stanford University, Stanford, USA

cDepartment of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, China

Received 1 February 2012; revised 8 April 2012; accepted 8 May 2012

Abstract

Bacterial concentration (Escherichia coli) is generally adopted as a key indicator of beach water quality. Currently the beach managementsystem in Hong Kong relies on past water quality data sampled at intervals between 3 and 14 days. Beach advisories are issued when thegeometric mean E. coli level of the past five samples exceeds the beach water quality objective (WQO) of 180 counts/100 mL. When the E. colilevel varies dynamically, the system is not able to track the daily bacterial variation. And yet worldwide there does not exist a generally acceptedmethod to predict beach water quality in a marine environment, which is influenced by hydro-meteorological variables, catchment character-istics, as well as complicated tidal currents and wave effects.

A comprehensive study of beach water quality prediction has been carried out for four representative beaches in Hong Kong: Big Wave Bay(BW), Deep Water Bay (DW), New Cafeteria (NC) and Silvermine Bay (SIL). Statistical analysis of the extensive regular monitoring data wascarried out for two periods before and after the commissioning of the Harbour Area Treatment Scheme (HATS): (1990e1997) and (2002e2006)respectively. The data analysis shows that E. coli is strongly correlated with seven hydro-environmental variables: rainfall, solar radiation, windspeed, tide level, salinity, water temperature and past E. coli concentration. The relative importance of the parameters is beach-specific, anddepends on the local geographical and hydrographical characteristics as well as location of nearby pollution sources.

Multiple Linear Regression (MLR) and Artificial Neural Network (ANN) models are developed from the sparsely sampled regular moni-toring data (2002e2006) to predict the next-day E. coli concentration using the key hydro-environmental variables as input parameters. Themodels are validated against daily monitoring data in the bathing seasons of 2007 and 2008. The models are able to track the dynamic changes inE. coli concentration and predict WQO compliance/exceedance with an overall accuracy of 70e96%. Both the MLR and ANN models aresuperior to the current beach advisories in capturing water quality variations, and in predicting WQO exceedances. For example, the modelspredict around 80% and 50% of the exceedances at BW and NC respectively in JuneeJuly 2007, as compared to 0% and 14% based purely onpast data. Similarly, observed exceedances are predicted with success rates of 71%, 42%, and 53% at BW, NC, and SIL respectively duringJulyeOctober 2008, as compared with 0%, 0%, and 6% using the current water quality assessment criterion. The MLR and ANN models havesimilar performances; ANN model tends to be better in predicting the high-end concentrations, with however a greater number of false positivepredictions (false alarms).

This work demonstrates the practical feasibility of predicting bacterial concentration based on the critical hydro-environmental variables, andpaves the way for developing a real time water quality forecast and management system for Hong Kong.� 2012 International Association for Hydro-environment Engineering and Research, Asia Pacific Division. Published by Elsevier B.V. All rightsreserved.

Keywords: Marine beach water quality; Daily prediction; Statistical model; Data-driven model

* Corresponding author. Tel.: þ86 852 2358 6161.

E-mail addresses: [email protected], [email protected] (J.H.W. Lee).

1570-6443/$ - see front matter� 2012 International Association for Hydro-environment Engineering and Research, Asia Pacific Division. Published by Elsevier B.V. All rights reserved.

doi:10.1016/j.jher.2012.05.003

165W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

1. Introduction

The Hong Kong Special Administrative Region (HKSAR)of China is a coastal city with a land area of 1100 km2 andpopulation of over 7 million. It is located at the mouth of thePearl River Estuary in Southern China (Fig. 1); the coastalwaters of around 1800 km2 serve a wide variety of beneficialuses that include protected wetland habitats, passages forendangered wild life species, recreational bathing waters,navigation, water supply, and receiving grounds for waste-water disposal. The Environmental Protection Department(EPD) of the HKSAR Government has been monitoring thewater quality of Hong Kong’s 41 gazetted bathing beachessince 1986. During the bathing season (MarcheOctober),water samples are collected at the beach at least three times

Fig. 1. Pearl River Estuary and Hong Kong, and the location of the beaches in Hong

Treatment Scheme (HATS) are also shown.

a month at intervals between 3 and 14 days; an extensive set ofbeach water quality data has been collected. Fig. 1 shows thelocations of the 41 beaches around different parts of HongKong.

To establish a scientific criteria for assessing beach waterquality, epidemiological studies were conducted in the late1980s on bathers at Hong Kong beaches. Escherichia coli (E.coli) was found to be a good indicator of faecal pollution andits concentration showed a strong relationship with the inci-dence rate of swimming-associated illnesses such as skin andgastrointestinal illnesses (Cheung et al., 1990). Accordingly,Hong Kong beaches are graded into four categories (Table 1)on a weekly basis according to the geometric mean of the fivemost recent E. coli concentration measurements e whichtypically span over about 30 days (EPD, 2006). Beaches with

Kong. The connection points (D) and the outfall location of the Harbour Area

Table 1

Beach grading system in Hong Kong.

Grade Beach water

quality

E. coli count

per 100 mL

Minor illness rates

(cases per 1000 swimmers)

1 Good �24 Undetectable

2 Fair 25e180 �10

3 Poor 181e610 11e15

4 Very poor >610 >15

166 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

‘very poor’ grading would be closed to swimmers immedi-ately. The beaches are also being ranked annually accordingto the geometric mean of the E. coli concentration over theentire bathing season. Beaches with ‘poor’ or ‘very poor’ (E.coli >180 and 610 counts/100 mL respectively) annualrankings are considered not meeting the Water QualityObjectives (WQO), and will be considered for closure in thefollowing year. Similar monitoring programmes have beenpracticed worldwide for several decades. For example, TheUnited States Environmental Protection Agency (USEPA)recommends an upper limit of 35 and 33 enterococci per100 mL for the geometric mean of at least five samples over30 days in marine and freshwater environments respectively(USEPA, 1986).

The hydrography of Hong Kong waters is mainly influ-enced by the tidal currents, monsoon-affected oceancurrents, and the Pearl River discharges (e.g. Lee et al.,2006). Tides are predominantly semi-diurnal, with a springand neap tidal range of 2 m and 1 m respectively. The tidalflow is vertically well-mixed in the dry season, and resem-bles a partially-mixed estuarine circulation with significantvertical density stratification in the wet season. The PearlRiver is China’s third longest river (2197 km) and thesecond largest river in terms of annual discharge volume(w3.3 � 1011 m3). Its average flow is approximately10,500 m3/s and 80% of the total flow occurs in the wetseason (MayeSeptember) due to the high rainfall duringthis period (annual rainfall 2100 mm). Fig. 2 shows thetypical surface flow field during flood and ebb tides in thedry season. In general, the flow is from east/south-east towest/north-west through Victoria Harbour and East LammaChannel up to the Pearl River Estuary during flood, andfrom west/north-west to east/south-east during ebb. The

Fig. 2. Typical tidal flow in Hong Kong in the dry season as computed by a validat

(Lee et al., 2006).

direction of flood and ebb currents in Victoria Harbour issimilar in the dry and wet seasons (Lee et al., 2006).

Due to rapid urbanisation and industrialisation in the PearlRiver Delta, the water quality in the Pearl River has beendeteriorating over the past decades; the increasing nitrogenloads, for example, affect the waters to the west of Hong KongIsland. In addition to inputs from the Pearl River, localdomestic and industrial wastewater is typically dischargedthrough a number of submarine outfalls after primary orsecondary treatment. The Harbour Area Treatment Scheme(HATS) is a major USD 2.5 billion environmental infrastruc-ture project to collect sewage from the urban areas of Kowloonand Hong Kong Island, through a 24 km long deep tunnelsewerage system, to a centralized treatment works at Stone-cutters Island (SCISTW). The sewage receives ChemicallyEnhanced Primary Treatment (CEPT), and the treated sewageflow (average flow of 19 m3/s; 1.4 million m3/d) is dischargedvia a 1.1 km long submarine outfall diffuser into westernVictoria Harbour at a mean depth of around 12 m (Choi et al.,2009). Stage I of HATS serves a population of around 3million and has been operating since December 2001. Beachwater quality is affected by major point sources such as theHATS outfall as well as local non-point sources (e.g. stormwater runoff).

The bacterial level at a marine beach varies dynamicallyfrom day to day (and in fact on an hourly basis); the E. colilevel at a standard sampling time (e.g. 11 am of sampling day)depends on various environmental parameters. For example,a clear relationship can be found between the incidence ofrainfall and reduction of beach water quality (Olivieri et al.,1977). Bacterial decay is also greatly enhanced by sunlight(Fujioka et al., 1981). Furthermore, mixing and transport ofpollutants are influenced by tidal flows (Hose et al., 2005;Boehm and Weisberg, 2005) and wind (Smith et al., 1999).Rosenfeld et al. (2006) reported higher bacterial levels duringspring tide and at night. In addition, the measurement of E.coli concentration using standard methods requires at least24e48 h. Hence routine beach water quality sampling may notprovide an adequate basis for management. There may bemany occasions when beaches were closed when criteria(based purely on past data) indicated that they should haveremained open, or left open when they should have been

ed three-dimensional hydrodynamic model (a) flood currents; (b) ebb currents

167W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

closed (Whitman and Nevers, 2004). Over the past decade,two-dimensional and three-dimensional hydrodynamic andwater quality models have been extensively used to predict E.coli concentrations for planning and environmental impactassessment studies. Generally speaking, model predictions areintended for design situations based only on several typicalscenarios using averaged parameters (e.g. assumed dryweather or stormwater conditions). While these predictions areuseful for planning purposes, they are rarely used for waterquality forecasts. In particular, there are many challenges tothe use of 3D/2D hydrodynamic models for real time predic-tion, because the changing boundary conditions are usuallyunknown. There is also great uncertainty associated with thespecification of the bacterial decay coefficient and the bacteriasource loading. Larsen (1992) has argued that beach waterquality can only be fruitfully tackled using a stochasticapproach.

Internationally there has been an increasing trend of usingstatistical or data-driven models to provide short-term fore-casts to assist in beach management. A notable example isthe EMPACT (Environmental Monitoring for Public Accessand Community Tracking) Beaches Project carried out by theU.S. Environmental Protection Agency (USEPA) (Wymeret al., 2005); their model input parameters include rainfall,onshore wind speed, water temperature, and previousbacterial concentration. Another example is the study for theFylde Coast in the United Kingdom (Crowther et al., 2001),with daily discharge, daily sunshine, wind and tide level asthe major model inputs. Successful predictions using simplestatistical models (e.g. Multiple linear regression (MLR)models) have been reported for beaches in lakes and rivers;for example, a series of studies have been carried out forseveral beaches in the Great Lakes in the United States(Olyphant, 2005; Nevers and Whitman, 2005); up to50e60% of the observed variance can be explained by themodels. On the other hand, water quality forecast at marinebeaches has received relatively scant attention. The marinebeaches, being more complicated as an open system, areinfluenced by complex estuarine circulation and mixingbrought by tidal current and wave effects. In a study ofa California marine beach, the use of statistical models (Houet al., 2006) is able to explain around 30e50% of theobserved variation in the E. coli data. He and He (2008) alsoexplored the use of Artificial Neural Networks (ANN) topredict water quality at two marine beaches in San Diego,which are influenced by lagoon outlets and hinterlands withhigh residential land use. Limited model validation basedonly on measurements during the wet winter (non-bathing)season was presented.

The availability of extensive regular monitoring datamakes Hong Kong an ideal laboratory for the study ofmarine beach water quality. In this paper, we presenta statistical analysis of bacterial water quality of fourrepresentative beaches using data during 1990e2006. Thecritical environmental factors affecting marine beach waterquality are identified and incorporated in Multiple LinearRegression (MLR) and Artificial Neural Network (ANN)

models for predicting the next-day E. coli level. The modelsdeveloped from sparsely sampled routine monitoring dataare validated using high frequency (daily) monitoring dataduring the bathing season of 2007 and 2008. In particular,the model performance are compared with the existingbeach management criterion (based purely on past data); theability of the models to predict compliance/exceedances ofWQO is assessed. The models provide a basis for a realtime beach forecast system to improve current water qualitymanagement practices.

2. Methods

2.1. Study beaches

Four representative beaches with different hydrographicand pollution conditions are selected for detailed analysis: BigWave Bay, Deep Water Bay, New Cafeteria and SilvermineBay. The locations of the beaches are shown in Fig. 1.

Big Wave Bay (BW) is located in a rural environment at theeastern edge of Hong Kong Island, and opens out to thesouthern waters of Hong Kong, where the water quality isgenerally good to fair. The beach is about 100 m long;a stream enters the beach at the northern end. The streamdrains from an upstream forested catchment (1.15 km2) andreceives pollution from an un-sewered village (population ofaround 500).

Lying on the southern shores of Hong Kong Island, the350 m long Deep Water Bay (DW) is a beach with excellentwater quality throughout the year, with occasional pollutionafter heavy rainfall. There is also a stream entering the beachwith a drainage area of 1.04 km2; the stream drains throughsewered residential areas (16% of the catchment, the rest ofwhich is mainly forest).

New Cafeteria (NC) is located at the edge of the north-western waters of Hong Kong. The area has been developed asa tourist spot with hotels and shopping malls. No majorfreshwater source exists at NC; the beach is more affected bythe ambient marine water quality: the Pearl River Estuary andPillar Point Sewage Outfall lie to the west, and the Stone-cutters Island HATS outfall is about 20 km to the east from thebeach.

Silvermine Bay (SIL) is located on the southeastern shoreof the less developed Lantau Island. The beach is about 200 mlong with a gentle slope; it is relatively enclosed, and seacondition is usually calm. There is a stream 70 m south of thebeach, with a drainage area of 1.08 km2; the catchment ismainly comprised of grassland/shrubland, and low density ofagricultural activities and rural settlement at the downstream.The Mui Wo sewage treatment works is also located to thesouth of the beach.

Relatively speaking, the Big Wave Bay and Silvermine Baybeaches are dominated by local non-point pollution. NewCafeteria Beach is primarily affected by the ambient condi-tion, and Deep Water Bay is a clean beach with no significantpollution sources. More details on the beaches can be found inThoe (2010) and http://www.waterman.hku.hk/.

Table 2

Parameters considered for analysis of beach water quality.

Parameter Unit Source

Beach E. coli count count/100 mL EPDa

Rainfall mm HKOb, GEOc

Wind speed m/s HKO, GEO

Wind direction degree HKO, GEO

Global solar radiation MJ/m2 HKO, GEO

Tide level m above CD Predicted

Tidal range m Predicted

Water temperature �C EPD

Salinity ppt EPD

Turbidity NTU EPD

Dissolved oxygen mg/L EPD

pH e EPD

Beach user e LCSDd

a Environmental Protection Department.b Hong Kong Observatory.c Geotechnical Engineering Office.d Leisure and Cultural Services Department.

0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

Measured ln EC

frequ

ency

μ=4.5

σ=1.78

N=538

Expe

cted

ln E

C

Measured ln EC

Big Wave Bay, 1986−2001

Fig. 3. Histograms and cumulative frequency distribution of lnEC at Big Wave

Bay showing the log-normality of E. coli concentration.

168 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

2.2. Data

2.2.1. Regular monitoring dataThe regular water quality monitoring data of the Hong

Kong Environmental Protection Department (EPD) from 1986to 2006 is used for analysis. At each beach, water samples aretaken at a location of about 1 m depth, and 300 mm belowwater surface. The samples are stored in sterilized plasticbottles in an ice box, and sent to the laboratory for E. colimeasurement within 6 h upon collection. Water temperatureand dissolved oxygen are readily measured on site, while E.coli level, salinity, turbidity and pH are obtained throughlaboratory analysis of the water samples using well-established standard methods (EPD, 2006). The samplinginterval is from 3 to 14 days, with a mean of 6.5 days duringthe bathing season (MarcheOctober), and at a minimum ofonce a month during non-bathing season (NovembereFeb-ruary). In addition, wind velocity, rainfall and global solarradiation at a nearby station of each beach are also obtainedfrom the Hong Kong Observatory (HKO). The number ofbeach users is obtained from the Leisure and Cultural ServicesDepartment (LCSD) of the Hong Kong Government. The tidelevel and tidal range at the sampling time are predicted bya well-validated Extended Harmonic Analysis using 25 tidalconstituents of the closest tidal station (Lee, 1986). Thestatistical correlation of the following parameters with the E.coli concentration is investigated:

1. total daily rainfall 1-day, 2-day, and 3-day prior to thesampling day (mm)

2. yesterday’s global solar radiation (MJ/m2)3. tide level at sampling time (m CD)4. tidal range on the day of sampling (m)5. yesterday’s prevailing wind direction and wind speed at

Waglan Island (Fig. 1)6. yesterday’s prevailing onshore wind speed based on

Waglan wind (m/s)7. salinity (ppt)8. water temperature (�C)9. turbidity (NTU units)

10. dissolved oxygen (DO, mg/L), and DO percentagesaturation

11. pH12. geometric mean of the NL most recent measurements of

E. coli concentrations (NL ¼ 1e5)13. number of beach users.

Table 2 gives a summary of all the data used and theirrespective sources. Two study periods are selected:1990e1997 and 2002e2006. The first period (1990e1997)coincides with the years not much affected by major infra-structure development such as the HATS Stage I project; thebeach water quality is relatively stable over this period, andleast affected by construction and engineering projects. Thesecond period (2002e2006) reflects the situation after thecommissioning of HATS Stage I in 2001, during whicha considerable improvement in water quality has been

observed in most areas of Hong Kong. It is found that the E.coli concentration at a beach is usually log-normallydistributed, which is shown in Fig. 3 using Big Wave Bayas an example. The cumulative frequency distribution of theexpected lnEC also agrees well with that of a normaldistribution (straight line plot at upper right corner ofFig. 3). In this study the natural logarithm of E. coliconcentration (lnEC) is taken as the desired parameter to bemodelled.

2.2.2. Daily beach water quality data 2007 and 2008To supplement the regular monitoring data, intensive daily

beach water quality surveys were carried out in the bathingseason of 2007 and 2008. During 1 Junee31 July 2007, the E.coli concentration and the corresponding environmentalparameters are monitored daily at Big Wave Bay (BW) andNew Cafeteria (NC). This is followed by a more extensivedaily monitoring during 14 Julye13 October 2008 for the fourselected beaches (BW, NC, DW, SIL). These daily data help

169W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

reveal the complex temporal and spatial variation of marinebeach water quality, and its causeeeffect relationship withdifferent hydro-meteorological factors.

3. Data analysis

3.1. Trend of beach water quality

Fig. 4 shows the box plots of E. coli concentration at thefour beaches in three different periods using the EPD’s regularmonitoring data: 1990e1997, 1986e2001 (pre-HATS) and2002e2006 (post-HATS). Each box plot shows the median, 25and 75 percentile values of the data sets, and outliers whichexceed 1.5 the inter-quartile range; indicative E. coli levels of180 and 610 counts/100 mL are also shown for reference.Compared to the pre-HATS period, it is clear that is a greatimprovement in water quality in the post-HATS period indi-cated by the significant decrease in E. coli concentration.However, there are still many occasions when the E. coli level

90−97 86−01 02−06

24

180610

1600

10000

100000

E

. coli (c

ount

s / 1

00 m

L)

N=261 N=538 N=251Period

90−97 86−01 02−06

24

180610

1600

10000

100000

E

. coli (c

ount

s / 1

00 m

L)

N=340 N=616 N=214Period

Big Wave Bay

New Cafeteria

Fig. 4. Box plots of E. coli concentratio

exceeds 180 counts/100 mL, as evident from the significantvariation of E. coli level from near zero to more than 610counts/100 mL (e.g. Big Wave Bay, 2002e2006). There is nosignificant difference in water quality in the periods1986e2001 and 1990e1997.

3.2. Key factors affecting beach water quality

Through systematic correlation analysis between lnEC andall hydro-meteorological parameters in the two study periods(1990e1997 and 2002e2006), 7 of the 13 environmentalparameters are identified as critical factors affecting beachbacterial concentration: previous 3 days’ cumulative rainfall,previous day’s global solar radiation, previous day’s onshorewind speed, salinity, water temperature, tide level, andgeometric mean of past 3 measurements of E. coli concen-tration in natural logarithm (lnEC3).

Table 3 shows the Pearson’s correlation coefficients (r) oflnEC with the seven parameters at the four selected beaches in

90−97 86−01 02−06

24

180610

1600

10000

100000

E

. coli (c

ount

s / 1

00 m

L)

N=259 N=562 N=282Period

90−97 86−01 02−06

24

180610

1600

10000

100000

E

. coli (c

ount

s / 1

00 m

L)

N=334 N=638 N=217Period

180 count / 100 mL

610 count / 100 mL1600 count / 100 mL

Legend:

outliner

Median

25 percentile

75 percentile

+1.5 interquartile range

−1.5 interquartile range

Silvermine Bay

Deep Water Bay

n at the four representative beaches.

Table 3

Correlation of lnEC with key parameters: (1) 1990e1997 and (2) 2002e2006 for the four representative beaches.

Beach Big Wave Bay Deep Water Bay New Cafeteria Silvermine Bay

Period 90e97 02e06 90e97 02e06 90e97 02e06 90e97 02e06

Number of data 261 251 259 282 340 214 334 217

Previous 3 days’ cumulative rainfall (mm) 0.345 0.452 0.445 0.486 0.420 0.349 0.393 0.348

Previous day’s global solar radiation (MJ/m2) �0.366 �0.211 �0.114 �0.080 �0.251 �0.241 �0.207 �0.182

Tide level (m) 0.188 0.091 0.051 0.229 0.227 0.317 0.041 �0.156

Water temperature (�C) �0.099 0.269 0.390 0.399 0.137 0.091 0.199 0.093

Salinity (ppt) �0.363 �0.610 �0.576 �0.511 �0.246 �0.317 �0.582 �0.531

Onshore wind speed (m/s) 0.149 0.125 0.392 0.408 0.200 0.285 0.101 0.267

Geometric mean of past 3 measurements

of E. coli concentration, lnEC3

0.133 0.316 0.475 0.426 0.196 0.063 0.365 0.098

170 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

both study periods. Fig. 5 shows the correlation in bar charts.Among the selected parameters, lnEC is most correlated withrainfall, salinity, global solar radiation and tide level. Watertemperature, wind speed and past E. coli concentration havevarious significance at different beaches, but tend to be lessimportant. The correlation can be interpreted against theknown hydraulic and water quality characteristics of HongKong waters. Rainfall is generally the most common param-eter that correlates with lnEC. It is an indication of the exis-tence of non-point source pollution which is brought from landto the beach through rainfall and the associated surface runoff.Overflow of bacteria-laden sewage from septic tanks orsoakaway pits may also occur after heavy rain. Salinity isstatistically co-correlated with rainfall, and reflects the impactof the bacterial-laden freshwater inflow to the beach. Thecorrelation of salinity is particularly strong at BW, DW and

BW DW NC SIL0

0.1

0.2

0.3

0.4

Cor

rela

tion

Coe

ffici

ent

3 day rain

BW−0.3

−0.2

−0.1

0sola

BW DW NC SIL0

0.1

0.2

0.3

0.4

0.5

Cor

rela

tion

Coe

ffici

ent

onshore wind speed

BW−0.2

0

0.2

0.4t

Fig. 5. Correlation of lnEC with different variables a

SIL (rw�0.5) where a stream is situated in close proximity tothe beach. Beach water quality is usually poor when salinity islow. On the other hand, the correlation with salinity is lessmarked at NC (r w�0.3). Bacterial mortality increases withultraviolet radiation, hence E. coli level is negatively corre-lated with global solar radiation and is relatively consistentover different beaches. This is consistent with the findings ofGameson and Gould (1975) and a recent study on bacterialdecay rate in Hong Kong coastal waters (Chan, 2010). Astrong correlation is observed at NC between lnEC and tidelevel (especially during 2002e2006), as NC is very close tothe submarine outfalls of the nearby sewage treatment plantsat Urmston Road and Pillar Point (Fig. 1). Past E. coli levelindicates the general trend of the recent water quality at thebeach, and has a consistent positive correlation with lnEC.Water temperature and onshore wind speed show appreciable

DW NC SIL

r radiation

BW DW NC SIL−0.8

−0.6

−0.4

−0.2

0salinity

DW NC SIL

ide level

BW DW NC SIL0

0.1

0.2

0.3

0.4

0.5lnEC3

t the four representative beaches (2002e2006).

171W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

correlation at several beaches (e.g. onshore wind speed at DWand NC, water temperature at DW), but they tend to be lessimportant than the other parameters (Table 3).

Changes in the statistical correlation with time can alsobe noted. Compared to the pre-HATS period (1990e1997),the statistical correlation with rainfall, radiation, salinity,and wind speed for the post-HATS period (2002e2006) isgenerally similar (Table 3). The most significant change isthe correlation with tide level and/or water temperature atBig Wave Bay, Deep Water Bay, and Silvermine Bay. Thisis mainly a consequence of HATS: pollution sourcespreviously discharged from sewage outfalls in easternVictoria Harbour (the connection points in Fig. 1) used toaffect beach water quality at southern waters (BW and DW),being advected with the flooding tide from the source to thebeaches. In contrast, after the commissioning of HATS, thesewage flow is centrally treated and discharged at Stone-cutters Island (Fig. 1). The concentrated sewage flow is nowtransported to the southern waters of Hong Kong Island onan ebb tide (Figs. 1 and 2). This also explains the changefrom positive to negative correlation with tide level at Sil-vermine Bay e a location that would be impacted by theHATS outfall on the ebb tide. Overall the results suggestthat each beach has its own characteristics and the corre-lation can change over time due to change of pollutionsources, and point to the need to develop individual corre-lation analysis and forecast model for each beach. Despitethe changes in magnitudes in correlation coefficients, thecritical parameters remain the same for the two periods.Similar correlation results (not shown) have been obtainedusing the daily E. coli data.

4. Development of predictive models

Based on the critical environmental factors revealed by thestatistical correlations, the Multiple Linear Regression modeland the Artificial Neural Network model are used to developpredictive models for beach E. coli concentration.

4.1. Multiple Linear Regression model

Fig. 6. Structure of Artificial Neural Network for daily beach water quality

prediction.

In the Multiple Linear Regression (MLR) method (Box andJenkins, 1976; Ostrom, 1978), it is assumed that the dependentvariable e lnEC, is a linear combination of the selected hydro-environmental variables:

Y ¼ b0 þXMj¼1

bjxj ¼ b0 þ b1x1 þ/þ bMxM ð1Þ

where Y ¼ predicted value of lnEC, xj ¼ value of jth variable,and bj ¼ regression coefficient of the jth variable, andb0 ¼ constant term. For the kth observation, the error betweenthe predicted and observed value of the dependent variable isthen:

ek ¼ yk � Yk ¼ yk � b0 þ

XMj¼1

bjxjk

!ð2Þ

where yk, Yk ¼ observed and predicted values of lnECrespectively, ek ¼ residual ¼ difference between predicted andobserved values of lnEC, and xjk ¼ value of jth variable cor-responding to the kth observation. The regression coefficients(b0, b1, ...bM) are obtained by best-fitting the assumed linearregression equation model to the observations. The goodnessof fit can be measured by the coefficient of determination, R2,as an indication of the percentage variances explained by themodel.

R2 ¼Pk¼1

NðYk � yÞ2Pk¼1

Nðyk � yÞ2 ¼SSR

SST¼ 1� SSE

SSTð3Þ

where SSR ¼ regression sum of squares, SST ¼ observedvariance, and SSE ¼ E ¼ sum of squares of residuals.

All the correlation and regression analysis are performedusing SPSS (Version 18). Stepwise regression algorithm isemployed to seek for the best combination of the selectedvariables. Entry and removal probability are 0.05 and 0.10respectively. Excluded parameters are further manuallycombined with the stepwise model to maximize the varianceexplained.

4.2. Artificial Neural Network

A feed-forward back-propagation Artificial NeuralNetwork (ANN) with three layers is used: an input layer withseven nodes representing the key parameters affecting bacterialevel, one hidden layer with five nodes, and a single outputlayer with one node to give the prediction of lnEC. Fig. 6shows the structure of the neural network. The number ofhidden layers and hidden nodes are determined througha series of model calibration to maximize model performance.All input data are linearly normalized into a range from �0.9to þ0.9. Log-sigmoidal function is used as the transfer func-tion from input layer to hidden layer; no transfer function isused from hidden layer to output layer. The network is trained

Table 4

Multiple Linear Regression (MLR) models and the corresponding adjusted R2 for the four representative beaches, 2002e2006.

Order Big Wave Bay Deep Water Bay New Cafeteria Silvermine Bay

Parameter adj R2 Parameter adj R2 Parameter adj R2 Parameter adj R2

1 Salinity 0.369 Salinity 0.258 3 day rain 0.112 Salinity 0.278

2 Radiation 0.431 3 day rain 0.38 Tide 0.204 Radiation 0.344

3 Water temp. 0.447 Tide 0.439 Salinity 0.234 Wind 0.363

4 lnEC3 0.454 Water temp. 0.454 Radiation 0.286 3 day rain 0.371

5 3 day rain 0.459 Radiation 0.476 Wind 0.286 Tide 0.375

6 Wind 0.479 Water temp. 0.376

7 lnEC3 0.479

172 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

through iterations via the gradient-descent method withmomentum correction; the model learning rate is 0.1 and themomentum term is 0.3. The data for each beach are randomlydivided into learning, validation and testing period in a ratio of60:20:20. To prevent data over-fitting, the model learning isstopped if the error between the prediction and the validationdata increases for 1000 iterations. The MATLAB NeuralNetwork Toolbox 5 is used to develop the ANN model. Themain advantages of ANNs are: (i) they are relatively easy toset up; (ii) they can provide quick response and hence are well-suited for real time operation; and most importantly (iii)ANNs can model dynamic, non-linear and noisy data. Detailsof the ANN method can be found elsewhere (Haykin, 1994).

The seven parameters identified in the correlation anal-ysis are used to develop the predictive models. Both MLRand ANN models are developed for the period of2002e2006, and their performances compared with thecurrent beach water quality assessment criterion (ClnEC5).Table 4 shows the adjusted R2 that can be explained by theMLR models for the four beaches. The order of theparameters being selected in the MLR models and theirrespective cumulative adjusted R2 are indicated. The MLRmodels can explain about 30e50 percent of the variances oflnEC. This is consistent with previous studies conducted ina marine environment (Hou et al., 2006).

The performance of the models in predicting WQOcompliance/exceedance is assessed according to the following

Table 5

Legend of the model performance table.

parameters: (i) percentage of observed non-compliances thatare actually predicted (‘sensitivity’); (ii) the percentage ofobserved compliances that are actually predicted (‘speci-ficity’). A high sensitivity implies a better protection ofpublic’s health, while a low specificity would suggest frequentundesirable false alarms (issue beach advisory when the waterquality is actually under threshold). On the other hand, thepercentage of predicted exceedance/compliance that areactually observed are shown as ‘predicted value’ (�). Table 5shows the legend of the performance table used in this study.Table 6 shows the model performance table for Big Wave Bayand Deep Water Bay during 2002e2006. Table 7 shows theperformance table for New Cafeteria and Silvermine Bay forthe same period.

The following observations can be made from an exami-nation of Tables 6 and 7: (i) The model predictions achievea significantly higher correlation with observations than thecurrent monitoring method (ClnEC5). For example, the MLRmodel achieves a correlation coefficient of 0.69 and 0.57 forBW and NC respectively, while the ClnEC5 attains corre-sponding values of only 0.32 and 0.063 respectively. (ii) Themodels are superior to the current criterion in predicting WQOcompliance/exceedance. For example, the MLR modelpredicts 47% of the observed exceedances at BW, compared toonly 14% based purely on past data. (iii) The ANN modelachieves a higher sensitivity (65% at BW, 46% at NC and 51%at SIL) when compared to MLR model (47% at BW, 32% at

Table 6

Performance of MLR, ANN models and the current assessment criterion (ClnEC5) at Big Wave Bay and Deep Water Bay (2002e2006).

173W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

NC and 32% at SIL); both models perform well at predictingcompliances (high specificity at w95%) at the three beaches.On the contrary, when using ClnEC5 as the predictive criterion,a sensitivity of only 4% is obtained at SIL. At the beach withexcellent water quality (DW), the specificity can be as high as99% for both models; ANN achieves a sensitivity of 36%,while the MLR model can only predict 1 exceedance out of 11cases (sensitivity 9%).

Overall, the comparison for the regular monitoring datashows that predictive models are superior to purely relying onpast monitoring data for beach advisories, and the non-linearANN model shows a better predictability on water qualityexceedances (high-end values), although it usually slightlyover-predicts when the observed E. coli level is low.

A sensitivity test is performed to study which parameterplays a more important role in the model prediction of E. colivariation. Each of the seven input parameters is increased by5% while keeping the other parameters unchanged, and theabsolute percentage change in the model output (lnEC) iscalculated. Fig. 7 shows the results for Big Wave Bay andDeep Water Bay. In general, MLR and ANN show similarparameter sensitivity. At BW, salinity stands out as the most

influential parameter that changes the prediction of lnEC(6.8% for ANN and 4.3% for MLR), DW shows a doublepeak in salinity and rainfall (sensitivity ranges from 4.3 to6.1%). Both BW and DW are affected by non-point sourcepollution from freshwater inflow onto the beach. The fresh-water input decreases the beach water salinity and at thesame time increases E. coli concentration. Although bothsalinity and rainfall indirectly represent the freshwatersource, the in-situ salinity seems to be more critical. Ingeneral, the sensitivity to the other parameters (watertemperature, onshore wind speed, solar radiation, lnEC3 andtide level) is consistent with the importance indicated by thestatistical analysis.

5. Daily prediction of beach water quality

It is of interest to study whether the models developed fromthe sparsely sampled data (3e14 days) of the regular moni-toring program can be reliably applied on a daily basis. TheANN and MLR models developed from the 2002e2006 dataare further validated against the high frequency daily dataobtained in 2007 and 2008.

Table 7

Performance of MLR, ANN models and the current assessment criterion (ClnEC5) at New Cafeteria and Silvermine Bay (2002e2006).

174 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

Fig. 8 shows the daily observed and predicted lnEC withthe current assessment (ClnEC5) over a two-month period atBW and NC in 2007. The daily global solar radiation andrainfall are also shown. It can be seen that a rapid rise in E.coli level is induced by a heavy rainstorm, followed by a fall

lnEC3 rain radiation0

5

10

abso

lute

cha

nge

in ln

EC (%

)

Big W

lnEC3 rain radiation0

5

10

abso

lute

cha

nge

in ln

EC (%

)

Deep

ANNMLR

ANNMLR

Fig. 7. Sensitivity analysis of MLR and ANN mo

back to background levels in around 3 days in the presenceof strong sunshine (e.g. July 1). At BW, ClnEC5 lags behindthe observation by about 1/2 to 1 month. When the E. colilevel frequently exceeds the threshold during early Junefollowing heavy rainfall, the corresponding ClnEC5 is

tide wind salinity temperature

ave Bay

tide wind salinity

Water Bay

water

temperaturewater

dels for Big Wave Bay and Deep Water Bay.

1−Jun 11−Jun 21−Jun 1−Jul 11−Jul 21−Jul 31−Jul

10000

1600610180

24

E

. co

li (c

ount

s/10

0mL)

rain

(mm

/d)

rad

(MJ/

m2 )

Date

Big Wave Bay

1−Jun 11−Jun 21−Jun 1−Jul 11−Jul 21−Jul 31−Jul050100150

30

15

0

1−Jun 11−Jun 21−Jun 1−Jul 11−Jul 21−Jul 31−Jul

10000

1600610180

24

E

. co

li (c

ount

s/10

0mL)

rain

(mm

/d)

rad

(MJ/

m2 )

Date

New Cafeteria

1−Jun 11−Jun 21−Jun 1−Jul 11−Jul 21−Jul 31−Jul050100150

30

15

0

C dataMLR ANN

Fig. 8. Comparison of daily prediction of E. coli level using ANN and MLR models at Big Wave Bay and New Cafeteria (JuneeJuly 2007).

175W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

however very low. On the other hand, when ClnEC5 climbs upto above 180 counts/100 mL in late June, the actual E. coliconcentration has already dropped to a low level. ClnEC5 isinsensitive at NC; it has only exceeded the threshold forthree days from 13 to 15 June, while in reality there are 20occasions out of 52 that the daily E. coli concentrationexceeds 180 counts/100 mL (exceedance rate 38%). BesidesClnEC5, the current system issues beach advisory note whena single measurement exceeds 1600 counts/100 mL, but thedecision can only be made after 1e2 days due to the timerequired for bacterial measurement. As observed in the dailydata, the E. coli concentration seldom stays at a high valueover a long time. This single measurement exceedancecriterion is therefore impractical and provides out-datedinformation.

Comparatively, both predictive models manage to track thedaily changes in water quality, and follow quite closely to theever-changing observed E. coli level; correlation coefficientbetween predicted and observed lnEC at BW by both modelsis about 0.75, and w0.45 at NC. Both models are superior tousing ClnEC5 (correlation w0.26 and 0.07 respectively).Table 8 shows the performance of the models in predictingWQO compliance/exceedance during JuneeJuly 2007. Thecurrent system (ClnEC5) performs poorly in predicting WQOexceedances, with a sensitivity of 0% and 14% at BW and NCrespectively. On the contrary, MLR and ANN model perfor-mances for the daily forecast are very encouraging, accountingfor around 80% and 50% of the exceedances at BW and NCrespectively; the overall prediction accuracy is around 85%

and 70% at BW and NC respectively. In general, the perfor-mances of ANN and MLR models in predicting daily waterquality are similar.

Similar water quality prediction is carried out for thedaily data in 2008. Fig. 9 shows the observed and predictedlnEC at the four representative beaches in 2008(AugusteSeptember), Tables A.1 and A.2 in the appendixshows the performance table for BW, DW and NC, SILrespectively (JulyeOctober). Similar conclusions can bereached. The correlation coefficients between MLR pre-dicted and observed lnEC for BW, DW, NC and SIL are0.576, 0.272, 0.275 and 0.680 respectively, as compared tocorresponding values of �0.044, 0.106, 0.080 and �0.131respectively based purely on past data. The models performwell in predicting water quality threshold exceedance, withsuccessful rate of 71%, 42% and 53% at BW, NC and SILrespectively, as compared to 0% at BW and NC and 6% atSIL when using the current system. Nevertheless, thecorrelation between the observed and predicted lnEC issignificantly lower in 2008 (about 0.3 for DW and NC)when compared to that obtained in 2007. As observed inFig. 9, the models tend to over-predict the very low E. colilevel. This reflects a better performance of the predictivemodels at beaches with fair water quality and criticallyaffected by local non-point source pollution (BW and SIL).Also, for beaches with excellent water quality such as DeepWater Bay (DW), the data is essentially biased to WQOcompliances, and the model has less explanatory power tothose extreme high-end values.

Table 8

Performance of MLR, ANN models and the current assessment criterion (ClnEC5) in daily prediction at Big Wave Bay and New Cafeteria (JuneeJuly 2007).

176 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

Both MLR and ANN models are developed based onhydro-environmental parameters that are significantlycorrelated with beach water quality. They have an encour-aging performance on predicting sudden deterioration ofwater quality due to changes in environmental conditions,which is the major weakness of the traditional beach waterquality assessment method (ClnEC5). Generally speaking,ANN is more attractive than MLR for its better performanceon capturing high-end values; on the other hand ANNusually over-predicts E. coli level and leads to more ‘falsealarms’. Care should be exercised when applying ANNmodels in real time forecast, as extreme cases (e.g. typhoonswith extreme winds or severe rain storms) are often missedby the regular monitoring calibration data. As ANN modelsessentially rely on interpolated data, unreasonable predictionmay result when the input parameter exceeds the range ofthe dataset used for model training.

6. Concluding remarks

A comprehensive study of beach water quality predictionhas been carried out for four representative beaches in Hong

Kong. Statistical analysis of the extensive sparsely sampled(3e14 days) regular monitoring data was carried out for twoperiods before (1990e1997) and after (2002e2006) thecommissioning of a major environmental infrastructureproject e the Harbour Area Treatment Scheme (HATS). Thedata analysis shows that E. coli e the key beach waterquality indicator e is strongly correlated with seven hydro-environmental variables: rainfall, solar radiation, windspeed, tide level, salinity, water temperature and past E. coliconcentration. The relative importance of the parameters isbeach-specific, and depends on the local geographical andhydrographical characteristics. Hence, beach water qualitymodel parameters are necessarily beach-specific. The anal-ysis also suggests that the correlations change over time inrelation to changing distribution of pollution sources. Similaranalysis has been carried out for all of Hong Kong’s beacheswith similar conclusions reached (Lee et al., 2008).

Multiple Linear Regression (MLR) and Artificial NeuralNetwork (ANN) models have been developed to predict thenext-day E. coli concentration (in natural logarithmic scale)using the above hydro-environmental variables as inputparameters. The models are developed using the 2002e2006

1−Aug 11−Aug 21−Aug 1−Sep 11−Sep 21−Sep 30−Sep

10000

1600610180

24

E

. co

li (c

ount

/100

mL)

rain

(mm

/d)

rad

(MJ/

m2 )Big Wave Bay

050100150

30150

1−Aug 11−Aug 21−Aug 1−Sep 11−Sep 21−Sep 30−Sep

10000

1600610180

24

E

. co

li (c

ount

/100

mL)

rain

(mm

/d)

rad

(MJ/

m2 )Deep Water Bay

050100150

30150

1−Aug 11−Aug 21−Aug 1−Sep 11−Sep 21−Sep 30−Sep

10000

1600610180

24

E

. co

li (c

ount

/100

mL)

rain

(mm

/d)

rad

(MJ/

m2 )New Cafeteria

050100150

30150

1−Aug 11−Aug 21−Aug 1−Sep 11−Sep 21−Sep 30−Sep

10000

1600610180

24

E

. co

li (c

ount

/100

mL)

rain

(mm

/d)

rad

(MJ/

m2 )Silvermine Bay

050100150

30150

C dataMLR ANN

Fig. 9. Comparison of daily prediction of E. coli level using ANN and MLR models at the four representative beaches (AugusteSeptember 2008).

177W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

regular monitoring data. When compared with the 2002e2006data, it is shown that the MLR model performs well in pre-dicting water quality objective (WQO) compliance/exceed-ance, with an overall accuracy of 87%, 95%, 82%, and 85% forBig Wave Bay (BW), Deep Water Bay (DW), New Cafeteria(NC), and Silvermine Bay (SIL) respectively. The performanceof the ANNmodel is similar, and tends to be better in predictingthe high concentrations. Both models are superior to the currentassessment criterion (relying on past five measurements)in predicting WQO exceedances. For example, the ANNmodelpredicts 65%, 36%, 46%, and 51% of the observed WQOexceedances for BW, DW, NC, SIL respectivelye as comparedto 14%, 0%, 14% and 4% using the current criterion.

The models developed from the sparsely sampled regularmonitoring data are successfully validated using daily datafor the bathing seasons of 2007 and 2008. When comparedwith the high frequency data, the models are able to trackthe dynamic changes in E. coli concentration at BW and NCduring JuneeJuly 2007, with an overall accuracy of 85%and 70% respectively. More important, the models predictaround 80% and 50% of the exceedances at BW and NCrespectively, as compared to 0% and 14% based on thecurrent beach advisory criterion. Similarly, the modelsperform well in predicting WQO compliance/exceedancesduring JulyeOctober 2008, with an overall accuracy ofaround 80e96 %. Observed exceedances are predicted with

178 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

success rates of 71%, 42%, and 53% at BW, NC, and SILrespectively, as compared with 0%, 0%, and 6% basedpurely on past data.

The extensive model validation demonstrates the practicalfeasibility of predicting bacterial concentration based on thecritical hydro-environmental variables e in particular the useof the model in predicting WQO compliances. The approachcan also be extended to study the compliance or exceedance ofthe 610 counts/100 mL threshold (‘very poor’ water quality).Table A.3 shows that both models can predict about 50% ofthe observed serious violations of WQO at BW, while thecurrent criterion fails to detect any violation event. Moredetails on the model performances can be found in Thoe(2010).

The present contribution paves the way for developinga daily beach water quality forecast system. Further work isrequired before the predictive models can be used for actualdaily forecasting. For example, the beach salinity measure-ments are typically not available on a daily or real time basis.Deterministic process-based hydrological models may bedeveloped in conjunction with ANN models to predict beachsalinity as a function of rainfall. For beaches adjacent tosignificant point sources, the use of 3D deterministic models

Table A.1

Performance of MLR, ANN models and the current assessment criterion (ClnEC5) in

may also provide an alternative to purely data-driven models.Different models are being developed as part of an effort todevelop a real time water quality forecast and managementsystem for Hong Kong (Project WATERMAN: http://www.waterman.hku.hk/). These ongoing developments and fieldexperiences with real time beach water quality forecasting willbe separately reported.

Acknowledgement

This work is supported by the Hong Kong Jockey ClubCharities Trust and in part by a grant from the UniversityGrants Committee of the Hong Kong Special AdministrativeRegion, China (Project No. AoE/P-04/04) to the Area ofExcellence (AoE) in Marine Environment Research andInnovative Technology (MERIT). The assistance of the HongKong Environmental Protection Department, Hong KongObservatory, and the Leisure and Cultural Services Depart-ment in this project (2007e2011) is gratefully acknowledged.

Appendix A. Additional beach water quality forecastmodel performance tables

daily prediction at Big Wave Bay and Deep Water Bay, (JulyeOctober 2008).

Table A.3

Performance of MLR, ANN models and the current assessment criterion (ClnEC5) in daily prediction at Big Wave Bay (JuneeJuly 2007).

Table A.2

Performance of MLR, ANN models and the current assessment criterion (ClnEC5) in daily prediction at New Cafeteria and Silvermine Bay, (JulyeOctober 2008).

179W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

180 W. Thoe et al. / Journal of Hydro-environment Research 6 (2012) 164e180

References

Boehm, A.B., Weisberg, S.B., 2005. Tidal forcing of enterococci at marine

recreational beaches at fortnightly and semidiurnal frequencies. Environ-

mental Science and Technology 39, 5575e5583.

Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting &

Control. Holden-Day Inc, San Francisco.

Chan, Y.M., 2010. Field and laboratory studies of E. coli decay rate at a coastal

beach with reference to storm events. M.Phil. thesis, The University of

Hong Kong.

Cheung, W.H.S., Chang, K.C.K., Hung, R.P.S., Kleevens, J.W.L., 1990. Health

effects of beach water pollution in Hong Kong. Epidemiology and Infec-

tion 105, 139e162.

Choi, K.W., Lee, J.H.W., Kwok, K.W.H., Leung, K.M.Y., 2009. Integrated

stochastic environmental risk assessment of the Harbour Area Treatment

Scheme (HATS) in Hong Kong. Environmental Science and Technology

43, 3705e3711.

Crowther, J., Kay, D., Wyer, M.D., 2001. Relationships between microbial

water quality and environmental conditions in coastal recreational waters:

the Fylde Coast, UK. Water Research 35 (17), 4029e4038.

Environmental Protection Department (EPD), HKSAR, 2006. 20 Years of

Beach Water Quality Monitoring in Hong Kong, 1986e2005.Fujioka, R.S., Hashimoto, H.H., Siwak, E.B., Young, R.H.F., 1981. Effect of

sunlight on survival of indicator bacteria in seawater. Applied and Envi-

ronmental Microbiology 41 (3), 690e696.Gameson, A.L.H., Gould, D.J., 1975. Effect of solar radiation on the mortality of

source terrestrial bacteria in sea water. In: Gameson, A.L.H. (Ed.), Discharge

of Sewage from Sea Outfalls. Pergamon Press, New York, pp. 209e219.

Haykin, S., 1994. Neural Networks. Macmillan.

He, L.M., He, Z.L., 2008. Water quality prediction of marine recreational

beaches receiving watershed baseflow and stormwater runoff in southern

California, USA. Water Research 42, 2563e2573.

Hose, G.C., Gordon, G.G., McCullough, F.E., Pulver, N., Murray, B.R., 2005.

Spatial and rainfall related patterns of bacterial contamination in Sydney

Harbour estuary. Journal of Water and Health 3 (4), 349e358.

Hou, D., Rabinovici, S.J., Boehm, A.B., 2006. Enterococci predictions from

partial least squares regression models in conjunction with a single-sample

standard improve the efficacy of beach management advisories. Environ-

mental Science and Technology 40 (6), 1737e1743.

Larsen, T., 1992. Debate on uncertainty in estimating bathing water quality.

Water Science and Technology 25 (9), 197e202.

Lee, J.H.W., 1986. Analysis and prediction of tides e A practical introduction.

M.Sc Course Notes (Unpublished).

Lee, J.H.W., Harrison, P.J., Kuang, C.P., Yin, K.D., 2006. Eutrophication

dynamics in Hong Kong coastal waters: physical and biological interac-

tions. In: Wolanski, E. (Ed.), The Environment in Asian Pacific Harbors.

Springer, Netherlands, pp. 187e206.

Lee, J.H.W., Choi, K.W., Thoe, W., Wong, H.C., 2008. Identification of

Critical Factors Affecting the Bacteriological Water Quality of Hong Kong

Beaches. Technical Report prepared for Hong Kong Environmental

Protection Department, December 2008.

Nevers, M.B., Whitman, R.L., 2005. Nowcast modeling of Escherichia coli

concentrations at multiple urban beaches of southern Lake Michigan.

Water Research 39 (20), 5250e5260.

Olivieri, V., Kruse, C., Kazuyoshi, K., Smith, J., 1977. Microorganisms in

Urban Stormwater. EPA-600/2-77-087. U.S. Environmental Protection

Agency, Cincinnati, OH.

Olyphant, G.A., 2005. Statistical basis for predicting the need for bacterially

induced beach closures: emergence of a paradigm? Water Research 39,

4953e4960.Ostrom, C.W., 1978. Time Series Analysis: Regression Techniques. SAGE

Publications, London.

Rosenfeld, L.K., McGee, C.D., Robertson, G.L., Noble, M.A., Jones, B.H.,

2006. Temporal and spatial variability of fecal indicator bacteria in the surf

zone off Huntington Beach, CA. Marine Environmental Research 61,

471e493.

Smith, P., Carroll, C., Wilkins, B., Johnson, P., Nic Gabhainn, S., Smith, L.P.,

1999. The effect of wind speed and direction on the distribution of sewage-

associated bacteria. Letters in Applied Microbiology 28 (3), 184e188.

Thoe, W., 2010. A daily forecasting system of marine beach water quality in

Hong Kong. Ph.D. thesis, The University of Hong Kong.

United States Environmental Protection Agency, 1986. Ambient Water Quality

Criteria for Bacteria e 1986. USEPA, Office of Water Regulations and

Standards, Washington, DC.

Whitman, R.L., Nevers, M.B., 2004. Escherichia coli sampling reliability at

a frequently closed Chicago beach: monitoring and management impli-

cations. Environmental Science and Technology 38 (16), 4241e4246.

Wymer, L.J., Brenner, K.P., Martinson, J.W., Stutts, W.R., Schaub, S.A.,

Dufour, A.P., 2005. The EMPACT Beaches Project: Results From

a Study on Microbiological Monitoring in Recreational Waters. Tech-

nical Report 600/R-04/023. United States Environmental Protection

Agency, Washington, DC.