Latest Correction 30-1-2015

231
“A COHERENT EXAMINATION OF RAINFALL AND FLOOD DATA IN SOME SELECTED SITES OF PAKISTAN"

description

jhnb

Transcript of Latest Correction 30-1-2015

A COHERENT EXAMINATION OF RAINFALL AND FLOOD DATA IN SOME SELECTED SITES OF PAKISTAN"

BYABAD ALIROLL NO: 0563-M.Phil-STAT-2011SESSION 2011-2013

DEPARTMENT OF STATISTICSG C UNIVERSITY, LAHORE (PAKISTAN)DECLARATION

I, Abad Ai Roll No. 0563-M.Phil-STAT-2011, student of MPHIL in the subjects of Statistics Session 2011-13 hereby declare that the matter printed in the Thesis on A COHERENT EXAMINATION OF RAINFALL AND FLOOD DATA IN SOME SELECTED SITES OF PAKISTAN" is my own work and has not been printed, published and submitted in the form of a thesis in any University or Research Institute, etc. in Pakistan or abroad.

Dated: ______________ ABAD ALI

THESIS COMPLETION CERTIFICATE

Certified that the research work contained in this thesis titled on A COHERENT EXAMINATION OF RAINFALL AND FLOOD DATA IN SOME SELECTED SITES OF PAKISTAN" has been carried out and completed by Mr. ABAD ALI, Roll No. 0563-M.Phil-STAT-2011under my supervision.

Supervisor

Dated: _____________ __________________________ Prof. Dr. Saleha Naghmi Habibullah Visiting Professor Department of Statistics, GC University Lahore

Submitted Through:

________________ ___________________ Mr. Jaffer Hussain Controller of Examination Chairperson, Department of Statistics GC University Lahore GC University Lahore

ACKNOWLEDGEMENT

None is worthy of praise except gracious ALLAH, Who created the Worlds of numerous creatures in the capacity of Absolute Authority. Almighty Allah has opened the new dimensions of knowledge for me and has led me to complete this task. All my respects to Almighty Allahs last Prophet HAZARAT MUHAMMAD(peace be upon him) who is the great mentor of the world. He enabled us to recognize the Creator of the world and to understand the philosophy of life.

I feel great pleasure in expressing from the core of my heart gratitude to my Supervisor who has been cooperative in all circumstances. I am extremely appreciative of her keen interest, motivational behavior, tolerance and inspiring guidance that enabled me to surmount this uphill task. It has been a great honor for me to work under her supervision. Her comments and valuable suggestions have played a vitally important role.

I am also very thankful to Prof. Sam C Saunders, Prof. Emer. Washington State University,USA for his valuable comments, suggestions and guidance throughout the period of my research.

I gratefully acknowledge Mr. Jaffer Hussain,Chairperson, Department of Statistics, GC University,Lahore for his polite, helping, encouraging and motivational behavior to complete this task. Last but not the least, I would like to thank my dearest mother and my family members.

ABBREVIATIONS

AEPAnnual Exceedence Probability

MOMMethod of Moments

ECDFEmpirical Cumulative Distribution Function

MLEMaximum Likelihood Estimator

GOFGoodness of Fit

EVTExtreme Value Theory

GEVDGeneralized Extreme value Distribution

IWDInverse Weibull Distribution

PPTPhilips Perron test

MAPE Mean Absolute Percentage Error

MADMean Absolute Deviation

MSDMean Squared Deviation

TABLE OF CONTENTS

Chapter 1 Introduction 1.1 Preliminary Remarks1.2 Global warming1.3Extreme Events1.4Precipitation1.5Rainfall1.5.1Intensity Of Rainfall1.5.2Rainfall measurement 1.5.3Rain fall impact on human life 1.5.4Effects on agriculture 1.5.5Effect on culture aspect 1.5.6Influence of rainfall on Pakistan 1.6Seasonal Environment of Pakistan1.6.1Winter 1.6.2Monsoon1.6.3Pre Monsoon1.6.4Post Monsoon 1.7Records and forecasting of weather in Pakistan1.7.1Hydraulic Structure1.8Flood1.8.1Flood damaging1.8.2Health Hazard1.8.3Agriculture, Livestock, and Fisheries1.8.4 Education1.8.5 Energy1.8.6 Transport & Communication1.8.7Environment1.9Area used for analysis

Chapter 2 Literature Review2.1 Introduction

Chapter 3 Methodology3.1Introduction 3.2Quantile3.3Exceedence Probability3.4Method of Estimations3.5Method of moments3.6Maximum Likelihood Method3.7Trend analysis by Graphical technique3.8Q-Statistics3.9Autocorrelation3.10Class of Distributions3.10.1Extreme value Distribution value 3.10.2Generalized Extreme value Distribution3.10.3Exponential Distribution3.10.4Gamma Distribution3.10.5Normal Distribution3.10.6Three parameters Log-Normal Distribution3.10.7Logistic Distribution3.10.8Nakagami Distribution3.10.9Weibull Distribution3.10.10Inverse Gaussian distribution3.10.11Rayleigh Distribution3.10.12Frechet Distribution3.11Goodness of fit test3.12Probability Plots3.13Utilization of Software

Chapter 4 Least Squares Analysis4.1 Introduction4.2Linear Models4.3Least Square estimates4.4Missing data analysis4.5Analysis of Mangla Flood Peaks4.6Analysis of site Shahdara Flood Peaks4.7 Analysis of site Balakot Rainfall Station4.8Fitting parabolic trend to Flood and Rainfall data4.9Measures of accuracy for time series data4.10Risalpur Rainfall Site4.11Dir Rainfall4.12Kohat Rainfall4.13Marala Rainfall Site4.14D.G khan Site4.15Terbela Flood Site4.16Muzafarabad4.17Correlation Between rainfall and flood peaks4.17.1Lag Correlation4.17.2Cross correlation

Chapter 5 Record values5.1Introduction5.2Probability Density Function of Upper Record Values5.3 Probability Density Function of Lower Record Values5.4Properties of Record Values5.5Properties for Lower record values5.6Lower Record Values from IWD5.7Lower Record Values from Frechet distribution5.8Maximum Likelihood Method5.9Means and Variances

Chapter 6 Stationary Models6.1Introduction6.2Tests For The Detection Of Stationarity6.3Unit Root Test6.4Dickey And Fuller Test 6.5Philips Perron (PP) Unit Root Test6.4 Analysis Of The Stationary Time Series Data At Different Sites6.4.1Marala Site Flood Peaks6.4.2Tarbela Site6.4.3Shahdara Site6.4.4Mangla Site6.4.4Muzafarabad Site6.4.4Balakot Site6.4.4Risalpur Site6.4.4Kohat Site6.4.4Dir Site

Chapter 7 Probability distributions7.1 Introduction 7.2Marala Flood Peaks7.2.1Exponential Distribution7.2.2Gamma Distribution7.2.3Normal Distribution7.2.4Log-Normal Distribution7.2.5Logistic Distribution7.2.6Nakagami Distribution7.2.7Weibull Distribution7.2.8Burr Distribution7.2.9Inverse Gaussian Distribution7.2.10Rayleigh Distribution7.2.11Generalized Extreme Value Distribution7.2.12Frechet Distribution7.3Shahdra Flood Peaks7.4Mangla Flood Peaks7.5Kohat7.6Muzafarabad Rainfall Site

Chapter 8Expected loss8.1Introduction8.2A Decision Based upon Expected Loss8.3Pareto Distribution7.4Maximum Likelihood Method of Pareto Distribution8.5The variance and covariance of these MLE8.6Method of Moments 8.7The variance and covariance matrix8.8Probability Weighted Moments

REFERENCES

CHAPTER 1INTRODUCTION

1.1Preliminary Remarks

Climate change is one of the hottest topics among the scientific community of the world today. In particular, the phenomenon of global warming is a cause for grave concern for meteorologists, oceanographers, and many other categories of scientists. United Nations bodies (such as the World Health Organization) and national organizations (such as the US Environmental Protection Agency) are investigating the risks to the inhabitants of this world due to global warming.

One very important area of concern linked with the phenomenon of global warming is the occurrence of flooding that is liable to cause loss of life and heavy damages to property. In particular, people in the developing countries suffer heavily due to the damages caused by excessive rain and flooding. Pakistan, for example, has experienced a number of floods during the past few decades some of which caused excessive damage to life and property.

1.2Global warmingThe phenomenon of global warming has been taking place on planet earth for the past 15,000 years. Increase in the temperature of the planet gradually is considered as global warming. The global warming is also called greenhouse effect. The greenhouse gasses (i.e carbon monoxides,carbondioxide, sulpher dioxide etc) are the main caused of global warming. The human beings, industries and vehicles are released these gasses.According to the scientists, these gasses affect the atmosphere desperately. It caused to make a hole in the ozone layer, which is working as a protector to the earth against the ultravioletrays released by the sun. Due to this phenomenon, the earth received a higher temperature that caused global warming.According to the opinion of a vast majority of the scientists today, this undesirable phenomenon is the result of industrialization, significant increase in the human population and other factors for which we ourselves are responsible.

Global warming exerts a variety of negative effects on the planet and its inhabitants such as reduction of territories, damage to marine ecologies, destruction of seasonal insects and many others. Extreme events such as high storms, cyclones and hurricanes can cause enormous damages and destruction of infrastructures. People may experience increased water-loss from reservoirs due to dryness, long summers and short winters as well as extreme temperature both in summer and winter.Sometimes, the situation may become very grave such as in the case of severe famines.1.3Extreme Events:Any event is considered an extreme event if its amount differs from its normal value to a greater extent, for example, inundation droughts and earthquakes etc. Extreme events affect human life and property to a large extent. As such, attempts aimed at accurate modeling of extreme events carry great significance for protecting human life and property.

During the past few decades, researchers have been greatly interested in studying extreme events including both lower and upper extreme events. The smallest value of a data-set and the largest value of a data-set, both are studied as extreme events. Analysis of record values (lower and upper) based on some probability distributions have been studied.

1.5Rain fall:Precipitation in liquid form is called rain. Surface water on earth crust is evaporated by sun rays then converted into clouds followed by the returning back to the earth surface in the form of drops. This type of precipitation is called rainfall.

1.5.1 Intensity of rainfall:Rainfall has broader effect on the socio-economic and human culture, so it is necessary to measure it up. The intensive levels are classified as followed. Light rain is rated as 0.098 inches/hour.The rain is considered moderate as it lies between 0.098 to 0.3 inches per hour. Heavy rain is packed as 0.3 to 2 inches per hour. Extreme rain is reduced as 2 inches per hour to maximum.

1.5.2Rainfall measurement:Sectors like industry forestry and agriculture require swift updating of rainfall measurement. Standardized rain gauge is used to detect rain and snow. Rain gauge is a device to measure the depth of precipitation per unit area (m2) counted as millimeters. One litter water is precipitated as if there is one millimeter per square meter rain fall. There is another unit as inches/square foot. A rain gauge is a funnel with its upper end opens a storage beaker to measure depth of stored water.

Pakistan meteorological department measure the rainfall at different stations and record is being kept at different intervals like daily, weekly, monthly and annual basis.

1.5.3 Rain fall impact on human life:Rainfall is a natural consequence of climate influence here on earth especially for those areas which are far from irrigation systems. Rainfall has a splendid effect on human beings including, mode, celebrations, socio economic sectors, poetry etc.

1.5.4 Effects on agriculture:Rainfall is a natural phenomenon that has been specious impact on human existence. After some regular intervals the rainfall is necessary for the plants to survive and nourish. Excessive irregularity in rainfall patterns affects adversely the agriculture sector and its allied socio-economic sectors like irrigation pattern and grain storage along with fodder.

1.5.5 Effect on culture aspect:Rainfall has a great effect on the socio-cultural aspect of life. The socio-cultural aspect has much relation to the economy. A society having strong economic grounds has a much developed culture. Agricultural economies are largely affected by the rain fall patterns. So there life styles and culture are also getting affected. Rainfall has a direct influence on the behavior and moods of people. Excessive rain getting country considers sunshine a blessing, a dry land devoid of rain considers a single drop of rain a heavy blessing. Absence of rain and floods has some great psychological and social effects. Poetry verses, objects, music attitudes, literature writings are also affected by the rain. After all rain as a weather event have a great impact on human life.

1.5.6Influence of rainfall on Pakistan:Pakistan is a country of diversity with an average climate of hot and arid. Its an agricultural country with latitudes 24N to 37N and longitudes 61E to 71E. Some areas receive heavy rainfall, some with moderate and some with areas receiving light rainfall. Monsoon areas get heavy rainfall through monsoon season and floods are common happening due to lack of proper management of resources.

1.6Seasonal Environment of Pakistan Pakistan has a variety of weather including every type of nature. It following has four seasons. Winter Pre-Monsoon Monsoon Post-Monsoon

1.6.1Winter The winter season in Pakistan almost existing in December, January, February and March but it has a variation in different areas of Pakistan. Some areas received high cool and some are slightly cool. The area of Himalayan received a heavy snowfall in winter season.

1.6.2MonsoonThe monsoon is very popular in Asia due to the change in climate and the occurrence of rainfall. This word (monsoon) is derived from Arabic word (Mawsam) and Portuguese word (moncao). The winds in monsoon season enter in Pakistan from the Indian Ocean and Arabic Sea. These monsoon winds caused a heavy rain and storms in the related areas of Pakistan. These are also caused of floods in affected areas. The extreme peak of monsoon is happened in August. Its duration is consists of the interval (July to September).

1.6.3Pre MonsoonPakistan usually received a dry and hot weather in summer. From one aspect, this hot season is harmful for Pakistan. Because, it melts the ice on glaciers which caused heavy floods like flood of 2005 in Pakistan. The monsoon has started at the end of summer season thats why the summer season is also called pre-monsoon. The summer season is consist of the period from April to June.

1.6.4Post Monsoon Monsoon duration ends up to the last of September. Its duration is very short including October and November. In this period a few rain has been received which is very useful for agriculture point of view.

1.7Records and forecasting of weather in Pakistan Proper Planning for any kind of formulation and structures is based on proper forecasting about the event. For the purpose of weather records and forecasting Pakistan Meteorological Department is working. In 2010 the highest temperature in Pakistan was recorded on 26 of the May in Mohenj-o-daro ,in the province of Sindh. It was the most reliable measurement of the hottest temperature in Asia ever recorded.

1.7.1Hydraulic StructurePakistan has possessed the multi kind of land. It consists of Mountains, River, Desert cultivated land etc. It has been classified in two main regions according to its geographical importance. The first is based on Indus basin and second is based on the dry areas. Irrigation system of Pakistan based on different rivers including (i) Satluj (ii) Jehlam (iii) Chenab (iv) Indus (v) Ravi. The Hydraulic structure of Pakistan depends upon Dams, Weirs, Barrage, Rivers, lacks etc.

Pakistan has faced a dual type of problems, scarcity of rainfall as well as excessive floods and rainfall. In some cultivated areas of Pakistan irrigation through the rainfall is the best irrigation system. But the lack of preservative methods of overflow rainfall water caused huge type of floods.

Global warming involved two main categories caused the change in climate (i) the natural variability and (ii) the human activities. The greenhouse gasses, use of fossil fuel, properties of the land surface, features of vaporizers and natural phenomenon are the major causes of the global warming. Due to the large extent in temperature as a result the glaciers are melting rapidly which caused the overflowing of the water named floods.

1.8FloodsFloods are the most caustic natural catastrophes that occur in many parts of the world. These floods have been renowned as the most costly orthodox hazards having high tendency to destroy properties as well as human beings. These are very hard to predict due to the involvement of many unnatural and natural factors in process of its occurrence.An extreme situation due to the excessive rainfall leads to excessive losses of life and property. This excessive level of water converted into a flood. Pakistan is being faced high intensity of floods and flood damaging from last few decades.

1.8.1Flood damagingThe monsoon in Pakistan occurred very severely, despite the forecasting with very low average of the rainfall it comes with a huge amount in the mid of August in Southern areas of Pakistan, a heavy rainfall is observed every year. The maximum rainfall is seemed in the beginning of July and continued till the last week of September

1.8.2Health Hazard:The health infrastructure in rural areas is available for the sake of every kind of health and provides the basics first aid. These infrastructures are being damaged with rainfall. Basic health units and rural health centers suffer most damage. Millions of dollars invested in health sector are ruined recklessly. The access of rainfall and flood also needed to investigate along with health hazards measurements.

1.8.3Agriculture, Livestock, and Fisheries:Agriculture has the central role in the growth of economy. Being a primary activity it engages a larger number of work forces for hand work. The major source of the livelihood of Pakistan population depends directly or indirectly on agriculture. The Rabi crop has been known as the main crop wheat which is the staple food of major portion of population. Fruits consisting grapes, citrus, mangoes and vegetables include potato, tomato, chilies and onion. Livestock is an integral part of agricultural scenario. Buffalo and cattle are main source of milk, meat and hides along with drafting power. Fodder crops include wheat straw and maize thinning.

1.8.4 EducationEducation department has a significant effect on the economy and the development of any country. The education institutions in those areas (school Madras and colleges) are spread over the distance. These institutions are constructed irrespective of such kind of safety aspects like floods, heavy rainfall and earthquake. A heavy proportion of such institutes have been affected completely or partially due to the flood. In flood (2011) the 4096 educational institutes were damaged in Sindh and Baluchistan. In Sindh, among the total number of damages which is 3892, the 1032 Girls schools are damaged completely or partially.

1.8.5 EnergyEnergy department is considered as a backbone of any country. There are many mega and small units are working to fulfill the requirements of energy for multiple uses. In Pakistan the following units are working for this purpose(i) Thermal Plants(ii) Hydro-electric plants(iii) Small Nuclear Plants

Most of the energy depends on the hydrological department. Heavy rainfall and floods caused a huge damaging in these units as well. A well and preplanned policies can save much heritage and reduce the cost of damaging.

1.8.6 Transport & CommunicationCommunication and transportation is a need of time in recent as well as the conservative period. The recent world is required global network to improve the basic factors involved in the infrastructure of any community. A large number of modern sources of communications and transportations exist over there. The fundamental sources in Pakistan are as follows(i) Roads(ii) Railway Lines(iii) Airports

The total area of Pakistan is 796095 Sq/km. In which the 259618 km are interlinked by roads, 7791 km area is connected by railway lines, and there are 42 airports in Pakistan. The flood of 2011 has destroyed the communication and transport infrastructures including coastal highways roads, railway network etc. From the two provinces of Pakistan, five districts of Baluchistan and eighteen districts of Sindh received a large amount of destruction in this field.

1.8.7EnvironmentThe environment provides the basics for every society and it is highly affected by the extreme changes. Pakistan is already facing composite complications of different kind of disease which are related to the environment.

Objectives of the StudyThe objective of this study is to assess the appropriate statistical distributions which are used for forecasting of the extreme rainfall and flood in coming years. The main objectives of this research arei. To examine the trend and pattern of weather change ii. To assist inthe prediction of floodingunder the projected climate changeanticipated duringthis century due to global warmingiii. To assess the suitable statistical distributions for flood and rainfall data1.9Area used for analysisPakistan receives very small amount of rainfall in most areas of Pakistan especially in those areas whose are located below the latitude of 32 degree. The study area consists of mostly the northern sites of Pakistan and Azad Jammon & Kashmir. The data has been gathered from the Meteorological Department of Pakistan. We have selected the different rainfall and flood sites in this analysis. As the rainfall sites are not very close to the flood sites but the geographical map of these sites lies within the region covered both (rainfall and flood) areas.This study is useful for prediction of the worst rainfall in coming years. It is based on the rainfall data collection and collective compilation of available data by the contributing of Meteorological Department of Pakistan. Pakistan is an agronomic country. The ecological situation of Pakistan lies in between N-N latitudes and E - E longitude. Its climate is hot and arid; however there is a vast diversity present in its climate. Some areas of Pakistan acquire a high rainfall, some get impartial and some receive very small amount of rainfall. Pakistans monsoon regions usually receive heavy rainfall during the monsoon period, which results in flood due to lack of proper water resource management and planning.

1.10Geographical positions of the siteS/NSite Latitude NLongitude EElevationYearsLengthSiteDescription

1Balakot34.3372.21995.40m1977-201236Rainfall

2Dir35.1271.511375.0m1977-201236Rainfall

3Kohat33.3571.26489m1977-201236Rainfall

4Mangla33.1473.64147m1925-201389Flood

5Marala32.6774.46250m1925-201288Flood

6Muzafarabad34.2273.29702m1955-201056Rainfall

7Risalpur34.0471.581014m1977-201236Rainfall

8Shahdara34.1573.49820m1925-201288Flood

9Terbela34.7472.48148m1977-201236Flood

Table 1.3Table contains the root Map from Terbela dam To mangla dam through Dir, Risalpur, Kohat and Balakot

Where point A indicates Terbela flood site,point B indicates Dir rainfall site, point C indicates Risalpur rainfall site, point D indicates Kohat rainfall, point E indicates Balakot rainfall and point F indicates Mangla flood site.

CHAPTER 2LITERATURE REVIEW

2.1 IntroductionGlobal warming is recently a great issue all around the World. The glaciers are melting due the warmer temperature year by year with the passage of time. Change in temperature caused the extreme events like temperature, rainfall, high floods etc. Increasing intensity of rainfall caused high level of floods.

In the half of 20th century Professor Gumbel first time suggested the application of extreme values distribution. (Gumble, 1941) used extreme value distribution for empirical analysis. The statistician and engineers used it frequently later on. He used a meteorological variable (annual flood flow, maximum precipitation etc) in 1941.

Huff and Neill (1959) used maximum magnitude of rainfall in Illinois and compared five different statistical distributions. They analyzed annual maxima and seasonal for 1 to 10 days of period. They used 30 stations having 40 years as a size of data analyzed and compute useful statistical results. Method of moments and least square method are compared. The difference between the results found insignificant.

Hershfield (1962) investigate the AMS series for the data based on the period of 24 hours rainfall in USA. He used Gumble distribution which seems a good fit and give significant results

Alexander (1963) used the method of storm transposition for estimating the frequency of rare events (Alexander, 1963).

Markovic (1965) used five probability distributions on the annual precipitation data along with the river flows in Canada and USA based on 2506 gauge stations. These five distributions named as candidate distribution including normal, log normal with two parameters, gamma of two parameters, log normal of three parameters, gamma of three parameters. He found that the gamma and log normal as insignificant results. Gamma distribution of one parameter has also in significant results over three parameters (Markovic.1965)

Dickinson (1976), used extreme value distributions on rainfall data to developed some useful rainfall extreme value distribution. He used the data of Southern Ontario from three stations. He also suggested the analysis of seasonal patterns to estimate the rainfall run offs.

Lanwehr et al. (1979) used three methods for estimating the parameters which have lot of interest in inferential statistics. Method of moments, Maximum likelihood and probability weighted moment (PWM) method used in gumble distribution and made the comparison between these three methods. He proved that PWM is good fit and best from all three methods.

In 1984. Stren and Coe fitted the non-stationary markove chains to the occurrence of rainfall. They applied the Gamma distribution with using the different values of its parameters to the time of years for total of the rainfall. They calculated the useful results from the models used for rainfall data for prediction and planning.

In 1992, Haktanir applied thirteen different distributions to annual Rood peak series for more than 30 observations taken from 45 unregulated streams in Anatolia. Parameters of these distributions were mostly estimated by using method of maximum likelihood method of moments and probability weighted moments (Haktanir, 1992).

The World Meteorological Organization in (1989) published a summary report favoring the policy makers and engineers. In this report, different methodologies were discussed to estimate the extreme events and utilization of different distributions to the data.Akosy(2000), suggested that the gamma distribution found to be an appropriate distribution in daily rainfall analysis. He used Markove chains to determine the dry and wet days through the use of Gamma distribution. He used Gamma distribution to generate the sequence of such kind of daily rainfall data.Koutsoyiannis & Baloutsos, (2000) investigate the largest record of a long period of time consisting of 136 years in Greece based on maximum annually rainfall data. They furthermoreadvised that the use of (type 1) extreme value distribution was not feasible in its conventionally used and suggested that generalized extreme value distribution is much better for record values analysis and proved to be a good predictor for return periods.

Park et al. (2001) studied the maximum summer rainfall in South Korea from sixty one gauge stations. He estimated L moments from Wakeby distribution and quantile estimation for the different return periods. He renowned isopluvial maps for different return periods to the estimated designs.

Kuczera (2001) utilized a comprehensive study of at site frequency flood data. He also used Monte Carlo Bayesian method to estimate the confidence limits of quantile and the expected probability distribution for any kind of frequency distribution of flood.

Pathak (2001) analyzed frequency analysis in South Florida Water management district for short period of rainfall. The data used in this study based on the time interval of January 1, 1900 to December 31, 1999. He used one day, three day and five day periods for highest rainfall.

Zalina et al, (2002) made a comparison between eight different candidate distributions to find the best and reliable estimator of maximum annual rainfall in Peninsular. They applied extrapolation of quantiles and a goodness of fit test and found that the generalized extreme value distribution is a good fit to data. Coles et al. (2003) used the concept of Beyesian inference in the modeling of rainfall data on daily basis. He found a distribution which can predict the extreme rainfall in the coming years.Ware & Lad (2003) used two different methods (frequentists and conventional) to make a comparison of precision and accuracy of regional and at-site analysis of flood quantiles for Wainmakariri River. He found that the frequentists method proved to be good as compared to estimates by conventional method.

Ahsanullah, Chan and Balakrishnan (1993) discuss the recurrence relations between the product moments of the extreme value distribution which are based upon record values. They also settled the product and single moments utilizing these relations in very simple way.Kamps (1995)investigate the order statistics and record values. He found someuseful results which are applied to obtain the relations including explicit expressions and recurrence relations for the moments of generalized order statistics of pareto, power function and Weibull distributions. He gave the idea of generalized extreme values and their properties. Also derived the joint density function of the first r and nth uniform generalized order statistics .He purposed some necessary and fundamental conditions for the existence of moments of the generalized order statistics.Barbson and Palutikof (1999) used extreme wind speeds for frequency analysis based on five Scottish islands. He applied the Generalized Pareto Distribution to the data and found the failure behavior of the distribution in the presence of non-stationary in the wind speeds.Pawlas and Szynal (2000) discussed some necessary conditions of characterization for inverse Weibull and generalized extreme value distribution by the help of the moments of kth record values.In 2006 Thompson et al. Firstly introduced global index for earthquakes similar to index flood method. They used the from 46 regions around the globe and showed that GPA and GEV are the best fit to the magnitude and annual maximum series by using goodness of fit test and L-moments diagram.Soliman, Abd Ellah and Sultan (2006) investigate the Bayesian analysis of the Weibull distribution having two parameters on the basis of record values.And estimate the Bayes and Maximum likelihood estimators of record values. The hazard and reliability functions were also discussed.

(Change, 2007) has reported that the eleven years from 1996 to 2006 were the warmest years. Inthe third assessment report he had compared two intervals of the years to see the change in linear trend. He found that from the years 1906 -2005 has a 0.74 linear trend which is increased from 0.560C to 0.920C and in the interval 1901 to 2000 years had received 0.60C change in temperature. There is double linear warming trend in temperature of last half century as compared to the change in temperature measured in last century.

Kao (2008) conducted an at-site frequency analysis of rainfall by using hourly precipitation dat for 53 gauging station in Indiana. A combination of generalized extreme value distribution and extreme value type-1 distribution was used to find at-site estimates and these estimates, at site and regionals were compared. He found that the regional estimates were not better as compared to at-site estimates.Khan et al. (2008) found that the Frechet distribution is very useful and flexible having a property to converge in different distributions. They used Monte Carlo Simulation to compare the shape and scale parameters. They also gave some important results between the relation of shape parameter to the Mode, Mean, Median, Variance, Coefficient of variance, Kurtosis and Skewness. They also used mathematical and graphical technique for theoretical analysis of Frechect distribution. Sultan (2008) used Bayesian and Maximum Likelihood method to estimate the parameters of Frechet distribution. He worked out with two different cases, one is estimation of both parameter (shape and scale) being unknown and other is keeping location parameter as known. He estimated the hazard function, survival rate and made a comparison between mean square errors which is estimated by simulation method.Kwon et al. (2008) used the gumble mixed model to analyze the bivariate storm frequency analysis. They used hourly rainfall data collected for 34 years at Jecheon station in Korea. They estimated the bivariate return periods, joint return periods and conditional return periods of storm events.Jakob et al. (2009) investigated the pattern of extreme rainfall in Sydney Australia in the absence of stationary in the data. The data was based on the years 1921- 2005 of Sydney Observatory Hills. He found that the rainfall pattern in Australia showed a large amount of variation both seasonally and spatially.

In 2009, NOAA National Climate Data center composed a climate report indicating that the temperature in last few decades had increased. This report was based on 300 scientists from all over the world including 160 research groups. They used ten indicators to see the behavior of the weather and temperature. Among these ten indicators seven are found to be increasing which indicates increasing temperature duringthe last few decades.

Table 1.2Indicators of Warming WorldSr #IndicatorsIncreasing Decreasing

1Air temperature near surfaceYesNo

2Humidity YesNo

3Snow coverNoYes

4Temperature over oceansYesNo

5Sea surface temperatureYesNo

6Sea iceNoYes

7Ocean heat contentYesNo

8Temperature over landYesNo

9Glaciers NoYes

10Sea levelYesno

Hamdi (2011) presented a new approach to estimate droughts in Tunisia. He conducted frequency analysis which used DeceedanceProbability for the first time. Historical information was completely and authentically evaluated in the study. A combination of weibull formula and log normal-III was considered as best fit to the data.

Weiss et al.(2013) studied the probabilistic patterns of extreme storm surges happened beside the French coasts of the Atlantic Ocean. They developed homogeneous regions to predict the occurrence of the extreme storm surges on the same storm events of the regions included the Eastern and Western English Channel regionsand Atlantic region.

Kantima Meeyaem et al. (2014) had proposed hybrid model for flood forecasting using case study area. They discussed the three models based on mathematical techniques namely, Gumbel distribution function, drainage density and Muskingum method. They found that hybrid models are very useful for flood prediction especially in small size of data.

Chapter No 3 Methodology3.1IntroductionThe purpose of this analysis is to evaluate the relationship between the length and intensity of the extreme events (like heavy rainfall and worst flood)to the chances of these events to be happened. In this study we have taken annual floods peaks of the four dams of Pakistan and some annual rainfall data of different stations. These rainfall stations have mostly nearby locations of the dams. The empirical data analysis has been done by using descriptive summary and graphical techniques. The graphical techniques including (probability plots, histograms, density graphs, empirical distributions function, etc) which can be used to find the probability density function suitable for relevant data.

3.2Quantile

Quantiles have the vital importance in the field of statistics. It splits the distribution into the desired parts as many as a researcher needed. Particularly in the sector of hydrology, the estimation of the intensity of rainfall, floods, drought and storms etc. It provides the basis for future planning polices and hydraulic designs. Quantile is a value of extreme event that had a particular probability of exceedence and a specified return period of that extreme event.

3.3Exceedence Probability

It is simply a probability that any particular event will across the specific value. In frequency analysis of flood peaks, the probability of flood height cross the available capacity is known as exceedence probability.The Annual Exceedence Probability (AEP) is the expected chance of the occurrence of the natural hazard event (such as rainfall or flooding event) within a year it is mostly expressed in percentage form. Extreme flood events occur (exceeded) rare times. Then the event will have a lesser annual probability. It is denoted by

Where k is the rank of the observed values and n is the total observations.

3.4Method of EstimationsThere are many methods and advance techniques available for estimating the parameters. Different researchers used different techniques according to their convenient approach.Followings are some methods which are commonly used Method of moments Maximum likelihood method Weighted probability moments

3.4.1Method of moments

Method of moments (MOM) is very conservative and oldest technique. It is developed and used by Karl Pearson (1857-1936). In this method the sample moments (Raw moments) is being equated to the corresponding population moments.The non-central rth population moments are calculated as

for a continuous random variable x.And the sample moments are

3.4.2Maximum Likelihood Method

Maximum likelihood method is a well-known and most frequently used method for estimating the unknown parameters. Let is a likelihood function of xi random variables based on unknown parameter . Now let is a function based on sample observations, i.e is the value of that maximizes then is called the ML estimator of. MLE can be found by using the following equations for:

or

(3.4.1)The equation (3.4.1) gives the maxima or minima of the given distribution

The function and gives same value of parameter and sometimes the log likelihood function Log L(.) is proved much easier to find the values of as compared to likelihood function. Crammer Rao Inequality can be used to find the variance of MLE

The important properties of MLE are as follows: ML estimators are usually consistent but under some regularity conditions, when the sample size becomes large enough i.e , then the unique consistent MLE is exist. ML estimator attain the normality assumption when n becomes very large MLE are efficient and consistent when n increases A sufficient estimator always found by ML method if the sufficient estimator exists. The ML estimator is biased and can be moderately biased when sample size is quite small. And the lack of normality may be suspected It is always positive or zero i.e cannot be negative MLE are in-variant under some functional forms of transformations. It is only found when the density function is available.

MLE may or may not be unique in solution. If is a MLE of and h() is 0ne-to-one function of then h() is a MLE of h().

3.5Trend analysis by Graphical techniqueIn time series analysis, the long term movement is called a trend. Trend can be of two types (i) increasing with the passage of time (ii) or decreasing with time. A time plot is used to see whether it is increasing or decreasing. Time plot is a simple graph which represents the relation of the time to the corresponding values.

3.6Q-StatisticsThe Ljung-Box Q-statistics at lag k is a test statistic used for the null hypothesis i.e, there is no autocorrelation up to order k . It is calculated as

Where the j-th autocorrelation and n is the number of observations. If the series does not based on the results of ARIMA estimation, then under the null hypothesis,

That is asymptotically distributed as a with degrees of freedom equal to the number of lag of autocorrelations.3.7AutocorrelationThere are three types of the data used for empirical analysis. Cross section Time series Pooled The pooled data is a combination of time series and cross sectional data. Time series data is an important type of data for any empirical analysis. A number of assumptions had been made for developing the models on these types of data.The examination of the relation of two or more set of variables has always a great interest for investigators. This kind of relationship is considered as correlation among those variables. The correlation between two sets of random variables(X, Y) is the interdependence between these random variables.

The correlation is measured by using the formula:

For sample data it is calculated by using the formula:

Autocorrelation provides a good lead to investigate the properties of a time series. The auto correlation is the simple correlation between pairs of observations,and called the auto correlation at lag k.

The cross sectional data is collected through a random sample of cross-sectional units. For example from a data of households consumption collected through sample survey, one cannot believe in advance that the random error term of one household is correlated with another household. If such type of correlation exist in the cross-sectional units is called Spatial Autocorrelation.

3.7Class of DistributionsThe probability distributions are used in various fields of research (hydrology, economic variables, civil engineering designs and models, weather forecasting and flood risk management).The following distributions are used to carry out the analysis of rainfall and flood data.

3.10.1 Extreme value DistributionThe extreme events give very negative impacts in some fields. These events are very rare in happenings but have great consequences. For example large amount of snowfall, extreme floods, high temperatures and storms or wind speeds etc. The most researchers and analysts used EVT (extreme value theory) for developing the suitable models to evaluate the loss and risks due to the extreme events.The probability density function of extreme value distribution is as follows:

where

Where is a scale parameter and is a shape parameter. If Z follows a weibull distribution with parameters (,) then the Log(Z) is followed as Extreme Value Distribution with 3.10.2Generalized Extreme value DistributionThe Generalized Extreme value Distribution abbreviated as (GEV) is belong to a class of continuous probability distributions. It is also known as Fisher-Tippett distribution. Extreme value distributions recognized as limiting distribution for optimization problems. The GEV distribution is used to normalize the maxima or minima obtained from a set of identical and independent random variables. A theory depends upon extreme values (EVT) provides the basic for measuring and modeling such kind of events which have very low chance of occurrences.

K is a shape parameter of GEV distribution and

the GEV distribution converge into Type I when and Type (II) for and in Type (III) for

3.10.3Exponential Distribution

Exponential distribution is a kind of continuous distribution. It measures the length of time of occurrence between two events in a Poisson process. These events are rare events .The pdf is

whereAnd the distribution function is as follows:

3.10.4Gamma Distribution

Gamma distribution is a two parameter continuous probability distribution. It is very useful in sense of its flexible property. The chi square and exponential distribution is somewhere called as the children of Gamma distribution.

Where is a gamma function defined as

3.10.5Normal Distribution

The Normal distribution is the very well-known and frequently useable continuous distribution. This distribution has another name as Gaussian distribution. Its pdf is as

3.10.6Three parameters Log-Normal Distribution

The log normal (3P) is distribution which belongs to a class of continuous distributions. It has three parameters , and .The probability density function is

3.10.7Logistic DistributionThe logistic distribution is also belonging to a family of continuous distribution. The shape of logistic distribution is similar to normal distribution its peak is higher than Normal. The logistic distribution is useful Hydrologic records (discharge in rivers, rainfall etc)

3.10.8Nakagami Distribution

The Nakagami distribution was proposed by (Nakagami, 1960). It is useful to develop models for the fading of radio frequency or signals. Application of this distribution spread around many fields like communications, hydrology, analysis of multimedia, traffic over networks and ultrasound data etc.

is the shape parameter and is a scale parameterif =1 this distribution collapsed into Rayleigh distribution and if =0.5 it converge into half normal.

3.10.9Weibull Distribution

In a large group of famous distributions, Weibull distribution is very useful to analyze the life time data. The Inverse Weibull distribution is also pay a vital role for predicting and analysis of many extreme events like earthquakes, rain fall, sea currents, floods and wind speeds etc. Applications of the Inverse Weibull distribution in many fields given in Harlow (2002) who found importance of this distribution for modeling the statistical behavior of material properties for applications in the field of engineering. Nadarajah and Kotz (2008) pointed the sociological models based on Inverse Weibull randon variables.The scale form of the Inverse Weibull distribution has its density function given by

While the location-scale form of Inverse Weibull distribution has its density function given by

3.10.10Inverse Gaussian distribution

The inverse Gaussian distribution derived in 1915 by Schrodinger. In 1945 Tweedie proposed the name of this distribution as inverse Gaussian distribution. In 1947 Wald revised this distribution and suggested it as a limiting form of samples of sequential probability ratio test. Thats why the inverse Gaussian distribution is also called Wald distribution. The pdf of it is as follow:

andWhere is the mean and is the shape parameter

3.10.11Rayleigh Distribution

Rayleigh distribution is belonging to a class of continuous distributions. It is used in complex numbers, wind and speed wave length etc.A random variable is said to follow a rayleigh distribution if it has the pdf

Where is a scale parameter and>0The distribution function is as follows:

3.10.12Frechet DistributionThe French Mathematician Maurice Frechet (1878-1973) gave a limiting distribution of the sequence for local maxima that provides the scale normalization (Frechet, 1927).

Frechet distribution with probability density function

Where is a scale parameter is a shape parameter and is a location parameter. And the cumulative distribution function is

3.11Goodness of fit test

The probability distributions have been applied on the different site of rainfall and flood data in intermediate step. After that a goodness of fit test is carried out to see whether the distribution is good for available data. For this purpose the following test are used

Chi Square goodness of fit test

Chi-square test has a wide application in the literature and commonly used for investigating the good fit of any particular distribution to the data.Chi-square test with null hypothesisH0 = Distribution is a good fit for dataH1 = Distribution is not a good fit for data

Test statistic:

with (N-n-1) degree of freedom

Where known as observed frequencies. Whereas are expected frequencies.With critical region

Where v is the degree of freedomThe conclusion is based on critical region and calculated value of chi square. If the calculated value of chi square is greater than the critical value then one can reject the null hypothesis otherwise accept.Kolmogorove Smirnov test

Kolmogorove Smirnov test is another tool for testing the goodness of fit to the specified distribution. The null hypothesis is used under this test is as

Ho : selected sample is drawn from the specified distribution.H1 : selected sample is not drawn from the specified distribution.The test statistic is used

The is referred as a supremum of a set of the ordered elements. The critical decisions based on the value of Dn , if the value of Dn closer to zero then distribution is considered a good fit to the data.3.12Probability PlotsProbability plots are commonly used as graphical technique for checking the basic assumption about the nature of the data. The given data is plotted versus the theoretical distribution and investigate the place of points around the line. If the mostly points lie around the straight line then the theoretical distribution followed observed data. We have used some probability plots to see the behavior of the data.

3.13Utilization of SoftwareThe work has been done by using different statistical software including MATLAB 5 , SPSS 16, MINITAB 15 and EASYFIT. Some graphical analysis is obtained using R LANGUAGE.

CHAPTER 4LEAST SQUARE ANALYSIS4.1 IntroductionThe researchers are always interested in the nature of relation between the variables. For instance, a researcher is wantedto determine the relationship between the disasters and extreme eventssuch as rainfall, storm, hurricane, earthquake and flood etc. A number of works have been made to find the better and precise methods for the estimation of linear models and fitting the data in recent years but the Least Square method is still dominant and used as an important tool of estimating the parameters.Least square methods is perhaps the most widespread technique in the field of statistics. There is several factors behind this fact. Mathematically,the use of squares makes least square method very submissive because the Pythagorean theorem directs when the error term is independent of an estimated quantity one can might be add the squared error and squared estimated quantity. Another mathematical aspect is the involvement of arithmetic tools ( eigen-decomposition, derivatives and singular value decomposition ) in the construction least square method for the relatively long period of time.As this method is shown by its name Least Squares which is obtained by minimizing the sum of squares of the deviations from the corresponding population observations. Method of least squares is the combination of different observations as being the best estimate of the true value; errors decrease with aggregation rather than increase by Roger Cotes(1722).

4.2Least Square estimates

A preliminary examination of data has been done by fitting a straight line and some graphical techniques to see what kind of variation exist. For this purpose , we fit a straight line to the data to see if the slope is positive and to what degree. The least square method is commonly used to find the estimates of the parameters. In this case a similar technique is used to find the estimates and utilization of those estimates for the prediction of the diversity of rainfall in coming years. It is suggested by Sam C.Saunders Prof. Emer. Washington State University.

Consider the yearly (maximum) flood height data over a period of say n years where there may be missing observations. The data is where these yi represents real measurements of the recorded yearly flood height on the ith year of the sample sequence.

A preliminary examination of data can be done by using some graphical techniques. Perhaps the following simple types of examinations can be completed using elementary procedures. Consider the expected model for the rainfall and flood data is

for k = 1,2,4, . . .n (4.2.1)

We fit a straight line to the data

Consider the sum of squares (4.2.2)

Consider the yearly (maximum) flood height data over a period of n years. The data is where .These represent real measurements of the recorded yearly flood height on the ith year of the sample sequence. Letis the cardinality of set J.Let S denoted the mean sum of squares as follows

(4.2.3)

We are to obtain the LS-estimators, sayand as functions of the data. Theyare to be found from solving the simultaneous equations by setting partial derivatives equal to zero. i.e

=0 and =0

First assume that where j is the set of independent observations and recall the formulae for the sum of integers and their squares.

and(4.2.4)

Now we have

And(4.2.5)

(4.2.6)Where we define

and for later use And thus the two equations solved simultaneously for and we get

(4.2.7)And

whereThese estimates can also be written in more convenient form. This is more useful for numerical calculations.

(4.2.8)

(4.2.9) Where k = 1,2,3 . . .n ,and n is total number of observations

If Y~F with E[Y]= V where are identically and independently distributed then andthis shows that these estimators give the true answer in expectations. i.e they are unbiased when there is no true increase.

Also show if there is no true increase. If there is annual average increase per year, namely, then Once Least-Squares estimates and are computed, we can estimate the worst average rainfall and flood peaks after ten years by using the formulae + (10 + n)(4.2.10)Now one might ask "What would the worst rainfall look like?" To answer this question, we could then compute the reduced valuesZk= (4.2.11)And find theZmax=Zk(4.2.12)So the worst rainfall over the next ten years estimated as

(4.2.13)

4.3Missing data analysisThe results of any data are mostly based on the availability and accuracy of the data. Unfortunately, the missing observations are the real problem for every researcher. Now we will discuss how one can tackle the issue of missing values.

Recall that when j cardinality we have

(4.3.1)And so

(4.3.2)Now we equate these both to zero and with some acknowledged ambiguity wealso now denote

(4.3.3)

(4.3.4)The two equations determining the estimators are the solutions to the pair

Which are

(4.3.5)It should also be demonstrated that no error is made by writing the ith year instead of the calendar year say m + i where perhaps m = 1997. But if there is no data for two years then the next coded entry would be the appropriate 1 value plus 2.4.4 Analysis of Mangla Flood PeaksThe Historigramof the Mangla flood peaks data pertaining the 89 years from 1925 to 2013 indicates decreasing line with very small value of r2=0.011 in fig (4.4.1). The value of R-square of the least square line is closer to zero which indicates that the line is almost horizontal. That means there is neither an upward nor a downward secular at Mangla.Fig.(4.4.1) The Historigram of the flood peaks at Mangla site

The estimates and are 271525.5and -829.500511 respectively The maximum value calculated by using the maximum of Zk is 474880.5 corresponding to the

year 1992.The worst flood peaks after some years may also be estimated asTable (4.5.1)Years Estimated flood peaks

2015670921.4535

2016670091.953

2017670067.937

The above table contains the forecasting of the flood height in coming years.4.6 Analysis of site Shahdara Flood PeaksThe Historigram of the Shahdara flood peaks data pertaining the 88 years from 1925 to 2012 indicates slightly decreasing line with very small value of r2=0.008. The value of R-square of the least square line is closer to zero which indicates that the line is almost horizontal. That means there is neither an upward nor a downward secular at Shahdara.The Historigram of the flood peaks at Shahdara site

Fig. (4.6.1)

The estimates and are 102514.7and -300.04 respectively The maximum value calculated by using the maximum of Zk is 492685.3 corresponding to theyear 1988.The worst flood peaks after some years may also be estimated as

Table (4.6.1) Forecasting of worst flood peaksYears Estimated flood peaks

2015567900

2016567600

2017567300

The table (4.6.1) has the forecasting flood height of three years at shahdara.4.7 Analysis of site Balakot Rainfall StationThe Historigram of the Balakot rainfall data pertaining the 36 years from 1977 to 2012 indicates increasing line with value of r2 =0.28. The amount of rainfall may increase in coming years and caused diverse floods. There may be some other reasons along with global warming. The forecasting of worst rainfall can play a vital role for decision making and hydrological engineering. It provides the basis for developing the design values for rainfall and flood protection buildings (dikes).

Fig. (4.7.1)The Historigram of the rainfall data of Balakot Site

The estimates and are 378.2527 and 2.223 respectivelyThe estimates and are 378.2527 and 2.223 respectively The maximum value calculated by using the maximum of Zk is 319.81 corresponding to the year 2010Table (4.7.1) Forecasting of worst rainfallYears Estimated rainfall

2015784.75

2016786.98

2017789.44

4.8Fitting parabolic trend to Flood and Rainfall dataLet the series Xt considered independent time series based on time. And the series contains the parabolic trend. Then we have the parabolic trend

(4.8.1)

And (4.8.2)

To find the estimator we get

and setting it equal to zero we get

(4.8.3)

(4.8.4)

(4.8.5)

4.9Measures of accuracy for time series dataThe forecasting of the time series data is based on the selection of appropriate model. We used some techniques to measure the accuracy of the model, which are useful to compare and forecasting of the different fits to the sampled data. The three methods of measuring the accuracy of the specified models are: Mean Absolute Percentage Error (MAPE) Mean Absolute Deviation (MAD) Mean Squared Deviation (MSD)The outliers have a significant effect on these approaches e.g The MAD is slightly affected by outliers as compared to the MSD. Generally the least value among all three methods is considered as a good model. 4.10Risalpur Rainfall SiteA visual investigation of the data suggests the quadratic model that could be very useful to explain the presence of trend in observed data. The value of R square (only 11.9%) variation explained that the rainfall has very small changes with passage of time. The estimated values of the quadratic model is as follows

R-Sq = 11.9% The R Square statistic is a measure of the strength of association between the observed and model-predicted values of the dependent variable. The large R Square values indicate strong relationships for both models. The R Square for the Quadratic model is larger, though it is not clear whether this is due to the Quadratic model capitalizing on chance with an extra parameter .

The R square value The scatter plot of Risalpur site shows the parabolic trend in the observed data. There are shown some outliers in the data. A visual clue from the figure 1 indicates that there are some outliers present in the data. In order to obtain the more precise examination one can detect the reasons of these outliers.

Table (4.10.1) Measures of accuracy Trend Methods

Linear MethodQuadraticExponential

MAPE24.423.1923.7

MAD78.072.9775.4

MSD10543.19547.5010740.8

Table (4.10.2) Scatter plot and residuals

(a) Scatter plot of Quadratic model (b) Residuals versus fitted values

The estimated amount of rainfall by quadratic model provides the best prediction as compared to the other methods. From table (4.10.2), one can see the difference between the estimated values by different methods for the same year which clearly shows that the misspecification of the model will mislead the results.

Table (4.10.2)YearsForecasting

LinearQuadraticExponential

2013377.408454.100359.241

2014378.838467.966360.518

2015380.269482.488361.800

2016381.699497.664363.086

2017383.130513.494364.377

A huge difference is observed in forecasting with selected models. Large amount of rainfall is expected according to second degree curve.

Figure (4.10.2) Graphical comparison three models

The quadratic model gives the least results among all three methods so the quadratic degree is a better choice for the forecasting of the rainfall in preceding years. 4.11Dir Rainfall

The rainfall data at Dir site exhibit that the trend is quadratic. The figure (4.11.1) contains the fitted plot of the Dir site and the residuals against the fitted. The value of R square is also very small i.e 18 %. (4.11.1)

(a) Scatter plot(b)Residuals plotThe estimated model is as follows.

R-Sq = 18.5%

Table (4.11.1) Measures of Accuracy Trend Methods

Linear MethodQuadratic Exponential

MAPE11.68911.59411.327

MAD16.96416.85316.695

MSD560.513558.224563.690

The two values of accuracy measures including MAPE and MAD of the exponential model are larger in amount to the corresponding models. From the figure (4.11.1 a) one can perceive that the presence of outliers can change the true image of the modele. Table (4.11.2) forecasting of worst rainfallYearsForecasting

LinearQuadraticExponential

2013163.282166.959160.698

2014164.528168.801162.030

2015165.774170.675163.374

2016167.020172.580164.728

2017168.265174.516166.093

Graph of three models

From the accuracy measures we can see that the Exponential model receives the smaller values of MAPE and MAD while the quadratic model has the minimum value of MSD.

4.12Kohat Rainfall

The Kohat site has a parabolic trend. The visual investigation of the graph suggests the presence of outliers in the data. The estimated model is as follows.

Scatter plot of Kohat rainfall site

Figure (4.12.1)From the fig(4.12.1) the graph shows that there is no linear trend in the data. By using the measurement of accuracy we can observe that the quadratic model gives the best fit for kohat site of rainfall.

Table (4.12.1) Measures of Accuracy Trend Methods

Linear MethodQuadraticExponential

MAPE12.00611.96011.674

MAD17.47517.36817.183

MSD474.968470.556477.505

Table (4.12.2) Forecasting of rainfallYearsForecasting

LinearQuadraticExponential

2013148.110143.005146.023

2014148.346142.413146.233

2015148.582141.777146.444

2016148.818141.098146.655

2017149.054140.376146.866

Graphical Comparison of three models

Figure (4.12.2)

The exponential model receives the smaller accuracy values of MAPE and MAD as compared to other models. We can also observe from the scatter plot Figure (4.12.1) that the pattern is not followed linear of quadratic trend. The best choice for this data is the exponential model.

4.13Marala Rainfall Site

The Marala flood site has a parabolic trend. The visual investigation of the graph suggests the presence of outliers in the data. The estimated model is as follows.

R-Sq = 4.4%

Table(4.13.1)Measures of Accuracy Trend Methods

Linear MethodQuadratic Exponential

MAPE56.995855.397856.9169

MAD159975156370157535

MSD4.35181E+104.16378E+104.59399E+10

Table (4.13.2) Forecasting of worst flood Years Forecasting

LinearQuadraticExponential

2013322730222405266671

2014322465215376266263

2015322199208196265855

2016321933200867265448

2017321668193387265041

From the table (4.13.1) we can see that the quadratic model has the smaller values of MAPE, MAD and MSD. By using these measures of accuracy the best model is the quadratic model4.14D.G khan SiteThe estimated linear and quadratic models are as follows

Figure(4.14.1)

Table(4.14.1)Measures of Accuracy Trend Methods

Linear MethodQuadratic Exponential

MAPE36.163733.342432.4880

MAD7.64107.21957.4172

MSD92.996983.716296.9935

Table(4.14.2)YearsForecasting

LinearQuadraticExponential

201228.239220.518426.8601

201328.623019.049427.3407

201429.006917.437727.8298

201529.390815.683628.3277

201629.774713.786928.8345

Figure (4.14.2)The quadratic model is better fit for the forecasting of D.G.Khan rainfall site.4.15Terbela Flood Site

The quadratic model is seemed to be a good model for the Terbela flood site. There is an indication of outliers in the data which is needed to be investigated and make a better prediction for the future flood amounts. The R square value is about 60 % shows a huge amount of variation is accounted. The estimated model is

R-Sq = 0.6%

Figure(4.15.1)

Table(4.15.1)Measures of AccuracyTrend Methods

Linear MethodQuadraticExponential

MAPE161618

MAD632106264563444

MSD925422179592387802139386600743

Table(4.15.2)YearsForecasting

LinearQuadraticExponential

2013395152385601373603

2014395752384652373533

2015396351383621373463

2016396951382509373394

2017397550381315373324

4.16MuzafarabadThe linear model is seemed to be a good model for the Muzafarabad rainfall site. There is an indication of outliers in the data which is needed to be investigated and make a better prediction for the future rainfall amounts. The R square value is about 70 % shows a huge amount of variation is accounted. The estimated model is

R-Square= 70.2%

Table(4.16.1)Measures of AccuracyTrend Methods

Linear MethodQuadraticExponential

MAPE14.43414.45014.476

MAD17.70517.72617.841

MSD459.259459.205462.689

Figure (4.16.2)Graphical comparison of three models

A visual clue from the scatter plot of Muzafarabad the linear trend is suspected. From the Accuracy table the linear trend is found to be good model.Table (4.16.2) Forecasting of rainfallYearsForecasting

LinearQuadraticExponential

2011123.855123.311121.395

2012123.745123.144121.265

2013123.635122.975121.135

2014123.526122.804121.006

2015123.416122.631120.876

The above estimation methods are used as a preliminary analysis of the rainfall and flood data. To evaluate the foretelling and warning about the floods are needed some further analysis

CHAPTER 5ANALYSIS OF RECORD VALUES5.1 Introduction

A record is a specific value or entry which is smaller or larger from all the previous values saywhere be the sequence of independent random variables. The value which is largest in magnitude from all the remaining values is called an upper record value and the value which is smallest in magnitude from the remaining all values is called lower record values. Let be the level of flood in the river on the kth day of jth site. If we are interested in maximum local value of flood height of then that local maxima known as upper record values. And the local minima are known as lower record values.Chandler (1952) gave the concept of the record times and record values. He found that the expectation of inter record time is going unlimited of a random variable followed any probability distribution. Feller (1966) quoted different examples related to the gambling problems which are based on record valuesThere are many real lives practical examples of Record values as well as in statistical situations like economics, sports, weather etc. some time we are interested for seeking new records and maintain them for further analysis and comparisons. For example, Olympic records, world records in sports, records of earthquakes, record of rainfall and highest flood peaks over the years etc.5.2 Probability Density Function of Upper Record ValuesLet x1,x2,x3 . . .xnare the identical and independent distributed random variables from any distribution having probability density function f(x) and probability distribution function F(x) with a specified random sample size.

If are the upper record values then the probability density function of the upper record values is given as below;

The probability density function of the is

(5.2.1)Where the reliability function is as follows:

andAnd the joint probability distribution of r upper record values written as follows:

(5.2.2)

The joint probability density function of first r upper values and s upper record values is as follows;

(5.2.3)

where r