Post on 03-Nov-2019
MINOR RESEARCH PROJECT
ON
A STATISTICAL MODELING OF RAINFALL IN TELANGANA STATE USING BOX-JENKINS
METHODOLOGY
(UGC – MRP File No:MRP-6862/16 (SERO/UGC) Dated 26-03-2017)
BY
B.VITTAL
PRINCIPAL INVESTIGATOR
DEPARTMENT OF HUMANITIES AND SCIENCES
CVR COLLEGE OF ENGINEERING
JNTU, HYDERABAD-501510
REPORT SUBMITTED TO
UNIVERSITY GRANTS COMMISSION, NEW DELHI
(Project Duration: 01-06-2017 to 31-05-2019)
1 | P a g e
B.Vittal Dept. Of Humanities & Sciences M.SC, ( Ph.D) CVR College of EngineeringPrincipal Investigator Vastunagar, Hyd.UGC- MRP Date:
FOREWARD
This is the final report of UGC sanctioned Minor Research Project entitled “A
Statistical Modeling of Rainfall in Telangana State using Box Jenkins
Methodology” bearing F.No. MRP-6862/16 (SERO/UGC) Dated 26-03-2017, has
been carried out by me as PrincipalInvestigator, during the period 1st June,
2017 to 31st May, 2019, in the Department of Humanities and Sciences, CVR
College of Engineering, JNTU, Hyderabad, Telangana State, India, 501510.
This report contains the detailed procedure for objectives / results achieved
andproblems attempted related to the objectives mentioned in the above
entitled project submitted toUGC for sanctioning the MRP of problems
attempted and the solutions or results obtained so farunder the project. This
project was technically assisted by Dr. Krishna Kishor, Dr. N.Rajuand their
cooperation is gratefully acknowledged.
( B.VITTAL)
2 | P a g e
ACKNOWLEDGMENTS
I would like to express my sincere thanks to the University Grants
Commission, NewDelhi for providing financial assistance to carry out this
project under UGC-MRP. I am alsograteful to Chairmen, Dean Research,
Dean Projectsand Principal, HOD of Humanities & Sciences, CVRCollege
Engineering, JNTUHfor providing necessary facilities to carryout this research
project smoothly.I am thankful to Prof. M. Krishna Reddy, Prof. V. Hara
Gopal, Dr. C. Jayalakshmi, Dr. N. Ch. Bhatra Charyulu, for their help and
encouragement throughoutthe period of this project. I am also thankful to all
my well wishers who helped directly orindirectly in completing this project.I
would not have been able to complete this project work without the support
from myfamily members and understanding of my goals and aspirations
towards research.
B.VITTAL
3 | P a g e
CONTENTS
1 INTRODUCTION
1.1 Introduction 51.2 Methodology of Seasonal rainfall 61.3 Methodology of Monthly rainfall 61.4 Summary of the Report 8
2. GEOGRAPHICAL SITUATION OF TELANGANA STATE
2.1 Geographical& Economical positionof Telangana Region 9
2.2 Ecological Situation of Telangana State 11
2.3 Climate &AgricultureSituation of Telangana State 11
2.4 Irrigation Projects in Telangana State 12
2.5 Electricity Generation in Telangana State 13
3. ANALYSIS OF ZONE WISE AND SEASONAL RAINFALL
3.1 Introduction 16
3.2 Definition and various Methods of measuring Rainfall 17
3.3 Methodology of Research 19
3.4 Analysis of Zone wise rainfall 20
3.5 Conclusion from Zone wise rainfall 22
3.6 Analysis of Seasonal Rainfall 23
3.7 ANOVA Approach for analysis of three seasonal Rainfall 25
3.8 Tukeys HSD-Post Hock Approach for analysis of seasonal Rainfall 27
4 | P a g e
3.9 Conclusion from Seasonal Rainfall 27
4. ANALYSIS OF MONTHLY RAINFALL
4.1 Introduction 26
4.2 Methodology of Research 26
4.3 ARIMA Model 27
4.4 Autocorrelation and Partial Autocorrelation Function 28
4.5 Road Map of Box-Jenkins Methodology 30
4.6 Monthly wise Rainfall Data 32
4.7 Time Series Plots of Rainfall 33
4.8 Model Identification 37
4.9 Diagnostic checking of Rainfall 38
4.9.1 Forecasting of Rainfall
5. SUMMARY OF FINDINGS AND FUTURE SCOPE
5.1 Objectives of the Project 48
5.2 Summary of Findings 49
5.3 Future Scope 51
6. BIBLIOGRAPH 51
1. INTRODUCTION
1.1 Introduction
5 | P a g e
Water is ours natural resources and it is vital to all the forms of the life. Water is used in
various fields like transportation, source of power, for the purposes for domestic
consumption, agriculture and industrial purpose. The availability of rainfall is depended
on the amount of precipitation in that particular area. The heavy or extended absence of
rainfall will cause drought and floods. Understanding rainfall variability is essential to
optimum managing the resources of water because of the development of nation
geographically and economically one of the most phenomenon is water.The availability
of genuine and reliable statistical data on water resources and its utilization, which can
be useful to the government for taking proper decision on water resources for
constructing projects and supply water to the people for the various purposes. For the
analysis of rainfall, statistical tools play major role for testing and analysis.
1.2 Methodologyof Seasonal Rainfall
The research design of the paper is casual in nature. Data was collected from Statistical
abstract of Telangana, India. Though the number of study conducted in the area is limited.
The nature of data collection is secondary in nature. Yearly data of 10 districts of
Telangana is divided into two groups such as the South and the North Telangana state
each of five districts. Regression model is constructed from the individual of five districts.
Data was analysed through t-test to test the significance between the South and the North
districts. ANOVA is used to test the seasonal variation of three seasons and the
significance between pairs of three variable(seasons) using Tukey’s HSD Post Hock test.
1.3 Methodology of Monthly Rainfall
Univariate time series analysis and forecasting has become a major tool in hydrology,
environmental management, and climatic fields. Various time series methods have been
used for modeling and forecasting rainfall data in literature but according to Pankratz
(1983) the Box and Jenkins method is the most general way of approaching to forecast
unlike other models, there is no need to assume initially a fixed and specified pattern. The
Univariate Box and Jenkins models are the most useful for analysis of single time series.
6 | P a g e
Box and Jenkins methodology is the probably the most accurate method for forecasting of
time series data. According to Caldwell (2006), the Box-Jenkins methodology is particularly
suited for the process of model exhibiting strong seasonal behaviour. There are other forecast
techniques exploring the relation among observations yield better results; most of these
forecast techniques are based on recent advances in time series analysis consolidated and
developed by Box and Jenkins (1976). Box- Jenkins approach to build a seasonal model of
monthly rainfall data of Telangana region. The estimation and diagnostic analysis result
revealed that the model is well fit to the historical data. Residual analysis revealed that there
was no violation of assumptions in relation to model adequacy. Further, we compared the
forecasting accuracy of the model by holding-out some rainfall values. The point forecast
results showed a very closer match with the pattern of the actual data and better forecasting
accuracy in validation period.
1.4 Summary of the Report
This report is devoted to study the A Statistical Modelling of Rainfall in Telangana State
using Box-Jenkins Methodology. The presentation of research work done is organized in
five chapters.
In Chapter – 1: The present chapter of the projectprovides a brief introduction on
methodology of research and with a bird’s eye view of some of the useful methods and
chapter-wise summary of the report
In Chapter – 2: In this chapter, there is a brief description in Telangana
Geographical,Economical situation is presented. To understand water and its resource
Ecological, agricultural position of Telangana is mentioned. To know about the Rainfall
storage and situations in the generation of electricity for Irrigation projects and Electricity
capacity of generation is given individually.
In Chapter – 3: An introduction to Zone wise rainfall and its Analysis. The complete
literature is available on the analysis of zone wise rainfall. For predicting the zone wise
7 | P a g e
rainfall regression model is constructed. For testing significance difference between zones t-
test has been used. From the analysis of zone wise rainfall, the output is that there is no
significance is observed.
In the process of analysis, seasonal rainfall is given in this chapter. For analysing the
seasonal rainfall, which is classified into three seasons namely Suth-west and north-East and
winter seasons. ANOVA and HSD post hock test is used test the significance of seasonal
difference with respect to the rainfall. From the analysis of season wise rainfall the output is
obtained as there is change in the amount of rainfall from the southwest Monsoon to
Northeast and winter season.
In Chapter – 4: An introduction to Monthly rainfall and its Analysis. SARIMA is the
statistical method which is used for building the model of existing rainfall data and
forecasting the future rainfall. In this project rainfall data is seasonal, so SARIMA is one of
the best approaches to build the model for forecasting. In this chapter for testing the
seasonality and Stationary, ACF, PACF concepts has been used to identify the seasonality,
stationary and it is found that there is seasonality and non-stationary in rainfall. For
diagnosis checking Residual ACF, PACF were constructed and H-Jung test is used for the
data is adequacy or not. Q-Q plots were also constructed for the diagnosis checking.
From the analysis of monthly rainfall data, the data involves seasonality and non-stationary.
To rectify these differencing lag functions has been taken. From the constructed
model,Forecasting has been done for two years 2019 and 2020. Diagnosis checking has been
done to verity the model is appropriate. Comparison has been done using the most recent
rainfall of four months in 2019 and forecasted rainfall of 2019 and 2020 four months and it
is found that actual rainfall and forecasted one are most likely.
In Chapter – 5: Conclusions drawn from Zone wise, seasonal and monthly rainfall. The
various tools implemented for drawing conclusion on rainfall data in different situations.
SARIMA model have given good results in constructing model, diagnosis checking and
forecasting the rainfall data is observed.
IMPORTANCE
8 | P a g e
The Economy of the Telangana is basically an agro based one. The Agriculturesector is
directly contributing around 25 percent to the State Gross Domestic Productand also
provides livelihood to about 63 percent of the State’s population. Historically,the economy
of the State is heavily dependent on the vagaries of monsoon. As such,rainfall statistics play
vital role in plan formation of irrigation projects, flood controlmeasures and to study
procedure of drought conditions.The rainfall statistics is needed, in the short-run, to
monitor the progress of agricultural Operations in an area, to assess the recurrence of
drought or floods andto prepare contingency plans, to advise farmers on the cultivation
practices to beadopted for different levels of precipitation and soil moisture.
2. GEOGRAPHICAL SITUATION OF TELANGANA STATE
2.1 Geographical & Economical position of Telangana Region
Geographical Situation of Telangan
Telangana is a state in India situated on the centre-south stretch of the Indian peninsula on
the high Deccan Plateau. It is the twelfth largest state and the twelfth-most populated state in
India with a geographical area of 112,077 km2 (43,273 sq mi) and 35,193,978 residents as
per 2011 census. On 2 June 2014, the area was separated from the northwestern part of
Andhra Pradesh as the newly formed 29th state with Hyderabad as its historic permanent
capital. Its other major cities include Warangal, Nizamabad and Karimnagar. Telangana is
bordered by the states of Maharashtra to the north, Chhattisgarh to the east, Karnataka to the
9 | P a g e
west, and Andhra Pradesh to the east and south. The terrain of Telangana region consists
mostly of hills, mountain ranges, and thick dense forests covering an area of 27,292 sq. km.
As of 2019, the state of Telangana is divided into 33 districts. Telangana is situated on the
Deccan Plateau, in the central stretch of the eastern seaboard of the Indian Peninsula. It
covers 112,077 square kilometres (43,273 sq mi). The region is drained by two major rivers,
with about 79% of the Godavari River catchment area and about 69% of the Krishna River
catchment area, but most of the land is arid. Telangana is also drained by several minor
rivers such as the Bhima, the Maner, the Manjira and the Musi.
The annual rainfall is between 900 and 1500 mm in northern Telangana and 700 to 900 mm
in southern Telangana, from the southwest monsoons. Various soil types abound, including
chalkas, red sandy soils, dubbas, deep red loamy soils, and very deep black cotton soils that
facilitate planting mangoes, oranges and flowers.
Region AreaGeographical Area 112,077 km2
Residents 3,51,93,978Forest Area 27,292 sq.Km
Telangana Economy
The economy of Telangana is mainly driven by agriculture. Two important rivers of India,
the Godavari and Krishna, flow through the state, providing irrigation. Farmers in Telangana
mainly depend on rain-fed water sources for irrigation. Rice is the major food crop. Other
important crops are cotton, sugar cane, mango, and tobacco. Recently, crops used for
vegetable oil production such as sunflower and peanuts have gained favour. There are many
multi-state irrigation projects in development, including Godavari River Basin Irrigation
Projects and Nagarjuna Sagar Dam, the world's highest masonry dam. The state has also
started to focus on the fields of information technology and biotechnology. Telangana is one
of top IT-exporting states of India. There are 68 Special Economic Zones in the state.
Telangana is a mineral-rich state, with coal reserves at Singareni Collieries Company.
10 | P a g e
The economy of Telangana is the eighth-largest state economy in India with 8.66 lakh₹
crore (US$130 billion) in gross domestic product and a per capita GDP of 206,000₹
(US$3,000). Telangana ranks sixteenth among Indian states in human development index.
The state has emerged as a major focus for robust IT software, industry and services sector.
The state is also the main administrative centre to a large number of Indian defence aero-
space and research labs like Bharat Dynamics Limited, Defence Metallurgical Research
Laboratory, Defence Research and Development Organisation and Defence Research and
Development Laboratory.
2.2 Ecological Situation of Telangana State
Ecological situation of Telangana
The Central Deccan Plateau dry deciduous forests eco region covers much of the state,
including Hyderabad. The characteristic vegetation is woodlands of Hardwickia binata and
Albizia amara. Over 80% of the original forest cover has been cleared for agriculture, timber
harvesting, or cattle grazing, but large blocks of forest can be found in Nagarjunsagar-
Srisailam Tiger Reserve and elsewhere. The more humid Eastern Highlands moist deciduous
forests cover the Eastern Ghats in the eastern part of the state.
2.3 Climate & Agriculture Situation of Telangana State
Climate in Telangana region
11 | P a g e
Telangana is a semi-arid area and has a predominantly hot and dry climate. Summers start in
March, and peak in May with average high temperatures in the 42 °C (108 °F) range. The monsoon
arrives in June and lasts until September with about 755 mm (29.7 inches) of precipitation. A dry,
mild winter starts in late November and lasts until early February with little humidity and average
temperatures in the 22–23 °C (72–73 °F) range.
Telangana Agriculture
Rice is the major food crop and staple food of the state. Other important crops are maize,
tobacco, mango, cotton and sugar cane. Agriculture has been the chief source of income for
the state's economy. The Godavari and Krishna rivers flow through the state, providing
irrigation. Apart from major rivers, there are small rivers like Tunga Bhadra, Bima, Dindi,
Kinnerasani, Manjeera, Manair, Penganga, Pranahitha, peddavagu and Taliperu. There are
many multi-state irrigation projects in development, including Godavari River Basin
Irrigation Projects and Nagarjuna Sagar Dam, the world's highest masonry dam.
Agri Export Zones for the following produce have been proposed for the following
locations.
Gherkins – Mahabubnagar, Rangareddy, Medak, Karimnagar, Warangal
Mangoes and grapes – Hyderabad, Rangareddy, Medak, Mahabubnagar
2.4 Irrigation Projects in Telangana State
Irrigation Projects in Telangana
Telangana region has a rich heritage of cultivation and irrigation dating back to several
centuries. In the past, rulers paid a good deal of attention to the development of irrigation in
their kingdoms for the benefit of their subjects. Big lakes like Ramappa, Pakhal, Laknavaram
and many other irrigation works of Kakatiya period have become names to remember.12 | P a g e
The Mir PAlam Tank is the finest example for arched dams. Hussain Sagar, Ghanapur
Anicut across the Manjira with two canals called Fathenahar and Mahaboobnahar Projects,
Pocharam lake, Osmansagar, Himayatsagar, Nizamsagar Project, Mannair Project, Dindi
Project, Palair Project, Wyra Project and Sarlasagar Projects are some of the magnificent
contributions of the eminent Engineers of Hyderabad State under Nawab Ali Nawaz Jung
Bahadur during the Nizam's kingdom in the Telangana Region.
Projects are classified as under, based on the extent of irrigated ayacut ( commandable area) under them.
Project AreaMajor Project Ayacut above 25000 Acres (10,000 ha.)Medium Project Ayacut above 5000 Acres (2000 ha) &
upto 25000 Acres ( 10000 ha.)Minor Project Ayacut upto 5000 Acres (2000 ha)
Major Irrigation projects in Telangana
Upper Manair Maner KarimnagarSrisailam Krishna Mahabubnagar, KurnoolSriram Sagar Godavari NizamabadSripada Yellampalli
Project Godavari Peddapalli, Mancherial
Osman Sagar Musi Ranga Reddy districtNizam Sagar Manjira NizamabadNagarjuna Sagar Krishna Nalgonda, GunturKoil Sagar Krishna Mahabubnagar
Kaleswaram Godavari Jayashankar Bhupalpally district
Jurala Krishna Jogulamba GadwalHimayat Sagar Musi Ranga Reddy district
13 | P a g e
Waterfalls in Telangana
Waterfalls in Telangana State – Telangana State are home destination to many waterfalls. If
you visit these in the right season it is a great experience. Water falls in Telangana State are
a great sight for nature lovers. Here We have put together a list of the very best Waterfalls in
Telangana has to offer, if you’re a nature lover then this list is all you need as the best
waterfalls’ details are clearlyprovided.
The various waterfalls are spread over the Telangana
Kuntala Waterfall (45 metres (148 ft)) located in Kuntala, Adilabad district.
Bogatha Waterfall is waterfall located in Koyaveerapuram G, Wazeedu Mandal, Jayashankar
Bhupalpally district, Telangana. It is located 120 kilometres (75 mi) from Bhadrachalam,
140 kilometres (87 mi) away from Warangal and 329 kilometres (204 mi) from Hyderabad.
Savatula Gundam Waterfalls are one of the many waterfalls located in Adilabad district,
Telangana, India. They are located 30 km (19 mi) from Asifabad and 350 km (220 mi) from
Hyderabad, the state capital.
Gowri Gundaala waterfalls at Sabitham village near Manthani in Peddapalli district.
2.5 Electricity Generation in Telangana State
Electricity generation in Telangana
Telangana State Power Generation Corporation Limited is a power generating organization of Telangana. It has ceased to do power trading and has retained with powers of controlling system operations of Power Generation after formation of Telangana state.
14 | P a g e
Telangana State Power Generation Corporation Limited has been incorporated under companies Act, 2013, on 19 May 2014 and commenced its operations from 2 June 2014.
Power stations in Telangana
Non-renewable
Thermal power
Thermal power is the "largest" source of power in Telangana state. There are different types
of Thermal power plants based on the fuel used to generate the steam such as coal, gas,
diesel etc.
Name Operatior Location Sector Fuel
Kakatiya TPP TSGENCO Chelpur, Bhupalpally State coal
Kothagudem TPS TSGENCO Paloncha,
Kothagudem State coal
Ramagundam TPS TSGENCO Ramagundam,
Peddapalli State coal
NTPC Ramagundam NTPC Ramagundam,
Peddapalli Central coal
Singareni TPP SCCL Jaipur, Mancherial State coal
Bhadradri TPP TSGENCO Manuguru, Kothagudem State coal
Telangana Super TPP NTPC Telangana Central coal
15 | P a g e
Yadadri TPP TSGENCO Dameracherla, Nalgonda State coal
Renewable
Hydroelectric
This is a list of hydroelectric power plants in Telangana.
S. No
Project Name Operator Sector Units(MW)
Inst.Capacity(MW)
1 Nagarjuna Sagar Main PH TSGENCO State 1x110,
7x100.8 815.6
2 Nagarjuna Sagar LCPH TSGENCO State 2x30 60
3 Srisailam LBPH TSGENCO State 6x150 9004 Pochampad PH TSGENCO State 4x9 365 Singur PH TSGENCO State 2x7.5 156 Nizam Sagar PH TSGENCO State 2x5 10
7 Paleru Mini Hydel TSGENCO State 2x1 2
8 Peddapalli Mini Hydels TSGENCO State 1x9.16 9.16
9 Pulichintala HEP TSGENCO State 4x30 301
0Lower Jurala HEP TSGENCO State 6x40 240
11 Jurala HEP TSGENCO State 6x39 234
12
Dummugudem Mini Hydel Power Project
SLS Power Corporation Private 6x4 24
13
Janapadu Hydro Power Project Pvt Ltd
JHPPPL Private 1x1 1
14
Nagarjuna Agro Tech Ltd NATL Private 1x1.335 4
15
Saraswati Power Industries Pvt Ltd
SPIL Private 2x1 2
16
Komaram Bheem Small Hydro Electric Project
DesignGroup Private 1x3 3
Total capacity (MW) 2385.76
16 | P a g e
3. ANALYSIS OF ZONE WISE &SEASONAL RAINFALL
3.1 IntroductionThe Economy of the Telangana is basically an agro based one. The Agriculturesector is
directly contributing around 25 percent to the State Gross Domestic Productand also
provides livelihood to about 63 percent of the State’s population. Historically,the economy
of the State is heavily dependent on the vagaries of monsoon. As such,rainfall statistics play
vital role in plan formation of irrigation projects, flood controlmeasures and to study
procedure of drought conditions.The rainfall statistics is needed, in the short-run, to
monitor the progress of agricultural Operations in an area, to assess the recurrence of
drought or floods andto prepare contingency plans, to advise farmers on the cultivation
practices to beadopted for different levels of precipitation and soil moisture.
3.2 Definition and various Methods of measuring Rainfall
Rainfall Defination
The total amount of rain deposited on a given area during a given time as measured by a
raingauge is called rainfall.Rain gauge is a type of instrument used by meteorologists and
hydrologists to measure rainfall rate in a certain period of time. Rain gauges are also known
as udometer, pluviometer and ombrometer.
Types of Rain Gauge
There are two types of rain gauges:1) Non-recording type2) Recording type
Non-Recording Type Rain GaugeExample – Symons Rain Gauge
Non-recording type rain gauge is most common type of rain gauge used by meteorological
department. It consists of a cylindrical vessel 127mm in diameter with a base enlarged to
210mm diameter.At its top section, funnel is provided with circular brass rim which is
127mm exactly so that it can fit into vessel well. This funnel shank is inserted in the neck of
a receiving bottle which is 75 to 100mm high from the base section and thinner than the
cylinder, placed into it to receive rainfall.17 | P a g e
Fig: Symons Rain gauge with graduated glass of accuracy 0.1mmA Receiving bottle has capacity of 100mm and during heavy rainfall, amount of rain is
frequently exceeded, so the reading should be measured 3 to 4 times in a day. Water
contained in this receiving bottle is measured by a graduated measuring glass with accuracy
up to 0. 1mm. For uniformity the rainfall is measured every day at 8:30Am IST and is
recorded as rainfall of the day.
Proper care, maintenance and inspection of rain gauge especially during dry weather is
necessary to keep the instrument free form dust and dirt, so that the readings are accurate.
Recording Type Rain GaugesThere are three types of recording rain gauges
a) Weighing bucket typeb) Tipping bucket typec) Floating or natural syphon type rain gauge
a) Weighing Bucket Type Rain GaugeWeighing bucket type rain gauge is most common self-recording rain gauge. It consists of a
receiver bucket supported by a spring or lever balance or some other weighing mechanism.
The movement of bucket due to its increasing weight is transmitted to a pen which traces
record or some marking on a clock driven chart.
Weighing bucket type rain gauge instrument gives a plot of the accumulated (increased)
rainfall values against the elapsed time and the curve so formed is called the mass curve.
18 | P a g e
Fig: Weighing Bucket Type Rain Gaugeb) Tipping Bucket Type Rain GaugeTipping bucket type rain gauge is a 30cm sized circular rain gauge adopted for use by US
weather bureau. It has 30cm diameter sharp edged receiver and at the end of the receiver is
provided a funnel.
Pair of buckets are pivoted under this funnel in such a manner that when one bucket receives
0.25mm of precipitation (rainfall),it tips discharging its rainfall into the container ,bringing
the other bucket under the funnel.
19 | P a g e
Fig: Tipping Bucket Rain Gauge
Fig: Tipping Bucket Rain GaugeTipping of bucket completes an electric circuit causing the movement of pen to mark on
clock driven receiving drum which carries a recorded sheet. These electric pulses generated
are recorded at the control room far away from the rain gauge station. This instrument is
further suited for digitalizing the output signal.
20 | P a g e
c) Floating or Natural Syphon Type Rain Gauge
The working of this type of rain gauge is similar to weighing bucket rain gauge. A funnel
receives the water which is collected in a rectangular container. A float is provided at the
bottom of container, and this float raises as the water level rises in the container. Its
movement being recorded by a pen moving on a recording drum actuated by a clock work.
Fig: Natural Syphon or Float Type Rain Gauge
21 | P a g e
Fig: Natural Syphon or Float Type Rain Gauge DetailsWhen water rises, this float reaches to the top floating in water, then syphon comes into
operation and releases the water outwards through the connecting pipe, thus all water in box
is drained out. This rain gauge is adopted as the standard recording rain gauge in India and
the curve drawn using this data is known as mass curve of rain fall.
3.3 METHODOLOGY OF RESEARCH
The research design of the paper is casual in nature. Data was collected from Statistical
abstract of Telangana, India. Though the number of study conducted in the area is limited.
The nature of data collection is secondary in nature. Yearly data of 10 districts of Telangana
is divided into two groups such as the South and the North Telangana state each of five
districts. Regression model is constructed from the individual of five districts. Data was
analyzed through t-test to test the significance between the South and the North districts.
ANOVA is used to test the seasonal variation of three seasons and the significance between
pairs of three variable(seasons) using Tukey’s HSD Post Hock test.
22 | P a g e
3.4Analysis of Zone wise Rainfall
For the analysis of Zone wise rainfall, Telangana state is divided into two major zones i.e,
South and North zone. The North districts of Telangana are Adilabad, Karimnagar,
Warangal, Khammam and Nizamabad. The South districts of telangana region are Medak,
Nalgonda, Rangareddy, Mahabubnagar and Hyderabad.
Table-1: Year wise average rainfall of South and North districts from 2004-2016
Graph-1: Graphical Representation of analysis Avg. rainfall from South and North districts
20042006
20082010
20122014
20162018
0200400600800
1000120014001600
South DistrictsNorth Districts
Avg.Rainfall in mms
23 | P a g e
YEAR South districts(Avg. rainfall) North districts(avg. rainfall)
2004 763 772.2
2005 1185.84 1303.88
2006 924.96 1200.26
2007 865.26 887.32
2008 1029.4 1013.92
2009 731.1 642.38
2010 1166.5 1292.82
2011 588.9 750.08
2012 873.48 1003.12
2013 1104.2 1362.34
2014 614.42 698.6
2015 658.32 784.9
2016 917.9 1184.68
2017 789.08 969.523
2018 774.32 966.142
Interpretation: From the graph-1, fluctuations in average rainfall found from the south and
North districts but fluctuation is very less from the one to another districts. The average
rainfall slightly more in South region but significantly it is similar from the both regions.
Graph-2: Scatter plot of South districts average rainfall
2002 2004 2006 2008 2010 2012 2014 2016 20180
200400600800
100012001400
f(x) = − 14.7669230769231 x + 30560.2292307692
South Districts(Avg.Rainfall)
Year
Avg.rainfall(in MMs)
Regression Statistical results of average rainfall for South zone
Year Regression equation R-Square P-value Statistically Significant
2004-2018 Y=-4.76X+30560 0.08 0.33 No
Interpretation: From the scatter plot, the Coefficient of determination R^2=0.08 it represent vary less variability rainfall of one to another year from the south region.
Graph-3: Scatter plot of North districts average rainfall
2002 2004 2006 2008 2010 2012 2014 2016 20180
500
1000
1500
f(x) = − 3.38186813186814 x + 7789.59340659341
North districts(avg.rainfall)
Year
Avg.rainfall(in mm)
Regression Statistic results of average rainfall for North zone
24 | P a g e
Year Regression equation R-Square P-value Statistically Significant
2004-2018 Y= -3.381X+7789 0.002 0.847 No
Interpretation: From the scatter plot, the Coefficient of determination R^2= 0.002 it represent vary poor variability rainfall of one to another year from the region.
Since the above average rainfall from the two regions is quantitative, t-test is the best to test the significance of two regions.
SPSS output from the North and South regions are as follows
t-Test: Two-Sample Assuming Equal Variances
South districts(Avg.Rainfall) North districts(avg.rainfall)
Mean 878.7138462 992.0384615
Variance 41014.71829 64584.89976Observations
15 15
Pooled Variance 52799.80903
Hypothesized Mean Difference 0
df 24
t Stat -1.257374812
P(T<=t) one-tail 0.110358157
t Critical one-tail 1.710882067
P(T<=t) two-tail 0.220716314
t Critical two-tail 2.063898547
Research Hypothesis:
Null HypothesisH0: There is no significance difference between average rainfallfrom the two regions.Alternative Hypothesis H1: There is a significance difference between average rainfalls from the two regions.
Conclusion: Since, P-value (0.2207) > α = 0.05. Accept the Null hypothesis H0. Hence, there is no significance difference between average rainfalls from the two regions.
3.5Conclusion from Zone wise rainfall
From the above analysis of zone wise rainfall it is found that within the zone precipitation is
in fluctuation but between the zones it is unique rainfall is observed. The reason behind the
25 | P a g e
zone is seasonal rainfall scattered in all the districts are equal except Mahabubnagar and
Nalgonda region.
3.6 ANALYSIS OF SEASONAL RAINFALL
Season in Telangana consisting of three seasons namely South West, North East and winter
season. There is more variation in rainfall from three seasons accordingly based on climate
and seasons. In this study statistical analysis is required for analysis of seasonal rainfall
using statistical tools. These tools are helps in drawing insights from the seasonal data.
Table-5: South West Monsoon (June-September) rainfall statistics
26 | P a g e
S.No Year
South West Monsoon(June-Sept)
Actual Normal % of deviation
1 2004-05 455.8 715.1 -36.26
2 2005-06 808.2 715.1 13.02
3 2006-07 728.9 715.1 1.93
4 2007-08 734.6 715.1 2.73
5 2008-09 755.2 715.1 5.61
6 2009-10 494.9 715.1 -30.79
7 2010-11 894.4 715.1 25.07
8 2011-12 601.1 715.1 -15.94
9 2012-13 707.2 715.1 -1.10
10 2013-14 851.5 715.1 19.07
11 2014-15 494.7 715.1 -30.82
12 2015-16 611.2 715.1 -14.53
13 2016-17 912 715.1 27.53
14 2017-18 846.4 715.1 15.51
Graph-4:Graphical Representation of Southwest Monsoon rainfall
2004-05
2005-06
2006-07
2007-08
2008-09
2009-10
2010-11
2011-12
2012-13
2013-14
2014-15
2015-16
2016-17
2017-180
50100150200250300
Southwest MonsoonActualNormal
Rainfall in mms
Interpretation: Above graph represents, there is highest rainfall found during the year 2016-
17 is 912 mms and lowest rainfall in 2004-05 and the remaining years its more fluctuating.
Table-6: North East (October-December) rainfall statistics
S.No Year
North-East Monsoon(Oct-Dec)
Actual Normal % of deviation
1 2004-05 76.4 129.2 -40.87
2 2005-06 172.3 129.2 33.36
3 2006-07 65.4 129.2 -49.38
4 2007-08 61.6 129.2 -52.32
5 2008-09 38.6 129.2 -70.12
6 2009-10 122 129.2 -5.57
7 2010-11 152.6 129.2 18.11
8 2011-12 24 129.2 -81.42
9 2012-13 141.8 129.2 9.75
10 2013-14 243.2 129.2 88.24
11 2014-15 54.4 129.2 -57.89
12 2015-16 27.5 129.2 -78.72
13 2016-17 70.8 129.2 -45.20
14 2017-18 54.7 129.7 -57.66
27 | P a g e
Graph-5:Graphical Representation of North East Monsoon rainfall
2004-05
2005-06
2006-07
2007-08
2008-09
2009-10
2010-11
2011-12
2012-13
2013-14
2014-15
2015-16
2016-17
2017-180
100
200
300
Southwest Monsoon ActualNormal
Rainfall in mms
Interpretation: Above graph represents, there is highest rainfall found during the year 2013-14 is 243.2 mms and lowest rainfall in 2004-05 i.e, 24 mms.
Table-7: Winter period (Jan-Feb) rainfall statistics
S.No Year
Winter Period(Jan-Feb)
Actual Normal % of deviation
1 2004-05 37.4 11.5 225.22
2 2005-06 0 11.5 -100.00
3 2006-07 0.6 11.5 -94.78
4 2007-08 19.6 11.5 70.43
5 2008-09 0 11.5 -100.00
6 2009-10 18.8 11.5 63.48
7 2010-11 10.1 11.5 -12.17
8 2011-12 8 11.5 -30.43
9 2012-13 34.5 11.5 200.00
10 2013-14 1.3 11.5 -88.70
11 2014-15 13 11.5 13.04
12 2015-16 1.5 11.5 -86.96
13 2016-17 0.6 11.5 -94.78
14 2017-18 0.2 11.5 -98.26
Graph-6:Graphical Representation of winter season rainfall statistics
28 | P a g e
2004-05
2005-06
2006-07
2007-08
2008-09
2009-10
2010-11
2011-12
2012-13
2013-14
2014-15
2015-16
2016-17
2017-180
10
20
30
40
Winter SeasonActualNormal
Rainfall in mms
Interpretation: Above graph represents, there is good amount of rainfall found during the
year 2004-05 and 2012-13 and in the year 2005-06 and 2008-09 rainfall is found to be Nill.
In generall, in winter season the average rainfall drastly decrease from the previous months.
3.7 ANOVA APPROACH FOR ANALYSIS OF THREE SEASONAL RAINFALL
Statistical Analysis of Seasonal Rainfall from 2004-2017
Year South West Monsoon (A) North-East Monsoon (B) Winter Period ©
2004-05 455.8 76.4 37.4
2005-06 808.2 172.3 0
2006-07 728.9 65.4 0.6
2007-08 734.6 61.6 19.6
29 | P a g e
2008-09 755.2 38.6 0
2009-10 494.9 122 18.8
2010-11 894.4 152.6 10.1
2011-12 601.1 24 8
2012-13 707.2 141.8 34.5
2013-14 851.5 243.2 1.3
2014-15 494.7 54.4 13
2015-16 611.2 27.5 1.5
2016-17 912 70.8 0.6
2017-18 846.4 54.7 0.2
From the above table, the significant difference between three seasonal variable can analysis by ANOVA One classification.
SPSS Output ANOVA Single Factor ANO
Source of Variation SS df MS F P-value F crit
Between Groups 3623948 2 1811974.184 193.3388 0.000 3.259446
Within Groups 337392.6 39 9372.016795
Total 3961341 41
Research Hypothesis:
Null HypothesisH0: There is no significance difference between three seasonal rainfall.
Alternative HypothesisH1: There is significance difference between atleast one seasonal
rainfall.
Conclusion: Since, P-value(0.000)<α=0.05, Reject the Hypothesis H0.Hence, conclude that there is a significant difference between three seasonal rainfall.
3.8TUKEYS HSD POST HOCK TEST APPROACH FOR ANANALYSIS OF THREE
SEASONAL RAINFALL
30 | P a g e
Tukeys HSD Post-Hock test
ANOVA suggests that one or more treatments are significantly different but using post-hock
test can find wich of the pairs of treatments are significantly different from each other.
HSD= Q = M 1−M 2
√MSw ( 1n )
Where,M = treatment/group meann = number per treatment/group
Tukey HSD results
Treatment pair Tukey HSD Q-Statistic Tukey HSD P-value Tukey HSD Inference
A vs B 22.3438 0.0010053 P <0.05
A vs C 25.5101 0.0010053 P < 0.05
B vs C 3.1663 0.0781684 Insignificant
Decision: From the refernce of the above table conclude that rainfall from south-west monsoon is significantly differce from other two monsoons and North-East monsoon and Winter season rainfall is likely insignificant.
3.9 CONCLUSION OF SEASONAL RAINFALL
From the above analysis conclude that there is an amount of change in the rainfall from one
season to other season. In winter season an amount of rainfall expected to less compare with
two remaning seasons. From the analysis of One way classification, conclusion is there is a
significant difference in three season rainfall is observed.. Tukey HSD test revealed that
rainfall from Southwest monsoon is significantly differ from other two monsoons and
Northeast monsoon, winter season rainfall is likely insignificant.
4. ANALYSIS OF MONTHLY RAINFALL
4.1 Introduction
31 | P a g e
This project consisting of Univariate time series analysis and forecasting has become a major
tool in hydrology,environmental management, and climatic fields. Various time series
methods have beenused for modeling and forecasting rainfall data in literatures but
according to Pankratz(1983) the Box and Jenkins method is the most general way of
approaching to forecastunlike other models, there is no need to assume initially a fixed and
specified pattern. TheUnivariate Box and Jenkins models are most useful for analysis of
single time series.
Box and Jenkins methodology is the probably the most accurate method for forecasting of
time series data. According toCaldwell (2006), the Box-Jenkins methodology is particularly
suited for development ofmodel of process exhibiting strong seasonal behavior. There are
other forecast techniquesexploring the relation among observations yield better results; most
of these forecasttechniques are based on recent advances in time series analysis consolidated
anddeveloped by Box and Jenkins (1976). Box- Jenkins approach to build a seasonal model
ofmonthly rainfall data of Telangana region. The estimation anddiagnostic analysis results
revealed that the model is well fitted to the historical data. Residual analysis revealed that
there was no violation of assumptions in relation to modeladequacy. Further we compared
the forecasting accuracy of the model by holding-outsome rainfall values. The point forecast
results showed a very closer match with thepattern of the actual data and better forecasting
accuracy in validation period.
In this project there is a seasonal behavior fund as in some months rainfall is in peak level of
increasing for every year. For the stabilization of variation in rainfall Seasonal
AutoregressiveIntegrated Moving Average(SARIMA) method is adopted for analysis,
Diagnose checking and forecasting.
In this project I used Seasonal Autoregressive Integrated Moving Average (SARIMA) model, proposed by Box and Jenkins (1976), for model identification, building and forecasting for rainfall data. Box and Jenkins methodology is a powerful approach to find the solution ofmany forecasting problems (Johnson and Montgomery, 1976) and it can provide reliable, extremely accurate forecasts of time series and offers a formal structured approach to model building and analysis. There are many quantitative methods of model building and forecasting which are being
32 | P a g e
used in climatology and metrological studies. With the development of the statistical software packages and its availability, these techniques have become easier, faster and more accurate to use. In this study, I employ SPSS and R software packages are being used for the statistical data analysis.
4.2 Methodology of Research
Box -Jenkins Analysis refers to a systematic method of identifying, fitting, checking, and
using integrated autoregressive, moving average (ARIMA) time series models.
A time series is a set of values observed sequentially through time. The series may be
denoted by X1, X2, …..Xt wheretrefers to the time period and X refers to the value. If the
X’s are exactly determined by a mathematicalformula, the series is said to be deterministic.
If future values can be described only by their probabilitydistribution, the series is said to be
aStatistical or stochastic process.A special class of stochastic processes is a stationary
stochastic process. A statistical process is stationary if the probability distribution is the
same for all starting values of t. This implies that the mean and variance are constant for all
values of t. A series that exhibits a simple trend is not stationary because the values of the
series depend on t. A stationary stochastic process is completely defined by its mean,
variance, and autocorrelation function. Oneof the steps in the Box -Jenkins method is to
transform a non-stationary series into a stationary one.
Since a stationary series is completely specified by its mean, variance, and autocorrelation
function, one of the major (and most subjective) tasks in Box-Jenkins analysis is to identify
an appropriate model from the sample autocorrelation function. Although the sample
autocorrelations contains random fluctuations, for moderate sample sizes they are fairly
accurate in signalling the order of the ARIMA model.
33 | P a g e
4.3 ARIMA Model
The ARMA (autoregressive, moving average) modelis defined as follows:
X t=ϕ 1+X t−1+….+ϕ p X t−p+a1 – θ 1 a t−1−…..−θ q a t−qwhere the φ's(phis) are the autoregressive parameters to be estimated, the θ's(thetas) are the
moving average parameters to be estimated, the X’s are the original series and the a’s are a
series of unknown random errors (or residuals) which are assumed to follow the normal
probability distribution.Box-Jenkins use the backshift operator to make writing these models
easier. The backshift operator,B, has the effect of changing time period t to time period t-1.
Thus BXt = Xt-1 and B2 Xt = Xt-2. Using backshift notation, the above model may be
rewritten as,
(1−ϕ1 B−…. ϕp Bp ) Xt=(1−ϕ1 B−…−ϕq Bq ) at
This may be abbreviated even further by writing:φ p (B)Xt =θ q (B)at
whereφp (B)=(1−ϕ1 B−…… …−ϕp Bp)
And θ q(B) = (1-θ 1 B−………−θ qBq ¿
These formulas show that the operator’s φ p (B) and θ q (B) are polynomials in B of orders p and q respectively. One of the benefits of writing models in this fashion is that we can see why several models may be equivalent.
Non-stationary ModelsMany time series encountered in practice exhibit non-stationary behavior. Usually, the non-
stationarity is due to a trend, a change in the local mean, or seasonal variation. Since the
Box-Jenkins methodology is for stationary models only, we have to make some adjustments
before we can model these non-stationary series. We use one of two methods for reducing a
non-stationary series with trend to a stationary series (without trend):
1. Use the first differences of the series, Wt = Xt –Xt-1. Note that this can be rewritten as Wt
= (1− B)Xt .
34 | P a g e
A more general form of this equation is:
ϕp (B ) (1−B ) d=θ q (B )at
where d is the order of differencing. This is known as the ARIMA(p,d,q) model.
Fit a least squares trend and fit the Box-Jenkins model to the residuals.
If the model exhibits an occasional change of mean, first differences will result in a
stationary model.For seasonal series, Box-Jenkins provided a modification to this equation
that will be the subject of the nextsection.
4.4 Autocorrelation and Partial Autocorrelation Function
Autocorrelation function(ACF)
Autocorrelation refers to the correlation of a time series with its own past and future values.
Autocorrelation is sometimes called “serial correlation”, which refers to the correlation
between members of a series of numbers arranged in time. Alternative terms are “lagged
correlation”, and “persistence.”Geophysical time series are frequently autocorrelated because
of inertia or carryover process in the physical system. For example, the slow drainage of
ground water reserves might impart correlation to successive annual flows of a river. Or
stored photo syntheses might impart correlation to successive annual values of tree-ring
indices. Autocorrelation complicates the application of statistical tests by reducing the
effective sample size. Autocorrelation can also complicate the identification of significant
covariance or correlation between time series (e.g., precipitation with a tree-ring series).
Three tools for assessing the autocorrelation of a time series are (1) the time series plot, (2)
the lagged scatterplot, and (3) the autocorrelation function.
35 | P a g e
In statistics, the autocorrelation of a random process is the Pearson correlation between
values of the process at different times, as a function of the two times or of the time lag. Let
X be a random process, and t be any point in time (t may be an integer for a discrete-time
process or a real number for a continuous-time process). Then Xt is the value (or realization)
produced by a given run of the process at time t. Suppose that the process has meanμt and
varianceσt2 at time t, for each t. Then the definition of the autocorrelation between times s
and t is
R (s ,t )=E( ( Xt−μt )(Xs−μs))
σt σsWhere, “E" is the expected value operator. Note that this expression is not well-defined for
all-time series or processes, because the mean may not exist, or the variance may be zero (for
a constant process) or infinite (for processes with distribution lacking well-behaved
moments, such as certain types of power law). If the function R is well-defined, its value
must lie in the range [−1, 1], with 1 indicating perfect correlation and −1 indicating perfect
anti-correlation.
Partial Autocorrelation function(PACF)
36 | P a g e
In time series analysis, the partial autocorrelation function (PACF) gives the partial
correlation of a time series with its own lagged values, controlling for the values of the time
series at all shorter lags. It contrasts with the autocorrelation function, which does not
control for other lags.
This function plays an important role in data analysis aimed at identifying the extent of the
lag in an autoregressive model. The use of this function was introduced as part of the Box–
Jenkins approach to time series modelling, whereby plotting the partial autocorrelative
functions one could determine the appropriate lags p in an AR (p) model or in an extended
ARIMA (p,d,q) model.Given a time series Zt , the partial autocorrelation of lag k, denoted αk,
is the autocorrelation between Zt and Zt+k with the linear dependence of Zton
Zt+1 through Zt+k-1 removed; equivalently, it is the autocorrelation between Zt
and Zt+kthat is not accounted for by lags 1 to k − 1, inclusive.
α(1) = Cor(Zt+1,Zt),
α(k) = Cor(Zt+k – Pt,k(Zt+k), Zt- Pt,k(Zt)), for k≥2where Pt,k(x)denotes the projection of x onto the space spanned by Xt+1…..Xt+k-1.
There are algorithms for estimating the partial autocorrelation based on the sample
autocorrelations. These algorithms derive from the exact theoretical relation between the
partial autocorrelation function and the autocorrelation function.
Partial autocorrelation plots are a commonly used tool for identifying the order of an
autoregressive model. The partial autocorrelation of an AR(p) process is zero at lag p + 1
and greater. If the sample autocorrelation plot indicates that an AR model may be
appropriate, then the sample partial autocorrelation plot is examined to help identify the
order. One looks for the point on the plot where the partial autocorrelations for all higher
lags are essentially zero. Placing on the plot an indication of the sampling uncertainty of the
sample PACF is helpful for this purpose: this is usually constructed on the basis that the true
value of the PACF, at any given positive lag, is zero. This can be formalized as described
below.
37 | P a g e
An approximate test that a given partial correlation is zero (at a 5% significance level) is
given by comparing the sample partial autocorrelations against the critical region with upper
and lower limits given by , where n is the record length (number of points) of the
time-series being limit is given by ±1.96√n
, where nis the record length (number of points) of
the time-series being analysed. This approximation relies on the assumption that the record
length is at least moderately large (say n>30) and that the underlying process has finite
second moment.
4.5 Road Map of Box-Jenkins Methodology
1. Introduction Overview
2. Identification Overview Identifying d Seasonality Identifying p and q
3. Estimation and information criteria Estimation
Information criteria38 | P a g e
4. Diagnostic checking
5. Model’s use
Step 1 (identification) involves determining the order of the model required (p, d, and q) in order to capture the salient dynamic features of the data. This mainly leads to use graphical procedures (plotting the series, the ACF and PACF, etc).
Step 2 (estimation and selection) involves estimation of the parameters of the di erentff models (using step 1) and proceeds to a first selection of models (using information criteria).
Step 3 (checking) involves determining whether the model(s) specified and estimated is adequate. Notably, one uses residual diagnostics.
TentativeTime Series PlotIdentificationRange-Mean PlotACF and PACFjj
Estimation Least Squares orMaximum Likelihood
Diagnostic Checking Residual Analysis & Forecast
No Model Ok
Yes
Use the Model Forecasting
Identification:
Three Objectives
1. Stationarity, Non-stationarity? What is the order of differentiation (d)?
2. Seasonal Component?
3. Identification of ARIMA order (p,q).
39 | P a g e
4.6 Monthly wise Rainfall data from 2000-2018
For the analysis and forecasting of rainfall, 19 years data has been collected which consists of monthly wise rainfall of 228 data points.
** 2000 2001 2002 2003 2004 2005 2006 2007 2008JAN 0 5.2 13.3 7.12 21.3 31.9 0 0 0FEB 25.4 0 1.8 1.23 6.34 16.6 4.6 1.99 20
MAR 1 17.5 1.3 7.25 8.55 13.7 4.85 0.12 11.77APR 15.5 32.5 6.3 22.32 30.72 8.2 44.83 14.91 13MAY 53.8 7.4 33.3 38.21 44.39 9.1 100 15.49 7.7JUN 241.1 183.8 136.9 100.24 52.44 104.8 150.1 166.08 123.8JUL 212.5 122.3 88.9 52.12 235.59 412.8 289.6 126.58 173.7
AUG 226.6 240.6 289.6 127.23 133.97 146.8 290.2 205.17 352.9SEP 68.5 144 86.5 98.25 121.97 319.6 262.61 260.43 145.6OCT 24.1 139.8 108.7 89.12 75.38 169.1 46.2 51.37 28.8NOV 3.4 2.9 1.1 3.27 6.77 2.3 24 15.62 13.8DEC 2.4 0 0 0 0 3.2 0 1.54 0.59
** 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018JAN 2.2 11.2 0.0 15.8 2.8 0.1 12.2 1.1 0 0FEB 0.0 7.2 9.7 0.3 3 1.9 0 0.2 0 2.5
MAR 5.3 0.9 2.3 1.5 0.1 47.7 30 11.3 8.7 5.5APR 6.2 10.9 33.1 16.6 25.1 10 56.4 4.4 3.2 23.4MAY 26.9 40.6 20.3 10.5 12.5 70.5 22.6 54.8 22.4 39.2JUN 65.5 102.3 80.5 114 185. 6 55.4 216. 1 199.8 202.2 169.3JUL 112.2 389.5 261.6 241.6 386. 2 142.9 79.2 234.2 141.4 197.1
AUG 165.4 278.9 230.3 190.03 212. 3 173.4 151. 4 131.1 200.7 294.1SEP 154.8 242.3 84.4 167 165. 1 126.8 154. 2 336.3 113 80.1OCT 94.3 101.7 15.7 85.8 232. 6 43.3 22.1 70.4 114.6 17.1NOV 31.9 45.7 1.7 58.4 17.3 12 2.3 0.5 4.9 1.4DEC 1.1 17.7 0.0 0 1.1 1.5 1.4 0.4 0 23.2
40 | P a g e
4.7 Time Series plots of Rainfall
Graph: Graphical situation of monthly rainfall from 2000-2018
From the above time series plot, it indicates that time series is non-stationary as its mean and
variance are not constant because of in some month’s rainfall is high compare with other
months in every year, by using this data we can’t forecast the future rainfall. To forecast the
rainfall it should be stationary, for this need to take differencing the data points.
Graph: rainfall for d=1
41 | P a g e
Graph: Rainfall for lag d=1
From the above time series plot, it seems to be stationary as its peaks are uniform from
subsequent period, hence this data adequate forecasting the further rainfall using ARIMA.
Before forecasting the rainfall need more clarification of building appropriate model at
suitable p, d, q values which plays major role of building the model. For this ACF and PACF
plots are essential for indentifying stationary and p,d , q values to build the model for
forecasting.
Figure: ACF, PACF plots of rainfall
42 | P a g e
Figure1: ACF time plot at d=0 of Telangana State Rainfall
Figure1 : PACF time plot at d=0 of Telangana State Rainfall
ARIMA modelis only estimated after transforming the variable under forecasting into a
stationary series. For this need to compute sample autocorrelation function to check whether
the series is stationary or non-stationary. From the above ACF, PACF charts, the sample
ACF dies slowly at the highest lags and the time plot indicates that the time series in non-
stationary and clearly seasonality variation in the present data set.In Time series plot the first
step of ARIMA model identification of time series. A time plot of the rainfall is plotted in
Figure 1 for d = 0. Non-stationary in variance is corrected by taking lag transformation and
43 | P a g e
non-stationary in mean is corrected through taking appropriate lag differencing of the data.
Hence the original series need to be differenced to make the series stationary.
Figure 2: ACF time plot differencing log transformation at d=1 of Telangana State Rainfall
Figure 2: PACF time plot at differencing lag transformation at d = 1 of Telangana State
Rainfall
Stationary cannow be checked using visual display of the ACF and PACFtime plot in Figure 2.
44 | P a g e
The ACF and PACF plots in Figure 1,show that the rainfall time series data arenot stable for d = 0 due to its slow decay and therefore nonstationary. For d = 1,the time plot is stationary. According to nonstationarytime series to a stationary one through differencing (where needed) is an importantpart of the process of fitting an ARIMA model.
The ACF and PACF plots for d = 1 in Figure 2 indicate that the first differenced rainfall series are stationary, peaks are under the limits and hence require further examination to establish the most suitable ARIMA.
4.8 Model Identification
Several SARIMA models have been tested and checked for the adequacy. Due to the above
reasons SARIMA model of different parameters can be identified. It should be mentioned
that if the best ARIMA model identified, this doesn’t mean that this model is the only model
can beConsidered in which other ARIMA models with values of AR and MA less than the
same parameters of the considered ARIMA models. This shows the need for a specific
criterion to select the most reliable model. BIC was selected to test the best model after
estimating its parameters by using the maximum likelihood technique.
Table 1- shows the values of mean square errors and the BIC values of some tested SARIMA models with different parameters
(p,d,q)(P,D,Q)4 BIS RMSE MAPE Significance of parameter
Ljung- Box
Sig. Adequacy
(0,1,1)(1,0,1) 8.226 57.710 397.219 Significant 23.355 0.077 Adequate(0,1,1)(1,0,2) 8.215 57.270 378.857 Significant 21.965 0.079 Adequate(0,1,1)(1,0,3) 8.269 58.132 450.867 Insignificant 22.082 0.054 Adequate(0,1,1)(1,0,4) 8.265 57.332 398.865 Insignificant 22.705 0.030 Adequate(0,1,1)(1,1,1) 8.120 55.149 342.413 Insignificant 13.490 0.565 Adequate(0,1,1)(1,1,3) 8.173 55.239 348.651 Insignificant 13.059 0.443 Adequate(0,1,1)(1,1,4) 8.192 55.07 355.922 Insignificant 13.358 0.344 Adequate(1,1,1)(1,0,1) 8.226 57.581 390.553 Insignificant 21.168 0.097 Adequate(1,1,1)(1,0,2) 8.217 56.369 387.241 Insignificant 13.047 0.444 Adequate(1,1,1)(1,0,3) 8.246 56.778 368.317 Insignificant 13.763 0.316 Adequate(1,1,1)(1,0,4) 8.268 56.736 311.021 Insignificant 14.354 0.214 Adequate(1,1,1)(1,1,1) 8.146 55.176 327.812 Insignificant 11.578 0.640 Adequate(1,1,1)(1,1,2) 8.197 55.898 542.336 Insignificant 13.297 0.425 Adequate(1,1,1)(1,1,3) 8.198 55.240 325.528 Insignificant 11.808 0.461 Adequate
45 | P a g e
(1,1,1)(1,1,4) 8.222 55.197 347.177 Insignificant 12.077 0.358 Adequate(1,1,1)(1,1,5) 8.246 55.183 354.120 Insignificant 12.521 0.252 Adequate(1,1,6)(1,1,1) 8.238 54.951 418.741 Insignificant 6.179 0.800 Adequate(0,1,12)(1,1,1) 8.401 55.315 432.346 Insignificant 3.670 0.453 Adequate(0.1,1)(1,1,6) 8.217 54.376 425.690 Insignificant 14.099 0.169 AdequateFrom the above table it is observed that only SARIMA(0,1,1)(1,0,2) is significant with respect to Parameters as well as adequacy of the model. So the most suitable model is SARIMA(0,1,1)(1,0,2).
Table: R-Output of SARIMA (0,1,1)(1,0,2)4 model parameter
Parameter Estimates SE t Sig . Remark
Diff 1
MA 1 1.000 0.055 18.162 0.000 Significant
ARSeasonal 1 0.999 0.006 175.462 0.000 Significant
MASeasonal1 0.727 0.166 4.378 0.000 Significant
MASeasonal2 0.213 0.095 2.238 0.026 Significant
The fitted model of forecasting rainfall is(1−ϕ1 B−…. ϕp Bp ) Xt=(1−ϕ1 B−…−ϕq Bq ) at
(1-0.999B2) Xt = (1-0.1.00B)(1-0.727B2 - 0.213B4)at
Similarly, the equation of time series for forecasting can be written asY t
¿=μ+0.999 Y t−4+1.00 at−1+0.727 at−4+0.213 at−5
Where Ŷ𝑡 bethe estimate of sugarcane yield at time t, μ is the mean time series process, 𝑦t-iis the past observations and 𝛼𝑖are the random shocks.
4.9 Diagnostic Checking of Rainfall data:
The final stage in Box and Jenkins modelling is model diagnostic checking that requires that
the residuals look like white noise. Diagnostic checking is done through examining the
autocorrelations and partial autocorrelations of the residuals ofvarious orders.It is through
diagnostic checks that a model can be declared statistically adequate and thereafter can be
used to forecast.The Plots of BIC, Normal Q-Q plot shows that the residualare a white noise
process.
46 | P a g e
Graph: Residual ACF and PACF Plots of rainfall statistics
From the above results of residual, it indicates that the peaks are under the limits; hence the autocorrelations is significantly different from zero at 5% level. This proves thatthe model is an appropriate model.
Portmanteau Test:
In this test the various autocorrelations of residuals for 25 lags are computed and their significance which is tested by Box-Ljung Q- test statistic. Let the hypothesis on the model is
Hypothesis:Ho: The selected model is adequate.H1: The selected model is inadequate.
Table: 4 Portmanteau Test
47 | P a g e
Ljung-Box Q-TestStatistic DF P-value
21.965 14 0.079
Since the probability corresponding to Box-Ljung Q-statistic is greater than 0.05, therefore,
accept Ho and hence conclude that the selected seasonal autoregressive integrated moving
average model is an adequate model for the given time series on monthly rainfall in
Telangana State.
Graph: Normal Q-Q Plot of Standardized Residual Rainfall
From the above graph of Q-Q plot of residual indicates that the monthly wise rainfall data
follows normal. The data points are scattered around the straight line and it is moving lower
to upward direction, it is the indication of the data follows assumption of normality.
4.9.1 Forecasting of monthly Rainfall
Forecasting results plays a vital role to decision makers for taking right decision by
minimizing the error. Decision making is the final stage of taking decision from statistical
analysis. In this project SARIMA Model plays vital role for model building and forecasting
the rainfall for two years 2019 and 2020.
48 | P a g e
Graph: R-Output of forecasting the monthly rainfall from 2019 to 2020
Table: Forecast monthly rainfall data of the years 2019 and 2020
Month/Year Forecasting Rainfall
(in MMs)
Month/Year Forecasting
Rainfall ( in MMs)
Jan 2019 08.88 Jan 2020 11.15
Feb 2019 08.08 Feb 2020 10.35
Mar 2019 12.62 Mar 2020 14.65
Apr 2019 24.77 Apr 2020 24.47
May 2019 38.94 May 2020 37.82
Jun 2019 146.31 Jun 2020 143.18
Jul 2019 206.28 Jul 2020 204.26
Aug 2019 229.85 Aug 2020 211.76
Sep 2019 148.62 Sep 2020 162.85
Oct 2019 64.44 Oct 2020 80.55
Nov 2019 14.31 Nov 2020 17.69
Dec 2019 12.05 Dec 2020 08.66
Conclusion: Above are the forecasted rainfall of two years. By comparing preceding
year(2018) rainfall with forecasted rainfall threre is an increasing trend is observed in few
months. From above forecasted rainfall of two years from 2019 to 2020 there is an
increasing in the rainfall in initial stage and the remaining months ascillatory moment is
observed.
49 | P a g e
Comparision of Actual Rainfall in 2019 for four months and Forecasted Rainfall
Month Jan Feb Mar AprActual Rainfall 2019 13.2 5.7 10.2 16.4
Forecast 2019 8.88 8.08 12.62 24.77Forecast 2020 11.15 10.35 14.65 24.47
From the above Actual rainfall to forecasted rainfall of 2019 and 2020 for four months, it is
found that there a small amount of change in the rainfall is observed but the prediction of
the rainfall is acceptable. Hence we may conclude that SARIMA model prediction( forecast
rainfall) of rainfall is approximately similar to the actual rainfall.
5. SUMMARY OF FINDINGS AND FUTURE SCOPE
5.1 Objectives of the Project
To provide and improve the best decision support under time constraints and weather
fluctuations.
To provide the best decision of Optimum utilization of rainfall in order to that
decision making is optimally feasible to the people.
To improve the lowest rainfall situation.
To minimize the maximum loss at the time of heavy rainfall.
To verify, the forecasting results of the following proposed models, graphically using
SPSS and R software.
5.2 Summary of Findings Water is ours natural resource and it is vital to all the forms of the life. Water is used in
various fields like transportation, source of power, for the purposes for domestic
consumption, agriculture and industrial purpose. The availability of rainfall is depended on
the amount of precipitation in that particular area.
The rainfall characteristics especially variability and trend are necessary for the proper design of hydro related schemes such as
(i) clean water supply
(ii) reservoir
50 | P a g e
(iii) storm water channels in rapidly growing regions.
In the first part of the analysis zone wise data has been taken and it is found that there is a
fluctuation in average rainfall found from the south and the North districts but fluctuation is
very less from the one to another districts. The average rainfall slightly more in the South
region but significantly it is similar from the both regions. From the results of the linear
regression analysis, there is statistically insignificant increasing trend in annual mean
rainfall data among the zones under study.t-test is applied for testing the significance of
rainfall and it is found that there is no significance difference between two zones of rainfall.
In the Second part of the analysis seasonal rainfall data has been taken. For analysis of
seasonal rainfall ANOVA one way classification is used and it is found that there is a
significant difference in seasonal rainfall is observed. For testing the pairs of the seasonal
rainfall Tukeys HSD post hock test has been used and the result israinfall from the south-
west monsoon is significantly different from other two monsoons and the North-East
monsoon and Winter season rainfall is likely insignificant.
In Third part of the analysis, monthly data of Nineteen years data has been taken. For the
analysis of monthly data Box-Jenkins methodology have been taken for building the model
and forecasting the future rainfall. Seasonality was foundin the monthly rainfall data, for the
analysis of seasonal rainfall SARIMA Model has been used. For testing stationary, ACF and
PACF functions has been used and it is found that the rainfall is nonstationary. To convert
the nonstationary to stationary, differencing lags have been taken. To test the significant of
constructed parameter model t-test is used and it is found that parameters are significant. To
test the adequacy of the rainfall data H-Jung test is used and it is found that the data is
adequate. Using SARIMA Model of Box-Jenkins methodology in twenty four months
rainfall is forecasted.
5.3 Future ScopeModel selection plays a major role in the forecasting rainfall because of the suitable model
construction will explore the better result from the collected data. In this analysis SARIMA
Model of Box – Jenkins methodology reflected good results from the data points. Even
though this model has given better solution it can improved the solution by applying more
51 | P a g e
advanced tools for getting more accuracy results. Neural networks and GARIMA Models are
advanced tools to build the model and forecasting of the rainfall data.
BIBLIOGRAPHY
[1]R. Khavse*, R. Deshmukh, N. Manikandan, J. L Chaudhary and D. Kaushik “Statistical
Analysis of Temperature and Rainfall Trend in Raipur District of Chhattisgarh”, Vol. 10(1),
305-312, 2015.
[2]Manickam Valli*, Kotapati Shanti Sree and Iyyanki V Murali Krishna “Analysis of
Precipitation Concentration Index and Rainfall Prediction in various Agro-Climatic Zones of
Andhra Pradesh, India”, Vol. 2(5), 53-61, 2013.
[3]Theresa Hoang Diem Ngo, Warner Bros. Burbank, CA “The Box-Jenkins Methodology
for Time Series Models”, Paper 454, 2013.
[4] Amaha Gerretsadikan and M.K.Sharma “Modeling and forecasting of rainfall data of
mekele for Tigray region (Ethiopia)”, Volume 9, Nos. 1&2, 2011.
[5] Fulekar, M.H. and Kale, R.K., “Impact of Climate Change: Indian Scenario,” University
News, 48(24): 14-20; 15-23, 2010.
[6] S. Soltani, R. Modarres and S.S. Eslamian, “ The use of timeseries modeling for the
determination of rainfall climates of Iran”. International Journal of Climatology, 27: 819-
829, 2007.
[7] Bowerman, Bruce L., Richard T. O’Connell, and Anne B. Koehler. “Forecasting, Time
Series, and Regression”, 4th ed. Belmont, CA: Thomson Brooks/Cole, 2005.
[8] G.E.P Box, G.M Jenkins, “Time Series Analysis, Forecasting and Control”, Holden-Day:
San Francisco, 1976.
52 | P a g e
[9] Krishna Reddy. M., Naveen Kumar .B., 2008, “Forecasting Foreign Exchange Rates
Using Time Delay NeuralNetworks”, Proceedings of IV International Conference on Data
Mining 2008, Las Vegas, Nevada, USA, 267-273.
[10] Tang, Z., Almeida, C. D. and Fishwick, P. A., 1991, “Time Series Forecasting using
Neural Networks Vs. Box-Jenkins Methodology”, Simulation, Vol. 57, No. 5, 303-310.
[11] Mwanga D.1,*, Ong’ala J.2, Orwa G.1Modeling sugarcane Yields in the Kenya Sugar
Industry: A SARIMA Model forecasting Approach, International Journal of Statistics and
Applications 2017, 7(6): 280-288
[12] Hill B. and Kendall M.G. “The Analysis of Economic Time-Series- Part I: Prices.”
Journal of the Royal Statistical Society Vol. 116. No. 1 (1953): pp.11-34.
Source of Data collection
1. Indiastats.com
2. Data.gov.in
3. Directorate of Economic and Statistics, Telangana State
4. Meteorological Department of Telangana State
5. Meteorological Department of India
53 | P a g e