Detection of oil pollution impacts on vegetation using ...

34
Journal Pre-proof Detection of oil pollution impacts on vegetation using multifrequency SAR, multispectral images with fuzzy forest and random forest methods Mohammed Ozigis, Jorg Kaduk, Claire Jarvis, Polyanna da Conceição Bispo, Heiko Balzter PII: S0269-7491(19)31660-4 DOI: https://doi.org/10.1016/j.envpol.2019.113360 Reference: ENPO 113360 To appear in: Environmental Pollution Received Date: 31 March 2019 Revised Date: 28 September 2019 Accepted Date: 6 October 2019 Please cite this article as: Ozigis, M., Kaduk, J., Jarvis, C., da Conceição Bispo, P., Balzter, H., Detection of oil pollution impacts on vegetation using multifrequency SAR, multispectral images with fuzzy forest and random forest methods, Environmental Pollution (2019), doi: https://doi.org/10.1016/ j.envpol.2019.113360. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Ltd.

Transcript of Detection of oil pollution impacts on vegetation using ...

Page 1: Detection of oil pollution impacts on vegetation using ...

Journal Pre-proof

Detection of oil pollution impacts on vegetation using multifrequency SAR,multispectral images with fuzzy forest and random forest methods

Mohammed Ozigis, Jorg Kaduk, Claire Jarvis, Polyanna da Conceição Bispo, HeikoBalzter

PII: S0269-7491(19)31660-4

DOI: https://doi.org/10.1016/j.envpol.2019.113360

Reference: ENPO 113360

To appear in: Environmental Pollution

Received Date: 31 March 2019

Revised Date: 28 September 2019

Accepted Date: 6 October 2019

Please cite this article as: Ozigis, M., Kaduk, J., Jarvis, C., da Conceição Bispo, P., Balzter, H.,Detection of oil pollution impacts on vegetation using multifrequency SAR, multispectral images withfuzzy forest and random forest methods, Environmental Pollution (2019), doi: https://doi.org/10.1016/j.envpol.2019.113360.

This is a PDF file of an article that has undergone enhancements after acceptance, such as the additionof a cover page and metadata, and formatting for readability, but it is not yet the definitive version ofrecord. This version will undergo additional copyediting, typesetting and review before it is publishedin its final form, but we are providing this version to give early visibility of the article. Please note that,during the production process, errors may be discovered which could affect the content, and all legaldisclaimers that apply to the journal pertain.

© 2019 Published by Elsevier Ltd.

Page 2: Detection of oil pollution impacts on vegetation using ...

1

Detection of Oil Pollution Impacts on Vegetation using Multifrequency SAR, 1

Multispectral Images with Fuzzy Forest and Random Forest Methods 2

Mohammed Ozigis 1, 3 *, Jorg Kaduk 1, Claire Jarvis 1, Polyanna da Conceição Bispo 1, 2,4, Heiko Balzter 1, 2 3

1 Centre for Landscape and Climate Research, School of Geography, Geology and Environment, University of 4

Leicester, United Kingdom; 5 2National Centre for Earth Observation, University of Leicester, United Kingdom; 6

3Department of Strategic Space Applications, National Space Research and Development Agency, Nigeria 7

(NASRDA) 8 4Department of Geography, School of Environment, Education and Development, The University of Manchester, 9

Oxford Road, Manchester, United Kingdom; 10

11

* Correspondence: [email protected]; [email protected] 12

Abstract 13

Oil pollution harms terrestrial ecosystems. There is an urgent requirement to improve on existing methods for 14

detecting, mapping and establishing the precise extent of oil-impacted and oil-free vegetation. This is needed to 15

quantify existing spill extents, formulate effective remediation strategies and to enable effective pipeline monitoring 16

strategies to identify leakages at an early stage. An effective oil spill detection algorithm based on optical image 17

spectral responses can benefit immensely from the inclusion of multi-frequency Synthetic Aperture Radar (SAR) 18

data, especially when the effect of multi-collinearity is sufficiently reduced. This study compared the Fuzzy Forest 19

(FF) and Random Forest (RF) methods in detecting and mapping oil-impacted vegetation from a post spill 20

multispectral sentinel 2 image and multifrequency C and X Band Sentinel – 1, COSMO Skymed and TanDEM-X 21

images. FF and RF classifiers were employed to discriminate oil-spill impacted and oil-free vegetation in a study 22

area in Nigeria. Fuzzy Forest uses specific functions for the selection and use of uncorrelated variables in the 23

classification process to yield an improved result. This method proved an efficient variable selection technique 24

addressing the effects of high dimensionality and multi-collinearity, as the optimization and use of different SAR 25

and optical image variables generated more accurate results than the RF algorithm in densely vegetated areas. An 26

Overall Accuracy (OA) of 75% was obtained for the dense (Tree Cover Area) vegetation, while cropland and 27

grassland areas had 59.4% and 65% OA respectively. However, RF performed better in Cropland areas with 28

OA=75% when SAR-optical image variables were used for classification, while both methods performed equally 29

well in Grassland areas with OA=65%. Similarly, significant backscatter differences (P<0.005) were observed in the 30

C-Band backscatter sample mean of polluted and oil-free TCA, while strong linear associations existed between 31

LAI and backscatter in grassland and TCA. This study demonstrates that SAR based monitoring of petroleum 32

hydrocarbon impacts on vegetation is feasible and has high potential for establishing oil-impacted areas and oil 33

pipeline monitoring. 34

Key Words: Multi-frequency SAR, Vegetation Indices, Oil Pollution, Random Forest, Fuzzy Forest, Variable 35

Importance 36

1 Introduction 37

Detecting oil spills forms the basis for establishing the total area affected by oil pollution, facilitating remediation 38

efforts and recovery after oil spill, and monitoring the impacts of oil pollution on plant life. Observed difficulties 39

encountered in the implementation of post-spill impact mapping are often related to the overestimation of the size of 40

oil polluted areas owing to confusion among features with similar spectral characteristics, such as oil pools in 41

exposed bare land and water bodies, burn scars and other vegetation types (Hese and Schmullius, 2008; Khanna et 42

Page 3: Detection of oil pollution impacts on vegetation using ...

2

al., 2013; Kokaly et al., 2013). This can result either from direct confusion of land cover types (Hese and 43

Schmullius, 2008) or background materials with overlapping absorption features such as dry vegetation having a 44

high Hydrocarbon Index (HI) similar to the signatures of hydrocarbons (Kokaly et al., 2013). It is also the case that 45

terrestrial oil slicks may occasionally present little to no visually reconcilable signatures even in high-resolution 46

optical images, primarily due to the effect of oil seepage into the soil, topsoil erosion, weathering and degradation 47

(Brown and Ulrich, 2014; Khanna et al., 2018; Shapiro, Khanna and Ustin, 2016). Hence, changes in vegetation 48

structural, biophysical, physiological and biochemical alterations have to be relied on to detect oil pollution 49

indirectly. 50

Narrowband optical hyperspectral sensors could overcome these limitations, as they allow detecting any changes in 51

very specific wavelengths. Khanna et al. (2013) and Li et al. (2005) have explored the potential of the AVIRIS data 52

to monitor the impact of oil spill on the terrestrial landscape. Mishra et al. (2012) also relied on hyperspectral sensor 53

to quantify the short-term impacts of oil spills on the photosynthetic activity and physiological status of coastal 54

saltmarshes and showed that phenological indicators from the polluted and unpolluted sites allowed the successful 55

delineation of the hotspots of plant stress, indicating oil pollution. In addition, airborne hyperspectral imagery and 56

support vector machine (SVM) methodology have been tested for terrestrial oil spill mapping with some success 57

(Achard et al., 2018). Mahdianpari et al. (2018) also used high-resolution UAV and electromagnetic (EM) induction 58

data for terrestrial oil spill detection and mapping; and showed that these data have potential for the discrimination 59

of oil-polluted land from adjoining vegetation and other land cover components. 60

Major constraints limiting the application of multitemporal spaceborne or airborne hyperspectral imagery for oil 61

spill mapping range from scale, high cost implications and accessibility in some data-scarce parts of the world 62

(especially in low and middle-income countries). Thus the amount of and access to airborne hyperspectral data is 63

limited. Other factors such as the cloudy weather conditions experienced almost all year round (especially within the 64

tropical and mangrove ecosystems) limits the application of field and image spectroscopy in the detection and 65

mapping of terrestrial oil spills. Synthetic Aperture Radar (SAR) sensor on the other hand acquires imagery in the 66

microwave domain of the electromagnetic spectrum independent of solar illumination conditions (day and night, 67

and in the boreal wintertime) and is relatively unconstrained by weather conditions due to its ability to penetrate 68

clouds. SAR images are relatively accessible and have increasingly been used to assess and monitor the state of 69

earth surface dynamics, especially within forest and cropland vegetation. This is owing to their immense capability 70

in penetrating through sparse and dense vegetation canopies depicting their current health and physical status. 71

Vegetation structural change can also be detected from time-series of radar backscatter and this has been 72

demonstrated in several studies (Nielsen et al., 2017; Ramsey et al., 2015; Zhou et al., 2017). Radar backscatter 73

analysis has also allowed the mapping of vegetation canopy types (Wagner et al., 1999), plant biomass (Patel et al., 74

2006; Bispo et al., 2014; Varghese et al., 2016) and establishing stages of crop development (Duguay et al., 2015; 75

Gebhardt et al., 2012; Zhou et al., 2017). 76

Other studies have used multi-frequency SAR variables for distinguishing native and invasive species (Ghulam et 77

al., 2014; Rajah et al., 2018). Results show that the integration of optical and multi-frequency SAR images can 78

improve vegetation class discrimination. This is especially the case when more advanced techniques as machine 79

learning, data fusion and object-based image classification are employed. Despite the existing potential of combined 80

multi-frequency SAR and optical image products for detecting and mapping terrestrial oil spills however, their 81

application for discriminating between oil polluted and oil-free land cover types has not been investigated in detail 82

yet. 83

In this study, we investigate C- and X-band SAR-derived textures, interferometric coherence, backscatter, 84

topography and soil variables, as well as optical image derived vegetation indices for identifying oil pollution using 85

fuzzy forest (FF) (Conn et al., 2015) and random forest (RF) (Breiman, 2001) classifiers. The rationale for the 86

Page 4: Detection of oil pollution impacts on vegetation using ...

3

choice of classifier is that RF classifier have a high flexibility in the use of input variables for describing different 87

situations. In contrast, FF seeks to use only high-performing uncorrelated variables for classification. FF is an 88

extension of RF designed to yield less biased variable importance rankings when there is high correlation among 89

input variables. This is largely because the use of highly correlated variables in a classification process can lead to 90

redundancy and confuses variable importance information (Gregorutti et al., 2017). This can thereafter translate to 91

less accurate maps (Darst et al., 2018; Schmidt et al., 2017; Strobl et al., 2008) owing to the effect of overfitting or 92

underfitting in the final model from the n variables. 93

The present study is the first application of a FF to satellite imagery for a terrestrial classification application. 94

The specific objectives of this study were: 95

To test the potential of multi-frequency SAR imagery for the classification of oil-impacted vegetation and 96

compare its performance with fused multi-frequency SAR and optical imagery. 97

To test the fuzzy forest and random forest methods for distinguishing oil spill-impacted and oil-free 98

vegetation types. 99

To identify the best input variables with the least multi-collinearity for discriminating between oil-100

impacted and oil-free vegetation. 101

To model the relationship between biochemical indicators of vegetation from the derived optical 102

vegetation indices, with vegetation structural components from the multi-frequency SAR backscatter, in 103

order to understand the biophysical properties of vegetation affected by oil pollution. 104

105

2 Material and Methods 106

2.1 Study Area 107

2.1.1 Location 108

The study area (figure 1) represents a small part of the Niger Delta region of Nigeria (~70,000 km2) and is bounded 109

by the four corner coordinates of (6.9570E, 5.0250N), (7.2470E, 5.0250N), (6.960E, 4.7950N) and (7.2540E, 4.8040N). 110

It covers an area of 1320 km2. The topography of the Niger delta region where the study area is located is 111

characterized by low lying flat terrain (Nriagu, 2011) with several tributaries of the River Benue, River Niger, 112

Bonny River and Escravos River (Musa et al. 2016; Nriagu, 2011). 113

2.1.2 Climate and Ecology 114

The climate pattern of the Niger delta region is dominated by the wet equatorial climate such that from February to 115

November, the climate of the coastal zone is dominated by the tropical maritime air mass and between December to 116

January the weather system is influenced by dry tropical continental air mass (Adejuwon, 2012). The region has an 117

average annual precipitation of 4500mm around the coastal margins to about 2000mm in its northern region. This 118

accounts for the intermittent flooding during the wet season (Adejuwon, 2012). The temperature variability across 119

the region is relatively low throughout the year over with average annual temperature of about 27oC and with little 120

seasonal variation. In terms of humidity, the region experiences much of high humidity (80% to 90%) in the months 121

of June through September, and lower humidity occurs from December to March (Adejuwon, 2012). 122

2.1.3 Oil Pollution in the Niger Delta 123

Nigeria’s oil-rich Niger Delta is characterized by a prevalence of numerous oil spill incidents since the early 1970’s. 124

The region has witnessed an increase in oil production activities since the discovery of crude in 1956, substantially 125

leading to expansion of oil pipeline facilities across the length and breadth of the region (Taiwo et al. 2012). 126

Specific causes of oil spillage in the Niger Delta region range from sabotage to oil facilities, pipeline infrastructure 127

Page 5: Detection of oil pollution impacts on vegetation using ...

4

decay, operational failure and other unknown causes (UNEP, 2011). This has led to some devastating spill incidents 128

over the years. Ndimele et al., (2018) noted that accumulated spill statistics in the region translate to an average of 129

nearly one thousand spills yearly, according to official records of the National Oil Spill Detection and Response 130

Agency (NOSDRA). The area chosen for this study has huge concentration of oil and gas pipeline facilities, and 131

also represents a prime oil spill hotspot area (Figure 1) with re-occurring spill incidents since 2013. These spills 132

have mostly occurred on farmland, marsh, mangrove and forest vegetation as they are the predominant land cover 133

types. 134

2.1.4 Predominant land cover 135

The Niger Delta region is characterised by 6 land cover types according to the most recent land cover map produced 136

by the European Space Agency (ESA) Climate Change Initiative (ECCI). These include: cropland, grassland, 137

mangrove, fresh water swamp, tree cover area, bare land, built up areas and water body (https://www.esa-landcover-138

cci.org/?q=node/187). However, only cropland, grassland and tree cover areas were used in this study. This is 139

primarily used to establish the vegetation specific (cropland, grassland and TCA) spill detection and image 140

classification processes. 141

Page 6: Detection of oil pollution impacts on vegetation using ...

5

142

Figure 1. The Study Area (Panel A) within the Niger Delta Region of Nigeria in the West African sub region. Panel 143

‘B’ shows a true colour composite of Sentinel 2A image of December 2016 (ESA, 2015) of the study area and oil 144

spill sites as retrieved from NOSDRA and SPDC database https://oilspillmonitor.ng/ and 145

https://www.shell.com.ng/sustainability/environment/oil-spills.html. Panel ‘C’ shows the predominant land cover 146

types within the study area according to the most recent land cover map produced by the European Space Agency 147

(ESA) Climate Change Initiative (ECCI). 148

Page 7: Detection of oil pollution impacts on vegetation using ...

6

In addition, the three land cover classes explored represent the most frequently affected land cover by the recurring 149

oil spill incidents. The northern part of the study area has higher concentration of riparian forest and dense 150

vegetation, while patches of grassland are concentrated in the central part of the study area. The southern part of the 151

study area is dominated by agricultural land, which is the main socio-economic activity of this area, as between 50% 152

and 70% of the Niger Delta inhabitants depend on the natural environment for agriculture, fishing, and the 153

collection of forest products as their principal source of livelihood. 154

2.2 Data and Pre-processing 155

Datasets used in this study comprise the Oil Spill Point and Incident data (2.2.1) for training and accuracy 156

assessment of the classifiers, the ESA Land Cover Map (2.2.2), as training and classification was executed 157

separately for each land cover class (Ozigis et al., 2019), and remote sensing optical and SAR images for the 158

classification of the vegetation into oil free or oil polluted classes. These images as well as derived variables and 159

ancillary datasets such as soil and geological maps are shown in Table 1a and b. 160

Table 1a: Remote sensing data used in this study. 161

Platform Sensor Swath spatial res. Image Months Season

Sentinel 1 C-band SAR 250km 5x20m January 2017 Dry Season Tandem-X X-band SAR 200 6m February 2016 Dry Season Cosmo Skymed X-band SAR 40km 3 x 3m December 2016 Dry Season Sentinel 2A Multispectral

Instrument (MSI)

290km 20m December 2016 Dry Season

Shuttle Radar Topography Mission

C- and X-band SAR

30m September 2014 Nil

162

163

164

165

166

167

168

169

170

Table 1b: List of SAR, optical and geophysical Variables for image classification comprising soil variables and 171

variables derived from S1 (Sentinel 1), CSM (Cosmo Skymed), TDX (TanDEM X) and S2 (Sentinel 2) 172

S/No Variables S1 CSM TDX S2

S/No Variables S1 CSM TDX S2

1 Sigma Nought v v v

15 SLOPE

v

2 VV/VH v

16 ASPECT

v

3 VV+VH v

17 NDWI v

4 VV-VH v

18 LAI v

5 Texture-Variance v v

19 NDVI v

6 Texture-Second Moment v v

20 B11 v

Page 8: Detection of oil pollution impacts on vegetation using ...

7

7 Texture-Mean

v v

21 B12 v

8 Texture-Homogeneity v v

22 B5 v

9 Texture-Entropy v v

23 B6 v

10 Texture-Dissimilarity v v

24 B7 v

11 Texture-Correlation v v

25 B8A v

12 Texture-Contrast v v

26 Soil Map

13 Coherence

v

27 Soil Geology

14 DEM

v

173

Data and data pre-processing are described in detail in the following sections 174

175

2.2.1 Oil Spill Point and Incident data 176

Published oil spill records from the National Oil Spill Detection and Response Agency (NOSDRA) 177

https://oilspillmonitor.ng/ and the Shell Petroleum Development Corporation (SPDC) 178

https://www.shell.com.ng/sustainability/environment/oil-spills.html were used were used for this study. The SPDC 179

is the largest private crude oil company within the Niger Delta region, while NOSDRA is a Government Agency 180

tasked with capturing all oil spill incidents both on marine and terrestrial ecosystems across the country. The 181

incident data from the two published sources were combined and large spill incidents covering areas of not less than 182

1000 m2 identified to ensure that the area covered is greater than a single pixel of 100 m2. As such, the greater the 183

spill size, the higher the number of sample points selected within the spill area (Supplementary Materials – A1). 184

Conversely, only single sample points for relatively small spill sites were selected. As the focus of the study was to 185

distinguish between oil spill-impacted and oil-free areas, it was necessary to select sample locations of non-spill/oil-186

free sites sufficiently far away from the known spill sites. A buffer of 600m was implemented around the spill areas 187

to create the High Consequence Area (HCA) isolating all existing spill points (Whanda et al., 2015). Outside this 188

HCA unpolluted sites were selected at random. 189

The selected spill site and non-spill points were thereafter assigned land cover types (Cropland, Grassland and TCA) 190

as provided by the ECCI dataset (http://2016africalandcover20m.esrin.esa.int/) and are given in Supplementary 191

Materials (A2). The processed points served two major purposes in this study. First, was for the class-wise 192

assignment of pixels in the RF and FF image classification operation. Here points were divided into two sets by a 193

ratio of 60:40 for training and validation respectively. Secondly, the processed points were used in extracting 194

requisite spectral and backscatter information from the images for further assessment. 195

2.2.2 ESA Land Cover Map 196

The existing land cover map for the African Continent produced by the ESA Climate Change Initiative 2016 197

(https://www.esa-landcover-cci.org/?q=node/187) was used in this study to stratify the composite image into 198

separate regions for the different vegetation types. The land cover map contains 10 classes for different land cover 199

classes including built-up areas, waterbodies and various vegetation types produced from a 20m high spatial 200

resolution Sentinel-2A image mosaic over Africa. The study area shapefile was used to extract land cover 201

information from the ECCI data, before a subset of the different vegetation types was taking from the final stacked 202

images. The land cover categories used in this study are Cropland, Grassland and Tree Cover Areas (TCA), these 203

were used as most oil and gas pipeline facilities and the corresponding spill incidents are largely concentrated on 204

these areas. Pictures of the Cropland, Grassland and Tree Cover Areas are given in the supplementary material (A3). 205

Features such as built-up areas, waterbodies and bare surfaces were excluded to further reduce the effect of artifacts 206

and spectral confusion among features as implemented in previous studies (Ozigis et al., 2019). 207

Page 9: Detection of oil pollution impacts on vegetation using ...

8

2.2.3 Optical reflectance data and vegetation indices from Sentinel 2 208

From the post-spill sentinel 2 (S2) only the orthorectified image bands with 20m spatial resolution acquired in 209

December 2016 were used for the oil spill detection and image classification process 210

(https://scihub.copernicus.eu/dhus/#/home). The S2 image was pre-processed from digital number radiance to top-211

of-atmosphere reflectance using the Sen2Cor module in the Sentinel Application Platform (SNAP) (Zuhlke et al., 212

2015) environment. This is done to eliminate atmospheric and radiometric effects. The image pixels were then 213

resampled to 10m spatial resolution. Pre-processed bands were used as input into the experimental image 214

classification and retrieval of Vegetation Indices. Three vegetation health indices namely: Normalized Difference 215

Water Index (NDWI), Leaf Area Index (LAI) and Normalized Difference Vegetation Index (NDVI) were generated 216

and incorporated into the machine learning classification. Their specific use in this research is because they have 217

shown to be particularly efficient in identifying stressed vegetation in previous studies (Adamu et al., 2015; 218

Arellano et al., 2015). The NDVI is widely used for remote sensing of vegetation because of its ability to depict 219

stress in vegetation. It uses land surface reflectance from the red channel, which is the strong chlorophyll absorption 220

region, and near-infrared, which represents a high reflectance plateau of vegetation, canopies (Eq. 1). 221

222

(1) 223

Introduced by Gao, (1996), the NDWI basically uses the mid-infrared and near-infrared bands located in the high 224

reflectance plateau of vegetation (Eq. 2). Due to the weak liquid absorption in the mid-infrared, the index is 225

sensitive to changes in liquid water content of vegetation and vegetation with near or absolute water loss is detected 226

better with this index than with NDVI. 227

228

(2) 229

The leaf area index (LAI) is defined as the ratio of green leaf area projected onto the horizontal ground surface. 230

Various methods ranging from field-based measurements and satellite image processes are used to compute LAI. 231

LAI are important variables for establishing gross photosynthesis, net primary productivity, evapotranspiration and 232

bidirectional reflectance as they depict structural properties of vegetation. The index can reveal a lot about the health 233

and structural state of vegetation (Clevers et al., 2017; Verrelst et al., 2015). In this study, LAI was generated from 234

Sentinel 2 optical imagery using the biophysical processor in the SNAP software. The processor primarily estimates 235

LAI through a radiative transfer processes with 8 Sentinel 2 bands, sun zenith and viewing zenith angles with the aid 236

of three-layered neural network. 237

238

2.2.4 Normalized Cross Section Backscatter 239

Sentinel 1 – GRD Product 240

Sentinel 1A is a C-band radar launched by the European Space Agency on the 3rd of April 2014, acquiring data in 241

VH and VV polarization (https://scihub.copernicus.eu/dhus/#/home). The image used in this study was acquired in 242

January 2017 and obtained in Level-1 Ground Range Detected (GRD) format in dual-polarization VV and VH 243

mode. This was pre-processed to obtain radar cross-section backscatter. The S1 image was radiometrically 244

calibrated before multi-looking (one look in range and four in azimuth), geocoded based on Shuttle Radar 245

Topography Mission (SRTM) data, and radiometrically calibrated with a final pixel spacing of 10 m × 10 m. Pixel 246

Page 10: Detection of oil pollution impacts on vegetation using ...

9

values were converted to backscatter coefficient (or normalized radar cross section) in units of dB using the formula 247

below in SNAP. 248

(3) 249

Normalized radar cross section, is the backscatter for a specific polarization, is the decimal 250

logarithm. 251

TanDEM X 252

Multiple post-spill Tandem X datasets were acquired from the German Space Agency (DLR). The TanDEM X is a 253

constellation of two satellites, which is jointly operated by the German Space Agency (DLR) and EADS Astrium. 254

TanDEM-X is a bistatic X-band SAR system, which consists of twin satellites, namely TerraSAR-X (launched June 255

15, 2007) and TanDEM-X (also launched June 21, 2010). The product used in this study is also a dry season image 256

of February 2017 in Level 1b Geocoded Ellipsoid Corrected (GEC) format and in Stripmap mode. However, only 257

the return signal in the HH channel was used in this study, as cross-polarization HV data was not readily available 258

for the desired period. The acquired image was radiometrically calibrated using equation (4) below (Sportouche et 259

al., 2012): 260

( ) (| ( )| (

) ) (4) 261

Where is the local incidence angle, is the calibration and processor scaling factor, ( ) is the image 262

value at pixel ( ) and is the decimal logarithm. 263

264

Cosmo Skymed 265

A post-spill Cosmo Skymed high-resolution SAR image was also used in this study. The satellite constellation was 266

launched and is operated by the Italian Space Agency (ASI). It acquires dual polarizations in HH and HV mode. 267

Image used in this study were acquired in December 2016 in the dry season as a level 1A image, which needed to be 268

corrected for the Range Spreading loss effect using antenna pattern gain compensation and incidence angle effect 269

following (Eq. 5). The corrected image was further multi-looked (one look in range and two in azimuth), geocoded 270

based on the SRTM, and radiometrically calibrated with a final pixel spacing of 10 m × 10 m (Sportouche et al., 271

2012). 272

( ) (| ( )|

(

)

) (5) 273

Where ( ) is the image value at pixel ( ),

is the reference incidence angle, is the reference slant 274

range, is the reference slant range exponent, is the calibration constant and is the rescaling factor. 275

Terrain correction was implemented using the range Doppler terrain correction module in the Sentinel Application 276

Platform (SNAP). 277

2.2.5 Coherence 278

Interferometric coherence was generated from the post-spill Tandem-X image in the SNAP software. Topographic 279

phase was removed with the aid of the SRTM 3 data and the final product was multi-looked with ratio 2:2 to obtain 280

the same spatial resolution as the Sentinel 1, Sentinel 2 and Cosmo Skymed images. The Refined Lee filter was used 281

for noise suppression before range Doppler terrain correction module was applied to geometrically correct the final 282

coherence image. Coherence decreases with increasing volume decorrelation and temporal decorrelation due to 283

Page 11: Detection of oil pollution impacts on vegetation using ...

10

movement of targets between the two image acquisitions, e.g. movement/defoliation of leaves and changing surface 284

roughness of water bodies. The magnitude of coherence values ranges from 0 to 1, where 0 represents a low and 285

incoherent target, 1 represents high and absolute coherence (Bamler and Hartl, 1998). 286

2.2.6 Texture features 287

Textural features were used in this study because they have the ability to depict important rapid change in vegetation 288

structural composition, which in turn can influence grey level tonation of SAR images (Hlatshwayo et al., 2019; Jin 289

et al., 2014). Eight gray-level co-occurrence matrix (GLCM) images were generated as prescribed by Haralick and 290

Shanmugam (1973) from the high resolution Cosmo Skymed and TanDEM-X images. The GLCM generated are: 291

Contrast, Correlation, Dissimilarity, Homogeneity, Mean, Second Moment, Variance and Entropy. 292

2.2.7 Digital elevation model 293

A digital elevation model was obtained from the Shuttle Radar Topography Mission (SRTM). The SRTM 1 Arc-294

Second Global dataset with acquisition date of 23rd September 2014 (https://earthexplorer.usgs.gov/) was re-sampled 295

to fit the pixel resolution of 10m baseline used for all other images in this study. The product is the void filled, 296

resampled from the initial 3Arc-Seconds to a better resolution of 1 arc-seconds, which corresponds to approximately 297

30m spatial resolution. 298

2.2.8 Soil Map and Geological data 299

In this study, soil type and geological maps compiled by the Nigerian Geologic Survey Agency (NGSA) 300

http://ngsa.gov.ng/GeoMaps were obtained, pre-processed and incorporated into the classification process. First, the 301

maps were georeferenced from the Geographic Coordinate System with WGS_1984 datum and reprojected to UTM 302

projection (WGS_1984_UTM_Zone_32N). The study area extent was extracted from the entire map before 303

intersecting layers were digitized. The vector map, which comprised of two predominant soil types, ferrosols and 304

fluvisols was further rasterized using kriging method, while the geology layer also comprised two predominant 305

types of Coastal Plain Sands and Alluvium. The layers were all incorporated into the classification process as 306

several studies (Abdel-Moghny et al., 2012; Klamerus-Iwan et al., 2015; Wang et al., 2013) have suggested that soil 307

type and geological characteristics can to a large extent influence hydrocarbon crude seepage and runoff processes, 308

which can indirectly influence the vegetation - impact nexus. 309

310

2.3 Evaluation of discriminatory potential of the developed variables 311

Backscatter from the various SAR sensors and the optical-derived vegetation indices were tested for their potential 312

to discriminate between oil-polluted and unpolluted areas in each of the three vegetation types (Cropland, Grassland 313

and TCA). Box plots, corresponding means, median values and interquartile ranges for each variable on the polluted 314

and unpolluted reference sites were analysed and tested for differences using several statistical tools. The paired 315

sample t-test was used to compare pairwise differences in means between oil-free and polluted areas as implemented 316

in (Khanna et al., 2013). The information content of the SAR backscatter independent from the optical variables was 317

also assessed using linear regression of the SAR variables on the three vegetation indices for polluted sites. 318

319

2.4 Image Classification and Experimental Scenarios 320

In order to evaluate the FF methodology for image classification, in particular the efficiency of its variable reduction 321

potential, RF image classification was also implemented for comparison. Two experimental image classification 322

scenarios were implemented in this study to assess the potentials of multifrequency SAR Image Fusion (MSIF) and 323

multifrequency SAR Optical Image Fusion (MSOIF) as shown in figure 2. Specifically, the setup allows to 324

Page 12: Detection of oil pollution impacts on vegetation using ...

11

determine, whether the inclusion of optical data in addition to the SAR data leads to a significant improvement of 325

the classification. 326

327

328

Figure 2. Illustration of the image classification process 329

Image variables for the MSIF experiment include backscatter (C and X Band), coherence, texture, elevation model 330

(DEM, Slope and Aspect), soil and geology map. Similarly, image variables for the MSOIF experiment include 331

backscatter (C and X Band), coherence, texture, elevation model (DEM, Slope and Aspect), soil map, geology map, 332

optical (sentinel – 2) bands and vegetation indices. The total number of variables for the MSIF and MSOIF were 29 333

and 37 respectively. Bilinear interpolation was used to resize and re-sample all 37 derived image variables into a 334

uniform pixel size of 10m. The FF and RF classifiers were trained separately on each of the three vegetation classes 335

with the reference spill and oil-free sites. Consequently there are four classifications for each of the three vegetation 336

classes: RF and FF with each multifrequency SAR Image Fusion (MSIF) and multifrequency SAR Optical Image 337

Fusion (MSOIF). 338

339

2.5 Random Forest 340

RF is an ensemble classification method introduced by Breiman, (2001). It works on the assumption that an 341

aggregation of correctly predicted classes from a large ensemble of randomly generated individual decision trees 342

achieves higher classification accuracy. Each decision tree in the classifier is trained using a subset of the various 343

input variables with two thirds of these samples. The remaining one third is used to generate the out-of-bag error, 344

which is an internal validation of the final model. The method also generates a measure of importance for each of 345

the subsampled variables used in the classification process on the account of the Gini index and mean decrease in 346

Gini. 347

However, it has been observed that RF variable importance measures could be biased when highly correlated 348

features are incorporated in a single classification (Nicodemus and Malley, 2009; Strobl et al., 2008), thereby 349

influencing the overall classification accuracy. In this study, RF image classification was implemented in the R 350

Page 13: Detection of oil pollution impacts on vegetation using ...

12

software (TeamR, 2017) using the Caret package (Kuhn, 2012). Several calibration/parameterization runs were 351

carried out to determine the optimal ntree and mtry values for training the RF models in each of the vegetation type, 352

using all the input variables. Results as experimented and shown in figure 3 indicates that the ntree = 500 and 353

mtry=6 yielded the best calibration accuracy with the lowest out-of-bag error. Figure 3 shows that calibration result 354

for TCA had the lowest out-of-bag-error of around 0.06% while grassland had the highest error of around 0.2%. 355

356

357

Figure 3. Out-of-bag accuracy (1 - OOB error) as a function of number of decision trees for the three land cover 358

classes. (a) Cropland (b) Grassland (c) TCA and; (d) Overall Calibration Accuracy 359

360

2.6 Fuzzy Forest 361

FF is an extension of the RF method that seeks to obtain less biased variable importance rankings in the presence of 362

highly correlated features (input variables). This is accomplished in two steps. First, a screening process to eliminate 363

unimportant variables by assigning features to separate variable clusters called ‘modules’. Here, the target is for FF 364

to produce a partition of the features with high correlation using Weighted Gene Correlation Network Analysis 365

(WGCNA). This feature can be denoted by the set . Let so that ∑ 366

. 367

The theoretical foundation of the FF method is aimed at using a piecewise screening process to eliminate features 368

from initial assigned variable clusters through Weighted Gene Correlation Network Analysis (WGCNA) for 369

detecting correlation networks. Then a selection phase is implemented through the Recursive Feature Elimination 370

Random Forest (RFE-RF) process to allow for the interaction between different variable clusters for the selection of 371

unique/important variables from each cluster. WGCNA is a biological statistical network tool that is used primarily 372

to analyse genes through a network correlation assessment across microarray of samples. The method is robust for 373

Page 14: Detection of oil pollution impacts on vegetation using ...

13

finding the linkages among gene clusters, which are necessary step for developing sound clinical gene and cell 374

therapies. However, this is the first known application of this method in a remote sensing image-processing context. 375

The screening step operates independently on each partition and each element of the partition is relieved of 376

unimportant variables using the Recursive Feature Elimination Random Forest (RFE-RF). Starting with all features 377

in partition , an RF model is fitted and the least important features are eliminated. The resultant features after the 378

first round of elimination are denoted ( )

. Consequently, a second RF is then fitted using features in ( )

and the 379

least important features are again eliminated leading to a further reduced set of features ( )

( )

. The 380

subset obtained after iteration t can be denoted as ( )

which is the number of features in ( )

. The process of 381

feature elimination continues until a user-defined threshold is reached, for instance until only 25% of the most 382

important variables in are remaining (Conn et al., 2015). 383

For the full potential of the most important variables to be selected, the user must specify how many features are to 384

be dropped after each iteration by specifying and tuning various screening parameters and specifying a stopping 385

criterion (Conn et al., 2015). In this study, the R Studio package version 1.0.143 was used for analysis and the 386

screening parameters were set as: ntree = 500; drop_fraction = 0.5; keep_fraction = 0.5; number_selected = 5. The 387

model specification ensured that only 10% of the original variables, which corresponds to the most important 5 388

variables, are kept for the final FF classification. 389

390

2.7 Confusion Matrix 391

The results of the RF and FF classifications were validated using the error matrix (Congalton, 1991) produced with 392

the remaining 40% oil-free and oil-polluted ground reference data, resulting in 180 pixels per class. Hence, selected 393

pixels representing actual classes from the classification result were compared to the ground truth reference classes 394

as determined in section 2.2.1. The validation process evaluated whether the true positive sites known as oil spill 395

sites were correctly classified as oil polluted vegetation and if the known unpolluted sites were correctly classified 396

as oil-free vegetation. 397

The performances of the two methods were further assessed using the overall accuracy (OA), producer’s accuracy 398

(PA) and user’s accuracy (UA). While the overall accuracy measures the correctness of the map classes to the total 399

ground truth used for validation, the producer accuracy (also known as error of omission) represents how well the 400

reference pixels of the vegetation type are classified. The user accuracy (also known as error of commission) on the 401

hand represents the probability that a pixel classified into a particular class actually represents that class on the 402

ground. Whether the FF or RF classification provided better results was evaluated using McNemar's test (de Leeuw 403

et al., 2006). This has been applied in several studies (Onojeghuo et al., 2018; Son et al., 2018; Whyte et al., 2018). 404

McNemar's test is a nonparametric test based on 2 using a 2 x 2 contingency matrix to assess the performance of 405

multiple classifier outputs based on the number of correctly predicted samples. The accuracies were considered as 406

statistically significant at a confidence level of 95% if the calculated 2 (from Eq. 6) is larger than the critical value 407

of 1.5. The samples are labelled as f12 and f21 which represents the correctly classified samples for FF that were 408

misclassified by RF, and the number of correctly classified samples for RF that were misclassified by FF, 409

respectively (Whyte et al., 2018). 410

411

( )

(6) 412

Page 15: Detection of oil pollution impacts on vegetation using ...

14

413

2.8 Field and Qualitative Validation 414

A qualitative validation to assess prediction performance was undertaken using high-resolution sentinel-2A images 415

and google earth image. The spatial extent of classified land cover as determined by the different classification 416

processes to the known oil spill extent was visually evaluated and compared between the different classifications. In 417

addition, field validation data were collected during a post-spill fieldwork assessment carried out in October 2018 in 418

some of the oil spill sites. It formed the basis of a toxicology analysis carried out during the fieldwork to establish 419

the volume of hydrocarbon content present in the soil. The toxicology analysis of soil sediment sample from three 420

spill site locations and one oil-free site location are displayed on the high-resolution image for comparison. The 421

toxicology analysis tested for the Total Hydrocarbon Content (THC) levels within the respective location and this 422

was compared visually to the result of the classified map from the two classifiers. 423

424

3 Results 425

3.1 Detecting Hydrocarbon Pollution Using Sentinel 2 Multispectral Bands 426

For cropland, the vegetation indices (LAI, NDWI and NDVI) tended to be higher for unpolluted cropland than for 427

polluted cropland, with significantly different means for LAI, NDVI and NDWI (P<0.05) (Figure 4). The range of 428

the indices was smaller (median < 0.3 and smaller interquartile range) for oil-free cropland compared to the large 429

median and interquartile range for the polluted cropland (median > 0.3). For grassland areas, the results indicate 430

significant differences between the means for oil-impacted and oil-free grassland (P<0.05 for all three indices), as 431

retrieved LAI and NDVI for oil-free grassland often had a higher median and interquartile range. This indicates a 432

larger heterogeneity of the unpolluted sites. With respect to TCA, the results shows that all three indices retrieved 433

and explored were more dissimilar and heterogeneous, as P<0.005 and the median for oil-free TCA were higher 434

than those retrieved from polluted TCA. 435

436

437

Figure 4. Distribution of retrieved LAI (Pink), NDWI (Blue), and NDVI (Green) for both Polluted and oil-free land 438

cover types in the study area (a) cropland (b) grassland (c) forest. Results show that median values of indices for oil-439

free land cover types are mostly significantly higher than for polluted land cover. 440

441

Page 16: Detection of oil pollution impacts on vegetation using ...

15

3.2 SAR C- and X-band Backscatter for Detecting Hydrocarbon Pollution 442

Figure 5 shows boxplots of retrieved backscatter values for different classes. The results indicate that backscatter 443

values from unpolluted cropland often had lower interquartile ranges with median values > -35dB, compared to the 444

polluted cropland, which often had larger interquartile spread and median values > -35dB. The significant variations 445

were observed more with the Sentinel-1 derived polarizations and cross-polarization ratios. A similar trend was 446

observed for grassland areas as retrieved backscatter values for unpolluted grassland had much lower variability and 447

lower median backscatter. Variations were also associated more with the Sentinel-1 cross-polarization ratios and the 448

Tandem-X data. However, these results were not statistically significant (P > 0.05) for cropland and grassland. In 449

TCA however, the results generally showed that backscatter values retrieved from the unpolluted sites had higher 450

medians of -10dB, -17dB, -8dB, 7dB and -19dB from the CSM, S1 VH, S1 VV, S1 VV–VH and S1 VV+VH 451

respectively. In contrast, the polluted TCA had median values of -13dB, -14dB, -9dB, 6dB and -17dB from the 452

CSM, S1 VH, S1 VV, S1 VV – VH and S1 VV/VH respectively. Interquartile range of backscatter between the oil-453

free (unpolluted TCA) and polluted TCA were uniform across the different sensors. The paired t-test showed that 454

the difference between means was statistically significant (P < 0.05). These were mostly obtained with S1 derived 455

products of S1 VV (P=0.0006), S1 VV + VH (P=0.0008) and S1 VH (P=0.0229). 456

457

458

(a) (b) 459

460

Page 17: Detection of oil pollution impacts on vegetation using ...

16

(c) 461

Figure 5: The distribution of TDX Backscatter (Purple), TDX coherence, Cosmo Skymed (Pink) and Sentinel 1 VV 462

(Green), Sentinel 1 VH (Blue) and Sentinel 1 VV/VH (Magenta) backscatter for polluted and oil-free land cover 463

types in the study area. (a) Cropland (b) Grassland (c) TCA. Result shows that median backscatter values and 464

interquartile range in oil-free TCA are significantly higher than the polluted TCA. 465

466

3.3 Relationship between the various biophysical variables 467

For croplands, there is generally a weak relationship between the SAR variables and the optically-derived LAI 468

indices as indicated by the results of least-squares regressions (Supplementary Materials – A4). The NDWI showed 469

a stronger relationship (R>0.4 or above) with the Sentinel-1 VV, VH, and VV + VH derived backscatter (P<0.05). 470

TDX backscatter had R = 0.3 with the NDWI (P=0.004). However, the result for NDVI showed a linear 471

relationship with CSM, S1 VV, VH + VH and VV-VH (all R>0.3; P<0.05). For grassland there is generally a 472

strong relationship between backscatter and vegetation indices (Supplementary Materials – A5). The NDWI and 473

LAI had the strongest association with the S1 variables. R values between NDWI with S1 VV, S1 VH, S1 VV+VH, 474

S1 VV/VH were 0.62, 0.55, 0.62 and 0.44 respectively. R values between LAI and S1 VV, S1 VH, VV+VH, 475

VV/VH, TDX Coherence were 0.47, 0.46, 0.5, 0.3, 0.4 respectively (all P < 0.05). The results of the linear 476

regressions for TCA vegetation showed that NDWI and LAI had the strongest association with the various 477

backscatter variables (Supplementary Materials – A6). High R values obtained for NDWI were with TDX 478

coherence, S1 VV, S1 VH, S1 VV+VH, S1 VV-VH and S1 VV/VH, and this gave R values of 0.48, 0.73, 0.45, 479

0.62, 0.44 and 0.7 respectively. Similarly, high R values recorded for LAI were with S1 TDX Coherence, S1 VV, 480

S1 VV + VH, S1 VV – VH and VV/VH. Values obtained are 0.503, 0.57, 0.46, 0.41 and 0.575 respectively. The 481

results obtained were also statistically significant (as P < 0.05). 482

483

3.4 Classifying and Mapping Oil Polluted Vegetation 484

Figure 6 shows the result of the image classifications. The best overall accuracies (OA) were obtained when the FF 485

and RF methods were used to classify the MSOIF variables for TCA and cropland areas. Generally, OA presented 486

slight differences in the output from the FF and RF. However, MSOIF yielded about 10% higher overall accuracy 487

than the multifrequency SAR image fusion (MSIF). This implies that the exclusion of the optical variables from the 488

two classification methodologies increased inter-class errors, thereby reducing the OA. A visual assessment of the 489

outputs showed that the spatial extent of oil-polluted cropland within the cropland segment from the MSOIF using 490

FF was larger than the extent from RF, which led to lower classification accuracy. The results for the TCA using the 491

RF with the MSOIF dataset also had large oil-impacted segments especially within the central parts of the study 492

area. This also must have accounted for the low classification accuracy, compared to the results obtained from the 493

FF, which had smaller segments of polluted areas. Results obtained for grassland areas did not show as much 494

contrast and dissimilarity as in the cropland and TCA classes. A similar OA was obtained when FF and RF were 495

used to classify the MSIF variables for cropland and TCA areas (62.5% for FF and 60% for RF). The RF 496

outperformed the FF when the same data were used to classify grassland. Overall accuracies of 55% and 40% were 497

obtained for RF and FF respectively. 498

499

Page 18: Detection of oil pollution impacts on vegetation using ...

17

500

(a) (b) 501

Figure 6: Classified maps of polluted and unpolluted areas for three land cover types. (a) Using MSOIF data. (b) Using MSIF. Results show that FF had highest performance (OA 502

75%) in TCA using the MSOIF variables, while RF had highest performance (75%) in Cropland Using MSOIF variables. Note that crop cover is more dominant in the southwest of 503

the study area and tree cover more in the north (Figure 1). 504

Page 19: Detection of oil pollution impacts on vegetation using ...

18

3.5 Variable Importance Measure 505

3.5.1 Multifrequency SAR – Optical Image Fusion (MSOIF) 506

The results of the variable importance from the RF and FF using the MSOIF dataset are presented in figure 7. 507

Elevation-derived variables including the DEM were the most important variables. For cropland classification, the 508

Red Edge (band 7), Aspect and NDVI were the most important variables in the discrimination of oil-free and oil-509

impacted cropland areas when FF classification was implemented. In contrast, for RF, the slope, aspect and DEM 510

were the three most important variables. This implied that features that are more diverse were selected for the FF 511

than RF. For grassland areas, the DEM, SWIR and Red bands were the most important explanatory variables in the 512

FF classification. However, the result from the RF classification showed that the three elevation variables (Aspect, 513

DEM and Slope) and SWIR bands had higher importance. The result obtained from the classification of tree cover 514

(dense vegetation) areas showed that NDVI, DEM, SWIR and the Red Edge (band 7) were the most important 515

variables in the FF classification. Only the DEM and Red Edge bands featured in the top 5 variables when the RF 516

classification was implemented. This exploration of the variable importance highlights that the use of the reduced 517

variables, which were free of high dimensionality effects, in the FF classification yielded comparable or improved 518

performance in the discrimination of oil-polluted and oil-free grassland and TCA compared to RF (Figure 7). This 519

implies that the FF was able to optimize the n input variables to select the most important uncorrelated features for 520

the classification of the MSOIF data. 521

522

3.5.2 Multifrequency SAR Image Fusion (MSIF) 523

The results of the variable importance obtained from the RF and FF classifications from the MSIF variables are 524

shown in figure 8. Elevation variables had greater importance than any other variables used in the classifications. In 525

cropland areas, the results of the RF showed that the 5 most important variables were mostly derived from the 526

elevation model, including slope, aspect and DEM, and textural variables. However, the variable importance chart 527

from the FF showed that both elevation-derived variables and Sentinel-1 backscatter data were the variables selected 528

by the FF classifier in the classification process. This is partly different for grassland areas, as the 5 most important 529

variables obtained from the RF classification were mostly the S1 and elevation-derived variables. On the contrary, 530

important variables obtained from the FF classifier were more diverse, as all three elevation-derived variables, the 531

soil map and Sentinel-1 data were used for the classification of grassland areas. Results for TCA areas also followed 532

a similar trend like the cropland, as the most important variables in the FF classification were Sentinel-1 and 533

elevation features. In contrast, S1, elevation and soil type variables were the top 5 variables in the RF classification. 534

It must be mentioned that the result of the classification using only the MSIF variables in the FF performed 535

comparatively well also with the RF, especially in cropland and TCA, as the best overall classification accuracies of 536

62.5% and 60% respectively were obtained. 537

538

Page 20: Detection of oil pollution impacts on vegetation using ...

19

539

Figure 7. Variable Importance (VI) plot from the FF and RF classification of the MSOIF variables. This shows that 540

Aspect, DEM and SWIR are consistently the most important variables for the FF and RF classification in 541

discriminating polluted Cropland, Grassland and TCA from their respective oil-free cover types. Blue dots 542

represents RF variables and pink dots represents FF variables. 543

Page 21: Detection of oil pollution impacts on vegetation using ...

20

544

Figure 8. Variable Importance (VI) plot from the FF and RF classification of the MSIF variables. This also shows 545

that Aspect, DEM and DEM are the important variables in the FF and RF classification to discriminate polluted 546

Cropland, Grassland and TCA from their respective oil-free cover types. 547

Page 22: Detection of oil pollution impacts on vegetation using ...

21

3.6 Accuracy Assessment 548

The overall accuracy (OA), users’ accuracy (UA) and producers’ accuracy (PA) (Congalton, 1991) for cropland, 549

grassland and TCA using the RF and FF classifiers is presented in the supplementary material (A7 and A8). Results 550

show that Cropland and TCA vegetation types using the MSOIF data with RF and FF gave the highest OA of 75% 551

in both classifiers. FF had lower errors of commission in cropland and TCA as user accuracies reached 100% and 552

90% respectively. The result of McNemar’s test (de Leeuw et al., 2006) (also given in the supplementary materials 553

– A9) shows that FF outperformed RF in cropland areas, as the difference between the errors was significant (2 - 554

test P<0.05), compared to the low 0.2 2 – value (P>0.05) obtained when the MSIF data was classified. The result 555

of McNemar’s test for grassland shows no significant difference between the errors in the FF and RF classifications 556

(overall classification accuracies of 65%). However, of the two classes investigated, the oil-free grassland had 557

higher UA=70% when FF was used to classify the MSOIF. McNemar’s test also showed that there is no significant 558

difference between the oil-free and oil-impacted grassland areas for the MSOIF and MSIF (P>0.05). Results for tree 559

cover areas showed that FF outperformed RF in the discrimination of oil-free and oil-polluted vegetation. OA was 560

75% and 70% for FF and RF respectively when the MSOIF data was classified. McNemar’s test showed that a 561

significant difference between the OA values (P<0.05). In addition, a low 2 of 0.3 was also obtained when the 562

MSIF data for TCA was classified. An overall accuracy of 60% was recorded by the two classifiers, explaining why 563

there is no significant difference between them (P>0.05). 564

565

3.7 Field and Qualitative Validation 566

The results of the field and qualitative validation are presented in Figure 9. The results of the lab test showed that 567

spill sites visited in Eleme, TAI and Gokana had THC values of 641 mg/kg, 620 mg/kg and 605 mg/kg respectively. 568

The areas around the oil spill locations were correctly classified as true positive sites of oil-impacted cropland by 569

both the FF and RF Classifiers. The non-spill site visited in Etche had a THC volume of 548 mg/kg, which is less 570

than the values of the spill sites. Very-high-resolution google earth imagery and sentinel 2 in a true colour 571

composite over the polluted sites was used to evaluate the extent classified as oil-polluted by the two classifiers. The 572

spatial extent of oil-polluted sites obtained from FF were broader and captured the extensive vegetated areas 573

impacted by the oil spill extent. 574

575

576

Page 23: Detection of oil pollution impacts on vegetation using ...

22

577

578

Figure 9. Spill points and Total Hydrocarbon Content (THC) on a true colour composite Sentinel 2A image of December 2016 (Source: https://scihub.copernicus.eu/dhus/#/home) 579

of the study area. Results show that THC around the spill points is much higher than at the unpolluted site. FF represented the polluted sites better than RF. 580

Page 24: Detection of oil pollution impacts on vegetation using ...

23

4 Discussion 581

Using multispectral optical data for the detection and mapping of oil-impacted land cover has been implemented in 582

several studies (Achard et al., 2018; Bianchi et al., 1995; Hese and Schmullius, 2008; Hese and Schmullius, 2009; 583

Mahdianpari et al., 2018; Ozigis et al., 2019). Most of these studies had, however, relatively low accuracy due to 584

spectral confusion, especially in cases where broadband multispectral images are used (Hese and Schmullius, 2008; 585

Hese and Schmullius, 2009). This study investigated the potential of multi-frequency SAR images, the fuzzy forest 586

machine learning algorithm and land-cover-specific image analysis (i.e. separately for different vegetation types) to 587

improve discrimination accuracy on a large scale. Results obtained demonstrated improvement over previous work 588

(Bianchi et al., 1995; Mahdianpari et al., 2018; Khanna et al., 2013; Achard et al., 2018; Van der Werff et al., 2007; 589

Ozigis et al., 2019) where multispectral or hyperspectral images were used to map polluted vegetation in smaller 590

areas. Van der Werff et al., (2007) experimented with minimum distance to class means and spectral angle mapper 591

methods to classify hyperspectral probe – 1 image. Results obtained showed that only 48% and 29% of pixels were 592

correctly classified. Similarly, Hese and Schmullius (2009) also used Landsat images in the ‘oil spill contamination 593

mapping in Russia’ project (OSCAR) to discriminate oil-contaminated vegetation from oil-free vegetation, soil and 594

industrial land use. Their results suggest that successional changes in vegetation composition owing to regeneration 595

and continuous spill incidents are major hindrance in mapping class types in the study. 596

However, the improved results in this study are mainly attributed to the use of integrated multi-frequency SAR 597

backscatter signals (which effectively captures the variation in vegetation structural composition) and multispectral 598

image bands in combination with derived vegetation health indices (which also effectively depict the bio-chemical 599

properties of vegetation). This provided a superior product that was able to improve discrimination accuracy 600

between impacted and oil-free vegetation types. However, observed limitations with the fuzzy forest method in this 601

study is its inability to further improve classification accuracy on the multifrequency SAR image fusion (MSIF) 602

classification for the three vegetation classes investigated. More also, classification accuracy was also observed to 603

be lower in cropland vegetation when the multifrequency SAR optical image fusion (MSOIF) classification was 604

implemented. Figure 10 shows the total area coverage retrieved from the FF and RF image classification using the 605

MSOIF and MSIF dataset. The result of the MSOIF classification, which had the best accuracy from both the RF 606

and FF, indicates that oil-free vegetation were significantly greater than the impacted vegetation. Comparison of the 607

retrieved impacted vegetation with the ECCI land cover data showed that 43%, 47% and 48% of the total grassland, 608

cropland and TCA vegetation respectively are under the influence and impact of oil pollution within the study area. 609

This result suggests a great reduction in the ecosystem services provided especially from the Tree cover (dense) 610

vegetation and cultivated farmlands. These very important vegetation types are necessary to reduce the impact of 611

climate change, improve agricultural productivity and foster food security. In particular, in the Niger delta, Nriagu 612

(2011) and UNEP (2011) have noted impacts of oil pollution ranging from land degradation, depletion of forest 613

vegetation, reduced crop yield to increased migration of local inhabitants to other rural and urban areas. These 614

impacts are of great concern as between 50% and 70% of the Niger Delta inhabitants depend on the natural 615

environment for agriculture, fishing, and the collection of forest products as their principal source of livelihood 616

(Nriagu, 2011). 617

618

Page 25: Detection of oil pollution impacts on vegetation using ...

24

619

Figure 10. Area (ha) of oil-free and impacted cropland, grassland and TCA vegetation according to fuzzy forest and 620

random forest classifications of the MSOIF and MSIF data. 621

622

The manifestation of the impact of oil on various vegetation types is largely because only small quantities of an oil 623

on land are weathered, bio-degraded or vaporized, while a larger percentage of the oil remains in place and 624

immobile especially the heavy compounds as bitumens, resins, waxes and asphaltenes. This generally leads to 625

infiltration of toxins into the soil column polluting surface and ground water systems. Major consequence of this 626

are prolonged barren period and ecosystem damage, which reduces the potential of recolonization of dominant 627

species such as arthropods, zooplankton and macrozoobenthos. Environmental degradation caused by oil spill 628

impact also has the potential of indirectly affecting human health through ground water consumption (from shallow 629

wells and boreholes), agricultural farm produce and fisheries. 630

A major concern is the fact that these effects can spread beyond the actual spill points to adjoining areas through top 631

soil erosion and ground water movement. Results obtained in this study demonstrated this trend, as oil-impacted 632

vegetation is observed to have stretched beyond the local incident points and catchments to other adjoining areas 633

(see figure 9). While spectral confusion could also be responsible for this spread, evidence from comparison with 634

high-resolution google earth image and the Mc Nemar test showed that spectral confusion was actually substantially 635

reduced with the introduction of the multifrequency SAR and Fuzzy forest method. Similarly, the total area 636

coverage of oil pollution within cropland area for the study area according to NOSDRA data showed that 3,020 ha 637

of land was contaminated by the 2015 and 2016 oil spills, while result obtained from the best analysis procedure in 638

this study showed that nearly 20,000 ha of cropland vegetation may have been impacted. Extrapolating this figure 639

for the entire Niger Delta would suggest that the area of cropland vegetation impacted by the spill incidents could be 640

six times the original size of the spill extent recorded. This certainly calls for more conscious efforts in tackling 641

terrestrial oil spill incidents through improved pipeline surveillance and maintenance owing to its negative impacts 642

on human health and the environment. 643

Page 26: Detection of oil pollution impacts on vegetation using ...

25

Improved ability to precisely identify areas under the influence of hydrocarbon (as demonstrated in this study) 644

would help in targeting remediation efforts at various scales in polluted areas. Such efforts and projects, tends to 645

focus on spill epicentre and reference locations as opposed to a general wide area remediation approach. There is 646

certainly a need to improve on the current strategy of identifying and mapping spill sites. This in the long-term 647

would ensure that the urgent attention needed in bio-diversity conservation, improved ecosystem services, increased 648

agricultural productivity, fishery and mariculture are achieved on a larger scale. In addition, it would also ensure 649

that human capital development and livelihood can be restored to function optimally as all pollution pathways 650

causing several ecological impacts would be precisely identified and remediated accordingly. 651

652

5 Conclusions 653

In conclusion, Spaceborne SAR data holds a huge potential in mapping oil polluted areas, monitoring leakages in oil 654

and gas transportation facilities and identifying hydrocarbon micro and macro seepage locations, effects of which 655

pose disturbance to adjoining vegetation. Results obtained in this study showed that the vegetation specific approach 656

to image classification, integration of optical and SAR variables, as well as machine learning methods assures of 657

better results, than when other rigid conventional techniques that limits the number of input variables are used. This 658

suggested that oil spill impact on the terrestrial ecosystem and specifically on vegetation has the potential of 659

traversing beyond the original spill reference location to other adjacent areas. Thus, environmental remediation and 660

rehabilitation efforts should therefore assume a top-bottom approach, where results obtained through mapping 661

operations are used to guide the entire remediation exercise, as opposed to traditional approaches where only spill 662

epicentre locations are focused on. By employing this approach, no impacted area would be spared in the 663

remediation exercise. This by extension would ensure a better and safe environment and improve agricultural 664

productivity for farmers in the region. Future studies in this regard can explore other indicators of vegetation 665

productivity owing to the effect of hydrocarbon pollution. Now, little attention and work has focused on the 666

determination and exploration of Above ground net primary productivity (ANPP), grass Biomass, vegetation 667

Biomass, forest carbon stock and other variables in detecting oil pollution. This would certainly provide newer 668

perspectives in understanding the multifaceted impacts of hydrocarbon on vegetation productivity and growth. 669

670

6 Acknowledgments 671

The TanDEM-X imagery was provided by the German Aerospace Centre (DLR) through the proposal 672

XTI_VEGE3408. Cosmo Skymed was acquired from the Italian Space Agency (CosAIR Programme). “Project 673

carried out using CSK® Products, © ASI (Italian Space Agency), delivered under an ASI licence to use”. Sentinel 1 674

and Sentinel 2 was copyright of the European Space Agency (ESA) acquired from the Sentinel Hub (Sci Hub). We 675

also like to acknowledge the National Oil Spill Detection and Response Agency (NOSDRA) and Shell Petroleum 676

Development Corporation for making the oil spill incident data. The Authors would like to thank the anonymous 677

reviewers for their helpful comments and suggestions. Dr Ogochukwu Amukali of the Niger Delta University, Mr 678

Goodluck Nakaima and Dr Bolaji Babatunde of the Department of Animal and Environmental Biology, University 679

of Port Harcourt for their support during the course of the various fieldworks conducted in the Niger Delta. 680

Funding: This research was undertaken with financial support through a scholarship provided by the Petroleum 681

Technology Development Fund (PTDF), European Union’s Horizon 2020 research and innovation programme 682

under the Marie Skłodowska-Curie grant agreement no 660020, Royal Society Wolfson Research Merit Award 683

(2011/R3), Natural Environment Research Council’s National Centre for Earth Observation. 684

Page 27: Detection of oil pollution impacts on vegetation using ...

26

685

The authors declare no conflict of interest. 686

687

688

689

690

691

692

693

694

695

7 Reference 696

Abam, T. (2001) 'Regional hydrological research perspectives in the Niger Delta', 697

Hydrological sciences journal, 46(1), pp. 13-25. 698

Abdel-Moghny, T., Mohamed, R. S., El-Sayed, E., Mohammed Aly, S. and Snousy, M. G. 699

(2012) 'Effect of soil texture on remediation of hydrocarbons-contaminated soil at El-Minia 700

district, Upper Egypt', ISRN Chemical Engineering, 2012. 701

Achard, V., Fabre, S., Alakian, A., Dubucq, D. and Déliot, P. (Year) 'Direct or indirect on-702

shore hydrocarbon detection methods applied to hyperspectral data in tropical area', Earth 703

Resources and Environmental Remote Sensing/GIS Applications IX. International Society for 704

Optics and Photonics, p. 107900N. 705

Adamu, B., Tansey, K. and Ogutu, B. (2015) 'Using vegetation spectral indices to detect oil 706

pollution in the Niger Delta', Remote Sensing Letters, 6(2), pp. 145-154. doi: 707

10.1080/2150704x.2015.1015656. 708

Adejuwon, J. O. (2012) 'Rainfall seasonality in the Niger delta belt, Nigeria', Journal of 709

Geography and Regional Planning, 5(2), pp. 51-60. 710

Akpokodje, E. G. (1987) 'The engineering-geological characteristics and classification of the 711

major superficial soils of the Niger Delta', Engineering Geology, 23(3-4), pp. 193-211. 712

Arellano, P., Tansey, K., Balzter, H. and Boyd, D. S. (2015) 'Detecting the effects of 713

hydrocarbon pollution in the Amazon forest using hyperspectral satellite images', Environ 714

Pollut, 205, pp. 225-239. doi: 10.1016/j.envpol.2015.05.041. 715

Bamler, R. and Hartl, P. (1998) 'Synthetic aperture radar interferometry', Inverse problems, 716

14(4), p. R1. 717

Barrett, B., Nitze, I., Green, S. and Cawkwell, F. (2014) 'Assessment of multi-temporal, 718

multi-sensor radar and ancillary spatial data for grasslands monitoring in Ireland using 719

machine learning approaches', Remote Sensing of Environment, 152, pp. 109-124. doi: 720

10.1016/j.rse.2014.05.018. 721

Bianchi, R., Cavalli, R. M., Marino, C. M., Pignatti, S. and Poscolieri, M. (Year) 'Use of 722

airborne hyperspectral images to assess the spatial distribution of oil spilled during the 723

Trecate blow-out (Northern Italy)', Remote Sensing for Agriculture, Forestry, and Natural 724

Resources. International Society for Optics and Photonics, pp. 352-363. 725

Bispo, P.C., Santos, J.R., Valeriano, M.M., Touzi, R. and Seifert, F.M., 2014. Integration of 726

Polarimetric PALSAR Attributes and Local Geomorphometric Variables Derived from 727

Page 28: Detection of oil pollution impacts on vegetation using ...

27

SRTM for Forest Biomass Modeling in Central Amazonia. Canadian Journal of Remote 728

Sensing, 40, pp.26-42. 729

Breiman, L. (2001) 'Random forests', Machine learning, 45(1), pp. 5-32. 730

Brown, L. D. and Ulrich, A. C. (2014) 'Bioremediation of oil spills on land', Handbook of 731

Oil Spill Science and Technology, pp. 395-406. 732

Clevers, J., Kooistra, L. and Van Den Brande, M. (2017) 'Using Sentinel-2 data for 733

retrieving LAI and leaf and canopy chlorophyll content of a potato crop', Remote Sensing, 734

9(5), p. 405. 735

Cloutis, E. (1989) 'Spectral reflectance properties of hydrocarbons: remote—sensing 736

implications', Science, 245(4914), p. 1657168. 737

Congalton, R. G. (1991) 'A review of assessing the accuracy of classifications of remotely 738

sensed data', Remote Sensing of Environment, 37(1), pp. 35-46. 739

Conn, D., Ngun, T., Li, G. and Ramirez, C. (2015) 'Fuzzy Forests: Extending Random 740

Forests for Correlated, High-Dimensional Data'. 741

Darst, B. F., Malecki, K. C. and Engelman, C. D. (2018) 'Using recursive feature elimination 742

in random forest to account for correlated variables in high dimensional data', BMC genetics, 743

19(1), p. 65. 744

de Leeuw, J., Jia, H., Yang, L., Liu, X., Schmidt, K. and Skidmore, A. (2006) 'Comparing 745

accuracy assessments to infer superiority of image classification methods', International 746

Journal of Remote Sensing, 27(1), pp. 223-232. 747

Duguay, Y., Bernier, M., Lévesque, E. and Tremblay, B. (2015) 'Potential of C and X Band 748

SAR for Shrub Growth Monitoring in Sub-Arctic Environments, Remote Sensing, 7, 9410–749

9430'.[in. 750

ESA, 2018: [https://scihub.copernicus.eu/dhus/#/home], Accessed 03/04/2016 751

Gao, B.-C. (1996) 'NDWI—A normalized difference water index for remote sensing of 752

vegetation liquid water from space', Remote Sensing of Environment, 58(3), pp. 257-266. 753

Gebhardt, S., Huth, J., Nguyen, L. D., Roth, A. and Kuenzer, C. (2012) 'A comparison of 754

TerraSAR-X Quadpol backscattering with RapidEye multispectral vegetation indices over 755

rice fields in the Mekong Delta, Vietnam', International Journal of Remote Sensing, 33(24), 756

pp. 7644-7661. 757

Ghulam, A., Porton, I. and Freeman, K. (2014) 'Detecting subcanopy invasive plant species 758

in tropical rainforest by integrating optical and microwave (InSAR/PolInSAR) remote 759

sensing data, and a decision tree algorithm', ISPRS Journal of Photogrammetry and Remote 760

Sensing, 88, pp. 174-192. doi: 10.1016/j.isprsjprs.2013.12.007. 761

Gregorutti, B., Michel, B. and Saint-Pierre, P. (2017) 'Correlation and variable importance in 762

random forests', Statistics and Computing, 27(3), pp. 659-678. 763

Haralick, R. M. and Shanmugam, K. (1973) 'Textural features for image classification', IEEE 764

Transactions on systems, man, and cybernetics, (6), pp. 610-621. 765

Hese, S. and Schmullius, C. (2008) 'Object oriented oil spill contamination mapping in West 766

Siberia with Quickbird data'. Object-Based Image Analysis. Springer, pp 367-382. 767

Hese, S. and Schmullius, C. (2009) 'High spatial resolution image object classification for 768

terrestrial oil spill contamination mapping in West Siberia', International Journal of Applied 769

Earth Observation and Geoinformation, 11(2), pp. 130-141. 770

Hlatshwayo, S. T., Mutanga, O., Lottering, R. T., Kiala, Z. and Ismail, R. (2019) 'Mapping 771

forest aboveground biomass in the reforested Buffelsdraai landfill site using texture 772

combinations computed from SPOT-6 pan-sharpened imagery', International Journal of 773

Applied Earth Observation and Geoinformation, 74, pp. 65-77. 774

Hörig, B., Kühn, F., Oschütz, F. and Lehmann, F. (2001) 'HyMap hyperspectral remote 775

sensing to detect hydrocarbons', International Journal of Remote Sensing, 22(8), pp. 1413-776

1422. 777

Page 29: Detection of oil pollution impacts on vegetation using ...

28

Jin, H., Mountrakis, G. and Stehman, S. V. (2014) 'Assessing integration of intensity, 778

polarimetric scattering, interferometric coherence and spatial texture metrics in PALSAR-779

derived land cover classification', ISPRS Journal of Photogrammetry and Remote Sensing, 780

98, pp. 70-84. 781

Khanna, S., Santos, M. J., Ustin, S. L., Koltunov, A., Kokaly, R. F. and Roberts, D. A. 782

(2013) 'Detection of salt marsh vegetation stress and recovery after the Deepwater Horizon 783

Oil Spill in Barataria Bay, Gulf of Mexico using AVIRIS data', PLoS One, 8(11), p. e78989. 784

doi: 10.1371/journal.pone.0078989. 785

Khanna, S., Santos, M. J., Ustin, S. L., Shapiro, K., Haverkamp, P. J. and Lay, M. (2018) 786

'Comparing the potential of multispectral and hyperspectral data for monitoring oil spill 787

impact', Sensors, 18(2), p. 558. 788

Klamerus-Iwan, A., Błońska, E., Lasota, J., Kalandyk, A. and Waligórski, P. (2015) 789

'Influence of oil contamination on physical and biological properties of forest soil after 790

chainsaw use', Water, Air, & Soil Pollution, 226(11), p. 389. 791

Kokaly, R. F., Couvillion, B. R., Holloway, J. M., Roberts, D. A., Ustin, S. L., Peterson, S. 792

H., . . . Piazza, S. C. (2013) 'Spectroscopic remote sensing of the distribution and persistence 793

of oil from the Deepwater Horizon spill in Barataria Bay marshes', Remote Sensing of 794

Environment, 129, pp. 210-230. doi: 10.1016/j.rse.2012.10.028. 795

Kuhn, M. (2012) 'The caret package', R Foundation for Statistical Computing, Vienna, 796

Austria. URL https://cran. r-project. org/package= caret. 797

Li, L., Ustin, S. L. and Lay, M. (2005) 'Application of AVIRIS data in detection of oil-798

induced vegetation stress and cover change at Jornada, New Mexico', Remote Sensing of 799

Environment, 94(1), pp. 1-16. 800

Mahdianpari, M., Salehi, B., Mohammadimanesh, F., Larsen, G. and Peddle, D. R. (2018) 801

'Mapping land-based oil spills using high spatial resolution unmanned aerial vehicle imagery 802

and electromagnetic induction survey data', Journal of Applied Remote Sensing, 12(3), p. 803

036015. 804

Mishra, D. R., Cho, H. J., Ghosh, S., Fox, A., Downs, C., Merani, P. B. T., . . . Mishra, S. 805

(2012) 'Post-spill state of the marsh: Remote estimation of the ecological impact of the Gulf 806

of Mexico oil spill on Louisiana Salt Marshes', Remote Sensing of Environment, 118, pp. 807

176-185. doi: 10.1016/j.rse.2011.11.007. 808

Mohammadimanesh, F., Salehi, B., Mahdianpari, M., Motagh, M. and Brisco, B. (2018) 'An 809

efficient feature optimization for wetland mapping by synergistic use of SAR intensity, 810

interferometry, and polarimetry data', International Journal of Applied Earth Observation 811

and Geoinformation, 73, pp. 450-462. 812

Musa, Z. N., Popescu, I. and Mynett, A. (2016) 'Assessing the sustainability of local 813

resilience practices against sea level rise impacts on the lower Niger delta', Ocean & Coastal 814

Management, 130, pp. 221-228. 815

Ndimele, P. E., Saba, A. O., Ojo, D. O., Ndimele, C. C., Anetekhai, M. A. and Erondu, E. S. 816

(2018) 'Remediation of Crude Oil Spillage'. The Political Ecology of Oil and Gas Activities 817

in the Nigerian Aquatic Ecosystem. Elsevier, pp 369-384. 818

Nicodemus, K. K. and Malley, J. D. (2009) 'Predictor correlation impacts machine learning 819

algorithms: implications for genomic studies', bioinformatics, 25(15), pp. 1884-1890. 820

Nielsen, A. A., Conradsen, K., Skriver, H. and Canty, M. J. (Year) 'Change detection in a 821

series of Sentinel-1 SAR data', Analysis of Multitemporal Remote Sensing Images 822

(MultiTemp), 2017 9th International Workshop on the. IEEE, pp. 1-3. 823

Nriagu, J. (2011) 'Oil industry and the health of communities in the Niger Delta of Nigeria'. 824

Ohsol, E. O., Pinkerton, J. W., Gillespie, T. E. and Laity, T. H. (1999) 'Process for upgrading 825

heavy crude oil production'.[in Google Patents. 826

Onojeghuo, A. O., Blackburn, G. A., Wang, Q., Atkinson, P. M., Kindred, D. and Miao, Y. 827

(2018) 'Mapping paddy rice fields by applying machine learning algorithms to multi-temporal 828

Page 30: Detection of oil pollution impacts on vegetation using ...

29

Sentinel-1A and Landsat data', International Journal of Remote Sensing, 39(4), pp. 1042-829

1067. 830

Ozigis, M. S., Kaduk, J. D. and Jarvis, C. H. (2019) 'Mapping terrestrial oil spill impact 831

using machine learning random forest and Landsat 8 OLI imagery: a case site within the 832

Niger Delta region of Nigeria', Environmental Science and Pollution Research, 26(4), pp. 833

3621-3635. 834

Patel, P., Srivastava, H. S., Panigrahy, S. and Parihar, J. S. (2006) 'Comparative evaluation 835

of the sensitivity of multi‐ polarized multi‐ frequency SAR backscatter to plant density', 836

International Journal of Remote Sensing, 27(2), pp. 293-305. 837

Rajah, P., Odindi, J. and Mutanga, O. (2018) 'Feature level image fusion of optical imagery 838

and Synthetic Aperture Radar (SAR) for invasive alien plant species detection and mapping', 839

Remote Sensing Applications: Society and Environment, 10, pp. 198-208. 840

Ramsey, E., Rangoonwala, A. and Jones, C. (2015) 'Structural Classification of Marshes 841

with Polarimetric SAR Highlighting the Temporal Mapping of Marshes Exposed to Oil', 842

Remote Sensing, 7(9), pp. 11295-11321. doi: 10.3390/rs70911295. 843

Schmidt, J., Fassnacht, F. E., Förster, M. and Schmidtlein, S. (2017) 'Synergetic use of 844

Sentinel‐ 1 and Sentinel‐ 2 for assessments of heathland conservation status', Remote 845

Sensing in Ecology and Conservation. 846

Shapiro, K., Khanna, S. and Ustin, S. L. (2016) 'Vegetation impact and recovery from oil-847

induced stress on three ecologically distinct wetland sites in the Gulf of Mexico', Journal of 848

Marine Science and Engineering, 4(2), p. 33. 849

Son, N.-T., Chen, C.-F., Chen, C.-R. and Minh, V.-Q. (2018) 'Assessment of Sentinel-1A 850

data for rice crop classification using random forests and support vector machines', Geocarto 851

International, 33(6), pp. 587-601. 852

Sportouche, H., Tupin, F., Nicolas, J.-M., Perciano, T. and Deledalle, C.-A. (Year) 'How to 853

combine TerraSAR-X and Cosmo-SkyMed high-resolution images for a better scene 854

understanding?', Geoscience and Remote Sensing Symposium (IGARSS), 2012 IEEE 855

International. IEEE, pp. 178-181. 856

Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T. and Zeileis, A. (2008) 'Conditional 857

variable importance for random forests', BMC bioinformatics, 9(1), p. 307. 858

Taiwo, E., Otolorin, J. and Afolabi, T. (2012) 'Crude oil transportation: Nigerian Niger delta 859

waxy crude'. Crude oil exploration in the world. IntechOpen. 860

TeamR, R. C. (2017) 'A language and environment for statistical computingVienna', AustriaR 861

Foundation for Statistical Computing. 862

UNEP (2011) 'Environmental Assessment of Ogoniland'.[in UNEP Nairobi. 863

Van der Werff, H., Noomen, M., Van der Meijde, M. and Van der Meer, F. (2007) 'Remote 864

sensing of onshore hydrocarbon seepage: problems and solutions', Geological Society, 865

London, Special Publications, 283(1), pp. 125-133. 866

Varghese, A. O., Suryavanshi, A. and Joshi, A. K. (2016) 'Analysis of different polarimetric 867

target decomposition methods in forest density classification using C band SAR data', 868

International Journal of Remote Sensing, 37(3), pp. 694-709. doi: 869

10.1080/01431161.2015.1136448. 870

Verrelst, J., Rivera, J. P., Veroustraete, F., Muñoz-Marí, J., Clevers, J. G., Camps-Valls, G. 871

and Moreno, J. (2015) 'Experimental Sentinel-2 LAI estimation using parametric, non-872

parametric and physical retrieval methods–A comparison', ISPRS Journal of Photogrammetry 873

and Remote Sensing, 108, pp. 260-272. 874

Wagner, W., Lemoine, G., Borgeaud, M. and Rott, H. (1999) 'A study of vegetation cover 875

effects on ERS scatterometer data', IEEE Transactions on Geoscience and Remote Sensing, 876

37(2), pp. 938-948. 877

Page 31: Detection of oil pollution impacts on vegetation using ...

30

Wang, Y., Feng, J., Lin, Q., Lyu, X., Wang, X. and Wang, G. (2013) 'Effects of crude oil 878

contamination on soil physical and chemical properties in Momoge wetland of China', 879

Chinese geographical science, 23(6), pp. 708-715. 880

Whanda, S., Sani, Y. and Bulus, G. (2015) 'Modelling of Potential Pipeline Impact Radius 881

and High Consequence Area in a Wetland Sub-Region of Nigeria', Journal of Geographic 882

Information System, 7(06), p. 692. 883

Whyte, A., Ferentinos, K. P. and Petropoulos, G. P. (2018) 'A new synergistic approach for 884

monitoring wetlands using Sentinels-1 and 2 data with object-based machine learning 885

algorithms', Environmental Modelling & Software, 104, pp. 40-54. 886

Zhou, T., Pan, J., Zhang, P., Wei, S. and Han, T. (2017) 'Mapping winter wheat with multi-887

temporal SAR and optical images in an urban agricultural region', Sensors, 17(6), p. 1210. 888

Zuhlke, M., Fomferra, N., Brockmann, C., Peters, M., Veci, L., Malik, J. and Regner, P. 889

(Year) 'SNAP (sentinel application platform) and the ESA sentinel 3 toolbox', Sentinel-3 for 890

Science Workshop. p. 21. 891

892

Page 32: Detection of oil pollution impacts on vegetation using ...
Page 33: Detection of oil pollution impacts on vegetation using ...

• Combined S2, C and X band SAR, geophysical and Digital elevation model (DEM) variables were used to detect and Map polluted from oil-free vegetation.

• Polluted and oil-free vegetation types presented varying backscatter distribution in which dense vegetation had strong relationship with leaf area index (LAI)

• The use of multifrequency SAR optical image fusion (MSOIF) in-combination with fuzzy forest and random forest methods provided better results than multifrequency SAR image fusion (MSIF).

Page 34: Detection of oil pollution impacts on vegetation using ...

The Authors have no conflict of interest to declare