Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems...

13
Expert Systems With Applications 88 (2017) 435–447 Contents lists available at ScienceDirect Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa Data mining and machine learning for identifying sweet spots in shale reservoirs Pejman Tahmasebi a,, Farzam Javadpour b , Muhammad Sahimi c a Department of Petroleum Engineering, University of Wyoming, Laramie, Wyoming 82071, USA b Bureau of Economic Geology, Jackson School of Geosciences, The University of Texas at Austin, 78713, Austin, TX, USA c Mork Family Department of Chemical Engineering and Materials Science, University of Southern California, Los Angeles, CA 90089-1211, USA a r t i c l e i n f o Article history: Received 12 February 2017 Revised 20 June 2017 Accepted 11 July 2017 Available online 14 July 2017 Keywords: Neural networks Genetic algorithm Big data Shale formation Fracable zones Brittleness index a b s t r a c t Due to its complex structure, production form a shale-gas formation requires more drillings than those for the traditional reservoirs. Modeling of such reservoirs and making predictions for their production also require highly extensive datasets. Both are very costly. In-situ measurements, such as well-logging, are one of most indispensable tools for providing considerable amount of information and data for such unconventional reservoirs. Production from shale reservoirs involves the so-called fracking, i.e. injection of water and chemicals into the formation in order to open up flow paths for the hydrocarbons. The measurements and any other types of data are utilized for making critical decisions regarding develop- ment of a potential shale reservoir, as it requires hundreds of millions of dollar initial investment. The questions that must be addressed include, does the region under study can be used economically for producing hydrocarbons? If the response to the first question is affirmative, then, where are the best places to carry out hydro-fracking? Through the answers to such questions one identifies the sweet spots of shale reservoirs, which are the regions that contain high total organic carbon (TOC) and brittle rocks that can be fractured. In this paper, two methods from data mining and machine learning techniques are used to aid identifying such regions. The first method is based on a stepwise algorithm that determines the best combination of the variables (well-log data) to predict the target parameters. However, in order to incorporate more training, and efficiently use the available datasets, a hybrid machine-learning algo- rithm is also presented that models more accurately the complex spatial correlations between the input and target parameters. Then, statistical comparisons between the estimated variables and the available data are made, which indicate very good agreement between the two. The proposed model can be used effectively to estimate the probability of targeting the sweet spots. In the light of an automatic input and parameter selection, the algorithm does not require any further adjustment and can continuously evalu- ate the target parameters, as more data become available. Furthermore, the method is able to optimally identify the necessary logs that must be run, which significantly reduces data acquisition operations. © 2017 Elsevier Ltd. All rights reserved. 1. Introduction Due to the recent progress in multistage hydraulic fracturing, horizontal drilling and advanced recovery methods, shales, rec- ognized as unconventional reservoirs, have become a promising source of energy. They are formed by fine-grained organic-rich matters and were previously considered as source and seal rock that, due to gas production and high pressures in the conven- tional reservoirs were traditionally called the trouble zones. Such regions were usually ignored, which explains why no compre- Corresponding author. E-mail addresses: [email protected] (P. Tahmasebi), [email protected] (F. Javadpour), [email protected] (M. Sahimi). hensive data are available for them. Thus, their characterization is still a massive task (Tahmasebi, Javadpour, & Sahimi, 2015a,b; Tahmasebi, Javadpour, Sahimi, & Piri, 2016; Tahmasebi, 2017; Tah- masebi & Sahimi, 2015). Furthermore, shales exhibit highly vari- able structures and complexities from basin to basin, and even in small fields. They host very small pores, and have low-matrix permeability and heterogeneity, both at the laboratory and field scales. Due to such difficulties and given the fact that new meth- ods for characterization of shale reservoirs are still being devel- oped, application of the characterization and modeling methods for the traditional reservoir to shales is of great importance. In particular, accurate characterization of such reservoirs entails inte- grating various information, including petrophysical, geochemical, geomechanical, and reservoir data (Tahmasebi & Sahimi, 2016a,b; http://dx.doi.org/10.1016/j.eswa.2017.07.015 0957-4174/© 2017 Elsevier Ltd. All rights reserved.

Transcript of Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems...

Page 1: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

Expert Systems With Applications 88 (2017) 435–447

Contents lists available at ScienceDirect

Expert Systems With Applications

journal homepage: www.elsevier.com/locate/eswa

Data mining and machine learning for identifying sweet spots in shale

reservoirs

Pejman Tahmasebi a , ∗, Farzam Javadpour b , Muhammad Sahimi c

a Department of Petroleum Engineering, University of Wyoming, Laramie, Wyoming 82071, USA b Bureau of Economic Geology, Jackson School of Geosciences, The University of Texas at Austin, 78713, Austin, TX, USA c Mork Family Department of Chemical Engineering and Materials Science, University of Southern California, Los Angeles, CA 90089-1211, USA

a r t i c l e i n f o

Article history:

Received 12 February 2017

Revised 20 June 2017

Accepted 11 July 2017

Available online 14 July 2017

Keywords:

Neural networks

Genetic algorithm

Big data

Shale formation

Fracable zones

Brittleness index

a b s t r a c t

Due to its complex structure, production form a shale-gas formation requires more drillings than those

for the traditional reservoirs. Modeling of such reservoirs and making predictions for their production

also require highly extensive datasets. Both are very costly. In-situ measurements, such as well-logging,

are one of most indispensable tools for providing considerable amount of information and data for such

unconventional reservoirs. Production from shale reservoirs involves the so-called fracking, i.e. injection

of water and chemicals into the formation in order to open up flow paths for the hydrocarbons. The

measurements and any other types of data are utilized for making critical decisions regarding develop-

ment of a potential shale reservoir, as it requires hundreds of millions of dollar initial investment. The

questions that must be addressed include, does the region under study can be used economically for

producing hydrocarbons? If the response to the first question is affirmative, then, where are the best

places to carry out hydro-fracking? Through the answers to such questions one identifies the sweet spots

of shale reservoirs, which are the regions that contain high total organic carbon (TOC) and brittle rocks

that can be fractured. In this paper, two methods from data mining and machine learning techniques are

used to aid identifying such regions. The first method is based on a stepwise algorithm that determines

the best combination of the variables (well-log data) to predict the target parameters. However, in order

to incorporate more training, and efficiently use the available datasets, a hybrid machine-learning algo-

rithm is also presented that models more accurately the complex spatial correlations between the input

and target parameters. Then, statistical comparisons between the estimated variables and the available

data are made, which indicate very good agreement between the two. The proposed model can be used

effectively to estimate the probability of targeting the sweet spots. In the light of an automatic input and

parameter selection, the algorithm does not require any further adjustment and can continuously evalu-

ate the target parameters, as more data become available. Furthermore, the method is able to optimally

identify the necessary logs that must be run, which significantly reduces data acquisition operations.

© 2017 Elsevier Ltd. All rights reserved.

1

h

o

s

m

t

t

r

f

h

i

T

m

a

i

p

s

o

o

h

0

. Introduction

Due to the recent progress in multistage hydraulic fracturing,

orizontal drilling and advanced recovery methods, shales, rec-

gnized as unconventional reservoirs, have become a promising

ource of energy. They are formed by fine-grained organic-rich

atters and were previously considered as source and seal rock

hat, due to gas production and high pressures in the conven-

ional reservoirs were traditionally called the trouble zones. Such

egions were usually ignored, which explains why no compre-

∗ Corresponding author.

E-mail addresses: [email protected] (P. Tahmasebi),

[email protected] (F. Javadpour), [email protected] (M. Sahimi).

f

p

g

g

ttp://dx.doi.org/10.1016/j.eswa.2017.07.015

957-4174/© 2017 Elsevier Ltd. All rights reserved.

ensive data are available for them. Thus, their characterization

s still a massive task ( Tahmasebi, Javadpour, & Sahimi, 2015a,b;

ahmasebi, Javadpour, Sahimi, & Piri, 2016; Tahmasebi, 2017; Tah-

asebi & Sahimi, 2015 ). Furthermore, shales exhibit highly vari-

ble structures and complexities from basin to basin, and even

n small fields. They host very small pores, and have low-matrix

ermeability and heterogeneity, both at the laboratory and field

cales. Due to such difficulties and given the fact that new meth-

ds for characterization of shale reservoirs are still being devel-

ped, application of the characterization and modeling methods

or the traditional reservoir to shales is of great importance. In

articular, accurate characterization of such reservoirs entails inte-

rating various information, including petrophysical, geochemical,

eomechanical, and reservoir data ( Tahmasebi & Sahimi, 2016a,b;

Page 2: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447

a

o

g

t

a

a

t

B

t

2

t

t

d

N

m

t

2

g

i

t

i

y

r

e

I

k

d

E

r

c

h

p

s

e

y

i

w

y

w

t

c

2

i

p

i

t

o

o

c

n

R

Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi, Sahimi, & An-

drade, 2017; Tahmasebi, Sahimi, Kohanpur, & Valocchi, 2016 ).

Apart from its type, efficient drilling may be thought of as tar-

geting the most productive zones of a reservoir with maximum

exposure. For shale reservoirs, this concept is equivalent to ar-

eas with high total organic carbon (TOC) and high fracability, i.e.

brittleness, which calls for comprehensive characterization of such

complex formations. High TOC and fracable index (FI) reflect high

quality of shale-gas reservoirs. The role of the TOC is clear, as it is

one of the main factors for identifying an economical shale reser-

voir. Higher TOC, ranging from 2% to 10%, represents richer organic

contents and, consequently, higher potential for gas production.

Since natural gas is trapped in both organic and inorganic mat-

ters, the TOC denotes the entire organic carbons, and is a direct

measure of the volume and maturity of the reservoir.

The FI influences the flow of hydrocarbons in a shale reservoir

and any future fracking in it. Thus, identifying the layers in a reser-

voir with high FI is of great importance. The FI controls a shale

reservoir’s production since it strongly influences the wells’ pro-

duction. It also provides very useful insight into where and how

new wells should be placed and spaced. In fact, unlike conven-

tional reservoirs that depend on long-range connectivity of the

permeable zones, optimal well locations and spacing control the

performance of shale reservoirs and future fracking operations in

them. Thus, separating the brittle and ductile zones of rock is a key

aspect of successful characterization of shale-gas reservoirs. More-

over, brittle shale has high potentials for being naturally fractured

and, consequently, exhibits good response to fracking treatments.

Past successful experience indicated that characterization of

shale reservoirs need accurate identification of the so-called sweet

spots, i.e. the zones that present the best production or the poten-

tial for high production, and the potential fracable zones, which

are critical to maximizing the production and future recovery. The

placement of most of the wells is closely linked with the sweet

spots, as well as the fracable zones for hydraulic fracturing. For ex-

ample, the TOC represents the ability of a shale reservoir in stor-

ing and producing hydrocarbons. Fracability is controlled mainly

by mineralogy and elastic properties, such as the Young’s and bulk

moduli and the Poisson’s ratio ( Sullivan Glaser et al., 2013 ). There-

fore, identification of the sweet spots is of great importance to

shale reservoirs. Such spots are characterized through high kero-

gen content, low water saturation, high permeability, high Young’s

modulus and low Poisson’s ratio.

One of the primary, as well as most affordable, methods for

characterizing complex reservoirs is coring and collecting petro-

physical data, as well as well logs. The latter can be integrated

with former in order to develop a more reliable model. In prin-

ciple, well-log data can be provided continuously and, thus, they

represent a real-time resource. Because of a huge number of wells

in a typical shale reservoir, we refer to such datasets as big data .

Clearly, such information is very useful when it is coupled with

some techniques that help better identify the sweet spots and fra-

cable zones. Eventually, the questions that must be addressed are:

where one should/should not drill new wells? Where are the zones

with high/low fracability index?

Aside from such critical questions, another issue regarding the

available big data is the fact that new data are continuously ob-

tained as the production proceeds. Thus, any algorithm for the

analysis of big data should be flexible enough for rapid adapta-

tion of new data. Furthermore, another important feature of the

algorithm should be its ability to use the available information to

create a “training platform” for forecasting the important parame-

ters.

In this paper, a very large database consisting of well logs, x-

ray diffraction (XRD) data, and experimental core analysis is used

to develop a model that reduces the cost and increases the prob-

bility of identifying the sweet spots. First, we describe a method

f data mining called stepwise regression ( Efromyson, 1960; Mont-

omery, Peck, & Vining, 2012 ) for identifying the correlations be-

ween the target (i.e., dependent) parameters – the TOC and FI -

nd the available well-log data (the independent variables). Then,

hybrid method borrowed from machine learning and artificial in-

elligence is proposed for accurate predictions of the parameters.

oth methods can be tuned rapidly, and can use the older database

o accurately characterize shale reservoirs.

. Methodology

As mentioned earlier, two very different methods are used in

his paper. The first is borrowed from data-mining field by which

he correlation between an independent variable and a series of

ependent variables is constructed and used for future forecasting.

ext, a method of machine learning for developing a more robust

odel is introduced that recognizes the complex relations between

he variables.

.1. Linear regression models based on statistical techniques

Due to dealing with a multivariate dataset, multiple linear re-

ression (MLR) is selected to model the relationship between the

ndependent and response variables, the TOC and FI. Suppose that

he response variable y is related to k predictor variables. Then, as

s well-known,

= β0 + β1 x 1 + β2 x 2 + . . . + βk x k + ε, (1)

epresents a MLR that contains k predictor variables. ε is the model

rror and βk with k = 0, 1, …, k are partial regression coefficients.

n practice, βk and the variance of the model error ( σ 2 ) are not

nown a priori . The parameters may be estimated using the sample

ata.

In the k -dimensional space of the predictor variables,

q. (1) represents a hyperplane. The expected change in the

esponse y as a result of a change in x j is contained in the coeffi-

ients β j . It should be noted that the variation of y is subject to

olding all the remaining variables x i ( i � = j ) constant. More com-

lex functional forms can still be included and modeled with a

light modification of Eq. (1) . For example, consider the following

quation:

= β0 + β1 x 1 + β2 x 3 2 + β3 e

x 3 + β4 x 1 x 2 + ε, (2)

f we let �1 = x 1 , �2 = x 3 2 , �3 = e x

3 and �4 = x 1 x 2 then Eq. (2) is

ritten as

= β0 + β1 �1 + β2 �2 + β3 �3 + β4 �4 + ε, (3)

hich is again a MLR model. Thus, Eq. (3) may be used to predict

he response variable using new dependent variables. The coeffi-

ients of Eq. (1) are estimated using least-square curve fit.

.1.1. Evaluating sub-models significance

Clearly, in the presence of multiple variables, selecting the most

mportant ones for reducing the complexity and performance of a

redictive model is of great importance. In other words, we are

nterested in identifying the best subset of parameters that main-

ains the variability while minimizing the complexity. Thus, vari-

us subsets must be compared and, then, one must decide which

ne is more representative and accurate than others. One common

riterion for model adequacy is the coefficient of multiple determi-

ation, R 2 . Thus, for a model with p variables, R 2 is calculated by

2 p =

S S R ( p )

S S T = 1 − S S Res ( p )

S S T , (4)

Page 3: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 437

w

t

h

t

R

w

d

i

H

i

T

2

m

o

s

m

s

F

I

c

a

t

d

t

m

t

n

a

T

u

d

v

fi

(

s

m

A

t

c

b

s

T

i

F

b

s

s

a

L

2

t

a

l

o

t

t

f

t

i

h

p

g

m

b

l

2

2

s

i

Z

a

r

f

2

Z

c

m

o

a

i

s

p

p

p

r

i

a

a

p

t

c

t

t

m

r

I

p

u

i

o

b

v

m

t

I

w

2

d

here SS Res ( p ) and SS R ( p ) indicate, respectively, the residual sum of

he squares and regression sum of squares. A higher p results in a

igher R 2 p . However, an adjusted coefficient of multiple determina-

ion that is more transparent is defined by

2 Adj,p = 1 −

(n − 1

n − p

)(1 − R

2 p

), (5)

here n is the number of variables. It should be noted that R 2 Adj,p

oes not increase by adding more variables, but R 2 p exceeds R 2 p+ s f and only if the partial F statistics exceeds 1 ( Edwards, 1969;

aitovsky, 1968; Seber, George, & Lee, 2003 ). The partial F statistics

s used for evaluating the significance of adding a new variable s .

hus, one can use a subset of models that maximize R 2 Adj,p

.

.1.2. Variable selection

Variable selection can be computationally expensive, as one

ust determine the result of adding/eliminating of each variable

ne by one. There are, however, some efficient methods for doing

o that are briefly described below.

- Forward selection

This method begins by adding the first variable that has the

aximum correlation with the response variable. Then, if the

tatistics F exceeds F in , it will be added to the model. Note that

in is used as a threshold and is calculated by, F = ( (

S S Res 1 −S S Res 2 p 2 −p 1

)

( S S Res 2 n −p 2

) ) .

n a similar fashion, the next variable is added to the model if it

arries the maximum correlation after the previously added vari-

ble. Once again, the statistics F is compared with F in and added

o model, if it exceeds the threshold. The process continues until F

oes not exceed the predefined threshold or no variable is left.

- Backward selection

Backward variable elimination acts exactly as the opposite of

he forward selection. It first incorporates all the n variables in the

odel and, then, removes them one-by-one according to some cri-

erion. The elimination is carried out using the statistics of elimi-

ation, F in . In other words, the minimum statistics of the variables

re compared with F out and are eliminated if it is less than F out .

he procedure is then repeated for the remaining n − 1 variables

ntil the smallest statistics F is not less than F out .

- Stepwise selection

The above algorithms are computationally expensive when one

eals with a large dataset, as most of the unconventional reser-

oirs are. Instead, one may use stepwise selection, which is an ef-

cient method to identify the best combination of the variables

Efromyson, 1960 ). Similar to the backward algorithm, in stepwise

election the independent variables are all incorporated into the

odel one at a time and are re-evaluated using a pre-defined F .

fter adding a new variable, the statistics F out is compared with

he pre-defined criterion to decide if some of the existing variables

an be eliminated without losing the accuracy. That this is possi-

le is due to one of the previously incorporated variables losing its

ignificance when one or more new variables are added to model.

he algorithm is terminated when either the defined statistics F

s maximized, or the improvement from adding any new variable

does not exceed the pre-defined F . It also should be noted that

oth F in and F out are used in this method. Generally, one can con-

ider F out < F in in order allowing a variable to be added easier. It

hould be noted that one may benefit from the recently-developed

lgorithms for forward selections ( Cernuda et al., 2013; Cernuda,

ughofer, Märzinger, & Kasberger, 2011; Serdio et al., 2017 ).

.2. Non-linear models based on soft computing architectures

The method described above can be used when one expects

o extract spatial linear relationships between the available vari-

ble. These methods are, however, unable of discovering the non-

inear relationships, as we encounter frequently in characterization

f complex shale reservoirs. Thus, in this section we demonstrate

he ability of non-linear models based on soft computing for ex-

racting such nonlinear relationships

The main purpose of machine leaning algorithms is to learn

rom the existing data and evidence, to be able to make predic-

ions. Thus, most of such approaches rely on the input data, which

s why they are classified as data-driven techniques. Such methods

ave been used widely in the earth science problems, including

ermeability/porosity estimation ( Karimpouli & Malehmir, 2015 ),

rade estimation ( Tahmasebi & Hezarkhani, 2011 ), contaminant

odeling ( Rogers & Dowla, 1994 ), predicting temperature distri-

ution and methane output in large landfills that are essentially

arge-scale porous media in which biodegradation occur ( Li et al.,

011; Li, Qin, Tsotsis, & Sahimi, 2012; Li, Tsotsis, Sahimi, & Qin,

014 ), etc. Due to the considerable complexity and unpredictable

patial correlations between the available data for shale reservoirs

n this paper, we use the fuzzy logic approach, first proposed by

adeh (1965 ), and combine it with two more machine learning

lgorithms, namely, neural networks (NNs) and the genetic algo-

ithm (GA) in order to increase the model’s predictive power. What

ollows is a brief description of the techniques.

.2.1. Fuzzy inference system

Fuzzy set theory, or fuzzy logic (FL), was first introduced by

adeh (1965) in which, unlike the Boolean logic that has two out-

omes, namely, true (1) or false (0), the true value of a variable

ay be between in between in [0, 1]. In other words, the FL allows

ne to define a partial set membership (see below). Thus, even if

set of vague, ambiguous, noisy, imprecise, and incomplete input

nformation is made available, the FL can still produce a valid re-

ponse.

Briefly, the FL tries to identify a connection between the in-

ut and output through defining some if-then rules, and is com-

osed of some elements that are described here. Typically, the in-

ut and output data are presented using a membership function ,

epresented by a curve. The curve attributes some weights to the

nput and defines how it can be mapped onto the output. The rules

re the instructions that define the conditional statements. They,

long with the membership functions, map the input onto the out-

ut. Finally, the interdependence of the input and output is defined

hrough some fuzzy preposition, or the so-called FL operators . A re-

ent comprehensive review on the FL is given by Gomide (2016) .

The aforementioned elements in the FL are typically difficult

o set. For example, membership functions, their distributions and

he compositions of the fuzzy rules strictly control the FL perfor-

ance, but are difficult to set. Thus, as the first solution, such pa-

ameters may be defined through an exhaustive trial and error.

n addition, prior knowledge and experience of the under-study

roblem can also help defining more accurate if-then rules. In fact,

ltra-tight shale gas reservoirs are currently under extensive stud-

es and, thus, some of their complexities are not even discovered

r understood yet. Thus, defining such elements for shales may

e very misleading. For this aim, the NNs are employed to alle-

iate such shortcomings and define the necessary parameters in a

ore meaningful way. Another solution can be use of the GA in

he FL ( Cordón, Gomide, Herrera, Hoffmann, & Magdalena, 2004 ).

t should be noted that there are many other techniques that deal

ith fuzzy systems extraction and learning from data ( Abonyi,

003; Lughofer, 2011; Pedrycz & Gomide, 2007 ), but they are not

iscussed in this paper.

Page 4: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

438 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447

a

t

a

fi

b

r

w

a

c

i

a

F

j

a

t

F

c

p

L

t

o

L

w

ω

L

o

ω

n

ω

p

m

A

F

a

i

p

fi

r

d

f

t

e

a

e

i

i

2.2.2. Integrated neural networks and fuzzy logic

The NNs have been inspired by human brain, with a signif-

icantly less complexity, which can be used for modeling com-

plex and highly nonlinear problems whose solutions by the com-

mon statistical or analytical methods are not straightforward

( Bishop, 1995 ) or impossible to drive. Briefly, a NN is composed

of some hierarchical nodes called neurons, which are trained using

some data. The inputs are mapped onto the output space though

some weights (or functions) that are initially generated randomly.

Then, the weights are updated using the produced error in the

output. The process, called training, continues until the error is

smaller than a threshold. Next, the resulting NN, or more precisely

the mathematical functions derived by it, are used for making pre-

diction. Back propagation is one of the most popular learning al-

gorithms for the NN that is used in conjunction with the gradient

descent or the Levenberg–Marquardt method, two optimization al-

gorithms by which the weights are changed iteratively to minimize

the following error, in order to determine their optimal values:

E =

1

2

i

( out p i − pr d i ) 2

(6)

where outp i is the actual output and prd i is the predicted value.

Minimizing E entails setting to zero its derivatives with respect to

the parameters w ij in order to decide how to vary w ij for two con-

nected nodes ( i, j ). Thus,

w i j := w i j + �w i j (7)

where

�w i j = −η∂E

∂ w i j

(8)

in which η is a learning rate. In order to prevent the optimization

method from being trapped in local minimum of E , as opposed to

its true global minimum, one can use the following approach. Sup-

pose that

ϕ j = − ∂E

∂N N j

(9)

and

ξ j = −∂N N j

∂ w j

(10)

then, �w ij is evaluated by

∂E

∂ w i j

=

∂NN

∂ w j

× ∂E

∂N N j

= ϕ j ξ j (11)

We then have

�w i j = ηϕ j ξ j (12)

Since,

∂E

∂ ξ j

= −( out p i − pr d i ) (13)

and

∂ξ

∂N N j

= f ′ (N N j

)(14)

ϕj is also evaluated by,

ϕ j =

∂E

∂N N j

=

∂E

∂ ξ j

× ∂ξ

∂N N j

= ( out p i − pr d i ) f ′ (N N j

)(15)

Thus, an approach based on integrating the NN and FL can ad-

dress the aforementioned issues in defining the latter’s parameters

( Jang, 1992, 1993 ). The integrated system benefits from the training

ability of the NN for minimizing the generated error in the output.

Thus, through the optimization loop in the NN the FL parameters

re tuned. In this fashion, the strengths of the FL qualification and

he NN quantification are efficiently integrated.

Suppose that n and 1 variables are available as the input ( inpt )

nd output ( out ), respectively, as is the case in this paper. Then, the

rst-order Sugeno fuzzy system ( Tanaka & Sugeno, 1992 ) is defined

y

j : If inp t 1 is �1 1 , inp t 2 is �

1 2 , . . . , and in p t j is �

j i , then

f r =

r ∑

i =1

n ∑

j=1

β j i inp t j + εi

(16)

here � j i

and β j i

are, respectively, the i th fuzzy membership set

nd the consequent parameter for the j th input. f r indicates the

onsequence part of the r th rule. In this study, the Sugeno model

s used in which the membership functions and weight adjustment

re denoted by premise and consequent parts, respectively; see

ig. 1 . The internal operations for the input mapping, weight ad-

ustment and consequent parameter integration are demonstrated

s well.

As already mentioned, the NN is integrated with the fuzzy sys-

em shown in Fig. 1 . The new integrated architecture is shown in

ig. 2 ( Jang, 1993 ). The resulting hybrid machine learning (HML)

ontains various elements in each layer that are as follow. The in-

uts are denoted by L 1 , which in this study are various well logs.

2 represents the layer that contains the membership functionsj

i , and measures the degree to which the inpt j satisfies the quan-

ifier:

utp j i = � j

i

(inp t j

)(17)

3 , the third layer, multiples the input data and produces the firing

eights for each rule r , as follow:

i = t

(ξ�1

1 , . . . , ξ

� j i

)(18)

4 , the fourth layer, normalizes each firing rule based on the sum

f all the previously calculated weights:

r =

ω i ∑ r i =1 ω i

(19)

Next, the calculated normalized firing strengths are used in the

ext layer, L 5 , to generate the adaptive function by

r f r = ω r

(β1

r inp t 1 + β2 r inp t 2 + . . . + βn

r inp t n + εr

)(20)

The overall output is then calculated in the sixth layer, L 6 , as

r

i =1

ω i f i =

∑ r i =1 ω i f i ∑ r

i =1 ω i

(21)

Finally, the response, calculated from the previous layer, ap-

ears in the last layer, L 7 .

Clearly, some of the aforementioned parameters, such as the

embership functions, are vital for an optimal hybrid network.

hybrid learning procedure is applied to the architecture of

ig. 2 that estimates the premise and the consequent parameters

nd, consequently, allows capturing the complexity and variabil-

ty between the input and output data. Thus, based on the back-

ropagation error, the parameters in the premise part are first kept

xed in order to compute the error. Next, by holding fixed the pa-

ameters in the consequent part, the premise’s parameters are up-

ated using the gradient-descent method. Finally, the membership

unctions and their consequent weights are optimized.

Although the NN assists the FL for optimal parameter selec-

ion, there are still some parameters that need to be optimized. For

xample, the NN cannot decide how many membership functions

re needed. In addition, due to using two user-dependent param-

ters – the learning rate and the momentum coefficient – the NN

tself may be trapped in a local minimum. It is worth mention-

ng that momentum adds a small friction to the previous weights

Page 5: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 439

Fig. 1. Demonstration of first-order Sugeno fuzzy model along with various membership functions and inputs.

Fig. 2. The integrated neural-fuzzy architecture.

a

w

s

c

d

w

e

d

n

r

c

i

o

c

2

t

c

i

b

m

e

o

a

c

t

t

E

e

c

f

i

w

u

1

m

r

c

r

g

s

w

t

e

n

s

T

t

nd updates to the current state. This parameter prevents the net-

ork from converging to local minimum. A high value can over-

hoot the minimum and, likewise, a lower value cannot avoid lo-

al minima and it causes the computations to be too long. To ad-

ress such issues, a third method, namely, the GA, is integrated

ith the neural-fuzzy structure that helps determining the param-

ters. Briefly, due to presence of various user-dependent and time-

emanding parameters in the NN, such as its number of layers,

umber of membership functions in the FL and the learning pa-

ameters (e.g., the momentum and rate), the GA can be used to ac-

elerate the above trial-and-error procedure. In other words, defin-

ng the aforementioned parameters requires a considerable amount

f time, whereas the GA algorithm can efficiently and systemati-

ally identify the optimal values.

.2.3. The genetic algorithm

The GA is one of the most successful optimization algorithms

hat can deal with a variety of problems that are not feasible, both

omputationally and in terms of the accuracy, to be addressed us-

ng many other optimization techniques. For example, the GA can

e used when the objective function (the energy E to be mini-

ized) is discontinuous, non-differentiable, highly nonlinear, and

ven stochastic ( Goldberg & Holland, 1988 ). The technique is built

n an initial population of the candidates that mimic the process,

nd is inspired by natural selection. Each candidate has some spe-

ific properties, i.e., chromosomes, which can be modified, e.g. mu-

ated, based on the existing evidence. Thus, the aim is to modify

he initial population to reach an optimal solution that minimizes

globally. At each step, the chromosomes (the parameters to be

stimated in the present work) are randomly selected from the

urrent parents – the population. Then, “children” are generated

or the next population. After some initial trial and error, the GA

dentifies the potentially useful chromosomes, so that the children

ill produce smaller error.

The GA uses three types of rules to create the next generation

sing the current population, which are as follow ( Deb & Agrawal,

999 ). (i) Mutation, an operation that changes the value of chro-

osomes; (ii) selection , which chooses the parents from the cur-

ent children in order to produce the next generation, and (iii)

rossover, an operation that combines the chromosomes (the pa-

ameters) to increase the variability and prevent the energy E from

etting trapped in a local minimum.

In the first step, all the variables are represented by binary

trings of 0 and 1. Each chromosome indicates several genes by

hich various possible values for each free parameter, which needs

o be optimized, are examined. Thus, an initial population is gen-

rated that contains various parameters of the FL and NN, such as

umber of the inputs (i.e., well logs), the number of the member-

hip functions, the learning rate, and the momentum coefficient.

he candidate are tested and updated progressively by the GA, un-

il the optimal values are obtained.

Page 6: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

440 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447

Fig. 3. (a) The wireline well log dsata used in this study, and (b) mineral compositions from the XRD analysis.

w

v

K

w

o

s

d

t

3. Results and discussion

As discussed, due to the extensive variability of shale reservoirs,

an extensive amount of information is required for their character-

ization and, hence, it helps to reduce the uncertainty and improve

real-time recovery operations. Thus, since well logs provide useful

information, and at the same time are widely available, the ob-

jective of this study is to use such data to predict two important

properties of shales, namely, the TOC and FI. Clearly, none of the

ell logs can by itself predict the two indices, as there is always

ery complex correlation between the well logs ( Dashtian, Jafari,

oohi Lai, Masihi, & Sahimi, 2011; Karimpouli & Tahmasebi, 2016 ),

hich in some cases are mixed with noise. Another important goal

f this study is to develop a methodology for deciding the neces-

ary logs that must be run. In other words, significant reduction in

ata acquisition efforts can be achieved, once the correlation be-

ween the two indices and the available data is identified. Together,

Page 7: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 441

Fig. 4. Mineralogical composition of the samples used in the analysis.

t

s

(

i

P

f

c

t

t

T

b

p

(

F

w

s

a

3

t

t

e

h

r

(

p

e

v

o

a

(

t

a

r

t

i

t

t

t

w

e

t

C

n

d

u

w

t

t

b

a

m

F

a

c

b

T

3

d

a

a

T

r

(

u

s

i

a

t

I

o

m

a

l

n

l

l

e

h

b

t

m

o

a

v

g

he steps described below quantify the two important properties of

hale reservoirs in real time.

There are several ways of defining and measuring brittleness

Zoback, 2007 ). One of the most common methods is based on us-

ng the elastic properties, such as the Young’s modulus and the

oisson’s ratio. The latter can be thought of as the rock ability to

ail under stress, whereas the ability of rock to resist fracturing

an be quantify using the Young’s modulus. Rickman, Mullen, Pe-

re, Grieser, and Kundert (2008) proposed equations for estimating

he brittleness index using well logs.

In this paper, the FI is defined using mineralogical contents.

his is motivated by the fact that fracability is mostly controlled

y hard mineral contents of rock, such as quartz, calcite and

yrite and, thus, the XRD of such minerals are used as follows

Jarvie, Hill, Ruble, & Pollastro, 2007 ):

I =

Qz

Qz + Cal + Cly (22)

here Qz, Cal and Cly are the quartz, calcite and clay contents, re-

pectively. The FI defined by Eq. (22) , along with the TOC was used

s the dependent (i.e., target) parameters.

.1. Linear regression models based on statistical technique

The mineralogical analysis and geochemical data for defining

he FI were used; their distribution is depicted in Fig. 4 , where

he sum of the quartz, clay and carbonate minerals is displayed on

ach vortex. The plot indicates that the content of the samples has

igher carbonate.

On the other hand, well-log data are also available. The spatial

elationships between the mineralogy, the FI and the gamma ray

GR) log are presented in Fig. 5 . As expected, the GR log and FI

ositively correlate with the Qz content.

Before applying the MLR analysis, one might also check the lin-

ar dependence (i.e., pairwise correlation) between the dependent

ariables FI and TOC, on the one hand, and the available well logs,

n the other hand. Such a relationship was studied, and the results

re presented in Table 1 . The correlation between the FI and sonic

DT ), density ( PE) , resistivity ( RT10) and Ur logs are stronger than

he other logs, whereas the overall correlation between the TOC

nd the well logs is strong. For example, there is considerable cor-

elation between the TOC, GR and DT. Strong correlation between

he TOC and GR is expected, since most of the gas in the samples

s trapped in the organic matters.

One can also investigate more complex correlations between

hree variables. Such correlations are shown in Figs. 6 and 7 for

he FI and TOC evaluations, respectively. A weak correlation be-

ween the first and second variables and the FI can be seen. Such

eak correlations are not sufficient when fitting a 3D surface. For

xample, a considerable number of mismatches between the fit-

ed surfaces and the selected variables are visible in both cases.

learly, the correlations with one/two variables do not reveal sig-

ificant information, and cannot be relied upon. Thus, in order to

evelop a more robust and accurate model, the proposed MLR is

sed, which optimally selects the informative variables. The step-

ise algorithm is applied to select the appropriate variables among

he initial well-log data to estimate the TOC and FI. For this aim,

he threshold F in for entering any new variable was considered to

e 0.1 with a confidence interval of 90%. As for removing the vari-

ble, F out was also set to 0.15. Thus, the following model for esti-

ating the FI was obtained:

I = 0 . 0 04 P E + 0 . 0 01 T hor + 0 . 032 (23)

The correlation coefficient for Eq. (23) is 0.44, which is very low

nd indicates that the MLR is not able to accurately predict the FI.

On the other hand, the equation for the TOC has a correlation

oefficient of 0.88, hence providing reliable predictions for the TOC

ased on the well-log data, and is given by

OC = 0 . 016 GR + 0 . 087 DT − 1 . 380 K − 4 . 020 (24)

.2. Non-linear models based on soft computing architectures

Before feeding the input data to the proposed HML method, the

ata are normalized to prevent any superficial scaling effect. Thus,

ll the data were normalized between zero and one. The well logs

re used as the input, while the TOC and FI represent the output.

o test the performance of the proposed method, the data were

andomly divided into two groups of training (80%) and testing

20%). The training data should be selected carefully to cover the

pper and lower possible boundaries of the input space.

Due to its flexibility for controlling the shape, the Bell member-

hip function,

f ( x ; a, b, c ) =

1

1 +

∣∣ x −c a

∣∣2 b (25)

nstead of the Gaussian distribution, was used because it can be

djusted easily when the proposed hybrid network needs to tune

he variability space between the well-log and the TOC/FI output.

n Eq. (25) , a, b , and c are three parameters that can be varied in

rder to change the shape of the function. In this paper, the mo-

entum and pure-line axon were used as the learning algorithm

nd the transfer function, respectively. Clearly, the output of pure-

ine function is the same as the input. Some other parameters that

eed to be set include the step size and the momentum rate as the

earning variables and the input processing elements as the topo-

ogical parameter. Such parameters can be obtained using trial and

rror, which is expensive computationally. The GA algorithm can,

owever, provide optimal solutions, so as to avoid long and cum-

ersome steps.

An important feature of the GA is its chromosomes (parame-

ers in the present study) that are composed of genes. The chro-

osomes are each composed of four genes that are the number

f inputs, membership functions for each input, the momentum,

nd the learning rate, which is the same as the number of the

ariables that should be optimized. First, some random values are

enerated and assigned to each chromosome. Then, the network

Page 8: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

442 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447

Fig. 5. Spatial distribution of (a) the FI, and (b) the GR, based on the mineralogical information.

Table 1

Correlation between the FT, TOC and well-log data.

Well-log/Dep. Var GR Rhob Nphi DT PE RT10 SP Ur Thor K

FI −0.014 0.245 −0.34 0.458 0.443 −0.413 −0.260 0.443 0.366 0.179

TOC 0.736 −0.563 0.703 0.745 −0.675 −0.266 0.428 0.538 0.597 0.410

Fig. 6. (a) and (b) represent the scatter plots with the third variable, the FI, while (c) and (d) show 3D fitted surface using the available data.

Page 9: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 443

Fig. 7. Same as Fig. 6 , but for the TOC.

i

m

M

w

a

o

m

t

t

i

a

a

t

s

a

w

0

i

u

g

o

F

t

g

a

T

f

m

G

Table 2

Comparison of MSEs for the HML and MLR

method for estimating the TOC and FI.

TOC FI

Method HML MLR HML MLR

MSE 0.08 0.43 0.12 1.12

t

s

(

(

s

e

r

t

a

i

d

r

a

t

M

h

F

i

F

s evaluated and the fitness function, represented by the following

ean-squares error (MSE), is computed for the chromosomes:

SE =

1

n

n ∑

i =1

( T i − P i ) 2 (26)

here T i and P i and the target and predicted values, respectively,

nd n indicates the number of the training data. The MSE can take

n any positive value, of course.

Next, the chromosomes (parameters) are updated based on the

agnitude of the last computed MSE to obtain a better fit. Thus,

he proposed MSE, Eq. (26) , is evaluated in each successful run of

he training. The individuals are selected, combined, and mutated

n such a way that they minimize the error. Then, the new parents

re replaced with the previous erroneous ones that help produce

more accurate generation. The steps that result in the final op-

imal network need to be repeated several times. The algorithm is

ummarized in Fig. 8 .

The GA parameters were set in variable ranges. For example,

fter some preliminary calculations the range of the crossover rate

as set between 0.05 and 0.95, while the mutation was between

.01 and 0.3. Each of the probabilities was tested in a round of 54

terations and the optimal values were used in the GA. The pop-

lation size was also set to be between 12 and 60. Using the al-

orithm, several networks with various configurations of the GA

perators were tested in order to set the final optimal parameters.

or example, as demonstrated in Fig. 9 , the optimal networks for

he TOC and FI were obtained, respectively, in the 25th and 17th

enerations. Unlike the MLR, the GA algorithm recognized that not

ll the input parameters have to be incorporated in the model.

herefore, the final models are based on the contributions of a

ew of the welllog data. The resulting networks contain the opti-

al parameters since various configurations were tested using the

A. Such models must achieve the lowest estimation error. Thus,

he following parameters were determined to provide the optimal

olution:

i) For the FI evaluation: a mutation rate of 0.15, a network with

five nodes in the MFs layer, a learning rate of 0.55, and a mo-

mentum of 0.64.

ii) For the TOC evaluation: a mutation rate of 0.12, a network with

four nodes in the MFs layer, a learning rate of 0.62, and a mo-

mentum of 0.71.

The results of application of the MLR and HML algorithms are

hown in Fig. 10 . They both provide acceptable results for the TOC

stimation, but the HML delivers predictions that are more accu-

ate. To fit the TOC curve, the MLR tries to mimic the overall dis-

ribution of the selected variables such as, for example, the GR

nd DT logs, whereas the HML represents a different form, and

ts predictions are in very good agreement with the GR log, in-

icating that the proposed TOC evaluation is physically valid. The

esults of the FI estimation, however, represent very different vari-

tions, which are mainly due to the very complex correlation be-

ween the FI and the well logs (see also Table 1 ). For example, the

LR method represents a strongly biased FI curve. On the other

and, the HML method determines a more accurate curve for the

I evaluation. The results are numerically compared using the MSE

n Table 2 .

The distribution of the GR log versus FI is presented in

ig. 11 (a), which indicates a direct relationship between the two

Page 10: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

4 4 4 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447

Fig. 8. Flowchart of the implemented HML algorithm. The data are first analyzed and then fed to the FL. The NN and GA are integrated to train and optimize the new hybrid

system.

Fig. 9. The MSE variation versus generation for evaluation of (a) the TOC, and (b) the FI.

4

q

c

c

o

m

w

r

d

v

n

u

n

a

v

z

variables and the TOC. Indeed, as can be seen, the TOC increases

when both the GR and FI do. It also indicates significant positive

correlations between the GR log and the FI. Furthermore, since the

MLR method recognized strong correlation between the TOC and

GR, DT and K (permeability), a more informative multidimensional

representation of such relationship is depicted in Fig. 11 (b). The

axes represent the GR, DT and the K logs. The other two variables,

namely, the TOC and FI, are shown using the size and color, re-

spectively. Simultaneous variation of the size and color indicates

the spatial dependence of the selected variables.

For the purpose of better identification of the fracable zones,

the same practice was implemented on the GR log and the TOC

against the FI. Then, the FI was clustered into four groups us-

ing the k-means method ( MacQueen, 1967 ). The classification was

performed using the information on the GR log, the FI and the

TOC. Thus, one may use the well-log data and instantly determine

whether the area under development is fracable or not. The accu-

racy of the method can also be verified in Fig. 13 as the probability

curves represent very different distribution of the FI data.

f

. Summary and conclusions

Due to their highly complex structures, shale-gas reservoirs re-

uire very accurate modeling. Vertical and lateral variability ne-

essitate more drilling, which consequently leads to significant in-

rease in the cost of the operations. Wireline well logging is one

f the most accessible and affordable approaches to continuously

onitor such complexities. Shale-gas repositories are associated

ith extensive/big data. Obviously, analysis of the uncertainty and

isk assessment for future development would benefit from big

ata. However, processing and analyzing such data require ad-

anced techniques.

Aside from acquiring more informative data using various tech-

iques, real-time data integration and modeling is still a significant

nsolved problem. Together, both the tools and data analysis tech-

iques aim at identifying the sweet spots that are potentially favor-

ble for successful and economic development of shale-gas reser-

oir. In shale-gas reservoirs ‘sweet spots’ are mostly defined as the

ones that contain high total organic carbon (TOC) and are easily

racable, i.e., high fracable index (FI). In this paper, two methods

Page 11: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 445

DM.

mr oN

XRD RT10[5 340]

GR[9 154]

Nphi[2 20]

Dt[48 90]

U[0.75 7.5]

PE[3.5 6]

HML (TOC)[0.24 4.36]

HML (FI)[0 1]

MLR (TOC)[0.24 4.36]

MLR (FI)[0 1]

Qz Carb. Clay

Fig. 10. Evaluation of the TOC and FI using the HML and MLR methods. The core analysis data for each dataset are shown as dots.

Fig. 11. (a) Spatial correlation between the GR (the X axis), FI ( Y axis) and TOC (the third axis represented by color). (b) Spatial multivariate correlation between the GR, Dt,

K, TOC (size of the spheres) and FI (color of the spheres). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of

this article.)

w

i

t

a

o

s

i

fi

c

i

l

t

n

w

o

w

b

n

c

p

i

T

t

t

m

m

p

c

a

t

w

l

f

o

t

c

r

q

t

a

r

t

a

w

p

b

r

w

c

d

O

ere presented for simultaneously dealing with such big data and

dentifying the sweet spots.

The first method was based on data mining techniques that try

o determine the spatial correlations between the variables using

n efficient stepwise algorithm. The method fits a surface based

n the well-log data and predicts both the TOC and FI. The final

urface is based only on switching the input parameters and test-

ng various coefficients. Clearly, however, the method is not able to

rst learn and then adaptively change the coefficients to fit a more

omplex and higher-order surface.

As an alternative, fuzzy logic, a useful method in machine learn-

ng techniques was selected since it can better process and ana-

yze the complex spatial relationship between the shale data. The

echnique is, however, based on some user-dependent rules that

eed a vast dataset for shales, their geological settings, wireline

ell-log data interpretation, and so on, which give rise to another

bstacle. From the same school of machine learning, neural net-

orks, due to their excellent performance in providing a learning

asis, were used to assists defining the knowledge-based rules. The

ew technique works very well when the input data are not very

omplex and with high dimensions. Furthermore, it also has some

arameters that must be set carefully, such as the number of the

nput data, the number of MFs, and learning-related parameters.

he data used in this study are, however, highly complex and mul-

idimensional that require extensive trial and error calculation if

brute-force computations” are to be used. Thus, the method is in-

egrated with the genetic algorithm in order to avoid such time de-

anding and cumbersome parameter adjustment. The new hybrid

achine learning (HML) technique was developed, and shown to

rovide much more accurate predictions for the TOC and FI, when

ompared with those of the MLR method, and are in very good

greement with the available experimental data for both proper-

ies.

The proposed methods in this study can still not capture the

hole complexity of shale reservoirs as they manifest highly non-

inear behavior. For example, the HML can hardly follow the trends

or the TOC and fracable index. The soft computing method, on the

ther hand, requires knowledge on various aspects of artificial in-

elligence. Aside from all such weaknesses, for example, the soft

ommuting method was designed in such a way to minimize the

equired knowledge. These techniques can be used when one re-

uires use of minimum amount of information in order to iden-

ify the sweet spots. The HML method requires least possible effort

nd user knowledge for shale reservoir characterization and can be

apidly updated using new data. The other available techniques for

he application of this study, however, are mostly based on trial

nd error. They require vast amount of knowledge in geology and

ell-log interpretation, whereas the proposed methods in this pa-

er minimize such necessities.

The implication of the present work is clear. Methods that are

ased on trial and error are not useful for characterization of shale

eservoirs, and increasing the amount of data, which can be costly,

ill not improve their performance. On the other hand, advanced

omputational and analysis techniques of the type that we have

iscussed and utilized in this paper offer at least two advantages.

ne is that one does not blindly and based on ad-hoc approaches

Page 12: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

446 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447

Fig. 12. (a) Spatial correlation between the GR ( X axis), TOC ( Y axis) and FI (the third axis represented by color). (b) Graphical representation of the k-means clustering. (c)

3D view of clustering of the well-log data into four groups of brittle, medium brittle, ductile and medium ductile. (For interpretation of the references to colour in this figure

legend, the reader is referred to the web version of this article.)

Fig. 13. Density distribution of the clustered well-logs based on the FI data. Three

distinct probabilities indicate various distributions of the FI data.

b

v

s

p

v

A

f

N

o

o

a

C

E

fi

i

c

w

R

A

B

C

try to develop correlations between the various variables. The sec-

ond advantage of such methods is that they identify the most per-

tinent variables, so that if one is to invest more resources in or-

der to collect more data, one can concentration on measuring and

gathering information about the most relevant variables, hence re-

ducing the financial burden of such operations.

As the future work, one might consider bringing more geolog-

ical information and defining more informative inputs. There are

some geological signatures that are not reordered in the logging,

ut can be extracted from other study, such as geophysical sur-

eys. Developing a more robust definition for fracbility can also be

tudied further. As another future avenue, one can use the elastic

roperties, instead of the XRD data, which are more feasible and

ersatile.

cknowledgements

PT thanks the financial support from the University of Wyoming

or this research. FJ would like to thank the support from

ano Geosciences lab and the Mudrock Systems Research Lab-

ratory ( MSRL) consortium at the Bureau of Economic Geol-

gy, The University of Texas at Austin . MSRL member companies

re Anadarko, BP , Cenovus, Centrica, Chesapeake, Cima, Cimarex,

hevron , Concho, ConocoPhillips , Cypress, Devon, Encana, Eni , EOG,

XCO, ExxonMobil, Hess, Husky, Kerogen, Marathon , Murphy, New-

eld, Penn Virginia, Penn West, Pioneer , Samson, Shell , Statoil , Tal-

sman, Texas American Resources, The Unconventionals, U.S. Inter-

rop, Valence, and YPF. XRD data provided by Dr. H. Rowe and

ireline well-log data provided by R. Eastwood.

eferences

bonyi, J. (2003). Fuzzy model identification. In Fuzzy model identification for

control (pp. 87–164). Birkhäuser Boston: Boston, MA . https://doi.org/10.1007/978- 1- 4612- 0027- 7 _ 4

ishop, C. M. (1995). Neural networks for pattern recognition . Clarendon Press .

ernuda, C., Lughofer, E., Hintenaus, P., Märzinger, W., Reischer, T., Pawliczek, M.,et al. (2013). Hybrid adaptive calibration methods and ensemble strategy for

prediction of cloud point in melamine resin production. Chemometrics and In-telligent Laboratory Systems, 126 , 60–75. https://doi.org/10.1016/j.chemolab.2013.

05.001 .

Page 13: Expert Systems With Applications - Laramie, Wyoming · 436 P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 Tahmasebi, Javadpour, & Sahimi, 2016; Tahmasebi,

P. Tahmasebi et al. / Expert Systems With Applications 88 (2017) 435–447 447

C

C

D

D

E

E

G

G

H

J

J

J

K

K

L

L

L

L

M

M

P

R

R

S

S

S

T

T

T

T

T

T

T

T

T

T

T

T

Z

Z

ernuda, C., Lughofer, E., Märzinger, W., & Kasberger, J. (2011). NIR-based quantifi-cation of process parameters in polyetheracrylat (PEA) production using flexi-

ble non-linear fuzzy systems. Chemometrics and Intelligent Laboratory Systems,109 (1), 22–33. https://doi.org/10.1016/j.chemolab.2011.07.004 .

ordón, O., Gomide, F., Herrera, F., Hoffmann, F., & Magdalena, L. (2004). Ten yearsof genetic fuzzy systems: Current framework and new trends. Fuzzy Sets and

Systems, 141 (1), 5–31. https://doi.org/10.1016/S0165-0114(03)00111-8 . ashtian, H., Jafari, G. R., Koohi Lai, Z., Masihi, M., & Sahimi, M. (2011). Analysis

of cross correlations between well logs of hydrocarbon reservoirs. Transport in

Porous Media, 90 (2), 445–464. https://doi.org/10.1007/s11242-011-9794-x . eb, K. , & Agrawal, S. (1999). Understanding interactions among genetic algorithm

parameters. In Foundations of Genetic Algorithms, 5 (5), 265–286 . dwards, A. W. F. (1969). Statistical methods in scientific inference. Nature,

222 (5200), 1233–1237. https://doi.org/10.1038/2221233a0 . fromyson, M. A. (1960). Multiple regression analysis. Mathematical methods for dig-

ital computers . NewYork: John Wiley and Sons, Inc .

oldberg, D. E., & Holland, J. H. (1988). Genetic algorithms and machine learning.Machine Learning, 3 (2/3), 95–99. https://doi.org/10.1023/A:1022602019183 .

omide, F. (2016). Fundamentals of fuzzy set theory. In Handbook on com-putational intelligence (pp. 5–42). WORLD SCIENTIFIC . https://doi.org/10.1142/

9789814675017 _ 0 0 01 aitovsky, Y. (1968). Missing data in regression analysis. Journal of the Royal Statis-

tical Society. Series B (Methodological), 30 (1), 67–82. Retrieved from http://www.

jstor.org/stable/2984459 . ang, J.-S. R. (1992). Self-learning fuzzy controllers based on temporal backpropa-

gation. IEEE Transactions on Neural Networks, 3 (5), 714–723. https://doi.org/10.1109/72.159060 .

ang, J.-S. R. (1993). ANFIS: Adaptive-network-based fuzzy inference system. IEEETransactions on Systems, Man, and Cybernetics, 23 (3), 665–685. https://doi.org/

10.1109/21.256541 .

arvie, D. M., Hill, R. J., Ruble, T. E., & Pollastro, R. M. (2007). Unconventionalshale-gas systems: The Mississippian Barnett Shale of north-central Texas as

one model for thermogenic shale-gas assessment. AAPG Bulletin, 91 (4), 475–499.https://doi.org/10.1306/12190606068 .

arimpouli, S., & Malehmir, A. (2015). Neuro-Bayesian facies inversion of prestackseismic data from a carbonate reservoir in Iran. Journal of Petroleum Science and

Engineering, 131 , 11–17. https://doi.org/10.1016/j.petrol.2015.04.024 .

arimpouli, S., & Tahmasebi, P. (2016). Conditional reconstruction: An alternativestrategy in digital rock physics. Geophysics, 81 (4), D465–D477. https://doi.org/10.

1190/geo2015 . i, H., Qin, S. J., Tsotsis, T. T., & Sahimi, M. (2012). Computer simulation of gas

generation and transport in landfills: VI – Dynamic updating of the model us-ing the ensemble Kalman filter. Chemical Engineering Science, 74 , 69–78. https:

//doi.org/10.1016/j.ces.2012.01.054 .

i, H., Sanchez, R., Joe Qin, S., Kavak, H. I., Webster, I. A., Tsotsis, T. T., et al.(2011). Computer simulation of gas generation and transport in landfills. V:

Use of artificial neural network and the genetic algorithm for short- and long-term forecasting and planning. Chemical Engineering Science, 66 (12), 2646–2659.

https://doi.org/10.1016/j.ces.2011.03.013 . i, H., Tsotsis, T. T., Sahimi, M., & Qin, S. J. (2014). Ensembles-based and GA-based

optimization for landfill gas production. AIChE Journal, 60 (6), 2063–2071. https://doi.org/10.1002/aic.14396 .

ughofer, E. (2011). Evolving fuzzy systems - Methodologies, advanced concepts

and applications : 266. Berlin, Heidelberg: Springer . https://doi.org/10.1007/978- 3- 642- 18087- 3 .

acQueen, J. (1967). Some methods for classification and analysis of multivariate ob-servations . The Regents of the University of California .

ontgomery, D. C. , Peck, E. A. , & Vining, G. G. (2012). Introduction to linear regressionanalysis . Wiley .

edrycz, W. , & Gomide, F. (2007). Fuzzy systems engineering : Toward human-centriccomputing . John Wiley .

ickman, R., Mullen, M. J., Petre, J. E., Grieser, W. V., & Kundert, D. (2008). A practi-cal use of shale petrophysics for stimulation design optimization: All shale plays

are not clones of the Barnett Shale. SPE annual technical conference and exhibi-tion . Society of Petroleum Engineers . https://doi.org/10.2118/115258-MS

ogers, L. L., & Dowla, F. U. (1994). Optimization of groundwater remediation us-ing artificial neural networks with parallel solute transport modeling. Water Re-

sources Research, 30 (2), 457–481. https://doi.org/10.1029/93WR01494 .

eber, G. A. F. , George, A. F. , & Lee, A. J. (2003). Linear regression analysis . Wiley-In-terscience .

erdio, F., Lughofer, E., Zavoianu, A.-C., Pichler, K., Pichler, M., Buchegger, T., et al.(2017). Improved fault detection employing hybrid memetic fuzzy modeling and

adaptive filters. Applied Soft Computing, 51 , 60–82. https://doi.org/10.1016/j.asoc.2016.11.038 .

ullivan Glaser, K. K. , Johnson, Brian Toelle, G. M. , Kleinberg, R. L. , Miller Kuala

Lumpur, P. , Wayne Pennington, M. D. , Lee Brown, A. , et al. (2013). Seeking thesweet spot: Reservoir and completion quality in organic shales. Oilfield Review,

25 , 16–29 . ahmasebi, P. (2017). Structural adjustment for accurate conditioning in large-scale

subsurface systems. Advances in Water Resources, 101 . https://doi.org/10.1016/j.advwatres.2017.01.009 .

ahmasebi, P., & Hezarkhani, A. (2011). Application of a modular feedforward neural

network for grade estimation. Natural Resources Research, 20 (1), 25–32. https://doi.org/10.1007/s11053- 011- 9135- 3 .

ahmasebi, P., & Sahimi, M. (2015). Reconstruction of nonstationary disordered ma-terials and media: Watershed transform and cross-correlation function. Physical

Review E, 91 (3), 32401. https://doi.org/10.1103/PhysRevE.91.03240 . ahmasebi, P., Javadpour, F., & Sahimi, M. (2015a). Multiscale and multiresolution

modeling of shales and their flow and morphological properties. Scientific Re-

ports, 5 , 16373. https://doi.org/10.1038/srep16373 . ahmasebi, P., Javadpour, F., & Sahimi, M. (2015b). Three-dimensional stochastic

characterization of shale SEM images. Transport in Porous Media, 110 (3), 521–531. https://doi.org/10.1007/s11242-015-0570-1 .

ahmasebi, P., Javadpour, F., & Sahimi, M. (2016a). Stochastic shale permeabilitymatching: Three-dimensional characterization and modeling. International Jour-

nal of Coal Geology, 165 , 231–242. https://doi.org/10.1016/j.coal.2016.08.024 .

ahmasebi, P., Javadpour, F., Sahimi, M., & Piri, M. (2016b). Multiscale study forstochastic characterization of shale samples. Advances in Water Resources, 89 ,

91–103. https://doi.org/10.1016/j.advwatres.2016.01.008 . ahmasebi, P., & Sahimi, M. (2016a). Enhancing multiple-point geostatistical mod-

eling: 1. Graph theory and pattern adjustment. Water Resources Research, 52 (3),2074–2098. https://doi.org/10.1002/2015WR017806 .

ahmasebi, P., & Sahimi, M. (2016b). Enhancing multiple-point geostatistical model-

ing: 2. Iterative simulation and multiple distance function. Water Resources Re-search, 52 (3), 2099–2122. https://doi.org/10.1002/2015WR017807 .

ahmasebi, P., Sahimi, M., & Andrade, J. E. (2017). Image-based modeling ofgranular porous media. Geophysical Research Letters . https://doi.org/10.1002/

2017GL073938 . ahmasebi, P., Sahimi, M., Kohanpur, A. H., & Valocchi, A. (2016c). Pore-scale simula-

tion of flow of CO2 and brine in reconstructed and actual 3D rock cores. Journalof Petroleum Science and Engineering . https://doi.org/10.1016/j.petrol.2016.12.031 .

anaka, K., & Sugeno, M. (1992). Stability analysis and design of fuzzy con-

trol systems. Fuzzy Sets and Systems, 45 (2), 135–156. https://doi.org/10.1016/0165-0114(92)90113-I .

adeh, L. A. (1965). Fuzzy sets. Information and Control, 8 (3), 338–353. https://doi.org/10.1016/S0019- 9958(65)90241- X .

oback, M. D. (2007). Reservoir geomechanics . Cambridge University Press .