Building Bayesian networks from basin modeling scenarios...
-
Upload
hoangquynh -
Category
Documents
-
view
217 -
download
4
Transcript of Building Bayesian networks from basin modeling scenarios...
Building Bayesian networks from basin modeling1
scenarios for improved geological decision making2
Gabriele Martinelli∗, Jo Eidsvik3
Dept. of Mathematical Sciences, Alfred Getz’ vei 1,4
Norwegian University of Science and Technology, Trondheim, Norway5
Richard Sinding-Larsen, Sara Rekstad6
Dept. of Geology and Mineral Resources Engineering, Sem Sælands veg 1,7
Norwegian University of Science and Technology, Trondheim, Norway8
Tapan Mukerji9
Department of Energy Resources Engineering, School of Earth Sciences,10
Stanford University, USA11
Abstract12
Basin models are used to gain insights about a petroleum system and to simulate geological processes13
required to form oil and gas accumulations. The focus of such simulations is usually on charge and timing-14
related issues, although uncertainty analysis about a wider range of parameters is becoming more common.15
Bayesian Networks are useful for decision making in geological prospect analysis and exploration. In this16
paper we propose a framework for merging these two methodologies: by doing so, we explicitly account17
for dependencies between the geological elements. The probabilistic description of the Bayesian Network is18
trained by using multiple scenarios of Basin and Petroleum Systems Modeling. A range of different input19
parameters are used for total organic content, heat flow, porosity, and faulting, to span a full categorical20
design for the Basin and Petroleum Systems Modeling scenarios. Given the consistent Bayesian Network21
for trap, reservoir and source attributes, we demonstrate important decision making applications such as22
evidence propagation and the value of information.23
Keywords: Bayesian Networks, Scenario Evaluation, Basin Modeling, Uncertainty Quantification,24
Petroleum Exploration25
∗Corresponding authorEmail address: [email protected] , [email protected] (Gabriele Martinelli )
Preprint submitted to Petroleum Geoscience
1. Introduction26
The correct integration of geological and geophysical information within a decisional framework for the27
purpose of oil and gas exploration is a challenge that will become more important with increasing cost28
and exploration difficulties of new targets. Currently it is common practice among scientists to quantify29
information about risk through detailed exploration analysis, and then forward these results to management.30
From the geophysicists’ side we can interpret 2D and 3D seismic surveys and magnetic, gravimetric and31
electromagnetic data. From the geologists’ side we can evaluate the chance of having adequate trapping,32
reservoir facies, seal capacity and charge. The latter is aided by basin modeling studies. Other aspects33
concerning the economical evaluation of a prospect (costs/investments connected to development in case34
of success) must also be taken into account by the decision makers. In the transition towards the decision35
makers the information is processed and quantified through expert opinions and commercial software (such36
as GeoX R©) for risk assessment, multiple-scenario evaluation and estimation of the amount and value of37
hydrocarbons (HC) resources under study. In this work we propose a supplement to the existing framework38
by integrating directly basin modelling scenarios and decision strategies.39
We can identify the problem by analyzing how we currently move from the Earth model to the decision40
space: the geological and geophysical know-how is first translated into basin and petroleum system modeling41
(BPSM). Outputs from multiple runs of basin modeling under different geologic scenarios are then used to42
establish a Bayesian network (BN) that models play element dependencies. The BN is used to test decisions.43
In this work we have used a common commercial software for BPSM, namely PetroMod R©. Petromod is44
based on a finite-element simulator (Hantschel and Kauerauf, 2009) that numerically solves the coupled45
system of equations for sediment compaction, heat flow, petroleum generation and migration, accounting46
for both chemical and physical processes. We have not used the PetroRisk R©extension of the software, since47
we needed to control explicitly all the imputed scenarios.48
In this framework a sensitivity analysis is then carried out, and a database with multiple runs (corre-49
sponding to different geologic scenarios) is built. The database is the starting point for the value assessment50
part that provides the basis for efficient decisions.51
The idea of modeling play element prospect dependencies with a BN was proposed in VanWees et al.52
(2008) and Martinelli et al. (2011). Martinelli et al. (2011) constructed a BN model for assessing the53
likelihood of source presence in a part of the North Sea. The network describes the prior distribution of the54
source system in terms of kitchen, prospects and segments. We will use the word segment for identifying55
a volume possibly filled with HC resulting from a source-reservoir-trap system, while we will use the word56
2
prospect for describing a collection of segments that may share some common features.57
When the BN is established, one can use standard techniques to propagate the evidence at certain58
nodes to all other nodes. This allows us to study the value of information (VOI) at one or more segments59
(Bhattacharjya et al., 2010). Similar ideas were developed in VanWees et al. (2008). We will use a Matlab60
package developed by Murphy (1999) to learn, build and perform inference on the network.61
One of the critical points of Martinelli et al. (2011) was the substantial belief in expert opinion when62
designing the BN. In the present paper we propose an alternative idea for building the BN, integrating expert63
opinions with quantitative geological data. The main idea is to train the probabilistic structure of the BN64
from the multiple basin modeling outputs. This is done by statistical parameter estimation, together with65
discretization and clustering guided by geological intuition. This BN model couples the geological processes66
and their responses with risk assessment. Assigning expected revenues to segments, the production strategy67
and other required economic variables can now easily be communicated. The BN model provides explicit68
probability statements, at single-segments and for prospects.69
Using statistical design of experiment (DOE) with oil and gas forecasting problems is not new: Damsleth70
et al. (1992) and Dejean and Blanc (1999) propose a DOE based approach for reservoir modeling simulations;71
Corre et al. (2000) extends DOE and Monte Carlo methods in order to study uncertainties in geophysics,72
geology and reservoir engineering. Other relevant works include Wendebourg (2003) and Wendebourg and73
Trabelsi (2005); the former uses DOE for determining and later calibrating the most influential variables74
and thus reducing the uncertainty of the outcome, while in the latter sensitivity analysis is performed on75
critical parameters that determine petroleum generation and migration.76
Dependency among wildcat wells has been discuss in Kaufman and Lee (1992), where a binary logit model77
for the number of successes is proposed. Kaufman and Lee (1992) mention, though, that the forecasting78
capacity of the model was poor in absence of a correct geological model of the basin.79
The paper is organized as follows: In Section 2 we introduce BPSM and the synthetic case study; Section80
3 discusses the DOE simulation setup with interpretations. In Section 4 we show the procedure for developing81
the BN model. Finally, in Section 5, we apply the model for decision making and in Section 6 we provide82
some guidelines and discussion topics for the extension of the methodology to a real case study; Section 7 is83
the conclusions.84
3
2. A Case study for basin and petroleum systems modeling85
2.1. Basin and Petroleum Systems Modeling86
BPSM is a useful component in exploration risk assessment and is applicable with increasing reliability87
during all stages of exploration, from frontier basins with no well control to more mature areas. The idea is88
to simulate the geological and chemical reactions that have occurred in the basin through geological time, in89
order to identify the critical aspects of the HC generation, migration and accumulation. Important geological90
risk factors in oil and gas exploration are the trapping (consisting of trap geometry, reservoir and seal), the91
oil and gas charge (migration and source factors), and the timing relationship between the charge and the92
formation of potential traps. These risk factors apply equally to basin, play and prospect scale assessments.93
BPSM software combine seismic, well, geological and petrophysical information to model the evolution of94
a sedimentary basin. As output Petromod R© will predict if, and how, a reservoir has been charged with HC,95
including the source and timing of HC generation, migration routes and amount of HC both at subsurface96
or at surface conditions.97
In this paper we will use the 3D version of the software, that allows for full visualization of the migration98
paths that lead to the accumulation of HC in the basin.99
2.2. The Bezurk case study100
We have decided to use as training model a synthetic basin developed in the Petroleum Geology class101
at NTNU, Trondheim, Norway (Tviberg, 2011). The controlled basin environment is called Bezurk Basin102
(Figure 1), and it includes three potential kinds of prospects, namely anticlinal type, fault type and a103
shoestring type. The latter is located within impermeable shale and consequently the chances for HC to104
migrate into this reservoir are low, therefore we will not use it in our discussion. The Bezurk basin mimics105
the behavior of a possible real basin with a main anticlinal trap on the NE sector of the basin, and a series of106
faults in the NS direction. All lithologies are based on those in the Petromod library and default values are107
used for sand/shale compaction coefficient, porosity-permeability trends and type of consolidation/sealing108
relations. A description can be found in the supplementary materials. A major uplift followed by a strong109
erosion has occurred in the western part of the basin, and this activity has caused the major faulting shown110
by Faults 1 and 2.111
The history of the basin has been characterized by the deposition of organic-rich shale and good-porosity112
sandstone layers. In particular, we recognize two main possible HC producing layers, the deepest being113
the coal bed layer denominated Eek, and the shallowest being a shale rich in organic content denoted Mlf.114
4
Fault 1
Fault 2
Eek (source)
Ou (res.)
Mlf (source)
Mmd (res.)
Figure 1: Bezurk basin; we see the 100 km2 area and the different thicknesses of the layers; in the west part of the basin weidentify the two faults that characterize the system.
Another assumption is that the Bezurk basin is an onshore basin, with sediment surface at zero meters above115
the sea level.The depositional history started 55 Ma ago and has continued until today, with a number of116
erosional episodes. Figure 2 shows two cross-sections of the basin. Marked in black (Eek) and pink (Mlf) are117
the two main source rocks, and in yellow (Mmd) and red (Ou) the two main reservoirs. The third reservoir118
layer, a shoestring reservoir, is best visible in the second cross-section and lies between the two Mua seal119
layers just in the synclinal part of the basin. The main anticlinal reservoirs are clearly visible in the first120
cross section, in the eastern part of the basin.121
We have identified 2 main plays, corresponding to the two main potential reservoir rocks:122
• The reservoir of the Mmd play in the Bezurk Basin is made up of sandstone, deposited in a regressive123
shallow marine environment during the time interval 20Ma to 15Ma. The sandstone reservoir has124
porosity ranging from 12% to 30%, which is considered to be a good porosity. The reservoir covers125
the whole area on the east side of the faults, and has a thickness ranging from about 300-900m.126
• The reservoir of Ou play is deposited from 34Ma to 23Ma in a transgressive shallow marine environment127
with the overlying Mlf shale acting as a seal. The underlying Eek-coal is deposited on a coastal plain128
5
in the same transgressive system as the reservoir and it is expected to generate HC due to its depth of129
burial and the corresponding Heat Flow. Potential traps lie along the western faults and form four-way130
closure of the northeastern anticlinal; they are similar to the traps of the younger Mmd-play. The131
porosity of the Ou reservoir ranges from 7% to about 20%, which overall is lower than the porosity in132
the Mmd reservoir. Both reservoirs have the same kind of sandstone, but due to compaction the lower133
reservoir (Ou) has a lower porosity than the upper reservoir (Mmd).134
Generated HC are expected to migrate to the overlaying reservoir. The critical factor is the geological135
timing, both for the Ou-play and for the Mmd-play. In both plays the seal is deposited on top of the136
reservoir rock. The sealing efficiency may be inadequate to keep the HC inside the trap in scenarios with137
early generation and migration. This can cause large amounts of HC to be lost.138
The basin is exposed to normal faulting at a young age (11 Ma). Two faults are observed in the profiles139
(western part). The faults are considered to be closed faults. HC accumulated in these trap segments140
constitute the fault-prospect. The critical factors of the prospects are the uplift and erosion related to the141
faulting.142
2.3. Expected Results143
• HC generation: Both source rocks are buried deep enough to generate HC. Eek is deposited in a144
coastal plain environment in the time interval 34.8 Ma to 34 Ma, and is today located at a depth of145
about 3000m to 5000m. The lithology of the deepest source rock (Eek) is coal, which mainly generates146
gas, but can be also oil prone. The source rock which today is at the depth of 2000m to about 4500m147
is the Mlf black shale. Mlf is deposited in a deep marine environment in the time interval of 20.60Ma148
to 20.00Ma, and is expected to generate both oil and gas. The generated HC are expected to migrate149
into the overlying Anticlinal-prospect and the Fault-prospect.150
• Anticlinal prospect: The Anticline prospect is expected to contain HC in both the Mmd reservoir151
and the Ou reservoir. It has a four-way closure and no large risks are related to the trapping mechanism.152
The sealing rocks for both reservoirs are shale, which over time are expected to obtain adequate sealing153
capacity and thus prevent the HC from leaking during the last part of the migration process. The154
lower accumulation is expected to contain more gas than the upper accumulation, due to Eek source155
rock being more gas prone.156
• Fault prospect: The Fault prospect contains some more uncertainties regarding HC preservation.157
The trap mechanism is a normal fault, which has remained closed from 11Ma to today. However, the158
6
Qal
Mua (seal)
Mmd (res.)Mlf (source)
Ou (res.)
Eek (source)
Base
Qal
Muh
Mlq (res.)
Mua (seal)
Mua (seal)
Mmd (res.)
Mlf (source)Ou (res.)
Eek (source)
Base
Figure 2: Cross sections. In the top one we can recognize the four way anticlinal trap located in the eastern part of the basin;in both we can identify the faults in the western part of the basin.
effect of the uplift and the subsequent erosion in the western part of the basin needs to be modeled: will159
the timing of the fault and its sealing capacity be adequate to hold accumulations in place throughout160
the basin development? Other crucial questions that need to be evaluated relate to the change in the161
geometry of the basin with time and how it affects the flow paths, and the size of the drainage area of162
the anticline that influences the accumulation of the Fault prospect.163
7
2.4. The master model164
We have designed a master model by establishing a plausible petroleum system scenario and a series165
of boundary conditions. In particular we have chosen a constant heat flow (HF) of 60 mW per m2 , that166
corresponds to a moderately active basin (Allen and Allen, 2005). We have estimated the paleo water167
depth (PWD) according to the depositional environment through time (see Table 5 in the Supplementary168
materials). Finally, since Bezurk is conceived as an onshore basin there is no water present and the sediment-169
water-interface temperature (SWIT) is in reality the sediment-air-interface temperature.170
An illustrative run (Figure 3) shows that the sole prospect that today is filled with HC are the two171
trap segments of the anticlinal formations on the eastern part of the basin. We see traces of HC against172
the wall of the closed faults, but no significant accumulation. Figure 3 shows paths and drainage areas,173
illustrating how the migration at the present time converges on the anticlines, while a minor part of HC174
migrates westwards toward Fault 2.175
The HC that migrate into the fault prospect are mainly lost during the time step of 1.77Ma - 1.55Ma176
(Figure 4), which is the critical time when the Muh seal is eroded. This particular uplift creates erosion,177
and losses can consequently be explained by the change in the geometry of the basin. The reservoir layer178
creates a small anticlinal trap structure against the fault where the HC accumulate. After the uplift the179
trap structure flattens out and the HC migrate out of the trap.180
Simulation of the Bezurk Basin
36
Accumulations and flow path
Figure 35. Hydrocarbon accumulation and flow path a) Mmd reservoir b) Ou reservoir
The results from the simulation show two accumulations in the Anticline prospect (Figure 35). The
information in Table 13 is extracted from PetroMod and shows that The Anticlinal prospect constitutes
almost 100% of the total resources in the basin. As expected the lower accumulation contains more
gas.
Oil (1e6 STB) Assoc. Gas (1e9 scf) Non. Assoc. Gas (1e9 scf) Condensate (1e6 STB) Anticlinal top segment (Mmd), acc. Nr. 25
866.5 236.49 0.12 0.01
Total Mmd reservoir 866.5 236.49 0.16 0.01 Anticlinal bot. segment (Ou), acc. Nr.38
149.97 151.26 92.03 2.18
Total Ou reservoir 149.98 151.44 95.64 2.19 Table 13. Accumulations in The Bezurk Basin
Figure 3: HC accumulations and flow paths for Mmd play (left) and Ou play (right) at present day. We see the oil (green) andgas (red) accumulations in the anticlinal segments, with traces of HC in the fault segment. The drainage area of the anticlinaltraps is much larger than the drainage area for the fault traps.
8
Simulation of the Bezurk Basin
37
None or very few hydrocarbons are trapped in the fault-prospect. The hydrocarbons that migrate into
the prospect are mainly lost during the time step of 1.77Ma – 1.55Ma (Figure 36), which is when the
Muh seal erodes (critical moment). This particular erosion creates uplift; as such the losses can be
explained by the change in the geometry of the basin. Figure 37a illustrates a close-up view of the
accumulation at 1.77Ma. The Mmd reservoir is shown as a transparent layer and the oil (green) and
gas (red) are also displayed. The reservoir layer creates a small anticlinal trap structure against the
fault where the hydrocarbons accumulate. After the uplift the trap structure flattens out (Figure 37b)
and the hydrocarbons migrate out of the trap.
Figure 36. HC acc., a) 1.77Ma b) 1.55Ma
Figure 37. HC acc. Close up, a) 1.77Ma, b)1.55Ma
Figure 4: Accumulation in the fault segments; screenshot of the process at 1.77 Ma and 1.55 Ma. Most of the HC leak outduring and after the uplift of the basin.
3. Basin modeling scenarios181
During the analysis of the basin we have been able to identify four critical elements that constitute182
possible sources of uncertainty in our model. In real life there are large uncertainty ranges in most of183
the input parameters, and previous studies such as Lerche (1997) and Wendebourg (2003) have discussed184
thoroughly the problem. Usually the modeller intends to constrain the model with the sparse measurements185
available, and leave more uncertainty for those parameters that can not be measured directly, such as the186
Heat Flow (HF), or that present a larger range, such as porosity or Total Organic Carbon (TOC) content.187
To accommodate for the uncertainty in our synthetic basin several scenarios for TOC content, HF and188
porosity are considered. We have also noticed that there is a zone in the western part characterized by a189
prominent faulting activity; for this reason we can hypothesize a possible structural uncertainty, by adding190
or removing one of these fault elements from our model. Usually the erosion magnitude is another common191
uncertainty factor; our choice to not include it depends mainly from the way the erosion is modelled in the192
Petromod maps, that makes it difficult to modify in a consistent way.193
We next run multiple-scenarios of BPSM changing the key factors in a controlled design of experiment194
(DOE).195
3.1. A full factorial design196
In order to study the interactions among the different factors, we have designed a full factorial study197
(Fisher, 1971), where each factor is represented by two to three levels. We have chosen three levels for the198
HF (HF): cool (50 mW/m2), normal (60 mW/m2) or hot (70 mW/m2); it is expected that a cool basin199
9
mainly will stay in the oil window, consequently generating mostly oil, while a warm basin will reach the200
gas window at an earlier stage, and therefore generate more gas. We have further chosen two levels for the201
porosity of the reservoir rock, high or low (see profiles in Figure 5). We use two levels for the TOC content202
of both source rocks, with TOC ranging from 8% (high) to 4% (low) for the Mlf black shale and from 20%203
(high) to 10% (low) for the Eek coal. Finally, we select two levels, open or close, for the presence of a new204
fault (Fault 3) located east of Fault 2. Table 1 summarizes the results from the scenarios. From the master205
model (see Section 2.4), we observe that the HC which accumulated in The Fault Trap were lost during the206
time period of 1.77Ma to 1.55Ma. The reason for adding the Fault 3 is to see if this could trap HC and207
potentially create a prospect.208
Figure 5: Porosity profiles; on the left the high case, with initial porosity around 40 %; on the right the low case, with initialporosity around 30 % and a rapid decrease.
3.2. Simulation outcomes209
In each of the 24 different BPSM runs, we measure the size and type of HC accumulations. We further210
measure which source rock has generated them and we observe the migration path. We gain insight into the211
HC production, the expulsion from the source rock and the accumulation in the reservoirs. As a result, the212
amounts of HC that have leaked is available, and we can try to explain this leakage phenomenon through213
the observation of the complete evolution of the basin.214
In this section we discuss the main effects of the different scenarios. A more complete analysis is provided215
by analysis-of-variance printouts and diagrams in the supplementary material.216
The supplementary material also gives a Table containing the main data concerning generation, ex-217
pulsion, accumulation and leakage for each of the 24 scenarios. Data are in MMBOE (Million barrels oil218
10
Model Porosity HF Fault 3 TOC Acc TE Oil Acc TE gas Acc BE oil Acc BE gas1 high cool closed high 580 10.66 340 32.352 low cool closed high 247 3.29 90 10.413 high normal closed high 776 35.65 172 43.794 low normal closed high 220 10.33 35 6.285 high hot closed high 736 31.16 2 29.636 low hot closed high 212 9.17 1 9.027 high cool open high 537 9.43 343 31.598 low cool open high 247 3.15 91 10.239 high normal open high 773 35.49 167 44.3910 low normal open high 218 10.22 35 12.9511 high hot open high 731 32.67 5 33.4212 low hot open high 207 9.90 7 10.5813 high cool closed low 265 4.72 213 18.6114 low cool closed low 218 2.68 95 7.4715 high normal closed low 659 30.37 106 40.3616 low normal closed low 218 10.03 38 12.7517 high hot closed low 528 22.92 1 29.7618 low hot closed low 206 8.71 5 9.4319 high cool open low 265 4.72 213 18.6220 low cool open low 218 2.68 95 7.4721 high normal open low 661 32.08 84 38.8322 low normal open low 218 10.03 37 12.7523 high hot open low 527 24.12 3 33.2524 low hot open low 206 9.45 8 10.55
Table 1: Experimental table, full factorial design with 4 factors (Porosity, Heat Flow, Fault 3 and TOC) and 2 ∗ 3 ∗ 2 ∗ 2 = 24total levels. Accumulation results for the main anticlinal traps in MMBOE are reported. TE and BE refer respectively to theupper reservoir (Top East) and to the lower reservoir (Bottom East)
equivalent) throughout the whole analysis. Some results concerning important data are presented in Figure219
6. Here we depict six Pareto charts, showing which factors are more relevant in terms of variance decom-220
position, following the principles of a classical ANOVA analysis with multiple factors, see Cochran and Cox221
(1992). The variance (or the equivalent total sum of squares) of the response factor is subdivided into five222
components, four related to the factors under considerations and one related to the residuals. The cumula-223
tive sums of the first four components are shown in the charts. In this way, these charts allow immediate224
identification of the most relevant factors, i.e. of the factors that bear the highest contributions to the total225
variance. Similar conclusions can be drawn from the boxplots presented in the supplementary material.226
The generation phase is divided into oil and gas generation and further subdivided into the two source227
rocks that are responsible for the HC generation, respectively the Eek source rock and the Mlf source rock.228
The main factor driving the HC generation is the level of maturation of the source rock itself, that ultimately229
depends on the burial depth, the HF and the TOC. The analysis shows that higher HF allows an earlier230
and faster maturation and therefore a more abundant generation of gas in both the source rocks. For the231
11
Hea
t Flo
w
TOC
Por
osity
Faul
t 3
Generation Tot
Sum
of s
quar
es
0.0e
+00
5.0e
+07
1.0e
+08
1.5e
+08
020
4060
8010
0C
umul
ativ
e P
erce
ntag
e
TOC
Hea
t Flo
w
Por
osity
Faul
t 3
Generation Eek
Sum
of s
quar
es
0e+
002e
+06
4e+
066e
+06
020
4060
8010
0C
umul
ativ
e P
erce
ntag
e
Hea
t Flo
w
TOC
Por
osity
Faul
t 3
Expulsion Tot
Sum
of s
quar
es
0.0e
+00
5.0e
+07
1.0e
+08
1.5e
+08
020
4060
8010
0C
umul
ativ
e P
erce
ntag
e
Por
osity
TOC
Hea
t Flo
w
Faul
t 3
Accumulation Mmd
Sum
of s
quar
es
020
0000
6000
0010
0000
0
020
4060
8010
0C
umul
ativ
e P
erce
ntag
e
Hea
t Flo
w
Por
osity
TOC
Faul
t 3
Accumulation Ou
Sum
of s
quar
es
050
000
1000
0015
0000
020
4060
8010
0C
umul
ativ
e P
erce
ntag
e
Hea
t Flo
w
TOC
Faul
t 3
Por
osity
Outflow Side
Sum
of s
quar
es
0.0e
+00
1.0e
+07
2.0e
+07
020
4060
8010
0C
umul
ativ
e P
erce
ntag
e
Figure 6: Pareto charts concerning six data extracted from Petromod. TOC and Heat flow are relevant factors when measuringHC generation and expulsion. Porosity is relevant for accumulation. Fault 3 appears to be a relevant factor just when measuringthe outflow from the lateral side of the traps.
oil generation there are no significant differences in the impact of HF for the Eek source rock. This means232
that the oil generation has reached the maximum potential already when HF is on the medium level, and233
this is consistent with our hypothesis. It turns out that when HF is high, most of the oil generated by the234
deeper source rock leaks out before being trapped. Therefore, the overall effect is a smaller oil accumulation235
in the Ou reservoir when HF is high. HF and TOC are also the main parameters responsible for controlling236
the quantity of expelled HC, i.e. the amount of HC that leave the source rock after the generation.237
Regarding the size of accumulations, we can see (again referring to Figure 6 and to the supplementary238
material) that the main factor is the porosity, followed by the HF again, especially for what concerns the239
Ou accumulations. It is quite natural that the porosity is relevant, since a sandstone with good porosity can240
trap much more HC than a bad reservoir. It is interesting to notice how the effects of HF and TOC vanish,241
showing that the surplus of HC generated has almost totally been lost before the seal rock had reached its242
sealing capacity. Actually, we notice that the oil accumulations in the Ou reservoir decrease sensibly with243
12
increased levels of HF, and are quite stable with respect to TOC.244
Finally, Fault 3 in the western part has a strong effect only when it is leaking. When the fault is not245
present, there is no leaking through the fault’s wall. On the contrary, when the fault is present, there is246
some leaking along the fault, especially when there is an early maturation (high HF). The fault has clearest247
effect when measuring the outflow from the side. In contrast, the outflow from the top and the total outflow248
is governed by the HF and TOC, since the scenarios with early maturation leak most of the HC before the249
seal is adequately sealing.250
We have run a similar analysis on the second major kinds of data that we get from a BPSM analysis,251
i.e. the oil and gas accumulations. In the supplementary materials we attach Table 2 with the expected252
accumulation values at surface conditions for each of the 24 scenarios. Some of these results can be also253
found directly in Table 1.254
We have distinguished four main accumulations, two in the eastern part of the basin, under anticlinal255
traps, and two in the western part of the basin, against the Fault 3. We name the first two accumulations256
as TE (Top East) and BE (Bottom East), and the latter as TW (Top West) and BW (Bottom West).257
TE and TW refer to the Mmd play, while BE and BW refer to the Ou play. The name Top refer to the258
upper reservoir, while the name Bottom refers to the lower reservoir. In the following sections these four259
accumulations will represent our four segments; TE and BE belong to the anticlinal prospect, while TW and260
BW belong to the fault prospect. The data confirm what is already observed through the previous analysis:261
oil accumulations in the Ou reservoir decrease with increasing HF, while gas accumulations increase from262
cool to normal HF, and then remain almost stationary. The main effect is the porosity of the reservoir rock,263
that accounts for the largest part of the total variability.264
4. Building the Bayesian network265
The experimental design setup gives better insight into the key factors responsible for the main geological266
processes in the basin. We will next use the multiple-scenario information to build a dependency structure267
for the segments. This takes the form of a BN that will be useful for decision making.268
A BN is characterized by a set of nodes and edges. The nodes are random variables, that may be
discrete or continuous. As an example, we will define nodes for trap presence (on/off), which is a binary
random variable. Edges define the conditional probability structure of the variables, connecting parents to
children. For instance, we will define a parent node for ’Trap Anticline’ that can be on/off. This node
has two children: ’TrapTopEast’ and ’TrapBottomEast’, which are also on/off, and they have conditional
13
probability distributions depending on the outcome of the parent node. Let V be the set of all nodes, xv
the variable at node v ∈ V , and x the vector of all node variables, the joint probability model can then be
defined by
p(x|θ) =∏v∈V
p(xv|xpa(v), θv).
Here, pa(v) denotes the parents of node v. Further, θ denotes the set of model parameters required for the269
conditional probabilities tables (CPT), where θv is the local parameter for node xv. We show below how we270
can train or learn these parameter values from the multiple-scenario BPSM outputs.271
We have chosen to use a BN structure similar to that of Martinelli et al. (2011), and we have used272
software developed by Murphy (1999). The CPT are then parameterized by incorporating basic geological273
mechanisms and allowing for local failure in the propagation of HC elements, and we train parameters274
within this context. The formulation in Martinelli et al. (2011) appears to be a flexible way of modeling275
dependencies steming from different geological elements (trap, source, reservoir). The separate assignment276
of these elements gives a generic model specification that is easy to interpret and communicate. Finally,277
a BN formulation allows explicit evaluation of the changes in the probabilities when single elements are278
observed, which leads to what-if studies or VOI calculations.279
By using trap, source and reservoir elements in the BN, we thus avoid direct use of the factors involved280
in the DOE. We have seen that the HF for example interact both at source and at trap level, and that the281
porosity affects both the accumulation and the leaking phase. In order to capture the same effects you would282
need a quadratic regression, such that introduced in Wendebourg and Trabelsi (2005). The BN obviates283
such complex constructions.284
4.1. Learning the network285
The complete set of 24 scenarios, and associated observations, are shown in Figure 7 (generation) and286
in Figure 8 (accumulation). We have used a standard k-means (Kaufman and Rousseeuw, 2005) algorithm287
with k = 2 (accumulation) or k = 3 (generation) for assessing the threshold for categorizing the data.288
The optimal choice of the number of levels has been dictated by the algorithm itself (two clusters appear289
clearly in the accumulation figures, three or more in the generation figures). Other statistical methods for290
discriminant analysis could be useful in this case, for example methods based on quantile regression or other291
hierarchical or centroid-based clustering methods. Note that the data in this way become proxy for the292
knowledge about the geological elements, that could potentially be observed at segment level.293
We next consider the three main geological elements (trap, reservoir and source) separately. The BN294
14
0 5 10 15 20 250
5000
10000
15000
Generation Total
0 5 10 15 20 250
2000
4000
6000
8000
10000
Generation Mlf
0 5 10 15 20 25500
1000
1500
2000
2500
3000
Generation Eek
0 5 10 15 20 250
200
400
600
800
1000
Generation Mlf Gas
0 5 10 15 20 250
2000
4000
6000
8000
10000
Generation Mlf Oil
0 5 10 15 20 250
200
400
600
800
Generation Eek Gas
0 5 10 15 20 25500
1000
1500
2000
Generation Eek Oil
Figure 7: Data for learning the source network. The x-axis represents the 24 experiments. Top: values for the HC generation.Middle: values for Eek and Mlf generation. Bottom: values for oil and gas generation in each of the Eek and Mlf source rock.Values in MMBOE.
model we have established is shown in Figure 9.295
• Trap: We have developed a network with 6 nodes: two parents, TrapAnticlinal and TrapFault, and296
four children, TrapTE, TrapBE, TrapTW and TrapBW. The marginals probabilities for the top nodes297
are {0, 1} for the anticlinal trap and {0.5, 0.5} for the fault trap. This is set by direct learning from298
the DOE output. The local CPTs for the children nodes include the possibility of a local failure,299
quantified in the success probability θT (∼0.9). This allows a strong and effective learning when the300
fault trap presence is confirmed or ignored.301
• Reservoir: The reservoir network is another small network, with 7 nodes: one common parent,302
that represents the total accumulation, two mid-level parents that represent respectively Mmd and303
Ou reservoirs and four children representing oil and gas accumulations in reservoirs (ResMmdGas,304
ResMmdOil, ResOuGas and ResOuOil). The nodes can be efficiently learned through a simple BN305
algorithm, given the data provided by the BPSM simulations.306
The learning process follows the classical maximum likelihood procedure with complete data using the
15
0 5 10 15 20 250
500
1000
1500
Accumulation Total
0 5 10 15 20 25200
400
600
800
1000
Accumulation Mmd
0 5 10 15 20 250
200
400
600
Accumulation Ou
0 10 200
20
40
60
Accumulation Mmd Gas
0 10 20200
400
600
800
1000
Accumulation Mmd Oil
0 10 200
20
40
60
Accumulation Ou Gas
0 10 200
100
200
300
400
Accumulation Ou Oil
Figure 8: Data for learning reservoir network. The x-axis represents the 24 experiments. Top: values for the accumulated HC.Middle: values for Mmd and Ou generation. Bottom: values for oil and gas generation in each of the Mmd and Ou sourcerock. Values in MMBOE.
joint model (Cowell et al., 2007). Because of conditional independence, and the database output from
the DOE, we can maximize each term separately. This means that we can restrict our attention to the
term we are interested in, locally, and then we can find the maximum likelihood estimate θ̂v locally,
based on our database:
θ̂vxv=
#(xv ∧ xpa(v))#(xpa(v))
,
i.e. the CPTs are estimated by the ratio of the corresponding counts in the database. For the top307
nodes, we just count the fractions directly. In our case we update on the basis of a limited number308
of experiments. Different and possibly more complex prior distributions can be assigned to the top309
nodes, such as Dirichlet priors. When there are missing or incomplete data more refined techniques310
are suggested, such as Expectation-Maximization (EM) or penalized EM algorithms, see e.g. Jordan311
(1998) and Cowell et al. (2007).312
In our example all reservoir nodes are binary (two states, high and low), and we impose a threshold for313
16
the accumulations being larger than a certain value. These values are the separating planes indicated314
by the k-mean algorithm. (Figure 8).315
With this procedure we can derive explicitly the correlation between the different nodes. Note that we316
have not imposed these correlation, but derived them from the data. They are nonetheless possible to317
tune, if the values are in contradiction with expert belief or other sources of data. For more details we318
refer to Appendix A. Here we just provide an example. For the reservoir subnetwork just described,319
we can check the correlation between some of the bottom nodes: the correlation between the node320
ResMmdGas and ResMmdOil is 0.84, while the correlation between ResOuOil and ResMmdOil is321
0.32. The first result is natural since a good porosity of the reservoir rock increases its ability to hold322
both oil and gas. The second result tells us that the porosity in the Ou reservoir and in the Mmd323
reservoir are weakly but positively correlated. This is the effect of having the same sandstone in both324
reservoir rocks, and of the choice of changing the porosity in tandem in the two reservoir rocks (either325
poor-poor or good-good). A confirmation that our discretisation has not altered too dramatically the326
results come from the comparison of the correlation between the distributions computed on the BN327
and the correlation of the data series. We note that Accumulation Mmd Gas and Accumulation Mmd328
Oil have a correlation coefficient of 0.90, while Accumulation Mmd Oil and Accumulation Ou Oil have329
a correlation coefficient of 0.25. These results are covered extremely well by the distributions of the330
network. Finally, we report in Table 2 the conditional probabilities of the given variables (note that331
they are not in a parent-child relationship!), derived from our BN; the marginals for the state high332
for the variables ResMmdGas ResOuOil are respectively 0.33 and 0.25. The Bayesian computations333
behind these and other results are discussed in detail in the Appendix A.334
ResMmdGas / ResMmdOil low highlow 1 0
high 0.2 0.8
ResOuOil / ResMmdOil low highlow 0.867 0.133
high 0.575 0.425
Table 2: Conditional Probability Tables for the variables ResMmdGas vs ResMmdOil (left) and ResOuOil vs ResMmdOil(right), within the Reservoir subnetwork.
• Source: The source network is more complicated, since we have to take into account two phenomena335
that interact with a difficult correlation structure, namely one for the gas generation and one for the336
oil generation. As we have previously discussed, for the shallower top rock an increase in HF has the337
duplex effect of a higher oil and gas generation. On the other hand, for the deeper source rock, it affects338
just the gas generation. TOC affects both generation in similar ways. We learn the statistical effect339
17
of this behavior from the DOE outputs concerning the generation phase. We include a correlation340
structure with 3 levels: a top node for the total generation, intermediate nodes for the Mmd and Eek341
generation and bottom nodes for the gas and oil generation in each of the source rocks.342
In this case we have assigned three levels to all the nodes, respectively high generation, medium and343
low. Most of the CPTs are learned directly from the data, using GenTot, GenMlf, GenEek, GenMlfOil,344
GenMlfGas, GenEekGas and GenEekOil along with the thresholds discretizing the data. All the nodes345
are discrete, with three possible stases, i.e. k = 3 in Figure 7.346
Finally, we gather the information that we get from source, reservoir and trap in a single node, using347
our geological understanding of the process. We know that the source rock is essential for the presence of348
HC in the prospect, while a poor reservoir quality or a poor trap makes it less likely to have a commercial349
discovery in the prospect. We will next discuss other considerations for joining the last part of the network.350
4.2. Gaussian nodes351
So far in the BN building, we have not used the accumulation volumes extracted from the multiple-
scenario BPSM. For learning the reservoir network we have used joint layer accumulation values and not
prospect/segment values. We will now incorporate this information in the bottom nodes of the network.
It seems reasonable to have discrete nodes in the top parts of the network, since attributes such as source,
reservoir and trap are on/off or multi-level features. In the bottom part of the network it may be more
realistic to have continuous nodes that mimic the actual behavior of the simulated scenarios. We therefore
split each of the bottom nodes TE, BE, TW and BW in two nodes, one for gas volume and the other for
oil volume, and state that they represent accumulation distributions whose mean and (possibly) variance
depend on the states of their parents. The simultaneous use of discrete and continuous variables in BN has
been explored in Chang and Fung (1995) and Friedman and Goldszmidt (1996). A good inference algorithm
is presented in Murphy (1999); the algorithm used is the Junction Tree Algorithm presented in Cowell et al.
(2007). The related CPTs have to be assessed, for example the conditional probability density of BEg (BE
gas) is:
pBEg(x|TraBE , ResOuGas, SouEekGas) ∼ N(µBEg, σ2BEg),
where µBEg is the conditional mean value and σBEg is the conditional standard deviation of this Gaussian352
distribution. This means assessing 12 mean and variance parameters (2 states for Trap and Reservoir and 3353
for Source) for each of the 8 nodes. We use the accumulations from our experimental design as references for354
the mean values of our Gaussian distributions. The choice of using Gaussian nodes comes from their wide355
18
use as continuous nodes when dealing with BN, particularly because of their simple parameter estimation356
properties via ML. Furthermore, it is reasonable to believe that given all the input parameters fixed, the357
accumulation distribution will be Gaussian. This is not in contrast with the classical lognormally shaped358
distribution of the total reserves, since the lognormal shape is due to the contribution of several uncertainty359
factors, while our nodes assume that for each state the set of parameters is fixed. Marginalizing over the360
discrete top nodes, we get a mixture of Gaussian distributions which produces results consistent with the361
classical theory.362
ResTop
ResMmd
ResOu
ResMmdGas
ResMmdOil
ResOuGas
ResOuOil
TraFaultTraAnti
TraTE TraBE TraTW TraBW
SouTopSouMlf
SouEek
SouMlfGas
SouMlfOil
SouEekGas
SouEekOil
TEg
BEg
TWg
BWg
TEo
BEo
TWo
BWo
Figure 9: BN with trap(top), reservoir(left) and source(lower right) branches. Top nodes are all discrete, while bottom nodesare Gaussian, as explained in Section 4.2.
Further, we include the possibility of local failure of one element, with a reduced volume, according to363
Table 3. Let the parameters γR and γS be the local importance factors for the elements source and trap. The364
effect is that the failure of a single element can still produce minor accumulations. This occurs for instance if365
we believe that low/high states for factors like porosity do not totally preclude the accumulation of HC, but366
simply produces a sensible reduction in the quantity (as seen in the simulations), due to unpredictable local367
variations as a consequence of porosity reducing or enhancing effects. The choice of the parametrization for368
Table 3 has some immediate effects: we are implicitly assuming, for example, that if we find a volume equal369
19
to 0 in a segment and we know that a trap is in place, we have to blame the source for this lack of volume370
(rows 3 and 4 in the table), but the same situation can also occur when both trap and reservoir fail (rows 5371
and 9), no matter which is the outcome of the source, as it is natural to assume. When just one of these two372
elements fail, on the other side, we still allow a marginal possibility of finding HC, and this is resumed in the373
parameters γR and γS . We have fixed γR and γS to be equal to 0.2. The effects are multiplicative, therefore374
a factor 0.2 reduces the expected accumulations to 20% of the expected accumulation with all the elements375
in place. The choice of a factor 0.2 is due to the need to cover approximately the accumulations from our376
experimental design. This parameter could possibly, in larger studies, be learned directly from data as well.377
The numbers 1, 2 and 3 in Table 3 represent the different states of the nodes. For the trap node, state 1378
correspond to the failure state (trap not present), while state 2 correspond to the success case (trap present).379
For the source node, state 1 corresponds to the failure case (charge not present), state 2 corresponds to the380
intermediate case (weak charge), and state 3 corresponds to the success case (strong charge).381
The second important point to discuss is how to assign the variances to the Gaussian distributions. We382
acknowledge that this is a crucial point, with large and important implications when analyzing the effect383
of nodes’ behaviour, as shown in the previous paragraph. We have decided to assign the variances in order384
to have a constant coefficient of variation σµ in all the possible scenarios described in Table 3; a constant385
coefficient of variation will be our standard hypothesis for the variability of HC volumes. We need to stress,386
though, that we do not have a definitive answer or suggestion to this point, since we will never be able387
to consider and describe all the possible scenarios that could possibly happen in the basin that we are388
considering.389
The complete BN is shown in Figure 9. Since the accumulations cannot be negative, we will concentrate390
in 0 the probability mass corresponding to negative values. Such negative values are the consequence of the391
Gaussian nodes chosen for the accumulations with a set of fixed parameters. The resulting distributions will392
therefore be a mixture of truncated Gaussian distributions.393
The effects of this parametrization on the HC distributions can be seen in Figure 10. The distributions394
are truncated at 0, resulting in mixed discrete-continuous distributions. The probabilities of discovery or395
chance of success (COS) are 0.919 for Top Anticlinal and 0.797 for Bottom anticlinal. Since our approach396
is completely data-driven in this case, we cannot compare it directly with a classical approach based on the397
multiplication of several risk factors, such as COS=P(trap)*P(reservoir)*P(source). Note, though, that BN398
can be efficiently used also in that setting, as shown in Martinelli et al. (2011) and Martinelli et al. (2012).399
The distributions are multimodal, and the different modes reflect the likelihood of being in each of the400
20
Reservoir Trap Source µ1 1 1 02 1 1 01 2 1 02 2 1 01 1 2 02 1 2 γR1 2 2 γS2 2 2 γR + γS1 1 3 02 1 3 2 ∗ γR1 2 3 2 ∗ γS2 2 3 1
Table 3: Conditional Probability Table for the oil and gas accumulations in the four prospects; the column µ represents themultiplicative factor assigned to the mean of the gaussian conditional distribution. The numbers 1, 2 and 3 in Table 3 representthe different states of the nodes. For the trap node, state 1 correspond to the failure state (trap not present), while state 2corresponds to the success case (trap present). For the source node, state 1 corresponds to the failure case (charge not present),state 2 correspond to the intermediate case (weak charge), and state 3 corresponds to the success case (strong charge).
24 configurations taken into account. The comparison between the empirical distribution (24 configurations,401
shown with blue stars) and the BN distribution can be found in Figure 11. Here the bivariate distributions402
for the states oil and gas for the main TE (left) and BE (right) accumulations are shown. As we can see, there403
is a positive correlation between the oil and gas accumulations, due to the positive effect of TOC and HF in404
the maturation of the source rock. Second, the BN distribution covers quite well the empirical distribution,405
though there are discrepancies due to the thresholds introduced in Section 4.1 and the prior values (again406
learned from the data) imposed to the upper nodes of the network. Recall that the main goal of this work407
is not to reproduce exactly the BPSM behaviour, but to integrate the results in a probabilistic framework408
where it is easier to evaluate the effect of particular observables. Nonetheless, since we have considered409
quite extreme settings in our parameter space, we have good reasons to believe that our distributions would410
constitute an ideal contour line (envelope) of a much larger range of scenarios than our original 24, and411
therefore would capture most of the uncertainties that characterize the basin modelling behaviour of this412
case study.413
For economical evaluation purposes it is interesting to analyze the inverse cumulative distributions of414
recoverable HC. In order to compute such distributions we need to take into account the recovery factor,415
that is estimated to be 0.45 for oil accumulations and 0.75 for gas accumulations. In Figure 12 we show416
the inverse cumulative distributions for segments TE and BE of the anticlinal prospect. The black line417
represents the contribution of the oil part, while the red line represents the added value brought by the gas418
accumulation. As we can see the gas accumulation is more important for prospect BE since this has a source419
21
−30 −20 −10 0 10 20 30 40 50 60 700
0.01
0.02
0.03
0.04
0.05
0.06
Volume (MMBOE)
Distribution Gas Ou (BE)
Distribution Gas Mmd (TE)
−200 0 200 400 600 800 10000
0.002
0.004
0.006
0.008
0.01
Volume (MMBOE)
Distribution Oil Ou (BE)
Distribution Oil Mmd (TE)
Figure 10: Oil and gas volume distributions in prospects BE and TE. The multimodality of the distribution is due to failureof local geological elements that do not totally jeopardize the likelihood of finding HC.
rock (Eek) maturity level sufficient to produce commercial quantities of gas.420
These distributions can be updated when more information gets available. Let us focus our attention on421
the effect of added information on the gas accumulation relative to segment BE of the anticlinal prospect.422
Let us assume that we receive information that confirms the presence/absence of the reservoir or trapping423
condition in that prospect. The network is updated, and the conditional accumulation distributions can be424
seen in Figure 13. The effect of confirming an adequate reservoir layer is much stronger than a positive trap,425
since the prior for the anticlinal trapping to be adequate is already as large as 0.9, while the uncertainty426
about the quality of the reservoir layer (porosity) is much larger.427
5. Applications for decision making428
In this section we demonstrate a couple of different applications of the network.429
22
Volume Gas (MMBOE)
Vo
lum
e O
il (M
MB
OE
)
−20 0 20 40 60−200
0
200
400
600
800
1000
1
2
3
4
5
6
7
x 10−5
Volume Gas (MMBOE)
Vo
lum
e O
il (M
MB
OE
)
−20 0 20 40 60−100
−50
0
50
100
150
200
250
300
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
x 10−4
Figure 11: Oil and gas volumes joint bivariate distributions. Values are given for the accumulations TE (left) and BE (right).
5.1. What-if scenarios430
We are interested in the behavior of the network in case of observing a HC column in another prospect.431
In order to mimic a real situation, we consider drilling a well on the anticlinal prospects, and observe the432
impact of various evidence in TE, the top segment, on BE, the bottom segment (Figure 14). We then433
compare with a similar observations made on the fault prospect BW (Figure 15).434
In Figure 14 we see that even a rich observation in TE is not sufficient to solve the bi-modality of the435
marginal distribution, since the possible uncertainty about the quality of the reservoir remains (TE and BE436
belong to 2 different reservoirs). In Figure 15 we see that both an extremely poor and a rich observation437
in the fault prospect BW can substantially change the shape of the posterior oil BE distribution as both438
segments belong to the same play. As we have already pointed out, a positive HC column observation in a439
high risk prospect such as BW that confirms for the play both the quality of the reservoir and the existence440
of a charge has a higher impact on BE than an observation in TE belonging to a different play.441
5.2. Value of Information442
The Value of Information (VoI) and the Value of Perfect Information (VoPI) are indices that are becoming443
popular in the industry for evaluating the economical convenience of acquiring a certain set of data or drilling444
an exploration well, see Eidsvik et al. (2008) and Bhattacharjya et al. (2010) for details. In this study we445
will consider just the VoPI, meaning that we consider the possibility of getting perfect information (oil/gas446
vs dry) from an exploration well.447
23
0 50 100 150 200 250 300 350 400 450 5000
0.2
0.4
0.6
0.8
1Inverse cumulative distribution of recoverable resources, Anticlinal Mmd (TE)
Volume, MMBOE
Oil resources
Oil+gas resources
0 50 100 150 200 250 3000
0.2
0.4
0.6
0.8
1Inverse cumulative distribution of recoverable resources, Anticlinal Mmd (BE)
Volume, MMBOE
Oil resources
Oil+gas resources
Figure 12: Inverse Cumulative Distribution of recoverable resources for segments TE (Top) and BE (Bottom) of the anticlinalprospect. In black volumes related to the oil accumulations, in red volumes composed of the joint contribution of oil and gasaccumulations.
Geologists and decision makers need to establish the probability of discovering recoverable HC larger448
than an economic threshold. We can use a similar criterion for assessing the VoPI, saying that a value for449
expected resources falling below the economic threshold is equivalent to having no discovered resources at all.450
Furthermore, when we compute the VoPI we have always to specify the cost of collecting that information.451
In this case since our reference unit are the volumes expressed in MMBOE, we will express the costs in the452
same units.453
Given these premises, the prior value for threshold t and cost C would be:
PV =∑
j∈{BE,TE,BW,TW}
max
{∫x>t
P (xj = x) · vx dx− C, 0}
24
0 50 100 150 200 250 300 350 400 450 5000
1
2
3
4
5
6x 10
−3
Volume (MMBOE)
marginal BE oil
marginal BE oil | Res BE OK
marginal BE oil | Tra BE OK
Figure 13: Distribution of the oil accumulation in segment BE before and after observing positive Reservoir and Trap evidencein the same segment. A positive evidence about the reservoir quality will not just increase the COS but also affect the expectedvolume, as we can see from the dashed line. A positive evidence about the trap will have a much smaller impact, since its priorvalue is already close to 1.
0 50 100 150 200 250 300 350 400 450 5000
1
2
3
4
5
6
7
8x 10
−3
Volume (MMBOE)
marginal BE oil
marginal BE oil | TE oil Acc = 0
marginal BE oil | TE oil Acc = 350
marginal BE oil | TE oil Acc = 700
Figure 14: Distribution of the oil accumulation in segment BE before and after observing an oil column of different heightin segment TE. The evidence collected in segment TE efficiently propagates through the network and has an impact on theexpected volume distribution for prospect BE; the larger is the discovery, the bigger is the impact.
The value of having free clairvoyance in segment i would then be:
V FC(i) =
∫ ∑j∈{BE,TE,BW,TW}
max
{∫x>t
P (xj = x|xi = e) · vx dx− C, 0}P (xi = e) de.
25
0 50 100 150 200 250 300 350 400 450 5000
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
0.01
marginal BE oil
marginal BE oil | BW oil Acc = 0
marginal BE oil | BW oil Acc = 5
marginal BE oil | BW oil Acc = 10
Figure 15: Distribution of the oil accumulation in segment BE before and after observing an oil column of different heightin segment BW. The evidence collected in segment BW efficiently propagates through the network and has an impact on theexpected volume distribution for prospect BE; the larger is the discovery, the bigger is the impact. The impact is overall largerthan that shown in Figure 14, since a relevant oil accumulation in segment BW is extremely unlikely.
and finally:
V oPI(i) = V FC(i)− PV.
In these expressions the quantity vx is intended to be proportional to the recoverable resources. It is worth454
noticing that when i = j the integral collapses in a single point, and we observe what is called self-evidence,455
i.e. the effect of observing a prospect itself.456
Since the distribution are numerically approximated, the integrals are computed through a discretization457
and this makes the process computationally intensive.458
When computing the VoPI, we state that a certain prospect will be drilled if its expected recoverable459
resources exceeds a certain threshold. We have considered two possible scenarios for t, t = 0 and t = 80. The460
value t may also represent risk averse behavior for the decision maker: the higher is t, the more conservative461
is the decision maker. For each possible scenario we have computed the VoPI for the four prospects for462
different costs C, representing the operational cost connected to developing the prospect. We have decided463
not to introduce monetary units, but to refer everything in MMBOE, that is the reference unit for the464
prospects’ volumes; for this reason C is expressed in the same terms. We have repeated the procedure with465
and without the self evidence.466
Results are in Figures 16 and 17. We can immediately see two major spikes, corresponding to the range467
of costs that affects decisions in the biggest prospects, namely BE and TE. This means that for operation468
26
0 50 100 150 2000
10
20
30
40
50
VOPI, t=0
Cost C
VOPI TE
VOPI BE
VOPI TW
VOPI BW
0 50 100 150 2000
10
20
30
40
50
Cost C
VOPI, t=80
VOPI TE
VOPI BE
VOPI TW
VOPI BW
Figure 16: Value of Perfect Information for the four prospects BE, TE, BW and TW, as a function of the threshold t andof the project/operation costs C. The two major spikes correspond to the range of costs that affects decisions in the biggestprospects, namely BE and TE.
0 50 100 150 2000
10
20
30
40
50
Cost C
VOPI without Self−Evidence, t=0
VOPI TE
VOPI BE
VOPI TW
VOPI BW
0 50 100 150 2000
10
20
30
40
50
VOPI without Self−Evidence, t=80
Cost C
VOPI TE
VOPI BE
VOPI TW
VOPI BW
Figure 17: Value of Perfect Information without self evidence for the four prospects BE, TE, BW and TW, as a function of thethreshold t and of the project/operation costs C. The two major spikes correspond to the range of costs that affects decisionsin the biggest prospects, namely BE and TE.
costs in the regions close to the spikes, having the possibility of observing the state of one of the prospects469
would sensibly change our decision about the other prospects. We recognize that the first spike corresponds470
to a decision change in prospect BE, and the second spike to a change in prospect TE. This is confirmed471
both by the VoPI computed without self evidence (the spikes corresponding to self evidence disappear, see472
for example the dashed line of TE that goes to 0 in Figure 17 for high values of C), and by observing which473
prospects have the highest impacts: BW in case of the first spike and TW in case of the second spike. The474
geological reasons have been discussed when commenting Figures 14 and 15 (the effect of confirming an475
adequate reservoir layer given by an observation in the fault prospect is stronger than that of an adequate476
trap, since from our data the prior likelihood for the anticlinal trap is much larger than the prior likelihood477
27
of a good reservoir quality), and they are confirmed by this VoPI analysis, which compresses the information478
into outputs useful for decision making. Similar discussions and considerations can be found in Martinelli479
et al. (2011).480
The values of VoPI that we get from such analysis must be compared with the exploration cost necessary481
to get that information, again expressed in MMBOE. As we can see, VoPI values are much smaller than the482
operation costs C, and this is consistent since they need to be compared to the exploration costs and not to483
the operation costs. If the exploration cost is, say, equal to 10 MMBOE, the threshold is fixed to t = 0, and484
the development costs C are equal to 50, it is optimal (more informative) to focus on segment BE; in this485
case TE is not very informative since its high volume makes it profitable anyway. If the costs C are equal486
to 150, on the other side, it is more informative to explore TE; in this situation even TW could be a good487
candidate (its VoPI lies above 10 MMBOE for C = 150), while TE becomes irrelevant since its volume does488
not cover the operation costs and it is less informative than TW for estimating the outcome of BE. The489
last comment is about the threshold t: we can see that higher values of t lead to a shift towards smaller490
costs C for the VoPI peaks. This is reasonable since higher values of t reduce the chance of the prospects to491
be commercially viable, and therefore make them interesting just if the operational costs are smaller. This492
effect is bigger for prospects with low volumes (first and foremost TW and BW, but also BE), while it is493
almost impossible to detect when the volume is large (TE), since the imposed threshold makes this prospect494
very appealing in any situation.495
6. Guidelines for practical use496
The example discussed in this paper is simplified under several points of view with respect to a real world497
scenario. For this reason, we would like to point out here some indications that should guide a practical498
application of this methodology.499
• How reliable is a process based simply on a basin model?500
In the present study we rely completely on a single basin model, whose parameters can change, but501
whose structure is essentially fixed. This means that we can correct for eventual discrepancies in many502
geometrical or geological parameters, but we are assuming that all the relevant information come from503
this unique source. This is clearly a simplification due to the necessity of presenting the workflow, but504
we believe that improvements are possible. The first important point is that the approach presented505
here does not include the experts’ knowledge as source for driving the probabilities of the risk factors.506
28
This is relevant since expert knowledge is commonly used in the industry for quantifying the risk507
factors of the different geological elements. An idea for integrating it is to build this knowledge into508
the network’s structure, as proposed in Martinelli et al. (2012). Another idea is to integrate it in a509
Dirichlet prior over the networks’ parameters, that is subsequently updated with the results of the510
experiments. The second point is whether, together with expert opinion, other data-driven sources of511
information can be used in parallel with Basin modeling to improve the estimate of the segments’ and512
prospects’ COS. To this regard, there are recent contributions that aim to use directly seismic or EM513
data to update the segments’ chances of success, see for example Kolbjornsen et al. (2012).514
• Is an experimental design the ideal way to train the BN?515
The experimental design plan is a simple yet complete way of handling the problem. If the extreme516
points of the design plan are able to capture the extremes of the distributions, and if these distributions517
are reasonably behaved (not too skewed towards one of the extremes), the method is robust. The518
problem is that in real life we often work with skewed distributions, possibly even multi-modal. In this519
case a simple DOE approach such the one proposed here is not sufficient any longer, and more complex520
approaches such as those based on Response Surface Models (RSM) presented in Wendebourg (2003)521
should be considered. The bottleneck in using this approach would be the discretization procedure:522
to put it simple, it is useless to be able to explore in detail the sample space, if afterwards we have to523
summarize our results into a few discrete outcomes. More complex BN with continuous nodes should524
probably be taken into account in this case.525
• Is it consistent to modify the parameters one at a time, disregarding the possible interdependencies?526
Even considering that the distributions are not skewed and that a DOE approach is a reasonable way527
to integrate the uncertainty, there may be problems due to the incompatibility of some configurations528
in the likely case where the input parameters are correlated. In this case we should ideally build first529
the joint distribution over the input parameters, and then draw configurations from that distribution.530
Our suggestion remains to apply Occam razor when possible, and avoid unnecessary constructions531
that make the final result more difficult to read and to interpret. A nice aspect of our approach is that532
we immediately control which factors are more to blame for certain results, and more complex design533
tables would jeopardize this clarity and effectiveness.534
• How to handle a more complex scenario with the same approach? Which are the limits?535
As said before, in a real scenario the uncertainty range is usually wider (more parameters involved) and536
29
more complex (correlations involved). Therefore great care should be taken when trying to reproduce537
a similar analysis. As mentioned above, we believe that this method is attractive because it allows538
an explicit evaluation of what-if scenarios and it allows to draw decisions and conclusions based on539
a sound and consistent framework where every assumption is made explicit. We do not believe that540
the computational complexity is an issue when dealing with real case studies, not even with many541
uncertainty parameters. To this regard, other BM software simpler than Petromod could be used, see542
for example Sylta (2004). We believe that the main challenges when dealing with real case studies543
are the parametrization of the network and of the BN distributions involved and the discretization544
thresholds.545
7. Discussion and Conclusions546
We have shown how Basin and Petroleum System Models can help in assessing the probability structure547
of the Bayesian Network that models prospect and play element dependencies. The workflow moves from548
the Earth model to the decision space. The geological and geophysical know-how is translated into BPSM.549
Outputs from multiple runs of basin modeling under different geologic scenarios are then used to establish550
the Bayesian network which is used to test decision scenarios, and perform value of information analysis.551
The work underlines the importance of assessing uncertainty in petroleum systems. The emphasis is552
less on knowing the right answer, that may never be known before drilling, but rather on determining the553
range of outcomes given the available data and state of understanding of the petroleum system. Problems554
are caused by the complex and often non linear interactions among the different parameters, that make the555
prediction problem extremely difficult. Currently these problems are solved running a bunch of simulations556
with different parameters, and studying the uncertainty in the resulting accumulated volumes distribution557
as main or sole output. We believe that this process is not sufficient any longer, since there are too many558
parameters that remain hidden (implicit parameters) when the effect of many parameters is tested at the559
same time. With our framework we provide an alternative solution by making explicit all the interconnected560
parameters, though not chosen arbitrarily, but derived from a multiple scenario evaluation.561
Acknowledgments562
We thank the Statistics for Innovation (SFI2) research center in Oslo, that partially financed GMs563
scholarship through the FindOil project. We acknowledge the Stanford BPSM group for the opportunity564
given to GM of learning and practicing the software used in this work.565
30
References566
Allen, P., Allen, J., 2005. Basin Analysis, Principles and Applications. 2th ed. Blackwell Publishings.567
Bhattacharjya, D., Eidsvik, J., Mukerji, T., 2010. The value of information in spatial decision making. Mathematical Geosciences568
42 (2), 141–163.569
Chang, K., Fung, R., 1995. Symbolic probabilistic inference with both discrete and continuous variables. IEEE Transactions570
on Systems, Man and Cybernetics 25 (6), 910–917.571
Cochran, W. G., Cox, G. M., 1992. Experimental Designs. Wiley.572
Corre, B., Thore, P., deFeraudy, V., Vincent, G., 2000. Integrated uncertainty assessment for project evaluation and risk573
analysis. SPE European Petroleum Conference.574
Cowell, R., Dawid, P., Lauritzen, S., Spiegelhalter, D., 2007. Probabilistic Networks and Expert Systems. Springer series in575
Information Science and Statistics.576
Damsleth, E., Hage, A., Volden, R., 1992. Maximum information at minimum cost: A north sea field development study with577
an experimental design. Journal of Petroleum Technology 44 (12), 1350–1356.578
Dejean, J.-P., Blanc, G., 1999. Managing uncertainties on production predictions using integrated statistical methods. SPE579
Annual Technical Conference and Exhibition.580
Eidsvik, J., Bhattacharjya, D., Mukerji, T., 2008. Value of information of seismic amplitude and csem resistivity. Geophysics581
73 (4), R59–R69.582
Fisher, R., 1971. The Design of Experiments, 9th Edition. Macmillan.583
Friedman, N., Goldszmidt, M., 1996. Discretizing continuous attributes while learning bn. Machine Learning: Proceedings of584
the International Conference.585
Hantschel, T., Kauerauf, A. I., 2009. Fundamentals of Basin and Petroleum Systems Modeling. Springer.586
Jordan, M., 1998. Learning in graphical models. Kluwer Academic Publishers.587
Kaufman, G. M., Lee, P. J., 1992. Are wildcat well outcomes dependent or independent? Working papers 3373-92., Mas-588
sachusetts Institute of Technology (MIT), Sloan School of Management.589
Kaufman, L., Rousseeuw, P., 2005. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Series in Probability590
and Statistics.591
Kolbjornsen, O., Hauge, R., Drange-Espeland, M., Buland, A., 2012. Model-based fluid factor for controlled source electro-592
magnetic data. Geophysics 77 (1), E21–E31.593
Lerche, I., 1997. Geological risk and uncertainty in oil exploration. Academic Press.594
Martinelli, G., Eidsvik, J., Hauge, R., Drange-Forland, M., 2011. Bayesian networks for prospect analysis in the north sea.595
AAPG Bulletin 95 (8), 1423–1442.596
Martinelli, G., Eidsvik, J., Hauge, R., Hokstad, K., 2012. Strategies for petroleum exploration based on bayesian networks:597
a case study, spe paper 159722. SPE Annual Technical Conference and Exhibition, San Antonio, TX, USA, 8-10 October598
2012.599
Murphy, K. P., 1999. A variational approximation for bayesian networks with discrete and continuous latent variables. Pro-600
ceedings of the Fifteenth conference on Uncertainty in artificial intelligence, UAI ’99.601
Sylta, O., 2004. Hydrocarbon migration modelling and exploration risk. Dr. philos, NTNU.602
Tviberg, S., 2011. To assess the petroleum net present value and accumulation process in a controlled petromod environment.603
Master Thesis at the Department of Geology and Mineral resources engineering, NTNU.604
31
VanWees, J., Mijnlieff, H., Lutgert, J., Breunese, J., Bos, C., Rosenkranz, P., Neele, F., 2008. A bayesian belief network605
approach for assessing the impact of exploration prospect interdependency: An application to predict gas discoveries in the606
netherlands. AAPG Bulletin 92 (10), 1315–1336.607
Wendebourg, J., 2003. Uncertainty of petroleum generation using methods of experimental design and response surface mod-608
eling: Application to the gippsland basin, australia. In: AAPG/Datapages Discovery Series No. 7: Multidimensional Basin609
Modeling, Chapter 19. AAPG Special Volumes, pp. 295?–307.610
Wendebourg, J., Trabelsi, K., 2005. How wrong can it be? understanding uncertainty in petroleum systems modelling. In:611
Geological Society, London, Petroleum Geology Conference series. Vol. 6. Geological Society of London, pp. 1289–1299.612
Appendix A. Basic computations on Bayesian Networks613
We discuss here in detail the learning procedure of the Reservoir part of the BN presented in Figure 9,614
and the relative computations.615
We use seven series of data provided by our 24 basin modelling scenarios. The data are respectively616
Accumulation Total, Accumulation Mmd, Accumulation Ou, Accumulation Mmd Gas, Accumulation Mmd617
Oil, Accumulation Ou Gas and Accumulation Ou Oil.618
Most of these data are reported in Table 1, in the supplementary materials, and shown in Figure 8. Given619
these data we build a network with seven nodes, whose names are respectively ResTop, ResMmd, ResOu,620
ResMmdGas, ResMmdOil, ResOuGas, ResOuOil. The structure of the network is imposed, and it can be621
seen on the left side of Figure 9. It is made by a top node, ResTop, with two children, ResMmd and ResOu,622
each of them has again two children, ResMmdGas and ResMmdOil for the first one, and ResOuGas and623
ResOuOil for the second one. The distributions are learned directly from the data. We do not incorporate624
any prior opinion. This means that the learning process is based just on the counts of the successful cases.625
We show the discretised values for the seven nodes of interest in Table A.4.626
Let us consider as an example the first two nodes, ResTop and ResMmd. We see that whenever the627
node ResTop is in state 2 (high), the node ResMmd is in state 2 as well. We also notice that just once we628
observe ResTop in state 1 (low) and node ResMmd in state 2 (high). This happens in the 23rd scenario.629
This means that we have 93% probability (14 times out of 15) to observe the node ResMmd in state 1 when630
we observe the node ResTop in state 1 as well. From these considerations we can write the conditional631
probability distribution of node ResMmd given node ResTop (Table A.5).632
We can use the same procedure for all the nodes of this subnetwork, and learn all the CPTs that we633
need. In this way we build the joint distribution for the network. We report the marginal distributions for634
the seven nodes of the network in Table A.6. The complete joint distribution representation would require635
32
Scenario / Node ResTop ResMmd ResOu ResMmdGas ResMmdOil ResOuGas ResOuOil1 2 2 2 1 2 2 22 1 1 1 1 1 1 13 2 2 2 2 2 2 24 1 1 1 1 1 1 15 2 2 1 2 2 2 16 1 1 1 1 1 1 17 2 2 2 1 2 2 28 1 1 1 1 1 1 19 2 2 2 2 2 2 210 1 1 1 1 1 1 111 2 2 1 2 2 2 112 1 1 1 1 1 1 113 1 1 2 1 1 1 214 1 1 1 1 1 1 115 2 2 1 2 2 2 116 1 1 1 1 1 1 117 2 2 1 2 2 2 118 1 1 1 1 1 1 119 1 1 2 1 1 1 220 1 1 1 1 1 1 121 2 2 1 2 2 2 122 1 1 1 1 1 1 123 1 2 1 2 2 2 124 1 1 1 1 1 1 1
Table A.4: Discretized values for the seven nodes of the subnetwork Reservoir. The discretisation is carried out according tothe cluster identified in Figure 8.
ResMmdGas / ResMmdOil low highlow 0.9333 0.0667
high 0 1.0000
Table A.5: Conditional Probability Tables for the node ResMmd given node ResTop, within the Reservoir subnetwork.
27 assessments, and it is therefore too large to show here. It can be derived with the same criteria, but recall636
that the idea of BNs is to break up the joint modelling using local CPTs.637
Node / State 1 (low) 2 (high)ResTop 0.6250 0.3750
ResMmd 0.5833 0.4167ResOu 0.7500 0.2500
ResMmdGas 0.6667 0.3333ResMmdOil 0.5833 0.4167ResOuGas 0.5833 0.4167ResOuOil 0.7500 0.2500
Table A.6: Marginal prior distributions of the seven variables (nodes) within the Reservoir subnetwork.
33
In order to derive the correlation coefficients reported in Section 4.1, we can use the standard Pearson
correlation coefficient formula in the special case of discrete variables. Let us consider a joint distribution
of two random variables x1 and x2: in this case we have four possible outcomes: p11 = p(x1 = 1, x2 = 1),
p21 = p(x1 = 2, x2 = 1), p12 = p(x1 = 1, x2 = 2) and p22 = p(x1 = 2, x2 = 2). Let us denote with
p1 = p(x1 = 1) and with p2 = p(x2 = 1). Then, the correlation coefficient is:
ρ =p11p22 − p12p21√p1p2(1− p1)(1− p2)
In the case of the variables ResMmdGas and ResMmdOil, for example, we have p11 = 0.5833, p21 =638
0.0833, p12 = 0 and p22 = 0.3333. Furthermore, from Table A.6, we have that p1 = 0.6667 and p2 = 0.5833.639
Therefore in this case ρ = 0.8367, i.e. the two variables are highly correlated.640
Finally, in order to derive the CPT for nodes that are not in a parent-child relation, we must propagate the
information on the BN using Bayes theorem, and summing out the variables (marginalization). Therefore,
for example, when we ask p(ResMmdGas = 1|ResMmdOil = 1), we have just the variable ResMmd in
between, and we can proceed as follows:
p(ResMmdGas = 1|ResMmdOil = 1) =
2∑j=1
p(ResMmdGas = 1, ResMmd = j|ResMmdOil = 1)
=
2∑j=1
p(ResMmdGas = 1|ResMmd = j, ResMmdOil = 1)p(ResMmd = j|ResMmdOil = 1)
=
2∑j=1
p(ResMmdGas = 1|ResMmd = j)p(ResMmdOil = 1|ResMmd = j)p(ResMmd = j)
p(ResMmdOil = 1)
If we substitute all the values present in our original CPT, we can get the result p(ResMmdGas =641
1|ResMmdOil = 1) = 1, as shown in Table 2. In this last passage we have exploited the conditional642
independence property of the BN, i.e. a node, given its parents, is conditionally independent from all the643
nodes that are not its descendants. In this case therefore ResMmdGas is independent from ResMmdOil644
given the value of its parent ResMmd. The same procedure can be applied on much larger scales using spe-645
cific propagation algorithms. One approach is the so called Variable Elimination algorithm. A more efficient646
algorithm, implemented in the software package that we have used (Murphy, 1999), is called Junction Tree647
Algorithm. For details about this algorithm see Jordan (1998) and Cowell et al. (2007).648
34