Interpreting Pressure and Flow Rate Data from Permanent ...

INTERPRETING PRESSURE AND FLOW RATE DATA

FROM PERMANENT DOWNHOLE GAUGES

USING DATA MINING APPROACHES

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF ENERGY

RESOURCES ENGINEERING

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Yang Liu

March 2013

http://creativecommons.org/licenses/by-nc/3.0/us/

This dissertation is online at: http://purl.stanford.edu/xp635wx9603

© 2013 by Yang Liu. All Rights Reserved.

Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

ii



http://purl.stanford.edu/xp635wx9603

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Roland Horne, Primary Adviser


Margot Gerritsen


Tapan Mukerji

Approved for the Stanford University Committee on Graduate Studies.

Patricia J. Gumport, Vice Provost Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

iii

Abstract

The Permanent Downhole Gauge (PDG) is a promising resource for real time down-

hole measurement. However, a bottleneck in utilizing the PDG data is that the

commonly applied well test methods are limited (practically) to short sections of

shut-in data only and thus fail to utilize the long term PDG data. Recent technology

developments have provided the ability for PDGs to measure both flow rate and pres-

sure, so the limitation of using only shut-in periods could be avoided, theoretically. In

practice however it is still difficult to make use of the combined flow rate and pressure

data over a PDG record of long duration, due to the noise in both of the signals as

well as uncertainty with respect to the appropriate reservoir model over such a long

period.

The successful application of data mining in computer science shows great poten-

tial in revealing the relationship between variables from voluminous data sets. This

inspired us to investigate the application of data mining methodologies as a way to

reveal the relationship between flow rate and pressure histories from PDG data, and

hence extract the reservoir model.

In this study, nonparametric kernel-based data mining approaches were studied.

The data mining process was conducted in two stages, namely learning and prediction.

In the learning process, the reservoir model was obtained implicitly in a suitable

functional form in the high-dimensional kernel Hilbert space (defined by the kernel

function) when the learning algorithm converged after being trained to the pressure

and flow rate data. In the prediction process, a pressure prediction was made by the

data mining algorithm according to an arbitrary flow rate history (usually a constant

flow rate history for simplicity). This flow rate history and the corresponding pressure

iv

prediction revealed the reservoir model underlying the variable PDG data. In a second

mode, recalculating the pressure history based on the measured flow rate history

removed noise from the pressure signal effectively. Recalculating the pressure based

on a denoised flow rate history removed noise from both signals.

In the work, a series of data mining methods using different kernel functions

and input vectors were investigated. Methods A, B, and C utilized simple kernel

functions. Method A and Method B did not require the knowledge of breakpoints

in advance. The difference between the two was that Method A used a low-order

kernel function with a high-order input vector, while Method B used a high-order

kernel function with a low-order input vector. Method C required the knowledge of

the breakpoints. Nine synthetic test cases with different well/reservoir models were

used to test these methods. The results showed that all three methods have good

pressure reproduction of the training flow rate history and pressure prediction of the

constant flow rate history. However, each of them has limitations in different aspects.

The limitation of the simple kernel methods led us to a reconsideration of ker-

nelization and superposition. In the simple kernel methods, the kernelization was

deployed over the superposition which was reflected as the summation in the input

vector. However, the architecture of superposition over kernelization would be more

suitable to capture the essence of the transient, and this approach was implemented

by using a convolution kernel in Method D. The convolution kernel was invented and

applied in the domain of natural language machine learning. In the original linguis-

tic study, the convolution kernel decomposed words into parts, and evaluated the

parts using a simple kernel function. This inspired us to apply the convolution kernel

method to PDG data by decomposing the pressure transient into a series of pressure

responses to the previous flow rate change events. The superposition was then re-

flected as the summation of simple kernels (hence superposition over kernelization).

16 synthetic and real field test cases were tested using this approach. The method

recovered the reservoir model successfully in all cases. By comparison, Method D

outperformed all simple kernel methods for its stability and accuracy in all test cases

without knowing the breakpoints in advance.

v

This study also discussed the performance of Method D working under compli-

cated data situations, including the existence of significant outliers and aberrant seg-

ments, incomplete production history, unknown initial pressure, different sampling

frequencies, and different time spans of the data set. The results suggested that: 1)

Method D tolerated a moderate level of outliers and aberrant segments without any

preprocessing; 2) Method D might reveal the reservoir/well model with effective rate

correction and/or optimization on initial pressure value when the production history

was incomplete and/or when the initial pressure was unknown; and 3) an appropri-

ate sampling frequency and time span of the data set were required to ensure the

sufficiency of the basis functions in the Hilbert kernel space.

In order to improve the performance of the convolution kernel method in dealing

with large data sets, two block algorithms, namely Methods E and F, were also

investigated. The two methods rescaled the original kernel matrix into a series of

block matrices, and used only some of the blocks to complete the training process. A

series of synthetic cases and real cases illustrated their efficiency and accuracy. The

comparison of the performance between Methods D, E, and F was also conducted.

vi

Acknowledgements

At the time when my Ph.D study finally approaches to an end, there is a long list of

names that I would like to show my thanks to. In my five-year journey of graduate

study, some of them pointed the direction for me many a time, encouraging me to

persist in my research despite failures; some of them lent me their hands once I

met problems no matter whether in the daily life or academic study; and some of

them accompanied me day after day, sharing my happiness and sadness. It is their

guidance, help and care that supported me to reach today’s status.

The first person that I would like to appreciate is my advisor, Professor Roland

Horne. He was the professor that recruited me in Beijing when I applied to the

department for admission nearly six years ago. He was also my advisor who guided

both my master’s and Ph.D study. Many a time when I hesitated trying a new

idea that might very possibly fail the tests, he always encouraged me to go ahead.

His warm words, “proving that a method does not work for a case is still a part of

research”, comforted me a lot when the research reached a plateau. His remarkable

insight and precise intuition helped me keeping on the right track of study. His

creative thoughts always provided me with more ideas. I feel lucky and honored

being his student in my graduate study. I still remember the first day when I saw

him in his office in 2007, he said “Yang, we have a long we to go.” Today, the way

reaches a milestone but does not end. I will cherish these days with Professor Horne,

and maintain this close personal relationship carefully in the future.

I would also like to express my gratitude to the rest of my thesis committee mem-

bers: Professor Margot Gerritsen, Professor Tapan Mukerji, Professor Lou Durlofsky,

and Professor Norman Sleep. Each of them helped me in my academic growth and

vii

gave constructive comments to my thesis and research. Professor Margot Gerritsen’s

linear algebra course gave me a solid foundation in the mathematic theory and com-

putation. Professor Tapan Mukerji has a wide range of knowledge, so that his courses

and talks were always good resources of references. As one of organizers of Smart

Field Annual Conference, Professor Lou Durlofsky provided a series constructive com-

ments and suggestions to my research work which was presented in the meeting. I

owe thanks to Professor Norman Sleep as well. Although I was not familiar to him,

he was still willing to be my committee chairman, and read through my thesis and

discuss the details with me. His attitude towards the scientific study earned my great

respect.

To the Smart Field Consortium and SUPRI-D Research Group, I express my

sincere thankfulness as well. The two research groups provided me not only the

important financial support throughout my whole graduate study, but also friendly

interactive academic platforms. I am grateful to Professor Khalid Aziz as he always

kept an eye on my research process suggesting me to widen the usage of my study as

a generic petroleum data processing method. The weekly SUPRI-D group meeting

was a joyful event in a busy life. I enjoyed the free discussion and knowledge sharing

between all SUPRI-D members. Especially, my acknowledgements go to Priscila

Ribeiro, Sanghui Ahn, Zhe Wang, and Maytham Ibrahim Al Ismail. They never

stinted their encouragement whenever I obtained a little progress.

Five years of campus life provided me chances to meet a lot of friends that cared

for me and cherished our friendship. I thank Siyao Xu, my roommate and best

friend, for his kindness, help, and generousness. I also thank Thanapong Boontaeng,

my officemate, for his patience many a time when I presented him the progress of

my research. I owe thanks to the Chinese community as well. Their friendship and

support made it easy in my daily life.

To my parents and my wife, I express my utmost love and acknowlegement. Nei-

ther of my parents has attended university due to the limitation in the special period

of China’s history. However, they always encouraged me to complete the Ph.D study

despite any difficulties. I appreciate them for giving me a life to experience such

exciting education and meet so many friends. I owe everything to my wife, Zhizhen

viii

Liu. She accompanied me in the long journey of my graduate study, sharing all my

happiness and sadness. Her persistent love supported every step of my progress. I

leave my final sincere gratitude to my devoted wife.

ix

This page intentionally left blank.

x

Contents

Abstract iv

Acknowledgements vii

1 Introduction 1

1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Literature Review 14

2.1 Reservoir Monitoring and Management . . . . . . . . . . . . . . . . . 15

2.2 Pressure Transient Analysis . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Data Processing and Denoising . . . . . . . . . . . . . . . . . 18

2.2.2 Breakpoint Detection . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.3 Flow Rate Reconstruction . . . . . . . . . . . . . . . . . . . . 25

2.2.4 Change of Reservoir Properties . . . . . . . . . . . . . . . . . 27

2.3 Deconvolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.4 Temperature Transient Analysis . . . . . . . . . . . . . . . . . . . . . 34

2.5 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Data Mining Concept and Simple Kernel 40

3.1 Components of Learning Algorithm . . . . . . . . . . . . . . . . . . . 41

3.1.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.1.2 Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

xi

3.1.3 Optimization Search Method . . . . . . . . . . . . . . . . . . . 45

3.2 Kernelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3 Kernelized Data Mining without Breakpoint Detection . . . . . . . . 53

3.4 Kernelized Data Mining with Breakpoint Detection . . . . . . . . . . 57

3.5 Application on Synthetic Cases . . . . . . . . . . . . . . . . . . . . . 59

3.5.1 Radial Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.5.2 Radial Flow + Wellbore . . . . . . . . . . . . . . . . . . . . . 61

3.5.3 Radial Flow + Skin . . . . . . . . . . . . . . . . . . . . . . . . 61

3.5.4 Radial Flow + Wellbore + Skin . . . . . . . . . . . . . . . . . 64

3.5.5 Radial Flow + Closed Boundary . . . . . . . . . . . . . . . . . 64

3.5.6 Radial Flow + Constant Pressure Boundary . . . . . . . . . . 65

3.5.7 Radial Flow + Wellbore + Skin + Closed Boundary . . . . . . 65

3.5.8 Radial Flow + Wellbore + Skin + Constant Boundary . . . . 67

3.5.9 Radial Flow + Dual Porosity . . . . . . . . . . . . . . . . . . 67

3.6 Summary and Limitation . . . . . . . . . . . . . . . . . . . . . . . . . 68

4 Convolution Kernel 72

4.1 The Origination of Convolution Kernel . . . . . . . . . . . . . . . . . 72

4.2 Convolution Kernel Applied to PDG Data . . . . . . . . . . . . . . . 75

4.3 Conjugate Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.4 Input Vector Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.5 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.5.1 Radial Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.5.2 Radial Flow + Wellbore . . . . . . . . . . . . . . . . . . . . . 93

4.5.3 Radial Flow + Skin . . . . . . . . . . . . . . . . . . . . . . . . 95

4.5.4 Radial Flow + Wellbore + Skin . . . . . . . . . . . . . . . . . 95

4.5.5 Radial Flow + Closed Boundary . . . . . . . . . . . . . . . . . 96

4.5.6 Radial Flow + Constant Pressure Boundary . . . . . . . . . . 98

4.5.7 Radial Flow + Wellbore + Skin + Closed Boundary . . . . . . 98

4.5.8 Radial Flow + Wellbore + Skin + Constant Boundary . . . . 99

4.5.9 Radial Flow + Dual Porosity . . . . . . . . . . . . . . . . . . 102

xii

4.5.10 Complicated Synthetic Case A . . . . . . . . . . . . . . . . . . 103

4.5.11 Complicated Synthetic Case B . . . . . . . . . . . . . . . . . . 105

4.5.12 Semireal Case A . . . . . . . . . . . . . . . . . . . . . . . . . 107

4.5.13 Semireal Case B . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.5.14 Real Case A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.5.15 Real Case B . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

4.5.16 Real Case C . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5 Performance Analysis 119

5.1 Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5.2 Aberrant Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

5.3 Partial Production History . . . . . . . . . . . . . . . . . . . . . . . . 136

5.4 Unknown Initial Pressure . . . . . . . . . . . . . . . . . . . . . . . . . 143

5.5 Sampling Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

5.6 Evolution of Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

6 Rescalability 162

6.1 Block Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

6.2 Advanced Block Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 170

6.3 Real Data Application . . . . . . . . . . . . . . . . . . . . . . . . . . 173

6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

7 Conclusion and Future Work 180

A Data 186

B Proof of Kernel Closure Rules 213

B.1 Summation Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

B.2 Tensor Product Closure . . . . . . . . . . . . . . . . . . . . . . . . . 214

B.3 Positive Scaling Closure . . . . . . . . . . . . . . . . . . . . . . . . . 215

xiii

C Breakpoint Detection Using Data Mining Approaches 217

C.1 K-means and Bilateral . . . . . . . . . . . . . . . . . . . . . . . . . . 217

C.2 Minimum Message Length . . . . . . . . . . . . . . . . . . . . . . . . 220

D Implementation 225

D.1 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

D.2 Work Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

Nomenclature 234

Bibliography 237

xiv

List of Tables

3.1 Kernel Function and the Corresponding Φ (x) (Ng, 2009) . . . . . . . 51

3.2 Reservoir behavior and input features . . . . . . . . . . . . . . . . . . 55

3.3 Input vectors and kernel functions for Method A and Method B . . . 56

3.4 Input vector and kernel function for Method C . . . . . . . . . . . . . 58

3.5 Test cases for simple kernel method . . . . . . . . . . . . . . . . . . . 60

4.1 Input vector for convolution kernel . . . . . . . . . . . . . . . . . . . 84

4.2 Test cases for convolution kernel input vector selection . . . . . . . . 85

4.3 Input vector and kernel function for Method D . . . . . . . . . . . . . 88

4.4 Test cases for convolution kernel method . . . . . . . . . . . . . . . . 89

4.5 Result plots for all tests on convolution kernel method . . . . . . . . 92

5.1 Test cases for outliers performance analysis . . . . . . . . . . . . . . . 122

5.2 Test cases for aberrant segment performance analysis . . . . . . . . . 129

5.3 Test case for partial production history performance analysis . . . . . 138

5.4 Test case for partial production history performance analysis . . . . . 140

5.5 Test case for unknown initial pressure performance analysis . . . . . . 145

5.6 Test case for unknown initial pressure analysis . . . . . . . . . . . . . 147

5.7 Test cases for sampling frequency performance analysis . . . . . . . . 150

5.8 Test cases for evolution learning performance analysis . . . . . . . . . 155

6.1 Comparison between Method D and Method E . . . . . . . . . . . . . 166

6.2 Test cases for rescalability test using Method E . . . . . . . . . . . . 167

6.3 Comparison between Method E and Method F . . . . . . . . . . . . . 171

xv

6.4 Test cases for rescalability test using Method F . . . . . . . . . . . . 172

6.5 Test cases for rescalability test on large PDG data set . . . . . . . . . 175

6.6 Execution time of Case 36 with different block sizes . . . . . . . . . . 178

A.1 Data for Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

A.2 Data for Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

A.3 Data for Case 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

A.4 Data for Case 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

A.5 Data for Case 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

A.6 Data for Case 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

A.7 Data for Case 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

A.8 Data for Case 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

A.9 Data for Case 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

A.10 Data for Case 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

A.11 Data for Case 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

A.12 Data for Case 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

A.13 Data for Case 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

A.14 Data for Case 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

A.15 Data for Case 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

A.16 Data for Cases 16-18 . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

A.17 Data for Cases 19-22 . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

A.18 Data for Cases 23-24 . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

A.19 Data for Cases 25-26 . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

A.20 Data for Cases 27-30 . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

A.21 Data for Cases 31-34 . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

A.22 Data for Cases 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

A.23 Data for Cases 36-37 . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

xvi

List of Figures

1.1 The structure of PDG . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 The appearance of PDG . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Variable flow rate and noisy data from PDG . . . . . . . . . . . . . . 6

1.4 Detect the real reservoir response . . . . . . . . . . . . . . . . . . . . 7

1.5 Discover the real reservoir model . . . . . . . . . . . . . . . . . . . . 8

1.6 Work Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1 Pressure response of a slant well . . . . . . . . . . . . . . . . . . . . . 15

2.2 History matching using PDG data . . . . . . . . . . . . . . . . . . . . 16

2.3 A general downhole data acquisition system . . . . . . . . . . . . . . 19

2.4 Three categories of noise from PDG . . . . . . . . . . . . . . . . . . . 20

2.5 Fitting pressure data with two approaches . . . . . . . . . . . . . . . 22

2.6 Breakpoint detection with both pressure and the flow rate data . . . 24

2.7 The effect of inaccuracy in breakpoint detection . . . . . . . . . . . . 25

2.8 Flow rate reconstruction using PDG pressure data . . . . . . . . . . . 26

2.9 Flow rate reconstruction using wavelet transformation . . . . . . . . . 28

2.10 History matching with variable reservoir properties . . . . . . . . . . 29

2.11 Variable reservoir properties as functions of time . . . . . . . . . . . . 29

2.12 Variable reservoir properties using the moving window method . . . . 30

2.13 Piecewise constant reservoir properties . . . . . . . . . . . . . . . . . 31

2.14 Deconvolution applied on the simulated data . . . . . . . . . . . . . . 32

2.15 Recover the initial pressure by deconvolution . . . . . . . . . . . . . . 33

2.16 Deconvolution with convex optimization on real field data . . . . . . 34

2.17 Temperature and pressure data from a PDG . . . . . . . . . . . . . . 35

xvii

2.18 Temperature and pressure transient analysis . . . . . . . . . . . . . . 36

2.19 Pressure prediction on synthetic data . . . . . . . . . . . . . . . . . . 38

2.20 Pressure prediction on real data . . . . . . . . . . . . . . . . . . . . . 38

3.1 Superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.2 Demonstration of the construction of feature-based input variable. . . 58

3.3 Simple kernel learning results for Case 1 . . . . . . . . . . . . . . . . 62









3.12 Method B failed to predict on a more variable flow rate history . . . . 70

4.1 Decompose an input sample point into parts . . . . . . . . . . . . . . 75

4.2 Comparison between SGD and CG . . . . . . . . . . . . . . . . . . . 78

4.3 Comparison between different convolution input vectors . . . . . . . . 86

4.4 Convolution kernel learning results for Case 1 . . . . . . . . . . . . . 94












xviii




4.19 Comparison between the prediction of two real cases . . . . . . . . . 114


5.1 Outlier performance test on Case 16 . . . . . . . . . . . . . . . . . . . 123



5.4 Aberrant segment performance test on Case 19 . . . . . . . . . . . . . 131




5.8 The original complete data set for Cases 23 and 24 . . . . . . . . . . 137

5.9 Partial production history test on Case 23 . . . . . . . . . . . . . . . 139

5.10 Partial production history test on Case 24 A . . . . . . . . . . . . . . 141

5.11 Partial production history test on Case 24 B . . . . . . . . . . . . . . 142

5.12 The true data and training data for Cases 25 and 26 . . . . . . . . . 144

5.13 Unknown initial pressure performance test on Case 25 . . . . . . . . . 145

5.14 Unknown initial pressure performance test on Case 26 . . . . . . . . . 148

5.15 The original complete data set for Cases 27-30 . . . . . . . . . . . . . 149

5.16 Pressure reproduction in the frequency tests on Cases 27 - 30 . . . . . 151

5.17 Pressure prediction in the frequency tests on Cases 27 - 30 . . . . . . 153

5.18 The original complete data set for Cases 31-34 . . . . . . . . . . . . . 155

5.19 Pressure reproduction in the evolution tests on Cases 31 - 34 . . . . . 157

5.20 Pressure prediction in the evolution tests on Cases 31-34 . . . . . . . 159

6.1 The block matrices used in the block algorithm . . . . . . . . . . . . 165

6.2 Rescalability test results on Case 35 using Method E . . . . . . . . . 169

6.3 The block matrices used in the advanced block algorithm . . . . . . . 170

6.4 Rescalability test results on Case 35 using Method E . . . . . . . . . 174

6.5 The real field data for rescalability tests . . . . . . . . . . . . . . . . 175

xix

6.6 The real field data and the resampled data for Cases 36 . . . . . . . . 176

6.7 Rescalability test results on Cases 36 . . . . . . . . . . . . . . . . . . 177

C.1 K-means and Bilateral methods on breakpoint detection . . . . . . . 219

C.2 MML method on breakpoint detection (no outliers) . . . . . . . . . . 222

C.3 MML method on breakpoint detection (outliers) . . . . . . . . . . . . 223

C.4 MML method usign flow Rate and time data only . . . . . . . . . . . 224

D.1 The class diagram of the PDG project . . . . . . . . . . . . . . . . . 226

D.2 The work flow of tests . . . . . . . . . . . . . . . . . . . . . . . . . . 230

xx

Chapter 1

Introduction

Downhole data acquisition from a producing well is very important for petroleum

development mainly for two reasons. On one hand, the real time measurements will

enable the petroleum engineers to access the immediate well status, so that a quick

response may be taken if any abnormal reservoir behaviours were observed. On the

other, the accumulated downhole data may be used to better calibrate the reser-

voir model in a history matching process, in which a reservoir model is proposed to

match the obtained measurements and thereafter to predict the future performance

of the reservoir. Conventionally, only the surface measurements such as surface rates

and cumulative production volume are utilized in a history matching process. The

downhole production data including pressure, flow rate and temperature as the func-

tions of time, may improve the accuracy of the reservoir model by capturing more

details of real time reservoir behaviours. Based on the prediction of the improved

model, petroleum engineers may make complex decisions to optimize the long term

production.

However, although the importance of real time downhole measurement is recog-

nized in the petroleum industry, long-term continuous downhole measurement was

not feasible due to the technical limitation until the invention and the deployment of

the Permanent Downhole Gauge(PDG).

Permanent Downhole Gauges were designed initially for well monitoring. The

installation of PDGs may date back to as far as 1963 (Nestlerode, 1963). However,

1

CHAPTER 1. INTRODUCTION 2

they were not widely deployed until the late 1980s when a new generation of reliable

PDGs was developed (Horne, 2007; Eck et al., 2000).

Figure 1.1: The structure of a commercial PDG (Eck et al., 2000).

Fig. 1.1 demonstrates the structure of a commercial PDG, while Fig. 1.2 shows

the appearance of a PDG used in offshore reservoirs. At the early stages, the PDG

measured only the temperature and the pressure, and was not able to obtain the flow

rate information. Therefore, at that time, only the temperature and pressure existed

in the PDG data set.

However, with the further development of PDGs, this problem was overcome. In

one commercial downhole device, two pressure gauges in gauge mandrels measure


Figure 1.2: The appearance of a commercial PDG used in offshore reser-voirs (Konopczynski and McKay, 2009).

the pressure drop across an integrated venturi, which is directly proportional to the

square of the fluid velocity. A third pressure gauge may be used to measure fluid

density. By using these two measurements, the flow rate may be calculated. Other

forms of flow rate measurement are used in permanent downhole gauge configurations

by other service companies. These settings enable the permanent downhole gauges

to provide the pressure, temperature, fluid density and fluid rate simultaneously at

each time point.

Since the 1960s when the first PDG was installed, a half century has passed. The

modern PDGs have more functionalities and better accuracy and stability. More

than 1000 wells worldwide had been equipped with PDGs in 2001 (Khong, 2001),

and the number is possibly close to 20,000 in 2012. The development of the PDG

could be viewed by the milestones of the PDG application in a major oilfield service

company (Eck et al., 2000).

1973 First permanent downhole gauge installation in West Africa, based on wireline

logging cable and equipment.

1975 First pressure and temperature transmitter on a single wireline cable.

1978 First subsea installations in North Sea and West Africa.


1983 First subsea installation with acoustic data transmission to surface.

1986 Fully welded metal tubing encased permanent downhole cable.

1986 Introduction of quartz crystal permanent pressure gauge in subsea well.

1990 Fully supported copper conductor in permanent downhole cable.

1993 New generation of quartz and sapphire crystal permanent gauges.

1994 Installation for mass flow rate measurement.

With the ability of long-term continuous record of pressure, flow rate, during

production, the PDG becomes a new and significant source of downhole reservoir

data. However, in many cases, the data from PDGs are still used mainly to monitor

the production status of the well, but not for reservoir analysis. The reason that the

PDG data are not used frequently for reservoir analysis is due to difficulties in dealing

with the uncontrolled flow rate variations in typical PDG data, using conventional

well test interpretation methods. Nevertheless, for the past ten years, petroleum

engineers have persisted in working on how to utilize the huge volume of PDG data

to better characterize the well and the reservoir for reservoir management.

1.1 Problem Statement

From the reservoir engineering point of view, the pressure transient measured by a

PDG is a function of the flow rate changes (the flow rate changes may be calculated

by the flow rates measured by the PDG as well). This is very much like the data

collected in a conventional well test, such as a buildup or a drawdown test (Horne,

2007). However, considering the nature of a conventional well test, an intended control

for an imposed flow rate change, and the nature of PDG measurement, unrestrained

fluctuations in a producing well, there are a few difficulties to apply conventional well

test analysis method to the PDG data.

Firstly, a conventional well test is designed to impose a flow rate change that is

as simple as possible so that the pressure response will be easy to interpret. Fig. 1.3


shows a typical pressure and flow rate acquisition from a real PDG. The flow rate data

are variable, and only two small sections of data highlighted in boxes are suitable for a

conventional buildup interpretation. Compared to the huge volume of measurements,

the conventional buildup interpretation methods are only applicable to a very limited

portion of the data.

Secondly, the PDG data are very noisy. Unlike traditional well testing tools that

are used in controlled environments, PDGs measure the pressure and flow rate in the

well during production. Therefore, the uncontrolled nature of the flow introduces

several kinds of noise and artifacts into the data. Fig. 1.3(b) shows a zoom-in view

of Fig. 1.3(a). The flow rate and pressure data are both very noisy. The problem

of the noise is not the absolute value biased from the true data, but the frequent

vibration. This leads to two issues. For the pressure transient, it makes it hard for

us to recognize what is the real reservoir response, and what is due to noise. For the

flow rate, there becomes no easy way to detect the break points (where the flow rate

really changes).

Thirdly, the flow rate history information is not usually needed in the conventional

well test interpretation. In the conventional well test, such as a buildup test or a

drawdown test, the flow rate is intended to be maintained at zero or a constant value.

Nowadays, the PDG has the capability to provide the flow rate information as well,

so there is a strong demand for a method that could cointerpret the pressure as well

as the flow rate simultaneously.

Fourthly, the PDGs measure the pressure and flow rate at a high frequency over

a long duration. A single year of measurement can amount to gigabytes of data. The

volumes of the data are far beyond the capability of manual processing, and thus,

require algorithmic approaches.

In addition to these problems, which are mainly technical difficulties, there is

also a restriction on conventional methods, namely physical model dependency. In

conventional well testing methods, reservoir models, which are used to deduce a re-

lationship between the flow rate and pressure, usually start from predefined physical

equations. This requires the engineer to predefine a physical model before making


260 280 300 320 340 360 3808600

8700

8800

8900

9000

Time (hours)

Pre

ssur

e (p

si)

260 280 300 320 340 360 3800

0.5

1

1.5

2

2.5

3x 10

4

Time (hours)

Flo

w R

ate

(ST

B/d

)

(a)

319.4 319.6 319.8 320 320.2 320.4 320.6 320.8 321 321.2 321.48700.5

8701

8701.5

8702

8702.5

8703

8703.5

8704

Time (hours)

Pre

ssur

e (p

si)

319.4 319.6 319.8 320 320.2 320.4 320.6 320.8 321 321.2 321.41.932

1.934

1.936

1.938

1.94

1.942x 10

4

Time (hours)

Flo

w R

ate

(ST

B/d

)

(b)

Figure 1.3: (a) Variable flow rate data. Only two small pieces of data are good for abuildup test. (b) The pressure and flow rate data are both very noisy.


any interpretation. This requirement increases the risk of making an incorrect pre-

sumption of physical model, especially bearing in mind that the model must describe

months or years of data, not just a few hours as in a conventional well test. This

study made a major departure from conventional approaches by seeking a physical-

model-independent method to achieve a nonparametric regression, matching a model

without knowing in advance what it is. With plentiful PDG data, the method is ex-

pected to discover the reservoir model in the process, rather than depend on knowing

the model in advance. This is the fundamental premise of this study.

Therefore, the target for this research has been to determine what method is able

to utilize data sets that are: (1) variable in flow rate (2) noisy, and (3) large in

number of measurements, to achieve cointerpretation of the pressure and flow rate

data from permanent downhole gauges with a nonparametric regression. Specifically,

two targets were achieved in this study.

For the first target, we would like to detect the real reservoir response from the

noisy data set. Suppose we have a noisy data set (on the left of Fig. 1.4), our

method will learn and obtain the reservoir properties from the noisy data set. Then,

our method is expected to return a cleaned pressure when the clipped flow rate are

provided (on the right of Fig. 1.4),

Data

0 20 40 60 80 100 120 140 160 180 2004000

4200

4400

4600

4800

5000

5200

Time (hours)

Pre

ssur

e (p

si)

0 20 40 60 80 100 120 140 160 180 200

0

20

40

60

Time (hours)

Flo

w R

ate

(ST

B/d

) =⇒

Cleaned Data

0 20 40 60 80 100 120 140 160 180 2004200

4400

4600

4800

5000

Time (hours)

Pre

ssur

e (p

si)

0 20 40 60 80 100 120 140 160 180 200

0

20

40

60

Time (hours)

Flo

w R

ate

(ST

B/d

)

Figure 1.4: Target 1: Detect the real reservoir response from the noisy data.

As the second target, we would like the method to discover the reservoir model

without knowing it in advance. This is an extension of the first target. Suppose we


have a noisy data set from PDGs, our method will learn and obtain the reservoir

model behind the noisy PDG data. After that, the method will give pressure predic-

tion according to an arbitrary given flow rate. In particular, when a constant flow

rate history is provided (as shown in the right part of Fig. 1.5), the predicted pressure

transient corresponding to the given constant flow rate will work like a deconvolu-

tion process, revealing the reservoir model behind the noisy data set.

Data

260 280 300 320 340 360 3808600

8700

8800

8900

9000

Time (hours)

Pre

ssur

e (p

si)

260 280 300 320 340 360 3800

0.5

1

1.5

2

x 104

Time (hours)

Flo

w R

ate

(ST

B/d

) =⇒

Reservoir Model

0 20 40 60 80 100 1207500

8000

8500

9000

9500

10000

Time (hours)

Pre

ssur

e (p

si)

0 20 40 60 80 100 1206

6.5

7

7.5

8x 10

4

Time (hours)

Flo

w R

ate

(ST

B/d

)

Figure 1.5: Target 2: Discover the reservoir model without knowing it in advance.

1.2 Methodology

In order to achieve the research targets, this study investigated the application of data

mining, which is a nonparametric regression method that does not require knowledge

of the reservoir model in advance.

Data mining is the process of extracting patterns from data. Data mining plays

a key role in many areas of science, finance and industry. Here are some examples of

data mining problems:

• Predict the price of a stock 6 months from now, on the basis of company per-

formance measures and economic data (Hastie et al., 2009).

• Identify the numbers in a handwritten ZIP code, from a digitized image (Hastie et al.,

2009).


• Search association rules from the supermarket transaction data (Tan et al.,

2005).

• Classify spam or junk emails from incoming emails (Hastie et al., 2009).

• Recognize the face pattern from a data base of photographs to confirm the

identity (Hastie et al., 2009).

Before computers were invented and widely used, manual data mining processes

had been used for centuries. Early methods of identifying patterns in data include

Bayes theorem (1700s) and least square regression analysis (1800s). The development

of computer science technology has stimulated the study of data mining techniques.

In addition to Bayes theorem and least square regression analysis, many efficient and

powerful methods have been invented, including neural networks, genetic algorithm,

decision trees, support vector machine, minimum message length, etc. Most of these

modern data mining methods are computationally intensive, and hence, are computer-

aided. With the help of these methods, many aspects of our daily life are greatly

changed. A typical example is the handwritten ZIP code identification. The neural

network invented in 1950s realized the ZIP code automated recognition. A lot of post

office laborers were released from the tedious work of reading ZIP codes on envelopes.

With the further development of the data mining techniques, the Support Vector

Machine (SVM) method realized identification of handwritten letters with efficient

performance in the 1980s. Nowadays, with the aid of these data mining methods, post

offices are able to process hundreds of thousands of mails faster and more accurately

with fewer manual workers.

Based on data mining’s ability of model detection from large volumes of data,

it seems worthwhile to use data mining in the processing of PDG data. Assuming

the PDG data reflect the properties of the reservoir, proper data mining algorithms

may be able to extract the reservoir model from the PDG data despite them being

variable and noisy. This study focused on applying data mining algorithms in the

cointerpretation of pressure and flow rate signals from permanent downhole gauges.

Fig. 1.6 shows the flow chart of this data mining approach. The whole algorithm

starts from the PDG data including pressure series and flow rate series. Then we


Figure 1.6: Work flow chart of cointerpretting pressure and flow rate data from PDGusing data mining approaches.


will create a training data set from the raw PDG data to train the machine learning

algorithm. This training process is an iterative process. As long as the machine

learning algorithm converges, the reservoir properties are expected to be obtained

and stored within the algorithm, Then, we may provide an arbitrary flow rate history

(a constant flow rate as an example in Fig. 1.6) as an input, and the well-trained

machine learning algorithm may then give a pressure prediction according to this

given flow rate history. The reservoir properties can be obtained if, as expected, the

pressure prediction can be treated as the real pressure response given the specific new

flow rate history. Also at this time, the original PDG data set, which is noisy, huge in

volume, and variable in flow rate, will be translated into a low-noise constant flow rate

data set. Engineers may apply the conventional well test interpretation methods on

this predicted pressure to estimate more information about the reservoir. Engineers

may even provide a future flow rate projection to ask the algorithm to give pressure

prediction, which could be used for production optimization. In this research, the

kernelization was implemented in the machine learning algorithm, and the training

data set was created according to the selection of different kernel functions.

One of the key reasons for the success of the Support Vector Machine (SVM)

approach is that SVM uses the process of kernelization, which enables the data mining

process to work in a high-dimensional Hilbert space (a space defined by the inner

product of vectors). The main advantage of kernelization is that the data mining is

performed in a very high-dimensional space while the computation is done in a lower-

dimensional space. This work took advantage of this characteristic of kernelization

and applied it in the data mining processing of PDG data.

1.3 Dissertation Outline

This dissertation proceeds as follows.

Chapter 2 provides a literature review on the PDG data interpretation. This

literature review introduces the methodology of previous works in utilizing the PDG

data for well analysis. The advantages and restrictions of the previous methods are

described.


Chapter 3 first presents an overview of data mining concepts and introduces the

key components of data mining algorithms. Then it explains the concept and algo-

rithm of kernelization, following which a simple kernel, linear kernel, is discussed.

The simple kernel methods were applied to a series of synthetic cases and the out-

standing issues are discussed in this chapter.

The restrictions of the simple kernel methods discussed in Chapter 3 leads to

an exploration of using a complex kernel, namely the convolution kernel that is

described in Chapter 4. In Chapter 4, the origin of the convolution kernel is first

introduced, followed by the detailed algorithm when using it in the PDG context.

A series of synthetic data, semireal data, and real field data were used to test the

convolution kernel methods. The results are discussed in this chapter as well.

Following Chapter 4, Chapter 5 discusses the performance analysis of the convo-

lution kernel. A series of sensitivity tests was carried out to demonstrate the method

performance in some special real field practice conditions, including the situation with

the existence of outliers and aberrant segments, the situation when the flow rate his-

tory is missing, and the situation with unknown initial pressure. In this chapter, the

effect of training data timespan and training data sampling frequency on the method

performance is also discussed. Finally, a test of evolution learning is also shown to

demonstrate the change of data mining results with time change in a production well.

In Chapter 6, an important issue, the scalability of the data mining method on

huge data sets is investigated. In this chapter, three block learning algorithms are

discussed. The results of applying these methods to a large scale data set are also

present in the final part of this chapter.

Chapter 7 summarizes the whole work, and provides some insights of the possible

future work in the PDG data analysis using data mining approaches.

In addition to the seven chapters, there are four appendices. Appendix A lists all

the data for 37 test cases discussed in the dissertation.

Appendix B proves the three kernel closure rules used in Chapter 4.

In Appendix C, breakpoint detection using data mining techniques, another im-

portant topic in transient testing, is discussed. Because it is not the focus of this

work–the cointerpretation of pressure and flow rate data – the discussion is put in the


appendix. Three different data mining methods, including K-means, bilateral, and

Minimum Message Length, were applied. This appendix describes advantages and

limitations of the three methods.

Appendix D explains in detail the C++ implementation of the project. In this ap-

pendix, a class diagram and a work flow diagram are used to demonstrate the structure

of the program, the functionalities of classes, the interaction between classes, and the

invoking of the functions. In addition, this appendix also explains the expandability

of the programs by using the abstract classes that define the interfaces.

Chapter 2

Literature Review

With the wide deployment of PDG, using the PDG data for reservoir analysis became

a topic of interest over the past decades. As mentioned in Chapter 1, PDGs were

initially designed for well monitoring, but the characteristics of the real-time downhole

measurement make the PDG a promising data source for reservoir analysis. In recent

years, study on PDG data interpretation flourished, and covered several areas of

reservoir engineering.

In this chapter, the previous work on PDG data interpretation will be reviewed.

According to the target of the analyses, and the data content that the analysis were

applied to, the review is organized into five sections, including:

Reservoir monitoring and management: the studies that used the PDG data

directly for reservoir monitoring and management;

Pressure transient analysis: the studies that analyzed PDG pressure transient

data mainly to characterize the reservoir;

Deconvolution: the studies that utilized both pressure and the flow rate data from

PDG to characterize the reservoir;

Temperature transient analysis: the studies that interpreted the temperature

data from PDGs;

14

CHAPTER 2. LITERATURE REVIEW 15

Data Mining: the studies that applied data mining techniques to cointerpret the

pressure and flow rate data from PDGs.

2.1 Reservoir Monitoring and Management

The usage of PDG data started from utilizing the real time downhole pressure mea-

surement to monitor the subsurface activities. Chalaturnyk and Moffatt (1995) pre-

sented the PDG pressure at the stages of completion, initial startup and early produc-

tion of a slant well. They determined that most significant reservoir events may be

reflected by the pressure response to illustrate the effectiveness of a PDG in reservoir

management. Figure 2.1 shows a synchronized downhole pressure response at the

initial startup stage.

Figure 2.1: Pressure response from PDG during initial startup of slant well fromChalaturnyk and Moffatt (1995).

de Oliveira and Kato (2004) showed a real field example in Campos Basin, Brazil,

which demonstrated a full workflow of integrating PDG data into reservoir manage-

ment optimization. The work started from using the PDG data in reservoir char-

acterisation, to reservoir development, to the production optimization. Figure 2.2


shows a history matching result using the PDG pressure data. de Oliveira and Kato

determined that from the comparison between the PDG data and the history match-

ing data, the PDG data reflected the interaction between production wells while the

history model did not. This provides a direction of the improvement of the history

matching models.

Figure 2.2: History matching using PDG data from de Oliveira and Kato (2004).

Kragas et al. (2004) presented a list of applications utilizing the PDG data in

reservoir monitoring and management. They include:

• Reservoir pressure measurement.

• Reduced well interventions.

• Reduced shut-ins.

• Flowing-bottomhole-pressure management.

• Skin determination.

• Detect compartmentalization.

• Voidage control.


• Problem wells diagnosis.

• Tubing-hydraulics matching.

Kragas et al. showed an example of the Northstar field, which is located in the Ivishak

formation, approximately 6 miles offshore Alaska in the Beaufort Sea, to illustrate

the application of the PDG data. The application demonstrated the great value of

the PDG data in the management and monitoring of perforation and completion.

These early studies worked mostly on correlating the PDG pressure transient

with the reservoir events directly. In the meantime, researchers and engineers began

to investigate on digging more useful information from the PDG data by further

complex data processing.

2.2 Pressure Transient Analysis

The most direct way to use the PDG data is for pressure transient analysis. Pres-

sure transient analysis requires measurement of both pressure and flow rates, but the

downhole flow rate data were not available at the early stages of PDG deployment due

to the technical limitations. In addition, disparities of purpose make the PDG mea-

surement and the well test analysis ultimately challenging to apply the conventional

well test analysis methods directly on the PDG data.

Athichanagorn (1999) developed a multistep procedure to process and interpret

PDG data. Athichanagorn determined that special handling such as outlier removal,

denoising, data reduction, and flow rate reconstruction were required for the PDG

pressure transient analysis. This was due to the volume of data, the uncontrolled and

unmeasured downhole flow rate, and the fluctuations of the subsurface conditions

through the long term production life.

Athichanagorn et al. (2002) described a work flow to apply pressure transient

analysis on the PDG data. The work flow included seven steps (Athichanagorn et al.,

2002):

1. Outlier removal


2. Denoising

3. Transient identification / breakpoint1 detection

4. Data reduction

5. Flow history reconstruction

6. Aberrant segment filtering

7. Transient analysis on moving windows

In this work flow, the first six steps are data preparation, and the last step applies

the conventional transient analysis method on a moving window of pressure data. In

order to make this work flow go smoothly, a lot of work related with each step has

been done. For the sake of convenience, seven aspects of work are classified into four

topics including: (1) data processing and denoising, (2) breakpoint detection, (3) flow

rate reconstruction, and (4) change in reservoir properties.

2.2.1 Data Processing and Denoising

PDGs may provide measurements at a very high frequency, as high as once per

second (Horne, 2007). Working at such high frequency, each PDG may accumulate a

data set of 125 MB per year (suppose each sampling point is stored as a 32bit single

float point number in the memory). In addition to the size of the data set, noise is

also very common in the PDG data, as demonstrated in Fig. 1.3(b). Handling the

huge volume of noisy PDG data requires special mathematical methods and careful

implementation.

Veneruso et al. (1992) addressed the noise problem from the source of the data,

the computer-based data acquisition system related with both hardware and soft-

ware. Fig. 2.3 shows the block graph of a general computer-based downhole data

acquisition and transmission system. Veneruso et al. determined that the measuring

1A breakpoint is a point where a flow rate change event happens. The breakpoint usually indicatesthe end of the previous transient and the beginning of the next transient. Therefore, transientidentification requires breakpoint detection.


system itself working under the complex and extreme subsurface conditions, might

very possibly be the source of noise without careful tuning. By taking a field exam-

ple, they demonstrated that noise could be caused by any key part of the system,

such as A/D conversion, sampling and transmission channel capacity. To ensure the

quality of the data, the whole system should be matched to the dowhhole sensor’s full-

scale measurement range, resolution and frequency band. They also tried to utilize

straightforward signal processing methods, such as digital filter, to denoise the data.

In general, Veneruso et al. pointed out that noise in the downhole measurement may

come from the measuring devices as well as the uncontrolled subsurface environment

itself. However, they did not provide much thought on how to process the noisy data

after they were loaded to the computer.

Figure 2.3: A general computer based downhole data acquisition and transmissionsystem, from Veneruso et al. (1992).

Athichanagorn et al. (2002) presented a work flow of processing the long-term

data from PDGs. Athichanagorn et al. (2002) pointed out three major categories of

noise in PDG signals, including outliers, normal noise and aberrant segments. In

their work, the outliers and the normal noise are filtered out by using the wavelet

method. After the applying the wavelet transformation on the noisy PDG signals,

Athichanagorn et al. (2002) determined the outliers as the values above a threshold

of detail signals, and the normal noise as the value below a threshold of detail signals.


For aberrant segments, the authors proposed an iterative method to regress on each

transient and exclude the transient where results have a large variance in the regressed

parameters. Fig. 2.4(a) and Fig. 2.4(b) show the processed results after applying the

wavelet methods on the example field data.

The classification of noise and the methods of data processing corresponding to

the three different classes of noise in Athichanagorn et al. (2002) are very useful, and

are already applied in industry practice. However, there are still some issues related

with the methods. Firstly, the thresholds in the wavelet methods are empirical, which

means a trial-and-error process is needed to decide what the thresholds should be in

each case. Secondly, simple filtering of the data by the wavelet methods is based

on the assumption that the vibration and outliers are caused purely by noise and

do not reflect the reservoir behavior. This may lead to a loss of useful information.

Thirdly, the iterative regression method used to solve the aberrant segments requires

a predetermination of transient periods. This requirement is challenging when the

data set is large or when the pressure change has a slow transition from one flow

period to the next (rather than a sharp break).

Ouyang and Kikani (2002) is an extension of Athichanagorn et al. (2002). They

continued to use the wavelet method in the PDG data processing and denoising,

focusing on the improvement of transient identification and automatic noise level de-

termination. Ouyang and Kikani’s improvement on the transient identification will be

reviewed in Section 2.2.2. Before Ouyang and Kikani (2002), Khong (2001) demon-

strated ways to determine the noise level (the noise threshold in the detail signals after

wavelet transformation) using the statistical equation, Eq. 2.1 (Donoho and Johnstone,

1994).

λ = σ√

2 log (n) (2.1)

where n is the total number of data points in the data set, and σ is the standard

deviation of the noise level. Ouyang and Kikani determined that in calculating the

standard deviation σ, Khong’s assumption that pressure varies linearly with time is

not valid for the majority of the time. Ouyang and Kikani replaced the linear Least

Squares Error regression with a nonlinear regression, and improved the accuracy of


(a)

(b)

(c)

Figure 2.4: (a) outliers and (b) normal noise filtered out using the wavelet method,and the final regression result matched to the pressure data with aberrant segments,from Athichanagorn et al. (2002).


σ. As a result, the noise threshold in the wavelet method may be better determined

and achieve a higher degree of automation.

Figure 2.5: Comparison of two approaches for best fitting pressure data,from Ouyang and Kikani (2002).

One important restriction of Ouyang and Kikani’s method is that a transient pe-

riod needs to be selected in advance. Compared the full duration of the PDG data, a

short period of transient may not represent the general noise level of the whole data

set. The selection itself may bring new uncertainties and systematic errors in the

denoising process.

Liu (2009) also presented a denoising method using the Haar wavelet transforma-

tion. Liu (2009) applied full level Haar wavelet transformation on both the pressure

and flow rate data, and plotted one versus the other. The idea was to truncate the

detail signal falling in the first and third quadrants (because those points violate the

rule that the signal of the pressure change shall be opposite to the signal of the flow

rate change), and reconstructed the pressure signal using the truncated detail signal.

Compared to other denoising methods that only use pressure to do the denoising,

this method filters the data using both the pressure and the flow rate. Liu (2009)


also showed another denoising method using the data mining methods, which will be

reviewed in Section 2.5.

2.2.2 Breakpoint Detection

One of major differences between a conventional well test and the PDG measure-

ments is the number of pressure transients. A conventional well test interpretation

is designed to work on an imposed flow rate change. Hence, a constant flow rate

(drawdown test) or a zero flow rate (buildup test) are preferred to provide as simple

flow rate change as possible. This simple flow rate change may be analyzed using a

simple mathematical solution. However, PDGs are used in the production environ-

ment so the flow rates are variable for most of the time. Even when the producing

well is set to produce at a constant flow rate, the uncontrolled fluctuations in the well

condition and the subsurface still result in a fluctuating flow rate history. Therefore,

the conventional well test analysis method usually works on a single pressure tran-

sient corresponding to a constant flow rate, while a PDG data analysis has to face

multiple pressure transients. In order to utilize the conventional well testing method

on a PDG data set, it is necessary to break the long-term record into individual tran-

sients. Hence, finding the location of the real breakpoints (at the places the flow rate

changes) is inevitable.

Athichanagorn et al. (2002) proposed a threshold method in which a breakpoint

will be identified when the pressure change is higher than a predefined ∆pmax and

whenever the timespan between samples become higher than a predefined ∆tmax.

Detecting a breakpoint from the pressure differential comes from the basic correlation

between the flow rate and the pressure. However, the differential depends on the

definition of the two parameters, ∆pmax and ∆tmax, which are quite tricky to decide.

A trial-and-error process requiring frequent human interaction cannot be avoided to

find proper thresholds. However, this frequent user interaction is not feasible for a

large data set.

Ouyang and Kikani (2002) first studied a case of 30 transients using the ratio

between the absolute value of the pressure differential over the first 0.1 hours of


each transient and the transient threshold, and then developed a practical formula to

predict the detectability of transients, as shown in Eq. 2.2.

∆q ≥ 0.0018khS

Bµ(2.2)

where S stands for the transient threshold used in the PDG data processing program.

With the help of Eq. 2.2, supposing that the parameters of k, h, B, µ are all given,

the maximum flow rate change corresponding to a specific pressure transient that

may be missed can be determined, using the threshold S. This flow rate change may

be also stated as the minimum detectable flow rate for the specific pressure transient.

The equation may also be used to guide the selection of transient threshold S under

a given flow rate change, if Eq. 2.2 is written into the form of Eq. 2.3.

S ≤ 0.0018∆qBµ

kh(2.3)

Ouyang and Kikani’s work improved the selection of the threshold parameter.

However, the procedure still cannot guarantee a high accuracy of breakpoint detection

because it is easy to know the flow rate change of a specific transient, but difficult to

know the minimum flow rate change of the whole PDG data set.

Even the downhole flow rate data do not help much in breakpoint detection. Rai

(2005) applied breakpoint detection to both the pressure and the flow rate data at

the same time. Most visually apparent breakpoints were detected, but still some were

missed.

The accuracy of the breakpoint detection affects the calculation significantly in

PDG data analysis, especially in the calculation of deconvolution. Nomura (2006)

determined that a breakpoint inaccuracy that could not even be detected with hu-

man eyes still led to huge deviation in the deconvolution results. As shown in Fig-

ure 2.7, the breakpoint detection (Fig. 2.7(a)) using the current commercial algo-

rithm from Athichanagorn (1999) seems good visibly, but the deconvolution result

(Fig. 2.7(b)) based on this breakpoint detection deviates substantially from the true

answer.

Nomura’s examples illustrate the industrial demand for the high level accuracy in


Figure 2.6: Breakpoint detection using both the pressure and the flow rate data.Most visually apparent breakpoints are detected, but still some are missed. From Rai(2005).

breakpoint detection. Nowadays, researchers and engineers are still working on the

accurate breakpoint detection. In this dissertation, both the methods with breakpoint

detection and the methods without breakpoint detection will be discussed. Providing

a method that does not require breakpoint detection is a clear advantage.

2.2.3 Flow Rate Reconstruction

As the early PDG tools did not provide the downhole flow rate information, the

pressure data has been used to reconstruct the flow rate series. This approach is

still often used today, as PDGs that measure both pressure and flow rate are now

available, but are deployed relatively infrequently.

Ouyang and Sawiris (2003) raised the question of reconstructing the production

and injection flow rate profile using the PDG pressure data. In their work, a key

numerical solution of flow rate as the function of downhole pressure was derived,


(a) (b)

Figure 2.7: (a) shows the breakpoints detected by current algorithms used in theindustry, while (b) demonstrates the deconvolution results using the detected break-points in (a). From Nomura (2006).

based on the assumption of single-phase flow along the wellbore. An offshore field

example was tested using the method, as demonstrated in Fig. 2.8. In addition to

the field example, Ouyang and Sawiris also performed sensitivity tests on all param-

eters in the method. Although the formulation was derived under single-phase flow,

Ouyang and Sawiris still observed that the method should be valid in the multiphase

situation, as long as the phases were well mixed.

Zheng and Wang (2011) utilized wavelet transformation to recover an oil-water

two-phase flow rate history. Zheng and Wang’s method first applied the wavelet

transformation on the PDG pressure transient, and obtained the frequency amplitude

change. They determined a relationship between the wavelet frequency amplitude

change and liquid fluid rate change, with which the liquid fluid change was derived

as a function of frequency amplitude change. As demonstrated in Fig. 2.9, the flow

rate profile is reconstructed as shown in Fig. 2.9(b), using the wavelet transformation

coefficients of PDG pressure as shown in Fig. 2.9(a). However, all the example cases

were synthetic cases, so the method still required more tests on the real field data.

Duru (2011) investigated using the temperature and the pressure data together

to reconstruct the flow rate history. Considering that Duru’s study is actually a

temperature transient analysis, it will be reviewed in Chapter 2.4.


(a)

(b)

Figure 2.8: Using the pressure profile as shown in (a), the flow rate profile is recon-structed as shown in (b). From Ouyang and Sawiris (2003).


(a)

(b)

Figure 2.9: Using the wavelet transformation coefficients of PDG pressure as shownin (a), the flow rate profile is reconstructed as shown in (b). From Zheng and Wang(2011).


Although the modern permanent downhole gauges have already achieved the ca-

pability of downhole flow rate measurement, better flow rate reconstruction methods

are still needed because many permanent downhole gauges that cannot measure the

flow rate are still deployed. Moreover, PDG data sets with partially missing flow rate

are also very common. Accurate flow rate reconstruction methods will be very helpful

in these cases.

2.2.4 Change of Reservoir Properties

The fact that the reservoir properties change during production has been noticed

for a long time. Lee (2003) demonstrated history matching to a two year record

of pressure, as shown in Fig. 2.10. The pressure simulation result with variable

permeability and skin overwhelmingly beats that with constant reservoir properties.

Unlike the conventional well tests which only use measurements of short duration,

PDGs may provide long-term measurements. The long-term PDG data, therefore,

are expected to be affected by the change in the reservoir properties or behavior.

Figure 2.10: A comparison between the pressure history matching with constant andvariable reservoir properties, from Lee (2003).

Dealing with the reservoir property change, Lee (2003) estimated the permeabil-

ity and the skin factor as functions of time. Regressing on the parameters of the


functions (simulations were performed in each iteration), the estimation of the reser-

voir properties is shown in Fig. 2.11. These estimations made by Lee were all based

on the assumption that the reservoir properties are functions of time only. Actually

the reservoir properties may also be affected by other factors, such as the change

of reservoir flow mechanisms. Nevertheless assuming the properties are functions of

time only could be treated as a convenient model to accommodate all the factors.

Figure 2.11: Estimate the permeability as a quadratic function of time, and the skinfactor as a linear function of time, from Lee (2003).

For the same PDG data, Khong (2001) has made a further investigation of the

moving window method proposed by Athichanagorn (1999). Khong set a width of a

window, and moved the window slowly with a predefined interval from the begging

of the data set till the end. Transient analysis was applied to each window, thus

yielding a series of reservoir properties as a function of time. Fig. 2.12 showed the

permeability change using the moving window method. The moving window method

was also used in Athichanagorn et al. (2002).

Zheng and Li (2007) also used a window-like method. In Zheng and Li’s study,

they first applied wavelet method to detect breakpoints and to define transients. In

each transient, they assumed the reservoir properties were constant. The result is a

sequence of piecewise constant reservoir properties, such as the permeability and the

skin factor shown in Fig. 2.13.


Figure 2.12: Variable permeability obtained by the moving window method on thePDG pressure data, from Khong (2001).

Figure 2.13: Piecewise constant reservoir properties after transient analysis,from Zheng and Li (2007).


2.3 Deconvolution

As some modern PDG devices have the capability of flow rate measurement, the

cointerpretation of pressure and flow rate data from the PDG, rather an analysis on

the pressure data only, has gained attention. Beyond the use specifically for PDG

data analysis, a common pressure/flowrate cointerpretation method is deconvolution.

Deconvolution is the process that uses the pressure transient response to variable

flow rate to compute the corresponding constant flow rate response. As expressed in

Eq. 2.4, the wellbore pressure drop, ∆pw (t) can be constructed by the convolution

of individual constant rate transient, ∆p0 (t). Therefore, the deconvolution process is

to extract the constant flow rate transient ∆p0 (t), from the convolved variable rate

transients, ∆pw (t).

∆pw (t) =

∫ t

0

q′ (t) ·∆p0 (t− τ) dτ (2.4)

Deconvolution has been discussed for decades, with most approaches being analyt-

ical. For example, Ramey (1970) applied Laplace transform on the pressure diffusion

equation to solve the partial differential equation in the Laplace space. However, ap-

plying the deconvolution to the numerical measurement from PDGs was not practical

until a series of important work by von Schroeter et al. (2004). The difficulty is that

the deconvolution process is actually a “desmoothing” process (Horne, 2007), because

the forward convolution equation (Eq.2.4) is a smoothing function. This “desmooth-

ing” process brings in serious instability issue in the mathematical solution, especially

when the data are noisy.

von Schroeter et al. (2004) made important breakthroughs by proposing a formu-

lation that enables solving the deconvolution problem as a separable nonlinear Total

Least Squares problem, and applied this algorithm to PDG data. A multitransient

simulated example was demonstrated in their work, as shown in Fig. 2.14. The dashed

curve is the deconvolution result, while the black and the grey dots are true and noisy

data. von Schroeter et al. had developed a method by which the feasible and stable

deconvolution process may be carried out.

Levitan et al. (2006) described a deconvolution technique using an unconstrained


(a) (b)

Figure 2.14: Deconvolution applied on the (a) variable flow rate and (b) multipletransients.The dashed curve is the deconvolution result, while the black and the greydots are true and noisy data, from von Schroeter et al. (2004).

objective function constructed by matching pressure and pressure derivative generated

by the response functions derived from different pressure build-up (PBU) periods.

One successful application of their method is to recover the initial pressure. According

to their work, the initial reservoir pressure may be regressed until convolution results

in two or more PBUs converge, as illustrated in Fig. 2.15. This method was proven to

be effective when the flow rate history is simple and the flow rate data are accurate

especially when the breakpoints of the flow rate are identified accurately.

Ahn and Horne (2008) proposed another deconvolution method using convex op-

timization approaches. The method permits the existence of noise in both pressure

and flow rate data and the accuracy of the deconvolution is assured by the iterative

convex optimization process. The method was successful on real field data, as shown

in Fig. 2.16. However, the method does not perform well if a buildup transient was

not fully developed in the deconvolution range. Especially when the data set is short

(not enough for a full transient) and noisy, the method will not work.

Most current deconvolution algorithms require breakpoint detection in advance.

As discussed earlier, the breakpoint detection with high level accuracy is a challenge.

As discussed in Section 2.2.2, Nomura (2006) showed that a tiny miss in break point

detection may result in a huge deviation in the deconvolution result(Fig. 2.7(a)). The

instability of deconvolution algorithms, and the difficulty in the high level accurate


(a)

(b)

Figure 2.15: (a) Using the initial reservoir pressure of 6310 psi leads to the in-convergence of the deconvolution results on two PBUs. (b)However, with the initialreservoir pressure of 6314.3 psi deconvolution results on two PBUs converged witheach other, from Levitan et al. (2006).


(a) (b)

Figure 2.16: Deconvolution using the convex optimization on the real filed data: (a)shows the pressure matching after four iterations, and (b) shows the flow rate match-ing after four iteration, from Ahn and Horne (2008).

breakpoint detection are the two prime difficulties of deconvolution of PDG data.

Nevertheless, the insights of the reservoir that would be brought by a successful de-

convolution process are still irreplaceable in the PDG data analysis domain. Actually,

the second target of this study, stated in Section 1.1 may obtain similar results of

deconvolution. The detailed discussion will appear in later chapters.

2.4 Temperature Transient Analysis

Downhole temperature measurement by permanent downhole temperature gauges has

been available as early as the initial development of PDGs. However, unlike the pres-

sure data that have been widely and deeply studied, the usage of PDG temperature

measurements still remains mostly for well monitoring. For example, Kragas et al.

(2004) demonstrated an example of downhole pressure and temperature from PDGs

synchronized with the well events (Fig. 2.17). The events, such as perforation, may

be clearly reflected in the pressure change as well as in the temperature change.

Fig. 2.17 actually shows another important relation behind the curves, that the

temperature also has transient behavior corresponding to the flow rate change, anal-

ogous to that of pressure. This inspires people working on temperature transient


Figure 2.17: Real time downhole temperature and pressure data from PDG synchro-nized with the well events, from Kragas et al. (2004).

analysis.

Some fundamental work of temperature transient analysis has been achieved

in Duru and Horne (2010, 2011). Duru and Horne (2010) derived a temperature tran-

sient model as a function of fluid properties, formation parameters, pressure, and

flow rate. The model related the mass balance and the energy balance by the

Joule-Thomson effect and the heat diffusion effect. Duru and Horne (2011) applied

Bayesian inversion method in solving the temperature transient model derived in

Duru and Horne (2010). The Bayesian inversion method is stochastic and power-

ful. The method deconvolves variable-rate pressure data and extracts the pressure

response kernel function. The method may even recover the flow rate history from

the variable-rate temperature data. The method has a good tolerance of noise, that

is, even with 10% noise, the method still works well. Fig. 2.18 shows a case of tem-

perature and pressure transient analysis using the temperature model and Bayesian

inversion method. The pressure response kernel function (Fig. 2.18(b)) was extracted

from the pressure data with 10% noise (Fig. 2.18(a)), while the flow rate history

(Fig. 2.18(d)) was recovered from the temperature data (Fig. 2.18(c)).


(a) (b)

(c) (d)

Figure 2.18: Temperature and pressure transient analysis with Bayesian inversionmethod. (a) the pressure data with 10% noise; (b) the extracted pressure responsekernel function from the noisy pressure data; (c) the temperature data and the repro-duced temperature; (d) the recovered flow rate history from the temperature data,from Duru and Horne (2011).


Temperature is not the main focus of this study. However, temperature data are

another important data source for the reservoir characterisation. It could be very help-

ful that the current study retained some flexibility and extendability to incorporate

the cointerpretation using the pressure, flow rate, and temperature simultaneously.

This flexibility and extendability requires some perspective insights on the high level

architecture design of the method.

2.5 Data Mining

One of the fundamental properties of PDG data is the large number of data points.

This property associates the PDG data exploration with an idea from the computer

science domain, data mining. Data mining is a process to extract models from large

volumes of data. It seems promising to use data mining in the processing of PDG

data. Assuming the PDG data reflect the properties of the reservoir, proper data

mining algorithms may be able to extract the reservoir model from the PDG data

despite them being variable and noisy. However, up to now, few attempts to apply

data mining to PDG data have been made.

Liu (2009) proposed a data mining method applied on the pressure transient in

the Laplace space. The method first transformed all the PDG data from the real

time space to the Laplace space, then applied a data mining process, namely Locally

Weighted Projection Regression, on the transformed pressure transient in the Laplace

space, and finally inverted the prediction of the data mining process from the Laplace

space to the real time space. Fig. 2.19 shows a synthetic case, comparing the pressure

prediction after data mining in the Laplace space, the original noisy data, and the

underlying synthetic true data. Considering that the synthetic true data is invisible

to the data mining process, and that all that the data mining algorithm could see

are the noisy data, the prediction is good. Fig. 2.20 shows the result of a real field

case, in which the real pressure is unknown. Compared with Fast Fourier Transform

or wavelet methods which smooth and filter the data, the method removes the larger

noise of the PDG data while preserving the local variations. These local variations

may contain useful information from the subsurface rather than being just noise.


0

100

200

300

400

500

600

700

800

0 100 200 300 400 500 600

Pres

sure

Time(samples)

Prediction from LWPR in Laplace Space

True DataNoisy DataPrediction

(a)

400

450

500

550

600

30 35 40 45 50 55 60 65 70

Pres

sure

Time(samples)


True DataNoisy DataPrediction

(b)

Figure 2.19: Pressure prediction in the real space after data mining in the Laplacespace. (a) shows the pressure prediction compared with the noisy and synthetic truedata; (b) shows a zoom-in view of the pressure prediction (Liu, 2009).

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 50 100 150 200 250

Pres

sure

Time(samples)


Real DataPrediction

Figure 2.20: Pressure prediction on real data after data mining in Laplace space (Liu,2009).


The reason that Liu (2009) chose to perform the data mining process in the

Laplace space is because the pressure transients are convolved in the real time space

and they could be deconvolved easily in the Laplace space. However, Liu (2009) had a

difficulty that more than 40% of computational time was spent on the transformation

and inversion between the real time space and the Laplace space. A data mining

method applied directly in the real time space would be more desirable. Therefore,

a fundamental target of the current study was that all the data mining algorithms

should work directly in the real time space.

Chapter 3

Data Mining Concept and Simple

Kernel

Data mining is a technique that is widely used in computer science. It is the process

of extracting patterns from data, and it plays a key role in many areas of science,

finance and industry. There are some examples of data mining described earlier in

Section 1.2.

Data mining commonly involves two main classes of tasks, regression and classi-

fication. Regression attempts to solve the continuous-solution problems by finding a

function to model the data with least error. In regression problems, the output is a

continuous physical or mathematical variable. For example, in the stock price predic-

tion problems, the regression output is the predicted stock price that is continuous.

Classification intends to solve the discrete-solution problems by categorizing the data

into different groups with least misclassification. In classification problems, the out-

put is a discrete group label and usually has no physical or mathematical meanings.

For example, in the spam email classification problems, the output is binary, 0 or 1

that represents the email is a spam email or a nonspam email.

There are two major classes of data mining algorithms, supervised learning and

unsupervised learning. The supervised learning is based on training data set. For

each training sample in the training data set, an input and an output will be provided.

The supervised learning algorithm aims to find the general correlation between the

41

CHAPTER 3. DATA MINING CONCEPT AND SIMPLE KERNEL 42

input and the output by being trained by all the samples in the training data set.

Take the spam mail case as an example. Usually people provide an email database as

the training data set. In the training data set, each email and corresponding spam-

indicator (0 for a spam email and 1 for a nonspam email) form a training sample

in which the email is an input, and the spam-indicator is an output. After being

trained by all the samples in the training data set, the supervised learning algorithm

is assumed to obtain the relationship between the email and the spam-indicator, such

that whenever a new email is received, the learning algorithm may be able to give a

spam-indicator prediction to predict whether this email is a spam email or not. As

the training data set performs as a guide or a teacher to the data mining algorithm,

the supervised learning is also named learning with a teacher.

The opposite of the supervised learning, the unsupervised learning is often called

learning without a teacher. In an unsupervised learning algorithm, there is no

training data set provided. The unsupervised learning is assumed to work directly

on a data set and infer the relationships or properties among the variables in the

data set. The case of association rule makes a good demonstration that the unsuper-

vised learning digs into the transaction data set finding out that 80% of supermarket

customers who buy beer also buy chips.

The data mining problem in this study is mainly a supervised regression problem.

The PDG data are used to construct the training data set, and the data mining

algorithm predicts the pressure as the function of flow rate and time. There is also

a study regarding the breakpoint detection via data mining approaches, described in

the appendix. That problem is an unsupervised classification problem in which the

data mining algorithm groups the PDG data points into transients.

3.1 Components of Learning Algorithm

In a data mining algorithm, there are three important components, including model,

cost function, and optimization search method. They are the core parts of a learning

algorithm, so a brief introduction of them is given here.


3.1.1 Model

The model serves as the pattern structure or underlying functional form sought from

the data. A model reflects the pattern structure of the observed data, and may also

provide the prediction given any predefined inputs, shown in Equation 3.1.

ypred = hθ (x) (3.1)

where, x = (x1, x2, . . . , xNx)T in general is the input vector of the input values. Nx

is the number of elements of each input vector. The element of the input vector is

also called the “feature”.hθ : RNx → R is a model, which in general is a nonlinear

function. θ = (θ1, θ2, . . . , θNθ)T is a vector of model parameters. Nθ is the number of

model parameters. ypred is the prediction by the hypothesis hθ (x) at x. In this study

θ will have no physical meaning and is needed only to train the algorithm.

In a few cases, the pattern of the data is known before data mining. People

may use the known pattern structure as the model. For example, in some seismic

studies, the forward model is known, and thus, the forward model may be used as the

model for the data mining. However, the pattern structure of the data is unknown in

most cases. In this situation, the model is a pattern structure proposed intuitively or

intentionally. Because complex models always lead to intensive computation, a linear

model is often the first choice. Thus for the case of PDG data model, as a preliminary

investigation (only) we could use a linear model as expressed in Eq. 3.2.

hθ (x) = θTx = 〈θ,x〉 (3.2)

where 〈·, ·〉 is the inner product of two vectors. θ = (θ1, θ2, . . . , θNx)T is a vector of

model parameters with the same size as input vector x.

For PDG data, studied in this work, the input vector is defined throughout this

and the next section as:

x(i) =

x(i)1

x(i)2

x(i)3

=

1

q(i)

t(i)

, i = 1, . . . , Np (3.3)


where Np is a number of the observed (measured) pressures, t(i) is the time point at

which ith pressure was measured, q(i) is the flow rate at time t(i) and a constant value

1. The existence of the constant “1” is to make the linear expression cover a possible

offset. The model will predict the values ypred(i), which in fact are the predicted

pressures ppred(i) at time t(i), where the PDG pressures were measured. Thus the

model for PDG data is:

ypred(i) = θTx(i), i = 1, . . . , Np (3.4)

where the model parameter vector is:

θ =

θ1

θ2

θ3

(3.5)

The observed data are the pressures measured at time t(i), namely p(i):

yobs(i) = p(i), i = 1, . . . , Np (3.6)

When a model is provided, the data mining question actually becomes an opti-

mization question to determine values of θ to be used to have the best fit of the

data.

3.1.2 Cost Function

A cost function judges the quality of the model compared to the acquisition data,

denoted as L(

ypred, yobs)

. Here, ypred is the prediction from the model, as stated in

Eq. 3.1, and yobs is the observation, y(i). Considering the model ypred = hθ (x) is also

a function of θ given the input data x. Essentially the cost function is a function of

θ, such that L = L (θ). This reveals the essence of a cost function as an evaluation of

the parameter vector θ. In this study, the least-mean-square (LMS) is employed as


shown in the Eq. 3.7

LLMS (θ) =1

2

Np∑

i=1

(

hθ

(

x(i))

− y(i))2

(3.7)

The LMS cost function emphasizes the fitting of the model to the observed data.

However, it tends to have very complex parameters θ which results in overfitting of

the data. A better way to restrain θ is to have a penalty term in the cost function,

shown in Equation 3.8.

LMAP (θ) =1

2

Np∑

i=1

(

hθ

(

x(i))

− y(i))2

+ c ‖θ‖ (3.8)

The cost function in Equation 3.8 is also called Maximum A Posteriori (MAP)

(Koller and Friedman, 2009) cost function. When the vector θ is too complex, ‖θ‖,the norm of the θ becomes larger. Also the cost will be increased by a penalty

coefficient. So the MAP cost function actually restrains the models on both the data

fitting and the model structure. The weight between the two is controlled by the

coefficient c.

When the model of a data mining is fixed, the cost function needs careful selection

to regularize the model in an expected way. In this work, LMS cost function was

taken. Here the LMS was chosen as the cost function not only because its simplicity.

If the data mining model is linear, the LMS cost function will be a convex function

whose Hessian Matrix is positive-definite, that is, the cost function will have only

one global minimum and there is no other local minimum at all. This property

is very useful in the data mining, because it releases the optimization method from

choosing a very good initial guess that has to be very close to the global minimum.

To prove the LMS cost function is a convex function, we may examine the element

(p, q) of the Hessian Matrix (H) of L (θ), which is:

Hpq =∂2

∂θp∂θqL (θ) =

Np∑

i=1

x(i)p x(i)

q (3.9)


Then for any vector z,

zTHz =

Nθ∑

p=1

Nθ∑

q=1

zpzqHpq

=

Nθ∑

p=1

Ntheta∑

q=1

zpzq

Np∑

i=1

x(i)p x(i)

q

=

Np∑

i=1

Nθ∑

p=1

Nθ∑

q=1

zpzqx(i)p x(i)

q

=

Np∑

i=1

((

Nθ∑

p=1

zpx(i)p

)(

Nθ∑

q=1

zqx(i)q

))

=

Np∑

i=1

(

Nθ∑

p=1

zpx(i)p

)2

>0 (3.10)

The derivation in Eq. 3.10 shows the Hessian matrix in Eq. 3.9 is positive-definite.

3.1.3 Optimization Search Method

The optimization search method minimizes the cost function. Any data mining algo-

rithm requires minimization of the cost function to a global minimum robustly and

efficiently. There are generally two kinds of optimization methods, gradient-based and

not gradient-based. The gradient-based methods are usually very fast, like Steepest

Gradient Descent methods, Conjugate Gradient Descent, etc. In early stages of this

research, the Steepest Gradient Descent method was used. The gradient descent

method starts from an initial guess, and performs updates repeatedly as shown in

Eq. 3.11.

θ[i+1]j = θ

[i]j − α

∂

∂θjL (θ) = θ

[i]j − α

Np∑

k=1

(

hθ[j](

x(k))

− y(k))

x(k)j (3.11)


here α is the learning rate which is chosen a proper value by experience1 and [m] is

the mth iteration.

Eq. 3.11 could be realized by the Batch Gradient Descent algorithm, as shown by

pseudocode in Algorithm 1. Actually, this is too expensive computationally, especially

when the training data set is very large. Also, it is not necessary to use the whole data

set to do the training in a single update step because it is very possible that θ has

converged before the whole data set is applied. Therefore, a better way to implement

the gradient descent method is to use Stochastic Gradient Descent algorithm, as

shown by pseudocode in Algorithm 2. In the Stochastic Gradient Descent algorithm,

each time θ is updated by a single sample only. The training is processed repeatedly

until θ has converged.

Algorithm 1 Batch Gradient Descent

iter = 0 initialize the iteration counterθ[0] = ~0 initial guess of θwhile θ[iter] is not converged doθ[iter+1] = θ[iter]

for i = 1 to Np do

θ[iter+1] = θ[iter+1] − α(

hθ

(

x(i))

− y(i))

x(i) Update θ by all samplesend foriter = iter + 1 update the iteration counter

end while

With the Stochastic Gradient Decent method, the training process in Eq. 3.11

may be simplified into Eq. 3.12.

θ[i+1]j = θ

(0)j − α

i∑

k=0

(

hθ[k](

x(k+1))

− y(k+1))

x(k+1)j (3.12)

1α does not have fixed range. A large α will lead to a fast convergence at the beginning and azigzag oscillation at the end, which very possible leads to an nonconvergence; however, a small αcan improve the convergence at the end but it converges very slowly at the beginning. A properstrategy to select the α is trial-and-error. In the coming chapters, the Conjugate Gradient methodwill replace the current Steepest Gradient Descent method to accelerate the convergence rate as wellas avoid the tricky selection of α. This is described later, in Section 4.3


Algorithm 2 Stochastic Gradient Descent

iter = 0 initialize the iteration counterθ[0] = ~0 initial guess of θwhile θ[iter] is not converged doi = ((iter + 1) mod Np) + 1

θ[iter+1] = θ[iter] − α(

hθ

(

x(i))

− y(i))

x(i) Update θ iter = iter + 1

end while

If we make an initial guess of θ[0] as 0, then Eq. 3.12 will be simplified as Eq. 3.13.

θ[i+1]j = α

i∑

k=0

(

y(k+1) − hθ[k](

x(k+1)))

x(k+1)j (3.13)

Eq. 3.13 is the training equation of the learning process. According to the model

we selected, the prediction equation will be written easily as Eq. 3.14, where xpred is

a given input which the prediction is required to make.

ypred = hθ

(

xpred)

= θTxpred (3.14)

In a supervised learning process, the cost function is a function of the parameters

θ and the training data. The optimization process is a process of obtaining the best

θ through utilizing the training data set to minimize the cost function. The process

looks like using the training data set to train the model parameter θ. This is where

the word “train” originates.

With these three components, a data mining algorithm could be performed, that

is, given a model and a cost function, a proper optimization method may find the

best parameters in the model to produce the least cost. Now that the whole data

mining process has been discussed, it is appropriate to discuss some of its problems.

The model proposed in this section is the most generic linear model. The difficulty of

proposing this model is that we force the pattern structure behind the PDG data to

be linear, or in other words, the model only captures the linearity of the PDG data.

However it is well understood that the PDG data are nonlinear at the large scale,

as shown in Fig. 1.3. Apparently from the curve, it may be seen that the pressure


transient is not related to the flow rate in a linear manner. Essentially, the pressure

transient is the convolution results of flow rate change events, which means that each

pressure is affected by all previous flow rate events starting at different time. Because

the nonlinearity is dominant throughout the whole duration of the PDG data, failing

to capture the nonlinearity will lead to failure to obtain the reservoir model, and

ultimately will fail to give a correct interpretation of the PDG data. So the question

becomes how to capture the nonlinearity with a linear model in the data mining

process.

3.2 Kernelization

An easy way to capture the nonlinearity by a linear model is to use a transformation

on the input vector x. For example, suppose the actual pattern structure behind the

data pair (y, z) is y = θ1 + θ2z + θ3z2, and suppose the model is linear as y = θTx. If

x is defined as:

x =

(

1

z

)

(3.15)

then in this two-dimensional space of of (1, z)T, the linear model hθ (x) = θTx will

only capture the linearity of y, and the second-order nonlinearity z2 could not be

captured. However, if there exist a transformation Φ (x) over vector x such that:

Φ (x) :

(

1

z

)

→

1

z

z2

(3.16)

Then in this three-dimensional space of (1, z, z2)T, the linear model hθ (x) = θTΦ (x)

will capture the quadratic nonlinearity of y. This example reveals that the nonlinearity

in a low-dimensional space could be approached by the linearity in a high-dimensional

space. The Φ (x) here is just a general form of transformation over vector x. We

actually do not know the dimension of Φ (x). It may be more than Nx, may be less,

and may be equal. Considering that we would like to capture more nonlinearity by


imposing Φ (x), the dimension of Φ (x) will be mostly more than Nx. In the example

in Eq. 3.16, the dimension of Φ (x) is three, which is greater than the dimension of x.

Correspondingly, the dimension of θ will be the same as that of Φ (x), rather than

that of θ. To make clear, θ has the same dimension as x before Eq. 3.16, while θ has

the same dimension as Φ (x) after Eq. 3.17.

Therefore, we may slightly modify Eq. 3.13 and Eq. 3.14 to reflect this transfor-

mation. The input variable x is replaced by Φ (x). The training equation becomes:

θ[i+1]j = α

i∑

k=0

(

y(k+1) − hθ[k](

Φ(

x(k+1))))

Φ(

x(k+1)j

)

(3.17)

and the prediction equation becomes:

ypred = hθ

(

Φ(

xpred))

= θTΦ(

xpred)

(3.18)

This form of the model (Eq. 3.17 and Eq. 3.18) will capture different nonlin-

earities according to different selections of the transformation. For the case dis-

cussed, to capture the nonlinearity of a p-degree polynomial, a transformation of

Φ (x) = (1, z, z2, . . . , zp)Thas to be constructed explicitly. This brings in another

problem that the more nonlinearity we would like to catch, the more complex the

transformation Φ (x) will be, and the more computation would be required. Another

difficulty is that we may not know the relevant functional form from Φ (x) in advance.

It is very natural to ask “is it possible that we construct Φ (x) without writing Φ (x)

out explicitly?”

If we multiply Eq. 3.17 by Φ(

x(i+2))

, we have:


θ[i+1]TΦ(

x(i+2))

=αi∑

k=0

(

y(k+1) − hθ[k](

Φ(

x(k+1))))

Φ(

x(k+1))T

Φ(

x(i+2))

=α

i∑

k=0

(

y(k+1) − θ[k]TΦ(

x(k+1))

)

Φ(

x(k+1))T

Φ(

x(i+2))

=α

i∑

k=0

(

y(k+1) − θ[k]TΦ(

x(k+1))

)

⟨

Φ(

x(k+1))

,Φ(

x(i+2))⟩

=α

i∑

k=0

(

y(k+1) − θ[k]TΦ(

x(k+1))

)

K(

x(k+1),x(i+2))

(3.19)

Finally, we write out the new form of training equation as

θ[i+1]TΦ(

x(i+2))

= α

i∑

k=0

(

y(k+1) − θ[k]TΦ(

x(k+1))

)

K(

x(k+1),x(i+2))

(3.20)

Here, K (x, z) is named the kernel function, defined by the inner product of the

transformations shown in Eq. 3.21.

K (x, z) = 〈Φ (x) ,Φ (z)〉 (3.21)

Although the Kernel function is defined by the inner products of (x), it usually

does not require the explicit formation of Φ (x) to make the computation. Assume

x = (x1, x2, x3)T and z = (z1, z2, z3)

T, two classical kernel functions and their cor-

responding Φ (x) are shown in Table 3.1. The first kernel function maps x from

three-dimensional space onto nine-dimensional space Φ (x), while the second kernel

function maps onto 13-dimensional space. Although the two Φ (x) in the table are

in high-dimensional spaces, the calculations of K (x, z) are both done in the three-

dimensional space, which significantly improves the performance. This demonstrates

that the kernel function may realize transformation onto a high-dimensional space

with the calculation in the low-dimensional space.

The proof of the first correlation between the kernel function K (x, z) and the


Table 3.1: Kernel Function and the Corresponding Φ (x) (Ng, 2009)

K (x, z) Φ (x)

K (x, z) =(

xTz)2

Φ (x) =

x1x1

x1x2

x1x3

x2x1

x2x2

x2x3

x3x1

x3x2

x3x3

K (x, z) =(

xTz+ c)2

Φ (x) =

x1x1

x1x2

x1x3

x2x1

x2x2

x2x3

x3x1

x3x2

x3x3√2cx1√2cx2√2cx3

c

transformation Φ (x) in Table 3.1 is shown as follows (Ng, 2009).

Proof. Assume x = (x1, x2, x3)T, and z = (z1, z2, z3)

T, then:

K (x, z) =(

xTz)2

=(x1z1 + x2z2 + x3z3)2

=x21z

21 + x2

2z22 + x2

3z23 + 2x1z1x2z2 + 2x1z1x3z3 + 2x2z2x3z3

=(x1x1, x1x2, x1x3, x2x1, x2x2, x2x3, x3x1, x3x2, x3x3)

(z1z1, z1z2, z1z3, z2z1, z2z2, z2z3, z3z1, z3z2, z3z3)T


Considering K (x, z) = Φ (x)T Φ (z), we have:

Φ (x) = (x1x1, x1x2, x1x3, x2x1, x2x2, x2x3, x3x1, x3x2, x3x3)T

and:

Φ (z) = (z1z1, z1z2, z1z3, z2z1, z2z2, z2z3, z3z1, z3z2, z3z3)T

Here we show that when we use the kernel function K (x, z) =(

xTz)2, we are

implicitly using Φ (x) = (x1x1, x1x2, x1x3, x2x1, x2x2, x2x3, x3x1, x3x2, x3x3)T. This

enables us to make the calculation in three-dimensional space (the space of vector x),

while run the learning algorithm in the nine-dimensional space (the space of Φ (x)).

However, although the kernel function only evaluates on vector x and z, not all

functions evaluating on x and z are valid kernel functions. A valid kernel is also called

a Mercer kernel, because Mercer developed criteria for the kernel validity, namely the

Mercer Theorem.

Mercer Theorem (Ng, 2009). Let K : ℜn × ℜn → ℜ be given. then for K to be a

valid (Mercer) kernel, it is necessary and sufficient that for any

x(1), . . . ,x(m)

, m <

∞, the corresponding kernel matrix K is symmetric positive semi-definite, where a

kernel matrix K is defined so that its (i, j)-entry is given by Kij = K(

x(i),x(j))

.

By using these “kernel tricks” although Eq. 3.20 involves Φ (x), it does not require

Φ (x) to be written out explicitly. The term θ[k]TΦ(

x(i+2))

is a recurrent term on the

left hand side of Eq. 3.20, which is calculated and stored in the previous iterations.The⟨

Φ(

x(k+1))

,Φ(

x(i+2))⟩

term is replaced by a kernel function K(

x(k+1),x(i+2))

which

is expressed by x(k+1) and x(i+2) instead. Therefore, Eq. 3.20 makes the calculation

using Φ (x) implicitly. In practice, for well test data interpretation, this means that

we match the measured data without knowing in advance what the reservoir model

is – in fact, the reservoir model is discovered in the process.

Similarly, the prediction equation, Eq. 3.18, becomes:

ypred = hθ

(

Φ(

xpred))

=

Niter∑

k=0

α(

y(k+1) − θ[k]TΦ(

x(k+1))

)

K(

x(k+1),xpred)

(3.22)


where Niter is the total number of the iterations before convergence. Usually Niter is

larger than Np because the training data will be used repeatedly before θ converges.

Even though there is a term of θ[k]TΦ(

x(k+1))

in Eq. 3.22, the equation still does not

require us to know Φ (x) explicitly. This is because the term θ[k]TΦ(

x(k+1))

is the

recurrent term stored during the training process (the left hand side of the training

equation, Eq. 3.20). So we do not have to know Φ (x) at all.

Eq. 3.20 and Eq. 3.22 kernelize the linear model with a kernel function enabling

the data mining algorithm to capture the nonlinearity using a linear model with

the calculation still in low-dimensional space. This process is named “kernelization”.

With the kernelized data mining algorithm, we are able to formulate different methods

to perform data mining on the PDG data. Those methods may be classified into two

categories, kernelized data mining without breakpoint detection, and kernelized data

mining with breakpoint detection. These will be discussed separately in the following

sections.

3.3 Kernelized Data Mining without Breakpoint

Detection

The reservoir response is controlled by a linear combination due to superposition of

individual flow rate responses. Fig. 3.1 demonstrates the formation of the superposi-

tion. Assume there are two flow rates q1 and q2 starting at different time. When either

of them appears individually, the reservoir will respond with pressure changes p1 and

p2 respectively. When they appear together, the flow rates will be combined, shown in

the figure as a constant flow rate followed by a zero flow rate. The pressure responses

are also summed up so that the pressure curve behaves as a drawdown followed by a

buildup. What is observed by the PDG is the combination result after superposition.

In fact, it will be much easier for the data mining algorithm to converge if trained

with separated constant flow rate and corresponding pressure responses. Therefore,

it will be very beneficial if the real breakpoints, where the real flow rate change events

happen, could be detected. However, considering the flow rate and pressure signals


from subsurface are very noisy (refer to Fig. 1.3), it is very difficult in practice to

decide whether the change is caused by noise or by a real flow rate change event. To

avoid the need to detect breakpoints, an easy solution is to treat all the points as

breakpoints, so that there is no need to know where the real breakpoints are.

0 50 1000

1

2

3

4

5

Time

Pre

ssur

e

p1p2

20 40 60 80 100−2

−1

0

1

2

Time

Flo

w R

ate

q1q2

0 50 1001

2

3

4

5

TimeP

ress

ure

p1+p2

20 40 60 80 100−2

−1

0

1

2

Time

Flo

w R

ate

q1+q2

Figure 3.1: The demonstration of superposition.

Eq. 3.23 expresses the new form of input variable x(i). This reflects the superpo-

sition effect in the PDG data formation. In this expression, each sample point before

time t(i) is treated as a flow rate change event, although they do not necessarily have

to be. This approach releases the algorithm from detecting the breakpoints.

x(i) =

1∑i−1

j=1

(

q(j) − q(j−1))

∑i−1j=1

(

q(j) − q(j−1))

log(

t(i) − t(j))

(3.23)


In addition to the processing of superposition in the input variable, the feature

selection of the input variable is another problem. Three components are shown in

Eq. 3.23, 1, q, and q log t, with 1 as the constant to cover the offset. Actually, this

term could be taken off from the input variable if K (x, z) =(

1 + xTz)d

is used as

the kernel function, where d is an integer power no less than 1. The(

q(j) − q(j−1))

term is the flow rate change term after the superposition processing. The last term(

q(j) − q(j−1))

log(

t(i) − t(j))

is the superposition of the log∆t. This term is a domi-

nant term during infinite-acting radial flow. In addition to those two terms, Table 3.2

shows some typical reservoir behaviors and their corresponding dominant terms that

could be used as the data mining features. The pure wellbore storage effect of a

single flow rate change event is an exponential function of the flow rate and the time.

Consider that the kernel function tries to approach the reservoir behavior by linear

functions in the high-dimensional space. Therefore, the Taylor expansion is used

to represent the exponential function by linear summation in the pseudospace. For

other behaviors that may exist in the reservoir response, we also expect that the

Taylor expansion may capture some of those behaviors.

Table 3.2: Reservoir behavior and input features

Reservoir Behavior Data Mining Features

Infinite-acting radial flow ∆q log∆tClosed boundary (pseudisteady state) ∆q∆tConstant pressure boundary ∆q∆tSkin factor ∆q∆t

Wellbore effect ∆q(

∆t⊕ (∆t)2 ⊕ . . .)

Others ∆q(

∆t⊕ (∆t)2 ⊕ . . .)

According to Table 3.2, to better capture the different reservoir behaviors, we need

a series of terms in the input vector including: ∆q, ∆q log∆t , ∆q∆t , ∆q (∆t)2,and

other ∆q (∆t)n terms to perform Taylor expansion. Because the Taylor expansion

will require several high-order terms to keep the accuracy, there will be a balance

between the kernel function and the input vector. For one selection, we constructed a

complex input vector containing high-order terms, and the low-order kernel function;

for the other, we constructed a simple input vector containing low-order term only, and


high-order kernel function to form the high-order term in the pseudo-high-dimensional

space. Table 3.3 shows the comparison between the two methods. The two methods

are named Method A and Method B for further discussion.

Table 3.3: Input vectors and kernel functions for Method A and Method B

Method A

Input Vector x(i) =

∑i−1j=1

(

q(j) − q(j−1))

∑i−1j=1

(

q(j) − q(j−1))

log(

t(i) − t(j))

∑i−1j=1

(

q(j) − q(j−1)) (

t(i) − t(j))

∑i−1j=1

(

q(j) − q(j−1)) (

t(i) − t(j))2

...∑i−1

j=1

(

q(j) − q(j−1))

log(

t(i) − t(j))6

Kernel Function K (x, z) =(

1 + xTz

)1

Method B

Input Vector x(i) =

∑i−1j=1

(

q(j) − q(j−1))

∑i−1j=1

(

q(j) − q(j−1))

log(

t(i) − t(j))

∑i−1j=1

(

q(j) − q(j−1)) (

t(i) − t(j))


1 + xTz

)3

It is correct to say that Methods A and B have different numbers of model parame-

ters, because they have different kernel functions or different input vectors. However,

it is not that important to discuss the parameters θ separately because with the

kernel-based learning algorithm, the learning parameters θ were coupled with the

Φ (x) term, referring to Eq. 3.20. That is, rather than training θ, we actually trained

θTΦ (x). Also Φ (x) has never been required explicitly in the training or prediction

process due to the introduction of the kernel function, and actually we do not know

the exact form of Φ (x) either. Eventually, the outputs of the learning and prediction

process are neither θ nor Φ (x), but the pressure prediction (ypred) given any xpred,

referring to Eq. 3.22.

We will next discuss the kernelized data mining with breakpoint detection, and

then apply all three methods to synthetic data sets in Section 3.5.


3.4 Kernelized Data Mining with Breakpoint De-

tection

In this section, a third method, Method C, is introduced. Unlike Method A and B

that do not require the knowledge of breakpoints, Method C requires the knowledge

of all breakpoints (provided by the user or by an external algorithm).

Before the discussion of the performance of the methods, let us revisit the con-

struction of the input variable of superposition, as shown in Eq. 3.23. Each element

of the vector is a linear combination of elements because the superposition is linear.

Because the kernelization is essentially a linear combination in a high-dimensional

space, the summation terms in each feature of vector x in Eq. 3.23 will be able to be

incorporated in the kernelization without writing them out explicitly, if a proper form

of input variable is selected. That is to say, if a proper form of the input variables is

selected, the superposition will be reflected automatically in the kernelization.

So we could define two new features as:

q(i)j the jth constant flow rate share of flow rate q(i)

t(i)j the time elapsed between the start of the jth constant flow rate share until the

time of t(i)

Fig. 3.2 shows a demonstration of the variables for the input vector x(i).

A new input variable was constructed as shown in Table 3.4. The reason we

constructed the x(i) in this way is that all the summation terms of superposition in

Eq. 3.23 are able to be written in the linear combination of the elements of x(i) in

the newly constructed input vector. This will enable the superposition be reflected

automatically in the kernelization process.

Here, k is the total number of flow rate change events before t(i).

We call this Method C. We tested the performance of the three methods using a

series of synthetic data sets.


Figure 3.2: Demonstration of the construction of feature-based input variable.

Table 3.4: Input vector and kernel function for Method C

Method C

Input Vector x(i) =

q(i)1...

q(i)k

t(i)1...

tk(i)

log(

t(i)1

)

...

log(

t(i)k

)


1 + xTz)3


3.5 Application on Synthetic Cases

To test the three methods, a test work flow was formed as follows.

1. Construct a synthetic pressure, flow rate data set, add 3% artificial noise (nor-

mally distributed) to both the pressure and the flow rate data2.

2. Use the synthetic data set (with artificial noise) as the training data set. Apply

the three kernelized data mining algorithms (Methods A, B and C) to learn the

data set until convergence.

3. Feed the data mining algorithm with the training variable flow rate history

(without noise) and collect the prediction from the data mining algorithm.

4. Compare the predicted pressure data (in Step 3) with the synthetic pressure

data without noise (in Step 1).

5. Feed the data mining algorithm with a constant flow rate history (without noise)

and collect the predicted pressures from the data mining algorithm.

6. Construct a synthetic pressure according to the constant flow rate in Step 5

using the same wellbore/reservoir model in Step 1.

7. Compare the predicted pressure data (in Step 5) with the synthetic pressure

data (in Step 6).

In these test steps, Steps 1-2 train the learning algorithm, while Steps 3-7 make the

prediction. In the prediction, we first try to reproduce the training data set by feeding

it with clean (noise-free) flow rate history (Steps 3-4), and then we make a prediction

for a constant flow rate history (Step 5). From a practical standpoint, Step

3 generates a noise-reduced version of the data, while Step 5 effectively

performs a deconvolution. These two steps correspond to the two specific targets

2In this study, most of time the artificial noise was 3% normally distributed. Actually in realpractice, the common noise of the PDG is less than 1% normally distributed if no mechanicalproblems exist. We made the artificial noise larger to ensure the method will still be feasible in aharsher environment.


of this study. Because the test data are constructed using a synthetic model, we

can make a comparison between the prediction and the true data to evaluate the

accuracy of the prediction. From the idea of data mining, the machine learning

algorithm obtains the reservoir models after being trained by the training data set.

Then, the prediction may be made according to any given flow rate history. So for

all the cases, we show at least two predictions. One is a reproduction of the training

data set, which has a variable flow rate and the same time length as the training data

set. The other prediction is a constant flow rate with shorter duration. This will help

avoid a misunderstanding that the prediction will be only made to the same time

length as the training data set. One important note is that although the synthetic

true data without noise are generated by the wellbore/reservoir model, the actual

training data were noisy in both flow rate and pressure because artificial noise was

added, and the no-noise true data are invisible to the machine learning algorithm for

the whole process. There were nine synthetic test cases using this test work flow,

listed in Table 3.5. To simplify the figures, the actual training data are shown later

only for Case 1.

Table 3.5: Test cases for simple kernel method

Test Case # Test Case Characteristics

1 Infinite-acting radial flow2 Infinite-acting radial flow + wellbore effect3 Infinite-acting radial flow + skin4 Infinite-acting radial flow + wellbore effect + skin5 Infinite-acting radial flow + closed boundary (pseudosteady

state)6 Infinite-acting radial flow + constant pressure boundary7 Infinite-acting radial flow + wellbore effect + skin + closed

boundary8 Infinite-acting radial flow + wellbore effect + skin + constant

pressure boundary9 Infinite-acting radial flow + dual porosity

All data used to generate the synthetic cases are listed in the Appendix A. The

results of the test sets are shown below one by one.


3.5.1 Case 1: Infinite-acting Radial Flow

For the tests, we started with the easiest scenario, namely the infinite-acting radial

flow case. In this case, only the infinite-acting radial flow behavior exists in the

data. The training data are shown in Fig. 3.3(a), and there is no skin, no wellbore

effect, and no boundary. Fig. 3.3(b) shows that all of three methods have a good

reproduction of the infinite-acting radial flow behavior. For the constant flow rate test

in Fig. 3.3(c) and Fig. 3.3(d) the three methods perform equally well in the Cartesian

plot, but the log-log plot shows that Method A has better accuracy throughout the

whole test period, and that Method B and C have a small drop-down at the end

of the prediction(the last half log cyle). From this case on, the figure showing the

training data, and the figure showing the prediction using the constant flow rate on

the Cartesian plot, will be omitted.

3.5.2 Case 2: Infinite-acting Radial Flow + Wellbore Effect

The second test set has one more feature than the first, namely a wellbore effect. The

prediction of the variable flow rate and constant flow rate are shown in Fig. 3.4(a) and

Fig. 3.4(b). From the figures, we may observe that Method A captures the overall

trend, but misses the wellbore effect at the beginning. Method B and C work better

than Method A in this case. Because the training data set has a flow rate history

with 1h intervals, the prediction is also made to a flow rate history with 1h intervals.

So the first point is at 1h but it has no derivative calculated, therefore, all derivatives

start from 2h. So in Fig. 3.4(b), you cannot view the unit slope line for the early time

wellbore effect before 2h. However, because there is no skin in this case, the hump

in the derivative is caused purely by the wellbore effect. The derivative of Method

B and C in the predicted pressure reproduces the hump, which is an indication of

capturing the wellbore effect.

3.5.3 Case 3: Infinite-acting Radial Flow + Skin

This test case adds the skin to the infinite-acting radial flow. Usually skin factor does

not change the shape of the curve by much. Therefore, Fig. 3.3(d) and Fig. 3.5(b)


0 50 100 150 200−1000

−500

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod AMethod BMethod C

(b)

0 10 20 30 40 50 60 70 80−800

−750

−700

−650

−600

−550

−500

Time (hours)

∆Pre

ssur

e (p

si)


(c)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)

True DataTrue Data (Derivative)Method AMethod A (Derivative)Method BMethod B (Derivative)Method CMethod C (Derivative)

(d)

Figure 3.3: Data mining results usign simple kernel methods on Case 1: (a) thetraining data set; (b) prediction using the variable flow rate history; (c) predictionusing the constant flow rate history on a Cartesian plot; (d) prediction using theconstant flow rate history on a log-log plot.


0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

102.1

102.2

102.3

102.4

102.5

102.6

102.7

102.8

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 3.4: Data mining results using simple kernel methods on Case 2: (a) predictionusing the variable flow rate history; (b) prediction using the constant flow rate historyon a log-log plot.

are very much alike. Also the prediction results are similar to those of Case 1 (Sec-

tion 3.5.1). Method A captures the behavior better than Method B, and better than

Method C.

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 3.5: Data mining results usign simple kernel methods on Case 3: (a) predictionusing the variable flow rate history; (b) prediction using the constant flow rate historyon a log-log plot.



+ Skin

In this case we added both wellbore effect and the skin into the infinite-acting radial

flow. Compared to the case in Section 3.5.2, which only contains the wellbore effect,

the skin sharpens the curve. The first derivative point in Fig. 3.4(b) is around 140,

while the first derivative point in Fig. 3.6(b) is around 155. The added skin leads to

more deviation of Method A in Fig. 3.6(b). However, Method B and Method C still

work very well in this case.

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

102.1

102.3

102.5

102.7

102.9

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 3.6: Data mining results usign simple kernel methods on Case 4: (a) predictionusing the variable flow rate history; (b) prediction using the constant flow rate historyon a log-log plot.

3.5.5 Case 5: Infinite-acting Radial Flow + Closed Boundary

(Pseudosteady State)

Sections 3.5.5 to 3.5.8 show tests of the cases of reservoir boundaries. In the data set

in this section, only infinite-acting radial flow and the closed boundary (pseudosteady

state) are present. Fig. 3.7(a) shows the prediction to the variable flow rate history.

The three methods captured the boundary behavior, but Method B and Method C

proposed a wellbore-like behavior at the beginning. Method A performed well in the

whole prediction.


0 20 40 60 80 100 120 140 160 180 200−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

102.1

102.2

102.3

102.4

102.5

102.6

102.7

102.8

Time (hours)

∆Pre

ssur

e (p

si)


(b)


3.5.6 Case 6: Infinite-acting Radial Flow + Constant Pres-

sure Boundary

This case tested the constant pressure boundary behavior. From Fig. 3.8(a) and

Fig. 3.8(b), we may see that all three methods worked well in this case in the majority

of both the variable flow rate and constant flow rate scenarios. However, from the

log-log plot, we see that Methods B and C begin to deviate from the true data in the

last half log cycle.


+ Skin + Closed Boundary

This case is a comprehensive case in that four different features exist at the same

time. Method A captures the main trend of the curve in the Cartesian plot, but in

the log-log plot, Fig. 3.9(b), Methods B and C show their advantage over Method A.

Failing to capture the wellbore effect is the reason why Method A deviated at the

turns/corners of the pressure transient curve. Method B and C reproduced all four

features in this case.


0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

100

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)


0 20 40 60 80 100 120 140 160 180 200−1100

−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

102.1

102.3

102.5

102.7

102.9

Time (hours)

∆Pre

ssur

e (p

si)


(b)




+ Skin + Constant Pressure Boundary

This case used the constant pressure together with other well test features. Fig. 3.10(b)

indicates that all three methods have accurate prediction. Comparably, Method B

and Method C gave better prediction in detail (such as the curvature in the middle

of the derivative curve), while Method A captured the trend but lost the detail.

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 3.10: Data mining results using simple kernel methods on Case 8: (a) predic-tion using the variable flow rate history; (b) prediction using the constant flow ratehistory on a log-log plot.

3.5.9 Case 9: Infinite-acting Radial Flow + Dual Porosity

This test case was designed to test the dual porosity behavior. The prediction results

are all acceptable in the Cartesian plots, as shown in Fig. 3.11(a). In the log-log

plot, as indicated in Fig. 3.11(b), Methods B and C captured the initial infinite-

acting radial flow, but lost accuracy in the last half log cycle. Comparably, Method

B captures the best prediction of the dual porosity characteristics (the drop in the

derivative).


0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

100

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

10−2

10−1

100

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 3.11: Data mining results using simple kernel methods on Case 9: (a) predic-tion using the variable flow rate history; (b) prediction using the constant flow ratehistory on a log-log plot.

3.6 Summary and Limitations

After applying the data mining algorithms to the synthetic test data sets, all three

methods were found to work well in most cases. Comparably, Method A performed

better in the cases that have less curvature change in the derivative such as the cases

without wellbore effect, while Method B and C worked better in the cases that have

complex curvature change in the derivative, such as the cases with more reservoir

features. Because Method B does not require provision of the exact positions of the

breakpoints, Method B would be preferable in real practice.

In all test cases in Section 3.5, the training data set had a flow rate history with

interval at least 1h, so the prediction was made to the flow rate history correspond-

ingly to 1h intervals. The first point is at 1h but it has no derivative calculated.

Therefore, all derivatives start from 2h. This raises a concern on the capability of the

methods of handling the early time behavior. Generally speaking, as long as there are

sufficient data in the early time zone (0.001h ≤ t ≤ 1h), and they are used to train

the machine learning algorithm, the earlier behavior would be obtained. However,

some real PDG data may have bigger time interval (e.g. 1h) or miss early time data,

so it would be impossible for the machine learning algorithm to learn the information

before 1h in that case. Hence, in those cases, the data mining may very possibly fail


to predict the response in this very early time stage.

Nowadays many downhole devices available commercially from major service com-

panies can provide simultaneous flow rate and pressure information. However, there

may still be some data sets in which the pressure and flow rate signals are not syn-

chronized. In these situations, we may propose a few methods. Firstly, if the data

set is very large, we may just pick out those sample times at which both the pressure

and the flow rate information are present. The data mining algorithm can handle

the uneven time intervals. This works as a resampling of the data set. Secondly, we

may interpolate the flow rate signals using locally weighted regression. Last but not

least, if there is a large period of flow rate data missing, we may use effective flow

rate (cumulative production divided by the cumulative production time) to replace

the whole period. The problem of incomplete production history will be addressed in

Chapter 5.

The problems of early time data and incomplete production history come from

the measuring hardware and software systems, which are generic problems and not

specific to the data mining approaches. However, the three methods (Methods A, B

and C) still have some limitations.

Method C first requires the exact knowledge of the breakpoints in advance. As

discussed in Section 2.2.2, detecting the breakpoints accurately is still a very difficult

problem nowadays. We introduce some methods of detecting breakpoints using data

mining approaches in Appendix C. However, even the data mining methods still fail

to discover the breakpoint locations with 100% accuracy. The second problem that

Method C has to face is the unbounded size of its input vector. Even when the exact

knowledge of breakpoint locations is provided, the input vector size will increase

unboundedly, according to Table 3.4.

Method A works stably in capturing the major trend of the reservoir behavior, but

it fails to capture the details of the pressure transient, especially when the derivative

curve contains frequent curvature change (such as the cases with wellbore effect). This

greatly limits the wide application of Method A, due to the richness of the pressure

transient change in real reservoir data compared to a simple artificial synthetic case.

Method B made consistently good prediction in most of the test cases. It returned


good reproduction of the training variable flow rate transient, and also gave accurate

prediction to a constant flow rate history. However, Method B failed to make accurate

prediction to a different variable flow rate history. For example, in Case 7, if Method

B was required to predict on a more variable flow rate, the prediction deviated from

the true answer, as shown in Fig. 3.12, even though it succeeded in the original flow

rate transient reproduction and the prediction to the constant flow rate (Fig. 3.9).

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

Pre

ssur

e (p

si)

True DataPrediction

Figure 3.12: In Case 7, Method B would fail to predict on a more variable flow ratehistory, even though it succeeded in the variable flow rate transient reproduction andthe prediction to the constant flow rate (Fig. 3.9).

Actually, in order to solve this problem, the Gaussian Kernel, shown in Eq. 3.24,

was also investigated to replace the linear kernel.

K (x, z) = exp

(

−‖x− z‖22σ2

)

(3.24)

where σ is a parameter to adjust the Gaussian curve’s decay speed. Compared with


the linear kernel which projects the input vector to a finite high-dimensional space, the

Gaussian kernel function projects the input vector into an infinite high-dimensional

space (Ng, 2009). Therefore, the Gaussian kernel usually has a good potential of

capturing tiny reservoir behaviors. However, the Gaussian kernel was finally proved

not to be helpful in this problem. Nevertheless, we still would like to document it here

for future study. We expect that someday when more understanding of data mining

in PDG data interpretation is obtained, the Gaussian kernel may be systematically

studied to reveal the relationship between convolved variables.

The problem comes from the construction of the input vector and the kernel func-

tion. According to the current formulation, the superposition was applied at the

level of the input vector, and the kernelization was applied over the superposition.

But actually, considering the essence of the reservoir pressure transient that each

measured pressure is the linear combination of the pressure response created by pre-

vious flow rate change events, the reservoir properties are indeed reflected directly on

the pressure response. Moreover, the pressure convolution, functioning as a smooth-

ing process, blurs distinction of the reservoir properties by mixing multiple pressure

responses together. Therefore, the kernel function that is expected to explore the

reservoir properties, should detect reservoir properties better if it could work more

fundamentally at the level of pressure response rather than on the level of super-

position. This idea lead to the development of the convolution kernel, which will

be described in Chapter 4. Compared to the complexity of the convolution kernel

that will be discussed in Chapter 4, the kernel functions used in all of three methods

discussed in this chapter are relatively “simple”, hence the title of this chapter.

Chapter 4

Convolution Kernel

As mentioned in Section 3.6, a limitation of previous methods comes from the archi-

tecture of kernelization and superposition. In Method B’s design, the kernelization is

deployed over the superposition, while a better deployment would be “superposition

over kernelization” due to the essence of convolution in the reservoir pressure tran-

sient, as discussed in Section 3.6. This chapter describes a new data mining method

using the “convolution kernel” which was developed to adapt and implement the idea

of “superposition over kernelization”.

4.1 The Origination of Convolution Kernel

The convolution kernel was initially introduced and applied by David Haussler in the

domain of natural language machine learning. His work was published in a technical

report by University of California at Santa Cruz in 1999 (Haussler, 1999).

The problem that Haussler faced is how to process the natural language artifi-

cially. For example, how to automatically determine whether two words are from the

same origin purely by computer without human interaction. Before Haussler, kernel

methods had already been applied to those discrete problems. At that time, a simple

kernel function, namely k (str1, str2), was used to evaluate the similarity between the

two strings str1 and str2. However, due to the complexity in the word construction,

simple kernel functions were very limited in accurate linguistic study.

73

CHAPTER 4. CONVOLUTION KERNEL 74

Haussler (1999) proposed an alternative path in kernelized data mining. He figured

out that words were fundamentally composed by “parts”, and the similarity between

words was essentially the identity between the parts from the two words. Therefore,

he creatively reformed the kernel function by applying the simple kernel function on

the parts and summed all simple kernel functions together to form a complex kernel

function, which is named the “convolution kernel”. Eq. 4.1 shows the “convolution

kernel”, K (str1, str2), in the mathematical form.

K (str1, str2) =∑

ui∈all parts of str1

∑

vj∈all parts of str2

k (ui, vj) (4.1)

Take an example of comparing two words “move” vs. “remove”. The data mining

with convolution kernel works as below:

1. ui ∈ all parts of “move”: m, o, v, e, mo, ov, ve, mov, ove, move

2. vj ∈ all parts of “remove”: r, e, m, o, v, re, em, mo, ov, ve, rem, emo, mov,

ove, remo, emov, move, remov, emove, remove

3. Evaluate the parts using a given simple kernel: k (ui, vj)

4. Sum all kernels of parts to form a convolution kernel: K (“move”, “remove”) =∑10

i=1

∑20j=1 k (ui, vj)

Following Haussler’s idea, a series of studies in artificial linguistics flourished in

the passed decade, such as Collins and Duffy (2002). These studies greatly improved

the accuracy and efficiency of the artificial linguistic processing. When reviewing

these studies, people cannot fail to admit that the key to their success lays on the

usage of the convolution kernel which enables the data mining to work directly on

more fundamental elements of the words.

However, the convolution kernel is not a single kernel function, but a generic

methodology of constructing complex kernel function using the simple kernels. The

function used in Haussler (1999) was just an example of convolution kernel. In order

to construct a convolution kernel function, there are three key elements:


Part definition: The way to decompose the original data into elemental parts.

Simple kernel: The kernel function used to evaluate between parts.

Combination of simple kernels: The way to combine all simple kernels over the

parts together to form the convolution kernel.

People have quite some flexibility in the selection of all three key elements in the

construction of convolution kernel. However, there are still some restrictions. The

most important restriction is that the convolution kernel still has to be a valid kernel

function so that a convolution kernel K(

x(i),x(j))

shall be able to written out in the

form of the inner product of two transformations over x(i) and x(j). That is, there shall

exist at least one transformation Φ (x) that satisfies K(

x(i),x(j))

= Φ(

x(i))T

Φ(

x(j))

.

Because the simple kernels are already valid kernel functions, there will be some

restrictions in the combination of the simple kernels to guarantee the validity of the

convolution kernel function.

An easy approach of combining the simple kernels is to follow the kernel closure

rules. Suppose K1 (x, z) and K2 (x, z) are two valid kernels, then we have the following

valid kernels of K (x, z) (Berg et al., 1984; Laskov and Nelson, 2012):

(a) K (x, z) = K1 (x, z) + K2 (x, z)

(b) K (x, z) = K1 (x, z)K2 (x, z)

(c) K (x, z) = aK1 (x, z) where a ∈ ℜ+

The proof of these rules is shown in Appendix B.

If the combination of simple kernels follows these rules1, the formulated convo-

lution kernel will still be a valid kernel function. For example, Eq. 4.1 utilized the

summation closure rule, so the constructed convolution kernel is valid.

With the three elements, a convolution kernel may be constructed for data min-

ing. Similar to simple kernels, the convolution kernel will project the input vector

into a pseudo-high-dimensional space, which helps to capture the nonlinearity while

maintaining a linear form in the training and prediction equations.

1There are also other closure rules for kernel combination. However, only the three most com-monly seen rules are listed here.


4.2 Convolution Kernel Applied to PDG Data

Although the convolution kernel was invented to solve the discrete linguistic problem,

it has wider application to the solution of complex continuous problems, such as data

mining on PDG data.

As discussed in Sections 3.3 and 3.6, the reservoir pressure transient is a convo-

lution result of the pressure responses due to all previous flow rate change events.

Therefore, a pressure transient may be decomposed into a series of pressure responses

– this provides a clue of “part definition” in a PDG data mining problem, similar to

the word parts in the original linguistic application.

Figure 4.1: Decompose an input sample point into parts.

Suppose there is an input sample point x(a) in a PDG data set, as shown in

Fig. 4.1. There are two flow rate change events before point x(a), including q(a)1 and

q(a)2 . Suppose each flow rate change event will have a corresponding input vector x

(a)1

and x(a)2 , then we may have a part definition of this given x(a) as:

all parts of x(a)

=

x(a)1 ,x

(a)2

(4.2)

Here, x(a)1 and x

(a)2 are general forms of input vectors on the two parts of x(a). We

intend not to specify the detailed form of the input vector of each part here, because

we would like to discuss the input vector in detail in Section 4.4.


Generally, for any input sample x(i), we have the parts definition as Eq. 4.3.

all parts of x(i)

=

x(i)1 ,x

(i)2 , . . . ,x

(i)k , . . . ,x

(i)Ni

(4.3)

Here, x(i)k is the general form of input vector of kth part of x(i), and Ni is the total

number of flow rate change events before x(i). Because it is very hard to detect all

breakpoints accurately, a wise solution is treat all points before the current sample

point as breakpoints. In this way, no breakpoint detection is required while the

accuracy is still maintained. In this way, Ni = i, and Eq. 4.3 becomes:

all parts of x(i)

=

x(i)1 ,x

(i)2 , . . . ,x

(i)k , . . . ,x

(i)i

(4.4)

With the parts defined, the second element of a convolution kernel is the simple

kernel. Supposing that we have two parts x(i)k and x

(j)l from two sample point x(i) and

x(j), the simple kernel we are going to use is the linear kernel, as shown in Eq. 4.5.

k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l (4.5)

Finally, linearly combining all simple kernels evaluated on all possible part pairs,

we form the convolution kernel, as shown in Eq. 4.6.

K(

x(i),x(j))

=

Ni∑

k=1

Nj∑

l=1

k(

x(i)k ,x

(j)l

)

where k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l (4.6)

As every point before the sample point is treated as a breakpoint to avoid the

requirement of accurate breakpoint detection, we have Ni = i and Nj = j. So that

the convolution kernel function is:

K(

x(i),x(j))

=

i∑

k=1

j∑

l=1

k(

x(i)k ,x

(j)l

)

where k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l (4.7)

Selecting the linear summation to form the convolution kernel originates mainly

from the characteristics of the reservoir pressure transient. As discussed previously,


the subsurface pressure is the convolution result of the pressure responses due to

previous flow rate change events. Specifically, the way of the convolution is linear

summation of all pressure responses, namely, superposition. Because the superpo-

sition linearly adds up all pressure responses which are “parts” in our convolution

kernel, it is quite intuitive to linearly combine all the simple kernels that evaluate

over the parts. This also helps us to implement the idea discussed in Section 3.6 that

a better deployment of the kernel functions would be superposition over kerneliza-

tion, rather than kernelization over the superposition. Linearly combining all simple

kernels exactly reflects the methodology of “superposition over kernelization”.

In addition to the response to the superposition, the linear summation also takes

advantage of the summation closure rules as discussed in Section 4.1, so that the

newly formed convolution kernel in Eq. 4.7 satisfies the requirement of a valid kernel

function.

In this way, a convolution kernel was successfully constructed for the data mining

of PDG data.

4.3 New Formulation for Conjugate Gradient

Before further moving on to the selection of the input vector, we would like to improve

the optimization method first. In the optimization method discussed in this section,

technical concepts including Conjugate Gradient and Reproducing Kernel Hilbert

Space were utilized. Actually, either topic of the two would form a book easily. It

was not the target of this study nor of this dissertation to explore the detailed theories

of the two topics. Therefore, only the useful conclusions related to the data mining

project are shown in this section. Deeper proof, derivation and theorems can be

obtained from the references.

In the simple study described in Chapter 3, the Steepest Gradient Descent (SGD)

method was used in the iterative training process. However, the SGD method is very

inefficient for two major reasons. For one reason, the SGD method has no limit on

the count of iterations before convergence. SGD may require a long time to finally

converge to the predefined residual, leading to a low efficiency of learning. For the


other, the SGD method may follow zig-zag iterations when the Hessian matrix has

a large condition number (Caers, 2009). Therefore, it is necessary to improve the

optimization method to raise the overall efficiency of the data mining process.

The Conjugate Gradient (CG) method can usefully replace the SGD method. The

CG method may avoid the two disadvantages of SGD discussed earlier. For any linear

equation Ax = b, where A ∈ ℜn×n, and x,b ∈ ℜn, the CG method is searching in

a Krylov space generated by A and b, and the solution is going to converge in at

most n iterations (Trefethen and Bau, 1997). So there is no longer any zig-zag steps.

Fig 4.2 shows a comparison between CG and SGD methods solving a linear system

with n = 2. The CG method (red line) takes at most two steps to converge, while

SGD method (green line) zig-zags when approaching the optimum.

Figure 4.2: A comparison of the SGD method (in green) and CG method (in red) forminimizing a quadratic function associated with a given linear system. CG methodconverges in at most n steps (here n=2), while SGD method zig-zags when approach-ing the optimum, from Alexandrov (2007).

However, in order to utilize the CG method, we have to formulate a linear equation

such as Ax = b. In this case, we need to reformulate the training equation (Eq. 3.20)

and the prediction equation (Eq. 3.22) to adapt the CG method.


In fact, each valid kernel function K (x, z) is positive-definite2, and associated

with a corresponding space of functions, HK, named reproducing kernel Hilbert space

(RKHS) (Hastie et al., 2009). The definitions of a Hilbert space and a reproducing

kernel Hilbert space are as follows.

Hilbert Space. A Hilbert space is an inner product space that is complete and sep-

arable with respect to the norm defined by the inner product. For example, the vector

space ℜn with the inner production definition as 〈x, z〉 = xTz.

Reproducing Kernel Hilbert Space (RKHS) (Evgeniou et al., 2000). A re-

producing kernel Hilbert space (RKHS) is a Hilbert space H of functions defined over

some bounded domain X ⊂ ℜn with the property that, for each x ∈ X, the evaluation

functionals defined as:

Fx[f] = f (x) ∀f ∈ H (4.8)

are linear, bounded functionals. The boundedness means that there exists a U = Ux∈

ℜ+ such that:

|Fx[f]| = |f (x)| ≤ U ‖f‖ (4.9)

for all f in the RKHS.

The relation between RKHS and Hilbert space is that a RKHS is a Hilbert

space where the inner product is defined using a positive-definite kernel function

K (x, z) (Evgeniou et al., 2000).

As mentioned at the beginning of the section, the RKHS is a profound topic,

about which it is not necessary to make further derivation or discussion. This section

will emphasize some items that are helpful to the PDG data mining task project:

1. Each positive-definite kernel function K (x, z) is associated with a RKHS HK,

while each RKHS HK corresponds to a unique positive-definite kernel function

K (x, z), named the reproducing kernel of HK (hence the terminology RKHS)

(Hastie et al., 2009; Evgeniou et al., 2000).

2To be exact, a valid kernel is positive semidefinite according to the Mercer Theorem (refer toSection 3.2 and Appendix B). However, the zero kernel is not useful. Here, in the dissertation, avalid kernel usually means a positive-definite kernel.


2. Using a positive-definite kernel K (x, z) to do the machine learning is equiva-

lently finding a function f in the RKHS HK corresponding to K (x, z) such that

f (x) = y, where x ∈ ℜn is the general form of input vector, and y ∈ ℜ is the

general form of the observation (in the training process) or the prediction (in

the prediction process). n is the dimension of the input vector x, specifically,

in the context of this project n = Nx.

3. When the true function f in HK is not visible, but a data set (training data set)

is provided as(

x(i), y(i))

|x(i) ∈ ℜn, y(i) ∈ ℜ, i = 1, . . . , m

, the true function f

may be approached by the set of half evaluated functions K(

·,x(i))

. This could

be expressed mathematically (Wahba, 1990; Hastie et al., 2009) as:

fβ (x) =

m∑

i=1

βiK(

x,x(i))

(4.10)

Here, m is the total number of training data, specifically in the context of this

project, m = Np. The half evaluated function K(

·,x(i))

works as the basis func-

tion, which is also known as the representer of evaluation at x(i) (Hastie et al.,

2009), mathematically Kx(i) (x) = K

(

x,x(i))

(treat the x(i) as a parameter, and

x as the unknown variable). In this dissertation, we would like to name these

half evaluated kernel functions as kernel basis functions for simplicity.

These discussions of the kernel function and RKHS give us at least two impor-

tant hints regarding the kernelized learning. First, because the half evaluated kernel

function, K(

·,x(i))

, works as the basis function of HK to span the true function f,

the more training data(

x(i), y(i))

are provided, the more possible

K(

·,x(i))

will

form a complete basis, and the closer the function fβ will approach the true func-

tion f. This explains why the kernelized learning method (generally all data mining

methods) requires a large amount of data.

Secondly, we may summarize the ultimate target of the kernelized learning from a

new point of view, that is, the task of the kernelized learning is to find the coefficients

β such that function fβ (x) defined in Eq. 4.10 is an adequate estimator of the true

function f (Blanchard and Kramer, 2010).


To obtain the coefficient β, the training data will be utilized. Substitute x =

x(1), y = y(1), then we have:

fβ(

x(1))

=

m∑

i=1

βiK(

x(1),x(i))

= y(1) (4.11)

Similarly for all training data, we have:

fβ(

x(1))

=∑m

i=1 βiK(

x(1),x(i))

= y(1)

...

fβ(

x(k))

=∑m

i=1 βiK(

x(k),x(i))

= y(k)

...

fβ(

x(m))

=∑m

i=1 βiK(

x(m),x(i))

= y(m)

(4.12)

Recall the definition of kernel matrix mentioned in the Mercer Theorem (refer to

Section 3.2), we define the kernel matrix as follows.

Kernel Matrix. Suppose a data set(

x(i), y(i))

|x(i) ∈ ℜn, y(i) ∈ ℜ, i = 1, . . . , m

is

given, and a valid kernel function K (x, z) is provided, a kernel matrix K may be

defined as

K =

K11 . . . K1j . . . K1m

......

...

Ki1 . . . Kij . . . Kim

......

...

Km1 . . . Kmj . . . Kmm

(4.13)

where:

Kij = K(

x(i),x(j))

(4.14)

Using the kernel matrix K, Equation 4.12 could be rewritten in the matrix form

as:

Kβ = y (4.15)


where:

y =(

y(1), . . . , y(m))T

(4.16)

β = (β1, . . . , βm)T (4.17)

Applying Eq. 4.15 to the PDG data mining project, in which m = Np, the training

equation becomes:

Kβ = y (4.18)

where:

K =

Kij |Kij = K(

x(i),x(j))

, i, j = 1, . . . , Np

(4.19)

β =(

β1, . . . , βNp

)T(4.20)

y =(

yobs(1), . . . , yobs(Np))T

(4.21)

With β obtained by the matrix form training equation, Eq. 4.18, the prediction

equation is:

ypred =

Np∑

i=1

βiK(

xpred,x(i))

(4.22)

Eq. 4.18 and Eq. 4.22 are the new matrix form training and prediction equation

for the kernelized learning from now on. Furthermore, considering the new form of

training equation, Eq. 4.18, has a linear form, the CG method may be applied to

the learning process. Algorithm 3 shows the algorithm of applying the conjugate

gradient method on the training equation, Eq. 4.18, to obtain the coefficients β.

In the algorithm, the iteration loop is executed at most for Np times because the

conjugate gradient method converges in at most Np steps as long as the kernel matrix

is well-posed. However, because the CG method has a very good convergence rate, it

usually converge very quickly and does not need that many steps to converge.

As described to now, the learning method with convolution kernel successfully


Algorithm 3 Learning with Conjugate Gradient Method

β[0] = ~0, r[0] = y, q[0] = r[0] initializationfor k = 1, 2, . . . , Np do

a[k] =r[k−1]Tr[k−1]

q[k−1]TKq[n−1]calculate step length

β[k] = β[k−1] + a[k]q[k−1] update solutionr[k] = r[k−1] − a[k]Kq[k−1] update residualif β[k] is convergent thenreturn β[k]

end if

b[k] =r[k]

Tr[k]

r[k−1]Tr[k−1]

q[k] = r[k] + b[k]q[k−1]update search directionend forreturn β[Np]

adapts the CG method as the optimization method. The last thing before the appli-

cation is to select a proper input vector for the parts in the convolution kernel.

4.4 Input Vector Selection

For the convolution kernel, an input vector still needs to be selected for the training

and prediction purposes. However, the input vector for the convolution kernel has

a slight difference to that of a simple kernel. For a simple kernel, one input vector

corresponds to one sampling point, while for the convolution kernel, one input vector

corresponds to one part of one sampling point. Mathematically, for each sampling

point(

t(i), q(i), p(i))

, there is only one corresponding input vector for the simple kernel

method such that:(

t(i), q(i), p(i))

→ x(i), (4.23)

However, there are a series of input vectors corresponding to all “parts” for this

sampling point, such that:

(

t(i), q(i), p(i))

→

x(i)k , k = 1, . . . , i

(4.24)


The selection of the input vector for the convolution kernel is hence the selection of

x(i)k .

Table 3.2 in Section 3.3 shows a guidance for selection of the input vector corre-

sponding to some reservoir behaviors. Here, three choices are prepared as shown in

Table 4.1.

Table 4.1: Input vector for convolution kernel

KV3F KV4FA KV4FB

x(i)k =

q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

x(i)k =

q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k

(

t(i)k

)2

x(i)k =

q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k /t

(i)k

Inside, k = 1, . . . , i. The name for the first choice of the input vector, “KV3F”,

is the abbreviation for “kernel input vector with three features”. “KV4FA” and

“KV4FB” are two kinds of input vectors with four features.

In the input vector of KV4FB, q(i)k /t

(i)k is also added as the last feature. The

feature is added to capture those reservoir behaviors that decay with the elapsed

time, such as wellbore effect. Moreover, the exponential integral function Ei (x) =

−∫ +∞

−x

exp (−u)

udu is the main function in the solution of the synthetic constant

infinite-acting radial flow (Ramey, 1970; Horne, 1995), and q(i)k /t

(i)k is the main func-

tion of the second order approximation for the exponential integral function, so it is

important to improve accuracy of the prediction.

To test which kernel function results in the best prediction, three tests cases were

used as listed in Table 4.2. These test cases are the same as those in Table 3.5 in

Section 3.5, so the test case numbers are kept the same to maintain a consistency

throughout the whole dissertation.

Fig. 4.3 shows the result of three input vector (with convolution kernel) applied

on three test cases. Fig. 4.3(a) and Fig. 4.3(b) are the prediction results according

to a variable flow rate and a constant flow rate using three different input vectors

on Test Case 4. All three input vectors return good prediction in the variable flow


Table 4.2: Test cases for convolution kernel input vector selection


4 Infinite-acting radial flow + wellbore effect + skin7 Infinite-acting radial flow + wellbore effect + skin + closed


pressure boundary

rate case (Fig. 4.3(a)). However, in the constant flow rate case (Fig. 4.3(b)), KV3F

and KV4FA miss the wellbore storage at the beginning while KV4FB captures the

wellbore storage very well. After the short wellbore effect, all three methods capture

the infinity-acting radial flow with good accuracy.

Fig. 4.3(c) and Fig. 4.3(d) show the results on Test Case 7. In the log-log plot

(Fig. 4.3(d)) KV4FB clearly shows its advantage in capturing the wellbore storage,

but it has slight deviation on the infinite-acting radial flow. However, KV4FA misses

the wellbore storage but has an accurate prediction of the infinite-acting radial flow.

KV3F has deviation in both the wellbore storage and the infinite-acting radial flow

stages. All three input vectors have good prediction of the boundary behavior.

Fig. 4.3(e) and Fig. 4.3(f) are the results on Test Case 8. All three input vectors

work well in both the variable flow rate case and the constant flow rate case. KV4FB

has a best prediction in the log-log plot (Fig. 4.3(f)).

To sum up, all three input vectors have good prediction in all three test cases.

Comparably, KV3F and KV4FA miss the wellbore effect in Test Cases 4 and 7, while

KV4FB captures all the reservoir features well in all test cases except the small

deviation on the radial flow in Case 7. Finally, KV4FB was selected as the most

suitable input vector for the convolution kernel.


0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataKV3FKV4FAKV4FB

(a) Case 4: Variable flow rate

100

101

102

102.1

102.3

102.5

102.7

102.9

Time (hours)

∆Pre

ssur

e (p

si)

True DataTrue Data (Derivative)KV3FKV3F (Derivative)KV4FAKV4FA (Derivative)KV4FBKV4FB (Derivative)

(b) Case 4: Constant Flow rate

0 20 40 60 80 100 120 140 160 180 200−1100

−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

Time (hours)

∆Pre

ssur

e (p

si)


(c) Case 7: Variable Flow rate

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(d) Case 7: Constant flow rate

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(e) Case 8: Variable flow rate

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(f) Case 8: Constant flow rate

Figure 4.3: (a) and (b) show the results of variable flow rate and constant flow rateusing three different input vectors on Test Case 4. (c) and (d) show the results onTest Case 7. (e) and (f) show the results on Test Case 8. From the comparison in allthree test cases, KV4FB gives better prediction, especially in the detail of wellboreeffect.


In addition, the input vector “KV5F” as shown in Eq. 4.25:

x(i)k =

q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k

(

t(i)k

)2

q(i)k /t

(i)k

(4.25)

has also been investigated. In the study, we found that KV5F performed very close

to KV4FB in most of test cases, except some of the real data cases. In those real

cases, KV5F was sensitive and unstable, and made biased predictions. Therefore,

KV5F was finally rejected from the input vector candidates. We still would like to

document it here, as it might be reselected an input vector candidate someday after

future study regarding its characteristics. In addition, although KV3F and KV4FA

were not finally selected as the input vector, they are still very promising for the

convolution kernel method. In our study, they passed all test cases with reasonable

predictions. Especially KV3F has surprisingly good stability in very noisy cases

(with high percentage of outliers and aberrant segments). For further study and real

practice, KV3F and KV4FA are still worth further study and investigation.

Having selected the input vector, it is time to formally form a new method, Method

D, as the method using the convolution kernel. Table 4.3 shows the input vector and

kernel function for Method D. The next section describes the application of Method

D to a series of test cases.

4.5 Application

A series of test cases, listed in Table 4.4 were constructed to test the performance of

the convolution kernel method in different scenarios. Test Cases 1 through 9 are the

same as those discussed in Chapter 3. The test results are shown and discussed in

this section.

The test workflow for Test Cases 1-13 was formed as follows.


Table 4.3: Input vector and kernel function for Method D

Method D

Input Vector x(i)k =

q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k /t

(i)k

Kernel FunctionK(

x(i),x(j))

=∑i

k=1

∑j

l=1 k(

x(i)k ,x

(j)l

)

k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l


mally distributed) to both the pressure and the flow rate data.


the convolution kernelized data mining algorithms (Method D) to learn the data

set until convergence.



4. Compare the predicted pressure data (from Step 3) with the synthetic pressure

data without noise (from Step 1).




using the same wellbore/reservoir model as Step 1.


data (from Step 6).

8. Feed the data mining algorithm with a multivariable flow rate history (without

noise) and collect the predicted pressures from the data mining algorithm.


Table 4.4: Test cases for convolution kernel methodTest Case # Test Case Characteristics

1 Infinite-acting radial flow2 Infinite-acting radial flow + wellbore effect3 Infinite-acting radial flow + skin4 Infinite-acting radial flow + wellbore effect + skin5 Infinite-acting radial flow + closed boundary (pseudosteady

state)6 Infinite-acting radial flow + constant pressure boundary7 Infinite-acting radial flow + wellbore effect + skin + closed


pressure boundary9 Infinite-acting radial flow + dual porosity10 Infinite-acting radial flow + wellbore effect + skin + constant

pressure boundary + step flow rate history (Complicated Syn-thetic Case A)

11 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary + fast shifted flow rate history (Compli-cated Synthetic Case B)

12 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary + real flow rate history (Semireal Case A)

13 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary + real flow rate history (Semireal Case B)

14 real pressure + real flow rate history (Real Case A)15 real pressure + real flow rate history (Real Case B)37 real pressure + real flow rate history + cross validation (Real

Case C)


9. Construct a synthetic pressure according to the multivariable flow rate in Step

9 using the same wellbore/reservoir model from Step 1.


data (from Step 9).

For the real case tests, Test Cases 14-15, and 37, there were no “true data” for

comparison, so the test workflow becomes:

1. Use the real data set as the training data set. Apply the convolution kernelized

data mining algorithms (Method D) to learn the data set until convergence.

2. Feed the data mining algorithm with the training variable flow rate history (real

flow rate history) and collect the prediction from the data mining algorithm.

3. Compare the predicted pressure data (from Step 2) with the real pressure data

(from Step 1).





The tests were conducted step by step, from synthetic to real cases. Test Cases 1-9

are synthetic test cases to test the convolution kernel working in different scenarios.

These nine tests are the same as those nine test cases for simple kernel methods,

listed in Table 3.5, aiming to compare the convolution kernel methods with the simple

kernel method. Therefore, both the predictions using Method D (convolution kernel

method) and Method B (simple kernel method) are displayed in the result plots. Test

Cases 10-11 are complicated synthetic cases, in which the flow rate history is changing

rapidly to represent the complex real reservoir production environment. Test Cases

12-13 are “semireal” field cases in which the training flow rate history is real while

the training pressure data were generated using the models. These two cases have

important meaning to us, because they are the closest artificial cases to the real field


cases while we still have the true data for the comparison of prediction accuracy. Test

Cases 14-15 are real cases in which both the pressure and the flow rate data are real.

However, because there are no “true data” for the real cases, there is no known true

answer for comparison. Test Case 37 is another real case with nearly nine months

of production data. In this case, a cross validation was performed, in which the real

data set was divided into two parts that were used for training data set and test data

set respectively. The target of this case was to validate the prediction result using

the real data set itself.

The same test cases have the same test case number throughout the whole disser-

tation, such as Test Cases 1-9 in the tests for both simple kernel method in Section 3.5

and convolution kernel method in this chapter. Case 37 was added later than other

cases in the following chapters, so the case number is not adjacent. The model data

for all test cases are listed in Appendix A.

The test results are shown in four kinds of plots, including:

• a Cartesian plot of the training data (noisy data) and the true data (no noise

data). For the real data cases, no true data will be plotted.

• a Cartesian plot of the prediction to the variable flow rate history (training data

set reproduction)

• a log-log plot of the prediction to the constant flow rate history

• a Cartesian plot of the prediction to the multivariable flow rate history (more

variable flow rate prediction)

For simplicity, not all the plots are shown for all test cases as not all are relevant.

Table 4.5 shows which plots are shown for each test case. The test cases and their

results are discussed one by one in the following 16 subsections.

4.5.1 Case 1: Infinite-acting Radial Flow

The same as in Section 3.5.1, Case 1 is the simplest case with only one reservoir

behavior, infinite-acting radial flow. Fig. 4.4(a) shows the training data set (pink


Table 4.5: Result plots for all tests on convolution kernel method

Test Case # Training/TrueData

Variable FlowRate

ConstantFlow Rate

Multi-Variable Flow

Rate

1 X X X2 X X3 X X4 X X X5 X X6 X X7 X X X8 X X X9 X X10 X X X X11 X X X X12 X X X X13 X X X X14 X X X X15 X X X X37 X X X (cross

validation)


line) and the true data (blue line). Both the flow rate and the pressure data had

added 3% artificial noise (normally distributed). Although the true data are shown

in the figures, they are not visible to the data mining process. In the test, only the

noisy data were provided to the machine learning algorithms. The true data are

shown here for comparison purpose only. For Tests 2-9, the figures for the training

data are omitted for simplicity.

In the variable flow rate prediction (Fig. 4.4(b)), Method B and Method both

give very good prediction compared with the true data. In the constant flow rate

prediction (Fig. 4.4(c)), the log-log plot shows that Methods B and D both capture

the overall infinite-acting radial flow behavior. However, Method B has a drop in the

derivative curve in the last half log cycle, while Method D has no such problem. The

half log cycle problem for Method B has been discussed in Section 3.6. At that time,

we anticipated a new architecture of kernel function and superposition might retrieve

better results. Here, the convolution kernel method (Method D) shows its advantage

to the simple kernel method at least in this specific kind of example.


Fig. 4.5 demonstrates the results of Method D working on an infinite-acting radial

flow with wellbore storage effect. Method B still shows a drop-off in the last half

log cycle in the derivative curve in Fig. 4.5(b), while Method D captures the infinite-

acting radial flow as well as the wellbore storage effect. In Fig. 4.5(a), the two methods

make very good reproduction of pressure transients according to the variable flow rate

history. As discussed in Section 3.5, the sampling of the data leads to the absence

of the early time data (data in the range [10−3h, 1h]), so the unit straight line of the

wellbore storage effect in the derivative curve is not seen in the figure. However, the

appearance of the pressure prediction showing the end of the hump of the pressure

derivative could be treated as an indirect demonstration that Method D still captures

the wellbore effect.


0 50 100 150 200−1000

−500

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod BMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)

True DataTrue Data (Derivative)Method BMethod B (Derivative)Method DMethod D (Derivative)

(c)

Figure 4.4: Data mining results using convolution kernel method on Case 1: (a) thetraining data set; (b) prediction using the variable flow rate history; (c) predictionusing the constant flow rate history on a log-log plot.


0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

102.1

102.2

102.3

102.4

102.5

102.6

102.7

102.8

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 4.5: Data mining results using convolution kernel method on Case 2: (a)prediction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot.

4.5.3 Case 3: Infinite-acting Radial Flow + Skin

Case 3 is a synthetic case with infinite-acting radial flow and a skin effect. The

results are shown in Fig. 4.6. The true pressure in this case has similar shape to

that in Case 1, as shown in Fig. 4.6(b) and Fig. 4.4(c). However, the factor of the

skin did affect the prediction of Method B, leading to a hump in the early stage of

infinite-acting radial flow and a drop at the last log cycle in Fig. 4.4(c). However,

Method D using the convolution kernel still makes the prediction with high precision.

In the Cartesian plot (Fig. 4.6(a)), at the corner near 120 hours, Method B deviates

slightly from the true data while Method D makes a sharp corner prediction that is

a better representation of the true data.


+ Skin

Fig. 4.7 shows the results applying the simple kernel method (Method B) and con-

volution kernel method (Method D) on Case 4 which includes infinite-acting radial

flow, wellbore effect and skin factor. Fig. 4.7(a) shows the pressure reproduction of

the variable flow rate history, in which the two methods both give good prediction.


0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 4.6: Data mining results usign convolution kernel method on Case 3: (a)prediction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot.

In the constant flow rate prediction (Fig. 4.7(b)), Method B has a drop in the last

half log cycle, while Method D maintains good prediction until the end. In this case,

the multivariable flow rate prediction results are also shown in Fig. 4.7(c). The figure

clearly shows the advantage of Method D compared to the big deviation of Method

B. In Section 3.6, we have discussed that failing to predict multivariable flow rate

history greatly limits the application of Method B. Here, Fig. 4.7(c) demonstrates

that Method D overcomes this limitation successfully.

4.5.5 Case 5: Infinite-acting Radial Flow + Closed Boundary

(Pseudosteady State)

Sections 4.5.5 to 4.5.8 show the tests on Cases 5-8 that have boundary effect. Fig. 4.8

shows the results for Case 5 containing infinite-acting radial flow together with a

closed boundary (pseudosteady state). The two methods made good prediction to the

variable flow rate history, as shown in Fig. 4.8(a). However, in the pressure prediction

to the constant flow rate (Fig. 4.8(b)), both Method B and Method D deviated slightly

from the true infinite-acting radial flow region. Comparably, Method B is closer to

the true answer. Both methods captured the pseudosteady state boundary.


0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

102.1

102.3

102.5

102.7

102.9

Time (hours)∆P

ress

ure

(psi

)


(b)

0 20 40 60 80 100 120 140 160 180 200

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(c)

Figure 4.7: Data mining results usign convolution kernel methods on Case 4: (a)prediction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot; (c) prediction using the multivariable flow rate historyon a Cartesian plot.


0 20 40 60 80 100 120 140 160 180 200−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)


4.5.6 Case 6: Infinite-acting Radial Flow + Constant Pres-

sure Boundary

Case 6 studied the infinite-acting radial flow with constant pressure boundary. The

results are shown in Fig. 4.9. The two methods both had good pressure reproduction

of the variable flow rate, illustrated in Fig. 4.9(a). In the prediction to a constant flow

rate history, as shown in Fig. 4.9(b), Method B had a big drop at the constant pressure

boundary stage. Method D deviated slightly from the true data at the very end of

the constant pressure boundary, but overall followed the trend well. Comparably,

Method D was preferred in this case.


+ Skin + Closed Boundary

This case is a case with four kinds of reservoir behaviors together, including infinite-

acting radial flow, wellbore effect, skin factor and a closed boundary. Fig. 4.10 shows

the results of Method B and Method D. The two methods made good pressure repro-

duction to the variable flow rate history, as shown in Fig. 4.10(a). In Fig. 4.10(b),

Method B captured all reservoir features well, including in the early stage of wellbore


0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)


effect, the middle stage of radial flow, and the end stage of the pseudosteady state

boundary. Comparably, Method D captured the wellbore effect as well as the pseu-

dosteady state boundary but had slight deviation during the radial flow. However,

Method B did not maintain its good performance in the prediction to a multivariable

flow rate history, as shown in Fig. 4.10(c). Method B only predicted the overall trend

of the pressure, but lost the accuracy in the absolute value. Meanwhile, Method D

demonstrated its stability and accuracy in that it had good prediction to the mul-

tivariable flow rate history. Compared to the small deviation in Fig. 4.10(b), this

stability and accuracy in the multivariable flow rate prediction is preferable.


+ Skin + Constant Pressure Boundary

Case 8 is similar to Case 7 that contains four reservoir/well features except that Case 7

has a closed boundary while Case 8 has a constant pressure boundary. As in previous

cases, the two methods worked well in pressure reproduction to the variable flow rate,

as shown in Fig. 4.11(a). In Fig. 4.11(b), Method B and Method D both made good

pressure prediction to the constant flow rate, capturing correctly the wellbore effect,

skin, infinite-acting radial flow, and the boundary. In Fig. 4.11(c), as in the previous


0 20 40 60 80 100 120 140 160 180 200−1100

−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(c)

Figure 4.10: Data mining results using convolution kernel method on Case 7: (a)prediction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot; (c) prediction using the multivariable flow rate historyon a Cartesian plot.


case, Method B failed to predict to the multivariable flow rate history, while Method

D still made good prediction.

The prediction results to the multivariable flow rate histories in Cases 4, 7 and 8

demonstrate that Method D overcomes the limitation of Method B, so that Method D

maintains accurate prediction even to the multivariable flow rate history. Although

there were no prediction results shown to the multivariable flow rate histories in

Cases 1-3, and 5-6, actually the tests have also been carried out that Method D made

accurate prediction to the multivariable flow rate history in those cases as well.

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

0 20 40 60 80 100 120 140 160 180 200

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)


(c)

Figure 4.11: Data mining results using convolution kernel method on Case 8: (a)prediction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot; (c) prediction using the multivariable flow rate historyon a Cartesian plot.


4.5.9 Case 9: Infinite-acting Radial Flow + Dual Porosity

Fig. 4.12 demonstrates the prediction results in the dual porosity case. In Fig. 4.12(a),

Method D made better prediction around the corner near 120 hours, but overall,

the two methods’ predictions were acceptable. In the log-log plot of the pressure

prediction to the constant flow rate, the two methods deviated from the true answer,

but Method D captured the trend of the derivative curve. In addition, the final stage

of Method B’s prediction shows very strong instability that the derivative increased

and decreased abruptly. Method D performed much more stably, and even caught

the final stage of the derivative.

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

100

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

10−2

10−1

100

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 4.12: Data mining results using convolution kernel methods on Case 9: (a)prediction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot.

Until this case, all nine cases that have been carried out on the simple kernel

method in Section 3.5 were also tested using the convolution kernel method, Method

D. From the comparison between Method B and Method D, we may see that Method

D made stable and accurate prediction in different scenarios, no matter whether to

a variable flow rate, a constant flow rate or a multivariable flow rate. Method D

captured the early stage behavior such as wellbore effect as well as the late stage

behavior such as the boundary effect.

From next subsection, the tests of more complicated cases will be reported. In

these tests, only Method D was applied, and hence, only the Method D results are


shown in the plots.

4.5.10 Case 10: Complicated Synthetic Case A

The study tried to evaluate the method step by step, from easy synthetic cases (Cases

1-9) to complicated synthetic cases (Cases 10 and 11), to semireal cases (Cases 12

and 13), and finally to real field cases (Cases 14-15 and 37). This section reports

the test results using Case 10, a complicated case that contains infinite-acting radial

flow, wellbore effect, skin factor, constant pressure boundary, and step-like flow rate

history.

Fig. 4.13(a) demonstrates the true data (blue line) and the noisy data (the pink

line). The true data were used for comparison purpose only and were not visible to the

data mining algorithm in the whole process. The noisy data were used as the training

data. In Fig. 4.13(a), we may see that the flow rate decreased and increased in a step

shape. Actually, in real field practice, the flow rate is often controlled in steps. We

therefor. constructed such a case to test whether the learning algorithm may function

well in the step-like flow rate change environment. Furthermore, the step-like flow

rate changes enhanced the pressure convolution, and shortened the constant flow rate

period such that the pressure transient had insufficient time to develop fully. All

these factors made the learning more difficult.

However, the pressure reproduction in Fig. 4.13(b) shows that the difficulty did

not affect the performance of Method D. The pressure reproduction is identical to the

true data. Fig. 4.13(c) and Fig. 4.13(d) show the prediction to a constant flow rate

history and a mutivariable flow rate history. The pressure predictions are accurate

in the two plots. Considering the fact that the true data was not known in advance

by the learning algorithm, and that what the learning algorithm knew was only the

noisy data, the pink line in Fig 4.13(a), it is clear that Method D did discover the

controlling logic of the reservoir behavior behind the noisy data set.


0 50 100 150 200−1000

−500

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)

True DataTrue Data (Derivative)Method DMethod D (Derivative)

(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)

Figure 4.13: Data mining results using convolution kernel method on Case 10: (a) thetraining data set; (b) prediction using the variable flow rate history; (c) prediction us-ing the constant flow rate history on a log-log plot; (d) prediction using multivariableflow rate history on a Cartesian plot.


4.5.11 Case 11: Complicated Synthetic Case B

Case 11 is reported in this section. Case 11 is a case that contains the infinite-

acting radial flow, wellbore effect, skin factor, constant pressure boundary, and a fast

shifting flow rate history. The true data and the noisy training data are shown in

Fig. 4.14(a). The figures show that the flow rate history changes every 10 hours,

leading to a wave-like pressure transient. These fast shifting flow rates contribute to

the pressure convolution very significantly. Also because of their short duration, each

single constant flow rate period was not sufficient to develop full reservoir behavior.

In the forward model, at least 30 hours was required for the pressure to respond

to the boundary, so in the training data set, no single piece of the constant flow

rate period and the corresponding pressure transient would reveal the overall model

of the well and the reservoir. The machine learning algorithm has to dig into the

convoluted pressure to obtain the well/reservoir model. In the test plan, it would

indicate a successful learning if the pressure prediction to a constant flow rate showed

a boundary in addition to the infinite-acting radial flow,

The learning and prediction results did not disappoint. Fig. 4.14(b) shows a very

good reproduction of the pressure response to the true flow rate history. At the same

time, Fig. 4.14(c) demonstrates the pressure prediction to a constant flow rate. The

plot shows that Method D captured the constant pressure boundary response from

the highly convoluted pressure transient, even thought no single period of constant

flow rate provided the overall description of the reservoir and the well. This shows the

merit of the data mining in the interpretation of PDG data in that the data mining

method may extract the reservoir and well model from a collection of pieces of data

even if each piece of data was not diagnostic by itself from a conventional well test

point of view. Fig. 4.14(d) shows the pressure prediction to a multivariable flow rate

history. The prediction is also very accurate compared to the true data.

The results of Cases 10 and 11 illustrated that Method D was capable of extracting

the well/reservoir model in the noisy and frequent changing environment. This is very

promising in that it reveals value in the PDG data that was previously ignored.


0 50 100 150 200−1000

−800

−600

−400

−200

Time (hours)

∆Pre

ssur

e (p

si)

0 50 100 150 200

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200

−900

−800

−700

−600

−500

−400

−300

Time (hours)∆P

ress

ure

(psi

)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)



4.5.12 Case 12: Semireal Case A

Sections 4.5.12 and 4.5.13 discuss two semireal cases, Case 12 and Case 13. In

the semireal cases, the flow rate data were from a real data set, while the pressure

data were generated using the forward model. The cases were designed to mimic the

real reservoir setting while maintaining the knowledge of the well/reservoir model.

This setting helps us to simulate the real field data but having the model available

for comparison purposes. The flow rate history of Case 12 selected a small piece of

production history, while that of Case 13 was longer.

Case 12 is a case in which infinite-acting radial flow, wellbore effect, skin factor,

and constant pressure boundary behavior all existed. Fig. 4.15(a) shows the true

data and the noisy data. Fig. 4.15(b) shows the pressure reproduction according

to the true flow rate history. In the training data, the real reservoir pressure data

were mixed with 3% artificial noise in the period of [120h, 130h]. However, in the

pressure prediction in Fig. 4.15(b) only the true reservoir pressure changes are seen,

and the noise is no longer apparent. This demonstrates that from the training process

the learning algorithm recognized what is real reservoir behavior and what is noise.

The advantage of the learning method specifically provides one of our study targets,

denoising. If using other denoising methods that generally smooth the curve, such

as Fast Fourier Transform (FFT) method, any small variation of the real reservoir

response would be included together with the noise. The real reservoir response which

might represent some specific reservoir events would then be lost. The denoising

provided by Method D works at a high resolution giving potentially more useful

information to the engineers.

Fig. 4.15(c) shows the pressure prediction to a constant flow rate. The learning

algorithm discovered all the preset well/reservoir features in the forward model. This

effectively achieves deconvolution. Fig. 4.15(d) demonstrates the pressure prediction

to a multivariable flow rate history. The pressure prediction is accurate compared to

the true data.


0 50 100 150 200−600

−400

−200

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−500

−450

−400

−350

−300

−250

−200

−150

−100

−50

0

Time (hours)∆P

ress

ure

(psi

)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)



4.5.13 Case 13: Semireal Case B

Fig. 4.16 shows the test results on the second semireal case, Case 13. In Case 13,

the infinite-acting radial flow, wellbore effect, skin factor, and the constant pressure

boundary behavior all existed. The true data and the training data are shown in

Fig. 4.16(a). In this case, the flow rate history is longer and with more flow rate

changes than Case 12.

Fig. 4.16(b) shows the pressure reproduction to the true flow rate history. Similar

to Case 12, the data mining algorithm detected the real reservoir behavior from within

the noise, and made accurate pressure prediction. Fig. 4.16(c) shows the pressure

prediction to a constant flow rate history. Method D captured the infinite-acting

radial flow, skin factor, and the constant pressure boundary, but it deviated from the

true data in the wellbore effect stage. Fig. 4.16(d) shows a good pressure prediction

to the multivariable flow rate. Although Method D missed the wellbore effect in the

prediction to the constant flow rate, the overall prediction still proves its efficiency

and accuracy.

Including the two semireal cases, the convolution kernel method, Method D,

worked out all test scenarios.

4.5.14 Case 14: Real Case A

Sections 4.5.14 and 4.5.15 report the results of applying Method D on two real cases,

Case 14 and Case 15. Because they are real field cases, the reservoir model is not

known. Therefore, in the plots, only the prediction and the real data are shown.

Similarly, the artificial noise was not added to the training data, hence feeding the

machine learning algorithm with the original flavor of the real data set. Case 14 and

Case 15 were two sections of a real field data set, a short section and a long section

respectively, to illustrate the method working in different settings.

The results of Case 14 are shown in Fig. 4.17. The real data are shown in

Fig. 4.17(a). The pressure reproduction according to the real flow rate history is

shown in Fig. 4.17(b). The figure shows that Method D reproduced well the original

real data set, and honored many details of the real data.


0 50 100 150 200−1500

−1000

−500

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−1200

−1000

−800

−600

−400

−200

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)



Fig. 4.17(c) and Fig. 4.17(d) demonstrate the pressure prediction to a constant

flow rate and a multivariable flow rate. Fig. 4.17(c) suggests an infinite-acting ra-

dial flow region and a constant pressure boundary region in the derivative curve.

No wellbore storage effect was suggested. We would like to emphasize an impor-

tant use of the data mining pressure prediction. Supposing that the data mining

method successfully retrieved the well/reservoir properties, Fig. 4.17(c) reflects the

same well/reservoir pressure response according to a constant flow rate. Therefore,

the features obtained in Fig. 4.17(c) are the features of the well/reservoir. In other

words, the data mining method, Method D was reporting that the PDG well did

not have strong wellbore effect, but the producing reservoir had a constant pressure

boundary. Also the engineer could obtain the permeability and investigated radius

from the prediction in Fig. 4.17(c) using conventional well testing methods. All this

information regarding the well/reservoir model properties is actually extracted by the

data mining algorithms. Without the data mining algorithms, it would be very hard

for the engineer to identify that useful information from the raw PDG data shown in

Fig. 4.17(a). Here lays the great merit of the data mining methods.

4.5.15 Case 15: Real Case B

Case 15 is a second real field case with longer production period as shown in Fig. 4.18(a).

Fig. 4.18(b) demonstrates the pressure reproduction to the real flow rate history. In

Fig. 4.18(b) shows that Method D captured the overall trend and most of the details,

such as the sudden pressure peak and pressure variation in the curve corners. How-

ever, the pressure reproduction also reported a deviation of about 10 psi at the late

stage of the pressure curve. We actually do not know the meaning of this deviation,

either reflecting an inaccuracy brought by the data mining method, or implying some

unknown reservoir events that changed the nature of the pressure response over time.

For the sake of real practice, this prediction is indeed a good clue for some further

investigation regarding the reservoir production.

Actually, from the pressure prediction to a constant flow rate (Fig. 4.18(c)) and a

multivariable flow rate (Fig. 4.18(d)), we can have some confidence in the prediction.


332 334 336 338 340 342 344 346 348 350 352−400

−300

−200

−100

0

Time (days)

∆Pre

ssur

e (p

si)

Real Data

332 334 336 338 340 342 344 346 348 350 3520

0.5

1

1.5

2

x 104

Time (days)Flo

w R

ate

(ST

B/d

)

Real Data

(a)

332 334 336 338 340 342 344 346 348 350 352−400

−350

−300

−250

−200

−150

−100

−50

Time (days)

∆Pre

ssur

e (p

si)

Real DataMethod D

(b)

100

101

102

103

10−1

100

101

102

Time (days)

∆Pre

ssur

e (p

si)

Method DMethod D (Derivative)

(c)

0 20 40 60 80 100 120 140 160 180 200−10

−9

−8

−7

−6

−5

−4

−3

−2

−1

0

Time (days)

∆Pre

ssur

e (p

si)

Method D

(d)



200 300 400 500 600 700 800 900 1000−400

−300

−200

−100

0

Time (days)

∆Pre

ssur

e (p

si)

Real Data

200 300 400 500 600 700 800 900 10000

0.5

1

1.5

2

x 104

Time (days)Flo

w R

ate

(ST

B/d

)

Real Data

(a)

200 300 400 500 600 700 800 900 1000−400

−350

−300

−250

−200

−150

−100

−50

0

Time (days)∆P

ress

ure

(psi

)

Real DataMethod D

(b)

100

101

102

103

10−1

100

101

102

Time (days)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−12

−10

−8

−6

−4

−2

0

Time (days)

∆Pre

ssur

e (p

si)

Method D

(d)



This is because Case 14 and Case 15 were from the same real field PDG data set. Case

14 data were selected to be shorter whereas Case 15 was longer – in fact, Case 14 was

a part of Case 15. Considering that two pieces of data are from the same reservoir,

ultimately they should have close behaviors in the pressure prediction. This is exactly

what we saw in the two test cases. Fig. 4.18(c) only shows the infinite-acting radial

flow and the constant pressure boundary without the evidence of wellbore effect. This

is the same situation in Case 14, discussed in Section 4.5.14. To make the comparison

more directly, Fig. 4.19(a) and Fig. 4.19(b) plot the pressure prediction from the two

cases in the same plot. What we can see is that the two predictions follow the same

shape and cover the same range. The small offset between the two curves might

come from the duration difference, that is, Case 15’s longer duration brought more

reservoir information to the machine learning algorithm. For example, longer duration

in Case 15 enables the pressure data to respond to the boundary effect leading to a

big decrease in the derivative curve in Fig. 4.19(a). Hence, the predictions in Cases 14

and 15 give us more confidence on the performance of the convolution kernel method.

100

101

102

103

10−1

100

101

102

Time (days)

∆Pre

ssur

e (p

si)

Case 14Case 14 (Derivative)Case 15Case 15 (Derivative)

(a)

0 20 40 60 80 100 120 140 160 180 200−12

−10

−8

−6

−4

−2

0

Time (days)

∆Pre

ssur

e (p

si)

Case 14Case 15

(b)

Figure 4.19: Comparison between the predictions from Case 14 and Case 15: (a)comparison between the pressure predictions to the constant flow rate history on alog-log plot; (b) comparison between the pressure predictions to the multivariableflow rate history on a Cartesian plot.


4.5.16 Case 37: Real Case C (Cross Validation)

In the previous two real cases, Real Cases A and B, the true data were unknown,

so it was hard to verify the prediction. In Real Case C, we applied cross validation

to verify the prediction results. In a cross validation process, the real data set was

divided into two parts. The first part of the real data was used as the training data

set, while the second part of real data was invisible to the data mining algorithm,

and was solely used as the test data set. If the data mining algorithm discovered the

reservoir model behind the first part of real data, it should have a good prediction to

the second part of data.

Real Case C was a case with a nearly nine month production history, as demon-

strated in Fig. 4.20(a). We used the first two thirds of the data as the training data

set to train the data mining algorithm. Then, we asked it to make three predic-

tions: one reproduction of the training data set, one prediction to a constant flow

rate history, and one prediction to the whole data set for cross validation, as shown in

Figs. 4.20(b), 4.20(c) and 4.20(d) respectively. In Fig. 4.20(b), we may see that the

data mining algorithm reproduced the training data set very well. In Fig. 4.20(d),

the data mining algorithm made good prediction to the last one third of the real data

set. The fact that the prediction captured the trend as well as the detail in the last

one third of data set demonstrated the correctness of the prediction results and the

effectiveness of the method.

4.6 Summary

In the 16 test cases, including nine synthetic cases, two complicated synthetic cases,

two semireal cases and two real cases, Method D performed stably and predicted accu-

rately in most of the scenarios. Compared with the simple kernel method (Method B),

Method D not only better captured the well/reservoir models, but also overcame the

limitations in prediction to a multivariable flow rate history. In the complicated cases,

even though the constant flow rate period was too short for the reservoir response to

develop fully, Method D could still obtain the whole reservoir model correctly from


500 550 600 650 700 750 800

−800

−600

−400

−200

0

Time (days)

∆Pre

ssur

e (p

si)

training← →prediction

Real Data

500 550 600 650 700 750 8000

1

2

x 104

Time (days)Flo

w R

ate

(ST

B/d

)


Real Data

(a)

500 520 540 560 580 600 620 640 660 680−800

−700

−600

−500

−400

−300

−200

−100

0

100

Time (days)∆P

ress

ure

(psi

)

Real DataMethod D

(b)

100

101

102

103

10−2

10−1

100

101

102

Time (days)

∆Pre

ssur

e (p

si)


(c)

500 550 600 650 700 750 800

−800

−700

−600

−500

−400

−300

−200

−100

0


Time (days)

∆Pre

ssur

e (p

si)

Real DataMethod D

(d)

Figure 4.20: Data mining results using convolution kernel method on Case 37: (a)the full set of real data, use first two thirds as the training data set; (b) predictionusing the first two thirds variable flow rate history; (c) prediction using the constantflow rate history on a log-log plot; (d) prediction using the whole real data set.


the highly convoluted pressure history. As the PDG is always used in the real produc-

tion, there is rarely a long constant flow rate period, and there may be few shut-ins in

some cases. The advantage of extracting reservoir model from highly convoluted data

not only widens the usage of Method D in real practice, but also gives more value

to the PDG data that were not fully utilized before. In the semireal cases, unlike

the traditional denoising method that would generally smooth the curve, Method D

detected the real reservoir response from the noise, exposing the tiny changes in the

pressure response that might indicate some specific reservoir events. In the first two

real cases that came from the same source, Method D showed its stability in the

extraction of the well/reservoir model. This gave us some confidence even though the

true answer was unknown. In Real Case C, the cross validation showed the robustness

of the method in a more direct manner. To sum up, with the new architecture of

“superposition over kernelization”, the data mining method using convolution kernel,

Method D, showed great potential in the interpretation of PDG data.

The more promising point other than the application results is that Method D is

not a single data mining method, but implies a platform of constructing data mining

methods. Using the same methodology, by varying the design of the input vector, the

form of kernel function, and the way of combining the simple kernel functions to form

the convolution function, a family of Method-D-like data mining methods may be

constructed and studied. This thought provides a new aspect of further exploration

on using data mining techniques in the interpretation of PDG data.

One important problem that Method D faces is the computational cost. The

most time consuming step in Method D is the construction of the kernel matrix.

According to the equations discussed in this chapter, the computational cost for the

kernel matrix construction is O(

N4p

)

. This cost would be significantly decreased if the

breakpoints are known. This is because we do not need to assume every sample before

the current sample is a breakpoint when the real breakpoints are known. However,

as discussed before, the breakpoint detection is also a difficult problem. So we will

have some discussion on the scalability of the problem in Chapter 6.

Although Method D showed its powerful capability in the 16 test cases, there are

still many realistic and uncontrolled problems challenging the performance in real


field practice. In the next chapter, a few realistic performance related problems will

addressed, including the effect of outliers and aberrant segments in the raw data, the

effect and solution when the production history is incomplete. By addressing these

problems, we obtained further understanding of the performance of the data mining

methods working in real practice.

Chapter 5

Performance Analysis

Chapter 4 discussed the formulation, derivation and application of convolution kernel

method (Method D). The 15 test cases discussed in Section 4.5 focused on the applica-

tion of the convolution kernel method to different well/reservoir models, and with the

different flow rate profiles approaching a real reservoir. Therefore, most of the exam-

ples except the two real cases had complete production histories with added artificial

noise. However, in real practice an application may face more complex difficulties

within in the data set that challenge the performance of the data mining method.

This chapter focuses on the performance analysis of the data mining method under

various situations, including:

• the existence of other kinds of noise in the data set, such as outliers and aberrant

segments;

• incomplete production history, such as missing pressure/flow rate record and

unknown initial pressure;

• different sampling frequency of the data set; and

• the evolution of the learning process with the increasing size of the data set.

120

CHAPTER 5. PERFORMANCE ANALYSIS 121

5.1 Outliers

In PDG data, due to the uncontrolled subsurface environment, outliers often exist

in addition to normal noise. An outlying observation, or outlier, is one that appears

to deviate markedly from other members of the sample in which it occurs (Grubbs,

1969). In PDG measurement, the outliers may be due to many reasons: a malfunction

of the sensor, a temporary and sudden change of the subsurface, an uncontrolled

disturbance of the signal transmission or recording tool, etc. Unlike the normal noise

that may exist over the whole data set, the outliers count for a very small portion of

the measurement. However, they do affect the PDG data interpretation, because they

impose major discontinuity in the derivative calculation, and make it more difficult

for real break point detection. In the conventional well test analysis, the outliers

are mainly filtered out by hand. However, dealing with the huge volume of PDG

data, human interaction becomes infeasible. Therefore, the data mining method is

expected to tolerate the outliers to some extent. Hence, this test was performed to

investigate the performance of the convolution kernel method under the existence of

outliers in addition to the normal noise.

The test was performed as follows.

1. Construct a synthetic pressure, flow rate data set, replace arbitrarily a certain

percentage of pressure and flow rate data with outliers, and then add 3% arti-

ficial noise (normally distributed) to both the pressure and the flow rate data.

2. Use the synthetic data set (with artificial outliers and artificial noise) as the

training data set. Apply convolution kernelized data mining algorithms (Method

D) to learn the data set until convergence.


(without outliers or noise) and collect the prediction from the data mining

algorithm.




5. Feed the data mining algorithm with a constant flow rate history (without

outliers or noise) and collect the predicted pressures from the data mining al-

gorithm.

6. Construct a synthetic pressure according to the constant flow rate from Step 5

using the same wellbore/reservoir model from Step 1.


data (from Step 6).

8. Feed the data mining algorithm with a multivariable flow rate history (with-

out outliers or noise) and collect the predicted pressures from the data mining

algorithm.

9. Construct a synthetic pressure according to the multivariable flow rate from

Step 9 using the same wellbore/reservoir model from Step 1.


data (from Step 9).

With this work flow, Method D was applied to three test cases, namely, Cases 16,

17 and 18. Brief characterization of the test cases is listed in Table 5.1. The results

of the test cases will be discussed one by one.

Case 16 had a moderate level of outliers, such that 6% of the pressure data and

3% of the flow rate data are outliers (actually, according to the accuracy of modern

PDGs, these percentages of outliers are already many). Case 16 is hence to test the

convolution kernel method working with a normal level of outliers. Fig. 5.1(a) shows

the training data in pink and the true data in blue. It can be seen that the outliers

deviate from the true data in both pressure and flow rate. At the same time, 3% (in

absolute value) normal noise was added everywhere throughout the data. Again, it

should be realized that only the noisy training data with the outliers were fed into

the learning algorithm, while true data are plotted here for comparison purpose only.

Fig. 5.1(b) shows the pressure reproduction using the variable flow rate. Compared

to the true data, the prediction has a very good precision such that the normal noise


Table 5.1: Test cases for outliers performance analysis


16 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; 6% of pressure and 3% of flow rate train-ing data are outliers; 3% artificial normal noise added to allpressure and flow rate data.

17 Infinite-acting radial flow + wellbore effect + skin + con-stant pressure boundary; 10% of pressure and 10% of flow ratetraining data are outliers; 3% artificial normal noise added toall pressure and flow rate data.

18 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; 10% of pressure training data are outliers,no outliers in flow rate; 3% artificial normal noise added toall pressure and flow rate data.

and the outliers are all excluded from the prediction leaving the correct pressure only.

Fig. 5.1(c) shows the pressure prediction to a constant flow rate history in a log-log

plot. The data mining algorithm successfully captured the wellbore storage, skin

factor, infinite-acting radial flow, and the constant pressure boundary. Fig. 5.1(d)

demonstrates the pressure prediction to a multivariable flow rate. The prediction is

also good compared to the true answer.

In this test, no extra noise filtering or outlier removal procedure was performed

in advance. All pressure prediction results shown in Fig. 5.1 were based on the data

mining using noisy and outlier data directly. Therefore, we may conclude that the

convolution kernel method, Method D, has the capability of tolerating the outliers at

a moderate level without any preprocessing procedures.

Because the convolution kernel method could tolerate the outliers at a moderate

level, a further test was then conducted using Case 17. In Case 17, 10% (in number)

of outliers exist in both the pressure and flow rate data, as shown in Fig. 5.2(a). The

increase in the number of outliers actually cut the data set into small pieces, so the

pressure prediction was affected. In Fig. 5.2(b), the reproduction of the pressure to

the variable flow rate shows deviation in the two drawdown periods. However, the

overall trend of the prediction is still reasonable. The deviation may be seen more


0 50 100 150 200−1500

−1000

−500

0

500

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200−20

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)

Figure 5.1: Outlier performance test results on Case 16: (a) the training data set; (b)prediction using the variable flow rate history; (c) prediction using the constant flowrate history on a log-log plot; (d) prediction using multivariable flow rate history ona Cartesian plot.


clearly in the log-log plot in Fig. 5.2(c), mainly in the wellbore effect and the infinite-

acting radial flow stages. The constant pressure boundary is still captured by the

data mining method. In this case the outliers interrupted the early transient, so they

have strong effect in those short period behaviors due to the lack of good data. In

the long period behavior, such as the boundary, the data without outliers still count

much more than the outliers, so the behavior was extracted successfully. Fig. 5.2(d)

demonstrates the pressure prediction to the multivariable flow rate. The deviation

could be observed in the early stages, while at late time, the prediction went to back

to the correct track.

Case 17 illustrates that the outliers do have effect on the prediction of the con-

volution method, especially when the outliers count more in the data set. However,

in detail, the effect is serious in the stage where the outliers were intensive, and the

effect will be less in the time period when the good data were dominant.

Compared with the pressure, the flow rate is more sensitive to the outliers, because

the outliers make the breakpoints less evident as discussed in Section 2.2.2. This was

demonstrated by Case 18. Compared with Case 17 in which the outliers existed in

both pressure and flow rate, in Case 18, 10% (in number) outliers only existed in the

pressure data while no outliers were in the flow rate data. We would like to use this

case to show that when the flow rate data are relatively clean, the convolution method

may still work well even though a large number of outliers exist in the pressure.

Fig. 5.3(a) shows the training data and the true data. We may see that the

outliers exist in the pressure, while only the normal noise exist in the flow rate.

Fig. 5.3(b) shows the pressure reproduction to the variable flow rate. Compared

with the pressure in Fig. 5.2(a), the pressure prediction was much improved. In

the log-log plot of Fig. 5.3(c), the improvement is more obvious. This time, the

learning algorithm captured the wellbore storage, skin factor, infinite-acting radial

flow very well. Only some deviation was seen in the boundary stage. Fig. 5.3(d)

shows the pressure prediction to the multivariable flow rate history. The precision of

the prediction is better than that in Fig. 5.2(d).

Case 18 illustrated that the outliers in the flow rate history have more effect on

the precision of the prediction. A clean flow rate history will help the data mining


0 50 100 150 200−1500

−1000

−500

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

50

100

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)



algorithm make a good prediction.

0 50 100 150 200−1500

−1000

−500

0

500

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)


Cases 16-18 verified the performance of the convolution kernel method in the

presence of outliers. The convolution kernel method, Method D, has the capability

to handle a moderate level of outliers naturally without any preprocessing (such as

outlier removal or noise filtering) in advance. However, when the outliers count for

a high percentage of the training data, the accuracy of the prediction decayed along

with the increase of the proportion of the outliers. Comparably, the outliers in the

flow rate have more effect on the prediction precision than those in the pressure.


Therefore, a clean flow rate history is desirable to make an accurate prediction.

5.2 Aberrant Segments

Aberrant segments are also referred to as behavior excursions (Horne, 2007). If an

outlier is a sampling point that appears to deviate markedly from other members of

the sample in which it occurs, then an aberrant segment is a period of sampling points

that appears to deviate markedly from other members. In the context of data mining,

an aberrant segment has larger effect on the pressure prediction than a collection of

discontinuous outliers. This is not only because the data in the aberrant segments

decreases the density of good data, but also because the continuous data in the aber-

rant segment form a “second” controlling logic in addition to the true well/reservoir

model, which “puzzles” the data mining algorithm in the learning process.

Aberrant segments are often seen in real PDG data records. As such, the aberrant

segments (and also the normal noise and the outliers) must be treated as the inherent

behavior of PDG data (Horne, 2007), and hence, an accommodation of the aberrant

segments is needed. In this section, the tests on the convolution kernel method

working in the existence of aberrant segments are reported.

The tests were conducted as follows.

1. Construct a synthetic pressure, flow rate data set, replace arbitrarily a period

of pressure data with an aberrant segment, and then add 3% artificial noise

(normally distributed) to both the pressure and the flow rate data.

2. Use the synthetic data set (with aberrant segments and artificial noise) as

the training data set. Apply convolution kernelized data mining algorithms

(Method D) to learn the data set until convergence.


(without aberrant segments or noise) and collect the prediction from the data

mining algorithm.





aberrant segments or noise) and collect the predicted pressures from the data

mining algorithm.




data (from Step 6).


aberrant segments or noise) and collect the predicted pressures from the data

mining algorithm.

9. Construct a synthetic pressure according to the multivariable flow rate from

Step 9 using the same wellbore/reservoir model from Step 1.


data (from Step 9).

The convolution kernel method was applied to four test cases, namely Cases 19,

20, 21 and 22, as listed in Table 5.2. The aberrant segments in Cases 19, 20 and

21 deviate farther and farther away from the true data, such that the performance

of the convolution kernel method could be tested under different extent of aberrant

segments. Case 22 has the same aberrant segment as Case 21. However, in Case

22, the aberrant segment was removed in preprocessing, leaving an absent period

of pressure data in the training data set. Case 22 was utilized to compare with

Case 21 whether exclusion of the aberrant segment would be helpful for the pressure

prediction.

Fig. 5.4 shows the results for Case 19. In Case 19, 8% (in number) of pressure

data deviated gradually from the true data in addition to a 3% normally distributed


Table 5.2: Test cases for aberrant segment performance analysis


19 Infinite-acting radial flow + wellbore effect + skin + con-stant pressure boundary; 8% of pressure training data lay inan aberrant segment; 3% artificial normal noise added to allpressure and flow rate data.

20 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; 8% of pressure training data lay in anaberrant segment that deviated far from the true data; 3%artificial normal noise added to all pressure and flow rate data.

21 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; 8% of pressure training data lay in anaberrant segment that deviated totally from true data, nooutliers in flow rate; 3% artificial normal noise added to allpressure and flow rate data.

22 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; 8% of pressure training data lay in anaberrant segment that deviated totally from the true data, nooutliers in flow rate; 3% artificial normal noise added to allpressure and flow rate data. The difference between Case 21and Case 22 was that in Case 22, the aberrant segment wasexcluded by preprocessing before feeding into the data miningalgorithm.


noise added globally. Fig. 5.4(a) shows the training data with aberrant segment and

the noise in pink and the true data in blue. The aberrant segment locates in the

range of [120h, 150h]. The deviation of the aberrant segment increases gradually from

early time to late time. In addition to the aberrant segment, the artificial noise was

added everywhere in both the pressure and the flow rate data. Using these data that

are noisy and with the aberrant segment to train the data mining algorithm, the

prediction is shown in Figs. 5.4(b), 5.4(c), and 5.4(d).

Fig. 5.4(b) shows a pressure reproduction to the variable flow rate history. The

aberrant segment of the training data in the range of [120h, 150h] is not apparent in

the pressure prediction in Fig, 5.4(b). The pressure prediction is close to the true

data. In the log-log plot of Fig. 5.4(c), the convolution kernel method captures the

infinite-acting radial flow and the boundary effect. There is a small deviation in the

region of the wellbore effect. The pressure prediction to the multivariable flow rate

history in Fig. 5.4(d) is good compared to the true data. The last 20 hours has a

slight deviation of about 5 psi, which is less than 1% deviation (compared to the scale

of the pressure) that was brought by the aberrant segment in the training data set.

In Case 19, the aberrant segment is moderate. In Case 20, we intensifed the

aberrant segment, so that the whole segment deviated from the true data from the

beginning to the end. Fig. 5.5(a) shows the training data in pink and the true data

in blue. The figure shows that the aberrant segment in the range of [120h, 150h] is

separated completely from the true answer. Fig. 5.5(b) shows the pressure reproduc-

tion to the variable flow rate after being trained by the noisy data in Fig. 5.5(a).

The pressure prediction demonstrates that the learning algorithm tried to return to

the correct track in the aberrant segment range. However, the pressure prediction

deviates more than that in Case 19 (Fig. 5.4(b)) due to the effect of the aberrant seg-

ment. Nevertheless, the aberrant segment behavior was still recognized by the data

mining algorithm and excluded from the prediction. Fig. 5.5(c) shows the pressure

prediction to the constant flow rate history. Compared with Case 19 in Fig. 5.4(c),

the deviation in the wellbore effect region increases and a small deviation exists in the

infinite-acting radial flow region. However, the boundary behavior was still captured

well. In Fig. 5.5(d), the pressure prediction to the variable flow rate is good overall,


0 50 100 150 200−1000

−500

0

↓ aberrant segment

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)

Figure 5.4: Aberrant segment performance test results on Case 19: (a) the trainingdata set; (b) prediction using the variable flow rate history; (c) prediction using theconstant flow rate history on a log-log plot; (d) prediction using multivariable flowrate history on a Cartesian plot.


but deviates from the true answer in the last 40 hours. In this way, Case 20 illustrates

that the accuracy of the pressure prediction would decay when the aberrant segment

is more severe.

0 50 100 150 200−1000

−500

0

↓ aberrant segment

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)∆P

ress

ure

(psi

)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)


Case 21 is an extension of Case 20. In Case 21, a pressure transient to the

multivariable flow rate was used as the training data. In the training data, the

second hump of the pressure was replaced by a straight line, as shown in Fig. 5.6(a).

Compared with the aberrant segments in Cases 19 and 20 (Figs. 5.4(a) and 5.5(a)), the

aberrant segment in this case is very severe and deviated totally from the true data.


However, the prediction is not completely biased. Fig. 5.6(d) shows the pressure

reproduction to the multivariable flow rate history. What we see is that the data

mining algorithm still captures a good overall trend of the pressure prediction. Even

in the aberrant segment region of [120h, 150h], the pressure prediction still behaves

as the shape of the normal (correct) pressure transient, while the straight-line-like

shape of the aberrant segment is not apparent in the prediction result. Figs. 5.6(b)

demonstrates the pressure prediction to a variable flow rate. Similar to the prediction

to the multivariable flow rate in Fig. 5.6(d), the prediction deviates from the true data,

while it still captures the overall trend so that the deviation is still in a reasonable

range. The pressure prediction to a constant flow rate history is shown in Fig. 5.6(c).

The derivative curve of the pressure prediction has a similar shape to that of the true

data. However, the infinite-acting radial flow region parallels that of the true data

which would bring a deviation in the permeability estimation.

As Case 21 suggested that a severely aberrant segment would distort the pressure

prediction, a further test, Case 22, was carried out to determine whether the removal

of the aberrant segment would help in the accuracy of the prediction. A severely

aberrant segment would usually be easily detected (by human eye or by external

algorithms). Hence, it would be very helpful if a preremoval of the severe aberrant

segment could increase the precision of the prediction.

The convolution kernel method did not disappoint in this regard. Fig. 5.7(a) shows

the training data in pink and the true data in blue. We may see that the aberrant

segment in the range of [120h, 150h] in Fig. 5.6(a) does not appear in the training data

in Fig. 5.7(a). The pressure reproduction to the multivatiable flow rate is shown in

Fig. 5.7(d). The pressure prediction is nearly identical to the true data. In the plot of

the pressure prediction to a variable flow rate in Fig. 5.7(b), the pressure prediction

returns to the correct track compared to that in Fig. 5.6(a). In the log-log plot of the

pressure prediction to a constant flow rate history in Fig. 5.7(c), all four behaviors

including the wellbore effect, skin effect, infinite-acting radial flow and the constant

pressure boundary are well revealed by the data mining algorithm.

The results of Cases 19-22 suggest that the data mining algorithm is robust to the

existence of the aberrant segments, although the accuracy of the pressure prediction


20 40 60 80 100 120 140 160 180 200

−600

−400

−200

0

← aberrant segment

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 20 40 60 80 100 120 140 160 180

0

20

40

60

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)



20 40 60 80 100 120 140 160 180 200

−600

−400

−200

0

← data removed

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

20 40 60 80 100 120 140 160 180 200

0

20

40

60

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

(a)

0 20 40 60 80 100 120 140 160 180 200−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)



reduces as the aberrancy becomes more severe. The convolution kernel method may

handle moderate aberrant segments with a resulting small deviation in the prediction,

whereas a severe aberrant segment would affect the prediction in absolute value while

retaining the overall shape of the true response. A preremoval of the aberrant segment

improves the precision of the data mining algorithm significantly.

5.3 Partial Production History

In a PDG data set, it is common to see a period of production history that is missing.

The data missing might be due to many reasons, such as the unavailability of the

measuring devices, or unexpected loss in the data storage, etc. In this case, the

data mining algorithm will face a partial production history. In this section, the

performance of the data mining algorithm working with a partial production history

is discussed.

In order to investigate the performance of the convolution kernel method working

with an incomplete data set, we first formed a semireal data set, as shown in Fig. 5.8.

We used this data set as the complete data set, and then used only the data in the

range of [100h, 300h] (data in the red box) as an incomplete data set to conduct the

tests.

The work flow of the test is as follows.

1. Construct a synthetic pressure, flow rate data set, and add 3% artificial noise

(normally distributed) to both the pressure and the flow rate data. This will

be the original complete data set as shown in Fig. 5.8.

2. Extract a part of the data in the range of [100h, 300h] from the original com-

plete data set (with artificial noise) as the training data set. Apply convolution

kernelized data mining algorithms (Method D) to learn the data set until con-

vergence.


(without outliers nor noise) and collect the prediction from the data mining

algorithm.


0 50 100 150 200 250 300−1500

−1000

−500

0

500

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200 250 300

0

50

100

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

Figure 5.8: The original complete semireal data set for the performance test of theconvolution kernel method working with partial production history.




outliers nor noise) and collect the predicted pressures from the data mining

algorithm.




data (from Step 6).

This work flow was performed in Test Case 23, listed in Table 5.3.

Fig. 5.9 demonstrates the results of Case 23. The pressure reproduction of the

training flow rate is shown in Fig. 5.9(a). In the figure, it seems that the pressure was

reproduced well. The pressure near 170h and the pressure in the range of [200h, 220h]

have slight deviation from the true data. However, the good result was not retained


Table 5.3: Test case for partial production history performance analysis


23 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; use the data from 100h to 300h as thetraining data to simulate the situation in which the first 100hproduction history is missing; no effective rate readjustmentwas made.

in the pressure prediction to the constant flow rate history, demonstrated by a log-log

plot in Fig. 5.9(b). In the log-log plot, the pressure prediction is still close to that

of the true data. However, the derivative curve shows that the prediction misses the

wellbore effect and infinite-acting radial flow. For the boundary, the prediction not

only deviated from the absolute value of the true answer, it also mistakenly replaced

the constant pressure boundary behavior by a pseudosteady state boundary behavior.

These errors are caused by the missing production history, especially, the missing

flow rate history in the first 100 hours. This could be explained by the construction

of the “parts” of the convolution kernel. Referring to Section 4.2, each part is defined

by a flow rate change event. Also in the convolution kernel method, in order to avoid

the breakpoint detection, each sampling point is treated as a flow rate change event.

Therefore, missing the flow rate history for the first 100 hours means that the first

part of the training data is defined by the sampling point at 100h. Because no earlier

part is available before this part, this part was treated as representative of the first

100 hours, which means that the flow rate in the first 100 hours was treated constant

as the value at 100h. Referring to Fig. 5.8, the flow rate at 100h is not representative

of the first 100 hours due to a nearly 50 hours shut-in from around 50h to around

75h. The dramatic change of the flow rate in the first 100 hours causes the error in

the well/reservoir model extraction by the data mining algorithm.

An approach to fix this problem is to reconstruct the flow rate data in the miss-

ing period using the PDG pressure data, as discussed in the literature review (Sec-

tion 2.2.3). However, this method would not work when the PDG pressure data is also

not available – actually, this is very possible when an unexpected data loss happens.

We proposed an alternative solution to handle this problem, namely effective rate


100 120 140 160 180 200 220 240 260 280 300−1100

−1000

−900

−800

−700

−600

−500

−400

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 5.9: Partial production history performance test results on Case 23 (withouteffective rate correction): (a) pressure reproduction to the training flow rate history;(b) prediction using the constant flow rate history on a log-log plot

readjustment. In a reservoir, the cumulative production is usually well documented.

With this cumulative oil production, we may calculate the effective rate as:

qeff =Q(1)

t(1)(5.1)

where t(1) is the first available time in the partial data set, and the Q(1) is the cumu-

lative oil production at t(1).

We then replace q(1) (the flow rate at t(1)) with this effective flow rate. That

is, the effective flow rate is selected to construct the first part of the partial data,

representing the flow rate of the whole missing period. Using the concept of the

effective rate, a new work flow was formed as follows:

1. Construct a synthetic pressure, flow rate data set, and add 3% artificial noise

(normally distributed) to both the pressure and the flow rate data. This will

be the original complete data set as shown in Fig. 5.8.

2. Extract the data in the range of [100h, 300h] in the original complete data

set (with artificial noise) as the training data set. Calculate the effective

flow rate for the first 100h. Replace the first flow rate (flow rate at


100h) with the effective rate. Apply the convolution kernelized data mining

algorithm (Method D) to learn the data set until convergence.


(without outliers nor noise) and collect the prediction from the data mining

algorithm.




outliers nor noise) and collect the predicted pressures from the data mining

algorithm.




data (from Step 6).

This work flow was applied to Test Case 24, listed in Table 5.4.

Table 5.4: Test case for partial production history performance analysis


24 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; use the data from 100h to 300h as thetraining data to simulate the situation in which the first 100hproduction history is missing; effective rate readjustment wasmade.

Fig. 5.10 demonstrates the results of Case 24. The pressure reproduction to the

training flow rate is shown in Fig. 5.10(a). The pressure reproduction is very close

to the true data. Compared to Fig. 5.9(a), the slight deviations near 170h and the

pressure in [200h, 220h] are not apparent in Fig. 5.10(a). Also, in the log-log plot

of Fig. 5.10(b) the pressure derivative curve prediction to the constant flow rate is


much improved compared that in Fig. 5.9(b). The data mining algorithm captured

the infinite-acting radial flow and the constant pressure boundary. A deviation exists

only in the wellbore storage region. These improvements show that the pressure

prediction would be improved if the effective rate readjustment is imposed.

100 120 140 160 180 200 220 240 260 280 300−1100

−1000

−900

−800

−700

−600

−500

−400

−300

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(a)

100

101

102

100

101

102

103

Time (hours)∆P

ress

ure

(psi

)


(b)

Figure 5.10: Partial production history performance test results on Case 24 (witheffective rate correction): (a) pressure reproduction to the training flow rate history;(b) prediction using the constant flow rate history on a log-log plot

In addition to the effective rate correction that was discussed above, we also inves-

tigated another kind of correction, namely effective time correction, to accommodate

the incomplete production history. In the effective time correction, instead of chang-

ing the flow rate, we readjusted the time of the incomplete production history. First,

we calculated the effective start time by Eq. 5.2.

teff =Q(1)

q(1)(5.2)

where q(1) is the first available flow rate in the partial data set. Then, we shift the

incomplete production data set to the effective start time using Eq. 5.3

(

t(i))′= t(i) −

(

t(1) − teff)

where i = 1, . . . , Np (5.3)

In this effective time correction, we tried to use the first known flow rate as

the constant average flow rate in the unknown production period, and adjusted the


production time to satisfy the balance between the constant average flow rate and the

accumulated production rate. The reason we imposed this effective time correction

came from the fact that the effective rate correction changed the first flow rate without

changing the first pressure leading to an unmatched flow-rate-pressure pair.

However, the test result did not meet our expectation. Applying this effective

time correction on Case 24, the results are shown in Fig. 5.11. Fig. 5.11(a) shows the

reproduction of the training data set which is an incomplete production history. The

reproduction is very close to the training pressure. However, the prediction to the

constant flow rate history on the log-log plot demonstrated in Fig. 5.11(b) suggests

another story. Apparently, the prediction misses the early stage (wellbore effect) and

the late stage (boundary effect) of the pressure transient. Only the infinite-acting

radial flow period is close to the true answer, yet still has an obvious deviation. These

results suggest that the effective time correction does not improve the prediction result

as well as the effective rate correction does, at least for Case 24. Further study might

be required to investigate the feasibility of the effective time correction thoroughly.

0 50 100 150 200 250−1100

−1000

−900

−800

−700

−600

−500

−400

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 5.11: Partial production history performance test results on Case 24 (witheffective time correction): (a) pressure reproduction to the training flow rate history;(b) prediction using the constant flow rate history on a log-log plot

Test Cases 23 and 24 in this section demonstrate the performance of the coonvo-

lution kernel method working in the condition of partial production history without

and with the effective rate readjustment respectively. Case 23 shows that the missing


production history (especially the missing flow rate history) affects the accuracy of

the prediction. However, by imposing the effective rate readjustment as in Case 24,

the effect of the missing period may be decreased to an acceptable level.

5.4 Unknown Initial Pressure

Similar to the missing production history, initial pressure identification is also a com-

monly seen problem in the PDG data analysis. In a real production, the initial

pressure of the reservoir might be unknown due to the long time elapsed since the

initial production, or due to the transition of the ownership of the reservoir. Even

though the initial pressure may be on file, it could be inappropriate due to subse-

quent shut-ins. All of these difficulties require the convolution kernel method to face

a unknown or inappropriate initial pressure, which is discussed in this section.

In order to address this problem, we first constructed a semireal data set, and

then shifted the initial pressure by 100 psi to form a training data to simulate the

scenario of inappropriate initial pressure, as shown in Fig. 5.12. This training data

set was the training data set for both Cases 25 and 26 discussed in this section. When

an inappropriate initial pressure is proposed, there would be an offset in the pressure

change compared to the true data. For example, in Fig. 5.12, the initial pressure of

the training data is set 100 psi more than that of the true data, leading to a constant

offset of 100 psi globally.

The study then investigated the effect of this inappropriate initial pressure by

using the work flow as follows.

1. Construct a synthetic pressure, flow rate data set.

2. Make the initial pressure 100 psi more than the true initial pressure, and add

3% artificial noise (normally distributed) to both the pressure and the flow rate

data. This will be the training data set with a wrong initial pressure. Apply

convolution kernelized data mining algorithms (Method D) to learn the data

set until convergence.


0 50 100 150 200−1000

−500

0

Time (hours)∆P

ress

ure

(psi

)

True DataNoisy Data

0 50 100 150 2000

20

40

60

80

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

Figure 5.12: The original true semireal data set and the training data for the perfor-mance test of the convolution kernel method working with a wrong initial pressure.The training data set has a offset to the true data due to the inappropriate initialpressure settings.










data (from Step 6).

The work flow was performed on Test Case 25, as listed in Table 5.5.

Fig. 5.13 shows the results of Case 25. The pressure reproduction to the training

flow rate history is shown in Fig. 5.13(a). Despite a global offset of 100 psi in the

training pressure data, the pressure reproduction still tried to return to the correct


Table 5.5: Test case for unknown initial pressure performance analysis


25 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; no optimization on the initial pressure.

track of the true data beginning at 100 hours of the prediction. However, the pressure

prediction in the last 100 hours still shows a deviation of around 100 psi, which is

a reflection of the effect of the inappropriate initial pressure. The log-log plot in

Fig. 5.13(b) demonstrates the pressure prediction to a constant flow rate history. The

derivative curves show that the pressure prediction deviates from the true answer in

the infinite-acting radial flow region. These plots show that the inappropriate initial

pressure results in inaccuracy in the pressure prediction.

0 20 40 60 80 100 120 140 160 180 200−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(a)

100

101

102

100

101

102

103

104

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 5.13: Unknown initial pressure performance test results on Case 25: (a) pre-diction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot.

To solve the problem of unknown initial pressure, we imposed an outer iteration

on the initial pressure during the data mining process. In this way, the initial pressure

is treated an unknown argument in the optimization iterations. In each optimization

iteration, the initial pressure is used to regenerate the training data set on which the

data mining algorithm is applied. When the data mining process finished, we make

the data mining algorithm reproduce the pressure to the training flow rate. There


is a difference between the pressure reproduction and the training pressure. When

the difference is small enough, we believe that the pressure reproduction is converged

to the training pressure, and hence obtain the initial pressure. Otherwise, the initial

pressure would be updated using the pressure difference for the next iteration. This

logic could be expressed using the pseudocode in Algorithm 4.

Algorithm 4 Data mining coupled with optimization on initial pressure

p[0]i = max

p(1), . . . , p(Np)

use the max pressure as the initial guess of the initialpressureiter = 0 initialize the iteration counterwhile iter < MAX ITER doUse the p

[iter]i as the initial pressure to update the training data set, obtaining

y[iter].Apply the data mining algorithm on the new training data setUse the data mining result to obtain the pressure reproduction (ypred[iter]) to thetraining flow rate historyif ypred[iter] is convergent to y[iter] thenreturn p

[iter]i

end ifUpdate p

[iter+1]i

iter = iter + 1 update the iteration counterend whilereturn p

[MAX ITER]i

With the Algorithm 4, a new work flow was formed as follows.

1. Construct a synthetic pressure, flow rate data set.

2. Make the initial pressure 100 psi more than the true initial pressure, and add

3% artificial noise (normally distributed) to both the pressure and the flow rate

data. This will be the training data set with a wrong initial pressure. Apply

convolution kernelized data mining with initial pressure optimization

algorithm (Algorithm 4) to learn the data set until convergence.











data (from Step 6).

The new work flow was performed on Test Case 26, as listed in Table 5.6.

Table 5.6: Test case for unknown initial pressure analysis


26 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; optimize the initial pressure as the outerloop over the data mining algorithm.

The results of Case 26 are demonstrated in Fig. 5.14. Fig, 5.14(a) shows the

pressure reproduction to the training flow rate. The pressure was reproduced well

compared to the true data. The log-log plot in Fig. 5.14(b) shows the pressure predic-

tion to the constant flow rate history. The derivative curves show that the pressure

prediction captures well the infinite-acting radial flow and the boundary, despite a

slight deviation in the wellbore effect region. The improvement in these plots suggests

that an optimization algorithm outside the data mining process helps to estimate the

appropriate initial pressure, and hence improve the accuracy of the pressure predic-

tion. In this optimization process, the data mining algorithm is actually used as a

black box whose input is the new guess of initial pressure, and whose output is the

pressure reproduction to the training flow rate history.

To sum up, in this section Case 25 demonstrated the effect of the inappropriate

initial pressure in the pressure prediction, while Case 26 suggested that an optimiza-

tion algorithm on the initial pressure outside the data mining algorithm would help to


0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

Figure 5.14: Unknown initial pressure performance test results on Case 26: (a) pre-diction using the variable flow rate history; (b) prediction using the constant flowrate history on a log-log plot.

find the appropriate initial pressure and improve the precision of the pressure predic-

tion. As a byproduct, these tests show another application of data mining algorithm

to find the appropriate initial pressure by using the data mining algorithm as a black

box.

5.5 Sampling Frequency

Sampling frequency is an important property of the data set. The sampling rate of

the data set may be affected by factors in both the hardware and the software. On

one hand, each PDG device may be programmed to record at a specific frequency,

hence deciding the sampling frequency of the raw measurement. On the other hand,

when the data set is large due to high frequency of the measurement, the data set is

usually resampled in a preprocessing process due to the consideration of performance.

As discussed in Section 4.3, the richness of the data fundamentally affects the data

mining process in the pseudo-high-dimensional space. In this way, the sampling

frequency also affects the precision of the data mining because the sampling frequency

is actually the data density of the data set deciding the richness of the data given

a specific period. The performance of the convolution kernel method working in


different sampling frequencies is investigated in this section.

0 50 100 150 200−1500

−1000

−500

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

50

100

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

Figure 5.15: The original complete semireal data set for the sampling frequency testsfor Cases 27-30.

To test the effect of the sampling frequency, we constructed a data set as shown

in Fig. 5.15. In this data set, there are 200 hours of data sampled by 400 points.

This set will be the base data set for Test Cases 27-30. Test Cases 27-30 resampled

the data set in Fig. 5.15 with different frequency to observe the behaviors of the

data mining algorithm. The detailed characterization of these test cases is listed in

Table 5.7. According to the description, the sampling frequency for Cases 27-30 are

1 point per 4 hours, 1 point per 2 hours, 1 point per hour, and 1 point per half hour

respectively. These test cases were conducted with the work flow as follows.


mally distributed) to both the pressure and the flow rate.

2. Resample the data set from Step 1 with a given sampling frequency to form a

training data set. Apply the convolution kernel method to learn the training

data set until convergence.




Table 5.7: Test cases for sampling frequency performance analysis


27 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; 200 hour data sampled by 50 points.











data (from Step 6).

Fig. 5.16 shows the pressure reproductions to the training data set in Cases 27-

30. Recall Case 27 has the least frequent sampling, the training data set of Case

27 is the smallest in the four. Fewer data leads to an incomplete kernel function

basis, which explains the deviation in the corners in Fig. 5.16(a). Similarly, some

slight deviations also exist in Case 28, as shown in Fig. 5.16(b). Compared with the

pressure reproductions in Cases 27 and 28, those in Cases 29 and 30, demonstrated

in Figs. 5.16(c) and 5.16(d) are so close to the true data that the true data are barely

seen behind the prediction. This is because the more frequent sampling brought more

data in the training data sets in Cases 29 and 30.

The effect of the sampling frequency can be seen more clearly in the log-log plots

in Fig. 5.17. Comparing the pressure derivative curves in the four log-log plots, it is


0 20 40 60 80 100 120 140 160 180 200−1200

−1000

−800

−600

−400

−200

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(a)

0 20 40 60 80 100 120 140 160 180 200−1200

−1000

−800

−600

−400

−200

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

0 20 40 60 80 100 120 140 160 180 200−1200

−1000

−800

−600

−400

−200

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(c)

0 20 40 60 80 100 120 140 160 180 200−1200

−1000

−800

−600

−400

−200

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)

Figure 5.16: The pressure reproduction to the training flow rate history in (a) Case27, (b) Case 28, (c) Case 29, and (d) Case 30.


apparent that the prediction improves consistently from Case 27 in Fig. 5.17(a) to

Case 30 in Fig. 5.17(d). In Fig. 5.17(a), the prediction captures the overall trend but

loses the details in the wellbore effect and the infinite-acting radial flow regions. In

Fig. 5.17(b), the infinite-acting radial flow region begins to approach the true answer,

while in Fig. 5.17(c) and 5.17(d) the infinite-acting radial flow are captured.

In Fig. 5.17, three observations attract our attention. First, in Fig. 5.17(a), al-

though the prediction loses the infinite-acting radial flow region, it captures the con-

stant pressure boundary very well. From the log-log plot we may see that in the

synthetic model, it takes at least 30 hours for the pressure to respond to the bound-

ary effect. Hence, more than 3/4 of the data (200 hours in total) lay in the region

of the boundary effect, leaving less than 1/4 of the data for the rest of the transient.

Considering there were only 50 points in total in Case 27, actually only around 10

points were involved in the prediction for the infinite-acting radial flow region, while

there were around 40 points for the boundary effect region. Therefore, the difference

for the prediction in the infinite-acting radial flow region and the boundary effect

region are still caused by the richness of the data. Secondly, comparing Fig. 5.17(c)

with Fig. 5.17(d), Case 30 has a better prediction in the infinite-acting radial flow

region, especially the connection to the boundary effect region, while Case 30 deviates

from the wellbore storage region. This is because the wellbore storage effect is a short

period feature dominating the derivative curve for only a few hours at the beginning.

However, the infinite-acting radial flow and the boundary effect are long lasting be-

haviors. Hence, when the data are more numerous, the learning algorithm prefers to

focus more on the lasting behavior rather than the short term behavior, leading to

the slight deviation in the wellbore storage region and a small improvement in the

infinite-acting radial flow region in Fig. 5.17(d). Finally, appropriate resampling did

not decrease much the precision of the prediction. Such as in Case 29, the size of the

training data set was half that of the original data set. However, the accuracy of the

prediction was not decreased. This implies the feasibility of resampling the data set

to improve the speed of the computation.

After the tests of Cases 27-30, we may conclude that the convolution kernel method

is able to work under different sampling frequencies. Too infrequent sampling leads


100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(d)

Figure 5.17: The pressure reproduction to the constant flow rate history in (a) Case27, (b) Case 28, (c) Case 29, and (d) Case 30.


to deviation in the prediction due to the lack of data. However, appropriate sampling

retains the precision of the prediction as well as accelerates the computation of the

prediction. It is worth some further exploration in future study as to what is the best

sampling/resampling frequency.

5.6 Evolution of Learning

Section 5.5 discussed the effect of the sampling frequency of the data set. However,

when the sampling frequency is fixed, the richness of the data set will depend on the

total time span of the data. A typical scenario relevant to the effect of the time span

is in a real-time measurement. The measurements are fed in more and more as time

elapses. Considering the data mining algorithm digs information only from the data

that has, different time spans of the data are expected to bring different information

to the data mining algorithm. Hence, in a real-time data analysis, an evolution of

the learning and the prediction will be seen along with the growth of the time span

of the training data. In this section, we demonstrate an evolution learning process,

and discuss the effect of the time span.

In order to investigate the issue, a semireal case was constructed, as shown in

Fig. 5.18. This data set covers 200 hours sampled by 200 points. Four test cases,

namely Cases 31-34 were then constructed by spanning different lengths of the data

set. The characterization of the four test cases is listed in Table 5.8. Cases 31-34

spanned the first 20, 50, 100, 200 hours of the original data set to simulate a real-time

progress of the data acquisition. These four tests were conducted using the work flow

as follows.


mally distributed) to both the pressure and the flow rate.

2. Extract a piece of the data set from Step 1 with a given length of period to

form the training data set. Apply the convolution kernel method to learn the

training data set until convergence.


0 50 100 150 200−800

−600

−400

−200

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataNoisy Data

0 50 100 150 200

0

20

40

60

Time (hours)Flo

w R

ate

(ST

B/d

)

True DataNoisy Data

Figure 5.18: The original complete semireal data set for evolution learning tests forCases 31-34.

Table 5.8: Test cases for evolution learning performance analysis


31 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary; first 25 hour data.














data (from Step 6).

Fig. 5.19 shows the pressure reproduction to the training flow rate history in

Cases 31-34. As there are only 25 hours of data (25 points) in Case 31, the pressure

reproduction has a slight deviation compared to the true data in Fig. 5.19(a) due

to the lack of data. For Cases 32-34, the pressure reproductions all seem good, as

demonstrated in Figs. 5.19(b) to 5.19(d).

However, log-log plots of the pressure prediction to the constant flow rate history

reveals another side of the story, as demonstrated in Fig. 5.20. The derivative curves

in the log-log plots suggest an evolution of the pressure prediction – in Fig. 5.20(a), the

pressure prediction only captures the wellbore effect and the infinite-acting radial flow,

while in Fig. 5.20(d) the prediction captures nearly all features of the well/reservoir

model. The whole evolution demonstrates the learning process of the data mining

algorithm. In the synthetic model, it requires at least 30 hours for the pressure to

respond to the constant pressure boundary. However, in Case 31, only the first 25

hour data were fed to the learning algorithm, so what the data mining algorithm saw

was only the wellbore effect, skin effect and the infinite-acting radial flow. That is

why the prediction in Fig. 5.20(a) does not show the constant pressure boundary in

the derivative curve. In Case 32, because the first 50 hours of data were provided,

it became possible for the data mining algorithm to detect the constant pressure


0 5 10 15 20 25−440

−430

−420

−410

−400

−390

−380

−370

−360

−350

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(a)

0 5 10 15 20 25 30 35 40 45 50−500

−450

−400

−350

−300

−250

−200

−150

−100

−50

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(b)

0 10 20 30 40 50 60 70 80 90 100−500

−450

−400

−350

−300

−250

−200

−150

−100

−50

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (hours)

∆Pre

ssur

e (p

si)

True DataMethod D

(d)

Figure 5.19: The pressure reproduction to the training flow rate history in (a) Case31, (b) Case 32, (c) Case 33, and (d) Case 34.


boundary. Therefore, the derivative curve in Fig. 5.20(b) demonstrated a “dropping

tail” showing that the data mining algorithm had some detection of the boundary.

However, because the 50 hours was not sufficient for the boundary effect to develop

fully, the data mining algorithm was still not able to capture the boundary effect

accurately. Finally, in Cases 33 and 34, the timespan of the data set was long enough,

so the predictions in Figs. 5.20(c) and 5.20(d) captured the boundary effect.

Supposing that there was an engineer processing these data in real time, he or she

would see this evolution of the prediction as a reflection of the further and further

understanding of the well/reservoir by the data mining algorithm along with longer

and longer time span of the data set. Furthermore, supposing there is a reservoir that

has a production history of five years, the comparison between the prediction using

the first two years and the prediction using the first five years could demonstrate a

property change of the reservoir along with the continuing production. This implies

a new application of the data mining approach – observing the reservoir property

change by applying the data mining algorithm progressively to increasing time span

of the PDG data.

Cases 31-34 demonstrated the effect of the time span of the data set. Because

the data mining algorithm can only make prediction based on what it has learned

from the data, a short period of data leads to the absence of well/reservoir features

that require a long time to develop. Therefore, when the time span of the data set

grows, the prediction by the data mining algorithm also evolves. As a byproduct,

this evolution points to another potential application of the data mining algorithm

to evaluate the well/reservoir property change.

5.7 Summary

In this chapter, more realistic problems related to performance of the data mining

algorithm were discussed.

Sections 5.1 and 5.2 first discussed two kinds of commonly seen noise behaviors

of the PDG data, outliers and the aberrant segments. These two behaviors may lead

to more effect on the pressure prediction than the normal noise. The normal noise


100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(a)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(b)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(c)

100

101

102

101

102

103

Time (hours)

∆Pre

ssur

e (p

si)


(d)

Figure 5.20: The pressure reproduction to the constant flow rate history in (a) Case31, (b) Case 32, (c) Case 33, and (d) Case 34.


is usually pervasive and relatively small compared to the absolute value of the true

measurement. However, the outliers and the aberrant segments are more arbitrary

and far more deviated from the true measurement. Especially, the aberrant segments

are more problematic because the deviated data in the aberrant segments brings a

second logic interfering with the ability of the data mining algorithm to focus on

the true well/reservoir model. Nevertheless, the tests carried here still demonstrated

a good tolerance of the data mining algorithm to a moderate level of outliers and

aberrant segments. For severe outliers, Section 5.1 suggested that a low-noise flow rate

history may improve the precision of the prediction effectively despite the existence

of severe outliers in the pressure. Handling the severe aberrant segments, Section 5.2

demonstrated that a preremoval (just simple deletion, no interpolation required) of

the aberrant segment immediately corrected the prediction.

Incomplete production history and unknown initial pressure are another two prob-

lems, discussed in Sections 5.3 and 5.4. Sections 5.3 and 5.4 first showed the effect of

the pressure prediction by the data mining algorithm working in these two kinds of

situations. Thereafter two solutions were found for the incomplete production history

and the inappropriate initial pressure problems. For the partial production history,

an effective rate qeff was defined in Eq. 5.1. The effective rate, calculated using the

accumulative production, represents the average flow rate in the missing period. Also

for the unknown initial pressure, an algorithm optimizing the initial pressure out-

side the data mining process was proposed, as shown in Algorithm 4. The algorithm

utilizes the data mining process as a black box to search for the appropriate initial

pressure iteratively until the pressure prediction converges to the training data. The

test cases demonstrated the feasibility of the two methods.

Finally, Sections 5.5 and 5.6 discussed the effect of sampling frequency and the

time span of the data set. The sampling frequency and the overall time span together

decide the total number of data fed to the data mining algorithm. The test cases in

the two sections demonstrated that either low sampling frequency or short time span

of the data set would lead to the deviation of the final prediction. However, the mech-

anisms of the inaccurate prediction in the two situations are slightly different. For the

infrequent sampling, the whole data set is provided to the data mining algorithm, but


not in detail due to the low sampling rate. Therefore, the data mining algorithm may

capture the overall trend but lose the accuracy in the feature details. However, for

the short time span, only part of the data is provided to the data mining algorithm,

so the data mining algorithm could only infer the well/reservoir model from a partial

data set missing the features that happen after the training time span. Nevertheless,

the deviation in the two scenarios are still consistent in the mathematical essence,

that is, the lack of training data leads to such an incomplete basis of kernel functions

K(

·,x(i))

that fβ (x) defined in Eq. 4.10 is not eligible to form an adequate esti-

mator of the true function f (refer to Section 4.3). As a result, Section 5.5 suggested

that an appropriate sampling rate that could provide sufficient basis kernel functions

should retain a good precision of the prediction as well as accelerate the performance

of the data mining process. Also, Section 5.6 illustrated the importance of the com-

pleteness of the data set and revealed a potential approach of observing the reservoir

property change using the data mining methods.

Along with the performance analysis, there were three important byproducts

worth future investigation. Firstly, the data mining method could be used as a black

box in an iterative optimization process to discover the appropriate initial pressure

of the reservoir. Secondly, resampling the PDG data set at an appropriate sampling

rate will not harm the prediction, but would improve the computational performance

of the data mining process. It would be a very helpful supplement if there is a way

to decide the appropriate sampling rate in advance of the training process. Finally,

applying the data mining to different lengths of the data time span results in an

evolution of the prediction. This could be utilized as a potential approach to observe

the property change of the well/reservoir model. All of these imply a future of wide

application of data mining methods in real practice.

Chapter 6

Rescalability

When the training data are numerous, the kernel matrix in the training equation,

Eq. 4.18 will be large. Recall that the computational cost of constructing the kernel

matrix is O(

N4p

)

(refer to Section. 4.6), so the large size of the matrix leads to a

low computational performance of the data mining process. To solve this problem,

a direct solution is to reduce the size of the data by resampling the original data

set with an appropriate sampling rate, as discussed in Section 5.5. An alternative

idea is to rescale the large kernel matrix into a series of block matrices, and solve the

training equation (Eq. 4.18) by solving a series of equations consisting of smaller block

matrices. In this chapter, the idea of rescaling the large kernel matrix into smaller

matrices is discussed. Based on the difference of the blocks that are used in the

training and prediction, two algorithms, namely block algorithm and advanced block

algorithm, are discussed in Sections 6.1 and 6.2 respectively. Finally, in Section 6.3, a

real field case is demonstrated, in which both the resampling and the advanced block

algorithm were applied.

163

CHAPTER 6. RESCALABILITY 164

6.1 Block Algorithm

To introduce the block algorithm, let us start from an easy example. Suppose that a

training data set has a total of 400 training samples. Hence, the training equation is:

Kβ = y (6.1)

where:

K =

Kij |Kij = K(

x(i),x(j))

, i, j = 1, . . . , 400

(6.2)

β = (β1, . . . , β400)T (6.3)

y =(

yobs(1), . . . , yobs(400))T

(6.4)

Then we divided the kernel matrix K ∈ ℜ400×400 into 200× 200 blocks. We have:

(

K11 K12

K21 K22

)(

β1

β2

)

=

(

y1

y2

)

(6.5)

where:

K11 =

Kij |Kij = K(

x(i),x(j))

, i, j = 1, . . . , 200

(6.6)

K12 =

Kij |Kij = K(

x(i),x(j))

, i = 1, . . . , 200, j = 201, . . . , 400

(6.7)

K21 =

Kij |Kij = K(

x(i),x(j))

, i = 201, . . . , 400, j = 1, . . . , 200

(6.8)

K22 =

Kij |Kij = K(

x(i),x(j))

, i, j = 201, . . . , 400

(6.9)

β1 = (β1, . . . , β200)T (6.10)

β2 = (β201, . . . , β400)T (6.11)

y1 =(

yobs(1), . . . , yobs(200))T

(6.12)

y2 =(

yobs(201), . . . , yobs(400))T

(6.13)

Then Eq. 6.5 may be expanded as:

K11β1 +K12β2 = y1 (6.14)


and

K21β1 +K22β2 = y2 (6.15)

Focusing on Eq. 6.14, we find that terms K11β1 and y1 are the terms related to

the first 200 samples only, while the term K12β2 is related to the last 200 samples.

Hence, Eq. 6.14 implies the first 200 pressure observations (y1) are affected not only

by the flow rate history in the first 200 samples (K11β1), but also by the flow rate

history in the last 200 samples (K12β2). This nonphysical implication inspired us to

conceive a reduction scenario: supposing that there are only the first 200 samples

for data mining, what will the training equation look like? Using the same variable

definition, the training equation will be:

K11β1 = y1 (6.16)

Comparing Eq. 6.15 with Eq. 6.16, the extra term in Eq. 6.15, K12β2, is caused

by additional basis kernel functions brought by the later 200 samples. Therefore,

assuming that the first 200 samples are eligible to form an adequate estimator of the

true function f (refer to Section 4.3), the extra term K12β2 should be able to be taken

out from Eq. 6.14. In this way, Eq. 6.14 will degrade to Eq. 6.16. Hence, β1 can be

solved as:

β1 = K−111 y1 (6.17)

Substitute Eq. 6.17 into Eq. 6.15, we have:

β2 = K−122 (y2 −K21β1) (6.18)

Using Eq. 6.17 and Eq. 6.18, the coefficient vector β is solved in two steps using

block matrices Kij instead of one step using the full matrix K.

To make this case general, for any kernel matrix K ∈ ℜ(u×v)×(u×v), u, v ∈ N, the

coefficient vector could be solved in v steps using the block matrices Kij ∈ ℜu×u,

as demonstrated in Eq. 6.19, as long as the u is large enough such that

for all 1 ≤ k ≤ v, data set(

x(i), y(i))

|1 ≤ i ≤ k × v

is eligible to form an

adequate estimator to approach the true function f behind the data set


(

x(i), y(i))

|1 ≤ i ≤ k × v

. With the calculated β, the pressure prediction can be

made using the original prediction equation, Eq. 4.22.

β1 = K−111 y1

β2 = K−122 (y2 −K21β1)

...

βk = K−1kk

(

yk −∑k−1

l=1 Kklβl

)

...

βv = K−1vv

(

yv −∑v−1

l=1 Kvlβl

)

(6.19)

In this way, the original kernel matrix was divided into relatively small block

matrices, and only a half of the blocks, the lower triangular blocks, are actually

involved in the computation, as demonstrated in Fig. 6.1. Physically, this block

algorithm could be explained as that a pressure transient is affected by the preceding

flow rate history only.

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X X X X X

XXX

X X

XXX

Figure 6.1: The block matrices used in the block algorithm, taking a 7 × 7-blockkernel matrix as an example.

To remain consistent with naming convention used for the previous methods,

we name this block algorithm as Method E. The comparison between the normal

convolution kernel method (Method D), and the block convolution kernel method

(Method E) is shown in Table 6.1.


Table 6.1: Comparison between Method D and Method E

Method D


q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k /t

(i)k

Kernel FunctionK(

x(i),x(j))

=∑i

k=1

∑j

l=1 k(

x(i)k ,x

(j)l

)

k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l

Block Algorithm NoBlocks Used for

TrainingNA

Method E


q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k /t

(i)k

Kernel FunctionK(

x(i),x(j))

=∑i

k=1

∑j

l=1 k(

x(i)k ,x

(j)l

)

k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l

Block Algorithm Yes

Blocks Used forTraining

Lower Triangular Blocks


To test Method E, we constructed a semireal case, Case 35, as listed in Table 6.2.

There are 600 samples in Case 35. We used 200 as the size of the block matrices.

The test work flow is as follows.

Table 6.2: Test cases for rescalability test using Method E


35 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary, covering 600 hours by 600 samples.




the convolution kernelized data mining with block algorithm (Method E) to

learn the data set until convergence.










data (from Step 6).







data (from Step 9).

Fig. 6.2 shows the test results of Case 35 using the block algorithm. Because the

block size is 200 and the total number of data is 600, the original kernel matrix was

divided into nine block matrices. The noisy training data and the synthetic true data

are shown in pink and in blue respectively in Fig. 6.2(a). Fig. 6.2(b) demonstrates

the pressure reproduction to the training flow rate history. The pressure prediction is

so close to the true data that the true data are barely seen. The pressure prediction

to the constant flow rate history is shown in Fig. 6.2(c). The derivative curve shows

that the block algorithm captured nearly all features of the well/reservoir model,

including the wellbore storage effect, skin factor effect, infinite-acting radial flow, and

the constant pressure boundary. The only slight deviation exists at the connection

region between the infinite-acting radial flow and the constant pressure boundary.

The pressure prediction to the multivariable flow rate history in Fig. 6.2(d) retains

the accuracy compared to the true data.

The results of Case 35 demonstrate the feasibility of the block algorithm (Method

E). As discussed, Method E rescales the original kernel matrix into a series of block

matrices, and uses the lower-triangular blocks in the training process. It took 318

minutes to complete the whole test of Case 35 (one training process with three pre-

dictions) compared to 478 minutes using Method D without the block algorithm on

an Intel Dual Core 2.66GHz, 2GB memory desktop. The performance increase is

from two aspects, the rescaled smaller dimension of the block matrices and the de-

creased number (a half) of total blocks used in the calculation due to the use of the

lower triangular only. Method E utilizes the lower triangular block matrices, as half

of the original kernel matrix. In order to further improve the computational per-

formance, further simplification is demanded. This simplification is discussed in the

next section, Section 6.2, namely the advanced block algorithm.


0 100 200 300 400 500 600

−2000

−1000

0

Time (hours)

∆Pre

ssur

e (p

si)

Real DataNoisy Data

0 100 200 300 400 500 6000

50

100

150

200

Time (hours)Flo

w R

ate

(ST

B/d

)

Real DataNoisy Data

(a)

0 100 200 300 400 500 600−3000

−2500

−2000

−1500

−1000

−500

0

500

Time (days)∆P

ress

ure

(psi

)

Real DataMethod E

(b)

100

101

102

101

102

103

Time (days)

∆Pre

ssur

e (p

si)

Real DataReal Data (Derivative)Method EMethod E (Derivative)

(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (days)

∆Pre

ssur

e (p

si)

Real DataMethod E

(d)

Figure 6.2: Test results on Case 35 using Method E (block algorithm): (a) the truedata and the training data; (b) pressure prediction using the variable flow rate history;(c) pressure prediction using the constant flow rate history on a log-log plot; (d)pressure prediction using multivariable flow rate history on a Cartesian plot.


6.2 Advanced Block Algorithm

To further improve the block algorithm, let us first revisit the solution of the coefficient

vector in Eq. 6.19. The general form of the solution is:

βk = K−1kk

(

yk −k−1∑

l=1

Kklβl

)

(6.20)

The term∑k−1

l=1 Kklβl in Eq. 6.20 could be explained as the total pressure response

to the previous k − 1 blocks of flow rate changes. So the term yk −∑k−1

l=1 Kklβl is

the pressure response to the current kth block flow rate changes. Considering the fact

that the pressure response to times long before flow rate change is small and limited,

only the most recent flow rate changes actually dominate the total pressure response.

In this way, if we make a further assumption that a pressure transient is related

to at most one block of flow rate changes before the current block, Eq. 6.20

becomes:

βk = K−1kk

(

yk −Kk,k−1βk−1

)

(6.21)

Then the block matrices used in the original kernel matrix are the bidiagonal

matrix, as demonstrated in Fig. 6.3.

X

X X

X X

X X

XX

X X

XX

Figure 6.3: The block matrices used in the advanced block algorithm, taking a 7× 7-block kernel matrix as an example.


Using Eq. 6.21 to solve for the coefficient vector β, we form an advanced block

algorithm. To compare with the simpler block algorithm in Section 6.1, we named

this advanced block algorithm Method F. Table 6.3 shows the comparison between

Method E and Method F.

Table 6.3: Comparison between Method E and Method F

Method E


q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k /t

(i)k

Kernel FunctionK(

x(i),x(j))

=∑i

k=1

∑j

l=1 k(

x(i)k ,x

(j)l

)

k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l

Block Algorithm Yes


Lower Triangular Blocks

Method F


q(i)k

q(i)k log t

(i)k

q(i)k t

(i)k

q(i)k /t

(i)k

Kernel FunctionK(

x(i),x(j))

=∑i

k=1

∑j

l=1 k(

x(i)k ,x

(j)l

)

k(

x(i)k ,x

(j)l

)

=(

x(i)k

)T

x(j)l

Block Algorithm Yes


Bidiagonal Blocks

To test Method F, we used the same test case as the one used for Method E, Test

Case 35 listed in Table 6.4, and follow the work flow as follows.





Table 6.4: Test cases for rescalability test using Method F


35 Infinite-acting radial flow + wellbore effect + skin + constantpressure boundary, covering 600 hours by 600 samples.

the convolution kernelized data mining with advanced block algorithm (Method

F) to learn the data set until convergence.










data (from Step 6).






data (from Step 9).

The test results are shown in Fig. 6.4. Because the block size is 200 and the

total number of data is 600, the original kernel matrix was divided into nine block

matrices. Fig. 6.4(a) shows the noisy training data (in pink) and the true data (in

blue). The pressure reproduction to the training flow rate history is demonstrated


in Fig. 6.4(b). We may see that most of the pressure predictions are very close to

the synthetic true data, except in two regions at 200h and 400h. Because the size

of the block matrices is 200, these two points are exactly the place where the blocks

connect. Therefore, these taper angles are caused by the connection between the

blocks. Despite these slight deviations at the block connections, the overall prediction

are still acceptable. Fig. 6.4(c) demonstrates the pressure prediction to a constant

flow rate history. The derivative curve shows that Method F captured the major

features of the well/reservoir model, including the wellbore effect, skin factor effect,

infinite-acting radial flow and the constant pressure boundary. Fig. 6.4(d) also shows

a good pressure prediction to the multivariable flow rate history.

It took Method F 138 minutes to complete the whole test of Case 35 (one training

with three predictions) compared to 318 minutes with Method E, and 478 minutes

for Method D. Method F increased the performance while sacrificing a little pre-

diction precision. However, the precision sacrifice is still acceptable considering the

performance increase. When the size of the training data is larger, the computational

performance advantage of Method F will be more substantial. The next section de-

scribes the application of Method F to a real field case with different block sizes.

6.3 Real Data Application

To demonstrate the application of Method F in real practice, a real field test case

was conducted. In this real case, the PDG sampling rate is one measurement per

2.6 minutes. Hence, there are around 140,000 sampling points for about 250-day

production history, as shown in Fig. 6.5.

Considering that 140,000 samples are too many even for the block algorithm, a

resampling was performed to reduce the size of the data set to 600 samples, or 1

sample per 10 hours. Fig. 6.6 demonstrates the comparison between the original real

field data and the resampled real field data. From the zoom-out view in Fig. 6.6(a),

the resampled data are nearly the same as the full size data set. This implies the

resampling rate is adequate overall. However, the zoom-in view in Fig. 6.6(b) shows


0 100 200 300 400 500

−2000

−1000

0

Time (hours)

∆Pre

ssur

e (p

si)

Real DataNoisy Data

0 100 200 300 400 500 6000

50

100

150

200

Time (hours)Flo

w R

ate

(ST

B/d

)

Real DataNoisy Data

(a)

0 100 200 300 400 500 600−3000

−2500

−2000

−1500

−1000

−500

0

500

Time (days)∆P

ress

ure

(psi

)

Real DataMethod F

(b)

100

101

102

101

102

103

Time (days)

∆Pre

ssur

e (p

si)

Real DataReal Data (Derivative)Method FMethod F (Derivative)

(c)

0 20 40 60 80 100 120 140 160 180 200−800

−700

−600

−500

−400

−300

−200

−100

0

Time (days)

∆Pre

ssur

e (p

si)

Real DataMethod F

(d)

Figure 6.4: Test results on Case 35 using Method E (block algorithm): (a) the truedata and the training data; (b) pressure prediction using the variable flow rate history;(c) pressure prediction using the constant flow rate history on a log-log plot; (d)pressure prediction using multivariable flow rate history on a Cartesian plot.


500 550 600 650 700 750 800−800

−600

−400

−200

0

Time (days)∆P

ress

ure

(psi

)

Real Data

500 550 600 650 700 750 8000

0.5

1

1.5

2

x 104

Time (days)Flo

w R

ate

(ST

B/d

)

Real Data

Figure 6.5: The real field data from 250-day production sampled by 140,000 points.

the difference between the two. In a five-day zoom-in range, the resampled data fol-

lows the trend of the original data set while losing the local variation. This resampled

data set was the data set of Test Case 36, listed in Table 6.5.

Table 6.5: Test cases for rescalability test on large PDG data set


36 Use the data set in Fig. 6.5 as the original data set, resampledthe original data set by 600 points.

Test Case 36 was performed three times with three different block sizes, including

600, 300, and 200. Hence, the total numbers of the blocks in three execution are one,

four, and nine. Using different block sizes allowed us to observe the effect of block

size. Actually, when the block size is 600, the kernel matrix becomes a single block.

Then, Method F is essentially the same as Method D, although Method F uses only

the bidiagonal block matrices while Method D uses the full kernel matrix. Therefore,

in addition to observing the effect of the size of the block, the comparison between

Method F and Method D could also be seen in these tests. In each time execution,

the test was conducted with the work flow as follows.

1. Use the real data set as the training data set. Apply the convolution kernelized


500 550 600 650 700 750 800−800

−600

−400

−200

0

Time (days)

∆Pre

ssur

e (p

si)

Real Data (140,000 samples)Resampled Data (600 samples)

500 550 600 650 700 750 8000

1

2

3x 10

4

Time (days)Flo

w R

ate

(ST

B/d

)


(a)

636 637 638 639 640 641−350

−300

−250

−200

−150

Time (days)

∆Pre

ssur

e (p

si)


636 637 638 639 640 6410.8

1

1.2

1.4x 10

4

Time (days)Flo

w R

ate

(ST

B/d

)


(b)

Figure 6.6: The original real field data set (140,000 samples) and the resampledtraining data set (600 samples) for the rescalability test in Cases 36: (a) zoom-outview, and (b) zoom-in view.

data mining algorithms (Method D or Method F) to learn the data set until

convergence.

2. Feed the data mining algorithm with the training variable flow rate history (real

flow rate history) and collect the prediction from the data mining algorithm.





Fig. 6.7 shows the results for Case 36. The original field data are shown in

Fig. 6.7(a). Fig. 6.7(b) shows the pressure reproductions to the training flow rate

history. In the figure, all three block sizes return good pressure reproduction com-

pared to the real field data. No obvious difference between the three tests is observed.

Fig. 6.7(c) demonstrates the pressure predictions to a constant flow rate with three

different block sizes. The derivative curves in Fig. 6.7(c) shows that all three different

block sizes capture similar well/reservoir features, including infinite-acting radial flow

and constant pressure boundary. Also, none of the derivatives shows wellbore storage


effect. These consistencies give us some confidence in the accuracy and robustness

of Method F, although the true answer is unknown. Fig. 6.7(d) shows the pressure

predictions to a multivariable flow rate history. The consistency was retained for all

three different sizes of blocking.

500 550 600 650 700 750 800−800

−600

−400

−200

0

Time (days)

∆Pre

ssur

e (p

si)

Real Data

500 550 600 650 700 750 8000

0.5

1

1.5

2

x 104

Time (days)Flo

w R

ate

(ST

B/d

)

Real Data

(a)

500 550 600 650 700 750 800−800

−700

−600

−500

−400

−300

−200

−100

0

100

Time (days)∆P

ress

ure

(psi

)

Real DataMethod F (600X1)Method F (300X2)Method F (200X3)

(b)

100

101

102

103

10−4

10−3

10−2

10−1

100

101

102

Time (days)

∆Pre

ssur

e (p

si)

Method F (600X1)Method F (600X1) (Derivative)Method F (300X2)Method F (300X2) (Derivative)Method F (200X3)Method F (200X3) (Derivative)

(c)

0 20 40 60 80 100 120 140 160 180 200

−20

−15

−10

−5

0

Time (days)

∆Pre

ssur

e (p

si)

Method F (600X1)Method F (300X2)Method F (200X3)

(d)

Figure 6.7: Rescalability test results on Cases 36: (a) the original real field data(140,000 samples); (b) pressure reproduction to the training flow rate history; (c)pressure prediction using the constant flow rate history on a log-log plot; (d) pressureprediction using multivariable flow rate history on a Cartesian plot.

The execution times of Case 36 with three different block sizes are listed in Ta-

ble 6.6. The execution time includes one training process and three predictions (in-

cluding pressure reproduction, constant flow rate history, and multivariable flow rate

history). As expected, the block size of 200 has the shortest execution time due to


the dramatic decrease of the calculation, while the block size of 600 has the longest

execution time due to the use of the full kernel matrix in the calculation.

Table 6.6: Execution time of Case 36 with different block sizesBlock Size Execution Time (minutes)

600 492300 298200 141

Three tests using the same Case 36 with different block sizes illustrated the feasi-

bility of the advanced block algorithm in real field practice. At the meanwhile, three

different blocking did not show obvious difference in the pressure prediction. This on

one hand gives us the confidence in the method, on the other demonstrated blocking

did not affect on the pressure prediction by much at least in this test case.

6.4 Summary

In order to improve the performance of the learning process for large data sets, the

kernel matrix in the training process has to be rescaled to an appropriate size for

calculation. In this section, two ways of rescaling the size of kernel matrix, the

block algorithm (Method E) and the advanced block algorithm (Method F), were

investigated. Both of them rescaled the original kernel matrix into a series of block

matrices. However, Method E utilized the lower triangular blocks for training whereas

Method F utilized the bidiagonal blocks. The semireal case and the real field case

demonstrated the feasibility of the two methods. In the tests, Method F showed a

better performance comparing to Methods D and E.

However, the success of Methods E and F is based on an important condition – an

appropriate block size. For Method E, the appropriate block size helps to maintain

the eligibility of data in the blocks to form an adequate estimator of the true function.

For Method F, in addition to the issue of adequate estimator, the appropriate block

size requires that the flow changes that happened two blocks before have insignificant

effect in the current pressure transient. This releases the former blocks from the


coefficient calculation. Hence, too small a block size would be detrimental to the

pressure prediction.

The higher computational performance of Method F trades for a little loss in the

prediction precision. However, Method F could increase in precision by increasing

the count of blocks that are involved in the coefficient calculation. In Section 6.2,

only two bidiagonal blocks were used in the calculation. But in the real practice,

the count of the block could be increased (to the largest count, Method F becomes

Method E). However, in this way, the increase in the precision brings a decrease

in the computational performance because more block matrices are involved in the

calculation. It is worth future investigation as to what is a proper block count to

balance the prediction precision and the computational performance.

Chapter 7

Conclusion and Future Work

Results, obtained in this study, show that data mining can be a useful mathematical

tool for PDG data analysis. By using the data mining method, the well/reservoir

model can be discovered in the form of hypothesis parameters in a pesudo-high-

dimensional space defined by the kernel function. Here is a summary of the main

points of this study.

1. The nonparametric data mining algorithms do not require any physical model

or mathematical assumption ahead of time. As long as the algorithm puts all

the possible features in the input vector, the data mining methods will find a

suitable functional form in the high-dimensional space and thereby discover the

most appropriate reservoir model in the process.

2. The data mining approaches cointerpret the pressure and flow rate data simulta-

neously by utilizing both the pressure and the flow rate in the training process.

This provides a way to make use of flow rate measurements that can now be

recorded with some modern PDG tools.

3. The data mining methods do not require constant flow rate, and utilize the

whole set of variable flow rate PDG data. The procedures also work well in

the absence of any shut-in periods, which are generally the period used most

commonly for present analysis techniques.

181

CHAPTER 7. CONCLUSION AND FUTURE WORK 182

4. The data mining methods tolerate noise in the data set naturally. No denoising

procedure is required in advance, and in fact the procedure provides a robust

way of removing noise without removing reservoir response signal.

5. The data mining approach can help the reservoir management in different ways:

• The prediction results of the data mining approaches may be analyzed

using conventional well test methods, and hence provide better character-

ization of the well and the reservoir.

• The data mining approaches can make pressure prediction to complex flow

rate histories so that the prediction result can be used for production

optimization or history matching.

• The data mining approaches can reproduce the pressure to a clipped flow

rate history to denoise the data set.

6. Among Methods A–D, Method D that uses convolution kernel method was

found to be preferable due to its superiority in the following aspects:

• Method D has accurate prediction in most cases, and overcomes the limi-

tation of predicting to a multivariable flow rate history.

• Method D does not require knowledge of the break points in advance while

still giving accurate prediction.

• Method D has a high level of tolerance to outliers and aberrant segments

in addition to the normal noise.

• Method D handles the incomplete production history by imposing an ef-

fective rate as a readjustment.

• Method D works under the condition of unknown initial pressure value by

an iterative process optimizing the initial pressure.

Hence, Method D should attract more attention in future study and in field

application.


7. When the data set is large, a resampling with an appropriate sampling rate helps

to improve the computational performance while maintaining the precision of

the prediction. In addition to the resampling, rescaling the kernel matrix with

Method E or F also improves the performance. In real practice, it is efficient to

use both resampling and rescaling methods, that is, to resample the data set to

a proper size, and then apply Method E or F on the reduced data set.

8. Comparing the two rescaling methods, Method E and Method F, Method F is

preferable due to the reasons as follows.

• Method F utilizes the bidiagonal block matrices only, and hence provides

better computational performance. Although there is some loss in the pre-

diction precision, the sacrifice in the precision is still acceptable considering

the performance increase.

• The performance and the precision of Method F may be balanced by in-

crease the count of block matrices in the computation.

As the data mining algorithm is a new approach to the PDG data interpretation,

the work that has been completed in this study is only a start. There are quite a few

improvements that could be made in different aspects as follows.

Appropriate resampling rate: As discussed in Section 5.5, an appropriate resam-

pling rate helps to improve the performance of the data mining process while

maintaining the precision of the prediction. Therefore, it will be very helpful to

determine an appropriate sampling rate in advance of the data mining process.

Appropriate count of involved block matrices: Chapter 6 demonstrated that

Method F is efficient to improve the computational performance by using the

bidiagonal block matrices. However, there is some precision loss in the pressure

prediction. The performance and the precision can be balanced by increasing

the count of the block matrices that are involved in the computation. Hence,

more investigation is needed in the determination of the appropriate count of

the block matrices for Method F.


Discovery of unknown initial pressure: As demonstrated in Section 5.4, the data

mining algorithm could be utilized as a black box in an iterative process to dis-

cover the appropriate initial pressure. In this way, PDG data also become a

promising resource in the initial pressure recovery. More tests should be per-

formed to conclude a best practice in the initial pressure discovery using the

data mining algorithms.

Observation of reservoir property change: Section 5.6 showed an alternative

application of PDG data in observing the reservoir property change by an evolu-

tion of pressure prediction using different time spans of PDG data. The reservoir

property change may be caused by many reasons, such as water flooding, hy-

draulic fractures, etc. A close monitoring of the reservoir property change may

help to revise the reservoir production settings to accommodate the subsurface

changes. Therefore, more investigation is required to formulate a proper work

flow to apply the data mining approaches to this specific purpose.

Unsynchronized data: As mentioned in Chapter 1, PDGs in the early stage did not

have the capability of measuring the downhole flow rate, and the flow rate data

were mostly provided by other resources at that time. In this situation, usually

the pressure measurement has a more frequent sampling rate while the flow rate

data are just a collection of sparsely distributed points. Therefore, the pressure

and the flow rate data are not synchronized. Even today, unsynchronized data

are still very common. In order to apply the data mining approaches on those

data sets, it is worth doing more improvements on the current data mining

approaches to adapt the unsynchronized data.

Temperature data: As an advantage discussed in this dissertation, the data min-

ing algorithm does not impose any physical model in advance. Therefore, the

method should have the capability to discover the relationship not only be-

tween pressure and flow rate data, but also between other measurements, such

as temperature. Because modern PDGs may provide the pressure, flow rate,

temperature data at each time step, it will be a good study direction to utilize


the data mining approaches in the cointerpreation of pressure, flow rate and

temperature data simultaneously.

Multiphase flow: In this study, the test cases were all single-phase. In a multiphase

flow well, the flow rate for each phase could also be provided in addition to the

total flow rate. More numerous variables bring in more complex relationships

behind the data. It would be interesting to apply the data mining approaches

to reveal the relationships between all the variables.

Multiple well: Multiple well interaction also requires further investigation. In real

reservoir production, interaction among wells, including the production wells

and injections wells, is very common. In conventional well testing, the interfer-

ence test is used to obtain reservoir properties through the interaction between

wells. In the data mining context, multiple wells not only bring in more data

from different sources, but also indicate more complex relationships. The data

mining methods face the relationship between variables as well as the rela-

tionship between wells. However, this study is very promising and important,

because it implies an integrated interpretation across wells which has been a

target chased by generations of reservoir engineers.

Parallel computation: The bottleneck of the performance of the data mining ap-

proaches lies in the heavy computation load in constructing the kernel matrix.

In addition to resampling the data set and rescaling the kernel matrix, there

is an alternative approach brought by the improvement of the computation

techniques – parallel computation. Because each element in the kernel matrix

is independent, parallel computation may be used in constructing the kernel

matrix. For example for Case 35, it took Method D nearly 500 minutes to

complete. With parallel computation, a five-thread computation will decrease

this time to 100 minutes. The advantage of the parallel computation is that

it increases the computational performance significantly without any reduction

in the prediction precision. Parallel computation would enable fast decision

with data mining approaches. Thus, further investigation in this direction will

accelerate the application of the data mining approaches in the real practice.


Nowadays, data mining approaches have already been used in many walks of daily

life. However, this study may be the first trial of using the data mining approach in

PDG data interpretation. These aspects of future work may be just a small portion

of future applications of data mining. At the end of this research, we hope it will not

be long before the data mining approaches are used widely for reservoir data analysis

in the resource industry.

Appendix A

Data

Table A.1: Data for Case 1Parameter Value Unit

k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STB


k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STBC 1E − 3 STB/psi

187

APPENDIX A. DATA 188


k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STBs 1 NA


k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STBC 1E− 3 STB/psis 1 NA


k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STBre 600 ft



k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STBre 600 ft


k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STBre 600 ftC 1E− 3 STB/psis 1 NA





k 20 mdh 10 ftφ 0.2 VOL/VOLCt 5E− 6 /psirw 0.32 ftµ 2 cppi 5000 psiB 1 RB/STBΩ 0.1 NAλ 1E− 7 NA


Table A.14: Data for Case 14

Time Flow Rate Pressure Time Flow Rate Pressure

day STB/day psi day STB/day psi

332.218611 21620.832 8649.1488 341.336667 6424.8864 8878.1664

332.308889 21635.1264 8648.7648 341.426944 6273.7248 8880.9408

332.399167 21638.8416 8649.1872 341.517223 5306.2176 8884.4256

332.489444 21400.08 8651.9712 341.6075 5275.2 8884.2144

332.579723 21083.9424 8656.7328 341.697777 5276.8608 8883.9456

332.67 21169.9872 8655.552 341.788056 5271.84 8883.8016

332.760277 21141.7344 8655.9456 341.878333 5270.5824 8883.6384

332.850556 20913.9648 8659.056 341.968611 8885.952 8851.0272

332.940833 20008.4448 8672.2176 342.058889 10661.9328 8826.5376

333.031111 20315.8656 8668.368 342.149167 10570.3968 8827.7472

333.121389 20795.2704 8661.648 342.239444 10512.2112 8827.8144

333.211667 21121.6416 8656.0224 342.329723 10456.1952 8827.9392

333.301944 21589.1808 8649.1776 342.42 10417.152 8828.0832

333.392223 19312.8576 8681.76 342.510277 10376.5056 8828.1696

333.4825 20062.7904 8672.1792 342.600556 10314.6336 8823.84

333.572777 20563.4496 8664.912 342.690833 12082.3872 8808.384

333.663056 20879.1168 8659.9488 342.781111 12243.2256 8805.3024

333.753333 18797.5008 8688.0672 342.871389 11889.1584 8809.0752

333.843611 19653.3696 8678.208 342.961667 13089.456 8794.3872

333.933889 19900.9536 8674.5888 343.051944 13505.2512 8788.7616

334.024167 20060.928 8672.4096 343.142223 14925.0624 8771.0016

334.114444 20207.3664 8670.192 343.230694 14718.9696 8772.5472

334.204723 20319.4368 8668.1472 343.320973 15687.5904 8759.9328

334.295 20435.6352 8666.1888 343.41125 16434.9888 8749.9488

334.385277 20481.8016 8665.6608 343.501527 17125.4016 8740.6944

334.475556 20501.7984 8665.1232 343.591806 16861.6608 8742.8544

334.565833 20543.1552 8664.5952 343.682083 18017.6448 8727.0432


334.656111 20669.088 8662.1568 343.772361 20105.0016 8698.0608

334.746389 20804.3616 8660.112 343.862639 20054.4 8696.544

334.836667 20907.4272 8658.5472 343.952917 19756.0128 8699.9904

334.926944 20979.1104 8657.2128 344.043194 19783.2096 8699.0976

335.017223 21016.2912 8656.4736 344.133473 19675.7184 8699.7984

335.1075 21083.8656 8655.408 344.22375 18887.0784 8709.8592

335.197777 19002.384 8685.888 344.314027 20352.1728 8689.824

335.288056 19105.104 8684.7264 344.404306 22319.6544 8659.6896

335.378333 19150.5408 8684.496 344.494583 21994.1088 8663.4528

335.468611 19140.8352 8684.3616 344.584861 21762.5472 8665.6992

335.558889 20755.7088 8662.3872 344.675139 21591.0336 8667.4656

335.649167 21446.736 8650.8192 344.765417 21447.8208 8668.56

335.739444 21319.9584 8652.0864 344.855694 22048.8288 8660.2848

335.829723 21278.3616 8652.2976 344.945973 21931.7568 8660.6304

335.92 21220.9056 8652.8736 345.03625 21869.5776 8660.8224

336.010277 21168.6624 8653.2384 345.126527 21830.0256 8661.2352

336.100556 21156.0672 8653.632 345.216806 21811.0272 8661.1104

336.190833 21119.6832 8653.6032 345.307083 21807.4848 8660.7648

336.281111 21108.864 8653.9392 345.397361 21799.5552 8660.3808

336.371389 21085.8432 8654.1792 345.487639 21767.1936 8660.3232

336.461667 21071.0112 8654.448 345.577917 21786.6528 8659.728

336.551944 21043.5552 8654.4768 345.668194 21793.1136 8659.5648

336.642223 21034.9248 8654.4768 345.758473 21767.0304 8659.3824

336.7325 20999.8176 8655.072 345.84875 21754.2912 8659.3632

336.822777 10711.9008 8792.3616 345.939027 21751.3152 8659.1712

336.913056 0 8889.4272 346.029306 21738.1152 8658.864

337.003333 0 8896.9056 346.119583 21706.6368 8659.1328

337.093611 0 8900.928 346.209861 21721.7472 8658.0768

337.183889 0 8903.7888 346.300139 21708.2208 8658.2208

337.274167 0 8905.9296 346.390417 21673.7184 8658.1536

337.364444 0 8907.7824 346.480694 21671.2896 8658.096


337.454723 0 8909.376 346.570973 21687.7824 8657.472

337.545 0 8910.6912 346.66125 21658.4448 8657.8464

337.635277 0 8912.1216 346.751527 21701.7984 8657.2512

337.725556 0 8912.7456 346.841806 21712.5792 8656.8288

337.815833 0 8914.128 346.932083 21709.0272 8656.7424

337.906111 0 8915.088 347.022361 21698.9664 8656.5504

337.996389 0 8915.9712 347.112639 21698.5248 8656.0704

338.086667 0 8916.7488 347.202917 21703.584 8656.1376

338.176944 0 8917.4784 347.293194 21689.9424 8655.7344

338.267223 0 8918.1984 347.383473 21684.6048 8655.888

338.3575 0 8918.8608 347.47375 21662.8224 8655.7728

338.447777 0 8919.504 347.564027 21666.0864 8655.6192

338.538056 0 8920.1184 347.654306 21669.1488 8655.2064

338.628333 0 8920.7424 347.744583 21645.5808 8655.2544

338.718611 0 8921.328 347.834861 21663.0048 8654.976

338.808889 0 8921.904 347.925139 21640.704 8654.6496

338.899167 0 8922.4416 348.015417 21645.5808 8654.6784

338.989444 0 8922.9984 348.105694 21635.184 8654.5824

339.079723 0 8923.5168 348.195973 21656.4576 8654.256

339.17 0 8924.0448 348.28625 21630.8928 8654.3232

339.260277 0 8924.5248 348.376527 21634.5984 8654.4672

339.350556 0 8924.9952 348.466806 21594.096 8654.0928

339.440833 0 8925.4464 348.557083 21611.0784 8653.9968

339.531111 0 8925.8976 348.647361 21604.56 8653.8432

339.621389 0 8926.32 348.737639 21641.3472 8653.4592

339.711667 0 8926.7232 348.827917 21666.7104 8652.8064

339.801944 0 8927.136 348.918194 21647.9712 8653.008

339.892223 0 8927.52 349.008473 21632.4288 8652.8736

339.9825 0 8927.904 349.09875 21657.8784 8652.384

340.072777 0 8928.2592 349.189027 21646.08 8652.1728

340.163056 0 8928.6432 349.279306 21638.6016 8651.9904


340.253333 0 8928.9888 349.369583 21649.6224 8651.904

340.343611 0 8929.344 349.459861 21638.016 8651.8848

340.433889 0 8929.6992 349.550139 21652.416 8651.3184

340.524167 0 8930.0256 349.640417 21659.712 8651.0304

340.614444 0 8930.352 349.730694 21641.9328 8650.9728

340.704723 0 8930.688 349.820973 21636.5376 8651.136

340.795 0 8930.9856 349.91125 21626.0448 8650.8864

340.885277 0 8931.3024 350.001527 21645.3792 8650.656

340.975556 0 8931.6096 350.091806 21626.0928 8650.9728

341.065833 0 8931.8976 350.182083 21617.088 8650.6176

341.156111 0 8932.2048 350.272361 21642.864 8650.08

341.246389 0 8932.5024

pi = 9000psi

Table A.15: Data for Case 15



260.001806 7495.4016 8912.112 599.426389 7317.0816 8903.2512

263.612917 7496.0928 8911.44 603.0375 7311.168 8902.7616

267.224027 7485.9264 8911.0752 606.648611 7323.696 8902.0224

270.835139 7483.5456 8910.5664 610.259723 7317.4944 8901.4176

274.44625 7482.1056 8910.1056 613.870833 7297.0656 8901.1776

278.057361 7525.8144 8909.1168 617.481944 7299.4944 8900.6688

281.666667 0 8988.1344 621.093056 7312.5504 8900.256

285.277777 6742.4256 8925.8496 624.702361 7325.6832 8899.7376

288.888889 7420.8768 8914.9344 628.313473 7348.608 8899.1232

292.5 13523.712 8838.9792 631.924583 7324.656 8898.8928

296.111111 15747.792 8798.2176 635.535694 7322.7744 8898.4512


299.722223 16996.6752 8772.3168 639.146806 7305.7152 8898.1152

303.333333 16918.704 8766.8736 642.757917 7292.352 8897.9136

306.944444 16883.2416 8763.1104 646.369027 7281.8208 8897.6832

310.555556 17756.4384 8746.7136 649.980139 7271.472 8897.4336

314.164861 19486.4448 8715.024 653.59125 7261.9584 8897.3376

317.775973 19424.9184 8710.6176 657.200556 7243.1808 8897.1936

321.387083 19340.6208 8707.1424 660.811667 7234.4832 8897.0784

324.998194 19569.3696 8699.136 664.422777 7226.832 8896.848

328.609306 21829.536 8659.0464 668.033889 7220.7552 8896.6656

332.220417 21620.832 8654.6208 671.645 0 8965.2576

335.831527 21263.0976 8658.096 675.256111 4526.1984 8928.5568

339.442639 0 8930.9376 678.867223 8088.8256 8892.5856

343.05375 13503.168 8794.2432 682.478333 7407.9936 8897.5968

346.663056 21664.5504 8663.1744 686.089444 7325.0016 8897.7408

350.274167 21647.4336 8655.3408 689.69875 7277.088 8897.6256

353.885277 20825.6448 8663.6544 693.309861 7242.0576 8897.3952

357.496389 21332.4192 8651.2896 696.920973 7270.3296 8896.6368

361.1075 21298.0704 8647.6704 700.532083 7239.7056 8896.464

364.718611 21302.1984 8644.4544 704.143194 7226.4864 8896.2624

368.329723 21298.4064 8641.3728 707.754306 7196.1888 8896.032

371.940833 21289.5936 8638.7904 711.365417 7183.0848 8895.792

375.550139 21262.56 8636.112 714.976527 7167.5616 8895.744

379.16125 21248.7264 8633.9424 718.585833 7157.3088 8895.5808

382.772361 21234.4704 8631.4272 722.196944 7140.72 8895.5328

386.383473 20123.5488 8646.96 725.808056 7134.3456 8895.3504

389.994583 20119.0272 8645.8752 729.419167 7115.0208 8895.216

393.605694 20104.3488 8644.8192 733.030277 7120.2528 8894.976

397.216806 20091.5136 8643.3504 736.641389 7115.6544 8894.8032

400.827917 20096.208 8642.2752 740.2525 7109.0976 8894.4864

404.439027 20081.7792 8641.1328 743.863611 7107.8112 8894.3232

408.048333 20055.6096 8639.6832 747.474723 7106.3136 8894.1408


411.659444 20056.2528 8638.7808 751.084027 7097.3952 8893.9104

415.270556 0 8875.92 754.695139 7097.7792 8893.8144

418.881667 0 8897.0016 758.30625 7094.7072 8893.5744

422.492777 0 8908.128 761.917361 7094.064 8893.4208

426.103889 0 8916.72 765.528473 7092.624 8893.1616

429.715 0 8916.0864 769.139583 7083.8496 8893.008

433.326111 0 8930.8992 772.750694 7088.9568 8892.8064

436.935417 0 8936.6496 776.361806 7082.8896 8892.624

440.546527 0 8941.6992 779.972917 7066.9248 8892.5952

444.157639 0 8946.2784 783.582223 7077.9072 8892.3744

447.76875 0 8950.6656 787.193333 7070.448 8892.2592

451.379861 0 8954.7072 790.804444 7067.0784 8892.0384

454.990973 0 8958.6144 794.415556 0 8948.2464

458.602083 0 8962.416 798.026667 5245.0368 8915.0304

462.213194 0 8966.1312 801.637777 7859.0784 8887.0176

465.824306 0 8969.7312 805.248889 7720.1664 8886.6336

469.433611 0 8973.2256 808.86 7646.64 8886.4128

473.044723 0 8976.5376 812.469306 7594.5312 8886.1824

476.655833 0 8979.6192 816.080417 7565.0496 8886.0384

480.266944 0 8982.4704 819.691527 7547.472 8885.5872

483.878056 0 8985.1584 823.302639 7521.4176 8885.4336

487.489167 0 8987.7216 826.91375 7503.4944 8885.2224

491.100277 0 8990.1696 830.524861 7477.344 8885.1552

494.711389 0 8992.464 834.135973 7454.2944 8885.0208

498.3225 0 8994.5856 837.747083 7435.5648 8884.8672

501.931806 0 8996.544 841.358194 7435.0272 8884.6176

505.542917 0 8998.3296 844.9675 7430.976 8884.368

509.154027 0 9000 848.578611 7420.8768 8884.1376

512.765139 6629.8752 8935.6992 852.189723 7411.8432 8883.9648

516.37625 5830.3296 8940.3648 855.800833 7410.912 8883.6864

519.987361 7242.1344 8923.536 859.411944 7404.2208 8883.5616


523.598473 7169.1552 8922.1248 863.023056 7405.392 8883.312

527.209583 7136.8992 8920.6944 866.634167 7398.8256 8883.216

530.818889 7121.1744 8919.2448 870.245277 7399.3248 8882.9568

534.43 0 8976.9984 873.854583 7396.5408 8882.7456

538.041111 4596.9216 8946.9504 877.465694 7394.832 8882.496

541.652223 5465.4048 8936.1696 881.076806 7392.9504 8882.256

545.263333 7609.872 8912.2656 884.687917 7386.2016 8882.0352

548.874444 7517.1744 8911.296 888.299027 7380.6528 8881.9488

552.485556 7452.6432 8910.6912 891.910139 7375.6224 8881.7952

556.096667 7427.3856 8910.2208 895.52125 7378.1376 8881.5744

559.707777 7388.5056 8910.1152 899.132361 7378.2528 8881.392

563.317083 7363.056 8909.7696 902.743473 7369.3632 8881.3728

566.928194 7347.9936 8909.2896 906.352777 7367.8848 8881.1616

570.539306 7345.4304 8908.3872 909.963889 7369.008 8880.912

574.150417 7360.2912 8907.36 913.575 7354.704 8880.7584

577.761527 7345.392 8906.6496 917.186111 7349.7792 8880.6336

581.372639 7342.608 8905.9776 920.797223 7351.5936 8880.4896

584.98375 7323.1104 8905.3824 924.408333 7341.6864 8880.4032

588.594861 7338.5664 8904.7968 928.019444 7344.4032 8880.2976

592.205973 7341.5616 8904.1152 931.630556 7342.8192 8880.1344

595.815277 7316.6496 8903.6928 935.241667 7341.2928 8880

pi = 9000psi


Table A.16: Data for Cases 16-18Parameter Value Unit





Table A.22: Data for Cases 35Parameter Value Unit


Table A.23: Data for Cases 36-37



505.524861 23034.7066 8213.8599 632.267639 0 8955.7229

505.947361 23035.3008 8213.7629 632.690139 0 8956.3047

506.369861 23035.1155 8213.5498 633.112639 0 8956.8787

506.792361 23039.9866 8213.496 633.535139 0 8957.4499

507.214861 23035.9354 8213.3491 633.957639 3407.7917 8899.6551

507.637361 23036.496 8213.2455 634.380139 3465.5942 8897.3818

508.059861 23038.273 8213.1437 634.802639 5457.3581 8857.6003

508.482361 23036.0266 8212.9459 635.225139 6725.3174 8829.3639

508.904861 23034.9504 8212.7482 635.647639 7416.8352 8811.4109

509.327361 23034.24 8212.7741 636.070139 8873.7197 8774.8771

509.749861 23037.3226 8212.6839 636.492639 10320.9686 8735.3751

510.172361 23033.496 8212.4967 636.915139 9960.073 8741.88

510.594861 23045.7542 8212.2903 637.337639 12710.0179 8666.0871

511.017361 23037.1594 8212.1789 637.760139 13028.4106 8654.4115

511.439861 23037.8698 8212.0541 638.182639 12534.9514 8664.8535


511.862361 23040.8304 8211.8688 638.605139 12325.5715 8669.0871

512.284861 23036.7955 8211.8688 639.027639 12121.0867 8673.3312

512.707361 23034.287 8211.7421 639.450139 12501.7498 8661.4541

513.129861 23219.8272 8205.1527 639.872639 12578.4643 8657.8839

513.552361 23320.5187 8201.0295 640.295139 12190.2422 8667.5011

513.974861 23328.3792 8200.7731 640.717639 12578.6208 8656.2672

514.397361 23321.7974 8200.5399 641.140139 12580.0397 8655.0624

514.819861 23326.4947 8200.2221 641.562639 12580.6339 8654.1053

515.242361 23322.5827 8200.3075 641.985139 12582.3446 8653.2403

515.664861 23322.3917 8200.0032 642.407639 12581.7859 8652.456

516.087361 23321.0208 8199.8707 642.830139 12579.0019 8651.8282

516.509861 23326.1174 8199.8333 643.252639 12577.3104 8651.1533

516.932361 23320.2278 8199.8055 643.675139 12588.24 8650.4343

517.354861 23315.7773 8199.7584 644.097639 12580.537 8649.9495

517.777361 23320.8374 8199.5463 644.520139 12582.215 8649.3831

518.199861 23317.4381 8199.1536 644.942639 12582.4291 8648.8618

518.622361 23323.727 8199.2755 645.365139 12748.1971 8643.7075

519.044861 23320.6877 8199.2103 645.787639 13239.7046 8629.1981

519.467361 23321.711 8199.0451 646.210139 13374.9427 8624.2666

519.889861 23321.6986 8198.9683 646.632639 13376.3837 8623.4621

520.312361 23318.4202 8198.6659 647.055139 13372.9603 8622.8698

520.734861 23321.1283 8198.6477 647.477639 13368.9696 8622.4791

521.157361 23322.1459 8198.5114 647.900139 13377.2602 8621.8963

521.579861 23316.7085 8198.4442 648.322639 13367.3539 8621.449

522.002361 23314.9238 8198.3079 648.745139 13368.9619 8620.9267

522.424861 23321.3808 8198.0698 649.167639 13374.0451 8620.4151

522.847361 23323.5677 8198.0823 649.590139 13381.4506 8619.7363

523.269861 23316.9878 8197.9968 650.012639 13378.9267 8619.3955

523.692361 23324.2032 8197.6282 650.435139 13371.8352 8619.2275

524.114861 23317.5792 8197.6445 650.857639 13380.6221 8618.592

524.537361 23321.1523 8197.5754 651.280139 13383.6509 8618.2531


524.959861 23316.8774 8197.2442 651.702639 13384.9834 8617.9229

525.382361 23319.2083 8197.2288 652.125139 13388.2176 8617.4554

525.804861 23317.9747 8197.0589 652.547639 13393.7376 8617.0752

526.227361 23319.4646 8196.9264 652.970139 13389.407 8616.8775

526.649861 23313.5722 8197.0944 653.392639 13385.1965 8616.6211

527.072361 23318.6698 8197.0455 653.815139 13394.5085 8616.169

527.494861 23318.8051 8196.4839 654.237639 13395.8813 8615.8407

527.917361 23320.5619 8196.361 654.660139 13394.9779 8615.5863

528.339861 23317.8451 8196.5319 655.082639 13399.1232 8615.1648

528.762361 23320.0013 8196.2208 655.503333 13401.0634 8614.9315

529.184861 23319.9619 8196.3946 655.925833 13405.4102 8614.6051

529.607361 23320.7606 8196.2333 656.348333 13410.1594 8614.1482

530.029861 23319.8064 8196.1095 656.770833 13412.5891 8613.8621

530.450556 23318.593 8196.0336 657.193333 13413.7248 8613.7229

530.873056 23321.0458 8195.8042 657.615833 13409.3107 8613.5491

531.295556 23326.2826 8195.3501 658.038333 13416.1123 8613.2967

531.718056 23326.033 8195.6602 658.460833 13416.5866 8613.073

532.140556 23324.3472 8195.3616 658.883333 13415.5469 8612.8272

532.563056 23327.9002 8195.3453 659.305833 13413.4838 8612.7159

532.985556 23326.1664 8195.3232 659.728333 13418.5651 8612.4643

533.408056 23325.167 8195.0813 660.150833 13409.8531 8612.4605

533.830556 23334.3504 8194.9229 660.573333 13416.3005 8612.184

534.253056 23320.7453 8194.6944 660.995833 13416.5664 8612.0573

534.675556 23325.5626 8194.7655 661.418333 13422.6461 8611.7789

535.098056 23323.1626 8194.8135 661.840833 13429.8979 8611.4871

535.520556 23326.5245 8194.6666 662.263333 13428.7747 8611.3152

535.943056 23320.8374 8194.488 662.685833 13427.4662 8611.1664

536.365556 23317.9219 8194.6992 663.108333 13427.6227 8610.9466

536.788056 23330.3962 8194.3248 663.530833 13428.1536 8610.6797

537.210556 23334.2698 8193.9523 663.953333 13432.2605 8610.6663

537.633056 23327.8234 8194.1011 664.375833 13427.977 8610.5866


538.055556 23326.8528 8194.033 664.798333 13431.7642 8610.4551

538.478056 23319.1354 8193.8967 665.220833 13423.2912 8610.3523

538.900556 23320.6253 8193.7805 665.643333 13437.935 8609.9962

539.323056 23320.007 8193.7402 666.065833 13280.5104 8614.6531

539.745556 23329.0051 8193.1143 666.488333 13283.7206 8614.488

540.168056 23314.0877 8193.5242 666.910833 13289.1821 8614.3095

540.590556 23319.7574 8193.3149 667.333333 13285.0771 8614.2048

541.013056 23318.4432 8193.3879 667.755833 13293.2256 8614.0627

541.435556 23324.1638 8193.2026 668.178333 13286.0803 8614.152

541.858056 23313.0902 8193.0605 668.600833 13292.2896 8613.9331

542.280556 23322.889 8192.9539 669.023333 13301.5315 8613.6221

542.703056 23317.7722 8192.7965 669.445833 13294.7683 8613.4867

543.125556 23316.2659 8192.761 669.868333 13296.9773 8613.4925

543.548056 23315.8781 8192.5335 670.290833 13301.2224 8613.312

543.970556 23318.4941 8192.4567 670.713333 13304.1014 8613.1507

544.393056 23321.4413 8192.2291 671.135833 0 8902.897

544.815556 23317.416 8192.281 671.558333 0 8917.0666

545.238056 23317.2566 8192.2339 671.980833 0 8926.8941

545.660556 23311.0406 8192.1303 672.403333 0 8936.3175

546.083056 23317.8019 8191.9949 672.825833 0 8941.0176

546.505556 23320.7002 8191.6416 673.248333 0 8940.8112

546.928056 23322.049 8191.6282 673.670833 0 8932.7808

547.350556 23318.0266 8191.6387 674.093333 2235.9264 8906.3463

547.773056 23325.3504 8191.5091 674.515833 4358.0765 8866.3363

548.195556 23310.9245 8191.393 674.938333 5988.5923 8830.489

548.618056 23314.4045 8191.32 675.360833 7103.543 8805.8266

549.040556 23312.0659 8191.2384 675.783333 6823.4026 8810.8272

549.463056 23317.4246 8191.0695 676.205833 8048.879 8782.9171

549.885556 23316.9187 8191.0061 676.628333 8910.6883 8760.9629

550.308056 23317.1117 8190.8746 677.050833 8925.2976 8758.5562

550.730556 23315.3405 8190.9024 677.473333 8883.9302 8758.657


551.153056 23323.9555 8190.4397 677.895833 10011.3082 8729.6592

551.575556 23309.8128 8190.6922 678.318333 11394.0518 8692.0099

551.998056 23308.7827 8190.5367 678.740833 12184.8077 8668.1712

552.420556 23317.0992 8190.4099 679.163333 12167.6534 8666.8887

552.843056 23316.3994 8190.3043 679.585833 12155.5363 8666.064

553.265556 22843.2192 8207.6439 680.008333 12146.2138 8665.1232

553.688056 22317.5645 8227.1914 680.430833 12137.7216 8664.5635

554.110556 22586.689 8218.2039 680.853333 12134.9933 8664.024

554.533056 0 8802.1709 681.275833 12131.1418 8663.5546

554.955556 0 8830.2586 681.698333 12128.3299 8663.0986

555.378056 0 8841.8919 682.120833 12124.7318 8662.6157

555.800556 0 8849.6842 682.543333 12124.5168 8662.2941

556.223056 0 8856.0624 682.965833 12123.1306 8661.8851

556.645556 0 8861.3597 683.388333 12125.1466 8661.5741

557.068056 0 8865.8928 683.810833 12122.88 8661.2007

557.490556 0 8869.9258 684.233333 12127.6752 8660.8243

557.913056 0 8873.5421 684.655833 12124.1126 8660.5651

558.335556 0 8876.7946 685.078333 12121.8422 8660.3376

558.758056 0 8879.8301 685.500833 12122.9549 8660.1351

559.180556 0 8882.64 685.923333 12125.6534 8659.8451

559.603056 0 8885.2531 686.345833 12125.2445 8659.5706

560.025556 0 8887.7155 686.766527 12134.1984 8659.2826

560.448056 0 8890.0339 687.189027 12123.8688 8659.1098

560.870556 0 8892.2189 687.611527 12130.2298 8658.7719

561.293056 0 8894.2867 688.034027 12131.6227 8658.6413

561.71375 0 8896.248 688.456527 12129.792 8658.4752

562.13625 0 8898.1335 688.879027 12130.5062 8658.3965

562.55875 0 8899.9306 689.301527 12126.8698 8658.2602

562.98125 0 8901.6711 689.724027 12128.0035 8658.121

563.40375 5870.6822 8799.9043 690.146527 12129.2285 8657.8435

563.82625 5929.8115 8794.9277 690.569027 12136.417 8657.7255


564.24875 8012.1936 8746.9728 690.991527 12132.3264 8657.6026

564.67125 9441.9638 8711.3885 691.414027 12137.5267 8657.4509

565.09375 9398.063 8709.553 691.836527 11776.9142 8667.6394

565.51625 9746.8618 8699.1619 692.259027 11774.6266 8667.6605

565.93875 12688.9939 8621.1754 692.681527 11779.9632 8667.6768

566.36125 12593.8051 8619.5434 693.104027 11778.383 8667.5559

566.78375 14735.1667 8555.4931 693.526527 11783.5162 8667.5232

567.20625 14609.0717 8556.3005 693.949027 11781.6806 8667.4733

567.62875 14451.0154 8559.025 694.371527 11782.9325 8667.409

568.05125 13536.8064 8584.2461 694.794027 12598.7405 8645.0775

568.47375 14268.9533 8562.3955 695.216527 13145.544 8628.3773

568.89625 15251.4163 8531.5498 695.639027 13191.0346 8626.0579

569.31875 16929.2602 8480.7178 696.061527 13190.615 8625.7085

569.74125 16334.8704 8495.4154 696.484027 13186.151 8625.1959

570.16375 17485.1885 8458.0675 696.906527 13185.1718 8624.9885

570.58625 17289.2006 8462.2032 697.329027 13182.2957 8624.6938

571.00875 18622.3805 8418.0394 697.751527 13175.1898 8624.3597

571.43125 18482.2051 8420.4576 698.174027 13182.696 8624.1946

571.85375 19350.8602 8390.8186 698.596527 12507.481 8643.7805

572.27625 19974.1709 8367.5079 699.019027 12511.991 8644.0023

572.69875 21013.1174 8330.4547 699.441527 12504.4022 8644.105

573.12125 21668.6851 8305.4122 699.864027 12847.8077 8634.0835

573.54375 22086.2323 8288.7831 700.286527 12851.3443 8633.6602

573.96625 22075.3594 8287.1367 700.709027 12708.3437 8637.5367

574.38875 22074.6586 8285.4624 701.131527 12704.6525 8637.7507

574.81125 22065.8102 8283.7623 701.554027 12705.6134 8637.6797

575.23375 22056.3005 8282.6707 701.976527 12713.0851 8637.5424

575.65625 22052.376 8281.6215 702.399027 12705.4973 8637.4176

576.07875 22046.424 8280.529 702.821527 13042.2442 8627.4567

576.50125 22034.5402 8280.0586 703.244027 13045.0282 8627.0823

576.92375 22037.5517 8279.04 703.666527 13372.176 8617.3987


577.34625 22029.0509 8278.1472 704.089027 13368.3571 8617.0426

577.76875 22030.9219 8277.3591 704.511527 13366.9651 8616.8026

578.19125 22029.8323 8276.7629 704.934027 13375.1875 8616.5511

578.61375 22027.9949 8275.9671 705.356527 13371.5069 8616.2218

579.03625 22027.3296 8275.2413 705.779027 13363.6656 8616.2727

579.45875 22021.2922 8274.6576 706.201527 13357.1357 8616.2602

579.88125 22024.6214 8274.0192 706.624027 13355.1158 8616.265

580.30375 22013.9923 8273.7226 707.046527 13356.191 8616.072

580.72625 22011.3014 8273.1187 707.469027 13364.6122 8615.8896

581.14875 22016.6246 8272.2922 707.891527 13353.5405 8615.8474

581.57125 22008.9302 8272.0954 708.314027 13352.905 8615.7178

581.99375 22010.3702 8271.3831 708.736527 13352.9002 8615.6631

582.41625 22010.3981 8270.9895 709.159027 13355.7734 8615.4528

582.83875 22010.2704 8270.5181 709.581527 13364.927 8615.28

583.26125 22009.6858 8269.9507 710.004027 13353.4339 8615.2234

583.68375 22007.5853 8269.7443 710.426527 13355.4 8615.065

584.10625 21998.9846 8269.3459 710.849027 13357.4688 8614.9258

584.52875 22006.0166 8268.6269 711.271527 13358.4624 8614.824

584.95125 22007.6736 8268.24 711.694027 13357.9536 8614.7472

585.37375 22013.6381 8267.6256 712.116527 13358.5478 8614.4938

585.79625 22009.009 8267.4346 712.539027 13357.6934 8614.4554

586.21875 22008.7728 8266.992 712.961527 13358.1514 8614.3738

586.64125 22002.6883 8266.7847 713.384027 13357.0253 8614.2922

587.06375 22003.2941 8266.4659 713.806527 13353.8006 8614.2394

587.48625 22004.0755 8266.1491 714.229027 13354.0656 8614.1031

587.90875 22001.3885 8265.7738 714.651527 13359.2506 8613.8775

588.33125 22004.5949 8265.4752 715.074027 13362.2141 8613.8535

588.75375 22002.2227 8265.2928 715.496527 13356.4166 8613.8199

589.17625 21996.1651 8264.9415 715.919027 13352.9779 8613.7517

589.59875 22004.7773 8264.2877 716.341527 13353.1133 8613.6567

590.02125 22009.6666 8263.9901 716.764027 13355.3491 8613.5943


590.44375 21999.9734 8264.0995 717.186527 12739.1453 8631.9139

590.86625 21995.8272 8263.8931 717.607223 12744.7363 8632.0752

591.28875 21990.5414 8263.6407 718.029723 12747.7066 8632.0762

591.71125 21990.816 8263.4189 718.452223 12747.0662 8632.1597

592.13375 21996.2045 8263.1069 718.874723 12750.3014 8632.0695

592.55625 21988.7923 8262.816 719.297223 12748.4064 8632.0858

592.976944 21992.7466 8262.4167 719.719723 12749.8848 8632.1578

593.399444 21992.9453 8262.0269 720.142223 12752.5882 8632.0541

593.821944 21994.0272 8261.8512 720.564723 12756.0682 8631.9706

594.244444 21991.0685 8261.5978 720.987223 12757.4198 8631.9283

594.666944 21993.1334 8261.2819 721.409723 12757.9373 8631.8871

595.089444 21991.4026 8261.1619 721.832223 12758.2589 8631.8391

595.511944 21991.56 8260.8864 722.254723 12754.6714 8631.8947

595.934444 21988.5302 8260.7395 722.677223 12758.1946 8631.7594

596.356944 21990.5981 8260.4304 723.099723 12756.8582 8631.7373

596.779444 21992.1245 8260.1655 723.522223 12758.0688 8631.6595

597.201944 21990.1728 8259.8986 723.944723 12758.161 8631.5731

597.624444 21998.8934 8259.4944 724.367223 12756.5798 8631.5664

598.046944 22000.9296 8259.0499 724.789723 12763.5398 8631.5319

598.469444 21148.151 8289.3754 725.212223 12760.4275 8631.5866

598.891944 20970.2592 8295.9725 725.634723 12758.5027 8631.4637

599.314444 20511.4954 8313.2256 726.057223 12760.7578 8631.4599

599.736944 20697.0605 8306.5219 726.479723 12758.2685 8631.3955

600.159444 21093 8292.6547 726.902223 12760.1146 8631.3667

600.581944 20084.2954 8328.3197 727.324723 12763.1107 8631.2343

601.004444 20250.7987 8322.6557 727.747223 12759.7786 8631.3091

601.426944 20257.969 8322.3648 728.169723 12760.0829 8631.2525

601.849444 20304.1258 8320.9699 728.592223 12764.975 8631.121

602.271944 20298.4522 8321.1677 729.014723 12764.4106 8631.0739

602.694444 20264.6534 8322.1411 729.437223 12760.751 8631.001

603.116944 20302.6541 8320.9171 729.859723 12765.4963 8631.0259


603.539444 20297.2253 8321.3242 730.282223 12766.5571 8630.9808

603.961944 5237.2541 8736.3466 730.704723 12768.863 8630.9146

604.384444 0 8847.2151 731.127223 12765.2976 8630.8474

604.806944 0 8858.8752 731.549723 12771.3542 8630.8023

605.229444 0 8866.4775 731.972223 12767.3664 8630.7255

605.651944 0 8872.1856 732.394723 12770.6534 8630.6986

606.074444 0 8876.7389 732.817223 12770.0256 8630.6045

606.496944 0 8881.0272 733.239723 12771.7805 8630.617

606.919444 0 8884.6819 733.662223 12776.3626 8630.5095

607.341944 0 8887.9258 734.084723 12770.327 8630.5507

607.764444 0 8890.8855 734.507223 12779.3731 8630.3415

608.186944 0 8893.6282 734.929723 12775.7645 8630.3741

608.609444 0 8896.1751 735.352223 12775.4438 8630.2858

609.031944 0 8898.5069 735.774723 12775.5533 8630.2541

609.454444 0 8900.6717 736.197223 12781.5437 8630.256

609.876944 0 8902.7895 736.619723 12777.791 8630.1658

610.299444 0 8904.8813 737.042223 12783.0067 8630.1831

610.721944 0 8906.8695 737.464723 12783.1066 8630.0535

611.144444 0 8908.6848 737.887223 12780.3917 8630.1783

611.566944 0 8910.3245 738.309723 12795.7325 8629.8643

611.989444 0 8911.9507 738.732223 13102.4266 8620.4429

612.411944 0 8913.5155 739.154723 13358.6131 8613.1152

612.834444 0 8915.0074 739.577223 13427.3328 8610.3744

613.256944 0 8916.4483 739.999723 13351.2509 8612.5056

613.679444 0 8917.849 740.422223 13353.073 8612.4461

614.101944 0 8919.2016 740.844723 13346.0496 8612.257

614.524444 0 8920.5053 741.267223 13344.7661 8612.2224

614.946944 0 8921.7648 741.689723 13342.4371 8612.0861

615.369444 0 8922.9888 742.112223 13346.711 8611.9219

615.791944 0 8924.184 742.534723 13340.0938 8612.0371

616.214444 0 8925.3562 742.957223 13340.4365 8611.9123


616.636944 0 8926.4909 743.379723 13340.9242 8611.8336

617.059444 0 8927.5939 743.802223 13343.4835 8611.8221

617.481944 0 8928.6835 744.224723 13340.6285 8611.6339

617.904444 0 8929.7367 744.647223 13338.4445 8611.5888

618.326944 0 8930.7591 745.069723 13339.5446 8611.4947

618.749444 0 8931.7642 745.492223 13338.3427 8611.5034

619.171944 0 8932.7319 745.914723 13337.4547 8611.3661

619.594444 0 8933.6957 746.337223 13334.593 8611.4151

620.016944 0 8934.6346 746.759723 13336.4602 8611.3248

620.439444 0 8935.5562 747.182223 13340.449 8611.1635

620.861944 0 8936.4509 747.604723 13345.559 8611.0455

621.284444 0 8937.3331 748.027223 13343.111 8610.8851

621.706944 0 8938.1991 748.449723 13342.3219 8610.8487

622.129444 0 8939.0439 748.870417 13344.0029 8610.6912

622.551944 0 8939.881 749.292917 13342.5754 8610.7527

622.974444 0 8940.7008 749.715417 13341.4992 8610.6144

623.396944 0 8941.5005 750.137917 13339.4947 8610.4531

623.819444 0 8942.2915 750.560417 13340.9395 8610.5117

624.240139 0 8943.0643 750.982917 13337.2598 8610.4829

624.662639 0 8943.8247 751.405417 13337.7302 8610.432

625.085139 0 8944.5792 751.827917 13337.9741 8610.4339

625.507639 0 8945.3194 752.250417 13346.4077 8610.3091

625.930139 0 8946.0432 752.672917 13333.0454 8610.3341

626.352639 0 8946.7623 753.095417 13338.1373 8610.2362

626.775139 0 8947.4669 753.517917 13334.7802 8610.1987

627.197639 0 8948.161 753.940417 13341.1632 8609.9443

627.620139 0 8948.8435 754.362917 13338.1795 8610.0739

628.042639 0 8949.5146 754.785417 13344.3485 8609.8608

628.465139 0 8950.1741 755.207917 13341.7718 8609.7706

628.887639 0 8950.8298 755.630417 13343.255 8609.6487

629.310139 0 8951.4682 756.052917 13350.4416 8609.5287


629.732639 0 8952.097 756.475417 13344.6893 8609.2531

630.155139 0 8952.7287 756.897917 13351.5101 8609.2666

630.577639 0 8953.3469 757.320417 13351.224 8609.1965

631.000139 0 8953.9421 757.742917 13351.7069 8608.9373

631.422639 0 8954.5402 758.165417 13358.7782 8608.8557

631.845139 0 8955.1344 758.288194 13352.7475 8609.0314

pi = 8958psi

Appendix B

Proof of Kernel Closure Rules

To prove the three kernel closure rules, the Mercer Theorem is needed.

Mercer Theorem (Ng, 2009). Let K : ℜn × ℜn → ℜ be given. then for K to be a

valid (Mercer) kernel, it is necessary and sufficient that for any

x(1), . . . ,x(m)

, (m < ∞),

the corresponding kernel matrix K is symmetric positive semi-definite, where a kernel

matrix K is defined so that its (i, j)-entry is given by Kij = K(

x(i),x(j))

.

With the Mercer Theorem, we may go ahead and prove all three kernel closure

rules in Chapter 4.1.

B.1 Summation Closure

Summation Closure Rule. Suppose K1 (x, z) and K2 (x, z) are two valid kernels,

then K (x, z) = K1 (x, z) + K2 (x, z) is also a valid kernel.

Proof. Since K1 and K2 are kernels, therefore, ∀z ∈ ℜm, we always have the following

equations:

zTK1z ≥ 0 (B.1)

zTK2z ≥ 0 (B.2)

where K1,K2 ∈ ℜm×m are the kernel matrices of kernel functions K1 and K2.

214

APPENDIX B. PROOF OF KERNEL CLOSURE RULES 215

Summing Eq. B.1 and Eq. B.2, we have:

zT (K1 +K2) z = zTKz ≥ 0 (B.3)

Eq. B.3 indicates that the matrix K = K1 + K2 is still a positive-semidefinite

matrix.

At the same time, because K1 and K2 are both symmetric matrices, then K =

K1+K2 is still a symmetric matrix. BecauseK is symmetric and positive semidefinite,

matrix K is a valid kernel matrix and function K is a valid kernel function.

B.2 Tensor Product Closure

Tensor Product Closure Rule. Suppose K1 (x, z) and K2 (x, z) are two valid ker-

nels, then K (x, z) = K1 (x, z)K2 (x, z) is also a valid kernel.

Proof. Because K1 and K2 are kernel functions, the kernel matrices K1 and K2 will

be symmetric. Therefore,

Kij =K(

x(i),x(j))

=K1

(

x(i),x(j))

K2

(

x(i),x(j))

=K1ijK2ij

=K1jiK2ji

=K1

(

x(j),x(i))

K2

(

x(j),x(i))

=K(

x(j),x(i))

=Kji (B.4)

So the kernel matrix of K is symmetric. Now, we will prove that the kernel matrix

of K is positive semidefinite. ∀z,


zTKz =∑

i

∑

j

zizjKij

=∑

i

∑

j

zizjK(

x(i),x(j))

=∑

i

∑

j

zizjK1

(

x(i),x(j))

K2

(

x(i),x(j))

=∑

i

∑

j

zizj(

ΦT1

(

x(i))

Φ1

(

x(j))) (

ΦT2

(

x(i))

Φ2

(

x(j)))

=∑

i

∑

j

∑

p

∑

q

zizjφ1

(

x(i))

pφ1

(

x(j))

pφ2

(

x(i))

qφ2

(

x(j))

q

=∑

i

∑

j

∑

p

∑

q

(

ziφ1

(

x(i))

pφ2

(

x(i))

q

)(

zjφ1

(

x(j))

pφ2

(

x(j))

q

)

=∑

p

∑

q

(

∑

i

ziφ1

(

x(i))

pφ2

(

x(i))

q

)(

∑

j

zjφ1

(

x(j))

pφ2

(

x(j))

q

)

=∑

p

∑

q

(

∑

i

ziφ1

(

x(i))

pφ2

(

x(i))

q

)2

≥0 (B.5)

So, the kernel matrix of K is positive-semidefinite.

Because the kernel matrix of K is symmetric and positive-semidefinite, K is a

valid kernel.

B.3 Positive Scaling Closure

Positive Scaling Closure Rule. Suppose K1 (x, z) and K2 (x, z) are two valid ker-

nels, and a ∈ ℜ+ then K (x, z) = aK1 (x, z) is also a valid kernel.

Proof. Because K1 is a symmetric matrix, then (K)ij = (aK1)ij = a (K1)ij =

a (K)ji = (K)ji, so K = aK1 is still a symmetric matrix.


Because a > 0, then multiply a onto Eq. B.1, we have

zT (aK1) z = zTKz ≥ 0 (B.6)

⇒ K is positive-semidefinite. So K is a valid kernel function.

Appendix C

Breakpoint Detection Using Data

Mining Approaches

C.1 K-means and Bilateral

The breakpoint detection problem may be treated as an unsupervised classification

problem. The input variable is given, including the time, flow rate and the pressure

series, but there is no foreknown output variable Y , which would be the group number

of different piecewise constant flow rate periods. The data mining algorithms are

assumed to discover the relationships among the data and classify them into different

groups. Each group will be one piecewise constant flow rate period, and the points

at the transitions between the groups are the break points.

The difficulties of breakpoint detection arise from:

1. The total number of the piecewise constant flow rate periods is not known. The

data mining algorithm has to make its own decision in the classification process.

2. The data are very noisy so that although two neighbor samples may have dif-

ferent flow rates, they may still lay in the same piecewise constant flow rate

period. The difference is caused by the noise.

3. The noise between the different piecewise constant flow rate periods sometimes

make the difference very tiny. For example, if the previous period has a flow

218

APPENDIX C. BREAKPOINT DETECTION USING DATAMINING APPROACHES219

rate of 70, and the current period has a flow rate of 80. The noise may very

possibly make the flow rate in the previous period close to 73, and the current

period 77. The original difference of 10 is weakened into 4 which is not easily

recognized.

There are several methods being used to detect the breakpoints. Here, K-means

method and Bilateral method were studied and are introduced briefly in this section.

K-means classification is a method to partition m observation into k clusters

in which each observation belongs to the cluster with the nearest mean. Suppose

that there is a set of m observations

x(1),x(2), · · · ,x(i), . . . ,x(m)

where x(i) ∈ℜn. In the context of PDG project, n = Nx, and m = Np. K-means method

aims to partition the m observations into k sets S1,S2, · · · ,Sk with the centroids

µ1,µ2, · · · ,µj, . . . ,µk

, where µj ∈ ℜn, k ≤ m. The method will optimize the cost

function:

L (µ1,µ2, · · · ,µk) =k∑

j=1

∑

x(i)∈Sj

∥

∥x(i) − µj

∥

∥

2(C.1)

The Bilateral method is a filtering technique in which the value at each sample

is weighted evaluated by the whole domain. The weight of data x(i) to xpred com-

bines both magnitude and spatial differences, in contrast to normal filters taking

into account only spatial information. Equation C.2 is the equation to calculate the

weight. The term

(

f(

xpred)

− f(

x(i)))2

σ2f

counts the weight of magnitude, while the

term‖xpred − x(i)‖2

σ2x

counts the weight of the spatial. The weight will be used to

calculate the evaluation at each x(i) using Equation C.3. By simply using the bi-

lateral method, the outliers of the data will be filtered and the whole curve will be

smoothed. Given the threshold ξ, each place x(i) such that∥

∥x(i+1) − x(i)∥

∥ ≥ ξ will

be a breakpoint.

W(

xpred;x(i))

= exp

[

−(

(

f(

xpred)

− f(

x(i)))2

σ2f

+‖xpred − x(i)‖2

σ2x

)]

(C.2)


ypred =

∑m

i=1 x(i)W

(

xpred;x(i))

∑m

i=1W(xpred;x(i))(C.3)

By using the two methods briefly discussed above, the breakpoint detection results

are shown in Figure C.1. Both methods have good detections on the noisy flow rate

data set. However, there are two vital disadvantages of the two methods:

• the K-means method requires foreknowledge of the total count k of the break-

points; and

• the Bilateral method requires foreknowledge of the separation threshold ξ.

These two requirements could not be satisfied in the study. As discussed at the begin-

ning of this section, the data mining algorithms are required to detect the breakpoints

without foreknowledge of the total number, nor of the separation threshold between

two neighbor piecewise constant flow rate periods. Therefore, a more powerful and

intelligent data mining method is required. Minimum Message Length is such a

method, discussed in the next section.

0 10 20 30 40 50 60 70−10

0

10

20

30

40

50

60

70

80

time (hours)

Flo

w R

ate

(ST

B/d

)

True QNoisy Q

(a)

0 10 20 30 40 50 60 70−10

0

10

20

30

40

50

60

70

80

time (hours)

Flo

w R

ate

(ST

B/d

)

True QNoisy Q

(b)

Figure C.1: Breakpoint detection by (a) K-Means method. (b) Bilateral method


C.2 Minimum Message Length

Minimum Message Length (MML) is a data mining method originating from the

information theory. The basic assumption of this theory is that even when models

are not equal in goodness of fit accuracy to the observed data, the one generating

the shortest overall message is more likely to be correct (Wallace and Boulton, 1968).

Therefore, MML method is a data mining method that finds a fitting model by

minimizing the length of the message that describes the whole data set.

The message referred in this method contains (Wallace and Boulton, 1968)

• the number of classes;

• a dictionary of class names;

• a description of the distribution function for each class;

• for each thing, the name of the class to which it belongs;

• for each thing, its attribute values in the code set up for its class.

The total length of the message is the summation of the length of each item of

the five contents in the message. This length will be the minimizing target of the

MML method. Because the number of classes is included in the message, the total

number of breakpoints will also be optimized in the data mining process. Therefore

the number of breakpoints is not a foreknowledge, but a result of the data mining

process. Similarly, the separation threshold is not required either by MML. Thus,

the MML method should be able to perform the task of breakpoint detection on the

PDG data.

Applying the data mining algorithms in the breakpoint detection is not the key

of this study. Instead, this study sought methods that avoid the need for breakpoint

detection. Therefore, the section here will not discuss the method in detail, but only

report some of the results obtained.

Figure C.2 shows the results of applying MML on the noisy data set without


outliers. When the input variable x(i) =

(

t(i)

q(i)

)

, the MML method will make classi-

fication only according to the time series and flow rate series, the result of which is

shown in Figure C.2(a). In this process, the pressure data is never known to the MML

algorithm. Similarly, Figure C.2(b) shows the results using pressure and time series

only. Figure C.2(c) shows the results using time, pressure and flow rate together.

From the comparison, the MML works well in all three cases. Figure C.3 shows a

test in which some artificial outliers are added in. MML method still works well with

these outliers.

Not all the time does the MML work as well using two parameters as using three

parameters. In Figure C.4(a), MML only uses the flow rate and time series. But there

is a breakpoint that is not detected. But using three variables including pressure, flow

rate, and time series, the missed break point is detected. Although there is an extra

“breakpoint” misrecognized by MML using three variables, it is not a serious problem

because only missing breakpoints lead to wrong calculation. This meets the common

sense that the more parameters that are utilized, the more accurate data mining

results will be. This result also implies that breakpoint detection is another scenario

in which the cointerpretation of pressure and flow rate is employed.

The investigation of the MML method is still ongoing. There is an important

difference between a generic classification problem and the breakpoint detection prob-

lem. In the generic classification problem, each point is classified individually, so it is

very common that the neighbor points are classified into different groups. But in the

breakpoint detection problem, the classification is based on a continuous period. The

continuity shall be considered in the data mining process. One of the future research

directions is how to modify the original MML method to reflect this continuity.


0 5 10 15 20 254200

4400

4600

4800

5000

t (hours)

p (p

si)

noisy, Q−t

0 5 10 15 20 250

20

40

60

80

100

t (hours)

q (S

TB

/d)

(a)

0 5 10 15 20 254200

4400

4600

4800

5000

t (hours)

p (p

si)

noisy, t−P

0 5 10 15 20 250

20

40

60

80

100

t (hours)

q (S

TB

/d)

(b)

0 5 10 15 20 254200

4400

4600

4800

5000

t (hours)

p (p

si)

noisy, Q−t−P

0 5 10 15 20 250

20

40

60

80

100

t (hours)

q (S

TB

/d)

(c)

Figure C.2: Applying MML method on breakpoint detection in a noisy data setwithout outliers: (a) use flow rate and time data only. (b) use pressure and time dataonly. (c) use pressure, flow rate and time data together.


0 5 10 15 20 254000

4500

5000

5500

t (hours)

p (p

si)

noisy, Q−t

0 5 10 15 20 250

50

100

150

t (hours)

q (S

TB

/d)

(a)

0 5 10 15 20 254000

4500

5000

5500

t (hours)

p (p

si)

noisy, t−P

0 5 10 15 20 250

50

100

150

t (hours)

q (S

TB

/d)

(b)

0 5 10 15 20 254000

4500

5000

5500

t (hours)

p (p

si)

noisy, Q−t−P

0 5 10 15 20 250

50

100

150

t (hours)

q (S

TB

/d)

(c)

Figure C.3: Applying MML method on breakpoint detection in a noisy data set withoutliers: (a) use flow rate and time data only. (b) use pressure and time data only.(c) use pressure, flow rate and time data together.


0 5 10 15 20 253500

4000

4500

5000

t (hours)

p (p

si)

noisy, Q−t

0 5 10 15 20 250

50

100

150

t (hours)

q (S

TB

/d)

(a)

0 5 10 15 20 253500

4000

4500

5000

t (hours)

p (p

si)

noisy, Q−t−P

0 5 10 15 20 250

50

100

150

t (hours)

q (S

TB

/d)

(b)

Figure C.4: (a) Using flow rate and time data only fails to capture a breakpoint. (b)Using pressure, flow rate and time data together detect all breakpoints successfully.

Appendix D

Implementation

The project programs were implemented in C++. This appendix will report the C++

implementation briefly from two views, the class interaction in Section D.1 and the

work flow in Section D.2.

D.1 Classes

Fig. D.1 shows a class diagram of the project. The diagram contains the major

classes used in the project and the relationship (interaction and inheritance) between

them. The full-filled arrow represents interaction (such as a function call) from the

arrow-starting class to the arrow-ending class, while the nonfilled arrow represents

that the arrow-beginning class inherits the arrow-ending class. In the diagram, the

abstract classes which define the interfaces are labeled with yellow color, while the

implementation classes are in blue. The green color represents the data entities such

as the input and output data files.

The introductions of the classes are listed as follows:

TesterBase: TesterBase is the abstract class that defines the interface of a test work

flow. The main abstract function is TesterBase::execute() in this abstract class,

which performs the whole test work flow. All subclasses that inherit TesterBase

have to implement the execute() function representing different test work flows.

226

APPENDIX D. IMPLEMENTATION 227

Figure D.1: The class diagram of the PDG analysis program.


GenericTester GenericTester is an implementation of TesterBase. It is the generic

and the only test work flow implemented in the whole project. The detailed

work flow is introduced in Section D.2. Although the GenericTester is the single

implementation, flexibility is still retained for future implementation using the

TesterBase abstract class. If there is a new test work flow needed, a new class

that implements TesterBase could be added in the project without changing

the existing test work flows.

ColLoader: ColLoader is a class that loads the the PDG input data. The PDG input

data are organized in three columns, including time, flow rate, and pressure

sequentially. Columns are separated using a tab.

ColWriter: ColWriter is a class that writes the result into a text file. The results

are organized in six columns, including time, flow rate, true pressure, pressure

prediction, true derivative, and prediction derivative sequentially. Columns

are separated using a tab. Both of ColLoader and ColWriter were used in

GenericTester for data input and output.

CaseScriptLoader: CaseScriptLoader is a class to load a test script, in which the

arguments for a test work flow are provided. CaseScriptLoader loads the script

file, and provides the arguments that written in the script file to GenericTester-

Args.

GenericTesterArgs: GenericTesterArgs is a class that collects all the arguments

used in GenericTester. When GenericTester executes a test, it will call Gener-

icTesterArgs to provide the arguments necessary for the test.

TestPackage: TestPackage is one of the major arguments in GenericTesterArgs. It

is used to store the full path file names of the testing files and test result files.

These file names are originally stored in the test script file on the hard disk,

and are loaded into the memory by the CaseScriptLoader.

InputVectorBase: InputVectorBase is the abstract class that defines the interface

of different input vector creators. Different input vector creators corresponding


to different input vectors derived from this abstract base class, including but not

limited to classes KernelVector 4F, KernelVector 4F B, KernelVector 3F, and

KernelVector 5F corresponding to the input vectors KV4FA, KV4FB, KV3F,

and KV5F in Chapter 4. InputVectorBase is another argument in the Gener-

icTesterArgs. The reason we make the abstract class rather than the derived

subclasses as the argument is because this design will enable more flexibility

for future expansion. For example, supposing there is another input vector, say

VectorX, that we would like to test some day. We just need to make it to derive

and implement the abstract class InputVectorBase in order to be used in the

project without any modifications on the existing codes. In general, all abstract

classes defined and used in this project follow this guideline.

LearnerBase: LearnerBase is the abstract class that defines the interface of training

and prediction process. Unlike the TesterBase which defines the overall work-

flow including the data input, input vector creation, training and prediction, and

data output, the abstract class LearnerBase only focuses on the training and

prediction part. In the whole study, we discussed six methods, namely Method

A - F. In these six methods, the data input and the data output were the same.

The difference between the six methods came from the input vector creation

(the choice of the subclass derived from InputVectorBase) and the training and

prediction process (the choice of the subclass derived from LearnerBase). There

are four different subclasses that implemented the abstract class LearnerBase,

including

• ConjugateGradient ConvolutionKernel used by Method D,

• ConjugateGradient ConvolutionKernelBlock used by Method E,

• ConjugateGradient ConvolutionKernelBlock Advanced used by Method F,

and

• GradientDescent Kernel used by Methods A - C.

In case we need to study other training and prediction method, we may cre-

ate another subclass that implements LearnerBase to extend the functionality


without affecting the previous methods. LearnerBase is an argument of Gener-

icTesterArgs. It is invoked by GenericTester.

DerivativeBase: DerivativeBase is the abstract class that defines the interface of

derivative calculation. Currently, we only have one derivative calculation im-

plementation, subclass Derivative LogTime, the calculation algorithm of which

follows the derivative calculation in Horne (1995).

D.2 Work Flow

After the introduction of the chief classes in the project in Section D.1, the execution

work flow of the programs will be introduced in this section. For convenience, we

take the effective rate test as the example. Fig. D.2 demonstrates the general work

flow of the programs.

The program executes as follows:

1. The whole program starts from the main() function in the file of main.cpp.

main() function is the entry of the whole project, that is, no matter what test

is executed, the program always starts from the main() function in main.cpp

file.

Themain() function invokes the test effectiverate() function in test effectiverate.cpp

file to perform the details of the effective rate test. There are two major reasons

that we separate the detailed test content in another function in another file.

For one reason, this avoids too many lines in the main() function, keeping the

program entry tidy and neat. By this means, it is easy for other developers to

recognize what test is going to be executed. For the other, this enables the flex-

ibility for further expansion. In case that we need to perform another test, we

just need to create another function in another file, and make the function called

in main(). Actually, there are a series different tests in this study, they are all

coded in a separate file and invoked by the main() function. Each time when a

test needs executing, we just need to replace the current invoked function with

the to-be-tested function in main(), and recompile the project.


Figure D.2: The work flow diagram for a common test (take effective rate test as anexample).


2. In test effectiverate() function, an instance of GenericTester was created. So

was an instance of GenericTesterArgs. The CaseScriptLoader was used to parse

the test script file which contains the necessary parameters and arguments for

the test, such as the test file name and the result file name. These parsed

arguments were then filled into the instance of GenericTesterArgs, so that when

the instance of GenericTester was executed, it may access all the necessary

parameters from the instance of GenericTesterArgs. In general, the preparation

work of the test was completed in this step. After the preparation, the work

flow of the test was started by invoking the TesterBase::execute() function in

the generictester.cpp.

Here, TesterBase::execute() function is actually GenericTester::execute() func-

tion, because the class GenericTester derives and implements the abstract class

TesterBase. However, we still use the function TesterBase::execute() to invoke

the test because we want to decouple the scheme of work flow and the detailed

implementation. Supposing that one day when we want to use a new test algo-

rithm which follows totally different test steps than GenericTester to perform

the effective rate sensitive test, we may simply include this test algorithm in

the project by creating new class which realizes the algorithm while deriving

and implementing the abstract class TesterBase. This will decrease the change

of the existing project to the lowest extent.

3. The function GenericTester::execute() which implements TesterBase::execute()

in file generictester.cpp is the detailed work flow of the test with all test

parameters ready (provided by the test effectiverate() function in the last step).

In GenericTester::execute(), five steps proceeded as follows:

(a) ColLoader was used to load the permanent downhole gauge data from the

hard disk.

(b) KernelVector 4F which derived and implemented InputVecorBase was called

to create the input vectors for the training and prediction process.

(c) LearnerBase::trainandpredict() in file conjugategradient kernel.cpp was

invoked to start the training and prediction process.


(d) Derivative LogTime was used to calculate the pressure derivatives from

the pressure prediction.

(e) ColWriter was used to save all predictions into prediction result files.

In these five steps, the first two steps were the data preparation for the training

and prediction, and the last two steps were the post process after the prediction.

The third step is the key step that performed the training and prediction pro-

cess. This step was invoked by calling function LearnerBase::trainandpredict()

in file conjugategradient kernel.cpp. Actually, the function that was fi-

nally called was ConjugateGradient ConvolutionKernel::trainandpredict(), be-

cause class ConjugateGradient ConvolutionKernel derived from the abstract

class LearnerBase, and function ConjugateGradient ConvolutionKernel::trainandpredict()

implemented LearnerBase::trainandpredict(). However, we still use Learner-

Base::trainandpredict() to invoke the training and prediction process. The rea-

son is similar to the previous cases that we would like to ensure more flexibility

for future expansions.

4. ConjugateGradient ConvolutionKernel::trainandpredict() which derives from Learner-

Base::trainandpredict()in file conjugategradient kernel.cpp is the process of

training and prediction. This function proceeded in three steps:

(a) Function ConjugateGradient ConvolutionKernel::generatecache() in file con-

jugategradient kernel.cpp was invoked to generate the kernel matrix

(K in Eq. 4.18) using the training data set. The kernel matrix was filled

one element by one element.

(b) Function ConjugateGradient ConvolutionKernel::iterationcore() in file con-

jugategradient kernel.cpp was invoked to solve the training equation,

Eq. 4.18. In this function, the singular value decomposition was applied to

precondition the kernel matrix to a condition number of 106 or less. And

then the β was obtained by solving the training equation, Eq. 4.18.

(c) With obtained β, the prediction was made using Eq. 4.22.


These steps demonstrate the work flow of the program execution. Although this

demonstration is for the effective rate sensitivity test, it is very generic for all the

tests in the study. When executing other tests, what we need to do is to replace the

function test effectiverate() in file test effectiverate.cpp with other test functions.

By using the abstract class as the interfaces in the function calls, this work flow

decouples most of dependencies between classes, so that a future expansion will not

affect the current codes.

Nomenclature

〈·, ·〉 the inner product of two vectors

α the learning rate

∆p pressure drop (psi)

∆p0 pressure response kernel function to a constant flow rate

∆pw wellbore pressure drop

∆q flow rate drop (STB/d)

K(

·,x(i))

half evaluated kernel function, works as the basis function in HK

Φ (x) the transformation over x

HK reproducing kernel Hilbert space associated with kernel function K

hθ (x) the hypothesis function with parameters θ

λ transmissivity ratio

H the Hessian matrix

x x = (x1, x2, . . . , xNx)T is the general form of input vector of the input values

Kx(i) (x) representer of evaluation at x(i), equal to K

(

x,x(i))

K kernel matrix. K is defined so that its (i, j)-entry is given byKij = K(

x(i),x(j))

.

µ viscosity (cp)

235


Niter the total number of the iterations before convergence

Ω storativity ratio

φ porosity

q(i) the flow rate at time t(i)

q(i)j the jth constant flow rate share of flow rate q(i)

σ the parameter in Gaussian kernel function to control the Gaussian curve’s

decay speed.

t(i) the time point at which ith pressure was measured

t(i)j the time elapse between the start of the jth constant flow rate share till the

time of t(i)

β the coefficients in linear combination of kernel basis function to approach real

f in HK, equal to (β1, . . . , βm)T, where m = Np in the context of this project

θ θ = (θ1, θ2, . . . , θNθ)T is a vector of model parameters

θ[m] the θ value in the mth iteration

xpred a given input which the prediction is required to make

x(i)k the general form of input vector of kth part of x(i)

y the general form of the observation vector consists of observation at each sam-

pling point, equal to(

y(1), . . . , y(m))T, where m = Np in the context of this

project

ypred the general form of the prediction by the hypothesis hθ (x) at x

B formation volume factor (res vol/std vol)

C wellbore effect coefficient


Ct total compressibility (/psi)

d the power in the linear kernel, an interger no less than 1

h thickness (ft)

k permeability (md)

Nθ the number of model parameters

Nx

the number of elements of each input vector

Ni the total number of flow rate change events before x(i)

Np the number of the observed (measured) pressures

pi reservoir initial pressure (psi)

Q(1) the cumulated oil production at t(1)

qeff the effective rate for incomplete production history

re reservoir investigated radius

rw wellbore radius (ft)

s skin factor

t time (hours)

teff the effective start time for incomplete production history

yobs the general form of the observed value y

yobs(i) the ith observed value y

Breakpoint a point where the flow rate change event happens. It usually indicates

the end of the previous transient and the beginning of the next transient.

CG Conjugate Gradient


Deconvolution the process that represent the pressure transient of a variable flow rate

in the form of constant flow rate profile

FFT Fast Fourier Transform

KV3F a kernel input vector with three features

KV4FA first kind of kernel input vector with four features

KV4FB second kind of kernel input vector with four features

KV5F a kernel input vector with five features

LMS least-mean-square

MAP Maximum A Posteriori

PBU pressure buildup

PDG Permanent Downhole Gauge

RKHS reproducing kernel Hilbert space

SGD Steepest Gradient Descent

Bibliography

Ahn, S. and Horne, R. (2008). Analysis of permanent downhole gauge data by coint-

erpretation of simultaneous pressure and flow rate signals. SPE Annual Technical

Conference and Exhibition. SPE-115793-MS.

Alexandrov, O. (2007). Illustration of conjugate gradient method. A fig-

ure generated by the Matlab codes. Internet resources. Retrieved from

http://en.wikipedia.org/wiki/Conjugate_gradient_method

on Sept 20, 2012.

Athichanagorn, S. (1999). Development of an Interpretation Methodology for Long-

Term Pressure Data from Permanent Downhole Gauges. PhD dissertation, Stan-

ford University.

Athichanagorn, S., Horne, R., and Kikani, J. (2002). Processing and interpretation

of long-term data acquired from permanent pressure gauges. SPE Reservoir Eval-

uation & Engineering, 3(3):384–391. SPE-80287-PA.

Berg, C., Christensen, J., and Ressel, P. (1984). Harmonic Analysis on Semigroups:

Theory of Positive Definite and Related Functions. Springer, Berlin.

Blanchard, G. and Kramer, N. (2010). Optimal learning rates for kernel conjugate

gradient regression. Advances in Neural Information Processing Systems (NIPS),

23:226–234.

Caers, J. (2009). Optimization and inverse modeling. Stanford University Energy

Resources Engineering Lecture Notes.

239

http://en.wikipedia.org/wiki/Conjugate_gradient_method

BIBLIOGRAPHY 240

Chalaturnyk, R. and Moffatt, T. (1995). Permanent instrumentation for production

optimization and reservoir management. International Heavy Oil Symposium. SPE-

30274-MS.

Collins, M. and Duffy, N. (2002). Convolution kernels for natural language. Advances

in Neural Information Processing Systems, 14(1):625–632.

de Oliveira, S. and Kato, E. (2004). Reservoir management optimization using per-

manent downhole gauge data. SPE Annual Technical Conference and Exhibition.

SPE-90973-MS.

Donoho, D. and Johnstone, I. (1994). Adapting to unkown smoothness via wavelet

shrinkage. Journal of the American Statistics Association, 90:1200–1224. No. 432.

Duru, O. (2011). Reservoir Analasis and Parameter Estimation Constrained to Tem-

perature, Pressure and Flowrate Histories. PhD dissertation, Stanford University.

Duru, O. and Horne, R. (2010). Modeling reservoir temperature transients and

reservoir-parameter estimation constrained to the model. SPE Reservoir Evalu-

ation & Engineering, 13(4):873–883. SPE-115791-PA.

Duru, O. and Horne, R. (2011). Simultaneous interpretation of pressure, temperature,

and flow-rate data using Bayesian inversion methods. SPE Reservoir Evaluation &

Engineering, 14(2):226–238. SPE-124827-PA.

Eck, J., Ewherido, U., Mohammed, J., Ogunlowo, R., Ford, J., Fry, L., S., H., Osugo,

L., Simonian, S., Oyewole, T., and Veneruso, T. (2000). Downhole monitoring:

The story so far. Oilfield Review, pages 20–33.

Evgeniou, T., Pontil, M., and Poggio, T. (2000). Regularization networks and support

vector machines. Advances in Computational Mathematics, 13(1):1–50.

Grubbs, F. (1969). Procedures for detecting outlying observations in samples. Tech-

nometrics, 11(1):1–21.

BIBLIOGRAPHY 241

Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical

Learning: Data Mining, Inference, and Prediction. Springer, Berlin.

Haussler, D. (1999). Convolution kernels on discrete structures. Research note, Uni-

versity of California at Santa Cruz.

Horne, R. (1995). Modern Well Test Analysis. Petroway, Palo Alto, CA, second

edition.

Horne, R. (2007). Listening to the reservoir – interpreting data from permanent

downhole gauges. JPT, 59(12):78–86. SPE-103513-MS.

Khong, K. (2001). Permanent downhole gauge data interpretation. Master report,

Stanford University.

Koller, D. and Friedman, N. (2009). Probabilistic Graphical Models: Principles and

Techniques (Adaptive Computation and Machine Learning). MIT Press, Cam-

bridge.

Konopczynski, M. and McKay, C. (2009). Closing the loop on intelligent completions.

Offshore, 69(9).

Kragas, T., Turnbull, B., and Francis, M. (2004). Permanent fiber-optic monitoring

at northstar: Pressure/temperature systemand data overview. SPE Production and

Facilities, 19(2):86–93. SPE-87681-PA.

Laskov, P. and Nelson, B. (2012). Theory of kernel functions. University Tubingen,

Germany. Lecture Notes for Advanced Topics in Machine Learning.

Lee, J. (2003). Analyzing rate data from permanent downhole gauges. Master report,


Levitan, M., Crawford, G., and Hardwick, A. (2006). Practical considerations for

pressure-rate deconvolution of well-test data. SPE Journal, 11(1):35–47. SPE-

90680-PA.

BIBLIOGRAPHY 242

Liu, Y. (2009). The cointerpretation of flow rate and pressure data from perma-

nent downhole gauges using wavelet and data mining approaches. Master report,


Nestlerode, W. (1963). The use of pressure data from permanently installed bottom-

hole pressure gauges. SPE Rocky Mountain Joint Regional Meeting. SPE-590-MS.

Ng, A. (2009). Machine learning lecture notes. Stanford University Computer Science

Lecture Notes.

Nomura, M. (2006). Processing and Interpretation of Pressure Transient Data from

Permanent Downhole Gauges. PhD dissertation, Stanford University.

Ouyang, L. and Kikani, J. (2002). Improving permanent downhole gauge (PDG)

data processing via wavelet analysis. SPE 13th European Petroleum Conference.

SPE-78290-MS.

Ouyang, L. and Sawiris, R. (2003). Production and injection profiling: A novel appli-

cation of permanent downhole pressure gauges. SPE Annual Technical Conference

and Exhibition. SPE-84399-MS.

Rai, H. (2005). Analyzing rate data from permanent downhole gauges. Master report,


Ramey, H. (1970). Approximate solutions for unsteady liquid flow in composite reser-

voirs. The Journal of Canadian Petroleum, 9(1):32–37.

Tan, P., Steinbach, M., and Kumar, V. (2005). Introduction to Data Mining. Addison

Wesley, Boston, Massachusetts.

Trefethen, L. and Bau, D. (1997). Numerical Linear Algebra. SIAM, Philadelphia,

PA.

Veneruso, A., Economides, C., and Akmansoy, A. (1992). Computer based downhole

data acquisition and transmission in well testing. SPE Annual Technical Conference

and Exhibition. SPE-24728-MS.

BIBLIOGRAPHY 243

von Schroeter, T., Hollaender, F., and Gromgarten, A. (2004). Deconvolution of well-

test data as a nonlinear total least-squares problem. SPE Journal, 9(4):375–390.

SPE-77688-PA.

Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia, PA.

Wallace, C. and Boulton, D. (1968). An information measure for classification. Com-

puter Journal, 11(3):185–194.

Zheng, S. and Li, X. (2007). Analyzing transient pressure from permanent downhole

gauges (PDG) using wavelet method. SPE Europe/EAGE Annual Conference and

Exhibition. SPE-107521-MS.

Zheng, S. and Wang, F. (2011). Recovering flowing history from transient pres-

sure of permanent down-hole gauges (PDG) in oil and water two-phase flowing

reservoir. SPE/DGS Saudi Arabia Section Technical Symposium and Exhibition.

SPE-149100-MS.

Interpreting Pressure and Flow Rate Data from Permanent ...

Documents

Transcript of Interpreting Pressure and Flow Rate Data from Permanent ...