Image recognition

33
IMAGE RECOGNITION IN SEISMIC INTERPRETATION UTILIZE DEEP LEARNING WITH CONVOLUTIONAL NEURAL NETWORK

Transcript of Image recognition

IMAGE RECOGNITION IN SEISMIC INTERPRETATIONUTILIZE DEEP LEARNING WITH CONVOLUTIONAL NEURAL NETWORK

HOW DOES IMAGE RECOGNITION WORK?

Before we embark into how it can be used within seismic data, we will take ourselves some

time to explain:

• the eye – foundation of image recognition

• how image recognition is performed on a computer, and what it does.

We will go into depth of explaining Artificial Neural Networks, how algorithms try to mimic the

human eye and its ability to recognize patterns and images in general.

Then we will try to explain components of Convolutional Network (CNN) methodology and

how its implemented today in existing technologies.

References to other presentations made by the author on how it can be used in Seismic

interpretation is made in final slide of this presentation.

FEED-FORWARD ARTIFICIAL NETWORK

A convolutional neural network (CNN, or ConvNet) is a type of feed-forward artificial neural

network where the individual neurons are tiled in such a way that they respond to overlapping

regions in the visual field.

Convolutional networks were inspired by biological processes and are variations of multilayer

perceptrons which are designed to use minimal amounts of preprocessing. They are widely used

models for image and video recognition.

We think this technology and methodology could be used on seismic data in order for us to

perform pattern recognition and perform data training in order to reveal geometries resembling

seismic facies, sequences, geometries, play types, trap types and so forth.

HUMAN EYE – THE NEURONS

Human eye and retina, a 5 layer extension of the brain and portal to the outside world (from article

“Space-time wiring specificity supports direction selectivity in the retina”, Jinseop S. Kim et.al 2013).

NEURONS (BIOLOGICAL VS ARTIFICIAL)

An artificial neuron is a mathematical function conceived as a model of biological neurons. Artificial neurons

are the constitutive units in an artificial neural network. Depending on the specific model used they may be

called a semi-linear unit, Nv neuron, binary neuron, linear threshold function, or McCulloch–Pitts

(MCP) neuron. The artificial neuron receives one or more inputs, representing dendrites and sums them to

produce an output, representing a neuron's axon. Usually the sums of each node are weighted, and the sum

is passed through a non-linear function known as an activation function or transfer function. The transfer

functions usually have a sigmoid shape, but they may also take the form of other non-linear functions,

piecewise linear functions, or step functions. They are also often monotonically increasing, continuous,

differentiable and bounded.

The artificial neuron transfer function should not be confused with a linear system's transfer function.

Biological Neuron Artificial Neuron

ARTIFICIAL NEURONS

Artificial neurons are designed to mimic aspects of their biological counterparts.

Dendrites – In a biological neuron, the dendrites act as the input vector. These dendrites allow the cell to receive signals from a large (>1000)

number of neighboring neurons. As in the above mathematical treatment, each dendrite is able to perform "multiplication" by that dendrite's

"weight value." The multiplication is accomplished by increasing or decreasing the ratio of synaptic neurotransmitters to signal chemicals

introduced into the dendrite in response to the synaptic neurotransmitter. A negative multiplication effect can be achieved by transmitting signal

inhibitors (i.e. oppositely charged ions) along the dendrite in response to the reception of synaptic neurotransmitters.

Soma – In a biological neuron, the soma acts as the summation function, seen in the previous slide mathematical description. As positive and

negative signals (exciting and inhibiting, respectively) arrive in the soma from the dendrites, the positive and negative ions are effectively added in

summation, by simple virtue of being mixed together in the solution inside the cell's body.

Axon – The axon gets its signal from the summation behavior which occurs inside the soma. The opening to the axon essentially samples the

electrical potential of the solution inside the soma. Once the soma reaches a certain potential, the axon will transmit an all-in signal pulse down its

length. In this regard, the axon behaves as the ability for us to connect our artificial neuron to other artificial neurons.

Unlike most artificial neurons, however, biological neurons fire in discrete pulses. Each time the electrical potential inside the soma reaches a

certain threshold, a pulse is transmitted down the axon. This pulsing can be translated into continuous values. The rate (activations per second,

etc.) at which an axon fires converts directly into the rate at which neighboring cells get signal ions introduced into them. The faster a biological

neuron fires, the faster nearby neurons accumulate electrical potential (or lose electrical potential, depending on the "weighting" of the dendrite

that connects to the neuron that fired). It is this conversion that allows computer scientists and mathematicians to simulate biological neural

networks using artificial neurons which can output distinct values (often from −1 to 1).

CNN & IMAGE RECOGNITION

When used for image recognition, convolutional neural networks (CNNs) consist of multiple layers of small neuron collections

which look at small portions of the input image, called receptive fields. The results of these collections are then tiled so that

they overlap to obtain a better representation of the original image; this is repeated for every such layer. Because of this, they

are able to tolerate translation of the input image. Convolutional networks may include local or global pooling layers, which

combine the outputs of neuron clusters. They also consist of various combinations of convolutional layers and fully connected

layers, with pointwise nonlinearity applied at the end of or after each layer. It is inspired by biological processes. To avoid the

situation that there exist billions of parameters if all layers are fully connected, the idea of using a convolution operation on

small regions has been introduced. One major advantage of convolutional networks is the use of shared weight in

convolutional layers, which means that the same filter (weights bank) is used for each pixel in the layer; this both reduces

required memory size and improves performance.

Some time delay neural networks also use a very similar architecture to convolutional neural networks, especially those for

image recognition and/or classification tasks, since the "tiling" of the neuron outputs can easily be carried out in timed stages

in a manner useful for analysis of images.

Compared to other image classification algorithms, convolutional neural networks use relatively little pre-processing. This

means that the network is responsible for learning the filters that in traditional algorithms were hand-engineered. The lack of

a dependence on prior-knowledge and the existence of difficult to design hand-engineered features is a major advantage for

CNNs.

CNN & IMAGE RECOGNITIONRECEPTIVE FIELDS

The receptive field of an individual sensory neuron is the particular region of the sensory space (e.g., the body surface, or

the retina) in which a stimulus will trigger the firing of that neuron. This region can be a hair in the cochlea or a piece of skin,

retina, tongue or other part of an animal's body. Additionally, it can be the space surrounding an animal, such as an area of

auditory space that is fixed in a reference system based on the ears but that moves with the animal as it moves (the space

inside the ears), or in a fixed location in space that is largely independent of the animal's location (place cells). Receptive

fields have been identified for neurons of the auditory system, the somatosensory system, and the visual system.

The term receptive field was first used by Sherrington (1906) to describe the area of skin from which a scratch reflex could be

elicited in a dog. According to Alonso and Chen (2008) it was Hartline (1938) who applied the terms to single neurons, in this

case from the retina of a frog.

The concept of receptive fields can be extended further up to the neural system; if many sensory receptors all form synapses

with a single cell further up, they collectively form the receptive field of that cell. For example, the receptive field of a ganglion

cell in the retina of the eye is composed of input from all of the photoreceptors which synapse with it, and a group of ganglion

cells in turn forms the receptive field for a cell in the brain. This process is called convergence.

Receptive field = center + surround

CNN & IMAGE RECOGNITIONCONVOLUTION

In mathematics and, in particular, functional analysis, convolution is a mathematical operation on two functions f and g,

producing a third function that is typically viewed as a modified version of one of the original functions, giving the area

overlap between the two functions as a function of the amount that one of the original functions is translated. Convolution is

similar to cross-correlation. It has applications that include probability, statistics, computer vision, natural language

processing, image and signal processing, engineering, and differential equations.

The convolution can be defined for functions on groups other than Euclidean space. For example, periodic functions, such as

the discrete-time Fourier transform, can be defined on a circle and convolved by periodic convolution. A discrete convolution

can be defined for functions on the set of integers. Generalizations of convolution have applications in the field of numerical

analysis and numerical linear algebra, and in the design and implementation of finite impulse response filters in signal

processing.

Computing the inverse of the convolution operation is known as deconvolution.

CNN & IMAGE RECOGNITIONPOINTWISE NONLINEARITY

From “Nonlinear Digital Filters: Principles and Applications”, by Ioannis Pitas, Anastasios N. Venetsanopoulo

CNN & IMAGE RECOGNITIONTIME DELAY NEURAL NETWORK

Time delay neural network (TDNN) is an artificial neural network architecture whose primary purpose is to work on

sequential data. The TDNN units recognize features independent of time-shift (i.e. sequence position) and usually form part of

a larger pattern recognition system. Converting continuous audio into a stream of classified phoneme labels for speech

recognition.

An input signal is augmented with delayed copies as other inputs, the neural network is time-shift invariant since it has no

internal state.

The original paper presented a perceptron network whose connection weights were trained with the back-propagation

algorithm, this may be done in batch or online. The Stuttgart Neural Network Simulator implements that version.

CNN & IMAGE RECOGNITIONCLASSIFICATION

Some examples of typical computer vision tasks are presented here.

RecognitionThe classical problem in computer vision, image processing, and machine vision is that of determining

whether or not the image data contains some specific object, feature, or activity. Different varieties of the

recognition problem are described in the literature:

Object classificationOne or several pre-specified or learned objects or object classes can be recognized, usually together with

their 2D positions in the image or 3D poses in the scene. Google Goggles or LikeThat provide a stand-alone

programs that illustrate this function.

IdentificationAn individual instance of an object is recognized. Examples include identification of a specific person's face or

fingerprint, identification of handwritten digits, or identification of a specific vehicle.

DetectionThe image data are scanned for a specific condition. Examples include detection of possible abnormal cells or

tissues in medical images or detection of a vehicle in an automatic road toll system. Detection based on

relatively simple and fast computations is sometimes used for finding smaller regions of interesting image

data which can be further analyzed by more computationally demanding techniques to produce a correct

interpretation.

FIRST WE DO PATTERN RECOGNITION

• A pattern is an object, process or event

• A class (or category) is a set of patterns that share common attribute (features) usually from the same information source

• During recognition (or classification) classes are assigned to the objects.

• A classifier is a machine that performs such task

“The assignment of a physical object or event to one of several pre-specified categories” -- Duda & Hart

Armando Vieira & Bernardete Ribeiro (2008)

WHAT IS A PATTERN?

“A pattern is the opposite of a chaos; it is an entity vaguely defined, that could be given a name.”

Examples of Patterns

Armando Vieira & Bernardete Ribeiro (2008)

Patterns of Constellations

Patterns of constellations are represented by 2D planar graphs

Human perception has strong tendency to find patterns from anything. We see patterns from even random noise,

we are more likely to believe a hidden pattern than denying it when the risk (reward) for missing (discovering) a

pattern is often high.

EXAMPLES OF PATTERNS

Armando Vieira & Bernardete Ribeiro (2008)

Biological Patterns ---morphology

Landmarks are identified from biologic forms and these patterns are then

represented by a list of points. But for other forms, like the root of plants,

Points cannot be registered crossing instances.

Applications: Biometrics, computational anatomy, brain mapping, …

EXAMPLES OF PATTERNS

Armando Vieira & Bernardete Ribeiro (2008)

Music Patterns

EXAMPLES OF PATTERNS

Armando Vieira & Bernardete Ribeiro (2008)

Discovery and Association of Patterns

EXAMPLES OF PATTERNS

Armando Vieira & Bernardete Ribeiro (2008)

EXAMPLES OF PATTERNS

Discovery and Association of Patterns

Armando Vieira & Bernardete Ribeiro (2008)

A broad range of texture patterns are generated by stochastic processes.

EXAMPLES OF PATTERNS

Armando Vieira & Bernardete Ribeiro (2008)

Object Recognition

EXAMPLES OF PATTERNS

Armando Vieira & Bernardete Ribeiro (2008)

Maps Recognition

Patterns of environment

EXAMPLES OF PATTERNS

Armando Vieira & Bernardete Ribeiro (2008)

APPROACHES TO IMAGE RECOGNITION

• Statistical PR: based on underlying statistical model of patterns and pattern classes.

• Neural networks: classifier is represented as a network of cells modeling neurons of the human brain

(connectionist approach).

• Structural (or syntactic) PR: pattern classes represented by means of formal structures as grammars,

automata, strings, etc.

Armando Vieira & Bernardete Ribeiro (2008)

PROBLEM FORMULATION

Measurements

PreprocessingClassificationFeatures

Input

objectClass

Label

Basic ingredients:

• measurement space (e.g., image intensity, pressure)

• features (e.g., corners, spectral energy)

• classifier - soft and hard

• decision boundary

• training sample

• probability of error

Armando Vieira & Bernardete Ribeiro (2008)

DESIGN CYCLE

1. feature selection and extraction

• What are good discriminative features?

2. modeling and learning

3. dimension reduction, model complexity

4. decisions and risks

4. error analysis and validation

5. performance bounds and capacity

6. algorithms

Armando Vieira & Bernardete Ribeiro (2008)

Armando Vieira & Bernardete Ribeiro (2008)

Armando Vieira & Bernardete Ribeiro (2008)

DATA COLLECTION

How do we know when we have collected an adequately large and

representative set of examples for training and testing the system?

Armando Vieira & Bernardete Ribeiro (2008)

FEATURE CHOICE

Depends on the characteristics of the problem domain.

Simple to extract, invariant to irrelevant transformation,

insensitive to noise.

Armando Vieira & Bernardete Ribeiro (2008)

MODEL CHOICE

Unsatisfied with the performance of our linear object classifier and want to jump to

another class of model

Armando Vieira & Bernardete Ribeiro (2008)

TRAINING

Use data to determine the classifier.

Many different procedures for

training classifiers and choosing

models

Armando Vieira & Bernardete Ribeiro (2008)

EVALUATION

Measure the error rate (or performance) and switch

from one set of features & models to another one.

Armando Vieira & Bernardete Ribeiro (2008)

COMPUTATIONAL COMPLEXITY

What is the trade off between computational ease and performance?

(How an algorithm scales as a function of the number of features, number or training examples,

number patterns or categories?)

CNN IN SEISMIC INTERPRETATION

http://www.slideshare.net/StigArneKristoffersen/future-trends-of-seismic-

analysis?ref=https://www.linkedin.com/profile/preview?locale=en_US&trk

=prof-0-sb-preview-primary-button

http://www.slideshare.net/StigArneKristoffersen/not-

54557734?ref=https://www.linkedin.com/profile/preview?locale=en_US&t

rk=prof-0-sb-preview-primary-button

http://media.wix.com/ugd/c193bc_d9e608ed875d4e208db1a6e8e6b5bb

77.pdf

http://media.wix.com/ugd/c193bc_4088f96c14b848a3a9f3c720f1e3445d.

pdf

Below, you will find some links to previous presentations I have made with focus on image recognition as

a tool in Seismic Interpretation, with emphasis on play, trap and seismic stratigraphy