Post on 13-Jun-2018
Oneiric Machine
Learning
The Foundations of Dream Inspired Adaptive Systems
Julian Holley
Faculty of Computing, Engineering and Mathematic Sciences
University of the West of England
A thesis submitted in partial fulfilment of the requirements of theUniversity of the West of England, Bristol for the degree of
Doctor of Philosophy
November 2008
Acknowledgements
Foremost I would like to thank my supervisory team, Director of Stud-
ies Dr. Tony Pipe and Dr. Brian Carse. I would especially like to
emphasise their enthusiasm, understanding, support and friendship
over a protracted and turbulent period in which I have worked on
this study.
Secondly I would like to thank the general manager Mr. Alan Bromley
& former manager and director Abdul Basharat of Clares Merchandise
Handling Equipment Ltd. my former employer and sponsor. I would
especially like to acknowledge how I have been able to restore the
balance between work commitments and my personal aspirations as
a researcher in recent months. I would also like to express my thanks
to Professor Larry Bull; responsible for my recent employment and
for his support during the final stages.
Thirdly, the degree to which the semantic clarity and comprehension
of the thesis has been improved, I owe many thanks to Anthony Chan-
dor. Any remaining errors are indicative of work not to pass under
his critical eye and therefore I alone am responsible.
Finally and with due humility for the space in her life that I have
selfishly stolen for sake of my own personal quest for understanding,
I thank Sarah.
Copyright Notice
This copy has been supplied on the understanding that it
is copyright material and that no quotation from the thesis
may be published without proper acknowledgement.
Abstract
Artificial adaptive systems inspired or derived from neuro-biological
components and processes have shown great promise at several lev-
els. One behaviour required for the continuous functional operation
of advanced neuro-biological systems is sleep. A definitive function or
purpose for sleep and of the associated phenomenology such as dream-
ing, remains elusive. Correspondingly there remain many unresolved
issues within the domain of artificial learning systems. One such as-
pect that largely remains intractable is the management of experiences
once learned and encoded. This is the general problem of developing a
persuasive explanation or scalable strategy for the contiguous organi-
sation of internal representation and memory within finite resources;
it is from this parallel perspective in which this research is set.
This research is an exploration into the cognition of sleep and dream-
ing in humans and animals. Positioned between sleep & dreaming
research and the machine learning domain, this thesis reports on an
approach to improve the latter by formulating theories emerging from
the former.
Recent research investigating the responsibility of sleep processes in
modifying memory have shown that for the avian and mammalian
brain sleep plays an important role in long term cognitive develop-
ment. A set of observations are created from the current understand-
ing of both the benefits of sleep and the processes involved, including
dreaming. From these observations the first contribution of this thesis
is presented; several proposals for the cognitive benefits of sleep and
dreaming in aspects of perception, consolidation, scalability, general-
isation and representational conceptualisation.
Previous research has investigated some aspects of sleep and dreaming
in relation to machine learning. These have been positioned at two
extremes of the machine learning paradigm; low level, emergent be-
haviour of artificial neural networks or high level, directed behaviour
of symbolic artificial intelligence. This is the first report of direct re-
search into the translation of the benefits by analogous mechanisms
of sleep and dreaming at a level in-between earlier research. This
combination is characterised by creating a foundation for a new genre
of artificial learning strategies derived directly from sleep and dream
phenomenology, Oneiric Machine Learning.1
Anticipatory classifier systems (ACS) represent a niche group of ma-
chine learning systems derived from the established machine learning
field of learning classifier systems (LCS). ACS are capable of latent
learning; learning for the reward of learning and subsequently creat-
ing an internal generalised model of the environment. This feature
aligned within the LCS framework provides an ideal developmental
template. A review of the latent learning background and ACS al-
gorithmic detail sets the basis for several applications illustrative of
the Oneiric Machine Learning approach. Empirical evidence demon-
strates how an adapted ACS system can exploit a dreamlike emergent
thread based on an incomplete, generalised model of the environment
to reduce the number of real actions required to reach model compe-
tency. Conceptual solutions to restrictions limiting the role to which
ACS/LCS systems can represent some aspects advocated by Oneiric
Machine Learning are presented.
In mitigation of these restrictions, two novel prototype systems are
described; the first introduces a method of implicitly managing state
generalisation by the building of concept links into the classifier rule.
The second illustrates automatic state alias triggered state augmenta-
tion and off-line resolution. Although remaining under development
1Oneiric: of or relating to dreams or dreaming. Adapted from Oneiric Behaviour (Jouvet,1979) used to describe rapid eye movement (REM) sleep re-animation.
results in these new directions present plausible systems level architec-
tures that are in part experimentally demonstrated. Novel solutions
are presented to structural and procedural problems that promote the
future development of cognitive systems within the LCS framework
setting a direction for future studies.
Contents
1 Introduction 1
1.1 Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Nature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Thesis Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.1 Part I: Sleep, Dreaming & Learning . . . . . . . . . . . . . 11
1.5.2 Part II: Latent & Machine Learning . . . . . . . . . . . . . 12
1.5.3 Part III: Heterogenesis . . . . . . . . . . . . . . . . . . . . 12
1.5.3.1 Hypothesis . . . . . . . . . . . . . . . . . . . . . 12
1.5.3.2 Consolidation . . . . . . . . . . . . . . . . . . . . 13
1.5.3.3 Generalisation . . . . . . . . . . . . . . . . . . . 13
1.5.3.4 State Augmentation . . . . . . . . . . . . . . . . 14
1.6 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
I Sleep, Dreaming and Learning 16
2 Foundations of Sleep and Dreaming 17
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Stages of Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Slow Wave Sleep Stage . . . . . . . . . . . . . . . . . . . . 23
2.2.2 REM Sleep Stage . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Physiology of Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.1 Endocrine System . . . . . . . . . . . . . . . . . . . . . . . 24
i
CONTENTS
2.3.2 Thermoregulation . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.3 Respiratory System . . . . . . . . . . . . . . . . . . . . . . 25
2.3.4 Autonomic System . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Ontogenesis of Sleep . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Neurophysiology of Sleep and Dreaming . . . . . . . . . . . . . . 27
2.5.1 State Transition . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.1.1 Waking to Slow Wave Sleep . . . . . . . . . . . . 28
2.5.1.2 Slow Wave Sleep to REM Sleep . . . . . . . . . . 29
2.5.1.3 Sleep to Waking . . . . . . . . . . . . . . . . . . 30
2.5.2 Waking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.3 Slow Wave Sleep . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.4 REM Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6 Neurological Summary of Sleep . . . . . . . . . . . . . . . . . . . 38
2.6.1 Hippocampus . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.6.2 Neurology of Memory . . . . . . . . . . . . . . . . . . . . . 50
2.7 Mentation During Sleep . . . . . . . . . . . . . . . . . . . . . . . 54
2.7.1 Hypnagogic Sleep . . . . . . . . . . . . . . . . . . . . . . . 54
2.7.2 Lucid Dreaming . . . . . . . . . . . . . . . . . . . . . . . . 55
2.7.3 Daydreaming . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.8 Sleep and Dreaming in Animals . . . . . . . . . . . . . . . . . . . 56
2.8.1 Dreaming Cats . . . . . . . . . . . . . . . . . . . . . . . . 56
2.8.2 Dreaming Rats . . . . . . . . . . . . . . . . . . . . . . . . 60
2.8.3 Dreaming Birds . . . . . . . . . . . . . . . . . . . . . . . . 65
2.9 Sleep and Dreaming in Humans . . . . . . . . . . . . . . . . . . . 70
2.9.1 Human Dreams . . . . . . . . . . . . . . . . . . . . . . . . 70
2.9.2 Correlates of Dreaming . . . . . . . . . . . . . . . . . . . . 71
2.10 Memory and Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.11 Isomorphism, Creativity and Insight . . . . . . . . . . . . . . . . . 85
2.11.1 Antithesis Sleep and Memory Adaptation . . . . . . . . . . 92
2.12 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
ii
CONTENTS
II Latent Learning 98
3 Latent Learning: A Review 99
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.2 Latent Learning: an Historical Road Map . . . . . . . . . . . . . 99
3.3 Selected Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.3.1 The Effect of Introduction of Reward Upon the Maze . . . 101
3.3.2 Purposive Behaviour in Animals and Men . . . . . . . . . 101
3.3.3 A Theoretical Derivation of Latent Learning . . . . . . . . 101
3.3.4 An Experimental Analysis of Latent Learning . . . . . . . 101
3.3.4.1 Review . . . . . . . . . . . . . . . . . . . . . . . 102
3.3.4.2 The Experiment . . . . . . . . . . . . . . . . . . 102
3.3.4.3 Disscusion . . . . . . . . . . . . . . . . . . . . . . 103
3.3.5 Unrewarded Exploration and Maze Learning . . . . . . . . 104
3.3.6 Latent Learning Impaired by REM Sleep Deprivation . . . 104
3.3.7 Lookahead Planning and Latent Learning in LCS . . . . . 105
3.3.8 Anticipatory Classifier Systems . . . . . . . . . . . . . . . 106
3.3.9 Latent Learning in Khepera Robots with the ACS . . . . . 107
3.4 Latent Learning: Learning Without Reward . . . . . . . . . . . . 108
3.4.1 Animal Psychology . . . . . . . . . . . . . . . . . . . . . . 108
4 Related Systems 110
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.2 The General Anticipatory Classifier System Framework . . . . . . 111
4.3 CFSC2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.3.2 Classifier System Architecture . . . . . . . . . . . . . . . . 115
4.3.3 Representing an Internal Model . . . . . . . . . . . . . . . 119
4.3.4 Learning a Model . . . . . . . . . . . . . . . . . . . . . . . 121
4.3.5 Rule Adaptation . . . . . . . . . . . . . . . . . . . . . . . 123
4.3.6 Summary and Review CFSC2 . . . . . . . . . . . . . . . . 123
4.4 Original ACS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
iii
CONTENTS
4.4.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4.3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.4.4 Latent Learning . . . . . . . . . . . . . . . . . . . . . . . . 128
4.4.4.1 Prediction Quality Adjustment . . . . . . . . . . 131
4.4.4.2 The ‘Specification of Changing Components’ . . . 132
4.4.4.3 The ‘Specification of Un-changing Components’ . 133
4.4.5 Behavioural Sequencing ‘Chunking’ . . . . . . . . . . . . . 133
4.4.6 ACS Summary and Review . . . . . . . . . . . . . . . . . 135
4.5 YACS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.5.2 Latent Learning . . . . . . . . . . . . . . . . . . . . . . . . 138
4.5.2.1 Effect Covering . . . . . . . . . . . . . . . . . . . 138
4.5.2.2 Selection of Accurate Classifiers . . . . . . . . . . 139
4.5.2.3 Specification of Conditions . . . . . . . . . . . . . 139
4.5.2.4 Specialisation Process . . . . . . . . . . . . . . . 141
4.5.2.5 Condition Covering and Useless Classifiers . . . . 141
4.5.3 Policy Learning . . . . . . . . . . . . . . . . . . . . . . . . 141
4.5.3.1 YACS Notation . . . . . . . . . . . . . . . . . . . 142
4.6 Value Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4.7 Modular Classifier System (MACS) . . . . . . . . . . . . . . . . . 144
4.7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.7.2 Latent Learning and Generalisation . . . . . . . . . . . . . 146
4.7.3 Exploration and Exploitation Policies . . . . . . . . . . . . 150
4.7.3.1 Active Exploration . . . . . . . . . . . . . . . . . 151
4.7.3.2 Exploitation . . . . . . . . . . . . . . . . . . . . . 154
4.7.3.3 Combining Exploration and Exploitation . . . . . 155
4.8 Model-based Reinforcement Learning . . . . . . . . . . . . . . . . 156
4.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
III Heterogenesis 160
5 Hypothesis 161
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
iv
CONTENTS
5.2 Aspects of Sleep and Dreaming . . . . . . . . . . . . . . . . . . . 162
5.2.1 General Aspects of Sleep . . . . . . . . . . . . . . . . . . . 162
5.2.2 Specific Aspects of Sleep . . . . . . . . . . . . . . . . . . . 164
5.2.3 Learning and Neural Adaptivity . . . . . . . . . . . . . . . 166
5.2.4 Dreaming and Delusion . . . . . . . . . . . . . . . . . . . . 166
5.2.5 Monotonicity of Cognition . . . . . . . . . . . . . . . . . . 168
5.2.6 Dreaming and Internal Representation . . . . . . . . . . . 169
5.2.7 A Closed System . . . . . . . . . . . . . . . . . . . . . . . 171
5.2.8 The Pressure of Sleep . . . . . . . . . . . . . . . . . . . . . 172
5.2.9 Emotions and Dreaming . . . . . . . . . . . . . . . . . . . 173
5.2.10 Periodicity of Waking and Sleep . . . . . . . . . . . . . . . 174
5.2.11 Hyperassociativity of Dreaming . . . . . . . . . . . . . . . 174
5.2.12 The Thread of Dreaming . . . . . . . . . . . . . . . . . . . 176
5.2.13 Temporal Relevancy of Dream Content . . . . . . . . . . . 177
5.2.14 The Progression of Sleep Mentation . . . . . . . . . . . . . 178
5.2.15 Development of Dreaming . . . . . . . . . . . . . . . . . . 179
5.2.16 Periodicity of SWS and REM sleep . . . . . . . . . . . . . 179
5.2.17 The Hippocampus Neocortex Dialogue During Sleep . . . . 181
5.2.18 Summary of Observations . . . . . . . . . . . . . . . . . . 186
5.3 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
5.3.1 Suggestion . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
5.3.2 Reiteration . . . . . . . . . . . . . . . . . . . . . . . . . . 192
5.3.3 Capacity Constraint Theory . . . . . . . . . . . . . . . . . 194
5.3.4 Long and Short Memory Assimilation . . . . . . . . . . . . 196
5.3.4.1 Long Term Behaviour in Short Term Context . . 200
5.3.4.2 Short Term Behaviour in Long Term Context . . 201
5.3.5 Concept Building . . . . . . . . . . . . . . . . . . . . . . . 205
5.3.6 Generalisation: The Building of Concepts . . . . . . . . . 209
5.3.7 On-line or Off-line Adaptation? . . . . . . . . . . . . . . . 211
5.3.8 Structural and Temporal Credit Assignment . . . . . . . . 217
5.4 An Integrative View . . . . . . . . . . . . . . . . . . . . . . . . . 218
5.4.1 Formal Definition of Oneiric Machine Learning . . . . . . . 226
5.5 The Machine Learning Link . . . . . . . . . . . . . . . . . . . . . 229
v
CONTENTS
6 Consolidation 231
6.1 Simulating Dreaming . . . . . . . . . . . . . . . . . . . . . . . . . 232
6.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
6.3 Bounded Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
6.3.1 Results: Bounded Walk, Latent Learning Phase . . . . . . 247
6.3.2 Result Analysis: Bounded Walk . . . . . . . . . . . . . . . 257
6.3.3 Discussion: Bounded Walk . . . . . . . . . . . . . . . . . . 259
6.4 T-Maze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
6.4.1 Result Analysis: T-maze . . . . . . . . . . . . . . . . . . . 266
6.5 Critique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
6.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
7 Generalisation 269
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7.2 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
7.2.1 Unit of Representation: Classifier Configuration . . . . . . 272
7.2.2 Unit of Representation: Classifier Structure . . . . . . . . 273
7.2.3 Operation: Action Selection . . . . . . . . . . . . . . . . . 273
7.3 Generalised Example . . . . . . . . . . . . . . . . . . . . . . . . . 276
7.3.1 Conceptual Example . . . . . . . . . . . . . . . . . . . . . 276
7.3.2 Detailed Example . . . . . . . . . . . . . . . . . . . . . . . 279
7.3.3 Sequence (Run 1) . . . . . . . . . . . . . . . . . . . . . . . 282
7.3.4 Commentary (Run 1) . . . . . . . . . . . . . . . . . . . . . 283
7.3.5 Sequence (Run 2) . . . . . . . . . . . . . . . . . . . . . . . 290
7.3.6 Commentary (Run 2) . . . . . . . . . . . . . . . . . . . . . 291
7.3.7 Example Notes . . . . . . . . . . . . . . . . . . . . . . . . 294
7.4 Implementation: System I . . . . . . . . . . . . . . . . . . . . . . 295
7.5 Implementation: System II . . . . . . . . . . . . . . . . . . . . . . 303
7.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
vi
CONTENTS
8 State Augmentation 305
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
8.1.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 306
8.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
8.2.1 Proposition Method . . . . . . . . . . . . . . . . . . . . . 319
8.3 Further State Adaptation . . . . . . . . . . . . . . . . . . . . . . 326
8.3.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
9 Conclusions 333
9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
9.2 Related Systems and Context . . . . . . . . . . . . . . . . . . . . 335
9.2.1 Sleep and Dream Related Systems . . . . . . . . . . . . . . 336
9.2.2 Non Sleep Related Systems . . . . . . . . . . . . . . . . . 338
9.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
9.3.1 Foundation Contributions . . . . . . . . . . . . . . . . . . 341
9.3.2 Application Contributions . . . . . . . . . . . . . . . . . . 341
9.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
9.5 Closing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
A 348
A.1 Bounded Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
A.2 T-maze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
A.3 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
References 388
vii
List of Figures
1.1 Parallel research approach . . . . . . . . . . . . . . . . . . . . . . 10
2.1 Human stages of sleep . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Principal lobes of the human cerebral cortex . . . . . . . . . . . . 40
2.3 Relative changes to cortical activity during REM sleep (lateral view) 41
2.4 Relative changes to cortical activity during REM sleep (medial view) 42
2.5 Relative changes to cortical activity during REM sleep (ventral view) 43
2.6 Mammalian memory taxonomy . . . . . . . . . . . . . . . . . . . 53
4.1 LCS: Environmental reward loop . . . . . . . . . . . . . . . . . . 112
4.2 LCS: State prediction reward loop . . . . . . . . . . . . . . . . . . 113
4.3 S → R→ S Classifier system as a combination S → R . . . . . . 114
4.4 LCS message and classifier cycle . . . . . . . . . . . . . . . . . . . 118
4.5 CFSC2 message frame . . . . . . . . . . . . . . . . . . . . . . . . 119
4.6 CFSC2 classifier frame . . . . . . . . . . . . . . . . . . . . . . . . 120
4.7 CFSC2 activation spreading within the message list. . . . . . . . . 121
4.8 ACS classifier configuration . . . . . . . . . . . . . . . . . . . . . 127
4.9 ACS Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.10 ACS expectation passthrough operation . . . . . . . . . . . . . . . 130
4.11 Specification of changing components . . . . . . . . . . . . . . . . 132
4.12 Specification of un-changing components . . . . . . . . . . . . . . 134
5.1 Simplified conceptual view of waking perception . . . . . . . . . . 169
5.2 Simplified conceptual view of dreaming perception . . . . . . . . . 170
5.3 Conceptual relationship between hippocampus & cortex . . . . . . 182
5.4 Conceptual formation of memory representation . . . . . . . . . . 183
viii
LIST OF FIGURES
5.5 Example of the concept of suggestive adaptation . . . . . . . . . . 190
5.6 Long and short term dream memory proposal: Real mode . . . . 198
5.7 Long term behaviour with the context of short term memory . . . 199
5.8 Short term behaviour with the context of long term memory . . . 201
5.9 Integration of new experiences in the existing memory . . . . . . . 203
5.10 Reactive agent; the environment encodes state. . . . . . . . . . . . 212
5.11 Partially reactive agent. . . . . . . . . . . . . . . . . . . . . . . . 213
5.12 Minority reactive agent . . . . . . . . . . . . . . . . . . . . . . . . 214
5.13 Building perception . . . . . . . . . . . . . . . . . . . . . . . . . . 215
5.14 Perceptual ambiguity concept . . . . . . . . . . . . . . . . . . . . 216
6.1 ACS operation under the influence of future expectations . . . . . 233
6.2 Agent switching between realities. . . . . . . . . . . . . . . . . . . 234
6.3 ACS agent switching between reality and dream worlds . . . . . . 237
6.4 Selection strategy relationship during simulated dreaming . . . . . 238
6.5 Distribution balance for each session . . . . . . . . . . . . . . . . 241
6.6 Experiment 1: Bounded walk, normalised results . . . . . . . . . . 248
6.7 Experiment 2: Bounded walk, normalised results . . . . . . . . . . 250
6.8 Experiment 3: Bounded walk, normalised results . . . . . . . . . . 252
6.9 Experiment 4: Bounded walk, normalised results . . . . . . . . . . 254
6.10 Model competency by experiment (Correct) . . . . . . . . . . . . 255
6.11 Bad model responses by experiment (M nc) . . . . . . . . . . . . 255
6.12 Total cycles by experiment (R+M) . . . . . . . . . . . . . . . . . 256
6.13 Experiment 5: T-maze, normalised results . . . . . . . . . . . . . 264
7.1 State and concept relationship . . . . . . . . . . . . . . . . . . . . 271
7.2 Generalised state relationships to concepts . . . . . . . . . . . . . 271
7.3 Contrast of classifier rule application . . . . . . . . . . . . . . . . 272
7.4 Concept to state relationship assumption . . . . . . . . . . . . . . 277
7.5 A retrospective association . . . . . . . . . . . . . . . . . . . . . . 277
7.6 A speculative association . . . . . . . . . . . . . . . . . . . . . . . 278
7.7 The general state association representation . . . . . . . . . . . . 278
7.8 T-maze state coding . . . . . . . . . . . . . . . . . . . . . . . . . 279
ix
LIST OF FIGURES
8.1 Simple grid world . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
8.2 A simple 2 dimensional maze. . . . . . . . . . . . . . . . . . . . . 310
8.3 Agent maze ambiguity. . . . . . . . . . . . . . . . . . . . . . . . . 315
8.4 Repeated maze transitions . . . . . . . . . . . . . . . . . . . . . . 317
8.5 The interim agent system. . . . . . . . . . . . . . . . . . . . . . . 318
8.6 Simple maze & target cell. . . . . . . . . . . . . . . . . . . . . . . 320
8.7 Disembodiment of the implied meanings. . . . . . . . . . . . . . . 320
8.8 The possible internal representations. . . . . . . . . . . . . . . . . 321
8.9 Automatic state mapping schema . . . . . . . . . . . . . . . . . . 327
A.1 Bounded walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
A.2 Symbolic maze representation of the bounded walk problem . . . 349
A.3 Latent learning of the bounded walk problem . . . . . . . . . . . 350
A.4 Rat in the T-maze experiment . . . . . . . . . . . . . . . . . . . . 352
A.5 T-maze state coding . . . . . . . . . . . . . . . . . . . . . . . . . 352
A.6 T-maze state transition diagram . . . . . . . . . . . . . . . . . . . 353
A.7 T-maze state and agent position . . . . . . . . . . . . . . . . . . . 353
A.8 T-maze and latent learning . . . . . . . . . . . . . . . . . . . . . . 355
A.9 T-maze state alternative coding II . . . . . . . . . . . . . . . . . . 355
A.10 T-maze state transition diagram: Coding II . . . . . . . . . . . . 356
A.11 T-maze state alternative coding III . . . . . . . . . . . . . . . . . 356
A.12 T-maze state transition diagram: Coding III . . . . . . . . . . . . 357
x
Chapter 1
Introduction
1.1 Paradox
Imagine a future situation where brain imaging has made considerable techno-
logical advances, way beyond our current capabilities. The advance is so great
that a device exists whereby the neural electrical activity of a subject can be
observed by merely donning a pair of special eye glasses. The glasses are able
to superimpose regional neural activity on the outline of a subject’s head (for
the purposes of discussion call these N-glasses). Zero neural activity produces no
superimposed colouring in contrast to bright luminous colours of high activity
producing an image similar to that rendered by contemporary PET scanners1 yet
dynamic, convenient and in real time.
By wearing N-glasses, merely walking down the high street or going to work
would make for a revealing experience. Despite the outwardly homogeneous ap-
pearance of the brain, like the body it has a remarkably consistent functional
locality that is consistent across individuals within species. The regional varia-
tion of neural activity of people and animals freely going about the business of
the day, would make a significant contribution to neuro-psychology, in addition
to interesting social engagements of those wearing N-glasses. Waking behaviour
patterns would of course lead to some fascinating insights, but what then would
the observation of sleeping individuals reveal? Intuitively one would think this
would bring a period of brain inactivity concomitant with the physical state.
1Positron Emission Tomography.
1
1.1 Paradox
Apart from the occasional dream most people would assume the brain, controller
of the body and host to the mind, would largely remain inactive. Herein resides
the paradox that is the basis for this research presented in this thesis. Through
the futuristic vision of the N-glasses a sleeping individual, mammal or bird would
reveal a startling contradiction to the notion that the sleeping brain like the body
rests.
Extrapolating the capability of the N-glasses still further, consider a visiting
alien watching the Earth from a low orbit. Viewing the Earth’s surface in terms
of human neural activity, how would the picture be interpreted? The light, day-
time side of the planet would be characterised by millions of moving pin points of
light, consistent in their illumination but of variable intensity. On the dark, night
time side of the planet the same pin points of lights largely remain motionless
and flicker brightly. Further investigation would only serve to confuse the situ-
ation. The light, daytime neural activity correlates with overt behaviour. The
dark, nighttime neural activity also initially is related to behaviour, but is later
confounded by long periodic bursts of neural activity close to the levels observed
during waking yet in the absence of associated waking behaviour. If any such
alien species were ever to witness this process, how would humanity explain this
phenomenon? Why, contrary to most logic, does the brain reactivate in apparent
isolation from the body?
Unfortunately the realisation of N-glasses is about as far away as the closest
likely source of intelligent life,1 but this fictional scenario serves to demonstrate
two important issues. Firstly, sleep is anything but rest for the brain, the restora-
tive bodily rest of sleep does not by association imply rest for the brain any more
than it does for the heart. Secondly, whilst there is no shortage of theories for
the need or reason for sleep and contribution of dreaming therein, none of these
are able to provide convincing supporting proof. The problems facing exposition
of a theory of sleep and dreaming closely parallel the equally confounded search
for an understanding of consciousness. The subjective experience of dreaming
during sleep is without doubt one of the most significant and tantalising clues to
understanding consciousness. The subjective experience of a dream demonstrates
1The Recently discovered planet ‘Gliese 581c’ (also see Star Trek Gene Roddenberry1966-69).
2
1.1 Paradox
that above all else the experience of life is primarily an abstract one. Waking be-
haviour is a result of the mind synchronised, trapped by the physical constraints
of the natural world. Imprinted, yet free from physical connections, the mind
does not cease to function, in fact quite the reverse. The aim of this research is
in no way a back door presentation of yet another theory of consciousness but
rather a direct, albeit at a low level, hypothesis on the cognitive utility of sleep
& dreaming and its application within the field of machine learning embodied in
the concept of Oneiric Processing1.
The scientific evidence for the contribution of sleep for both animals and hu-
mans in terms of cognitive homeostasis2 and in particular with regard to memory
is strong. The underlying mechanisms responsible for this contribution are less
clear, this is especially true of one particularly intriguing feature of mammalian
sleep, that of the dreaming process. The consensus amongst researchers in the
sleep and dreaming scientific community is, in general that dreaming does rep-
resent some off-line cognitive function, but there is a less popular, yet strong
opinion that dreaming serves no off-line function and is merely an observation of
the mind over a certain functional brain state. Despite all the claims and counter
claims in regard to the function of dreaming, one aspect of neural adaptation
makes any negative claim against dreaming as adaptation hard to justify. This
is the simple and well established phenomenon, that neural adaptation occurs by
activation. Waking experience changes neural circuits merely by their activation
(or inactivity), therefore why should the experience during sleep of a dream not
also result in adaptation? Holding this position, how does the experience of the
content of dreams contribute to cognitive homeostasis in humans, animals and
machines?
The natural world has been the basis of much inspiration for machine learn-
ing systems, from classical animal reinforcement learning, through physiological
neural synthesis to emulation of natural selection, all in search of explanation of
natural strategies of intelligence which in turn can lead to better systems and
1Oneiric of or relating to dreams or dreaming. Adapted from Oneiric Behaviour used todescribe REM sleep re-animation (Jouvet, 1979).
2Cognitive Homeostasis refers to the intrinsic process responsible for development and main-tenance of the normal balance between experience, perception and behaviour in a changingenvironment.
3
1.2 Nature
machines. However, most of the machine learning mechanisms for the abstract
manipulation of memory and agent structure are born of necessity and logical
progression, and this research proposal intends to take inspiration from dreaming
to improve machine learning systems directly.
1.2 Nature
Metaphorically, a master stroke of genius, eloquently captures the emergence of
cognition to such a degree as that found in modern humankind.1 After all the
millions upon millions of branches and deviations, dead ends, false starts and
mass extinction, it transpires that the most successful survival strategy is the
ability to become detached from the reality that was originator and host.2 Not
merely blindly reacting to the surrounding environment, but to cast a critical
eye at the surrounding environment and internally at ourselves. Evolutionary
pressure has stumbled into the domain of meta evolution. The emergence of an
organism that can consciously take evolution from the physical substrate to that
of an abstract one, is the context, the juncture, at which humankind currently
finds itself.
Crucially, the stage is set for such a leap, the embodiment of mind in ma-
chine. Yet in spite of the centuries of theorising, and very recent discoveries of
both processes leading to its formation and resulting function and environmental
interaction, the answer to how the mind is born of brain feels tantalisingly close,
but remains stubbornly elusive.
Several obvious paths are apparent in realising this goal. First, accept the
proposition that it is not possible for humankind to understand the mechanisms
that generate mind from brain. Two brute force approaches are apparent in
such a situation, namely copy the mechanisms (and conceivably, the state) of
the human brain exactly, or restart the evolutionary processes that led to its
development. The appearance of mind hence emerges from machine, without
actually understanding the mind, just the mechanisms leading to its creation.
1A more accurate description would be a master stroke of luck.2Reminiscent of the final deduction of the WOPR computer (aka Joshua) in the film War
Games (Warner Brothers, 1983) ‘the only winning move, is not to play’.
4
1.3 Background
Secondly, propose that it is possible to understand how mind results from the
brain and then embody that understanding within a machine.
These propositions can be more generally framed; can a machine N construct
another machine (N + 1) that is dimensionally superior? Dependent on the
definition of unit of dimension this process should result with ever increasing
meta cognitions that is hard to comprehend for the originating machine and
creator.
There is the very real possibility that the first proposition will succeed before
the second.1 Released from the current ethical constraints on human experi-
mentation, artificial embodiment may of course help in a greater understanding
leading to the satisfaction of the second proposition. However it remains unclear
that by blindly replicating and probably extending run time and capacity in the
first instance will result implicitly in the solution of the second.
The second of these propositions is indicative of the author’s belief and re-
flects the approach taken in this study; through elucidation of the mechanism,
not processes that resulted in the mechanism, comes understanding and exploita-
tion. Looking to Nature for clues in designed artificial adaptive systems2 does
in no way guarantee that a better engineered solution does not exist, just that
this is one example of a solution. This issue is mitigated in two regards; firstly
that exploration serves not only to inspire but also furthers understanding of the
Natural world; secondly the target environment for some of the control systems
is the Natural world, aligning environment and control system within the same
domain.
1.3 Background
The genesis for this research was a simple solution to a practical experimental
problem. A previous project involved designing an adaptive system based on the
concept of the Adaptive Heuristic Critic (AHC) learning to control an inverted
1The robot brain of Hector depicted in screen play by Martin Amis for the 1980 sciencefiction film Saturn 3 was grown from a human fetus and developed by copying the thoughtsand actions from a living human brain.
2‘Artificial’ as in a digital computer simulation of an adaptive system.
5
1.3 Background
pendulum1 [(Cannon, 1967), page 703], in a failure avoidance mode. The system
was inspired and closely resembled similar systems of (Barto et al., 1983) and
(Anderson, 1989). In the AHC one element of the system learns the error function
(the critic) the output of which is used to train another element (the controller).
In such a system the critic can learn an error function guided by only a simple
environmental reward or penalty. In the case of the cart pole problem the critic
is penalised in an identical manner both for failing to balance the pole and for
running out of track.2 One of the problems of this approach is that it may take
many trials for the critic to learn the error function, only then becoming useful in
terms of training the controller. Training output from the naive critic during this
phase can erroneously lead the controller away from the desired final mapping,
further increasing overall training time. Although in this sequential learning the
system eventually learns both to balance the pole and avoid the ends of the track,
this approach takes time.
A single computer system was host for neural network simulation, control and
graphics in addition to the simulation of the cart pole dynamics. This provided
a convenient development platform before an attempt to control a real cart-
pole system3. The Bang-Bang servo control4 presented no problems during the
simulation, but caused significant mechanical stress to the experimental cart-pole
construction often resulting in its destruction at some point into the training cycle.
Previous success in simulation and promising but as yet unfulfilled success
on the physical plant immediately lead to one obvious solution. Much of the
training time involved slowly adjusting the weight space by back propagation
(Rumelhart & McClelland, 1986) of the critic and then the controller, and much
of the training could have been achieved off-line in simulation before moving to
the real plant. This would lead to an overall reduction in mechanical stress and
the prospect of the real plant surviving long enough to balance the pole. By
normalising the simulation and real plant control variables the control system
1Also known as the cart pole problem.2This can lead to interesting controller behaviour; for example, the pole can be be delib-
erately tipped in one direction, risking failure in order that the controller can recover the polebalance and simultaneously shift to a new cart position in the same direction.
3The Plant.4At every control interval, a fixed force is always applied either positive or negative to the
cart base. A variable or zero force is not permitted.
6
1.3 Background
could seamlessly switch between a simulation of the cart pole and the real plant.
Re-running the experiment by periodically alternating between the real plant and
the simulation allowed some of the real experience (the inclusion of many non-
linearities not represented in the model) to be carried through into the model
interaction phase. Therefore learning was facilitated by an alternation between
the model and real plant, rather than complete prior training in simulation. The
physical plant was eventually balanced though training via this periodic switching
between the simulation and the real plant (Holley, 1996).
The experimental fix was in reality a reaction to poorly constructed experi-
ment apparatus and defeated the original intention of the system of learning in
the absence of any detailed environmental reward, or simply of avoiding failure.
The periodic nature of interacting with the external plant and internal model
in order to reduce stressful real plant interaction was suggestive of an analogy
between waking and sleep in humans and animals.1 The benefit of this approach
in the experiment was clear, but in the natural world how do sleep and dream-
ing contribute to waking behaviour? This set the scene for an investigation into
the science of sleep and dream research with an aim to building better artificial
adaptive systems.
This previous work was not aimed at investigating off-line training policies,
but it did give rise to questions about off-line adaptation and eventually to the
initial question:
Given a quantity of experience, how best can this knowledge be utilised
in order to become more successful in the future?
In the case of the previous example the motivation was limited to reducing
learning time through a preset user goal, to minimise plant damage. Other cases
may be more complex; where the environment is changing, goals moving, or
perhaps both. In those cases simply reiterating sound previous examples will be
detrimental. In the cart-pole problem for example, if an extra jointed section
was added to the pole on the real plant and not on the simulation, simulated
experience would be detrimental. Conversely there would be no benefit exposing
1At the time based on the subjective experience of dreaming, the incorporation of wakingexperience into dreams.
7
1.4 Methodology
the system to a multi-jointed pole simulation if there is never a chance that this
will become reality.
These thoughts led to investigations of model learning and planning where
various strategies are applied to improve agent performance, for example reduce
interaction with the real world, improve convergence time etc. Direct planning
for example uses a model of the world to search out goals, prior to interacting
with the actual world, such as in maze solving.
Any agent that interacts with a problem, irrespective of whether a model
is being generated from that world, faces the exploration/exploitation trade off.
This policy depends heavily on the problem, but one solution may be; ‘exploit
all the time and only explore when absolutely necessary’, or ‘exploit mostly and
occasionally explore’. Consider an agent that generates a model of the world. The
agent can use the model in various ways, but suppose the agent interacts with
the model as though the model were the real world? In this situation how does
exploration occur since the model cannot know the correct ‘real world’ response
in such a situation? Nevertheless, it may be that this is one of the most useful
off-line polices, since this is where exploration is safe. The dilemma then is clear,
abstract exploration may be safe but it may also be incorrect and thus eventually
unsafe.
These ideas may have a natural parallel, dreaming or at least the concept of
dreaming. The reiteration of coherent threads or stories with bizarre deviations
as though real may be a natural strategy to prepare animals for the future based
on the past.
1.4 Methodology
Appreciation of the size and complexity of the cognitive domain in which this
work is a part, is accepted. The possibility of designing a large simulation of
advanced cognitive functions is rejected in favour of a more focused and dis-
parate approach. In this thesis, the approach is based on investigation of core
issues in separation, with a view to integration later. Speculation, justification,
application, experimentation and analysis of isolated issues are sought at a level
that removes all but the essence of the respective issue. Although the applied
8
1.4 Methodology
problems appear trivial, they are applied within the context of extrapolation and
scalability to larger real world problems. The primary aim is to produce insights
to further cognitive machine learning systems. Conjunction, discussion, design
and operation of such systems contribute towards the cognitive faculties of sleep
and dreaming within the natural world. Below is an overview of the approach
taken to the problem:-
1. Review of the facts surrounding sleep & dreaming
2. Selection of the salient aspects
3. Reduction of those aspects
4. Hypothesis, proposed models
5. Implementation
6. Review of approach
The problem approach is characterised by a broad exploration of sleep and
dreaming in conjunction with a narrow exploration of a particular class of machine
learning systems, specifically learning classifier systems (LCS). A comprehensive
investigation in such a vast and dynamic field of sleep and dreaming research is
difficult1, and in relation to the research review there a tendency towards memory
and neurology and negation of evolutionary and other aspects.
A large portion of the research has been dedicated to a deliberately broad
investigation into the current state of the art as regards to sleep and dream
research. Analogous comparison between the subjective experience of dreaming
and off-line machine learning adaptation may well have been the catalyst for this
research but it cannot alone be the justification. Therefore a broad investigation
into the nature of sleep and dreaming was required, the approach is graphically
presented in Fig. 1.1.
1The standard medical text on sleep and dreaming divides into 125 subject areas in 1475pages (Roth et al., 2005).
9
1.5 Thesis Layout
1.5 Thesis Layout
The research disseminated in this thesis has been organised into 3 major sec-
tions. Firstly Part I examines the current state of the art in reference to sleep
and dream research, lays the foundation for the justification for the core hy-
pothesis and provides linkage into the learning classifier systems (LCS) machine
learning domain. Secondly Part II examines the background and implementa-
tion of related anticipatory based learning classifier systems. Finally Part III
presents the synergy of the dual paradigm research and presents the hypothesis,
proposed systems and experimental results, terminating with a research summary
and conclusions.
1.5.1 Part I: Sleep, Dreaming & Learning
Sleep is almost exclusively an essential behavioural component of all animal life
on Earth. In order to justify an argument for a cognitive adaptive component of
sleep, especially in regards to dreaming, this section reviews the current state of
knowledge from the perspective of the sleep and dreaming research field.
Sleep is the substrate of dreaming. To derive a system of adaptation based on
dreaming without considering the broader field of sleep would be a too narrow
and naive an approach. Sleep as with waking, is a catalyst for a vast range of
biochemical, cellular and systematic changes both physiological and psychologi-
cal.
One aspect that requires special attention is the elucidation of the existence
of dreaming in species other than humans. Whilst anecdotal evidence for dream-
ing in higher animals is strong, communication of that subjective experience is
of course not possible. Therefore alternative methods of investigation are re-
quired in order to support the circumstantial evidence refuting anthropomorphic
tendencies. Several sections describe research that justifies the argument that
rodents, birds and cats experience a sleeping mentation which closely resembles
the structure if not the complexity of human dreaming.
11
1.5 Thesis Layout
1.5.2 Part II: Latent & Machine Learning
For any animal or artificial agent to operate abstractly from the environment there
are two essential minimum requirements. Firstly there must be some method of
associating or linking internal representations of environmentally derived expe-
riences. Secondly there must be a internal platform or space in which to replay
these associations. Therefore in order to plan, imagine and ultimately dream an
artificial cognitive system must have the ability to either directly or indirectly link
internal representations of environmental states (and behaviour) and provide an
arena independent of the environment in which to replay internal representations
in isolation from the environment.
Given such prerequisite abilities, an animal or agent must firstly learn the
environmental dynamics (typically by analysis of reaction to direct agent actions)
in order that the utility of cognitive facets can be utilised. Therefore in order
for an animal or agent to perform better in the future, learning for the sake of
learning, learning in the absence of any obvious reward is in itself rewarding. In
animal psychology this is the concept of Latent Learning. A combination of latent
learning and cognitive features has been expressed within the machine learning
field of learning classifier systems.
This section of the thesis firstly explores the natural phenomenon of latent
learning and then reviews the various LCS incarnations that both learn latently
and present an architecture for cognitive behaviour.
1.5.3 Part III: Heterogenesis
1.5.3.1 Hypothesis
A three step approach is taken in this chapter to form the bridge between the
sleep and dreaming research described in Part I, the latent learning and latent
learning systems of Part II and the remainder of the thesis which contains several
complimentary theories, experimental work, developmental adaptive systems and
concluding hypothesis.
The first step is concerned with disseminating the sleep and dreaming research
described in Part I into a set of observations relevant to adaptive systems. On the
basis of this set of observations the second step develops several complimentary
12
1.5 Thesis Layout
theories on the adaptive contribution of sleep and dreaming. The third step
presents an Integrative View; this orientates Oneiric Machine Learning within
the wider field of adaptive systems, presents a formal definition and concludes
with a discussion relating the oneiric approach to the general issues facing machine
learning.
1.5.3.2 Consolidation
This chapter reports on a modification to the Anticipatory Classifier System
(ACS) (Stolzmann, 1996) that allows the system to periodically supplant real
experience for experience artificially generated from the current and transitory
information represented in the classifier list. The resulting ACS agent stores
experience by creating generalised classifier rules on-line and then periodically
switches from interacting with the environment to its current internal represen-
tation of the environment. The system incorporates some of the key aspects of
dreaming explored in the previous section, namely experience dependent reacti-
vation, emergent self perpetuating thread creation and off-line adaptation.
1.5.3.3 Generalisation
This chapter Generalisation and the following chapter (Augmentation) describe
unfinished developmental work with anticipatory classifier style systems similar to
those described in Related Systems and the adapted ACS described in Consolida-
tion. In both chapters ideas based on theories developed in Hypothesis are used
to create two prototype anticipatory style classifier systems that present novel
approaches to managing state generalisation and state augmentation. Effectively
these are the first steps towards a complete oneiric based classifier system.
In this chapter (Generalisation) a new (concept) classifier system The Coupled
Classifier System (CCS) is introduced. This system attempts to resolve simulta-
neously issues of capacity and generalisation that the ACS (and other similarly
structured systems) do not tackle. The CCS is a classifier system designed specif-
ically as a platform for Oneiric Processing. The CCS is built from two key ideas
born from the sleep and dream research alongside issues of other anticipatory
classifier systems. Firstly the CCS is structurally different from other anticipa-
tory classifier systems (see Sect. 4) in so much as matching experiences point or
13
1.5 Thesis Layout
are coupled to other experiences negating the need for state, action, state triplets
but simply states and indirect associations with other states for example; from
S → A→ S to S → A→ L(S), where L represents a linkage, coupling or redirec-
tion to another existing matching state (condition part). Secondly the experience
is not lost as a result of on-line generalisation but is resolved at periodic off-line
intervals by a process dealing with capacity. Experiences are broken down into
cognitive building blocks that (a) gracefully resolve capacity and by their creation
(b) promote generalised representations and creative behaviour in the real world.
1.5.3.4 State Augmentation
In broad terms sleep not only aids the retention of memory, but simultaneously
changes the representation promoting later (waking) generalisation. The con-
tents of this chapter are the results of thoughts regarding the abstract (sleeping)
resolution of problems and ambiguities introduced during (waking) interaction
with the environment. It is obviously difficult to represent the sort of abstract
human problems known to benefit from sleep. Whilst still very complex systems
rodent sleep studies have demonstrated that rats dream about past (and possibly
future) maze tasks. Rodent sleep studies employ simple mazes in order to relate
neural activity to location (Sect. 2.8.2) to detect dreaming rather than the effect
of sleep in solving the mazes. Nevertheless this relationship between re-running
mazes during sleep and the problems of maze solving in machine learning led to
thoughts of maze resolution in machine learning and especially in respect to clas-
sifier representations. Resolution in this context is not solving the maze, rather
the successful mapping, which subsequently promotes solution of the maze. In
particular the issue of state aliasing in non-Markov mazes remains a difficult is-
sue for learning classifier style systems. This chapter describes one (online-offline,
waking-sleeping) approach to the resolution of such maze environments by auto-
matically resolving environmental ambiguities with additional (perceptual tags)
later resolved by repeated offline activation; state augmentation and offline me-
diated resolution.
14
1.6 Audience
1.6 Audience
This thesis brings together aspects from two disparate and developing fields of
research; that of sleep and dreaming and that of machine learning. Inspiration
from the former is used to further advance the latter. Background reviews into
both areas form the foundation for the work linked by a central Hypothesis sec-
tion (Sect. 5). In this section ideas from sleep and dreaming are gathered into
a set of observations and related to the issues facing machine learning. Finally
experimental work and prototype systems are presented using anticipatory clas-
sifier style architecture. The work is mainly aimed at those working within the
field of artificial adaptive systems and Natural Computation1 who may have no
knowledge of sleep and dreaming. Nevertheless this work is also likely to attract
readers from sleep and dreaming and other closely related fields. The main con-
tribution of the work in the Hypothesis and Conclusion section is recommended
for both audiences. The whole document is recommended for those approaching
from the general field of artificial adaptive systems. Those approaching the docu-
ment without any knowledge of machine learning can ignore the sections Related
Systems, the experimental work Consolidation and systems development work of
the sections Generalisation and Augmentation without losing sense of the main
contributions.
1Artificial life, genetic algorithms, swarm behaviour and neural networks are some exampleswithin this genre.
15