Robust learning—rich and poor

43
http://www.elsevier.com/locate/jcss Journal of Computer and System Sciences 69 (2004) 123–165 Robust learning—rich and poor John Case, a Sanjay Jain, b,1, Frank Stephan, c,2 and Rolf Wiehagen d a Computer and Information Sciences Department, University of Delaware, Newark, DE 19716, USA b Department of Computer Science, School of Computing, National University of Singapore, Research Laboratory at Kensington 3 Science Drive 2, Singapore 117543, Singapore c National ICT Australia LTD, Sydney, The University of New South Wales, Sydney, NSW 2052, Australia d Department of Computer Science, University of Kaiserslautern, D-67653 Kaiserslautern, Germany Received 19 November 2000; accepted in revised form 17 October 2003 Abstract A class C of recursive functions is called robustly learnable in the sense I (where I is any success criterion of learning) if not only C itself but even all transformed classes YðCÞ; where Y is any general recursive operator, are learnable in the sense I: It was already shown before, see Fulk (in: 31st Annual IEEE Symposium on Foundation of Computer Science, IEEE Computer Soc. Press, Silver Spring, MD 1990, pp. 405–410), Jain et al. (J. Comput. System Sci. 62 (2001) 178), that for I ¼ Ex (learning in the limit) robust learning is rich in that there are classes being both not contained in any recursively enumerable class of recursive functions and, nevertheless, robustly learnable. For several criteria I; the present paper makes much more precise where we can hope for robustly learnable classes and where we cannot. This is achieved in two ways. First, for I ¼ Ex; it is shown that only consistently learnable classes can be uniformly robustly learnable. Second, some other learning types I are classified as to whether or not they contain rich robustly learnable classes. Moreover, the first results on separating robust learning from uniformly robust learning are derived. r 2003 Elsevier Inc. All rights reserved. ARTICLE IN PRESS Corresponding author. E-mail addresses: [email protected] (J. Case), [email protected] (S. Jain), [email protected] (F. Stephan), [email protected] (R. Wiehagen). 1 Supported in part by NUS Grant R252-000-127-112. 2 Supported by the Deutsche Forschungsgemeinschaft (DFG) under Heisenberg grant Ste 967/1-1 while F. Stephan was working at the Mathematical Institute, University of Heidelberg. National ICT Australia is funded by the Australian Government’s Department of Communications, Information Technology and the Arts and the Australian Research Council through Backing Australia’s Ability and the ICT Centre of Excellence Program. 0022-0000/$ - see front matter r 2003 Elsevier Inc. All rights reserved. doi:10.1016/j.jcss.2003.10.005

Transcript of Robust learning—rich and poor

Page 1: Robust learning—rich and poor

http://www.elsevier.com/locate/jcss

Journal of Computer and System Sciences 69 (2004) 123–165

Robust learning—rich and poor

John Case,a Sanjay Jain,b,1,� Frank Stephan,c,2 and Rolf Wiehagend

aComputer and Information Sciences Department, University of Delaware, Newark, DE 19716, USAbDepartment of Computer Science, School of Computing, National University of Singapore, Research Laboratory at

Kensington 3 Science Drive 2, Singapore 117543, SingaporecNational ICT Australia LTD, Sydney, The University of New South Wales, Sydney, NSW 2052, Australia

dDepartment of Computer Science, University of Kaiserslautern, D-67653 Kaiserslautern, Germany

Received 19 November 2000; accepted in revised form 17 October 2003

Abstract

A class C of recursive functions is called robustly learnable in the sense I (where I is any success criterionof learning) if not only C itself but even all transformed classes YðCÞ; where Y is any general recursiveoperator, are learnable in the sense I: It was already shown before, see Fulk (in: 31st Annual IEEESymposium on Foundation of Computer Science, IEEE Computer Soc. Press, Silver Spring, MD 1990, pp.405–410), Jain et al. (J. Comput. System Sci. 62 (2001) 178), that for I ¼ Ex (learning in the limit) robustlearning is rich in that there are classes being both not contained in any recursively enumerable class ofrecursive functions and, nevertheless, robustly learnable. For several criteria I; the present paper makesmuch more precise where we can hope for robustly learnable classes and where we cannot. This is achievedin two ways. First, for I ¼ Ex; it is shown that only consistently learnable classes can be uniformly robustlylearnable. Second, some other learning types I are classified as to whether or not they contain rich robustlylearnable classes. Moreover, the first results on separating robust learning from uniformly robust learningare derived.r 2003 Elsevier Inc. All rights reserved.

ARTICLE IN PRESS

�Corresponding author.

E-mail addresses: [email protected] (J. Case), [email protected] (S. Jain), [email protected]

(F. Stephan), [email protected] (R. Wiehagen).1Supported in part by NUS Grant R252-000-127-112.2Supported by the Deutsche Forschungsgemeinschaft (DFG) under Heisenberg grant Ste 967/1-1 while F. Stephan

was working at the Mathematical Institute, University of Heidelberg. National ICT Australia is funded by the

Australian Government’s Department of Communications, Information Technology and the Arts and the Australian

Research Council through Backing Australia’s Ability and the ICT Centre of Excellence Program.

0022-0000/$ - see front matter r 2003 Elsevier Inc. All rights reserved.

doi:10.1016/j.jcss.2003.10.005

Page 2: Robust learning—rich and poor

1. Introduction

Robust learning has attracted much attention recently. Intuitively, a class of objects is robustly

learnable if not only this class itself is learnable but all of its effective transformations remainlearnable as well. In this sense, being learnable robustly seems to be a desirable property in allfields of learning. In inductive inference, i.e., informally, learning of recursive functions in thelimit, a large collection of function classes was already known to be robustly learnable. Actually,in [Gol67] any recursively enumerable class of recursive functions was shown to be learnable. Thiswas achieved even by one and the same learning algorithm, the so-called identification byenumeration, see [Gol67]. Moreover, any reasonable model of effective transformations maps anyrecursively enumerable class again to a recursively enumerable and, hence, learnable class.Consequently, all these classes are robustly learnable. Clearly, the same is true for all subclasses ofrecursively enumerable classes. Thus, the challenging remaining question was if robust learning iseven possible outside the world of the recursively enumerable classes. This question remained openfor about 20 years, until it has been answered positively! Fulk [Ful90] and Jain et al. [JSW01]showed that there are classes of recursive functions which are both ‘‘algorithmically rich’’ androbustly learnable, where algorithmically rich means being not contained in any recursivelyenumerable class of recursive functions. Earliest examples of (large) algorithmically rich classesfeatured direct self-referential coding. Though ensuring the learnability of these classesthemselves, these direct codings could be destroyed already by simple effective transformations,thus proving that these classes are not robustly learnable. An early motivation from Ba%rzdin-s for

studying robustness was just to examine what happens to learnability when at least the thenknown direct codings are destroyed (by the effective transformations). Later examples ofalgorithmically rich classes, including some indeed robustly learnable examples, featured moreindirect, ‘‘topological’’ coding. Fulk [Ful90] and Jain et al. [JSW01] mainly had focussed on theexistence of rich and robustly learnable classes; however, in the present paper we want to makemuch more precise where we can hope for robustly learnable classes and where we cannot. In orderto reach this goal we will follow two lines.

The first line, outlined in Section 3, consists in exhibiting a ‘‘borderline’’ that separates theregion where robustly learnable classes do exist from the region where robustly learnable classesprovably cannot exist. More exactly, for the basic type Ex of learning in the limit, such aborderline is given just by the type Cons of learning in the limit consistently (i.e., each hypothesiscorrectly and completely reflects all the data seen so far). Actually, in Theorem 23 we show that allthe uniformly robustly Ex-learnable classes must be already contained in Cons; and hence thecomplementary region Ex–Cons is free of any such classes. Notice that Ex–Cons is far from beingempty, since it was shown before that Cons is a proper subset of Ex; see [Bar74a,BB75,Wie76]; thelatter is also known as inconsistency phenomenon, see [CF99,Ste98,WZ94] where it has been shownthat this phenomenon is present in polynomial-time learning as well. We were surprised to find the‘‘robustness phenomenon’’ and the inconsistency phenomenon so closely related this way. There isanother interpretation suggested by Theorem 23 which in a sense nicely contrasts the results onrobust learning from [JSW01]. All the robustly learnable classes exhibited in that paper were ofsome non-trivial topological complexity, see [JSW01] for details. On the other hand, Theorem 23intuitively says that uniformly robustly learnable classes may not be ‘‘too complex’’, as they all arelocated in the ‘‘lower part’’ Cons of the type Ex: Finally, this location in Cons in turn seems useful

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165124

Page 3: Robust learning—rich and poor

in that just consistent learning plays an important role not only in inductive inference, see[Ful88,JB81,Lan90,WZ95], but also in various other fields of learning such as PAC learning,machine learning and statistical learning, see the books [AB92,Mit97,Vap00], respectively.In Section 4, we follow another line to solve the problem where rich robustly learnable classes

can be found and where they cannot. Therefore, let us call any type I of learning such as I ¼ Ex;Cons; etc. robustly rich if I contains rich classes being robustly learnable in the sense I (where‘‘rich’’ is understood as above, i.e., a class is rich if it is not contained in any recursivelyenumerable class); otherwise, the type I is said to be robustly poor. Then, for a few types, it wasalready known if they are robustly rich or robustly poor. The first results in this direction are dueto Zeugmann [Zeu86] where the types Ex0 (Ex-learning without any mind change) and Reliable

(see Definition 8) were proved to be robustly poor. In [Ful90,JSW01] the type Ex was shownrobustly rich. Below we classify several other types as to whether they are robustly rich or poor,respectively. We exhibit types of both categories, rich ones (hence, the first rich ones after Ex) andpoor ones, thus making the whole picture noticeably more complete. This might even serve as anappropriate starting point for solving the currently open problem to derive conditions (necessary,sufficient, both) for when a type is of which category. Notice that in proving types robustly richbelow, in general, we show some stronger results, namely, we robustly separate the correspondingtypes from some other ‘‘close’’ types, thereby strengthening known separations in a robust way.From these separations, the corresponding richness results follow easily.In Section 5, we deal with a problem which was completely open up to now, namely, separating

robust learning from uniformly robust learning. While in robust learning any transformed class isrequired to be learnable in the mere sense that there exists a learning machine for it, uniformlyrobust learning intuitively requires to get such a learning machine for the transformed classeffectively at hand, see Definition 14. As it turns out by our results, this additional requirement canreally lead to a difference. Actually, for a number of learning types, we show that uniformlyrobust learning is stronger than robust learning. Notice that fixing this difference is alsointeresting for the following reason. As said above, in Theorem 23 all the uniformly robustly Ex-learnable classes are shown to be learnable consistently. However, at present, it is open if thisresult remains valid when uniform robustness will be replaced by robustness only. Some results ofSection 5 can be considered as first steps to attack this apparently difficult problem.Recently, several papers were published that deal with robustness in inductive inference, see

[CJOþ00,Ful90,Jai99,JSW01,KS89a,KS89b,OS02,Zeu86]. Each of them has contributed interest-ing points to a better understanding of the challenging phenomenon of robust learning.[Ful90,JSW01,Zeu86] are already quoted above. In addition, notice that in [JSW01] the mindchange hierarchy for Ex-type learning is proved to stand robustly. This contrasts the result from[Ful90] that the anomaly hierarchy for Ex-type learning does not hold robustly. In[KS89a,KS89b], the authors were dealing with so-called Ba%rzdin-s’ Conjecture which, intuitively,stated that the type Ex is robustly poor. In [CJOþ00] robust learning has been studied for anotherspecific learning scenario, namely learning aided by context. The intuition behind this model is topresent the functions to be learned not in a pure fashion to the learner, but together with some‘‘context’’ which is intended to help in learning. It is shown that within this scenario several resultshold robustly as well. In [OS02], the notion of hyperrobust learning is introduced. A class ofrecursive functions is called hyperrobustly learnable if there is one and the same learner whichlearns not only this class itself but also all of its images under all primitive recursive operators.

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 125

Page 4: Robust learning—rich and poor

Hence, this learner must be capable to learn the union of all these images. This definition is thenjustified by the following results. First, it is shown that the power of hyperrobust learning does notchange if the class of primitive recursive operators is replaced by any larger, still recursivelyenumerable class of general recursive operators. Second, based on this stronger definition,Ba%rzdin-s’ Conjecture is proved by showing that a class of recursive functions is hyperrobustly Ex-learnable iff this class is contained in a recursively enumerable class of recursive functions. Notethat, by this equivalence and by Corollary 34 below, hyperrobust Ex-learning is stronger thanuniformly robust Ex-learning. From [OS02] hyperrobustness destroys both direct and topologicalcoding tricks. In [CJOþ00], it is noted that hyperrobustness destroys any advantage of context,but, since, empirically, context does help, this provides evidence that the real world, in a sense, hascodes for some things buried inside others. In [Jai99], another basic type of inductive inference,namely Bc; has been robustly separated from Ex; thus solving an open problem from [Ful90].While in the present paper general recursive operators are taken in order to realize thetransformations of the classes under consideration (the reason for this choice is mainly a technicalone, namely, that these operators ‘‘automatically’’ map any class of recursive functions to a classof recursive functions again), in some of the papers above other types of operators are used suchas effective, recursive, primitive recursive operators, respectively. At this moment, we do not seeany choice to this end that seems to be superior to the others. Indeed, each approach appearsjustified if it yields interesting results.For references surveying the theory of learning recursive functions, the reader is referred to

[AS83,BB75,CS83,Fre91,JORS99,KW80,OSW86].

2. Notation and preliminaries

Recursion-theoretic concepts not explained below are treated in [Rog67]. N denotes the set ofnatural numbers. ‘‘�’’ denotes a non-member of N and is assumed to satisfy ð8nÞ½no �oN�: LetA;D;C;+;*; respectively, denote the membership, subset, proper subset, superset and proper

superset relations for sets. The empty set is denoted by |: We let cardðSÞ denote the cardinality ofthe set S: So ‘‘cardðSÞp�’’ means that cardðSÞ is finite. The minimum and maximum of a set S are

denoted by minðSÞ and maxðSÞ; respectively. We take maxð|Þ to be 0 and minð|Þ to be N: wA

denotes the characteristic function of A; that is, wAðxÞ ¼ 1; if xAA; and 0 otherwise./; S denotes a 1–1 computable mapping from pairs of natural numbers onto natural

numbers. p1;p2 are the corresponding projection functions. /; S is extended to n-tuples ofnatural numbers in a natural way. L denotes the empty function. Z; with or without subscripts,superscripts, primes and the like, ranges over partial functions. If Z1 and Z2 are both undefinedon input x; then, we take Z1ðxÞ ¼ Z2ðxÞ: We say that Z1DZ2 iff for all x in domain ofZ1; Z1ðxÞ ¼ Z2ðxÞ: We let domainðZÞ and rangeðZÞ; respectively, denote the domain and rangeof the partial function Z: ZðxÞk denotes that ZðxÞ is defined. ZðxÞm denotes that ZðxÞ isundefined.We say that a partial function Z is conforming with Z0 iff for all xAdomainðZÞ-domainðZ0Þ;

ZðxÞ ¼ Zðx0Þ: ZBZ0 denotes that Z is conforming with Z0: Z is non-conforming with Z0 iff there existsan x such that ZðxÞkaZ0ðxÞk: ZfZ0 denotes that Z is non-conforming with Z0:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165126

Page 5: Robust learning—rich and poor

For rAN; r-extension of Z denotes the function f defined as follows:

f ðxÞ ¼ZðxÞ; if xAdomainðZÞ;r; otherwise:

For a finite set S of programs, we let UnionðSÞ denote the partial recursive function:UnionðSÞðxÞ ¼ jpðxÞ; for the first pAS found such that jpðxÞk using some standard dovetailing

mechanism for computing jp’s. When programs q1; q2;y; qn for partial recursive functions

Z1; Z2;y; Zn are implicit, we sometimes abuse notation and use UnionðfZ1; Z2;y; ZngÞ; to denoteUnionðfq1; q2;y; qngÞ:

f ; g; h;F and H; with or without subscripts, superscripts, primes and the like, range over totalfunctions. R denotes the class of all recursive functions, i.e., total computable functions witharguments and values from N: T denotes the class of all total functions. R0;1 ðT0;1Þ denotes theclass of all recursive functions (total functions) with range contained in f0; 1g: C and S; with orwithout subscripts, superscripts, primes and the like, range over subsets of R: P denotes the classof all partial recursive functions over N: j denotes a fixed acceptable programming system. ji

denotes the partial recursive function computed by program i in the j-system. Note that in thispaper all programs are interpreted with respect to the j-system. We let F be an arbitrary Blumcomplexity measure [Blu67] associated with the acceptable programming system j; many suchmeasures exist for any acceptable programming system [Blu67]. We assume without loss ofgenerality that FiðxÞXx; for all i; x: ji;s is defined as follows:

ji;sðxÞ ¼jiðxÞ; if xos and FiðxÞos;

m; otherwise:

For a given partial computable function Z; we define MinProgðZÞ ¼ minðfijji ¼ ZgÞ:A class CDR is said to be recursively enumerable iff there exists an r.e. set X such that

C ¼ fjijiAXg: For any non-empty recursively enumerable class C; there exists a recursivefunction f such that C ¼ fjf ðiÞjiANg:A class C is said to be h-bounded iff for all fAC; for all but finitely many x; f ðxÞphðxÞ: A class

C is bounded iff it is h-bounded for some recursive h: A class C is unbounded iff it is not h-boundedfor any recursive h:The following functions and classes are commonly considered below. Zero is the everywhere 0

function, i.e., ZeroðxÞ ¼ 0; for all xAN: CONST ¼ ff jð8xÞ½f ðxÞ ¼ f ð0Þ�g denotes the class ofconstant functions. FINSUP ¼ ff jð8NxÞ½f ðxÞ ¼ 0�g denotes the class of all recursive functions offinite support.

2.1. Function identification

We first describe inductive inference machines. We assume that the graph of a function is fed toa machine in canonical order.For fAR and nAN; we let f ½n� denote f ð0Þf ð1Þyf ðn � 1Þ; the finite initial segment of f of

length n: Clearly, f ½0� denotes the empty segment. SEG denotes the set of all finite strings,ff ½n�jfAR4nANg: SEG0;1 ¼ ff ½n�jfAR0;14nANg: We let s; t and g; with or without subscripts,

superscripts, primes and the like, range over SEG: L denotes the empty sequence. We assume

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 127

Page 6: Robust learning—rich and poor

some computable ordering of elements of SEG: sot; if s appears before t in this ordering.Similarly, one can talk about least element of a subset of SEG:For a finite string s and finite or infinite string b; we let s b denote the concatenation of s and

b: We identify s ¼ a0a1yan�1 with the partial function

sðxÞ ¼ax; if xon;

m; otherwise:

Similarly, a total function g is identified with the infinite sequence gð0Þgð1Þgð2Þy: Thus, forexample, 0N ¼ Zero:Let jsj denote the length of s: If jsjXn; then we let s½n� denote the prefix of s of length n: sDt

denotes that s is a prefix of t: An inductive inference machine (IIM) [Gol67] is an algorithmicdevice that computes a (possibly partial) mapping from SEG into N: Since the set of all finitestrings, SEG; can be coded onto N; we can view these machines as taking natural numbers asinput and emitting natural numbers as output. We say that Mðf Þ converges to i (written:Mðf Þk ¼ i) iff ð8NnÞ½Mðf ½n�Þ ¼ i�; Mðf Þ is undefined if no such i exists. M0;M1;y denotes arecursive enumeration of all the IIMs. The next definitions describe several criteria of functionidentification.

Definition 1 (Gold [Gol67]). Let fAR and SDR:

(a) M Ex-identifies f (written: fAExðMÞ) just in case there exists a program i for f such thatMðf Þk ¼ i:

(b) M Ex-identifies S iff M Ex-identifies each fAS:(c) Ex ¼ fSDRjð(MÞ½SDExðMÞ�g:

By the definition of convergence, only finitely many data points from a function f have beenobserved by an IIM M at the (unknown) point of convergence. Hence, some form of learningmust take place in order for M to learn f : For this reason, hereafter the terms identify, learn andinfer are used interchangeably.

Definition 2 (Ba%rzdin-s [Bar74b] and Case and Smith [CS83]). Let fAR:

(a) M Bc-identifies f (written: fABcðMÞ) iff, for all but finitely many nAN; Mðf ½n�Þ is a programfor f :

(b) M Bc-identifies S iff M Bc-identifies each fAS:(c) Bc ¼ fSDRjð(MÞ½SDBcðMÞ�g:

Definition 3 (Ba%rzdin-s [Bar74a]). M is said to be consistent on f iff, for all n; Mðf ½n�Þk and

f ½n�DjMðf ½n�Þ:

Definition 4 (Wiehagen [Wie78]). M is said to be conforming on f iff, for all n; Mðf ½n�Þk andf ½n�BjMðf ½n�Þ:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165128

Page 7: Robust learning—rich and poor

Definition 5 (((a) Ba%rzdin-s [Bar74a])). M Cons-identifies f iff M is consistent on f ; and M Ex-

identifies f :(b.1) (Ba%rzdin-s [Bar74a]). M Cons-identifies C iff M Cons-identifies each fAC:(b.2) Cons ¼ fCjð(MÞ½M Cons-identifies C�g:(c.1) (Jantke and Beick [JB81]). M RCons-identifies C iff M is total, and M Cons-identifies C:(c.2) RCons ¼ fCjð(MÞ½M RCons-identifies C�g:(d.1) (Wiehagen and Liepe [WL76]). M TCons-identifies C iff M is consistent on each fAT;

and M Cons-identifies C:(d.2) TCons ¼ fCjð(MÞ½M TCons-identifies C�g:

Note that for M to Cons-identify a function f ; it must be defined on each initial segment of f :Similarly for Conf-identification below.

Definition 6 ((a) Wiehagen [Wie78]). M Conf-identifies f iff M is conforming on f ; and M Ex-identifies f :(b.1) Wiehagen [Wie78] M Conf-identifies C iff M Conf-identifies each fAC:(b.2) Conf ¼ fCjð(MÞ½M Conf-identifies C�g:(c.1) M RConf-identifies C iff M is total, and M Conf-identifies C:(c.2) RConf ¼ fCjð(MÞ½M RConf-identifies C�g:(d.1) (Fulk [Ful88]). M TConf-identifies C iff M is conforming on each fAT; and M Conf-

identifies C:(d.2) TConf ¼ fCjð(MÞ½M TConf-identifies C�g:

Definition 7 (Osherson et al. [OSW86]). M is confident iff for all total f ; Mðf Þk:M Confident-identifies C iff M is confident and M Ex-identifies C:Confident ¼ fCjð(MÞ½M Confident-identifies C�g:

Definition 8 (Minicozzi [Min76] and Blum and Blum [BB75]). M is reliable iff M is total, and forall total f ; Mðf Þk ) M Ex-identifies f :M Reliable-identifies C iff M is reliable and M Ex-identifies C:Reliable ¼ fCjð(MÞ½M Reliable-identifies C�g:

Definition 9. NUM ¼ fCjð(C0jCDC0DRÞ½C0 is recursively enumerable�g:

For references on inductive inference within NUM; the set of all recursively enumerable classesand their subclasses, the reader is referred to [BF74,FBP91,Gol67].We let I and J range over identification criteria defined above.The following theorem relates the criteria of inference discussed above.

Theorem 10 ((Wiehagen and Liepe [WL76], Wiehagen and Zeugmann [WZ95], Ba%rzdin-s

[Bar74a,Bar74b], Blum and Blum [BB75], Wiehagen [Wie76,Wie78], Fulk [Ful88], and Grabowski[Gra86])).

(a) NUMCTCons ¼ TConfCRConsCRConfCConfCExCBc:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 129

Page 8: Robust learning—rich and poor

(b) RConsCConsCConf:(c) RConfD/ Cons:(d) ConsD/ RConf:(e) TConsCReliableCEx:(f) ReliableD/ Conf:(g) RConsD/ Reliable:(h) NUMD/ Confident:(i) ConfidentD/ Conf:(j) ConfidentD/ Reliable:

Proof. (a) and (b): NUMCTConsCCons was shown by Wiehagen and Liepe [WL76].TConsCRConsCCons was shown by Wiehagen and Zeugmann [WZ95]. ConsCEx was shownby Ba%rzdin-s [Bar74a], Blum and Blum [BB75], and Wiehagen [Wie76]. ConsCConfCEx was done

by Wiehagen [Wie78]. TCons ¼ TConf was shown by Fulk [Ful88]. ExCBc was done byBa%rzdin-s [Bar74b] (see also [CS83]).

RConsCRConf and RConfCConf follow from parts (c) and (d).

Part (c) follows from proof of Conf � Consa| in [Wie78]. We are not sure if anyone hasexplicitly shown part (d), but it follows as a corollary to Theorem 35.

TConsDReliable follows from definition. Reliable�TConsa| was shown by [Gra86]; it alsofollows from part (f). ReliableCEx was shown by [BB75]. Thus, part (e) follows.For part (f), consider the class C ¼ FINSUP,ff jjf ð0Þ ¼ f4ð8xÞ½Ff ð0ÞðxÞpf ðx þ 1Þ�g: Clearly,

CAReliable: Also since FINSUPDC; if CAConf; then CATConf ¼ TCons: Now suppose byway of contradiction thatMTCons-identifies C: Then by Kleene recursion theorem [Rog67] thereexists an e such that je may be described as follows.

jeðxÞ ¼e; ifx ¼ 0;

Feðx � 1Þ; if x40; and Mðje½x� Feðx � 1ÞÞaMðje½x�Þ;Feðx � 1Þ þ 1; otherwise:

8><>:

It is easy to verify that je is total. Since M is consistent on all inputs, for each f ½x�; there exists atmost one y; such that Mðf ½x�Þ ¼ Mðf ½x� yÞ: Thus, by definition of je; for all x40;Mðje½x�ÞaMðje½x þ 1�Þ:LetSD ¼ ffARjjf ð0Þ ¼ f g: It was shown by [BB75], thatSDeReliable: SinceSDARCons-

Confident; parts (g) and (j) follow.For (h), consider FINSUP: Clearly, FINSUPANUM: Now, suppose FINSUPDExðMÞ: Then,

for all s; there exists t such that sDt; and MðsÞaMðtÞ: It follows that there exists an infinitesequence si; iAN such that siCsiþ1; and MðsiÞaMðsiþ1Þ: Thus, M makes infinite number ofmind changes on

Si si: Thus, M is not confident. (h) follows.

For part (i), let C ¼ ff j½jf ð0Þ ¼ f4ð8x40Þ½f ðxÞa0�� or ½ð(yÞ½ð8x4yÞ½f ðxÞ ¼ 0�4ð8xj0oxpyÞ½f ðxÞa0���g:It is easy to verify that CAConfident: Suppose by way of contradiction that M Conf-identifies

C: We consider two cases:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165130

Page 9: Robust learning—rich and poor

Case 1: There exists an f ½n� such that, for x with 0oxon; f ðxÞa0 and ½Mðf ½n�Þm orf ½n�fjMðf ½n�Þ�:

In this case, let g ¼ f ½n�0N: Clearly, gAC; but M does not Conf-identify g:

Case 2: Not Case 1.

In this case, by Kleene recursion theorem [Rog67], there exists an e such that je may be definedas follows.

jeðxÞ ¼e; if x ¼ 0;

1; if x40 and; Mðje½x� 1ÞaMðje½x�Þ;2; otherwise:

8><>:

By hypothesis of the case, je is total and a member of C: Suppose MðjeÞk: Let n be suchthat for all x4n; Mðje½x�Þ ¼ Mðje½n�Þ: But then, by definition of je; for all x4n; jeðxÞ ¼ 2;and Mðje½x� 1Þ ¼ Mðje½x�Þ: This, by hypothesis of the case, implies that M does notEx-identify je: &

2.2. Operators

Definition 11 (Rogers [Rog67]). A recursive operator is an effective total mapping, Y; from(possibly partial) functions to (possibly partial) functions, which satisfies the following properties:

(a) Monotonicity: For all functions Z; Z0; if ZDZ0 then YðZÞDYðZ0Þ:(b) Compactness: For all Z; if ðx; yÞAYðZÞ; then there exists a finite function aDZ such that

ðx; yÞAYðaÞ:(c) Recursiveness: For all finite functions a; one can effectively enumerate (in a) all ðx; yÞAYðaÞ:

Definition 12 (Rogers [Rog67]). A recursive operator Y is called general recursive iff Y maps alltotal functions to total functions.

For each recursive operatorY; we can effectively (from Y) find a recursive operatorY0 such that,

(d) for each finite function a; Y0ðaÞ is finite, and its canonical index can be effectively determinedfrom a; furthermore if aASEG then Y0ðaÞASEG; and

(e) for all total functions f ; Y0ðf Þ ¼ Yðf Þ:

This allows us to get a nice effective sequence of recursive operators.

Proposition 13 (Jain et al. [JSW01]). There exists an effective enumeration, Y0;Y1;y; of recursive

operators satisfying condition (d) above such that, for all recursive operators Y; there exists an iANsatisfying:

for all total functions f ; Yðf Þ ¼ Yiðf Þ:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 131

Page 10: Robust learning—rich and poor

Since we will be mainly concerned with the properties of operators on total functions, fordiagonalization purposes, one can restrict attention to operators in the above enumerationY0;Y1;y:

Definition 14 (Fulk [Ful90] and Jain et al. [JSW01]).

RobustEx ¼ fCjð8 general recursive operators YÞ½YðCÞAEx�g:UniformRobustEx ¼ fCjð(gARÞð8ejYe is general recursiveÞ½YeðCÞDExðMgðeÞÞ�g:

One can similarly define RobustI and UniformRobustI; for other criteria I of learning consideredin this paper.

2.3. Some useful propositions

Proposition 15 (Jain et al. [JSW01] and Zeugmann [Zeu86]). NUM ¼ RobustNUM:

Corollary 16. (a) NUMDRobustTConsDRobustReliable:(b) NUMDUniformRobustTCons:

Proof. (a) follows immediately from Proposition 15 and Theorem 10(a,e).In order to show (b) recall that, by Proposition 15, for any recursively enumerable class C and

any general recursive operator Ye; YeðCÞ is again recursively enumerable and, hence, can be Ex-learned using identification by enumeration, see [Gol67]. Moreover, given C; this identification byenumeration strategy can be computed uniformly in e: Recall that identification by enumerationalready works consistently, see [Gol67]. Finally, notice that this strategy can easily and uniformlybe made to work consistently on all total functions. &

We say that f ¼� g iff cardðfxjf ðxÞagðxÞgÞ is finite. We say that CDR is closed under finitevariations iff for all f ; gAR such that f ¼� g; fAC iff gAC:

Proposition 17 (Ott and Stephan [OS02]). Suppose C is closed under finite variations. ThenCARobustEx iff CANUM:

Proposition 18. TConsD/ RobustEx:

Proof. Let C ¼ ff jð(iÞ½FiAR4f ¼� Fi�g: It is easy to verify that CATCons: However, as C isclosed under finite variations, using Proposition 17, we have CARobustEx iff CANUM: AsfFijFiARgeNUM; proposition follows. &

The following result was shown in [Gra86].

Lemma 19 (Grabowski [Gra86]). Suppose SAReliable; and S is bounded. Then SANUM:

Corollary 20. Suppose SATCons; and S is bounded. Then SANUM:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165132

Page 11: Robust learning—rich and poor

Proof. This follows immediately from Lemma 19 and Theorem 10(e). &

Proposition 21 (Freivalds and Wiehagen [FW79]). Let Se ¼ fjiARjipeg: Then, there exists arecursive h such that, for any e; SeDExðMhðeÞÞ: Moreover, this MhðeÞ also witnesses that SeACons:

Corollary 22. Let Se ¼ fjiARjipeg: There exists a recursive h such that, if Yj is a general

recursive operator, then YjðSeÞDExðMhðe;jÞÞ: Moreover, this Mhðe;jÞ also witnesses that

YjðSeÞACons:

3. Robust learning and consistency

The main result of this section is UniformRobustExDCons; see Theorem 23. Hence, all theuniformly robustly Ex-learnable classes are contained in the ‘‘lower part’’ Cons of Ex; recall thatCons is a proper subset of Ex; see [Bar74a,BB75,Wie76]. This nicely relates two surprisingphenomena of learning, namely the robustness phenomenon and the inconsistency phenomenon.On the other hand, despite the fact that every uniformly robustly Ex-learnable class is located inthat lower part of Ex; UniformRobustEx contains ‘‘algorithmically rich’’ classes, since, byCorollary 34, UniformRobustEx*NUM:

Theorem 23. UniformRobustExDCons:

Proof. Suppose CAUniformRobustEx; and g is a recursive function such that, for all i; if Yi isgeneral recursive, then YiðCÞDExðMgðiÞÞ: Without loss of generality we may assume that each

MgðiÞ is total.

Then, by Kleene recursion theorem [Rog67], there exists an e such that Ye may be described asfollows. For ease of presentation, Yeðf ½m�Þ may be infinite for some f ;m (and thus it does notsatisfy the invariant (d) we assumed (see discussion around Proposition 13) on the enumerationY0;Y1;y; this can be easily handled by appropriately slowing down Ye):

YeðLÞ ¼ L;

Yeðf ½n þ 1�Þ ¼

Yeðf ½n�Þ; if Yeðf ½n�Þ is infinite;Yeðf ½n�Þ f ðnÞ 0N; if for all m;Yeðf ½n�Þ f ðnÞD/ jMgðeÞðYeðf ½n�Þf ðnÞ0mÞ;m;

Yeðf ½n�Þ f ðnÞ 0m; if m is the least number such that

Yeðf ½n�Þ f ðnÞDjMgðeÞðYeðf ½n�Þf ðnÞ0mÞ;m:

8>>>><>>>>:

It is easy to verify that Ye is general recursive.

Claim 24. (a) If Yeðf ½n�Þ is finite, and aab; then Yeðf ½n� aÞfYeðf ½n� bÞ:(b) If Yeðf ½n�Þ and Yeðh½m�Þ are both finite, then f ½n�Bh½m� iff Yeðf ½n�ÞBYeðh½m�Þ:(c) If Yeðf ½n þ 1�Þ is finite, then Yeðf ½n�Þ f ðnÞDjMgðeÞðYeðf ½nþ1�ÞÞ:

(d) For all p; n; there exists at most one f ½n þ 1�; such that Yeðf ½n�Þ f ðnÞDjp:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 133

Page 12: Robust learning—rich and poor

(e) If Yeðf ½n þ 1�Þ is infinite, then MgðeÞ does not Ex-identify Yeðf Þ:(f) For all fAC; for all n; Yeðf ½n�Þ is finite.

Proof. (a), (c) follow from construction. (b) follows from part (a). Part (d) follows frompart (b).For part (e), suppose Yeðf ½n þ 1�Þ is infinite. Without loss of generality assume that n is least

such that Yeðf ½n þ 1�Þ is infinite. Then, Yeðf Þ ¼ Yeðf ½n þ 1�Þ ¼ Yeðf ½n�Þ f ðnÞ 0N: But,for all m; Yeðf ½n�Þ f ðnÞD/ jMgðeÞðYeðf ½n�Þf ðnÞ0mÞ;m: Thus, either MgðeÞ diverges on Yeðf Þ; or Yeðf ½n�Þ f ðnÞD/ jMgðeÞðYeðf ÞÞ: Thus, MgðeÞ does not Ex-identify Yeðf Þ:(f) follows as corollary to part (e). &

Let prog be a recursive function such that jprogðpÞ is defined as follows.

jprogðpÞðxÞ

1. Search for f ½x þ 1� such thatYeðf ½x þ 1�Þ is finite, andYeðf ½x�Þ f ðxÞDjp: (�Note that by Claim

24(d), there exists at most one f ½x þ 1� satisfying the above. Moreover, if pAfMgðeÞðYeðh½x þ1�ÞÞjYeðh½x þ 1�Þ is finite g; then there exists one such f ½x þ 1�; by Claim 24(c). �)

2. If and when such an f ½x þ 1� is found, then output f ðxÞ:

End

Claim 25. If Yeðf ½n�Þ is finite, then f ½n�DjprogðMgðeÞðYeðf ½n�ÞÞÞ:

Proof. If n ¼ 0; then claim trivially holds. If n40; then by Claim 24(c),Yeðf ½n � 1�Þ f ðn � 1ÞDjMgðeÞðYeðf ½n�ÞÞ: Thus, by definition of prog and Claim 24(d),

f ½n�DjprogðMgðeÞðYeðf ½n�ÞÞÞ: &

We now define M as follows:

Mðf ½m�Þ ¼progðMgðeÞðYeðf ½m�ÞÞÞ; if Yeðf ½m�Þ is finite;m; otherwise:

It follows from Claims 24(f) and 25 that M is consistent on each fAC: Since MgðeÞ converges on

Yeðf Þ for each fAC; it follows that M converges on each fAC: Thus, it follows by consistency ofM that M Ex-identifies each fAC: It follows that CACons: &

Note that the proof of Theorem 23 shows that given any recursive g; one can effectively constructan M such that, for any C; if

for all e such that Ye is general recursive, YeðCÞDExðMgðeÞÞ

then M Cons-identifies C: Thus, we have that UniformRobustExDUniformRobustCons: Since thereverse inclusion holds by definition, it follows that,

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165134

Page 13: Robust learning—rich and poor

Corollary 26. UniformRobustEx ¼ UniformRobustCons:

Now we investigate the links between boundedness and uniform robust learnability of classes offunctions. As a consequence, we derive that UniformRobustEx contains algorithmically richclasses, see Corollary 34 below.

Theorem 27. Suppose SAUniformRobustCons: If there exists an x such that maxðff ðxÞjfASgÞ ¼N; then SANUM:

Proof. There is an infinite recursive binary tree without an infinite recursive branch where eachnode s is either branching (having successors s 0 and s 1) or a leaf (having no successors). Sucha tree can be obtained by starting with full binary tree, and then removing s from this tree iff thereexists a jojsj=2 such that s½2j�Djj;jsj:

Let this tree be named T and let t0; t1;y be a one–one recursive enumeration of its leaves; notethat the length of each tk is at least 1:

Claim 28. For any gAR0;1; there exists a unique f (which is recursive) such that g ¼ tf ð0Þ tf ð1Þ tf ð2Þy: Furthermore, a program for f can be computed effectively from g:

Proof. We inductively define gi and ni (gi will be recursive). Let g0 ¼ g: Suppose gi

has been defined. Then we define ni and giþ1 as follows. Since gi is recursive, gi is not aninfinite branch of T : Let ni be such that tni

Dgi (note that ni is unique). Let giþ1 be such thatgi ¼ tni

giþ1:It is easy to verify that each gi is recursive, and ni can be effectively found from g and i: Let

f ðiÞ ¼ ni: The proposition is now easy to verify. &

For any recursive g; let ImagðgÞ denote the unique f such that g ¼ tf ð0Þ tf ð1Þ tf ð2Þy:

By the above proposition, program for ImagðgÞ can be found effectively from programfor g:We now continue with the proof of the theorem. Let S and x be as given in the hypothesis.

Suppose g is a recursive function such that, for all e such that Ye is general recursive, MgðeÞ Cons-

identifies YeðSÞ:Let Ui ¼ fsASEG0;1jMgðiÞðsÞk4sDjMgðiÞðsÞg:Note that Ui is r.e. Fix some uniform (in i) enumeration of Ui: Let si;t denote the least s not

enumerated in Ui within t steps. Note that si;t can be effectively computed from i; t:By implicit use of Kleene recursion theorem [Rog67], there exists an e such that

Yeðf Þ ¼ se;f ðxÞ tf ð0Þ tf ð1Þy

Note that Ye is general recursive.

Claim 29. Ue ¼ SEG0;1:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 135

Page 14: Robust learning—rich and poor

Proof. Suppose by way of contradiction that SEG0;1 � Uea|: Let s be the least element of

SEG0;1 � Ue: Thus, MgðeÞðsÞm or jMgðeÞðsÞ does not extend s: Thus, MgðeÞ does not Cons-identify

any extension of s:Let t be large enough so that all elements of Ue which are less than s have been enumerated in

Ue within t steps. Then, for any fAS; such that f ðxÞ4t; se;f ðxÞ ¼ s: Thus, Yeðf Þ extendss: Therefore, MgðeÞ does not Cons-identify Yeðf Þ (since MgðeÞ does not Cons-identify any

extension of s).A contradiction to MgðeÞ Cons-identifying YeðSÞ: &

It follows that MgðeÞ TCons-identifies YeðSÞ: Since YeðSÞDR0;1; it follows by Corollary 20

that YeðSÞANUM:Thus, there exists a recursively enumerable family, f0; f1;y of recursive functions in R0;1 such

that YeðSÞDffijiANg:Let gi;n ¼ fiðnÞfiðn þ 1Þfiðn þ 2Þy:Thus, fgi;nji; nANg is recursively enumerable. Suppose fAS; and fk ¼ Yeðf Þ: Then,

clearly, Imagðgk;nÞ ¼ f ; for n ¼ jse;f ðxÞj: It follows that SDfImagðgi;nÞji; nANg: Hence,

SANUM: &

Corollary 30. Suppose SAUniformRobustEx: If there exists an x such that maxðff ðxÞjfASgÞ ¼N; then SANUM:

Proof. This follows immediately from Corollary 26 and Theorem 27. &

Definition 31. A function F is 1-generic iff, for every recursively enumerable set ADSEG; one offollowing holds:(a) There exists an x such that F ½x�AA or(b) there exists an x such that for all sAA; F ½x�D/ s:

It can be shown that there exists a limiting recursive 1-generic F :

Theorem 32. There is SAUniformRobustEx such that S is unbounded.

Proof. Let F be a limiting recursive 1-generic function.

Claim 33. For all general recursive operators Y; one of (a) or (b) is satisfied.

(a) There exists a sDF such that ð8f+sÞ½YðFÞ ¼ Yðf Þ�; or(b) YðFÞ is not recursive.

Proof. Suppose YðFÞ is recursive. Let A ¼ fsjYðsÞfYðFÞg: Then, A is recursively enumerable.Moreover, for all sAA; sD/ F : Thus, by definition of 1-generic function, there exists an x suchthat, for all sAA; F ½x�D/ s: It follows that for any f extending F ½x�; Yðf Þ ¼ YðFÞ: Part (a)follows. &

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165136

Page 15: Robust learning—rich and poor

Now define S as follows. Let

Zi;rðxÞ ¼FðxÞ; if xor;

jiðxÞ; otherwise:

S ¼ fZi;rjior and Zi;rARg:Clearly, S is unbounded (since for all recursive h; there exists an i; such that jiAR

and jiðxÞ4hðxÞ infinitely often. Thus, Zi;rAS; for all but finitely many r; but Zi;r is not

h-bounded).We show that SAUniformRobustEx: Note that there is a recursively enumerable approxima-

tion G0;G1;y of total recursive functions to F :Let h be as in Corollary 22. By parameterized s–m–n theorem, let g be a recursive function such

that MgðeÞ may be defined as follows.

MgðeÞðf ½n�Þ

1. If there is a kpn such that YeðGkÞ+f ½n�; then output a canonical index for YeðGkÞ for the leastsuch k:

2. Otherwise let m be least number such that YeðGn½m�Þff ½n�: Let Z0i;r denote the

function:

Z0i;rðxÞ ¼GnðxÞ; if xor;

jiðxÞ; otherwise:

Let m0 be such that, for all i; rpm; there exists a program jpm0 for Z0i;r:Output Mhðm0;eÞð f ½n�Þ:

EndWe claim that MgðeÞ Ex-identifies YeðSÞ: To see this, suppose fAYeðSÞ:Case 1: There exists a k such that YeðGkÞ ¼ f :

In this case by step 1 of the construction of MgðeÞ; we have that MgðeÞ Ex-identifies f :

Case 2: Not Case 1.

In this case we first claim that YeðFÞaf : If this were not the case, then since f is recursive, byClaim 33 there would exist an m such that, for all g+F ½m�; YeðgÞ ¼ f : Since for all but finitelymany k; F ½m�DGk; we would have Case 1 (that is, YeðGkÞ ¼ f ).Thus, for all but finitely many n; MgðeÞ executes step 2 on input f ½n�; and m as computed in

step 2 is the least number such that YeðF ½m�Þff ; and F ½m�DGn:But then, by construction of S; there exists an iorpm; such that Zi;rAS and f ¼ YeðZi;rÞ:But then, for all but finitely many n; the value m0 as defined in MgðeÞð f ½n�Þ converges (as n

goes to N) and is a bound on a program for Zi;r: It follows that Mhðm0;eÞ and thus MgðeÞ Ex-

identifies f :

The above cases prove the theorem. &

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 137

Page 16: Robust learning—rich and poor

Corollary 34. UniformRobustEx*NUM:

Proof. The inclusion follows from Corollary 16(b) and Theorem 10(a). The proper inclusion nowfollows from Theorem 32, since every class from NUM is bounded. &

4. Robustly rich and robustly poor learning

In this section, we show for several learning types that they are ‘‘robustly rich’’; that is, thesetypes contain classes being both robustly learnable and not contained in any recursivelyenumerable class. Notice that in proving these types robustly rich below, in general, we show somestronger results, namely, we robustly separate (or even uniformly robustly separate) thecorresponding types from some other types, thereby strengthening known separations in a(uniformly) robust way. From these separations, the corresponding richness results follow easilyby applying Theorem 10. On the other hand, there are also ‘‘robustly poor’’ learning types; that is,every robustly learnable class of these types is contained in a recursively enumerable class. Somefurther robustly poor types will be exhibited in Section 5.A summary of results relating robust consistent and conforming learning is shown in Fig. 1.

Note that the relationship between Cons and RobustConf is open at this point (along with whetherRobustCons is a proper subset of RobustConf).Besides the results given in Fig. 1, we also give results involving confident learning in this

section.

Theorem 35. UniformRobustConsD/ RConf: (Moreover the non-inclusion can be witnessed using a

class from R0;1:)

Proof. Let F be a limiting recursive function which dominates every recursive function, andFðxÞoFðx þ 1Þ; for all x: Let FsðÞ denote recursive approximation of F :Let C ¼ f fAR0;1 j ð(eÞ½ f ðeÞ ¼ 1; ð8xoeÞ½ f ðxÞ ¼ 0�; and ð(jpFðeÞÞ½jj ¼ f ��g:

Claim 36. CAUniformRobustEx:

Proof. Let h be as in Corollary 22. Let g be a recursive function such thatMgðeÞ may be defined as

follows.

MgðeÞð f ½n�Þ1. If YeðZero½n�ÞBf ½n�; then output a standard program for YeðZeroÞ:2. Else let m be least number such that Yeð0mÞff ½n�:

Output MhðFnðmÞ;eÞð f ½n�Þ:End

Suppose Ye is general recursive. Then we claim that MgðeÞ Ex-identifies YeðCÞ: To see this,

suppose fAYeðCÞ: If f ¼ YeðZeroÞ; then clearly, by step 1, MgðeÞ Ex-identifies f :

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165138

Page 17: Robust learning—rich and poor

If faYeðZeroÞ; then let m be least number such that ffYeðZero½m�Þ: Thus, if f ¼ YeðZÞ; thenZðxÞa0; for some xom: Therefore, by definition of C; there exists a jpFðmÞ; such that f ¼YeðjjÞ:Now it follows from definition of h in Corollary 22 thatMhðFðmÞ;eÞ Ex-identifies f ; and thus

MgðeÞ Ex-identifies f : &

Thus CAUniformRobustCons by Corollary 26.We now show that CeRConf: Assume by way of contradiction that CARConf as witnessed

by M:

Claim 37. For all but finitely many j; for all s extending 0 j1; if Mðs 0Þ ¼ Mðs 1Þ; then

jMðs0ÞðjsjÞm:

Proof. Let f be such that for jAN; jf ð jÞ is defined as follows:

Search for a sASEG0;1 extending 0j1; such that Mðs 0Þ ¼ Mðs 1Þ; and jMðs0ÞðjsjÞk: If and

when such a s is found, let jf ð jÞ be ð1�: jMðs0ÞðjsjÞÞ-extension of s:

ARTICLE IN PRESS

TCons=TConf

RCons RConf

Cons Conf

RobustTCons=RobustTConf=NUM

RobustRCons RobustRConf

RobustConfRobustCons

Proper Subset

Subset, maybe proper

Relationship not yet known

Fig. 1. Robust learning and consistency/conformity.

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 139

Page 18: Robust learning—rich and poor

Note that for all but finitely many j; f ð jÞpFð jÞ: Let j0 be such that for all jXj0;f ð jÞpFð jÞ:Thus, for all jXj0; if there exists a s extending 0 j1 such that Mðs 0Þ ¼ Mðs 1Þ; and

jMðs0ÞðjsjÞk; then jf ð jÞ is total and belongs to C; and M is not conforming on jf ð jÞ:

A contradiction to M RConf-identifying C:

Thus, for all jXj0; for all s extending 0 j1; if Mðs 0Þ ¼ Mðs 1Þ; then jMðs0ÞðjsjÞm: &

Now for sASEG0;1 define fs as follows:

fsðxÞ ¼sðxÞ; if xojsj;1; if Mð fs½x� 1Þ ¼ Mð fs½x�Þ; and Mð fs½x� 0ÞaMð fs½x�Þ;0; otherwise:

8><>:

Clearly, fs is total for each sASEG0;1:

Claim 38. For all but finitely many j; for all gAC such that 0 j1Dg; there exists a t such that g ¼ ft:

Proof. Consider j such that

for all sASEG0;1 extending 0 j1; if Mðs 0Þ ¼ Mðs 1Þ; then jMðs0ÞðjsjÞm:

Note that all but finitely many j satisfy the above condition (by Claim 37). Now suppose

0 j1DgAC: Let t+0 j1 be such that tDg; and for all t0; tDt0Dg; MðtÞ ¼ Mðt0Þ: Then, it followsthat ft ¼ g: &

From the above claim it follows that C is in NUM (since, for any j; there are only finitely many

functions in C with prefix 0 j1).

However, CeNUM; since f fAR j jj ¼ f and 0 j1Df gDC; but f fAR j jj ¼ f and

0 j1Df geNUM: &

Corollary 39. UniformRobustCons*NUM:

Proof. The inclusion follows from Corollary 16(b) and Theorem 10(a,b). The proper inclusionnow follows from Theorem 35 and Theorem 10(a). &

Note that the class S used in proof of Theorem 32, is also not in RConf: So we can have analternative proof for above theorem using the S in proof of Theorem 32. However, our currentproof is useful for some other results in the paper.We now consider robust separation of RCons and TCons; see Theorem 44 below. We first

show the following proposition and lemma.

Proposition 40. Suppose Yk is general recursive. Then for all n; there exists an m such that, if

tASEG0;1; and jtjXm; then domainðYkðtÞÞ+f0; 1;y; n � 1g:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165140

Page 19: Robust learning—rich and poor

Proof. This follows from Konig’s Lemma and general recursiveness of Yk: &

Lemma 41. There exists a recursive p such that (a) to (e) are satisfied.

(a) jpðiÞðiÞ ¼ 1 and jpðiÞðxÞ ¼ 0; for xoi:

(b) jpðiÞAR0;1 or jpðiÞASEG0;1:

(c) For all i; for all kpi; either:(i) YkðjpðiÞÞfYkðZeroÞ or

(ii) for all fAR0;1 such that jpðiÞDf ; ½Ykð f ÞBYkðZeroÞ�:(d) For all k such that Yk is general recursive, for all i; jXk such that iaj; Either,

(i) YkðjpðiÞÞBYkðZeroÞ or

(ii) Ykðjpð jÞÞBYkðZeroÞ or

(iii) YkðjpðiÞÞfYkðjpð jÞÞ or

(iv) for all f ; gAR0;1 such that jpðiÞDf and jpð jÞDg; ½Ykð f ÞBYkðgÞ�:(e) C ¼ fjpðiÞ j iAN;jpðiÞARgeNUM:

Proof. By operator recursion theorem [Cas74], there exists a recursive p such that jpðiÞ may be

described as follows.Initially, we let jpðiÞðxÞ ¼ 0; for xoi; and jpðiÞðiÞ ¼ 1: Let js

pðiÞ denote the portion of jpðiÞdefined before stage s: It will always be the case that domain of js

pðiÞ is an initial segment

of N:For all i; for all kpi; let Iði; k; sÞ denote the predicate Ykðjs

pðiÞÞfYkðZero½s�Þ: IntuitivelyIði; k; sÞ being true denotes that YkðjpðiÞÞ has been made non-conforming with YkðZeroÞ: Clearly,if Iði; k; sÞ is true, then Iði; k; s þ 1Þ is also true.

Initially, let DiagðiÞ ¼ |; for each iAN: Intuitively, DiagðiÞ denotes a collection of j suchthat in the construction we have made jpðiÞajjið jÞ: Clearly, DiagðiÞ is monotonically

non-decreasing.Intuitively, in stages below, if smod 3 ¼ 0; then we try to work on satisfying (c), if smod 3 ¼ 1;

then we work on satisfying (d) and if smod 3 ¼ 2; then we work on satisfying (e). Let xsi denote

the least x not in the domain of jspðiÞ: Go to stage 0.

Stage s

If smod 3 ¼ 0; then1. For ips Do,2. If there exists a kpi; and tASEG0;1; such that following are satisfied

2.1. YkðjspðiÞÞBYkðZero½s�Þ; and

2.2. jspðiÞDt and jtjps; and

2.3. YkðtÞfYkðZero½s�Þ;

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 141

Page 20: Robust learning—rich and poor

Then pick least such k and a corresponding t and make jsþ1pðiÞ ¼ t:

EndFor3. Go to stage s þ 1:If smod 3 ¼ 1; then4. If there exist k; i; jAN and t; gASEG0;1; such that the following are satisfied

4.1. kpiojps; and4.2. Iði; k; sÞ and Ið j; k; sÞ are both true, and4.3. Ykðjs

pðiÞÞBYkðjspð jÞÞ; and

4.4. jtjps and jgjps; and4.5. js

pðiÞDt; jspð jÞDg; and

4.6. YkðtÞfYkðgÞ;Then for least such triple ði; j; kÞ; (in some ordering of all triples), pick one such

corresponding t and g; and make jsþ1pðiÞ ¼ t and jsþ1

pð jÞ ¼ g:5. Go to stage s þ 1:If smod 3 ¼ 2 then6. For ips Do7. Let j be the least number not in DiagðiÞ:8. If Fið jÞps and Fjið jÞðxs

i Þps;

Then let jpðiÞðxsi Þ ¼ 1�: jjið jÞðxs

i Þ; and DiagðiÞ ¼ DiagðiÞ,f jg:Endfor9. Go to stage s þ 1:

End stage s

Clearly, (a) and (b) of claim hold.We now consider (c). Consider any i ¼ i0; and k ¼ k0pi0: If jpði0Þ is total, then (c) clearly holds.

So suppose jpði0Þ is in SEG0;1: Suppose by way of contradiction that Yk0ðjpði0ÞÞBYk0ðZeroÞ; andthere exists an extension fAR0;1 of jpði0Þ such that Yk0ð f ÞfYk0ðZeroÞ: Then there exists a finite

t; jpð0ÞDtDf and an n such thatYk0ðtÞfYk0ðZero½n�Þ: Thus, for some stage s4maxðjtj; n; i0Þ andsmod 3 ¼ 0; in step 2 of stage s; jpði0Þ would be extended so as to make Yk0ðjpði0ÞÞfYk0ðZeroÞ: Acontradiction.Thus (c) is satisfied.We now consider (d). Consider i ¼ i0; j ¼ j0; and k ¼ k0pminði0; j0Þ:Without loss of generality

assume i0oj0: Suppose lims-N Iði0; k0; sÞ and lims-NIð j0; k0; sÞ are both true (otherwise (d)trivially holds). Suppose by way of contradiction that Yk0ðjpði0ÞÞBYk0ðjpð j0ÞÞ and there exist

total f ; gAR0;1 such that jpði0ÞDf ; jpð j0ÞDg; and Yk0ð f ÞfYk0ðgÞ: Let n be such that

Yk0ð f ½n�ÞfYk0ðg½n�Þ: Let s be large enough such that Iði0; k; sÞ and Ið j0; k; sÞ are bothtrue, ½js

pði0Þ ¼ jpði0Þ or jspði0Þ+jpði0Þ½n�� and ½js

pð j0Þ ¼ jpð j0Þ or jspð j0Þ+jpð j0Þ½n��: Thus, for any

s04s such that s0 mod 3 ¼ 1; ði0; j0; k0Þ would be possible candidate in step 4, and for large enoughs0 it will be the smallest triple satisfying 4.1–4.6. Thus, it would be ensured thatYk0ðjpði0ÞÞfYk0ðjpð j0ÞÞ:Thus (d) holds.

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165142

Page 21: Robust learning—rich and poor

For (e), first note that jpðiÞ may be modified in stages smod 3 ¼ 0 only finitely often, since there

are only finitely many kpi: Similarly, for each kpi; jpðiÞ may be modified in stages smod 3 ¼ 1

only finitely often, since if Iði; k; sÞ; then there exists a finite n such that YkðjpðiÞÞfYkðZero½n�Þ;and thus, only jon can satisfy (4.3). It follows that for all i; for all but finitely many stages, if

smod 3a2; then jspðiÞ ¼ jsþ1

pðiÞ :

Now suppose by way of contradiction that CANUM: Let i be such that ji is total, andCDfjjið jÞ j jANgDR: Then, consider jpðiÞ: It is clear that jpðiÞ is total (due to step 6–8). Also,

DiagðiÞ contains every j eventually. Thus, for all j; jpðiÞajjið jÞ; and hence jpðiÞefjjið jÞ j jANg: Acontradiction.Thus (e) holds. &

Corollary 42. Suppose p is as in Lemma 41. Suppose Yk is general recursive operator. Then givenany i and n; exactly one of (a) and (b) below holds, and one can effectively determine which of (a) and

(b) holds.

(a) YkðjpðiÞÞfYkðZero½n�Þ:(b) For all fAR0;1 extending jpðiÞ; Ykð f ÞBYkðZero½n�Þ:

Proof. By Lemma 41(c), we have:

(i) YkðjpðiÞÞfYkðZeroÞ or(ii) For all fAR0;1 such that jpðiÞDf ; ½Ykð f ÞBYkðZeroÞ�:

If (ii) holds, then clearly, (b) of corollary holds. So suppose (i) holds. It follows that eitherYkðjpðiÞÞfYkðZero½n�Þ or YkðjpðiÞÞ+YkðZero½n�Þ: If YkðjpðiÞÞfYkðZero½n�Þ then (a) holds, and

if YkðjpðiÞÞ+YkðZero½n�Þ then (b) holds.

We now show how to effectively determine which of (a) and (b) holds.Note that if (a) holds, then there exists a t such that YkðjpðiÞ;tÞfYkðZero½n�Þ: If (b) holds, then

either

(iii) YkðjpðiÞÞ+YkðZero½n�Þ or(iv) jpðiÞ is finite and for all tASEG0;1 such that jpðiÞDt; YkðtÞBYkðZero½n�Þ:

Thus, if (b) holds, then there exists a t such that

(iii0) YkðjpðiÞ;tÞ+YkðZero½n�Þ or(iv)0 For all tASEG0;1 such that jtj ¼ t and jpðiÞ;tBt; YkðtÞ+YkðZero½n�Þ

(where ðiv0Þ follows from (iv) using Proposition 40).

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 143

Page 22: Robust learning—rich and poor

Thus, if (b) holds, we can effectively find a witness for it.It follows that one can determine effectively which of (a) and (b) holds. &

Corollary 43. Suppose p is as in Lemma 41. Suppose Yk is general recursive operator. Then given

any i; j and n; exactly one of (a), (b) and (c) holds, and one can effectively determine which of (a), (b)and (c) holds.

(a) [For all fAR0;1; such that jpðiÞDf ; Ykð f ÞBYkðZero½n�Þ� or ½For all gAR0;1; such that

jpð jÞDg; YkðgÞBYkðZero½n�Þ�;(b) ½YkðjpðiÞÞfYkðZero½n�Þ and Ykðjpð jÞÞfYkðZero½n�Þ� and there exists an xon; such that

YkðjpðiÞÞðxÞaYkðjpð jÞÞðxÞ;(c) ½YkðjpðiÞÞfYkðZero½n�Þ and Ykðjpð jÞÞfYkðZero½n�Þ� and for all fAR0;1 extending jpðiÞ; for all

gAR0;1 extending jpð jÞ; Ykð f Þ½n� ¼ YkðgÞ½n�:

Proof. Note that using Corollary 42 one can effectively determine whether (a) holds orYkðjpðiÞÞfYkðZero½n�Þ and Ykðjpð jÞÞfYkðZero½n�Þ: We now show that if (a) does not hold then

one of (b) or (c) holds, and one can effectively determine which one.By Lemma 41(d), if (a) does not hold then we must have(iii) YkðjpðiÞÞfYkðjpð jÞÞ or(iv) for all f ; gAR0;1 such that jpðiÞDf and jpð jÞDg; ½Ykð f ÞBYkðgÞ�:If (iii) holds, then clearly, either (b) of corollary holds or both YkðjpðiÞÞ and Ykðjpð jÞÞ have

domain a superset of f0; 1;y; n � 1g and YkðjpðiÞÞ½n� ¼ Ykðjpð jÞÞ½n�; and thus (c) is satisfied.

Moreover the witness for both of above cases can be effectively found (since in first case thereexist xon and t such that YkðjpðiÞ;tÞðxÞaYkðjpð jÞ;tÞðxÞ; and in second case there exists a

t such that domains of YkðjpðiÞ;tÞ and Ykðjpð jÞ;tÞ; are supersets of f0; 1;y; n � 1g and

YkðjpðiÞ;tÞ½n� ¼ Ykðjpð jÞ;tÞ½n�).So suppose (iv) holds. Then clearly, (c) of corollary holds. Thus, by Proposition 40, there exists

a t such that

(iv0) For all t such that jtj ¼ t and jpðiÞ;tBt; for all g such that jgj ¼ t and jpð jÞ;tBg;

YkðtÞBYkðgÞ; and domain of both YkðtÞ and YkðgÞ is a superset of f0;y; n � 1g:Thus, we have a witness for (c) holding.It follows that one can determine effectively which of (a), (b) or (c) holds. &

Theorem 44. RobustRConsD/ TCons: (Moreover the non-inclusion can be witnessed using a classfrom R0;1:)

Proof. Let p be as in Lemma 41.Let C ¼ fjpðiÞ j iAN4jpðiÞARg: Then, by part (b) of Lemma 41, CDR0;1; and, by part (e),

CeNUM: Thus, by Corollary 20, CeTCons:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165144

Page 23: Robust learning—rich and poor

We now show that CARobustRCons: To show this, suppose Yk is a general recursive operator.We need to show that YkðCÞARCons:Let S ¼ fjpðiÞAC j ipkg:Define M as follows:

Mðg½m�Þ1. If g½m�BYkðZeroÞ; then output a standard program for YðZeroÞ:2. Else If g½m�BYkð f Þ; for some fAS; then output a standard program for one suchYkð f Þ:3. Else let n be such that YkðZero½n�Þfg½m�:

(� Note that if gAYkðCÞ; and we are at this point of the construction, then g must beYkðjpðiÞÞ; for some i such that kpion: �)

4. Using Corollary 42 determine A ¼ fi j kpion4YkðjpðiÞÞfYkðZero½n�Þg:(� Note that by Corollary 42, for kpion; and ieA — either jpðiÞ is not total or

YkðjpðiÞÞBYkðZero½n�Þfg½m� — thus, gaYkðjpðiÞÞ: �)

5. Let B be a subset of A such that:for iAA � B; there exists an xom such that YkðjpðiÞÞðxÞagðxÞ;for distinct i; jAB; for all f ; hAR0;1 such that f+jpðiÞ and h+jpð jÞ; Ykð f Þ½m� ¼YkðhÞ½m�:

(� Note that, by Corollary 43, there exists such a B; and one such B can be effectivelyfound. �)(� Note that elements in A � B can be safely ignored. �)

6. Output (standard) program for UnionðfYkðjpðiÞÞ j iABgÞ:End

We claim that above M RCons-identifies YkðCÞ:Clearly, M is defined on each initial segment. Suppose gAYkðCÞ: To see that M Cons-identifies

g we argue as follows.If g ¼ YkðZeroÞ or if g ¼ YkðjpðiÞÞ; for some ipk; then clearly by steps 1 and 2,M is consistent

on g and Ex-identifies g:If gaYkðZeroÞ; then let n be least value such that gfYkðZero½n�Þ: Thus, g must be one

of YkðjpðiÞÞ; for ion: We have already handled the case ipk; so assume koion for the

following.We first show thatM is consistent on these g: Note that steps 1 and 2 are clearly consistent with

the input. So we only consider m such that Mðg½m�Þ in above construction reaches step 3. Forthese m; by construction of A; if ieA; then either jpðiÞ is not total, or YkðjpðiÞÞ is conforming withYkðZero½n�Þ and thus not equal to g: Similarly, if iAA � B; then jpðiÞ is not total or YkðjpðiÞÞ isnon-conforming with g:Now, since gAYkðCÞ; there exists an iAB such that YkðjpðiÞÞ ¼ g; and since for all i0; j0AB;

Ykðjpði0ÞÞ and Ykðjpð j0ÞÞ do not differ on f0; 1;y;m � 1g; we have that g½m�DUnionðfYkðjpðiÞÞ jiABgÞ: Thus, M is consistent on g:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 145

Page 24: Robust learning—rich and poor

To show that M Ex-identifies g; note that B is monotonically non-increasing (withrespect to g½m�). Also the limiting (with respect to m) value of B satisfies theproperty:

for any i; jAB; for any f ; hAR0;1 such that jpðiÞDf and jpð jÞDh; ½Ykð f ÞBYkðhÞ�:

Using the fact that g ¼ YkðjpðiÞÞ for some iAB; it immediately follows that

M Ex-identifies g: &

Corollary 45. RobustRCons*NUM:

Proof. The inclusion follows from Corollary 16(b) and Theorem 10(a). The proper inclusion nowfollows from Theorems 44 and 10(a). &

We now show that RConf and RCons can be separated robustly, see Theorem 49 below. Aportion of the proof is similar to that in Theorem 44. We need the following modification ofLemma 41.

Lemma 46. There exists a recursive p such that (a)–(e) are satisfied.

(a) jpðiÞðiÞ ¼ 1 and jpðiÞðxÞ ¼ 0; for xoi:

(b) jpðiÞAR0;1 or jpðiÞASEG0;1:

(c) For all i; for all kpi; Either:(i) YkðjpðiÞÞfYkðZeroÞ; or

(ii) for all fAR0;1 such that jpðiÞDf ; ½Ykð f ÞBYkðZeroÞ�:(d) For all k such that Yk is general recursive, for all i; jXk such that iaj; Either:

(i) YkðjpðiÞÞBYkðZeroÞ; or

(ii) Ykðjpð jÞÞBYkðZeroÞ; or

(iii) YkðjpðiÞÞfYkðjpð jÞÞ; or

(iv) for all f ; gAR0;1 such that jpðiÞDf and jpð jÞDg; ½Ykð f ÞBYkðgÞ�:(e) Let C ¼ fjpðiÞAR jMi is totalg,fr-extension of jpðiÞ jMi is total and jpðiÞ is not total and

rAf0; 1gg:

Then CeRCons; and there exists a recursive H such that following two conditions are

satisfied.

(e.1) If Mi is total, then limt-N Hði; tÞk ¼ 1 iff jpðiÞ is total, and limt-N Hði; tÞk ¼ 0 iff jpðiÞ is

not total.(e.2) If Mi is not total, then limt-N Hði; tÞ is defined. Moreover, if limt-N Hði; tÞ ¼ 0; then jpðiÞ is

not total.

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165146

Page 25: Robust learning—rich and poor

Proof. Proof of this lemma is a modification of proof of Lemma 41 where we make appropriatechanges so that condition (e) is satisfied.

By operator recursion theorem [Cas74], there exists a recursive p such that jpðiÞ may be

described as follows.Initially, we let jpðiÞðxÞ ¼ 0; for xoi; and jpðiÞðiÞ ¼ 1: Let js

pðiÞ denote the portion

of jpðiÞ defined before stage s: It will always be the case that domain of jspðiÞ is an initial segment

of N:For all i; for all kpi; let Iði; k; sÞ denote the predicate Ykðjs

pðiÞÞfYkðZero½s�Þ: IntuitivelyIði; k; sÞ being true denotes that YkðjpðiÞÞ has been made non-conforming with YkðZeroÞ: Clearly,if Iði; k; sÞ is true, then Iði; k; s þ 1Þ is also true.Intuitively, in stages below, if smod 3 ¼ 0; then we try to work on satisfying (c), if smod 3 ¼ 1;

then we work on satisfying (d) and if smod 3 ¼ 2; then we work on satisfying (e). Let xsi denote

the least x not in the domain of jspðiÞ: Go to stage 0.

Stage s

If smod 3 ¼ 0; then1. For ips Do,2. If there exists a kpi; and tASEG0;1; such that following are satisfied

2.1. YkðjspðiÞÞBYkðZero½s�Þ; and

2.2. jspðiÞDt and jtjps; and

2.3. YkðtÞfYkðZero½s�Þ;Then pick least such k and a corresponding t and make jsþ1

pðiÞ ¼ t:EndFor3. Go to stage s þ 1:If smod 3 ¼ 1; then4. If there exist k; i; jAN and t; gASEG0;1; such that the following are satisfied

4.1. kpiojps; and4.2. Iði; k; sÞ and Ið j; k; sÞ are both true, and4.3. Ykðjs

pðiÞÞBYkðjspð jÞÞ; and

4.4. jtjps and jgjps; and4.5. js

pðiÞDt; jspð jÞDg; and

4.6. YkðtÞfYkðgÞ;Then for least such triple ði; j; kÞ; (in some ordering of all triples), pick one

such corresponding t and g; and make jsþ1pðiÞ ¼ t and jsþ1

pð jÞ ¼ g:5. Go to stage s þ 1:If smod 3 ¼ 2 then6. If Miðjs

pðiÞ 0Þ and MiðjspðiÞ 1Þ both converge within s steps, then

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 147

Page 26: Robust learning—rich and poor

7. If MiðjspðiÞ 0ÞaMiðjs

pðiÞ 1Þ;7.1 then let wAf0; 1g be such thatMiðjs

pðiÞ wÞaMiðjspðiÞÞ; and set jpðiÞðxs

i Þ ¼w and go to stage s þ 1:7.2 else go to stage s þ 1:Else go to stage s þ 1:

End stage s

Clearly, (a) and (b) of claim hold.We now consider (c). Consider any i ¼ i0; and k ¼ k0pi0: If jpði0Þ is total, then (c) clearly holds.

So suppose jpði0Þ is in SEG0;1: Suppose by way of contradiction that Yk0ðjpði0ÞÞBYk0ðZeroÞ; andthere exists an extension fAR0;1 of jpði0Þ such that Yk0ð f ÞfYk0ðZeroÞ: Then there exists a finite

t; jpð0ÞDtDf and an n such thatYk0ðtÞfYk0ðZero½n�Þ: Thus, for some stage s4maxðjtj; n; i0Þ andsmod 3 ¼ 0; in step 2 of stage s; jpði0Þ would be extended so as to make Yk0ðjpði0ÞÞfYk0ðZeroÞ: Acontradiction.Thus, (c) is satisfied.We now consider (d). Consider i ¼ i0; j ¼ j0; and k ¼ k0pminði0; j0Þ: Without loss of

generality assume i0oj0: Suppose lims-N Iði0; k0; sÞ and lims-NIð j0; k0; sÞ are both true(otherwise (d) trivially holds). Suppose by way of contradiction that Yk0ðjpði0ÞÞBYk0ðjpð j0ÞÞand there exist total f ; gAR0;1 such that jpði0ÞDf ; jpð j0ÞDg; and Yk0ð f ÞfYk0ðgÞ: Let n be such

that Yk0ð f ½n�ÞfYk0ðg½n�Þ: Let s be large enough such that Iði0; k; sÞ and Ið j0; k; sÞ are bothtrue, ½js

pði0Þ ¼ jpði0Þ or jspði0Þ+jpði0Þ½n�� and ½js

pð j0Þ ¼ jpð j0Þ or jspð j0Þ+jpð j0Þ½n��: Thus, for

any s04s such that s0 mod 3 ¼ 1; ði0; j0; k0Þ would be possible candidate in step 4, and for largeenough s0 it will be the smallest triple satisfying 4.1–4.6. Thus, it would be ensured thatYk0ðjpði0ÞÞfYk0ðjpð j0ÞÞ:Thus (d) holds.For (e), first note that jpðiÞ may be modified in stages smod 3 ¼ 0 only finitely often, since there

are only finitely many kpi: Similarly, for each kpi; jpðiÞ may be modified in stages smod 3 ¼ 1

only finitely often, since if Iði; k; sÞ; then there exists a finite n such that YkðjpðiÞÞfYkðZero½n�Þ;and thus, only jon; can satisfy (4.3). It follows that for all i; for all but finitely many stages, if

smod 3a2; then jspðiÞ ¼ jsþ1

pðiÞ :

We now show that CeRCons: Suppose Mi is total. Let t be such that for any stage s4t;

smod 3a2; jspðiÞ ¼ jsþ1

pðiÞ : Note that there exists such a stage t: Now note that in step 6, if clause

succeeds infinitely often. If 7.1 is executed infinitely often, then jpðiÞ is total but Mi makes

infinitely many mind changes on jpðiÞ: On the other hand, if 7.2 is executed in any stage s4t;

smod 3 ¼ 2; then step 7.2 would be executed in every stage s04s; s0 mod 3 ¼ 2 (since jpðiÞ ¼ jspðiÞ

in this case), jpðiÞ is finite, and Mi is not consistent on at least one of r-extension of jpðiÞ; for

rAf0; 1g (since MiðjspðiÞ 0Þ ¼ Miðjs

pðiÞ 1Þ). Thus, again Mi does not RCons-identify C: Thus,

CeRCons:Let Hði; sÞ ¼ 0; if in stage 3s þ 2 above construction executes step 7.2; Hði; sÞ ¼ 1 otherwise. It

is easy to verify that H satisfies the requirements in (e). &

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165148

Page 27: Robust learning—rich and poor

Corollary 47. Suppose p is as in Lemma 46. Suppose Yk is general recursive operator. Then given

any i and n; exactly one of (a) and (b) below holds, and one can effectively determine which of (a) and(b) holds.

(a) YkðjpðiÞÞfYkðZero½n�Þ:(b) For all fAR0;1 extending jpðiÞ; Ykð f ÞBYkðZero½n�Þ:

Proof. By Lemma 46(c), we have:

(i) YkðjpðiÞÞfYkðZeroÞ or(ii) For all fAR0;1 such that jpðiÞDf ; ½Ykð f ÞBYkðZeroÞ�:

If (ii) holds, then clearly, (b) of corollary holds. So suppose (i) holds. It follows that eitherYkðjpðiÞÞfYkðZero½n�Þ or YkðjpðiÞÞ+YkðZero½n�Þ: If YkðjpðiÞÞfYkðZero½n�Þ then (a) holds, and

if YkðjpðiÞÞ+YkðZero½n�Þ then (b) holds.

We now show how to effectively determine which of (a) and (b) holds.Note that if (a) holds, then there exists a t such that YkðjpðiÞ;tÞfYkðZero½n�Þ: If (b) holds, then

either

(iii) YkðjpðiÞÞ+YkðZero½n�Þ or(iv) jpðiÞ is finite and for all tASEG0;1 such that jpðiÞDt; YkðtÞBYkðZero½n�Þ:

Thus, if (b) holds, then there exists a t such that

(iii0) YkðjpðiÞ;tÞ+YkðZero½n�Þ or(iv0) For all tASEG0;1 such that jtj ¼ t and jpðiÞ;tBt; YkðtÞ+YkðZero½n�Þ (where ðiv0Þ follows

from (iv) using Proposition 40).

Thus, if (b) holds we can effectively find a witness for it.It follows that one can determine effectively which of (a) and (b) holds. &

Corollary 48. Suppose p is as in Lemma 46. Suppose Yk is general recursive operator. Then given

any i; j and n; exactly one of (a), (b) and (c) holds, and one can effectively determine which of (a), (b)and (c) holds.

(a) ½ For all fAR0;1; such that jpðiÞDf ; Ykð f ÞBYkðZero½n�Þ� or ½ For all gAR0;1; such that

jpð jÞDg; YkðgÞBYkðZero½n�Þ�;(b) ½YkðjpðiÞÞfYkðZero½n�Þ and Ykðjpð jÞÞfYkðZero½n�Þ� and there exists an xon; such that

YkðjpðiÞÞðxÞaYkðjpð jÞÞðxÞ;(c) ½YkðjpðiÞÞfYkðZero½n�Þ and Ykðjpð jÞÞfYkðZero½n�Þ� and for all fAR0;1 extending jpðiÞ; for all

gAR0;1 extending jpð jÞ; Ykð f Þ½n� ¼ YkðgÞ½n�:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 149

Page 28: Robust learning—rich and poor

Proof. Note that using Corollary 47 one can effectively determine whether (a) holds orYkðjpðiÞÞfYkðZero½n�Þ and Ykðjpð jÞÞfYkðZero½n�Þ: We now show that if (a) does not hold then

one of (b) or (c) holds, and one can effectively determine which one.

By Lemma 46(d), if (a) does not hold then we must have

(iii) YkðjpðiÞÞfYkðjpð jÞÞ or(iv) for all f ; gAR0;1 such that jpðiÞDf and jpð jÞDg; ½Ykð f ÞBYkðgÞ�:

If (iii) holds, then clearly, either (b) of corollary holds or both YkðjpðiÞÞ and Ykðjpð jÞÞ havedomain a superset of f0; 1;y; n � 1g and YkðjpðiÞÞ½n� ¼ Ykðjpð jÞÞ½n�; and thus (c) is satisfied.

Moreover witness for both of above cases can be effectively found (since in the first case thereexists a t and xon such that YkðjpðiÞ;tÞðxÞaYkðjpð jÞ;tÞðxÞ; and in the second case there exists a t

such that domains of YkðjpðiÞ;tÞ and Ykðjpð jÞ;tÞ; are supersets of f0; 1;y; n � 1g and

YkðjpðiÞ;tÞ½n� ¼ Ykðjpð jÞ;tÞ½n�).So suppose (iv) holds. Then clearly, (c) of corollary holds. Thus, by Proposition 40, there exists

a t such that

(iv0) For all t such that jtj ¼ t and jpðiÞ;tBt; for all g such that jgj ¼ t and jpð jÞ;tBg;

YkðtÞBYkðgÞ; and domain of both YkðtÞ and YkðgÞ is a superset of f0;y; n � 1g:Thus, we have a witness for (c) holding.It follows that one can determine effectively which of (a), (b) or (c) holds. &

Theorem 49. RobustRConfD/ RCons: ðMoreover the non-inclusion can be witnessed using a class

from R0;1:)

Proof. Let p;H be as in Lemma 46.

Let C ¼ fjpðiÞAR jMi is totalg,fr-extension of jpðiÞ jMi is total and jpðiÞ is not total and

rAf0; 1gg:Then by part (e) of Lemma 46, CeRCons:We now show that CARobustRConf: To show this, suppose Yk is a general recursive operator.

We need to show that YkðCÞARConf:Let S ¼ fjpðiÞAC j ipkg,fr-extension of jpðiÞ j rAf0; 1g; ipk; and r-extension of jpðiÞACg:Define M as follows:

Mðg½m�Þ1. If g½m�BYkðZeroÞ; then output a standard program for YðZeroÞ:2. Else If g½m�BYkð f Þ; for some fAS; then output a standard program for one such Ykð f Þ:3. Else let n be such that YkðZero½n�Þfg½m�:

(� Note that if gAYkðCÞ; and we are at this point of the construction, then g must beYkðjpðiÞÞ; or Ykðr-extension of jpðiÞÞ; for some rA0; 1; for some i such that kpion: �)

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165150

Page 29: Robust learning—rich and poor

4. Using Corollary 47 determine A ¼ fi j kpion4YkðjpðiÞÞfYkðZero½n�Þg:(� Note that by Corollary 42, for kpion; and ieA — either jpðiÞ is not total or

YkðjpðiÞÞBYkðZero½n�Þfg½m� — thus, gaYkðjpðiÞÞ: �)

5. Let B be a subset of A such that:for iAA � B; there exists an xom such that YkðjpðiÞÞðxÞagðxÞ;for distinct i; jAB; for all f ; hAR0;1 such that f+jpðiÞ and h+jpð jÞ; Ykð f Þ½m� ¼YkðhÞ½m�:

(� Note that, by Corollary 48, there exists such a B; and one such B can be effectivelyfound. �)

6. For each iAB; such that Hði;mÞ ¼ 0; let hi denote0-extension of jpðiÞ;m if Ykð0-extension of jpðiÞ;mÞ+g½m�;1-extension of jpðiÞ;m if Ykð1-extension of jpðiÞ;mÞ+g½m�; and Ykð0-extensionof jpðiÞ;mÞKg½m�;L if Ykð1-extension of jpðiÞ;mÞKg½m�; and Ykð0-extension of jpðiÞ;mÞKg½m�:

7. Output (standard) program for UnionðfYkðjpðiÞÞ j iABg,fYkðhiÞ j iAB;Hði;mÞ ¼ 0gÞ:End

We claim that above M RConf-identifies YkðCÞ:Clearly, M is defined on each initial segment. Suppose gAYkðCÞ: To see that M Conf-identifies

g we argue as follows.If g ¼ YkðZeroÞ or if g ¼ YkðjpðiÞÞ; for some ipk; or if g ¼ Ykðr-extension of jpðiÞÞ;

for rAf0; 1g; for some ipk; then clearly by steps 1 and 2, M is conforming on g andEx-identifies g:If gaYkðZeroÞ; then let n be the least value such that gfYkðZero½n�Þ: Thus, g must be one of

YkðjpðiÞÞ; for ion; or g must beYkðr-extension of jpðiÞÞ; rAf0; 1g; for some ion:We have already

handled the case ipk; so assume koion for the following.We first show that M is conforming on these g: Note that steps 1 and 2 are clearly consistent

with the input. So we only consider m such thatMðg½m�Þ in above construction reaches step 3. Forthese m; by construction of A; if ieA; then for all total f extending jpðiÞ; Ykð f Þ is conformingwithYkðZero½n�Þ and thus not equal to g: Similarly, if iAA � B; then for all total f extending jpðiÞ;

Ykð f Þ is non-conforming with g:Now, since gAYkðCÞ; there exists an iAB such that YkðjpðiÞÞ ¼ g; or limt-N Hði; tÞ ¼ 0

and jpðiÞ is not total and g ¼ Ykðr-extension of jpðiÞÞ; for some rAf0; 1g: Fix one

such i:Now for jAB; jai; for any total extension f of jpðiÞ and any total extension h of jpð jÞ; Ykð f Þ

and YkðhÞ do not differ on f0; 1;y;m � 1g; thus, for all jAB; Ykðjpð jÞÞ is conforming with g½m�:Also, by definition of hj in step 6, we have that g½m� is conforming with YkðhjÞ; for all jAB such

that Hð j;mÞ ¼ 0: Thus, M is conforming on g:To show that M Ex-identifies g; first note that B is monotonically non-increasing (with respect

to g½m�). Also, for all jAB; limt-N Hð j; tÞk and if limt-N Hð j; tÞ ¼ 0; then hj stabilizes as m goes

to N:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 151

Page 30: Robust learning—rich and poor

Thus, if g ¼ YkðjpðiÞÞ; then clearly, fYkðjpð jÞÞ j jABg contains g: On the other hand, if jpðiÞ is

not total, and g is Ykðr-extension of jpðiÞÞ; for some rAf0; 1g; then g ¼ YkðhiÞ; for the limitingvalue of hi (with respect to m).It follows that, for large enough m; in the definition of Mðg½m�Þ;

gDUnionðfYkðjpðiÞÞ j iABg,fYkðhiÞ j iAB;Hði;mÞ ¼ 0gÞ:

Along with conformity of M; above implies that M Ex-identifies g: &

Corollary 50. RobustRConf*NUM:

Proof. The inclusion follows from Corollary 16(b) and Theorem 10(a). The proper inclusion nowfollows from Theorems 49 and 10(a). &

Proposition 51. UniformRobustConfidentD/ RConf:

Proof. C used in the proof of Theorem 35 is also in UniformRobustConfident: &

Corollary 52. UniformRobustConfidentD/ NUM:

Proof. This follows immediately from Proposition 51 and Theorem 10(a). &

Note that by Theorem 10(h), NUMD/ Confident: Thus, using Corollary 52, NUM and Confident

are incomparable. The following proposition strengthens Corollary 34.

Proposition 53. UniformRobustExD/ Confident,NUM:

Proof. Let F ;S be as in the proof of Theorem 32. Thus,SAUniformRobustEx andSeNUM; byproof of Corollary 34.Now we show that SeConfident: Suppose that a total-recursive and confident learner M Ex-

identifies S: This learner M makes on F only finitely many mind changes, in particular thereexists an n such that, for all n04n; MðF ½n0�Þ ¼ MðF ½n�Þ: Now consider the following set B ofstrings:

B ¼ ft j t extends F ½n� and MðtÞaMðF ½n�Þg:

Since for all sAB; sD/ F ; by definition of 1-generic function, there exists an x4n such that forall t+F ½x�; MðtÞ ¼ MðF ½x�Þ: Thus, M can learn at most one function extending F ½x�: However,by definition of S; there are infinitely many functions extending F ½x� in S: Thus, M cannotwitness that SAConfident: &

Zeugmann [Zeu86] proved already the following theorem.

Theorem 54 (Zeugmann [Zeu86]). RobustReliable ¼ NUM:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165152

Page 31: Robust learning—rich and poor

Corollary 55. RobustTCons ¼ NUM:

Proof. This follows immediately from Theorem 54 and Corollary 16(a). &

5. Robust learning versus uniformly robust learning

While in robust learning any transformed class is required to be learnable in the mere sense thatthere exists a learning machine for it, uniformly robust learning requires to get such a machine forthe transformed class effectively at hand. For a number of learning types, we now show thatuniformly robust learning is indeed stronger than robust learning.

Definition 56. A function F is i-generic iff for all sets ADSEG in Si; there exists an x such thateither

(a) F ½x�AA; or(b) for all sAA; F ½x�D/ s:

Proposition 57. There exist a recursive binary tree T and a limiting recursive function mappingtASEG0;1 to mtASEG0;1 such that following four properties are satisfied.

(a) For all sASEG0;1; msAT :(b) For all s; tASEG0;1; sCt iff msCmt:(c) Only infinite branches of T are mg; where gAT0;1 and mg ¼

SnAN mg½n�:

(d) For all tASEG0;1; for all i; jpjtj; either

(d.1) YiðmtÞfjj; or

(d.2) for all s+t; sASEG0;1; YiðmsÞBjj:

Proof. We first define mt; for tASEG0;1 below in stages. mst denotes the value of mt just

before the start of stage s: It will be the case that mstASEG0;1; and for all s; gASEG0;1; for

all s; sCg iff mssCms

g: Also, mt ¼ lims-N mstk: For npjsj; let s½m; n� denote sðmÞsðm þ

1Þysðn � 1Þ:Initially, let m0

t ¼ t: Go to stage 0:Stage s

1. Search for t; sASEG0;1; i; jAN such that

jtj; jsjps;i; jpjtj;tDs;Yiðms

tÞBjj;s:

YiðmssÞfjj;s:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 153

Page 32: Robust learning—rich and poor

2. If there exist such t;s; i; j; then for least such t; and a corresponding s let

msþ1t ¼ ms

s;

for all g+t; msþ1g ¼ ms

sg½jtj;jgj�:

3. For all g; such that msþ1g is not defined in step 2, let msþ1

g ¼ msg:

4. Go to stage s þ 1:End stage s

Since each t can be chosen in step 2 only finitely often—once for each pair i; jpjtj—after each

mg; gCt are stabilized, it is easy to show by induction on jtj that msþ1t can be defined via step 2

above only finitely often. Thus, mt ¼ lims-Nmstk:

Let T ¼ fs j ð(tÞð(tXjsjÞ½sDmtt�g:

It is easy to verify that (a) and (b) are satisfied. To see (c), suppose hAT0;1 is not of form mg for

any gAT0;1: Let Xh ¼ ftASEG0;1 j mtDhg: Let Z ¼S

tAXht: By part (b) if Xh is infinite, then

ZAT0;1; and h ¼ mZ: Thus, Xh is finite. It follows that, for all tASEG0;1 such that jtj ¼ jZj þ 1;mtD/ h: Let w ¼ maxðfjmtj j tASEG0;14jtj ¼ jZj þ 1gÞ: Then, it follows that h½w� is not a prefix ofany extension of mt; for any t of length jZj þ 1:Let s be such that, for all tASEG0;1 such that jtjpjZj þ 1; for all tXs; mt ¼ mt

t: Now, it followsthat h½w�D/ mt

t; for any tXs; and any tASEG0;1: Thus, it follows that h½maxðw; sÞ�D/ mtt; for any

tXmaxðw; sÞ; and any tASEG0;1: Thus, h½maxðw; sÞ�eT : This proves (c).To see (d), suppose i; jpjtj; s+t; and YiðmtÞBjj; but YiðmsÞfjj: Then, let s be such that

sXmaxði; j; jtj; jsjÞ and YiðmtÞBjj;s; YiðmsÞfjj;s; and for all tXs; mtt ¼ mt; and mt

s ¼ ms: But

then for all tXs; t;s; i; j will satisfy the conditions in step 1. Thus, for large enough t4s; t will bethe least candidate picked in stage t; step 2. Hence, mt

tamtþ1t ; a contradiction. Consequently, (d)

holds. &

Theorem 58. UniformRobustConfidentCRobustConfident:

Proof. Clearly, UniformRobustConfidentDRobustConfident: We thus only need to show that

RobustConfident�UniformRobustConfidenta|:Fix T ;mt; tASEG0;1 and mg; gAT0;1; as in Proposition 57.

Let F be a f0; 1g-valued 3-generic function.Let f 0 ¼ mF :Let S ¼ ff 0½x� 0N j xANg:For rest of proof let s; t range over SEG0;1:

Claim 59. For all e such that Ye is general recursive, there exist xe; je such that jjeAR; and

(A) ð8t+F ½xe�Þ½YeðmtÞDjje�; or

(B) ð8jjARÞð8t+F ½xe� j jtjXmaxðe; jÞÞ½YeðmtÞfjj�:

Proof. Suppose Ye is general recursive. Let Re ¼ ft j ð(jÞ½jjAR4ð8s+tÞ½YeðmsÞBjj��g:

Note that ReAS3: Thus, since F is 3-generic, there exists an xe such that either

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165154

Page 33: Robust learning—rich and poor

(a) F ½xe�ARe; or(b) for all tARe; F ½xe�D/ t:In other words, eitherða0Þ there exists a j such that jjAR; and, for all s+F ½xe�; YeðmsÞDjj; or

ðb0Þ for all jjAR; for all t+F ½xe�; there exists a s+t such that ½YeðmsÞfjj�:By Proposition 57, ðb0Þ above is equivalent toðb00Þ for all jjAR; for all t+F ½xe�; such that jtjXmaxðe; jÞ; ½YeðmtÞfjj�:Now ða0Þ is same as (A) and ðb00Þ is same as (B). &

For Ye being general recursive, fix xe; je as in Claim 59.Let Fe ¼ fs0N j sDmF ½xe�g:Let Ge ¼ fs0N j s+mF ½xe�g:Note that SDFe,Ge:

Claim 60. Suppose Ye is general recursive. Then, for all fAYeðSÞ � ½YeðFeÞ,fjjeg�; there exists

an n such that, for all sAT ; such that s+mF ½xe� and jsj ¼ n; ffYeðsÞ:

Proof. Suppose fAYeðSÞ � ½YeðFeÞ,fjjeg�:

We first claim that,(P1) for some m4xe; for all t+F ½xe� such that jtjXm; YeðmtÞff :To see this, first note that by Claim 59,(A) For all t+F ½xe�; YeðmtÞDjje

; or

(B) ð8jjARÞð8t+F ½xe� j jtjXmaxðe; jÞÞ½YeðmtÞfjj�:In case (B), (P1) immediately follows by taking m ¼ maxðMinProgðf Þ; eÞ: In case (A), since

fajje; by Konig’s Lemma and general recursiveness of Ye; (P1) follows.

Now, by Proposition 57(c) and Konig’s Lemma, there exist only finitely many sAT such thats+mF ½xe�; but s does not extend any mt; with t+F ½xe�; jtjXm: Thus, there exists an n4m; such

that for all sAT with s+mF ½xe� and jsj ¼ n; ffYeðsÞ: &

Claim 61. For all e, such that Ye is general recursive, YeðSÞAConfident:

Proof. Let h be as in Corollary 22. Consider the following M:Mðg½n�Þ

1. If for some fAYeðFeÞ,fjjeg; f ½n� ¼ g½n�; then output a standard program for f :

2. Else If there exists a sAT ; such that jsj ¼ n; s+mF ½xe�; YeðsÞBg½n�; then outputMðg½n � 1�Þ (ifn ¼ 0; then Mðg½n�Þ ¼ 0).

3. If for all sAT ; such that jsj ¼ n; and s+mF ½xe�; ½YeðsÞfg½n��; thenLet p be a bound on the programs for fs0N j sASEG0;1; jsjomg; where m4xe is the least

number such that for all sAT with jsj ¼ m and s+mF ½xe�; YeðsÞfg½n�:Output Mhðp;eÞðg½n�Þ:End

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 155

Page 34: Robust learning—rich and poor

We claim that M Confident-identifies YeðSÞ: To see this, first note that M Ex-identifies eachgAYeðFeÞ,fjje

g: If geYeðFeÞ,fjjeg; then we consider two cases.

(Note that, if Mðg½n�Þ executes step 3, then Mðg½n þ 1�Þ also executes step 3.)Case 1: For all n; Mðg½n�Þ never reaches step 3.In this case triviallyM is confident on g:Moreover, by Claim 60, geYeðSÞ � ½YeðFeÞ,fjje

g�:Case 2: For all but finitely many n; Mðg½n�Þ executes step 3.Then, for all n such thatMðg½n�Þ reaches step 3, m as computed in step 3 of the algorithm forM;

would be the least number such that, for all sAT with jsj ¼ m and s+mF ½xe�; YeðsÞfg: It follows

that values of m and p as computed by M converge. Thus, M is confident on g: Moreover, bydefinition of S; if gAYeðSÞ; then g must be from fYeðs0NÞ j jsjomg: Thus,Mhðp;eÞ and henceM

Ex-identifies g:From above cases it follows that M Confident-identifies YeðSÞ: &

Claim 62. CeUniformRobustConfident:

Proof. Suppose by way of contradiction that there exists a recursive g such that, for all e; if Ye isgeneral recursive, then MgðeÞ Confident-identifies YeðSÞ: Without loss of generality assume that

MgðeÞ is total for all e:

Then, by implicit use of Kleene recursion theorem [Rog67], there exists an e such that Ye maybe defined as follows. For ease of presentation, Yeðf ½m�Þ may be infinite for some f ;m (and thus itdoes not satisfy the invariant (d) we assumed (see discussion around Proposition 13) on theenumeration Y0;Y1;y; this can be easily handled by appropriately slowing down Ye).YeðLÞ ¼ L:

Yeðf ½n þ 1�Þ ¼

Yeðf ½n�Þ; if Yeðf ½n�Þ is infinite;Yeðf ½n�Þ f ðnÞ 0N; if for all m;

Yeðf ½n�Þ f ðnÞD/ jMgðeÞðYeðf ½n�Þf ðnÞ0mÞ;m;

Yeðf ½n�Þ f ðnÞ 0m; if m is the least number such that

Yeðf ½n�Þ f ðnÞDjMgðeÞðYeðf ½n�Þf ðnÞ0mÞ;m:

8>>>>>><>>>>>>:

It is easy to verify that Ye is general recursive. Recall that f 0 ¼ mF : We now consider two cases.Case 1: For some n; Yeðf 0½n þ 1�Þ is infinite.Let n be maximum such that Yeðf 0½n�Þ is finite. Thus, Yeðf 0½n þ 1�Þ is infinite, and Yeðf 0Þ ¼

Yeðf 0½n�Þ f 0ðnÞ 0N ¼ Yeðf 0½n þ 1� 0NÞAYeðSÞ: However, by definition of Ye; Yeðf 0½n�Þ f 0ðnÞD/ jMgðeÞðYeðf 0½n�Þf 0ðnÞ0mÞ;m; for all m: Thus, MgðeÞðYeðf 0½n þ 1� 0NÞÞ diverges, or Yeðf 0½n�Þ f 0ðnÞD/ jMgðeÞðYeðf 0½n�f 0ðnÞ0NÞÞ: It follows that MgðeÞ does not Ex-identify Yeðf 0½n þ 1� 0NÞAYeðSÞ:A contradiction.

Case 2: For all n; Yeðf 0½n�Þ is finite.In this case first note that for all haf 0;YeðhÞaYeðf 0Þ: To see this, let y be the least number such

that hðyÞaf 0ðyÞ: Now it is easy to verify that Yeðf 0½y� f 0ðyÞÞDYeðf 0Þ and Yeðf 0½y� hðyÞÞDYeðhÞ:Thus, Yeðf 0ÞaYeðhÞ:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165156

Page 35: Robust learning—rich and poor

Thus, by Claim 59, it follows that Yeðf 0Þ is non-recursive (otherwise, for all mh such that h

extends F ½xe�; we would have Yeðf 0Þ ¼ YeðmhÞ—a contradiction to above proof that for allmhaf 0; Yeðf 0ÞaYeðmhÞ).Now, for all n; Yeðf 0½n�Þ f 0ðnÞDMgðeÞðYeðf 0½n þ 1�ÞÞ: Thus, either MgðeÞ diverges on Yeðf 0Þ; or

Yeðf 0Þ is recursive. Since it was shown above that Yeðf 0Þ is non-recursive, it follows that MgðeÞdiverges on Yeðf 0Þ; a contradiction to MgðeÞ being confident.

From the above cases it follows that SeUniformRobustConfident: &Theorem follows from Claims 61 and 62. &

For identification with mind changes, we assumeM to be a mapping from SEG to N,f?g: Thisis to avoid biasing the number of mindchanges made by the machine [CS83].

Definition 63 (Gold [Gol67], Blum and Blum [BB75], and Case and Smith [CS83]). LetbAN,f�g: Let fAR:

(a) M Exb-identifies f (written: fAExbðMÞ) just in case M Ex-identifies f ; andcardðfn j ?aMðf ½n�ÞaMðf ½n þ 1�ÞgÞpb (i.e., M makes no more than b mind changes on f ).

(b) M Exb-identifies S iff M Exb-identifies each fAS:(c) Exb ¼ fSDR j ð(MÞ½SDExbðMÞ�g:

In the following, we present characterizations for RobustEx0; see Proposition 64; forUniformRobustExn for any nAN; see Corollary 68; and for

SnAN UniformRobustExn; see

Corollary 70. As a consequence, we derive that all the types Exn; nAN; are uniformly robustlypoor. Furthermore, for all the types Exn; nAN; uniformly robust learning turns out to be strongerthan robust learning, see Corollary 71.

Proposition 64 (Zeugmann [Zeu86], Jain et al.[JSW01]). For any CDR; CARobustEx0 iff C isfinite.

Thus, the type Ex0 of learning without any mind change is robustly poor, since every finite class isrecursively enumerable.

Proposition 65. For any nAN and CDR with cardðCÞo2nþ1; CAUniformRobustExn:

Proof. Suppose C ¼ ff0; f1;y; fm�1g with mo2nþ1: Then, one can show CAUniformRobustExn asfollows. There exists a recursive g such that MgðeÞ may be defined as follows.

For any f ; let Sðf ½l�Þ ¼ fiom j f ½l�BYeðfi½l�Þg:For SDfi j iomg; let MajðSÞ denote a program such that

jMajðSÞðxÞ ¼y; if cardðfiAS jYeðfiÞðxÞ ¼ ygÞ4cardðSÞ

2;

m; otherwise:

(

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 157

Page 36: Robust learning—rich and poor

Now define MgðeÞðf ½l�Þ as follows:

MgðeÞðf ½l�Þ ¼

MgðeÞðf ½l � 1�Þ; if cardðSðf ½l�ÞÞ ¼ 0;

MajðSðf ½s�ÞÞ; if 2kpcardðSðf ½l�ÞÞo2kþ1; and

s is the least number such that

2kpcardðSðf ½s�ÞÞo2kþ1:

8>>><>>>:

or any f ; for ipn; let si be minimal number, if any, such that cardðSðf ½si�ÞÞo2iþ1: (If for all l;

cardðSðf ½l�ÞÞX2iþ1; then si ¼ N). Let s�1 ¼ N:Note that for all l such that siplosi�1; MgðeÞðf ½l�Þ ¼ MajðS½si�Þ: Thus, MgðeÞ makes at most n

mind changes. Furthermore, for fAC; let i be the minimum number such that siaN: Then,

cardðfj j jASðf ½si�Þ4YeðfjÞBf gÞX2i: Thus, MajðSðf ½si�ÞÞ is a program for Yeðf Þ: Thus,MgðeÞ Ex-

identifies f : The proposition follows. &

The following proposition is a modification of similar proposition in [JSW01].

Proposition 66. For any Me and nAN; one can effectively define pe;ni ; io2nþ1; such that Me does not

Exn-identify fjpe;nij io2nþ1g:

Proof. We give below a description of total functions Fe;ni effectively in i; e; n: It will be the case

that programs pe;ni for F

e;ni can be obtained effectively from i; e; n:

Without loss of generality assume that MeðLÞ ¼ ?:For all binary strings a of length at most n; we define Ha as follows.

HL ¼ 0j; where j ¼ minðfx jMeð0xÞa?gÞ:For a binary string b of length less than n and bAf0; 1g; let Hbb be defined as follows:

If Hb is infinite, then Hbb ¼ Hb; otherwise, let Hbb ¼ Hb b 0j; where j ¼ minðfx jMeðHbÞaMeðHbb0xÞgÞ:It is easy to verify that HbDHbb; and if Hb is finite, then Me on Hb has made at least jbj mind

changes. Now suppose ðn þ 1Þ-bit binary representation of i is b0b1?bn: Then define

Fe;ni ¼

Hb0b1ybn�1 ; if Hb0;b1ybn�1 is infinite;

Hb0b1ybn�1 bn 0N; otherwise:

It is easy to verify that programs pe;ni for F

e;ni can be obtained effectively from i; e; n:

We claim that Me cannot Exn-identify fFe;ni j io2nþ1g: To see this, note that if HL is infinite,

then Me does not identify any element of fFe;ni j io2nþ1g: Otherwise, let b be the longest string of

length at most n such that Hb is finite. Then, consider the following cases:

Case 1: jbj ¼ n:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165158

Page 37: Robust learning—rich and poor

In this case, Me on Hb has already made n mind changes. Suppose i and i0 respectively, have

binary representation of b 0 and b 1: Now, Me can Exn-identify at most one of Fe;ni and F

e;ni0 ;

since both start with Hb:Case 2: jbjon:In this case, MeðHbÞ ¼ MeðHb 1 0NÞ ¼ MeðHb 0 0NÞ: Suppose i and i0 respectively, have

ðn þ 1Þ-bit binary representation b 0 0n�jbj and b 1 0n�jbj: Then, Fe;ni ¼ Hb 0 0N and F

e;ni0 ¼

Hb 1 0N; and thus, Me does not Ex-identify at least one of Fe;ni ;Fe;n

i0 :The proposition follows from above cases. &

Proposition 67. For any nAN and CDR with cardðCÞX2nþ1; CeUniformRobustExn:Proof. Suppose f0; f1;y; f2nþ1�1 are 2

nþ1 distinct functions in C: Let m be such that, for all i; j with

0piojo2nþ1; fi½m�afj½m�:Let p

e;ni be as in Proposition 66.

Suppose by way of contradiction that g is such that for all e; if Ye is general recursive, thenMgðeÞ Exn-identifies YeðCÞ:Then by Kleene recursion theorem [Rog67], there exists an e such that Ye may be described as

follows.

Yeðf Þ ¼j

pgðeÞ;ni

; if f ½m� ¼ fi½m�; for some io2nþ1;

Zero; otherwise:

(

Clearly, then by Proposition 66, MgðeÞ does not Exn-identify YeðCÞ: A contradiction. &

Corollary 68. For any nAN and CDR; CAUniformRobustExn iff cardðCÞo2nþ1:

Clearly, Corollary 68 yields that all the types Exn; nAN; are uniformly robustly poor.

Corollary 69. For any nAN; RobustEx0D/ UniformRobustExn:

Proof. This follows immediately from Propositions 64 and 67. &

Corollary 70. RobustEx0 ¼S

nAN UniformRobustExn:

Proof. This follows immediately from Proposition 64 and Corollary 68. &

Corollaries 69 and 70 above give a situation which is quite rare in the sense that in mostcases if one can diagonalize against each of In; one would expect to be able to diagonalizeagainst

SnAN In:

Corollary 71. For any nAN; UniformRobustExnCRobustExn:

Proof. Since RobustEx0DRobustExn; we have RobustExn-UniformRobustExna| byCorollary 69. &

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 159

Page 38: Robust learning—rich and poor

The following Propositions 72 and 73 are needed in proving Theorem 75 below. From thistheorem, we then derive that uniformly robust learning is stronger than robust learning for eachof the types Cons;Ex and Bc:

Proposition 72. There exists a K-recursive sequence of initial segments, s0;s1;yASEG0;1; such

that for all eAN; the following are satisfied.

(a) 0e1Dse:(b) For all e0pe; if Ye0 is general recursive, then either Ye0 ðseÞfYe0 ð0jsejÞ or for all fAT0;1

extending se; Ye0 ðf Þ ¼ Ye0 ð0NÞ:

Proof. We define se (using oracle for K) as follows. Initially, let s0e ¼ 0e1: For e0pe; define se0þ1e as

follows: if there exists an extension tASEG0;1 of se0e ; such that Ye0 ðtÞfYe0 ð0jtjÞ; then let se0þ1

e ¼ t;otherwise, let se0þ1

e ¼ se0e :

Now let se ¼ seþ1e as defined above. It is easy to verify that the proposition is satisfied. &

Proposition 73. There exists an infinite increasing sequence a0; a1;y of natural numbers such that

for A ¼ fai j iANg; the following properties are satisfied for all kAN:

(a) The complement of A is recursively enumerable relative to K.

(b) jakis total.

(c) For all epak such that je is total, jeðxÞpjakþ1ðxÞ for all xAN:

(d) For se as defined in Proposition 72, jsakjpakþ1:

Proof. The construction of ai’s is done using movable markers (using oracle for K). Let asi denote

the value of ai at the beginning of stage s in the construction. It will be the case that, for all s and i;

either asi ¼ asþ1

i ; or asþ1i 4s: This allows us to ensure property (a). The construction itself directly

implements properties (b)–(d). Let pad be a 1–1 padding function [Rog67] such that for all i; j;jpadði;jÞ ¼ ji; and padði; jÞXi þ j:

We assume without loss of generality that j0 is total. Initially, let a00 ¼ 0; and a0iþ1 ¼padð0; jsa0

ijÞ (this ensures a0iþ1Xjsa0

ij4a0i ). Go to stage 0:

Stage s

1. If there exist a k; 0okps; and xps such that:(i) jas

kðxÞm or

(ii) for some epask�1;½ð8ypsÞ½jeðyÞk� and jeðxÞ4jas

kðxÞ�

2. Then pick least such k and go to step 3. If there is no such k; then for all i; let asþ1i ¼ as

i ; and goto stage s þ 1:

3. For iok; let asþ1i ¼ as

i :4. Let j be the least number such that

(i) ð8ypsÞ½jjðyÞk� and(ii) for all epas

k�1; if for all yps; jeðyÞk; then for all yps; jjðyÞXjeðyÞ:Let asþ1

k ¼ padðj; jsask�1

j þ s þ 1Þ:

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165160

Page 39: Robust learning—rich and poor

5. For i4k; let asþ1i ¼ padð0; jsasþ1

i�1j þ s þ 1Þ:

6. Go to stage s þ 1:End stage s

We claim (by induction on k) that lims-N askk for each k: To see this, note that once all the ai;

iok; have stabilized, step 4 would eventually pick a j such that jj is total, and for all epak�1; if je

is total then jepjj: Thereafter, ak would not be changed.

We now show the various properties claimed in the proposition. One can enumerate A (using

oracle for K) using the following property: xAA iff there exists a stage s4x such that, for all ipx;as

iax: Thus (a) holds. (b) and (c) hold due to the check in step 1. (d) trivially holds due to paddingused for definition of as

i for all s: &

Definition 74. Suppose hAR: Let Bh ¼ fje j jeAR4ð8NxÞ½FeðxÞphðxÞ�g:

Intuitively, Bh denotes the class of recursive functions whose complexity is almost everywherebounded by h: We assume without loss of generality that FINSUPDBj0

: Thus, for ai as in

Proposition 73, FINSUPDBjai; for all i:

Theorem 75. RobustConsD/ UniformRobustBc:

Proof. Fix s0;s1;y as in Proposition 72, and a0; a1;y as in Proposition 73.Let Gk ¼ Bjak

-ff j sakDf g:

The main idea of the construction is to construct the diagonalizing class by taking at mostfinitely many functions from each Gk:Let Ybk

be defined as

Ybkðf ½n�Þ ¼

L; if nojsakj;

f ðjsakjÞf ðjsak

j þ 1Þyf ðn � 1Þ; if sakDf ½n�;

f ½n�; otherwise:

8><>:

Note that Ybkis general recursive.

Claim 76.S

iXk YbkðGiÞeBc:

Proof. Suppose by way of contradiction that M Bc-identifiesS

iXk YbkðGiÞ: Then, clearly M

must Bc-identify FINSUP (since FINSUPDYbkðGkÞÞ: Thus, for all tASEG0;1; there exists

an n such that, for all mXn; jMðt0mÞðjtj þ mÞ ¼ 0: Thus, there exists a recursive function

g such that, for all tASEG0;1 satisfying jtjpn; jMðt0gðnÞÞðjtj þ gðnÞÞ ¼ 0: Now for each t;

define ft inductively by letting Z0 ¼ t; Znþ1 ¼ Zn 0gðjZnjÞ 1 and ft ¼S

n Zn: Note that M does not

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 161

Page 40: Robust learning—rich and poor

Bc-identify any ft: Also, ft is uniformly (in t) computable and thus fft j tASEG0;1gDBh; for somerecursive h: Thus, for sufficiently large j; fft j tASEG0;1gDBjaj

: Thus, for almost all j4k;

fsajAGj ¼ Ybk

ðGjÞ:Since M does not Bc-identify fsaj

; claim follows. &

Let g0 be a function dominating all K 0-recursive functions. For each kAN and epg0ðkÞ; let fk;e

denote a function inS

iXk Gi; such that Me does not Bc-identify Ybkðfk;eÞ:

Let S ¼ ffk;e j kAN; epg0ðkÞg: Let Fk ¼ S-Gk: It is easy to verify that Fk is finite (since

fk;eeS

iok Gi).

Claim 77. SeUniformRobustBc:

Proof. Suppose by way of contradiction that h is a recursive function such that MhðeÞBc-identifies YeðSÞ: Note that bk can be recursively computed using oracle for K :Thus, hðbkÞ can be recursively computed using oracle for K : Hence, for all but finitely manyk; hðbkÞpg0ðkÞ: Consequently, MhðbkÞ does not Bc-identify Ybk

ðfk;hðbkÞÞAYbkðSÞ: A contra-

diction. &

Claim 78. SARobustCons:

Proof. Suppose Y ¼ Yk is general recursive. We need to show that YkðSÞACons: Let A ¼fai j iANg: Since A is r.e. in K ; there exists a recursive sequence c0; c1;y; such that each aAA;a4ak; appears infinitely often in the sequence, and each aeA or apak; appears only finitely oftenin the sequence. Let se;tASEG0;1 be such that se;t+0e1; and se;t can be obtained effectively from

e; t; and limt-Nse;t ¼ se: Note that there exist such se;t due to K-recursiveness of the sequence

s0;s1;y:Note that there exists a recursive h such that, if je is recursive then, MhðeÞ Cons-identifies

YðBjeÞ: Fix such recursive h:

Let F ¼ f0Ng,F0,F1,y,Fk: F and YðFÞ are finite sets of total recursive functions.Define M as follows.Mðf ½n�Þ1. If for some gAYðFÞ; g½n� ¼ f ½n�; then output a canonical program for one such g:2. Else, let tpn be the largest number such that Yðsct;nÞBf ½n�; and Yðsct;nÞfYð0NÞ:

Dovetail the following steps until one of them succeeds. If steps 2.1 or 2.2 succeed, then goto step 3. If step 2.3 succeeds, then go to step 4.2.1. There exists an s4n; such that csact; and Yðscs;sÞBf ½n�; and Yðscs;sÞfYð0NÞ:2.2. There exists an s4n; such that sct;sasct;n:2.3. MhðctÞðf ½n�Þk; and f ½n�DjMhðctÞðf ½n�Þ:

3. Output a program for f ½n�0N:4. Output MhðctÞðf ½n�Þ:

End

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165162

Page 41: Robust learning—rich and poor

It is easy to verify that whenever Mðf ½n�Þ is defined, f ½n�DjMðf ½n�Þ: Also, if fAYðFÞ; then M

Cons-identifies f :Now, consider any fAYðSÞ �YðFÞ: Note that there exists a unique i4k such that fBYðsai

Þand Yðsai

ÞfYð0NÞ (due to definition of saj’s). Fix such i: Also, since faYð0NÞ; there exist only

finitely many e such that fBYð0e1Þ:We first claim that Mðf ½n�Þ is defined for all n: To see this, note that if ctaai or sct;nasai

; thenstep 2.1 or step 2.2 would eventually succeed. Otherwise, since fAYðFiÞDYðBjai

Þ; step 2.3 wouldeventually succeed (since MhðaiÞ Cons-identifies YðBjai

Þ).Thus, it suffices to show thatMEx-identifies f : Let r be such that ffYð0rÞ: Let m and n4m be

large enough such that (i)–(iv) hold.

(i) f ½n�fYð0rÞ:(ii) cm ¼ ai; and for all sXm; sai;s ¼ sai;m:(iii) For all eor and t4m; if eeA or epak; then ctae:(iv) For all eor and t4m; if eAA � faig and e4ak; then Yðse;tÞff ½n� or Yðse;tÞBYð0NÞ:

Note that there exist such m; n: Thus, for all n0Xn; in computation of Mðf ½n0�Þ; ct would be ai;and step 2.1 and step 2.2 would not succeed. Thus, step 2.3 would succeed, and M would outputMhðaiÞðf ½n0�Þ: Thus, M Ex-identifies f ; since MhðaiÞ Ex-identifies f : &

Theorem follows from above claims. &

Corollary 79. UniformRobustConsCRobustCons:UniformRobustExCRobustEx:UniformRobustBcCRobustBc:

6. Conclusion

In this paper we investigated, for many learnability criteria, whether there are rich functionclasses which are robustly or uniformly robustly learnable under these criteria.Furthermore, we showed that every uniformly robustly Ex-learnable class is also consistently

learnable. It remains an open problem whether the adverb ‘‘uniformly’’ can be dropped: Is everyrobustly Ex-learnable class also consistently learnable?In addition, we explored the relationship between robust and uniformly robust variants of

several learnability criteria. Specifically, we separated robust Ex-learning from its uniformlyrobust counterpart, solving an open problem from [JSW01].

Acknowledgments

We thank the anonymous referees for several helpful comments.

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 163

Page 42: Robust learning—rich and poor

References

[AS83] D. Angluin, C. Smith, Inductive inference: theory and methods, Comput. Surveys 15 (1983) 237–289.

[AB92] M. Anthony, N. Biggs, Computational Learning Theory, Cambridge University Press, Cambridge, 1992.

[Bar74a] J. Ba%rzdin-s, Inductive inference of automata, functions and programs, in: International Mathematics

Congress, Vancouver, 1974, pp. 771–776.

[Bar74b] J. Ba%rzdin-s, Two theorems on the limiting synthesis of functions, in: Theory of Algorithms and Programs,

Latvian State University, Vol. 1, 1974, pp. 82–88 (in Russian).

[BF74] J. Ba%rzdin-s, R. Freivalds, Prediction and limiting synthesis of recursively enumerable classes of functions,

Latvijas Valsts Univ. Zimatm. Raksti 210 (1974) 101–111.

[BB75] L. Blum, M. Blum, Toward a mathematical theory of inductive inference, Inform. Control 28 (1975)

125–155.

[Blu67] M. Blum, A machine-independent theory of the complexity of recursive functions, J. ACM 14 (1967)

322–336.

[Cas74] J. Case, Periodicity in generations of automata, Mathematical Systems Theory 8 (1974) 15–32.

[CJOþ00] J. Case, S. Jain, M. Ott, A. Sharma, F. Stephan, Robust learning aided by context (Special Issue for

COLT’98), J. Comput. System Sci. 60 (2000) 234–257.

[CS83] J. Case, C. Smith, Comparison of identification criteria for machine inductive inference, Theoret. Comput.

Sci. 25 (1983) 193–220.

[CF99] C. Costa Florencio, Consistent identification in the limit of some Penn and Buszkowski’s classes is NP-hard,

in: P. Monachesi (Ed.), Computational Linguistics in the Netherlands, 1999, pp. 1–12.

[Fre91] R. Freivalds, Inductive inference of recursive functions: qualitative theory, in: J. Ba%rzdin-s, D. Bjorner (Eds.),

Baltic Computer Science, Lecture Notes in Computer Science, Vol. 502, Springer, Berlin, 1991,

pp. 77–110.

[FBP91] R. Freivalds, J. Ba%rzdin-s, K. Podnieks, Inductive inference of recursive functions: Complexity bounds,

in: J. Ba%rzdin-s, D. Bj^rner (Eds.), Baltic Computer Science, Lecture Notes in Computer Science, Vol. 502,

Springer, Berlin, 1991, pp. 111–155.

[FW79] R. Freivalds, R. Wiehagen, Inductive inference with additional information, J. Inform. Process. Cybernet.

(EIK) 15 (1979) 179–195.

[Ful88] M. Fulk, Saving the phenomenon: requirements that inductive machines not contradict known data, Inform.

Comput. 79 (1988) 193–209.

[Ful90] M. Fulk, Robust separations in inductive inference, in: 31st Annual IEEE Symposium on Foundations of

Computer Science, IEEE Computer Society Press, Silver Spring, MD, 1990, pp. 405–410.

[Gol67] E.M. Gold, Language identification in the limit, Inform. Control 10 (1967) 447–474.

[Gra86] J. Grabowski, Starke Erkennung, in: R. Lindner, H. Thiele (Eds.), Strukturerkennung diskreter

kybernetischer Systeme, Teil I, Seminarbericht Nr.82, Department of Mathematics, Humboldt University

of Berlin, 1986, pp. 168–184 (in German).

[Jai99] S. Jain, Robust behaviorally correct learning, Inform. Comput. 153 (2) (1999) 238–248.

[JORS99] S. Jain, D. Osherson, J. Royer, A. Sharma, Systems that Learn: An Introduction to Learning Theory, 2nd

Edition, MIT Press, Cambridge, MA, 1999.

[JSW01] S. Jain, C. Smith, R. Wiehagen, Robust learning is rich, J. Comput. Syst. Sci. 62 (2001) 178–212.

[JB81] K.P. Jantke, H.-R. Beick, Combining postulates of naturalness in inductive inference, J. Inform. Process.

Cybernet. (EIK) 17 (1981) 465–484.

[KW80] R. Klette, R. Wiehagen, Research in the theory of inductive inference by GDR mathematicians—a survey,

Inform. Sci. 22 (1980) 149–169.

[KS89a] S. Kurtz, C. Smith, On the role of search for learning, in: R. Rivest, D. Haussler, M. Warmuth (Eds.),

Proceedings of the Second Annual Workshop on Computational Learning Theory, Morgan Kaufmann, Los

Altos, CA, 1989, pp. 303–311.

[KS89b] S. Kurtz, C. Smith, A refutation of Ba%rzdin-s’ conjecture, in: K.P. Jantke (Ed.), Analogical and Inductive

Inference, Proceedings of the Second International Workshop (AII ’89), Lecture Notes in Artificial

Intelligence, Vol. 397, Springer, Berlin, 1989, pp. 171–176.

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165164

Page 43: Robust learning—rich and poor

[Lan90] S. Lange, Consistent polynomial-time inference of k-variable pattern languages, in: J. Dix, K.P. Jantke,

P. Schmitt (Eds.), Nonmonotonic and Inductive Logic, First International Workshop, Karlsruhe, Germany,

Lecture Notes in Computer Science, Vol. 543, Springer, Berlin, 1990, pp. 178–183.

[Min76] E. Minicozzi, Some natural properties of strong identification in inductive inference, Theoret. Comput. Sci. 2

(1976) 345–360.

[Mit97] T. Mitchell, Machine Learning, McGraw-Hill, New York, 1997.

[OSW86] D. Osherson, M. Stob, S. Weinstein, Systems that Learn: An Introduction to Learning Theory for Cognitive

and Computer Scientists, MIT Press, Cambridge, MA, 1986.

[OS02] M. Ott, F. Stephan, Avoiding coding tricks by hyperrobust learning, Theoret. Comput. Sci. 284 (2002)

161–180.

[Rog67] H. Rogers, Theory of Recursive Functions and Effective Computability, McGraw-Hill, New York, 1967

(Reprinted by MIT Press in 1987).

[Ste98] W. Stein, Consistent polynomial identification in the limit, in: M.M. Richter, C.H. Smith, R. Wiehagen,

T. Zeugmann (Eds.), Algorithmic Learning Theory: Ninth International Conference (ALT’ 98), Lecture

Notes in Artificial Intelligence, Vol. 1501, Springer, Berlin, 1998, pp. 424–438.

[Vap00] V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd Edition, Springer, Berlin, 2000.

[Wie76] R. Wiehagen, Limes-Erkennung rekursiver Funktionen durch spezielle Strategien, J. Inform. Process.

Cybernet. (EIK) 12 (1976) 93–99.

[Wie78] R. Wiehagen, Zur Theorie der Algorithmischen Erkennung, Dissertation B, Humboldt University of Berlin,

1978.

[WL76] R. Wiehagen, W. Liepe, Charakteristische Eigenschaften von erkennbaren Klassen rekursiver Funktionen,

J. Inform. Process. Cybernet. (EIK) 12 (1976) 421–438.

[WZ94] R. Wiehagen, T. Zeugmann, Ignoring data may be the only way to learn efficiently, J. Experiment. Theoret.

Artif. Intell. 6 (1994) 131–144.

[WZ95] R. Wiehagen, T. Zeugmann, Learning and consistency, in: K.P. Jantke, S. Lange (Eds.), Algorithmic

Learning for Knowledge-Based Systems, Lecture Notes in Artificial Intelligence, Vol. 961, Springer, Berlin,

1995, pp. 1–24.

[Zeu86] T. Zeugmann, On Ba%rzdin-s’ conjecture, in: K.P. Jantke (Ed.), Analogical and Inductive Inference,

Proceedings of the International Workshop, Lecture Notes in Computer Science, Vol. 265, Springer, Berlin,

1986, pp. 220–227.

ARTICLE IN PRESS

J. Case et al. / Journal of Computer and System Sciences 69 (2004) 123–165 165