Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE...

Computational Learning: Statistical & Soft-Computing Approaches: with focus on Language Structure using Fuzzy Similarity

Narendra S Chaudhari

School of Computer Engineering Nanyang Technological UniversitySingapore

Emails: asnarendra@ntu.edu.sg; nsc183@gmail.com; nscmp@lycos.com

verbpredicate

nounarticlephrasenoun

predicatephrasenounsentence

runsverb

catnoun

thearticle

Son of (Ethel) Sara Turing (conceived 1911 : Chatrapur, India; born 23 June 1912, London)

Prof. Nikhil Pal, Editor-in-Chief, IEEE Tras. Fuzzy Systems & Professor, Indian Statistical Institute (ISI), Calcutta, India

Language Structure using Fuzzy Similarity:Language Structure – Some Approaches

• Model Construction– Grammar Learning– Grammar Models of Practical Interest– Regular Language Learning

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen, 2000 ] – Alignment Profile– Profile Similarity– Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Concluding Remarks

• Research Interests

Human beings appear to be able to learn:new conceptswithout needing

to be programmed explicitlyin any conventional sense.

[Leslie G. Valiant, 1984]Harvard University,

Aiken Computation Laboratory, Cambridge, MA, USA

Grammar Learning: BIG PICTURE

Formal Language Learning: (Grammar Learning): Automation of formal language (Chomsky Hierarchy) learning.

Grammar Learning: Model Construction

• Grammar Learning (also called as Grammar Inference: GI) is to identify grammatical models from generated samples.

Grammar Model

S→ ….

….. …

Grammar Learning: Approach -Induction, and, Inductive Inference

Induction: reasoning from a part to a whole, from particulars to generals, or from individual to the universal.

Inductive inference: process of hypothesizing a general rule from examples.

– Example:

100, 111100, 11000, 1110, 1100,…

– Guess:

“any number of 1’s followed by any number of 0’s”.

E.M. Gold [1967]Formulated concept of “Identification in the Limit”It views inductive inference as an infinite process.

PROVES THE “Negative” result:

Any “super-finite” language class (class containing at least one infinite language) cannot be learnt with only “positive” examples

E. M. Gold [1967] Language identification in the limit,Information and Control, Vol. 10, pp. 447-474, 1967.

Grammar Learning: E.M. Gold’s “negative” result

E.M. Gold [1978]Formulated concept of “Language learning in the Limit”(again views inductive inference as an infinite process)

PROVES THE “Positive” result:

“Regular” languages are “learnable in the limit”

“practicality problems” (good algorithms) still remain

E. M. Gold [1978] Complexity of automation identification from given data. Information and Control, Vol. 37, pp. 302-320, 1978.

Grammar Learning: E.M. Gold’s “positive” result

Grammar Models of practical interest

Regular Languages

Context-free Languages

Stochastic Extensions

S→AB (80%)

S→B (20%)

Sub classes

Extended Models

Regular Language (Finite Automaton) Learning

• Learning from Representative Samples and Membership Queries [Angluin 1981]

• Learning by Membership and Equivalence Queries [Angluin 1987b]• Learning Reversible Languages [Angluin 1982]

[Angluin 1981] D. Angluin, "A Note on the Number of Queries Needed to Identify Regular Languages", Information and Control, 51, pp. 76-87, 1981. [Angluin 1987b] D. Angluin, "Learning regular sets from queries and counterexamples", Information and Control, 39, pp. 337-350, 1987b. [Angluin 1982] D. Angluin, "Inference of Reversible Languages", Journal of ACM, 29, pp. 741-765, 1982.

designed efficient algorithms for incrementalefficient algorithms for incrementallearning of DFAlearning of DFA.– A version space based framework (using the lattice

of DFAs) is reported in:

– Suresh Jain, Narendra S. Chaudhari, “An Incremental Algorithm for learning DFA from characteristic sample,”International Journal of Computational Intelligence Research (IJCIR), Vol. 3, No. 4, pp. 297-312 (Online: http://www.ripublication.com/ijcirv3/ijcirv3n4_3.pdf )(Dec 2007).

Regular Language Learning:Some of Our Contributions

Language Structure using Fuzzy Similarity

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen, 2000 ] – Alignment Profile

• Alignment Profile as a Fuzzy Set– Profile Similarity

• Profile Similarity formulated in terms of Fuzzy Operations– Indistinguishable Grammar Symbols

• Profile Similarity and Indistinguishable Grammar Symbols– Profile Based Alignment Learning (PBAL)

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen, 2000 ] – Alignment Profile

Alignment Based Learning (ABL)

Introduced by Zaanen [Zaanen 2000, 2002a]For Context-free Languages• Based on alignment information• Unsupervised• Does not require language details

[Zaanen 2000] Menno van Zaanen, "ABL: Alignment-Based Learning", Proceedings of the 18th International Conference on Computational Linguistics (COLING), Saarbrücken, Germany, pp. 961-967, 31 Jul-4 Aug 2000.

Induction of Linguistic Knowledge

ILK Research GroupDept. of Communication and Information SciencesFaculty of HumanitiesTilburg UniversityP.O. Box 90153NL-5000 LE TilburgThe Netherlands

ABL Step 1

• Get alignments between each pair of sentences from the samples.

Oscar sees the large, green apple.

Cookie monster sees the red apple.

ABL Step 2

• Extract context-free grammar

A → OscarA → Cookie monsterB → large, greenB → redS → A sees the B apple.

Problems with ABL

• Unable to identify the following alignment– Book [a trip to Goa beach].– Show [ me Big Bird’s house].

• Generates the following alignment, and generates incorrect grammatical rules – Gopal eats [biscuits].– Gopal eats [well] .

Book and Show are “verbs” but ABL cannot learn this concept of verbs.

biscuits and well should not be “clubbed” as single concept (because biscuits is noun, well is adverb) but ABL will club them to be derived from the same non-terminal symbol (since other parts Gopal eats are aligned in ABL) .

OUR EXTENSION: Profile Based Alignment Learning1,2

• Objective: – Refine the learned model by reducing identified invalid rules

– Improve the precision of context-free grammatical rules extracted from samples

1: X. Wang and N. S. Chaudhari, “Alignment Based Similarity Measure for Grammar Learning”, 2006 IEEE International Conference on Fuzzy Systems, pp. 1902 – 1909, July 16-21, 2006.2: X. Wang and N. S. Chaudhari, “Profile Based Alignment Learning System for Language Inference”, in Proceedings of the 14th International Conference on Intelligent and Adaptive Systems and Software Engineering, pp. 94-99, Toronto, Canada, July 20-22, 2005.

biscuits and well should not be “clubbed” as single concept … but ABL will club them to be derived from the same non-terminal symbol (since other parts Gopal eats are aligned in ABL) .

Book and Show are of the “same catergories” (verbs) but ABL cannot learn this concept of “same category” (verb).

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaanen ] – Alignment Profile

Alignment Profile

• Suppose– “apples” aligned with “pears” in one context

• I brought some [apples] this morning• I brought some [pears] this morning

– “apple” aligned with “well” in another context• Gopal eats [pears]• Gopal eats [apples]• Gopal eats [well]

apples pears well

2/5 2/51/5

• Then alignment profile for “apples” is:

apples pears well

2/5 2/51/5

• The alignment profile for “pears” is:

apples pears well

1/5 1/5 1/5

• Alignment profile for “well” is:

√ Not matching

Alignment Profile: as a Fuzzy Set

apples pears well

2/5 2/51/5

• The alignment profile for “apples” is:

We define Alignment profile as a fuzzy set AP defined on the set of symbols N ∪Σ, denoted by AP = { <v, μAP(v) > | v ∈N ∪Σ },

where μAP(v) : N ∪Σ → [0,1]is the membership grade of the fuzzy set AP.

Profile Similarity

• Our basic idea: words with higher “Profile similarity” tend to be generated from the same nonterminal in context-free languages

Similarity

Dissimilarity

Profile Similarity

• Our basic idea: words with higher “Profile similarity” tend to be generated from the same nonterminal in context-free languages

Similarity

Dissimilarity

Profile Similarity: formulated in terms of Fuzzy operations

OUR Approach to prove that words with higher “Profile similarity”tend to be generated from the same nonterminal in context-free languages: – Formulate “Alignment Profile” as a Fuzzy Set over Grammar symbols – Formulate Profile Similarity in terms of Fuzzy Set operation(s) The Profile similarity of alignments profiles AP1 and AP2, denoted as Ps(AP1, AP2),is formulated as:

Ps(AP1, AP2) = .

SimilarityDissimilarity||

APAPAPAP

∪∩

Indistinguishable Grammar Symbols: our definition †

Define indistinguishable (grammar) symbols:In a CFG G = (N, Σ, P, S), A∈N, B∈N,if for all p∈P and

for all “surrounding contexts” α, β, RHS(p)=αAβ,

there exists a q∈P, RHS(q)=αBβ, and LHS(p) = LHS(q), then A and B are called indistinguishable symbols, Otherwise, A and B are distinguishable.

X → αAβ

Y → αBβ

p: q: X

† : motivation: we introduce this definition to formalize the construction to “merge” grammar symbols (e.g. “apples”, “pears”) with high profile similarity

Indistinguishable Grammar Symbols & Profile Similarity

Theorem 1. In a SCFG G(N, Σ, P, S), A∈N, w1∈N∪Σ, w2∈N∪Σ, {A→w1 (pr1), A→ w2 (pr2) } ⊆ P, pr1>0, pr2>0.

Assume that A can be generated from S. Then, given enough generated sentences,Ps(AP(w1), AP(w2)) = 1.

Theorem 2. In a SCFG G(N, Σ, P, S), A∈N, w1 ∈N∪Σ, w2 ∈N∪Σ, {A→w1 (pr1), B→ w2 (pr2) } ⊆ P, pr1>0, pr2>0. Assume that A and B are distinguishable, thenthe profile similarity, Ps(AP(w1), AP(w2)) = 1 - σ,

where σ is a positive real number less than 1.Note: To represent the “frequency of occurrence” of the rule, we use “stochastic” extension of CFG: i.e. we use SCFG.

Example

Sample text 1: London came out on top when the announcement was made by IOC President Jacques Rogge in Singapore.

Sample text 2: This was the fourth bid from Britain. London will become the first city to have hosted the Olympics three times.

Sample text 3:This was the fourth bid from Britain. London will become the first city to have hosted the Olympics three times.

production rules and their counts are:

p1: Sn1 → [London] [came] [out] [on] [top] [when] [the] [announcement] [was] [made] [by] [IOC] [President] [Jacques] [Rogge] [in] [Singapore], (C(p1) = 1)

p2 : S → Sn1 , (C(p2) = 1)p3 : Sn2 →[This] [was] [the] [fourth] [bid] [from] [Britain], (C(p3) = 2)p4 : Sn3 → [London] [will] [become] [the] [first] [city] [to] [have] [hosted] [the]

[Olympics] [three] [times], (C(p4) = 2)p5 : S → Sn1 Sn2, (C(p5) = 2)

• Context Free Grammar Learning– Alignment Based Learning (ABL) [ Zaneen ] – Alignment Profile

Profile Based Alignment Learning: PBAL Framework

Makes use of the following concepts:– Slot Alignment Score– Sentence Similarity– Sentence Similarity Threshold– Dynamic Sentence Similarity Threshold

STEPS in Our PBAL Framework:• Find pairwise alignment for each sentence pair• Calculate and accumulate alignment counts (slot alignment,

sentence similarity)• Calculate alignment profile and Profile similarities• Redo the alignment until there is no further change• Extract grammar by extracting non-terminal for each slot

Experiments, and Example Rules Generated

For experiments and testing, we used• CHILDES database: a collection of child-related speech• sample from the English-American corpora• No editing / cleanup is done and used directly for alignment• Number of Sentences: more than 10,000

One sample rule is:

don't put that in the {1155}[dough ] (+)oh lets put that in the {1155}[middle ]

where {1155} is the ID of nonterminal of CFG, with [dough] and [middle] identified to be generated from {1155}.

(the number in bracket is the number of times the rule is used. ):\769 = \437 \510 \544 \757 .(1)

CHILDES Reference: B. MacWhinney, The CHILDES Project: Tools for analyzing talk, third ed., Lawrence Erlbaum Associates, Mahwah, NJ, 2000

Rules Generated: Example\769 = \437 \510 \544 \757 .(1)

\769 = \470 \515 \508 \758 \534 \466 \759 \470 \515 .(1)\769 = \578 \760 \439 \399 \761 .(1)\769 = \478 \499 \617 \484 \510 \490 \762 \469 \490 \475 \527 \647 \465 .(1)\769 = \385 \473 \763 \764 \445 \765 .(1)\769 = \394 \768 \402 \533 \766 \765 \647 .(1)\769 = \552 \442 \766 \765 \647 .(1)\769 = \552 \606 \406 .(1)\1073 = \594 .(1)\1072 = \635 .(1)\1073 = \679 .(1)\1072 = \680 .(1)\1078 = \419 .(1)\1078 = \431 .(1)\1079 = \450 \542 .(1)\1080 = \675 .(1)\1080 = \580 \507 \678 .(1)\1081 = \450 .(1)\1079 = \733 \734 .(1)\1081 = \465 .(1)\1090 = \386 \387 \388 \389 .(1)\1092 = \497 .(1)\1091 = \487 \498 .(1)\1093 = \390 \516 .(1)\1094 = \526 .(1)\1094 = \527 \528 .(1)\1090 = \536 \537 \537 \538 \521 .(1)

\1092 = \582 .(1)\1091 = \599 \439 .(1)\1095 = \435 \460 \545 \414 \530 .(1)\1095 = \416 \622 \516 .(1)\1093 = \400 \647 \646 .(1)\1097 = \515 .(1)\1096 = \503 .(1)\1098 = \460 .(1)\1097 = \563 \690 .(1)\1096 = \691 .(1)\1098 = \414 .(1)\1099 = \385 \466 \437 .(1)\1099 = \484 \401 .(1)\1120 = \460 .(1)\1120 = \1098 .(1)\1124 = \400 \564 .(1)\1123 = \565 .(1)\1124 = \390 .(2)\1123 = \1095 .(2)\1129 = \668 \544 \440 \669 .(1)\1129 = \536 \766 \765 \767 .(1)

Precision and Number of Rules Generated by Different PBAL Formulations

9996.93%Dynamic SST PBAL

6094.97%R_Score based PBAL

3491.47%PBAL SA

2291.36%PBAL NSA

1485.82%ABL SST

# of Rules Generated

PrecisionFormulation

Language Structure Learning: Concluding Remarks

• Grammar Learning: computationally hard – More tractable with “additional information” given in terms of:

• ‘negative’ examples• ‘structural’ information

• Learning of Regular Grammars: – many approaches have been proposed by researchers– We have developed one approach based on “Version Spaces”:

useful for incremental learning

• Learning of Context Free Grammars: – Researchers have investigated approaches based on soft-

computing models– We extended Zaanen’s Alignment Based Learning (ABL) to make

use of “Profile Based Alignment Learning”

• Applications: • one area: Games

Questions ? …

…• More information:

– Web: www/ntu/edu/sg/home/asnarendra• Contact

– Email: asnarendra@ntu.edu.sgPersonal Emails: nsc183@gmail.com,

nscmp@lycos.com

Research Interests

• Algorithms– Graph Isomorphism– Parsing of Context Free Grammars (CFGs)

• Soft Computing– Specialized Recurrent Neural Networks (RNNs) for protein secondary

Structure (PSS) Prediction – Long-Short Term Memory (LSTM) Networks– Segmented Memory Recurrent Neural Networks (SMRNNs)– Bidirectional SMRNNs (BSMRNNs)

– Binary Neural Networks (BNNs)– Construction Method(s)

– Grammar Learning– For Regular Grammars: Use of Version Spaces– For CFGs: Use of Profile Based Alignment Learning

• Simulations and Games• BSP for urban Terrain Modeling• Game AI: non-conventional

• Non-conventional Game AI– Games with Computational Models

• Computational Learning• Soft-computing• Neural Networks and Binary Neural Networks

Non-conventional Game AI: Computational Learning, and Computer

Science Models: Outline

Computational Models: Formal Language

• Language Structures– Verbal Explanation

(Study of Forms of Language Structures)

• Formal Languages • Chomsky’s classification:

– Regular (Type 3) Languages– Context Free (Type 2) Languages– Context Sensitive Languages– Phrase Structured Languages

Table (represents mapping) - …(phoneme – Viseme)Finite Automaton (FA) - regular languagesPush down automata - Context-freeLinear Bounded Automata - Context sensitiveTuring machines - Phrase-structured lgs

Automaton Model

Formal Languages they represent

Our Work: Computational Learning - I

• Learning of Regular Grammars• Through “Reversible Automata”• Construction through

– Positive Examples, and– Negative Examples

• “Optimal Construction” remains NP-Complete– However, “good” algorithms possible for Construction

• Games with such learning models

Web: http://www.ntu.edu.sg/home/asnarendra

Our Work: Computational Learning - II

• Learning of Context Free Grammars• Through “Structured Examples”• Construction through

– Positive “structured” Examples, and– Negative “structured” Examples

• Use of Soft-Computing Approaches for learning of Structures

• PBAL Based Approach for CFG Learning

• Games with such learning models

Soft Computing

• Current state of soft computing: • collection of the following techniques

– Fuzzy Logic – Neural Networks– Evolutionary Computing Techniques inspired from behavioral

studies like• Genetic Algorithms• ant colony optimization• small world theory• theory of memes

Neural Networks

• Models cortical structures of the brain. • Neurons - interconnected processing

elements, that work together to produce output.

• Co-operation - output relies on the cooperation of the individual neurons within the network to operate.

• Parallel Processing: Processing of information is parallel (to contrast with sequential nature of Turing Model).

Neural Networks• Difficulties:• Initialization of network's starting structure, • training parameters, and, • weight updates• Other factors:

– Number of hidden layers– Number of hidden nodes– Initial weights

Some (famous) Solutions:– Self Organizing Maps– Binary Neural Networks

Binary Neural Networks (BNNs)

• Hard-limiter nonlinearity

• Our Contribution:Constructive methods for

– Number of hidden layers– Number of hidden nodes

• Based on geometric (hypersphere) concepts, we have developed BNN construction methods.

±1Sum

Threshold = +1

w1= +1

w2= +2

• Non-conventional Game AI– Games with Computational Models

• Computational Learning• Soft-computing• Neural Networks and Binary Neural Networks

Future Game AI: Computational Learning,

and Computer Science Models: Outline

Non-conventional Game AI: Concluding Remarks

• Two Computational Models– Chomsky’s Grammars: Formal Languages– Automata Models: FA, PDA, TMs

• Existing Computational Learning Techniques allow automatic construction of – Regular Languages, and,– Part of Context Free Languages– Algorithms for such Constructions remain Compute Intensive,

specially for Context Free Languages

Non-conventional Game AI: Concluding Remarks: Continued …

• Neural Network Construction: Trial & Error, and Time Consuming– “constructive methods”

• Many “specialized” variants available • GA based Approaches: NEAT (2004), RT-NEAT (2006),

etc• Binary Neural Networks

– Construction methods based on Geometric Space Expansion

• Future Game AI– would integrate these technologies

Questions ? …

Email: asnarendra@ntu.edu.sg; Personal Emails: nsc183@gmail.com, nscmp@lycos.com

Web: www.ntu.edu.sg/home/asnarendra

Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE...

Documents

Transcript of Computational Learning: Statistical & Soft-Computing ......Prof. Nikhil Pal, Editor-in-Chief, IEEE...

Indistinguishable Particles A (r), B (r): Two identical particles A and B in a certain states r Since the particles are indistinguishable, this requires:

Estudio Tras Las Huellas (Tracking Footprints)redegresadoslatam.org/.../12/...las-Huellas-2014-1.pdf · Tras las Huellas Estudio Tras Las Huellas (Tracking Footprints) ... un año

Indistinguishable and efﬁcient single photons from … › english › research › quantum-optics...Indistinguishable and efﬁcient single photons from a quantum dot in a planar

Generating indistinguishable photons from a quantum dot in a … · 2019. 5. 11. · RAPID COMMUNICATIONS PHYSICAL REVIEW B 95, 201410(R) (2017) Generating indistinguishable photons

Distinguishing the Indistinguishable: Exploring Structural ...openaccess.thecvf.com/content_cvpr_2017/papers/Yan_Distinguishing... · Distinguishing the Indistinguishable: Exploring

Tracking Indistinguishable Translucent Objects over Time using … · 2017. 4. 3. · ten indistinguishable and sometimes translucent; and on the other hand they can exhibit motion

Witness-Indistinguishable Arguments with -Protocols for ... · Witness-Indistinguishable Arguments with -Protocols for Bundled Witness Spaces and its Application to Global Identities

TRAS-31E-6-IVAN, IOV

Distinguishing the Indistinguishable: Exploring Structural Ambiguities via … · 2020. 10. 24. · Distinguishing the Indistinguishable: Exploring Structural Ambiguities via Geodesic

Input-Indistinguishable Computation Silvio MicaliMIT Rafael PassCornell Alon RosenHarvard.

Indistinguishable Photons · Indistinguishable Photons To really take off, advanced quantum information processing will require getting a better (experimental) grasp of an essential

O T TRAS FERABLE - rmsc.health.rajasthan.gov.in

Entanglement of indistinguishable particles

Social Relations Model: Estimation (Indistinguishable) David A. Kenny.

A System for Generating and Injecting Indistinguishable ...vpk/papers/wifi_decoys.jcs12.pdfA System for Generating and Injecting Indistinguishable Network Decoys Brian M. Bowen, Vasileios

Tecnologias por tras do facebook

Tracking Indistinguishable Translucent Objects over Time ...€¦ · ten indistinguishable and sometimes translucent; and on the other hand they can exhibit motion that appears stochas-tic

A Note on the Prehistory of Indistinguishable Particles

Tras la virtud - Alasdair Macintyre

Small bowel perforation due to indistinguishable ...