The ability of Statistical Learning an individual differences study Noam Siegelman- Hebrew U. Ram...

92
The ability of Statistical Learning an individual differences study Noam Siegelman- Hebrew U. Ram Frost The Hebrew University Haskins Laboratories & BCBL

description

Starting point – SL and L2 learning (Frost et al., 2013) Individual Differences in Visual Statistical Learning task predict L2 literacy learning

Transcript of The ability of Statistical Learning an individual differences study Noam Siegelman- Hebrew U. Ram...

PowerPoint PresentationNoam Siegelman- Hebrew U.
Starting point – SL and L2 learning (Frost et al., 2013)
Individual Differences in Visual Statistical Learning task predict L2 literacy learning
The theoretical stand regarding L2 learning
In any language, all linguistic domains (e.g., syntax, gender marking, writing system, morphological structure) display some level of quasi-regularity.
Language learning is primarily a process of picking up and assimilating the statistical regularities of a linguistic environment.
The fundamental cognitive faculty of implicit correlation-learning which underlies any form of learning plays a primary role in second language acquisition.
Second language literacy acquisition theoretical basic assumptions:
The organization of any reading system reflects the overall structure of the language (Frost, BBS, 2012).
Reading in L2 requires not only the acquisition of new graphemes and GTP correspondences, but mostly the implicit assimilation of deep linguistic structure (e.g., morphology).
L2 literacy eventually reflects the statistical properties of the language; the correlations of orthographic, phonologic, and semantic units. These are implicitly (or explicitly) picked up by the readers.
A statistical approach to writing systems
Spoken words in a given language are characterized by patterns of transitional probabilities that constrain their possible phonological structure.
Each writing system is characterized by a different level of consistency (i.e., high and low correlations) in GTP mapping.
Each writing system is characterized by a set of correlations that determine the possible co-occurrences of letter sequences.
Each language offers some systematic correlations between form and semantic meaning through morphological structure.
In a nutshell:
There is a general cognitive capacity for statistical learning.
Like any human capacity it would have a normal distribution of individual differences.
This capacity predicts, at least to some extent, individual differences in the ease or difficulty in acquiring the new set of regularities that determine reading in L2.
S.L a quick literature review:
The amazing power of S.L (Endress & Mehler, 2009).
S.L in children with SLI (Evans et al., 2009).
Implicit learning of artificial grammars (Reber, 1967).
Implicit learning and tacit knowledge (Reber, 1993).
Implicit learning and statistical learning: One phenomenon two approaches (Perruchet & Pacton, 2006).
Past and current research on S.L
As Romberg and Saffran (2010) highlight:
Most research in S.L has demonstrated the ability of infants/adults (and even rodents!) to extract elements out of linguistic or non-linguistic input using the underlying statistics,
Focusing only at the group level,
Using only one S.L/I.L task.
This is not enough for our interest in the domain of individual differences !
The capacity for statistical learning- four theoretical questions:
Is S.L a unified capacity or a unified mechanism responsible for all possible detection of correlations, or is it a componential capacity?
If it is not (and probably it is not) a unified capacity, how are the different components of S.L interrelated?
How does S.L relates to other general cognitive capacities such as intelligence, memory, executive functions, etc?
Is S.L a “stable” capacity of the individual that remains more or less constant across time, such as intelligence, etc?
Statistical Learning: A preliminary mapping sentence (Facet theory, Gutman 1959)
Statistical learning is the ability to implicitly pick up regularities of {verbal/non-verbal}
information in the {visual/auditory}modality,
when contingencies are {adjacent/non adj.}, thereby shaping behavior.
Modality:
Visual
Auditory
The present research project
48 students of the Hebrew university were tested in a series of experimental tasks that monitored general cognitive abilities and verbal abilities. Participants were also tested with various form of statistical learning tasks that cover some of the SL theoretical space, and these were repeated twice at T1 and T2.
Aims:
To examine in a within-subject design whether performance in a given S.L task predicts performance in another S.L task. This will tell us something about the unity/componentiality of S.L as a cognitive ability.
To examine how S.L is related to other general cognitive or verbal abilities. This will tell us whether S.L is a subset (nested) of a more general ability such as intelligence or memory.
To examine whether S.L is a stable (therefore reliable) capacity of individuals. Reliability is a necessary condition to validity of predictions!
Investigated tasks
ASL (auditory modality, verbal, adjacent contingencies).
ANA (auditory, non-verbal, adjacent)
AVN (auditory, verbal, non-adjacent)
Cognitive abilities tasks
4 separate testing sessions.
Each participant is tested once in all tasks of cognitive abilities.
Each participant is tested twice in all tasks of statistical learning, test-retest.
Testing sessions of S.L are separated by at least three months.
24 shapes:
(adapted from Turk-Browne et al. 2005, by Glicksohn & Cohen, 2011)
8 Triplets


The 8 triplets are presented in a random order to create a 10 minutes familiarization stream.
Q: Can S’ pick-up the rules regarding the TPs of the visual shapes?
The experimental setting
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
Auditory Statistical Learning – Adjacent
The 12 words are presented in a random order to create a 10 minutes familiarization stream.
Familiarization:
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
Statistical Learning: Non-Linguistic
Exactly the same as the adjacent linguistic statistical learning experiment, with the only difference that the syllables are replaced with18 non-linguistics noises (A,B … R).
0.187
0.5
Familiarization:
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
Statistical Learning - Non Adjacent
Subjects can extract “words” based upon the non-adjacent statistics: Words have consonant “roots”.
Adjacent TP between syllables: p=0.5
Non-adjacent TP between consonant: within words: p=1.
between words: p=0.5
Statistical Learning - Non Adjacent
During the test, Participants hear legal “words” that are composed of a consonant root and a new vowel patterns, and nonwords that are assembled from “part-roots” and the same vowel patterns, and are requested to choose which belongs to “language”.
pu ve gi
po vi ga
di ku bo
du ke ba
me to sa
ma tu se
vu ge di
bo pi va
gi du ko
ku be ma
te so da
ga mu te
2
1
2
1
2
1
2
1
2
1
2
1
Probabilistic Serial Learning Task
Kaufman et al., 2010
In each trial, stimulus is appearing at one of four locations on the screen, and subjects are asked to press a corresponding key.
Press ‘1’
Press ‘2’
Press ‘3’
Press ‘4’
Probabilistic Serial Learning Task
Kaufman et al., 2010
Subjects are unaware that the sequence of the successive stimuli is determined probabilistically by the last two presented stimuli:
P= 0.85 for sequence A (1-2-1-4-3-2-4-1-3-4-2-3).
P= 0.15 for sequence B (3-2-3-4-1-2-4-3-1-4-2-1).
p=0.85
p=0.15
Total of 960 trials in 8 blocks.
The dependant variable for this task is the difference between the average RT in the trials taken from the 85% sequence (probable trials), and the average RT in the trials taken from the 15% sequence (improbable trials) over the last 6 blocks (720 trials). This shows whether S’ implicitly learned the probability of sequences.
Probabilistic Serial Learning Task
Kaufman et al., 2010
Distribution of scores in the five S.L/I.L tasks
In order for the tasks to reliably predict cognitive abilities, the task has to:
Have a level of difficulty that is not at floor or at ceiling.
Have a variance that is large enough to allow for a wide distribution of individual differences.
Visual Statistical Learning Distribution
Mean=22.2 (of 32, 69.4%), SD=5.56
0 0 0 0 0 0 0 0 0 0 2 1 3 6 8 7 16 12 8 15 11 13 12 4 6 8 4 10 9 4 11 9
Auditory Statistical Learning (Adjacent) Distribution
n=102
Mean=21.6 (of 36, 59.1%), SD=5.64
0 0 0 0 0 0 0 0 1 2 2 1 2 3 3 5 5 7 3 9 9 3 7 10 6 4 7 5 2 0 0 2 1 1 1 1
Auditory Statistical Learning (Non-Linguistic, Adjacent) Distribution
n=103
Mean=20.56 (of 36, 57.1%), SD=3.33
0 0 0 0 0 0 0 0 0 0 0 1 0 1 5 5 6 6 15 15 12 10 8 6 5 5 0 1 1 1 0 0 0 0 0 0
Auditory Statistical Learning (Linguistic, Non-Adjacent) Distribution
n=102
*but without the two extreme observations: SD = 3.52
0 0 0 0 0 0 0 0 1 0 0 0 3 2 3 7 4 8 12 8 11 14 9 8 3 4 1 2 0 1 0 0 0 0 0 1
Serial Reaction Time (SRT) Distribution
Mean=17.54 ms, SD = 17.36
n=48
-15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 0 1 3 3 4 4 7 5 10 4 1 2 0 1 1 1 0 0 0 1
Summary – SL tasks Distributions
a subset (nested) of general cognitive abilities?
What are the inter-correlations of scores in the various S.L tasks and the scores obtained in the cognitive tasks?
Cognitive Tasks
Digit span- WM
Verbal working memory
Syntactic processing- verbal abilities
Switch task- Executive functions
If S.L. is an independent theoretical construct, we should expect it not to be nested within a given faculty (very small correlations), with some correlation with intelligence.
Switch task (executive functions)
Participants are presented with letters or digits, in one of four different locations. Letters are always shown in the first and second positions, and digits in the third and fourth.
In each trial, one stimulus is presented (a letter or a digit).
When seeing a digit – participants are asked to decide whether the digit is odd or even.
When seeing a letter – whether the letter is consonant or vowel.
Switch task (executive functions)
There are two kinds of trials:
‘Stay’ Trials: When the stimulus is of the same type as the previous stimulus (letter after a letter or a digit after a digit).
‘Switch’ Trials: When the stimulus is of a different kind from the previous stimulus (letter after a digit or a digit after a letter).
The dependant variable= mean RT difference between ‘Switch’ trials and ‘Stay’ trials.
Syntactic Processing
Participants read a syntactically complex sentence, followed by two short sentences, and asked to choose which of the two sentences is semantically congruent with the target sentence.
The Gardener who stood next to the teacher ran away.
The Gardener ran away. The teacher ran away.
The score is a standardized score based on the subject’s RT, provided by the Israeli National Institute of Testing and Evaluation (NITE).
Correlations of VSL and tests of general cognitive capacities
VWM
0.151 (n=103)
-0.163 (n=47)
-0.029 (n=47)
Correlations of ASL (adjacent) and tests of general cognitive capacities
VWM
Correlations of AVN (non-adjacent) and tests of general cognitive capacities
VWM
Correlations of ANA (non-linguistic) and tests of general cognitive capacities
VWM
VWM
Q2:
Capacity?
in the various S.L tasks?
Correlations between the different Statistical Learning Tasks
ASL- non linguistic
ASL- non adjacent
ASL- Non linguistic
ASL- non adj
Q3:
Is there a test-retest reliability ?
Test-Retest Reliability - VSL
n=47, r=0.67
20 26 17 14 11 19 18 22 25 18 22 22 26 23 15 20 20 25 22 19 28 23 16 14 32 31 21 32 26 28 19 24 26 15 21 21 31 21 20 17 15 19 18 17 22 15 31 29 24 31 21 17 16 22 18 15 31 15 24 21 23 17 15 19 16 31 28 13 19 19 14 16 32 32 17 32 29 17 21 25 21 16 32 14 27 32 18 17 30 16 15 26 13 32 31
Test-Retest Reliability – ASL
n=48, r=0.6
11 13 18 21 17 36 23 20 18 16 21 10 19 19 27 10 23 18 16 21 18 28 20 21 20 21 20 32 18 32 14 16 16 11 24 27 25 22 24 33 25 14 25 23 24 21 26 27 9 18 22 25 18 35 27 21 22 21 23 17 18 20 19 22 19 23 20 17 19 25 19 19 28 32 24 33 27 33 17 18 18 19 22 27 22 14 17 29 24 29 19 28 33 24 23 26
Test-Retest Reliability – ASL Non-Adjacent
n=46, r=0.49
23 16 21 28 16 22 27 25 20 26 19 23 15 16 19 36 22 16 23 23 15 19 21 20 16 9 24 19 23 24 17 24 17 21 26 20 18 22 20 19 22 16 19 23 21 13 24 25 21 26 24 16 20 17 18 25 20 13 23 17 22 18 36 15 19 20 18 16 20 18 17 16 13 20 17 23 10 17 26 22 18 20 23 22 27 16 21 16 12 25 15 20 22 26
Test-Retest Reliability – ASL Non-linguistic
n=48, r=0.1
18 23 21 19 22 14 21 19 18 17 15 19 15 21 23 18 12 20 19 21 30 28 21 21 20 20 18 16 15 17 16 16 15 20 25 23 18 19 17 20 15 20 19 22 17 19 22 22 18 21 20 16 17 32 22 26 16 22 17 26 25 25 18 20 22 20 24 20 26 25 20 20 12 24 23 17 20 24 19 20 21 21 18 27 24 21 23 20 15 18 26 19 22 22 24 29
Test-Retest Reliability – SRT
Is there a learning effect from T1 to T2?
T1 average
T2 average
# subs improved
Conclusions- S.L tasks
Not all tasks that monitor S.L as an individual ability are equally good in terms of performance distribution (variance) and test-retest reliability.
From all the tasks examined in our study, VSL seems to provide the best fit. It has a normal distribution of performance, it is reliable, it has a small but significant correlation with intelligence, and it does seem to predict success or failure in L2 literacy (e.g., Frost et al. 2013, Psych Science).
Tasks that do not allow for test-retest reliability should be avoided when individual differences are concerned (SRT, non-linguistic auditory sounds).
General conclusions
S.L is not a unified ability. Individuals may be good in detecting correlations in one context and not as good in another.
S.L is not a subset of intelligence or WM, although some task correlate with intelligence.
Detection of correlations and regularities in a given context seems to be a stable and reliable individual ability.
Our research so far shows that individual differences in detecting transitional probabilities of adjacent shapes in the visual modality are predictor of L2 literacy acquisition.
More questions to be answered:
S.L and language learning:
Is there something specific about the correlation of VSL with reading, or does VSL correlate with other aspects of L2 learning?
Do different aspects of S.L predict different aspects of L2 learning? Is a certain aspect of S.L. particularly important in a specific language? (i.e., non-adjacent S.L for Hebrew?
Is there an influence of the native linguistic environment on the ability to pick-up statistics?
Training in S.L. – will improve L2 learning?
More questions to be answered:
The mechanism of S.L and methodology
How do our results (that points to a componential ability) can co-exist with Arit’s model of S.L?
How can we improve the measurement of S.L?
Towards a normalized measurement of individual differences in S.L?
Thanks
Alona Narkiss
Henry Brice
Tali Ben-porat
Dana Yankelevich
Amit Elazar