An introduction to CHILDES Rianne Schippers [email protected].
-
Upload
collin-lambert -
Category
Documents
-
view
227 -
download
0
Transcript of An introduction to CHILDES Rianne Schippers [email protected].
Outline
• What is CHILDES?
• Where do you find CHILDES?
• Why would you use CHILDES?
• How do you use CHILDES?
What is CHILDES?
• Child Language Data Exchange System
• Brain MacWhinney
• Online database of first and second language acquisition in children.
– Written transcripts– Audio– Video
• Also contains data from not typically developing children.
What is CHILDES?
• Recorded, natural speech.
– Recorded in home setting– Recorded at regular intervals– Longitudinal data
• Typological variation.
– Germanic– Romance– Slavic– Asian
Where is CHILDES?
Link: http://childes.psy.cmu.edu/
• “Databases”, i.e. the datasets.
• “Database manual” describing each dataset.
• Programs you can use to browse the databases
• Manuals that explain how to use the programs
Why use CHILDES?
• Answer questions about language acquisition
• Experimental studies
– Does child at age X know Y?
– Do 3-year-olds know passives?– Do 2-year-olds know inflectional morphology?
– What interpretation do children at age X assign to Y?
– Do 4-year-olds understand binding?– Do 5-year-olds understand scope freezing?
Why use CHILDES?
• Questions experiments cannot easily answer:
– Role played by input– Order of acquisition– Manner of acquisition– Causality
• Longitudinal study
Big, universal questions – Lexical categories – Inflectional morphology– Argument structure
Why use CHILDES?
• Does the interaction between language type and pronoun omission match the predictions of parameter-setting models?
• Are children with Down syndrome responsive to maternal requests?
• How do children first learn mental state verbs such as “remember” or “know”?
Why use CHILDES?
• Smaller, language specific questions – Verb second– Subjects (EPP)– Particle verbs
• Comparative studies– Acquisition of determiners
• Exploration– Mean Length of Utterance, frequencies
How to use CHILDES?
• Download and install the dataset(s) you are interested in. The “database manual” describes
• Language• Age(s)• Number of children
• Download and install CLAN (Computerized Language Analysis):
– A search and statistics engine for CHILDES.
• OR use the NLTK’s CHILDES module.
How to use CHILDES?
• All files are transcribed in CHAT format
– Codes for the Human Analysis of Transcripts
• Format
– Files start with @-headers: information about participants and setting
– The rest of the file contains *-tiers and %-tiers– *-tiers: specify the speaker (*CHI = child)– %-tiers: are related to the previous *-tier and give
extralinguistic information
How to use CHILDES?
• %-tiers are also used for coding
• %pho for phonology
• %mor for morphology
*CHI: I have a ball%mor: PRO|I&1S V|have-PRES DET|a&INDEF N|ball
• %syn for syntax
How to use CHILDES?
• Some more annotations
# unfilled pause between words6 schwa& phonological fragmentxxx unintelligible speech [/] retracing without correction, e.g..: then [/] then[//] retracing, with correction, e.g.: then [//] but< >["] quotation mark, used when the child literally repeats something
• All notation can be found in the CHAT manual
How to use CHILDES?
• Go to the command window
• Every search starts with a command
– kwal: word search– combo: combined search for 2 or more words– freq: frequency counts– mlu: mlu counts
• A command is followed by search parameters
How to use CHILDES?
• Some standard CLAN parameters
+t selects the utterances of a specified speaker +s selects a word to be searched+u specifies that all search results are stored in one file +r deals with the treatment of material between parentheses+f output is stored in the (specified) file(s)
• Not all commands have the same search parameters
– Type the command in the command window and hit enter
How to use CHILDES?
• Searching with kwal
– Speaker(s)– Word– File(s)
• Command must come first, the order in which the search parameters are given is irrelevant
• Every search parameter and the command must be separated from each other by a space
How to use CHILDES?
• Setting the speaker parameter
– Identify the speaker(s)+t = look for that specific speaker-t = look for everyone but that specific speaker
• We are interested in the child
– command parameter-speaker-child
kwal +t*CHI
How to use CHILDES?
• Setting the word parameter
– Decide what word you want to look for+s = look for that specific word-s = look for everything except that specific word
• Let’s say we want to know whether the child has used the auxiliary ‘want’.
– command speaker parameter-word-want
kwal +t*CHI +s”want”
How to use CHILDES?
• Specifying the file
• Two ways:
– Using the ‘file in’ button– Specifying the file in the command line
• Let’s say we want to start our search in file sarah023.cha
– Command speaker word file
kwal +t*CHI +s”want” sarah023.cha
How to use CHILDES?
Exercise:
– Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha
How to use CHILDES?
Exercise:
– Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha
Steps to take:
– Determine the command– Identify the speaker– Decide on the word– Specify the file
How to use CHILDES?
Exercise:
– Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha
Steps to take:
– Determine the command– Identify the speaker– Decide on the word– Specify the file
kwal +t*MOT +s”want” sarah023.cha
How to use CHILDES?
• Searching for several words
– Make a list in .txt format– Enter the list as the word you are looking for
• For example:
– A list with all auxiliaries– Named auxiliary.txt– Parameter: [email protected]
kwal +t*CHI [email protected] sarah023.cha
How to use CHILDES?
• Output screen is limited
• Store the data in a separate file
– Parameter: +f– File name has three letters– For example: aux
• Command speaker word parameter-store-filename file
kwal +t*CHI +s”want” +faux sarah023.cha
How to use CHILDES?
• Retype the command: kwal +t*CHI +s”want” sarah023.cha
• Notice: some material is in between brackets
*CHI: wan(t) do (a)gain
• What does this mean?
– Child actually said ‘wan’ instead of ‘want’.
• CLAN will standardly include the material in between brackets.
– CLAN will look for ‘want’
How to use CHILDES?
• What does this mean?
– A search for ‘want’ will give you both ‘wan(t)’ and ‘want’.
• Control whether the search includes material in between brackets.
• +r parameter
+r1 = default, include material in brackets+r2 = exclude material in brackets+r5 = exclude rephrased material
How to use CHILDES?
• Try out: kwal +t*CHI +s”want” +r2 sarah023.cha
• +r5 allows for exclusion of rephrased material
• What is rephrased material?
*CHI: I wanna [: want to] eat cereal
• In the default setting, CLAN will look for rephrased material
• The +r5 option allows you to look for ‘wanna’.
How to use CHILDES?
• Searching with both +s and –s
• CLAN only allows you to specify either +s or -s
• Imagine you want to look for all the conjugations of one verb, but are not interested in any other, identical words
• For example: all the verbal forms of ‘go’
• First of all: wild card
– Wild card *, allows you to look for anything
How to use CHILDES?
• Adding the * to the word search
+s”go*”
• Words that this search will find are: go, gone, goes, going
• But also words such as: got, good, goat, god etc.
• Ideally, you want to specify both +s and –s
• Piping option
How to use CHILDES?
• Piping: the second command operates on the output of the first command
• First command: look for ‘go*’ second command: exclude ‘good’, ‘got’, etc.
• In order for the second command to be able to operate on the first, the first command must give an output in CHAT format
• +d option
How to use CHILDES?
• First command:
– Look for ‘go*’– For the speaker *CHI– Output must be in CHAT format– In file sarah040.cha
kwal +t*CHI +s”go*” +d sarah040.cha
• Second command: exclude ‘got’
kwal –s”got”
How to use CHILDES?
• Piping the first and the second command
first command piping-operation second command
kwal +t*CHI +s”go*” +d sarah040.cha | kwal –s”got”
How to use CHILDES?
• Looking for more than one word at a time
• Searching with combo
– Speaker(s)– Words– File(s)
• Boolean operators:
^ = immediately followed by* = any character+ = or! = not
How to use CHILDES?
• Setting the speaker parameter
combo +t*CHI
• Setting the word parameter
– Let’s look for the combination of ‘want’ and ‘to’– ‘want’ immediately followed by ‘to’
combo +t*CHI +s”want^to”
How to use CHILDES?
• Specifying the file
– Let’s look in file sarah034.cha
combo +t*CHI +s”want^to” sarah034.cha
• Combo looks for the words in sequence by default
• The +x parameter allows you to look for two or more words in any order
How to use CHILDES?
• Searching for ‘want’ directly followed by ‘to’ without +x only gives ‘want to’
combo +t*CHI +s”want^to” sarah034.cha
• Searching for ‘want’ directly followed by ‘to’ with +x gives both ‘want to’ and ‘to want’
combo +t*CHI +s”want^to” +x sarah034.cha
Pitfalls and limitations
• Cannot test for acceptability or ungrammaticality
• Be aware of:
– Routines– Imitations– Speech errors– Mistranscriptions
Protocol
• CHILDES transcripts were collected with great effort and are now freely available. In return for using them, you reward the creators with citations.
• Cite latest copy of MacWhinney’s book:MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk. Third Edition. Mahwah, NJ: Lawrence Erlbaum Associates.
• Cite the publication selected by the creator(s) of the database(s) you have used.– References can be found in the ‘database manuals’ on
the site