An introduction to CHILDES Rianne Schippers [email protected].

39
An introduction to CHILDES Rianne Schippers [email protected]

Transcript of An introduction to CHILDES Rianne Schippers [email protected].

An introduction to CHILDES

Rianne [email protected]

Outline

• What is CHILDES?

• Where do you find CHILDES?

• Why would you use CHILDES?

• How do you use CHILDES?

What is CHILDES?

• Child Language Data Exchange System

• Brain MacWhinney

• Online database of first and second language acquisition in children.

– Written transcripts– Audio– Video

• Also contains data from not typically developing children.

What is CHILDES?

• Recorded, natural speech.

– Recorded in home setting– Recorded at regular intervals– Longitudinal data

• Typological variation.

– Germanic– Romance– Slavic– Asian

Where is CHILDES?

Link: http://childes.psy.cmu.edu/

• “Databases”, i.e. the datasets.

• “Database manual” describing each dataset.

• Programs you can use to browse the databases

• Manuals that explain how to use the programs

Why use CHILDES?

• Answer questions about language acquisition

• Experimental studies

– Does child at age X know Y?

– Do 3-year-olds know passives?– Do 2-year-olds know inflectional morphology?

– What interpretation do children at age X assign to Y?

– Do 4-year-olds understand binding?– Do 5-year-olds understand scope freezing?

Why use CHILDES?

• Questions experiments cannot easily answer:

– Role played by input– Order of acquisition– Manner of acquisition– Causality

• Longitudinal study

Big, universal questions – Lexical categories – Inflectional morphology– Argument structure

Why use CHILDES?

• Does the interaction between language type and pronoun omission match the predictions of parameter-setting models?

• Are children with Down syndrome responsive to maternal requests?

• How do children first learn mental state verbs such as “remember” or “know”?

Why use CHILDES?

• Smaller, language specific questions – Verb second– Subjects (EPP)– Particle verbs

• Comparative studies– Acquisition of determiners

• Exploration– Mean Length of Utterance, frequencies

How to use CHILDES?

• Download and install the dataset(s) you are interested in. The “database manual” describes

• Language• Age(s)• Number of children

• Download and install CLAN (Computerized Language Analysis):

– A search and statistics engine for CHILDES.

• OR use the NLTK’s CHILDES module.

How to use CHILDES?

• All files are transcribed in CHAT format

– Codes for the Human Analysis of Transcripts

• Format

– Files start with @-headers: information about participants and setting

– The rest of the file contains *-tiers and %-tiers– *-tiers: specify the speaker (*CHI = child)– %-tiers: are related to the previous *-tier and give

extralinguistic information

How to use CHILDES?

• %-tiers are also used for coding

• %pho for phonology

• %mor for morphology

*CHI: I have a ball%mor: PRO|I&1S V|have-PRES DET|a&INDEF N|ball

• %syn for syntax

How to use CHILDES?

• Some more annotations

# unfilled pause between words6 schwa& phonological fragmentxxx unintelligible speech [/] retracing without correction, e.g..: then [/] then[//] retracing, with correction, e.g.: then [//] but<   >["] quotation mark, used when the child literally repeats something

• All notation can be found in the CHAT manual

How to use CHILDES?

• Go to the command window

• Every search starts with a command

– kwal: word search– combo: combined search for 2 or more words– freq: frequency counts– mlu: mlu counts

• A command is followed by search parameters

How to use CHILDES?

• Some standard CLAN parameters

+t selects the utterances of a specified speaker +s selects a word to be searched+u specifies that all search results are stored in one file +r deals with the treatment of material between parentheses+f output is stored in the (specified) file(s)

• Not all commands have the same search parameters

– Type the command in the command window and hit enter

How to use CHILDES?

• Searching with kwal

– Speaker(s)– Word– File(s)

• Command must come first, the order in which the search parameters are given is irrelevant

• Every search parameter and the command must be separated from each other by a space

How to use CHILDES?

• Setting the speaker parameter

– Identify the speaker(s)+t = look for that specific speaker-t = look for everyone but that specific speaker

• We are interested in the child

– command parameter-speaker-child

kwal +t*CHI

How to use CHILDES?

• Setting the word parameter

– Decide what word you want to look for+s = look for that specific word-s = look for everything except that specific word

• Let’s say we want to know whether the child has used the auxiliary ‘want’.

– command speaker parameter-word-want

kwal +t*CHI +s”want”

How to use CHILDES?

• Specifying the file

• Two ways:

– Using the ‘file in’ button– Specifying the file in the command line

• Let’s say we want to start our search in file sarah023.cha

– Command speaker word file

kwal +t*CHI +s”want” sarah023.cha

How to use CHILDES?

Exercise:

– Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha

How to use CHILDES?

Exercise:

– Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha

Steps to take:

– Determine the command– Identify the speaker– Decide on the word– Specify the file

How to use CHILDES?

Exercise:

– Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha

Steps to take:

– Determine the command– Identify the speaker– Decide on the word– Specify the file

kwal +t*MOT +s”want” sarah023.cha

How to use CHILDES?

• Searching for several words

– Make a list in .txt format– Enter the list as the word you are looking for

• For example:

– A list with all auxiliaries– Named auxiliary.txt– Parameter: [email protected]

kwal +t*CHI [email protected] sarah023.cha

How to use CHILDES?

• Output screen is limited

• Store the data in a separate file

– Parameter: +f– File name has three letters– For example: aux

• Command speaker word parameter-store-filename file

kwal +t*CHI +s”want” +faux sarah023.cha

How to use CHILDES?

• Retype the command: kwal +t*CHI +s”want” sarah023.cha

• Notice: some material is in between brackets

*CHI: wan(t) do (a)gain

• What does this mean?

– Child actually said ‘wan’ instead of ‘want’.

• CLAN will standardly include the material in between brackets.

– CLAN will look for ‘want’

How to use CHILDES?

• What does this mean?

– A search for ‘want’ will give you both ‘wan(t)’ and ‘want’.

• Control whether the search includes material in between brackets.

• +r parameter

+r1 = default, include material in brackets+r2 = exclude material in brackets+r5 = exclude rephrased material

How to use CHILDES?

• Try out: kwal +t*CHI +s”want” +r2 sarah023.cha

• +r5 allows for exclusion of rephrased material

• What is rephrased material?

*CHI: I wanna [: want to] eat cereal

• In the default setting, CLAN will look for rephrased material

• The +r5 option allows you to look for ‘wanna’.

How to use CHILDES?

• Searching with both +s and –s

• CLAN only allows you to specify either +s or -s

• Imagine you want to look for all the conjugations of one verb, but are not interested in any other, identical words

• For example: all the verbal forms of ‘go’

• First of all: wild card

– Wild card *, allows you to look for anything

How to use CHILDES?

• Adding the * to the word search

+s”go*”

• Words that this search will find are: go, gone, goes, going

• But also words such as: got, good, goat, god etc.

• Ideally, you want to specify both +s and –s

• Piping option

How to use CHILDES?

• Piping: the second command operates on the output of the first command

• First command: look for ‘go*’ second command: exclude ‘good’, ‘got’, etc.

• In order for the second command to be able to operate on the first, the first command must give an output in CHAT format

• +d option

How to use CHILDES?

• First command:

– Look for ‘go*’– For the speaker *CHI– Output must be in CHAT format– In file sarah040.cha

kwal +t*CHI +s”go*” +d sarah040.cha

• Second command: exclude ‘got’

kwal –s”got”

How to use CHILDES?

• Piping the first and the second command

first command piping-operation second command

kwal +t*CHI +s”go*” +d sarah040.cha | kwal –s”got”

How to use CHILDES?

• Looking for more than one word at a time

• Searching with combo

– Speaker(s)– Words– File(s)

• Boolean operators:

^ = immediately followed by* = any character+ = or! = not

How to use CHILDES?

• Setting the speaker parameter

combo +t*CHI

• Setting the word parameter

– Let’s look for the combination of ‘want’ and ‘to’– ‘want’ immediately followed by ‘to’

combo +t*CHI +s”want^to”

How to use CHILDES?

• Specifying the file

– Let’s look in file sarah034.cha

combo +t*CHI +s”want^to” sarah034.cha

• Combo looks for the words in sequence by default

• The +x parameter allows you to look for two or more words in any order

How to use CHILDES?

• Searching for ‘want’ directly followed by ‘to’ without +x only gives ‘want to’

combo +t*CHI +s”want^to” sarah034.cha

• Searching for ‘want’ directly followed by ‘to’ with +x gives both ‘want to’ and ‘to want’

combo +t*CHI +s”want^to” +x sarah034.cha

Pitfalls and limitations

• Cannot test for acceptability or ungrammaticality

• Be aware of:

– Routines– Imitations– Speech errors– Mistranscriptions

Protocol

• CHILDES transcripts were collected with great effort and are now freely available. In return for using them, you reward the creators with citations.

• Cite latest copy of MacWhinney’s book:MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk. Third Edition. Mahwah, NJ: Lawrence Erlbaum Associates.

• Cite the publication selected by the creator(s) of the database(s) you have used.– References can be found in the ‘database manuals’ on

the site