Our research

Statistical learning, learning biases and the nature of human languages

How do we learn language? In particular, how do young children end up as competent adult-like speakers of their native language(s)? I am interested in how the way in which children process the language input they receive leads them to build a languages “system” which can produce and understand novel utterances they have never heard before. One approach to investigating this explores how the process of statistical learning over the language input identifies reoccurring patterns and leads to appropriate generalization.

Our research explores this question using Artificial Language Learning experiments, where participants learn and are tested on novel languages created by the experimenter. The languages can be very simple, but this provides a controlled methodology for exploring how the statistics of language input affect what is learned. Our experiments have explored how the structure of language input can affect the extent to which learners extract generalization [Wonnacott et al., 2017, 2012, 2008; Wonnacott, 2011; Perfors et al., 2010]. A new collaboration with Ben Ambridge continues this work as part of an ERC funded project. Ongoing experiments explore what types of input lead learners to avoid over-generalization of linguistic constructions (e.g. not to generalize the verb “carry” to the construction *he carried the child the parcel). There is evidence that frequently hearing utterances such as *he carried the parcel to the child plays a role, but what about the learners’ more general experience of hearing “carry” in other constructions?

Other work uses similar artificial language learning methodology to looks at learners’ biases and how these might shape human languages. For example, languages exhibit variation: in English the precise way in which we pronounce the plural marker -s varies (e.g. sometimes “s” e.g. “cats”, sometimes “z” e.g. “dogs”). While it seems logically possible that this kind of variation could occur completely at random (e.g. randomly chose to produce “s” or “z”), this type of behaviour very rarely (possibly never) occurs in human languages. Instead, linguistic variation is predictable: in the case of -s, the pronunciation is predictable from the last sound in the noun. Why do languages work like this? Seminal work by Hudson Kam & Newport suggests that this is due to strong learning biases in children which lead them to regularize inconsistent input. With collaborators Kenny Smith and Olga Feher, we have been exploring the extent to which these biases for regularization are also present to a weaker extent in adult learners, how they might be exacerbated by interactions between language users, and how this might influence language structure [Smith et al., 2017; Feher et al., 2016; Smith & Wonnacott, 2010]. With collaborator Anna Samara we have also looked at whether children and adults can learn probabilistic social linguistic conditioning (i.e. learning that certain speakers are more likely to use some forms than others) [Samara et al., 2017].

Researchers in the Language Learning Lab

Anna Samara (former postdoc, now collaborator)
Catriona Silvey
Eva Viviani
Maša Vujović
Elizabeth Wonnacott

Second language learning in children and adults

See our ESRC research grant page here, and click here for more details on our past Teachers Workshops on Second Language Learning in the Primary Years.

This research program explores how the statistical structure of the input can affect learning of a modern foreign language, and how this differs for learners of different ages. This has potential implications for language teaching in schools.

One set of experiments, with Anastasia Giannakopoulou, Helen Brown
and Meghan Clayards, explores the learning of non-native speech contrasts (e.g. Greek speakers of English learning the difference between “sheep” and “ship”) [Giannakopoulou et al. 2017]. There is evidence that adults learn better when speech sounds are exemplified across multiple contexts, is this also true for children? We are also looking at how input variability affects the learning of grammatical structures in children of different ages. One experiment focuses on the learning of grammatical gender classes (i.e. the division of words into masculine and feminine) in Italian. 7 year old children learned Italian words by playing a computerized training game. The words were “marked” as masculine and feminine – masculine words were preceded by the word “il” and feminine words by the word “la”, and masculine words ended in an “o” and feminine words ended in an “a”. We found that children showed strong learning of the gender markings for the trained words, but there was only weak generalization of the patterns (as seen with for new word) [Brown et al., 2016]. Ongoing experiments explore how manipulating the input boosts learning (for example, “staging” the input so that singulars are learned before plurals; “skewing” the input, so that one marker is more frequent than the other).

Ongoing experiments explore factors affecting vocabulary learning (using Lithuanian) and learning of “tones” in Mandarin Chinese.

This research is funded by a research grant from the ESRC held in collaboration with Dr Helen Brown (see the link at the top) as well as an SSHRC Insight Grant held by collaborator Meghan Clayards.

Researchers in the Language Learning Lab

Gwen Brekelmans
Hanyu Dong
Elizabeth Wonnacott

Literacy development

Some of our research explores the processes involved in spelling development, in collaboration with Anna Samara. Spelling is a complex and challenging task, particularly in orthographies where letters and sounds do not have one-to-one correspondence (e.g., in English, vowel sounds can be spelled in as many as five different ways!). In line with increasing evidence that memorization and explicit learning skills do not suffice for competent spelling skill to develop, we investigate spellers’ frequency-based sensitivity: For example, can beginner spellers pick up on untaught orthographic conventions (e.g. gz and dz are illegal spellings of frequent word-final sound combinations in English; *bagz, *padz) from simple text exposure and what are the computational mechanisms at play? We address these questions using artificial lexicons, i.e., novel words which exemplify spelling patterns akin to those seen in natural orthographies. We incidentally expose participants to these words and subsequently, ask them make judgments about unseen words which either follow or violate the novel spelling patterns. Using these methods, we have shown that frequency statistics do have an influence on children’s spelling preferences: For example, beginning spellers rapidly learn and generalize over novel orthographic conventions for permissible letter contexts (e.g., d and o cannot occur next to one another) [Samara & Caravolas, 2014] both when these are embedded within rime units (i.e., vowel-plus-final-consonant units) and body units (i.e., initial consonant plus-vowel units) [Samara, Singh, & Wonnacott, in preparation]. Our ongoing work compares children’s ability to learn different types of statistics in orthographic stimuli e.g. co-occurrence frequency vs. conditional probability) and explores co-dependencies with the processes of extracting similar statistics from spoken input.

Other research in reading development has been conducted in collaboration with Prof Kate Nation (Oxford) and Dr Holly Joseph (Reading). For example, we have shown experimentally [Joseph et al., 2014] that so-called “Age of Acquisition” effects in word reading (i.e. the fact that the age at which a word is first encountered affects the way it is later processed as a mature reader) can result from the order in which words are encountered during learning: if we teach participants new words via passive reading exposure, early exposed words are subsequently read differently from later exposed words. This speaks against accounts of reading development where Age-of-Acquisition results from changing brain plasticity in developing readers, or where these effects are an epiphenomena due to other statistical properties (which were all controlled in the study). We have also conducted various eye-tracking experiments exploring children’s reading of syntactically ambiguous sentences [Wonnacott et al., 2016]. We were particularly interested in the relationship between children’s online processing (as revealed by eye movements) and offline comprehension.

Researchers in the Language Learning Lab

Daniela Singh
Elizabeth Wonnacott