Unit 1 - mini-lecture
Data-driven learning is a term that encompasses a number of different techniques that have been used over the past 20 years or so. What they all have in common is the use of large collections of searchable databases of authentic material. For language-learning this has generally meant text, though speech corpora exist as well. Language corpora can allow students to explore aspects of the language being learned. The first implementations generally took the form of computer printouts of concordancer output that a teacher would prepare for a class. Nowadays there are people using ordinary web searches in data-driven learning lessons, since the world wide web can be thought of as by far the largest corpus of authentic language use ever compiled.
The theoretical bases of data-driven language learning relate to two language learning truisms. First, the more input, especially authentic input, the better. Second, allowing and encouraging students to explore the language on their own leads to greater retention and transfer. Language learners interacting with corpora are engaged in a high-level search and discovery activity in which they discover the patterns of language on their own.
- "This method of using computers and technology in language learning is again very much in line with constructivist approaches to language learning which propose to provide learners with an opportunity to develop strategies which they can build on once the language class is finished." from this website about concordancers (we'll get to that):
There are at least two general types of lessons that can be created using data from a language corpus. In the first type, exemplified by the earliest uses but still suitable especially for lower-level students, a teacher searches for particular usage data related to the content of the lesson and then prepares material for the students. The second type, which has been shown to be more useful for more advanced learners, involves students exploring the corpus themselves to make their own discoveries. This second type of lesson should still be carefully designed, such that students develop clear goals about their activities.
Even without direct student interactivity with the corpus, a data-driven lesson is learner-centered. The focus is not on the words, grammatical structures or concept being taught, but rather on the students' experiences of language and creating contexts that puts students in direct contact with the actual speech community (or writing community).
- [http://llt.msu.edu/vol12num1/emerging/ Godwin-Jones, Robert(2008).
EMERGING TECHNOLOGIES OF ELASTIC CLOUDS AND TREEBANKS: NEW OPPORTUNITIES FOR CONTENT-BASED AND DATA-DRIVEN LANGUAGE LEARNING. Language Learning & Technology Vol.12, No.1, February 2008, pp. 12-18]
Back to Unit 1: Corpora