Project Proposal

My initial idea for a mini-course is to provide a tutorial for language teachers to help them learn how to use certain web-based language tools that I have found useful. I wanted to focus in particular on reading authentic online material using software that supports Data-Driven Language Learning.

Needs Assessment

Context Assessment

This course will be presented entirely online so it's important to make sure that it's accessible from as many different platforms as possible. It will be presented in the context of the (KNILT) wiki, so it should exhibit formatting and design consistent with the other pages.

Learner Analysis

The learners will be language teachers who want to help students practice reading using software to help find and penetrate authentic material. Language teachers (and perhaps learners) are often motivated to include more ways of getting authentic material more learnable. They may be put off by the idea of learning more software or adding another technology paradigm to their workflow. Data-driven learning is an idea that would probably not be particularly controversial, since it is based on the notion of learning with authentic material and increasing input.

Learning Objectives

Learners will be able to

  • state the general definition of data-driven language learning
  • choose an appropriate context in which to use data-driven learning
  • generate a lesson using the lextutor concordancer and language corpora
  • adopt active strategies for teaching with authentic language data

Task Analysis and Sequencing

Curriculum map


  • Entry skills
  1. Basic computing skills (web browsing, cut-and-paste, word processing) - IS
  2. Lesson design - IS
  3. determining the appropriateness of texts for learning-level, subject matter, etc.
  4. adaptability to new strategies and technologies in teaching - AT
  5. general understanding of/experience with language teaching - IS, VI
  • Elemental skills
  1. Selecting corpora - IS; requires IS of determining appropriateness of text, IS of web browsing/information search, VI - definition of a corpus,
  2. using the concordnacer - IS; requires procedural skill of web browsing and word processing, VI - the definition of concordancing
  3. understanding and appreciating DDLL - VI, AT; requires IS of using the concordancer, IS of selecting corpora, understanding of language learning
  • Integrated skills
  1. lesson design with concordanced text; requires - selecting corpora, using the concordancer, understanding and appreciating DDLL
  2. choosing an appropriate context in which to use data-driven learning; requires - understanding and appreciating DDLL, determining appropriateness of texts for learning-level, subject matter, etc.
  3. adopting active strategies for teaching with authentic language data; requires - adaptability to new strategies and technologies in teaching, understanding of language learning and teaching.


Course Sequence

  • Unit 0: Presentation of Objectives - What and Why
  • Unit 1: Introduction to Corpora
  1. Mini-lecture: What corpora are, their role in data-driven language learning, and how to find them
  2. Self-quiz
  3. Hands-dirty exercise: Looking at various available corpora.
  4. Media: Screen shots, graphs of statistical distributions
  • Unit 2: The Concordancer
  1. Mini-lecture: What a concordancer does
  2. Hands-dirty exercise: use the concordancer to generate various results
  3. Media: screen shots, links
  • Unit 3: Data-driven lesson
  1. Mini-lecture: More in-depth discussion about DDL
  2. Hands-dirty exercise: making the lesson
  3. Media: links

Notes on the use of media and learning methods

Much of the learning in this mini-course will be concept learning and learning of software.

  • Concept learning

The idea that a language teacher can use large, searchable collections of authentic language is a fairly straightforward one. The details of how this can be done and why it might be a good idea can be convincingly presented by using some graphs showing word frequencies and collocation frequencies. The quicker the learner can actually poke at the corpora using the software the better. Every time you use a concordancer you can learn something new about your language. Starting out with some engaging questions like "What is the most common noun following the phrase "in spite of"? Could get the student involved in the project of interrogating a corpus.

  • Software learning

Learning to use a concordancer is a fairly simple matter. I'm going to use lextutor for this lesson. I'll begin with some simple, constrained tasks, then have the student explore the corpora more independently.


  • Godwin-Jones, Robert. Of Elastic Clouds and Treebanks: New Opportunities for Content-Based and Data-Driven Language Learning, Language Learning & Technology, v12 n1 p12-18 Feb 2008 University of Hawaii National Foreign Language Resource Center.Honolulu


The course: Data-Driven Language Learning Using Corpora and Concordancing

Ben Blanchard

ETAP 623 Fall 2009