Unit 2 - mini-lecture



"You shall know a word by the company it keeps." - J. R. Firth

There are many possible ways the large amount of data in a corpus could be used and analyzed. For example, you can count the instances of particular words and compare their frequencies to other words, or you could measure sentence length and calculate the average. But probably the most common type of information about words yielded by corpora is information about the context in which words appear. That is, what are the words immediately surrounding a given word in the texts?

Simply put, a concordancer is a software application that searches a database of text for a single element, finds every instance of that element - be it a word or a phrase - and presents them all along with some information about their contexts. Some concordancers also incorporate some more sophisticated functionality, such as measuring how often certain words co-occur. Concordancers were developed by people with an intensive interest in words. Before computers, it was a labor-intensive endeavor to prepare a concordance for a single work, let alone a large collection. Today they are an important tool in linguistic research, usually used together with corpora.

You can learn many things about a word by observing the contexts in which it is used. Very specific functional information, such as the argument structure of a verb, can be demonstrated using concordancer output: Is the verb transitive or intransitive? What prepositions typically introduce an object? - The verb "think", for instance, is only really transitive with the word "thought", but objects are usually introduced with the preposition "about". This kind of information can probably be gleaned by looking at less than a page of authentic examples concordanced from a small corpus.

Similarly, you can learn information about the meanings of words from looking at concordancer output. Words that appear in a near context with words being examined can tell a lot about what those words mean. For instance, if the word is frequently preceded by conjugations of "eat" ("eating", "eaten", etc.), we might safely conclude that it likely refers to a type of food. It's also possible, using a concordancer, to efficiently develop a familiarity with common collocations and idioms involving a word. We might learn that "crow" refers to a flying animal by observing conjugations of the word "fly" appearing with it, but also that to "eat crow" means to experience humiliation.

A word only exists in a language to the extent that it is used by speakers or writers. In a sense, words are like any other tools; we learn how to use them by example. Hearing or reading words in context is almost certainly the way we obtain most of our vocabulary, whether in our native language or a language we are learning later in life. Concordancers can be a way to make this process efficient for learners.


  • Gaskell, Delian & Thomas Cobb(2004). Can learners use concordance feedback for writing errors?, Dept. de linguistique et de didactique des langues, Université du Québec à Montréal, Canada, Submitted to System, November 2003 Revision April 12, 2004. Accessed at http://www.lextutor.ca/cv/conc_fb.htm
  • Dyck, Garry N.(1999). Concordancing for English Language Teachers, Paper presented at the annual session of TESL Manitoba: http://home.cc.umanitoba.ca/~gdyck/conc.html

