April 11th, 2010 - Types, Tokens and Patterns: Beginning Corpus Linguistics

posted Jan 15, 2012, 11:07 PM by Hamamatsu JALT   [ updated Jan 15, 2012, 11:08 PM ]

Types, Tokens and Patterns: Beginning Corpus Linguistics

Matt Smith

In this presentation, Matt showed how easily a computer corpus can help us see patterns in language. Some dictionaries now show the most common patterns that a given word takes on in discourse, based on corpus data, such as the word “decide: V wh-; V to V” with V indicating a verb. So the word “decide” often appears before interrogatives, such as in “decide whether . . .” or before to-infinitives, such as “decide to call . . .”. This information can help both teachers and students see the patterns of how words are used to express meaning. Matt gave the audience a number of examples of concordances (data showing the behavior of a given word) with which we could identify patterns and their implications. He suggested three main sources: The Bank of English, British National Corpus (BNC), and Corpus of Contemporary American English (COCA) as large, reliable computer sites that can be accessed. The presentation was well prepared, well informed and to the point.

Reported by Dan Frost