|Sinclair's seminal work - |
the bible of corpus linguistics
Years ago before I became familiar with corpus tools (corpus as in linguistic corpus = "collection of samples of real-world texts stored on computer"; plural = corpora) we had a fierce debate with my colleagues whether to use the preposition to or for after the noun hint. We wanted to produce posters for English learning centres we had set up for a number of high schools and each poster was meant to provide "Hints for/to speaking / listening etc".
Emails were sent back and forth about what preposition should be used and the argument inevitably turned to the British/American distinction until somebody used Google Fight to compare hints to and hints for. Google Fight provided us with pseudo-scientific evidence that hints for is slightly more common than hints to – it was back in the days when I was still blissfully unaware that Google search yields different results for different people and sometimes even for the same person on different computers!
That was before I discovered the British National Corpus hosted on the Brigham Young University website. Had I discovered it earlier I would have searched for Hints + Preposition and found that hints on something is actually more common than the other two options we were vehemently debating.
In his recent article Whathave corpora ever done for us, Hugh Dellar raises doubts about the usefulness of corpora to the ELT field. While not completely dismissive of corpus research and its value, Hugh basically argues that its effect on the language teaching profession has been enslaving rather than liberating. I find Hugh's polemic surprising considering the fact that corpus linguistics is what gave impetus to the Lexical approach, of which Hugh is a staunch advocate (used the corpus here to look up a "juicy" adjective for "advocate"!)
Objective view of language
In the past 30 years corpus research has provided irrefutable evidence about how language works, not least that language is highly patterned in that it largely consists of recurrent lexico-grammatical combinations.
Starting with the Collins COBUILD project, corpora have revolutonised lexicography and changed the face of the modern dictionary. These days most respectable dictionaries – an indispensable tool for learners and teachers alike - include examples drawn from the corpus, frequency information and often register variation (if a word is more suitable for formal or informal contexts).
Corpora have shed light on many aspects of language which were previously described based on intuition. Instead of groping in the dark and anecdotal evidence we now have access to authentic language data. For example, in the past many grammar books presented "any" as a sort of transformation of "some" used in negatives and questions.
I have some time. – I don't have any time. – Do you have any time?
Corpus research has shown that any is more common in affirmative sentences (50% of all usage of "any") and not as frequent in questions (only 10%) as prescriptive grammarians would have you believe.
Frequency of lexical or grammatical items is useful for deciding which materials should be included in a syllabus. This is not to say that these should not be balanced by another consideration: relevance to the learner.
Corpora in the classroom: a boon or bane?
However Hugh’s main argument of corpora is its applicability to classroom teaching and relevance to teachers themselves. As a teacher I find corpora invaluable. Just the other day a student asked me about the difference between classic and classical. I came up with classical music and classic mistake off the top of my head but had to consult a corpus to find further examples:
classic example / case / symptoms / mistake / movie
classical music / composer / tradition
Such puzzles with confusable words can be easily solved by using the Compare function in BNC or COCA
No doubt some people are walking dictionaries and can (off the cuff) rattle off examples of usage but I would look it up in a corpus. Very often I give my students an answer about how a word is used and then consult a corpus or (corpus-based) dictionary to confirm my hunch. I am often right but sometimes I overlook certain patterns. And why not get learners to look up the answers themselves? Although data-driven learning (DDL) hasn't gained much popularity, there is some evidence that getting students to study language data (concordances) by themselves is beneficial to vocabulary learning.
Finally, Hugh argues that corpora make English as a foreign language unnecessarily foreign for non-native teachers by emphasizing certain dubious features of spoken grammar (e.g. "like" for reported speech) that we don't really need to teach learners. This is particularly ironic because Innovations and, to a lesser degree, Outcomes - coursebooks co-authored by Hugh Dellar - are packed with colloquialisms. Innovations Upper-Intermediate has a whole page devoted to vague language (sort of, kind of, -ish) - an important feature of spoken grammar of English. Perhaps I wouldn't teach "like" for productive use in an EFL context. But what about the determiner "this" which has a markedly different use in spoken language, as corpus studies have revealed? In contrast to written language, we often use "this" to refer to things NOT previously mentioned in spoken narratives to make them more vivid.
I saw this weird guy on the train yesterday.
And then there was this loud pop, like something exploded.
Corpora have provided us with more accurate language descriptions and informed dictionaries, grammar reference books and pedagogical materials. With various corpora and "corpus-light" tools (see here) now widely available online, corpora are no longer a remit of linguists but a valuable resource for teachers and learners.
For another rebuttal of Hugh Dellar's argument, see Mura Nava's post here