Chapter 6. Results and discussion: Phase I. Corpus-based research
6.3. Discussion
There are three interesting points arising from the corpus data of verb-noun collocations in the BNC, the TIME corpus and English I textbook corpus.
First, high-frequency collocations were high ranked in both the BNC and the TIME corpus. This was contrary to the present writer’s expectation because the sources of these two corpora were different: the BNC is extracted samples of British English while the TIME corpus is extracted samples of mainly North American English. The total tokens and types were also different: about 100 million tokens were in the BNC and about 453 thousand tokens were in the TIME corpus. Furthermore, surprisingly, some collocations occurring in the English I textbook corpus overlapped with high-frequency collocations ranked within 100 in the BNC and those which occurred more than 10 times in the TIME corpus, in spite of the very limited total tokens in English I textbooks: about 6000 to 8000 tokens. These were a desirable result in order to identify basic collocations for Japanese learners of English at an early stage.
Second, high-frequency collocations consisted of basic-level words, according to the results of analyzed data extracted from the BNC and the TIME corpus. In fact, collocations composing L1 and L2 verbs and nouns
made up around 85% of all the occurring collocations in the TIME corpus. In the BNC, the coverage of L1 verb-noun collocations reached 78%, and the coverage of L1 and L2 verb-noun collocations reached 98% in the first 100 high-frequency collocations. These findings were also desirable for Japanese upper secondary school students because they are expected to develop their four skills comprehensively using textbooks with a very limited number of vocabulary. Thirteen hundred words are targeted for 10th graders, but as they are calculated in the word-form system in which headwords, inflectional forms, reduced forms and derivative forms are respectively counted, these 1300 words will be in fact smaller in number, if they are calculated as one word.
Collocation is a good way to develop a better command of English with a limited number of words. This is supported by many researchers who regard collocation as important in EFL learning. Bahns (1993) and Howarth (1998a, b) emphasize the importance of collocation teaching. So do Ellis (2001), Lexical Approach advocates (Lewis 1993, 2000; Hill 2000), McCarthy (1984), Yorio (1980) and many other researchers (see section 2.4.). Gitsaki and Taylor (1999) claim that teachers should supply new lexical items together with their most frequent collocations in an EFL class while they are being taught. Hattori and Matsuhata (1980), Tono (2003) and Murata (2003) mention that new words would be firmly fixed in students’ minds when they are presented with words which have already been learned.
English I textbooks do not present high-frequency collocations repeatedly frequently enough in the same context or in the different context. Previous research proved that words should be repeated six times to be effectively learned on average. Kachroo (1962) examined how many times certain words
were repeatedly presented in a textbook and how they were fixed in learners’
memory. His findings showed that words repeated more than seven times were acquired by almost all the informants, and more than half of words repeated only once or twice were not memorized by them. Salling (1959) and Crothers and Suppes (1967) conducted similar experiment. Salling suggests that at least a five-time repetition of words is necessary to be memorized.
Crothers and Suppers (1967) claim that words should be repeated six or seven times. According to Rod (1999), six-time repetition of words in a reading textbook results in better acquisition of words than two-time and four-time repetition. Zahar et al. (2001) conducted research on the relationship between the repetition of targeted word and their acquisition by learners of different levels and found that those in lower levels should have more opportunities to be exposed to the target words seven times on average to acquire them. Saragi et al. (1978) came to the conclusion that when students try to keep certain words in their minds through reading a text, they should read it more than 16 times. Shaughnessy, Zimmerman and Underwood (1970) claimed that certain words would be more firmly fixed by repeating them at certain intervals than by doing so intensively. Thus, these pieces of research tell us that collocations should be repeatedly presented much more often in textbooks.
Teachers are required to find ways to expose high frequency collocations to students because collocations cannot be presented repeatedly in the limited number of textbook pages. For example, by reading a text passage including target collocations several times, listening to a tape several times, or presenting examples of sentences including target collocations several times. Explicit learning is an especially effective way to make learners pay
attention to collocations. Schmitt (2000) maintains that both explicit and incidental learning are necessary, especially certain important words, for example, the most frequent words in a language and technical vocabulary, make excellent targets for explicit attention. Nation and Kyongho (1995) argue that we should consider vocabulary teaching in terms of cost/benefits, with the value of learning such words well worth the time required to teach them explicitly; on the other hand, infrequent words in general English are probably best left to incidental learning. Zahar et al. (2001) conducted research on EFL students at lower secondary schools. They acquired 2.16 words on average when they read a 2098-word text including 30 unknown words. They were expected to learn about 70 words in a year after they read such kinds of text every day. These three pieces of research invite us to teach explicitly collocations which are used frequently.
Third, among collocations extracted from the TIME corpus, some which are related to specific topic types and tend to be ranked lower in the BNC, although they occurred 10 times or more in the TIME corpus. For example, among six collocations which occur 10 times or more in the TIME corpus, but which are not ranked within the 100 in the BNC fight war (275th) and use force (352nd) are on the Iraqi issues and write song (352nd) and make movie (590th) are on entertainment. TIME American version tends to reflect current domestic issues such as presidential election and the US related issues such as Iraqi war. They may have ranked lower in the more general corpus.
The collocations extracted from the English I textbook corpus also indicated the same tendency. Among more than three-time collocations occurring in the English I textbook corpus, for example, run marathon
occurred four times, highest frequency collocation in the corpus but the rank of this collocation in the BNC is the 838th however. Thus, topic-oriented collocations tend to be ranked low in the general corpus.
Moreover, words of every day use which are frequently used in lower and upper secondary English textbooks also tend to be lower in the general corpus. For example, take photo (566th), blow whistle (689th), and ride bicycle (885th) are frequently used in our daily life, but their ranks are low in the BNC.
Finally, the purpose of English learning of Japanese students must be considered. They need to learn English for General Purposes (EGP) to develop basic English skills, not English for Specific Purposes (ESP) such as technical or business terms. Therefore, basic collocations become requisite for them (see section 2.4). Leech, Rayson and Wilson (2001) support this idea and emphasize the use of frequency data for educational purposes as follows:
For the teaching of languages, whether as a mother tongue or as a foreign or second language, information about the frequencies of words is important for vocabulary grading and selection. Here frequency has applications to language learning in such areas as: syllabus design, materials writing, grading and simplification of readers, language testing and perhaps even at the ‘chalkface’ of classroom teaching (Leech et al.
2001: ix).
The editors of JACET 8000 also mention that they struggled with a most challenging task of harmonizing scientific accuracy and educational effectiveness in order to make up a word list for Japanese learners of English.
They point out three problems on selection of high-frequency words: (a) They have chosen words related to current affairs such as political issues and economic issues and many vulgar words and slang words; (b) They have
excluded daily words popular in lower and upper secondary English textbooks; and (c) They do not cover words for the beginning level. Therefore, ranks of words based on corpus data were modified, referring to those based on English textbooks for upper secondary students. Based on the viewpoints of Leech et al. and the editors of JACET 8000, collocations should be identified scientifically and educationally for Japanese learners of English in this research.
6.4. Basic collocations determined by analyses of corpora