Kawachi 2001 MA withiout appendixes 最近の更新履歴 Kazuhiro Kawachi

(1)

Slips of the Tongue

in Spontaneous vs. Preplanned Speech

in Japanese

Kazuhiro Kawachi

An MA project submitted to the Faculty of the Graduate School of

University at Buffalo, the State University of New York in partial fulfillment of the requirements for the degree of

Master of Arts Department of Linguistics

April, 2001

(2)

Table of Contents

Acknowledgments i

Abstract ii

1. Introduction 1

1.1 Research Problems 1

1.2 Literature Review 3

1.2.1 Studies on Slips of the Tongue in Japanese 3 1.2.2 Studies on Methodologies of Slips of the Tongue Data Collection 3 1.2.3 Studies on Practice Effects on Speech Production 6 1.2.4 The Relationships between Contexts and Methods 9

1.3 Hypothesis 10

2. Study 12

2.1 Methodology 12

2.1.1 Data Sources 12

2.1.2 Subjects 12

2.1.3 Definitions Involved in a Slip of the Tongue 13

2.1.4 Procedure 1: Data Collection 14

2.1.5 Procedure 2: Classification 15

2.1.5.1 Jaeger’s (1992a, to appear) Classification System 15 2.1.5.2 Other Classification Principles 21

2.1.6 Speech Production Models 23

2.2 Results 28

2.2.1 General Results 28

2.2.2 Specific Results 29

3. Discussion 37

3.1 Findings on Practice Effects: those Aspects of Speech Production

Planning which are Affected by Practice 37

3.2 Findings on those Aspects of Speech Production Planning which are

not Affected by Practice 44

3.3 Practice Effects and Attention 45

3.4 Three Types of Practice 46

4. Conclusions 49

Notes 51

References 74

(3)

Acknowledgments

First and foremost, I would like to express my deepest thanks to my advisor, Dr. Jeri J. Jaeger, for all the constructive advice that she gave me on this paper. This paper developed out of the paper that I wrote for her psycholinguistics class in the semester of Spring 1999, and since that time I worked on the topic as long as two years and three months. During the longitudinal research, she constantly gave me not only highly insightful comments and suggestions, but also shared with me the joy and pain of writing a paper. I would not have been able to complete this paper without her advice.

I would also like to express my sincere thanks to Dr. Jean-Pierre A. Koenig and Dr. Matthew S. Dryer for reading this paper and giving me very helpful comments on various portions of this study.

I would like to express thanks as well to all other professors in the Department of Linguistics at the University at Buffalo (Dr. Colleen Fitzgerald, Dr. Karin E. Michelson, Dr. Leonard Talmy, Dr. Robert D. Van Valin, Jr., Dr. Wolfgang Wölck, and Dr. David A. Zubin) for helping me gain a perspective as a linguist and develop originality and creativity.

I am also indebted to Dr. Sheri Wells-Jensen for having discussions with me on earlier versions of the paper and providing me constant encouragement in writing this paper.

My special thanks go to my mentor, Dr. Norimitsu Tosu, who introduced me to the world of linguistics. If I had not known him, I would not be studying linguistics. To him, I owe my deepest gratitude.

I was fortunate enough to be able to obtain hints from classes in the Department of Psychology at the University at Buffalo taught by Dr. Gail A Mauner and Dr. William C. Schmidt. I would like to thank them.

I also thank my fellow students at University at Buffalo, SOT research group members in particular (especially, Ed Akiumi, Osamu Amazaki, Lilian Guerrero, Nuttanart Muansuwan, and Kimio Tanihara), for their helpful comments on my presentations of earlier versions of this paper.

(4)

Abstract

This paper addresses the question of how practice influences speech production planning processes. There are three types of practice in speaking: 1) practice required for the acquisition of speech production processing, 2) practice for improving verbatim, task-specific knowledge in a particular speech-production experiment, for example a tongue-twister experiment, 3) practice in expressing the content to be conveyed in a specific situation. The present study focuses on the third type of practice by means of slips of the tongue (SOTs) in Japanese. Studies of the first type (e.g. Stemberger 1989, Jaeger 1992, to appear, Poulisse 1999) and the second type (e.g. Schwartz et al. 1994, Dell et al. 1997), though they are still small in number, have been conducted, but almost no attention has been directed to the third type. The vast majority of previous studies on SOTs have assumed that errors follow the same pattern regardless of the context of speaking, and have treated data collected from various sources as being the same, regardless of 'practice' factors.

I collected 536 errors from spontaneous everyday conversation and 246 errors from live- broadcast TV programs such as talk shows and entertainment shows where the speech is

preplanned to a large extent and no reading is involved. Errors were classified according to unit, form, and directionality (Jaeger 1992, to appear). Though there were those aspects of speech production planning that were unaffected by practice, the following practice effects were attested. The preplanned speech contained higher frequencies of phonological and syntactic errors, anticipatory phonological errors and syntagmatic lexical errors, phonological omission errors, especially telescopings, and lexical unit errors involving closed-class morphemes than the spontaneous speech. On the other hand, the spontaneous speech contained higher frequencies of paradigmatic lexical errors, perseveratory phonological errors and syntagmatic lexical errors, lexical unit errors involving open-class morphemes, especially verbs and adjectives, and lexical and phrasal blends than the preplanned speech. Moreover, the phonological error and its source tended to be located at a greater distance in the preplanned speech than in the spontaneous

speech. These results are widely divergent from findings about both the first and second types of practice. This provides evidence that the three types of practice in speaking exert different effects on speech production planning processes. I will show how many of the effects of the third type of practice can be explained in terms of automatization, namely a transition from controlled processing to automatic processing, resulting in shifts in the loci of errors. The results also suggest that SOTs should be collected from the same type of source with respect to practice.

(5)

1. Introduction

Like many papers on slips of the tongue, the present paper takes as its starting point the premise that research into slips of the tongue gives insights into speech production processes.^[1] If something goes wrong with a speaker’s speech production, one can ask how and where it occurred in that speaker’s planning mechanism. Since utterances which deviate from the speaker’s intention tend to show regularities, such regularities can be used to explain the structure of the speech production planning mechanism.

However, the vast majority of previous slips of the tongue (SOT, henceforth: following Jaeger 1992a, to appear) studies have the following characteristics.^[2] First, as Jaeger (to appear) points out, most SOT research has been based on corpora collected from adult speakers of Germanic languages, chiefly English, Dutch and German. There are only a limited number of studies that have investigated SOTs in other languages (see Berg 1987a:5-6, Wells-Jensen 1999:33-34 for references). Second, although SOTs have been investigated for the purpose of the “construction of performance models” (Fromkin 1973b:13), few SOT studies have taken into account the discourse context in which the SOTs occur, with the result that almost no discourse or environmental components have been incorporated into speech production models except Levelt’s (1989) model. Finally, the bulk of SOT studies have assumed that SOTs follow the same pattern regardless of the setting where and the purpose for which the utterance occurs, and thus have treated data collected from various sources as reflecting the same speech production processes.

The present study addresses these issues as follows. First, the language that it deals with is Japanese, a language which shows typological characteristics distinct from those of Germanic languages. Second, this study takes into consideration the discourse context in which the speaker makes a SOT, and looks for differences in the types of SOTs which different contexts induce. Third, only those SOTs made in a uniform setting and purpose of speaking are regarded as constituting the same type of corpus in the present study. The overall focus of this study is the effect of practice on speech production planning, to which I will now turn.

1.1 Research Problems

This paper focuses on how practice influences speech production processes. “Practice” can be defined as things that are done for the purpose of improving a skill. Practice in speaking can be interpreted in one of the following three ways. One is practice that is required for the acquisition of fluent speech production. The second type is practice for improving verbatim, task-specific knowledge in a particular speech-production experiment, typically a tongue-twister experiment. The third type is practice that is done until one can express content which is

intended to be said in a specific situation (e.g., when giving a public speech, when speaking in a TV program). These three types of practice will be called “acquisition-practice,” “experimental- practice,” and “content-practice,” respectively, in this paper. After practice, processing is said to become automatic (e.g., Allport et al. 1972, Schneider and Shiffrin 1977, Shiffrin and Schneider 1977). Automatic processing, which is characterized by fast processing speed, effortlessness, temporally parallel processing, and processing with low capacity demands, usually refers to processing after experimental-practice. However, if automatization is a transition from performance based on a slow algorithm to performance based on single-step, direct-access memory look-up (e.g., Logan 1988, Logan and Compton 1998) or from procedural-memory-

(6)

based performance to declarative-memory-based performance (e.g., Strayer and Kramer 1990, Grant and Logan 1993), one can find a commonality among the three types of practice: like performers of experimental tasks, children, as well as a person making a public speech, build up memory bases and, after practice, come to be able to speak using fast memory retrieval. Thus, the automatization, or at least facilitation, of processing seems to occur in any of the three kinds of practice. Nevertheless, the three types of practice exhibit differences. First, they are different from one another in what is learned after practice; i.e., what kinds of information come to be able to be looked up. What is learned after acquisition-practice is the use of the language system as a whole, that is, by means of which forms one can express a particular content in which discourse context. After experimental-practice, the subject learns how to pronounce given speech forms in the experimental context and the speech content is minimally learned. What is learned after content-practice is what speech content should be expressed in a particular

discourse context and also, to some extent, what speech form should be used for the content. Accordingly, the three differ as to the purpose of practice. The purpose of acquisition-practice is to be able to be a fluent speaker of a particular language. That of experimental-practice is to improve task performance. The purpose of content-practice is to be able to say what the speaker is supposed to say in a particular context fluently. Acquisition-practice is unintentional as long as the language is the person’s first language, although it must be more intentional if the

language is the second language of an adult learner. On the other hand, experimental-practice and content-practice are generally intentional. Acquisition-practice usually takes a long time compared to the last two types. Acquisition-practice is required universally, while experimental- practice is made only in experimental tasks, and content-practice occurs only in limited social contexts. Everyday conversation is usually free from experimental-practice and content-practice, and a normal speaker of a language can communicate effortlessly with other speakers in that language without intentional practice. If it were always required in everyday conversation, communication would be inefficient. The largest difference between acquisition-practice and content-practice on the one hand and experimental-practice on the other is that the latter is practice in producing sentences within an experiment but out of any discourse context, while the former types of practice are practice in speaking in discourse contexts, where speakers are making themselves understood. Although it may be controversial as to whether cognitive processes involved in all the three kinds of practice are the same or different, the present study makes a preliminary distinction among the three on the bases of the differences described above. I will discuss this issue in 3.4. Content-practice is the focus of this study and will still be called

“practice,” but speech after practice of this type will be called “preplanned speech” in this paper. Thus, preplanned speech for some particular public discourse will be placed in contrast to speech without practice in the sense under study here, namely spontaneous speech in daily life.

Although SOTs have been studied for more than one hundred years, it is just recently that practice effects on speech production have started to be looked at in SOT research. However, the kinds of practice studied by SOT research have been limited to acquisition-practice and

experimental-practice, and almost no attention has been directed to content-practice. Even studies on acquisition-practice (e.g., Stemberger 1989, Jaeger 1992a, 1992b, to appear) and those on experimental-practice (e.g., Schwartz et al. 1994, Dell et al. 1997) are still small in number; furthermore, the language dealt with in such studies is, in most cases, English. The present study addresses the issue of effects of content-practice on speech production by comparing SOTs in Japanese made in spontaneous speech and preplanned speech.

(7)

In the rest of this section, SOT studies relevant to the present study will be reviewed and a hypothesis will be put forward. Section 2 first describes the methodologies used in this study; 2.2 presents the results and 2.3 analyzes them. Section 3 discusses the findings in detail and compares them with findings about effects of acquisition-practice and content-practice. Section 4 concludes the paper.

1.2 Literature Review

1.2.1 Studies on Slips of the Tongue in Japanese

There are only a small number of studies on SOTs in Japanese (e.g., Kamio and Tonoike 1979, Tabusa 1982, Kubozono 1985, 1989, Terao 1987, 1995). Those studies on Japanese SOTs based on naturalistic data tend to focus upon some specific planning units rather than speech production in a broad context. Most of them do not deal with the issue of where and how, including by whom, data should be collected.^[3]

1.2.2 Studies on Methodologies of Slips of the Tongue Data Collection

Like any empirical study, research on SOTs has followed two kinds of data collection methods, observational ones and experimental ones. The former, by which data are collected in an uncontrolled, naturalistic setting, consist of two types. One of them, which is the most traditional, is used to collect SOTs from everyday conversation (what is called the “pen-and- paper” method in Poulisse 1999:96): every time researchers encounter a SOT error in their daily life, they write down that error and its relevant information (see 2.1.4; e.g., Meringer and Mayer 1895, Meringer 1908). The other is to take data from tape-recorded conversations (the “tape- recording” method). These observational methods have the advantage of allowing the researcher to collect all types of SOTs that can occur in natural conversation, and thus enable the researcher to probe into the speech production mechanism employed there, as long as data are collected with precision. Obviously, a disadvantage to these methods is their inefficiency. They require longitudinal data collection in view of the low frequency of SOTs that occur in everyday

conversation (e.g., 191 errors in about 150,000 words in Garnham et al. 1982). In the other kind of data collection method, experimental methods, data are collected in a controlled setting by inducing SOTs experimentally, and tape-recording them (e.g., Baars et al. 1975). Unlike

observational methods, experimental methods make it possible for a researcher to collect a large number of SOTs during a small amount of time; yet these methods tend to yield narrow types of SOTs, as compared to SOTs that occur in natural conversation. As far as the accuracy of data and the objective of study are concerned, the tape-recording method has been considered to be the most ideal, although it may be at the cost of efficiency. Despite this, this method has been employed by only a small number of studies (e.g., Boomer and Laver 1968; Garnham et al. 1982).

The weaknesses of the experimental methods and those of the pen-and-paper method have been pointed out in the literature (e.g., Baars 1992, Stemberger 1992, 1993, Poulisse 1999). As Fromkin (1980:5) and Garrett (1976:253-254) point out, the experimental methods might not be testing real production processes that are used in naturalistic settings, but only those

laboratory-specific processes which occur in artificial experimental settings, with the result that the subjects may be more likely to produce certain types of SOTs. Moreover, as MacKay and

(8)

Kempler (1984:437-438) argue, it is often the case that researchers cannot reconstruct the intended utterance by asking subjects about the target of the error, that is, what the speaker intended to say, after the experiment. Nevertheless, the experimental methods have been employed to test specific hypotheses set up by means of data collected by the pen-and-paper method.

Although the pen-and-paper method can test real production processes and allow the researcher to ask subjects what they had in mind immediately after they made a SOT, it has received much criticism. A major criticism of this method is that what are perceived and

reported as SOTs are highly susceptible to the researcher’s perceptual biases: the researcher may be more likely to fail to perceive and report some types of SOTs than others. For example, Cutler (1982) claimed that great caution is needed when making “More Errors” arguments about SOTs, those arguments about SOTs based on relative frequencies of particular error types, because of the problem of “detectability,” i.e. the fact that some types of errors are more difficult to detect than others when the pen-and-paper method is used. She attempted to support this argument with evidence from experiments on slips of the ear and shadowing experiments.

However, criticism of this sort has the following problems. First, it is based upon the results of experiments on perceptual biases where the subjects are not those who are trained in detection of SOTs, and thus underestimates SOT researchers’ ability. For example, the subjects of the experiments, upon which Cutler (1982) bases her argument, are as follows.

(1) Cohen (1980): Dutch students of the English department of Utrecht University

Cole (1973): English undergraduate students from an introductory psychology course

Lackner (1980): English Brandeis students

Tent and Clark (1980): English adults who were randomly selected from various English and Linguistics courses offered at

Macquarie University

Bond and Small (1984): English undergraduate students enrolled in Ohio University

It would be unfair to argue for perceptual biases of SOT researchers on the bases of perceptual biases of such subjects. As Stemberger (1993:62) states, “a single highly trained and highly motivated person can collect all the errors.”^[4] Second, the experimental environments are quite different from the environments in which SOT researchers collect data from everyday

conversation. For example, no single SOT researcher who uses the pen-and-paper method would collect as SOT data, mispronunciations that are deliberately performed while reading a novel as in Cole’s (1973) experiment.

Another fallacious argument against slip collection from everyday conversation was put forward by Ferber (1991, 1995). Ferber (1991) threw doubt on the quality of naturalistic data by comparing data taken by means of what seems to be an experimentally-devised pen-and-paper method, or rather a one-time tape-recording method (“on-line” collection), with those taken from the same source by means of the tape-recording method (“off-line” collection). She had her subjects listen to a tape only once on which radio conversations had been recorded and “write down all those utterances which are thought to be incorrect” (p.113) as they listened to the tape. She compared what her subjects reported as errors (on-line data) with the errors that she recorded

(9)

after listening to the same tape several times (off-line data). She found many errors that her subjects could not detect but she was able to find, and concluded that it is impossible to record all the errors on-line; thus she claimed that naturalistic SOT data are not reliable. She has made exactly the same mistakes as Cutler did. First of all, what she was testing is the difference between the number of SOTs that untrained subjects correctly noticed and wrote down (not

“perceived” or “detected”) while listening to the tape, vs. the number of SOTs that a trained linguist detected after listening to the tape several times.

Ferber (1995) also questioned the quality of naturalistic data by examining the

distribution of phonological errors and that of lexical errors in different corpora as shown below.

(2) on-line corpora

Abd-El-Jawad and Abu-Salim (1987) Arabic

Berg (1988), Wiedenmann (1992), Ferber (1993) German

del Viso et al. (1991) Spanish

Shattuck-Hufnagel (1987[1986]), Stemberger (1989) English

Söderpalm (1979) Swedish

off-line corpora

Garnham et al. (1982[1981]), Deese (1984) English

Marx (1984), Ferber (1993) German

She attempted to show that the ratio of phonological errors to the total of SOTs is less consistent across on-line corpora than across off-line corpora and claimed that on-line corpora are not reliable, that is, they show many random deviations from the truth, which is recorded by means of off-line SOT collection. She further stated that on-line corpora systematically deviate from off-line corpora and thus are not valid. However, it is unfair to compare the two groups of corpora like this for the following reasons. First, as shown above, the eight on-line corpora are made up of corpora from five languages, while the languages dealt with in the four off-line corpora are only German and English. Different languages in fact show different distributional patterns of SOTs (Wells-Jensen 1999). Second, the on-line corpora are larger in number than the off-line corpora. It is not surprising that the greater the number of items in the corpora, the more variation is shown. Third, as Ferber herself admitted (p.1181, 1184), it is possible that the eight on-line corpora may be using different classification systems. Although she said that she had restricted the comparison to phonological and lexical errors “to overcome the difficulties caused by different classification schemes and terminology” across the on-line corpora and “by the lack of detailed information” on the on-line corpora (p.1181), she did not explain why she thought it was possible to overcome the difficulties. Fourth, some corpora were collected by a single researcher and others were collected by other people as well as the researchers (e.g., “75 % of the errors” in Söderpalm 1979 “were collected by the author and the rest were collected by friends and colleagues”: p.27). In the latter case, it is possible that variations in the training of SOT data contributors other than the researchers may affect SOT patterns.

On the other hand, some defenders of the pen-and-paper method have stated that there is no difference between the results of studies performed by the pen-and-paper method and those of studies using the tape-recording method; however, they have rarely demonstrated this

statistically. Making a short mention of Boomer and Laver’s (1968[1973]) tape-recorded data, Fromkin (1973a[1971]:216) said, “there were no sharp discrepancies between the kinds of errors recorded by them and by myself.” Berg (1987b:282) made a similar statement: “Later

(10)

comparison between the tape and the handwritten records did not reveal any significant differences, neither with regards to the frequency of errors, nor the identity of the essential components of slips (source, target, error).”

There are supporters of both the pen-and-paper method and the experimental methods. Comparing naturalistic data and experimental data, Stemberger (1992) argues that the two types of data share most of the characteristics of SOTs and that most of the differences are due to task differences, although there are a small number of cases where naturalistic data are not reliable. However, he did not provide any distributional data on individual SOT types in studies by the two types of methods to make comparisons between them. Besides, a full calibration of the pen- and-paper method would require comparing it to both the experimental and tape-recording methods.

As discussed earlier, the great majority of SOT studies have not addressed the issue of the settings where SOTs occur. Any argument about data collection methodologies based on such studies will be unconvincing if there is any difference in SOT types due to settings.

Such differences have been reported by Wells-Jensen (1999), who had speakers of five languages (English, Hindi, Japanese, Spanish, and Turkish) narrate a cartoon with its sound muted. She interpreted the results of her cartoon narration experiment, which showed distributions of SOTs different from naturalistic corpora in some aspects, as reflections of its methodological effects on the data. In the task of her experiment, where scenes on the screen to be described changed rapidly, the subject was required to describe the ongoing scene

immediately or “on line,” thus to speak faster than normal. She found that the speed pressure imposed on the subjects, as well as the task which required description of visual stimuli:

(3) a. increased overall SOT rates,

b. raised the number of SOTs due to environmental contaminations, and thus increased the number of SOTs involving wrong lexical selections,

c. increased the number of SOTs involving closed-class morphemes, and d. decreased the number of syntactic SOTs (pp.183-184).^[5]

Another property of her experimental task was that speech content was controlled. The subject is instructed to “explain the action on the screen as if they were radio sports broadcasts” (p.78). Therefore, subjects are expected to report objectively what is happening and cannot choose propositions of their own will. She found that although the propositional content in her

experiment was fixed, an overload of information caused many lexical items to be activated and thus increased lexical selection errors (p.117). Most importantly, all of her findings about the methodological effects were found to be true in all of the five languages.

1.2.3 Studies on Practice Effects on Speech Production

As far as I know, there is only one study which has touched on the issue of the effects of content-practice. Hotopf (1983) looked at lexical SOT types in different settings, although he devoted most of his discussion to the differences between lexical SOTs and lexical slips of the pen (SOPs, henceforth) rather than the differences in error types among different settings for speech or for writing. The data that he analyzed consist of the following four samples.

(11)

(4) a. the Author’s speech sample: the author’s own SOTs

b. the Daily Life speech sample: SOTs made by people other than the author in daily conversation

c. the Meringer speech sample: SOTs from Meringer and Mayer 1895 and Meringer 1908

d. the Conference speech sample: SOTs collected at a psychology conference using the tape-recording method

One of his findings relevant here is that the Conference speech sample is more similar than the other three SOT samples to the SOP samples, in terms of the distribution of lexical SOTs. He argued that this is because it has “long monologues dealing with abstract questions in psychology” and “requires complex thought processes” for which a register similar to that in writing is employed (pp.156-157).

Another of his findings is that the Conference speech sample contains fewer “semantic group slips,” those lexical substitution SOTs where the SOT and the target are semantically similar to each other, than the other SOT samples (pp.160-161). He argued that “semantic group slips” are more likely to occur with concrete words, and thus are less likely to occur in the conference setting where more abstract words are used. He claimed that a piece of evidence for the abstractness of the words used in the conference setting is “the use of general words and phrases like “people,” “and so on,” “sort of” and “something,” as well as repetitions of the same phrases” (p.160).

He also found that what he calls “structural errors,” those lexical SOTs where the error and the target are similar to each other phonetically, are more frequent in the conference setting than in the other settings (pp.154-156). This may in fact be due to morphology: more abstract words, which would be frequent in a conference, tend to contain more derivational morphology, and this is often a source of malapropisms.

Although Hotopf presented his data collected in different settings separately, he was disinclined to apply statistical tests to the data (p.154) and to compare SOTs across settings, because the subjects who made SOTs in the samples except the Author’s speech sample were different from the people who made SOPs in the samples that are made up of SOPs made by people other than the author. It would have been possible to compare lexical errors in the Conference speech sample and those in the Daily Life speech sample, since the data in the two samples were taken from consistent settings. For example, he could have noticed that

anticipatory lexical errors were more common in the conference setting (17.5%) than in the daily life setting (6.8%) (p.155). The difference might have been caused by content practice, another variable that he did not take into consideration. Nevertheless, his study, which explored SOT and SOP patterns in different settings, made a significant contribution to differences between speaking and writing.

As mentioned in 1.1, there are a limited number of SOT studies that have tested ways in which practice per se influences speech production. In these studies, the term “practice” has been used to refer only to experimental-practice. Nevertheless, since it has much in common with the type of practice under study here, it is worth reviewing studies on experimental-practice, some of which investigate its relationship with acquisition-practice.

MacKay and Bowman (1969) demonstrated how practice increases the rate of reading on different linguistic levels (the semantic, syntactic, and phonological levels) and reduces the number of reading errors in their experiments, using German-English bilingual subjects.

(12)

However, contrary to their claim, what they were testing is not speech but reading or translation from a written text. Consequently, what they treated as SOTs are in fact reading errors,

including stutters. Their interest was in the frequencies of errors and they did not discuss the relationships between practice and error types. Nevertheless, they pioneered the study of “the practice effect in speech production,” an effect which they define as “the increase in maximum rate of producing a sentence as a result of practice” (p.38). This effect is also reported in other experimental studies (e.g., Dell and Repka 1992; Schwartz et al. 1994). From the viewpoint that practice facilitates skills in behavior, MacKay (1981, 1982, 1987) developed a node-structure- based theory not only of speech production but also of sequential behaviors in general.

However, he did not talk about practice effects on individual error types, although he discussed frequency of errors and the regularity that errors show.

Schwartz et al. (1994) addressed the issue of practice effects on SOT types. They treat

“practice” broadly and would regard it as the term that subsumes all the three types of practice. Using the London-Lund corpus of normal adults’ SOTs in Garnham et al. (1982) and Bloch’s (1986) corpus of a jargon aphasic patient’s SOTs, they argued that the former follows a “good” error pattern and the latter a “bad” error pattern. In a “good” error pattern, there are fewer lexical and phonological errors, more lexical and phonological anticipations, and more errors that are actual words and form familiar strings of words than in a “bad” error pattern. Based on the results of a tongue twister experiment, they demonstrated that SOT types shift from a more “bad” error pattern to a more “good” error pattern as one practices a tongue twister. Furthermore, they found parallels between a jargon aphasic speech and a less practiced speech. They also pointed out that children tend to make more “bad” errors than adults, citing Stemberger’s (1989) study. They attributed such practice effects to strengthened connections in a spreading activation model of speech production. I will come back to this account in 2.1.6 below.

Their findings are innovative, but strictly speaking, there are problems with the tongue twister methodology. Production and practice of the tongue twisters in this experiment are somewhat different from production and practice of speech in natural conversation, and may deviate from real cognitive processes. Tongue twisters or tongue-twister-like utterances do not often occur in daily life, and this would be true with any language. Moreover, as mentioned in 1.1, the type of practice that was investigated by their experiment is task-specific. In their experiment, the subjects were asked to say the ten tongue twisters that had been visually presented to them and removed, each twice, for eight rounds. There is an experimental context but no discourse context in this experiment: the subject’s verbal behavior would be bizarre out of the experimental context. In a naturalistic setting, speakers frequently say a series of clauses that are related in the discourse and are in various forms; the hearer sometimes interrupts the speaker but what the hearer says is related in the discourse to what the speaker has said. Speech is usually produced with meanings in mind in natural conversation. On the other hand, the meanings of the tongue twisters in this experiment is by far less important to their production and practice. Since lexical or syntactic errors are much less likely to occur and phonological errors are more likely to occur in this type of experiment than in natural conversation, it is questionable to what extent these practice effects apply to lexical SOTs, even though the authors claimed that they do. Therefore, evidence from naturalistic data is necessary to confirm their findings.

Furthermore, when they attempted to extend their argument about “good” vs. “bad” error patterns to “naturalistic” SOT data outside of the results of their experiment, they actually conflated different settings. The data in Bloch (1986) were taken from interviews with the aphasic patient and should have been compared with a normal adult corpus where the setting was

(13)

consistent with the interview context. However, the London-Lund corpus of normal adults’ SOTs, which Schwartz et al. (1994) collated with Bloch’s data, contains data that were collected not only from “spontaneous speech,” but also “prepared (but unscripted) oration” (Svartvik and Quirk 1980:12), which is regarded as preplanned in the present study.

Dell et al. (1997) replicated the practice effects found in Schwartz et al.’s (1994) tongue twister experiment. They also explored the relationship between SOT rate and anticipation SOTs. Using experiments similar to those in Schwartz et al. (1994), they found that the lower the error rate is, the more anticipation errors occur. Although they did not discuss practice effects on SOT types by comparing practiced and spontaneous naturalistic data, they attempted to apply the anticipatory practice effect to naturalistic data in terms of the relationship between SOT rate and anticipation SOTs, and claimed that the data provide evidence for the effect. Unfortunately, they took nonaphasic adult SOT data from miscellaneous sources, assuming that the error rate in the data taken from these sources would be similar to that of the London-Lund corpus. As mentioned above, the London-Lund corpus consists of data collected from

“spontaneous speech” and “prepared (but unscripted) oration” (Svartvik and Quirk 1980:12). The proportion of one type of speech to the other in each of the four nonaphasic adult corpora needs to be the same as that in the London-Lund corpus for a fair comparison, but this is unlikely. Moreover, the language investigated by Meringer (1908) is German and that

investigated by Nooteboom (1969) is Dutch, while the language dealt with in the London-Lund corpus, Shattuck-Hufnagel (1979), and Stemberger (1989) is English. Wells-Jensen (1999) has shown that language makes a difference, especially in terms of anticipations and perseverations. Anticipations predominate in English (and other Germatic languages), regardless of the context of speaking. However, Wells-Jensen showed that anticipations and perseverations occur with approximately equal frequency in Japanese, Hindi, and Spanish; this was also found for Mandarin by Wan (1999) and for Korean by Min (1998). Thus, any claims about ratios of anticipations and perseverations may need to take into consideration the normal patterns in individual languages.

1.2.4 The Relationships between Contexts and Methods

As mentioned earlier, most SOT data have been collected with discourse context left out of consideration. “Context” can be construed not only as discourse contexts where SOTs occur but also contexts in which SOT data are collected. In what follows, contexts in the latter sense, specifically naturalistic data collection contexts, will be at issue. There are two dimensions along which different types of naturalistic speech should be distinguished, when one determines whether it is the pen-and-paper method or the tape-recording method that should be employed for a particular naturalistic data collection context.

The first factor is whether the speech can be suspended by the speaker or not, that is whether the researcher can interrupt the speech event in order to ask the speaker about the target utterance and to take notes.^[6] Natural conversation can be either suspendable or unsuspendable. Most everyday conversation is suspendable as long as it is allowed to be interrupted.

Suspendability of speech is often determined by the researcher’s relationship with conversational partner(s), by cultural conventions, or by the conversational context (the researcher’s role in the speech event, the number of participants in the speech event, whether it is public speech, whether the speaker is busy or not, etc.) There are cases where conversation is physically unsuspendable owing to a one-way communication (e.g., conversation on a TV or radio program). When the

(14)

researcher can stop the conversation, the pen-and-paper method is better than the tape-recording method because the researcher can ask the subject about the target immediately after each SOT occurs. As long as the researcher possesses enough skill in detection of SOTs, the use of a tape- recorder would be unnecessary. In fact, if a researcher was able to interrupt the speech but left the tape recorder running without doing so, the chance to quiz the subject about the error would be missed.

For cases where the speech cannot be suspended, tape-recording is optimal, while the pen-and-paper method would be less desirable because the researcher’s physical and memorial limitations as a human would make it difficult to record all the errors and their relevant discourse contexts accurately. If the pen-and-paper method is used for speech that is so fast that the

researcher cannot write down successive SOTs, even a highly trained SOT detector would not necessarily be able to record all the errors produced. Although the use of the tape-recording method for unsuspendable speech may have the disadvantage of leaving the researcher to determine the target of an error that has not been corrected by the speaker, a highly trained researcher would be able to identify the target correctly with a high probability. In short, the optimal method for collecting SOTs in a suspendable speech content is the pen-and-paper method, and that for collecting SOTs in an unsuspendable speech content is the tape-recording method.^[7]

Another factor to be considered is whether the speech is spontaneous or preplanned, a factor introduced in 1.1. As far as I know, virtually no SOT researchers have made this distinction for naturalistic settings. Everyday conversation is for the most part constituted of spontaneous speech, although there are a small number of situations in everyday conversation where practice is necessary, for example, when telling a joke or story or presenting a rehearsed excuse or apology.^[8] Prototypical preplanned speech includes conversation in TV programs and public speeches (e.g., class lectures, conference talks). Although participants in such speech events can speak spontaneously from time to time, for example when they have to “play it by ear” because they have not done enough preparation or they have to answer an unexpected question, most of such speech is more or less preplanned. The participants are conscious of the expected content of the speech in advance, so that they can say it during a certain time frame, whether they have memorized the exact linguistic forms that express the content or not.

Whether the speech is preplanned or not appears to have nothing to do with which of the methods, the pen-and-paper method or the tape-recording method, should be used. Nevertheless, the tape-recording method is difficult to adopt for unexpected spontaneous speech. For example, when a researcher is being spoken to or hears someone talk on the street unexpectedly, it would be difficult to start tape-recording immediately.

The two dimensions discussed so far seem to be distinct. Suspendability is a sociocultural, contextual, and physical notion. It has no direct relationship to the speech production itself. Spontaneity and practice are notions that pertain to speech production: they relate to how much the content and, to some extent, the form are established in advance of its production. However, the two factors show some interactional tendency. While spontaneous speech tends to be suspendable, preplanned speech tends to be unsuspendable, because

preplanned speech is likely to occur in a context where the period of time for which the speech lasts is already arranged. Everyday conversation consists of much more suspendable,

spontaneous speech than any other type of speech. Conversations in TV programs and public speech consist of much more unsuspendable, preplanned speech. This relationship is illustrated in Figure 1.1.^[9]

(15)

more spontaneous more preplanned more suspendable everyday conversation

more unsuspendable conversation in TV programs, public speech

Figure 1.1: Different kinds of data collection contexts defined by suspendablility and spontaneity

Therefore, since most preplanned speech readily available to SOT researchers is unsuspendable, the pen-and-paper method is unsuitable for such speech. The tape-recording method would allow the researcher to record and review all the SOTs that occur during a certain period of time. On the other hand, for spontaneous speech, a great portion of which is suspendable, the pen-and- paper method is much more efficient and more accurate with respect to recording of the target than the tape-recording method, as long as the researcher is skilled in detecting and recording SOTs.

1.3 Hypothesis

Since no effects of content-practice have been attested before, the hypothesis tested by this study is the null hypothesis, that is the hypothesis that SOT types are the same regardless of whether the speech is preplanned or spontaneous.

One could conjecture, in light of the findings on practice effects on phonological SOTs found by means of the experimental method by Schwartz et al. (1994) and Dell et al. (1997), that SOTs in preplanned speech will show a more “good” error pattern than those in spontaneous speech in the present study. However, their findings about practice effects will not necessarily apply to data collected in naturalistic settings. First, as mentioned in 1.1, the kind of practice at issue here is different from practice in their sense. It is possible that the two types of practice may be different to the extent that there are no such practice effects at all in naturalistic settings where the speech does not involve mechanical repetitions of the same sentences. Second, their practice effects were found with only a small set of tongue-twister sentences, whereas speech in naturalistic settings is not made up of such a small set. Their studies were able to look at limited aspects of speech production, as seen in the characteristics of a “good” error pattern discussed earlier. One might be able to find practice effects concerning a greater variety of aspects of speech production in naturalistic settings. For these reasons, their findings about practice effects do not serve as hypotheses to be verified in this study, although they will be compared to my findings in section 3.

(16)

2. Study

2.1 Methodology 2.1.1 Data Sources

The present study was designed to maximize the quality of data with respect to different types of speech. The author collected SOTs from two types of sources. One was everyday conversation, where speech was, in most cases, spontaneous. The present research limited everyday conversation to suspendable, spontaneous face-to-face conversation: excluded were public speech, telephone conversations, and readings. The conversations in the data occurred between February 13, 1999 and May 15, 1999 and between January 23, 2000 and August 19, 2000.

The other source was live-broadcast TV programs, where speech was presumed to have been preplanned to a large extent.^[10] Since, unlike everyday conversation, researchers cannot stop a conversation going on TV, videotaped live-broadcast TV programs were employed. Most live TV programs in Japan are talk shows, entertainment shows, news, play-by-play sport

broadcasting, or technical discussions on political or social issues. The programs dealt with in this research are mainly talk shows and entertainment shows.^[11]

Thus, the present study sets up a dichotomy between preplanned speech and spontaneous speech. Although practice is a gradient notion rather than an all-or-nothing one, in this study, I could not measure how much speech was preplanned before its production, unlike the

experiments in Schwartz et al. (1994) and Dell et al. (1997), which used as an independent variable how many times the subject practiced a tongue twister. Moreover, I could not consider whether the speaker had overt practice or mental practice, unlike MacKay (1981) and Dell and Repka (1992). The two factors above are very difficult to take into consideration for a study of naturalistic data. One has to rely on the fact that the speaker is speaking in a setting where practice is required. Although the present study could not test how much the speakers practiced their speech or whether the speakers practiced their speech overtly or mentally, the presence of the speakers in a TV program attests that those speakers have had practice in their speech in advance to a large extent whether it may be overt, mental, or both, because they have always been informed about the content of the program beforehand. Even in experimental studies, the two factors are very tricky. In addition to the variable of the number of trials of saying a tongue twister that each speaker has had, Schwartz et al. (1994:136) found “large individual differences in error probability, particularly with the slow speaking rate.” As for the other factor, overt practice vs. mental practice, the tongue twister experiments in Dell and Repka (1992), which handled this issue, asked the subjects to report the errors that they thought they had made after mental practice. The problem of “detectability” is very likely to arise because the subjects of such a study are typically unskilled in detecting SOTs. Moreover, it would be almost impossible to know exactly how the subject practiced the tongue twister mentally. Thus, these problems are problems for all studies of this type, not just the present study.

2.1.2 Subjects

The speakers who contributed to the data from everyday conversation were normal adult native speakers of Japanese. In all the conversations, the speakers spoke the Tokyo dialect,

(17)

although some of them were not from Tokyo or its outskirts. Their ages ranged between 24 and 62.

The speakers in the data from the TV programs were also normal native speakers of Japanese. Their ages ranged between 16 and 72 by estimate. The speakers were people in show business such as comedians, actors, actresses, and singers, and people from the general public. Since all the TV programs examined in this study were broadcast in Tokyo, the speakers usually spoke the Tokyo dialect, whether they were from the Tokyo area or not. However, there were cases where the speaker spoke a dialect other than the Tokyo dialect in the program. Since Japanese dialects show variations of lexical items and pitch accents, it is possible that what might seem to be a SOT was actually a function of the speaker’s dialect. Thus, speakers of other dialects were excluded from the study.

2.1.3 Definitions Involved in a Slip of the Tongue

The present study treats SOTs as one type of speech error. As Boomer and Laver (1973[1968]:123) defines it, a SOT is “an involuntary deviation in performance from the

speaker’s current phonological, grammatical or lexical intention.” This definition of a SOT was elaborated by Dell (1986:284) with the additional notions of its one-time occurrence and its occurrence in speech production planning: “an unintended, nonhabitual deviation from a speech plan.” Thus, SOTs should be distinguished from other types of speech errors and other

anomalous or disfluent utterances. Wells-Jensen (1999:84-88), who analyzed different kinds of self-monitoring discussed by Levelt (1989:460-463), established criteria for judging whether a speech error is a SOT or not. The present study basically adopts her criteria.^[12]

There are three notions that pertain to a SOT: “error,” “target” and “source.” The

“target” of an error is what the speaker intended to say. An error may or may not be corrected with its target by the speaker. When it is not corrected, it may be that the speaker did not notice the error at all or that the speaker noticed it but did not bother to correct it. The “source” of an error is the linguistic element that has led the speaker to make the error. Very often, an error and its source are planned in the same linguistic context. Such an error is called a “contextual” error (Dell 1986, Stemberger 1989:169, Jaeger 1992a, to appear). If the error is “non-contextual,” there may be a source outside of the linguistic context (e.g., in the speaker’s physical

environment, on the speaker’s mind; Muansuwan 2000), or there may be no source at all even outside of the linguistic context. In the present study, the SOT “error” is transcribed in bold italics, the “target” is in bold, and the “source” is in italics, as exemplified by the following examples, one with a phonological substitution and the other with a lexical substitution. In the SOT examples henceforth, EC means the error occurred in everyday conversation and TV means that the data occurred in a TV program. E stands for an utterance containing an error and I for the intended utterance (I1 and I2 for the two intended utterances of a blend error). The intended utterance is listed only when the error is not corrected with the intended utterance.

(18)

(5) ... chotto yotei tas-e ... ... a.little plan NONWORD-ADV

tat-e-sokona-tte-sima-i-mas-i-te, ...

build-ADV-fail-CNNCT-put.away-ADV-PLT-ADV-CNNCT

‘... I failed to *[meaningless] ... make a plan ...’

The consonant [s] from “sokona” was anticipated and substituted for the second [t] in

“tat.”

(6) E: Amerikan-byuutyii tte eiga omosiroi-no-ka-na? American-beauty CMPL movie interesting-NML-Q-SF

Sugu eiga de yaru kara ii-ya. soon movie at do because good-SFT

I: ... Sugu terebi de yaru kara ii-ya. ... soon TV at do because good-SFT

E: ‘I wonder if the movie American Beauty is interesting. Because it will be on movies, it’s OK even if I don’t go to see it.’

I: ‘... Because it will be on TV, it’s OK even if I don’t go to see it.’

The noun “eiga” in the first sentence was perseverated and substituted for “terebi.” 2.1.4 Procedure 1: Data Collection

The present study employed the ideal data collection methods for the two different settings, the pen-and-paper method for everyday conversation and the tape-recording method for TV programs.

When data were collected from everyday conversation, I detected SOTs and wrote them and their relevant information down as accurately as possible. The conversation was suspended when necessary for recording the SOT. The speaker who had made a SOT was asked to say the target when I could not identify it, or if the speaker was the author, I ascertained the target by introspection. I always had a pen and paper with me to be able to record SOTs anytime.^[13]

For data collection from TV programs, the author watched each program at least three times. In the first trial, notes were roughly taken right after a SOT was detected, without stopping the tape. In the second trial, the tape was stopped every time a SOT was detected; the SOT and its relevant information were written down as accurately as possible. The tape was rewound several times until the notes precisely reflect the utterance(s). In the final trial, the notes were reviewed.

As mentioned above, information relevant to each SOT was recorded, in whichever setting the SOT was made. The information consists of the utterance where the SOT occurred, the intended utterance, the discourse context where it occurred, the speaker’s age and gender, the date when the SOT occurred, and the name of the TV program in the case of data collection from TV programs. When the source of the SOT was in the linguistic context, at least the utterance

(19)

that contained the source was recorded, whether this utterance was made by the same or a different speaker.

2.1.5 Procedure 2: Classification

2.1.5.1 Jaeger’s (1992a, to appear) Classification System

Each error was classified according to the following classification system, which was based on that of Jaeger (1992a, to appear), who uses the three parameters, “unit,” “form,” and

“directionality.” The Japanese SOT examples below, as well as (5) and (6) in 2.1.3, will be used to illustrate these parameters.

The parameter “unit” is used to describe a linguistic element that behaves as a unit in the SOT, that is an element that is substituted, moved, added, or omitted. It is defined in terms of the formal property of the error. A unit may be phonological, lexical, or phrasal. A phonological unit, in the case of Japanese, may be a consonant, vowel, feature, mora, syllable, rhyme, or pitch accent. Some of these are specific to Japanese. A lexical unit is a morpheme (or morphemes; see below), whether it is free or bound. It may be an open-class morpheme (hereafter, OC) or a closed-class morpheme (hereafter, CC). Japanese OCs and CCs are made up of the following categories:

(7) OCs: verb roots, adjective roots, noun roots, adverbs (including mimetics), noun compounds, verb complexes

CCs: affixes, particles, conjunctions, pronouns, demonstrative adjectives

Note that the OC categories include morpheme composite structures such as noun compounds and verb complexes, which are larger than single morphemes. A phrasal unit is a phrase, a larger linguistic unit than a morpheme that is not stored as a single entry in the lexicon and that forms a constituent other than a noun compound and a verb complex. In (5) and (8), the unit is

phonological: consonants in (5) and pitch accents in (8). In (6) and (9), the unit is lexical: OCs in (6) and CCs in (9). The unit of the error in (10) is phrasal. (In (12), (13), (17), and (18), the unit is phonological: consonants in (12) and (18), vowels in (17), and WU (see below) in (13). In (14), (15), and (19), the unit is lexical.)

(8) E: Nannimo nomimono ir-a-nai-no? ... are okasii. (laughs) nothing NONWORD need-IRR-NEG-Q ah funny

LLLL LLLL LHHLH

I: Nannimo nomimono ir-a-nai-no? nothing drink need-IRR-NEG-Q

LLLL LHHL LHHLH

E: ‘Don’t you need any *[meaningless]?... ah, that’s funny.’ I: ‘Don’t you need anything to drink?’

The pitch accent pattern of “nannimo” was perseverated and substituted for that of

“nomimono.”

(20)

(9) E: Kuruma no soto kara ori-rare-nai-kara-ne.

car GEN outside from get.out-can-NEG-because-SFT

I: Kuruma no soto ni wa ori-rare-nai-kara-ne.

car GEN outside to TOP get.out-can-NEG-because-SFT

E: ‘You can’t get out from the outside of the car.’ I: ‘You can’t get out to the outside of the car.’ The particle “kara” was substitued for “ni.”

(10) E: A! sonna ni ame toma-tte-nai-na (laughs) ... oh like.that MANNER rain stop-CNNCT-NEG-AFFM

kuruma sonna ni toma-tte-nai-na.

cars like.that MANNER stop-CNNCT-NEG-AFFM

I1: A! sonna ni kuruma

oh like.that MANNER cars

toma-tte-nai-na.

stop-CNNCT-NEG-AFFM

I2: A! sonna ni ame fu-tte-nai-na.

oh like.that MANNER rain fall-CNNCT-NEG-AFFM

E: ‘Oh, there is not so much rain parking.’ I1: ‘Oh, there are not so many cars parking.’

I2: ‘Oh, it is not raining so much (lit, there is not so much rain falling).’

The two phrases “kuruma toma-tte-nai-na” and “ame fu-tte-nai-na” were blended.

The parameter “form” involves how the error has occurred in relation to the target. There are six forms as shown below (Jaeger to appear).

(11) a. substitution: One element (error) is substituted for another (target).

b. addition: An element (error) is added to the intended utterance (target: the original form).

c. omission: An element (error) is omitted from the intended utterance (target: the original form).

d. movement: An element (error) moves to a place in the intended utterance where it should not be, and is substituted or added (errors: the omitted form

and the substituted or added form; target: the original form).

e. exchange: Two elements (targets) are exchanged with each other (error: the exchanged form).

f. blend: Two elements (targets) compete to enter the same syntagmatic slot and become blended (error: the blended form).

(21)

In the examples below, (12) is a movement, (13) is an omission, (14) is an exchange, and (15) is a blend. (Examples (5), (6), (8), (9), (17), and (19) are substitutions, (10) is a blend, and (18) is an addition.)

(12) Watching a hockey game.

A, kyankaku-seki (laughs) ... kankyaku-seki ni

ah NONWORD-seat audience-seat to

[kj ankakuseki] [kankjakuseki]

hai-ccha-tta-yo.

enter-put.away-PAST-SFT

‘Ah, the puck got into the *[meaningless] ... stands.’

The glide [j] in “kyankaku-seki” moved from after the second [k] to after the first [k].

(13) A: Keisatukan no minasan no doryoku tte police.officer GEN everybody GEN effort CMPL

wakaru-yo-ne, koo ya-tte mi-te-ru-to.

understand-SFT-TAG like.this do-CNNCT see-CNNCT-exist-if

B: Yasai ... yasasii desu-ne. vegetable kind COP.PLT-SFT

LLL LLLL

A: ‘As we watch the documentary like this, we come to know the police officers’ efforts, don’t we?’

B: ‘They are vegetables ... kind.’

The weight bearing unit [ ] was deleted. Note that the error and its target have the same pitch accent pattern.

(14) E: Nyuu-yooku no washoku no resutoran de New-york GEN Japanese.food GEN restaurant LOC

yaki-zakana ga ur-eru yoo ni

broil-fish NOM sell-ATTR manner MANNER

tob-u-n-da tte ... A, gyaku da. (laughs) fly-ATTR-NML-COP CMPL Ah reverse COP

I: ... yaki-zakana ga tob-u yoo ni broil-fish NOM fly-ATTR manner MANNER

ur-eru-n-da tte.

sell-ATTR-NML-COP CMPL

(22)

E: ‘I hear that broiled fish fly in Japanese restaurant in New York City as if they were selling ... ah, the other way around.’

I: ‘I hear that broiled fish sell ... like hotcakes (lit, I hear that broiled fish sell ... as if they were flying).’

The two verbs “ur-” and “tob-” were exchanged.

(15) E: Moo byooin iku yonbi ... “yonbi” da

already hospital go NONWORD NONWORD COP

tte (laughs) ... junbi s-i-cha-oo.

CMPL preparation do-ADV-put.away-CHR

I1: Moo byooin iku junbi s-i-cha-oo.

already hospital go preparation do-ADV-put.away-CHR

I2: Moo byooin iku yooi s-i-cha-oo.

already hospital go preparation do-ADV-put.away-CHR

E: ‘Let’s get *[meaningless] ... I said *[meaningless] ... Let’s get ready to go to the hospital.’

I1: ‘Let’s get ready to go to the hospital.’ I2: ‘Let’s get ready to go to the hospital.’ The two nouns “junbi” and “yooi” were blended.

A telescoping (a “telescopic error” in Fromkin 1973a[1971] and a “haplology” in Levelt 1989) is a kind of omission error with no directionality. In a telescoping, “at least one weight- bearing unit (vowel, rhyme, mora, or syllable) is deleted, and the remaining segments are

collapsed into a shorter utterance with the same intonational pattern as the planned longer string” (Jaeger to appear). Its unit is phonological, because the entire omitted element does not carry a meaning. Since the omitted element is not restricted to a particular phonological form, the phonological unit of a telescoping is labelled as “WU.” It should be noted that an omission error with no directionality where a meaningful lexical unit was omitted was categorized as a lexical omission, not as a telescoping. Example (13) is an example of a telescoping.

The “directionality” of an error is determined by two criteria. One is whether the error is

“contextual” or “non-contextual,” namely whether there is an obvious source for the error in the linguistic context or not (2.1.3). The other factor involves the positional relationship between the error and its source in the case of a contextual error or the positional relationship between the error and the target in the case of a non-contextual error. There are two types of positional relations, “syntagmatic” and “paradigmatic (associative)” relations (Saussure 1972[1916]:121- 125 [170-175]). A contextual error is always syntagmatic simply because the error and its source have a syntagmatic relation; the error originates from a source whose location is somewhere else in the utterance. One can tell from which direction a contextual error comes by examining its positional relationship with its source. If the error occurs before its source, one can judge that the source has been anticipated and thus it is an “anticipation.” If the error occurs after its

(23)

source, it is a “perseveration.” There can be more than one source of an error, both before and after the error. In some cases, one of the potential sources is more likely to have actually caused the error than the other, but to be safe, I have labeled all such errors as

“anticipation/perseveration,” which is abbreviated to “a/p.” An exchange is another type of contextual error, where two contextual elements exchange with each other. Note that exchange is also one type of SOT form, as mentioned above, whose directionality is also labelled as exchange.

Anticipations can be either complete or incomplete. In a complete anticipation, the speaker says both the error and its source before or without correcting it with its target. In an incomplete anticipation, the speaker notices the error and corrects it before saying the source. For a discussion of the difference between incomplete anticipations and exchanges, see Stemberger (1989) and Jaeger (1992a).

A non-contextual error is either paradigmatic or syntagmatic. Since there is no source in a non-contextual error, one needs to look at the relationship between the error and its target. Non-contextual syntagmatic errors are those where a syntagmatic string in the intended utterance is augmented or diminished without the addition or omission being affected by any element in the utterance. Thus, non-contextual additions and non-contextual omissions belong to this type of error. Such errors will be labeled as “no directionality.” If the error and its target have a paradigmatic relationship, that error is called paradigmatic. In a paradigmatic error, the error unit is incorrectly selected to occupy a slot planned for the target unit in a string, or two targets compete with each other for the same slot in a string and become blended. Paradigmatic errors include non-contextual substitutions and blends.

The different kinds of directionality of errors are summarized below. The bold-faced term will be used below to represent each category or subcategory.

(16) (A) contextual/syntagmatic (A1) anticipation (A2) perseveration (A3) a/p

(A4) exchange

(B) non-contextual/syntagmatic - no directionality

non-contextual addition, non-contextual omission, telescoping (C) non-contextual/paradigmatic

non-contextual substitution, blend

In the following examples, (17) is a complete anticipation, (18) is a/p, and (19) is paradigmatic. (The directionality of (5) is an incomplete anticipation, (6) and (8) are perseverations, (9), (10), and (15) are paradigmatic, (12) and (17) are complete anticipation, (13) has no directionality, and (14) is an exchange.)

(17) E: De ano siso to iu no ga takabu-tta

and FILL perilla CMPL say NML NOM get.excited-PERF mono o osae-masu kedo, osee-sugi-cha

thing ACC restrain-PLT but NONWORD-overdo-CNNCT.TOP

(24)

ik-e-mas-en node ... go-can-PLT-NEG because

I: ... osae-sugi-cha ik-e-mas-en node ...

restrain-overdo-CNNCT.TOP go-can-PLT-NEG because

E: ‘And perilla restrains your feelings when you’re excited but it shouldn’t

*[meaningless] them too much ...’

I: ‘... it shouldn’t restrain them too much ...’

The vowel [e] in “osae-sugi-cha” was anticipated and was substituted for [a] in the same morpheme.

(18) Ano hito nyuukyoku (laughs) ... that person NONWORD

[ : ]

nyuukoku-kyohi-s-areru-n-ja-nai-no?

entering.into.a.country-refusal-do-PASS-NML-COP.ADV.TOP-NEG-Q

[ : ]

‘That person may not be *[meaningless] ... admitted into the country.’

[j] in “nyuukoku” was perseverated or [j] in “kyohi” was anticipated and was added to the first [k] in “nyuukoku.”

(19) Soo pi ... piza ... biza ga kire-masu-si ... FILL (pizza) pizza visa NOM expire-PLT-also

‘The (pizza) ... pizza ... visa was going to expire and ...’ The noun “piza” was substituted for “biza.”

The three parameters discussed so far can classify SOT errors into three broad TYPES, LEXICAL, SYNTACTIC, and PHONOLOGICAL errors, according to the stage of speech production planning in which the error has occurred (see 2.1.6 below). Notice that the name of the broad TYPES are written in capitals here to avoid confusion with the name of units. LEXICAL errors are those errors which occur when a wrong lexical item is selected from the lexicon. In errors of this TYPE, the unit is always a lexical unit. SYNTACTIC errors are those errors which are made due to a wrong assignment of a lexical item to a syntagmatic slot or to a wrong selection of a syntactic template. In SYNTACTIC errors, lexical items may be

misarranged in the syntagmatic string or two phrases may be blended and inserted in the same syntagmatic slot. The unit of such an error is a lexical unit in the former case and is a phrasal unit in the latter case. PHONOLOGICAL errors occur when a wrong phonetic form is assigned to the phonological representation. The unit of this TYPE of error is invariably a phonological unit. Examples (5), (8), (12), (13), (17), and (18) are PHONOLOGICAL errors, (9), (15), and (19) are LEXICAL errors, and (6), (10), and (14) are SYNTACTIC errors. Note that the unit is lexical in (6) and (14), and phrasal in (10). The relationships between the TYPES of errors and the three parameters are simplified in Table 2.1.^[14]