|
Article Excerpt ABSTRACT. This paper presents results of a work in progress that investigates the features and discourse distribution of cleft constructions across languages as well as the factors at work in their choice. Specifically, we find that in English the choice of an it-cleft (e.g. It's a character defect that makes them go into politics) is determined mainly by three types of constraints, STRUCTURAL, SEMANTICO-PRAGMATIC, and DISCOURSE-COGNITIVE, involving such competing motivations as focus, topic, the contrast between given and new information, the staging of information, rhetorical structure, and text-type (Delin & Oberlander 1995:470, Perzanowski & Gurney 1997:209).
The analysis is based on data from the International Corpus of English-Great Britain (henceforth ICE-GB). It is argued that the core meaning of it-clefts is the positive identification of a discourse element, the ELEMENT IN FOCUS (EIF), usually a subject/agent NP, mostly by means of the declarative pattern it is/it was ... that in fundamentally spontaneous persuasive speech. Furthermore, it-clefts normally contain new information and implement three main discourse strategies: (a) CORRECTIVE when reformulating or correcting old topics, generally displaying the pattern new + given; (b) TRANSITIONAL when introducing new or reintroducing deactivated topics or spatio-temporal settings, usually with the pattern new + new; and (c) TOPICAL when continuing with a previous discourse topic, normally exhibiting the values given + new. Underlying these three strategies is the central function of it-clefts, namely to direct the addressee's attention towards the EIF, which specifies a relation of exhaustive topicality with respect to the preceding and/or subsequent discourse (Gundel 1977, Declerck 1983,1984, Borkin 1984, Huddleston 1984, Collins 1985, Quirk, Greenbaum, Leech & Svartvik 1985, Declerck 1988, Geluykens 1988, Lambrecht 1988, Collins 1991, Fichtner 1993, Ball 1994, Declerck 1994, Lambrecht 1994, Delin & Oberlander 1995, Weinert & Miller 1996, Fawcett & Huang 1997, Biber, Johansson, Leech, Conrad & Finegan 1999, Davidse 2000, Hedberg 2000, Ward, Birner & Huddleston 2002). **
INTRODUCTION. Traditionally, CLEFTS (1), such as those in 1, have been said to differ formally, semantically, and functionally, from corresponding more basic, frequent or canonical structures, such as those in 2, but to be otherwise equivalent in terms of truth-conditions and illocutionary force (Halliday 1967a,b, Borkin 1984, Huddleston 1984, Quirk et al. 1985, Halliday 1994, Aarts 1997, Biber et al. 1999, Davidse 2000, Ward, Birner & Huddleston 2002). The citation codes '(A1)' and '(A2)' in example 1 index full citations to items from The International Corpus of English, The British Component, which are listed in the appendix.
(1) a. It's a character defect that makes them go into politics (A1) b. it is in the least developed countries that they have suffered the most (A2)
(2) a. A character defect makes them go into politics b.i. in the least developed countries they have suffered the most b.ii. they have suffered the most in the least developed countries
Broadly, clefting identifies a discourse strategy whereby information is packaged or CLEFT into two units in order to fulfill a two-fold discourse effect: (a) to set up a relationship of identification of the specifying type between two discourse units; and (b) to give discourse prominence to (part of) one of the two units, for example, a character defect; in the least developed countries.
This investigation will be restricted to it-clefts, leaving aside such constructions as wh-clefts, or pseudoclefts, and other cognate sequences, exemplified in 3 and 4, which differ in information structure, function, and discourse effects (Prince 1978, 1981b, Hedberg 1990, Collins 1985, 1991, Haugland 1992, Delin & Oberlander 1995, Weinert & Miller 1996):
(3) a. What makes them go into politics is a character defect b. Where they have suffered the most is in the least developed countries
(4) a. The (only) thing/reason that makes them go into politics is a character defect b. The (only) place in which/where they have suffered the most is in the least developed countries
Adopting a three-dimensional approach, we shall focus on three types of constraints on the constructions under analysis: structural (Section 2), semantic (Section 3), and discourse-cognitive (Section 4). These interact with competing motivations such as: (a) Focus OF ATTENTION (FA), which refers to the speaker's attentional processes that shift from one attentional window to another as discourse progresses, building up the camera angle of discourse, (b) different directions of MENTAL SCANNING (or different paths of mental access), which are imposed by attentional processes and which determine the speaker's subjectivity and point of view (Langacker 2001a,b, Cornish 2004), (c) the semantico-pragmatic notion of Focus and the assumptions it leads to in terms of presupposition and assertion (Lambrecht 1994, Vallduvi & Engdahl 1996, Lambrecht 2001, Drubig 2003), (d) the mappings of GIVEN and NEW information, as well as the cohesive ties entailed by such mappings (Halliday & Hasan 1976, Clark & Haviland 1977, Gundel 1977, Martin 1992, Gundel 1999, Gomez-Gonzalez 2001, Gundel 2002), (e) TOPIC, that is, a prominent conceptualization that acts as a kind of cognitive anchoring point with respect to which other conceptualizations are brought into discourse (Kemmer 1995:58), at roughly two possible levels, a global discourse level (GLOBAL TOPIC) and a local discourse level (LOCAL TOPIC) (Hannay & Bolkestein 1998, Downing 1991, 1997), (f) the STAGING OF INFORMATION, which is how speakers and writers initiate discourse units and build them up incrementally as discourse unfolds, thereby guiding the conceptualizer's monitoring of the message at issue (see Mackenzie 2000, Gomez-Gonzalez 2001, 2004, Mackenzie 2004), and (g) TEXT-TYPE, characterized by a specific communicative function and such contextual variables as MODE (roughly the medium through which linguistic contact occurs, e.g. written, spoken), TENOR (kind of speaker-addressee relationship, e.g. formal, informal), and FIELD (subcategories of style referring to the social processes in which language plays a part) (e.g. Martin 1992, Halliday 1994).
Only a cursory description of the aforementioned categories is offered here because a more fine-grained account of the problems involved in their definition and application, as well as the inter-analyst disagreements regarding these issues has been given elsewhere (Gomez-Gonzalez 2001, 2002, 2004). We shall see that the three-dimensional nature of it-clefts in combination with these six parameters explain their distribution across the ICE-GB text-types (Section 5). The paper closes with a summary of the main conclusions (Section 6).
The study is characterized as CORPUS-BASED rather than CORPUS-DRIVEN because it departs from theoretical assumptions made about it-clefts in the relevant literature, using the data extracted from ICE-GB to test the theory. By contrast, in CORPUS-DRIVEN approaches corpus evidence is used to discover linguistic insights and to make theoretical statements derived from these. (2) We now turn to the details of the corpus and the procedure used to filter the data from it.
1. MATERIAL AND METHODS. This study is based on the British Component (GB) of the International Corpus of English (ICE), a corpus compiled and annotated by the Survey of English Usage (SEU), University College London in 1998. ICE-GB compiles a sampling of the English used in the 1990s by adult (over 18) educated speakers and writers who are natives of England, Scotland, and Wales. The corpus contains a million words and displays a major mode classification into spoken language (300 texts, approximately 600,000 words) and written language (200 texts, approximately 400,000 words), comprising a wide variety of texts, as illustrated in Figure 1.
[FIGURE 1 OMITTED]
Nearly one-third of the samples consist of spontaneous dialogues, representing the most common type of speech. Each text has an average of 2,000 words. In spoken texts, there are 60,894 text units (text units correspond either to grammatical sentences or, in some instances of spoken discourse, to coherent utterances); while in written texts, there are 27,463 text units.
ICE-GB is fully tagged and parsed. The tags are of three different kinds, structural, part-of-speech, and grammatical. The parse trees of each text unit provide a visual representation of the part-of-speech of each word, the particular phrases and clauses that these words are members of, and the function that they serve. ICE-GB is released with the ICE Corpus Utility Program (ICECUP3), a text analysis program especially devised to work with the aforementioned tags and parse trees (3).
ICE-CUP3 selected 430 tokens of grammatical constructions tagged as CLEFTIT by the ICE-GB compilers. However, after a close scrutiny of these tokens, some of them had to be excluded from the analysis for two main reasons. First, some turned out to be not it-clefts, as it was the case of, for example, two that-complement clauses that were extraposed from subject position to the end of the sentence (e.g. Was it not the case (...) that he was (...) in the process of expansion (A3) vs. that he was (...) in the process of expansion was not the case; It is the belief that not only did Coleridge (...) put himself off his proper thought (...) (A4) vs. That not only did Coleridge (...) put himself off his proper thought (...) is the belief). Second, they were incomprehensible or incomplete sequences resulting from contextual or processing disruptions (e.g. it was the style of humour of the day which (A5); it's autosuggestion that if I might use that word (A6); it's really foreign taste is the (A7).
Once these exclusions were made, our sample comprised a total of 422 it-clefts. Their frequencies and features will be discussed in the sections that follow. Frequencies result from dividing the number of examples by the number of words in ICE-GB, 600,000 words in the spoken part and 400,000 in the written part, and then multiplying the quotient by 10,000 words.
2. FORMAL DIMENSION. It-clefts are two-part constructions that consist of a main clause (A) containing a copular or relational verb, usually be (It's a character defect), and a dependent clause (B) (that makes them go into politics). According to our data, unmarked declarative it-clefts are more frequent (388, 91.94%) than marked declaratives (5, 1.18%) or interrogatives (34, 34%). Marked declaratives and interrogatives will be addressed in Section 2.2.
2.1. THE PROTOTYPICAL TYPE: UNMARKED DECLARATIVE IT-CLEFTS. In this section we shall focus on the variations of unmarked declarative it-clefts found in ICE-GB. The variations are represented in decreasing frequency in Figure 2.
[FIGURE 2 OMITTED]
Figure 2 provides a graphical representation of three important assumptions made here about the it-cleft structural configuration: (a) 1 and 4 are presented as elements of a discontinuous description constituted by it + RDC; (b) the RDC is characterized as an immediate constituent of the it-cleft as a whole, dependent on the be-clause; and (c) the wh-form of the RDC refers back to the unit called EIF. Explanations for these three assumptions will be provided as we present our findings for each of the four prototypical elements of the construction.
The subject of the superordinate clause (A: 1), it, has been interpreted in the literature roughly in two different ways: (a) as a non-referential dummy element that acts as syntactic filler of the subject slot, or (b), the position endorsed here, as a referential element of some kind. Two main reasons are adduced by the supporters of the dummy-pronoun interpretation. One is the lack of agreement between it and the verb in B, the dependent clause. The assumption is that since the latter normally agrees with the ElF rather than with it (e.g. and it is such organisations that are more likely to be successful in periods of rapid change, A8), then it cannot be a referential pronoun but must be a dummy one (Collins 1985, Huddleston 1988, Collins 1991). Declerck and Seki (1990) reach the same conclusion on the grounds that neither can it always be formulated as a nominal, which in their view means that there is no antecedent for the pronoun, nor can it be replaced by s/he or other pronouns when referring to human antecedents (For similar conclusions see also Sornicola 1988:358-59, Weinert & Miller 1996).
We shall adopt the view that it is referential in the sense that it points both backwards and forwards in discourse. As a result of its backward-looking potential the it + be combination refers to what has come before. It specifies a relation of (dis)continuous topicality and therefore expresses the speaker's point of view.
Exploiting its forward looking potential, it acts as a cataphoric pronoun that points to the variable represented by X or the GAP in the open proposition encoded in the dependent clause (B) (e.g. 'the X that makes them go into politics', in the example in Figure 2). The value of X is identified with the element EIF (a character defect) through the predicative construction it + be (further details on the semantic configuration of/t-clefts will be given in Section 3).
This analysis has the advantage of explaining why it cannot be replaced by other pronouns, one of the arguments adduced by dummy-it supporters. The third person singular neuter pronoun is best suited to point forward to a variable (X), whose value is unknown (it may turn out to be animate or inanimate, feminine or masculine, singular or plural, etc.), in an open proposition, which crucially is not a referential expression proper.
Concomitant accounts supporting some degree of referentiality in the it of it-clefts are provided by, for example, Hedberg (2000:906, 819) and Gundel (2002:118), who assume that the cleft pronoun (it) together with the cleft clause (our dependent clause) functions as a discontinuous definite description; while Miller and Weinert (1998) similarly refer to it as a DEFINITE DEICTIC (see also e.g. Akmajian 1970, Borkin 1984, Davidse 2000).
Although absent from ICE-GB, (A: 1) realizations other than it have also been attested in the literature, as illustrated in 5.
(5) a. (No,) that was the doctor I was speaking to (Quirk et al. 1985:1386)
b. Those are my feet you're treading on (Quirk et al. 1985:1386)
It would seem, however, that these instances differ from canonical it-clefts in some respects, as also explained in, for example, Ball (1978) and Lambrecht (2001). First, the sequences in 5 are used SPECIFICATIONALLY, as opposed to PREDICATIONAL constructions (e.g. That's a nice dress you're wearing), but unlike other specificational clefts those in 5a,b cannot be uncleft, thereby resembling proverbial clefts (e.g. It's the rare linguist who can keep track of these distinctions vs. ? The rare linguist can keep track of these distinctions) (see e.g. Ball 1978, Declerck 1988, Hedberg 1990, 2000). And second, the (A: 1) elements in 5, that, those, seem to be more referential than the it of it-clefts. For one thing, only the former can be used exophorically thereby participating in an equation between two truly referential denotata. Thus, in the case of 5a, for example, we could utter that sequence pointing to someone in front of us meaning that 'that was the doctor, the person I was speaking to'. By contrast,...
|