John Lyons Language and Linguistics an IntroductionFull description
John Lyons Language and Linguistics an Introduction
Full description
John Lyons Language and Linguistics an IntroductionDescrição completa
Full description
This accessible textbook is the only introduction to linguistics in which each chapter is written by an expert who teaches courses on that topic, ensuring balanced and uniformly excellent coverage ...
This accessible textbook is the only introduction to linguistics in which each chapter is written by an expert who teaches courses on that topic, ensuring balanced and uniformly excellent coverage ...
Proposal for a unit, the phraseme, in linguistics, alongside such concepts as the phoneme and morpheme.Full description
In this paper I have tried to show the role of linguistics in language teaching by relating various parts of linguistics with the different aspects of language teaching.Full description
Descripción completa
livro sobre ensino de LE
livro sobre ensino de LEDeskripsi lengkap
livro sobre ensino de LE
AMERICAN HORTICULTURAL SOCIETY ENCYCLOPEDIA OF PLANTS & FLOWERSFull description
Volume 2 Bilingualism Concessive
Bilingualism 1
Bilingualism Li Wei, University of Newcastle upon Tyne, Newcastle upon Tyne, UK ! 2006 Elsevier Ltd. All rights reserved.
What Is Bilingualism? Bilingualism is a product of extensive language contact (i.e., contacts between people who speak different languages). There are many reasons for speakers of different languages to get into contact with one another. Some do so out of their own choosing, whereas others are forced by circumstances. Among the frequently cited factors that contribute to language contact are education, modern technology, economy, religion and culture, political or military acts, and natural disasters. One does not have to move to a different place to be in contact with people speaking a different language. There are plenty of opportunities for language contact in the same country, the same community, the same neighborhood, or even the same family. However, although language contact is a necessary condition for bilingualism at the societal level, it does not automatically lead to bilingualism at the individual level. For example, Belgium, Canada, Finland, India, Luxembourg, Paraguay, and Singapore, to name but a few countries, are bi- or multilingual, but the degree or extent of bilingualism among the residents of these countries varies significantly. There are large numbers of bilingual or multilingual individuals in Luxembourg, Paraguay, and Singapore, but considerably fewer in the other officially bi- or multilingual countries. Mackey (1962) claims that there are actually fewer bilingual people in bilingual countries than there are in the so-called ‘unilingual’ ones, because the main concerns of bi- or multilingual states are often the maintenance and use of two or more languages in the same nation, rather than the promotion of bilingualism among their citizens. It is therefore important to distinguish bilingualism as a social or societal phenomenon from bilingualism as an individual phenomenon.
Who Is Bilingual? People who are brought up in a society in which monolingualism and uniculturalism are promoted as the normal way of life often think that bilingualism is only for a few, ‘special’ people. In fact, one in three of the world’s population routinely uses two or more languages for work, family life, and leisure. There are even more people who make irregular use of languages other than their native one; for example,
many people have learned foreign languages at school and only occasionally use them for specific purposes. If we count these people as bilinguals, then monolingual speakers would be a tiny minority in the world today. Yet the question of who is and who is not a bilingual is more difficult to answer than it first appears. Baker and Prys Jones (1998: 2) suggest that in defining a bilingual person, we may wish to consider the following questions: . Should bilingualism be measured by how fluent people are in two languages? . Should bilinguals be only those people who have equal competence in both languages? . Is language proficiency the only criterion for assessing bilingualism, or should the use of two languages also be considered? . Most people would define a bilingual as a person who can speak two languages. What about a person who can understand a second language perfectly but cannot speak it? What about a person who can speak a language but is not literate in it? What about an individual who cannot speak or understand speech in a second language but can read and write it? Should these categories of people be considered bilingual? . Should self-perception and self-categorization be considered in defining who is bilingual? . Are there different degrees of bilingualism that can vary over time and with circumstances? For instance, a person may learn a minority language as a child at home and then later acquire another, majority language in the community or at school. Over time, the second language may become the stronger or dominant language. If that person moves away from the neighborhood or area in which the minority language is spoken or loses contact with those who speak it, he or she may lose fluency in the minority language. Should bilingualism therefore be a relative term? The word ‘bilingual’ primarily describes someone with the possession of two languages. It can, however, also be taken to include the many people in the world who have varying degrees of proficiency in and interchangeably use three, four or even more languages. In many countries of Africa and Asia, several languages coexist and large sections of the population speak three or more languages. Individual multilingualism in these countries is a fact of life. Many people speak one or more local or ethnic languages, as well as another indigenous language which has become the medium of communication between different ethnic
2 Bilingualism
groups or speech communities. Such individuals may also speak a foreign language – such as English, French or Spanish – which has been introduced into the community during the process of colonization. This latter language is often the language of education, bureaucracy and privilege. Multilingualism can also be the possession of individuals who do not live within a multilingual country or speech community. Families can be trilingual when the husband and wife each speak a different language as well as the common language of the place of residence. People with sufficient social and educational advantages can learn a second, third, or fourth language at school or university; at work; or in their leisure time. In many continental European countries, children learn two languages at school – such as English, German, or French – as well as being fluent in their home language – such as Danish, Dutch, or Luxembourgish. It is important to recognize that a multilingual speaker uses different languages for different purposes and does not typically possess the same level or type of proficiency in each language. In Morocco, for instance, a native speaker of Berber may also be fluent in colloquial Moroccan Arabic but not literate in either of these languages. This Berber speaker will be educated in Modern Standard Arabic and use that language for writing and formal purposes. Classical Arabic is the language of the mosque, used for prayers and reading the Qur’an. Many Moroccans also have some knowledge of French, the former colonial language.
Theoretical Issues in Bilingualism Research Chomsky (1986) defined three basic questions for modern linguistics: i. What constitutes knowledge of language? ii. How is knowledge of language acquired? iii. How is knowledge of language put to use? For bilingualism research, these questions can be rephrased to take in knowledge of more than one language (see also Cook, 1993): i. What is the nature of language, or grammar, in the bilingual person’s mind, and how do two systems of language knowledge coexist and interact? ii. How is more than one grammatical system acquired, either simultaneously or sequentially? In what aspects does bilingual language acquisition differ from unilingual language acquisition?
iii. How is the knowledge of two or more languages used by the same speaker in bilingual speech production? Taking the acquisition question first, earlier observers of bilingual children concentrated on documenting the stages of their language development. Volterra and Taeschner (1978), for example, proposed a threestage model of early bilingual development. According to this model, the child initially possesses one lexical system composed of lexical items from both languages. In stage two, the child distinguishes two separate lexical codes but has one syntactic system at his or her disposal. Only when stage three is reached do the two linguistic codes become entirely separate. Volterra and Taeschner’s model gave rise to what is now known as the ‘unitary language system hypothesis.’ In its strongest version, the hypothesis supposes that the bilingual child has one single language system that they use for processing both of their languages in the repertoire. In the 1980s, the unitary language system hypothesis came under intense scrutiny; for instance, by Meisel (1989) and Genesee (1989). They argue that there is no conclusive evidence to support the existence of an initial undifferentiated language system, and they also point out certain methodological inconsistencies in the three-stage model. The phenomenon of language mixing, for instance, can be interpreted as a sign of two developing systems existing side by side, rather than as evidence of one fused system. Meisel’s and Genesee’s studies led to an alternative hypothesis, known as the ‘separate development hypothesis’ or ‘independent development hypothesis.’ More recently, researchers have investigated the possibility that different aspects of language (e.g., phonology, vocabulary, syntax, pragmatics) of the bilingual child’s language systems may develop at different rates (e.g., Li and Zhu, 2001). Care needs to be taken in interpreting research evidence using children at different developmental stages. Although the ‘one-versus-two-systems’ debate (i.e., whether bilingual children have an initially differentiated or undifferentiated linguistic system) continues to attract new empirical studies, a more interesting question has emerged regarding the nature of bilingual development. More specifically, is bilingual acquisition the same as monolingual acquisition? Theoretically, separate development is possible without there being any similarity with monolingual acquisition. Most researchers argue that bilingual children’s language development is, by and large, the same as that of monolingual children. In very general terms, both bilingual and monolingual children go through an initial babbling stage, followed by the
Bilingualism 3
one-word stage, the two-word stage, the multiword stage, and the multiclause stage. At the morphosyntactic level, a number of studies have reported similarities rather than differences between bilingual and monolingual acquisition. Garcia (1983), for example, compared the use of English morpheme categories by English monolingual children and bilingual children acquiring English and Spanish simultaneously and found no systematic difference at all. Pfaff and Savas (1988) found that their 4-year-old Turkish/ German subject made the same errors in Turkish case marking as reported in the literature on monolingual Turkish children. Muller’s (1990) study of two French/German children indicates that their use of subject–verb agreement and finite verb placement in both languages is virtually identical to that of comparable monolingual children. De Houwer (1990) found that her Dutch/English bilingual subject, Kate, used exactly the same word orders in Dutch as monolingual Dutch-speaking children, both in terms of types and in proportional use. Furthermore, De Houwer found in Kate parallels to monolingual children for both Dutch and English in a range of structures, such as nonfinite verb placement, preposed elements in affirmative sentences, clause types, sentence types, conjunctions, and question inversion. Nevertheless, one needs to be careful in the kinds of conclusions one draws from such evidence. Similarities between bilingual and monolingual acquisition do not mean that the two languages a bilingual child is acquiring develops in the same way or at the same speed, or that the two languages a bilingual child is acquiring do not influence and interact with each other. Paradis and Genesee (1996), for example, found that although the 2–3-year-old French–English bilingual children they studied displayed patterns that characterize the performance of monolingual children acquiring these languages separately, and they acquired these patterns within the same age range as monolingual children, they used finite verb forms earlier in French than in English; used subject pronouns in French exclusively with finite verbs, but subject pronouns in English with both finite and nonfinite verbs, in accordance with the status of subject pronouns in French as clitics (or agreement markers) but full NPs in English; and placed verbal negatives after lexical verbs in French (e.g., ‘n’aime pas’) but before lexical verbs in English (‘do not like’). Further evidence of cross-linguistic influence has been reported by Dopke (1992), for example, in her study of German–English bilingual children in Australia. These children tended to overgeneralize the –VO word order of English to German, which instantiates both VO and OV word orders, depending on the clausal structure of the utterance. Dopke suggests
that children learning English and German simultaneously are prone to overgeneralize SVO word order in their German because the VO order is reinforced on the surface of both the German and the English input they hear. Most of the studies that have examined crosslinguistic influences in bilingual acquisition focus on morphosyntactic features. One area that has hitherto been underexplored is the interface between phonetics and phonology in bilingual acquisition. Although most people seem to believe that the onset of speech by bilingual children is more or less the same as for monolingual children, there are indications that bilingual children seem to develop differently from monolingual children in the following three aspects: the overall rate of occurrence of developmental speech errors, the types of speech errors and the quality of sounds (Zhu and Dodd, 2005). For example, studies on Cantonese/English (Holm and Dodd), Putonghua/ Cantonese (So and Leung), Welsh/English (Ball et al.), Spanish/English (Yavas and Goldstein), and Punjabi/ English (Stow and Pert) (also in Zhu and Dodd, 2006) bilingual children seem to indicate that bilingual children tend to make not only more speech errors but also different types of speech errors compared with monolingual children of the same age. These speech errors would be considered atypical if they had occurred in the speech of monolingual children. Moreover, although bilingual children seem to be able to acquire monolingual-like competence at the phonemic level, there are qualitative differences at the phonetic level in terms of production. For example, using instrumental analysis, Khattab (also in Zhu and Dodd, 2006) finds that although Arabic–English bilingual children have similar patterns of production and use of VOT, /l/, and /r/ in some respects to those of monolinguals from each language, they also show differences that are intricately related to age, input, and language context. These studies and others are reported in Zhu and Dodd (2005). There is one area in which bilingual children clearly differ from monolingual children; namely, code-mixing. Studies show that bilingual children mix elements from both languages in the same utterance as soon as they can produce two-word utterances. Researchers generally agree that bilingual children’s mixing is highly structured and grammatically constrained, although there is no consensus on the nature of the specific constraints that organize their mixing. Vihman (1985), who studied her own son Raivo, who acquired English and Estonian simultaneously, argued, for example, that the language mixing by bilingual children is qualitatively different from that of more mature bilinguals. She invoked as evidence for this claim the fact that young bilingual children
4 Bilingualism
indicate a propensity to mix function words over contentives (e.g., nouns, verbs, adjectives) – a type of mixing that is rare in older bilingual mixing. However, Lanza’s (1997) study, although finding similar patterns in the mixing produced by her two Norwegian–English bilingual subjects, argued that children’s mixing is qualitatively the same as that of adults; their relatively greater degree of mixing of function words is evidence of what Lanza called ‘dominance’ of one language over another rather than of a substantial difference from bilingual adults’ mixing. Both Vihman and Lanza, as well as other studies of children’s mixing, show that bilingual children mix their languages in accordance with constraints that operate on adult mixing. The operation of constraints based on surface features of grammar, such as word order, is evident from the two-word/twomorpheme stage onward, and the operation of constraints based on abstract notions of grammatical knowledge is most evident in bilingual children once they demonstrate such knowledge overtly (e.g., verb tense and agreement markings), usually around two years and 6 months of age and older. As Genesee (2002) points out, these findings indicate that in addition to the linguistic competence needed to formulate correct monolingual strings, bilingual children have the added capacity to coordinate their two languages in accordance with the grammatical constraints of both languages during mixing. Although these studies provide further evidence for the separate development, or two-systems, argument, they also indicate that there are both quantitative and qualitative differences between bilingual acquisition and monolingual acquisition. Another area of interest in acquisitional studies of bilingual children is the role of input and social context in the rate and order of language acquisition. Earlier assumptions were that the bilingual child would have half, or less, of the normal input in each of their two languages, compared with the monolingual child. More careful examinations of bilingual children show considerable variations in the quantity and quality of input, interactional styles of the parents, and environmental policies and attitudes toward bilingualism. On the basis of Harding and Riley’s work (1986), Romaine (1995) distinguished six types of early-childhood bilingualism according to the native language of the parents, the language of the community at large, and the parents’ strategy in speaking to the child. Type 1: One person, one language.
. Parents: The parents have different native languages, with each having some degree of competence in the other’s language.
. Community: The language of one of the parents is the dominant language of the community. . Strategy: The parents each speak their own language to the child from birth. Type 2: Nondominant Home Language/One Language, One Environment
. Parents: The parents have different native languages. . Community: The language of one of the parents is the dominant language of the community. . Strategy: Both parents speak the nondominant language to the child, who is fully exposed to the dominant language only when outside the home, and in particular in nursery school. Type 3: Nondominant Home Language without Community Support
. Parents: The parents share the same native languages. . Community: The dominant language is not that of the parents. . Strategy: The parents speak their own language to the child. Type 4: Double Nondominant Home Language without Community Support
. Parents: The parents have different native languages. . Community: The dominant language is different from either of the parents. . Strategy: The parents each speak their own language to the child from birth. Type 5: Nonnative Parents
. Parents: The parents share the same native language. . Community: The dominant language is the same as that of the parents. . Strategy: One of the parents always addresses the child in a language that is not his or her native language. Type 6: Mixed Languages
. Parents: The parents are bilingual. . Community: Sectors of community may also be bilingual. . Strategy: Parents code-switch and mix languages. The three headings Romaine used to classify the six types of childhood bilingualism – the languages of the parents, the sociolinguistic situation of the community, and the discourse strategies of the parents and other immediate carers – are critical factors not only in the process of bilingual acquisition but also in
Bilingualism 5
Figure 1 Lexical association model.
Figure 2 Dual-store model.
the final product of that process (i.e., the type of bilingual speaker it produces). Arguably, the six types of bilingual children would grow up as different types of bilinguals with different mental representations of the languages and different patterns of language behavior. Research on the cognitive organization and representation of bilingual knowledge is inspired and influenced by the work of Weinreich. Focussing on the relationship between the linguistic sign (or signifier) and the semantic content (signified), Weinreich (1953) distinguished three types of bilinguals. In type A, the individual combines a signifier from each language with a separate unit of the signified. Weinreich called them ‘coordinative’ (later often called ‘coordinate’) bilinguals. In type B, the individual identifies two signifiers but regards them as a single compound, or composite, unit of signified; hence ‘compound’ bilinguals. Type C refers to people who learn a new language with the help of a previously acquired one. They are called ‘subordinative’ (or ‘subordinate’) bilinguals. Weinreich’s examples were from English and Russian:
evidenced in grammaticality and fluency of speech, and some ‘coordinative’ bilinguals show difficulties in processing two languages simultaneously (i.e., in code-switching or in ‘foreign’ word identification tasks). It must also be stressed that Weinreich’s distinctions among bilingual individuals are distributed along a continuum from a subordinate or compound end to a coordinate end and can at the same time be more subordinate or compound for certain concepts and more coordinate for others, depending on, among other things, the age and context of acquisition. Weinreich’s work influenced much of the psycholinguistic modelling of the bilingual lexicon. Potter et al. (1984) presented a reformulation of the manner in which bilingual lexical knowledge could be represented in the mind in terms of two competing models: the Concept Mediation Model and the Word Association model. In the Concept Mediation Model, words of both L1 and L2 are linked to amodal conceptual representations. In the Lexical Association Model, in contrast, words in a second language are understood through L1 lexical representations. As can be seen in Figure 1, the models are structurally equivalent to Weinreich’s distinction between coordinative and subordinative bilingualism. At the same time, several researchers (e.g., Kolers and Gonzalez [1980] and Hummel [1986]) presented evidence for the so-called dual-store model, as represented in Figure 2. This latter model has also generated considerable research on the existence of the putative ‘bilingual language switch’ postulated to account for the bilingual’s ability to switch between languages on the basis of environmental demands (e.g., MacNamara, 1967; MacNamara and Kushnir, 1971). Subsequent studies found conflicting evidence in favor of different models. Some of the conflicting evidence could be explained by the fact that different types of bilingual speakers were used in the experiments in terms of proficiency level, age, and context of acquisition. It is possible that lexical mediation is associated with low levels of proficiency, and concept mediation with higher levels, especially for those who have become bilingual in later childhood or adulthood. Some researchers called for a developmental dimension in the modelling of bilingual knowledge.
(A) ‘book’ ? /buk/
‘kniga’ ? /kn’iga/
(C) ‘book’ | /buk/ | /kn’iga/
Weinreich’s distinctions are often misinterpreted in the literature as referring to differences in the degree of proficiency in the languages, but in fact the relationship between language proficiency and cognitive organization of the bilingual individual, as conceptualized in Weinreich’s model, is far from clear. Some ‘subordinate’ bilinguals demonstrate a very high level of proficiency in processing both languages, as
6 Bilingualism
Figure 3 Revised hierarchical model.
Kroll and Stewart (1994), for example, proposed the Revised Hierarchical Model, which represents concept mediation and word association not as different models but as alternative routes within the same model (see Figure 3). An important distinctive feature of being bilingual is being able to make appropriate language choices. Bilingual speakers choose to use their different languages depending on a variety of factors, including the type of person addressed (e.g., members of the family, schoolmates, colleagues, superiors, friends, shopkeepers, officials, transport personnel, neighbors), the subject matter of the conversation (e.g., family concerns, schoolwork, politics, entertainment), location or social setting (e.g., at home, in the street, in church, in the office, having lunch, attending a lecture, negotiating business deals), and relationship with the addressee (e.g., kin, neighbors, colleagues, superior/inferior, strangers). However, even more complex are the many cases in which a bilingual talks to another bilingual with the same linguistic background and changes from one language to another in the course of conversation. This is what is known as code-switching. Figure 4 illustrates a decision-making process of the bilingual speaker in language choice and code-switching. There is a widespread impression that bilingual speakers code-switch because they cannot express themselves adequately in one language. This may be true to some extent when a bilingual is momentarily lost for words in one of his or her languages. However, code-switching is an extremely common practice among bilinguals and takes many forms. A long narrative may be divided into different parts expressed in different languages, sentences may begin in one language and finish in another, and words and phrases from different languages may succeed each other. Linguists have devoted much attention to the study of code-switching. It has been demonstrated that code-switching involves skilled manipulation of overlapping sections of two or more grammars and that there is virtually no instance of ungrammatical combination of two languages in code-switching, regardless of the bilingual ability of the speaker. Some suggest that code-switching is itself a discrete
Figure 4 Adapted from Grosjean, 1982: 129.
mode of speaking, emanating from a single codeswitching grammar. One important aspect of the code-switching grammar is that the two languages involved do not play the same role in sentence making. Typically, one language sets the grammatical framework, with the other providing certain items to fit into the framework. Code-switching therefore is not a simple combination of two sets of grammatical rules but grammatical integration of one language in another. Bilingual speakers of different proficiency levels in their two languages or speaking two typologically different languages can engage in code-switching and, indeed, vary it according to their needs. The possible existence of a code-switching grammar calls into question the traditional view of the bilingual as two monolinguals in one person (for further discussions, see Grosjean, 1985). One consequence of the ‘twoin-one’ perspective is that bilingual speakers are often compared to monolinguals in terms of their language proficiency. For example, some researchers have suggested that bilingual children have smaller vocabularies and lessdeveloped grammars than their monolingual peers, while their ability to exploit the similarities and differences in two sets of grammatical rules to accomplish rule-governed code-switching was not considered relevant. In some experimental psycholinguistic studies, tests are given without taking into account that bilingual speakers may have learned their two languages under different conditions for different purposes and that they only use them in different situations with different people. It is important to emphasize that bilingual speakers have a unique linguistic and psychological profile; their two languages are constantly in different states of activation, and they are able to call on their linguistic knowledge and resources according to the context and adapt their behavior to the task at hand.
Bilingualism 7
Bilingualism as a Sociopolitical Issue Language choice is not a purely linguistic issue. In many countries of the world, much of the social identification of individuals, as well as of groups, is accomplished through language choice. By choosing one or another of the two or more languages in one’s linguistic repertoire, a speaker reveals and defines his or her social relationships with other people. At a societal level, whole groups of people, and in fact, entire nations, can be identified by the language or languages they use. Language, together with culture, religion, and history, is a major component of national identity. Multilingual countries are often thought to have certain problems that monolingual states do not. On the practical level, difficulties in communication within a country can act as an impediment to commerce and industry. More seriously, however, multilingualism is a problem for government. The process of governing requires communication both within the governing institutions and between the government and the people. This means that a language, or languages, must be selected as the language for use in governing. However, the selection of the ‘official language’ is not always easy, as it is not simply a pragmatic issue. For example, on pragmatic grounds, the best immediate choice for the language of government in a newly independent colony might be the old colonial language, as the colonial governing institutions and records are already in place in that language, and those nationals with the most government experience already know it. The old colonial language will not, however, be a good choice on nationalist grounds. For a people that has just acquired its own geographical territory, the language of the state that had denied it territorial control would not be a desirable candidate for a national symbol. Ireland has adopted a strategy in which both the national language, Irish, and the language of the deposed power, English, are declared as official; the colonial language is used for immediate, practical purposes, and the national language is promoted and developed. However, in many other multilingual countries that do not have a colonial past, such as China, deciding which language should be selected as the national language can sometimes lead to internal, ethnic conflicts. Similarly, selecting a language for education in a multilingual country is often problematic. In some respects, the best strategy for language in education is to use the various ethnic languages. After all, these are the languages the children already speak, and school instruction can begin immediately without waiting until the children learn the official language. Some would argue, however, that this strategy could
be damaging for nation-building efforts and disadvantage children by limiting their access to the wider world. It should be pointed out that there is no scientific evidence to show that multilingual countries are particularly disadvantaged, in socioeconomic terms, compared to monolingual ones. In fact, all the research that was carried out in the 1960s and 1970s on the relationship between the linguistic diversity and economic well-being of a nation came to the conclusion that a country can have any degree of language uniformity or fragmentation and still be underdeveloped, and a country whose entire population speaks the same language can be anywhere from very rich to very poor. It might be true, however, that linguistic uniformity and economic development reinforce each other; in other words, economic wellbeing promotes the reduction of linguistic diversity. It would be lopsided logic, though, to view multilingualism as the cause of the socioeconomic problems of a nation. Multilingualism is an important resource at both the societal and personal levels. For a linguistically diverse country to maintain ethnic group languages alongside the national or official languages can prove an effective way to motivate individuals while unifying the nation. In addition, a multiethnic society is arguably a richer, more exciting, and more stimulating place to live in than a community with only one dominant ethnic group. For the multilingual speaker, the availability of various languages in the community repertoire serves as a useful interactional resource. Typically, multilingual societies tend to assign different roles to different languages; one language may be used in informal contexts with family and friends, while another for the more formal situations of work, education, and government. Imagine two friends who are both bilingual in the same ‘home’ and ‘official’ languages. Suppose that one of them also works for the local government and that her friend has some official business with her. Suppose further that the government employee has two pieces of advice to give to her friend: one based on her official status as a government representative, and one based on their mutual friendship. If the official advice is given in the ‘government’ language and the friendly advice in the ‘home’ language, there is little chance that there would be any misunderstanding about which advice was which. The friend would not take the advice given in the ‘home’ language as official. There is a frequent debate in countries in which various languages coexist concerning which languages are a resource. The favored languages tend to be those that are both international and particularly valuable in international trade. A lower place is
8 Bilingualism
given in the status ranking to minority languages, which are small, regional, and of less perceived value in the international marketplace. For example, French has traditionally been the number one modern language in the British school curriculum, followed by German and Spanish, and then a choice between Italian, Modern Greek, and Portuguese. One may notice that all of these are European languages. Despite large numbers of mother-tongue Bengali, Cantonese, Gujarati, Hakka, Hindi, Punjabi, Turkish, and Urdu speakers in England, these languages occupy a very low position in the school curriculum. In the British National Curriculum, the languages Arabic, Bengali, Chinese (Cantonese or Mandarin), Gujarati, Modern Hebrew, Hindi, Japanese, Punjabi, Russian, Turkish, and Urdu are initially only allowed in secondary schools (for 11–18 year olds) if a major European language such as French is taught first (Milroy and Milroy, 1985). Clearly, multilingualism as a national and personal resource requires careful planning, as would any other kind of resource. However, language planning has something that other kinds of economic planning do not usually have: language as its own unique cultural symbolic value. As has been discussed earlier, language is a major component of the identity of a nation and an individual. Often, strong emotions are evoked when talking about a certain language. Language planning is not simply a matter of standardizing or modernizing a corpus of linguistic materials, nor is it a reassignment of functions and status. It is also about power and influence. The dominance of some languages and the dominated status of other languages are partly understandable if we examine who holds positions of power and influence, who belong to elite groups that are in control of decisionmaking, and who are in subordinate groups, on whom decisions are implemented. It is more often than not the case that a given arrangement of languages benefits only those who have influence and privileges. For the multilingual speaker, language choice is not only an effective means of communication but also an act of identity (Le Page and Tabouret-Keller, 1985). Every time we say something in one language when we might just as easily have said it in another, we are reconnecting with people, situations, and power configurations from our history of past interactions and imprinting on that history our attitudes toward the people and languages concerned. Through language choice, we maintain and change ethnic group boundaries and personal relationships and construct and define ‘self’ and ‘other’ within a broader political economy and historical context.
Changes in Attitudes Toward Bilingualism From the early nineteenth century to about the 1960s, there was a widespread belief that bilingualism has a detrimental effect on a human beings’ intellectual and spiritual growth. Stories of children who persisted in speaking two languages in school having had their mouths washed with soap and water or being beaten with a cane were not uncommon. The following is a quote from a professor at Cambridge University that illustrates the dominant belief of the time, even among academics and intellectuals: If it were possible for a child to live in two languages at once equally well, so much the worse. His intellectual and spiritual growth would not thereby be doubled, but halved. Unity of mind and character would have great difficulty in asserting itself in such circumstances. (Laurie, 1890: 15)
Professor Laurie’s view represented a commonly held belief throughout the twentieth century that bilingualism disadvantages rather than advantages one’s intellectual development. Early research on bilingualism and cognition tended to confirm this negative viewpoint, finding that monolinguals were superior to bilinguals on intelligence tests. One of the most widely cited studies was done by Saer (1923) who studied 1400 Welsh–English bilingual children between the ages of 7 and 14 years in five rural and two urban areas of Wales. A 10-point difference in IQ was found between the bilinguals and the monolingual English speakers from rural backgrounds. From this, Saer concluded that bilinguals were mentally confused and at a disadvantage in intelligence compared with monolinguals. It was further suggested, with a follow-up study of university students, that ‘‘the difference in mental ability as revealed by intelligence tests is of a permanent nature since it persists in students throughout their university career’’ (Saer, 1923: 53). Controversies regarding the early versions of IQ tests and the definition and measurement of intelligence aside, there were a number of problems with Saer’s study and its conclusions. First, it appeared to be only in the rural areas that the correlation between bilingualism and lower IQ held. In urban areas, monolinguals and bilinguals were virtually the same; in fact, the average IQ for urban Welsh–English bilingual children in Saer’s study was 100, whereas for monolingual, English-speaking children it was 99. The urban bilingual children had more contact with English both before beginning school and outside school hours than did the rural bilinguals. Thus, the depressed scores of the rural population were probably more a reflection of lack of opportunity
Bilingualism 9
and contexts to use English and were not necessarily indicative of any sociopsychological problems. More important, however, is the issue of statistical inference in this and other studies of a similar type. Correlations do not allow us to infer cause-and-effect relationships, particularly when other variables – such as rural versus urban differences – may be mediating factors. Another major factor is the language in which such tests were administered, particularly tests of verbal intelligence. Many such studies measured bilinguals only in the second or nondominant language. At around the same time that Saer conducted studies on bilinguals’ intelligence, some well-known linguists expressed their doubts about bilingual speakers’ linguistic competence. The following is Bloomfield’s characterization of a Menomini Indian man in the United States, whom he believed to have ‘deficient’ knowledge of Menomini and English: White Thunder, a man around 40, speaks less English than Menomini, and that is a strong indictment, for his Menomini is atrocious. His vocabulary is small, his inflections are often barbarous, he constructs sentences of a few threadbare models. He may be said to speak no language tolerably. (Bloomfield, 1927: 395)
This is one of the early statements of a view that became fashionable in educational circles; namely, that it was possible for bilinguals not to acquire full competence in any of the languages they spoke. Such an individual was said to be ‘semilingual.’ These people were believed to have linguistic deficits in six areas of language (see Hansegard, 1975; Skutnabb-Kangas, 1981): 1. 2. 3. 4. 5. 6.
Size of vocabulary Correctness of language Unconscious processing of language Language creation Mastery of the functions of language Meanings and imagery.
It is significant that the term ‘semilingualism’ emerged in connection with the study of language skills of people belonging to ethnic minority groups. Research that provided evidence in support of the notion of ‘semilingualism’ was conducted in Scandinavia and North America and was concerned with accounting for the educational outcomes of submersion programs in which minority children were taught through the medium of the majority language. However, these studies, similar to the ones conducted by Saer, had serious methodological flaws, and the conclusions reached by the researchers were misguided.
First, the educational tests used to measure language proficiencies and to differentiate between people were insensitive to the qualitative aspects of languages and to the great range of language competences. Language may be specific to a context; a person may be competent in some contexts but not in others. Second, bilingual children are still in the process of developing their languages. It is unfair to compare them to some idealized adults. Their language skills change over time. Third, the comparison with monolinguals is also unfair. It is important to distinguish whether bilinguals are ‘naturally’ qualitatively and quantitatively different from monolinguals in their use of the two languages (i.e., as a function of being bilingual). Fourth, if languages are relatively underdeveloped, the origins may not be in bilingualism per se but in the economic, political, and social conditions that evoke underdevelopment. The disparaging and belittling overtone of the term ‘semilingualism’ itself invokes expectations of underachievement in the bilingual speaker. Thus, rather than highlighting the apparent ‘deficits’ of bilingual speakers, the more positive approach is to emphasize that when suitable conditions are provided, languages are easily capable of development beyond the ‘semi’ state. One of the specific issues Bloomfield raised in his comments on the language behavior of members of the Menomini Indians in North America was the frequent mixing of their own language and English. It has been described as ‘verbal salad,’ not particularly appealing but nevertheless harmless, or ‘garbage’ that is definitively worthless and vulgar. Unfortunately, although switching and mixing of languages occurs in practically all bilingual communities and all bilingual speakers’ speech, it is stigmatized as an illegitimate mode of communication, even sometimes by the bilingual speakers themselves. Haugen (1977: 97), for example, reports that a visitor from Norway made the following comment on the speech of the Norwegians in the United States: ‘‘Strictly speaking, it is no language whatever, but a gruesome mixture of Norwegian and English, and often one does not know whether to take it humorously or seriously.’’ Gumperz (1982: 62–63) reports that some bilingual speakers who mixed languages regularly still believe such behavior was ‘‘bad manners’’ or a sign of ‘‘lack of education or improper control of language.’’ One of the Punjabi–English bilinguals Romaine interviewed said: ‘‘I’m guilty as well in the sense that we speak English more and more and then what happens is that when you speak your own language you get two or three English words in each sentence . . . but I think that’s ‘wrong’’’ (Romaine, 1995: 294).
10 Bilingualism
Attitudes do not, of course, remain constant over time. At a personal level, changes in attitudes may occur when there is some personal reward involved. Speakers of minority languages will be more motivated to maintain and use their languages if they prove to be useful in increasing their employability or social mobility. In some cases, certain jobs are reserved for bilingual speakers only. At the societal level, attitudes toward bilingualism change when the political ideology changes. In California and elsewhere in the southwestern United States, for instance, pocho and calo used to serve as pejorative terms for the Spanish of local Chicanos. With a rise in ethnic consciousness, however, these speech styles have become symbolic of Chicano ethnicity and are now increasingly used in contemporary Chicano literature. Since the 1960s, there has been a political movement, particularly in the United States, advocating language rights. In the United States, questions about language rights are widely discussed not only in college classrooms and language communities but also in government and federal legislatures. Language rights have a history of being tested in U.S. courtrooms. From the early 1920s to the present, there has been a continuous debate in U.S. courts of law regarding the legal status of language minority rights. To gain short-term protection and a mediumterm guarantee for minority languages, legal challenges have become an important part of the language rights movement. The legal battles concerned not just minority language vs. majority language contests, but also children vs. schools, parents vs. school boards, state vs. the federal authorities, and so on. Whereas minority language activists among the Basques in Spain and the Welsh in Britain have been taken to court by the central government for their actions, U.S. minority language activists have taken the central and regional government to court. The language rights movement has received some support from organizations such as the United Nations, Unesco, the Council of Europe, and the European Union. Each of these four organizations has declared that minority language groups have the right to maintain their languages. In the European Union, a directive (77/486/E EC) stated that member states should promote the teaching of the mother tongue and the culture of the country of origin in the education of migrant workers’ children. The kind of rights, apart from language rights, that minority groups may claim include protection, membership of their ethnic group and separate existence, nondiscrimination and equal treatment, education and information in their ethnic language, freedom to worship, freedom of belief freedom of movement, employment, peaceful
assembly and association, political representation and involvement, and administrative autonomy. However, real changes in attitudes toward bilingualism will not happen until people recognize or, better still, experience the advantages of being bilingual. Current research indicates that there are at least eight overlapping and interacting benefits for a bilingual person, encompassing communicative, cognitive and cultural advantages (adapted from Baker and Prys Jones, 1998: 6–8): Communicative advantages Relationships with parents: Where parents have differing first languages, the advantage of children becoming bilingual is that they will be able to communicate in each parent’s preferred language. This may enable a subtler, finer texture of relationship with the parent. Alternatively they will be able to communicate with parents in one language and with their friends and within the community in a different language. Extended family relationships: Being a bilingual allows someone to bridge the generations. When grandparents, uncles, aunts and other relatives in another region speak a language that is different from the local language, the monolingual may be unable to communicate with them. The bilingual has the chance to bridge that generation gap, build closer relationships with relatives extended family. Community relationships: A bilingual has the chance to communicate with a wider variety of people than a monolingual. Bilingual children will be able to communicate in the wider community and with school and neighbourhood friends in different languages when necessary. Transnational communication: One barrier between nations and ethnic groups tends to be language. Language is sometimes a barrier to communication and to creating friendly relationships of mutual respect. Bilinguals in the home, in the community and in society have the potential for lowering such barriers. Bilinguals can act as bridges within the nuclear and extended family, within the community and across societies. Language sensitivity: Being able to move between two languages may lead to more sensitivity in Communication. Because bilinguals are constantly monitoring which language to use in different situations, they may be more attuned to the communicative needs of those with whom they talk. Research suggests that bilinguals may be more empathic towards listeners’ needs in communication. When meeting those who do not speak their language particularly well, bilinguals may be more patient listeners than monolinguals. Cultural advantages Another advantage of being a bilingual is having two or more worlds of experience. Bilingualism provides the opportunity to experience two or more cultures. The monolingual may experience a variety of cultures; for example, from different neighbours and communities
Bilingualism 11 that use the same language but have different ways of life. The monolingual can also travel to neighbouring countries and experience other cultures as a passive onlooker. However, to penetrate different cultures requires the language of that culture. To participate and become involved in the core of a culture requires a knowledge of the language of that culture. There are also potential economic advantages to being bilingual. A person with two languages may have a wider portfolio of jobs available. As economic trade barriers fall, as international relationships become closer, as unions and partnerships across nations become more widespread, all increasing number of jobs are likely to require a person to be bilingual or multilingual. jobs in multinational companies, jobs selling and exporting, and employment prospects generated by translational contact make the future of employment more versatile for bilinguals than monolinguals. Cognitive advantages More recent research has shown that bilinguals may have some advantages in thinking, ranging from creative thinking to faster, progress in early cognitive development and greater sensitivity in communication. For example, bilinguals may have two or more words for cacti object and idea; sometimes corresponding words in different languages have different connotations. Bilinguals are able to extend the range of meanings, associations and images, and to think more flexibly and creatively. Therefore, a bilingual has the possibility of more awareness of language and more fluency, flexibility and elaboration in thinking than a monolingual.
It would be misleading to suggest that there is no disadvantage to bilingualism. Some problems, both social and individual, may be falsely attributed to bilingualism. For instance, when bilingual children exhibit language or personality problems, bilingualism is sometimes blamed. Problems of social unrest may unfairly be attributed to the presence of two or more languages in a community. However, the real possible disadvantages of bilingualism tend to be temporary. For example, bilingual families may be spending significantly more of their time and making much greater efforts to maintain two languages and bring up children bilingually. Some bilingual children may find it difficult to cope with the school curriculum in either language for a short period of time. However, the individual, cognitive, cultural, intellectual, and economic advantages bilingualism brings to a person make all the effort worthwhile. A more complex problem associated with bilingualism is the question of identity of a bilingual. If a child has both a French and an English parent and speaks each language fluently, is he or she French, English, or Anglo-French? If a child speaks English and a minority language such as Welsh, is he or she Welsh, English, British, European, or what? It has
to be said that for many bilingual people, identity is not a problem. Although speaking two languages, they are resolutely identified with one ethnic or cultural group. For example, many bilinguals in Wales see themselves as Welsh first, and possibly British next, but not English. Others, however, find identity a real, problematic issue. Some immigrants, for instance, desperately want to lose the identity of their native country and become assimilated and identified with the new home country, while some others want to develop a new identity and feel more comfortable with being culturally hyphenated, such as Chinese-American, Italian-Australian, Swedish-Finn, or Anglo-French. Yet identity crises and conflicts are never static. Identities change and evolve over time, with varying experiences, interactions, and collaborations within and outside a language group. Bilingualism is not a static and unitary phenomenon; it is shaped in different ways, and it changes depending on a variety of historical, cultural, political, economic, environmental, linguistic, psychological, and other factors. Our understanding of bilingual speakers’ knowledge and skills will grow as research methodology is defined and refined and our attitudes toward bilingualism change to the positive. See also: Bilingual Education; Bilingual Language Development: Early Years; Bilingualism and Second Language Learning; Interlanguage; Lingua Francas as Second Languages; Society and Language: Overview.
Bibliography Baker C & Prys Jones S (1998). Encyclopaedia of bilingualism and bilingual education. Clevedon: Multilingual Matters. Bloomfield L (1927). ‘Literate and illiterate speech.’ American Speech 2, 432–439. Chomsky N (1986). Knowledge of language: its nature, origin and use. New York: Praeger. Cook V (1993). Linguistics and second language acquisition. London: Macmillan. De Houwer A (1990). The acquisition of two languages from birth. Cambridge: Cambridge University Press. Dopke S (1992). One parent, one language. Amsterdam: Benjamins. Garcia E (1983). Early childhood bilingualism. Albuquerque: University of New Mexico Press. Genesee F (1989). ‘Early bilingual language development: one language or two?’ Journal of Child Language 16, 161–179. Genesee F (2002). ‘Rethinking bilingual acquisition.’ In Dewaele J-M, Housen A & Li W (eds.) Bilingualism: beyond basic principles. Clevedon: Multilingual Matters. 204–228.
12 Bilingualism Grosjean F (1985). ‘The bilingual as a competent but specific speaker-hearer.’ Journal of Multilingual and Multicultural Development 6, 467–477. Gumperz J J (1982). Discourse strategies. Cambridge: Cambridge University Press. Hansegard N E (1975). ‘Tvasprakighet eller havsprakighet?’ Invandrare och Minoriteter 3, 7–13. Harding E & Riley P (1986). The bilingual family. Cambridge: Cambridge University Press. Haugen E (1977). ‘Norm and deviation in bilingual communities.’ In Hornby P (ed.) Bilingualism: psychological, social and educational implications. New York: Academic Press. Hummel K (1986). ‘Memory for bilingual prose.’ In Vaid J (ed.) Language processing in bilinguals: psycholinguistic and neurolinguistic perspectives. Hillsdale, NJ: Lawrence Erlbaum. Kolers P & Gonzalez E (1980). ‘Memory for words, synonyms and translation.’ Journal of Experimental Psychology: Human Learning and Memory 6, 53–65. Kroll J & Stewart E (1994). ‘Category interference in translation and picture naming: evidence for asymmetric connections between bilingual memory representations.’ Journal of Memory and Language 33, 149–174. Lanza E (1997). Language mixing in infant bilingualism. Oxford: Oxford University Press. Laurie S S (1890). Lectures on language and linguistic method in school. Cambridge: Cambridge University Press. Le Page R & Tabouret-Keller A (1985). Acts of identity: Creole-based approaches to language and ethnicity. Cambridge: Cambridge University Press. Li W & Zhu H (2001). ‘Development of code-switching and L1 attrition in L2 setting.’ In Almgren M, Barrena A, Ezeizabarrena M-J, Idiazabal I & MacWhinney B (eds.) Research on child language acquisition. Somerville, MA: Cascadilla Press. 174–187. Mackey W F (1962). ‘The description of bilingualism.’ Canadian Journal of Linguistics 7, 51–85. MacNamara J (1967). ‘The linguistic independence of bilinguals.’ Journal of Verbal Leaning and Verbal Behaviour 6, 729–736.
MacNamara J & Kushnir S (1971). ‘The linguistic independence of bilinguals: the input switch.’ Journal of Verbal Leaning and Verbal Behaviour 10, 480–487. Meisel J M (1989). ‘Early differentiation of languages in bilingual children.’ In Hyltenstam K & Obler L (eds.) Bilingualism across the lifespan: aspects of acquisition, maturity and loss. Cambridge: Cambridge University Press. 13–40. Milroy J & Milroy L (1985). Authority in language. London: Routledge. Muller N (1990). ‘Developing two gender assignment systems simultaneously.’ In Meisel J (ed.) Two first languages. Dordrecht: Foris. 193–236. Paradis J & Gensee F (1996). ‘Syntactic acquisition in bilingual children.’ Studies in Second Language Acquisition 18, 1–25. Pfaff C & Savas T (1988). ‘Language development in a bilingual setting.’ Paper presented at the 4th Turkish Linguistics Conference, Ankara. Potter M C, So K-F, VonEchardt B & Feldman L B (1984). ‘Lexical and conceptual representation in beginning and more proficient bilinguals.’ Journal of Verbal Learning and Verbal Behaviour 23, 23–38. Romaine S (1995). Bilingualism (2nd edn.). Oxford: Blackwell. Saer D J (1923). ‘An inquiry into the effect of bilingualism upon the intelligence of young children.’ Journal of Experimental Psychology 6, 232–240, 266–274. Skutnabb-Kangas T (1981). Bilingualism or not: the education of minorities. Clevedon: Multilingual Matters. Vihman M (1985). ‘Language differentiation by the bilingual infant.’ Journal of Child Language 12, 297–324. Volterra V & Taeschner T (1978). ‘The acquisition and development of language by bilingual children.’ Journal of Child Language 5, 311–326. Weinreich U (1953). Languages in contact: findings and problems. New York: The Linguistic Circle of New York. Zhu H & Dodd B (eds.) (2006). Phonological development and disorder: a multilingual perspective. Clevedon: Multilingual Matters.
Bilingualism and Aphasia P C M Wong, Northwestern University, Evanston, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
Bilingual individuals, sometimes referred to as multilinguals or polyglots, are broadly defined as individuals who know (and use) two or more languages. These individuals possibly acquire (or are still acquiring) the two or more languages at different times in their lives and use these languages at different levels of proficiency. Although the term ‘perfect bilingual’ has been used to refer to individuals who are equally
proficient in the languages they know, often proficiency and use depend on the social/functional situations (e.g., work vs. family settings). Thus, it has been argued that bilinguals are not truly ‘two monolinguals in one person’ but are holistic, unique, and specific speaker–hearers (Grosjean, 1989). In the case of aphasia (language deficits as a result of brain damage), the various languages can be affected and recovered differently. Consequently, assessing and rehabilitating bilingual aphasics warrant considerations that are different from (or additional to) those associated with monolingual aphasics.
12 Bilingualism Grosjean F (1985). ‘The bilingual as a competent but specific speaker-hearer.’ Journal of Multilingual and Multicultural Development 6, 467–477. Gumperz J J (1982). Discourse strategies. Cambridge: Cambridge University Press. Hansegard N E (1975). ‘Tvasprakighet eller havsprakighet?’ Invandrare och Minoriteter 3, 7–13. Harding E & Riley P (1986). The bilingual family. Cambridge: Cambridge University Press. Haugen E (1977). ‘Norm and deviation in bilingual communities.’ In Hornby P (ed.) Bilingualism: psychological, social and educational implications. New York: Academic Press. Hummel K (1986). ‘Memory for bilingual prose.’ In Vaid J (ed.) Language processing in bilinguals: psycholinguistic and neurolinguistic perspectives. Hillsdale, NJ: Lawrence Erlbaum. Kolers P & Gonzalez E (1980). ‘Memory for words, synonyms and translation.’ Journal of Experimental Psychology: Human Learning and Memory 6, 53–65. Kroll J & Stewart E (1994). ‘Category interference in translation and picture naming: evidence for asymmetric connections between bilingual memory representations.’ Journal of Memory and Language 33, 149–174. Lanza E (1997). Language mixing in infant bilingualism. Oxford: Oxford University Press. Laurie S S (1890). Lectures on language and linguistic method in school. Cambridge: Cambridge University Press. Le Page R & Tabouret-Keller A (1985). Acts of identity: Creole-based approaches to language and ethnicity. Cambridge: Cambridge University Press. Li W & Zhu H (2001). ‘Development of code-switching and L1 attrition in L2 setting.’ In Almgren M, Barrena A, Ezeizabarrena M-J, Idiazabal I & MacWhinney B (eds.) Research on child language acquisition. Somerville, MA: Cascadilla Press. 174–187. Mackey W F (1962). ‘The description of bilingualism.’ Canadian Journal of Linguistics 7, 51–85. MacNamara J (1967). ‘The linguistic independence of bilinguals.’ Journal of Verbal Leaning and Verbal Behaviour 6, 729–736.
MacNamara J & Kushnir S (1971). ‘The linguistic independence of bilinguals: the input switch.’ Journal of Verbal Leaning and Verbal Behaviour 10, 480–487. Meisel J M (1989). ‘Early differentiation of languages in bilingual children.’ In Hyltenstam K & Obler L (eds.) Bilingualism across the lifespan: aspects of acquisition, maturity and loss. Cambridge: Cambridge University Press. 13–40. Milroy J & Milroy L (1985). Authority in language. London: Routledge. Muller N (1990). ‘Developing two gender assignment systems simultaneously.’ In Meisel J (ed.) Two first languages. Dordrecht: Foris. 193–236. Paradis J & Gensee F (1996). ‘Syntactic acquisition in bilingual children.’ Studies in Second Language Acquisition 18, 1–25. Pfaff C & Savas T (1988). ‘Language development in a bilingual setting.’ Paper presented at the 4th Turkish Linguistics Conference, Ankara. Potter M C, So K-F, VonEchardt B & Feldman L B (1984). ‘Lexical and conceptual representation in beginning and more proficient bilinguals.’ Journal of Verbal Learning and Verbal Behaviour 23, 23–38. Romaine S (1995). Bilingualism (2nd edn.). Oxford: Blackwell. Saer D J (1923). ‘An inquiry into the effect of bilingualism upon the intelligence of young children.’ Journal of Experimental Psychology 6, 232–240, 266–274. Skutnabb-Kangas T (1981). Bilingualism or not: the education of minorities. Clevedon: Multilingual Matters. Vihman M (1985). ‘Language differentiation by the bilingual infant.’ Journal of Child Language 12, 297–324. Volterra V & Taeschner T (1978). ‘The acquisition and development of language by bilingual children.’ Journal of Child Language 5, 311–326. Weinreich U (1953). Languages in contact: findings and problems. New York: The Linguistic Circle of New York. Zhu H & Dodd B (eds.) (2006). Phonological development and disorder: a multilingual perspective. Clevedon: Multilingual Matters.
Bilingualism and Aphasia P C M Wong, Northwestern University, Evanston, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
Bilingual individuals, sometimes referred to as multilinguals or polyglots, are broadly defined as individuals who know (and use) two or more languages. These individuals possibly acquire (or are still acquiring) the two or more languages at different times in their lives and use these languages at different levels of proficiency. Although the term ‘perfect bilingual’ has been used to refer to individuals who are equally
proficient in the languages they know, often proficiency and use depend on the social/functional situations (e.g., work vs. family settings). Thus, it has been argued that bilinguals are not truly ‘two monolinguals in one person’ but are holistic, unique, and specific speaker–hearers (Grosjean, 1989). In the case of aphasia (language deficits as a result of brain damage), the various languages can be affected and recovered differently. Consequently, assessing and rehabilitating bilingual aphasics warrant considerations that are different from (or additional to) those associated with monolingual aphasics.
Bilingualism and Aphasia 13
Bilingualism and the Brain In order to better understand how neurological injuries may affect the linguistic abilities of individuals who speak more than one language, it is important to consider how multiple languages may be organized in the brain. Traditionally, the debate has been centered on ‘language laterality’ or ‘hemispheric specialization’; that is, whether one side of the brain (the left side) is mostly responsible for both languages, whether the right hemisphere contributes in the case of bilinguals more so than in monolinguals, and whether one hemisphere contributes mostly to only one language (Paradis, 1990). Although the issue of laterality has some bearing on predicting the presence or absence of aphasia as a result of brain injury, it only considers the brain in very gross neuroanatomic terms (i.e., left and right hemispheres). Recently, the precise neuroanatomic circuits within and across cerebral hemispheres have been considered, as have other structures in the nervous system, along with factors such as language use, age of acquisition, proficiency, and level and medium of exposure, which potentially have more extensive clinical implications. Recent neuroimaging studies, although involving only isolated linguistic tasks, suggest that attained proficiency and the age of language acquisition may be determining factors in whether the two languages are subserved by the same neural circuits. Wong et al. (2005) found that even though both native Mandarin-speaking and English-speaking adults (who do not speak Mandarin) were able to discriminate Mandarin lexical tone patterns, a feature of the Mandarin language, the two groups used regions near the inferior frontal gyrus but in opposite hemispheres when doing so, presumably due to their corresponding attained proficiency or lack thereof in Mandarin. Kim et al. (1997) found that early but not late bilinguals showed spatially overlapping brain activations in the left inferior frontal gyrus associated with sentence generation in first (L1) and second (L2) languages. Late bilinguals also showed activation in the left inferior frontal gyrus, but the centers of activation were further apart relative to the early bilinguals. However, since early bilinguals tend to have a higher level of proficiency in both languages, other studies have suggested that attained proficiency might be the most important factor in determining whether or not the two languages are subserved by the same neural circuit (Perani et al., 1998; for a review, see Abutalebi et al., 2001). Converging evidence on brain and bilingualism is being built and shows great promise for the effective assessment and rehabilitation of bilingual aphasics, especially when combined with existing
knowledge in the neurobiology of monolingual aphasia. For example, studies suggest that perilesional areas may be recruited in aphasia recovery (Warburton et al., 1999). If, as Kim et al. (1997) suggested, L1 and L2 in late bilinguals (who likely speak L2 with relatively low proficiency) are in the same gross neuroanatomic region but nonoverlapping, then one language may be associated with the perilesional areas, areas that surround the injured area, in certain instances of brain injury (i.e., one language might be more preserved). Consequently, relying on these perilesional areas (and the less disrupted language) in rehabilitation of these individuals might be more productive than rehabilitation of their early bilingual or even monolingual counterparts whose injury might have caused disruption of all language(s) they speak. It is important to note that although some ideas have been proposed (Green and Price, 2001), little evidence exists to support one rehabilitation strategy over another in bilingual aphasia.
Types of Bilingual Aphasias and Patterns of Recovery Different types of bilingual aphasia, as well as different patterns of recovery, have been reported, involving not only speaking and understanding speech but also reading and writing (Streifler and Hofman, 1976). In addition to cases in which the two or more languages are equally impaired, it has been reported that some individuals showed selective aphasia in which signs of aphasia were evident in one language but not the other (Paradis & Goldblum, 1989). Differential aphasia has also been reported where different types of aphasia were shown in different languages (Albert and Obler, 1978; Silverberg and Gordon, 1979) – for example, conduction aphasia in one language and global aphasia in another. In addition, some individuals showed involuntary blending of grammatical elements (e.g., syntactic and morphologic units) of two languages (Glonig & Glonig, 1965; Perecman, 1984) – for example, combining syllables of two languages, thus creating a new word (Paradis, 1998). This is different from ‘code switching,’ which involves the alternative use of two or more languages in the same conversation (Milroy and Myusken, 1995). Code switching can function to convey emotional content, to emphasize or clarify the references being made, and to quote (De Fina, 1989), and it is considered to be an important aspect of normal bilingual discourse in many communities (Heller, 1995). Patterns of code switching were also found to be different between bilingual aphasics and normal individuals (De Santi et al., 1995; Mun˜ oz et al., 1999).
14 Bilingualism and Aphasia
It has been suggested that the degree and type of linguistic impairments in bilingual aphasics may be specific to the structures of the language. For example, it has been found that although Mandarin– Cantonese bilinguals showed impairment in the production of lexical tones (pitch patterns used to contrast word meaning), a greater degree of deficit was found in Cantonese production, possibly because Cantonese contains six tonal contrasts, whereas Mandarin contains only four (Lim and Douglas, 2000). In Friulian–Italian bilingual aphasics, the most frequently made errors in Friulian but not Italian involved the omission of the second obligatory pronoun, which is a typical feature of Friulian but not Italian (Fabbro and Frau, 2001). In other words, a type of linguistic impairment may not be apparent in one language because it does not occur as often (or at all) in that language. This also reinforces the idea of assessing multiple languages in bilingual aphasic individuals because impairments in one language do not necessarily predict the same impairments in the other. With regard to patterns of recovery, as well as improvements in both languages in terms of comparable rate and extent (parallel recovery), individuals show the following kinds of recovery: selective recovery, when only one language improves; successive recovery, when one language improves before the other language; or differential recovery, when one language improves more so than the other. Most interestingly, some individuals show antagonistic recovery, namely improvement in one language but deterioration in another (Paradis and Goldblum, 1989). Some even demonstrate alternating antagonism, in which the improvement–deterioration pattern of the two languages alternates (Paradis et al., 1982). It has also been reported that some individuals showed paradoxical recovery, namely when the patient recovered a ‘dead’ language – that is, a language the individual once had some knowledge of but had never used it premorbidly for ordinary communicative purposes. For example, Grasset (1884) reported a case of a monolingual Frenchspeaking Catholic woman who started to speak single Latin words and prayers (the language of the church) a few days following a left-hemisphere stroke but was unable to speak French. It is worth noting that it is not known what single factor influences the pattern of recovery (Paradis, 1998). For example, it is not always the case that the language spoken most proficiently premorbidly will be the language affected the most or the least by brain injury or the language that will be recovered first.
Bilingual Aphasia Assessment When evaluating a bilingual aphasic individual, various important issues warrant special considerations. First, a ‘direct translation’ is not the same as crosslanguage equivalency. Different languages have different (nonoverlapping) grammatical structures and vocabulary that can potentially influence how thoughts are expressed; consequently, certain linguistic impairments may or may not manifest themselves depending on the language, as suggested previously in the Mandarin–Cantonese and Friulian–Italian bilingual cases. Furthermore, languages are used in different social and cultural contexts, resulting in context-dependent interpretations even for the same utterance. Second, because bilingual aphasics use the two or more languages in different social settings, and because the two or more languages can be affected and recovered differently, all languages the individuals speak premorbidly need to be assessed in order to gain a more complete picture of the aphasia. Third, in addition to any formal measures, a thorough case history detailing use and proficiency of each language needs to be taken because it can potentially affect the rehabilitation process. Different formal/standardized test batteries are available for assessing aphasics who speak different languages. These include tests that are originally constructed in English but then translated into other languages with considerations of the appropriate linguistic and cultural contexts and/or normative data for the specific groups. For example, there is a Cantonese version of the Western Aphasia Battery (Yiu, 1992), a Spanish version of the Boston Naming Test (Taussig et al., 1992), and a Japanese version of the Communication Abilities in Daily Living (Sasanuma, 1991). In addition, there are also tests designed for assessing bilingual individuals, including the Bilingual Aphasia Test developed by Paradis and colleagues for more than 65 languages and 170 specific language-pair combinations [e.g., an Urdu version (Paradis and Janjua, 1987) and a Bulgarian– French version (Paradis and Parcehian, 1991)] and the Multilingual Aphasia Examination in Chinese, French, German, Italian, Portuguese, and Spanish (Rey and Benton, 1991).
Rehabilitation Traditional approaches employed in aphasia rehabilitation still apply to rehabilitating bilingual aphasic individuals, such as language stimulation approaches that emphasize individual linguistic units
Bilingualism and Aphasia 15
and processes such as grammar and naming, as well as compensatory approaches that target the individual’s participation in vocational and social settings despite linguistic impairments. However, additional challenges exist when more than two languages are present. For example, should rehabilitation focus on one or two languages? If one, which one? No one set of widely accepted guidelines exists for selecting one or all languages in aphasia rehabilitation, and evidence and arguments exist for either consideration (Bond, 1984; Chlenov, 1948; Linke, 1979; Wald, 1958). Similarly, it is still unclear whether skills acquired from the rehabilitation of one language can be transferred to another. Evidence suggests that skill transfer across affected languages may be optimal if the languages are closely related (e.g., Spanish and Italian) (Paradis, 1998). As stated previously, different individuals use their multiple languages in different social and vocational settings. In rehabilitation, the affected individual and her or his family should be counseled to consider the preponderating need of one language over another. For example, the social penalty of linguistic impairments in English may be greater for Spanish–English bilinguals whose immediate peers are English-speaking, even though Spanish might be the more proficient language.
Conclusion Basic knowledge of how multiple languages are represented in the brain and what factors influence representation undoubtedly have bearing on the clinical process. Moreover, careful documentation of linguistic impairment characteristics and the course of recovery in the two languages can also inform us about how the brain is organized. With increasing interaction between individuals from diverse linguistic and cultural backgrounds, due to factors such as immigration, globalization, and state unionization, the number and proportion of individuals who know and use more than one language will most likely increase. The clinical population as well as clinical needs will likewise increase. Thus, a greater basic and clinical understanding of bilingualism and the brain is warranted.
Bibliography Abutalebi J, Cappa F & Perani D (2001). ‘The bilingual brain as revealed by functional neuroimaging.’ Bilingualism: Language and Cognition 4(3), 179–190. Albert M & Obler L (1978). The bilingual brain. New York: Academic Press.
Bond S (1984). Bilingualism and aphasia: word retrieval skills in a bilingual anomic aphasic. Unpublished master’s thesis, Denton: North Texas State University. Chlenov L (1948). ‘Ob Afazii u Poliglotov.’ Izvestiia Akademii Pedagogucheskikh NAUK RSFSR 15, 783–790. [Translated version: Hervouet-Zieber T (1983). ‘On aphasia in polyglots.’ In Paradis M (ed.). 446–454.] De Fina A (1989). ‘Code-switching: grammatical and functional explanations.’ Ressenga-Italiana-di-Linguistica 32, 107–140. DeSanti S, Obler L & Sabo-Abramson H (1995). ‘Discourse abilities and deficits in multilingual dementia.’ In Paradis M (ed.) Aspects of bilingual aphasia. San Diego: Singular. 224–235. Fabbro F & Frau F (2001). ‘Manifestations of aphasia in Friulian.’ Journal of Neurolinguistics 14, 255–279. Gloning I & Gloning K (1965). ‘Aphasien bei Polyglotten. Beitrag zur Dynamik des Sprachabbaus sowie zur Lokalisationsfrage dieser Sto¨ runge.’ Wiener Zeitschrift fu¨r Nervenheilkunde 22, 362–397. [Translated version: Greenwood A & Keller E (1983). ‘Aphasias in polyglots. Contribution to the dynamics of language disintegration as well as to the question of the localization of these impairments.’ In Paradis M (ed.). 681–716.] Grasset J (1884). ‘Contribution clinique a` l’e´ tude des aphasies (ce´ cite´ et surdite´ verbales).’ Montpellier Me´dical, January (Observation II), 33–34. [Translated version: Mitchell C (1983). ‘Clinical contribution to the study of aphasias.’ In Paradis M (ed.). 15.] Green D & Price C (2001). ‘Functional imaging in the study of recovery patterns in the bilingual aphasia.’ Bilingualism: Language and Cognition 4(2), 191–201. Grosjean F (1989). ‘Neurolinguists, beware! The bilingual is not two monolinguals in one person.’ Brain and Language 36, 3–15. Heller M (1995). ‘Codeswitching and the politics of language.’ In Milroy L & Muysken P (eds.) One speaker, two languages. Cambridge: Cambridge University Press. 115–135. Kim K, Relkin N & Lee K (1997). ‘Distinct cortical areas associated with native and second languages. Nature (London) 388, 171–174. Lim V & Douglas J (2000). Impairment of lexical tone production in stroke patients with bilingual aphasia. Academy of Aphasia meeting at the School of Human Communication Sciences, Australia: La Trobe University. Linke D (1979). ‘Zur Therapie polyglotter Aphasiker.’ In Peuser G (ed.) Studien zur Sprachtherapie. Munich: Wilhelm Fink Verlag. Milroy L & Myusken P (1995). ‘Introduction: codeswitching and bilingualism research.’ In Milroy L & Myusken P (eds.) One speaker, two languages. Cambridge, UK: Cambridge University Press. 1–14. Mun˜ oz M, Marquardt T & Copeland G (1999). ‘A comparison of the codeswitching patterns in aphasic and neurologically normal bilingual speakers of English and Spanish.’ Brain and Language 66, 249–274. Paradis M (ed.) (1983). Readings on aphasia in bilinguals and polyglots. Montreal: Didier.
16 Bilingualism and Aphasia Paradis M (1990). ‘Language lateralization in bilinguals: enough already!’ Brain and Language 39, 576–586. Paradis M (1998). ‘Acquired aphasia in bilingual speakers.’ In Sarno M (ed.) Acquired aphasia, 3rd edn. New York: Academic Press. 531–549. Paradis M & Goldblum M (1989). ‘Selective crossed aphasia followed by reciprocal antagonism in a trilingual patient.’ Brain and Language 15, 55–69. Paradis M, Goldblum M & Abidi R (1982). ‘Alternate antagonism with paradoxical translation behavior in two bilingual aphasic patients.’ Brain and Language 15, 55–69. Paradis M & Janjua N (1998). Bilingual Aphasia Test (Urdu version). Hillsdale, NJ: Lawrence Erlbaum. Paradis M & Parcehian P (1991). Bilingual Aphasia Test (Bilingual-French version). Hillsdale, NJ: Lawrence Erlbaum. Perani D, Paulesu E, Galles N S et al. (1998). ‘The bilingual brain. Proficiency and age of acquisition of the second language.’ Brain and Language 121(10), 1841–1852. Perecman E (1984). ‘Spontaneous translation and language mixing in a polygot aphasic.’Brain and Language 2, 43–63. Rey G & Benton A (1991). Examen de afasia multilingue: manual de intrucciones. Iowa City, IA: AJA Associates. Sasanuma S (1991). ‘Aphasia rehabilitation in Japan.’ In Sarno M & Woods D (eds.) Aphasia rehabilitation: views
from the Asian-Pacific region. San Diego: Academic Press. Silverberg R & Gordon H (1979). ‘Different aphasia in two bilingual individuals.’ Neurology 29, 51–55. Streifler M & Hofman S (1976). ‘Sinistrad mirror writing and reading after brain concussion in a by-systemic (oriento-occidental) polyglot.’ Cortex 12, 356–364. Taussig I, Henderson V & Mack W (1988). Spanish translation and validation of a neuropsychological battery: performance of Spanish- and English-speaking Alzheimer’s disease patients and normal comparison subjects. Paper presented at the meeting of the Gerontological Society of America, San Francisco. Wald I (1968). Problema afazii poliglotov. Voprosy Kliniki I Patofiziologii Afazii. 140–176. Warburton E, Price C & Swinburn K (1999). ‘Mechanisms of recovery from aphasia: evidence from positron emission tomography studies. Journal of Neurology, Neurosurgery, and Psychiatry 66, 155–161. Wong P C M, Parsons L M, Martinez M & Diehl R L (2004). ‘The role of the insula cortex in pitch pattern perception: the effect of linguistic contexts.’ Journal of Neuroscience 24, 9153–9160. Yiu E M-L (1992). ‘Linguistic assessment of Chinesespeaking aphasics: development of a Cantonese aphasia battery.’ Journal of Neurolinguistics 7, 379–424.
Bilingualism and Second Language Learning T K Bhatia, Syracuse University, Syracuse, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Introduction There is a widespread perception in monolingual societies, particularly in the United States, that bilingualism is a rare and exceptional occurrence in communication. By contrast, from a global perspective, bilingualism is a world-wide phenomenon. In fact, global communication is often carried out through a speaker’s second, third, or even fourth language. According to David Crystal (1997) approximately two-thirds of the world’s children grow up in a bilingual environment which, in turn, leads to adult bilingualism/multilingualism. However, childhood bilingualism is not the only reason for adult bilingualism. A host of different factors (such as marriage, religion, education, linguistic plurality of a particular region, migration, jobs, government policies, urbanization, etc.) also lead to adult bilingualism. How, then, do humans become bilingual? Is adult secondlanguage learning different from child-language learning? Is bilingual-language acquisition different from monolingual-language acquisition? Is early bi-
lingualism different from late bilingualism? Does second language learning have adverse cognitive effects on children? And how are two (or more) languages represented in the brain? This chapter attempts to answer these and other questions concerning bilingual language learning and use.
Key Concepts Before discussing language development among bilinguals, it is crucial to give an overview of key fundamental concepts concerning language development in children and adults. Also, it should be mentioned that the term ‘second language learning’ is used in a wider sense to include the learning of any additional language during a period ranging from childhood to adulthood. An additional language may be a language of the country or spoken outside the country (i.e. foreign language). Acquisition vs. Learning
A child’s process of learning languages is different from an adult’s process. A child can learn any language relatively effortlessly, while the same task becomes rather challenging for adults. For this reason, some second language researchers (Krashen,
16 Bilingualism and Aphasia Paradis M (1990). ‘Language lateralization in bilinguals: enough already!’ Brain and Language 39, 576–586. Paradis M (1998). ‘Acquired aphasia in bilingual speakers.’ In Sarno M (ed.) Acquired aphasia, 3rd edn. New York: Academic Press. 531–549. Paradis M & Goldblum M (1989). ‘Selective crossed aphasia followed by reciprocal antagonism in a trilingual patient.’ Brain and Language 15, 55–69. Paradis M, Goldblum M & Abidi R (1982). ‘Alternate antagonism with paradoxical translation behavior in two bilingual aphasic patients.’ Brain and Language 15, 55–69. Paradis M & Janjua N (1998). Bilingual Aphasia Test (Urdu version). Hillsdale, NJ: Lawrence Erlbaum. Paradis M & Parcehian P (1991). Bilingual Aphasia Test (Bilingual-French version). Hillsdale, NJ: Lawrence Erlbaum. Perani D, Paulesu E, Galles N S et al. (1998). ‘The bilingual brain. Proficiency and age of acquisition of the second language.’ Brain and Language 121(10), 1841–1852. Perecman E (1984). ‘Spontaneous translation and language mixing in a polygot aphasic.’Brain and Language 2, 43–63. Rey G & Benton A (1991). Examen de afasia multilingue: manual de intrucciones. Iowa City, IA: AJA Associates. Sasanuma S (1991). ‘Aphasia rehabilitation in Japan.’ In Sarno M & Woods D (eds.) Aphasia rehabilitation: views
from the Asian-Pacific region. San Diego: Academic Press. Silverberg R & Gordon H (1979). ‘Different aphasia in two bilingual individuals.’ Neurology 29, 51–55. Streifler M & Hofman S (1976). ‘Sinistrad mirror writing and reading after brain concussion in a by-systemic (oriento-occidental) polyglot.’ Cortex 12, 356–364. Taussig I, Henderson V & Mack W (1988). Spanish translation and validation of a neuropsychological battery: performance of Spanish- and English-speaking Alzheimer’s disease patients and normal comparison subjects. Paper presented at the meeting of the Gerontological Society of America, San Francisco. Wald I (1968). Problema afazii poliglotov. Voprosy Kliniki I Patofiziologii Afazii. 140–176. Warburton E, Price C & Swinburn K (1999). ‘Mechanisms of recovery from aphasia: evidence from positron emission tomography studies. Journal of Neurology, Neurosurgery, and Psychiatry 66, 155–161. Wong P C M, Parsons L M, Martinez M & Diehl R L (2004). ‘The role of the insula cortex in pitch pattern perception: the effect of linguistic contexts.’ Journal of Neuroscience 24, 9153–9160. Yiu E M-L (1992). ‘Linguistic assessment of Chinesespeaking aphasics: development of a Cantonese aphasia battery.’ Journal of Neurolinguistics 7, 379–424.
Bilingualism and Second Language Learning T K Bhatia, Syracuse University, Syracuse, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Introduction There is a widespread perception in monolingual societies, particularly in the United States, that bilingualism is a rare and exceptional occurrence in communication. By contrast, from a global perspective, bilingualism is a world-wide phenomenon. In fact, global communication is often carried out through a speaker’s second, third, or even fourth language. According to David Crystal (1997) approximately two-thirds of the world’s children grow up in a bilingual environment which, in turn, leads to adult bilingualism/multilingualism. However, childhood bilingualism is not the only reason for adult bilingualism. A host of different factors (such as marriage, religion, education, linguistic plurality of a particular region, migration, jobs, government policies, urbanization, etc.) also lead to adult bilingualism. How, then, do humans become bilingual? Is adult secondlanguage learning different from child-language learning? Is bilingual-language acquisition different from monolingual-language acquisition? Is early bi-
lingualism different from late bilingualism? Does second language learning have adverse cognitive effects on children? And how are two (or more) languages represented in the brain? This chapter attempts to answer these and other questions concerning bilingual language learning and use.
Key Concepts Before discussing language development among bilinguals, it is crucial to give an overview of key fundamental concepts concerning language development in children and adults. Also, it should be mentioned that the term ‘second language learning’ is used in a wider sense to include the learning of any additional language during a period ranging from childhood to adulthood. An additional language may be a language of the country or spoken outside the country (i.e. foreign language). Acquisition vs. Learning
A child’s process of learning languages is different from an adult’s process. A child can learn any language relatively effortlessly, while the same task becomes rather challenging for adults. For this reason, some second language researchers (Krashen,
Bilingualism and Second Language Learning 17
1985) distinguish between two types of mechanisms in language development: a subconscious process resulting in tacit knowledge of the language (i.e., ‘language acquisition’), and a more conscious process (i.e., ‘language learning’). While children go through the former process, adults undergo the latter in their quest to become bilingual. The Critical Period Hypothesis and Its Biological Basis
In addition to degree of effort, it has been frequently observed that even very proficient bilinguals fall short of being perfect bilinguals. In spite of the complete mastery of syntax, their speech is marked by traces of the first language accent. Similarly, it is also shown that in spite of considerable effort and motivation, the ultimate attainment of some grammatical structures by adults is seldom achieved. To explain these and other differences in language acquisition and recovery from aphasia Lenneberg (1967) proposed the ‘‘critical period hypothesis,’’ which is sensitive to age. This hypothesis claims that there is a period in the maturation of human organism, lasting from two years to puberty, in which nearly effortless and complete language acquisition is possible. Afterwards, this hypothesis notes, language learning requires more effort and motivation, largely because of a loss of brain plasticity resulting in the completion of the lateralization of the language function in the left hemisphere. Recent research claims have additionally shown that there are different critical periods for different grammatical structures of language. Since the accent (phonetics and phonology) of a second language is the most difficult to attain, the critical period for phonetics and phonology (approximately from five to seven years) is earlier than that for morphology and syntax. See Johnson and Newport (1991) and Bhatia and Ritchie (1999) for details. Access to Universal Grammar (UG)
Children are born to acquire human languages. Regardless of gender, race, ethnicity, or nationality, every normal child is capable at birth of acquiring any human language. In theoretical studies following from the Chomskyan mentalistic framework, this innate ability is termed the access to universal grammar (UG). In this case, a child has full access to universal grammar, whereas an adult has either limited or no access. These and other universal principles of grammatical structures and principles of learning largely lead a child’s language development. The role of parental input then becomes to trigger an appropriate value for innately given or set parameters, specific to the language to which the child is exposed. One such parameter, called the ‘head parameter,’ describes
how a child does not have to even learn the specific word order of his/her language, but only has to choose between already specified values – headinitial or head-final – based on the nature of the input language. Children begin to learn to set parametric values even from the one-word stage. A Japanese child learns to choose the head-final system, whereas an English-speaking child chooses the head-initial value. These principles are generally refereed to as a child’s language acquisition device (LAD). Input and Learning Environment: Natural vs. Unnatural Settings
Usually children become bilinguals or multilingual in a natural way. A normal child can become a fluent bilingual by the age of five, for instance, without any formal training. In the process of acquiring a language, the role of input (motherese, etc.) or imitation is important but limited. Children do not learn a language by mindlessly imitating the input provided by mothers or caretakers. That is, while the role of parental input cannot be ruled out, language acquisition studies show that neither motherese nor imitation plays a significant role in a child’s language development. Instead, this burden is carried by the child himself/herself. Research on child-language acquisition reveals that the child learns the language by using the ‘rule formulation strategy.’ For instance, an English-speaking child learns on his/her own that by the addition of the inflection ‘-ed’ to a verbal stem, one generates the corresponding past tense form of the verb. In this process, the child over-generalizes and produces utterances such as ‘I go-ed’ [go-PAST]. Even after being corrected [i.e. provided negative evidence] by the mother or caretaker that the child meant ‘I went’ [go.PAST], the child still does not reject the rule s/he has formulated in his or her mind and which s/he still produces in utterances such as ‘I went-ed’ [go.PAST-PAST]. The role of the adult is thus to prevent the child’s grammar from overgeneralization. In other words, the child has an innate capacity to acquire languages in an environment which is termed a ‘natural’ environment, whereas, by contrast, adults and school-age children learn language in formal settings such as schools and colleges through a formal instructional method.
Defining and Measuring Bilingualism What is bilingualism and who is bilingual? Defining and measuring bilingualism is a very complex task due to the number and types of input conditions, biological, socio-psychological, and other nonlinguistic factors that can lead to a varying degree
18 Bilingualism and Second Language Learning
of bilingual competencies. In short, there is no widely-accepted definition or measures of bilinguals. Instead, a rich range of scales, dichotomies, and categories are employed to characterize bilinguals. If a bilingual can understand but cannot speak a second language, such an individual is called a receptive bilingual, whereas a productive bilingual demonstrates a spoken proficiency in two languages. If the second language is acquired in a natural setting before the age of five that individual is termed an early bilingual, in contrast with a late bilingual who learns his second language after the age of five either in home or in schools. Labels such as fluent vs. non-fluent, functional vs. non-functional, balanced vs. unbalanced, primary vs. secondary, and partial vs. complete refer, either to a varying command in different types of language proficiency (e.g., spoken, listening, writing, etc.), or an asymmetrical relationship (dominance) between two languages. A compound vs. coordinate bilingual refers to the way two languages are processed in the brain. The list is by no means exhaustive. Other major distinctions such as simultaneous vs. sequential are discussed in the next section. Similarly, bilingualism can be viewed from individual, societal (attitudes towards bilingualism), and political (i.e., government policies toward bilingualism) perspectives. In general, a bilingual person demonstrates many complex attributes rarely seen in a monolingual person. For that reason, a bilingual is not equivalent to two monolinguals, but something entirely different. This working definition of bilingualism is offered by Bloomfield (1933), who claimed that a bilingual is one who has a native-like control of two languages, i.e., a balanced bilingual (see Grosjean 1982 or Edwards, 2004 for more details).
Patterns and Mechanisms in Bilinglual Language Development Providing either a natural environment or inputs in monolingual/dominant language speech communities is not a challenging task. The same is also true for those societies where social and political systems are conducive to bilingualism. For instance, in India, where bilingualism is viewed as natural, approved by society, and further nurtured by government language policies, linguistic groups and communities do not need to take any special measures to assure that their children receive input from two languages. In sharp contrast, in societies where bilingualism is not valued or where the language of a minority is distinct, it becomes imperative for families to plan meaningful strategies to ensure the smooth exposure to the family language. One such strategy that families employ in
this second setting, described by Bhatia and Ritchie (1999) as ‘‘discourse allocation,’’ restricts the use of one language to one social agent or social setting and the other language to other social situations. The various manifestations of such strategies are the following: (a) one-parent/one-language (e.g., the child’s mother speaks one language and, the child’s father speaks the other. This strategy was employed by Leopold (1939–1949) in his classic study of bilingual language development of his daughter, Hildegard; (b) one-place/one-language (e.g. speaking one language in the kitchen and the other elsewhere); (c) a language/time approach; and (d) a topic-related approach. Although the discourse allocation approach is better than providing no input and thus raising a monolingual child, it leads to different patterns in bilingual language development than developing bilingualism in a natural setting. For instance, during the early stages of Hildegard’s bilingualism, she developed a rule that fathers speak German and mothers speak in English. Childhood Bilingualism
Other factors such as age and amount of exposure to the two languages also result in differences in the pattern of childhood bilingualism. The distinction between simultaneous and sequential bilinguals in research on bilingual language acquisition is based on age and the degree of exposure to two languages. When the child is exposed to two languages to more or less the same degree from birth onward, the pattern of language development is referred to as simultaneous, whereas sequential bilingualism describes the attainment of one language first and the second language later, preferably before the age of seven. Similarly, the term late bilingual is used for those sequential bilinguals who acquire their second language at a relatively younger age than adults learning a second language. Although there is unanimous agreement among researchers about the validity of the simultaneous and sequential bilinguals, there is no consensus among scholars about the exact line of demarcation between the two. See McLaughlin (1984) and De Houwer (1995) for either theoretical or methodological grounds. One of the most intriguing aspects of the childhood bilingualism is how children learn to separate the two languages, particularly in a natural setting (i.e., a simultaneous bilingual) in initial stages. After all, when parents provide input, they do not tag or prime their input with a language identification label. Even if parents go to the absurd length of identifying the language of each word or sentence they use, these labels are semantically empty for
Bilingualism and Second Language Learning 19
children. Furthermore, bilingual parents unwittingly make the task of separating the two languages even harder for children because of their normal tendency to mix two languages. In short, a child is provided with three distinct types of linguistic inputs: two languages, each in an unmixed/pure form, and one with a mixture of two languages. Given this state of affairs, how does the child learn to separate the two languages in question? This task is not challenging for a monolingual child because only one language serves as a source of input. The two hypotheses which attempt to shed light on this question are the unitary system hypothesis and the dual system hypothesis. According the unitary system hypothesis (Volterra and Taeschner, 1978), the child undergoes three stages before s/he is able to separate two input languages. During the first two stages, the child experiences confusion. During the first stage, s/he is unable to distinguish the two lexicons and grammars of the linguistic systems. At this stage, they have a single lexicon made up of items drawn from the lexicons of both languages. Hence, no translational equivalents or synonyms are found in their vocabulary. Volterra and Taeschner claim that their two bilingual subjects at the ages of 1 year 10 months and 1 year 6 months had a hybrid list of 137 words with no translational equivalents. During the second stage, the child slowly learns to separate the two lexicons, but is still unable to separate the grammatical systems. Cross-linguistic synonyms emerge, but the child applies the same set of syntactic rules to both languages. It is only during the third stage that the child becomes capable of separating the two sets of vocabularies and grammars. Findings of recent research reveal that the unitary system hypothesis cannot sustain the scrutiny of the succeeding research and the evidence motivating the three stages of bilingual language development is full of shortcomings and contradictions both on methodological and empirical grounds. The dual system hypothesis states that bilingual children, based on their access to Universal Grammar and language specific parameter setting, have the capacity of separating the two grammars and lexical systems right from the beginning. A wide variety of cross-linguistic studies (e.g., different input conditions – one parent/one language and mixed input condition; and different word order types) lends support to this hypothesis. For instance, in a study devoted to the language development of a HindiEnglish bilingual child, it is clear that at age 2, the child is capable of developing two distinct lexicons using a syllabification strategy. At the age of 1 year 7 months, two different word orders develop – SVO [subject-verb-object] for English and SOV for Hindi.
For a more detailed treatment of the shortcomings of the unitary system hypothesis and the strengths of the dual system hypothesis, see Bhatia and Ritchie 1999: 591–614. Another fascinating feature of bilingual speech is that, not only are bilinguals capable of keeping the two linguistic systems separate, but they often mix them either within a sentence or inter-sententially. This behavior is often termed ‘code-mixing’ or ‘codeswitching’ in sociolinguistic literature. Depending upon the theoretical and empirical objectives of their research, some researchers do not distinguish between the two terms and use them interchangeably; for those researchers who distinguish between the two, the code-mixing refers to intra-sentential mixing while the term code-switching refers to the intersentential mixing in bilinguals. Both bilingual children as well as adults show this behavior. What explains this behavior of language mixing? Earlier research attempted to explain it in terms of the language deficiency hypothesis: it was claimed that bilinguals in general and children in particular have language gaps. As claimed by the unitary system hypothesis the lack of synonyms compels them to mix the two lexical systems during stage I. Similarly, stage II yields the mixing of two language systems due to confusion. In other words, the lack of proficiency in either one language (i.e., the absence of balanced bilingualism) or both languages (i.e., semi-bilingualism) leads to mixing. The language augmentation hypothesis is capable of offering deeper insights into the bilingual mixing behavior. As it has been shown earlier in the discussion of the dual system hypothesis, children do not go through the initial stages of treating the two linguistic systems as if they were one system, but begin to distinguish them immediately. The consideration of optimization leads bilinguals to mix language with an aim to get maximum mileage from the two linguistic systems at their disposal. An analogy drawn from the beverage industry further explains this point. The separation of juices (e.g., apple vs. orange juice) renders two distinct tastes. However, if one mixes the two juices, the result is a new taste, a distinct from the two pure juices. The same is true of bilingual language mixing. Research on the linguistic and sociolinguistic motivations for language mixing both in children and adults shows that such considerations as semantic domains and semantic complexity (an item less complex or salient in one language), stylistic effects, clarification, elaboration, relief strategy (i.e., a linguistic item is temporarily unavailable in one language), interlocutor’s identification, discourse strategies of participants/topics, addressee’s perceived linguistic capability and speaker’s own linguistic
20 Bilingualism and Second Language Learning
ability, and other complex socio-psychological reasons, such as attitudes, societal values, and personality, prompt bilinguals to mix two languages. The list of motivations is by no means exhaustive (see Bhatia and Ritchie, 1996, for more details). Adult Bilingualism: Second Language Learning
In contrast to sequential childhood bilingualism, adults who learn a second language after they have learned their mother tongue experience the learning of a second language as a laborious and conscious task. As pointed out earlier, unlike children who are able to universally and uniformly acquire native competency in their mother tongue, adults rarely achieve native-like competency in their second language. Depending on the level of their motivation and hard work, adults can learn a second language with varying degrees of competence. However, there comes a point during the second language learning that even the most talented learner cannot bypass the stage of ‘fossilization.’ This stage is marked with second language errors which no amount of training can correct. For these reasons, second language (L2) learning is viewed as fundamentally different from first language (L1) acquisition. The hypothesis which aims at accounting for these differences between the child and the adult language is termed the fundamental difference hypothesis. In spite of the asymmetrical relation between L1 and L2 learning, one should not draw a conclusion that there is nothing in common between the two. What is common between L1 and L2 learners is that both undergo stages of language development. In other words, like L1 learners, in the process of grammar construction, L2 learners undergo stages of development: the intermediate stages of grammar development between the initial stage and the ultimate stage are termed interlanguage grammars. Take the case of the development of negation in English L1 and L2 learners. The grammar of negation in L2 learners of English shows the same stages of development as in L1 English learners – Stage I: the sentenceinitial placement of negation; Stage II: preverbal placement of negation with no auxiliary verb; and Stage III: preverbal placement of negation with an appropriate auxiliary verb. Native Language Influence and Dominance
An important way in which L2 learning is different from L1 learning is the influence of the mother tongue on second-language learning. The mother tongue or L1 plays an important role in the process of L2 acquisition. Research on grammatical errors of L2 shows that L2 learners transfer the grammatical rules – phonetic, phonological, morphological, and syntactic
rules – of L1 to their second language. An Englishspeaking learner of Hindi has difficulties in hearing and producing a four-way contrast between Hindi aspiration and voicing contrast (i.e., unvoiced unaspirates, unvoiced aspirates, voiced unaspirates, and voiced aspirates). It would be a gross simplification to claim that L2 learners transfer all grammatical features of L1 to L2. Adult learners possess a relatively higher level of logical and cognitive ability than do children; therefore, these qualities color their second language learning. For instance, English-speaking learners of Hindi will not translate there in these sentences: 1. There is a chair in the room 2. The chair is over there in an identical way (i.e. by choosing the remote locative adverb in both cases). Similarly, it would be an oversimplification to claim that childhood bilingualism is free from the dominance relationship between the two languages. Not only does the mother tongue influence second language acquisition in children, it also affects their school achievement. Approaches to Second Language Learning
In adult language acquisition research, the term second language is used in a wider sense to include both the acquisition of a second language which may or may not be foreign to a country. However, in the context of language teaching the distinction between the two is made to highlight major differences in the learning aims, teaching methods, and the achievement levels to be attained. A number of approaches have been developed to facilitate the learning of second/foreign languages. Some of the following are notable: 1. Grammar-translation method: Following the tradition of teaching classical languages such as Greek, Latin, and Sanskrit, this method places emphasis on memorization and rote learning. Learners memorize nominal and verbal paradigms of the second language and translate L1 into L2 or vice versa. Very little emphasis is placed on developing spoken proficiency in the foreign language, while reading and written comprehension receives overwhelming importance. This method is perhaps the oldest method of language teaching which dates back to the 19th century. 2. The direct method: Also known as oral or natural methods, it departs from the grammar-translation method in three important respects: one, memorization receives a back seat in the learning of the second language; two, special emphasis is placed on acquiring spoken and listening competencies;
Bilingualism and Second Language Learning 21
and three, the introduction of the target language is free from any reference to the native language of learners. Native language is never used as a tool to explain either grammar or other intricacies of the target language usage. This model attempts to simulate the native speaker environment of the target language. However, in actual practice there are severe constraints on replicating the natural setting of the native speaker’s learning environment in an actual classroom setting. 3. The audio-lingual method is a byproduct of World War II during which the United States experienced an urgent need to quickly train its troops in foreign languages for overseas military operations. An emphasis is placed on spoken and listening competencies, rather than on written ones. 4. The structural method: In order to speed up the acquisition of foreign languages, insights of structural linguistics were applied to language teaching. This method exposes learners to different structural patterns and transformation drills. Audio-lingual structural models assume that L2 is acquired through imitation. The discussion in the key concept section shows the limitation of this model. A number of other methods such as the natural approach and ‘suggestopedia’ have been proposed, but the fact remains that no method has a grip on the complexity involving learning a second language.
Bilingual Education: Additive vs. Subtractive Bilingualism Teaching children a school language, particularly if the school language is different from the child’s home language, is one of the major challenges for bilingual education programs. Bilingual education programs in America aim at minority students learning English. Such programs have attracted a great deal of controversy on the basis of their merit and outcome. While there is rapid growth of bilingual education programs in the United States, the aim of such programs is not always to introduce additive bilingualism which ensures the maintenance of the child mother tongue, while learning the school/dominant language. A large number of bilingual education programs in the United States aim at subtractive bilingualism. In other words, while they offer children a transition to learning the school/majority language, in that process they do not ensure the maintenance of the child’s mother tongue. In contrast, the language policies of bilingual nations such as India, Canada, and Switzerland are very conducive to the promotion of language rights for minority languages. The government of India, for instance, favors the advancement of linguistic
diversity and pluralism by the introduction of the Three Language Formula, which calls for trilingualism in education. In addition to learning two national languages, Hindi and English, students are expected to learn a third language beyond their native tongue. For example, in northern India, students are expected to learn one of the four Dravidian languages (Tamil, Telugu, Kannada, and Malayalam) from southern India. While bi- or multi-lingual education programs like India’s do not view bilingualism in general and the maintenance of minority languages in particular as a threat to national integration, this is not the case with bilingual education in the United States. U.S. educational policies are not conducive to linguistic and cultural diversity. A notable feature of the Canadian bilingual education program is termed the language immersion program. Introduced in the 1960s in Quebec, the program was introduced at the request of the English-speaking minority to provide their children a high level of proficiency in schools in the dominant language of the region, French. Children were immersed in schools in the second language of students (i.e., French) in which children used their mother tongue to communicate with a bilingual teacher who would reply in French. This process leads children from what Cummins (1981) calls basic interpersonal communication skills (BICS) proficiency to cognitive-academic language proficiency (CALP) in the school language. BICS refer to the language proficiency level of students with restricted vocabulary and simpler syntax, whereas CALP requires a type of proficiency suitable for academic pursuits – a developed vocabulary and sufficiently complex syntax suited for abstract and analytical thinking. The success of the Canadian language immersion model continues to generate enthusiasm and controversy in bilingual education in the United States.
Socio-Psychological Factors Successful language learning not only depends on teaching methods but also on learners’ motivation, intelligence, opportunities, and other factors, such as their attitude toward the target language and culture. Keeping in mind the motivation and the learners’ attitudes, there are two types of learners: instrumental and integrative learners. Instrumental learners, who learn a language for the purpose of gaining external rewards (monitory gains, good jobs, etc.), however, tend to be less successful learners than integrative learners, who have a positive attitude toward the culture of the target language. Psychological factors such as the affective filter (Krashen, 1985)
22 Bilingualism and Second Language Learning
either inhibit or promote the learning of a second language: negative influences such as anxiety, lack of self-confidence, and inadequate motivation can create serious obstacles to successful language learning. Due to a lack of self-esteem and a higher level of performance anxiety, minority children tend to raise the affective filter, which results in the reduction of comprehensible input. Consequently, it takes a toll on their progress in language acquisition. Similarly, since adults show more self-consciousness than children, they put themselves in a disadvantageous position in terms of language acquisition.
indicate why no theory of language learning and/or teaching is capable of explaining bilingual verbal behavior and the mechanisms leading to bilingual language development. See also: Bilingualism; Bilingual Education; Bilingual Lan-
guage Development: Early Years; Code Switching and Mixing; Foreign Language Teaching Policy; Interlanguage; Second and Foreign Language Learning and Teaching; Second Language Acquisition: Phonology, Morphology, Syntax.
Bibliography Effects of Bilingualism Does bilingualism have an adverse linguistic and cognitive effect, particularly on children? Earlier research in the United States pointed out that exposing children to more than one language during their childhood leads them to semi-bilingualism and confusion. Crowding their brain with two or more languages, this research suggested, not only leads children to linguistic deficiency, both in competence and performance levels (semi-lingualism, stuttering, etc.), but also to a wide variety of cognitive and psychological impairments such as low intelligence, mental retardation, left-handedness, and even schizophrenia. Research by Peal and Lambert (1962), however, put to rest such a negative view of bilingualism: their findings and the work of succeeding researchers provide ample evidence that these negative conclusions of earlier research were premature, misguided (biased toward immigrant communities), and unnecessarily pessimistic. Solid on methodological grounds, Peal and Lambert’s study revealed a positive view of bilingualism, including the conclusion that bilingual children demonstrate more cognitive flexibility than monolinguals. Contrary to previous studies, bilinguals performed better than monolinguals in both verbal and non-verbal measures. The study, which was conducted in Montreal, was revolutionary in its own right, changing the face of research on bilingualism forever (see Hakuta, 1985: Chap. 2 for details). This study has been replicated in a number of countries confirming the positive effects of bilingualism.
Conclusions A number of diverse and complex conditions and factors lead to life-long bilingualism. These factors – biological, social, psychological, and linguistic – account for a varied pattern amongst bilinguals, witnessed around the world. Thus, a bilingual is neither two monolinguals in the brain, nor are two bilinguals clones of each other. These complexities
Bhatia T & Ritchie W (1996). ‘Bilingual language mixing, Universal Grammar, and second language acquisition.’ In Ritchie W C & Bhatia T K (eds.) Handbook of second language acquisition. San Diego, CA: Academic Press. 627–688. Bhatia T & Ritchie W (1999). ‘The bilingual child: Some issues and perspectives.’ In Ritchie W C & Bhatia T K (eds.) Handbook of child language acquisition. San Diego, CA: Academic Press. 569–643. Bloomfield L (1933). Language. New York: Holt. Crystal D (1997). English as global language. Cambridge: Cambridge University Press. Cummins J (1981). Schooling and minority language students: a theoretical framework. Los Angeles: California State University. De Houwer A (1995). ‘Bilingual language acquisition.’ In Fletcher P & MacWhinney B (eds.) Handbook of child language. Oxford: Basil Blackwell Ltd. 219–250. Edwards J (2004). ‘Foundations of bilingualism.’ In Bhatia T & Ritchie W (eds.) Handbook of bilingualism. Oxford: Blackwell Publishing. 7–31. Grosjean F (1982). Life with two languages. Cambridge, MA: Harvard University Press. Hakuta K (1986). Mirror of language. New York: Basic Books, Inc. Johnson J & Newport E (1991). ‘Critical period effects on universal properties of language: The status of subjacency in the acquisition of a second language.’ Cognition 39, 215–258. Krashen S (1985). The input hypothesis: issues and implications. London: Longman. Lenneberg E (1967). Biological foundations of language. New York: Wiley Press. Leopold W (1939–1949). Speech development of a bilingual child: A linguist’s record (4 vols). Evanston, IL: Northwestern University Press. McLaughlin B (1984). ‘Early bilingualism: methodological and theoretical issues.’ In Paradis M & Lebrun Y (eds.) Early bilingualism and child development. Lisse, The Netherlands: Swets and Zeitlinger. 19–45. Peal E & Lambert W E (1962). ‘Relation of bilingualism to intelligence.’ Psychological Monographs 76, 1–23. Volterra V & Taeschner T (1978). ‘The acquisition and development of language by bilingual children.’ Journal of Child Language 5, 311–326.
Binding Theory 23
Binbinka
See: Wambaya.
Binding Theory A Asudeh, Carleton University, Ottawa, Canada M Dalrymple, Oxford University, Oxford, UK ! 2006 Elsevier Ltd. All rights reserved.
What Is Binding? Binding theory concerns syntactic restrictions on nominal reference. It particularly focuses on the possible coreference relationships between a pronoun and its antecedent (the nominal that a nondeictic pronoun depends on for its reference). For instance, in (1a) himself must refer to the same individual as he. In contrast, in (1b) her cannot refer to the same individual as she. Instead, the sentence must mean that some person voted for some other person.
Binding Conditions Binding theory is typically stated in terms of conditions that refer to three key aspects: the class of nominal involved, the syntactic region that constitutes the domain of binding, and a structural condition on the syntactic relation between a nominal and its potential binder. Classes of Nominals
For the purposes of binding theory, nominals are traditionally partitioned into several classes, as shown here: (4)
(1a) He voted for himself. (1b) She voted for her.
Pronouns like himself or ourselves, which must corefer with some other noun phrase in the sentence, are called reflexive pronouns or reflexives. Pronouns like she, her, and us are called nonreflexive pronouns. Two nominal expressions that corefer, or refer to the same individual or individuals, are annotated by identical subscripts; if two nominals do not corefer, they are annotated with different subscripts: (2a) Hei voted for himselfi. (2b) Shei voted for herj.
In an example like Hei voted for himselfi, we say that the reflexive pronoun himself is bound by he, and that he is the binder of himself. Reciprocals like each other and one another must also be bound by a local antecedent and are grouped in binding-theoretic terms with reflexives: (3a) Theyi voted for each otheri. (3b) * Ii voted for each otherj.
Reflexives and reciprocals are together called anaphors. Some major works on binding are Faltz (1977), Wasow (1979), Chomsky (1981, 1986), Reinhart (1983), Dalrymple (1993), Reinhart and Reuland (1993), and Pollard and Sag (1994). Huang (2000) contains a rich cross-linguistic survey of pronominal systems. Bu¨ ring (2004) provides a recent comprehensive overview of the syntax and semantics of binding and presents a new synthesis.
The first major division is between pronouns and nonpronouns. Pronouns are then further subdivided into reflexives and reciprocals, which are collectively referred to as ‘anaphors,’ and nonreflexive pronouns, often simply called ‘pronominals’ or ‘pronouns’ (in opposition to anaphors). We will here refer to nonreflexive pronouns as ‘pronominals,’ reserving the term ‘pronoun’ for the class that includes anaphors and other pronouns. This yields three classes for the purposes of binding theory: anaphors, pronominals, and nonpronouns. Each class is governed by its own binding condition. Binding Domains
Traditional definitions of binding domains distinguish local from nonlocal domains. Consider the following sentence: (5) Billi said that [Gonzoj voted for himself*i,j]
The reflexive himself must be bound in its local domain, here the subordinate clause Gonzo voted for himself. The only appropriate binder in this domain is Gonzo. The reflexive cannot be bound by the higher subject Bill, which is outside the reflexive’s local domain. This is indicated by placing the marker of ungrammaticality (*) beside the illicit index. A pronominal in the same position must not be bound in its local domain:
24 Binding Theory (6) Billi said that [Gonzoj voted for himi,*j]
The local domain for the pronominal is also the subordinate clause, and it cannot be bound in this domain. It can, however, be bound by the matrix subject, which lies outside the local domain. Command
Besides a syntactic domain condition, binding involves the requirement that the binding nominal be in a structurally dominant position. This required relation between a pronoun and its binder is called ‘command’ and is defined in different ways in different theories. The structural condition on binding means that certain elements cannot be binders, even if they fall within the correct syntactic domain: (7) Gonzoi’s friendj voted for himself*i,j.
The entire subject Gonzo’s friend can bind the reflexive, but the possessor Gonzo cannot, because the possessor does not command the reflexive. We have thus far seen that anaphors must be bound within some local domain and that pronominals cannot be bound within some local domain. Nonpronouns cannot be bound in any domain, whether local or nonlocal: (8a) * Hei voted for Billi. (8b) * Hei said that Gonzo voted for Billi. (8c) When hei voted for George, Gonzoi was drunk.
In (8a) and (8b), the pronoun is in the proper structural relation to command the name. Since this results in the nonpronoun being bound, the sentences are ungrammatical on the indexation indicated. In (8c), by contrast, the pronoun is not in the proper structural relation to command the name, because the pronoun is too deeply embedded. Although the pronoun and the name corefer, as indicated by the coindexation, there is no binding relation, and the sentence is grammatical. Bringing these ideas together, a typical statement of binding conditions is as follows (based on Chomsky, 1981): A. An anaphor (reflexive or reciprocal) must be bound in its local domain. B. A pronominal (nonreflexive pronoun) must not be bound in its local domain. C. A nonpronoun must not be bound. Following Chomsky (1981), these binding principles are often referred to as Principle A, the condition on anaphors; Principle B, the condition on pronominals; and Principle C, the condition on nonpronouns. Principles A, B, and C are also called Conditions A, B, and C.
Variation in Structural Relation All versions of binding theory incorporate some notion of structural domination or superiority as a component of the binding relation. We referred to this relation above as command. One commonly assumed version of command is the tree-configurational relation of c-command (Reinhart, 1983): (9a) A c-commands B if and only if A does not dominate B and the first branching node dominating A also dominates B. (9b)
In the tree in (9b), the first branching node dominating A, labeled X, also dominates B, and A does not dominate B. Therefore, A c-commands B. B does not c-command A, because the first branching node dominating B is Y, and Y does not dominate A. Other tree-based definitions of command have been proposed; in them, command is relativized to nodes other than the first branching node. For example, the similar relation of m-command makes reference to the first maximal projection dominating A. Thus, in diagram (9b), A m-commands B if X is a maximal projection (see X-Bar Theory). Notice also that if X is a maximal projection and Y is not a maximal projection, then B also m-commands A because the first maximal projection dominating B dominates A and B does not dominate A. Some literature on binding continues to use the term ‘c-command’ but defines it as m-command. Other theories define a command relation on linguistic structures other than trees. In lexical functional grammar (LFG), command is defined on f(unctional)structures, which represent predicates and their adjuncts and subcategorized grammatical functions. The command relation relevant for binding in LFG is called ‘f-command’ and is defined as follows: (10a) An f-structure A f-commands an f-structure B if and only if A does not contain B and every f-structure that contains A also contains B. (10b)
In the f-structure in (10b), the f-structure labeled A f-commands B: A does not contain B, and the f-structure X that contains A also contains B. B does not f-command A because there is an f-structure Y that contains B but not A. Notice that in (10), A and Y f-command each other, just as in a tree there
Binding Theory 25
is mutual c-command between sisters. Since A can be the subject and Y the object, we need an additional principle to ensure that the subject binds the object but not vice versa. Otherwise a perfectly grammatical sentence like (11) would be a Principle B violation because the object reflexive would bind the subject pronominal. (11) Hei injured himselfi.
Cases of mutual f-command like the above occur not just between subjects and objects but among all coarguments of a given predicate. Such cases are handled by an independently motivated relational hierarchy of grammatical functions based on the notion of obliqueness, in which the subject outranks the object, which in turn outranks the other arguments. In head-driven phrase structure grammar (HPSG), grammatical functions are encoded on SUBCAT (subcategorization) lists, which are ordered according to the aforementioned obliqueness hierarchy: the subject is the first member of SUBCAT, the object is the second, and so on. Early work in HPSG defined a version of command called o-command on the SUBCAT list, in terms of this obliqueness relation. A simplified definition of o-command follows:
on a thematic hierarchy, such as Agent > Goal > Theme (Jackendoff, 1972; Wilkins, 1988).
Variation in Binding Domain Some theories assume that the local domain for the anaphoric and pronominal binding conditions (Principles A and B) is the same: anaphors are required to be bound in exactly the same domains in which pronouns are required not to be bound. For example, Chomsky (1981) proposed that the local binding domain for both anaphors and pronominals is the governing category, where a governing category for an element is the minimal domain containing a subject and the head that selects the element. This predicts that anaphors and pronominals are in complementary distribution, a prediction that seems to be borne out by examples like the following: (13a) Gonzoi saw himselfi/*himi. (13b) Gonzoi thought that George liked himi/ *himselfi.
Huang (1983) subsequently pointed out that the prediction above is incorrect, based on examples like the following: (14a) Theyi saw each otheri’s pictures. (14b) Theyi saw theiri pictures.
(12a) A o-commands B if and only if A does not contain B and A precedes B on a SUBCAT list, or A o-commands X and X contains B. (12b)
(15a) Theyi saw pictures of each otheri/themselvesi. (15b) Theyi saw pictures of themi.
In the SUBCAT list in (12b), A o-commands B because A o-commands X and X contains B. B does not o-command A, on the other hand, because B does not precede A on a SUBCAT list and B does not o-command anything that contains A. The o-command relation in HPSG and LFG’s f-command relation are similar in that they are defined on structures that encode grammatical functions. The two theories are also similar in using the relational hierarchy to define binding constraints. More recent work in HPSG (Manning and Sag, 1999) defines binding on the ARG-ST (argument structure) list, a basic representation of argument structure, rather than on SUBCAT. The ARG-ST version of HPSG binding replaces o-command with a-command, where a-command can be defined by replacing all mention of o-command in (12) with a-command and all mention of SUBCAT with ARG-ST. To the extent that ARG-ST encodes thematic relations like agent (logical subject) and patient (logical object), the acommand version of HPSG binding is related to proposals that define the structural binding relation
In (14) and (15), the anaphors and pronominals occur in identical positions: there is no complementary distribution. Chomsky (1986) addressed this problem by proposing that the local domain for anaphoric and pronominal binding is the smallest domain in which the binding constraint in question could be satisfied. For the anaphoric cases in (14a) and (15a), there is no possibility of satisfying Principle A within the noun phrase that contains the anaphor. Therefore, the anaphor’s local domain becomes the domain of the containing NP; since the anaphors in (14a) and (15a) are bound in this slightly larger domain, the sentences are grammatical. In contrast, the local domain for the pronominals in (14b) and (15b) is the smaller domain constituted by just the NP containing the pronominal since Principle B is satisfiable within this domain. Although the English examples above are amenable to a treatment along these lines, data from other languages indicate that a unified notion of local binding domain for all anaphora is inadequate. Some languages have several anaphors, each with a different local domain. Consider the two Norwegian reflexives seg and seg selv:
26 Binding Theory (16a) Joni fortalte meg om seg selvi /*segi J. told me about self ‘Jon told me about himself.’ (16b) Joni hørte oss snakke om segi /*seg selvi J. heard us talk about self ‘Jon heard us talk about him.’
Based on data like the above, Manzini and Wexler (1987), Dalrymple (1993), and others argued that binding constraints must be parameterized as lexical properties of particular pronouns. Thus, part of the lexical entry for seg selv specifies that it must be bound to an argument of the same syntactic predicate, whereas the lexical entry for seg specifies that it must be bound in the minimal finite clause in which it is contained but cannot be bound by a coargument. Thus, a single language can have various anaphors, each with its own binding domain. Indeed, Norwegian has a third reflexive (ham selv) that has yet a different binding domain. Furthermore, many languages have long-distance reflexives that must be bound within the same sentence but place no further restrictions on their binding domain (Koster and Reuland, 1991; Cole et al., 2001). The possibility for a reflexive to allow longdistance binding has been claimed to correlate with its morphological form (Faltz, 1977; Pica, 1987): morphologically complex reflexives like English himself or Norwegian seg selv allow only local binding, whereas morphologically simple reflexives like Norwegian seg allow long-distance binding. A puzzle that has gone largely unaddressed in the literature on binding is the local nature of reciprocal binding. Although there are many examples of reflexive pronouns that need not be locally bound, there seem to be no comparable examples of long-distance reciprocals. Treating reflexives and reciprocals as anaphors that must obey the same binding principle does not lead us to expect this difference in behavior.
Defining the Binding Relation In all of the examples we have examined so far, the relation between the pronoun and its potential antecedent has involved either coreference or noncoreference. In more complicated cases involving plurals, the possibility of partial overlap of reference arises. Lasnik (1981) discussed examples like (17), which he marked as ungrammatical: (17) * We like me.
In this example, the speaker is included in the referent of the subject, leading to the impossibility of a pronoun referring to the speaker in object position. Lasnik also claimed that in (18), the group of people referred to as they cannot include the referent of him:
(18) They like him.
Examples such as these have prompted some researchers to revise the treatment of the binding relation by introducing a more complicated indexing system. Higginbotham (1983) proposed that the symmetrical coindexation mechanism be replaced with an antisymmetrical linking mechanism, represented by an arrow notation: (19)
This mechanism is particularly adept at representing split antecedents—cases in which a plural pronoun’s antecedent is made up of two syntactically separate nominals: (20)
The referential dependency of the pronoun on the two nominals is represented by linking it to both antecedents simultaneously. The most extensively explored revision to the standard coindexation mechanism is the proposal to represent the index for plural noun phrases as a set containing an index value for each individual in the set (Lasnik, 1981). In (21), they refers to two individuals, i and j. This index value is used to prevent the object him from referring to either individual i or individual j: (21) They{i,j} like him*{i}/*{j}/{k}.
This move necessitates a corresponding adjustment to the binding condition for pronominals, which must now refer to overlap of set-valued indices rather than simply to identity of atomic indices. For example, Principle B would be reformulated to require that the index of a pronominal must not overlap with the index of a commanding nominal in the pronominal’s local domain. Overlap is understood in settheoretic terms: a set index A does not overlap with a set index B if and only if the intersection of A and B is empty. Notice that this treatment of indexation also blocks readings in which there is overlapping reference between plural pronouns: (22) They{i,j} like them*{i,j}/*{i,k}/*{j,k}/{k,l}.
With the move to set-valued indices and a notion of overlap based on intersection, the binding relation no longer concerns coreference and noncoreference, but rather coreference and disjoint reference. Principle B requires disjoint reference, as discussed above, whereas Principle A still requires coreference, i.e., total overlap/equality of set indices:
Binding Theory 27 (23a) They{i,j} like himself*{i}/*{j}. (23b) They{i,j} like themselves{i,j}. (23c) They{i,j} like themselves*{i,k}/*{i,j,k}.
Example (23a) is ungrammatical because there is no coindexation that can make the set index of the reflexive equal to the set index of the antecedent (himself cannot be plural). Example (23b) is, by contrast, grammatical: the set index of the reflexive and its antecedent are equal. Example (23c) illustrates that overlap of reference or intersection is not sufficient for reflexive binding, since the sentence cannot have an interpretation in which a group of people likes another group of people that includes only some of the first group. A problem for this approach is that there are grammatical examples that appear to be structurally identical to the ungrammatical examples above. Berman and Hestvik (1997) presented the following example, which, while syntactically similar to (18), is acceptable for many speakers: (24) John and Mary often connive behind their colleagues’ backs to advance the position of one or the other. This time they got her a job in the main offce.
Since they refers to John and Mary and her refers to Mary, the grammatical sequence they{i,j} got her{j} a job appears to be identical in binding-theoretic terms to the ungrammatical indexing they{i,j} like him{j} for (18). Reinhart and Reuland (1993) and Kiparsky (2002) proposed that the crucial difference between ungrammatical and grammatical instances of overlapping reference lies in whether the predicate taking the pronominal as an argument is interpreted collectively or distributively. If the predicate is a collective predicate, then overlapping reference is possible, but if it is a distributive predicate, then overlapping reference is impossible. This is meant to derive the difference between the grammatical (25a) and the putatively ungrammatical (25b): (25a) We elected me. (25b) * We voted for me.
The idea is that elect is a collective predicate and the overlapping reference is allowed, but vote for involves each individual voting separately and is therefore distributive, rendering the sentence ungrammatical. Similarly, the context of (24) makes it clear that John and Mary together got her a job – the predicate is interpreted collectively. However, many speakers find (25b) just as grammatical as (25a), even though vote for is presumably equally distributive for these speakers. In addition, certain grammatical
instances of overlapping reference do not obviously involve collective predication or do not involve predicates whose collective reading is logically distinct from their distributive reading (Bu¨ ring, 2004), and certain ungrammatical instances of overlapping reference similarly do not involve obviously distributive predicates.
Semantic Approaches to Binding Theory Bach and Partee (1980) provided a semantic alternative to syntactic binding theories, couched in Montague semantics. They argue that functional application in the semantics yields a sufficiently rich structural relation to model binding theory, provided that certain auxiliary assumptions are made. These assumptions can be thought of as analogous to binding constraints. Bach and Partee principally sought to show that a semantic binding theory achieves a coverage equal to syntactic binding theories (of the time), but they noted that one advantage of their semantic binding theory is that it generalizes readily to languages whose syntactic structure is less configurational. These languages nonetheless have rules of semantic composition similar to those of configurational languages, even if notions like subject and object in these languages are not defined configurationally. In this respect, their binding theory is similar to syntactic binding theories that define binding in terms of grammatical functions rather than on structural configurations, which only indirectly model grammatical functions. The HPSG and LFG binding theories discussed in an earlier part of this article are two such theories. Keenan (1988) also offered a semantic binding theory, but one based on his semantic case theory rather than on Montague semantics. His binding theory deals principally with reflexives and shares with the Bach and Partee theory (1980) the advantage of applying readily to nonconfigurational languages. The basic insight behind Keenan’s theory of reflexivization is that a reflexive denotes a function SELF that when applied to a binary relation R returns the set of x such that hx, xi is in R. The function SELF thus reduces the arity of the relation that it applies to. This treatment of reflexivization as an arity-reducing function is shared by Bach and Partee (1980). Reinhart and Reuland (1993) offered a mixed syntactic/semantic approach to binding theory. Their theory centers around the notion of predication, with syntactic predicates distinguished from semantic predicates. A semantic predicate is a predicate and its semantic arguments. A syntactic predicate is a head, all of its selected internal arguments, and, crucially,
28 Binding Theory
an external argument (a subject). Reinhart and Reuland proposed the following two binding conditions: 1. A reflexive-marked syntactic predicate is reflexive. 2. A reflexive semantic predicate is reflexive-marked. A predicate is reflexive-marked if and only if one of its arguments is a reflexive. A predicate is reflexive if and only if two of its arguments are coindexed. Given these conditions, a sentence like Gonzoi injured himselfi is allowed since injured is a reflexivemarked predicate (marked by himself), that is reflexive (the arguments of the predicate are coindexed). The sentence *Gonzoi injured himi is disallowed because the predicate is reflexive but not reflexive-marked. And the sentence *Gonzoi said Kate injured himselfi is unacceptable since injured is reflexive-marked but not reflexive (Kate and himself are not coindexed).
Exemption and Logophoricity Certain formulations of binding theory allow some occurrences of anaphors to be excluded from the purview of binding constraints. For example, HPSG’s Principle A states that a locally commanded anaphor must be locally bound (where the command relation is either o-command or a-command, depending on the version of the theory, as discussed above). If an anaphor is not locally commanded, HPSG’s Principle A does not apply to it: the anaphor is exempt from binding (Pollard and Sag, 1994). For example, the reflexive in the following sentence is an exempt anaphor: (26) Gonzoi downloaded a picture of himselfi.
Similarly, in (27) the reflexive is in noncomplementary distribution with a pronoun and is treated as exempt from binding constraints: (27) Gonzoi saw a snake near himi/himselfi.
The binding theory of Reinhart and Reuland (1993) is similar in treating some anaphors as exempt. Recall that their Principle A requires a reflexive-marked syntactic predicate to be reflexive. Crucially, a syntactic predicate must have a subject. Therefore, although the noun picture in (27) is reflexive-marked, it does not count as a syntactic predicate, and Reinhart and Reuland’s Principle A does not apply to it. Theories like these, in which some anaphors are exempt from binding constraints, contrast with approaches like that of Chomsky (1986), sketched earlier. In Chomsky’s view, reflexives in examples like (27) are not exempt from binding but rather must be bound in a slightly larger syntactic domain. The binding theory of LFG is similar in this regard.
Constraints on the distribution of exempt anaphors are often claimed to be defined in nonsyntactic terms. For example, Pollard and Sag (1994) argued that exempt anaphors are used to refer to an antecedent whose point of view is being reported. In this view, exempt anaphors are subject to discourse and pragmatic constraints, as discussed extensively by Kuno (1987). In cases of noncomplementary distribution, such as (27), Kuno argued that the reflexive indicates that the speaker has taken on the subject’s point of view but the pronoun does not. The encoding of point of view in pronominal systems is typically discussed under the rubric of logophoricity. Theories of exemption differ on the treatment of the specifier or possessor of a noun phrase. Reinhart and Reuland’s theory (1993), like Chomsky’s (1986), treats specifiers of noun phrases as subjects for purposes of binding theory. This predicts that sentences like (28) are ungrammatical: (28) * Gonzoi downloaded her picture of himselfi.
Since the specifier her is in the right structural position to count as a subject, the reflexive must be bound in the NP, either because it can be bound in this minimal domain (in Chomsky’s 1986 account) or because the head noun counts as a syntactic predicate and is reflexive marked (in the Reinhart and Reuland account). Recent psycholinguistic evidence has been shown to bear on this issue; speakers in fact find sentences like (28) grammatical (Asudeh and Keller, 2001; Runner et al., 2003): (29) Gonzoi downloaded her picture of himselfi.
Asudeh and Keller (2001) argued that the result exemplified by (29) supports predication-based binding theories that do not treat possessors as subjects, such as certain versions of HPSG and LFG binding theory. They noted that the possessor in the noun phrase is not an argument of the head noun and concluded that if the possessor is not a semantic argument, then it is not a subject in predication-based theories. In an HPSG binding theory, the reflexive in (29) is exempt. In an LFG account, the reflexive is not exempt but must be bound in the minimal domain containing a subject, which corresponds to the matrix clause.
Pragmatic and Blocking Approaches to Binding In the binding theories reviewed thus far, Principle A and Principle B derive a kind of blocking effect: pronouns are in general barred where reflexives are
Binding Theory 29
required. Pronouns and reflexives are thus predicted to be in mostly complementary distribution, although the complementarity is relaxed in certain situations, using a variety of mechanisms. Kiparsky (2002) noted that this derivative notion of blocking has the conceptual disadvantage of lacking deep motivation: the general complementarity seems merely coincidental. He argued that the grammar should include blocking principles that explicitly compare structures containing pronouns to ones containing reflexives. He gave an overview of the issues involved and offered a hybrid binding theory that includes blocking principles. Huang (2000) presented an alternative sort of blocking account based on a theory of neo-Gricean pragmatics. Huang’s analysis followed in an established tradition of pragmatic approaches to binding, which he reviewed extensively. His account contrasts with that of Kiparsky (2002), in which the blocking constraints rely on notions of featural and morphological economy rather than on pragmatic principles. Although blocking accounts arguably provide an explanation of pronoun/reflexive complementarity that nonblocking accounts lack, they are by the same token seriously challenged when the complementarity breaks down. Reflexives and pronouns must be shown to give rise to different meanings or pragmatic effects in such environments, with the result that the blocking relation fails to apply since it chooses only between semantically or pragmatically equivalent options (Kiparsky, 2002; Huang, 2000).
Reflexives and Valence Reduction Reflexive forms do not always fill a syntactic and semantic role of a predicate. In many languages, the same form can play two roles. It can be a reflexive pronoun with an independent syntactic and semantic role in some cases, and it can mark intransitivity or valence reduction, with no associated semantic role, in other cases. For example, the Swedish form sig serves as an argument long-distance reflexive in (30a). However, in (30b) it simply marks the verb as intransitive. Examples (30c) and (30d) show that the verb is intransitive, since the verb cannot take a full local reflexive or a free object. (30a) Johani ho¨ rde oss prata om sigi. J. heard us talk about self ‘Johan heard us talk about him.’ (30b) Johan skyndade sig. J. hurried self ‘Johan hurried up.’ (30c) * Johani skyndade sig sja¨lvi. J. hurried self
(30d) *
Johan J.
skyndade hurried
Maria. M.
A question raised by this pattern of data is why the long-distance reflexive is used for valence reduction. Reinhart and Reuland (1993) offered an explanation of these facts based on the observation that longdistance reflexives are morphologically simple (Faltz, 1977; Pica, 1987). However, in languages like English, which lack morphologically simple reflexives, full reflexives seem to serve a similar function: (31a) Gonzo behaved himself. (31b) * Gonzo behaved David.
A detailed study of reflexivization and its relation to syntactic and semantic valence reduction was presented by Sells et al. (1987).
Binding and Movement Binding theory is invoked in certain treatments of A-movement (movement to an argument position) and A-bar movement (movement to a nonargument position) in transformational grammar. Such treatments assume that the passive example of A-movement in (32a) and the wh-question example of A-bar movement in (32b) involve transformations, in which the t represents the original position – the trace – of the coindexed element: (32a) Gonzoi was accosted ti. (32b) Whoi did someone accost ti?
The fact that binding theory applies to these examples might initially appear puzzling since binding theory is about anaphors, pronominals, and nonpronouns, and traces do not seem to fit into any of these categories. However, Chomsky (1982) gave a featural breakdown of overt noun phrases in terms of the features [ ! a(naphor)] and [ ! p(ronominal)] and then applied the classification to covert noun phrases, i.e., empty categories. The passive trace is grouped with anaphors using the feature assignment [þ a, #p]. The trace in wh-movement is grouped with nonpronouns using the feature assignment [#a, #p]. This classification enables the statement of locality relations on transformations in terms of binding requirements on traces of moved elements. The binding-theoretic treatment of empty categories has been considerably revised in more recent transformational work. Hornstein (2001) revived the connection by claiming that anaphors are the result of overt A-movement. In this view, pronominals and reflexives are both claimed to be grammatical formatives introduced during derivations, not by lexical insertion. This treatment of binding has the
30 Binding Theory
advantage for transformational grammar of reducing binding to movement, which is independently motivated in transformational theory. However, it faces a number of challenges. The account does not readily extend to long-distance, intransitivizing, or exempt/ logophoric reflexives. In addition, it treats deictic pronouns differently from anaphors and pronominals, as lexical items introduced through lexical insertion. This raises the question of why nondeictic personal pronouns, which are purely grammatical formatives, uniformly have the same morphological realization as deictic personal pronouns. Despite these challenges, further evidence for binding as movement apparently comes from resumptive pronouns, as in the following Swedish example: (33) Vilken elev trodde Maria att han fuskade? which student thought M. that he cheated? ‘Which student did Maria think cheated?’
This example seems to indicate that wh-movement has left a pronoun in the extraction site. This could be explained by treating resumptive pronouns as overt traces that result from a last-resort attempt to save a derivation. Boeckx (2003) offered an alternative movement-based account in which a resumptive pronoun is the result of spelling out a head whose complement has moved away to become the resumptive’s antecedent. However, resumptive pronouns do not obey standard constraints on movement and do not possess other characteristics of wh-traces. They therefore do not lend straightforward support to the binding-as-movement view. In a recent overview of resumption, Asudeh (2004) argued that resumptive pronouns are not last-resort grammatical devices, overt traces, or the result of movement but are rather ordinary, lexically inserted pronouns that are bound by the wh-phrase and whose distribution is explained on the basis of semantic composition. Lastly, binding is also relevant to movement as a diagnostic tool for the extraction site for movement. Reconstruction, as in (34a), and connectivity, as in (34b), are two particular phenomena in which binding has been crucial: (34a) Which picture of himselfi does nobodyi likei? (34b) What nobodyi was was sure of himselfi.
The locality of reflexive binding has been used as evidence that the wh-phrase in (34a) must be reconstructed in its base position. Similarly, the free relative’s subject in its surface position in (34b) does not command, and therefore cannot bind, the reflexive. In order to bind the reflexive, the free relative’s subject must at some nonsurface level be the subject of the second copula. Bu¨ ring (2004: chapter 12) gave an extensive overview of reconstruction and
connectivity, as well as other issues concerning binding and movement. See also: Anaphora, Cataphora, Exophora, Logophoricity; Anaphora: Philosophical Aspects; Command Relations; Coreference: Identity and Similarity; Deixis and Anaphora: Pragmatic Approaches; Pronouns; Scope and Binding: Semantic Aspects; X-Bar Theory.
Bibliography Asudeh A (2004). ‘Resumption as resource management.’ Ph.D. diss., Stanford University. Asudeh A & Keller F (2001). ‘Experimental evidence for a predication-based binding theory.’ In Andronis M, Ball C, Elston H & Neuvel S (eds.) Proceedings of the Chicago Linguistic Society 37. Chicago: Chicago Linguistic Society. 1–14. Bach E & Partee B (1980). ‘Anaphora and semantic structure.’ In Kreiman J & Ojeda A E (eds.) Papers from the parasession on pronouns and anaphora. Chicago: Chicago Linguistic Society. 1–28. [Reprinted in Partee B H (ed.) Compositionality in formal semantics: selected papers of Barbara Partee. Oxford: Blackwell Publishers. 2003.] Berman S & Hestvik A (1997). ‘Split antecedents, noncoreference and DRT.’ In Bennis H, Pica P & Rooryck J (eds.) Atomism and binding. Dordrecht: Foris. 1–29. Boeckx C (2003). Islands and chains: resumption as derivational residue. Amsterdam: John Benjamins. Bu¨ ring D (2004). Binding theory. Cambridge: Cambridge University Press. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris Publications. Chomsky N (1982). Some concepts and consequences of the theory of government and binding. Cambridge, MA: MIT Press. Chomsky N (1986). Knowledge of language: its nature, origin, and use. New York: Praeger. Cole P, Hermon G & Huang C-T J (eds.) (2001). Longdistance reflexives, Syntax and semantics, vol. 33. San Diego: Academic Press. Dalrymple M (1993). The syntax of anaphoric binding. [CSLI Lecture Notes, number 36.] Stanford, CA: CSLI Publications. Faltz L M (1977). ‘Reflexivization: a study in universal syntax.’ Ph.D. diss., University of California, Berkeley. [Reprinted by Garland Press, New York, 1985.] Higginbotham J (1983). ‘Logical form, binding, and nominals.’ Linguistic Inquiry 14, 395–420. Hornstein N (2001). Move! a minimalist theory of construal. Oxford: Blackwell Publishers. Huang C-T J (1983). ‘A note on the binding theory.’ Linguistic Inquiry 14, 554–560. Huang Y (2000). Anaphora: a cross-linguistic study. Oxford: Oxford University Press. Jackendoff R S (1972). Semantic interpretation in generative grammar. Cambridge, MA: MIT Press.
Biosemiotics 31 Keenan E L (1988). ‘On semantics and the binding theory.’ In Hawkins J A (ed.) Explaining language universals. Oxford: Blackwell Publishers. 105–144. Kiparsky P (2002). ‘Disjoint reference and the typology of pronouns.’ In Kaufmann I & Stiebels B (eds.) More than words. [no. 53 in Studia Grammatica] Berlin: Akademie Verlag. 179–226. Koster J & Reuland E (eds.) (1991). Long-distance anaphora. Cambridge: Cambridge University Press. Kuno S (1987). Functional syntax: anaphora, discourse, and empathy. Chicago: University of Chicago Press. Lasnik H (1981). ‘On two recent treatments of disjoint reference.’ Journal of Linguistic Research 1, 48–58. [Also in Lasnik H (1989). Essays on anaphora. Dordrecht: Kluwer Academic Publishers.] Manning C D & Sag I A (1999). ‘Dissociations between argument structure and grammatical relations.’ In Kathol A, Koenig J-P & Webelhuth G (eds.) Lexical and constructional aspects of linguistic explanation. Stanford, CA: CSLI Publications. 63–78. Manzini M R & Wexler K (1987). ‘Parameters, binding theory, and learnability.’ Linguistic Inquiry 18, 413–444. Pica P (1987). ‘On the nature of the reflexivization cycle.’ In McDonough J & Plunkett B (eds.) Proceedings of the
Seventeenth Annual Meeting of the North Eastern Linguistic Society, vol. 17. Amherst, MA: GLSA Publications/University of Massachusetts. 483–500. Pollard C & Sag I A (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press. Reinhart T (1983). Anaphora and semantic interpretation. London: Croom Helm. Reinhart T & Reuland E (1993). ‘Reflexivity.’ Linguistic Inquiry 24, 657–720. Runner J T, Sussman R S & Tanenhaus M K (2003). ‘Assignment of reference to reflexives and pronouns in picture noun phrases: evidence from eye movements.’ Cognition 89, B1–B13. Sells P, Zaenen A & Zec D (1987). ‘Reflexivization variation: Relations between syntax, semantics, and lexical structure.’ In Iida M, Wechsler S & Zec D (eds.) Working papers in grammatical theory and discourse structure. Stanford, CA: CSLI Publications. 169–238. [CSLI Lecture Notes, number 11.] Wasow T (1979). Anaphora in generative grammar. Ghent: E. Story. Wilkins W (1988). ‘Thematic structure and reflexivization.’ In Wilkins W (ed.) Syntax and semantics: thematic relations, vol. 21. San Diego: Academic Press. 191–214.
Biosemiotics S Brier, Copenhagen Business School, Copenhagen, Denmark ! 2006 Elsevier Ltd. All rights reserved.
Semiotics develops a general theory of all possible kinds of signs, their modes of signification and information, whole behavior and properties, but is usually restricted to human communication and culture. Biosemiotics (bios, life and semion, sign) is a growing field that studies the production, action, and interpretation of signs, such as sounds, objects, smells, and movements, as well as signs on molecular scales, in an attempt to integrate the findings of biology and semiotics to form a new view of life and meaning as immanent features of the natural world. Life and semiosis are seen as coexisting. The biology of recognition, memory, categorization, mimicry, learning, and communication are of interest for biosemiotic research, together with the analysis of the application of the tools and notions of semiotics such as interpretation, semiosis, types of sign, and meaning. The biosemiotic doctrine accepts nonconsciously intentional signs in humans, nonintentional signs, also between animals as well as between animals and humans, and signs between organs and cells in the body and between cells in the body or in nature. Thus the biological processes between and within living
beings transcend the conceptual foundation of the other natural sciences. In the tradition of Peirce, who founded semiotics as a logic and scientific study of dynamic sign action in human and nonhuman nature, biosemiotics attempts to use semiotic concepts to answer questions about the biologic and evolutionary emergence of meaning, intentionality, and a psychic world. Peircian biosemiotics builds on Peirce’s unique triadic concept of semiosis, where the ‘interpretant’ is the sign concept in the organism that makes it see/recognize something as an object. This is its interpretation of what the outer sign vehicle stands for in a motivated context by relating to a code that is connected to that specific functionality. For instance, why a small gazelle, and not an elephant, is seen as prey for a cheetah. As Peirce’s semiotics is the only one that deals systematically with nonintentional signs of the body and of nature at large, and therefore accepts involuntary body movements (such as instinctive motor patterns in animal courtship) and patterns of and within the body (such as plumage for another bird and smallpox for a physician) as signs, and further patterns and differences in nature (such as the track of a tornado), it has become the main source for semiotic contemplations of the similarities and differences of signs of inorganic nature, signs of the living systems,
Biosemiotics 31 Keenan E L (1988). ‘On semantics and the binding theory.’ In Hawkins J A (ed.) Explaining language universals. Oxford: Blackwell Publishers. 105–144. Kiparsky P (2002). ‘Disjoint reference and the typology of pronouns.’ In Kaufmann I & Stiebels B (eds.) More than words. [no. 53 in Studia Grammatica] Berlin: Akademie Verlag. 179–226. Koster J & Reuland E (eds.) (1991). Long-distance anaphora. Cambridge: Cambridge University Press. Kuno S (1987). Functional syntax: anaphora, discourse, and empathy. Chicago: University of Chicago Press. Lasnik H (1981). ‘On two recent treatments of disjoint reference.’ Journal of Linguistic Research 1, 48–58. [Also in Lasnik H (1989). Essays on anaphora. Dordrecht: Kluwer Academic Publishers.] Manning C D & Sag I A (1999). ‘Dissociations between argument structure and grammatical relations.’ In Kathol A, Koenig J-P & Webelhuth G (eds.) Lexical and constructional aspects of linguistic explanation. Stanford, CA: CSLI Publications. 63–78. Manzini M R & Wexler K (1987). ‘Parameters, binding theory, and learnability.’ Linguistic Inquiry 18, 413–444. Pica P (1987). ‘On the nature of the reflexivization cycle.’ In McDonough J & Plunkett B (eds.) Proceedings of the
Seventeenth Annual Meeting of the North Eastern Linguistic Society, vol. 17. Amherst, MA: GLSA Publications/University of Massachusetts. 483–500. Pollard C & Sag I A (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press. Reinhart T (1983). Anaphora and semantic interpretation. London: Croom Helm. Reinhart T & Reuland E (1993). ‘Reflexivity.’ Linguistic Inquiry 24, 657–720. Runner J T, Sussman R S & Tanenhaus M K (2003). ‘Assignment of reference to reflexives and pronouns in picture noun phrases: evidence from eye movements.’ Cognition 89, B1–B13. Sells P, Zaenen A & Zec D (1987). ‘Reflexivization variation: Relations between syntax, semantics, and lexical structure.’ In Iida M, Wechsler S & Zec D (eds.) Working papers in grammatical theory and discourse structure. Stanford, CA: CSLI Publications. 169–238. [CSLI Lecture Notes, number 11.] Wasow T (1979). Anaphora in generative grammar. Ghent: E. Story. Wilkins W (1988). ‘Thematic structure and reflexivization.’ In Wilkins W (ed.) Syntax and semantics: thematic relations, vol. 21. San Diego: Academic Press. 191–214.
Biosemiotics S Brier, Copenhagen Business School, Copenhagen, Denmark ! 2006 Elsevier Ltd. All rights reserved.
Semiotics develops a general theory of all possible kinds of signs, their modes of signification and information, whole behavior and properties, but is usually restricted to human communication and culture. Biosemiotics (bios, life and semion, sign) is a growing field that studies the production, action, and interpretation of signs, such as sounds, objects, smells, and movements, as well as signs on molecular scales, in an attempt to integrate the findings of biology and semiotics to form a new view of life and meaning as immanent features of the natural world. Life and semiosis are seen as coexisting. The biology of recognition, memory, categorization, mimicry, learning, and communication are of interest for biosemiotic research, together with the analysis of the application of the tools and notions of semiotics such as interpretation, semiosis, types of sign, and meaning. The biosemiotic doctrine accepts nonconsciously intentional signs in humans, nonintentional signs, also between animals as well as between animals and humans, and signs between organs and cells in the body and between cells in the body or in nature. Thus the biological processes between and within living
beings transcend the conceptual foundation of the other natural sciences. In the tradition of Peirce, who founded semiotics as a logic and scientific study of dynamic sign action in human and nonhuman nature, biosemiotics attempts to use semiotic concepts to answer questions about the biologic and evolutionary emergence of meaning, intentionality, and a psychic world. Peircian biosemiotics builds on Peirce’s unique triadic concept of semiosis, where the ‘interpretant’ is the sign concept in the organism that makes it see/recognize something as an object. This is its interpretation of what the outer sign vehicle stands for in a motivated context by relating to a code that is connected to that specific functionality. For instance, why a small gazelle, and not an elephant, is seen as prey for a cheetah. As Peirce’s semiotics is the only one that deals systematically with nonintentional signs of the body and of nature at large, and therefore accepts involuntary body movements (such as instinctive motor patterns in animal courtship) and patterns of and within the body (such as plumage for another bird and smallpox for a physician) as signs, and further patterns and differences in nature (such as the track of a tornado), it has become the main source for semiotic contemplations of the similarities and differences of signs of inorganic nature, signs of the living systems,
32 Biosemiotics
and the cultural and linguistic signs of humans living together in a society. Semiotics is also defined as the study – or doctrine – of signs and sign systems, where sign systems are most often understood as codes. Examples of biological codes are those for the production of proteins from the information of the genome, for the reception and effects of hormones, and neurotransmitters spring to mind as obvious biological sign systems. Marcello Barbieri (2001) has pointed to the importance of codes in living systems such as the genetic code, signal codes for hormones and between nerve cells, and between nerve cells and muscles, codes for recognition of foreign substances and life form in the immune system, etc. He defines codes as rules of correspondence between two independent worlds such as the Morse code standing for letters in the alphabet. A code gives meaning to differences or information in certain contexts. But information is not a code in itself. He also points to the peculiar fact that the proteins in the living cell are different from proteins created through external spontaneous chemical processes. Living systems are not natural in the same way as physical and chemical systems because the protein molecules they are self-constructed from are manufactured by molecular machines (the ribosomes and connected processes). The ribosomes, that is an organelle in the cell constructed by huge RNA molecules connected with several enzymes, are systems that are capable of assembling molecules by binding their subunits together in the order provided by a template. Cell proteins have the sequences of their amino acids determined by the internal code system in the cell connected to the genes in the nucleus’s DNA. The ribosomal system for building proteins uses the base sequence of messenger-RNA, which comes out to the ribosome from inside the nucleus, in itself a template of the gene in the DNA, to determine the amino acid sequence in the proteins. Living systems are thus built out of artificially produced, code-based molecules from the cell’s molecular assembler machine. They are autopoietic (self-creating) – as pointed out by Maturana and Varela – as they produce their own elements and internal organization. A living system’s structure, organization, and processes are determined by internal codes and they are therefore in a certain way artificial. Thus a code is a set of process rules or habits (for instance, how the ribosome works) that connects elements in one area (e.g., genes) with another area (e.g., proteins) in a specific meaning context (here the creation, function, and survival of the cell). As the biosemiotician Kalevi Kull (1999) points out, codes are correspondences that cannot be inferred directly from natural laws. To most biosemioticians,
it is crucial that the correspondence is not a universal natural law but is motivated from a living signifying system. Thus machines do not make codes themselves. A sequence of differences such as the base pairs in DNA can be information for coding, but is not a code in itself. Biosemiotics argues that codes are triadic sign processes where an interpretant makes the motivated connection between objects and signs (representamens). Living systems function based on self-constructed codes. This differentiates them from physical, chemical, and technological systems (computers do not make their own codes as they function causally after the codes we have made and installed). As Alexis Sharov (1998) notes, a sign is an object that is a part of some self-reproducing system. A sign is always useful for the system and its value can be determined by its contribution to the reproductive value of the entire system. Thus semiosis is a crucial part of those processes that make systems living and lift them out of the physical world’s efficient causality through the informational realm of formal causality in chemistry into the final causation in semiotic processes. Thus, biosemiotics works with more types of causation than classical sciences inspired by Peirce’s semiotic philosophy. In Peirce’s philosophy, efficient causality works through the transfer of energy and is quantitatively measurable. Formal causality works through pattern fitting, difference, and with signals as information in a dualistic proto-semiotic matter. Final causation is semiotic signification and interpretation. Semiosis, both in the form of signification and communication, is viewed as an important part of what makes living systems transcend pure physical, chemical, and even the informational explanations of how computers function. Molecules are composed of sequences of atoms and make three-dimensional shapes. They interact informationally through formal causality. The biological macromolecules are composed of minor molecules often put in sequences. Cells interpret the molecules as coded signs and interact with them through final causation in semiosis. Thus far, biosemiotics considers the living cell to be simplest system possessing real semiotic competence. Biosemiotics sees the evolution of life and the evolution of semiotic systems as two aspects of the same process. The scientific approach to the origin and evolution of life has overlooked the inner qualitative aspects of sign action, leading to a reduced picture of causality. The evolution of life is not only based on physical, chemical, and even informational processes, but also on the development of semiotic possibilities, or semiotic freedom as one of the founding biosemioticians, Jesper Hoffmeyer (1996), calls it. It is the evolution of semiotic freedom that creates the
Biosemiotics 33
Figure 1 The model classifies types of semiosis and proto-semiotic (informational) processes. On the left side is Luhmann theory of viewing the body, the psyche, and the linguistic system as autopoietic (closed and self-organized). The localization of the processes in this diagram is symbolic and not really related to actual physical locations; for example, the head is also part of biological autopoiesis and the location of endosemiotic processes. To simplify this model, I have placed all the cybernetic-autopoietic concepts on the left and all the biosemiotic ones on the right, although all concepts concern both persons. Each person is placed within a signification sphere (Umwelt). When these spheres are combined through sociocommunicative autopoietic language games, a common signification sphere of culture is created. One part of exosemiotic signification is based on the linguistic processes of conceptualization and classifications. Underneath the language games is the biological level of instinctually based sign games, and under that is the informational exchange through structural couplings. Thus, exosemiotics also has a level of biopsychological, or emphatic, signification, as well as a level of structural couplings that the organism, or rather the species, has developed through evolution. Endosemiotics is made up of the processes between cells and organs in the body. Phenosemiotics is prelinguistic sign processes in the mind such as emotions and imaging, where thought semiosis is conceptualized thinking. On the far left side are the signification processes toward the environment that consists of nonintentional potential signs that become the signification sphere when they are interpreted as signs.
zoosemiotic system of sign games, as the bio- and cybersemiotician Søren Brier (1995) calls it. These sign games are the primary system behind the foundation of human language games and the tertiary system of culture such as Thomas Sebeok and Marcel Danesi (2000) have thoroughly shown in their Modeling System Theory. Multicellular living individual beings are then understood as swarms of communicatively organized semiotic cellular units. The human body is seen as organized in swarms of swarms of biological and as layer upon layer of internal (endo) semiotic processes, as well as external (exo) signification processes building up a signification sphere (Umwelt) and finally exo-semiotic social processes between individuals constructing language and first-person experiences (see Figure 1). Complex self-organized living systems are not only governed by physically efficient causation; they are also governed by formal and final causality. They
are governed by formal causality in the sense of the downward causation from a higher level structure (such as a tissue, an organ, or the entire organism) to its individual cells, constraining their action, but also endowing them with functional meanings in relation to the entire metabolism (as systems science has shown). Organisms are governed by final causality in the sense that they tend to take habits and generate future interpretants of the present sign actions, as in learning. In this sense (Brier, 1998), biosemiotics draws upon the insights of fields such as systems theory, theoretical biology, and the physics of complex self-organized systems. As Sharov (1998) points out, biosemiotics can be viewed as a root of biology and semiotics rather than a branch of semiotics (in its conventional limit to human languages). As such, biosemiotics also represents a suggestion for a deeper foundation that can connect biology with the humanities in another way than sociobiology and evolutionary psychology do.
34 Biosemiotics
Biological systems are then understood as being held together for communicative reasons and are therefore not natural in physical–chemical understanding. They are communicative structures, as Kull (2001) argues. One could also call them discursive material systems. As we can call humans languagecyborgs because our minds are artificially formed by language, we can call all other living systems signcyborgs because they are made of coded molecules and organized communicatively by semiotic processes. But computers only work on and are organized around differences or informational bits. Thus, they are dualistic and therefore proto-semiotic (No¨ th, 2002), as genuine semiosis is triadic according to Peirce. The same goes for information in natural systems, for example dissipative structures such as tornadoes. Biosemiotics offers a rich field of exploration and ongoing research into the life of signs as they are found in the actual world’s ecological, mental, and artificial systems (Emmeche, 1998). Examples of relevant topics are sign functions in physical, chemical, biological, and computational systems such as molecular biology, cognitive ethology, cognitive science, robotics, and neurobiology; communication of all living systems including the area of ethology; the semiotics of cellular communication in the body among organs, the immune system, and in the brain such as psychoneuroimmunology, the representational dynamics of disease and possible relevance for medical diagnose and treatment; the study of the semiotics of complex systems, anticipatory systems, artificial life, and real life; the semiotics of collective biological phenomena such as emergent signs in swarm intelligence; the metaphysics of Darwinism: can semiotics provide a foundation for a new evolutionary paradigm through Peirce’s idea of Thirdness, and the emergence of interpretants in biotic evolution? Biosemiotics can help develop the theory of biological self and its relation to the emotional and sign-producing systems in animals as well as the linguistic thinking system in humans, the theory of the embodiment of consciousness and language and internal mental causation. Such may be a short and bold formulation of the biosemiotic view combining several researchers’ contribution to a view that is as close to consensus as possible for the leading researchers in this still young research program. Apart from C. S. Peirce, early pioneers of biosemiotics are Jakob von Uexku¨ ll (1864–1944), Charles Morris (1901–1979), Heini Hediger (1908–1992), and Giorgio Prodi (1928–1987); the founding fathers are Thomas A. Sebeok (1920–2001) and Thure von Uexku¨ ll (1908–2004), and the founders of the
second wave are contemporary scholars such as Jesper Hoffmeyer and Claus Emmeche (who formed the biosemiotic group in Copenhagen in the 1980s), Kalevi Kull (the Jakob von Uexku¨ ll center), Alexei Sharov, Søren Brier, Marcello Barbieri, Anton Markos, Dario Martinelli (zoosemiotic musicology), and semioticians such as Floyd Merrell, John Deely, Myrdene Anderson, Lucia Santaella, Frederik Stjernfelt, Tommi Vehkavaara, and Winfried No¨ th have also contributed as part of their more general work. In the following, we look into the foundations and specific theories. However, it is interesting that F. S. Rothschild (1899–1995), who did not notably influence the development of biosemiotics, was the first to use the term in 1962 in the Annals of the New York Academy of Sciences 96: 774–784.
Thomas Sebeok’s Development of Zoosemiotics and Biosemiotics Ever since Umberto Eco formulated the problem of the semiotic threshold, Peircian semiotics has developed further into the realm of biology. The efforts of Thomas Sebeok (1920–2001) have led to the development of a biosemiotics encompassing all living systems, including plants and microorganisms as sign users (Petrilli and Ponzio, 2001). Sebeok’s name is associated most of all with the term ‘zoosemiotics,’ the study of animal sign use (Sebeok, 1972). It was coined in 1963 and it deals with species-specific communication systems and their signifying behaviour. Zoosemiotics is concerned more with the synchronic perspective than the ethology of Lorenz and Tinbergen, which focuses more on the diachronic dimension. Sebeok’s research succeeded in broadening the definition of semiotics beyond human language and culture to a biosemiotics encompassing not only human nonverbal communication but also all sign processes between and within animals (Sebeok, 1990). He pointed out that we are living in a world of signs: a ‘semiossphere.’ Sebeok argued that the biosphere and the semiossphere are linked in a closed cybernetic loop where meaning itself powers creation in self-excited circuits. With Sebeok’s enthusiastic support as editor, the two large special volumes of Semiotica on biosemiotics (Sebeok et al., 1999), and on Jakob von Uexku¨ ll’s contribution to the foundation of biosemiotics (Kull, 2001) were edited by first the Copenhagen and next the Tartu school of biosemiotics. Later, through the collaboration of these schools of biosemiotics, a series of annual conferences under the name Gatherings in Biosemiotics has been developed since 2000, now also in collaboration with biosemioticians in Prague. In 2004, through further cooperation with the Italian school
Biosemiotics 35
of semantic biology (Barbieri), work on starting a Journal of Biosemiotics has begun.
Hoffmeyer and Emmeche’s Theory of Code Duality Later Sebeok decided that that zoosemiotics rests on a more comprehensive science of biosemiotics. This global conception of semiotics, namely biosemiotics, equates life with sign interpretation and communication. It is carried by an inspiration from Jakob von Uexku¨ ll’s theory that all living beings are the center of a phenomenal Umwelt (Sebeok, 1989). This idea was carried on through Thure von Uexku¨ ll, with whom Sebeok interacted in creating the foundations for a modern biosemiotics. In the late 1980s, these ideas merged with the ideas of the Danish biochemist Jesper Hoffmeyer’s communicative view of life and his and the biophilosopher Claus Emmeche’s theory (Emmeche and Hoffmeyer, 1991; Hoffmeyer and Emmeche, 1991) of the foundational code duality of living systems: they see living systems defined by the interactions through evolution between a digital code in the gene or genotype and an analog code in the whole individual or phenotype. The gene is a code for memory and self-representation and the individual living body is a code for action and interaction with the real world and its ecology. Thus life appears also to be an interplay of different types of self- and other-descriptions. The egg and the hen as two interacting aspects of a living system evolving through time and space is another example. Thus signs and not molecules are the basic units of the study of life and the semiotic niche is the species home. Biological evolution is a development toward more semiotic freedom. Hoffmeyer’s contribution to biosemiotics is summarized in Emmeche et al. (2002).
The Roots from Uexku¨ ll and Ethology Although biosemiotics is already prefigured in Jakob von Uexku¨ ll’s Umweltlehre, although not in semiotic terms, Sebeok fruitfully combined the influences of von Uexku¨ ll and Charles S. Peirce, to merge them into an original whole, in an evolutionary perspective, arriving at the thesis that symbiosis and semiosis are one and the same (Sebeok, 1989). Biosemiotics finds its place as a master science, which encompasses the parallel disciplines of ethology and comparative psychology. As Uexkull was one of Konrad Lorenz’s most important teachers, the ethology he and Tinbergen developed fitted nicely into biosemiotics as it developed from Sebeok’s studies of animal communication and ethology.
Figure 2 Jakob von Uexku¨ll’s functional circle that demonstrates his (phenomenal constructivistic) concept of objects (von Uexku¨ll 1957: 10–11; referred to as ‘Figure 3.’) In cybernetic recursive e processes between receptors and effectors, the perceptual object is created on the basis of a functional tone.
In J. and T. von Uexku¨ ll’s writings (J. von Uexku¨ ll, 1934; T. von Uexku¨ ll et al., 1982) on the speciesspecific and subjective Umwelt in animals, one finds the roots of important concepts such as sign stimuli, innate release mechanisms, and ‘functional tones’ that are later utilized in Lorenz’s ethological research program as the concept of motivation. J. von Uexku¨ ll’s ‘tone’ concept is the root of Lorenz’s specific motivation, but it seems even more closely related to Gibson’s affordances, although it is unclear whether Gibson ever read von Uexku¨ ll. The functional tones are the number of functions an animal can distinguish in its surroundings, which creates its functional images of ‘thing’ that thus becomes ‘object’ in the animals Umwelt. Brier (1999) has coined the term ‘signification sphere’ to give a modern semiotic term to Uexku¨ ll’s presemiotic concepts. Figure 2 shows the presemiotic Uexku¨ ll model of object perception. As von Uexku¨ ll’s concept of ‘tone’ becomes Lorenz’s ‘motivation,’ the ‘subjectively defined object’ becomes the ‘sign stimuli’ in ethology, and finally the ‘functional relation between receptors and effectors’ becomes the ‘IRM’ (innate response mechanism). However, it is clear that von Uexku¨ ll’s biophenomenological concepts differ from the biocybernetic and partially mechanistic framework found in the theoretical foundation of Lorenz and Tinbergen’s articles from around 1950. First in the new biosemiotics, this conceptual difference can be solved using Peirce’s philosophy (Brier, 2001).
Animal Languages or Sign Games? The empiricist and natural science readings Sebeok offers for communication were new to the semiotics field. References to animal models are made throughout his work in the context of ethology. The approaches of ethology and sociobiology have been controversial and, in their applicability to human culture and society, accused of reductionism. Sebeok shows that some of this controversy may find itself
36 Biosemiotics
played out in the new transdisciplinary framework of biosemiotics. In 1992, he and his wife Jean UmikerSebeok published ‘The semiotic web 1991’ as a volume titled Biosemiotics. This volume was predicated on a book they edited in 1980, Speaking of apes, which presented a detailed critical evaluation of current investigations of the ability of apes to learn language. Sebeok showed in a profound critique of the way the experiments were constructed that it is very doubtful that apes have such capabilities. Thus biosemiotics does not entail that there are no significant differences between human and ape linguistic capabilities. But through biosemiotics, Sebeok and Danesi (2000) argued that a zoosemiotic system exists as the foundation of human language, which has to be called the primary one; thus languages become secondary and culture tertiary, as already mentioned.
The Peircian Influence
The majority of biosemiotics builds on Peirce’s unique triadic concept of semiosis, where the interpretant is the sign concept in the organism’s mind, which is the interpretation of what the outer sign vehicle stands for: its object. For instance, that a raised fist’s object is a physical threat. Peircean biosemiotics is based on Peirce’s theory of mind as a basic part of reality (in Firstness) existing in the material aspect of reality (in secondness) as the inner aspect of matter manifesting itself as awareness and experience in animals and finally as consciousness in humans. Peirce’s differentiation between the immediate object of semiosis and the dynamic object – that is all we can get to know about it in time – is a differentiation between the object of the organism and the environment or universe outside it. Biosemiotics begins with the process of knowledge: how signification occurs within living systems, making perception and cognition possible.
Anthroposemiotics as Part of Biosemiotics But biosemiotics does not only deal with animals in zoosemiotics; it also deals with signs in plants in phytosemiotics, with bacterial communication. According to one standard scheme for the broad classification of organisms, five super kingdoms are now distinguished: bacteria, protists (protozoa-like slime molds and primitive algae, all with a nucleus), plants; animals; and fungi. Thus the major classification categories in biosemiotics are: bacteriosemiotics, protistosemiotics, phytosemiotics (Krampen, 1981), mycosemiotics, and zoosemiotics (Deely, 1990). Within zoosemiotics, anthroposemiotics encompasses the human race. There are two biosemiotic interpretations of anthroposemiotics. One is that it encompasses the traditional area of semiotics of language and culture plus the embodiment of human signification. The other one, that leading biosemioticians share, is that it only deals with the human body and the biological parts of human cognition and communication. Going into the body of multicellular organisms, endosemiotics (T. von Uexku¨ ll et al., 1993) deals with communication between the cells in the body of all living systems, including human physiology. In the framework of endosemiotics, there is, for instance, a special area of immunosemiotics dealing with the immunological code, immunological memory, and recognition. The way that we now know that the nervous system’s, the hormone system’s, and the immunological system’s communicative codes work on each other is considered to be the basis of the biological self: an endosemiotic self-organized cybernetic system with a homeostasis.
Peircian Biosemiotics Modern Peircian biosemiotics is very different from the symbolic semiotics of human language that cyberneticians distanced themselves from many years ago. The theories of Heinz von Foerster on recursive functions in the nervous system establishing perceptual objects as eigen functions of this recursive cognitive interplay between nervous system and environment has supported Uexku¨ ll’s older concept of object (Brier, 1996). Humberto Maturana and Francisco Varela’s concept and theory of autopoiesis, the cell as a self- and closure-organizing system recursively reproducing the closure and internal organization of living systems, have had a significant influence on the development of the Copenhagen school of biosemiotics (Brier, 1995). The interaction between the autopoiesis, the genome, and semiosis in an animal (here a small fish) as understood through biosemiotics can be modeled as shown in Figure 3. Peircian biosemiotics is distinct from other semiotic paradigms in that it not only deals with intentional signs of communication, but also encompasses nonintentional signs such as symptoms of the body and patterns of an inanimate nature. Peircian semiotics breaks with the traditional dualistic epistemological problem of first-order science by framing its basic concept of cognition, signification, within a triadic semiotic philosophy. Triadic semiotics is integrated into a theory of continuity between mind and matter (Synechism) where the three basic categories (Firstness, Secondness, and Thirdness) are not only inside the perceiver’s mind, but also in the nature
Biosemiotics 37
Figure 3 Brier’s model showing two autopoietic systems (males) of the same species (gene pool) see the same sign in an object, creating the interpretant of a female of the same species. This occurs through the partially inherited structural coupling that ethology calls the innate response mechanism (IRM), which is tuned to anticipate certain differences as significant for survival and proliferation, i.e., as sign stimuli. The whole model is within one life form (naturalizing Wittgenstein’s concept), mating, which again generates the mating sign game or ground (Peirce). I have excluded here, for simplicity, the female’s point of view as a species-specific autopoietic system.
perceived. This is connected to the second important ontological belief in Peirce’s philosophy, namely Thycism that sees chance and chaos as basic characteristics of Firstness. This is combined with an evolutionary theory of mind (Agapism), where mind has a tendency to form habits in nature. Chaos and chance are seen as a First, which is not to be explained further (for instance, by regularities). It is the basis of habitforming and evolution. The chaos of Firstness is not seen as the lack of law, as it is in mechanicism and rationalism, but as something full of potential qualities to be manifested individually in Secondness and as general habits and knowledge in dynamic objects and semiosis in Thirdness. This is the deep foundation of Peirce’s pragmaticism (Brier, 2003).
Biosemiotics and Information in Computer and Physiosemiotics The essential question for the current debate about the possibility of a transdisciplinary information/ signification science is whether the Percian biosemiotics can comprise uninterpreted natural objects, dissipative structures, and other spontaneous generations of order and patterns in nature as signs. These objects were previously described in physical–chemical terms. Now some adherents of the paninformational paradigm want to explain them in purely
informational terms (Brier, 1992). From a Peircian point of view, these phenomena are proto-semiotic, or quasi-semiotic, when compared to the semiosis of living systems, because they are only displays of Secondness in the well-argued view of Winfred No¨ th (2002). There is thus competition between the informational and the semiotic approaches in producing that new transdisciplinary framework that can unite the traditional views of nature by the sciences, with the new understandings of computers and cognition and finally the social aspects of language and consciousness in communication. But some scholars even accept to use the sign concept on processes between nonliving entities in nature and machine: physiosemiotics. John Deely (1990) is one of the more prominent promoters of a Peircean view of semiotics as a transdisciplinary theory encompassing both the human mind and its text production as seen from phenomenology and hermeneutics as well as all of nature and life seen from a biosemiotic as well as a physiosemiotic viewpoint. That is not the discussion of whether any natural thing can become a sign when placed in a meaningful context by a living system, but if the objects and their processes are signs per se. It is interesting to see that semiotics thus has moved from the humanities into biology and from there even into the other natural sciences at the same time as the
38 Biosemiotics
and communication coming from cybernetics and computer science with the semantic pragmatic approaches coming from the linguistic point of view and semiotics if we want to bridge this gap in our culture and knowledge. Concepts of closure, self-organization, and differentiation of biological, psychological, and social systems developed in second-order cybernetics and autopoiesis theory need to be integrated into theories of embodiment and Peircian biosemiotics.
Cyber(bio)semiotics
Figure 4 The relevance of the bottom-up informational view and the top-down semiotic view in the area of the foundation of information science. On the left side is a hierarchy of sciences and their objects, from physics to humanities and vice versa. On the right is an illustration of the two most common scientific schemas for understanding and predicting communicative and organizational behavior: (1) the semiotic top-down paradigm of signification, cognition, and communicative and (2) the informational bottom-up functionalistic view of organization, signal transmission, and AI. The width of the two paradigms in correlation with the various subject areas shows an estimate of how the relevance of the paradigm is generally considered, although both claim to encompass the entire spectrum.
formulation of objective informational concepts has been used as the basis of understanding all types of cognitive processes in animals, machines, humans, and organizations in the information processing paradigm. Information science is thus moving from computer science down into nature and up into cognitive systems, human intelligence, consciousness and social systems, and communication in competition with semiotics that is moving in the other direction (see Figure 4). Information theory is now an important part of the consciousness research program, but there is a great deal of work to do for serious philosophy, considering how many central philosophical topics of mind, language, epistemology, and metaphysics will be affected by the biosemiotic development. Peircian biosemiotics may contribute to a new transdisciplinary framework in understanding knowledge, consciousness, meaning, and communication. But to do this, new elements have to be integrated, making it possible to unite the functionalistic approaches to information
Søren Brier (2003) has developed such a philosophy of information, cognition, and communication science framework that encompasses biosemiotics and information science and well as second-order cybernetics and autopoiesis to this transdisciplinary area, which he calls Cybersemiotics. Peircean cybersemiotics is based on Peirce’s theory of mind as a basic part of reality (in Firstness) existing in the material aspect of reality (in Secondness) as the inner aspect of matter (hylozoism) manifesting itself as awareness and experience in animals and finally as consciousness in humans. Combining this with a general systems theory of emergence, selforganization, and closure/autopoiesis, and a semiotized version of Luhmann’s triple autopoietic theory of communication (see Figure 1) combined with pragmatic theories of embodied social meaning, it forms an explicit theory of how the inner world of an organism is constituted and, therefore, how first-person views are possible and are just as real as matter. Such a theory has been missing from the modern discussions of a science of consciousness. Through this foundation for semiosis, a theory of meaning and interpretation including mind – at least as immanent in nature – is possible, and cybernetic views of information as well as autopoietic views on structural couplings can be combined with pragmatic theories of language in the biosemiotic perspective. The term ‘pro- and quasi-semiotic objects’ recognizes that systems in nature and culture work with differences, often in the form of coding, instead of through either physical causality or meaningful semiosis. Systems of Secondness have established an information level above the energetic and causal level of nature. This area, delimited from a semiotic point of view, is part of what classical first-order cybernetics considers their subject area: goal-oriented machines and pattern-forming, self-organized processes in nature that are based on information. The terms ‘informational,’ ‘coding,’ and ‘signal’ are used mainly in cybernetic contexts for these systems, before attempts, foreshadowed by Wiener, to create a
Biosemiotics 39
paninformational paradigm (Brier, 1992). In Peircean biosemiotic philosophy, these levels can be bound together by Synechism, Tychism, and Agapism, combined with an evolutionary view of the interactions between Firstness, Secondness, and Thirdness. The view of Firstness as a blend of qualities of mind and matter containing qualia and living feeling and a tendency to form habits is crucial for understanding the self-organizing capabilities of nature and how what seems to be dead matter can, through evolutionary self-organization, become autopoietic and alive with cognitive/semiotic and feeling abilities (Brier, 2003). To summarize, cybersemiotics develops a semiotic and informational theory accepting several levels of existence, such as a physical and a conscious social linguistics, now placed in the broader cybersemiotics framework that combines Peirce’s triadic semiotics with systemic and cybernetic views including autopoiesis and second-order cybernetics. When talking about reality, I think we should distinguish between: 1. The first level of quantum vacuum fields entangled causality is not considered physically dead, as is usually the case in physicalistic physics. Cybersemiotics conceives it as a part of Firstness, which also holds qualia and pure feeling. Although physicists may be bothered by this new metaphysical understanding of this level of reality, they cannot claim that there is no room for new interpretations, because physics has a complete understanding of it. On the contrary, this is one of the most mysterious levels of reality we have encountered, and its implications and interaction with the observers’ consciousness have been discussed since the 1930s and were central in the disputes between Bohr and Einstein, and now some researchers are attempting to exploit the entanglement to explain the possibility of teleportation. 2. The second level of efficient causation is clearly what Peirce describes as Secondness. This realm is ontologically dominated by physics as classical kinematics and thermodynamics. But for Peirce, it is also the willpower of the mind. It is mainly ruled by efficient causation. Thus Peircean cybersemiotics does not accept a level of pure mechanical physics; nor did Ilya Prigogine. 3. The third level of information is where the formal causation manifests clearly and where the regularities and Thirdness becomes crucial for interactions through stable patterns that are as yet only proto-semiotic. This level is ontologically dominated by the chemical sciences. This difference in ontological character may be one of the keys to understanding the differences between physics and
chemistry. It is not only a matter of complexity but also of organization and type of predominant causality, which here is formal causation. 4. On the fourth level, where life has self-organized, the actual semiotic interactions emerge. First internally in multicellular organisms, such as in endosemiotics, and between organisms such as in sign games. This framework – based on biosemiotics – points out that the informational concept may be useful for analyzing life at the chemical level, but it is not sufficient to capture the communicative, dynamic organizational closure of living systems. This is one of the reasons why Maturana and Varela do not want to use the information concept in their explanations of the dynamics of life and concept. But they do not use a semiotic either. Final causation dominates here as in the next level where it emerges as purpose. 5. Finally on the fifth level with syntactic language games, human self-consciousness emerges and with that rationality, logical thinking, and creative inferences (intelligence). Intelligence is closely connected to abduction and conscious finality. Abduction is crucial to signification. It is the ability to see something as a sign for something else. This something else has to be a habit of nature, mind or society. Some kind of regularity or stability in nature that the mind can recognize as somewhat lawful is necessary for it to be a fairly stable eigen value in the mind (an interpretant) and be useful for conscious purposeful action and interaction in communication as well as in ethical social praxis (Phronesis). See also: Barthes, Roland (1915–1980); Eco, Umberto (b. 1932); Information Theory; Jacobsen, Lis (1882–1961); Luhmann, Niklas (1927–1998); Morris, Charles (1901– 1979); Peirce, Charles Sanders (1839–1914); Sebeok, Thomas Albert: Modeling Systems Theory; Semiology versus Semiotics.
Bibliography Barbieri M (2001). The organic codes: the birth of semantic biology, PeQuod. Republished in 2003 as The organic codes: an introduction to semantic biology. Cambridge: Cambridge University Press. Brier S (1992). ‘Information and consciousness: a critique of the mechanistic foundation of the concept of information.’ Cybernetics & Human Knowing 1(2/3), 71–94. Brier S (1995). ‘Cyber-semiotics: on autopoiesis, codeduality and sign games in bio-semiotics.’ Cybernetics & Human Knowing 3(1), 3–14. Brier S (1996). ‘From second order cybernetics to cybersemiotics: a semiotic reentry into the second order
40 Biosemiotics cybernetics of Heinz von Foerster.’ Systems Research 13(3), 229–244. Brier S (1998). ‘The cybersemiotic explanation of the emergence of cognition: the explanation of cognition, signification and communication in a non-Cartesian cognitive biology.’ Evolution and Cognition 4(1), 90–102. Brier S (1999). ‘Biosemiotics and the foundation of cybersemiotics. Reconceptualizing the insights of ethology, second order cybernetics and Peirce’s semiotics in biosemiotics to create a non-Cartesian information science.’ Semiotica 127(1/4), 169–198. Brier S (2001). ‘Cybersemiotics and Umweltslehre.’ Semiotica 134(1/4), 779–814. Brier S (2003). ‘The cybersemiotic model of communication: an evolutionary view on the threshold between semiosis and informational exchange.’ TrippleC 1(1), 71–94. http://triplec.uti.at/articles/tripleC1(1)_Brier. pdf. Deely J (1990). Basics of semiotics. Bloomington: Indiana University Press. Emmeche C (1998). ‘Defining life as a semiotic phenomenon.’ Cybernetics & Human Knowing 5(1), 33–42. Emmeche C & Hoffmeyer J (1991). ‘From language to nature: the semiotic metaphor in biology.’ Semiotica 84(1/2), 1–42. Emmeche C, Kull K & Stjernfelt F (2002). Reading Hoffmeyer, rethinking biology. Tartu: Tartu University Press. Hoffmeyer J (1996). Signs of meaning in the universe. Bloomington: Indiana University Press. Hoffmeyer J & Emmeche C (1991). ‘Code-duality and the semiotics of nature.’ In Anderson M & Merrell F (eds.) On semiotic modeling. Berlin: Mouton de Gruyter. 117–166. Krampen M (1981). ‘Phytosemiotics.’ Semiotica 36(3/4), 187–209. Kull K (1999). ‘Biosemiotics in the twentieth century: a view from biology.’ Semiotica 127(1/4), 385–414. Kull K (ed.) (2001). ‘Jakob von Uexku¨ ll: a paradigm for biology and semiotics.’ Semiotica. 134(1/4), special issue, 1–60. No¨ th W (2002). ‘Semiotic Machine.’ Cybernetics and Human Knowing 9(1), 3–22.
Petrilli S & Ponzio A (2001). Thomas Sebeok and the signs of life. Icon Books. Sebeok T A (1972). Perspectives in Zoosemiotics. The Hague: Mouton. Sebeok T (1989). Sources in Semiotics VIII. The sign & its masters. New York: University Press of America. Sebeok T A (1990). Essays in zoosemiotics. Toronto: Toronto Semiotic Circle. Sebeok T A & Danesi M (2000). The forms of meaning: modeling systems theory and semiotic analysis. Berlin: Mouton de Gruyter. Sebeok T A, Hoffmeyer J & Emmeche C (eds.) (1999). Biosemiotica. Berlin: Mouton de Gruyter. Sebeok T A & Umiker-Sebeok J (eds.) (1980). Speaking of apes: a critical anthology of two-way communication with man. New York: Plenum Press. Sebeok T A & Umiker-Sebeok J (eds.) (1992). Biosemiotics: the semiotic web 1991. Berlin: Mouton de Gruyter. Sharov A (1998). ‘From cybernetics to semiotics in biology.’ Semiotica 120(3/4), 403–419. Uexku¨ ll J von (1982). ‘The theory of meaning.’ Semiotica 42(1), 25–82. Uexku¨ ll J von (1934). ‘A stroll through the worlds of animals and men. A picture book of invisible worlds.’ reprinted In Schiller C H (ed.) (1957) Instinctive behavior. The development of a modern concept. New York: International Universities Press. 5–80. Uexku¨ ll T von, Geigges W & Herrmann J M (1993). ‘Endosemiosis.’ Semiotica 96(1/2), 5–51.
Relevant Websites http://www.ento.vt.edu – The international biosemiotics page. http://www.nbi.dk – Gatherings in Biosemiotics. http://www.zbi.ee – Jakob von Uexku¨ ll Centre. http://www.zoosemiotics.helsinki.fi/ – Zoosemiotics home page. http://triplec.uti.at – Brier’s article in TripleC.
Birdsong: a Key Model in Animal Communication M Naguib, Universitat Bielefeld, Bielefeld, Germany K Riebel, Leiden University, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Introduction The melodious beauty and complexity of birdsong have long attracted amateurs, naturalists, and scientists alike. Despite the almost ubiquitous presence of birdsong in both natural and anthropogenous
environments, few people are aware that birdsong is one of the most elaborate acoustic communication systems in the animal kingdom. Birdsong shows some basic and almost unique similarities to human speech, an aspect that has attracted considerable interdisciplinary scientific attention from biologists, psychologists, and linguists. As in human speech acquisition, vocal learning by songbirds plays a prominent role in song development (Catchpole and Slater, 1995). There is a sensitive period in which the basic species-specific structure is acquired, in much
40 Biosemiotics cybernetics of Heinz von Foerster.’ Systems Research 13(3), 229–244. Brier S (1998). ‘The cybersemiotic explanation of the emergence of cognition: the explanation of cognition, signification and communication in a non-Cartesian cognitive biology.’ Evolution and Cognition 4(1), 90–102. Brier S (1999). ‘Biosemiotics and the foundation of cybersemiotics. Reconceptualizing the insights of ethology, second order cybernetics and Peirce’s semiotics in biosemiotics to create a non-Cartesian information science.’ Semiotica 127(1/4), 169–198. Brier S (2001). ‘Cybersemiotics and Umweltslehre.’ Semiotica 134(1/4), 779–814. Brier S (2003). ‘The cybersemiotic model of communication: an evolutionary view on the threshold between semiosis and informational exchange.’ TrippleC 1(1), 71–94. http://triplec.uti.at/articles/tripleC1(1)_Brier. pdf. Deely J (1990). Basics of semiotics. Bloomington: Indiana University Press. Emmeche C (1998). ‘Defining life as a semiotic phenomenon.’ Cybernetics & Human Knowing 5(1), 33–42. Emmeche C & Hoffmeyer J (1991). ‘From language to nature: the semiotic metaphor in biology.’ Semiotica 84(1/2), 1–42. Emmeche C, Kull K & Stjernfelt F (2002). Reading Hoffmeyer, rethinking biology. Tartu: Tartu University Press. Hoffmeyer J (1996). Signs of meaning in the universe. Bloomington: Indiana University Press. Hoffmeyer J & Emmeche C (1991). ‘Code-duality and the semiotics of nature.’ In Anderson M & Merrell F (eds.) On semiotic modeling. Berlin: Mouton de Gruyter. 117–166. Krampen M (1981). ‘Phytosemiotics.’ Semiotica 36(3/4), 187–209. Kull K (1999). ‘Biosemiotics in the twentieth century: a view from biology.’ Semiotica 127(1/4), 385–414. Kull K (ed.) (2001). ‘Jakob von Uexku¨ll: a paradigm for biology and semiotics.’ Semiotica. 134(1/4), special issue, 1–60. No¨th W (2002). ‘Semiotic Machine.’ Cybernetics and Human Knowing 9(1), 3–22.
Petrilli S & Ponzio A (2001). Thomas Sebeok and the signs of life. Icon Books. Sebeok T A (1972). Perspectives in Zoosemiotics. The Hague: Mouton. Sebeok T (1989). Sources in Semiotics VIII. The sign & its masters. New York: University Press of America. Sebeok T A (1990). Essays in zoosemiotics. Toronto: Toronto Semiotic Circle. Sebeok T A & Danesi M (2000). The forms of meaning: modeling systems theory and semiotic analysis. Berlin: Mouton de Gruyter. Sebeok T A, Hoffmeyer J & Emmeche C (eds.) (1999). Biosemiotica. Berlin: Mouton de Gruyter. Sebeok T A & Umiker-Sebeok J (eds.) (1980). Speaking of apes: a critical anthology of two-way communication with man. New York: Plenum Press. Sebeok T A & Umiker-Sebeok J (eds.) (1992). Biosemiotics: the semiotic web 1991. Berlin: Mouton de Gruyter. Sharov A (1998). ‘From cybernetics to semiotics in biology.’ Semiotica 120(3/4), 403–419. Uexku¨ll J von (1982). ‘The theory of meaning.’ Semiotica 42(1), 25–82. Uexku¨ll J von (1934). ‘A stroll through the worlds of animals and men. A picture book of invisible worlds.’ reprinted In Schiller C H (ed.) (1957) Instinctive behavior. The development of a modern concept. New York: International Universities Press. 5–80. Uexku¨ll T von, Geigges W & Herrmann J M (1993). ‘Endosemiosis.’ Semiotica 96(1/2), 5–51.
Relevant Websites http://www.ento.vt.edu – The international biosemiotics page. http://www.nbi.dk – Gatherings in Biosemiotics. http://www.zbi.ee – Jakob von Uexku¨ll Centre. http://www.zoosemiotics.helsinki.fi/ – Zoosemiotics home page. http://triplec.uti.at – Brier’s article in TripleC.
Birdsong: a Key Model in Animal Communication M Naguib, Universitat Bielefeld, Bielefeld, Germany K Riebel, Leiden University, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Introduction The melodious beauty and complexity of birdsong have long attracted amateurs, naturalists, and scientists alike. Despite the almost ubiquitous presence of birdsong in both natural and anthropogenous
environments, few people are aware that birdsong is one of the most elaborate acoustic communication systems in the animal kingdom. Birdsong shows some basic and almost unique similarities to human speech, an aspect that has attracted considerable interdisciplinary scientific attention from biologists, psychologists, and linguists. As in human speech acquisition, vocal learning by songbirds plays a prominent role in song development (Catchpole and Slater, 1995). There is a sensitive period in which the basic species-specific structure is acquired, in much
Birdsong: a Key Model in Animal Communication 41
the same way that humans have to acquire the phonemes of their language in the first few years of life. The only other well-established examples of animal communication in which learning plays such a central role in signal acquisition are found in parrots, hummingbirds, bats, and marine mammals (Janik and Slater, 1997). Using birdsong as a model system allows us to study the complexity of animal behavior from both mechanistic and functional perspectives. Because it is the best studied vertebrate communication system on almost all levels of scientific investigation, from molecular biology to evolutionary ecology, birdsong development has become a textbook example for illustrating basic biological processes (Alcock, 2001; Campbell and Reece, 2001; Barnard, 2004). In most songbirds that breed in the temperate zones, only the males sing; their songs function to defend a territory against other males and to attract and stimulate females (Catchpole and Slater, 1995), but there is an enormous variation in song structure and phenomenology, development, and delivery. The taxonomic order of perching birds (passerines) can be subdivided into two distinct groups: the oscines (over 4000 species), which in general learn their song, and the suboscines (about 1000 species), for which there is limited evidence that key structural components of the species-typical song are learned (Kroodsma, 2004). Many of the sub-oscines are tropical birds and their song is often much simpler than is the highly complex song common in oscine species. Songbirds in the tropics also differ from those in the temperate zones in how and when they sing: singing tends to occur all year round and often females also sing. Even more strikingly, mated pairs may combine their songs into highly coordinated duets (Hall, 2004). The speed and precision in coordination of timing of duets results in a composite signal that, even for an experienced human listener, sounds like the song of a single individual. This article will mainly focus on song by males in temperate zone passerines, as these are much better studied than tropical birds and are ideal to illustrate general principles of songbird vocal communication. Birdsong versus Bird Calls
Birdsong is distinguished from the remainder of songbird vocalizations, which are generally referred to as calls. Calls have been defined based on both structural and functional criteria. Calls are given by both sexes, they are simple in structure, and in many cases they are highly context specific, such as begging calls or alarm calls (Marler, 2004). Other than song, which is normally delivered only in the breeding season, calling occurs all year. Calls have long been thought to be affected little, if at all, by vocal learning.
However, with more studies addressing call learning, it has emerged that there is much more developmental plasticity than previously thought. Among the various calls given by birds, the alarm calls given in response to predators have received specific attention, because they can vary gradually with the urgency of the threat and even provide functionally referential information (see Alarm Calls), a trait that has long been viewed to be specific to human language. Singing Versatility
Birdsong structure and versatility vary enormously, from structurally simple songs with only one repeated element (e.g., grasshopper warblers, Locustella naevia) to highly complex songs (e.g., nightingales, Luscinia megarhynchos) in which each male sings around 200 different song types, each of which is composed of many different elements (Figure 1). For the purpose of comparative studies, it has proved useful to categorize birds into continuous and discontinuous singers (Hartshorne, 1973; Catchpole and Slater, 1995). Continuous singers such as reed warblers (Acrocephalus scirpaceus) produce long, almost continuous streams of elements (the basic units of vocal production). The elements in the song repertoire of a continuous singer are usually recombined in various ways, so that each new sequence is slightly different from the previous ones. Most male songbirds, however, are discontinuous singers, i.e., they alternate songs (which are a specific combination of song elements) with silent intervals (Figure 1). Among different species of discontinuous singers, there are two discrete singing styles. In some species, males repeat the same song type several times before switching to a song of a different type. This way of singing is most characteristic for species in which males have a small to medium repertoire of different song types (i.e., a repertoire of 2 to 10 acoustically distinct songs/male). There are some exceptions to this rule, though; for example, Carolina wren (Thryothorus ludovicianus) males have a repertoire of about 40 distinctly different song types, but deliver their repertoire with eventual variety. Birds following this repetitive mode are generally said to be singing with ‘eventual variety.’ Examples are song sparrows (Melospiza melodia), yellowhammers (Emberiza citrinella), chaffinches (Fringilla coelebs), and great tits (Parus major). In other species, males hardly ever repeat the same song type in immediate succession but instead, after each song, switch to a different song type within their repertoire. This singing style is ‘referred to as showing ‘immediate variety’ and is characteristic of species that have larger song repertoires, such as mockingbirds (Mimus polyglottus), European blackbirds (Turdus merula), or nightingales.
42 Birdsong: a Key Model in Animal Communication
Figure 1 Sound spectrograms of 25-s singing sequences by males of five different species of songbirds. (A) Grasshopper warbler, Locustella naevia, (B) Carolina wren, Thryothorus ludovicianus, (C) song sparrow, Melospiza melodia, (D) yellowhammer, Emberiza citrinella, (E) nightingale, Luscinia megarhynchos; (B)–(D) show singers with eventual variety and (E) shows a species with immediate variety.
Song Development Songbirds have an exceptional faculty for vocal learning (Figure 2). Song learning consists of a phase of acquisition (sensory phase: memorization of song patterns) and a phase of production learning, i.e., the sensorimotor learning phase of the complex motor pattern. The timing of these two processes during development varies across species, from tightly overlapping to completely separate in time. The acquisition process is often limited to a sensitive phase during the first year of life (which is the time to maturation in most songbird species), with no additional learning after the first breeding season (‘closed-ended learners’; e.g., chaffinches or zebra finches (Taeniopygia guttata)). In other species, learning might continue throughout life (‘open-ended learners’; e.g., canaries (Serinus canaria) or starlings (Sturnus vulgaris)). Often this entails repertoire size increasing with age. Song acquisition learning seems to take place during a sensitive phase without apparent external
reinforcement (‘channeled’ or ‘pre-programmed’ learning). Unlearned biases (varying in their specificity across species) guide what types of vocalizations are preferentially learned. Generally, the first auditory memories are laid down during the first weeks of life, often around the time when the young birds fledge from the nest, and the sensory learning phase precedes the motor learning phase. In seasonal species, this might not occur until months after the offspring heard adult birds sing. Early singing consists of quiet, amorphous warbling (subsong) that proceeds to more structured and phonologically varied song (plastic song). Whereas these first two phases may take several weeks, the last transition, to the fully crystallized song, often occurs rather rapidly, within a few days. After that, phonology, phonological syntax, and timing fully are those of adult song (Figures 3 and 4). The onset of motor practice and song crystallization correlates with changes in steroid hormone levels, which are triggered by photoperiod in temperate zones but exhibit less clear circannual patterns in tropical nonseasonal species. Where song
Birdsong: a Key Model in Animal Communication 43
Figure 2 Culturally transmitted song types in the zebra finch (Taeniopygia guttata). Columns show spectrograms of tutors’ songs in the top row (adult males w709 and o554, respectively) and their respective tutees. Young males were housed with their respective tutors throughout the sensitive phase for song learning (days 35–65 posthatching); as a result, songs of tutees resemble the song of their tutor and each other more than do those of full brothers.
Figure 3 An example of changes in one song motif in the course of ontogeny in a chaffinch, Fringilla coelebs. The crystallized song type was also in the final song type repertoire of this individual (illustrated in Figure 4, tutee song type 2).
and testosterone titers are seasonal, a brief phase of subsong is observed before the onset of the breeding season even in adult birds. In the sensorimotor model of song learning, a crude early template sets the sensory predispositions that filter the types of acoustic stimuli that are laid down as specific song memories (the ‘template’) during the sensory learning phase. The template is adjusted
by learning and plays an important role in the development of full song in the subsequent sensorimotor phase. This is in line with observations that song developed by young birds deprived of adult song tutors contains species-specific characteristics (a song-deprived nightingale sounds different from a song-deprived starling) but lacks the fine detail of normal adult song. With the onset of the motor learning phase, auditory feedback is crucial to adjust the song output until it matches the template. Interrupting the auditory feedback by masking it with noise or by blocking the central nervous connections, thus making the bird unable to hear its own song, will result in the development of song that is even more impoverished than the song of isolate-raised birds. The original model of song learning has been updated and altered over the years, but both behavioral and neurobiological findings seem to support the principle underlying ideas of a two-phase process (sensory and sensorimotor learning phases), and this still serves well as an appropriate description of the basic pattern observed in many species. Consequences of vocal learning are increased interindividual and geographic variation arising from imprecise song copying (see individual w83 in Figure 2 and differences between tutor’s and tutee’s songs in Figure 4). As in human speech, birds can have local dialects that are discretely different from
44 Birdsong: a Key Model in Animal Communication
Figure 4 Four song types were played on tape to young fledgling chaffinches (tape tutor). The final repertoire of one of the respective tutees in the subsequent spring is shown (for song development, see Figure 3). Redrawn from Riebel K and Slater P J B (1999), Ibis 141, 680–683.
other dialects in the same species, with clear-cut dialect boundaries (see Dialects in Birdsongs). Population changes in time and space have been relatively well studied due to short avian generation times, and cultural changes in song can be easily observed and documented. Songbirds thus provide an important study system for nonhuman gene-culture co-evolution studies and diachronic and geographic change such as dialect formation.
species. Even in bird species in which the sexes do not exhibit substantial morphological differences, adult males and females often show consistent differences in acoustic parameters such as fundamental frequencies and harmonic composition. These differences often seem to come about rather suddenly during sub-adult development and possibly coincide with steroid hormone-induced changes of the vocal tract (Ballintijn and ten Cate, 1997).
Development of Vocalizations in Non-oscine Birds
In contrast to the extensive vocal learning process in most songbirds, their closest relatives, the suboscines, seem to be able to develop species-specific song even when deprived of adult song or auditory feedback, although vocal learning now also has been shown to occur in some sub-oscines. Vocal learning in songbirds seems to have evolved independently several times and has also been reported for at least two other avian orders, parrots (Psittacidae) and hummingbirds (Trochilidae). Vocal learning has been little investigated in other avian taxa and may be even more widespread than reported (Kroodsma, 2004). Developmental changes during maturation also occur in taxa not described as vocal learners. For example, specialized juvenile vocalizations (such as begging calls) may disappear from the vocal repertoire or the characteristics of the vocal tract may change during growth. An analogue to human ‘voice breaking’ has been described in a number of bird
Song Production Birds have a larynx located at the top of their trachea, but vocalize with the aid of a specialized organ, the syrinx, located much lower down where the two bronchi join to form the trachea (see Vocal Production in Birds). The tonal character of many bird vocalizations and the existence of a unique sound-producing organ have triggered a wealth of hypotheses as to possible fundamental differences in sound production mechanisms between birds and mammals. Recent findings suggest that the basic mechanism is the same: cyclic opening and closing of the gap between the vocal membranes lead to harmonic sound at the source, which undergoes filtering by the vocal tract. However, whereas a larynx consists of only one pair of vocal folds, there are two sets (one in each bronchus) of each of the several pairs of membranes involved in birdsong production (Goller and Larsen, 2002). The two halves of the syrinx are innervated
Birdsong: a Key Model in Animal Communication 45
independently, creating two potential sound sources that can, within certain limits, be operated independently. In most songbirds, one side of the syrinx seems dominant over the other, and this lateral dominance might even differ from syllable to syllable and even within a syllable. As in mammal sound production, including human speech, the settings of the songbird vocal tract act as a vocal filter and movements of the neck, tongue, and beak contribute to changes in resonance properties.
Neurobiological Correlates of Singing and Song Learning Songbird brains show special adaptations for the production and acquisition of song (Figure 5). A number of interconnected brain nuclei (the ‘song system’) are absent in non-vocal-learning bird species and are sexually dimorphic in those species in which producing song is a behavioral dimorphism. The brain areas involved are highly specialized and easy to distinguish from surrounding brain tissue using standard tissue staining techniques. Two main pathways are involved in sound production (Figure 5A). The posterior (or motor) pathway descends from cerebral areas to control the syrinx via the hypoglossal nerve (XII); two cerebral regions (HVC and RA; see Figure 5 for abbreviations) show neuronal activation synchronized with singing. The anterior pathway plays an important role in song learning, and lesions in either Area X or MAN in young birds disrupt song acquisition; such lesions do not affect singing in adult birds. The well-delineated sensitive phases of sensory learning in many songbird species allow controlled experimental assessment of the quantity and quality of the sensory input. Avian song learning is thus a prime model to study the neurobiological basis of vocal learning and adult neuronal plasticity (seasonal changes, neurogenesis). Insights from neurophysiology and anatomy and from studies on effects of differential gene expression mediating neuroanatomical and functional change have greatly advanced our understanding of the subtle neuroanatomical changes involved in learning (Jarvis, 2004). Sex Differences
The avian song system has provided examples of the most extreme sex differences in functional brain anatomy in vertebrates documented so far. The pronounced sex difference related to song systems and the pronounced seasonal changes in neuronal number and volume (up to threefold) and of the song nuclei (Tramontin and Brenowitz, 2000) provide interesting insights into the role of steroid hormones in neuronal
Figure 5 Song system. Schematic drawings of a parasagittal section of the songbird brain. Abbreviations are based on the revised nomenclature of Reiner et al. (2004), Journal of Comparative Neurology 473, 377–414: CMM, caudal medial mesopallium (former: caudal medial hypertriatum ventrale, CMHV); DLM, medial part of the dorsolateral thalamus; HVC, high vocal center; L, Field L2; LaM, lamina mesopallialis (former: lamina hyperstriatica, LH); MAN, magnocellular nucleus of the anterior nidopallium; MLd, mesencephalic lateral dorsal nucleus (dashed lines indicate the nucleus is located more medially than the illustrated section); NCM, caudal medial nidopallium; nXIIth, nucleus hypoglossalis partis tracheosyringalis; Ov, nucleus ovoidalis; RA, magnocellular nucleus of the arcopallium; V, lateral ventricle. (A) Anterior and posterior pathway. Arrows connect nuclei of the conventional ‘song system’ that consists of the posterior (motor) pathway and the anterior forebrain pathway. Two main pathways are involved in sound production and learning; the posterior (motor) pathway is activated during singing and descends from the HVC (pallium): HVC ! RA ! nXIIts ! syrinx. The anterior pathway, HVC ! Area X ! DLM ! MAN ! RA, is involved in vocal learning. (B) Auditory pathway: input from cochlea via auditory nerve (VIII) and brain stem nuclei (not shown) ! MLd (mesencephalon) ! OV (in the thalamus) ! L (with primary and secondary auditory cells of the pallium) ! tertiary auditory areas of the nidopallium (NCM, CMM, HVC shelf, and RA cup). From the HVC shelf there is also a descending pathway via the RA cup to the auditory regions of the midbrain. The gray areas show neuronal activation when the bird is exposed to conspecific song. Figure kindly provided by Terpstra N and Brittijn M (2004), Journal of Neurosciences 24, 4971–4977.
46 Birdsong: a Key Model in Animal Communication
development and differentiation. Large differences between closely related species, ranging from species in which females have never been observed to sing to those in which females sing as much as males, provide excellent opportunities for comparative studies in neuroethology (Brenowitz, 1997; MacDougallShackleton and Ball, 1999). They also provide a prime model for the study of hormonal and genetic effects in gender differentiation. When song is sexually dimorphic, it is possible to differentiate production and perception learning and to identify specialized adaptations of the brain. In a cross-species comparison across 20 or so species, sex differences in the neuronal song system were found to be correlated with sex differences in song output and repertoire size (MacDougall-Shackleton and Ball, 1999). However, it is unclear whether this is due to sex differences in song output or to vocal learning. Most studies so far have been based on sex differences related to quantity and quality of adult song output and not to song learning (Gahr et al., 1998). Though many species show clear sex differences in song usage, there have been few studies investigating female vocal learning abilities, but this is a rapidly growing field of research (Riebel, 2003). Evidence is quickly accumulating that early learning greatly influences adult female song and its perception. Future studies will thus have to show whether females differ from males in when and what they learn, or only in how much they sing.
Hearing and Perception
act as transducers, leading to sound-specific patterns of discharge in the auditory nerve (the (nVIIIth)). From nVIIIth, the auditory pathway (Figure 5B) continues, ascending via a number of nuclei in the brain stem, the mesencephalon, and the thalamus (ovoidalis) to primary and secondary auditory cells of the pallium. From there, auditory information is transmitted to tertiary auditory areas of the nidopallium. Thus, in line with songbirds’ sensory learning abilities, there is a full ascending sensory pathway to higher forebrain regions. Hearing Range and Perception
The hearing ranges of birds have been determined using both electrophysiological methods (recording neuronal activities on sound playback) and behavioral methods (training birds to indicate behaviorally whether they can discriminate between two sounds). Bird hearing is remarkably acute both in the lowand high-frequency ranges, despite the short basilar papilla. Audiograms show species-specific peaks and troughs, with specialists such as night-hunting owls showing higher sensitivities. Inspection of avian audibility curves reveals no ultra- or infrasonic hearing (Figure 6). Though birds might hear from roughly 0.5 to 10 kHz, they generally hear best between 1 and 6 kHz, with absolute sensitivity approaching 0–10 dB SPL at the most sensitive frequency, which is usually at around 2–3 kHz (Dooling, 2004). Generally, the sounds that birds produce map
The Avian Ear and the Auditory Pathway
For any communication system, the study of physical properties of signals and their production needs to be paralleled by the study of the corresponding receptors. Bird ears are similar to mammal ears in many respects, but differ in a number of key features. The outer ear lacks an external pinna, and its opening in the skull is covered by feathers and there is only a single middle ear bone (the columella). Moreover, the basilar papilla is straight rather than coiled and shows a greater diversity of sensory hair cell types compared to mammal ears (Causey Whittow, 2000). These differences might explain why the range of audible frequencies seems little curtailed despite the remarkably short basilar papilla, which is only about 2–3 mm long (compared to up to 9 mm in owls and 30 mm in humans). Despite these differences, in general, birds’ ears work like those of mammals. Sound waves set the membrane separating the inner from the outer ear vibrating. This motion is transmitted via the columella to the fluid of the inner ear. The pressure changes and motions within the fluid excite the hair cells on the sensory epithelium; the hair cells
Figure 6 Avian and human audibility curves. Owls (Strigiformes) have a higher sensitivity compared to an average songbird and to humans. Redrawn from Dooling R J et al. (2000), in Dooling R J, Fay R R, and Popper A N (eds.) Comparative hearing: birds and reptiles, 308–359, New York: Springer Verlag; and Dooling (2004).
Birdsong: a Key Model in Animal Communication 47
well onto the frequency range of their most sensitive hearing. Despite their small head size, songbirds also show good directional hearing. Instead of integrating the information of directionality using differential arrival times of a sound at both ears, songbirds’ ears are connected via the air cavities in the skull bones so that sound is incident on the inner surface of the tympanic membrane at the opposite ear. Two different pressures build up on either side of the membranes; by moving its head until the two pressures are equalized, the bird localizes the sound. The magnitude of spatial masking release is similar to that in humans (10–15 dB, with tone and masking noise 90! apart). Masking effects of noise are frequency specific and strongest when overlapping with the actual signal (Klump, 1996). Compared to humans, birds do less well in detecting changes in intensity, but when discriminating between complex sounds, birds demonstrate fine temporal resolution, exceeding that of humans. However, birds’ perception also shows some interesting parallels with human abilities, and there is good experimental evidence for auditory stream analysis (filtering of auditory objects from general background noise) and categorical perception (both for avian and non-avian vocalizations; i.e., birds show categorical perception of human phonemes). Birds also superficially show complex serial pattern recognition (for example, in the discrimination of musical tunes), but use different strategies for categorization than humans. Unlike humans, who focus on differences in relative pitch, in bird species tested so far, absolute pitch and absolute frequency range were more important in classification of complex sounds. Development of Hearing and Perception The development of hearing and perception has not been widely studied, compared to song production. However, even in species not known as vocal learners, perception is modulated by experiences during development. In ducklings (Anas platyrhynchos), preferences for and recognition of the species-specific maternal call are greatly impaired in birds that are deprived of hearing their mother’s and their own calls while still in the egg (Gottlieb, 1978). Development and learning are of even greater impact when complex vocalizations, such as the learned songs in songbirds, are concerned. During the sensorimotor learning phase, auditory neurons develop specific responsiveness to elements of, first, the tutor and, later, the bird’s own song. Song discrimination abilities are impaired in both males and females if they are deprived of speciesspecific song during development, suggesting that the fine tuning of song perception also depends on early
experiences in non-singing females. Moreover, evidence is accumulating that female preferences for specific variants of conspecific songs are also greatly influenced by social learning processes (Riebel, 2003).
Evolution and Functions of Birdsong Functions of Birdsong
So far we have dealt with the proximate causation of song: its development, control and perception. But why do birds sing? And what kind of information do they signal and extract from a song that they hear? It is well documented that birdsong is an advertisement signal with a dual function: territory defense and mate attraction. However, the precise functions of song can differ among species. Moreover, within species, the function of song may differ with time of day or season and it may differ depending on how birds sing, i.e., which song patterns they sing and how they use them when interacting with each other. Song encodes information about the singer and such information can be relevant for other males and females. Nevertheless, females and males may attend to different aspects of song so that, even though song may be addressed to both sexes, the specific traits that are used to assess a singer may differ, depending on which sex is listening. Birdsong as a Long-Range Signal
Unlike human speech, birdsong, in common with other advertisement signals in the animal kingdom, is used as a long-range signal, often over 100 or more meters. During transmission through the environment, acoustic signals inevitably attenuate and degrade (Wiley and Richards, 1982; Slabbekoorn, 2004) (Figure 7). Thus the structure of a song at the position at which a receiver makes a decision differs from its structure at its source. The nature of these environmentally induced changes in a song depends on habitat structure and weather conditions. The differences in the acoustic properties of a given habitat are of evolutionary significance, and certain signal structures will be more effective than others in longrange communication. As a consequence, songbirds in forests sing differently from those that live in open areas, such as woodlands or fields. The reflecting surfaces of the vegetation in forests are the main cause of sound degradation (signal reverberation); in contrast, open habitats cause negligible reverberation. Rapid repetitions of elements with the same frequency structure, i.e., trills, are particularly susceptible to being blurred by reverberation. Indeed, birds in closed habitats have been found to sing trills with slower repetition rates compared to birds in
48 Birdsong: a Key Model in Animal Communication
Figure 7 Undegraded and degraded sound spectrograms and oscillograms of a chaffinch song. Upper panel: song as recorded from a singing male within a distance of 10 m is undegraded. Lower panel: song as recorded at a distance of 40 m in a deciduous forest. Here the oscillogram (top) and spectrogram (bottom) show temporal smearing of the sound.
open habitats. Because vegetation also causes additional attenuation of sound, and specifically of the higher frequencies, there should be strong selection to avoid higher frequencies for long-range communication in forests. Empirical findings show that birds in open habitats use, on average, more high frequencies than do birds in closed habitats. However, the environmental effects on song transmission not only mask information coded in the song but also provide additional relevant information. Degradation and attenuation with distance are to some extent predictable, so that birds, like humans, have been shown to use cues from degradation and attenuation as distance cues (Figure 8). This can be crucial for an effective defense of large territories against rival males (Naguib and Wiley, 2001). Because they can assess the distance to a singing rival, males need only invest time and energy in repelling a rival that is nearby and therefore is a likely threat; energy need not be wasted when the rival is far away and beyond the territorial boundary. Territorial Function and Communication among Males
Figure 8 Response scores of Carolina wrens to playback of undegraded (clear) song and song with added distance cues. Scores on the principal component (shown on the Y axis) indicate strength of response. Birds, like humans, use reverberation and high-frequency attenuation as separate cues to distance. Reproduced from Naguib M (1995), Animal Behaviour 50, 1297–1307.
Song in most passerine birds is used as a territorial signal, i.e., to advertise an area that will be defended against rival males. In a classic study on the territorial function of birdsong, Krebs and colleagues (Krebs, 1977) removed male great tits from their territories; installed loudspeakers then played recorded conspecific song or a control sound, or no sound was
broadcast (Figure 9). Territories in which no song or the control song (a tune on a tin whistle) was broadcast were occupied by new males earlier than when conspecific songs were broadcast. This and subsequent experiments provided convincing evidence that male song keeps out rival males. Moreover,
Birdsong: a Key Model in Animal Communication 49
Figure 9 Schematic representation of a classic experiment on the territorial function of song in great tits. Males were removed from their territory and were replaced by loudspeakers either playing great tit songs (‘experimental’) or playing back a control stimulus, or no stimulus was broadcast. Shaded areas on the right indicate re-occupation of the territory by other males after 8 or 10 hours. Males settle only in those areas (‘control silent’, ‘control sound’) in which no great tit songs were broadcast. Redrawn from Krebs J R and Davies N B (1992), An introduction to behavioural ecology, Oxford: Blackwell Scientific.
playback experiments in the field and observations of undisturbed singing in different contexts have shown that males obtain important information from a rival’s song on which they base their decision on how to respond to that rival. As in all social behavior, individual specific information is of central relevance when repeated encounters occur. Birds can use such information to distinguish between familiar and unfamiliar individuals. Moreover, males discriminate not only between neighbors and strangers, but also become more aggressive when they hear their neighbor’s song from the opposite side of their territory (Figure 10). Thus, information on familiarity with song is linked to a location from which it is usually heard. The reduced response to a neighbor’s song when received from the ‘correct’ direction is termed the ‘dear enemy effect’ (Stoddard, 1996). Neighbors are rivals in competition for space and matings, but, once a relation is established, neighboring males benefit by reduced aggression toward each other. In addition, neighbors also can act as an early warning system when a stranger starts singing somewhere
in the territorial neighborhood, an issue that has received specific attention in studies using birdsong as a model in investigating communication networks (Naguib, 2005; Peake, 2005). During territorial conflicts, males can signal their readiness to escalate a contest by a range of different singing strategies. There is variation within and among species as to which strategy has which signal value (Todt and Naguib, 2000). Males may time their songs during an interaction so that they overlap songs of their opponent. In almost all species studied to date, song overlapping is used and perceived as an agonistic signal. Another way of agonistically addressing a rival is to match his song type, i.e., to reply with the same song pattern the rival has just sung. Song rate and the rate of switching among different song types can likewise signal changing levels of arousal. In barn swallows (Hirundo rustica), the structure of the song can be correlated with levels of testosterone (Figure 11), and thus song may be used as a predictor of fighting vigor. The importance of song in territory defense also may vary with time of the season and
50 Birdsong: a Key Model in Animal Communication
Figure 10 Response strength of male territorial song sparrows to playback, at different locations in their territory, of song of neighbors and strangers. Neighbor/stranger discrimination usually occurs only at the boundary toward the territory of the neighbor whose song is broadcast. At the center of a territory or at the opposite boundary, no discrimination is observed, suggesting that intrusions at these sites are assessed as equally threatening regardless of the identity of the intruder. Redrawn from Stoddard P K et al. (1991), Behavioural Ecology and Sociobiology 29, 211–215.
Figure 11 Relation between levels of plasma testosterone and number of impulses per rattle in barn swallow songs. Males with more impulses in the rattles of their song had higher testosterone levels, suggesting that song codes information on the physiological state of the singer. Redrawn from Galeotti P et al. (1997), Animal Behaviour 53, 687–700.
with time of the day. The dawn chorus, for instance, a marked peak of singing activity early in the morning in many temperate-zone songbirds, has a specific function in territory defense in some species (Staicer et al., 1996). Function in Mate Attraction
Song provides information on male motivation and quality and there is now good evidence that females
Figure 12 Nocturnal singing activity of male nightingales. Bars indicate the period of the breeding cycle when males sing at night. Males cease nocturnal song after pairing but resume it when their females lay eggs. Males that remain unpaired (‘bachelors’) continue nocturnal song throughout the entire breeding season (bachelors, N ¼ 12; mated males, N ¼ 18). Modified from Amrhein et al. (2002), Animal Behaviour 64, 939–944.
use this information for pairing and mating decisions. Females may choose a male partner on the basis of his song and, once paired, still mate additionally with other males with more attractive song, in so-called extra-pair copulations. There are two lines of evidence showing the function of song in female choice. Field studies have shown that song traits are linked to mating success and to paternity, and laboratory studies have shown that females are more responsive to specific song traits. In many bird species, males change their singing behavior after pairing, suggesting that the function of song differs between the period of mate attraction and the period thereafter. Many warblers show a marked decrease in singing activity after pairing, and nocturnally singing birds such as the nightingale cease nocturnal song the day after a female has settled within their territory (Amrhein et al., 2002) (Figure 12). Sedge warbler (Acrocephalus schoenobaenus) males become paired earlier when they have large vocal repertoires (Figure 13), suggesting that repertoire size is a trait used by females in mating decisions. Great reed warbler (Acrocephalus arundinaceus) females exhibit more display behavior in response to complex songs than to simple ones (Figure 14) and have been shown to copulate only with those neighboring males that have a song repertoire larger than their social mate has (Figure 15). Dusky warblers (Phylloscopus fuscatus) that produce song elements at a higher relative amplitude gain more extra-pair matings than do
Birdsong: a Key Model in Animal Communication 51
Figure 13 Pairing and song repertoire in sedge warblers. Males with larger song repertoires pair earlier, suggesting that song repertoire is used in female choice. Redrawn from Catchpole C (1980), Behaviour 74, 149–166.
males that sing their elements ‘less well’. Furthermore, studies have shown that males usually increase their song output when their mate disappears or is removed experimentally. In addition, studies under controlled laboratory conditions have shown that females show strong preferences for specific song traits. Females show more copulation solicitation displays (a specific posture females use to elicit copulations) when hearing large song repertoires than when hearing smaller, less complex song repertoires, as in great reed warblers (Figure 14). In canaries, a substructure of the song, a complex syllable category (a trill), has been identified as a ‘sexy syllable’ to which females pay specific attention. More recent studies have used operant techniques in which females were allowed to peck a key to release playback of songs of different complexity, and with this technique it is possible to test female preference for song in more detail (Riebel, 2003).
Comparison to Human Speech Both human speech and birdsong consist of finite sets of smaller units (humans: phonemes; birds: elements or syllables) that are arranged by a species-specific combinatorial system into larger units (humans: words and sentences; birds: phrases and songs). Despite the very different functions fulfilled by birdsong (territorial and mate-attracting signal) and human speech (physical carrier of human language), there are many parallels. Both types of communication are acquired by a form of channeled social learning, whereby some sounds are more likely to be copied than others. Learning of speech by humans and song by birds takes place without obvious external reward, occurs at specific phases during development, and relies on auditory feedback and a prolonged phase of motor learning (birds:
Figure 14 Female copulation solicitation displays in response to playback of songs of different complexity in great reed warblers. Female displays last longer in response to larger song repertoires, suggesting that males with larger song repertoires are more attractive. Redrawn from Catchpole C et al. (1986), Ethology 73, 69–77.
subsong; humans: babbling). As in the (prelinguistic) acquisition of phonemes in humans, in birds a sensory learning phase precedes the first production attempts. Babbling babies, like young birds, undergo a long phase of motor practice during which initial phonological (over)production moves toward producing phonological units that become more and more similar to the phonologies that are heard. In human speech acquisition, learning to produce the phonetic units precedes the mapping of meaning onto these units. It is thus in the acquisition of auditory memories and in the first (prelinguistic) phase of motor learning that birdsong and speech development can perhaps best be seen as an analogue. Next to similarities on the behavioral level, highly specialized brain regions control vocal learning, memory, production, and perception, both in human speech and in birdsong. Songbirds’ vocal learning ability is mirrored in highly specialized forebrain areas solely dedicated to the acquisition and perception of vocalizations and to the control of the complex motor patterns underlying song. Both song and speech acquisition have sensitive periods during which learning is greatly enhanced and sensory experience leads to learned representations guiding vocal output via complex feedback mechanisms. Increasing experience and sub-adult hormonal changes later slow down or stop further acquisition learning. These similarities of the acquisition of vocal units suggest that similar neural mechanisms might underlie vocal learning in birds and in humans. In line with this, studies on functional morphology of the bird brain now suggest that avian forebrain areas are functionally much more equivalent to mammalian forebrain areas than previously thought. Moreover, central and peripheral control of both song and speech show lateralization, which is a clear indication of evolutionarily highly derived systems. Birdsong phonology is often highly complex and can show
52 Birdsong: a Key Model in Animal Communication Individual Recognition in Animal Species; Insect Communication; Non-human Primate Communication; Traditions in Animals; Vocal Production in Birds.
Bibliography
Figure 15 Relation between male song repertoire and survival of their offspring in great reed warblers. Offspring survival (measured as recruits to the next year’s breeding population) is positively related to the father’s song repertoire size, suggesting that song repertoire is an indicator of male quality. Data from Hasselquist et al. (1996), Nature 381, 229–232.
more than one hierarchical level (elements show specific associations resulting in phrases and songs; these higher units will also show nonrandom sequential organization). However, in birdsong there is no evidence of the recursiveness (the embedding of units of the same hierarchical category within each other) that is found in human language. Moreover, alternative combinations of units do not normally create different semantic meanings, except in a very limited sense: Different combinations of units may code for general information such as species, age, gender, and motivation. Song is thus best seen as an analogue to human speech (not language) and to nonverbal aspects of acoustic communication. In this respect, it is a valuable model for comparative studies (Doupe and Kuhl, 1999; Hauser et al., 2002) on mechanisms (behavioral, neurobiological, gene regulatory, and hormonal) as well as on the evolution of vocal learning (Fitch, 2000). In addition, how learning processes affect diachronic change and geographical variation of signaling provides interesting opportunities for comparative research into gene-culture co-evolutionary processes (see Dialects in Birdsongs). See also: Alarm Calls; Animal Communication: Deception
and Honest Signaling; Animal Communication: Dialogues; Animal Communication: Long-Distance Signaling; Animal Communication Networks; Animal Communication: Overview; Animal Communication: Parent–Offspring; Animal Communication: Signal Detection; Animal Communication: Vocal Learning; Communication in Grey Parrots; Communication in Marine Mammals; Development of Communication in Animals; Dialects in Birdsongs; Fish Communication; Frog and Toad Communication;
Alcock J (2001). Animal Behavior (7th edn.). Sinauer Associates, USA. Amrhein V, Korner P & Naguib M (2002). ‘Nocturnal and diurnal singing activity in the nightingale: correlations with mating status and breeding cycle.’ Animal Behaviour 64, 939–944. Ballintijn M R & ten Cate C (1997). ‘Sex differences in the vocalizations and syrinx of the collared dove (Streptopelia decaocto).’ The Auk 114, 445–479. Barnard C (2004). Animal behaviour: mechanisms, development, function and evolution. Harlow: Pearson, Prentice Hall. Brenowitz E A (1997). ‘Comparative approaches to the avian song system.’ Journal of Neurobiology 33, 517–531. Campbell N A & Reece J B (2001). Biology. San Francisco: Benjamin-Cummings. Catchpole C & Slater P J B (1995). Bird song: biological themes and variations. Cambridge: Cambridge University Press. Causey Whittow G (2000). Sturkie’s avian physiology (5th edn.). San Diego: Academic Press. Dooling R (2004). ‘Audition: can birds hear everything they sing?’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 207–225. Doupe A J & Kuhl P K (1999). ‘Birdsong and human speech: common themes and mechanisms.’ Annual Reviews in Neurosciences 22, 567–631. Fitch W T (2000). ‘The evolution of speech: a comparative review.’ Trends in Cognitive Sciences 4, 258–267. Gahr M, Sonnenschein E & Wickler W (1998). ‘Sex differences in the size of the neural song control regions in a duetting songbird with similar song repertoire size of males and females.’ Journal of Neuroscience 18, 1124–1131. Goller F & Larsen O N (2002). ‘New perspectives on mechanisms of sound generation in songbirds.’ Journal of Comparative Physiology A 188, 841–850. Gottlieb G (1978). ‘Development of species identification in ducklings IV: Changes in species-specific perception caused by auditory deprivation.’ Journal of Comparative and Physiological Psychology 92, 375–387. Hall M L (2004). ‘A review of hypotheses for the functions of avian duetting.’ Behavioral Ecology and Sociobiology 55, 415–430. Hartshorne C (1973). Born to sing. Bloomington: Indiana University Press. Hauser M D, Chomsky N & Fitch W T (2002). ‘The faculty of language: what is it, who has it, and how did it evolve?’ Science 298, 1569–1579. Janik V M & Slater P J B (1997). ‘Vocal learning in mammals.’ Advances in the Study of Behaviour 26, 59–99.
Bislama 53 Jarvis E D (2004). ‘Brains and birdsong.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 226–271. Klump G (1996). ‘Bird communication in the noisy world.’ In Kroodsma D E & Miller E H (eds.) Ecology and evolution of acoustic communication in birds. Ithaca, New York: Cornell University Press. 321–338. Krebs J R (1977). ‘Song and territory in the great tit Parus major.’ In Stonehouse B & Perrins C (eds.) Evolutionary ecology. London: Macmillan. 47–62. Kroodsma D E (2004). ‘The diversity and plasticity of birdsong.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 108–131. MacDougall-Shackleton S A & Ball G F (1999). ‘Comparative studies of sex differences in the song-control system of songbirds.’ Trends in Neurosciences 22, 432–436. Marler P I E B (2004). ‘Bird calls: a cornucopia for communication.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 132–177. Naguib M (2005). ‘Singing interactions in song birds: implications for social relations, territoriality and territorial settlement.’ In McGregor P K (ed.) Communication networks. Cambridge: Cambridge University Press. 300–319. Naguib M & Wiley R H (2001). ‘Estimating the distance to a source of sound: mechanisms and adaptations for longrange communication.’ Animal Behaviour 62, 825–837.
Peake T M (2005). ‘Communication networks.’ In McGregor P K (ed.) Communication networks. Cambridge: Cambridge University Press. Riebel K (2003). ‘The ‘‘mute’’ sex revisited: vocal production and perception learning in female songbirds.’ Advances in the Study of Behavior 33, 49–86. Slabbekoorn H (2004). ‘Singing in the wild: the ecology of birdsong.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 181–208. Staicer C A, Spector D A & Horn A G (1996). ‘The dawn chorus and other diel patterns in acoustic signaling.’ In Kroodsma D E & Miller E H (eds.) Ecology and evolution of acoustic communication in birds. London: Cornell University Press. Stoddard P K (1996). ‘Vocal recognition of neighbors by territorial passerines.’ In Kroodsma D E & Miller E H (eds.) Ecology and evolution of acoustic communication in birds. Cornell: University Press. 356–376. Todt D & Naguib M (2000). ‘Vocal interactions in birds: the use of song as a model in communication.’ Advances in the Study of Behaviour 29, 247–296. Tramontin A D & Brenowitz E A (2000). ‘Seasonal plasticity in the adult brain.’ Trends in Neurosciences 23, 251–258. Wiley R H & Richards D G (1982). ‘Adaptations for acoustic communication in birds: sound transmission and signal detection.’ In Kroodsma D E & Miller E H (eds.) Acoustic communication in birds, vol. 2. New York: Academic Press. 131–181.
Bislama C Hyslop, La Trobe University, Bundoora, VIC, Australia ! 2006 Elsevier Ltd. All rights reserved.
Bislama, an English-lexifier pidgin-creole, is the national language of Vanuatu, a republic in the southwest Pacific within the region of Melanesia. Along with English and French, it is also one of the official languages of the country. As the national language, it is spoken by the majority of the population as either a first or second language. There are as many as 100 distinct languages spoken in Vanuatu (81 actively spoken languages according to Lynch and Crowley, 2001) for a population of only 186 678 (1999 census), and as a result Bislama is vital as a lingua franca between speakers of different language groups. In urban areas and even in some rural areas, it is fast becoming the main language used in daily life. According to the 1999 census, in urban areas, where there is a great deal of intermarriage, Bislama is the
main language used at home in 58% of households; in rural areas, this figure is considerably lower, at 13.3%. However, even in the most remote areas of the country only a minority of elderly people are not fluent in Bislama. Currently, English and French are the principal languages of education in Vanuatu and Bislama is generally banned in schools. However, Bislama is used for many other government and community services. For example, the majority of radio broadcasts are in Bislama, although only some of the content of newspapers is published in Bislama. Parliamentary debates are conducted in the language, as are local island court cases. Bislama is a dialect of Melanesian Pidgin, mutually intelligible with Solomons Pijin (Pijin), spoken in Solomon Islands, and Tok Pisin, spoken in Papua New Guinea. Thus, the language is not just an important lingua franca of Vanuatu, but also a common regional language that allows for communication among most peoples of Melanesia. Only in New Caledonia is Melanesian Pidgin not spoken.
Bislama 53 Jarvis E D (2004). ‘Brains and birdsong.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 226–271. Klump G (1996). ‘Bird communication in the noisy world.’ In Kroodsma D E & Miller E H (eds.) Ecology and evolution of acoustic communication in birds. Ithaca, New York: Cornell University Press. 321–338. Krebs J R (1977). ‘Song and territory in the great tit Parus major.’ In Stonehouse B & Perrins C (eds.) Evolutionary ecology. London: Macmillan. 47–62. Kroodsma D E (2004). ‘The diversity and plasticity of birdsong.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 108–131. MacDougall-Shackleton S A & Ball G F (1999). ‘Comparative studies of sex differences in the song-control system of songbirds.’ Trends in Neurosciences 22, 432–436. Marler P I E B (2004). ‘Bird calls: a cornucopia for communication.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 132–177. Naguib M (2005). ‘Singing interactions in song birds: implications for social relations, territoriality and territorial settlement.’ In McGregor P K (ed.) Communication networks. Cambridge: Cambridge University Press. 300–319. Naguib M & Wiley R H (2001). ‘Estimating the distance to a source of sound: mechanisms and adaptations for longrange communication.’ Animal Behaviour 62, 825–837.
Peake T M (2005). ‘Communication networks.’ In McGregor P K (ed.) Communication networks. Cambridge: Cambridge University Press. Riebel K (2003). ‘The ‘‘mute’’ sex revisited: vocal production and perception learning in female songbirds.’ Advances in the Study of Behavior 33, 49–86. Slabbekoorn H (2004). ‘Singing in the wild: the ecology of birdsong.’ In Marler P & Slabbekoorn H (eds.) Nature’s music: the science of birdsong. San Diego: Elsevier Academic Press. 181–208. Staicer C A, Spector D A & Horn A G (1996). ‘The dawn chorus and other diel patterns in acoustic signaling.’ In Kroodsma D E & Miller E H (eds.) Ecology and evolution of acoustic communication in birds. London: Cornell University Press. Stoddard P K (1996). ‘Vocal recognition of neighbors by territorial passerines.’ In Kroodsma D E & Miller E H (eds.) Ecology and evolution of acoustic communication in birds. Cornell: University Press. 356–376. Todt D & Naguib M (2000). ‘Vocal interactions in birds: the use of song as a model in communication.’ Advances in the Study of Behaviour 29, 247–296. Tramontin A D & Brenowitz E A (2000). ‘Seasonal plasticity in the adult brain.’ Trends in Neurosciences 23, 251–258. Wiley R H & Richards D G (1982). ‘Adaptations for acoustic communication in birds: sound transmission and signal detection.’ In Kroodsma D E & Miller E H (eds.) Acoustic communication in birds, vol. 2. New York: Academic Press. 131–181.
Bislama C Hyslop, La Trobe University, Bundoora, VIC, Australia ! 2006 Elsevier Ltd. All rights reserved.
Bislama, an English-lexifier pidgin-creole, is the national language of Vanuatu, a republic in the southwest Pacific within the region of Melanesia. Along with English and French, it is also one of the official languages of the country. As the national language, it is spoken by the majority of the population as either a first or second language. There are as many as 100 distinct languages spoken in Vanuatu (81 actively spoken languages according to Lynch and Crowley, 2001) for a population of only 186 678 (1999 census), and as a result Bislama is vital as a lingua franca between speakers of different language groups. In urban areas and even in some rural areas, it is fast becoming the main language used in daily life. According to the 1999 census, in urban areas, where there is a great deal of intermarriage, Bislama is the
main language used at home in 58% of households; in rural areas, this figure is considerably lower, at 13.3%. However, even in the most remote areas of the country only a minority of elderly people are not fluent in Bislama. Currently, English and French are the principal languages of education in Vanuatu and Bislama is generally banned in schools. However, Bislama is used for many other government and community services. For example, the majority of radio broadcasts are in Bislama, although only some of the content of newspapers is published in Bislama. Parliamentary debates are conducted in the language, as are local island court cases. Bislama is a dialect of Melanesian Pidgin, mutually intelligible with Solomons Pijin (Pijin), spoken in Solomon Islands, and Tok Pisin, spoken in Papua New Guinea. Thus, the language is not just an important lingua franca of Vanuatu, but also a common regional language that allows for communication among most peoples of Melanesia. Only in New Caledonia is Melanesian Pidgin not spoken.
54 Bislama
The formation and development of Bislama, and of Melanesian Pidgin generally, took place within Vanuatu and other regions of Melanesia and also in Australia and other countries of the Pacific. A pidgin first started to emerge in Vanuatu (known as the New Hebrides at the time) in the mid-1800s as a result of the sandalwood and sea slug trade. Further development took place in the second half of the 19th century, with increasing numbers of Ni-Vanuatu being recruited to work on plantations both inside Vanuatu and in other areas of the Pacific, particularly in the sugarcane plantations of Queensland and Fiji (Crowley, 1990a). During the early decades of the 20th century, the language stabilized, such that its structure today is very close to what it was then. The status of and need for Bislama as a lingua franca within the country increased in the period leading up to independence in 1980, to the extent that today it has become the unifying language of the nation. The majority of the Bislama lexicon, approximately 84–90%, is derived from English, reflecting its history of development alongside English-speaking traders, plantation owners, and colonists. Only approximately 3.75% of the vocabulary originates from the vernacular languages and 6–12% derives from French (Crowley, 2004). Of those words that derive from local languages, the majority describe cultural artifacts and concepts and endemic floral and faunal species that have no common names in English, such as nasara ‘ceremonial ground,’ navele ‘Barringtonia edulis,’ and nambilak ‘buff-banded rail.’ Note that many of these words start with na-, the form of an article or noun marker in many Vanuatu languages. Although the majority of the lexicon is derived from English, the grammar of Bislama is greatly influenced by the vernacular languages. For example, in the pronominal system there is an inclusive-exclusive distinction in the first person, yumi ‘we (inclusive)’ is distinguished from mifala ‘we (exclusive).’ Dual and trial number is also distinguished from the plural, as yutufala ‘you (two),’ yutrifala ‘you (three),’ and yufala ‘you (pl.).’ Another feature that Bislama inherits from the substratum languages is reduplication. Reduplication is a productive process for both verbs and adjectives, but it is rarer for nouns. In verbs, reduplication can mark an action as being continuous, habitual, reciprocal, or random. It can mark intensity in both verbs and adjectives, and it also marks plurality in adjectives. Like English and many Vanuatu languages, Bislama is characterized by AVO/SV word order, and this is the only means of recognizing the subject
and object of the clause. Peripheral arguments are marked by prepositions. The preposition long has a wide general use; it marks the locative, allative, ablative, and dative. It can also mark the object of comparison in a comparative construction, the instrumental, and a number of other less easily defined functions. The preposition blong also has a number of functions, marking the possessor in a possessive construction, a part-whole relationship, and a purposive role. Prepositions marking other semantic roles are wetem ‘with’ (instrumental and comitative), from ‘for, because of’ (reason), and olsem ‘like’ (similitive). As is true of most pidgin languages, there is little marking of tense, aspect, and mood. The preverbal markers bin and bae mark the past and future tense, respectively. However, it is possible for an unmarked verb, preceded only by its subject, to indicate either past, present, or future tense, depending on the context. A number of auxiliaries also occur, with aspectual or modal functions, such as stap, marking a continuous or habitual action; mas ‘must’; save ‘be able’; and wantem ‘want.’ Verb serialization is a productive process in Bislama, encoding various meanings and functions such as a cause-effect relationship; a causative; or direction, position, or manner of action. See also: Central Solomon Languages; Papua New Guinea: Language Situation; Pidgins and Creoles: Overview; Solomon Islands: Language Situation; Tok Pisin; Vanuatu: Language Situation.
Bibliography Crowley T (1990a). Beach-la-mar to Bislama: the emergence of a national language in Vanuatu. Oxford Studies in Language Contact. Oxford: Clarendon Press. Crowley T (1990b). An illustrated Bislama-English and English-Bislama dictionary. Port Vila: University of the South Pacific, Pacific Languages Unit. Crowley T (2004). Bislama reference grammar. Honolulu: University of Hawai’i Press. Lynch J & Crowley T (2001). Languages of Vanuatu: a new survey and bibliography. Canberra, Australia: Pacific Linguistics. Tryon D T (1987). Bislama: an introduction to the national language of Vanuatu. Canberra, Australia: Pacific Linguistics. Tryon D T & Charpentier J-M (2004). Pacific pidgins and creoles: origins, growth and development. Trends in linguistics studies and monographs 132. Berlin: Mouton de Gruyter.
Black Islam 55
Black Islam R Turner, University of Iowa, Iowa City, IA, USA ! 2006 Elsevier Ltd. All rights reserved.
The involvement of black Americans with Islam reaches back to the earliest days of the African presence in North America. The history of black Islam in the United States includes successive and varied presentations of the religion that document black Americans’ struggles to define themselves independently in the context of global Islam. This article is a historical sketch of black Islam that focuses on the following topics: Islam and transatlantic slavery, early 20th-century mainstream communities, early 20th-century racial separatist communities, and mainstream Islam in contemporary black America.
Islam and Transatlantic Slavery Muslim slaves – involuntary immigrants who had been the urban-ruling elite in West Africa, constituted at least 15% of the slave population in the United States in the 18th and 19th centuries. Their religious and ethnic roots could be traced to ancient black kingdoms in Ghana, Mali, and Songhay. Some of these West African Muslim slaves brought the first mainstream Islamic beliefs and practices to America by keeping Islamic names, writing in Arabic, fasting during the month of Ramadan, praying five times a day, wearing Muslim clothing, and writing and reciting the Qur’an. The fascinating portrait of a West African Muslim slave in the United States who retained mainstream Islamic practices was that of a Georgia Sea Island slave, Bilali. He was one of at least 20 black Muslims who are reported to have lived and practiced their religion in Sapelo and St. Simon’s Islands during the antebellum period. This area provided fertile ground for mainstream Islamic continuities because of its relative isolation from Euro-American influences. Bilali was noted for his religious devotion: for wearing Islamic clothing, for his Muslim name, and for his ability to write and speak Arabic. Islamic traditions in his family were retained for at least three generations. Fascinating portraits of outstanding African Muslim slaves in the United States, which exist in the historical literature, also include Job Ben Solomon (1700–1773), a Maryland slave of Fuble Muslim origins; Georgetown, Virginia, slave Yarrow Mamout, who was close to 100 years old when his portrait was painted by Charles Wilson Peale; Abd al-Rahman Ibrahima (1762–1825), a Muslim prince in Futa Jallon,
who was enslaved in Mississippi; Omar Ibn Said (1770–1864), a Fuble Muslim scholar who was a slave in North Carolina and pretended a conversion to Christianity; and numerous others. By the eve of the Civil War, the black Islam of the West African Muslim slaves was, for all practical purposes defunct, because these Muslims were not able to develop community institutions to perpetuate their religion. When they died, their presentation of Islam, which was West African, private, with mainstream practices, disappeared. But they were important nonetheless, because they brought black Islam to America.
Early 20th-Century Mainstream Communities In the late 19th century, the Pan-Africanist ideas of a Presbyterian minister in Liberia, Edward Wilmot Blyden (1832–1912), which critiqued Christianity for its racism and suggested Islam as a viable religious alternative for black Americans, provided the political framework for Islam’s appeal to black Americans in the early 20th century. Moreover, the internationalist perspective of Marcus Garvey’s Universal Negro Improvement Association and the Great Migration of more than one million black southerners to northern and midwestern cities during the World War I era provided the social and political environment for the rise of black American mainstream communities from the 1920s to the 1940s. The Ahmadiyya Movement in Islam, a heterodox missionary community from India, laid the groundwork for mainstream Islam in black America by providing black Americans with their first Qur’ans, important Islamic literature and education, and linkages to the world of Islam. Mufti Muhammad Sadiq, the first Ahmadiyya missionary to the United States, established the American headquarters of the community in Chicago in 1920. He recruited many of his earliest black American converts from the ranks of Marcus Garvey’s Universal Negro Improvement Association. By the mid-1920s, Sadiq and black American converts, such as Brother Ahmad Din and Sister Noor, had established The Muslim Sunrise, the first Islamic newspaper in the United States, and thriving multiracial communities in Detroit, Michigan; Gary, Indiana; and St. Louis, Missouri. There were several dynamic early 20th-century communities to which black American Sunni Muslims can trace their roots. These communities – the Islamic Mission to America, Jabul Arabiyya, and the First Cleveland Mosque – were influenced by Muslim
56 Black Islam
immigrants and their own constructed presentations of mainstream Islam in black communities. Four things influenced the Islamic Mission to America in New York City: the local Muslim immigrant community; Muslim sailors from Yemen, Somalia, and Madagascar; the Ahmadi translation of the Qur’an; and the black American community. Shiek Daoud was born in Morocco and came to the United States from Trinidad. Daoud’s wife, ‘Mother’ Sayeda Kadija, who had Pakistani Muslim and Barbadian roots, became president of the Muslim Ladies Cultural Society. The Islamic Mission to America published its own literature about mainstream Islam. Sheik Daoud believed that black American Muslims should change themselves not only spiritually, but also in ‘‘language, dress, and customs’’ to connect them to Islamic civilization and revivalism in Asia and Africa. Daoud immersed himself in the complex experiences of, and boundaries between, Muslim immigrants and black converts to Islam in New York City and Brooklyn from the 1920s to the 1960s. Muhammad Ezaldeen, an English teacher and principal, was a Moorish Science Temple member in Newark, New Jersey, in the 1920s. After several years of Arabic and Islamic studies in Egypt, he returned to the United States to promote the Islamic connections between Arab and black American culture in the Adenu Allahe Universal Arabic Association. In 1938, he and his followers established Jabul Arabiyya, a Sunni Muslim community ruled by Islamic law in rural West Valley, New York. Communities of this association were founded in New Jersey (Ezaldeen Village); Jacksonville, Florida; Rochester, New York; Philadelphia, Pennsylvania; and Detroit, Michigan. These communities emphasized the hijra – the movement of early Arabian Muslims from Mecca to Medina in 622 C.E. – as the centerpiece of their spiritual philosophy. Tensions between black American and immigrant leaders in the Ahmadiyya Movement in Islam resulted in the establishment of the Sunni First Cleveland Mosque by Imam Wali Akram in 1936 and the First Muslim Mosque in Pittsburgh by Nasir Ahmad and Saeed Akmal in the same period. Wali Akram was one of the first black American Muslim converts to sever all ties with the immigrant community in order to establish mainstream Islam in a black American community. The imam and his wife, Kareema, learned Arabic and taught the language and the recitation of the Qur’an to black converts. One of Akram’s unique contributions to the black American community was the Muslim Ten Year Plan, which utilized the faith and discipline of Sunni Islam to get black people off welfare and to make black American
Muslim communities economically and socially selfsufficient. In 1943, Wali Akram conducted the first session of the Uniting Islamic Society of America in Philadelphia. This national group was established to unify disparate black American mainstream organizations against the agenda of foreign Muslims. The Uniting Islamic Society of America met several times from 1943 to 1947 to develop a united platform on doctrine, politics, women’s issues, leadership, and relations with the immigrant community. Ultimately, this organization failed because of personality conflicts and different visions of the black American mainstream Islamic community. The grassroots work of these mainstream groups with their emphasis on study of the Arabic language and the Qur’an, the transformation of domestic space and community life, adoption of Islamic dress and customs, and cosmopolitan travels to Egypt, Morocco, Trinidad, India, Barbados, Jamaica, and New York City are key to understanding the Muslim lifestyles of these early Sunni black American converts as expressions of global Islam. These early black American Sunni communities were overshadowed by the successful missionary work of the heterodox Ahmadiyya movement and later by the ascendancy of the Nation of Islam in the 1950s. Mainstream Islam did not become a popular option for black American Muslims until the 1960s.
Early 20th-Century Racial Separatist Communities Noble Drew Ali (1886–1929) was the founder of the Moorish Science Temple of America in Newark, New Jersey, in 1913. This was the first mass religious community in the history of black American Islam and the black nationalist model for the Nation of Islam. In the late 1920s, the Moorish American community in the United States grew to approximately 30 000 members and was the largest Islamic community in the United States before the ascendancy of the Nation of Islam in the 1950s. The Moorish Americans, who established branches of their community in several northern cities and made their headquarters in Chicago in the 1920s, claimed to be descendants of Moroccan Muslims and constructed a nationalist identity by changing their names, nationality, religion, diet, and dress. Their esoteric spiritual philosophy was constructed from Islam, Christianity, and black Freemasonry. In 1927, Ali wrote their sacred text, the Holy Koran of the Moorish Science Temple, also called the Circle Seven Koran, to teach his followers their preslavery religion, nationality, and genealogy. To support his
Black Islam 57
case for a Moorish American identity, he emphasized two important points: first, black Americans were really ‘Asiatics’ – the descendants of Jesus, and second, the destiny of western civilization was linked to the rise of the ‘Asiatic’ nation – Asians, Africans, Native Americans, and black Americans. In the Holy Koran of the Moorish Science Temple, Noble Drew Ali also argued that truth, peace, freedom, justice, and love were the Islamic ideals that his followers should emulate. The Moorish Science Temple survived in factions after Noble Drew Ali’s mysterious death in 1929 and received official recognition for its Islamic linkages to Morocco from the Moroccan ambassador to the United States in 1986. Major communities exist today in Baltimore, Pittsburgh, and Los Angeles. The Nation of Islam began in Detroit, Michigan, in 1930 as the Allah Temple of Islam – a small black nationalist Islamic movement founded by W. D. Fard, an immigrant Muslim missionary, who preached a philosophy of political self-determination and racial separatism to the newly arrived black southerners of the Great Migration. Fard believed that Western civilization would soon end in a race war, and he established an institutional framework – the Fruit of Islam, The Muslim Girls Training Corps, and the University of Islam to separate black Muslims from white Christian America. Although his ethnic and Islamic identity remains undocumented, Fard might have been a Druze, a sectarian branch of the Ismaili Shii Muslims, who have a long documented tradition of human divinity and esoteric interpretations of the Qur’an. A victim of police brutality, he disappeared mysteriously in 1934, after he assigned leadership of his community to Elijah Muhammad (1897–1975), who led the Nation of Islam from 1934 to 1975 from its Chicago headquarters and was an important figure in the development of black nationalism and Islam among black Americans in the 20th century. The members of the Nation of Islam believed that their descendants were the Asiatics, who were the original Muslims and the first inhabitants of the earth, and they claimed a divine identity for their founder, W. D. Fard, and prophetic status for Elijah Muhammad. During World War II, the Nation of Islam’s membership decreased dramatically as Elijah Muhammad and his son, Herbert, became involved politically with Satokata Takahashi, a Japanese national organizer among black Americans, and they were prisoners in the federal penitentiary in Milan, Michigan, from 1943 to 1946. In the 1950s and 1960s, as black Americans and Africans cracked the political power of white supremacy in the United States and abroad,
Elijah Muhammad’s institutional quest for economic power made the Nation of Islam into the wealthiest black organization in American history. In this era, the Nation of Islam provided a community model and political inspiration for the black power movement. Malcolm X’s phenomenal organizing efforts among young lower-class black men and women in the northern cities created powerful constituencies for the Nation of Islam across the United States, and the Muhammad Speaks newspaper, which was edited by a leftward-leaning staff, provided exemplary coverage of international news and anticolonial struggles in Asia and Africa. Malcolm X provided a powerful message of racial separatism, self-discipline, and black community development in the midst of the integrationist strategies and nonviolent demonstrations of the civil rights movement. However, as the political tactics and strategies of the civil rights and the black power movements became more sophisticated Elijah Muhammad’s economic agenda for his community resulted in a conservative vision regarding political activism; this was one of the primary factors that led to Malcolm X’s departure from the Nation of Islam. In the wake of President Kennedy’s assassination in 1963, a public controversy between Elijah Muhammad and Malcolm X evolved into a permanent separation. Establishing a new spiritual and political identity, Malcolm abandoned the heterodox, racial-separatist philosophy of the Nation of Islam and converted to multiracial Sunni Islam during the last year of his life. In March, 1964, he founded the Sunni Muslim Mosque, Inc. in Harlem as the base for a spiritual program to eliminate economic and social oppression against black Americans. Then, Malcolm made the hajj, the Islamic pilgrimage to Mecca, Saudi Arabia, in April 1964. There, he changed his name from Malcolm X to El Hajj Malik El-Shabazz, which signified the adoption of a new identity that was linked to mainstream Islam. Malcolm’s Sunni Islamic identity became a significant model for many black Americans who have converted to mainstream Islam since the 1960s. After Mecca, Malcolm traveled extensively through North and West Africa establishing important religious and political linkages with Third World nations. These profound international experiences deepened his Pan-African political perspective. When Malcolm returned to the United States, he founded the Organization of Afro-American Unity in New York City on June 29, 1964, to promote his political perspective, which linked the black American struggle for social justice to global human rights issues in Africa, Asia, Latin America, and the Caribbean.
58 Black Islam
During the final weeks of Malcolm’s life in 1965, he began to talk about the black American freedom struggle as an aspect of ‘‘a worldwide revolution’’ against racism, corporate racism, classism, and sexism. Because of his potential (if he had lived) to unite many black Muslims and black Christians in America and abroad in a global liberation struggle that could have involved the United Nations, there is no question that the American intelligence community had the incentive to be involved in Malcolm X’s murder. Since 1978, Louis Farrakhan has led the revived Nation of Islam and published the Final Call newspaper. Farrakhan speaks fluent Arabic and travels frequently to the Middle East and West Africa to promote the issues of black American Muslims. His greatest achievement as leader of the Nation of Islam was the Million Man March in 1995, which brought the healing spirit of Islam to more than one million black men who gathered in Washington, D.C. This was the largest political gathering of black Americans in American history. On Saviours’ Day in Chicago in February 2000, Farrakhan announced changes in the Nation of Islam’s theology and ritual practices that will bring his community closer to the center of mainstream Islam in North America. Major factions of the Nation of Islam are led by John Muhammad in Highland Park, Michigan; Silis Muhammad in Atlanta, Georgia; and Emmanuel Muhammad in Baltimore, Maryland. The Five Percenters, also called the Nation of Gods and Earths, are popular among rap musicians and the hip-hop community; they were founded by Clarence 13X in New York City in 1964.
Mainstream Islam in Contemporary Black America Large numbers of black Americans have turned to mainstream Islamic practices and communities since Malcolm X’s conversion to Sunni Islam in 1964. Like Malcolm X, black American Sunni Muslims see themselves as part of the mainstream Muslim community in the world of Islam and study Arabic, fast during the month of Ramadan, pray five times a day, make the hajj to Mecca, practice charity and social justice, and believe in one God and Muhammad as his last prophet. The dramatic growth of mainstream Islam in black America is also related to the arrival of more than three million Muslims in the United States after the American immigration laws were reformed in 1965. Elijah Muhammad’s son, Warith Deen Mohammed, has played an important role within mainstream Islam in the United States. He became the Supreme Minister of the Nation of Islam after his father’s
death in 1975. During the first years of his leadership, he mandated sweeping changes, which he called the ‘‘Second Resurrection’’ of black Americans, in order to align his community with mainstream Islam. He refuted the Nation of Islam’s racial-separatist teachings and praised his father for achieving the ‘‘First Resurrection’’ of black Americans by introducing them to Islam. But now the community’s mission was directed not only at black Americans, but also at the entire American environment. The new leader renamed the Nation of Islam the ‘‘World Community of Al-Islam in the West’’ in 1976; the American Muslim Mission in 1980; and the ‘‘American Society of Muslims’’ in the 1990s. Ministers of Islam were renamed ‘imams’, and temples were renamed ‘mosques’ and ‘masjids’. The community’s lucrative financial holdings were liquidated, and mainstream rituals and customs were adopted. Although Warith Deen Mohammed’s positive relationships with immigrant Muslims, the world of Islam, and the American government are important developments in the history of mainstream Islam in the United States, his group has diminished in members since the 1980s, and he resigned as the leader of the American Society of Muslims in 2003. In the wake of Mohammed’s departure, Mustafa El-Amin, a black American imam in Newark, New Jersey, has attempted to revive this black mainstream Islamic community. Darul Islam, founded in Brooklyn, New York, in 1962 and having branches in many major American cities, is probably the largest and most influential community of black American Sunni Muslims. Prestige and leadership are based on knowledge of the Qur’an, the hadith, and the Arabic language. Darul Islam is a private decentralized community, which did not allow immigrants in its midst until the mid–1970s. The Hanafi Madh-hab Center, founded by Hammas Abdul Khalis in the 1960s, is a black American Sunni group that made headlines in the 1970s because of its conversion of the basketball star Kareem Abdul Jabbar and the assassination of Khalis’s family in their Washington, D.C., headquarters. Siraj Wahhaj leads an important black Sunni community in Bedford Stuyvesant in Brooklyn, New York. Although black American Muslims populate multiethnic Sunni masjids and organizations across the United States, reportedly there are subtle racial and ethnic tensions between black American and immigrant Muslims. Immigrant Muslims talk about ‘a color- and race-blind Islam’ and the American dream, whereas black American Muslims continue to place Islam at the forefront of the struggles for social justice, as the United States has entered a new century of frightening racial profiling and violence
Blaming and Denying: Pragmatics 59
in a post–September 11 world. Certainly, black American and immigrant Muslims have a lot to learn from each other and need to present a united front on social justice issues, as mainstream Islam’s appeal and ascendancy in the United States in this century may depend on American Muslims’ ability to claim a moral and political high ground on social justice and racial issues that have historically divided the American Christian population. In the wake of post–September 11 legislation, such as U.S. Patriot Act that has enabled the detention of Muslim immigrants and Muslim Americans, black American Muslims are probably in the strongest position to refute arguments that claim there is a clash of civilizations between Islam and the West because of the ethnic group’s history of contributions to the American experience. Although there are no conclusive statistics, some observers estimate that there are six to seven million Muslims in the United States and that black American Muslims comprise 42% of the total population. Finally, the future of American Muslim communities in the 21st century may be determined significantly by the conversion experiences and social-political perspectives of young black Americans. According to A report from the Mosque Study Project 2000, published by the Council on American– Islamic Relations, black Americans constitute the largest percentile of the yearly converts to mainstream Islam, and many of these converts are young black men and women who reside in urban locations.
See also: Islam in Africa; Islam in East Asia; Islam in
Southeast Asia; Islam in the Near East; New Religious Movements; Religion: Overview.
Bibliography Austin A D (1997). African Muslims in antebellum America: transatlantic stories and spiritual journeys. New York: Routledge. Clegg C A III (1997). An original man: the life and times of Elijah Muhammad. New York: St. Martin’s. Dannin R (2002). Black pilgrimage to Islam. New York: Oxford University Press. Diouf S A (1998). Servants of Allah: African Muslims enslaved in the Americas. New York: New York University Press. Essieu-Udom E U (1962). Black nationalism: a search for identity in America. Chicago: University of Chicago Press. Haddad Y Y (ed.) (1991). The Muslims of America. New York: Oxford University Press. Haley A (1965). The autobiography of Malcolm X. New York: Ballantine Books. Lincoln C E (1994). The black Muslims in America (3rd edn.). Trenton, NJ: Africa World Press. McCloud A B (1995). African-American Islam. New York: Routledge. Nimer M (2002). The North American Muslim resource guide: life in the United States and Canada. New York: Routledge. Turner R B (2003). Islam in the African-American experience (2nd edn.). Bloomington: Indiana University Press.
Blaming and Denying: Pragmatics R Wodak, University of Vienna, Vienna, Austria, and Lancaster University, Lancaster, UK ! 2006 Elsevier Ltd. All rights reserved.
Definition of Terms Blaming and denying, frequent and constitutive features of conflict talk, are expressed in many different direct or indirect linguistic modes, depending on the specific broad and narrow contexts of the conversations, on the functions of the utterances, and on the formality of the interactions. Moreover, the usages and functions of blaming and denying are dealt with in many disciplines (psychoanalysis, sociopsychology, political sciences, sociology, anthropology, psychiatry, linguistics, argumentation studies, history, and so forth). For example, the specifics of blaming and
denying can be related to psychological and psychiatric syndromes, wherein certain patterns are viewed as compulsive and out of control, and to political debates and persuasive discourses, in which blaming and denying, by serving to promote one group and to debase or attack the opposition, are carefully and strategically planned and serve positive self-presentation and negative other-presentation. Thus, the linguistic analysis of those verbal practices that construct a dynamic of ‘justification discourses’ requires methodologies that are adequate for the specific genre and context (speech act theory, conversation analysis, discourse analysis, text linguistics, argumentation analysis, rhetoric, and so forth) (for overviews of some important features of conflict talk in specific domains from varying perspectives, see Austin, 1956/1957; Gruber, 1996; Kopperschmidt, 2000) (see also Discourse Markers; Psychoanalysis and Language).
Blaming and Denying: Pragmatics 59
in a post–September 11 world. Certainly, black American and immigrant Muslims have a lot to learn from each other and need to present a united front on social justice issues, as mainstream Islam’s appeal and ascendancy in the United States in this century may depend on American Muslims’ ability to claim a moral and political high ground on social justice and racial issues that have historically divided the American Christian population. In the wake of post–September 11 legislation, such as U.S. Patriot Act that has enabled the detention of Muslim immigrants and Muslim Americans, black American Muslims are probably in the strongest position to refute arguments that claim there is a clash of civilizations between Islam and the West because of the ethnic group’s history of contributions to the American experience. Although there are no conclusive statistics, some observers estimate that there are six to seven million Muslims in the United States and that black American Muslims comprise 42% of the total population. Finally, the future of American Muslim communities in the 21st century may be determined significantly by the conversion experiences and social-political perspectives of young black Americans. According to A report from the Mosque Study Project 2000, published by the Council on American– Islamic Relations, black Americans constitute the largest percentile of the yearly converts to mainstream Islam, and many of these converts are young black men and women who reside in urban locations.
See also: Islam in Africa; Islam in East Asia; Islam in
Southeast Asia; Islam in the Near East; New Religious Movements; Religion: Overview.
Bibliography Austin A D (1997). African Muslims in antebellum America: transatlantic stories and spiritual journeys. New York: Routledge. Clegg C A III (1997). An original man: the life and times of Elijah Muhammad. New York: St. Martin’s. Dannin R (2002). Black pilgrimage to Islam. New York: Oxford University Press. Diouf S A (1998). Servants of Allah: African Muslims enslaved in the Americas. New York: New York University Press. Essieu-Udom E U (1962). Black nationalism: a search for identity in America. Chicago: University of Chicago Press. Haddad Y Y (ed.) (1991). The Muslims of America. New York: Oxford University Press. Haley A (1965). The autobiography of Malcolm X. New York: Ballantine Books. Lincoln C E (1994). The black Muslims in America (3rd edn.). Trenton, NJ: Africa World Press. McCloud A B (1995). African-American Islam. New York: Routledge. Nimer M (2002). The North American Muslim resource guide: life in the United States and Canada. New York: Routledge. Turner R B (2003). Islam in the African-American experience (2nd edn.). Bloomington: Indiana University Press.
Blaming and Denying: Pragmatics R Wodak, University of Vienna, Vienna, Austria, and Lancaster University, Lancaster, UK ! 2006 Elsevier Ltd. All rights reserved.
Definition of Terms Blaming and denying, frequent and constitutive features of conflict talk, are expressed in many different direct or indirect linguistic modes, depending on the specific broad and narrow contexts of the conversations, on the functions of the utterances, and on the formality of the interactions. Moreover, the usages and functions of blaming and denying are dealt with in many disciplines (psychoanalysis, sociopsychology, political sciences, sociology, anthropology, psychiatry, linguistics, argumentation studies, history, and so forth). For example, the specifics of blaming and
denying can be related to psychological and psychiatric syndromes, wherein certain patterns are viewed as compulsive and out of control, and to political debates and persuasive discourses, in which blaming and denying, by serving to promote one group and to debase or attack the opposition, are carefully and strategically planned and serve positive self-presentation and negative other-presentation. Thus, the linguistic analysis of those verbal practices that construct a dynamic of ‘justification discourses’ requires methodologies that are adequate for the specific genre and context (speech act theory, conversation analysis, discourse analysis, text linguistics, argumentation analysis, rhetoric, and so forth) (for overviews of some important features of conflict talk in specific domains from varying perspectives, see Austin, 1956/1957; Gruber, 1996; Kopperschmidt, 2000) (see also Discourse Markers; Psychoanalysis and Language).
60 Blaming and Denying: Pragmatics
The Use of Blaming and Denying: Domains and Genres
The Linguistic/Pragmatic Analysis of Blaming and Denying
Blaming and denying occur both in private, intimate conversations and in the domains of politics, the law, and the media. Linguistic manifestations depend on the choice of genre and on the formality/informality of the settings. For example, studies on racist or antiSemitic discourses show that the more informal the setting (anonymous conversations, conversations with friends, or e-mail postings), the more likely the use of abusive language, derogatory terms, and discriminatory language. If the setting is more formal (for example, a televised debate or political speech), the wording of ‘blaming’ is mitigated, more indirect, and often introduced by disclaimers (S ome of my be s t frie nds are J e wis h/Turks , but ; I love all pe ople , but ; and so forth), after which, the ‘other’ is attacked, often by a projection of guilt or by a turning of the tables (van Dijk, 1993; Wodak, 2004) (s e e Mitigation). Justification discourses have been analyzed in studies dealing with court trials (Scott and Lyman, 1976; Alexy, 1996), relationships between parents and children (Wodak and Schulz, 1986), intimate relationships (Jacobson and Kettelhack, 1995; Dejudicibus and McCabe, 2001), media debates (Lamb and Keon, 1995; Dickerson, 1998), and the speeches, print media, slogans, and debates of election campaigns (Chilton, 2004); they have also been focused on in the police environment and other bureaucratic settings (Ehlich and Rehbein, 1986) and during proceedings in which official bodies have attempted to come to terms with traumatic past events (Ensink and Sauer, 2003; Martin and Wodak, 2003). One of the most significant manifestations of denial is ‘Holocaust denial,’ in which speakers and writers suggest evidence or arguments for their claim that the Holocaust never happened, being – in their opinion – invented by a (supposedly Jewish) conspiracy (Lipstadt, 1993). There is no doubt that such a denial serves many functions, probably primarily to reject (individual and/or collective) guilt by counterattacking an imaginary opponent. Justification discourses are not restricted to oral, spontaneous texts; the same types of blaming and denying are also manifest in many written genres, reflecting the intentions and aims of the authors of newspaper articles, letters, party programs, election materials, or legal documents. The visual genres, especially caricature, lend themselves to justification discourses through the presentation of, and debate about, visual evidence (e.g., photos representing war crimes; see later).
Depending on the genre, different linguistic and/or pragmatic approaches are used in analysis. Most obviously, speech act theory allows for the categorization of direct and indirect forms of blaming and denying in conversations or debates (s e e Speech Acts). In conversation analytic terms, blaming consists of two parts: on the one hand, a specific action is presented; on the other hand, there is the negative evaluation of this action, often an accusation. Gruber (1996) listed several important forms of these socalled ‘adjacency pairs’ (s e e Conversation Analysis). Accusations can either relate to situational factors or to factors that are outside of the specific setting. Either way, perceived violations of rules and norms may trigger the speech act of blaming. Moreover, accusations can be formulated either directly or indirectly, depending on the knowledge that the participants in the debate or conflict are supposed to possess. Reacting to aggressive behavior, a defendant can either apologize and try to legitimize her/his actions through accounts, anecdotes, various kinds of evidence, and so forth (Scott and Lyman, 1968/1976), or the accusation can be rejected. Conversation analysts propose that rejection is the preferred mode of reaction (Pomerantz, 1978). Silence can also occur; this is usually interpreted as the accused acknowledging the legitimacy of the accusation. Sometimes, a counteraccusation may follow, or the accusation may be partially or completely denied. These patterns of speech acting can create a conversational dynamic that it is very difficult to overcome. Argumentation analysis focuses on typical modes of arguments that are used in conflict talk. Certain topoi characterize blaming as well as denying; both the topoi and the fallacies are difficult to deconstruct, such that a rational debate becomes almost impossible. Many argumentative moves can be made while blaming an opponent, ranging from attacking the opponent personally (argume ntum ad homine m) or threatening the opponent and his/her freedom of expression (argume ntum ad baculum), to undermining the credibility of the opponent by showing that he/she does not adhere to the point of view that he/she publicly defends (tu quoque, a variant of the ad homine m argument) (for typical fallacies in conflict talk, see van Eemeren and Grootendorst, 1992; Reisigl and Wodak, 2001) (s e e als oArgument Structure). What holds for argumentation is also true of denials. Denials can occur as disclaimers (I am not a racis t, s e xis t, e tc., but ) or as direct rejections of
Blaming and Denying: Pragmatics 61
certain accusations; they can be formulated as counterattacks (identification with the aggressor), or as ‘straw man’ fallacies (when a fictitious standpoint is attributed to the opponent, or the opponent’s actual standpoint is being distorted). Some of these fallacies have already been described in classical rhetoric (as in Aristotle’s De sophisticis elenchis), wherein fallacies are defined as incorrect moves adopted in dispute to refute a thesis (see van Eemeren and Grootendorst, 2004) (see also Rhetoric, Classical). Discourse analysis focuses on the strategies employed in blaming and denying. These strategies are realized linguistically in various, predictable ways, depending on the context. Moreover, mitigation and intensification markers are of obvious interest, because they serve to open or close options for debate and argument. Discursive strategies such as scapegoating, blaming the victim, blaming the messenger, victim–perpetrator reversal, the straw man fallacy, turning the tables, and so forth have been studied extensively; they all belong to the category of ‘discourses of justification’ (Wodak et al., 1990; Van Leeuwen and Wodak, 1999). ‘Strategy’ is defined as a more or less detailed and directed plan of practices (including discursive practices), adopted to achieve a particular social, political, psychological, or linguistic aim. As far as discursive strategies, i.e., systematic ways of using language, are concerned, they are located at different levels of linguistic organization and complexity. Strategies, realized as macroconversational patterns or moves, are often used to structure public debates, such as on AIDS, poverty, economic problems, the welfare state, racism, xenophobia, and anti-Semitism; as well as on sexism and the representation of rape (Carlson, 1996; Maynard, 1998; Anderson et al., 2001).
An Example: The War-Crimes Debate Between 1995 and 2004, the Hamburg Institute for Social Research created and presented to the public two itinerant exhibitions, under the common denomination Crimes of the German Wehrmacht (see Heer et al. (2003); for an extensive analysis of the debates surrounding the exhibitions, as well as an analysis of the historical narratives in Germany and Austria around the discursively constructed images of the German Wehrmacht, see also Wodak (2005)). The first exhibition was shown from March, 1995, through the end of 1999, at a total of 33 venues in the Federal Republic of Germany and in Austria. The second exhibition was shown to the public for the first time in Berlin in November, 2001; the new exhibition upheld the main statement of the former exhibition (which had been hotly debated and
often criticized, both in the press and in other fora of discussion): viz., that during World War II, the Wehrmacht was extensively involved, as an institution, in planning and implementing an unprecedented war of annihilation. However, the second exhibition had shifted to a focus on texts, whereas the first exhibition had presented mainly photographs. The exhibitions demonstrated the at times passive, at times active, role of the Wehrmacht in German war crimes. From November, 2001, through March, 2004, this second exhibition was displayed in 11 German cities, as well as in Vienna and in Luxemburg, attracting more than 420000 visitors (the Hamburg Institute’s first exhibition on the same subject had attracted about 800000 visitors). Both exhibitions triggered a discussion throughout the Federal Republic of Germany and Austria about the crimes committed during the war waged by the National Socialist regime and about how postwar German society dealt with this part of its past. Never before had the West German and Austrian publics discussed their past with such intensity and for such a long period. In the debates surrounding the two exhibitions (1995 and 2001) on war crimes committed by the German Wehrmacht in World War II, typical discursive strategies of blaming and denying become apparent. Interviews with visitors to the exhibition emphasized, on the one hand, the fact of ‘‘not having seen, known, or heard anything’’ about the deportation and extermination of prisoners of war as well of racial and ethnic groups such as Jews, Roma, and other civilians. On the other hand, the blame was projected onto ‘a few soldiers,’ who were labeled as ‘exceptions’; in this way, any explicit involvement of the Wehrmacht as an institution was denied (Heer et al., 2003). The same patterns are found in the reports on hearings of the South African Truth and Reconciliation Commission (TRC) and in the debates about the pictures of tortured Iraqi prisoners that first appeared in 2004, during the Iraq war. Figure 1 summarizes the most important strategies of denial (i.e., discursive reactions to blaming). The main distinction shown in the diagram is between people orienting themselves toward the context, i.e., acknowledging the fact that they are watching an exhibition about the German army’s war crimes, and taking a stance toward that fact (the left side of the diagram), and people who do not orient themselves toward the context (the right side of the diagram). The first three strategies negate the very context, at least at the explicit level: 1. People do not position themselves with respect to their belief in the existence of war crimes. This may be done by (a) refusing to deal with the issue
62 Blaming and Denying: Pragmatics
Figure 1 Array of discursive strategies (see Benke and Wodak, 2003: 124). Abbreviations: NS, Nazi state; SS, Schutzstaffel (Hitler’s ‘protection guard’ unit; SD, Sicherheitsdienst (security police). From Benke G & Wodak R (2003). ‘The discursive construction of individual memories: how Austrian ‘‘German Wehrmacht’’ soldiers remember WW II.’ In Wodak R & Martin J R (eds.) Re/reading the past. Amsterdam: Benjamins. 115–138. With kind permission by John Benjamins Publishing Company, Amsterdam/Philadelphia.
at all, (b) claiming ignorance, combined with a refusal to take a stance (people using this strategy claim that they do not/did not know anything about what happened), or (c) claiming victimhood (people adopting this strategy may offer elaborate stories about all sorts of terrible things that happened to them during and after the war; in this way, they are able to avoid having to deal with the issue of war crimes committed by the Wehrmacht). 2. People lift the discussion up to a more general level. Using the strategy of scientific rationalization, some people launch into extensive analyses of the Nazi state, aiming to explain how National Socialism came to be successful, why people were in favor of the Nazis, and so on. (This strategy was found among all of the visitors to the exhibitions, both in Germany and in Austria.) 3. People engage in ‘positive-self’ presentation: the interviewee tells stories that portray him/her as having performed good and praiseworthy deeds. War crimes are acknowledged, yet the actor claims to have had no part in them (or fails to mention
any relation to war crimes); the interviewees declare themselves to have acted responsibly, in such a way that they are morally without blame. The following strategies acknowledge the fact of the exhibition at some level, either by acceptance or refutation: 1. In a strategy of acceptance, some people try to understand what happened. 2. For the most part, however, people try not to deal with the past; instead, they use several strategies to justify, and/or deny, the existence of the war crimes, either by (a) relativizing the facts (people using this strategy will start to enumerate crimes of other nations, or use cliche´ s, such as ‘‘every war is horrible’’) or by (b) adopting two further strategies seeking to provide a (pseudo-) rational causal explanation for the war crimes. The first is characterized by the interviewees’ continuing the unmitigated and undisguised use of Nazi ideology and Nazi propaganda of the kind that was promoted during that time to justify the war: ‘‘If we hadn’t fought them, the Russians would be at
Blaming and Denying: Pragmatics 63
the Atlantic Ocean today.’’ The second of these strategies similarly stems from the Nazi period, but at least it acknowledges, however implicitly, that the war’s moral status is questionable: ‘‘Others forced us.’’ 3. Another strategy acknowledges that crimes indeed did happen, and that the army should perhaps be held responsible, yet it attributes the responsibility to someone higher up, possibly within the army: ‘‘I only did my duty.’’ 4. Yet another strategy is the ‘‘Not ‘we,’ but ‘them’ ’’ strategy, which attributes the crimes to units of the army other than the one in which the interviewee served. A variant is: ‘‘Not ‘this,’ but ‘that’’’ (e.g., ‘‘We didn’t bomb Copenhagen, only Rotterdam’’). 5. Finally, there is a strategy that simply denies the fact that war crimes happened at all. In this strategy, people often turn the focus of their memory on their particular Wehrmacht unit, in which horrors of the kind shown in the exhibitions simply were said to be unthinkable. These discursive strategies are all strategies of responding to an interview situation following the interviewees’ presence at an exhibition where thousands of photos of war crimes are shown. Though people employ a number of strategies throughout an interview, their answers can usually be grouped into subsets, each of which serves primarily one of the strategic functions mentioned herein. Some of the strategies are mutually exclusive, i.e., people who completely deny the existence of war crimes would not try to relativize them. This appears to be a logical necessity, but as Billig et al. (1988) pointed out, logic or logical consistency is not necessarily prevalent in official texts; neither is it in everyday conversation, and even less so in emotionally charged debates or conflicts. See also: Argument Structure; Conversation Analysis; Discourse Markers; Mitigation; Psychoanalysis and Language; Rhetoric, Classical; Speech Acts.
Bibliography Alexy R (1996). Theorie der juristischen Argumentation. Die Theorie des rationalen Diskurses als Theorie der juristischen Begru¨ndung. Frankfurt am Main: Suhrkamp. Anderson I, Beattie G & Spencer C (2001). ‘Can blaming victims of rape be logical? Attribution theory and discourse – analytic perspectives.’ Human Relations 54/4, 445–467. Aristotle (1928). Sophistical refutations. Ross W D (ed.). Oxford: Clarendon Press, [350 B.C.]. Austin J L (1956/1957). ‘A plea for excuses.’ In Proceedings of the Aristotelian Society.
Benke G & Wodak R (2003). ‘The discursive construction of individual memories: how Austrian ‘‘German Wehrmacht’’ soldiers remember WW II.’ In Wodak R & Martin J R (eds.) Re/reading the past. Amsterdam: Benjamins. 115–138. Billig M, Condor S, Edwards D, Gane M, Middleton D & Radley A (1988). Ideological dilemmas. A social psychology of everyday thinking. London: Sage. Carlson R G (1996). ‘The political-economy of AIDS among drug-users in the United-States: beyond blaming the victim or powerful others.’ American Anthropologist 98(2), 266. Chilton P A (2004). Analyzing political discourse. London: Routledge. Dejudicibus M & McCabe M P (2001). ‘Blaming the target of sexual harrassment: impact of gender-role, sexist attitudes, and work role.’ Sex Roles 44(7–8), 401–417. Dickerson P (1998). ‘‘‘I did it for the nation’’: repertoires of intent in televised political discourse.’ British Journal of Social Psychology 37/4, 477–494. Ehlich K & Rehbein J (1986). ‘Begru¨ nden.’ In Ehlich K & Rehbein J (eds.) Muster und Institution. Untersuchungen zur schulischen Kommunikation. Tu¨ bingen: Narr. 88–132. Ensink T & Sauer C (eds.) (2003). The art of commemoration. Amsterdam: Benjamins. Gruber H (1996). Streitgespra¨che. Zur Pragmatik einer Diskursform. Opladen: Westdeutscher Verlag. Heer H, Manoschek W, Pollak A & Wodak R (eds.) (2003). Wie Geschichte gemacht wird. Erinnerungen an Wehrmacht und Zweiten Weltkrieg. Vienna: Czernin. Jacobson B & Kettelhack G (1995). If only you would listen. How to stop blaming his or her gender and start communicating with the one you love. New York: St. Martin’s Press. Kopperschmidt J (2000). Argumentationstheorie zur Einfu¨hrung. Hamburg: Junius. Lamb S & Keon S (1995). ‘Blaming the perpetrator: language that distorts reality in newspaper articles on men battering women.’ Psychology of Women Quarterly 19(2), 209–220. Lipstadt D E (1993). Denying the Holocaust. The growing assault on truth and memory. New York: Plume. Martin J & Wodak R (eds.) (2003). Re/reading the past. Amsterdam: Benjamins. Maynard D W (1998). ‘Praising versus blaming the messenger: moral issues in deliveries of good and bad news.’ Research on Language and Social Interaction 31(3–4), 359–395. Pomerantz A M (1978). ‘Attributions of responsibility: blamings.’ Sociology 12, 115–133. Reisigl M & Wodak R (2001). Discourse and discrimination. Rhetoric of racism and antisemitism. London: Routledge. Scott M B & Lyman S (1968). ‘Accounts.’ American Sociological Review 33. Van Dijk T A (1993). ‘Denying racism: elite discourse and racism.’ In Solomos J & Wrench J (eds.) Racism and migration in Western Europe. Oxford: Berg. 179–193. Van Eemeren F H & Grootendorst R (1992). Argumentation, communication, and fallacies. A pragma-dialectical perspective. Hillsdale, NJ: Erlbaum.
64 Blaming and Denying: Pragmatics Van Eemeren F H & Grootendorst R (2004). A systematic theory of argumentation. Cambridge: Cambridge University Press. Van Leeuwen T & Wodak R (1999). ‘Legitimizing immigration control.’ Discourse Studies 1/1, 83–118. Wodak R (2004). ‘Discourse of silence: anti-semitic discourse in post-war Austria.’ In Thiesmeyer L (ed.) Discourse and silencing. Representation and the
language of displacement. Amsterdam: Benjamins. 179–210. Wodak R & Schulz M (1986). The language of love and guilt. Amsterdam: Benjamins. Wodak R, Nowak P, Pelikan J, Gruber H, de Cillia R & Mitten R (1990). ‘Wir sind alle unschuldige Ta¨ ter’. Diskurshistorische Studien zum Nachkriegsantisemitismus. Frankfurt am Main: Suhrkamp.
Bleek, Wilhelm Heinrich Immanuel (1827–1875), and Family E Hu¨ltenschmidt, University of Bielefeld, Bielefeld, Germany ! 2006 Elsevier Ltd. All rights reserved.
Wilhelm Heinrich Immanuel Bleek was born March 8, 1827, in Berlin, in what was then Prussia; he died in Cape Town, in the Cape Colony in South Africa, on August 17, 1875. He was the son of the famous theologian and specialist in New Testament exegesis Friedrich Bleek, professor of theology at the University of Bonn. His mother was Augusta Charlotte Marianne Henriette, ne´ e Sethe, originating from a prominent family of Prussian civil servants. In 1862 in Cape Town, Wilhelm H. I. Bleek married Jemima C. Lloyd, daughter of an archdeacon. They had four children. Bleek is recognized as the founder of German African Studies. He attended the Gymnasium in Bonn and then studied classics and theology at the University of Bonn from 1845 to 1848 and from 1849 to 1851. He chose as his main subject Old Testament studies. Like all researchers in the Textwissenschaft of the Old Testament, he compared several Semitic languages to clarify some linguistic points; in this way, he extended his interest to North African (Hamitic) languages. As a consequence, he studied in Berlin in 1848 and 1849 with the famous specialist in Egyptological research, Richard Carl Lepsius. Here Bleek had to transcribe manuscripts of southern African languages, sent mostly by missionaries, into Lepsius’s phonetic alphabet. In 1851, Bleek submitted his doctoral thesis at the University of Bonn. From this time on, he propagated the hypothesis that the ‘Hottentot’ (Khoekhoe) language was typologically and genetically linked to the North African (Hamitic) languages: like the Hamitic languages, it was a gender language, differing from the Bantu languages without nominal gender. Later, it was Bleek who created the classificatory term ‘Bantulanguages.’ From 1855 on, Bleek worked as an explorer- linguist in southern Africa, though he had to break off his first attempt to explore Africa from the Guinea coast
because of fever. In the salon of the Prussian ambassador in London, C. C. J. von Bunsen, who was an aristocratic historian, a friend of Bleek’s family, a promoter of Sanskrit and Oriental Studies, and a correspondent of Alexander von Humboldt, Bleek got to know Sir George Grey, governor of the Cape province (a British colony at this time) and J. W. Colenso, bishop of Natal. Colenso engaged Bleek formally to accompany him to compile a Zulu grammar, and Bleek arrived in 1855 in Natal. He had great plans for doing extended field work and thus becoming a sort of Livingstone of linguistics, but the only concrete result was a stay at the court of the famous Zulu king Mpanda. All other plans had to be abandoned due to financial and health problems. The only institutions in the world where scientific research was professionalized and thus constantly remunerated at this time were the Prussian universities; but Bleek was never a member of the staff of a Prussian university. What helped him to survive and to carry on his work, on a more limited scale, was the patronage of Sir George. In 1856, Bleek became the curator and bibliographer of Sir George’s enormous collection of documents concerning the languages and the ethnology of southern Africa, and he constantly extended this collection, which was intended to become the most complete collection of material on aboriginal languages from all over the world. So Bleek spent the rest of his life in Cape Town; but here, at least, he had the opportunity in 1858 to meet Livingstone on his way to Mozambique. In 1859, when Sir George was appointed governor of New Zealand, he donated his collection to the South African Public Library at Cape Town, with Bleek as its curator (1862). In 1870, through the influence of Sir George, Bleek’s name was placed on Gladstone’s Civil List, ensuring him a royal pension like other persons such as Charles Darwin or Charles Lyell. Only then, for the first time in his life, did he enjoy financial independence. As a bibliographer, Bleek’s main work was The library of H. E. Sir George Grey, K. C. B. (1857–1867), but his main scientific
64 Blaming and Denying: Pragmatics Van Eemeren F H & Grootendorst R (2004). A systematic theory of argumentation. Cambridge: Cambridge University Press. Van Leeuwen T & Wodak R (1999). ‘Legitimizing immigration control.’ Discourse Studies 1/1, 83–118. Wodak R (2004). ‘Discourse of silence: anti-semitic discourse in post-war Austria.’ In Thiesmeyer L (ed.) Discourse and silencing. Representation and the
language of displacement. Amsterdam: Benjamins. 179–210. Wodak R & Schulz M (1986). The language of love and guilt. Amsterdam: Benjamins. Wodak R, Nowak P, Pelikan J, Gruber H, de Cillia R & Mitten R (1990). ‘Wir sind alle unschuldige Ta¨ter’. Diskurshistorische Studien zum Nachkriegsantisemitismus. Frankfurt am Main: Suhrkamp.
Bleek, Wilhelm Heinrich Immanuel (1827–1875), and Family E Hu¨ltenschmidt, University of Bielefeld, Bielefeld, Germany ! 2006 Elsevier Ltd. All rights reserved.
Wilhelm Heinrich Immanuel Bleek was born March 8, 1827, in Berlin, in what was then Prussia; he died in Cape Town, in the Cape Colony in South Africa, on August 17, 1875. He was the son of the famous theologian and specialist in New Testament exegesis Friedrich Bleek, professor of theology at the University of Bonn. His mother was Augusta Charlotte Marianne Henriette, ne´e Sethe, originating from a prominent family of Prussian civil servants. In 1862 in Cape Town, Wilhelm H. I. Bleek married Jemima C. Lloyd, daughter of an archdeacon. They had four children. Bleek is recognized as the founder of German African Studies. He attended the Gymnasium in Bonn and then studied classics and theology at the University of Bonn from 1845 to 1848 and from 1849 to 1851. He chose as his main subject Old Testament studies. Like all researchers in the Textwissenschaft of the Old Testament, he compared several Semitic languages to clarify some linguistic points; in this way, he extended his interest to North African (Hamitic) languages. As a consequence, he studied in Berlin in 1848 and 1849 with the famous specialist in Egyptological research, Richard Carl Lepsius. Here Bleek had to transcribe manuscripts of southern African languages, sent mostly by missionaries, into Lepsius’s phonetic alphabet. In 1851, Bleek submitted his doctoral thesis at the University of Bonn. From this time on, he propagated the hypothesis that the ‘Hottentot’ (Khoekhoe) language was typologically and genetically linked to the North African (Hamitic) languages: like the Hamitic languages, it was a gender language, differing from the Bantu languages without nominal gender. Later, it was Bleek who created the classificatory term ‘Bantulanguages.’ From 1855 on, Bleek worked as an explorer- linguist in southern Africa, though he had to break off his first attempt to explore Africa from the Guinea coast
because of fever. In the salon of the Prussian ambassador in London, C. C. J. von Bunsen, who was an aristocratic historian, a friend of Bleek’s family, a promoter of Sanskrit and Oriental Studies, and a correspondent of Alexander von Humboldt, Bleek got to know Sir George Grey, governor of the Cape province (a British colony at this time) and J. W. Colenso, bishop of Natal. Colenso engaged Bleek formally to accompany him to compile a Zulu grammar, and Bleek arrived in 1855 in Natal. He had great plans for doing extended field work and thus becoming a sort of Livingstone of linguistics, but the only concrete result was a stay at the court of the famous Zulu king Mpanda. All other plans had to be abandoned due to financial and health problems. The only institutions in the world where scientific research was professionalized and thus constantly remunerated at this time were the Prussian universities; but Bleek was never a member of the staff of a Prussian university. What helped him to survive and to carry on his work, on a more limited scale, was the patronage of Sir George. In 1856, Bleek became the curator and bibliographer of Sir George’s enormous collection of documents concerning the languages and the ethnology of southern Africa, and he constantly extended this collection, which was intended to become the most complete collection of material on aboriginal languages from all over the world. So Bleek spent the rest of his life in Cape Town; but here, at least, he had the opportunity in 1858 to meet Livingstone on his way to Mozambique. In 1859, when Sir George was appointed governor of New Zealand, he donated his collection to the South African Public Library at Cape Town, with Bleek as its curator (1862). In 1870, through the influence of Sir George, Bleek’s name was placed on Gladstone’s Civil List, ensuring him a royal pension like other persons such as Charles Darwin or Charles Lyell. Only then, for the first time in his life, did he enjoy financial independence. As a bibliographer, Bleek’s main work was The library of H. E. Sir George Grey, K. C. B. (1857–1867), but his main scientific
Bleek, Wilhelm Heinrich Immanuel (1827–1875), and Family 65
work was A comparative grammar of South African languages (1862–1869). In his Comparative grammar, Bleek wanted not only to prove, by the means of the ‘science of language,’ the kinship between the Hottentot and the north African languages, but also to make a definitive contribution to a question already posed in Sanskrit and Oriental linguistics: what are the very first, the primitive forms of human language (after the full natural evolution of man and language), and can they be found in the Hottentot and ‘Kafir’ (Zulu) languages. An adherent of evolutionism, he was convinced that in southern Africa the most primitive state of mankind was preserved. This was the immediate goal of his research, but he also pursued another, more distant goal: to understand the causes of the specific cultural difference between populations adhering to a primitive or natural religion and those adhering to a transcendental religion. For this son of a Protestant theologian, culture, mind, and religion were the same ‘thing.’ In this he refers to Max Mu¨ ller, whom he probably met in Bunsen’s house in London, but without agreeing with him on every point. Bleek seeks the cause of religious or mental differences in linguistic differences concerning the ‘forms’ and ‘elements’ of language, which he compares by analogy to certain nonmathematical and nonlogical sciences: to organic chemistry (phonology as the science of the ‘elements’ of language) and to comparative anatomy (the ‘forms’ as the skeleton of language). So in his main work as elsewhere, Bleek works not only as a comparative linguist, but as a linguistic researcher who has his intellectual background in Spinoza’s philosophy, as transmitted among certain Lutheran theologians and elsewhere in German intellectual culture. Bleek’s debt to Spinoza’s philosophy is manifest mainly in his explicitly speculative work The origin of language, submitted in 1853 for the Volney Prize (which he did not win), prefaced for publication in 1867 by himself and by his uncle, Ernst Haeckel, a researcher on human evolution and a Darwinist. This work advanced the thesis that there is no opposition, no essential difference between sciences and humanities, between natural sciences and the sciences of the mind (Geisteswissenschaften). Spinoza’s philosophy implies epistemological naturalism, a continuity between man and nature. To this naturalistic conception of history were opposed the post-Kantian and Hegelian idealistic German historicism and ‘Geisteswissenschaft.’ Bleek’s last great scientific enterprise was his Bushman dictionary, begun in about 1870 and completed by his daughter Dorothea Frances Bleek in the 1940s, published in the American Oriental Society series in 1956. His many works on Bushman tales, studied because they give access to the religion, were published in
1911 by his sister-in-law, Lucy C. Lloyd. Here, as in his other works, the languages of the ‘negroes’ are legitimate subjects of scientific research, not inferior to the classical languages: each ‘race’ has a place in the history of the evolution of man and is equally interesting. The more primitive ‘races’ may even be more interesting. The Bushman dictionary constitutes an enormous compendium of information about languages that have become in the meantime extinct. Bleek’s main hypothesis concerning the kinship of the Hottentot and the North African languages survived up to the work of the Hamburg Africanist Carl Meinhof; when he tried to prove this kinship definitively by means of comparative philology, Meinhof found that it did not exist. Comparative philology, or the science of language, was and is a modern research science capable of revising its own hypotheses. Bleek’s belief in the existence of a causal relation between language and mind in the sense of the structures of religious systems is no longer accepted. Comparative research into civilizations understands the difference between primitive or natural and transcendent religions in a different way. Dorothea Frances Bleek, born March 26, 1873, in Mowbray, Cape Colony, died June 27, 1948, in Plumstead, South Africa. The youngest daughter of W. H. I. Bleek, she was an eminent researcher in the Hottentot (Khoekhoe) and Bushman (Khoisan) languages. In 1904, she was a student of African languages in Berlin, Germany; after 1908, she concentrated on research in the Bushman languages and cultures. She was introduced to these studies by her father’s sister-in-law, Lucy C. Lloyd. Miss Lloyd continued and edited the work of W. H. I. Bleek, encountering many difficulties, since she was ‘only’ a woman in Victorian times. Dorothea F. Bleek continued and edited the work of both W. H. I. Bleek and Lucy C. Lloyd. From 1910 to 1930 she did extensive fieldwork among Bushman populations. The results are documented in a series of publications, the most important of which is the Bushman dictionary, begun by her father about 1870, continued by Lucy C. Lloyd, but mainly established by Dorothea F. Bleek and published by the American Oriental Society in 1956. She was also active in other domains, such as Bushman anthropology, for the Africa Museum in Cape Town, and the study of Bushman rock paintings. While Dorothea Bleek’s father was the inventor of the term ‘Bantu-languages,’ the daughter established the distinction of three main regional groups of the Khoisan languages: southern, northern, and central Khoisan, with the Hottentot (Khoekhoe) language being a part of the central Khoisan group. Her father’s hypothesis of a typological-genetic link between the Hottentot and the Hamitic languages is no
66 Bleek, Wilhelm Heinrich Immanuel (1827–1875), and Family
longer accepted, but the main classificatory result of the daughter’s work still holds. From 1923 to 1948, Dorothea Bleek was Honorary Reader in the Bushman Languages at the University of Cape Town. But she refused the title of an Honorary Doctor, regarding herself simply as her father’s humble disciple. See also: Africa as a Linguistic Area; Bantu Languages; Lepsius, Carl Richard (1810–1884); Meinhof, Carl Friedrich Michael (1857–1944); Mu¨ller, Friedrich Max (1823– 1900); South Africa: Language Situation.
Bibliography Bleek D F (1927). ‘The distribution of Bushman languages in South Africa.’ In Festschrift Meinhof. Hamburg: Augustin. 55–64. Bleek D F (1929). Comparative vocabularies of Bushman languages. Cambridge: Cambridge University Press. Bleek D F (1953). Cave artists of South Africa. Cape Town: Balkema.
Bleek D F (1956). A Bushman dictionary. New Haven, CT: American Oriental Society. Bleek W H I (1851). De nominum linguarum Africae Australis, Copticae, Semiticarum aliarumque sexualium. Bonn: A. Marcus. Bleek W H I (1858–1867). The library of H. E. Sir George Grey, K. C. B. Philology (8 vols). London: Tru¨ bner. Bleek W H I (1862 and 1869). A comparative grammar of South African languages (2 vols). London: Tru¨ bner. ¨ ber den Ursprung der Sprache, Bleek W H I (1868). U als erstes Kapitel einer Entwicklungsgeschichte der Menschheit. Weimar: Bo¨ hlau. Engelbrecht J A (1956). ‘Introduction.’ In Bleek D F (ed.) A Bushman dictionary. New Haven, CT: American Oriental Society. Lloyd L C (ed.) (1911). Specimens of Bushman folklore. London: Allen & Co. Spohr O H (1962). Wilhelm Heinrich Immanuel Bleek: a bio-bibliographical sketch. Cape Town: University of Cape Town Libraries. Velten C (1903). ‘Bleek.’ In Allgemeine Deutsche Biographie 47. Berlin: Duncker & Humblot. 15–17.
Blend O Bat-El, Tel-Aviv University, Tel-Aviv, Israel ! 2006 Elsevier Ltd. All rights reserved.
Introduction The word Oxbridge is composed of a string of segments corresponding to segments at the left edge of Oxford and the right edge of Cambridge. This is a blend, and so are vodkatini (vodka þ martini), jazzercise (jazz þ exercise), and maridelic (marijuana þ psychedelic). Blends (also called portmanteau words) exhibit some sort of structural fusion, in which a single word is formed from two words (and in a handful of cases from three). The byproduct of this fusion is the truncation of segmental material from the inner edges of the two words or only one of them (i.e., the material not underlined in the examples above). Note that blends refer only to cases where the inner edges are truncated. Forms in which the right edges of the two (or more) words are truncated, such as sitcom (situation þ comedy), modem (modulator þ demodulator), and fortran (formula þ translation), are called clipped compounds. Blends in which only the first word undergoes truncation could also be considered a clipped compound (mocamp from motor þ camp), especially when each word contributes only one syllable to the surface form, which is a characteristic of clipped compounds.
A blend is one word that delivers the concept of its two base words and its meaning is thus contingent on the semantic relation between the two base words. In skinoe (ski þ canoe), the word canoe functions as the semantic head, since skinoe is a type of canoe. In snazzy, however, neither snappy nor jazzy functions as a head and the meaning of the blend is thus a hybrid of the meaning of the two (sometimes nearsynonymous) base words. The most intriguing question with respect to blends is whether their phonological structure, i.e., their size, syllable structure, and segmental makeup, is predictable on the basis of the base words (Bauer, 1983). For example, why do we get beefalo from beef and buffalo, rather than *beelo or *beebuffalo? And since the order of the base words affects the phonological shape of the blend, we may also ask why the order is not buffalo þ beef, which would result in *buffabeef or *bubeef? In most cases, two base words provide only one possible blend (there is a handful of cases where both orders are available, e.g., tigon (tiger þ lion) versus liger (lion þ tiger), absotively (absolutely þ positively) versus posilutely (positively þ absolutely), and moorth (moon þ earth) versus earthoon (earth þ moon)). Therefore, we may suspect that the formation of blends is not accidental, but rather governed by some general principles. The principles reflect two competing tendencies: (i) to truncate segments from
66 Bleek, Wilhelm Heinrich Immanuel (1827–1875), and Family
longer accepted, but the main classificatory result of the daughter’s work still holds. From 1923 to 1948, Dorothea Bleek was Honorary Reader in the Bushman Languages at the University of Cape Town. But she refused the title of an Honorary Doctor, regarding herself simply as her father’s humble disciple. See also: Africa as a Linguistic Area; Bantu Languages; Lepsius, Carl Richard (1810–1884); Meinhof, Carl Friedrich Michael (1857–1944); Mu¨ller, Friedrich Max (1823– 1900); South Africa: Language Situation.
Bibliography Bleek D F (1927). ‘The distribution of Bushman languages in South Africa.’ In Festschrift Meinhof. Hamburg: Augustin. 55–64. Bleek D F (1929). Comparative vocabularies of Bushman languages. Cambridge: Cambridge University Press. Bleek D F (1953). Cave artists of South Africa. Cape Town: Balkema.
Bleek D F (1956). A Bushman dictionary. New Haven, CT: American Oriental Society. Bleek W H I (1851). De nominum linguarum Africae Australis, Copticae, Semiticarum aliarumque sexualium. Bonn: A. Marcus. Bleek W H I (1858–1867). The library of H. E. Sir George Grey, K. C. B. Philology (8 vols). London: Tru¨bner. Bleek W H I (1862 and 1869). A comparative grammar of South African languages (2 vols). London: Tru¨bner. ¨ ber den Ursprung der Sprache, Bleek W H I (1868). U als erstes Kapitel einer Entwicklungsgeschichte der Menschheit. Weimar: Bo¨hlau. Engelbrecht J A (1956). ‘Introduction.’ In Bleek D F (ed.) A Bushman dictionary. New Haven, CT: American Oriental Society. Lloyd L C (ed.) (1911). Specimens of Bushman folklore. London: Allen & Co. Spohr O H (1962). Wilhelm Heinrich Immanuel Bleek: a bio-bibliographical sketch. Cape Town: University of Cape Town Libraries. Velten C (1903). ‘Bleek.’ In Allgemeine Deutsche Biographie 47. Berlin: Duncker & Humblot. 15–17.
Blend O Bat-El, Tel-Aviv University, Tel-Aviv, Israel ! 2006 Elsevier Ltd. All rights reserved.
Introduction The word Oxbridge is composed of a string of segments corresponding to segments at the left edge of Oxford and the right edge of Cambridge. This is a blend, and so are vodkatini (vodka þ martini), jazzercise (jazz þ exercise), and maridelic (marijuana þ psychedelic). Blends (also called portmanteau words) exhibit some sort of structural fusion, in which a single word is formed from two words (and in a handful of cases from three). The byproduct of this fusion is the truncation of segmental material from the inner edges of the two words or only one of them (i.e., the material not underlined in the examples above). Note that blends refer only to cases where the inner edges are truncated. Forms in which the right edges of the two (or more) words are truncated, such as sitcom (situation þ comedy), modem (modulator þ demodulator), and fortran (formula þ translation), are called clipped compounds. Blends in which only the first word undergoes truncation could also be considered a clipped compound (mocamp from motor þ camp), especially when each word contributes only one syllable to the surface form, which is a characteristic of clipped compounds.
A blend is one word that delivers the concept of its two base words and its meaning is thus contingent on the semantic relation between the two base words. In skinoe (ski þ canoe), the word canoe functions as the semantic head, since skinoe is a type of canoe. In snazzy, however, neither snappy nor jazzy functions as a head and the meaning of the blend is thus a hybrid of the meaning of the two (sometimes nearsynonymous) base words. The most intriguing question with respect to blends is whether their phonological structure, i.e., their size, syllable structure, and segmental makeup, is predictable on the basis of the base words (Bauer, 1983). For example, why do we get beefalo from beef and buffalo, rather than *beelo or *beebuffalo? And since the order of the base words affects the phonological shape of the blend, we may also ask why the order is not buffalo þ beef, which would result in *buffabeef or *bubeef? In most cases, two base words provide only one possible blend (there is a handful of cases where both orders are available, e.g., tigon (tiger þ lion) versus liger (lion þ tiger), absotively (absolutely þ positively) versus posilutely (positively þ absolutely), and moorth (moon þ earth) versus earthoon (earth þ moon)). Therefore, we may suspect that the formation of blends is not accidental, but rather governed by some general principles. The principles reflect two competing tendencies: (i) to truncate segments from
Blend 67 Table 1 Types of semantic relations between the base words Base words
(a)
(b)
Table 2 The number of syllables in a blend equals the number of syllables in Its longer base word
Blend
Endocentric relation: one of the words functions as a semantic head (in bold below) and the other as a modifier klan þ koran kloran ‘a bible used by the members of KKK’ education þ entertainment edutainment ‘educational entertainment’ key þ container keytainer ‘a container for keys’ Exocentric relation: both words have the same semantic status, and thus none of them serves as a head alphabetic þ numeric alphameric ‘consisting of both letters and numbers’ escalator þ lift escalift ‘a hybrid device with the advantage of both an escalator and a lift’ tangerine þ lemon tangemon ‘a hybrid of tangerine and lemon’
the base in order to allow the blend to have the length a single word, preferably one of the base words, and (ii) to preserve as many segments from the base words as possible and thus maximize the semantic transparency of the blend. The principles proposed in the following sections take English blends as the empirical basis (the data are drawn mostly from Adams (1973) and Bryant (1974)). However, these principles should be applicable to blends from other languages, though some parameter settings might be required (see Kubazuno (1990) for English and Japanese; Bat-El (1996) for Hebrew; Fradin (2000) for French; and Pin˜ eros (2004) for Spanish).
The Semantic Relation between the Base Words The meaning of a blend is composed of the meaning of its base words, which exhibit two types of semantic relation, endocentric and exocentric (Table 1) (see Adams (1973) and Algeo (1977) for other types of relation). In some cases, it is not clear whether the semantic relation is endo- or exocentric. The blend smog (smoke þ fog), for example, has two meanings, ‘a mixture of fog and smoke’ (exocentric) and ‘an airborne
pollution’ (endocentric). The same is true for brunch (breakfast þ lunch), which means either ‘lunch with some characteristics of breakfast’ (endocentric) or ‘a mixture of breakfast and lunch’ (exocentric). These two types of relations also appear in compounds (Bauer, 1988; Spencer, 1991), but blends are much more permissive in this respect. Blends allow any possible combination of lexical categories, including some that do not appear in compounds (e.g., verb–verb, as in baffound, from baffle þ confound). In addition, blends do not show preference for endo- or exocentric relation, whereas compounds are mostly endocentric. Finally, in endocentric compounds the order of the head and the modifier is fixed and this is also true for most endocentric blends in English (Kubozono, 1990), which are rightheaded, like compounds. In Hebrew, however, whose compounds are left-headed, blends can be either right- or left-headed (Bat-El, 1996).
The Size of the Blend The formation of a blend aims toward two competing goals. On the one hand, it must have the structure of a single word, unlike compounds, in which the two base words are accessible. For this purpose, the blend often adopts the number of syllables in one of its base words, thus truncating some segmental material. On the other hand, a blend must preserve as much of the structure from its base words as possible. To accommodate the first goal and maximize the fulfillment of the second, the number of syllables in a blend is often identical to the number of syllables in the longer base word (number of syllables in parentheses) (see Table 2). By adopting the number of syllables from the longer rather than the shorter base word, the blend obtains the structure of one word and maximizes its size. Maximization facilitates the semantic recoverability of the base words, since the more segmental material from the base words there is, the easier it is to identify them.
68 Blend Table 3 Segmental maximization also determines the order of the base words in exocentric blends A þ B – Maximizing order
There are, however, some exceptions, for example, plumcot (2) from plum (1) þ apricot (3); brunch (1) from breakfast (2) þ lunch (1); goon (1) from gorilla (3) þ baboon (2); and bionic (3) from biology (4) and electronic (4). It should be noted that Kubozono (1990) claims that the number of syllables in a blend is identical to the number of syllables in the rightmost word, but some of the exceptions above (bionic, plumcot, goon) do not obey this generalization either. When the two base words have an identical number of syllables, the number of segments often plays a role. Here again, in order to facilitate recoverability, blends tend to preserve as many base segments as possible, given the restriction on the number of syllables noted above. This tendency affects the order of the base words in exocentric blends, in which the order is not determined by a head–modifier relation. For example, a word with a complex onset will be first and a word with a complex coda second. That is, the order of the base words is determined by the principle requiring the maximization of the number of segments (see Table 3). In some cases, segmental maximization is blocked by the phonotactics of the language. For example, from bang þ smash we obtain bash, rather than the segmentally richer form *smang (smash þ bang), since English does not allow monomorphemic sCVC words where the two Cs are nasal (Davis, 1988). The fact that blends are subject to stem phonotactics supports the claim that blends are monomorphemic despite their polymorphemic base.
Figure 1 Segmental overlap.
The Switch Point at Segmental Overlap Contrary to the principle given above, there are blends consisting of more, and sometimes fewer, syllables than the longer base word. In many cases, this is due to the presence of one or more segments (shown in boldface below) shared by the two base words. In such cases, the position of the shared segments determines the ‘switch point’ of the blend, i.e., where the first base word ends and the second begins (see Table 4). The selection of the position of the shared segment(s) as the switch point contributes to segmental maximization. The shared segments overlap and thus correspond to segments in both base words, allowing more segments from each word to be preserved in the blend. For example, diabesity preserves diabe from diabetes and besity from obesity. Notice that in Chicagorilla all segments of the base words appear in the blend. Of course, the more segments of the base words in the blend there are, the more transparent the base words are (see Figure 1). Segmental overlap by the shared segments may also determine the order of the base words in exocentric blends (in which the order of the base words is not determined by the head–modifier relation) (see Table 5). There are cases where only one order of the two words allows a segmental overlap of the shared segments. The requirement to have the switch point at the segmental overlap usually overrides the requirement to maintain the same number of syllables in the blend as in the longer base word (see Table 4). In a few cases, such as Bisquick ‘quick biscuit.’ it also overrides the order imposed by the head–modifier relation (Algeo, 1977). However, there are plenty of blends that meet all the requirements (see Table 6).
Table 4 The switch point at the overlap of the identical segments shared by the base words Blend
advertisement (4) þ entertainment (4) dynamic (3) þ magnetic (4) narcotic (3) þ coma (2) shame (1) þ amateur (3) snob (1) þ problem (2) velocity (4) þ tone (1) west (1) þ Australia (4)
the coda to be more sonorous than the adjacent onset. When this requirement is not met, or when the distance in sonority between the coda and the onset is insufficient, the switch point is at the onset–nucleus boundary of the second word (as in monosyllabic blends). Thus, rocket þ balloon does not yield *rock!lloon, due to the offending kl contact and therefore the surface form is rock!oon.
The Switch Point at Syllable Constituency
Conclusion
When the two base words do not have a shared segment, the syllable structure plays a role in determining the switch point. In monosyllabic blends, derived from two monosyllabic base words, the switch point (marked with !) must be at the onset– nucleus boundary (see Table 7). The question is: which word contributes its nucleus, the first (CV!C) or the second (C!VC)? It appears that there is a preference for the latter option; that is, the first word contributes only its onset and the second contributes its nucleus and coda, i.e., its entire rhyme (Kubozono, 1990). Since the onset and the nucleus are perceptually more salient than the coda, this division allows the blend to preserve one perceptually salient element from each base word, i.e., the onset from the first word and the nucleus from the second. There are, however, several exceptions, some of which are due to lexical blocking, for example, slosh (*slush – lexical blocking) from slop þ slush; boost (*boist) from boom þ hoist; and moorth (*mearth – lexical blocking) from moon þ earth. In polysyllabic blends, there is a preference for the switch point to be at the syllable boundary in the blend, which allows maximization of the segmental material (see Table 8). That is, camera þ recorder yields cam!corder rather than *cam!order. However, there is a restriction on the type of coda–onset contact at the switch point. This restriction, known as the Syllable Contact Law (Vennemann, 1988), requires
The discussion above suggests that the formation of blends is governed by several principles that together determine the order of the base words, the size of the blend, and the switch point. The order of the base words is determined by the head–modifier relation, requiring the head to follow its modifier (see Table 1a). In the absence of such a relation, i.e., in an exocentric relation, the phonology plays a role. When the two base words have one or more shared segments, the order of the base words is such that these segments overlap (Table 6). In the absence of shared segments, segmental maximization determines the order (Table 3). The number of syllables in the blend is also determined by the overlap of the shared segments, which demarcate the switch point (Table 4). In the absence of a shared segment, the number of syllables in the blend is identical to that in the longer base word (Table 2). If the two base words have an identical number of syllables, then segmental maximization plays a role (Table 3). The switch point is determined by the shared segments, which overlap in the blend (Tables 4 and 5). In the absence of a shared segment, the switch point is determined by syllabic constituency. In monosyllabic blends, the switch point is at the onset–nucleus boundary, such that the blend preserves the onset of the first word and the nucleus plus the coda of the second (Table 7). In polysyllabic blends, the switch point is at the syllable boundary, in cases where the
70 Blend Table 8 The switch point in polysyllabic blends Base words
coda–onset contact respects the Syllable Contact Law; otherwise, it is at the onset–nucleus boundary (Table 8). The principles governing the formation of blends are not always obeyed. The few exceptions found reflect a natural state of affairs in derivational morphology, where exceptions are often due to some extragrammatical factors. There is, however, intergrammatical (nonexceptional) violation of principles, in cases of conflict (e.g., switch point at syllable constituency and the Syllable Contact Law (Table 8). In such cases, one principle has a (language-specific) priority over the other, allowing a deterministic selection of the surface form. A model of conflicting principles and violation under conflict is provided by Optimality Theory (Prince and Smolensky, 1993). See also: Complex Segments; Compound; Head/Depen-
Bibliography Adams V (1973). An introduction to Modern English wordformation. London: Longman.
Algeo J (1977). ‘Blends, a structural and systemic view.’ American Speech 52, 47–64. Bat-El O (1996). ‘Selecting the best of the worst: The grammar of Hebrew blends.’ Phonology 13, 283–328. Bauer L (1983). English word formation. Cambridge: Cambridge University Press. Bauer L (1988). Introducing linguistic morphology. Edinburgh: Edinburgh University Press. Bryant M M (1974). ‘Blends are increasing.’ American Speech 49, 163–184. Fradin B (2000). ‘Combining forms, blends and related phenomena.’ In Doleschal U & Thornton A M (eds.) Extragrammatical and marginal morphology. Munich: Lincom Europa. 11–59. Kubozono H (1990). ‘Phonological constraints on blending in English as a case for phonology–morphology interface.’ Yearbook of Morphology 3, 1–20. Pin˜ eros C E (2004). ‘The creation of portmanteaus in the extragrammatical morphology of Spanish.’ Probus 16, 201–238. Prince A & Smolensky P (1993). Optimality theory: Constraint interaction in generative grammar. Technical report RuCCSTR-2. Rutgers Center for Cognitive Science. Spencer A (1991). Morphological theory. Oxford: Blackwell. Vennemann T (1988). Preference laws for syllable structure. Berlin: Mouton de Gruyter.
Blessings 71
Blessings B G Szuchewycz ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, pp. 370–371, ! 1994, Elsevier Ltd.
Blessings are utterances associated primarily with the sphere of religious activity, but they also appear with varying frequency in the politeness formulas and parenthetical expressions of everyday conversation. In both contexts, the dominant linguistic feature is the use of formal and/or formulaic language. Blessings, particularly in religious ritual, may also be accompanied by specific nonlinguistic features including gestures (e.g., laying on of hands, the sign of the cross) and the use of special objects (e.g., a crucifix) or substances (e.g., water, oil). Concern with such patterned relationships between linguistic form, on the one hand, and social context and function, on the other, is central to the study of the role of language in social life. Linguistically, blessings (and their opposite, curses) are marked by the use of a special language, which may be either a highly formal or archaic variety of the dominant language (e.g., Classical Arabic) or a different code entirely (e.g., Latin). In addition to their specific content, linguistic features such as repetition, special form (e.g., parallel couplets), special prosody (e.g., chant), and fixity of pattern distinguish blessings from other types of speech and contribute to their formal and formulaic character. The concept of blessing in Jewish, Christian, and Muslim thought, as in many other traditions, is concerned with the bestowal of divine favor or benediction through the utterance of prescribed words. As such, blessings represent an example of the belief in the magical power of words, other manifestations of which include the use of spells, incantations, and curses. As an aspect of religious behavior, blessings are associated with essential components of public and private ritual activity. They are performed by religious specialists in situations of communal worship as, for example, in rituals where a general blessing of those present marks the end of the event. Blessings are also used by nonspecialists to solemnize, sacralize, and/or mark the boundaries of social events. In traditional Judaism, for example, brokhe ‘blessings’ include short formulaic expressions used in a wide variety of situations as well as longer texts associated with domestic ceremonies (e.g., a grace after a meal) and specific occasions or rites (e.g., Passover, weddings, funerals). Common to all is a fixity of form and
the strict association of specific texts to specific occasions. In the Bible, the Hebrew root brk ‘blessing’ is associated with a number of meanings. A blessing may be an expression of praise or adoration of God, a divine bestowal of spiritual, material, or social prosperity, or an act of consecration that renders objects holy. The Greek eulogia of the New Testament stresses the spiritual benefits that are obtainable through Christ, the gospels, and the institution of the church (e.g., liturgical blessings). Each instance – praise, benediction, and consecration – represents a social and religious act accomplished through the use of a highly conventionalized form of language. Blessings often function as ‘performatives.’ A performative is a speech act that, when uttered, alters some state of affairs in the world. Under the appropriate conditions, if a minister states, ‘I pronounce you man and wife,’ then a marriage has been socially established. If someone says, ‘I promise,’ then a promise has been made. Similarly, blessings function as religious performatives, in that the utterance of the requisite expression precipitates a change in spiritual state. Mastery of the linguistic formulas, however, is not sufficient for the successful realization of blessings (and other performatives). The existence of an extralinguistic institution (e.g., family, descent group, religious institution) with differentiated social roles and statuses for the blessor and blessee(s) is a necessary precondition to an authentic and valid performance of the act. Only certain individuals may pronounce a couple man and wife and create a legally binding marriage. The same is true of blessings. Catholicism, for example, distinguishes those blessings exchanged between lay persons, the spiritual value of which depends on the personal sanctity of the blessor, from liturgical blessings, which carry the force of the ecclesiastical institution. As the institution itself is hierarchically organized, so too is the right to confer particular blessings. Some may be performed by the pontiff alone, some only by a bishop, others by a parish priest, and yet others by a member of a religious order. Similarly, and in a very different ethnographic context, among the Merina of Madagascar the tsodrano is a ritual blessing in which seniors act as intermediaries between ancestors and those being blessed, their juniors. A father bestows fertility and wealth on his son through a ceremonial public blessing that transfers to the son the power of the ancestors in a ritual stressing the continuity and reproduction of the descent group.
72 Blessings
Like other performatives, blessings operate properly only within a context of social and cultural norms and institutions, which are necessary for their realization and to legitimate and maintain their force. Much of human face-to-face interaction is ritualistic in nature, and it has been argued that the use of formalized and prepatterned linguistic and nonlinguistic behavior in everyday life is evidence of a link between interpersonal rituals of politeness, on the one hand, and ritual behavior in the sacred sphere, on the other (Brown and Levinson, 1987). Blessings are an example of a specific linguistic routine common to both. In nonreligious contexts, blessings are evident in the politeness formulas and parenthetical expressions of everyday conversation: for example, the English ‘Bless you!’ as a conventional response to a sneeze. Similarly, in greetings, thanks, and leave-takings, blessings are exchanged between interlocutors and, although they may literally express a wish for supernatural benefits, their primary communicative function is as highly conventionalized markers of social and/or interactional status. In both their religious and secular uses, blessings thus function as expressions of solidarity, approval, and good will.
When embedded parenthetically within larger sentences or longer texts, blessings may also function as semantically and interactionally significant units. In oral narratives, the use of a blessing (or curse) serves to communicate directly the emotional state or attitude of the speaker toward the topic, providing a means of internal evaluation and signaling speaker involvement in the text. Yiddish speakers, for example, make extensive use of a large set of fixed expressions, many of which are blessings, for just such a purpose (Matisoff, 1979).
Bibliography Brown P & Levinson S C (1987). Politeness: Some universals in Language usage. Cambridge: Cambridge University Press. Matisoff J A (1979). Blessings, curses, hopes and fears: Psychoostensive expressions in Yiddish. Philadelphia, PA: Institute for the Study of Human Issues. Ries J (1987). ‘Blessing.’ In Eliade M (ed.) The encyclopedia of religion. New York: Macmillan. Westermann C (1978). Blessing: In the Bible and the life of the church. Philadelphia, PA: Fortress Press.
Bloch, Bernard (1907–1965) J G Fought, Diamond Bar, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Bernard Bloch studied English and German (A.B. 1928, M.A. 1929) at the University of Kansas, where his father Albert taught art. Continuing at Northwestern University, he took a course in linguistics with Werner F. Leopold in 1931. That same year, he was chosen as a field worker on the Linguistic Atlas of New England, directed by Hans Kurath. In 1933 he followed Kurath and the Atlas project to Brown University. Bernard (and his wife Julia) did much exacting editorial work on the Atlas. He completed his Ph.D. at Brown in 1935, in English and linguistics, teaching English and modern languages there until moving to Yale’s linguistics department in 1943. His character, his intelligent and disciplined scholarship, and his extraordinary writing and editorial skills soon made Bloch an influential presence within the Linguistic Society of America. In 1940 he became the second editor of its journal, Language, and continued as editor until his death. His insistence on clarifying each point in a manuscript made it no idle jest when he later remarked that he had published
many papers each year, most of them under famous pseudonyms. Bloch and Leonard Bloomfield shared intensely demanding applied linguistic work during the war. Although he was Bloomfield’s junior colleague at Yale for only a few years, Bloomfield’s influence on him was profound (Bloch, 1949). The austere modernist intellectual architecture of their work is very similar (Bloch, 1948); Bloch’s writing is much friendlier to readers. His wartime work on Japanese was published as a basic course, and later in a series of descriptive publications capped by the article on phonemics (Bloch, 1950), all meant to illustrate the application of the principles of linguistic description. His ‘English verb inflection’ (Bloch, 1947) is an exemplar of distributionalist structural morphology, compactly presenting a remarkably complete solution together with its rationale. Bloch was an extraordinary teacher, delivering beautifully composed informal lectures as lightly as one might carry on a conversation, sustaining an easy exchange of statements, questions, and answers. He would sometimes read a few sentences from some unidentified publication, extracts chosen for their comic value in illustrating various rhetorical or
72 Blessings
Like other performatives, blessings operate properly only within a context of social and cultural norms and institutions, which are necessary for their realization and to legitimate and maintain their force. Much of human face-to-face interaction is ritualistic in nature, and it has been argued that the use of formalized and prepatterned linguistic and nonlinguistic behavior in everyday life is evidence of a link between interpersonal rituals of politeness, on the one hand, and ritual behavior in the sacred sphere, on the other (Brown and Levinson, 1987). Blessings are an example of a specific linguistic routine common to both. In nonreligious contexts, blessings are evident in the politeness formulas and parenthetical expressions of everyday conversation: for example, the English ‘Bless you!’ as a conventional response to a sneeze. Similarly, in greetings, thanks, and leave-takings, blessings are exchanged between interlocutors and, although they may literally express a wish for supernatural benefits, their primary communicative function is as highly conventionalized markers of social and/or interactional status. In both their religious and secular uses, blessings thus function as expressions of solidarity, approval, and good will.
When embedded parenthetically within larger sentences or longer texts, blessings may also function as semantically and interactionally significant units. In oral narratives, the use of a blessing (or curse) serves to communicate directly the emotional state or attitude of the speaker toward the topic, providing a means of internal evaluation and signaling speaker involvement in the text. Yiddish speakers, for example, make extensive use of a large set of fixed expressions, many of which are blessings, for just such a purpose (Matisoff, 1979).
Bibliography Brown P & Levinson S C (1987). Politeness: Some universals in Language usage. Cambridge: Cambridge University Press. Matisoff J A (1979). Blessings, curses, hopes and fears: Psychoostensive expressions in Yiddish. Philadelphia, PA: Institute for the Study of Human Issues. Ries J (1987). ‘Blessing.’ In Eliade M (ed.) The encyclopedia of religion. New York: Macmillan. Westermann C (1978). Blessing: In the Bible and the life of the church. Philadelphia, PA: Fortress Press.
Bloch, Bernard (1907–1965) J G Fought, Diamond Bar, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Bernard Bloch studied English and German (A.B. 1928, M.A. 1929) at the University of Kansas, where his father Albert taught art. Continuing at Northwestern University, he took a course in linguistics with Werner F. Leopold in 1931. That same year, he was chosen as a field worker on the Linguistic Atlas of New England, directed by Hans Kurath. In 1933 he followed Kurath and the Atlas project to Brown University. Bernard (and his wife Julia) did much exacting editorial work on the Atlas. He completed his Ph.D. at Brown in 1935, in English and linguistics, teaching English and modern languages there until moving to Yale’s linguistics department in 1943. His character, his intelligent and disciplined scholarship, and his extraordinary writing and editorial skills soon made Bloch an influential presence within the Linguistic Society of America. In 1940 he became the second editor of its journal, Language, and continued as editor until his death. His insistence on clarifying each point in a manuscript made it no idle jest when he later remarked that he had published
many papers each year, most of them under famous pseudonyms. Bloch and Leonard Bloomfield shared intensely demanding applied linguistic work during the war. Although he was Bloomfield’s junior colleague at Yale for only a few years, Bloomfield’s influence on him was profound (Bloch, 1949). The austere modernist intellectual architecture of their work is very similar (Bloch, 1948); Bloch’s writing is much friendlier to readers. His wartime work on Japanese was published as a basic course, and later in a series of descriptive publications capped by the article on phonemics (Bloch, 1950), all meant to illustrate the application of the principles of linguistic description. His ‘English verb inflection’ (Bloch, 1947) is an exemplar of distributionalist structural morphology, compactly presenting a remarkably complete solution together with its rationale. Bloch was an extraordinary teacher, delivering beautifully composed informal lectures as lightly as one might carry on a conversation, sustaining an easy exchange of statements, questions, and answers. He would sometimes read a few sentences from some unidentified publication, extracts chosen for their comic value in illustrating various rhetorical or
Bloch, Jules (1880–1953) 73
factual blunders. It transpired that all of these examples were drawn from his own published work. Students in his introductory course wrote a two-page essay each week on a topic relevant to the readings. These were returned at the next class, edited with the same fierce devotion to clarity and professionalism that he brought to all papers sent to the editor of Language. They came back folded lengthwise with his unsparing comments typed in a narrow column on the back. Many of us kept those papers as treasures.
Bibliography Bloch B (1947). ‘English verb inflection.’ Language 23, 399–418. Bloch B (1948). ‘A set of postulates for phonemic analysis.’ Language 24, 3–46. Bloch B (1949). ‘Leonard Bloomfield.’ Language 25, 87–98. Bloch B (1950). ‘Studies in colloquial Japanese: IV. Phonemics.’ Language 26, 86–125. Joos M (1967). ‘Bernard Bloch.’ Language 43, 3–19.
See also: Bloomfield, Leonard (1887–1949); Japanese;
Kurath, Hans (1891–1992); Phoneme; Structuralism.
Bloch, Jules (1880–1953) M McCaskey, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Jules Bloch was born in Paris on May 1, 1880, and attended the Lyce´ e Louis-le-Grand as a scholarship student. He completed his Licence e`s Lettres, and subsequently became a graduate student in Sanskrit and ancient literature and culture in the E´ cole des Hautes E´ tudes at the University of Paris. In 1905, he undertook his first major academic project toward the end of his graduate training. He and two other researchers were given the task of translating large portions of the monumental three-volume Kurze vergleichende Grammatik der indogermanischen Sprachen (1902–1904) by the Indo-European linguists Karl Brugmann and Berthold Delbru¨ ck. Bloch then helped edit an abridged version of the translated text, Abre´ge´ de grammaire compare´e des langues indoeurope´ennes (1905), supervised by Antoine Meillet, a specialist in Indo-European linguistics at the E´ cole des Hautes E´ tudes. In 1906, Bloch published his own diploma thesis on Sanskrit, La phrase nominale en sanscrit, and went on to pursue the study of Hindi and Tamil. He performed research in the field in India, later moving to Vietnam, where he served on the faculty of the E´ cole Franc¸ aise d’Extreˆ me Orient in Hanoi. In 1914, Bloch completed and submitted his doctoral thesis, La formation de la langue marathe, a diachronic study of Marathi; for this he received the Prix Volney, a prestigious linguistic prize awarded annually by the Institut de France since 1822. His research was soon interrupted by infantry service for four years in World War I, during which he rose from
sergeant to lieutenant and was awarded the Croix de Guerre for bravery. Bloch returned to the E´ cole des Hautes E´ tudes in 1919, and was made Director of Studies there in 1920. He also served as Professor of Sanskrit at the Sorbonne, and in 1937 became a professor at the Colle`ge de France, where he remained until his retirement in 1951. Bloch also served as the secretary of the Socie´ te´ Linguistique in France for close to a quarter of a century (1920–1944), keeping in close touch with other leading linguists in Europe and India throughout his career. He also guided and assisted many Indian students in Paris, and a number of them subsequently distinguished themselves in the field of Indo-European linguistic studies. Bloch developed proficiency in and did research on a number of languages of India, ancient and modern, including Sanskrit, Pali, Vedic language, Hindi, and Marathi, an Indo-European language spoken by over 65 000 000 people. Bloch also did research on Tamil, a Dravidian language spoken by more than 50 000 000 people in India, Sri Lanka, Malaysia, and elsewhere in Southeast Asia. His Structure grammaticale des langues dravidiennes (1946) was one of the first modern linguistic studies of the Dravidian family of languages. Bloch also began a project to translate the Pali Buddhist Canon, with his inaugural volume of the Canon bouddhique Pa¯li (1949), but this work was unfortunately not continued by others. In the last year of his life, Bloch published one of the first modern scholarly studies of the Romany-speaking people, Les Tsiganes (1953). Romany, spoken by an estimated 2 000 000 people, is an Indo-European language with origins in India and grammatical affinities with Sanskrit. Bloch was one of the first Indo-European
Bloch, Jules (1880–1953) 73
factual blunders. It transpired that all of these examples were drawn from his own published work. Students in his introductory course wrote a two-page essay each week on a topic relevant to the readings. These were returned at the next class, edited with the same fierce devotion to clarity and professionalism that he brought to all papers sent to the editor of Language. They came back folded lengthwise with his unsparing comments typed in a narrow column on the back. Many of us kept those papers as treasures.
Bibliography Bloch B (1947). ‘English verb inflection.’ Language 23, 399–418. Bloch B (1948). ‘A set of postulates for phonemic analysis.’ Language 24, 3–46. Bloch B (1949). ‘Leonard Bloomfield.’ Language 25, 87–98. Bloch B (1950). ‘Studies in colloquial Japanese: IV. Phonemics.’ Language 26, 86–125. Joos M (1967). ‘Bernard Bloch.’ Language 43, 3–19.
See also: Bloomfield, Leonard (1887–1949); Japanese;
Kurath, Hans (1891–1992); Phoneme; Structuralism.
Bloch, Jules (1880–1953) M McCaskey, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Jules Bloch was born in Paris on May 1, 1880, and attended the Lyce´e Louis-le-Grand as a scholarship student. He completed his Licence e`s Lettres, and subsequently became a graduate student in Sanskrit and ancient literature and culture in the E´cole des Hautes E´tudes at the University of Paris. In 1905, he undertook his first major academic project toward the end of his graduate training. He and two other researchers were given the task of translating large portions of the monumental three-volume Kurze vergleichende Grammatik der indogermanischen Sprachen (1902–1904) by the Indo-European linguists Karl Brugmann and Berthold Delbru¨ck. Bloch then helped edit an abridged version of the translated text, Abre´ge´ de grammaire compare´e des langues indoeurope´ennes (1905), supervised by Antoine Meillet, a specialist in Indo-European linguistics at the E´cole des Hautes E´tudes. In 1906, Bloch published his own diploma thesis on Sanskrit, La phrase nominale en sanscrit, and went on to pursue the study of Hindi and Tamil. He performed research in the field in India, later moving to Vietnam, where he served on the faculty of the E´cole Franc¸aise d’Extreˆme Orient in Hanoi. In 1914, Bloch completed and submitted his doctoral thesis, La formation de la langue marathe, a diachronic study of Marathi; for this he received the Prix Volney, a prestigious linguistic prize awarded annually by the Institut de France since 1822. His research was soon interrupted by infantry service for four years in World War I, during which he rose from
sergeant to lieutenant and was awarded the Croix de Guerre for bravery. Bloch returned to the E´cole des Hautes E´tudes in 1919, and was made Director of Studies there in 1920. He also served as Professor of Sanskrit at the Sorbonne, and in 1937 became a professor at the Colle`ge de France, where he remained until his retirement in 1951. Bloch also served as the secretary of the Socie´te´ Linguistique in France for close to a quarter of a century (1920–1944), keeping in close touch with other leading linguists in Europe and India throughout his career. He also guided and assisted many Indian students in Paris, and a number of them subsequently distinguished themselves in the field of Indo-European linguistic studies. Bloch developed proficiency in and did research on a number of languages of India, ancient and modern, including Sanskrit, Pali, Vedic language, Hindi, and Marathi, an Indo-European language spoken by over 65 000 000 people. Bloch also did research on Tamil, a Dravidian language spoken by more than 50 000 000 people in India, Sri Lanka, Malaysia, and elsewhere in Southeast Asia. His Structure grammaticale des langues dravidiennes (1946) was one of the first modern linguistic studies of the Dravidian family of languages. Bloch also began a project to translate the Pali Buddhist Canon, with his inaugural volume of the Canon bouddhique Pa¯li (1949), but this work was unfortunately not continued by others. In the last year of his life, Bloch published one of the first modern scholarly studies of the Romany-speaking people, Les Tsiganes (1953). Romany, spoken by an estimated 2 000 000 people, is an Indo-European language with origins in India and grammatical affinities with Sanskrit. Bloch was one of the first Indo-European
74 Bloch, Jules (1880–1953)
linguists to undertake the systematic study of Romany language and culture. See also: Brugmann, Karl (1849–1919); Delbru¨ck, Berthold
(1842–1922); Dravidian Languages; Indo–Aryan Languages; Indo–European Languages; Meillit, Antoine (Paul Jules) (1866–1936).
Bibliography Bloch J (1905). Abre´ ge´ de grammaire compare´ e des langues indo-europe´ ennes, d’apre`s le Pre´ cis de grammaire compare´ e de K. Brugmann et B. Delbru¨ ck. Tr. par J. Bloch, A. Cuny et A. Ernout, sous la direction de A. Meillet et R. Gauthiot. Paris: C. Klincksieck. Bloch J (1906). ‘La phrase nominale en sanscrit.’ Me´ moires de la Socie´ te´ de Linguistique de Paris, vol. XIV, 27–96. Bloch J (1920). La formation de la langue marathe. Paris: E´ . Champion.
Bloch J (1934). L’indo-aryen du Veda aux temps modernes. Paris: Adrien-Maisonneuve. Bloch J (1946). Structure grammaticale des langues dravidiennes. Publications du Muse´ e Guimet. Bibliothe`que d’e´ tudes, t. 56. Paris: A. Maisonneuve. Bloch J (1949). Canon bouddhique Pa¯ li (Tripitaka) Texte et traduction par Jules Bloch, Jean Filliozat, Louis Renou. Paris: Adrien-Maisonneuve. Bloch J (1950). Les inscriptions d’Asoka; traduites et commente´ es par Jules Bloch. Paris: Les Belles Lettres. Bloch J (1953). Les Tsiganes. Paris: Presses universitaires de France. Bloch J (1970). The formation of the Mara¯ thı¯ language, translated by Dev Raj Chanana. Delhi: Motilal Banarsidass. Bloch J (1985). Recueil d’articles de Jules Bloch, 1906– 1955: textes rassemble´s par Colette Caillat. Paris: College de France, Institut de Civilisation Indienne.
Bloomfield, Leonard (1887–1949) J G Fought, Diamond Bar, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Leonard Bloomfield was born in Chicago; his family moved to rural Wisconsin when he was nine. He graduated from Harvard in 1906. When he sought an assistantship in German at the University of Wisconsin that summer, he met the Germanist Eduard Prokosch (1876–1938), who introduced him to linguistics. Bloomfield took his doctorate in Germanic philology at the University of Chicago in 1909. He taught German (German, Standard) for one year at the University of Cincinnati as an instructor, later moving to the University of Illinois. In 1913–1914 he studied with the Neogrammarians Karl Brugmann, August Leskien, and Hermann Oldenberg at the Universities of Leipzig and Go¨ ttingen and then returned to Illinois, only then becoming an assistant professor, his rank until 1921. During his stay at Illinois he also published his first work on a non-Indo-European language, Tagalog texts with grammatical analysis (1917), whose conception and organization were very probably influenced by his friend Franz Boas (1858–1942). In 1919, he began his work on the Algonquian languages (1928, 1930, 1934, 1946), some of which was edited and published posthumously (1957, 1962, 1975). In 1921, he moved to Ohio State University as a full professor. While there, he and the behavioral psychologist Albert Paul Weiss (1879–1931) became friends, and Bloomfield adopted some of the idiom of that
approach, though its role in his work has been greatly exaggerated. There Bloomfield also took part, with George Melville Bolling (1871–1963), in founding the Linguistic Society of America in 1925. Esper (1968) was an invaluable eyewitness report on this period in Bloomfield’s career. In 1927, Bloomfield returned to the University of Chicago, where he and Edward Sapir (1884–1939) were briefly colleagues. His years at the University of Chicago seem to have been the most pleasant and productive of his working life. In 1940 he went to Yale, as Sterling Professor, the successor of Prokosch and to some degree also of Sapir. Bloomfield led the linguistics program and took an active role in war-related work on practical language-learning materials, writing and editing a number of manuals. A stroke ended his working life in 1946; he died in 1949. His family life was darkened by tragedies. Bernard Bloch, who knew and admired him, described his personality as ‘‘not strongly magnetic’’ (1949: 91). Anecdotes show his readiness to use highly refined sarcasm in dealing with critics, colleagues, and students alike. For an extended example, see Bloomfield, 1944; in a more typical instance, he claimed that his introductory textbook Language (1933) could be understood by any bright high-school student. This remark has often been cited as evidence of Bloomfield’s innocence by scholars who have struggled with this formidable book. It is not. He supervised only a handful of dissertations, and he sometimes tried to discourage students from specializing in linguistics. It was through his publications,
74 Bloch, Jules (1880–1953)
linguists to undertake the systematic study of Romany language and culture. See also: Brugmann, Karl (1849–1919); Delbru¨ck, Berthold
(1842–1922); Dravidian Languages; Indo–Aryan Languages; Indo–European Languages; Meillit, Antoine (Paul Jules) (1866–1936).
Bibliography Bloch J (1905). Abre´ge´ de grammaire compare´e des langues indo-europe´ennes, d’apre`s le Pre´cis de grammaire compare´e de K. Brugmann et B. Delbru¨ck. Tr. par J. Bloch, A. Cuny et A. Ernout, sous la direction de A. Meillet et R. Gauthiot. Paris: C. Klincksieck. Bloch J (1906). ‘La phrase nominale en sanscrit.’ Me´moires de la Socie´te´ de Linguistique de Paris, vol. XIV, 27–96. Bloch J (1920). La formation de la langue marathe. Paris: E´. Champion.
Bloch J (1934). L’indo-aryen du Veda aux temps modernes. Paris: Adrien-Maisonneuve. Bloch J (1946). Structure grammaticale des langues dravidiennes. Publications du Muse´e Guimet. Bibliothe`que d’e´tudes, t. 56. Paris: A. Maisonneuve. Bloch J (1949). Canon bouddhique Pa¯li (Tripitaka) Texte et traduction par Jules Bloch, Jean Filliozat, Louis Renou. Paris: Adrien-Maisonneuve. Bloch J (1950). Les inscriptions d’Asoka; traduites et commente´es par Jules Bloch. Paris: Les Belles Lettres. Bloch J (1953). Les Tsiganes. Paris: Presses universitaires de France. Bloch J (1970). The formation of the Mara¯thı¯ language, translated by Dev Raj Chanana. Delhi: Motilal Banarsidass. Bloch J (1985). Recueil d’articles de Jules Bloch, 1906– 1955: textes rassemble´s par Colette Caillat. Paris: College de France, Institut de Civilisation Indienne.
Bloomfield, Leonard (1887–1949) J G Fought, Diamond Bar, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Leonard Bloomfield was born in Chicago; his family moved to rural Wisconsin when he was nine. He graduated from Harvard in 1906. When he sought an assistantship in German at the University of Wisconsin that summer, he met the Germanist Eduard Prokosch (1876–1938), who introduced him to linguistics. Bloomfield took his doctorate in Germanic philology at the University of Chicago in 1909. He taught German (German, Standard) for one year at the University of Cincinnati as an instructor, later moving to the University of Illinois. In 1913–1914 he studied with the Neogrammarians Karl Brugmann, August Leskien, and Hermann Oldenberg at the Universities of Leipzig and Go¨ttingen and then returned to Illinois, only then becoming an assistant professor, his rank until 1921. During his stay at Illinois he also published his first work on a non-Indo-European language, Tagalog texts with grammatical analysis (1917), whose conception and organization were very probably influenced by his friend Franz Boas (1858–1942). In 1919, he began his work on the Algonquian languages (1928, 1930, 1934, 1946), some of which was edited and published posthumously (1957, 1962, 1975). In 1921, he moved to Ohio State University as a full professor. While there, he and the behavioral psychologist Albert Paul Weiss (1879–1931) became friends, and Bloomfield adopted some of the idiom of that
approach, though its role in his work has been greatly exaggerated. There Bloomfield also took part, with George Melville Bolling (1871–1963), in founding the Linguistic Society of America in 1925. Esper (1968) was an invaluable eyewitness report on this period in Bloomfield’s career. In 1927, Bloomfield returned to the University of Chicago, where he and Edward Sapir (1884–1939) were briefly colleagues. His years at the University of Chicago seem to have been the most pleasant and productive of his working life. In 1940 he went to Yale, as Sterling Professor, the successor of Prokosch and to some degree also of Sapir. Bloomfield led the linguistics program and took an active role in war-related work on practical language-learning materials, writing and editing a number of manuals. A stroke ended his working life in 1946; he died in 1949. His family life was darkened by tragedies. Bernard Bloch, who knew and admired him, described his personality as ‘‘not strongly magnetic’’ (1949: 91). Anecdotes show his readiness to use highly refined sarcasm in dealing with critics, colleagues, and students alike. For an extended example, see Bloomfield, 1944; in a more typical instance, he claimed that his introductory textbook Language (1933) could be understood by any bright high-school student. This remark has often been cited as evidence of Bloomfield’s innocence by scholars who have struggled with this formidable book. It is not. He supervised only a handful of dissertations, and he sometimes tried to discourage students from specializing in linguistics. It was through his publications,
Bloomfield, Leonard (1887–1949) 75
especially Language, that he shaped American descriptive linguistics as a discipline during its structuralist period. Bloomfield began as a Germanist and Indo-Europeanist in the Neogrammarian tradition. These disciplines, and his rigorous cast of mind, provided the foundation for his austere approach to language description. The contrastive comparison of linguistic forms through the construction of textual concordances, the logic of textual variants, and many other analytical techniques and concepts of the classical comparative method, all became basic tools of descriptive and pedagogical applications of linguistics. Bloomfield’s Outline guide for the practical study of foreign languages (1942) described this toolkit and explained its use. His method was based on the notion of the linguistic sign; it called for comparing linguistic forms that are partly alike and partly different, and then looking for other examples of each part so as to understand how they are alike and how they are different in form and function. In a letter to Truman Michelson written in 1919, Bloomfield, then 32, had already condensed his method of analysis into one sentence: ‘‘No preconceptions; find out which sound variations are distinctive (as to meaning) and then analyze morphology and syntax by putting together everything that is alike’’ (Hockett, 1987: 41). When compiling a descriptive grammar, however, Bloomfield selected among variants in his data to build and then describe a community norm of usage. Such a norm was implicit in his account of usage differences among Menomini speakers (1927); the details of its construction were brilliantly illuminated by Goddard (1987). See also: Algonquian and Ritwan Languages; Bloch, Ber-
nard (1907–1965); Boas, Franz (1858–1942); Brugmann, Karl (1849–1919); Germanic Languages; Hockett, Charles Francis (1916–2000); Leskien, August (1840–1916); Linguistics as a Science; Sapir, Edward (1884–1939); Structuralism; Tagalog.
Bibliography Bloch B (1949). ‘Leonard Bloomfield.’ Language 25, 87– 94. Bloomfield L (1917). Tagalog texts with grammatical analysis. University of Illinois Studies in Language and
Literature (vol. 3, Nos. 2–4). Urbana: University of Illinois Press. Bloomfield L (1926). ‘A set of postulates for the science of language.’ Language 2, 152–164. Bloomfield L (1927). ‘Literate and illiterate speech.’ American Speech 2, 432–439. Bloomfield L (1928). Menomini texts (Publications of the American Ethnological Society, vol. 12). New York: G. E. Stechert, agents. Bloomfield L (1930). Sacred stories of the Sweet Grass Cree (National Museum of Canada, Bulletin No. 6). Ottawa: F. A. Acland. Bloomfield L (1933). Language. New York: Holt. Bloomfield L (1934). Plains Cree texts (Publications of the American Ethnological Society, vol. 16). New York: G. E. Stechert, agents. Bloomfield L (1942). Outline guide for the practical study of foreign languages. Baltimore: Linguistic Society of America. Bloomfield L (1944). ‘Secondary and tertiary responses to language.’ Language 20, 45–55. Bloomfield L (1946). ‘Algonquian.’ In Hoijer H et al. (eds.) Linguistic structures of native America (Viking Fund publications in anthropology, 6, 85–129). New York: Wenner-Gren Foundation. Bloomfield L (1957). Eastern Ojibwa: Grammatical sketch, texts, and word list. Ann Arbor: University of Michigan Press. Bloomfield L (1962). The Menomini language. New Haven & London: Yale University Press. Bloomfield L ed. by Charles F Hockett (1975). Menomini lexicon. Milwaukee, WI: Milwaukee Public Museum Press. Hockett C F (1987). Letters from Bloomfield to Michelson and Spair. In Hall R A (ed.) Leonard Bloomfield: Essays on his life and work. Amsterdam: John Benjamins. 39–60. Esper E A (1968). Mentalism and objectivism in linguistics: the sources of Leonard Bloomfield’s psychology of language. New York: American Elsevier. Goddard I (1987). ‘Leonard Bloomfield’s descriptive and comparative studies of Algonquian.’ In Hall R A (ed.) Leonard Bloomfield: essays on his life and work. Amsterdam: John Benjamins. 179–217. Hockett C F (ed.) (1970). A Leonard Bloomfield anthology. Bloomington: Indiana University Press. Hockett C F (1987). Letters from Bloomfield to Michelson and Spair. In Hall R A (ed.) Leonard Bloomfield: Essays on his life and work. Amsterdam: John Benjamins. 39–60.
76 Blumer, Herbert (1900–1987)
Blumer, Herbert (1900–1987) N Denzin, University of Illinois at Urbana–Champaign, Urbana, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
Herbert Blumer is the founding father of the unique social psychological perspective called ‘symbolic interactionism.’ The foremost student of George Herbert Mead (see Mead, George Herbert (1863– 1931)), he translated Mead’s philosophy into a theory of self, society, and interaction that has come to be known as the ‘symbolic interactionist perspective’ in contemporary U.S. sociology. Blumer received his bachelor’s and master’s degrees from the University of Missouri in 1921 and 1922, respectively. He taught there until 1925, when he left to enter the doctoral program of the department of sociology at the University of Chicago, where he received his Ph.D. in 1928. He became an instructor in sociology at Chicago in 1925, was an associate professor from 1931 to 1947, and was a professor from 1947 to 1952. When Mead died in 1931, Blumer took over his social psychology course. From 1930 to 1935, Blumer was secretary–treasurer of the American Sociological Association and was elected president in 1955. In 1954, he was elected president of the Society for the Study of Social Problems. He also served as president of the Pacific Sociological Association and as vice president of the International Sociological Association. From 1941 to 1952, he was editor of the American Journal of Sociology. During World War II, he served as liaison officer between the Office of War Information and the Bureau of Economic Warfare and as a public panel chairman of the War Labor Board. He taught at the University of Chicago from 1925 to 1952. In 1952, he went as chair to the Department of Sociology at the University of California at Berkeley, where he remained as a faculty member until his death in 1987. Blumer was the author of approximately 60 articles, dozens of book reviews (in the American Journal of Sociology), two monographs [The rationale of labor–management relations (1958), and The world of youthful drug use (1967)], at least three review essays, three obituaries (Louis Wirth, Ernest Burgess, and Joseph Lohman), and four books [Movies and conduct (1933), Movies, delinquency, and crime (with D. M. Hauser, 1933), Critiques of research in the social sciences, I. An appraisal of Thomas and Znaniecki’s The Polish Peasant in Europe and America (1939), and Symbolic interactionism (1969)]. Posthumous publications include a collection of his
papers on Industrialization as an agent of social change: a critical analysis, as well as The collected papers of Herbert Blumer: George Herbert Mead and human conduct and Selected works of Herbert Blumer: a public philosophy for mass society. Blumer is remembered for his athletic prowess, his warmth as a person, his capacity as a sympathetic and understanding listener, and his acute memory and critical mind. He was a powerful and effective teacher of several generations of students who ‘‘found themselves and their careers while sitting in his classes’’ (Shibutani, 1970: viii). Blumer’s impact on U.S. sociology has been substantial. A loyal opponent of functionalism, positivism in sociology, and behavioral and cognitive psychology, he long championed the interpretive, naturalistic approach to human experience, social theory, and social research. Many of the ideas he put forth early in his career have since, as Shibutani noted, become generally accepted. His studies of the movies, fashion, collective behavior, racism and prejudice, the industrialization process, and social problems have become sociological classics and models of research for other scholars. As the chief systematizer of the sociological implications of Mead’s thought, his writings on symbolic interaction have served to define this perspective within the international sociological community. Blumer’s sociology involved the following assumptions. Human beings act toward things on the basis of the meanings things have for them. Meanings arise out of, and are modified in, the process of social interaction. Society consists of the joint interactions of individuals. These joint actions describe recurrent patterns of collective activity, complex networks of institutional relations, and historical processes and forces. The proper study of society is at the intergroup, interactional level. Society is a framework for the operation of social, symbolic, economic, political, religious, kinship, and legal interactions. The notion of structure as process is central to Blumer’s argument. Social structures are composed of interacting units ‘‘caught up in the interplay of opposing processes of persistence and change’’ (Morrione, 2004: xvi). Social reality is situated in these sites of interaction. Blumer put in motion a methodological project that assumed an obdurate natural social world that could be studied scientifically – that is, mapped, reproduced, and made sense of through the careful work of the naturalistic researcher who gets close to the phenomenon under investigation. He sought a processual, interpretive social science that would utilize sensitizing concepts grounded in subjective
Boas, Franz (1858–1942) 77
human experience. The empirical materials of this science would be valid, reliable, and permit the testing of hypotheses and the formulation of theoretical generalizations. Interpretive theory would confront the obdurate features of human group life and be shaped around the previously mentioned kinds of materials. When the Society for the Study of Symbolic Interaction formed in 1974, Blumer was an immediate supporter. His impact on symbolic interactionism has been permanently recognized by the society with its annual Herbert Blumer Award, which is given to the outstanding graduate student paper best representing the tradition associated with Blumer’s scholarship. See also: Mead, George Herbert (1863–1931).
Bibliography Blumer H (1969). Symbolic interactionism: perspective and method. Englewood Cliffs, NJ: Prentice-Hall.
Blumer H (1990). Industrialization as an agent of social change: a critical analysis. In Maines D R & Morrone T J (eds.). New York: DeGruyter. Blumer H (2004). Herbert Blumer: George Herbert Mead and human conduct. In Morrone T J (ed.). Walnut Creek, CA: AltaMira. Lyman S M & Vidich A J (1988). Social order and the public philosophy: an analysis and interpretation of the work of Herbert Blumer. Fayetteville: University of Arkansas Press. Lyman S M & Vidich A J (eds.) (2000). Selected works of Herbert Blumer: a public philosophy for mass society. Urbana: University of Illinois Press. Morrione T J (2004). ‘Preface.’ In Morrone T J (ed.) Herbert Blumer: George Herbert Mead and human conduct. Walnut Creek, CA: AltaMira. ix–xviii. Shibutani S (ed.) (1970). Human nature and collective behavior: papers in honor of Herbert Blumer. Englewood Cliffs, NJ: Prentice-Hall. Symbolic Interaction 11(1) (1988, Spring). Entire issue on Herbert Blumer’s legacy. Wiseman J P (1987). In memoriam: Herbert Blumer (1900–87). Journal of Contemporary Ethnography 16, 243–249.
Boas, Franz (1858–1942) J G Fought, Pomona College, Claremont, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Franz Boas was born in Minden, Germany to a family of merchants. He graduated from the University of Kiel (Ph.D., 1881), specializing in psychophysics and geography. His first field work was conducted in Baffin Land in 1883; apparently this is when the focus of his interests began to shift from geography to anthropology. He came to the United States in 1886, working for a time at assorted jobs, including teaching, and managing anthropology exhibits at the Chicago World’s Fair (1892–1895). In these years he also began his long examination of Kwakiutl, Tsimshian, and other Northwest Coast languages and cultures. In 1899 he secured an appointment at Columbia University, an affiliation he retained for the rest of his life. He was a master of administration and fund raising. From his secure academic position, he soon made Columbia the source from which the professionalization of American anthropology would spread, shifting its focus from museums of artifacts to academic and field research, with linguistics as a core discipline. He strove always to reorient the field away from racism, whether overt or tacit.
As the developer and impresario of modern American anthropology and the mentor of many of its leading figures, he made an immensely significant contribution to American linguistics. Further, as a linguist in his own right, his contribution was highly respectable. Boas was self-taught in linguistics. He was more successful in establishing standards for linguistic field work than in re-inventing historical and comparative linguistics as a tool of culture history. His background in perceptual psychology led him to publish (1889) an insight into naı¨ve impressions of foreign language sounds that is a very early and independent expression of what became the phonemic principle. The magnitude of his overall contribution to the development of field linguistics and the study of Native American languages, even after making allowances for the personal contributions of Edward Sapir, his brilliant student, and Leonard Bloomfield, his friend, is only slightly exaggerated in Bloomfield’s memorial statement (1943: 198): ‘‘Boas amassed a tremendous body of observation, including much carefully recorded text, and forged, almost single-handed, the tools of phonetic and structural description.’’ See also: Bloomfield, Leonard (1887–1949); Canada: Language Situation; Cultural Evolution of Language; Linguistic Anthropology; Primitive Languages; Relativism; Sapir,
Boas, Franz (1858–1942) 77
human experience. The empirical materials of this science would be valid, reliable, and permit the testing of hypotheses and the formulation of theoretical generalizations. Interpretive theory would confront the obdurate features of human group life and be shaped around the previously mentioned kinds of materials. When the Society for the Study of Symbolic Interaction formed in 1974, Blumer was an immediate supporter. His impact on symbolic interactionism has been permanently recognized by the society with its annual Herbert Blumer Award, which is given to the outstanding graduate student paper best representing the tradition associated with Blumer’s scholarship. See also: Mead, George Herbert (1863–1931).
Bibliography Blumer H (1969). Symbolic interactionism: perspective and method. Englewood Cliffs, NJ: Prentice-Hall.
Blumer H (1990). Industrialization as an agent of social change: a critical analysis. In Maines D R & Morrone T J (eds.). New York: DeGruyter. Blumer H (2004). Herbert Blumer: George Herbert Mead and human conduct. In Morrone T J (ed.). Walnut Creek, CA: AltaMira. Lyman S M & Vidich A J (1988). Social order and the public philosophy: an analysis and interpretation of the work of Herbert Blumer. Fayetteville: University of Arkansas Press. Lyman S M & Vidich A J (eds.) (2000). Selected works of Herbert Blumer: a public philosophy for mass society. Urbana: University of Illinois Press. Morrione T J (2004). ‘Preface.’ In Morrone T J (ed.) Herbert Blumer: George Herbert Mead and human conduct. Walnut Creek, CA: AltaMira. ix–xviii. Shibutani S (ed.) (1970). Human nature and collective behavior: papers in honor of Herbert Blumer. Englewood Cliffs, NJ: Prentice-Hall. Symbolic Interaction 11(1) (1988, Spring). Entire issue on Herbert Blumer’s legacy. Wiseman J P (1987). In memoriam: Herbert Blumer (1900–87). Journal of Contemporary Ethnography 16, 243–249.
Boas, Franz (1858–1942) J G Fought, Pomona College, Claremont, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Franz Boas was born in Minden, Germany to a family of merchants. He graduated from the University of Kiel (Ph.D., 1881), specializing in psychophysics and geography. His first field work was conducted in Baffin Land in 1883; apparently this is when the focus of his interests began to shift from geography to anthropology. He came to the United States in 1886, working for a time at assorted jobs, including teaching, and managing anthropology exhibits at the Chicago World’s Fair (1892–1895). In these years he also began his long examination of Kwakiutl, Tsimshian, and other Northwest Coast languages and cultures. In 1899 he secured an appointment at Columbia University, an affiliation he retained for the rest of his life. He was a master of administration and fund raising. From his secure academic position, he soon made Columbia the source from which the professionalization of American anthropology would spread, shifting its focus from museums of artifacts to academic and field research, with linguistics as a core discipline. He strove always to reorient the field away from racism, whether overt or tacit.
As the developer and impresario of modern American anthropology and the mentor of many of its leading figures, he made an immensely significant contribution to American linguistics. Further, as a linguist in his own right, his contribution was highly respectable. Boas was self-taught in linguistics. He was more successful in establishing standards for linguistic field work than in re-inventing historical and comparative linguistics as a tool of culture history. His background in perceptual psychology led him to publish (1889) an insight into naı¨ve impressions of foreign language sounds that is a very early and independent expression of what became the phonemic principle. The magnitude of his overall contribution to the development of field linguistics and the study of Native American languages, even after making allowances for the personal contributions of Edward Sapir, his brilliant student, and Leonard Bloomfield, his friend, is only slightly exaggerated in Bloomfield’s memorial statement (1943: 198): ‘‘Boas amassed a tremendous body of observation, including much carefully recorded text, and forged, almost single-handed, the tools of phonetic and structural description.’’ See also: Bloomfield, Leonard (1887–1949); Canada: Language Situation; Cultural Evolution of Language; Linguistic Anthropology; Primitive Languages; Relativism; Sapir,
78 Boas, Franz (1858–1942) Edward (1884–1939); Structuralism; United States of America: Language Situation.
Bibliography Bloomfield L (1943). ‘Franz Boas.’ Language 19, 198. Boas F (1889). ‘On alternating sounds.’ American Anthropologist 2, 47–53. Boas F (ed.) (1911). Handbook of American Indian languages. Bulletin 40. Washington, DC: Bureau of American Ethnology.
Boas F (1940). Race, language, and culture. New York: Macmillan (reprinted 1966, New York: Free Press). Boas F (1860–1942). Papers. Philadelphia: American Philosophical Society. Cole D (1999). Franz Boas: The early years, 1858–1906. Seattle and London: University of Washington Press. Mackert M (1993). ‘The roots of Franz Boas’ view of linguistic categories as a window to the human mind.’ Historiographia Linguistica 20, 331–351. Stocking G W (1974). The shaping of American anthropology, 1883–1911: A Franz Boas reader. New York: Basic Books.
Body Language A Ponzio, Universita` di Bari, Bari, Italy ! 2006 Elsevier Ltd. All rights reserved.
Body Language as Human Semiosis Body language belongs to the sphere of anthroposemiosis, the object of anthroposemiotics (see Anthroposemiotics). In fact, the term ‘language’ in today’s semiotics is specific to human semiosis (i.e., human sign behavior). Following Charles Morris’s and Thomas Sebeok’s terminological specifications, semiotics describes sign behavior with general reference to the organism (i.e., it identifies semiosis and life), and distinguishes between ‘signs in human animals’ and ‘signs in nonhuman animals,’ reserving the term language as a special term for the former. In others words, language is specific to man as a semiotic animal – that is, as a living being not only able to use signs (capable of semiosis) but also able to reflect on signs through signs (capable of semiotics). In this acceptation, language is not verbal language alone: Language refers to both verbal and nonverbal human signs. In this view – that is, from a semiotic and not a linguistic perspective (pertaining to linguistics) – language is not reduced to speech but speech is a specification of language. Language is acoustic language as much as the gestural or the tactile, etc., depending on the kind of sign vehicle that intervenes, which is not necessarily limited to the verbal in a strict sense. Following Morris (1946/1971a: 112–114), there are five criteria for the definition of language: 1. Language is composed of a plurality of signs. 2. In a language each sign has a signification common to a number of interpretants: this is linguistic signification, common to members of the interpreter-family, whereas there may, of course, be
differences of signification for individual interpreters, but such differences are not then regarded as linguistic. 3. The signs constituting a language must be ‘comsigns’ – that is, producible by the members of the interpreter-family. Comsigns are either activities of the organisms (e.g., gestures) or the products of such activities (e.g., sounds, traces left on a material medium, or constructed objects). 4. The signs that constitute a language are plurisituational signs – that is, signs with a relative constancy of signification in every situation in which a sign of the sign-family in question appears. 5. The signs in a language must constitute a system of interconnected signs combinable in some ways and not in others in order to form a variety of complex sign-processes. If language is considered as synonymous with ‘communication,’ animals no doubt also possess language. If, on the contrary, language is distinguished from communication and determined by the five criteria mentioned previously, then animals certainly do not have language, although they do communicate. Even if some of the conditions that enable us to speak of language would seem to occur in animals, they do not occur together. On this subject, the following statement by Morris (1946/1971a: 130) seems important: But even if these conditions were met [i.e., if all the other requirements were met in nonhuman animal communication], the fifth requirement is a harder hurdle. For though animal signs may be interconnected, and interconnected in such a way that animals may be said to infer, there is no evidence that these signs are combined by animals which produce them according to limitations of combinations necessary for the signs to form a language system. Such considerations strongly favor the hypothesis that language – as here defined – is unique to man.
78 Boas, Franz (1858–1942) Edward (1884–1939); Structuralism; United States of America: Language Situation.
Bibliography Bloomfield L (1943). ‘Franz Boas.’ Language 19, 198. Boas F (1889). ‘On alternating sounds.’ American Anthropologist 2, 47–53. Boas F (ed.) (1911). Handbook of American Indian languages. Bulletin 40. Washington, DC: Bureau of American Ethnology.
Boas F (1940). Race, language, and culture. New York: Macmillan (reprinted 1966, New York: Free Press). Boas F (1860–1942). Papers. Philadelphia: American Philosophical Society. Cole D (1999). Franz Boas: The early years, 1858–1906. Seattle and London: University of Washington Press. Mackert M (1993). ‘The roots of Franz Boas’ view of linguistic categories as a window to the human mind.’ Historiographia Linguistica 20, 331–351. Stocking G W (1974). The shaping of American anthropology, 1883–1911: A Franz Boas reader. New York: Basic Books.
Body Language A Ponzio, Universita` di Bari, Bari, Italy ! 2006 Elsevier Ltd. All rights reserved.
Body Language as Human Semiosis Body language belongs to the sphere of anthroposemiosis, the object of anthroposemiotics (see Anthroposemiotics). In fact, the term ‘language’ in today’s semiotics is specific to human semiosis (i.e., human sign behavior). Following Charles Morris’s and Thomas Sebeok’s terminological specifications, semiotics describes sign behavior with general reference to the organism (i.e., it identifies semiosis and life), and distinguishes between ‘signs in human animals’ and ‘signs in nonhuman animals,’ reserving the term language as a special term for the former. In others words, language is specific to man as a semiotic animal – that is, as a living being not only able to use signs (capable of semiosis) but also able to reflect on signs through signs (capable of semiotics). In this acceptation, language is not verbal language alone: Language refers to both verbal and nonverbal human signs. In this view – that is, from a semiotic and not a linguistic perspective (pertaining to linguistics) – language is not reduced to speech but speech is a specification of language. Language is acoustic language as much as the gestural or the tactile, etc., depending on the kind of sign vehicle that intervenes, which is not necessarily limited to the verbal in a strict sense. Following Morris (1946/1971a: 112–114), there are five criteria for the definition of language: 1. Language is composed of a plurality of signs. 2. In a language each sign has a signification common to a number of interpretants: this is linguistic signification, common to members of the interpreter-family, whereas there may, of course, be
differences of signification for individual interpreters, but such differences are not then regarded as linguistic. 3. The signs constituting a language must be ‘comsigns’ – that is, producible by the members of the interpreter-family. Comsigns are either activities of the organisms (e.g., gestures) or the products of such activities (e.g., sounds, traces left on a material medium, or constructed objects). 4. The signs that constitute a language are plurisituational signs – that is, signs with a relative constancy of signification in every situation in which a sign of the sign-family in question appears. 5. The signs in a language must constitute a system of interconnected signs combinable in some ways and not in others in order to form a variety of complex sign-processes. If language is considered as synonymous with ‘communication,’ animals no doubt also possess language. If, on the contrary, language is distinguished from communication and determined by the five criteria mentioned previously, then animals certainly do not have language, although they do communicate. Even if some of the conditions that enable us to speak of language would seem to occur in animals, they do not occur together. On this subject, the following statement by Morris (1946/1971a: 130) seems important: But even if these conditions were met [i.e., if all the other requirements were met in nonhuman animal communication], the fifth requirement is a harder hurdle. For though animal signs may be interconnected, and interconnected in such a way that animals may be said to infer, there is no evidence that these signs are combined by animals which produce them according to limitations of combinations necessary for the signs to form a language system. Such considerations strongly favor the hypothesis that language – as here defined – is unique to man.
Body Language 79
This means that by comparison with animal signs, human language is characterized by the fact that its signs can be combined to form compound signs. It would seem, therefore, that in the last analysis, this ‘capacity for combination’ is the most distinctive element. This conception is very close to Sebeok’s when he states that language (he too distinguishing it from the communicative function) is characterized by ‘syntax’ – that is, the possibility of using a finite number of signs to produce an infinite number of combinations through recourse to given rules. Body language includes different sign systems. Common to these sign systems is their foundation in language intended as a specific human modeling device (Sebeok, 1991, 2001b). All animal species have models to construct their world, and language is the model belonging to human beings. However, the distinctive feature of language with respect to other zoosemiotic systems (although this feature is present in endosemiotic systems, such as the genetic code, the immune code, the metabolic code, and the neural code) is syntax, through which the same construction pieces may be assembled in an infinite number of ways. Consequently, the human primary modeling system can produce an indefinite number of models and worlds. All species communicate in a world peculiar to that species alone ensuing from the type of modeling characteristic of that species. In the early stages of its development, the hominid was endowed with a modeling device able to produce an infinite number of worlds. This explains the evolution of hominids into Homo sapiens sapiens. The reason why it is possible for such animals to produce a limitless number of worlds is that the human modeling device, or language, functions in terms of syntax – that is, in terms of construction, deconstruction, and reconstruction with a finite number of elements that may be composed and recomposed in an infinitely great variety of different forms. We are referring to the human ability to reflect on sign materials, means, and models (i.e., on that which has already been modeled), to the end of using such materials in new modeling processes. This is what is intended by specific human semiosis – that is, ‘semiotics.’ Body languages are semiotical.
Body Language and the Sign–Body General Connection The previous discussion demonstrated the connection with body language and human semiosis. However, body language belongs to the general connection between signs and bodies that is found in all the universe of life (i.e., in all planetary semiosis). This implies continuity from nonhuman animal signs to human
signs. As Morris (1946/1971b: 13) concludes his discussion of the distinction between nonhuman animal signs and human signs, human language (and the postlanguage symbols it makes possible) goes far beyond the sign-behavior of animals. On this subject, the following observation is similar to Sebeok’s conception of human signs: But language-behavior is still sign-behavior, and language signs rest upon, and never completely take the place of [italics added], the simpler signs which they presuppose. The continuity is as real as the discontinuity, and the similarity of human and animal sign-behavior as genuine as the difference.
All sign processes include the body in some sense because the entire sign process takes place in a biological, social, or technical medium; it must have a channel of access to the object interpreted. Channels and media are different and consequently have different ways to connect sign and body. The source may be (1) an inorganic body, such as a natural inorganic object or manufactured inorganic object, and in this case, the interpreted may be a sign only because it receives an interpretation from the interpreter (‘semiosis of information’), or (2) an organic substance or a living being (organism or components) belonging to H. sapiens or speechless creatures (‘semiosis of symptomatization,’ in which the sign is unintentional, and ‘semiosis of communication,’ in which the sign is intentional). In body signs of symptomatization semiosis (symptoms, clues, and traces) the interpreted sign is already an interpretant response before being interpreted as a sign by an interpretant. However, this response is not oriented to being interpreted as a sign; that is, it does not come to life for the purpose of being interpreted. On the contrary, in semiosis of communication where too the interpreted is already an interpretant response before being interpreted as a sign by the interpretant, this interpretant response is intended to be interpreted as a sign. When an organism or a machine takes an object as a sign of another object, it must have a ‘channel,’ a passageway to access it. Possible channels are gases, liquids, and solids with regard to matter; they are chemical and physical with regard to energy. Concerning the latter, channels may be acoustic (air, water, and solids) or optical (reflected daylight or bioluminescence; Sebeok, 1991: 27–28), tactile, chemical, electric, magnetic, or thermal. Semiosis may engage several channels and also a simultaneous use of more than one channel, as is frequently the case in human communication. ‘Medium’ can be used as a synonym of channel (Sebeok, 1991: 27), but medium is also the world in
80 Body Language
which semiosis takes place. It may be a biological, social, or technical medium. In this double sense that connects medium to model and modeling, we may refer to semiosis in the world of technical instruments and social institutions. In any type of semiosis there is a connection between signs and bodies, signata and signantia, media/channels and significata, semiosis and materiality. Materiality of the signans (Petrilli, 1990: 365–401; Rossi-Landi, 1992: 271–299) is not limited to extrasign materiality, physical materiality (the body of the signans and its channel), and instrumental materiality (nonsign bodily residues of nonverbal signs, i.e., their nonsign uses and functions; Rossi-Landi, 1985: 65–82). More than this, materiality of the signans is ‘semiosic materiality,’ and in the sphere of anthroposemiosis it is also ‘semiotic materiality.’ Semiotic materiality is historicosocial materiality at more or less high levels of complexity, elaboration, and/or articulation (elaboration materiality). It is ideological materiality, extraintentional materiality (i.e., objectivity independent from consciousness and volition), as well as signifying otherness materiality (i.e., the possibility of engendering other signata than the signatum of any specific interpretive route) (Ponzio, 1990: 15–61, 1994: 42–45). Signs are bodies. However, the physical object may be transformed into a sign while still belonging to the world of physical matter due to ‘sign work,’ to use Rossi-Landi’s terminology. As a sign, the physical body acquires meaning engendered in the relation to something else, it defers to something external to itself, and it reflects and refracts another reality from itself (Voloshinov, 1929/1973: 10): Signs also are particular, material things; and . . . any item of nature, technology, or consumption can become a sign acquiring in the process a meaning that goes beyond its given particularity. A sign does not simply exist as a part of reality – it reflects and refracts another reality.
The following distinction is proposed: The expression ‘semiosic corporeality’ is used for bodies that have become signs in a world modeled by living beings where sign processes are languageless, and semiotic corporeality is used where bodies that are signs presuppose a world modeled by language (i.e., a human world). As Marx (Marx and Engels, 1845/1968: 42), suggested, ‘‘From the start the ‘spirit’ is afflicted with the course of being ‘burdened’ with matter, which here makes its appearance in the form of agitated layers of air, sounds, in short, of language.’’ Here, language is ‘‘agitated layers of air, sounds’’: This is about its physical materiality. However, language is also human consciousness and the organization of human
life: This is about the semiotic materiality of language as human primary modeling. ‘‘Language is as old as consciousness, language is practical consciousness that exists also for other men, and for that reason alone it exists for me personally as well’’ (Marx and Engels, 1845/1968: 42). Language is ‘‘the immediate actuality of thought. . . . Neither the thought, nor the language exist in an independent realm from life’’ (Marx and Engels, 1845/1968: 503–504). As a body, the sign is material in a physical sense; as sign, it is material in a semiosic sense; and as human historicosocial matter, it is material in a semiotic sense. In human worlds modeled by language, a body is a sign because of its historicosocial materiality. It is this kind of materiality that interests us when a body is taken into consideration and studied as a human sign (i.e., in a semiotic framework).
The Body in the Sign In contemporary general semiotics, of which the most holistic expression is Sebeok’s ‘global semiotics,’ the criterion of life (i.e., of living body) is semiosis. Using the formula employed by Marcel Danesi to sum up Sebeok’s conception of the semiosic character of living beings, we may say that the body is in the sign (i.e., life is defined by semiosis). In the human animal, or ‘semiotic animal,’ this means that semiosis is the bond that links together body, mind, and culture (Danesi, 1998: 16). Studies on the manifestation patterns of semiosis in nature and culture show persuasively that in anthroposemiosis there exists an inextricable nexus among sign, body, and culture. The type of sign (according to Charles S. Peirce’s typology of signs), in which the body lives and organizes its world on the basis of its species-specific modeling device, is first and foremost the ‘icon.’ In other words, iconicity is a basic signifying strategy in various life-forms. The iconic mode of representation is the relation of the sign with its referent through replication, simulation, imitation, or resemblance. Iconicity is the default form of semiosis, as Sebeok demonstrated by documenting that in vastly different species the manifestation of the capacity to produce signs stands in some direct simulative relation to their referents. In his works, Sebeok showed the variety of manifestations of iconicity in different species. Iconic signs can thus be vocal, visual, olfactory, gustatory, or tactile in their form. It may be that in humans too all signs start out as a simulative relation to their referential domains. Like Peirce, Sebeok viewed iconicity as the primordial representational strategy in the human species. Danesi (1998: 10) considers iconicity as an aspect of utmost relevance in the study of signs.
Body Language 81
He emphasizes the important role of iconicity – documented by Sebeok especially in the final three chapters of his 1986 book – in the bond that links semiosis, body, mind, and culture. This inextricable nexus manifests itself in the form of iconical representational behavior. ‘‘Iconicity is, in effect, evidence of this nexus’’ (Danesi, 1998: 37). Danesi (1998: 18–20) refers to the conception that the iconic mode of representation is the primary means of bodily semiosis as the ‘iconicity hypothesis.’ Consequently, another principle of global semiotics or semiotics of life is the ‘sense-implication hypothesis’ (Danesi, 1998: 17), which suggests that semiosis is grounded in the experiential realm of sense. This principle has a philosophical antecedent in John Locke – according to which all ideas came from sensation first and reflection later – but it is connected with modeling theory: what is acquired through the body is modeled differently through the innate modeling system possessed by different species. In fact, a species perceives according to its own particular anatomical structure and to its own particular kind of modeling system. Due to its species-specific modeling system, called language by Sebeok, Homo, the semiotic animal, not only is a sophisticated modeler of the world but also has a remarkable ability to re-create his world in an infinite number of forms. The living body is initially an iconic sign – that is, in a world iconically modeled. This is valid too in the case of the human species on the ontogenetic and phylogenetic levels. Natural learning flow (i.e., the semiosic process in which children acquire knowledge) takes place through the body and human primary modeling system and proceeds from iconicity to the forms of modeling that children learn in the cultural context. To recognize that the body is, lives, in the sign with reference to human ontogenetic development in the body–sign–culture relation implies, as Danesi (1998: 61) states, that the semiosic capacities of the learner and the determination of his or her semiosic stage – rather than the subject matter to be learned – should therefore be the focus of education. The main implication of the formula ‘the body in the sign’ and modeling theory for education is of a methodological nature. If the teacher is familiar with the forms of the semiosic process in human learning, he or she would be in a better position to help the learner acquire knowledge and skill more effectively and efficiently. In fact, the key to successful learning, states Danesi, lies, arguably, in determining at what point the learning phase is ready to be overtaken by the following – that is, what the Russian psychologist Vygotsky (1934/ 1962) called the ‘proximal zone’ of learning. The semiotic approach to education, as the psychologist and semiotician Vygotsky claimed, is indispensable for an
appropriate foresight of the ‘zones of proximal development’ of each particular learner.
The Body in the Languages of Globalization and ‘Grotesque Realism’ Here, another argument is added to those proposed by Danesi in order to consider the implications of the formula ‘the body in the sign’ for education. Included as goals in education are the capacity for criticism, social conscience, and responsible behavior. On this subject, the previously mentioned formula has implications for an adequate consciousness and comprehensive interpretation of communication under present-day semiosis conditions (i.e., in the phase named ‘globalization’). In the current age, characterized by the automated industrial revolution, the global market, consumerism, and the pervasiveness of communication through the whole production cycle (communication – production, communicative exchange, and consumption of goods that are messages), ‘the body in the sign’ highlights that globalization and therefore languages of globalized communication incorporate human life in all its manifestations. ‘Life in all its manifestations’ refers to life in the form of development (well-being and consumerism) as well as in the form of underdevelopment (poverty and impossibility of survival); in the form of health and of disease; in the form of normality and deviation; in the form integration and emargination; in the form employment and unemployment; in the form functional transfer of the workforce, characteristic of emigration and migration, which expresses the denied request of hospitality; and in the form of exposition to war disseminated at a worldwide level, and planned as infinite. Again, incorporation of the body in the languages of globalized communication is not limited to human life alone. Life over the whole planet is now involved (even compromised and put at risk). The planetary perspective of global semiotics allows for the necessary distance and indeclinable responsibility (a responsibility without alibis) for an approach to contemporaneousness that does not remain imprisoned within the confines of contemporaneity itself. The controlled insertion of bodies into languages of the production apparatus of global communication goes hand in hand with the spread of the concept of the individual as a separate and self-sufficient entity. The body is understood and experienced as an isolated biological entity, as belonging to the individual, as an autonomous sphere of private interests. Such an attitude has led to the almost total extinction of cultural practices and worldviews based on
82 Body Language
intercorporeality (i.e., reciprocal interdependency), exposition and opening of the living body. Think of the ways the body is perceived in popular culture, discussed by Bakhtin (1965) in the forms of carnival and grotesque realism, where the body and corporeal life generally are conceived neither individualistically nor separately from the rest of terrestrial life and, indeed, from the world. We refer to verbal and nonverbal languages of the grotesque body that we may find in all cultures on the planet and in the literary carnivalized genres of all national literatures. Grotesque realism presents the body as something that is not defined once and for all, that is not confined to itself, but as flourishing in symbiosis with other bodies, in relations of transformation and renewal that far exceed the limits of individual life. Globalization, in which communication is exploited for profit, does not weaken the individualistic, private, and static conception of the body, connected with the rise of the bourgeoisie, but, on the contrary reinforces it. Division and separatism among the sciences are functional to the ideological–social necessities of the ‘recent new cannon of the individualized body’ (Bakhtin, 1965). This in turn is functional to the controlled insertion of bodies into the languages of the reproduction cycle of today’s production system. The interdisciplinary focus of global semiotics and attention on the signs of the interconnection between living bodies, human and nonhuman, are the presuppositions of an education that is free from stereotyped, limited, and distorted ideas and practices of communication under present-day conditions. This is another implication of the semiotic global approach for education and another possible meaning of the proposition chosen by Danesi to sum up what Sebeok said: ‘The body is in the sign’ – that is, semiosis is the bond that links the body, the mind, and culture.
Body Language and Speech in Human Phylogenesis It appears virtually certain that early hominid forms that evolved to Homo erectus had language as an interior modeling device, although not speech. As previously mentioned, a modeling system is a tool with which an organism analyzes its surroundings. Language as a modeling system seems to have always been an exclusive property of the species Homo. It is an original lingua mutola (a mute, speechless language) described by Giambattista Vico in La scienza nuova, and which consists in the inventive, ‘poetic’ capacity to model different possible worlds at the basis of communication among members of the early hominid species.
According to Sebeok’s (2001a: 17–30) reconstruction, hominids to H. erectus (included) communicated with each other by nonverbal means, in the manner of all other primates. However, differently from the latter, its body signs were already body languages because they were founded on a specific human primary modeling device. Homo habilis (‘handy man,’ 2.4–2.0 million years ago) and H. erectus (‘upright man,’ more than 1.5 million years ago) with a brain volume of 800–1200 cm3 and a far more elaborate tool kit (including fire), had language, but not speech, and communicated with mute body languages (i.e., in an articulate and organized world on the basis of syntax inherent to human primary modeling). Speech did not appear until our own immediate archaic sapiens (‘wise man’) ancestors appeared (approximately 300 000 years ago), who, as indicated by evidence from rule-governed behavior, not only had language but also manifested it in the form of speech. Thus, although language as a specific human primary modeling system emerged on the scene perhaps 2.5 or 3.0 million years ago, verbal language or speech appeared solely in H. sapiens as a communication system and developed slowly in H. sapiens sapiens also as a cognitive system, namely as a secondary modeling system. However, the human nonverbal system had body languages as communicative devices implicating, similarly to future speech, language not reducible to a communicative device: The specific function of language in the evolution of anthroposemiosis was not to transmit messages and give information but to model species-specific human worlds. Following Sebeok, we may say that language is essentially ‘mind work,’ whereas speech is ‘ear and mouth work.’ The relatively simple, nonverbal models that nonhuman animals live by, that hominids used to communicate, and that normal human infants (in-fans) likewise employ are indeed kinds of primary modeling. Consequently, the sign systems of nonhuman animals are merely body sign systems, whereas sign systems of the human animal (semiotic animal) including hominids and today’s normal infants are body languages. However, as a type of primary modeling, all these models are more or less pliable representations that must fit ‘reality’ sufficiently to tend to secure survival in one’s Umwelt. Such ‘top-down’ modeling (to use a current jargon borrowed from the cognitive sciences) can persist and become very sophisticated indeed in the adult life of exceptionally gifted individuals, as borne out by Einstein’s testimonial or by what we know about Mozart’s and Picasso’s abilities to model intricate auditory or visual compositions in their heads in
Body Language 83
anticipation of transcribing this onto paper or canvas. This kind of nonverbal modeling is indeed primary, in both a phylogenetic and an ontogenetic sense. Syntax makes it possible for hominids not only to represent immediate ‘reality’ (in the sense discussed previously) but also, uniquely among animals, to frame an indefinite number of possible worlds in the sense of Leibniz (Sebeok, 1991: 57–58).
Dialogism of Body Language In Bakhtin’s view, dialogue consists of the fact that one’s own word alludes always and in spite of itself, whether it knows it or not, to the word of the other. Dialogue is not an initiative taken by self. As clearly emerges from Bakhtin’s analysis of novels by Dostoevsky, the human person does not enter into dialogue with the other out of respect for the other but, rather, in spite of oneself. Both word and self are dialogic in the sense that they are passively involved with the word and self of the other. Internal and external verbal discourse is implied dialogically in otherness, just as the ‘grotesque body’ (Bakhtin, 1965) is implied in the body of the other. In fact, dialogue and body are closely interconnected. Bakhtin’s dialogism cannot be understood separately from his biosemiotic conception of sign. On this basis, he criticized both subjective individualism and objective abstraction. According to Bakhtin, there cannot be dialogism among disembodied minds. Unlike platonic dialogue, and similarly to Dostoevsky, for Bakhtin, dialogue is not only cognitive and functional to abstract truth, but it is also a life need grounded in the inevitable interconnection of the self’s body with the body of other. For Bakhtin, dialogue is the embodied, intercorporeal expression of the involvement of one’s body with the body of the other. The concept of the body as an individual, separate, and autonomous body is only an illusion. The image that most adequately expresses the condition of intercorporeity is the grotesque body (Bakhtin, 1965) in popular culture, in vulgar language of the public place, and in the masks of carnival. This is the body in its vital and indissoluble relation to the world and to the body of others. In 1926, Bakhtin published an article on the biological and philosophical subject titled ‘Contemporary vitalism’ (signed by the biologist I. I. Kanaev, who subsequently declared that Bakhtin was the author). In his description of the interaction between living body and environment and opposing the dualism of life force and physical–chemical processes, Bakhtin maintained that the organism forms a monistic unit with the surrounding world. In his works of the 1920s, Bakhtin criticized both the vitalists and
the reflexologists, as well as both Freudianism and mechanistic materialism (e.g., the mechanistic view of the relation between base and superstructure). In Bakhtin’s view, each of these different trends is vitiated by false scientific claims that underestimate the dialogic relation between body and world. Such approaches either dematerialize the living body or physicalize it in terms of mechanistic relations. Bakhtin formulated the category of ‘carnivalesque’ in his study on Rabelais, which he extended to culture at a world level insofar as it is human and not just Western culture. The carnivalesque participates in ‘great experience,’ understood as offering a global view of the complex and intricate life of bodies, signs, and languages. As Bakhtin shows in the 1963 edition of his book on Dostoevsky, dialogue in the polyphonic novel has its roots in the carnivalesque language of the grotesque body. Plurivocality, ductility, and ambiguity of sense in verbal language (the expression of centrifugal forces in linguistic life) are also connected with the grotesque body. This is especially evident in the double character of verbal and gestural ‘language of the public place,’ of vulgar expression that is simultaneously laudatory and offensive. Most interesting on this subject is Bakhtin’s reference (in Voloshinov, 1929/1973) to Dostoevsky’s notes on an animated conversation formed of a single vulgar bodily word used with different meanings.
Foremost Expressions of Body Language On the basis of the discussion of an issue that is essentially methodological and that also concerns body language (which coincides with the human semiosphere; i.e., the special semioses characteristic of the semiotic animal, the sole animal gifted with the primary modeling device called language by Sebeok), we may now consider some exemplars of body language. As the expression of body language, we have already discussed such human signs as gesture, face expression, vocal songs, and bodily movements used to communicate in phases antecedent to verbal language (i.e., speech) on both the phylogenetic and the ontogenetic level. These are nonverbal signs used by infants and hominids before the advent of H. sapiens. Body language includes signs studied by physiognomics – the discipline that studies the relations between bodily characteristics, especially facial features, and psychic characters of the human individual. In semiotics, an important work on the bond between body and temperament is The open self by Charles Morris (1948), who used the typology (‘endomorphy,’ ‘mesomorphy,’ and ‘ectomorphy’) proposed by psychologist William H. Sheldon in
84 Body Language
The varieties of human physique and Varieties of temperament from a semiotic perspective. Body language involves modifications of the cultural body, which belong to some complex sign system or merely to the binary presence/absence system, in a wide range of cultural alterations operated on the body from brands, tattoos, the stripping of the flesh, and piercing to maquillage, including the use of belladonna to dilate the pupils. Body language also includes dance, especially ritual dances, in which any small body movement can have a precise meaning. We have also mentioned cultural modifications in the distinctive pheromonal function of the human chemical signature now studied by semiochemistry (Sebeok, 2001b: 96). On this subject, Sebeok cited both the novel Das perfume by Patrick Su¨ skind, based entirely on the indexical facets of human semiochemistry, and a passage from Peirce concerning the study of odors as signs, with special reference to women’s favorite perfumes. Human odors are classified by Sebeok as indexical signs, but this body language also has an iconic aspect (i.e., it also signifies on the basis of similarity): In the passage cited by Sebeok, Peirce’s comment is the following: ‘‘Surely there must be some subtle resemblance between the odor and the impression I get of this or that woman’s nature’’ (Sebeok, 2001b: 313). Signs of body language are also signs that relate to phrenology, anthropometry, palmistry, and graphology or practices such as handwriting authentication and identification by fingerprinting or by individual unique sequences of DNA molecules. Moreover, body language is studied by the branch of semiotics called proxemics – that is, the semiotics of interpersonal space, originally developed by Edward T. Hall in the context of cultural anthropology. Finally, body language includes such human sign systems as the ‘sign language’ of the American Indians (Sebeok, 1979), monastic signs (Sebeok and UmikerSebeok, 1987), and the language of deaf-mutes. The latter is further proof of the fact that man as a semiotic animal is not the speaking animal but the animal that is endowed with language, the primary modeling device. It is not true that dogs only lack speech. Dogs and other nonhuman animals lack language. Instead, the deaf-mute only lacks speech, as a pathology. This means that other nonverbal systems, such as the gestural, can be grafted onto the human primary modeling device. Also, due to these sign systems the deaf-mute is able to accomplish the same inventive and creative mental functions as any other human animal. It must be emphasized that the connection between verbal language and body language largely
depends on their common participation in language understood as human primary modeling. Concerning verbal intonation, and specifically the important phenomenon of language creativity called ‘intonational metaphor,’ Bakhtin (1926/1983) observed that an intimate kinship binds the intonational metaphor in real-life speech with the ‘metaphor of gesticulation.’ In fact, the word itself was originally a ‘linguistic gesture,’ a ‘component of a complex body gesture,’ understanding gesture broadly to include facial expression, gesticulation of the face. Intonation and gesture belong to body language, and they express a living, dynamic relationship with the outside world and social environment. By using intonation and gesticulation, stated Bakhtin (1926/1983), an individual takes up an active social position with regard to certain values. Of course, this position is conditioned by social instances. Verbal intonation and gesture participate in the creative modeling of human language. In this sense, they belong to the anthroposemiotic bond relating sign–mind–culture. In this bond also reside the aesthetic–creative forces of body language that create and organize artistic forms. See also: Anthroposemiotics;
Biosemiotics; Gesture: Sociocultural Analysis; Gestures: Pragmatic Aspects; Indexicality: Theory; Kinesics; Performance in Culture; Semiotic Anthropology; Sign Language: Overview; Significs: Theory; Silence: Cultural Aspects; Social Semiotics; Structuralism.
Bibliography Bakhtin M M (1965). Rabelais and his world. Cambridge: MIT Press. Bakhtin M M (1983). ‘Discourse in life and discourse in poetry.’ In Shukman A (ed.) Bakthin school papers, Russian Poetics in Translation No. 10. Oxford: RPT. (Original work published 1926.) Danesi M (1998). The body in the sign: Thomas A. Sebeok and semiotics. Toronto: Legas. Fano G (1992). Origins and nature of language. Petrilli S (trans.). Bloomington: Indiana University Press. (Original work published 1972.) Kanaev I I (1926). ‘Sovremennyj vitalizm.’ Chelovek i priroda 1, 33–42; 9–23. (New edn. (1993) in Dialog, Karnaval, Chronotop 4, 99–115.) Marx K & Engels F (1968). Selected works in one volume. London: Lawrence & Wishart. (Original work published 1845.) Marx K & Rayzankaya S (eds.) (1968). The German ideology. Moscow: Progress Publishers. (Original work published 1845–1846.) Morris C (1948). The open self. New York: Prentice Hall. Morris C (1971a). ‘Signs language and behavior.’ In Morris C (ed.). 73–398. (Original work published 1946.)
Boeckh, August (1785–1867) 85 Morris C (1971b). Writings on the general theory of signs. Sebeok T A (ed.). The Hague, The Netherlands: Mouton. (Original work published 1946.) Peirce C S (1931–1958). Collected papers (8 vols). Cambridge, MA: Belknap Press of Harvard University Press. Petrilli S (1990). ‘On the materiality of signs.’ In Ponzio A. 365–401. Petrilli S (1998). Teoria dei segni e del linguaggio. Bari, Italy: Graphis. Petrilli S (ed.) (2003). Linguaggi. Bari, Italy: Laterza. Petrilli S (2005a). Percorsi della semiotica. Bari, Italy: Graphis. Petrilli S (ed.) (2005b). Communication and its semiotic bases: studies in global communication. Madison, WI: Atwood. Petrilli S (2005c). ‘Bodies, signs and values in global communication.’ In Petrilli S (ed.). Petrilli S & Calefato P (2003). Logica, dialogica, ideological. I segni fra funzionalita` ed eccedenza. Milan: Mimesis. Ponzio A (1990). Man as a sign. Essays on the philosophy of language. Petrilli S (trans. & ed.). Berlin: de Gruyter. Ponzio A, Calefato P & Petrilli S (1994). Fondamenti di filosofia del linguaggio. Rome: Laterza. Ponzio A & Petrilli S (2000). Il sentire nella comunicazione globale. Rome: Meltemi. Ponzio A & Petrilli S (2001). Sebeok and the signs of life. London: Icon Books. Ponzio A & Petrilli S (2005). Semiotics unbounded. Interpretive routes through the open network of signs. Toronto: Toronto University Press. Posner R, Robering K & Sebeok T A (eds.) (1997–2004). Semiotik/Semiotics. A handbook on the sign-theoretic
foundations of nature and culture (3 vols). Berlin: de Gruyter. Rossi-Landi F (1985). Metodica filosofica e scienza dei segni. Milan: Bompiani. Rossi-Landi F (1992). Between signs and non-signs. Petrilli S (ed.). Amsterdam: Benjamins. Sebeok T A (1976). Contributions to the doctrine of signs. Lisse: Peter de Ridder Press. (2nd edn. Lanham: University Press of America.) Sebeok T A (1979). The sign & its masters. Austin: University of Texas Press. Sebeok T A (1981). The play of musement. Bloomington: Indiana University Press. Sebeok T A (1986). I think I am a verb. More contributions to the doctrine of signs. New York: Plenum. Sebeok T A (1991). A sign is just a sign. Bloomington: Indiana University Press. Sebeok T A (2001a). Global semiotics. Bloomington: Indiana University Press. Sebeok T A (2001b). Signs. An introduction to semiotics. Toronto: Toronto University Press. Sebeok T A & Danesi M (2000). The forms of meanings. Modelling systems theory and semiotic analysis. Berlin: de Gruyter. Sebeok T A & Umiker-Sebeok J (eds.) (1987). Monastic sign languages. Berlin: de Gruyter. Voloshinov V N (1973). Marxism and the philosophy of language. Matejka L & Titunik I R (trans.). Cambridge, MA: Harvard University Press. (Original work published 1929.) Vygotsky L S (1962). Thought and language. Cambridge: MIT Press. (Original work published 1934.)
Boeckh, August (1785–1867) S Fornaro, University of Sassari, Italy ! 2006 Elsevier Ltd. All rights reserved.
August Boeckh (Figure 1) was born in Karlsruhe on November 24, 1785, as the son of court secretary and notary Georg Mattha¨ us Boeckh (1735–1790). Following the advice of his mother, he attended the well-known ‘Gymnasium illustre’ in Karlsruhe, where he received a special education under the supervision of mathematician and physicist Johannes Lorenz Bo¨ ckmann (1741–1802), graduating as Candidatus theologicus. The influence of Schleiermacher and Friedrich August Wolf (1759–1824) led Boeckh to break off his theological studies in 1805 and devote himself to the study of Greek antiquity. Completing his studies in 1806, Boeckh went to Berlin to attend the ‘Seminar fu¨ r gelehrte Schulen,’ directed by J. J. Bellermann, then headmaster of the Gymnasium
‘Zum Grauen Kloster.’ As a member of the seminar, Boeckh taught Latin, French, and history. He soon developed a friendship with Professors Buttmann and Heindorf, with whom he founded the Berliner Griechische Gesellschaft, also known as Graeca. After finishing his dissertation at Halle University, he moved to Heidelberg. He immediately passed his Habilitation, thereby obtaining an Extraordinariat, which was raised to an Ordinariat fu¨ r Klassische Philologie in 1809, in the seminar founded by Friedrich Creuzer (1771–1858). Through cordial relations with Clemens Brentano (1778–1842) and Achim von Arnim (1781–1831), Boeckh introduced in detail Schleiermacher’s Plato translations in the Heidelbergische Jahrbu¨ cher. Two years later, W. von Humboldt offered him a professorship in Berlin, where he earned high praise in the organization of teaching and research at the newly founded university. In 1812, the philological seminar, developed
Boeckh, August (1785–1867) 85 Morris C (1971b). Writings on the general theory of signs. Sebeok T A (ed.). The Hague, The Netherlands: Mouton. (Original work published 1946.) Peirce C S (1931–1958). Collected papers (8 vols). Cambridge, MA: Belknap Press of Harvard University Press. Petrilli S (1990). ‘On the materiality of signs.’ In Ponzio A. 365–401. Petrilli S (1998). Teoria dei segni e del linguaggio. Bari, Italy: Graphis. Petrilli S (ed.) (2003). Linguaggi. Bari, Italy: Laterza. Petrilli S (2005a). Percorsi della semiotica. Bari, Italy: Graphis. Petrilli S (ed.) (2005b). Communication and its semiotic bases: studies in global communication. Madison, WI: Atwood. Petrilli S (2005c). ‘Bodies, signs and values in global communication.’ In Petrilli S (ed.). Petrilli S & Calefato P (2003). Logica, dialogica, ideological. I segni fra funzionalita` ed eccedenza. Milan: Mimesis. Ponzio A (1990). Man as a sign. Essays on the philosophy of language. Petrilli S (trans. & ed.). Berlin: de Gruyter. Ponzio A, Calefato P & Petrilli S (1994). Fondamenti di filosofia del linguaggio. Rome: Laterza. Ponzio A & Petrilli S (2000). Il sentire nella comunicazione globale. Rome: Meltemi. Ponzio A & Petrilli S (2001). Sebeok and the signs of life. London: Icon Books. Ponzio A & Petrilli S (2005). Semiotics unbounded. Interpretive routes through the open network of signs. Toronto: Toronto University Press. Posner R, Robering K & Sebeok T A (eds.) (1997–2004). Semiotik/Semiotics. A handbook on the sign-theoretic
foundations of nature and culture (3 vols). Berlin: de Gruyter. Rossi-Landi F (1985). Metodica filosofica e scienza dei segni. Milan: Bompiani. Rossi-Landi F (1992). Between signs and non-signs. Petrilli S (ed.). Amsterdam: Benjamins. Sebeok T A (1976). Contributions to the doctrine of signs. Lisse: Peter de Ridder Press. (2nd edn. Lanham: University Press of America.) Sebeok T A (1979). The sign & its masters. Austin: University of Texas Press. Sebeok T A (1981). The play of musement. Bloomington: Indiana University Press. Sebeok T A (1986). I think I am a verb. More contributions to the doctrine of signs. New York: Plenum. Sebeok T A (1991). A sign is just a sign. Bloomington: Indiana University Press. Sebeok T A (2001a). Global semiotics. Bloomington: Indiana University Press. Sebeok T A (2001b). Signs. An introduction to semiotics. Toronto: Toronto University Press. Sebeok T A & Danesi M (2000). The forms of meanings. Modelling systems theory and semiotic analysis. Berlin: de Gruyter. Sebeok T A & Umiker-Sebeok J (eds.) (1987). Monastic sign languages. Berlin: de Gruyter. Voloshinov V N (1973). Marxism and the philosophy of language. Matejka L & Titunik I R (trans.). Cambridge, MA: Harvard University Press. (Original work published 1929.) Vygotsky L S (1962). Thought and language. Cambridge: MIT Press. (Original work published 1934.)
Boeckh, August (1785–1867) S Fornaro, University of Sassari, Italy ! 2006 Elsevier Ltd. All rights reserved.
August Boeckh (Figure 1) was born in Karlsruhe on November 24, 1785, as the son of court secretary and notary Georg Mattha¨us Boeckh (1735–1790). Following the advice of his mother, he attended the well-known ‘Gymnasium illustre’ in Karlsruhe, where he received a special education under the supervision of mathematician and physicist Johannes Lorenz Bo¨ckmann (1741–1802), graduating as Candidatus theologicus. The influence of Schleiermacher and Friedrich August Wolf (1759–1824) led Boeckh to break off his theological studies in 1805 and devote himself to the study of Greek antiquity. Completing his studies in 1806, Boeckh went to Berlin to attend the ‘Seminar fu¨r gelehrte Schulen,’ directed by J. J. Bellermann, then headmaster of the Gymnasium
‘Zum Grauen Kloster.’ As a member of the seminar, Boeckh taught Latin, French, and history. He soon developed a friendship with Professors Buttmann and Heindorf, with whom he founded the Berliner Griechische Gesellschaft, also known as Graeca. After finishing his dissertation at Halle University, he moved to Heidelberg. He immediately passed his Habilitation, thereby obtaining an Extraordinariat, which was raised to an Ordinariat fu¨r Klassische Philologie in 1809, in the seminar founded by Friedrich Creuzer (1771–1858). Through cordial relations with Clemens Brentano (1778–1842) and Achim von Arnim (1781–1831), Boeckh introduced in detail Schleiermacher’s Plato translations in the Heidelbergische Jahrbu¨cher. Two years later, W. von Humboldt offered him a professorship in Berlin, where he earned high praise in the organization of teaching and research at the newly founded university. In 1812, the philological seminar, developed
and directed by Boeckh, was raised to university level. Along with Schleiermacher, Savigny, and the anatomist Carl Asmund Rudolphi (1771–1832), Boeckh joined a commission charged with evaluating the university statutes that were introduced at the Alma mater Berolinensis in 1817. A large part of Boeckh’s scientific lifework emerged within the context of the Prussian Academy of Sciences, to which he was admitted in 1814. As successor of his friend Schleiermacher, Boeckh was secretary of the humanities section for 27 years (1834–1861). In 1815, he initiated on behalf of the Academy the four-volume Corpus Inscriptionum Graecum (CIG), published between 1825 and 1859. The ambitious enterprise of collecting all antique inscriptions led to Boeckh’s reputation as the father of epigraphy and initiated the monumental academy projects successfully implemented by his successors Mommsen, Harnack, Wilamowitz, and Diels. Boeckh was no armchair philologist. Besides lecturing and his academy work, he took on increasingly administrative tasks within the framework of building and extending the university. He was dean for the first time in 1814/1815, and was elected Rektor first in 1825. He held this office five times consecutively, last in 1860 at the age of nearly 75, when Berlin University celebrated its 50th birthday. Boeckh’s commitment reached far beyond the university. Not only did he remain interested throughout his life in political issues, he also participated actively on a regular basis. This is illustrated, e.g., by his commitment to the reform of the Prussian teacher
education program and becomes even clearer by his dedication to German unification and academic freedom. Boeckh’s high offices at the university and the Academy, combined with his indisputable intellectual authority as a scholar, made him an important contact person for both court and state. He was careful, however, to preserve his independence, merely accepting the title of Geheimer Regierungsrat. In 1832, he ostentatiously declined working for the censorship agency, followed by his refusal to become Kultusminister in 1848. Even without a political office, Boeckh exerted considerable influence over the intellectual life of his time, transcending the university and the academy. By accepting the philology chair, Boeckh had become Professor eloquentiae et poeseos. This position included not only formulating a foreword for the lecture timetable each semester and composing all Latin university documents, it also involved being the university’s main speaker on festive occasions, a task he conscientiously fulfilled until shortly before his death. Boeckh’s personal correspondence provides evidence that limitations on freedom of speech made this by no means easy for him. Yet Boeckh, who called himself a ‘Protestant’ in the actual sense of the word, never deviated from his personal opinion. His numerous speeches, which focused on the concept of academic freedom, profess a liberal point of view and a pugnacious humanism. Academic freedom found in him one of its most eloquent and persistent defenders. Boeckh was married twice. In 1809 he married Dorothea Wagermann, the daughter of superintendent general Gottfried Wagermann. After her early death, Boeckh married Anna Taube in 1830. On August 3, 1867, August Boeckh died at the age of 82 as a result of lung disease. Boeckh began with studies on Plato (especially ‘Timaios’) and the Pythagorean Philolaos, using his thorough mathematical education. Through Greek musical studies he discovered the field of Greek metrics. In Berlin, Boeckh developed a special interest in rhetorical-antiquarian matters, due to B. G. Niebuhr’s influence. In 1817, he published Die Staatshaushaltung der Athener, the first Attic economic history. In the foreword, he articulates his wish that science should expand from a one-sided linguistic approach to an all-comprehensive exploration of Greek life. Boeckh did theoretically design and practically implement an extensive science of classical antiquity, comprising as equal components of a complex whole all areas of life and all of its cultural expressions. The over-enthusiastic plan of his youth to create a cultural-historical oeuvre entitled ‘Hellen,’ – intended to present an overall picture of Greek life in all of
Boethius of Dacia (fl. 1275) 87
its political, economic, religious, and intellectual facets – remained beyond his reach, mainly due to the existence of only insufficient preparatory work, or none at all, for too many sections of his envisioned composition. He never discarded his central idea of an interdisciplinary, cultural-study-based approach to classical antiquity. Instead, he advanced to heading the realistic philological school in opposition to the linguistic-text-critical school or so-called ‘Wortphilologie,’ of Gottfried Hermann (1772– 1848). Hermann and his supporters argued that only through language could ‘‘everything else that characterizes a people be comprehended and understood.’’ The dispute, begun with a review by Hermann of the first issue of CIG journal, continued for several years. Besides his interdisciplinary emphasis, it is especially Boeckh’s insistence on a solid methodological basis for every research that casts him in such a modern light. His famous lecture on Encyklopa¨ die und Methodologie der Wissenschaften, given regularly between 1809 and 1865, should be required reading for every philologist even today. See also: Greek, Ancient; Humboldt, Wilhelm von (1767–
1835); Paleography, Greek and Latin; Wolf, Friedrich August (1759–1824).
Bibliography Augustii Borckhii Commentatio Academica de Platonica corporis mundani fabrica conflati ex elementis geometrica ratione concinnatis. Heidelbergae, 1810. Boeckh, August. Gesammelte Kleine Schriften, Bd. 1–7. Leipzig: Teubner, 1858–1874. Boeckh, August. Encyclopa¨ die und Methodologie der Philologischen Wissenschaften. Bratuscheck E & Klussmann R (eds.), 2nd edn. Leipzig: Teubner. (Repr. Darmstadt: Wissenschaftliche Buchgesellschaft, 1966.) Corpus Inscriptionum Graecarum, Auctoritate et impensis Academiae Litterarum Regiae Borussicae. vol. 2. Boeckhius, Augustus (ed.) Berolini ex Oficina Academica, 1828–1843. Die Staatshaushaltung der Athener. Berlin: Realschulbuchhandlung, 1817; Berlin: Reimer, 1886. Metrologische Untersuchungen u¨ ber Gewichte, Mu¨ nzfu¨ ße und Maße des Altertums. Berlin: Veit, 1838. (Repr. Karlsruhe: Badenia Verlag, 1978.) Pindari carmina quae supersunt cum deperditorum fragmentis selectis. Rec. Augustus Boeckhius. Editio secunda correctior. Lipsiae: Weisel, 1825. Schneider B. August Boeckh, Altertumsforscher, Universita¨ tslehrer und Wissenschaftsorganisator im Berlin des 19. Jahrhunderts: Ausstellung zum 200. Geburtstag, 22. November 1985–18. Januar 1986, Berlin, Staatsbibliothek Preussischer Kulturbesitz. (Ausstellung und Katalog, Bernd Schneider.)
Boethius of Dacia (fl. 1275) E Bell Canon, University of Georgia, Athens, GA, USA ! 2006 Elsevier Ltd. All rights reserved.
Boethius of Dacia, also known as Boethius the Dane and Boethius of Sweden, was born in the early 13th century. He was associated with the University of Paris as a teacher of philosophy and grammar, and his theory of language and grammar was based in the Averroist tradition of Aristotelian philosophy. Also called a ‘radical Aristotelian,’ Boethius found many of his philosophical writings condemned in 1270 and again in 1277 by the Bishop of Paris. It is possible that later in life, Boethius joined the Dominican Order and probably served in Dacia, Romania. As a grammarian, Boethius was part of a group of like-minded thinkers called the ‘Modistae.’ The Modistae produced written works on the nature of language based on the then-recently rediscovered philosophies of the ancient Greeks, particularly Aristotle. They developed the notion of ‘speculative grammar,’ or the function of language as a mirror of what is real in the world. Boethius wrote on the nature and origin
of grammar, including parts of speech in Modi Significandi sive Quaestiones Super Priscianum Maiorem (1980). In this work, he broke with the linguistic philosophy of Priscian by establishing grammar as a science: Quia ergo ea, de quibus est grammatica, sunt comprehensibilia ab intellectu et habent causas per se, ideo grammatica est scientia. (‘Because, therefore, those things with which grammar is concerned are comprehensible by the intellect and have causes per se, it follows that grammar is a science.’) (Quote and translation from McDermott, 1980.)
Boethius believed that philosophy and grammar were intertwined: One ought to be grammarian, in order that he might consider modes of signifying; a philosopher, so as to consider the properties of objects, and a philosophergrammarian so as to derive the modes of signifying from the properties of objects. (Translation from McDermott, 1980.)
His belief that the human soul was not immortal, that the world was eternal, as well as his association
Boethius of Dacia (fl. 1275) 87
its political, economic, religious, and intellectual facets – remained beyond his reach, mainly due to the existence of only insufficient preparatory work, or none at all, for too many sections of his envisioned composition. He never discarded his central idea of an interdisciplinary, cultural-study-based approach to classical antiquity. Instead, he advanced to heading the realistic philological school in opposition to the linguistic-text-critical school or so-called ‘Wortphilologie,’ of Gottfried Hermann (1772– 1848). Hermann and his supporters argued that only through language could ‘‘everything else that characterizes a people be comprehended and understood.’’ The dispute, begun with a review by Hermann of the first issue of CIG journal, continued for several years. Besides his interdisciplinary emphasis, it is especially Boeckh’s insistence on a solid methodological basis for every research that casts him in such a modern light. His famous lecture on Encyklopa¨die und Methodologie der Wissenschaften, given regularly between 1809 and 1865, should be required reading for every philologist even today. See also: Greek, Ancient; Humboldt, Wilhelm von (1767–
1835); Paleography, Greek and Latin; Wolf, Friedrich August (1759–1824).
Bibliography Augustii Borckhii Commentatio Academica de Platonica corporis mundani fabrica conflati ex elementis geometrica ratione concinnatis. Heidelbergae, 1810. Boeckh, August. Gesammelte Kleine Schriften, Bd. 1–7. Leipzig: Teubner, 1858–1874. Boeckh, August. Encyclopa¨die und Methodologie der Philologischen Wissenschaften. Bratuscheck E & Klussmann R (eds.), 2nd edn. Leipzig: Teubner. (Repr. Darmstadt: Wissenschaftliche Buchgesellschaft, 1966.) Corpus Inscriptionum Graecarum, Auctoritate et impensis Academiae Litterarum Regiae Borussicae. vol. 2. Boeckhius, Augustus (ed.) Berolini ex Oficina Academica, 1828–1843. Die Staatshaushaltung der Athener. Berlin: Realschulbuchhandlung, 1817; Berlin: Reimer, 1886. Metrologische Untersuchungen u¨ber Gewichte, Mu¨nzfu¨ße und Maße des Altertums. Berlin: Veit, 1838. (Repr. Karlsruhe: Badenia Verlag, 1978.) Pindari carmina quae supersunt cum deperditorum fragmentis selectis. Rec. Augustus Boeckhius. Editio secunda correctior. Lipsiae: Weisel, 1825. Schneider B. August Boeckh, Altertumsforscher, Universita¨tslehrer und Wissenschaftsorganisator im Berlin des 19. Jahrhunderts: Ausstellung zum 200. Geburtstag, 22. November 1985–18. Januar 1986, Berlin, Staatsbibliothek Preussischer Kulturbesitz. (Ausstellung und Katalog, Bernd Schneider.)
Boethius of Dacia (fl. 1275) E Bell Canon, University of Georgia, Athens, GA, USA ! 2006 Elsevier Ltd. All rights reserved.
Boethius of Dacia, also known as Boethius the Dane and Boethius of Sweden, was born in the early 13th century. He was associated with the University of Paris as a teacher of philosophy and grammar, and his theory of language and grammar was based in the Averroist tradition of Aristotelian philosophy. Also called a ‘radical Aristotelian,’ Boethius found many of his philosophical writings condemned in 1270 and again in 1277 by the Bishop of Paris. It is possible that later in life, Boethius joined the Dominican Order and probably served in Dacia, Romania. As a grammarian, Boethius was part of a group of like-minded thinkers called the ‘Modistae.’ The Modistae produced written works on the nature of language based on the then-recently rediscovered philosophies of the ancient Greeks, particularly Aristotle. They developed the notion of ‘speculative grammar,’ or the function of language as a mirror of what is real in the world. Boethius wrote on the nature and origin
of grammar, including parts of speech in Modi Significandi sive Quaestiones Super Priscianum Maiorem (1980). In this work, he broke with the linguistic philosophy of Priscian by establishing grammar as a science: Quia ergo ea, de quibus est grammatica, sunt comprehensibilia ab intellectu et habent causas per se, ideo grammatica est scientia. (‘Because, therefore, those things with which grammar is concerned are comprehensible by the intellect and have causes per se, it follows that grammar is a science.’) (Quote and translation from McDermott, 1980.)
Boethius believed that philosophy and grammar were intertwined: One ought to be grammarian, in order that he might consider modes of signifying; a philosopher, so as to consider the properties of objects, and a philosophergrammarian so as to derive the modes of signifying from the properties of objects. (Translation from McDermott, 1980.)
His belief that the human soul was not immortal, that the world was eternal, as well as his association
88 Boethius of Dacia (fl. 1275)
with other Averroists such as Siger of Brabant, ultimately resulted in the condemnation of his writings by Etienne Tempier, bishop of Paris, in 1270 and again in 1277. Many of his writings are either lost or remain unedited. His three best-known works are De summo bono (‘On the supreme good’), De aeternitate mundi (‘On the eternity of the world’), and De somniis (‘On dreams’). Although he professed his faith in Christ as a Christian and may have joined the Dominican Order, his philosophical theories kept him at odds with the church for the remainder of his life. The exact date and place of his death are unknown.
Bibliography Bursill-Hall G L (1971). Speculative grammars of the middle ages, the doctrine of Partes Orationis of the Modistae (Approaches to Semiotics 11). The Hague: Mouton. Maurer A (1967). ‘Boethius of Dacia.’ In The Catholic University of America (ed.) New Catholic Encyclopedia, 19 vols. New York: McGraw-Hill. McDermott A & Senape C (eds.) (1980). Godfrey of Fontaine’s Abridgement of Boethius of Dacia’s Modi Significandi Sive Quaestiones Super Priscianum Maiorem. (Amsterdam Studies in the Theory and History of Linguistic Science 3) (Vol. 22). Amsterdam: John Benjamins B. V.
See also: Aristotle and Linguistics; Aristotle and the Stoics on Language; Priscianus Caesariensis (d. ca. 530).
Bo¨htlingk, Otto Nikolaus (1815–1904) S A Romashko, Moscow, Russia ! 2006 Elsevier Ltd. All rights reserved.
Born into a family of a German merchant in St Petersburg, Russia, Otto von Boehtlingk studied Oriental Languages at the university of his native city, but in 1835 he moved to Germany, where he felt that his interest in Sanskrit could be satisfied. After a short time in Berlin, he finished his studies in Bonn as a pupil of August Wilhelm von Schlegel and Chr. Lassen. In Bonn he published his first work, the Sanskrit grammar of Pa¯ nini with Indian scholia and his own commentary (Boehtlingk, 1839–1840). In 1842 Boehtlingk returned to Russia to enter the Imperial Academy of Sciences in St Petersburg as a research fellow (he became a full member of the Academy in 1852). He published a series of articles on Sanskrit grammar, but the announced plan of an integral Sanskrit grammar never came into being. Instead, for a time he interrupted his work on Sanskrit and approached a new, pioneering task; the Academy commissioned him to systematize the Yakut data that had been collected by A. Th. von Middendorff’s Siberian expedition. At that time, this unwritten peripheral Turkic language from Eastern Siberia was hardly known. Analyzing the received data and working with an informant he found in St Petersburg, Boehtlingk provided a descriptive work (Boehtlingk, 1851), which is still considered a classic in the field of Altaic studies. Boehtlingk adapted the ideas of early European typological theory (from W. von Humboldt, A. F. Pott, and H. Steinthal) for the practical analysis of an agglutinating language and used the methods of comparative
and historical philology to distinguish the inherited Turkic vocabulary of Yakut from Mongolian and other borrowings. The main work of Boehtlingk was the Sanskrit dictionary (Boehtlingk and Roth, 1855–1875), also known as the St Petersburg dictionary, which was compiled with assistance of Rudolf von Roth and other sanskritologists. It was the first European Sanskrit dictionary based not on Indian lexicographic works, but on the thorough study of primary texts. It was also a historical dictionary, representing the development of Sanskrit from the Vedic hymns through the late stages of the language. To complete his dictionary, Boehtlingk moved to Germany in 1868, with the permission of Russian authorities, where copious Sanskrit resources were available. He stayed in Germany until the end of his life, first in Jena and later in Leipzig. The so-called ‘shorter version’ of his Sanskrit dictionary (Boehtlingk, 1879–1889, also prepared with assistance of many sanskritologists) in fact includes an enlarged number of entries versus his earlier work; however, most of the examples were omitted from this version. An offspring of Boehtlingk’s lexicographical work was a collection of Indian sayings (Boehtlingk, 1863–1865). During his life Boehtlingk published a number of Indian texts; his second edition of Pa¯nini’s grammar (Boehtlingk, 1887) contains not only the text and a German translation, but almost the half of the book consists of indices, word and root lists, grammatical commentaries, and other useful supplements. See also: Panini; Sanskrit; Schlegel, August Wilhelm von (1767–1845); Turkic Languages; Yakut.
88 Boethius of Dacia (fl. 1275)
with other Averroists such as Siger of Brabant, ultimately resulted in the condemnation of his writings by Etienne Tempier, bishop of Paris, in 1270 and again in 1277. Many of his writings are either lost or remain unedited. His three best-known works are De summo bono (‘On the supreme good’), De aeternitate mundi (‘On the eternity of the world’), and De somniis (‘On dreams’). Although he professed his faith in Christ as a Christian and may have joined the Dominican Order, his philosophical theories kept him at odds with the church for the remainder of his life. The exact date and place of his death are unknown.
Bibliography Bursill-Hall G L (1971). Speculative grammars of the middle ages, the doctrine of Partes Orationis of the Modistae (Approaches to Semiotics 11). The Hague: Mouton. Maurer A (1967). ‘Boethius of Dacia.’ In The Catholic University of America (ed.) New Catholic Encyclopedia, 19 vols. New York: McGraw-Hill. McDermott A & Senape C (eds.) (1980). Godfrey of Fontaine’s Abridgement of Boethius of Dacia’s Modi Significandi Sive Quaestiones Super Priscianum Maiorem. (Amsterdam Studies in the Theory and History of Linguistic Science 3) (Vol. 22). Amsterdam: John Benjamins B. V.
See also: Aristotle and Linguistics; Aristotle and the Stoics on Language; Priscianus Caesariensis (d. ca. 530).
Bo¨htlingk, Otto Nikolaus (1815–1904) S A Romashko, Moscow, Russia ! 2006 Elsevier Ltd. All rights reserved.
Born into a family of a German merchant in St Petersburg, Russia, Otto von Boehtlingk studied Oriental Languages at the university of his native city, but in 1835 he moved to Germany, where he felt that his interest in Sanskrit could be satisfied. After a short time in Berlin, he finished his studies in Bonn as a pupil of August Wilhelm von Schlegel and Chr. Lassen. In Bonn he published his first work, the Sanskrit grammar of Pa¯nini with Indian scholia and his own commentary (Boehtlingk, 1839–1840). In 1842 Boehtlingk returned to Russia to enter the Imperial Academy of Sciences in St Petersburg as a research fellow (he became a full member of the Academy in 1852). He published a series of articles on Sanskrit grammar, but the announced plan of an integral Sanskrit grammar never came into being. Instead, for a time he interrupted his work on Sanskrit and approached a new, pioneering task; the Academy commissioned him to systematize the Yakut data that had been collected by A. Th. von Middendorff’s Siberian expedition. At that time, this unwritten peripheral Turkic language from Eastern Siberia was hardly known. Analyzing the received data and working with an informant he found in St Petersburg, Boehtlingk provided a descriptive work (Boehtlingk, 1851), which is still considered a classic in the field of Altaic studies. Boehtlingk adapted the ideas of early European typological theory (from W. von Humboldt, A. F. Pott, and H. Steinthal) for the practical analysis of an agglutinating language and used the methods of comparative
and historical philology to distinguish the inherited Turkic vocabulary of Yakut from Mongolian and other borrowings. The main work of Boehtlingk was the Sanskrit dictionary (Boehtlingk and Roth, 1855–1875), also known as the St Petersburg dictionary, which was compiled with assistance of Rudolf von Roth and other sanskritologists. It was the first European Sanskrit dictionary based not on Indian lexicographic works, but on the thorough study of primary texts. It was also a historical dictionary, representing the development of Sanskrit from the Vedic hymns through the late stages of the language. To complete his dictionary, Boehtlingk moved to Germany in 1868, with the permission of Russian authorities, where copious Sanskrit resources were available. He stayed in Germany until the end of his life, first in Jena and later in Leipzig. The so-called ‘shorter version’ of his Sanskrit dictionary (Boehtlingk, 1879–1889, also prepared with assistance of many sanskritologists) in fact includes an enlarged number of entries versus his earlier work; however, most of the examples were omitted from this version. An offspring of Boehtlingk’s lexicographical work was a collection of Indian sayings (Boehtlingk, 1863–1865). During his life Boehtlingk published a number of Indian texts; his second edition of Pa¯nini’s grammar (Boehtlingk, 1887) contains not only the text and a German translation, but almost the half of the book consists of indices, word and root lists, grammatical commentaries, and other useful supplements. See also: Panini; Sanskrit; Schlegel, August Wilhelm von (1767–1845); Turkic Languages; Yakut.
Bolivia: Language Situation 89
Bibliography Boehtlingk O N (1839–1840). Paˆ nini’s acht Bu¨ cher grammatischer Regeln (2 vols). Bonn: Ko¨ nig. Boehtlingk O N (1845). Sanskrit-Chrestomatie. St Petersburg: Kaiserliche Akademie der Wissenschaften. [2nd edn. 1877.] ¨ ber die Sprache der Jakuten. Boehtlingk O N (1851). U St Petersburg: Kaiserliche Akademie der Wissenschaften. [Reprinted: The Hague: Mouton, 1964.] Boehtlingk O N & Roth R (1855–1875). Sanskrit-Wo¨ rterbuch (7 vols). St Petersburg: Kaiserliche Akademie der Wissenschaften. [Reprint: Osnabru¨ ck: Zeller/Wiesbaden: Harrassowitz, 1966.] Boehtlingk O N (1863–1865). Indische Spru¨ che (3 vols). St Petersburg: Kaiserliche Akademie der Wissenschaften. [2nd edn., 1870–1873; reprint of the 2nd edn.: Osnabru¨ ck: Zeller/Wiesbaden: Harrassowitz, 1966.]
Boehtlingk O N (1879–1889). Sanskrit-Wo¨ rterbuch in ku¨ rzerer Fassung (7 vols). St Petersburg: Kaiserliche Akademie der Wissenschaften. [Reprint: Osnabru¨ ck: Zeller/ Wiesbaden: Harrassowitz, 1966.] Boehtlingk O N (1887). Paˆ nini’s Grammatik. Leipzig: Haessel. [Reprints: Hildesheim: Olms, 1964/Delhi: Motilal Banarsidass, 1998.] Bulich S K (1904). ‘Pamjati O. N. f. Betlinga.’ Izvestija Otdelenija russkogo jazyka i slovesnosti Imperatorskoj Akademii nauk 9, 187–200. Kirfel W (1955). ‘Boehtlingk, Otto Nikolaus von.’ In Neue Deutsche Biographie, vol. 2. Berlin: Duncker & Humblot. 396–397. Salemann K & Oldenburg S von (1892). ‘Boehtlingk’s Druckschriften.’ Me´ lange Asiatique 10, 247–256. Windisch E (1920). Geschichte der Sanskrit-Philologie und indischen Altertumskunde (vol. 2). Strassburg: Tru¨ bner.
Bolivia: Language Situation M Crowhurst, University of Texas, Austin, TX, USA ! 2006 Elsevier Ltd. All rights reserved.
Bolivia is home to approximately 40 indigenous languages representing four distinct Amerindian stocks, an impressive degree of linguistic diversity (see Figure 1). Two European languages are also spoken: in addition to Spanish, Plautdietsch (Low German) is spoken in eastern Bolivia by Mennonites who emigrated from Canada (possibly via Mexico) to avoid conscription during World War I. The best represented of the Amerindian stocks, in terms of number of living speakers, is Andean: Aymara and Quechua are spoken natively by millions of Bolivians. These languages are spoken primarily in the mountainous southwestern third of Bolivia. In recent years, the presence of Quechua and Aymara in urban centers further to the east has increased dramatically as speakers have migrated in search
of better economic opportunities. A third Andean language, Leco, is nearly extinct, according to data from Bolivia’s Rural Indigenous Census of 1994 (the source for all numerical figures in this article). Finally, Callahuaya (Callawalla), which blends Quechua morphosyntax with roots from Puquina, an extinct language of Peru, was a specialized (nonnative) language used by Incan herb doctors, and is still used by a few herb doctors today. The great majority of Bolivia’s languages spring from the Equatorial-Tucanoan and Macro-Panoan stocks (see Figures 2 and 3). A final group of three varieties – Besiro, as well as the now extinct Moncoca and Churapa – belong to the Chiquitano family, a linguistic isolate. (Note: the Ethnologue classifies Chiquitano as Macro-Ge. This is probably an oversimplification: Dı´ez Astete and Murillo (1998: 75–76) indicated that Chiquitano is an artificial family constituted of more than 40 languages spoken by ethnolinguistic groups who were forcibly relocated in Jesuit missions
Figure 1 Macro-linguistic affiliation of Bolivian languages (References: Ruhlen, 1991; Ethnologue).
Bolivia: Language Situation 89
Bibliography Boehtlingk O N (1839–1840). Paˆnini’s acht Bu¨cher grammatischer Regeln (2 vols). Bonn: Ko¨nig. Boehtlingk O N (1845). Sanskrit-Chrestomatie. St Petersburg: Kaiserliche Akademie der Wissenschaften. [2nd edn. 1877.] ¨ ber die Sprache der Jakuten. Boehtlingk O N (1851). U St Petersburg: Kaiserliche Akademie der Wissenschaften. [Reprinted: The Hague: Mouton, 1964.] Boehtlingk O N & Roth R (1855–1875). Sanskrit-Wo¨rterbuch (7 vols). St Petersburg: Kaiserliche Akademie der Wissenschaften. [Reprint: Osnabru¨ck: Zeller/Wiesbaden: Harrassowitz, 1966.] Boehtlingk O N (1863–1865). Indische Spru¨che (3 vols). St Petersburg: Kaiserliche Akademie der Wissenschaften. [2nd edn., 1870–1873; reprint of the 2nd edn.: Osnabru¨ck: Zeller/Wiesbaden: Harrassowitz, 1966.]
Boehtlingk O N (1879–1889). Sanskrit-Wo¨rterbuch in ku¨rzerer Fassung (7 vols). St Petersburg: Kaiserliche Akademie der Wissenschaften. [Reprint: Osnabru¨ck: Zeller/ Wiesbaden: Harrassowitz, 1966.] Boehtlingk O N (1887). Paˆnini’s Grammatik. Leipzig: Haessel. [Reprints: Hildesheim: Olms, 1964/Delhi: Motilal Banarsidass, 1998.] Bulich S K (1904). ‘Pamjati O. N. f. Betlinga.’ Izvestija Otdelenija russkogo jazyka i slovesnosti Imperatorskoj Akademii nauk 9, 187–200. Kirfel W (1955). ‘Boehtlingk, Otto Nikolaus von.’ In Neue Deutsche Biographie, vol. 2. Berlin: Duncker & Humblot. 396–397. Salemann K & Oldenburg S von (1892). ‘Boehtlingk’s Druckschriften.’ Me´lange Asiatique 10, 247–256. Windisch E (1920). Geschichte der Sanskrit-Philologie und indischen Altertumskunde (vol. 2). Strassburg: Tru¨bner.
Bolivia: Language Situation M Crowhurst, University of Texas, Austin, TX, USA ! 2006 Elsevier Ltd. All rights reserved.
Bolivia is home to approximately 40 indigenous languages representing four distinct Amerindian stocks, an impressive degree of linguistic diversity (see Figure 1). Two European languages are also spoken: in addition to Spanish, Plautdietsch (Low German) is spoken in eastern Bolivia by Mennonites who emigrated from Canada (possibly via Mexico) to avoid conscription during World War I. The best represented of the Amerindian stocks, in terms of number of living speakers, is Andean: Aymara and Quechua are spoken natively by millions of Bolivians. These languages are spoken primarily in the mountainous southwestern third of Bolivia. In recent years, the presence of Quechua and Aymara in urban centers further to the east has increased dramatically as speakers have migrated in search
of better economic opportunities. A third Andean language, Leco, is nearly extinct, according to data from Bolivia’s Rural Indigenous Census of 1994 (the source for all numerical figures in this article). Finally, Callahuaya (Callawalla), which blends Quechua morphosyntax with roots from Puquina, an extinct language of Peru, was a specialized (nonnative) language used by Incan herb doctors, and is still used by a few herb doctors today. The great majority of Bolivia’s languages spring from the Equatorial-Tucanoan and Macro-Panoan stocks (see Figures 2 and 3). A final group of three varieties – Besiro, as well as the now extinct Moncoca and Churapa – belong to the Chiquitano family, a linguistic isolate. (Note: the Ethnologue classifies Chiquitano as Macro-Ge. This is probably an oversimplification: Dı´ez Astete and Murillo (1998: 75–76) indicated that Chiquitano is an artificial family constituted of more than 40 languages spoken by ethnolinguistic groups who were forcibly relocated in Jesuit missions
Figure 1 Macro-linguistic affiliation of Bolivian languages (References: Ruhlen, 1991; Ethnologue).
90 Bolivia: Language Situation
Figure 2 Equatorial-Tucanoan languages spoken in Bolivia (More detailed information concerning classification can be found in Ruhlen, 1991; Jensen, 1999; and the Ethnologue).
Figure 3 Macro-Panoan Languages Spoken in Bolivia (References: Ruhlen, 1991; Ethnologue).
in the Chiquitos region beginning in 1550. The relationships among these languages is not known. Besiro is thought to have resulted from contact among several languages in this group.) Bolivia’s Equatorial-Tucanoan, Macro-Panoan, and Chiquitano languages, along with Itonama (Paezan), are (or were) spoken in the Tierras Bajas, or Lowlands, in the zones known as Amazonı´a (in the north), Oriente, and the Chaco (south, adjacent to Paraguay and Argentina). All of the lowland languages are
endangered to a greater or lesser extent. Many, including Canichana, Cayubaba, and Reyesano, will become extinct once the few remaining, elderly speakers have passed away. Some lowland languages, for example, Guaranı´ and the Moxo varieties, are relatively stable. Still other languages, at greater risk of extinction, represent two general situations. Some are robust within their heritage communities, but the futures of the groups themselves are uncertain because their members are too few to guarantee sustainability (for
Bolivia: Language Situation 91 Table 1 Population and language statistics for the indigenous groups of Bolivia’s Lowland Region Linguistic family
Heritage languagea
Total population of ethnolinguistic group (all ages)
a Churapa, Moncoca, Jora´, Paunaca, Saraveca, Toromona, and Pauserna are not included in Table 1 because no data is available for these languages (which are extinct or nearly extinct). Callahuaya is not included because it is not spoken as a first language. b Figures accompanies by the abbreviation ‘‘abs’’ represent absolute numbers, not percentage. (Source: the Rural Indigenous Census of 1994, reported in Dı´ ez Astete & Murillo 1998.)
example, Araona, Ayoreo, and Siriono´ ). In other cases, the ethnolinguistic group itself faces no risk of imminent collapse but is undergoing a process of language shift in which the heritage language is gradually replaced by a regionally dominant language in all spheres of life. Examples are Guarayu, and especially Besiro, which is being passed on at a rate of only one child learner per eight adult speakers. The displacing language in Bolivia has generally been Spanish, but this has not always been the case: Chane´ (Arawakan), a language of the Chaco, was displaced by northwardly migrating Guaranı´ who conquered and enslaved the Chane´ people before the arrival of the Spaniards in the 16th century (Pifarre´ , 1989; Dı´ez Astete and Murillo, 1998). Contact between Guaranı´ and Chane´ produced the antecedent of what is now Izocen˜ o, one of three main dialects of Bolivian Guaranı´ (see Figure 2). Detailed demographic information concerning the linguistic status of Bolivia’s lowland languages is provided in Table 1.
Language Maps (Appendix 1): Map 50.
Bibliography Albo X (1976). Lengua y sociedad en Bolivia. La Paz: Republica de Bolivia, Ministerio de Planeamiento y Coordinacion, Instituto Nacional de Estadistica. Albo X (1995). Bolivia plurilingu¨ e: guı´a para planificadores y educadores, vols 1 and 2. Cuadernos de Investigacio´ n 44. La Paz: Imprenta Publicidad Papiro. Dı´ez Astete A & Murrillo D (1998). Pueblos indı´genas de Tierras Bajas: caracterı´sticas principales. La Paz: Talleres Gra´ ficos. Dietrich W (1986). El idioma Chiriguano: gramatica, textos, vocabulario. Madrid: Ediciones Cultural Hispanica, Instituto de Cooperacio´ n Iberoamericana. Hardman M, Va´ squez J & Yapita J D (1988). Aymara: compendio de estructura fonolo´gica y gramatical. La Paz: Gramma Impresio´ n. Hoeller A P (1932a). Grammatik der Guarayo Sprache. Hall im Tirol: Verlag der Missionsprokura der Franziskaner.
92 Bolivia: Language Situation Hoeller A P (1932b). Guarayo-Deutsches Wo¨ rterbuch. Hall im Tirol: Verlag der Missionsprokura der Franziskaner. Ibarra Grasso D E (1982). Las lenguas indigenas en Bolivia. La Paz: Libreria Editorial Juventud. Instituto Nacional de Estudios Lingu¨ ı´sticos (1984). Atlas etnolingu¨ ı´stico de Bolivia. La Paz: Instituto Nacional de Antropologı´a. Jensen C (1999). ‘Tupi-Guarani.’ In Dixon R M W & Aikhenvald A Y (eds.) The Amazonian languages. Cambridge: Cambridge University Press. 125–164. Lema A M (1998). Pueblos indı´genas de la Amazonı´a Boliviana. La Paz: AIP FIDA-CAF. ˜ ande Reko: Melia` B (1989). Los Guaranı´-Chiriguano 1: N nuestro modo de ser. La Paz: Librerı´a Editorial Popular.
Me´ traux A (1927). Migrations historiques des Tupı´-Guaranı´. Paris: Maisonneuve fre`res. Me´ traux A (1942). ‘The native tribes of eastern Bolivia and western Matto Grosso.’ Bureau of American Ethnology, bulletin no. 134. Smithsonian Institution. Montan˜ o Aragon M (1987). Guia etnografica linguistica de Bolivia: tribus de la selva. La Paz: Editorial Don Bosco. Pifarre´ F (1989). Los Guaranı´-Chiriguano 2: historia de un pueblo. La Paz: Librerı´a Editorial Popular. Ruhlen M (1991). A guide to the world’s languages, vol. 1: Classification. Stanford, CA: Stanford University Press. Summer Institute of Linguistics (1965). Gramaticas estructurales de lenguas bolivianas. Riberalta, Bolivia: Summer Institute of Linguistics.
Boole and Algebraic Semantics E L Keenan, University of California, Los Angeles, CA, USA A Szabolcsi, New York University, New York, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
In 1854 George Boole, a largely self-educated British mathematician, published a remarkable book, The laws of thought, in which he presented an algebraic formulation of ‘‘those operations of the mind by which reasoning is performed’’ (Bell, 1965: 1). Since then, boolean algebra has become a rich subbranch of mathematics (Koppelberg, 1989), with extensive applications in computer science and, to a lesser extent, linguistics (Keenan and Faltz, 1985). Here we illustrate the core boolean notions currently used in the study of natural language semantics. Most such applications postdate Boole’s work by more than a century, though Boole (1952: 59) anticipated some of the linguistic observations, pointing out, for example, that Animals are either rational or irrational does not mean the same as Either animals are rational or animals are irrational; similarly, Men are, if wise, then temperate does not mean If all men are wise then all men are temperate. Generative grammarians rediscovered such truths in the latter third of the 20th century. We begin with the basic notion of a partially ordered set (poset) and characterize richer structures with linguistic applications as posets satisfying additional conditions (Szabolcsi, 1997; Landman, 1991). A poset consists of a domain D of objects on which is defined a binary relation R, called a partial order relation, which is reflexive (for all x in D, xRx), transitive (xRy and yRz implies xRz), and antisymmetric
(xRy and yRx implies x ¼ y). For example, the ordinary arithmetical " relation is a partial order: n " n, any natural number n; if n " m and m " p, then n " p; and if n " m and m " n, then n ¼ m. Similarly, the subset relation # is reflexive: any set A is a subset of itself. And if A # B and B # C, then A # C, so # is transitive. And finally, if A # B and B # A, then A ¼ B, that is, A and B are the same set, since they have the same members. So partial order relations are quite familiar from elementary mathematics. A case of interest to us is the arithmetical " restricted to {0, 1}. Here 0 " 1, 0 " 0 and 1 " 1, but 1 is not " 0. Representing the truth value ‘False’ as 0 and ‘True’ as 1, we can say that a conditional sentence ‘if P then Q’ is True if and only if TV(P) " TV(Q), where TV(P) is the truth value of P, etc. Thus we think of sentences of the True/False sort as denoting in a set {0, 1} on which is defined a partial order, ". The denotations of expressions in other categories defined in terms of {0, 1} inherit this order. For example, one-place predicates (P1s), such as is even or lives in Brooklyn, can be presented as properties of the elements of the set E of objects under discussion. Such a property p looks at each entity x in E and says ‘True’ or ‘False’ depending on whether x has p or not. So we represent properties p, q as functions from E into {0, 1}, and we define p " q if and only if (iff) for all x in E, p(x) " q(x), which just means if p is True of x, then so is q. The " relation just defined on functions (from E into {0, 1}) is provably a partial order. Other expressions similarly find their denotations in a set with a natural partial order (often denoted with a symbol like ‘"’). A crucial example for linguists concerns the denotations of count NPs (Noun Phrases), such as some poets, most poets, etc., as they occur in sentences (Ss) like Some poets
92 Bolivia: Language Situation Hoeller A P (1932b). Guarayo-Deutsches Wo¨rterbuch. Hall im Tirol: Verlag der Missionsprokura der Franziskaner. Ibarra Grasso D E (1982). Las lenguas indigenas en Bolivia. La Paz: Libreria Editorial Juventud. Instituto Nacional de Estudios Lingu¨ı´sticos (1984). Atlas etnolingu¨ı´stico de Bolivia. La Paz: Instituto Nacional de Antropologı´a. Jensen C (1999). ‘Tupi-Guarani.’ In Dixon R M W & Aikhenvald A Y (eds.) The Amazonian languages. Cambridge: Cambridge University Press. 125–164. Lema A M (1998). Pueblos indı´genas de la Amazonı´a Boliviana. La Paz: AIP FIDA-CAF. ˜ ande Reko: Melia` B (1989). Los Guaranı´-Chiriguano 1: N nuestro modo de ser. La Paz: Librerı´a Editorial Popular.
Me´traux A (1927). Migrations historiques des Tupı´-Guaranı´. Paris: Maisonneuve fre`res. Me´traux A (1942). ‘The native tribes of eastern Bolivia and western Matto Grosso.’ Bureau of American Ethnology, bulletin no. 134. Smithsonian Institution. Montan˜o Aragon M (1987). Guia etnografica linguistica de Bolivia: tribus de la selva. La Paz: Editorial Don Bosco. Pifarre´ F (1989). Los Guaranı´-Chiriguano 2: historia de un pueblo. La Paz: Librerı´a Editorial Popular. Ruhlen M (1991). A guide to the world’s languages, vol. 1: Classification. Stanford, CA: Stanford University Press. Summer Institute of Linguistics (1965). Gramaticas estructurales de lenguas bolivianas. Riberalta, Bolivia: Summer Institute of Linguistics.
Boole and Algebraic Semantics E L Keenan, University of California, Los Angeles, CA, USA A Szabolcsi, New York University, New York, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
In 1854 George Boole, a largely self-educated British mathematician, published a remarkable book, The laws of thought, in which he presented an algebraic formulation of ‘‘those operations of the mind by which reasoning is performed’’ (Bell, 1965: 1). Since then, boolean algebra has become a rich subbranch of mathematics (Koppelberg, 1989), with extensive applications in computer science and, to a lesser extent, linguistics (Keenan and Faltz, 1985). Here we illustrate the core boolean notions currently used in the study of natural language semantics. Most such applications postdate Boole’s work by more than a century, though Boole (1952: 59) anticipated some of the linguistic observations, pointing out, for example, that Animals are either rational or irrational does not mean the same as Either animals are rational or animals are irrational; similarly, Men are, if wise, then temperate does not mean If all men are wise then all men are temperate. Generative grammarians rediscovered such truths in the latter third of the 20th century. We begin with the basic notion of a partially ordered set (poset) and characterize richer structures with linguistic applications as posets satisfying additional conditions (Szabolcsi, 1997; Landman, 1991). A poset consists of a domain D of objects on which is defined a binary relation R, called a partial order relation, which is reflexive (for all x in D, xRx), transitive (xRy and yRz implies xRz), and antisymmetric
(xRy and yRx implies x ¼ y). For example, the ordinary arithmetical " relation is a partial order: n " n, any natural number n; if n " m and m " p, then n " p; and if n " m and m " n, then n ¼ m. Similarly, the subset relation # is reflexive: any set A is a subset of itself. And if A # B and B # C, then A # C, so # is transitive. And finally, if A # B and B # A, then A ¼ B, that is, A and B are the same set, since they have the same members. So partial order relations are quite familiar from elementary mathematics. A case of interest to us is the arithmetical " restricted to {0, 1}. Here 0 " 1, 0 " 0 and 1 " 1, but 1 is not " 0. Representing the truth value ‘False’ as 0 and ‘True’ as 1, we can say that a conditional sentence ‘if P then Q’ is True if and only if TV(P) " TV(Q), where TV(P) is the truth value of P, etc. Thus we think of sentences of the True/False sort as denoting in a set {0, 1} on which is defined a partial order, ". The denotations of expressions in other categories defined in terms of {0, 1} inherit this order. For example, one-place predicates (P1s), such as is even or lives in Brooklyn, can be presented as properties of the elements of the set E of objects under discussion. Such a property p looks at each entity x in E and says ‘True’ or ‘False’ depending on whether x has p or not. So we represent properties p, q as functions from E into {0, 1}, and we define p " q if and only if (iff) for all x in E, p(x) " q(x), which just means if p is True of x, then so is q. The " relation just defined on functions (from E into {0, 1}) is provably a partial order. Other expressions similarly find their denotations in a set with a natural partial order (often denoted with a symbol like ‘"’). A crucial example for linguists concerns the denotations of count NPs (Noun Phrases), such as some poets, most poets, etc., as they occur in sentences (Ss) like Some poets
Boole and Algebraic Semantics 93
daydream. We interpret this S as True iff there is an entity x that both the ‘poet’ property p and the ‘daydreams’ property d map to 1. Similarly, No poets daydream is True iff there is no such x. And Most poets daydream is True iff the set of x such that p(x) and d(x) ¼ 1 outnumbers the set such that p(x) ¼ 1 and d(x) ¼ 0. That is, the set of poets that daydream is larger than the set that don’t. And for F,G possible NP denotations (called generalized quantifiers), we define F " G iff for all properties p, F(p) " G(p). This relation is again a partial order. As NP denotations map one poset (properties) to another (truth values), it makes sense to ask whether a given function F preserves the order (if p " q, then F(p) " F(q)), reverses it (if p " q, then F(q) " F(p)), or does neither. Some/all/most poets preserve the order, since, for example, is laughing loudly " is laughing and Some poet is laughing loudly " Some poet is laughing, which just means, recall, that if the first sentence is True, then the second is. In contrast, no poet reverses the order, since, in the same conditions, No poet is laughing implies No poet is laughing loudly. The reader can verify that fewer than five poets, neither poet, at most six poets, and neither John nor Bill are all order reversing. And here is an unexpected linguistic correlation: reversing order correlates well with those subject NPs that license negative-polarity items, such as ever: (1a) No student here has ever been to Pinsk. (1b) *Some student here has ever been to Pinsk.
Observe that as a second linguistic application, modifying adjectives combine with property-denoting expressions (nouns) to form property-denoting expressions and can be represented semantically by functions f from properties to properties. For example, tall combines with student to form tall student, and semantically it maps the property of being a student to that of being a tall student. And overwhelmingly when f is an adjective function and p a property, f(p) " p. All tall students are students, etc. In fact, the denotation sets for the expressions we have discussed possess a structure much richer than a mere partial order: they are (boolean) lattices. A lattice is a poset in which for all elements x, y of the domain, the set {x, y} has a least upper bound (lub) noted (x _ y) and read as ‘x join y,’ and a greatest lower bound (glb), noted (x ^ y) and read as ‘x meet y.’ An upper bound (ub) for a subset K of a poset is an element z that every element of K is " to. An ub z for K is a lub for K iff z " every ub for K. Dually a lower bound (lb) for K is an element w " every element of K; such a w is a glb for K iff every lb for K is " w. For example, in the truth value lattice {0,1}, lubs are given by the standard truth table for disjunction:
1 _ 1 ¼ 1, 1 _ 0 ¼ 1, 0 _ 1 ¼ 1, and 0 _ 0 ¼ 0. That is, a disjunction of two false Ss is False, but True otherwise. Similarly, glbs are given by the truth table for conjunction: a conjunction of Ss is True iff each conjunct is, and False otherwise. So here the denotation of or is given by _, and that for and by ^. And this is quite generally the case. In our lattices of functions, for example, f _g, the lub of {f, g}, is that function mapping each argument x to f(x) _ g(x). Similarly, f ^ g maps each x to f(x) ^ g(x). So, for example, in the lattice of properties, the glb of {POET, DOCTOR} is that property which an entity x has iff POET (x) ¼ 1 and DOCTOR (x) ¼ 1, that is, x is both a poet and a doctor. So, in general, we see that the lattice structure provides denotations for the operations of conjunction and a disjunction, regardless of the category of expression we are combining. We might emphasize that the kinds of objects denoted by Ss, P1s, Adjectives, NPs, etc., are quite different, but in each category conjunctions and disjunctions are generally interpreted by glbs and lubs of the conjuncts and disjuncts. So Boole’s original intuition that these operations represent properties of mind – how we look at things – rather than properties specific to any one of these categories, is supported. And we are not done: boolean lattices present an additional operation, complement, which provides a denotation for negation. Note that negation does combine with expressions in a variety of categories: with Adjectives in a bright but not very diligent student, with P1s in Most of the students drink but don’t smoke, etc. Formally, a lattice is said to be bounded if its domain has a glb (noted 0) and a lub (noted 1). Such a lattice is complemented if for every x there is a y such that x ^ y ¼ 0 and x _ y ¼ 1. If for each x there is exactly one such y, it is noted :x and called the complement of x. In {0, 1}, for example, :0 ¼ 1 and :1 ¼ 0. In our function lattices, :f is that function mapping each x to :(f(x)). In distributive lattices (ones satisfying x ^ (y _ z) ¼ (x ^ y) _ (x ^ z) and x _ (y ^ z) ¼ (x _ y)^ (x _ z)), each x has a unique complement. A lattice is called boolean if it is a complemented distributive lattice. And, again, a linguistic generalization: the negation of an expression d in general denotes the complement of the denotation of d. Given uniqueness of complements, : is a function from the lattice to itself, one that reverses the order: if x " y, then :y " :x. We expect, correctly then, that negation licenses negative-polarity items in the predicate, and it does: He hasn’t ever been to Pinsk is natural, *He has ever been to Pinsk is not. Reversing the order on denotations, then, is what ordinary negation has in common with NPs such as no poet, neither John nor Bill, etc., which as we saw earlier also license negative-polarity items.
94 Boole and Algebraic Semantics
The boolean lattices we have so far invoked have further common properties. They are, for example, complete, meaning that each subset, not just ones of the form {x, y}, has a glb and a lub. They are also atomic (Keenan and Faltz, 1985: 56). In addition, different categories have some distinctive properties – which, with one exception, space limitations prevent us from reviewing (see also Keenan, 1983). The exception is the lattice of count NP denotations, needed for expressions such as most poets and five of John’s students. This lattice has the property of having a set of complete, independent (free) generators, called individuals (denotable by definite singular NPs, such as John, Mary, this poet). This means that any function from properties to truth values is in fact a boolean function (meet, join, complement) of individuals (Keenan and Faltz, 1985: 92). And this implies that the truth value of an S of the form [[Det N] þ P1], for P1 noncollective, is booleanly computable if we know which individuals have the N and the P1 properties. The truth of Ss like Most of the students laughed, No students laughed, etc., is determined once that information is given. This semantic reduction to individuals is a major simplification, in that the number of individuals is the number of elements in E, whereas the number of possible NP denotations is that of the power set of the power set of E. So speaking of an E with just four elements, we find there are just four individuals but 65 536 NP denotations. These freely generated algebras show up in another, unexpected syntactic way. Szabolcsi and Zwarts (1993) observed that negation determines a context that limits the class of questions (relative clauses, etc.) we can grammatically form. Thus, the questions in (2) are natural, but those in (3), in which the predicates are negated, are not: (2) How tall is John? (3) *How tall isn’t John?
How much did the car cost? *How much didn’t the car cost?
It is tempting to say simply that we cannot question out of negative contexts, but that is not correct. Both questions in (4) are acceptable: (4) How many of the books on the list did/didn’t you read?
A more accurate statement is that negation blocks questioning from domains that lack individuals (free generators), such as amounts and degrees. So, as with the distribution of negative-polarity items, we find an unexpected grammatical sensitivity to boolean structure. Much ongoing work in algebraic semantics focuses on NPs (and their predicates) that are not boolean
compounds of individuals. The predicates in the Ss in (5) force us to interpret their subjects as groups. (5a) John and Mary respect each other/are a nice couple. (5b) Russell and Whitehead wrote Principia mathematica together. (5c) The students gathered in the courtyard/ surrounded the building. (5d) Six teaching assistants graded 120 papers between them.
Respecting each other (being a nice couple, etc.) holds of a group of individuals if certain conditions among them obtain. But it does not make sense to say *John respects each other (*He is a nice couple, etc.), so we must interpret and somewhat differently from the glb operator discussed earlier. We note that the other boolean connectives – such as either . . . or . . . and neither . . . nor . . . – do not admit of a reinterpretation in the way that and does (Winter, 2001). *Either John or Mary respect each other is nonsense: the disjunctive subject still forces a lub interpretation in which respect each other would hold of at least one of the disjuncts. First attempts to provide denotations for the subject NPs in (5) involve enriching the understood domain E of entities with a partial order relation called part-of, to capture the sense in which the individual John is part of the denotation of John and Mary in (5a) or some individual student is part of the group of students in (5c), etc. The group itself is a new type of object, one that is the lub of its parts. And new types of predicates, such as those in (5), can select these new objects as arguments. Thus, the domain of a model is no longer a mere set E but is a join semi-lattice, a set equipped with a part-of partial order in which each nonempty subset has a lub (see Link, 1983, 1998; Landman, 1991). Yet other new types of arguments are mass terms (6a) and event nominals (6b). (6a) Water and alcohol don’t mix. (6b) 4000 ships passed through the lock last year. (Krifka, 1991)
Mass term denotations have a natural part-of relation: if I pour a cup of coffee from a full pot, the coffee that remains, as well as that in my cup, is part of the original coffee. So mass term denotations are in some way ontologically uniform, with the result that definitional properties of a whole also apply to their parts – the coffee I poured and the coffee that remains are both coffee. This contrasts with predicates in (5), where respect each other, gather in the courtyard, etc., do not make sense even when applied to the proper parts of their arguments. In general, mass
Boole, George (1815–1864) 95
terms are much less well understood than count terms (see Pelletier and Schubert, 1989; Link, 1998). Last, observe that (6b) is ambiguous. It has a count reading, on which there are 4000 ships each of which passed through the lock (at least once) last year. But it also has an event reading, of interest here, on which it means that there were 4000 events of ships passing through the lock. If, for example, each ship in our fleet of 2000 did so twice, then there were 4000 passings but only 2000 ships that passed. Now, the event in (6b) has the individual passing events as parts, so such complex events exhibit something of the ontological uniformity of mass terms. But there are limits. The subevents of a single passing (throwing lines to the tugboats, etc.) are not themselves passings. So events present a part-of partial order with limited uniformity, and at least some events can be represented as the lubs of their parts. But in distinction to pure mass terms, events are ontologically complex, requiring time and place coordinates, Agent and Patient participants, etc., resulting in a considerable enrichment of our naı¨ve ontology (see Parsons, 1990; Schein, 1993; and Landman, 2000). See also: Formal Semantics; Monotonicity and Generalized Quantifiers; Negation: Semantic Aspects; Operators in Semantics and Typed Logics; Plurality; Polarity Items; Quantifiers: Semantics.
Bibliography Bell E (1937). Men of mathematics. New York, NY: Simon and Schuster. Boole G (1854). The laws of thought. Reprinted (1952) as vol. 2 in George Boole’s collected logical works. La Salle, IL: Open Court.
Carlson G (1977). ‘A unified analysis of the English bare plural.’ Linguistics and Philosophy 1, 413–456. Keenan E L (1983). ‘Facing the truth: some advantages of direct interpretation.’ Linguistics and Philosophy 6, 335–371. Keenan E L & Faltz L M (1985). Boolean semantics for natural language. Dordrecht: D. Reidel. Koppelberg S (1989). Monk J D & Bonnet R (eds.) Handbook of boolean algebras, vol. 1. North-Holland: Amsterdam. Krifka M (1991). ‘Four thousand ships passed through the lock: object-induced measure functions on events.’ Linguistics and Philosophy 13, 487–520. Krifka M (1992). ‘Thematic relations as links between nominal reference and temporal constitution.’ In Sag I A & Szabolcsi A (eds.) Lexical matters. Chicago: CSLI Publications, Chicago University Press. 29–53. Landman F (1991). Structures for semantics. Dordrecht: Kluwer. Landman F (2000). Events and plurality. Dordrecht: Kluwer. Link G (1983). ‘A logical analysis of plurals and mass terms: a lattice-theoretic approach.’ In Ba¨ uerle R et al. (eds.) Meaning, use and interpretation in language. Berlin: de Gruyter. 302–323. Link G (1998). Algebraic semantics in language and philosophy. Stanford: CSLI. Parsons T (1990). Events in the semantics of English: a study in subatomic semantics. Cambridge, MA: MIT Press. Pelletier F J & Schubert L K (1989). ‘Mass expressions.’ In Gabbay D & Guenthner F (eds.) Handbook of philosophical logic, vol. IV. Dordrecht: D. Reidel. 327–407. Schein B (1993). Plurals and events. Cambridge, MA: MIT Press. Szabolcsi A (ed.) (1997). Ways of scope taking. Dordrecht: Kluwer. Szabolcsi A & Zwarts F (1993). ‘Weak islands and an algebraic semantics for scope taking.’ Natural Language Semantics 1, 235–284. Winter Y (2001). Flexibility principles in boolean semantics. Cambridge, MA: MIT Press.
Boole, George (1815–1864) E Shay, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.
George Boole, a mathematician who might have been a shoemaker, was born in Lincoln, UK, on November 2, 1815, to a lady’s maid and a shoemaker who could have been a mathematician. The younger Boole, who acquired an early love of mathematics from his brilliant father John, studied Latin with a tutor and by his late teens had taught himself Greek, French, and German. Finances did not allow him an elite education,
but through local schools, tutoring, and self-study he grew well versed in mathematics, languages, and literature. In 1831 he began teaching school, opening his own boarding school in 1835 while pursuing independent study of applied mathematics. Four years later he published his first professional paper. Despite his non-standard education, Boole in 1849 received a professorship in mathematics at the new Queen’s College, Cork, partly on the strength of testimonials from his hometown. In 1851 he was elected Dean of Science, the position he held until his death. At Cork he published the works for which he is best
Boole, George (1815–1864) 95
terms are much less well understood than count terms (see Pelletier and Schubert, 1989; Link, 1998). Last, observe that (6b) is ambiguous. It has a count reading, on which there are 4000 ships each of which passed through the lock (at least once) last year. But it also has an event reading, of interest here, on which it means that there were 4000 events of ships passing through the lock. If, for example, each ship in our fleet of 2000 did so twice, then there were 4000 passings but only 2000 ships that passed. Now, the event in (6b) has the individual passing events as parts, so such complex events exhibit something of the ontological uniformity of mass terms. But there are limits. The subevents of a single passing (throwing lines to the tugboats, etc.) are not themselves passings. So events present a part-of partial order with limited uniformity, and at least some events can be represented as the lubs of their parts. But in distinction to pure mass terms, events are ontologically complex, requiring time and place coordinates, Agent and Patient participants, etc., resulting in a considerable enrichment of our naı¨ve ontology (see Parsons, 1990; Schein, 1993; and Landman, 2000). See also: Formal Semantics; Monotonicity and Generalized Quantifiers; Negation: Semantic Aspects; Operators in Semantics and Typed Logics; Plurality; Polarity Items; Quantifiers: Semantics.
Bibliography Bell E (1937). Men of mathematics. New York, NY: Simon and Schuster. Boole G (1854). The laws of thought. Reprinted (1952) as vol. 2 in George Boole’s collected logical works. La Salle, IL: Open Court.
Carlson G (1977). ‘A unified analysis of the English bare plural.’ Linguistics and Philosophy 1, 413–456. Keenan E L (1983). ‘Facing the truth: some advantages of direct interpretation.’ Linguistics and Philosophy 6, 335–371. Keenan E L & Faltz L M (1985). Boolean semantics for natural language. Dordrecht: D. Reidel. Koppelberg S (1989). Monk J D & Bonnet R (eds.) Handbook of boolean algebras, vol. 1. North-Holland: Amsterdam. Krifka M (1991). ‘Four thousand ships passed through the lock: object-induced measure functions on events.’ Linguistics and Philosophy 13, 487–520. Krifka M (1992). ‘Thematic relations as links between nominal reference and temporal constitution.’ In Sag I A & Szabolcsi A (eds.) Lexical matters. Chicago: CSLI Publications, Chicago University Press. 29–53. Landman F (1991). Structures for semantics. Dordrecht: Kluwer. Landman F (2000). Events and plurality. Dordrecht: Kluwer. Link G (1983). ‘A logical analysis of plurals and mass terms: a lattice-theoretic approach.’ In Ba¨uerle R et al. (eds.) Meaning, use and interpretation in language. Berlin: de Gruyter. 302–323. Link G (1998). Algebraic semantics in language and philosophy. Stanford: CSLI. Parsons T (1990). Events in the semantics of English: a study in subatomic semantics. Cambridge, MA: MIT Press. Pelletier F J & Schubert L K (1989). ‘Mass expressions.’ In Gabbay D & Guenthner F (eds.) Handbook of philosophical logic, vol. IV. Dordrecht: D. Reidel. 327–407. Schein B (1993). Plurals and events. Cambridge, MA: MIT Press. Szabolcsi A (ed.) (1997). Ways of scope taking. Dordrecht: Kluwer. Szabolcsi A & Zwarts F (1993). ‘Weak islands and an algebraic semantics for scope taking.’ Natural Language Semantics 1, 235–284. Winter Y (2001). Flexibility principles in boolean semantics. Cambridge, MA: MIT Press.
Boole, George (1815–1864) E Shay, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.
George Boole, a mathematician who might have been a shoemaker, was born in Lincoln, UK, on November 2, 1815, to a lady’s maid and a shoemaker who could have been a mathematician. The younger Boole, who acquired an early love of mathematics from his brilliant father John, studied Latin with a tutor and by his late teens had taught himself Greek, French, and German. Finances did not allow him an elite education,
but through local schools, tutoring, and self-study he grew well versed in mathematics, languages, and literature. In 1831 he began teaching school, opening his own boarding school in 1835 while pursuing independent study of applied mathematics. Four years later he published his first professional paper. Despite his non-standard education, Boole in 1849 received a professorship in mathematics at the new Queen’s College, Cork, partly on the strength of testimonials from his hometown. In 1851 he was elected Dean of Science, the position he held until his death. At Cork he published the works for which he is best
96 Boole, George (1815–1864)
known, including An investigation into the laws of thought (1854). The fundamental assumption of this work is that human language and reasoning can be expressed in algebraic terms and that the truth of a proposition can be examined without reference to the meaning of its components. Boolean logic is based on Boolean algebra, which is founded on the notions of sets, variables, and operators. If variables in an equation are replaced by propositions, and if operators are replaced by connectives such as ‘and,’ ‘or,’ ‘not,’ or ‘if . . . then,’ the truth of a proposition may be evaluated in the same way as the truth of an algebraic statement. The results of such an evaluation are binary: a proposition is held to be either true or not true. Boolean logic emerges in several subdisciplines of linguistics. The notion that the truth of a proposition may be understood without reference to its meaning is crucial to formal semantics, to the ‘predicate calculus’ of Frege and others, and to Chomsky’s attempts to analyze grammar in mathematical terms. The binary nature of Boolean logic is fundamental to neuroscience, artificial intelligence, soft-
ware design, and most notably to all digital and electronic devices that rely on binary switching circuits. In addition to his seminal work on logic, Boole published roughly 50 papers on mathematics. He earned the Medal of the Royal Society in 1844 and was named a Fellow of the Society in 1857. In 1855 he married Mary Everest, niece of the famous explorer. He died of pneumonia on December 18, 1864. See also: Chomsky, Noam (b. 1928); Formal Semantics;
Frege, Gottlob (1848–1925).
Bibliography Boole G (1854). An investigation into the laws of thought. London: Walton and Maberley (reprinted 1973, New York: Dover). Keenan E L & Faltz L M (1985). Boolean semantics for natural language. Dordrecht: Kluwer. MacHale D (1985). George Boole, his life and work. Dublin: Boole Press.
Bopp, Franz (1791–1867) E F K Koerner, Zentrum fu¨r Allgemeine Sprachwissenschaft, Berlin, Germany ! 2006 Elsevier Ltd. All rights reserved.
Bopp (Figure 1) was born on September 14, 1791 in Mainz, and died on October 23, 1867 in Berlin. After one year studying classical as well as modern languages at the newly created University of Aschaffenburg, he went to Paris, inspired by Friedrich Schlegel’s (see Schlegel, Friedrich von (1772–1829)) Ueber die Sprache und Weisheit der Indier (1808), with the encouragement of his mentor Karl Joseph Windischmann, and through contacts established with the Orientalist Antoine Le´ onard de Che´ zy. There he studied Sanskrit (largely on his own), Arabic, and Persian with Antoine Isaac Silvestre de Sacy (see Silvestre de Sacy, Baron Antoine-Isaac (1758–1838)). In 1814, he received a grant from the King of Bavaria that allowed him to continue his research. This culminated in the book whose publication date – 1816 – is generally regarded as marking the beginning of comparative Indo–European linguistics. Bopp spent two more years in Paris until a grant from the Munich Academy of Sciences allowed him to move to London to add to his knowledge of Sanskrit through contacts with the most
distinguished scholars in the field, Henry Thomas Colebrooke and especially Charles Wilkins, both of whom had published grammars of the language. During his stay in Britain, Bopp produced a revised English version of the linguistic portion of his Conjugationssystem (1820) (the remainder was devoted to translations from Sanskrit literature). While in Paris, Bopp had introduced Friedrich Schlegel’s elder brother, August Wilhelm (see Schlegel, August Wilhelm von (1767–1845)) to the study of the classical Indic language and literature; in London, he tutored Wilhelm von Humboldt (see Humboldt, Wilhelm von (1767–1835)), who at the time was Prussian ambassador. In order to round off his studies to prepare himself for an academic career, Bopp asked the Bavarian Academy for permission to enroll at the University of Go¨ ttingen. Instead, the authorities there granted him a doctorate honoris causa in recognition for work already done. Soon afterwards, in the summer of 1821, he arrived in Berlin and (through the intervention of Wilhelm von Humboldt and his brother Alexander) was appointed extraordinary professor of Oriental languages and general linguistics. In 1825 he was made a full professor and a member of the Prussian Academy, in whose Proceedings he published a large number of his comparative linguistic works. From 1824 onward he published his own
96 Boole, George (1815–1864)
known, including An investigation into the laws of thought (1854). The fundamental assumption of this work is that human language and reasoning can be expressed in algebraic terms and that the truth of a proposition can be examined without reference to the meaning of its components. Boolean logic is based on Boolean algebra, which is founded on the notions of sets, variables, and operators. If variables in an equation are replaced by propositions, and if operators are replaced by connectives such as ‘and,’ ‘or,’ ‘not,’ or ‘if . . . then,’ the truth of a proposition may be evaluated in the same way as the truth of an algebraic statement. The results of such an evaluation are binary: a proposition is held to be either true or not true. Boolean logic emerges in several subdisciplines of linguistics. The notion that the truth of a proposition may be understood without reference to its meaning is crucial to formal semantics, to the ‘predicate calculus’ of Frege and others, and to Chomsky’s attempts to analyze grammar in mathematical terms. The binary nature of Boolean logic is fundamental to neuroscience, artificial intelligence, soft-
ware design, and most notably to all digital and electronic devices that rely on binary switching circuits. In addition to his seminal work on logic, Boole published roughly 50 papers on mathematics. He earned the Medal of the Royal Society in 1844 and was named a Fellow of the Society in 1857. In 1855 he married Mary Everest, niece of the famous explorer. He died of pneumonia on December 18, 1864. See also: Chomsky, Noam (b. 1928); Formal Semantics;
Frege, Gottlob (1848–1925).
Bibliography Boole G (1854). An investigation into the laws of thought. London: Walton and Maberley (reprinted 1973, New York: Dover). Keenan E L & Faltz L M (1985). Boolean semantics for natural language. Dordrecht: Kluwer. MacHale D (1985). George Boole, his life and work. Dublin: Boole Press.
Bopp, Franz (1791–1867) E F K Koerner, Zentrum fu¨r Allgemeine Sprachwissenschaft, Berlin, Germany ! 2006 Elsevier Ltd. All rights reserved.
Bopp (Figure 1) was born on September 14, 1791 in Mainz, and died on October 23, 1867 in Berlin. After one year studying classical as well as modern languages at the newly created University of Aschaffenburg, he went to Paris, inspired by Friedrich Schlegel’s (see Schlegel, Friedrich von (1772–1829)) Ueber die Sprache und Weisheit der Indier (1808), with the encouragement of his mentor Karl Joseph Windischmann, and through contacts established with the Orientalist Antoine Le´onard de Che´zy. There he studied Sanskrit (largely on his own), Arabic, and Persian with Antoine Isaac Silvestre de Sacy (see Silvestre de Sacy, Baron Antoine-Isaac (1758–1838)). In 1814, he received a grant from the King of Bavaria that allowed him to continue his research. This culminated in the book whose publication date – 1816 – is generally regarded as marking the beginning of comparative Indo–European linguistics. Bopp spent two more years in Paris until a grant from the Munich Academy of Sciences allowed him to move to London to add to his knowledge of Sanskrit through contacts with the most
distinguished scholars in the field, Henry Thomas Colebrooke and especially Charles Wilkins, both of whom had published grammars of the language. During his stay in Britain, Bopp produced a revised English version of the linguistic portion of his Conjugationssystem (1820) (the remainder was devoted to translations from Sanskrit literature). While in Paris, Bopp had introduced Friedrich Schlegel’s elder brother, August Wilhelm (see Schlegel, August Wilhelm von (1767–1845)) to the study of the classical Indic language and literature; in London, he tutored Wilhelm von Humboldt (see Humboldt, Wilhelm von (1767–1835)), who at the time was Prussian ambassador. In order to round off his studies to prepare himself for an academic career, Bopp asked the Bavarian Academy for permission to enroll at the University of Go¨ttingen. Instead, the authorities there granted him a doctorate honoris causa in recognition for work already done. Soon afterwards, in the summer of 1821, he arrived in Berlin and (through the intervention of Wilhelm von Humboldt and his brother Alexander) was appointed extraordinary professor of Oriental languages and general linguistics. In 1825 he was made a full professor and a member of the Prussian Academy, in whose Proceedings he published a large number of his comparative linguistic works. From 1824 onward he published his own
Bopp, Franz (1791–1867) 97
Figure 1 Franz Bopp.
grammars of Sanskrit, and his comparative grammar of the major Indo-European languages appeared between 1833 and 1852. Although he had a number of distinguished students, including August Friedrich Pott (see Pott, August Friedrich (1802–1887)), Adalbert Kuhn, William Dwight Whitney (see Whitney, William Dwight (1827–1894)), and Michel Bre´ al (see Bre´ al, Michel Jules Alfred (1832–1915)), Bopp’s enormous impact on Sanskrit studies and on the field of comparative philology was largely produced – apart from his voluminous comparative grammar – by the vast number of his empirical studies of individual branches of the Indo-European language family. However, as Bre´ al (1991) pointed out, another reason for his success was that he did not slavishly follow the Indic grammatical tradition in his treatment of Sanskrit but introduced his own perspective to the analysis of this language in conjunction with Greek, Latin, Persian, and other Indo-European languages. Thus he developed a method of showing their basic structural identity, which provided the framework for several generations of comparative-historical linguists. See also: Arabic; Bre´al, Michel Jules Alfred (1832–1915);
Humboldt, Wilhelm von (1767–1835); Persian, Old; Pott, August Friedrich (1802–1887); Sanskrit; Schlegel, August Wilhelm von (1767–1845); Schlegel, Friedrich von (1772– 1829); Silvestre de Sacy, Baron Antoine-Isaac (1758– 1838); Whitney, William Dwight (1827–1894).
Bibliography Bopp F (1825). Vergleichende Zergliederung der SanskritaSprache und der mit ihm verwandten Sprachen. Erste Abhandlung: Von den Wurzeln und Pronomen erster und zweiter Person. Abhandlungen der Ko¨niglichen
Akademie der Wissenschaften zu Berlin; Philos.-historische Klasse 1825: 117–148. Repr. in Bopp 1972. Bopp F (1827). Ausfu¨hrliches Lehrgeba¨ude der SanskritaSprache (2nd edn.). Berlin: F. Du¨ mmler. Bopp F (1833–1852). Vergleichende Grammatik des Sanskrit, Send [Armenischen]. Griechischen, Lateinischen. Litthauischen, [Altslawischen], Gothischen und Deutschen. 6 Abtheilungen. F. Du¨ mmler, Berlin. 2nd edn. 1857–1861, 3 vols, repr. 1971, F. Du¨ mmler, Bonn. Bopp F (1972). Kleine Schriften zur vergleichenden Sprachwissenschaft: Gesammelte Berliner Abhandlungen 1824– 54. Leipzig: Zentralantiquariat der DDR. ¨ ber das Bopp F & Windischmann K J (eds.) (1816). U Conjugationssystem der Sanskritsprache in Vergleichung mit jenem der griechischen, lateinischen, persischen und germanischen Sprache. Frankfurt: Andrea¨ ische Buchhandlung. Repr. 1975, Georg Olms, Hildesheim. Bopp F & Koerner E F K (eds.) (1820). ‘Analytical comparison of the Sanskrit, Greek, Latin, and Teutonic languages, showing the original identity of their grammatrical structure.’ Annals of Oriental Literature 1, 1–64. 1974 Benjamins, Amsterdam; 2nd edn. 1989 with detailed biography of Bopp. Bre´ al M (1991). ‘Introduction to the French translation of Bopp’s Comparative Grammar.’ In Wolf G (ed.) The beginnings of semantics. Stanford, CA: Stanford University Press. Koerner K (1989). ‘Franz Bopp.’ In Practicing linguistic historiography; Selected essays by K. Koerner. Amsterdam: Benjamins. Lefmann S (1891–1895). Franz Bopp, sein Leben und seine Wissenschaft. Mit dem Bildnis Franz Bopps und einem Anhang: Aus Briefen und anderen Schriften (Parts I–II). Berlin: Georg Reimer. Lefmann S (1897). ‘Franz Bopp.’ Nachtrag. Mit einer Einleitung und einem vollsta¨ ndigen Register. Berlin: Georg Reimer. Lehmann W P (1991). ‘Franz Bopp’s use of typology.’ Z Phon 44(3), 275–284. Morpurgo Davies A (1987). ‘‘‘Organic’’ and ‘‘Organism’’ in ‘‘Franz Bopp.’’’ In Hoenigswald H M & Wiener L F (eds.) Biological metaphor and cladistic classification: An interdisciplinary perspective. Philadelphia, PA: University of Pennsylvania Press. Paustian P R (1978). ‘Bopp and the nineteenth-century distrust of the Indian grammatical tradition.’ Indogermanische Forschungen 82, 39–49. Timpanaro S (1973). ‘Il contrasto tra i fratelli Schlegel e Franz Bopp sulla struitura e la genesi delle lingue indoeuropee.’ Critica Storica 10, 1–38. Verburg P A (1950). ‘The background to the linguistic conceptions of Bopp.’ Lingua 2, 438–468. Repr. in Sebeok T A (ed.) 1966 Portraits of Linguists, vol. I. Bloomington, IN: Indiana University Press.
98 Borgstrøm, Carl Hjalmar (1909–1986)
Borgstrøm, Carl Hjalmar (1909–1986) E Hovdhaugen, University of Oslo, Oslo, Norway ! 2006 Elsevier Ltd. All rights reserved.
Borgstrøm was born in Kristiania (Oslo), Norway. When he began his studies of Celtic languages, his teacher, professor Carl Marstrander, encouraged him to choose Scottish Gaelic dialects as his speciality. Borgstrøm’s later studies of dialects on the Hebrides (Borgstrøm, 1940, 1941) were pioneer works and laid the foundation of subsequent investigations of Gaelic dialects. Borgstrøm also studied comparative Indo-European philology and from 1932 to 1935 was Lecturer in Comparative Philology at Trinity College in Dublin. In 1936–1937 he was Visiting Professor of Sanskrit in Ankara. During his stay in Turkey he learned Turkish and consequently offered courses in Turkish at the University of Oslo. During the war Borgstrøm went to Sweden and in 1945 he was a lecturer of linguistics in Lund. In 1947 he was appointed Professor of Comparative Indo-European Philology at the University of Oslo. However, during his entire career he published only a few articles in this field that were mainly overlooked or negatively received. His main interest (besides Celtic studies) was general linguistics, and he produced some discerning structural studies on Norwegian phonology (e.g., Borgstrøm, 1938). However, his most important publication was his introductory textbook on general linguistics, first published in 1958. It was a successful symbiosis of American and European structuralism interspersed
with Borgstrom’s own ideas on language analysis. For almost two decades it was the basic textbook in linguistics in Norway and also to some extent in the other Nordic countries. Borgstrom thus had an important influence on the emergence of structural linguistics in the Nordic countries. Borgstrøm was a shy and formal person, but as a teacher and supervisor he was unique. He stimulated, encouraged, and supported his students in a challenging way. A whole generation of Norwegian linguists was influenced by his broad theoretical orientation, his penetrating and constructive way of analyzing linguistic data, and his scholarly and human generosity. See also: Marstrander, Carl J. S. (1883–1965).
Bibliography Borgstrøm C Hj (1938). ‘Zur Phonologie der norwegischen Schriftsprache (nach der ostnorwegischen Aussprache).’ NTS 9, 250–273. Borgstrøm C Hj (1940). The dialects of the Outer Hebrides. A Linguistic Survey of the Gaelic Dialects of Scotland, vol. 1. NTS Suppl. bind 1. Oslo: Aschehoug. Borgstrøm C Hj (1941). The dialects of Skye and Rossshire. A Linguistic Survey of the Gaelic Dialects of Scotland, vol. 2. NTS Suppl. bind 2. Oslo: Aschehoug. Borgstrøm C Hj (1958). Innføring i sprogvidenskap. Oslo: Universitetsforlaget. Simonsen H G (1999). ‘Carl Hjalmar Borgstrøm.’ In Arntzen J G (ed.) Norsk biografisk leksikon 1. Oslo: Kunnskapsforlaget. 421–422.
Bosnia and Herzegovina: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
After the break-up of the Republic of Yugoslavia, and the following war in 1992–1995, Bosnia and Herzegovina was administratively divided into two entities: the Federation of Bosnia and Herzegovina and the Republika Srpska. The population of Bosnia and Herzegovina is about 4 007 608 (estimated, July 2004). There are three official languages: Bosnian, spoken by 48% of the population (2000 census), Serbian (37.1%), and Croatian (14.3%). Other languages spoken are German, Italian, Macedo-Romanian, Vlax Romani, Turkish, and Albanian.
The term ‘Bosnian’ refers to the languages spoken by Bosnian Serbs, Bosnian Croats, and Bosnian Bosniacs (formerly referred to as Bosnian Muslims), although the Croats and the Serbs in Bosnia and Herzegovina call their language Croatian and Serbian, respectively. Bosnian is used to refer to the language of the Bosniac group. All three languages – Bosnian, Serbian, and Croatian – are dialects of the standard version of Central-South Slavic, formerly and still frequently called Serbo-Croatian. Bosnian and Croatian use a Latin alphabet. Serbian uses both Latin and Cyrillic alphabets.
See also: Serbian–Croatian–Bosnian Linguistic Complex.
98 Borgstrøm, Carl Hjalmar (1909–1986)
Borgstrøm, Carl Hjalmar (1909–1986) E Hovdhaugen, University of Oslo, Oslo, Norway ! 2006 Elsevier Ltd. All rights reserved.
Borgstrøm was born in Kristiania (Oslo), Norway. When he began his studies of Celtic languages, his teacher, professor Carl Marstrander, encouraged him to choose Scottish Gaelic dialects as his speciality. Borgstrøm’s later studies of dialects on the Hebrides (Borgstrøm, 1940, 1941) were pioneer works and laid the foundation of subsequent investigations of Gaelic dialects. Borgstrøm also studied comparative Indo-European philology and from 1932 to 1935 was Lecturer in Comparative Philology at Trinity College in Dublin. In 1936–1937 he was Visiting Professor of Sanskrit in Ankara. During his stay in Turkey he learned Turkish and consequently offered courses in Turkish at the University of Oslo. During the war Borgstrøm went to Sweden and in 1945 he was a lecturer of linguistics in Lund. In 1947 he was appointed Professor of Comparative Indo-European Philology at the University of Oslo. However, during his entire career he published only a few articles in this field that were mainly overlooked or negatively received. His main interest (besides Celtic studies) was general linguistics, and he produced some discerning structural studies on Norwegian phonology (e.g., Borgstrøm, 1938). However, his most important publication was his introductory textbook on general linguistics, first published in 1958. It was a successful symbiosis of American and European structuralism interspersed
with Borgstrom’s own ideas on language analysis. For almost two decades it was the basic textbook in linguistics in Norway and also to some extent in the other Nordic countries. Borgstrom thus had an important influence on the emergence of structural linguistics in the Nordic countries. Borgstrøm was a shy and formal person, but as a teacher and supervisor he was unique. He stimulated, encouraged, and supported his students in a challenging way. A whole generation of Norwegian linguists was influenced by his broad theoretical orientation, his penetrating and constructive way of analyzing linguistic data, and his scholarly and human generosity. See also: Marstrander, Carl J. S. (1883–1965).
Bibliography Borgstrøm C Hj (1938). ‘Zur Phonologie der norwegischen Schriftsprache (nach der ostnorwegischen Aussprache).’ NTS 9, 250–273. Borgstrøm C Hj (1940). The dialects of the Outer Hebrides. A Linguistic Survey of the Gaelic Dialects of Scotland, vol. 1. NTS Suppl. bind 1. Oslo: Aschehoug. Borgstrøm C Hj (1941). The dialects of Skye and Rossshire. A Linguistic Survey of the Gaelic Dialects of Scotland, vol. 2. NTS Suppl. bind 2. Oslo: Aschehoug. Borgstrøm C Hj (1958). Innføring i sprogvidenskap. Oslo: Universitetsforlaget. Simonsen H G (1999). ‘Carl Hjalmar Borgstrøm.’ In Arntzen J G (ed.) Norsk biografisk leksikon 1. Oslo: Kunnskapsforlaget. 421–422.
Bosnia and Herzegovina: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
After the break-up of the Republic of Yugoslavia, and the following war in 1992–1995, Bosnia and Herzegovina was administratively divided into two entities: the Federation of Bosnia and Herzegovina and the Republika Srpska. The population of Bosnia and Herzegovina is about 4 007 608 (estimated, July 2004). There are three official languages: Bosnian, spoken by 48% of the population (2000 census), Serbian (37.1%), and Croatian (14.3%). Other languages spoken are German, Italian, Macedo-Romanian, Vlax Romani, Turkish, and Albanian.
The term ‘Bosnian’ refers to the languages spoken by Bosnian Serbs, Bosnian Croats, and Bosnian Bosniacs (formerly referred to as Bosnian Muslims), although the Croats and the Serbs in Bosnia and Herzegovina call their language Croatian and Serbian, respectively. Bosnian is used to refer to the language of the Bosniac group. All three languages – Bosnian, Serbian, and Croatian – are dialects of the standard version of Central-South Slavic, formerly and still frequently called Serbo-Croatian. Bosnian and Croatian use a Latin alphabet. Serbian uses both Latin and Cyrillic alphabets.
See also: Serbian–Croatian–Bosnian Linguistic Complex.
Botswana: Language Situation 99
Bosnian
See: Serbian–Croatian–Bosnian Linguistic Complex.
Botswana: Language Situation H M Batibo, University of Botswana, Gaborone, Botswana ! 2006 Elsevier Ltd. All rights reserved.
Botswana is a medium-sized country located in southern Africa. It is completely landlocked, surrounded by South Africa to its south, Zimbabwe to its east, Namibia to its west, and Zambia and Namibia to its north. It is largely composed of the Kalahari Basin of the southern Africa Plateau. Apart from the Limpopo and Chobe rivers, drainage is internal, and largely to the Okavango Swamp in the northwest. The Okavango Delta, which has resulted from this inland drainage, as well as the surrounding area as far east as the Chobe Basin, is a rich habitat of both fauna and flora and has attracted many different communities, who live in the areas as farmers, fishermen, herders, or hunters.
The country has a population of over 1.7 million people (according to the 2001 census report), giving a density of about 3 people per square kilometer. However, the population is concentrated on the eastern and northern fringes of the country where the land is more fertile. On the other hand, the Kalahari desert of the central, west, and southwest area of the country is home to numerous groups of San and Khoe people, commonly known as Khoesan or Bushman, who traditionally live in scattered bands of no more than 30 people each. In fact, the San were the first inhabitants of the area, having lived there for at least 20 000 years as hunters and foragers. The Khoe arrived in the area about 4000 years ago, followed by the Bantu groups more than 2000 years later. The territory, often attributed to Khoesan in language maps such as Greenberg’s (see Greenberg, 1963), is misleadingly extensive, as settled areas are
Figure 1 The distribution of the 28 Botswana languages (after Batibo et al., 2003).
100 Botswana: Language Situation
better considered to be Bantu, who constitute more than 96% of the Botswana population. In fact the Khoesan groups, whose population is about 39 000 in Botswana, are fast vanishing due to integration into the more dominant and socioeconomically prestigious Bantu communities.
Table 1 The estimated number of speakers of the Botswana languages Language
1. 2. 3.
Linguistic Relationships The country has 28 languages (see Figure 1), belonging to three main language families: Bantu (a subbranch of Niger–Congo), Khoesan (Khoisan), and Germanic (a subbranch of Indo-European). There are 14 Bantu languages, of which five belong to the Sotho branch of Southern Bantu: Setswana (Tswana), Shekgalagari, Sebirwa, Setswapong (Tswapong), and Silozi (Lozi). Three languages belong to the Sala– Shona branch of Southeastern Bantu: Ikalanga, Zezuru, and Nambya (Najwa). The southern branch of Central Bantu includes Chikuhane (Sesubiya), Shiyeyi (Yeyi) (erroneously classified by Guthrie [1948] in Zone R), Thimbukushu (Mbukushu), and Rugciriku (Rumanyo) [Diriku]. The only language that belongs to Western Bantu is Otjiherero (Herero) (classified by Guthrie in Zone R). Lastly, Sindebele (Ndebele), which is extensively spoken along the eastern borders of the country, belongs to the Nguni group of languages, together with IsiZulu (Zulu) and IsiXhosa (Xhosa) in South Africa. There are 12 Khoesan languages which belong to three distinct groups: Northern Khoesan, with three languages: Ju|’hoan, Kx’au||’ein, and Hua (formerly thought to belong to Southern Khoesan); Central Khoesan, with eight languages: Nama, Naro, |Gwi, ||Gana, Kua (Hietshware), Shua, Tshwa, and Khwedam (the last comprising Bugakhwe, ||Anikhwe, |Anda, and various Kxoe dialects). The last group, known as Southern Khoesan, has only one member in Botswana, !Xo´ o˜ . The other members of Southern Khoesan, formerly spoken mainly in what is now South Africa, have largely become extinct. There are two Germanic languages: Afrikaans, spoken by about 7500 people, mainly Afrikaner settlers in farms and ranches in the Ghanzi and Kgalagadi districts (Grimes, 2000), and English, which is mainly spoken as a second language. The figures presented in Table 1 are mere estimates, as no census involving language or ethnicity has taken place in Botswana since independence in 1966. It is difficult to come up with accurate figures regarding the speakers of the various languages, as many people tend to equate language with ethnicity or may want to identify themselves with the majority languages. Setswana is found in three main dialects: the northern dialect, spoken in the northern areas by the
Ngwato, Tawana, and part of the Kwena groups; the southern dialect, spoken in southern areas by the Ngwaketse, Rolong, Tlhaping, Tlharo and part of the Kwena groups; and the eastern dialect, spoken in the eastern areas by the Kgatla, Tlokwa and Lete groups. Setswana is the most dominant language both demographically and in terms of status and prestige. It is spoken by 78.6 percent of the population as first language, and is understood and used by over 90 percent of the population. It is the national language and the main lingua franca of the country. The only other widely used language is Ikalanga (Kalanga), which is spoken by over 150 000 people. Leaving aside the very early literacy traditions of the Coptic, Nubian, Ethiopic, Vai, and Arab-speaking communities, the Setlhaping variety of Setswana in Botswana has the distinction of being the first African language known to develop an orthography and a literature, with the publication by Robert Moffat of the Christian New Testament in 1839.
Bouvet Island: Language Situation 101
Language Policy, Use, and Literacy
Bibliography
English is the official language of Botswana, while Setswana is the national language. Both are used in the administration and mass media. English is used in the formal business of government, while Setswana is generally used in semiofficial interactions, particularly in the oral mode. Setswana is used in lower primary education and English in upper primary and all the subsequent levels of education. The official literacy rate is estimated at about 60 percent, although independent estimates of literacy in Setswana are lower. The enrollment for secondary level schooling is reported to be 21 percent. On the other hand, over 2000 students per year enroll in the University, giving Botswana the highest rate of university admission, proportionate to its population, in Africa. Botswana is one of the countries in Africa where the smaller languages have very few speakers. Most of the small Botswana languages, especially those of Khoesan origin, are spoken by fewer than 10 000 people, most of whom are bilingual in the major languages, particularly Setswana. Hence, the process of language shift and death are a great concern to both linguists and the general public.
Anderson G & Janson T (1997). The languages of Botswana. Gaborone: Longman Botswana. Batibo H M (1997). ‘The fate of the minority languages of Botswana.’ In Smieja B & Tasch M (eds.) Human contact through language and linguistics. Frankfurt: Peter Lang. 243–252. Batibo H M (1998). ‘The fate of the Khoesan languages of Botswana.’ In Brenzinger M (ed.) Endangered languages in Africa. Koeln: Ruediger Koeppe. 267–284. Batibo H M & Smieja B (eds.) (2000). Botswana: the future of the minority languages. Frankfurt: Peter Lang. Batibo H M & Tsonope J (eds.) (2000). The state of Khoesan languages in Botswana. Gaborone: Tasalls. Batibo H M, Mathangwane J T & Tsonope J (2003). A study of the third language teaching in Botswana (preliminary report). Gaborone: Associated Printers. Europa Publications (1991). Africa south of the Sahara (20th edn.). London: Europa Publications. Government of Botswana (1994). Revised national policy on education (white paper). Gaborone: Government Printers. Greenberg J H (1963). The languages of Africa. Bloomington: Indiana University Press. Grimes B (2000). Ethnologue (14th edn.). Dallas: S K Publications. Guthrie M (1948). The classification of the Bantu languages. London: International African Institute. Janson T & Tsonope J (1991). Birth of a national language: the history of Setswana. Gaborone: Heinemann Botswana. Mazonde I N (ed.) (2002). Minorities in the millennium: perspectives from Botswana. Gaborone: Light Books Publishers for the University of Botswana. Nyati-Ramahobo L (1999). The national language: a resource or a problem? Gaborone: Pula Press. Smieja B (2003). Language pluralism in Botswana: hope or hurdle? Frankfurt: Peter Lang. Vosssen R (1988). Bayreuth African studies 13: patterns of language knowledge and use in Ngamiland in Botswana. Bayreuth: Bayreuth University.
See also: Bilingualism and Second Language Learning; Indo–European Languages; Khoesaan Languages; Language Maintenance and Shift; Language Policy in Multilingual Educational Contexts; Languages of Wider Communication; Lingua Francas as Second Languages; Minorities and Language; Multiculturalism and Language; Namibia: Language Situation; Proto-Bantu; South Africa: Language Situation; Xhosa; Zambia: Language Situation; Zimbabwe: Language Situation; Zulu.
Bouvet Island: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
Bouvet Island, so named after the French naval officer who discovered it in 1739, is a volcanic island situated in the southern section of the Atlantic Ocean, southwest of South Africa’s Cape of Good Hope. It is the most isolated island on Earth – the nearest land, the Antarctic Continent, is more than 1600 km away. The first territorial claim came from
the British in 1825, but in 1928 the claim was waived in favor of the Norwegian Crown. Bouvet Island was declared a natural reserve in 1971 and, since 1977, Norway has run an automated meteorological station on the island. With no native population, the island is considered a territory of Norway and is administered by the Polar Department of the Ministry of Justice and Police in Oslo. The official language is Norwegian and the few researchers who on occasion are present on Bouvet Island are subject to Norwegian law.
Bouvet Island: Language Situation 101
Language Policy, Use, and Literacy
Bibliography
English is the official language of Botswana, while Setswana is the national language. Both are used in the administration and mass media. English is used in the formal business of government, while Setswana is generally used in semiofficial interactions, particularly in the oral mode. Setswana is used in lower primary education and English in upper primary and all the subsequent levels of education. The official literacy rate is estimated at about 60 percent, although independent estimates of literacy in Setswana are lower. The enrollment for secondary level schooling is reported to be 21 percent. On the other hand, over 2000 students per year enroll in the University, giving Botswana the highest rate of university admission, proportionate to its population, in Africa. Botswana is one of the countries in Africa where the smaller languages have very few speakers. Most of the small Botswana languages, especially those of Khoesan origin, are spoken by fewer than 10 000 people, most of whom are bilingual in the major languages, particularly Setswana. Hence, the process of language shift and death are a great concern to both linguists and the general public.
Anderson G & Janson T (1997). The languages of Botswana. Gaborone: Longman Botswana. Batibo H M (1997). ‘The fate of the minority languages of Botswana.’ In Smieja B & Tasch M (eds.) Human contact through language and linguistics. Frankfurt: Peter Lang. 243–252. Batibo H M (1998). ‘The fate of the Khoesan languages of Botswana.’ In Brenzinger M (ed.) Endangered languages in Africa. Koeln: Ruediger Koeppe. 267–284. Batibo H M & Smieja B (eds.) (2000). Botswana: the future of the minority languages. Frankfurt: Peter Lang. Batibo H M & Tsonope J (eds.) (2000). The state of Khoesan languages in Botswana. Gaborone: Tasalls. Batibo H M, Mathangwane J T & Tsonope J (2003). A study of the third language teaching in Botswana (preliminary report). Gaborone: Associated Printers. Europa Publications (1991). Africa south of the Sahara (20th edn.). London: Europa Publications. Government of Botswana (1994). Revised national policy on education (white paper). Gaborone: Government Printers. Greenberg J H (1963). The languages of Africa. Bloomington: Indiana University Press. Grimes B (2000). Ethnologue (14th edn.). Dallas: S K Publications. Guthrie M (1948). The classification of the Bantu languages. London: International African Institute. Janson T & Tsonope J (1991). Birth of a national language: the history of Setswana. Gaborone: Heinemann Botswana. Mazonde I N (ed.) (2002). Minorities in the millennium: perspectives from Botswana. Gaborone: Light Books Publishers for the University of Botswana. Nyati-Ramahobo L (1999). The national language: a resource or a problem? Gaborone: Pula Press. Smieja B (2003). Language pluralism in Botswana: hope or hurdle? Frankfurt: Peter Lang. Vosssen R (1988). Bayreuth African studies 13: patterns of language knowledge and use in Ngamiland in Botswana. Bayreuth: Bayreuth University.
See also: Bilingualism and Second Language Learning; Indo–European Languages; Khoesaan Languages; Language Maintenance and Shift; Language Policy in Multilingual Educational Contexts; Languages of Wider Communication; Lingua Francas as Second Languages; Minorities and Language; Multiculturalism and Language; Namibia: Language Situation; Proto-Bantu; South Africa: Language Situation; Xhosa; Zambia: Language Situation; Zimbabwe: Language Situation; Zulu.
Bouvet Island: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
Bouvet Island, so named after the French naval officer who discovered it in 1739, is a volcanic island situated in the southern section of the Atlantic Ocean, southwest of South Africa’s Cape of Good Hope. It is the most isolated island on Earth – the nearest land, the Antarctic Continent, is more than 1600 km away. The first territorial claim came from
the British in 1825, but in 1928 the claim was waived in favor of the Norwegian Crown. Bouvet Island was declared a natural reserve in 1971 and, since 1977, Norway has run an automated meteorological station on the island. With no native population, the island is considered a territory of Norway and is administered by the Polar Department of the Ministry of Justice and Police in Oslo. The official language is Norwegian and the few researchers who on occasion are present on Bouvet Island are subject to Norwegian law.
102 Bovelles, Charles de (1479–1567)
Bovelles, Charles de (1479–1567) N Lioce, IVO Sint-Andries, Belgium P Swiggers, Katholieke Universiteit Leuven, Leuven, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Born in Saint-Quentin in Picardy (before March 28, 1479; according to some sources in 1475), Charles de Bovelles (Bouvelles/Bouelles; Latinized: Bovillus) studied in Paris (1495–1503) with J. Lefe`vre d’E´ taples and started writing his first philosophical works there. He then traveled through Switzerland, Germany, the Low Countries, and Spain. He received instruction in astronomy while in Rome in 1507. Upon his return to Picardy in 1508, he devoted himself to his ecclesiastic functions as a canon in Saint-Quentin and a priest in Noyon, combining these with a scholarly career. He died in Ham (Vermandois) on February 24, 1567 (some sources give 1553 or 1556 as date of his death). His writings (and extensive scholarly correspondence) cover various domains such as theology, metaphysics, arithmetic, and geometry, but they primarily involved biblical studies, theology, ethics, and metaphysics. His philosophical work was inspired by Ramon Llull, Nicolas of Cusa, Marsilio Ficino, Giovanni Pico della Mirandola, and neo-Platonism in general, which was highly popular in 16th century humanist circles. In his classification of the sciences (in his Metaphysicum introductorium, 1503–1504), the liberal arts (grammar, dialectic, and rhetoric) are classified on the lower level; Bovelles, however, took a keen interest in language matters. As a typical Renaissance scholar he wrote almost all his works in Latin, except his poetry (published in 1529) and a manual of geometry (1511, the first geometry handbook in French). He also showed interest in popular sayings and proverbs; his collection of Latin sentences was translated into French in 1557 (Proverbes et dicts sententieux avec l’interpretation d’iceux). Bovelles’s main linguistic work is his study of dialect differences in northern France (1533), which also includes a valuable etymological dictionary (in which Bovelles also used Late Latin sources), and a less useful onomasticon. Like many humanists, he saw the relationship between Latin and the Romance languages as that between a regularized, fixed language and various vernacular offshoots, the latter
characterized by irregularity and incapable of being laid down into rules (he denied the possibility of writing a grammar of French). He explains language evolution as due to astral determinism and human intervention (arbitrium hominum). In his analysis of dialect differences he shows himself as a keen observer of lexical and phonetic data; his work constitutes an important source for French diachronic lexicology. In his explanation of the diversification of Latin, Bovelles gave much weight to substratal and superstratal influences. See also: Renaissance Linguistics: French Tradition.
Bibliography Bovelles C de (1533). Liber de differentia vulgarium linguarum, & Gallici sermonis varietate. Quae voces apud Gallos sint factitiae & arbitrariae vel barbarae: quae item ab origine Latina manarint. De hallucinatione Gallicanorum nominum. Paris: R. Estienne. [Reedition, with French translation and notes, by C DumontDemaizie`re: Sur les langues vulgaires et la varie´te´ de la langue franc¸aise. Paris: Klincksieck, 1973.] Charles de Bovelles en son cinquie`me centenaire 1479– 1979 (1982). Actes du colloque international tenu a` Noyon. Paris: Tre´ daniel. Demaizie`re C (1983). La grammaire franc¸aise au XVIe sie`cle. Les grammairiens picards. Lille: Atelier national de reproduction des the`ses. Magnard P (1997). ‘Bovelles (Charles de) (1475–1556).’ In Centuriae Latinae. ed. by Colette Nativel. Geneva: Droz. 169–174. Margolin J-C (1985). ‘Science et nationalisme linguistique ou la bataille pour l’e´ tymologie au XVIe sie`cle. Bovelles et sa poste´ rite´ critique.’ In The fairest flower. The emergence of linguistic national consciousness in Renaissance Europe. Firenze: Accademia della Crusca. 139–165. Margolin J-C (ed.) (2002). Lettres et poe`mes de Charles de Bovelles. Paris: Champion. Schmitt C (1976). ‘Charles de Bovelles. Sur les langues vulgaires (. . .). Une source importante pour l’histoire du vocabulaire franc¸ ais.’ Travaux de Linguistique et de Litte´rature 14(1), 129–156. Schmitt C (1977). ‘La grammaire franc¸ aise des XVIe et XVIIe sie`cles et les langues re´ gionales.’ Travaux de Linguistique et de Litte´rature 15(1), 215–225. Victor J M (1978). Charles de Bovelles, 1479–1553. An intellectual biography. Geneva: Droz.
Boxhorn, Marcus Zuerius (1602/12–1653) 103
Boxhorn, Marcus Zuerius (1602/12–1653) D Droixhe ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, p. 395, ! 1994, Elsevier Ltd.
With C. Saumaise’s De hellenistica (1643) and G. K. Kirchmayer ’s school in Wittenberg (see Metcalf, 1974), Boxhorn’s work represents one of the most accomplished efforts in pre-comparativism, in its search for a European prototype called ‘Scythian.’ Born in Bergen op Zoom (Netherlands) in 1602 or 1612, Marcus Boxhorn studied at Leiden, where he became professor of rhetoric and history, until his untimely death in 1653. As a young teacher, he submitted to his famous colleague Claude Saumaise linguistic comparisons, for example, between Greek hudor ‘sweat’, Latin sudor, and ‘Celtic’ sud. A strong Flemish tradition pushed him to look for the key of such a ‘harmony’ in his national language. The latter had been set among the oldest mother-tongues, on the basis of a relation between the Cimmerians of the Black Sea and the Dutch-Cimbrians (see Swiggers, 1984). Correspondences joining Persian and the Germanic languages had also been recently popularized by Justus Lipsius. Boxhorn undertook a systematic exploration of the analogies that united the European languages, including the Celtic and Slavic ones. He drew profit from the discovery of Anglo-Saxon, but died before Franciscus Junius’s edition of the Gospels in Anglo-Saxon and Gothic (1664–1665). His ideas were expounded in a Dutch Antwoord of 1647, concerning the sensational discovery of stone images of the goddess Nehalennia, whose name he interpreted as a ‘Scythian root.’ That 100-page essay was led to demonstrate the common origin of Greek, Latin, and Dutch. In a rather traditional way he puts forward various lexical analogies, focusing on the ‘basic vocabulary’ (esp. names for body parts). His rudimentary search for phonetic rules must be compared to the universal equivalences previously established by Cruciger, Besold, Nirmutanus, Hayne, and others. But, contrary to most of these authors, he vigorously broke with the theory of Hebrew mother tongue, according to the secularization propagated in Leiden by Joseph Juste Scaliger (see Scaliger, Joseph Justus (1540–1609)) or Grotius. So, his Originum gallicarum
liber (1654) thrashed John Davies (1632) for having linked Welsh with Hebrew. He was also said to have ‘‘planted the seed of Celtic philology in the fertile soil of the mind of Leibniz’’ (see Leibniz, Gottfried Wilhelm (1646–1716)). Boxhorn, remarkably, extended the comparison into morphology: declension of Latin unus and German ein; likeness of the infinitive endings in Greek and Dutch; similitudes with Latin in the formation of present participles, comparatives, or diminutives. ‘‘It is obvious that those nations have learned their tongue from one mother, as can be seen from their ordinary manner of varying words and names, in the declensions, the conjugations, etc.; and even in the anomalies.’’ By laying the stress on his native language, sometimes awkwardly assimilated to ‘Scythian’ in declamatory statements, he already betrayed his own quest for a real prototype, and his Originum gallicarum liber finally fostered the Celtic fever. Despised by Saumaise – who arrived at the same historical conclusions (!) – Boxhorn was considered a monomaniac. One year before he died, he wrote to Huygens that he nevertheless maintained his ideas with ‘pride’ and ‘joy,’ being confident that he had ‘‘understood something true and important.’’ See also: Leibniz, Gottfried Wilhelm (1646–1716); Scaliger,
Joseph Justus (1540–1609).
Bibliography Bonfante G (1953/54). ‘Ideas on the kinship of the European languages from 1200 to 1800.’ Cahiers d’histoire mondiale 1, 679–699. Droixhe D (1989). ‘Boxhorn’s bad reputation. A chapter in academic linguistics.’ In Dutz K D (ed.) Speculum historiographiae linguisticae. Mu¨ nster: Nodus Publikationen. Metcalf G J (1974). ‘The Indo–European hypothesis in the sixteenth and seventeenth centuries.’ In Hymes D (ed.) Studies in the history of linguistics. Traditions and paradigms. Bloomington, IN: Indiana University Press. Muller J C (1986). ‘Early stages of language comparison from Sassetti to Sir William Jones (1786).’ Kratylos 31, 1–31. Swiggers P (1984). ‘Adrianus Schrieckius: de la langue des Scythes a` l’Europe linguistique.’ Histoire, E´ piste´ mologie, Langage 6, 17–35.
104 Brahui
Brahui P S Subrahmanyam, Annamalai University, Bangalore, India ! 2006 Elsevier Ltd. All rights reserved.
The word ‘Brahui’ designates both a language and its speakers. Brahui is the conventional spelling for the phonetically more correct Bra¯ ho¯ ı¯/Bra¯hu¯ı¯. The language is a member of the Dravidian family; more specifically, it belongs to the North Dravidian subgroup, of which the other two members are Kur. ux and Malto. The Brahuis live mainly in the Baluchistan and Sind provinces of Pakistan, but some are found also in Afghanistan (Sˇo¯ra¯wa¯k desert) and Iran (Sistan area). It is estimated that there are about 700 000 Brahui tribesmen, of whom only about 300 000 speak the language. Even those who speak Brahui are bilinguals in either Balochi or Siraki. There are two views current among the scholars to explain the location of Brahui, which is far away from the main Dravidian area. Whereas one view maintains that the Brahuis lived where they are now located from the earliest times, the other holds that they migrated to the current locations from that part of the main area that is occupied by the speakers of Kur. ux and Malto.
Syntax Word Classes
The following word classes may be recognized for Brahui: nouns (including pronouns and numerals), verbs, adjectives, adverbs (including expressives), particles, and interjections. An adjective normally occurs before the noun it qualifies but may be shifted to the postnominal position for the sake of emphasis: jwa¯n-o¯ hullı¯-as good-INDEF horse-INDEF ‘good horse’
Phonology The Brahui phonological system contains eight vowels and 28 consonants (see Tables 1 and 2). Proto-Dravidian short *e and short *o have been removed from the Brahui vowel system under the influence of Balochi; *e developed into i/a and *o developed into u/a/o¯ (the exact conditionings are not known). The e¯ and o¯ have shorter (and somewhat lower) allophones before a consonant cluster. The voiceless stops p, t, and k may optionally be accompanied by aspiration in all positions (po¯ k/ pho¯ k/pho¯ kh ‘wasted’); however, aspirated stops in Indo-Aryan loans sometimes lose their aspiration in the south (dho¯ bı¯/do¯bı¯ ‘washerman’). The voiceless lateral L is the most characteristic sound of Brahui since it does not occur either in Proto-Dravidian (PDr) or in the neighboring languages of Brahui. It
Front
Central
Short
Long
i
¯ı e¯
hullı¯-as horse-INDEF ‘good horse’
jwa¯n-o¯ good-INDEF
Nouns and adjectives characteristically distinguish between definite and indefinite forms. The basic forms are definite and the corresponding indefinite ones are derived by adding -o¯ to the adjective base and -as to the nominal base, as illustrated in the preceding examples. A definite adjective that is monosyllabic is often strengthened by the addition of -a¯/-anga¯: sun-anga¯ sˇahr deserted village ‘deserted village’
An indefinite adjective can function also as a noun: ball-o¯ big-INDEF ‘big (one)’
Table 1 Vowels of Brahui
High Mid Low
comes from two sources, PDr (alveolar) *l and (retroflex) *l. ; both of these also show the reflex l in some words, the conditioning being unclear because of the paucity of the data (pa¯L ‘milk’ < PDr *pa¯l, te¯L ‘scorpion’ < PDr *te¯.l). The contrast between L and l is illustrated in pa¯L ‘milk’ and pa¯l ‘omen.’ One major dialectal division in Brahui involves the voiceless glottal fricative h; it appears in all positions in the northern dialects but is replaced in the south by the glottal stop in initial and intervocalic positions, and is lost before a consonant or in final position; the following examples illustrate the variation in the northern and southern dialects, respectively: hust, ust ‘heart’; sahi affat. , sa ı¯ affat. ‘I don’t know’; sˇahd, sˇad ‘honey’; and po¯h, po¯ ‘intelligence.’
Short
Back Long
Short
Long
u
u¯
o¯o¯ a
a¯
An adverb occurs before the verb. Adverbs may be divided into those of (1) time (e.g., da¯sa¯ ‘now,’ daro¯ ‘yesterday’, ayno¯ ‘today’, pagga ‘tomorrow’), (2) place (e.g., monat. ı¯ ‘forward’), and (3) manner (e.g., dawn ‘thus’). For particles, the enclitic pronouns
are very commonly used in Brahui. Whereas those for the third person are used in dialects throughout the Brahui area, those for the first and the second persons are more common in the Jahlawa¯ n dialect. They are suffixed to nouns or verbs. When added to a noun, they carry the sense of a pronoun in the genitive case; when added to a verb, they signal the direct or indirect object. The forms are: 1SG þ ka ‘my’, 2SG þ ne¯ ‘your,’ 3SG þ ta ‘his/her/its’, 3PL þ ta¯ ‘their’ (there are no plurals in the first and second persons): maL-e¯ þ ka son-ACC/DAT þ 1ENCL ‘my son (accus.)/to my son’ xalkus þ ka. strike-PAST-2SG þ 1ENCL ‘You struck me.’
Agreement
A finite verb shows agreement with the subject pronoun for person and number (see Table 3).
Noun Morphology A nominal base is followed by the plural suffix when plurality has to be expressed and then by a case suffix; a postposition is normally attached to the genitive form of a noun. Plural Suffix
Word Order
The favored word order in Brahui is subject-objectverb: ı¯ da¯ ka¯ re¯ me¯ kar-o¯ ı¯ I this work do-NOM ‘I must do this work.’
person are retained to refer to all categories: o¯ (d) ‘he/ she/it’ (cf. Ta(mil). atu ‘it’, Te(legu). adi ‘she, it’) and o¯ fk ‘they’ (cf. Ta. av(ay), Te. avi ‘they (NEUT)’).
ut. be.1SG
The plural suffix is -k (variant -a¯ k) in the nominative but -te¯ - before a nonnominative case suffix (see Table 4); as in the South Dravidian languages, use of the plural suffix is optional when plurality is understood from the context: ira¯ ma¯ r/ma¯-k (<*ma¯r-k) two son/son-PL ‘two sons’ Case Suffixes and Postpositions
Sentences Without the Copular Verb
Like most of the other Dravidian languages (especially the southern ones), Brahui contains sentences without the copula in certain contexts: numa¯ sˇ ahr-at. ı¯ at. ura¯ /o¯ your village-LOC how many house ‘How many houses are there in your village?’ Gender and Number
Brahui, like Toda of South Dravidian, has no gender distinction, but number (singular versus plural) is distinguished (see later, Plural Suffixes). The original neuter forms (both singular and plural) of the third
The nominative is unmarked; locative I means ‘in’ and locative II means ‘on, by’ (Table 4 shows all of the case forms of xal ‘stone’). The following example shows postpositions: ka-na¯ ne¯ maGa¯ ı¯ my towards ‘towards me’
There are also a few prepositions, such as be¯ (d) ‘without,’ of Perso-Arabic origin that have entered Brahui through Balochi. Pronouns
All of the pronouns are of Dravidian origin; however, Brahui developed postclitic forms of personal
Past 1. tix-a¯ þ .t ‘I put’ 2. tix-a¯ þ s 3. tix-a¯ Imperfect 1. tix-a¯ þ .t -a ‘I was putting’ 2. tix-a¯ þ s-a 3. tix-a¯k-a Pluperfect 1. tix-a¯ þ sut ‘I had put’ 2. tix-a¯ þ sus 3. tix-a¯ þ sas Perfect 1. tix-a¯-n þ ut. ‘I have put’ 2. tix-a¯-n þ us 3. tix-a¯-n þ e Present indefinite 1. tix-i-v ‘I may put’ 2. tix-i-s 3. tix-e Future 1. tix-o-.t ‘I will put’ 2. tix-o-s 3. tix-o-e Nonpast negative tix-pa-r ‘I will not put’ 1. 2. tix-p-e¯s 3. tix-p
Table 4 Case forms of xal ‘stone’ Plural
tix-a¯ þ n tix-a¯ þ re tix-a¯ þ r tix-a¯ þ n-a tix-a¯ þ re tix-a¯ þ r-a tix-a¯ þ sun. tix-a¯ þ sure tix-a¯ þ sur tix-a¯-n þ un tix-a¯-n þ ure tix-a¯-n þ a tix-i-n tix-i-re tix-i-r tix-o-n tix-o-re tix-o-r tix-pa-n tix-p-e¯re tix-pa-s
and demonstrative pronouns under the influence of Balochi (see preceding discussion, Word Classes). The first-person personal pronouns are ı¯ ‘I’ and nan ‘we’; the second-person personal pronouns are nı¯ ‘you(singular)’ and num ‘you (plural).’ There is only the singular reflexive pronoun, te¯ n ‘self’. The interrogative pronouns are de¯ r ‘who?’ and ant ‘what?’. The third-person forms show a threefold deictic distinction: proximal da¯(d) ‘(one) who is here’ (plural da¯fk), medial e¯ (d) ‘(one) who is at some distance’ (plural e¯ fk), and distal o¯ (d) ‘(one) who is far off’ (plural o¯ fk). Numerals
Only the cardinal numbers for one, two, and three are of Dravidian origin (the forms without the final .t of these function as adjectives); all others are borrowed from Balochi. The number ‘1’ is asi(t. ), ‘2’ is ira(t. ), and ‘3’ is musi(t. ).
Verb Morphology Verb Bases
A verb base in Brahui may be simple or complex. The complex base is formed from the simple one
Case
Singular
Plural
Nominative Accusative-dative Instrumental Comitative Ablative Genitive Locative I Locative II
by the addition of the transitive-causative suffix -if (conditioned variant: -f ). This suffix converts an intransitive into a transitive and an underived transitive into the corresponding causative; it is, therefore, possible to use the suffix twice in a sequence, e.g., bin‘to hear,’ bin-if- ‘to cause to hear,’ ka - ‘to die,’ kas-f‘to kill,’ and kas-f-if- ‘to cause (someone) to kill.’ Finite Verbs
There are four kinds of past tense (past, imperfect, pluperfect, and perfect), each with different shades of meaning, and all of them are periphrastic constructions involving the ‘be’ verb. The past stem, which is the basis for all of these, is formed by adding to the base -b- (conditioned variants: -e¯ -, -k-, -g-, -is-, -s-, -ss-). The following formulas give the structures of these tenses: 1. 2. 3. 4.
Past: past stem þ present of ann- ‘to be.’ Imperfect: past þ a. Pluperfect: past stem þ past of ann- ‘to be.’ Perfect: past stem þ (u)n þ present of ann- ‘to be.’
The present indefinite, the future, and the nonpast negative are morphological constructions with the following structures (these and the previously mentioned tenses are illustrated in Table 3 with the verb base tix- ‘to put’): 1. Present indefinite: verb base þ i þ personal suffix. 2. Future: verb base þ o þ personal suffix. 3. Nonpast negative: verb base þ pa þ personal suffix. There are some other syntactic constructions involving ann- ‘to be’ that need not be mentioned here. One noteworthy feature of Brahui is the strategy of suffixing -a to form one type of finite verb from another. The imperfect present-future and the negative present-future are thus formed from the past present-indefinite and the nonpast negative, respectively. The imperative suffixes are 2SG -ø, 2PL -bo (conditioned variant: -ibo):
The corresponding negative imperative has the negative suffix -pa- (conditioned variant: -fa-) between the base and the imperative suffix: tix-pa put-NEG-2SG ‘Don’t put (singular)!’ tix-pa-bo. put-NEG-2PL ‘Don’t put (plural)!’ Nonfinite Verbs
The present adverb has the suffix -isa: bis-isa bake-PRES ADV ‘baking’
The present adjective has the suffix -ok: bin-ok hear-PRES ADJ ‘that hear(s)’
The infinitive-cum-action noun is formed by adding -ing (conditioned variant: -e¯ ng) to the verb base: bin-ing hear-INF/VN ‘to hear, hearing’
See also: Afghanistan: Language Situation; Dravidian Languages; Iran: Language Situation; Pakistan: Language Situation.
Bibliography Bausani A (1969). ‘La letteratura Brahui.’ In Botto O (ed.) Storia delle letterature d’Oriente II. Rome. 649–657.
Brahui A R (1983). ‘History, background, objectives and achievements of the Brahui Academy, Quetta, Pakistan.’ In Rossi A & Tosi M (eds.) Newsletter of Baluchistan studies I. Naples. Bray D (1909). The Brahui language I. Calcutta [Reprinted in 1972 in Quetta]. Bray D (1913). Life-history of a Brahui. London. Bray D (1934). The Brahui language II: The Brahui problem. Delhi [Reprinted in 1978 in Quetta]. Bray D (1939). ‘Brahui tales.’ Acta Orientalia 17, 65–98. DeArmond R (1975). ‘Some rules of Brahui conjugation.’ In Schiffman H & Eastman C (eds.) Dravidian phonological systems. Seattle: University of Washington. 242–298. Elfenbein J (1982). ‘Notes on the Balochi-Brahui commensality.’ Transactions of the Philological Society. 77–98. Elfenbein J (1983). ‘The Brahui problem again.’ Indo-Iranian Journal 25, 103–132, 191–209. Elfenbein J (1987). ‘A periplus of the Brahui problem.’ Studia Iranica 16, 215–233. Elfenbein J (1997). ‘Brahui phonology.’ In Kaye A (ed.) Phonologies of Asia and Africa. Winona Lake, IN: Eisenbrauns. 797–811. Elfenbein J (1998). ‘Brahui.’ In Steever S B (ed.) The Dravidian languages. London and New York: Routledge. 388–414. Emeneau M B (1937). ‘Phonetic observations on the Brahui language.’ Bulletin of the School of Oriental and African Studies 8(4), 981–983. Emeneau M B (1962a). Brahui and Dravidian comparative grammar. Berkeley: University of California Publications in Linguistics. Emeneau M B (1962b). ‘Bilingualism and structural borrowing.’ Proceedings of the American Philosophical Society 106, 430–442. Emeneau M B (1991). ‘Brahui personal pronouns, first singular and reflexive.’ In Bai B L & Reddy B R (eds.) Studies in Dravidian and general linguistics (a Festschrift for Bh. Krishnamurti). Hyderabad: Osmania University. 1–12. Grierson G (1906). Linguistic survey of India, vol. 4: the Munda and Dravidian languages. Calcutta. Grierson G (1921). Linguistic survey of India, vol. 10: Eranian family. Calcutta. Mayer T J L (1906–1907). A Brahui reading book (vols. I–III). Ludhiana [Reprinted in one volume in 1983 by the Brahui Academy, Quetta.]. Tate G P (1909). The frontiers of Baluchistan. London.
108 Braille, Louis (1809–1852)
Braille, Louis (1809–1852) A Bowers ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, pp. 396–397, ! 1994, Elsevier Ltd.
Louis Braille’s system is used worldwide to enable the blind to read and write by means of raised dots that are impressed from the reverse side of a page and read with the fingertips. The original system has been adapted to facilitate the notation of many other written forms, including non-European languages, shorthand, mathematical and scientific characters, and music.
Louis Braille was born in 1809 at Coupvray (Seineet-Marne), the son of a saddler. At age 3, playing in his father’s workshop, he injured one eye with a sharp instrument; complications resulted that led to total blindness. In time his father attempted to teach him to read the Roman letter shapes from wooden blocks studded with nails, but this could only serve as a beginning. At 10, in 1819, Braille was sent to an institution for blind children, in Paris (the Institution des Jeunes Aveugles), which had been founded five years earlier by Professor Valentin Hau¨ y. Here, the teaching of reading seems to have been only slightly more advanced than that which the boy had
Figure 1 Standard English Braille. (Reproduced by kind permission of the Royal National Institute for the Blind, London, UK.)
Brands and Logos 109
experienced at home: again it relied on raised Roman letter shapes, though Professor Hau¨ y is credited with embossing them on paper for his students. At this time a former military officer, Charles Barbier, had researched a method for night-communication in battle by a code of dots embossed on cardboard, which he maintained would also benefit blind students. In 1824, however, at the age of 15, Braille adapted the existing ideas into a system whereby as many as 63 letters, abbreviations, and numbers could be embossed, requiring only a 6-dot code. The code for each symbol or word was embossed from the back of the page using an awl and a perforated ruler, and arranged in a ‘cell’ or domino formation offering six possible dots, with extra guide symbols to multiply their possibilities. The columns were set about 3 mm apart, slightly more widely spaced than in present-day Braille. The inventor published this revolutionary system in 1829, by which time he was a teacher at the Institution. An accomplished musician, he also adapted his system to enable musical notation to be embossed. Figure 1 is an example of the Standard English Braille dots representing letters and letter groups, together with some of the wide range of contractions, punctuation, and mathematical signs.
The Braille system revolutionized the speed of students’ reading, and would have opened the way to large-scale embossed printing of books if the support of the authorities and funding had been available. In fact throughout his life Braille faced stubborn resistance from sighted teachers who insisted that the Roman letter shapes must be used, and several decades passed before his system was adopted widely in Europe. Its simplicity, compactness, and the speed with which it could be used would finally ensure its adoption. Braille died at 43, however, from tuberculosis, before his system was in common use even in his native France. The only viable alternative that does not use encoded print is Moon Type, which was perfected in 1845 by William Moon, a printer from Brighton, UK. Moon Type uses Roman capital letters, and is valuable for users who have lost their sight late in life.
Bibliography Braille, Primer (rev. edn.). Peterborough: Royal National Institute for the Blind. Henri P (1987). (transl.) The life and work of Louis Braille 1809–1852: inventor of the alphabet for the blind. Pretoria: South African National Council for the Blind.
Brands and Logos M Danesi, University of Toronto, Toronto, Canada ! 2006 Elsevier Ltd. All rights reserved.
Introduction The technique of promoting products by identifying them with the name of the manufacturer or with some invented name, known as the brand name, has become a primary marketing strategy since the turn of the 20th century. It was (and continues to be) based on the premise that the appeal of a product, called the ‘brand,’ increases if it can be linked to socially significant trends and values that the name evokes subconsciously. Turning a product into a brand thus transforms it into a sign – something that stands for something other than itself – that taps into social meaning systems that govern lifestyle, values, beliefs, and the like. For this reason, the semiotic study of brand creation as a central strategy of consumerist cultures has become widespread. Its origin can be traced to the general study of popular culture as a ‘mythological sign system’ by Roland Barthes (1915–
1980) in 1957. It now constitutes a branch of semiotics, generally called marketing semiotics (e.g., Wolfe, 1989; Umiker-Sebeok, 1987; Berger, 2000; Beasley and Danesi, 2002; Danesi, 2002).
Brands and Advertising The modern history of brands and ‘persuasion’ advertising overlap considerably (see Marketing and Semiotics: From Transaction to Relation). It is impossible to advertise ‘nameless’ products with any degree of persuasion. Brand names imbue products with identities in the same way that names given to human beings give them a distinct identity. It was in the 20th century that advertising evolved into a science of persuasion intended to influence people to perceive objects of consumption as ‘necessary’ accouterments of life, leading to a widespread, insatiable appetite for new objects of consumption in general ‘groupthink.’ Roland Barthes (1957) coined the term ‘neomania’ to characterize this type of groupthink (see Mythologies in Pop Culture).
Brands and Logos 109
experienced at home: again it relied on raised Roman letter shapes, though Professor Hau¨y is credited with embossing them on paper for his students. At this time a former military officer, Charles Barbier, had researched a method for night-communication in battle by a code of dots embossed on cardboard, which he maintained would also benefit blind students. In 1824, however, at the age of 15, Braille adapted the existing ideas into a system whereby as many as 63 letters, abbreviations, and numbers could be embossed, requiring only a 6-dot code. The code for each symbol or word was embossed from the back of the page using an awl and a perforated ruler, and arranged in a ‘cell’ or domino formation offering six possible dots, with extra guide symbols to multiply their possibilities. The columns were set about 3 mm apart, slightly more widely spaced than in present-day Braille. The inventor published this revolutionary system in 1829, by which time he was a teacher at the Institution. An accomplished musician, he also adapted his system to enable musical notation to be embossed. Figure 1 is an example of the Standard English Braille dots representing letters and letter groups, together with some of the wide range of contractions, punctuation, and mathematical signs.
The Braille system revolutionized the speed of students’ reading, and would have opened the way to large-scale embossed printing of books if the support of the authorities and funding had been available. In fact throughout his life Braille faced stubborn resistance from sighted teachers who insisted that the Roman letter shapes must be used, and several decades passed before his system was adopted widely in Europe. Its simplicity, compactness, and the speed with which it could be used would finally ensure its adoption. Braille died at 43, however, from tuberculosis, before his system was in common use even in his native France. The only viable alternative that does not use encoded print is Moon Type, which was perfected in 1845 by William Moon, a printer from Brighton, UK. Moon Type uses Roman capital letters, and is valuable for users who have lost their sight late in life.
Bibliography Braille, Primer (rev. edn.). Peterborough: Royal National Institute for the Blind. Henri P (1987). (transl.) The life and work of Louis Braille 1809–1852: inventor of the alphabet for the blind. Pretoria: South African National Council for the Blind.
Brands and Logos M Danesi, University of Toronto, Toronto, Canada ! 2006 Elsevier Ltd. All rights reserved.
Introduction The technique of promoting products by identifying them with the name of the manufacturer or with some invented name, known as the brand name, has become a primary marketing strategy since the turn of the 20th century. It was (and continues to be) based on the premise that the appeal of a product, called the ‘brand,’ increases if it can be linked to socially significant trends and values that the name evokes subconsciously. Turning a product into a brand thus transforms it into a sign – something that stands for something other than itself – that taps into social meaning systems that govern lifestyle, values, beliefs, and the like. For this reason, the semiotic study of brand creation as a central strategy of consumerist cultures has become widespread. Its origin can be traced to the general study of popular culture as a ‘mythological sign system’ by Roland Barthes (1915–
1980) in 1957. It now constitutes a branch of semiotics, generally called marketing semiotics (e.g., Wolfe, 1989; Umiker-Sebeok, 1987; Berger, 2000; Beasley and Danesi, 2002; Danesi, 2002).
Brands and Advertising The modern history of brands and ‘persuasion’ advertising overlap considerably (see Marketing and Semiotics: From Transaction to Relation). It is impossible to advertise ‘nameless’ products with any degree of persuasion. Brand names imbue products with identities in the same way that names given to human beings give them a distinct identity. It was in the 20th century that advertising evolved into a science of persuasion intended to influence people to perceive objects of consumption as ‘necessary’ accouterments of life, leading to a widespread, insatiable appetite for new objects of consumption in general ‘groupthink.’ Roland Barthes (1957) coined the term ‘neomania’ to characterize this type of groupthink (see Mythologies in Pop Culture).
110 Brands and Logos
The dawn of advertising as a science of persuasion was signaled by the establishment and rapid growth of ‘advertising agencies’ at the end of the 19th century. These started composing newspaper ads, posters, and billboards for clients that related the qualities of a product not in themselves, but in relation to specific social and lifestyle trends. By the 1920s, such agencies had become themselves large business enterprises, continuously turning to psychologists to help them develop techniques and methods designed to influence the ‘typical consumer’ of the product. Business and psychology had clearly joined forces, broadening the attempts of their predecessors to build an unconscious bridge between a product and the consumer by playing on his or her emotional needs, fears, and expectations. With the entrenchment of electronic media (radio and television) in the 1940s and 1950s as mass communication outlets, advertising became itself a mass communication strategy, imprinting in groupthink the perception that objects of consumption were necessarily intertwined with the style and content of everyday life – a perception reinforced today through Internet advertising. The influence of brand advertising on society is unmistakable. Its language has become the language of virtually everyone – even of those who are critical of it. This is because of its omnipresence in the social landscape. As Twitchell (2000: 1) aptly puts it, ‘‘Language about products and services has pretty much replaced language about all other subjects.’’ The objective of brand advertising is, in fact, to get people to assimilate and react to advertising discourse unwittingly and in ways that parallel how individuals and groups have responded in the past to other kinds of ‘authoritative’ social discourse (such as religious discourse). Advertising language has become one of the most ubiquitous and persuasive forms of social discourse of the modern era. There are now even websites, such as AdCritic.com, that feature ads for their own sake, so that audiences can view them for their rhetorical and aesthetically pleasing qualities alone.
Creating Brand Identity The main techniques that go into the creation of brand identity are called ‘positioning,’ ‘imagecreation,’ and ‘mythologization.’ Positioning is the placing or targeting of a brand for the right market segment. For example, ads for the Mercedes Benz automobile are aimed typically at socially upscale car buyers, whereas ads for Dodge vans are designed to appeal (make sense) to middle-class individuals. The language of the ad, the lifestyle characteristics displayed in it, the overall ‘look’ of the personages in
it, and so on are tailored to reflect class and appurtenant lifestyle distinctions. The register of the language used by the characters in Mercedes Benz ads, for instance, is sociolinguistically higher than that used by characters in Dodge van ads. Creating an ‘image’ for a brand inheres in fashioning a recognizable ‘personality’ for it so that it can be positioned for specific market populations. Personality in this case refers to the traits and qualities that a potential consumer of the brand unconsciously possesses or aspires to have. The image is a sign constructed with an amalgam of signifiers (actual forms) – the brand name, design, logo, price, and overall presentation of the product. This amalgam creates is fashioned to appeal to specific consumer types – hence the term personality (as mentioned). Take alcohol brands as an example. What kinds of people drink beer? And what kinds drink aperitifs? In current American culture, answers to these questions would typically include remarks about the educational level, class, social attitudes, etc., of the consumer. The one who drinks beer is portrayed in ads as a down-to-earth character who simply wants to ‘hang out’ with friends; the one who drinks an aperitif is portrayed instead as a smooth, sophisticated type. The idea behind creating an image for the brand is, clearly, to speak directly to particular types of individuals, not to everyone, so that these individuals can see their own personalities mirrored in the lifestyle images created by the appurtenant advertising. Brand identity is often also created by the technique of mythologization. This is the strategy of imbuing a brand with some mythic meaning, such as the quest for eternal beauty, the conquest of death, and so on. This is an especially widespread in the case of cosmetic and beauty products. The eternal beauty myth can be seen in the images that advertisers create for such products. The characters in the relevant ads are, typically, attractive people with a deified, mythic quality about them. They are not unlike the statues of ancient Greek gods like Apollo and Aphrodite. In effect, through positioning, image-creation, and mythologization, the modern advertiser stresses not the brand’s qualities as a product or service, but rather the personality image that can be associated with it.
Brand Names Creating an identity for a product is tantamount to creating a ‘signification system’ for it – a system of meanings that are relevant to specific kinds of individuals. This is achieved, first and foremost, by giving it a ‘brand name.’ The product, like a person, can then be easily differentiated from other products.
Brands and Logos 111
The legal term for brand name is ‘trademark.’ It is little wonder that trademarks are so fiercely protected by corporations and manufacturers. So powerful are they as identifiers that some have gained widespread currency, becoming general terms for the product type in common discourse. Examples include ‘aspirin,’ ‘scotch tape,’ ‘cellophane,’ and ‘escalator.’ Most brand names appear on the product, on its container, and in advertisements for the product. These provide, in effect, an easy way to determine who makes a certain product, helping consumers easily identify what they like about it so that they can purchase it again. A brand represents not only a certain social meaning, but also the manufacturer’s reputation and good will. A so-called ‘strong brand’ is a product name that has no recognizable meaning, such as Kodak. Strong brands receive broad protection from being used by other companies who might play on the name in order to cause confusion among consumers. ‘Weak brands,’ on the other hand, are product names created with common words, such as Premier, some of which refer to a characteristic of the product (e.g., Wet ‘n Wash). These receive less protection, unless the public identifies them with a certain manufacturer as a result of extensive advertising and long, continuous use. A brand was, originally, a recognizable mark made on the flesh of animals with a hot iron so as to identify ownership and qualities of the animals. The ancient Egyptians branded livestock as early as 2000 B.C.E. In the late medieval period, tradespeople and guild members posted characteristic visual ‘marks’ outside their shops for the same basic reasons – to identify the owner and quality of the product or service. Visual signs were used because most people were not literate at the time. Such signs became, a little later, the ‘marks of the trade’ or ‘trademarks.’ Shops selling medieval swords and ancient Chinese pottery, for instance, bore visual signs that buyers could identify and use, when put on the products themselves, to ascertain their origin and determine their quality. Among the best-known trademarks surviving from that era are the striped pole of the barbershop and the three-ball sign of the pawnbroker shop. Names for common products, such as household ones, were first used towards the end of the 19th century. Previously, everyday household products were sold in neighborhood stores from large bulk containers. Around 1880, soap manufacturers started naming their products so that they could be identified and differentiated for their qualities. The first of these were Ivory, Pears’, Sapolio, and Colgate. The concept of the brand name thus came into being, spreading rapidly because, as Naomi Klein (2000: 6) aptly
observes, the market was starting to be flooded by uniform mass-produced and, thus, indistinguishable products: ‘‘Competitive branding became a necessity of the machine age.’’ By the early 1950s, it became obvious that branding was not just a simple strategy for product differentiation but the very semiotic fuel that propelled corporate identity and product recognizability. Even the advent of no-name products, designed to cut down the cost of purchase, have had little counter-effects on the power that branding has had on the consciousness of people. Names such as Nike, Apple, Body Shop, Calvin Klein, Levi’s, Coke, Pepsi, among many others, have become ‘culture-wide signs’ recognized by virtually anyone living in a modern consumerist society. As Klein (2000: 16) goes on to remark, for such brands, the name constitutes ‘‘the very fabric of their companies.’’ To continue to be effective, however, brands must keep in step with the times. In early 2000, some carmakers, for instance, started looking at naming trends that were designed to appeal to a new generation of customers accustomed to an Internet style of communication and representation. Cadillac, for instance, announced a new model with the monogram name CTS in 2001 and STS in 2005. Acura also transformed its line of models with names such as TL, RL, MDX, RSX. Such ‘alphabetic names’ evoke images of accuracy, technology, and sleekness in an analogy with similar abbreviating tendencies in science at large – e.g., ‘laser’ for ‘l(ight) a(mplification) by s(timulated) e(mission of) r(adiation).’ They are also consistent with ‘Internet style,’ a telegraphic form of language that spawns monogrammatic and alphanumeric signifiers on a daily basis. Hyundai’s XG300 model, for instance, sounds perfect for Internet times. On the other side of the naming equation, such abbreviations are hard to remember, especially for older customers who have not yet tapped into Internetese. As the above examples show, brand names are devised intentionally to create a signification system for products. At a practical informational level, naming a product has, of course, a denotative function; that is, it allows consumers to identify what product they desire to purchase (or not). But at a connotative level, the product’s name generates images that go well beyond this simple identifier function (see Denotation versus Connotation). In the world of fashion, for instance, designer names such as Gucci, Armani, and Calvin Klein evoke connotations of the clothes as objets d’art rather than images of mere clothing items, shoes, or jewelry; so too do names such as Ferrari, Lamborghini, and Maserati in the domain of automobiles. The manufacturer’s name, in such cases, extends the meaning of the product considerably.
112 Brands and Logos
When people buy an Armani or a Gucci product, they feel that they are buying a work of art to be displayed on the body; when they buy Poison, by Christian Dior, they sense instead that they are buying a dangerous, but alluring, love potion; when they buy Moondrops, Natural Wonder, Rainflower, Sunsilk, or Skin Dew cosmetics they feel that they are acquiring some of nature’s beauty resources; when they buy Eterna 27, Clinique, Endocil, or Equalia beauty products they sense that they are getting products made with scientific precision; and so on. Another common brand naming strategy involves iconicity – the strategy of creating names that resemble or assign some sensory property or social meaning to a product (see Iconicity: Theory). Iconicity is an effective strategy, because it renders the products highly memorable. A name such as Ritz Crackers, for example, assigns sonority to the product that is simulative of sounds that crackers make as they are being eaten. Another example of an iconic brand name is Drakkar Noir, chosen by Guy Laroche for a cologne product. Together with the dark bottle, the name conveys images of ‘fear,’ the ‘forbidden,’ and the ‘unknown.’ Forbidden things take place under the cloak of the night; hence the name noir (French for ‘black’). The sepulchral name Drakkar Noir is clearly iconic with the bottle’s design at a connotative level, reinforcing the idea that something desirous in the ‘dark’ will happen by splashing on the cologne. The word Drakkar is obviously suggestive of Dracula, the deadly vampire who came out at night to mesmerize his sexual prey with a mere glance. The name of the Acura automobile, to give another example of the use of iconicity, was likely designed to be imitative of both Italian and Japanese words. Italian feminine nouns end in -a and certain Japanese words end in the suffix -ura (e.g., tempura). The brand name is thus linked iconically to Italian and Japanese words and, by extension, the perceived qualities of the respective cultures at once. Carmakers have used the same strategy of creating car names ending in the vowel -a which, given the inbuilt melodious quality of such a word, makes it not only easier to remember but also suggestive of specific qualities. Here are a few examples: Achieva Altima Asuna Aurora Corsica Elantra Festiva Integra Lumina
Maxima Precidia Samara Sentra Serenia Sonata Brand names are clearly powerful signs, because they are suggestive of various qualities or attributes, either explicitly or implicitly. Here are examples of some of the strategies that are used to bring about overt or implicit suggestion. 1. Brand names that are the names of the actual manufacturers imply ‘tradition,’ ‘reliability,’ and, in the case of lifestyle products such as clothes, ‘artistry,’ ‘sophistication,’ and ‘beauty’: Armani Bell Benetton Calvin Klein Folger’s Gillette Gucci Kraft etc. 2. Brand names referring to real or fictitious people elicit images built culturally into the bearers of the actual name (e.g., Wendy’s evokes the image a friendly young girl), or else suggest qualities that the name itself is designed to emphasize (e.g., Mr. Clean): Aunt Jemima Barbie (the doll) Ken (the doll) McDonald’s Mr. Clean Wendy’s etc. 3. Names identifying the geographical location of a product or of a company suggest stability and tradition: American Bell Southern Bell Western Union etc. 4. Names designed to refer to some aspect of nature bestow upon the product the meanings that the particular aspect evokes: Aqua Velva Cascade Mountain Dew Surf Tide etc.
Brands and Logos 113
5. Names indicating the kinds of things that can be done with the product, such as a vehicle, or the kinds of places that can be visited with it, evoke connotations of lifestyle such as ‘country living,’ ‘back-to-nature living,’ ‘wild-west lifestyle,’ ‘city life,’ and so on: Dodge Durango Ford Escape Ford Expedition Ford Explorer Hyundai Santa Fe Jeep Grand Cherokee Jeep Renegade Jeep Wrangler Mercury Mountaineer etc. 6. Names constructed as hyperboles emphasize product ‘superiority’ and ‘excellence’: MaxiLight SuperFresh UltraLite etc. 7. Names created as combinations of words describe a product in a ‘poetic’ way: Frogurt (¼ Frozen þ Yogurt) Fruitopia (¼ Fruit þ Utopia) Yogourt (¼ Yogurt þ Gourmet) etc. 8. Names designed to indicate what the product can do set off images of ‘user-friendliness’: Easy On Easy Wipe Kleenex Lestoil One Wipe Quick Flow etc. 9. Brand names designed to indicate what can be accomplished with the product are also suggestive of ‘user-friendliness’ and ‘goal-achievement’: Air Fresh Bug Off Close-Up Toothpaste No Sweat etc. Even in relaying straightforward information, such as identifying the manufacturer (Bell, Kraft, etc.), indicating the geographical location of the company (Southern Bell, American Bell, etc.), describing what the product can do (Easy On, Quick Flow, etc.), and so on, brand names nevertheless create signification systems. The name Bell, for instance, evokes meanings of ‘tradition’ and ‘reliance’ that familiarity with the name kindles. In effect, every brand name
Table 1 Brand names and the signification systems they evoke Brand names
Signification systems
Superpower, Multicorp, Future Now, Quantum Health Resources, PowerAde, etc. People’s Choice, Advantage Plus, Light N’ Easy, Viewer’s Choice, etc.
‘big picture,’ ‘forward-looking,’ ‘strong,’ ‘powerful,’ etc.
Biogenical, Technics, Panasonic, Vagisil, Anusol, Proof Positive, Timex, etc. Coronation, Morning Glory, Burger King, Monarch’s Flour, etc. Wash ‘N Wear, Drip-Dry, Easy Clean, Okay Plus, etc. General Electric, General Mills, General Dynamics, General Foods, etc. Cheer, Joy, etc. Pledge, Promise, etc.
‘free-spirited,’ ‘advantageous,’ egalitarian,’ ‘common,’ ‘friendly,’ etc. ‘scientific,’ ‘methodical,’ ‘fool-proof,’ ‘accurate,’ ‘reliable,’ etc. ‘conquering,’ ‘regal,’ ‘majesty,’ ‘nobility,’ ‘blue-blooded,’ etc. ‘user-friendly,’ ‘simple,’ ‘uncomplicated,’ ‘basic,’ etc. ‘all-encompassing’ ‘widespread,’ ‘popular,’ etc. ‘happy,’ ‘bright,’ ‘friendly,’ ‘smiling,’ etc. ‘trustworthy,’ ‘reliant,’ ‘secure,’ etc.
entails an unconscious signification system – a set of connotations – of one kind or other. It is this system that is used and reused for various advertising purposes. Indeed, the more connotations a name evokes, the more powerful it is and, as a consequence, the more possibilities it offers to the advertiser for creating truly effective ads and commercials. The higher the ‘connotative index’ of a signification system, as it has been called (Beasley and Danesi, 2002), the greater its market appeal. Table 1 shows just a few examples of how signification systems are generated by brand names. Naming a product makes it possible to refer to it as if it had a distinctive character or quality – ‘I don’t trust Colgate products; they’re useless’; ‘I will only buy Quaker Oats; it suits me perfectly’; etc. It is meaningless to say something like ‘I don’t trust the toothpaste that has blue stripes in it’; or ‘I will buy only the cereal that has an oat-like taste to it.’ Moreover, a product with a name has the capacity, by its very nature, to tap into the brain’s capacity to store meaningful categories in the form of language. A word classifies something, keeps it distinct from other things, and, above all else, allows it to have meaning over and above itself. The name Ivory, for example, evokes an image of something ‘ultrawhite,’ Royal Baking Powder of something ‘regal’ and ‘splendid,’ Bon Ami of ‘a good friend,’ and so on. Such suggestive images stick in the mind in the same way that the meanings of words do. They become an unconscious part of our semantic memory system.
114 Brands and Logos
It is little wonder that the term ‘brand’ is no longer used today just to refer just to a specific product line, but also to the company that manufactures it, to the image that the company wishes to impart of itself and of its products, and to the ‘personality structure’ that is perceived in users of the product. Thus, the name Coca-Cola now refers not only to the actual soft drink, but also to the company itself, the social meanings that drinking Coke entails, and so on and so forth. Coca-Cola went on sale as a headache and hangover remedy on May 8, 1886 at Jacob’s Pharmacy in Atlanta. It was created by local pharmacist John S. Pemberton from South American cocoa shrub leaves, an extract of African kola nuts, and fruit syrup. It was Pemberton’s bookkeeper who named the product ‘Coca-Cola’ and who suggested writing its name with the familiar flowing script that virtually everyone recognizes. The drink was subsequently promoted with such slogans as ‘‘Wonderful nerve and brain tonic and remarkable therapeutic agent’’ and ‘‘Its beneficial effects upon diseases of the vocal chords are wonderful.’’ In 1891, Atlanta pharmacist Asa G. Candler acquired ownership of Coca-Cola, changing its image from a ‘tonic’ to that of a popular 5¢ soft drink that could be drunk together with family and friends – an image that has persisted to this day and is the basis of Coca-Cola’s continued commercial success. That image was created at first by imprinting the Coca-Cola name on drinking glasses, providing them to diners and other eateries that featured ‘pop’ and foods meant to be eaten quickly and cheaply. From then, Coca-Cola has coopted socially significant themes, from the brotherly love and peace espoused during the counterculture era of the late 1960s and early 1970s with its ‘‘I’d like to teach the world to sing in perfect harmony’’ campaign, to the ‘‘Coke is the real thing’’ campaign shortly thereafter, to 2000’s campaigns showing Coke as the drink of Olympic athletes.
Logos Logos (an abbreviation of ‘logogriphs’) are the pictorial counterparts of brand names. They are designed to reinforce the signification system for a product through the visual channel. Consider the apple logo adopted by the Apple Computer Company. As a visual iconic sign suffused with latent religious symbolism, it strongly suggests the story of Adam and Eve in the Western Bible, which revolves around the eating of forbidden fruit (probably the apple) that contained forbidden knowledge. The logo reinforces this symbolic association because it shows an apple that has had a bite taken from it. The creator of the logo, a
man named Rob Janoff of Regis McKenna Advertising, has consistently denied any intent to connect the logo to the Genesis story, claiming instead that he put the bite there in order to ensure that the figure not be confused with a tomato. Whatever the truth, the bite in the logo evokes the Genesis story nonetheless. As another example, consider the Playboy logo of a bunny wearing a bow tie. Its ambiguous design opens up at least two interpretive chains: 1. rabbit ¼ ‘female’ ¼ ‘highly fertile’ ¼ ‘sexually active’ ¼ ‘promiscuous’ ¼ etc. 2. bow tie ¼ ‘elegance’ ¼ ‘night club scene’ ¼ ‘finesse’ ¼ etc. The appeal and staying power of this logo is due, arguably, to this inbuilt dual signification system. By not being able to pin down what the actual meaning of the logo is, we start experiencing the sign holistically and, thus, as an artistic text or mysterious pictograph. Logos have now become part of a culture-wide visual symbolism that interconnects products with daily life. Until the 1970s, logos on clothes, for instance, were concealed discretely inside a collar or on a pocket. Today, they can be seen conspicuously on all kinds of products, indicating that society has become ‘logo conscious.’ Ralph Lauren’s polo horseman, Lacoste’s alligator, and Nike’s ‘swoosh’ symbol, to mention but three, are now shown prominently on clothing items, evoking images of heraldry and, thus, nobility. They constitute symbols of ‘cool’ (Klein, 2000: 69) that legions of people are seemingly eager to put on view in order to convey an aura of high class ‘blue-blooded’ fashionableness. To see why logos are so powerful psychologically, consider briefly the Nike symbol, which is found on the shoe brand. As a visual sign suggesting speed, it works on several levels, from the iconic to the mythical. At the iconic level, it implies the activity of running at top speed with the Nike shoe; at the mythic level, it taps into the idea of speed as symbolic of power and conquest (such as in the Olympic races). The combination of these two signifying levels creates a perception of the logo, and thus the product, as having a connection to both reality and narrative history. Nike was the goddess of victory in Greek mythology. An ancient statue of Nike shows a winged female figure alighting on the prow of a ship, presumably to crown the ship’s commander. Her garments, wet with spray and blown by her flight, whip about her body. Given their psychological power, it is little wonder that logos are used as well by noncommercial enterprises and organizations. One of the most widely known ones is the peace sign, often worn on chains and necklaces. Derived from an ancient runic symbol
Brands and Logos 115
of despair and grief, it became the logo for philosopher Bertrand Russell’s (1872–1970) ‘Campaign for Nuclear Disarmament’ in the 1950s. The logo’s first widespread exposure came when it surfaced in the 1962 sci-fi film The Day the Earth Caught Fire, leading to its adoption by the counterculture youth of the era. In a fundamental sense, a logo is a pictograph – a picture used to express ideas. The inbuilt emotional appeal of pictography is likely the reason why the alphabet character X has become a kind of ersatz logo for everything from movies to sports names: e.g., Nissan’s X-Terra model, X-treme sports, the movie action hero ‘Triple X,’ XXX movies, and so on. The letter X has become a kind of ‘macrologo’ that is synonymous with youth, danger, and excitement, even though it has been around for centuries as the mathematical variable par excellence, as a signature used by those who cannot write, as a blasphemous letter assigned to cartoon bottles of alcohol and boxes of dynamite, and as a symbol marking a secret treasure on a pirate’s map. ‘X’ has always constituted a pictography of various meanings that predate X-treme sports and X-File TV programs. X is powerful because it conjures up images of things that are just beyond the realm of information, or beyond decency and righteousness. In today’s sexually charged culture, ‘X’ on a product means ‘Buy me, I’m X-rated and X-citing.’ ‘X’ is, in a phrase, one of the most provocative symbols of contemporary logo culture, characterizing it in a compact yet accurate way. And the reason is, ultimately, because it reverberates with mythical symbolism that reaches back to the origin of pictography as a craft controlled by those in power. It is a modern-day hieroglyph. The only way to explain why we extract so much meaning from a simple letter is, in fact, to see it as a product of an unconscious pattern of pictorial symbolism that continues to have emotional hold on the modern mind. Its particular design – a cross symbol that has been rotated 45 degrees – reverberates with contradiction and opposition. No wonder that advertisers, manufacturers, Hollywood moguls, and all the other image-makers of contemporary pop culture have adopted it as a symbol of ‘cool.’
Conclusion As mentioned, brands and logos are now created to name not just products, but entire corporations (IBM, Ford, etc.) and even specific characters that represent, in some way, a corporation. Take, for example, the Disney Corporation cartoon character Mickey Mouse. In 1929, Disney allowed Mickey
Mouse to be reproduced on school slates, effectively transforming the character into a logo. A year later Mickey Mouse dolls went into production and throughout the 1930s the Mickey Mouse brand name and image were licensed with huge success. In 1955, The Mickey Mouse Club premiered on US network television, further entrenching the brand and image – and by association all Disney products – into the cultural mainstream. Analogous ‘branding events’ have repeated themselves throughout modern society. The idea is to get the brand to become intertwined with cultural spectacles (movies, TV programs, etc.) and thus indistinguishable as a sign from other culturally meaningful signs and sign systems. Because of the Disney Corporation, toys, children’s TV programming, childhood films, videos, DVDs, theme parks, and the like have become part of the modern perception of childhood as a Fantasyland world. This is why children now experience their childhood through such products. See also: Barthes, Roland: Theory of the Sign; Denotation versus Connotation; Iconicity: Theory; Marketing and Semiotics: From Transaction to Relation; Media: Semiotics; Mythologies in Pop Culture.
Bibliography Barthes R (1957). Mythologies. Paris: Seuil. Beasley R & Danesi M (2002). Persuasive signs: the semiotics of advertising. Berlin: Mouton de Gruyter. Berger A A (2000). Ads, fads, and consumer culture: advertising’s impact on American character and society. Lanham: Rowman & Littlefield. Danesi M (2002). Understanding media semiotics. London: Arnold. Danna S R (1992). Advertising and popular culture: studies in variety and versatility. Bowling Green, OH: Bowling Green State University Popular Press. Dyer G (1982). Advertising as communication. London: Routledge. Forceville C (1996). Pictorial metaphor in advertising. London: Routledge. Goffman E (1979). Gender advertisements. New York: Harper and Row. Goldman R & Papson R (1996). Sign wars: the cluttered landscape of advertising. New York: Guilford. Jhally S (1987). The codes of advertising. New York: St Martin’s Press. Jones J P (ed.) (1999). How to use advertising to build strong brands. London: Sage. Key W B (1972). Subliminal seduction. New York: Signet. Key W B (1976). Media sexploitation. New York: Signet. Key W B (1980). The clam-plate orgy. New York: Signet. Key W B (1989). The age of manipulation. New York: Henry Holt. Klein N (2000). No logo: taking aim at the brand bullies. Toronto: Alfred A. Knopf.
116 Brands and Logos Leymore V (1975). Hidden myth: structure and symbolism in advertising. London: Heinemann. Packard V (1957). The hidden persuaders. New York: McKay. Twitchell J B (2000). Twenty ads that shook the world. New York: Crown.
Umiker-Sebeok J (ed.) (1987). Marketing signs: new directions in the study of signs for sale. Berlin: Mouton. Wolfe O (1989). ‘Sociosemiology and cross-cultural branding strategies.’ Marketing Signs 3, 3–10.
Braune, Wilhelm (1850–1926) E Einhauser, University of Cologne, Cologne, Germany ! 2006 Elsevier Ltd. All rights reserved.
Wilhelm Braune belonged to the so-called ‘Neogrammarians,’ a group of linguists with quite a strong influence on linguistic research in the last third of the 19th century and at the beginning of the 20th century. But whereas other Neogrammarians such as Hermann Paul and Karl Brugmann became rather famous, Wilhelm Braune more or less took on the role of the decent working linguist in the background. The obituaries of his friend Eduard Sievers (1927) and of his successor Friedrich Panzer (1927) give evidence of his calm and peaceable character and of his conscientious working attitude. As the most famous results of his diligence, Braune’s Gotische and his Althochdeutsche Grammatik should be mentioned. The newest (20th) edition of the Gotische Grammatik only recently has been revised by Frank Heidermanns and so today is still of value as a reliable working tool. The Althochdeutsche Grammatik, too, has seen many editions, and even the Althochdeutsche Lesebuch is still in print. Furthermore, a great part of Braune’s working energy was absorbed by the Beitra¨ ge zur Geschichte der deutschen Sprache und Literatur, a journal he founded together with Hermann Paul in 1874 and which quickly became one of the central periodicals of the Neogrammarians (the common abbreviation PBB refers to the initials of the co-founders: Paul and Braune’s Beitra¨ ge). A great deal of Braune’s works were published here, among others an important essay on the history of the German language titled Zur Kenntnis des Fra¨ nkischen und zur hochdeutschen Lautverschiebung (1874) and a voluminous analysis of the Handschriftenverha¨ ltnisse des Nibelungenliedes (1900). These two titles, as well as the Beitra¨ ge, are representative of Braune’s scientific position: he saw himself not only as a linguist but also as a Germanist, a philologist who is not interested in linguistic questions alone (see also Panzer, 1927: 159). That he was still open to new scientific trends
in his later years can be seen in his essay Althochdeutsch und Angelsa¨ chsisch (1918), in which he took into account geographical and cultural aspects. Braune’s university career was quite unspectacular, fitting for his role as a decent worker. He studied in Leipzig, where he was mostly influenced by his teachers August Leskien, Friedrich Zarncke, and Rudolf Hildebrandt. In Leipzig he also met Hermann Paul, Eduard Sievers, Hermann Osthoff, and Karl Brugmann and hence became part of the new development in linguistics, which quite soon was named Neogrammarian. After his graduation Braune first worked as an assistant at the library at the University of Leipzig, then he gave lectures, and finally he was offered a chair at Gießen in 1880. Mainly due to the influence of Hermann Osthoff, who already taught there, he was offered a chair at the University of Heidelberg in 1888, where he lived until he died in 1926. See also: Brugmann, Karl (1849–1919); English, Old English;
German; Gothic; Leskien, August (1840–1916); Osthoff, Hermann (1847–1909); Paul, Hermann (1846–1921); Sievers, Eduard (1850–1932); Zarncke, Friedrich (1825–1891).
Bibliography Braune W (1918). ‘Althochdeutsch und Angelsa¨ chsisch.’ Beitra¨ ge zur Geschichte der deutschen Sprache und Literatur 43, 361– 445. Braune W (1994). Althochdeutsches Lesebuch (17th edn.). Bearbeitet von Ernst A. Ebbinghaus. Beitrag von Karl Helm. Tu¨ bingen: Niemeyer. Braune W (2004). Gotische Grammatik. Mit Lesestu¨ cken und Wo¨ rterverzeichnis (20th edn.). Bearbeitet von Frank Heidermanns. Tu¨ bingen: Niemeyer. (1st edn., 1880.) Einhauser E (1989). Die Junggrammatiker. Ein Problem fu¨ r die Sprachwissenschaftsgeschichtsschreibung. Trier: Wissenschaftlicher Verlag Trier. Panzer F (1927). ‘Wilhelm Braune [Nachruf].’ Zeitschrift fu¨ r deutsche Philologie 52, 158–164. Sievers E (1927). ‘Wilhelm Braune [Nachruf].’ Beitra¨ ge zur Geschichte der deutschen Sprache und Literatur 5, I–VI. Wunderlich H (1910). ‘Wilhelm Braune.’ GermanischRomanische Monatshefte 2, 81–91.
116 Brands and Logos Leymore V (1975). Hidden myth: structure and symbolism in advertising. London: Heinemann. Packard V (1957). The hidden persuaders. New York: McKay. Twitchell J B (2000). Twenty ads that shook the world. New York: Crown.
Umiker-Sebeok J (ed.) (1987). Marketing signs: new directions in the study of signs for sale. Berlin: Mouton. Wolfe O (1989). ‘Sociosemiology and cross-cultural branding strategies.’ Marketing Signs 3, 3–10.
Braune, Wilhelm (1850–1926) E Einhauser, University of Cologne, Cologne, Germany ! 2006 Elsevier Ltd. All rights reserved.
Wilhelm Braune belonged to the so-called ‘Neogrammarians,’ a group of linguists with quite a strong influence on linguistic research in the last third of the 19th century and at the beginning of the 20th century. But whereas other Neogrammarians such as Hermann Paul and Karl Brugmann became rather famous, Wilhelm Braune more or less took on the role of the decent working linguist in the background. The obituaries of his friend Eduard Sievers (1927) and of his successor Friedrich Panzer (1927) give evidence of his calm and peaceable character and of his conscientious working attitude. As the most famous results of his diligence, Braune’s Gotische and his Althochdeutsche Grammatik should be mentioned. The newest (20th) edition of the Gotische Grammatik only recently has been revised by Frank Heidermanns and so today is still of value as a reliable working tool. The Althochdeutsche Grammatik, too, has seen many editions, and even the Althochdeutsche Lesebuch is still in print. Furthermore, a great part of Braune’s working energy was absorbed by the Beitra¨ge zur Geschichte der deutschen Sprache und Literatur, a journal he founded together with Hermann Paul in 1874 and which quickly became one of the central periodicals of the Neogrammarians (the common abbreviation PBB refers to the initials of the co-founders: Paul and Braune’s Beitra¨ge). A great deal of Braune’s works were published here, among others an important essay on the history of the German language titled Zur Kenntnis des Fra¨nkischen und zur hochdeutschen Lautverschiebung (1874) and a voluminous analysis of the Handschriftenverha¨ltnisse des Nibelungenliedes (1900). These two titles, as well as the Beitra¨ge, are representative of Braune’s scientific position: he saw himself not only as a linguist but also as a Germanist, a philologist who is not interested in linguistic questions alone (see also Panzer, 1927: 159). That he was still open to new scientific trends
in his later years can be seen in his essay Althochdeutsch und Angelsa¨chsisch (1918), in which he took into account geographical and cultural aspects. Braune’s university career was quite unspectacular, fitting for his role as a decent worker. He studied in Leipzig, where he was mostly influenced by his teachers August Leskien, Friedrich Zarncke, and Rudolf Hildebrandt. In Leipzig he also met Hermann Paul, Eduard Sievers, Hermann Osthoff, and Karl Brugmann and hence became part of the new development in linguistics, which quite soon was named Neogrammarian. After his graduation Braune first worked as an assistant at the library at the University of Leipzig, then he gave lectures, and finally he was offered a chair at Gießen in 1880. Mainly due to the influence of Hermann Osthoff, who already taught there, he was offered a chair at the University of Heidelberg in 1888, where he lived until he died in 1926. See also: Brugmann, Karl (1849–1919); English, Old English;
German; Gothic; Leskien, August (1840–1916); Osthoff, Hermann (1847–1909); Paul, Hermann (1846–1921); Sievers, Eduard (1850–1932); Zarncke, Friedrich (1825–1891).
Bibliography Braune W (1918). ‘Althochdeutsch und Angelsa¨chsisch.’ Beitra¨ge zur Geschichte der deutschen Sprache und Literatur 43, 361– 445. Braune W (1994). Althochdeutsches Lesebuch (17th edn.). Bearbeitet von Ernst A. Ebbinghaus. Beitrag von Karl Helm. Tu¨bingen: Niemeyer. Braune W (2004). Gotische Grammatik. Mit Lesestu¨cken und Wo¨rterverzeichnis (20th edn.). Bearbeitet von Frank Heidermanns. Tu¨bingen: Niemeyer. (1st edn., 1880.) Einhauser E (1989). Die Junggrammatiker. Ein Problem fu¨r die Sprachwissenschaftsgeschichtsschreibung. Trier: Wissenschaftlicher Verlag Trier. Panzer F (1927). ‘Wilhelm Braune [Nachruf].’ Zeitschrift fu¨r deutsche Philologie 52, 158–164. Sievers E (1927). ‘Wilhelm Braune [Nachruf].’ Beitra¨ge zur Geschichte der deutschen Sprache und Literatur 5, I–VI. Wunderlich H (1910). ‘Wilhelm Braune.’ GermanischRomanische Monatshefte 2, 81–91.
Brazil: Language Situation 117
Brazil: Language Situation D Moore, Museu Goeldi, Bele´m, Brazil ! 2006 Elsevier Ltd. All rights reserved.
Background The indigenous population in what is now Brazil was much higher in the past, with a multiplicity of societies and languages. According to Roosevelt (1994), the oldest pottery in the New World (6000–8000 years) is found in Brazilian Amazonia, on whose flood plains dense populations, divided into chiefdoms, lived at the time of European contact. Other regions of Brazil, such as the central highlands, the semi-arid northeast, and the more temperate southern region, were likewise home to sizeable indigenous populations, most of which were destroyed or absorbed. Over 40% of the modern Brazilian gene pool is of indigenous origin. European contact began with the arrival of the ships of the Portuguese explorer Pedro A´ lvares Cabral in 1500. He encountered some Tupinamba´ on the eastern coast of Brazil. European immigration was relatively slight for the first two centuries. European men frequently took indigenous wives, and a class of mestizos was produced, which was important in the colonizing process, during which large numbers of native people were relocated and obliged to learn the language of the mestizo, Lı´ngua Geral, or Nheengatu (Nhengatu), a Tupı´-Guaranı´ language originally spoken on the coast that was modified by substratum effects and borrowings from Portuguese. Several dialects of Nheengatu still persist in Amazonia. With the expulsion of the Jesuits in the mid-18th century, the state assumed control over the communities of resettled native peoples (reduc¸ o˜ es), where the population was already declining from occidental disease. The regions of Brazil that have been occupied the longest have the fewest indigenous societies and languages, especially eastern Brazil, where few indigenous groups still speak their language. Rodrigues (1993) estimates that 75% of the indigenous languages became extinct. The surviving native groups are mostly in remote areas, especially in Amazonia, where contact with national society has been more recent and less intensive. There are still native groups living out of contact with the outside world. Newly contacted groups still commonly lose two-thirds of their population to Western diseases – an unnecessary loss, since the diseases responsible for this loss of life and language are preventable or treatable. A number of native political organizations exist in Brazil (for example, the Coordenac¸a˜o de Organizac¸o˜es Indı´genas da Amazoˆnia Brasileira – COIAB, and the Federac¸a˜o de Organizac¸o˜es Indı´genas do Rio Negro – FOIRN)
and are active in debating policy and defending the interests of the communities that they represent. Indigenous affairs are under the control of the National Indian Foundation (FUNAI), and all researchers must obtain authorization from that governmental entity to enter indigenous areas, as well as approval from the National Council for Scientific and Technological Development (CNPq).
The Study of Native Brazilian Languages Some of the earliest descriptive studies of the native languages of the New World were conducted by Jesuits in Brazil, for example, Anchieta (1595). This tradition did not take hold, however. In the 19th century and the first half of the 20th century, a number of nonspecialists, especially members of scientific expeditions, accomplished a certain amount of linguistic description. These include, notably, Karl von den Steinen, General Couto de Magalha˜es, Theodor Koch-Gru¨nberg, Curt Nimuendau´, Emilie Snethlage, and Joa˜o Capistrano de Abreu. Modern scientific studies of native Brazilian languages only began in the second half of the 20th century. Mattoso Caˆmara established the Setor de Lingu¨ı´stica at the Museu Nacional in 1961 and also authored a book about indigenous languages (1965), though he was not a fieldworker. During a number of years, Brazilian research on indigenous languages was mainly done at the Museu Nacional and at the State University of Campinas (UNICAMP). However, in the second half of the 1980s the study of native languages spread to other centers, especially the Federal Universities of Brası´lia (UnB), Goia´s (UFG), Pernambuco (UFPE), and Para´ (UFPA), aside from the University of Sa˜o Paulo (USP) and the Museu Goeldi, which is a federal research institute in Bele´m. The anthropologist Darcy Ribeiro established a cooperation agreement between the Summer Institute of Linguistics (SIL) and the Museu Nacional in 1956. This agreement was terminated in 1981, and there are now no formal ties between Brazilian academic centers and missionary organizations. Foreign missionaries have become less influential in the study of indigenous languages as their place is being taken to a certain extent by Brazilian missionaries and increasingly by professional and numerous Brazilian scientific linguists. A number of these latter have studied abroad in recent years, and upon the completion of their studies, they are strengthening the national capacity in scientific linguistics, especially in diachronic linguistics (see, for example, Meira and Franchetto, forthcoming), in recent theory and methodology, and
118 Brazil: Language Situation
in overall descriptions of individual languages. The first complete grammar of a native language in decades authored by a Brazilian linguist was the description of Kamayura´ by Seki (2000). More such general descriptions have been undertaken by young Brazilian linguists. There is, unfortunately, no national program for identifying and describing endangered languages in Brazil. However, a number of recent modern documentation projects with international funding have improved the level of documentation efforts. These are very popular with native groups. The small number of foreign nonmissionary linguists studying Brazilian indigenous languages has increased considerably in recent years. Some modern information about Brazilian native languages appeared in a general work on South American languages edited by Klein and Stark (1985). Amazonia became identified as a distinct research area in linguistics with the publication of the Handbook of Amazonian languages series, edited by Derbyshire and Pullum (1986–1998) and the compendium edited by Payne (1990). Later useful general works with the same regional focus are those edited by Queixalo´ s and Renault-Lescure (2000) and by Dixon and Aikhenvald (1999). These typically include languages outside of what is, strictly speaking, Amazonia, for example, the languages of the central highlands of Brazil. In recent years, volumes of the ILLA series have included many Brazilian languages, for example, the volumes edited by van der Voort and van de Kerke (2000) and by Crevels, van de Kerke, Meira, and van der Voort (2002). In Portuguese, a general treatment of Brazilian languages is that by Rodrigues (1986). Rodrigues (1993) presents information on the situation of Brazilian native languages, but suffers from confusion between the number of speakers and the population size, which results in underestimating the degree of endangerment. Seki (1999) and Franchetto (2000) describe the study of indigenous languages in Brazil. Wetzels (1995) presents a collection of phonological studies. A recent collection of articles is that by Cabral and Rodrigues (2002). One Brazilian periodical dedicated exclusively to indigenous languages is Lı´nguas Indı´genas Americanas (LIAMES), of UNICAMP. The Boletim do Museu Paraense Emı´lio Goeldi contains linguistics articles in its Anthropology issues. Articles likewise appear in the journals Revista de Documentac¸a˜o de Estudos em Lingu¨ı´stica Teo´rica e Aplicada (D.E.L.T.A.) of the Pontı´fica Universidade Cato´ lica de Sa˜ o Paulo, the Boletim da ABRALIN, and the Cadernos de Estudos Lingu¨ı´sticos of UNICAMP. Of the many NGOs working with indigenous groups, the largest and most concerned with documentation is the Instituto So´ cio Ambiental (ISA), whose website
is a valuable source of information and also publications (including maps) that can be purchased via the Internet. There is also a website and a listserv run by the Museu Antropolo´ gico, Universidade Federal de Goia´ s.
The Situation of the Native Brazilian Languages Of course, Portuguese is the official language of Brazil. Impressionistically, Brazilian Portuguese is about as different from the Portuguese dialects in Portugal as American English is from the English dialects in Great Britain. There are many other languages spoken by immigrant communities in Brazil, especially German, Italian, and Japanese. We will focus attention here on the situation of the native languages. It must be emphasized that the information presented below is approximate, due to the lack of systematic data gathering about the situation of the native languages of Brazil. Even when population size is known, the number of effective speakers and the degree of transmission is often not known with certainty. What are considered to be different languages sometimes turn out to be dialects of the same language, often reflecting ethnic or political divisions. Much of the information is a revised version of information presented in an overview article about endangered languages in lowland South America by Moore (forthcoming), which is based on a number of sources, including Queixalo´ s and Renault-Lescure (2000), Rodrigues (1993), Dixon and Aikhenvald (1999), the map of the Centro de Documentac¸ a˜ o Indı´gena (1987), the website of the Instituto So´ cio Ambiental, the author’s own knowledge of several regions, and personal communications from many linguists actively studying indigenous languages in various geographical areas. Language names and the genetic classification are adapted from those of the Instituto So´ cio Ambiental website, which are a 1997 adaptation of information from Rodrigues (1986). Names used by Ethnologue, if different, appear in parentheses after (note that Ethnologue’s family names and categorization sometimes differ from the one used here). Population figures are normally from this same website; numbers from other sources are put in brackets. Speaker estimates are from various sources; when more than one source is used, the second is separated by placing it in brackets. Where no real information is available, the space is left blank. Since many tribal groups span national boundaries, it is important to note that all estimates are specific to Brazil and excluding speakers of those groups living in, say, Colombia or Venezuela. Likewise, the estimate of the amount of study refers to
Brazil: Language Situation 119
studies carried out among speakers in Brazil, not in other countries. These estimates are very rough and can change quickly with the publication of new work. Languages with little or no significant scientific description are rated 0; those with an M.A. thesis or several articles are rated 1, those with a good overall sketch or doctoral thesis on some aspects of the language are rated 2; and those with reasonably complete descriptions are rated 3. In the terminology used here for genetic groupings, ‘family’ means a group of related but different languages whose genetic relation is reasonably obvious, and ‘stock’ refers to a group of families whose relation is not so obvious. Because of the small size of the surviving speech communities and the precarious conditions in which they live, all might be considered to be in danger of extinction. However, it is more useful to distinguish those that are in serious, imminent danger of disappearing, either because of a low number of speakers, low transmission, or both factors. Some languages listed may already be extinct, but are listed anyway because a careful search sometimes finds remaining speakers somewhere, and that search may be abandoned if they are not listed. Languages are not considered urgently endangered if there are a reasonable number of speakers of at least one dialect or a reasonable number of speakers in another country. Larger groupings are considered first, following alphabetical order within the grouping.
Major Language Families
Arawak The languages of the Arawak family, in its restricted sense, also designated Maipurean, have long been recognized as related, though proposed genetic links to other linguistic groups are more doubtful. The supposed link with the Arawa´ languages, for example, has no linguistic basis. The work of Noble (1965) influenced archeology, but is dubious in its conclusions. The Arawak languages are amazingly widespread, from the Caribbean to Bolivia. In Brazil, they occur in northern Para´ state, on the tributaries of the Rio Negro in the northwest, along the Purus River in the west, on the tributaries of the Juruena River in Mato Grasso, and along the Upper Xingu River. The relatively numerous Tereˆ na live in Mato Grosso do Sul. It is not certain whether or not there are still speakers of Mandawa´ ka (Mandahuaca) in the region of the Upper Rio Negro. The Arawak languages are polysynthetic and often have gender and nominal classification (Table 2). Carib The Carib family is centered on northern South America. The Carib languages of northern Brazil are rather similar, though Waimiri-Atroari (Atruahı´) is more distant. The language called Galibi do Oiapoque is intrusive from French Guiana, where it is called Kali’na (or Carib in Surinam and Guyana). The Carib languages on or near the Upper Xingu are quite different from the northern languages and also do not constitute a single consistent subgroup (Table 3).
Hypothetical Linguistic Stock
Macro-Jeˆ Various authors have, on one basis or another, proposed groupings of languages often considered today as Macro-Jeˆ . It is important to confirm or disconfirm each of the proposed genetic affiliations, some of which are not obvious. The Jeˆ family of languages, the largest of the stock, is focused on the savanna regions of Brazil from the southern parts of the states of Para´ and Maranha˜ o south to Santa Catarina e Rio Grande do Sul. The other families of this hypothesized stock generally occur outside of Amazonia, mainly in eastern and northeastern Brazil, but with some in central Brazil and farther west. Rikbaktsa has been held to be the exception, apparently living for a long time in an Amazonian environment in northern Mato Grosso. Recent research, however, indicates that the Jabutı´ languages are probably Macro-Jeˆ , as was speculated by some authors, indicating a wider and older presence in Amazonia as well. Because of their early contact with Europeans, many of the Macro-Jeˆ languages in the east and northeast of Brazil are extinct, with or without some documentation. The last speaker of Umotı´na died recently (Table 1).
Pano The Pano linguistic family is not highly differentiated internally. It occurs in Peru, Bolivia, and Brazil, and is usually considered to be related to the Tacana family of Bolivia. The Brazilian Pano languages occur in the states of Acre and Amazonas, except for the Kaxararı´ in Randoˆ nia, and have received relatively little study. Sources are contradictory as to whether Amawa´ka is spoken in Brazil (Table 4). Tucano Of the divisions of the Tucano family, Western, Eastern, and (for some authors) Central, it is mainly the Eastern branch which occurs in Brazil, though Kubewa (Cubeo), of the putative Central branch, also occurs there. Except for Arapaso, each of the Tucano languages of Brazil is also spoken in Colombia, where they have generally received more study. More recent sources doubt that Yuruti (Juriti) is spoken in Brazil. These languages are noted for tone or pitch accent, morpheme-intrinsic nasality, and complex obligatory coding of evidentiality. The languages are spoken in the region of the Vaupe´ s, Tiquie´ , and Papurı´ Rivers. The speakers of several of them refer to themselves as Yeba´ -masa˜ (Yepa´ -masa˜ ).
120 Brazil: Language Situation Table 1 Macro-Jeˆ (Macro-Ge) stock Linguistic unit
Boro´ro Family Boro´ro (Boroˆro) Guato´ Family Guato´ Jeˆ Family Akwe´n
Apinaye´ Kainga´ng
Kayapo´
Dialects, groups
No. of speakers
Population
Transmission
1024
Xakriab´a´ Xava´nte Xere´nte
Studies
2
5 [40]
372
low
0? most all?
6000 9602 1814 1262 25 000 total Kainga´ng
none high?
7096 total Kayapo`
high
1 total Kayapo´
all
202
high
2
all
334 58 458 1337
high
1–2
high high
2
338
low
2
1900 620
high
1
757
low
1
most 1860 10
919 2500 185
good high none
2 1 0
10?
150
low
1
most?
802
25
56
low
1
909
med?
1
2930
med?
1
Kainga´ng do Parana´ Kainga´n Central Kainga´ng do Sudoeste Kainga´ng do Sudeste Gorotire
Endangered
high?
2
urgent urgent
1 1 2 2 total Kainga´ng
Kararaoˆ Kokraimoro Kubenkrankegn Menkrangnoti Mentuktı´ re (Txukahama˜e) Xikrin Panara´ (Kreen-akore, Krenakarore) Suya´ Timbı´ ra
Xokle´ng (Xokleng) Karaja´ Family Karaja´
Krena´k Family Krena´k (Krenak) Maxakalı´ Family Maxakalı´ Ofaye´ Family Ofaye´ (Opaye´, Ofaye´-Xavante) Rikbaktsa´ Family Rikbaktsa´ (Erikpaksa´, Rikbaktsa) Yathe´ Family Yatheˆ, Fulnioˆ, Carnijo´)
Many of these languages are quite robust, but have received little study in Brazil (Table 5). Tupı´ The Tupı´ family consists of 10 branches, one of which, Tupı´-Guaranı´, spreads over a vast area, with extensions into Argentina, Paraguay, Bolivia, Peru, and French Guiana. Languages of this branch have been studied for centuries, but with more fascination for the Tupı´-Guaranı´ dialects on the coast studied by the Jesuits, which contributed many loanwords to Portuguese and which achieved an almost classical status in Brazil, where the word ‘Tupı´’ is sometimes used to refer to these dialects. Though Tupı´-Guaranı´ is often thought to be somehow more central in the family, it is actually rather atypical. Awetı´ is apparently the branch most closely related to Tupı´-Guaranı´, and these two together with Mawe´ form a subgroup within the family. The Ramarama and Purubora´ branches form a subgroup also; the other relations are not obvious and are still being worked out. Research on the Tupi families in the western state of Rondoˆ nia, often considered the original location of the Tupi peoples, is rather recent. A number of languages important for comparative Tupi studies are urgently endangered (Table 6).
[530] 459 [345]
100
320 1914
all
15795 6500 491 321
8
208
Endangered
urgent urgent
0 0 high very low
1 3
high
1 1 2 1
none
1
variable
urgent
urgent
Medium-sized Language Families
Arawa´ The Arawa´ languages are spoken in a relatively circumscribed region centered on the upper and middle Purus and Jurua´ rivers. Their maintenance is generally good (Table 7). Katukina The Katukina family of languages (not to be confused with Katukina do Acre, a Pano language) are spoken by groups on the Javaı´, Jurua´ , and Jutuı´ rivers in southern Amazonas. Recently, Adelaar (2000) presented evidence that the Peruvian family Harakmbut is genetically related to the Katukina family of languages. Their study is urgent (Table 8). Maku´ The Maku´ languages (not to be confused with the Ma´ ku language of Roraima) are spoken by hunter-gatherer groups mainly in the region of the Vaupe´ s, though the Nade¨ b live lower on the Rio Negro. The Bara´ (Kakua, Kakwa) language (not to be confused with the Bara´ (Barasana) language of the Tucano family) is spoken on the border with Colombia, and it is not clear how many live in Brazil (Table 9).
122 Brazil: Language Situation Table 3 Carib (Karib) family Linguistic unit
Dialects, groups
No. of speakers
Population
Transmission
Studies
Aparaı´ (Apalaı´ )
most
high
2
Arara do Para´ (Ukara˜gma˜, Ara´ra, Para´) Bakairı´ Galibı´ do Oiapoque (Kali’na, Carib) Hixkarya´na Ingariko´ (Kapo´ng, Akwaio, Patamona) Kalapa´lo (Kuiku´ro-kalapa´lo)
all?
415 [150?] 195
high?
1
most
950 28
good low?
2 0
most?
[550] 675
high good
3 1
most
417
good
1
most
69 [145]
low
1
most most few most?
450 [500] 16 500 119 426
good high? low high?
2 3 0 0
most most all all all
105 532 735 [900] 310 931
good high? high high high
1 1 3 2 2
all? most?
2020 450 [150?]
high med?
2 1
Population
Transmission
Studies
Kalapa´lo, Kuiku´ru, Matipu´, Nahukwa´ are dialects of one language Shikuyana is dialect
Nambikwara The Nambikwara languages occur in western Mato Grosso and southeastern Rondoˆ nia, in a region that includes both tropical forest and savanna, centered on tributaries of the Guapore´ and the Juruena rivers (Table 10). Chapakura (Txapaku´ ra) The extant Chapakura languages are spoken in the state of Rondoˆ nia (and in Bolivia). Tora´ , in the state of Amazonas, is described by recent visitors as already extinct for many years. Recent ethnographers state that Urupa´ is extinct also. The More´ live in Bolivia, though there may be a few in Brazil (Table 11). Yanomami The languages of the Yanomami family are spoken in Brazil and in Venezuela, by rather unacculturated groups. In Brazil these languages occur in the northern state of Roraima, near the Venezuelan border (Table 12). Smaller Language Families
Bora Some speakers of the Miranha dialect of Bora reportedly live along the Solimo˜ es River in Brazil. Guaikuru´ Kadiwe´ u, one of the Guaikuru´ languages (which tend to occur in the Chaco region of Paraguay and Argentina) is spoken in Mato Grosso do Sul in Brazil. Jabutı´ The name of this family is a corruption of Djeoromitxi, one of its component languages. The languages are found in southern Rondoˆ nia.
Population
Transmission
Studies
328 39 61 1531 [50?] 42 287 168
0 0 0
1004
0
17 [10] 4604 593 447
0 3 0 2
Endangered
0 0 0 1
Mura The language of the Mura and that of the Piraha˜ appear to have been quite close; often they are grouped under one name (Mu´ ra-Piraha˜ ). There are occasional reports of elderly Mura speakers, though the Mura generally speak Portuguese or a dialect of Nheengatu (Table 13). Isolated Languages
Seven languages are not known to be genetically affiliated with others. Of these, Aikana˜ (Tubara˜ o), Kanoeˆ (Kanoe´ ), and Kwaza´ are in the same region in southern Rondoˆ nia. The language of the Iranxe (Iraˆ ntxe) and Mynky is spoken near the headwaters of the Juruena River, in Mato Grosso. The Truma´ i are thought to have been relative latecomers to the Upper Xingu regional system. There is said to be only one Ma´ ku speaker, in the state of Roraima. The Tikuna (Ticuna) are numerous, living along the Solimo˜ es River, extending into Columbia and Peru. It is a sign of progress that, of these isolated languages, Kanoeˆ , Kwaza´ , Mynky, Trumai, and Tikuna have received intensive modern study in recent years (Table 14). Creole Languages
There are two groups in the northern state of Amapa´ , the Galibi-Marworno (Carib) and the Karipuna do Norte (Karipu´ na Creole French), both of whom lived for some time in French Guiana and speak creoles heavily influenced by the French-based creole of that country (Table 15).
124 Brazil: Language Situation Table 6 Tupı´ family Linguistic unit
Galibi Marwono (Carib) Karipuna do Norte (Karipu´na Creole French)
See also: Arawak Languages; Benveniste, Emile (1902– 1976); Cariban Languages; Endangered Languages; Evolution of Semantics; Guarani; Meaning: Pre-20th Century Theories; Polysemy and Homonymy; Tupian Languages. Language Maps (Appendix 1): Map 51.
Bibliography Adelaar W (2000). ‘Propuesta de un nuevo vı´nculo gene´ tico entre dos grupos lingu¨ ı´sticos indı´genas de la Amazonia occidental: harakmbut y katukina.’ In Miranda L (ed.) Actas I Congresso de lenguas indı´genas de Sudame´ rica II. Peru: Lima. 219–236. Anchieta J de (1595). Arte e grammatica da lingua mais usada na costa do Brasil. Coimbra: Antoˆ nio Mariz. Cabral A S A C & Rodrigues A D (eds.) (2002). Lı´nguas indı´genas brasileiras: fonologia, grama´ tica e histo´ ria; atas do I Encontro Internacional do Grupo de Trabalho sobre Lı´nguas Indı´genas da ANPOLL I and II. Bele´ m: Editora Universita´ ria UFPA. Centro de Documentac¸ a˜ o, Indı´gena. (1987). Povos indı´genas do Brasil. (Map). Sa˜o Paulo: CEDI. Crevels M, van de Kerke S, Meira S & van der Voort H (eds.) (2002). Selected papers from the 50th International Congress of Americanists in Warsaw and the Spinoza Workshop on Amerindian Languages in Leiden, 2000. CNWS Publications, 114. Leiden: Research School of Asian, African, and Amerindian Studies. Derbyshire C D & Pullum G K (eds.) (1986–1998). Handbook of Amazonian languages. 4 vols. Berlin: Mouton de Gruyter. Dixon R M W & Aikhenvald A Y (eds.) (1999). The Amazonian languages. Cambridge: Cambridge University Press.
No. of speakers
Population
1764 [860] 1708 [672]
Transmission
Studies
Endangered
0 1
Franchetto B (2000). ‘O conhecimento cientı´fico das lı´nguas indı´genas da Amazoˆ nia no Brasil.’ In Queixalo´ s F & Renault-Lescure O (eds.). 165–182. Klein H E & Stark L R (eds.) (1985). South American Indian languages: retrospect and prospect. Austin: University of Texas Press. Mattoso Caˆ mara J Jr (1965). Introduc¸ a˜ o a`s lı´nguas indı´genas brasileiras. Rio de Janeiro: Livraria Acadeˆ mica. Meira S & Franchetto B (2005). ‘The Southern Cariban languages and the Cariban family.’ International Journal of American Linguistics 71(2), 127–190. Moore D (forthcoming). ‘Endangered languages of lowland tropical South America.’ In Brenzinger M (ed.) Language diversity endangered. Berlin: Mouton de Gruyter. Noble G K (1965). Proto-Arawakan and its descendents. Bloomington: Indiana University Research Center in Anthropology, Folklore, and Linguistics. Payne D L (ed.) (1990). Amazonian linguistics: studies in lowland South American languages. Austin: University of Texas Press. Queixalo´ s F & Renault-Lescure O (eds.) (2000). As lı´nguas Amazoˆ nicas hoje. Sa˜ o Paulo: Instituto So´ cio Ambiental. Rodrigues A D (1986). Lı´nguas brasileiras: para o conhecimento das lı´nguas indı´genas. Sa˜ o Paulo: Edic¸ o˜ es Loyola. Rodrigues A D (1993). ‘Endangered languages in Brazil.’ unpublished manuscript from the Symposium on endangered languages of South America. Leiden: Rijks Universiteit. Roosevelt A C (1994). ‘Amazonian anthropology: strategy for a new synthesis.’ In Amazonian Indians from prehistory to the present. Tucson: The University of Arizona Press. 1–29. Seki L (1999). ‘A lingu¨ ı´stica indı´gena no Brasil.’ Revista de Documentac¸ a˜ o de Estudos em Lingu¨ ı´stica Teo´ rica e Aplicada 15, 257–290.
128 Brazil: Language Situation Seki L (2000). Grama´ tica do Kamaiura´ . Campinas: Editora da UNICAMP. van der Voort H & van de Kerke S (eds.) (2000). Indigenous languages of lowland South America. Indigenous languages of Latin America, vol. 1, CNWS publications 90. Leiden: Research School of Asian, African, and Amerindian Studies. Wetzels L (1995). Estudos fonolo´ gicos das lı´nguas indı´genas brasileiras. Rio de Janeiro: Editora UFRJ.
Relevant Websites http://www.socioambiental.org – Instituto So´ cio Ambiental (ISA). http://www.geocities.com/linguasindigenas/ – Listserv about indigenous languages in Brazil.
Bre´ al, Michel Jules Alfred (1832–1915) E Guimara˜ es, Institute of Language Studies–Unicamp, Sa˜o Paulo Campinas, Brazil ! 2006 Elsevier Ltd. All rights reserved.
Bre´ al, French linguist and one of the founders of semantic linguistics, studied Sanskrit in Berlin with Bopp and Albrecht Weber. He received his Ph.D. in 1863, defending the thesis He´ rcules et Cacus. E´ tude de mithologie compare´ e and Des noms perses dans les ecrivains grecs. In 1864, he became a professor of compared grammar at the Colle`ge de France. In 1868, he joined the group that founded the E´ cole des Hautes E´ tudes, where he became director and was, for a time, Ferdinand de Saussure’s professor. From 1879 to 1888, he was Inspector General of French Public Instruction. His work was dedicated to three domains: the study of ancient inscriptions and myths, the study of historical and compared linguistics, and reflection on questions related to teaching. He himself named his work in linguistics semantics, having been the first to use this word in a linguistic discipline (Bre´ al, 1883). In these studies, Bre´ al included himself in the historical perspective of the 19th century and considered that semantics deals with the change of the signification of words (Delesalle, 1988). He differed from the comparativists of his time (Aarsleff, 1981; Delesalle, 1980), as he considered that language does not reduce to forms and that its study must necessarily include the meaning (Bre´ al, 1866). Changes in language are not natural, ruled by inevitable laws, but occur by man’s willful action and intelligence. Willful action, which is not conscious, is constituted by the slow and groping agreement of the will of many, a collective will. Intelligence is a faculty of knowledge and has its origin in the functioning of the sign. Language represents an accumulation of intellectual work. Therefore, language is not a natural science; it is historical and cultural (Bre´ al, 1897). In this domain, Bre´ al established a fundamental concept in semantics studies – that of polysemy –
and this aspect can be found in the work that synthesizes the principal points of his production (1897). Willful action and intelligence change the signification of a word that, not losing its previous signification, takes on more than one meaning. Polysemy is the result of history and is one of the places that represent the accumulation of the intellectual work of the language. Another important aspect, also present in the E´ ssai de se´ mantique, is what he called the subjective element. He who speaks is marked in what he spoke. In languages there are the forms that, when used, mark this presence. Personal pronouns are one of the examples of these forms, which would later be crucial in the work of E´ mile Benveniste. See also: Bopp, Franz (1791–1867); Weber, Albrecht Frie-
drich (1825–1901).
Bibliography Aarsleff H (1981). ‘Bre´ al, la Se´ mantique et Saussure.’ Histoire, Episte´ mologie, Langage III 2, 115–134. Bre´ al M (1863). Hercule et Cacus, e´ tude de mythologie compare´ e. Paris: A. Durand. Bre´ al M (1866). ‘De la forme et de la fonction des mots.’ Revue des cours litte´ raires de la France et de l’e´ tranger, fascicle dated December 29, 1866. In Desmet P & Swiggers P (eds.) (1995) De la grammaire compare´ e a` la se´ mantique. Textes de Michel Bre´ al publie´ s entre 1864 et 1898. Paris: Peeters. Bre´ al M (1883). ‘Les lois intellectuelles du langage. Fragment de se´ mantique.’ Annuaire de l’Association pour l’Encouragement des E´ tudes Grecques em France, 17. In Desmet P & Swiggers P (eds.) (1995) De la grammaire compare´ e a` la se´ mantique. Textes de Michel Bre´ al publie´ s entre 1864 et 1898. Paris: Peeters. Bre´ al M (1897). E´ ssai de se´ mantique. Paris: Hachette. Delesalle S (1980). ‘L’analogie: d’un arbitraire a` l’autre.’ Langue Franc¸ aise 46, 90–108. Delesalle S (1988). ‘L’E´ ssai de Se´ mantique de Bre´ al, du ‘transformisme,’ a` la diachronie.’ In La Linguistique ge´ ne´ tique. Histoire et the´ ories. Paris: PUF.
128 Brazil: Language Situation Seki L (2000). Grama´tica do Kamaiura´. Campinas: Editora da UNICAMP. van der Voort H & van de Kerke S (eds.) (2000). Indigenous languages of lowland South America. Indigenous languages of Latin America, vol. 1, CNWS publications 90. Leiden: Research School of Asian, African, and Amerindian Studies. Wetzels L (1995). Estudos fonolo´gicos das lı´nguas indı´genas brasileiras. Rio de Janeiro: Editora UFRJ.
Relevant Websites http://www.socioambiental.org – Instituto So´cio Ambiental (ISA). http://www.geocities.com/linguasindigenas/ – Listserv about indigenous languages in Brazil.
Bre´al, Michel Jules Alfred (1832–1915) E Guimara˜es, Institute of Language Studies–Unicamp, Sa˜o Paulo Campinas, Brazil ! 2006 Elsevier Ltd. All rights reserved.
Bre´al, French linguist and one of the founders of semantic linguistics, studied Sanskrit in Berlin with Bopp and Albrecht Weber. He received his Ph.D. in 1863, defending the thesis He´rcules et Cacus. E´tude de mithologie compare´e and Des noms perses dans les ecrivains grecs. In 1864, he became a professor of compared grammar at the Colle`ge de France. In 1868, he joined the group that founded the E´cole des Hautes E´tudes, where he became director and was, for a time, Ferdinand de Saussure’s professor. From 1879 to 1888, he was Inspector General of French Public Instruction. His work was dedicated to three domains: the study of ancient inscriptions and myths, the study of historical and compared linguistics, and reflection on questions related to teaching. He himself named his work in linguistics semantics, having been the first to use this word in a linguistic discipline (Bre´al, 1883). In these studies, Bre´al included himself in the historical perspective of the 19th century and considered that semantics deals with the change of the signification of words (Delesalle, 1988). He differed from the comparativists of his time (Aarsleff, 1981; Delesalle, 1980), as he considered that language does not reduce to forms and that its study must necessarily include the meaning (Bre´al, 1866). Changes in language are not natural, ruled by inevitable laws, but occur by man’s willful action and intelligence. Willful action, which is not conscious, is constituted by the slow and groping agreement of the will of many, a collective will. Intelligence is a faculty of knowledge and has its origin in the functioning of the sign. Language represents an accumulation of intellectual work. Therefore, language is not a natural science; it is historical and cultural (Bre´al, 1897). In this domain, Bre´al established a fundamental concept in semantics studies – that of polysemy –
and this aspect can be found in the work that synthesizes the principal points of his production (1897). Willful action and intelligence change the signification of a word that, not losing its previous signification, takes on more than one meaning. Polysemy is the result of history and is one of the places that represent the accumulation of the intellectual work of the language. Another important aspect, also present in the E´ssai de se´mantique, is what he called the subjective element. He who speaks is marked in what he spoke. In languages there are the forms that, when used, mark this presence. Personal pronouns are one of the examples of these forms, which would later be crucial in the work of E´mile Benveniste. See also: Bopp, Franz (1791–1867); Weber, Albrecht Frie-
drich (1825–1901).
Bibliography Aarsleff H (1981). ‘Bre´al, la Se´mantique et Saussure.’ Histoire, Episte´mologie, Langage III 2, 115–134. Bre´al M (1863). Hercule et Cacus, e´tude de mythologie compare´e. Paris: A. Durand. Bre´al M (1866). ‘De la forme et de la fonction des mots.’ Revue des cours litte´raires de la France et de l’e´tranger, fascicle dated December 29, 1866. In Desmet P & Swiggers P (eds.) (1995) De la grammaire compare´e a` la se´mantique. Textes de Michel Bre´al publie´s entre 1864 et 1898. Paris: Peeters. Bre´al M (1883). ‘Les lois intellectuelles du langage. Fragment de se´mantique.’ Annuaire de l’Association pour l’Encouragement des E´tudes Grecques em France, 17. In Desmet P & Swiggers P (eds.) (1995) De la grammaire compare´e a` la se´mantique. Textes de Michel Bre´al publie´s entre 1864 et 1898. Paris: Peeters. Bre´al M (1897). E´ssai de se´mantique. Paris: Hachette. Delesalle S (1980). ‘L’analogie: d’un arbitraire a` l’autre.’ Langue Franc¸aise 46, 90–108. Delesalle S (1988). ‘L’E´ssai de Se´mantique de Bre´al, du ‘transformisme,’ a` la diachronie.’ In La Linguistique ge´ne´tique. Histoire et the´ories. Paris: PUF.
Breton 129
Bredsdorff, Jakob Hornemann (1790–1841) J van Pottelberge, Ghent University, Ghent, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Though trained as a natural scientist, Jakob Hornemann Bredsdorff is remembered most of all as one of the first scientific runologists and historical linguists. He was born on April 3, 1790, in Vester Skerninge (on the island of Fyn, Denmark) into a line of highly educated Lutheran priests. After thorough preparation at home, Bredsdorff entered Nykøbing Cathedral School in 1807. Here, his language teacher was S. N. J. Bloch, a renowned philologist and pedagogue who also taught Rasmus Rask. In 1809 Bredsdorff enrolled at the University of Copenhagen, where he received a first degree in Divinity in 1814 and a doctoral degree in natural sciences in 1817. Early on in Copenhagen, if not before, Bredsdorff met Rask, with whom he was friends until the latter’s death in 1832. He spent most of his career as a reader in geology and botany at the prestigious private school Sorø Academy, where he died on June 16, 1841. Bredsdorff left a small but remarkable body of linguistic writings. He owes his special place in the history of linguistics most of all to his highly original paper On the causes of linguistic change (published in Danish in 1821), which provides a genuine theory of language change, with the speaker as the central locus of change. It differs fundamentally from the ideas of Rask, who treated language history in terms of natural history. Being at least 40 years ahead of its time and published in an examination program of Roskilde Cathedral School, the paper passed unnoticed. It was rediscovered and republished by Vilhelm Thomsen in 1886. In the long-standing Scandinavian tradition of runology, Bredsdorff gave the first more-or-less correct interpretation of the runic inscription on the famous Golden Horn of Gallehus (1839), which
ultimately led to today’s standard transliteration by Ludvig Wimmer. He was also the first to realize that the 24-character runic alphabet was older than the more common 16-character alphabet, though he derived both erroneously from Ulfilas Gothic alphabet. His data-oriented analysis of the relationships between the Germanic languages also foreshadowed modern views, arguing that Gothic should be considered a separate branch of Germanic and not the ancestor of High German or all Germanic languages together. Being well aware of the gap between the rough classification of sounds in orthography and the more sophisticated differentiations in pronunciation, Bredsdorff tried to develop an alphabet to represent pronunciation more accurately, which he applied to both standard and colloquial Danish in 1817. The sources of Bredsdorff’s linguistic insights have not yet been investigated; preliminary information can be found in Andersen (1982). See also: Gothic; Phonetic Transcription: History; Rask, Rasmus Kristian (1787–1832); Runes; Thomsen, Vilhelm Ludvig Peter (1824–1927).
Bibliography Andersen H (1982). ‘On the causes of linguistic change (1821) by Jakob Hornemann Bredsdorff. English translation with commentary and an essay on J. H. Bredsdorff.’ Historiographia Linguistica 9, 1–41. Glahder J (ed.) (1933). J. H. Bredsdorffs udvalgte afhandlinger inden for sprogvidenskab og runologi. Copenhagen: Levin & Munksgaard. Sandfeld K (1979). ‘Bredsdorff, Jakob Hornemann.’ In Cedergreen Bech S (ed.) Dansk biografisk leksikon (3rd edn.), vol. 2. Copenhagen: Gyldendal. 497–498.
Breton J Le Duˆ , Universite´ de Bretagne Occidentale, Brest, France ! 2006 Elsevier Ltd. All rights reserved.
Breton (brezoneg, brezhoneg) belongs to the Brythonic branch of the Celtic languages. It is spoken in Lower Brittany, and its linguistic border is the westernmost limit of the withdrawal of Celtic before Roman expansion.
Breton has long been considered the continuation of Gaulish. Linguistic studies in the 19th century smothered all purported genetic connection between Breton and French and also any close relationship to Gaulish. Some historians argued that Breton had been imported whole by immigrants from Britain into a thoroughly romanized Armorica. Modern Celtic studies confirmed the view that Breton was a late offshoot of British Celtic. We now know that emigration from Britain began before the Saxon invasions,
Breton 129
Bredsdorff, Jakob Hornemann (1790–1841) J van Pottelberge, Ghent University, Ghent, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Though trained as a natural scientist, Jakob Hornemann Bredsdorff is remembered most of all as one of the first scientific runologists and historical linguists. He was born on April 3, 1790, in Vester Skerninge (on the island of Fyn, Denmark) into a line of highly educated Lutheran priests. After thorough preparation at home, Bredsdorff entered Nykøbing Cathedral School in 1807. Here, his language teacher was S. N. J. Bloch, a renowned philologist and pedagogue who also taught Rasmus Rask. In 1809 Bredsdorff enrolled at the University of Copenhagen, where he received a first degree in Divinity in 1814 and a doctoral degree in natural sciences in 1817. Early on in Copenhagen, if not before, Bredsdorff met Rask, with whom he was friends until the latter’s death in 1832. He spent most of his career as a reader in geology and botany at the prestigious private school Sorø Academy, where he died on June 16, 1841. Bredsdorff left a small but remarkable body of linguistic writings. He owes his special place in the history of linguistics most of all to his highly original paper On the causes of linguistic change (published in Danish in 1821), which provides a genuine theory of language change, with the speaker as the central locus of change. It differs fundamentally from the ideas of Rask, who treated language history in terms of natural history. Being at least 40 years ahead of its time and published in an examination program of Roskilde Cathedral School, the paper passed unnoticed. It was rediscovered and republished by Vilhelm Thomsen in 1886. In the long-standing Scandinavian tradition of runology, Bredsdorff gave the first more-or-less correct interpretation of the runic inscription on the famous Golden Horn of Gallehus (1839), which
ultimately led to today’s standard transliteration by Ludvig Wimmer. He was also the first to realize that the 24-character runic alphabet was older than the more common 16-character alphabet, though he derived both erroneously from Ulfilas Gothic alphabet. His data-oriented analysis of the relationships between the Germanic languages also foreshadowed modern views, arguing that Gothic should be considered a separate branch of Germanic and not the ancestor of High German or all Germanic languages together. Being well aware of the gap between the rough classification of sounds in orthography and the more sophisticated differentiations in pronunciation, Bredsdorff tried to develop an alphabet to represent pronunciation more accurately, which he applied to both standard and colloquial Danish in 1817. The sources of Bredsdorff’s linguistic insights have not yet been investigated; preliminary information can be found in Andersen (1982). See also: Gothic; Phonetic Transcription: History; Rask, Rasmus Kristian (1787–1832); Runes; Thomsen, Vilhelm Ludvig Peter (1824–1927).
Bibliography Andersen H (1982). ‘On the causes of linguistic change (1821) by Jakob Hornemann Bredsdorff. English translation with commentary and an essay on J. H. Bredsdorff.’ Historiographia Linguistica 9, 1–41. Glahder J (ed.) (1933). J. H. Bredsdorffs udvalgte afhandlinger inden for sprogvidenskab og runologi. Copenhagen: Levin & Munksgaard. Sandfeld K (1979). ‘Bredsdorff, Jakob Hornemann.’ In Cedergreen Bech S (ed.) Dansk biografisk leksikon (3rd edn.), vol. 2. Copenhagen: Gyldendal. 497–498.
Breton J Le Duˆ, Universite´ de Bretagne Occidentale, Brest, France ! 2006 Elsevier Ltd. All rights reserved.
Breton (brezoneg, brezhoneg) belongs to the Brythonic branch of the Celtic languages. It is spoken in Lower Brittany, and its linguistic border is the westernmost limit of the withdrawal of Celtic before Roman expansion.
Breton has long been considered the continuation of Gaulish. Linguistic studies in the 19th century smothered all purported genetic connection between Breton and French and also any close relationship to Gaulish. Some historians argued that Breton had been imported whole by immigrants from Britain into a thoroughly romanized Armorica. Modern Celtic studies confirmed the view that Breton was a late offshoot of British Celtic. We now know that emigration from Britain began before the Saxon invasions,
130 Breton
so that most scholars acknowledge that Breton is rooted in Armorican Gaulish, absorbing different varieties of British Celtic. A traditional view of the language purports the existence of a unified old Breton, supposed to have split into four dialects, named after the dioceses as they existed before the 1789 French Revolution: Le´ onais for the diocese of Le´ on, Tre´ gorrois for Tre´ guier, Cornouaillais for Cornouaille, and Vannetais for Vannes. There are, in fact, two major dialect groups: (1) KLT – Cornouaille (Kerne), Le´ on, Tre´ gor and (2) Vannetais, the western border of which is the river Elle´ . Falc’hun (1962, 1981) has reported the existence of an intermediate dialect centered on Carhaix, the meeting point of all the major roads, and constituting a bridge between remote linguistic forms, like the reflexes of the dental spirants from old Celtic *tt and *d. Le´ on deiz ‘day’ and dervez ‘duration of a day’ (Welsh dydd and dyddwaith) are far removed from vannetais de and deu`eh. The central forms are de and devez, dropping z from *d as in vannetais, but keeping z from *tt as in Le´ on. The primitive twofold partition could reflect the difference between Osismii and Venetes Gaulish, the latter keeping closer to Armorican. An intensity stress generally falls on the penultimate in the northwest, whereas in the Southeast a pitch stress affects the last syllable, not unlike French. Voiceless consonants and /m/ are fortes, voiced spirants are lenes, and voiced stops and /l/, /n/ and /r/ can be either. Vowels are short before fortes and long before lenes when stressed. One can thus oppose ar zal ‘the room’ (long [a:], weak [l]) and zall ‘salted’ (short [a], strong [l]). There can be up to eight phonemic nasal vowels, which are not borrowings from French, but archaic features, as in han˜v ‘summer’. Primitive consonants were weakened, especially between vowels. These changes survived the loss of final syllables, turning a simple phonetic mechanism into a grammatical device called ‘lenition,’ so that the initial consonants of feminine words are lenited after the article – originally ending in a vowel – and also the following adjective: mamm ‘mother,’ mad ‘good,’ ar vamm vad ‘the good mother.’ The geminate voiceless fortes became voiceless spirants, giving rise to the spirant mutation: penn ‘head,’ he fenn ‘her head.’ Another sandhi phenomenon caused the so-called provective mutation: a final -h in hoh ‘your’ devoices a following voiced initial consonant, as in bugel ‘child,’ ho(h) pugel ‘your child.’ Final consonants are devoiced before pausa. Ma zad ‘my father’ keeps a long [a:], but a devoiced -d when final, the voice being restored when the utterance is followed by a vowel as in ma zad eo ‘(he) is my father’. All voiceless consonants are voiced before a vowel or l, m, n, and r. Native Breton speakers are
readily recognizable in French when they pronounce toud’ la z’maine for toute la semaine ‘during the whole week.’ English and Breton grammars show striking similarities; for example, both use a compulsory periphrastic progressive in opposition to a simple present: Ma breur ne gan ket ‘my brother does not sing’ vs. ma breur n’ema ket o kana ‘my brother is not singing.’ The lexis is basically Celtic (dorn ‘fist, hand’, Welsh dwrn, Gaelic dorn; den ‘person’, Welsh dyn, Gaelic duine). About 500 common words are Latin borrowings (taol
British Indian Ocean Territory: Language Situation 131
1% of Breton children benefit from this bilingual education. See also: Celtic; France: Language Situation; Welsh.
Bibliography Balcou J & Le Gallo Y (eds.) (1987). Histoire litte´ raire et culturelle de la Bretagne. Paris-Gene`ve: ChampionSlatkine. Broudic F (1995). La pratique du breton de l’ancien re´ gime a` nos jours. Rennes: PUR. Falc’hun F (1962). ‘Le Breton, forme moderne du gaulois, Rennes.’ Annales de Bretagne 64(4). Falc’hun F (1981). Perspectives nouvelles sur l’histoire de la langue bretonne. Paris: UGE. Fleuriot L (1980). Les origines de la Bretagne. Paris: Payot. Guiomar J-Y (1987). Le Bretonisme: les historiens bretons au XIX! sie`cle. Socie´ te´ d’Histoire et d’arche´ ologie de Bretagne. He´ lias P-J (1979). The horse of pride – life in a Breton village. London/New Haven: Yale University Press. Hemon R (1975). A historical morphology and syntax of Breton. Dublin: The Dublin Institute for Advanced Studies. Hersart de La Villemarque´ T (1867). Barzaz Breiz – chants populaires de la Bretagne. Paris: Perrin. Humphreys L l H (1995). Phonologie et morphosyntaxe du parler breton de Bothoa en Saint-Nicolas-du-Pelem. Brest: Emgleo Breiz-Brud Nevez.
Jackson K (1967). A historical phonology of Breton. Dublin: Dublin Institute for Advanced Studies. Le Berre Y & Le Duˆ J (1997). ‘Nommer le breton.’ In Tabouret-Keller (ed.) Le nom des langues I: les enjeux de la, nomination des langues. Louvain-La-Neuve: BCILL, Peeters. Le Berre Y & Le Duˆ J (1999). ‘Le qui pro quo des langues re´ gionales: sauver la langue ou e´ duquer l’enfant?’ In Clairis C, Costaouec D & Coyos J-B (eds.) Langues et cultures re´ gionales de France – Etat des lieux, enseignement, politiques. Paris: L’Harmattan. Le Duˆ J (2001). Nouvel atlas linguistique de la BasseBretagne. Brest: CRBC. Le Roux P (1924–1963). Atlas linguistique de la BasseBretagne. Rennes and Paris. Loth J (1883). L’immigration bretonne en Armorique du Ve`me au VIIe`me sie`cles de notre e`re. Paris. McKenna M (1988). A handbook of modern spoken Breton. Tu¨ bingen: Niemeyer. Piette J F R (1973). French loanwords in Middle Breton. Cardiff: University of Wales Press. Ploneis J-M (1983). Au carrefour des dialectes breton: le parler de Berrien. Paris: SELAF. Sommerfelt A (1978). Le breton parle´ a` Saint-Pol-de-Le´ on: phone´ tique et morphologie (2e`me e´ d.). Oslo: Universitetforlaget. Tanguy B (1977). Aux origines du nationalisme breton-le renouveau des e´ tudes bretonnes au XIXe`me sie`cle. Paris: UGE. Ternes E (1970). Grammaire structurale du breton de l’ıˆle de Groix. Heidelberg: Carl Winter.
British Indian Ocean Territory: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The British Indian Ocean Territories comprise an overseas territory of the United Kingdom. It consists of more than 2000 islands in the Indian Ocean to the south of India, midway between Africa and Indonesia. The territory was established in 1965, when it was slightly larger than today; in 1976 a number of islands became part of the newly independent Seychelles. Currently the British Indian Ocean Territories comprise the six main island groups that make up the
Chagos Archipelago. The largest of these islands is Diego Garcia, which houses a joint U.K.–U.S. naval support facility. It is the only island that is inhabited, by approximately 1500 (U.K. and U.S.) military personnel and 2000 civilian contractors. During the establishment of the naval base (1967–1973), the local population of Ilois, mainly agricultural workers, were relocated to Mauritius and the Seychelles. There is a legal campaign to gain the right of return, but so far this has been unsuccessful, largely due to the special military status of Diego Garcia. Thus, the official (and only) language of the British Indian Ocean Territories is English.
British Indian Ocean Territory: Language Situation 131
1% of Breton children benefit from this bilingual education. See also: Celtic; France: Language Situation; Welsh.
Bibliography Balcou J & Le Gallo Y (eds.) (1987). Histoire litte´raire et culturelle de la Bretagne. Paris-Gene`ve: ChampionSlatkine. Broudic F (1995). La pratique du breton de l’ancien re´gime a` nos jours. Rennes: PUR. Falc’hun F (1962). ‘Le Breton, forme moderne du gaulois, Rennes.’ Annales de Bretagne 64(4). Falc’hun F (1981). Perspectives nouvelles sur l’histoire de la langue bretonne. Paris: UGE. Fleuriot L (1980). Les origines de la Bretagne. Paris: Payot. Guiomar J-Y (1987). Le Bretonisme: les historiens bretons au XIX! sie`cle. Socie´te´ d’Histoire et d’arche´ologie de Bretagne. He´lias P-J (1979). The horse of pride – life in a Breton village. London/New Haven: Yale University Press. Hemon R (1975). A historical morphology and syntax of Breton. Dublin: The Dublin Institute for Advanced Studies. Hersart de La Villemarque´ T (1867). Barzaz Breiz – chants populaires de la Bretagne. Paris: Perrin. Humphreys L l H (1995). Phonologie et morphosyntaxe du parler breton de Bothoa en Saint-Nicolas-du-Pelem. Brest: Emgleo Breiz-Brud Nevez.
Jackson K (1967). A historical phonology of Breton. Dublin: Dublin Institute for Advanced Studies. Le Berre Y & Le Duˆ J (1997). ‘Nommer le breton.’ In Tabouret-Keller (ed.) Le nom des langues I: les enjeux de la, nomination des langues. Louvain-La-Neuve: BCILL, Peeters. Le Berre Y & Le Duˆ J (1999). ‘Le qui pro quo des langues re´gionales: sauver la langue ou e´duquer l’enfant?’ In Clairis C, Costaouec D & Coyos J-B (eds.) Langues et cultures re´gionales de France – Etat des lieux, enseignement, politiques. Paris: L’Harmattan. Le Duˆ J (2001). Nouvel atlas linguistique de la BasseBretagne. Brest: CRBC. Le Roux P (1924–1963). Atlas linguistique de la BasseBretagne. Rennes and Paris. Loth J (1883). L’immigration bretonne en Armorique du Ve`me au VIIe`me sie`cles de notre e`re. Paris. McKenna M (1988). A handbook of modern spoken Breton. Tu¨bingen: Niemeyer. Piette J F R (1973). French loanwords in Middle Breton. Cardiff: University of Wales Press. Ploneis J-M (1983). Au carrefour des dialectes breton: le parler de Berrien. Paris: SELAF. Sommerfelt A (1978). Le breton parle´ a` Saint-Pol-de-Le´on: phone´tique et morphologie (2e`me e´d.). Oslo: Universitetforlaget. Tanguy B (1977). Aux origines du nationalisme breton-le renouveau des e´tudes bretonnes au XIXe`me sie`cle. Paris: UGE. Ternes E (1970). Grammaire structurale du breton de l’ıˆle de Groix. Heidelberg: Carl Winter.
British Indian Ocean Territory: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The British Indian Ocean Territories comprise an overseas territory of the United Kingdom. It consists of more than 2000 islands in the Indian Ocean to the south of India, midway between Africa and Indonesia. The territory was established in 1965, when it was slightly larger than today; in 1976 a number of islands became part of the newly independent Seychelles. Currently the British Indian Ocean Territories comprise the six main island groups that make up the
Chagos Archipelago. The largest of these islands is Diego Garcia, which houses a joint U.K.–U.S. naval support facility. It is the only island that is inhabited, by approximately 1500 (U.K. and U.S.) military personnel and 2000 civilian contractors. During the establishment of the naval base (1967–1973), the local population of Ilois, mainly agricultural workers, were relocated to Mauritius and the Seychelles. There is a legal campaign to gain the right of return, but so far this has been unsuccessful, largely due to the special military status of Diego Garcia. Thus, the official (and only) language of the British Indian Ocean Territories is English.
Brøndal, Rasmus Viggo (1887–1942) 133
Brøndal, Rasmus Viggo (1887–1942) F Gregersen, University of Copenhagen, Copenhagen, Denmark ! 2006 Elsevier Ltd. All rights reserved.
Viggo Brøndal, born Rasmus Viggo Hansen on October 13, 1887, in Copenhagen, was a Danish linguist and Romance philologist. He changed his name from Hansen to Brøndal in 1912. Brøndal studied with Sandfeld and Nyrop at the University of Copenhagen and graduated as magister artium in Romance Philology in 1912. As a postgraduate, Brøndal studied with Be´ dier and Meillet in Paris for a year. In 1917 he was awarded the degree of Doctor of Philosophy by the University of Copenhagen for a dissertation on loans and substratum influences in Romance and Germanic languages. The book is heavily influenced by the sociological Meillet school and thereby atypical of Brøndal’s later work. From 1917 to 1925 he worked as an assistant to the Place-Name Committee, and then returned to Paris for a three-year period where he was a reader of Danish at the Sorbonne. In 1928 he was appointed Professor of Romance Philology at the University of Copenhagen, a position that he held until his death on December 14, 1942. Viggo Brøndal was among the founding members of the Linguistic Circle of Copenhagen (1931), and, until his premature death, he was a major force in its endeavors to further structural linguistics. In particular, he and Louis Hjelmslev together edited the Acta Linguistica (later Acta Linguistica Hafniensia) from its start (1937). Brøndal formed strong bonds with other structuralists, such as Roman Jakobson, and he functioned as the secretary general of the 1936 Copenhagen Congress of Linguists, which was presided over by his former teacher, Otto Jespersen. Brøndal’s particular kind of structuralism is of an idealist transcendent type, based on his reception of the Aristotelian categories. From his work on word classes (1928), arguably his best book, until his theory of the prepositions (1940), the vision is refined, but the outlines remain the same: The four generic concepts of descriptum (D), descriptor (d), relatum (R), and relator (r) may be combined to form what are seen as the possible linguistic categories in any specific system. Brøndal uses his categories to analyze the relation between morphology and syntax in his 1932 book on the subject and later wrote a number of
papers, collected in the posthumously published Essais de linguistique ge´ ne´ rale (1943). Viggo Brøndal was a fascinating orator, and the debates between him and Hjelmslev were both fierce and singularly gratifying for a whole generation of Danish linguists. Brøndal deeply influenced his immediate pupils, the Nordic philologist Paul Diderichsen, the Romance philologist Knud Togeby, and the literary historian Hans Sørensen, but none of them are seen to adhere to his theory, strictly speaking, in their later works. The theory is justly characterized by Eli Fischer-Jørgensen as a great intellectual achievement in that it seeks to capture everything linguistic with rather few but very abstract concepts and strives to treat all linguistic levels from phonology through word classes (morphology) to syntax and semantics using only these same concepts to arrive finally at a characteristic of a language – and through this – a culture. But unlike Hjelmslev’s theory, Brøndal’s impressive structure has remained outside the mainstream of linguistics. See also: Copenhagen School; Scandinavia: History of Linguistics.
Bibliography Brandt P A (ed.) (1989). Linguistique et se´ miotique: Actualite´ de Brøndal. Travaux du Cercle Linguistique de Copenhague, Vol. 22, C.A. Reitzel: Cph. Brøndal V (1943). Essais de linguistique ge´ ne´ rale. Munksgaard: Cph. Brøndal V (1948). Les Parties du discours. Munksgaard: Cph. [Original Danish ed., 1928.] Brøndal V (1948). Substrat et emprunt. Munksgaard: Cph./ Institutul de linguistica romaˆ na. [Original Danish ed., 1917.] Brøndal V (1950). The´ orie des pre´ positions. Munksgaard: Cph. (Original Danish ed., 1940.) Fischer-Jørgensen E (1979). Viggio Brøndal, Cedergreen Bech (ed.): Dansk biografisk leksikon, Vol. 1. Gyldendal: Cph. Larsen S E (1987). ‘A semiotician in disguise: semiotic aspects of the work of Viggo Brøndal.’ In Sebeok T & Umiker-Sebeok J (eds.) The semiotic web ’86. An international yearbook. Berlin: Mouton de Gruyter. Larsen S E (1986). Sprogets geometri (vols 1–2). Odense: Odense Universitets forlag. Larsen S E (ed.) (1987). Langages 86: Actualite´ de Brøndal. Paris: Larousse.
132 Brockelmann, Carl (1868–1956)
Brockelmann, Carl (1868–1956) M V McDonald ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, pp. 415–416, ! 1994, Elsevier Ltd.
The German Orientalist Brockelmann was born in Rostock on September 17, 1868. While still at school, his imagination was fired by the great geographical discoveries then being made, and he showed an early interest in exotic languages. However, on entering Rostock University in 1886 he began with the study of classics as offering more secure career prospects. The award of a scholarship allowed him to move shortly to Breslau (now Wroclaw) where he studied Semitic and Indo–European philology. At this time he also taught himself Turkish, a language which was to remain an abiding interest. In 1888 he went to Strasbourg (then in Germany) in order to study under No¨ ldeke (see No¨ldeke, Theodor. Here he occupied himself with Sanskrit, Armenian, and Ancient Egyptian. By 1892 he was once again in Breslau where he completed his Habilitation (on Ibn al-Jawzi), wrote his Lexicon Syriacum, and traveled for the first time to Turkey. While in Constantinople he made the acquaintance of Jahn, to whose chair in Ko¨ nigsberg he was later to succeed. These years in Breslau saw in particular the preparation of his edition of Ibn Qutayba’s ‘Uyu¯ n al-Akhba¯ r (Berlin/Strasbourg, 1900– 1908) and the first edition of his Geschichte der Arabischen Litteratur (Weimar/Berlin, 1898–1902). In 1903 he was appointed to a chair in Ko¨ nigsberg, where he completed his own best-loved work, the Grundriss der Vergleichenden Grammatik der Semitischen Sprachen (Berlin, 1907–1913). In 1910 he accepted a chair in Halle, which became his hometown for the rest of his life. In 1923 he went back to Breslau, from which he finally retired to Halle in 1935. His years of retirement were overshadowed
by World War II and its ensuing miseries. His son was taken prisoner at Stalingrad, only returning in the early 1950s, and his wife died in 1945. His savings became worthless, and a period of extreme difficulty was finally alleviated when in 1947 he was made librarian to the Deutsche Morgenla¨ ndische Gesellschaft in Halle. It is said to be due to his efforts that this library was not shipped to the Soviet Union as war reparations. Even during these years, he continued to teach a number of languages, and his scholarly energies remained undimmed. He retired again in 1953 and was still working on his Hebra¨ische Syntax at his death (May 6, 1956). During his lifetime, Brockelmann produced a mass of articles and studies covering an enormous range of Semitic and Turkish studies (see Fu¨ ck, 1958), centering mainly on grammar, syntax, and lexicography. He is best remembered however, by Arabists at least, for his Geschichte der arabischen Literatur, the final version of which appeared in Leiden (1943–1949). This is an invaluable but unwieldy work; to some extent it has been superseded by later works, but it is still indispensable, even though it is occasionally impossible to identify his references. He is one of the select band of Orientalists whose name has become a household work in the field, and a copy of ‘Brockelmann’ is one of the most-used works in any Arabic library. See also: No¨ldeke, Theodor (1836–1930); Semitic Lan-
guages.
Bibliography Fu¨ ck J (1958). Obituary in Wissenschaftliche Zeitschrift der Martin-Luther-Universita¨t Halle–Wittenberg. 7, 4. Ges. Sprachw., Halle/Saale. Sellheim R (1981). ‘Autobiographische Aufzeichnungen und Erinnerungen von Carl Brockelmann.’ Oriens 27–28, 1–65.
134 Brosses, Charles de (1709–1777)
Brosses, Charles de (1709–1777) G Haßler, University of Potsdam, Potsdam, Germany ! 2006 Elsevier Ltd. All rights reserved.
Charles de Brosses was born in Dijon, France, on February 7, 1709, and died in Paris, on May 7, 1777. He was a French magistrate and scholar, who came from a family of judges and studied in his home town. A classmate of Georges Louis Leclerc de Buffon (1707–1788) at Godrans Colle`ge in Dijon, he was appointed judge at the Burgundian Parlement des E´ tats (1730), later becoming a conseiller (1741) and finally first president (1775). He also followed his inclination towards literature and science, and, during a visit to Italy in 1739–1740, he wrote his letters on Italy which were published posthumously. His friend Buffon solicited him to undertake the composition of his Histoire des navigations aux terres australes (1756). This work included many word lists and was translated into English and German. De Brosses was the first to lay down the geographical divisions of Australasia and Polynesia, which were afterwards adopted by succeeding geographers. His works on the history and origins of language earned him a reputation as a theorist in this field. He was elected to the Acade´ mie des Inscriptions in 1758. In 1760 de Brosses published a dissertation, Du culte des dieux fe´ tiches, which was afterwards inserted into the Encyclope´ die me´ thodique. In this work, secretly smuggled into France after having been rejected by the Acade´ mie des Inscriptions, he developed the hypothesis that all divinities had a physical origin and that they were initially material objects adored for their own sake. In 1765 appeared his work on the origin of language, Le Traite´ de la formation me´ chanique des langues. De Brosses aimed at giving a naturalistic interpretation to symbolic functioning. Language is for him primarily an organic phenomenon. ‘Mechanical’ was not a neutral term, but was taken from the works of Jean Baptiste le Rond d’Alembert (1717–1783), who used it to designate a part of applied mathematics which tried to explain movement and its forces. The linguistic use of the term mechanical was not invented by de Brosses, but he took it up from Noe¨ l Antoine Pluche (1688–1761), who used it in a work on the acquisition of languages by children (La Me´ chanique des langues et l’art de les enseigner, 1751). De Brosses goes further in his explanation, adopting materialist connotations of the mechanical description of the first languages. Their elements should be natural and necessary and can be found in any language. His aim was to observe the operation
of the expressive movement of the body and the iconic links between the first sounds and the objects they represent. Nature is the author of the germination of sound and the first true words. The gradual evolution of languages towards arbitrariness does not eliminate this link. He developed a phonetic theory which allowed him to link words of different languages with their organic root. Nevertheless, de Brosses’s emphasis on etymology and the study of regularities of sound change does not make him a forerunner of historical linguistics. In this respect, the Traite´ was more likely a failure. De Brosses thought that we would be able to compare all languages on the basis of their organic roots and that the forms of unknown languages would fit easily into this scheme. He stressed the word in itself, independent of its relation to specific languages. Together with his thesis of a universal family of languages this emphasis produces an unbridgeable gap which separates his theory from the comparative grammar of the 19th century. De Brosses was occupied throughout most of his life with a translation of Sallust, attempting to supply the lost chapters of that celebrated historian’s work. These literary occupations did not prevent the author from efficiently executing his official duties as first president of the parliament of Burgundy, nor from carrying on a constant and extensive correspondence with the most distinguished literary figures of his time. Presenting himself as a candidate for the Acade´ mie Franc¸ aise in 1770, de Brosses was rejected due to the opposition of Voltaire (1694–1778) on personal grounds. See also: 18th Century Linguistic Thought; Origin of Language Debate.
Bibliography Auroux S (1979). La se´ miotique des encyclope´ distes: essai d’e´ piste´ mologie historique des sciences du langage. Paris: Payot. Be´ zard Y ([1937] 1939). Le Pre´ sident de Brosses et ses amis de Gene`ve, d’apre`s les correspondances ine´ dites e´ change´ s entre Charles de Brosses, Be´ nigne Legouz de Gerland, Charles Bonnet, Pierre Pictet, Jean Jallabert. Annales de Bourgogne, January–March 1937. Paris: Ancienne Librarie Furne, Boivin et Cie. Brosses C de (1765). Traite´ de la formation me´ chanique des langues et des principes physiques de l’e´ tymologie. Paris: Saillant. Brosses C de ([1756] 1967). Histoire des navigations aux terres australes. Amsterdam: Nico Israel.
Brown, Gillian 135 Brosses C de (1995). Lettres familie`res d’Italie: lettres e´ crites d’Italie en 1739 et 1740. Brussels: Editions Complexe. Garreta J-C (ed.) (1981). Charles de Brosses 1777–1977: actes du colloque organise´ a` Dijon du 3 au 7 mai 1977 pour le deuxie`me centenaire de la mort du pre´ sident de Brosses, par l’Acade´ mie des Sciences, Arts et Belles Let-
tres de Dijon et le Centre de Recherche sur le XVIIIe Sie`cle de l’Universite´ de Dijon. Geneva: Slatkine. Sautebin H ([1899] 1971). Un linguiste franc¸ ais du XVIIIe sie`cle, le pre´ sident de Brosses: e´ tude historique et analytique du Traite´ de la formation me´ chanique des langues. Geneva: Slatkine.
Brown, Gillian G Yule, Kaaawa, HI, USA ! 2006 Elsevier Ltd. All rights reserved.
Gillian (Gill) Brown, with a Cambridge M.A., had already taught in Ghana (1962–1964) before becoming an assistant lecturer in Phonetics and Linguistics at Edinburgh University in 1965. After fieldwork in Uganda, Gill published an early paper in generative phonology (Brown, 1970), which became part of her Edinburgh doctoral dissertation on the phonology of Lumasaaba in 1971, and the basis of a scholarly monograph (Brown, 1972). In the 1970s, Gill’s work on the practical applications of phonetics and phonology led to her widely acclaimed book on listening to spoken English (Brown, 1977/1990). Later, Gill’s intonation project (1975–1979), funded by the first of many research grants, developed innovative methods of eliciting and analyzing spoken data (Brown et al., 1980). Gill then combined linguistics, cognitive psychology, and the study of discourse structure to create a book that helped define the field of discourse analysis for many linguists (Brown and Yule, 1983a). Further research projects resulted in more books, two on the teaching and testing of spoken language (Brown and Yule, 1983b and Brown et al., 1984), one on language understanding (Brown et al., 1994) and another on referential communication (Brown, 1995). In subsequent research, Gill has focused on the ways in which context is created in discourse understanding (Brown, 1998). While the research projects continued, Gill moved from Edinburgh (1965–1983) to become Professor of Applied Linguistics at the University of Essex (1983–1988), then to serve as the founding Director of the Research Centre for English and Applied Linguistics at Cambridge University (1988–2004), where she created a stimulating intellectual environment for graduate study in many areas at the intersection of linguistics and cognitive psychology. As one of the few women professors in these institutions at the time, Gill was increasingly involved in administration, serving as Dean of Social Sciences at Essex, in
committee work, such as the University Grants Committee, and in public service, as a member of the Kingman Inquiry into English language teaching in British schools. In recognition of her outstanding work, Gill received a CBE in 1992. See also: Assimilation; Cognitive Pragmatics; Cohesion and Coherence: Linguistic Approaches; Communication, Understanding, and Interpretation: Philosophical Aspects; Discourse Processing; Elicitation Techniques for Spoken Discourse; Generative Phonology; Human Reasoning and Language Interpretation; Information Structure in Spoken Discourse; Intonation; Language Education: Grammar; Narrative: Cognitive Approaches; Phonetic Processes in Discourse; Phonetic Transcription: Analysis.; Phonology: Overview; Second Language Listening; Spoken Discourse: Types.
Bibliography Brown G (1970). ‘Syllables and redundancy rules in generative phonology.’ Journal of Linguistics 6, 1–17. Brown G (1972). Phonological rules and dialect variation: the phonology of Lumasaaba. Cambridge: Cambridge University Press. Brown G (1977/1990). Listening to spoken English (2nd edn.). Harlow: Longman. Brown G (1995). Speakers, listeners and communication. Cambridge: Cambridge University Press. Brown G (1998). ‘Context creation in discourse understanding.’ In Malmkjaer K & Williams J (eds.) Context in language learning and language understanding. Cambridge: Cambridge University Press. Brown G & Yule G (1983a). Discourse analysis. Cambridge: Cambridge University Press. Brown G & Yule G (1983b). Teaching the spoken language. Cambridge: Cambridge University Press. Brown G, Anderson A, Shillcock R & Yule G (1984). Teaching talk: strategies for production and assessment. Cambridge: Cambridge University Press. Brown G, Currie K & Kenworthy J (1980). Questions of intonation. London: Croom Helm. Brown G, Malmkjaer K, Pollitt A & Williams J (eds.) (1994). Language and understanding. Oxford: Oxford University Press.
Brown, Gillian 135 Brosses C de (1995). Lettres familie`res d’Italie: lettres e´crites d’Italie en 1739 et 1740. Brussels: Editions Complexe. Garreta J-C (ed.) (1981). Charles de Brosses 1777–1977: actes du colloque organise´ a` Dijon du 3 au 7 mai 1977 pour le deuxie`me centenaire de la mort du pre´sident de Brosses, par l’Acade´mie des Sciences, Arts et Belles Let-
tres de Dijon et le Centre de Recherche sur le XVIIIe Sie`cle de l’Universite´ de Dijon. Geneva: Slatkine. Sautebin H ([1899] 1971). Un linguiste franc¸ais du XVIIIe sie`cle, le pre´sident de Brosses: e´tude historique et analytique du Traite´ de la formation me´chanique des langues. Geneva: Slatkine.
Brown, Gillian G Yule, Kaaawa, HI, USA ! 2006 Elsevier Ltd. All rights reserved.
Gillian (Gill) Brown, with a Cambridge M.A., had already taught in Ghana (1962–1964) before becoming an assistant lecturer in Phonetics and Linguistics at Edinburgh University in 1965. After fieldwork in Uganda, Gill published an early paper in generative phonology (Brown, 1970), which became part of her Edinburgh doctoral dissertation on the phonology of Lumasaaba in 1971, and the basis of a scholarly monograph (Brown, 1972). In the 1970s, Gill’s work on the practical applications of phonetics and phonology led to her widely acclaimed book on listening to spoken English (Brown, 1977/1990). Later, Gill’s intonation project (1975–1979), funded by the first of many research grants, developed innovative methods of eliciting and analyzing spoken data (Brown et al., 1980). Gill then combined linguistics, cognitive psychology, and the study of discourse structure to create a book that helped define the field of discourse analysis for many linguists (Brown and Yule, 1983a). Further research projects resulted in more books, two on the teaching and testing of spoken language (Brown and Yule, 1983b and Brown et al., 1984), one on language understanding (Brown et al., 1994) and another on referential communication (Brown, 1995). In subsequent research, Gill has focused on the ways in which context is created in discourse understanding (Brown, 1998). While the research projects continued, Gill moved from Edinburgh (1965–1983) to become Professor of Applied Linguistics at the University of Essex (1983–1988), then to serve as the founding Director of the Research Centre for English and Applied Linguistics at Cambridge University (1988–2004), where she created a stimulating intellectual environment for graduate study in many areas at the intersection of linguistics and cognitive psychology. As one of the few women professors in these institutions at the time, Gill was increasingly involved in administration, serving as Dean of Social Sciences at Essex, in
committee work, such as the University Grants Committee, and in public service, as a member of the Kingman Inquiry into English language teaching in British schools. In recognition of her outstanding work, Gill received a CBE in 1992. See also: Assimilation; Cognitive Pragmatics; Cohesion and Coherence: Linguistic Approaches; Communication, Understanding, and Interpretation: Philosophical Aspects; Discourse Processing; Elicitation Techniques for Spoken Discourse; Generative Phonology; Human Reasoning and Language Interpretation; Information Structure in Spoken Discourse; Intonation; Language Education: Grammar; Narrative: Cognitive Approaches; Phonetic Processes in Discourse; Phonetic Transcription: Analysis.; Phonology: Overview; Second Language Listening; Spoken Discourse: Types.
Bibliography Brown G (1970). ‘Syllables and redundancy rules in generative phonology.’ Journal of Linguistics 6, 1–17. Brown G (1972). Phonological rules and dialect variation: the phonology of Lumasaaba. Cambridge: Cambridge University Press. Brown G (1977/1990). Listening to spoken English (2nd edn.). Harlow: Longman. Brown G (1995). Speakers, listeners and communication. Cambridge: Cambridge University Press. Brown G (1998). ‘Context creation in discourse understanding.’ In Malmkjaer K & Williams J (eds.) Context in language learning and language understanding. Cambridge: Cambridge University Press. Brown G & Yule G (1983a). Discourse analysis. Cambridge: Cambridge University Press. Brown G & Yule G (1983b). Teaching the spoken language. Cambridge: Cambridge University Press. Brown G, Anderson A, Shillcock R & Yule G (1984). Teaching talk: strategies for production and assessment. Cambridge: Cambridge University Press. Brown G, Currie K & Kenworthy J (1980). Questions of intonation. London: Croom Helm. Brown G, Malmkjaer K, Pollitt A & Williams J (eds.) (1994). Language and understanding. Oxford: Oxford University Press.
136 Brown, Roger William (1925–1997)
Brown, Roger William (1925–1997) M Thomas, Boston College, Chestnut Hill, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
Roger Brown was a social psychologist trained at the University of Michigan. During a 40-year-long career teaching at Harvard University and, briefly, MIT, he made at least three substantial contributions to late twentieth-century American linguistics. First, Brown is probably best known for his 1973 study of the acquisition of English by three preschoolers. Capitalizing on the recent invention of the portable tape recorder, Brown or his graduate students visited the homes of ‘Adam,’ ‘Eve,’ and ‘Sarah’ weekly or every other week to record at least two hours a month of spontaneous speech. Data were collected from Eve for eleven months; from Adam and Sarah, for five years. The recordings were then transcribed and meticulously analyzed. Brown developed the technique of calculating a child’s Mean Length of Utterance, measured in morphemes. Using MLU as a basis for calibrating the three children’s emerging grammars, he found commonalities in how they expressed semantic relations, and in their gradual building up of grammatical and morphological complexity. Brown’s transcripts and tapes were later deposited in the CHILDES online database, where they have had lasting influence on the study of child language acquisition. Second, Brown was extraordinarily effective as a teacher. He trained a cohort of students who have had distinguished careers in diverse subfields of linguistics and psycholinguistics, including Jean Berko Gleason, Ursula Bellugi, Melissa Bowerman, Courtney Cazden, Kenji Hakuta, Howard Gardner, Michael Maratsos, Steven Pinker, Eleanor Rosch, and Dan Slobin. Kessel’s (1988) festschrift provides a fuller list of Brown’s students and showcases their writings. A third contribution Brown made is related. Not all of Brown’s students inherited his distinctive working style (self-described as ‘‘phenomenon-centered,’’ ‘‘low-tech, minimally mathematical’’ with ‘‘an almost Talmudic taste for poring over data . . . involving
nothing but the free exercise of the principles of induction’’ [1989: 49–50]), but it set a particular tone to the emerging discipline of psycholinguistics. Moreover, in person and in his best-selling textbooks (1958, 1965), Brown modeled open-mindedness, intellectual playfulness, and an unfailing sense of wonder that attracted many to the study of language. Brown’s work on Adam, Eve, and Sarah achieved such prominence that it is surprising to learn that he published nothing on child language after the midpoint of his career. However, he left behind diverse work on linguistic relativity, music and language, tipof-the-tongue phenomena, and the sociolinguistics of politeness. He also left behind a painful memoir (Brown, 1997) that meditates on the afflicted personal life of a man celebrated as much for his geniality as for his professional success. See also: Language Acquisition Research Methods; Psycholinguistics: Overview.
Bibliography Brown R (1958). Words and things. Glencoe, IL: The Free Press. Brown R (1965). Social psychology. New York: The Free Press. Brown R (1973). A first language: the early stages. Cambridge, MA: Harvard University Press. Brown R (1989). ‘Roger Brown.’ In Lindzey G (ed.) A history of psychology in autobiography. Stanford, CA: Stanford University. 34–60. Brown R (1997). Against my better judgment: an intimate memoir of an eminent gay psychologist. Binghamton, NY: Haworth Press. Kessel F S (1988). The development of language and language researchers: essays in honor of Roger Brown. Hillsdale, NJ: Lawrence Erlbaum.
Bru¨ cke, Ernst (1819–1891) P C Sutcliffe, Colgate University, Hamilton, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Ernst Wilhelm Ritter von Bru¨ cke (b. June 6, 1819, d. January 7, 1892), not to be confused with his grandson and biographer, Ernst Theodor Bru¨ cke, was a physiologist who taught for 46 years, 41 of them as Professor of Physiology at the University of Vienna (1849–1890). Bru¨ cke was classically educated, which gave him a thorough foundation in Greek and Latin and a broad humanistic interest in languages and learning. Then he studied medicine, primarily in Berlin with physiologist Johannes Mu¨ ller, who encouraged him to apply his physiological expertise to developing a natural system of speech sounds with which all the world’s languages could be described (Jankowsky, 1999: 247). It is his groundbreaking works in this field, most especially Grundzu¨ge der Physiologie und Systematik der Sprachlaute fu¨r Linguisten und Taubstummenlehrer (1856), for which he is remembered in linguistic, and particularly phonetic, history, although he had 140 publications in all. In his first work on speech physiology, Untersuchungen u¨ber die Lautbildung und das natu¨rliche System der Sprachlaute (1849), Bru¨ cke laid the foundation for Grundzu¨ge, painstakingly describing various pronunciations of all the vowels and consonants and arranging them in a system according to genetic criteria, introducing terms such as alveolar and dental still in use today. Apparently, Bru¨ cke was unaware of Ellis’s Essentials of phonetics (1848) as he wrote it (see Ellis, Alexander John (ne´ Sharpe) (1814– 1890)), so he essentially developed a natural system of speech sounds with no help from predecessors, except for some observations by Kempelen (see Kempelen, Wolfgang von (1734–1804)). In the 7 years between Untersuchungen and Grundzu¨ge, Bru¨ cke deepened his observation of languages with help from colleagues in Vienna, including Miklosich
(see Miklosˇicˇ, Franc (1813–1891)) for Slavic and Anton Hassan for Arabic languages. He also realized a further practical application for his now ‘finetuned’ system: as a tool for teachers of the hearingimpaired (Jankowsky, 1999: 246). Although Bru¨ cke’s Grundzu¨ge was superseded by Sievers’s Grundzu¨ge der Lautphysiologie in 1876 (see Sievers, Eduard (1850–1932)), Bru¨ cke’s work provided the physiological description linguists had lacked and laid the foundation for this later work. He was extremely influential on Sievers and his generation, and also on Sigmund Freud. Scherer (see Scherer, Wilhelm (1841–1886)) and Sweet (see Sweet, Henry (1845–1912)) are among those who used and praised his work. See also: Ellis, Alexander John (ne´ Sharpe) (1814–1890); Freud, Sigmund (1856–1939); Kempelen, Wolfgang von (1734–1804); Miklosˇicˇ, Franc (1813–1891); Scherer, Wilhelm (1841–1886); Sievers, Eduard (1850–1932); Sweet, Henry (1845–1912).
Bibliography Bru¨ cke E W (1849). ‘Untersuchungen u¨ ber die Lautbildung and das Natu¨ rliche System der Sprachlaute.’ Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften Wien, Math.-Naturwiss. Klasse. 2, 181–208. Bru¨ cke E W (1856). Grundzu¨ge der Physiologie und Systematik der Sprachlaute fu¨r Linguisten und Taubstummenlehrer. Vienna: C. Gerold & Sohn (2nd rev. edn., 1876). Bru¨ cke E T (1928). Ernst Bru¨cke. Vienna: Julius Springer. Jankowsky K (1999). ‘The works of Ernst Wilhelm Bru¨ cke (1819–1892) and Johann N Czermak (1828–1873): Landmarks in the history of phonetics.’ In Cram D et al. (eds.) History of linguistics 1996, vol. II: From classical to contemporary linguistics. Amsterdam: Benjamins. 241–255.
Brugmann, Karl (1849–1919) K R Jankowsky, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Karl Friedrich Christian Brugmann (see Figure 1) was born on March 16, 1849 in Wiesbaden. He graduated from high school in his home town in 1867, studied
philology for one year in Jena, then decided to move on to Leipzig, where he selected as his major subject comparative philology, with Georg Curtius (see Curtius, Georg (1820–1885)) becoming his principal teacher. His doctoral thesis of 1871, entitled De Graecae linguae productione suppletoria, was followed in 1877 by his Habilitationsschrift, Zur Geschichte der Nominalsuffixe -as-, -jas- und -vas- (published in Zeitschrift fu¨r
Brugmann, Karl (1849–1919) 137
Bru¨cke, Ernst (1819–1891) P C Sutcliffe, Colgate University, Hamilton, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Ernst Wilhelm Ritter von Bru¨cke (b. June 6, 1819, d. January 7, 1892), not to be confused with his grandson and biographer, Ernst Theodor Bru¨cke, was a physiologist who taught for 46 years, 41 of them as Professor of Physiology at the University of Vienna (1849–1890). Bru¨cke was classically educated, which gave him a thorough foundation in Greek and Latin and a broad humanistic interest in languages and learning. Then he studied medicine, primarily in Berlin with physiologist Johannes Mu¨ller, who encouraged him to apply his physiological expertise to developing a natural system of speech sounds with which all the world’s languages could be described (Jankowsky, 1999: 247). It is his groundbreaking works in this field, most especially Grundzu¨ge der Physiologie und Systematik der Sprachlaute fu¨r Linguisten und Taubstummenlehrer (1856), for which he is remembered in linguistic, and particularly phonetic, history, although he had 140 publications in all. In his first work on speech physiology, Untersuchungen u¨ber die Lautbildung und das natu¨rliche System der Sprachlaute (1849), Bru¨cke laid the foundation for Grundzu¨ge, painstakingly describing various pronunciations of all the vowels and consonants and arranging them in a system according to genetic criteria, introducing terms such as alveolar and dental still in use today. Apparently, Bru¨cke was unaware of Ellis’s Essentials of phonetics (1848) as he wrote it (see Ellis, Alexander John (ne´ Sharpe) (1814– 1890)), so he essentially developed a natural system of speech sounds with no help from predecessors, except for some observations by Kempelen (see Kempelen, Wolfgang von (1734–1804)). In the 7 years between Untersuchungen and Grundzu¨ge, Bru¨cke deepened his observation of languages with help from colleagues in Vienna, including Miklosich
(see Miklosˇicˇ, Franc (1813–1891)) for Slavic and Anton Hassan for Arabic languages. He also realized a further practical application for his now ‘finetuned’ system: as a tool for teachers of the hearingimpaired (Jankowsky, 1999: 246). Although Bru¨cke’s Grundzu¨ge was superseded by Sievers’s Grundzu¨ge der Lautphysiologie in 1876 (see Sievers, Eduard (1850–1932)), Bru¨cke’s work provided the physiological description linguists had lacked and laid the foundation for this later work. He was extremely influential on Sievers and his generation, and also on Sigmund Freud. Scherer (see Scherer, Wilhelm (1841–1886)) and Sweet (see Sweet, Henry (1845–1912)) are among those who used and praised his work. See also: Ellis, Alexander John (ne´ Sharpe) (1814–1890); Freud, Sigmund (1856–1939); Kempelen, Wolfgang von (1734–1804); Miklosˇicˇ, Franc (1813–1891); Scherer, Wilhelm (1841–1886); Sievers, Eduard (1850–1932); Sweet, Henry (1845–1912).
Bibliography Bru¨cke E W (1849). ‘Untersuchungen u¨ber die Lautbildung and das Natu¨rliche System der Sprachlaute.’ Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften Wien, Math.-Naturwiss. Klasse. 2, 181–208. Bru¨cke E W (1856). Grundzu¨ge der Physiologie und Systematik der Sprachlaute fu¨r Linguisten und Taubstummenlehrer. Vienna: C. Gerold & Sohn (2nd rev. edn., 1876). Bru¨cke E T (1928). Ernst Bru¨cke. Vienna: Julius Springer. Jankowsky K (1999). ‘The works of Ernst Wilhelm Bru¨cke (1819–1892) and Johann N Czermak (1828–1873): Landmarks in the history of phonetics.’ In Cram D et al. (eds.) History of linguistics 1996, vol. II: From classical to contemporary linguistics. Amsterdam: Benjamins. 241–255.
Brugmann, Karl (1849–1919) K R Jankowsky, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Karl Friedrich Christian Brugmann (see Figure 1) was born on March 16, 1849 in Wiesbaden. He graduated from high school in his home town in 1867, studied
philology for one year in Jena, then decided to move on to Leipzig, where he selected as his major subject comparative philology, with Georg Curtius (see Curtius, Georg (1820–1885)) becoming his principal teacher. His doctoral thesis of 1871, entitled De Graecae linguae productione suppletoria, was followed in 1877 by his Habilitationsschrift, Zur Geschichte der Nominalsuffixe -as-, -jas- und -vas- (published in Zeitschrift fu¨r
138 Brugmann, Karl (1849–1919)
Figure 1 Photograph of Karl Brugmann from ldg. Jahrbuch IV (1918). Retrieved from http://titus.uni-frankfurt.de/personal/ galeria/brugmann.htm.
vergleichende Sprachforschung 24: 1–99), qualifying him as a university teacher. Before that he had obtained his Staatsexamen in 1872 and taught high school, for one year in Wiesbaden, then at the Nicolai-Schule in Leipzig. In 1877 he started his university career, first lecturing on Sanskrit and comparative philology in Leipzig, where he was appointed associate professor (Extra-Ordinarius) in 1882. Two years later he went to Freiburg as full professor (Ordinarius) and stayed there for three years, only to return to Leipzig for good in 1887 to occupy the newly established chair for IndoEuropean philology. He died in Leipzig on June 29, 1919. Brugmann was the most productive of the Neogrammarians and undoubtedly also the one who commanded the greatest influence on language students who streamed to Leipzig from all over the world. Apart from the enormous impact through the substance of his writings, his unparalleled success was to a large extent also due to his unique personal style and courage. As a young man of 27, he broke with his mentor Curtius, who could not approve of Brugmann’s impatient zest for exploring even less conventional avenues if he thought that it furthered the advancement of his science. And from then on he embarked on a course of quiet but determined confrontation. This frame of mind was the basis for the second feature that aided his climb to unprecedented prominence. Within a short time after he entered the teaching profession, Brugmann managed to establish himself as an arbiter of what should be acceptable as
solid linguistic achievement and what would have to be discarded as insignificant. When he sat in judgment, his criteria were derived from solid research that he and his Neogrammarian comrades-in-arms kept supplying in abundance. Even before Brugmann had started teaching at the university level, his first two major studies, written in 1876, brought him recognition and fame that continued to grow rapidly with every work he produced. Of the approximately 400 titles in his list of publications, two above all others were instrumental in solidifying his reputation as the unchallenged leader in the field, and even today those works remain achievements that need to be consulted. The first is the Griechische Grammatik of 1885, a model treatment of all components of grammar from a comparative point of view; the second, his Grundriss der vergleichenden Grammatik der indogermanischen Sprachen (vols. 1, 2, 6). Following Franz Bopp (see Bopp, Franz (1791–1867)) and August Schleicher (see Schleicher, August (1821–1868)), he was the third scholar to attempt a comprehensive documentation of what comparative linguistics had accomplished at his particular time. To achieve this monumental task, he had to restrict himself to phonology and morphology of the eight principal Indo-European languages, leaving it to Berthold Delbru¨ ck (see Delbru¨ ck, Berthold (1842–1922)) to deal with syntax (vols 3–5). As was characteristic of all Neogrammarian scholars, Brugmann was truly fascinated by discovering and securing as many relevant linguistic facts as he possibly could. But he, as did most of his Neogrammarian friends, went far beyond the mere amassing of facts in that he successfully attempted to arrive at the formulation of the basic principles that govern those facts and place them within a coherent system.
See also: Bopp, Franz (1791–1867); Curtius, Georg (1820–
1885); Delbru¨ck, Berthold (1842–1922); Neogrammarians; Schleicher, August (1821–1868).
Bibliography Brugmann K (1876). Nasalis sonans in der indogermanischen Grundsprache. In Curtius G (ed.) Studien zur griechischen und lateinischen Grammatik 9. Leipzig: S. Hirzel. 285–338. Brugmann K (1876). Zur Geschichte der stammabstufenden Deklinationen. In Curtius G (ed.) Studien zur griechischen und lateinischen Grammatik 9. Leipzig: S. Hirzel. 361–406. Brugmann K (1885). Zum heutigen Stand der Sprachwissenschaft. Strasbourg: Triibner.
Bruneau, Charles (1883–1969) 139 Brugmann K (1885). Griechische Grammatik (Lautlehre, Flexionslehre und Syntax). In Mu¨ ller I (ed.) Handbuch der klassischen Altertumswissenschaft 2. 1–126. Brugmann K (1886–1893). Grundriss der vergleichenden Grammatik der indogermanischen Sprachen, vol. 1: Einleitung und Lautlehre, vol. 2: Wortbildungslehre, vols 3–5: (1893–1900). Vergleichende Syntax der indogermanischen Sprachen, Parts 1–3. 6. Indices (Wort-, Sach- und Autorenindex). Strasbourg: Tru¨ bner. Brugmann K (1894). Die Ausdru¨ cke fu¨ r den Begriff der Totalita¨ t in den indogermanischen Sprachen, eine semasiologisch-etymologische Untersuchung. Leipzig: Edelmann. Brugmann K (1902–1904). Kurze vergleichende Grammatik der indogermanischen Sprachen. Auf Grund des fu¨ nfba¨ ndigen Grundrisses der vergleichenden Grammatik der indogermanischen Sprachen von Karl Brugmann und Berthold Delbru¨ ck, verfasst von Karl Brugmann (3 vols). Strasbourg: Tru¨ bner. Brugmann K & Osthoff H (1878–1910). Morphologische Untersuchungen auf dem Gebiete der indogermanischen Sprachen (6 vols). Leipzig: S. Hirzel.
Brugmann K & Streitberg W (eds.) (1891–). Indogermanische Forschungen. Zeitschrift far indogermanische Sprach- und Altertumskunde. Strasbourg: Tru¨ bner. Fo¨ rster M (1919). ‘Worte der Erinnerung an Karl Brugmann.’ Indogermanisches Jahrbuch [fu¨ r 1918] 6, vii–x. Jankowsky K R (1972). The Neogrammarians: a re-evaluation of their place in the development of linguistic science. The Hague: Mouton. Sommer F (1955). ‘Karl Friedrich Christian Brugmann’. In Neue Deutsche Biographie, vol. 2. Berlin: Duncker and Humblot. 667. Streitberg W (1919). ‘Karl Brugmanns Schriften, 1871– 1909.’ Indogermanische Forschungen 26, 425–440. Streitberg W (1919). ‘Karl Brugmann.’ Indogermanisches Jahrbuch 7, 143–148. (Repr. in: In Sebeok T A (ed.) Portraits of linguists, vol. 1. Bloomington, IN: Indiana University Press). Streitberg W (1919). ‘Karl Brugmanns Schriften, 1909– 1919.’ Indogermanisches Jahrbuch 7, 148–152.
Bruneau, Charles (1883–1969) D Candel, CNRS and University of Paris, Paris 7, France ! 2006 Elsevier Ltd. All rights reserved.
Charles Bruneau graduated from the Sorbonne and then followed Gillie´ ron’s classes on dialectology at Ecole des Hautes Etudes. He received a position at the University of Nancy in 1913 and at the Sorbonne in 1933. In 1934 he succeeded his former professor, Ferdinand Brunot, as chair of History of French Language and remained there until 1954. Born in the French Ardennes, Bruneau focused his thesis on the local dialects in 93 villages of this region (Etude phone´ tique des patois d’Ardenne). In 1912 he was called by Ferdinand Brunot to participate in the ‘Archives de la parole,’ the first institutional oral survey. This was based on using a Pathe´ phonograph to collect phonograms. An automobile to transport their 500 kilograms of recording equipment was even made available to them, quite an innovation at that time. Bruneau first used the method of questionnaires learned from Edmont and Gillie´ ron for the Atlas linguistique de la France, asking for translations of words and sentences into the local dialect. This gave a large amount of speech facts. As he explained in several letters written to Brunot (available at the
Archives du de´ partement de l’audiovisuel from the Bibliothe`que Nationale de Paris) he then changed his method, preferring to analyze his speakers’ free speech and accents, instead of comparing the words or phrases he solicitated from them. Brunot and Bruneau finally gathered 166 recordings that they filed following the Viennese ‘Phonogrammarchiv,’ marking the geographical particularities of the samples, as well as some biographical data describing the speakers, and giving a phonetical transcription of the records. After he settled in Paris, Bruneau contributed to Ferdinand Brunot’s monumental Histoire de la langue franc¸ aise des origines a` nos jours by writing ‘L’Epoque romantique’ (covering the period 1815 to 1852) and ‘L’Epoque re´ aliste’ (covering the period 1852 to 1886). These two volumes mostly describe the history of literary language through stylistic monographies, different from the rest of Ferdinand Brunot’s work. Around 1952 Bruneau, arguing against Spitzer’s stylistic criticism, made a distinction between pure (or scientific) stylistics, and stylistics applied to literature (or authors’ stylistics), the former being part of language science.
See also: Brunot, Ferdinand (1860–1938); Dialect Atlases;
Gillie´ron, Jules (1854–1926); Spitzer, Leo (1887–1960).
Bruneau, Charles (1883–1969) 139 Brugmann K (1885). Griechische Grammatik (Lautlehre, Flexionslehre und Syntax). In Mu¨ller I (ed.) Handbuch der klassischen Altertumswissenschaft 2. 1–126. Brugmann K (1886–1893). Grundriss der vergleichenden Grammatik der indogermanischen Sprachen, vol. 1: Einleitung und Lautlehre, vol. 2: Wortbildungslehre, vols 3–5: (1893–1900). Vergleichende Syntax der indogermanischen Sprachen, Parts 1–3. 6. Indices (Wort-, Sach- und Autorenindex). Strasbourg: Tru¨bner. Brugmann K (1894). Die Ausdru¨cke fu¨r den Begriff der Totalita¨t in den indogermanischen Sprachen, eine semasiologisch-etymologische Untersuchung. Leipzig: Edelmann. Brugmann K (1902–1904). Kurze vergleichende Grammatik der indogermanischen Sprachen. Auf Grund des fu¨nfba¨ndigen Grundrisses der vergleichenden Grammatik der indogermanischen Sprachen von Karl Brugmann und Berthold Delbru¨ck, verfasst von Karl Brugmann (3 vols). Strasbourg: Tru¨bner. Brugmann K & Osthoff H (1878–1910). Morphologische Untersuchungen auf dem Gebiete der indogermanischen Sprachen (6 vols). Leipzig: S. Hirzel.
Brugmann K & Streitberg W (eds.) (1891–). Indogermanische Forschungen. Zeitschrift far indogermanische Sprach- und Altertumskunde. Strasbourg: Tru¨bner. Fo¨rster M (1919). ‘Worte der Erinnerung an Karl Brugmann.’ Indogermanisches Jahrbuch [fu¨r 1918] 6, vii–x. Jankowsky K R (1972). The Neogrammarians: a re-evaluation of their place in the development of linguistic science. The Hague: Mouton. Sommer F (1955). ‘Karl Friedrich Christian Brugmann’. In Neue Deutsche Biographie, vol. 2. Berlin: Duncker and Humblot. 667. Streitberg W (1919). ‘Karl Brugmanns Schriften, 1871– 1909.’ Indogermanische Forschungen 26, 425–440. Streitberg W (1919). ‘Karl Brugmann.’ Indogermanisches Jahrbuch 7, 143–148. (Repr. in: In Sebeok T A (ed.) Portraits of linguists, vol. 1. Bloomington, IN: Indiana University Press). Streitberg W (1919). ‘Karl Brugmanns Schriften, 1909– 1919.’ Indogermanisches Jahrbuch 7, 148–152.
Bruneau, Charles (1883–1969) D Candel, CNRS and University of Paris, Paris 7, France ! 2006 Elsevier Ltd. All rights reserved.
Charles Bruneau graduated from the Sorbonne and then followed Gillie´ron’s classes on dialectology at Ecole des Hautes Etudes. He received a position at the University of Nancy in 1913 and at the Sorbonne in 1933. In 1934 he succeeded his former professor, Ferdinand Brunot, as chair of History of French Language and remained there until 1954. Born in the French Ardennes, Bruneau focused his thesis on the local dialects in 93 villages of this region (Etude phone´tique des patois d’Ardenne). In 1912 he was called by Ferdinand Brunot to participate in the ‘Archives de la parole,’ the first institutional oral survey. This was based on using a Pathe´ phonograph to collect phonograms. An automobile to transport their 500 kilograms of recording equipment was even made available to them, quite an innovation at that time. Bruneau first used the method of questionnaires learned from Edmont and Gillie´ron for the Atlas linguistique de la France, asking for translations of words and sentences into the local dialect. This gave a large amount of speech facts. As he explained in several letters written to Brunot (available at the
Archives du de´partement de l’audiovisuel from the Bibliothe`que Nationale de Paris) he then changed his method, preferring to analyze his speakers’ free speech and accents, instead of comparing the words or phrases he solicitated from them. Brunot and Bruneau finally gathered 166 recordings that they filed following the Viennese ‘Phonogrammarchiv,’ marking the geographical particularities of the samples, as well as some biographical data describing the speakers, and giving a phonetical transcription of the records. After he settled in Paris, Bruneau contributed to Ferdinand Brunot’s monumental Histoire de la langue franc¸aise des origines a` nos jours by writing ‘L’Epoque romantique’ (covering the period 1815 to 1852) and ‘L’Epoque re´aliste’ (covering the period 1852 to 1886). These two volumes mostly describe the history of literary language through stylistic monographies, different from the rest of Ferdinand Brunot’s work. Around 1952 Bruneau, arguing against Spitzer’s stylistic criticism, made a distinction between pure (or scientific) stylistics, and stylistics applied to literature (or authors’ stylistics), the former being part of language science.
See also: Brunot, Ferdinand (1860–1938); Dialect Atlases;
Gillie´ron, Jules (1854–1926); Spitzer, Leo (1887–1960).
140 Bruneau, Charles (1883–1969)
Bibliography Bruneau C (1912). La Conservation des patois ardennais. Paris: Champion. Bruneau C (1913). Etude phone´ tique des patois d’Ardenne. Paris: Champion. Bruneau C (1914–1926). Enqueˆ te linguistique sur les patois d’Ardenne (2 vols). Paris: Champion. Bruneau C (1948). ‘L’Epoque romantique (1815–1852).’ In Brunot F (ed.) Histoire de la langue franc¸ aise des origines a` 1900, part 12. Paris: Armand Colin.
Bruneau C (1951). ‘La Stylistique.’ Romance Philology V–1, 1–14. Bruneau C (1953). ‘L’Epoque re´ aliste (1852–1886).’ In Brunot F (ed.) Histoire de la langue franc¸ aise des origines a` 1900, part 13. Paris: Armand Colin. Bruneau C (1955–1958). Petite histoire de la langue franc¸ aise (2 vols). Paris: Armand Colin. Chevalier J-C (1994). ‘F. Brunot (1860–1937), la fabrication d’une me´ moire de la langue.’ Langages 114, 54–68.
Brunei Darussalam: Language Situation C H Gallop ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, pp. 419–420, ! 1994, Elsevier Ltd.
Brunei Darussalam is an independent sultanate on the northwest coast of Borneo. There are four districts: Brunei Muara (where the capital is situated), Belait, Tutong, and Temburong (an enclave in the adjoining Malaysian state of Sarawak). The population is 75% Malay, with the largest minority group being Chinese (15%). The official language is Bahasa Melayu (Standard Malay), the variety jointly accepted by Malaysia, Indonesia, and Brunei Darussalam. It is widely used in all areas of public life, including government, the printed media, and broadcasting. English is also understood and spoken in various domains of public and private life. The education system is bilingual, with academic subjects, including the sciences, taught in English, while culturally based subjects and Islamic religion are in Malay. Malay is now written in romanized form, but learning of the original Jawi script, an adaptation from Arabic, is currently being revived in schools. Three Islamic religious institutions use a combination of Arabic, Malay, and English. The majority of students at the University of Brunei Darussalam (opened in 1985) study in English. Seven communities are accepted by the 1959 constitution as being indigenous groups of the Malay
race. Brunei Malay is the dialect of the numerically and politically dominant people of that name who have traditionally lived on water. Kedayan is the dialect of the land-dwelling farmers. These two variant dialects of the Brunei Muara district are each about 80% cognate with Standard Malay. The other five communities, the Tutong, Belait, Dusun, Bisaya, and Murut, speak the languages of their names. The Bisaya and Murut reside only in Temburong. All five languages are less than 40% cognate with Brunei Malay. Their use is declining in favor of Brunei Malay due to population mobility, intermarriage, and, for Dusun, Bisaya, and Murut, conversion to Islam. Iban, Penan, and Mukah are Sarawak languages spoken only by small groups of settled immigrants. The urban Chinese use mainly the Hokkien, Cantonese, and Hakka dialects. A national ideology, the Malay Islamic Monarchy concept, is being promoted in education and public life. It characterizes Brunei Darussalam as a monocultural, Malay-speaking, Islamic society, and is contributing to the decline of minority languages. See also: Malay.
Bibliography Jones G Martin P & Oz˙o´g, C (in press). Bilingualism and national development in Brunei Darussalam. In Jones G & Oz˙o´g C (eds.) Bilingualism and national development. Clevedon: Multilingual Matters. Nothofer B (1991). The languages of Brunei Darussalam. Pacific Linguistics.
140 Bruneau, Charles (1883–1969)
Bibliography Bruneau C (1912). La Conservation des patois ardennais. Paris: Champion. Bruneau C (1913). Etude phone´tique des patois d’Ardenne. Paris: Champion. Bruneau C (1914–1926). Enqueˆte linguistique sur les patois d’Ardenne (2 vols). Paris: Champion. Bruneau C (1948). ‘L’Epoque romantique (1815–1852).’ In Brunot F (ed.) Histoire de la langue franc¸aise des origines a` 1900, part 12. Paris: Armand Colin.
Bruneau C (1951). ‘La Stylistique.’ Romance Philology V–1, 1–14. Bruneau C (1953). ‘L’Epoque re´aliste (1852–1886).’ In Brunot F (ed.) Histoire de la langue franc¸aise des origines a` 1900, part 13. Paris: Armand Colin. Bruneau C (1955–1958). Petite histoire de la langue franc¸aise (2 vols). Paris: Armand Colin. Chevalier J-C (1994). ‘F. Brunot (1860–1937), la fabrication d’une me´moire de la langue.’ Langages 114, 54–68.
Brunei Darussalam: Language Situation C H Gallop ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, pp. 419–420, ! 1994, Elsevier Ltd.
Brunei Darussalam is an independent sultanate on the northwest coast of Borneo. There are four districts: Brunei Muara (where the capital is situated), Belait, Tutong, and Temburong (an enclave in the adjoining Malaysian state of Sarawak). The population is 75% Malay, with the largest minority group being Chinese (15%). The official language is Bahasa Melayu (Standard Malay), the variety jointly accepted by Malaysia, Indonesia, and Brunei Darussalam. It is widely used in all areas of public life, including government, the printed media, and broadcasting. English is also understood and spoken in various domains of public and private life. The education system is bilingual, with academic subjects, including the sciences, taught in English, while culturally based subjects and Islamic religion are in Malay. Malay is now written in romanized form, but learning of the original Jawi script, an adaptation from Arabic, is currently being revived in schools. Three Islamic religious institutions use a combination of Arabic, Malay, and English. The majority of students at the University of Brunei Darussalam (opened in 1985) study in English. Seven communities are accepted by the 1959 constitution as being indigenous groups of the Malay
race. Brunei Malay is the dialect of the numerically and politically dominant people of that name who have traditionally lived on water. Kedayan is the dialect of the land-dwelling farmers. These two variant dialects of the Brunei Muara district are each about 80% cognate with Standard Malay. The other five communities, the Tutong, Belait, Dusun, Bisaya, and Murut, speak the languages of their names. The Bisaya and Murut reside only in Temburong. All five languages are less than 40% cognate with Brunei Malay. Their use is declining in favor of Brunei Malay due to population mobility, intermarriage, and, for Dusun, Bisaya, and Murut, conversion to Islam. Iban, Penan, and Mukah are Sarawak languages spoken only by small groups of settled immigrants. The urban Chinese use mainly the Hokkien, Cantonese, and Hakka dialects. A national ideology, the Malay Islamic Monarchy concept, is being promoted in education and public life. It characterizes Brunei Darussalam as a monocultural, Malay-speaking, Islamic society, and is contributing to the decline of minority languages. See also: Malay.
Bibliography Jones G Martin P & Oz˙o´g, C (in press). Bilingualism and national development in Brunei Darussalam. In Jones G & Oz˙o´g C (eds.) Bilingualism and national development. Clevedon: Multilingual Matters. Nothofer B (1991). The languages of Brunei Darussalam. Pacific Linguistics.
Bruner, Jerome Seymour (b. 1915) 141
Bruner, Jerome Seymour (b. 1915) W A Hass, St Peter, MN, USA ! 2006 Elsevier Ltd. All rights reserved.
Jerome Bruner is a pre-eminent, powerfully influential, and multiply pioneering psychologist whose studies have touched on language in many ways. Born in New York City of nominally observant Jewish parents, he was blind for the first two years of life. His father, a watchmaker, died when he was 12, after which his family moved about, and he developed his interest in sailing. He attended Duke University (B.A., 1937), and went to Harvard for graduate study in psychology (Ph.D., 1941). He served in the U.S. Army Intelligence Corps in World War II, applying his knowledge of propaganda and public opinion. He then joined the Harvard faculty, doing experimental psychological research on the dependence of perception on motivation (the ‘New Look’ studies). He carried out an enormously influential series of studies on the acquisition of structured cognitive categories, involving the grouping of instances on the basis of values of defining attributes. He founded and directed the Center for Cognitive Studies in the 1960s, leading the way toward the field’s mentally realistic engagement with information processing. He left Harvard for the Watts Professorship of Experimental Psychology, at Wolfson College, Oxford University, in 1972, personally sailing across the Atlantic to assume his new role, and then returning to the United States in 1979. He has subsequently taught at the New School for Social Research (New York City) and the New York University School of Law. His research interests progressed from cognitive to developmental and educational psychology, where his thinking has galvanized our understanding of pedagogical practice, and then on to cultural and narrative psychology, providing specimen studies exemplifying another novel and productive paradigm for human studies. His abiding interest in language began with his work on wartime propaganda and public opinion, more or less by-passed his perceptual experiments, and continued with his analyses of conceptual categorization. Despite personal contacts with George A. Miller, Noam Chomsky, Roman Jakobson, and Roger Brown, he never became enamored of the structural properties of syntactic or phonological grammars. For instance, it was Brown who added a 65-page appendix to A study of thinking, indicating how attributes and categorizations are involved in speech and linguistic meaning. He became interested in language development, as epitomizing a symbolic, in contrast to iconic or enac-
tive, system of representation and cultural ‘amplifier’ of reflective, in contrast to sensory or motoric, capacities. From language as an instrument of thought, based on semantic representation, he went on to treat language development in terms of communicative (or pragmatic) function. This perspective allowed him to combine prior American pragmatic and generative influences, with Oxfordian ‘ordinary language’ work on Austinian illocutionary acts, as well as Tinbergen’s ethology – in his delineation of the ontogenesis of speech acts. He went on to the formulation of the Language Acquisition Support System, through which speakers became participants with particular linguistic communities or subcommunities. In this way, he could emphasize not only cultural influences, but also narrative structures and processes. He based his account of stories on Kenneth Burke’s pentad (Agents, Actions, Goals, Instruments, Settings – plus Trouble!), as well as other structural accounts and hermeneutic modes of human intentionality. Narratives, thus conceived, have properties that allow construction of personal selves/identities, as well as human institutions (such as schools and legal systems), and provide for the dynamics of individual development/aging and or human history (the passing forward of culture). His prolific career has provided contributions sufficient to suggest several approaches to the role of language in the human sciences, but reflects a general trend from the application of his original experimental psychology to his more recent anthropological/ interpretive stance. He has managed to combine an engagement in current political, educational, and legal issues, as they have become crucial over the decades, with a principled perspective stemming from meaning-based psychological research and theory – which he has helped shape and vitalize. His has not been a ‘school of psycholinguistics,’ but rather a schooling of psychological engagement in human language. See also: Brown, Roger William (b. 1925); Chomsky, Noam (b. 1928); Jakobson, Roman (1896–1982); Psycholinguistics: Overview; Tinbergen, Niko (1917–1988).
Bibliography Bruner J (1964). ‘The course of cognitive growth.’ American Psychologist 19, 1–15. Bruner J (1975). ‘From communication to language: a psychological perspective.’ Cognition 3, 255–287. Bruner J (1983). Child’s talk: learning to use language. New York: Norton.
142 Bruner, Jerome Seymour (b. 1915) Bruner J (1986). Actual minds, possible worlds. Cambridge, MA: Harvard University Press. Bruner J (1991a). Acts of meaning. Cambridge, MA: Harvard University Press. Bruner J (1991b). ‘The narrative construction of reality.’ Critical Inquiry 18, 1–21.
Bruner J (1996). The culture of education. Cambridge, MA: Harvard University Press. Bruner J (2002). Making stories: law, literature, life. New York: Farrar, Strauss & Giroux. Bruner J, Goodnow J J & Austin G A (1956). A study of thinking. New York: Wiley.
Brunot, Ferdinand (1860–1938) D Candel, CNRS and University of Paris 7, Paris, France ! 2006 Elsevier Ltd. All rights reserved.
Alumnus of Ecole Normale Supe´ rieure (1879), Ferdinand Brunot ranked first in the ‘Agre´ gation de grammaire’ competition of 1882, got his doctorate in 1891, and a tenured position at the Sorbonne in 1900, in a new chair in ‘History of French Language’ specifically created for him. At Ecole Normale he was trained by the philologists Gaston Paris and Arse`ne Darmesteter. Under the influence of Cle´ dat he worked on ‘patois’ and on spelling. His friends were Baudrillart, Bergson, Jaure`s, and the sociologist Durkheim. Brunot became politically engaged and was a convinced ‘republican.’ He was on the side of Dreyfus, helped found the Human Rights League and later became a mayor of Paris. He created the monumental Histoire de la langue franc¸ aise. He wrote the first 10 parts in 18 volumes, over 10 000 pages: the emphasis is placed on language as a human and social phenomenon, on historical philology, and on both the internal history and the external geography of language, while proposing the first steps of a social lexicology. This work reached 26 volumes, the latest ones, covering the period 1870 to 2000, were published by CNRS. Brunot’s other substantial work is La Pense´ e et la langue (1922), a French grammar including oral usages that are based on a classification of ideas more than of signs, forms, and grammatical categories. This well-recognized philologist, who was a member of the ‘Acade´ mie des Inscriptions et Belles Lettres’ and of the ‘Acade´ mie Royale de Belgique,’ and who also inspired the teaching of stylistics in France, got some fresh reactions from linguists: Bourciez and
Meillet criticized the large number of facts and the absence of a system; Meillet and Bally, the process of starting from concepts to get to the language; Bally, the mixing of syntagm and paradigm, and of synchrony and diachrony. As a polemist, Brunot published a critique of the 1932 French Grammar of the Acade´ mie Franc¸ aise. As an innovator, he strongly encouraged spelling reform. He also became involved in training teachers of French as a foreign language. Brunot created the Institut de Phone´ tique of the University of Paris and the ‘Archives de la parole’ (1911), in order to record and store samples of ‘patois’ (from Ardennes, with Bruneau, and from Berry Limousin). He also recorded and stored samples of average Parisians’ speech and politicians’, poets’, and actors’ voices (Barre`s, Dreyfus, and Apollinaire). This was the beginning of the De´ partement de l’audiovisuel at the Bibliothe`que Nationale de France. See also: Bruneau, Charles (1883–1969); Meillit, Antoine (Paul Jules) (1866–1936).
Bibliography Brunot F (1905–1968). Histoire de la langue franc¸ aise des origines a` nos jours (13 parts, 23 vols). Paris: Armand Colin (1985–2000) 3 vols. Paris: CNRS Editions. Brunot F (1922). La Pense´ e et la langue. Paris: Masson. Brunot F (1932). Observations sur la grammaire de l’Acade´ mie franc¸ aise. Paris: Droz. Chevalier J-C (1992). ‘L’Histoire de la langue franc¸ aise de F. Brunot.’ In Nora P (ed.) Les Lieux de me´ moire. Paris: Gallimard. 420–459. Chevalier J-C (1994). ‘F. Brunot (1860–1937), la fabrication d’une me´ moire de la langue.’ Langages 114, 54–68.
142 Bruner, Jerome Seymour (b. 1915) Bruner J (1986). Actual minds, possible worlds. Cambridge, MA: Harvard University Press. Bruner J (1991a). Acts of meaning. Cambridge, MA: Harvard University Press. Bruner J (1991b). ‘The narrative construction of reality.’ Critical Inquiry 18, 1–21.
Bruner J (1996). The culture of education. Cambridge, MA: Harvard University Press. Bruner J (2002). Making stories: law, literature, life. New York: Farrar, Strauss & Giroux. Bruner J, Goodnow J J & Austin G A (1956). A study of thinking. New York: Wiley.
Brunot, Ferdinand (1860–1938) D Candel, CNRS and University of Paris 7, Paris, France ! 2006 Elsevier Ltd. All rights reserved.
Alumnus of Ecole Normale Supe´rieure (1879), Ferdinand Brunot ranked first in the ‘Agre´gation de grammaire’ competition of 1882, got his doctorate in 1891, and a tenured position at the Sorbonne in 1900, in a new chair in ‘History of French Language’ specifically created for him. At Ecole Normale he was trained by the philologists Gaston Paris and Arse`ne Darmesteter. Under the influence of Cle´dat he worked on ‘patois’ and on spelling. His friends were Baudrillart, Bergson, Jaure`s, and the sociologist Durkheim. Brunot became politically engaged and was a convinced ‘republican.’ He was on the side of Dreyfus, helped found the Human Rights League and later became a mayor of Paris. He created the monumental Histoire de la langue franc¸aise. He wrote the first 10 parts in 18 volumes, over 10 000 pages: the emphasis is placed on language as a human and social phenomenon, on historical philology, and on both the internal history and the external geography of language, while proposing the first steps of a social lexicology. This work reached 26 volumes, the latest ones, covering the period 1870 to 2000, were published by CNRS. Brunot’s other substantial work is La Pense´e et la langue (1922), a French grammar including oral usages that are based on a classification of ideas more than of signs, forms, and grammatical categories. This well-recognized philologist, who was a member of the ‘Acade´mie des Inscriptions et Belles Lettres’ and of the ‘Acade´mie Royale de Belgique,’ and who also inspired the teaching of stylistics in France, got some fresh reactions from linguists: Bourciez and
Meillet criticized the large number of facts and the absence of a system; Meillet and Bally, the process of starting from concepts to get to the language; Bally, the mixing of syntagm and paradigm, and of synchrony and diachrony. As a polemist, Brunot published a critique of the 1932 French Grammar of the Acade´mie Franc¸aise. As an innovator, he strongly encouraged spelling reform. He also became involved in training teachers of French as a foreign language. Brunot created the Institut de Phone´tique of the University of Paris and the ‘Archives de la parole’ (1911), in order to record and store samples of ‘patois’ (from Ardennes, with Bruneau, and from Berry Limousin). He also recorded and stored samples of average Parisians’ speech and politicians’, poets’, and actors’ voices (Barre`s, Dreyfus, and Apollinaire). This was the beginning of the De´partement de l’audiovisuel at the Bibliothe`que Nationale de France. See also: Bruneau, Charles (1883–1969); Meillit, Antoine (Paul Jules) (1866–1936).
Bibliography Brunot F (1905–1968). Histoire de la langue franc¸aise des origines a` nos jours (13 parts, 23 vols). Paris: Armand Colin (1985–2000) 3 vols. Paris: CNRS Editions. Brunot F (1922). La Pense´e et la langue. Paris: Masson. Brunot F (1932). Observations sur la grammaire de l’Acade´mie franc¸aise. Paris: Droz. Chevalier J-C (1992). ‘L’Histoire de la langue franc¸aise de F. Brunot.’ In Nora P (ed.) Les Lieux de me´moire. Paris: Gallimard. 420–459. Chevalier J-C (1994). ‘F. Brunot (1860–1937), la fabrication d’une me´moire de la langue.’ Langages 114, 54–68.
Buddhism, Japanese 143
Buck, Carl Darling (1866–1955) R D Greenberg, Yale University, New Haven, CT, USA ! 2006 Elsevier Ltd. All rights reserved.
Born in Bucksport, Maine in 1866, Carl Darling Buck was an expert in historical linguistics, especially the history of the Italic branch of Indo–European. Having completed his undergraduate studies at Yale in 1886, he continued his graduate studies there for another 3 years. He then moved to Europe, where he studied classical languages in Athens (1887–1889) and Leipzig (1889–1892). Buck joined the University of Chicago in the year of its founding, 1892, as a professor of Sanskrit and Indo-European comparative philology. He remained at this post until his retirement in 1933, when he became professor emeritus. In his retirement, Buck remained active in his scholarship, working on his reverse dictionary of Greek nouns and adjectives co-authored with Walter Petersen (1945) and his dictionary of synonyms in the Indo– European languages (1949). In 1954, the University of Chicago honored him as one of three surviving individuals who had been at the University since its founding. Buck died the following year. During his remarkable career, Buck made significant contributions in the areas of the Oscan and Umbrian languages, including a study of these languages’ vocalic systems (1892) and verbal forms (1895). Together with William Gardner Hale, he produced a grammar of Latin (1903), and in the next year his grammar of Oscan and Umbrian appeared. Before the end of the first decade of the 20th century, Buck had also produced studies of ancient Greek dialects. However, many students of Indo–European
linguistics may know Buck best through his comparative grammar of Greek and Latin (1933) and his dictionary of Indo–European synonyms (1949). These works have endured as classic reference books in historical linguistics. See also: Greek, Ancient; Indo–European Languages;
Proto-Indo–European Syntax.
Bibliography Buck C D (1892). Der Vocalismus der oskischen Sprache. Leipzig: K. F. Koehler’s Antiquarium. Buck C D (1895). The Oscan-Umbrian verb-system. Chicago: University of Chicago Press. Buck C D (1903). A sketch of the linguistic conditions of Chicago. Chicago: The University of Chicago Press. Buck C D (1904). Grammar of Oscan and Umbrian. Boston: Ginn and Company. Buck C D (1905). Elemenatarbuch der oskisch-umbrischen Dialekte. Heidelberg: C. Winter. Buck C D (1910). Introduction to the study of the Greek dialects: grammar, selected inscriptions, glossary. Boston: Ginn and Company. Buck C D (1933). Comparative grammar of Greek and Latin. Chicago: University of Chicago Press. Buck C D (1949). A dictionary of selected synonyms in the principal Indo–European languages: a contribution to the history of ideas. Chicago: University of Chicago Press. Buck C D & Hale W G (1903). A Latin grammar. New York: Mentzer, Bush. Buck C D & Petersen W (1945). A reverse index of Greek nouns and adjectives, arranged by terminations with brief historical introductions. Chicago: University of Chicago Press.
Buddhism, Japanese I Reader ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, p. 423–424. ! 1994, Elsevier Ltd.
Buddhism has had major linguistic influences in Japan, ranging from the development of classical Japanese literary forms and new syllabaries to the use of a wide vocabulary that has, in the course of time, become part of standard Japanese usage. It entered Japan along with many other facets of continental Asiatic culture from the 6th century onward. Along with Buddhism, the most significant cultural influence that entered Japan in this period was the Chinese
writing system, which the Japanese adopted. It was through the medium of Buddhist texts written in Chinese that the Japanese originally encountered and studied Buddhist thought. This Chinese orientation provided the predominant lens through which the Japanese viewed and learned about Buddhism. Until the 19th century, no systematic attempts were made to study such root languages of Buddhism as Sanskrit or Pali or to study early pre-Chinese Buddhist texts. There was, however, in the work of the 18th century writer Tominaga Nakamoto, a recognition of the conditional nature of language, dependent on the time in which it was used, the form of expression used, and the intent of the user. Tominaga’s studies of Buddhist texts and of the varying ways
Buddhism, Japanese 143
Buck, Carl Darling (1866–1955) R D Greenberg, Yale University, New Haven, CT, USA ! 2006 Elsevier Ltd. All rights reserved.
Born in Bucksport, Maine in 1866, Carl Darling Buck was an expert in historical linguistics, especially the history of the Italic branch of Indo–European. Having completed his undergraduate studies at Yale in 1886, he continued his graduate studies there for another 3 years. He then moved to Europe, where he studied classical languages in Athens (1887–1889) and Leipzig (1889–1892). Buck joined the University of Chicago in the year of its founding, 1892, as a professor of Sanskrit and Indo-European comparative philology. He remained at this post until his retirement in 1933, when he became professor emeritus. In his retirement, Buck remained active in his scholarship, working on his reverse dictionary of Greek nouns and adjectives co-authored with Walter Petersen (1945) and his dictionary of synonyms in the Indo– European languages (1949). In 1954, the University of Chicago honored him as one of three surviving individuals who had been at the University since its founding. Buck died the following year. During his remarkable career, Buck made significant contributions in the areas of the Oscan and Umbrian languages, including a study of these languages’ vocalic systems (1892) and verbal forms (1895). Together with William Gardner Hale, he produced a grammar of Latin (1903), and in the next year his grammar of Oscan and Umbrian appeared. Before the end of the first decade of the 20th century, Buck had also produced studies of ancient Greek dialects. However, many students of Indo–European
linguistics may know Buck best through his comparative grammar of Greek and Latin (1933) and his dictionary of Indo–European synonyms (1949). These works have endured as classic reference books in historical linguistics. See also: Greek, Ancient; Indo–European Languages;
Proto-Indo–European Syntax.
Bibliography Buck C D (1892). Der Vocalismus der oskischen Sprache. Leipzig: K. F. Koehler’s Antiquarium. Buck C D (1895). The Oscan-Umbrian verb-system. Chicago: University of Chicago Press. Buck C D (1903). A sketch of the linguistic conditions of Chicago. Chicago: The University of Chicago Press. Buck C D (1904). Grammar of Oscan and Umbrian. Boston: Ginn and Company. Buck C D (1905). Elemenatarbuch der oskisch-umbrischen Dialekte. Heidelberg: C. Winter. Buck C D (1910). Introduction to the study of the Greek dialects: grammar, selected inscriptions, glossary. Boston: Ginn and Company. Buck C D (1933). Comparative grammar of Greek and Latin. Chicago: University of Chicago Press. Buck C D (1949). A dictionary of selected synonyms in the principal Indo–European languages: a contribution to the history of ideas. Chicago: University of Chicago Press. Buck C D & Hale W G (1903). A Latin grammar. New York: Mentzer, Bush. Buck C D & Petersen W (1945). A reverse index of Greek nouns and adjectives, arranged by terminations with brief historical introductions. Chicago: University of Chicago Press.
Buddhism, Japanese I Reader ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, p. 423–424. ! 1994, Elsevier Ltd.
Buddhism has had major linguistic influences in Japan, ranging from the development of classical Japanese literary forms and new syllabaries to the use of a wide vocabulary that has, in the course of time, become part of standard Japanese usage. It entered Japan along with many other facets of continental Asiatic culture from the 6th century onward. Along with Buddhism, the most significant cultural influence that entered Japan in this period was the Chinese
writing system, which the Japanese adopted. It was through the medium of Buddhist texts written in Chinese that the Japanese originally encountered and studied Buddhist thought. This Chinese orientation provided the predominant lens through which the Japanese viewed and learned about Buddhism. Until the 19th century, no systematic attempts were made to study such root languages of Buddhism as Sanskrit or Pali or to study early pre-Chinese Buddhist texts. There was, however, in the work of the 18th century writer Tominaga Nakamoto, a recognition of the conditional nature of language, dependent on the time in which it was used, the form of expression used, and the intent of the user. Tominaga’s studies of Buddhist texts and of the varying ways
144 Buddhism, Japanese
that Sanskrit terms had been translated, in different eras and by different authors, into Chinese enabled him to come to an understanding of the relative nature of language and of religious forms and to develop the foundations of a critical scholarship of religion and of Buddhist texts. The entry of Buddhism into Japan led the Japanese to study Chinese culture in depth and, especially, to embark on an intensive study of the Chinese language and its writing system so as to facilitate study of the new religion. Consequently, the entry of Buddhism was a powerful spur toward the Japanese adoption of the Chinese writing system. It also proved to be a major medium for its gradual transformation into a Japanese system, for in order to make the texts more readily comprehensible they were read in a Japanese syntactical style that applied Japanese pronunciations to the ideograms. Even in the early 1990s, Japanese Buddhist priests intoned, with Japanese pronunciations, texts written in Chinese. Because few understand Buddhist Chinese, this has led to a situation in which most priests and worshippers do not understand the meanings of the texts they recite. Large numbers of contemporary Japanese translations and commentaries on Buddhist texts are, however, available to explain such texts. Buddhism played a major part in the development and use of the phonetic syllabary katakana, which forms an intrinsic part of the written language along with Chinese ideograms. Katakana developed as a mnemonic device for providing Japanese readings and pronunciations of Chinese Buddhist texts and was widely used in the temples of Nara, the ancient capital, by the 8th century. Its use was subsequently diversified into the world of literature and eventually into general use. Although the earliest usages and development of the other phonetic syllabary, hiragana, occurred outside of Buddhist temples, it is clear that this syllabary was also widely used in the
Buddhist world by the 10th century, and this further stimulated the emergence of an authentic written language combining ideograms and phonetic script. Many Buddhistic words still in use in the late 20th century (e.g., the term issai shujoˆ , ‘all sentient beings’) derive from the earliest wave of assimilation in the 6th century: although changes in the dominant form of Chinese in the 7th century affected the ways that the Japanese pronounced most Chinese ideograms, the Buddhist temples resisted this change and continued for the most part to preserve the earlier forms. For example, the ideogram meaning ‘being, existence’ is more commonly pronounced sei, but in Buddhist contexts generally retains the earlier pronunciation joˆ . To this extent, Buddhist Japanese as used in rituals has a rather archaic feeling compared to standard Japanese. Nonetheless, many standard Buddhist terms based on these earlier forms of pronunciation, for example, jigoku (‘hell’) and gokuraku (‘heaven’), have become everyday terms still extant in contemporary Japanese and not limited to Buddhistic usage. Because of its close relationship with the Chinese language, which formed the basis of the Japanese writing system and which added immeasurably to the Japanese vocabulary of the time, Buddhism has thus played an instrumental role in the evolution of the Japanese language and in augmenting its scope, both orally and in written forms. See also: Buddhism, Tibetan; South Asia: Religions.
Bibliography Matsunaga D & Matsunaga A (1978). Foundation of Japanese Buddhism (2 vols). Los Angeles: Buddhist Books International. Miller R A (1967). The Japanese language. Tokyo: Charles Tuttle. Nakamoto T (1990). Emerging from meditation, trans. with an introduction by Pye M. London: Duckworth.
Buddhism, Tibetan J Powers, Australian National University, Canberra, Australia ! 2006 Elsevier Ltd. All rights reserved.
During the 7th century, Tibet was an expanding military power. Its armies conquered large parts of China’s Central Asian territories, but as it came into contact with more advanced civilizations, Tibet became increasingly aware of its cultural backwardness. One consequence of this was the decision by
King Songtsen Gampo (ca. 618–650) to send his advisor To¨ nmi Sambhota to India to develop a written script for the Tibetan language. With the help of Indian pandits, he developed a script based on north Indian models that is still used today. Although Tibetan histories written centuries later by Buddhist clerics present Songtsen Gampo as an emanation of the Buddha Avalokites´ vara whose primary goal in life was to spread Buddhism, there is little evidence that he had any real interest in the religion. Some of his successors, however, became
144 Buddhism, Japanese
that Sanskrit terms had been translated, in different eras and by different authors, into Chinese enabled him to come to an understanding of the relative nature of language and of religious forms and to develop the foundations of a critical scholarship of religion and of Buddhist texts. The entry of Buddhism into Japan led the Japanese to study Chinese culture in depth and, especially, to embark on an intensive study of the Chinese language and its writing system so as to facilitate study of the new religion. Consequently, the entry of Buddhism was a powerful spur toward the Japanese adoption of the Chinese writing system. It also proved to be a major medium for its gradual transformation into a Japanese system, for in order to make the texts more readily comprehensible they were read in a Japanese syntactical style that applied Japanese pronunciations to the ideograms. Even in the early 1990s, Japanese Buddhist priests intoned, with Japanese pronunciations, texts written in Chinese. Because few understand Buddhist Chinese, this has led to a situation in which most priests and worshippers do not understand the meanings of the texts they recite. Large numbers of contemporary Japanese translations and commentaries on Buddhist texts are, however, available to explain such texts. Buddhism played a major part in the development and use of the phonetic syllabary katakana, which forms an intrinsic part of the written language along with Chinese ideograms. Katakana developed as a mnemonic device for providing Japanese readings and pronunciations of Chinese Buddhist texts and was widely used in the temples of Nara, the ancient capital, by the 8th century. Its use was subsequently diversified into the world of literature and eventually into general use. Although the earliest usages and development of the other phonetic syllabary, hiragana, occurred outside of Buddhist temples, it is clear that this syllabary was also widely used in the
Buddhist world by the 10th century, and this further stimulated the emergence of an authentic written language combining ideograms and phonetic script. Many Buddhistic words still in use in the late 20th century (e.g., the term issai shujoˆ, ‘all sentient beings’) derive from the earliest wave of assimilation in the 6th century: although changes in the dominant form of Chinese in the 7th century affected the ways that the Japanese pronounced most Chinese ideograms, the Buddhist temples resisted this change and continued for the most part to preserve the earlier forms. For example, the ideogram meaning ‘being, existence’ is more commonly pronounced sei, but in Buddhist contexts generally retains the earlier pronunciation joˆ. To this extent, Buddhist Japanese as used in rituals has a rather archaic feeling compared to standard Japanese. Nonetheless, many standard Buddhist terms based on these earlier forms of pronunciation, for example, jigoku (‘hell’) and gokuraku (‘heaven’), have become everyday terms still extant in contemporary Japanese and not limited to Buddhistic usage. Because of its close relationship with the Chinese language, which formed the basis of the Japanese writing system and which added immeasurably to the Japanese vocabulary of the time, Buddhism has thus played an instrumental role in the evolution of the Japanese language and in augmenting its scope, both orally and in written forms. See also: Buddhism, Tibetan; South Asia: Religions.
Bibliography Matsunaga D & Matsunaga A (1978). Foundation of Japanese Buddhism (2 vols). Los Angeles: Buddhist Books International. Miller R A (1967). The Japanese language. Tokyo: Charles Tuttle. Nakamoto T (1990). Emerging from meditation, trans. with an introduction by Pye M. London: Duckworth.
Buddhism, Tibetan J Powers, Australian National University, Canberra, Australia ! 2006 Elsevier Ltd. All rights reserved.
During the 7th century, Tibet was an expanding military power. Its armies conquered large parts of China’s Central Asian territories, but as it came into contact with more advanced civilizations, Tibet became increasingly aware of its cultural backwardness. One consequence of this was the decision by
King Songtsen Gampo (ca. 618–650) to send his advisor To¨nmi Sambhota to India to develop a written script for the Tibetan language. With the help of Indian pandits, he developed a script based on north Indian models that is still used today. Although Tibetan histories written centuries later by Buddhist clerics present Songtsen Gampo as an emanation of the Buddha Avalokites´vara whose primary goal in life was to spread Buddhism, there is little evidence that he had any real interest in the religion. Some of his successors, however, became
Bu¨hler, Karl (1879–1963) 145
devout Buddhists, and they began the process of translating the Indian Buddhist canon into Tibetan. The Tibetan language is part of the Tibeto-Burman language family, and aside from the codifications of To¨ nmi Sambhota (who adopted case endings and other features of Sanskrit), there is no linguistic similarity between it and Indic languages. As a result, Tibetan translators of Sanskrit texts decided to create a specialized vocabulary, which was codified in standard lexicons, as well as an artificial grammar and sentence structure for their translations that reflected those of Sanskrit Buddhist texts. Contemporary Tibetan scholars make a distinction between two forms of Tibetan: (1) ‘dharma language’ (chos skad), which is the canonical language devised for scriptural translation, and (2) ‘colloquial language’ (phal skad), which is used for day-today conversation among Tibetans. Both share the same grammar and syntax, but differ in their vocabulary. Largely due to the sponsorship of Tibetan governments, a vast corpus of Indic Buddhist texts was translated into Tibetan, and in the 14th century the scholar–monk Pudo¨ n (1290–1364) compiled a standard canon, which is still normative today. The ‘first dissemination’ (snga dar) of Buddhism to Tibet began in the 7th century under the patronage of Tibet’s kings, but stalled after the early dynasty fell. In the 11th century, the Indian master Atis´ a (982–1054) traveled to Tibet and initiated the ‘second dissemination’ (phyi dar) of Buddhism. His mission was so successful that from this point forward, Buddhism became the dominant religion in the region.
Tibetan Buddhism historically derived from two primary sources: (1) the monastic universities of northern India and (2) tantric lineages based mainly in Bengal and Bihar. One of the most important Tibetan innovations was the system of reincarnating lamas (sprul sku), the most important of which is the Dalai Lama lineage. This began in the 16th century, when So¨ nam Gyatso (1543–1588) was given the title ‘‘Ta le’’ (‘Ocean’) by the Mongol chieftain Altan Khan. The fifth Dalai Lama, Ngawang Losang Gyatso (1617–1682), was made ruler of Tibet in 1642 with the help of Mongol supporters, and his successors remained the temporal and religious leaders of the country until 1950, when troops of the Chinese People’s Liberation Army invaded the country. In 1959, the 14th Dalai Lama, Tenzin Gyatso (1935–), fled into exile in India. Since that time, Chinese control over religion has intensified, and currently there is no real religious freedom in Tibet. Buddhism has been revived in exile, however, and study and practice thrive in Tibetan exile communities in South Asia. See also: Buddhism, Japanese; Sanskrit; South Asia: Religions; Tibetan.
Bibliography Powers J (1995). Introduction to Tibetan Buddhism. Ithaca: Snow Lion. Smith E G (2001). Among Tibetan texts: History and literature of the Himalayan Plateau. Boston: Wisdom. Wilson J B (1992). Translating Buddhism from Tibetan. Ithaca: Snow Lion.
Bu¨hler, Karl (1879–1963) M Bednarek, University of Augsburg, Augsburg, Germany ! 2006 Elsevier Ltd. All rights reserved.
Karl Bu¨ hler (see Figure 1) was born in Germany (Meckesheim) in 1879 and, after gaining a doctorate in philosophy and medicine, started work as an assistant to Oswald Ku¨ lpe, a psychologist in Wu¨ rzburg. Later he worked as a professor of psychology in Dresden and Vienna (1922–1938), before having to emigrate to the United States in 1938, where he lived in Los Angeles from 1945 until his death in 1963. In linguistics Bu¨ hler is most famous for his work on deixis and on language function, but he also
published work on developmental psychology, language comprehension, and human cognition as well as on other linguistic phenomena such as phonology, syntax, morphology, and stylistics. Bu¨ hler postulated four axioms for linguistics, which were concerned with (1) the organon model of language, (2) the sign status of language (by virtue of abstract features), (3) the field structure of language (Bu¨ hler united von Humboldt’s dichotomy of ergon and energeia and de Saussure’s distinction between langue and parole in his Vierfelderschema of language as Sprechhandlung, Sprachwerk, Sprechakt, and Sprachgebilde), and (4) the fact that language is a system of two ‘classes’ (Zweiklassensystem), namely semantics and syntax. Of these four axioms, it was especially the organon model that was influential.
Bu¨hler, Karl (1879–1963) 145
devout Buddhists, and they began the process of translating the Indian Buddhist canon into Tibetan. The Tibetan language is part of the Tibeto-Burman language family, and aside from the codifications of To¨nmi Sambhota (who adopted case endings and other features of Sanskrit), there is no linguistic similarity between it and Indic languages. As a result, Tibetan translators of Sanskrit texts decided to create a specialized vocabulary, which was codified in standard lexicons, as well as an artificial grammar and sentence structure for their translations that reflected those of Sanskrit Buddhist texts. Contemporary Tibetan scholars make a distinction between two forms of Tibetan: (1) ‘dharma language’ (chos skad), which is the canonical language devised for scriptural translation, and (2) ‘colloquial language’ (phal skad), which is used for day-today conversation among Tibetans. Both share the same grammar and syntax, but differ in their vocabulary. Largely due to the sponsorship of Tibetan governments, a vast corpus of Indic Buddhist texts was translated into Tibetan, and in the 14th century the scholar–monk Pudo¨n (1290–1364) compiled a standard canon, which is still normative today. The ‘first dissemination’ (snga dar) of Buddhism to Tibet began in the 7th century under the patronage of Tibet’s kings, but stalled after the early dynasty fell. In the 11th century, the Indian master Atis´a (982–1054) traveled to Tibet and initiated the ‘second dissemination’ (phyi dar) of Buddhism. His mission was so successful that from this point forward, Buddhism became the dominant religion in the region.
Tibetan Buddhism historically derived from two primary sources: (1) the monastic universities of northern India and (2) tantric lineages based mainly in Bengal and Bihar. One of the most important Tibetan innovations was the system of reincarnating lamas (sprul sku), the most important of which is the Dalai Lama lineage. This began in the 16th century, when So¨nam Gyatso (1543–1588) was given the title ‘‘Ta le’’ (‘Ocean’) by the Mongol chieftain Altan Khan. The fifth Dalai Lama, Ngawang Losang Gyatso (1617–1682), was made ruler of Tibet in 1642 with the help of Mongol supporters, and his successors remained the temporal and religious leaders of the country until 1950, when troops of the Chinese People’s Liberation Army invaded the country. In 1959, the 14th Dalai Lama, Tenzin Gyatso (1935–), fled into exile in India. Since that time, Chinese control over religion has intensified, and currently there is no real religious freedom in Tibet. Buddhism has been revived in exile, however, and study and practice thrive in Tibetan exile communities in South Asia. See also: Buddhism, Japanese; Sanskrit; South Asia: Religions; Tibetan.
Bibliography Powers J (1995). Introduction to Tibetan Buddhism. Ithaca: Snow Lion. Smith E G (2001). Among Tibetan texts: History and literature of the Himalayan Plateau. Boston: Wisdom. Wilson J B (1992). Translating Buddhism from Tibetan. Ithaca: Snow Lion.
Bu¨hler, Karl (1879–1963) M Bednarek, University of Augsburg, Augsburg, Germany ! 2006 Elsevier Ltd. All rights reserved.
Karl Bu¨hler (see Figure 1) was born in Germany (Meckesheim) in 1879 and, after gaining a doctorate in philosophy and medicine, started work as an assistant to Oswald Ku¨lpe, a psychologist in Wu¨rzburg. Later he worked as a professor of psychology in Dresden and Vienna (1922–1938), before having to emigrate to the United States in 1938, where he lived in Los Angeles from 1945 until his death in 1963. In linguistics Bu¨hler is most famous for his work on deixis and on language function, but he also
published work on developmental psychology, language comprehension, and human cognition as well as on other linguistic phenomena such as phonology, syntax, morphology, and stylistics. Bu¨hler postulated four axioms for linguistics, which were concerned with (1) the organon model of language, (2) the sign status of language (by virtue of abstract features), (3) the field structure of language (Bu¨hler united von Humboldt’s dichotomy of ergon and energeia and de Saussure’s distinction between langue and parole in his Vierfelderschema of language as Sprechhandlung, Sprachwerk, Sprechakt, and Sprachgebilde), and (4) the fact that language is a system of two ‘classes’ (Zweiklassensystem), namely semantics and syntax. Of these four axioms, it was especially the organon model that was influential.
146 Bu¨ hler, Karl (1879–1963)
not clear to what extent means of Ausdruck are employed intentionally or subconsciously and to what extent they are conventionalized. Although some of his examples include conventional and intentional expressions, Ausdruck is defined by him as ‘‘freie oder gehemmte Entladung von Affekten’’ (1982: 352), i.e., the free or inhibited discharge of emotions, which seems to more or less subconsciously ‘mirror’ the speaker’s mental state or personality traits (Auer, 1999: 33). From a strict viewpoint, this function thus does not concern linguistics (Konstantinidou, 1997: 36; cf. also Pe´ter, 1984: 245; Stankiewicz, 1964: 239f.); however, Bu¨hler’s expressive function has also been interpreted as referring to intentional linguistic communication.
The ‘Field Theory’ of Language
Figure 1 Karl Bu¨hler. With permission of the Archiv der Universita¨t Wien.
The Organon Model In his Organonmodell Bu¨ hler took up Plato’s proposal to explain language by virtue of the metaphor of language as a tool (Greek o´rganon) that serves distinct functions in society. He distinguished between three functions: Darstellung (‘representation’), Ausdruck (‘expression’), and Appell (‘vocative’, ‘appellative’). In very simple terms, language may be used for the representation of things (Dinge: Darstellung), for the expression of the speaker’s inner feelings/states (Sender: Ausdruck), and for influencing the hearer’s behavior (Empfa¨nger: Appell). The linguistic sign is hence functionally very complex: it is a symbol by virtue of its representational function, a symptom (sign, index) by virtue of its dependence on the sender, and a signal by virtue of its appeal to the hearer. In a given instance, one of these functions may dominate the others: Darstellung dominates in scientific language, Ausdruck in poetic language, and Appell in military language, but this does not mean that the other functions are not present as well (Bu¨hler speaks of Dominanzpha¨nomene). Thus, an expression such as es regnet (‘it’s raining’) denotes a meteorological event (Darstellung). But by uttering it with different intonations, the speaker can also express his/her feelings (Ausdruck) or appeal to someone not to forget to bring an umbrella (Appell) (1982: 46). Because of Bu¨hler’s occasionally imprecise style of writing, aspects of this model remain muchdiscussed in European linguistics. For instance, it is
Bu¨ hler also developed his own contextual language theory, the Zweifelderlehre (‘two-field theory’). Signs, he said, are not isolated entities but always occur in context, in a ‘field’ that can be deictic (Zeigfeld) or symbolic (Symbolfeld). As such, linguistic signs function as Feldgera¨te. Representational symbols can be interpreted solely with the help of the Symbolfeld, but in order to assign meaning to utterances containing deictic signals (Zeigwo¨rter), hearers need the extralinguistic context (the Zeigfeld). What is designated by expressions such as here, there, I, you changes according to the position of the speaker. Thus, it is the I, now, and here that establishes the deictic center (Bu¨ hler calls it the Ich-jetzt-hier-Origo). The matter is more complicated because there are several modi of pointing toward the context of utterance: ad oculos (reference to components of the current context), anaphorisch (reference to textual components), and am Phantasma (reference to fictional worlds). Bu¨ hler’s contribution to linguistics is enormous: apart from crucially influencing linguistic research (especially the Prague School) before World War II, his work has been of great importance in linguistics since the ‘pragmatic turn’ in the 1970s, although he failed to gain the deserved attention upon his emigration to the United States. His Organonmodell has been immensely significant in that it influenced the establishment of derived functional models from Roman Jakobson to Dell Hymes and M.A.K. Halliday. He is still being reinterpreted and analyzed within modern linguistics (e.g., Kubczak, 1984; Pe´ ter, 1984; Auer, 1999). Similarly, his comments on deixis, morphology, and metaphor have provided a stepping stone for modern approaches in these fields. His ¨P estate is currently being reorganized by the FDO (Forschungsstelle und Dokumentationszentrum fu¨r o¨sterreichische Philosophie).
Bulgaria: Language Situation 147 See also: Functionalist Theories of Language; Halliday,
Michael A. K. (b. 1925); Jakobson, Roman (1896–1982); Prague School.
Bibliography Auer P (1999). Sprachliche Interaktion. Eine Einfu¨ hrung anhand von 22 Klassikern. Tu¨ bingen: Niemeyer. Bu¨ hler K (1918). Die Geistige Entwicklung des Kindes. Jena: Fischer. Bu¨ hler K (1933). Ausdruckstheorie. Das System an der Geschichte aufgezeigt. Jena: Fischer. Bu¨ hler K (1969). Die Axiomatik der Sprachwissenschaften. Frankfurt a. M.: Vittorio Klostermann (shortened version of the original contribution to Kant-Studien 38, 1933). Bu¨ hler K (1982). ‘The axiomatization of the language sciences.’ In Innis R E (ed.) Karl Bu¨ hler. Semiotic foundations of language theory. New York/London: Plenum. 91–164. Bu¨ hler K (1982). Sprachtheorie. Die Darstellungsfunktion der Sprache (1st edn.: Jena, 1934). Stuttgart: Fischer.
Bu¨ hler K (1990). Theory of language: the representational function of language. (Foundations of semiotics 25) (Goodwin D F, trans.). Amsterdam: Benjamins. Eschbach A (ed.) (1988). Karl Bu¨ hler’s theory of language. Amsterdam: Benjamins. Konstantinidou M (1997). Sprache und Gefu¨ hl. Hamburg: Helmut Buske Verlag. Kubczak H (1984). ‘Bu¨ hlers ‘‘Symptomfunktion’’.’ Zeitschrift fu¨ r Romanische Philologie 100, 1–25. Musolff A (1990). Kommunikative Kreativita¨t. Karl Bu¨ hlers Zweifelderlehre als Ansatz zu einer Theorie innovativen Sprachgebrauchs. Aachen: Alano. Pe´ ter M (1984). ‘Das Problem des sprachlichen Gefu¨ hlsausdrucks in besonderem Hinblick auf das Bu¨ hlersche Organon-Modell.’ In Eschbach A (ed.) Bu¨ hler-Studien, vol. 1. Frankfurt/Main: Suhrkamp. 239–260. Stankiewicz E (1964). ‘Problems of emotive language.’ In Sebok T A (ed.) Approaches to semiotics. The Hague: Mouton. 239–264.
Bulgaria: Language Situation A G Angelov, University of Sofia, Sofia, Bulgaria ! 2006 Elsevier Ltd. All rights reserved.
The Bulgarian language is spoken mainly in the central and eastern Balkans. There are also Bulgarianspeaking minorities or language islands in the regions of Banat (Catholic settlers on the territory northwest of the Balkans, cf. Stojkov, 1967), Bessarabia and Tavria (emigrants in Moldova and Ukraine, Stojanova, 1997), Albania (Hristova, 2003), northern Greece (Shklifov and Shklivova, 2003), and Turkey (Bojadzhiev, 1991). If, to this population, which identifies itself as Bulgarian, one adds the speakers of the Macedonian language – the majority of whom also considered themselves Bulgarians until 1944, when the Macedonian language was created (cf. Kocˇev et al., 1994; Bozhinov and Panayotov, 1978; Angelov, 2000) – the number of Bulgarian speakers may reach 10.5 million. Bulgarian is also used as a first language for educational purposes by various minorities in Bulgaria: Turks, Roma, Russians, Armenians, and Jews (Angelov and Marshall, 2005), although after signing the European Framework Convention for the Protection of National Minorities in 1999, the Bulgarian Ministry of Education started to include some minority languages in its educational programs. According to
the last census (March 2001) Turks are the largest minority in Bulgaria (758 000 or 9.5% of the whole population). They are mainly rural (63%), concentrated in northeast and southeast of the country, speaking Turkish dialects, probably different from Standard Turkish. During the last 15 years, after the democratic changes, Turks have their own media and role in the public administration. Roma (last census records 365 797, but other sources 650 000) are heterogeneous, differing by origin, religion, language, and dialects. Some Roma are Christians, other are Muslims (part of them speakers of Turkish); they live across the country in urban ghettoes or village neighborhoods. They have a huge number of nongovernment organizations (NGOs), but lack coordination, and do not have their own media, as the Turks do. Russians (30 000), Armenians (14 000), and Jews (3000) are mainly urban populations, each of them with specific cultural traditions and religious institutions. The sociolinguistic situation in the country presents also some alleged, smaller minority groups and confessional communities. The Pomaks, called also Bulgarian Mohammedans (Muslim Bulgarians) are located in Rhodopi Mountains. They speak archaic Bulgarian dialect, although they started to learn Turkish in the 1990s as a step-mother tongue. Gagaouz (few villages in north-east Bulgaria), counter to the
Bulgaria: Language Situation 147 See also: Functionalist Theories of Language; Halliday,
Michael A. K. (b. 1925); Jakobson, Roman (1896–1982); Prague School.
Bibliography Auer P (1999). Sprachliche Interaktion. Eine Einfu¨hrung anhand von 22 Klassikern. Tu¨bingen: Niemeyer. Bu¨hler K (1918). Die Geistige Entwicklung des Kindes. Jena: Fischer. Bu¨hler K (1933). Ausdruckstheorie. Das System an der Geschichte aufgezeigt. Jena: Fischer. Bu¨hler K (1969). Die Axiomatik der Sprachwissenschaften. Frankfurt a. M.: Vittorio Klostermann (shortened version of the original contribution to Kant-Studien 38, 1933). Bu¨hler K (1982). ‘The axiomatization of the language sciences.’ In Innis R E (ed.) Karl Bu¨hler. Semiotic foundations of language theory. New York/London: Plenum. 91–164. Bu¨hler K (1982). Sprachtheorie. Die Darstellungsfunktion der Sprache (1st edn.: Jena, 1934). Stuttgart: Fischer.
Bu¨hler K (1990). Theory of language: the representational function of language. (Foundations of semiotics 25) (Goodwin D F, trans.). Amsterdam: Benjamins. Eschbach A (ed.) (1988). Karl Bu¨hler’s theory of language. Amsterdam: Benjamins. Konstantinidou M (1997). Sprache und Gefu¨hl. Hamburg: Helmut Buske Verlag. Kubczak H (1984). ‘Bu¨hlers ‘‘Symptomfunktion’’.’ Zeitschrift fu¨r Romanische Philologie 100, 1–25. Musolff A (1990). Kommunikative Kreativita¨t. Karl Bu¨hlers Zweifelderlehre als Ansatz zu einer Theorie innovativen Sprachgebrauchs. Aachen: Alano. Pe´ter M (1984). ‘Das Problem des sprachlichen Gefu¨hlsausdrucks in besonderem Hinblick auf das Bu¨hlersche Organon-Modell.’ In Eschbach A (ed.) Bu¨hler-Studien, vol. 1. Frankfurt/Main: Suhrkamp. 239–260. Stankiewicz E (1964). ‘Problems of emotive language.’ In Sebok T A (ed.) Approaches to semiotics. The Hague: Mouton. 239–264.
Bulgaria: Language Situation A G Angelov, University of Sofia, Sofia, Bulgaria ! 2006 Elsevier Ltd. All rights reserved.
The Bulgarian language is spoken mainly in the central and eastern Balkans. There are also Bulgarianspeaking minorities or language islands in the regions of Banat (Catholic settlers on the territory northwest of the Balkans, cf. Stojkov, 1967), Bessarabia and Tavria (emigrants in Moldova and Ukraine, Stojanova, 1997), Albania (Hristova, 2003), northern Greece (Shklifov and Shklivova, 2003), and Turkey (Bojadzhiev, 1991). If, to this population, which identifies itself as Bulgarian, one adds the speakers of the Macedonian language – the majority of whom also considered themselves Bulgarians until 1944, when the Macedonian language was created (cf. Kocˇev et al., 1994; Bozhinov and Panayotov, 1978; Angelov, 2000) – the number of Bulgarian speakers may reach 10.5 million. Bulgarian is also used as a first language for educational purposes by various minorities in Bulgaria: Turks, Roma, Russians, Armenians, and Jews (Angelov and Marshall, 2005), although after signing the European Framework Convention for the Protection of National Minorities in 1999, the Bulgarian Ministry of Education started to include some minority languages in its educational programs. According to
the last census (March 2001) Turks are the largest minority in Bulgaria (758 000 or 9.5% of the whole population). They are mainly rural (63%), concentrated in northeast and southeast of the country, speaking Turkish dialects, probably different from Standard Turkish. During the last 15 years, after the democratic changes, Turks have their own media and role in the public administration. Roma (last census records 365 797, but other sources 650 000) are heterogeneous, differing by origin, religion, language, and dialects. Some Roma are Christians, other are Muslims (part of them speakers of Turkish); they live across the country in urban ghettoes or village neighborhoods. They have a huge number of nongovernment organizations (NGOs), but lack coordination, and do not have their own media, as the Turks do. Russians (30 000), Armenians (14 000), and Jews (3000) are mainly urban populations, each of them with specific cultural traditions and religious institutions. The sociolinguistic situation in the country presents also some alleged, smaller minority groups and confessional communities. The Pomaks, called also Bulgarian Mohammedans (Muslim Bulgarians) are located in Rhodopi Mountains. They speak archaic Bulgarian dialect, although they started to learn Turkish in the 1990s as a step-mother tongue. Gagaouz (few villages in north-east Bulgaria), counter to the
148 Bulgaria: Language Situation
Pomaks, are Christians who speak a Turkish dialect. Catholics (found in small towns and villages as Chiprovtsi, Rakovski, and Bardarski Geran) are related with the Bulgarian minority of Banat in Romania.They use specific archaic Bulgarian, which has a written norm, based on Latin alphabet and influenced by Croatian liturgical tradition. Modern Bulgarian, spoken by 99% of the population of the country, including the bilingual minorities (10–15%), is a South Slavic language which stems from Old Church Slavonic (Old Bulgarian, according to Leskien, 1919; Mladenov, 1929; Vaillant, 1948; Duridanov, 1993). The beginning of the Bulgarian literary tradition was laid by the brothers SS Cyril and Methodius, who are now recognized as the inventors of the Glagolitic, an alphabet that corresponds accurately to the phonetic peculiarities of the Slavic languages. The New Testament was translated into this new literary language as early as the 9th century, along with a number of other liturgical books and mediaeval ecclesiastical writings. The Cyrillic was named after St. Constantine-Cyril the Philosopher, and its invention is ascribed to St. Clement of Ohrid (c. 830–916), the most prominent disciple of the two brothers, who was ordained by Bulgaria’s Tsar Simeon (893–927) ‘first bishop of the Bulgarian language’ in 893. Today the Cyrillic is used in Bulgaria as well as Russia, Ukraine, and Serbia; it was also used until recently in Mongolia and some other ex-Soviet republics. In the 16th, 17th, and 18th centuries the Bulgarian language changed significantly (Mirchev, 1963), although it preserved its rich vocabulary (Gerov, 1995) and script. Modern Bulgarian has a somewhat different grammar (Weigand, 1907; Beaulieux, 1933; Scatton, 1984; Hauge, 1999; Kotova and Yanakiev, 2001) and an analytic structure. Bulgarian has nine tenses, a developed system of aspect and moods, intensive use of prepositions and impersonal constructions, but no case declension in the noun; the three main cases are preserved in the personal pronouns. Thus the grammatical structure of Bulgarian differs from that of the other Slavic languages (Ivanchev, 1988). There were probably significant dialectal differences on Bulgarian language territory as early as the Middle Ages, but the varied relief of the Balkans produced a considerable number of dialects within a comparatively small geographical region (Stojkov, 1993). There are two main spoken variants in Bulgarian—Eastern and Western, which can be traced along the northeast axis from the Danube River to the Aegean, i.e., parallel to the Black Sea coast but around 300 to 350 km inland on the central Balkans. Bulgarian dialects are still used in the highland villages of the Rhodopi Mountains and the Central
Balkan Range, as well as in the Rila, Pirin, and Strandja mountains. In urban conditions the traditional dialects have evolved into interdialects, influenced by the urban social environment and Standard Bulgarian (Videnov, 1990; Videnov and Angelov, 1999; Dimitrova, 2004). The modern Bulgarian literary standard was established in the late 19th century (Gyllin, 1991; Hill, 1992; Georgieva et al., 1989) on the basis of the Northeastern Bulgarian dialects, although the capital city, Sofia, lies in the western part of the country. This gap has resulted in some cultural and linguistic disproportions; however, they do not disrupt the continuum in communication. Actually there are no language barriers between the Slavic peoples even on a far larger geographical territory that covers the South and East Slavic languages. Various genres of fiction and poetry have developed in Modern Bulgarian, especially in the 19th and 20th centuries, as well as stylistic registers used in public administration, science, and education. Specific urban slang, professional jargons, and sociolects have also emerged naturally; e.g., ‘tarikatski jargon’ of Sofia (Armjanov, 1993) is still typical of the young generation, in the same way it was used in some urban folksongs during the 1920s. See also: Analytic/Synthetic, Necessary/Contingent, and a
priori/a posteriori; Balkans as a Linguistic Area; Bulgarian Lexicography; Bulgarian; Europe as a Linguistic Area; Language and Dialect: Linguistic Varieties; Language Education Policy in Europe; Leskien, August (1840– 1916); Lingua Francas as Second Languages; Macedonia: Language Situation; Migration and Language; Minorities and Language; Moldova: Language Situation; Nationalism and Linguistics; Old Church Slavonic; Proto-IndoEuropean Morphology; Proto-Indo–European Phonology; Proto-Indo–European Syntax; Romania: Language Situation; Slavic Languages; Standard and Dialect Vocabulary; Teaching of Minority Languages; Ukraine: Language Situation.
Bibliography Angelov A G (2000). ‘The political border as a factor for language Divergence.’ In Zybatov L N (ed.) Sprachwandel in der Slavia. Frankfurt am Main: Peter Lang. 611–633. Angelov A G & Marshall D F (eds.) (2005). ‘Overcoming minority language policy failure: The case for Bulgaria and the Balkans.’ International journal of the sociology of language (special issue, forthcoming). Armjanov G L (1993). Rechnik na baˆ lgarskija zhargon. Sofia: 7MþLogis. Beaulieux L (1933). Grammaire de la langue bulgare. Paris: Libraireie ancienne honore´ champion.
Bulgarian 149 Bozhinov V & Panayotov L (eds.) (1978). Macedonia. Documents and Material. Sofia: Izdatelstvo na Baˆ lgarskata akademija na naukite. Bojadzhiev T A (1991). Baˆ lgarskite govori v Zapadna (Belomorska) i Iztochna (Odrinska) Trakija. Sofia: Universitetsko izdatelstvo ‘‘Sv. Kliment Ohridski’’. Dimitrova E (2004). Diglosijata v grad Krivodol. Sofia: Heron Press. Duridanov I (ed.) (1993). Gramatika na starobaˆlgarskija ezik. Sofia: Izdatelstvo na Baˆlgarskata akademija na naukite. Georgieva E, Zherev St & Stankov V (eds.) (1989). Istorija na novobaˆ lgarskija knizhoven ezik. Sofia: Izdatelstvo na baˆ lgarskata akademija na naukite. Gerov N (1895). Rechnikaˆ na Blaˆ garskyj jazykaˆ . Plovdiv: Druzhestvena Pechjatnica ‘‘Saˆ glasie.’’ Gyllin R (1991). The genesis of the modern Bulgarian literary language. Uppsala: Uppsala University. Hauge K R (1999). A short grammar of contemporary Bulgarian. Bloomington: Slavica. Hill P (1992). ‘Language standardization in the South Slavonic area.’ In Ammon U, Mattheier K J, Nelde P H (eds. of Sociolinguistica 6), Mattheier K J & Panzer B (eds. of the special issue, ‘The Rise of National Languages in Eastern Europe’). Sociolinguistica, 6. Tuebingen: Max Niemeyer Verlag. 108–150. Hristova E (2003). Baˆ lgarska rech ot Albanija. Govoraˆ t na selo Vraˆ bnik. Blagoevgrad: Universitetsko Izdatelstvo ‘‘Neofit Rilski.’’ Ivanchev Sv T (1988). Baˆ lgarskijat ezik – klasicheski i ekzotichen. Sofia: Narodna prosveta. Kocˇ ev I, Kronshteiner O & Alexandrov I (1994). The fathering of the what is known as the Macedonian literary language. Sofia: Macedonian Scientific Institute.
Kotova N & Yanakiev M (2001). Grammatika bolgarskogo jazyka. Moscow: Izdateljstvo Moskovskogo Universiteta. Leskien A (1919). Grammatik der altbulgarischen (altkirchenslavischen) Sprache. Heidelberg, Germany: Carl Winter’s Universita¨ tsbuchhandlung. Mirchev K (1963). Istoricheska gramatika na baˆ lgarskija ezik. Sofia: Nauka i iskustvo. Mladenov St (1929). Geschichte der bulgarischen Sprache. Berlin: Walter de Gruyter & Co. Scatton E A (1984). A reference grammar of modern Bulgarian. Columbus, OH: Slavica. Shklifov Bl & Shklivova E (2003). Baˆ lgarski dialektni tekstove ot Egejska Makedonija. Sofia: Akademichno izdatelstvo ‘‘Marin Drinov.’’ Stojanova E P (1997). Istorija odnogo jazykovogo ostrova. Sofia, Veliko Tarnovo: Znak’94. Stojkov St (1967). Banatskijat govor. Sofia: Izdatelstvo na Baˆlgarskata akademija na naukite. Stojkov St (1993). Baˆ lgarska dialektologija. Sofia: Izdatelstvo na Baˆ lgarskata akademija na naukite. Vaillant A (1948). Manuel du vieux slave. Paris: Institut d’e´ tudes slaves. Videnov M G (1990). Savremennata balgarska gradska ezikova situacija. Sofia: Universitetsko izdatelstvo ‘‘Sv. Kliment Ohridski.’’ Videnov M G & Angelov A G (eds.) (1999). ‘Sociolinguistics in Bulgaria.’ International Journal of the Sociology of Language (special issue, 135). Weigand G (1907). Bulgarische Grammatik. Leipzig, Germany: Johann Ambrosius Barth.
Bulgarian J Miller, University of Auckland, Auckland, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Bulgarian is a South Slavic language, along with Slovene (Slovenian), Macedonian, and the SerbCroatian linguistic complex. Geographically Bulgarian is also a Balkan language and shares a number of phonetic, grammatical, and lexical features with Rumanian (Romanian), Greek, and Albanian. For instance, Rumanian and Albanian have schwa in stressed syllables and so does Bulgarian, the only Slav language with this property. Bulgarian has two sets of dialects, Eastern and Western (further subdivisions are recognized). A major difference is in the reflexes of the Common Slavic jat vowel, roughly equivalent to ‘ye’ as in English yet. In
the North Eastern dialects the jat vowel became ‘ja’ in a stressed syllable and followed by a syllable with a back vowel. Elsewhere it became ‘e.’ Standard Bulgarian, based on the North Eastern dialects, has the ‘ja’ – ‘e’ alternation, in, e.g., adjectives: bjalo ‘white’ (neuter singular) versus beli (plural). The Common Slavic ‘l’ and ‘r’ plus jer (extra-short vowel) and syllabic ‘l’ and ‘r’ became ‘uˇ r’ and ‘uˇ l’ in polysyllabic words before two consonants and ‘ruˇ ’ and ‘luˇ ’ elsewhere: skuˇrben ‘sorrowful’; ‘pruˇ v’ (first-person masculine) versus ‘puˇ rva’ (first-person feminine). Consonants are palatalized or non-palatalized, as in other Slav languages. Bulgarian has lost the Slavic case-suffixes but has developed definite articles, attached to the first word in noun phrases: Bulgarian knigata ‘the book,’ kniga ‘a book,’ novata kniga ‘the new book,’ nova kniga ‘a new book.’ In written Bulgarian masculine nouns
Bulgarian 149 Bozhinov V & Panayotov L (eds.) (1978). Macedonia. Documents and Material. Sofia: Izdatelstvo na Baˆlgarskata akademija na naukite. Bojadzhiev T A (1991). Baˆlgarskite govori v Zapadna (Belomorska) i Iztochna (Odrinska) Trakija. Sofia: Universitetsko izdatelstvo ‘‘Sv. Kliment Ohridski’’. Dimitrova E (2004). Diglosijata v grad Krivodol. Sofia: Heron Press. Duridanov I (ed.) (1993). Gramatika na starobaˆlgarskija ezik. Sofia: Izdatelstvo na Baˆlgarskata akademija na naukite. Georgieva E, Zherev St & Stankov V (eds.) (1989). Istorija na novobaˆlgarskija knizhoven ezik. Sofia: Izdatelstvo na baˆlgarskata akademija na naukite. Gerov N (1895). Rechnikaˆ na Blaˆgarskyj jazykaˆ. Plovdiv: Druzhestvena Pechjatnica ‘‘Saˆglasie.’’ Gyllin R (1991). The genesis of the modern Bulgarian literary language. Uppsala: Uppsala University. Hauge K R (1999). A short grammar of contemporary Bulgarian. Bloomington: Slavica. Hill P (1992). ‘Language standardization in the South Slavonic area.’ In Ammon U, Mattheier K J, Nelde P H (eds. of Sociolinguistica 6), Mattheier K J & Panzer B (eds. of the special issue, ‘The Rise of National Languages in Eastern Europe’). Sociolinguistica, 6. Tuebingen: Max Niemeyer Verlag. 108–150. Hristova E (2003). Baˆlgarska rech ot Albanija. Govoraˆt na selo Vraˆbnik. Blagoevgrad: Universitetsko Izdatelstvo ‘‘Neofit Rilski.’’ Ivanchev Sv T (1988). Baˆlgarskijat ezik – klasicheski i ekzotichen. Sofia: Narodna prosveta. Kocˇev I, Kronshteiner O & Alexandrov I (1994). The fathering of the what is known as the Macedonian literary language. Sofia: Macedonian Scientific Institute.
Kotova N & Yanakiev M (2001). Grammatika bolgarskogo jazyka. Moscow: Izdateljstvo Moskovskogo Universiteta. Leskien A (1919). Grammatik der altbulgarischen (altkirchenslavischen) Sprache. Heidelberg, Germany: Carl Winter’s Universita¨tsbuchhandlung. Mirchev K (1963). Istoricheska gramatika na baˆlgarskija ezik. Sofia: Nauka i iskustvo. Mladenov St (1929). Geschichte der bulgarischen Sprache. Berlin: Walter de Gruyter & Co. Scatton E A (1984). A reference grammar of modern Bulgarian. Columbus, OH: Slavica. Shklifov Bl & Shklivova E (2003). Baˆlgarski dialektni tekstove ot Egejska Makedonija. Sofia: Akademichno izdatelstvo ‘‘Marin Drinov.’’ Stojanova E P (1997). Istorija odnogo jazykovogo ostrova. Sofia, Veliko Tarnovo: Znak’94. Stojkov St (1967). Banatskijat govor. Sofia: Izdatelstvo na Baˆlgarskata akademija na naukite. Stojkov St (1993). Baˆlgarska dialektologija. Sofia: Izdatelstvo na Baˆlgarskata akademija na naukite. Vaillant A (1948). Manuel du vieux slave. Paris: Institut d’e´tudes slaves. Videnov M G (1990). Savremennata balgarska gradska ezikova situacija. Sofia: Universitetsko izdatelstvo ‘‘Sv. Kliment Ohridski.’’ Videnov M G & Angelov A G (eds.) (1999). ‘Sociolinguistics in Bulgaria.’ International Journal of the Sociology of Language (special issue, 135). Weigand G (1907). Bulgarische Grammatik. Leipzig, Germany: Johann Ambrosius Barth.
Bulgarian J Miller, University of Auckland, Auckland, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Bulgarian is a South Slavic language, along with Slovene (Slovenian), Macedonian, and the SerbCroatian linguistic complex. Geographically Bulgarian is also a Balkan language and shares a number of phonetic, grammatical, and lexical features with Rumanian (Romanian), Greek, and Albanian. For instance, Rumanian and Albanian have schwa in stressed syllables and so does Bulgarian, the only Slav language with this property. Bulgarian has two sets of dialects, Eastern and Western (further subdivisions are recognized). A major difference is in the reflexes of the Common Slavic jat vowel, roughly equivalent to ‘ye’ as in English yet. In
the North Eastern dialects the jat vowel became ‘ja’ in a stressed syllable and followed by a syllable with a back vowel. Elsewhere it became ‘e.’ Standard Bulgarian, based on the North Eastern dialects, has the ‘ja’ – ‘e’ alternation, in, e.g., adjectives: bjalo ‘white’ (neuter singular) versus beli (plural). The Common Slavic ‘l’ and ‘r’ plus jer (extra-short vowel) and syllabic ‘l’ and ‘r’ became ‘uˇr’ and ‘uˇl’ in polysyllabic words before two consonants and ‘ruˇ’ and ‘luˇ’ elsewhere: skuˇrben ‘sorrowful’; ‘pruˇv’ (first-person masculine) versus ‘puˇrva’ (first-person feminine). Consonants are palatalized or non-palatalized, as in other Slav languages. Bulgarian has lost the Slavic case-suffixes but has developed definite articles, attached to the first word in noun phrases: Bulgarian knigata ‘the book,’ kniga ‘a book,’ novata kniga ‘the new book,’ nova kniga ‘a new book.’ In written Bulgarian masculine nouns
150 Bulgarian
take different subject and oblique forms of the article: (j)at and (j)a. In spoken Bulgarian (j)at is typically not used. Bulgarian has preserved the Indo-European tenseaspect system of imperfect and aorist alongside the newer perfective-imperfective system. Typically, imperfect suffixes are added to imperfective stems and aorist suffixes to perfective stems. Bulgarian does offer examples of perfective stems with imperfect suffixes in subordinate clauses introduced by, e.g., shtom ‘as soon as’ and in main clauses; they express a completed action that is repeated. The following example (1) is from Feuillet (1995: 36). (1) Vecher sedneshe na chardaka Evening sit-down–3SG on verandah-DO ‘In the evening he would sit down on the verandah’
Sedn is perfective and -eshe is imperfect. There are two future constructions, one for assertions and the other for denials. The former structure uses the particle shte, derived from the verb xoshto˜ ‘I want/wish.’ The meaning ‘want’ is now expressed by iskam, cognate with the Russian iskat’ ‘search for’. Compare (2) and (3). (2a) Dimo shte dojde utre Dimo particle come-PERF-3SG tomorrow ‘Dimo will come tomorrow’ (2b) azshte dojda utre I particle come-PERF-1SG tomorrow ‘I will come tomorrow’ (3) az I
iskam want-IMPERF1SG ‘I want to come’
da conjunction
dojda come-PERF1SG
The future-conditional still consists of a verb (originally the imperfect of xoshto˜ ) plus a da complement clause: shtjax da dojda ‘I would come,’ shteshe da dojdesh ‘you would come.’ The negative future construction consists of the invariable njama, originally a negative form of imam ‘have,’ plus a da clause, as in (4). (4a) Donka Donka
njama not-have-IMPERF3SG ‘Donka won’t come’
(4b) az
njama not-have-IMPERF1SG ‘I won’t come’
da conjunction
da conjunction
dojde come-PERF3SG dojda come-PERF1SG
Ne shte occurs, but the njama construction is the norm. Bulgarian has a perfect as well as a perfective: Bulgarian chetox ‘I read’ (last week) versus chel suˇ m ‘I have read.’ Chel is the perfect participle (originally
resultative) and suˇ m is the copula. Both Bulgarian and Macedonian have developed another perfect, with a passive (resultative) participle and imam ‘I have’: compare angazhiral suˇ m masa ‘I have booked a table,’ where angazhiral expresses a property of the speaker, and imam angazhirana masa ‘I-have booked a-table,’ where angazhirana expresses a property of table. Bulgarian has what Bulgarian linguists call a renarrative construction. It is based on the perfect and past perfect. De Bray (1980: 123) talks of the past perfect as used in renarration; Feuillet talks of the use of the perfect and past perfect to signal distance or inference. That is, neither recognizes a separate renarrative tense. Examples are in (5); see Feuillet (1995: 41). 5(a) Kazal He-supposedly-said
na to
Bozhura, Bozhura
che that
shtjal da se vurne he-would conjunction self return ‘He is supposed to have told Bozhura that he would return’ 5(b) Kaza na Bozhura, che He-said to Bozhura that shtjal da se vurne he-would conjunction self return ‘He told Bozhura that he would return’
(3a) demonstrates a Balkan feature, a lack of infinitives. Where Russian, for example, has an infinitive, Bulgarian has a finite clause. Bulgarian has two principal subordinating conjunctions, da and che. Da is used for irrealis clauses; in (4a) the event of Donka coming is not a fact but a possibility. In (6) (from Feuillet, 1995) the event of his looking at the traffic is irrealis; he is not doing it. In (7), in contrast, the event of Donka coming is presented as fact, and the clause is introduced by che. (6) Toj He
varveshe, was-walking
bez without
da conjunction
obrashta vnimanie na dvizhenie-to turns attention to traffic-the ‘He was walking without paying attention to the traffic’
(7) Tja ‘She
kaza, said
che that
Donka Donka
shte will
dojde come’
Da was originally a marker of irrealis main clauses, a function which it still has in modern Bulgarian. Bulgarian has a relativizer kojto (masculine), kojato (feminine), and koeto (neuter), with the plural koito. It is used as a free relative: kojto pie tazi rakija e glupak ‘whoever drinks this rakija is an idiot,’ and as a relativizer in relative clauses, as in (8). (8) knigata, kojato kupix book-the which I-bought ‘the book which I bought’
Bulgarian Lexicography 151
The structure preposition plus relativizer is used: knigata, v kojato chetox tezi dumi ‘the book in which I read these words.’ Spoken Bulgarian has a relative clause introduced by the invariable deto ‘where’: knigata DETO ja kupikh ‘the book that I bought,’ momcheto deto dojde ‘the boy that came.’ It also has a relative clause structure with shto (‘what’) and resumptive pronoun: kniga, shto ja kupikh ‘thebook that it I-bought.’ Despite the lack of case suffixes Bulgarian has flexible word order because of clitic personal pronouns (see Feuillet, 1995: 52–55). The personal pronouns have long and short (clitic) forms: mene me (me-accusative), mene mi (me-dative), nego go (him-accusative), and so on. Consider the question– answer pair in (9). (9) Chete li ja Dimo novata kniga? Read Q it Dimo new-the book? ‘Did Dimo read the new book?’ Dimo ja chete novata kniga Dimo it read new-the book ‘Dimo read the new book’
(9) is neutral; it asks simply if this event took place, not whether it was Dimo doing it or someone else, or if it was the new book that was read or something else. The order novata kniga ja chete
Dimo highlights ja chete Dimo; the pronoun ja signals that novata kniga is the direct object of chete. The order novata kniga, Dimo ja chete, with focal stress on Dimo, puts contrastive highlighting on Dimo: ‘As for the book, it was Dimo who read it and not anyone else.’
See also: Balto-Slavic Languages; Bulgaria: Language Situation; Bulgarian Lexicography; Clitics; Macedonia: Language Situation; Macedonian; Old Church Slavonic; Relative Clauses; Spoken Discourse: Word Order; Tense.
Bibliography Aronson H (1968). Bulgarian inflectional morphophonology. The Hague: Mouton de Gruyter. De Bray R G A (1980). Guide to the Slavonic languages (3rd edn.). Columbus, OH: Slavica. Feuillet J (1995). Bulgare. Munich: Lincom Europa. Holman M & Kovacheva M (1993). Teach yourself Bulgarian. London: Hodder and Stoughton. Hubenova M, Dzhumadanova A & Marinova M (1983). A course in modern Bulgarian, parts 1 and 2. Columbus, OH: Slavica. Rudin C (1985). Aspects of Bulgarian syntax: complementizers and wh constructions. Columbus, OH: Slavica.
Bulgarian Lexicography D Stantcheva, Berlin-Brandenburg Academy of Sciences, Berlin, Germany ! 2006 Elsevier Ltd. All rights reserved.
Bulgarian lexicography began much later than that of most other European languages. This was largely due to the decline of the early Bulgarian literary culture, after the Turkish destruction of the second Bulgarian empire at the end of the 14th century. The literary culture did not begin to revive until 1762, the beginning of a period known as the Bulgarian Renaissance which lasted until 1878 (the end of the Russian– Turkish war). Pioneer dictionaries in Bulgarian include a manuscript of a Bulgarian–Greek word index from Bogasko, near Kastoria (16th century); Chetiriezichen rechnik, by Daniil of Moschopolis (1802); Dodatak k Sanktpeterburgskim sravnitel’nim rechnitsima sviiu eziki, by Vuk Karadzhich (1822); and the manuscripts Gr‘‘tsko-b‘‘lgarski razgovornik and Gr‘‘tsko-b‘‘lgarski rechnik, by Zakhariı˘ Krusha (1828). However, these works had no influence
on subsequent Bulgarian lexicographical tradition, which developed as a part of the Bulgarian Renaissance, or on the development of New Bulgarian as a literary language. Works from the Renaissance period can be classified as follows: 1. Purist compilations: these include the word indexes in Bolgarska gramatika by Neofit Rilski (1835) and P‘‘rvichka b‘‘lgarska gramatika by Ivan Bogorov (1844), and a dictionary entitled Rechnik na dumi turski i gr‘‘tski v iazika b‘‘lgarskiı˘ (1855) by Mikhail Pavlev and Aleksandur T. Zhivkov. 2. Dictionaries of foreign words that have enriched the Bulgarian vocabulary: these include Kratkiı˘ rechnik za chuzhdestrannite rechi, koito sia nakhozhdat v b‘‘lgarskiı˘ iazik (1863), by Teodor Khrulev, and thesauri as appendixes to various books by Anastas Kipilovski, Petko R. Slaveı˘kov, Sava Radulov, and Mikhail Popovich. 3. Bilingual dictionaries: chief among these are An English and Bulgarian vocabulary in two parts (1860) by Charles Morse and Konstantin Vasiliev, and Frensko-b‘‘lgarski rechnik (1869)
Bulgarian Lexicography 151
The structure preposition plus relativizer is used: knigata, v kojato chetox tezi dumi ‘the book in which I read these words.’ Spoken Bulgarian has a relative clause introduced by the invariable deto ‘where’: knigata DETO ja kupikh ‘the book that I bought,’ momcheto deto dojde ‘the boy that came.’ It also has a relative clause structure with shto (‘what’) and resumptive pronoun: kniga, shto ja kupikh ‘thebook that it I-bought.’ Despite the lack of case suffixes Bulgarian has flexible word order because of clitic personal pronouns (see Feuillet, 1995: 52–55). The personal pronouns have long and short (clitic) forms: mene me (me-accusative), mene mi (me-dative), nego go (him-accusative), and so on. Consider the question– answer pair in (9). (9) Chete li ja Dimo novata kniga? Read Q it Dimo new-the book? ‘Did Dimo read the new book?’ Dimo ja chete novata kniga Dimo it read new-the book ‘Dimo read the new book’
(9) is neutral; it asks simply if this event took place, not whether it was Dimo doing it or someone else, or if it was the new book that was read or something else. The order novata kniga ja chete
Dimo highlights ja chete Dimo; the pronoun ja signals that novata kniga is the direct object of chete. The order novata kniga, Dimo ja chete, with focal stress on Dimo, puts contrastive highlighting on Dimo: ‘As for the book, it was Dimo who read it and not anyone else.’
See also: Balto-Slavic Languages; Bulgaria: Language Situation; Bulgarian Lexicography; Clitics; Macedonia: Language Situation; Macedonian; Old Church Slavonic; Relative Clauses; Spoken Discourse: Word Order; Tense.
Bibliography Aronson H (1968). Bulgarian inflectional morphophonology. The Hague: Mouton de Gruyter. De Bray R G A (1980). Guide to the Slavonic languages (3rd edn.). Columbus, OH: Slavica. Feuillet J (1995). Bulgare. Munich: Lincom Europa. Holman M & Kovacheva M (1993). Teach yourself Bulgarian. London: Hodder and Stoughton. Hubenova M, Dzhumadanova A & Marinova M (1983). A course in modern Bulgarian, parts 1 and 2. Columbus, OH: Slavica. Rudin C (1985). Aspects of Bulgarian syntax: complementizers and wh constructions. Columbus, OH: Slavica.
Bulgarian Lexicography D Stantcheva, Berlin-Brandenburg Academy of Sciences, Berlin, Germany ! 2006 Elsevier Ltd. All rights reserved.
Bulgarian lexicography began much later than that of most other European languages. This was largely due to the decline of the early Bulgarian literary culture, after the Turkish destruction of the second Bulgarian empire at the end of the 14th century. The literary culture did not begin to revive until 1762, the beginning of a period known as the Bulgarian Renaissance which lasted until 1878 (the end of the Russian– Turkish war). Pioneer dictionaries in Bulgarian include a manuscript of a Bulgarian–Greek word index from Bogasko, near Kastoria (16th century); Chetiriezichen rechnik, by Daniil of Moschopolis (1802); Dodatak k Sanktpeterburgskim sravnitel’nim rechnitsima sviiu eziki, by Vuk Karadzhich (1822); and the manuscripts Gr‘‘tsko-b‘‘lgarski razgovornik and Gr‘‘tsko-b‘‘lgarski rechnik, by Zakhariı˘ Krusha (1828). However, these works had no influence
on subsequent Bulgarian lexicographical tradition, which developed as a part of the Bulgarian Renaissance, or on the development of New Bulgarian as a literary language. Works from the Renaissance period can be classified as follows: 1. Purist compilations: these include the word indexes in Bolgarska gramatika by Neofit Rilski (1835) and P‘‘rvichka b‘‘lgarska gramatika by Ivan Bogorov (1844), and a dictionary entitled Rechnik na dumi turski i gr‘‘tski v iazika b‘‘lgarskiı˘ (1855) by Mikhail Pavlev and Aleksandur T. Zhivkov. 2. Dictionaries of foreign words that have enriched the Bulgarian vocabulary: these include Kratkiı˘ rechnik za chuzhdestrannite rechi, koito sia nakhozhdat v b‘‘lgarskiı˘ iazik (1863), by Teodor Khrulev, and thesauri as appendixes to various books by Anastas Kipilovski, Petko R. Slaveı˘kov, Sava Radulov, and Mikhail Popovich. 3. Bilingual dictionaries: chief among these are An English and Bulgarian vocabulary in two parts (1860) by Charles Morse and Konstantin Vasiliev, and Frensko-b‘‘lgarski rechnik (1869)
152 Bulgarian Lexicography Table 1 The main Bulgarian dictionaries from 1945 to the present day Type of dictionary
Title
Dictionary of foreign words Explanatory monolingual dictionaries; dictionaries of record
Stefan Ilchev et al. (1982): Rechnik na chuzhdite dumi v b‘‘lgarskiia ezik Liubomir Andreı˘ chin et al. (1st edition 1955, 4th edition 1994 revised and completed by Dimit‘‘r Popov): B‘‘lgarski t‘‘lkoven rechnik Stoian Romanski et al. (1955–1959): Rechnik na s‘‘vremenniia b‘‘lgarski knizhoven ezik Kristalina Cholakova et al. (1977–, 11 volumes published by 2002, up to oiam se): Rechnik na
Spelling dictionaries and pronouncing dictionaries
Ivan Khadzhov et al. (1945): Pravopisen i pravogovoren nar‘‘chnik Liubomir Andreı˘ chin et al. (1st edition 1945, 10th edition 1984): Pravopisen rechnik na
b‘‘lgarskiia ezik
b‘‘lgarskiia knizhoven ezik
Stoian Romanski (1951): Pravopisen rechnik na b‘‘lgarskiia knizhoven ezik s posochvane izgovora na dumite i poiasnenie na chuzhdite dumi
Pet‘‘r Pashov/Khristo P‘‘rvev (1975): Pravogovoren rechnik na b‘‘lgarskiia ezik Elena Georgieva/Valentin Stankov (1983): Pravopisen rechnik na s‘‘vremenniia b‘‘lgarski knizhoven ezik
Frequency dictionaries Historical dictionaries
Dimit‘‘r Popov et al. (1998): Rechnik za pravogovor, pravopis, punktuatsiia Valentin Stankov et al. (2002): Nov pravopisen rechnik na b‘‘lgarskiia ezik Tsvetanka Nikolova (1987): Chestoten rechik na b‘‘lgarskata razgovorna rech Elena Todorova (1995): Chestoten rechik na b‘‘lgarskata publitsistika 1944–1989g. Dora Ivanova-Mircheva et al. (1999–, 1 volume published by 1999, up to N ): Starob‘‘lgarski rechnik
Dora Ivanova-Mircheva/Angel Davidov (2001): Mal‘‘k rechnik na starob‘‘lgarskiia ezik Vladimir Georgiev et al. (1971–, 6 volumes published by 2002, up to slovar): B‘‘lgarski etimologichen rechnik
Dictionaries of personal names and place names Phraseological dictionaries Dictionaries of archaisms and neologisms Paradigmatic dictionaries
Stefan Ilchev (1969): Rechnik na lichnite i familnite imena u b‘‘lgarite ı˘ ordan Zaimov (1988): B‘‘lgarski imennik Nikolaı˘ Michev (1989): Rechnik na selishchata i selishchnite imena v B‘‘lgariia: 1878–1987 Keti Nicheva et al. (1974–1975): Frazeologichen rechnik na b‘‘lgarskiia ezik Keti Nicheva (1993): Nov frazeologichen rechnik na b‘‘lgarskiia ezik Stefan Ilchev et al. (1974): Rechnik na redki, ostareli i dialektni dumi v literaturata ni ot XIX i XX vek Diana Blagoeva et al. (2001): Rechnik na novite dumi i znacheniia Liubomir Andreı˘ chin et al. (1975): Obraten rechnik na s‘‘vremenniia b‘‘lgarski ezik Stefka Vasileva (1988): Rechnik na paronimite v b‘‘lgarskiia ezik Emiliia Pernishka/Stefka Vasileva (1998): Rechnik na antonimite v b‘‘lgarskiia ezik Liuben Nanov (1st edition 1950, 2nd revised edition 1968): B‘‘lgarski sinonimen rechnik Milka Dimitrova/Ana Spasova (1980): Sinonimen rechnik na s‘‘vremenniia b‘‘lgarski knizhoven ezik
Dictionary of acronyms Valency dictionary Dictionaries of jargon and slang
Ani Nanova/Liuben Nanov (1987): B‘‘lgarski sinonimen rechnik Lidiia Krumova/Mariia Choroleeva (1983): Rechnik na s‘‘krashcheniiata v b‘‘lgarskiia ezik Mariia Popova (1987): Krat‘‘k valenten rechnik na glagolite v b‘‘lgarskiia ezik Georgi Armianov (1993): Rechnik na b‘‘lgarskiia zhargon Gancho Ganchev/Albena Georgieva (1994): Rechnik na obidnite dumi i izrazi v b‘‘lgarskiia ezik
and B‘‘lgarsko-frenski rechnik (1871), both by Ivan Bogorov. 4. Unfinished projects of monolingual explanatory dictionaries: these include Bulgarski rechnik (1856) by Naı˘den Gerov, with a Russian translation section (only part A–vleka was completed); Slovar na b‘‘lgarskiia ezik, izt‘‘lkuvan na cherkovnoslavianski i gr‘‘tski ezik (1875) by Neofit Rilski, with a Greek translation section (only letters A–B were published); B‘‘lgarski rechnik. S‘‘birane sichkite nashi dumi, posreshchnati s frenski i ist‘‘lkuvani d‘‘lgo i shiroko b‘‘lgarski (1881) by Ivan Bogorov, with a French translation section (only the first five letters were completed); B‘‘lgarski
rechnik s t‘‘lkuvanea i primeri (1871) by Ivan Bogorov (only part A–vdl‘‘bnat was published); and a manuscript of a monolingual explanatory dictionary by Petko R. Slaveı˘kov (1827–1895), in which only parts of the letters A, B, V, and G are preserved. Two pioneering monolingual dictionaries with Russian translation sections appeared at the end of the 19th century: Slovar’ bolgarskogo iazyka po pamiatnikam’ narodnoı˘ slovesnosti i proizvedeniiam’ noveı˘seı˘ pechati (1885–1889) by Aleksand‘‘r Diuvernua, and the five-volume Rechnik na b‘‘lgarskyı˘ ezyk’ s’ t‘‘lkuvanie rechi-ty na b‘‘lgarsky i na
Bulgarian Lexicography 153
russky (1895–1904) by Naı˘den Gerov, with a supplementary volume (1908) by Todor Panchev (a facsimile of the six volumes was published in 1978). Both works concentrate on colloquial vocabulary and are important lexicographical records of the formation of New Bulgarian. Based on these two dictionaries is the unfinished B‘‘lgarski t‘‘lkoven rechnik (1927–1951) founded by Stoian Argirov, Stefan Mladenov, Aleksand‘‘r Teodorov-Balan, and Ben’o Tsonev; this work was continued by Stefan Mladenov and renamed B‘‘lgarski t‘‘lkoven rechnik s ogled k‘‘m narodnite govori. Only letters A–K were completed. A number of monolingual dictionaries were published in the early 20th century; these include the first dictionary of synonyms by Mariana Dabeva (1930–1934); Entsiklopedichen rechnik na chuzhdite dumi (1939) by Georgi Bakalov, a dictionary of foreign words; and Stefan Mladenov’s Etimologicheski i pravopisen rechnik na b‘‘lgarskiia knizhoven ezik (1941), the first etymological dictionary and the first codification of Bulgarian orthography. In the second half of the 20th century, Bulgarian lexicography began to expand rapidly and is currently thriving, supported by the Bulgarian Academy of Sciences (BAS) in Sofia and by the universities of Sofia, Veliko Turnovo, Plovdiv, and Shumen. During the past 60 years, a great number of modern dictionaries have appeared, each with a different linguistic approach and editorial aim (see Table 1). The major scholarly record of the Bulgarian language is Rechnik na b‘‘lgarskiia ezik by Kristalina Cholakova et al. (this work, in progress since 1977, had 11 volumes published by 2002, up to oiam se). Since 1998, the BAS has worked to create an electronic corpus of the Bulgarian language. The corpus is representative of all major registers of the Bulgarian language and is being continually enlarged and updated. It can be expected to have an important effect on the future of Bulgarian lexicography. In addition to practical dictionary making, numerous theoretical metalexicographical works have been published (the most important of which are included in the following bibliography), in which lexicography is generally viewed as a branch of lexicology. The Department for Bulgarian Lexicology and Lexicography at BAS is an institutional member of European Association of Lexicography (Euralex) and publishes the lexicographic journal Leksikograficheski pregled. See also: Bilingual Lexicography; Bulgarian; Bulgaria:
Language Situation; Dictionaries; Lexicography: Overview; Lexicology; Macedonian; Old Church Slavonic; Slavic Languages.
Bibliography Boiadzhiev T (1986). B‘‘lgarska leksikologiia. Sofia: Nauka i izkustvo. Cholakova K (1972). ‘Trideset godini b‘‘lgarska leksikografiia v Instituta za b‘‘lgarski ezik.’ In B‘‘lgarski ezik 5. Sofia: Izdatelstvo na B‘‘lgarskata akademiia na naukite. 456–458. Cholakova K (1978). ‘B‘‘lgarskata leksikografiia v minaloto i dnes.’ In Pashov P (ed.) V‘‘prosi na b‘‘lgarskata leksikologiia. Sofia: D‘‘rzhavno izdatelstvo ‘Narodna prosveta.’ 159–179. Cholakova K (1984). ‘S‘‘vremenna b‘‘lgarska leksikografiia.’ In S‘‘vremenna B‘‘lgariia, vol. 5. Sofia: Izdatelstvo na B‘‘lgarskata akademiia na naukite. 78–83. Cholakova K (1985). ‘Chetirideset godini b‘‘lgarska leksikografiia i leksikologiia.’ In B‘‘lgarski ezik 1. Sofia: Izdatelstvo na B‘‘lgarskata akademiia na naukite. 21–25. Cholakova K (ed.) (1986). V‘‘prosi na s‘‘vremennata b‘‘lgarska leksikologiia i leksikografiia. Sofia: Izdatelstvo na B‘‘lgarskata akademiia na naukite. Dimova A & Pavlova M (1973). ‘Pogled v‘‘rkhu razvoia na nashata leksikografiia (t‘‘lkovni rechnitsi).’ In B‘‘lgarski ezik 6. Sofia: Izdatelstvo na B‘‘lgarskata akademiia na naukite. 583–589. Kiuvlieva-Mishaı˘kova V (1997). B‘‘lgarskoto rechnikovo delo prez V‘‘zrazhdaneto. Sofia: Akademichno izdatelstvo, ‘Prof. Marin Drinov.’ Kiuvlieva-Mishaı˘kova V (ed.) (2002). Problemi na b‘‘lgarskata leksikologiia, frazeologiia i leksikografiia. Sofia: Akademichno izdatelstvo, ‘Prof. Marin Drinov.’ [B‘‘lgarsko ezikoznanie 3.] Lewanski R C (1973). A bibliography of Slavic dictionaries. (vol. II) (2nd edn.). Bologna: Editrice Compositori. Rusinov R & Georgiev S (1996). Leksikologiia na b‘‘lgarskiia knizhoven ezik. Veliko T‘‘rnovo: Abagar. Stankiewicz E (1984). Grammars and dictionaries of the Slavic languages from the Middle Ages up to 1850. Includes indexes. Berlin: Walter de Gruyter. Steinke K (1990). ‘Bulgarische Lexikographie.’ In Hausmann F J, Reichmann O, Wiegand H E & Zgusta L (eds.) Wo¨ rterbu¨ cher: Ein internationales Handbuch zur Lexikographie, vol. 2. Berlin and New York: Walter de Gruyter. 2304–2308. V‘‘tov V (1995). Fonetika i leksikologiia na b‘‘lgarskiia ezik. Veliko T‘‘rnovo: Abagar. Zidarova V (1998). Ocherk po b‘‘lgarska leksikologiia. Plovdiv: Plovdivsko universitetsko izdatelstvo, ‘Paisiı˘ Khilendarski.’
Relevant Website http://www.ibl.bas.bg – electronic corpuas of the Bulgarian language.
154 Bullokar, William (c. 1531–1609)
Bullokar, William (c. 1531–1609) W Viereck, Universita¨t Bamberg, Bamberg, Germany ! 2006 Elsevier Ltd. All rights reserved.
William Bullokar was a persistent spelling reformer who provided evidence about the pronunciation of English toward the end of the 16th century. From his books we can glean some information about his life. Born of a Sussex family about 1530, he spent at least one period of military service abroad, namely in Le Havre, which had been occupied by the English in 1562 and 1563. Several years earlier, Bullokar had been a teacher. In 1570 he married the daughter of an alderman of Chichester and, 4 years later, moved to the Chichester parish of St. Andrews, where he died in early 1609. His son John was the author of the 1616 book An English expositor: teaching the interpretation of the hardest words used in our language (reprinted in 1971). In 1585 his father had published for John the book The short sentences of the wyz Cate in his spelling system. About 1574, Bullokar started to devise a remedy for the state of English orthography. He did not try to substitute traditional spelling for a phonetic one. Rather, he essentially kept the historical orthography and attempted to indicate the pronunciation by means of diacritical marks above and below the graphemes. However, he did not completely succeed in this and often allowed himself to be influenced by the traditional writing system. He laid down his orthographic system of 37 letters in his Amendment of orthographie for English speech of 1580. Bullokar had several of his books printed in his script, namely a collection of fables (Aesops´ fablz´ , 1585), as well as his grammatical treatises. His system was, however, too complex to be adopted for English generally. The pronunciation mirrored in his works seems basically to have been that of the London middle class, that is, of Standard English, which lower-class people strove for. On the other hand, Bullokar retained a good
Burji
See: Highland East Cushitic Languages.
many dialectisms and vulgarisms, some of which were imported into the standard language during the 17th century. Bullokar also wrote the first English grammar in English. However, he followed William Lily’s Grammatica Latina very closely. Still, some of his observations are noteworthy, such as those on phrasal verbs. See also: English, Early Modern; Spelling Reform.
Bibliography Bullokar W (1580–1581). The works, vol. 1: a short introduction or guiding to print, write, and reade Inglish speech. [Reprinted in 1966 by Danielsson, B. & Alston, R. C. (eds.). Leeds, UK: University, School of English, Texts and Monographs N.S.] Bullokar W (1586). The works, vol. 2: pamphlet for grammar. [Reprinted in 1980 by Turner, J. R. (ed.). Leeds, UK: University, School of English, Texts and Monographs N.S.] Bullokar W (1580). The works, vol. 3: booke at large, for the amendment of orthographie for English speech. [Reprinted in 1970 by Turner, J. R. (ed.). Leeds, UK: University, School of English, Texts and Monographs N.S.] Bullokar W (1585). The works, vol. 4: Aesops´ fablz´ . [Reprinted in 1969 by Turner, J. R. (ed.). Leeds, UK: University, School of English, Texts and Monographs N. S.] Bullokar W (1580, 1586). Booke at large: and bref grammar for English. [Reprinted in 1977, with an introduction by Diane Borstein. Delmar, NY: Scholars’ Facsimiles and Reprints.] Dobson E J (1968). William Bullokar, in English pronunciation, 1500–1700, vol. I: survey of the sources (2nd edn.). Oxford: Clarendon Press. 93–117. Poldauf I (1948). ‘On the history of some problems of English grammar before 1800.’ Prague Studies in English 7.
Burkina Faso: Language Situation 155
Burkina Faso: Language Situation B Coulibaly ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 1, pp. 430–432, ! 1994, Elsevier Ltd.
Burkina Faso is a multilingual and multiethnic country. A distinction must be made between the official language, French, and the national languages. Indeed, the constitution of the Fourth Republic stipulates under Section II, article 35: ‘French is the official language. The methods by which the national languages are to be promoted shall be laid down by law.’ Among the national languages, there is a distinction between majority languages, which are used as lingua francas, and local, minority languages.
French French, as stated, is the official language of administration, foreign affairs, the judicial system (Supreme Court and High Court of Justice, etc.), and of formal education (both primary and secondary schools and universities). French is also the language that allows further access to other foreign languages (mainly English, with German in second place). French is even the means of access to other national languages. It is the ‘Strait Gate’ through which all desirous of state employment must pass. Thus, French is the language of initial education for most of those who are literate. According to the 1985 census, 73.37% of literate people (i.e., 12.1% of the population) are literate in French. All this demonstrates just how privileged a position the French language occupies, and it doubtless explains why the national languages are flooded by French expressions. In this respect, no area of lexis is immune. But this influence only concerns lexis: the syntax, phonology, and morphology of the respective languages are not, or hardly, affected. Thus, those who attain mastery of the French language are limited in number. At a rough estimate they make up some 10% of the population. They live mainly in the large towns. The remaining 90% use only the national languages as their means of expression.
The National Languages The main issues to be considered are the number of national languages, their geographical distribution,
the division into language families, and the extent to which people are bi- or multilingual. How Many National Languages Are There?
Tiedrebeogo and Yago (1983) speak of 60 languages used by a population then estimated at 5 600 000 people. Of these 60 languages, only 36 have been studied linguistically, and 18 are the object of National Subcommissions. The aim of these Subcommissions, under the general direction of the National Commission (set up on January 17, 1969), is to undertake, carry out, and promote the study of the various national languages and to teach adult literacy in them. The languages concerned are, in alphabetical order: Bisa (Lala: Bisa), Bobo, Bwamu, Cerma, Dagara (Dagara Northern), Fulfulde, Gulmancema (Gourmance´ ma), Jula, Kar (Karaboro, Eastern), Kasem, Lobiri (Lobi), Lyele (Lye´ le´ ), Moore (Mo`ore´ ), Nuni, San, Senufo (Se´ nufo, Senara), Songhai (Songhay), and Tamasheq (Tamasheq, Kidal). Language Families and Geographical Distribution
The languages found in Burkina Faso mostly belong to three main groups: the Gur languages group or Voltaic group (Moore, Gulmancema, Kasem, Dagara, etc.); the Mande group (Jula, Dafing (also known as Marka), Bobo, San, Bisa) and the West Atlantic group represented by Fulfulde. These three groups correspond to the three language types accepted by Houis as representing the totality of African languages, i.e., the ‘economic’ type, the type with differentiated morphology, and the intermediate type. The Mande languages are examples of the first type, with their open syllable (CV) structure, productive compound word morphology, lack of nominal classes, etc.; the languages with differentiated morphology are represented by the only West Atlantic language, namely Fulfulde, which has closed syllable structure, nonproductive compound word morphology, extensive nominal classes, etc., and, finally, there are the Gur or Voltaic languages that represent the intermediate type with both open and closed syllable structure, fairly extensive nominal classes, etc. Tiedrebeogo and Yago (1983) list the following as languages used for interethnic communication: Moore, Jula, and Fulfulde. In general terms, Moore is used in the center of the country, Jula in the west, and Fulfulde in the northeast, but each of the three languages extends well beyond these areas and they thus behave, given their general use, as true lingua francas.
156 Burkina Faso: Language Situation Bilingual and Multilingual Areas
According to I. Nacro (Plurilingualism and education in Africa: a sociolinguistic approach to the situation in Upper Volta, 1984), there are four bilingual areas in Burkina Faso: 1. The west is notable for the existence of a whole host of minority languages. Indeed, 71.42% of the national languages are located in this area. This state of affairs has benefited Jula, which is used as a second language by several ethnic groups, such as the Bobo, Bwaba (Bwamu), Senufo, Dagara, Lobiri, Samo, Gouin (Cerma), Turka, Toussian, Siamu (Siamou), etc. Each of these groups automatically uses Jula when unable to communicate with anyone in their native language. 2. The central southern area where Moore is used as a second language by the Bisa. 3. The central northern area where Moore is used as a second language by the Peul. 4. Finally, the north where the Tamasheq and the Bela use Fulfulde as their second language. The large towns, and especially Ouagadougou and Bobo-Dioulasso, form multilingual areas. Apart from the three languages listed above, French and Bobo (in Bobo-Dioulasso), are spoken there. For What Purposes Are the Various Languages Used?
Despite the low prestige they enjoy, the national languages are the subject of some attention on the part of the government authorities. Since 1969, a certain number of official measures have been taken and instructions issued in their support. It is worth mentioning the successive setting-up of the DAFS, the ONEPAFS, the INAFA, and the INA; the translation of the national anthem into some 10 languages in 1985; the fact that civil servants are taught to read and write in national languages; the publication of a circular obliging ministers and senior managers to use the national languages for preference when addressing the people at large; the ‘Commando’ literacy program and the ‘Bantare´ ’ program aimed at making 10 000 women literate; and so on. All these measures will be given concrete expression by the use of the national languages on television and radio, in written materials of all types, in the courts of first instance, and in religious services. On television The three most widely used languages are employed daily to broadcast news that has already been broadcast in French, to debate certain problems in magazine programs broadcast in the
relevant languages, and to give lessons in transcription through courses organized by the officials of the INA. On radio Some 20 national languages have access to the radio where they are given generous time allocations. They use these to broadcast news and to organize programs containing folktales, proverbs, and riddles, thus encouraging oral literature. In written and oral literature Oral literature exists in all the languages. It has its own means of preservation and is not dependent on being written down. Written literature is practically nonexistent, however, being limited to a few INA publications, to essays by some authors in the context in the national Grand Prix for Art and Literature (GPNAL), and to religious writings. Nonetheless, it is worth noting the existence of some newspapers that appear at irregular intervals in four of the country’s languages, namely Moore, Jula, Fulfulde, and Gulmancema. In the courts of first instance In these courts, and especially in local arbitration and reconciliation tribunals, all the national languages are used. However, in the large towns, the three majority languages enjoy a privileged position, with the other languages being mainly employed in the villages. In religious services This area is where the national languages are used the most. The Christians teach their congregations to read and write in these languages. In this context, the SIL (Summer Institute of Linguistics) organizes literacy courses.
Conclusion In Burkina Faso, several so-called national languages live peacefully side by side with the official language, French. Briefly, the use that the typical citizen of the country makes of languages may be described as follows: the official language is used to communicate with the outside world, while the national languages are devoted to internal purposes. Language Maps (Appendix 1): Map 2.
Bibliography Tiedrebeogo G & Yago Z (1983). The language situation in Upper Volta. Ouagadougou: National Council for Scientific and Technical Research.
Burma: Language Situation 157
Burma: Language Situation J Watkins, University of London, London, UK ! 2006 Elsevier Ltd. All rights reserved.
First, a note on alternative names. The names of places, ethnolinguistic groups and their languages may in some cases be rendered in English in two ways: one introduced by the Burmese government in 1989 and one in use before then that may still have general currency. Such pairs are separated with an oblique stroke, the older name preceding the newer one, thus: Burma/Myanmar.
Burma, like the entire Southeast Asian region, is an area of extreme linguistic diversity. Most languages indigenous to the territory of the Union of Myanmar (here referred to simply as Burma) belong to the Tibeto-Burman, Tai-Kadai, and Mon-Khmer language families. The task of enumerating the languages spoken in Burma is confounded by three factors that may cause the number of languages to be overestimated or underestimated. Firstly, there is a general dearth of accurate and upto-date demographic data describing the population of Burma and the languages they speak. There has never been a formal linguistic survey of the country, and many of the data available are patchy and unreliable, and the speaker numbers presented here include provisional and rough estimates. Secondly, languages and dialects spoken in Burma may be referred to by multiple names, which may be ethnonyms and/or language names, both autonymic and exonymic. Conversely, one name may be used to refer to multiple languages and dialects. Lastly, the perennial problem of how to define distinct languages as opposed to dialects of the same language is frequently encountered in Burma. The geographical distribution of Burma’s languages is obviously a complex affair. In general, the Burmese-speaking Burman/Bamar majority live in the central plains, occupying about half the area of the country, with other languages found in the more mountainous areas nearer the borders in all directions. These areas coincide mostly with the area of the administrative non-Burman States: Arakan/ Rakhine, Chin, Kachin, Shan, Karenni/Kayah, Karen/ Kayin, and Mon. The most comprehensive reliable listings of the languages spoken in Burma are Bradley’s map of the country in the Atlas of the world’s languages (Bradley, 1994) and SIL International’s Ethnologue. The total number of languages spoken in Burma is put at 107 by SIL International. The population of the country has recently been estimated to be approxi-
mately 53 million, some 20% more than the figures used by SIL. In official Burmese government sources, the number of ‘national races’ in Burma is put at 135. The ethnic groups in this list are often categorized geographically according to the state in which they reside, with no regard for their linguistic relationships. This tally appears to stem from the 1931 British Census of India, and bears little resemblance to the actual ethnolinguistic situation.
Tibeto-Burman Languages Tibeto-Burman languages are spoken by about fourfifths of the population of Burma. Burmese is the Tibeto-Burman language with more speakers than all the rest of the Tibeto-Burman languages combined. It is the first language of about two-thirds of the population and is spoken nonnatively by several million speakers of other languages. The dialects Arakanese (Arakan/Rakhine State), Tavoyan (Tenasserim/Tanintharyi Division), and Intha (Shan State) may be argued to be separate languages. Other closely related Burmish languages/ dialects include Taungyo, Yaw, Hpun, Achang, Lashi, Maru, and Danu. Languages of the Loloish branch of Tibeto-Burman are spoken in the eastern part of Shan State, including Lisu (roughly 125 000 speakers), Lahu (approximately 125 000 speakers), and Akha (approximately 200 000 speakers). The western side of Burma is home to the languages in the diverse Kuki-Chin branch of Tibeto-Burman. In northern Arakan/Rakhine State and in Chin State, some two dozen Chin languages are spoken, typically with thousands or tens of thousands of speakers at most: Southern Chin (including Daai, Khumi, Mro, and Asho), Mro, Central Chin (including Haka/Lai and Lushai/Mizo), and Northern Chin (including Tedim, Falam, and Thado). The languages of northern Burma’s Kachin State include languages for which the classification within Tibeto-Burman is disputed or unclear. The half-dozen Nungish languages include Nung and Rawang; the Sal group (comprising Baric and Luish languages and Jinghpaw, or classified alternatively as JinghpawKonyak-Garo) includes languages such as Jinghpaw, Kado, Khienmungan, Chang, and Tase. About 20 languages of the Karen branch of TibetoBurman are found in the areas of eastern Burma bordering Thailand, in Karen/Kayin and Karenni/ Kayah States, spoken by 3–4 million people. The larger
158 Burma: Language Situation
languages include Sgaw and Pwo (Pho) (approximately 1.25 million speakers each), Padaung (Kayah) (roughly 300 000 speakers) and Pa’O (approximately 500 000 speakers). Other Karen languages include Yintale, Yinbaw, Paku, Geko, Geba, and Manu.
Mon-Khmer Languages Mon-Khmer languages account for only about 7% of the population. The major Mon-Khmer language spoken in Burma is Mon, a literary language with a rich history, spoken by some 800 000 in the Mon State in southeastern Burma. Palaungic languages, a part of the Northern branch of Mon-Khmer, are spoken by scattered communities in Shan State and in northern central Burma. The larger of these include Wa, a diverse group of several dozen dialects including Paraok, En, Son, Va, and Vo. Wa and the closely related language Plang are spoken in Shan State by perhaps 600 000–700 000 people. About half that number speak other Palaungic languages: Palaung (Pale, Shwe, and Rumai), Riang, Loi, Samtao, and related dialects.
Tai-Kadai Languages Burma’s Tai-Kadai languages are spoken in the northeastern part of Shan State, with speakers of Shan found also some areas of Kachin State. Tai-Kadai languages account for about one-tenth of the population. The major Tai languages spoken in Shan State are Shan (approximately 6% of the population, over 3 million speakers) and the smaller Tai languages Khamti, Khu¨ n, Lu¨ , and Tai Nu¨ a.
Other Languages As a result of Burma’s colonization by the British, some South-Asian Indo-Aryan languages not indigenous to Burma – principally Hindi/Urdu, Bengali, and Panjabi – are spoken, mainly in urban centers. They are the descendants of people brought to Burma as part of the colonial administration established by the British. Chinese is spoken both natively and used as a major lingua franca in areas near the Chinese border, in particular the Kokang area. In Arakan/Rakhine State, a variety of the Chittagonian dialect of Bengali is spoken by the Muslim Rohingya population, numbering in the hundreds of thousands. The Rohingya’s status as Burmese nationals has been problematic; in recent years refugees from Burma have fled to Bangladesh, from where a number have been repatriated to Burma. There are a few thousand speakers of Hmong Njua, a Hmong-Mien language, in the northeast of the country.
Burmese is the official and national language. It is the sole language of all the official business and administration of the military government, including all broadcast media and state education, though there is a state-run University for the Development of National Races, established in 1990, with the aim of training teachers in Kachin, Karenni/Kayah, Sgaw and Po Karen/Kayin, Shan, Mon, Arakanese/ Rakhine, and Kachin (possibly meaning Jinghpaw), and Chin (possibly meaning Haka/Lai Chin). Other languages may be taught as part of political movements associated with a particular ethnic group, for example the Mon-language education system promoted by the New Mon State Party in the 1990s. The use of English in Burma has been a contentious issue over the last century, when at times Burmese took second place in the education system developed under British Colonial Rule. The status of Burmese was reasserted in the 1930s in connection with nationalist anticolonialist political movements such as the Dobama a-Si a-Youn. The status of English was drastically reduced under all-Burmese education policies of the U Ne Win socialist government established in 1962, though the teaching of English has resumed and flourished in the last two decades. At present, private schools, typically the reserve of relatively wealthy city-dwellers, may provide part of their curriculum in English or in Chinese. Such schools may be a popular choice for parents who want to equip their children for commercial success in the future. While Burmese remains the language of communication at the national level, a number of languages function as regional lingua francas, such as Arakanese in Arakan/Rakhine State and southern Chin State. Shan, Chinese, and Lahu are all used between speakers of other languages in various parts of Shan State. It is relatively rare for ethnic Burmans/Bamar to speak languages other than Burmese, but most people whose first language is a language other than Burmese speak Burmese to some degree, and frequently other languages besides. A simple straw poll revealed that a quarter of a group of about 25 speakers of the MonKhmer language Wa in Shan State spoke five or more languages in their everyday lives. Pa¯ li, no longer a living language, remains culturally prominent in Burma as the language of the Buddhist scriptures, which are routinely studied and chanted as part of Buddhist religious practice. Pa¯ li is also an important source of loanwords – typically learned and religious vocabulary – in the written languages with predominantly Buddhist speaker populations, namely Burmese, Mon, and Shan.
Burma: Language Situation 159
Languages spoken in Burma have also borrowed vocabulary from English and from the major languages spoken in the countries neighboring Burma: Chinese, Thai, Bengali, and Hindi. Of course, many of the languages spoken in areas near Burma’s borders with neighboring countries are spoken on both sides of the border by speech communities that may have a common sense of identity despite the political divisions imposed by the border. Examples of such communities are Wa in Burma and China, Karen/Kayin in Burma and Thailand, and Naga in Burma and India. The general lack of research on Burma’s linguistic landscape means that there is an incomplete picture of the extent and scope of language endangerment in Burma. Undoubtedly, certain languages may be losing ground to Burmese or regional languages, or may be losing internal dialectal diversity. There is a critical need for a systematic language survey of Burma. See also: Burmese; Endangered Languages; Mon; Multi-
lingualism: Pragmatic Aspects; Wa.
Bibliography Allott A (1985). ‘Language policy and language planning in Burma.’ In Bradley D (ed.) Papers in Southeast Asian linguistics: language policy, language planning and sociolinguistics in Southeast Asia. Canberra: Pacific Linguistics. 131–154. Bradley D (1994). ‘East and South-East Asia.’ In Moseley C & Asher R (eds.) Atlas of the world’s languages. London: Routledge. Bradley D (1995). Papers in South Asian linguistics No. 13: Studies in Burmese languages. Canberra: Pacific Linguistics, Australian National University. Callahan M P (2003). ‘Language policy in modern Burma.’ In Brown M & Ganguly Sˇ (eds.) Fighting words: language policy and ethnic relations in Asia. Cambridge: MIT Press. Lintner B (1990). Land of jade: a journey through insurgent Burma. Edinburgh: Kiscadale/Bangkok: White Lotus. Myanmar Language Commission (1993). MyanmarEnglish dictionary. Yangon, Myanmar: Myanmar Language Commission.
Myanmar Language Commission (2001). English-Myanmar dictionary. Yangon, Myanmar: Myanmar Language Commission. Okell J (1969). A reference grammar of colloquial Burmese. London: Oxford University Press. Okell J (1995). ‘Three Burmese dialects.’ In Bradley D (ed.) Papers in Southeast Asian linguistics No. 13: studies in Burmese languages. Canberra: Pacific Linguistics, Research School of Pacific and Asian Studies, Australian National University. Rajah A (1990). ‘Ethnicity, nationalism and the Nation-state: the Karen in Burma and Thailand.’ In Wijeyewardene G (ed.) Ethnic groups across national boundaries in mainland Southeast Asia. Singapore: Institute of Southeast Asian Studies. 102–133. Sakhong L H (2003). In search of Chin identity: a study on religion, politics and ethnic identity in Burma. Copenhagen: Nordic Institute of Asian Studies Press. SIL International (2002). Ethnologue: languages of Myanmar. http://www.ethnologue.com/show_country.asp? name ¼ Myanmar accessed 27 November 2004. Smith M (1991). Burma: insurgency and the politics of ethnicity. London: Zed Books. Smith M (1994). Ethnic groups in Burma: development, democracy and human rights. London: Anti-Slavery International. South A (2003). Mon nationalism and civil war in Burma: the golden Sheldrake. London: Routledge Curzon. Taylor R H (1982). ‘Perception of ethnicity in the politics of Burma.’ Southeast Asian Journal of Social Science 10(1), 7–22. Thein Lwin (2000). The teaching of ethnic language and the role of education in the Context of Mon ethnic nationality in Burma. http://www/mrc-usa.org/school-research.htm accessed 27 November 2004. Tin Htway (1972). ‘The role of literature in nation building in Burma.’ In Grossmann B (ed.) Southeast Asia in the Modern World. Wiesbaden: Otto Harassowitz. 35–60. Wheatley J K (1990). ‘Burmese.’ In Comrie B (ed.) The major languages of East and Southeast Asia. London: Routledge. 106–126. Wheatley J K (2003). ‘Burmese.’ In Thurgood G & LaPolla R J (eds.) The Sino-Tibetan languages. London, New York: Routledge. 195–207. Yeˆkhaung M L (1966). Modernisation of Burmese. Prague: Oriental Institute, Czechoslovak Academy of Sciences.
160 Burmese
Burmese J Watkins, School of Oriental and African Studies, London, UK ! 2006 Elsevier Ltd. All rights reserved.
Introduction Burmese is the national language of Burma/Myanmar and is the mother tongue of the Burman (Bamar) ethnic majority, who make up approximately twothirds of Burma’s population of slightly over 50 million. The rest of the country’s indigenous population is diverse, speaking between 60 and 100 other languages among them, depending on the criteria used to distinguish languages from one another. Most nonBurmans live in the areas near Burma’s borders with Thailand, Laos, China, India, and Bangladesh, although many live interspersed with Burmans and speak Burmese and other languages in addition to their native language. Burmese is little spoken outside Burma, but widely dispersed and fragmented communities of Burmese expatriates may be found in Asia and around the world. Burmese belongs to the Tibeto-Burman language family, which comprises approximately 350 languages spoken across a vast territory stretching from the Himalayas to mainland Southeast Asia. Burmese has by far the largest number of speakers of any of the Tibeto-Burman languages, most of which have only a few thousand speakers and many of which may disappear during the 21st century. Most of the other languages spoken in Burma also belong to the Tibeto-Burman language family. Some, such as Arakanese (Rakhine), Intha, and Danu, are so similar to Burmese as to be considered by some to be dialects of Burmese rather than separate languages.
History and Script The Burmese have been in the area of modern Burma/ Myanmar from approximately 850 C.E. onward, founding their capital at Pagan (Bagan). Despite extensive contact over the following two centuries with the Pyu, the speakers of a now-dead Tibeto-Burman language that occupied the area, the first inscriptions in Burmese date from the 11th century, with no extant examples of Burmese writing before then. Burmese script is a close cousin of the Mon script, which was adapted from a southern Indian script, a descendant of the Bra¯ hmı¯ script that was the ancestor of many Indic scripts found in South and Southeast Asia. It is thought that the Burmese adapted the script from Mon after Mon scribes were brought to the city of Pagan after the Burmese king Anawratha, in 1057
C.E.,
defeated the Mon, although this theory has been disputed in recent research. Aside from the rounding of the originally square characters into the distinctive round-shaped letters of Burmese today, the alphabet has remained largely unchanged to the present day. It is widely believed that the round shapes of Burmese letters evolved because texts were traditionally written on palm leaves, which would split easily if angled shapes were scratched on them. Whether or not this is true, Burmese writing retains its distinctive round shapes, and handwriting with consistent, even circles is praised. The writing system evolved between the period of the early inscriptions and the 16th century C.E. when it assumed a form similar to its present-day state. The spoken language has changed considerably since that time, with the result that a faithful transliteration of written Burmese (such as the one approved by the American Library Association and the Library of Congress used here) gives little impression of the way letters or words are pronounced in the language today. Sound changes have applied to certain initial consonants. Final consonants have disappeared. A glottal stop is all that remains of final stop consonants, whereas the place contrasts of written final stops are realized as vowel changes in the syllable. Final nasal consonants have been replaced by a parallel series of nasalized vowels. In general, many combinations of symbols are pronounced differently from the sounds represented by the symbols individually. The phonetic transcription used here is faithful to the principles of the IPA, although several others have been devised. A transliteration and transcription are compared in the following example. Burmese script Transliteration RUP‘MRAN˙ ‘SAM˙ KRA¯ Transcription jou .mjı`N.ya`N. a´ Gloss picture.see.sound.hear Translation ‘television’ (more commonly ‘T.V.’)
tı`.vı`
Burmese script is basically alphabetic. There are separate symbols to represent consonants (Table 1) and vowels (Table 2), but the symbols are organized in syllabic clusters, which are written from left to right. Within each cluster, however, the symbols do not necessarily appear in left-to-right order. For example, to write the syllable tı` ‘worm,’ the vowel -ı` is placed on top of the consonant t, but to write tu` ‘nephew,’ the u` must hang below the initial t. Certain sounds in Burmese, namely affricates, voiceless sonorants, and initial consonant clusters, are written using medial forms of four consonants, shown in Table 3.
Burmese 161 Table 1 Consonants of Burmese, transliterated and transcribed
Table 2 Burmese word-initial and word-internal vowel symbols
Burmese script has retained the features and symbols needed for writing the South Asian languages for which its parent scripts were originally designed, such as Pa¯ li, the language of the Buddhist scriptures and the source of many loans in Burmese, which can easily be identified because of phonological features such as doubled consonants and retroflex consonants that do not occur in Burmese words. A Pa¯ li phrase and its rendition in Burmese are shown next. Burmese script Transliteration Transcription
¯ MI ˙ SARANAM ˙ GACCHA BUDDHAM h
bou da`N yerena`N gji s a`mi ˜ ‘I go to the Buddha for refuge’
Phonetics and Phonology Some of the sounds used in Burmese are considered unusual because they occur relatively rarely in the world’s languages. These are the so-called voiceless nasals, which include the sound of air escaping through the nose. The Burmese word for jı´N.nı`. mjou .na`N.mu˜ ‘investment’ contains examples ˚ ˚such sounds: ˚ ˚ /m / and /n/. The consonants in of two ˚ Burmese are set out in Table 4.˚ For reasons of historical phonology, vowels in orthographically open syllables (Table 5), which are written with no final consonant letter, can be distinguished from those found in orthographically closed syllables (Table 6) namely those ending in a glottal stop or with a nasal vowel (transcribed here with /N/, which does not represent a final nasal consonant), both of which are written as final consonant letters in the writing system. Like the majority of the languages spoken in mainland East and Southeast Asia, Burmese is a tone language. The tonal contrasts involve not only the commonly observed differences in pitch and vowel length but also differences in phonation type – whether the voice is breathy or sharp in character. The presence or absence of a glottal stop at the end of the syllable may also considered to be part of the tonal system. Table 7 gives a basic description of the tonal contrasts on a syllable consisting of a bilabial nasal and an open vowel.
162 Burmese Table 3 Medial forms of Burmese consonants
Table 4 The consonants of Burmese
Burmese morphemes in phrases and compounds display varying degrees of phonological juncture, principally voicing assimilation and reduction of the first syllable, as shown in the following examples. . Voicing assimilation on internal morpheme boundaries in compounds.
Table 5 Vowels of Burmese in orthographically open syllables
Morphology Morphemes in Burmese are predominantly monosyllabic. With the exception of Indo-European loans, typically from Pali or English, compounding is the major source of polymorphemic words. In the television example above, four morphemes (N þ V) (N þ V) combine to form a noun. Derivational morphology by prefixation is common, in particular noun-formation from verbs using the prefix - e-. pja`iNsha`iN > epja`iN esha`iN
compete > competition
ja´ uN / wEA > sell / buy > eja´ uN trade ewEA
The verbal complex, typcially occurring at the end of a Burmese sentence, may comprise one or more head verbs in series followed by a string of auxiliary verbs, verbal particles, and markers. NP
NP
VP
khi mı`. ze´ .dwe` ˚
ho`tEA.dwe`
phji .pOA . la`. ze`.bja`N.ba`.dEA
Table 6 Vowels of Burmese in orthographically closed syllables: killed tone or nasal vowel
164 Burmese Table 7 Burmese tones
a Syllables with one of these tones may in some contexts become reduced to a short, unstressed schwa which is counted as a fifth tonal category in some analyses.
modern.market.PL hotel.PL
become emerge.begin. CAUS.also. POLITE.REALIS
‘. . . caused modern markets and hotels to begin to appear as well’
Burmese has a system of noun case markers, which in many contexts are not obligatorily present, and postpositions, as illustrated next.
grammar words and some other vocabulary. A colloquial-style sentence is compared to its literary-style equivalent in the next example. Spoken Literary
u´ .ba .ga ma´ Ndele´ .go` D D
eme`.nE D
la`.dEA
u´ .ba .ðı` ma´ Ndele´ .ðo eme`.niN la`. i D mother.with ˚˜ ˜ U Ba.DSUBJ Mandalay.to come. REALIS
u´ .ba .ga U Ba.DSUBJ D
ma´ Ndele´ .go` Mandalay.to
eme`.nE D mother.with
ywa´ .dEA go.REALIS
Burmese, like other languages of the region, encodes power and solidarity in personal relationships using a rich system of pronouns and forms of address. Pronouns may be true pronouns, such as Na` 1SING ‘I’ and nı`N 2SING ‘you’ (both familiar, not polite), or grammaticalized from other sources, such as enOA 1SING (male, polite; literally ‘royal slave’). Other forms of address include titles, personal relationships, and names or a combination of all three, such as sheja´ ma . dOA .khı` Nkhı`N hOB ‘TeachD Khin Chaw.’ er (FEM) Aunt (¼ Mrs.) Khin
‘U Ba came to Mandalay with his mother’
Given the large number of speakers of Burmese and the existence of a large diaspora community scattered around the world, Burmese has an inevitable presence on the Web, although at the time of writing standardized encoding has yet to be widely adopted and so text is usually displayed on the Internet as graphics. For ease of use, computer users often render Burmese in romanized form in Internet chat rooms or e-mail. See also: Burma: Language Situation; Sino-Tibetan Languages; Sino-Tibetan Languages; Sino-Tibetan Languages.
Literacy and Literary Burmese
Bibliography
The literacy rate in Burma has often been said to be high compared to other countries in the region, but accurate data are extremely difficult to obtain. One recent source suggests that nearly 80% of Burmese people over the age of 15 are literate, but other sources have put the figure much lower. The Burmese language exists in a colloquial style used in spoken informal contexts and a literary style used in official formal settings. The main difference between the two is that they have separate sets of
Allott A (1985). ‘Language policy and language planning in Burma.’ In Bradley D (ed.) Papers in Southeast Asian linguistics: language policy, language planning and sociolinguistics in Southeast Asia. Canberra, Australia: Pacific Linguistics. 131–154. Armstrong L E & Pe Maung Tin (1925). A Burmese phonetic reader. London: University of London Press. Bradley D (1982). ‘Register in Burmese.’ In Bradley D (ed.) Pacific Linguistics Series A-62: Tonation. Canberra, Australia: Pacific Linguistics, Australian National University.
Burnett, James, Monboddo, Lord (1714–1799) Bradley D (1995). Papers in South Asian linguistics 13: Studies in Burmese linguistics. Canberra, Australia: Pacific Linguistics, Australian National University. Myanmar Language Commission (1993). Myanmar– English dictionary. Yangon, Myanmar: Myanmar Language Commission. Myanmar Language Commission (2001). English– Myanmar dictionary. Yangon, Myanmar: Myanmar Language Commission. Okell J (1965). ‘Nissaya Burmese, a case of systematic adaptation to a foreign grammar and syntax.’ In Milner G B & Henderson E J A (eds.) Indo–Pacific linguistic studies, vol. 2: Descriptive linguistics (Lingua 14–15). Amsterdam: North Holland. 186–230. Okell J (1969). A reference grammar of colloquial Burmese. London: Oxford University Press. Okell J (1984). Burmese: an introduction (4 vols). DeKalb, IL: Northern Illinois University. Okell J & Allott A (2001). Burmese/Myanmar: a dictionary of grammatical forms. Richmond, UK: Curzon Press. Roop D H (1972). An introduction to the Burmese writing system. New Haven, CT: Yale University Press. Sprigg R K (1957). ‘Studies in linguistics analysis.’ Transactions of the Philological Society (Special volume). 104–138.
165
Sprigg R K (1977). ‘Tonal units and tonal classification: Panjabi, Tibetan and Burmese.’ In Gill H S (ed.) Pa`kha Sanjam 8: Parole and langue. Patiala: Punjabi University. 1–21. Thurgood G W (1981). Monumenta Serindica 9: Notes on the origins of Burmese creaky tone. Tokyo: To¯ kyo¯ gaikokugo daigaku. Wheatley J K (1990). ‘Burmese.’ In Comrie B (ed.) The major languages of East and Southeast Asia. London: Routledge. 106–126. Wheatley J K (1996). ‘Burmese writing.’ In Daniels P T & Bright W (eds.) The world’s writing systems. Oxford: Oxford University Press. 450–456. Wheatley J K (2003). ‘Burmese.’ In Thurgood G & LaPolla R J (eds.) The Sino-Tibetan languages. London & New York: Routledge. 195–207. Relevant Websites Sino-Tibetan etymological dictionary and thesaurus (STEDT) (2002). University of California at Berkeley. http://linguistics.berkeley.edu. SIL International (2002). Ethnologue: Languages of Myanmar. http://www.ethnologue.com.
Burnett, James, Monboddo, Lord (1714–1799) P C Sutcliffe, Colgate University, Hamilton, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
James Burnett, born in Monboddo in Scotland on October 14, 1714, was a judge, classics scholar, and a Scottish Enlightenment philosopher. After a classical education, Burnett studied law and eventually became a judge, Lord of Session, taking the title Lord Monboddo, in 1767 and remained in this post until his death on May 26, 1799. An eccentric and controversial figure, Monboddo was not afraid to expound unpopular views, both on the bench and in his scholarship, especially as a member of the Select Society of Edinburgh, a group of prominent citizens that gathered weekly to share ideas that included David Hume and Adam Smith (see Smith, Adam (1723–1790)) among others. In his two anonymously published six-volume works, Of the progress and origin of language (OPL) (1773–1792) and Antient metaphysics (AM) (1779–1792), he opposed Locke’s popular empiricism, favoring the idealist metaphysics and authority of the ancient Greeks, especially Aristotle.
Of the two works, OPL deals more specifically with language and was the more popular and successful, though the basic arguments are reiterated in Book III of AM. The first two volumes of OPL are most frequently discussed. In Monboddo’s own words, the ‘‘three heads’’ of Book I were ‘‘that Language is not natural to man . . . that it may have been invented . . . and . . . to show how it was invented’’ (Cloyd, 1972: 45). Though all humans have a faculty for language founded upon their ability to abstract meaning, language only arose where humans lived communally, gradually evolving from animal cries as men purposefully attached meaning to sounds. That the natural, primitive state of man is without language, Monboddo revealed in credulous accounts of travelers’ tales of primitive societies and by maintaining that orangutans, because of their social behavior, were actually the lowest form of humans without language. This notorious claim discredited Monboddo’s work to many of his contemporaries as well as to posterity, even as it classified him as a pre-Darwinian. Book II is a universal grammar, influenced by the work of Monboddo’s friend, Harris (see Harris, James (1709–1780)).
Burnett, James, Monboddo, Lord (1714–1799) Bradley D (1995). Papers in South Asian linguistics 13: Studies in Burmese linguistics. Canberra, Australia: Pacific Linguistics, Australian National University. Myanmar Language Commission (1993). Myanmar– English dictionary. Yangon, Myanmar: Myanmar Language Commission. Myanmar Language Commission (2001). English– Myanmar dictionary. Yangon, Myanmar: Myanmar Language Commission. Okell J (1965). ‘Nissaya Burmese, a case of systematic adaptation to a foreign grammar and syntax.’ In Milner G B & Henderson E J A (eds.) Indo–Pacific linguistic studies, vol. 2: Descriptive linguistics (Lingua 14–15). Amsterdam: North Holland. 186–230. Okell J (1969). A reference grammar of colloquial Burmese. London: Oxford University Press. Okell J (1984). Burmese: an introduction (4 vols). DeKalb, IL: Northern Illinois University. Okell J & Allott A (2001). Burmese/Myanmar: a dictionary of grammatical forms. Richmond, UK: Curzon Press. Roop D H (1972). An introduction to the Burmese writing system. New Haven, CT: Yale University Press. Sprigg R K (1957). ‘Studies in linguistics analysis.’ Transactions of the Philological Society (Special volume). 104–138.
165
Sprigg R K (1977). ‘Tonal units and tonal classification: Panjabi, Tibetan and Burmese.’ In Gill H S (ed.) Pa`kha Sanjam 8: Parole and langue. Patiala: Punjabi University. 1–21. Thurgood G W (1981). Monumenta Serindica 9: Notes on the origins of Burmese creaky tone. Tokyo: To¯kyo¯ gaikokugo daigaku. Wheatley J K (1990). ‘Burmese.’ In Comrie B (ed.) The major languages of East and Southeast Asia. London: Routledge. 106–126. Wheatley J K (1996). ‘Burmese writing.’ In Daniels P T & Bright W (eds.) The world’s writing systems. Oxford: Oxford University Press. 450–456. Wheatley J K (2003). ‘Burmese.’ In Thurgood G & LaPolla R J (eds.) The Sino-Tibetan languages. London & New York: Routledge. 195–207. Relevant Websites Sino-Tibetan etymological dictionary and thesaurus (STEDT) (2002). University of California at Berkeley. http://linguistics.berkeley.edu. SIL International (2002). Ethnologue: Languages of Myanmar. http://www.ethnologue.com.
Burnett, James, Monboddo, Lord (1714–1799) P C Sutcliffe, Colgate University, Hamilton, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
James Burnett, born in Monboddo in Scotland on October 14, 1714, was a judge, classics scholar, and a Scottish Enlightenment philosopher. After a classical education, Burnett studied law and eventually became a judge, Lord of Session, taking the title Lord Monboddo, in 1767 and remained in this post until his death on May 26, 1799. An eccentric and controversial figure, Monboddo was not afraid to expound unpopular views, both on the bench and in his scholarship, especially as a member of the Select Society of Edinburgh, a group of prominent citizens that gathered weekly to share ideas that included David Hume and Adam Smith (see Smith, Adam (1723–1790)) among others. In his two anonymously published six-volume works, Of the progress and origin of language (OPL) (1773–1792) and Antient metaphysics (AM) (1779–1792), he opposed Locke’s popular empiricism, favoring the idealist metaphysics and authority of the ancient Greeks, especially Aristotle.
Of the two works, OPL deals more specifically with language and was the more popular and successful, though the basic arguments are reiterated in Book III of AM. The first two volumes of OPL are most frequently discussed. In Monboddo’s own words, the ‘‘three heads’’ of Book I were ‘‘that Language is not natural to man . . . that it may have been invented . . . and . . . to show how it was invented’’ (Cloyd, 1972: 45). Though all humans have a faculty for language founded upon their ability to abstract meaning, language only arose where humans lived communally, gradually evolving from animal cries as men purposefully attached meaning to sounds. That the natural, primitive state of man is without language, Monboddo revealed in credulous accounts of travelers’ tales of primitive societies and by maintaining that orangutans, because of their social behavior, were actually the lowest form of humans without language. This notorious claim discredited Monboddo’s work to many of his contemporaries as well as to posterity, even as it classified him as a pre-Darwinian. Book II is a universal grammar, influenced by the work of Monboddo’s friend, Harris (see Harris, James (1709–1780)).
166 Burnett, James, Monboddo, Lord (1714–1799)
In 1784, parts of the first three volumes of OPL were translated into German and published with a foreword by Herder (see Herder, Johann Gottfried (1744–1803)), who praised Monboddo for his first attempts to use a comparison of languages and races to develop a philosophy of mankind. OPL influenced Herder’s Ideen zur Philosophie der Geschichte der Menschheit. Monboddo can also be linked to Jones (see Jones, William, Sir (1746–1794)), with whom he corresponded. Monboddo postulated a connection between Greek and Sanskrit in Book I of OPL in 1774, and this, perhaps, deserves to be considered the starting point of comparative linguistics rather than Jones’s statement of 1786. Certainly, Monboddo’s tremendous influence on his contemporaries makes him worthy of more consideration than he has traditionally received.
See also: Harris, James (1709–1780); Herder, Johann Gottfried (1744–1803); Jones, William, Sir (1746–1794); Locke, John (1632–1704); Origin and Evolution of Language; Smith, Adam (1723–1790).
Bibliography Arnold G (2002). ‘Monboddo die Palme? Zur MonboddoRezeption J. G. Herders.’ Herder Yearbook 6, 7–19. Burnett J, Lord Monboddo (1773–1792). Of the origin and progress of language (6 vols). London and Edinburgh: AMS Press. Cloyd E L (1972). James Burnett, Lord Monboddo. Oxford: Clarendon Press. Plank F (1993). ‘Des Lord Monboddo Ansichten von Ursprung und Entwicklung der Sprache.’ Linguistische Berichte 144, 154–166.
Burrow, Thomas (1909–1986) R Chatterjee, Lado International College, Silver Spring, MD, USA ! 2006 Elsevier Ltd. All rights reserved.
Thomas Burrow was born on June 29, 1909 in the Lancashire village of Leck. He studied classics at Cambridge University. He became interested in Sanskrit through a course in comparative philology and received his Ph.D. for his studies on the Kharosthi documents from Chinese Turkestan (now Xinjiang, land of the Uighur people). In 1944 he was appointed to the Boden Chair in Sanskrit at Oxford. He retired in 1976 and died 10 years later. Burrow’s first book on the Kharosthi documents analyzes them as related to a Prakrit of Northwest India, now in the Peshawar region. He provides a grammar of the language and a combined index and vocabulary. Burrow’s work in Sanskrit itself is well represented by The Sanskrit language (1955, 1966, 1973). Burrow’s focus here is the description of Sanskrit in its relation to Indo-European. He masterfully lays out the Indo-European neighbors of the language and their overlapping characteristics, quoting at the beginning the famous words of Sir William Jones in his address to the Royal Asiatic Society of Bengal in 1786: ‘‘The Sanscrit language, whatever be its antiquity, is of wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either . . .’’. Burrow writes of Sanskrit as ‘‘a form of language which in most respects is more archaic and less
altered from original Indo-European than any other member of the family.’’ He emphasizes that the importance of Sanskrit grammarians is unequalled anywhere in the world, and that Panini’s work regulated the language of the classical literature in the language ‘‘to the last detail.’’ In the 1960s, Burrow, with Murray B. Emeneau, made a signal contribution to the study of the other great language family of India, Dravidian. Their Dravidian etymological dictionary (DED) first appeared in 1961. It has been called a landmark event in Dravidian linguistics. Data from almost 30 languages are taken into account. The dictionary itself runs to some 500 pages. There are indexes of Dravidian, Indo-Aryan, Munda and other languages, including Hobson-Jobson. There is also an index of English meanings and of flora. The dictionary does not contain proto-Dravidian reconstructions – Burrow and Emeneau decided that the time required was not warranted by the state of Dravidian studies at the time. When the DED was published, the compilers decided to restrict it to Dravidian material alone. However, Indo-Aryan material had been collected and was readied for publication in the University of California Publications in Linguistics in 1962 under the title Dravidian borrowings from Indo-Aryan. In 1968 Burrow published a collection of papers in India, Collected papers on Dravidian linguistics. Notable here is an excursus into the further relationships of Dravidian languages to geographically distant families such as Ural-Altaic, specifically
166 Burnett, James, Monboddo, Lord (1714–1799)
In 1784, parts of the first three volumes of OPL were translated into German and published with a foreword by Herder (see Herder, Johann Gottfried (1744–1803)), who praised Monboddo for his first attempts to use a comparison of languages and races to develop a philosophy of mankind. OPL influenced Herder’s Ideen zur Philosophie der Geschichte der Menschheit. Monboddo can also be linked to Jones (see Jones, William, Sir (1746–1794)), with whom he corresponded. Monboddo postulated a connection between Greek and Sanskrit in Book I of OPL in 1774, and this, perhaps, deserves to be considered the starting point of comparative linguistics rather than Jones’s statement of 1786. Certainly, Monboddo’s tremendous influence on his contemporaries makes him worthy of more consideration than he has traditionally received.
See also: Harris, James (1709–1780); Herder, Johann Gottfried (1744–1803); Jones, William, Sir (1746–1794); Locke, John (1632–1704); Origin and Evolution of Language; Smith, Adam (1723–1790).
Bibliography Arnold G (2002). ‘Monboddo die Palme? Zur MonboddoRezeption J. G. Herders.’ Herder Yearbook 6, 7–19. Burnett J, Lord Monboddo (1773–1792). Of the origin and progress of language (6 vols). London and Edinburgh: AMS Press. Cloyd E L (1972). James Burnett, Lord Monboddo. Oxford: Clarendon Press. Plank F (1993). ‘Des Lord Monboddo Ansichten von Ursprung und Entwicklung der Sprache.’ Linguistische Berichte 144, 154–166.
Burrow, Thomas (1909–1986) R Chatterjee, Lado International College, Silver Spring, MD, USA ! 2006 Elsevier Ltd. All rights reserved.
Thomas Burrow was born on June 29, 1909 in the Lancashire village of Leck. He studied classics at Cambridge University. He became interested in Sanskrit through a course in comparative philology and received his Ph.D. for his studies on the Kharosthi documents from Chinese Turkestan (now Xinjiang, land of the Uighur people). In 1944 he was appointed to the Boden Chair in Sanskrit at Oxford. He retired in 1976 and died 10 years later. Burrow’s first book on the Kharosthi documents analyzes them as related to a Prakrit of Northwest India, now in the Peshawar region. He provides a grammar of the language and a combined index and vocabulary. Burrow’s work in Sanskrit itself is well represented by The Sanskrit language (1955, 1966, 1973). Burrow’s focus here is the description of Sanskrit in its relation to Indo-European. He masterfully lays out the Indo-European neighbors of the language and their overlapping characteristics, quoting at the beginning the famous words of Sir William Jones in his address to the Royal Asiatic Society of Bengal in 1786: ‘‘The Sanscrit language, whatever be its antiquity, is of wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either . . .’’. Burrow writes of Sanskrit as ‘‘a form of language which in most respects is more archaic and less
altered from original Indo-European than any other member of the family.’’ He emphasizes that the importance of Sanskrit grammarians is unequalled anywhere in the world, and that Panini’s work regulated the language of the classical literature in the language ‘‘to the last detail.’’ In the 1960s, Burrow, with Murray B. Emeneau, made a signal contribution to the study of the other great language family of India, Dravidian. Their Dravidian etymological dictionary (DED) first appeared in 1961. It has been called a landmark event in Dravidian linguistics. Data from almost 30 languages are taken into account. The dictionary itself runs to some 500 pages. There are indexes of Dravidian, Indo-Aryan, Munda and other languages, including Hobson-Jobson. There is also an index of English meanings and of flora. The dictionary does not contain proto-Dravidian reconstructions – Burrow and Emeneau decided that the time required was not warranted by the state of Dravidian studies at the time. When the DED was published, the compilers decided to restrict it to Dravidian material alone. However, Indo-Aryan material had been collected and was readied for publication in the University of California Publications in Linguistics in 1962 under the title Dravidian borrowings from Indo-Aryan. In 1968 Burrow published a collection of papers in India, Collected papers on Dravidian linguistics. Notable here is an excursus into the further relationships of Dravidian languages to geographically distant families such as Ural-Altaic, specifically
Burundi: Language Situation 167
Finno-Ugric. Burrow reviews previous work by Caldwell, Schrader, and others and presents ‘‘as a first instalment of evidence supporting the theory of Dravidian-Uralian relationship’’ a list of words applying to the body and its parts. See also: Caldwell, Robert (1814–1891); Emeneau, Murray Barnson (b. 1904); Jones, William, Sir (1746–1794); Panini; Sanskrit.
Burrow T (1955, 1966, 1973). The Sanskrit language. London: Faber. Burrow T & Emeneau M B (1961, 1984). Dravidian etymological dictionary. Oxford: Clarendon Press. Burrow T & Emeneau M B (1962). Dravidian Borrowings from Indo-Aryan. University of California Publications in Linguistics (vol. 27). Berkeley, CA: University of California Press. Burrow T (1968). Collected papers on Dravidian linguistics. Annamalainagar: Annamalai University Department of Linguistics, Publication no. 13.
Bibliography Burrow T (1937). The language of the Kharosthi documents from Chinese Turkestan. Cambridge: The University Press.
Burundi: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
Burundi lies surrounded by the Democratic Republic of the Congo in the east, Rwanda in the north, and Tanzania in the east. In the southwest, Burundi borders Lake Tanganyika. The comparatively small country has about 6.2 million inhabitants divided into three main ethnic groups: Hutus (approx. 85%), Tutsis (approx. 14%), and Twa (1%). The Twa pygmies are an original hunter-gatherer community and are now mainly engaged in hunting, pottery, and ironworking. They are assumed to be the original inhabitants of the area, with Hutus and Tutsis arriving later. The Urundi kingdom became part of German East Africa in 1890, together with the neighboring Rwanda. After World War I both territories were administered by Belgium under a League of Nations mandate. In 1962 Burundi became an independent kingdom, and in 1966, after the overthrow of the monarchy, a republic. Burundi has a long history of suffering from internal unrest and ethnic violence brought about by conflicts between Hutus and Tutsis. Ironically, the country is a counter-example to the claim that monolingualism brings internal stability, as all ethnic groups in Burundi speak one language, Rundi. Rundi (Kirundi) is a Bantu language closely related to Kinyarwanda, the language of Rwanda, as well as
to Ha of Tanzania. All three are largely mutually intelligible, although the varieties are distinct enough to serve for ethnic and national identification. Within Burundi, Hutus, Tutsis, and Twas speak different dialects of Rundi. The last two groups are assumed to be originally speakers of non-Bantu languages, and to have shifted to Rundi. Communities of Rundi speakers, including refugees, are also found in Rwanda, Uganda, and Tanzania. In addition to Rundi, the former colonial language, French, is used in Burundi, especially for formal and official purposes, in education, and for international communication. Both Rundi and French are official languages. The third important language in Burundi is Swahili, which is spoken by the Muslim, Asian, and Congolese communities, as well as a contact language by others, mainly in the capital, Bujumbura, and along Lake Tanganyika. See also: Rwanda: Language Situation; Tanzania: Language Situation.
Bibliography Ntahokaja J-B (1994). Grammaire structurale du Kirundi. Bujumbura: L’Universite´ du Burundi. Sommers M (2001). Fear in Bongoland: Burundi refugees in urban Tanzania. New York, Oxford: Berghahn.
Burundi: Language Situation 167
Finno-Ugric. Burrow reviews previous work by Caldwell, Schrader, and others and presents ‘‘as a first instalment of evidence supporting the theory of Dravidian-Uralian relationship’’ a list of words applying to the body and its parts. See also: Caldwell, Robert (1814–1891); Emeneau, Murray Barnson (b. 1904); Jones, William, Sir (1746–1794); Panini; Sanskrit.
Burrow T (1955, 1966, 1973). The Sanskrit language. London: Faber. Burrow T & Emeneau M B (1961, 1984). Dravidian etymological dictionary. Oxford: Clarendon Press. Burrow T & Emeneau M B (1962). Dravidian Borrowings from Indo-Aryan. University of California Publications in Linguistics (vol. 27). Berkeley, CA: University of California Press. Burrow T (1968). Collected papers on Dravidian linguistics. Annamalainagar: Annamalai University Department of Linguistics, Publication no. 13.
Bibliography Burrow T (1937). The language of the Kharosthi documents from Chinese Turkestan. Cambridge: The University Press.
Burundi: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
Burundi lies surrounded by the Democratic Republic of the Congo in the east, Rwanda in the north, and Tanzania in the east. In the southwest, Burundi borders Lake Tanganyika. The comparatively small country has about 6.2 million inhabitants divided into three main ethnic groups: Hutus (approx. 85%), Tutsis (approx. 14%), and Twa (1%). The Twa pygmies are an original hunter-gatherer community and are now mainly engaged in hunting, pottery, and ironworking. They are assumed to be the original inhabitants of the area, with Hutus and Tutsis arriving later. The Urundi kingdom became part of German East Africa in 1890, together with the neighboring Rwanda. After World War I both territories were administered by Belgium under a League of Nations mandate. In 1962 Burundi became an independent kingdom, and in 1966, after the overthrow of the monarchy, a republic. Burundi has a long history of suffering from internal unrest and ethnic violence brought about by conflicts between Hutus and Tutsis. Ironically, the country is a counter-example to the claim that monolingualism brings internal stability, as all ethnic groups in Burundi speak one language, Rundi. Rundi (Kirundi) is a Bantu language closely related to Kinyarwanda, the language of Rwanda, as well as
to Ha of Tanzania. All three are largely mutually intelligible, although the varieties are distinct enough to serve for ethnic and national identification. Within Burundi, Hutus, Tutsis, and Twas speak different dialects of Rundi. The last two groups are assumed to be originally speakers of non-Bantu languages, and to have shifted to Rundi. Communities of Rundi speakers, including refugees, are also found in Rwanda, Uganda, and Tanzania. In addition to Rundi, the former colonial language, French, is used in Burundi, especially for formal and official purposes, in education, and for international communication. Both Rundi and French are official languages. The third important language in Burundi is Swahili, which is spoken by the Muslim, Asian, and Congolese communities, as well as a contact language by others, mainly in the capital, Bujumbura, and along Lake Tanganyika. See also: Rwanda: Language Situation; Tanzania: Language Situation.
Bibliography Ntahokaja J-B (1994). Grammaire structurale du Kirundi. Bujumbura: L’Universite´ du Burundi. Sommers M (2001). Fear in Bongoland: Burundi refugees in urban Tanzania. New York, Oxford: Berghahn.
168 Burushaski
Burushaski G D S Anderson, Salem, OR, USA ! 2006 Elsevier Ltd. All rights reserved.
Burushaski is a language isolate spoken in the Northern Areas, Pakistan, primarily in the Hunza, Nagar, and, Yasin valleys. A small enclave of Burushaski speakers is also found over the border in Kashmir, India. The Hunza and Nagar varieties differ only minorly from each other; both stand at a relative distance from the Yasin variety of Burushaski, sometimes also considered to be a close sister language, Werchikwar. There are approximately 80 000 speakers of Burushaski, including somewhere in the area of 15 000–20 000 people speaking the Yasin dialect, with an additional 20 000–30 000 speakers of both Hunza Burushaski and Nagar Burushaski. In all communities where Burushaski is spoken, the language remains vital, with many women and children still monolingual speakers. The first comprehensive study of Burushaski was Lorimer (1935–1938). The most recent is Berger’s three-volume grammar, dictionary, and text collection (1998). Bilingualism among Burushaski speakers is common primarily in the two Dardic Indo–European languages Shina (Nagar Burushaski speakers) and Khowar (the Burusho of Yasin valley). In Hunza, especially in the village of Mominabad, the Indo–Aryanspeaking Du´ umaki (Domaaki) live in close contact with Burushaski speakers; nearly all Du´ umaki speakers appear to be bilingual in Burushaski. Burushaski itself may have previously been spoken in a wider area than it is currently found: for example, in Dras, in Baltistan, there is a group of people known as the Brokpa or Brusa; also, in Ponjal, there are the so-called Burushken, who are now Shina speaking. Burushaski has a basic five-vowel system, with two series of contrastive long vowels, alternatively bearing stress or higher pitch on the first or second mora, respectively: (1) i e
ı´i e´e
iı´ ee´
u o a
a´a
u´u o´o
uu´ oo´
aa´
There is some dispute among Burushaski specialists as to the exact nature of these long vowels. Varma (1941: 133) described the suprasegmental or intonational contrasts of Burushaski long vowels as representing a rising and falling tone; modern investigators, however, e.g., Tiffou (1993), Berger (1998), and Morin and Tiffou (1989), considered this to be a difference of moraic stress: that is, Burushaski long vowels may receive stress on either the first mora or
the second, corresponding to Varma’s falling and rising tones, respectively. These phenomena are phonemic in Burushaski. A comprehensive instrumental analysis of Burushaski vocalism remains to be done. A lowered pitch on the first mora is sometimes heard with the former (initial-mora prominent) forms. (Note that expressive diminutives are generally associated with this intonational pattern, e.g., s˘on ‘blind’ vs. s˘o´on ‘somewhat blind’ or .tak ‘attached’ vs. .ta´ak ‘somewhat attached.’) Yasin exhibits the same intonational phenomena as the standard Hunza and Nagar varieties, although the moraic stress difference seems to be less pronounced, and in some speakers, this contrast has been neutralized. Examples of phonemic vowel contrasts in Burushaski include bat ‘flat stone’ vs. baa´t ‘porridge’ (as in bras-e baa´t ‘cooked rice,’ aalu-e baa´t ‘mashed potatoes’); d. ir ‘boundary, water ditch between fields, small irrigation canal; hostility’ vs. d. ı´ir ‘overhanging rock’; Xun ‘wooden block in door lock, stocks (for prisoner)’ vs. Xu´un ‘quail’; men ‘who’ vs. mee´n ‘old, venerable; fallow field’; gon ‘dawn’ vs. goo´n ‘like, as.’ Note that these length contrasts only appear in stressed syllables in Burushaski. Three-way contrasts between short, first-moraprominent, and second-mora-prominent vowels are found in a small number of lexical items in Burushaski. Such triplets include bo ‘grain, seed, sperm/semen’ vs. bo´o et- ‘low, bellow’ vs. boo´ (cf. nupa´u ! nupoo´n in the converb form) ‘sit down, lower self,’ don ‘large herd’ vs. do´on (!do´on ke) ‘still, yet, nevertheless’ vs. doo´n ‘woman’s head scarf; open’ (Berger, 1998: vol. 3, pp. 121–122). Two-way length contrasts, such as ba´ak ‘punishment, torture’ vs. baa´k ‘generosity’ are relatively common. Burushaski has an extensive system of consonants. In fact, there are eight different stop/affricate series attested in the language. This includes labial, dental, alveolar, retroflex, palatal, palatal-retroflex, velar, and uvular. All of these series may be found in voiceless unaspirated, voiceless aspirated, and voiced series (see Table 1).
Table 1 The consonantal inventory of Burushaski p ph b (f)a m w
t th d
c ch z s
t. t. h d.
cˇ cˇh
c.ˇ c.ˇ h
sˇ
s.ˇ
y
y.
n l
k kh g (x)a N
q qh X h
r
a [f] and [x] occur only in loan words, or as a variant of the aspirated stops [ ] and [ ] or [ ], respectively.
Burushaski 169 Table 2 Plural formation in Burushaski Singular
While retroflexion is common throughout the languages of south Asia, Burushaski has one of the largest inventories of nonsonorant retroflex sounds among the languages of the region, with no fewer than seven such sounds. In addition, the Hunza and Nagar varieties possess a curious retroflex, a spirantized palatal, symbolized /y. /, with a range of local or idiolectal realizations. This sound is lacking in the Yasin Burushaski dialect. Burushaski possesses four noun classes, based on real-world semantic categorization. Thus, male humans belong to class I, female humans to class II, nonhuman animates to class III and inanimates to class IV (2). These classes are formally realized not in the noun themselves but through the selection of case allomorphs and verb agreement morphology. (2) I: male human II: female human hir ‘man’ dası´n ‘girl’ III: animate nonhuman IV: inanimate haXu´ r ‘horse’ Xate´ nc.˘ ‘sword’
Another salient feature of the nominal system of Burushaski is the wide range of plural formations attested in the language. There are literally dozens of plural markers in the language, each often found with only a small number of nouns. Sometimes these are found only with nouns of a particular class but others crosscut this categorization (see Table 2). Burushaski has a highly developed system of grammatical and instrumental cases as well as an elaborate system of local/directional cases and instrumental/ comitative cases (see Table 3). The exact number is difficult to determine as new elements enter this system through the grammaticalization (and phonological fusion) of relational nouns/postpositions. There are at least the following grammatical cases (i.e., ones assigned by structural position or verbal subcategorization): ergative, genitive, dative, ablative. In the latter two instances with class II nouns, the cases are built off the genitive (or oblique) stem.
Numerals agree in class with their nominal complement in class in Burushaski (note class-I and class-III are conflated here; see Table 4). Numbers 20 and above are based on a clear vigesimal system, 30 literally being ‘20–10’ and 40 being (etymologically) ‘2–20.’ etc. (3) aalter(an) 20 aalter toorumo 30 aaltuwalter 40 aaltuwalter toorumo 50 iiski aalter 60 iiski aalter toorumo 70 waalti aalter(an) 80 waalti aalter toorumo 90 tha 100
The verbal system of Burushaski stands out for its morphological complexity among south Asian languages. There are two basic sets of inflections, depending in part on the stem allomorph. These two broad categories are as follows: (4) I past perfect pluperfect aorist (conative)
II future present imperfect
The maximal template of the Burushaski simplex verb is given by Tikkanen (1995: 91) as: (5)
NEG-
D-
PERSON/CLASS/NUMBER-
CAUS-
–4
–3
–2
PL.SUBJ-
DUR-
þ1
þ2
–1 1SG.SUBJþ3
PRTCPL/OPT/
SUBJ.SFX-
p
-
Ø
Q
COND/AUX-
þ4
þ5
þ6
Some examples of verbs reflecting this template are given in (6). Note the curious and morphologically triggered (and phonologically unmotivated) devoicing of obstruents following the negative allomorph a- (but not oo´ -). (6) oo´ -min-im-i NEG-drink-AP-I ‘he didn’t drink (it)’ (Berger, 1998: 106) a-tu´ ru-m-i NEG-work-AP-I ‘he didn’t work’ (Berger, 1998: 105) a-mı´-kac˘ -ic˘ -a-i NEG–1PL-enclose-DUR-AUX-I ‘he doesn’t enclose us’ (Berger, 1998: 105) a-tu-ququ-m-i NEG-D-be.confused-AP-I ‘he was not confused’ (Berger, 1998: 105)
I-neck-SUPERABL ‘from on his neck’ Instrumental/Comitative Cases usko´
ya´ .t-umuc-ane
hin
jinzaat-an
three ‘a three-headed demon’
head-PL-INSTR.B
one.I
demon-SG.ART
day-o-k
d-l
stone-PL-INSTR ‘pelt with stones’
hit
-me-ke
gat.
tooth-INSTR ‘bite with teeth’
bite
me´ -k
bow-INSTR ‘shoot with bow’ ame´ -k-at. e
bow-INSTR-SUPERESS ‘shoot with bow’
d-l
hit bis˘ a´ -
throw
animate possessor of a logical argument as an argument morphologically in the verb-word (7). (7a) khakha´ ay-umuc phas.˘U´ me´ e-t-aa walnut-PL gobble.up 1PL-AUX-2 ‘you gobbled up our walnuts’ (Berger, 1998: 162) (7b) hiles-e dasin-mo mo-mis.˘ moo-skarc-im-i boy-ERG girl-GEN II-finger II-cut-AP-I ‘the boy cut off the girl’s finger’ (Willson, 1990: 5)
Another characteristic feature of the Burushaski verbal system is the grammaticalized use of double argument indexing with intransitive verbs. This single vs. double marking appears within two separate functional subsystems. In the first one, presence vs. absence of double marking implies degree of control of the subject over the action: less control is indexed through double marking (8a). In the second such subsystem, class-IV nouns receive single marking while class-III nouns receive double marking with the same predicate (8b).
(8a) Xurc-ı´m-i sink-AP-I ‘he dove under’ (Berger, 1998: 118) i-Xu´ rc-im-i I-sink-AP-I ‘he drowned’ (Berger, 1998: 118) (8b) ha Xulu´ -m-i house burn-AP-IV ‘the house burned’ (Berger, 1998: 118) hun i-Xu´ l-im-i wood III-burn-AP-III ‘the wood burned’ (Berger, 1998: 118)
Syntactically, Burushaski is a fairly rigid SOV language. In narrative texts, head-tail linkage, a common narrative device among south Asian languages, is frequently found (clauses are linked by rote repetition of the finite verb of a preceding sentence in a nonfinite form in an immediately following sentence). Further, some cases appear only on the leftmost of two (conjunctively or disjunctively) conjoined nouns, while others appear on both. There thus appear to be both phrasal and word-level case forms in Burushaski.
Burushaski 171 Table 4 Numerals
1 2 3 4 5 6 7 8 9 10 11
See also: Pakistan: Language Situation.
I/III
II
IV
hin aaltan iisken waalto cundo mis˘ indo talo aaltambo hunc˘ o toorumo turma hin
han aala/aalto usko waalto cundo mis˘ indo talo aaltambo hunc˘ o toorumo turma han
hi(k) aalti/aalto iiski waal(ti) cindi mis˘ in(di) tale aaltam(bi) hunti toorimi turma hik
A further curious aspect of Yasin Burushaski is the highly atypical semantic (plural) agreement seen with disjunctively conjoined NPs (Anderson and Eggert, 2001). Most of these features can be seen in the following examples. (9a) gus ya hir-e dasen a-mu-yeec-en woman or man-ERG girl NEG-II-see-PL ‘the woman or the man didn’t see the girl’ (Anderson et al., 1998) (9b) hir ya guse-e dasen a-mu-yeec-en man or woman-ERG girl NEG-II-see-PL ‘the man or the woman didn’t see the girl’ (Anderson et al., 1998)
Another characteristic feature of Burushaski syntax is the extensive use of case forms to mark a wide range of subordinate clause functions (Anderson, 2002). (10) ma ma-ı´r-a´ t. e e tan y’all 2PL-die-SUPERESS I sad a-ma´ y-a-m 1-become.dur-1-AP ‘when you all die I will be sad’ (Berger, 1998: 140)
Burushaski includes loans from a range of local languages including Urdu, Khowar, Shina, and even (perhaps indirectly) from Turkic languages as well. In some instances, loan affixes may be found as well, e.g., d. ad. an-ci ‘big-drum drummer’ (Berger, 1998: 209). More tenuous lexical connections have been proposed with Northeast Caucasian languages and Paleo–Balkanic Indo–European languages (Casule, 1998). There is a small body of indigenous literature in Burushaski written in a modified Urdu script. In addition, various texts in transcription have appeared, including Skyhawk et al. (1996), Skyhawk (2003), etc.
Bibliography Anderson G D S (1997). ‘Burushaski phonology.’ In Kaye A S & Daniels P T (eds.) Phonologies of Asia and Africa (including the Caucasus). Winona Lake, IN: Eisenbrauns. 1021–1041. Anderson G D S (2002). ‘Case marked clausal subordination in Burushaski complex sentence structure.’ Studies in Language 26(3), 547–571. Anderson G D S & Eggert R H (2001). ‘A typology of verb agreement in Burushaski.’ Linguistics of the Tibeto-Burman Area 24(2), 235–254. Anderson G D S, Eggert R H, Zide N H & Ramat F (1998). Burushaski language materials. Chicago: University of Chicago Language Laboratories and Archives. Bashir E (1985). ‘Towards a semantics of the Burushaski verb.’ In Zide A, Magier R K D & Schiller E (eds.). Proceedings of the Conference on Participant Roles: South Asia and Adjacent Areas. Bloomington: Indiana University Linguistics Club. 1–32. Benveniste E (1949). ‘Remarques sur la classification nominale en Burusaski.’ Bulletin de la Socie´ te´ Linguistique de Paris 44, 64–71. Berger H (1956). ‘Mittelmeerische Kulturpflanzennamen aus dem Burushaski.’ Mu¨ nchener Studien zur Sprachwissenschaft 9, 4–33. Berger H (1959). ‘Die Burushaski-Lehnwo¨ rter in der Zigeunersprache.’ Indo-Iranian Journal 3, 17–43. Berger H (1974). Das Yasin-Burushaski (Werchikwar): Grammatik, Texte, Wo¨ rterbuch. Wiesbaden: Otto Harrassowitz. Berger H (1994). ‘Kombinatorischer Lautwandel im Burushaksi.’ Studien zur Indologie und Iranistik 19, 1–9. Berger H (1998). Die Burushaski-Sprache von Hunza und Nager (3 vols). Wiesbaden: Otto Harrassowitz. Bleichsteiner R (1930). ‘Die werschikisch-burischkische Sprache im Pamir-Gebiet und ihre Stellung zu den Japhetitensprachen des Kaukasus.’ Wiener Beitra¨ ge zur Kulturgeschichte und Linguistik 1, 289–331. Casule I (1998). Basic Burushaski etymologies: the Indo– European and Paleo–Balkanic affinities of Burushaski. Munich: Lincom Europa. Klimov G A & Edel’man D I (1970). Iazyk burushaski. Moscow: Akademia Nauk SSSR. Leitner G W (1889). The Hunza and Nagyr hand-book: being an introduction to a knowledge of the language, race, and countries of Hunza, Nagyr, and a part of Yasin. Calcutta. Lorimer D L R (1932). ‘A Burushaski text from Hunza.’ Bulletin of the School of Oriental Studies 4, 505–531. Lorimer D A (1935–1938). The Burushaski language (3 vols). Oslo: H. Aschehoug. Morgenstierne G (1945). ‘Notes on Burushaski phonology.’ Norsk Tidsskrift for Sprogvidenskap 13, 59–95.
172 Burushaski Morgenstierne G, Vogt H & Borstrøm C J (1945). ‘A triplet of Burushaski studies.’ Norsk Tidsskrift for Sprogvidenskap 13, 61–147. Morin Y-C & Tiffou E (1988). ‘Passive in Burushaski.’ In Shibatani M (ed.) Passive and voice. Amsterdam: John Benjamins. 493–525. Morin Y-C & Tiffou E (1989). Dictionnaire comple´ mentaire du Bourouchaski du Yasin. Paris: Peeters/SELAF. Skyhawk H van (2003). Burushaski-Texte aus Hispar: Materialien zum Versta¨ ndnis einer archaischen Bergkultur in Nordpakistan. Wiesbaden: Otto Harrassowitz. Skyhawk H van, Berger H & Jettmar K (1996). Libi Kisar: ein Volksepos im Burushaski von Nager. Wiesbaden: Otto Harrassowitz. Tiffou E (1977). ‘L’Effacement de l’ergatif en bourouchaski.’ Studia Linguistica 31, 18–37. Tiffou E (1993). Hunza proverbs. Calgary: University of Calgary Press. Tiffou E & Patry R (1995). ‘La Notion de pluralite´ verbale: le cas du bourouchaski du Yasin.’ Journal Asiatique 283(2), 407–444.
Tiffou E & Pesot J (1988). Contes du Yasin. Paris: Peeters. Tikkanen B (1995). ‘Burushaski converbs in their areal context.’ In Haspelmath M & Ko¨ nig E (eds.) Converbs in cross-linguistic perspective: structure and meaning of adverbial verb forms – adverbial participles, gerunds. Berlin: Mouton de Gruyter. 487–528. Toporov N V (1970). ‘About the phonological typology of Burushaski.’ In Jakobson R & Kawamoto S (eds.) Studies in general and Oriental linguistics presented to Shiro Hattori on the occasion of his sixtieth birthday. Tokyo: TEC Corporation for Language and Educational Research. 632–647. Toporov V N (1971). ‘Burushaski and Yeniseian languages: some parallels.’ In van Poldauf I (ed.) Etudes de la phonologie, typologie et de la linguistique ge´ ne´ rale. Prague: Acade´ mie Tche´ coslovaque des Sciences. 107–125. Varma S (1941). ‘Studies in Burushaski dialectology.’ Journal of the Royal Asiatic Society of Bengal Letters 7, 133–173. Willson S R (1990). Verb agreement and case marking in Burushaski. M.A. thesis, University of North Dakota.
C C ¸ abej, Eqrem (1908–1980) Z Wasik, Adam Mickiewicz University, Poznan´, Poland ( ! 2006 Elsevier Ltd. All rights reserved.
Eqrem C¸abej was born in Gjirokastra (at that time Turkey) on August 6, 1908, and died on August 13, 1980, in Tirana (Albania). He received his elementary education in the place of his birth and then was sent to Austria to attend a high school at Klagenfurt. Subsequently, he went to Graz and Vienna to study comparative Indo-European linguistics and Albanian philology. In 1933, he defended his doctoral dissertation ‘Italoalbanische Studien’ before the commission of Paul Kretschmer and Norbert Jokl. After graduating from Vienna University, he worked as a teacher of Albanian in secondary schools and other educational institutions, first in Gjirokastra and then in Shkode¨r. Spending the interwar period in Italy, which occupied Albania, C¸abej studied archival documents from the Albanian past preserved there in libraries. After the end of World War II, when a two-year Pedagogical Institute in Gjirokastra had been created in 1946, he was nominated a ‘pedagogue’ in linguistics and Albanology. In 1947, he became a member of the Institute of Sciences, and in 1957 he was offered a professorial position at Tirana University. For some years he worked in the Institute of Language and Literature, and when the Academy of Sciences was formed, he was elected a member of its presidium. In 1959, C¸abej defended a thesis on ‘Some aspects of historical phonetics of Albanian in the light of the language of Gjon Buzuku’ [Disa aspekte¨ te¨ fonetike¨s historike te¨ shqipes ne¨ drite¨n e gjuhe¨s se¨ Gjon Buzukut], securing him the degree ‘candidate of philological sciences’; and, in the same year, he was given the title ‘professor’ for his theoretical and practical achievements. He had also prepared a dissertation for a doctor’s degree devoted to ‘Etymological studies in the domain of Albanian’ [Studime etimologjike ne¨ fus he¨ te¨ shqipes], but, meanwhile, this degree was abolished. Etymology and history of language were the domains in which he worked until the last days of his life, taking part in all professional sessions of national and international
character, in Albania and other research centers of the Balkans and of central Europe, and publishing in journals all over the world. The scientific activity of E. C¸abej embraces two phases. In the first phase, 1929–1945, he was a philologist, folklorist, dialectologist, and ethnographer; in the second, 1945–1980, the focus of his interest shifted to linguistics, etymology and historical phonetics, lexicology, and lexicography. C ¸ abej’s first step in describing his own native language in terms of comparative linguistics was a dissertation devoted to Italian and Albanian, a copy of which is available in Vienna University. He paid particular attention to the roots and the place of Albanian in the Balkans. His chrestomathy for high school pupils, ‘Elements of linguistics and Albanian literature’ [Elemente te¨ gjuhe¨sise¨ e te¨ literature¨s shqipe] (1936), comprised in addition to literary texts knowledge related to linguistic classifications and the distribution of Albanian dialects. In it he defended his hypothesis concerning the Illyrian ancestry of his native tongue, exploiting the opinions of philosophers such as G. W. Leibniz, J. E. Thunmann, and J. P. Fallmerayer, and linguists such as G. Meyer and F. X. von Miklosic, as well as P. Kretschmer and N. Jokl. His next monograph, ‘On the genesis of Albanian literature’ [Pe¨r gjeneze¨n e literature¨s shqipe] (1939), is characterized by its etymological explorations of ethnonyms: Arbe¨n, Arbe¨r, Arbe¨resh, and the first historical periodization of Albanian literature. Between 1935 and 1942, C ¸ abej published several works from linguistics, folkloristics, and mythology, chiefly in Revue internationale des e´tudes balkaniques, Knjige o Balkanu, and Leipziger Vierteljahresschrift fu¨r Su¨dosteuropa. In the 1940s, he collaborated with Hrvatska Enciklopedija (1941) and later prepared ‘The linguistic Atlas of Albanian’ (1943). Decisive for C¸abej’s linguistic reorientation was the study of the ‘Missal of Gjon Buzuku’ (Meshari i Gjon Buzukut) from 1555, preserved in the Vatican Library. The results of his studies of this translation of the Catholic missal and his acquaintance with texts of other Albanian authors led him to write both a series of articles from historical morphology and phonetics published in the Bulletin
174 C¸ abej, Eqrem (1908–1980)
for Social Sciences at Tirana and a critical edition of the work of Gjon Buzuku (1968) with philological elaboration and explanation throwing light on literary traditions of earlier times. On the basis of constatations included in his earlier works he wrote a treatise ‘On some basic problems of the ancient history of Albanian’ which had been translated into Italian, French, and English. In some other articles published for international organizations C ¸ abej pointed to the role of Albanian in relation to historically cognate and geographically adjacent languages. The results of his historical studies are reflected in two monographs edited as textbooks for students of Albanian language and literature, ‘Introduction to the history of Albanian’ and ‘Historical phonetics of Albanian’ (published in one volume in 1970). The core of numerous publications of the 1960s and 1970s, however, was C¸ abej’s answers to unresolved questions: whether Albanians are descendants of Illyrians and whether they had always lived in the territories they occupy at present. He provided counterarguments to the claims of G. L. Weigand and other researchers regarding the non-autochthon character of Albanians. C ¸ abej’s studies on agricultural terminology beginning in antiquity show the sedentary character of the Albanian tribes.
Cacaopera
Opus vitae of C¸ abej are his ‘Etymological studies in the domain of Albanian’ (Studime etimologjike ne¨ fushe¨ te¨ shqipes), discussed and interpreted successively in parts between 1969 and 1979. With his historical experience he has contributed decisively to the codification and standardization of the Albanian language while taking part on editorial boards of practically all dictionaries, both monolingual and multilingual, published in Albania after the War as well as in working groups concerned with orthography, including the Congress of 1972. See also: Albanian; Leibniz, Gottfried Wilhelm (1646–1716);
Miklosˇicˇ, Franc (1813–1891).
Bibliography Blaku M (1980). ‘In memoriam: Prof. Eqrem C¸ abej, Nestor i gujhe¨ sise¨ shqiptare.’ Fjala XIII 15/1, 8–9. Kastrati J (1981). ‘Bibliografia e Prof. Eqrem C ¸ abejt (1929–1981).’ Studime filologjike 3, 219–254. Wa˛sik Z (1985). ‘Profesor Dr Eqrem C¸abej (1908–1980), wybitny filolog, two´rca wspo´łczesnego je( zykoznawstwa alban´skiego.’ Acta Universitatis Wratislaviensis 777. Studia Linguistica IX, 99–114.
See: Misumalpan.
Caddoan Languages D Rood, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.
Caddoan is a family of North American language consisting of two branches: Caddo, formerly spoken in Texas and Louisiana, and now spoken only in Oklahoma; and North Caddoan, found in the central Plains from Oklahoma to North Dakota. The North Caddoan languages include Arikara, Pawnee, Kitsai, and Wichita. Arikara and Pawnee are linguistically very close, while Kitsai falls between them and Wichita.
Language Structure The Caddoan languages have extremely small phoneme inventories, but complex morphophonemics. They are morphologically and syntactically prototypical examples of polysynthetic structure. The proposed phoneme inventory for the family is */p, t, k, c (¼ [ts]), s, w, n, r, y, , h, i, a, u/ (Chafe, 1979: 218–219). Caddo has a somewhat larger set, which appears to result from relatively recent expansion. Caddoan verbs consist of 30 or more positional slots into which bound morphemes may be inserted; the verb root occurs near the end. In addition to expected categories like tense, modality, aspect,
174 C¸abej, Eqrem (1908–1980)
for Social Sciences at Tirana and a critical edition of the work of Gjon Buzuku (1968) with philological elaboration and explanation throwing light on literary traditions of earlier times. On the basis of constatations included in his earlier works he wrote a treatise ‘On some basic problems of the ancient history of Albanian’ which had been translated into Italian, French, and English. In some other articles published for international organizations C ¸ abej pointed to the role of Albanian in relation to historically cognate and geographically adjacent languages. The results of his historical studies are reflected in two monographs edited as textbooks for students of Albanian language and literature, ‘Introduction to the history of Albanian’ and ‘Historical phonetics of Albanian’ (published in one volume in 1970). The core of numerous publications of the 1960s and 1970s, however, was C¸abej’s answers to unresolved questions: whether Albanians are descendants of Illyrians and whether they had always lived in the territories they occupy at present. He provided counterarguments to the claims of G. L. Weigand and other researchers regarding the non-autochthon character of Albanians. C ¸ abej’s studies on agricultural terminology beginning in antiquity show the sedentary character of the Albanian tribes.
Cacaopera
Opus vitae of C¸abej are his ‘Etymological studies in the domain of Albanian’ (Studime etimologjike ne¨ fushe¨ te¨ shqipes), discussed and interpreted successively in parts between 1969 and 1979. With his historical experience he has contributed decisively to the codification and standardization of the Albanian language while taking part on editorial boards of practically all dictionaries, both monolingual and multilingual, published in Albania after the War as well as in working groups concerned with orthography, including the Congress of 1972. See also: Albanian; Leibniz, Gottfried Wilhelm (1646–1716);
Miklosˇicˇ, Franc (1813–1891).
Bibliography Blaku M (1980). ‘In memoriam: Prof. Eqrem C¸abej, Nestor i gujhe¨sise¨ shqiptare.’ Fjala XIII 15/1, 8–9. Kastrati J (1981). ‘Bibliografia e Prof. Eqrem C ¸ abejt (1929–1981).’ Studime filologjike 3, 219–254. Wa˛sik Z (1985). ‘Profesor Dr Eqrem C¸abej (1908–1980), wybitny filolog, two´rca wspo´łczesnego je( zykoznawstwa alban´skiego.’ Acta Universitatis Wratislaviensis 777. Studia Linguistica IX, 99–114.
See: Misumalpan.
Caddoan Languages D Rood, University of Colorado, Boulder, CO, USA ! 2006 Elsevier Ltd. All rights reserved.
Caddoan is a family of North American language consisting of two branches: Caddo, formerly spoken in Texas and Louisiana, and now spoken only in Oklahoma; and North Caddoan, found in the central Plains from Oklahoma to North Dakota. The North Caddoan languages include Arikara, Pawnee, Kitsai, and Wichita. Arikara and Pawnee are linguistically very close, while Kitsai falls between them and Wichita.
Language Structure The Caddoan languages have extremely small phoneme inventories, but complex morphophonemics. They are morphologically and syntactically prototypical examples of polysynthetic structure. The proposed phoneme inventory for the family is */p, t, k, c (¼ [ts]), s, w, n, r, y, , h, i, a, u/ (Chafe, 1979: 218–219). Caddo has a somewhat larger set, which appears to result from relatively recent expansion. Caddoan verbs consist of 30 or more positional slots into which bound morphemes may be inserted; the verb root occurs near the end. In addition to expected categories like tense, modality, aspect,
Caddoan Languages 175
pronoun, number, evidential, and verb root, there are slots for certain adverbs, incorporated objects, patient definiteness (in Wichita and possibly others), and derivational stem-forming elements. All the languages have a bipartite verb stem for many verbs; a class of ‘preverbs’ occurs separated from the root by several slots. Nouns generally may take only one of two or three suffixes: an ‘absolutive’ (which occurs only when the noun is used alone), a locative, or, in some of the languages, an instrumental. Noun compounds are frequent and productively formed. All the languages lack adpositions and most adjectives. Sentential argument structure (subject, object, indirect object, possessor) is marked entirely in the verbal complex; word order in clauses has strictly pragmatic functions. Intransitive verbs fall into two classes depending on whether their subjects are marked by transitive object pronouns or transitive agent pronouns.
History and Scholarship Europeans first encountered speakers of Caddoan languages during the 16th-century Spanish expeditions from Mexico searching for Quivira (the land supposed to have included El Dorado, a rumored but non-existent city with streets of gold). Maps from those expeditions record a few (now largely uninterpretable) place names, but beyond that most information on the languages has been collected since the 1960s. Kitsai was recorded as spoken by its last monolingual speaker in the early 20th century, but none of the data has been published. The other languages continued to have a few speakers at the beginning of the 21st century, but all will probably be extinct by 2025, despite language preservation and revival efforts. Large text collections and good grammars are available for two of the languages, Arikara and Pawnee, thanks to the work of Douglas R. Parks. Parks has also coauthored a series of Arikara teaching grammars and a dictionary for elementary school students.
Wichita is documented in a grammar, several articles about grammatical phenomena, and a few texts by David S. Rood, as well as audio and video documentation archived at the Max Planck Institute for Psycholinguistics in Nijmegen, the Netherlands. For Caddo, see the texts by Wallace L. Chafe and the detailed description of verb morphology by Lynette Melnar. Allan R. Taylor and W. L. Chafe have published on the history of the Caddoan language family (see Chafe, 1979, for further reading). See also: Adpositions; Endangered Languages; Polysynthetic Language: Central Siberian Yupik; United States of America: Language Situation.
Bibliography Chafe W L (1979). ‘Caddoan.’ In Campbell L & Mithun M (eds.) The languages of native America: Historical and comparative assessment. Austin, TX: University of Texas Press. Chafe W L (2005). ‘Caddo.’ In Hardy H K & Scancarelli J (eds.) The native languages of the southeastern United States. Lincoln, NE: University of Nebraska Press. Melnar L R (2004). Caddo verb morphology. Lincoln, NE: University of Nebraska Press. Parks D R (1976). A grammar of Pawnee. New York: Garland. Parks D R (ed.) (1977). Native American texts series, vol. 2, no. 1: Caddoan texts. Chicago: University of Chicago Press. Parks D R (1991). Traditional narratives of the Arikara Indians (4 vols). Lincoln, NE: University of Nebraska Press. Parks D R (2005). An elementary dictionary of Skiri Pawnee. Lincoln, NE: University of Nebraska Press. Parks D R, Beltran J & Waters E P (1998–2001). An introduction to the Arikara language: Sahni1 Wakuunu’ (2 vols). Roseglen, ND: White Shield School. [Multimedia versions on CD are available from the American Indian Research Institute, Bloomington, IN.] Rood D S (1976). Wichita grammar. New York: Garland. Rood D S & Lamar D J (1992). Wichita language lessons (manual and tape recordings). Anadarko, OK: Wichita and Affiliated Tribes.
176 Caldwell, Robert (1814–1891)
Caldwell, Robert (1814–1891) J-L Chevillard, CNRS – Universite´ Paris 7, Paris, France
three Europeans (along with Beschi and Pope) to have his statue near the Marina Beach in Chennai.
! 2006 Elsevier Ltd. All rights reserved.
Among all the Europeans who have studied the languages of south India, Bishop Caldwell (see Anonymous, IJDL XVIII–1, 1989 for his biography) is probably one of the most famous. He was born in Ireland and arrived in India in 1838 as a protestant missionary. In his Comparative grammar of the Dravidian or south-Indian family of languages, first published in 1856 in London, he is credited with demonstrating what had been until then hypothesized by earlier writers (see Ellis, Francis Whyte (ca. 1778– 1819)), namely that several languages of south India are related and belong to one and the same family. Caldwell called this family ‘Dravidian,’ from the Sanskrit dra¯ vid. a, which had sometimes been used to refer to the Tamil language and people, and sometimes more vaguely to south Indian peoples (see Krishnamurti, 2003: 1–2). From Caldwell onwards, the word ‘Dravidian’ has frequently been used mainly in two contexts: (1) comparative Dravidian linguistics, where Caldwell’s (1875) lists of ‘‘six cultivated dialects’’ (Tamil, Malayalam, Telugu, Canarese (Kannada), Tulu, and Kudagu) and ‘‘six uncultivated dialects’’ (Tuda (Toda), Koˆ ta (Kota), Gon. d. , Khond (Kui), Oraˆ on, and Raˆ jmahaˆ l) have now been extended to ‘‘23 modern languages plus three ancient ones’’ (Steever, 1998); and (2) politics, with the success of parties such as the DMK, or Tira¯ vit. a Munne¯ rrak Kalakam ‘Dravidian Progress Association’ (see Ramaswamy, 1997). Some of the most significant continuators of Caldwell’s theories in the field of Dravidian linguistics in the 20th century have been Jules Bloch and M. B. Emeneau, the latter one being responsible, along with T. Burrow, for the important Dravidian etymological dictionary (1984). Caldwell is also known for his 1881 A political and general history of the district of Tinnevelly, in the presidency of Madras, from the earliest period to its cession to the English government in A. D. 1801. He is one of
See also: Beschi, Constanzo Guiseppe (1680–1747); Bloch, Jules (1880–1953); Burrow, Thomas (1909–1986); Dravidian Languages; Ellis, Francis Whyte (ca. 1778–1819); Emeneau, Murray Barnson (b. 1904).
Bibliography Andronov M S (1999). Dravidian historical linguistics. Moscow: The Russian Academy of Sciences, Institute of Oriental Studies. Anonymous (1989). ‘Bishop Caldwell.’ In International Journal of Dravidian Linguistics (IJDL) XVIII–1. 42–66. Bloch J (1954). The grammatical structure of Dravidian languages. Tr. from the 1946 original French [La Structure Grammaticale des Langues Dravidiennes, Librairie d’Ame´ rique et d’Orient, Adrien-Maisonneuve, Paris], Poona: Deccan College Hand-Book Series. Burrow T & Emeneau M B (1984). A Dravidian etymological dictionary (2nd edn.). Oxford: Clarendon Press. Caldwell R (11856, 21875, 31913, 1974). A comparative grammar of the Dravidian or South-Indian family of languages (3rd edn.), rev. and ed. by Wyatt J L & Pillai R R. [Reprint of the 3rd edn., originally printed by K. Paul, Trench, Tru¨ bner & Co., Ltd., London]. New Delhi: Oriental Books Reprint Corporation. Caldwell R (1881). A political and general history of the district of Tinnevelly, in the presidency of Madras, from the earliest period to its cession to the English government in. A.D. 1801. Madras: Government Press. Reprinted (1982). New Delhi: Asian Educational Services. Krishnamurti B (2003). The Dravidian languages (Cambridge language surveys). Cambridge: Cambridge University Press. Ramaswamy S (1997). Passions of the tongue. Berkeley, CA: University of California Press. Steever S B (ed.) (1998). The Dravidian languages (Routledge language family descriptions). London and New York: Routledge. Subrahmanyam P S (1983). Dravidian comparative phonology. Annamalai Nagar: Annamalai University. Zvelebil K V (1990). Dravidian linguistics, an introduction. Pondicherry: Pondicherry Institute of Linguistics and Culture.
Calligraphy, East Asian 177
Calligraphy, East Asian A Gaur, Surbiton, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.
Chinese calligraphy depends on the brush, paper (created in the 2nd century A.D.), and the multiple forms of Chinese characters. Chinese painters and calligraphers use the same instruments and the same material, but most artists would rather be remembered as calligraphers than as painters. In China, calligraphy is art, perhaps the highest form of art possible. Another distinguishing point is the fact that it was practiced amongst equals, it was never simply commissioned. In calligraphy, it is not only the hand that writes, but the whole arm, the whole body and above all, the whole mind. The earliest known examples of Chinese writing go back to the Shang period (ca. 1766–1122 B.C.) to a script called Jiaguwen. The next script, Jinwen zhongdingwen, was used during the Zhou period (late 11th century B.C.) but did not yet exhibit many signs of calligraphic distinction. It was supplanted by the Great Seal script, Dazhuan, which flourished between 1700 and 800 B.C. In the 3rd century B.C., China was finally united under the first Qin Emperor Shi Huang Di (259–210 B.C.), and (we are told) on his instructions a new script, Xiaozhuan, the Small Seal script, was created to meet the growing demand for documents and records. Though it was the basis for later calligraphic developments, it was still only written with the tip of a longhaired brush, mainly on bamboo slips or wood. In addition, Xiaozhuan could not be written with speed, a serious shortcoming for a script specially designed to serve an increasing bureaucracy. However, Lishu (the ‘clerical script’), a simplified version of the Small Seal script, which allowed the brush to move swiftly over paper, was designed. Between 200- and 400 A.D. three more variations of Lishu came into existence: Caoshu (fl. 200–400 A.D.), Xingshu (fl. from the 3rd century A.D. to the present) and the most important variation Kaishu, the ‘proper style of Chinese writing.’ Kaishu was used for public documents and private correspondence, and eventually also for block printed books. It also served as an examination subject in the Civil Service examination, which started during the Tang period (618–907 A.D.) and was abolished only in 1905. Kaishu allowed for a maximum of individuality. Its greatest exponents were the ‘the two Wangs’ (father and son) who lived in the 4th century A.D. and influenced not only Chinese but also Korean and Japanese calligraphy. Wang Xizhi relaxed the tension in the arrangements of strokes and by doing so furthered
the two other styles: Xingshu the ‘running script’ and Caoshu the ‘grass script.’ Chinese script, paper, ink, and the Chinese way of writing were brought to Korea and Japan during the earlier part of the first millennium A.D. But the Korean and the Japanese languages are ill suited for being written in Chinese characters and fairly soon attempts at simplification were made. In 1446, the Korean King Sejong promulgated an alphabetic script called Hangul, which consisted of only 11 basic vowels and 17 consonant signs. Korean could have been written in this script at this time, but the hostility of the Chinese educated elite relegated it mainly to the use of women authors and people of low rank. Overall, Korean calligraphers relied on copying the great Chinese masters. During the Koryo dynasty (918–1892), a square angular form was used; this was eventually followed by the zhao style, again copied from the Chinese calligrapher Zhao Mengfu (1254–1322). The most famous calligrapher of the Choson period (1392–1910) was Kim Ch’ong hui (1786–1856), a member of the School of Practical Learning. After World War II, Korean calligraphy lost its importance, but today there is, for the first time, an attempt to use calligraphy based almost exclusively on King Sejong’s Korean alphabet. In Japan, the situation was similar. Between the 8th and the 10th centuries, two syllabaries, katakana and hiragana, appeared, provoking similar reactions from the Chinese-speaking elite as in Korea. Proper Japanese calligraphy begins in the Nara period (710–794 A.D.) written in kanji (Chinese characters) mostly based on Chinese Tang models and the ‘two Wangs.’ Buddhist sutra literature preferred kanji styles such as Kaisho, Gyosho, and Sosho. In the 9th century, during the Heian period (794–1185), Japan had terminated the embassies to China, and Japanese calligraphers began to interpret, not just copy, Chinese models. The Heian period also saw new and more sophisticated trends, such as novels written by women entirely in an elegant hiragana style known as onnade (women’s writing). The 16th century once more encouraged close contacts with China, and new styles of Ming calligraphy were taken up, but a century later the pendulum swung back to Japanese, largely through the masters of the Kan’ei period (1624–1644). Apart from the wayo (Japanese) tradition, there developed another highly original style of calligraphy, which traces its origin back to Chinese Chan Buddhism. In reaches as far back as the 13th century, when the Zen Sect was formed by the monks Eisai (1141–1215) and Dogen (1200–1253). In the newly founded Zen monasteries,
178 Calligraphy, East Asian
a special type of calligraphy developed, referred to, especially after the 14th century, as Bokuseki (traces of ink). This is a greatly distinct form of calligraphic writing and it eventually became connected with the aesthetics of the tea ceremony. Today calligraphy is still held in highest esteem in Japan, and the work of good calligraphers sells at adequate (or as we would call it) exuberant prices.
Bibliography Earnshaw C J (1988). Sho: Japanese calligraphy. An indepth introduction to the art of writing characters. Tokyo: Charles E. Tuttle Company.
Gaur A (1994). A history of calligraphy. London: British Library. Kim Y-Y (1959). Hanguk sohwa immyong saso: bibliographical dictionary of Korean artists and calligraphers. Seoul: KOIS. Mote F W & Hung-lam C (1988). Calligraphy and the East Asian book. Boston: Horticultural Hall. Nakata Y (1983). The art of Japanese calligraphy (3rd edn.). Alan Woodhull (trans.). New York/Tokyo: Weatherhill/Heibonsha. Robinson A (1995). The story of writing. London: Thames and Hudson. Yee C (1973). Chinese calligraphy, an introduction to its aesthetic and technique. Cambridge, Boston.
Calligraphy, Islamic A Gaur, Surbiton, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.
Islamic calligraphy begins with the Qur’an and the need for its precise and appropriate transmission. The sacred text had been revealed in Arabic over a period of some 23 years to the Prophet Muhammad, which gave both language and script a new status. At first the various parts were preserved, either through oral traditions or recorded on different materials (wood, paper, parchment, bone, leather, etc.). In 633 A.D., during the battles following Muhammad’s death, many of the story tellers (huffaz) were killed, and fearing for the safety of the revelation, Abu Bakr (r. 632–634 A.D.), the first Caliph, instructed one of the Prophet’s secretaries to compile the full text into one book. The book that appeared in 651 A.D. still forms the authentic version of every Qur’an. In the 7th century, the Arabs possessed a script of their own, a stiff angular development of Nabataean called Jazm, which was mainly used for commercial purposes. The earliest copies of the Qur’an were written in variations of Jazm named after the towns where they had originated: Ambari (after Anbar), Hiri (after Hirah), Makki (after Mecca), Madani (after Medina), and so on. None of them was well defined and only two achieved a measure of prominence: a round form used in Median called Mudawwar and a more angular form under the name of Mabsut. Finally, after a number of experiments, a style developed named after the city of its origin: Kufah. This was a bold, elongated and straight-lined script, which for the next 300 years became the main script for copying the Qur’an. By the late 10th century, two distinct forms
of Kufic emerged: eastern Kufic that developed in Persia and western Kufic, eventually called Maghribi. Maghribi originated around Tunis and became the source of various scripts of North and West Africa, and of Andalusia. After the 13th century, Kufic went out of general use and was from then on mainly used for decorations. Besides the elongated Kufic, a number of more rounded, cursive scripts had been used for personal use and for administration. Early attempts at improvements had led to the creation of some 20 different styles, many short-lived, all lacking elegance and discipline. In the 10th century, Ibn Muqlah (886–940 A.D.), an accomplished Baghdad calligrapher, set out to redesign them so as to make them suitable for writing the Qur’an. His system of calligraphy rested on mathematical measurements: the rhombic dot, the standard alif, and the standard circle. The rhombic dot, formed by pressing the pen diagonally on paper so that the length of the dot’s equal sides were the same as the width of the pen; the standard alif, a straight vertical line measuring a specific number of dots (mostly between five and seven); and the standard circle, which has a diameter equal to the length of the standard alif. Thus the various cursive styles were ultimately dependent on the width of the pen and the number of dots fashioning the standard alif. Ibn Muqlah’s reform (known as al-Katt al Mansub) was successfully applied to the sittah, the six major styles known as Thuluth, Naskhi (the most popular form of writing in the Arab world and, after 1000 A.D., the standard script for copying the Qur’an), Muhaqqaq, Rayhani (another popular Qur’an script), Riqa (favored by the Ottoman calligraphers), and Tawqi. Under Ibn Muqlah’s influence, four more styles
178 Calligraphy, East Asian
a special type of calligraphy developed, referred to, especially after the 14th century, as Bokuseki (traces of ink). This is a greatly distinct form of calligraphic writing and it eventually became connected with the aesthetics of the tea ceremony. Today calligraphy is still held in highest esteem in Japan, and the work of good calligraphers sells at adequate (or as we would call it) exuberant prices.
Bibliography Earnshaw C J (1988). Sho: Japanese calligraphy. An indepth introduction to the art of writing characters. Tokyo: Charles E. Tuttle Company.
Gaur A (1994). A history of calligraphy. London: British Library. Kim Y-Y (1959). Hanguk sohwa immyong saso: bibliographical dictionary of Korean artists and calligraphers. Seoul: KOIS. Mote F W & Hung-lam C (1988). Calligraphy and the East Asian book. Boston: Horticultural Hall. Nakata Y (1983). The art of Japanese calligraphy (3rd edn.). Alan Woodhull (trans.). New York/Tokyo: Weatherhill/Heibonsha. Robinson A (1995). The story of writing. London: Thames and Hudson. Yee C (1973). Chinese calligraphy, an introduction to its aesthetic and technique. Cambridge, Boston.
Calligraphy, Islamic A Gaur, Surbiton, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.
Islamic calligraphy begins with the Qur’an and the need for its precise and appropriate transmission. The sacred text had been revealed in Arabic over a period of some 23 years to the Prophet Muhammad, which gave both language and script a new status. At first the various parts were preserved, either through oral traditions or recorded on different materials (wood, paper, parchment, bone, leather, etc.). In 633 A.D., during the battles following Muhammad’s death, many of the story tellers (huffaz) were killed, and fearing for the safety of the revelation, Abu Bakr (r. 632–634 A.D.), the first Caliph, instructed one of the Prophet’s secretaries to compile the full text into one book. The book that appeared in 651 A.D. still forms the authentic version of every Qur’an. In the 7th century, the Arabs possessed a script of their own, a stiff angular development of Nabataean called Jazm, which was mainly used for commercial purposes. The earliest copies of the Qur’an were written in variations of Jazm named after the towns where they had originated: Ambari (after Anbar), Hiri (after Hirah), Makki (after Mecca), Madani (after Medina), and so on. None of them was well defined and only two achieved a measure of prominence: a round form used in Median called Mudawwar and a more angular form under the name of Mabsut. Finally, after a number of experiments, a style developed named after the city of its origin: Kufah. This was a bold, elongated and straight-lined script, which for the next 300 years became the main script for copying the Qur’an. By the late 10th century, two distinct forms
of Kufic emerged: eastern Kufic that developed in Persia and western Kufic, eventually called Maghribi. Maghribi originated around Tunis and became the source of various scripts of North and West Africa, and of Andalusia. After the 13th century, Kufic went out of general use and was from then on mainly used for decorations. Besides the elongated Kufic, a number of more rounded, cursive scripts had been used for personal use and for administration. Early attempts at improvements had led to the creation of some 20 different styles, many short-lived, all lacking elegance and discipline. In the 10th century, Ibn Muqlah (886–940 A.D.), an accomplished Baghdad calligrapher, set out to redesign them so as to make them suitable for writing the Qur’an. His system of calligraphy rested on mathematical measurements: the rhombic dot, the standard alif, and the standard circle. The rhombic dot, formed by pressing the pen diagonally on paper so that the length of the dot’s equal sides were the same as the width of the pen; the standard alif, a straight vertical line measuring a specific number of dots (mostly between five and seven); and the standard circle, which has a diameter equal to the length of the standard alif. Thus the various cursive styles were ultimately dependent on the width of the pen and the number of dots fashioning the standard alif. Ibn Muqlah’s reform (known as al-Katt al Mansub) was successfully applied to the sittah, the six major styles known as Thuluth, Naskhi (the most popular form of writing in the Arab world and, after 1000 A.D., the standard script for copying the Qur’an), Muhaqqaq, Rayhani (another popular Qur’an script), Riqa (favored by the Ottoman calligraphers), and Tawqi. Under Ibn Muqlah’s influence, four more styles
Calligraphy, South Asian and Tibetan 179
were eventually accorded similar status: Ghubar, Tumar, Ta’liq, and Nasta’liq. Those cursive styles were eventually further perfected by two other famous calligraphers: Ibn al-Bawwab (d. 1022 A.D.) and Yaqut (d. 1298). Ibn Muqlah’s reform had not been accepted in the Maghrib, the western part of the quickly extending Muslim empire, where copying acknowledged masters preserved the purity of the style. Between 800 and 1200 A.D., the city of Kairouan (now Tunis) was an important religious and cultural center. The Maghribi style introduced a rounding of rectangular curves into semi-circles while the final flourishes of letters are often extended, sometimes touching other letters in the adjoining word. Maghribi became the main script in Northwest Africa and Spain and was responsible for the creation of important substyles such as Qayrawani, Fasi, Andalusi, and Sudani. After the extension of Islam to Persia, Turkey, and further east to Malaysia (even China), the Arabic script had to be adapted to languages belonging to different linguistic families. This meant some changes in the script but it also opened the possibility for new forms of calligraphy. In the 16th century, Persian calligraphers developed Ta’liq, an already existing style, which became influential in the eastern part of the Islamic world, gaining special favor in Turkey and India. A later development of the same style, Nasta’liq, was mainly used for secular literature. In the middle of the 17th century, a style called Shikasthe (‘broken form’) developed in Herat. Characterized by exaggerated density and closely connected ligatures, it became the preferred script for Persian and Urdu correspondence. Persian calligraphers and Persian influence brought Nasta’liq to India and Afghanistan. During the 14th century, a minor Indian style called Bihari arose, which was characterized by the use of colors.
Chinese Muslims generally used the style prevalent in Afghanistan but a special script called Sini, sometimes written with a brush, was used for writing on ceramic and china. Soon after the defeat of the Mamluks in 1517, Turkish dominion extended over most of the Arab world. From then on, Islamic art and calligraphy became increasingly associated with the Ottoman Turks who not only excelled in most calligraphic styles but also created some highly effective scripts of their own. The two most important are Diwani and Jali. The Turks also excelled in the art of mirror writing, where the left side reflects the writing on the right. Another style, Siyaqad, combines complexity of line with elements of cryptography and was used to communicate important political information. An impressive calligraphic device is the Tughra, an ornamental design based on the name and titles of the reigning Sultan that served as a signature legitimizing official degrees.
Bibliography Brend B (1991). Islamic art. London: British Museum Publications. Edgu F (1980). Turkish calligraphy. Engl. (trans.). Istanbul: Ada. Gaur A (1994). A history of calligraphy. London: British Library. Lings M & Safardi Y H (1976). The Qu’ran. An exhibition catalogue. London: British Library. Rice D S (1955). The unique Ibn al-Bawwad manuscript in the Chester Beatty Library. Dublin: Emery Walker. Safadi Y H (1978). Islamic calligraphy. London: Thames and Hudson. Schimmel A (1970). Islamic calligraphy. Iconography of religion. Leiden: Brill. Schimmel A (1990). Calligraphy and Islamic culture (2nd edn.). London: I. B. Tauris and Co.
Calligraphy, South Asian and Tibetan A Gaur, Surbiton, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.
Although the term ‘calligraphy’ derives from graphein (to write) and kallos (beautiful), beautiful writing in itself is not calligraphy. Fine writing, even the development of distinct styles, is not necessarily calligraphy. Calligraphy makes a statement about a particular society, a statement about the sum total of its cultural and historical heritage. As such, it
results from the interaction between several essential elements: the attitude of society to writing; the importance and function of the text; definite, often mathematically based rules about the correct interaction between lines and space and their relationship to each other; and mastery and understanding of the script, the writing material, and the tools used for writing. Calligraphy is to a large extent an expression of harmony, as perceived by a particular society. Calligraphy also encourages a certain amount of individuality, though within strictly confined
Calligraphy, South Asian and Tibetan 179
were eventually accorded similar status: Ghubar, Tumar, Ta’liq, and Nasta’liq. Those cursive styles were eventually further perfected by two other famous calligraphers: Ibn al-Bawwab (d. 1022 A.D.) and Yaqut (d. 1298). Ibn Muqlah’s reform had not been accepted in the Maghrib, the western part of the quickly extending Muslim empire, where copying acknowledged masters preserved the purity of the style. Between 800 and 1200 A.D., the city of Kairouan (now Tunis) was an important religious and cultural center. The Maghribi style introduced a rounding of rectangular curves into semi-circles while the final flourishes of letters are often extended, sometimes touching other letters in the adjoining word. Maghribi became the main script in Northwest Africa and Spain and was responsible for the creation of important substyles such as Qayrawani, Fasi, Andalusi, and Sudani. After the extension of Islam to Persia, Turkey, and further east to Malaysia (even China), the Arabic script had to be adapted to languages belonging to different linguistic families. This meant some changes in the script but it also opened the possibility for new forms of calligraphy. In the 16th century, Persian calligraphers developed Ta’liq, an already existing style, which became influential in the eastern part of the Islamic world, gaining special favor in Turkey and India. A later development of the same style, Nasta’liq, was mainly used for secular literature. In the middle of the 17th century, a style called Shikasthe (‘broken form’) developed in Herat. Characterized by exaggerated density and closely connected ligatures, it became the preferred script for Persian and Urdu correspondence. Persian calligraphers and Persian influence brought Nasta’liq to India and Afghanistan. During the 14th century, a minor Indian style called Bihari arose, which was characterized by the use of colors.
Chinese Muslims generally used the style prevalent in Afghanistan but a special script called Sini, sometimes written with a brush, was used for writing on ceramic and china. Soon after the defeat of the Mamluks in 1517, Turkish dominion extended over most of the Arab world. From then on, Islamic art and calligraphy became increasingly associated with the Ottoman Turks who not only excelled in most calligraphic styles but also created some highly effective scripts of their own. The two most important are Diwani and Jali. The Turks also excelled in the art of mirror writing, where the left side reflects the writing on the right. Another style, Siyaqad, combines complexity of line with elements of cryptography and was used to communicate important political information. An impressive calligraphic device is the Tughra, an ornamental design based on the name and titles of the reigning Sultan that served as a signature legitimizing official degrees.
Bibliography Brend B (1991). Islamic art. London: British Museum Publications. Edgu F (1980). Turkish calligraphy. Engl. (trans.). Istanbul: Ada. Gaur A (1994). A history of calligraphy. London: British Library. Lings M & Safardi Y H (1976). The Qu’ran. An exhibition catalogue. London: British Library. Rice D S (1955). The unique Ibn al-Bawwad manuscript in the Chester Beatty Library. Dublin: Emery Walker. Safadi Y H (1978). Islamic calligraphy. London: Thames and Hudson. Schimmel A (1970). Islamic calligraphy. Iconography of religion. Leiden: Brill. Schimmel A (1990). Calligraphy and Islamic culture (2nd edn.). London: I. B. Tauris and Co.
Calligraphy, South Asian and Tibetan A Gaur, Surbiton, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.
Although the term ‘calligraphy’ derives from graphein (to write) and kallos (beautiful), beautiful writing in itself is not calligraphy. Fine writing, even the development of distinct styles, is not necessarily calligraphy. Calligraphy makes a statement about a particular society, a statement about the sum total of its cultural and historical heritage. As such, it
results from the interaction between several essential elements: the attitude of society to writing; the importance and function of the text; definite, often mathematically based rules about the correct interaction between lines and space and their relationship to each other; and mastery and understanding of the script, the writing material, and the tools used for writing. Calligraphy is to a large extent an expression of harmony, as perceived by a particular society. Calligraphy also encourages a certain amount of individuality, though within strictly confined
180 Calligraphy, South Asian and Tibetan
circles. Only three civilizations have produced true calligraphy: the Arabs (and those who use the Arab script), the Chinese (and those who use the Chinese script), and Western civilization based on Roman letters, Roman laws, and the Christian Church. India, and with it the scripts of South and Southeast Asia that developed from Indian prototypes, did not create calligraphy in the strictest sense of the word, mainly for two reasons. First there was the lack of writing materials and writing tools suitable for calligraphy: palm leaves into which the letters had to be incised with a metal stylus (or in the north written with a reed pen), and secondly the attitude to writing. Though writing, as is generally assumed, had been introduced by Semitic traders in the 6th or 7th century B.C., Hinduism, the religion of the area, was decisively hostile to it. The memorizing and the recital of the Vedic hymns was predominately the property of certain Brahmanical subcastes whose status deepened on maintaining this monopoly. Buddhism too, though not overtly hostile to writing, placed the importance of the text above its visual representation. Monks should not take delight in visual beauty. In consequence, the vast majority of South Indian and Sri Lankan palm-leaf manuscripts are at best only adequately, and indeed often indifferently written. Indian manuscripts, and with it Indian scripts, are predominantly meant to provide information. Only a few surviving manuscripts from India predate the 11th century, and those come mostly from the north or from the Jain area. Though some have beautiful illustrations, the script (whether Siddhamatrika or Kutila) is well done but uninspiring. In Tibet, where writing was introduced together with Buddhism, in the 7th century A.D., writing was taught in the monasteries as part of the curriculum. Only about three styles developed: a book hand (dbu-can), a more cursive script for everyday life (dbu-med) or for official documents (bam-yig), and decorative scripts (bru-tsha). None of them displayed any calligraphic traditions. Fine writing did, however, play a major part in the complex and esoteric world of Hindu Tantras, popular Daoism, and most of all Tantric Buddhism. It was in Tantric Buddhism that beautiful writing, combined with other elements, eventually moved toward calligraphy. The script that underwent this transformation was siddham, an Indian syllabic script that goes back to the Indian Gupta period (320–647 A.D.).
According to the legends associated with Tantric Buddhism, the siddham letters ‘exploded’ out of emptiness and were taught by the Buddha but kept secret until the Indian saint Nagarjuna revealed them to his disciples. From the 7th century onward, siddham letters were mostly used for the representation of ‘seed syllables’ within mantras (sacred diagrams), each letter personifying a different cosmic force of the Buddha. Awareness of emptiness, so the teaching goes, is transformed into a seed syllable, from the seed develops the Buddha, who may be portrayed by an icon (in this case the seed syllable), and contemplation of the icon unites the devotee with the seed and returns him to emptiness. Buddhism brought Sanskrit texts, mainly written in siddham script, to China. Unlike India, China had always given much importance to the written word, since the large number of different dialects made oral communication difficult. In keeping with this attitude, Chinese Buddhists paid great attention to the form and the correct construction of siddham characters. Once the pen was replaced by the Chinese brush, siddham became a special branch of Chinese calligraphy connected with sacred writing. From China, Buddhism brought the siddham script to Korea, and in the 9th century two Japanese monks, Kukai (773–835 A.D.) and Saicho (767–822 A.D.), who had both studied in China, introduced it to Japan, where it soon gained considerable popularity within certain circles. Both the Heian (794–1185) and the Kamakura period (1185–1333) produced a number of siddham masters. After a period of decline, siddham calligraphy re-emerged in the 17th century. It is still an important calligraphic tradition and has indeed experienced something of a renaissance. There are today prominent modern siddham masters whose work is much valued, aesthetically as well as financially.
Bibliography Gaur A (1994). A history of calligraphy. London: British Library. Lauf D I (1976). Tibetan sacred art: the heritage of Tantra. London: Shambhala Publication Inc. Legeza L (1975). Tao magic, the sacred language of diagrams and calligraphy. London: Thames and Hudson. Losty J P (1982). The art of the book in India. London: British Library. Nakata Y (1983). Chinese calligraphy. New York: Weatherhill.
Calligraphy, Western, Modern 181
Calligraphy, Western, Modern A Gaur, Surbiton, Surrey, UK ! 2006 Elsevier Ltd. All rights reserved.
The 20th century saw a remarkable revival of Western calligraphy. The motivation for it lay partly in a growing unease about some of the more ugly aspects of the Industrial Revolution. Life now seemed increasingly dominated by shoddy, machine-made objects, which no longer had any direct connection with their users. This brought about a nostalgic yearning for the past and with it a growing interest in medieval art and craftsmanship. Such sentiments were intellectually underwritten by the philosophy of John Ruskin (1819–1900), by artistic movements such as the Pre-Raphaelite Brotherhood and to some extent the Gothic Revival. The eventual re-emergence of calligraphy was, however, largely rooted in the stimuli created by the Arts and Crafts Movement of the 1880s and 1890s, the work of William Morris and, most of all, Edward Johnston.
Calligraphy in Britain Between 1870 and 1876, the poet, writer, and (greatly idealistic) Socialist William Morris (1834–1896), who until then had been much occupied with creating designs for wallpapers, glass, textiles, tapestry, and print, turned his attention to medieval and humanistic-style manuscripts. He experimented with various scripts, studying scribal techniques and using quill and parchment to achieve results. His calligraphy shows good rhythmic quality but lacks an understanding of the shape of letters and their inner relationship. Nevertheless, his manuscripts, and the research and patronage connected with his work, created great interest and opened the way for calligraphic reforms. In 1890, Morris founded the Kelmscott Press and successfully tried his hand at engraving, type designing, and high-quality printing. In the same year, T. J. Cobden-Sanderson’s Doves Press was created. Both presses considerably increased the status of the book, eventually commissioning calligraphers to design the type. In due course, Morris became one of the moving spirits behind the Central School of Arts and Crafts, which had originally been founded in 1896 by William Richard Lethaby (1857–1931). There, eventually, Sir Sidney Carlyle Crockerell (1867–1962), one of Morris’s secretaries, taught calligraphy and lettering. It was, however, Edward Johnston (1872–1944) who was most decisively responsible for the renewal of Western calligraphy. Impressed by Morris’s ideas,
he abandoned his study of medicine and turned his attention to the manuscripts in the British Museum. In the process, Johnson rediscovered the lost technique of writing. He realized that the nature and form of a script were determined by the way the pen was held, that the proportions of a letter stood in direct ratio to the breadth of the pen’s edge, which, trimmed chisel-wise, could produce that range of graduation from the thickest strokes to the finest of hairlines that characterized the best medieval works. He also taught himself how to cut and sharpen reeds, bamboo, and quills. In 1898, Johnston began to teach, first at the Central School of Arts and Crafts and, later, at the Royal College of Art (where Lethaby worked as Professor for Ornament and Design). As a teacher, Johnston had a decisive influence on calligraphy and typography, particularly in England and Germany. His pupils included Eric Gill (who later became a well-known sculptor, engraver, and letterer), Noel Rooke (who engraved illustrations for Johnson’s later works), William Graily Hewitt, Percy J. Delf Smith (who became the honorary Secretary of the shortlived Society of Calligraphers), and most of all the highly gifted Anna Simons (who introduced Johnston’s method to Germany, Austria, Switzerland, and the Netherlands). Other art schools followed the example of the Royal College of Art, offering courses in lettering and writing. The first was Birmingham; Leicester College of Art came next, and eventually the subject became part of the curriculum in most arts schools throughout the country. Type design, which had for so long been in the hands of engineers, passed into the hands of artist and calligraphers such as Stanley Morison, Jan van Krimpen, Bruce Rogers, and Victor Hammer. By selecting fine alphabets for font material, they ensured that those alphabets were used for books (printed as well as manuscript) and book covers, and they began to influence the private market. Johnston himself had done some of his best work for church service books, wedding gifts, presentations and the like as a result of commissions from private patrons and public bodies. Calligraphy had always been used for such purposes, but now it received a new impetus and new quality. In 1906, Johnston published his first book, Writing & illuminating & lettering. It consisted of 500 pages, illustrated with his own and Rooke’s drawings, and reproductions from historic manuscripts. It was instructive, stimulating, technically helpful, and in due course it became an important handbook for calligraphers, not only in Britain but also in Germany,
182 Calligraphy, Western, Modern
the United States, and Australia. Three years later, Johnston’s second book, Manuscript and inscription letters, appeared, including a number of plates by Eric Gill; it was specially meant for schools and craftsmen. Other writing manuals followed. In 1916, Graily Hewitt’s Handwriting manual was published. Hewitt had replaced Johnston at the Central School for Arts and Crafts 4 years earlier. He admired the Humanist manuscripts of the 15th century but his great achievement was the recovery of the craft of laying and burnishing gold leaf. Alfred Fairbank, one of his students, turned his interest to italic handwriting. One of his aims was the improvement of everybody’s handwriting. In 1932, his A handwriting manual came out. It was a forerunner of books on italic script. In 1952, Fairbanks became President of the Society of Scribes and Illuminators (SSI) and encouraged the formation of a Society for Italic Handwriting. Finally, in 1955, J. H. Benson’s The first writing book: Arrighi’s ‘La Operina’ appeared, going back to the early copybooks of the Italian masters. In 1915, the London Transport Services commissioned Johnston to design a new alphabet for publicity and signs; the letters can still be seen all over London, especially on the underground. His result was a sans serif block letter alphabet based on classical Roman proportion, which, during the coming decades, exerted considerable influence on the choice of letterforms used in advertising. By reaching commerce, calligraphy began to play a role in the everyday life and the everyday business of people. Newspapers, journals, and magazines began to display more lavish and in many cases better-written and better-composed advertisements. It was (and is) indeed in the sphere of advertising that many calligraphers found a new and lucrative outlet. The Society of Scribes and Illuminators (SSI) was founded in 1921. The idea came originally from Graily Hewitt and Laurence Christie, who had both been students of Johnston and were now teaching at the Central School of Arts and Crafts. The Society held its first exhibition a year later in Brook Street Gallery: it showed 106 works by 31 members of the Society. In the beginning, it was fairly easy to be accepted by the Society as a Fellow but as time passed higher and higher standards were required and the reputation of the Society rose. In 1924, the Society set up small research groups to study particular problems and techniques: writing on skins, quality of pigments, preparation of inks, methods of gilding, and styles of cursive handwriting. The result of this research provided the basis for the compilation of the first Calligrapher’s handbook compiled during the 1950s. The Society also produced an excellent, worldwide, and still active journal: The Scribe. Several members
(such as Alfred Fairbank and Joan Kingsford) wrote manuscripts for private clients; some of them can still be seen in national museums and libraries. Already in 1931, the Society of Scribes and Illuminators had arranged, together with the Victoria and Albert Museum, an exhibition of Three centuries of illuminated addresses, diplomas and honorary freedom scrolls. The exhibition included five Freedom Scrolls made for City Livery Companies by Edward Johnston and a good number of presentation addresses executed by Graily Hewitt, Ida Henstock, Laurence Christie, Daisy Alcock, and others. With the increase in calligraphic activities in Great Britain, some exhibitions were (in 1930 and 1938) sent to the United States, the first at the invitation of the American Institute of Graphic Arts. The second exhibition was shown in New York, Boston, Chicago, as well as the Universities of Yale and Pittsburgh. Other exhibitions followed. They gave contemporary calligraphy an opportunity to come to the attention of a wider public. The years following World War II created a need for Rolls of Honor and provided the Society of Scribes and Illuminators with new opportunities. The manipulation of letterforms has always been at the center of Western calligraphy. The 20th century’s revival of the craft was closely connected with a reform of letter carving. In England, this reform, largely promoted by Eric Gill, based itself to a considerable extent on the Roman lettering of Trajan’s column. Analyzed in detail, such letters were soon taught in every art school and became models for sign writing, street names, memorials, foundation stones, and so on.
Calligraphy and 20th-Century Art The new use of letterforms touched other aspects of life as well. Especially on the Continent, graphic artists, painters, and (mostly) politically motivated groups of artists such as the Dadaists, the Constructivists, and the more moderate Bauhaus, began to involve lettering in their publications. The Dadaists, founded in 1917 in Zurich, were nihilistic groups of artists, who aimed at demolishing current aesthetic standards that they linked with bourgeois values. Seeing letters as the normal expression of a conventional society, they began to turn them into instruments of attack. The chaos of typefaces used for their magazine Dada illustrates this point. The Constructivists used the disposition, the size, and the weight of the components of individual letters to create unique abstract patterns, which they saw as representation of the contemporary machine age and the new revolutionary order in Russia, which had replaced the previous decadence. Their
Calligraphy, Western, Modern 183
work included posters, advertisements, letters, and newspaper headings; their preferred letterform was the sans serif, a functional letterform without historical commitment. Most important, however, was the Bauhaus, which in many ways reacted more positively than the others. It flourished in Germany between 1919 and1933 and its aim was to end the schism between art and technically expert craftsmanship. Though their interest centered around architecture, they soon began to teach typography in order to find new and positive letterforms. Painters too began to treat letters as an important part of their visual vocabulary. Cubists, Surrealists and the Collagists began to include single letters, or fragments of newspapers, in their paintings. The secret writing pictures of Paul Klee (1879–1940) and Max Ernst (1891–1976) used at first still legible writing that did, however, soon turn toward more abstract brush movements in the hands of Mark Tobey and Hans Hartung (1904–1989). Letters in a painting were used to underline themes, add a message, and they thus became an integral part of the picture itself; or they could simply provide a visual effect by using the idea of layout linked to meaning. A good many artists have used (and are using) lettering in this way. From Pablo Picasso (1881–1973) and Joan Miro (1893–1983) to Franz Kline (1910– 1962), who under De Kooning’s influence developed his characteristic action painting of slashing black and white calligraphy, and eventually Andy Warhol (1928–1987), Roy Lichtenstein (1923–1997) and the Pop Art Movement as such. Pop Art, which emerged in the 1950s, set out to challenge conventional ideas of good taste and the hermetic inviolability of art itself; the use of letters is often in the form of advertisements and billboards, reminiscent, at times, of the early Construtivists. The first half of the 20th century saw a good deal of success in revitalizing calligraphy and related crafts. This success depended mainly on three elements: 1. teaching lettering and calligraphy in arts schools, polytechnics, and similar institutions; 2. the growing number of exhibitions, many of them in connection with the United States and the Continent; 3. the foundation of societies and the publications of journals and books, which greatly encouraged the rising of standards. Since well before World War II, and for quite some times afterward, calligraphy was taught in almost every school. It was one of the subjects included in the National Diploma, which was a B.A. equivalent course. It was also taught at the Royal College of Art, a postgraduate college where the diploma is equal to
an M.A. But in 1953, calligraphy was discontinued at the Royal College of Art. In the early 1960s, it received a second blow, the National Diploma was replaced by the Diploma in Art and Design (a B.A.) and calligraphy was no longer included in the new courses, and was almost totally phased out as an examination subject. The only exceptions were the Reigate School of Art where it is still taught. Roehampton Institute of Higher Education started a 1-year Diploma course in calligraphy and bookbinding in 1979, and also an advanced Diploma in Calligraphy. Around 2003, bookbinding was dropped. Reigate and Roehampton are (it seems) now the two main institutions that still teach calligraphy in Great Britain. Now that calligraphy is no longer taught officially at university level, adult education institutes throughout Great Britain are putting on courses – of often widely varying value. The Society of Scribes and Illuminators also runs a number of workshops and some residential courses, which are advertised in their journal. Still flourishing as well is the Society for Italic Handwriting founded in 1952 under the direction of Alfred Fairbank. The year 1971 saw the establishment of the government-funded Crafts Council, which provided grants. In 1994, CLAS (Calligraphy and Lettering Arts Society) was founded. CLAS has its own website, and it runs Diploma and Advanced Diploma courses, Certificates, annual festivals, and exhibitions. It has accredited tutors and keeps in contact with American calligraphers. The first major exhibition was held in 2002 and the society is now preparing for its Tenth Anniversary Exhibition. A magazine, The Edge, is regularly published and free to all members. CLAS provides a variety of courses and a Certificate of Competence. It operates an annual examination and awards diplomas on three progressive levels. Its main advantage is that it is in principle open to everybody but carefully tutors and examines those who are allowed to teach. There is, in fact, a good deal of enthusiasm for calligraphy at the moment in Britain. An often voiced complaint is the lack of excellent teachers. This, however, does not mean that there are no longer any impressive calligraphers. We only have to think about Ann Camp, Donald Jackson, Heather Child, Sheila Waters, and Ann Hechle – to name but a few.
Calligraphy in Europe The 20th century revival in the art of lettering and writing was not restricted to Britain. Parallel (and not unconnected) movements occurred in other parts, most notably in Austria and Germany. The traditional alphabets in Germany and Austria had proceeded
184 Calligraphy, Western, Modern
along different lines, keeping the Gothic script until the 1930s. In Austria, the main exponent was Rudolf von Larisch (1856–1934), who worked in the Imperial Chancellery in Vienna where he had ample opportunity to study historical manuscripts and compare the various hands he found there with the (far less impressive) contemporary standards. His Zierschriften im Dienst der Kunst (Decorative lettering and writing in the service of art) was published in 1899. It led, 3 years later, to a teaching appointment at the Vienna School of Art. His publication appeared 7 years before Johnston’s work, but soon rivaled its standing in Austria. Unlike Johnston, von Larisch did not believe that calligraphy rested on the study of historic hands but was a natural vehicle for creative self-expression. Though he held different views and encouraged different teaching methods, in 1909, when Larisch and Johnston met in London, they found themselves in mutual sympathy. Larisch’s most important work, Unterricht in Ornamentaler Schrift (Instruction in decorative writing and lettering), published in 1906, further extended the scope of his studies and had considerable influence in German-speaking countries. Applying calligraphy on glass, metal, textiles, wood, and pottery fascinated him. He believed that calligraphers should express intuitive feelings in their work and that the pattern of letters on the page should be in harmony with the rhythm of writing and the material used. In Germany, it was Anna Simons (1871–1951), Johnston’s favorite pupil, who became instrumental in strengthening the link between German and English calligraphers. From a Prussian legal family, she began to study with Johnston in 1901 and became his best student. After retuning to Germany, she translated Johnston’s two books (Writing & illuminating & lettering in 1910 and later Manuscript and inscriptional letters) and helped with exhibitions. It was, however, mainly Rudolf Koch (1874–1934) who initiated the movement. He was a skilled calligrapher who had close ties with type and type design. He worked at the Klingspor type foundry and taught lettering at the School of Arts and Crafts in Offenbach. In 1918, under his leadership, a group called the Offenbach Penmen was founded. It later became a workshop community, where people worked on lettering, woodcuts, embroidery, weaving, and books written on Japanese paper. Many of the people there became leading teachers in Germany, Austria, the United States, and England. Whereas Johnston had seen writing as the central discipline of his craft, Koch gave this place to lettering in the broadest sense. In Europe, the link between art schools, printing houses, and the workshops of craftsmen had always
been much closer than in Britain; most early pioneers in calligraphy were also type designers of some note. This dual tradition was kept alive in the work of calligrapher/type designers such as Friedrich Poppl (1893–1982), who said that ‘‘calligraphy will always remain the starting point for script design.’’ Poppl was a member of the Arts and Crafts School at Wiesbaden and later professor at the Technical College there. He specialized in designing alphabets for typesetting and photo printing. Another important German calligrapher was Walter Kaech (d. 1970), who taught lettering for many years. Imre Reiner studied graphic arts in Stuttgart and was well known for his lively calligraphic inventions and type designs. The same can be said of Karl Georg Hoefer and, most prominently, Hermann Zapf (b. 1918). Zapf enjoyed a great reputation as a calligrapher, book designer, and typographer. In the Netherlands, this claim goes to Jan van Krimpen (1892–1958), in Czechoslovakia to the book artist and calligrapher Oldrich Menhart (1897–1962), in Estonia to Villu Toots (b. 1916), an outstanding teacher and exponent of lettering. There are now in fact several hundred graduates from the lettering school he founded in 1965, including his own grandchildren. The most important penman in Scandinavia is Erik Lindegren, whose survey of Lettering and printing types was published in 1975. All of them looked for new ways to link tradition with new means of expressing letterforms. In the last few decades, the influence of Austria and Germany has been strongly felt in the United States. The effects of exhibitions such as those produced by Zapf and Friedrich Neugebauer during the 1980s, the development of intensive workshops, and the resulting meetings of craftsmen, have enabled gifted teachers to kindle an enduring enthusiasm.
Calligraphy in the United States American calligraphy (or interest in writing) had, during the 17th and 18th centuries, mostly been concerned with practical considerations, namely how to improve everybody’s handwriting. Some English manuals (William Mather’s Young man’s companion, London 1681; and Edward Cocker’s The pen’s triumph, London 1660) were well known; they taught an English version of Italian Humanistic mixed with remnants of older Gothic hands. The first known American printed manual for handwriting appeared in Philadelphia in 1748 under the imprint of Franklin & Hall. It was George Fisher’s The instructor, or American young man’s best companion containing instructions in reading, writing and arithmetic and many other
Calligraphy, Western, Modern 185
things beside the art of making several sorts of wines. It gave examples of Round Hand, Flourishing Alphabets, Italian Hand, and Gothic Secretary. Most of these books had been pirated from English models. There were also tentative attempts to involve the teaching of women, as for example John Jenkins’s The art of writing, reduced to a plain and easy system, on a plan entirely new, Boston, 1791, which taught an orthodox version of the English unlooped Round Hand to the ‘‘Gentlemen and Ladies and to the Young Masters and Mistresses throughout the United States.’’ During the 19th century, such attempts led to various systems (methods of teaching) and colleges where they could be taught. Over the first half of the century, over 100 writing masters were distributing copybooks that in the main taught rapid writing (a Running Round Hand) to men of business in the form of selfinstructors. Among the first manuals were Henry Dean’s Analytical guide to the art of penmanship (Salem, 1894) and Benjamin Howard Rand’s A new and complete system of mercantile penmanship (Philadelphia, 1814). Well-known and commercially successful was the Spencerian College of Penmanship and Business which dominated the market for some 35 years. Founded by Platt Roger Spencer (1800– 1864) in Ohio, it propagated a sloping, semiangular style, which was rapid and legible, while at the same time lending itself easily to embellishment. Spencer had begun teaching handwriting at the age of 15 and he and his five sons ran the college (and eventually a chain of such colleges in some 44 cities) from a log cabin at the family farm, while at the same time traveling around the country to teach at various academies. As the 19th century progressed, competition mounted between those who emphasized a plain practical business hand and others who delighted in flourishes, which could occasionally lead to such extravagances as quill-written pen pictures of animals and humans; however, as time passed the ‘flourishers’ grew increasingly more defensive. Another successful writer/entrepreneur was Charles Paxton Zaner, who in 1888 founded the Zanarian College of Penmanship, also in Columbus, Ohio, which eventually produced a ‘commercial cursive’ or ‘business hand’ that, like copperplate in the Old Country, soon found favor among those anxious to advance their career prospects. Modern American handwriting derives largely from the teachings of H. Dean, B. F. Foster, R. P. Spencer, and A. R. Dunton (who was involved in lengthy disputes with Spencer). At the beginning of the 20th century, the Italic style, and the use of the broad-edged pen were greatly advanced by Frances M. Moore, who after having studied in London under
G. Hewitt, published her manual in 1926. Since then, it has been mainly the formal and semiformal Italic hand that has made headway in the United States, finding favor not only as a model for everyday handwriting, but also among those actively engaged in the pursuit of calligraphy. In the beginning, it took some effort to convert teachers and pupils to this style; more recently such books as Fred Eager’s Italic way to beautiful writing (1974) has given further impetus in this direction. The usefulness of calligraphy in America and Britain is basically based on different considerations. In Britain, emphasis has been placed on formal purpose, such as ceremonial occasions. In consequence, British scribes have shown a more formal approach to layout and letter style, even when designed for commercial use. In America, the predominant use for calligraphy has been in the commercial field. Calligraphers were also designers who produced a constant flow of lively work to serve a wider commercial field. At the beginning of the 20th century, several attempts were made to reform not only handwriting, but also lettering and type design. Such reforms centered mainly on men like Frederic W. Goudy (1865– 1945), Bruce Rogers (1870–1957) and, most of all, William A. Dwiggins (1880–1956). Dwiggins was a well-known type designer whose calligraphy owed little to European influence and showed great gaiety, character, and originality. In 1925, he founded the (wholly imaginary) Society of Calligraphers and issued beautiful certificates of honorary membership to people who worked in publishing and in the graphic arts whom he considered worthy of such distinction. The contact with Britain did, however, continue. In 1913, Ernst Frederick Detterer (d. 1947) of Chicago came to London to take private lessons from Edward Johnston. After his return to America, he began to establish a calligraphic tradition of formal penmanship, especially in the midwest. In 1931, he became Curator of the John M. Wing Foundation at the Newberry Library in Chicago, where he founded a Calligraphic Study Group, which greatly influenced the development of American calligraphy. A versatile calligrapher was John Howard Benson (1901–1956) from Rhode Island, who studied in New York at the National Academy of Design at a time when lettering had not yet attained a recognized place in art education. In 1950, he published a manual (Elements of lettering) and 5 years later he produced the first English translation of Arrighi’s La Operina. Other influential teachers and calligraphers were Arnold Bank (b. 1908), Paul Standard (b. 1896) and Lloyd Reynolds (Italic calligraphy and handwriting; 1969). In 1958, Reynolds went a step further and
186 Calligraphy, Western, Modern
mounted an exhibition at the Portland (Oregon) Museum of Art entitled Calligraphy: The Golden Age and its Modern Revival, the result of many years of historical study, research, and practical work. Another influential exhibition (mostly works of British calligraphers) was organized by P. W. Filby in 1959 at the Peabody Institute Library in Baltimore called Calligraphy and Illumination; followed 2 years later by Calligraphy and Handwriting in America, 1710–1961. Filby also became involved in an exhibition on Two Thousand Years of Calligraphy held in 1965 at the Walters Art Gallery in Baltimore, which produced a detailed and scholarly catalogue. Today calligraphy-related activities concentrate themselves mainly around well-known teaching centers (New York; Rhode Island; Chicago; Portland; Oregon; Boston; California), a wide use of fine writing in commerce (much more pronounced and more positive than in Europe), and individual circles where calligraphy is practiced and taught both as an art form and a traditional craft. On the whole, calligraphy is increasingly alive, widely practiced, and appreciated; there are now more groups, more conferences, more exhibitions, and more periodicals produced by influential societies, such as Alphabet (for the Friends of Calligraphy, San Francisco), Calligraphy Idea Exchange (a quarterly magazine) and Calligraphy Review. There are also more courses at art schools or run by private individuals and groups (some of them formal and structured, others less so), more national and international conferences, and a good deal more general awareness of calligraphy than in Europe. One of the reasons lies perhaps in the fact that in the United States there is less divide between calligraphers, artists, designers, and amateurs. After the 1950s, which saw a general regrouping of ideas and resources, an additional stimulus was provided by some prominent calligraphers such as Sheila Waters, David Howells, and (most of all) Donald Jackson taking up teaching appointments at American centers, stimulating workshops and the foundation of new societies, which in turn created a further need for tutors. There has also been an increase in media coverage, a large number of books covering special aspects of calligraphy, and periodicals promoting both an interest in the formal historical scripts while at the same time introducing new trends and new practices to the audience.
See also: Asia, Ancient Southwest: Scripts, Earliest; Asia,
Ancient Southwest: Scripts, Epigraphic West Semitic; Asia, Ancient Southwest: Scripts, Middle Aramaic; Asia, Ancient Southwest: Scripts, Modern Semitic; Asia, Inner: Scripts; China: Scripts, Non-Chinese; Japan: Writing
System; Korean Script: History and Description; Paleography, Greek and Latin; South and Southeast Asia: Scripts; Tibet: Scripts; Typography; Writing Materials.
Bibliography Anderson D M (1967). The art of written forms: the theory and practice of calligraphy. New York: Holt, Rinehart and Winston. Angel M (1984). The art of calligraphy. London: Pelham Books. Backemeyer S & Gronberg T (eds.) (1984). W R Lethaby 1857–1931: architecture, design and education. Catalogue for the exhibition at Central School of Art and Design. London: Lund Humphries. Benson J H (1955). The first writing book: Arrighi’s ‘La Operina.’ Oxford: University Press. Brinkley J (ed.) (1964). Lettering today. London: Studio Vista. Camp A (1984). Pen lettering (Revised edition). London: A & C Black. Child H (ed.) (1987). More than fine writing: the life and calligraphy of Irene Wellington. With contributions by Heather Collins, Ann Hechle, and Donald Jackson. New York: The Overlook Press. Child H (1988). Calligraphy today; twentieth century tradition and practise (3rd edn.). London: A & C Black. Dreyfus J (1952). The work of Jan van Krimpen. London: Sylvan Press. Fairbank A (1975). A handwriting manual (Revised edn.). London: Faber. Filby P W (1963). Calligraphy and handwriting in America 1710–1961 assembled and shown by the Peabody Institute Library, Baltimore, Maryland, November 1961– January 1962. New York. Folsom R (1990). The calligraphers’ dictionary. With an introduction by Hermann Zapf. London: Thames and Hudson. Gaur A (1994). The history of calligraphy. London: British Library. Gray N (1986). A history of lettering, creative experiment and lettering identity. Oxford: University Press. Jackson D (1987). The story of writing (2nd edn.). London: Trefoil Books. Johnston E (1906). Writing & illuminating & lettering. London: John Hogg. Kaech W (1956). Rhythm and proportion in lettering. Switzerland: Otto Walter, Olten-Verlag. Knight S (1984). Historical scripts: a handbook for calligraphers. Taplinger: A & C Black. Livingston M (1992). Pop art catalogue of and exhibition held at the Royal Academy of Arts, London 13 September – 19 April 1992. London. Macdonald B J (1973). The art of lettering with a broad pen. New York: Pentalic. Mahoney D (1981). The craft of calligraphy. London: Pelham Books. Reynolds L J (1969). Italic calligraphy and handwriting. New York: Pentalic.
Caˆmara Ju´nior, Joaquim Mattoso (1904–1970) 187 Smith P & Delf C (1946). Civic and memorial lettering. London: A & C Black. Whalley J I & Kaden V C (1980). The universal penman. A survey of western calligraphy from the Roman period to 1980. HMSO: London.
Zapf H (1960). About alphabets, some marginal notes on type design. New York: The Typophiles.
Caˆmara Ju´nior, Joaquim Mattoso (1904–1970) E Guimara˜es, Unicamp, Sao Paulo Campinas, Brazil
The Portuguese Language
! 2006 Elsevier Ltd. All rights reserved.
His work describing Portuguese was especially dedicated to phonology and morphology. In the phonological area, aside from an important vision of the conjuncture of the Portuguese phonological system, produced on rigorously structural bases, with an eye to rigor of structural description, Caˆ mara formulated a number of hypotheses that caused discussion. One of them is the nonexistence of nasal vowels in Portuguese. In his opinion, there is a nasal archiphoneme closing the syllable, as in canto /kaNtu/. In his morphological studies, also of a structuralist nature, he presents a rigorous comprehension of the structure of names in Portuguese, but his most interesting contribution regards the morphology of verbs. Aside from his description of the verbal system, he left an indispensable analysis of the so-called irregular Portuguese verbs. According to him, there are other verbal paradigms and not exactly irregularities of the regular paradigms (Caˆ mara, 1969, 1970, 1972, 1975). In the study of the Portuguese phrase, we can call attention to his description of the functioning of the pronoun ele, in colloquial Portuguese, as a verbal complement in Brazilian Portuguese instead of the atonic pronoun lhe (Caˆ mara, 1957). In European Portuguese, the pronoun ele, as well as eu, tu, no´s, vo´s and eles, function only as the subject. The Portuguese language (Caˆ mara, 1972) is perhaps his most complete analytic work on the Portuguese language (Naro, 1976). Initially, it gives an extremely acute presentation of the history of Portuguese and its fixation in Brazil. Following this presentation, the descriptions made by the author during the 1950s and 1960s regarding Portuguese phonology and morphology appear. These are followed by a study of the lexicon and also the Portuguese phrase. Part of this work is published in Estrutura da lı´ngua portuguesa (Structure of the Portuguese language) (Caˆ mara, 1970), surely the first descriptive (rather than normative) grammar produced in Brazil. Regarding stylistic studies, he published a specific work (Caˆ mara, 1953a) and produced a large number
Mattoso Caˆ mara, Brazilian linguist, is responsible for the introduction of linguistic structuralism in Brazil. He was a graduate in Architecture and Law and began his career in linguistics in the 1930s, taking courses given by George Millardet in Rio de Janeiro. Later on, he went to the United States where he studied under Jakobson. He was a professor of linguistics at the Federal District University, in Rio de Janeiro, from 1937 to 1939. In 1950, he became a professor of general linguistics at the University of Brazil’s National College of Philosophy, also in Rio de Janeiro. He was a visiting professor in the United States, Portugal, Mexico and Uruguay (Uchoˆ a, 1972). Caˆ mara Jr. is the author of the first work on general linguistics published in Brazil (Princı´pios de Lingu¨ı´stica Geral [Origins of General Linguistics, 1941]). Its second edition, revised and enlarged, was published in 1954. In it, Mattoso already shows his formation, marked by the structuralism of the School of Prague (fundamentally Jakobson), Saussure and Sapir. His work covers a wide range of preoccupations: stylistics, phonemics, grammar, the study of indigenous Brazilian languages, and the history of language and general linguistics. In the area of general linguistics, his reflections on the relationship of language and culture are very important. Ever since Princı´pios de Lingu¨ı´stica Geral, he has taken a position according to which the relationship between language and culture is such that language in part of the culture, but a part that can be detached. Therefore, on the one hand, a language is capable of speaking of the culture itself and, on the other, it signifies the culture of which it is part. It was in this way that, due to his functionalist mentalism, he forever marked the position that the study of language is of interest because it is significant. In the area of general linguistics, he also published the Diciona´rio de Fatos Gramaticais (Dictionary of grammatical facts, 1956), renamed Diciona´rio de Filologia e Grama´tica (Dictionary of philology and grammar, 1963).
Caˆmara Ju´nior, Joaquim Mattoso (1904–1970) 187 Smith P & Delf C (1946). Civic and memorial lettering. London: A & C Black. Whalley J I & Kaden V C (1980). The universal penman. A survey of western calligraphy from the Roman period to 1980. HMSO: London.
Zapf H (1960). About alphabets, some marginal notes on type design. New York: The Typophiles.
Caˆmara Ju´nior, Joaquim Mattoso (1904–1970) E Guimara˜es, Unicamp, Sao Paulo Campinas, Brazil
The Portuguese Language
! 2006 Elsevier Ltd. All rights reserved.
His work describing Portuguese was especially dedicated to phonology and morphology. In the phonological area, aside from an important vision of the conjuncture of the Portuguese phonological system, produced on rigorously structural bases, with an eye to rigor of structural description, Caˆmara formulated a number of hypotheses that caused discussion. One of them is the nonexistence of nasal vowels in Portuguese. In his opinion, there is a nasal archiphoneme closing the syllable, as in canto /kaNtu/. In his morphological studies, also of a structuralist nature, he presents a rigorous comprehension of the structure of names in Portuguese, but his most interesting contribution regards the morphology of verbs. Aside from his description of the verbal system, he left an indispensable analysis of the so-called irregular Portuguese verbs. According to him, there are other verbal paradigms and not exactly irregularities of the regular paradigms (Caˆmara, 1969, 1970, 1972, 1975). In the study of the Portuguese phrase, we can call attention to his description of the functioning of the pronoun ele, in colloquial Portuguese, as a verbal complement in Brazilian Portuguese instead of the atonic pronoun lhe (Caˆmara, 1957). In European Portuguese, the pronoun ele, as well as eu, tu, no´s, vo´s and eles, function only as the subject. The Portuguese language (Caˆmara, 1972) is perhaps his most complete analytic work on the Portuguese language (Naro, 1976). Initially, it gives an extremely acute presentation of the history of Portuguese and its fixation in Brazil. Following this presentation, the descriptions made by the author during the 1950s and 1960s regarding Portuguese phonology and morphology appear. These are followed by a study of the lexicon and also the Portuguese phrase. Part of this work is published in Estrutura da lı´ngua portuguesa (Structure of the Portuguese language) (Caˆmara, 1970), surely the first descriptive (rather than normative) grammar produced in Brazil. Regarding stylistic studies, he published a specific work (Caˆmara, 1953a) and produced a large number
Mattoso Caˆmara, Brazilian linguist, is responsible for the introduction of linguistic structuralism in Brazil. He was a graduate in Architecture and Law and began his career in linguistics in the 1930s, taking courses given by George Millardet in Rio de Janeiro. Later on, he went to the United States where he studied under Jakobson. He was a professor of linguistics at the Federal District University, in Rio de Janeiro, from 1937 to 1939. In 1950, he became a professor of general linguistics at the University of Brazil’s National College of Philosophy, also in Rio de Janeiro. He was a visiting professor in the United States, Portugal, Mexico and Uruguay (Uchoˆa, 1972). Caˆmara Jr. is the author of the first work on general linguistics published in Brazil (Princı´pios de Lingu¨ı´stica Geral [Origins of General Linguistics, 1941]). Its second edition, revised and enlarged, was published in 1954. In it, Mattoso already shows his formation, marked by the structuralism of the School of Prague (fundamentally Jakobson), Saussure and Sapir. His work covers a wide range of preoccupations: stylistics, phonemics, grammar, the study of indigenous Brazilian languages, and the history of language and general linguistics. In the area of general linguistics, his reflections on the relationship of language and culture are very important. Ever since Princı´pios de Lingu¨ı´stica Geral, he has taken a position according to which the relationship between language and culture is such that language in part of the culture, but a part that can be detached. Therefore, on the one hand, a language is capable of speaking of the culture itself and, on the other, it signifies the culture of which it is part. It was in this way that, due to his functionalist mentalism, he forever marked the position that the study of language is of interest because it is significant. In the area of general linguistics, he also published the Diciona´rio de Fatos Gramaticais (Dictionary of grammatical facts, 1956), renamed Diciona´rio de Filologia e Grama´tica (Dictionary of philology and grammar, 1963).
188 Caˆ mara Ju´ nior, Joaquim Mattoso (1904–1970)
of articles in his life, notably on one of the most important authors of literature in the Portuguese language, Machado de Assis. These works were later compiled into a book (Caˆ mara, 1962). These stylistic studies had an important impact on his grammatical description of verbs.
The Indigenous Languages and Other Interests On the study of indigenous languages in Brazil, he published Introduc¸ a˜ o a`s Lı´nguas Indı´genas Brasileiras (Introduction to the indigenous Brazilian languages, 1965). In this work, in addition to producing a vision of the conjunctional problem of studying indigenous languages in Brazil, he brings up interesting discussions on the question of linguistic borrowing. Also, as a professor of the National Museum’s Department of Anthropology, in Rio de Janeiro, he was responsible for the presence of linguistics when the postgraduate course in Anthropology was created. In this program, the question of indigenous languages has always been of great importance. Aside from these aspects, he also dedicated himself to the study of linguistic history, having published an interesting work in this domain (Caˆ mara, 1975b). He was also dedicated to teaching and produced works for this purpose. He was a rigorous and important translator of Sapir’s and Jakobson’s texts to Portuguese. See also: Jakobson, Roman (1896–1982); Sapir, Edward (1884–1939); Saussure, Ferdinand (-Mongin) de (1857– 1913).
Bibliography Caˆ mara J M Jr (1953a). Contribuic¸ a˜ o a` Estilı´stica Portuguesa. Rio de Janeiro: Simo˜ es. Caˆ mara J M Jr (1953b). Para o Estudo da Foneˆ mica Portuguesa. Rio de Janeiro: Simo˜ es. Caˆ mara J M Jr (1954). Princı´pios de Lingu¨ ı´stica Geral (1st edn.). Rio de Janeiro: Acadeˆ mica Briguiet, 1941. Caˆ mara J M Jr (1956). Diciona´ rio de Fatos Gramaticais (2nd edn.). Rio de Janeiro: MEC/Casa de Rui Barbosa. Diciona´rio de Filologia e Grama´ tica, Rio de Janeiro: Ozon, 1963. Caˆ mara J M Jr (1957). ‘Ele como acusativo no Portugueˆ s do Brasil.’ In Dispersos. 1st edn. Rio de Janeiro: FGV, 1972. Miscelaˆ nea Homenaje a Andre´ Martinet. Estruturalismo y Historia. Univ. de la Laguna. Caˆ mara J M Jr (1962). Ensaios Machadianos. Rio de Janeiro: Acadeˆ mica. Caˆ mara J M Jr (1965). Introduc¸ a˜ o a`s Lı´nguas Indı´genas Brasileiras. Rio de Janeiro: Acadeˆ mica. Caˆ mara J M Jr (1969). Problemas de Lingu¨ ı´stica Descritiva. Petro´ polis: Vozes. Caˆ mara J M Jr (1970). Estrutura da Lı´ngua Portuguesa. Petro´ polis: Vozes. Caˆ mara J M Jr (1972a). Dispersos. Rio de Janeiro: Fundac¸ a˜ o Getu´ lio Vargas. Caˆ mara J M Jr (1972b). The Portuguese language: history and structure. Chicago: University Chicago Press. Caˆ mara J M Jr (1975a). Histo´ ria e Estrutura da Lı´ngua Portuguesa. Rio de Janeiro: Acadeˆ mica. Caˆ mara J M Jr (1975b). Histo´ ria da Lingu¨ ı´stica. Rio de Janeiro: Vozes. Naro A J & Reighard J (1976). Tendeˆ ncias Atuais da Lingu¨ ı´stica e da Filologia no Brasil. Rio de Janeiro: Francisco Alves. Uchoˆ a C E F (1972). ‘Os Estudos e a Carreira de Joaquim Mattoso Caˆ mara Jr.’ In Dispersos. Rio de Janeiro: FGV.
Cambodia: Language Situation G Chigas, University of Massachusetts Lowell, Lowell, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
Ninety-five per cent of Cambodia’s current population of approximately 12 million speaks Khmer or Cambodian. While the majority of the population is ethnic Khmer, there are substantial numbers of ethnic Vietnamese and Chinese who maintain their respective language and customs in addition to Khmer. There are also various indigenous minorities, such as the Cham (or Khmer-Islam) and Khmer Loeur (Upland Khmer), who speak various dialects of Mon-Khmer languages. Foreign languages such
as Sanskrit, Pali, French, Thai, and English have also had a strong influence on Khmer vocabulary and usage.
Literacy Literacy rates among men and women have varied considerably during the 19th and twentieth centuries. Prior to the establishment of modern public education, reading and writing was primarily taught at temple schools and was generally limited to boys ordained as novice monks. Under the French (1863– 1953), the traditional temple-based system was maintained until the early 1900s, when a French-styled system of public education was introduced. By 1925
188 Caˆmara Ju´nior, Joaquim Mattoso (1904–1970)
of articles in his life, notably on one of the most important authors of literature in the Portuguese language, Machado de Assis. These works were later compiled into a book (Caˆmara, 1962). These stylistic studies had an important impact on his grammatical description of verbs.
The Indigenous Languages and Other Interests On the study of indigenous languages in Brazil, he published Introduc¸a˜o a`s Lı´nguas Indı´genas Brasileiras (Introduction to the indigenous Brazilian languages, 1965). In this work, in addition to producing a vision of the conjunctional problem of studying indigenous languages in Brazil, he brings up interesting discussions on the question of linguistic borrowing. Also, as a professor of the National Museum’s Department of Anthropology, in Rio de Janeiro, he was responsible for the presence of linguistics when the postgraduate course in Anthropology was created. In this program, the question of indigenous languages has always been of great importance. Aside from these aspects, he also dedicated himself to the study of linguistic history, having published an interesting work in this domain (Caˆmara, 1975b). He was also dedicated to teaching and produced works for this purpose. He was a rigorous and important translator of Sapir’s and Jakobson’s texts to Portuguese. See also: Jakobson, Roman (1896–1982); Sapir, Edward (1884–1939); Saussure, Ferdinand (-Mongin) de (1857– 1913).
Bibliography Caˆmara J M Jr (1953a). Contribuic¸a˜o a` Estilı´stica Portuguesa. Rio de Janeiro: Simo˜es. Caˆmara J M Jr (1953b). Para o Estudo da Foneˆmica Portuguesa. Rio de Janeiro: Simo˜es. Caˆmara J M Jr (1954). Princı´pios de Lingu¨ı´stica Geral (1st edn.). Rio de Janeiro: Acadeˆmica Briguiet, 1941. Caˆmara J M Jr (1956). Diciona´rio de Fatos Gramaticais (2nd edn.). Rio de Janeiro: MEC/Casa de Rui Barbosa. Diciona´rio de Filologia e Grama´tica, Rio de Janeiro: Ozon, 1963. Caˆmara J M Jr (1957). ‘Ele como acusativo no Portugueˆs do Brasil.’ In Dispersos. 1st edn. Rio de Janeiro: FGV, 1972. Miscelaˆnea Homenaje a Andre´ Martinet. Estruturalismo y Historia. Univ. de la Laguna. Caˆmara J M Jr (1962). Ensaios Machadianos. Rio de Janeiro: Acadeˆmica. Caˆmara J M Jr (1965). Introduc¸a˜o a`s Lı´nguas Indı´genas Brasileiras. Rio de Janeiro: Acadeˆmica. Caˆmara J M Jr (1969). Problemas de Lingu¨ı´stica Descritiva. Petro´polis: Vozes. Caˆmara J M Jr (1970). Estrutura da Lı´ngua Portuguesa. Petro´polis: Vozes. Caˆmara J M Jr (1972a). Dispersos. Rio de Janeiro: Fundac¸a˜o Getu´lio Vargas. Caˆmara J M Jr (1972b). The Portuguese language: history and structure. Chicago: University Chicago Press. Caˆmara J M Jr (1975a). Histo´ria e Estrutura da Lı´ngua Portuguesa. Rio de Janeiro: Acadeˆmica. Caˆmara J M Jr (1975b). Histo´ria da Lingu¨ı´stica. Rio de Janeiro: Vozes. Naro A J & Reighard J (1976). Tendeˆncias Atuais da Lingu¨ı´stica e da Filologia no Brasil. Rio de Janeiro: Francisco Alves. Uchoˆa C E F (1972). ‘Os Estudos e a Carreira de Joaquim Mattoso Caˆmara Jr.’ In Dispersos. Rio de Janeiro: FGV.
Cambodia: Language Situation G Chigas, University of Massachusetts Lowell, Lowell, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
Ninety-five per cent of Cambodia’s current population of approximately 12 million speaks Khmer or Cambodian. While the majority of the population is ethnic Khmer, there are substantial numbers of ethnic Vietnamese and Chinese who maintain their respective language and customs in addition to Khmer. There are also various indigenous minorities, such as the Cham (or Khmer-Islam) and Khmer Loeur (Upland Khmer), who speak various dialects of Mon-Khmer languages. Foreign languages such
as Sanskrit, Pali, French, Thai, and English have also had a strong influence on Khmer vocabulary and usage.
Literacy Literacy rates among men and women have varied considerably during the 19th and twentieth centuries. Prior to the establishment of modern public education, reading and writing was primarily taught at temple schools and was generally limited to boys ordained as novice monks. Under the French (1863– 1953), the traditional temple-based system was maintained until the early 1900s, when a French-styled system of public education was introduced. By 1925
Cambodia: Language Situation 189
there were about 160 primary schools with 10 000 students. However, enrollment remained relatively small until late in the colonial period. Even by 1944, for example, only 500 out of the approximately 80 000 students enrolled in primary schools went on to the secondary level. After the nation gained independence in 1953, Prince Norodom Sihanouk accelerated Cambodia’s educational reforms, and by the late 1960s, Cambodia enjoyed one of the highest literacy rates in Southeast Asia. This rapid progress came to an abrupt halt in the 1970s under the genocidal regime of Pol Pot (1975–1979), when many schools were converted into torture centers and approximately 75% of Cambodia’s teachers died of starvation, overwork, or execution. After the 1993 UN-sponsored elections and the end of two decades of civil war, Cambodia’s literacy rates began to recover. A recent study by the Cambodian Ministry of Education, Youth, and Sports states that approximately 55% of women and 75% of men are functionally literate.
Foreign Influence Historically, foreign languages and ideas have had a significant influence on Khmer vocabulary and usage. Contact with the literature and social institutions of India, Thailand, and France and the current widespread use of English have expanded the Khmer lexicon with foreign loanwords, especially for vocational purposes. For centuries prior to the Angkor period (9th to 15th centuries), Indian influence had already led to the use of many Sanskrit loanwords. With the establishment of Hinayana Buddhism in the 15th century, Pali loanwords were added. After the fall of the Angkor Empire, Thai influence increased as Cambodian kings and monks went to live and study in Thailand. From the middle of the 19th century, the use of French for official, education, and recreational purposes rivaled the use of Khmer. However, unlike in Vietnam, the French were never successful at romanizing the Khmer script. To the contrary, after gaining independence in 1953, there was a growing impetus to affirm Cambodian national and cultural identity and a concerted effort was made to expunge French loanwords and replace them with Khmer terms. During the Khmer Rouge period (1975–1979), a new vocabulary, including politicized metaphors,
Cambodian
See: Khmer.
appeared that reflected the regime’s radical ideology. Finally, over the last ten years, the influence of English as the language of international business and development has had an impact on Khmer similar to the previous use of French.
Phonology and Grammar Most Khmer words are monosyllabic or disyllabic, while polysyllabic words are generally neologisms or loanwords from Sanskrit and Pali. Another distinctive feature of Khmer and one that distinguishes it from Thai, Lao, and Vietnamese is the fact that it is non-tonal. There are a total of 33 basic consonants in the Khmer alphabet, comprising two distinct series or registers. The register (whether voiced or voiceless) determines the pronunciation of the vowel that follows. In addition, there are 12 independent vowels, 16 vowel symbols, and 31 subscript consonant symbols, which are used in combination with the basic consonant symbols. Khmer also has 10 diacritical marks that modify the sounds of the dependent and independent symbols. Although Khmer nouns and verbs are not inflected, number and verb tense are indicated by syntax and time markers as needed. See also: French; Pali; Sanskrit; Thailand: Language Situ-
ation. Language Maps (Appendix 1): Map 78.
Bibliography Henderson E J A (1976). ‘Vestiges of morphology in modern standard Khasi.’ In Jenner P N, Thimpson L C & Starosta S (eds.) Austroasiatic studies. Honolulu: University Press of Hawaii. 1:477–522. Jabob J (1993). Cambodian linguistics, literature and history. London: School of Oriental and African Studies, University of London. Jenner P N (1969). ‘Affixation in modern Khmer.’ Ph.D. diss., Hawaii University. Marston J (1994). ‘Metaphors of the Khmer Rouge.’ In Ebihara M M, Mortland C A & Ledgerwood J (eds.) Cambodian culture since 1975. Ithaca: Cornell University Press. Pou S (1982). ‘Du Sanskrit kı¯rti au khmer kerti: une tradition litte´ raire du Cambodge.’ Seksa Khmer 5, 33–54.
190 Cameron, Deborah (b. 1958)
Cameron, Deborah (b. 1958) B McElhinny, University of Toronto, Toronto, Ontario, Canada ! 2006 Elsevier Ltd. All rights reserved.
Deborah Cameron is Rupert Murdoch Professor of Language and Communication at the University of Oxford. She has degrees from the University of Newcastle upon Tyne (B.A., 1980) and the University of Oxford (M.Litt., 1985). She is a sociolinguist whose work focuses on language, gender, and sexuality; feminist theory; language ideologies; and media language. She is one of the principal scholars to show the implications of feminist theory for linguistics, as well as to demonstrate the contributions that sociolinguistic research can make to interdisciplinary feminist theory and research. Her works in this area include Feminism and linguistic theory (1985), Women in their speech communities: new perspectives on language and sex (with Coates, J., 1998), The feminist critique of language: a reader (ed., 1990), and ‘Gender, language, and discourse’ (1998). She is a key contributor to the emerging and rapidly growing body of scholarship on language, sexuality, and desire (see Language and Sexuality, with Kulick, D., 2003; and ‘Performing gender identity: young men’s talk and the construction of heterosexual masculinity,’ 1997). She has argued that many sociolinguistic researchers take a ‘merely’ ethical approach to their research, and has (with Elizabeth Frazer, Penelope Harvey, Ben Rampton, and Kay Richardson) raised questions about, and tried to develop examples of, what sociolinguistic research that is devoted to advocacy for, or even empowerment of, disenfranchised communities might look like in the collaborative book Researching language: issues of power and method (1992). In a similar vein, Verbal hygiene (1995) takes up a number of case studies that challenge the truism that linguists should, and do, describe rather than prescribe linguistic practices; it received the 1996 Book Award from the British Association of Applied Linguistics. She has recently begun writing about the implications of globalization for communication in such works as Good to talk? talk and working in a communication culture (2000b), Globalization and language teaching (ed., with Block, D., 2002), and ‘Styling the worker: gender and the commodification of language in the globalized service economy’ (2000c). She has also published six other books,
including The lust to kill: a feminist investigation of sexual murder (with Frazer, E., 1987), and Working with spoken discourse (2001). Professor Cameron’s speaking style is funny without being flip, and blunt without being rude. Her writings, even as they lay out complex theoretical insights, are always lucid, in ways consistent with her arguments that if sociolinguists fail to find ways to educate wider audiences in sophisticated ways about language, others will supply more stereotypical, problematic perspectives (see ‘A self off the shelf?: consuming women’s empowerment,’ 2000a). This, in combination with her knack for identifying cutting-edge research questions, makes her widely sought after as a plenary speaker and as a public commentator on sociolinguistic issues.
Bibliography Cameron D (1985). Feminism and linguistic theory. [Repr. 1992.] London: Macmillan. Cameron D (ed.) (1990). The feminist critique of language: a reader. London: Routledge. Cameron D (1995). Verbal hygiene. London: Routledge. Cameron D (1997). ‘Performing gender identity: young men’s talk and the construction of heterosexual masculinity.’ In Johnson S & Meinhof U (eds.) Language and masculinity. Oxford: Blackwell. Cameron D (1998). ‘Gender, language, and discourse.’ Signs. Cameron D (2000a). ‘A self off the shelf?: consuming women’s empowerment.’ In Andrews M & Talbot M (eds.) All the world and her husband: women in twentieth-century consumer culture. Cameron D (2000b). Good to talk? talk and working in a communication culture. London: Sage. Cameron D (2000c). ‘Styling the worker: gender and the commodification of language in the globalized service economy.’ Journal of Sociolinguistics. Cameron D (2001). Working with spoken discourse. Sage. Cameron D & Block D (2002). Globalization and language teaching. London: Routledge. Cameron D & Coates J (eds.) (1998). Women in their speech communities: new perspectives on language and sex. Essex: Longman. Cameron D & Frazer E (1987). The lust to kill: a feminist investigation of sexual murder. Cambridge: Polity. Cameron D & Kulick D (2003). Language and sexuality. Cambridge: Cambridge University Press. Cameron D, Frazer E, Harvey P, Rampton B & Richardson K (1992). Researching language: issues of power and method. London: Routledge.
Cameroon: Language Situation 191
Cameroon: Language Situation B Connell, York University, Toronto, Ontario, Canada ! 2006 Elsevier Ltd. All rights reserved.
The Republic of Cameroon has a population of approximately 14.5 million people, speaking almost 300 languages. Like its neighbor Nigeria to the west, it has an extremely complex linguistic setting, with a high ratio of languages relative to overall population, as well as the additional complication of colonial languages and their legacy. In Cameroon this legacy is more complex than elsewhere, as the country has inherited languages from two colonial administrations, British and French, whose policies or attitudes towards indigenous languages were diametrically opposed. No one indigenous language dominates, as none has a substantially disproportionate number of speakers. Three, however, approach this status: Fula, or Fulfulde, spoken in the northern part of the country, has approximately 668 700 first-language speakers (population figures for individual languages are taken from Grimes, 2000) and functions as a lingua franca in that region, with up to 5 000 000 others using it as a second language, though there is substantial dialect variation. Ewondo has 577 700 speakers and serves as a lingua franca in the central region; its status is bolstered as a result of being the language of the capital, Yaounde´ . Duala, despite a relatively low number of first-language speakers (87 700), is a lingua franca in the western region, due largely to its status as the language of Douala, the financial heart of Cameroon. In addition to these, Cameroon Pidgin English (sometimes referred to as ‘Wes Cos’) is spoken predominantly as a second language by approximately 2 000 000 people in the South West and North West provinces (see Pidgins and Creoles: Overview). Its use ranges far beyond these areas, however, and in practice it is the most widely used lingua franca in Cameroon. It should be added that Cameroon Pidgin English is not the only pidgin spoken in country; Ewondo Populaire is a pidginized version of Ewondo spoken around Yaounde´ , and the variety of Fulfulde used as a second language is also arguably pidginized. Of all Cameroonian languages, Fulfulde is the largest in terms of first-language speakers and only a few others boast more than 100 000 speakers. The average number of speakers per language is 51 000, and as many as 31 languages are listed (Connell, in press) as having fewer than 1000 speakers. Cameroon has a complex colonial history, with the French dominating the bulk of the country, but first the Germans and then the British controlling the
western region. It was only in 1961 that the former British Cameroon decided by referendum to leave the newly independent Nigeria to join French Cameroon. This mottled history has had a significant impact on the linguistic situation in the country. Following the tradition of the French, little or no importance has been attached by the government to the use of indigenous languages in education, and to date there is no official policy in this respect. Attitudes among the people, however, do seem to vary somewhat between the so-called francophone and anglophone zones. In the former British colony there is greater importance attached to mother tongue education, and although it is still not to be found in state schools, there is a greater tendency for private schools (typically, but not only, mission schools) to offer at least the first years of primary education in the language of the community. An increasing tendency, particularly in the anglophone zone of the southwest, is the use of pidgin in primary education, although this practice is not at present recognized by the government. Only French and English are acknowledged as official languages, and the use of two colonial languages as official languages has led to intergroup conflict. As has happened elsewhere (Canada, Belgium), a minority language group has perceived itself as being discriminated against, and in Cameroon those who are of the English zone claim difficulty in obtaining civil service employment where, despite the policy of two official languages, the language of the workplace is French. Despite the lack of official policy or status regarding indigenous languages, a certain degree of attention has been devoted to their documentation and development. The Atlas Linguistique du Cameroun was undertaken as part of a larger effort, the Atlas Linguistique d’Afrique Centrale (1983), sponsored in part by the French government and in part by Cameroonian government agencies. In addition to this, a standardized orthography has been developed suitable for the writing of all Cameroonian languages. The great number of languages found in Cameroon, their substantial diversity, the low average number of speakers, and the high number of languages with fewer than 1000 speakers, has important implications for linguistic studies. At a very basic level, most of these languages remain only partially described at best, and considerable work of importance remains to be done to rectify this situation. There is a very real threat that, in the face of globalization and modernization, many of these languages will disappear in the near future before they can be documented. The implications of the linguistic diversity found in
192 Cameroon: Language Situation
Cameroon, and particularly the Nigeria-Cameroon borderland, are of great interest for historical studies, both of a linguistic and general nature. This region is now generally accepted by historical linguists as being the ultimate homeland of the Bantu languages (see Bantu Languages), as it is here where the older relatives of Bantu, Bantoid language groups such as Mambiloid, and the apparent isolates Dakoid and Fam are found. The considerable amount of work to be done on these from the historical and ethnological perspective will eventually reveal much about the prehistory of the peoples of West and Central Africa.
Bibliography Connell B (in press). ‘Language endangerment in Central Africa.’ To appear in Brenzinger M (ed.) Language diversity endangered. Berlin: Mouton de Gruyter. Dieu M & Renaud P (1983). ‘Situation linguistique en Afrique Central – Inventaire pre´ liminaire: le Cameroun.’ In Dieu M & Renaud P (eds.) Atlas de L’Afrique Centrale (ALAC), Atlas Linguistique du Cameroun (ALCAM). Paris: ACCT. Grimes B F (ed.) (2000). Ethnologue (14th edn.). Dallas: SIL International. CD-ROM edition.
See also: Bantu Languages; Niger-Congo Languages; Nigeria: Language Situation; Pidgins and Creoles: Overview.
Campanella, Thomas (1568–1639) C Massai ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume, pp. 442–443, ! 1994, Elsevier Ltd.
Campanella was one of the most important philosophers of the Italian Renaissance. Besides theology, poetry, and astrology, his interests also included linguistics. He was the author of a Latin Grammar, and several of his works discuss language reforms of both an orthographical and semantic nature. Campanella was born on September 5, 1568 at Stilo in Calabria. While still very young he entered the Dominican order, but irked by the discipline in 1589 fled the monastery for Naples, Rome, Florence, and Padua, where he studied at the University. In 1599 he returned to Calabria, where he was involved in a plot against the Spanish. The plot was discovered and Campanella passed the next 27 years in prison. A late summons to the French court, due to his fame as an astrologer, brought him the tranquillity he had never known and enabled him to dedicate himself to the revision and publication of works written in prison. He died in Paris on May 21, 1639. The most important of Campanella’s writings to deal with linguistics are his Poetica (Italian edition, 1596, Latin, 1612) and the Grammatica (1618, but published with Poetica in 1638; see Firpo, 1940). The latter is somewhat traditional in outlook, similar to the grammars of the Modistae. Indeed, according to Padley (1976), ‘‘Campanella’s work forms an
important part of . . . Scholastic reaction.’’ However, other features of the Grammatica are typical of its day, such as its interest in the creation of a philosophical language. Campanella outlines its theoretical basis in the last pages of the grammar, the Appendix de philosophicae linguae institutione. The new language, he states, must be clear and unambiguous, reflecting an absolute correspondence between words and things. Similarly, its orthography should show a marked relation between sounds and letters. In the Grammatica, but still more so in the Poetica, Campanella suggests that letters should be represented as they are articulated (e.g.,) (representing lip closure for b, etc.) (see Phonetic Transcription: History). Campanella’s theories were to play an important role in 17th-century debates on the subject of artificial language, and considerably influenced the work of J. Wilkins, who in his Essay of 1668 cites him among his sources (see Wilkins, John (1614–1672)). See also: Phonetic Transcription: History; Wilkins, John
(1614–1672).
Bibliography Crahay R (1973). ‘Pratique du latin et the´ orie du language chez Campanella.’ In Ijsewijn J & Kessler E (eds.) Acta Conventus Neo–Latini Lovaniensis. Louvain August 23–28, 1971. Louvain/Munich: Leuven University Press/ W. Fink. Firpo L (1940). Bibliografia delle opere di Tommaso Campanella. Turin: Bona.
192 Cameroon: Language Situation
Cameroon, and particularly the Nigeria-Cameroon borderland, are of great interest for historical studies, both of a linguistic and general nature. This region is now generally accepted by historical linguists as being the ultimate homeland of the Bantu languages (see Bantu Languages), as it is here where the older relatives of Bantu, Bantoid language groups such as Mambiloid, and the apparent isolates Dakoid and Fam are found. The considerable amount of work to be done on these from the historical and ethnological perspective will eventually reveal much about the prehistory of the peoples of West and Central Africa.
Bibliography Connell B (in press). ‘Language endangerment in Central Africa.’ To appear in Brenzinger M (ed.) Language diversity endangered. Berlin: Mouton de Gruyter. Dieu M & Renaud P (1983). ‘Situation linguistique en Afrique Central – Inventaire pre´liminaire: le Cameroun.’ In Dieu M & Renaud P (eds.) Atlas de L’Afrique Centrale (ALAC), Atlas Linguistique du Cameroun (ALCAM). Paris: ACCT. Grimes B F (ed.) (2000). Ethnologue (14th edn.). Dallas: SIL International. CD-ROM edition.
See also: Bantu Languages; Niger-Congo Languages; Nigeria: Language Situation; Pidgins and Creoles: Overview.
Campanella, Thomas (1568–1639) C Massai ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume, pp. 442–443, ! 1994, Elsevier Ltd.
Campanella was one of the most important philosophers of the Italian Renaissance. Besides theology, poetry, and astrology, his interests also included linguistics. He was the author of a Latin Grammar, and several of his works discuss language reforms of both an orthographical and semantic nature. Campanella was born on September 5, 1568 at Stilo in Calabria. While still very young he entered the Dominican order, but irked by the discipline in 1589 fled the monastery for Naples, Rome, Florence, and Padua, where he studied at the University. In 1599 he returned to Calabria, where he was involved in a plot against the Spanish. The plot was discovered and Campanella passed the next 27 years in prison. A late summons to the French court, due to his fame as an astrologer, brought him the tranquillity he had never known and enabled him to dedicate himself to the revision and publication of works written in prison. He died in Paris on May 21, 1639. The most important of Campanella’s writings to deal with linguistics are his Poetica (Italian edition, 1596, Latin, 1612) and the Grammatica (1618, but published with Poetica in 1638; see Firpo, 1940). The latter is somewhat traditional in outlook, similar to the grammars of the Modistae. Indeed, according to Padley (1976), ‘‘Campanella’s work forms an
important part of . . . Scholastic reaction.’’ However, other features of the Grammatica are typical of its day, such as its interest in the creation of a philosophical language. Campanella outlines its theoretical basis in the last pages of the grammar, the Appendix de philosophicae linguae institutione. The new language, he states, must be clear and unambiguous, reflecting an absolute correspondence between words and things. Similarly, its orthography should show a marked relation between sounds and letters. In the Grammatica, but still more so in the Poetica, Campanella suggests that letters should be represented as they are articulated (e.g.,) (representing lip closure for b, etc.) (see Phonetic Transcription: History). Campanella’s theories were to play an important role in 17th-century debates on the subject of artificial language, and considerably influenced the work of J. Wilkins, who in his Essay of 1668 cites him among his sources (see Wilkins, John (1614–1672)). See also: Phonetic Transcription: History; Wilkins, John
(1614–1672).
Bibliography Crahay R (1973). ‘Pratique du latin et the´orie du language chez Campanella.’ In Ijsewijn J & Kessler E (eds.) Acta Conventus Neo–Latini Lovaniensis. Louvain August 23–28, 1971. Louvain/Munich: Leuven University Press/ W. Fink. Firpo L (1940). Bibliografia delle opere di Tommaso Campanella. Turin: Bona.
Campe, Joachim Heinrich (1746–1818) 193 Formigari L (1970). Linguistica ed empirismo nel Seicento inglese. Bari: Laterza. Padley G A (1976). Grammatical theory in Western Europe 1500–1700: The Latin tradition. Cambridge: Cambridge University Press.
Padley G A (1985–1988). Grammatical theory in Western Europe 1500–1700: Trends in vernacular grammar. Cambridge: Cambridge University Press. Salmon V (1979). The study of language in 17th-century England. Amsterdam: Benjamins.
Campe, Joachim Heinrich (1746–1818) K R Jankowsky, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Joachim Heinrich Campe, born in 1746 in Deensen near Holzminden, Germany, studied Protestant theology and philosophy at the universities of Helmstedt and Halle. After graduating in 1769, he spent four years as private tutor in the house of Alexander Georg von Humboldt in Berlin, then two years as military chaplain in Potsdam, only to return in 1775 to the Berlin-Tegel castle of the Humboldts, this time as educator of the two sons, Wilhelm (b. 1767) and Alexander (b. 1769). Wilhelm von Humboldt later reminisced on this, for him, unforgettable period of his life: ‘‘[Campe] showed even then a most appropriate, natural gift of vividly stimulating a child’s intellect’’ (cf. Hallier, 1862: 17; transl. by K. R. Jankowsky). Rousseau’s new theory of education, advanced in his monumental 4-volume Emile, ou De l’e´ ducation (published 1762) significantly strengthened Campe’s interest in pedagogy. He welcomed his appointment in 1776 by Count Franz von Dessau to the board of directors of the Dessau Philanthropin, a prestigious educational institution, founded in 1772 and directed by Johann Bernhard Basedow (1723–1790). After a few months, he succeeded Basedow as the Philanthropin’s director but resigned the following year, due to irreconcilable differences with the institution’s founder. From then on he devoted most of his time to writing. Campe gained widespread recognition among his contemporaries and for a long time thereafter in three major areas: 1. He produced a substantial number of highly influential educational writings, most prominent among them his Robinson der Ju¨ ngere: Zur angenehmen und nu¨ tzlichen Unterhaltung fu¨ r Kinder (Campe, 1779–1780), based on Daniel Defoe’s Robinson Crusoe, translated in numerous languages, which saw 90 editions within about 100 years. Of comparable importance as an educational tool was his Kleine Kinderbibliothek (Campe,
1790a), originally comprising more than 20 volumes and likewise translated in several languages. By 1815, it had gone through 11 editions of varying size. 2. Campe believed in the need for ‘purifying’ the German (German, Standard) language of nonGerman ingredients. He tried to achieve this objective by theoretical discussions as well as practical illustrations (cf., e.g., Campe, 1790b, 1794, 1804). Of his approximately 11 000 newly coined German words, about 3000 were there to stay, not necessarily as replacements, but certainly as well-liked variants of their foreign originals. They include ‘Hochschule’ for ‘Universita¨ t’ (university), ‘Einzahl, Mehrzahl’ for ‘Singular, Plural’ (singular, plural), ‘Stelldichein’ for ‘Rendezvous’ (rendezvous), ‘Feingefu¨ hl’ for ‘Delikatesse’ (tact, delecacy). But the majority – like ‘Zitterweh’ for ‘Fieber’ (fever), ‘Geistesanbau’ for ‘Kultur’ (culture), ‘Haarkra¨ usler’ for ‘Friseur’ (hairdresser) – was short-lived, their demise being quickened by ironic, even sarcastic criticism from highly placed sources (cf., e.g., Xenien by Goethe and Schiller). 3. Tied to his ‘purification campaign’ was his effort to present to native speakers the richness of their mother tongue by compiling a comprehensive Wo¨ rterbuch der Deutschen Sprache. The first fruit of his labors was the 2-volume supplement of 1801 to Adelung’s Wo¨ rterbuch der hochdeutschen Mundart (1774–1786). But his own Wo¨ rterbuch goes well beyond that of Adelung. He aims at the entire Deutsche Sprache, not restictively at the hochdeutsche Mundart only. And whereas Adelung lists just ca. 55 000 words, Campe’s dictionary comprises almost three times that amount. Even though he counts derivations as separate entries, the advancement is still considerable. Campe’s significance for historical linguistics is still being examined. Publications like Orgeldinger (1999) and the exhibition at the Wolfenbu¨ ttel Library as documented in Schmitt (1996) provided just a glimpse of proof that the discussion is far from being over.
Campe, Joachim Heinrich (1746–1818) 193 Formigari L (1970). Linguistica ed empirismo nel Seicento inglese. Bari: Laterza. Padley G A (1976). Grammatical theory in Western Europe 1500–1700: The Latin tradition. Cambridge: Cambridge University Press.
Padley G A (1985–1988). Grammatical theory in Western Europe 1500–1700: Trends in vernacular grammar. Cambridge: Cambridge University Press. Salmon V (1979). The study of language in 17th-century England. Amsterdam: Benjamins.
Campe, Joachim Heinrich (1746–1818) K R Jankowsky, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Joachim Heinrich Campe, born in 1746 in Deensen near Holzminden, Germany, studied Protestant theology and philosophy at the universities of Helmstedt and Halle. After graduating in 1769, he spent four years as private tutor in the house of Alexander Georg von Humboldt in Berlin, then two years as military chaplain in Potsdam, only to return in 1775 to the Berlin-Tegel castle of the Humboldts, this time as educator of the two sons, Wilhelm (b. 1767) and Alexander (b. 1769). Wilhelm von Humboldt later reminisced on this, for him, unforgettable period of his life: ‘‘[Campe] showed even then a most appropriate, natural gift of vividly stimulating a child’s intellect’’ (cf. Hallier, 1862: 17; transl. by K. R. Jankowsky). Rousseau’s new theory of education, advanced in his monumental 4-volume Emile, ou De l’e´ducation (published 1762) significantly strengthened Campe’s interest in pedagogy. He welcomed his appointment in 1776 by Count Franz von Dessau to the board of directors of the Dessau Philanthropin, a prestigious educational institution, founded in 1772 and directed by Johann Bernhard Basedow (1723–1790). After a few months, he succeeded Basedow as the Philanthropin’s director but resigned the following year, due to irreconcilable differences with the institution’s founder. From then on he devoted most of his time to writing. Campe gained widespread recognition among his contemporaries and for a long time thereafter in three major areas: 1. He produced a substantial number of highly influential educational writings, most prominent among them his Robinson der Ju¨ngere: Zur angenehmen und nu¨tzlichen Unterhaltung fu¨r Kinder (Campe, 1779–1780), based on Daniel Defoe’s Robinson Crusoe, translated in numerous languages, which saw 90 editions within about 100 years. Of comparable importance as an educational tool was his Kleine Kinderbibliothek (Campe,
1790a), originally comprising more than 20 volumes and likewise translated in several languages. By 1815, it had gone through 11 editions of varying size. 2. Campe believed in the need for ‘purifying’ the German (German, Standard) language of nonGerman ingredients. He tried to achieve this objective by theoretical discussions as well as practical illustrations (cf., e.g., Campe, 1790b, 1794, 1804). Of his approximately 11 000 newly coined German words, about 3000 were there to stay, not necessarily as replacements, but certainly as well-liked variants of their foreign originals. They include ‘Hochschule’ for ‘Universita¨t’ (university), ‘Einzahl, Mehrzahl’ for ‘Singular, Plural’ (singular, plural), ‘Stelldichein’ for ‘Rendezvous’ (rendezvous), ‘Feingefu¨hl’ for ‘Delikatesse’ (tact, delecacy). But the majority – like ‘Zitterweh’ for ‘Fieber’ (fever), ‘Geistesanbau’ for ‘Kultur’ (culture), ‘Haarkra¨usler’ for ‘Friseur’ (hairdresser) – was short-lived, their demise being quickened by ironic, even sarcastic criticism from highly placed sources (cf., e.g., Xenien by Goethe and Schiller). 3. Tied to his ‘purification campaign’ was his effort to present to native speakers the richness of their mother tongue by compiling a comprehensive Wo¨rterbuch der Deutschen Sprache. The first fruit of his labors was the 2-volume supplement of 1801 to Adelung’s Wo¨rterbuch der hochdeutschen Mundart (1774–1786). But his own Wo¨rterbuch goes well beyond that of Adelung. He aims at the entire Deutsche Sprache, not restictively at the hochdeutsche Mundart only. And whereas Adelung lists just ca. 55 000 words, Campe’s dictionary comprises almost three times that amount. Even though he counts derivations as separate entries, the advancement is still considerable. Campe’s significance for historical linguistics is still being examined. Publications like Orgeldinger (1999) and the exhibition at the Wolfenbu¨ttel Library as documented in Schmitt (1996) provided just a glimpse of proof that the discussion is far from being over.
194 Campe, Joachim Heinrich (1746–1818) See also: Humboldt, Wilhelm von (1767–1835); Rousseau, Jean-Jacques (1712–1778).
Bibliography Campe J H (1779–1780). Robinson der Ju¨ ngere: Zur angenehmen und nu¨ tzlichen Unterhaltung fu¨ r Kinder. Hamburg: Carl Ernst Bohn. Campe J H (1790a). Kleine Kinderbibliothek. Braunschweig: Schulbuchhandlung. Campe J H (1790b [1791]). Proben einiger Versuche von deutscher Sprachbereicherung. Braunschweig: Schulbuchhandlung. Campe J H (1794). Ueber die Reinigung und Bereicherung der deutschen Sprache. Braunschweig: Schulbuchhandlung. Campe J H (1795–1797). Beitra¨ ge zur weiteren Ausbildung der deutschen Sprache. Braunschweig: Schulbuchhandlung. Campe J H (1801). Wo¨ rterbuch zur Erkla¨rung und Verdeutschung der unserer Sprache aufgedrungenen fremden Ausdru¨ cke: Ein Erga¨ nzungsband zu Adelung’s Wo¨ rterbuche. In zwei Ba¨ nden (ed.). Braunschweig: Schulbuchhandlung. Campe J H (1804 [21813]). Versuch einer genauern Bestimmung und Verdeutschung der fu¨ r unsere Sprachlehre
geho¨ rigen Kunstwo¨ rter. Braunschweig: Schulbuchhandlung. Campe J H (ed.) (1807–1811). Wo¨ rterbuch der Deutschen Sprache. Braunschweig: Schulbuchhandlung. Repr. Hildesheim: Georg Olms, 1969. Campe J H (1813). Wo¨ rterbuch zur Erkla¨ rung und Verdeutschung der unserer Sprache aufgedrungenen fremden Ausdru¨ cke: Ein Erga¨ nzungsband zu Adelung’s und Campe’s Wo¨ rterbu¨ chern. Braunschweig: Schulbuchhandlung. Hallier E (1862). Joachim Heinrich Campe’s Leben und Wirken: Bausteine zu einer Biographie. Liegnitz: Krumbhaar. Jankowsky K R (1999). ‘Joachim Heinrich Campe (1746– 1818) und sein Wo¨ rterbuch im Vergleich zu Johann Leo Weisgerbers sprachtheoretische Arbeiten.’ In Klaus D Dutz (ed.) Interpretation und Re-Interpretation. Mu¨ nster: Nodus. 67–86. 2 Leyser J A (1877 [ 1896]). Joachim Heinrich Campe: Ein Lebensbild aus dem Zeitalter der Aufkla¨ rung (2 vols). Braunschweig: Vieweg. Orgeldinger S (1999). Standardisierung und Purismus bei Joachim Heinrich Campe. Berlin; New York: De Gruyter. Schmitt H et al. (eds.) (1996). Visona¨ re Lebensklugheit: Joachim Heinrich Campe in seiner Zeit (1746–1818) [Exhibition and Catalogue]. Wiesbaden: Harrassowitz.
Canada: Language Situation G J Rowicka, Leiden University, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Canada has a multilingual population of 29.6 million. Since passage of the Official Languages Act in 1969, it has two official languages on the federal level: English and French. However, only 23% of the Canadian population, predominantly inhabitants of Quebec, speak French as their sole or principal language, versus 68% who mainly speak English. English-French bilingualism is not very widespread (17% of the population), but it is increasing. The issue of reconciling Quebec’s francophones with the majority anglophone Canadian population seems to have been moved to the back burner since the Quebec government’s referendum on independence failed to pass in October 1995. Canadian English resembles American English in many ways. Like Americans, but unlike speakers of (Southern) British English, Canadians pronounce [r] in car and farm. Yet Canadian English also has some characteristics of its own in its vocabulary, spelling, pronunciation, and grammar, some of which are seen
as ‘Britishisms.’ Many Canadians still use serviettes at the table, rather than napkins, as Americans do. They apologize by saying sorry [sori], rather than [sari]. Typical is ‘Canadian Raising’, which makes the vowels [aw] in house and [ay] in knife (before voiceless consonants) sound quite different – ‘higher’ – than the vowel in houses and knives (before voiced consonants). A well-known Canadian trait is eh, as in You like it, eh?, where Americans would rather use huh. There are, however, regional and social differences in these and other features. Canadian French also differs from European French. It developed out of 17th century French and other languages spoken in France at that time and has preserved some archaic features long since lost in European French. For instance, Quebec French has a distinction between long and short vowels, such as feˆ te ‘anniversary’ [fE:t] and faite ‘done, FEM’ [fEt], while most European French dialects have only short vowels. In several French varieties, word-final consonant clusters can be simplified, for instance, table [tab] ‘table’. In Quebec French, however, more complex groups are also simplified, as in astre [as] ‘aster’, even in formal contexts. Some words have a different meaning in Quebec and in European French.
194 Canada: Language Situation See also: Humboldt, Wilhelm von (1767–1835); Rousseau, Jean-Jacques (1712–1778).
Bibliography Campe J H (1779–1780). Robinson der Ju¨ngere: Zur angenehmen und nu¨tzlichen Unterhaltung fu¨r Kinder. Hamburg: Carl Ernst Bohn. Campe J H (1790a). Kleine Kinderbibliothek. Braunschweig: Schulbuchhandlung. Campe J H (1790b [1791]). Proben einiger Versuche von deutscher Sprachbereicherung. Braunschweig: Schulbuchhandlung. Campe J H (1794). Ueber die Reinigung und Bereicherung der deutschen Sprache. Braunschweig: Schulbuchhandlung. Campe J H (1795–1797). Beitra¨ge zur weiteren Ausbildung der deutschen Sprache. Braunschweig: Schulbuchhandlung. Campe J H (1801). Wo¨rterbuch zur Erkla¨rung und Verdeutschung der unserer Sprache aufgedrungenen fremden Ausdru¨cke: Ein Erga¨nzungsband zu Adelung’s Wo¨rterbuche. In zwei Ba¨nden (ed.). Braunschweig: Schulbuchhandlung. Campe J H (1804 [21813]). Versuch einer genauern Bestimmung und Verdeutschung der fu¨r unsere Sprachlehre
geho¨rigen Kunstwo¨rter. Braunschweig: Schulbuchhandlung. Campe J H (ed.) (1807–1811). Wo¨rterbuch der Deutschen Sprache. Braunschweig: Schulbuchhandlung. Repr. Hildesheim: Georg Olms, 1969. Campe J H (1813). Wo¨rterbuch zur Erkla¨rung und Verdeutschung der unserer Sprache aufgedrungenen fremden Ausdru¨cke: Ein Erga¨nzungsband zu Adelung’s und Campe’s Wo¨rterbu¨chern. Braunschweig: Schulbuchhandlung. Hallier E (1862). Joachim Heinrich Campe’s Leben und Wirken: Bausteine zu einer Biographie. Liegnitz: Krumbhaar. Jankowsky K R (1999). ‘Joachim Heinrich Campe (1746– 1818) und sein Wo¨rterbuch im Vergleich zu Johann Leo Weisgerbers sprachtheoretische Arbeiten.’ In Klaus D Dutz (ed.) Interpretation und Re-Interpretation. Mu¨nster: Nodus. 67–86. 2 Leyser J A (1877 [ 1896]). Joachim Heinrich Campe: Ein Lebensbild aus dem Zeitalter der Aufkla¨rung (2 vols). Braunschweig: Vieweg. Orgeldinger S (1999). Standardisierung und Purismus bei Joachim Heinrich Campe. Berlin; New York: De Gruyter. Schmitt H et al. (eds.) (1996). Visona¨re Lebensklugheit: Joachim Heinrich Campe in seiner Zeit (1746–1818) [Exhibition and Catalogue]. Wiesbaden: Harrassowitz.
Canada: Language Situation G J Rowicka, Leiden University, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Canada has a multilingual population of 29.6 million. Since passage of the Official Languages Act in 1969, it has two official languages on the federal level: English and French. However, only 23% of the Canadian population, predominantly inhabitants of Quebec, speak French as their sole or principal language, versus 68% who mainly speak English. English-French bilingualism is not very widespread (17% of the population), but it is increasing. The issue of reconciling Quebec’s francophones with the majority anglophone Canadian population seems to have been moved to the back burner since the Quebec government’s referendum on independence failed to pass in October 1995. Canadian English resembles American English in many ways. Like Americans, but unlike speakers of (Southern) British English, Canadians pronounce [r] in car and farm. Yet Canadian English also has some characteristics of its own in its vocabulary, spelling, pronunciation, and grammar, some of which are seen
as ‘Britishisms.’ Many Canadians still use serviettes at the table, rather than napkins, as Americans do. They apologize by saying sorry [sori], rather than [sari]. Typical is ‘Canadian Raising’, which makes the vowels [aw] in house and [ay] in knife (before voiceless consonants) sound quite different – ‘higher’ – than the vowel in houses and knives (before voiced consonants). A well-known Canadian trait is eh, as in You like it, eh?, where Americans would rather use huh. There are, however, regional and social differences in these and other features. Canadian French also differs from European French. It developed out of 17th century French and other languages spoken in France at that time and has preserved some archaic features long since lost in European French. For instance, Quebec French has a distinction between long and short vowels, such as feˆte ‘anniversary’ [fE:t] and faite ‘done, FEM’ [fEt], while most European French dialects have only short vowels. In several French varieties, word-final consonant clusters can be simplified, for instance, table [tab] ‘table’. In Quebec French, however, more complex groups are also simplified, as in astre [as] ‘aster’, even in formal contexts. Some words have a different meaning in Quebec and in European French.
Canada: Language Situation 195
The other main variety of Canadian French is Acadian, which is spoken along the Atlantic coast. Cajun, the French dialect of Louisiana, United States, derives from Acadian. Canada’s linguistic wealth extends far beyond the two largest languages. About 17% of all Canadians speak a language other than English or French as their mother tongue. These are Aboriginal Canadian languages or immigrant languages. According to the 2001 Census, only 21 languages indigenous to Canada are still spoken (although other sources still mention 50). They can be grouped into several language families. Most linguistic diversity is concentrated in the west of the country. The majority of Aboriginal language families are as distinct from each other as, for instance, IndoEuropean is from Sino-Tibetan. They exhibit remarkable structural diversity and characteristics unlike those familiar from Indo-European languages. For instance, in Nuxalk (Bella Coola, a Salish language) there are words without a single vowel, e.g., skw|’ L p ‘seed’. A property of most Aboriginal languages is polysynthesis. Words in polysynthetic languages can contain a large number of meaningful parts (morphemes). For instance, in Mohawk (an Iroquoian language the following is a single word: s-a-h wa-nho-t -kw-ahs-e again-PAST-she/him-door-close-un-for-PERF ‘she opened the door for him again’
Numerous Aboriginal words have been adopted into Canadian English. The country’s name, Canada, comes from the Laurentian (extinct Iroquoian language) word for ‘settlement’. In the Northwest Territories, since 1993 several Aboriginal languages have enjoyed an official status equal to that of English and French. Inuktitut (the language of the Inuit) also has official status in Nunavut, a Canadian territory that was part of the Northwest Territories until 1999 and where 80% of the population are Inuit. However, most Aboriginal languages are seriously endangered as a result of, among others, repressive education policies practiced in the past and are only spoken fluently by the oldest generation. Only Cree (80 000 speakers), Ojibwa (45 000 speakers), and Inuktitut (20 000 speakers) are estimated to have good chances of long-term survival. There is a growing involvement of universities in language preservation efforts. Among the Aboriginal Canadian languages on the verge of extinction is Michif, a unique mixed language of Canada’s Me´ tis, most of whom are descendants from Cree or Ojibwa women and French Canadian fur trappers. Michif combines Cree verbs and French nouns. French noun phrases retain lexical gender and
adjective agreement, while Cree and Ojibwa verbs retain much of their polysynthetic structure. This makes Michif unlike other contact languages, which usually exhibit simplified grammar. Among new (immigrant) Canadian languages, Chinese, Italian, and German are each spoken by more than 400 000 people, with Chinese speakers constituting the largest linguistic group in Canada after English and French. Seven other languages (Spanish, Portuguese, Polish, Panjabi, Ukrainian, Arabic, and Tagalog) have between 150 000 and 228 000 speakers each. In some cities there are such large ethnic populations that it is possible to live, work, and shop there without using any of the official languages. In Toronto, 40% of the population speak a mother tongue that is neither English nor French. In Vancouver this figure is 27%, in Winnipeg 21%, and in Montre´ al 17%. The strength of nonofficial languages is part of a deliberate policy on the part of the Canadian government. It is precisely the rejection of uniformity, the refusal to accept a homogeneous view of themselves and their country, that constitutes the most authentic and widely shared experience of Canadians. The affirmation and preservation of differences, personal, social, local, regional, cultural, linguistic, has consumed the minds and hearts of Canadians all through their history. It is the Canadian response to the question of identity. Our unity – and it is a real and profound unity if we will only bring ourselves to see it—arises from the determination to preserve the identity of each of us. – From A national understanding (government report, 1977)
See also: American Lexicography; Isolated Language Varieties; Language Families and Linguistic Diversity; Michif.
Language Maps (Appendix 1): Maps 52–54.
Bibliography Chambers J K (ed.) (1979). The languages of Canada. Montreal: Didier. Edwards J (ed.) (1998). Language in Canada. Cambridge: Cambridge University Press. Grimes B F (ed.) (2000). The ethnologue: languages of the world. (14th edn. þ CD-ROM). Dallas, TX: Summer Institute of Linguistics. Also available at: www.ethnologue.com. Mithun M (1999). The languages of Native North America. Cambridge: Cambridge University Press. Statistics Canada (2001). Census of Canada. Available at: http://www.statcan.ca.
196 Canadian Lexicography
Canadian Lexicography K Barber, Oxford University Press, Toronto, Ontario ! 2006 Elsevier Ltd. All rights reserved.
Dictionaries used in English-speaking Canada have all too often been reprints of British or American works, with little or no revision. It was not until the late 1950s that Canadians began to seriously research the history of Canadian English and words that originated in Canada, that have meanings peculiar to Canada, or that have special significance in Canada. This resulted in the publication of the Dictionary of Canadianisms on historical principles in 1967, on which the Canadian content in dictionaries, chiefly those published by Gage, was based for the next 25 years. Dictionaries of regional varieties of Canadian English, such as the Dictionary of Newfoundland English (1982) and the Dictionary of Prince Edward Island English (1988), expanded on the coverage provided by the Dictionary of Canadianisms. In 1992, Oxford University Press Canada established a permanent dictionary department in Toronto, with the aim of producing a thoroughly researched dictionary of current Canadian English based on corpus analysis and a vast reading program. The first edition of the Canadian Oxford dictionary appeared in 1998, followed by a number of spin-offs and a second edition in 2004. This project also provides Canadian quotations to the OED. Like Anglophone Canadians, Francophones in Canada have long had to make do with dictionaries reflecting a linguistic reality different from their own, a phenomenon compounded by the European French trend toward marginalizing varieties of the language found outside of France. There has been an ongoing tension in Que´be´cois dictionaries between attempts
to align Que´be´cois French with the standards of France on the one hand and the desire to assert and legitimate usages particular to Quebec on the other. In the late 1980s, the first serious dictionaries of Canadian French began to appear, based upon the belief that French–Canadian (both Que´be´cois and Acadian) usages are both valid and standard rather than marginal compared with the French spoken in France. These dictionaries, such as the Dictionnaire du franc¸ais plus a` l’usage des francophones d’Ame´rique (1988), the Dictionnaire des canadianismes (1989), and the Dictionnaire que´be´cois d’aujourd’hui (1992), drew on the vast research compiled by the Universite´ Laval, which published its own historical dictionary of Canadian French, the Dictionnaire du franc¸ais que´be´cois, in 1998. A small French–English dictionary was first published in Canada in 1962. Researchers at the University of Ottawa, the Universite´ de Montre´al, and the Universite´ Laval have been collaborating on a Canadian French–English dictionary since 1988. A number of dictionaries of Canadian Aboriginal languages also exist, and more are in preparation. This work was started by missionaries in the 19th century and was taken up more recently by the Canadian Museum of Civilization, which has produced bilingual dictionaries of Western Abenaki, Heiltsuk, Kwakwala (Kwakiutl), Mohawk, and Mi’kmaq (Micmac). See also: Bilingual Lexicography; Canada: Language Situ-
ation; English in the Present Day (since ca. 1900); French.
Canary Islands: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The Canary Islands are a group of seven main islands about 100 km off the African coast opposite Morocco and Western Sahara. They are an autonomous region of Spain. The islands have been inhabited since at least 200 B.C., and they were mentioned in classical
sources. However, there is no further evidence of European knowledge of them until the 13th century, when they were ‘re-discovered’ by a Genovese fleet. At that time, the people living on the islands were speakers of Guanche, most probably a Berber language. However, the only traces of the language are a few place names, as their speakers had been converted to Christianity, and to Castilian Spanish, by the end of the 15th century, when the islands became part of Castile. Spanish is the official language of the
196 Canadian Lexicography
Canadian Lexicography K Barber, Oxford University Press, Toronto, Ontario ! 2006 Elsevier Ltd. All rights reserved.
Dictionaries used in English-speaking Canada have all too often been reprints of British or American works, with little or no revision. It was not until the late 1950s that Canadians began to seriously research the history of Canadian English and words that originated in Canada, that have meanings peculiar to Canada, or that have special significance in Canada. This resulted in the publication of the Dictionary of Canadianisms on historical principles in 1967, on which the Canadian content in dictionaries, chiefly those published by Gage, was based for the next 25 years. Dictionaries of regional varieties of Canadian English, such as the Dictionary of Newfoundland English (1982) and the Dictionary of Prince Edward Island English (1988), expanded on the coverage provided by the Dictionary of Canadianisms. In 1992, Oxford University Press Canada established a permanent dictionary department in Toronto, with the aim of producing a thoroughly researched dictionary of current Canadian English based on corpus analysis and a vast reading program. The first edition of the Canadian Oxford dictionary appeared in 1998, followed by a number of spin-offs and a second edition in 2004. This project also provides Canadian quotations to the OED. Like Anglophone Canadians, Francophones in Canada have long had to make do with dictionaries reflecting a linguistic reality different from their own, a phenomenon compounded by the European French trend toward marginalizing varieties of the language found outside of France. There has been an ongoing tension in Que´be´cois dictionaries between attempts
to align Que´be´cois French with the standards of France on the one hand and the desire to assert and legitimate usages particular to Quebec on the other. In the late 1980s, the first serious dictionaries of Canadian French began to appear, based upon the belief that French–Canadian (both Que´be´cois and Acadian) usages are both valid and standard rather than marginal compared with the French spoken in France. These dictionaries, such as the Dictionnaire du franc¸ais plus a` l’usage des francophones d’Ame´rique (1988), the Dictionnaire des canadianismes (1989), and the Dictionnaire que´be´cois d’aujourd’hui (1992), drew on the vast research compiled by the Universite´ Laval, which published its own historical dictionary of Canadian French, the Dictionnaire du franc¸ais que´be´cois, in 1998. A small French–English dictionary was first published in Canada in 1962. Researchers at the University of Ottawa, the Universite´ de Montre´al, and the Universite´ Laval have been collaborating on a Canadian French–English dictionary since 1988. A number of dictionaries of Canadian Aboriginal languages also exist, and more are in preparation. This work was started by missionaries in the 19th century and was taken up more recently by the Canadian Museum of Civilization, which has produced bilingual dictionaries of Western Abenaki, Heiltsuk, Kwakwala (Kwakiutl), Mohawk, and Mi’kmaq (Micmac). See also: Bilingual Lexicography; Canada: Language Situ-
ation; English in the Present Day (since ca. 1900); French.
Canary Islands: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The Canary Islands are a group of seven main islands about 100 km off the African coast opposite Morocco and Western Sahara. They are an autonomous region of Spain. The islands have been inhabited since at least 200 B.C., and they were mentioned in classical
sources. However, there is no further evidence of European knowledge of them until the 13th century, when they were ‘re-discovered’ by a Genovese fleet. At that time, the people living on the islands were speakers of Guanche, most probably a Berber language. However, the only traces of the language are a few place names, as their speakers had been converted to Christianity, and to Castilian Spanish, by the end of the 15th century, when the islands became part of Castile. Spanish is the official language of the
Cape Verde Islands: Language Situation 197
Canary Islands today. During the 19th and early 20th centuries, a mixed Spanish-English trade variety called Pichingli was used on the islands (Armistead, 1995). In the 20th century, the Canary Islands developed a major tourist industry, and there is now at any one time a substantive contingent of more or less short-term visitors, who are catered to with mainly English and German print media, shop signs, menus, and so forth. In addition to spoken language, the Canary island of La Gomera is home to the ‘whistling language’ Silbo Gomero (from Spanish silbar, ‘whistle’). The language is said to have been used as a means of long-distance communication on the mountainous island since before the arrival of the Spanish, but knowledge of it decreased with the advent of modern
Cantonese
communication. Seen as a part of the island’s cultural heritage, Silbo Gomero is now taught in schools on the island. See also: Spanish.
Bibliography Armistead A G (1995). Sobre la lengua de los cambulloneros: El pichingli. Revista de Filologı´a de la Universidad de la Laguna 14, 245–252. Bo¨ hm G (1996). Sprache und Geschichte im Kanarischen Archipel. Vol. 1: Kulturgeschichte. Wien: Afro-Pub. Lo´ pez J M & Dı´az D C (eds.) (1996). El espan˜ol de Canarias hoy: ana´lisis y perspectivas. Frankfurt: Vervuert/ Madrid: Iberoamericana.
See: Chinese.
Cape Verde Islands: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The Republic of Cape Verde consists of 10 islands and five islets off the west coast of Africa, about 600 km west of Senegal. The islands were uninhabited until the 15th century, when they were colonized by the Portuguese, who used them as a supply and trading post for the slave trade. Many speakers of African (mainly West-Atlantic) languages were brought from the then-Portuguese territory of Guinea-Bissau. Cape Verde became independent in 1975. The official language of Cape Verde is Portuguese. However, the majority of the 415 000 (July 2004 estimate) residents of the islands speak the Portuguesebased Cape Verdean creole Kabuverdianu (Crioulo/ Kriolu) as their first language. Kabuverdianu falls into two main dialect groups, Sotavento and Barlavento. The former is spoken on the southern (Sotavento) islands, which include Sa˜o Tiago, with the capital, Praia, and site of the earliest settlements. Sotavento is spoken by about 65% of Kabuverdianu speakers. The dialect of the northern Barlavento islands, which
were settled only in the late 17th and 18th centuries, is spoken by the remaining 35% of speakers. Since independence, the role and status of Kabuverdianu have increased, and the language is used in domains previously reserved for Portuguese, e.g., formal religious and political discourse. Portuguese is used on television and radio, as well as in education, where it is the primary language of instruction throughout all levels. Because of harsh economic conditions and high unemployment, many Cape Verdeans have left the islands and work abroad, so the majority of speakers of Kabuverdianu (about 934 000) do not live in Cape Verde. There are large Kabuverdianu communities in Guinea Bissau, Senegal, several western European countries, and the United States. See also: Pidgins and Creoles: Overview.
Bibliography Chabal P (2002). A history of postcolonial Lusophone Africa. London: Hurst. Holm J (1989). Pidgin and creoles. Vol. 2: Reference survey. Cambridge: Cambridge University Press.
Cape Verde Islands: Language Situation 197
Canary Islands today. During the 19th and early 20th centuries, a mixed Spanish-English trade variety called Pichingli was used on the islands (Armistead, 1995). In the 20th century, the Canary Islands developed a major tourist industry, and there is now at any one time a substantive contingent of more or less short-term visitors, who are catered to with mainly English and German print media, shop signs, menus, and so forth. In addition to spoken language, the Canary island of La Gomera is home to the ‘whistling language’ Silbo Gomero (from Spanish silbar, ‘whistle’). The language is said to have been used as a means of long-distance communication on the mountainous island since before the arrival of the Spanish, but knowledge of it decreased with the advent of modern
Cantonese
communication. Seen as a part of the island’s cultural heritage, Silbo Gomero is now taught in schools on the island. See also: Spanish.
Bibliography Armistead A G (1995). Sobre la lengua de los cambulloneros: El pichingli. Revista de Filologı´a de la Universidad de la Laguna 14, 245–252. Bo¨hm G (1996). Sprache und Geschichte im Kanarischen Archipel. Vol. 1: Kulturgeschichte. Wien: Afro-Pub. Lo´pez J M & Dı´az D C (eds.) (1996). El espan˜ol de Canarias hoy: ana´lisis y perspectivas. Frankfurt: Vervuert/ Madrid: Iberoamericana.
See: Chinese.
Cape Verde Islands: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The Republic of Cape Verde consists of 10 islands and five islets off the west coast of Africa, about 600 km west of Senegal. The islands were uninhabited until the 15th century, when they were colonized by the Portuguese, who used them as a supply and trading post for the slave trade. Many speakers of African (mainly West-Atlantic) languages were brought from the then-Portuguese territory of Guinea-Bissau. Cape Verde became independent in 1975. The official language of Cape Verde is Portuguese. However, the majority of the 415 000 (July 2004 estimate) residents of the islands speak the Portuguesebased Cape Verdean creole Kabuverdianu (Crioulo/ Kriolu) as their first language. Kabuverdianu falls into two main dialect groups, Sotavento and Barlavento. The former is spoken on the southern (Sotavento) islands, which include Sa˜o Tiago, with the capital, Praia, and site of the earliest settlements. Sotavento is spoken by about 65% of Kabuverdianu speakers. The dialect of the northern Barlavento islands, which
were settled only in the late 17th and 18th centuries, is spoken by the remaining 35% of speakers. Since independence, the role and status of Kabuverdianu have increased, and the language is used in domains previously reserved for Portuguese, e.g., formal religious and political discourse. Portuguese is used on television and radio, as well as in education, where it is the primary language of instruction throughout all levels. Because of harsh economic conditions and high unemployment, many Cape Verdeans have left the islands and work abroad, so the majority of speakers of Kabuverdianu (about 934 000) do not live in Cape Verde. There are large Kabuverdianu communities in Guinea Bissau, Senegal, several western European countries, and the United States. See also: Pidgins and Creoles: Overview.
Bibliography Chabal P (2002). A history of postcolonial Lusophone Africa. London: Hurst. Holm J (1989). Pidgin and creoles. Vol. 2: Reference survey. Cambridge: Cambridge University Press.
198 Cape Verdean Creole
Cape Verdean Creole M Baptista, University of Georgia, Athens, GA, USA ! 2006 Elsevier Ltd. All rights reserved.
Cape Verdean Creole (henceforth CVC) is spoken in Cape Verde Islands, an archipelago located in the Atlantic Ocean off the northwestern coast of Africa, at approximately 450 kilometers from Senegal. The archipelago is divided into two main clusters: the windward islands (locally known as Barlavento) and the leeward islands (Sotavento). Barlavento includes Boavista, Sal, Sa˜ o Nicolau, Santa Luzia, Sa˜ o Vicente, and Santo Anta˜ o. Sotavento consists of Brava, Fogo, Santiago, and Maio. Given the strategic location of the archipelago at the crossroads of Europe, Africa, and America, the Portuguese settled the islands from 1462 onward, and the islands came to play a critical role in the slave trade from the 15th to the 19th centuries. As a result, many view CVC as the oldest creole alive today. Historical sources (Bra´ sio, 1962) state that the tribes of Mandingues, Balantes, Bijagos, Feloupes, Beafadas, Pepels, Quissis, Brames, Banhuns, Peuls, Jalofos, Bambaras, Bololas, and Manjakus provided most of the human contingent to the slave trade in Cape Verde. The white settlers came from Algarve and Alentejo in Portugal and also included Jews, Spaniards, Italians, and French (Martinus, 1996). Having been settled at different times with different populations, it is not surprising that a number of morphophonological and syntactic features distinguish Barlavento varieties (closer to Portuguese) from their Sotavento counterparts (more Africanized), resulting in a fairly complex sociolinguistic situation. Although earlier descriptions of the language viewed CVC as a mere dialect of Portuguese, recent studies have shed new light on the hybrid nature of CVC focusing on the African contributions to the formation of the language. Baptista (2003a) studied specifically reduplication, a morphological process found in African languages whereby a reduplicated adjective or adverb expresses emphasis, as in moku moku ‘very drunk’ or faxi faxi ‘very quickly’. Noun reduplication may yield a distributive interpretation, as in dia dia ‘every day’ or may simply lead to a change in meaning, as in boka ‘mouth,’ boka boka signifying ‘in secret’. Lexical categories such as adjectives once reduplicated may shift category (i.e., adjective to noun) as in mansu ‘quiet’, mansu mansu ‘secrecy’. Other scholars such as Rouge´ (2004) and
Quint (2000) have examined the possible African etymology of some of the Cape Verdean linguistic items that have found their way in the grammatical and lexical components of the language. Lang (2004) has investigated how some grammatical morphemes inherited from Portuguese may also take on new functions passed down from substrates like Wolof. In a similar vein of work, Baptista (2003b) has examined how the plural suffix-s in Cape Verdean inherited from Portuguese is sensitive to conditions such as the animacy hierarchy and definiteness, two variables playing a role in the African languages having contributed to the genesis of CVC. Such studies demonstrate the genuine hybrid nature of CVC by examining how various elements from all source languages involved in its genesis interact and at what level. This gives us valuable insights into cognitive processes at play when languages come abruptly into contact. See also: Cape Verde Islands: Language Situation; Pidgins
and Creoles: Overview. Language Maps (Appendix 1): Maps 47, 48.
Bibliography Baptista M (2002). The syntax of Cape Verdean Creole: the Sotavento varieties. Amsterdam/Philadelphia: John Benjamins. Baptista M (2003a). ‘Number inflection in creole languages.’ Interface 6, 3–26. Baptista M (2003b). ‘Reduplication in Cape Verdean Creole.’ In Kouwenberg S (ed.) Twice as meaningful: reduplication in pidgins and creoles. London: Battlebridge. 177–184. Bra´ sio A (1962). ‘Descobrimento, povoamento, evangelizac¸ a˜ o do archipe´ lago de Cabo Verde.’ Studia 10, 49–97. Lang J (2004). Diciona´ rio do crioulo da ilha de Santiago (Cabo Verde). Tu¨ bingen: Gunter Narr Verlag. Martinus F (1996). The kiss of a slave: Papiamentu’s West-African connections. Ph.D. diss., University of Amsterdam. Quint N (2000). Grammaire de la langue cap-verdienne. Paris: L’Harmattan. Rouge´ J L (2004). Dictionnaire e´ tymologique des cre´ oles portugais d’Afrique. Paris: Karthala. Veiga M (1998). Le Cre´ ole du Cap-Vert: etude grammaticale descriptive et contrastive. Ph.D. diss., Universite´ Aix-Marseille.
Cariban Languages 199
Cariban Languages S Meira, Leiden University, Leiden, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
The Cariban family is one of the largest genetic groups in South America, with more than 25 languages (see Figure 1) spoken mostly north of the Amazon, from Colombia to the Guianas and from northern Venezuela to Central Brazil (see Figure 2). Despite the long history of their studies, most Cariban languages are still insufficiently described. The best descriptive works published so far are Hoff (1968, on Karinya) and Derbyshire (1979, 1985, on Hishkaryana). There are good descriptive works on Apalai, Makushi, and Waiwai in Derbyshire and Pullum
(1986–1998); Jackson (1972) gives a brief, but detailed, overview of Wayana. Muller (1994) is a very informative Panare dictionary. Meira (2005) and Carlin (2004) are full descriptions of Tiriyo; Meira (2000), mostly a historical study, contains some descriptive work on Tiriyo, Akuriyo, and Karihona. Gildea (1998) and Derbyshire (1999) contain surveys of the family.
Comparative Studies and Classification First recognized by the Jesuit priest Filippo Salvadore Gilij in the 18th century (Gilij, 1780–1783), the Cariban family was subsequently studied by L. Adam (1893) and C. H. de Goeje (1909, 1946). After some initial tentative proposals within larger South
Figure 1 A tentative classification of Cariban languages. (?) ¼ difficult to classify; (y) ¼ extinct (not all listed here). Different names or spellings for the same language are given in parentheses. Dialects are indented under the language name. (Demogropahic data refer to speakers, not ethnic members of the group; sources: Ethnologue and author’s own work).
200 Cariban Languages
American classifications (the last of which is Loukotka, 1968), the first detailed classification was published by V. Girard (1971), followed by M. Durbin (1977) and T. Kaufman (1994). Durbin’s classification – unfortunately used in the Ethnologue (SIL) – is, as Gildea (1998) pointed out, seriously flawed; Girard’s classification is limited (14 low-level subgroups); Kafuman’s classification is probably the best; it is based not on firsthand sources but on the comparison of other classifications. The proposal in Figure 1 is the preliminary result of ongoing comparative research. There is some good evidence that Cariban and Tupian languages are distantly related (Rodrigues, 1985); other hypotheses (e.g., Ge-Pano-Carib and Macro-Carib, from Greenberg, 1987) remain mostly unsupported and are not accepted by specialists. Shafer (1963) was the first attempt at reconstructing Proto-Cariban phonology, but its many flaws make Girard (1971) the real first proposal in this area. The most up-to-date study is Meira and Franchetto (2005). Meira (2000) reconstructs the phonology and
morphology of the intermediate proto-language of the Taranoan subgroup.
Main Linguistic Features Phonology
Cariban languages have small segmental inventories: usually only voiceless stops (p, t, k, ), one or two fricatives/affricates (h or F, s or or t ), two nasals (m, n), a vibrant (&, often or ), glides (w, j), and six vowels (a, e, i, o, u, i). Some languages have distinctive voiced obstruents (Bakairi, Ikpeng, Karihona), more than one vibrant or lateral (Bakairi, Kuikuro, Ikpeng, Hishkaryana, Waiwai, Kashuyana), or more fricatives or affricates (Bakairi, Waimiri-Atroari, Kashuyana, Waiwai); others have an extra vowel e (Wayana, Tiriyo, Panare, Bakairi, Pemong, Kapong). Vowel length is often distinctive, whereas nasality usually is not, with few exceptions (Apalai, Bakairi, Kuikuro). Many languages have weight-sensitive
Figure 2 Map of the current distribution of Cariban languages. Living languages in bold, extinct languages in normal type. AK, Akuriyo; Ar, Arara; Bk, Bakairi; Ch, Chaymay; Dk, De0 kwana; Hk, Hishkaryana; Ik, IIkpeng; Ka, Karinya; Kh, Karihona; Kk, Kuikuro; Km, Kumanakotoy; Kp, Kapong; Ks, Kashuyana; Mk, Makushi; Mp, Mapoyo; Pe, Pemong; Pi, Pimenteriay; Pm Palmellay; Pn, Panare; Ti, Tiriyo; Tm, Tamanaku; Yu, Yukpa; Yw, Yawarana; Wm, Waimiri-Atroari; Ww, Waiwai; Wy, Wayana.
Cariban Languages 201
rhythmic (iambic) stress (Table 1; Meira, 1998); some, however, have simple cumulative, usually penultimate, stress (Panare, Bakairi, Kuikuro, Yukpa). Morphophonological phenomena include stem-initial ablaut in verbs and nouns and the systematic reduction of stem-final syllables within paradigms (Gildea, 1995; Meira, 1999). Morphology
Cariban languages are mostly suffixal; prefixes exist also, marking person and valency (the latter on verbs). Some languages (Tiriyo, Wayana, Apalai) have reduplication. The complexity of the morphology is comparable to that of Romance languages. There are usually nouns, verbs, postpositions, adverbs (a class that includes most adjectival notions), and particles. Possessed nouns take possession-marking suffixes that define subclasses (-ri, -ti, -ni, -Ø ) and personmarking prefixes that indicate the possessor (e.g.,
Table 1 Rhythmic (iambic) stress: Tiriyo 1. Words with only light (CV) syllables, based on the stem apoto ‘helper, servant’a apoto [(a.po:).to] ‘helper’ m-apoto-ma [(ma.po:).to.ma] ‘you helped him’ kit-apoto-ma [(ki$.ta:).(po.to:).ma] ‘the two of us helped him’ m-apoto[(ma.po:).(to.ma:).ti] ‘you all helped him’ ma-ti kit-apotoma-ti m-apotoma-po-ti kit-apotoma-po-ti
[(ki$.ta:).(po.to:).ma.ti]
‘we all helped him’
[(ma.po:).(to.ma:).po.ti]
‘you all had him helped’ ‘we all had him helped’
[(ki$.ta:).(po.to:).(ma.po:).ti]
2. Words with at least one heavy (non-CV) syllable. kin-eraht [(ki$.ne:).(rah).(te.po:).ti] ‘he made them all po-ti be found’ mi-repent [(mi.re:).(pen).(te.te:).ne] ‘you all paid/ t -ne rewarded him’ m-ait -po[(mai).(te.po:).te.ne] ‘you all had it t -n pushed’ e
e
e
e
e e
a Iambic feet are enclosed in parenthesis. Dots ¼ syllable boundaries; hyphens ¼ morpheme boundaries.
Ikpeng o-megum-ri ‘your wrist’, o-muj-n ‘your boat,’ o-egi-Ø ‘your pet’). With overt nominal possessors, some languages have a linking morpheme j- (e.g., Panare Toman j-uwe ‘Tom’s house, place’). Nouns can also be marked for past (‘ex-N,’ ‘no longer N’) with special suffixes (-tpo, -tpi, -bi, -tpe, -hpe, -npe, etc.; e.g., Bakairi u˜ w -bi-ri ‘my late father’). Pronouns distinguish five persons (1, 2, 3, 1 þ 2 ¼ dual inclusive ¼ ‘you and I,’ 1 þ 3 ¼ exclusive; the 1 þ 3 pronoun functions syntactically as a third-person form) and two numbers (singular, or noncollective, and plural, or collective). The third-person forms also have gender (animate vs. inanimate) and several deictic distinctions (Table 2). To each pronoun usually corresponds a person-marking prefix (except 1 þ 3, to which correspond simple third-person markers). In some languages, the 1 þ 2 prefixes were lost (Kapong, Pemong, Makushi); in others, the prefixes are replaced by pronouns as overt possessors (Yukpa, Waimiri-Atroari). In more conservative languages, verbs have a complex inflectional system, with prefixes marking person and suffixes marking various tense-aspect-mood and number distinctions. The person-marking prefixes form what Gildea termed the Set I system (Table 3), variously analyzed as split-S or active-stative (e.g., by Gildea) or as cross-referencing both A (Agent) and P (Patient) (Hoff, 1968). In most languages, however, innovative systems have arisen from the reanalysis of older deverbal nominalizations or participials, and are now in competition with the Set I system. Most of the new systems follow ergative patterns, thus creating various cases of ergative splits and even a couple of fully ergative languages (Makushi, Kuikuro, in which the Set I system has been entirely lost). Gildea (1998) provides a detailed account of this diachronic development. Underived adverbs usually take no morphology other than one nominalizing suffix. There are many postpositions, often formed with smaller locative or directional elements; they can take the same personmarking prefixes as nouns, and (usually) the same nominalizing suffix as adverbs. There are many particles in several syntactic subclasses and with various
Table 2 A typical Cariban pronominal system: Kashuyana Third person
Anaphoric Demonstrative Proximal Medial Distal
Inanimate
Animate
Other persons
Sing.
Sing.
Pl.
Sing.
Pl.
iro
iro-tomu
noro
norojami
1
owi
soro moro moni
soro-tomu moro-tomu mon-tomu
mosoro moki mokiro
mo tsari mokjari mokjari
2 1þ2 1þ3
omoro kumoro amna
Pl.
omjari kimjari
202 Cariban Languages Table 3 Cariban person-marking systems Conservative (Set I) system: Karinya IP
1A 2A 1 þ 2A 3A (SP)
2P
Innovative system: Makushi 1 þ 2P
kk-
B-/jB-/ j-
a(j)a(j)-
kk-
3P
(SA)
s(i)m(i)kis(i)n(i)n(i)-
mkitn(i)-
semantic and pragmatic contents (diminutives, evidentials, modals, etc.; cf. Hoff, 1986, 1990, for the Karinya case). Class-changing morphology is quite rich. Verbs have many nominalizing affixes (‘actual’ vs. ‘habitual’ or ‘potential’ A, P, S; circumstance; action) and also adverbial-ized forms (participial, temporal, modal, etc.). There also are affixes for intransitivizing, transitivizing and causativizing verb stems (according to their valency). There are several noun verbalizers (inchoative: ‘to produce/have N’; privative: ‘to de-N X’; dative: ‘to provide X with N’). Syntax
Cariban languages are famous as examples of the rare OVS word order (Derbyshire, 1977), with Hishkaryana as the first case study. (1) toto j-oska-je okoje man LINKER -bite-PAST snake ‘The snake bit the man.’ (Derbyshire, 1979: 87)
(Hishkaryana)
Tight syntactic constituents are few: most languages have only OV-phrases (only with third-person A and P), possessive phrases (possessor-possessed), and postpositional phrases. There are no modifier slots: ‘modification’ is carried out by the apposition of syntactically independent but pragmatically coreferential nominals (e.g., the woman, that one, the tall one, the one with beads instead of that tall woman with beads). Equative clauses can have a copula, but verbless clauses also occur: (2) tuhu ire stone this ‘This is a stone.’ (author’s data)
(Bakairi)
Negation is based on a special adverbial form of the verb, derived with a negative suffix (usually -pira, -pra, -hra, -ra, etc.), in a copular clause: (3) isapokara on-ene-pira aken lizard.sp 3NEG-see-NEG 1:be:PAST ‘I did not see a jacuraru lizard.’ (Lit. lizard not-seeing-it I-was) (Koehn and Koehn, 1986: 64)
(Apalai)
B-
1 2 1þ2 3Refl
S
P
uaiti-
u(j)a(j)i(t)-/ Bt(i)-
A -u-ja
-B-ja -i-ja -ti(u)-ja
Subordinate clauses are usually based on deverbal nominals or adverbials. In some languages, there are finite subordinate clauses (Panare, Tamanaku, Yukpa, Tiriyo). The sentences below exemplify relative clauses (in brackets): nominalizations (4) and finite clauses with relativizing particles (5). (4) kaikui e-wa:re, [pahko (Tiriyo) dog 2-known.to father i-n-tu:ka-hpe]? 3-PAT.NMLZR-beat-PAST ‘Do you know the dog that my father beat?’ (author’s data) (5) a. t onkai pe it-et eti pare (Tamanaku) which 3-name priest [n-epu-i net i]? 3-come-PAST RELAT ‘What is the name of the priest who has (just) come?’ (Gilij, 1782: III, 176) b. ake peru [kat amo¼n woneta] (Yukpa) that dog RELAT you¼DAT 1.talk sa¼ne siiw thus¼3.be white ‘The dog that I talked to you about was white.’ (author’s data)
With verbs of motion, a special deverbal (supine) form is used to indicate the purpose of the displacement. (6) epi-he wi-te-jai bathe-SUPINE 1-go-PRESENT ‘I am going (somewhere) to bathe.’ (Jackson, 1972: 60)
(Wayana)
Lexicon and Semantics
Cariban languages have few number words, usually not specifically numerical (one ¼ alone, lonely; two ¼ a pair, together; three ¼ a few); higher numbers are expressed with (often not fully conventionalized) expressions based on words for hand, foot, person or body, or are borrowings. Spatial postpositions often distinguish: vertical support (‘on’), containment (‘in’), attachment/adhesion, Ground properties (‘in open space,’ ‘on summit of,’ ‘in water’), and complex spatial configurations (‘astraddle,’ ‘parallel to,’
Cariban Languages 203
‘piercing’). Some languages have ‘mental state’ postpositions (desiderative: want; cognoscitive: know; protective: protective toward; etc.). There are different verbs for eating, depending on what is eaten; to every verb corresponds a noun designating the kind of food in question (e.g., Tiriyo ene ‘eat meat,’ oti ‘meat food’; enapi ‘eat fruits, vegetables’, nnapi ‘fruit, vegetable food’; eku ‘eat bread’, uru ‘bread food’; aku ‘eat nuts,’ mme ‘nut food’). See also: Brazil: Language Situation; Colombia: Language Situation; Ergativity; French Guiana: Language Situation; Guyana: Language Situation; Rhythm; Rhythmic Alternations; Suriname: Language Situation; Venezuela: Language Situation; Word Stress.
Bibliography Adam L (1893). Mate´ riaux pour servir a` l’e´ tablissement d’une grammaire compare´ e des dialectes de la famille caribe. Bibliothe`que Linguistique Ame´ ricaine, vol. 17. Paris: J. Maisonneuve. Carlin E B (2004). A grammar of Trio. Duisburger Arbeiten zur Sprach- und Kulturwissenschaft, vol. 55. Frankfurt am Main: Peter Lang (Europa¨ ischer Verlag der Wissenschaften). Derbyshire D C (1977). ‘Word order universals and the existence of OVS languages.’ Linguistic Inquiry 8, 590–599. Derbyshire D C (1979). Hixkaryana. Lingua Descriptive Series, vol. 1. Amsterdam: North-Holland. Derbyshire D C (1985). Hixkaryana and linguistic typology. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Derbyshire D C (1999). ‘Carib.’ In Dixon R M W & Aikhenvald A Y (eds.) The Amazonian languages. Cambridge Language Surveys. Cambridge: Cambridge University Press. 23–64. Derbyshire D C & Pullum G K (eds.) (1986–1998). Handbook of Amazonian languages (4 vols). Berlin: Mouton de Gruyter. Durbin M (1977). ‘A survey of the Cariban language family.’ In Basso E (ed.) Carib speaking Indians, culture, and society. Tucson: University of Arizona Press. Gildea S (1995). ‘A comparative description of syllable reduction in the Cariban language family.’ International Journal of American Linguistics 61, 62–102. Gildea S (1998). On reconstructing grammar: comparative Cariban morphosyntax. Oxford Studies in Anthropological Linguistics, vol. 18. Oxford: Oxford University Press. Gilij, Filippo Salvadore (1780–1783). Saggio di storia americana (4 vols). Rome: Luigi Salvioni (Stampator Vaticano). Girard V (1971). ‘Proto-Carib phonology.’ Ph. D. diss., University of California, Berkeley. de Goeje C H (1909). E´ tudes linguistiques caraı¨bes. Verhandelingen der Koninklijke Akademie van Wetenschappen,
Letterkunde, nieuwe reeks, deel X, no. 3. Amsterdam: Johannes Mu¨ ller. de Goeje C H (1946). E´tudes linguistiques caraı¨bes, vol. 2. Verhandelingen der Koninklijke Nederlandsche Akademie van Wetenschappen, Letterkunde, nieuwe reeks, deel IL, no. 2. Amsterdam: N. V. Noord-Hollandsche Uitgeversmaatschappij. Greenberg J H (1987). Language in the Americas. Stanford: Stanford University Press. Hawkins R E (1998). ‘Wai Wai.’ In Derbyshire & Pullum (eds.). 25–224. Hoff B J (1968). The Carib language. The Hague: Martinus Nijhoff. Hoff B J (1986). ‘Evidentiality in Carib: particles, affixes, and a variant of Wackernagel’s law.’ Lingua 69, 49–103. Hoff B J (1990). ‘The non-modal particles of the Carib language of Surinam and their influence on constituent order.’ In Payne D L (ed.) Amazonian linguistics: studies in lowland South American languages. Austin: University of Texas Press. 495–541. Jackson W S (1972). ‘A Wayana grammar.’ In Grimes J E (ed.) Languages of the Guianas. Norman: Summer Institute of Linguistics and University of Oklahoma Press. 47–77. Kaufman T K (1994). ‘The native languages of South America.’ In Moseley C & Asher R E (eds.) Atlas of the world’s languages. New York: Routledge. 46–76. Koehn E & Koehn S (1986). ‘Apalai.’ In Derbyshire & Pullum (eds.). 33–127. Loukotka Cˇ (1968). Classification of South American Indian languages. Los Angeles: Latin American Center, University of California. Meira S (1998). Rhythmic stress in Tiriyo´. International Journal of American Linguistics 64, 352–378. Meira S (1999). ‘Syllable reduction and ghost syllables in Tiriyo´.’ In Hwang S J & Lommel A R (eds.) XXV LACUS Forum. Fullerton, CA: The Linguistic Association of Canada and the United States (LACUS). 125–131. Meira S (2000). A reconstruction of Proto-Taranoan: phonology and morphology. Munich: LINCOM Europa. Meira S (2005). A grammar of Tiriyo´. Berlin: Mouton de Gruyter. Meira S & Franchetto B (2005). ‘The southern Cariban languages and the Cariban family.’ International Journal of American Linguistics. 71, 127–192. Muller M C M (1994). Diccionario ilustrado panareespan˜ol, espan˜ol-panare. Caracas: Comisio´n Quinto Centenario, Gra´ficas Armitano. Rodrigues A D (1985). ‘Evidence for Tupi-Cariban relationship.’ In Klein H & Stark L (eds.) South American languages: retrospect and prospect. Austin: University of Texas Press. 371–404. Shafer R (1963). Vergleichende Phonetik der karaibischen Sprachen. Verhandelingen der Koninklijke Nederlandse Akademie van Wetenschappen, Letterkunde, nieuwe reeks, deel LXIX, no. 2. Amsterdam: N. V. NoordHollandsche Uitgeversmaatschappij.
204 Caribbean Lexicography
Caribbean Lexicography L Winer, McGill University, Montreal, Quebec, Canada ! 2006 Elsevier Ltd. All rights reserved.
Glossaries of ‘creolisms’ of the Caribbean are found in works ranging from travelogues, novels, and cookbooks to scientific studies, particularly of local flora and fauna. Amateur word lists of varying lengths have also been written for several of the Caribbean territories. These have words and definitions, sometimes with proposed historical derivations, and often include proverbs. The best known are Ottley’s series on ‘Trinibagianese’ (1965–1967), and Mendes’s Coteci cote-la, (1986, 2000) both for Trinidad and Tobago (see also Baptiste, 1993; Haynes, 1987), but these are found for other territories including Jamaica (Maxwell, 1981; Rosen, 1987), Barbados (Collymore, 1955–1970), Belize (McKesey, 1974), Antigua (Christian, 1993), and the Virgin Islands (Seaman, 1967–1976; Valls, 1981). Some word lists focus on contributing languages, for example, Hindi (Mahabir and Mahabir, 1990) and French (Ryan, 1985) or specific domains, for example, dancehall (FrancisJackson, 1995). These works of ‘local slang’ are generally intended for a popular, sometimes tourist, audience, often more to amuse than to inform. They are of very limited scope, have no standardized or consistent orthography, and though valuable, are often inaccurate, especially for derivations. Scholarly lexicography in the English Caribbean began with the landmark Dictionary of Jamaican English (Cassidy and Le Page, 1967, 1980). This was the first regional dictionary to be prepared on the historical principles set down by the Oxford English Dictionary; it remains a valuable resource and model. Appearing after this are the Dictionary of Bahamian English (Holm with Shilling, 1982) and the Dictionary of Caribbean English Usage (Allsopp, 1996); the latter is designed to cover the entire English Caribbean beyond Jamaica. Both include illustrative citations and regional designations where known; the latter includes some historical information and guidelines for ‘correct’ usage. A historical dictionary for Trinidad and Tobago is in preparation (Winer, forthcoming). These dictionaries are all intended to have popular appeal and educational applications, as well as providing information for scholars and readers, especially in linguistics and literature. Three concerns of current lexicography in the English Caribbean are of particular importance. The first is boundaries of inclusion (Winer, 1993: 48–57), which are often difficult to determine when so many words are shared by a regional English creole and an
international standard English. For example, Where words have diverged in meaning, for example, miserable E. ‘unhappy’ vs. CE ‘badly behaved,’ or are commonly used in the Caribbean but are now archaic in SE, for example, pappyshow ‘object of ridicule,’ it is reasonable to include them. A common problem of amateur works is the inclusion of words as ‘local’ that are in fact informal or colloquial forms of standard English, for example, jack up prices, bamboozle. The second problem is the lack of an agreed-upon standardized orthography, either within or between countries, with some people favoring a more phonetic approach, and some a more historical one (hampered by frequent uncertainty as to origin) (Winer, 1990). Finally, although all lexicographers may wish for better sources of etymologies, there is a particular lack of appropriate linguistic resources for a number of the major Amerindian and African languages especially relevant to the development of language in the Caribbean. See also: Barbados: Language Situation; Belize: Language Situation; Jamaica: Language Situation; Trinidad and Tobago: Language Situation.
Bibliography Allsopp R (1996). Dictionary of Caribbean English usage. Oxford: Oxford University Press. Baptiste R (1993). Trini talk: a dictionary of words and proverbs of Trinidad & Tobago. Port of Spain, Trinidad: Caribbean Information Systems & Services. Cassidy F G & Le Page R (1967, rev. ed. 1980). Dictionary of Jamaican English. Oxford: Oxford University Press. Christian I (ed.) (1993). Dictionary of Anguillian language. The Anguilla Printers: Government of Anguilla, Adult and Continuing Education Unit. Collymore F A (1955–1970). Barbadian dialect (5 edns.). The Barbados National Trust. Francis-Jackson C (1995). The official dancehall dictionary: a guide to Jamaican dialect and dancehall slang. Kingston: Kingston Publishers. Haynes M (1987). Trinidad and Tobago dialect (plus). San Fernando, Trinidad: Haynes. Holm J & Shilling A (1982). Dictionary of Bahamian English. Cold Spring, NY: Lexik House. Mahabir K & Mahabir S (1990). A dictionary of common Trinidad Hindi. El Dorado, Trinidad: Chakra Publishing. Maxwell K (1981). How to speak Jamaican. Kingston: Jamrite Publications. McKesey G (1974). The Belizean lingo. Belize: National Printers Ltd. Mendes J (1986, rev. ed. 2000). Cote ci, cote la: Trinidad & Tobago dictionary. Port of Spain, Trinidad: Medianet.
Carnap, Rudolf (1891–1970) 205 Ottley C R (1965–1967). Creole talk (Trinibagianese) of Trinidad and Tobago: words, phrases and sayings peculiar to the country (4 vols). (rev. 1–vol. ed. 1971) Trinidad: Ottley. Rosen B (1987). Speak Jamaican. Kingston, Jamaica: Newmarket Investment Co. Ryan P (1985). Macafouchette. Trinidad: Ryan. Seaman G A (1967–1976). Virgin Islands dictionary. St. Croix: Seaman.
Valls L (1981). What a pistarckle: A glossary of Virgin Islands English Creole. St. John USVI: Valls. Winer L (1990). ‘Standardization of orthography of the English Creole of Trinidad and Tobago.’ Language Problems & Language Planning 14(3), 237–268. Winer L (1993). Trinidad and Tobago, vol. 6: Varieties of English around the world. Amsterdam: John Benjamins. Winer L (forthcoming). Dictionary of the English/Creole of Trinidad & Tobago. Toronto: University of Toronto Press.
Carnap, Rudolf (1891–1970) T P Go´rski, University of Wrocław, Poland ! 2006 Elsevier Ltd. All rights reserved.
Rudolf Carnap, born on May 18, 1891, in Wuppental (Germany), was a philosopher, logician, and mathematician. From 1910 to 1914 he studied philosophy, mathematics, and physics at the University of Jena and Freiburg, and took part in Frege’s courses on the system of logic. Carnap planned to complete his dissertation in physics on thermionic emission, but the advent of World War I interrupted his studies. In 1917 he returned from the war and began to study the theory of relativity in Berlin. The new dissertation he developed dealt with an axiomatic system for the physical theory of time and space (greatly inspired by Kant’s Critique of Pure Reason). It was issued in 1922 under the title Der Raum. In 1925 Carnap moved to Vienna to accept the post of Assistant Professor at the University of Vienna, and within the next few years he became one of the leaders of the Vienna Circle. In 1931 he moved to Prague to become Professor of Natural Philosophy and four years later emigrated to the United States. He died on September 14, 1970, in Santa Monica, California. Carnap’s works deal mainly with semantics and formal logic, and their application to the methodology of sciences, and also the philosophy of sciences. He researched as well the issue of the basis of mathematics, the theory of probability, logical induction, Table 1 Carnap’s classification of statements (formulas) in scientific languages
and the theory of time and space. As his philosophical ideas developed, he underwent a profound change from positivism to neopositivism. In his early works he claimed that philosophical researches should be limited to only the logical analysis of scientific language to which he wanted to apply the traditional philosophy. At the same time, influenced by Wittgenstein, Carnap criticized all kinds of metaphysics, especially realism and idealism, which he called scientific pseudo-problems. Metaphysical statements, he claimed, are neither true nor false, but simply devoid of sense; they are statements only from the grammatical point of view, but logically, they are not statements. Carnap’s classification of statements (formulas) in scientific languages may be seen in Table 1. The distinction between observational and theoretical formulas, as presented above, led Carnap to distinguish between two scientific laws: empirical and theoretical. Carnap’s radical ideas are strongly connected with his view on verification of sentences, and with the need to construct a common language for all empirical sciences. Later, however, the idea of a common language was replaced by a postulate of transformation of (through either reducing or eliminating) the general scientific terms into the language of classical physics. In his last years his views were less categorical. He formulated a kind of basis for the construction of a scientific language, allowing for the use of scientific languages constructed differently. Carnap tried to combine his empiric attitude (connected with the science of natural history) with phenomenalism (a tendency of subjective treatment of experience). Thus, his ideological metamorphosis added also to the decrease of the phenomenological approach.
Type of statement
Observational terms
Theoretical terms
Logical statements Purely theoretical statements Observational sentences Rules of correspondence
No No
No Yes
Bibliography
Yes Yes
No Yes
Carnap R (1922). ‘Der Raum: Ein Beitrag zur Wissenschaftslehre.’ Dissertation. In Kant-Studien, Erga¨nzungshefte. n. 56.
Carnap, Rudolf (1891–1970) 205 Ottley C R (1965–1967). Creole talk (Trinibagianese) of Trinidad and Tobago: words, phrases and sayings peculiar to the country (4 vols). (rev. 1–vol. ed. 1971) Trinidad: Ottley. Rosen B (1987). Speak Jamaican. Kingston, Jamaica: Newmarket Investment Co. Ryan P (1985). Macafouchette. Trinidad: Ryan. Seaman G A (1967–1976). Virgin Islands dictionary. St. Croix: Seaman.
Valls L (1981). What a pistarckle: A glossary of Virgin Islands English Creole. St. John USVI: Valls. Winer L (1990). ‘Standardization of orthography of the English Creole of Trinidad and Tobago.’ Language Problems & Language Planning 14(3), 237–268. Winer L (1993). Trinidad and Tobago, vol. 6: Varieties of English around the world. Amsterdam: John Benjamins. Winer L (forthcoming). Dictionary of the English/Creole of Trinidad & Tobago. Toronto: University of Toronto Press.
Carnap, Rudolf (1891–1970) T P Go´rski, University of Wrocław, Poland ! 2006 Elsevier Ltd. All rights reserved.
Rudolf Carnap, born on May 18, 1891, in Wuppental (Germany), was a philosopher, logician, and mathematician. From 1910 to 1914 he studied philosophy, mathematics, and physics at the University of Jena and Freiburg, and took part in Frege’s courses on the system of logic. Carnap planned to complete his dissertation in physics on thermionic emission, but the advent of World War I interrupted his studies. In 1917 he returned from the war and began to study the theory of relativity in Berlin. The new dissertation he developed dealt with an axiomatic system for the physical theory of time and space (greatly inspired by Kant’s Critique of Pure Reason). It was issued in 1922 under the title Der Raum. In 1925 Carnap moved to Vienna to accept the post of Assistant Professor at the University of Vienna, and within the next few years he became one of the leaders of the Vienna Circle. In 1931 he moved to Prague to become Professor of Natural Philosophy and four years later emigrated to the United States. He died on September 14, 1970, in Santa Monica, California. Carnap’s works deal mainly with semantics and formal logic, and their application to the methodology of sciences, and also the philosophy of sciences. He researched as well the issue of the basis of mathematics, the theory of probability, logical induction, Table 1 Carnap’s classification of statements (formulas) in scientific languages
and the theory of time and space. As his philosophical ideas developed, he underwent a profound change from positivism to neopositivism. In his early works he claimed that philosophical researches should be limited to only the logical analysis of scientific language to which he wanted to apply the traditional philosophy. At the same time, influenced by Wittgenstein, Carnap criticized all kinds of metaphysics, especially realism and idealism, which he called scientific pseudo-problems. Metaphysical statements, he claimed, are neither true nor false, but simply devoid of sense; they are statements only from the grammatical point of view, but logically, they are not statements. Carnap’s classification of statements (formulas) in scientific languages may be seen in Table 1. The distinction between observational and theoretical formulas, as presented above, led Carnap to distinguish between two scientific laws: empirical and theoretical. Carnap’s radical ideas are strongly connected with his view on verification of sentences, and with the need to construct a common language for all empirical sciences. Later, however, the idea of a common language was replaced by a postulate of transformation of (through either reducing or eliminating) the general scientific terms into the language of classical physics. In his last years his views were less categorical. He formulated a kind of basis for the construction of a scientific language, allowing for the use of scientific languages constructed differently. Carnap tried to combine his empiric attitude (connected with the science of natural history) with phenomenalism (a tendency of subjective treatment of experience). Thus, his ideological metamorphosis added also to the decrease of the phenomenological approach.
Type of statement
Observational terms
Theoretical terms
Logical statements Purely theoretical statements Observational sentences Rules of correspondence
No No
No Yes
Bibliography
Yes Yes
No Yes
Carnap R (1922). ‘Der Raum: Ein Beitrag zur Wissenschaftslehre.’ Dissertation. In Kant-Studien, Erga¨nzungshefte. n. 56.
206 Carnap, Rudolf (1891–1970) Carnap R (1934). Logische Syntax der Sprache (The logical syntax of language). New York: Humanities Press, 1937. Carnap R (1935). Philosophy and logical syntax. London: Kegan Paul. Carnap R (1942). Introduction to semantics. Cambridge, MA: Harvard University Press. Carnap R (1943). Formalization of logics. Cambridge, MA: Harvard University Press. Carnap R (1947). Meaning and necessity: a study in semantics and modal logic. Chicago: University of Chicago Press. Carnap R (1950). Logical foundations of probability. Chicago: University of Chicago Press. Carnap R (1952). The continuum of inductive methods. Chicago: University of Chicago Press.
Carnap R (1966). Philosophical foundations of physic. Chicago: University of Chicago Press. Creath R (ed.) (1990). Dear Carnap, Dear Van: the Quine– Carnap correspondence and related work. Berkeley: University of California Press. Logic, language, and the structure of scientific theories: Proceedings of the Carnap–Reichenbach Centennial, University of Konstanz, May 21–24, 1991 (Pittsburgh, PA: University of Pittsburgh Press/[Konstanz]: Universitasverlag Konstanz, 1991). Pasquinelli A (ed.) (1995). L’eredita` di Rudolf Carnap: Epistemologia, Filosofia delle Scienze, Filosofia del Linguaggio. Bologna: CLUEB. PSA 1970: Proceedings of the 1970 Biennial Meeting of the Philosophy of Science Association: In Memory of Rudolf Carnap. Dordrecht: D. Reidel.
Cartography: Semiotics C De Sousa, University of Wisconsin at Milwaukee, Milwaukee, WI, USA ! 2006 Elsevier Ltd. All rights reserved.
Introduction Maps suggest an intrinsic and largely unconscious link between knowledge and visual representation (pictures, diagrams, etc.), with one implying the other. A map is a type of diagram that allows a user to find places on terra firma through a drawing of those places. Figure 1, for example, shows how to go from one location (A) to another (B). The locations are represented as points and the streets as lines meeting at right angles. Getting to B involves traveling west two blocks and north three blocks from location A. Compass directions are specified as N ¼ north, S ¼ south, E ¼ east, W ¼ west; and blocks with equally calibrated units on the lines: With such simple diagrammatic elements (points, lines, etc.), it is actually possible to represent all kinds of actual topographical spaces, in outline form. A map
Figure 1 Map of how to get from A to B.
can thus be defined, semiotically, as a diagrammatic text constructed with elemental visual signifiers (see Visual Semiotics) that are designed to indicate where a topographical object (a place, a river, a mountain, etc.) is located on terra firma, by using signifiers that resemble the features they represent in schematic, or in some cases, actual pictographic form. For example, a small tree might stand for a forest, an orchard, or a state park. But many signifiers have little resemblance to the features they represent, as when a circle stands for a city. The same sign may represent different features on different maps. For example, a triangle might represent a mobile home park on one map and an eagle’s nest on another. Such differences make it important to read the map ‘legend,’ as it is called, to find out what each sign means on a particular map. The relation of the elements to each other involves ‘scaling.’ A scale shows the mathematical relationship by which distances on a map reduce actual distances on Earth. Many maps illustrate scale by marking off distances on a straight line. Each mark shows how distance on the line corresponds to miles, kilometers, or other units of measurement on Earth. Other maps state the scale in words and figures. Such a scale might appear as 1 inch: 16 miles. In this relationship, 1 inch (2.5 cm) on the map represents a distance of 16 miles (26 km). Representative fractions are also used to show scale. These indicate the number of distance units on Earth represented by one unit on the map. In the example above, where the scale is 1 inch: 16 miles, the representative fraction would be 1:1 000 000 or 1/1 000 000, because there are 1 000 000 inches in 16 miles. The relationship remains the same for inches,
206 Carnap, Rudolf (1891–1970) Carnap R (1934). Logische Syntax der Sprache (The logical syntax of language). New York: Humanities Press, 1937. Carnap R (1935). Philosophy and logical syntax. London: Kegan Paul. Carnap R (1942). Introduction to semantics. Cambridge, MA: Harvard University Press. Carnap R (1943). Formalization of logics. Cambridge, MA: Harvard University Press. Carnap R (1947). Meaning and necessity: a study in semantics and modal logic. Chicago: University of Chicago Press. Carnap R (1950). Logical foundations of probability. Chicago: University of Chicago Press. Carnap R (1952). The continuum of inductive methods. Chicago: University of Chicago Press.
Carnap R (1966). Philosophical foundations of physic. Chicago: University of Chicago Press. Creath R (ed.) (1990). Dear Carnap, Dear Van: the Quine– Carnap correspondence and related work. Berkeley: University of California Press. Logic, language, and the structure of scientific theories: Proceedings of the Carnap–Reichenbach Centennial, University of Konstanz, May 21–24, 1991 (Pittsburgh, PA: University of Pittsburgh Press/[Konstanz]: Universitasverlag Konstanz, 1991). Pasquinelli A (ed.) (1995). L’eredita` di Rudolf Carnap: Epistemologia, Filosofia delle Scienze, Filosofia del Linguaggio. Bologna: CLUEB. PSA 1970: Proceedings of the 1970 Biennial Meeting of the Philosophy of Science Association: In Memory of Rudolf Carnap. Dordrecht: D. Reidel.
Cartography: Semiotics C De Sousa, University of Wisconsin at Milwaukee, Milwaukee, WI, USA ! 2006 Elsevier Ltd. All rights reserved.
Introduction Maps suggest an intrinsic and largely unconscious link between knowledge and visual representation (pictures, diagrams, etc.), with one implying the other. A map is a type of diagram that allows a user to find places on terra firma through a drawing of those places. Figure 1, for example, shows how to go from one location (A) to another (B). The locations are represented as points and the streets as lines meeting at right angles. Getting to B involves traveling west two blocks and north three blocks from location A. Compass directions are specified as N ¼ north, S ¼ south, E ¼ east, W ¼ west; and blocks with equally calibrated units on the lines: With such simple diagrammatic elements (points, lines, etc.), it is actually possible to represent all kinds of actual topographical spaces, in outline form. A map
Figure 1 Map of how to get from A to B.
can thus be defined, semiotically, as a diagrammatic text constructed with elemental visual signifiers (see Visual Semiotics) that are designed to indicate where a topographical object (a place, a river, a mountain, etc.) is located on terra firma, by using signifiers that resemble the features they represent in schematic, or in some cases, actual pictographic form. For example, a small tree might stand for a forest, an orchard, or a state park. But many signifiers have little resemblance to the features they represent, as when a circle stands for a city. The same sign may represent different features on different maps. For example, a triangle might represent a mobile home park on one map and an eagle’s nest on another. Such differences make it important to read the map ‘legend,’ as it is called, to find out what each sign means on a particular map. The relation of the elements to each other involves ‘scaling.’ A scale shows the mathematical relationship by which distances on a map reduce actual distances on Earth. Many maps illustrate scale by marking off distances on a straight line. Each mark shows how distance on the line corresponds to miles, kilometers, or other units of measurement on Earth. Other maps state the scale in words and figures. Such a scale might appear as 1 inch: 16 miles. In this relationship, 1 inch (2.5 cm) on the map represents a distance of 16 miles (26 km). Representative fractions are also used to show scale. These indicate the number of distance units on Earth represented by one unit on the map. In the example above, where the scale is 1 inch: 16 miles, the representative fraction would be 1:1 000 000 or 1/1 000 000, because there are 1 000 000 inches in 16 miles. The relationship remains the same for inches,
Cartography: Semiotics 207
centimeters, miles, kilometers, or any other units of measurement. Given the obvious relevance of maps to semiotics, it is somewhat surprising to find that genuine interest on the part of semioticians in maps goes back only to 1967, with the appearance of Bertin’s Se´ miologie graphique (1967). However, since then, interest has burgeoned, as has interest in the use of semiotic theory among cartographers (e.g., Casti, 2000; Foote, 1985, 1988; Hsu, 1979; Ljungsberg, 2002, 2004; Palek, 1991; Pravda, 1993, 1994; Schlichtmann, 1985, 1999a; Wood and Fels, 1986), leading to the materialization of a branch that is now called ‘cartosemiotics’ (Wolodtschenko, 1999). A general survey of cartosemiotic literature can be found in Schlichtmann (1999b).
Historical Background As with any socially functional text, maps tend to condition how groups perceive and interpret territories. To illustrate how a map can do this, consider the technique of cylindrical projection in Western mapmaking. Developed by the Flemish geographer Gerardus Mercator (1512–1594), it consists of wrapping a cylinder around the globe, making it touch the equator, and then projecting (1) the lines of latitude outward from the globe onto the cylinder as lines parallel to the equator, and (2) the lines of longitude outward onto the cylinder as lines parallel to the prime meridian (the line that is designated 0! longitude, passing through the original site of the Royal Greenwich Observatory in England). The
Figure 2 The Mercator projection.
resulting two-dimensional map can be made to represent the world’s surface as a two-dimensional plane figure such as a rectangle or an ellipse. Figure 2 is an example of the latter. Because of the curvature of the globe, the latitude lines on the map nearest the poles appear closer together. This distortion makes the sizes of certain landmasses appear smaller than they are. Indeed, the very concept of ‘worldview’ derives from the fact that the ways in which we come to ‘view the world’ are, in part, a consequence of how that world is represented for viewing by the maps we make of it. Although modern technology now makes it easy to construct three-dimensional and, thus, nondistorting maps, traditionally the term ‘map’ has designated a two-dimensional representation of an area; threedimensional maps are more accurately known as ‘models.’ All civilizations have developed mapmaking techniques to meet a host of social needs. In many cultures, these were elaborated and refined in tandem with the rise and growth of the mathematical sciences. Since Mercator invented the cylindrical projection method, most mapmaking techniques have been devised in accordance with the principles of Cartesian coordinate geometry. This consists, essentially, of two perpendicular number lines in a plane. Points of a geometric figure are located in the plane by assigning each point two coordinates (numbers) on the number lines x and y. The x-coordinate, called the line of latitude in cartography, gives the location of the point along the horizontal number line. The y-coordinate, called the line of longitude, locates the point along the vertical number line.
208 Cartography: Semiotics
By convention, longitude is marked 180! east and 180! west from 0! at Greenwich, England. Latitude is marked 90! north and 90! south from the 0! parallel of the equator. Points on a map can be accurately defined by giving degrees, minutes, and seconds for both latitude and longitude. As mentioned, distances are represented with the technique of ‘scaling,’ whereby two points on the earth are converted to two corresponding points on the map by means of a scale: for example, a scale of 1:100 000 means that one unit measured on the map (say 1 cm) represents 100 000 of the same units on the earth’s surface. The varying heights of hills and mountains, and the depths of valleys, are portrayed instead with the technique known as ‘relief.’ In earlier maps, this consisted in making small drawings of mountains and valleys on the maps. But this was extremely imprecise and thus came eventually to be supplanted by the use of ‘contour lines.’ The shapes of these lines provide accurate representations of the shapes of hills and depressions, and the lines themselves show actual elevations, so that closely spaced contour lines indicate steep slopes. Other methods of indicating elevation include the use of colors, tints, hachures (short parallel lines), and shadings. When colors are used for this purpose, a graded series of tones is selected for coloring areas of similar elevations. Shadings or hachures, neither of which show actual elevations, are more easily interpreted than contour lines and are sometimes used in conjunction with them for achieving greater fidelity in representation.
Figure 3 Ptolemy’s map of the world.
The first known maps were made by the Babylonians around 2300 B.C. Carved on clay tablets, they consisted largely of land surveys made for the purposes of taxation. More extensive regional maps, drawn on silk and dating from the 2nd century B.C. , have been found in China. The precursor of the modern map, however, is believed to have been devised by the Greek philosopher Anaximander (ca. 611–ca. 547 B.C.). It was circular and showed the known lands of the world grouped around the Aegean Sea at the center and surrounded by the ocean. Anaximander’s map constituted one of the first attempts to think beyond the immediate territorial boundaries of a particular society – Greece – even though Anaximander located the center of the universe in the Aegean Sea. Then, around 200 B.C., the Greek geometer and geographer Eratosthenes (276?–195? B.C.) introduced the technique of parallel lines to indicate latitude and longitude, although they were not evenly and accurately spaced. Eratosthenes’s map represented the known world from present-day England in the northwest to the mouth of the Ganges River in the east and to Libya in the south. About 150 A.D., the Egyptian scholar Ptolemy (ca. 100–ca. 170 A.D.) published the first textbook in cartographic science, entitled Geographia. Even though they contained a number of errors, his were among the first maps of the world to be made with mathematical principles. At about the same time in China, mapmakers were also beginning to use mathematically accurate grids for making maps. Figure 3 is Ptolemy’s map of the world, which may have been made by Ptolemy himself or by
Cartography: Semiotics 209
cartographers who rediscovered his work after it had been lost for many centuries. The next step forward in cartography came in the medieval era, when Arab seamen made highly accurate navigational charts, with lines indicating the bearings between ports. In the 15th century, influenced by the publication of Ptolemy’s maps, European mapmakers laid the foundations for the modern science of cartography. In 1507, for instance, the German cartographer Martin Waldseemu¨ ller (ca. 1470–ca. 1522) became the first to apply the name America to the newly identified trans-Atlantic lands, separating America into North and South – a cartographic tradition that continues to this day – and differentiating the Americas from Asia. In 1570, the first modern atlas – a collection of maps of the world – was put together by the Flemish cartographer Abraham Ortelius (1527–1598). The atlas, titled Orbis Terrarum, contained 70 maps. Undoubtedly, the most important development in the 16th century came when Mercator developed the technique of cylindrical projection in 1569, as mentioned above (Crane, 2002). This allowed cartographers to portray compass directions as lines, at the expense, however, of the accurate representation of relative size. By the 18th century, the modern-day scientific principles of mapmaking were well established. With the rise of nationalism in the 19th century, a number of European countries conducted topographic surveys to determine political boundaries. In 1891, the International Geographical Congress proposed the political mapping of the entire world on a scale of 1:1 000 000, a task that occupied cartographers for over a century. Throughout the 20th century, advances in aerial and satellite photography, and in computer modeling of topographic surfaces, have greatly enhanced the versatility, functionality, accuracy, and fidelity of mapmaking. Today, the so-called Geographic Information System (GIS) consists of computers, computer programs, and extremely large amounts of information, which is stored as computer code and can include measurements or photographs taken from land, sea, or space. Cartographers can use GIS to produce many different maps from the stored data. These are easily stored on computer software or devices, such as CD-ROMs, which enable people to choose exactly the area they want to view, then print a map. There are now also invehicle navigation systems that create maps to guide drivers of moving vehicles. These systems constantly track a vehicle’s location by using signals from a group of space satellites called the Global Positioning System (GPS). A computer in the vehicle combines the position data with stored street map data and produces maps of the route to a destination. The maps
change as the vehicle moves. Some in-vehicle systems show the map on a small screen. Other systems produce spoken directions. To navigate airplanes, aeronautical charts are used. Depending on their level of certification, pilots use Visual Flight Rules (VFR) or Instrument Flight Rules (IFR) charts. VFR charts show landmarks that pilots can see as they fly, such as roads, bridges, and towns. These also show airports and indicate the heights of mountains and other obstacles. IFR charts are designed for radio navigation. These show the location of transmitters of high-frequency radio signals. Pilots use these signals to determine their position and plot their course. Some airplanes are equipped with computer systems that produce heads-up display maps. These are projected near eye level where the pilot can see them without looking down.
General Semiotic Considerations How do we interpret a map? To say ‘I am here, but I want to get to there’ on a map involves two levels of interpretation: (1) that here and there are indexes (signs indicating location) in map space standing for points in real space, and (2) that the movement from here to there on a map stands for the corresponding movement between two points in real space through scaling. Modern mapmaking is based, as mentioned, on the principles of Cartesian geometry, which segment the map space into determinable points and calculable scaled distances. The traditional maps of North American aboriginal peoples, on the other hand, are designed to show the interconnectedness among the parts within the map space through a distortion of distance, angulation, and shape, not through segmentation and scaling. Western maps represent the world as an agglomeration of points, lines, and parts, related to each other in terms of the mathematics of the Cartesian plane; aboriginal maps represent it instead as a holistic, unsegmentable entity. The two types of mapmaking systems thus reveal different worldviews. These have had specific ‘consequences,’ such as village and city design. Cartesian mapmaking has clearly influenced the design of modern cities. Not only does the layout of the city of New York, for example, mirror a Cartesian map, but the city also names its streets largely in terms of the grid system: for example, 52nd and 4th refers to the intersection point of two perpendicular lines in the city grid. In a fundamental semiotic sense, such cities are the ‘iconic byproducts’ of the worldview that has been enshrined into groupthink by the widespread use of grid maps since the early 16th century.
210 Cartography: Semiotics
As representations of terra firma, maps are also ‘intellectual codes’ and can thus be used as both navigational and exploratory models of the world. In the same way that the sciences of geometry and trigonometry have allowed human beings to solve engineering problems since ancient times, the science of cartography has allowed explorers to solve navigation and exploration problems with amazing accuracy. Exploration involves determining position and direction. Position is a point on the earth’s surface that can be identified in terms of a grid or coordinate system. Direction is the position of one point relative to another within the system. The shortest distance between two points is a straight line, and since any line in the plane is a hypotenuse, then its length can be determined easily by the Pythagorean theorem. In this way, maps have allowed navigators to fix points and determine distances to regions of the plane (i.e., the earth’s surface). Explorers setting out on a journey may not know what they will encounter along the way, nor will they know in advance if they will reach a land mass or a body of water. But they can still take that journey with a high degree of assurance that they will be able to predict where they are on terra firma. Exploration is ancient. According to many archaeologists and historians, it began approximately 3000 years ago in the area of the eastern Mediterranean Sea. Since then nearly every portion of the earth’s land surface has been explored and mapped. Space photography and advanced measurement technology, including a laser reflector placed on the moon, have made possible extremely precise measurements of the earth’s surfaces. Considerable work is now being carried out to investigate the vast regions that are under the seas. What is even more remarkable is that cartography has permitted us to describe the positions of heavenly bodies and to calculate their distances from Earth with accuracy. Suffice it to say here that mapping outer space involves the use of techniques that correspond to terrestrial point fixing in terms of latitude and longitude lines. Simply put, the positions of stars relative to one another are regarded as points on a celestial map; the motion of the sun, the moon, and the planets is then indicated as a mean rate of progression across the celestial space. It is truly mindboggling to think that with the aid of a simple representational device (the map), we have already been able to set foot on the moon and will no doubt be able to visit other places in the skies in the not-too-distant future. As a final word on the navigational uses of maps, the recent development of computer systems that are used in advanced traffic management to improve traffic control merits some consideration here. Traffic
along major highways in some cities is monitored by remote cameras, radar, or sensors in the roadway. A central computer system analyzes the information. If roads are congested, traffic flow can be improved by automatically adjusting traffic-signal timing, controlling traffic flow on freeway ramps, or providing information to drivers by means of electronic signs along the roads. Advanced travelerinformation systems are also currently available in some automobiles. These are navigational systems into which drivers enter their destination. An electronic map then displays the best route on a small screen, or a synthesized voice provides directions along the route, including directions on when to turn. These systems use a transponder, or a transmitting and receiving device, in the vehicle and a satellitebased GPS to pinpoint the exact location of the vehicle along its route. When this navigation system is coupled with cellular-radio technology, it can be used to signal a central dispatcher in case of an emergency.
Maps as Texts Reading maps constitutes a culture-specific form of text-interpretation. A map of New York City would probably not be interpreted as a map by a nomad from the north of Iraq or by an Inuit hunter in Nunavut. A map is identified as such because the visual signifiers that compose it (lines, colors, shapes, and so on) are understood as topographical elements. Like any text (see Texts: Semiotic Theory) understanding that these elements are part of a whole implies understanding the cartographic code – which in the case of Western maps is Cartesian in nature. Map reading is thus a culture-based text decipherment process, in which the reader constructs the meaning of the map out of the elements that have been assigned specific roles and positions on the text according to the position and relationships between them in the real world. As Wood and Fels (1986) argued, every map is a synthesis of signs and a sign in itself – an artifact of depiction of referents and an instrument of promoting worldview. Like any other kind of text, it is a product of a specific code – a set of conventions that prescribe relations of content and expression in given contexts. As noted above, some of the elements that constitute a map text include combinations of lines, shapes, and colors to denote road types, green spaces, lakes, and other water bodies, together with miscellaneous types of buildings. The relationships between these elements directly correspond to relationships between the objects and spaces in the real world. A map reflects the real world’s structure in the way that the signifiers are combined together – buildings don’t
Cartography: Semiotics 211
overlap, they are not built in the middle of roads, there is no parkland in the middle of a lake, and so on. Like any text, moreover, a map is created with an audience in mind, unless it is a personal map. Thus, once the map is finished, the author relinquishes his or her rights to the interpretation of the map, and the text belongs to the audience, which ‘re-writes it,’ ‘refashions’ it, or ‘adds to’ it to suit its specific interpretive needs. If we know the audience for which the map was made, and we compare it to maps made for other audiences, the differences between the two can tell us a lot about the author’s intentions, the social situation of the audiences, and the type of power discourse that the map supports or undermines. A map is supposed to be denotative in that it must resemble the area it represents as closely as possible. Yet, as Derrida (1976) cogently argued in reference to any text, the actual social meanings of maps are constantly ‘slipping away’ from each other, constantly shifting and changing so that they can never be exactly determined. The slippage in this case is due to the fact that a map is the simplification of a complex topographical object, either drawn on paper or modeled on computer with the aid of photography (which provides perspective). The modern history of scientific cartography has, in fact, revolved around attempts to solve the slippage problem. To represent the entire surface of the earth without distortion requires a spherical globe. A flat map cannot accurately represent the earth’s rounded surface except for very small areas where the curvature is negligible. To accurately display large or medium sized parts of the earth’s surface a map must be drawn with distortions of areas, distances, and directions. Various projection techniques are used to prepare flat maps of the earth’s surface. These are classified as geometric or analytic depending on the technique used to develop them. Geometric projections are classified based on the type of surface on which the map is assumed to be developed (i.e., cylinders, cones, or planes), while analytical projections are developed by mathematical computation. Solving the slippage problem has been greatly assisted, needless to say, by technological innovations since World War II. Perhaps most important has been the use of remote-sensing techniques that gather information about an area from far above the ground via aerial and satellite photography. Improvements in satellite technology, computer software, and the use of satellite triangulation have substantially improved the accuracy of remote-sensing techniques and of today’s maps. The foundation of a modern map is a careful survey giving geographical locations and relations of many points in the area being mapped. Nearly all maps
developed today make use of both remote-sensing and traditional land-surveying information. Once the information is collected, the map is carefully planned with regard to its final use, so that the information can be rendered clearly and accurately. The collected surveys and remote-sensed data are then used to enter a large number of points on a grid of crossed lines corresponding to the projection chosen for the map. Elevations are determined and contour lines, roads, and rivers are drawn. Final preparation of a map for printing begins by making a series of sheets, one for each color used on the map, that are then scribed onto the surface by a sharp etching tool. Each of these sheets is then used as a negative from which a lithographic plate is made. But despite all the technological innovations, a map is still a text and, thus, subject to slippage, albeit of a different kind. Reading precisely made maps still requires knowing that the elements on them are signifiers that cohere into an overall representation of space, even if the representation has largely eliminated scale and angle distortions. A device such as a continuous-curve plotter enables a computer to draw accurate maps from the stored data. Computer-generated maps can also be displayed on a video screen, where an operator can easily make alterations in the content. Because such maps, and each incorporated change, can be stored in the computer, they are useful in furnishing an animated picture of a change over a period of time.
Conclusion The map is an important tool, not only for navigation and exploration purposes, but also for cultural identification purposes. In addition to providing a wealth of factual information, the map permits visual comparison between areas because it may be designed to indicate, by means of symbols, not only the location but also the characteristics of geographic features of an area and, thus, to give it a representation. Like any ‘memory code,’ such as a history book, this can then be stored for preservation. No wonder, then, that geographers have developed a standard pattern of map symbols for identifying such cultural features as homes, factories, churches, dams, bridges, tunnels, railways, highways, travel routes, mines, farms, and grazing lands. As map viewers, we think that we have ‘topographical reality’ laid bare before us. But, as cartosemiotics has shown (and as has been argued in this article), understanding maps involves a process of text decipherment on the part of reader – even if the ‘reader’ is a computer (which still has to be programmed by a human being). Studying maps from a
212 Cartography: Semiotics
semiotic point of view leads to a complex picture of the possibilities and the limitations that the map offers (Foote, 1985, 1988). Technologically made maps belong to a contemporary code of mapmaking that involves the use of informatics. But informatics itself is a code of its own. In effect, the lesson to be learned from studying maps semiotically is that no matter how accurate we try to make our scientific texts, they are inevitably subject to human interpretation in psychological, historical, and cultural terms. See also: Iconicity; Indexicality: Theory; Sapir, Edward (1884–1939); Texts: Semiotic Theory; Visual Semiotics; Whorf, Benjamin Lee (1897–1941).
Bibliography Bertin J (1967). Se´ miologie graphique. The Hague: Mouton. Casti E (2000). Reality as representation: the semiotics of cartography and the generation of meaning. Bergamo: Bergamo University Press. Crane N (2002). Mercator: the man who mapped the planet. New York: Weidenfeld & Nicholson. Derrida J (1976). Of grammatology. Spivak G C (trans.). Baltimore: Johns Hopkins Press. Foote K E (1985). ‘Space, territory, and landscape: the borderlands of geography and semiotics.’ Semiotic Inquiry 5, 159–174. Foote K E (1988). ‘Object as memory: the material foundations of human semiosis.’ Semiotica 69, 243–268.
Hsu M-L (1979). ‘The cartographer’s conceptual process and thematic symbolization.’ The American Cartographer 6, 117–127. Ljungberg C (2002). ‘City maps: the cartosemiotic connection.’ In Simpkins S & Deely J (eds.) Semiotics 2001. 193–205. Ljungberg C (2004). ‘Logical aspects of maps.’ Semiotica 148, 413–437. Palek B (1991). ‘Semiotics and cartography.’ In Sebeok T A & Umiker-Sebeok J (eds.) Recent developments in theory and history. Berlin: Mouton de Gruyter. 465–491. Pravda J (1993). ‘Map language.’ Cartographica 30, 12–14. Pravda J (ed.) (1994). Cartographic thinking and map semiotics. Special issue of Geographia Slovaca (5). Bratislava: Slovenska´ Akademia vied Geograficky U´ stav. Robinson A H & Petchenik B B (1976). The nature of maps. Chicago: University of Chicago Press. Schlichtmann H (1985). ‘Characteristic traits of the semiotic system: map symbolism.’ The Cartographic Journal 22, 23–30. Schlichtmann H (1999a). ‘Map symbolism revisited: units, order and contexts.’ Geographia Slovaca 5, 47–62. Schlichtmann H (ed.) (1999b). Map semiotics around the world. International Cartographic Association. Turnbull D (1989). Maps are territories. Chicago: University of Chicago Press. Wolodtschenko A (1999). ‘Cartosemiotics: component of theoretical cartography.’ Geographia Slovaca 5, 63–85. Wood D & Fels J (1986). ‘Design on signs: myth and meaning in maps.’ Cartographica 23, 54–103.
Case B J Blake, LaTrobe University, Bundoora, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.
Case Marking Case is essentially a system of marking dependent nouns for the type of relationship they bear to their heads. Traditionally, the term refers to inflectional marking, and, typically, case marks the relationship of a noun to a verb at the clause level or of a noun to a preposition, postposition, or another noun at the phrase level. Straightforward examples of case systems can be found in the Dravidian languages. Table 1 shows the set of case forms for the noun makan ‘son’ in Malayalam. The nominative is the citation form and is used for the subject of a clause. The accusative is used for the
direct object and the dative for the indirect object (the recipient of a verb of giving). The genitive expresses the possessor (makanre peena ‘son’s pen’) and the sociative (alternatively comitative) expresses the notion of ‘being in the company of’. The locative expresses location, and the instrumental expresses the instrument, as in ‘cut with a knife’ and the agent of the passive. The ablative expresses ‘from’. It is built
Table 1 Malayalam case system Nominative Accusative Dative Genitive Sociative Locative Instrumental Ablative
semiotic point of view leads to a complex picture of the possibilities and the limitations that the map offers (Foote, 1985, 1988). Technologically made maps belong to a contemporary code of mapmaking that involves the use of informatics. But informatics itself is a code of its own. In effect, the lesson to be learned from studying maps semiotically is that no matter how accurate we try to make our scientific texts, they are inevitably subject to human interpretation in psychological, historical, and cultural terms. See also: Iconicity; Indexicality: Theory; Sapir, Edward (1884–1939); Texts: Semiotic Theory; Visual Semiotics; Whorf, Benjamin Lee (1897–1941).
Bibliography Bertin J (1967). Se´miologie graphique. The Hague: Mouton. Casti E (2000). Reality as representation: the semiotics of cartography and the generation of meaning. Bergamo: Bergamo University Press. Crane N (2002). Mercator: the man who mapped the planet. New York: Weidenfeld & Nicholson. Derrida J (1976). Of grammatology. Spivak G C (trans.). Baltimore: Johns Hopkins Press. Foote K E (1985). ‘Space, territory, and landscape: the borderlands of geography and semiotics.’ Semiotic Inquiry 5, 159–174. Foote K E (1988). ‘Object as memory: the material foundations of human semiosis.’ Semiotica 69, 243–268.
Hsu M-L (1979). ‘The cartographer’s conceptual process and thematic symbolization.’ The American Cartographer 6, 117–127. Ljungberg C (2002). ‘City maps: the cartosemiotic connection.’ In Simpkins S & Deely J (eds.) Semiotics 2001. 193–205. Ljungberg C (2004). ‘Logical aspects of maps.’ Semiotica 148, 413–437. Palek B (1991). ‘Semiotics and cartography.’ In Sebeok T A & Umiker-Sebeok J (eds.) Recent developments in theory and history. Berlin: Mouton de Gruyter. 465–491. Pravda J (1993). ‘Map language.’ Cartographica 30, 12–14. Pravda J (ed.) (1994). Cartographic thinking and map semiotics. Special issue of Geographia Slovaca (5). Bratislava: Slovenska´ Akademia vied Geograficky U´stav. Robinson A H & Petchenik B B (1976). The nature of maps. Chicago: University of Chicago Press. Schlichtmann H (1985). ‘Characteristic traits of the semiotic system: map symbolism.’ The Cartographic Journal 22, 23–30. Schlichtmann H (1999a). ‘Map symbolism revisited: units, order and contexts.’ Geographia Slovaca 5, 47–62. Schlichtmann H (ed.) (1999b). Map semiotics around the world. International Cartographic Association. Turnbull D (1989). Maps are territories. Chicago: University of Chicago Press. Wolodtschenko A (1999). ‘Cartosemiotics: component of theoretical cartography.’ Geographia Slovaca 5, 63–85. Wood D & Fels J (1986). ‘Design on signs: myth and meaning in maps.’ Cartographica 23, 54–103.
Case B J Blake, LaTrobe University, Bundoora, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.
Case Marking Case is essentially a system of marking dependent nouns for the type of relationship they bear to their heads. Traditionally, the term refers to inflectional marking, and, typically, case marks the relationship of a noun to a verb at the clause level or of a noun to a preposition, postposition, or another noun at the phrase level. Straightforward examples of case systems can be found in the Dravidian languages. Table 1 shows the set of case forms for the noun makan ‘son’ in Malayalam. The nominative is the citation form and is used for the subject of a clause. The accusative is used for the
direct object and the dative for the indirect object (the recipient of a verb of giving). The genitive expresses the possessor (makanre peena ‘son’s pen’) and the sociative (alternatively comitative) expresses the notion of ‘being in the company of’. The locative expresses location, and the instrumental expresses the instrument, as in ‘cut with a knife’ and the agent of the passive. The ablative expresses ‘from’. It is built
Table 1 Malayalam case system Nominative Accusative Dative Genitive Sociative Locative Instrumental Ablative
makan makane makanne makanre makanoo
Case 213
(consonant stems and i-stems). The designations a¯ -stems, o-stems, and so forth are not synchronically transparent and reflect the product of historical reconstruction. In Latin, there is also a three-way gender distinction: masculine, feminine, and neuter. With a few exceptions male creatures are masculine and females feminine, but inanimates are scattered over all three genders (though almost all neuter nouns are inanimate). There is a partial association of form and gender in that a¯ -stems are almost all feminine and o-stems mostly masculine (except for a subclass of neuters, represented by bellum in Table 2). This means that there can be fusion of gender, number, and case. The point is illustrated in Table 2, where we have domina ‘mistress (of a household)’ illustrating feminine a¯ -stems and dominus ‘master (of a household)’, which is based on the same root, representing masculine o-stems. As can be seen from Table 2, the word form domina simultaneously represents nominative case, feminine gender, and singular number; dominum represents accusative case, masculine gender, and singular number; and similarly with other word forms. In Latin, adjectives decline like nouns, and there is concord between a noun and an attributive or predicative adjective. This concord is sensitive to case and number, and those adjectives that belong to the first and second declension are sensitive to gender. So we find domina bona ‘good mistress’ and agricola bonus ‘good farmer’, where agricola is one of the few nouns of masculine gender in the first declension. With adjectives of the first and second declensions, the inflections simultaneously represent case, number, and gender without exception.
on the locative, and some linguists would take a form like makanilninne to consist of the locative plus the postposition ninne. Malayalam and the other Dravidian languages provide good examples of case systems since these languages are agglutinative and the case marking (-e accusative, -il locative, etc.) can easily be isolated. Moreover, the case marking is consistent across singular and plural. Plural is marked as a first-order suffix between the stem and the case marking, as in the following: (1) Kappal tiramaala-kaU-e bheedicu. ship wave-PL-ACC split-PT ‘The ship broke through the waves.’
In Malayalam, the accusative case is generally used for the direct object only for human nouns, the nominative being used for other nouns. However, where both subject and object are inanimate as in (1), the accusative is used. Case systems are a feature of conservative IndoEuropean languages such as Russian and Greek, and much of our framework for describing case comes from the study of the classical languages Ancient Greek, Latin, and Sanskrit. However, there is a complication in these languages in that number marking and case marking are never separate. This means separate paradigms for the two number categories of singular and plural. Moreover, there are different case/number forms for different stem classes. Traditionally five such classes are recognized, and there are also variations within the classes. Three of these classes, or declensions as they are usually referred to, are illustrated in Table 2: the first declension (a¯ -stems), second declension (o-stems). and third declension Table 2 Latin case paradigms 1 a¯-stems feminine
Six cases are recognised: nominative, vocative, accusative, genitive, dative, and ablative; however, no paradigm exhibits six different forms. In the traditional descriptions, a case is established wherever there is a distinction for any single class of nominals, since this facilitates the description of the functions. The vocative, the case used in forms of address, has a distinctive form only in the singular of the second declension. Elsewhere there is a common form for the nominative and vocative; however, distinct nominative and vocative cases are recognized for all paradigms. The utility of this approach can be seen in a phrase such as O so¯ l laudande ‘Oh, praiseworthy sun’. Here the adjective has a distinctive vocative form (masculine, singular) but the noun so¯ l does not, but we can still say there is concord.
Types of Case The term case is from Latin ca¯ sus, which is in turn a translation of the Greek pto¯ sis ‘fall’. The nominative was considered the basic form of nominals, and the other cases ‘fell away’ from this form and were referred to as the oblique cases. In some languages there is a formal difference between the nominative and the oblique, inasmuch as the oblique cases are built on a common oblique stem. In Malayalam, for instance, some nouns such as maram ‘tree’ have an oblique stem maratt- so the accusative is maratte, locative marattil, and so on (cf. Table 1). In the description of cases, a distinction is often made between syntactic or grammatical cases and semantic cases. The nominative and accusative are grammatical cases in that they encode the grammatical relations of subject and object respectively, whereas a case such as the locative in Malayalam is semantic in that it expresses a specific semantic role, namely, the notion of position. It can also be said that nominative and accusative function to distinguish arguments of a predicate, whereas a semantic case has content and is a predicate. A further distinction is sometimes made among the semantic cases between local and nonlocal cases. Local cases are those referring to place such as locative (‘at’), allative (‘to’), and ablative (‘from’). However, the distinction between grammatical and semantic case is blurred somewhat by the fact that a primarily grammatical case can have a semantic function and a semantic case can have a grammatical function. In Latin the accusative, a grammatical case, is used to express destination Va¯ do¯ Ro¯ mam ‘I go to Rome’ and extent or duration, as in xxv anno¯ s ‘for 25 years’, and the ablative, a semantic case, is used for the logical subject in the passive (occı¯sus a¯ co¯nsule ‘killed by the consul’).
The genitive case is distinct from the others in that it is typically adnominal, marking the dependent of a nominal, whereas the other cases typically mark the dependents of verbs. The genitive has a semantic function, namely that of expressing possession, as in the Latin phrase co¯nsulis equus ‘the consul’s horse’, but it can have a variety of other functions. In Latin its grammatical character can be seen particularly in a phrase such as amor patris. This phrase is ambiguous: it can mean either the love felt by a father or the love directed towards the father. The former is called the subjective genitive since pater corresponds to the subject in the verbal expression: Pater amat ‘Father loves’. The latter is called the objective genitive since pater corresponds to the object in Amat patrem ‘He or she loves father’. The dative case takes its name from the Latin verb da¯re ‘to give’ since it expresses the recipient of verbs of giving. In Latin the dative also expresses the complement of a handful of two-place verbs such as fidere ‘to trust’ and pare¯re ‘to obey’. It additionally expresses the experiencer of a few verbs such as place¯re ‘to please’: Domino¯ non placet bellum ‘War is not pleasing to the master’. A similar range of functions can be found in a number of languages. However, in a few languages, and regularly across the Indian subcontinent, the dative encodes the actor in certain aspects. In Malayalam, for instance, the dative in conjunction with the potential marker aam on the verb signals physical ability or permission. (2) Avalkku avane her.DAT him.ACC ‘She can hit him.’
itikkaam. can/may.hit
The Latin ablative also illustrates a further complication. Although nominally a semantic case expressing ‘from’, it represents a syncretism of an ablative, locative, and instrumental, which were distinguished in earlier stages of the language, and therefore it expresses a variety of semantic roles. A more useful distinction than between syntactic and semantic cases is between core and peripheral cases, where the core refers to subject and object.
Types of Case System Accusative system
The most common system of core cases is one that opposes nominative for subject and accusative for object. This system is found in various language families, including Indo-European, Uralic, Turkic, Mongolian, Tungusic, and Dravidian (see (1) above), as well as in Korean and Japanese (see (15) below).
Case 215 Ergative system
A sizable minority of languages have one case for the agent of a transitive verb (A) and another for the subject of an intransitive predicate (S) and direct object (O). The former is called the ergative case, and the latter is called either nominative or absolutive. The latter label is also used for a grammatical relation covering S and O. The absolutive is usually unmarked, as in the following illustration from Yalarnnga (Australian). (3) Yirri tjala ngani-mi man.ABS this.ABS go-FU ‘The man will go for fish.’
yimarta-ta. fish-PURP
(4) Kilawurru tjala yirri-nthu wala-mu this.ABS man-ERG hit-PT galah.ABS payarla-yu. boomerang-ERG ‘A man killed the galah with a boomerang.’
Note that in (4) the ergative also encodes the role of instrument. It is very common for ergatives to have non-core functions, another example of grammatical cases having semantic functions. Ergative systems are often considered rare and remote, but in fact they make up at least 20% of the world’s languages. Ergative systems are to be found in Basque, all families of the Caucasian phylum, in the Tibeto-Burman languages, in Austronesian, in most Australian languages, in some languages of the Papuan families, in Eskimo-Aleut, in Tsimshian and Chinook in North America, in the Mayan languages of Central America and several families of South America (Dixon, 1994:5), and in Hurrian and several other languages of the ancient Near East. Split-intransitive system
Some languages, perhaps no more than a few score, organize their core grammar so that the argument of some one-place predicates is marked like the A of a two-place verb, while the argument of the other one-place predicates is marked like the O of a twoplace verb. Such languages have been called splitintransitive languages or split-S languages (Dixon, 1994:70ff). Examples can be found in the Kartvelian (South Caucasian) languages of the Caucasus. The following sentences are from Laz. Note that the suffix -k, glossed ergative on the basis of its appearance on A in transitive clauses like (7), also appears on the ‘agent’ of the intransitive verb in (5). On the other hand, the subject of the intransitive verb in (6) is unmarked like the O of (7). (5) Bere-k imgars. child-ERG 3SG.cry ‘The child cries.’
(6) Bere oxori-s doskidu. child.ABS house-DAT 3SG.stay ‘The child stayed in the house.’ (7) Baba-k father-ERG
mec¸ caps skiri-s cxeni. 3SG.give. child-DAT horse.ABS 3SG.3SG ‘The father gives a horse to his child.’
This pattern also occurs in Georgian, but it applies only to certain classes of verbs in the aorist tense group. In the present tense, all subjects are in the nominative case and the direct objects in the dative. The active system is also found in the Americas, where it usually shows up in the bound pronouns on the verb. It has been reported from Guaranı´ (Andean); Lakhota (Lakota) and other Siouan languages; the Pomoan (Pomo) languages; Caddo, Arikara, and other Caddoan languages; and Mohawk, Seneca, and other Iroquoian languages. It also occurs in Acehnese (Austronesian). Direct-Inverse System
Another system of marking the core relations is the direct-inverse system. In this system, which is characteristic of the Algonquian languages, the marking on the verb indicates whether an activity is direct or inverse. If the action proceeds from first or second person to third it is direct, but if it proceeds from third to first or second it is inverse. In transitive clauses with two third-person participants, the direct and inverse markers distinguish whether a more topical participant (proximate) is A, which gives a direct combination, or a less topical participant (obviative) is A, which gives an inverse combination. A ‘more topical participant’ will be chosen on the basis of discourse principles and will tend to be the last-mentioned person or the discourse topic. The ‘less topical person’ is marked by the obviative suffix -wa. (8) a. Na¯pe¯w atim-wa man.PROX dog-OBV ‘The man saw the dog.’ b. Na¯pe¯w-(w)a atim man-OBV dog.PROX ‘The man saw the dog.’
wa¯pam-e¯-w. see-DIRECT-3SG
(9) a. Atim na¯pe¯w-(w)a dog.PROX man-OBV ‘The dog saw the man.’ b. Atim-wa na¯pe¯w dog-OBV man.PROX ‘The dog saw the man.’
wa¯pam-e¯-w. see-DIRECT-3SG
wa¯pam-ik. see-INVERSE.3SG
wa¯pam-ik. see-INVERSE.3SG
As can be seen, there are two ways of expressing the same propositional content according to which participant is chosen as topic.
216 Case
Factors Affecting Marking It is not common to find languages in which the nominative-accusative or absolutive-ergative distinction holds for all nominals in every context. The ‘older’ Indo-European languages such as Latin might be thought to provide good examples of a nominative-accusative distinction, but this distinction is neutralized for nouns of the neuter gender. This is illustrated by the word bellum ‘war’ in Table 2. Almost all neuter nouns in Latin are inanimate, though there are numerous inanimates that have masculine or feminine gender. Even in languages like Japanese and Korean, in which the subject and object are marked by postpositions (see (15) below), there is a tendency to drop the postpositions in colloquial speech; the subject marker is more likely to be lost if the subject is pronominal or human, and the object marker is more likely to be lost if the object is inanimate or indefinite. In fact, there is a very strong crosslanguage tendency for object marking to be confined to objects that are pronominal, human, or definite. In Spanish, for instance, subject and object are distinguished in the clitic pronouns, but with free nominals the preposition a, which otherwise means ‘to’, is used with specific human objects: Busco un empleado ‘I’m looking for an employee’ (anyone will do), Busco a un empleado ‘I’m looking for an employee’ (the one who was here a minute ago). A large number of languages mix accusative and ergative marking, but the two types of marking tend to complement one another. If we take the hierarchy in Table 3, we can say that where accusative or ergative marking co-occur, accusative marking covers a continuous segment of the hierarchy from the top and ergative from the bottom. In the Pama-Nyungan languages of Australia, for instance, ergative marking is found on all nouns. It may extend to third-person pronouns, and in a few languages it covers all nominals. Accusative marking in these languages is generally found on all pronouns. In some, it extends to cover kin terms and personal names, or all humans as well. The principle that seems to underlie these restrictions on marking reflects a view that the most natural transitive predication is one with an agent at the top Table 3 Nominal hierarchy 1st person 2nd person 3rd person kin terms and personal names human animate inanimate
of the hierarchy and a patient at the bottom. Marking tends to be confined to deviations from this ideal, that is, mark pronominal objects with accusative, and nouns, as opposed to pronouns, with ergative. In some languages there is a split between accusative and ergative on tense or aspect lines. A number of Indo-Aryan languages, including Hindi-Urdu (Literary Hindi), Marathi, and Punjabi (Panjabi), and some Iranian languages, such as Pashto and Kurdish, are described as having an ergative construction only in the perfect. Typical Indo-Aryan languages are described as having a direct/oblique case system where the direct case encodes S, A, and O and the oblique is governed by postpositions. However, if O is animate and specific, it is usually marked by a postposition. There is also subject-verb agreement, as in the following examples from Marathi: (10) Ti keel. khaa-t-e. she banana eat-PRES-3SG.F ‘She eats a banana.’ (11) Ti Ravi laa chal. -l. -a. she Ravi ACC torture-PRES-3SG.F ‘She tortures Ravi.’
In the perfect, however, A is marked by a postposition. The verb agreement is with P unless P is marked by the postposition for specific, animate nouns, with the verb then remaining in its neutral form. In Marathi, the ergative postposition is ni. (12) Ti ni kel. i she ERG banana.PL ‘She ate bananas.’ (13) Ti ni Ravi laa she ERG Ravi ACC ‘She tortured Ravi.’
The postposition laa, glossed as ACCusative, marks indirect as well as direct objects.
Size of Case Systems Some languages have no case system at all, and this is possible since there are alternative mechanisms. For the core relations, one alternative is to use word order. Subject and object are distinguished by the use of the unmarked word order subject-verb-object in a number of languages, including Thai, Cambodian (Central Khmer), and Vietnamese. This is also true of English, but English has vestigial two-way case system marked on most personal pronouns (I/me, she/her, he/him, we/us, and they/them but no distinction with you and it). The other alternative for marking the core relations is to use some form of cross-referencing pronominal representation, usually on the verb or auxiliary
Case 217
verb. The following example is from the northern Australian language Gunwinygu (Gunwinggu), in which the first person singular is represented as a prefix nga- on the verb and the third person plural by -di-. The -n- indicates that the first person is the object. (14) Daluk ngaye nga-n-di-ma-ng. woman me 1sg-OBJ-3PL-get-NONPAST ‘The women will get me.’
In this example, the subject and object are represented independently of the verb, but they can be omitted, leaving a sentence meaning, ‘They will get me.’ The common situation in languages with cross-referencing bound pronouns is for the free pronouns to be used only for emphasis. Where the function of bound pronominal markers is indicated by a change of form (affix or suppletion) as in Gunwinygu, the system seems to be case-like, and such systems certainly derive from the use of case marking. But there is an important difference. Bound pronominal systems represent grammatical relations, each set being in a one-for-one relationship with a grammatical relation. Cases are not always in a one-for-one correspondence with grammatical relations. If that were the situation, we would deal only in relations and the word forms or markers that express these relations. If, for example, the nominative forms in Latin expressed only subject, the accusative only direct object, the dative only indirect object, then we would talk of subject forms, direct object forms, and indirect object forms. There would be no need for the notion of case, just as there is no need for notional categories between tense markers and tense categories or aspect markers and aspect categories. For the peripheral functions, the common alternative is the use of adpositions. Case systems, like other systems of grammatical forms, are normally relatively small. They range from two as in the Northwest Caucasian language Kabardian to 15 as in Finnish, but the number of relations a dependent noun can bear to a head will exceed these limits, mainly in the area of relative position (‘under’, ‘over’, ‘behind’, ‘between’, etc.). As a result, almost all languages use adpositions, whether they have peripheral cases or not. Where adpositions are used in addition to case markers, they form a kind of secondary system. However, in some languages, such as Japanese, postpositions are used to the exclusion of case affixes, even covering core functions. In the following Japanese example, the postposition ga marks the subject, ni marks the indirect object, and o marks the direct object: (15) Sensei ga Tasaku ni hon o yat-ta. teacher SUBJ Tasaku IO book DO give-PAST ‘The teacher gave Tasaku a book.’
Adpositions can be considered to be analytic case markers as opposed to synthetic case markers. In Latin, which is fairly typical of languages having analytic as well as synthetic case markers, prepositions are like verbs in that they govern cases, and combinations of preposition and case suffix can serve to mark the relations of nouns to the verb. In English, all prepositions govern the accusative (with me, from her), but in some languages, different prepositions govern different cases. In Latin, some prepositions govern the ablative and others the accusative. The preposition in can govern both: in casa¯ ‘in the cottage’, in casam ‘into the cottage’. In Malayalam, too, different postpositions govern different cases. Mutal ‘from’ governs the nominative in mala mutal ‘from the mountain’, poole ‘like’ governs the accusative as in ammaye poole ‘be like mother’, and kuu
218 Case Table 4 Finnish local cases
by counting combinations of orientation markers and case markers as members of one system. This is justified in Finnish where the markers cannot be identified consistently, but not, for the most part, in the Northeast Caucasian languages.
Types of Marking Case marking is usually via suffixation. The only other mechanism that is at all common is suppletion, as with English pronouns (I/me, etc.). Case suffixes follow number marking; when pronominal possessors are marked on the noun, these usually appear before the case marking, as in Turkish, where they appear between the number marking and the case marking: adam-lar–ım-la (man-PL-1SG.POSS-LOC) ‘with my men’; similarly in Hungarian: hajo´ -I-m-on (ship-PL-1SG.POSS-LOC) ‘on my ships’. In the BaltoFinnic languages, however, the possessor marking usually follows the case marking. In Finnish, for instance, we find: kirkolla-mme (church-ADESSIVE1PL.POSS) ‘at our church’. The adessive case expresses the sense of ‘near’ or ‘at’. In some languages, including Indo-European case languages like Latin and Ancient Greek, case marking (actually case/number marking in these languages, as explained above) appears not only on nouns but also on certain dependents of the noun such as adjectives and determiners. The following example is from Plato. Bios is a nominative singular form of a second declension masculine noun, the nominative indicating that bios is the subject of the predicate. The definite article and the adjective are in the nominative singular masculine form, their concord in case, number,and gender indicating that they are dependents of bios. (16) Ho anexetastos bios the.NOM.SG unexamined.NOM.SG life.NOM.SG ou bio¯ tos anthro¯ po¯ . not livable.NOM.SG man.DAT.SG ‘The unexamined life is not livable for man.’
This example also illustrates concord between a predicative adjective (bio¯ tos) and the subject (bios).
Besides concord within the noun phrase and concord exhibited by predicative nominals and adjectives, there is also apparent concord between what looks like separated parts of a noun phrase. In Latin, it is possible to take a word that would appear to modify a noun and express it in a phrase separate from the noun. The following example is from Virgil (Aeneid II:3), (17) Infa¯ ndum, re¯ gı¯na, iube¯ s unspeakable.ACC queen.VOC order.2SG renova¯ re dolo¯ rem. renew.INF sorrow.ACC ‘Unspeakable, [O] queen, [is] the sorrow you order [me] to rekindle.’
Here the gerundive adjective infa¯ ndum is displaced from the word it might be thought to modify, namely, dolo¯ rem. In Australian languages, the noncontiguous expression of words that would normally appear within a single noun phrase in most languages is commonplace. The other common pattern of case marking is for it to be found only on the final word in the noun phrase. We can distinguish two subtypes. In the first, the final word is the head noun in the noun phrase. This type is widespread. It is found, for instance, in Quechua. It is common among the Papuan languages, and there is a concentration of the type in Asia, including the Turkic, Mongolian, and Tungusic (Tungus) families north of the Himalayas, as well as the languages of the subcontinent, whether they be Dravidian, Munda, or Indo-Aryan (though a number of Indo-Aryan languages have vestigial concord). In this area, most languages are consistent modifier-head languages with SOV order at clause level and determiner-noun, adjective-noun order at phrase level. The following example is from the Dravidian language Kannada: (18) Naanu ellaa maanava I.NOM all human priitisutteene. love.1SG ‘I love all mankind.’
janaangavannu community.ACC
In the other subtype, the final word in the noun phrase is not always the head. This is the situation in various Australian and Amazonian languages. The phenomenon also occurs in Basque: (19) etxe zaharr-etan house old-PL.LOC ‘in old houses’
In a few languages, a nominal can carry more than one case, each case with different scope, and often with different functions. Most but not all instances involve an inner layer of adnominal case plus an outer layer of concordial adverbal case, as in the following examples from Old Georgian:
However, some Australian languages evince double adverbal case. In Warlpiri, for instance, a locally marked adjunct may take ergative case marking in a transitive clause. Consider the contrast between the following sentences. In (21a), the noun phrase in the role of destination is marked for allative case, as one would expect. However, if a verb for ‘carry’ is substituted for a verb meaning ‘send’, then it is possible to further mark the allative-marked phrase for ergative: (21a) Ngarrka-ngku ka maliki PRES dog.ABS man-ERG ngurra-kurra yilya-mi. camp-ALL send-NONPST ‘The man is sending the dog to the camp.’ (21b) Ngarrka-ngku ka kuyu ka-nyi man-ERG PRES meat.ABS carry-NONPST ngurra-kurra (-rlu). camp-ALL(-ERG) ‘The man is carrying the meat to the camp.’
An inner local phrase normally has the patient as its scope (and this is normally encoded in the absolutive relation, i.e., as S or O). In Warlpiri, the use of the ergative on the locally marked phrase is to indicate that the agent (A) is also within its scope. With carrying, the agent moves to the same destination as the patient. Where a clause rather than a noun phrase is a dependent, the same possibilities for the distribution of case marking arise. In most instances the case marker appears only on the head of the clause, namely the verb, as in the following example from Finnish, where the translative case is found on the infinitive. The translative means ‘into’, mainly metaphorically, as in ‘You’ll turn into a pumpkin’, and purpose as in mi-ksi ‘what for’. With a nominalised verb, it indicates purpose. The actor of the nominalised verb is expressed by the possessive pronominal suffix. (22) Osti-n karttakirja-n suunnitella-kse-ni bought-1SG atlas-ACC plan-TRANS-1SG.POSS automatka-n. car.trip-ACC ‘I bought an atlas in order to plan a car journey.’
Another possibility is for the case marking of a dependent verb to spread to its dependents by concord. The following example is from Yukulta (Ganggalida) (Northern Australian). Note that the dative, which is appropriate to the verb warratj-, spreads to the allative-marked complement to yield a second layer of case marking.
(23) Taamitya-ngandi tangka ask-1SG.3SG.FU.AUX man.ABS natha-rul-ngkurlu warratj-urlu. camp-ALL-DAT go-DAT ‘I’ll ask the man to go to the camp.’
In some languages there are different principles of case marking operating in subordinate clauses, particularly if the verb is non-finite. A well-known example is the use of the accusative for the subject as well as the object in Ancient Greek and Latin ‘accusative and infinitive constructions’. The following example is from Latin, where dominum, the subject of the verb vı¯disse is in the accusative as well as the object co¯ nsulem. (24) Dicunt dominum co¯ nsulem vı¯disse. say.3PL master.ACC consul.ACC see.PERF.INFIN ‘They say the master has seen [lit. ‘to have seen’] the consul.’
In Thalantji (Pama-Nyungan), the dative, which is the main adnominal case in this language, is used to mark the complement of transitive verbs in non-finite relative clauses. Contrast the accusative on kanyara in the main clause and the dative on murla in the subordinate clause in (25): (25) Ngatha nhaku-nha kanyara-nha I see-PAST man-ACC murla-ku warni-lkitha. meat-DAT cut-REL.DS ‘I saw the man (who was) cutting meat.’
The verbal suffix -lkitha in this example glossed ‘relative, different subject’ marks a qualifying clause, the covert subject of which must be interpreted as being distinct from the main clause subject. The marker is a case marker in origin. This is an example of a derived function of case marking. In Turkish, the genitive is used to mark the subject of a nominalised verb. The object of such a verb if present takes the normal case marking. (26) Ahmed-i ben-i sev-digˇ -in-i Ahmed-GEN 1SG-ACC love-NM-3SG.POSS.ACC bil-iyor-um. know-PRES-1SG ‘I know that Ahmed loves me.’
The form -in is a third singular possessive form in cross-reference with Ahmed-i. In Turkish, noun possessors are cross-referenced on possessed nouns: Biz-im heykel-imiz (we-GEN statue-1PL.POSS) ‘our statue’. The accusative on the nominalised verb marks it as the complement of biliyorum, and the accusative on ben marks it as the complement of sevmek ‘to love’. See also: Ergativity; Head/Dependent Marking; Inflection
and Derivation.
220 Case
Bibliography Blake B J (2001). Case (2nd edn.). Cambridge: Cambridge University Press. Brecht R D & Levine J S (eds.) (1986). Case in Slavic. Columbus, OH: Slavica Publishers. Comrie B (1986). ‘On delimiting cases.’ In Brecht R D & Levine J S (eds.) 86–105. Comrie B (ed.) (1987). The world’s major languages. London: Croom Helm. Comrie B (1989). Language universals and linguistic typology. Oxford: Blackwell. Delancey S (1981). ‘An interpretation of split ergativity and related patterns.’ Language 57, 626–657.
Dixon R M W (1994). Ergativity. Cambridge: Cambridge University Press. Dixon R M W (2002). Australian languages. Cambridge: Cambridge University Press. Mel’cuk I A (1986). ‘Toward a definition of case.’ In Brecht R D & Levine J S (eds.). 35–85. Plank F (ed.) (1991). Paradigms: the economy of inflection. Berlin: Mouton de Gruyter. Silverstein M (1976). ‘Hierarchy of features and ergativity.’ In Dixon R M W (ed.) Grammatical categories in Australian languages. Canberra: Australian Institute of Aboriginal Studies/New Jersey: Humanities Press. 112–171.
Case Grammar J M Anderson, Methoni Messinias, Greece ! 2006 Elsevier Ltd. All rights reserved.
‘Case Grammar’ is a label used for various developments in grammatical theory originating in the midto-late 1960s that are associated more or less closely with a certain hypothesis concerning the organization of the grammar: the hypothesis concerns the status of semantic functions – or relations or roles – such as Agentive or Locative; functions that label the mode of participation of the denotata of arguments in the situation described by the predication in which they occur. The terms ‘various’ and ‘more or less closely’ are used advisedly. Since Case Grammar is a partial hypothesis, it is compatible with a variety of hypotheses concerning other aspects of the grammar, though it will, of course, interact with them. Since, too, the hypothesis can be formulated in more and less strong forms, not all variants of Case Grammar are as distinct in their claims from what is embodied in other frameworks that are not usually termed Case Grammars. The minimum Case Grammar hypothesis is that semantic functions are relevant to the expression of syntactic (as well as semantic) generalizations; in a stronger, more interesting and distinctive form it involves the claim that they are basic to the syntax and that many other aspects of syntactic structure are derivative of them. The name itself is in part a recognition that what is involved is, again in part, a return to traditional concerns with the semantics and syntax of Case (see Case), concerns that were neglected by those structuralist (including early transformational) frameworks that abolished the morphology/syntax division and were reticent about semantics. The term ‘Case Grammar’
specifically devolves from Fillmore’s (1965, 1966, 1968a) use of Case Relation, or simply Case, for semantic function, the status of which is fundamental to the Case Grammar hypothesis. In what follows, the first section outlines some of the basic notions central to the main tradition associated with this hypothesis; while the second and third sections give some idea of, respectively, the variety of interpretation to which it has been subjected and the range of attempts to arrive at a definition of Case and Cases.
Some Fundamentals The use of the term ‘Case (Relation)’ for semantic function is based on the familiar observation that in a number of languages semantic functions are distinguished by differences in nominal inflexion, as in the Old English sentence of (1): (1) Him ofhreow þæs mannes ‘He/theyþdat pitied theþgen manþgen’
(cf. again see Case), wherein the dative inflexion signals the locus of the emotion denoted by the verb (discussed below as the Experiencer Case Relation) and the genitive inflexion marks the Source of the emotion. However, it must be acknowledged, as is once more familiar from earlier studies (cf. e.g., Welte, 1987), that in many instances the correlation between case inflexion and semantic function is not simple. Most notoriously, case inflexions can correlate more closely with grammatical functions or relations. Notably, most grammarians would not include Subject among the set of semantic functions. At the very least, it is of a rather different character from Agentive, etc.; hence its differentiation, along with (for many grammarians) Object, etc. as a grammatical relation. The
220 Case Grammar
Bibliography Blake B J (2001). Case (2nd edn.). Cambridge: Cambridge University Press. Brecht R D & Levine J S (eds.) (1986). Case in Slavic. Columbus, OH: Slavica Publishers. Comrie B (1986). ‘On delimiting cases.’ In Brecht R D & Levine J S (eds.) 86–105. Comrie B (ed.) (1987). The world’s major languages. London: Croom Helm. Comrie B (1989). Language universals and linguistic typology. Oxford: Blackwell. Delancey S (1981). ‘An interpretation of split ergativity and related patterns.’ Language 57, 626–657.
Dixon R M W (1994). Ergativity. Cambridge: Cambridge University Press. Dixon R M W (2002). Australian languages. Cambridge: Cambridge University Press. Mel’cuk I A (1986). ‘Toward a definition of case.’ In Brecht R D & Levine J S (eds.). 35–85. Plank F (ed.) (1991). Paradigms: the economy of inflection. Berlin: Mouton de Gruyter. Silverstein M (1976). ‘Hierarchy of features and ergativity.’ In Dixon R M W (ed.) Grammatical categories in Australian languages. Canberra: Australian Institute of Aboriginal Studies/New Jersey: Humanities Press. 112–171.
Case Grammar J M Anderson, Methoni Messinias, Greece ! 2006 Elsevier Ltd. All rights reserved.
‘Case Grammar’ is a label used for various developments in grammatical theory originating in the midto-late 1960s that are associated more or less closely with a certain hypothesis concerning the organization of the grammar: the hypothesis concerns the status of semantic functions – or relations or roles – such as Agentive or Locative; functions that label the mode of participation of the denotata of arguments in the situation described by the predication in which they occur. The terms ‘various’ and ‘more or less closely’ are used advisedly. Since Case Grammar is a partial hypothesis, it is compatible with a variety of hypotheses concerning other aspects of the grammar, though it will, of course, interact with them. Since, too, the hypothesis can be formulated in more and less strong forms, not all variants of Case Grammar are as distinct in their claims from what is embodied in other frameworks that are not usually termed Case Grammars. The minimum Case Grammar hypothesis is that semantic functions are relevant to the expression of syntactic (as well as semantic) generalizations; in a stronger, more interesting and distinctive form it involves the claim that they are basic to the syntax and that many other aspects of syntactic structure are derivative of them. The name itself is in part a recognition that what is involved is, again in part, a return to traditional concerns with the semantics and syntax of Case (see Case), concerns that were neglected by those structuralist (including early transformational) frameworks that abolished the morphology/syntax division and were reticent about semantics. The term ‘Case Grammar’
specifically devolves from Fillmore’s (1965, 1966, 1968a) use of Case Relation, or simply Case, for semantic function, the status of which is fundamental to the Case Grammar hypothesis. In what follows, the first section outlines some of the basic notions central to the main tradition associated with this hypothesis; while the second and third sections give some idea of, respectively, the variety of interpretation to which it has been subjected and the range of attempts to arrive at a definition of Case and Cases.
Some Fundamentals The use of the term ‘Case (Relation)’ for semantic function is based on the familiar observation that in a number of languages semantic functions are distinguished by differences in nominal inflexion, as in the Old English sentence of (1): (1) Him ofhreow þæs mannes ‘He/theyþdat pitied theþgen manþgen’
(cf. again see Case), wherein the dative inflexion signals the locus of the emotion denoted by the verb (discussed below as the Experiencer Case Relation) and the genitive inflexion marks the Source of the emotion. However, it must be acknowledged, as is once more familiar from earlier studies (cf. e.g., Welte, 1987), that in many instances the correlation between case inflexion and semantic function is not simple. Most notoriously, case inflexions can correlate more closely with grammatical functions or relations. Notably, most grammarians would not include Subject among the set of semantic functions. At the very least, it is of a rather different character from Agentive, etc.; hence its differentiation, along with (for many grammarians) Object, etc. as a grammatical relation. The
Case Grammar 221
inflexions identified by the label nominative in various languages are labeled thus precisely because their occurrence correlates most closely with the nominal identified on other grounds as a Subject. Further, apart from the fact that case inflexions typically express other categories simultaneously with case (or grammatical) relations, categories such as gender, number, dimensionality, distinctions in Case Relation are frequently expressed otherwise than by nominal inflexion. Typically, as well as or instead of noun inflexions, adpositions, verbal auxiliaries, word order, verb morphology, and combinations of these may be involved, often in combination in the same language. Consider, for example, the sentences from Eastern Pomo in (2): (2a) mı´! he (2b) be´kh they (2c) mı´! he (2d) be´kh they (2e) be´kal they
be´kal du!le´ya them killed mı´!pal sˇa! akiya him killed kaluhuya went-home ka´lphi!lı´ya went-home e!xe´ka slipped
The shape of the pronoun reflects whether it is Agentive or not: mı´! and be´kh are Agentive, they mark the Source of the Action, both in the transitive examples (2a, b) and the intransitive (2c, d); but mı´!pal and be´kal signal the entity undergoing the action or process, what I will describe below as an instance of the Neutral Case Relation, both in the transitive examples (2a, b) and in the intransitive (2e). Typically, too, the suppletion illustrated in (2a, b) – the ‘kill’ verb changes its shape – is in response to the number of the Neutral (‘them’ vs. ‘him’). However, in (2c, d) the (partial) suppletion is triggered by the number of the argument we have already (on the basis of the correlation between semantic function and pronoun morphology) labeled Agentive. This would seem to confirm that the pronouns in (2c) and (2d) are simultaneously Agentive and Neutral, as both are Source of the Action and undergoer of it (and thus moving), whereas that in (2e) is Neutral simply. We return to the association between argument and Case Relation in ‘Defining Case Relations.’ What is most relevant at this point is illustrating the interaction of different means of expressing Case relations. There is typically, too, with respect to a specific language, no one-to-one correlation between a particular Case Relation and a particular expression, whether by inflexion, adposition or whatever. Despite occasional claims to the contrary, the English preposition by, for instance, is not associated uniquely with
Agentives, or to with Goals: both can also mark what is often labeled Experiencer (known by/to). Nevertheless, given that the expression of distinctions in Case Relation may be regarded as the prototypical function of case inflexions, this ‘case’ terminology is not unjustified. And its appropriateness is reinforced to the extent that grammatical relations can be regarded as neutralized Case Relations. Such a notion is crucial to the strong Case Grammar framework developed by Charles Fillmore in the late 1960s. This substituted for the Deep Structure of a grammar of the type envisaged in Chomsky’s (1965) basic representations including nodes associated with a set of Case Relations. Both Surface Structures of (3), for example, are derived from the (unordered) Deep (Case) Structure of (4).
The Proposition consists of a verb and a set of Case Phrases, each of which includes a Case ‘flag’ (Kasus) and a Noun Phrase. The Agentive is the source of the action; the Neutral is the least specific Case, whose precise relation to the predicate is most obviously dependent on the type of the predicate: it labels the
222 Case Grammar
entity that undergoes processes and movements and actions and has locations and states attributed to it. ‘Defining Case Relations’ discusses more fully the definition of the Cases; but notice at this point that N(eutral) appears under a number of different labels: Ergative (an acknowledged misnomer), Object(ive), Nominative (again, unfortunate), Absolutive, Patient (usually interpreted more narrowly than Neutral). Modality consists of elements of tense, mood, and aspect, including Modal Cases (such as Manner Phrases), which are interpreted as modalities of the sentence as a whole. Predicates, such as the verb in (3) and (4), are subcategorized not in terms of their functionally unlabeled complements, but with respect to the set of Case Phrases they require, obligatorily, or optionally, as exemplified in (5a) and (5b), respectively: (5a) þ [ (5b) þ [
N(eutral) A(gentive)] N (A)]
The verb of (5a) (perhaps kill, if we ignore Instrumentals, such as with a handbag, for the moment: see again ‘Defining Case Relations’) is marked as taking a Neutral and an Agentive. That in (5b) takes a N and an optional A, perhaps appropriate for melt: see (8) below. Assignment of the status of Subject is based on the set of Case Phrases present in a sentence, the Case Frame. Each entry in (5) includes a set of Case Frame Features. Unmarked assignment in a particular predication is specifically in accord with a Hierarchy of Case Relations, as exemplified by (6) (Fillmore, 1968a): (6) Agentive > Instrumental > Objective
In terms of (6), an Agentive, if present, will be Subject; in the absence of an A, then an Instrumental, if present, will be; and so on. With kill, the Agentive thus takes priority, as in (3a). The Surface Structure of the passive sentence in (3b) contains signals (the auxiliary construction) that here the Hierarchy has been overridden, a marked selection of Subject has been made, and the ‘rejected’ subject is distinguished by an appropriate marker (here by). There has been some controversy over the character of the Hierarchy, partly as a result of different views concerning the set of Cases, as discussed in ‘Defining Case Relations’ below. It may be too that lexical exceptions to the Hierarchy have to be acknowledged. For instance, if like and please in English are associated with the same set of Case Frame Features, then one of them will apparently have to be marked as exceptional in Subject selection: (7a) Jemima liked the play (7b) The play pleased Jemima
whatever the hierarchy, if it is determinate. Anderson (1977: Sect. 2.1.5) and others have denied that this is necessary, in that the two verbs require different sets. Grimshaw (1990) suggests that such pairs as (7) differ ‘aspectually.’ But even if such exceptions have to be countenanced, the viability of some such hierarchy is the basis for the strong Case Grammar hypothesis, which asserts the basicness of Cases and the dependence thereon of other syntactic phenomena (including Subject selection). It has also been argued that conventional Deep Structures (as introduced in Chomsky, 1965), as well as being derivable, and indeed dispensable, also form a poor basis from which to project (Deep) grammatical relations, one of their claimed roles: configurational definitions are difficult to maintain across languages; they allow for spurious (or at least never utilized) grammatical relations; and some categories seem themselves to be relational (e.g., Place). Moreover, the role of Deep relations in the grammar (pace Katz, 1972) remains obscure. See here, for example, Anderson, 1977: Sect. 1.2, 1982; Starosta, 1987. And, that argument proceeds, neither Deep Structures nor grammatical relations are relevant to lexical relationships. Fillmore’s work on Case Grammar and beyond has involved a strong interest in lexical semantics (see particularly Fillmore, 1987; also, in the present context, Fillmore, 1968b, 1970, 1971b, 1971c, 1972, 1977a, 1977b). Much effort within Case Grammar has been devoted to showing that subcategorization and lexical relations are sensitive to Case Relations rather than the configurations and grammatical relations that are derivative in CG. So-called ergative verbs (Lyons, 1968) appear to provide a straightforward example with respect to subcategorization. Case Grammarians have argued that the basic distributional potential of verbs such as melt in (8) is most transparently described in terms of the Case Features of (9) rather than, say, the conventional frame of (10): (8a) Burt melted the ice (8b) The ice melted (9) N (A) (10) þ [ ___ ([NP])
Example (10) can be said to obscure what is constant in the two basic occurrences, the Neutral (Objective) argument, with selection of Subject being determined by the Hierarchy. Anderson (1984) attempts to show that a wide range of (lexical) derivational relationships in English make no reference to grammatical relations or the configurations in terms of which they may be defined.
Case Grammar 223
Typical here is -able formation, whose central function is to form adjectives of a certain semantic character on the basis of verbs, such that the argument of the adjective corresponds to the Neutral argument of the verb (whatever its grammatical relation); we again have the familiar ‘ergative’ relation, illustrated in (11) and highlighted therein by the derivational pattern associated with a verb such as change: (11a) The cover is removable (cf. Beppo has removed the cover) The settings are changeable (cf. Beppo has changed the settings) (11b) The material is perishable (cf. The material has perished) The weather is changeable (cf. The weather has changed)
Reference to grammatical relations again obscures the generalization. The import of this is that if the lexicon (via subcategorization and derivational relationships) has access only to basic syntactic structure, then grammatical relations and the division into Subject and Objects do not appear to be basic: this is exactly what is claimed by the Case Grammar hypothesis. This conclusion can be avoided, with respect to examples such as (8) and (11), at least, if one adopts the Unaccusative hypothesis associated with developments of Relational Grammar and other frameworks. In terms of this hypothesis, the Surface Subjects of (8a) and (11a) are initial Direct Objects; their subjecthood is derivative only. (Other intransitive verbs [typically Agentive] do have initial Subjects; they are Unergative.) The relationships in (8) and (11) can then be described without reference to Case Relations: for example, ‘ergative’ verbs are transitives with an optionally empty subject; the derivational relationship illustrated by (11) involves uniformly the Direct Object of the base verb. Case Grammarians have argued that such a strategy lacks independent motivation, is unnecessary given the availability of Case Relations, and leads to incorrect predictions (cf. e.g., Anderson, 1980). Such arguments have also been concerned to show not only that grammatical relations are derivative (‘Surface’) but also that it is inappropriate to attribute subjecthood at all to certain language types, if a definition of Subject more restrictive than ‘obligatory derived relation’ is to be maintained. In most languages, Subject is a syntactically motivated grouping (or neutralization) of transitive Agentives and intransitive Neutrals (cf. [8] again). In the (‘ergative’) Dyirbal language of Australia, such a grouping is at most only marginally relevant from a syntactic point of view. (Direct) Objecthood is also notoriously
difficult to identify; and it has been argued that Indirect Object is an incoherent relation, even derivatively (Anderson, 1978, and see more recently S. R. Anderson, 1988). Throughout the late 1960s and the early 1970s, various different versions of Case Grammar were proposed, to some extent independently (e.g., apart from the work of Fillmore: Anderson, 1968, 1971; Platt, 1971; Cook, 1971, 1972a, 1972b, 1973, 1978; Nilsen, 1972; Longacre, 1976; cf. also Chafe, 1970), and the framework inspired a number of descriptive and applied studies. However, little agreement on the set and nature of the set of Case Relations emerged (see further ‘Defining Case Relations’ below). It also became clear that Case Grammar was vulnerable to the charge that, apart from the claimed prediction of the distribution of grammatical relations and associated word order properties, little evidence of a syntactic role for Case Relations was adduced. Thus, for example, the Case-based lexicon of Stockwell et al. (1973) plays almost no role in the syntactic descriptions that occupy the rest of the volume. Even more seriously, perhaps, evidence was put forward, on the other hand, for the semantic and syntactic relevance of Deep grammatical relations and for the necessity of positing the kind of relation between active and passive sentences prescribed by the Passive transformation, which was eliminated from the syntax by Fillmore’s (1968a) proposals. Crucial here is the discussion of the role of grammatical relations by S. R. Anderson (1971). Pairs of sentences such as those in (12): (12a) Ernie loaded the pickup with packs of amaretti (12b) Ernie loaded packs of amaretti onto the truck
differ in interpretation: the former has been described as ‘holistic,’ the action exhausts the relevant dimensions of space denoted by the Direct Object; the latter is ‘partitive.’ If (12a) and (12b) share Case Features (say Agentive, Neutral, Place), then the basis for the difference must be located elsewhere. It is not a property of Surface Structure; the difference remains constant under various transformational movements, as illustrated by (13) and (14): (13a) The pickup was loaded with packs of amaretti (13b) Packs of amaretti were loaded onto the truck (14a) It was packs of amaretti that Ernie loaded the truck with It was the pickup that Ernie loaded with packs of amaretti (14b) It was packs of amaretti that Ernie loaded onto the pickup It was the pickup that Ernie loaded packs of amaretti onto
224 Case Grammar
What the (a) variants share is Deep Structure association of the Place Case Relation with Direct Object function. Their derivations must include a stage at which this association is made, motivating both Passive as a transformation and Deep Object. (This is, of course, not to deny that, given appropriate assumptions, this association could be read off derived structures.) Fillmore (1977a) concedes the force of this argument, and essentially reinstates Deep Structure, thereby effectively abandoning the Case Grammar hypothesis in its strongest form. Others have disagreed. It is possible, for instance, that Passive may be accommodated constructionally, as bi-clausal (see ‘Varieties of Case Grammar’ below). But even if that possibility is laid aside, it can be argued that the Case Feature assignments assumed above are inappropriate, that they differ between the (a) and (b) examples. Miller (1985) suggests that the association of nonsubjective arguments to Case Relations is reversed between (a) and (b); J. M. Anderson (1977: Sect. 1.8) accepts that the (a) and (b) examples share the same Case Features and the same associations, but suggests that in addition the Place argument in the (a) examples is simultaneously Neutral: i.e., it bears two Case Relations. (On this see ‘Defining Case Relations’ below.) Under either proposal, we can associate holisticness, as elsewhere, with the Neutral Case Relation: unless this expectation is canceled in some way, Neutrals are normally understood as participating as a whole in the process being represented. This is true of the Neutrals in (1b) and (12b), for instance. J. M. Anderson also points out that the generalization based on Deep Object is inadequate. The Subjective Place argument in (15a) but not (15b) is also associated with a holistic interpretation: (15a) The garden is swarming with bees (15b) Bees are swarming in the garden
He attributes this to the Place in (15a) being again simultaneously Neutral, which is also associated with its selection as Subject. Once more, (12) and (15) show the ‘ergative’ (or, more misleadingly, ‘unaccusative’) pattern that cuts across grammatical relations. The syntactic role of Case Relations has been variously addressed since the mid-1970s. The force of objections based on the paucity of reference by transformations to Case Relations has been considerably weakened by the apparent demise of individual transformations. For instance, the motivations for a syntactic relationship between pairs such as that in (16), involving a putative Dative Movement, are disputable, where evident: (16a) Anna taught Helen Greek (16b) Anna taught Greek to Helen
These pairs can be argued to show a partially shared Case Frame, with the difference in word order being attributable to the differences in Case Relations present (cf. again Anderson, 1978; also Anderson, 1987). The difference in Case Frame is reflected in the contrary acceptabilities of (17a) and (17b): (17a) *Anna taught an empty room Greek (17b) Anna taught Greek to an empty room
The Helen/empty room argument is involved in the action in a different way in the (a) and (b) examples, suggesting a difference in Case Relation (on the possible character of this difference, once more involving an argument being assigned more that one Case Relation, see again ‘Defining Case Relations’ below). More generally, the development of frameworks in which transformations are reduced to a small number, including one, or from which they are eliminated altogether, poses the question of the syntactic relevance of Case Relations in a different way – or rather different possible ways. This is one respect in which different variants of Case Grammar have evolved in response to decisions about other aspects of the grammar than are encompassed by the basic Case Grammar hypothesis.
Varieties of Case Grammar Fillmorean Case Grammar evolved as an alternative to the kind of transformational grammar expounded in Chomsky (1965), with subject formation instated as one of a number of transformational rules. Anderson (1968, 1971) talks of rules of realization (including ‘sequencing’), implying nontransformational mapping of (unordered) Case structures onto surface structures enriched with sequence and configurations; but only a very limited range of constructions is taken into consideration there. And Anderson (1977) envisages complex (transformational) derivational relationships between initial unordered structures and Surface, including prelexical application of syntactic rules (cf. similar developments over the same period within Generative Semantics (see Generative Semantics). Partly in reaction to such developments, Starosta (1973, 1978, 1987, 1988) and others have formulated a framework, Lexicase (see Lexicase), which eschews transformations, and, indeed, syntactic derivations altogether. Syntax is monostratal and Case Relations and Case Forms (including case inflexions) are marked as features on lexical items, the former on nouns, the latter on nouns, verbs, and prepositions. Thus, corresponding to (3) and (4) above we might have (18), with no Deep/Surface distinction (modeled on Starosta, 1987: 65):
Case Grammar 225
AGT and PAT are Case Relation features, roughly corresponding to A and N; Nom, Acc, Sorc, and Goal are Case Form features. The relationship between the [þAGT] noun in (18a) and the Means ([þMNS]) noun in (18b) is expressed by the rule of lexical derivation that forms the adjectival verb in (18b) from the transitive verb in (18a). Corresponding to the Case Frame Features discussed above, we have Contextual Features such as [þ[þAGT]] and [þ[þPAT]] associated lexically with predicates. Contextual Features may be either inherent or given by lexical redundancy: e.g., all verbs are redundantly [þ[þPAT]] (what Starosta refers to as Patient Centrality: see ‘Defining Case Relations’ below). The Subject Selection Hierarchy is also expressed by lexical redundancies such as (19) (Starosta, 1987: 67):
In general, the grammar is reduced to what can be expressed via relationships between individual lexical items, though the status of long-distance dependencies remains uncertain. We should note too that some critics have found it difficult to see how basic word order settings (such as Head-Modifier vs. Modifier-Head) can be regarded as part of the lexicon. It is still possible within such a restrictive framework to reconstruct the fundamental basis for the Case Grammar hypothesis, the primacy of Case Relations vis-a`-vis Case Forms; specifically, in terms of redundancies such as that in (19). But lexical primacy of Case Relations is a property shared by a number of other proposals, such as those made within the tradition of Functional Grammar (see Functional Grammar: Martinet), as developed by Dik (1978; also 1987) and others. Are these also Case Grammars? Starosta himself describes various Valency Grammars (see Valency Grammar) as such. This is to some extent a terminological matter. However, the central thrust of work in Case Grammar (what was called above the strong hypothesis) has involved the syntactic primacy of Case Relations and the exclusion of grammatical relations, etc. from the lexicon (cf. ‘Some Fundamentals’ above). Other developments suggest that this can be maintained within a restrictive theory that nevertheless does not abandon the syntax/lexicon division. The descriptions proposed by Anderson (1977) involve complex syntactic derivations. But they are conceived of as resulting from the interaction of a small number of universal syntactic rules, and possibly only one involving structure change (Raising), regulated by universal constraints, such as Strict Cyclicity. Roughly, in terms of this last, cyclic structure-changing rules must apply only in derived (nonmonoclausal) environments or in an environment resulting from the application of a cyclic rule. Thus, a derivation whereby the Subject of the finite verb in (20) is Raised out of a subordinate clause, conforms to Strict Cyclicity: (20) Fran seems to like cheese
Example (19) requires that an item that occurs with a [þAGT] must also occur with a [þPAT] (generalizable as Patient Centrality), and also that the [þAGT] is [þNom], Subject, and the [þPAT] is [þAcc], Object.
whereas monoclausal rules such as Dative and the traditional Passive, whether interpreted as movements or relation changes, are illegitimate. Such a requirement outlaws all the Advancements of Relational Grammar, for instance, if conceived in derivational terms. Raising, indeed, it has been argued (e.g., Anderson, 1982, 1984, 1986), provides a paradigm instance of the syntactic role of Case Relations and of the appropriateness of the strong Case Grammar hypothesis.
226 Case Grammar
In (20), the Raised Subject of the (nonfinite) subordinate clause assumes the Subject function in the main (cyclic) clause. In (21) the Raised Subject becomes the Object of the finite verb: (21) Nobody expected Fran to like cheese
(Such a formulation clearly rejects attempts to deny such a [derived] status to the Fran argument in [21].) The derived relations involved present us with a familiar pattern of distribution, that labeled ‘ergative.’ The status of Subject in (20) and Object in (21) is exactly what one would expect of a Neutral argument; in (21), it is denied subjecthood by the Experiencer argument of the finite verb. Raising confers the Neutral Case Relation in the cyclic clause on the Subject argument of the subordinate verb. Formulations of Raising invoking grammatical relations or configurations, once again, obscure this generalization. More interestingly still, this interaction of relations is as predicted by the Case Grammar hypothesis. The identification of grammatical relations is derived, determined by the Hierarchy; thus, the identity of the Subject of the cyclic clause is not available on the cycle of rules applying to that clause (in the absence of an arbitrary ordering of rules); but, if the Subject is identified cyclically, then its identity is available on the next cycle. This is exactly what the formulation of Raising requires: the argument involved is identified as the Subject of the lower clause (now available, on the Raising cycle) and as Neutral (a Case Relation) in the cyclic clause; its grammatical relation in the upper clause is given by the Hierarchy. Indeed, if this argument had to be identified in the formulation of Raising by its grammatical relation in the cyclic clause, such a formulation would stand as a counter-example to the predictions of the Case Grammar hypothesis. Dative and Passive, also, in so far as they involve manipulation of derived properties within a single clause, also violate the Case Grammar hypothesis, as they do Strict Cyclicity. The two hypotheses converge in excluding them as monoclausal rules. The Dative relationship with examples such as (16) can be allowed for in terms of partially shared Case Features. Such an account is not available, however, for Passive. But as an alternative to the lexical proposal described above, a cue can be taken (as it has been in a number of approaches) from the overtly two-verb structure involved. Say (Passive) be is a Raising verb, and its Subject is Raised out of the subordinate clause associated with the accompanying nonfinite (e.g., Anderson, 1991). It differs from a Raising verb like seem only in that the Raised argument is not the Subject of the nonfinite verb; rather, it is the argument next
down on the Subject Selection Hierarchy. This stipulation allows for all of (22), with the boundaries of the subordinate clause marked by square brackets): (22a) Albert was [killed by Emma] (22b) That was [known by/to everybody] (22c) The bed was [slept in by a parrot]
In each instance, the Hierarchically highest argument (Agentive, Experiencer, Agentive/Neutral, respectively) is ignored by Raising, and is marked with a preposition characteristic of the Case Relation involved. (On the Agentive/Neutral assigned to (22c) see ‘Defining Case Relations’ below.) More recently, it has been claimed (cf. again Anderson, 1991) that Raising (including Passive) is projectively structure-building; more generally, that syntactic structure is built up monotonically on the basis of the valencies (specified crucially in terms of Case Relations) of individual lexical items. Such developments serve to enable us to bring into sharper focus the differences between the lexical Case Grammar advocated by Starosta and the tradition within which Case Relations are seen as syntactically basic. Crucially involved is the validity of a syntactic/ lexical distinction. Advocates of the distinction (e.g., Anderson, 1984) argue, for instance, for a differentiation between syntactic and lexical passives (derived adjectives); it is unclear how, if legitimate, this distinction is to be reconstructed in purely lexical terms. Another central question for the lexical approach (apart from those mentioned above) is: how are the alleged asymmetries between the roles of Case Relations and Case Forms to be accounted for? This involves both the syntactic asymmetry associated with Strict Cyclicity (deployed in the description of Raising given above) and the lexical asymmetry associated with the claimed failure of lexical rules to make reference to Case Forms. Case Grammar approaches (including perhaps Functional Grammar) are united in highlighting the Contrastive status of Case Relations: they have to be stipulated lexically. Many other aspects of syntactic and lexical structure are redundant, derivative of Case specifications and e.g., parametric settings for basic word order. The Case Relations are also semantically identified, though distributionally distinctive. Such a view has been influential in the evolution of other, distinctively labeled frameworks, such as Role and Reference Grammar (see Semantics in Role and Reference Grammar). In the main line of development in transformational grammar (epitomized by Chomsky, 1972, 1981, 1995), the Case Grammar hypothesis was initially rejected, but over time transformational syntax came to embrace Case
Case Grammar 227
Relations (under the unfortunate name of Thematic Relations, or Theta Roles), and eventually to abandon Deep Structure. In these respects, the Minimalist program can be identified as a variant of Case Grammar. On the other hand, Anderson (1989, 1991, 1997) and Bo¨ hm (1998), in view of the semanticity of the Case Relations, see Case Grammar as a subpart of Notional Grammar, wherein all the basic elements of the syntax, including word classes, though distributionally motivated, are identified on the basis of semantic properties displayed by their prototypical members. As ‘Varieties of Case Grammar’ has in part illustrated, other aspects of Case Grammar have been much more contentious. To some extent, this reflects choice among theoretical alternatives relatively independent of the Case grammar hypothesis. Independence of hypotheses within grammatical systems, however, is only ever relative. Thus, whereas Fillmore (1968a, 1971a) remains rather undecided concerning the appropriate representation of syntactic structures including Cases, others (e.g., Robinson, 1970; Anderson, 1971; Tarvainen, 1987, and cf. Tesnie`re, 1959) have argued that Case Relations, specifically, are most appropriately expressed by Dependency structures (see Dependency Grammar). In terms of the framework of Anderson, 1971, for instance, (23) would correspond to (4), with Case nodes dependent on the V(erb) and governing the N(oun); Erg ¼ Ergative (roughly, Agentive; see ‘Defining Case Relations’) and Nom ¼ Nominative (Neutral):
The correlation between the dependency hypothesis, and other aspects of syntax, and the Case Grammar hypothesis itself is controversial. But perhaps the least agreed-on aspect of Case Grammar has been the constitution of the set of Case Relations themselves, an issue that remains unresolved. And the characterization of these in turn has consequences for these other issues concerning the parameters of a Case Grammar.
Defining Case Relations The arguments of a predicate identify the entities involved in the situation described by the predicate. The nature of the roles played by these entities can be discriminated in more or less detail. In a sense, each predicate prescribes a unique set of roles. But it is possible to generalize over various ‘fields’ of the vocabulary; and in certain institutionalized situations, in particular, these roles are made lexically explicit in the language. Thus, in the situations described by English verbs such as sell, buy, barter, etc., one role is occupied by a Customer; in those described by concede, justify, accuse, etc., there occurs a Defendant, perhaps (see e.g., Fillmore, 1971c, 1972: Sect. 42, 1977b). More generally still, we can recognize that in many situations it is possible to attribute to a particular entity the role of Source of the Action, the Agentive role. The Customer in (24a) and the Defendant in (24b) are both Agentive, as Sources of the immediate Action described by the verb: (24a) Algernon bought a Lada from Bert (24b) Algernon justified his decision unconvincingly
However, there is no simple mapping between such generalized roles and the more specific institutionalized functions. The Customer in (25), for instance, is not presented as the Source of the immediate Action: (25) Bert sold a Lada to Algernon
even though the same ‘real-world’ event may be being referred to by (24a) and (25). This observation undermines such critiques as that offered by Dowty (1989). Dowty’s program for eliminating Case Relations from the syntax has been pursued, however, in various ways, as in the tradition exemplified by Hale and Keyser (2002), generally at the cost of acceptance of a very abstract view of syntax and of the syntacticization of lexical structure. Much of Fillmore’s work since the early 1970s has been concerned with the cognitive structures, Frames (see Frame Analysis), within which detailed role specifications are articulated. However, it is generalized roles such as Agentive that Case Grammarians identify as Case Relations, those semantic functions that are basic to the lexicon and/or syntax. There has been some agreement, and much disagreement, concerning the set of roles that fulfill this function in the grammar. (For a survey of early work see Somers, 1987.) The set of Case Relations offered in Fillmore (1968a) was tentative and not intended as necessarily exhaustive; he suggests (Fillmore, 1968a: 24–25):
228 Case Grammar
. Agentive (A), the case of the typically animate perceived instigator of the action identified by the verb; . Instrumental (I), the case of the inanimate force or object causally involved in the action or state identified by the verb; . Dative (D), the case of the animate being affected by the state or action identified by the verb; . Factitive (F), the case of the object or being resulting from the action or state identified by the verb, or understood as a part of the meaning of the verb; . Locative (L), the case that identifies the location or spatial orientation of the state or action identified by the verb; . Objective (O), the semantically most neutral case, the case of anything representable by a noun whose role in the action or state identified by the verb is identified by the semantic interpretation of the verb itself; conceivably the concept should be limited to things that are affected by the action or state identified by the verb. Other possibilities are mentioned elsewhere. Fillmore (1971b: Sect. 4) proposes a rather different set as ‘‘the case notions that are most relevant to the subclassification of verb types’’: it includes a Counter-Agent, and splits the Dative into various other Cases, including a new one, the Experiencer (see further below). Fillmore (1971a) contains a slightly different list, in particular lacking Result (regarded there as a Goal) and Counter-Agent and including Location, Time, and Path. Starosta (1988) suggests a set somewhat reduced compared to his and Fillmore’s earlier proposals. Cook (1978: 299), for instance, proposes only Agent, Experiencer, Benefactive, Object, and Locative, while Longacre (1976: 27–34) has Experiencer, Patient, Agent, Range, Measure, Instrument, Locative, Source, Goal, and Path. These sets at least show some overlap; much less of this is evident in, for example, the rather more exotic set proposed by Tarvainen (1987). Such uncertainty over the set of Cases, also well illustrated by the introductory discussion to the sample lexicon of Stockwell et al. (1973: Chap. 12, Sect. II.A), has often been cited in criticism of the Case Grammar framework (e.g., Chapin, 1972). It might, indeed, be regarded as a sign of lack of responsibility to describe a particular Case as a ‘wastebasket’ (Fillmore, 1971a: 42). Unless a theory of Case is properly constrained, new Cases are liable to crawl out of (or be rescued from; Radden, 1978) the ‘wastebasket’ or from even less desirable spots. It should, however, be pointed out that the same challenge, or problem, confronts any framework that
includes semantic functions or thematic roles (Gruber, 1965; Jackendoff, 1983, and descendants), whose status (as already implied) is now seen by a number of non-Case Grammarians as central to the grammar (cf. e.g., the discussions and references in Wilkins, 1988). Moreover, the determination of the set of syntactic categories as a whole remains contentious. How, for instance, does one group and hierarchize the classes indicated by the labels universal quantifier (e.g., all), distributive quantifier (every), existential quantifier (e.g., some), adjectival quantifier (e.g., many), quantificational adjective (e.g., numerous, various), cardinal numeral, ordinal numeral, superlative adjective, . . . (see e.g., Anderson, 1989)? On the other hand, this Defendant has to concede that the resolution of the question of the constitution of the set of semantic functions is rather more crucial for a framework in which they play such a fundamental role as is advocated by Case Grammarians. Thus, much effort has been devoted within Case Grammar not merely to the establishing of definitions and semantic/syntactic properties of the individual Cases (cf. e.g., Fillmore’s [1972: Sect. 32] discussion of Experiencer and personally), but also to the formulation of general principles governing the distribution of Cases and (less commonly) of a general substantive theory of the category of Case. Some Case Relations seem to be well established: Agentives and Neutrals, for instance, are generally invoked, with agreement over the central instances of such; and their status as Cases is supported by many of the phenomena alluded to in ‘Some Fundamentals’ and ‘Defining Case Relations.’ And a Place or Locative Case is generally acknowledged, though its relation to Source and Goal (and Path) is contentious. It is also generally agreed that predicates of experience such as like involve a distinct Case, which Fillmore (1968a) dubbed Dative; but the scope of this Case is controversial, with Fillmore (1971a, 1971b), as seen above, for instance, reassigning some former Datives to Goal or Neutral and relabeling the rest Experiencer (Fillmore, 1971a: 42): . . . where there is a genuine psychological event or mental state verb, we have the Experiencer; where there is a non-psychological verb which indicates a change of state, such as one of dying or growing, we have the Object; where there is a transfer or movement of something to a person, the receiver as destination is taken as the Goal.
However, as we will see, there are also problems with this revised version. Many investigators recognize an Instrumental Case Relation, as illustrated by the a hammer argument in (26):
Case Grammar 229 (26a) The vandals dented the BMW with a hammer (26b) A hammer dented the BMW
And some (cf. e.g., Fillmore, 1971a: Sect. 9) have recognized a Path: (27) Henry traveled through Celle
Anderson (1971, 1977), however, rejects both of these as Cases, as well as most of the others that have been proposed, in favor of a very restricted set of Case Relations. Much of this disagreement can be understood in terms of diverse interpretations of the distributional and substantive constraints that Case Relations conform to. The nature of these constraints and the differences in their application can perhaps best be appreciated in terms of an examination of a Case on whose validity most researchers seem to be agreed, the Agentive. We can provide Agentive with a distinctive semantic definition, perhaps along the lines of one of those given above, and we can associate phrases thus defined with a distinctive distribution, particularly in relation to their role in the Subject Selection Hierarchy. Occurrence of Agentive also correlates with other semanticosyntactic properties: zeromanifestation in imperatives (Kill Albert! etc.), adverbial selection: (28a) Emma killed Albert in cold blood/deliberately (28b) *Albert died in cold blood/deliberately
where interpretation of (28b) requires some extension of our normal understanding of the meaning and the argument type normally associated with die. Agentives are also, perhaps, ‘‘typically animate’’ (Fillmore, 1968a), even human, as in these examples. Some phrases that share their basic distribution with Agentives like Emma in (28a) are not human, or even animate, however: (29a) Lightning killed Albert (29b) Albert was killed by lightning
and they lack many of the associated properties. This can perhaps be allowed for in terms of Fillmore’s ‘typically animate’; perhaps, to elaborate on this somewhat, lightning in (29) is a nonprototypical Agentive. But what of (30) and the like? (30a) The poison killed Albert (30b) Albert was killed by the poison
For Fillmore (1968a), the poison is an Instrumental, which in the presence of an Agentive (even if not overt) is necessarily marked by with: (31a) Emma killed Albert with the poison (31b) Albert was killed with the poison (by Emma)
But the Case of the poison in (30a) is in fact indeterminate, given Fillmore’s definitions (either A or I), or a nonexistent ambiguity is predicted for its role here. One solution to this dilemma is to suggest that Instrumentals only ever occur with predicates that also take an Agentive (an instrument presupposes an agent) and to regard the poison in (30) as Agentive: it is again a nonprototypical Agentive that we interpret as normally fitting into a frame or scene that includes an (unspecified) ultimate agent. Likewise, there is no distributional reason to recognize a distinct Force Case (Huddleston, 1970) associated with lightning in (29): volition and the capacity to wield an instrument are not necessarily to be attributed to nonprototypical Agentives. The fact that the poison in (30) and (31) is now interpreted as bearing two different Cases is analogous to the situation we associated with the Customer role above: an entity bearing the same role, say Instrument, in a ‘real-world’ situation may be represented linguistically in different ways (cf. [24a] and [25]), in this instance as Agentive or Instrumental. (See Anderson, 1977: Chap. 1; and for a more general description of Instrumentals, Nilsen, 1973.) If the subjects of (29a) and (30a) are not Instrumentals, then prototypical Instrumentals such as those realized in (31) do not participate in Subject and Object Selection; indeed, arguably, they are not an independent component of the subcategorization requirements of any predicate. To put it in its strongest form, it is being claimed that Instrumentals are available with any Agentive predicate. Of course, the class of Instrumental will vary with the class of Agentive predicate: one travels by car rather than with (a) car (unless one is merely accompanying it). If the availability of Instrumentals can be allowed for by redundancy, i.e., their occurrence need never be stipulated by the Case Features of any predicate, then they can be removed from the set of primary Cases. They are not Participant but Circumstantial (Halliday, 1967/ 1968; Anderson, 1977, 1986); not Actants (Tesnie`re, 1959). The drawing of a distinction between Participant and Circumstantial is the first step in arriving at a delimitation of the set of (Participant) Case Relations. It is only to potential Participants that distributional criteria such as Fillmore’s (1971a) principles of ‘contrast’ and ‘complementarity’ can be fully applied (see below); circumstantials require a rather different approach. Circumstantials largely correspond to what many Case Grammarians have dubbed Outer Cases. (It is less clear to what extent Circumstantials as characterized here also correspond in general to Fillmore’s Modal Cases.) And perhaps a less controversial
230 Case Grammar
Circumstantial is what Fillmore calls the Outer Locative, as in (32): (32a) Nigel made lots of money in London (32b) In London Nigel made lots of money
The Outer Locative (given appropriate choice of lexical items) can appear with any predicate, and, for example, is readily fronted. Contrast the (inner) Locative in (33): (33a) Nigel kept lots of money in a sock (33b) ?In a sock Nigel kept lots of money
The (inner) Locative is part of the Case Frame (in its absence the lexical item keep has a different sense: ‘retain’), and it can be accompanied by an Outer Locative: (34a) Nigel kept lots of money in a sock in London (34b) In London Nigel kept lots of money in a sock
Example (34b) lacks the additional interpretation available with (a) with respect to which in London modifies sock. Anderson (1977: Chap. 1) argues that a number of proposed (Propositional) Cases are Circumstantial. Thus, Time, for instance, as in (35), is always Circumstantial: (35) Brenda left on Tuesday
For Anderson, the temporal phrases in (36) and (37) are, respectively, Source and Goal, and Neutral: (36) The concert lasted from seven to eleven (37) A long period elapsed
These arguments are associated here with verbs that normally require that their Source-and-Goal or Neutral argument involve temporal reference. And Tuesday in (38) is a nonprototypical Experiencer: (38) Tuesday saw Brenda’s departure
Such a proposal is based on a strategy of eliminating, as distinctions in Case, contrasts that are basically signaled elsewhere. As such, it can be said to represent an implementation of Fillmore’s (1971a: 40–41) principle of ‘complementarity.’ Fillmore (1971a: Sect. 3) also offers two assumptions that he terms principles of ‘contrast.’ The second principle, not explicitly formulated as such, is concerned with the establishment of contrasts in Case Relation associated with a single (syntactic) position. Fillmore illustrates this with comparative constructions and with the Subjects of the same predicate in (39): (39a) I am warm (39b) This jacket is warm
(39c) Summer is warm (39d) The room is warm
On one interpretation, I in (39a) is an Experiencer (and warm a psychological predicate); this jacket in (39b) is an Instrument; summer in (39c) is a Time; the room in (39d) is a Locative/Location. These assignments are supported by the recurrence of such distinctions in Subject position with other predicates (Fillmore discusses sad), but it is unclear what the syntactic consequences of some of the posited distinctions might be. Fillmore’s discussion of comparatives (only Noun Phrases of identical Case can be compared) is also inconclusive in this respect, in that it is apparent that many other semantic factors are involved in determining the wellformedness of such constructions. And implementation of ‘complementarity’ would suggest that (39c) and (39d) (at least) do not involve a distinction in Case (rather, of referential domain). Application of a ‘contrast’ principle here necessitates that the rest of the environment be kept constant, and so the possibility of ‘complementarity’ is eliminated. The other principle of ‘contrast’ discussed by Fillmore (1971a) is the one-instance-per-clause principle: a single clause will contain at most one (possibly compound) Noun Phrase associated with a particular Case. This has been generally accepted even outside Case Grammar, as in Chomsky’s Theta Criterion; and it seems to be well supported. The principal area of dissent concerns Neutral, which Anderson (1971; also 1977: Chap. 1) suggests occurs twice in equatives (The guy over there is the man she loves, etc.). He associates this with a further property that has been attributed to Neutral; that it is obligatory with every predicate – what Starosta (1987; also 1978) refers to as Patient Centrality. This principle of ‘contrast’ has been frequently coupled with a companion principle requiring that each NP bear at most one Case Relation (Fillmore, 1968a: 24; cf. again the Theta Criterion). However, it has been argued that this is quite generally inappropriate (not just with respect to specific Cases). At issue are, among other things, sentences like those in (40): (40a) Bert has moved the bookcase (40b) The bookcase has moved (40c) Bert has moved
Also see on this and other areas, such as the see/look Case Frames, Huddleston, 1970; Anderson, 1968: App., 1971: Chap. 1. In (40a) Bert and the bookcase are fairly uncontroversially A and N, respectively; and in (40b) the bookcase is once more Neutral. But what of Bert in (40c)? It is clearly Neutral: the referent of Bert undergoes the movement; and its status as
Case Grammar 231
such is required by Patient Centrality. But on one interpretation at least, Bert is also the source of the action, Agentive. And there are semantic and syntactic consequences (adverb modification, Unergativity, etc.); some of these are also illustrated by the sentences from Eastern Pomo given in (2) above. For a contrary view, however, see Starosta, 1987, for instance, where this distinction is not represented. Somers (1987: Chap. 8) surveys some of the arguments involved. Starosta (1988: Sect. 4.3) introduces a third ‘case-like’ category (besides Case Forms and Case Relations), namely Macroroles (cf. Foley and van Valin, 1984), of which there are two: Actor and Undergoer. The Actor is the Agent of a transitive clause or the Patient of an intransitive one. These are ‘‘established to account primarily for morphosyntactic rather than situational generalizations’’ (Starosta, 1988: 145). But one might have expected them to allow for such phenomena as others have associated with attribution of more than one Case Relation to a single argument, while maintaining the one-instance-per-NP constraint. However, ‘‘it appears that Actor, like Patient, is present in every clause’’ (Starosta, 1988: 146). Thus, both the bookcase in (40b) and Bert in (40c) would apparently be [þactr, þPAT]. The semantic distinction remains uncaptured. Actor also does not seem to accord well with the syntactic functions Starosta attributes to it. Thus, ‘‘the actant which may be omissible in imperatives . . . is the Actor’’ (Starosta, 1988: 151). But not all intransitive Patients show unmarked imperativization: this is true of (40b) as well as of the Patients associated with verbs such as stumble, blister, etc. under their normal (nonmetaphorical, non-playacting) interpretation. Anderson (1977: Chap. 2, in particular) deploys ‘complementarity’ and nonunary Case assignments for NPs to argue, without recourse to Macroroles, for a very reduced set of (Participant) Cases. For instance, he suggests that Path is a combination of Source and Goal; that Goal is a variant of Locative (with a predicate that also takes a Source, a directional predicate). Experiencers are interpreted as a combination of Locative with a Case Relation which, uncombined with Locative, characterizes Agentives: a Case Relation he calls Ergative. Thus, the Case Relations in (40) are respectively represented as in (41a–c) and those in (7a) and (39a) as, respectively, (41d, 41e): (41a) (41b) (41c) (41d) (41e)
Erg Abs Abs ErgþAbs ErgþLoc Abs ErgþLocþAbs
(where Anderson’s Abs(olutive) is Neutral, the Nom of J. M. Anderson, 1971). He proposes (Anderson, 1977: 115) a set of four Cases, given in (42), with each Case characterized in terms of combinations of the notional features Place and Source, such that Abs is unmarked and Erg is a non-Place Source, source of the event or situation, physical or mental, potentially in control of it:
See also, more recently, Ostler, 1980. These characterizations incorporate a general substantive principle determining the character of Case Relations, to complement the distributional and individual properties mentioned above. They instantiate one articulation of the Localist Hypothesis (cf. Localism), whose earlier history is charted by Hjelmslev (1935/1937), in terms of which the domain of Case is structured by components utilized in our perception of spatial situations: there are no necessarily ‘abstract’ Case Relations. The hypothesis is one attempt to provide a general definition of Case, avoiding the problems of uncertainty and overlap associated with notional definitions particular to individual Cases. The importance of a nonparticularist approach to Case is forcibly expressed by Hjelmslev (1935: 4): De´ limiter exactement une cate´ gorie est impossible sans une ide´ e pre´ cise sur les faits de signification. Il ne suffit pas d’avoir des ide´ es sur les significations de chacune des formes entrant dans la cate´ gorie. Il faut pouvoir indiquer la signification de la cate´ gorie prise dans son ensemble.
Apart from within the Localist tradition (including its partial, unwitting adoption by Gruber, 1965 and Jackendoff, 1976, 1983), it is only recently (with developments in Cognitive Grammar (see Cognitive Grammar) and elsewhere) that such a viewpoint has been to the forefront in the mainstream of the structuralist linguistic tradition in the last few decades. The postulation of a universal theory of Case is, of course, not to say that the ‘same’ situation will be expressed in terms of the same Case Frame in different languages, or that what can be an Erg in language X will necessarily correspond to an Erg in language Y (cf. e.g., Dahl, 1987). The English Experiencer (ErgþLoc) in (42) is alien to a large number of languages, for instance. Rather, these Relations form the basis for constructing clause structures in any language, and their applicability is limited primarily,
232 Case Grammar
within the Localist tradition, by the spatial prototypes with which they are associated. Among the many uncertainties and contentious issues surrounding notions of Case, some of which are surveyed here, perhaps the least explored is the character and status of Circumstantials. Anderson (1986) suggests that, despite their apparent diversity, the set of Circumstantials can be described using the same set of (combinations of) Case Relations as are appropriate to distinguishing Participants, and that the hierarchy of Circumstantials in terms of their closeness of relation to the central proposition is associated with the specificity of the verb class with which they are compatible. Thus, Instrumentals, in so far as they are compatible only with Agentive verbs, are more tightly integrated than Outer Locatives or circumstantial time phrases. But here in particular, much research remains to be done (in any framework). For some discussion, see again, e.g., Somers, 1987: Chap. 1. See also: Case; Cognitive Grammar; Dependency Grammar; Frame Analysis; Functional Grammar: Martinet; Generative Semantics; Lexicase; Minimalism; Semantics in Role and Reference Grammar; Valency Grammar.
Bibliography Abraham W (ed.) (1971). Kasustheorie. Frankfurt: Athena¨ um. Abraham W (ed.) (1978). Valence, semantic case and grammatical relations. Amsterdam: John Benjamins. Anderson J M (1968). ‘Ergative and nominative in English.’ Journal of Linguistics 4, 1–32. Anderson J M (1971). The grammar of case: towards a localistic theory. Cambridge: Cambridge University Press. Anderson J M (1977). On case grammar: prolegomena to a theory of grammatical relations. London: Croom Helm. Anderson J M (1978). ‘On the derivative status of grammatical relations.’ In Abraham W (ed.). 661–694. Anderson J M (1980). ‘Anti-unaccusative, or: relational grammar is case grammar.’ Revue roumaine de linguistique 25, 193–225. Anderson J M (1982). ‘Analysis and levels of linguistic description.’ In Siciliani E, Barone R & Aston G (eds.) La lingua inglese nell’universita`. Bari: Adriatica. 3–26. Anderson J M (1984). Case grammar and the lexicon. University of Ulster occasional papers in linguistics and language learning. Coleraine. Anderson J M (1986). ‘Structural analogy and case grammar.’ Lingua 70, 79–129. Anderson J M (1987). ‘Case grammar and the localist hypothesis.’ In Dirven R & Radden G (eds.). 103–121. Anderson J M (1989). ‘Reflexions on notional grammar, with some remarks on its relevance to issues in the
analysis of English and its history.’ In Arnold D J et al. (eds.) Essays on grammatical theory and universal grammar. Oxford: Oxford University Press. 13–36. Anderson J M (1991). ‘Notional grammar and the redundancy of syntax.’ Studies in Language 15, 301–333. Anderson J M (1997). A notional theory of syntactic categories. Cambridge: Cambridge University Press. Anderson J M & Dubois-Charlier F (eds.) (1975). La grammaire des cas (Langages 38). Paris: Didier-Larousse. Anderson S R (1971). ‘On the role of deep structure in semantic interpretation.’ Foundations of Language 7, 387–396. Anderson S R (1988). ‘Objects (direct and not so direct) in English and elsewhere.’ In Duncan-Rose C & Vennemann T (eds.) On language: rhetorica, phonologica, syntactica. A festschrift for Robert P. Stockwell from his friends and colleagues. London: Routledge. 287–314. Bo¨ hm R (1998). ‘De-activated participants: notional grammar, dependency and (anti)passives.’ In Boeder W, Schroeder C, Heinz Wagner K & Wildgen W (eds.) Sprache in Raum und Zeit: In memoriam Johannes Bechert, 2. Tu¨ bingen: Gunter Narr. 19–49. Chafe W (1970). Meaning and the structure of language. Chicago: University of Chicago Press. Chapin P G (1972). ‘Review of R. P. Stockwell, P Schachter & B H Partee Integration of Transformational Theories on English Syntax, Los Angeles: University of California, Los Angeles, 1968.’ Language 48, 645–667. Chomsky N (1965). Aspects of the theory of syntax. Cambridge: MIT Press. Chomsky N (1972). ‘Some empirical issues in the theory of transformational grammar.’ In Peters S (ed.) Goals of linguistic theory. New York: Holt, Rinehart & Winston. 63–130 [Reprinted in Studies on semantics in generative grammar. The Hague: Mouton. 1972]. Chomsky N (1981). Lectures on government and binding: The Pisa lectures. Dordrecht: Foris. Chomsky N (1995). The minimalist program. Cambridge: MIT Press. Cook W A (1971). ‘Case grammar as deep structure in tagmemic analysis.’ Languages and Linguistics Working Papers. Georgetown University 2, 1–9 [reprinted in Cook (1979), 28–35]. Cook W A (1972a). ‘A set of postulates for case grammar analysis.’ Languages and Linguistics Working Papers, Georgetown University 4, 35–49 [reprinted in Cook (1979), 36–49]. Cook W A (1972b). ‘A case grammar matrix.’ Languages and Linguistics Working Papers, Georgetown University 6, 15–47 [reprinted in Cook (1979), 50–81]. Cook W A (1973). ‘Covert case roles.’ Languages and linguistics working papers, Georgetown University 7, 52–81 [reprinted in Cook (1979), 82–108]. Cook W A (1978). ‘A case grammar matrix model (and its application to a Hemingway text).’ In Abraham W (ed.). 296–309. Cook W A (1979). Case grammar: development of the matrix model (1970–1978). Washington DC: Georgetown University Press.
Case Grammar 233 ¨ (1987). ‘Case grammar and prototypes.’ In Dirven Dahl O R & Radden G (eds.). 147–161. Dik S C (1978). Functional grammar. Amsterdam: NorthHolland [3rd printing 1981, Dordrecht: Foris]. Dik S C (1987). ‘Some principles of functional grammar.’ In Dirven R & Radden G (eds.). 37–53. Dirven R & Radden G (eds.) (1987). Concepts of case. Tu¨ bingen: Gunter Narr. Dowty D (1989). ‘On the semantic content of the notion of ‘‘thematic role’’.’ In Chierchia G, Partee B H & Turner R (eds.) Properties, types and meanings 2, semantic issues. Dordrecht: Kluwer. 69–129. Fillmore C J (1965). ‘Toward a modern theory of case.’ Project on Linguistic Analysis, Ohio State University 13, 1–24 [reprinted in Reibel D A & Schane S A (eds.) Modern studies in English. Englewood Cliffs, NJ: Prentice-Hall, 1969, 361–375]. Fillmore C J (1966). A proposal concerning English prepositions. Monograph Series on Languages and Linguistics, Georgetown University 19, 19–33. Fillmore C J (1968a). ‘The case for case.’ In Bach E & Harms R T (eds.) Universals in linguistic theory. New York: Holt, Rinehart & Winston. 1–88 [reprinted, in German translation, in Abraham (ed.) 1971]. Fillmore C J (1968b). ‘Lexical entries for verbs.’ Foundations of Language 4, 373–393. Fillmore C J (1970). ‘The grammar of hitting and breaking.’ In Jacobs R A & Rosenbaum P S (eds.) Readings in English transformational grammar. Waltham, MA: Ginn. 120–133. Fillmore C J (1971a). Some problems for case grammar. Monograph Series on Languages and Linguistics, Georgetown University 23, 35–56 [reprinted, in French translation, in Anderson & Dubois (eds.) 1975. 65–80]. Fillmore C J (1971b). ‘Types of lexical information.’ In Steinberg D D & Jacobovitz L A (eds.) Semantics: an interdisciplinary reader. Cambridge: Cambridge University Press. 370–392 [Also In Kiefer F (ed.) Studies in Syntax and Semantics. Reidel, Dordrecht. 109–137]. Fillmore C J (1971c). ‘Verbs of judging.’ In Fillmore C J & Langendoen D T (eds.) Studies in linguistic semantics. New York: Holt, Rinehart & Winston. 273–289. Fillmore C J (1972). ‘Subjects, speakers and roles.’ In Davidson D A & Harman G H (eds.) Semantics of natural language. Dordrecht: Reidel. 273–289. Fillmore C J (1977a). ‘The case for case reopened.’ In Cole P & Sadock J (eds.) Syntax and semantics 8, grammatical relations. New York: Academic Press. 3–26. Fillmore C J (1977b). ‘Topics in lexical semantics.’ In Cole R W (ed.) Current issues in linguistic theory. Bloomington: Indiana University Press. 76–138. Fillmore C J (1987). ‘A private history of the concept ‘‘frame’’.’ In Dirven R & Radde G (eds.). 28–36. Foley W A & van Valin R D (1984). Functional syntax and universal grammar. Cambridge: Cambridge University Press. Grimshaw J (1990). Argument structure. Cambridge: MIT Press.
Gruber J S (1965). ‘Studies in lexical relations.’ Ph.D. diss. MIT. Hale K & Keyser S J (2002). Prolegomenon to a theory of argument structure. Cambridge: MIT Press. Halliday M A K (1967/1968). ‘Notes on transitivity and theme.’ Journal of Linguistics 3, 37–81; 4, 179–215. Hjelmslev L (1935/1937). ‘La cate´ gorie des cas.’ Acta Jutlandica 7, i–xii, 1–184; 9, i–vii, 1–78 [reprinted 1972, Munich: Fink]. Huddleston R D (1970). ‘Some remarks on case grammar.’ Linguistic Inquiry 1, 501–511. Jackendoff R S (1976). ‘Toward an explanatory semantic representation.’ Linguistic Inquiry 7, 89–150. Jackendoff R S (1983). Semantics and cognition. Cambridge: MIT Press. Katz J J (1972). Semantic theory. New York: Harper & Row. Longacre R E (1976). An anatomy of speech notions. Lisse: Peter de Ridder. Lyons J (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press. Miller J E (1985). Semantics and syntax. Cambridge: Cambridge University Press. Nilsen D L F (1972). Toward a semantic specification of deep case. The Hague: Mouton. Nilsen D L F (1973). The instrumental case in English: syntactic and semantic considerations. The Hague: Mouton. Ostler N (1980). A theory of case linking and agreement. Indiana University Linguistics Club. Platt J T (1971). Grammatical form and grammatical meaning: a tagmemic view of Fillmore’s deep structure case concepts. Amsterdam: North-Holland. Radden G (1978). ‘Can ‘‘area’’ be taken out of the wastebasket?’ In Abraham W (ed.). 327–338. Robinson J J (1970). ‘Case, category and configuration.’ Journal of Linguistics 6, 57–80. Somers H L (1987). Valency and case in computational linguistics. Edinburgh: Edinburgh University Press. Starosta S (1973). ‘The faces of case.’ Language Sciences 25, 1–14 [reprinted, in French translation, in Anderson J M & Dubois-Charlier F (eds.) 1975. 104–128]. Starosta S (1978). ‘The one per Sent solution.’ In Abraham W (ed.). 459–576. Starosta S (1987). ‘A place for (lexi-)case.’ In Dirven R & Radden G (eds.). 54–74. Starosta S (1988). The case for lexicase: an outline of lexicase grammatical theory. London, New York: Pinter. Stockwell R P, Schachter P & Partee B (1973). The major syntactic structures of English. New York: Holt, Rinehart & Winston. Tarvainen K (1987). ‘Semantic cases in the framework of dependency theory.’ In Dirven R & Radden G (eds.). 75–102. Tesnie`re L (1959). E´ le´ ments de syntaxe structurale. Paris: Klincksieck. Welte W (1987). ‘On the concept of case in traditional grammars.’ In Dirven R & Radden G (eds.). 15–27. Wilkins W (ed.) (1988). Syntax and semantics 21, thematic relations. New York: Academic Press.
234 Castre´n, Matthias Alexander (1813–1852)
Castre´n, Matthias Alexander (1813–1852) F Karisson, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.
Matthias Alexander Castre´n (1813–1852) was the founder of scholarly Uralic (Finno-Ugric and Samoyed) studies and an early pioneer of linguistic field work and ethnolinguistics. Castre´n was born in Finland in southern Lapland and went to school in Oulu. He studied classical and Oriental languages at the University of Helsinki and obtained his Cand. Phil. degree in 1836. He was a good friend of the central characters of the national awakening movement in Finland, Johan Ludvig Runeberg, Elias Lo¨nnrot, and Zacharias Topelius, and this affected his course of life. But the person who influenced him the most was A. J. Sjo¨gren (1794–1855), an academician in the Imperial Academy in St. Petersburg. Sjo¨gren had himself made long field trips during most of the 1820s to the western Finno-Ugric populations in Russia, and he inspired Castre´n to extend the explorations eastward. Sjo¨gren also was instrumental in arranging funding from the Academy. Castre´n had already done fieldwork in 1838–1839 in Lapland and Karelia, mainly to collect mythology and folklore. In 1841–1844 he conducted his first long field trip, to western and central Russia. Castre´n studied Yurak Samoyed (Nenets) in Arkhangel’sk and then continued along the Pechora to the Komis (Zyryans). His grammar of Komi (Komi-Zyrian) (Elementa grammatices Syrjaenae) was published in 1844. Castre´n crossed the Urals and ended up in Obdorsk, where his health started to falter and he was diagnosed with tuberculosis of the lungs. During this long trip he became convinced that Finnish and Samoyed were related. In 1845 his basic grammar of Mari, Low (Cheremis) appeared. Castre´n’s decisive journey lasted from 1845 to 1849 and covered an area extending over Siberia beyond the Yenissei and Ob’, from the coastal tundra of the Arctic Sea all the way to the Altay and Sayan mountains, even transgressing the Chinese border. In addition to extensive archeological and ethnographic material, he collected data on all Samoyed languages, including Kamassian (Kamas), of which he found the last 150 speakers east of Krasnoyarsk, Khanty (Ostyak), the isolated and highly complex Ket (Yenissei-Ostyak) language and the related Kott language, two dialects of Turkic, Buryat Mongolian (Buriat, Mongolia), and Evenki (Tungus). All in all, Castre´n traveled more than 50 000 km. Castre´n’s first comparative study (1839) treated nominal inflection in Finnish, Estonian, and Sa´mi.
He formulated a program for further studies in the field. Castre´n rejects the old speculations concerning the biblical origin of Finnish and proceeds to a concrete comparison of morphological categories. He deals with morphophonological alternations like vowel harmony and consonant gradation (see Finnish) as well as number and case inflection. He was the first to demonstrate that the stop alternations inFinnish, Estonian, and Sa´mi depend in part upon the structure of the following syllable (open or closed). Castre´n’s most important contribution is his comparative grammar of the Samoyed languages. This 600-page posthumous work contains descriptions of all the five Samoyed languages investigated by Castre´n, especially their sounds and morphological properties, and provides outlines of the development of the languages and their hypothesized common source. This work constitutes the foundation of comparative Uralic studies. Castre´n’s study (1850) of pronominal suffixes in Finno-Ugric, Samoyed, Turkic, Mongolian, and Tungusic languages attempts to show that these languages are related and form the Altaic language family. However, posterity has not accepted this hypothesis. In 1851, Castre´n was appointed the first professor of Finnish at the University of Helsinki, but after only one year in office he died of complications related to his tuberculosis. Even upon his deathbed he struggled with his comparative Samoyed grammar, which (along with many of his other writings) was published posthumously by the academician Anton Schiefner in St. Petersburg (Castre´n, 1853–1862). See also: Anthropological Linguistics: Overview; Finnish;
Uralic Languages.
Bibliography Castre´n M A (1853–1862). Nordische Reisen und Forschungen. Im Auftrage der Kaiserlichen Akademie der Wissenschaften herausgegeben von Anton Schiefner. St. Petersburg. 1–7. Estlander B (1929). Mathias Aleksanteri Castre´n. Ha¨nen matkansa ja tutkimuksensa. Helsinki: Otava. [Also in Swedish.] Hovdhaugen E, Karlsson F, Henriksen C & Sigurd B (2000). The history of linguistics in the Nordic countries. Jyva¨skyla¨: Societas Scientiarum Fennica. Korhonen M (1986). Finno-Ugrian language studies in Finland 1828–1918. The history of learning and science in Finland 1828–1918. Helsinki: Societas Scientiarum Fennica.
238 Causal Theories of Reference and Meaning Searle J (1983). Intentionality. Cambridge: Cambridge University Press. Stalnaker R (1997). ‘Reference and necessity.’ In Hale B & Wright C (eds.) A companion to the philosophy of language. Oxford: Blackwell. 534–553.
Stampe D (1977). ‘Toward a causal theory of linguistic representation.’ Midwest Studies in Philosophy 2, 42–63.
Catalan M W Wheeler, University of Sussex, Brighton, UK ! 2006 Elsevier Ltd. All rights reserved.
Geography and Demography The territories where Catalan is natively spoken cover 68 730 km2, of which 93% lies within Spain (see Figure 1). They are: 1. The Principality of Andorra 2. In France: North Catalonia – almost all of the de´partement of Pyre´ne´es-Orientales 3. In Spain: Catalonia, except for the Gasconspeaking Vall d’Aran; the eastern fringe of Aragon; most of Valencia (the Comunitat Valenciana), excepting some regions in the west and south that have been Aragonese/Spanish-speaking since at least the 18th century; El Carxe, a small area of the province of Murcia, settled in the 19th century; and the Balearic Islands 4. In Italy: the port of Alghero (Catalan L’Alguer) in Sardinia Table 1 shows the population of these territories (those over 2 years of age in Spain) and the percentages of the inhabitants who can understand, speak, and write Catalan. Information is derived from the 2001 census in Spain together with surveys and other estimates; the latter are the only sources of language data in France and Italy. The total number of speakers of Catalan is a little under 7.5 million. Partly as a result of the incorporation of Catalan locally into the education system, there are within Spain a significant number of second-language speakers who are included in this total. Virtually all speakers of Catalan are bilingual, using also the major language of the state they live in. (Andorrans are bilingual in Spanish or French, or are trilingual.)
Genetic Relationship and Typological Features Catalan is a member of the Romance family and a fairly prototypical one, as befits its geographically central position in the European Romance area. Some particularly noteworthy characteristics are pointed out here
(for more details see Wheeler, 1988). In historical phonology, note the palatalization of initial /l-/ and loss of stem-final /n/ that became word final, for example, LEONEM > lleo ´ [Le"o] ‘lion.’ Original intervocalic -C0 -, -TJ-, -D- became /w/ in word-final position and were lost elsewhere, for examples, PLACET > plau ["plaw] ‘please.3.SING,’ PLACEMUS > plaem [ple"em] ‘please. 1.PL.’ As the previous examples also illustrate, posttonic nonlow vowels were lost, so that a dominant pattern of phonological words is of consonant-final oxytones. The full range of common Romance verbal inflection is retained, including inflected future (sentira` ‘hear.3.SING.FUT’), widely used subjunctives, and a contrast between present perfect (ha sentit ‘has heard’) and past perfective (sentı´ ‘heard.3.SING. PERF’). In addition to the inherited past perfective form, now largely literary, Catalan developed a periphrastic past perfective using an auxiliary that was originally the present of ‘go’ (va sentir ‘AUX. PERF.3.SING hear.INF’). In some varieties of Catalan, this construction has developed a subjunctive (vagi sentir ‘AUX.PERF.SUBJ.3.SING hear.INF’), introducing, uniquely in Romance, a perfective/imperfective aspect distinction in the subjunctive. Considerable use is made of pronominal and adverbial clitics that attach to verb forms in direct and indirect object functions or partitive or adverbial functions, quite often in clusters of two or three, as in (1). (1) us n’hi envi-en 2.PL.OBJ PART.LOC send-3.PL ‘‘they send some to you (PL) there’’
Most of the pronominal/adverbial clitics have several contextually conditioned forms; thus, the partitive clitic shows variants en ! n’ ! -ne. Clitic climbing is commonly found with a pronominal complement of a verb that is itself the complement of a (semantic) modal, as in (2). This example also shows the (optional) gender agreement of a perfect participle with a preceding direct object clitic. (2) no not
l’he sab-ud-a agafa-r DO.3.SING.F. knowcatch-INF have.1.SING PART-F ‘‘I haven’t been able to catch it (FEM)’’
Catalan 239
Figure 1 Catalan-speaking areas and dialects.
A fair number of items in the basic vocabulary are etymologically distinct from the corresponding terms in neighboring Romance languages, for example, estimar ‘to love,’ ganivet ‘knife,’ gens ‘not at all,’ massa ‘too,’ pujar ‘to go up,’ tardor ‘autumn,’ and tou ‘soft.’
Dialects Although there are significant dialect differences in Catalan, the dialects are to a high degree mutually intelligible. They are conventionally divided into two groups, on the basis of differences in phonology as
240 Catalan Table 1 Catalan language demography and competences Territory
Population
Understand Catalan (%)
Speak Catalan (%)
Write Catalan (%)
Andorra North Catalonia Catalonia Aragon fringe Valencia Balearics Alghero/L’Alguer Total
well as some significant features of verb morphology; there are some interesting lexical differences, too. The eastern dialect group (see Figure 1) includes North Catalan or rossellone`s (in France), central Catalan (in the eastern part of Catalonia), Balearic, and alguere`s (in Alghero/L’Alguer). The western group consists of Northwestern Catalan (western and southern Catalonia and eastern Aragon) and Valencian. The main diagnostic heterogloss distinguishing the two major dialect groups involves vowel reduction in unstressed syllables: In the eastern dialects /a/ is pronounced [e] in unstressed syllables and, with some exceptions, /e/ and /e/ are also reduced to [e], whereas /o/ and /O/ are reduced to [u].
History Catalan is a variety of Latin that developed originally on a small territory on either side of the eastern Pyrenees. Expansion of this territory, the Marca Hispanica of the Carolingian empire, is associated with a process of developing political independence, beginning with the separation (A.D. 988) of the county of Barcelona from the trunk of the Carolingian domain. Eventual fusion with the crown of Aragon (1162) gave new momentum to this projection. In 1151, a treaty between the kings of Aragon and Castile had carved up the future conquest of territories then under Arab control, so that Valencia would fall to the crown of Aragon while lands further west would be attached to Castile. The kingdom of Valencia was captured in the 1230s and was populated by speakers from various parts of Catalonia and Aragon, although a numerous subordinate population of Arabicspeaking moriscos, as they were called, remained until their expulsion in 1609. The Balearic Islands were conquered between 1229 and 1287 and were resettled by speakers largely from eastern Catalonia. Sicily was also captured for the house of Barcelona (1282), as was Sardinia (1323–1327); Catalan was widely used as an official language in Sicily until the 15th century and in Sardinia until the 17th century. In
Sardinia, only the port of Alghero was subject to Catalan resettlement, and it has remained Catalanspeaking to the present day. The original expansion southward of Catalan following the reconquest extended as far as Murcia and Cartagena, although the kingdom of Murcia became Spanish-speaking during the 15th century. The chancellery of the kingdom of Aragon was trilingual, using Latin, Catalan, and Aragonese as the occasion required. A substantial body of Catalan literature in various prose and verse genres was produced before decline set in in the 16th century. In 15th-century Valencia the court was already bilingual, and after the merger of the Aragonese and Castilian crowns in 1479 Spanish (Castilian) gradually increased in prestige throughout the Catalan territories, with the urban and literate classes becoming bilingual. From the 16th century, Catalan came increasingly under Spanish influence in vocabulary, syntax, pronunciation, and orthography as a result of the social and cultural prestige of Castile. It was not until the 19th century that a substantial Catalan literary and cultural revival took place, which continues to the present. Standardization of the modern language was achieved in the early 20th century. Since the Second World War, most of the Catalanspeaking territories have experienced a substantial immigration of non-Catalan speakers. In France, these have been pieds noirs resettled from Algeria and retired people from various parts of France. In Catalonia and Valencia, the population almost doubled between 1950 and 1975 as people from less-developed southern Spain sought employment in the manufacturing and service industries. Majorca and Ibiza (Eivissa) have attracted a workforce from many parts of Spain, feeding the tourist industry. Many immigrants have wished to acquire Catalan, or at least have wished their children to do so, as an aid to integration, but until the late 1970s there were few opportunities to realize this. These large Spanishspeaking communities have added to the institutional
Catalan 241
and cultural pressures in favor of the use of Spanish in the Catalan territories. In 1659, Philip IV of Spain ceded the northern part of Catalonia (essentially the modern de´ partement of Pyre´ ne´ es-Orientales) to the French crown. From that point, North Catalonia became subject to the linguistic unification policies of the French state. French became the official language in 1700 and has had a marked influence on the vocabulary of North Catalan and, in recent times, on its phonology as well. Minorca was under British rule during most of the 18th century, and there is a handful of Minorcan Anglicisms in the vocabulary dating from that period. The dialect of Alghero is, not surprisingly, heavily influenced by Sardinian and even more so by Italian in all components of the language.
Present Sociolinguistic Situation The status, situation, and prospects of the Catalan language are significantly different in each of the territories in which it is spoken, although each of those in Spain shares, in some way, the consequences of Catalan’s having been for centuries an oppressed minority language. The cultural decline and loss of prestige affecting Catalan from the 16th century onward has already been mentioned. The defeat of the Catalans in the war of the Spanish Succession (1714) initiated a series of measures, extending throughout the 18th and 19th centuries, that imposed the use of Spanish in public life, for example, in accounts, in preaching, in the theater, in the criminal courts, in education, in legal documents, in the civil registers, and on the telephone. In the 20th century, these measures were mostly repeated and supplemented by the imposition of Spanish in catechism, by the prohibition of the teaching of Catalan, and by sanctions against people refusing to use Spanish. The Second Republic (1931–1939) to a large extent removed these restrictions, but Franco’s victory in the Spanish Civil War was followed in 1940 by a total ban on the public use of Catalan. Despite a gradual relaxation allowing some publication of books and magazines, Catalan remained excluded from nearly all public institutions until Spain’s adoption of a democratic constitution in 1978. In the early 1980s, Catalonia, Valencia, and the Balearics obtained their statutes of autonomy, involving co-official status for Spanish and Catalan. All of these statutes promote language normalization, the goal of which is universal bilingualism without diglossia. In Catalonia, the expressed aim of the Generalitat (the autonomous government) goes further than this: It seeks to make the local language the normal medium of public life, with Spanish having a secondary
role as an auxiliary language or a home language for its native speakers. In Catalonia, the teaching of Catalan is obligatory in all schools, and primary and secondary education through the medium of Catalan now reaches at least 60% of the population. In Valencia and the Balearics, the de facto policy has been to promote effective knowledge of Catalan through education and to enhance its status while largely preserving a diglossic relationship between Spanish and Catalan. In Valencia, significant political forces reject the name Catalan for the local language and insist on the term Valencian. Although the Balearic Islands Council passed a linguistic normalization law in 1986, progress has been inconsistent, although Catalan is widely available in the education system which includes some Catalan-medium education. In Andorra, Catalan has always been the sole official language. In 1993, Andorra adopted a new constitution, and the government has been pursuing an active Andorranization policy, involving Catalanmedium education. The status of Catalan in North Catalonia is parallel to that of the other traditional minority languages in France. Language shift was all but universal after the Second World War, so that most native speakers are (as of 2004) over 60 years old. Catalan has at best an occasional, decorative role in public life. In primary schools, some 30% study Catalan (as a foreign language) and, in secondary schools, some 15%. The current trend is for intergenerational language shift from Catalan in French Catalonia, in Alghero, in southern Valencia around Alicante (Alacant), and possibly in Palma (Majorca). Elsewhere, Catalan is holding its own, with some evidence of intergenerational shift toward Catalan in Catalonia. See also: Andorra: Language Situation; France: Language Situation; Indo–European Languages; Italy: Language Situation; Romance Languages; Spain: Language Situation; Spanish.
Bibliography Badia i Margarit A M (1951). Grama´ tica histo´ rica catalana. Barcelona, Spain: Noguer. [Catalan translation Grama`tica histo`rica catalana. Valencia: 3 i 4, 1981.] Moll F de B (1952). Grama´ tica histo´ rica catalana. Madrid: Gredos. [Catalan translation Grama`tica histo`rica catalana. Valencia: Universitat, 1991.] Nadal J M & Prats M (1982–1996). Histo`ria de la llengua catalana (2 vols.). Barcelona, Spain: Edicions 62. Pradilla M A` (ed.) (1999). La llengua catalana al tombant del mil.lenni: aproximacio´ sociolingu¨ ı´stica. Barcelona, Spain: Empu´ ries.
242 Catalan Sola` J, Lloret M R, Mascaro´ J & Pe´ rez Saldanya M (eds.) (2002). Grama`tica del catala` contemporani (3 vols.). Barcelona, Spain: Empu´ ries. Wheeler M W (1988). ‘Catalan.’ In Harris M & Vincent N (eds.) The Romance languages. London: Routledge. 170–208.
Wheeler M W (in press). The phonology of Catalan. Oxford: Oxford University Press. Wheeler M W, Yates A & Dols N (1999). Catalan: a comprehensive grammar. London: Routledge.
Categorial Grammars: Deductive Approaches G Morrill, Polytechnic University of Catalonia, Barcelona, Spain ! 2006 Elsevier Ltd. All rights reserved.
Introduction According to Frege, it is certain possibly complex expressions, and not in general the words, which are the primary bearers of meaning. Thus, while phrase structure grammar classifies words and phrases by atomic categories or types, what Bar-Hillel (1964) dubbed ‘categorial grammar’ is characterized by the classification of words and phrases into atomic and complex fractional types according to their completeness or incompleteness as expressions (Husserl, 1913; Ajdukiewicz, 1935; Bar-Hillel, 1953). Lambek (1958) gave a calculus in this spirit for which was provided a Gentzen-style sequent system. Deductive approaches to categorial grammar were thus born. In the next section we present an introduction to categorial grammar. In the section ‘Deductive Systems’ we review categorical calculus, sequent calculus, natural deduction, and proof nets. The technical appendix provides a contemporary definition of categorial formalism.
might have a rule N ! DefArt CN showing that a definite article combines with a count noun to form a referring nominal, categorial grammar may express the same information by assigning a definite article the functor type N/CN, showing that it combines with a count noun on the right to form a referring nominal. Let us write a: A to indicate that the expression a is of type A. Then a categorial lexicon might include the following type assignments: (1) cat: CN Mary: N likes: (N\S)/N sleeps: N\S that: (CN\CN)/(S/N) the: N/CN
What type assignments follow from what? It is easy to see that the following are valid, where concatenation is indicated by þ: (2)
Furthermore, where a is a variable and coindexed overline indicates the withdrawal of a type assignment statement, the following are valid: (3)
Categorial Grammar In Categorial Grammar (see also Combinatory Categorial Grammar) the categories or types by which linguistic expressions are classified are defined recursively on the basis of a small set of atomic types by means of two operators, / (over) and \ (under). Atomic types (for example, S for declarative sentence, N for referring nominal and CN for count noun) are types, and if A and B are types, so are the functor types B/A and A\B. Expressions of type B/A are those which concatenate with argument As on the right to form Bs and expressions of type A\B are those which concatenate with argument As on the left to form Bs. (Some authors write B\A for A\B; we keep to the original notation by which cancellation is under adjacency.) Thus, for example, whereas phrase structure grammar
In the rules E stands for elimination, because the operator is eliminated reading from premises to conclusion, and I stands for introduction, because the operator is introduced reading from premises to conclusion. There are the following derivations of the sentence ‘the cat sleeps’ and the relative clause ‘that Mary likes.’ (4)
242 Catalan Sola` J, Lloret M R, Mascaro´ J & Pe´rez Saldanya M (eds.) (2002). Grama`tica del catala` contemporani (3 vols.). Barcelona, Spain: Empu´ries. Wheeler M W (1988). ‘Catalan.’ In Harris M & Vincent N (eds.) The Romance languages. London: Routledge. 170–208.
Wheeler M W (in press). The phonology of Catalan. Oxford: Oxford University Press. Wheeler M W, Yates A & Dols N (1999). Catalan: a comprehensive grammar. London: Routledge.
Categorial Grammars: Deductive Approaches G Morrill, Polytechnic University of Catalonia, Barcelona, Spain ! 2006 Elsevier Ltd. All rights reserved.
Introduction According to Frege, it is certain possibly complex expressions, and not in general the words, which are the primary bearers of meaning. Thus, while phrase structure grammar classifies words and phrases by atomic categories or types, what Bar-Hillel (1964) dubbed ‘categorial grammar’ is characterized by the classification of words and phrases into atomic and complex fractional types according to their completeness or incompleteness as expressions (Husserl, 1913; Ajdukiewicz, 1935; Bar-Hillel, 1953). Lambek (1958) gave a calculus in this spirit for which was provided a Gentzen-style sequent system. Deductive approaches to categorial grammar were thus born. In the next section we present an introduction to categorial grammar. In the section ‘Deductive Systems’ we review categorical calculus, sequent calculus, natural deduction, and proof nets. The technical appendix provides a contemporary definition of categorial formalism.
might have a rule N ! DefArt CN showing that a definite article combines with a count noun to form a referring nominal, categorial grammar may express the same information by assigning a definite article the functor type N/CN, showing that it combines with a count noun on the right to form a referring nominal. Let us write a: A to indicate that the expression a is of type A. Then a categorial lexicon might include the following type assignments: (1) cat: CN Mary: N likes: (N\S)/N sleeps: N\S that: (CN\CN)/(S/N) the: N/CN
What type assignments follow from what? It is easy to see that the following are valid, where concatenation is indicated by þ: (2)
Furthermore, where a is a variable and coindexed overline indicates the withdrawal of a type assignment statement, the following are valid: (3)
Categorial Grammar In Categorial Grammar (see also Combinatory Categorial Grammar) the categories or types by which linguistic expressions are classified are defined recursively on the basis of a small set of atomic types by means of two operators, / (over) and \ (under). Atomic types (for example, S for declarative sentence, N for referring nominal and CN for count noun) are types, and if A and B are types, so are the functor types B/A and A\B. Expressions of type B/A are those which concatenate with argument As on the right to form Bs and expressions of type A\B are those which concatenate with argument As on the left to form Bs. (Some authors write B\A for A\B; we keep to the original notation by which cancellation is under adjacency.) Thus, for example, whereas phrase structure grammar
In the rules E stands for elimination, because the operator is eliminated reading from premises to conclusion, and I stands for introduction, because the operator is introduced reading from premises to conclusion. There are the following derivations of the sentence ‘the cat sleeps’ and the relative clause ‘that Mary likes.’ (4)
Categorial Grammars: Deductive Approaches 243 (5)
Categorical Calculus
An arrow A ! B comprises a source syntactic type A and a target syntactic type B. An arrow is valid if and only if in every interpretation, [[A]] is a subset of [[B]]. There is the following calculus of valid arrows (Lambek, 1958): (9)
These will be our running examples in the presentation of deductive systems in the next section. By way of motivation of categorial grammar, consider right node raising and left node raising coordination: (6a) John likes and Mary dislikes London. (6b) John showed Mary Paris and Suzy Berlin.
The conjuncts are nonconstituents on a phrase structure view of grammar; however, in categorial grammar the conjuncts can be analyzed as units so that the node raising can be treated as coordination of like-type constituents:
For example, the following shows that the types for ‘the cat sleeps’ in order yield a sentence: (10)
(7a)
(7b)
Where R abbreviates CN\CN and TV abbreviates (N\S)/N, the following shows that ‘that Mary likes’ yields an R: (11)
Deductive Systems In addition to the division operators \ and / there is a product operator ! such that A!B signifies the concatenation of an A and a B. The interpretation of the categorical operators is summarized: (8) [[A!B]] ¼ {s1 þ s2| s1 2 [[A]] and s2 2 [[B]]} [[A\C]] ¼ {s| for all s0 2 [[A]], s0 þ s 2 [[C]]} [[C/B]] ¼ {s| for all s0 2 [[B]], s þ s0 2 [[C]]}
The purpose of deductive systems is to provide sound and complete calculi for this interpretation (Buszkowski, 1986; Pentus, 1994). We consider in turn categorial calculus, sequent calculus, natural deduction, and proof nets.
The rules of the categorical calculus are elegant but, as we see, the proofs are not very economical. Furthermore, given an arrow to be proved there is no obvious strategy to search for a proof because in the rule Trans of (9b) the type B is an unknown reading from conclusion to premises. This situation is improved in the sequent calculus. Sequent Calculus
A sequent G ) A comprises a nonempty sequence of antecedent types G and a succedent type A. A sequent
244 Categorial Grammars: Deductive Approaches
A0, . . ., An ) A is valid if and only if in every interpretation, if s0 2 [[A0]], . . ., sn 2 [[An]] then s0 þ . . . þ sn 2 [[A]]. There is the following calculus of valid sequents (Lambek, 1958): (12)
(13)
(14)
These divide into the identity rules (12) and the logical rules (13) and (14). For each operator there is a left (L) logical rule in which the operator appears in the antecedent of the conclusion and a right (R) logical rule in which the operator appears in the succedent of the conclusion. With the exception of Cut, which introduces the unknown type A reading from conclusion to premises, every rule contains one less operator in the premises than in the conclusion. Now the calculus enjoys Cut elimination, that is to say that every provable sequent can be proved without the use of Cut (Lambek, 1958). Hence, the calculus provides a decision procedure, backward-chaining from the sequent to be proved in the finite Cut-free search space. For example: (15)
(16)
Natural Deduction
In natural deduction (Barry et al., 1991), proofs are ordered trees with hypothesis types at leaves and a conclusion type at the root, with coindexation indicating the closing of hypotheses under hypothetical reasoning. Proofs are combined and extended at the roots starting from single types. A proof with leftto-right unclosed leaves A0, . . ., An asserts that in every interpretation, if s0 2 [[A0]], . . ., sn 2 [[An]] then s0 þ . . . þ sn belongs to the root type. (18)
(19)
As with the calculus of the ‘Categorial Grammar’ section, which is a labeled form of natural deduction, for each operator there is a rule of elimination (E) such that the operator is eliminated reading from premise to conclusion and a rule of introduction (I) such that the operator is introduced reading from premise to conclusion. In \I, A must be the leftmost hitherto unclosed hypothesis and cannot be the last such; in /I, B must be the rightmost hitherto unclosed hypothesis and cannot be the last such. In "E, A and B must be the only unclosed hypotheses in the indicated subderivation. For example: (20)
(21)
However, the Cut-free search space still contains proofs which differ in inessential orderings of rules. For example, the following is equivalent to (15): (17)
Categorial Grammars: Deductive Approaches 245
Natural deduction provides quite an economic proof syntax because it does not iterate the contexts of sequent calculus inferences. However, it does not provide an obvious proof search procedure because on the one hand, working from leaves to root, it is not clear which hypotheses to make and later close, and on the other hand, working from root to leaves, /E and \E introduce an unknown. In the next section we present proof nets, which combine the representational and computational advantages of sequent calculus and natural deduction. Proof Nets
When we inspect the sequent calculus we see that it is predictable which sequent rule will apply to an operator in a proof of a sequent. To the main operator * in an antecedent type will apply the rule *L, to the main operator * in a succedent type will apply the rule *R, and to a subordinate operator * will apply *L or *R according to the propagation of subtypes in the unfolding of a proof. This means we can anticipate the antecedent or succedent position of a type in a proof. Let there be two polarities: input (!) and output (" ), corresponding respectively to antecedent (L) and succedent (R) position. A polar type Ap is a type together with a polarity p. A polar type tree is the result of unfolding a polar type up to its atomic leaves according to the following logical links:
is the negation/reverse of B first and the negation/ reverse of A second. A proof frame is a list of polar type trees comprising an output type followed by at least one input type. A proof structure is the result of connecting in a proof frame every leaf with one other with the same atomic type and complementary polarity. These connections are called axiom links. A proof structure is a proof net if and only if: (24a) (Acyclicity) Every cycle crosses both edges of some i-link. (24b) (Planarity) The axiom linking is planar in the ordering of the leaves induced by the list ordering of the frame, i.e., the axiom linking can be drawn in the half-plane without crossing lines. (24c) (No subtending) No axiom link connects the left and right descendent leaves of an output division node.
(For acyclicity see Danos and Regnier, 1989; for planarity see Roorda, 1991; for no subtending see de Groote and Retore´ , 2003.) A proof net over a proof frame A" , A!0, . . ., A!n asserts that in every interpretation, if s0 2 [[A0]], . . ., sn 2 [[An]] then s0 þ . . . þsn 2 [[A]]. For example: (25)
(22)
(26)
(23)
In each link, the premises above the line are the immediate subtypes of the conclusion below the line, marked with polarity according to the propagation to antecedent or succedent position. The links (22a–d) and (23a, b) correspond respectively to \L, \R, /L, /R, !L, and !R, showing just the active types without the iterated contexts. They are marked i or ii according as the rule is unary (the premises belong to the same subproof) or binary (the premises belong to different subproofs). Note that in the output unfoldings the left-to-right order of the subtypes is switched. The intuition behind this is that output polarity hides an implicit negation, and in a noncommutative system the negation/reverse of A first and then B second
Notice that, the cycle in (26) does indeed cross both edges of a i-link so that it complies with acyclicity. Introducing now the semantic dimension, a proof net analysis contains implicitly the semantic reading of a proof (de Groote and Retore´ , 1996). The semantic form is recovered following a deterministic semantic trip through the net. The semantic trip starts upwards at the unique output root and proceeds according to the instructions in (27)–(28), generating the successive characters of the semantic form as indicated. The trip bounces at input roots inserting
246 Categorial Grammars: Deductive Approaches
the associated lexical semantics. It ends downwards back at the origin having crossed each edge twice, once in each direction. (27)
discourse semantic notions. Syntax is seen as the bridge between these two dimensions. In the first section we present prosodic representation and interpretation and in the second section semantic representation and interpretation. In the third section we present categorial syntactic types and their bidimensional interpretation. Prosodics
A prosodic structure is a semigroup, i.e., an algebra (L, þ) of arity (2) such that þ is associative: (31) s1þ(s2þ s3) ¼ (s1þs2)þs3
(28)
Let there be a set B of prosodic constants. Then the set C of prosodic forms is defined by: For example, the result of performing the semantic trip on (25) is (29) and the result of performing the semantic trip on (26) is (30a), which is equivalent to (30b). (29)
Conclusion We have illustrated deductive approaches to the Lambek calculus, the nucleus of categorial grammar. We have seen a variety of calculi leading up to proof nets, which for their parsimony and economy can claim to be the syntactic structures of deductive categorial grammar. Recent work has concentrated on generalizations of the basic calculus, extending its linguistic coverage while trying to preserve its attractive logical properties (Morrill, 1994; Carpenter, 1997; Moortgat, 1997). Perhaps the most challenging contemporary aspect is the development of corresponding theories of proof nets (see e.g., Moot and Puite, 2002; Fadda, 2004; Fadda and Morrill, 2005).
Technical Appendix We are interested in modeling the two dimensions of language: form and meaning. Categorial grammar classifies expressions simultaneously with respect to these two dimensions. We refer to the first as prosodics, signifying word order abstracted from mode of articulation, e.g., verbalization or signing, but potentially including intonational contours or their analogue. We refer to the second as semantics, signifying logical semantics abstracted from illocution, but potentially including focus or other
(32) C :: ¼ B | C þ C
That is, the prosodic forms are the terms of a prosodic algebra. A prosodic interpretation comprises a prosodic structure (L, þ) and a prosodic valuation mapping from B into L. The prosodic value [a]w 2L of a prosodic form a with respect to a prosodic interpretation with prosodic valuation w is defined by: (33) [a]w ¼ w(a) for a 2 B [a þ b]w ¼ [a]w þ [b]w
Two prosodic forms a and b are equivalent, a ffi b, if and only if [a]w ¼ [b]w in every prosodic interpretation. Since prosodic structures are associative we have: (34) a þ (b þ g) ffi (a þ b) þ g
Hence, we may omit parentheses in prosodic forms. Semantics
The functional exponentiation of a set X to a set Y, X " Y is the set of all functions mapping from Y into X. The cartesian product of a set X with a set Y, X $ Y, is the set of all ordered pairs with first element in X and second element in Y. The set T of semantic types is defined by: (35) T :: ¼ e | t | T ! T | T & T
A semantic structure is a T-indexed family of sets {Dt}tET such that De is a nonempty set of entities, Dt is the set {Ø, {Ø}} of truth values, and (36) Dt1 !t2 ¼ Dt2 " Dt1 Dt1&t2 ¼ Dt1 $ Dt2
Let there be a set Vt of semantic variables for each semantic type t and a set Ct of semantic constants for each semantic type t, including the logical semantic constants:
Categorial Grammars: Deductive Approaches 247 provided f{c/x} is free p1(f, c) ffi f p2(f, c) ffi c lx(f x) ffi f provided x is not free in f (p2f, pf) ffi f
The sets Ft of semantic terms for each semantic type t are defined by: (38) Ft:: ¼ Vt| Ct| (Ft’!tFt’)| p1Ft&t’| p2Ft’&t Ft’!t:: ¼ lVt’Ft Ft&t’:: ¼ (Ft,Ft’)
An occurrence of a semantic variable x in a semantic term is bound if and only if it falls within a subterm of the form lxf; otherwise it is free. The result f{c/x} of substituting semantic variable x (of semantic type t) by semantic term c (of semantic type t) in semantic term f is the result of replacing by c every free occurrence of x in f; the substitution is free if and only if no semantic variable becomes bound in the process of replacement. A semantic form is a semantic term with no free variables. A semantic interpretation comprises a semantic structure {Dt}t2 T, a semantic assignment g mapping from each Vt into Dt, and a semantic valuation f mapping from each Ct into Dt such that: (39) f(_)(m) ¼ m0 |!m[m0 f(&)(m) ¼ m0 |!m\m0 f(!)(m) ¼ m0 |!({Ø}\m)[m0 f(!)(m) ¼ {Ø}\m f(8)(m) ¼ Intersection of m(m0 ) for all m0 2De f(9)(m) ¼ Union of m(m0 ) for all m0 2 De f(i)({m}) ¼ m
The semantic value [f]gf 2 Dt of a semantic term f 2 Ft with respect to a semantic interpretation with semantic assignment g and semantic valuation f is defined by: (40) [x]f g ¼ g(x) for x 2 Vt [c]f g ¼ f(x) for c 2 Ct [(f c)]f g ¼ [f]f g([c]f g) [p1f]f g ¼ fst([f]f g) [p2f]f g ¼ snd([f]f g) [lxtf]f g ¼ Dt m|! [f]f (g!{)[{} [(f, c)]f g ¼ h[f]f g, [c]f gi
functional application first projection second projection functional abstraction pair formation
Note that, the semantic value of a semantic form is invariant with respect to semantic assignment. Two semantic forms f and c are equivalent, f ffi c, g g if and only if [f]f ¼ [c]f in every semantic interpretation. We have: (41) lxf ffi ly(f{y/x}) a-conversion provided y is not free in f and f{y/x} is free (lxf c) ffi f{c/x} b-conversion
Z-conversion
Syntax
The set F of syntactic types is defined on the basis of a set A of atomic syntactic types as follows: (42) F :: ¼ A | F$F | F\F | F/F
Let there be a basic type map t mapping from A into T. This induces the type map T from F into T such that: (43) T(P) ¼ t(P) for P 2 A T(A$B) ¼ T(A)&T(B) T(A\C) ¼ T(A)!T(C) T(C/B) ¼ T(B)!T(C)
A syntactic interpretation comprises a prosodic structure (L, þ), a semantic structure {Dt}teT, and a syntactic valuation F mapping each P 2 A into a subset of L&Dt(P). Then the syntactic value [ [A] ]F, a subset of L&DT(A), for each syntactic type A is given by: (44) [[P]]F ¼ F(P) for P 2 A [[A$B]]F ¼ {(s1+s2, hm1, m2i)| (s1, m1)2[[A]]F and (s2, m2)2[[B]]F} [[A\C]]F ¼ {(s, m)| for all (s0 , m0 )2[[A]]F, (s0 þ s, m(m0 ))2[[C]]F} [[C\B]]F ¼ {(s, m)| for all (s0 , m0 )2[[B]]F, (s þ s0 , m(m0 ))2[[C]]F}
A semiotic interpretation comprises a prosodic interpretation, a semantic interpretation, and a syntactic interpretation, with the same prosodic and semantic structures. A type assignment statement a!f: A comprises a syntactic type A, a semantic form f of type T(A) and a prosodic form a. A semiotic interpretation satisfies a type assignment statement a!f: A if and only if h[a], [f]i2[[A]]. A semiotic interpretation satisfies a set S of type assignment statements if and only if it satisfies every type assignment statement s2S. A set s of type assignment statements models a type assignment statement s, S| ¼ s, if and only if every semiotic interpretation that satisfies S satisfies s. A lexicon comprises a set of type assignment statements. The language model L(S) defined by a lexicon S is the set of all type assignment statements that S models: (45) L( ) ¼ {s| P
P
|=s}
This is like the declarative semantics of logic programs wherein the meaning of a program is the set
248 Categorial Grammars: Deductive Approaches
of all ground atoms which it entails, or a logical theory which is the set of all consequences of an axiomatization. For example, the language model defined by the lexicon (46) includes the type assignment statements in (47): (46) cat–cat: CN likes–like: (N\S)/N Mary–m: N. sleeps–sleep: N\S that–lxlylz((& (y z)) (x z)): (CN\CN)/(S/N) that–i: N/CN (47a) the þ cat þ sleeps–(sleep (i cat)): S (47b) that þ Mary þ likes–lylz((& (y z)) ((like z) m)): CN\CN
See also: Combinatory Categorial Grammar.
Bibliography Ajdukiewicz K (1935). ‘Die Syntaktische Konnexita¨ t.’ Studia Philosophica 1, 1–27. Bar-Hillel Y (1953). ‘A quasi-arithmetical notation of syntactic description.’ Language 19, 47–58. Bar-Hillel Y (1964). Language and information. Reading, MA: Addison-Wesley. Barry G, Hepple M, Leslie N & Morrill G (1991). ‘Proof figures and structural operators.’ In Fifth Conference of the European Chapter of the Association for Computational Linguistics, Berlin. Buszkowski W (1986). ‘Completeness results for Lambek syntactic calculus.’ Zeitschrift fu¨r Mathematische Logik und Grundlagen der Mathematik 32, 13–28.
Carpenter B (1997). Type-logical semantics. Cambridge, MA: MIT Press. Danos V & Regnier L (1989). ‘The structure of multiplicatives.’ Archive for Mathematical Logic 28, 181–203. de Groote P & Retore´ C (1996). ‘On the semantic readings of proof-nets.’ In Kruijff G-J, Morrill G & Oehrle R T (eds.) Proceedings of Formal Grammar 1996. Prague. 57–70. de Groote P & Retore´ C (2003). Proof-theoretic methods in computational linguistics. Lecture notes of the 15th European Summer School in Logic, Language and Information, Vienna. Fadda M (2004). ‘Non-associativity and balanced proof nets.’ In Proceedings of Categorial Grammars: an Efficient Tool for Natural Language Processing, Montpellier, France. 46–58. Fadda M & Morrill G (2005). ‘The Lambek calculus with brackets.’ In Scott P, Casadio C & Seely R (eds.) Language and grammar: studies in mathematical linguistics and natural language. Stanford, CA: CSLI. Husserl E (1913). Logische Untersuchungen (2nd edn.). Halle, Germany: Max Niemeyer. Lambek J (1958). ‘The mathematics of sentence structure.’ American Mathematical Monthly 65, 154–170. Moortgat M (1997). ‘Categorial type logics.’ In van Benthem J & ter Meulen A (eds.) Handbook of logic and language. Amsterdam/New York: Elsevier/Cambridge, MA: MIT Press. 93–177. Moot R & Puite Q (2002). ‘Proof nets for the multimodal Lambek calculus.’ Studia Logica 71(3), 415–442. Morrill G (1994). Type logical grammar: categorial logic of signs. Dordrecht: Kluwer Academic. Pentus M (1994). ‘Language completeness of the Lambek calculus.’ In Proceedings of the Eighth Annual IEEE Symposium of Logic in Computer Science. 487–496. Roorda D (1991). Resource logics: proof-theoretical investigations. Ph.D. thesis, University of Amsterdam.
Categorical Perception in Animals J Fischer, German Primate Center, Goettingen, Germany ! 2006 Elsevier Ltd. All rights reserved.
The label ‘Categorical Perception’ (CP) is commonly used to describe the observation that continuous variation in a sensory stimulus is recoded into discrete categories. The classic example is the distinction of voiced and voiceless plosive consonants, such as /da/ and /ta/. These phonemes are mainly but not exclusively distinguished by the time lag between the plosive burst and the onset of voicing, i.e., the so-called ‘voice onset time’ (VOT). Although VOT may vary continuously between negative values – the voicing
begins before the plosive burst – to positive values – the voice sets in after the plosive sound – listeners typically sort these phonemes into one category or another. Such effects have become known as Categorical Perception, although CP may involve not only perceptual categorization, but also categorization of mental representations, and decision-making processes. Over time, the operational definition of CP has changed from a restrictive view to a more general one, and this has led to some dispute over which findings constitute examples of CP. A conservative definition of CP requires the fulfillment of four criteria: (1) distinct labeling of stimulus categories; (2) failure to discriminate within categories; (3) a
248 Categorial Grammars: Deductive Approaches
of all ground atoms which it entails, or a logical theory which is the set of all consequences of an axiomatization. For example, the language model defined by the lexicon (46) includes the type assignment statements in (47): (46) cat–cat: CN likes–like: (N\S)/N Mary–m: N. sleeps–sleep: N\S that–lxlylz((& (y z)) (x z)): (CN\CN)/(S/N) that–i: N/CN (47a) the þ cat þ sleeps–(sleep (i cat)): S (47b) that þ Mary þ likes–lylz((& (y z)) ((like z) m)): CN\CN
See also: Combinatory Categorial Grammar.
Bibliography Ajdukiewicz K (1935). ‘Die Syntaktische Konnexita¨t.’ Studia Philosophica 1, 1–27. Bar-Hillel Y (1953). ‘A quasi-arithmetical notation of syntactic description.’ Language 19, 47–58. Bar-Hillel Y (1964). Language and information. Reading, MA: Addison-Wesley. Barry G, Hepple M, Leslie N & Morrill G (1991). ‘Proof figures and structural operators.’ In Fifth Conference of the European Chapter of the Association for Computational Linguistics, Berlin. Buszkowski W (1986). ‘Completeness results for Lambek syntactic calculus.’ Zeitschrift fu¨r Mathematische Logik und Grundlagen der Mathematik 32, 13–28.
Carpenter B (1997). Type-logical semantics. Cambridge, MA: MIT Press. Danos V & Regnier L (1989). ‘The structure of multiplicatives.’ Archive for Mathematical Logic 28, 181–203. de Groote P & Retore´ C (1996). ‘On the semantic readings of proof-nets.’ In Kruijff G-J, Morrill G & Oehrle R T (eds.) Proceedings of Formal Grammar 1996. Prague. 57–70. de Groote P & Retore´ C (2003). Proof-theoretic methods in computational linguistics. Lecture notes of the 15th European Summer School in Logic, Language and Information, Vienna. Fadda M (2004). ‘Non-associativity and balanced proof nets.’ In Proceedings of Categorial Grammars: an Efficient Tool for Natural Language Processing, Montpellier, France. 46–58. Fadda M & Morrill G (2005). ‘The Lambek calculus with brackets.’ In Scott P, Casadio C & Seely R (eds.) Language and grammar: studies in mathematical linguistics and natural language. Stanford, CA: CSLI. Husserl E (1913). Logische Untersuchungen (2nd edn.). Halle, Germany: Max Niemeyer. Lambek J (1958). ‘The mathematics of sentence structure.’ American Mathematical Monthly 65, 154–170. Moortgat M (1997). ‘Categorial type logics.’ In van Benthem J & ter Meulen A (eds.) Handbook of logic and language. Amsterdam/New York: Elsevier/Cambridge, MA: MIT Press. 93–177. Moot R & Puite Q (2002). ‘Proof nets for the multimodal Lambek calculus.’ Studia Logica 71(3), 415–442. Morrill G (1994). Type logical grammar: categorial logic of signs. Dordrecht: Kluwer Academic. Pentus M (1994). ‘Language completeness of the Lambek calculus.’ In Proceedings of the Eighth Annual IEEE Symposium of Logic in Computer Science. 487–496. Roorda D (1991). Resource logics: proof-theoretical investigations. Ph.D. thesis, University of Amsterdam.
Categorical Perception in Animals J Fischer, German Primate Center, Goettingen, Germany ! 2006 Elsevier Ltd. All rights reserved.
The label ‘Categorical Perception’ (CP) is commonly used to describe the observation that continuous variation in a sensory stimulus is recoded into discrete categories. The classic example is the distinction of voiced and voiceless plosive consonants, such as /da/ and /ta/. These phonemes are mainly but not exclusively distinguished by the time lag between the plosive burst and the onset of voicing, i.e., the so-called ‘voice onset time’ (VOT). Although VOT may vary continuously between negative values – the voicing
begins before the plosive burst – to positive values – the voice sets in after the plosive sound – listeners typically sort these phonemes into one category or another. Such effects have become known as Categorical Perception, although CP may involve not only perceptual categorization, but also categorization of mental representations, and decision-making processes. Over time, the operational definition of CP has changed from a restrictive view to a more general one, and this has led to some dispute over which findings constitute examples of CP. A conservative definition of CP requires the fulfillment of four criteria: (1) distinct labeling of stimulus categories; (2) failure to discriminate within categories; (3) a
Categorical Perception in Animals 249
Figure 2 Mean percentage of /d/ responses by chinchilla and human subjects to synthetic speech sounds simulating a continuum ranging from /da/ to /ta/. The animals had been trained on the end-points of the continuum (0 and þ80 ms VOT) and then tested with stimuli ranging from þ10 to þ70 ms. Reprinted with permissions from Kuhl P & Miller J D (1975). Science 190, 69–72. ! 1975 AAAS. Permission from AAAS is required for all other uses.
Figure 1 Idealized labeling and discrimination functions. (Top) The graded continuum between the two end-points 0 and 10 is partitioned into two categories, A and B. The labeling function is nonlinear. (Bottom) Discrimination function for discrimination of stimuli that fall within a category (e.g., 2 and 3) and across categories (e.g., 5 and 6). The same physical variation may be difficult to distinguish when it falls within a category and easy to distinguish when it straddles the category boundary.
discrimination peak at the category boundary; and (4) a close agreement between labeling and discrimination functions (Studdert-Kennedy et al., 1970; see Figure 1). More loosely, CP has been defined as a compression of within-category and/or a separation of between-category differences (Harnad, 1987). Correspondingly, so-called ‘perceptual anchors’ or ‘prototypes’ refer to the compressed region within a category, whereas ‘boundary effects’ occur when a given variation of a stimulus is reported as the ‘same’ when it lies within a category and is reported as ‘different’ when it straddles the boundary between two categories (Kuhl, 1991). In the auditory domain, CP was initially believed to be restricted to the perception of speech sounds and considered to be special to speech (Liberman, 1957). This claim sparked interest in the question of whether animals would exhibit CP of human speech tokens (Kuhl, 1987). In an influential study, Kuhl and Miller (1975) trained chinchillas (Chinchilla chinchilla) to discriminate between different human speech tokens. Subjects were trained to distinguish the end-points of the voiced–voiceless continuum between /da/ and /ta/.
VOT in these experiments ranged between 0 and 80 ms. In the test trials, animals placed the phonetic boundary at approximately 40 ms (Figure 2) and they also extended their generalization to other consonants differing with regard to VOT (Kuhl and Miller, 1978). Similarly, Morse and Snowdon (1975) demonstrated CP of speech tokens in rhesus monkeys (Macaca mulatta). The finding that animals perceived the phonetic boundaries in similar places as Englishspeaking people initially led to the hypothesis that the observed boundaries may be innate and linked to the mammalian auditory system. However, category boundaries have been shown to be flexible and variable across different languages (Repp and Liberman, 1987). More importantly, a number of animal studies indicated that they may exhibit CP of their own sounds. These findings support the view that categorical boundaries are not innate, but are established through experience. Several studies employing operant training procedures revealed that nonhuman primates show CP of certain features of their own species’ vocalizations (e.g., May et al., 1989). Moreover, a number of studies investigated animals’ natural responses to graded variations of their own sounds. Female mice reliably responded with retrieval behavior to variations of ultrasonic pup vocalizations that fell within the natural range of the frequency bandwidth, but there was a distinct drop in the propensity to respond to calls whose bandwidth exceeded the category boundary (Ehret and Haack, 1981). Snowdon and Pola (1978) showed that the pygmy marmoset (Cebuella pygmaea), a New World monkey, responded in a categorical fashion to the playback of synthetic
250 Categorical Perception in Animals
Figure 3 Proportion of stimulus presentations that were followed by a pygmy marmoset emitting a closed mouth trill (CMT) within 5 s of the playback of a closed mouth trill, in relation to trill duration. NULL represents the response when no auditory stimulus was presented. Reprinted from Animal Behaviour, 26, Snowdon C T & Pola Y V, Interspecific and intraspecific responses to synthesized pygmy marmoset vocalizations, 192–206, Copyright (1978), with permission from Elsevier.
modifications of single acoustic parameters in their trills (Figure 3). A further set of studies adopted a specific playback technique, the ‘habituation–dishabituation’ paradigm (also ‘habituation–recovery’; Fischer et al., 2001) previously used in human infant research (Fantz, 1964; Eimas et al., 1971). With this technique, a series of stimuli is presented until the subject ceases to respond. Subsequently, a putatively distinct stimulus is presented. A recovery in response suggests that this stimulus is placed in a different category than those used for habituation, whereas a failure to respond to this test stimulus suggests that it is placed in the same category as those used for habituation. Using this method, Nelson and Marler (1989) studied swamp sparrow (Melospiza georgiana) responses to variation in note duration, a feature characteristic for different populations of this species. In these experiments, animals showed renewed territorial responses only when the note duration was switched to a length of the other category, whereas they failed to do so when the same absolute variation fell within a given category. Crickets (Teleogryllus oceanicus) exhibit categorical perception of the frequency of tones, depending on whether they fall in the species-specific range or whether they simulate the presence of bats, one of their main predators (Wyttenbach et al., 1996). Fischer (1998) also adopted the habitation– dishabituation paradigm and demonstrated that Barbary macaques (Macaca sylvanus) responded in a categorical fashion to continuous variation between
Figure 4 Looking time after playback of a series of Barbary macaque shrill barks given to two different disturbances, human observers and dogs. Graphs depict habituation in response to repeated presentation of calls given in response to observers. In the test, either a call given in response to a dog (A) or a novel call given in response to the observer (B) was played. Test stimuli differed from habituation stimuli by similar acoustic amounts, measured in terms of scores derived from a multivariate acoustic analysis. Unpublished material from Fischer (1996).
two subtypes of alarm calls (Figure 4). These calls varied with regard to a suite of variables. Experience with the population-typical variants of calls appeared to influence the categorization of sounds, supporting the view that experience with the stimuli can influence the location of category boundaries. Interestingly, baboons showed continuous responses to the graded variation between two subtypes of their loud calls (Fischer et al., 2001). Both methodological approaches to the study of CP in animals – operant conditioning and observation of natural responses – have been criticized for methodological shortcomings: Studies that employed operant conditioning may have established categories through the training, and therefore the observed categorization may simply be an outcome of generalization of the training stimuli. On the other hand, those studies that relied on natural responses could not demonstrate that subjects were unable to distinguish between categories (Snowdon, 1979). Accordingly,
Categorical Perception in Animals 251
Nelson and Marler (1990) concluded that studies involving operant conditioning were aimed at identifying the ‘just noticeable difference’ (jnd), whereas those relying on the animals’ natural responses identified the ‘just meaningful difference’ (jmd), and it has been suggested that the term ‘categorical responses’ be used for the latter and the term ‘categorical perception’ be reserved for the former. Irrespective of the actual label used, however, it seems warranted to conclude that nonlinear responses to continuous variation in sound features are common among species from a variety of taxa, including insects, rodents, birds, and nonhuman primates. This finding supports the view that CP in the broad sense is an expression of categorical effects in the perception and representation of biologically meaningful stimulus variation. See also: Animal Communication: Deception and Honest Signaling; Animal Communication: Dialogues; Animal Communication: Overview; Animal Communication: Signal Detection; Animal Communication: Vocal Learning; Cognitive Basis for Language Evolution in Non-human Primates; Communication in Grey Parrots; Development of Communication in Animals; Non-human Primate Communication; Traditions in Animals.
Bibliography Ehret G & Haack B (1981). ‘Categorical perception of mouse pup ultrasounds by lactating females.’ Naturwissenschaften 68, 208. Eimas P D, Siqueland E R, Jusczyk P & Vigorito J (1971). ‘Speech perception in infants.’ Science 171, 303–306. Fantz R L (1964). ‘Visual experience in infants: Decreased attention to familiar patterns relative to novel ones.’ Science 146, 668–670. Fischer J (1996). Perzeption von Lautkategorien bei Berberaffen. Dissertation. Berlin: Free University Berlin. Fischer J (1998). ‘Barbary macaques categorize shrill barks into two call types.’ Animal Behaviour 55, 799–807. Fischer J, Metz M, Cheney D L & Seyfarth R M (2001). ‘Baboon responses to graded bark variants.’ Animal Behaviour 61, 925–931. Harnad S (1987). Categorical perception. Cambridge University Press: Cambridge.
Kuhl P K (1987). ‘Categorization by animals and infants.’ In Harnad S (ed.) Categorical perception. Cambridge: Cambridge University Press. 355–386. Kuhl P K (1991). ‘Human adults and human infants show a ‘‘perceptual magnet effect’’ for the prototypes of speech categories, monkeys do not.’ Perception and Psychophysics 50, 93–107. Kuhl P K & Miller J D (1975). ‘Speech perception by the chinchilla: Voiced-voiceless distinction in alveolar plosive consonants.’ Science 190, 69–72. Kuhl P K & Miller J D (1978). ‘Speech perception by the chinchilla: Identification functions for synthetic VOT stimuli.’ Journal of the Acoustical Society of America 63, 905–917. Liberman A M (1957). ‘Some results of research on speech perception.’ Journal of the Acoustical Society of America 29, 117–123. May B, Moody D B & Stebbins W C (1989). ‘Categorical perception of conspecific communication sounds by Japanese macaques, Macaca fuscata.’ Journal of the Acoustical Society of America 85, 837–847. Morse P A & Snowdon C T (1975). ‘An investigation of categorical speech discrimination by rhesus monkeys.’ Perception and Psychophysics 17, 9–16. Nelson D A & Marler P (1989). ‘Categorical perception of a natural stimulus continuum – Birdsong.’ Science 244, 976–978. Nelson D A & Marler P (1990). ‘The perception of birdsong and an ecological concept of signal space.’ In Stebbins W C & Berkley M A (eds.) Comparative perception 2: Complex signals. New York: Wiley. 443–478. Repp B H & Liberman A M (1987). ‘Phonetic category boundaries are flexible.’ In Harnad S (ed.) Categorical perception. Cambridge: Cambridge University Press. 89–112. Snowdon C T (1979). ‘Response of nonhuman animals to speech and to species-specific sounds.’ Brain Behaviour and Evolution 16, 409–429. Snowdon C T & Pola Y V (1978). ‘Interspecific and intraspecific responses to synthesized pygmy marmoset vocalizations.’ Animal Behaviour 26, 192–206. Studdert-Kennedy M, Liberman A M, Harris K S & Cooper F S (1970). ‘Motor theory of speech perception.’ Psychological Review 173, 16–43. Wyttenbach R A, May M L & Hoy R R (1996). ‘Categorical perception of sound frequency by crickets.’ Science 273, 1542–1544.
252 Categorizing Percepts: Vantage Theory
Categorizing Percepts: Vantage Theory K Allan, Monash University, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.
Vantage theory (VT) is a theory of cognitive categorization in terms of point of view or ‘vantage.’ The underlying assumption is that categorization reflects human needs and motives. VT was created by the late Robert E. MacLaury as a way of explaining the meanings and development of color terms across languages when he found prototype theory and fuzzy-set logic inadequate to the task (see MacLaury 1986, 1987, 1991, 1995, 1997, 2002). VT explains . how people construct categories by analogy to the way they form points of view in space–time; . how categories are organized; . how categories divide; and . the relations between categories. In VT, cognition consists of selective attention to perception. To form a category, selected perceptions and reciprocal emphases on similarity and difference must be integrated in a principled way. A vantage is a point of view constructed by analogy to physical experience as though it were one or more ‘space– motion coordinates’ on a spatial terrain. Reminiscent of gestalt theory is MacLaury’s claim that a category is the sum of its coordinates, plus their arrangement into one or more vantages by selective emphasis. ‘‘The maker of the category, in effect, names the ways he constructs it rather than the set of its components as detached from himself’’ (1997: 153). The categorizer’s perspectives can be illustrated by an ornithologist ‘zooming in’ to see a mallard among the ducks on a lake, or alternatively ‘zooming out’ to see the assembled mallards, widgeon, and pintails as ducks. The mallard is the ‘fixed coordinate’; the rest a ‘mobile coordinate.’ In both views, there is a pair of coordinates that we can loosely differentiate as ‘species’ and ‘genus.’
Figure 1 Red focus in the composite ‘warm’ category; cf. MacLaury (1997: 145).
Imagine mapping warm-category colors (red, yellow) in an array of colored blocks representing the entire color spectrum. If each of the terms ‘red’ and ‘yellow’ is mapped differently, there is a single vantage. If there is coextensive mapping (evidence of a composite ‘warm’ color) with red focus [see Color Terms], red will dominate at the primary level of concentration, Level 1 in Figure 1, and attention is on ‘similarity,’ S, as the mobile coordinate. At Level 2 concentration, attention to the mobile coordinate yellow notes its similarity to red (as a warm color). At Level 3, there is attention to D, the ‘difference’ of fixed coordinate yellow from red. Here, yellow is recessive. Thus does VT model the dominant–recessive pattern of coextensive naming. The dominant vantage includes reinforced attention to similarity; the recessive vantage reinforces attention to difference. Thus, a category is composed of . selected perceptions; . reciprocal and mutable emphases on similarity and difference; and . at least one arrangement of these coordinates into levels of concentration—which is the vantage. VT has been applied to many cognitive fields: the category of person in 16th century Aztec; literacy choices for Yaquis in Arizona; choice of orthography in Japan; semantic extensions in English, French, Spanish, and Zapotec; lexical choices in French; varieties of Japanese women’s speech; terms of address in Japanese; the process of argumentation; and foreign language learning.
See also: Cognitive Linguistics; Cognitive Semantics;
Color Terms.
Bibliography MacLaury R E (1986). Color in Mesoamerica, vol. 1. Ph.D. diss., UCB. No. 8718073. Ann Arbor: UMI University Microfilms. MacLaury R E (1987). ‘Coextensive semantic ranges: Different names for distinct vantages of one category.’ In Need B, Schiller E & Bosch A (eds.) Papers from the Twenty-Third Annual Meeting of the Chicago Linguistics Society. Chicago: Chicago Linguistics Society. 268–282. MacLaury R E (1991). ‘Social and cognitive motivations of change: Measuring variability in color semantics.’ Language 67, 34–62.
Category-Specific Knowledge 253 MacLaury R E (1995). ‘Vantage theory.’ In Taylor J R & MacLaury R E (eds.) Language and the cognitive construal of the world. Berlin: Mouton de Gruyter. 231–276. MacLaury R E (1997). Color and cognition in Mesoamerica: Constructing categories as vantages. Austin: University of Texas Press. MacLaury R E (ed.) (2002). Language Sciences 24. Special Edition on Vantage Theory.
Taylor J R & MacLaury R E (eds.) (1995). Language and the cognitive construal of the world. Berlin: Mouton de Gruyter.
Category-Specific Knowledge B Z Mahon and A Caramazza, Harvard University, Cambridge, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
Principles of Organization Theories of the organization of conceptual knowledge in the brain can be distinguished according to their underlying principles. One class of theories, based on the neural structure principle, assumes that the organization of conceptual knowledge is governed by representational constraints internal to the brain itself. Two types of neural constraints have been invoked: modality-specificity and domain-specificity. The second class of theories, based on the correlated structure principle, assumes that the organization of conceptual knowledge in the brain is a reflection of the statistical co-occurrence of object properties in the world. Neuropsychological evidence, and more recently findings from functional neuroimaging, have figured centrally in attempts to evaluate extant theories of the organization of conceptual knowledge. Here we outline the main theoretical perspectives as well as the empirical phenomena that have been used to inform these perspectives. Modality-Specific Hypotheses
The first class of theories based on the neural structure principle assumes that the principal determinant of the organization of conceptual knowledge is the sensory-motor modality (e.g., visual, motor, verbal) through which the information was acquired or is typically processed. For instance, the knowledge that hammers are shaped like a T would be stored in a semantic subsystem dedicated to representing the visual structure of objects, while the information that hammers are used to pound nails would be represented in a semantic subsystem dedicated to functional knowledge of objects. There have been many proposals based on the modality-specific assumption (Beauvois, 1982; Warrington and McCarthy, 1983,
1987; Warrington and Shallice, 1984; Allport, 1985; Martin et al., 2000; Humphreys and Forde, 2001; Barsalou et al., 2003; Cree and McRae, 2003; Crutch and Warrington, 2003; Gallese and Lakoff, in press). One way to distinguish between these proposals concerns whether, and to what extent, conceptual knowledge is assumed to be represented independently of sensory-motor processes. At one extreme are theories that assume conceptual content reduces to (i.e., actually is) sensory-motor content (e.g., Allport, 1985; Pulvermuller, 2001; Barsalou et al., 2003; Gallese and Lakoff, in press). Central to such proposals is the notion of simulation, or the automatic reactivation of sensory-motor information in the course of conceptual processing. Toward the other end of the continuum are modality-based hypotheses of the organization of conceptual knowledge that assume that sensory-motor systems may be damaged without compromising the integrity of conceptual knowledge (Martin et al., 2000; Plaut, 2002; Crutch and Warrington, 2003; for discussion, see Mahon and Caramazza, in press). Domain-Specific Hypotheses
A second class of proposals based on the neural structure principle assumes that the principal determinant of the organization of conceptual knowledge is semantic category (e.g., Gelman, 1990; Carey and Spelke, 1994; Caramazza and Shelton, 1998; Kanwisher, 2000). For instance, in this view, it may be argued that conceptual knowledge of conspecifics and conceptual knowledge of animals are represented and processed by functionally dissociable processes/systems. Crucially, in this view, the first order principle of organization of conceptual processing is semantic category and not the modality through which that information is typically processed. One proposal along these lines, the DomainSpecific Hypothesis (Caramazza and Shelton, 1998), argues that conceptual knowledge is organized by specialized (and functionally dissociable) neural
Category-Specific Knowledge 253 MacLaury R E (1995). ‘Vantage theory.’ In Taylor J R & MacLaury R E (eds.) Language and the cognitive construal of the world. Berlin: Mouton de Gruyter. 231–276. MacLaury R E (1997). Color and cognition in Mesoamerica: Constructing categories as vantages. Austin: University of Texas Press. MacLaury R E (ed.) (2002). Language Sciences 24. Special Edition on Vantage Theory.
Taylor J R & MacLaury R E (eds.) (1995). Language and the cognitive construal of the world. Berlin: Mouton de Gruyter.
Category-Specific Knowledge B Z Mahon and A Caramazza, Harvard University, Cambridge, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
Principles of Organization Theories of the organization of conceptual knowledge in the brain can be distinguished according to their underlying principles. One class of theories, based on the neural structure principle, assumes that the organization of conceptual knowledge is governed by representational constraints internal to the brain itself. Two types of neural constraints have been invoked: modality-specificity and domain-specificity. The second class of theories, based on the correlated structure principle, assumes that the organization of conceptual knowledge in the brain is a reflection of the statistical co-occurrence of object properties in the world. Neuropsychological evidence, and more recently findings from functional neuroimaging, have figured centrally in attempts to evaluate extant theories of the organization of conceptual knowledge. Here we outline the main theoretical perspectives as well as the empirical phenomena that have been used to inform these perspectives. Modality-Specific Hypotheses
The first class of theories based on the neural structure principle assumes that the principal determinant of the organization of conceptual knowledge is the sensory-motor modality (e.g., visual, motor, verbal) through which the information was acquired or is typically processed. For instance, the knowledge that hammers are shaped like a T would be stored in a semantic subsystem dedicated to representing the visual structure of objects, while the information that hammers are used to pound nails would be represented in a semantic subsystem dedicated to functional knowledge of objects. There have been many proposals based on the modality-specific assumption (Beauvois, 1982; Warrington and McCarthy, 1983,
1987; Warrington and Shallice, 1984; Allport, 1985; Martin et al., 2000; Humphreys and Forde, 2001; Barsalou et al., 2003; Cree and McRae, 2003; Crutch and Warrington, 2003; Gallese and Lakoff, in press). One way to distinguish between these proposals concerns whether, and to what extent, conceptual knowledge is assumed to be represented independently of sensory-motor processes. At one extreme are theories that assume conceptual content reduces to (i.e., actually is) sensory-motor content (e.g., Allport, 1985; Pulvermuller, 2001; Barsalou et al., 2003; Gallese and Lakoff, in press). Central to such proposals is the notion of simulation, or the automatic reactivation of sensory-motor information in the course of conceptual processing. Toward the other end of the continuum are modality-based hypotheses of the organization of conceptual knowledge that assume that sensory-motor systems may be damaged without compromising the integrity of conceptual knowledge (Martin et al., 2000; Plaut, 2002; Crutch and Warrington, 2003; for discussion, see Mahon and Caramazza, in press). Domain-Specific Hypotheses
A second class of proposals based on the neural structure principle assumes that the principal determinant of the organization of conceptual knowledge is semantic category (e.g., Gelman, 1990; Carey and Spelke, 1994; Caramazza and Shelton, 1998; Kanwisher, 2000). For instance, in this view, it may be argued that conceptual knowledge of conspecifics and conceptual knowledge of animals are represented and processed by functionally dissociable processes/systems. Crucially, in this view, the first order principle of organization of conceptual processing is semantic category and not the modality through which that information is typically processed. One proposal along these lines, the DomainSpecific Hypothesis (Caramazza and Shelton, 1998), argues that conceptual knowledge is organized by specialized (and functionally dissociable) neural
254 Category-Specific Knowledge
circuits innately determined to the conceptual processing of different categories of objects. However, not all Domain-Specific theories assume that the organization of the adult semantic system is driven by innate parameters (e.g., Kanwisher, 2000). Feature-Based Hypotheses
The class of hypotheses based on the correlated structure principle has focused on articulating the structure of semantic memory at the level of semantic features. There are many and sometimes diverging proposals along these lines; common to all of them is the assumption that the relative susceptibility to impairment (under conditions of neurological damage) of different concepts is a function of statistical properties of the semantic features that comprise those concepts. For instance, on some models, the degree to which features are shared by a number of concepts is contrasted with their relative distinctiveness (Devlin et al., 1998; Garrard et al., 2001; Tyler and Moss, 2001). Another dimension that is introduced by some theorists concerns dynamical properties of damage in the system; for instance, Tyler and Moss assume that features that are more correlated with other features will be more resistant to damage, due to greater reciprocal activation (or support) from those features with which they are correlated (but see Caramazza et al., 1990). Distinctive features, on the other hand, will not receive as much reciprocal support, and will thus be more susceptible to damage. More recently, theorists have expanded on the original proposal of Tyler and colleagues, adding dimensions such as familiarity, typicality, and relevance (e.g., Cree and McRae, 2003; Sartori and Lombardi, 2004). Feature-based models of semantic memory have in general emphasized an empirical, bottom up, approach to modeling the organization of semantic memory, usually drawing on feature generation tasks (e.g., Garrard et al., 2001; Tyler and Moss, 2001; Cree and McRae, 2003; Sartori and Lombardi, 2004). For this reason, feature-based models have been useful in generating hypotheses about the types of parameters that may contribute to the organization of conceptual knowledge.
Clues from Cognitive Neuropsychology Neuropsychological studies of patients with semantic impairments have figured centrally in developing and evaluating the hypotheses outlined above. Of particular importance has been a clinical profile described as category-specific semantic deficit. Patients with category-specific semantic deficits present with disproportionate or even selective difficulty for
conceptual knowledge of stimuli from one semantic category compared to other semantic categories. For instance, the reports of category-specific impairment by Warrington and her collaborators (e.g., Warrington and McCarthy, 1983, 1987; Warrington and Shallice, 1984) documented patients who were impaired for living things compared to nonliving things, or the reverse: greater difficulty with nonliving things than living things. Since those seminal reports, the phenomenon of category-specific semantic deficit has been documented by a number of investigators (for recent reviews of the clinical evidence, see Humphreys and Forde, 2001; Tyler and Moss, 2001; Capitani et al., 2003). The clinical profile of category-specific semantic deficits is in itself quite remarkable, and can be striking. Consider some aspects of the following case of category-specific semantic deficit for living animate things. Patient EW (Caramazza and Shelton, 1998) was 41% correct (7/16) for naming pictures of animals but was in the normal range for naming pictures of non-animals (e.g., artifacts, fruit/vegetables) when the pictures from the different semantic categories were matched jointly for familiarity and visual complexity. EW was also severely impaired for animals (60%; 36/60 correct) in a task in which the patient was asked to decide, yes or no, whether the depicted stimulus was a real object or not. In contrast, EW performed within the normal range for making the same types of judgments about nonanimals. On another task, EW was asked to decide whether a given attribute was true of a given item (e.g., Is it true that eagles lay eggs?). EW was severely impaired for attributes pertaining to animals (65% correct) but within the normal range for non-animals. EW was equivalently impaired for both visual/perceptual and functional/associative knowledge of living things (65% correct for both types of knowledge) but was within the normal range for both types of knowledge for non-animals. The phenomenon of category-specific semantic deficits frames what has proven to be a rich question: How could the conceptual system be organized such that various conditions of damage can give rise to conceptual impairments that disproportionately affect specific semantic categories? There is emerging consensus that any viable answer to this question must be able to account for the following three facts (for discussion, see Caramazza and Shelton, 1998; Tyler and Moss, 2001; Capitani et al., 2003; Cree and McRae, 2003; Samson and Pillon, 2003). Fact I: The grain of the phenomenon: Patients can be disproportionately impaired for either living animate things (i.e., animals) compared to living inanimate things (i.e., fruit/vegetables (e.g., Hart and
Category-Specific Knowledge 255
Gordon, 1992; Caramazza and Shelton, 1998) or living inanimate things compared to living animate things (e.g., Hart et al., 1985; Crutch and Warrington, 2003; Samson and Pillon, 2003). Patients can also be impaired for nonliving things compared to living things (Hillis and Caramazza, 1991). Fact II: The profile of the phenomenon: Categoryspecific semantic deficits are not associated with disproportionate impairments for modalities or types of information (e.g., Caramazza and Shelton, 1998; Laiacona and Capitani, 2001; Farah and Rabinowitz, 2003; Samson and Pillon, 2003). Conversely, disproportionate impairments for modalities or types of information are not necessarily associated with category-specific semantic deficits (e.g., Lambon-Ralph et al.,1998; Miceli et al., 2001). Fact III: The severity of overall impairment: The direction of category-specific semantic deficits (i.e., living things worse than nonliving things, or vice versa) is not related to the overall severity of semantic impairment (Garrard et al., 1998; Zannino et al., 2002). Explaining Category-Specific Semantic Deficits
Most of the empirical and theoretical work in category-specific semantic deficits has been driven by an attempt to evaluate a theoretical proposal first advanced by Warrington, Shallice, and McCarthy (Warrington and McCarthy, 1983, 1987; Warrington and Shallice, 1984): the Sensory/Functional Theory. The Sensory/Functional Theory is an extension of the modality-specific semantic hypothesis (Beauvois, 1982) discussed above. In addition to assuming that the semantic system is functionally organized by modality or type of information, the Sensory/Functional Theory assumes that the recognition/identification of items from different semantic categories (e.g., living things compared to nonliving things) differentially depends on different modality-specific semantic subsystems. In general, Sensory/Functional theories assume that the ability to identify/recognize living things differentially depends on visual/perceptual knowledge, while the ability to identify/recognize nonliving things differentially depends on functional/associative knowledge (for data and/or discussion of the assumption that different types or modalities of information are differentially important for different semantic categories, see Farah and McClelland, 1991; Caramazza and Shelton, 1998; Garrard et al., 2001; Tyler and Moss, 2001; Cree and McRae, 2003). There are several versions of the Sensory/ Functional Theory, each of which has emphasized a different correspondence between the type or modality of information and the category of items that differentially depends on that type of information.
For instance, it has been proposed that color information is more important for fruit/vegetables than animals (e.g., Humphreys and Forde, 2001; Cree and McRae, 2003; Crutch and Warrington, 2003) while biological motion information is more important for animals than for fruit/vegetables (e.g., Cree and McRae, 2003). Another version of the Sensory/Functional Theory (Humphreys and Forde, 2001) holds that there is greater perceptual crowding (due to greater perceptual overlap) at a modality-specific input level for living things than for nonliving things. Thus, damage to this visual modality-specific input system will disproportionately affect processing of living things compared to nonliving things (see also Tranel et al., 1997; Dixon, 2000; Laws et al., 2002). Common to theories based on the Sensory/Functional Assumption is that at least some category-specific semantic deficits can be explained by assuming damage to the modality or type of information upon which recognition/identification of items from the impaired category differentially depends (for discussion see Humphreys and Forde, 2001). Other authors have argued that the fact that category-specific semantic deficits are not necessarily associated with deficits to a modality or type of knowledge (see Fact II above) indicates that the phenomenon does not provide support for Sensory/Functional theories (for discussion, see Caramazza and Shelton, 1998; Tyler and Moss, 2001; Capitani et al., 2003; Cree and McRae, 2003; Samson and Pillon, 2003). Caramazza and Shelton (1998) argued for a Domain-Specific interpretation of category-specific semantic deficits that emphasized the hypothesis that the grain of category-specific semantic deficits will be restricted to a limited set of categories. Specifically, because the Domain-Specific Hypothesis (Caramazza and Shelton, 1998) assumes that the organization of conceptual and perceptual processing is determined by innate constraints, the plausible categories of category-specific semantic impairment are ‘animals,’ ‘fruit/vegetables,’ ‘conspecifics,’ and possibly tools. Recent discussion of this proposal (Caramazza and Mahon, in press; see also Shelton et al., 1998) has capitalized on using the category ‘conspecifics’ as a test case. Consistent with expectations that follow from the Domain-Specific Hypothesis, patients have been reported who are relatively impaired for knowledge of conspecifics but not for animals or objects (e.g., Kay and Hanley, 1999; Miceli et al., 2000) as well as the reverse: equivalent impairment for animals and objects but spared knowledge of conspecifics (Thompson et al., 2004). Thus, the domain of conspecifics can be spared or impaired independently of both objects and other
256 Category-Specific Knowledge
living things, and importantly, an impairment for conspecifics is not necessarily associated with a general impairment for living things compared to nonliving things. Another line of research has sought an account of category-specific semantic deficits in terms of featurebased models of semantic memory organization. For instance, the Organized Unitary Content Hypothesis (OUCH) (Caramazza et al., 1990) makes two principal assumptions. First, conceptual features corresponding to object properties that often cooccur will be stored close together in semantic space; and second, focal brain damage can give rise to category-specific semantic deficits either because the conceptual knowledge corresponding to objects with similar properties is stored in adjacent neural areas, or because damage to a given property will propagate damage to highly correlated properties. While the original OUCH model is not inconsistent with the currently available data from categoryspecific semantic deficits, it is too unconstrained to provide a principled answer to the question of why the various facts are as they are. Other feature-based models have emphasized the differential susceptibility to impairment of different types of semantic features. These models often assume random (or diffuse) damage to a conceptual system that is not organized by modality or object domain. For instance, in order to account for category-specific semantic deficits, the semantic memory model advanced by Tyler and Moss (2001) makes three assumptions bearing on the relative susceptibility to impairment of different classes of semantic features: (a) Living things have more shared features than nonliving things, or put differently, nonliving things have more distinctive/informative features than living things; (b) For living things, biological function information is highly correlated with shared perceptual properties (e.g., can see/has eyes). For artifacts, function information is highly correlated with distinctive perceptual properties (e.g., used for spearing/has tines). (c) Features that are highly correlated with other features will be more resistant to damage than features that are not highly correlated (see also Devlin et al., 1998; Garrard et al., 2001; Cree and McRae, 2003). This proposal, termed the Conceptual Structure Account, predicts that a disproportionate deficit for living things will be observed when damage is relatively mild, while a disproportionate deficit for nonliving things will only arise when damage is so severe that all that is left in the system are the highly correlated shared perceptual and function features of living things. Recent work investigating the central prediction of the theory through cross sectional analyses of patients at varying stages of Alzheimer’s
disease has not found support for this prediction (Garrard et al., 1998; Zannino et al., 2002).
Clues from Functional Neuroimaging Increasingly, the neuropsychological approach is being complemented by functional neuroimaging studies of category-specificity. There is a large body of evidence from functional neuroimaging that demonstrates differentiation by semantic domain within modality-specific systems specialized for processing object form and object-associated motion. Specifically, within the ventral object processing system, areas on the inferior surface of the temporal lobes process object-associated form and texture, while areas on the lateral surfaces of the temporal lobes process object-associated movement (Kourtzi and Kanwisher, 2000; Beauchamp et al., 2002, 2003). Within both form/texture- and motion-specific areas of the ventral object processing system, there is differentiation by semantic category. On the inferior surface of the temporal lobe (e.g., fusiform gyrus), more lateral areas are differentially involved in the processing of living things, while more medial regions are differentially involved in the processing of nonliving things. Furthermore, human face stimuli, in comparison to non-face stimuli (including animals without faces), differentially activate distinct regions of the inferior temporal cortex (Kanwisher et al., 1999). On the lateral surface of the temporal lobes, more superior regions (e.g., superior temporal sulcus) are differentially involved in the processing of motion associated with living things, while more inferior regions (e.g., middle temporal gyrus) are differentially involved in the processing of motion associated with nonliving things (for review, see Kanwisher, 2000; Martin and Chao, 2001; Beauchamp et al., 2002, 2003; Bookheimer, 2002; Caramazza and Mahon, 2003, in press). All of the theoretical frameworks outlined above have been applied to the data from functional neuroimaging. One widely received view, the Sensory/Motor Theory, developed by Martin, Wiggs, Ungerleider, and Haxby (1996; see also Martin et al., 2000) assumes that conceptual knowledge of different categories of objects is stored close to the modality- specific input/ output areas that are active when we learn about and interact with those objects. Other authors have interpreted these patterns of activation within a Domain-Specific Framework (e.g., Kanwisher, 2000; Caramazza and Mahon, 2003, in press), while still others have interpreted these findings within a distributed semantic memory model that emphasizes experience-dependent and/or feature-based properties of concepts (e.g., Tarr and Gauthier, 2000; Levy et al.,
Category-Specific Knowledge 257
2001; Martin and Chao, 2001; Bookheimer, 2002; Devlin et al., 2002). Regardless of what the correct interpretation of these functional neuroimaging data turns out to be, they suggest a theoretical approach in which multiple dimensions of organization can be distinguished. In particular, whether the categoryspecific foci of activation are interpreted within the Domain-Specific Framework or within a featurebased framework, these data suggest the inference that the organization of conceptual knowledge in the cortex is driven both by the type or modality of the information as well as its content-defined semantic category.
Conclusion The three proposals that we have reviewed (the Sensory/Functional Theory, the Domain-Specific Hypothesis, and the Conceptual Structure Account) are contrary hypotheses of the causes of categoryspecific semantic deficits. However, the individual assumptions that comprise each account are not necessarily mutually contrary as proposals about the organization of semantic memory. In this context, it is important to note that each of the hypotheses discussed above makes assumptions at a different level in a hierarchy of questions about the organization of conceptual knowledge. At the broadest level is the question of whether or not conceptual knowledge is organized by Domain-Specific constraints. The second question is whether conceptual knowledge is represented in modality-specific semantic stores specialized for processing/storing a specific type of information, or is represented in an amodal, unitary system. The third level in this hierarchy of questions concerns the organization of conceptual knowledge within any given object domain (and/or modalityspecific semantic store): the principles invoked by feature-based models may prove useful for articulating answers to this question (for further discussion of the various levels at which specific hypotheses have been articulated, see Caramazza and Mahon, 2003). Different hypotheses of the organization of conceptual knowledge are more or less successful at accounting for different types of facts. Thus, it is important to consider the specific assumptions made by each hypothesis in the context of a broad range of empirical phenomena. The combination of neuropsychology and functional neuroimaging is beginning to provide promising grounds for raising theoretically motivated questions concerning the organization of conceptual knowledge in the human brain.
Acknowledgments Preparation of this manuscript was supported in part by NIH grant DC04542 to A. C., and by an NSF Graduate Research Fellowship to B. Z. M. Portions of this article were adapted from Caramazza and Mahon (2003) and Caramazza and Mahon (in press).
Bibliography Allport D A (1985). ‘Distributed memory, modular subsystems and dysphasia.’ In Newman & Epstein (eds.) Current perspectives in dysphasia. New York: Churchill Livingstone. Barsalou L W, Simmons W K, Barbey A K & Wilson C D (2003). ‘Grounding conceptual knowledge in the modality-specific systems.’ Trends in Cognitive Sciences 7, 84–91. Beauchamp M S, Lee K E, Haxby J V & Martin A (2002). ‘Parallel visual motion processing streams for manipulable objects and human movements.’ Neuron 34, 149–159. Beauchamp M S, Lee K E, Haxby J V & Martin A (2003). ‘FMRI responses to video and point-light displays of moving humans and manipulable objects.’ Journal of Cognitive Neuroscience 15, 991–1001. Beauvois M F (1982). ‘Optic aphasia: a process of interaction between vision and language.’ Proceedings of the Royal Society (London) B298, 35–47. Bookheimer S (2002). ‘Functional MRI of language: new approaches to understanding the cortical organization of semantic processing.’ Annual Review of Neuroscience 25, 151–188. Capitani E, Laiacona M, Mahon B & Caramazza A (2003). ‘What are the facts of category-specific deficits? A critical review of the clinical evidence.’ Cognitive Neuropsychology 20, 213–262. Caramazza A, Hillis A E, Rapp B C & Romani C (1990). ‘The multiple semantics hypothesis: Multiple confusions?’ Cognitive Neuropsychology 7, 161–189. Caramazza A & Shelton J R (1998). ‘Domain specific knowledge systems in the brain: the animate-inanimate distinction.’ Journal of Cognitive Neuroscience 10, 1–34. Caramazza A & Mahon B Z (2003). ‘The organization of conceptual knowledge: the evidence from categoryspecific semantic deficits.’ Trends in Cognitive Sciences 7, 325–374. Caramazza A & Mahon B Z (in press). ‘The organization of conceptual knowledge in the brain: the future’s past and some future directions.’ Cognitive Neuropsychology. Carey S & Spelke E (1994). ‘Domain-specific knowledge and conceptual change.’ In Hirschfeld L A & Gelman S A (eds.) Mapping the mind: domain-specificity in cognition and culture. New York: Cambridge University Press. 169–200. Cree G S & McRae K (2003). ‘Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns).’ Journal of Experimental Psychology: General 132, 163–201.
258 Category-Specific Knowledge Crutch S J & Warrington E K (2003). ‘The selective impairment of fruit and vegetable knowledge: a multiple processing channels account of fine-grain category specificity.’ Cognitive Neuropsychology 20, 355–373. Devlin J T, Gonnerman L M, Anderson E S & Seidenberg M S (1998). ‘Category-specific semantic deficits in focal and widespread brain damage: a computational account.’ Journal of Cognitive Neuroscience 10, 77–94. Devlin J T, Russell R P, Davis M H, Price C J, Moss H E, Fadili M J & Tyler L K (2002). ‘Is there an anatomical basis for category-specificity? Semantic memory studies in PET and fMRI.’ Neuropsychologia 40, 54–75. Dixon M J (2000). ‘A new paradigm for investigating category-specific agnosia in the new millennium.’ Brain and Cognition 42, 142–145. Farah M J & McClelland J L (1991). ‘A computational model of semantic memory impairment: modality specific and emergent category specificity.’ Journal of Experimental Psychology: General 120, 339–357. Farah M J & Rabinowitz C (2003). ‘Genetic and environmental influences on the organization of semantic memory in the brain: is ‘‘living things’’ an innate category?’ Cognitive Neuropsychology 20, 401–408. Gallese V & Lakoff G (in press). ‘The brain’s concepts: the role of the sensory-motor system in conceptual knowledge.’ Cognitive Neuropsychology. Garrard P, Patterson K, Watson P C & Hodges J R (1998). ‘Category specific semantic loss in dementia of Alzheimer’s type. Functional-anatomical correlations from cross sectional analyses.’ Brain 121, 633–646. Garrard P, Lambon-Ralph M A, Hodges J R & Patterson K (2001). ‘Prototypicality, distinctiveness, and intercorrelation: analyses of semantic attributes of living and nonliving concepts.’ Cognitive Neuropsychology 18, 125–174. Gelman R (1990). ‘First principles organize attention to and learning about relevant data: number and the animateinanimate distinction as examples.’ Cognitive Science 14, 79–106. Hart J, Berndt R S & Caramazza A (1985). ‘Categoryspecific naming deficit following cerebral infarction.’ Nature 316, 439–440. Hart J & Gordon B (1992). ‘Neural subsystems for object knowledge.’ Nature 359, 60–64. Hillis A E & Caramazza A (1991). ‘Category-specific naming and comprehension impairment: a double dissociation.’ Brain 114, 2081–2094. Humphreys G W & Forde E M (2001). ‘Hierarchies, similarity, and interactivity in object recognition: ‘‘Category-specific’’ neuropsychological deficits.’ Behavioral and Brain Sciences 24, 453–509. Kanwisher N (2000). ‘Domain specificity in face perception.’ Nature 3, 759–763. Kanwisher N, Stanley D & Harris A (1999). ‘The fusiform face area is selective for faces, not animals.’ Neuroreport 10, 183–187. Kay J & Hanley J R (1999). ‘Person-specific knowledge and knowledge of biological categories.’ Cognitive Neuropsychology 16, 171–180.
Kourtzi Z & Kanwisher N (2000). ‘Activation in human MT/MST by static images with implied motion.’ Journal of Cognitive Neuroscience 12, 48–55. Laiacona M & Capitani E (2001). ‘A case of prevailing deficit for non-living categories or a case of prevailing sparing of living categories?’ Cognitive Neuropsychology 18, 39–70. Lambon-Ralph M A, Howard D, Nightingale G & Ellis AW (1998). ‘Are living and non-living category-specific deficits causally linked to impaired perceptual or associative knowledge? Evidence from a category-specific double dissociation.’ Neurocase 4, 311–338. Laws K R, Gale T M, Frank R & Davey N (2002). ‘Visual similarity is greater for line drawings of nonliving than living thing: the importance of musical instruments and body parts.’ Brain and Cognition 48, 421–423. Levy I, Hasson U, Avidan G, Hendler T & Malach R (2001). ‘Center-periphery organization of human object areas.’ Nature Neuroscience 4, 533–539. Mahon B Z & Caramazza A (in press). ‘The orchestration of the sensory-motor systems: clues from neuropsychology.’ Cognitive Neuropsychology. Martin A & Chao L L (2001). ‘Semantic memory and the brain: structure and processes.’ Current Opinion in Neurobiology 11, 194–201. Martin A & Weisberg J (2003). ‘Neural foundations for understanding social and mechanical concepts.’ Cognitive Neuropsychology 20, 575–587. Martin A, Ungerleider L G & Haxby J V (2000). ‘Category specificity and the brain: the sensory/motor model of semantic representations of objects.’ In Gazzaniga M S (ed.) The new cognitive neurosciences. Cambridge, MA: MIT Press. Martin A, Wiggs C L, Ungerleider L G & Haxby J V (1996). ‘Neural correlates of category-specific knowledge.’ Nature 379, 649–652. Miceli G, Capasso R, Daniele A, Esposito T, Magarelli M & Tomaiuolo F (2000). ‘Selective deficit for people’s names following left temporal damage: an impairment of domain-specific conceptual knowledge.’ Cognitive Neuropsychology 17, 489–516. Miceli G, Fouch E, Capasso R, Shelton J R, Tamaiuolo F & Caramazza A (2001). ‘The dissociation of color from form and function knowledge.’ Nature Neuroscience 4, 662–667. Plaut D C (2002). ‘Graded modality-specific specialization in semantics: a computational account of optic aphasia.’ Cognitive Neuropsychology 19, 603–639. Pulvermuller F (2001). ‘Brain reflections of words and their meaning.’ Trends in Cognitive Science 5, 517–524. Samson D & Pillon A (2003). ‘A case of impaired knowledge for fruit and vegetables.’ Cognitive Neuropsychology 20, 373–401. Sartori G & Lombardi L (2004). ‘Semantic relevance and semantic disorders.’ Journal of Cognitive Neuroscience 16, 439–452. Shelton J R, Fouch E & Caramazza A (1998). ‘The selective sparing of body part knowledge: a case study.’ Neurocase 4, 339–351.
Catford, John C. (b. 1917) 259 Tarr M J & Gauthier I (2000). ‘FFA: a flexible fusiform area for subordinate-level visual processing automatized by expertise.’ Nature Neuroscience 3, 764–769. Thompson S A, Graham K S, Williams G, Patterson K, Kapur N & Hodges J R (2004). ‘Dissociating personspecific from general semantic knowledge: roles of the left and right temporal lobes.’ Neuropsychologia 42, 359–370. Tranel D, Logan C G, Frank R J & Damasio A R (1997). ‘Explaining category-related effects in the retrieval of conceptual and lexical knowledge for concrete entities.’ Neuropsychologia 35, 1329–1339. Tyler L K & Moss H E (2001). ‘Towards a distributed account of conceptual knowledge.’ Trends in Cognitive Science 5, 244–252.
Warrington E K & McCarthy R (1983). ‘Category specific access dysphasia.’ Brain 106, 859–878. Warrington E K & McCarthy R (1987). ‘Categories of knowledge: further fractionations and an attempted integration.’ Brain 110, 1273–1296. Warrington E K & Shallice T (1984). ‘Category-specific semantic impairment.’ Brain 107, 829–854. Zannino G D, Perri R, Carlesimo G A, Pasqualettin P & Caltagirone C (2002). ‘Category-specific impairment in patients with Alzheimer’s disease as a function of disease severity: a cross-sectional investigation.’ Neuropsychologia 40, 2268–2279.
Catford, John C. (b. 1917) J G Harris, Kirkland, WA, USA ! 2006 Elsevier Ltd. All rights reserved.
J. C. Catford, Professor Emeritus of linguistics at the University of Michigan, USA, was born in Edinburgh, Scotland, in 1917. He studied at the Universities of Edinburgh, Paris, and London. He is, in the opinion of many, one of the greatest living linguists of the 20th and 21st centuries. At age 14, inspired by Bernard Shaw’s Pygmalion, he became deeply interested in phonetics, which he studied in Sweet’s Primer of phonetics, and with encouragement from Daniel Jones (the leading British phonetician of the time). As a schoolboy he became competent in phonetic analysis and production, applying this skill to many English dialects and foreign languages. Having had an audition at the British Broadcasting Corporation, at 17 he began a long association with the BBC and parallel careers as a phonetician/linguist and a radio actor. At this time, his enthusiasm for phonetics broadened into a general interest in linguistics, on which he read widely in the works of Sweet, Jespersen, Sapir, Bloomfield, and others. Specializing in French at Edinburgh University, he passed an academic year in France as an ‘‘assistant d’anglais’’ in a French lyce´ e. During this time, he earned the Diploˆ me de Phone´ tique Ge´ ne´ rale of the Institut de Phone´ tique of the University of Paris, where he also attended lectures by Marcel Cohen and Andre´ Martinet. In 1939, he interrupted his studies to accept an invitation to teach at the British Council’s Institute of English Studies in Athens for one year. The start of World War II prevented his
return to Britain, so the one year became seven, during which he applied phonetics and linguistics in teaching English in Greece, Egypt, and Palestine, acquiring knowledge of Modern Greek, Arabic, and Hebrew. He also met speakers of Caucasian languages and was fascinated by their phonetics and grammar. Returning to the UK in 1946, he studied general linguistics (with J. R. Firth) and Slavonic linguistics at London University, earning his living as a radio actor, specializing in ‘exotic’ dialects and foreign accents, i.e., doing applied phonetics, including the analysis of the sound systems of numerous languages, dialects, and even individuals, and then synthesizing approximately the same sounds in his own vocal tract. In 1952, he returned to Edinburgh University to work full time on the Linguistic Survey of Scotland, where he designed a phonological, rather than phonetic, questionnaire for field work. In 1957, he created and became Director of the Edinburgh University School of Applied Linguistics – believed to be the first academic institution to specialize in the application of linguistic theory and data to practical problems such as language teaching and translation. In 1964, he was invited to the University of Michigan as a professor of linguistics and Director of the English Language Institute, subsequently Chairman of the Department of Linguistics, and Director of the Phonetics Laboratory. He taught phonetics and phonology, applied linguistics, translation theory, comparative-historical linguistics, and several other topics. He also developed his interest in Caucasian languages in two field trips to the USSR. In 1973, he conducted a seminar in Israel for Circassian
Catford, John C. (b. 1917) 259 Tarr M J & Gauthier I (2000). ‘FFA: a flexible fusiform area for subordinate-level visual processing automatized by expertise.’ Nature Neuroscience 3, 764–769. Thompson S A, Graham K S, Williams G, Patterson K, Kapur N & Hodges J R (2004). ‘Dissociating personspecific from general semantic knowledge: roles of the left and right temporal lobes.’ Neuropsychologia 42, 359–370. Tranel D, Logan C G, Frank R J & Damasio A R (1997). ‘Explaining category-related effects in the retrieval of conceptual and lexical knowledge for concrete entities.’ Neuropsychologia 35, 1329–1339. Tyler L K & Moss H E (2001). ‘Towards a distributed account of conceptual knowledge.’ Trends in Cognitive Science 5, 244–252.
Warrington E K & McCarthy R (1983). ‘Category specific access dysphasia.’ Brain 106, 859–878. Warrington E K & McCarthy R (1987). ‘Categories of knowledge: further fractionations and an attempted integration.’ Brain 110, 1273–1296. Warrington E K & Shallice T (1984). ‘Category-specific semantic impairment.’ Brain 107, 829–854. Zannino G D, Perri R, Carlesimo G A, Pasqualettin P & Caltagirone C (2002). ‘Category-specific impairment in patients with Alzheimer’s disease as a function of disease severity: a cross-sectional investigation.’ Neuropsychologia 40, 2268–2279.
Catford, John C. (b. 1917) J G Harris, Kirkland, WA, USA ! 2006 Elsevier Ltd. All rights reserved.
J. C. Catford, Professor Emeritus of linguistics at the University of Michigan, USA, was born in Edinburgh, Scotland, in 1917. He studied at the Universities of Edinburgh, Paris, and London. He is, in the opinion of many, one of the greatest living linguists of the 20th and 21st centuries. At age 14, inspired by Bernard Shaw’s Pygmalion, he became deeply interested in phonetics, which he studied in Sweet’s Primer of phonetics, and with encouragement from Daniel Jones (the leading British phonetician of the time). As a schoolboy he became competent in phonetic analysis and production, applying this skill to many English dialects and foreign languages. Having had an audition at the British Broadcasting Corporation, at 17 he began a long association with the BBC and parallel careers as a phonetician/linguist and a radio actor. At this time, his enthusiasm for phonetics broadened into a general interest in linguistics, on which he read widely in the works of Sweet, Jespersen, Sapir, Bloomfield, and others. Specializing in French at Edinburgh University, he passed an academic year in France as an ‘‘assistant d’anglais’’ in a French lyce´e. During this time, he earned the Diploˆme de Phone´tique Ge´ne´rale of the Institut de Phone´tique of the University of Paris, where he also attended lectures by Marcel Cohen and Andre´ Martinet. In 1939, he interrupted his studies to accept an invitation to teach at the British Council’s Institute of English Studies in Athens for one year. The start of World War II prevented his
return to Britain, so the one year became seven, during which he applied phonetics and linguistics in teaching English in Greece, Egypt, and Palestine, acquiring knowledge of Modern Greek, Arabic, and Hebrew. He also met speakers of Caucasian languages and was fascinated by their phonetics and grammar. Returning to the UK in 1946, he studied general linguistics (with J. R. Firth) and Slavonic linguistics at London University, earning his living as a radio actor, specializing in ‘exotic’ dialects and foreign accents, i.e., doing applied phonetics, including the analysis of the sound systems of numerous languages, dialects, and even individuals, and then synthesizing approximately the same sounds in his own vocal tract. In 1952, he returned to Edinburgh University to work full time on the Linguistic Survey of Scotland, where he designed a phonological, rather than phonetic, questionnaire for field work. In 1957, he created and became Director of the Edinburgh University School of Applied Linguistics – believed to be the first academic institution to specialize in the application of linguistic theory and data to practical problems such as language teaching and translation. In 1964, he was invited to the University of Michigan as a professor of linguistics and Director of the English Language Institute, subsequently Chairman of the Department of Linguistics, and Director of the Phonetics Laboratory. He taught phonetics and phonology, applied linguistics, translation theory, comparative-historical linguistics, and several other topics. He also developed his interest in Caucasian languages in two field trips to the USSR. In 1973, he conducted a seminar in Israel for Circassian
260 Catford, John C. (b. 1917)
teachers, on the Cyrillic orthography and the grammar of Adyghe, so that Circassian children in Israel could become literate in their own language. After his retirement in 1986, he was Visiting Professor at the University of the Bosphorus, Istanbul, at the Hebrew University, Jerusalem, and at the University of California, Los Angeles. In 1988–1993, he was Executive Editor (translation) for the Encyclopedia of language and linguistics (Oxford, Pergamon Press, 1994), and wrote the encyclopedia articles ‘Caucasian languages,’ ‘Articulatory phonetics,’ and ‘Translation, overview.’ His major contributions have been in phonetic taxonomy, aerodynamic phonetics, phonation types, Scots dialectology, Caucasian phonetics, applied linguistics, and translation theory. His Fundamental problems in phonetics (1977), A practical introduction to phonetics (1988), and articles on ‘Phonation types’ (1964) and ‘The articulatory possibilities of man’ (1968) are classics in the field. See also: Applied Linguistics: Overview and History; Bloomfield, Leonard (1887–1949); Jespersen, Otto (1860– 1943); Johnson, Samuel (1709–1784); Jones, Daniel (1881–1967); Martinet, Andre´ (1908–1999); Sapir, Edward (1884–1939); Shaw, George Bernard (1856–1950); Sweet, Henry (1845–1912).
Bibliography Catford J C (1939). ‘On the classification of stop consonants.’ Le Maıˆtre Phone´tique 3d ser. 65, 2–5. [Republished in Jones W & Laver J (eds.) Phonetics in Linguistics. London: Longman, 1973. 43–46.] Catford J C (1957). ‘Vowel systems of Scots dialects.’ Transactions of the Philological Society 107–117 [For application see Linguistics Atlas of Scotland, vol. 3].
Catford J C (1964). ‘Phonation types.’ In Abercrombie D et al. (eds.) In honour of Daniel Jones. Longmans. 26–37. Catford J C (1965). A linguistic theory of translation. London: Oxford University Press. Catford J C (1968). ‘The Articulatory possibilities of man.’ In Malmberg B (ed.) Manual of phonetics. Amsterdam: North Holland Publishing Co. 309–333. Catford J C (1976). ‘Ergativity in Caucasian languages.’ In Actes du 6e Congre`s de l’asociation linguistique du nord-est. Montreal: Univ. de Montre´ al. 1–57. Catford J C (1977). Fundamental problems in phonetics. Edinburgh: Edinburgh University Press. Catford J C (1981). ‘Observations on the recent history of vowel classification.’ In Aster & Henderson (eds.) Towards a history of phonetics. Edinburgh: Edinburgh Press. Catford J C (1982). ‘Marking and frequency in the English verb.’ In Language form and linguistic variation; current issues in linguistic theory, vol. 15. Amsterdam: Benjamins. 11–27. Catford J C (1988a). ‘Notes on the phonetics of Nias.’ In McGinn R (ed.) Studies in Austronesian linguistics. Athens, OH: Ohio University. 151–172. Catford J C (1988b). A practical introduction to phonetics. Oxford: Clarendon Press. [2nd edn., 2001.] Catford J C (1988c). ‘Functional load and diachronic phonology.’ In Tobin Y (ed.) The Prague School and its legacy. Amsterdam: Benjamins. 3–19. Catford J C (1991). ‘The classification of Caucasian languages.’ In Lamb S et al. (eds.) Sprung from some common source. Stanford, CA: Stanford University Press. 232–268. Catford J C (1992). ‘Caucasian phonetics and general phonetics.’ In Paris C (ed.) Caucasologie et mythologie compare´e. Actes du colloque international du CNRS, IVe Colloque de Caucasologie. Paris: Peeters. 193–216. Catford J C (1998). ‘Sixty years in linguistics.’ In Koerner E F K (ed.) First person singular III, autobiographies by North American scholars in the language science. Amsterdam: Benjamins. 3–38.
Caucasian Languages B G Hewitt, SOAS, Doncaster, UK ! 2006 Elsevier Ltd. All rights reserved.
Around 38 languages are deemed to be indigenous to the Caucasus; often difficult demarcation between language and dialect explains the uncertainty. The ancestral homelands are currently divided between: 1. Russia’s north Caucasian provinces (Circassian, Abaza, Ingush, Chechen, Avaro-Ando-Tsezic, Lako-Dargic, northern Lezgic);
2. de facto independent Abkhazia (Abkhaz, Mingrelian, Svan, Georgian, Laz); 3. Georgia (Georgian, Mingrelian, Svan, Laz, Bats, Chechen, Avar, Udi); 4. Azerbaijan (Lezgi, Budukh, Kryts’, Khinalugh, Rutul, Ts’akhur, Avar, Udi) Turkey (Laz, Georgian). Diaspora-communities of North (especially northwest) Caucasians can be found across former Ottoman territories, particularly Turkey, where the majority Circassian and Abkhazian populations reside and where the term ‘Cherkess’ often
260 Catford, John C. (b. 1917)
teachers, on the Cyrillic orthography and the grammar of Adyghe, so that Circassian children in Israel could become literate in their own language. After his retirement in 1986, he was Visiting Professor at the University of the Bosphorus, Istanbul, at the Hebrew University, Jerusalem, and at the University of California, Los Angeles. In 1988–1993, he was Executive Editor (translation) for the Encyclopedia of language and linguistics (Oxford, Pergamon Press, 1994), and wrote the encyclopedia articles ‘Caucasian languages,’ ‘Articulatory phonetics,’ and ‘Translation, overview.’ His major contributions have been in phonetic taxonomy, aerodynamic phonetics, phonation types, Scots dialectology, Caucasian phonetics, applied linguistics, and translation theory. His Fundamental problems in phonetics (1977), A practical introduction to phonetics (1988), and articles on ‘Phonation types’ (1964) and ‘The articulatory possibilities of man’ (1968) are classics in the field. See also: Applied Linguistics: Overview and History; Bloomfield, Leonard (1887–1949); Jespersen, Otto (1860– 1943); Johnson, Samuel (1709–1784); Jones, Daniel (1881–1967); Martinet, Andre´ (1908–1999); Sapir, Edward (1884–1939); Shaw, George Bernard (1856–1950); Sweet, Henry (1845–1912).
Bibliography Catford J C (1939). ‘On the classification of stop consonants.’ Le Maıˆtre Phone´tique 3d ser. 65, 2–5. [Republished in Jones W & Laver J (eds.) Phonetics in Linguistics. London: Longman, 1973. 43–46.] Catford J C (1957). ‘Vowel systems of Scots dialects.’ Transactions of the Philological Society 107–117 [For application see Linguistics Atlas of Scotland, vol. 3].
Catford J C (1964). ‘Phonation types.’ In Abercrombie D et al. (eds.) In honour of Daniel Jones. Longmans. 26–37. Catford J C (1965). A linguistic theory of translation. London: Oxford University Press. Catford J C (1968). ‘The Articulatory possibilities of man.’ In Malmberg B (ed.) Manual of phonetics. Amsterdam: North Holland Publishing Co. 309–333. Catford J C (1976). ‘Ergativity in Caucasian languages.’ In Actes du 6e Congre`s de l’asociation linguistique du nord-est. Montreal: Univ. de Montre´al. 1–57. Catford J C (1977). Fundamental problems in phonetics. Edinburgh: Edinburgh University Press. Catford J C (1981). ‘Observations on the recent history of vowel classification.’ In Aster & Henderson (eds.) Towards a history of phonetics. Edinburgh: Edinburgh Press. Catford J C (1982). ‘Marking and frequency in the English verb.’ In Language form and linguistic variation; current issues in linguistic theory, vol. 15. Amsterdam: Benjamins. 11–27. Catford J C (1988a). ‘Notes on the phonetics of Nias.’ In McGinn R (ed.) Studies in Austronesian linguistics. Athens, OH: Ohio University. 151–172. Catford J C (1988b). A practical introduction to phonetics. Oxford: Clarendon Press. [2nd edn., 2001.] Catford J C (1988c). ‘Functional load and diachronic phonology.’ In Tobin Y (ed.) The Prague School and its legacy. Amsterdam: Benjamins. 3–19. Catford J C (1991). ‘The classification of Caucasian languages.’ In Lamb S et al. (eds.) Sprung from some common source. Stanford, CA: Stanford University Press. 232–268. Catford J C (1992). ‘Caucasian phonetics and general phonetics.’ In Paris C (ed.) Caucasologie et mythologie compare´e. Actes du colloque international du CNRS, IVe Colloque de Caucasologie. Paris: Peeters. 193–216. Catford J C (1998). ‘Sixty years in linguistics.’ In Koerner E F K (ed.) First person singular III, autobiographies by North American scholars in the language science. Amsterdam: Benjamins. 3–38.
Caucasian Languages B G Hewitt, SOAS, Doncaster, UK ! 2006 Elsevier Ltd. All rights reserved.
Around 38 languages are deemed to be indigenous to the Caucasus; often difficult demarcation between language and dialect explains the uncertainty. The ancestral homelands are currently divided between: 1. Russia’s north Caucasian provinces (Circassian, Abaza, Ingush, Chechen, Avaro-Ando-Tsezic, Lako-Dargic, northern Lezgic);
2. de facto independent Abkhazia (Abkhaz, Mingrelian, Svan, Georgian, Laz); 3. Georgia (Georgian, Mingrelian, Svan, Laz, Bats, Chechen, Avar, Udi); 4. Azerbaijan (Lezgi, Budukh, Kryts’, Khinalugh, Rutul, Ts’akhur, Avar, Udi) Turkey (Laz, Georgian). Diaspora-communities of North (especially northwest) Caucasians can be found across former Ottoman territories, particularly Turkey, where the majority Circassian and Abkhazian populations reside and where the term ‘Cherkess’ often
Caucasian Languages 261
indiscriminately applies to any North Caucasian. Circassians are found in Syria, Israel, and Jordan, home also to a significant Chechen population. Speaker numbers range from 500 (Hinukh) to 3–4 million (Georgian). Many of the languages are endangered. Three families are usually recognized: A. South Caucasian (Kartvelian) Georgian Svan Mingrelian (Megrelian) Laz (Ch’an) [Scholars in Georgia regard Mingrelian and Laz as codialects of Zan] B. North West Caucasian Abkhaz Abaza Ubykh (extinct from 1992) West Circassian (Adyghe) East Circassian (Kabardian) C. Nakh-Daghestanian (a) Nakh (North Central Caucasian) Chechen Ingush Bats (Ts’ova Tush) (b) Daghestanian (North East Caucasian) 1. Avaro-Ando-Tsezic(/Didoic): Avaric: Avar Andic: Andi, Botlikh, Godoberi, K’arat’a (Karata), Akhvakh, Bagvalal, T’indi (Tindi), Ch’amalal (Chamalal) Tsezic: Tsez (Dido), Khvarshi, Hinukh, Bezht’a (Bezhta) (K’ap’uch’a), Hunzib (these last two are sometimes regarded as codialects) 2. Lako-Dargic: Lakic: Lak Dargic: Dargwa (Dargi(n)) – some treat K’ubachi, Chiragh, and Megeb as full languages 3. Lezgic: Lezgi(an), Tabasaran (Tabassaran), Rutul (Mukhad), Ts’akhur (Tsakhur), Aghul, Udi, Archi, Budukh, Khinalugh, Kryts’ (Kryts) Some challenge the Lezgic status of Archi, Khinalugh, Budukh, and Kryts.’ Mutual intelligibility basically exists between Laz and Mingrelian, Abkhaz, and Abaza, West and East Circassian. Only Georgian
has an ancient tradition of writing, but during the Soviet period the languages in bold all enjoyed literary status. Publishing in Mingrelian, Laz, Ts’akhur, Aghul, Rutul, and Udi was tried in the 1930s but discontinued, though there have been some postSoviet attempts to publish more widely (including Dido).
Phonetics and Phonology All Caucasian languages have voiced vs. voiceless aspirate vs. voiceless ejective plosives, affricates, and occasionally fricatives, to which some add a fortis series (voiceless unaspirated or geminate). North West Caucasian is characterized by large consonantal inventories coupled with minimal vowel systems, consisting of at least the vertical opposition open /A/ vs. closed /e/. Ubykh possessed 80 phonemes (83 if the plain velar plosives attested only in loans are admitted), with every point of articulation between lips and larynx utilized and displaying the secondary features of palatalization, labialization, and pharyngalization – Daghestanian pharyngalization is normally assigned to vowels (Table 1). Some recent analyses of Daghestanian languages have produced inventories rivaling those of the North West Caucasian, though no parallel minimality among the vowels is posited. One analysis of Archi assigns it 70 consonants (Table 2). Noticeable here, is the presence of 10 laterals, though some specialists recognize no more than three or four. Table 1 Consonantal phonemes for Ubykh p p¿
b b¿
p’ p¿,
m m¿
w w¿
f v¿ t tw
w
d dw
w
t’ tw’ ’ ’
n
w
’
’ ’
s C Cw s sw §
r
z ! w !
Z Zw Z ’
l j
(k) kj kw q q¿ qj qw q¿w
(g) gj gw
(k’) kj’ kw’ q’ q¿’ qj’ qw’ q¿w’
x
X
w w¿ wj ww w¿w h
R R¿ Rj Rw R¿w
262 Caucasian Languages Table 2 Consonantal system of Archi p t tw
b d dw
p’ t’
p: t:
m n
w
w
w
i E
r
e Q
A ’ tsw’ ts’ w ’ ’ w ’
tsw ts
Table 6 Hunzib basic vowel system
ts:’
s sw s sw
:’
s: s:w s: s:w : :w
w
z zw Z Zw /
g gw
k’ kw’ q’ qw’
u
Table 7 Chiragh Dargwa vowel system l
i(:) E(:)
u(:)
j k kw q qw
$i O
A(:)
k: k:w w ww h h
q:’
w: w:w
R Rw ¿
Table 8 Udi vowel system i i¿ (y) E E¿ (œ) (a)
e
u u¿ O O¿ A A¿
Table 3 Georgian-Avar-Andi vowel system Table 9 Chechen vowel system i
u O
E
i i: y y: je ie Hœ yœ e e: a a:
A
u u: wo uo o o: A A:
Table 4 Svan’s upper Bal vowel system i
i:
y
y: e
E a
E: a:
œ
u
u:
O A
O: A:
e:
œ:
Table 5 Bezht’a basic vowel system i
y E
u œ a
O A
Kartvelian occupies a mid-position with between 28 and 30 consonants (see Georgian). Georgian shares with Avar and Andi the simple five-vowel triangle (Table 3). Schwa is added to this in the other Kartvelian languages, while the various Svan dialects have length and/or umlaut, Upper Bal having the richest system (Table 4). Triangular or quadrilateral vowel systems are attested in Nakh-Daghestanian (Table 5). All but /y, E, œ/ possess long counterparts, and the nasalized vowels: / , , , , , :, , :/ have also been recognized. Table 6 shows the Hunzib basic vowels. All these Hunzib vowels have long counterparts, and fluctuating nasalization on short vowels has been observed. The simplest (near-)quadrilateral system is attested in Chiragh Dargwa, with four pairs distinguished by
length (Table 7). Udi has been analyzed in Table 8, whilst Chechen presents the complicated system (Table 9). Most, if not all, of these can be nasalized as a result of the weakening of a following /n/. Stress is sometimes distinctive (Abkhaz-Abaza) but usually not. Tonal distinctions have been proposed for some of the Daghestanian languages (Andi, Akhvakh, Ch’amalal, Khvarshi, Hinukh, Bezht’a, Tabasaran, Ts’akhur, Ingush, and Budukh).
Morphology North West Caucasian sememes are typically C(C)(V), and minimal case systems combine with highly polysynthetic verbs, which may contain up to four agreement prefixes, locational preverbs, orientational preverbs and/or suffixes, interrogative and conjunctional elements, and markers of tense-modality, (non-)finiteness, causation, potentiality, involuntariness, polarity, reflexivity, and reciprocality (see Abkhaz). Kartvelian balances a moderate total of cases with reasonably complex verbs, which may contain: agreement with two or three (rarely four) arguments via two sets of agreement affixes, directional/perfectivizing preverbs (the large total in Mingrelian-Laz suggests North West Caucasian influence), and markers of tense-aspect-modality, causation, potentiality, version (vocalic prefixes indicating certain relations between arguments), and voice –
Caucasian Languages 263 Table 10 Avar locative case endings Series
Essive
Allative
Ablative
1. 2. 3. 4. 5.
-d(.)A -q: - :’ - : -D (¼ class-marker)
-d.E -q:.E - :’.E - :.E -D-E
-d(.)A.s:A -q:.A - :’.A - :.A -s:A
‘on’ ‘near’ ‘under’ ‘in (mass)’ ‘in (space)’
Kartvelian is the only family to have a full activepassive diathetic opposition. Nakh-Daghestanian has complex nominal systems with both grammatical and sometimes large numbers of locative cases; Lezgi(an), Aghul, and Udi apart, nouns fall into one of between two and (depending on the analysis) five or eight (largely covert) classes. Verbs are correspondingly simple: agreement is totally absent from Lezgi(an) and Aghul; elsewhere, verbs with an agreement slot typically allow only class agreement (Andic), though some languages (Bats, Lak-Dargwa, Tabasaran, Akhvakh, Archi, Hunzib, and Avar dialects) have added perhaps rudimentary person agreement, whilst Udi has person agreement only. Some languages have a small selection of preverbs. Some distinguish perfective from imperfective roots. Some North Caucasian verbs can be construed transitively or intransitively (?passively), depending on the clausal structure. Antipassives are also attested. Avar illustrates a typical system of locative-cases (Table 10). Ergativity and some other oblique case function are often merged in a single morph. Deictic systems range from two-term (Mingrelian, Ubykh, Kryts’), through three-term (Georgian, Abkhaz, Circassian), to five-term in a swathe of Daghestanian, and even six-term (Lezgi(an), Godoberi). Counting systems are predominantly vigesimal, at least up to ‘99’ (though Bats is vigesimal throughout), but some systems are decimal.
accusative just for Series II. Laz has extended the case marker horizontally across its three series for all transitive subjects. Active–inactive alignment plays a role in some languages (Bats). A nominative/absolutive argument is the obligatory minimum in a clause, and where verbs have class agreement, this is the determiner for the class marker (which in some languages also appears on adverbs and as part of a locative case exponent); the determiner for person agreement in languages with class agreement might be this same or a different argument (e.g., the logical subject), depending on a variety of factors. Verbs such as want, have, hear are construed indirectly with the logical subject in an oblique case, but, if Kartvelian and North West Caucasian employ just the dative/general oblique case for this argument, greater distinctions can apply in Nakh-Daghestanian: Avar employs its dative case with verbs of emotion (love), a locative (Series I essive) with verbs of perception (see), and the genitive for the possessor in conjunction with the copula. Only Kartvelian has the category of subordinating conjunctions, naturally associated with full clauses containing indicative or subjunctive finite verbs. Such structures are rare in North Caucasian, where one finds a variety of nonfinite (nominalized) verb forms fulfilling the subordinate role. Examples: ilu-di ri :’i b-EZ-A vs. ri :’i b-EZ-A mother- meat. 3-fryErg Absol3 Past ‘Mother fried the meat’ vs. ‘The meat (was) fried’ (Andi)
Syntax Word orders are: Kartvelian and Nakh-Daghestanian AN, GN, N-Postposition, SOV, though Old Georgian was rather NA and NG; North West Caucasian GN, predominantly NA, N-Postposition, SOV. Some degree of ergativity characterizes all the languages, but in Mingrelian, where the system was originally as illustrated for Georgian (q.v.), the ergative case marker was extended vertically to replace the original nominative for intransitive (including indirect) verbs in Series II (aorist indicative and subjunctive), where it functions as a Series II nominative allomorph, the original nominative effectively becoming an
is-t’i s:i RArt:Ol- hA brother-Erg water.Absol boil-Pres ‘Brother is boiling the water’
(Bezht’a)
vs. is s:i-d RArt:Ol-dA:- h brother.Absol water-Instr boil-AntiPass-Pres ‘Brother is regularly engaged in boiling water’ (Bezht’a) k’Ots-k man-NomA
vs.
RAb-i girl-AccB
kO-ø- ir-u Prev-herB-see-he.AorA
264 Caucasian Languages RAb-k dO-Rur-u girl-NomA Prev-die-she.AorA ‘The man saw the girl’ vs. ‘The girl died’ (Mingrelian)
See also: Abkhaz; Georgian.
k’O -s RAb-i ø-A- ir-E man-DatB girl-NomA heB-Pot-see-her.PresA ‘The man can see the girl’ (Mingrelian)
Berg H van den (1995). A grammar of Hunzib (with texts and lexicon). Lincom Studies in Caucasian Linguistics 01. Munich: Lincom Europa. Berg H van den (2001). Dargi folktales. Oral stories from the Caucasus and an introduction to Dargi grammar. Leiden: Research School CNWS. Berg H van den (2004). ‘The East Caucasian language family.’ Special Edition of Lingua. 147–190. Boeder W (1979). ‘Ergative syntax and morphology in language change: the South Caucasian languages.’ In Plank F (ed.) Ergativity. New York: Academic Press. 435–480. Boeder W (2004). ‘South Caucasian.’ Special edition of Lingua. 5–89. Catford J C (1976). ‘Ergativity in Caucasian languages.’ In Papers from the 6th Meeting of the Northeast Linguistics Society. Montreal. NELS, 6, 37–48. Catford J C (1977). ‘Mountain of tongues: the languages of the Caucasus.’ Annual Review of Anthropology 6, 283–314. Charachidze´ G (1981). Grammaire de la langue Avar. Paris: Editions Jean-Favard. Chirikba V A (1996). Common West Caucasian. The reconstruction of its phonological system and parts of its lexicon and morphology. Leiden: CNWS. Deeters G (1930). Das Kharthwelische Verbum. Leipzig: Kommissionsverlag von Markert und Petters. Dume´ zil G (1975). Le verbe Oubykh. Etudes descriptives et comparatives (avec la collaboration de Tevfik Esenc¸ ). Paris: Klincksieck. Dume´ zil G (1976). ‘Comple´ ments et corrections au Verbe Oubykh (1975) I.’ Bedi Kartlisa (revue de kartve´ lologie) XXXIV, 9–15. Greppin J (ed.) (1989–2004). The indigenous languages of the Caucasus. 1: Kartvelian languages (Harris A C [ed.]), 1991; 2: North West Caucasus (Hewitt B G [ed.]), 1989; 3: North East Caucasian languages, Part 1 (Job D M [ed.]), 2004; 4: North East Caucasian languages, Part 2 (Smeets R [ed.]), 1994. Delmar, New York: Caravan Books. Harris A C (2002). Endoclitics and the origins of Udi morphosyntax. Oxford: OUP. Haspelmath M (1993). A grammar of Lezgian. Berlin: Mouton de Gruyter. Hewitt B G (1987). The typology of Subordination in Georgian and Abkhaz. Berlin: Mouton De Gruyter. Hewitt B G (ed.) (1992). Caucasian perspectives. Munich: Lincom Europa. Hewitt B G (1998). ‘Caucasian languages.’ In Price G (ed.) Encyclopaedia of the languages of Europe. 57–81. [Paperback version 2000.] Hewitt B G (2004). ‘North West Caucasian.’ Special Caucasian edition of Lingua. 91–145. Hewitt B G (2004). Introduction to the study of the Caucasian languages. Munich: Lincom Europa.
vs. k’O -s RAb-k k-ø-A- ir-u man-DatB girl-NomA Prev-heB-Pot-see-her.AorA ‘The man could see the girl’ (Mingrelian) ins:-u-jE j.As father-Obl-Dat daughter2.Absol ‘Father loves (his) daughter’
j-O :’-u-lA 2-love-TV-Pres (Avar)
ins:-u-d.A w.As-ul father-Obl-LocI son-Pl.Absol ‘Father sees (his) sons’
r-ix:-u-lA Pl-see-TV-Pres (Avar)
ins:-u-l tsu father-Obl-Gen horse3.Absol ‘Father has a horse’
b-ugO 3-be.Pres (Avar)
lAmsgEd-wEn-is bikw-d sgA shade-from-Gen wind-ErgA Prev la-ø-j-k’wis-ø, ErE Prev-itB-SV-admit-it.AorA that minE uswwAr nEnsgA their each.other.Dat between w.O-l.qmAs-A miZ CompPref-strong-CompSuff sun.NomA le.m.ar-ø apparently.be-itA ‘The north wind admitted that the sun was apparently the stronger of them’ (Lower Bal Svan) teRA-Ze-m sun-wind-the. Erg/OblIII
teRA-r sun-the. AbsolI
jAZ self
nAh.re.j much
nAh more
ø-qe-gwe.re-ø-me- wA-mA ø-zA.re- A§e-r itI-how-strongitI-Prev-Prev-itIII-notAbsol.N/F. admit. Stat.PresI N/F-ifI ø-me-wwe-n-Aw ø-wwe-RA itI-not-happen-Fut-AbsI itI-happen-Aor.Fin ‘It became impossible for the north wind not to admit how/that the sun is stronger than it’ (Temirgoi West Circassian)
Kinship Kartvelian is unrelated to any known language or language family, but the debate continues concerning the relationship between the northern families. Linkage to Hattic is postulated for northwestern Caucasian and to Hurrian for Nakh-Daghestanian. Udi has recently been conclusively demonstrated to descend from Caucasian Albanian.
Bibliography
Causatives: Semantics 265 Kibrik A E & Kodzasov S V (1988). Sopostavitel’noe izuchenie dagestanskix jazykov. Glagol [Contrastive study of the Daghestanian languages. The verb]. Moscow: University Press. Kibrik A E & Kodzasov S V (1990). Sopostavitel’noe izuchenie dagestanskix jazykov. Imja. Fonetika [Contrastive study of the Daghestanian languages. The noun. phonetics]. Moscow: University Press. Klimov G A (1994). Einfu¨ hrung in die kaukasische Sprachwissenschaft, translated and expanded by Gippert, J. Hamburg: Buske. Klimov G A (1998). Trends in linguistics. Documentation 16. Etymological dictionary of the Kartvelian languages. Berlin: Mouton De Gruyter. Kuipers A H (1960). Phoneme and morpheme in Kabardian. ’S-Gravenhage: Mouton.
Nichols J (1997). ‘Chechen phonology.’ In Kaye A S (ed.) Phonologies of Asia and Africa (including the Caucasus), vol. 2. Winona Lake: Eisenbrauns. 941–971. Nikolayev S L & Starostin S A (1994). A North Caucasian etymological dictionary. Moscow: Asterisk. Paris C (1974). Syste`me phonologique et phe´ nome`nes phone´ tiques dans le parler besney de Zennun Ko¨ yu¨ (Tcherkesse oriental). Paris: Klincksieck. Smeets R (1984). Studies in West Circassian phonology and morphology. Leiden: Hakuchi Press. Tuite K (1998). Kartvelian morphosyntax: number agreement and morphosyntactic orientation in South Caucasian languages. Studies in Caucasian Linguistics 12. Munich: Lincom Europa.
Causatives: Semantics J J Song, University of Otago, Dunedin, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Defining Causative Constructions The causative construction is a linguistic expression that denotes a complex situation consisting of two events: (1) the causing event in which the causer does something, and (2) the caused event in which the causee carries out an action or undergoes a change of condition or state as a result of the causer’s action. The following example is such a linguistic expression. (1) The teacher made Matthew paint the house
In (1), the causer (the teacher) did something, and as a result of that action the causee (Matthew) in turn carried out the action of painting the house. The causative construction has two main characteristics. First, the causer noun phrase and the expression of cause must be foregrounded, with the causee noun phrase and the expression of effect backgrounded. The foregrounding of the causer noun phrase and the expression of cause is achieved by putting these two expressions in grammatically more prominent positions in the sentence than the causee noun phrase and the expression of effect. Second, the expression of the causer’s action must be without specific meaning; all that is encoded by that expression is the pure notion of cause. For instance, the sentence in (2), although denoting a causative situation similar to (1), is not regarded as an example of the causative construction but rather as an example of what may be referred to broadly as the causal construction.
(2) Matthew painted the house because the teacher instructed him to do so
There are two clear differences between (1) and (2). First, in (1) the causer noun phrase, the teacher, and the expression of cause, made, are the subject and the main predicate of the sentence, respectively (i.e., they are foregrounded). The causee noun phrase and the predicate of effect, on the other hand, appear as a nonsubject noun phrase and a subordinate predicate, respectively (i.e., they are backgrounded). This situation is reversed in (2); the causee noun phrase and the expression of effect appear as the subject and the predicate of the main clause, respectively, with both the causer noun phrase and the expression of cause located in the subordinate clause. Second, in (1) the expression of the causer’s action, made, lacks specific lexical content. In (2), on the other hand, the expression of the causer’s action, instructed has specific lexical content.
Types of Causative Constructions The most widely known classification of causatives is based on the formal fusion between the predicate of cause and that of effect. In this classification, three different types of causative are recognized: (1) lexical, (2) morphological, and (3) syntactic. The lexical causative type involves suppletion (no formal similarity between the noncausative verb and its causative counterpart). In this type, the formal fusion of the expression of cause and of effect is maximal, with the effect that the causative verb cannot be analyzed into two morphemes. Examples of
Causal Theories of Reference and Meaning 235
Causal Theories of Reference and Meaning A Sullivan, Memorial University of Newfoundland, St. John’s NL, Canada ! 2006 Elsevier Ltd. All rights reserved.
Reference, Meaning, and Causal Theories The theory of reference and the theory of meaning are two closely related, fundamental strains within the study of mind and language. The aim of a theory of meaning is to explain what it is that competent speakers of a given language know, or are able to do, in virtue of which they are able to use the language to communicate. The aim of the theory of reference is to explain what it is in virtue of which words refer to what they do, how it is that utterances can hook onto and express information about particular things. The exact relation between meaning and reference is a controversial matter (in large part because of the wide variety of theoretical approaches to meaning). According to some views, the meaning of an expression is precisely its referent, and so theories of meaning and of reference are just slightly different roads in to what is essentially the same task. Opponents of this notion point to co-referential expressions that differ in meaning (such as ‘Portugal’ and ‘the country immediately west of Spain’), or to meaningful expressions that do not seem to refer to anything (‘of’, or ‘for the sake of’), to show that meaning is distinct from reference. Or again, many theorists hold that proper names refer but cannot really be said to have a meaning, or that complete sentences have a determinate meaning but do not refer to anything. In any case, the causal theory of reference (i.e., words refer to what they do by virtue of a certain sort of causal relation between word and referent) and the causal theory of meaning (i.e., words mean what they do by virtue of a certain sort of causal relation between word and meaning) are, historically and conceptually, distinct views. To help avoid confusion, I will distinguish the relevant approach to reference by calling it ‘the causal-historical theory.’ (‘Historical’ is an appropriate distinguishing mark because the history of how a word is transmitted from its original inception to the current speaker ismuch more important on the causal approach to reference, as compared with the causal approach to meaning.)
The Causal-Historical Theory of Reference The causal-historical theory of reference was developed in the 1960s and 1970s. It is explicitly
developed only for proper names (cf. Donnellan, 1970; Kripke, 1972) and natural kind terms (cf. Kripke, 1972; Putnam, 1975). However, Kaplan (1977) raises some related points about indexical expressions, and there have been attempts to fashion a fully general approach to reference along these lines (for discussion, see Stalnaker, 1997; Devitt and Sterenly, 1999). The theory has replaced the descriptivist approach to reference, different versions of which were defended by Frege and Russell, as the orthodox approach to reference. (see Proper Names: Philosophical Aspects for discussion.) According to the causal-historical theorists, descriptivists are wrong to demand that, in order to significantly use a term, speakers need to have a uniquely identifying description of its referent. Rather, once a convention is in place, linking a term to its referent, a deferential intention to comply with this practice – i.e., to use ‘X’ to refer to what others have used ‘X’ to refer to – is all that is required in order to use the term to refer. The view has it that certain expressions refer to certain things in virtue of a causal-historical relation between word and object, initially fixed during a dubbing or baptism and propagated from there to subsequent speakers, who implicitly defer to that initial dubbing in using the expression to refer. The notion of a causal-historical chain as that which is criterial in determining reference is developed more or less independently by Donnellan and Kripke. Donnellan (1970: 277) concludes an argument against descriptivism with the claim that ‘‘. . . in some way the referent must be historically, or, we might say, causally connected to the speech act.’’ Donnellan (1974: 17) articulates the point at a bit more length: ‘‘Suppose someone says ‘Socrates was snub-nosed’, and we ask to whom he is referring. . . .[T]his calls for a historical explanation; we search not for an individual who might best fit the speaker’s descriptions . . . but rather for an individual historically related to his use of the name.’’ Kripke (1972: 94–95) uses similar terms to describe his approach: ‘‘. . . It’s in virtue of our connection with other speakers in the community, going back to the referent himself, that we refer to a certain man . . . In general, our reference depends not just on what we think ourselves, but on other people in the community, the history of how the name reached one, and things like that. It is by following such a history that one gets to the reference.’’ And again Kripke (1972: 106): ‘‘. . . reference actually seems to be determined by the fact that the speaker is a member of a community of speakers who use the name.
236 Causal Theories of Reference and Meaning
The name has been passed to him by tradition from link to link.’’ The causal-historical theory is an externalist approach to reference, in that reference depends largely on factors external to the speaker’s head – factors pertaining to the speaker’s linguistic community and to the environment in which the expression in question evolved. (Descriptivists tend to be internalists, insofar as they hold that reference is fully determined by the speaker’s beliefs and discriminative abilities.) On the causal-historical view, the criteria for the correct application of a word are not, in general, introspectively accessible to competent speakers; one can competently use ‘gold’ or ‘Aristotle’ without knowing anything that would distinguish Aristotle from Plato, or gold from fool’s gold. Mistaken or ignorant speakers can still single out specific referents via these complex, communal, causal-historical mechanisms. (see Externalism about Content for more on this.) Contra the descriptivists, the causal-historical theorists argue that the meaning of a proper name is not some kind of descriptive sense (see Direct Reference; Proper Names: Philosophical Aspects; Reference: Philosophical Theories for discussion). From here, the conclusion that the semantic contribution of a name is just its referent looks compelling. This is why the theory has led to a resurgence of interest in the Millian view of proper names (i.e., the meaning of a name it just its referent) and in the Russellian approach to singular propositions (i.e., the proposition expressed by a sentence containing a name – say, ‘Kaplan is in California’ – is individuated solely in terms of the individual and property that it is about, as opposed to being individuated in terms of more finely-grained concepts or meanings). Many think that the causal-historical chain of transmission story about how a word refers to something in particular nicely complements, and fleshes out, these doctrines of Mill and Russell. The causal-historical theory does not aim to aim to give a reductive analysis of reference. For example, Kripke (1972: 96) says: ‘‘When the name is ‘passed from link to link,’ the receiver of the name must, I think, intend to use it with the same reference as the man from whom he heard it . . . [T]he preceding account hardly eliminates the notion of reference; on the contrary, it takes the notion of intending to use the same reference as a given.’’ (Cf. Kaplan’s [1990] discussion of the point that the intention to preserve reference is not itself a causal notion.) Thus, those who seek to naturalize reference, by reducing the relation of reference to something more scientifically respectable, must either significantly alter the causalhistorical view, or look elsewhere.
The Causal Theory of Meaning In contrast, the causal theory of meaning (also called the ‘information-theoretic’ approach to meaning) is explicitly in the business of explaining semantic phenomena in non-semantic terms. The general aim here is a naturalistic account of the phenomenon of meaning, and the thought is that the notion of causation is the most promising place from which to start. Dretske (1981) is a seminal proponent of this approach, and Fodor (1987, 1990) develops related accounts. Stampe (1977), another influential proponent, gives the following programmatic sketch: ‘‘We have causal theories . . . of knowledge and memory, of belief, of evidence, of proper and common names, and of reference. If . . . these phenomena should turn out to have causal analyses, it will be no mere coincidence. Only their having something in common would make it so . . . [The root of this convergence] is that representation is essentially a causal phenomenon’’ (1977: 81). The general idea behind the causal theory of meaning is that linguistic meaning is a species of causal co-variance. Roughly, the goal is to show that ‘means’ means (more or less) the same thing in (1) and (2), that both cases are, at root, cases of reliable correlation: 1. Smoke means fire. 2. ‘Fire’ means fire. For a word to mean something in particular is for the word to reliably indicate that thing. Alternatively, a word ‘W’ means M if M tends to cause or bring about tokens of ‘W.’ (The account is intended to apply not only to tokens of ‘W’ that are actually uttered, but also, and more fundamentally, to occurrences of the word in thought.) If a satisfactory account of meaning were forthcoming down this avenue, this would be a monumental leap forward for the human and cognitive sciences. As yet, there is nothing remotely resembling a satisfactory scientific treatment of meaning; given the fundamental and pervasive roles that meaningful thoughts and utterances play in our lives, that is a rather large gap in our scientific understanding of human beings. (Note that Grice [1957] criticizes a view that he calls ‘the causal theory of meaning’ – the core of which is the idea that the meaning of an expression ‘E’ is (roughly) the content of the attitude that is prone to cause a speaker to utter ‘E,’ and that hearing ‘E’ is prone to cause in listeners. This view has not played a major role in the philosophy of language; but nonetheless some of Grice’s arguments against it are echoed in the criticisms, described in the next section, of the above information-theoretic causal theory.)
Causal Theories of Reference and Meaning 237
Problems and Prospects There are many problems with the causal-historical theory of reference (which are discussed at more length in Reference: Philosophical Theories). Evans (1973) and Searle (1983) develop counterexamples to the theory, cases where it seems to be committed to unwelcome consequences. Furthermore, many of the semantic views with which the theory has been allied (such as those of Mill and Russell mentioned earlier in the second section of this article) are controversial (see Direct Reference; Proper Names: Philosophical Aspects for discussion). More generally, the causalhistorical view is just a sketchy picture – it does not offer anything like specific necessary or sufficient causal-historical conditions for identifying the referent of an utterance or inscription. Any utterance stands in an awful lot of causal relations to an indefinite range of things; to single out precisely which subset of these ubiquitous causal relations are semantically relevant – let alone precisely which of them are relevant to determining the referent of a particular use of a particular expression – is a daunting task that is as yet barely begun. The situation is worse for the (more reductionist, and so more ambitious) causal theory of meaning. It not only falls prey to the problems that befalls the causal-historical approach to reference but also gives rise to some distinctive problems of its own. Basically, for almost any word-meaning pair ‘W’-M, it is not difficult to come up with conditions in which things distinct from M tend to cause ‘W’s, and conditions in which M does not tend to cause ‘W’s. For instance, in various sorts of suboptimal conditions, cows might tend to cause tokens of ‘horse,’ but nonetheless – regardless of how dark it is (or how far away they are, what they are disguised as, etc.) – these cows are distinct from the meaning of ‘horse.’ In the other direction, if a horse was to ‘baa’ like a sheep, or was painted with zebra-stripes, or what have you, these misleading factors would affect its tendency to cause ‘horse’-tokens, but would not have the slightest effect on the fact that the term ‘horse’ correctly applies to it. In short, causation is a much more undiscriminating relation than meaning; and this is the source of all manner of problems for the project of using causation to build an account of meaning. There are many refinements of the basic causaltheoretic view, intended to skirt these elementary problems and their many variants. However, the consensus seems to be that this type of causal theory can only succeed in delivering an account of meaning that accommodates our intuitions about the normativity and determinacy of meaning (i.e., respectively, it is possible to misapply a term, and the terms ‘A’ and ‘B’
can differ in meaning even if all As are Bs) if it smuggles in semantic notions, and thus helps itself to meaning, as opposed to offering an account of meaning (for discussion, see Loewer, 1997). To sum up: the causal theory of reference is the view that a word refers to that to which it stands in the right sort of causal-historical relation. Since the 1970s, it has become the orthodox approach to reference. However, many problems remain to be worked out, for this general picture to yield a satisfactory, comprehensive account of reference. The causal theory of meaning is the view that the meaning of a word is that which reliably causes tokens of the word to be thought or uttered. Many take this to be the most promising avenue for a naturalistic account of meaning. However, there are reasons to think that the approach is too crude to yield an adequate account of linguistic meaning. At best, there are counterexamples that have yet to be satisfactorily addressed. See also: Direct Reference; Externalism about Content; Proper Names: Philosophical Aspects; Reference: Philosophical Theories; Sense and Reference: Philosophical Aspects.
Bibliography Devitt M & Sterenly K (1999). Language and reality (2nd edn.). Cambridge, MA: MIT Press. Donnellan K (1970). ‘Proper names and identifying descriptions.’ Synthese 21, 256–280. Donnellan K (1974). ‘Speaking of nothing.’ Philosophical Review 83, 3–32. Dretske F (1981). Knowledge and the flow of information. Cambridge, MA: MIT Press. Evans G (1973). ‘The causal theory of names.’ Proceedings of the Aristotelian Society 47, 187–208. Fodor J (1987). Psychosemantics. Cambridge, MA: MIT Press. Fodor J (1990). ‘A theory of content’ and other essays. Cambridge, MA: MIT Press. Grice H P (1957). ‘Meaning.’ Philosophical Review 66, 377–388. Kaplan D (1977). ‘Demonstratives.’ In Almog J, Perry J & Wettstein H (eds.) (1989) Themes from Kaplan. Oxford: Oxford University Press. 481–564. Kaplan D (1990). ‘Words.’ Proceedings of the Aristotelian Society 64, 93–120. Kripkes S (1972). Naming and necessity. Cambridge, MA: Harvard University Press. Loewer B (1997). ‘A guide to naturalizing semantics.’ In Hale B & Wright C (eds.) A companion to the philosophy of language. Oxford: Blackwell. 108–126. Putnam H (1975). ‘The meaning of ‘‘meaning’’.’ In Gunderson K (ed.) Mind, language, and reality. Cambridge: Cambridge University Press. 131–193.
238 Causal Theories of Reference and Meaning Searle J (1983). Intentionality. Cambridge: Cambridge University Press. Stalnaker R (1997). ‘Reference and necessity.’ In Hale B & Wright C (eds.) A companion to the philosophy of language. Oxford: Blackwell. 534–553.
Stampe D (1977). ‘Toward a causal theory of linguistic representation.’ Midwest Studies in Philosophy 2, 42–63.
Catalan M W Wheeler, University of Sussex, Brighton, UK ! 2006 Elsevier Ltd. All rights reserved.
Geography and Demography The territories where Catalan is natively spoken cover 68 730 km2, of which 93% lies within Spain (see Figure 1). They are: 1. The Principality of Andorra 2. In France: North Catalonia – almost all of the de´ partement of Pyre´ ne´ es-Orientales 3. In Spain: Catalonia, except for the Gasconspeaking Vall d’Aran; the eastern fringe of Aragon; most of Valencia (the Comunitat Valenciana), excepting some regions in the west and south that have been Aragonese/Spanish-speaking since at least the 18th century; El Carxe, a small area of the province of Murcia, settled in the 19th century; and the Balearic Islands 4. In Italy: the port of Alghero (Catalan L’Alguer) in Sardinia Table 1 shows the population of these territories (those over 2 years of age in Spain) and the percentages of the inhabitants who can understand, speak, and write Catalan. Information is derived from the 2001 census in Spain together with surveys and other estimates; the latter are the only sources of language data in France and Italy. The total number of speakers of Catalan is a little under 7.5 million. Partly as a result of the incorporation of Catalan locally into the education system, there are within Spain a significant number of second-language speakers who are included in this total. Virtually all speakers of Catalan are bilingual, using also the major language of the state they live in. (Andorrans are bilingual in Spanish or French, or are trilingual.)
Genetic Relationship and Typological Features Catalan is a member of the Romance family and a fairly prototypical one, as befits its geographically central position in the European Romance area. Some particularly noteworthy characteristics are pointed out here
(for more details see Wheeler, 1988). In historical phonology, note the palatalization of initial /l-/ and loss of stem-final /n/ that became word final, for example, LEONEM > lleo ´ [Le"o] ‘lion.’ Original intervocalic -C0 -, -TJ-, -D- became /w/ in word-final position and were lost elsewhere, for examples, PLACET > plau ["plaw] ‘please.3.SING,’ PLACEMUS > plaem [ple"em] ‘please. 1.PL.’ As the previous examples also illustrate, posttonic nonlow vowels were lost, so that a dominant pattern of phonological words is of consonant-final oxytones. The full range of common Romance verbal inflection is retained, including inflected future (sentira` ‘hear.3.SING.FUT’), widely used subjunctives, and a contrast between present perfect (ha sentit ‘has heard’) and past perfective (sentı´ ‘heard.3.SING. PERF’). In addition to the inherited past perfective form, now largely literary, Catalan developed a periphrastic past perfective using an auxiliary that was originally the present of ‘go’ (va sentir ‘AUX. PERF.3.SING hear.INF’). In some varieties of Catalan, this construction has developed a subjunctive (vagi sentir ‘AUX.PERF.SUBJ.3.SING hear.INF’), introducing, uniquely in Romance, a perfective/imperfective aspect distinction in the subjunctive. Considerable use is made of pronominal and adverbial clitics that attach to verb forms in direct and indirect object functions or partitive or adverbial functions, quite often in clusters of two or three, as in (1). (1) us n’hi envi-en 2.PL.OBJ PART.LOC send-3.PL ‘‘they send some to you (PL) there’’
Most of the pronominal/adverbial clitics have several contextually conditioned forms; thus, the partitive clitic shows variants en ! n’ ! -ne. Clitic climbing is commonly found with a pronominal complement of a verb that is itself the complement of a (semantic) modal, as in (2). This example also shows the (optional) gender agreement of a perfect participle with a preceding direct object clitic. (2) no not
l’he sab-ud-a agafa-r DO.3.SING.F. knowcatch-INF have.1.SING PART-F ‘‘I haven’t been able to catch it (FEM)’’
Causatives: Semantics 265 Kibrik A E & Kodzasov S V (1988). Sopostavitel’noe izuchenie dagestanskix jazykov. Glagol [Contrastive study of the Daghestanian languages. The verb]. Moscow: University Press. Kibrik A E & Kodzasov S V (1990). Sopostavitel’noe izuchenie dagestanskix jazykov. Imja. Fonetika [Contrastive study of the Daghestanian languages. The noun. phonetics]. Moscow: University Press. Klimov G A (1994). Einfu¨hrung in die kaukasische Sprachwissenschaft, translated and expanded by Gippert, J. Hamburg: Buske. Klimov G A (1998). Trends in linguistics. Documentation 16. Etymological dictionary of the Kartvelian languages. Berlin: Mouton De Gruyter. Kuipers A H (1960). Phoneme and morpheme in Kabardian. ’S-Gravenhage: Mouton.
Nichols J (1997). ‘Chechen phonology.’ In Kaye A S (ed.) Phonologies of Asia and Africa (including the Caucasus), vol. 2. Winona Lake: Eisenbrauns. 941–971. Nikolayev S L & Starostin S A (1994). A North Caucasian etymological dictionary. Moscow: Asterisk. Paris C (1974). Syste`me phonologique et phe´nome`nes phone´tiques dans le parler besney de Zennun Ko¨yu¨ (Tcherkesse oriental). Paris: Klincksieck. Smeets R (1984). Studies in West Circassian phonology and morphology. Leiden: Hakuchi Press. Tuite K (1998). Kartvelian morphosyntax: number agreement and morphosyntactic orientation in South Caucasian languages. Studies in Caucasian Linguistics 12. Munich: Lincom Europa.
Causatives: Semantics J J Song, University of Otago, Dunedin, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Defining Causative Constructions The causative construction is a linguistic expression that denotes a complex situation consisting of two events: (1) the causing event in which the causer does something, and (2) the caused event in which the causee carries out an action or undergoes a change of condition or state as a result of the causer’s action. The following example is such a linguistic expression. (1) The teacher made Matthew paint the house
In (1), the causer (the teacher) did something, and as a result of that action the causee (Matthew) in turn carried out the action of painting the house. The causative construction has two main characteristics. First, the causer noun phrase and the expression of cause must be foregrounded, with the causee noun phrase and the expression of effect backgrounded. The foregrounding of the causer noun phrase and the expression of cause is achieved by putting these two expressions in grammatically more prominent positions in the sentence than the causee noun phrase and the expression of effect. Second, the expression of the causer’s action must be without specific meaning; all that is encoded by that expression is the pure notion of cause. For instance, the sentence in (2), although denoting a causative situation similar to (1), is not regarded as an example of the causative construction but rather as an example of what may be referred to broadly as the causal construction.
(2) Matthew painted the house because the teacher instructed him to do so
There are two clear differences between (1) and (2). First, in (1) the causer noun phrase, the teacher, and the expression of cause, made, are the subject and the main predicate of the sentence, respectively (i.e., they are foregrounded). The causee noun phrase and the predicate of effect, on the other hand, appear as a nonsubject noun phrase and a subordinate predicate, respectively (i.e., they are backgrounded). This situation is reversed in (2); the causee noun phrase and the expression of effect appear as the subject and the predicate of the main clause, respectively, with both the causer noun phrase and the expression of cause located in the subordinate clause. Second, in (1) the expression of the causer’s action, made, lacks specific lexical content. In (2), on the other hand, the expression of the causer’s action, instructed has specific lexical content.
Types of Causative Constructions The most widely known classification of causatives is based on the formal fusion between the predicate of cause and that of effect. In this classification, three different types of causative are recognized: (1) lexical, (2) morphological, and (3) syntactic. The lexical causative type involves suppletion (no formal similarity between the noncausative verb and its causative counterpart). In this type, the formal fusion of the expression of cause and of effect is maximal, with the effect that the causative verb cannot be analyzed into two morphemes. Examples of
266 Causatives: Semantics
this type include, English die vs. kill and German sterben ‘to die’ vs. to¨ ten ‘to kill.’ In the morphological type, the expression of cause is in the form of a derivational affix, with the expression of effect realized by a basic verb to which that affix is attached. In Japanese, for example, the suffix -(s)ase can apply to basic verbs to derive causative verbs, for example, ik- ‘[X] to go’ vs. ik-ase- ‘to cause [X] to go.’ The causative morpheme can be in the form of not only suffixes but also prefixes, infixes, and circumfixes. In the syntactic type, the expression of cause and of effect are separate verbs, and they occur in different clauses. This type has already been exemplified by (1). Swahili provides another good example (Vitale, 1981: 153). (3) Ahmed a-li-m-fanya mbwa Ahmed he-PAST-him-make dog samaki mkubwa fish large ‘Ahmed made the dog eat a large fish’
a-l-e he-eat-SUBJ
The three causative types must be understood to serve only as reference points. There are languages that fall somewhere between any two of the ideal types. For instance, Japanese lexical causative verbs lie between the lexical type and the morphological type because they exhibit degrees of physical resemblance – from almost identical to totally different – to their corresponding noncausative verbs, for example, tome- ‘to cause [X] to stop’ vs. tomar- ‘[X] to stop,’ oros- ‘to bring down’ vs. ori- ‘to come down,’ age- ‘to raise’ vs. agar- ‘to rise,’ and koros- ‘to kill’ vs. sin- ‘to die.’
The Semantics of Causatives: Two Major Types of Causation As previously described, the causative construction is a linguistic expression that denotes a situation consisting of two events: (1) the causing event in which the causer does something, and (2) the caused event in which the causee carries out an action or undergoes a change of condition or state as a result of the causer’s action. There are two mixed but distinct levels of description contained in this definition: the level of events and the level of participants. The first level is where the relationship between the causing event and the caused event is captured. The second level concerns the interaction between the causer and the causee. Most descriptions of the semantics of causatives revolve around these two levels of description. Two major causation types – the distinction between direct and indirect causation, and the distinction between manipulative and directive causation – are discussed in this article because they are
most highly relevant to the three causative types (lexical, morphological, and syntactic) previously described. The first semantic type of causation is based on the level of events; and the second is based on the level of participants. The distinction between direct and indirect causation hinges on the temporal distance between the causing event and the caused event. If the caused event is temporally adjacent to the causing event, without any other event intervening between them, the overall causative situation may be regarded as direct. For example, if X makes Y fall into the river by pushing Y, the causing event of X pushing Y immediately precedes the caused event of Y’s falling into the river. There is no intervening or intermediary event that plays a role in the realization of the caused event; in direct causation, the caused event is immediately temporally adjacent to the causing event. As a matter of fact, the temporal distance between cause and effect in direct causation may be so close that it sometimes becomes difficult perceptually, if not conceptually, to divide the whole causative situation into the causing event and the caused event (e.g., the cat jumped as John slammed the door). Thus, direct causation represents a causative situation in which the causing event and the caused event abut temporally on one another, the former immediately preceding the latter. Indirect causation, on the other hand, involves a situation in which the caused event may not immediately follow the causing event in temporal terms. There will be at least one event intervening between the causing and caused events. In order for this to be the case, however, the temporal distance between the two events must be great enough for the whole causative situation to be divided clearly into the causing event and the caused event. For example, X fiddles with Y’s car, and days later Y is injured in a car accident due to the failure of the car. In this situation, the causing event is X’s fiddling with Y’s car and the caused event is Y’s getting injured in the accident. But these events are separated temporally from one another by the intermediary event (the failure of the car). The intervening event plays an important role in bringing about the caused event. Note that, although this causative situation is indirect, the caused event is connected temporally with the causing event in an inevitable flow or chain of events: Y’s accident caused by the failure of the car and the failure of the car in turn caused by X’s fiddling with it (e.g., Croft, 1991). There can potentially be more than one event intervening between the causing event and the caused event in indirect causation. The other level of description involves the major participants of the causative situation, namely the causer and the causee. Depending on the nature and
Causatives: Semantics 267
extent of the causer’s relationship with the causee in the realization of the caused event, the causative situation may be either manipulative or directive. If the causer acts physically on the causee, then the causative situation is regarded as manipulative. The causer manipulates the causee in bringing about the caused event. The situation used previously to exemplify direct causation is also manipulative because the causer physically pushes the causee into the river. In other words, this particular causative situation represents direct and manipulative causation. The causer may rely on an intermediary physical process or means in effecting the caused event. For example, if X causes Y to fall by pushing a shopping trolley straight into Y, the causer effects the caused event through some physical means, as in the case of direct manipulative causation already discussed. But this intermediary physical process also represents an independent event intervening between the causing event and the caused event – in fact, this intermediary event itself constitutes a causative situation consisting of a causing event (X exerting physical force directly on the shopping trolley) and a caused event (the shopping trolley rolling straight into Y). The causative situation in question may thus be regarded as indirect and manipulative causation. The causer may also draw on a nonphysical (e.g., verbal or social) means in causing the causee to carry out the required action or to undergo the required change of condition or state. For example, if medical doctor X causes patient Y to lie down for a medical examination by giving Y an instruction or direction to do so, the causative situation is directive causation. This particular situation is also direct in that there is no other event intervening between the causing event and the caused event – Y’s lying down is immediately temporally adjacent to X’s uttering the instruction. Again, directive causation may also be indirect rather than direct. For example, if X causes Y to type a letter by giving Z an instruction to cause Y to do the typing, then we are dealing with indirect directive causation (e.g., I had the letter typed by Tim by asking Mary to tell him to do so). The caused event is separated from the causing event by the intervening event of Z asking Y to comply with X’s original instruction.
Causative Continuum and Causation Types There is a strong correlation between the causative and the causation types. The three causative types – lexical, morphological, and syntactic – can be interpreted as forming a continuum of formal fusion or
Figure 1 Continuum of formal fusion.
physical propinquity between the expressions of cause and of effect, as schematized in Figure 1. There is a strong tendency for manipulative or direct causation to be mapped onto the causative types on the left of the continuum in preference to those on the right of the continuum. Directive or indirect causation, on the other hand, is far more likely to be expressed by the causative types on the right of the continuum than by those on the left of the continuum. This is often cited in the literature as an excellent example in support of iconic motivations in language. Iconic motivation (or iconicity) is the principle that the structure of language should, as closely as possible, reflect the structure of what is expressed by language (e.g., Haiman, 1985). Recently, the correlation between the causative and causation types has been reinterpreted as that between the degree of difficulty in bringing about the caused event and the degree of transparency in expressing the notion of causation (Shibatani, 2002). For example, directive (as opposed to manipulative) causation involves a nonphysical (verbal or social) means of causing the causee to carry out the required action or to undergo the required change of condition or state. Directive causation entails a higher degree of difficulty in bringing about the caused event than manipulative causation. For one thing, in directive causation the causer relies on the causee’s cooperation; the (prospective) causee can refuse to comply with the (prospective) causer’s wish or demand. This higher degree of difficulty in bringing about the caused event is then claimed to be reflected by the tendency for directive causation to be expressed by the causative types to the right, rather than the left, on the continuum. The notion of causation is much more transparently encoded in the syntactic causative (i.e., a separate lexical verb of cause) than in the lexical causative, where the notion of causation is not expressed by a separate morpheme, let alone by a separate verb. Moreover, there is a large amount of crosslinguistic evidence in support of the case marking of the causee being determined by semantic factors relating to the agency, control, affectedness, or even topicality of the main participants of the causative situation (e.g., Cole, 1983). In Bolivian Quechua, for example, the causee noun phrase is marked by the accusative case if the causee is directly under the causer’s authority and has no control over his or
268 Causatives: Semantics
her action. If, however, the causee has control over his or her action but complies voluntarily with the causer’s wish, the causee noun phrase appears in the instrumental case. Some linguists have made an attempt to reinterpret such variable case marking to reflect the conceptual integration of the causee in the causative event as a whole (Kemmer and Verhagen, 1994). This fits in well with the view that the simple noncausative clause pattern serves as a structural model for morphological causatives (Song, 1996). The causative of intransitive verbs is based on the transitive clause pattern, and the causative of transitive verbs is based on either the ditransitive clause pattern or the transitive clause pattern with an adjunct. See also: Affixation; Iconicity; Iconicity: Theory; Inflection and Derivation; Morphological Typology.
Bibliography Cole P (1983). ‘The grammatical role of the causee in universal grammar.’ International Journal of American Linguistics 49, 115–133. Comrie B (1976). ‘The syntax of causative constructions: cross-language similarities and divergences.’ In Shibatani M (ed.) Syntax and semantics 6: the grammar of causative constructions. New York: Academic Press. 261–312. Comrie B (1989). Language universals and linguistic typology (2nd edn.). Oxford: Blackwell. Comrie B & Polinsky M (eds.) (1993). Causatives and transitivity. Amsterdam & Philadelphia: John Benjamins.
Croft W (1991). Syntactic categories and grammatical relations: the cognitive organization of information. Chicago: University of Chicago Press. Dixon R M W (2000). ‘A typology of causatives: form, syntax and meaning.’ In Dixon R M W & Aikhenvald A Y (eds.) Changing valency: case studies in transitivity. Cambridge, UK: Cambridge University Press. 30–83. Haiman J (1985). Natural syntax: iconicity and erosion. Cambridge, UK: Cambridge University Press. Kemmer S & Verhagen A (1994). ‘The grammar of causatives and the conceptual structure of events.’ Cognitive Linguistics 5, 115–156. Saksena A (1982). ‘Contact in causation.’ Language 58, 820–831. Shibatani M (1976). ‘The grammar of causative constructions: A conspectus.’ In Shibatani M (ed.) Syntax and semantics 6: the grammar of causative constructions. New York: Academic Press. 1–40. Shibatani M (2002). ‘Introduction: some basic issues in the grammar of causation.’ In Shibatani M (ed.) The grammar of causation and interpersonal manipulation. Amsterdam & Philadelphia: John Benjamins. 1–22. Song J J (1995). ‘Review of B. Comrie, and M. Polinsky (ed.) Causatives and transitivity.’ Lingua 97, 211–232. Song J J (1996). Causatives and causation: a universaltypological perspective. London & New York: Addison Wesley Longman. Song J J (2001). Linguistic typology: morphology and syntax. Harlow and London: Pearson Education. Talmy L (1976). ‘Semantic causative types.’ In Shibatani M (ed.) Syntax and semantics 6: the grammar of causative constructions. New York: Academic Press. 43–116. Vitale A J (1981). Swahili syntax. Dordrecht & Cinnaminson: Foris Publications.
Caxton, William (ca. 1415–1491) W Hu¨llen, Dusseldorf, Germany ! 2006 Elsevier Ltd. All rights reserved.
William Caxton (Cauxton, Causton) was born in Tenterden, Kent, some time between 1411 and 1422, and died in Westminster in 1491. After an apprenticeship as a mercer in London, he left for Bruges in 1446, where he went into business on his own. He stayed there for 30 years acting as a governor of the Merchant Adventurers between 1462 and 1465, a post that gave him considerable influence in the supervision of trade between the Low Countries and England. After 1468, he was able to establish close contact with the Duke of Burgundy, who had married Edward IV’s sister. But after 1470 he
relinquished all his commercial and political offices for good. Caxton’s interests in printing and in translating went hand in hand because, in addition to other titles, he was eager to print his own works. Between 1471 and 1474, he informed himself in Cologne about printing techniques but published his translation of The recuyell of the histories of Troy, which he had begun as a preventive against idleness (Dictionary of National Biography), in Bruges in the latter year at a press owned by Colard Mansion. It was the first book printed with movable letters in the English language. He then moved to London, where he stayed for the rest of his life. In 1477, he issued The dictes and sayings of the philosophers from his own press. It was the first English book printed in England.
268 Causatives: Semantics
her action. If, however, the causee has control over his or her action but complies voluntarily with the causer’s wish, the causee noun phrase appears in the instrumental case. Some linguists have made an attempt to reinterpret such variable case marking to reflect the conceptual integration of the causee in the causative event as a whole (Kemmer and Verhagen, 1994). This fits in well with the view that the simple noncausative clause pattern serves as a structural model for morphological causatives (Song, 1996). The causative of intransitive verbs is based on the transitive clause pattern, and the causative of transitive verbs is based on either the ditransitive clause pattern or the transitive clause pattern with an adjunct. See also: Affixation; Iconicity; Iconicity: Theory; Inflection and Derivation; Morphological Typology.
Bibliography Cole P (1983). ‘The grammatical role of the causee in universal grammar.’ International Journal of American Linguistics 49, 115–133. Comrie B (1976). ‘The syntax of causative constructions: cross-language similarities and divergences.’ In Shibatani M (ed.) Syntax and semantics 6: the grammar of causative constructions. New York: Academic Press. 261–312. Comrie B (1989). Language universals and linguistic typology (2nd edn.). Oxford: Blackwell. Comrie B & Polinsky M (eds.) (1993). Causatives and transitivity. Amsterdam & Philadelphia: John Benjamins.
Croft W (1991). Syntactic categories and grammatical relations: the cognitive organization of information. Chicago: University of Chicago Press. Dixon R M W (2000). ‘A typology of causatives: form, syntax and meaning.’ In Dixon R M W & Aikhenvald A Y (eds.) Changing valency: case studies in transitivity. Cambridge, UK: Cambridge University Press. 30–83. Haiman J (1985). Natural syntax: iconicity and erosion. Cambridge, UK: Cambridge University Press. Kemmer S & Verhagen A (1994). ‘The grammar of causatives and the conceptual structure of events.’ Cognitive Linguistics 5, 115–156. Saksena A (1982). ‘Contact in causation.’ Language 58, 820–831. Shibatani M (1976). ‘The grammar of causative constructions: A conspectus.’ In Shibatani M (ed.) Syntax and semantics 6: the grammar of causative constructions. New York: Academic Press. 1–40. Shibatani M (2002). ‘Introduction: some basic issues in the grammar of causation.’ In Shibatani M (ed.) The grammar of causation and interpersonal manipulation. Amsterdam & Philadelphia: John Benjamins. 1–22. Song J J (1995). ‘Review of B. Comrie, and M. Polinsky (ed.) Causatives and transitivity.’ Lingua 97, 211–232. Song J J (1996). Causatives and causation: a universaltypological perspective. London & New York: Addison Wesley Longman. Song J J (2001). Linguistic typology: morphology and syntax. Harlow and London: Pearson Education. Talmy L (1976). ‘Semantic causative types.’ In Shibatani M (ed.) Syntax and semantics 6: the grammar of causative constructions. New York: Academic Press. 43–116. Vitale A J (1981). Swahili syntax. Dordrecht & Cinnaminson: Foris Publications.
Caxton, William (ca. 1415–1491) W Hu¨llen, Dusseldorf, Germany ! 2006 Elsevier Ltd. All rights reserved.
William Caxton (Cauxton, Causton) was born in Tenterden, Kent, some time between 1411 and 1422, and died in Westminster in 1491. After an apprenticeship as a mercer in London, he left for Bruges in 1446, where he went into business on his own. He stayed there for 30 years acting as a governor of the Merchant Adventurers between 1462 and 1465, a post that gave him considerable influence in the supervision of trade between the Low Countries and England. After 1468, he was able to establish close contact with the Duke of Burgundy, who had married Edward IV’s sister. But after 1470 he
relinquished all his commercial and political offices for good. Caxton’s interests in printing and in translating went hand in hand because, in addition to other titles, he was eager to print his own works. Between 1471 and 1474, he informed himself in Cologne about printing techniques but published his translation of The recuyell of the histories of Troy, which he had begun as a preventive against idleness (Dictionary of National Biography), in Bruges in the latter year at a press owned by Colard Mansion. It was the first book printed with movable letters in the English language. He then moved to London, where he stayed for the rest of his life. In 1477, he issued The dictes and sayings of the philosophers from his own press. It was the first English book printed in England.
Cayman Islands: Language Situation 269
Between then and his death he produced the incredible output of about 70 books, almost all of them in folio, 21 being his own translations. He edited Chaucer (providing the editio princeps of the Canterbury tales), Lydgate, Gower, The chronicle of Brut, and also pamphlets, horae, and speeches. He translated French versions of Latin classical literature and of the philosophers and issued similar translations by others. For experts in printing techniques, his works are recognizable by the founts, which, however, he changed six times, and by such conspicuous signs as the absence of title pages, of ordinary commas and full stops, and of catchwords at the foot of each page. He was the first printer to include woodcuts. Although he ushered in the new era of printed culture, his influence on the history of English is somewhat indirect. He did not contribute to the standardization of spelling, which was achieved only a century later. His own personal style, as visible in the prologues and epilogues of editions, was quite traditional, with Germanic lexis and syntax, including alliterations. In his translations, however, he conformed with the style of his mostly French authors, which led to a massive acceptance of French words, the abundant use of synonyms, elaborate forms of address, rhetorical figures, etc. Caxton’s connection with the court of Burgundy may have been the personal background for this. In doing so and by the sheer mass of books he produced, he supported and
reinforced the development of a ‘curial’ (or ‘clergial,’ ‘aureate’) style that was typical of the development of Middle English and brought the language into its own. His edition of a book of French and English conversations, which he probably translated himself, shows a new way of teaching foreign languages in schools, in which the old habit of printing topically ordered vocabulary is embedded in a method of presenting natural dialogues and role play. See also: Classroom Talk; English, Early Modern; English Spelling: Rationale; Translation: Pragmatics; Western Linguistic Thought Before 1800.
Bibliography Blades W (1971). The biography and typography of William Caxton, England’s first printer. Totowa, NJ: Rowman and Littlefield. Blake N F (1969). Caxton and his world. London: Deutsch. Crotch W J B (1928). The prologues and epilogues of William Caxton. London: EETS, Oxford University Press (Reprinted 1973, Millwood, NY: Kraus.). Hogg R M (ed.) (1992, 1999). The Cambridge history of the English language, vol. II, Blake N F (ed.): 1066–1476; vol. III, Lass R (ed.): 1476–1776. Hu¨ llen W (1999). English dictionaries 800–1700: The topical tradition. Oxford: Clarendon.
Cayman Islands: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The official language of the Cayman Islands is English, with 36 000 speakers. Literacy is 98%. There also exist sizable minority languages of Haitian Creole French, French, and Spanish. Cayman Islands English, although structurally similar to a creole, seems to have borrowed some creole features of Jamaican without having undergone creolization. Unlike many of the other islands in the Caribbean, the Cayman Islands were not directly subject to the pressures exerted by slavery and the plantation system. Instead, early settlers were turtle fishers and wreck salvagers. Population growth forced some em-
igration to islands off the coast of Central America, the Bay Islands and Corn Island. Strong trading ties also exist with Belize and Jamaica. These population movements and commercial links probably contributed towards language contact and borrowing; however, the lack of a large non-English population seems to have inhibited the development of a true creole. See also: Haiti: Language Situation.
Bibliography Holm J (1989). Pidgins and creoles 2: reference survey. Cambridge: Cambridge University Press (esp. pp. 479–480).
Cayman Islands: Language Situation 269
Between then and his death he produced the incredible output of about 70 books, almost all of them in folio, 21 being his own translations. He edited Chaucer (providing the editio princeps of the Canterbury tales), Lydgate, Gower, The chronicle of Brut, and also pamphlets, horae, and speeches. He translated French versions of Latin classical literature and of the philosophers and issued similar translations by others. For experts in printing techniques, his works are recognizable by the founts, which, however, he changed six times, and by such conspicuous signs as the absence of title pages, of ordinary commas and full stops, and of catchwords at the foot of each page. He was the first printer to include woodcuts. Although he ushered in the new era of printed culture, his influence on the history of English is somewhat indirect. He did not contribute to the standardization of spelling, which was achieved only a century later. His own personal style, as visible in the prologues and epilogues of editions, was quite traditional, with Germanic lexis and syntax, including alliterations. In his translations, however, he conformed with the style of his mostly French authors, which led to a massive acceptance of French words, the abundant use of synonyms, elaborate forms of address, rhetorical figures, etc. Caxton’s connection with the court of Burgundy may have been the personal background for this. In doing so and by the sheer mass of books he produced, he supported and
reinforced the development of a ‘curial’ (or ‘clergial,’ ‘aureate’) style that was typical of the development of Middle English and brought the language into its own. His edition of a book of French and English conversations, which he probably translated himself, shows a new way of teaching foreign languages in schools, in which the old habit of printing topically ordered vocabulary is embedded in a method of presenting natural dialogues and role play. See also: Classroom Talk; English, Early Modern; English Spelling: Rationale; Translation: Pragmatics; Western Linguistic Thought Before 1800.
Bibliography Blades W (1971). The biography and typography of William Caxton, England’s first printer. Totowa, NJ: Rowman and Littlefield. Blake N F (1969). Caxton and his world. London: Deutsch. Crotch W J B (1928). The prologues and epilogues of William Caxton. London: EETS, Oxford University Press (Reprinted 1973, Millwood, NY: Kraus.). Hogg R M (ed.) (1992, 1999). The Cambridge history of the English language, vol. II, Blake N F (ed.): 1066–1476; vol. III, Lass R (ed.): 1476–1776. Hu¨llen W (1999). English dictionaries 800–1700: The topical tradition. Oxford: Clarendon.
Cayman Islands: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
The official language of the Cayman Islands is English, with 36 000 speakers. Literacy is 98%. There also exist sizable minority languages of Haitian Creole French, French, and Spanish. Cayman Islands English, although structurally similar to a creole, seems to have borrowed some creole features of Jamaican without having undergone creolization. Unlike many of the other islands in the Caribbean, the Cayman Islands were not directly subject to the pressures exerted by slavery and the plantation system. Instead, early settlers were turtle fishers and wreck salvagers. Population growth forced some em-
igration to islands off the coast of Central America, the Bay Islands and Corn Island. Strong trading ties also exist with Belize and Jamaica. These population movements and commercial links probably contributed towards language contact and borrowing; however, the lack of a large non-English population seems to have inhibited the development of a true creole. See also: Haiti: Language Situation.
Bibliography Holm J (1989). Pidgins and creoles 2: reference survey. Cambridge: Cambridge University Press (esp. pp. 479–480).
270 Cebuano
Cebuano J U Wolff, Cornell University, Ithaca, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Cebuano is spoken in the central and southern Philippines. It is a member of the Austronesian family of languages, the group of languages spoken throughout most of Indonesia, northward into the Philippines and Taiwan and eastward through much of Papua New Guinea and over the Pacific as far Hawaii and Easter Island. The languages of the Philippines, with the exceptions of the Spanish Creoles, Chabacano and Chavacano, are closely related and typologically similar to one another. In particular, Cebuano is subgrouped with Tagalog and is similar to Tagalog in much the same way as Italian and Spanish are similar to each other (see Tagalog). Cebuano is called Sinugba anun or Sinibuwanu´ natively, and is sometimes referred to as ‘Sugbuanon’ in the literature about the language. Cebuano is also commonly called ‘Visayan’ (Binisaya natively), after the name of the region of the central Philippines. However, there are in fact more than 30 languages spoken in this area, all of which are referred to as ‘Visayan,’ such that many publications referring to ‘Visayan’ have to do with languages other than Cebuano. Cebuano is spoken by somewhere around a fifth of the population of the Philippines. It is thus second only to Tagalog in number of speakers. Throughout the 20th century Cebuano was widely used as a lingua franca in Mindanao and was almost universally known as a second language by those in Mindanao who were not native speakers of Cebuano. At the present time Tagalog is gaining as the lingua franca at the expense of Cebuano, and in Mindanao, as throughout the Cebuano speech area, native speakers of Cebuano are more and more learning Tagalog as a second language. Cebuano is considered a language of the home and social intercourse, and as such enjoys little prestige and is excluded from settings that are considered official or involve people of high rank. For these settings English is used. Further, the educated classes use English as a code together with Cebuano in social settings. Church services that aim at a lowerclass audience are in Cebuano, but those aiming at an upper-class congregation are held in English. Books are in English, and English is the official medium of instruction, although for practical reasons teachers make frequent resort to Cebuano at the primary and even secondary levels (the children do not understand English). As an upshot of the emphasis given to English in the educational system and Cebuano’s lack of prestige, the elite know the latter but poorly and speak a kind of basic Cebuano mixed with
English, which does not make full use of the rich vocabulary and grammatical apparatus which would allow for eloquence. The best knowledge of Cebuano and most eloquent use is on the part of low-status groups, people with little education and little access to English. Cebuano was widely used in mass media until the middle of the 20th century, but in recent years Tagalog has become more and more widespread. There are still radio programs in Cebuano, and there is one weekly, Bisaya, distributed throughout the Cebuano-speaking area, which is aimed at a readership with little education. Cebuano was first recorded in 1521 in a word list written down by Pigafetta, Magellan’s chronicler, when Magellan’s expedition made its ill-fated stop in Cebu. Catechisms in Cebuano were composed in the years shortly after the first Spanish colonization in 1564, and the translations made at this time are still in use. The earliest dictionaries and grammatical sketches were composed during the 17th century, although none of these were published until the 18th century. Otherwise no literature antedating the 20th century survives, but the beginning of the 20th century saw a surge of interest in Cebuano and the beginnings of a rich literary production, which gradually diminished from the 1920s and 1930s to the point that now very little is being written. The early dictionaries and catechisms of Cebuano show that the language has changed considerably since the 17th century. Many of the verb forms used in the catechisms and cited in the earliest dictionary are no longer used (although remnants are found in rural dialects) and others are confined to ceremonious or particularly fancy styles, and absent from normal speech. In vocabulary, too, the language has changed considerably. At least one-third of the listings in the major Cebuano dictionary by Fr. Juan Felix de la Encarnacio´n, which dates from the middle of the 17th century, were unknown to more than 100 informants queried during the 1960s and 1970s.
What Cebuano Is Like in Comparison with Tagalog Cebuano is typologically like the other languages of the Philippines, and most similar to Tagalog (see Tagalog). The sound systems of the two languages are similar, but have a very different rhythm, for two reasons. First, Tagalog loses the glottal stop in any position except before pause, whereas Cebuano pronounces the glottal stop with a sharp clear break, giving a staccato effect to the language. Second, Tagalog has short and long vowels, with no limit on the number of long vowels within a word or on
Cebuano 271
the syllable on which length occurs. Cebuano has few long vowels, and only on the final syllable. The Tagalog and Cebuano consonant inventories are exactly the same. The vowels are different, however. Cebuano has only three vowels, /i/, /a/, and /u/. (Some dialects retain a fourth central vowel, schwa, inherited from Proto-Austronesian, but this has merged with /u/ in the Cebuano of Cebu City.) The vowels /a/ and /u/ may occur lengthened in the final syllable. Stress is contrastive and occurs on the final or the penult. There can be no more than one long vowel in a word. The Cebuano verb system is similar to Tagalog’s but not commensurate with it: the Cebuano verb expresses tense (action started or not), and also has special tenseless forms which are used when the verb is preceded by an adverb or phrase which expresses tense. These three verb forms are durative or nondurative, as exemplified below: (1) Action started, punctual vs. action started, durative: misul ub siya ug pula put-on she OBJ red ‘she put something red on’ nagsul ub siya ug pula is-wearing she OBJ red ‘she is (was) wearing something red’ (2) Action not started, punctual vs. durative: musulub siya ug pula put-on she OBJ red ‘she will put on something red’ magsul ub siya ug pula is-wearing she OBJ red ‘she will be wearing something red’ (3) Tenseless verb, durative vs. punctual: siya musul ub ug pula wa not she put-on OBJ red ‘she didn’t put something red on’ wa siya magsul ub ug pula not she is-wearing OBJ red ‘she wasn’t wearing red’
A system of affixes which show prepositionlike relationships, analogous to that shown by the Tagalog verb, cuts across this tense–aspect system of Cebuano: the Cebuano verbs contain morphemes which express the relation between the verb and a word it refers to. The verb may refer to the agent (active voice), the patient of the action (direct passive), the thing moved or said (conveyance passive), the instrument of the action, the place of the action, the beneficiary of the action, or (peculiarly for Cebuano) time of the action:
(4) (Active) Mipalit siya ug sa´ ging bought he/she OBJ bananas ‘he bought some bananas [that’s what he did]’ (5) (Patient) Gipalit nı´ya ang sa´ging bought-it by-him the bananas ‘he bought the bananas [that’s what happened to the bananas]’ (6) (Place) ba´lik ta sa gipalitan let’s-go-back we to was-bought-at nı´mu ug sa´ging by-you OBJ bananas ‘let’s go back to the place you bought some bananas’ (7) (Instrument) Ma u na y is-the-one that the-one-that ipalit nı´mu ug sa´ging will-buy-with-it by-you OBJ bananas ‘that is the thing [money] you will use to buy bananas with’ (8) (Beneficiary) Putling Marı´ya ig ampu Virgin Mary pray-for ‘Virgin Mary pray for us’
mu by-you
kami us
These verbal inflections are added to roots. In addition, new stems can be formed by adding one or more derivational affixes that have meanings similar to those found in Tagalog (see Tagalog). Cebuano has a complex system of deictics and demonstrative pronouns that is a good deal more complex than that of Tagalog. The deictics in Cebuano distinguish tense when initial in the clause: e.g., dinhi ‘was here’, nı´ a ‘is here’, anhi ‘will be here.’ They distinguish for four distances, dı´ a ‘is here near me (but not near you)’, nı´ a ‘is here (near you and me)’, na´ a ‘is there (near you but not near me)’, tu´ a ‘is there (far from both of us)’. When final in the clause the deictics distinguish motion from nonmotion: didtu ‘there (far away)’, ngadtu ‘going there (far away)’. The interrogatives forms for ‘when’ and ‘where’ also distinguish tense. The changes that Cebuano has undergone since the earliest attestations amount to the loss of distinctions. This can be accounted for partly by the fact that Cebuano has been brought to new areas and spread to populations formerly speaking other languages and also by the fact that there has never been a prescriptive tradition which derogates deviant forms. The four-vowel system, which Cebuano inherited from the protolanguage, has been reduced to
272 Cebuano
three, except in the case of rural dialects. Further, the category durative vs. punctual, which characterizes the verbal system, has in historical times been lost in the passive verbs except in ceremonial styles. Many of the derivational affixes forming verb stems that were productive in pre-19th-century attestations of the language are now confined to petrified forms. In the past two generations Tagalog has influenced an important component of the verbal system, namely, the loss of the tenseless forms, although in rural speech this part of the system is still intact. Further, the system of deictics has been simplified in speakers influenced by Tagalog: namely, tense has been lost, the four-way distance distinction has been reduced to two – i.e., ‘here’ vs. ‘there,’ and the distinction between deictics expressing motion and those which do not has been lost. These changes are most strongly observed in areas which or among groups who have contact with Tagalog speech, and from this population these simplifications spread elsewhere in the Cebuano speech community. Cebuano morphology differs in type of Tagalog in two ways: first, affixational patterns are regular and predictable in Tagalog but in Cebuano they are not: whereas in Tagalog the paradigms are normally filled out for all roots with a given meaning type, in Cebuano many affixes are capriciously distributed, quite irrespective of the semantic qualities of the root. Second, there are numerous variations
in affixation and some of the interrogatives, distributed by areas and individual speakers. Tagalog has much less variation. See also: Affixation; Austronesian Languages: Overview;
Deixis and Anaphora: Pragmatic Approaches; Demonstratives; Philippines: Language Situation; Tagalog.
Bibliography Cabonce R (1983). An English-Cebuano Visayan dictionary. Manila: National Book Store. Encarnacio´ n Fr J F de la (1885). Diccionario Bisaya-Espan˜ ol (3rd edn.). Manila. Mojares R B (1977). Bibliography of Cebuano linguistics. Cebu City: University of San Carlos. Wolff J U (1961). Cebuano texts with glossary and grammar. Mimeographed. Cebu City. Wolff J U (1966–7). Beginning Cebuano (2 vols). New Haven: Yale University Press. Wolff J U (1972). A Cebuano Visayan dictionary. Ithaca, NY: Southeast Asia Program, Cornell University/Manila: Linguistic Society of the Philippines 72–81. Wolff J U (1973). ‘The character of borrowings from Spanish and English in the languages of the Philippines.’ Journal of Philippine Linguistics 4(1). Zorc D (1977). The Bisayan dialects of the Philippines: subgrouping and reconstruction. Canberra: Pacific Linguistics.
Celtic ´ Baoill, University of Aberdeen, Aberdeen, UK CO ! 2006 Elsevier Ltd. All rights reserved.
The Celts get their name from Keltoi, a name of unknown origin applied by the Greeks from around 500 B.C. to a widespread people who lived mainly to the north and west of them. They have long been identified with the archaeological cultures known as Hallstatt and La Te`ne, named from type-sites in central Europe and dating from the period following 600 B.C., but linking a language to an archaeological culture can be unreliable, and this link and others concerned with the Celts have been queried, notably in James (1999). The languages understood to belong to these people are of the Indo-European family, the most westerly branch of it, and one important feature thought to mark Celtic out from the rest is the loss (or reduction in some contexts) of the letter p. For
example, the Indo-European word for a ‘father,’ which began with p- (whence, e.g., Greek and Latin pater), gives modern Gaelic (Gaelic, Irish) athair. This development predates all the evidence we have for the languages. Another early development was the change in some branches of Celtic, whereby the Indo-European /ku/ (or ‘Q’) became /p/, whence the well-known division between P-Celtic and Q-Celtic languages. In the later (insular Q-Celtic) languages this q has developed to a /k/ sound, written c, and so we get oppositions like Gaelic cenn and Welsh pen, ‘head’ (from an original stem *qen-). The languages may be classified as Continental Celtic and Insular Celtic, the former group dating from the earliest period of Celtic history up till about 500 A.D., by which time all the continental languages had probably disappeared. Three main continental languages are identifiable, Gaulish, Lepontic, and Celtiberian, and we know all three principally from inscriptions (on stones or on coins),
272 Cebuano
three, except in the case of rural dialects. Further, the category durative vs. punctual, which characterizes the verbal system, has in historical times been lost in the passive verbs except in ceremonial styles. Many of the derivational affixes forming verb stems that were productive in pre-19th-century attestations of the language are now confined to petrified forms. In the past two generations Tagalog has influenced an important component of the verbal system, namely, the loss of the tenseless forms, although in rural speech this part of the system is still intact. Further, the system of deictics has been simplified in speakers influenced by Tagalog: namely, tense has been lost, the four-way distance distinction has been reduced to two – i.e., ‘here’ vs. ‘there,’ and the distinction between deictics expressing motion and those which do not has been lost. These changes are most strongly observed in areas which or among groups who have contact with Tagalog speech, and from this population these simplifications spread elsewhere in the Cebuano speech community. Cebuano morphology differs in type of Tagalog in two ways: first, affixational patterns are regular and predictable in Tagalog but in Cebuano they are not: whereas in Tagalog the paradigms are normally filled out for all roots with a given meaning type, in Cebuano many affixes are capriciously distributed, quite irrespective of the semantic qualities of the root. Second, there are numerous variations
in affixation and some of the interrogatives, distributed by areas and individual speakers. Tagalog has much less variation. See also: Affixation; Austronesian Languages: Overview;
Deixis and Anaphora: Pragmatic Approaches; Demonstratives; Philippines: Language Situation; Tagalog.
Bibliography Cabonce R (1983). An English-Cebuano Visayan dictionary. Manila: National Book Store. Encarnacio´n Fr J F de la (1885). Diccionario Bisaya-Espan˜ol (3rd edn.). Manila. Mojares R B (1977). Bibliography of Cebuano linguistics. Cebu City: University of San Carlos. Wolff J U (1961). Cebuano texts with glossary and grammar. Mimeographed. Cebu City. Wolff J U (1966–7). Beginning Cebuano (2 vols). New Haven: Yale University Press. Wolff J U (1972). A Cebuano Visayan dictionary. Ithaca, NY: Southeast Asia Program, Cornell University/Manila: Linguistic Society of the Philippines 72–81. Wolff J U (1973). ‘The character of borrowings from Spanish and English in the languages of the Philippines.’ Journal of Philippine Linguistics 4(1). Zorc D (1977). The Bisayan dialects of the Philippines: subgrouping and reconstruction. Canberra: Pacific Linguistics.
Celtic ´ Baoill, University of Aberdeen, Aberdeen, UK CO ! 2006 Elsevier Ltd. All rights reserved.
The Celts get their name from Keltoi, a name of unknown origin applied by the Greeks from around 500 B.C. to a widespread people who lived mainly to the north and west of them. They have long been identified with the archaeological cultures known as Hallstatt and La Te`ne, named from type-sites in central Europe and dating from the period following 600 B.C., but linking a language to an archaeological culture can be unreliable, and this link and others concerned with the Celts have been queried, notably in James (1999). The languages understood to belong to these people are of the Indo-European family, the most westerly branch of it, and one important feature thought to mark Celtic out from the rest is the loss (or reduction in some contexts) of the letter p. For
example, the Indo-European word for a ‘father,’ which began with p- (whence, e.g., Greek and Latin pater), gives modern Gaelic (Gaelic, Irish) athair. This development predates all the evidence we have for the languages. Another early development was the change in some branches of Celtic, whereby the Indo-European /ku/ (or ‘Q’) became /p/, whence the well-known division between P-Celtic and Q-Celtic languages. In the later (insular Q-Celtic) languages this q has developed to a /k/ sound, written c, and so we get oppositions like Gaelic cenn and Welsh pen, ‘head’ (from an original stem *qen-). The languages may be classified as Continental Celtic and Insular Celtic, the former group dating from the earliest period of Celtic history up till about 500 A.D., by which time all the continental languages had probably disappeared. Three main continental languages are identifiable, Gaulish, Lepontic, and Celtiberian, and we know all three principally from inscriptions (on stones or on coins),
Celtic 273
names (place-names and personal names) and quotations on record in other languages. Verbs, and therefore sentences, are extremely rare, so that our knowledge of all three languages really is minimal. Gaulish and Lepontic are P-Celtic languages, the former belonging to the general area of Gaul (France, but including also parts of Switzerland, Belgium, and Italy) and the latter to parts of the southern Alps. Celtiberian is the name favored, over the alternative Hispano-Celtic, by de Hoz (1988) for the Q-Celtic language, which has, since the mid-20th century, come to be reasonably well attested by inscriptions in north central Spain; a relevant opposition here is between the form used for ‘and’ (Latin -que), appearing as pe in Lepontic and as cue in Celtiberian. Archaeology indicates movement of features of the Hallstatt and La Te`ne cultures from the continent to Britain and Ireland from about 500 B.C., and it is assumed that Celtic languages came with them. Jackson (1953: 4) used the term Gallo-Brittonic to cover both Gaulish and the first P-Celtic languages in Britain. A Q-Celtic language appeared in Ireland, but there is much disagreement as to when, whence, and by what route. There is also much discussion of criteria for assessing relationships between the Celtic languages in this early period, and opinions change frequently (see Evans, 1995); evidence for dating expansion and change in the languages is inevitably scarce. The Insular Celtic languages are divided into Brythonic and Goidelic groups, the former denoting the descendants of the P-Celtic, which reached Britain from the continent, namely Welsh, Cornish, Breton, Pictish, and Cumbric. Cumbric (or Cumbrian) is used to denote the early language(s) of what are now the northern part of England and the southern part of Scotland, but little is really known about the language(s) apart from what can be gathered from names (see Price 1984: 146–154). The surviving languages in the Brythonic group are Welsh and Breton, Cornish having gone out of general use in the 18th century, though it is still in use among enthusiasts. Sims-Williams (1990: 260; see also Russell, 1995: 132–134) argued that the main linguistic developments from (the theoretical) Brittonic, leading toward the modern insular languages, were in place by 500 A.D., and divergences between Cornish and Breton followed shortly afterward. Goidelic is the term used by linguists for the Q-Celtic language that appeared in Ireland before the 1st century B.C. and for its descendants. The theory has long been that the original Goidelic language in Ireland spread to western Britain when the power of the Romans waned around 400 A.D., and that Scottish Gaelic (Gaelic, Scots) and Manx eventually
developed there. But while the simple theory of a major Irish migration bringing Gaelic to Scotland is widely accepted, even in Scotland, Ewan Campbell has recently shown (Campbell, 2001) that archaeology provides no evidence in support of any such invasion. The earliest written form of the Gaelic language is that found in Ogam, the alphabet used for inscriptions on stone, dating from about the 4th century till the 7th (McManus, 1991 is a detailed study). Thereafter the language, as attested in the literature, is divided into Old (till 900 A.D.), Middle (900–1200), Early Modern (till c. 1650), and Modern periods. The distinctive Scottish and Manx forms only become clearly visible in the Early Modern period. The linguistic theory in Jackson (1951: 78–93) envisaged a historical period, c. 1000–1300 A.D., during which Irish (as Western Gaelic) became clearly distinct from Eastern Gaelic (Scottish Gaelic and Manx), but this ´ Buachalla, has come under attack by those (such as O 2002) who see the significant historical division within Goidelic as a north/south one, with Scotland, Man, and Ulster in opposition to the rest of Ireland on many points. On similar grounds, the three Gaelic languages may be seen rather as what Hockett (1958: 323–325) called an L-complex, a single linguistic continuum within which national and even geographical boundaries are ignored by dialectal isoglosses. This suggestion ´ Buachalla, 1977: 95–96) is supported (a) by (cf. O the fact that all three ‘languages’ identify themselves by variants of the same name, Gaeilge, Ga`idhlig, Gaelck, and others, whence the English term Gaelic; and (b) by the strong evidence that, while Gaelic survived (until the early 20th century) in the interface area between north-eastern Ireland and the southern Highlands, speakers on both sides of the North Channel were able to converse with little difficulty. See also: Breton; Cornish; Isle of Man: Language Situa-
tion; Scots Gaelic; United Kingdom: Language Situation; Wales: Language Situation; Welsh.
Bibliography Campbell E (2001). ‘Were the Scots Irish?’ Antiquity 75, 285–292. de Hoz J (1988). ‘Hispano-Celtic and Celtiberian.’ In Maclennan G (ed.) Proceedings of the First North American Congress of Celtic Studies. Ottawa: University of Ottawa. 191–207. Evans D E (1995). ‘The early Celts: the evidence of language.’ In Green M J (ed.) The Celtic world. London: Routledge. 8–20. Hockett C F (1958). A course in modern linguistics. New York: Macmillan.
274 Celtic Jackson K (1951). ‘‘‘Common Gaelic’’: the evolution of the Goedelic languages.’ In Proceedings of the British Academy XXXVII, 71–97. Jackson K (1953). Language and history in early Britain. Cambridge: Cambridge University Press. James S (1999). The Atlantic Celts: ancient people or modern invention? London: British Museum Press. Maier B (2003). The Celts: a history from earliest times to the present. Edinburgh: Edinburgh University Press. McManus D (1991). A guide to Ogam. Maynooth: An Sagart. ´ Buachalla B (1977). ‘Nı´ and cha in Ulster Irish’ E´ riu 28, O 92–141.
´ Buachalla B (2002). ‘‘‘Common Gaelic’’ revisited.’ In O ´ O Baoill C & McGuire N R (eds.) Rannsachadh na Ga`idhlig 2000. Obar Dheathain: An Clo` Gaidhealach. 1–12. Price G (1984). The languages of Britain. London: E. Arnold. Russell P (1995). An introduction to the Celtic languages. London: Longman. Sims-Williams P (1990). ‘Dating the transition to NeoBrittonic: phonology and history, 400–600.’ In Bammesberger A & Wollmann A (eds.) Britain 400–600: language and history. Heidelberg: C. Winter. 217–261.
Celtic Religion B Maier, University of Aberdeen, Abderdeen, UK ! 2006 Elsevier Ltd. All rights reserved.
This is used as a convenient umbrella term to refer to the religious beliefs, myths, rites, and cults of all the Celtic-speaking peoples before the advent of Christianity. The designation Celtic may be justified on linguistic, archaeological, and historical grounds, but it should be noted that these criteria do not always converge. Moreover, one might just as well talk of Celtic religions (in the plural), as there are marked regional and chronological differences in a continuum that stretches from Ireland to Asia Minor and from the 5th century B.C. to the 5th century A.D. What we know about Celtic religion is mainly based on archaeological findings, information provided by Greek and Roman authors, and inferences drawn from the medieval vernacular traditions of the Celticspeaking countries. On this evidence, it is assumed that the Celts worshipped a multitude of gods and goddesses, but the names of these are known only from the Roman imperial period onwards. Classical authors and Latin dedicatory inscriptions from Gaul and Britain usually equate Celtic deities with their Roman counterparts, whereas medieval Irish and Welsh texts tend to treat them on the principle of Euhemerism as mortal beings who were supposed to have lived in a distant past. To judge from the inscriptions, most of the Celtic deities appear to have been of purely local or regional significance. As there are hardly any consecutive Celtic texts from the pagan period, we are ignorant of many basic features of Celtic religion. It is to be stressed, however, that several popular ideas about Celtic religion such as the concept of a ‘Celtic calendar’ or the belief in a subterranean ‘otherworld’ are based exclusively on
medieval and modern insular sources and should not be projected back on the Continental Celts of classical antiquity. Recent archaeological investigations have provided ample information on pre-Roman cult sites and sacrificial practices, but the wellknown Celtic priesthood of the druids continues to be known from literary sources only. Here the most detailed information is provided by the Stoic philosopher Posidonius and by Julius Caesar, but the different pieces of information provided by these two authors are sometimes contradictory and generally cannot be verified by reference to other, independent witnesses. Druids are also mentioned in medieval Irish works of literature set in the pre-Christian period, but the descriptions given of them appear to be modeled on that of Christian priests, so that their source value for the history of Religions appears rather limited. See also: Early Irish Linguistics; Welsh.
Bibliography Fauduet I (1993). Les temples de tradition celtique en Gaule romaine. Paris: Errance. Haffner A (ed.) (1995). Heiligtu¨ mer und Opferkulte der Kelten. Stuttgart: Theiss. Landes C (ed.) (1992). Dieux gue´ risseurs en Gaule romaine. Lattes: Muse´e arche´ologique Henri Prades. Maier B (1997). Dictionary of Celtic religion and culture. Woodbridge: Boydell and Brewer. Maier B (2001). Die Religion der Kelten. Mu¨nchen: C. H. Beck. Me´niel P (1992). Les sacrifices d’animaux chez les Gaulois. Paris: Errance. ´ hO ´ ga´in D (1990). Myth, legend and romance. An O encyclopaedia of the Irish folk tradition. London: Ryan Publishing.
274 Celtic Jackson K (1951). ‘‘‘Common Gaelic’’: the evolution of the Goedelic languages.’ In Proceedings of the British Academy XXXVII, 71–97. Jackson K (1953). Language and history in early Britain. Cambridge: Cambridge University Press. James S (1999). The Atlantic Celts: ancient people or modern invention? London: British Museum Press. Maier B (2003). The Celts: a history from earliest times to the present. Edinburgh: Edinburgh University Press. McManus D (1991). A guide to Ogam. Maynooth: An Sagart. ´ Buachalla B (1977). ‘Nı´ and cha in Ulster Irish’ E´riu 28, O 92–141.
´ Buachalla B (2002). ‘‘‘Common Gaelic’’ revisited.’ In O ´ O Baoill C & McGuire N R (eds.) Rannsachadh na Ga`idhlig 2000. Obar Dheathain: An Clo` Gaidhealach. 1–12. Price G (1984). The languages of Britain. London: E. Arnold. Russell P (1995). An introduction to the Celtic languages. London: Longman. Sims-Williams P (1990). ‘Dating the transition to NeoBrittonic: phonology and history, 400–600.’ In Bammesberger A & Wollmann A (eds.) Britain 400–600: language and history. Heidelberg: C. Winter. 217–261.
Celtic Religion B Maier, University of Aberdeen, Abderdeen, UK ! 2006 Elsevier Ltd. All rights reserved.
This is used as a convenient umbrella term to refer to the religious beliefs, myths, rites, and cults of all the Celtic-speaking peoples before the advent of Christianity. The designation Celtic may be justified on linguistic, archaeological, and historical grounds, but it should be noted that these criteria do not always converge. Moreover, one might just as well talk of Celtic religions (in the plural), as there are marked regional and chronological differences in a continuum that stretches from Ireland to Asia Minor and from the 5th century B.C. to the 5th century A.D. What we know about Celtic religion is mainly based on archaeological findings, information provided by Greek and Roman authors, and inferences drawn from the medieval vernacular traditions of the Celticspeaking countries. On this evidence, it is assumed that the Celts worshipped a multitude of gods and goddesses, but the names of these are known only from the Roman imperial period onwards. Classical authors and Latin dedicatory inscriptions from Gaul and Britain usually equate Celtic deities with their Roman counterparts, whereas medieval Irish and Welsh texts tend to treat them on the principle of Euhemerism as mortal beings who were supposed to have lived in a distant past. To judge from the inscriptions, most of the Celtic deities appear to have been of purely local or regional significance. As there are hardly any consecutive Celtic texts from the pagan period, we are ignorant of many basic features of Celtic religion. It is to be stressed, however, that several popular ideas about Celtic religion such as the concept of a ‘Celtic calendar’ or the belief in a subterranean ‘otherworld’ are based exclusively on
medieval and modern insular sources and should not be projected back on the Continental Celts of classical antiquity. Recent archaeological investigations have provided ample information on pre-Roman cult sites and sacrificial practices, but the wellknown Celtic priesthood of the druids continues to be known from literary sources only. Here the most detailed information is provided by the Stoic philosopher Posidonius and by Julius Caesar, but the different pieces of information provided by these two authors are sometimes contradictory and generally cannot be verified by reference to other, independent witnesses. Druids are also mentioned in medieval Irish works of literature set in the pre-Christian period, but the descriptions given of them appear to be modeled on that of Christian priests, so that their source value for the history of Religions appears rather limited. See also: Early Irish Linguistics; Welsh.
Bibliography Fauduet I (1993). Les temples de tradition celtique en Gaule romaine. Paris: Errance. Haffner A (ed.) (1995). Heiligtu¨mer und Opferkulte der Kelten. Stuttgart: Theiss. Landes C (ed.) (1992). Dieux gue´risseurs en Gaule romaine. Lattes: Muse´e arche´ologique Henri Prades. Maier B (1997). Dictionary of Celtic religion and culture. Woodbridge: Boydell and Brewer. Maier B (2001). Die Religion der Kelten. Mu¨nchen: C. H. Beck. Me´niel P (1992). Les sacrifices d’animaux chez les Gaulois. Paris: Errance. ´ hO ´ ga´in D (1990). Myth, legend and romance. An O encyclopaedia of the Irish folk tradition. London: Ryan Publishing.
Central African Republic: Language Situation 275
Central African Republic: Language Situation C Thornell, Go¨teborg University, Go¨teborg, Sweden ! 2006 Elsevier Ltd. All rights reserved.
The Central African Republic, with a population of 3.7 million people (annual growth rate ¼ 1.56%), is multilingual, as most of the African countries are (Figure 1). As a rule, urban centers are multilingual, whereas rural areas are more or less monolingual. The language of the dominant ethnic group prevails. For the past decades, contact between urban centers and rural areas has increased, which has resulted in more than one language being used in rural areas. In particular, this is the case in places in which economic activities are going on. In this multilingual setting, French and the Central African language Sango have the status of official languages. The number of languages spoken in the Central African Republic is not clear. The figures vary between sources. The Atlas linguistique de L’Afrique centrale: Centrafrique (ALC), published in 1984, indicates 43 languages, whereas Ethnologue in 2000 mentions 69. One reason for the variance is that different definitions of the term ‘language’ are used. The individual languages included in these numbers are, in general, associated with ethnic groups and subgroups, which traditionally are found in specific geographical areas. Thus, the languages are not defined according to linguistic criteria. From a linguistic point of view, it would in many cases be more appropriate to speak about dialect clusters instead of languages (Figure 2).
Language Classification The Central African languages are classified to the Niger-Congo, the Nilo-Saharan, and the Afro-Asiatic phyla (Figure 3). A small number, spoken in the northern and eastern part of the country, is grouped into the Nilo-Saharan phylum (i.e., Runga and Sara, and the Afro-Asiatic languages include Hausa and Arabic. Most national languages are affiliated to the
Figure 1 Location of the Central African Republic.
Niger-Congo phylum. More precisely, they belong to the subgroups of the Ubangi and Bantu languages. The Ubangi languages dominate both in terms of number of languages and of speakers. Important clusters include: 1. The Gbaya cluster, including Manza: mainly spoken by the Gbaya people either as a mother tongue or a second language. The Gbaya live in the western part of the country, and the number of Gbaya is estimated at 30% of the total Central African Republic population (1996). 2. The Banda cluster: spoken by people belonging to the Banda groups. These people inhabit the central parts of the country, and they include about 20% of the Central African Republic population. 3. The Ngbandi cluster, with its main dialects Sango Riverain, Yakoma, and Dendi: The dialects are predominantly spoken by the Ngbandi ethnic group, which lives along the Ubangi River. The group represents about 5% of the Central African Republic population. The official language, Sango, is also classified as belonging to the Ngbandi cluster, although the language originates as a pidgin. The language came into being in the nineteenth century as a trade language. Today, it has the character of an extended pidgin/creole language. Its core vocabulary, and parts of its peripheral vocabulary as well, come from the Ngbandi cluster, other Ubangi languages, and Bantu languages. Concepts typical of Western civilization have been encoded in French words, but the grammatical structures developed out the languages of the area. All Central Africans except 11% have at least some proficiency in the language, according to the 1988 census. Approximately 10% of the inhabitants of the Central African Republic speak Sango as their mother tongue. 4. The Zande-Nzakara cluster: spoken by the Zande and Nzakara peoples, who reside along the Mbomou River. Ethnologue also includes the Kpatili variety in the Zande-Nzakara cluster. This variety could also be included in the Ngbandi cluster. The Zande and Nzakara peoples are estimated to make up 3% of the Central African Republic population. The other subgroup of the Niger-Congo phylum, the Bantu languages, is represented by small languages, such as Mpiemo and Pande, spoken in the southwestern part of the country. There are also varieties that can be characterized as mixings. For instance, Yangere-Gbaya, spoken south of Bania in the southwest, has emerged out of both
276 Central African Republic: Language Situation
Figure 2 Language map of the Central African Republic, adapted from Atlas linguistique de L’Afrique centrale: Centrafrique (1984).
the Gbaya dialect spoken in the area and Yangere, belonging to the Banda cluster. In addition to Sango, there are lingua francas used at the regional level, such as Gbaya-Biyanda used in the southwestern part of the country and Zande in the eastern part of the country.
Language Policy The governments in power since the Central African Republic achieved independence in 1960 have kept the former colonial language, French, as official language, but the Sango language has successively been promoted. In 1964, it was declared the only national language, and in 1991 it was declared an official language alongside French. Having the status of a national language, Sango was an important unifying symbol in the building of the independent Central African Republic. Simultaneously with the promotion of Sango as an official language in 1991, the other Central African languages received the status of national languages. This meant that they became recognized by the government which they have not been earlier. They were now allowed to be used in some formal domains, such as at the lower administration levels. Measures for Sango to function optimally as an official language have been taken. The institution
mainly responsible for these matters is the Institut de Linguistique Applique´ e in Bangui. An orthography has been elaborated, and a terminology adapted to a modern society has been successively elaborated. In the making of terminology, a purist approach is applied, which implies that the Sango lexical stock and Sango’s capacity for word formation are widely used for the creation of new words. French borrowings are avoided, and established French loan words are replaced. A further step in implementing Sango as an official language is the translation into Sango of the constitution, the laws, and government documents, which hitherto have all been in French. In addition, other Central African languages, mainly used in oral communication, are now subject to language-planning activities in terms of elaboration of orthographies, dictionary compilations, and grammar writings based on linguistic research. These languages are focused on by the Association Centrafricaine de Traductions de la Bible et de l’Alphabe´ tisation and SIL International.
Language Use Despite the language policy determining which language must to be used in formal domains, language use in many situations depends on the speakers’ language proficiency. Sango is more and more being
Central African Republic: Language Situation 277
Figure 3 Figure classification of the Central African languages, adapted from ALC (1984).
used at the expense of French. French is mostly used at higher levels of administration, in the media, and at school. Nevertheless, French is still considered to be the more prestigious language. The increasing use of Sango is also occurring at the expense of the other national languages, which in the long run will lead to a country-wide language shift to Sango. Executive Domain
In the executive domain, the two official languages are used. In the legislative domain, for instance, the National Assembly holds some of its sessions in Sango. In the provinces, Sango is used in situations in which French has traditionally been used, such as in speeches by government officials. In jurisdiction, some laws, such as 98.004 on 27 March 1998 on the
electoral code of the country, exist in Sango, and the remainder of the laws are on their way to be translated. In courts, Sango is used when the defendants do not know French, and translation into national languages is given when needed. The traditional leaders, in their role as state representatives, often speak in a national language, or Sango, in their official communications. In international contacts, however, French has been the undisputed language, though today English is a competing alternative. School
French is the medium of instruction and learning in school at all levels despite the 1984 decree stipulating that instruction should be given in Sango as well. For the time being, explanations in Sango are allowed in
278 Central African Republic: Language Situation
lower grades when needed. Although it is said that other Central African languages are not used for these explanations, they probably are, particularly in rural areas. Despite primary school being compulsory, less then half the children (males, 47%, and females, 39%) attended primary school, and only 24% of these reached grade 5 according to Central African administration data (UNICEF). The enrolment rate is high in the capital and low in the country’s rural areas. Approximately half of the adult population was literate (51%) in 2003 (world Fact-book, 2005). Proficiency in French and length of school attendance, both in urban and rural areas, are related, because it is in school that students get their greatest exposure to French. This means that more people in urban areas tend to have proficiency in French than in rural areas. Media and Literature
The state broadcasting company ‘Radio Bangui,’ which does both radio and TV broadcasts most radio programs in Sango (70%) and the rest in French. French-speaking people frequently listen to international radio channels in French. The television programs shown by Radio Bangui are mainly in French. Programs from abroad are also accessible via satellite dishes, and most of these programs, which are shown to the public by private persons, are in French. In the capital, people have access to the Internet. There is even a home page in Sango (http://sango. free.fr), of which Marcel Diki-Kidiri at CNRS in Paris is the webmaster. Central African journalists broadcasting in Sango tend to use new terminology, which sometimes makes understanding difficult for common people. Central African writers write in French and publish in France. An example is Etienne Goye´ mide´ , who published his book Le silence de la foreˆ t at the publishing house Hatier, in Paris (1984). Most literature and printed material in the country comprises translations from either English or French and published by religious denominations. The major part of the literature is in Sango, but in recent years printed materials in other Central African languages have been published. Muslims read publications in Arabic, imported from Muslim countries. One reason for the low production of printed materials is that the demand is low – oral communication is much more important. Religious Activities
Religious activities play an important role in the Central African society because half the population
is Christian and 15% are Islamic. In Christian services, Sango is the common language, although a revitalization of national languages is taking place. Moreover, the mixing of languages is not rare in services, nor are translations, such as of the sermon into the dominant national language of the area. In urban areas, services are held in French, and in the capital they are even held in English. The Muslim services are in Arabic, and in some cases explanations are made in Sango or a national language. Communication in Everyday Life
The language used in the daily life depends on the situation. In a predominantly monolingual geographical area, the national language of the area is, for obvious reasons, used in most domains. Only in communication with visitors who do not know the national language in question is another language, which in many cases today would mean Sango or a regional lingua franca, used. In a multilingual area, the use of national languages is roughly limited to communication with people belonging to the same speech community. In other situations, Sango, and to a lesser extent the other official language, French, are commonly used. Even within families in which the parents belong to the same ethnic group Sango may be spoken. It is not rare that the parents speak to them in their national language and get the answer in Sango. In these cases, the children understand but do not speak the national language. Between themselves, they always speak Sango. In everyday conversations, switching between languages is a frequent phenomenon, at the phrase level and beyond. For instance, people speaking Sango and national languages switch very often into French. Furthermore, they frequently use French loan words. This means that people in everyday life speak a Sango variety that differs from the variety modeled by the language-planning institutions, in that this latter variety is free from French expressions. See also: Arabic; Bantu Languages; French; Hausa; Lin-
gua Francas as Second Languages; Nilo-Saharan Languages; Sango. Language Maps (Appendix 1): Map 3.
Bibliography Bendor-Samuel J (ed.) (1989). The Niger-Congo languages. Lanham, NY: University Press of America. Bouquiaux L, Kobozo J M, Diki-Kidiri M, Vallet J & Behaghel A (1978). Dictionnaire Sango-Franc¸ ais,
Central Solomon Languages 279 Lexique Franc¸ ais-Sango. Paris: Socie´ te´ d’e´ tudes linguistiques et anthropologiques de France. Boyd R (1989). ‘Adamawa-Ubangi.’ In Bendor-Samuel J (ed.) The Niger-Congo languages. Lanham, NY: University Press of America. 178–216. Boyeldieu P & Diki-Kidiri J M (1982). Le domaine Ngbandi. Paris: SELAF. Bradshaw R & Bombo-Konghozaud J (1999). The Sango language and Central African culture. Mu¨ nchen: Lincom. Census of the CAR (1988). Bangui: Department of Statistics and Census. Diki-Kidiri M (1998). Dictionnaire orthographique du sa¨ ngo¨ . Reading: BBA Editions. Goye´ mide´ E (1984). Le silence de la foreˆ t. Paris: Hatier. Karan M (2001). The dynamics of Sango language spread. Dallas, TX: SIL International Publications in Sociolinguistics. Lim F (1998). Lexiques des termes juridiques et administratifs (Franc¸ ais-Sango et Sango-Franc¸ ais). Bangui: Institut de Linguistique Applique´ e, Universite´ de Bangui. Lim F (1998). Ndı¨a¨ No 98.004 tıˆ 27 mbaˆngo¨, 1998, soˆ alu¨ ndı¨a¨ tıˆ voˆte na Ko¨do¨ro¨seˆse tıˆ Beˆafrıˆka/Loi No 98.004 du 27 mars 1998 portant code electoral de la Re´publique Centrafricaine. Bangui: Institut de Linguistique Applique´ e, Universite´ de Bangui. Lim F (2000). Language clusters of Central African Republic on the basis of mutual intelligibility. Cape Town: The Centre of Advanced Studies of African Society. Mon˜ ino Y (1995). Le Proto-Gbaya, essai de linguistis comparative historique sur vingt-et-une langue d’Afrique centrale. Paris: Peeters. Mon˜ ino Y (ed.) (1988). Lexique comparatif des langues oubanguienne. Paris: Geuthner. Morill C H (1997). ‘Language, culture, and society in the Central African Republic: the emergence and development of Sango.’ Ph.D. diss., Bloomington: Indiana University. Moser R (1992). Sociolinguistic dynamics of Sango. Bundoora: La Trobe University. Pasch H (1997). ‘Sango’ In Thomason S G (ed.) Contact languages: A wider perspective. Amsterdam/Philadelphia: John Benjamins Publishing Company. 209–270. Pasch H (ed.) (1992). Sango: the national official language of the Central African Republic; proceedings of the
colloquium: the status and uses of Sango in the Central African Republic. Cologne, September, 3–4. Ko¨ ln: Ko¨ ppe. Queffe´ lec A, Daloba J & Wenezoui-Dechamps M (1997). Le franc¸ais en Centrafricque: lexique et socie´te´. Vanves: AUF/EDICEF. Samarin W (1989). The black man’s burden, African colonial labor on the Congo and Ubangi Rivers, 1880–1900. Boulder, CO: Westview. Samarin W J (1967). A grammar of Sango. The Hague: Mouton. Sammy-Mackfoy P (ed.) (1984). Atlas linguistique de L’Afrique centrale: Centrafrique. Paris: Agence de coope´ ration culturelle et technique, and Yaounde´ : Centre re´ gional de recherche et de documentation sur les traditions orales et pour le de´ veloppement des langues africaines. Thornell C (1997). The Sango language and its lexicon (seˆndaˆ-yaˆngaˆ tıˆ sa¨ngo¨). Lund: Lund University Press, and Bromley: Chartwell-Bratt. Thornell C (2005). ‘Minoritetsspra˚ket mpiemos sociolingvistiska kontext.’ In Maho J M (ed.) Africa & Asia 5, Go¨teborg working papers on Asian and African languages and literatures. Go¨teborg: The department of Oriental and African languages. 175–200. Thornell C & Olivestam C E (2005) (to be published). Kulturmo¨te i centralafrikansk kontext med kyrkan som arena. (Cross culture encounter in Central Africa). Go¨teborg: Acta Universitatis Gothenburgensis. Vuarchex F (ed.) (1989). Litte´rature Centrafricaine (97). Paris: Clef.
Relevant Websites http://www.cia.gov/cia/publications/factbook/geos/ct.html – Central Intelligence Agency, 2005 World Factbook: The Central African Republic website. http://www.ethnologue.org – Ethnologue: languages of the world website. http://sango.free.fr – YSB, Yaˆngaˆ tıˆ Sa¨ngo¨ tıˆ Beˆafrıˆka. http://www.unicef.org/infobycountry/car_statistics. html#5 – UNICEF, Information by country: The CAR website.
Central Solomon Languages A Terrill, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
There are four or possibly five Papuan languages in the central Solomon Islands: Bilua, spoken on the island of Vella Lavella; Touo (known more commonly in the literature as Baniata, after one of the villages where it is spoken), spoken on Rendova Island; Lavukaleve, spoken in the Russell Islands; Savosavo,
spoken on Savo Island; and possibly Kazukuru, an extinct and barely documented language of New Georgia.
Relationships Among the Languages By the time of Ray (1926, 1928), there was already an established list of non-Austronesian languages of the Solomon Islands, consisting of Bilua, Baniata (here referred to as Touo), Savo, and Laumbe (now called Lavukaleve). Waterhouse and Ray (1931) later
Central Solomon Languages 279 Lexique Franc¸ais-Sango. Paris: Socie´te´ d’e´tudes linguistiques et anthropologiques de France. Boyd R (1989). ‘Adamawa-Ubangi.’ In Bendor-Samuel J (ed.) The Niger-Congo languages. Lanham, NY: University Press of America. 178–216. Boyeldieu P & Diki-Kidiri J M (1982). Le domaine Ngbandi. Paris: SELAF. Bradshaw R & Bombo-Konghozaud J (1999). The Sango language and Central African culture. Mu¨nchen: Lincom. Census of the CAR (1988). Bangui: Department of Statistics and Census. Diki-Kidiri M (1998). Dictionnaire orthographique du sa¨ngo¨. Reading: BBA Editions. Goye´mide´ E (1984). Le silence de la foreˆt. Paris: Hatier. Karan M (2001). The dynamics of Sango language spread. Dallas, TX: SIL International Publications in Sociolinguistics. Lim F (1998). Lexiques des termes juridiques et administratifs (Franc¸ais-Sango et Sango-Franc¸ais). Bangui: Institut de Linguistique Applique´e, Universite´ de Bangui. Lim F (1998). Ndı¨a¨ No 98.004 tıˆ 27 mbaˆngo¨, 1998, soˆ alu¨ ndı¨a¨ tıˆ voˆte na Ko¨do¨ro¨seˆse tıˆ Beˆafrıˆka/Loi No 98.004 du 27 mars 1998 portant code electoral de la Re´publique Centrafricaine. Bangui: Institut de Linguistique Applique´e, Universite´ de Bangui. Lim F (2000). Language clusters of Central African Republic on the basis of mutual intelligibility. Cape Town: The Centre of Advanced Studies of African Society. Mon˜ino Y (1995). Le Proto-Gbaya, essai de linguistis comparative historique sur vingt-et-une langue d’Afrique centrale. Paris: Peeters. Mon˜ino Y (ed.) (1988). Lexique comparatif des langues oubanguienne. Paris: Geuthner. Morill C H (1997). ‘Language, culture, and society in the Central African Republic: the emergence and development of Sango.’ Ph.D. diss., Bloomington: Indiana University. Moser R (1992). Sociolinguistic dynamics of Sango. Bundoora: La Trobe University. Pasch H (1997). ‘Sango’ In Thomason S G (ed.) Contact languages: A wider perspective. Amsterdam/Philadelphia: John Benjamins Publishing Company. 209–270. Pasch H (ed.) (1992). Sango: the national official language of the Central African Republic; proceedings of the
colloquium: the status and uses of Sango in the Central African Republic. Cologne, September, 3–4. Ko¨ln: Ko¨ppe. Queffe´lec A, Daloba J & Wenezoui-Dechamps M (1997). Le franc¸ais en Centrafricque: lexique et socie´te´. Vanves: AUF/EDICEF. Samarin W (1989). The black man’s burden, African colonial labor on the Congo and Ubangi Rivers, 1880–1900. Boulder, CO: Westview. Samarin W J (1967). A grammar of Sango. The Hague: Mouton. Sammy-Mackfoy P (ed.) (1984). Atlas linguistique de L’Afrique centrale: Centrafrique. Paris: Agence de coope´ration culturelle et technique, and Yaounde´: Centre re´gional de recherche et de documentation sur les traditions orales et pour le de´veloppement des langues africaines. Thornell C (1997). The Sango language and its lexicon (seˆndaˆ-yaˆngaˆ tıˆ sa¨ngo¨). Lund: Lund University Press, and Bromley: Chartwell-Bratt. Thornell C (2005). ‘Minoritetsspra˚ket mpiemos sociolingvistiska kontext.’ In Maho J M (ed.) Africa & Asia 5, Go¨teborg working papers on Asian and African languages and literatures. Go¨teborg: The department of Oriental and African languages. 175–200. Thornell C & Olivestam C E (2005) (to be published). Kulturmo¨te i centralafrikansk kontext med kyrkan som arena. (Cross culture encounter in Central Africa). Go¨teborg: Acta Universitatis Gothenburgensis. Vuarchex F (ed.) (1989). Litte´rature Centrafricaine (97). Paris: Clef.
Relevant Websites http://www.cia.gov/cia/publications/factbook/geos/ct.html – Central Intelligence Agency, 2005 World Factbook: The Central African Republic website. http://www.ethnologue.org – Ethnologue: languages of the world website. http://sango.free.fr – YSB, Yaˆngaˆ tıˆ Sa¨ngo¨ tıˆ Beˆafrıˆka. http://www.unicef.org/infobycountry/car_statistics. html#5 – UNICEF, Information by country: The CAR website.
Central Solomon Languages A Terrill, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
There are four or possibly five Papuan languages in the central Solomon Islands: Bilua, spoken on the island of Vella Lavella; Touo (known more commonly in the literature as Baniata, after one of the villages where it is spoken), spoken on Rendova Island; Lavukaleve, spoken in the Russell Islands; Savosavo,
spoken on Savo Island; and possibly Kazukuru, an extinct and barely documented language of New Georgia.
Relationships Among the Languages By the time of Ray (1926, 1928), there was already an established list of non-Austronesian languages of the Solomon Islands, consisting of Bilua, Baniata (here referred to as Touo), Savo, and Laumbe (now called Lavukaleve). Waterhouse and Ray (1931) later
280 Central Solomon Languages
discovered Kazukuru, a language of New Georgia, identifying it as unlike both the Melanesian (i.e., Austronesian) and Papuan languages of the Solomon Islands. Much later, Lanyon-Orgill (1953) claimed Kazukuru and two further varieties, Guliguli and Dororo, to be Papuan languages; however, the data are so scant as to make classification uncertain. Greenberg (1971) was the first to make an explicit claim for the genetic unity of these languages, as part of his Indo-Pacific family. This claim was shortly followed by Wurm’s (1972, 1975, 1982) proposal of an East Papuan phylum, linking all the Papuan languages of the islands off the coast of New Guinea into one genetic grouping. Both claims have been firmly rejected by specialists in the region, and recent views have been much more cautious: Ross (2001) suggested, on the basis of similarities in pronouns, that Bilua, Touo (Baniata), Savosavo, and Lavukaleve formed a family, unrelated to other island and mainland Papuan languages. Terrill (2002) found limited evidence of similarities in gender morphology among these languages. In lexical comparisons using an extended Swadesh list of roughly 333 items (with obvious Austronesian loans removed), Bilua, Lavukaleve, Touo, and Savosavo share only 3–5% resemblant forms (i.e., within the realm of chance). In short, at this stage of knowledge, a genetic relationship among any or all of these languages still remains to be proven.
Typological Characteristics A typological overview of these and other Papuan languages of island Melanesia provided by Dunn et al. (2002) showed that, but for a few striking exceptions, the only grammatical features shared by the central Solomon Islands Papuan languages are also held in common with surrounding Oceanic Austronesian languages. These common features include an inclusive/exclusive distinction in pronouns, dual number (actually, there are four number categories in Touo), reduplication for various purposes, nominative/accusative alignment (although Lavukaleve has ergative/absolutive alignment in certain types of subordinate clauses), and serial verb constructions (absent in Bilua). The two most notable departures from Oceanic grammatical patterns are SOV constituent order in three of the languages (Bilua has SVO with some variation) and the presence of gender; there are three genders in Lavukaleve, four in Touo, and two in Bilua and Savosavo. Gender in Bilua is contextually determined: the masculine–feminine distinction applies only to human nouns, but for inanimate nouns there is a distinction, marked by the same morphology as marks gender in human nouns,
between ‘singulative’ (¼masculine) and ‘unspecified number’ (¼feminine) (Obata, 2003). Savosavo has two genders, masculine and feminine, and it is not clear whether they are contextually determined as in Bilua or permanently assigned as in Touo and Lavukaleve (Todd, 1975). Touo has some very unusual features for the region, including a phonological distinction between breathy/ creaky vs. modal vowels, as well as six vowel positions instead of the usual five for the region. Touo sources include Todd (1975), Frahm (1999), and Terrill and Dunn (2003). Lavukaleve too has many unusual features, including focus markers that show agreement in person, gender, and number of the head of the constituent on which they mark focus; and a very complex participant marking system depending on factors to do with predicate type and clause type (Terrill, 2003). See also: Papuan Languages; Solomon Islands: Language Situation.
Bibliography Dunn M, Reesink G & Terrill A (2002). ‘The East Papuan languages: a preliminary typological appraisal.’ Oceanic Linguistics 41, 28–62. Frahm R M (1999). Baniata serial verb constructions, M.A. thesis, University of Auckland. Greenberg J H (1971). ‘The Indo-Pacific hypothesis.’ In Sebeok T A (ed.) Current trends in linguistics, vol. 8: Linguistics in Oceania. The Hague: Mouton and Co. 807–871. Lanyon-Orgill P A (1953). ‘The Papuan languages of the New Georgian Archipelago, Solomon Islands.’ Journal of Austronesian Studies 1, 122–138. Obata K (2003). A grammar of Bilua: a Papuan language of the Solomon Islands. Canberra: Pacific Linguistics 540. Ray S H (1926). A comparative study of the Melanesian Island languages. London: Cambridge University Press. Ray S H (1928). ‘The non-Melanesian languages of the Solomon Islands.’ In Koppers W (ed.) Festschrift publication d’hommage offerte au P. W. Schmidt. Vienna: Mechitharisten-Congregations-Buchdruckerei. 123–126. Ross M (2001). ‘Is there an East Papuan phylum? Evidence from pronouns.’ In Pawley A, Ross M & Tryon D (eds.) The boy from Bundaberg: studies in Melanesian linguistics in honour of Tom Dutton. Canberra: Pacific Linguistics. 301–321. Terrill A (2002). ‘Systems of nominal classification in East Papuan languages.’ Oceanic Linguistics 41, 63–88. Terrill A (2003). A grammar of Lavukaleve. Berlin: Mouton de Gruyter. Terrill A & Dunn M (2003). ‘Orthographic design in the Solomon Islands: the social, historical, and linguistic situation of Touo (Baniata).’ Written Languages and Literacy 6, 177–192.
Cˇeremisina, Maja Ivanovna (b. 1924) 281 Todd E (1975). ‘The Solomon Language family.’ In Wurm S A (ed.) Papuan languages and the New Guinea linguistic scene. Canberra: Pacific Linguistics C-38. 805–846. Waterhouse W H L & Ray S H (1931). ‘The Kazukuru language of New Georgia.’ Man xxxi, 123–126. Wurm S A (1972). ‘The classification of Papuan languages and its problems.’ Linguistic Communications 6, 118–178.
Wurm S A (1975). ‘The East Papuan phylum in general.’ In Wurm S A (ed.) Papuan languages and the New Guinea linguistic scene. Canberra: Pacific Linguistics C-38. 783–804. Wurm S A (1982). Papuan languages of Oceania. Tu¨ bingen: Gunter Narr Verlag.
ˇ eremisina, Maja Ivanovna (b. 1924) C O Molchanova, Uniwersytet Szczecinski, Szczecin, Poland ! 2006 Elsevier Ltd. All rights reserved.
ˇ eremisina was born in Kiev (the Maja Ivanovna C Ukrainian republic) in 1924. She is a Russian scholar who, after Ubrjatov’s death in Novosibirsk, took on the responsibility of continuing research on the syntax of Siberian indigenous peoples’ languages. Under her guidance, 33 scholars have investigated the syntactic structures of their mother tongues (Altai, Alutor, Buryat (Buriat), Kazakh, Ket, Khakas, Khanty, Kirghiz, Nganasan, Selkup, Shor, Tuva (Tuvin), and others). Most of them have undertaken 3-year postgraduate courses at the university in Novosibirsk. Cˇ eremisina received her secondary and higher education in Moscow. Her first years after secondary school were during World War II. On the first day of aerial bombardment in Moscow, her parents’ house was completely destroyed, and her mother was killed. Much later, Cˇ eremisina was educated at the University of Moscow, where she mastered literature and the Russian language and later undertook 3-year postgraduate courses at Moscow University. After graduation, she taught many subjects in Russian philology at university departments in Tomsk, Tula, Beijing (China), and Novosibirsk. Cˇ eremisina obtained her M.A. in 1960 and her Ph.D. In 1974. Her doctoral thesis was entitled ‘Complex comparative constructions in the Russian language.’ Before Cˇ eremisina’s doctoral defense, Ubrjatova asked her to read the manuscript of a book devoted to the analysis of complex sentences in the Yakut language. Cˇ eremisina read the manuscript three times, trying to comprehend Yakut, the frame of mind of its speakers, and their way of expressing themselves, and also trying to penetrate into Ubrjatova’s way of thinking, which gradually opened itself up to her. Her main field of endeavor thereafter became Siberian indigenous languages.
In 1975, Cˇ eremisina took charge of a project based on comparative and typological research into the structure of complex sentences in the languages of Siberian indigenous peoples. The starting point of the investigation was one of the postulates propounded by Ubrjatova in her monograph on Yakut syntax – that Turkic languages employ similar language means to establish links between both words and units of higher levels (phrases and sentences). Testing the postulate on other Altaic languages became the goal of Cˇ eremisina and her disciples. Cˇ eremisina founded a new Department of Languages and Folklore of the Indigenous Siberian Peoples at the university in Novosibirsk. At present, Cˇ eremisina and her team are working at the typology of a simple sentence in Altaic languages. She has published five monographs, nine textbooks, and 183 papers.
See also: Altaic Languages; Turkic Languages; Yakut.
Bibliography Cˇ eremisina M I (1976). Sravnitel’nyje konstrukcii russkogo jazyka. Novosibirsk: Nauka. Cˇ eremisina M I (2002). Jazyk i ego otrazˇ enije v nauke o jazyke. Novosibirsk: Trudy gumanitarnogo fakul’teta NGU. Cˇ eremisina M I & Kolosova T A (1987). Ocˇ erki po teorii slozˇ nogo predlozˇ enija. Novosibirsk: Nauka. Cˇ eremisina M I, Brodskaja L M, Gorelova L M, Skribnik E K, Borgojakova T N & Sˇ amina L A (1984). Predikativnoje sklonenije pricˇ astij v altajskikh jazykakh. Novosibirsk: Nauka. Cˇ eremisina M I, Skribnik E K, Brodskaja L M, Sorokina I P, Sˇ amina L A, Kovalenko N N & Ojun M V (1986). Strukturnyje tipy sintakticˇ eskikh polipredikativnykh konstrukcij v jazykakh raznykh system. Novosibirsk: Nauka.
88 Einstein, Albert (1879–1955)
The following list includes the works cited here and a selection of Einstein’s most relevant and accessible writings that deal directly or indirectly with language and linguistics. See also: Chomsky, Noam (b. 1928); Peirce, Charles Sanders (1839–1914); Piaget, Jean (1896–1980); Reichenbach, Hans (1891–1953); Russell, Bertrand (1872–1970); Strawson, Peter Frederick (b. 1919).
Bibliography Chomsky N A (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Clark R W (1971). Einstein: the life and times. New York: Avon. Einstein A (1922). The meaning of relativity. Princeton, NJ: Princeton University Press. Einstein A (1936). ‘Physics and reality.’ In Out of my later years. Secaucus, NJ: Citadel. 59–97. Einstein A (1944). ‘Remarks on Bertrand Russell’s theory of knowledge.’ In Schilpp P A (ed.) The philosophy of Bertrand Russell. Evanston, IL: Northwestern University. 279–291. Einstein A (1956). ‘The common language of science.’ In Out of my later years. Secaucus, NJ: Citadel (Originally a radio talk in 1941). 111–113. Oller J W Jr, Chen L, Oller S D & Pan N (2005). ‘Empirical predictions from a general theory of signs.’ Discourse Processes 40(2), 115–144.
18th Century Linguistic Thought G Haßler, University of Potsdam, Potsdam, Germany ! 2006 Elsevier Ltd. All rights reserved.
The 18th century has been characterized as a century of debate on language, leading to the formation of new conceptions in this field. The focus on language has also been supported by the role of public opinion and by profound changes in society. But the most important discussions were about the role of language in thought. In the field of language description, 18th-century authors followed for the most part traditional views and contributed to their further development.
Discussions on Language in the Early 18th Century The inclusion of language in the philosophical systems of Rene Descartes (1596–1650), Antoine Arnauld (1612–1694), and Baruch de Spinoza (1632–1677) was based on the supposition that there was an analogy between the dualism of language and thought and that of body and mind. The basis of the doctrine, which supposed an incorporeal thinking no longer accessible to humans after original sin, had already been developed by St Augustine (354–430). From a historical perspective, thinking required signs to render collections of simple ideas, as well as memory, possible. Signs gave rise to the manipulation of ideas, and they did not depend on the presence of things or the enumeration of all the simple ideas included in them. A main tenet of the Augustinian–rationalist doctrine was the merely spiritual nature of all notions
and of the relations between them. The denotation of a term was regarded as a mental object that could only have a representational relation to the word and could not depend on linguistic signs and their corporeal nature. The form words obtained in different languages was regarded as arbitrary, whereas the composition of the concept was universal and did not depend on sensations. For the rationalist thinkers, the indispensability of language consisted only in communication between people when the transmission of pure incorporeal notions was impossible. But linguistic signs met the necessities of communication in a very insufficient way because intuitive conceptions overwhelmed the human mind while their linguistic signs distracted from their content and slowed down the process of thinking. Descartes’s vision of the relation between language and thought was a dualistic one, in that he did not attribute to linguistic signs any influence on ideas. He saw a confirmation of this opinion in the fact that animals with highly developed speech organs did not develop human thinking even if they were able to produce some speech sounds. This idea was further developed by Gerauld de Cordemoy (1626–1684), who claimed that the substantial difference between language and thought allowed the development of different languages with arbitrary sound patterns. The authors of the Port-Royal Grammar (1660) and the Logic (1662) took up the distinction between language-independent thought, communicated thought, and language-dependent thinking, and they subordinated linguistic signs to conceptual notions. Had ideas depended on names, people would not have had the same ideas about even the simplest things
18th Century Linguistic Thought 89
because languages use different names to designate them. Arnauld and Pierre Nicole (1625–1695), the authors of the Logic, criticized Thomas Hobbes’s (1588–1679) remark that reasoning consisted of comparing names and uniting them by the copula. For the authors of Port-Royal Grammar, human thinking operated with the designated ideas and not with their names. The arbitrariness of the linguistic sign played an important part in this argumentation. The fact that Arabs and French people could communicate proved that thought was independent of language. But this language-independent thought had to make use of signs if it was to be communicated. The use of signs that restricted to the communication of ideas could become a habit and finally result in a situation where people could no longer imagine ideas without words. For example, many philosophical treatises aimed at satisfying people with words. These arguments led to a critical discussion of language that played an important part in the 18th-century debate. The diversity of languages had been regarded just as a proof of the secondary role of linguistic signs, but it could not be denied as a practical and an empirical phenomenon that had to be taken into account by grammarians. The contrast between a general way of thinking and the language-specific way of expression was a widely accepted position in 17th- and early18th-century grammar. A direct linear word order followed most closely by the French language, for example, was declared to be a universal feature. The Latin deviation from this subject–verb–object order was declared to have had no influence on thought; Romans must have thought like French people before they rearranged their ideas using inversions and ponderous constructions (for the debate on word order, see Ricken, 1978). The existence of different expressions for the same conceptual structure had already been discussed by the grammarians of Port-Royal, who saw no obstacle to universal ideas in it. The system of language at a given moment was always determined by use, and this meant that people had to rely on the elements of their own language, even if they were less developed or less differentiated than in other languages. For example, all languages had to express relations between nouns and whole syntactic units. In languages with case systems these relations were expressed by morphological elements, whereas in French prepositions and word order served the same purpose. This kind of explanation appeared often in 18thcentury grammars, which associated the special genius of a language (ge´nie de la langue) with the function of formal elements of language and not with their relation to thoughts. But the notion of a ge´nie de la langue had already become very fashionable at the end
of the 17th century in systematizations of Vaugelas’s Remarques sur la langue franc¸aise by Louis Du Truc (1668), Jean Menudier (1681), and Jean d’Aisy (1685). A temporary pinnacle in the description of the particular means of a language was attained with Claude Buffier’s Grammaire franc¸oise sur un plan nouveau (1709). According to Buffier, the reality of language disproved the opinion that a grammatical theory must be prior to all languages. In Buffier’s view, it was a crucial error to write a French grammar following the principles of Latin. Regarding the signification of words primarily as a function of individual representation, Buffier warned not to confuse signs with designated notions and denied the exact explicability of significations. Following John Locke (1632–1704) in his nominalist positions, he overcame the rationalist theory of language. The Essay concerning human understanding (1690) by Locke gave a new answer to the question of how thought could be influenced by language. According to Locke, linguistic signs did not represent the objects of knowledge but the ideas that the human subject created. The nominalist explanation of complex ideas led to a denial of the existence of innate ideas and to the supposition of a voluntary imposition of signs onto a collection of simple ideas for which there was no pattern in reality. Universal language had been a tempting issue for a long time. Even Descartes proposed to base his reflections on this matter on the nature of thought and not on actually existing languages. Taking advantage of such a universal language, peasants would be able to think more conveniently than philosophers using any existing language. It was the same idea that led to the characteristica universalis by Gottfried Wilhelm Leibniz (1646–1716). His doctrine of the harmonious structure of the world and its perception made him reject the nominalist supposition that signs denoted arbitrarily a collection of simple ideas. The form of signs was not naturally determined but historically motivated and not the result of an arbitrary imposition. Imperfections that could be explained by the nature of language, the fuzziness of significations, the polysemy of words, and the conscious abuse of language, might lead human thinking through a tortuous path. Although in the customary use of language such insufficiency was unavoidable, the philosophical use of language could elude it by creating its own philosophical sign system.
Contradictory Results of Empirical–Sensationalist Thought The empirical and sensationalist theory of knowledge developed by Locke had a great influence in many
90 18th Century Linguistic Thought
European countries and was taken up in France by Ce´ sar Chesneau de Du Marsais (1676–1756) and Etienne Bonnot de Condillac (1714–1780), among others, whose thoughts led to different conclusions. Du Marsais, in his ‘‘Essay on Tropes’’ (1730), developed a sensationalist theory of metaphor, apart from tackling grammatical issues in his articles in the French Encyclope´ die. But, in spite of his views on lexical signification rooted in a sensationalist theory, he held a rationalist opinion on syntax, considering subject–verb–object word order as natural and corresponding to the order of thought. However, he modified this theory, recognizing the communicative and stylistic function of different constructions as a counterpart to the figurative use of words. Du Marsais is an author who is not easily classified as either a sensationalist or a rationalist. Delesalle and Chevalier (1986: 88) have characterized him as building a system of empirisme raisonne´ in which he finally looked for a new strategy of grammatical operations. He tried to deduce a set of principles that would be transferable to new situations from the use of language. Following this concept of analysis, he did not arrive at new conclusions through the application of theorems but by a kind of arithmetic that consisted of composition and decomposition. What were regarded as reliable in this context were not definitions but only explanations of the appearance of an idea. The fundamental attitude about the acquisition of language that stemmed from his theory was routine. Du Marsais’s aim was to find an order, a general principle in the multitude of texts. The following example (Du Marsais, 1729/1797: II, 215) shows how a manual of Latin would look following this approach. Mino´ is filiam Aria´ dnen, cujus ope labyrı´nthi amba´ges explicu´erat, secum a´bstulit: eam tamen, immemor beneficii, dese´ruit in insula´ Naxo: destitu´tam Bacchus duxit. a´bstulit secum Aria´dnen, filiam il emmena avec lui Ariane, fille Mino´is; ab ope cujus de Minos par le secours de laquelle explicu´erat amba´ges labyrinthi. il avoit de´ me´ le´ les de´ tours du labirinthe. Tamen, immemor. . . Cependant, ne se ressouvenant point . . . ‘et enleva Ariane, fille de Minos. Ce`te princesse avoit donne´ a` The´se´e un peloton de fil qui aida ce he´ros a` sortir du labirinte. Cependant, oubliant . . . .’
The first line is the Latin original in all its complexity and with its inversions. The second line shows the Latin sentence transformed into the natural order, and the third and fourth lines show an interlin-
ear translation into French and a current French translation. It is easy to suspect that such a procedure would be criticized. Du Marsais introduced routine as a basis for the acquisition of language, but what provided this routine? The bad Latin in the second line of our example was certainly not an appropriate means. One of the protagonists of the critical discussion of Du Marsais’s method was the Abbe´ Noe¨l Antoine Pluche (1688–1761), who picked out as a central theme the function of translation in language teaching. Pluche did not reject the use of interlinear translation for the clarification of the structure of the original text when this structure was maintained on the syntactic level. But, for the initial stage of language learning, he rejected translation made on the basis of rules and vocabularies because it would lead to unusual and awkward expressions that were far from the ge´ nie de la langue latine. The development of the issue of a genie de la langue by Condillac was different in kind. Condillac formulated a coherent sensationalist theory of cognition by substituting for Locke’s dualist explication of sensation and reflection the concept of transformed sensation (sensation transforme´ e), which helped to explain even complex thought as made up of simple sensations. The instrument allowing this transformation was language, to which Condillac attributed an important role in human thought. Human language arose from a language of gestures (langage d’action), which, gradually and stimulated by the needs of communication, developed into a language of arbitrary (artificial) signs. The signs of human language operated according to the principle of analogy, which corresponded to a motivated relation between signs of analogous content. It was this analogy of signs that made up the genius of a language. The sensationalist discussion of signs and their influence on thought gave rise to several applied themes of discussion (the abuse of words, grammar teaching, construction of a philosophical language, and synonymy). The sensationalist theory was nevertheless only one solution, and it was not generally accepted by all 18th-century language theorists. Authors such as James Harris (1709–1780) continued to suppose the existence of innate powers that produced mental operations such as thinking and reasoning. Although corporeal entities could always be subdivided, the unities of the mind could not. So it was important to discover the elementary principles of the dispositions of the mind that acted in combination. As an important principle for this analysis, Harris (1786/1993: 307) endorsed the distinction between Matter and Form, which he tried to find in language as well:
18th Century Linguistic Thought 91
‘‘Now if Matter and Form are among these Elements, and deserve perhaps to be esteemed as the principal among them, it may not be foreign to the Design of this Treatise, to seek whether these, or any things analogous to them, may be found in Speech or Language.’’ The genius of a language is given an important role in this context: ‘‘[. . .] many words, possessing their Significations (as it were) under the same Compact, unite in constituting a particular Language’’ (Harris 1786/1993: 328). For Harris (1786/1993: 398), human cognition did not derive from sensations but from mental archetypes that anticipated perceptible things. All communication consisted for him in the transmission of ideas and words: For what is Conversation between Man and Man? – It is a mutual intercourse of Speaking and Hearing. – To the Speaker, it is to teach; to the Hearer, it is to learn. – To the Speaker, it is to descend from Ideas to Words; to the Hearer, it is to ascend from Words to Ideas. – If the Hearer, in this ascent, can arrive at no Ideas, then he said not to understand; if he ascends to Ideas dissimilar and heterogeneous, and then is he said to misunderstand. – What then is requisite, that he may be said to understand? – That he should ascend to certain Ideas, treasured up within himself, correspondent and similar to those within the Speaker. The same may be said of a Writer and a Reader; as when any one reads to-day or tomorrow, or here or in Italy, what Euclid wrote in Greece two thousand years ago.
Another example of a productive survival of rationalist theory in 18th-century grammar is Nicolas Beauze´ e (1717–1789), who defined general grammar as a science and opposed it to the grammars of particular languages, which he called art. According to him, the rules of scientific grammar should be universal and not depend on the arbitrary properties of languages. The relation of language and thought was explained in a dualistic way: Thought was independent of language, whereas language was an instrument of analysis and at the same time a reflection of thought
Prize Topics on Language Theory In the second half of the 18th century, the debate on linguistic subjects was especially intense in the Berlin Academy. Several themes were brought for academic competitions and these can be regarded as a reflection of the general European language debate. Topic I: Relativity and Subjectivity of Languages as a Means of Cognition and Communication
The president of the Academy, Pierre Louis Moreau de Maupertuis (1698–1759) associated the diversity
of languages with different forms of thought (plans d’ide´ es): Translation among distant languages was considered virtually impossible, and signs had no strict philosophical validity in regard to reality. Reporting in the Lettre sur le progre`s des sciences (1752) that travelers to the Pacific Islands had seen savages there, Maupertuis concluded that he would rather have an hour’s conversation with them than with the most refined mind of Europe. It was also Maupertuis who introduced the origin of language as the topic of an academic prize contest. The signs by which people designated their first ideas, he argued, have so much influence on all our knowledge that research into the origin of languages and the manner in which they were formed deserved as much attention and could be as useful in the study of philosophy as other methods that build systems on words with meanings that had not been thoroughly examined. We might therefore expect to learn a great deal from the comparison of distant languages because in the construction of languages we could discover the vestiges of the first steps taken by the human mind. The diversity of human languages and universals of thought may be regarded as one of the main themes of the discussion. The search for language origins was an attempt to arrive at basic principles, to examine what was natural as opposed to artificial. The empirical reality of the diversity of languages was a challenge to the universalistic principle. From this challenge arose the question for 1759, whose prize was awarded to an essay submitted by the distinguished Goettingen professor of Semitic languages, Johann David Michaelis (1717–1791). Michaelis’s essay, which was translated into English and Dutch after being published in German and French, has become well known in Europe (cf. Michaelis, 1974). Usually only the successful essay was published at the expense of the Academy and all the others were kept as anonymous manuscripts in the archive, but in this case all the texts were published. It is easy to explain why the Academy decided to do this. The relativistic view suggested by the Academy for 1759 was not really taken up, except by Michaelis’s prize essay. Obviously, it did not follow the trend of the contemporary language discussion, which was much more occupied with the general foundations of language. For Michaelis, languages were the results of the work of whole peoples, and this democratic development led to the conservation of prejudices in words. On the other hand, people could contribute to the improvement of languages by the exclusion of etymologies that might mislead human thinking. The French translation contained some important changes suggested by Johann Bernhard Merian
92 18th Century Linguistic Thought
(1723–1807) and Andre´ Pierre Le Guay de Pre´ montval (1716–1764). Michaelis pointed out that no one particular language had a general advantage over others – richness and poverty had always been relative and depended on the purpose that languages were used for. Changes in the text show that there was an attempt to adapt Michaelis’s text to the debate, mainly dominated by French texts and terminology. But the chief supplement dealt with the possibility of framing a successful universal language of learning. He now answered a question that had been considered very important by the Academy since the days of Leibniz. In this context, the antiuniversalistic view of Michaelis appears most clearly: A universal language of science would be exposed to even more arbitrary intervention than a natural language because every scientist or philosopher would be able to determine the significations of the words he used (Michaelis, 1762/1974: 167). Topic II: The Anthropological Foundations and the Origin of Language
If we look at the context of the discussion, it is not difficult to understand why, as early as 1759, the origin of language, and not the philological or philosophical study of different languages, was the fundamental question. In 1756, Jean-Jacques Rousseau’s (1712–1778) Discours sur l’origine et les fondemens de l’ine´ galite´ parmi les hommes was published in German. In the same year, Johann Peter Su¨ ßmilch (1707–1767) read two papers on the divine origin of language. Rousseau presented the dilemma succinctly: If people needed speech in order to learn how to think, they must have been in even greater need of thinking in order to invent the art of speaking. The Academy was interested in the anthropological foundations of language as well as in an explanation of language variety. The question of the origin of language did not aim at historical and factual explanation of previous phases of language development. It was as hypothetical as the state of nature in political philosophy and, like the latter, its aim was to understand people in the present. But the authors of the 31 papers submitted to the contest did not always respond exactly to the question emphasized by the Academy text. There was, for instance, a manuscript entitled Reˆ veries sur le langage that reduced the problem to the question of whether language was innate or not. Also, without quoting the great texts on the origin of language, many authors just copied them. So we find Condillac’s description of the development of human sound language from a language of gestures; traces of this development were to be found in Hebrew, which by its antiquity could be regarded as a natural language. This was another
way to empiricize the search for the foundations of language: One of the classical languages, usually Hebrew, was declared to be so close to the origin that it could be regarded as bearing all necessary traces that languages must have. The issue of another essay (I-667) was the invention of language by human beings. The invention of language was said to be, first of all, due to necessity and danger. If a couple of human beings were exposed to the danger of being devoured by a hungry wolf, the woman, who represented the weaker but more inventive companion, would imitate the wolf’s cry to warn her partner. The author discussed 10 simple steps that led from this origin to the present state of development. It is interesting that for the author of this essay there was no essential difference between the structures of different languages. The languages of Greenland, Japan, the Hottentots, the Oronocs, the Tartars, and the Caribbean were regarded as regular and analogous. They expressed the same anthropological foundations as the European languages, in a very simple way, which could be proved by the forms of their inflexions. According to the author, declension, conjugation, and syntax were common features of all languages. This attitude was typical of the kind of hypothetical empiricism used to study language diversity and its consequences or to explain universals without any regard to real languages. The few examples quoted by any author using this kind of hypothetical empiricism were always very close to the everyday experience of the author. In the case of this essay, the author quoted mainly German examples, to which he added some Polish, French, Greek, and Latin ones. There was also a second kind of hypothetical empiricism, apart from the one we have just discussed. This is obvious in an essay (I-666) that received the accessit by the Academy, written in Latin by Francesco Soave (1743–1806). For the author of this contribution, evidence about the origin of language was to be obtained by observing children who grew up outside society. He repeated a hypothetical experiment already discussed by Bernard de Mandeville (1670–1733), Condillac, and others. Contemporary authors were aware that ethical concerns would never allow them to carry out this experiment. They referred, however, to cases of real savage children who had been found and whose intellectual and communication habits had been studied. As we have seen, there was a wide range of thinking about the diversity of human languages and its relation to the mental development of humans. Why, in spite of this, has Johann Gottfried Herder (1744–1803) gone down in the annals of history as a so-called forerunner of linguistic relativism? Herder
18th Century Linguistic Thought 93
offered a solution to the main anthropological problem without leaving aside the empirical reality of languages: Endowed with the capacity for thought characteristic of humans and for the first time freely exercising this capacity, humans invented language. The doctrine of the intimate connection between Volksgeist and language has generally been regarded as one of Herder’s most important contributions to the thought of that age. Nevertheless, it is easy to find such statements on a reciprocal dependency between the genius of the languages and the characters of the peoples in other texts of the language discussion (cf. Haßler, 1984; Neis, 2003). Linguistic questions were discussed repeatedly with reference to the great 18th-century texts that were published and distributed. This seems to have contributed to a rather monolithic picture of the ideas brought up by the contests. It is quite natural that Jacob Grimm (1785–1863) should refer respectfully to Herder in his own treatise on the origin of language, even though he noted a lack of depth and erudition in the reflections of this author, who could not compete with the scientific view of language imposed by the development of comparative linguistics in the 19th century. In the 19th century, Herder became one of the intellectual heroes of Romanticism. The references to Herder constitute an impressive corpus, which is sufficient to explain the preferences of modern historiographers. His contribution to the classical heritage seems to be evident. But did it really consist in having found out that languages were different and related to human thought, that every language expressed the soul of the people who used it? This idea was much more explicit in the texts of other contributors of the 1771 question. What really made up the outstanding quality of Herder’s essay was the fact that he explained the general foundations of language and, in this way, the origin of language. The papers submitted for the contest on the origin of language show that the authors did not always answer exactly the question that seemed to be important to the Academy. In contrast to the question of 1759, this time the Academy invited them to examine the foundations of language, but many of the contributors wrote about the presumed history of languages and about their differences and grammatical categories. Topic III: Comparing and Assessing Languages
It is justified to say that the Academy was always a little late, inviting authors to answer questions that had already been under discussion for several years. But in the case of the relative value of languages with respect to communication and thought, the Academy
just changed the focus. After having turned from diversity, which was accentuated in 1759, to the anthropological foundations in 1771, it returned to diversity for another question in the last decade of the century: Vergleichung der Hauptsprachen Europas, lebender und todter, in Bezug auf Reichthum, Regelma¨ ßigkeit, Kraft, Harmonie und andere Vorzu¨ ge; in welchen Beziehungen ist die eine der anderen u¨ berlegen, welche kommen der Vollkommenheit menschlicher Sprache am na¨ chsten? (1792–1794). But the context of the debate had changed. It was the time of the great language collections, which either had already appeared or were under preparation. Nevertheless, it seems to have been difficult to get contributions. In 1792, when the topic was first brought up, only two essays arrived, one of which was by a Goettingen professor who had worked in accordance with the ideas of Michaelis, comparing mainly classical languages. This author asserted that Greek was the ideal language for scientific communication and that German was a poor language that would never be suited to learned purposes. It is evident that this excluded him from winning the prize, especially at a time when the Academy was looking for a substitute for French, which has been in decline since the death of Frederic the Great. The second manuscript has been lost, but a remark by the secretary of the Academy ascribes it to Johann Christoph Schwab (1743–1821), who had shared with Antoine de Rivarol (1753–1801) the prize on the universality of French. The Academy had to wait 2 more years to receive a contribution that deserved the prize. It was written by the Berlin preacher Daniel Jenisch (1762–1804) and was published in 1796. Jenisch’s book has usually been considered as one of the forerunners of 19thcentury historical–comparative linguistics. However, it was written in the context of an empiricism that was merely hypothetical. All his considerations about languages are guided by epistemological observations and second-hand testimonies about language facts. He acknowledged this himself, nevertheless feeling a certain need for another kind of empirical studies. What Jenisch promised was not a philological comparison of languages – in this respect, the competing Goettingen manuscript was much more consistent – but an assessment of languages on the basis of a constructed ideal. This ideal language consisted of the familiar properties found in the Renaissance discussion of language. Richness, lucidity, insistence, certainty, and euphony were such properties, and they were not to be found to the same degree in all languages. Thus, the advantages or disadvantages of any language depended on the purpose and the field the language would be used for. By asserting that
94 18th Century Linguistic Thought
languages were equal in rank and could not be reduced to a universal grammar, Jenisch envisaged an impartial comparison of languages. But his idea of empiricism was merely hypothetical and based on literature. So the prize-winning topic on the comparison of languages did not prepare or even open a new epoch of language studies but concluded the discussion on universals and the relativity of languages in the context of the Enlightenment debate. At the end of the 18th century, new kinds of empirical questions were emerging, but many authors still tried to address them through the old epistemological framework. See also: Academies: Dictionaries and Standards; Adelung, Johann Christoph (1732–1806); Beattie, James (1735–1803); Beauzee, Nicolas (1717–1789); Brosses, Charles de (1709–1777); Burnett, James, Monboddo, Lord (1714–1799); Condillac, Etienne Bonnot de (1714– 1780); Diderot, Denis (1713–1784); Early Historical and Comparative Studies; Hamann, Johann Georg (1730– 1788); Harris, James (1709–1780); Jones, William, Sir (1746–1794); Language Teaching: History; Lomonosov, Mikhail Vasilyevich (1711–1765); Modern Linguistics: 1800 to the Present Day; Murray, Alexander (1775–1813); Origin of Language Debate; Rhetoric: History; Sign Language: History of Research; Sign Theories; Smith, Adam (1723–1790); Western Linguistic Thought Before 1800.
Bibliography Aarsleff H (1982). From Locke to Saussure: Essays on the study of language and intellectual history. Minneapolis, MN: University of Minnesota Press. Delesalle S & Chevalier J-C (eds.) (1986). La linguistique, la grammaire et l’e´ cole, 1750–1914. Paris: Armand Colin. Du Marsais C C de (1729/1797). ‘Ve´ ritables principes de la grammaire, ou nouvelle grammaire raisonnne´ e pour apprendre la langue latine.’ In Duchosal M E G & Millon C (eds.) Œuvres de Dumarsais. Paris: Pougin. II, 215.
Formigari L (1994). La se´ miotique empiriste face au kantisme. Anquetil M (trans.). Lie`ge: Mardaga. Harris J (1986/1993). Hermes or a philosophical inquiry concerning universal grammar (4th edn., reprint). London: Routledge/Thoemmes Press. Haßler G (1984). Sprachtheorien der Aufkla¨ rung zur Rolle der Sprache im Erkenntnisprozeß. Berlin: AkademieVerlag. Haßler G (1992). ‘Sprachphilosophie in der Aufkla¨ rung.’ In Dascal M, Gerhardus D, Kuno L & Meggle G (eds.) Sprachphilosophie – philosophy of language – La philosophie du langage. Ein internationales Handbuch zeitgeno¨ ssischer Forschung. Berlin/New York: de Gruyter. 116–144. Haßler G & Schmitter P (eds.) (1999). Sprachdiskussion und Beschreibung von Sprachen im 17. und 18. Jahrhundert. Mu¨ nster: Nodus. Michaelis J D (1762/1974). De l’influence des opinions sur le langage et du langage sur les opinions (reprint). Stuttgart/Bad Cannstatt: Friedrich Frommann Verlag. Neis C (2003). Anthropologie im Sprachdenken des 18. Jahrhunderts. Die Berliner Preisfrage nach dem Ursprung der Sprache (1771). Berlin/New York: Walter de Gruyter. Ricken U (1978). Grammaire et philosophie au Sie´ cle des Lumie`res. Les controverses sur l’ordre naturel et la clarte´ du franc¸ ais. Lille: Presses Universitaires. Ricken U (1984). Sprache, Anthropologie, Philosophie in der franzo¨ sischen Aufkla¨ rung. Ein Beitrag zur Geschichte des Verha¨ ltnisses von Sprachtheorie und Weltanschauung. Berlin: Akademie-Verlag. Ricken U (ed.) (1990). Sprachtheorie und Weltanschauung in der europa¨ ischen Aufkla¨rung. Zur Geschichte der Sprachtheorien des 18. Jahrhunderts und ihrer europa¨ ischen Rezeption nach der Franzo¨ sischen Revolution. Berlin: Akademie-Verlag. Storost J (1994). Langue franc¸ aise – langue universelle? Die Diskussion u¨ ber die Universalita¨ t des Franzo¨ sischen an der Berliner Akademie der Wissenschaften. Zum Geltungsanspruch des Deutschen und Franzo¨ sischen im 18. Jahrhundert. Bonn: Romanistischer Verlag.
El Salvador: Language Situation J DeChicchis, Kwansei Gakuin University, Sanda, Japan ! 2006 J Dechiccus. Published by Elsevier Ltd. All rights reserved.
The Repu´ blica de El Salvador is linguistically more Spanish than Spain. Nine tenths of this Spanishspeaking Central American culture is ethnically mestizo (of mixed American and European heritage).
Almost none of the resident Amerindians can speak anything but Spanish, which is also the dominant language of most of the European Salvadoren˜ os. Other languages are spoken primarily as foreign languages of cultural and commercial interaction, with the possible exception of some Mayan-speaking immigrants from Guatemala. Spanish usage is so strong that, although knowledge of English is widespread in San Salvador business circles, even foreigners
Cˇeremisina, Maja Ivanovna (b. 1924) 281 Todd E (1975). ‘The Solomon Language family.’ In Wurm S A (ed.) Papuan languages and the New Guinea linguistic scene. Canberra: Pacific Linguistics C-38. 805–846. Waterhouse W H L & Ray S H (1931). ‘The Kazukuru language of New Georgia.’ Man xxxi, 123–126. Wurm S A (1972). ‘The classification of Papuan languages and its problems.’ Linguistic Communications 6, 118–178.
Wurm S A (1975). ‘The East Papuan phylum in general.’ In Wurm S A (ed.) Papuan languages and the New Guinea linguistic scene. Canberra: Pacific Linguistics C-38. 783–804. Wurm S A (1982). Papuan languages of Oceania. Tu¨bingen: Gunter Narr Verlag.
ˇ eremisina, Maja Ivanovna (b. 1924) C O Molchanova, Uniwersytet Szczecinski, Szczecin, Poland ! 2006 Elsevier Ltd. All rights reserved.
ˇ eremisina was born in Kiev (the Maja Ivanovna C Ukrainian republic) in 1924. She is a Russian scholar who, after Ubrjatov’s death in Novosibirsk, took on the responsibility of continuing research on the syntax of Siberian indigenous peoples’ languages. Under her guidance, 33 scholars have investigated the syntactic structures of their mother tongues (Altai, Alutor, Buryat (Buriat), Kazakh, Ket, Khakas, Khanty, Kirghiz, Nganasan, Selkup, Shor, Tuva (Tuvin), and others). Most of them have undertaken 3-year postgraduate courses at the university in Novosibirsk. Cˇeremisina received her secondary and higher education in Moscow. Her first years after secondary school were during World War II. On the first day of aerial bombardment in Moscow, her parents’ house was completely destroyed, and her mother was killed. Much later, Cˇeremisina was educated at the University of Moscow, where she mastered literature and the Russian language and later undertook 3-year postgraduate courses at Moscow University. After graduation, she taught many subjects in Russian philology at university departments in Tomsk, Tula, Beijing (China), and Novosibirsk. Cˇeremisina obtained her M.A. in 1960 and her Ph.D. In 1974. Her doctoral thesis was entitled ‘Complex comparative constructions in the Russian language.’ Before Cˇeremisina’s doctoral defense, Ubrjatova asked her to read the manuscript of a book devoted to the analysis of complex sentences in the Yakut language. Cˇeremisina read the manuscript three times, trying to comprehend Yakut, the frame of mind of its speakers, and their way of expressing themselves, and also trying to penetrate into Ubrjatova’s way of thinking, which gradually opened itself up to her. Her main field of endeavor thereafter became Siberian indigenous languages.
In 1975, Cˇeremisina took charge of a project based on comparative and typological research into the structure of complex sentences in the languages of Siberian indigenous peoples. The starting point of the investigation was one of the postulates propounded by Ubrjatova in her monograph on Yakut syntax – that Turkic languages employ similar language means to establish links between both words and units of higher levels (phrases and sentences). Testing the postulate on other Altaic languages became the goal of Cˇeremisina and her disciples. Cˇeremisina founded a new Department of Languages and Folklore of the Indigenous Siberian Peoples at the university in Novosibirsk. At present, Cˇeremisina and her team are working at the typology of a simple sentence in Altaic languages. She has published five monographs, nine textbooks, and 183 papers.
See also: Altaic Languages; Turkic Languages; Yakut.
Bibliography Cˇeremisina M I (1976). Sravnitel’nyje konstrukcii russkogo jazyka. Novosibirsk: Nauka. Cˇeremisina M I (2002). Jazyk i ego otrazˇenije v nauke o jazyke. Novosibirsk: Trudy gumanitarnogo fakul’teta NGU. Cˇeremisina M I & Kolosova T A (1987). Ocˇerki po teorii slozˇnogo predlozˇenija. Novosibirsk: Nauka. Cˇeremisina M I, Brodskaja L M, Gorelova L M, Skribnik E K, Borgojakova T N & Sˇamina L A (1984). Predikativnoje sklonenije pricˇastij v altajskikh jazykakh. Novosibirsk: Nauka. Cˇeremisina M I, Skribnik E K, Brodskaja L M, Sorokina I P, Sˇamina L A, Kovalenko N N & Ojun M V (1986). Strukturnyje tipy sintakticˇeskikh polipredikativnykh konstrukcij v jazykakh raznykh system. Novosibirsk: Nauka.
282 Cerron-Palomino, Rodolfo (b. 1940)
Cerron-Palomino, Rodolfo (b. 1940) C Parodi, University of California, Los Angeles, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Rodolfo Cerro´ n-Palomino, an emeritus professor at the Universidad Nacional Mayor de San Marcos, Lima, (1970–1991) and an active professor at the Pontificia Universidad Cato´ lica del Peru´ (1998 to date), is a prominent figure of Andean linguistics. He received his Bachelor of Arts from the Universidad Nacional Mayor de San Marcos, Lima, and his Masters degree from Cornell University in Lingusitics. Professor Cerro´ n-Palomino has two Ph.D. degrees in Linguistics, one from Universidad Nacional Mayor de San Marcos, and a second one from the University of Illinois at Urbana Champaign. He was awarded the Gugenheim fellowship, in addition to earning several other honors in Germany, Holland, and the United States. Cerro´ n Palomino’s linguistic work focuses on the indigenous languages of the Andes: Quechua, Aimara, Chipaya and Mochica. He has researched bilingualism in Peru, the linguistic interference of Spanish into Indian languages, and the influence of Quechua into Spanish. Within the Quechua linguistic family, he has produced historical, sociolinguistic, and descriptive work of several dialects of Central or Junin-Wanka Quechua. He spearheaded the standardization of modern Quechua, which has been implemented in Peruvian rural public schools to provide the students with bilingual education. Cerro´ n-Palomino argued against the Quechumara hypothesis championed by Orr and Longacre. He showed that Proto-Quechua and Proto-Aimara are languages that do not share a common origin. His historical work on Quechua continued in his book Lingu¨ ı´stica Quechua (1987) a broad survey of Quechua geographical distribution, including all the main dialectal variations, origins, development, classification into dialects, phonology, and grammar. He studied the Aimara linguistic family from Peru, Bolivia, and Chile in his book Lingu¨ ı´stica Aimara (2000). In his detailed work on Andean languages, he uses colonial sources such as Ludovico Bertonio, Diego Gonc¸ alez Holguin, and Inca Garcilaso de la Vega. In his article Lenguas de la costa norte peruana (2004), he sheds light on prehispanic language contact
in Peru due to the ‘mitmas.’ The mitmas were Inca institutions that caused demographic movements of different human linguistic groups along the Inca Empire and that aimed to control newly conquered people. Recently, he has been working with Andean Spanish and Chipaya, the only language alive of the Uru family. In his book on Andean Spanish, Castellano andino (2003), he addresses the variation of rural Spanish due to the interference of Quechua and Aimara bilingualism in Peru. He exposes the prejudice the speakers of Andean Spanish have to face, since their speech, called ‘motoso Spanish,’ is highly stigmatized. Andean Spanish has a noncanonical word order and different agreement patterns than standard Spanish. In addition, Andean Spanish incorporates Quechua and Aimara loanwords that are not used in standard Peruvian Spanish. Cerro´ n-Palomino traces the origins of Andean Spanish in colonial texts, such as in Guamam Poma’s El primer nueva coronica y buen gobierno. He has also analyzed Quechua loanwords into local Peruvian Spanish and loanwords of Spanish into Quechua.
See also: Bilingual Education; Peru: Language Situation; Quechua; Spanish; Standardization.
Bibliography Cerro´ n-Palomino R (1976). Grama´ tica quechua. Lima: Instituto de Estudios Peruanos. Cerro´ n-Palomino R (1987 [2003]). Lingu¨ ı´stica quechua. Cuzco: Centro de Estudios Regionales Andinos Bartolome´ de las Casas. Cerro´ n-Palomino R (1994). Quechu´ mara: estructruras paralelas del quechua y del aimara. La Paz: CIPCA. Cerro´ n-Palomino R (1995). La lengua del Nailamp: reconstruccio´ n y obsolencia del mochica. Lima: Fondo Editorial de la Pontificia Universidad Cato´ lica del Peru´ . Cerro´ n-Palomino R (2000). Lingu¨ ı´stica aimari. Lima: Instituto France´ s de Estudios Andinos. Cerro´ n-Palomino R (2003). Castellano andino. Lima: Fondo Editorial de la Pontificia Universidad Cato´ lica del Peru´ . Cerro´ n-Palomino R (2004). ‘Lenguas de la costa norte peruana.’ In Estrada Ferna´ ndez Z et al. (eds.) Estudios en lenguas amerindias: homenaje a Ken L. Hale. Hermosillo: Universidad de Sonora.
Cerulli, Enrico (1898–1988) 283
Cerulli, Enrico (1898–1988) P D Fallon, Howard University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Enrico Cerulli, born on February 15, 1898 in Naples, was a diplomat, anthropologist, and linguist who specialized in the languages, history, and cultures of Ethiopia, Somalia, and Eritrea. After attending the University of Naples and earning a law degree, in 1920 Cerulli began an impressive diplomatic career. He became Secretary and later Director of Political Affairs in Italian Somaliland from 1920 to 1925. A Counselor for the Italian Legation to Ethiopia from 1926 to 1929, Cerulli also participated in an exploratory trip to Western Ethiopia and a scientific expedition to the sources of the Wabe¯ Shabele¯ River. He was the Italian representative to the Anglo-Italian Boundary Commission for Somaliland (1930–1931). He became the SecretaryGeneral of the Ministry for Italian Africa from 1932 to 1935, then a delegate to the League of Nations (1935–1937). He married Lina Ciotola in 1936 and had two sons with her. Favored by Mussolini, Cerulli was appointed Deputy Governor-General of Italian East Africa in 1938. Cerulli is widely recognized as a scholar of considerable repute. Yet Bahru (2001: 162) has cited Cerulli as ‘‘the perfect example of scholarship being put at the service of colonial administration.’’ Sbacchi (1985) wrote that many Italians in Africa thought that Cerulli was incapable of carrying out responsibilities of high office and that he favored Somalis over other ethnic groups. Yet he was one of only a few officials with extensive colonial experience, and he was highly knowledgeable about the local environment. Because of incompatibility with the GovernorGeneral, the Duke of Aosta, he was transferred to Harar as Governor. Cerulli retired from public service (1940–1944). Later, he was an Italian delegate to the Peace Conference (1944–1947), the Four-Power Conferences in London from 1947 to 1949, and delegate to the United Nations regarding Italian African Territories. From 1950 to 1954 he was ambassador to Iran, and from 1955 to 1968 he was Councilor of State. Cerulli wrote many grammatical sketches (based on fieldwork) on languages around the Horn of Africa, including Harari (Ethiopian Semitic); Somali, Sidamo, Arbore, Komso, and Daasanach (Cushitic); Janjero (Yemsa), Chara, Basketto, and Koorete (Omotic); and Berta (Nilo-Saharan). Some of these works were the first linguistic documentation of the
languages and are generally quite accurate and useful to this day. In addition, Cerulli showed great interest in Somali poetry and songs. Islam was a driving interest of Cerulli’s, from his university studies through his later writings on Islamic connections to medieval Western culture (1949) and Islam in general (1971). Yet Cerulli was also interested in Ethiopian Christianity, its links to Eastern Christianity in Palestine, and its Ge’ez texts on Ethiopian saints. Cerulli received honorary doctorates from the Universities of Brussels, Rome, and Manchester. He was a member of many academies and societies in Europe, including the Accademia Nazionale dei Lincei (also its Vice-President). He served as President of the Italian Anthropological Institute. Cerulli was a ‘‘man of science and a man of action’’ (Ricci, 1990) who successfully combined two careers in public service and in scholarship. Cerulli died on September 19, 1988 in Rome. See also: African Linguistics: History; Cushitic Languages; Eritrea: Language Situation; Ethiopia: Language Situation; Ethiopian Semitic Languages; Nilo-Saharan Languages; Omotic Languages; Somali; Somalia: Language Situation.
Bibliography Bahru Z (2001). A history of modern Ethiopia, 1855–1991 (2nd edn.). Athens, OH: Ohio University Press. Cerulli E (1922). The folk literature of the Galla of Southern Abyssinia. (Harvard African Studies 3). Cambridge: Harvard University Press. Cerulli E (1933). Etiopia occidentale (2 vols). Rome: Sindicato Italiano Arti Grafiche. Cerulli E (1936–1951). Studi etiopici (4 vols). Vol. I: La lingua e la storia di Harar (1936); vol. II: La lingua e la storia dei Sidamo (1938); vol. III: Il linguaggio dei Giangero` ed alcune lingue Sidama dell’Omo (Basketo, Ciara, Zaisse`) (1938); vol. IV: La lingua caffina (1951). Rome: Istituto per l’Oriente. Cerulli E (ed.) (1949). Il Libro della scala e la questione delle fonti arabo-spagnole della Divina commedia. Vatican City: Biblioteca Apostolica Vaticana. Cerulli E (1957–1964). Somalia: Scritti vari editi ed inediti (3 vols). Rome: Istituto Poligrafico dello Stato P. V. Cerulli E (1971). L’Islam di ieri e di oggi. Rome: Istituto per l’Oriente. Ricci L (1990). ‘Enrico Cerulli e l’Istituto per l’Oriente.’ Oriente Moderno nuova serie 9(70), 1–6. Sbacchi A (1985). Ethiopia under Mussolini: Fascism and the colonial experience. London: Zed Books.
284 Chad: Language Situation
Chad: Language Situation J Roberts, SIL-Chad and Universite´ de N’Djame´na, Chad ! 2006 Elsevier Ltd. All rights reserved.
Introduction and History The Republic of Chad is located in the Sahel of northcentral Africa, at the meeting place of three of the four major phyla of African languages. As a result, Chad is characterized by great linguistic variety. Today Chad is a checkerboard of numerous language groups, most of them small and each of which is located in a limited geographic area (see Appendix 1). The Ethnologue (Gordon, 2004) lists a total of 131 living languages in Chad. Only 21 of these languages boast more than 100 000 speakers (see Table 1); nonetheless, these 21 represent about three-fourths of the population of Chad, which current estimates place at about 9 million. More than 50 Chadian languages, on the other hand, have fewer than 10 000 speakers. The present linguistic situation is the result of movements of peoples, contact between ethnic groups, and the dominance of certain ethnic groups over the centuries. Most of Chad’s peoples claim to have come from northeastern Africa or the Middle East, but the details of their origins have been largely lost in the oral history. The Arabs came in several waves, beginning in the 14th century; their language has had a heavy influence on the indigenous languages of Chad because of its continued contact with them over the years. The Kanem-Borno empire, the Bagirmi kingdom, and the Wadai kingdom, which enjoyed ascendancy at different periods from the 9th to the 19th centuries, are in part responsible for the present-day use of Kanembu, Bagirmi, and Maba as vehicular languages. Finally, the French language arrived in Chad at the end of the nineteenth century as the colonizers’ language of administration and education.
French, Arabic, and Other Languages of Wider Communication Since Chad gained its independence in 1960, French has been the official language; Arabic was early proposed as a second official language, but this proposal has been a source of controversy. The Constitution of 1996 declared that French and Arabic are the official languages of Chad. But there has been some ambivalence as to whether the Arabic referred to is Modern Standard Arabic or vernacular Chadian Arabic. If Chad fit the pattern of diglossia common throughout the Arabic-speaking world, Standard Arabic would
be deemed the only variety worthy of official status, the local vernacular (Chadian Arabic) being suitable only for oral informal communication. Nonetheless, Chadian Arabic is used in ways that would otherwise be considered inappropriate: it is used in formal settings (speeches, news broadcasts), and various efforts have been made to give it a standardized written form, even using Latin characters. A growing body of literature exists in Chadian Arabic. Further, the association of Arabic with Islam has made the Arabic question a bone of contention in Chad’s officially secular society. It is true that Arabic does not enjoy widespread use in public life (e.g., in education, on signs and billboards, in newspapers) as compared with French, which is still seen as the language
Table 1 Chadian languages with more than 100 000 speakersa Language (with classificatory affiliation)
Number of native speakers (in thousands)
Chadian Arabic
1100
Nilo-Saharan languages (Sara subgroup)
Ngambay Gor/Mango Sar Gulay Kaba Na/Kaba Deme/Kuifa Mbay
1200 350 270 240 170 140
Nilo-Saharan languages (Bagirmi subgroup)
Naba (Bilala, Kuka, Medogo)
340
Nilo-Saharan languages (Saharan branch)
Kanembu Kanuri Dazaga Zaghawa
570 130 420 120
Nilo-Saharan languages (Maban branch)
Maba
470
Chadic languages (Chari-Logone group)
Nancere Lele Gabri
110 100 100
Chadic languages (Masa branch)
Musey Marba Masana
260 180 160
Niger-Congo languages (Adamawa family)
Mundang Tupuri
240 130
a Estimated numbers of native speakers in 2004, in thousands, based on projections of ethnic group population from the 1993 census (cf. Bureau Central du Recensement, 1994).
Chad: Language Situation 285
of prestige, the principal language of education and administration. The issue of a ‘bilingual’ Chad, where French and Arabic are used as equals on the national level, has engendered much debate (cf. Coudray, 1998). In a country with about 130 languages, bilingualism is a necessity for interethnic communication, but the lingua franca used is not limited to the official languages. Some estimates reckon that only 30% of Chadians understand French, and only a very small minority of Chadians master Standard Arabic. On the other hand, those who understand vernacular Chadian Arabic may be as high as 70%. Chadian Arabic is widely used as a second language throughout the north and east of the country, and it is also gaining ground in certain areas of the south; ‘Bongor Arabic’ is the pejorative term used to describe the pidginized Arabic used in southwest Chad. But other Chadian languages are also used in specific geographic areas for interethnic communication: Fulfulde in the southwest near the Cameroon border; Sar in the far south, and/or a Sara-Ngambay mixture used across the south; Bagirmi along the Chari River; Kanembu in the area around Lake Chad; Dazaga in the north from Lake Chad across to the Sudan border; and Maba throughout much of eastern Chad. In urban centers, such as the capital N’Djame´ na, speakers of diverse languages live side by side, and language use is adapted accordingly. It is common to hear frequent code switching between French, Chadian Arabic, and a local language, depending on the constraints of a given communication situation. Another urban phenomenon is ‘common Sara,’ forged by speakers of a diversity of Sara languages to enable them to communicate freely.
Use of Chadian Languages in Writing and Literacy Only 33% of Chadians can read or write French or Arabic, according to a Census Bureau survey in 1998. But there are small numbers of literates in national languages, too. The government’s Direction de l’Alphabe´ tisation et de Promotion des Langues Nationales (DAPLAN) coordinates efforts in nonformal education, efforts to provide writing systems for Chadian languages, and efforts to develop pedagogical materials so that Chadian languages can be used in literacy programs. A number of Chadian languages have been reduced to writing for specific practical purposes. Christian missionaries were the first to do so: translations of Bible portions first appeared in the Mbay language in 1932, in Mundang in 1933, and in Ngambay in 1936. This effort grew in subsequent years, so that today
about 50 Chadian languages have some published religious materials, whether Bible portions, liturgical materials, or New Testaments and Bibles. In recent years, nonreligious development agencies and nongovernmental organizations have worked to disseminate health and agricultural materials in a variety of Chadian languages. There are also grassroots efforts to further the use of Chadian languages. Numerous language committees and cultural associations have been formed in recent years with a view to promoting the local languages, developing local literacy programs on the village level, and developing a body of literature in these languages.
Classification and Linguistic Characteristics of Chadian Languages The three phyla of African languages represented in Chad are Afro-Asiatic (Afrasian), Nilo-Saharan, and Niger-Congo. Nilo-Saharan Languages
The Nilo-Saharan phylum is represented in Chad by the following branches (cf. Bender, 1996): 1. Central Sudanic, and specifically the Sara-Bagirmi group. More than one third of Chad’s population speak languages from the Sara subgroup, which constitutes a dialect continuum in a broad band along the southern border with the Central African Republic. The principal Sara languages are Ngambay, Kaba, Gulay, Mbay, Sar, and Kaba Na, as well as a number of other, intermediate varieties, such as Gor, that have been dubbed dialects or languages according to different classifications. The Bagirmi subgroup, located further to the north, includes Bagirmi, Naba, and Kenga. 2. Maban, located in the Wadai near the Sudanese border: principal languages are Maba, Masalit, and Runga. 3. Saharan (located throughout the northern desert): its languages are Tedaga, Dazaga, Zaghawa (Beria), Kanembu, and Kanuri. 4. Eastern Sudanic, represented by the Tama group and Daju, spoken in eastern and central Chad. 5. For, represented by the Amdang language in the area of Biltine. Linguistically, the Nilo-Saharan show the most diversity among Chadian languages. The Sara languages are quite simple morphologically and syntactically, while languages of the Saharan group show great complexity; the other subgroups show intermediate degrees of complexity.
286 Chad: Language Situation Afro-Asiatic Languages: Chadic Family
More than one-third of Chadian languages belong to the Chadic family, one division of the Afro-Asiatic (or Afrasian) phylum. (Note that the adjective Chadic refers to this particular language family; the term Chadian refers to the country of Chad.) The following branches of the Chadic family are found in Chad (cf. Barreteau and Newman, 1978): 1. East Chadic. All the languages of this branch are spoken in Chad; it divides into two subgroups: the Gue´ ra group (Newman’s group ‘‘B’’), located in the Gue´ ra region of central Chad, whose major languages are Dangale´ at, Migaama, Bidiyo, Mukulu, and Mubi; and the Chari-Logone group (Newman’s group ‘‘A’’), located between Chad’s two major rivers, and whose major languages are Kera, Kwang, Nancere, Lele, Gabri, Somrai, and Tumak. 2. Masa. All languages of the Masa branch of Chadic are spoken in Chad, along its border with Cameroon. The main languages are Masana, Marba, Musey, and several varieties of Zime. 3. Central Chadic (Biu-Mandara). Only a few languages of this branch are spoken in Chad, all located along the Cameroon border, notably Kotoko (several varieties), Buduma, Musgu, and Gidar. These Chadic languages are normally characterized by grammatical gender in their nominal systems; their verbs may be marked for directionality and/ or plurality of action. Vowels are often limited in number; some languages have even been analyzed as having only one underlying vowel, a. Many Chadic languages exploit labialization (lip-rounding) and palatalization (fronting) to a greater or lesserzz degree. Vowel length and consonant gemination are common phenomena in the languages of the Gue´ ra subgroup. Niger-Congo Languages: Adamawa Family
The Niger-Congo languages are represented in Chad essentially by the Adamawa family (cf. Boyd, 1989). Apart from a couple of isolate languages within Adamawa (Day, Laal), the languages spoken in Chad fall into two groups: 1. Mbum group, along the border with Cameroon; its principal languages are Mundang, Tupuri, and the Eastern Mbum cluster of Nzakambay, Kuo, and Karang. The languages of the Kim cluster, spoken along the Logone River, were originally recognized as a separate group but are now joined to the Mbum group.
2. Bua group, a cluster of languages spoken from the Chari River north to the Gue´ ra; it includes Bua, Niellim, Tunia, Bolgo, and Gula varieties. Adamawa languages are notable within the NigerCongo phylum for their lack of a functioning noun class system; in general, the morphology and syntax of Adamawa languages are not greatly complex. Common Features
A number of distinctive features of Chadian languages are worthy of mention because they can be found in languages of all three phyla. In the sound systems, many languages have phonemic implosives K and F (sometimes also ); many have a series of prenasalized plosives (mb, nd, nj, ng); some have the retroflex flap 8; and the labial flap v˘ occurs with some regularity, especially in ideophones. A few languages have ATR vowel harmony (in certain Saharan, Maban, and Bua languages). Most Chadian languages are tonal, although accent also seems to play an important role in some of them. Most languages have three register tones, which are relatively stable (i.e., not subject to spreading or downstep) but which can combine into tonal contours. Tone is exploited to make both lexical and grammatical distinctions. SVO word order predominates in Chadian languages of all families; the only notable exception is the SOV order found in Saharan and Maban languages. Morphology is also relatively simple in most of the languages of southern Chad, regardless of classificatory affiliation. The Eastern Chadic languages have a somewhat richer morphology, but the greatest degree of morphological complexity is undoubtedly to be found in the verbal systems of the Saharan languages.
Prospects Chadian languages have not received much attention from linguists. In the 1960s and 1970s, Europeans such as Jean-Pierre Caprile and Herrmann Jungraithmayr did active research (especially in the Sara languages and the Chadic languages, respectively). In recent years, Chadians have done research and description, especially through the National Institute of Human Sciences or the linguistics department of the University of N’Djame´na. Other scholars and organizations, such as SIL, continue to engage in on-site linguistic research and description of Chadian languages. But much remains to be done. The very existence of certain languages, such as Zerenkel or Mabire in the Gue´ra, has been discovered or confirmed only in
Chadic Languages 287
the past few years, because it is still difficult to reach many areas of Chad even today. And since so many of Chad’s language groups are small, it is questionable how long the languages can continue. Indeed, several languages are already moribund, with only a handful of older speakers still living (e.g., Berakou, Mabire, Goundo), and a few others are seriously endangered because of language shift. Nonetheless, bilingualism of the mother tongue with another major language (such as Chadian Arabic) is stable among most groups; many remain relatively isolated from the mainstream; and in most groups children still learn the language of their parents. So most Chadian languages should remain to be spoken for at least two or three generations to come. See also: Bilingualism; Code Switching and Mixing; Endangered Languages; Niger-Congo Languages; NiloSaharan Languages. Language Maps (Appendix 1): Map 4.
Bibliography Alio K (1997). ‘Langues, de´ mocratie et de´ veloppment.’ Travaux de linguistique tchadienne 1, 5–31. Barreteau D (ed.) (1978). Inventaire des e´ tudes linguistiques sur les pays d’Afrique noire d’expression franc¸ aise et sur Madagascar. Paris: Conseil International de la langue franc¸ aise. Barreteau D & Newman P (1978). ‘Les langues tchadiques.’ In Barreteau (ed.). 291–329. Bender M L & Doornbos P (1983). ‘Languages of WadaiDarfur.’ In Bender M L (ed.) Nilo-Saharan language studies. East Lansing, MI: African Studies Center. 43–79. Bender M L (1996). The Nilo-Saharan languages. Mu¨ nchen: Lincom Europa.
Boyd R (1989). ‘Adamawa-Ubangi.’ In Bendor-Samuel J (ed.) The Niger-Congo languages. Lanham, MD: University Press of America. 178–215. Bureau Central de Recensement (1994). Recensement ge´ ne´ ral de la population et de l’habitat. N’Djame´ na: Ministe`re du Plan et de la Coope´ ration. Caprile J P (1972). ‘Carte linguistique du Tchad.’ In Cabot J (ed.) Atlas pratique du Tchad. Paris and N’Djame´ na: Institut Ge´ ographique National and Institut National des Sciences Humaines. 36–37. Caprile J-P (1977). ‘Introduction.’ In Caprile J P (ed.) Etudes phonologiques tchadiennes. Paris: SELAF. 11–21. Caprile J-P (1978). ‘Le Tchad.’ In Barreteau (ed.). 449–463. Caprile J-P (1981). ‘Les langues sara-bongo-baguirmiennes et leur classification.’ In Perrot (ed.). 237–242. Collelo T (ed.) (1990). Chad: a country study. Washington, DC: U.S. Government Printing Office. Coudray H (1998). ‘Langue, religion, identite´ , pouvoir: le contentieux linguistique franco-arabe au Tchad.’ In Centre Al-Mouna Contentieux linguistique arabe-franc¸ ais. N’Djame´ na: Centre Al-Mouna. 19–69. Gordon R (2004). Ethnologue: languages of the world (15th edn.). Dallas: SIL. Jouannet F (1978). ‘Situation sociolinguistique du Tchad: approches.’ In Caprile J-P (ed.) Contacts de langues et contacts de cultures, vol. 2: La situation du Tchad: approche globale au niveau national. Paris: SELAF. 11–121. Jullien de Pommerol P (1997). L’arabe tchadien: e´ mergence d’une langue ve´ hiculaire. Paris: Karthala. Jungraithmayr H (1981). ‘Les langues tchadiques: Ge´ ne´ ralite´ s’ and ‘Inventaire des langues tchadiques.’ In Perrot (ed.). 401–413. Perrot J (ed.) (1981). Les langues dans le monde ancien et moderne, Premie`re partie: Les langues de l’Afrique subsaharienne. Paris: Centre National de la Recherche Scientifique. Zeltner J-C (1970). ‘Histoire des Arabes sur les rives du lac Tchad.’ In Annales de l’Universite´ d’Abidjan F. 2–2.
Chadic Languages P J Jaggar, University of London, London, UK ! 2006 Elsevier Ltd. All rights reserved.
Introduction The Chadic language family comprises an estimated 140 to 150 languages spoken in areas to the west, south, and east of Lake Chad (west Africa). The bestknown and most widespread Chadic language is Hausa, with upwards of 30 million first-language speakers, more than any other language in Africa south of the Sahara. The remaining languages, some of which are rapidly dying out (often due to pressure
from Hausa), probably number little more than several million speakers in total, varying in size from fewer than half a million to just a handful of speakers, and new languages continue to be reported. Written descriptions of varying length and quality are available for only about one-third of the total, although for some – e.g., Bidiya (Bidiyo), Guruntum, Kanakuru (Dera), Kera, Kwami, Lamang, Margi (Marghi Central), Miya, and Mupun – good descriptive grammars have been produced, and several dictionaries have appeared, e.g., Dangale´ at, Lame´ , Ngizim, and Tangale. Hausa has four recent comprehensive reference grammars, in addition to two
Chadic Languages 287
the past few years, because it is still difficult to reach many areas of Chad even today. And since so many of Chad’s language groups are small, it is questionable how long the languages can continue. Indeed, several languages are already moribund, with only a handful of older speakers still living (e.g., Berakou, Mabire, Goundo), and a few others are seriously endangered because of language shift. Nonetheless, bilingualism of the mother tongue with another major language (such as Chadian Arabic) is stable among most groups; many remain relatively isolated from the mainstream; and in most groups children still learn the language of their parents. So most Chadian languages should remain to be spoken for at least two or three generations to come. See also: Bilingualism; Code Switching and Mixing; Endangered Languages; Niger-Congo Languages; NiloSaharan Languages. Language Maps (Appendix 1): Map 4.
Bibliography Alio K (1997). ‘Langues, de´mocratie et de´veloppment.’ Travaux de linguistique tchadienne 1, 5–31. Barreteau D (ed.) (1978). Inventaire des e´tudes linguistiques sur les pays d’Afrique noire d’expression franc¸aise et sur Madagascar. Paris: Conseil International de la langue franc¸aise. Barreteau D & Newman P (1978). ‘Les langues tchadiques.’ In Barreteau (ed.). 291–329. Bender M L & Doornbos P (1983). ‘Languages of WadaiDarfur.’ In Bender M L (ed.) Nilo-Saharan language studies. East Lansing, MI: African Studies Center. 43–79. Bender M L (1996). The Nilo-Saharan languages. Mu¨nchen: Lincom Europa.
Boyd R (1989). ‘Adamawa-Ubangi.’ In Bendor-Samuel J (ed.) The Niger-Congo languages. Lanham, MD: University Press of America. 178–215. Bureau Central de Recensement (1994). Recensement ge´ne´ral de la population et de l’habitat. N’Djame´na: Ministe`re du Plan et de la Coope´ration. Caprile J P (1972). ‘Carte linguistique du Tchad.’ In Cabot J (ed.) Atlas pratique du Tchad. Paris and N’Djame´na: Institut Ge´ographique National and Institut National des Sciences Humaines. 36–37. Caprile J-P (1977). ‘Introduction.’ In Caprile J P (ed.) Etudes phonologiques tchadiennes. Paris: SELAF. 11–21. Caprile J-P (1978). ‘Le Tchad.’ In Barreteau (ed.). 449–463. Caprile J-P (1981). ‘Les langues sara-bongo-baguirmiennes et leur classification.’ In Perrot (ed.). 237–242. Collelo T (ed.) (1990). Chad: a country study. Washington, DC: U.S. Government Printing Office. Coudray H (1998). ‘Langue, religion, identite´, pouvoir: le contentieux linguistique franco-arabe au Tchad.’ In Centre Al-Mouna Contentieux linguistique arabe-franc¸ais. N’Djame´na: Centre Al-Mouna. 19–69. Gordon R (2004). Ethnologue: languages of the world (15th edn.). Dallas: SIL. Jouannet F (1978). ‘Situation sociolinguistique du Tchad: approches.’ In Caprile J-P (ed.) Contacts de langues et contacts de cultures, vol. 2: La situation du Tchad: approche globale au niveau national. Paris: SELAF. 11–121. Jullien de Pommerol P (1997). L’arabe tchadien: e´mergence d’une langue ve´hiculaire. Paris: Karthala. Jungraithmayr H (1981). ‘Les langues tchadiques: Ge´ne´ralite´s’ and ‘Inventaire des langues tchadiques.’ In Perrot (ed.). 401–413. Perrot J (ed.) (1981). Les langues dans le monde ancien et moderne, Premie`re partie: Les langues de l’Afrique subsaharienne. Paris: Centre National de la Recherche Scientifique. Zeltner J-C (1970). ‘Histoire des Arabes sur les rives du lac Tchad.’ In Annales de l’Universite´ d’Abidjan F. 2–2.
Chadic Languages P J Jaggar, University of London, London, UK ! 2006 Elsevier Ltd. All rights reserved.
Introduction The Chadic language family comprises an estimated 140 to 150 languages spoken in areas to the west, south, and east of Lake Chad (west Africa). The bestknown and most widespread Chadic language is Hausa, with upwards of 30 million first-language speakers, more than any other language in Africa south of the Sahara. The remaining languages, some of which are rapidly dying out (often due to pressure
from Hausa), probably number little more than several million speakers in total, varying in size from fewer than half a million to just a handful of speakers, and new languages continue to be reported. Written descriptions of varying length and quality are available for only about one-third of the total, although for some – e.g., Bidiya (Bidiyo), Guruntum, Kanakuru (Dera), Kera, Kwami, Lamang, Margi (Marghi Central), Miya, and Mupun – good descriptive grammars have been produced, and several dictionaries have appeared, e.g., Dangale´at, Lame´, Ngizim, and Tangale. Hausa has four recent comprehensive reference grammars, in addition to two
288 Chadic Languages
high-quality dictionaries, making it the best-documented language in sub-Saharan Africa. Chadic is a constituent of the Afroasiatic phylum, which also includes Semitic (e.g., Amharic, Arabic, [Standard] Hebrew), Cushitic (e.g., Oromo, Somali), Omotic (e.g., Dime, Wolaytta), Berber (e.g., Tamahaq and Tamajeq [Tamajeq, Tayart] [spoken by the Tuareg], Tamazight [Central Atlas], and (extinct) Ancient Egyptian/Coptic. The phylogenetic membership of Chadic within Afroasiatic was first proposed almost 150 years ago, but did not receive wide acceptance until Greenberg’s (1963) major (re)classification of African languages. The standard internal classification divides Chadic languages into three major branches: West (e.g., Hausa, Bole, Angas, Ron, Bade), Central ¼ Biu-Mandara (e.g., Tera, Mandara, Bachama-Bata [Bacama], Kotoko [Afade]), and East (e.g., Somrai, Kera, Dangale´ at), in addition to an isolated Masa cluster (with subbranches and smaller groupings).
Phonology Laryngealized implosive stops, e.g., /b F/, and ejective stops, e.g., /p’ t’/, are widespread throughout Chadic, together with prenasalized obstruents, e.g., /mb nd/. A characteristic pattern, therefore, is for a language to present a four-way phonation contrast, e.g., coronal /t d F nd/ and/or labial /p b K mb/. The voiceless and voiced lateral fricatives /l // are also commonplace, in addition to palatal and velar (including labialized velar) consonants. Vowel systems generally vary from two (monophthongal) vowels, high /e/ (with various phonetic values) and low /a/, as in Bachama-Bata and Mandara, to seven vowels, e.g., [Dangale´ at] /i e E a O o u/, with /i (e) a e (o) u/ a common inventory, and the diphthongs /ai/ and /au/ are attested. Tangale has a nine-vowel ATR pattern. Contrastive vowel length, especially in medial position, is also widespread throughout the family. Chadic languages are tonal, and two level (High/ Low) tones, e.g., Hausa, or three (High/Mid/Low), e.g., Angas, are typical. Downstep is also common (e.g., Ga’anda, Miya, Tera). Although tone can be lexically contrastive, its primary function is normally grammatical, e.g., in distinguishing tense/aspect/ mood categories. [Transcription: aa ¼ long vowel, a ¼ short; a`(a) ¼ L(ow) tone, aˆ (a) ¼ F(alling) tone, H(igh) tone is unmarked.]
Morphology and Syntax Many Chadic languages have masculine/feminine grammatical gender (an inherited Afroasiatic feature), with no distinction in the plural, and typically
distinguish gender in second and third person singular pronouns, e.g., [Miya] fiy/mace ‘you (MASC/FEM)’, te/nje ‘he/she’. Some also preserve the characteristic n/t/n (MASC/FEM/PL) marking pattern in grammatical formatives (and the masculine and plural markers often fall together phonologically), cf., [Masa] ve`t-na ‘rabbit’, ve`t-ta ‘female rabbit’, ve`dai-na ‘rabbits’. Noun pluralization is complex, and some widespread plural suffixes are reconstructable for ProtoChadic, e.g., *-Vn, *-aki, *-i, and *-ai. Examples: (-Vn) ku`men/ku`menen ‘mouse/mice’ [Bade], miyo`/ mishan ‘co-wife/co-wives’ [Kanakuru], (-aki) goonaa/ go`ona`kii ‘farm(s)’ [Hausa], (-i) duwima`/du`wı`mi ‘guineafowl(s)’ [Gera], (-ai) mu`tu`/mutai ‘sore(s)’ [Dangale´at]. Other plurals entail infixation of internal -a-, e.g., [Ron] sa`kur/sakwaˆ ar ‘leg(s)’. Some languages restrict overt plural marking to a narrow range of nouns (typically humans and animals). Verbs in many Chadic languages have retained the lexically arbitrary Proto-Chadic distinction between final –a and final –e verbs (where the final schwa vowel is often pronounced as [i], [e], or [u]), cf., [Tera] na ‘see’ and dle ‘get’, [Guruntum] daa ‘sit’ and shi ‘eat’. Verbal semantics and valency are modified by the addition of one or more derivational extensions (often fused suffixes). These extensions encode such notions as action in the direction of (centripetal) or away from (centrifugal) a deictic center (often the speaker), or action partially or totally completed, e.g., (totality) sa`-nya` ‘drink up’ < sa` ‘drink’ [Margi]. Some extensions also have a syntactic function, denoting, inter alia, transitivization or perfectivity, e.g., (transitivization) ya`w-tu ‘take down’ < ya`wwu ‘go down’ [Bole], ka`ta-naa ‘return’ (TRANS) < ka`tee ‘return’ (INTRANS) [Ngizim]. Verb stems can be overtly inflected for tense-aspectmood by segmental and/or tone changes. Many languages also have so-called ‘pluractional’ verbs, which express an action repeated many times or affecting a plurality of subjects (if intransitive) or objects (if transitive), and are formed via prefixal reduplication, ablaut or gemination, e.g., [Guruntum] pa`ni/pa`ppa`ni ‘take’, [Angas] fwin/fwan ‘untie’, [Pero] lofo`/loffo` ‘beat’. In some languages, pluractional stems occur with plural subjects of intransitive verbs and plural objects of transitive verbs, producing ergative-type agreement. In a number of languages, intransitive verbs are followed by an ‘intransitive copy pronoun’, which maps the person, number, and gender of the coreferential subject, e.g., [Kanakuru] na` po`ro`-no ‘I went out’ (literally I went out-I). Derivational and inflectional reduplication is widespread throughout the family (often signaling semantic intensification), ranging from (a) copying of a single segment, e.g., [Miya] pluractional verb tlyaaFe
Chadic Languages 289
‘to hoe repeatedly’ < tlyaFe ‘to hoe’, [Bidiya] ta`ttuk ‘very large’ < ta`tuk ‘large’; (b) reduplication of a syllable, e.g., [Hausa] prefixal reduplication of the initial CVC syllable of a sensory noun to form an intensive sensory adjective, as in zu`zzurfaa ‘very deep’ (< zur-zurf-aa) < zurfii ‘depth’ (with gemination/assimilation of the coda /r/); (c) full reduplication (exact copy), e.g., [Guruntum] kı`nı`-kı`nı` ‘just like this’ < kı`nı` ‘like this’, [Kwami] kayo`-kayo` ‘a gallop’ < kayo` ‘a ride’, [Tangale] sa`N-sa`N ‘very bright’
‘exceed, surpass, be more than’, i.e., exceed object X in relation to manner Y. In noun phrase syntax, the normative order for constituents is head-initial, i.e., head noun followed by definite determiners, possessives, numerals, relative clauses, etc. The linear order in genitive constructions is possessee X (þ ‘of’ linker) þ possessor Y, e.g., [Margi] tagu ge Haman ‘Haman’s horse’ (literally horse of Haman). Many Chadic languages also make an overt distinction between alienable and inalienable possession whereby inalienable possession is expressed by direct juxtaposition (i.e., with no overt linker), cf. (inalienable) menda Miyim ‘Miyim’s wife’ (literally wife Miyim), and (alienable) gam ma tamnoi ‘the woman’s ram’ (literally ram of woman) [Kanakuru]. Reflexive pronouns and reciprocals (phrasal anaphors) are typically formed with the body-part nouns ‘head’ and ‘body’ respectively, e.g., [Kwami] kuu-nı` ‘himself’ (literally head-his), [Miya] tuwatu`w-a`ama` ‘each other (we)’ (literally body-our). See also: Africa as a Linguistic Area; African Linguistics:
History; Afroasiatic Languages; Hausa; Nigeria: Language Situation; Phonology: Overview; Syntax of Words.
Bibliography Alio K (1986). Essai de description de la langue bidiya du Gue´ ra (Tchad). Phonologie – Grammaire. Marburger Studien zur Afrika- und Asienkunde, Serie A, Afrika, 45. Berlin: Dietrich Reimer. Ebert K H (1979). Sprache und Tradition der Kera (Tschad), Teil 3: Grammatik. Marburger Studien zur Afrika- und Asienkunde, Serie A, Afrika, 15. Berlin: Dietrich Reimer. Fe´ dry J (1971). Dictionnaire dangale´ at (Tchad). Paris: Afrique et Langage. Frajzyngier Z (1993). A grammar of Mupun. Sprache und Oralita¨ t in Afrika, 14. Berlin: Dietrich Reimer. Frajzyngier Z (1996). Grammaticalization of the complex sentence: a case study in Chadic. Amsterdam: John Benjamins. Greenberg J H (1963). The languages of Africa. Indiana University Research Center in Anthropology, Folklore, and Linguistics, publication 25. International Journal of American Linguistics, 29(1), part 2. Bloomington: Indiana University. Haruna A (2003). A grammatical outline of Gu`rdu`n/Gu`ru`ntu`m (southern Bauchi, Nigeria). Westafrikanische Studien, Frankfurter Beitra¨ ge zur Sprach- und Kulturgeschichte, vol. 25. Cologne: Ru¨ diger Ko¨ ppe. Hoffmann C (1963). A grammar of the Margi language. London: Oxford University Press. Jaggar P J (2005). ‘Hausa and Chadic.’ In Strazny P (ed.) Encyclopedia of linguistics. New York/London: Routledge. 445–447.
290 Chadic Languages Jaggar P J & Wolff H E (2002). Chadic and Hausa linguistics: the selected papers of Paul Newman with commentaries. Cologne: Ru¨ diger Ko¨ ppe. Jungraithmayr H & Ibriszimow D (1994). Chadic lexical roots, vol. 1: Tentative reconstruction, grading, distribution and comments; vol. 2: Documentation. Sprache und Oralita¨ t in Afrika, 20. Berlin: Dietrich Reimer. Jungraithmayr H, Galadima N A & Kleinewillingho¨ fer U (1991). A dictionary of the Tangale language (Kaltungo, northern Nigeria). Sprache und Oralita¨ t in Afrika, 12. Berlin: Dietrich Reimer. Newman P (2000). The Hausa language: an encyclopedic reference grammar. New Haven and London: Yale University Press. Newman P (2003). ‘Chadic languages.’ In Frawley W J (ed.) International Encyclopedia of Linguistics, 2nd ed., vol. 1. Oxford and New York: Oxford University Press. 304–307.
Newman P & Ma R (1966). ‘Comparative Chadic: phonology and lexicon.’ Journal of African Languages 5, 218–251. Pawlak N (1994). Syntactic markers in Chadic: a study on development of grammatical morphems [sic]. Warsaw: Instytut Orientalistyczny, Uniwersytetu Warszawskiego. Sachnine M (1982). Dictionnaire lame´ -franc¸ ais. Lexique franc¸ ais-lame´ . Paris: SELAF. Schuh R G (1981). A dictionary of Ngizim. University of California Publications in Linguistics 99. Berkeley and Los Angeles: University of California Press. Schuh R G (1998). A grammar of Miya. University of California Publications in Linguistics 130. Berkeley and Los Angeles: University of California Press. Stolbova O V (1996). Studies in Chadic comparative phonology. Moscow: Diaphragma Publishers. Wolff H E (1983). A Grammar of the Lamang language (Gwa’d Lamang). Afrikanistische Forschungen, 10. Glu¨ ckstadt: J. J. Augustin.
Chamberlain, Basil Hall (1850–1935) M McCaskey, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Basil Hall Chamberlain (1850–1935) was born in Southampton, England, but educated in France and Switzerland. His family wanted him to become a banker in London, but he went to Japan instead, on the grounds that travel would be better for his health. He arrived in Japan in 1873 to begin intensive study of the Japanese language. In 1876, he became a lecturer in English language and literature at Tokyo University. Having independent means, however, he decided to resume his own studies a year later. He resumed teaching English, at the Japanese Naval Academy from 1874 to 1882, since this post paid well and allowed him time for his own language studies as well. In 1886, he became professor of Japanese and philology at Tokyo Imperial University, specializing in historical linguistics. He was among the first Westerners to do serious advanced research on Japanese academic subjects, and one of his advanced students at Tokyo University, Ueda Kazutoshi (1867–1937), became a pioneer in developing modern Japanese lexicography and the study of linguistics in Japan. In addition to his work on standard Japanese, Chamberlain studied the Japanese language variants used in the Ryukyu Islands, as well as the Ainu language, then still spoken widely in northern
Hokkaido. He began studying Korean in 1881 and subsequently did comparative linguistic studies of Japanese, Ryukyu dialect, Ainu, and Korean, initiating research carried on and developed by Hattori Shiro¯ and other Japanese linguists two generations later. In 1880, Chamberlain began his research on the Kojiki, the oldest Japanese history text written in the Japanese language. In 1883 he published a translation, the Ko-ji-ki or records of ancient matters (Chamberlain, 1982), with notes by William George Aston (1841–1911), the Tokyo British embassy secretary. Aston was a Japanese language specialist who published a companion translation of the Nihon shoki (1896), the oldest Japanese history text written in classical Chinese. Chamberlain continued his Kojiki research after leaving Japan, publishing his final version in 1932. The next full English Kojiki translation did not appear until 1969. Though Chamberlain’s professional focus was on classical literature and linguistic analysis, he also wrote a series of popular books introducing Westerners to Japanese language and culture. A handbook of colloquial Japanese (1888) designed for use by learners at a basic level, was widely used for more than half a century, until it was superseded by more complete references in English after World War II. Chamberlain’s Things Japanese (1890), one of the few books available in this area then, was a popular introduction to Japanese social life and customs and is still in print today.
290 Chadic Languages Jaggar P J & Wolff H E (2002). Chadic and Hausa linguistics: the selected papers of Paul Newman with commentaries. Cologne: Ru¨diger Ko¨ppe. Jungraithmayr H & Ibriszimow D (1994). Chadic lexical roots, vol. 1: Tentative reconstruction, grading, distribution and comments; vol. 2: Documentation. Sprache und Oralita¨t in Afrika, 20. Berlin: Dietrich Reimer. Jungraithmayr H, Galadima N A & Kleinewillingho¨fer U (1991). A dictionary of the Tangale language (Kaltungo, northern Nigeria). Sprache und Oralita¨t in Afrika, 12. Berlin: Dietrich Reimer. Newman P (2000). The Hausa language: an encyclopedic reference grammar. New Haven and London: Yale University Press. Newman P (2003). ‘Chadic languages.’ In Frawley W J (ed.) International Encyclopedia of Linguistics, 2nd ed., vol. 1. Oxford and New York: Oxford University Press. 304–307.
Newman P & Ma R (1966). ‘Comparative Chadic: phonology and lexicon.’ Journal of African Languages 5, 218–251. Pawlak N (1994). Syntactic markers in Chadic: a study on development of grammatical morphems [sic]. Warsaw: Instytut Orientalistyczny, Uniwersytetu Warszawskiego. Sachnine M (1982). Dictionnaire lame´-franc¸ais. Lexique franc¸ais-lame´. Paris: SELAF. Schuh R G (1981). A dictionary of Ngizim. University of California Publications in Linguistics 99. Berkeley and Los Angeles: University of California Press. Schuh R G (1998). A grammar of Miya. University of California Publications in Linguistics 130. Berkeley and Los Angeles: University of California Press. Stolbova O V (1996). Studies in Chadic comparative phonology. Moscow: Diaphragma Publishers. Wolff H E (1983). A Grammar of the Lamang language (Gwa’d Lamang). Afrikanistische Forschungen, 10. Glu¨ckstadt: J. J. Augustin.
Chamberlain, Basil Hall (1850–1935) M McCaskey, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Basil Hall Chamberlain (1850–1935) was born in Southampton, England, but educated in France and Switzerland. His family wanted him to become a banker in London, but he went to Japan instead, on the grounds that travel would be better for his health. He arrived in Japan in 1873 to begin intensive study of the Japanese language. In 1876, he became a lecturer in English language and literature at Tokyo University. Having independent means, however, he decided to resume his own studies a year later. He resumed teaching English, at the Japanese Naval Academy from 1874 to 1882, since this post paid well and allowed him time for his own language studies as well. In 1886, he became professor of Japanese and philology at Tokyo Imperial University, specializing in historical linguistics. He was among the first Westerners to do serious advanced research on Japanese academic subjects, and one of his advanced students at Tokyo University, Ueda Kazutoshi (1867–1937), became a pioneer in developing modern Japanese lexicography and the study of linguistics in Japan. In addition to his work on standard Japanese, Chamberlain studied the Japanese language variants used in the Ryukyu Islands, as well as the Ainu language, then still spoken widely in northern
Hokkaido. He began studying Korean in 1881 and subsequently did comparative linguistic studies of Japanese, Ryukyu dialect, Ainu, and Korean, initiating research carried on and developed by Hattori Shiro¯ and other Japanese linguists two generations later. In 1880, Chamberlain began his research on the Kojiki, the oldest Japanese history text written in the Japanese language. In 1883 he published a translation, the Ko-ji-ki or records of ancient matters (Chamberlain, 1982), with notes by William George Aston (1841–1911), the Tokyo British embassy secretary. Aston was a Japanese language specialist who published a companion translation of the Nihon shoki (1896), the oldest Japanese history text written in classical Chinese. Chamberlain continued his Kojiki research after leaving Japan, publishing his final version in 1932. The next full English Kojiki translation did not appear until 1969. Though Chamberlain’s professional focus was on classical literature and linguistic analysis, he also wrote a series of popular books introducing Westerners to Japanese language and culture. A handbook of colloquial Japanese (1888) designed for use by learners at a basic level, was widely used for more than half a century, until it was superseded by more complete references in English after World War II. Chamberlain’s Things Japanese (1890), one of the few books available in this area then, was a popular introduction to Japanese social life and customs and is still in print today.
Champollion, Jean-Franc¸ ois (1790–1832) 291
He returned to Europe permanently in 1911, after almost 40 years in Japan, and spent the remaining two dozen years of his life in Switzerland. He continued to research and write about Japan and revise his earlier works for definitive publication. He spent time visiting his younger brother, the controversial author Houston Stewart Chamberlain (1855–1927), a prominent ultranationalist and activist in Germany who later became a German citizen. Unlike Houston, Basil spent his life immersed in studying the culture of another country without losing a scholarly perspective. The fact that Basil’s complete works were republished in 2000 (Chamberlain, 2000) testifies to the value of his contributions to the study of Japanese language and culture.
Bibliography Chamberlain B H (1886). A simplified grammar of the Japanese language. London: Trubner. Chamberlain B H (1895). Essay in aid of a grammmar and dictionary of the Luchuan language. Yokohama: Kelly & Walsh. Chamberlain B H (1907). A handbook of colloquial Japanese. London: C. Lockwood & Son. Chamberlain B H (1971). Japanese things (Previous editions have title: Things Japanese). Rutland, VT: C. E. Tuttle Co. Chamberlain B H (trans.). (1982). The Kojiki: records of ancient matters. Rutland, VT: C. E. Tuttle Co. Chamberlain B H (2000). Collected works of Basil Hall Chamberlain: major works (8 vols). Tokyo: Edition Synapse. ¯ ta Y (1998). Basil Hall Chamberlain: portrait of a O Japanologist. Richmond, Surrey: Japan Library.
Champollion, Jean-Franc¸ ois (1790–1832) F S O’Rourke, Somerset County Library System, Princess Anne, MD, USA S C O’Rourke, Yale University, New Haven, CT, USA ! 2006 Elsevier Ltd. All rights reserved.
Jean-Franc¸ ois Champollion, born December 23, 1790, was an Orientalist most remembered for his decipherment of ancient Egyptian hieroglyphics such as those found on the storied Rosetta Stone. Others, including English physicist Thomas Young, had labored for many years to reveal what the hieroglyphics symbolized, but it was Champollion – with knowledge of these foregoing efforts, his expertise in many languages, and his research on the cartouches of the pharaohs – who was the first to successfully unravel the mystery, beginning late in 1821 and into 1822. Champollion’s breakthrough furthered Young’s own suspicions that some hieroglyphics had phonetic values. His discoveries enabled scholars to accurately translate hieroglyphic texts and were therefore crucial to the current understanding of ancient Egyptian civilization. The Rosetta Stone, a dark granite stela bearing the same text in three different scripts (Egyptian demotic, Egyptian hieroglyphic, and Greek – the first two representing a single language), was unearthed in 1799 at a construction site near the town of el-Rashid, or Rosetta, in the Nile River delta during Napoleon Bonaparte’s ill-fated campaign in Egypt. It fell to the English after the French were defeated in Egypt and has been on exhibit at the British Museum in London almost continuously since 1802.
Born in Figeac, France, during the unrest of the French Revolution, Champollion received virtually no formal schooling at as early age, although he taught himself to read and became well-versed in Greek and Latin. In 1801 he was sent to Grenoble to live with his older brother, Jacques-Joseph Champollion-Figeac, a great source of inspiration for Champollion and a noted archaeologist and paleographer; Champollion became known as le jeune. Champollion’s propensity for languages flourished under the influence of his brother and Joseph Fourier, a mathematician, physicist, and prominent member of Napoleon’s mission in Egypt who first met Champollion in 1802 while holding office in Grenoble. Champollion eventually studied Arabic, Avestan, Chinese, Coptic, Hebrew, Pahlavi, Persian, Sanskrit, and Syriac, among other languages. In 1807, at the age of 16, he addressed the Socie´ te´ des Sciences et des Arts de Grenoble (later the Acade´ mie delphinale) and soon afterwards, became its youngest member. He continued to follow his conviction that the more modern Coptic language would play an important role in the decipherment of the ancient Egyptian inscriptions. In 1809, after studying in Paris for several years under such figures as Silvestre de Sacy, Champollion was appointed to an academic post in history in Grenoble. In the following year he was named docteur e`s lettres. He later fell victim to the political hostilities at the onset of the Bourbon Restoration and lost his position at Grenoble in 1816, although was given a chair in history and geography there in 1818.
Champollion, Jean-Franc¸ois (1790–1832) 291
He returned to Europe permanently in 1911, after almost 40 years in Japan, and spent the remaining two dozen years of his life in Switzerland. He continued to research and write about Japan and revise his earlier works for definitive publication. He spent time visiting his younger brother, the controversial author Houston Stewart Chamberlain (1855–1927), a prominent ultranationalist and activist in Germany who later became a German citizen. Unlike Houston, Basil spent his life immersed in studying the culture of another country without losing a scholarly perspective. The fact that Basil’s complete works were republished in 2000 (Chamberlain, 2000) testifies to the value of his contributions to the study of Japanese language and culture.
Bibliography Chamberlain B H (1886). A simplified grammar of the Japanese language. London: Trubner. Chamberlain B H (1895). Essay in aid of a grammmar and dictionary of the Luchuan language. Yokohama: Kelly & Walsh. Chamberlain B H (1907). A handbook of colloquial Japanese. London: C. Lockwood & Son. Chamberlain B H (1971). Japanese things (Previous editions have title: Things Japanese). Rutland, VT: C. E. Tuttle Co. Chamberlain B H (trans.). (1982). The Kojiki: records of ancient matters. Rutland, VT: C. E. Tuttle Co. Chamberlain B H (2000). Collected works of Basil Hall Chamberlain: major works (8 vols). Tokyo: Edition Synapse. ¯ ta Y (1998). Basil Hall Chamberlain: portrait of a O Japanologist. Richmond, Surrey: Japan Library.
Champollion, Jean-Franc¸ois (1790–1832) F S O’Rourke, Somerset County Library System, Princess Anne, MD, USA S C O’Rourke, Yale University, New Haven, CT, USA ! 2006 Elsevier Ltd. All rights reserved.
Jean-Franc¸ois Champollion, born December 23, 1790, was an Orientalist most remembered for his decipherment of ancient Egyptian hieroglyphics such as those found on the storied Rosetta Stone. Others, including English physicist Thomas Young, had labored for many years to reveal what the hieroglyphics symbolized, but it was Champollion – with knowledge of these foregoing efforts, his expertise in many languages, and his research on the cartouches of the pharaohs – who was the first to successfully unravel the mystery, beginning late in 1821 and into 1822. Champollion’s breakthrough furthered Young’s own suspicions that some hieroglyphics had phonetic values. His discoveries enabled scholars to accurately translate hieroglyphic texts and were therefore crucial to the current understanding of ancient Egyptian civilization. The Rosetta Stone, a dark granite stela bearing the same text in three different scripts (Egyptian demotic, Egyptian hieroglyphic, and Greek – the first two representing a single language), was unearthed in 1799 at a construction site near the town of el-Rashid, or Rosetta, in the Nile River delta during Napoleon Bonaparte’s ill-fated campaign in Egypt. It fell to the English after the French were defeated in Egypt and has been on exhibit at the British Museum in London almost continuously since 1802.
Born in Figeac, France, during the unrest of the French Revolution, Champollion received virtually no formal schooling at as early age, although he taught himself to read and became well-versed in Greek and Latin. In 1801 he was sent to Grenoble to live with his older brother, Jacques-Joseph Champollion-Figeac, a great source of inspiration for Champollion and a noted archaeologist and paleographer; Champollion became known as le jeune. Champollion’s propensity for languages flourished under the influence of his brother and Joseph Fourier, a mathematician, physicist, and prominent member of Napoleon’s mission in Egypt who first met Champollion in 1802 while holding office in Grenoble. Champollion eventually studied Arabic, Avestan, Chinese, Coptic, Hebrew, Pahlavi, Persian, Sanskrit, and Syriac, among other languages. In 1807, at the age of 16, he addressed the Socie´te´ des Sciences et des Arts de Grenoble (later the Acade´mie delphinale) and soon afterwards, became its youngest member. He continued to follow his conviction that the more modern Coptic language would play an important role in the decipherment of the ancient Egyptian inscriptions. In 1809, after studying in Paris for several years under such figures as Silvestre de Sacy, Champollion was appointed to an academic post in history in Grenoble. In the following year he was named docteur e`s lettres. He later fell victim to the political hostilities at the onset of the Bourbon Restoration and lost his position at Grenoble in 1816, although was given a chair in history and geography there in 1818.
292 Champollion, Jean-Franc¸ ois (1790–1832)
In 1822 Champollion stunned the academic world with his celebrated Lettre a` M. Dacier, in which he announced his success in identifying the values of a number of hieroglyphics from Egyptian texts. His achievements came to the attention of the French kings Louis XVIII and Charles X and Champollion first reaped the rewards of royal favor when he was sent on a tour of Italian museums to study their collections of Egyptian antiquities. He was also honored in 1826 with an appointment as curator of the soon-to-open Egyptian and Oriental collections at the Louvre in Paris. Champollion continually refined his analysis of the hieroglyphic system over the years and by the end of his short life he had concluded that it consisted of three types of elements: those with phonetic values (be they syllabic or purely alphabetic), ideograms (figurative or symbolic in nature), and determinatives (providing information as to semantic sense and pronunciation). Others before him had failed to realize that the system was complex as such. Champollion’s work on the hieroglyphic system also clarified the relation between it and two descendent scripts of ancient Egypt, demotic and hieratic. Champollion mounted a joint expedition to Egypt in 1828 with his Italian colleague Ippolito Rosellini, during which Champollion conducted the first systematic survey of the monuments and hieroglyphic inscriptions found there. Although criticized by some for removing such inscriptions, many of which would later appear in museums abroad, others feel that in doing so, Champollion and others may have saved them from destruction. In 1830 he was elected to the Acade´ mie des Inscriptions et Belles-Lettres; the following year the Colle`ge Royal (later Colle`ge de France) in Paris established a chair in archaeology for him. He died after a series of strokes on March 4, 1832 at the age of 41. Many of his manuscripts, including his Egyptian dictionary and grammar, were published posthumously by his brother and others. Champollion is widely considered the father of the modern discipline of Egyptology. For his sheer brilliance, professional accomplishments, and devotion to all aspects of the field, Champollion was often referred to as the Egyptian. See also: Afroasiatic Languages; Ancient Egyptian and Coptic; Decipherment; Egypt: Scripts, Ancient; Lepsius, Carl Richard (1810–1884); Scripts, Undeciphered; Young, Thomas (1773–1829).
Bibliography Adkins L & Adkins R (2000). The keys of Egypt: the race to read the hieroglyphs. London: HarperCollins Publishers.
Bresciani E (ed.) (1978). Lettres a` Zelmire/Jean-Franc¸ois Champollion: pre´sente´es par Edda Bresciani. Paris: L’Asiathe`que. Champollion J-F (1814). L’E´gypte sous les pharaons, ou recherches sur la ge´ographie, la re´ligion, la langue, les e´critures et l’histoire de l’E´gypte avant l’invasion de Cambyse/par M. Champollion le jeune (2 vols). Paris: Chez de Bure fre`res. Champollion J-F (1824–1826). Lettres a` M. le duc de Blacas d’Aulps . . . relatives au Muse´e royal e´gyptien de Turin (2 vols in 1). Paris: Firmin Didot. Champollion J-F (1827). Notice descriptive des monumens e´gyptiens du Muse´e Charles X / par M. Champollion le jeune, conservateur des antiques du Muse´e royal du Louvre (seconde division). Paris: Impr. du Crapelet. Champollion J-F (1827–1828). Pre´cis du syste`me hie´roglyphique des anciens E´gyptiens . . . [2nd rev. ed.] (2 vols). Paris: Imprimerie royale. Champollion J-F (1922). Lettre a` M. Dacier. . . relative a` l’alphabet des hie´roglyphes phone´tiques . . . . Paris: P. Geuthner. Champollion J-F (1970–1971). Monuments de l’E´gypte et de la Nubie (4 vols in 1). Gene`ve: E´ditions de BellesLettres. Champollion J-F (1973–1974). Notices descriptives/JeanFranc¸ois Champollion (2 vols in 5). Gene`ve: E´ ditions de Belles-Lettres. Champollion J-F (1984). Principes ge´ne´raux de l’e´criture sacre´e e´gyptienne applique´e a` la repre´sentation de la langue parle´e. Paris: Institut d’Orient. Champollion J-F (1997). Grammaire e´gyptienne. Arles: Actes Sud, Solin. Champollion J-F (2000). Dictionnaire e´gyptien/JeanFranc¸ois Champollion. Arles: Solin-Actes Sud. Champollion-Figeac J-J (ed.) (1844–1889). Monuments de l’E´gypte et de la Nubie. Notices descriptives conformes aux manuscrits autographes re´dige´s sur les lieux (2 vols in 3). Paris: Firmin Didot fre`res. Chassagnard G (2001). Les fre`res Champollion: de Figeac aux hie´roglyphes. Figeac, France: Segnat e´ ditions. Dewachter M (1990). Champollion: un scribe pour l’E´gypte. Paris: Gallimard. Dewachter M & Fouchard A (eds.) (1994). L’e´gyptologie et les Champollion. Grenoble: Presses universitaires de Grenoble. Faure A (2004). Champollion: le savant de´chiffre´. Paris: Fayard. Hartleben H (1983). Champollion, sa vie et son oeuvre, 1790–1832/Hermine Hartleben: pre´sentation de Christiane Desroches Noblecourt, traduction et documentation de Denise Meunier selon l’adaptation du texte allemande de Ruth Schumann Antelme. Paris: Pygmalion/Ge´ rard Watelet. Jacq C (1988). Sur les pas de Champollion: l’E´gypte des hie´roglyphes. Paris: E´ ditions M. Trinckvel. Jacq C (ed.) (1998). Textes fondamentaux sur l’E´gypte ancienne. Paris: La Maison de vie.
Channel Islands: Language Situation 293 Lacouture J (1988). Champollion, une vie de lumie`res. Paris: B. Grasset. Lunel A (1990). Le reˆ ve inacheve´ : chronique historique: Jean-Franc¸ ois Champollion d’apre`s les e´ crits de son fre`re, J-J Champollion-Figeac/raconte´ e par Alain Lunel. Paris: Intertextes.
Parkinson R B (1999). Cracking codes: the Rosetta Stone and decipherment. London: British Museum Press/Berkeley: University of California Press. Vaillant P (ed.) (1984). Jean-Franc¸ ois Champollion, lettres a` son fre`re, 1804–1818. Paris: L’Asiathe`que.
Channel Islands: Language Situation M C Jones, University of Cambridge, Cambridge, UK ! 2006 Elsevier Ltd. All rights reserved.
The Channel Islands (see Figure 1) constitute a small archipelago off the west coast of the Cotentin peninsula of Normandy. The eight islands, in descending order of size, are Jersey, Guernsey, Alderney, Sark, Herm, Jethou, Lihou, and Brecqhou. The precise date of the Roman occupation of the islands and hence of the introduction of Latin is unknown but it can be assumed that Latin and, later, some form of Romance speech have been spoken there for approximately two millennia. Despite the fact that the archipelago was separated from the Duchy of Normandy in 1204 and has, from that time on, formed part of the British Isles (although the islands are self-governing: Jersey forms its own bailiwick and the other islands form the bailiwick of Guernsey), until relatively recently the majority of the inhabitants spoke Norman French dialects closely related to those of the neighboring mainland.
Figure 1 The Channel Islands.
English was introduced to the islands in the Middle Ages, when garrisons were established there, but the local dialects remained as the everyday variety of the majority of the Islanders until well into the nineteenth century, when the growth of trade and tourism led to progressive Anglicization. The evacuation of many of the inhabitants to the British mainland in the days preceding the German occupation of the islands during the Second World War had severe linguistic repercussions. Alderney, where the local dialect (Auregnais) was already moribund, was totally evacuated and the dialect is now extinct. Since the standard language of all the Channel Islands has been French and never the indigenous dialects, their presence in ‘official’ domains has been virtually nonexistent. Standard French (and now English) has been the language of religion and legislation and English has always been dominant in the education system. Today, Standard French is reserved largely for ceremonial functions. Official statistics for speakers of the local dialects have been gathered twice for Jersey (in the 1989 and 2001 censuses) and once for Guernsey (in the 2001 census). A comparison of the Jersey results from the two censuses provides us with a clear indication of the decline of the dialect in recent years. In 1989, there were 5720 speakers of Jersey French (Je`rriais), representing 6.9% of the total resident population. By 2001, this had gone down to 2874 speakers, or 3.2% of the total resident population, with some two-thirds of these speakers aged over 60. The 2001 census also recorded that only 113 speakers declared Je`rriais to be their usual everyday language. On Guernsey, the number of people able to speak Guernsey French (Guernesiais) fluently in 2001 was 1327, or 2.2% of the total resident population, with nearly 70% of these speakers aged over 64. No such information has been gathered for Sark, although it is estimated that fewer than 20 out of the 600 permanent inhabitants can speak Sark French (Sercquiais). Despite their linguistic affinity with mainland Norman, the dialects of the Channel Islands show salient differences from the mainland varieties – and from
Channel Islands: Language Situation 293 Lacouture J (1988). Champollion, une vie de lumie`res. Paris: B. Grasset. Lunel A (1990). Le reˆve inacheve´: chronique historique: Jean-Franc¸ois Champollion d’apre`s les e´crits de son fre`re, J-J Champollion-Figeac/raconte´e par Alain Lunel. Paris: Intertextes.
Parkinson R B (1999). Cracking codes: the Rosetta Stone and decipherment. London: British Museum Press/Berkeley: University of California Press. Vaillant P (ed.) (1984). Jean-Franc¸ois Champollion, lettres a` son fre`re, 1804–1818. Paris: L’Asiathe`que.
Channel Islands: Language Situation M C Jones, University of Cambridge, Cambridge, UK ! 2006 Elsevier Ltd. All rights reserved.
The Channel Islands (see Figure 1) constitute a small archipelago off the west coast of the Cotentin peninsula of Normandy. The eight islands, in descending order of size, are Jersey, Guernsey, Alderney, Sark, Herm, Jethou, Lihou, and Brecqhou. The precise date of the Roman occupation of the islands and hence of the introduction of Latin is unknown but it can be assumed that Latin and, later, some form of Romance speech have been spoken there for approximately two millennia. Despite the fact that the archipelago was separated from the Duchy of Normandy in 1204 and has, from that time on, formed part of the British Isles (although the islands are self-governing: Jersey forms its own bailiwick and the other islands form the bailiwick of Guernsey), until relatively recently the majority of the inhabitants spoke Norman French dialects closely related to those of the neighboring mainland.
Figure 1 The Channel Islands.
English was introduced to the islands in the Middle Ages, when garrisons were established there, but the local dialects remained as the everyday variety of the majority of the Islanders until well into the nineteenth century, when the growth of trade and tourism led to progressive Anglicization. The evacuation of many of the inhabitants to the British mainland in the days preceding the German occupation of the islands during the Second World War had severe linguistic repercussions. Alderney, where the local dialect (Auregnais) was already moribund, was totally evacuated and the dialect is now extinct. Since the standard language of all the Channel Islands has been French and never the indigenous dialects, their presence in ‘official’ domains has been virtually nonexistent. Standard French (and now English) has been the language of religion and legislation and English has always been dominant in the education system. Today, Standard French is reserved largely for ceremonial functions. Official statistics for speakers of the local dialects have been gathered twice for Jersey (in the 1989 and 2001 censuses) and once for Guernsey (in the 2001 census). A comparison of the Jersey results from the two censuses provides us with a clear indication of the decline of the dialect in recent years. In 1989, there were 5720 speakers of Jersey French (Je`rriais), representing 6.9% of the total resident population. By 2001, this had gone down to 2874 speakers, or 3.2% of the total resident population, with some two-thirds of these speakers aged over 60. The 2001 census also recorded that only 113 speakers declared Je`rriais to be their usual everyday language. On Guernsey, the number of people able to speak Guernsey French (Guernesiais) fluently in 2001 was 1327, or 2.2% of the total resident population, with nearly 70% of these speakers aged over 64. No such information has been gathered for Sark, although it is estimated that fewer than 20 out of the 600 permanent inhabitants can speak Sark French (Sercquiais). Despite their linguistic affinity with mainland Norman, the dialects of the Channel Islands show salient differences from the mainland varieties – and from
294 Channel Islands: Language Situation
each other. The dialects spoken in Jersey and Guernsey also display considerable regional variation. An embryonic language support campaign currently exists on Jersey (where the dialect has recently been standardized), including limited teaching of Je`rriais in school (from September 1999). Since January 2004, Guernesiais has been taught to children of infant school age in three of Guernsey’s primary schools. There is nothing comparable in Sark. The dialects have left a substrate imprint on the distinctive variety of English spoken in the Channel Islands. However, with increasing immigration from the UK, this variety is not as widespread as it once was. A limited amount of literary output has been produced in the local dialects of Jersey and Guernsey. See also: Je`rriais.
Bibliography Barbe´ P (1995). ‘Guernsey English: my mother tongue.’ Report and Transactions of La Socie´ te´ Guerne´ siaise 23/ 4, 700–723. Birt P (1985). Le´ Je`rriais pour tous. A complete course on the Jersey language. Jersey: Don Balleine. Brasseur P (1978a). ‘Les principales caracte´ ristiques phone´ tiques des parlers normands de Jersey, Sercq, Guernesey et Magneville (canton de Bricquebec, Manche). Premie`re partie.’ Annales de Normandie 25/1, 49–64. Brasseur P (1978b). ‘Les principales caracte´ ristiques phone´ tiques des parlers normands de Jersey, Sercq, Guernesey et Magneville (canton de Bricquebec, Manche). Deuxie`me partie.’ Annales de Normandie 25/3, 275–306. Brasseur P (1998). ‘La survie du dialecte normand et du franc¸ ais dans les ıˆles anglo-normandes: remarques sociolinguistiques.’ Plurilinguismes 15, 133–170. De Garis M (1982). Dictiounnaire Angllais-Guerne´ siais (3rd edn.). Chichester: Phillimore. De Garis M (1983). ‘Guerne´siais: a grammatical survey.’ Report and Transactions of La Socie´ te´ Guerne´ siaise 21, 319–353. Emanuelli F (1907–1908). ‘Le parler populaire de l’ıˆle anglo-normande d’Aurigny.’ Revue de Philologie Franc¸ aise 20/1, 136–142. Jones M C (2000). ‘The subjunctive in Guernsey Norman French.’ Journal of French Language Studies 10/2, 177–203. Jones M C (2001). Jersey Norman French: a linguistic study of an obsolescent dialect. Oxford: Blackwell. Jones M C (2002). ‘Mette a haout dauve la grippe des Angllaı¨s: convergence on the island of Guernsey.’ In Jones M C & Esch E (eds.) Contributions to the sociology
of language 86: Language change: the interplay of internal, external and extra-linguistic factors. Berlin/New York: Mouton de Gruyter. 143–168. Jones M C (in press). ‘French in the Channel Islands.’ In Britain D (ed.) Language in the British Isles (2nd edn.). Cambridge, Cambridge University Press. Jones M C & Price G (in press). ‘Channel Islands French.’ In Ammon U & Haarmann H (eds.) Wieser Sprachenenzyklopa¨ die Westeuropa. Joret C (1883). Des caracte`res et de l’extension du patois normand. Paris: Vieweg. Le Maistre F (1949). ‘Le normand dans les ıˆles anglonormandes.’ Le Franc¸ ais moderne 17, 211–18. Le Maistre F (1966). Dictionnaire Jersiais-Franc¸ ais. Jersey: Don Balleine. Le Maistre F (1982). The language of Auregny: la langue normande d’Auregny. Jersey: Don Balleine, and Alderney: Alderney Society and Museum. Lebarbenchon R J (n. d.). La Gre`ve de Lecq. Litte´ ratures et cultures populaires de Normandie 1: Guernesey et Jersey. Cherbourg: Isoe`te. Lemprie`re R (1974). History of the Channel Islands. London: Robert Hale. Lepelley R (1999). La Normandie dialectale. Petite encyclope´ die des langages et mots re´ gionaux de la province de Normandie et des Iles anglo-normandes. Universite´ de Caen: Office Universitaire d’Etudes Normandes. Liddicoat A J (1989). ‘A brief survey of the dialect of Sark.’ Report and Transactions of La Socie´ te´ Guerne´ siaise 22/4, 689–704. Liddicoat A J (1994). A grammar of the Norman French of the Channel Islands: the dialects of Jersey and Sark. Berlin: Mouton de Gruyter. Me´tivier G (1831). Rimes guernesiais. Guernesey: ThomasMauger and London: Simpkins, Marshall. Mourant A (1865). Rimes et poe´ sies jersiais. Jersey: TouzelFalle. Ramisch H (1989). The variation of English in Guernsey, Channel Islands. Frankfurt am Main: Peter Lang. Sjo¨gren A (1964). Les parlers bas-normands de l’ıˆle de Guernesey. I Lexique franc¸ais-guerne´siais. Paris: Klincksieck. Spence N C W (2003). ‘Parlers jersiais et parlers bas-normands.’ Revue de Linguistique Romane 67, 159–177. Tomlinson H (1981). ‘Le guernesiais: e´tude grammaticale et lexicale du parler normand de l’ıˆle de Guernesey.’ Ph. D. diss., University of Edinburgh. Uttley J (1966). The story of the Channel Islands. London: Faber & Faber. Viereck W (1988). ‘The Channel Islands: an anglicist’s no man’s land.’ In Klegraf J & Nehls D (eds.) Essays on the English language and applied linguistics on the occasion of Gerhard Nickel’s 60th birthday. Heidelberg: Julius Groos. 468–478.
Chao Yuen Ren (1892–1982) 295
Chao Yuen Ren (1892–1982) R J LaPolla, La Trobe University, Bundoora, Australia ! 2006 Elsevier Ltd. All rights reserved.
Y. R. Chao is easily the most famous linguist to have come out of China. Born before the end of the last dynasty in China, he received a traditional Confucian education, but was also one of the first Chinese people to be sent to the West for training in modern Western science (under the Boxer Indemnity Fund). The remarkable breadth and scope of his studies included physics, mathematics, linguistics, musical and literary composition, and translation, and he was a pioneer in many of these fields. Chao received a B.A. in mathematics from Cornell University in 1914, and a Ph.D. in philosophy from Harvard in 1918, but began his career teaching physics at Cornell and Harvard. He returned to China in 1920 to take a position at Tsing Hua University; there he acted as interpreter for Bertrand Russell, Dora Black, and John Dewey. He was involved in the vernacular literature movement, in 1922 translating Alice in Wonderland into vernacular Chinese, and also in the national language and standardization movement. He developed the system of National Romanization that was adopted by the government and was one of the key members of the group that created the national language, with his voice being the one on the recordings of the new national and standard language. In the late 1920s, when the Academia Sinica was founded, he became the head of the linguistics section of the Institute of History and Philology. There he organized teams to go to the field to systematically record the different Chinese dialects, and published his Studies of the modern Wu dialects, the earliest descriptive work of its kind in China. It was also in those years that he wrote his famous article ‘On the non-uniqueness of phonemic solutions of phonetic systems.’ During the war years he was back in the U.S., in Hawaii, then Yale and Harvard. He was recruited to lead the U.S. Army Chinese language program at Harvard, where he wrote his Cantonese primer (1947) and Concise dictionary of spoken Chinese (1946). These publications also led to his Mandarin primer (1961), Readings in sayable Chinese (1968), and Grammar of spoken Chinese (1968), still the best grammar of Mandarin Chinese available. It is an amazing piece of scholarship not only for its incredible thoroughness, but also because of his
insightful inductive analysis, taking Chinese on its own terms (in fine structuralist tradition) rather than trying to force it into any preconceived categories. In 1947 he was on his way back to China, but stopped off at the University of California at Berkeley, where he was offered a position, in which he remained until he retired in 1960, becoming Agassiz Professor of Oriental Languages and Literatures in 1952. His contributions were also recognized by his election to president of the Linguistic Society of America in 1945 and president of the American Oriental Society in 1960. Among his well-known students are Wang Li (see Wang Li (1900–1986)), Kun Chang (who replaced Chao at Berkeley upon his retirement), and Jerry Norman. See also: Chinese; Phoneme; Wang Li (1900–1986).
Bibliography Chao B Y (1999). Za ji Zhao jia (Notes on the Chao family). Beijing: Zhongguo Wenlian Chubanshe. Chao Y R (1928). Xiandai Wuyu yanjiu (Studies of the modern Wu dialects). Peking: Tsing Hua College Research Institute, Monograph No. 4. Chao Y R (1934). ‘On the non-uniqueness of phonemic solutions of phonetic systems.’ Bulletin of the Institute of History and Philology, Academia Sinica 4, 363–397. Chao Y R (1961). Mandarin primer. Cambridge: Cambridge University Press. [Originally published 1956.] Chao Y R (1968). A grammar of spoken Chinese. Berkeley & Los Angeles: University of California Press. Chao Y R (1968). Language and symbolic systems. Cambridge: Cambridge University Press. Chao Y R (1976). Aspects of Chinese sociolinguistics: essays by Yuen Ren Chao. Dil A S (ed.) Stanford: Stanford University Press. Chao Y R (1977). Yuen Ren Chao: Chinese linguist, phonologist, composer, and author. [Oral history based on interviews by Rosemary Levenson.] Berkeley: Regional Oral History Office, Bancroft Library, UC Berkeley. Su J (1999). Zhao Yuanren xueshu sixiang pingzhuan (Critical summary of Yuen Ren Chao’s ideas). Beijing: Beijing Tushuguan Chubanshe. Zhang S (1999). Yao yao chang lu: Zhao Yuanren (A long road: Yuen Ren Chao). Hong Kong: Zhonghua Shuju Youxian Gongsi. [Biography of Y. R. Chao.] Zhao X & Huang P (eds.) (1998). Zhao Yuanren nian pu (Chronology of the life of Yuen Ren Chao). Beijing: Shangwu Yinshuguan.
296 Character Sets
Character Sets J H Jenkins, Apple Computer, Inc., Cupertino, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Everything is a number to a computer, and so, in order to represent text, it is necessary to create a numerical representation for it. The most common technique for doing so is to assign numbers to individual characters in a writing system; the term ‘character set’ is used to refer to such a mapping. More specifically, everything is a binary number of a fixed size to a computer. A single binary digit is referred to as a bit. Because the binary representation of a number is difficult for humans to use, computer scientists usually use base 8 (octal) or base 16 (hexadecimal) numbers. In octal numerals, only the digits 0 through 7 are required; hexadecimal uses 0 through 9 and A through F as digits. When it is unclear which base is being used, octal numerals are written with a leading 0, and hexadecimal numerals with a leading 0x. The smallest-size number a computer can readily manipulate is called a byte. Earlier systems used bytes that varied in size from computer to computer; today, an eight-bit byte or octet is standard. Bytes are the basis of data interchange between different computers. We can therefore say that a character set is a mapping between a set of text elements such as characters to a series of bytes. Conceptually, this process can be broken down into stages (see Figure 1): 1. the abstract character repertoire, which is the collection of characters to be encoded; 2. the coded character set, which maps the elements of an abstract character repertoire to a set of nonnegative integers; 3. the character encoding form, which maps the elements of a coded character set to code units of a specific size; and 4. the character encoding scheme, which maps code units from one or more character encoding forms to an actual sequence of bytes.
History The first efforts to create numerical encodings for text date from the 19th century with the advent of telegraphy. As the technology advanced and developed, so did increasingly sophisticated techniques to coordinate the transmission of textual information via numerical or other codes and the technology used for transmission. This dovetailed with the pioneering
work in computers to create the first computer character sets. The most direct descendant of the telegraphy codes is EBCDIC (Extended Binary Coded Decimal Interchange Code), used in IBM mainframe computers from the early 1960s onwards. Its chief rival was the American Code for Standard Information Interchange (ASCII), developed at about the same time by what is now the American National Standards Institute (ANSI). ASCII has proven far more popular and influential; use of EBCDIC has been restricted largely to IBM mainframes and closely related systems. ASCII represents characters using units of seven bits, which provides for a set of 27 ¼ 128 characters. The first 32 and last one are used for control codes – numbers that do not represent text but are used to give instructions to the machines displaying or printing it. (Character 0x08, for example, was frequently used to instruct a terminal to ring its bell.) The remaining 95 are the characters for display. In its current form, ASCII includes the upper- and lowercase letters of the Latin alphabet as used for English, the numerals 0 through 9, basic punctuation, and some accents. The accents were used for non-English languages. A French ‘e´ ’ would be represented by instructing the terminal to type ‘e’, physically move backwards one place, and then type the accent. ASCII is sufficient for use with standard modern English as typed on typewriters, but not for anything else. Nor is it actually sufficient even for English typesetting, as it is missing important symbols such as em and en dashes. On systems based on the eight-bit byte, the use of ASCII seems wasteful, since that eighth bit is unused. Originally it was intended that the extra bit should be used as a parity bit as a redundancy check. By the 1980s, however, eight-bit extensions to ASCII had become common. The first 128 characters, 0x00 through 0x0F, were as in ASCII, and the remaining 128 were used for extensions such as dingbats or additional letters. Most notable of the eight-bit character sets are MacRoman, developed by Apple Computer for use on its Macintosh computers, and the ISO 8859 series of standards developed by the International Organization for Standardization. Eight-bit character sets are inadequate for languages requiring more than 256 characters. Extensions were developed, therefore, for use in East Asia. The most common are referred to as double-byte character sets. In these character sets, the bytes 0x00 through 0x7F are used as ASCII. Bytes 0x80 through 0xFF are available to signal the start of
Character Sets 297
Figure 1 Stages in the character-encoding process.
two-byte units. The Latin letter ‘A’ would therefore be represented by the single byte 0x41, and the Chinese character by the two-byte sequence 0xA4AB. Such techniques provide enough room for the daily needs of modern East Asian languages. By 1990, there were dozens of character sets in common use, with varying repertoires and architectures. This created problems for software companies, whose products were originally written with ASCII or one of its eight-bit extensions in mind. Getting software to work with multiple character sets was a long and expensive process. This led to the desire for a universal character set, adequate for the representation of all languages. Two efforts to produce a universal character set converged in the early 1990s. One is known as ISO/ IEC 10646, produced by the international standardization community. The other is Unicode, produced by the Unicode Consortium, an industry consortium consisting largely (but not entirely) of companies headquartered in the United States. The convergence of the two character sets means that the terms ‘Unicode’ and ‘ISO/IEC 10646’ can be used almost interchangeably in most circumstances. We will generally use the term ‘Unicode’, as Unicode is formally an implementation of ISO/IEC 10646 with additional specifications. The existence of a universal character set has proven vital in the rise of the Internet. So long as documents are created and printed on a single computer, it doesn’t make much difference which character set is used. When a user in Nome, Alaska is using his Windows system to access a Japanese Web page hosted by a computer in Calcutta running Linux, however, it becomes important to make sure that both systems can handle the same character set. And when that computer in Calcutta is used to host the records of a multinational corporation requiring support for English, French, Chinese, Japanese, Hindi, and Thai, it becomes even more important to avoid the accounting headache of dealing with multiple character sets. By the turn of the 21st century, Unicode and ISO/IEC 10646 had become the focus of all efforts to extend the computer representation of human languages. They are also increasingly supported by modern operating systems and software.
Characteristics of Unicode Unicode was originally based on a simple 16-bit architecture: there were 216 ¼ 65 536 code points available from 0x0000 through 0xFFFF. The first 128 code points are identical with ASCII; indeed, the first 256 code points are identical with the popular eight-bit standard, ISO 8859-1. The remaining space included characters of well over a dozen other scripts ranging from Greek to Thai. Numerals and punctuation marks were included, together with mathematical symbols and popular dingbats. It soon became clear that this was insufficient room for the full set of characters people actually wanted to use. Sixteen additional planes each of 65 536 code points each were added and multiple encoding forms created, as described below, to accommodate them. The number assigned a character is referred to as its Unicode scalar value (USV), and is usually written as ‘U þ ’ followed by four or more hexadecimal digits. Each character is also assigned a name. The Latin letter ‘A’ has U þ 0041 for its USV and is named LATIN CAPITAL LETTER A. In addition to standalone characters, Unicode includes a number of combining marks. These are included to allow support for on-the-fly generation of new accented forms. This makes it unnecessary to actually catalogue every accented Latin letter, for example, before it can be represent on computers. Several thousand code points are reserved for private use; that is, the standard does not define what they are to represent but leaves that to private agreements between different users. This allows users to interchange data using unencoded characters or characters inappropriate for encoding, such as corporate logos. One of the most controversial aspects of Unicode is its inclusion of a unified set of ideographs for all of East Asia: Chinese (both its simplified and traditional forms), Japanese, and Korean. (Unicode and ISO/IEC 10646 use the linguistically incorrect term ‘ideograph’ for historical reasons. The authors of both standards are aware that other terms better describe the function of these characters in the languages that use them.) The process of producing a single set of characters for East Asian languages is referred to as
298 Character Sets
Figure 3 Characters and glyphs.
Figure 2 Han unification.
Han unification (see Figure 2). Because individual characters are often written in visually distinct ways in different parts of East Asia, there was some concern that this would force, for example, a Japanese user to see their name written with culturally inappropriate Chinese glyphs; and because the Unicode Consortium itself and the bulk of its officers are American, it was also felt that Americans were being insensitive to the needs of people in East Asia. Both fears proved to be overstated. In point of fact, most Japanese users use systems with fonts designed specifically for Japanese and only rarely are confronted by Japanese text written using Chinese glyphs. Even where they are, the glyphs they see would generally be perfectly acceptable, or, at most, about as odd-looking as the spelling ‘colour’ is to an American or ‘color’ to a Briton. The actual work of Han unification, moreover, has been done by an international group. This group, now called the Ideographic Rapporteur Group or IRG, currently has delegations representing the People’s Republic of China (PRC), the Special Administrative Regions of Hong Kong and Macao, Taiwan, North and South Korea, Japan, Vietnam, Singapore, and the United States, with a liaison representing the Unicode Consortium. The current head of the IRG is Dr Lu Qin of Hong Kong Polytechnic University, and its remaining officers are all from the PRC. The issue of Han unification serves to illustrate a fundamental distinction made by Unicode and ISO/ IEC 10646, that between character and glyph. A character is a unit of meaning in a writing system, while a glyph is what one actually sees on the page or on the screen. Thus ‘a’, ‘b’, and ‘c’ are different characters, while ‘a’ and ‘a’ are different glyphs for the same character (see Figure 3). The distinction is an important one. Characters are the fundamental units where meaning is concerned, and are what are used in processes such as searching, sorting, pattern matching, or text to speech. They are insufficient for more visually oriented processes such as printing or optical character recognition. Scripts such as Arabic, for example, make extensive use of contextual forms, where a letter changes its shape depending on its position in a word. If each of the different forms for each letter were separately encoded, not only would that make text input and
Figure 4 Unicode encoding forms.
spelling checking more complex, it would also lock the visual representation of Arabic into one calligraphic style. By separating the encoding of the characters from the drawing of the glyphs, it’s possible for the same text to have substantially different appearances, depending on additional information such as font and point size. Unicode is thus aimed at plain text rendering; its goal is minimal legibility – the minimum amount of information needed to guarantee legibility. Pure Unicode should be legible to users regardless of the details of the rendering process. Unicode defines three character encoding forms, named after the size of the code unit they use: UTF-8, UTF-16, and UTF-32 (see Figure 4). (‘UTF’ can be taken to stand either for ‘universal transformation format’ or ‘Unicode transformation format.’) In UTF-8, each character is represented by one to four bytes. The 128 characters from ASCII are, in fact, represented by the same bytes as they are for ASCII itself. This is an enormous advantage for software originally written for ASCII or other character sets sharing its basic architecture. UTF-16 corresponds to the original definition of Unicode. Characters U þ 0000 through U þ D7FF and U þ E000 through U þ FFFF are represented by one 16-bit unit. Characters above U þ FFFF are represented by two 16-bit units, using a combination of two units in the ranges 0xD800 through 0xDFFF. (The scalar values U þ D800 through U þ DFFF are therefore not used for character encoding.) In UTF-32, each character is represented by its Unicode scalar value, padded to 32 bits. Finally, Unicode defines five character-encoding schemes based on these three encoding forms. These are necessitated because computers vary in how they combine two bytes into one 16-bit unit.
The Future Unicode and ISO/IEC 10646 can be considered to have rightfully won their title of the universal
Character versus Content 299
character set. Although other character sets continue in use, all work on computer representation of text is now done either in Unicode or in a fashion compatible with it. All major operating systems are being written around Unicode, and most software in development supports it. Major issues still remain. One is the distinction between character and glyph. Although in most cases, the line between the two is clear, in some it is not. Many of the ideographs being considered for encoding by the IRG, for example, are so obscure that even the experts are not sure which ones are distinct characters. In scripts that are yet to be fully deciphered, such as the Indus Valley script, there is the same problem. Another issue is diachronicity. With a few exceptions, the scripts encoded in Unicode are modern scripts in current use, which means that the repertoires are relatively well defined, as are the ranges of acceptable glyphs for the characters. When an effort is made to represent a script in use over the course of centuries or millennia, things are not always clear cut. Oracle-bone Chinese is not fully understood, for example, and it’s unclear how to coordinate it with modern Chinese. And there is also the issue of scripts that do not follow the linear approach to layout common to all modern writing systems. Unicode has already
expressly disavowed any intention of fully encoding music or mathematics because they are so two-dimensional in layout. Egyptian hieroglyphics, however, are a written representation of human language and should be encoded, although the details are far from clear. But in the meantime, computer technology has advanced considerably in its ability to represent human languages, over the course of just the past decade. Prior to 1990, it would have been unthinkable to exchange data in virtually any human language virtually anywhere in the world. That, however, is now becoming a reality.
See also: Asia, Ancient Southwest: Scripts, Modern Semitic; China: Writing System; Digital Fonts and Typography; Japan: Writing System.
Bibliography Gillam R (2003). Unicode demystified: a practical programmer’s guide to the encoding standard. Boston: Addison-Wesley. Graham T (2000). Unicode: a primer. Foster City, CA: M & T Books. Lunde K (1999). CJKV information processing. Beijing: O’Reilly & Associates.
Character versus Content C Spencer, Howard University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
David Kaplan introduced the content/character distinction in his monograph Demonstratives (1989a) to distinguish between two aspects of the meaning of (1) indexical and demonstrative pronouns (e.g., ‘I’, ‘here,’ ‘now,’ ‘this,’ and ‘that’) and (2) sentences containing them. Roughly, the content of an occurrence of an indexical or demonstrative is the individual to which it refers, and its character is the rule that determines its referent as a function of context. Thus, an indexical has different contents in different contexts, but its character is the same in all contexts. For instance, the character of ‘I’ is the rule, or function, that maps a context of utterance to the speaker of that context. This function determines that the content of Sally’s utterance of ‘I’ is Sally.
Content/Character Distinction and Semantics Sentences containing indexicals or demonstratives are context-dependent in two ways. First, contexts help to determine what these sentences say. Second, contexts determine whether what is said is true or false. For instance, suppose Sally says, ‘I’m cold now’ at time t. The context supplies Sally as the referent for ‘I’ and time t as the referent for ‘now,’ so it helps to determine what Sally said. Other facts about the context, specifically whether Sally is cold at time t, determine whether she said something true or false. Different contexts can play these different roles, as they do when we ask whether what Sally said in one context would be true in a slightly different context. A central virtue of Kaplan’s semantics is that it distinguishes between these two roles of context. For Kaplan, a context of use plays the first role, of
Character versus Content 299
character set. Although other character sets continue in use, all work on computer representation of text is now done either in Unicode or in a fashion compatible with it. All major operating systems are being written around Unicode, and most software in development supports it. Major issues still remain. One is the distinction between character and glyph. Although in most cases, the line between the two is clear, in some it is not. Many of the ideographs being considered for encoding by the IRG, for example, are so obscure that even the experts are not sure which ones are distinct characters. In scripts that are yet to be fully deciphered, such as the Indus Valley script, there is the same problem. Another issue is diachronicity. With a few exceptions, the scripts encoded in Unicode are modern scripts in current use, which means that the repertoires are relatively well defined, as are the ranges of acceptable glyphs for the characters. When an effort is made to represent a script in use over the course of centuries or millennia, things are not always clear cut. Oracle-bone Chinese is not fully understood, for example, and it’s unclear how to coordinate it with modern Chinese. And there is also the issue of scripts that do not follow the linear approach to layout common to all modern writing systems. Unicode has already
expressly disavowed any intention of fully encoding music or mathematics because they are so two-dimensional in layout. Egyptian hieroglyphics, however, are a written representation of human language and should be encoded, although the details are far from clear. But in the meantime, computer technology has advanced considerably in its ability to represent human languages, over the course of just the past decade. Prior to 1990, it would have been unthinkable to exchange data in virtually any human language virtually anywhere in the world. That, however, is now becoming a reality.
See also: Asia, Ancient Southwest: Scripts, Modern Semitic; China: Writing System; Digital Fonts and Typography; Japan: Writing System.
Bibliography Gillam R (2003). Unicode demystified: a practical programmer’s guide to the encoding standard. Boston: Addison-Wesley. Graham T (2000). Unicode: a primer. Foster City, CA: M & T Books. Lunde K (1999). CJKV information processing. Beijing: O’Reilly & Associates.
Character versus Content C Spencer, Howard University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
David Kaplan introduced the content/character distinction in his monograph Demonstratives (1989a) to distinguish between two aspects of the meaning of (1) indexical and demonstrative pronouns (e.g., ‘I’, ‘here,’ ‘now,’ ‘this,’ and ‘that’) and (2) sentences containing them. Roughly, the content of an occurrence of an indexical or demonstrative is the individual to which it refers, and its character is the rule that determines its referent as a function of context. Thus, an indexical has different contents in different contexts, but its character is the same in all contexts. For instance, the character of ‘I’ is the rule, or function, that maps a context of utterance to the speaker of that context. This function determines that the content of Sally’s utterance of ‘I’ is Sally.
Content/Character Distinction and Semantics Sentences containing indexicals or demonstratives are context-dependent in two ways. First, contexts help to determine what these sentences say. Second, contexts determine whether what is said is true or false. For instance, suppose Sally says, ‘I’m cold now’ at time t. The context supplies Sally as the referent for ‘I’ and time t as the referent for ‘now,’ so it helps to determine what Sally said. Other facts about the context, specifically whether Sally is cold at time t, determine whether she said something true or false. Different contexts can play these different roles, as they do when we ask whether what Sally said in one context would be true in a slightly different context. A central virtue of Kaplan’s semantics is that it distinguishes between these two roles of context. For Kaplan, a context of use plays the first role, of
300 Character versus Content
Figure 1 Two-dimensional matrix.
supplying contents for indexical expressions, and a circumstance of evaluation plays the second. A context of use is just a context in which an indexical expression may be used, and which supplies a content for the indexical expression. A circumstance of evaluation is an actual or merely possible situation in which the content of an utterance is evaluated for truth or falsehood. A semantic framework like Kaplan’s, which captures the double-dependence of meaning on context, is sometimes called a two-dimensional semantics. In the two-dimensional framework, a meaningful entity such as a linguistic expression or an utterance determines not a single semantic value but a twodimensional matrix of semantic values. Figure 1 represents Kaplan’s semantics in this way. In Figure 1, the vertical axis of the matrix displays contexts of use (u1-u3) and the horizontal axis displays circumstances of evaluation (c1-c3). Each cell in the matrix gives the extension of the linguistic expression e as used in the specified context of use and evaluated in the specified circumstance of evaluation. In this matrix, the cell in row n and column m gives the semantic value of e in the context of use specified at the beginning of row n and evaluated in the circumstance of evaluation specified at the top of column m. If e is a sentence, cells will be filled in with truth values as illustrated. Kaplan offers a syntax and semantics for a formal language containing indexicals, demonstratives, and a variety of modal operators. In this formal system, a context of use is an ordered n-tuple of contextual features to which indexicals or demonstratives are sensitive, such as the speaker, time, world, and location of the context. A circumstance of evaluation is an ordered n-tuple of a possible world-state or worldhistory, a time, and perhaps other elements as would be required given the sentential operators in the language. For Kaplan, all contexts of use are proper, which means that the speaker of the context must be located at the time, place, and world of the context. Circumstances of evaluation, however, need not be proper. Contexts of use and circumstances of evaluation play a role in the specification of the character and content of an expression. The character of any linguistic expression e is a function from contexts of use
to contents appropriate for e, i.e., an individual if e is a singular term, a proposition if e is a sentence, and sets of n-tuples of individuals if e is an n-place predicate. Indexical expressions only have contents relative to a context of use. So Kaplan speaks of the content of an occurrence of an expression rather than the content of the expression itself. Contents are evaluated in circumstances of evaluation, and these evaluations yield extensions appropriate to the kind of content under evaluation. So we also can characterize the content of an occurrence of e as a function from circumstances of evaluation to extensions of a type appropriate to e. For instance, the extensions for sentences are truth values, for indexicals, individuals, and for n-place predicates, n-tuples of individuals. For individuals and n-place predicates, these will be constant functions (i.e. the function delivers the same extension in every circumstance of evaluation). It is often simpler to think of contents as individuals (for singular terms), propositions (for sentences) and sets of n-tuples of individuals (for n-place predicates), and Kaplan typically talks about contents in this way. Both ways of thinking of contents are semantically equivalent. For Kaplan, indexicals and demonstratives are both directly referential and rigidly designating. They are directly referential because they contribute only their referents to the propositions expressed by sentences containing them. They are rigidly designating because, once they secure a content in a context of use, they retain that content in every circumstance of evaluation. Indexicals and demonstratives contrast with the typical definite description in both respects. Definite descriptions typically contribute a descriptive condition to a proposition rather than an individual, and this descriptive condition is typically satisfied by different individuals in different worlds of evaluation. Although Kaplan’s view that demonstratives are directly referential is widely accepted, some recent discussions of complex demonstratives (i.e. expressions of the form ‘that F’) have defended a quantificational approach, and some considerations in favor of such an approach may apply to the pure demonstratives ‘this’ and ‘that’ (King, 2001). Kaplan’s semantics has technical virtues lacking in earlier treatments of natural language indexicality. It shares with other double-indexing accounts (Kamp, 1971) a technical superiority to single-index theories, which evaluate sentences relative to a single index, which is an ordered n-tuple of features of a context, such as a speaker, time, location, and world. Such theories cannot account for the interaction of indexicals and certain sentence operators. To evaluate the sentence (1), for instance, we need to consider the truth value of the constituent sentence,
Character versus Content 301
‘the man who is now President of the United States no longer hold[s] that office’ in situations occurring after the sentence is uttered. (1) Someday, the man who is now President of the United States will no longer hold that office.
But the indexical ‘now’ in that constituent sentence must still refer to the time (1) is used, and not the time at which the constituent sentence is evaluated. As Hans Kamp has argued, only a double-indexing theory will correctly predict the truth conditions for (1) (Kamp, 1971).
Content/Character Distinction and Philosophy The content/character distinction sheds light on some specifically philosophical issues involving contextsensitivity in thought and language. These applications involve philosophically significant assumptions, and are more controversial than the applications to the semantics of indexicals and demonstratives. First, content and character play two roles that Gottlob Frege initially envisioned for the meaning, or sense, of a sentence, one semantic and the other more broadly psychological (Frege, 1892). Frege thought that the sense of a sentence should both determine its truth condition and provide the cognitive significance of beliefs expressible with that sentence. Although Frege expected that one entity, the sense, could play both roles, indexical and demonstrative belief undermines this expectation, since it appears to require two different entities to play the two roles. Different people who have a belief they could express by saying ‘I’m cold’ will be in the same psychological/functional state. They will all be shivering and trying to get warmer. But because each person who thinks, ‘I’m cold’ is a constituent of the content of that thought, all of these thoughts will differ in content. The psychological role of an indexical belief appears to be more closely tied to the character of the sentence the thinker would use to express that belief than to the content of the belief. But the content, rather than the character, is more directly relevant to the truth condition of an occurrence of a sentence containing an indexical. Second, Kaplan has suggested that the content/ character distinction helps to explain the relation between the epistemological notions of logical truth and the a priori, on the one hand, and the metaphysical notions of necessity and contingency on the other. Other philosophers have put broadly similar applications of the two-dimensional framework into service to the same end (Stalnaker, 1978, cf. Stalnaker, 2004; Chalmers, 1996; Jackson, 1998).
As is evident to anyone who understands sentence (2), it cannot be uttered falsely. Therefore, sentence (2) is in a certain sense a logical or a priori truth. Yet it does not express a necessary truth, since occurrences of (3) will typically be false. (2) I am here now. (3) Necessarily, I am here now.
Kaplan has suggested that we explain the special status of (2) as follows: metaphysically speaking, (2) is contingently true in virtue of its content. But it has its special epistemic status in virtue of its character: the character of (2) requires that it express a truth in every context of use. Other sentences which may express the same content as a particular occurrence of (2), but a different character, such as (4), do not have the same special epistemic status. (4) GWB is in Washington, DC on June 16, 2004.
Because (4) and some occurrences of (2) share a content but differ in their epistemic status, it is natural to conclude that contents cannot be the bearers of this special epistemic property. Critics of this account of the a priori (Soames, 2005) say that the content/character distinction cannot underwrite the general account of a priori knowledge that some of its defenders (Chalmers, 1996; Jackson, 1998) have claimed. Third, some philosophers have used Kaplan’s content/character distinction to distinguish narrow content (i.e. content determined by the internal state of the thinker) from wide content (i.e. content determined by the internal state of the thinker and his or her environment) (Fodor, 1987; see also Chalmers, 1996; Jackson, 1998, for a related application of two-dimensional semantics to these ends). They suggest that narrow content is loosely modeled on Kaplan’s characters, and wide content on Kaplan’s contents. That characters seem to capture something important about the psychological roles of belief makes them particularly attractive candidates to model the purely internal aspects of thought. Critics of the approach contend that although characters help to characterize internal states of thinkers, they are not themselves determined by such states (Stalnaker, 1989). See also: Analytic/Synthetic, Necessary/Contingent, and a
Bibliography Almog J, Perry J & Wettstein H (1989). Themes from Kaplan. New York: Oxford University Press.
302 Character versus Content Chalmers D (1996). The conscious mind. New York: Oxford University Press. Fodor J A (1987). Psychosemantics: the problem of meaning in the philosophy of mind. Cambridge, MA: MIT Press. Frege G (1892). ‘Ueber sinn und bedeutung.’ Zeitschr. fur Philos. und Philos. Kritik 100. Feigl H (trans.). 190–202. Jackson F (1998). From metaphysics to ethics. New York: Oxford University Press. Kamp H (1971). ‘Formal properties of ‘‘now’’.’ Theoria 37, 227–273. Kaplan D (1989a). ‘Demonstratives.’ In Almog, Perry, & Wettstein (eds.). 481–564. Kaplan D (1989b). ‘Afterthoughts.’ In Almog, Perry, & Wettstein (eds.). 565–614. King J (2001). Complex demonstratives. Cambridge, MA: MIT Press.
Kripke S (1980). Naming and necessity. Cambridge, MA: Harvard University Press. Lewis D K (1980). ‘Index, context, and content.’ In Kanger S & Ohman S (eds.) Philosophy and grammar. Dordrecht: Reidel. Soames S (2005). Reference and description: the case against two-dimensionalism. Princeton, NJ: Princeton University Press. Stalnaker R C (1978). ‘Assertion.’ In Cole P (ed.) Syntax and ssemantics, vol. 9: pragmatics. New York: Academic Press, Inc. 315–322. Stalnaker R C (1989). ‘On what’s in the head.’ Philosophical Perspectives 3, Philosophy of Mind and Action Theory. 287–316. Stalnaker R C (2004). ‘Assertion revisited: on the interpretation of two-dimensional modal semantics.’ Philosophical Studies 118(1–2), 299–322.
Chart Parsing and Well-Formed Substring Tables S G Pulman ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 2, pp. 505–508, ! 1994, Elsevier Ltd.
For efficiency, most practical parsing algorithms (see Parsing: Statistical Methods) are implemented using a ‘well-formed substring table’ (wfsst) to record parses of complete subconstituents. This means that duplicate paths through the search space defined by the grammar are avoided. Since there may be many such paths, use of such a table can sometimes mean the difference between processing time of seconds, as opposed to minutes or even hours.
Well-Formed Substring Tables To illustrate the notion of a well-formed substring table, consider the process of parsing a sequence analyzable as: (1) . . .(1) V (2) NP (3) PP (4)
(e.g., ‘. . .saw the man in the park’), given rules in the grammar like: (2) VP ! V NP VP ! VP PP NP ! NP PP
In a top-down parsing regime, there will be two points at which the first of these rules will be used to predict an NP beginning at position 2. One should
assume that the algorithm first finds the nonrecursive NP from 2 to 3. Now, via the third rule, it will again look for an NP, this time followed by a PP. The naive algorithm might repeat the same sequence of actions it has just performed in recognizing the original NP from 2 to 3. However, if it is assumed that, whenever a complete constituent is recognized, it is recorded in a table, then the algorithm can be changed to consult this table whenever it is looking for a constituent of a particular type at a given point. In this instance, there is already an NP from 2 to 3; therefore, it is possible to proceed directly to look for a PP beginning at position 3. Having found this PP, it is possible to assemble two VPs: one from 1 to 3, of the form [VP [V NP]], and one from 1 to 4 of the form [VP [V [NP [NP PP]]]]. Now the situation is reached where the second rule predicts a PP beginning at position 3. The naive algorithm will again go ahead and recompute the sequence of actions leading to the recognition of the PP. However, using the wfsst, the fact is already recorded that there is a PP from 3 to 4, and instead the algorithm can simply use that information to build a VP from 1 to 4 of the form [VP [VP PP]]. Thus, at least two recomputations have been avoided. If this does not sound very impressive, the reader is invited to work through the steps involved in parsing a sequence like: (3) V NP P NP P NP P NP
(e.g., ‘saw the man in the park with a telescope on Friday’) given the above rules, along with P ! P NP, and to check how many times the same constituents
302 Character versus Content Chalmers D (1996). The conscious mind. New York: Oxford University Press. Fodor J A (1987). Psychosemantics: the problem of meaning in the philosophy of mind. Cambridge, MA: MIT Press. Frege G (1892). ‘Ueber sinn und bedeutung.’ Zeitschr. fur Philos. und Philos. Kritik 100. Feigl H (trans.). 190–202. Jackson F (1998). From metaphysics to ethics. New York: Oxford University Press. Kamp H (1971). ‘Formal properties of ‘‘now’’.’ Theoria 37, 227–273. Kaplan D (1989a). ‘Demonstratives.’ In Almog, Perry, & Wettstein (eds.). 481–564. Kaplan D (1989b). ‘Afterthoughts.’ In Almog, Perry, & Wettstein (eds.). 565–614. King J (2001). Complex demonstratives. Cambridge, MA: MIT Press.
Kripke S (1980). Naming and necessity. Cambridge, MA: Harvard University Press. Lewis D K (1980). ‘Index, context, and content.’ In Kanger S & Ohman S (eds.) Philosophy and grammar. Dordrecht: Reidel. Soames S (2005). Reference and description: the case against two-dimensionalism. Princeton, NJ: Princeton University Press. Stalnaker R C (1978). ‘Assertion.’ In Cole P (ed.) Syntax and ssemantics, vol. 9: pragmatics. New York: Academic Press, Inc. 315–322. Stalnaker R C (1989). ‘On what’s in the head.’ Philosophical Perspectives 3, Philosophy of Mind and Action Theory. 287–316. Stalnaker R C (2004). ‘Assertion revisited: on the interpretation of two-dimensional modal semantics.’ Philosophical Studies 118(1–2), 299–322.
Chart Parsing and Well-Formed Substring Tables S G Pulman ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 2, pp. 505–508, ! 1994, Elsevier Ltd.
For efficiency, most practical parsing algorithms (see Parsing: Statistical Methods) are implemented using a ‘well-formed substring table’ (wfsst) to record parses of complete subconstituents. This means that duplicate paths through the search space defined by the grammar are avoided. Since there may be many such paths, use of such a table can sometimes mean the difference between processing time of seconds, as opposed to minutes or even hours.
Well-Formed Substring Tables To illustrate the notion of a well-formed substring table, consider the process of parsing a sequence analyzable as: (1) . . .(1) V (2) NP (3) PP (4)
(e.g., ‘. . .saw the man in the park’), given rules in the grammar like: (2) VP ! V NP VP ! VP PP NP ! NP PP
In a top-down parsing regime, there will be two points at which the first of these rules will be used to predict an NP beginning at position 2. One should
assume that the algorithm first finds the nonrecursive NP from 2 to 3. Now, via the third rule, it will again look for an NP, this time followed by a PP. The naive algorithm might repeat the same sequence of actions it has just performed in recognizing the original NP from 2 to 3. However, if it is assumed that, whenever a complete constituent is recognized, it is recorded in a table, then the algorithm can be changed to consult this table whenever it is looking for a constituent of a particular type at a given point. In this instance, there is already an NP from 2 to 3; therefore, it is possible to proceed directly to look for a PP beginning at position 3. Having found this PP, it is possible to assemble two VPs: one from 1 to 3, of the form [VP [V NP]], and one from 1 to 4 of the form [VP [V [NP [NP PP]]]]. Now the situation is reached where the second rule predicts a PP beginning at position 3. The naive algorithm will again go ahead and recompute the sequence of actions leading to the recognition of the PP. However, using the wfsst, the fact is already recorded that there is a PP from 3 to 4, and instead the algorithm can simply use that information to build a VP from 1 to 4 of the form [VP [VP PP]]. Thus, at least two recomputations have been avoided. If this does not sound very impressive, the reader is invited to work through the steps involved in parsing a sequence like: (3) V NP P NP P NP P NP
(e.g., ‘saw the man in the park with a telescope on Friday’) given the above rules, along with P ! P NP, and to check how many times the same constituents
Chart Parsing and Well-Formed Substring Tables 303
are reparsed when a wfsst is not being used. For constructions like this, the numbers can grow very rapidly indeed. Use of a wfsst here is essential if parsing algorithms are to be implemented in a practically usable form. Another technique is to keep a record of subconstituents that have not been found beginning at a particular point in the input. For example, in the earlier ‘saw the man in the park’ example, when the VP of the form [VP V [NP [NP PP]]] is found, it will, via the second rule, cause a prediction of a PP beginning at position 4. This prediction fails, because it is at the end of a sentence. When the VP of the form [VP [VP [V NP]] PP] is found, it too will cause the same prediction, via the same rule. Although in this case discovering that there are no more PPs is fairly trivial, this will not always be so, and a lot of recomputation can be saved by also checking that, when looking for a constituent C at position P, one has not already tried and failed to find C at P.
Basic Chart Parsing A ‘chart’ is a generalization of a wfsst in which incomplete constituents are also represented. Charts are the basis on which many parsing algorithms are implemented, and they provide a flexible framework within which different types of processing regimes and different types of grammatical formalism can be handled. A chart consists of a set of ‘edges’ and ‘vertices.’ Vertices represent the positions between words in an input sentence, and edges represent partial or complete analyses. An edge, at least when context-free rules or related formalisms are involved, can be thought of as something derived from a rule and having the following structure: (4) edge(Id, LeftVertex, RightVertex, MotherCat, DaughtersFound, DaughtersSought)
In describing charts it is convenient to use a PROLOGlike notation: Words in lower case are constants, words beginning with upper case are variables, an underscore is a variable whose value is not of interest, and a list of items is enclosed in square brackets, with the convention that an expression like [Head | Tail] represents a list whose first member is ‘Head’ and whose remaining members (a list) are ‘Tail.’ An edge has an identifier and connects vertices. DaughtersFound will usually be represented in terms of a list of other edges representing the analysis of those daughters. Daughters Sought is a list of categories. An edge is complete if DaughtersSought is empty.
To specify a particular instantiation of a chart framework, it is necessary to state the following: 1. a regime for creating new edges 2. a way of combining two old edges to form a new edge 3. access to an ‘agenda’ of edges created by (1) or (2) that are waiting for further processing when they are entered into the chart
A Bottom-Up Chart Parser One particular chart-based algorithm can be specified as follows: It proceeds bottom-up, one word at a time. (5) New edges: whenever there is a complete edge put in the chart of the form edge(Id, From, To, Category, -,-) then for each rule in the grammar of the form Lhs ! Rhs where Category is the first member of Rhs, put a new edge on the agenda of the form edge(NewId, From, From, Lhs, [], Rhs)
Not all rules in the grammar meeting this criterion will lead to a complete parse. This step of the procedure can be made sensitive to information precomputed from the grammar so as to select only rules that are, say, compatible with the next word in the input, or alternatively, compatible with the next category sought of at least one incomplete edge ending at the point where the current word starts (see Aho and Ullman, 1977 for various types of relevant grammar relations). (6) Combine Edges: Whenever a new edge is put into the chart of the form: edge(Id1, B, C, Cat1, Found1, []) then for each edge in the chart of the form: edge(Id2, A, B, Cat2, Found2, [Cat1 | OtherDaughtersSought]) create a new edge edge(Id3, A, C, Cat2, [Id1/Found2] Other DaughtersSought) Whenever a new edge is put into the chart of the form: edge(Id1, A, B, Cat1, Found1, [Cat2 | RestSought]) then for each edge in the chart of the form: edge(Id2, B, C, Cat2, Found2, []) create a new edge edge(Id3, A, C, Cat1, [Id1/Found1] RestSought)
The first part of ‘combine edges’ is triggered by the addition of a complete edge to the chart, and produces a new edge for each incomplete edge ending where
304 Chart Parsing and Well-Formed Substring Tables
the complete edge begins that can combine with it. These incomplete edges are already in the chart. The new edges are put on the agenda. The second part is triggered by the addition of an incomplete edge to the chart, and produces a new edge for each complete edge beginning where the incomplete edge ends. These new edges are put on the agenda. Looking at things from the point of view of a complete edge, the first part of ‘combine edges’ ensures that it is combined with whatever is already in the chart that it can be combined with, whereas the second part ensures that it will be combined with any future incomplete edges entering the chart. Thus, no opportunity for combination will be missed, at whatever stage of parsing it arises. All that has to be done now to specify a complete chart-parsing procedure is to define access to the agenda: either treat the agenda as a stack (last in, first out) in which case the general search strategy will be depth first, or as a queue (first in, first out) in which case the search order of hypotheses will be breadth first. There could also be more complex heuristics ordering edges on the agenda according to some weighting function: This would mean that the highest scoring hypothesis was explored first, independently of the order in which they were generated. The procedure must also be embedded in some kind of top-level driver, so as to start off the process and to check for complete analyses when the process is ended. The final program, then, might have the following structure: (7) Until no more words: create new edges for next word do New Edges for these edges. Until agenda is empty: pop next edge off agenda and put in chart do New Edges do Combine Edges Check for complete edges of desired category spanning start to finish
Given the following grammar, the algorithm would proceed as follows on the input sentence ‘they can fish’ (with the words occupying chart positions 1 to 2, 2 to 3, and 3 to 4, respectively): S ! NP VP NP ! they | fish VP ! Aux VP VP ! Vi VP ! Vt NP Aux ! can Vi ! fish Vi ! can Operation new word edge pop
At this point, no more processing can take place. Inspecting the chart, it can be found that these are two complete edges (edges 15 and 18) spanning the input of the desired category, S. By recursively tracing through their contained edges, the syntactic structure of the analyses implicit in the chart can be recovered. It can be noticed that, as well as sharing some complete subconstituents (edge 1), the final analyses were built up using some of the same partial constituents (edge 3).
Other Chart-Based Parsing Algorithms It is easy to formulate different processing regimes within the chart framework. By redefining the ‘new
Chart Parsing and Well-Formed Substring Tables 305
edges’ routine to operate on incomplete edges, one can implement a top-down, ‘Earley’ style of algorithm (Thompson and Ritchie, 1984). Alternatively, it is possible to define ‘new edges’ so that any arbitrary constituent, rather than the first, is used to build new edges, and with corresponding changes to ‘combine edges,’ constituents can be built right to left, as well as left to right (Steel and de Roeck, 1987). This can be useful for practical efficiency when some construction is ‘keyed’ by an item that is not the leftmost one, (e.g., conjunctions in English or verbal complements in subordinate clauses in languages like German). A left-to-right strategy would have to allow for the possibility of many types of constituents before there was any evidence for them, possibly leading to a lot of wasted computation most of the time. This strategy can also be generalized to ‘headdriven’ parsing, using information from the head of a phrase to guide the search for its sister constituents to the left or right (Kay, 1990). For all of these different types of parsing algorithms, appropriate search strategies can be imposed by different methods of manipulating the agenda, as mentioned earlier. Chart parsing also offers a degree of robustness in the case of a failure to find a complete parse, either because the input is ungrammatical or (what amounts to the same thing from the point of view of the parser) is not within the coverage of the grammar. Even if the overall parse fails, an algorithm like the one above will result in all well-formed subconstituents that are present in the input being found. From these, different types of heuristic strategy can be used to do something useful. Mellish (1989) described one such technique, which attempted to massage an ill-formed input into a well-formed one by inserting or deleting constituents after the ordinary parsing process has failed. It is also worth pointing out that many of the virtues of charts as a basis for parsing are also desiderata for the reverse process of generation. In generation as well as in parsing, it is important to avoid unnecessary recomputation. Shieber (1988) describe that a chart-based framework that was intended to accommodate a variety of generation and parsing algorithms.
Packing Although charts, as so far described, offer considerable economy of representation and processing, they still take exponential time and space when there is an exponential number of parse trees to be assigned to a sentence. This can be improved on. Notice that in the example above, two VP constituents were built
spanning 2 to 4, and these in turn gave rise to two S constituents, each containing the same NP. This has the advantage that all analyses of the sentence are explicitly represented: There is an S edge for each one. However, it has the disadvantage that the combination is done redundantly on the second and any further occasions. (This is where the exponential behavior arises.) If an NP can combine with one VP from 2 to 4 to form a sentence, then it is obvious that it can also combine with any other VP from 2 to 4. At the point at which this move is made, the fact that the two VPs have a different internal structure is irrelevant. At a cost of extra work in spelling out explicit parse trees when other processing has been completed, this redundancy can be eliminated. In the chart framework developed here, this is most easily achieved by generalizing the representation of complete edges so that the ‘constituents’ or ‘Daughters Found’ field is a disjunction of lists of edge identifiers, rather a single list. Then the algorithm must incorporate the extra step, whenever it is building a complete edge, of checking to see whether there is already in the chart a complete edge of the same category and with the same ‘from’ and ‘to’ labels. If there is, it can simply make a disjunction of the constituents field of the edge being built with that of the existing edge and no new edge is needed, or anything else. Any analyses that the existing edge already takes part in will also involve the new edge, and the new edge will be included in any further analyses involving the existing edge. Thus, in the case of the second VP edge, at the point where edge 9 can combine with edge 11 to form edge 17, edge 14 would have extended to look like: e14(2, 4, VP, [ [e4, e13], [e5, e11] ], [])
Then the combination of edge 3 with edge 14 to make a sentence would represent both analyses, without needing to repeat the combination explicitly. In this case, only one set of operations is saved, but it is easy to see that in cases like the multiple PP sentence encountered earlier, the savings can be considerable. This extension is essentially the notion of ‘packing,’ as described in, for example, Tomita (1987). If explicit parse trees are required, then there is an extra cost in a post-processing phase, for each disjunction will need to be unpacked into a separate analysis. This step may be exponential even where parsing time was polynomial: The complexity is moved, not eliminated. However, the number of constituents created and the number of computational operations performed while actually parsing will usually be far smaller if packing is used than if the simple chart scheme is used. Moreover, in many practical NLP
306 Chart Parsing and Well-Formed Substring Tables
systems, parse trees do not need to be enumerated explicitly, except perhaps for debugging purposes, as they are merely a preliminary to semantic interpretation. In some of these systems (e.g., that described in Alshawi et al., 1988), many aspects of semantic interpretation can be done on packed structures directly, saving yet more computational work. See also: Parsing: Statistical Methods.
Bibliography Aho A V & Ullman J D (1977). Principles of compiler design. Reading, MA: Addison Wesley. Alshawi H et al. (1988). ‘Overview of the core language engine.’ Proc Int Conf 5th Generation Computer Systems, Tokyo.
Chatino
Kay M (1990). ‘Head driven parsing.’ In Parsing Technologies. Proceedings of an International Workshop. Carnegie–Mellon University, Pittsburgh, PA. Mellish C (1989). Some chart-based techniques for parsing ill-formed input. Proceedings of 27th ACL, Vancouver. Shieber S (1988). A uniform architecture for parsing and generation. COLING 88, Budapest. Steel S & de Roeck A (1987). ‘Bidirectional chart parsing.’ In Mellish C S & Hallam J (eds.) Advances in artificial intelligence. Proceedings of AISB-87. Ellis Horwood. Chichester. Thompson H S & Ritchie G D (1984). ‘Implementing natural language parsers.’ In O’Shea T & Eisenstadt M (eds.) Artificial intelligence: tools, techniques, and applications. New York: Harper and Row. Tomita M (1987). ‘An efficient augmented context-free parsing algorithm.’ Computational Linguistics 13, 31–46.
See: Zapotecan.
Chatterji, Suniti Kumar (1890–1977) K Karttunen, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.
Suniti Kumar Chatterji was born in Sibpur, Howrah on November 26, 1890, and died in Calcutta on May 29, 1977. He was born in a traditional Sa¯maveda Brahman family of Kas¯ yapa clan originating in Eastern Bengal. He was educated at Motilal Sil’s Free School, Scottish Church College (B.A. 1911), and Calcutta University (M.A. 1913 in English, then also Sanskrit studies until 1919). In 1913 he became Professor of English at Vidyasagar College in Calcutta, and in 1914–1919 he taught English as an Assistant Professor at Calcutta University. Chatterji undertook further studies of Indo-Aryan and Indo-European in London in 1919–1921 (D.Litt. 1921 in Indo-Aryan philology, under L. D. Barnett, F. W. Thomas et al.) and in Paris in 1921–1922 (under Bloch, Meillet, Przyluski, Pelliot et al.). In 1922 he became Khaira Professor of Indian Linguistics and Phonetics at Calcutta University, and in 1952 he was named Emeritus Professor of comparative philology. As a friend of Tagore, he served many years as a member of the governing body of Visva-Bharati
University. In the 1950s and 1960s he was Chairman of the Upper House of the West Bengal Legislature. He attended numerous congresses and traveled widely, e.g., around 1951 acting as visiting lecturer at Pennsylvania University for 6 months. He won many honors, e.g., Dr.h.c. in 1960 from Rome and D.Litt.h.c. in 1965 from Delhi. From 1964 on he was National Research Professor of Humanities. Chatterji’s London dissertation, Origin and development of Bengali, published in 1926, made him famous as the foremost linguist of India. He was no theoretician but had a good grasp of historical linguistics and discussed questions of Old and Middle Indo-Aryan, Bengali, and Hindi and contacts of Indo-Aryan with other Indo-European and non-Indo-European languages. He was one of the first phoneticians in India. In addition to linguistic works he edited Bengali and Maithili texts and wrote a number of books and essays on such topics as languages, literature, history, national questions, and travel books in English, Bengali, and Hindi. Among his students were M. M. Ghosh and S. Sen. See also: Bloch, Bernard (1907–1965); Meillit, Antoine
(Paul Jules) (1866–1936); Sen, Sukumar (1900–1992).
306 Chart Parsing and Well-Formed Substring Tables
systems, parse trees do not need to be enumerated explicitly, except perhaps for debugging purposes, as they are merely a preliminary to semantic interpretation. In some of these systems (e.g., that described in Alshawi et al., 1988), many aspects of semantic interpretation can be done on packed structures directly, saving yet more computational work. See also: Parsing: Statistical Methods.
Bibliography Aho A V & Ullman J D (1977). Principles of compiler design. Reading, MA: Addison Wesley. Alshawi H et al. (1988). ‘Overview of the core language engine.’ Proc Int Conf 5th Generation Computer Systems, Tokyo.
Chatino
Kay M (1990). ‘Head driven parsing.’ In Parsing Technologies. Proceedings of an International Workshop. Carnegie–Mellon University, Pittsburgh, PA. Mellish C (1989). Some chart-based techniques for parsing ill-formed input. Proceedings of 27th ACL, Vancouver. Shieber S (1988). A uniform architecture for parsing and generation. COLING 88, Budapest. Steel S & de Roeck A (1987). ‘Bidirectional chart parsing.’ In Mellish C S & Hallam J (eds.) Advances in artificial intelligence. Proceedings of AISB-87. Ellis Horwood. Chichester. Thompson H S & Ritchie G D (1984). ‘Implementing natural language parsers.’ In O’Shea T & Eisenstadt M (eds.) Artificial intelligence: tools, techniques, and applications. New York: Harper and Row. Tomita M (1987). ‘An efficient augmented context-free parsing algorithm.’ Computational Linguistics 13, 31–46.
See: Zapotecan.
Chatterji, Suniti Kumar (1890–1977) K Karttunen, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.
Suniti Kumar Chatterji was born in Sibpur, Howrah on November 26, 1890, and died in Calcutta on May 29, 1977. He was born in a traditional Sa¯maveda Brahman family of Kas¯yapa clan originating in Eastern Bengal. He was educated at Motilal Sil’s Free School, Scottish Church College (B.A. 1911), and Calcutta University (M.A. 1913 in English, then also Sanskrit studies until 1919). In 1913 he became Professor of English at Vidyasagar College in Calcutta, and in 1914–1919 he taught English as an Assistant Professor at Calcutta University. Chatterji undertook further studies of Indo-Aryan and Indo-European in London in 1919–1921 (D.Litt. 1921 in Indo-Aryan philology, under L. D. Barnett, F. W. Thomas et al.) and in Paris in 1921–1922 (under Bloch, Meillet, Przyluski, Pelliot et al.). In 1922 he became Khaira Professor of Indian Linguistics and Phonetics at Calcutta University, and in 1952 he was named Emeritus Professor of comparative philology. As a friend of Tagore, he served many years as a member of the governing body of Visva-Bharati
University. In the 1950s and 1960s he was Chairman of the Upper House of the West Bengal Legislature. He attended numerous congresses and traveled widely, e.g., around 1951 acting as visiting lecturer at Pennsylvania University for 6 months. He won many honors, e.g., Dr.h.c. in 1960 from Rome and D.Litt.h.c. in 1965 from Delhi. From 1964 on he was National Research Professor of Humanities. Chatterji’s London dissertation, Origin and development of Bengali, published in 1926, made him famous as the foremost linguist of India. He was no theoretician but had a good grasp of historical linguistics and discussed questions of Old and Middle Indo-Aryan, Bengali, and Hindi and contacts of Indo-Aryan with other Indo-European and non-Indo-European languages. He was one of the first phoneticians in India. In addition to linguistic works he edited Bengali and Maithili texts and wrote a number of books and essays on such topics as languages, literature, history, national questions, and travel books in English, Bengali, and Hindi. Among his students were M. M. Ghosh and S. Sen. See also: Bloch, Bernard (1907–1965); Meillit, Antoine
(Paul Jules) (1866–1936); Sen, Sukumar (1900–1992).
Chave´ e, Honore´ (1815–1877) 307
Bibliography Bhattacharji S, Banerji P K & Kanjilal A K (1970). Suniti Kumar Chatterji. The Scholar and the Man. Calcutta: Jijnasa. Chatterji S K (1926). Origin and development of Bengali language. Calcutta: Calcutta University Press. 1–2 (2nd edn., 1970; Part 3. Additions and corrections, index. London, 1978). Chatterji S K (1928). A Bengali phonetic reader. London: University of London Press. Chatterji S K (1942). Indo-Aryan and Hindi. Ahmedabad: Gujarat Vernacular Society (2nd rev. and enl. edn., Calcutta, 1960). Chatterji S K (1963). Languages and literatures of modern India. Calcutta: Prakash Bhavan.
Chatterji S K (1968). Balts and Aryans in their IndoEuropean Background. Simla: Indian Institute of Advanced Study. Chatterji S K (1972). Select papers. New Delhi: People’s Pub. House. Chatterji S K (1983). On the development of Middle IndoAryan. Calcutta Sanskrit College Research Series 132. Calcutta: Sanskrit College. Chatterji S K & Sen S (1957). A Middle Indo-Aryan Reader. Calcutta: Calcutta University Press. Singh U N (ed.) (1997). S. K. Ch. A centenary tribute. Papers from the proceedings of the National Seminar on ‘Suniti Kumar Chatterji: An End-Century Assessment’ held November 1989. New Delhi: Sahitya Akademi.
Chave´ e, Honore´ (1815–1877) J van Pottelberge, Ghent University, Ghent, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Honore´ Chave´ e, a Belgian comparative linguist who specialized in Indo-European and Semitic languages, initiated the school of naturalist linguistics in France and became one of its main representatives. He was born in Namur (Belgium) on June 3, 1815, and trained as a Roman Catholic priest at the Namur Diocesan Seminary (1833–1838). In 1840 he became parish priest of Floriffoux, but soon quit in 1843, moving to Brussels and eventually to Paris, teaching classes at Stanislas College (1846–1848) that formed the basis for his early magnum opus, Lexiologie indo-europe´enne (1849). Soon afterward he abandoned the priesthood and the Christian faith, embraced Auguste Comte’s positivism, and became a Freemason. From 1848 on Chave´ e taught his students at his home; he gave guest lectures in Pisa and Bologna, and taught German at the E´cole Polytechnique in Versailles in 1871–1872. He died after a lingering illness on July 16, 1877. Apart from his studies of Semitic languages at Leuven under Jan Theodoor Beelen (a biblical exegete and orientalist) from 1838 until 1840 and classes with Euge`ne Burnouf in Paris, Chave´ e was essentially self-taught, though influenced by the writings of Fre´ de´ ric Eichhoff, Friedrich Diez, Franz Bopp, and others. Very much like August Schleicher, with whose work he did not however become acquainted until later, Chave´ e adopted the view that language is a living organism (and that the loss of transparent morphological structure represents ‘illness’ or decay), and took a historical and glottogonic approach to comparativism. Throughout his work, he reconstructed
primitive monosyllabic roots of language, which consisted of only two classes: syllables expressing sensations (verbs) and demonstrative syllables (pronouns). He considered both kind of etymological roots to be spontaneous creations of the brain and, consequently, regarded linguistics as a branch of anthropology and ultimately as a natural science. Given that comparative research demonstrated the distinct origins of Semitic and Indo-European roots, Chave´ e drew the conclusion from the polygenetism of the languages that the races who speak them also have a polygenetic origin. Unlike other comparative work of his time, Chave´ e’s research included (lexical) meaning, as he tried to establish primitive semantic kernels and classified all verbal roots into onomasiological ‘natural families’ after the model of biological taxonomy. Being over-eager to schematize, however, most of his reconstructions are speculative. Together with Abel Hovelacque, his most famous pupil, Chave´ e founded the Revue de Linguistique et de Philologie Compare´e in 1867 (which appeared until 1916), the first French journal devoted to linguistics and the main dissemination channel of the naturalist school. Apart from his comparative work, Chave´ e wrote on language education; his descriptions of his native Walloon language are still valuable. See also: Bopp, Franz (1791–1867); Diez, Friedrich (1794–
1876); Indo–European Languages; Naturalism; Schleicher, August (1821–1868); Semitic Languages.
Bibliography Chave´ e H (1849). Lexiologie indo-europe´enne. Paris: Franck.
Chave´e, Honore´ (1815–1877) 307
Bibliography Bhattacharji S, Banerji P K & Kanjilal A K (1970). Suniti Kumar Chatterji. The Scholar and the Man. Calcutta: Jijnasa. Chatterji S K (1926). Origin and development of Bengali language. Calcutta: Calcutta University Press. 1–2 (2nd edn., 1970; Part 3. Additions and corrections, index. London, 1978). Chatterji S K (1928). A Bengali phonetic reader. London: University of London Press. Chatterji S K (1942). Indo-Aryan and Hindi. Ahmedabad: Gujarat Vernacular Society (2nd rev. and enl. edn., Calcutta, 1960). Chatterji S K (1963). Languages and literatures of modern India. Calcutta: Prakash Bhavan.
Chatterji S K (1968). Balts and Aryans in their IndoEuropean Background. Simla: Indian Institute of Advanced Study. Chatterji S K (1972). Select papers. New Delhi: People’s Pub. House. Chatterji S K (1983). On the development of Middle IndoAryan. Calcutta Sanskrit College Research Series 132. Calcutta: Sanskrit College. Chatterji S K & Sen S (1957). A Middle Indo-Aryan Reader. Calcutta: Calcutta University Press. Singh U N (ed.) (1997). S. K. Ch. A centenary tribute. Papers from the proceedings of the National Seminar on ‘Suniti Kumar Chatterji: An End-Century Assessment’ held November 1989. New Delhi: Sahitya Akademi.
Chave´e, Honore´ (1815–1877) J van Pottelberge, Ghent University, Ghent, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Honore´ Chave´e, a Belgian comparative linguist who specialized in Indo-European and Semitic languages, initiated the school of naturalist linguistics in France and became one of its main representatives. He was born in Namur (Belgium) on June 3, 1815, and trained as a Roman Catholic priest at the Namur Diocesan Seminary (1833–1838). In 1840 he became parish priest of Floriffoux, but soon quit in 1843, moving to Brussels and eventually to Paris, teaching classes at Stanislas College (1846–1848) that formed the basis for his early magnum opus, Lexiologie indo-europe´enne (1849). Soon afterward he abandoned the priesthood and the Christian faith, embraced Auguste Comte’s positivism, and became a Freemason. From 1848 on Chave´e taught his students at his home; he gave guest lectures in Pisa and Bologna, and taught German at the E´cole Polytechnique in Versailles in 1871–1872. He died after a lingering illness on July 16, 1877. Apart from his studies of Semitic languages at Leuven under Jan Theodoor Beelen (a biblical exegete and orientalist) from 1838 until 1840 and classes with Euge`ne Burnouf in Paris, Chave´e was essentially self-taught, though influenced by the writings of Fre´de´ric Eichhoff, Friedrich Diez, Franz Bopp, and others. Very much like August Schleicher, with whose work he did not however become acquainted until later, Chave´e adopted the view that language is a living organism (and that the loss of transparent morphological structure represents ‘illness’ or decay), and took a historical and glottogonic approach to comparativism. Throughout his work, he reconstructed
primitive monosyllabic roots of language, which consisted of only two classes: syllables expressing sensations (verbs) and demonstrative syllables (pronouns). He considered both kind of etymological roots to be spontaneous creations of the brain and, consequently, regarded linguistics as a branch of anthropology and ultimately as a natural science. Given that comparative research demonstrated the distinct origins of Semitic and Indo-European roots, Chave´e drew the conclusion from the polygenetism of the languages that the races who speak them also have a polygenetic origin. Unlike other comparative work of his time, Chave´e’s research included (lexical) meaning, as he tried to establish primitive semantic kernels and classified all verbal roots into onomasiological ‘natural families’ after the model of biological taxonomy. Being over-eager to schematize, however, most of his reconstructions are speculative. Together with Abel Hovelacque, his most famous pupil, Chave´e founded the Revue de Linguistique et de Philologie Compare´e in 1867 (which appeared until 1916), the first French journal devoted to linguistics and the main dissemination channel of the naturalist school. Apart from his comparative work, Chave´e wrote on language education; his descriptions of his native Walloon language are still valuable. See also: Bopp, Franz (1791–1867); Diez, Friedrich (1794–
1876); Indo–European Languages; Naturalism; Schleicher, August (1821–1868); Semitic Languages.
Bibliography Chave´e H (1849). Lexiologie indo-europe´enne. Paris: Franck.
308 Chave´ e, Honore´ (1815–1877) Chave´ e H (1857). Franc¸ ais et wallon, paralle`le linguistique. Paris: Truchy. Desmet P (1996). La linguistique naturaliste en France (1867–1922). Leuven: Peeters.
Leroy M (1985). ‘Chave´ e (Honore´ -Joseph).’ In Biographie nationale publie´ e par l’Acade´ mie royale des sciences, des lettres et des beaux-arts de Belgique, vol. 44. Brussels: Bruylant. 197–206.
Chibchan A Constenla Uman˜ a, University of Costa Rica, San Jose´, Costa Rica ! 2006 Elsevier Ltd. All rights reserved.
The Chibchan stock is currently composed of the 16 languages from Central America and northwestern South America listed below with their main current alternate names, approximate number of speakers, and location: Pech (Paya; 900; Olancho Department, eastern Honduras), Rama (20; Rama Cay and other localities south of Rı´o Escondido, southeastern Nicaragua), Male´ku Jaı´ka (Guatuso; 300; Guatuso County, northern plains of Costa Rica), Cabe´car (8500; Atlantic watershed and southern Pacific slope of the Talamanca Range, southern Costa Rica), Bribri (6000; southern Atlantic and Pacific slopes of the Talamanca Range), Boruca (Brunka; 2, 20 semi-speakers with a passive domain of the language; Te´rraba Valley, southwestern Costa Rica), Teribe (a dialect of Naso; 3000; Teribe and Changuinola rivers area, northwestern Panama; Te´rraba, the Costa Rican dialect, is extinct), Buglere (Bocota´, Guaymı´ Sabanero; 3700; Bocas del Toro, Veraguas, Chiriquı´ Provinces, western Panama), Nga¨bere (Guaymı´; 110 000 in the Bocas del Toro, Chiriquı´, and Veraguas provinces, Western Panama, and 2172 in the bordering area of southwestern Costa Rica), Kuna (70 000 in the eastern Atlantic coast and the southeastern Paya and Pucuro localities of Panama, and 800 in Arquı´a and Caima´n Nuevo in the Uraba´ Gulf, Colombia), Chimila (450; lowlands to the south of Fundacio´n River, Magdalena Department, Colombia), Cogui (Ca´gaba; 6000; northern, eastern, and western slopes of the Sierra Nevada de Santa Marta, Colombia), Damana (Malayo; 1500; southern and eastern slopes of the Sierra Nevada de Santa Marta), Ica (Bı´ntucua; 8000; southern slopes of the Sierra Nevada de Santa Marta), Barı´ (Motilo´n; 1500 in Colombia, 850 in Venezuela; Serranı´a de Motilones), and Tunebo (Uwa; 3500, mostly in Colombia, a few in Venezuela; eastern slopes of the Sierra Nevada de Cocuy). Formerly, the stock included at least eight more languages which are listed with their original location, and approximate time of extinction: Huetar (central Costa Rica, 18th century), Cha´nguena,
Dorasque (both in western Panama, Chiriquı´ Lagoon area, beginning of the 20th century), Antioquian (central and northeastern Department of Antioquia, Colombia, 18th century), Tairona (the coast to the north of the Sierra Nevada de Santa Marta, 18th century or before), Kankuama (eastern slopes of the Sierra Nevada de Santa Marta, first half of the 20th century), Duit (Boyaca´ Department, Colombia, 18th century), and Muisca (Cundinamarca Department, Colombia, 18th century).
Subgrouping The following subgrouping is based on both lexicostatistical and comparative evidence (Constenla, 1995: 42): I. Pech. II. Core Chibchan: IIA. Votic: Rama, Guatuso. IIB. Isthmic: B1. Viceitic: Cabe´car, Bribri. B2. Boruca. B3. Teribe. B4. Guaymiic: Nga¨bere, Buglere. B5. Doracic: Dorasque, Cha´nguena. B6. Kuna. IIC. Magdalenic: C1. Core Magdalenic: C1.1. Southern Magdalenic: C1.1a. Chibcha: Muisca, Duit. C1.1b. Tunebo. C1.2. Arhuacic: C1.2a. Cogui. C1.2b. Eastern-southern Arhuacic: C1.2b.1. Eastern Arhuacic: Damana, Kankuama. C1.2b.2. Ica. C2. Chimila. C3. Barı´. There are some indications that (a) the Isthmic group could be divided into two branches: ViceiticBoruca and Teribe-Guaymiic-Doracic-Kuna, (b) the
308 Chave´e, Honore´ (1815–1877) Chave´e H (1857). Franc¸ais et wallon, paralle`le linguistique. Paris: Truchy. Desmet P (1996). La linguistique naturaliste en France (1867–1922). Leuven: Peeters.
Leroy M (1985). ‘Chave´e (Honore´-Joseph).’ In Biographie nationale publie´e par l’Acade´mie royale des sciences, des lettres et des beaux-arts de Belgique, vol. 44. Brussels: Bruylant. 197–206.
Chibchan A Constenla Uman˜a, University of Costa Rica, San Jose´, Costa Rica ! 2006 Elsevier Ltd. All rights reserved.
The Chibchan stock is currently composed of the 16 languages from Central America and northwestern South America listed below with their main current alternate names, approximate number of speakers, and location: Pech (Paya; 900; Olancho Department, eastern Honduras), Rama (20; Rama Cay and other localities south of Rı´o Escondido, southeastern Nicaragua), Male´ku Jaı´ka (Guatuso; 300; Guatuso County, northern plains of Costa Rica), Cabe´car (8500; Atlantic watershed and southern Pacific slope of the Talamanca Range, southern Costa Rica), Bribri (6000; southern Atlantic and Pacific slopes of the Talamanca Range), Boruca (Brunka; 2, 20 semi-speakers with a passive domain of the language; Te´rraba Valley, southwestern Costa Rica), Teribe (a dialect of Naso; 3000; Teribe and Changuinola rivers area, northwestern Panama; Te´rraba, the Costa Rican dialect, is extinct), Buglere (Bocota´, Guaymı´ Sabanero; 3700; Bocas del Toro, Veraguas, Chiriquı´ Provinces, western Panama), Nga¨bere (Guaymı´; 110 000 in the Bocas del Toro, Chiriquı´, and Veraguas provinces, Western Panama, and 2172 in the bordering area of southwestern Costa Rica), Kuna (70 000 in the eastern Atlantic coast and the southeastern Paya and Pucuro localities of Panama, and 800 in Arquı´a and Caima´n Nuevo in the Uraba´ Gulf, Colombia), Chimila (450; lowlands to the south of Fundacio´n River, Magdalena Department, Colombia), Cogui (Ca´gaba; 6000; northern, eastern, and western slopes of the Sierra Nevada de Santa Marta, Colombia), Damana (Malayo; 1500; southern and eastern slopes of the Sierra Nevada de Santa Marta), Ica (Bı´ntucua; 8000; southern slopes of the Sierra Nevada de Santa Marta), Barı´ (Motilo´n; 1500 in Colombia, 850 in Venezuela; Serranı´a de Motilones), and Tunebo (Uwa; 3500, mostly in Colombia, a few in Venezuela; eastern slopes of the Sierra Nevada de Cocuy). Formerly, the stock included at least eight more languages which are listed with their original location, and approximate time of extinction: Huetar (central Costa Rica, 18th century), Cha´nguena,
Dorasque (both in western Panama, Chiriquı´ Lagoon area, beginning of the 20th century), Antioquian (central and northeastern Department of Antioquia, Colombia, 18th century), Tairona (the coast to the north of the Sierra Nevada de Santa Marta, 18th century or before), Kankuama (eastern slopes of the Sierra Nevada de Santa Marta, first half of the 20th century), Duit (Boyaca´ Department, Colombia, 18th century), and Muisca (Cundinamarca Department, Colombia, 18th century).
Subgrouping The following subgrouping is based on both lexicostatistical and comparative evidence (Constenla, 1995: 42): I. Pech. II. Core Chibchan: IIA. Votic: Rama, Guatuso. IIB. Isthmic: B1. Viceitic: Cabe´car, Bribri. B2. Boruca. B3. Teribe. B4. Guaymiic: Nga¨bere, Buglere. B5. Doracic: Dorasque, Cha´nguena. B6. Kuna. IIC. Magdalenic: C1. Core Magdalenic: C1.1. Southern Magdalenic: C1.1a. Chibcha: Muisca, Duit. C1.1b. Tunebo. C1.2. Arhuacic: C1.2a. Cogui. C1.2b. Eastern-southern Arhuacic: C1.2b.1. Eastern Arhuacic: Damana, Kankuama. C1.2b.2. Ica. C2. Chimila. C3. Barı´. There are some indications that (a) the Isthmic group could be divided into two branches: ViceiticBoruca and Teribe-Guaymiic-Doracic-Kuna, (b) the
Chibchan 309
Magdalenic group could be also divided into two branches: Southern Magdalenic-Barı´ and ArhuacicChimila, (c) Huetar might belong to Votic, and (d) Tairona to Eastern-southern Arhuacic (Jackson, 1995: 67–68). The split of Proto-Chibchan into the ancestors of Pech and Core Chibchan occurred, according to glottochronology, around 6550 years BP, at the times of the beginning of the transition from the huntergatherer way of life to the agricultural one. The greater diversity between the languages is found to the west and north, in Central America, which suggests that the Chibchan people’s homeland must have been there, probably in Costa Rica and Panama, where archeology has found the oldest sites related to them.
External Relationships There have been proposals of relationships between Chibchan and at least a score of other Amerindian language groups and isolates from Florida in the United States to northern Chile and Argentina (such as Timucua, Tarascan, Cuitlatec, Xincan, Lencan, Misumalpan, Chocoan, Andaquı´, Betoy, Warao, Yanomama, Paez, Barbacoan, Mochica, Kunza, Allentiac), which together would constitute a Macro-Chibchan phylum. None of these have been proved, and the quality of the supposed evidence in their favor is extremely poor (Constenla, 1993: 81–95).
Typology The Chibchan languages belong to the Lower Central American Linguistic Area, characterized by features such as SOV order, postpositions, prepositive genitive, postpositive numerals and adjectives, lack of gender contrasts, and contrasts between voiced and voiceless stops. The Chibchan languages of southern Costa Rica and western Panama, together with the Chocoan languages, constitute a Central Subarea characterized by the predominance of features such as distinctive
Chickasaw
vowel nasality, tense/lax vocalic contrasts, ergative or active case systems, and absence of person inflections. Most Chibchan languages in this subarea present numeral classifiers, postpositive demonstratives, and tone contrasts. Pech, Rama, and Male´ ku Jaı´ka are part of a Northern Subarea, and the Magdalenic languages, of an Eastern Subarea. Although each of these subareas possesses its own characteristics, they share the predominance of features, both positive and negative, opposed to those of the Central subarea such as accusative-nominative case systems (Male´ ku Jaı´ka and Tunebo are exceptions to this), person inflection for possession in nouns and for agent and patient in verbs, prepositive demonstratives, and lack of numeral classifiers, distinctive vowel nasality, and tense/lax vocalic contrasts.
See also: Choco Languages; Colombia: Language Situation; Costa Rica: Language Situation; Honduras: Language Situation; Misumalpan; Nicaragua: Language Situation; Panama: Language Situation; Venezuela: Language Situation.
Bibliography Constenla A (1991). Las lenguas del Area Intermedia: introduccio´ n a su estudio areal. San Jose´ : Editorial de la Universidad de Costa Rica. Constenla A (1993). ‘La familia chibcha.’ In Rodrı´guez de Montes M L (ed.) Estado actual de la clasificacio´ n de las lenguas indı´genas de Colombia. Bogota´: Instituto Caro y Cuervo. 75–125. Constenla A (1995). ‘Sobre el estudio de las lenguas chibchenses y su contribucio´ n al conocimiento del pasado de sus hablantes.’ Boletı´n Museo del Oro 38–39, 13–55. Holt D (1986). ‘The development of the Paya sound system.’ Ph.D. diss. University of California, Los Angeles. Jackson R (1995). ‘Fonologı´a comparativa de los idiomas chibchas de la Sierra Nevada de Santa Marta.’ Boletı´n Museo del Oro 38–39, 57–69.
See: Mobilian Jargon; Muskogean Languages.
310 CHILDES Database
CHILDES Database A Theakston, University of Manchester, Manchester, UK ! 2006 Elsevier Ltd. All rights reserved.
Overview of the System The Child Language Data Exchange System (CHILDES) has revolutionized the way that research is carried out among the child language research community in the 20 years since its inception. The system was first developed by Brian MacWhinney and Catherine Snow and put in place at Carnegie Mellon University in the United States. MacWhinney has remained the driving force behind the development of the system. The CHILDES system consists of three main tools: a system of transcription for linguistic data (CHAT format), a set of tools for analyzing those data when transcribed according to the principles of the system (the CLAN programs), and a database of linguistic corpora provided by members of the language research community for use by the wider research community. All aspects of the system are available online and can be downloaded from the CHILDES homepage. In addition, the CHILDES system supports online access to membership lists and e-mail distribution lists (info-childes) that have encouraged discussion and debate among eminent researchers and new researchers. From its humble beginnings, the system has grown to include corpora from more than 30 languages, covering first-language acquisition, bilingual acquisition, language disorders, and narrative. The transcripts are generated from case studies and from groups of children, and include both longitudinal and cross-sectional data. Although American English and British English are well represented, other languages such as Sesotho, Estonian, Spanish, and Cantonese are included in the database. A key benefit of the CHILDES system is in providing researchers with a wealth of data on which to base their analyses of children’s early language. Due to time and financial constraints, researchers wishing to use longitudinal data have made recordings (traditionally approximately one hour every week) from small numbers of children or even from case studies. This means that it is sometimes difficult to generalize research findings to other children. Researchers are now able to compare the results from their own data with those from other comparable children from the CHILDES database. Indeed, many researchers use the database as the sole source of data for their analyses. However, to use the CHILDES system properly, it is essential to understand the basics of transcription. In addition, as a courtesy to those who
donate data, it is necessary to cite an appropriate reference to recognize the researchers who contributed the data used in an analysis. Researchers should also cite the CHILDES handbook (MacWhinney, 2000a, MacWhinney, 2000b) to acknowledge use of the system and to recognize the work of Brian MacWhinney in developing and maintaining the system.
Transcription and CHAT Format The first major tool of the CHILDES system is the CHAT format for transcription. This is a set of guidelines for transcription that, if followed correctly, enables all researchers who understand the system to accurately interpret any transcripts contributed to the database and to use the data in their analyses. The system is relatively straightforward but takes a bit of time to master. A handbook written by Brian MacWhinney entitled The CHILDES Project: tools for analyzing talk, Vol. 1 (2000a) provides details of the various aspects of transcription, but readers are advised to consult the online transcription manual for up-to-date information, as the format for transcription is sometimes modified in line with developments in the overall system. A simple transcript includes header tiers that provide background information, for example the participants, the location or setting, the date, and so on. The header tiers are then followed by the transcript. Each main line of the transcript represents a separate utterance and is identified by a speaker ID followed by the utterance. Each main line can then be accompanied by any number of dependent tiers that appear directly beneath the utterance and allow more detailed coding, for example speech act coding, glosses of second languages, error coding, and any general comments that assist the researcher. All transcripts must finish with a final ‘End’ line. Figure 1 shows a sample transcript in the CLAN window. The transcript shows headers, main lines, and a few dependent tiers (for errors –%err, and morphological analysis, %mor), and also includes examples of coding for retracings [/], imitations [þ I], unintelligible speech (xxx, [þ PI]), incomplete utterances þ//. [þ IN], and omitted material prefaced with 0.). When transcribing data, transcripts can be made as simple or as complex as is required by the individual researcher. However, as transcription is a lengthy process (often up to 10 hours per hour of data), forward planning is essential to make the most of the time available. It is possible to save a lot of time if sufficient thought is given in advance to the kinds of
CHILDES Database 311
Figure 1 A sample transcript (adapted from the Manchester corpus available on CHILDES, Theakston et al., 2001).
questions the researcher might want to ask of the data and the transcript is coded accordingly. Many aspects of linguistic interactions are easy to code as transcription is taking place, but are somewhat harder to locate at a later stage should it be necessary for a given analysis. For example, argument structure overgeneralizations (e.g., Don’t giggle me) are difficult to locate unless coded as errors, and the identity of omitted words or morphemes may be evident in the context of the recording when intonation patterns are available, but may be unclear when we are faced with the bare transcript. However, the vast range of coding possibilities available and the necessity of selecting among these to address the needs of the immediate research means that when using data available on CHILDES that has been contributed by other researchers, it is essential to ascertain whether the coding carried out is sufficient to support your own specific analyses.
Analysis and the CLAN Programs Once the researcher has transcripts available for analysis, either through the transcription of new data,
or by accessing data online, the second component of the CHILDES system comes into play. The CLAN programs provide a powerful computerized system of analysis for transcripts in CHAT format. There are about 40 programs, although most researchers will find themselves using just a handful. To run the programs, it is necessary to type fairly basic commands into the command window (see Figure 2), and to ensure that this is done accurately, as programs will either not run or will generate inaccurate results if the command is flawed. The most commonly used programs allow automated calculation of standard measures such as mean length of utterance or turn (MLU, MLT), type token ratios and measures of lexical diversity (VOCD, FREQ), and enable researchers to conduct basic frequency counts on words or phonemes quickly and accurately (FREQ, PHONFREQ). More advanced programs allow researchers to isolate utterances containing specific words, word types, or word combinations either alone or with preceding and following linguistic context (KWAL and COMBO). It is also possible to generate cooccurrence data showing the frequency with which individual combinations of words occur together (COOCCUR).
312 CHILDES Database
Figure 2 The CLAN commands window.
Those researchers who take the time to fully understand the CLAN programs will be rewarded, as many more complex research questions require the use of a combination of different programs in sequence, using the output of one program as input to another to generate results. Using the CLAN programs can be confusing for the novice, as all programs have a number of switches that allow different output. For example, it is possible to output information with or without line numbers, in alphabetical order, preserving details of file names, or merging all output from several files together. As for transcription, there is a published manual (MacWhinney, 2000a, 2000b) providing guidance on the use of the programs, but readers are advised to consult the online manual, as many programs are regularly updated as the capabilities of the system improve. The manual also provides basic tutorials in the use of the programs. At the most straightforward level, analyses are carried out on all speakers and on the main line in the transcript. However, it is possible to fine-tune analyses by specifying individual speakers or individual tiers over which to conduct these analyses. Moreover, it is possible to run the programs on single transcript files, or over many files simultaneously, and to generate output file by file, or to merge output together across files and/or speakers. Search commands can be modified to include wildcards to allow more complex analyses. Anyone carrying out analyses using the CLAN programs should check the output against the original files, at least in the first instance, to check that the search is operating in the way expected. Although the programs are powerful and can perform searches quickly and efficiently, the accuracy of the output relies crucially on the accuracy of the transcription, coding, and the search command. In addition to the analysis programs, the MOR and POST programs allow researchers to add a line of morphological coding to their transcripts.
For researchers interested in the development of grammar, these programs provide an extremely valuable tool. Once a transcript has been coded, it is then possible to search for specific word types, for example verbs, pronouns, and for combinations of these items, allowing much more powerful analyses. To date, almost all of the English data in CHILDES has been run through the MOR program by Brian MacWhinney and colleagues, as well as some Spanish, Japanese, and Cantonese corpora. For English, the POST program provides almost fully automated coding (the MOR program alone generates options for grammatical coding, but the choice of the correct coding is largely done by hand). There are currently MOR grammars available for Cantonese, Chinese, Dutch, English, French, German, Italian, Japanese, and Spanish, and grammars are being developed for a number of other languages. Of course, the accuracy of any analysis based on the line of morphological coding depends on the accuracy of the coding itself. Although the grammars are continually being updated, in the past there have been a few problems with the accuracy of the coding, so researchers are advised to check transcripts carefully prior to running any analyses to ensure that the coding is correct.
New Directions One of the more recent developments of the CHILDES system is the ability to link transcripts to digitized audio and video data. It is necessary to create a digitized sound or video file, but it is then possible to create a direct link between an individual utterance in a transcript and the corresponding section of audio or video recording. This has the potential to revolutionize child language research, as it will allow us to investigate issues that were not considered by the original researchers, by recoding the audio or video data. For example, this may be useful to people interested in phonological development, as few recordings are transcribed phonetically by researchers interested in later language development or the acquisition of grammar. It will also allow researchers to double-check utterances that they are unsure of, for example, children’s errors or utterances marked unintelligible by the original transcriber. It is now possible to access linked transcripts for a number of datasets from the CHILDES website and either download these or view them online. New improvements in technology bring with them potentially difficult issues with respect to participant consent and confidentiality. When parents and children agree to participate in studies of child language acquisition, we have a responsibility to ensure that we obtain consent for all uses of the resulting
CHILDES Database 313
data. In the past, parents would not have envisaged transcripts of their interactions with their children appearing on the Internet, available to anyone who cares to look and listen. Traditionally, researchers have dealt with this problem by giving children pseudonyms and removing any potentially identifying information from the transcripts. Of course, this is easy to do with written transcripts, but much more difficult if we intend to make audio and video recordings available online to the wider research community. The availability of audio and video recordings also means a much wider range of questions that may potentially be asked of the data than the original researchers intended, again raising issues of participant consent. A second ongoing development is the transformation of the CHILDES database into XML format through the Talkbank project. This will eventually allow researchers to access the CHILDES database online and to run analyses directly without having to download the data to their own computers. Of course, many researchers prefer to recode data before analysis, but the facility to run analyses online will doubtless prove useful to many people.
Conclusion The CHILDES system has led to significant improvements in research in child language acquisition over the last 20 years, and will continue to do so with the new developments taking place. It is a huge benefit to the research community that there are now large amounts of data for English, and increasing amounts of data from a wide range of other languages. However, it is necessary for researchers to exercise caution. First, over-reliance on the CHILDES database can lead to the extensive analysis of data from just a few children, which may bias our findings in various ways. Second, when comparing children using data from CHILDES, it is important to carefully control for the age and linguistic stage of the children concerned, and to ensure that the corpora are comparable
in terms of transcription procedures and coding before carrying out any analyses. Third, the increasing trend toward analyses at the lexical level means that although the CHILDES system provides data from many children, relatively few of these children have enough data in a dense enough time period to support such analyses. These issues mean that although the CHILDES database is a valuable resource, it is essential that researchers continue to donate new data to CHILDES to promote rigorous cross-linguistic research into language acquisition. See also: Corpora of Spoken Discourse; Corpora; Corpus
Linguistics; Language Development: Overview; Parsing and Grammar Description, Corpus-Based.
Bibliography Berko Gleason J & Thomson R B (2002). ‘Out of the baby book and into the computer: child language research comes of age.’ APA Review of Books 47, 4. MacWhinney B (1999). ‘The CHILDES system.’ In Ritchie W & Bhatia T (eds.) Handbook of child language acquisition. San Diego, CA, US: Academic Press, Inc. 457–494. MacWhinney B (2000a). The CHILDES project: tools for analyzing talk, vol. 1: Transcription format and programs. Mahwah, NJ: Lawrence Erlbaum Associates. MacWhinney B (2000b). The CHILDES project: tools for analyzing talk, vol. 2: The database. Mahwah, NJ: Lawrence Erlbaum Associates. MacWhinney B (2001). ‘From CHILDES to TalkBank.’ In Almgren M, Barrena A & Ezeizaberrena M (eds.) Research on child language acquisition. Somerville, MA: Cascadilla. 17–34. Sokolov J & Snow C (eds.) (1994). Handbook of research in language development using CHILDES. Hillsdale, NJ: Erlbaum.
Children’s Literature: Translation of T Puurtinen, University of Joensuu, Savonlinna, Finland ! 2006 Elsevier Ltd. All rights reserved.
Role and Readership Translation of children’s literature poses particular challenges owing to some special characteristics of children’s books and qualities of child readers. The fact that children’s literature tends to have a peripheral position in cultures (Shavit, 1986) and suffer from lack of prestige makes it possible to manipulate texts translated for children in various ways to make them accord with the expectations of the receiving culture. Furthermore, children are not expected to tolerate as much strangeness and foreignness as adult readers, and therefore, modification of the content and language of source texts is often considered necessary. Instead of being innovative, translated children’s books thus tend to conform to conventional, accepted forms, models, and language. However, children’s literature plays an important part as a tool for education, socialization, development of linguistic skills, and spreading world knowledge. Especially in minor language cultures, where translations account for a large proportion of published children’s literature, children are likely to come into contact with literature and its educative and entertaining functions mainly through translations. Therefore, translations may have a key role in introducing child readers to characters, events, and language typical of fiction. The term ‘children’s literature’ usually refers to fiction targeted at readers from preliterate children to young teenagers; nonfiction, such as school textbooks, is excluded. Children’s fiction is, in fact, not a uniform genre either; its various subgenres, e.g., fairy tales and fantasy stories, detective novels, realistic stories, differ in terms of purpose and language (see Puurtinen, 2003: 402), which is likely to affect the choice of translation methods. Here, however, children’s fiction is treated as one, albeit very heterogeneous, genre. Although children are the primary readership, children’s books actually have an important secondary target group – adult readers, whose preferences and literary tastes must be taken into account by both authors and translators. However, Oittinen (1993, 2000) advocates translating for children, rather than translating children’s literature, and emphasizes the significance of children’s culture and their magical world, as well as society’s image of childhood and the translator’s own child image (Oittinen, 2000: 41–60).
In addition to the existence of two target groups, children’s literature has a number of other special qualities, which have an effect on both the content and language of translations: strong ideological, didactic, ethical, and moral norms, ambivalence, aim at high readability and speakability, and text–picture relationship. (Nikolajeva, 1996, is an extensive discussion of features and forms of children’s literature.) Translation problems and their solutions made at the level of language tend to reflect, and result from, these hierarchically higher levels. (For a comprehensive review of studies on the translation of children’s literature, see Tabbert, 2002.)
Cultural Norms Various norms regulating the translation of children’s literature can be subsumed under the more extensive concept of culture, or ideology in a neutral sense, referring to taken-for-granted assumptions, beliefs, and values shared by a particular society or culture. In fact, ideology is the overriding constraint, an umbrella concept, dictating what is acceptable children’s literature (see Stephens, 1992; Knowles and Malmkjær, 1996). In general, children’s books are expected to be in some way beneficial to children and sufficiently easy in terms of plot, characterization, and language to be comprehensible (Shavit, 1981: 172, 1986: 112–128). These two requirements may sometimes be contradictory. For instance, a maximally understandable text may be regarded as too simple to teach anything new and, in that respect, benefit the child reader. Moreover, notions of what is beneficial and comprehensible vary from culture to culture and change with time, which often leads to manipulation of source texts in translation (see Shavit, 1981, for examples of manipulation, and Desmidt, 2003, for didactic norms and readability). A good example of the effect of strong didactic norms on language is the French translation of Pippi Longstocking, the originally Swedish Pippi La˚ngstrump by Astrid Lindgren. While the original Pippi shows disrespect towards adults in her anarchic, norm-breaking language use, the French translation has turned her into a more obedient, well-behaved girl by, e.g., toning down impolite expressions (Heldner, 1993). Another example of didacticism is provided by the older German (German, Standard) translations of Tove Jansson’s originally Swedish Moomintroll books, which have, e.g., omitted references to kissing, poker games, and American dance music, because these were considered unsuitable for German children’s literature in the 1950s, at the time
Children’s Literature: Translation of 315
of publishing the translations (Bode, 1996; for censorship in East Germany, see Thomson-Wohlgemuth, 2003).
Ambivalence The need to appeal to adult readers leads to the creation of ambivalent texts, which can be read and interpreted differently by children and adults (Shavit, 1986: 63–91). Well-known classics, such as Lewis Carroll’s Alice’s adventures in wonderland and Kenneth Grahame’s The wind in the willows, are examples of ambivalent texts, which children are supposed to read on only one level, as simple fairy tales, whereas adults are expected to be aware of the ironical or satirical levels as well. Intertextuality, such as allusions to other literary texts, films and real-life or fictional characters, is a typical element of ambivalence intended to please the more knowledgeable adult audience. When translated, an ambivalent children’s book may be simplified and directed merely to children by omitting all ingredients of ambivalence; sometimes two different translations can be made, one for children and the other, retaining ambivalence, primarily for adults.
Readability and Speakability All special features and norms typical of children’s literature are naturally to some extent manifest in the language used, but the feature which is the most directly related to linguistic details is readability, or comprehensibility. Although readability is also influenced by other factors than purely linguistic ones (e.g., such text-external factors as the reading situation, the reader’s subject and world knowledge, interest, and motivation), it is usually defined as a textual quality determined by the level of linguistic difficulty. The main determinants of readability include the length and complexity of sentences and the length and familiarity of words (Puurtinen, 1995: 104–115, 135–164; see also Puurtinen, 1997, for syntactic norms). Readability can be understood to cover speakability, the ease of reading aloud, which is an important quality in children’s books read aloud by adults to small children (see Puurtinen, 1995: 164–178; Dollerup, 2003). In addition to lexical and syntactic features relevant to readability, speakability is also affected by, e.g., rhyme, rhythm, and alliteration, which make reading aloud fluent and pleasant. If readability requirements in the source and target literatures are different, the translator may, e.g., simplify sentence structures, decrease sentence length, or use more familiar, concrete vocabulary. The age of
the potential readers is crucial, as the reading skills of different age groups vary extensively. The educative function of children’s literature and aim at high readability in translation may sometimes be contradictory: Culture-specific elements, such as references to foreign places and customs, convey information about foreign cultures if retained in translation, but they also reduce readability. If, on the other hand, such elements are replaced with familiar, domestic ones, i.e., if cultural adaptation is carried out, readability is increased, but the opportunity to improve knowledge is simultaneously lost. (For preservation, neutralization, and adaptation of culture-specific elements, see, e.g., Klingberg, 1986; Nord, 1993.) In Israel, the desire to teach children the highly valued literary form of Hebrew through literature resulted in a translation policy which forced translators to turn even colloquial source text dialogues into standard, formal Hebrew (Even-Zohar, 1992). Easily understandable, everyday source text language was thus replaced with the traditional, varied, but less readable Hebrew variant. Since the 1980s, didactic intentions have been giving way to considerations of readability in Hebrew children’s literature (Du-Nour, 1995). In some cases high readability may clash with the aim to create an inspiring, enjoyable translation. The numerous descriptive personal names in the Harry Potter books (e.g., Neville Longbottom, Minerva McGonagall, Vindictus Viridian) are complicated enough to have a negative effect on readability, but as they often refer to some special qualities of the characters and require inferences and interpretation from the reader, they are likely to make the reading process more active and enjoyable. Therefore, most translators tend to choose such target language equivalents for the names that also trigger inferences (Davies, 2003; for functions and translation of proper names in children’s literature, see Nord, 2003).
Relationship between Text and Illustrations The relationship between text and illustrations is particularly important in picture books for small children, where pictures are the dominant element. In illustrated books for older children, pictures may also have a central role in supporting the text. There is interaction between words and pictures. Pictures influence the interpretation of the content of the story, and words create a point of view to the pictures (Oittinen, 1993: 113–139, 2000: 100–114, 2001, 2003). Thus, the translator’s interpretation of the story is affected by the pictures, and this interpretation is reflected in the formulation of the translation. Although translators try to make the text and
316 Children’s Literature: Translation of
illustrations cohere, discrepancies are sometimes created if, e.g., the translator has to work on the basis of the verbal text only, or if the translator is provided with new pictures different from the original ones. To improve correspondence between text and pictures, the translator may also deviate from the text of the original (Dollerup, 2003: 88). However, the original balance between text and pictures can be disturbed, if the translator adds such information into the target text which in the original is only conveyed by pictures (see Tabbert, 1991; O’Sullivan, 2000: 287–291). Translating illustrated texts requires specialized knowledge of the text–picture relationship, child readers’ picture reading abilities, and potential cultural differences in the conventions of illustrated books.
Bibliography Bode A (1996). ‘Vieraskielisia¨ sanoja taikurin hatusta. Muumikirjojen ka¨ a¨ nno¨ kset saksaan ja slaavilaisiin kieliin.’ In Kurhela V (ed.) Muumien taikaa. Tutkimusretkia¨ Tove Janssonin maailmaan. Suomen Nuorisokirjallisuuden Instituutin julkaisuja 20. Helsinki: BTJ kirjastopalvelu. 110–134. Davies E E (2003). ‘A goblin or a dirty nose? The treatment of culture-specific references in translations of the Harry Potter books.’ The Translator. Studies in Intercultural Communication 9(1), 65–100. Desmidt I (2003). ‘‘‘Jetzt bist du in Deutschland, Da¨ umling.’’ Nils Holgersson on foreign soil—subject to new norms.’ Meta. Translators’ Journal 48(1–2), 165–181. Dollerup C (2003). ‘Translation for reading aloud.’ Meta. Translators’ Journal 48(1–2), 81–103. Du-Nour M (1995). ‘Retranslation of children’s books as evidence of changes of norms.’ Target. International Journal of Translation Studies 7(2), 327–346. Even-Zohar B (1992). ‘Translation policy in Hebrew children’s literature: the case of Astrid Lindgren.’ Poetics Today 13(1), 231–245. Heldner C (1993). ‘Une anarchiste en camisole de force. Fifi Brindacier ou la me´ tamorphose franc¸ aise de Pippi La˚ngstrump.’ Moderna spra˚ k 87(1), 37–43. Klingberg G (1986). Children’s fiction in the hands of the translators. Lund: CWK Gleerup. Knowles M & Malmkjær K (1996). Language and control in children’s literature. London: Routledge.
Nikolajeva M (1996). Children’s literature comes of age: toward a new aesthetic. New York: Garland. Nord C (1993). ‘Alice im Niemandsland: die Bedeutung von Kultursignalen fu¨r die Wirkung von literarischen U¨bersetzungen.’ In Holz-Ma¨ntta¨ri J & Nord C (eds.) Traducere navem: Festschrift fu¨ r Katharina Reiss zum 70. Geburtstag. Studia translatologica ser A, vol. 3. Tampere: University of Tampere. 395–416. Nord C (2003). ‘Proper names in translations for children: Alice in Wonderland as a case in point.’ Meta. Translators’ Journal 48(1–2), 182–196. Oittinen R (1993). I am me – I am other: on the dialogics of translating for children. Acta Universitatis Tamperensis, series A, (vol. 386). Tampere: University of Tampere. Oittinen R (2000). Translating for children. New York: Garland. Oittinen R (2001). ‘On translating picture books.’ Perspectives. Studies in Translatology 9(2), 109–125. Oittinen R (2003). ‘Where the wild things are: translating picture books.’ Meta. Translators’ Journal 48(1–2), 128–141. O’Sullivan E (2000). Kinderliterarische Komparatistik. Heidelberg: C. Winter. Puurtinen T (1995). Linguistic acceptability in translated children’s literature. University of Joensuu Publications in the Humanities 15. Joensuu: University of Joensuu. Puurtinen T (1997). ‘Syntactic norms in Finnish children’s literature.’ Target. International Journal of Translation Studies 9(2), 321–334. Puurtinen T (2003). ‘Genre-specific features of translationese? Linguistic differences between translated and nontranslated Finnish children’s literature.’ Literary and Linguistic Computing 18(4), 389–406. Shavit Z (1981). ‘Translation of children’s literature as a function of its position in the literary polysystem.’ Poetics Today 4(2), 171–179. Shavit Z (1986). Poetics of children’s literature. Athens, GA: University of Georgia Press. Stephens J (1992). Language and ideology in children’s fiction. London: Longman. Tabbert R (1991). ‘Bilderbu¨cher zwischen zwei Kulturen.’ In Tabbert R (ed.) Kinderbuchanalysen II. Frankfurt: Dipa. 130–148. Tabbert R (2002). ‘Approaches to the translation of children’s literature.’ Target. International Journal of Translation Studies 14(2), 305–351. Thomson-Wohlgemuth G (2003). ‘Children’s literature and translation under the East German regime.’ Meta. Translators’ Journal 48(1–2), 241–249.
Chile: Language Situation 317
Chile: Language Situation A Valencia, University of Playa Ancha, Valparaiso, Chile ! 2006 Elsevier Ltd. All rights reserved.
Introduction Chile is situated in the southwest of South America. After gaining independence from Spain, it became an independent republic in 1810. Its geography is uneven and mountainous with a surface of 756 626 sq km, out of which 379.9 sq km are islands. With 15 116 435 inhabitants (census of 2002), Chile is a country with a stable economy and has become one of the most powerful countries of South America, with great human potential as well as potential natural resources and natural beauties. The official language is Spanish, but there are also minority languages. Autochthonous languages include Mapudungu´ n, Aymara, Rapanui, and Qawa´sqar, the language of the last representatives of the ethnic group known as Alacalufe. A number of foreign languages are spoken, resulting from successive migrations during the 19th century that were facilitated by the structure of the territory: 4300 km of coastline and innumerable geographical irregularities that generate relatively isolated spaces. All this has influenced the linguistic evolution of the country and, as a consequence, Chile is today a multilingual and pluricultural country. The education is in Spanish; however, second languages are taught, notably English, but also German (German, Standard), French, Italian, Hebrew, and Japanese.
first names, especially with female names (e.g., la Maria). These are used less frequently with male names (e.g., el Juan), and not with family names (e.g., #la Gonza´ lez). In tense-aspect morphology, future tense is often formed with a periphrastic construction with ir a (‘go to’) plus infinitive, e.g., voy a amar, ‘I will love,’ instead of the synthetic amare´ . Another relevant phenomenon related to the forms of address is the one known as ‘voseo,’ a phenomenon that, in Chile, has a verbal character: VERB2psing þ PRON tu´ . In this way, the canonical forms ‘‘tu´ cantas, tu´ quieres, tu´ ves’’ [ka´ ntas, kje´ res, bes] ‘you sing, you like, you see’ are pronounced ‘‘tu´ cantai, tu´ querı´(h), tu´ veı´(h)’’ [kanta´ i, kerı´h, beı´h]. The use of the pronoun vos is stigmatized. Although the phenomenon of voseo does not exist in formal speech, this is quite common in colloquial speech as a sign of ‘familiarity’ between speaker and hearer. Similarly, ustedes replaces vosotros, and os is not used at all. On the lexical level, Chilean Spanish includes several loanwords from Amerindian languages, e.g., guata, ‘belly,’ chuncho, ‘owl,’ laucha, ‘little mouse,’ and pololo, ‘boy friend’ from Mapudungu´ n; and camanchaca, ‘mist,’ and calato, ‘naked,’ from Aymara. In addition, there are loanwords from European languages such as kuchen, ‘cake,’ pizza, and sandwich. There is an increasing influence of English, especially in sport-slang, economy, and business. Specific uses in informal speech include huevo´ n [weßo´ n], ‘fool,’ used as an emphatic term of address, as well as gallo, ‘cock,’ for ‘man,’ cabro, ‘goat,’ for ‘boy,’ and cabra, ‘she-goat,’ for ‘girl.’ This is why it is not strange that, as a humorist once said, ‘‘En Chile los gallos se casan con las cabras’’ (‘In Chile roosters marry goats’).
Spanish Spoken in Chile The Spanish spoken in Chile is a dialect of the Spanish language and includes characteristics that are also present in other dialects of Spanish, but it is set apart by the frequency in the use of some forms. Phonologically, Chilean Spanish is, like other South American dialects of Spanish, characterized by seseo (that is, European Spanish /y/ is pronounced /s/). For example, zapato, ‘shoe’, is pronounced [sapa´ to], as opposed to European Spanish [yapa´ to]. Yeı´smo (/l/ is pronounced /^!/); caballo, ‘horse,’ is pronounced [kaßa´^!o]. Other features in the speech found in varieties of Chilean Spanish are the aspiration or elision of /s/ in coda position (e.g., [a´ hno] for asno, ‘donkey’), and lenition or elision of intervocalic /d/ (e.g., [ala´ ðo] ! [ala´ :o] for alado, ‘winged’). In morphology and syntax, a distinct feature of Chilean Spanish is the use of definite articles before
Autochthonous Languages Mapudungu´ n is the language of the about 1 000 000strong Mapuche ethnic group. The smaller groups of Pehuenches and Huilliches also belong to this ethnic group and speak languages similar to Mapudungu´ n. Mapudungu´ n (also called Mapuche or Araucanian) is an SVO language. The Mapuche population is predominantly bilingual Mapudungu´ n-Spanish, but has managed to maintain its linguistic identity, despite the violent process of acculturation and economic control suffered from the 16th century onwards. Mapudungu´ n has been studied since the 17th century. In 1992, Adalberto Salas published Mapuche or Araucanian: phonology, grammar and analysis of texts, the most complete and modern work on this language.
318 Chile: Language Situation
Chilean Aymara is spoken by approximately 40 000 people in small villages in valleys of the mountain range and high plateau strip in the northern part of Chile, at altitudes of 3000–3800 m. Little work has been done on Chilean Aymara. It is an agglutinative language with a large amount of affixes. The phonological and lexical variations detected until now indicate that it is a variety of the Bolivian Aymara. The Chilean group has developed a form of Aymara-Spanish bilingualism of its own kind. Rapanui, or Pascuense, is an Austronesian VSO language spoken by approximately 2400 speakers on Easter Island, an island about 3700 km off the Chilean coast. The important and enigmatic archaeological history of this island is responsible for a great flow of tourism and multinational research, which favors the multilingual activity of the island’s inhabitants. Bilingual Spanish-Rapanui education is offered in primary and secondary education. Many students continue their studies in universities in continental Chile. There are a number of studies on Rapanui, and there are descriptions of the current language being developed. Fuegian Languages
Of all the languages once spoken in the inhospitable Chilean Patagonia, the languages of the Ona (or Selk’nam) became extinct in 1928, and of the Ya´ mana or Yagan, in 2003; only a small community of about 20 Qawa´ sqar or Alacalufes persists. There are linguistic studies that have been developed and others still in development, before the inevitable extinction of these languages.
Conclusion The Spanish spoken in Chile is a peripheral dialect in relation to the rest of the Spanish-speaking world. It has evolved by preserving archaisms and developing a creativity with its own rules. It is in contact with two vernacular Indoamerican languages and one Austronesian language. With current globalization, it is also experiencing a notable increase of anglicisms. As for the autochthonous languages, there are initiatives with the Department of Education and university researchers for their study and conservation. It is interesting to note the active participation of the young people of these groups. The promulgation of the Indigenous Law – though a great part of these declarations remain effective only on paper – in some way has given place to the recognition of the autochthonous people. In contact areas, some schools give primary education in minority languages. Alphabets have been created for these languages and several
universities are preparing teachers for a bicultural and bilingual education. See also: Austronesian Languages: Overview; Mapudun-
gan; Spanish. Language Maps (Appendix 1): Map 49.
Bibliography Araya G, Contreras C, Wagner C & Bernales M (1973). Atlas lingu¨ ı´stico y etnogra´ fico del Sur de Chile (ALESUCH), I. Valdivia: Universidad Austral de Chile. Catrileo M (1995). Diccionario lingu¨ ı´stico etnogra´ fico de la lengua mapuche. Santiago: Editorial Andre´ s Bello. Catrileo M (2003). ‘El mapudungun de Chile.’ In Valencia A (ed.) Desde el Cono Sur. Homenaje a Juan M. Lope Blanch. Santiago: Sociedad Chilena de Lingu¨ ı´stica. 39–48. Clairis Ch (1976). ‘Esquisse phonologique de l’aymara parle´ au Chili.’ La linguistique 3(2), 143–152. Clairis Ch (1987). El qawasqar. Lingu¨ ı´stica fueguina. Teorı´a y descripcio´ n. Valdivia: Estudios Filolo´ gicos (Anejo 12). Dannemann M & Valencia A (1989). Grupos aborı´genes chilenos. Su situacio´ n actual y distribucio´ n territorial. Santiago: Editorial Universitaria. Gallardo A (1986). ‘Lenguas verna´culas y planificacio´ n lingu¨ ı´stica.’ Lenguas Modernas [Santiago] 13, 7–16. Guerra A M, Lagos D, Riffo A & Villalo´ n C (1995). ‘El sintagma nominal del rapanui: Inventario de clases’ Nueva Revista del Pacı´fico [Valparaı´so] 40, 63–77. Morales F (1999). ‘Panorama del voseo chileno y rioplatense’ Boletı´n de Filologı´a de la Universidad de Chile. Estudios en honor de Ambrosio Rabanales XXXVII, 835–848. Morales F & Quiroz O (1984–1987). Diccionario ejemplificado de chilenismos y de otros usos diferenciales del espan˜ ol de Chile (4 vols). Santiago: Editorial Universitaria. Oroz R (1966). La lengua castellana en Chile. Santiago: Editorial Universitaria. Ortiz H & Saavedra E (2003). La fone´ tica en Chile. Bibliografı´a analı´tica 1829–2000. Santiago: Phone´ Libros. Rabanales A (1981). ‘Perfil lingu¨ ı´stico de Chile.’ In Schlieben-Lange B (ed.) Logos semantikos. Studia in honorem Eugenio Coseriu, vol. 5. Madrid: Gredos. 447–464. Rabanales A (1998). ‘La polı´tica lingu¨ ı´stica en Chile.’ In Matluck J & Sole´ C (eds.) La lengua espan˜ ola: pasado, presente y futuro. Austin: University of Texas. 111–120. Rabanales A & Contreras L (1979, 1990). El habla culta de Santiago de Chile. Materiales para su estudio (vol. I). In Boletı´n de Filologı´a de la Universidad de Chile, Anejo 2; (vol. II). Bogota´ : Instituto Caro y Cuervo. Rabanales A & Contreras L (1987). Le´ xico del habla culta de Santiago de Chile. Me´ xico: Centro de Lingu¨ ı´stica Hispa´ nica.
China: Language Situation 319 Sa´ ez L (2000). Co´ mo hablamos en Chile. Ocho aproximaciones. Santiago: Editorial BACH-SOCHIL. Sa´ ez L (2002). El espan˜ ol de Chile. La creatividad lingu¨ ı´stica de los chilenos. Santiago: Editorial BACH-Coleccio´ n IDEA. Saez L, Tassara G & Valencia A (eds.) (1996). ‘El pluralismo lingu¨ ı´stico, la educacio´ n y el desarrollo nacional.’ In Lingu¨ ı´stica y Literatura. Anejo 1. Salas A (1973). ‘The phonemes of the language of Easter Island.’ RLA [Concepcio´ n] 11, 61–66. Salas A (1992). El mapuche o araucano. Fonologı´a, grama´ tica y antologı´a de cuentos. Madrid: MAPFRE. Salas A & Valencia A (1988). ‘Fonologı´a del aymara altipla´ nico chileno.’ Filologı´a y Lingu¨ ı´stica [San Jose´ de Costa Rica] XIV 2, 119–122. Salas A & Valencia A (1990). ‘El fonetismo del ya´ mana o yaga´n. Una nota en lingu¨ ı´stica de salvataje.’ RLA [Concepcio´ n] 28, 147–169.
Valencia A (1995). ‘Chile.’ In Lopez H (dir.) ALFAL, El espan˜ ol de Ame´ rica. Cuadernos bibliogra´ ficos, vol. 6. Madrid: Arco Libros [Contiene informacio´ n de 1843 a 1994.] Valencia A (2002). ‘Aspectos del habla femenina de Santiago de Chile.’ In Parodi G (ed.) Lingu¨ ı´stica e interdisciplinariedad: Desafı´os del nuevo milenio. Ensayos en honor a Marianne Peronard. Valparaı´so: Ediciones Universitarias. 439–456. Valencia A (2003). ‘Algunos fraseologismos chilenos.’ In Sa´ nchez J P & Werner R (eds.) Lexicografı´a y lexicologı´a en Europa y Ame´ rica. Homenaje a Gu¨ nther Haensch. Madrid: Gredos. 663–681. Wagner C (1997). ‘Las construcciones con que en el espan˜ ol formal de Chile.’ Estudios Filolo´ gicos [Valdivia] 30, 19–27.
China: Language Situation D Bradley, La Trobe University, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.
China is known in Chinese as Zhongguo ‘middle nation,’ and indeed the cultural influence of China and the Chinese on all its neighbors has been profound. Chinese has been the dominant language of China for millennia, but many other languages are spoken in China. The languages of China fall into ten main groups: 1. Sino-Tibetan, including Sinitic (Han Chinese, the majority nationality) and Tibeto-Burman (17 nationalities, more than 100 languages) throughout the country; 2. Manchu-Tungus (Manchu, Xibo, Ewenk, Oroqen and Hezhe nationalities, seven languages) mainly in the northeast; 3. Mongol in the north central region (5.5 nationalities, including half of the Yugur, seven languages); 4. Turkic (6.5 nationalities and seven languages, with the other half of the Yugur); 5. Austro-Asiatic or Mon-Khmer in the far southwest (Wa, Bulang, De’ang, and Jing nationalities, plus some small unclassified groups, more than 12 languages); 6. Tai-Kadai in the southwest (nine nationalities, more than 20 languages); 7. Miao-Yao in the southwest central area (Miao, Yao, and She nationalities, 27 languages);
8. Indo-European (two nationalities, Tajik and Russian, plus creole Portuguese in Macao); 9. Korean; and 10. Austronesian (the Gaoshan nationality, a dozen languages indigenous to Taiwan, with few speakers on the mainland). The historical linguistic connection between Chinese and the Tibeto-Burman languages is fully established (Benedict, 1972; Coblin, 1986; Thurgood and LaPolla, 2003). The relationships within the TibetoBurman language family are also widely researched (Bradley, 1979; Matisoff, 2003). Chinese scholars suggest a close link of Sino-Tibetan languages with the Tai-Kadai languages, but this has been disproven by Benedict (1975), who instead linked the Tai-Kadai, Austronesian, and Miao-Yao families. The ‘Altaic hypothesis’ links Turkic, Mongol, Manchu-Tungus, and Japanese-Korean, although this is dubious. Marginal to China are the Mon-Khmer or Austro-Asiatic groups in the southwest and the Indo-European groups. Conversely, Japan, Korea, and Vietnam were long under the cultural influence of China.
The Chinese Language There is a long tradition of philological and epigraphic work on Chinese, notably Karlgren (1957); the best summary is Sagart (1999). There are also many excellent studies of Chinese syntax, especially Mandarin, notably Li and Thompson (1981). Chinese linguists have worked particularly on phonology and lexicon.
China: Language Situation 319 Sa´ez L (2000). Co´mo hablamos en Chile. Ocho aproximaciones. Santiago: Editorial BACH-SOCHIL. Sa´ez L (2002). El espan˜ol de Chile. La creatividad lingu¨ı´stica de los chilenos. Santiago: Editorial BACH-Coleccio´n IDEA. Saez L, Tassara G & Valencia A (eds.) (1996). ‘El pluralismo lingu¨ı´stico, la educacio´n y el desarrollo nacional.’ In Lingu¨ı´stica y Literatura. Anejo 1. Salas A (1973). ‘The phonemes of the language of Easter Island.’ RLA [Concepcio´n] 11, 61–66. Salas A (1992). El mapuche o araucano. Fonologı´a, grama´tica y antologı´a de cuentos. Madrid: MAPFRE. Salas A & Valencia A (1988). ‘Fonologı´a del aymara altipla´nico chileno.’ Filologı´a y Lingu¨ı´stica [San Jose´ de Costa Rica] XIV 2, 119–122. Salas A & Valencia A (1990). ‘El fonetismo del ya´mana o yaga´n. Una nota en lingu¨ı´stica de salvataje.’ RLA [Concepcio´n] 28, 147–169.
Valencia A (1995). ‘Chile.’ In Lopez H (dir.) ALFAL, El espan˜ol de Ame´rica. Cuadernos bibliogra´ficos, vol. 6. Madrid: Arco Libros [Contiene informacio´n de 1843 a 1994.] Valencia A (2002). ‘Aspectos del habla femenina de Santiago de Chile.’ In Parodi G (ed.) Lingu¨ı´stica e interdisciplinariedad: Desafı´os del nuevo milenio. Ensayos en honor a Marianne Peronard. Valparaı´so: Ediciones Universitarias. 439–456. Valencia A (2003). ‘Algunos fraseologismos chilenos.’ In Sa´nchez J P & Werner R (eds.) Lexicografı´a y lexicologı´a en Europa y Ame´rica. Homenaje a Gu¨nther Haensch. Madrid: Gredos. 663–681. Wagner C (1997). ‘Las construcciones con que en el espan˜ol formal de Chile.’ Estudios Filolo´gicos [Valdivia] 30, 19–27.
China: Language Situation D Bradley, La Trobe University, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.
China is known in Chinese as Zhongguo ‘middle nation,’ and indeed the cultural influence of China and the Chinese on all its neighbors has been profound. Chinese has been the dominant language of China for millennia, but many other languages are spoken in China. The languages of China fall into ten main groups: 1. Sino-Tibetan, including Sinitic (Han Chinese, the majority nationality) and Tibeto-Burman (17 nationalities, more than 100 languages) throughout the country; 2. Manchu-Tungus (Manchu, Xibo, Ewenk, Oroqen and Hezhe nationalities, seven languages) mainly in the northeast; 3. Mongol in the north central region (5.5 nationalities, including half of the Yugur, seven languages); 4. Turkic (6.5 nationalities and seven languages, with the other half of the Yugur); 5. Austro-Asiatic or Mon-Khmer in the far southwest (Wa, Bulang, De’ang, and Jing nationalities, plus some small unclassified groups, more than 12 languages); 6. Tai-Kadai in the southwest (nine nationalities, more than 20 languages); 7. Miao-Yao in the southwest central area (Miao, Yao, and She nationalities, 27 languages);
8. Indo-European (two nationalities, Tajik and Russian, plus creole Portuguese in Macao); 9. Korean; and 10. Austronesian (the Gaoshan nationality, a dozen languages indigenous to Taiwan, with few speakers on the mainland). The historical linguistic connection between Chinese and the Tibeto-Burman languages is fully established (Benedict, 1972; Coblin, 1986; Thurgood and LaPolla, 2003). The relationships within the TibetoBurman language family are also widely researched (Bradley, 1979; Matisoff, 2003). Chinese scholars suggest a close link of Sino-Tibetan languages with the Tai-Kadai languages, but this has been disproven by Benedict (1975), who instead linked the Tai-Kadai, Austronesian, and Miao-Yao families. The ‘Altaic hypothesis’ links Turkic, Mongol, Manchu-Tungus, and Japanese-Korean, although this is dubious. Marginal to China are the Mon-Khmer or Austro-Asiatic groups in the southwest and the Indo-European groups. Conversely, Japan, Korea, and Vietnam were long under the cultural influence of China.
The Chinese Language There is a long tradition of philological and epigraphic work on Chinese, notably Karlgren (1957); the best summary is Sagart (1999). There are also many excellent studies of Chinese syntax, especially Mandarin, notably Li and Thompson (1981). Chinese linguists have worked particularly on phonology and lexicon.
320 China: Language Situation
Outsider linguists often say that the Han Chinese speak seven distinct, mutually unintelligible languages: . Beifanghua (‘northern speech’) (known as Mandarin in English) in the north and west; . Wu around Shanghai; . Min in and around Fujian and in Taiwan; . Yue (known as Cantonese in English, from the name of Guangdong Province) around Hong Kong, Guangzhou city, and most of Guangdong; . Hakka (also known as Gejia in Mandarin), widely scattered across the southeast; and . The inland varieties Gan and Xiang. Some scholars further subdivide Mandarin, and there may be some additional varieties of Chinese, such as Waxianghua in southwestern Hunan. From a Chinese perspective, these varieties are all historically part of the Han Chinese majority group, all share the same zhongwen ‘middle writing’ Chinese character writing system, and all speak some fangyan of the Han Chinese yuyan. Fangyan is usually translated as ‘dialect’ and yuyan as ‘language’, but their meanings are broader. Among the minority languages, many historically related and structurally similar mutually unintelligible speech varieties are also classified together. Norman (1988) and Ramsey (1987) provide an excellent introduction to the differences between the various varieties of Han Chinese. These are particularly great in phonology and lexicon, with lesser morphosyntactic differences. Many scholars suggest that the Han Chinese varieties diverged in the 7th century A.D., but some separated considerably earlier. The non-Mandarin speech varieties of many cities also continue to be written, especially for local opera and folklore; additional local characters are used for words that do not exist in Mandarin. Lexically, every non-Mandarin speech variety is full of borrowings from other varieties, especially from Mandarin, including several strata from Mandarin of different periods. Sociolinguistics of Chinese
In the long history of China, diglossia developed gradually, with Confucian and other texts from the mid-first millennium B.C. remaining the standard for official literary use up to 1911; this came to be called wenyan ‘writing sound’. Alongside this, popular spoken-language literature, known as baihua ‘white speech’, came into use over the last millennium. During most dynasties, the spoken language of daily administration has been a Mandarin dialect, and this came to be known as guanhua ‘official speech’; however, other varieties of Chinese were
used locally. During the Republic period (1911– 1949), the Mandarin of Beijing was standardized and made the official written language as well, under the name guoyu ‘nation language’. In the 1950s, Beijing Mandarin was again codified, and in 1958 it was made the national language, under the name putonghua ‘common speech’. The lexical differences between guoyu, which is still the official language in Taiwan, and putonghua, which has constitutional status as the national language in the People’s Republic of China (PRC), are substantial and growing. The PRC has made major educational achievements in spreading putonghua alongside other spoken varieties of Chinese and minority languages and literacy in Chinese characters. There is also a standard romanization for putonghua called pinyin ‘phonetic writing’, introduced in 1958. Pinyin was originally intended to replace characters, but this plan was soon dropped. The sociolinguistic situation in Hong Kong is interesting. Cantonese is diglossic, with a literary and formal spoken high version that is quite different from the spoken low version. Since mid-1997 Cantonese, Mandarin, and English have had co-official status. Knowledge and use of Mandarin has spread widely since the late 1980s, but is still far behind most of China; traditional rather than simplified characters and Cantonese continue in general use.
Writing Systems The Chinese character writing system, with nearly four millennia of history, is a strong unifying characteristic for Han Chinese of all spoken varieties. Because it is logographic, sound change does not require orthographic reform. However, in 1958 a major reform was introduced in China; this simplified more than 36% of the frequently used characters and restricted the use of all but 6196 characters (Bradley, 1991: 310). The traditional full-form characters are still used in Hong Kong and Taiwan and by many overseas Chinese, but simplified characters are now used in Singapore and Malaysia. Conversely, fullform characters have been returning to use in China since the 1990s, especially for contacts with overseas Chinese, study of ancient literature, and greater status and formality. Some of the other languages of China also have logographic scripts; some are directly derived from traditional Chinese characters, with additions following the same principle of combining a radical (semantic element) plus a phonetic. These include the Zhuang orthography as well as the better-known Japanese kanji ‘Chinese characters’, Vietnamese chu noˆ m, pre-hangul Korean, and so on. Distinct logographic
China: Language Situation 321
systems include Naxi and four separate traditions of Yi. Mongol (Peripheral Mongolian) and Manchu were long written with Sogdian scripts. Tibetan is written with an Indic-derived script; one Mongol emperor tried unsuccessfully to replace Chinese characters with a modified Tibetan script in 1269. Turkic and other language groups in western China switched from Sogdian to Arabic scripts in the 13th century A.D. Some minority languages had scripts (mainly romanizations) devised by Christian missionaries. The missionary Samuel Pollard used roman letters, letters from Pitman shorthand, and invented letters in his script, mainly used for Miao and some Yi languages. In Pollard scripts, consonants are large, vowels are small, and the position to the vowel relative to the consonant indicates the tone: above for high etc. From the 1950s onwards, new romanizations for many minority languages were produced following the principles of pinyin. Some were created to replace existing orthographies (for Lisu, Lahu, Miao, Wa, and other Christian scripts, along with various Arabic scripts); others were created for then-unwritten languages. When Russian linguists were active in China, some Cyrillic letters were used; but after 1958 most were removed. Since the 1980s, some groups have gone back to their pre-1950 scripts, especially Arabic in Xinjiang and Christian romanizations and others in the southwest (Lisu, Miao, etc.). Conversely, the Yao pinyin-based script of the 1950s became the basis of a unified Yao orthography also used by Yao outside China.
Minority Languages The classification of non-Han ethnic groups was a gradual process. Four historically important groups – the Mongols, Manchu, Tibetans, and Hui – were recognized during the Republic period (1911–1949); the PRC flag has one large star representing the Han majority and four smaller stars representing these four groups. During the 1950s, a further fifty national minorities were recognized, and in 1978 the Jinuo nationality was added. Like the Han Chinese majority, some of the national minorities are linguistically highly composite; the Yi nationality, for example, includes six language clusters and many more distinct languages. In some cases, the ethnic classification links unrelated or distantly related languages. For example, the Lajia of eastern Guangxi, who speak a Tai-Kadai language (Lakkia), are included in the Yao nationality; the Yao of Hainan are included in the Miao nationality; and the Nu nationality includes speakers of four languages – two of which, Raorou (Zauzou) and Nusu, are Burmese-Yi languages, one of
which, Dulong (Drung), is also the language of the Dulong nationality, and the last of which, Anong (Nung), is related to Dulong and to Rawang in Burma (Myanmar). For maps and linguistic classification, see Wurm et al. (1987/1991) or Moseley and Asher (1994). Ramsey (1987) and Bradley (1994, 2001) have discussed language policy for minorities. Briefly, each recognized nationality has the constitutional right (but not the obligation) to maintain and develop its language and culture, and for each nationality one or more ‘standard’ varieties has been selected and codified. The overall policy is transitional bilingual education in the first few years of primary school. In some minority autonomous areas, local government has decided to maintain their language even up to university level, but this is the exception. With the transition to a market economy, education in and use of minority languages is decreasing. Table 1 gives census population figures for all recognized groups in China. The large minority population increases between 1982 and 2000 reflect minority re-identification of Sinicized people who do not speak their traditional minority languages. The Unclassified category includes many very small groups who have not been recognized as separate national minorities and has gradually decreased as some such groups have been assigned to existing nationalities. For example, the 40 000 Kucong of south central Yunnan, who speak a language quite closely related to Lahu, were amalgamated into the Lahu nationality in 1987; and the 5000 Laomian in southwestern Yunnan, whose language is only distantly related to Lahu but who live in Lahu areas and are mostly bilingual in Lahu, were amalgamated into the Lahu nationality in 1990. The Hui nationality is mainly composed of Muslim speakers of local varieties of Chinese and is usually said not to have a language. However, there is one group of Hui in Gansu who speak a distinctive variety of verb-final Chinese, presumably as a result of influence from languages of the area. Another group of Hui in Tongren County, the Qinghai Province, are locally called Kangjia and speak a Mongol language closely related to Bao’an (Bonan). A third group of Hui in southern Hainan speak a Chamic Austronesian language, Tsat. The Mongol nationality also shows how ethnic identity and language are not necessarily linked. This group includes the Mongols, whose language is similar to the speech of Mongolia and of the Buriats in Russia. It also includes several southwestern groups who claim descent from Mongol armies sent to China during the Yuan (Mongol) Dynasty. The ‘Mongols’ of southwestern Sichuan are in fact Moso, speakers of a
322 China: Language Situation Table 1 Population of China by ethnic groupa
Han (Chinese) Zhuang Manchu Hui Miao Uighur Tujia Yi Mongol Tibetan Buyi Dong (Kam) Yao Korean Bai Hani Kazakh Li Dai She Lisu Gelao Dongxiang (Santa) Lahu Shui Wa Naxi Qiang Tu (Monguor) Mulao Xibo Kirgiz Daur Jingpo Maonan Salar Bulang Tajik Achang Pumi Ewenk Nu Jing (Vietnamese) Jinuo De’ang Bao’an Russian Yugur Uzbek Menba (Monpa) Oroqen Dulong Tatar Hezhe (Nanai) Gaoshan Luoba Foreign-born citizens Unclassified
a Numbers shown in thousands; 1982 1990 and 2000 census data shown.
language classified as eastern Naxi, whose speakers are included in the Naxi nationality in adjacent Yunnan. The ‘Mongols’ of Tonghai County in south central Yunnan speak a Yi language. The Yugur nationality is another composite group; about half speak a Turkic language and half speak a Mongolic language.
Contact Languages Since the 1950s putonghua has increasingly become the lingua franca throughout China. It is used as a second Mandarin variety by those whose first speech variety is another kind of Mandarin, as a second Chinese variety by 300 million whose first speech variety is a non-Mandarin variety, and by an increasing proportion of minorities. The trend is for the urbanized or educated minority group members not to speak their traditional languages at all. Several contact languages have developed in China. One such language is spoken in Wutun and two nearby villages in northeastern Qinghai; it is structurally mainly Mongol, with extensive postpositional agglutinative morphology. However, the speakers follow Tibetan Buddhism (as most Mongols did from the Yuan Dynasty onward) and have long been in close contact with Tibetans and Han Chinese; therefore, the speech of Wutun contains a very large proportion of Tibetan and Chinese lexicon and also shows some influence from the syntactic structures of Tibetan and Chinese. Another is the Portuguese creole of Macao, now moribund; it has extensive Cantonese lexicon, in addition to the various strata of lexicon from previous contact in Malacca (now in Malaysia) and India.
Language Endangerment Several large nationalities are rapidly losing their languages. This includes the very large Manchu nationality, former rulers of the Jin and Qing dynasties, with about 20 very old speakers left in remote areas. In the southeast, the language of the She nationality in Guangdong Province now has fewer than a thousand speakers, or less than 0.1% of this group. The Gelao nationality, mainly in Guizhou, has fewer than 1% speaking any of the various Gelao languages. Similarly, less than 1% of the Tujia nationality in Hunan and nearby speak either of the two traditional Tujia languages (Northern and Southern Tujia). The Xixia language, formerly spoken in what is now Ningxia and Gansu, has been extinct for over 700 years; many other languages have doubtless disappeared, their descendants now speaking Chinese.
China: Language Situation 323
Many of the smaller groups included within composite national minorities speak endangered languages. Bradley (forthcoming) lists 90, including 49 Tibeto-Burman, 16 Tai-Kadai, 7 Miao-Yao, 7 Tungusic, 6 Mon-Khmer, two Turkic, two Mongol, and one Indo-European creole. One example is the Sanie just west of Kunming in Yunnan (Bradley, 2005; Bradley and Bradley, 2002; Bradley et al., 1999). In 76 villages there are 17 320 Sanie, classified as Yi nationality; some of the Han Chinese population of Kunming doubtless also have unremembered Sanie ancestry. All 8000 remaining speakers are bilingual in Yunnanese Mandarin, and most of these are adults. The long-term prognosis for many non-Han languages of China is bleak. This is especially so for those that are spoken only in less remote areas of China, are not designated as the standard language for their nationality, and do not extend into adjacent countries where they also have official status (Mongol, Korean, Russian, Vietnamese, Tajik, Kazakh, Kirghiz, and Northern Uzbek). See also: Chinese; Language Education Policy in China; Minorities and Language; Sino-Tibetan Languages. Language Maps (Appendix 1): Map 81–83.
Bibliography Benedict P K (1972). Sino-Tibetan: a conspectus. Cambridge: Cambridge University Press. Benedict P K (1975). Austro-Thai language and culture. New Haven, CT: HRAF Press. Bradley D (1979). Scandinavian Institute of Asian Studies monograph series 39: Proto-Loloish. London/Malmo¨ : Curzon Press. Bradley D (1991). ‘Chinese as a pluricentric language.’ In Clyne M G (ed.) Pluricentric languages. Berlin: Mouton de Gruyter. 305–324. Bradley D (1994). ‘Building identity and the modernisation of language: minority language policy in Thailand and
China.’ In Gomes A (ed.) Modernity and identity: Asian illustrations. Bundoora: Institute of Asian Studies, La Trobe University for Asian Studies Association of Australia. 192–205. Bradley D (2001). ‘Language policy for the Yi.’ In Harrell S (ed.) Perspectives on the Yi of southwest China. Berkeley/ Los Angeles/London: University of California Press. 195–214. Bradley D (2005). ‘Sanie and language loss in China.’ International Journal of the Sociology of Language 173. Bradley D (Forthcoming). ‘East and Southeast Asia.’ In Moseley C (ed.) Encyclopedia of endangered languages. London: Routledge. Bradley D & Bradley M (eds.) (2002). Language endangerment and language maintenance. London: Routledge Curzon. Bradley D, Bradley M & Li Y X (1999). ‘Language maintenance of endangered languages in central Yunnan, China.’ In Ostler N (ed.) Endangered languages and education. Bath: Foundation for Endangered Languages. 13–20. Coblin W S (1986). A Sinologist’s handlist of Sino-Tibetan lexical comparison. Nettetal: Steyler. Karlgren B (1957). ‘Grammata serica recensa.’ Bulletin of the Museum of Far Eastern Antiquities 29. Li C & Thompson S A (1981). Mandarin Chinese: a functional reference grammar. Berkeley/Los Angeles/London: University of California Press. Matisoff J A (2003). Handbook of Proto-Tibeto-Burman. Berkeley/Los Angeles/London: University of California Press. Moseley C & Asher R (eds.) (1994). Atlas of the world’s languages. London: Routledge. Norman J (1988). Chinese. Cambridge: Cambridge University Press. Ramsey S R (1987). The languages of China. Princeton, NJ: Princeton University Press. Sagart L (1999). The roots of Old Chinese. Amsterdam/ Philadelphia: John Benjamins. Thurgood G & LaPolla R (eds.) (2003). Sino-Tibetan languages. London: Routledge. Wurm S A W, T’sou B K & Bradley D (eds.) (1987/1991). Language atlas of China (2 vols). Hong Kong: Longmans.
324 China: Religions
China: Religions V Goossaert, CNRS-EPHE, Paris, France ! 2006 Elsevier Ltd. All rights reserved.
Chinese Religion Scholars have devised several models to describe the Chinese religious situation at various periods of history. One model that well describes the situation since the Song dynasty (960–1279) considers that – with the exception of religions (Islam, Christianity) that arrived in China from the outside and could not become fully integrated because of exclusive claims of truth – all religious practices, beliefs, and organizations in China belong to a single system, best called Chinese religion. This organic, pluralist, and nonhierarchical system integrates traditions of individual salvation (meditation and bodily techniques, morality, and spiritpossession techniques, including spirit writing), communal celebration (cults of local saints and ancestors), and death rituals together with three institutionalized religions, Buddhism, Taoism, and Confucianism. The sectarian tradition that formed around the 15th century has sometimes been described as China’s fourth religion; it has distinctive scriptures and a theology of its own, but is actually well integrated socially into the larger religious system. The basic principles of Chinese religion were already attested in late antiquity (before the advent of the Qin empire in 221 B.C.E.). They include a cosmology describing an organic universe without either notions of separate matter and spirit or an external creative force that evolves continuously according to objective rules explained by symbolical models (yin and yang, five phases, trigrams. . .); a sacrificial liturgy for ancestors and local territorial gods as well as dead heroes and related purification rules (zhaijie); practice of spirit possession; and exorcisms to cure illnesses and improper possessions. During the Han dynasty (206 B.C.E.–220 C.E.), Confucianism, the heir of the sacrificial religion of the nobility, was constituted as state religion; Taoism was fully formed as a religion with a distinctive theology, liturgy, clergy, and churchlike organization; and Buddhism began to arrive from Central Asia. During the medieval and Tang periods (3rd to 9th centuries), Taoism, Buddhism, and Confucianism alternated competition and cooperation before the affirmation of their coexistence and equal orthodoxy was firmly established as an imperial doctrine. All three institutionalized religions attempted to take control of numerous local cults and their spirit mediums with limited success. During the Song period, within the context of a strong demographic and
economic expansion, local communities constituted around the cults of their saints, as well as corporate lineages devoted to the cult of their ancestors, emerged as the most common and powerful religious institutions and ushered in a tremendous rise of vernacular religious arts (notably opera and other performing arts in the framework of temple festivals) and techniques (spirit writing, which allows an unmediated direct access to deities and saints). Meanwhile, the clerical institutions of the three religions remained extremely prestigious and respected as they provided scriptural (elaborating theological justifications for local cults) and liturgical services to such local, lay religious communities. This pluralist system remained in place until a sudden shift in religious policies during the 20th century, inspired by the introduction of Western notions, restrained the extent of officially tolerated religious activities. The Republican (1912–) and Communist (1949–) regimes distinguished five authorized although controlled religions (Buddhism, Taoism, Islam, Catholicism, and Protestantism) but actively suppressed superstitions, notably local cults (the Nationalist government in Taiwan has strongly liberalized its stance since the 1970s). This suppression, though, has been much less effective than expected, as the strong renewal throughout the Chinese world, notably on the mainland since the 1980s, shows. The three institutionalized religions are each precisely defined by a distinctive clergy, a canon (scriptures, which define orthodoxy), a liturgy, and training centers (monasteries, academies, where the canon is kept and the clergy is trained and ordained). The institutions defined by these four characteristics can be properly named Buddhism, Taoism, and Confucianism. They do not exist in isolation but serve Chinese religion as a whole. This whole is not syncretism as it is too often described; the three institutionalized religions are expected to coexist but not mingle, and people do not confuse them. It is an organic, pluralistic religious system. Also, it might prove very useful to dispense totally with the much-abused notion of ‘popular religion’ or ‘folk religion’ and indeed with the very word popular. The many independent communities that form the social structure of Chinese religion choose within the shared repertoire of beliefs and practices and the services offered by the three religions those that give them relevant meaning, and their choices hinge on socioeconomic, ideological, and theological considerations much more complex than an elite/popular dichotomy can suggest. The role of Confucianism, Buddhism, and Taoism within Chinese religion then is not to exist as exclusive
China: Religions 325
institutions providing their members a way to salvation, as the 19th-century Western concept of religion would imply, but rather to transmit their tradition of practice and make it available to all, either as individual spiritual techniques or liturgical services to entire communities. In late imperial times and well into the 20th century, only clerics and a small number of retired laymen (jushi) would declare themselves Buddhist or Taoist, but very few Chinese indeed would never engage in Buddhist or Taoist practices. The wide acceptance and official status of the doctrine of the three religions’ coexistence made them complementary. The services offered by the three religions and their clerics were comparable in many respects, sometimes causing collaboration and/ or competition, while each of the three also provided unique services.
The Three Religions’ Approach to Languages and Texts The birth of Chinese writing is closely associated with religious practices, as the earliest available instances of Chinese writing are divination records, the jiagu wen. These texts were written on tortoise breastplates and mammal bones that had been heated after a question was asked of ancestors or other gods, and the answer was read in cracks caused by the heating: both question and answer were subsequently noted on the used breastplate or bone and kept in royal archives. Written language served to keep track of divinations, then, but was not instrumental in the divination process itself. Among the various institutionalized religions that developed in China, whether indigenous (Confucianism, Taoism) or imported (Buddhism) (see Buddhism, Tibetan), we might oppose those essentially based on oral teaching and liturgy, that is, Confucianism and Buddhism, and Taoism, which is fundamentally predicated on written texts. This might come as a surprise since both Confucianism and Buddhism are well known for their extensive canons, exegetical tradition, and high valuation of reading and study. However, both these religions describe their own history as initially characterized by oral transmission before reaching a stage at which writing texts became necessary, and only then as a way to safeguard the tradition. Confucius himself claimed to have merely written the texts that he, like any gentleman, had learned as a child. These texts, which became the scriptures/classics (jing), that is, authoritative standards for orthodoxy within Confucianism, are fundamentally linked to liturgy. They include the Book of Odes (Shijing), essentially odes and hymns to be sung during rituals; the Book of Documents (Shujing) recording the grand
rituals performed by past kings; the Yijing, a divination manual; and liturgical manuals. A broadly similar process, but under very different circumstances in India, occurred in Buddhism, in which the teachings of the Buddha were reportedly transmitted orally and only noted down during a council when various versions began to diverge too widely. It was only much later, within the early so-called Mahayana (great vehicle) networks, that a devotion to texts as objects formed, linked to a divinization of texts themselves identified as Buddhas and also to specific revelation techniques. On the other hand, Taoism very early developed a specific theology of writing, whereby written characters in their primordial form (of which the commonly used characters were merely imitations) are considered fundamental elements in the structure of the universe, embodying primordial truths, hence the name zhenwen (transcendent writs) for scripts used in the construction of Taoist ritual altars. Similarly, talismans (fu) are construed the true names of various cosmic forces, and the proper writing of talismans gives the priest command on such forces. Talismans and transcendent writs are not pronounced but only written. Thus, in contrast to Confucian and Buddhist liturgies in which the crucial act is the proper recitation of texts and hymns, Taoist liturgy hinges on written communication with forces of the Dao. The priest, after purification, writes the various memoirs, requests, and other communications according to precise forms, rules, and models (writing mistakes are punished by the gods); burns them (fire is the main channel of communication between this and the other world); and silently in meditation raises to submit the requests himself in audience with Heaven. This liturgy of the sacrifice of scriptures (which within Taoism replaces the sacrifice of meats of common religion, see Schipper, 1974) has been well attested since the 2nd century C.E. and remains central to Taoist liturgy in the 21st century. Devotional practices linked to written characters found their first known expression in Taoist community rules (the ‘Hundred eighty rules of the Supreme lord’, probably 2nd or 3rd century C.E.) and much later developed into a practice known as xizi, ‘Care for written characters’, embedded into common religious ethics of the early modern period (16th to 19th centuries). Xizi adepts consider throwing away a paper or other material with any written characters on it (not only scriptures, but any text) as a sin; they treat all such material with care, pick up those discarded by others, and regularly burn them in a clean furnace before bringing the ashes to the sea or a river. Buddhist influence, notably through the Lingbao revelations (early 5th century C.E.), has caused Taoism
326 China: Religions
to adopt the oral recitation of scriptures (different modes of chanting exist according to the exact nature of the text, see Boltz, 1996) as a part of rituals and to adopt notions (of Indian origin) of sacred and efficacious sounds proper to the Buddhist tradition. The Buddhist tantric tradition of dharani, texts whose recitation provoke a spiritual effect, was also adopted in China well beyond Buddhist circles, and dharani texts are found engraved on many supports and are recited in case of danger (Strickmann, 1996). As a consequence of the adoption of Buddhist notions of efficacious sounds, Buddhist and Taoist clerics have to pronounce (in some cases orally, in others silently) incantations/spells, zhou (see Sawada, 1984), that ensure purity and avoid demonic contamination in many precise circumstances (eating, urinating, getting to sleep). Moreover, Chinese religious practice, including the sectarian tradition, values mantras, that is, secretly transmitted efficacious words, either in proper Chinese or in (pseudo-)Sanskrit (see Sanskrit). On the other hand, the Taoist focus on the written word has not diminished because of Buddhist influence, and Chinese Buddhism had even adopted the Taoist practice of written communication with the otherworld. Large Buddhist rituals since at least the Song period have also included sending written petitions to the Buddhas. Writing and speaking are then two distinct, parallel modes of communication with the divine, the former more Taoist oriented and the latter more Buddhist and Confucian oriented, although all traditions and communities use both media. Ordinary devotees address the deities they pray to either orally in the vernacular (no specific language of prayer has developed) or, with the help of a clerical specialist, by writing and burning a formal request (shu).
Scriptures and Revelations The language of the Confucian and Taoist scriptures is not fundamentally different from that of other contemporary texts. Confucian scriptures were used in educating children (who learned them by rote) and formed the curriculum of state and private schools that developed gradually and, notably under the Song, in relation to state civil service examinations. The style of these scriptures was thus widely imitated, and citations are found throughout any oral or written discourse to the present day. Taoist scriptures, many of which were transmitted only to initiates, did not enjoy such a status as models of writing and speech, yet also exerted a deep influence on poetry, notably the Shangqing texts (revealed during the late 4th century B.C.E.), whose rich and often arduous flowery metaphors and mystic vocabulary informed later
secular poetry. Aesthetic judgments on poetic and calligraphy style and doctrinal contents are deeply interconnected in the transmission and diffusion of all Chinese religious texts. The status of Buddhist scriptures is different because these had to be translated from very different languages, most of them Indo-European (see ). Translation strategies varied, from using the existing Taoist lexicon (which facilitated adoption, but caused confusions), to transliterating original words, to eventually creating, mostly from existing Chinese words, a new Chinese Buddhist vocabulary (Zu¨ rcher, 1991). Part of this vocabulary, including a few transliterations from Sanskrit and other Indo-European languages, has found its way into everyday language. The translation of Buddhist texts was a huge enterprise lasting from the 2nd to the 13th centuries, with the imperial state and rich patrons sponsoring whole translating institutes where native speakers of Indian or Central Asian languages and Chinese monks collaborated to produce word-for-word translations and then readable final texts. Major texts were translated several times by different teams, and the best, most readable versions eclipsed others. Incidentally, in the process of making technical Buddhist dictionaries and transliterating names and words, Chinese scholars developed ways to classify syllables according to initials and finals. At the same time as thousands of Buddhist texts were translated, others, the apocrypha, were written in Chinese while pretending to be translations, and a good deal of the scholastic activity of the Buddhist establishment consisted in compiling canons and sorting apocrypha from true scriptures according to philological as well as theological criteria. The history of translation of Islamic and Christian texts into Chinese, at later periods, bears comparison with the great Buddhist translation enterprise, although the more limited impact of these exclusive religions did not allow new words coined by translators to become everyday vocabulary. While the Buddhist and Confucian canons were basically closed during the Song period, scriptures of Taoism, local cults, and sectarian traditions continued to appear in large numbers. One major feature of Chinese religion is the openness of revelation. Spirit possession is extremely common, and it is widely accepted that although certain forms of possession by certain spirits are dangerous or not desirable, possession is an authentic means of communication with deities and therefore of accessing truth and obtaining grace. Official religion during antiquity institutionalized possession, but this was no longer the case for Confucianism after the Han or for Buddhism. On the other hand, possession continues to this day to be practiced within the confines of local cults,
China: Religions 327
devotional groups, and sectarian communities; spirit mediums are not organized as a clergy but are trained by Taoists and work for local communities. Temple spirit mediums are usually regarded with disdain and sometimes hostility by clerics, but play a crucial role throughout China in answering personal queries (difficult choices, righting wrongs) and healing illnesses. When possessed, they may write talismans to help, protect, or cure the patient, but also often act as oracles, speaking with the voice of the god to answer the question asked. The spirit mediums working for spirit-writing groups are different, as they are often considered to have a higher status. When possessed, they hold a forked object, usually made of wood, over a tray of sand, ashes, or just a table and write characters that are then interpreted and noted by an assistant. Books revealed by such means are extremely numerous, edited, printed, and freely distributed (as a pious act) by the groups who received them. They include morality books, shanshu, where deities lay out the common ethics of Chinese religion and tell tales of retribution for good or evil acts (on the language aspects of shanshu, see Bell, 1996); immortality techniques manuals, where immortals guide directly their adepts in their self-cultivation quest; medicinal recipes; and poetry and other narratives. Such books have been produced by the thousands since the Song, with a sharp increase around the 17th century.
Classical and Vernacular Language in Religious Practice In China, as in many other cultures, the use of texts and language style differentiates the religious practice of various social classes and groups. As a general rule, an explicit contrast opposes the clergies (Buddhist, Taoist, Confucian), that is, specialists sharing a nationwide training and ordination system and practicing a (supposedly unified) liturgy, with the same texts, music, and chanting/reciting techniques throughout the country, to vernacular specialists totally embedded in local village culture and not related to larger clerical institutions. The former speak (during the course of the ritual) national, classical language, whereas the latter only use the vernacular local dialect (see Chinese). Such an opposition, best known through the Taoist case where specialists are divided into classical-speaking and vernacularspeaking priests (see notably Schipper, 1985), is similarly observed in the Buddhist and Confucian contexts. It should be emphasized that the classical/ vernacular dichotomy is both real in the sense that it is an explicit category for both specialists and laypeople (it is crucial in this context that people do not
understand what priests say when speaking in classical language) and an ideological construct hiding much more complex realities. First, the classical language used, supposedly unique throughout the Chinese world, actually varies very much from one region to the next and is often a classical/archaic form of the local language, rather than a language understandable in other parts of China. Second, rituals actually mix classical and vernacular language; Buddhist and Taoist rituals are composed of many different rites, some very solemn with hymns that have not changed for centuries and some in which improvisation is possible, and the style is much closer to spoken language. Clerics also frequently sing ballads or tell devotional stories in the vernacular, fully understood by the audience. Largely, the same rhetorical dialectics and complementarity between the more prestigious classical and the more vulgar vernacular can be observed in the opera. Chinese opera, which formed under the Song (on the basis of more ancient and not well known antecedents), is intimately linked to Chinese religion and local cults, since temple festivals were the major venue for opera shows. Large temple festivals included both clerical (Buddhist, Taoist, Confucian) rituals within the temple and opera shows outside, both being offered to the local gods. Many opera plays have religious themes (although there is no distinction between profane/sacred themes), and some performing arts have a direct ritual efficacy, such as exorcist plays or puppet shows. Opera also mixed songs in classical language with more vernacular dialogues.
Preaching Even though the classical/vernacular dialectics valorize the former, it would be wrong to think that vernacular language was universally considered a debased medium for communication with deities or transmitting religious truths in Chinese religious circles throughout history. First, the Song period witnessed a rise, which has not abated since, of the genre of recorded conversations, yulu, of spiritual masters, Confucian, Buddhist, and Taoist. Yulu are supposed to be verbatim records of question-and-answer exchanges between a master and his/her disciples as actually spoken, that is, in the vernacular. Some of them have actually been edited and cannot truly reflect oral usage, yet such texts are among our best witnesses of spoken Chinese language for the early modern period. The valorization of vernacular language in such texts is linked to a spiritual movement to enhance the living master, his/her actions, and everyday words, rather than the scriptures, as the main guide to truth and transcendence: such
328 China: Religions
ideas were formulated by Quanzhen Taoism, NeoConfucianism, and Chan Buddhism, the latter having expressed the most radical statements to this effect, with a (rhetorical) rejection of scriptures as useless and a glorification of ‘teaching without words’ (Gardner, 1991; Berling, 1987). The late Ming (16th and early 17th centuries) move to use vernacular language in literature (notably novels) was also a religious movement with similar motivations, aiming at retrieving the language Confucius spoke. A second reason why some religious specialists strove to use the vernacular both in print and in speech was the need to preach. Because the three religions do not have exclusive lay organizations and share the status of orthodox teachings, they do not much proselytize. Clerics and activists, however, have been active throughout history in spreading their message in both print and speech. Late imperial Confucian activists wrote manuals aimed at helping those who had, as the law mandated, to recite an imperially written text on morality (the ‘Sacred edicts’, shengyu) to villagers twice a month. As the sacred edicts in terse written classical style were hardly intelligible when recited aloud, these activists wrote texts in various shades of vernacular aimed at explaining them and providing simple examples (Mair, 1985). Buddhist and Taoist preaching (see Preaching) was usually somewhat different, being less-tersely didactic and relying more on performing arts techniques. Medieval (5th to 10th centuries C.E.) manuscripts found in Dunhuang include such Buddhist preaching texts (bianwen) alternating vernacular speech, rhymed songs, and large images to be shown to the audience. Taoists also developed songs (notably daoqing, ‘Taoist feelings’ ballads) to be sung by clerics and/or professional performers during festivals, promising salvation and appealing to conversion. Devotional literature, telling the stories of the saints and the retribution of evil in the prosimetric style alternating prose and rhymed songs and known as baojuan, has been extremely widespread since the 16th century. Baojuan performers are invited during temple festivals or in the houses of the large families to recite their texts, a meritorious, pious, and at the same time enjoyable performance that can last for nights on end.
See also: Buddhism, Tibetan; Chinese; Preaching; Sanskrit; South Asia: Religions.
Bibliography Bell C (1996). ‘A precious raft to save the world: The interaction of scriptural traditions and printing in a Chinese morality book.’ Late Imperial China 17(1), 158–200. Berling J A (1987). ‘Bringing the Buddha down to earth: notes on the emergence of Yu¨ -lu as a Buddhist genre.’ History of Religions 27, 56–88. Boltz J M (1996). ‘Singing to the spirits of the dead: a Daoist ritual of salvation.’ In Yung B, Rawski E S & Watson R S (eds.) Harmony and counterpoint, ritual music in Chinese context. Stanford, CA: Stanford University Press. 177–225. Dean K (1993). Daoist ritual and popular cults of Southeast China. Princeton, NJ: Princeton University Press. Gardner D K (1991). ‘Modes of thinking and modes of discourse in the Sung: some thoughts on the Yu¨ -lu (‘recorded conversations’) texts.’ Journal of Asian Studies 50(3), 574–603. Lopez D S (ed.) (1996). Religions of China in practice. Princeton, NJ: Princeton University Press. Mair V (1985). ‘Language and ideology in the written popularizations of the sacred edict.’ In Johnson D, Nathan A J & Rawski E S (eds.) Popular culture in late imperial China. Berkeley: University of California Press. 325–359. Sawada M (1984). Chuˆ goku no juhoˆ . Tokyo: Hirakawa. Schipper K (1985). ‘Vernacular and classical ritual in Taoism.’ Journal of Asian Studies 45(1), 21–51. Schipper K (1974). ‘The written memorial in Taoist ceremonies.’ In Wolf A P (ed.) Religion and ritual in Chinese society. Stanford, CA: Stanford University Press. 309–324. Strickmann M (1996). Mantras et mandarins. Le bouddhisme tantrique en Chine. Paris: Gallimard. Yang C K (1961). Religion in Chinese society. Berkeley: University of California Press. Yu¨ C (2001). Kuan-yin: the Chinese transformation of Avalokitesvara. New York: Columbia University Press. Zu¨ rcher E (1991). ‘A new look at the earliest Chinese Buddhist texts.’ In Shinohara K & Schopen G (eds.) From Benares to Beijing: essays on Buddhism and Chinese religion in honour of Prof. Jan Yu¨ n-hua. Oakville, Ontario: Mosaic Press. 277–304.
China: Scripts, Non-Chinese 329
China: Scripts, Non-Chinese M Bender, The Ohio State University, Columbus, OH, USA ! 2006 Elsevier Ltd. All rights reserved.
Soon after 1949, influenced by policies already instituted in the Soviet Union, the new government of the People’s Republic of China decided to promote the recognition of ethnic minority peoples on a scale unprecedented in Chinese history. Groups with fully functional, widely used traditional scripts include the Uygur, Kazak, Mongols, Koreans, and Tibetans (Zhou, 2003: 280). The Yi, Naxi, and Dai had writing systems used only by shamans, priests, or monks. In the pre-1949 era, several scripts were created by Western missionaries (such as Samuel Pollard, Alfred Lietard, and Pere Paul Vial) for some divisions of the Miao, Jingpo, Lisu, Hani, Lahu, and Wa (Va) and the Yi subgroup known as Sani. Other scripts used by the Zhuang, Shui, and Dong were based on vernacular usages of Chinese characters and had very limited implementation. Miao epic poems from Guizhou Province mention an ancient, but lost, script. An absolete form of phonetic script different in form from both Chinese and Yi scripts (see below) and consisting of approximately 180 graphs dates to the Qing dynasty (1644–1911). A traditional ‘women’s script’ (nushu) of unknown origin (some have suggested a minority link) was in recent use in Jingyong County, Hunan Province, for writing letters and stories in verse. One goal of the early era of minority recognition was the creation of written scripts for those groups that did not have a tradition of writing. By the early 1950s, Chinese linguists, often working with local representatives, had begun to devise scripts for a number of the groups, especially those living in the border areas of the southwest. Initially, Cyrillic and roman alphabets were utilized in creating new scripts. In some cases, the International Phonetic Alphabet (IPA) was also used, sometimes in combination with the other systems. Ultimately, roman letters (sometimes with IPA symbols) prevailed in the official scripts, which numbered approximately 13. The trend toward roman letters was strongly influenced by the adoption of the Pinyin romanization system, developed between 1956 and 1958, for writing the sounds of Standard Chinese. Over the ensuing years, script creation was affected by policy revisions, national and local budget allocations, the rate of publication and distribution of script materials, and the degree and effectiveness of local implementation and use. In a few cases, the created scripts crossed borders,
as in the case of the Jingpho (Jingpo) script used in Yunnan and contiguous ethnic areas in Myanmar. Among the first languages targeted for the creation of a new script was Northern Zhuang (Zhuang). China’s largest ethnic minority group, the Zhuang, number approximately 15 million and live in the Guangxi Zhuang Autonomous Region and surrounding areas in the southwest. A major challenge was finding a system that could represent the sounds of the various Zhuang dialects, some of which are mutually unintelligible. The key question was which dialect would be the basis for the writing system – an issue in the creation of many of the other minority scripts. At the urging of a Russian specialist, the Wuming dialect was chosen as a standard. A romanized script was first proposed in the early 1950s, but again on Soviet advice, a system combining roman letters, Cyrillic letters, and certain IPA symbols was adopted in the mid-1950s. This script was put to limited use until it was displaced by an all-roman system in 1982. As with several other southern minority scripts, consonant letters placed at the end of a morpheme were used to represent the linguistic tones. Examples of the Pinyin-based script are as follows: vunz (‘person’) employing the second tone (55), menh (‘slow’) employing tone six (33), and laeuj ndok guk (‘tiger bone wine’) employing an alternate usage of the third tone (55) and tone seven (35). Dictionaries, school texts, signs on government offices, and folk literature utilizing the script appeared throughout the 1980s, though Chinese continues to be the primary script for most Zhuang speakers. By the mid-1950s, Russian linguists were also advising Chinese scholars either to adapt Cyrillic systems already developed in the Soviet Union for scripts for small, seminomadic transborder groups such as the Ewenki and Oroqen, or to create new Cyrillicbased scripts for larger or more sedentary groups of Mongolian-related speakers in northern and northeastern China. One case was Daur, spoken by approximately 100 000 people in eastern Inner Mongolia and Heilongjiang Province (with several thousand speakers in Xinjiang, as well). Traditionally, the Daur have no written scripts of their own. A Cyrillicbased script was created. One concern was the representation of vowel chronemes, which was solved by using double roman letters (such as the Russian ‘a’ for the IPA ‘a’, represented as ‘aa’) (Zhou, 2003: 179–180). This script, however, was never widely implemented. By 1980, a new romanized Daur script was in limited use that included experiments with the script in schools, the publishing of a Daur/Chinese dictionary, and the publication of some folk songs
330 China: Scripts, Non-Chinese
and stories. Some features of the earlier Cyrillic-based script were carried over into the Pinyin-based romanization, including the representation of chronemes, as exemplified by the words baatur (‘hero’), maamaa (child’s term for ‘horse’), and taawudaar (‘number five’ in a series). Similar conventions are used in some other scripts, including a mixed IPA and Pinyin script later developed for representing Oroqen, as with axaaj (‘elder brother’) and axaxaan (‘little brother’). The case of the Yi nationality of the Liangshan Yi Autonomous Prefecture, Sichuan, is unique in that a whole new syllabary was created out of traditional logographic and syllabic graphs. Beginning in the 1950s, a romanized script was created to write the sounds of a variety of Sichuan Yi (Northern Yi) (Bradley, 2001: 207). Three of the four tones in Shynra dialect are represented by final consonants; t (55), x (44), p (21). The mid-level tome (33) is unmarked. An example of the script from a textbook (Li and Ma, 1981: 2) is as follows: Speaker A: Nop jiet nyi zzyrmuo bat? Your home also well (particle)? Is your homes well also? Speaker B: Zzyrmuo ggehni jjoddu ap jjo. happy healthy worries not have. No worries, all happy and healthy
The romanized script, however, failed to gain popularity, in part because of local attachments to the native writing system (once used by approximately 2% of the population, mostly ritual specialists). In the 1970s, efforts were made to develop a syllabary based on traditional graphs. By 1980, a new syllabic script was being taught in many schools throughout the Liangshan Prefecture and neighboring Ninglang Prefecture in Northern Yunnan. The syllabic script was based on 819 traditional graphs and the Shynra dialect pronunciation (as spoken in Xide County). Newspapers, journals, textbooks, poetry collections, folklore, and other writings have been published in the script, though most modern Nuosu authors write and publish in Chinese, the medium of officialdom. Although the script was welcome in Northern Yispeaking areas, it was inadequate to meet the needs of speakers from other dialect areas where the traditional scripts vary greatly. Thus, attempts have been made in Yunnan and Guizhou to create a ‘supra-X’ system based on approximately 2000 traditional characters. It has yet to be fully developed and local conventions are in use in several areas. For instance, scholars of the Sani subgroup in Yunnan have published a number of multilinear folk literature texts combining local versions of traditional Yi characters, IPA symbols to represent Yi sounds, and Chinese graphs. Another factor working against script
acceptance in some areas of Yunnan and Guizhou is a decline in speakers of local Yi dialects. Around 1905, Samuel Pollard created a nonroman alphabetic script for a Miao subgroup (Hua Miao) in eastern Yunnan. The script was modified several times in the 1950s, but is still in use in some areas today (Zhou, 2003: 317–323). By the late 1950s, Chinese linguists had devised romanized scripts for several Miao groups in Guizhou and contiguous areas. As with Yi, the varying degree of intelligibility between dialects has made it difficult for researchers to achieve the elusive goal of a universal Miao romanization system. Although written Chinese is the major script in all the Hmong-(Miao-) speaking areas, in some places the romanized Miao scripts have been implemented in limited ways. In the case of the Eastern dialect (which represents a group of three subdialects in the Eastern Guizhou province), dictionaries, grammars, and folk literature texts have appeared in bilingual editions. In a grammar designed to introduce the Eastern romanized script, eight speech tones are represented by consonants attached to each morpheme: b (33), x (55), d (24), l (22), t (44), s (23), k (53), f (21). An example of first tone usage is dab (‘reply’), with the second tone dax (‘come’), the third dad (‘length’), and so forth. The work also included several long folk stories written in the script. The following sample is from Dail Daib Pik Vongx (The dragon maiden) (Wang, 1985: 224). The text was presented in a multilinear format with word-for-word Chinese (here substituted by English, followed by a translation): nongd nangx ngax hek jud, nenx ghax dad laib khangb lol ket tob tob: need eat meat drink wine, need then grab a calabash come shake lightly lightly: When he needed to drink wine, he just took the calabash and shook it lightly, saying: ‘‘lol ngax khangb! lol ngax dit ob bad daib nangx khangb! ‘‘come meat calabash! come meat for us father–son eat! Calabash, bring on the meat, bring on the meat for this father and son to eat! lol jud khangb! lol jud dit ob bad daib hek khangb!’’ come wine calabash! come wine for us father–son drink calabash!’’ Calabash, bring on the wine! bring on the wine for this father and son to drink!’’
Since 1949, revisions have been made to a number of the traditional scripts, and over a dozen official scripts (and numerous unofficial ones) have been created for various ethnic groups in China. However, social and economic pressures will be a determining factor in the continued use of the minority scripts. The needs of information technology offer new
China: Writing System 331
avenues for the development and use of romanized scripts. By the late 1990s, computerized script programs had been developed for all the major traditional written scripts and several of the newer ones. For example, researchers have developed a romanized Uyghur (Uygur) script that is more compatible with on-line usage than the Arabic-based traditional script (Zhou, 2003: 138). Nevertheless, without increased promotion, it is likely that many of the newer scripts will eventually exist primarily as tools for scholars and symbolic markers of ethnicity. See also: China: Language Situation.
Bibliography Bradley D (2001). ‘Language policy for the Yi.’ In Harrel S (ed.) Perspectives on the Yi of Southwest China. Berkeley: University of California Press. 195–213. Chiang W W (1995). We two know the script; we have become good friends: Linguistic and social aspects of the women’s script literacy in southern Hunan, China. New York: Rowman and Littlefield Publishers, Inc. Enhebatu (1983). Da Han xiao cidian (DAOR NIAKAN BULKU BITEG). Hohot: Neimenggu renmin chubanshe. Han Y & Meng S (1993). Olunchunyu Hanyu duizhao duben (Oroqen language and Han language
bilingual reader). Beijing: Zhongyang minzu xueyuan chubanshe. He L & Xiong Y (eds.) (1999). Yunnan shaoshu minzu wenzi gaiyao (Introduction to the scripts of the Yunnan ethnic minorities). Kunming: Yunnan minzu chubanshe. Huang J M (2003). Yiwen wenzi xue (Study of Yi writing). Beijing: Minzu chubanshe. Huang Y (1983). Zhuangzu geyao gailun (Introduction to Zhuang nationality folksongs). Nanning: Guangxi minzu chubanshe. Li M & Ma M (1981). Liangshan Yiyu huihua liubaiju (Six hundred conversation sentences in Liangshan Yi language). Chengdu: Sichuan minzu chubanshe. Mackerras C (1995). China’s minority cultures: Identities and integration since 1912. New York: St. Martin’s Press. Ramsey S R (1981). The languages of China. Princeton, NJ: Princeton University Press. Wang C (1985). Miaoyu yufa, Qiandong fangyan/Benx wix faf hveb hmub, wangf cunb deef hxad (Miao language grammar, Southern Guizhou local dialect). Beijing: Guangming erbao chubanshe. Yunnan minzu xueyuan minzu yuyan wenxuexi. (Yunnan Nationalities Institute, Ethnic Languages Literatures Department) (eds.) (1997) Yunnan minzu yuyan wenxue lunwenji (Collected articles on Yunnan ethnic languages and literatures). Kunming: Yunnan minzu chubanshe. Zhou M (2003). Multilingualism in China: The politics of writing reforms for minority languages 1949–2002. Berlin: Mouton de Gruyter.
China: Writing System D P Branner, University of Maryland, College Park, MD, USA ! 2006 Elsevier Ltd. All rights reserved.
The Chinese characters are a unique form of writing in the modern, integrated world. They have been the primary script across East Asia, and they have proven to be fascinating to the entire world. In China, their two major forms are those of ancient inscriptions: 1. bronze (jı¯nwe´n ) 2. oracle bone or ‘OBI’ (jiaˇguˇwe´n ) 3. seal (zhua`nwe´n , zhua`nshu¯ )
Ancient Inscribed Styles Seal
, qı`we´n , xiaˇozhua`n
and the newer ink-brush styles: 1. clerical (lı`shu¯ ) 2. ‘running’ and ‘grass’ (xı´ngshu¯ 3. modern square script (kaˇishu¯
of scholars of paleography. All known styles of Chinese script, including those recently rediscovered through archaeological excavation, are the subject of modern calligraphy practice, one of the prime vehicles of self-cultivation in Chinese traditional secular culture.
, caˇoshu¯ )
)
In Chinese society today there is considerable knowledge about ancient scripts outside of the community
Seal script has been used on signature seals since ‘Warring States’ times (475–221 B.C.E.), and was the main form used in public inscriptions in the Qı´n (221–206 B.C.E.) and Ha`n (B.C.E. 206–221 C.E.) dynasties. The Qı´n standardization of the script is thought to have involved seal script, although it is less well known that Qı´n itself cultivated a highly conservative character structure in an effort to show that it was the worthy successor of the Zho¯u. Seal script has long been used on official seals, and so is associated with political legitimacy.
China: Writing System 331
avenues for the development and use of romanized scripts. By the late 1990s, computerized script programs had been developed for all the major traditional written scripts and several of the newer ones. For example, researchers have developed a romanized Uyghur (Uygur) script that is more compatible with on-line usage than the Arabic-based traditional script (Zhou, 2003: 138). Nevertheless, without increased promotion, it is likely that many of the newer scripts will eventually exist primarily as tools for scholars and symbolic markers of ethnicity. See also: China: Language Situation.
Bibliography Bradley D (2001). ‘Language policy for the Yi.’ In Harrel S (ed.) Perspectives on the Yi of Southwest China. Berkeley: University of California Press. 195–213. Chiang W W (1995). We two know the script; we have become good friends: Linguistic and social aspects of the women’s script literacy in southern Hunan, China. New York: Rowman and Littlefield Publishers, Inc. Enhebatu (1983). Da Han xiao cidian (DAOR NIAKAN BULKU BITEG). Hohot: Neimenggu renmin chubanshe. Han Y & Meng S (1993). Olunchunyu Hanyu duizhao duben (Oroqen language and Han language
bilingual reader). Beijing: Zhongyang minzu xueyuan chubanshe. He L & Xiong Y (eds.) (1999). Yunnan shaoshu minzu wenzi gaiyao (Introduction to the scripts of the Yunnan ethnic minorities). Kunming: Yunnan minzu chubanshe. Huang J M (2003). Yiwen wenzi xue (Study of Yi writing). Beijing: Minzu chubanshe. Huang Y (1983). Zhuangzu geyao gailun (Introduction to Zhuang nationality folksongs). Nanning: Guangxi minzu chubanshe. Li M & Ma M (1981). Liangshan Yiyu huihua liubaiju (Six hundred conversation sentences in Liangshan Yi language). Chengdu: Sichuan minzu chubanshe. Mackerras C (1995). China’s minority cultures: Identities and integration since 1912. New York: St. Martin’s Press. Ramsey S R (1981). The languages of China. Princeton, NJ: Princeton University Press. Wang C (1985). Miaoyu yufa, Qiandong fangyan/Benx wix faf hveb hmub, wangf cunb deef hxad (Miao language grammar, Southern Guizhou local dialect). Beijing: Guangming erbao chubanshe. Yunnan minzu xueyuan minzu yuyan wenxuexi. (Yunnan Nationalities Institute, Ethnic Languages Literatures Department) (eds.) (1997) Yunnan minzu yuyan wenxue lunwenji (Collected articles on Yunnan ethnic languages and literatures). Kunming: Yunnan minzu chubanshe. Zhou M (2003). Multilingualism in China: The politics of writing reforms for minority languages 1949–2002. Berlin: Mouton de Gruyter.
China: Writing System D P Branner, University of Maryland, College Park, MD, USA ! 2006 Elsevier Ltd. All rights reserved.
The Chinese characters are a unique form of writing in the modern, integrated world. They have been the primary script across East Asia, and they have proven to be fascinating to the entire world. In China, their two major forms are those of ancient inscriptions: 1. bronze (jı¯nwe´n ) 2. oracle bone or ‘OBI’ (jiaˇguˇwe´n ) 3. seal (zhua`nwe´n , zhua`nshu¯ )
Ancient Inscribed Styles Seal
, qı`we´n , xiaˇozhua`n
and the newer ink-brush styles: 1. clerical (lı`shu¯ ) 2. ‘running’ and ‘grass’ (xı´ngshu¯ 3. modern square script (kaˇishu¯
of scholars of paleography. All known styles of Chinese script, including those recently rediscovered through archaeological excavation, are the subject of modern calligraphy practice, one of the prime vehicles of self-cultivation in Chinese traditional secular culture.
, caˇoshu¯ )
)
In Chinese society today there is considerable knowledge about ancient scripts outside of the community
Seal script has been used on signature seals since ‘Warring States’ times (475–221 B.C.E.), and was the main form used in public inscriptions in the Qı´n (221–206 B.C.E.) and Ha`n (B.C.E. 206–221 C.E.) dynasties. The Qı´n standardization of the script is thought to have involved seal script, although it is less well known that Qı´n itself cultivated a highly conservative character structure in an effort to show that it was the worthy successor of the Zho¯u. Seal script has long been used on official seals, and so is associated with political legitimacy.
332 China: Writing System
Figure 1 A page from a So`ng dynasty dictionary of seal forms, with text in square script.
Seal script has been the focus of traditional paleography and graphology, because it is documented in a 1st-century dictionary, the Shuo¯ we´ n jieˇ zı` , the oldest native work of its kind. But seal script represents a highly evolved stage of development and is not ideal for research on the early history of Chinese writing. It is formalistic and most of its graphs are broadly isomorphic with modern square script, even though seal script differs in having
lines of even weight, rounded corners, and rounded line-ends (see the example in Figure 1, from a So`ng dynasty (960–1279) compendium). Characters are written so as to conform to an invisible box and look balanced and neat. Bronze
Some of the most structurally conservative ancient graphs tend to be those cast on early bronze vessels,
China: Writing System 333
Figure 2 A Western Zho¯o¯u bronze inscription.
which abound from the period of the Sha¯ ng (1766– 1122) and Western Zho¯ u dynasties (1122–221). Bronze script was a monumental form; the sacrificial vessels on which it appears were often costly and had ritual functions in the ancestral temples of important families. Many short inscriptions (5–20 characters) exist, recording little more than personal names and a few formal phrases associated with the commissioning of the vessel. An example appears in Figure 2, from a Western Zho¯ u guıˇ vessel. But some longer texts (as long as several hundred characters) also exist and are a major source for historical and linguistic study of this period. Their formal nature, however, must never be forgotten; this was not a medium or a context that lent itself to casual writing or to the recording of natural speech. Bronze script graphs tend to have both regular angles and smooth curves, with lines of varying thickness and separated by irregular spaces. Their size and shape may vary considerably within a single inscription, in contrast to more standardized graphs of later periods. Texts are generally written in columns, top to bottom, continuing right to left, and, in the best inscriptions, the columns have a largely linear appearance, with the graphs about equally spaced. But graphs themselves vary a great deal in form; this is true of both the appearance of primary elements and the structure of compound graphs. Ancient bronzes have been known and handled continuously since their own time (there is a tradition in the received literature that possession of certain vessels of state betokens the possession of the throne). The collection and study of their inscriptions continues in a tradition
Figure 3 A fine oracle bone inscription on scapula.
unbroken since the So`ng dynasty, although modern archaeological finds and linguistic method have increased our knowledge incalculably in the past half century. The form shown here is taken from a modern rubbing (ta`pia`n ). Bone
The oldest Chinese writing now known and definitively identified as writing is the oracle bone inscriptions (OBI), dating mainly from the time of the last 12 kings of the Sha¯ ng dynasty, or the period ca. mid-14thcentury to 1122 B.C.E. Tortoise plastrons and carabao scapulas were used in ritual divination by or for the kings (some were also inscribed with related information), and finally they were either stored or discarded. They were first identified around the turn of the 20th century, and so their study is entirely modern. The great majority of authentic Sha¯ ng bones come from an area near A¯ nya´ ng in China’s He´ na´ n Province , the site of the later Sha¯ ng capital; there are also smaller finds from other parts of north China, as well as pieces dating from the early Western Zho¯ u. Inscriptions are generally classified into five very rough stylistic periods. Bone and scapula are harder materials than the soft clay on which bronze inscriptions were actually prepared, and OBI are rougher in appearance, with simpler structure and lines that are more jagged (see Figure 3 for an unusually legible and large
334 China: Writing System
inscription). Most OBI finds are fragmentary, and even complete texts are usually short and highly compact. For these reasons, the identification of graphs by linguistic context is often much more tentative than with bronze inscriptions.
Ink-Brush Styles Although there is internal evidence that OBI and bronze graphs must have developed from earlier ink or painted forms, no undisputed specimens have yet been unearthed. From the 4th and 3rd centuries B.C.E., however, there are a number of examples of cursive ink writing on bamboo splints (zhu´ jiaˇn ) and silk (bo´ shu¯ ). Important recent finds come from sites at Ba¯osha¯n and Guo¯ dia`n in Hu´ beˇ i Province. The texts represented in these documents tend to vary a good deal from received versions, and character structure seems to have been highly variable. The Shuo¯ we´ n records a tradition that, in the fragmentation of ‘China’ prior to the Qı´n unification, each politically independent region developed its own style of writing. However, the variation is not simply geographical; we sometimes see the same character written quite differently even within the same document. Materials of this type were not well known in traditional times, although there is a So`ng dynasty dictionary, the Ha`njiaˇn, that contains some related forms. Cursive brush forms tend to look much more abstract than the older bone and bronze forms. Horizontal strokes are often greatly thickened, and vertical strokes tend to trail off unless they turn and continue horizontally. A sample of a recent find at Guo¯ dia`n appears in Figure 4. In all brush styles, whether cursive or formal, the brush allows control of the amount of ink released, and the thickening or narrowing of lines becomes an essential part of the aesthetics of the script, in a way that was not seen in the older inscriptional styles. However, from the Ha`n onward, we see inscriptions using ink-brush styles, because paper (first widely used in the Ha`n) could be pasted on stone and the ink characters on it carved there. That provided a way for medieval Chinese governments to promulgate standard texts in standard script: they erected official editions of the classical texts that were a part of their intellectual basis for rule, and visitors could make rubbings and take them home for study and emulation. Clerical
True, so-called ‘clerical’ script is the term applied to cursive writing from the Qı´n and Ha`n, but these are clearly a development from Warring States cursive tradition. From the Ha`n we have a number of stone steles carved from ink models. The Ca´ o Qua´ n Stele
Figure 4 A fragment of Warring States brushwork.
, dated 185 C.E. and unearthed in the 16th century, is one of the most striking examples of the style; see Figure 5. The characters are square, with many of the basic stroke types formed in a regular and distinctive way. Writers are evidently paying close attention to control of the brush-tip. Structurally, most of the characters are already very close to modern square script, in spite of stylistic differences. ‘Running’ and ‘Grass’ Styles
The ‘running’ (literally, ‘walking’) and ‘grass’ styles are both characterized by ellipsis of discrete structural elements and the flowing connection between various remaining lines of the graph. Grass style is the more elliptical of the two and, contrary to popular belief, is the older, dating from the Western Ha`n (B.C.E. 206– 8 C.E.); it was originally a cursive development of clerical script. It remains a favored style for calligraphy, and the elegance of fine specimens arouses a visceral aesthetic response in some afficionados, although the effect is often lost on viewers who cannot read Chinese, and is even baffling to them. In more extreme examples, even people who appreciate calligraphy cannot always make out every character. The (relatively moderate) work of the 14th-century Ka¯ nglıˇ Zıˇsha¯ n is shown in Figure 6. Running style is generally much more legible for the modern reader; its forms bear a much closer relationship to
China: Writing System 335
Figure 5 From a Ha`n stele inscription in clerical script.
modern square script, and it developed later, in the 3rd century, probably as a casual style for use in letters. A rubbing of the running script of the 8th century Lıˇ Yo¯ ng is shown in Figure 7. When the running and grass styles are compared with modern square script, the viewer becomes aware that a kind of systematic vocabulary of ellipsis is involved, and that vocabulary has been made the subject of a number of books, including one in English by Fred Fang-yu¨ Wang (1958). Modern Square Script and the Simplification Movement
Modern square script is the style in common use today. It seems to have developed by the 3rd century, and rapidly became standard for official writing. No later than the end of the 6th century there were dictionaries circulating that exhibited standard square script forms, no doubt for emulation. Square script does not flow; lines are discrete and generally straight. Vertical lines are supposed to go straight up and down, and horizontal lines are supposed to be almost level, or rising very slightly from left to right. Changes of line direction are clearly expressed and the narrowing and flaring of the different parts of each line are exactly prescribed. Figure 8 shows a 6th-century example from the 7th-century tomb inscription of Su¯ Cı´ . Even though it is obviously
Figure 6 A sample of Mongol-era ‘grass’ script.
not contemporary, almost all the graphs are instantly readable by any literate person today. In the 20th century, both the Nationalist (KMT) and Communist governments of China promulgated official lists of ‘simplified’ characters (jiaˇ ntıˇzı` ). Many of these graphs are essentially running script forms given a square script cast, and almost all of them have been in popular handwritten use for a very long time, at least regionally and often universally. When the Chinese civil war wound down
336 China: Writing System
Figure 8 A sample of 7th-century square script.
Figure 7 A sample of 8th-century ‘running’ script.
to a stalemate in the 1950s, the Communist-governed mainland was promoting a simplified script as an aid to proletarian literacy and as a step toward the eventual eradication of the characters, while the KMTgoverned island of Taiwan was promoting traditional script, as commonly seen in printed texts from the previous several centuries. In some cases, systematic principles are involved in how characters are simplified, but those principles were never fully implemented. For example, the traditional form is officially simplified to , to , to , to , to , and so on. The example of is paralleled by , , and being simplified to , , and , and the case of is paralleled by , , and being
, , and . But there are many simplified to other uncompleted sets: , , and remain unchanged in the official simplified character set, as do , , and . As of the turn of the 21st century, there has not been full popular integration of the simplified and traditional character sets in popular usage, even though the actual differences between the two are relatively superficial. Typically, a week of review is all that is needed for someone literate in one character set to master the other set. But many native speakers of Chinese still claim only to be able to read one or the other set, a declaration that the neutral observer suspects is really about political or regional allegiance rather than any intrinsic incompatibility between the two. Software developers have quietly made it possible for the two sets to be integrated in most computer applications.
China: Writing System 337
As for full alphabetization and the elimination of the characters, as of this writing it remains official Communist policy, but seems to this writer unlikely ever to take place because of the dominating place of the script in Chinese identity. Enthusiasts of alphabetization include a number of prominent Western scholars. DeFrancis (1950) and Serruys (1962) give good resumes of the early history of the simplification and alphabetization movement.
Other Matters of Form and Punctuation Form of Connected Writing
Until recent decades, Chinese has conventionally been written in straight columns, top to bottom and continuing right to left. (Bronze inscriptions occasionally run in columns from left to right, often in only one of a pair of inscriptions facing each other on the lid and body of a single vessel.) Over the course of the 20th century, it became usual to write in rows from left to right. Taiwan and conservative overseas newspapers made the switch in the late 1990s. In Taiwan and Hong Kong, much literature continues to use the traditional format. In the mainland, only scholarly works of classical literature and philology still frequently follow the traditional format. Chinese characters in modern print are generally all the same size and occupy an (invisible) square, with the same amount of space between them. Unstressed syllables and the one important Mandarin subsyllabic morpheme (the rhotacizing ‘suffix’ -r / ) always take up the same amount of space as ‘full’ syllables. Punctuation
Modern Chinese, since the early 20th century, has been punctuated with a set of symbols derived from Western usage: the period ! or ", the comma, colon, semicolon, question mark, quotation marks, and exclamation point: : ; ? !. There is also a special ‘halfcomma’ ( ) used for separating items in a list (and not rendered orally as a pause, as a comma is in English). The half-comma is derived from the shape of a simple dot as conventionally written with a brush. There are traditional quotation marks d and e (or : and in columnar format) and brackets and ( / ), and h and i ( / ). It is usual for all of these punctuation marks to be given the same amount of space as a normal character, although with the advent of computers and wider exposure to other written languages, many typographical practices are in rapid coevolution. There was historically no such thing as italicization, but its availability in word processors has introduced it to the Chinese printed world since the late 1980s.
Traditional printed books usually had no punctuation, although individual readers might ‘point’ the text by hand, as needed, using a circle or halfcomma. However, Warring States bamboo manuscripts often have ‘half-comma’-like marks at the ends of what we can identify as phrases and sentences. Traditional teachers would sometimes mark alternate tonal readings of a given character by putting a dot or a small circle in one or another ‘circling the of its corner (so-called qua¯ npo` ‘‘broken’’ reading’). Ancient bronze inscriptions and bamboo texts made frequent use of a ‘doubling mark’ (cho´ ngwe´ n ), which is the numeral ‘two’ ( ) added to the bottom right corner of a character to indicate that the character was to be read twice – sometimes a series of characters could be so marked, meaning that a phrase was to be read twice. Figure 2 contains one such example in the upper left corner. In later texts, broadly speaking from the Ha`n onward, the doubling mark is given its own space as a character, fitting into its own invisible square. By the 6th century we see a more elaborate doubling mark used, which survives in handwriting today. Ligatures
Ligatures (he´ we´ n ) are very common in ancient texts. We find two common characters pressed into the space of one, often written in such a way as to share strokes, and the doubling mark added at bottom right. There is no firm evidence as to how these were to be read; possibly two morphemes were to be read, or possibly there was some kind of spoken contraction involved. In Warring States bamboo texts there is a second kind of ligature, where a single compound character has the doubling mark but is meant to be read as two different morphemes. For example, the graph bı`ng ‘together’ with the doubling mark added is understood to be read as bı`nglı` ‘standing together’; the graph itself is historically two written side by side, and the use of the doubling mark to indicate requires that fact to be known to the reader.
The Internal Linguistic Structure of the Chinese Script The Chinese script is famous for being nonphonetic. Although phonetic principles have plainly determined its development, it remains highly ‘defective’ in the technical sense: the sounds of speech are represented inconsistently, and are often totally concealed.
338 China: Writing System
Figure 10 Symbol-graphs: Figure 9 Pictographs:
,
,
(top) and
, ,
,
(top) and
,
(bottom).
(bottom).
Primary Elements
Chinese characters are of two basic types: primary elements and compound characters made from two or more primary elements. A primary element is one we recognize as being indivisible, so that no discrete linguistic meaning attaches to smaller components. Most primary elements are derived from ‘pictographs’ (xia`ngxı´ng ‘form-depicting’ characters), which stylize some physical object. Examples (from ca. 10th-century B.C.E. bronze inscriptions) are shown in Figure 9: the characters shown represent common Chinese words for ‘horse,’ ‘to stand,’ ‘large’ (in the top row) and ‘mouth,’ ‘eye,’ and ‘moon, month’ (bottom row). In their modern square script forms, the graphs are , (top) and , , (bottom). Another traditional category of primary elements is the zhıˇshı` , graphs that ‘indicate a matter.’ They are abstract and nondepictive. Figure 10 shows a few examples: ‘one’ , ‘two’ (top) and ‘five’ , ‘on top; to ascend’ (bottom). Relatively few primary elements are of this type. Compounding: Extension of Primary Elements Through Polyvalence
Primary elements often appear as discrete graphs, but frequently we do not see them used in their primary meaning. Rather, the earliest attested usage is often as a loan for a near homophone or synonym standing at some distance from what we believe to be the primary sense. Such ‘polyvalent’ usage is known in Chinese as jiaˇ jie` (‘loan’) usage. Its best known form is semantic polyvalence – ‘polysemy,’ or the ‘rebus’ principle, in which one graph represents two words with different meanings but very similar sounds. Polysemy allows us to write words for which no obvious physical symbol can be found, by using any other word that has a sound that is the same or similar. Polysemy is a phonetic principle, and is the most
fundamental linguistic principle involved in grasping the structure of Chinese script. represents an intransitive verb As an example, meaning ‘to stand,’ but we often see it used in other senses. One such usage is the transitive ‘to erect,’ although that is not usually considered to be jiaˇ jie` because the meaning ‘to erect’ is not now associated with a different graph or word. But also stands for the word we now write ‘place, position.’ The ancient spoken word originally written is reconstructed *g-rep (Mandarin lı`), and is reconstructed *reps (we`i); we believe the two words were alike enough in sound and meaning that the use of one to stand for the other would not have been a big jump for a literate reader. A third word sometimes written with is *g-reps (qı`) ‘to arrive at one’s place.’ The three words standardly written , , and are close enough that we can believe they may have been cognate, and so the use of a single form to stand for all three does not surprise us. But to date we have found no explicit native statement about these relationships, and so are forced to make deductions about the range of loangraph relationships. We also have no explicit word lists – lexicons showing sound, meaning, and a conventional graphic representation – older than about the 6th century C.E., and those from the 6th century are filled with archaisms and late standardizations. Consequently, our ability to reconstruct the early language and identify the words represented by graphs in excavated materials is still very limited. A second form of polyvalence is ‘polyphony’ – writing a graph usually associated with one word to represent another word with a totally different sound but similar in meaning. Relatively few examples of this kind have survived the standardizing tendencies of the past 2200 years, but we do see occasional examples such as , associated both with *hngrat ‘sprout’ (a word long obsolete) and *’tshu / ‘plant’ in the received literature. In excavated ancient documents, however, there appear to be a great many examples of this kind.
China: Writing System 339 Table 1 Phonetic compounds based on Graph
Reconstruction
Meaning
Structure of graph
*’taw
pictograph
*’taws
‘blade, knife’ ‘to arrive’
*draws
‘to summon’
disambiguated by the addition of semantic determinative *tits ‘to arrive’ (the element is an altered form of ) disambiguated by the addition of semantic determinative *kho ‘mouth; oral’
Extension of Primary Elements Through Compounding
Primary elements are only a minute proportion of the total number of Chinese graphs. In most cases, a typical Chinese graph is a compound of two or more of them. Compounding is thought to have come about at an early date because of the intrinsic ambiguity of polyvalence. To resolve the ambiguity, ‘determinative elements’ were added, forming a compound graph. Determinatives may be either phonetic or semantic: they may give the reader a clue to the intended word through either sound or meaning. The overwhelming majority of compounds involve the addition of a semantic determinative to a primary phonetic element that may once have been used polyphonically. Such ‘phonetic compounds’ (xı´ngshe¯ ng ‘form and sound’ graphs) make up the vast proportion of all of the Chinese characters attested from antiquity. Table 1 shows three graphs, two of which are compounds, differentiated semantically from the first, which was a polysemous primary graph. From , another graph was derived for the word *’taw ‘to fall down (said of a person)’. is two derivational steps away from : in it, the compound (associated with *’taws ‘to arrive’ and *’taw ‘to fall down’) has been disambiguated by the addition of semantic determinative *nin ‘person’ for the word *’taw but not for *’taws. From have sprung a whole series of secondary compound graphs, shown in Table 2. And from another (tertiary) compound has been derived, for *taws ‘to shine on’ by the addition of the semantic determinative ( ) *’hmej ‘fire’. None of the elements , , , , , , , , , , , or appear to contribute to the overall phonetic value of these graphs; in every case, our ability to read a character stems from the original association
with the sound *’taw. Many schoof the element lars have proposed etymologizing these words so as to link them to the basic meaning of *’taw, ‘blade or knife’: perhaps ‘bright’ and ‘to shine’ suggest the sharpness or glint of a blade, or perhaps ‘summon’ and ‘command’ suggest compulsion by force of arms. But that is speculation; there is no reason to assume that was used as anything except a token of sound in these graphs. If ‘blade’ and ‘bright’ are related, that is mainly a matter of the etymology of spoken words. Our ability to describe early graphic evolution in Chinese is greatly complicated by two problems. First, compound graphs are rarely seen in their earliest meanings, and, second, primary graphs are rarely seen in the loan usages that we believe to have been disambiguated later as distinct phonetic compounds. Full documentation may never be possible, because it seems likely that many phonetic compound graphs known in post-Ha`n times were perhaps formed directly as compounds without having gone through the process of disambiguation of a primary polyphonic pictograph. A minority of compound graphs were formed by the addition of a phonetic element to an older pictograph. An example is the word zhu` ‘to cast in metal’ now written *tus (?). The modern square-script form is a phonetic compound, with *du as the phonetic element and ‘metal’ *k(r)em as the semantic determinative. Figure 11 shows several ancient forms of : the three in the top row have in common the primarly elements ‘vessel’ at the bottom and what appear to be two hands holding some sort of inverted vessel at the top. Between these two parts are variously *’hmej ‘fire’, ‘metal’ *k(r)em, and the element , whose reading and meaning are disputed but which appear to be the phonetic element in modern . Of these three graphs, only the one containing is recognizable to us as a phonetic compound, in which specifies that a word resembling *du is intended; the other two must be understood as compounds of some sort. In the bottom row of Figure 11, we see a form containing , and , followed by another form containing only and with *’kho ‘mouth’ (an element of modern ). Both of these must be forebears of modern . Last is a form containing only and *stu (?; likely variant of ‘hand’, phonetic in *stu ‘to protect’); this is also a phonetic compound graph, using a different phonetic element from the received form. In sum, it appears that early ways of writing ‘to cast’ were not phonetic, but that was first added as a phonetic determinative and the complex pictographic elements gradually reduced to , which looks like a semantic determinative. But we can see that simple phonetic-plus-semantic structure
340 China: Writing System Table 2 Phonetic compounds based on Meaning
*taw *taw
‘bright’ ‘to summon with the hand’ ‘pond’ ‘to command’ ‘to exceed, surpass’ ‘kind of ritual music’ ‘to connect’ ‘marten: kind of weasel’
*taw *taws *thraw *daw *daw *tew
Structure of graph
disambiguated by the addition of semantic determinative disambiguated by the addition of semantic determinative
*nit ‘sun’ *hu ‘hand’
disambiguated by the addition of semantic determinative ( ) *h(l)uj ‘water’ disambiguated by the addition of semantic determinative *ngan ‘to say’ disambiguated by the addition of semantic determinative *’tso(k) ‘to run’ disambiguated by the addition of semantic determinative *(r) m ‘music’ disambiguated by the addition of semantic determinative *s ‘thread’ disambiguated by the addition of semantic determinative *lre (?) ‘any legless wild creature’ (!) e
Reconstruction
e
Graph
for a word meaning ‘to call.’ In addition to , common words using phonetically for this sound are *meng ‘name, to name’ and * mrings ‘to command.’ Presumably, one of the three words *mreng ‘to crow,’ *meng ‘name,’ and *mrings ‘to command’ is what was originally intended by the use of the primary element in all three graphs. Findings like this are disputed in many circles, however; the huı`yı`type compound is part of the traditional lore of the Chinese script, whether or not it turns out to have a historical basis. Character Structure and the Dictionary Figure 11 Six bronze forms of zhu`
‘to cast in metal’.
is actually late in this graph, and developed following a different path than the compounds. Semantic Compounds
Because compound characters are made of two or more primary elements, it is often thought that it is the meanings of those elements that combine to indicate the meaning of the whole compound. That is a misconception, one widespread in China as well as abroad. Chinese traditionally call such ‘semantic compounds’ huı`yı` , graphs with ‘meanings conjoined’. There are some graphs that (at present) are almost always explained as semantic compounds; for instance lı´n ‘forest’ is apparently composed of the element mu` ‘tree’ doubled. But by far the overwhelming majority of compound graphs contain at least one patently phonetic element. Moreover, modern scholars continue to study those compounds that are ambiguous, and, in some cases, they have been plausibly explained as originally phonetic compounds in which the phonetic element has ceased to be recognized. A fine example is *mreng ‘to crow,’ apparently a compound of the elements *’kho ‘mouth’ and *’tiw ‘bird,’ neither of which can be phonetic. However, it has been proposed by modern scholars that is actually a polyphonic element, and that in addition to ‘mouth’ it also stands
Chinese dictionaries have been arranged by sound since at least the 6th century C.E., but how does one find a character if one doesn’t know how it is pronounced? To solve this problem, Chinese dictionaries are also arranged by character structure. Usually, the semantic determinative, if there is one, is identified as the classifying element of a graph, and all the characters that have the same semantic determinative are placed together. For example, the characters , , , , , , and are all placed under in the dictionary, because is identified as a recurring determinative element in all of them. Westerners generally call this element the ‘radical’ of the character, but the Chinese name (bu`shoˇ u ) is better translated as ‘classifier.’ ‘Radical’ suggests that the determinative element is somehow the etymological ‘root’ of the graph, when in fact most semantic determinatives are late additions in compound characters; it is the original polysemous phonetic element that really deserves the name ‘root.’ See also: Calligraphy, East Asian; Chinese; Printing and Typewriting.
Bibliography Boltz W G (1993). ‘Shuo wen chieh tzu.’ In Loewe M (ed.) Early Chinese Texts: A Bibliographical Guide. Berkeley:
Chinantec: Phonology 341 Society of the Study of Early China and Institute of East Asian Studies. 429–442. Boltz W G (1999). ‘Language and writing.’ In Loewe M & Shaughnessy E L (eds.) The Cambridge history of ancient China. Cambridge: Cambridge University Press. 74–123. Boltz W G (2003). Origin and early development of the Chinese writing system (Vol. 78). New Haven, CT: American Oriental Society Monograph Series. Boodberg P (1940). ‘‘‘Ideography’’ or Iconolatry?’ T’oung Pao 35(4), 266–288. Botte´ ro F (2004). ‘Chinese characters versus other writing systems: the Song origins of the distinction between ‘‘non-compound characters’’ (wen) and ‘‘compound characters’’ (zi).’ In Takashima K & Jiang Shaoyu J (eds.) Meaning and form: essays in pre-modern Chinese grammar. Munich: LINCOM Europa, Studies in Asian Linguistics. 1–17. DeFrancis J (1950). Nationalism and language reform in China. Princeton, NJ: Princeton University Press. Ha`njiaˇ n (1983). Lıˇ Ling (ed.). Pub. with. Guˇ we´ n sı`she¯ ng yu`n . Beˇ iji¯ng: Zho¯ nghua´ Shu¯ ju´ . Keightley D N (1978). Sources of Shang history. Berkeley, Los Angeles, and Oxford: University of California Press.
Kern M (2000). The stele inscriptions of Ch`in Shih-huang: text and ritual in early Chinese imperial representation. New Haven, CT: American Oriental Society. Qiu´ Xı´guı¯ (2000). Chinese writing. Mattos G L & Norman J (trans.). Berkeley: Society of the Study of Early China and Institute of East Asian Studies. Serruys P L-M (1962). Survey of the Chinese language reform and the anti-illiteracy movement in Communist China. Studies in Chinese Communist Terminology, No. 8. Berkeley: Center for Chinese Studies. Shaughnessy E L (1991). Sources of Western Zhou history. Berkeley, Los Angeles, and Oxford: University of California Press. Wang F Fang-yu¨ (1958). Introduction to Chinese cursive script. New Haven, CT: Far Eastern Publications, Yale University.
Further Reading The best Western-language overview of the modern native tradition of paleography is Qiu´ (2000). The best presentation of modern linguistic analysis of the structure of the script is Boltz (2003), which, however, remains controversial among traditionalists.
Chinantec: Phonology D Silverman, Nanvet, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Chinantecan is a group of about 14 VSO languages within the Otomanguean family, spoken by approximately 90 000 people in northeastern Oaxaca, Mexico, having branched from the Otomanguean tree more than 16 centuries ago. The 14 major languages (where ‘language’ is defined as a speech community with mutual intelligibility not in excess of 80% with other communities) are Ojitla´ n, Usila, Tlacoatzintepec, Chiltepec, Sochiapan, Tepetotutla, Tlatepusco, Palantla, Valle Nacional, Ozumacı´n, Lalana, Lealao, Quiotepec, and Comaltepec. The first seven are northern languages and tend to be more innovative phonologically; the second seven southern languages are more conservative. Syllables are usually CV, with only a few post-vocalic elements, among them a nasal and/or laryngeals. Proto-Chinantec is reconstructed as possessing consonants *p, *t, *k, *kw, *b, *z, *g, *gw, *s, *m, *n, *N, *w, *l, *r, and *j. Laryngeals *h and * could stand alone prevocalically, or could precede any of the voiced consonants. Additional consonant-glide clusters are reconstructed as well. The reconstructed tonal inventory includes *H, *L, *HL, *LH, and *HLH. Vowels
included *i, *e, *a, *u, *i$, and *e, as well as several diphthongs. The vowels may be augmented in a bewildering number of ways, however. In modern Comaltepec – the most conservative Chinantecan language – eight vowel qualities (i, e, æ, a, o, V, $i , u) may be combined with five tonal qualities (L, M, H, LM, LH), two voice qualities (plain and aspirated), a nasality contrast, as well as a binary length contrast. The cross-classification of these 5 independent systems results in 320 possible nucleus qualities (8 ! 5 ! 2 ! 2 ! 2). Thus, a single vowel quality may possess up to 40 contrastive values. Chinantec roots and words are usually monosyllabic. The rich inflectional system normally involves modification of root vowels, resulting in monosyllabic stems that bear a particularly high informational load. In Comaltepec, for example, a single syllable may contain not only the root but also (in the case of verb complexes) active/stative markers, gender markers (animate/inanimate), transitivity markers (intransitive/transitive/ditransitive), aspect (progressive/ intentive/completive), and possibly subject pronoun clitics (two subsyllabic classes). Methods of stem modification involve nasalization, tone, length, phonation augmentation, and sometimes consonant changes. Additionally, certain irregular patterns are marked by ablaut. Due to their inherent inflection,
Chinantec: Phonology 341 Society of the Study of Early China and Institute of East Asian Studies. 429–442. Boltz W G (1999). ‘Language and writing.’ In Loewe M & Shaughnessy E L (eds.) The Cambridge history of ancient China. Cambridge: Cambridge University Press. 74–123. Boltz W G (2003). Origin and early development of the Chinese writing system (Vol. 78). New Haven, CT: American Oriental Society Monograph Series. Boodberg P (1940). ‘‘‘Ideography’’ or Iconolatry?’ T’oung Pao 35(4), 266–288. Botte´ro F (2004). ‘Chinese characters versus other writing systems: the Song origins of the distinction between ‘‘non-compound characters’’ (wen) and ‘‘compound characters’’ (zi).’ In Takashima K & Jiang Shaoyu J (eds.) Meaning and form: essays in pre-modern Chinese grammar. Munich: LINCOM Europa, Studies in Asian Linguistics. 1–17. DeFrancis J (1950). Nationalism and language reform in China. Princeton, NJ: Princeton University Press. Ha`njiaˇn (1983). Lıˇ Ling (ed.). Pub. with. Guˇwe´n sı`she¯ng yu`n . Beˇiji¯ng: Zho¯nghua´ Shu¯ju´. Keightley D N (1978). Sources of Shang history. Berkeley, Los Angeles, and Oxford: University of California Press.
Kern M (2000). The stele inscriptions of Ch`in Shih-huang: text and ritual in early Chinese imperial representation. New Haven, CT: American Oriental Society. Qiu´ Xı´guı¯ (2000). Chinese writing. Mattos G L & Norman J (trans.). Berkeley: Society of the Study of Early China and Institute of East Asian Studies. Serruys P L-M (1962). Survey of the Chinese language reform and the anti-illiteracy movement in Communist China. Studies in Chinese Communist Terminology, No. 8. Berkeley: Center for Chinese Studies. Shaughnessy E L (1991). Sources of Western Zhou history. Berkeley, Los Angeles, and Oxford: University of California Press. Wang F Fang-yu¨ (1958). Introduction to Chinese cursive script. New Haven, CT: Far Eastern Publications, Yale University.
Further Reading The best Western-language overview of the modern native tradition of paleography is Qiu´ (2000). The best presentation of modern linguistic analysis of the structure of the script is Boltz (2003), which, however, remains controversial among traditionalists.
Chinantec: Phonology D Silverman, Nanvet, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Chinantecan is a group of about 14 VSO languages within the Otomanguean family, spoken by approximately 90 000 people in northeastern Oaxaca, Mexico, having branched from the Otomanguean tree more than 16 centuries ago. The 14 major languages (where ‘language’ is defined as a speech community with mutual intelligibility not in excess of 80% with other communities) are Ojitla´n, Usila, Tlacoatzintepec, Chiltepec, Sochiapan, Tepetotutla, Tlatepusco, Palantla, Valle Nacional, Ozumacı´n, Lalana, Lealao, Quiotepec, and Comaltepec. The first seven are northern languages and tend to be more innovative phonologically; the second seven southern languages are more conservative. Syllables are usually CV, with only a few post-vocalic elements, among them a nasal and/or laryngeals. Proto-Chinantec is reconstructed as possessing consonants *p, *t, *k, *kw, *b, *z, *g, *gw, *s, *m, *n, *N, *w, *l, *r, and *j. Laryngeals *h and * could stand alone prevocalically, or could precede any of the voiced consonants. Additional consonant-glide clusters are reconstructed as well. The reconstructed tonal inventory includes *H, *L, *HL, *LH, and *HLH. Vowels
included *i, *e, *a, *u, *i$, and *e, as well as several diphthongs. The vowels may be augmented in a bewildering number of ways, however. In modern Comaltepec – the most conservative Chinantecan language – eight vowel qualities (i, e, æ, a, o, V, $i , u) may be combined with five tonal qualities (L, M, H, LM, LH), two voice qualities (plain and aspirated), a nasality contrast, as well as a binary length contrast. The cross-classification of these 5 independent systems results in 320 possible nucleus qualities (8 ! 5 ! 2 ! 2 ! 2). Thus, a single vowel quality may possess up to 40 contrastive values. Chinantec roots and words are usually monosyllabic. The rich inflectional system normally involves modification of root vowels, resulting in monosyllabic stems that bear a particularly high informational load. In Comaltepec, for example, a single syllable may contain not only the root but also (in the case of verb complexes) active/stative markers, gender markers (animate/inanimate), transitivity markers (intransitive/transitive/ditransitive), aspect (progressive/ intentive/completive), and possibly subject pronoun clitics (two subsyllabic classes). Methods of stem modification involve nasalization, tone, length, phonation augmentation, and sometimes consonant changes. Additionally, certain irregular patterns are marked by ablaut. Due to their inherent inflection,
342 Chinantec: Phonology Table 1 Partial verb paradigm from comaltepec hit (transitive/inanimate) progressive intentive completive hit (transitive/animate) progressive intentive completive
1s bah¥ bah¡ bah¥
1p ba¥ bah¡ bah¡
2 bah¥ bah¡ bah
3 bah¥ bah¥ bah¥
bV: £ bV: ¡ bV: £
bV: £ bV: ¡ bV: ¡
bV £ bV ¡ bV:
bV: £ bV: £ bV: £
bare verbal roots do not exist as such in Chinantecan. All Chinantecan languages have a large number of verb classes, along with many lexical exceptions. Classes are differentiated by patterns of identity or nonidentity across aspect/person combinations. For example, in the partial paradigm for the verb ‘to hit’ shown in Table 1, some complexes are identical to others, while others are different. Verbs in this class will tend to show a similar pattern of identity and nonidentity across cells, while verbs in other classes show a different pattern. Table 2 provides examples of stem inflection from Quiotepec (Robbins, 1968). In at least some Chinantecan languages, the verb may be prefixed by a subject agreement marker for intransitive verbs, or by an object agreement marker for transitive verbs. Additional verbal prefixes include a negation marker, and tense and aspect markers (imperfect, past, hodiernal past, perfect, past imperfect, etc.). Unlike verbs, nouns do not typically display internal inflection, instead showing stability across inflectional augmentation. In Tepetotutla, for example, noun roots may concatenate with a quantifier, a gender-inflected numeral, a classifier, etc. In Lealao, constituents of the noun phrase may include a quantifier, the head, a modifier, a possessor, and a deictic marker, in that order, as well as a classifier prefix in some cases. Stem complexes are obligatorily stressed. Posttonic and pretonic syllables are not stressed. Stressed syllables may possess greater phonological and morphological complexity than do unstressed syllables. In Sochiapan, unstressed syllables differ from stressed ones in displaying a more limited distribution of phonemes. Posttonic syllables in Palantla consist of a small list of words that do not contrast for tonal features. Pretonic syllables, while maintaining tonal contrasts, do not possess postvocalic elements, except in very careful speech. In Comaltepec, posttonic syllables consist of a limited set of clitics, person-of-subject inflectors (in verbs), and possessors (in nouns). Pretonic syllables consist of only several verbal prefixes and a few proclitics, and possess a smaller inventory of tone values. These syllables are not a site for further inflection, and thus do not
I give (something) I gave (something) thou givest (something) thou gavest (something) I give (something to someone) I gave (something to someone) thou givest (something to someone) thou gavest (something to someone) I give (something animate) I gave (something animate) thou givest (something animate) thou gavest (something animate) I give (something animate to someone) I gave (something animate to someone) thou givest, gavest (something animate to someone)
possess morphological complexity. In Quiotepec, too, stress falls on the major lexical classes (verbs, nouns, etc.); most pretonic syllables consist of inflectional material. Pretonic syllables only occur with single tones, never with tonal contours. In at least several Chinantecan languages, the vocalism of posttonic syllables is harmonically determined by the stem vowel. Tone may spread from stem to suffix as well. Regarding Chinantecan stress, several languages are traditionally characterized as possessing either ‘ballistic’ stress or ‘controlled’ stress on stem syllables. In Palantla, Tepetotutla, Sochiapan, and Comaltepec, ballistic syllables have been characterized by an initial surge and rapid decay of intensity, and a loss of voicing of postvocalic elements; controlled syllables exhibit no such initial surge of intensity, displaying a more evenly controlled decrease of intensity, and a lack of postvocalic devoicing. Ballistic syllables tend to be shorter in duration than controlled syllables, and may possess a smaller inventory of tonal patterns. In at least several Chinantecan languages, ballistic syllables cross-classify with almost every other syllable type. Both oral and nasal vowels, both long and short vowels, preaspirated and preglottalized onsets and plain onsets, open and checked syllables, and nasally closed syllables, may all possess ballistic stress. Ballistic stress interacts most significantly with tone, tending to raise high tones and lower low tones. In Lalana, ballistic stress (considered postvocalic h in some analyses) may not occur with glottal checking, and may occur with only H, L, and HL tones, whereas controlled syllables reportedly also possess MH, LH, and HLH, and may be checked. In Lealao, only level tones (L, M, H, VH) may occur with ballistic stress, whereas controlled syllables may also occur with tonal contours (LM, LH). In Comaltepec, ballistic syllables may occur with almost any tonal pattern.
Chinese 343 Table 3 Tone sandhi in Comaltepec Non-sandhi context
Sandhi context
Gloss
to:D Ni$hD ku:£ hi£ moh £
kwa to: kwa Ni$h kwa ku: mi$:£ hids mi$:£ moh
give a banana give a chayote give money I ask for a book I ask for squash
sd
The ballistic stress found in some Chinantec languages corresponds to tonal lowering in Ojitla´ n and Usila. Quiotepec is variously characterized as possessing ballistic accent or raised tones in these same contexts, often accompanied by postvocalic aspiration. The Chinantecan ballistic syllable corresponds to postvocalic aspiration in related Mixtecan and Otopamean languages, to prevocalic aspiration in related Popolocan languages, and to glottally ‘interrupted’ (CV V) syllables in the Chatino, Zapotec, and Tlapanec languages. Chinantecan ballistic syllables may derive from ProtoOtomanguean *CVh syllables (which may or may not have been phonetically realized as interrupted vowels). Indeed, recent phonetic and phonological investigations have recharacterized the ballistic phenomenon as largely laryngeally-based, involving postvocalic aspiration. Segmental sandhi is rather limited in Chinantecan, although tone sandhi is widespread, being both phonologically and morphologically conditioned. The best-studied tone sandhi system is that of Comaltepec. Here, LH tones spread their H component on to a following vowel. Furthermore, M tones on unchecked controlled syllables (deriving from ProtoChinantec H) trigger the presence of an H tone
on the following syllable. Examples are shown in Table 3. See also: Oto-Manguean Languages.
Bibliography Anderson J L (ed.) (1989). Comaltepec Chinantec syntax. Studies in Chinantec languages (vol. 3). Dallas: Summer Institute of Linguistics. Merrifield W R (1968). Palantla Chinantec Grammar. Papeles de la Chinantla V. Seria Cientı´fica 9. Mexico: Museo Nacional de Antropologı´a. Merrifield W R & Rensch C R (eds.) (1990). Syllables, tone, and verb paradigms. Studies in Chinantec languages (vol. 4). Dallas: Summer Institute of Linguistics. Rensch C R (1968). Proto Chinantec phonology. Papeles de la Chinantla VI. Seria Cientifica 10. Mexico: Museo Nacional de Antropologia. Rensch C R (1976). Comparative Otomanguean phonology. Indiana University: Bloomington. Rensch C R (ed.) (1989). An etymological dictionary of the Chinantec languages. Studies in Chinantec languages (vol. 1). Dallas: Summer Institute of Linguistics. Robbins F E (1968). Quiotepec Chinantec grammar. Papeles de la Chinantla IV. Seria Cientı´fica 8. Mexico: Museo Nacional de Antropologı´a. Rupp J E (ed.) (1989). Lealao Chinantec syntax Studies in Chinantec languages (vol. 2). Dallas: Summer Institute of Linguistics. Silverman D (1997). ‘Tone sandhi in Comaltepec Chinantec.’ Language 73, 473–492. Silverman D (1997). Phasing and recoverability. New York: Garland. Westley D O (1991). Tepetotutla Chinantec syntax Studies in Chinantec languages (vol. 5). Dallas: Summer Institute of Linguistics.
Chinese Y Gu, Chinese Academy of Social Sciences, Beijing, China ! 2006 Elsevier Ltd. All rights reserved.
The State of the Art If language is ultimately seated in the minds of individual speakers, as some linguists claim, then Chinese can be described as a collection of over 1.3 billion idiolects scattered around the world, in Mainland China, Taiwan, Hong Kong, and Singapore in particular. If on the other hand language is held to be the property of a speech community, as many linguists believe, Chinese is then an assemblage of numerous
‘dialects’ spreading over different continents and across time zones, some of which are so different that their speakers cannot even communicate with one another. In spite of the vast diversity, and even some mutual oral unintelligibility, all literate speakers can overcome the barrier imposed by the oral unintelligibility via reading (not aloud!) and writing. The writing script partly enables the users to transcend the differences of idiolects and dialects, and bridges the past and the present. In this article, Chinese will be discussed within its two natural divisions: spoken Chinese and written Chinese. The former includes (1) the classification of dialects and their geographic and demographic
Chinese 343 Table 3 Tone sandhi in Comaltepec Non-sandhi context
Sandhi context
Gloss
to:D Ni$hD ku:£ hi£ moh £
kwa to: kwa Ni$h kwa ku: mi$:£ hids mi$:£ moh
give a banana give a chayote give money I ask for a book I ask for squash
sd
The ballistic stress found in some Chinantec languages corresponds to tonal lowering in Ojitla´n and Usila. Quiotepec is variously characterized as possessing ballistic accent or raised tones in these same contexts, often accompanied by postvocalic aspiration. The Chinantecan ballistic syllable corresponds to postvocalic aspiration in related Mixtecan and Otopamean languages, to prevocalic aspiration in related Popolocan languages, and to glottally ‘interrupted’ (CV V) syllables in the Chatino, Zapotec, and Tlapanec languages. Chinantecan ballistic syllables may derive from ProtoOtomanguean *CVh syllables (which may or may not have been phonetically realized as interrupted vowels). Indeed, recent phonetic and phonological investigations have recharacterized the ballistic phenomenon as largely laryngeally-based, involving postvocalic aspiration. Segmental sandhi is rather limited in Chinantecan, although tone sandhi is widespread, being both phonologically and morphologically conditioned. The best-studied tone sandhi system is that of Comaltepec. Here, LH tones spread their H component on to a following vowel. Furthermore, M tones on unchecked controlled syllables (deriving from ProtoChinantec H) trigger the presence of an H tone
on the following syllable. Examples are shown in Table 3. See also: Oto-Manguean Languages.
Bibliography Anderson J L (ed.) (1989). Comaltepec Chinantec syntax. Studies in Chinantec languages (vol. 3). Dallas: Summer Institute of Linguistics. Merrifield W R (1968). Palantla Chinantec Grammar. Papeles de la Chinantla V. Seria Cientı´fica 9. Mexico: Museo Nacional de Antropologı´a. Merrifield W R & Rensch C R (eds.) (1990). Syllables, tone, and verb paradigms. Studies in Chinantec languages (vol. 4). Dallas: Summer Institute of Linguistics. Rensch C R (1968). Proto Chinantec phonology. Papeles de la Chinantla VI. Seria Cientifica 10. Mexico: Museo Nacional de Antropologia. Rensch C R (1976). Comparative Otomanguean phonology. Indiana University: Bloomington. Rensch C R (ed.) (1989). An etymological dictionary of the Chinantec languages. Studies in Chinantec languages (vol. 1). Dallas: Summer Institute of Linguistics. Robbins F E (1968). Quiotepec Chinantec grammar. Papeles de la Chinantla IV. Seria Cientı´fica 8. Mexico: Museo Nacional de Antropologı´a. Rupp J E (ed.) (1989). Lealao Chinantec syntax Studies in Chinantec languages (vol. 2). Dallas: Summer Institute of Linguistics. Silverman D (1997). ‘Tone sandhi in Comaltepec Chinantec.’ Language 73, 473–492. Silverman D (1997). Phasing and recoverability. New York: Garland. Westley D O (1991). Tepetotutla Chinantec syntax Studies in Chinantec languages (vol. 5). Dallas: Summer Institute of Linguistics.
Chinese Y Gu, Chinese Academy of Social Sciences, Beijing, China ! 2006 Elsevier Ltd. All rights reserved.
The State of the Art If language is ultimately seated in the minds of individual speakers, as some linguists claim, then Chinese can be described as a collection of over 1.3 billion idiolects scattered around the world, in Mainland China, Taiwan, Hong Kong, and Singapore in particular. If on the other hand language is held to be the property of a speech community, as many linguists believe, Chinese is then an assemblage of numerous
‘dialects’ spreading over different continents and across time zones, some of which are so different that their speakers cannot even communicate with one another. In spite of the vast diversity, and even some mutual oral unintelligibility, all literate speakers can overcome the barrier imposed by the oral unintelligibility via reading (not aloud!) and writing. The writing script partly enables the users to transcend the differences of idiolects and dialects, and bridges the past and the present. In this article, Chinese will be discussed within its two natural divisions: spoken Chinese and written Chinese. The former includes (1) the classification of dialects and their geographic and demographic
344 Chinese
Figure 1 Classification of Chinese dialects.
distributions; (2) Putonghua as a lingua franca; and (3) a brief discussion plus sound illustrations of three major dialects. The latter includes (1) the writing script, and (2) the historical evolution of written Chinese from archaic Chinese to modern Chinese. The article concludes with a summative account of how Chinese, both spoken and written, is electronically processed.
Spoken Chinese Although Chinese, like any other language in the world, is substantiated in idiolects, i.e., parole in the Saussurean term, they are thrown away after being used as evidence for language system construction i.e., the Chinese language. In other words, talking about Chinese, over 1.3 billion idiolects are generally ignored. What linguists are interested in is the various dialects evolved from them. The number of dialects depends on how fine-grained the researcher’s scheme is intended to be. It is hardly a rare case that people in two villages only a dozen of miles apart cannot intelligibly communicate through speech.
Mongolia Autonomous Region, Shandong, Beijing, Tianjing, Hebei, Shanxi, Gansu, Qinghai, Ningxia Hui Autonomous Region, Sichuan, Yunnan, Guizhou, Guangxi Zhuang Autonomous Region, western part of Hubei, Chongqing, northern parts of Jiangsu, and Anhui. The total Mandarin-speaking population, based on the 1982 census, was about 662.23 million. Table 1 shows the demographic distributions among the subgroups of Mandarin. The demographic distributions of other non-Mandarin dialects are shown in Table 2. Mandarin Chinese is often nontechnically regarded as an equivalent to Chinese, which was historically the language of the Han nationality. Thanks to massive immigration and frequent contact, Mandarin Chinese is spoken by non-Han ethnic peoples as well. Some members of the Hui nationality, for instance, who are of Mohammedan origin, adopt Mandarin as their mother tongue. Almost all members of the She and Manchu nationalities speak Mandarin Chinese. Conversely, some people of Han origin in Hainan Province speak the Be language instead of Mandarin.
Dialect Classification and Distribution
Putonghua as Lingua Franca
Chinese dialects can be classified by adopting a treelike structure. The first branching-out from the trunk is the two major supergroups: Mandarin and nonMandarin. Mandarin includes eight subgroups: Northeastern, Beijing, Beifang, Jiaoliao, Zhongyuan, Lanyin, Southwestern, and Jianghuai. The non-Mandarin group comprises nine subgroups: Jin, Wu, Hui, Gan, Xiang, Min, Yue, Pinghua, and Hakka. Each of the subgroups has its own clusters, each of which in turn encompasses local dialects (see Figure 1). Geographically speaking, Mandarin is spoken in the following provinces and major cities: Heilongjiang, Jilin, Liaoning, the eastern part of the Inner
Dialects create diversity and local identity, and at the same time impose constraints on communication and social interaction. A tension always exists between diversification and standardization of the language. Many campaigns have been launched in the long history of China in favor of standardizing both spoken and written Chinese. The policy of shu tong wen zi (‘writing according to the same script’) adopted in the Qin Dynasty (248–207 B.C.) was in fact a systematic reform undertaken by the imperial court to standardize the writing script. In the Sui Dynasty, Lu Fayan’s (fl. 600 A.D.) Qieyun (‘Guide to poetic rhyming’) became a standard reference on pronunciation
Chinese 345 Table 1 Mandarin-speaking population by 1982 Northeastern Beijing Beifang Jiaoliao Zhongyuan Lanyin Southwestern Jianghuai * Yet to be grouped Total
for the generations to come, as well as for the reconstruction of ancient phonological systems. The campaign for the standardization of modern Chinese started as early as the last leg of the Qing Dynasty (1616–1911 A.D.) when the National Language Movement was vigorously launched as a part of the measures to revitalize the shattered country. It was argued that the nation could not be unified without a unified language. Guoyu (‘national language’) was initially envisaged and artificially constructed on the basis of some major dialects. This proved to be untenable, for it was next to impossible to promote such a language without natural speakers. New Guoyu (‘new national language’), with the Beijing dialect as its base, was proposed and eventually adopted. Immediately after the founding of the People’s Republic of China in 1949, language reform was put high on the government’s agenda. Modern Standard Chinese, officially called Putonghua, was adopted as the national language. It uses the Beijing dialect for its standard pronunciation and northern dialects as its base input. Putonghua is officially stipulated to be the language of instruction at all levels of education, and of mass media. The term Guoyu is still being used in Taiwan, while in Singapore it is called Huayu (i.e., Chinese). Putonghua, Guoyu, and Huayu are three different terms to refer to more or less the same Modern Standard Chinese. Modern Standard Spoken Chinese
Phonology The phonological structure of Modern Standard Chinese is conceptualized more in traditional Chinese terms than otherwise. A syllabic structure has three essential components: initials, finals, and tones. The initials and finals are two segments of a syllable, while the tones are supersegmental, i.e., features superimposed on the segments. The initials are the sounds known as consonants in Western literature. The finals, i.e., vowels, have internal structures of their own: the medial and the root of the final,
Table 2 Demographic distributions of other non-Mandarin dialects by 1982 Jin Wu Hui Gan Xiang Min Yue Pinghua Hakka * Yet to be grouped Total
which is further decomposed into two: the main vowel and the syllabic terminal (see Figure 2). The initial, the medial, and the syllabic terminal are not obligatory to make a Chinese syllable. A simple syllable can consist of a main vowel plus a tone only. The possible initials, finals, and tones of Modern Standard Chinese are summarized in Tables 3, 4, and 5, respectively. It is perhaps well-known now to the non-Chinese speaking world that Chinese tones are phonemic, that is, the same phonetic syllable pronounced in different tones will produce different words. The syllable /ma/ is the classic example: ma55 (mother), ma35 (hemp), ma214 (horse), ma51 (scold), and ma0, (a functional particle without a fixed lexical meaning). While tones are properties of words, there are also intonations of utterances. The relation between the tone and the intonation is often metaphorized as small ripples (cf. word tones) riding on large waves (cf. utterance intonations). The interaction between the tone and the intonation results in an algebraic sum of the two kinds of waves. Grammar It is generally held that, although Chinese dialects are so diversified that mutual unintelligibility in speech is not uncommon, they are conversely amazingly unified in matters of grammar. There are some minor divergencies found between dialects, for example, with regard to the order of direct and indirect objects, the Wu dialects and Cantonese differing from Mandarin Chinese. Cases like this, however, are extremely limited. It is quite valid to hold that there is one universal Chinese grammar. At the risk of oversimplification, which is unavoidable in such a short essay as the present one, Chinese grammar, in comparison with English and other European languages, is pragmatically oriented. The subject and predicate in the grammar of Western languages are best viewed as the topic and comment in Chinese. The subject/actor and the predicate/action are treated as a special case of topic and comment.
346 Chinese
Figure 2 Syllabic structure of Modern Standard Chinese.
Table 3 Initials of modern standard Chinese Description
Pinyin
IPA
Bilabials
b p m f d t n l z c s zh ch sh r j q x g k h
p p‘ m f t t‘ n l ts ts‘ s t§ t§‘ § r tc¸ tc¸‘ c¸ k k‘ x
Alveolars
Dental sibilants
Retroflexes
Palatals
Velars
Pragmatics One of the Chinese politeness maxims dictates that the speaker should denigrate him or herself, while elevating the other. This maxim has been codified in a range of lexical items. All the self-related expressions, including those referring to one’s family members, relatives, properties, writings, and so on, are marked with denigration, whereas the other-referring expressions, including those referring to the other’s family members, relatives, properties, writings, and so on, carry the force of elevation. For instance, a man referring to his own house will politely use han she (‘cold living place’), but fushang (‘mansion’) to refer to the other’s residence. The self-denigration and other-elevation maxim also operates in compliment-taking. Complaints are made about Chinese failing to take a compliment gracefully. Hearing a compliment like ni de yifu hen piaoliang (‘your dress is beautiful’), a Chinese lady will vigorously insist that it is very ugly indeed: bu bu bu, chou shile (‘no no no, deadly ugly’).
Written Chinese
For instance, jiu bu he, yan chou (word-for-word rendering: wine not drink, cigarette smoke) is understood as ‘Talking about wine, I don’t drink; but as for cigarettes, I do smoke.’ The topic-comment structure has something to do with the complaint often made by Westerners about Chinese saying ‘no’ but actually meaning ‘yes.’ Responding to the utterance zhe shu bu hao (word-for-word rendering: ‘this book not good’), if the speaker also thinks that the book is not good, he will say shi (‘yes’), meaning that he agrees with what the first speaker said about the book. While the English mind checks the statement against the fact, the Chinese mind expresses agreement or disagreement with the speaker. In other words, the Chinese mind tends to treat the speaker’s utterance as setting up a topic, and the responder’s job is to comment on the topic. The issue of the truth or falsehood of the statement becomes secondary.
In most languages, spoken and written forms are generally regarded as two functional varieties of one and the same language. The relation between spoken and written Chinese, however, cannot be dealt with so readily in the same way. Inscriptions incised on oracle bones, dated 1400–1100 B.C., are the earliest existent written records of Chinese. The inscriptions were not transcripts of the speeches of emperors or tribal kings. They can be regarded, at best, as setting up some of the earliest instances of a particular genre of written Chinese. By the time of the late years of Qing Dynasty (1616–1911) 3000 years or so later, the archaic written Chinese had become so different from the contemporary spoken Chinese that it would take years of dedicated study before one could read and write it. To make things even worse, the archaic written Chinese was prescribed as the medium of education. It was, and still is, no easier for students to learn it than it would be to learn a foreign tongue. Some language reform activists in the 1910s went on record arguing that archaic written Chinese
Chinese 347 Table 4 Finals of modern standard Chinese Pinyin
IPA
a o e eˆ ai ei ao ou an en ang ong er
a o X E ai ei au ou an en aN uN 6
Pinyin
IPA
Pinyin
IPA
Pinyin
IPA
i ia
i ia
u ua
u ua uo
u¨
y
ie
iE
u¨e
yE
uai uei
uai uei
uan uen ueng
uan un ueN
u¨an u¨n
yEn yn
iao iou ian in ing iong
iau iou iEn in iN N
Table 5 Tones of modern standard Chinese Chinese terms in Pinyin
was partially to blame for the humiliating decline of the Chinese civilization following the time of the Tang dynasty (618–907). Attempts to reform written Chinese thus had two aspects: the reform of the writing script and the reform of archaic written Chinese as the medium of education. Writing Script Reform: Alphabetization Versus Simplification
he nature of the Chinese writing script has been disputed for years, as can be seen in the variety of English terms used to designate the marks on paper known in Chinese as hanzi (i.e., Chinese characters): pictographs, pictograms, ideographs, ideograms, phonograms, logographs, ideophonographs, lexigraphs, morphographs, sinographs, and so on. The evidence for the claim that the Chinese writing originated from picture-drawing is substantial. Table 6 shows four instances of pictographs taken from oracle bone inscriptions with their corresponding present-day characters. It is apparent that the pictographs have evolved, through orthographic reforms, to such an extent that even those characters with highly iconicized origins as shown in the table have lost their picturesqueness.
Chinese characters are constructed from five basic strokes (see Table 7) in a square space. Picture-based character creation is only one of the many ways in which Chinese characters are constructed. Some philologists in the Han dynasty (206 B.C.–220 A.D.), on the basis of the then existent writings, abstracted six principles of character formation. Later studies show that only four of them are genuine: (1) zhi shi, the simple indicative principle; (2) xiang xing, the pictographic principle; (3) hui yi, the compound indicative principle; and (4) xing sheng, the semantic–phonetic principle. The pictographic method of character formation had ceased to be productive by the Han dynasty. The semantic– phonetic character formation has been the most productive of all, and the majority of Chinese characters are thus constructed. It is on this account that the Chinese writing system can be appropriately designated as being morphosyllabic. A Chinese character can be as simple as one stroke (e.g., ‘one’), or as complex as dozens of strokes (e.g., ‘snuffling’). Given a set of 11 834 characters, the average number of strokes per character is 11.5516, and 63 percent of the set is made of 12-stroke characters. Since it is quite a challenging task to learn to write such characters, there has been no shortage of appeals to reform the writing script. As early as the 1910s, some language reform activists argued for abolishing the characters altogether, to be replaced with a new alphabet script. This proved to be completely infeasible. The PRC government eventually adopted three reform measures: (1) a romanization alphabet known as Pinyin that is used to mark the pronunciations of characters; (2) a simplification scheme according to which 1754 characters would be simplified; and (3) a total of 1055 duplicate characters that were to be abolished.
348 Chinese Table 6 Instances of Chinese pictographs Pictographs found in oracle bone inscriptions
Corresponding present-day characters English translation
tiger
Table 7 Strokes and character writing
deer
horse
elephant
show what the non-Mandarin dialects look like. They are highlighted here thanks to the demographic size (see Table 2) and relatively prestigious status they enjoy. The Yue Group: Hong Kong Cantonese
The Reform of Archaic Written Chinese
Archaic written Chinese models the writings prevalent from the Spring and Autumn (770–476 B.C.) to the Later Han (25–220 A.D.) periods. Partially because the characters were immune to the dynamic changes of actual speech sounds over space and time, archaic written Chinese achieved, as it were, an independent symbolic existence. By the 1900s, it had no natural speakers. It did, however, have several potential rivals under the name baihua wen, literally meaning ‘unadorned speech writing,’ which was far closer to the contemporary vernacular speech. The reform movement basically dethroned archaic written Chinese and replaced it with the baihua wen that had been formerly much despised. The reform proved to be an uphill task, however, as it met with fierce resistance from die-hard adherents.
Three Major Dialects As graphically shown in Figure 1, the non-Mandarin supergroup falls into nine subgroups of dialects, which of course can be further divided into smaller groups. In this section three dialects, Hong Kong Cantonese, Shanghainese and Fuzhou dialect, representing the Yue group, the Wu group and the Min group respectively, are examined as a window to
Hong Kong Cantonese is one of the important varieties of the Yue group. It is spoken by 89 percent of Hong Kong’s 6.4 million population (by the 1996 census) in family discourse. It is also used in some radio and TV programs, and as an instructional language in schools and university classrooms. English was the main official language in the former British colony, but its use actually was, and still is, quite limited. Since the return of sovereignty to China in 1997, Putonghua has become increasingly popular. Having said this, Hong Kong Cantonese still remains a true vernacular of the local people. The term ‘Cantonese’ is derived from Guangzhou, the most influential city in southern China, which is known as Canton in English. Hong Kong and Guangzhou Cantonese are not noticeably different except that the former’s lexicon has more English loan words than the latter’s. In speech Cantonese and Mandarin or Putonghua are mutually unintelligible. Educated Cantonese speakers, however, use the standard form of written Putonghua. There are some spoken Cantonese words that have no corresponding Putonghua characters. Some Cantonese written words coined by local newspapers and in advertisements in Hong Kong are unintelligible to Putonghua readers. Backed up by the economic and financial strength and influence of Hong Kong and Guangzhou, Cantonese is enjoying a prestige that is unprecedented for any regional dialect in China, and is the most studied of all the dialects. Grammars, dictionaries, and textbooks have been written to render it more like a language than a regional dialect. Cantonese has 16 initial consonants. Unlike Mandarin, it has completely nasal syllables with m and ng
Chinese 349
functioning as vowels. For instance, the Cantonese word for the Mandarin word wu (‘five’) is ng, which can only be a syllabic nasal terminal of a final in Mandarin. It has eight vowels, and two sets of consonants that can be syllabic terminals: (1) nasals:-m,-n,-ng; (2) unreleased consonants:-p,-t,-k. Its tone system is far more complex than that of Putonghua. The exact number of tones is not without controversy. Some hold that only six tones are clearly distinctive in Hong Kong Cantonese, although there can be up to nine tones in the Yue group. The Wu Group: Shanghainese
The Wu group is spoken mainly in Shanghai, Southern Jiangsu Province, and a large part of Zhejiang Province. Historically the Suzhou Wu dialect enjoyed more prestige and esteem than the other regional varieties. When Shanghai established itself as an industrial and commercial center in China, it lost its glory and was replaced by Shanghainese, whose speakers seem to be eager to establish their own identity. Shanghainese speakers, who may speak fluent Putonghua, will loose no opportunity to code-switch to Shanghainese if they can be understood by an interlocutor, even at the risk of rudely shutting off any nonShanghainese speakers from the conversation. In comparison with Cantonese, Shanghainese is very much under-studied. Existent literature on it mainly consists of academic research papers. Like Cantonese, educated Shanghainese speakers write in written Putonghua, although there exist lexical items that are unique to the dialect. The term Shanghainese refers to the majority speech of downtown Shanghai. It has 28 initials (i.e., consonants), and 43 finals (i.e., vowels). One of its hallmark features (and also of the Wu group) is a three-way distinction in the initial consonants p, p‘, and b, which become a two way distinction, p, and p‘, in Putonghua. Although Wu dialects have seven or eight tones, tones 4, 5, and 6 have been lost as separate categories, which results in five tones in Shanghainese: (1) high level (53), (2) level high (35), (3) low level (13), (4) high þ a glottal stop (5), and (5) low þ a glottal stop (1). The Min Group: Fuzhou Dialect
The Min group is mainly spoken in Fujian, Taiwan, Hainan, as well as some areas in Guangdong, Zhejian, Guangxi, and Jiangxi. It is by no means a homogeneous group. On the contrary, even within Fujian Province six subgroups can be identified, one of which is known as the Min eastern subgroup, with the Fuzhou dialect as its prototype. Mutual communicability within this eastern subgroup is quite low.
Table 8 Sample usage in Fujian dialect Putonghua
(shuidao) (shuxin) (leng) (ku) (taopao) (zou)
Fujian dialect
English translation
rice letter cold cry escape walk
Historically the Fuzhou dialect was understood to cover an area of 11 counties. The present-day use of the term is much restricted to the speech of the locals in downtown Fuzhou. Phonologically it has 15 consonants, 46 vowels including diphthongs, and 7 tones. One of the striking features of the Fuzhou dialect in comparison with Mandarin is that it has preserved a great many archaic words or usages. For instance, the word for ‘rice’ is in Fujian dialect, which is totally obsolete in Putonghua. For another instance, the word in Fujian dialect is used to mean a letter, a usage found only in archaic Chinese. Table 8 lists some more instances. Sound Illustrations
The phonological differences between Putonghua, Hong Kong Cantonese, Shanghainese, and Fujian dialects can be illustrated by the ways four natural objects – the sun, the moon, stars, and thunder – are lexicalized and pronounced (see Table 9).
Chinese Information Processing At the early stage of computer technology, processing Chinese characters seemed to be such a forbidding task that calls for the romanization of the Chinese writing system were made again, but initial conceptions of the problem proved to be exaggerated. The national standard GB 2312–80, established on the basis of ISO 646 and officially coming into effect in 1981, provides a standard scheme of coding 6763 characters, which are subdivided into two groups according to the frequency of usage: the most frequent set, and the less frequent set. The most frequent set of 3755 characters is assumed to be 99.9% adequate for general usage (based on a statistical study of lexical frequency made in 1974). The GB 2312–80 standard met the demands of hardware and software development and exchange of information for general purposes, but it soon had to be amended as new demands arose. In 1994, a standard coding scheme for two supplementary sets consisting of 7237 and 7039 characters was officially
350 Chinese Table 9 The phonological differences between Putonghua, Hong Kong Cantonese, Shanghainese, and Fujian dialects
Note: The characters are transcribed in IPA symbols. The superscripted numbers are tone types with 1–5 values.
announced. As GB 2312–80 was designed to accommodate simplified characters, the new GB 12345–90 was introduced for nonsimplified characters that are maintained in Taiwan and Hong Kong. Nowadays, character recognition for both print fonts and handwriting is commercially available. Text-to-speech synthesis and production in the genre of journalistic texts has achieved a high degree of naturalness. The character script and lexical tones, which were thought to be two major obstacles for Chinese information processing, are no longer condemned, but appreciated as features with a flavor of real Chinese. See also: China: Language Situation; China: Religions; China: Scripts, Non-Chinese; China: Writing System; Chinese as an Isolating Language; Chinese Lexicography; Chinese Linguistic Tradition; Chinese (Mandarin): Phonology.
Bibliography Chao Y R (1968). A grammar of spoken chinese. Berkeley, CA: University of California Press. Chen P (1999). Modern Chinese. Cambridge: Cambridge University Press. DeFrancis J (1986). The Chinese language. Hawaii: University of Hawaii Press. Matthews S & Yip V (1994). Cantonese: a comprehensive grammar. London: Routledge. Ramsey S R (1987). The languages of China. Princeton University Press. , 1999 , 1996 , 1998 , 1999 , 1987
Longman
Chinese (Mandarin): Phonology 351
Chinese (Mandarin): Phonology S Duanmu, University of Michigan, Ann Arbor, MI, USA ! 2006 Elsevier Ltd. All rights reserved.
Chinese is the first language of over 1 billion speakers. There are several dialect families of Chinese (each in turn consisting of many dialects), which are often mutually unintelligible. However, there are systematic correspondences among the dialects and it is easy for speakers of one dialect to pick up another dialect rather quickly. The largest dialect family is the northern family (also called the Mandarin family), which consists of over 70% of all Chinese speakers. Standard Chinese (also called Mandarin Chinese) is a member of the northern family; it is based on the pronunciation of the Beijing dialect. There are, therefore, two meanings of the term Mandarin Chinese, one referring to the northern dialect family and one referring to the standard dialect. To avoid the ambiguity, I use Standard Chinese (SC) for the latter meaning. SC is spoken by most of those whose first tongue is another dialect. In principle, over 1000 million people speak SC, but in fact less than 1% of them do so without some accent. This is because even Beijing natives do not all speak SC. SC has five vowels, shown in Table 1 in IPA symbols (Chao, 1968; Cheng, 1973; Duanmu, 2002; Lin, 1989). [y] is a front rounded vowel. When high vowels occur before another vowel, they behave as glides [j, H, w]. [i] and [u] can also follow a nonhigh vowel to form a diphthong. The mid vowel can change frontness and rounding depending on the environment. The low vowel can change frontness but not rounding. The consonants of SC are listed in Table 2; sounds with limited distribution are in parentheses. For most speakers, [N] cannot occur in syllable-initial position. In syllable-coda position, only [n] and [N] can occur. The palatals do not contrast with [ts, tsh, s], [<§, <§h, §], or [k, kh, x]; the palatals only occur with front vowels or front glides, but the other three sets do not. Because some speakers pronounce the palatals as [tsj, tsjh, sj], it is possible to analyze a palatal as a combination of a dental and a front vowel. The retroflex liquid (sometimes written as a fricative [Z]) is not a trill but an approximant (and has no lip rounding, unlike that in English); because SC does not have a trill, I transcribe it as [r] instead of [¿]. The retroflex series [<§, <§h, §, r] is a major characteristic of SC speakers from Beijing. SC speakers from other areas often replace [<§, <§h, §, r] with the dentals [ts, tsh, s, z]. The unaspirated stops and affricates [p, t, k, <§, ts] can
become voiced [b, d, g, BZ, dz] when they occur in an unstressed syllable. SC also has two syllabic consonants [z] and [r] (or [Z]). They were previously thought to be special vowels, probably because it was believed that every syllable must have a vowel. [z] is used when a syllable starts with [ts], [tsh], or [s] and when there is no vowel in the rime; it can be seen as the extension of the [s] element into the rime, where it becomes voiced. [t] is used when a syllable starts with [<§], [<§h], [§], or [r] and when there is no vowel in the rime; it can be seen as the extension of the retroflex element into the rime. SC also has a couple of syllabic nasals, which are usually interjections. The SC syllable can made of up to four sounds – CGVX, where C is a consonant, G is a glide, V is a vowel, X is a nasal or an offglide of a diphthong, and VX is the rime. When both C and G are present, they are realized as one sound CG, where G is the secondary articulation. Thus, the SC word [swei] ‘age’ is phonetically quite different from the English word [swei] sway. SC also has a suffix [r], which changes the rime of the syllable it is attached to from VX to Vr and adds a retroflex quality to the nuclear vowel. In other words, the [r] suffix can lead to loss of contrast in the original coda. SC has two kinds of syllables, which can be called full (or regular) syllables and weak syllables. Full syllables (mostly monosyllabic content words) have tones and are long. Weak syllables (mostly grammatical words) are short and do not have their own tones. A full syllable can sometimes change to a weak syllable, in which case it loses its underlying tone and
Table 1 Standard chinese vowels High Mid Low
i e a
y
u
Table 2 Standard chinese consonants Labial
Dental
p ph
t th ts tsh s n 1
f m
Palatal
Retroflex
Velar
k kh (tC) (tCh) (C)
<§ <§h § r
x (N)
352 Chinese (Mandarin): Phonology
undergoes rime reduction and shortening. In syllable theory, full syllables are heavy and have two moras each, whereas weak syllables are light and have one mora each. In other words, in weak syllables the vowel is short. In full syllables, the vowel is short when the rime is VC or VG and long when the rime is V. In the electronic dictionary CMUDICT, English has approximately 10 000 monosyllables (excluding homophones). In contrast, SC has a very small syllable inventory, only approximately 400 syllables excluding tones (or approximately 1300 syllables including tones). It is a puzzle why SC uses so few syllables, especially when many times more seem to be available. For example, given about 20 Cs, three Gs, five Vs, and five Xs, there are approximately 2000 possible CGVX combinations (excluding tones), yet just 400 are used. It turns out that twothirds of the unused forms are ruled out by two requirements. The first is that C and G cannot have the same place of articulation, which follows from the analysis that CG is a single sound because in a single sound each place feature can be used just once. The second requirement is that V and X cannot have opposite values for [round] or [back]. There are four distinctive tones on full SC syllables. Weak syllables may get tone from certain intonational environments; otherwise, they remain toneless, which is phonetically a low pitch. The four distinctive tones are high, rise, low, and fall. The pitch range of the tones can vary according to stress, whereby a syllable with greater stress has a wider pitch range (Shen, 1985). Using standard tonal features, according to which contour tones are made of two (or more) level tones, the four tones are represented in (1), where the vowel is long because it is in a V rime.
In the alphabet system Pinyin, the words in (1) can be written ma1, ma2, ma3, and ma4, where vowel length is not represented and tones are represented by the digits 1– 4. The first, second, and fourth tones have normal voice quality, but the third tone has a murmured voice quality. The third tone may also begin with a slight dip in pitch, which does not seem to be phonologically relevant. In final position, the third tone can optionally end with a rise, in which case it is phonetically extra long. Phonologically we may represent it with three moras, exemplified in (2). Such a syllable often has an amplitude break or a
glottal stop in the middle, and to some people it sounds like two syllables.
Of the 1300 or so SC syllables (including tones), most are full syllables, in which the four tones are fairly evenly distributed, as shown in Tables 3 and 4. In Table 3, one sees that there are slightly fewer second tones than other tones, but not by a lot. In Table 4, one sees that most syllables have four or three tones each, and only a small number of syllables have two or one tone each. According to a text corpus of over 45 million Chinese character tokens (Da, 2000), there are over 6000 different Chinese characters, most of which represent monosyllabic words. This means that each SC syllable represents approximately 15 words excluding tones or five words including tones. The homophone load is not distributed evenly, as Figure 1 shows. The top 15 SC syllables are shown in (3), where the number of words a syllable represents (ignoring tones) is shown in parentheses. One might think that the most frequent syllables are the most natural or unmarked, namely, those that children learn first or those that are most common in the world’s languages, such as [ba], Table 3 Frequency of tones Tone
First
Second
Third
Fourth
All
Number
337
255
316
347
1255
Table 4 Frequency of tones Tones per syllable
4
3
2
1
All
Number of such syllables
178
130
59
35
402
Figure 1 Homophone density in Standard Chinese (ignoring tones), based on the analysis of 6000 characters listed in Da (2000). Most syallables represent fewer than 20 words each, but the syllable [ji] represents over 100 words.
Chinese (Mandarin): Phonology 353
[ma], or [ta]. However, many of the syllables in (3) do not seem to be unmarked syllables. (3) pi (51), tCy (51), wei (53), wu (55), § (58), tCan (61), Ci (64), Han (64), tChi (66), li (71), t§ (72), fu (73), Hy (90), tCi (93), ji (106)
Most English monosyllables represent just one word each. Because Chinese has so many homophones, a natural question is: How does Chinese avoid ambiguity in speech? The answer seems to be that most ambiguities are clarified by context. For example, although sun and son are homophones in English, there are few contexts in which they would cause ambiguity. Despite the large number of homophones, the syllable inventory of Chinese continues to decrease. SC no longer allows [p, t, k] or [m] in syllable-final position, although some other dialects do. Shanghai has lost all diphthongs, and its tonal inventory has reduced to just two. In all likelihood, SC is moving in the direction of further reduction. For example, SC does not make use of such contrasts as [wi] vs. [wei] or [ji] vs. [i], which English does (consider we vs. way and yeast vs. east). In addition, about 200 of the 1300 syllables are now rarely used. From a functional point of view, it is a mystery why the high homophone density has not prevented syllable loss in SC or at least slowed it down. A possible answer, paradoxically, is that high homophone density may in fact speed up syllable loss. Studies on frequency effects show that frequent words (Bybee, 2001) are more likely to undergo reduction than infrequent words. Because Chinese has fewer syllables than English, Chinese syllables are used more frequently and so they are more likely to undergo reduction and loss of contrasts. In disyllabic English words and phrases three kinds of stress differences can be distinguished. In words such as Peter, Anna, and panda, the first is clearly stressed and the other not. In such cases, the stressed syllable is longer, has an unreduced vowel, and has a pitch accent. In contrast, the unstressed syllable is short, has a reduced vowel, and has no pitch accent. In words such as blackboard and pancake, the stress difference is also clear. Although both syllables are heavy and have an unreduced vowel, one syllable has a pitch accent and the other does not. In expressions such as Red Cross, real deal, and red-hot (adjective), the stress difference is no longer obvious; in each case, both syllables are heavy, have an unreduced vowel, and have a pitch accent. As a result, they are sometimes thought to have equal stress. When a full SC syllable occurs next to a weak one, their stress difference is like the first English case (Peter, Anna, and panda). When two (or more) full
SC syllables occur together, they all have tones and so their stress difference is not obvious; this is similar to the third English case (Red Cross and real deal). Now, in English the first two cases are quite common, and so stressed syllables often stand out. In Chinese, on the other hand, full syllables often occur together, and so stressed syllables often do not stand out. This may have contributed to a common view that there is no stress in Chinese. The difference between full and weak syllables in SC can be explained in terms of moraic trochee – a full syllable has two moras, so it forms a foot and has stress. However, Chinese also uses disyllabic feet. For example, the disyllabic foot is used in poetic templates, and it is also a domain for certain kinds of tone sandhi. In addition, a minimal expression should be disyllabic. If a noun is monosyllabic, a semantically redundant syllable is often added. Thus, an SC speaker usually cannot say Fa ‘France’ or Wang ‘Wang’, but must say Fa Guo ‘France country’, Lao Wang ‘old Wang’, or Xiao Wang ‘little Wang’. In contrast, Sudan ‘Sudan’ and Yindu ‘India’ can be said by themselves (and adding ‘country’ to them would be odd). The disyllabic requirement has created a large dual vocabulary whereby many words have two forms, a monosyllabic form and a disyllabic form. The disyllabic form is a compound in structure but a single noun in meaning, and it can be called a pseudocompound. Some examples are shown in Table 5. It is commonly thought that the creation of disyllabic words is triggered by homophone density. However, the common view cannot explain why monosyllabic names need another syllable, even though there is no ambiguity, such as when you address someone in person. A more likely reason for the creation of pseudo-compounds is to fill a disyllabic foot (Duanmu, 1999b). A disyllabic noun (or compound) can be heavyheavy (two full syllables) or heavy-light (a full syllable and a weak syllable), but not light-heavy. This suggests that the disyllabic foot is trochaic. However, when a heavy-heavy noun is spoken in isolation, the second syllable is often longer and appears to have slightly more stress. This has led to the view that Chinese has final stress (Chao, 1968; Hoa, 1983).
But when a heavy-heavy noun is in nonfinal position, its second syllable no longer has extra duration (Feng, 1985; Wang and Wang, 1993). It seems, therefore, that the extra duration of a final full syllable is due to prepause lengthening and the trochaic analysis fares better overall. The discussion so far suggests that a disyllabic word contains both moraic trochee and syllabic trochee (Duanmu, 1999a). The structure is shown in (4), which can be called a dual-trochee.
can be quite complicated. To understand the change we must first understand the formation of syllabic feet, which in turn depends on syntax (Shih, 1986; Shen, 1994; Chen, 2000; Duanmu, 2002). Thus, Third Tone Sandhi offers an excellent case for the study of the interaction between phonology and syntax. In summary, Chinese differs from languages such as English in a number of ways (such as a lack of polysyllabic words, a small inventory of syllables, high homophone density, a dual vocabulary, and the use of distinctive tones). However, the difference is only apparent. Under careful analysis, Chinese also observes linguistic principles similiar to those in other languages (such as foot structure, the behavior of heavy vs. light syllables, and the effect of frequency on syllable reduction).
The dual-trochee distinguishes three degrees of stress: (1) a heavy syllable that heads a syllabic foot, (2) a heavy syllable that does not head a syllabic foot, and (3) a light syllable. The cases are easy to distinguish in English: (1) an unreduced vowel and a pitch accent (first syllable in Peter, pancake, or even city, if the last word is syllabified as cit.y), (2) an unreduced vowel but no pitch accent (second syllable in pancake), and (3) a reduced vowel and no pitch accent (second syllable in Peter or Anna). In Chinese, (1) and (2) are hard to distinguish because they both have tones and unreduced rimes; however, (3) is easy to distinguish from (1) and (2) because it has a reduced rime and no tone. Because Chinese uses pitch contour (tones) to contrast word meanings, intonation is often expressed not as pitch variation on lexical words themselves, but as boundary tones that are added after lexical tones. Two examples are shown in (5) and (6).
See also: Chinese; Chinese as an Isolating Language; Chinese Linguistic Tradition; Foot; Phrasal Stress; Word Stress; Tone: Phonology.
(5) Tone Intonation LH þ L ! LHL nan nan ‘difficult’ ‘affirmation’ ‘Surely difficult!’ (6) Tone HL mai ‘sell’
þ
Intonation H ‘question’
!
HLH mai ‘Sell?’
The boundary tones can also occur on what might be called intonation syllables. For example, the boundary tone in (5) can occur on [a] or [ou], and that in (6) can occur on [ma]. Many Chinese dialects have tone sandhi, whereby syllable tones change in context. The most well-known tone sandhi in SC is Third Tone Sandhi, by which a third tone changes to a second tone when another third tone follows, or T3 T3 ! T2 T3. In an expression made up of many third tones, the resulting change
Bibliography Bybee J (2001). Cambridge studies in linguistics 94: Phonology and language use. Cambridge, UK: Cambridge University Press. Chao Y R (1930). ‘A system of tone letters.’ Le Maıˆtre Phone´tique 45, 24–27. Chao Y R (1933). ‘Tone and intonation in Chinese.’ Bulletin of the Institute of History and Philology (Academia Sinica) 4, 121–134. Chao Y R (1968). A grammar of spoken Chinese. Berkeley and Los Angeles: University of California Press. Chen M (2000). Cambridge studies in linguistics 92: Tone sandhi: Patterns across Chinese dialects. Cambridge, UK: Cambridge University Press. Cheng C C (1973). A synchronic phonology of Mandarin Chinese (Monographs on linguistic analysis no. 4). The Hague: Mouton. Da J (2000). ‘Chinese text computing.’ Department of Foreign Languages and Literatures, Middle Tennessee State University, Murfreesboro, TN. Available at: http://lingua. mtsu.edu/chinese-computing/. Duanmu S (1999a). ‘Metrical structure and tone: Evidence from Mandarin and Shanghai.’ Journal of East Asian Linguistics 8, 1–38. Duanmu S (1999b). ‘Stress and the development of disyllabic vocabulary in Chinese.’ Diachronica 16, 1–35. Duanmu S (2002). The phonology of Standard Chinese. Oxford: Oxford University Press. Duanmu S (2004). ‘A corpus study of Chinese regulated verse: Phrasal stress and the analysis of variability.’ Phonology 21, 43–89. Feng L (1985). ‘Beijinghua yuliu zhong shengyundiao de shichang’ [Duration of initials, finals, and tones in Beijing dialect]. In Lin T & Wang L J (eds.) 131–195. Hoa M (1983). L’accentuation en pe´kinois. Paris: Editions Langages Croise´ s (Distributed by Centre de Recherches Linguistiques sur l’Asie Orientale, Paris.).
Chinese as an Isolating Language 355 Lin M C & Yan J Z (1988). ‘The characteristic features of the final reduction in the neutral-tone syllable of Beijing Mandarin.’ In Phonetic Laboratory Annual Report of Phonetic Research. Beijing: Phonetic Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences. 37–51. Lin T & Shen J (1995). ‘Beijinghua er hua yun de yuyin fenqi’ [Variations in the [er]-suffixed rimes in the Beijing dialect]. Zhongguo Yuwen 1995, 170–179. Lin T & Wang L J (eds.) (1985). Beijing yuyin shiyanlu. [Working papers in experimental phonetics]. Beijing: Beijing University Press. Lin Y H (1989). ‘Autosegmental treatment of segmental processes in Chinese phonology.’ Ph.D. diss., University of Texas, Austin. Shen J (1985). ‘Beijinghua shengdiao de yinyu he yudiao’ [Pitch range of tone and intonation in Beijing dialect]. In Lin T & Wang L J (eds.) 73–130. Shen J (1994). ‘Beijinghua shangsheng liandu de diaoxing zuhe he jiezou xingshi’ [Tonal patterns and rhythmic structure in successive third tones in the Beijing dialect]. Zhongguo Yuwen 1994, 274–281.
Shih C L (1986). ‘The prosodic domain of tone sandhi in Chinese.’ Ph.D. diss., University of California, San Diego. Wang J & Wang L J (1993). ‘Putonghua duo yinjie ci yinjie shi chang fenbu moshi’ [The types of relative lengths of syllables in polysyllabic words in Putonghua]. Zhongguo Yuwen 1993, 112–116. Wang L J & He N J (1985). ‘Beijinghua er-huayun de tingbian shiyan he shengxue fenxi’ [Auditory discrimination experiments and acoustic analysis of Mandarin retroflex endings]. In Lin T & Wang L J (eds.) 27–72. Yan J Z & Lin M C (1988). ‘Beijinghua sanzizu zhongyin de shengxue biaoxian’ [Acoustic characteristics of the stress in Beijing trisyllables]. Fangyan 1988, 227–237. Yang S A (1992). ‘Beijinghua duoyinjie zuhe yunlu tezheng de shiyan yanjiu’ [An experiment on the prosody of polysyllables in the Beijing dialect]. Fangyan 1992, 128–137. Yip M (1980). ‘Tonal phonology of Chinese.’ Ph.D. diss., MIT, Cambridge, MA. Yip M (2002). Tone. Cambridge, UK: Cambridge University Press.
Chinese as an Isolating Language J L Packard, University of Illinois, Urbana, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
If we use the term ‘isolating’ in what is perhaps its simplest and most often used sense – referring to whether the words of a language are mostly monomorphemic (see Classification of Languages) – then Chinese can be considered only a moderately isolating language, because Chinese has at least as many multimorphemic as it has monomorphemic words. The term isolating, however, has also been used to refer to whether the morphemes of a language are clearly identifiable, defined by the following properties: (1) whether morpheme boundaries in the language are sharply defined, (2) whether there is only a single distinct morphemic identity represented within a defined morpheme boundary space (i.e., the extent to which there is no overlapping exponence; (see Classification of Languages), and (3) whether morphemes in the language have a single, invariant phonological form. If we define an isolating language based on an identifiable morphemes criteria, then Chinese scores relatively high on the ‘isolating language’ scale. It can be profitably studied using both definitions of the term.
Isolating Defined as Having Monomorphemic Words The definition of isolating language as monomorphemic relies on whether words in a language appear without the obligatory affixation of grammatical morphemic information. This property was intended to contrast with languages such as Russian and Latin in which word roots are generally bound content forms that require affixation of grammatical morphemic information (indicating such properties as case, number, or gender) when they occur in context. For example the Russian root for ‘book’ (knig-) must be augmented with an inflectional ending that reflects case or number (knig-u book-ACC.SING; knig-i book-NOM.PL), and cannot appear as a bare stem in isolation. Languages like Chinese whose words occur without such obligatory grammatical marking are considered isolating because the words in such languages may appear in bare form without the necessity of adding morphemic information. The absence of obligatory affixation means that words in such languages will tend to contain fewer morphemes on average, giving rise to the monomorphemic word definition of isolating language.
Chinese as an Isolating Language 355 Lin M C & Yan J Z (1988). ‘The characteristic features of the final reduction in the neutral-tone syllable of Beijing Mandarin.’ In Phonetic Laboratory Annual Report of Phonetic Research. Beijing: Phonetic Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences. 37–51. Lin T & Shen J (1995). ‘Beijinghua er hua yun de yuyin fenqi’ [Variations in the [er]-suffixed rimes in the Beijing dialect]. Zhongguo Yuwen 1995, 170–179. Lin T & Wang L J (eds.) (1985). Beijing yuyin shiyanlu. [Working papers in experimental phonetics]. Beijing: Beijing University Press. Lin Y H (1989). ‘Autosegmental treatment of segmental processes in Chinese phonology.’ Ph.D. diss., University of Texas, Austin. Shen J (1985). ‘Beijinghua shengdiao de yinyu he yudiao’ [Pitch range of tone and intonation in Beijing dialect]. In Lin T & Wang L J (eds.) 73–130. Shen J (1994). ‘Beijinghua shangsheng liandu de diaoxing zuhe he jiezou xingshi’ [Tonal patterns and rhythmic structure in successive third tones in the Beijing dialect]. Zhongguo Yuwen 1994, 274–281.
Shih C L (1986). ‘The prosodic domain of tone sandhi in Chinese.’ Ph.D. diss., University of California, San Diego. Wang J & Wang L J (1993). ‘Putonghua duo yinjie ci yinjie shi chang fenbu moshi’ [The types of relative lengths of syllables in polysyllabic words in Putonghua]. Zhongguo Yuwen 1993, 112–116. Wang L J & He N J (1985). ‘Beijinghua er-huayun de tingbian shiyan he shengxue fenxi’ [Auditory discrimination experiments and acoustic analysis of Mandarin retroflex endings]. In Lin T & Wang L J (eds.) 27–72. Yan J Z & Lin M C (1988). ‘Beijinghua sanzizu zhongyin de shengxue biaoxian’ [Acoustic characteristics of the stress in Beijing trisyllables]. Fangyan 1988, 227–237. Yang S A (1992). ‘Beijinghua duoyinjie zuhe yunlu tezheng de shiyan yanjiu’ [An experiment on the prosody of polysyllables in the Beijing dialect]. Fangyan 1992, 128–137. Yip M (1980). ‘Tonal phonology of Chinese.’ Ph.D. diss., MIT, Cambridge, MA. Yip M (2002). Tone. Cambridge, UK: Cambridge University Press.
Chinese as an Isolating Language J L Packard, University of Illinois, Urbana, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
If we use the term ‘isolating’ in what is perhaps its simplest and most often used sense – referring to whether the words of a language are mostly monomorphemic (see Classification of Languages) – then Chinese can be considered only a moderately isolating language, because Chinese has at least as many multimorphemic as it has monomorphemic words. The term isolating, however, has also been used to refer to whether the morphemes of a language are clearly identifiable, defined by the following properties: (1) whether morpheme boundaries in the language are sharply defined, (2) whether there is only a single distinct morphemic identity represented within a defined morpheme boundary space (i.e., the extent to which there is no overlapping exponence; (see Classification of Languages), and (3) whether morphemes in the language have a single, invariant phonological form. If we define an isolating language based on an identifiable morphemes criteria, then Chinese scores relatively high on the ‘isolating language’ scale. It can be profitably studied using both definitions of the term.
Isolating Defined as Having Monomorphemic Words The definition of isolating language as monomorphemic relies on whether words in a language appear without the obligatory affixation of grammatical morphemic information. This property was intended to contrast with languages such as Russian and Latin in which word roots are generally bound content forms that require affixation of grammatical morphemic information (indicating such properties as case, number, or gender) when they occur in context. For example the Russian root for ‘book’ (knig-) must be augmented with an inflectional ending that reflects case or number (knig-u book-ACC.SING; knig-i book-NOM.PL), and cannot appear as a bare stem in isolation. Languages like Chinese whose words occur without such obligatory grammatical marking are considered isolating because the words in such languages may appear in bare form without the necessity of adding morphemic information. The absence of obligatory affixation means that words in such languages will tend to contain fewer morphemes on average, giving rise to the monomorphemic word definition of isolating language.
356 Chinese as an Isolating Language
As it turns out, many (if not most) Chinese words are in fact dimorphemic, consisting of either (1) two free content morphemes (compound word), (2) one free and one bound content morpheme or two bound content morphemes (bound root word), (3) a free or bound content morpheme plus a word-forming affix (derived word), or (4) a free content morpheme plus an inflectional affix (grammatical word; see Packard, 2000 for further details). However, most dimorphemic Chinese words are either compound words or bound root words, and so the multimorphemic status of Chinese words is generally not due to the presence of affixation. Moreover, when Chinese words do contain affixes, they are never obligatory in the sense that they are required in the default case, as seen in the Russian example above. Chinese affixes are, nonetheless, sometimes obligatory in an alternative sense: if a property in question is selected to be expressed by the speaker, then the use of the affix concomitant with that property is a required element. Some common examples of this obligatory marking of an optionally selected property in Chinese are the use of classifiers with nouns, the marking of plural numbers on human pronouns, and the use of aspect marking on verbs. Classifiers are word-forming morphemes that are required when nouns are modified by a number and/or a determiner. For example, the noun shu ‘book’ generally occurs in context in bare form with no grammatical marking whatsoever. But when shu is modified by a number such as san ‘three’ or a determiner such as na ‘that,’ the classifier ben ‘volume’ must occur between the modifying element and the noun, yielding san-ben shu and na-ben shu for ‘three books’ and ‘that book’ respectively. In the case of human pronouns, the personal pronouns wo ‘I/me,’ ni ‘you,’ and ta ‘he, she’ are obligatorily marked with the plural suffix -men when the referent is plural in number, to yield women ‘we, us,’ nimen ‘you (pl),’ and tamen ‘they, them.’ Verbs in Chinese may occur with inflectional suffixes that express various forms of grammatical aspect, that is, that refer to the activity profile of the event represented by the verb. For example, the verbal aspect marker -le (note that this is the -le that affixes to and has scope over the verb, and not the le that occurs in sentence-final position and has scope over the sentence) indicates that the event associated with the verb has been completed, the verbal aspect marker -guo indicates that the event associated with the verb has occurred at least once, and the verbal aspect marker -zhe indicates that the action represented by the verb is ongoing or continuous. In Chinese, the obligatory marking of a selected property as seen in classifiers, human plural
pronouns, and verbal aspect contrasts with cases in which the marking of a selected property is optional, as with plural marking on regular human nouns. When a human noun is transparently plural in number, the addition of the suffix –men, which would explicitly represent a plural number, is optional. For example, in both of the following examples the Chinese noun that translates into English as ‘teachers’ refers unambiguously to a set that contains multiple members. (1) laoshi dou you shu teacher all have book ‘the teachers all have books’ (2) laoshimen dou you shu teacher-PL all have book ‘the teachers all have books’
Both examples refer to ‘teachers’ as a plural concept but only the second overtly marks the plural number with the suffix -men. The two examples are identical in meaning, but the second explicitly marks the plural while the first does not. If Chinese is examined as an isolating language based on its use of monomorphemic words, it is worthwhile to consider in concrete terms where Chinese should be located on the monomorphemic word scale. The contemporary Chinese novel Shui Ru Da Di by Wen Fan (2004; Beijing: People’s Literature Publishing House) provides a typical sampling. If we examine the first 100 words in the third paragraph on page 16, we find that 51 (51%) of the words are monomorphemic (if by token; 35 words or 47.2% if by type), 45 (45%) of the words are dimorphemic (if by token; 35 words or 47.2% if by type), and 4 words (4% if by token, 5.4% if by type) contain more than two morphemes. If counted by type, 47.3% of the words are monomorphemic, and 52.7% are multimorphemic. In addition, the average number of morphemes per word token for that hundred-word sample is 1.54. This figure may be compared with the 1.06 morphemes-per-word cited for Vietnamese (perhaps the most purely isolating language using this criterion), 1.68 for modern English, and 3.72 for Eskimo (see Classification of Languages). In sum, if the concept of monomorphemic words is used as the defining criterion, Chinese must be considered only moderately isolating.
Isolating Defined as Having Clearly Identifiable Morphemes To determine where Chinese belongs on the isolating language scale using the ‘identifiable morpheme’ criterion, the first property to consider is sharply
Chinese as an Isolating Language 357
defined morpheme boundaries. In Chinese, morpheme boundaries are nothing if not clearly defined. There is generally no question where one morpheme ends and another one begins in any Chinese utterance. Even in cases of affixation in which the phonological form of the stem is affected, it is quite clear which part of the affixed word belongs to the stem and which part belongs to the affix. To illustrate, consider the following examples of -er (phonetically [er]) diminution suffixation (data from Cheng, 1973; in IPA, tones not marked). The -er suffix often makes only a negligible semantic contribution to the derived word, but it is the affixation operation that has the greatest phonological effect in (Mandarin) Chinese. The -er suffix attaches to words with varying degrees of phonological effect on the stem and on the affix itself. In examples (1)–(3) of Table 1, the -er suffix is appended to the stem with the [e] vowel of the suffix dropped in favor of stem vocalic elements, and with no effect on the phonological form of the stem. In (4), the [e] vowel of the suffix is dropped and the stem final velar nasal [N] is lost, but its nasality is retained in the form of nasalization on the stem nuclear vowel, that is, [AD ]. In (5), the [e] vowel of the suffix is dropped and the stem final apical nasal [n] is lost, but its nasality is not retained as in (4). In (6), we see a stronger contribution from the suffix, since it retains its [e] vowel. In (7), the suffix is appended in unaltered form, and the stem final [n] is displaced. In (8)–(10), the suffix is appended in unaltered form, replacing various parts of the stem final, including its complete replacement in (9) and (10). The examples in Table 1 demonstrate that even though suffixation of -er results in a good deal of phonological variability on both stem and affix, in all cases the resulting derived words contain phonological strings that can be unambiguously attributed to either the stem or the affix, and the phonological identities of the participating morphemes remain clear. Table 1 Some phonological effects of -er suffixation Noun
Thus, the sharply defined morpheme boundary aspect of the identifiable morpheme criterion for isolating language makes Chinese appear quite isolating indeed. The second criterion for identifiable morphemes is the existence of overlapping exponence. ‘Overlapping exponence’ refers to the occurrence of more than one grammatical property within a single affix. For example, in the case of the -us ending on the Latin word lupus ‘wolf’, where the -us encodes both accusative case and singular number, there is no way to confer an independent phonological identity upon a portion of the -us suffix that encodes the accusative and a part that encodes the singular. In Chinese, there are no affixes that do such double duty by systematically encoding more than one grammatical meaning in a single affix. Therefore, Chinese is clearly an isolating language in view of this property. The third necessary property of identifiable morphemes is invariance of phonological form. Chinese morphemes do commonly change from their citation phonological forms when they appear in context. Such phonological variation, however, is virtually always completely determined by phonological environment. This is in contrast with languages such as Russian and Latin, where allomorphic variation in general is grammatically conditioned, and generally occurs independent of phonological context. In Chinese, the shift from citation form usually involves tone sandhi, a phonologically conditioned change in lexical tone. Two tone sandhi rules from Mandarin, the L tone sandhi rule and the MH tone sandhi rule, provide an illustration (from Chen, 2000: 20, 27). Mandarin Chinese has four lexical tones: a high (H) tone, a mid-rising (MH) tone, a low (L) tone, and a high-falling (HL) tone. The L tone sandhi rule changes an L into an MH when the L precedes (i.e., occurs to the left of) another L. The MH tone sandhi rule changes a nonfinal MH into an H when it follows (i.e., occurs to the right of) by an H or an MH. In (3), the citation tones for ‘to bury a horse’ are MH and L, and their surface realizations are the same as their citation forms. In (4), the tone on the word ‘buy’ in ‘to buy a horse’ changes from citation L to sandhi MH following the L tone sandhi rule, making utterances (3) and (4) completely homophonous. (3) mai MH mai MH (4) mai L mai L MH
buy horse ‘to buy a horse’ citation tones sandhi tones
358 Chinese as an Isolating Language (5) fen
shui
ling
H H
L MH
L L
H
H
L
divide water mountain-ridge ‘watershed’ citation tones sandhi tone forms (intermediate, nonrealized forms) sandhi tone forms (final surface forms)
In (5), the citation L tone on shui changes to an intermediate, nonrealized sandhi MH tone in accord with L tone sandhi, and that intermediate sandhi MH value for shui acts as input into the MH tone sandhi rule, changing the nonrealized sandhi MH tone to a final surface H tone. From these examples it is clear that the phonological shape of Chinese morphemes does undergo considerable variation, but such variation is entirely a function of phonological context. To conclude, the reputation of Chinese as an isolating language is perhaps not so well-deserved if we rely merely on the monomorphemic word criterion, since
the preponderance of Chinese words are multimorphemic. But if our criterion is how easy the morphemes of a language are to identify and individuate, then Chinese scores rather high on the isolating language scale. See also: Arabic as an Introflecting Language; Chinese; Chinese (Mandarin): Phonology; Chinese Lexicography; Classification of Languages; Finnish as an Agglutinating Language; Italian as a Fusional Language; Morphological Typology.
Bibliography Chen M Y (2000). Tone sandhi: Patterns across Chinese dialects. New York/London: Cambridge University Press. Cheng C (1973). A synchronic phonology of Mandarin Chinese. The Hague: Mouton. Packard J L (2000). The morphology of Chinese. New York/London: Cambridge University Press.
Chinese Linguistic Tradition G Casacchia, Universita` degli studi di Napoli ‘L’Orientale,’ Napoli, Italy ! 2006 Elsevier Ltd. All rights reserved.
In China, linguistic research started very early. Theoretical analysis, lexicography, and dialectology were cultivated by Chinese scholars before the foundation of the Empire (221 B.C.). On the other hand, some fields had to wait until they received a hint from foreign cultures. Phonology was born under the Indian influence in the 3rd century A.D., and grammatical studies began following some Western examples only in the 19th century.
The Beginnings The idea that language is largely based on an agreement among human beings was established by Xunzi [Xunzi (Hsu¨ n-tzu) was one of the most outstanding Confucian philosophers and author of a book of the same name] and Moists [Moism is the doctrine founded by Mozi (Mo-tzu)]. According to the first, ‘names are attached to things once for ever, but this link is based on an agreement,’ and the latter added, ‘names are like painted tigers’ (i.e., just a pale image of the real thing). With the foundation of the Empire, things changed. The major philosopher of the Han dynasty, Dong
Zhongshu [Dong Zhongshu (Tung Chung-shu) played a key role in establishing Confucian orthodoxy], adopted the opposite idea of linguistic realism: A natural bond exists between names and things. The Han penchant toward the cosmological theory of the ‘five elements’ (everything in the world is connected with five principles and modifies following them) prevented them from a nominalistic theory of the language. It is impossible to underevaluate the imprinting given by the ideographic writing system of Chinese on linguistic studies. The ideograms (it is well-known ‘ideogram’ is far from ideal when talking about Chinese characters, but more precise terms, such as sinograms or logograms, are not yet widely used), that nearly completely hide the phonological and morphological aspects of the language, and their importance as a tool of the state administration produced a sort of a ‘pheticism of the ideograms’ in the minds of the cultivated people, who seldom dared to see them closer or to ‘open’ them to better understand their nature. Therefore, since the very beginning, the pioneering linguistic works in China were not devoted to analyzing speech and its component, the word, but to collecting and comparing ideograms. First, some dictionaries of difficult words appeared, including the Ji jiu pian (‘Quick performances’) and some others, among which the most important was the Er Ya (‘Perfection attained’), a list of 2016 words
362 Chinese Linguistic Tradition
grammatical system threw a bright light over most of the specific features of Chinese.
Current Situation Currently, all branches of linguistics are fully developed in China due to the efforts of many scholars at the international level, including Lu¨ Shuxiang, Zhu Dexi, and many others. The first stage, in the 1950s, was largely dominated by the influence of the Soviet Union. For example, in a major debate over ‘are there or are there not word classes in Chinese?’ the core for a long time was actually ‘are there or are there not in Chinese the same word classes as in Russian?’ However, it was also open to influences from the West such as structuralism. During the second stage, in the 1960s and 1970s, sociolinguistics (how to simplify the writing system and how to teach the national language) mainly held the ground. The third stage, after the opening to the world and the development of social sciences, succeeded in giving impulse to modern linguistics. Nevertheless, the tradition still plays a role, and the strong points of Chinese linguistics remain
paleography, etymology, dialectology, grammar, and lexicography – much more than general linguistics. See also: China: Writing System; Chinese; Chinese as an
Isolating Language; Chinese Lexicography; Chinese (Mandarin): Phonology.
Bibliography Baxter W (1992). A handbook of old Chinese phonology. New York. Boltz W (1994). The origin and early development of the Chinese writing system. New Haven, CT. Botte´ro F (1996). Se´mantisme et classification dans l’e´criture chinoise. Paris. Harbsmeyer C (1998). ‘Language and logic.’ In Needham J (ed.) Science and civilisation in China, vol. 7, part 1. Cambridge. Masini F (1993). ‘The formation of modern Chinese lexicon and its evolution toward a national language.’ Journal of Chinese Linguistics, monograph series No. 6. Qiu Xigui (1990). Wenzixue gaoyao. Peking. [Mattos G L & Norman J (trans.) (2000). Chinese writing. Berkeley.] Yip Po-ching (2000). The Chinese lexicon: a comprehensive survey. Hong Kong.
Chinese Lexicography Li Ming, Soochow University, Suzhou, China ! 2006 Elsevier Ltd. All rights reserved.
A Brief Historical Survey Lexicographical development in China can be divided into three stages. The rudimentary period followed a long and tortuous course from 200 B.C. to the 14th century. The inception of Chinese lexicography was closely related to the study of exegesis, critical interpretation of ancient texts, which inevitably involved the analyses of meaning. It is in this connection that some scholars regard (Yi ching; The book of changes) as the first dictionary in the Chinese language. However, the book hardly has any features typical of a modern dictionary. It is generally agreed that (Erya; Near correctness), which came out in the Han Dynasty (206 B.C.– 220 A.D.), consisting of 30 books about the accepted meanings of classic texts, was the earliest quasidictionary in China. The first Chinese dictionary in the modern sense of the word was (Shuowen jiezi; The origin of Chinese characters) by Xu Shen in the Eastern Han Dynasty (25–220 A.D.). The book contains some 10 000 Chinese characters with
pronunciations and definitions. A few other books having characteristics of a dictionary or an encyclopedia in this stage paved the way for the advancement of Chinese lexicography, including books on Buddhism in the Tang Dynasty (618–907). The intermediate phase (from 1368 to 1700) was marked by the compilation of the handwritten (Yongle dadian; The yongle canon), a 60volume encyclopedia compiled in 1408 in the Ming Dynasty (1368–1644). The most important lexicographical publications in this period are encyclopedic in nature, focusing on particular subjects. These include (Bencao gangmu; A compendium of herbal medicine), 1578; (Nongzheng quanshu; The encyclopedia of agriculture), a 60-volume work published in 1639; (Tiangong kaiwu; The exploitation of the works of nature), a three-volume encyclopedia about agriculture and handicrafts published in 1637. The last of these was later translated into French, Japanese, German, and English. The modern phase (from 1700 to the present) witnessed the publication in 1716 of a remarkable dictionary, (Kangxi zidian; The kangxi lexicon), with 47 035 Chinese characters. Since then,
Chinese Lexicography 363
thousands of dictionaries have been produced, most of them being published after the 1900s. Some of the notable dictionaries are (Siku quanshu; The complete library in four divisions), compiled between 1772 and 1782; (Zhonghua da zidian; The great China character dictionary), with 48 000 characters, compiled in 1915; (Ciyuan), an encyclopedic dictionary with almost 100 thousand references, published in 1915; (Guoyu zhengyin zidian; A Chinese character dictionary with revised pronunciations), published in 1926; (Xinhua zidian; Xinhua dictionary of Chinese characters), published in 1957; (Xiandai hanyu cidian; A dictionary of modern Chinese), published in 1978; (Hanyu da zidian; An unabridged dictionary of Chinese characters), published in 1988–1990; (Hanyu da cidian; An unabridged Chinese dictionary), compiled 1986–1993; and (Zhongguo da baike quanshu; The encyclopedia of China), compiled 1980–1993. Robert Morrison published the first Chinese–English dictionary in 1815. A few other bilingual dictionaries came into being in the first two decades of the 20th century and this category grew in full swing from 1970 to the end of the century.
Recent Developments Since the 1990s, a number of electronic dictionaries have been produced. Chinese dictionaries in the electronic media are chiefly focused on the ‘personal digital assistant’ (PDA) and other hand-held devices. There are a few online dictionaries and encyclopedias on CD-ROM, but they generally lack the multimedia functions that are the principal advantage of such dictionaries. For example, The encyclopedia of China has a four-CD version that is even larger in scope than Microsoft’s Encarta. It boasts over 50 000 pictures. The other media formats (sounds, video clips, and animations), however, are yet to be incorporated. Pocket e-dictionaries in China have their own advantages, though. Most of them offer two-way translation and can be updated easily. As for online dictionaries, they invariably present a much more extensive vocabulary than is found in paper dictionaries, and are revised and updated at irregular intervals. At one time, plagiarism was rampant, but after a celebrated lawsuit in 1993, Chinese copyright laws and intellectual property rights began to be respected. For several years, however, in the wake of this lawsuit, there were heated intellectual discussions in Chinese lexicographical circles about what constitutes a breach of copyright in dictionary-making and how to eliminate the problem.
Another significant development in Chinese lexicography has been a shift from prescriptivism to descriptivism. Take (A dictionary of modern Chinese), for example. This dictionary used to be (and in many ways still is) considered as the authority on the Chinese language. It is a prescriptive model of how Chinese should be written. The new edition published in 1996, however, contains a number of colloquialisms and new borrowings from foreign languages. In recent years, corpus linguistics has been given more attention in China. Quite a few corpora, ranging in size from 1 million to 70 million words, have been established in Guangzhou, Shanghai, Beijing, and Nanjing. For instance, the general-purpose Chinese corpus at Tsinghua University contains some 50 million words and the Chinese Academy of Social Sciences has a corpus with 10 million words. As a result, corpus-based dictionaries have been compiled and are available to the general public.
Retrieval Systems In Chinese, the character is the basic semantic unit, so that conventionally the distinction can be made between (zidian), a dictionary of Chinese characters, and (cidian), a dictionary of Chinese words comprising one, two, or more characters. The generic term for dictionaries and encyclopedias in Chinese is (cishu), which also denotes any reference work. Since Chinese is an ideographic writing system in which symbols represent ideas rather than sounds, the macrostructure of a Chinese dictionary is traditionally organized in brush-stroke order instead of the alphabetical order. The retrieval of characters in Chinese dictionaries is achieved by means of radicals. The user checks the radical index for the page number of the character to be consulted. Then, in the microstructure, the user looks for the word or words grouped under that character. The radicals are arranged in ascending order of the number of strokes. The basic rules for brush-stroke order are as follows: (1) horizontal strokes precede vertical strokes; (2) downward–left curved strokes go before downward–right curved strokes; (3) from top to bottom; (4) from left to right; (5) from outside to inside; (6) middle strokes go before strokes on the sides. Due to the complicated nature of Chinese characters, in 1997 the National Working Committee on Chinese Speech and Writing issued a set of guidelines called the Standard Brush-Stroke Order of Modern Chinese Characters. In modern Chinese dictionaries, however, (pinyin, a scheme for the Chinese phonetic alphabet in Roman letters), is adopted in addition to the radical index, so that the user can look up characters in the
364 Chinese Lexicography
alphabetical order of pinyin. As Chinese is a tonal language, the flat tone is conventionally given first for words having the same pronunciation but different tone, followed by the rising, falling-rising, and falling tones. In the case of homonyms, brush-stroke order applies. In the 1920s, Wang Yunwu (Y. W. Wong), the editor-in-chief of the Commercial Press Ltd. in Shanghai, invented his (sijiao haoma jianzi fa, the four corner system). Each Chinese character is given a four-digit number according to the brush strokes in the system. The user has to identify the elements of the four corners of the character in question by moving clockwise from the upper left-hand corner, then the upper right, next the lower left, and, finally, the lower right-hand corner. When the four digits have been determined, the user checks the fourcorner index to find the page number for the character to be consulted. The system is difficult to handle at first, but once the rules are mastered it becomes easy. For various reasons, the four-corner system is out of fashion nowadays, at least in mainland China.
Standardization The plethora of dialects in China makes it necessary to employ (putonghua, or Mandarin) as the official language and to standardize the spoken and written forms. A number of academic organizations in China are directly or indirectly involved with language standardization, for example the National Working Committee on Chinese Speech and Writing, the National Committee on the Standardization of Chinese, the National Committee on the Standardization of Scientific and Technical Terminology, the Standardization Administration of China, the China Language Modernization Society, the China Language Society, and the China Lexicographical Society. Three Chinese publishing houses specialize in dictionaries: Shanghai Lexicographical Publishing House, Hubei Lexicographical Publishing House, and Sichuan Lexicographical Publishing House. There used to be authoritarian control over which Chinese presses could publish reference works, but now such restrictions have been lifted. The Chinese journal (Cishu yanjiu, Lexicographical studies) is devoted to issues related to all reference works.
Dictionary Types Chinese lexicography in modern days covers every type in the field, ranging from desk dictionaries to pocket versions in size, from paper to electronic editions in media, and from monolingual to bilingual in language. Although written Chinese is homogeneous,
the spoken form varies geographically. There are scores of such variations. To cope with this problem, and partly as an effort to preserve the cultural heritage, dictionaries have been compiled for the major regional dialects. It should be noted that (hanyu Chinese or Mandarin), the official language in China, literally means ‘a language spoken by the Han nationality.’ There are other languages used by minority ethnic groups in the country, such as Tibetan, Mongolian, and Weiwuer. Bilingual dictionaries in China therefore may be foreign language (FL)-Chinese, Chinese-FL, minority language (ML)Chinese, or Chinese-ML. In fact, many ML-Chinese and Chinese-ML dictionaries have been published since 1950.
Markets Given the large population of China (1.29 billion in 2003), the market for dictionaries, both monolingual and bilingual, is vast and continues to expand. Over 400 million copies of (Xinhua dictionary of Chinese characters) have been printed between the first edition in 1953 and the 10th edition in 2004. Considering the fact that there are tens of millions of English learners in the country, it is not surprising that most bilingual dictionaries can command a ready sale. (Xin yinghan cidian; A new English– Chinese dictionary), for example, has sold over 10 million copies since the 1970s. International publishing houses have been trying to enter the market since China joined the World Trade Organization in 2001. It can be reasonably expected that, with the improvement of literacy among the young generation and the expansion of college enrollments, the dictionary market will grow even more extensively. See also: Bilingual Lexicography; China: Writing System; Chinese; Chinese as an Isolating Language; Chinese Linguistic Tradition; Chinese (Mandarin): Phonology; Corpora; Corpus Linguistics; Corpus Lexicography; Language Education Policy in China; Lexicography: Overview; Lexicology; Plagiarism; Thesauruses; Tone: Phonology.
Bibliography (1991). , [Chen B (1991). An introduction to dictionary compilation. Shanghai: Fudan University Press.] (1982). , [Hu M et al. (1982). An introduction to lexicography. Beijing: Renmin University of China Press.] (1987). , [Huang J (1987). On dictionaries. Shanghai: Shanghai Lexicographical Publishing House.]
Chino, Eiichi (1932–2002) 365 (1990). , [Li K (1990). A course in modern lexicography. Nanjing: Nanjing University Press.] (1992). , [Lin Y (1992). A brief history of Chinese lexicography. Zhengzhou: Zhongzhou Ancient-Text Publishing House.] (1992). [Yang Z et al. (1992). Dictionary of lexicography. Shanghai: Xuelin Press.] (2004). [Zhang Y (2004). A bibliography of papers on Chinese lexicography. Shanghai: Shanghai Lexicographical Publishing House.] Bolton K & Hutton C (eds.) (2002). Western linguists and the languages of China, vols. 1–7: Chinese dictionaries, first series. Bristol: Ganesha Publishing Ltd. Chien D (1986). Lexicography in China: bibliography of dictionaries and related literature. Exeter: University of Exeter Press.
Hartmann R R K & James G (1998). Dictionary of lexicography. London: Routledge. Mathias J, Creamer T & Hixson S (1982). Chinese dictionaries: an extensive bibliography of dictionaries in Chinese and other languages. Greenwood Publishing Group, Inc. Wilder G D (1987). Analysis of Chinese characters. New York: Dover Publications Inc. Yang P F (1985). Chinese lexicology and lexicography: a selected and classified bibliography. Hong Kong: The Chinese University Press.
Relevant Websites http://www.omniglot.com – Provides a guide to the Chinese writing system. http://www.camsociety.org – For discussion of Chinese brush strokes. http://www.pinyin.info – For information about Chinese phonology.
Chino, Eiichi (1932–2002) J Ta´rnyikova´, Palacky University, Olomouc, Czech Republic ! 2006 Elsevier Ltd. All rights reserved.
Eiichi Chino, who taught in the Tokyo University of Foreign Studies (Gaikokugo daigaku) and former president of Wako¯ University, was known above all as a Japanese Bohemicist, a specialist in Czech philology, who studied and later taught the Czech language, wrote textbooks (co-author of Chekogo no nyu¯ mon [Introduction to the Czech language], 1976, Tokyo: Hakusuisha), lectured on Czech language and literature, and published a series of essays on various aspects of Czech cultural and political life (Chino, 1990, 1997). Eiichi Chino translated into Japanese a variety of text genres, ranging from books for children (cf. Frantisˇek Hrubı´n, Kurˇa´tko a obilı´–Hiyoko to mugibatake [The little chicken in a field of grain]) to film scripts (the Oscar nominated film Kolya), libretti (Leosˇ Jana´cˇek’s opera Osud – Unmei [Fate]), and a collection of essays by the leading Czech structuralist Jan Mukarˇovsky´ (co-translator, see Mukarˇovsky´, 1982). What brought Chino fame and admiration, however, were above all his translations of canonical Czech literature. He had a lifelong affection for Karel Cˇapek, whose unique style not only found in Chino a sensitive and creative interpreter (Karel Cˇapek’s play ‘R.U.R.’ – Robotto, Kareru Chapekku. Tokyo: Iwanami Shoten, 1989) but also a keen director, who engaged his Tokyo students of
Czech (at Gaikokugu daigaku) in staging the play. Preceding ‘R.U.R.’ were Chino’s own essays on Karel Cˇapek (Chino, 1975). Chino wrote epilogues to many of his translations familiarizing Japanese readers with the Czech cultural setting. Among other Czech authors translated by Chino were Franz Kafka, Viktor Fischl, Milan Kundera (1992, 1998), and Ota Pavel (2000). In 2000, for rendering outstanding service to Czech culture (see also his articles and essays in Asahi Shimbun and a series of university lectures in the 1980s on the Prague Linguistic Circle), Eiichi Chino was awarded the Medal of Merit, First Grade by President Va´clav Havel. The interest in Czech philology, though deep and emotional, always remained part of Chino’s broader interest in Slavic philology (including Old Church Slavonic, Russian, Polish, Slovak, Serbo-Croatian and Bulgarian), and comparative linguistics (‘‘Studies of Non-Indo-European Languages’’ in Miyaoka, O. (ed.), Languages of the North Pacific Rim: types and history. 1992. Tokyo: Sanseido¯,). Eiichi Chino was a co-editor of a 6-volume encyclopedia known as The Sanseido¯ encyclopaedia of linguistics (Chino, 1988–1996) and an author of many enlightening books on linguistics (Chino, 1980a, 1980b, 1994, 1999). These were preceded by studies on the Japanese language (Chino, 1977). Eiichi Chino’s multifarious activities reflect the variety of his studies, completed at three universities: the Tokyo University of Foreign Studies (Russian), the University of Tokyo (general linguistics), and Charles University, Prague
358 Chinese as an Isolating Language (5) fen
shui
ling
H H
L MH
L L
H
H
L
divide water mountain-ridge ‘watershed’ citation tones sandhi tone forms (intermediate, nonrealized forms) sandhi tone forms (final surface forms)
In (5), the citation L tone on shui changes to an intermediate, nonrealized sandhi MH tone in accord with L tone sandhi, and that intermediate sandhi MH value for shui acts as input into the MH tone sandhi rule, changing the nonrealized sandhi MH tone to a final surface H tone. From these examples it is clear that the phonological shape of Chinese morphemes does undergo considerable variation, but such variation is entirely a function of phonological context. To conclude, the reputation of Chinese as an isolating language is perhaps not so well-deserved if we rely merely on the monomorphemic word criterion, since
the preponderance of Chinese words are multimorphemic. But if our criterion is how easy the morphemes of a language are to identify and individuate, then Chinese scores rather high on the isolating language scale. See also: Arabic as an Introflecting Language; Chinese; Chinese (Mandarin): Phonology; Chinese Lexicography; Classification of Languages; Finnish as an Agglutinating Language; Italian as a Fusional Language; Morphological Typology.
Bibliography Chen M Y (2000). Tone sandhi: Patterns across Chinese dialects. New York/London: Cambridge University Press. Cheng C (1973). A synchronic phonology of Mandarin Chinese. The Hague: Mouton. Packard J L (2000). The morphology of Chinese. New York/London: Cambridge University Press.
Chinese Linguistic Tradition G Casacchia, Universita` degli studi di Napoli ‘L’Orientale,’ Napoli, Italy ! 2006 Elsevier Ltd. All rights reserved.
In China, linguistic research started very early. Theoretical analysis, lexicography, and dialectology were cultivated by Chinese scholars before the foundation of the Empire (221 B.C.). On the other hand, some fields had to wait until they received a hint from foreign cultures. Phonology was born under the Indian influence in the 3rd century A.D., and grammatical studies began following some Western examples only in the 19th century.
The Beginnings The idea that language is largely based on an agreement among human beings was established by Xunzi [Xunzi (Hsu¨n-tzu) was one of the most outstanding Confucian philosophers and author of a book of the same name] and Moists [Moism is the doctrine founded by Mozi (Mo-tzu)]. According to the first, ‘names are attached to things once for ever, but this link is based on an agreement,’ and the latter added, ‘names are like painted tigers’ (i.e., just a pale image of the real thing). With the foundation of the Empire, things changed. The major philosopher of the Han dynasty, Dong
Zhongshu [Dong Zhongshu (Tung Chung-shu) played a key role in establishing Confucian orthodoxy], adopted the opposite idea of linguistic realism: A natural bond exists between names and things. The Han penchant toward the cosmological theory of the ‘five elements’ (everything in the world is connected with five principles and modifies following them) prevented them from a nominalistic theory of the language. It is impossible to underevaluate the imprinting given by the ideographic writing system of Chinese on linguistic studies. The ideograms (it is well-known ‘ideogram’ is far from ideal when talking about Chinese characters, but more precise terms, such as sinograms or logograms, are not yet widely used), that nearly completely hide the phonological and morphological aspects of the language, and their importance as a tool of the state administration produced a sort of a ‘pheticism of the ideograms’ in the minds of the cultivated people, who seldom dared to see them closer or to ‘open’ them to better understand their nature. Therefore, since the very beginning, the pioneering linguistic works in China were not devoted to analyzing speech and its component, the word, but to collecting and comparing ideograms. First, some dictionaries of difficult words appeared, including the Ji jiu pian (‘Quick performances’) and some others, among which the most important was the Er Ya (‘Perfection attained’), a list of 2016 words
Chinese Linguistic Tradition 359
two less important categories. The definition is highly standardized. For instance, the first category is explained as ‘x mean y. Pictogram,’ the third as ‘x mean y. From a, from b,’ and the fourth as ‘x mean y. from a, sounds as b.’ Ming niao sheng ye. Cong niao, cong kou Ming (sing of a bird) means voice of a bird. From niao ‘bird,’ from kou ‘mouth.’ Bing ming ye. Cong huo, bing sheng Bing (bright) means bright. From huo ‘fire,’ sounds as bing.
Figure 1 The 36 initial sounds divided into nine categories and printed on a hand as a memory aide. From Qie yun zhi zhang tu (‘Tables to understand the rhymes’) by Sima Guang, 10th century.
classified into 19 sections, without pronunciation, in the following way: xu (wait)/si (wait)/ti (stop)/li (reach)/di (attain)/zhi (stop)/xi (wait): They mean dai (wait).
The logic of the Er Ya was to list difficult words together with simpler ones in order to explain the first by the latter and to provide the writer with many synonyms. For this reason, this kind of lexicographical work had great success and approximately 150 similar books were compiled during the Empire, up to the past century. The first dialectological works were clearly inspired by the Er Ya. The Fang Yan (‘Local speeches’) collected dialectal words together with more widespread words to attain the same results as Er Ya: dang/xiao/zhe mean zhi (know). In Chu they say dang or xiao, and between Qi and Song they say zhe.
In the book, the key notion of ‘common speech’ is found for the first time, as a reflection of the unified Empire in the minds of its inhabitants. However, the main work of Han dynasty linguistics is another dictionary, the Shuo wen jie zi [‘Explanation of words’; wen refers to the simple ideograms (e.g., ‘tree’) and zi to the complex ideograms (e.g., ‘forest’ – made of two trees to suggest plurality); shuo means ‘explain’ and jie ‘dissect’ or ‘analyze’], written by Xu Shen between 100 and 121 A.D. With nearly 10 000 monosyllabic words, it is a huge work, but its importance is in its several lexicographic inventions, so apt to the description of Chinese language, that have lasted up to the present. First, all of the ideograms are divided into six categories: pictograms, symbols, associated logic elements, associated semio- and phono-elements, and
Furthermore, for the first time the pronunciation of the items is also given by an homophonus ideogram, and the formula x du ruo y ‘x is read as y.’ Moreover, a list of 540 graphic components was established, also for the first time (e.g., wood, water, word, ship, worm, and roof), in order to classify all the 10 000 graphs of the texts. These graphic components, reduced during the last dynasty to 214 [in Kangxi zidian, ‘Dictionary of the emperor Kangxi (K’ang-hsi),’ by Zhang Yushu, 1716], are still used in the dictionaries of today under the name ‘radicals.’ Besides the previously mentioned works, based on solid scientific grounds, the realistic tradition of the Han produced some lexicographical works also worth mentioning. The best among them is the Shi ming (‘Names explained’) by Liu Xi: Jing (‘view’) means jing (‘territory’): what eyes embrace within a certain territory. Jin (‘brocade’) means jin (‘gold’): what is as precious as gold.
The theoretical basis of the work was the so-called sheng xun (‘phonological exegesis’), according to which any definitions must be based on synonymy and homophony.
The Flourishing Age The contacts with Sanscrit, due to the introduction of Buddhism in China from the 1st century A.D., finally gave the Chinese the suggestion to split the ideograms into two parts in order to start a phonological analysis of their own language and to establish a better way of writing their readings. Sun Yan (a scholar, fl. during the Wei dynasty, 220–265) was likely the first in his Er Ya yin yi (‘Sounds and meanings in Er Ya’) to adopt a new system called fanqie (lit. ‘rotate and cut’) to analyze phonologically the words in Er Ya. The fanqie system is one of the most outstanding achievements of the old linguistics. According to it, the reading of an ideogram is given through two other ideograms with no connection in meaning, the first giving the reading
360 Chinese Linguistic Tradition
Figure 2 A rhyme book, Yunjing (‘A mirror of rhymes’), by Zhang Lizhi (1161). The column at the top provides the initial sounds. The four columns below the first provide the final sounds, divided according to the ‘tones’ (the four black circles) and the ‘degrees’ (/-a-/, /-u-/, /-i-/, and /-y-/).
of the initial sound (sheng), and the second giving the reading of the final sound (yun): dong ¼ d(u) þ (h)ong dong (east) ¼ d(u) (capital) þ (h)ong (red)
This system was not only a brilliant invention for an ideographic writing system but also very productive and gave birth to a new branch of research, covering the whole length of the Empire and that lasted until the last century. First, the yun shu (‘books of rhymes’), a large number of dictionaries arranged phonologically and grouping together those words sharing the same final sound, were written, mainly to help students during the imperial exams (in which one of the most important tests was writing poems) but also to give a new impulse to linguistics. The first book of rhymes was Qie Yun (‘Rhymes according the fanqie’) by Lu Fayan, written in 601 A.D. Second, grouping together the words sharing the same initial sound, Chinese derived 36 words working as the letters of the Latin alphabet: kiem (see) for /k-/, k’i (creek) for /k’-/, and so on. From a technical standpoint, the time was ripe to adopt a more rational system of writing than the ideographic one. Nevertheless, a reform that could have given an alphabet to Chinese was never undertaken, for the good reason that a highly complicated system was one of the raisons d’etre of the powerful officers who ruled the Empire.
Third, not only the initial sounds but also the final sounds were identified and classified with slightly more than 100 ideograms distributed into four ‘degrees’ according to the openeness of the first syllable: /-a-/ (‘open mouth degree’) as in ma (horse), /-u-/ (‘close mouth degree’) as in hu (lake), /-i-/ (‘teeth at the same level degree’) as in di (earth), and /-y-/ (‘round mouth degree’) as in lu¨ (donkey). As a dignified conclusion to the classical linguistics period, in 1899 some casual discoveries gave birth to modern paleography. A huge amount of ‘dragon bones’ (which were at first called jiaguwen, ‘oracle bones’) with inscriptions dating back to 1500 B.C., completely renewed the traditional ideas of paleography and etymology, based for all the extent of the Empire on more recent graphs. For instance, for centuries the character for wang (‘king’), was considered ‘a trait d’union between heaven, earth and man,’ although it is simply the picture of an axe plunged into the ground. The study of jiaguwen is a developing science; to date, not more than approximately 60% of the old inscriptions have been deciphered.
The Reform The 20th century saw the birth of one more linguistic branch: grammar and syntax. The clashes and contacts with the West persuaded the Chinese that learning Western sciences was fundamental to saving
Chinese Linguistic Tradition 361
Figure 3 One page of the Kang xi zi dien (‘A dictionary of the emperor Kangxi’). Out of the frame are the words in the old writing system called xiao zhuan (‘small seal’). Within the page, the words are followed by the definitions and the sources quoted (in the circles).
and developing the country, and the idea arose to spare much of the time devoted to traditional learning by heart of the thousands of Chinese characters through Western techniques (i.e., grammar and syntax, mainly English, but also Latin and French) and so to be able to study more natural sciences and engineering. In 1898, the first grammar of classical Chinese, Ma shi wen tong (‘A grammar by dr Ma’) by Ma Jianzhong
was published, followed in 1921 by the first grammar of modern Chinese, Xinzhu guoyu wenfa (‘A new grammar of national language’) by Li Jinxi. In the past, no systematic description of the language could be found, besides some general ideas such as the distinction between shizi (‘full’) and xuzi (‘empty words’) or some research on particles. Later, many comprehensive grammars were brought to life, which despite their heavy debt to the Western
362 Chinese Linguistic Tradition
grammatical system threw a bright light over most of the specific features of Chinese.
Current Situation Currently, all branches of linguistics are fully developed in China due to the efforts of many scholars at the international level, including Lu¨ Shuxiang, Zhu Dexi, and many others. The first stage, in the 1950s, was largely dominated by the influence of the Soviet Union. For example, in a major debate over ‘are there or are there not word classes in Chinese?’ the core for a long time was actually ‘are there or are there not in Chinese the same word classes as in Russian?’ However, it was also open to influences from the West such as structuralism. During the second stage, in the 1960s and 1970s, sociolinguistics (how to simplify the writing system and how to teach the national language) mainly held the ground. The third stage, after the opening to the world and the development of social sciences, succeeded in giving impulse to modern linguistics. Nevertheless, the tradition still plays a role, and the strong points of Chinese linguistics remain
paleography, etymology, dialectology, grammar, and lexicography – much more than general linguistics. See also: China: Writing System; Chinese; Chinese as an
Isolating Language; Chinese Lexicography; Chinese (Mandarin): Phonology.
Bibliography Baxter W (1992). A handbook of old Chinese phonology. New York. Boltz W (1994). The origin and early development of the Chinese writing system. New Haven, CT. Botte´ ro F (1996). Se´ mantisme et classification dans l’e´ criture chinoise. Paris. Harbsmeyer C (1998). ‘Language and logic.’ In Needham J (ed.) Science and civilisation in China, vol. 7, part 1. Cambridge. Masini F (1993). ‘The formation of modern Chinese lexicon and its evolution toward a national language.’ Journal of Chinese Linguistics, monograph series No. 6. Qiu Xigui (1990). Wenzixue gaoyao. Peking. [Mattos G L & Norman J (trans.) (2000). Chinese writing. Berkeley.] Yip Po-ching (2000). The Chinese lexicon: a comprehensive survey. Hong Kong.
Chinese Lexicography Li Ming, Soochow University, Suzhou, China ! 2006 Elsevier Ltd. All rights reserved.
A Brief Historical Survey Lexicographical development in China can be divided into three stages. The rudimentary period followed a long and tortuous course from 200 B.C. to the 14th century. The inception of Chinese lexicography was closely related to the study of exegesis, critical interpretation of ancient texts, which inevitably involved the analyses of meaning. It is in this connection that some scholars regard (Yi ching; The book of changes) as the first dictionary in the Chinese language. However, the book hardly has any features typical of a modern dictionary. It is generally agreed that (Erya; Near correctness), which came out in the Han Dynasty (206 B.C.– 220 A.D.), consisting of 30 books about the accepted meanings of classic texts, was the earliest quasidictionary in China. The first Chinese dictionary in the modern sense of the word was (Shuowen jiezi; The origin of Chinese characters) by Xu Shen in the Eastern Han Dynasty (25–220 A.D.). The book contains some 10 000 Chinese characters with
pronunciations and definitions. A few other books having characteristics of a dictionary or an encyclopedia in this stage paved the way for the advancement of Chinese lexicography, including books on Buddhism in the Tang Dynasty (618–907). The intermediate phase (from 1368 to 1700) was marked by the compilation of the handwritten (Yongle dadian; The yongle canon), a 60volume encyclopedia compiled in 1408 in the Ming Dynasty (1368–1644). The most important lexicographical publications in this period are encyclopedic in nature, focusing on particular subjects. These include (Bencao gangmu; A compendium of herbal medicine), 1578; (Nongzheng quanshu; The encyclopedia of agriculture), a 60-volume work published in 1639; (Tiangong kaiwu; The exploitation of the works of nature), a three-volume encyclopedia about agriculture and handicrafts published in 1637. The last of these was later translated into French, Japanese, German, and English. The modern phase (from 1700 to the present) witnessed the publication in 1716 of a remarkable dictionary, (Kangxi zidian; The kangxi lexicon), with 47 035 Chinese characters. Since then,
Chino, Eiichi (1932–2002) 365 (1990). , [Li K (1990). A course in modern lexicography. Nanjing: Nanjing University Press.] (1992). , [Lin Y (1992). A brief history of Chinese lexicography. Zhengzhou: Zhongzhou Ancient-Text Publishing House.] (1992). [Yang Z et al. (1992). Dictionary of lexicography. Shanghai: Xuelin Press.] (2004). [Zhang Y (2004). A bibliography of papers on Chinese lexicography. Shanghai: Shanghai Lexicographical Publishing House.] Bolton K & Hutton C (eds.) (2002). Western linguists and the languages of China, vols. 1–7: Chinese dictionaries, first series. Bristol: Ganesha Publishing Ltd. Chien D (1986). Lexicography in China: bibliography of dictionaries and related literature. Exeter: University of Exeter Press.
Hartmann R R K & James G (1998). Dictionary of lexicography. London: Routledge. Mathias J, Creamer T & Hixson S (1982). Chinese dictionaries: an extensive bibliography of dictionaries in Chinese and other languages. Greenwood Publishing Group, Inc. Wilder G D (1987). Analysis of Chinese characters. New York: Dover Publications Inc. Yang P F (1985). Chinese lexicology and lexicography: a selected and classified bibliography. Hong Kong: The Chinese University Press.
Relevant Websites http://www.omniglot.com – Provides a guide to the Chinese writing system. http://www.camsociety.org – For discussion of Chinese brush strokes. http://www.pinyin.info – For information about Chinese phonology.
Chino, Eiichi (1932–2002) J Ta´rnyikova´, Palacky University, Olomouc, Czech Republic ! 2006 Elsevier Ltd. All rights reserved.
Eiichi Chino, who taught in the Tokyo University of Foreign Studies (Gaikokugo daigaku) and former president of Wako¯ University, was known above all as a Japanese Bohemicist, a specialist in Czech philology, who studied and later taught the Czech language, wrote textbooks (co-author of Chekogo no nyu¯mon [Introduction to the Czech language], 1976, Tokyo: Hakusuisha), lectured on Czech language and literature, and published a series of essays on various aspects of Czech cultural and political life (Chino, 1990, 1997). Eiichi Chino translated into Japanese a variety of text genres, ranging from books for children (cf. Frantisˇek Hrubı´n, Kurˇa´tko a obilı´–Hiyoko to mugibatake [The little chicken in a field of grain]) to film scripts (the Oscar nominated film Kolya), libretti (Leosˇ Jana´cˇek’s opera Osud – Unmei [Fate]), and a collection of essays by the leading Czech structuralist Jan Mukarˇovsky´ (co-translator, see Mukarˇovsky´, 1982). What brought Chino fame and admiration, however, were above all his translations of canonical Czech literature. He had a lifelong affection for Karel Cˇapek, whose unique style not only found in Chino a sensitive and creative interpreter (Karel Cˇapek’s play ‘R.U.R.’ – Robotto, Kareru Chapekku. Tokyo: Iwanami Shoten, 1989) but also a keen director, who engaged his Tokyo students of
Czech (at Gaikokugu daigaku) in staging the play. Preceding ‘R.U.R.’ were Chino’s own essays on Karel Cˇapek (Chino, 1975). Chino wrote epilogues to many of his translations familiarizing Japanese readers with the Czech cultural setting. Among other Czech authors translated by Chino were Franz Kafka, Viktor Fischl, Milan Kundera (1992, 1998), and Ota Pavel (2000). In 2000, for rendering outstanding service to Czech culture (see also his articles and essays in Asahi Shimbun and a series of university lectures in the 1980s on the Prague Linguistic Circle), Eiichi Chino was awarded the Medal of Merit, First Grade by President Va´clav Havel. The interest in Czech philology, though deep and emotional, always remained part of Chino’s broader interest in Slavic philology (including Old Church Slavonic, Russian, Polish, Slovak, Serbo-Croatian and Bulgarian), and comparative linguistics (‘‘Studies of Non-Indo-European Languages’’ in Miyaoka, O. (ed.), Languages of the North Pacific Rim: types and history. 1992. Tokyo: Sanseido¯,). Eiichi Chino was a co-editor of a 6-volume encyclopedia known as The Sanseido¯ encyclopaedia of linguistics (Chino, 1988–1996) and an author of many enlightening books on linguistics (Chino, 1980a, 1980b, 1994, 1999). These were preceded by studies on the Japanese language (Chino, 1977). Eiichi Chino’s multifarious activities reflect the variety of his studies, completed at three universities: the Tokyo University of Foreign Studies (Russian), the University of Tokyo (general linguistics), and Charles University, Prague
366 Chino, Eiichi (1932–2002)
(Slavic philology, Czech and general linguistics). As a professor of Wako¯ University, Chino wrote an essay entitled ‘Japanese – a treasure trove.’ (Nihongo Kyoiku Tsushin, 16). The title is symbolic: Eiichi Chino’s legacy is a treasure trove of many and varied contributions to linguistics, literature, foreign language teaching (Chino, 1986), and translatology – and a challenge for those interested in this charismatic polyglot and his research (cf. the obituary by Vlasta Winkelho¨ ferova´ (2002) ‘Odesˇ el japonsky´ bohemista Eiicˇ i Cˇ ino.’ Dokorˇ a´ n 22, Obec Spisovatelu˚ . [Bulletin of the union of writers.] Prague). See also: Japanese; Translation and Genre: Literary.
Bibliography Chino E (1975). Poketto no naka no Chapekku [Cˇ apek in the pocket]. Tokyo: Sho¯ bunsha. Chino E (1980a). Gengogaku no tanoshimi [The joy of linguistics]. Tokyo: Taishukan Shoten. Chino E (1980b). Gengo no geijutsu [The art of words]. Tokyo: Taishukan Shoten. Chino E (1986). Gaikokugo jo¯ tacuho¯ [Methods of improvement in a foreign language]. Tokyo: Iwanami Shoten. Chino E (1988–1996). Gengogaku daijiten [The Sanseido encyclopedia of linguistics] (6 vols). Takashi K, Rokuro¯ K & Chino E (eds.). Tokyo: Sanseido¯ .
Chi-Nyanja
Chino E (1990). Biro¯ do kakumei no kokoro. Va´clav Havel [The heart of the velvet revolution. Va´ clav Havel]. Tokyo: Iwanami Shoten. Chino E (1994). Gengogaku e no hirakareta tobira [The door opened to linguistics]. Janua linguisticae reserata. Tokyo: Sanseido¯ . Chino E (1997). Bı¯ru to kohon no Puraha [The Prague of beer and old books]. Tokyo: Hakusuisha. Chino E (1999). Kotobano jukaı¯ [Through the thickets of words]. Tokyo: Seidosha. Chino E et al. (1977). Kokugo kokuji mondai [Problems of Japanese characters]. Tokyo: Iwanami Shoten. Kundera M (1992). Smeˇ sˇ ne´ la´ sky – Bisho o sasou ai no monogatari. [Laughable loves]. Eiichi Chino, Mitsuyoshi Numano & Yoshinari Nishinaga (trans.). Tokyo: Shueisha. Kundera M (1998). Nesnesitelna´ lehkost bytı´–Sonzai no taerarenai karusa [The unbearable lightness of being]. Chino E (trans.). Tokyo: Shueisha. Mukarˇ ovsky´ J (1982). Cheko ko¯ zo¯ bigaku ronshu¯ : biteki kino¯ no geijutsu shakaigaku/Jan Mukajofusuki [A collection of essays on the structure and aesthetic function of Czech/Jan Mukarˇ ovsky´ ]. Chino E & Hirai T (trans. and eds.). Tokyo: Shobo. Pavel O (2000). Smrt kra´ sny´ ch srncu˚ . – Utsukushii shika no shi [The death of beautiful roebucks]. Chino E (trans.). Tokyo: Kinokuniyashoten.
See: Nyanja.
Chiri, Mashiho (1909–1961) A Pe´ rez Pereiro, Arizona State University, Tempe, AZ, USA ! 2006 Elsevier Ltd. All rights reserved.
Mashiho Chiri was the first ethnic Ainu person to attend Tokyo Imperial University and was hailed as the Ainu genius, countering claims of Ainu inferiority to the Japanese. He was born in Hokkaido, Japan, in 1909 to an Ainu family and grew up in the town of Muroran. Despite his Ainu heritage, the stigma associated with the Ainu language and culture at the time being what it was, he did not grow up speaking Ainu, but Japanese instead. He remarked that he would later have to learn Ainu as a foreign language, as he might French or German (German, Standard), and that this pained him greatly.
Chiri’s aunt, Matsu Kannari, and his sister, Yukie Chiri, were Ainu speakers and informants of Kyosuke Kindaichi, the famous Ainu scholar. Kindaichi encouraged Mashiho to study and helped him enter a prestigious high school and, later, Tokyo Imperial University. Initially, Chiri entered the English literature department, but later transferred to the department of linguistics where Kindaichi was teaching at the time. Together with Kindaichi, he published his dissertation Ainu goho gaisetsu, where he provided a description of Ainu grammar. Chiri and Kindaichi had different interests with regard to their studies. While Kindaichi was primarily interested in the classical language of the epic yukar, Chiri, although also a scholar of the yukar, chose to focus on the colloquial varieties of the spoken
366 Chino, Eiichi (1932–2002)
(Slavic philology, Czech and general linguistics). As a professor of Wako¯ University, Chino wrote an essay entitled ‘Japanese – a treasure trove.’ (Nihongo Kyoiku Tsushin, 16). The title is symbolic: Eiichi Chino’s legacy is a treasure trove of many and varied contributions to linguistics, literature, foreign language teaching (Chino, 1986), and translatology – and a challenge for those interested in this charismatic polyglot and his research (cf. the obituary by Vlasta Winkelho¨ferova´ (2002) ‘Odesˇel japonsky´ bohemista Eiicˇi Cˇino.’ Dokorˇa´n 22, Obec Spisovatelu˚. [Bulletin of the union of writers.] Prague). See also: Japanese; Translation and Genre: Literary.
Bibliography Chino E (1975). Poketto no naka no Chapekku [Cˇapek in the pocket]. Tokyo: Sho¯bunsha. Chino E (1980a). Gengogaku no tanoshimi [The joy of linguistics]. Tokyo: Taishukan Shoten. Chino E (1980b). Gengo no geijutsu [The art of words]. Tokyo: Taishukan Shoten. Chino E (1986). Gaikokugo jo¯tacuho¯ [Methods of improvement in a foreign language]. Tokyo: Iwanami Shoten. Chino E (1988–1996). Gengogaku daijiten [The Sanseido encyclopedia of linguistics] (6 vols). Takashi K, Rokuro¯ K & Chino E (eds.). Tokyo: Sanseido¯.
Chi-Nyanja
Chino E (1990). Biro¯do kakumei no kokoro. Va´clav Havel [The heart of the velvet revolution. Va´clav Havel]. Tokyo: Iwanami Shoten. Chino E (1994). Gengogaku e no hirakareta tobira [The door opened to linguistics]. Janua linguisticae reserata. Tokyo: Sanseido¯. Chino E (1997). Bı¯ru to kohon no Puraha [The Prague of beer and old books]. Tokyo: Hakusuisha. Chino E (1999). Kotobano jukaı¯ [Through the thickets of words]. Tokyo: Seidosha. Chino E et al. (1977). Kokugo kokuji mondai [Problems of Japanese characters]. Tokyo: Iwanami Shoten. Kundera M (1992). Smeˇsˇne´ la´sky – Bisho o sasou ai no monogatari. [Laughable loves]. Eiichi Chino, Mitsuyoshi Numano & Yoshinari Nishinaga (trans.). Tokyo: Shueisha. Kundera M (1998). Nesnesitelna´ lehkost bytı´–Sonzai no taerarenai karusa [The unbearable lightness of being]. Chino E (trans.). Tokyo: Shueisha. Mukarˇovsky´ J (1982). Cheko ko¯zo¯ bigaku ronshu¯: biteki kino¯ no geijutsu shakaigaku/Jan Mukajofusuki [A collection of essays on the structure and aesthetic function of Czech/Jan Mukarˇovsky´]. Chino E & Hirai T (trans. and eds.). Tokyo: Shobo. Pavel O (2000). Smrt kra´sny´ch srncu˚. – Utsukushii shika no shi [The death of beautiful roebucks]. Chino E (trans.). Tokyo: Kinokuniyashoten.
See: Nyanja.
Chiri, Mashiho (1909–1961) A Pe´rez Pereiro, Arizona State University, Tempe, AZ, USA ! 2006 Elsevier Ltd. All rights reserved.
Mashiho Chiri was the first ethnic Ainu person to attend Tokyo Imperial University and was hailed as the Ainu genius, countering claims of Ainu inferiority to the Japanese. He was born in Hokkaido, Japan, in 1909 to an Ainu family and grew up in the town of Muroran. Despite his Ainu heritage, the stigma associated with the Ainu language and culture at the time being what it was, he did not grow up speaking Ainu, but Japanese instead. He remarked that he would later have to learn Ainu as a foreign language, as he might French or German (German, Standard), and that this pained him greatly.
Chiri’s aunt, Matsu Kannari, and his sister, Yukie Chiri, were Ainu speakers and informants of Kyosuke Kindaichi, the famous Ainu scholar. Kindaichi encouraged Mashiho to study and helped him enter a prestigious high school and, later, Tokyo Imperial University. Initially, Chiri entered the English literature department, but later transferred to the department of linguistics where Kindaichi was teaching at the time. Together with Kindaichi, he published his dissertation Ainu goho gaisetsu, where he provided a description of Ainu grammar. Chiri and Kindaichi had different interests with regard to their studies. While Kindaichi was primarily interested in the classical language of the epic yukar, Chiri, although also a scholar of the yukar, chose to focus on the colloquial varieties of the spoken
Choco Languages 367
language. After graduation, Chiri started teaching in Sakhalin, then a Japanese territory, and began studying the Sakhalin dialect of Ainu. The result was Ainu goho kenkyu, a grammar of this dialect. He was convinced of the importance of distinguishing between the different dialects and using them to understand the local culture where each dialect was spoken. He also made significant contributions to the study of geographical names, sometimes in collaboration with Hidezo Yamada, a renowned scholar of Ainu place names. In the course of his studies, Chiri also proposed that Ainu, still believed to be a language isolate, exhibits vowel harmony, a property associated with the Altaic languages. Chiri resented the subordination of the Ainu in Japan and the discrimination which they endured. He himself was also teased at school, being called ‘inu’ or dog, a common slur against the Ainu. He was also critical, perhaps excessively so, of what he considered to be ‘bad scholarship’ on the Ainu people. Although he was raised in a Japanized home, he sometimes claimed that non-Ainu scholars, especially those with whom he disagreed, did not understand the language and culture the way he did. In particular, he attacked John Batchelor and his famous dictionary, referring to it as a collection of errors. Chiri began what would be his own
magnum opus, the Classified Ainu dictionary, a complete dictionary of the Ainu language accounting for dialectical variations. In June of 1961, at the age of 52, Mashiho Chiri died after having written only three of the planned ten volumes. See also: Ainu; Batchelor, John (1853–1944); Japan: Language Situation; Japanese; Kindaichi, Kyosuke (1882– 1971); Naert, Pierre (1916–1971).
Bibliography Chiri M & Kindaichi K (1936). Ainu goho gaisetsu (Outline of Ainu grammar). Tokyo: Iwanami Shoten. Chiri M (1942). Ainu goho kenkyu: Karafuto hogen o chushin to shite (A study of Ainu grammar with an emphasis on the Sakhalin dialect). In Reports from the museum of Sakhalin No. 4: Toyohara. Chiri M (1952). ‘Ainugo ni okeru boin chowa. (Vowel harmony in Ainu).’ Annual Reports on Culture and Science 1, 101–118. Chiri M (1953, 1954, 1962). Bunrui Ainugo jiten (Classified Ainu dictionary) (vols 1, 2, 3). Tokyo: Nihon jomin bunka kenkyujo. Chiri M (1956a). Ainugo nyumon (Introduction to the Ainu language). Sapporo, Japan: Nire Shobo. Chiri M (1956b). Chimei Ainugo shojiten (Small dictionary of Ainu placenames). Tokyo: Nire Shobo.
Choco Languages D Aguirre Licht, Universidad de los Andes, Bogota´, Colombia ! 2006 Elsevier Ltd. All rights reserved.
Present Indians of Western Colombia Colombia, a basically Spanish-speaking nation in the northwestern corner of South America, conserves a considerable number of Indian languages. The speakers of the various languages survived the colonization of the subcontinent, isolating themselves in rather desolate places far from the urban centers, where there were no mestizos and only the black population dared to enter. Of the immense mosaic of aboriginal languages thought to have existed when the Europeans arrived in what is today Colombian territory – for its privileged situation as a crossroads for peoples from north to south and from south to north of the continent – about 90 peoples still survive. They are characterized as different from the majority of the Spanish-speaking population, because they maintain particular sociocultural characteristics, among them
a language of their own, as is the case for 65 of these 90 peoples. These languages have been characterized by the most diverse range of linguistic varieties, i.e., isolating, agglutinating, and flexive languages, as correspond to such highly varied regions in which their speakers are found: desert zones, grasslands, jungles, coastal littorals, river littorals, foothills, and mountainous zones of both temperate and cold climates. Four of these Indian languages still survive in western Colombia, which correspond to four ethnic groups that continue to preserve their own cultural characteristics, such as their language and, therefore, their particular way of thinking, or worldview. The first of these four languages is Tule, of the Chibchan linguistic family, the speakers of which are known as ‘Cunas.’ They occupy the extreme northwestern part of the country, in the Golfo de Uraba (where there are no more than 1000 individuals) and the majority are found in the neighboring country of Panama, in the San Blas Islands (around 40 000 individuals), where they have immigrated for more than half a century. The second language is Awa or Awa-Cuaiquer,
Choco Languages 367
language. After graduation, Chiri started teaching in Sakhalin, then a Japanese territory, and began studying the Sakhalin dialect of Ainu. The result was Ainu goho kenkyu, a grammar of this dialect. He was convinced of the importance of distinguishing between the different dialects and using them to understand the local culture where each dialect was spoken. He also made significant contributions to the study of geographical names, sometimes in collaboration with Hidezo Yamada, a renowned scholar of Ainu place names. In the course of his studies, Chiri also proposed that Ainu, still believed to be a language isolate, exhibits vowel harmony, a property associated with the Altaic languages. Chiri resented the subordination of the Ainu in Japan and the discrimination which they endured. He himself was also teased at school, being called ‘inu’ or dog, a common slur against the Ainu. He was also critical, perhaps excessively so, of what he considered to be ‘bad scholarship’ on the Ainu people. Although he was raised in a Japanized home, he sometimes claimed that non-Ainu scholars, especially those with whom he disagreed, did not understand the language and culture the way he did. In particular, he attacked John Batchelor and his famous dictionary, referring to it as a collection of errors. Chiri began what would be his own
magnum opus, the Classified Ainu dictionary, a complete dictionary of the Ainu language accounting for dialectical variations. In June of 1961, at the age of 52, Mashiho Chiri died after having written only three of the planned ten volumes. See also: Ainu; Batchelor, John (1853–1944); Japan: Language Situation; Japanese; Kindaichi, Kyosuke (1882– 1971); Naert, Pierre (1916–1971).
Bibliography Chiri M & Kindaichi K (1936). Ainu goho gaisetsu (Outline of Ainu grammar). Tokyo: Iwanami Shoten. Chiri M (1942). Ainu goho kenkyu: Karafuto hogen o chushin to shite (A study of Ainu grammar with an emphasis on the Sakhalin dialect). In Reports from the museum of Sakhalin No. 4: Toyohara. Chiri M (1952). ‘Ainugo ni okeru boin chowa. (Vowel harmony in Ainu).’ Annual Reports on Culture and Science 1, 101–118. Chiri M (1953, 1954, 1962). Bunrui Ainugo jiten (Classified Ainu dictionary) (vols 1, 2, 3). Tokyo: Nihon jomin bunka kenkyujo. Chiri M (1956a). Ainugo nyumon (Introduction to the Ainu language). Sapporo, Japan: Nire Shobo. Chiri M (1956b). Chimei Ainugo shojiten (Small dictionary of Ainu placenames). Tokyo: Nire Shobo.
Choco Languages D Aguirre Licht, Universidad de los Andes, Bogota´, Colombia ! 2006 Elsevier Ltd. All rights reserved.
Present Indians of Western Colombia Colombia, a basically Spanish-speaking nation in the northwestern corner of South America, conserves a considerable number of Indian languages. The speakers of the various languages survived the colonization of the subcontinent, isolating themselves in rather desolate places far from the urban centers, where there were no mestizos and only the black population dared to enter. Of the immense mosaic of aboriginal languages thought to have existed when the Europeans arrived in what is today Colombian territory – for its privileged situation as a crossroads for peoples from north to south and from south to north of the continent – about 90 peoples still survive. They are characterized as different from the majority of the Spanish-speaking population, because they maintain particular sociocultural characteristics, among them
a language of their own, as is the case for 65 of these 90 peoples. These languages have been characterized by the most diverse range of linguistic varieties, i.e., isolating, agglutinating, and flexive languages, as correspond to such highly varied regions in which their speakers are found: desert zones, grasslands, jungles, coastal littorals, river littorals, foothills, and mountainous zones of both temperate and cold climates. Four of these Indian languages still survive in western Colombia, which correspond to four ethnic groups that continue to preserve their own cultural characteristics, such as their language and, therefore, their particular way of thinking, or worldview. The first of these four languages is Tule, of the Chibchan linguistic family, the speakers of which are known as ‘Cunas.’ They occupy the extreme northwestern part of the country, in the Golfo de Uraba (where there are no more than 1000 individuals) and the majority are found in the neighboring country of Panama, in the San Blas Islands (around 40 000 individuals), where they have immigrated for more than half a century. The second language is Awa or Awa-Cuaiquer,
368 Choco Languages
classified as an independent language, whose speakers are thought to number about 4000 and are located in the extreme southwestern part of the country (in the department of Narin˜ o) and in smaller numbers in the neighboring country of Ecuador. The third and fourth languages are Waunme´ u (Woun Meu) and Embera, which belong to the so-called Choco language group, which has only recently been classified as an independent linguistic family. Waunme´ u is spoken by the Waunanas, who number around 4000 individuals in Colombia, along the lower San Juan River, in the south of the department of Choco, and no more than 2000 individuals who have immigrated to the Province of Darien, in Panama. Embera is spoken by the Indians who call themselves Emberas but who are known by different names in the literature because they constitute a much larger number of speakers – around 60 000 – divided in various dialects. The Emberas are dispersed throughout the western part of Colombia and even in the frontier zones of Panama and Ecuador, and some of these dialects have grown so far apart that they are now mutually unintelligible. Of course, this is a timid sample of the much greater number of ethnic groups that inhabited the western part of Colombia when the Europeans arrived, among which we can recall the names of the Idaba´ ez, Ingara´ s, Birus, Surrucos, Poromeas, and the presentday Kunas, Waunanas, Katı´os, or Emberas. Some of the denominations applied to the Indians then generally known as ‘Choco’ or ‘Chocoes’ were the Anda´guedas, Baudo´, Chamı´s, Dabeibas, Dariens, Katı´os, Noanama´s, and Saijas. Nowadays it is known that these names are derived from the names of the regions inhabited by these groups, which generally took the name of the main river that crossed through their territory and, in the case of the name ‘Katı´o,’ to the fact that the Embera Indians eventually occupied the region of the Katı´o Indians, a brave warrior tribe that succumbed to the Spanish. The Embera Indians occupy a much greater territory today than they did at the time of the arrival of the Europeans, but with a very atomized coverage, i.e., only in different and specific points of little extension. Mestizo settlers displaced them to these Indian reservations, called ‘Resguardos’ or ‘Cabildos,’ which were very effective in the colonial period in preventing the extinction of these peoples, by impeding their occupation by outsiders but obliging the Emberas to give up the extensive territories in which they had once freely roamed. In this article we see the different dialects into which the Embera language is presently divided. These dialects are a product of the different regions in which the Embera Indians have settled since the
arrival of the Spanish, in different latitudes of the continent but always limited to a fringe that extends from the western littoral: the Pacific coast of Colombia, from north to south, to the Cauca River, which separates the western and central cordilleras stretching from north to south along the country, together with the eastern cordillera, the final branches of which disappear as they enter the Caribbean region of Colombia. Thus, the scenario that the Choco Indians occupy consists of the Pacific coast of Colombia, with its jungle plains; the Province of Darie´n, in Panama; and the spurs of the western cordillera and its terminal branches to the west of the Cauca River.
Retrospective of Linguistic Studies and Attempts to Classify the Emberas The inclusion within a single linguistic family of the speech of the different Choco groups (the Waunana language and the different Embera dialects) that survived the Conquest and the colonial period is a recent fact. Their classification within any one of the great variety of American linguistic families is still open to discussion. In the literature on the country’s Indians, there is abundant documentation on population and migrations of the Choco, from chroniclers like Fray Pedro Simo´n, Bartolome´ de Las Casas, Jorge Robledo, Juan de Castellanos, Pedro Cieza de Leo´n, to recent researchers like Henry Wassen, Katleen Romoli, Reina Torres de Arauz, Sven Isacsson, Mauricio Pardo, and Patricia Vargas. The last two, who are Colombian authors, have advanced in research about the Emberas, having reviewed all previous studies. In his article ‘Bibliografı´a sobre indı´genas Choco’ (1981), for example, Pardo did an excellent review of the ethnohistoric literature available to date, and in ‘Regionalizacio´n de indı´genas Choco’ (1987), he updated the discussion of the ethnohistoric panorama. Vargas (1986), on the other hand, found that the incursion of the Emberas into the territories of the Katı´o Indians did not mean the total extinction of the latter, because the two peoples intermingled, which is why the present Emberas of the region present particular characteristics that could be assigned to the Katı´os. The term Choco was already used in the 17th century to designate the Emberas of the upper San Juan and Atrato rivers and the Waunanas of the lower San Juan River. The earliest report known about the Emberas is found in the diary of the missionary Father Joseph Palacios de la Vega, around 1787, in San Cipriano, on the San Jorge River. This linguistic material, consisting of 37 phrases and 107 morphemes, fundamentally corresponds to the speech of the present Emberas of the northeast (Reichel-Dolmatoff, 1955).
Choco Languages 369
A series of vocabularies was later collected by travelers, mostly foreigners, in different Choco Indian localities (Mollien, 1824; Cullen, 1851; Seeman, 1851; Bastian, 1876; Greiffenstein, 1878; Collins, 1879; White, 1884; Pela´ ez, 1885; Etiene, 1887; Simons, 1887; Pinart, 1887; Vela´ squez, 1916; Robledo, 1922). These materials fundamentally served as the basis for analysis and classification until the middle of the 20th century. But there have also been comparative studies since the 19th century: Bollaert (1860) proposed affinities between the Choco and Mesoamerican groups; Adam (1888) compared vocabularies obtained by Cullen, Seeman, and Uribe; Brinton (1891) observed the territorial extension of the speech of the Choco; Chamberlain (1907) determined that the geographical limits of the Choco were between 8 and 4 degrees northern latitude, between the Golfo de Uraba´ and the Golfo de San Miguel, and proposes Choco as an independent linguistic group; Lehmann (1910, 1920) suggested kinship with the Chibcha dialects of the Barbacoas and Talamanca groups; Loukotka (1968 [1942]) reaffirmed the separation of these languages as an independent linguistic family and recognized nine extant languages and five extinct languages; Rivet (1912, 1924, 1943) compared elements of the Choco vocabulary with 56 Caribe dialects, 34 Chibcha dialects, and 29 Arawak dialects and concluded that there was a strong Caribe influence and, to a much lesser degree, Chibcha and Arawak influence; Ortiz (1937, 1940, 1954, 1965), Mason (1950), Meillet and Cohen (1952), and Tovar (1961) followed Loukotka’s regionalization and Rivet’s affiliation. The first attempts to classify the native languages of America were made in the second half of the 20th century. At the beginning of the 20th century, 19 independent language families were mentioned for the Pacific coast, including the Choco family (see, for example, the classifications of Alexander Chamberlain [1913] for the linguistic families of South America). Later researchers, such as Paul Rivet (1944), reduced this number and proposed the inclusion of the Choco family within other macrofamilies, like the Chibcha or the Caribe. At present, in light of recent linguistic explorations, the thesis of the independence of this family seems to be the most reliable, vindicating its defenders, among whom, in addition to Chamberlain, we can name Nordenskiold (1928), Loukotka (1968 [1942]), Tovar and Larrucea (1984), and Pardo and Aguirre (1993). The cultural unity and the common origin of the Choco Indians were the subject of controversy for a long time. Mason’s classification (1950) (broadened with that of Greenberg, 1960), for example, divided the Choco languages into Empera, with 3
variants; Catı´o (Embera-Catı´o), with 14 variants; and Noanama´ (with 1 variant). Loukotka (1968 [1942]) had already spoken of 9 extant and 5 extinct Choco languages, and later Loukotka and Rivet proposed 10 extant variants for the Choco group (which they call the Empera´ division) and 2 extinct ones (see Ortiz, 1965: 197–200). Jacob Loewen (1960) confirmed Erland Nordenskiold’s statement, testifying that linguistically only two languages (Waunana and Embera) – mutually unintelligible but nonetheless related – belonged within the Choco family, and proposed, with a phonological criterion, four large dialectal areas, one Waunana and three Embera, with lexical variations within the Embera areas. Among the main bibliographical compilations on the Choco languages were those of Adam (1888), with 7 references; Lehmann (1920), with 30 references; Reichel-Dolmatoff (1945), with 38 references; Ortiz (1954), with 60 references; Loewen (1963), with 191 references, which were not only linguistic but historical as well; Ortega (1978), with 67 references; and Pardo (1981), with 72 references, and (1986), a survey of everything written on the subject to date, with 135 references, including everything from academic studies to simple lists of words. There have been grammatical studies of the Embera language since 1881, when Jose´ Vicente Uribe published a brief article in which he presented the different types of Embera words in a general way. In 1936, Fray Pablo del Santı´simo Sacramento published a grammatical essay on the speech of the Embera-Catı´os of the Apostolic Prefecture of Uraba´ , as well as a classification of Embera, in which he dedicated a small portion to the syntax of the language. In 1918, an anonymous Catı´o-Spanish catechism appeared for missionaries of Antioquia. There is also an undated Catı´a grammar by Marı´a Betania (quoted in Pinto, 1974), and the Claretian priest Constancio Pinto published a Catı´o-Espan˜ ol dictionary (1950), as well as another extensive dictionary with grammar (1974). Scientific studies based on fieldwork began with the research of Jacob Loewen, an American Mennonite missionary who did a study for a master’s degree (1954) among the Waunana Indians of the lower San Juan River, and a doctoral thesis (1958) on the speech of the Emberas of the Sambu´ River, in the province of Darie´ n in Panama. Loewen also wrote numerous articles on Embera phonology and dialectology, on comments on traditional stories, on loans from Spanish, on problems of bilingual literacy programs, and on basic readers in Indian languages. Jean Caudmont (1955) elaborated notes on phonological and grammatical generalities through the use of field notes of Reichel-Dolmatoff (1945), taken 10 years earlier among the Embera group in Riofrı´o
370 Choco Languages
in the department of Valle, who had emigrated from the region of the Chamı´. The Claretian missionary Constancio Pinto, who lived with the Emberas of the region of the Chamı´ (headwaters of the San Juan River) for more than 40 years, published a dictionary (1950) of the Embera language, as well as a book with a much more extensive vocabulary and with a section on grammar (1974). Despite their having been based on methodical fieldwork, these studies suffered from the fact that they had been transcribed using Spanish-language phonetics and indistinctly presented, especially in the work of Pinto, words from zones like the Chamı´, the Anda´ gueda, the Sinu´ , and the Atrato, without taking the dialectal variations into account. With more linguistic precision, the Swedish researcher Nils Holmer (1963), in one of the publications of the Gutemburg Ethnographic Museum, occupied himself extensively with phonological and morphological aspects of Waunana. The Summer Institute of Linguistics (SIL), which arrived in the country in 1962, carried out linguistic studies in distinct zones inhabited by Embera Indians. France´ s Gralow (1976) elaborated a phonological description for the Chamı´ zone. In the 1970s, Eileen Rex and Mareike Schotlenndreyer traveled throughout the municipalities of Dabeiba, Frontino, and Chigorodo´ , in the department of Antioquia, and the upper Sinu´ , in the department of Co´ rdoba, and published a phonology (1973) of the speech of the Emberas of the upper Sinu´ and northwestern Antioquia. Schotlenndreyer developed a literacy primer for the zone of Chigorodo´ (1973) and a structural analysis of her and Rex’s stories (1977). Eileen Rex wrote her master’s thesis on the Catı´a grammar (1975). Phillip Harms developed basic readers on the Embera language and tales and stories in the company of the natives from 1981 to 1985, for the Emberas of the Saija River on the coast of the department of Cauca, to the south of the department of Choco, and carried out a phonological description with Judy Powell (1984) and a grammatical study of the speech of these Emberas (Harms, 1987). Powell also mimeographed some Embera stories and biblical passages. David Stansell, who lived for more than 10 years among the Emberas of the Bojaya´ River in the department of Choco, wrote about them in Aspectos de la cultura material de grupos e´tnicos de Colombia (1973). Michael and Nellis (1984) produced primers in Chamı´ for the Emberas of the Valle de Garrapatas in the department of Valle del Cauca. Gordon Horton worked with the Emberas of the upper Sinu´ in the 1960s and 1970s, mainly on the morphology of the language, and developed a series of primers and other didactic materials. Miguel Loboguerrero carried out a linguistic study (1976) on
the dialect of the Chamı´ region; his resulting work included a phonological description, a grammatical description, and a corresponding lexicon. Nelly Mercedes Prado did an analysis of the ‘Epera’ variant (‘Embera,’ according to the phonology of this dialect) of the Saija River (1982) as her master’s thesis, the presentation of which included phonology, morphology, and an appendix titled ‘Un estudio inicial,’ along with a lexicon of 845 items, each with its respective phonetic transcription. She continued her work with the publication of didactic materials (1985), further explored aspects of the language, such as nasality (1991), and later worked in ethnolinguistic conflicts between blacks and Indians (1992) within the broadest project, known as ‘Cada rı´o tiene su decir.’ Many missionaries working in Embera territories have concerned themselves with the language. One primer on the language of the Emberas of the upper San Juan, with an alphabet, was developed by G. Manzini (1973); another primer, on the Katı´a variety, was designed by Martı´nez and Guisao (1980). There are a catechism in the Baudo´ dialect (1981) and a primer by Livia Correa (1982), as well as one, by Marı´a L. Pico´ n (1985), on the Itsmina region. For the Waunanas, in addition to Holmer’s studies, there was a phonological and grammatical study done by the Sacred Heart missionaries Sa´ nchez and Castro (1977), with the advice of Reinaldo Binder of the SIL, and a monograph by Luz Lotero (1972). The Embera Waunana Regional Organization (OREWA) of Choco wrote a manual for indigenous teachers (1987), within the framework of its newly initiated ethnoeducation program. Mauricio Pardo has done phonological and grammatical descriptions of the Embera language in northwestern Antioquia and the zone of the upper Baudo´ River in the department of Choco. With his participation in workshops with teachers from Baudo´ and in 1983 in northeastern Antioquia, an era of studies began that committed both Indians and researchers to a common cause in the application of the results of linguistic studies. In 1986, Pardo proposed, together with the author of this article, a revision of the Choco dialectology established by Loewen, 1963 (see next section of this article), and has done an extensive compilation of the publication of linguistic data on this language up to 1986. This author has also concerned himself with the elaboration of language primers and sociolinguistic aspects of the ethnic group.
Present Regionalization of the Embera Indians The first Indians denominated ‘Chocos’ by the Spanish were the Emberas of the upper San Juan River,
Choco Languages 371
who were then known as the Simas or the Tatama´ s. These Indians today call themselves Chamı´. This denomination would later be applied to all indigenous groups of the upper Atrato River, in the department of Choco, then known as ‘Citara´ ’ or ‘Citarambira´ ,’ and to the Indians of the middle and lower San Juan, respectively called ‘Poya’ and ‘Noanama’ in the the 17th century. Based on these points, registered in colonial papers, and respecting the linguistic data obtained from present settlements, one can attempt to reconstruct the dispersion of the Chocos (see Figures 1 and 2). Most of the Chamı´ are located along the upper San Juan River, in the Risaralda municipalities of Mistrato´ and Pueblorico, on the border with Choco. They have moved northward and southward along the cordillera to places like the upper Anda´gueda River, in southeastern Choco, to the southwestern part of the department of Antioquia in the municipalities of Jardı´n, Valparaı´so, and Bolı´var, and to the northern part of the department of the Valle del Cauca along the Garrapatas and Sanguininı´ rivers. Small groups are also located in other parts of Antioquia and Valle and have even moved down into the departments of Caqueta´ and Putumayo. Those who were called Citara´ s or Citarambira´ s during colonial times – then located along the upper Atrato River, on the Capa´ River, in Lloro´ , and along the lower Anda´ gueda River – have moved northward along the river to the upper Baudo´ River, toward the coastal tributaries to the north of Cabo Corrientes and the Panamanian portion of Darie´ n. These riverdwelling Indians are known as ‘Cholos’ on the Pacific coast of Colombia. Because these people form a distinct dialectal zone and because they are generally considered a mountain group, researchers believe the Indians who presently occupy territories in northeastern Antioquia – in Dabeiba, Frontino, Ituango, Murrı´, among other places, and in the department of Co´ rdoba, in the upper Sinu´ , San Jorge River, Rioverde, etc. – must descend from Emberas who, after the Conquest, settled along the eastern tributaries of the middle course of the Atrato River, a group different from the Citara´s. These Indians are erroneously known as ‘Katı´os,’ but colonial documents imply that the real Katı´os succumbed toward the end of the 17th century, after a terrible struggle with the Spanish. Vargas (1990) postulated, based on archival documents, that many Katı´os united both in alliance and in war with the Emberas. The Indians encountered by the Spanish in the middle San Juan River, whom the Spanish called ‘Poya´ ,’ are believed to be the ancestors of the present dwellers of middle Baudo´ River, in the affluents Catru´ , Dubasa, and surroundings. The Poya´ presented a dialectal
Figure 1 Current Choco Dialectology. Reproduced from Pardo M (1987). ‘Regionalizacio´n de indı´ genas choco.’ In Revista del Museo del Oro, Boletı´ n 18, January–April. Bogota: Musco del Oro. 46–63.
difference with the ones from the upper Baudo´ River. These people called themselves Emberas to differentiate themselves from the mountain people, who were called Katı´os. The Indians presently located to the south of Buenaventura also descended from the Poya´ s, whose main settlements are along the Saija River (department of Cauca), and the Satinga and Saquianga rivers
372 Choco Languages
Figure 2 Choco Dispersion. Reproduced from Pardo M (1987). ‘Regionalizacio´n de indı´ genas choco.’ In Revista del Museo del Oro, Boletı´ n 18, January–April. Bogota: Musco del Oro. 46–63.
(department of Narin˜ o) (Pardo, 1987). They call themselves ‘Eperas,’ in accordance with the phonology of their dialect. In the department of Caldas, there are settlements of Embera Indians, known to the rest of the popula-
tion as ‘Memes.’ They live in municipalities such as Belalca´ zar, Vitervo, and Riosucio, in places like La Betulia, La Tesalia, and the Indian reservations of San Lorenzo and Nuestra Sen˜ ora de la Montan˜ a. Some are Indian reservations with reserved territory, while others such as Can˜ amomo and Lomaprieta are in the process of becoming reservations (these are called ‘partialities’). In addition to the problem of vindicating their own identity as a separate ethnic group, they have encountered major difficulties for having lost their native tongue, but nonetheless they are at present actively committed to carrying out programs to recover their language with the help of native speakers from other regions. The Emberas who settled along the lower San Juan River and its tributaries, along the Jurado´ , Jampavado´ , Docampado´ , and Siguirisu´ a in southern Choco, and along the San Juan de Micay River in Cauca were called ‘Nonama´ ’ or ‘Noanama´ ’ ever since the invasion, but they call themselves ‘Waunana’ or ‘Wauna´ n.’ Over the course of a century they have migrated to the province of Darie´ n in Panama, where 2000 now reside, and to the Chintado´ River along the lower Atrato, where there are several hundred who migrated some 20 years ago. There are estimated to be about 4000 native speakers of Waunana in Colombia. Like the Emberas, they are known as ‘Cholos.’ The Waunanas and the Emberas are the only two ethnic groups that can clearly be identified as presently forming part of the Choco family. In 1988, the author of this article, together with the anthropologist/researcher Mauricio Pardo, presented a proposal for the regional classification of the Choco Indians – a revision of that proposed by J. Loewen – based on the different dialects encountered during fieldwork in the different zones with Choco Indians in Colombia. Some samples to support this proposal are presented below. These are taken from personal fieldwork notes and first appeared in an article entitled ‘Dialectologı´a Choco’ in the memoirs of the seminar-workshop ‘Estado actual de la clasificacio´ n de la lenguas indı´genas de Colombia,’ held in February 1988 at the Instituto Caro y Cuervo in Bogota (Pardo and Aguirre, 1993) (see Figure 3). To begin, a diagram showing the present linguistic variations and the local denominations is presented (Figure 3). The proposal of Jacob Loewen is then presented (Table 1), followed by the Pardo-Aguirre proposal (Table 2). After that, the zones and specific places identified by Pardo and Aguirre are presented in detail (Table 3), along with a global diagram of said zones (Figure 4). Finally, phonological and grammatical comparisons of the Waunana language and the different dialects proposed for the Embera language (as well as among the latter) are shown (Tables 4–6).
Choco Languages 373
Figure 3 Phylogenetic tree of the Choco linguistic varieties (local denominations).
Table 1 Choco phonological systems (according to Jacob Loewen)
Note: As can be seen, Loewen proposed 4 phonological systems and dialect subdivisions at the lexical level within them. Nonetheless, the recent data show that at least 6 different systems can be identified: 1 for Waunana and 5 for the different Embera dialects.
Present State of Studies on the Embera Language Colombia, together with the other Latin-American countries, with all the richness that multiculturalism and plurilinguism represent, only in recent times has given attention to its aboriginal languages. There is not still an official position on the defense of these languages and their speakers, who are not extinct, thanks to their proper fight and the support of a
sector of the civil population. Just during the last 20 years of the 20th century were the Indian and Afro-Colombian languages, still alive in the national panorama, taken seriously by academia. In 1984, the Anthropology Department of Andes University instituted a Masters in ethnolinguistics, with the sponsorship of the Centre Nationale de la Recherche Scientifique (CNRS) of France. In the program, researchers are prepared for the study of
374 Choco Languages Table 2 Choco phonological systems (based on recent data)
aeiouu For the South Coast, there is a sixth vowel, which is the e (oral only)
a
Data from Mejı´ a (2000b) Data from Prado (1991) c Data from Pardo (1985a) d Data from Aguirre (1995a) b
Table 3 Details regarding the zones and specific places of the Choco proposed by Pardo and Aguirre
PLOSIVE
Waunana
Lower Baudo´
Upper San Juan
Choco West Atrato. Antioquia East Atrato
ph th kh p t k bdg
ph th kh p t k bd KF
p h t h kh bd
p h t h kh bdg
KF
K F/ð z !"
AFFRICATE v FRICATIVE TRILL SOUNDING APPROXIMANT
cˇ s h r rr lmn wj
Note: According to this scheme, at the strictly phonological level, Saija and Waunana have identical systems, even though they are very different at the lexical level.
the native and Afro-Colombian languages, their eventual goals being publication, conservation, and strengthening of these languages. The program’s students and professors constitute the Centro Colombiano de Estudios de Lenguas Aborı´genes (CCELA), through which they do the scientific work of the rescue and fortification of these languages. With these linguist students, a new era in the research and promotion of the aboriginal and creole languages of the country has begun, with them covering the entire national territory, doing fieldwork and linguistic data analysis in situ. This has yielded an awakening of these communities for the rest of Colombian population and even for themselves.
Several students from the program have done research on the Embera language: Rito Llerena Villalobos. He was a student from the first promotion, having finished the program in 1987. He is now a professor at Universidad de Antioquia, in the Department of Linguistics. From 1989 to 1992, he worked on compared phonology of the Amerindian languages of Antioquia, including the Tule language (of the Cuna Indians), subject of his degree thesis (1987). This researcher has worked lately on the Embera language, creating didactic materials for the Indian teachers of Alto Anda´gueda, phonological and morphological research in the Embera Reservation of
Choco Languages 375
Figure 4 Choco dialectology.
Jaidukama´, department of Antioquia, and collaborating in ethnoeducation among the Emberas of Tierralta, upper Sinu´ River, in the department of Co´ rdoba, where he is working at present. In the year 2000 he wrote a report on the grammar and phonology of the Tule language for the Instituto Caro y Cuervo. Mario Hoyos Benites. He, too, was a student from the program’s first promotion and finished in 1987. At present he is a professor at the Universidad de la
Guajira. He worked in 1984 in the Napipı´ and middle Atrato rivers and other places in the region. His research has addressed everything from the design of didactic material for Indian teachers around all the country (1991) to interdialectal phonology. He presented a report on the Embera language for the Atlas Etnolingu¨ ı´stico de Colombia of the Instituto Caro y Cuervo in 1997, and wrote a report (2000) on the Embera language of the Napipı´ River for the institute.
376 Choco Languages Table 4 Phonological variation according to Lexicon. Representative sample
I You He We You They Who Person Man Woman Father Mother Son Daughter Spouse Head Eye Tooth Mouth Stomach Hand Foot Blood Meat Water Ground Stone River Mountain Sun Tree Leaf Root Dog Bird Fish One Two Three
Waunana
South Coast
Lower Baudo´
Upper San Juan
Antioquia Co´ rdoba
Atrato
mu puh icˇ macˇ paan hak n khai wauna´n emkhoi ui ai at/tata ieua´ kha huu!"a puru dau khier i/ihure bi hu´a bui bak nemekmo´t du he˜p mok du duursi edau pab khiri pakˆhare saak nemcˇai a˜wa´rr a pai daunumı´ tharhu˜p
m pu icˇi tai para´ a˜cˇi khai e˜pe˜ra˜´ m k ra˜ ) awera a˜ko˜re˜ na˜ve˜ oarra khau khima poro tau khida ithai bi hu´a buru/hı˜ r wa´a/iwa´ cˇier panı´ a joro´ ma˜u˜ to ee a˜ko˜re˜hı˜ ru` pakhuru khiru kharra´ usa ipana cikho aba ome o˜pe´
m pu icˇi tacˇi ma˜ra˜ a˜cˇi khai e˜pe˜ra˜´ m khı˜ ra˜ u˜e˜ra˜/ve˜ra˜ tata nana uarra khau khima´ boro tau khiFa´ ithae Ki hu´a hı˜ ru˜ va cˇikho panı´ a/paito´ joro´ mo˜kara´ to e!"a uma˜dau pakuru´ khitu´a kharra usa ı˜ pana´ Keta´ aKa o˜me˜ o˜pea
m bu icˇi dacˇi macˇi/ma˜ra˜ a˜cˇi kai e˜be˜ra˜ mu˜kı˜ ra˜ u˜e˜ra˜ cˇacˇa/dada´ dana/na˜ve˜ varr/oarra kau kima boro dau kiFa i/itae Ki hu´a h r /he˜ru˜ oa kiuru banı´ a e´oro mokara do ea uma˜da bakuru kidu´a karr usa ibana Keda aKa ome o˜bea
m bu i!"i dai ma˜ra˜ a˜!"i kai e˜be˜ra˜´ ma˜kı˜ ra˜ u˜e˜ra˜ zeze papa vuarra kau kima buru dabu´ kiða´/cˇiða´ itae Ki huwa´ hı˜ ru˜´ va !"iko banı´ a egoro´ mo˜gara´ do katuma´ ı˜ ma˜dau bakuru kitu´a karra´ usa´ ı˜ bana´ Keda´ aKa ume u˜bea
m bu i!"i dai ma˜ra˜/pa˜ra˜ a˜!"i kai e˜be˜ra˜´ ma˜´kı˜ ra˜ u˜e˜ra˜´ zeze papa oarra kau kima boro´ dau kiFa itae Ki huwa´ he˜ru˜´ oa´ !"iko baido´ egoro´ mo˜gara´ do e!"a´ uma˜dau bakuru kedua´ karra´ usa´ ı˜ bana´ Keda´ aKa ume´ ubea
Notes: Details regarding the zones and specific places of the Choco proposed by Pardo and Aguirre. Waunana: lower San Juan River, Docampado´, coastal rivers, Jurado´, Panama, Chintado´. South Coast: Saija, Satinga, Saquianga, Naya, Cajambre, south of Buenaventura. Lower Baudo: Catru´, Dubasa, coastal rivers, Purricha, Pavaja. Upper San Juan: Chamı´ , Tado´, upper Anda´gueda River, southwest of Antioquia Department, Garrapatas River (north of Valle Department). Antioquia/Co´rdoba: Dabeiba, Murrı´ , Riosucio, upper Sinu´ and San Jorge Rivers. Atrato: upper Atrato River, Capa´, Bojaya´, upper Baudo´ River, Panama´. Actually, the difference among the diverse phonological inventories is in the plosive systems and the voicing of the sibilant /s/ and the palatal affricate. Hence, the global scheme outlined in Table 3 can be suggested. Table 5 Grammatical similarities and differences
ber cˇ tatabro ethe´rre gallina mu˜-buda mi pelo u˜a˜u˜a˜-ra nin˜o-ac kauzake-da nin˜ita-ac
khaahı´ m mordio´ peehı´ mato´ ko˜sı´ corto´ ubeası´ a pego´ uratusı´ a froto´
Common characteristics: 1. Predominant suffixing. 2. Occasional prefixing: integration in nominals, some verbal aspectualizing. 3. Varients of: number, gender, affection, position, permanence in the auxiliary. 4. Tactical order variation for focalization. 5. Verbalized lexical determination (adjectival verbs). 6. Actancy, opposition: agent, attributive, instrumental versus intransitive subject, accusative. 7. Great variation in prenominal suffixing. 8. Basic S O V order.
Ernesto Llerena Garcı´a. He completed the program in 2001, with a dissertation titled ‘La predicacio´ n de la oracio´ n simple en la lengua embera del Alto Sinu´ ’ (simple sentence predication of the Embera of Atto Sinu´ ). He has been profesor of linguistics at the Antioquia and Co´ rdoba universities, where is working at the moment. With his father, Rito Llerena, and the Emberas of upper Sinu´ River, he wrote Diccionario etnolingu¨ı´stico de la lengua Embera (2003) for the Normal Superior de Monterı´a (capital of the department of Co´ rdoba). Daniel Aguirre Licht. A student from the second promotion, he finished up in 1989. In 1985, he began phonological studies of Chamı´, southeast of the department of Antioquia. He continued with morphological studies in 1987, and then morphophonological and grammarians in 1998. In 1988, he collaborated with the anthropologist Mauricio Pardo in research on Choco dialectology; included in the resulting article (Pardo and Aguirre, 1993) was an answer to Paul Rivet’s hypothesis about the origin Karib of the Choco languages. Aguirre Licht also worked in the department of Risaralda, the location of the Indian reservation Embera-Chamı´ of Purembara´ (from puru ¼ ‘town’ and embera), a possible place of the dispersion of the different Embera groups at the Spanish arrival, and he has also worked with the Emberas of Garrapatas Valley, in the department of Valle del Cauca. About the Waunana there are also the works of Gustavo Mejı´a (1987), another student from the first promotion of the master’s in ethnolinguistics of the Universidad de los Andes. He did a grammarian investigation in 1987 as his thesis. He also did, for the Instituto Caro y Cuervo, a phonological and morpho-
syntactic description of Waunana (2000b) and a presentation of the aboriginal languages of the Pacific coast of Colombia (2000a). Edel Rasmussen, who worked at the Universidad Nacional de Panama´ , studied the Embera language of the Panama area and published research on phonology (1986) and on grammar (1985). The Technological University of Pereira (UTP), located in the capital of the department of Risaralda, has paid attention the great number of the EmberaChamı´ Indians who live in the department, both in studies of their language and in projects on other matters. Fernando Romero L. of the Psycho-Pedagogy Department in the School of Education does research on linguistic and pedagogical problems of the teaching of Spanish as a second language with bilingual Chamı´ and Nasa (Pa´ ez) teachers, as well as on discourse analysis of this variety of the Embera language, including studies in which the author of this article has participated. Linguist Olga L. Bedoya works with him, and as Director of the Ethnoeducation and Community Development Program in the same school she does research on the interference of Spanish in different Embera dialects, problems of orality versus writing, and other aspects of Embera language and culture. See also: Applied Linguistics in South America; Bilingual-
ism and Second Language Learning; Colombia: Language Situation; Educational Linguistics; History of Linguistics in Central and South America; Identity: Second Language; Language Policy in Multilingual Educational Contexts; Pedagogical Grammars: Second Language; Teaching of Minority Languages.
Bibliography Adam M (1888). ‘Bibliographie des re´ centes conquetes de la linguistique Sud-Americane.’ In Congres internacional des Americanistes, Berlı´n, 7th session. 497–520. Aguirre D (1990). El sintagma nominal en embera chamı´ de Cristianı´a. Master’s thes., Universidad de los Andes. Aguirre D (1992). ‘Previsibilidad del acento en embera chamı´.’ In Lenguas aborı´genes de Colombia, memorias 2. Bogota´: CCELA – Universidad de los Andes. 31–62. Aguirre D (1993). ‘Lenguas sobrevivientes del Pacı´fico colombiano.’ In Pacı´fico, vol. 1: Colombia. Ed. Plegra Bogota´ : FEN – Universidad Nacional de Colombia. 310–325. Aguirre D (1994). ‘‘‘’paru´ de bu´ ’’ o ‘‘Fiesta de la pubertad’’ entre los embera: investigacio´ n-accio´ n participante en etnolingu¨ ı´stica.’ Boletı´n de Antropologı´a 8(24), 43–64. Aguirre D (1995a). ‘Fonologı´a del embera-chamı´ de Cristianı´a (departamento de Antioquia).’ In Estudios fonolo´gicos del grupo Choco, vol. 8: Lenguas aborı´genes de Colombia. Bogota´ : Universidad de Los Andes – CCELA. 9–86.
378 Choco Languages Aguirre D (1995b). ‘Recuperacio´ n cultural y problemas pra´cticos de la traduccio´ n: al rescate de la tradicio´ n cultural entre los embera-chamı´ de Antioquia.’ In Lenguas Aborı´genes de Colombia, memorias 3: La recuperacio´ n de lenguas nativas como bu´ squeda de identidad e´ tnica symposium. 7th congress of anthropology, Medellı´n, Colombia, 1994. Bogota´ : Universidad de Los Andes – CCELA. 19–38. Aguirre D (1998a). ‘Cosmos, naturaleza y cuerpo humano entre los embera.’ In Lenguas aborı´genes de Colombia, memorias 5: El le´ xico del cuerpo humano a trave´ s de la grama´ tica y la sema´ ntica. Bogota´ : Universidad de Los Andes – CCELA. 113–120. Aguirre D (1998b). ‘Experiencias etnoeducativas con los Embera-Chamı´.’ In Lenguas aborı´genes de Colombia, memorias 4: Educacio´ n endo´ gena frente a educacio´ n formal. Bogota´ : Universidad de Los Andes – CCELA. 143–155. Aguirre D (1998c). ‘Fundamentos morfosinta´ cticos para una grama´tica embera.’ In Lenguas aborı´genes de Colombia, descripciones 12. Bogota´ : Universidad de Los Andes – CCELA. Aguirre D (1999a). ‘Ergatividad, focalizacion, tematizacion y topicalizacion en embera.’ In Lenguas aborı´genes de Colombia, memorias 6: Congreso de lingu¨ ı´stica Amerindia y Criolla. Bogota´ : Universidad de Los Andes – CCELA. 317–330. Aguirre D (1999b). La lengua embera. Languages of the World materials. LINCOM’s Descriptive Grammar Series, no. 208. Munich: Alemania. Aguirre D (1999c). ‘Textos y comentarios sobre vocabularios embera-chamı´ y embera-catı´o.’ In Documentos sobre lenguas aborı´genes de Colombia, vol. 4: Lenguas del occidente de Colombia [Edns. Uniandes, Colciencias.] Del archivo de Rivet P & Landaburu P (comp.). Bogota´ : Universidad de Los Andes – CCELA. 83–114. Aguirre D, Romero F, Duque A & Gallego J (2000). Oralidad y escritura entre los embera-chami de Risaralda. Pereira: Universidad Tecnolo´ gica de Pereira. Anonymous (1918). ‘Catecismo Catı´o-Espan˜ ol para uso de los misioneros de Marı´a Inmaculada y Santa Catalina de Siena y sus neo´ fitos catecu´ menos.’ In Revista Departamental de Instruccio´ n Pu´ blica (2nd edn., no. 16). Medellı´n: Secretarı´a de Instruccio´ n Pu´ blica de Antioquia. 494–513. Bastian A (1876). ‘Bericht ubre die Spreche welche die Chamı´es–Anda´guedas–Murindoes–Can˜ as Gordas– Rioverdes–Caramantas–Tadocitos–Patoes–Curasambas– Indianer Sprechen.’ Zeitschrift Fu¨ r Ethnologie 8, 359–377. Bedoya O L (1997). ‘Los nombres compuestos en la lengua Epera de Oriente y su motivacio´ n cultural.’ Revista de Ciencias Humanas 4(14). 86–94. Bedoya O L & Restrepo M (1998). Interferencias lingu¨ ı´stica entre la lengua y el espan˜ ol hablado en el choco. Monterı´a: Universidad de Co´ rdoba. Bedoya O L & Zuluaga V (1996). ‘El discurso oral y escrito entre los embera-chamı´.’ Revista de Ciencias Humanas 3(8), 45–51. Bollaert W (1860). Antiquarian, ethnological and other researches in New Granada, Equador, Peru´ and Chile. Londres: Tru¨ bner. 65–67.
Brinton D (1891). The American race. New York. 175–177, 275. Brinton D (1894). ‘Some words from the Anda´ gueda dialect of the Choco stock.’ In Proceedings of the Philosophical Society, vol. 34. Philadelphia. 401–402. Brinton D (1895). ‘Vocabulary of the Noanama´ dialect of the Choco stock.’ In Proceedings of the Philosophical Society, vol. 35. Philadelphia. 202–204. Casas B delas (1951). Historia de las Indias (vols 3). Me´ xico: Fondo de Cultura Econo´ mica. Casas B delas (1973). ‘Nu´ n˜ ez de Balboa descubre el Mar Pacı´fico.’ In Historiadores de las Indias. Me´ xico: Los Cla´ sicos; New York and Panama City: W. M. Jackson. Casas B delas (1977). Brevı´sima relacio´ n de la destruccio´ n de las Indias. La Habana: Ed. Ciencias Sociales. Castellanos J de (1942). Historia de la gobernacio´ n de Antioquia y la del Choco. Medellin: Biblioteca Popular de Cultura Colombiana. Caudmont J (1955). ‘La lengua Chamı´.’ Revista Colombiana de Antropologı´a 4, 273–293. Caudmont J (1956). ‘La lengua Chamı´’ (part 2). Revista Colombiana de Antropologı´a 5, 53–108. Chamberlain A (1907). ‘South American Linguistics stocks.’ Congress international des Americanistes, 15th session, vol. 2, Quebec. 187–204. Chamberlain A (1913). ‘Linguistic stocks of South American Indians, with distribution map.’ American Anthropologist 15. Cieza de Leo´ n P de (1924). La cro´ nica general del Peru´ (vol. 7). Lima: Coleccio´ n Urteaga. Collins F (1879). Vocabulary of the language of the Indians of the Canto´ n of Choco, State of Cauca, United States of Colombia. [Napipı´ Expedition.] Washington, DC: 118–121. Correa L (1981). Breve catecismo Espan˜ ol y Embera Baudo´ . Medellı´n: Cadavid Restrepo Print. Correa L (1982). Cartilla Embera Baudo´ . Medellı´n: Cadavid Restrepo Print. Cullen E (1851). ‘Vocabulary of the language of the Cholo of Choco Indians of the Isthmus of Darie´ n.’ Journal of the Royal Geographic Society 20, 189–190. Cullen E (1866). ‘The Darien Indians.’ Transactions of the Ethnological Society of London 4, 264–268. Cullen E (1875). ‘Description of the Choco Indians.’ New York Herald Tribune, 167–171. Embera Waunana Regional Organization (OREWA) (1987). Experiencia Educativa Jose´ Melanio Tunay. Indian Teachers Course. Alto Buey. Quibdo´ : OREWA. Etiene C P (1887). ‘Nouvelle-Grenade.’ In Apercu Ge´ neral sur la Colombia: recits de voyages en Amerique. Ge´ neve. 39–41. Gralow F (1976). Sistemas fonolo´ gicos de idiomas Colombianos, vol. 3: Fonologı´a del Chamı´. Lomalinda and Meta. Instituto Lingu¨ ı´stico de Verano. Editorial Townsed. 29–44. Greenberg J (1960). ‘The general classification of Central and South American languages.’ In Wallace A (ed.) Men and cultures: selected papers of the 5th international congress of anthropological and ethnographic sciences. Philadelphia: University of Pennsylvania Press. 791–794. Greiffenstein C (1878). ‘Vokabular der indier des Chamı´.’ Zeitschrift Fu¨ r Ethnologie 10, 135–138.
Choco Languages 379 Harms P (1985). Ne-animalara´ : animales en epena pedee (Saija), cartilla ILV. Lomalinda and Meta: Ed. Townsend. Harms P (1987). Typological grammar of epena pedee (Saija). Typewritten. Harms P & Powel J (1984). Sistemas Fonolo´ gicos de idiomas colombianos, vol. 5: Fonologı´a del epena pedee del rı´o Saija ILV. Lomalinda, Meta: Ed. Townsed. 155–201. Holmer N (1963). Grama´tica comparada de un dialecto del Choco con texto, ı´ndice y vocabulario. Etnologiskka Studier 26. Etnografiska Museet: Go¨ teborg. Horton G (undated). Trabajo sobre los Catı´os del Alto Sinu´ . Handwritten. Tierralta and Co´ rdoba: Ethnographic and Linguistic Study. Horton G (1964). Libro I de Epera Pedea. Co´ rdoba: Tierralta. Horton G (1965). Tama! Culebra en Epera Pedea y Espan˜ ol. Medellı´n: Tipografı´a Unio´ n. Hoyos M (1987a). Informe sobre la lengua embera para el atlas etnolingu¨ ı´stico de Colombia. Bogota´ : Instituto Caro y Cuervo. Hoyos M (1987b). Morfologı´a de la palabra verbal del embera-napipı´. Master’s thes., Universidad de los Andes. Hoyos M (1991). Rudimentos de lingu¨ ı´stica, pedagogı´a y dida´ctica para el maestro indı´gena y para la ensen˜ anza de la lectoescritura en el marco de la educacio´ n bilingu¨ e. In Ethnoeducational Program. Plan de Universalizacio´ n Ba´sica Primaria. Bogota: Ministerio de Educacio´ n Nacional. Hoyos M (2000). ‘Informe sobre la lengua embera´ del rı´o Napipı´.’ In Gonza´ lez M S & Rodrı´guez M L (comps.) Lenguas indı´genas de Colombia: una visio´ n descriptiva. Bogota´ : Instituto Caro y Cuervo. 75–83. Isacsson S (1973). ‘Indios Cimarrones del Choco (Colombia).’ Etnografiska Museet Go¨ teborg Arstrick. Go¨ teborg. Isacsson S (1974). ‘An enigmatic colonization in the XVII century in Colombia.’ Etnografiska Museet Go¨ teborg Arstrick. Go¨ teborg: Go¨ teborg Etnografiska Museum. Isacsson S (1975). Indiana, vol 3: Biografı´a Atraten˜ a. Berlı´n: Ibero-Amerikanisches Institut. Preubischer Kulturbesitz. Isacsson S (1976). ‘Embera: territorio y re´ gimen agrario de una tribu Selva´tica bajo la Dominacio´ n Espan˜ ola.’ In Friedemann N S (ed.) Tierra, tradicio´ n y poder en Colombia, vol. 12. Bogota´: Biblioteca Ba´sica Colombiana. Lehmann W (1910). ‘Ergebnisse Einer Forschungsreise in Mittelamerika und Me´ xico.’ Zeitschrift fu¨ r Ethnologie 42, 695. Lehmann W (1920). ‘Zentral Amerika.’ Zeitschrift fu¨ r Ethnologie, 1. 69–142. Llerena G E (2001). La predicacio´ n de la oracio´ n simple en la lengua embera del Alto Sinu´ . Master’s thes., Universidad de Los Andes. Llerena G E & Llerena R (2003). Diccionario etnolingu¨ ı´stico de la lengua embera. Written for the Normal Superior de Monterı´a. Co´ rdoba: Monterı´a. Llerena V R (1987). ‘Relacio´ n y determinacio´ n en el predicado de la lengua Kuna.’ In Lenguas Aborı´genes de Colombia, descripciones no. 1. Centro Colombiano de Estudios de Lenguas Aborı´genes – CCELA. Bogota´ : Universidad de los Andes. Llerena V R (1992a). ‘Estructura y variacio´ n en las fonologı´as de las lenguas epera´ de Oriente y Occidente.’ In
Lenguas Aborı´genes de Colombia, memorias no. 2. Centro Colombiano de Estudios de Lenguas Aborı´genes – CCELA. Bogota´ : Universidad de los Andes. Llerena V R (1992b). ‘Guia de materiales para la ensen˜ anza de la lectoescritura en la lengua epera del Alto Anda´ gueda.’ Organizacio´ n Regional Embera Waunana (OREWA) CIP. Quibdo´ : Universidad de Antioquia. Llerena V R (2000). ‘Elementos de grama´ tica y fonologı´a de la lengua Cuna.’ Gonza´ lez M S & Rodrı´guez M L (comps.) In Lenguas indı´genas de Colombia: una visio´ n descriptiva. Bogota´ : Instituto Caro y Cuervo. Llerena V R & Gallego H (1989). ‘Fonologı´a y morfofonologı´a de las lenguas epera-chamı´ de Cristianı´a, epera´ ’Catı´o de Jaidukama, y Kuna de Caima´ n Nuevo.’ Centro de Documentacio´ n. Facultad de Comunicaciones. Medellı´n: Universidad de Antioquia. Llerena V R & Gallego H (1990). ‘Resultados de la investigacio´ n fonolo´ gica de las lenguas Amerindias de Antioquia.’ Cultura Embera. Memories of the 4th congress of anthropology in Colombia. Medellı´n: Organizacio´ n Indı´gena de Antioquia (OIA). 68–90. Loboguerrero M (1976). Lingu¨ ı´stica del Chamı´. Fieldwork. Bogota´ : Universidad Nacional de Colombia. Loewen J (1954). Waunana grammar: a descriptive ana´ lisis. Thes., University of Washington. Loewen J (1958). An introduction to Epera speech: Sambu dialect. Ph.D. diss., University of Michigan. Loewen J (1960). ‘Dialectologı´a de la familia lingu¨ ı´stica Choco.’ Revista Colombiana de Antropologı´a 9, 8–22. Loewen J (1963). ‘Choco: phonological problems.’ Internacional Journal of American Linguistics 29, 357–371. Loewen J (1964). ‘The Choco and their Spirit World.’ Practical Anthropology 11(3), 97–104. Loewen J (1972). ‘El cambio cultural entre los Choco de Panama´ .’ Ame´ rica Indı´gena 32(1). Lotero L (1972). Monografı´a de los Indı´genas Noanama´. Medellı´n: Editorial Servigra´ficas. Loukotka C (1968 [1942]). Clasification of South American Indian languages. Los A´ ngeles: University of California. Manzini G (1973). Abecedario Embera del alto rı´o San Juan. Vicariato Aposto´ lico de Itsmina. Medellı´n: Yepes Print. Manzini G (1974). Indı´genas e indigenismo en el Choco. Medellı´n: Editorial Universidad de Antioquia. Manzini G (1976). ‘El ma´ s antiguo testimonio sobre el lenguaje embera´ .’ Revista de la Universidad del Choco 1, 101–108. Martı´nez E & Guisao E (1980). Mi cartilla Catı´a: texto de lectura bilingu¨ e Katı´o-Espan˜ ol. Secretarı´a de Educacio´ n y Cultura de Antioquia. Medellı´n: Talleres Albo´ n. Mason A (1950). Handbook of South American Indians, vol. 6: The languages of South American Indians. Washington, D.C: Bureau of American Ethnology. 157–317. Meillet A & Cohen M (1952). Les langues du monde. Paris: Centre National de la Recherche Scientifique, 1152–1160. Mejı´a G (1987). Relaciones actanciales y de persona en Waunana. Master’s thes., Universidad de los Andes. Mejı´a G (2000a). ‘Lenguas aborı´genes de la costa pacı´fica de Colombia.’ In Gonza´ lez M S & Rodrı´guez M L (comps.) Lenguas indı´genas de Colombia: una visio´ n descriptiva. Bogota´: Instituto Caro y Cuervo.
!
380 Choco Languages Mejı´a G (2000b). ‘Presentacio´ n y descripcio´ n fonolo´ gica y morfosinta´ctica del waunana.’ In Gonza´lez, M S & Rodrı´guez M L (comps.) Lenguas indı´genas de Colombia: una visio´ n descriptiva. Bogota´: Instituto Caro y Cuervo. Michael R & Nellis M J (1984). ¿Na Karema? El Abecedario ebera (chamı´), cartilla 1. Lomalinda and Meta: L. V. Editorial Townsend. Mollien G (1824). Travels in the Republic of Colombia in the years 1822 and 1823. London. ([1944]. ‘Viaje por la Repu´ blica de Colombia.’ Biblioteca Popular de Cultura Indiana, vol. 8. Bogota´ .) Mollien G (1979). ‘Recorrido por la Tierra del Oro.’ In Las Maravillas de Colombia, vol. 1. Bogota´ : Editorial Forja. 113–134. Nordenskiold E (1927). ‘The Choco Indians of Colombia and Panama.’ Discoverg 8(95), 350–377. Nordenskiold E (1928). ‘Les Indiens del’ Isthmus de Panama´ .’ In La Geographie. Parı´s. 229–319. Ortega R C (1978). Los estudios sobre lenguas indı´genas en Colombia. Bogota´ : Imprenta del Instituto Caro y Cuervo. Ortiz S E (1937). ‘Clasificacio´ n de las lenguas indı´genas de Colombia.’ Idearium 1, 76. Ortiz S E (1940). ‘Lingu¨ ı´stica Colombiana: familia Choco.’ Universidad Cato´ lica Bolivariana 6, 46–77. Ortiz S E (1954). Estudio sobre lingu¨ ı´stica aborigen de Colombia. Bogota´ : Ed. Kelly. 271–310. Ortiz S E (1965). Historia extensa de Colombia, vol. 1, book 3: Lenguas y dialectos indı´genas de Colombia. Bogota´ : Ediciones Lerner. Pardo M (1981). ‘Bibliografı´a sobre indı´genas Choco.’ Revista Colombiana de Antropologı´a 23, 464–528. Pardo M (1983). ‘Transformaciones histo´ ricas en los indı´genas Choco.’ Boletı´n de Antropologı´a 5, 611–628. Pardo M (1984). Etnolingu¨ ı´stica de indı´genas Choco. Investigation report presented to the Fondo para la Promocio´ n de la Investigacio´ n y la Te´ cnica (FPIT) del Banco de la Repu´ blica, Bogota´ . Pardo M (1985a). ‘Las lenguas Choco en Antioquia: aspectos fonolo´ gicos.’ Revista de Ciencias Sociales 1, 53–68. Pardo M (1985b). ‘Los indı´genas chocoanos, 450 an˜ os de resistencia.’ Revista Codechoco´ 1, 45–50. Pardo M (1986). ‘La situacio´ n de la lingu¨ ı´stica sobre el grupo Choco.’ In Memorias de eventos cientı´ficos Colombianos, no. 42: Memorias III congreso de antropologı´a en Colombia. Instı´tuto Columbiano para el Fomento de la Educacio´ n Superior. Bogota´ . 87–117. Pardo M (1987). ‘Regionalizacio´ n de indı´genas Choco.’ In Revista del Museo del Oro Boletı´n 18, January–April. Bogata: Museo del Oro. 46–63. Pardo M (1989). ‘Lengua y sociedad: indı´genas Choco.’ Arqueologı´a: Revista de los Estudiantes de Antropologı´a de la Universidad Nacional 10, 69–74. Pardo M (1997). ‘Aspectos sociales de las lenguas Choco.’ In Pacho´ n X & Correa F (comps.) Lenguas Amerindias: condiciones sociolingu¨ ı´sticas en Colombia. Bogota´ : Instituto Caro y Cuervo–Instituto Colombiano de Antropologı´a (ICAN). 321–381. Pardo M & Aguirre D (1993). ‘Dialectologı´a Choco.’ In Rodrı´guez M L (comp.) Estado actual de la clasificacio´ n
de las lenguas indı´genas de Colombia. Colection Ezequiel Uricoechea, no. 11. Bogota´ : Instituto Caro y Cuervo. 269–312. Pardo M & Orozco L A (1987). Cartilla 1: ebera be’dea. karta jarabada ’buubada. Cartilla 2: ebera be’dea. oarrara´ karta jaradia´ badaa. Codechoco-Diar-Orewa. Pela´ ez T (1885). ‘Vocabulario de los Indios del Rı´o Verde, Mutata´ , Dabeiba, Frontino, Can˜ asgordas.’ In Uribe M (ed.) Geografı´a general y compendio histo´ rico del Estado de Antioquia en Colombia. Pari´s: Goupy et Jourdan Press. 535–541. Pico´ n M L (1985). Maach meua waunana esap gaai k’augtarrau. Cartilla Vicariato Aposto´ lico de Itsmina. Quibdo´ : Educacio´ n Contratada Quebrada de Pichima´ . Pinart A (1887). ‘Les Indiens de l’etat de Panama´ .’ In Revue D’Etnographie, vol. 6. Paris. 33–56. Pinart A (1897). ‘Vocabulario Castellano-Choco’ (Baudo´ Citara`e). Petite Bibliotheque Americaine vol. V . Pinto C (1950). Diccionario Catı´o-Espan˜ ol y Espan˜ olCatı´o. Manizales: Imprenta Departamental. Pinto C (1974). La cultura Catı´a: su lengua y su cultura, vol. 2: La lengua: grama´ tica y diccionario. Medellı´n: Editorial Graname´ rica. Pinto C (1977). ‘Los Indios Chamı´.’ Aleph, 21–35. Pinto C (1978). La cultura Catı´a: su lengua y su cultura, vol. 1: Cultura´ y mithologı´a. Medellı´n: Ed. Compa´s. Prado M (1982). El epera de Saija: un estudio inicial. Master’s thes., Universidad del Valle. Prado M (1985). Epera pedee pa´ da: lengua nativa escrita. Vocabulary and Alphabet. Cartillas de Lecto-escritura. Popaya`n: Universidad del Cauca. Prado M (1990). ‘La decisio´ n de escribir una lengua y sus implicaciones: el epera de Saija, un caso.’ In Cultura Embera, memories of the 4th congress of anthropology in Colombia. Medellı´n: Organizacio´ n Indı´gena de Antioquia (OIA). 58–67. Prado M (1991). Aproximaciones a la nasalidad del epera´ de Saija. Speech presented to the Segundo Congreso del CCELA, Villeta, Cundinamarca. Prado M (1992). Problemas lingu¨ ı´sticos y conflictos intere´ tnicos entre comunidades negras e indı´genas epera´ de Saija. Cali: Universidad del Valle. Rasmussen E (1985). Nociones gramaticales de Ebera´. Panama´ : Universidad Nacional de Panama´ . Rasmussen E (1986). Principales rasgos fonolo´ gicos del Embera. Arosemena M (trans.). Panama´ : Universidad Nacional de Panama´ . Reichel-Dolmatoff G (1945). ‘Bibliografı´a lingu¨ ı´stica del grupo Choco.’ Boletı´n de Arqueologı´a 1(6). Reichel-Dolmatoff G (1955). Diario de viaje por las antiguas provincias de Cartagena del Padre Joseph Palacios de la Vega. Bogota´ : Ministerio de Educacio´ n. Reichel-Dolmatoff G (1964). Bibliografı´a del Choco. Universities of Cali and Tulane. Rex E (1975). On Catio grammar. Thes., Universidad de Texas at Arlington. Rex E & Schotlendreyer M (1973). ‘Sistema fonolo´ gico del Catı´o.’ In Sistemas fonolo´ gicos Colombianos, vol. 2. Lomalinda and Meta: Ed. Townsed. Instituto Lingu¨ ı´stico de Verano. 73–88.
Choco Languages 381 Rivet P (1912). ‘Les familias linguistiques du nor-ouest de L’Amerique du Sud.’ L’Anne Linguistique 4, 123–126. Rivet P (1924). ‘Langues de L’Amerique du sud et des Antilles.’ In Meillet et al. (eds.) Les langues du monde, vol. 16. Parı´s. 639–712. Rivet P (1943). ‘La Influencia Karib en Colombia.’ Revista del Instituto Etnolo´ gico Nacional 1, 13–96. Rivet P (1944). ‘La Lengua Choco.’ Revista del Instituto Etnolo´ gico Nacional 1, 297–349. Rivet P (1946). ‘Groupe Catı´o.’ Lingu¨ istique: Journal de la Societe´ des Americanistes 34, 25–29. Robledo E (1922). ‘Vocabulario de los Chamı´es.’ Repertorio Histo´ rico de Antioquia 4(5–8), 603–607. Romero Loaiza F & Bedoya O L (1997). ‘La ensen˜ anza del espan˜ ol como segunda lengua en los embera-chamı´ y nasa, una propuesta lingu¨ ı´stica y pedago´ gica.’ Boletı´n de Antropologı´a 11(29), 11–19. Romoli K (1961). Pascual de Andagoya y el descubrimiento de la Costa Pacı´fica. Speech presented to Congreso Internacional de Historia Hispanoamericana, Cartagena. Romoli K (1962). ‘El sureste del Cauca y sus indios al tiempo de la Conquista Espan˜ ola, segu´ n documentos contempora´neos del Distrito de Almaguer.’ Revista Colombiana de Antropologı´a 11, 239–297. Romoli K (1975). ‘El Alto Choco en el siglo XVI.’ Revista Colombiana de Antropologı´a 19, 9–38. Romoli K (1976). ‘El Alto Choco en el siglo XVI (pt.2): ‘las gentes.’ Revista Colombiana de Antropologı´a 20, 25–78. Sa´ nchez M & Castro C (1977). ‘Una grama´ tica pedago´ gica del Waunana’ (pt. 1). In Lenguas de Panama´ , vol. 3. Panama´ : Instituto Nacional de Cultura. I. L. V. Santı´simo Sacramento F P del (1936). El idioma Katı´o. Essay. Medellı´n. Schotlendreyer M (1973). El abecedario ebena (Catı´o) ¿Cane busia Cobu´ a? Lomalinda and Meta: I. L. V. Editorial Townsed. Schotlendreyer M (1977). ‘Estudios en Camsa´ y Catı´o: la narracio´ n folclo´ rica catı´a como un drama en actos y escenas.’ In Serie sinta´ ctica. Lomalinda and Meta: I. L. V. Editorial Townsed. 97–153. Seeman B (1851). ‘The aborigens of the Isthmus of Panama´ .’ Transactions of the American Etnological Society 3, 179–181. Simo´ n F P (1882). Noticias historiales de las Conquistas de tierra firme en las Indias occidentales (5 vols). Bogota´ : Medardo Rivas Press. Simons F A (1887). ‘Vokabular des Tukura´ .’ Zeitschrift fu¨ r Ethnologie 19, 302.
Choctaw
Stansell D (1973). ‘Embera.’ In Aspectos de la cultura material de grupos e´ tnicos de Colombia, vol. 1. Lomalinda and Meta: I. L. V. Editorial Townsed. 179–194. Torres de Arauz R (1960). ‘Los grupos humanos del Darie´ n Panamen˜ o.’ Report and annexes of subcommittees of the Darie´ n, 8th congreso Panamericano, Carreteras, Panama´ . Torres de Arauz R (1966). La cultura Choco: estudio etnolo´ gico e histo´ rico. Panama´ : Universidad de Panama´ . Torres de Arauz R (1971). ‘Culturas Prehispa´ nicas del Darie´ n.’ Hombre y Cultura 2(2), 7–40. Torres de Arauz R (1972). ‘Panorama actual de las culturas indı´genas panamen˜ as.’ Ame´ rica Indı´gena 32(1), 77–94. Tovar A (1961). Cata´ logo de lenguas de Ame´ rica del Sur. Buenos Aires: Ediciones Suramericana. Tovar A & Larrucea C (1984). Cata´ logo de las lenguas de Ame´ rica del Sur. Madrid: Editorial Gredos. Uribe J V (1881). ‘Grama´ tica y vocabulario de la lengua que hablan los indios darienes que habitan la regio´ n comprendida entre las desembocaduras del Atrato, en el Atla´ ntico, y del San Juan, en el Pacı´fico, y la cordillera en que limitan las antiguas provincias del Choco y Antioquia.’ Congreso internacional de Americanistas, session 4, vol. 2. Madrid. 297–309. Uribe J V (1883). ‘Grama´ tica y vocabulario de la lengua que hablan los indios darienes.’ Congreso internacional de Americanistas, session 4, vol. 2. Madrid. 269–309. Vargas P (1984). La Conquista tardı´a de un territorio aurı´fero: la reaccio´ n de los Embera de la cuenca del Atrato a la Conquista Espan˜ ola. Thes., Universidad de los Andes. Vargas P (1986). Los habitantes del Alto Rı´o Sinu´ y sus fronteras, siglos XVI–XVIII. Report to Instituto Colombiano de Antropologı´a (ICAN), Bogota´ . Vargas P (1990). Los Embera y los Kuna: impacto y reaccio´ n ante la ocupacio´ n espan˜ ola, siglos XVI y XVII. Master’s thes., Universidad del Valle. Vela´squez R (1916). ‘Vocabulario de los Chamı´es.’ Boletı´n de la Sociedad de Ciencias Naturales del Instituto de la Salle 4, 147–150. Wassen H (1935). ‘Notes on the southern groups of Choco Indians in Colombia’ (Waunana and Saija). Etnologiska Studier 1, 35–182. White R B (1884). ‘Notes on the aboriginal races of the north western provinces of South America.’ Journal of the Royal Anthropological Institute of Great Britain and Ireland 13, 240–456.
See: Muskogean Languages.
Choctaw Trade Language
See: Mobilian Jargon.
382 Chomsky, Noam (b. 1928)
Chomsky, Noam (b. 1928) J B Walmsley, Universita¨t Bielefeld, Bielefeld, Germany ! 2006 Elsevier Ltd. All rights reserved.
Noam Avram Chomsky (Figure 1) was born in 1928 in Philadelphia, Pennsylvania, and studied at the University of Pennsylvania. From 1951 to 1955 he was a junior fellow of the Harvard University Society of Fellows, completing his Ph.D. in linguistics in 1955. In that year he took up a post in the Department of Modern Languages and Linguistics at Massachusetts Institute of Technology. From 1958 to 1959 he worked at the Institute for Advanced Study at Princeton. From 1966 to 1975 he held the Ferrari P. Ward Professorship of Modern Languages and Linguistics, and was appointed Institute Professor in 1976. Chomsky’s thinking has had a radical influence on grammar theory and the philosophy of grammar, syntax, morphology, language acquisition, phonology, and the historiography of linguistics. His linguistic work (the other half of his life is devoted to political writing) can be divided into four main phases: an early phase (the 1950s); a classical phase (1960s and 1970s); a transitional phase, in the 1980s; and a new phase in the 1990s, in which the ‘minimalist program’ was introduced. The first work of Chomsky’s to make a wider impact was Syntactic structures (1957), which introduced a new way of looking at grammar and language that opened up exciting avenues of enquiry. In this book Chomsky circumvented the debate as to
Figure 1 Noam Chomsky.
what constituted a sentence by postulating it as a primitive in his system. According to Syntactic structures, a language was ‘‘a set . . . of sentences, each finite in length and constructed out of a finite set of elements’’ (Chomsky, 1957: 13), and ‘‘the fundamental aim in the linguistic analysis of a language L is to separate the grammatical sequences which are the sentences of L from the ungrammatical sequences which are not sentences of L . . . The grammar of L will thus be a device that generates all the grammatical sequences of L and none of the ungrammatical ones.’’ ‘Grammatical’ did not mean ‘meaningful,’ as the sentence ‘Colorless green ideas sleep furiously’ showed, which, while not being uncontroversially meaningful, is syntactically well-formed. In Syntactic structures Chomsky also introduced the concept ‘generative’ to mean not only productive, but also formally explicit. Chomsky argued that neither finite-state nor phrase-structure (PS) grammars were adequate as a means of generating the sentences of a natural language, and introduced a transformational component by means of which further operations could be performed on the output of a PS-component to generate the strings that underlie sentences. Finally, this study emphasized the evaluative function of grammar theory (in selecting the best grammar) as opposed to the discovery procedures favored by the preceding generation. Aspects of the theory of syntax (Chomsky, 1965) marked an increasing interest in grammar as reflecting the structure of the human mind. This work introduced the notions of competence (a speaker’s ability to produce and understand correct sentences) and performance (the externally verifiable product of such competence), and also deep- and surface-structure. This study is the source of his classic statement, ‘‘Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogenous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors . . . in applying his knowledge of the language in actual performance’’ (Chomsky, 1965: 3). This approach to grammar became known as ‘standard theory’ (ST). Keen to distance himself from his immediate predecessors, Chomsky stressed the affinities between transformational grammar and so-called traditional grammar: ‘‘It would not be inaccurate to regard the transformational model as a formalization of features implicit in traditional grammars, and to regard these grammars as inexplicit transformational generative grammars’’ (Chomsky, 1964: 918).
Chomsky, Noam (b. 1928) 383
Standard theory raised questions about the nature of a deep structure that postulated identical representations for synonymous sentences, about the nature of global and cross-derivational constraints, and about how illocutionary force could or should be incorporated into the syntax. Differences on these issues led to a split in the movement, with ‘generative semantics’ being rejected by Chomsky and some of his associates in favor of an ‘interpretive semantics’ (Huck and Goldsmith, 1995). After Aspects Chomsky published successive revisions in what came to be known as ‘the generative enterprise,’ each of which is characterized by a key work. ‘Standard theory’ was followed by ‘extended standard theory’ (Chomsky, 1973) and ‘revised extended standard theory’ (Bach, 1977), which introduced the concept of ‘core grammar’ and periphery, and devoted much attention to conditions and filters. It then became ‘government-(and)binding theory’ (Chomsky, 1980, 1981), which introduced y-roles (‘semantic’ or ‘thematic relations’), and replaced deep- and surface-structure by ‘D-structure’ and ‘S-structure,’ though not with identical meanings. Up to this point the goal of Chomskyan theory had been the writing of grammars of natural languages. The next – transitional – phase saw a move toward a different goal – exploring the principles of universal grammar. This – Chomsky’s ‘second conceptual shift’ (Chomsky, 1986: 6, 145) – was pursued within the theory of ‘principles and parameters.’ Competence and performance were replaced by the (partly overlapping) ‘I-language’ (internalized language as a property of the human mind) and ‘E-(externalized)language’ (speech written or oral). Principles and parameters are properties of all human languages (part of innate UG), but the principles are invariable whereas the parameters can have different settings. Thus the ‘projection principle’ (‘‘lexical structure must be represented categorially at every syntactic level’’ – Chomsky, 1986: 84) is assumed to apply to all human languages. But the ‘head parameter’ will specify by its setting for a given language whether the head typically comes first in a construction (followed by its complements, as in bought the book) or last (as in Japanese). Principles and parameters theory was succeeded in the 1990s by the ‘minimalist program’ (Chomsky, 1995). The change in terminology from ‘theory’ to ‘program’ is significant. The question had now become, ‘How perfect is human language?’ The minimalist program is driven by the view that human language is maximally simple, and by a striving for economy – economy of representation, economy of principles, and economy of derivation.
Although Syntactic structures contained little evidence of the degree to which grammar theory and language acquisition theory were to merge, there was evidence as early as the 1950s of the challenge that Chomsky’s ideas were to pose in psycholinguistics. In his review of Skinner’s Verbal behavior (Chomsky, 1959), Chomsky both attacked behaviorist theories of language acquisition and opened up new avenues for research: ‘‘The fact that all normal children acquire essentially comparable grammars with remarkable rapidity suggests that human beings are somehow specially designed to do this, with datahandling or ‘hypothesis-formulating’ ability of unknown character and complexity’’ (Chomsky, 1959: 57). The question was how, on the basis of the available data, the child manages to construct a grammar of its mother tongue: ‘‘A consideration of the character of the grammar that is acquired, the degenerate quality and narrowly limited extent of the available data, the striking uniformity of the resulting grammars, and their independence of intelligence, motivation, and emotional state, over wide ranges of variation, leave little hope that much of the structure of the language can be learned by an organism originally uninformed as to its general character’’ (Chomsky, 1965: 58). The information that ‘the organism’ possessed to deal with this task was characterized as a ‘language acquisition device,’ a ‘black box’ that, while one had no direct access to its contents, nevertheless permitted hypotheses to be made about its workings on the basis of its output. A search for universals of language acquisition was thus under way that ultimately turned into the task of describing the properties of I-language. In the mid-1960s Chomsky concentrated on grammar theory, phonology, and the history of ideas. The position adopted in The sound pattern of English (Chomsky and Halle, 1968) illustrates a general move away from structuralist taxonomic description toward the generative paradigm. The sound pattern of English made use of a set of primitive terms together with a set of rules for combining them and for linking the morphophonemic, phonological, and phonetic levels by means of transformations. The system was symbol-based and formally defined, with rules of the form A ! B/X Y (to be read as: ‘interpret – or re-write – A as B when it appears in a context preceded by X and followed by Y’). In the early stages, interest focused on the nature of phonological rules, rule ordering, and the cyclic nature of rules. Though – or perhaps because – The sound pattern of English presented a theory that was essentially one-dimensional, not fully fleshed out, and language specific, it too stimulated further research.
384 Chomsky, Noam (b. 1928)
In that volume and elsewhere Chomsky preferred to look back to the work of earlier grammarians as opposed to his structuralist predecessors. In 1966 he published Cartesian linguistics, in which he argued that his innateness theory was not simply the product of a backlash against behaviorist views, but had a much longer pedigree. This work was sharply reviewed, notably by Salmon, who showed that even in Descartes’ time proponents of innateness theories were locked in combat with empirical-sensualist theorists. Despite Chomsky’s disclaimers in the book, the consensus view seemed to be that Cartesian linguistics was not a model of how to write the history of linguistics. It says something for Chomsky’s stature, though, that his excursion into the history of linguistics acted as a powerful stimulus to younger scholars (Walmsley, 2000). Chomsky’s work dominated the second half of 20th-century linguistics like that of no other linguist. The switch from a taxonomic, inductive, descriptive approach to a generative, hypothetico-deductive, ‘explanatory’ approach left an indelible mark on how we do syntax, morphology, and phonology today. It increased precision by committing itself to symbolic representation and formalization. The increase in precision, though, came at a price: the scope of ‘linguistics’ narrowed. Since a language was said to be ‘‘a set of structural descriptions of sentences, where a full structural description determines (in particular) the sound and meaning of a linguistic expression’’ (Chomsky, 1977: 81), the focus of attention moved in effect from ‘language’ to ‘grammar.’ Chomsky, of course, was not the only source of linguistic ideas in the 20th century, but it would be difficult to grasp much of modern linguistics without some understanding of his ideas and techniques. His ideas have proved fruitful not only in themselves, but also in the reactions they provoked within morphology, syntax, and elsewhere. Underlying almost all these theories however are frequently untested assumptions about the fundamental categories of language – word classes, attributes, and their values. The energy invested in the Chomskyan paradigm has in part deflected attention from these issues. As Lyons put it: ‘‘in Aspects . . ., as generally, Chomsky was . . . content to operate, uncritically, with the categories and subcategories of traditional grammar’’ (Lyons, 1989: 167). These categories have hardly been questioned by the big commercial grammars, either. Chomsky’s influence has left its mark on mathematical linguistics, historical linguistics, theories of language acquisition, anthropology, the study of human cognition, biology, philosophy and the philosophy of science, artificial intelligence, logic,
music theory, literary theory, law and theology, among other fields (Otero, 1994). See also: Constituent Structure; Generative Semantics; Grammar; Language Acquisition Research Methods; Principles and Parameters Framework of Generative Grammar; Phonology: Overview; Syntax of Words; Traditional Grammar; Transformational Grammar: Evolution; X-Bar Theory.
Bibliography Bach E (1977). ‘Comments on the paper by Chomsky.’ In Culicover P W, Wasow T & Akmajian A (eds.) Formal syntax. New York: Academic Press. 133–155. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1959). ‘Review of Verbal behavior by B. F. Skinner.’ Language 35, 26–58. Chomsky N (1964). ‘The logical basis of linguistic theory.’ In Lunt H G (ed.) Proceedings of the ninth international congress of linguists. Cambridge, Mass, August 27–31, 1962. The Hague: Mouton. 914–978. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, Mass: MIT Press. Chomsky N (1966). Cartesian linguistics: a chapter in the history of rationalist thought. New York: Harper & Row. Chomsky N (1973). ‘Conditions on transformations.’ In Anderson S R & Kiparsky P (eds.) A Festschrift for Morris Halle. New York: Holt, Rinehart & Winston. 232–286. [Reprinted in Chomsky N Essays on form and interpretation. New York, Amsterdam/Oxford: North-Holland, 1977, 80–160.] Chomsky N (1977). Essays on form and interpretation. New York, Amsterdam/Oxford: North-Holland. Chomsky N (1980). Rules and representations. New York: Columbia University Press. Chomsky N (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky N (1986). Knowledge of language: its nature, origin and use. New York: Praeger. Chomsky N (1995). The minimalist program. Cambridge, MA: MIT Press. Chomsky N & Halle M (1968). The sound pattern of English. New York: Harper & Row. Huck G J & Goldsmith J A (1995). Ideology and linguistic theory: Noam Chomsky and the deep structure debates. London & New York: Routledge. Lyons J (1989). ‘Semantic ascent: a neglected aspect of syntactic typology.’ In Arnold D et al. (ed.) Essays on grammatical theory and universal grammar. Oxford: Clarendon Press. 153–186. Otero C P (ed.) (1994). Noam Chomsky. Critical assessments (4 vols). London & New York: Routledge. Salmon V (1969). ‘Review of N. Chomsky Cartesian linguistics.’ Journal of Linguistics 5, 165–187. Walmsley J (2000). ‘Review of A science in the making. The Regensburg symposia on European linguistic historiography.’ Henry Sweet Society for the History of Linguistic Ideas Bulletin 35, 60–70.
Chorasmian 385
Chorasmian P O Skjærvø, Harvard University, Cambridge, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
Chorasmian was an Iranian language spoken in medieval Chorasmia, a state on the Oxus/Amu Darja south of the Aral Sea. The name is first mentioned in the Avesta and the Achaemenid inscriptions (see Avestan; Persian, Old) but the language is known only from much later times. Several words pertaining to the calendar and astronomy were cited by Abu Rayhan Biruni in his Athar al-baqiya (comp. 1000). Since then archeological excavations have uncovered inscriptions and documents on parchment and wood from ca. 200–700 A.D.; also, a number of manuscripts of Arabic works containing interlinear glosses in Chorasmian have been found in libraries in Turkey, notably Abu’l-Qasim Zamakhshari’s Muqaddimat al-adab (ms. from ca. 1200) and several 13th-century Arabic law books. The Chorasmian glosses are written in Arabic script, with several modified letters. Those in the Muqaddima are often underpointed or not pointed at all, which makes them hard to interpret. Some Arabo-Persian letters were modified to express special Chorasmian sounds. Triple superscript dots over c ¼ ts and dz, over f ¼ b. Triple subscript dots were used under s to indicate s, not sˇ, and single subscript dot under d to indicate d, not d. Chorasmian historical phonology is characterized by extensive affrication of dentals, palatalization, and a variety of, often unpredictable, simplifications of consonant groups. For instance, t and d > c [ts] and j [dz] before and after i, y: pc < pati- (preverb) and *pita¯ ‘father’; pzy ‘sinew’, cf. Av. paidiia¯-. Intevocalic sˇ developed variously: >mh ‘ewe’, cf. Av. mae¯sˇı¯-; mwf ‘mouse’, cf. Av. mu¯sˇ; gwx ‘ear’, cf. MPers. go¯sˇ; etc. The Chorasmian vowel system is characterized by the reappearance (in the script) of final vowels before suffixes, pc ¼ pica ‘father’, but pc>m ¼ pica¯-mi ‘my father’. Contraction of final vowels with vowels of suffixes is common, e.g., ha¯bir-ı¯na ‘give.IMPERF-1ST.SING’ ¼ ‘I gave’, but ha¯bir-ina-hi-di ‘give.IMPERF-1ST.SING-he/she/it. ENCL.OBL-you.ENCL.OBL’ ¼ ‘I gave her to you’ > ha ¯birna¯-hı¯-di > ha¯bir-n-ı¯-di. Such final vowels are sometimes indicated by the Arabic vowel marks. Masculine and feminine gender are distinguished in the definite article (ı¯ˇ, ya; -ı¯ˇ, -a¯ˇ after prepositions) and in declension (nom. sing. masc. no ending, but fem. -a). Five cases are distinguished in masculine nouns: nominative-accusative, vocative (-a), possessive (-a¯n), dative (-i), and ablative-locative (-a), but feminine
nouns have only two forms: nominative and locative (-a) contrasting with the other cases (-iya). The plural endings are -i or -ina, possessive -in-a¯n. A final -k becomes -c before -i. The direct object can be marked by -da¯r attached to the dative (presumably < *ra¯d, cf. Pers. -ra¯). Examples: ı¯ ka¯m-hi ‘DEF mouth.MASC-he. ENCL.OBL ‘his mouth’; f-ı¯ ka ¯ ma¯-hi ‘in DEF mouth’; ya¯ i cama¯-h ‘DEF eye.FEM-he.ENCL.OBL’; ya¯ cam-ya¯-hi da¯r ‘DEF eye.FEM-DAT-he.ENCL.OBL’ ¼ ‘his eye’ (DO); ı¯ bandik ‘the servant’, ı¯ bandic-i ‘the servants’, f-ı¯ bandic-ı¯-hi ‘with-DEF servant-PL-he.ENCL.OBL’ ¼ ‘with his servants’; ı¯ bfin-e¯nik ¯ı bu¯m-in-a¯n ‘DEF create.AGT DEF earth-PL-POSS’ ¼ ‘the creator of the earths’. When several enclitic personal and local pronouns are added to a verb, the order is strict, e.g., ge¯r-ı¯da¯-hı¯na¯-bir ‘turn-IMPERF.3RD.SING-he.ENCL.OBL-they.ENCL.OBLupon’ ¼ ‘he made them go around him’, where -bir goes not with the preceding -na¯-, but with -hı¯-. When personal and local complements follow the verb, they must be anticipated as enclitics, e.g., m-uxwa¯s-ida¯na¯-wa f-ı¯ razik-a ı¯ cu¯b ‘IMPERF-let-PAST.3RD.SING-they. ENCL.OBL-there in-DEF-vinyeard-LOC DEF-water.PL’ ¼ ‘he let the water into the vinyeard’; hı¯d-ida¯-hı¯-na¯-da¯-bir ı¯ sala¯m ‘read.IMPERF.3RD.SING-it.ENCL.OBL-they.ENCL.OBLthere-on DEF-greeting.PL’ ¼ ‘he recited the greetings upon him’. The verbal system is of the Eastern Middle Iranian type. There are three stems: present, past, and perfect (perfect participle ¼ past stem þ suffix -ik, FEM -ica). There are numerous modal forms (indicative, imperative, subjunctive, optative, injunctive); an imperfect formed with prefixes (m-ikk- ‘did’) or lengthening of the vowel of the first syllable (h-a¯-bir- ‘gave’), both reflexes of the Old Iranian augment; a form ending in -ı¯(n) added to personal endings, the function of which is not completely clear but which is referred to as ‘permansive’; a (present) perfect formed with the perfect participle and the verb da¯r- (transitive verbs) or ‘be’ (intransitive verbs), e.g., akt-ik da¯riy-a¯yı¯ ‘do.PERF.PART.MASC have.PRES-1ST.SING-PERMAN SIVE’ ¼ ‘I may have done’; pura¯ca-ihi [ pura¯cı¯hi ‘divorce.PERF.PART.FEM-be.PRES.2ND.SING’ ¼ ‘you are divorced’. See also: Iranian Languages.
Bibliography Benzing J (1968). Das Chwaresmische Sprachmaterial einer Handschrift der »Muqaddimat al-adab« von Zamaxsˇarı¯ I: Text. Wiesbaden: Steiner. Benzing J (1983). Chwaresmischer Wortindex. Taraf Z (ed.). Wiesbaden: Harrassowitz.
386 Chorasmian Henning W B (1971). A fragment of a Khwarezmian dictionary. In MacKenzie D N (ed.). London: Lund Humphries. Humbach H (1989). ‘Choresmian.’ In Schmitt R (ed.) Compendium Linguarum Iranicarum. Wiesbaden: Reichert. 193–203.
MacKenzie D N (1990). The Khwarezmian element in the Qunyat al-Munya. Amarat H & MacKenzie D N (trans.). London: School of Oriental and African Studies. Samadi M (1986). Das Chwarezmische Verbum. Wiesbaden: Harrassowitz.
Christianity and Language in the Middle Ages N Adkin, University of Nebraska, Lincoln, NE, USA ! 2006 Elsevier Ltd. All rights reserved.
Christianity was founded by a carpenter’s son and spread by fishermen. Since the first Latin versions of the Bible were produced for very humble folk, their language was correspondingly humble; it was also marked by lexical and syntactic Hebraisms and Hellenisms. For educated pagans of late antiquity, the problem with Christianity was not theological, but linguistic: the language of its scriptures was simply considered too crude. While, moreover, the case made by the ‘Nijmegen School’ for a specifically Christian sociolect would seem to be untenable, the works of Christian writers were sufficiently unclassical to make educated pagans turn up their noses. When the highly educated Cyprian embraced Christianity, a pagan dubbed him instead Coprian (‘Mr. Shit’). The educational system that produced such people consisted of two parts. Broadly speaking, the role of the ‘grammaticus’ was to teach correct Latin and to expound the classics. The ‘rhetor’ then coached the student in the five parts of rhetoric: invention, disposition, style, memory, and delivery. Style (elocutio) comprised figures of language (e.g., adiunctio, anaphora, anastrophe, annominatio, asyndeton) and figures of thought (e.g., aetiologia, allegoria, aposiopesis, apostrophe, aversio). The problems posed by such an educational system for Christians are conveniently illustrated by the two greatest of the doctors of the Church: Jerome and Augustine. Jerome had been the star student of Donatus, the doyen of grammatici. When as a hermit Jerome preferred the stylistic finesse of Cicero to the rebarbative language of the Old Latin Bible, he had a famous dream, in which he was hauled before God’s throne and told: ‘You are a Ciceronian, not a Christian’. Augustine had been a professional rhetor himself. When as a bishop he then wrote the De doctrina christiana, the only use he could find for the old educational system was as an ancillary to the better understanding of the Bible.
Both Jerome and Augustine lived to see the fall of Rome. The barbarization and ruralization resulting from the influx of hordes of illiterate Germans led ultimately to the collapse of the old educational system. With the end of Roman rule, private and public libraries began to disappear. The books that survived usually found refuge in the monasteries, which had developed in the fifth century, when the appeal of a life of ascetic withdrawal grew. Such monasteries were also in a position to provide education. A similar function could be performed by the schools that came to be attached to cathedrals. Education accordingly became the preserve of the Church: a clerk is now a ‘cleric’. If Christians like Jerome and Augustine had been ambivalent in their attitude toward classical education, 6th-century monks like Benedict and Gregory the Great rejected it decisively. Benedict’s Rule makes his monks copy books, but Benedict himself had run away from school and composed his Rule in Vulgar Latin; the books he had in mind were naturally Christian. The prefatory letter to Gregory’s Moralia in Iob proclaims its author’s resolute refusal to trammel the Word of God with the ‘‘rules of Donatus.’’ His friend Gregory of Tours wrote a particularly solecistic Latin that showed how far the language had advanced toward Proto-Romance and how close to each other its written and spoken forms had become. Throughout the Dark Ages the light of learning still shone in Ireland, which had received Christianity in the 5th century. At the same time the Irish also received Latin, which as the language of the Bible, the Liturgy, and the Church Fathers was the indispensable concomitant of the new religion. The Irish also did not share the unease of their continental coreligionists with the pagan classics, since Ireland had never known Roman paganism. In addition, because in Ireland Latin was the preserve of an educated elite, the language was not exposed to the evolutionary tendencies that affected it on the increasingly Romance-speaking continent. Irish monks were therefore able to bring Christianity and a fairly classical form of Latin to their neighbors, the English.
386 Chorasmian Henning W B (1971). A fragment of a Khwarezmian dictionary. In MacKenzie D N (ed.). London: Lund Humphries. Humbach H (1989). ‘Choresmian.’ In Schmitt R (ed.) Compendium Linguarum Iranicarum. Wiesbaden: Reichert. 193–203.
MacKenzie D N (1990). The Khwarezmian element in the Qunyat al-Munya. Amarat H & MacKenzie D N (trans.). London: School of Oriental and African Studies. Samadi M (1986). Das Chwarezmische Verbum. Wiesbaden: Harrassowitz.
Christianity and Language in the Middle Ages N Adkin, University of Nebraska, Lincoln, NE, USA ! 2006 Elsevier Ltd. All rights reserved.
Christianity was founded by a carpenter’s son and spread by fishermen. Since the first Latin versions of the Bible were produced for very humble folk, their language was correspondingly humble; it was also marked by lexical and syntactic Hebraisms and Hellenisms. For educated pagans of late antiquity, the problem with Christianity was not theological, but linguistic: the language of its scriptures was simply considered too crude. While, moreover, the case made by the ‘Nijmegen School’ for a specifically Christian sociolect would seem to be untenable, the works of Christian writers were sufficiently unclassical to make educated pagans turn up their noses. When the highly educated Cyprian embraced Christianity, a pagan dubbed him instead Coprian (‘Mr. Shit’). The educational system that produced such people consisted of two parts. Broadly speaking, the role of the ‘grammaticus’ was to teach correct Latin and to expound the classics. The ‘rhetor’ then coached the student in the five parts of rhetoric: invention, disposition, style, memory, and delivery. Style (elocutio) comprised figures of language (e.g., adiunctio, anaphora, anastrophe, annominatio, asyndeton) and figures of thought (e.g., aetiologia, allegoria, aposiopesis, apostrophe, aversio). The problems posed by such an educational system for Christians are conveniently illustrated by the two greatest of the doctors of the Church: Jerome and Augustine. Jerome had been the star student of Donatus, the doyen of grammatici. When as a hermit Jerome preferred the stylistic finesse of Cicero to the rebarbative language of the Old Latin Bible, he had a famous dream, in which he was hauled before God’s throne and told: ‘You are a Ciceronian, not a Christian’. Augustine had been a professional rhetor himself. When as a bishop he then wrote the De doctrina christiana, the only use he could find for the old educational system was as an ancillary to the better understanding of the Bible.
Both Jerome and Augustine lived to see the fall of Rome. The barbarization and ruralization resulting from the influx of hordes of illiterate Germans led ultimately to the collapse of the old educational system. With the end of Roman rule, private and public libraries began to disappear. The books that survived usually found refuge in the monasteries, which had developed in the fifth century, when the appeal of a life of ascetic withdrawal grew. Such monasteries were also in a position to provide education. A similar function could be performed by the schools that came to be attached to cathedrals. Education accordingly became the preserve of the Church: a clerk is now a ‘cleric’. If Christians like Jerome and Augustine had been ambivalent in their attitude toward classical education, 6th-century monks like Benedict and Gregory the Great rejected it decisively. Benedict’s Rule makes his monks copy books, but Benedict himself had run away from school and composed his Rule in Vulgar Latin; the books he had in mind were naturally Christian. The prefatory letter to Gregory’s Moralia in Iob proclaims its author’s resolute refusal to trammel the Word of God with the ‘‘rules of Donatus.’’ His friend Gregory of Tours wrote a particularly solecistic Latin that showed how far the language had advanced toward Proto-Romance and how close to each other its written and spoken forms had become. Throughout the Dark Ages the light of learning still shone in Ireland, which had received Christianity in the 5th century. At the same time the Irish also received Latin, which as the language of the Bible, the Liturgy, and the Church Fathers was the indispensable concomitant of the new religion. The Irish also did not share the unease of their continental coreligionists with the pagan classics, since Ireland had never known Roman paganism. In addition, because in Ireland Latin was the preserve of an educated elite, the language was not exposed to the evolutionary tendencies that affected it on the increasingly Romance-speaking continent. Irish monks were therefore able to bring Christianity and a fairly classical form of Latin to their neighbors, the English.
Christianity and Language in the Middle Ages 387
Accordingly, Alcuin was in turn summoned by Charlemagne from the cathedral school in York to his capital Aachen, where this well-educated Englishman masterminded the Carolingian renaissance. This renaissance entailed a revival of classical education. As a species of Carolingian ‘minister of culture’, Alcuin established the trivium of ‘grammar’, ‘rhetoric’, and ‘dialectic’. He thereby bequeathed to the Middle Ages a tradition of writing correct and cultivated Latin, which were the objectives of grammar and rhetoric, respectively, while dialectic taught correct thinking. Texts of the pagan authors became once again the subject of diligent study instead of being ignored or destroyed; their dissemination was also facilitated by the new Carolingian minuscule (writing style). Donatus was reinstated as the arbiter of acceptable usage. Zealous attention was now paid to his two Artes, both of which set out the eight parts of speech. While the longer treatment dealt with the vitia et virtutes orationis, the shorter one, with its congenial question-and-answer layout, became the favorite grammar for beginners throughout the rest of the Middle Ages. Henceforth, Latin was treated as a learned language separate from Romance. However, some qualifications must be noted to this rosy picture of linguistic and literary renaissance. The unclassical features that had marked the style of the Church Fathers persisted; here particular mention may be made of the use of quod, quia, and quoniam rather than the accusative and infinitive, of the use of the infinitive to express purpose, of a tendency to employ prepositions indiscriminately, and of a greater license in the handling of tense and mood. Similar divergencies from the classical language characterized Jerome’s massively influential Vulgate version of the Bible, which by the Carolingian period had generally replaced the stylistically hair-raising Old Latin translations that had so repelled him in his youth. The Bible and writings of the Church Fathers alone seemed an appropriate object of study to some religious leaders, who disapproved of the use of pagan texts; Alcuin himself had felt qualms, although Christian allegories were always an option. Even when such reservations were absent, the goal was always a proper understanding of Holy Writ. Charlemagne’s own motivation had been largely pragmatic, since he wished to improve the quality of officials for Church and State. Moreover, there is evidence that in the decades following Charlemagne’s death, his educational reforms were not being systematically implemented. Such as it was, this Carolingian renaissance was effected through the schools that were attached either to monasteries (e.g., Fulda, St. Gall, Tours) or to cathedrals (e.g., Metz, Orle´ ans, Reims); such
establishments were the forerunners of the great monastic and cathedral schools of the High Middle Ages. Charlemagne’s own biographer Einhard was himself an alumnus of Fulda; his masterpiece, the Vita Caroli, illustrates the new enthusiasm for the classics by its studious imitation of Suetonius’s Vita Augusti. If in matters of grammar the Carolingian age preferred to rely on the handily succinct Donatus, he was eclipsed in the 11th century by Priscian’s much ampler Institutiones grammaticae, which likewise deals with the eight parts of speech. For rhetorical instruction the main textbooks continued to be the pseudo-Ciceronian Rhetorica ad Herennium and Cicero’s juvenile De inventione; the latter had inspired Alcuin’s own De rhetorica et de virtutibus. Medieval authors produced numerous commentaries on these four key manuals of grammar and rhetoric as well as textbooks of their own that derive from them; here, passages from Christian works could be used for exemplification in place of pagan ones. Of the rhetorical handbooks, the Rhetorica ad Herennium in particular sets out its subject with admirable clarity and comprehensiveness. Its fourth book, which often circulated separately, is devoted to a very full treatment of the 64 figures of language and thought. Lavish use of these figures in Walter of Chaˆ tillon’s 12th-century Alexandreis enabled this epic to outrank even Virgil’s supposedly incomparable Aeneid in the favor of the schools. This same 12th century saw a weakening of the Church’s role in the teaching of Latin. Here a number of factors were involved. First, the new orders of Cistercians and Carthusians admitted only adults, so they did not maintain schools. Second, this period was marked by the emergence of so-called private schools, some of which developed in the following century into the universities. Last, a large number of ‘grammar schools’ appeared. While the old monastic schools continued to exist, their importance accordingly declined in the face of this competition from urban centers. The final point to be made in this connection is that the role of these monastic schools in education had for the first time made education available at a higher level to women, who learned Latin as nuns. If the 12th century had managed to equal the auctores, in the 13th the study of them succumbed to dialectic. Rhetoric itself now tended to be reduced to the ‘ars dictaminis’, which was principally concerned with imparting a good prose style for the composition of letters. Here the Ciceronian arrangement of a speech into ‘exordium’, ‘divisio’, ‘narratio’, ‘confirmatio’, ‘refutatio’, and ‘peroratio’ was condensed to ‘salutatio’, ‘narratio’, ‘petitio’, and ‘conclusio’. Careful attention was also given to ‘cursus’, whereby three
388 Christianity and Language in the Middle Ages
principal types of accentual cadence were employed at the ends of sentences: ‘planus’ (-´ - --´ -), ‘tardus’ (-´ --´ --) and ‘velox’ (-´- - ---´ -). In connection with such ‘artes dictaminis,’ reference may also be made to the ‘artes poetriae’ and to the specifically Christian ‘artes praedicandi’. These two kinds of manual-issued precepts for the writing of poetry and sermons, respectively: while the ‘ars praedicandi’ dealt chiefly with the division and amplification of a theme, the ‘ars poetriae’ laid particular stress on the figures of language and thought expounded in the fourth book of the Rhetorica ad Herennium. In the sphere of grammar this period saw the rise of the Modistae, whose ‘grammatica speculativa’ set about applying to Priscian the new methods of scholastic logic; it now went beyond his morphology to embrace syntactic analysis. Grammar accordingly came to be perceived as possessing a universal character that transcended individual languages: the grammatical ‘modi significandi’ reflected the mind’s ‘modi intellegendi’, which in turn reflected the ‘modi essendi’ of reality itself. While moreover Priscian’s examples had been taken from the soigne´ language of the classics, the scholastically minded Modistae devised their own, which in comparison were often bizarrely utilitarian. A word should also be said about the effect of scholasticism on the Latin language itself. While Medieval Latin as a whole had always been prone to admit neologisms, this tendency was particularly characteristic of scholastic Latin. Here, particular importance was attached to the use of suffixes (e.g., certitudinaliter) and to the substantivization of adjectives (e.g., haecceitas), participles (e.g., ens rationis) and infinitives (e.g., pro posse). The result was a Latin that was precise but unclassically inelegant. Classical elegance returned with the Renaissance humanists, whose obsession with it killed Latin as a general means of communication; even the grammars of Donatus and Priscian were rejected in favor of observing the usage of classical authors themselves. Such aspirations to linguistic classicism could moreover entail an intermittent neo-paganism that tended to disown Christianity itself. A brief mention must also be made of languages other than Latin. Educated Romans of the classical period had been bilingual, since they were equally proficient in Latin and Greek. By late antiquity, however, knowledge of Greek in the West was on the wane. During the Middle Ages, Greek was largely unknown, even though it was the language of both the New Testament and of Aristotle, who exercised a vast influence on scholasticism. Likewise, Hebrew, the language of the Hebrew Bible, was hardly known. In both cases the Middle Ages relied instead
on Latin translations. The vernaculars also were largely unimportant until the late Middle Ages, since they lacked the prestige of Latin, which was felt to be the language of theology, the most important of the sciences. See also: Latin.
Bibliography Bardy G (1953). ‘Les origines des e´ coles monastiques en Occident.’ Sacris Erudiri 5, 86–104. Browning R (2000). ‘Education in the Roman Empire’ In Cameron A, Ward-Perkins B & Whitby M (eds.) Cambridge ancient history 14: Late antiquity: empire and successors, A.D. 425–600. Cambridge: Cambridge University Press. 855–883. Butzer P L, Kerner M & Oberschelp W (eds.) (1997). Charlemagne and his heritage: 1200 years of civilization and science in Europe (Karl der Große und sein Nachwirken: 1200 Jahre Kultur und Wissenschaft in Europa) 1: Wissen und Weltbild. Turnhout: Brepols. Camargo M (1991). Ars dictaminis, ars dictandi. Turnhout: Brepols. Dell’Omo M (ed.) (1996). Virgilio e il chiostro: manoscritti di autori classici e civilta` monastica. Rome: Fratelli Palombi. Diem A (1998). ‘The emergence of monastic schools: the role of Alcuin.’ In Houwen L A J R & MacDonald A A (eds.) Alcuin of York: scholar at the Carolingian court. Groningen: E. Forsten. 27–44. Fried J (ed.) (1986). Schulen und Studium im sozialen Wandel des hohen und spa¨ten Mittelalters. Sigmaringen: J. Thorbecke. Hagendahl H (1958). Latin Fathers and the classics: a study on the Apologists, Jerome and other Christian writers. Go¨ teborg: Almqvist & Wiksell. Holtz L (1981). Donat et la tradition de l’enseignement grammatical: e´tude sur l’Ars Donati et sa diffusion (IVeIXe sie`cle) et e´dition critique. Paris: Centre national de la recherche scientifique. Hunt R W (1980). The history of grammar in the Middle Ages: collected papers. Amsterdam: Benjamins. Iglesia Duarte J I de la (ed.) (2000). La ensen˜anza en la Edad Media. Logron˜ o: Instituto de Estudios Riojanos. Kaster R (1988). Guardians of language: the grammarian and society in late antiquity. Berkeley: University of California Press. Kelly D (1991). The arts of poetry and prose. Turnhout: Brepols. Law V (1997). Grammar and grammarians in the early Middle Ages. London: Longman. Le Duc G (1994). ‘The contribution to the making of European culture of Irish monks and scholars in medieval times.’ In Mackey J P (ed.) The cultures of Europe: the Irish contribution. Belfast: Institute of Irish Studies, Queen’s University of Belfast. 21–37. Lindgren U (1992). Die Artes liberales in Antike und Mittelalter: bildungs- und wissenschaftsgeschichtliche
Christianity in Africa 389 Entwicklungslinien. Munich: Institut fu¨ r Geschichte der Naturwissenschaften. Mantello F A C & Rigg A G (eds.) (1996). Medieval Latin: an introduction and bibliographical guide. Washington, D. C.: Catholic University of America Press. Mohrmann C (1961–1977). Etudes sur le latin des chre´ tiens. Rome: Edizioni di Storia e Letteratura. Murphy J J (1974). Rhetoric in the Middle Ages: a history of rhetorical theory from Saint Augustine to the Renaissance. Berkeley: University of California Press. Murphy J J (1989). Medieval rhetoric: a select bibliography. (2nd edn.). Toronto: University of Toronto Press. Paetow L J (1910). The arts course at medieval universities with special reference to grammar and rhetoric. UrbanaChampaign: University Press. Pare´ G M, Brunet A M & Tremblay P (1933). La Renaissance du XIIe sie`cle: les e´ coles et l’enseignement. Paris: Vrin. Pe´ rez Rodrı´guez E (2001). ‘La cristianizacio´n de la grama´tica latina (ss. V–IX).’ In Alberte Gonza´lez A & Macı´as Villalobos C (eds.) Actas del congreso internacional
‘‘Cristianismo y tradicio´ n latina’’. Madrid: Laberinto. 49–74. Reynolds S (1996). Medieval reading: grammar, rhetoric and the classical text. Cambridge: Cambridge University Press. Riche´ P (1979). Les e´ coles et l’enseignement dans l’Occident chre´ tien de la fin du Ve sie`cle au milieu du XIe sie`cle. Paris: Aubier Montaigne. Riche´ P (1995). Education et culture dans l’Occident barbare, VIe-VIIIe sie`cles (4th edn.). Paris: Editions du Seuil. Rosier-Catach I (1983). La grammaire spe´ culative des modistes. Lille: Presses universitaires de Lille. Scaglione A (1990). ‘The classics in medieval education.’ In Bernardo A S & Levin S (eds.) The classics in the Middle Ages. Binghamton: Center for Medieval and Early Renaissance Studies. 343–362. Verger J (1996). La renaissance du XIIe sie`cle. Paris: Editions du Cerf. Verger J (1999). Les universite´ s au moyen aˆ ge. Paris: Presses Universitaires de France.
Christianity in Africa A Hastings ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 2, pp. 542–544, ! 1994, Elsevier Ltd.
The involvement of the Christian churches in issues of language has been central both to the modern linguistic history of sub-Saharan Africa and to the life of the churches themselves. Apart from Swahili in the east and Fulfulde and two or three others in the more Islamized west, none of the languages of sub-Saharan Africa had been given a written form prior to the coming of missionaries.
Early Missionary Translation Work The earliest extant Bantu text is a lengthy catechism in Portuguese and Kongo produced by the Jesuit Mattheus Cardoso and printed in Lisbon in 1624. This, a subsequent Kongo dictionary by a Capuchin, Georges de Geel, and a grammar produced by another Capuchin, Giacinto Brugiotti da Vetralla in 1659, are immensely valuable for Bantu linguistic history. The last includes the Bantu noun-class and concord system. Unfortunately, the Catholic missionaries of that era produced little work in any other language. It was the 19th-century Protestant missionaries, with their far higher conviction of the necessity of Bible translation, who produced a
mass of linguistic work – dictionaries, grammars, and biblical texts for scores of languages in every part of Africa. The pioneering work of Krapf in the east, van der Kemp, Moffat, Appleyard, and Colenso in the south, and Raban, Scho¨n, and Crowther in the west – to name but a few from the early and middle years of the 19th century – stands for a vastly larger undertaking that has as yet received no adequate historical survey. In almost every African language with a written literature, missionaries have been responsible for the basic work and indeed for most subsequent published literature as well, except in a handful of the larger languages. The vast multiplicity of African languages and the policy of most colonial and postcolonial governments to use English, French, or Portuguese for educational purposes mean that (apart from some score of languages, such as Yoruba, Swahili, Shona, and Ganda) there is still little, if anything, of any extent published in most languages except for church purposes.
Missionary Imposition of a Dialect as a Language Missionaries could not, of course, have done this work without African collaborators who had themselves first learned English. Inevitably the precise language canonized by missionaries, in the first dictionaries and New Testaments, was the dialect used by their assistants. As all languages inevitably varied geographically, so that it is indeed open to argument how far any
Christianity in Africa 389 Entwicklungslinien. Munich: Institut fu¨r Geschichte der Naturwissenschaften. Mantello F A C & Rigg A G (eds.) (1996). Medieval Latin: an introduction and bibliographical guide. Washington, D. C.: Catholic University of America Press. Mohrmann C (1961–1977). Etudes sur le latin des chre´tiens. Rome: Edizioni di Storia e Letteratura. Murphy J J (1974). Rhetoric in the Middle Ages: a history of rhetorical theory from Saint Augustine to the Renaissance. Berkeley: University of California Press. Murphy J J (1989). Medieval rhetoric: a select bibliography. (2nd edn.). Toronto: University of Toronto Press. Paetow L J (1910). The arts course at medieval universities with special reference to grammar and rhetoric. UrbanaChampaign: University Press. Pare´ G M, Brunet A M & Tremblay P (1933). La Renaissance du XIIe sie`cle: les e´coles et l’enseignement. Paris: Vrin. Pe´rez Rodrı´guez E (2001). ‘La cristianizacio´n de la grama´tica latina (ss. V–IX).’ In Alberte Gonza´lez A & Macı´as Villalobos C (eds.) Actas del congreso internacional
‘‘Cristianismo y tradicio´n latina’’. Madrid: Laberinto. 49–74. Reynolds S (1996). Medieval reading: grammar, rhetoric and the classical text. Cambridge: Cambridge University Press. Riche´ P (1979). Les e´coles et l’enseignement dans l’Occident chre´tien de la fin du Ve sie`cle au milieu du XIe sie`cle. Paris: Aubier Montaigne. Riche´ P (1995). Education et culture dans l’Occident barbare, VIe-VIIIe sie`cles (4th edn.). Paris: Editions du Seuil. Rosier-Catach I (1983). La grammaire spe´culative des modistes. Lille: Presses universitaires de Lille. Scaglione A (1990). ‘The classics in medieval education.’ In Bernardo A S & Levin S (eds.) The classics in the Middle Ages. Binghamton: Center for Medieval and Early Renaissance Studies. 343–362. Verger J (1996). La renaissance du XIIe sie`cle. Paris: Editions du Cerf. Verger J (1999). Les universite´s au moyen aˆge. Paris: Presses Universitaires de France.
Christianity in Africa A Hastings ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 2, pp. 542–544, ! 1994, Elsevier Ltd.
The involvement of the Christian churches in issues of language has been central both to the modern linguistic history of sub-Saharan Africa and to the life of the churches themselves. Apart from Swahili in the east and Fulfulde and two or three others in the more Islamized west, none of the languages of sub-Saharan Africa had been given a written form prior to the coming of missionaries.
Early Missionary Translation Work The earliest extant Bantu text is a lengthy catechism in Portuguese and Kongo produced by the Jesuit Mattheus Cardoso and printed in Lisbon in 1624. This, a subsequent Kongo dictionary by a Capuchin, Georges de Geel, and a grammar produced by another Capuchin, Giacinto Brugiotti da Vetralla in 1659, are immensely valuable for Bantu linguistic history. The last includes the Bantu noun-class and concord system. Unfortunately, the Catholic missionaries of that era produced little work in any other language. It was the 19th-century Protestant missionaries, with their far higher conviction of the necessity of Bible translation, who produced a
mass of linguistic work – dictionaries, grammars, and biblical texts for scores of languages in every part of Africa. The pioneering work of Krapf in the east, van der Kemp, Moffat, Appleyard, and Colenso in the south, and Raban, Scho¨n, and Crowther in the west – to name but a few from the early and middle years of the 19th century – stands for a vastly larger undertaking that has as yet received no adequate historical survey. In almost every African language with a written literature, missionaries have been responsible for the basic work and indeed for most subsequent published literature as well, except in a handful of the larger languages. The vast multiplicity of African languages and the policy of most colonial and postcolonial governments to use English, French, or Portuguese for educational purposes mean that (apart from some score of languages, such as Yoruba, Swahili, Shona, and Ganda) there is still little, if anything, of any extent published in most languages except for church purposes.
Missionary Imposition of a Dialect as a Language Missionaries could not, of course, have done this work without African collaborators who had themselves first learned English. Inevitably the precise language canonized by missionaries, in the first dictionaries and New Testaments, was the dialect used by their assistants. As all languages inevitably varied geographically, so that it is indeed open to argument how far any
390 Christianity in Africa
specific language actually existed prior to its precise, missionary-constructed form, the language of missionary literature represents in every case the particular local form in which the missionary’s assistants were at home. Once a language was printed, it was, however, essential that it be used across a wider area in which people were much less at home with it. The missionary approach had mistakenly assumed a degree of linguistic uniformity that in fact it had rather to impose. This produced inevitable tensions and at times rebellions. Moreover, other missionaries working in the same general language area but some distance away inevitably absorbed different dialectical forms. Missionaries certainly revised their translations as their language knowledge improved, but Africans had also to some extent to relearn their languages when they studied the Scriptures or other works in mission schools. Moreover, tensions between different forms of a language as fixed by a range of different missionary translations (so that, for instance, Protestant and Catholic Baganda could divide linguistically as well as theologically) continued for decades, at least until the 1930s, or later, when colonial governments insisted on standardized forms for state-assisted schools. Even then, imposed uniformity, such as ‘Union Igbo,’ was not easily accepted. The missionary preoccupation with the vernacular written word was quickly taken over by leading African Christians. The Reverend John Raban’s initial work on Yoruba was continued by the Yoruba school teacher, and later bishop, Samuel Ajayi Crowther, who made a major linguistic contribution in his recognition of the essential role of tone in Yoruba, Nupe, and other Nigerian languages. In southern Africa, the Presbyterian minister Tiyo Soga, the translator of Bunyan’s Pilgrim’s progress into Xhosa, died in 1871 while at work on the Acts of the Apostles. While some missionaries and African Christians, such as Scho¨ n and Crowther, were undoubtedly outstanding precisely as students of language, their primary preoccupation – to translate as quickly as possible as much biblical and Christian literature as they could – often, it must be said, weakened their achievement precisely as linguists.
The Name of God For missionaries, the great question was a practical and pedagogical one: how far could they find suitable words already in existence for the special things they wished to teach about, how far had they instead to invent words in which to do so? In the latter case, how were they to arrive at the words they needed? It is striking that in the large majority of African languages, missionaries were content to use an existing vernacular name for the all-important name of God.
In some places, at least at first, they felt unable to do this and imported foreign names like Dio or Godi. But these were exceptional and mostly short-lived. For the greater part, the missionaries became convinced that Africans had already sufficient belief in a single almighty spirit for it to be possible to adopt an African name for such a spirit to use for the biblical and Christian Yahweh, the Father of Jesus. Thus, in East Africa, Mungu, Mulungu, or Ruhanga; in central Africa Lesa or Nzambi; in southern Africa Molimo all came – among many other names – to be used. In Zimbabwe, there were long hesitations among some over the use of Mwari – in this case because of its localized cult. It is noticeable that in many cases these names had already obtained a considerable degree of intertribal currency, in some cases overshadowing more local names. It seems clear that missionaries preferred words with the wider usage.
Other Key Christian Terms This seems true for a wide range of religious terms. As missionaries advanced from one people to another, they inevitably carried words, especially key words, across intertribal boundaries. They did this particularly in eastern Africa in the early years with regard to Swahili. While Swahili is basically a Bantu language, it has incorporated a very large number of Arabic words, including especially religious words. Despite their Muslim origin, these seemed, with their theistic character and biblical links, ideal for missionary use. Hence words like eddini, essala, injili, kanisa (‘religion,’ ‘prayer,’ ‘gospel,’ ‘church’) were imported into many other inland Bantu languages from Swahili. They seemed to fill a gap while avoiding the importation of purely European words to which many missionaries were driven in other circumstances, when no suitable vernacular word was discovered for some major Christian concept. However, a later generation of missionaries found this dependence upon Swahili regrettable, perhaps because of its apparent Muslim connotations, and in several languages such words were systematically eliminated if alternatives could be found. ‘Gospel’ and ‘baptize’ are typical cases. The root meaning of the one is ‘good news,’ of the other ‘wash’ or ‘sprinkle,’ but each has turned in Christian usage almost into a proper name. Many early missionaries (Catholics especially) were unwilling simply to find a vernacular phrase meaning ‘good news’ or ‘wash’ and render the words in this way. Hence, vernacular Christian doctrinal texts could be peppered with strange-sounding terms, transliterations of Greek, Latin, or English. The tendency now is to eliminate these in favor of genuinely local words.
Christianity in Africa 391
Muzimu and Nganga In other cases, traditional words with a religious connotation were available but were avoided just because it seemed the wrong connotation. Words for ‘spirit’ are probably the clearest case. Missionaries were most anxious not to let the ‘Holy Spirit’ become identified with, or understood in terms of, the ‘spirits’ of traditional religion, especially spirits of the dead, which frequently possessed the living. Hence, nearly everywhere the typical Bantu spirit word muzimu was rejected. Again, the almost universally used term throughout Bantu Africa for a priest, diviner, or medium, nganga, was regarded as unusable for a Christian minister in the 19th century and since. The nganga became, instead, the stereotype of the pagan ‘witch doctor’ against whose influence Christianity was battling. This is interesting because missionaries in the 16th and 17th centuries willingly described themselves as ‘nganga.’ These are two cases in which the linguistic challenge of assimilation still remains to be met.
The Fluidity of Word Meanings What seems clear is that language usage in nonliterate societies is far more fluid than might be imagined. It is less a matter of finding the right word than of making it right by regular usage. Existing words had the flexibility of all language, and anyway missionaries could not know how reliable were their informants. Words (and not only the name of God) moved easily across language groups in precolonial Africa. Missionary importations and adaptations were nothing new. Once a word was adopted, used in a certain way in scriptures, hymns, and sermons, it easily acquired the meaning now given it, whether it had it before or not. Within a generation or so, non-Christians, too, could be using it the missionary way. Thus, for instance, even where missionaries mistakenly adopted for ‘God’ a local name that had really belonged to a culture hero of quite limited importance (e.g., the Nyakyusa Kyali), the new missionary content for the word quickly became a normative one, recognized by all but the antiquarian.
The Lasting Language Impact of Missionary Translations In the scores of African languages that have possessed a Bible, a hymn book, a catechism, for 50 to 100 years, but still very little more in the way of written literature, these books, with their specific vocabulary and the meanings of the words that the content of the
Bible and Christian tradition impose upon them, may now be near the heart of living language usage. Rather little in that vocabulary was formally imported, but the conversion of traditional words to new meanings has effectively taken place. Nevertheless, the new meanings have not simply obliterated old meanings. African religion in most places is now an integrated mix of the traditional with the Christian (or the Islamic). That mix is one of concept, of ritual, and also, inevitably, of linguistic meaning. Moreover, the process described in this article has not proceeded everywhere at the same rate or gone so far. In some languages it began considerably later than in others. In many smaller languages, little of scripture has even now been printed. Much has depended, too, on how linguistically skilled early missionaries were. In some places they did indeed master the language, translate intelligibly, preach eloquently, and in a way impose their meanings upon it. In others, missionaries were unable to do this. Their linguistic ability was simply inadequate. They preached through poorly trained interpreters and remained so marginal to the vernacular culture that what translations they produced had little impact upon its world of meaning. The very fact that schooling was in English or French might have actually protected vernacular meanings from the impact of Western Christian verbal imperialism. Nevertheless, in most rural areas where Christian churches have been actively at work and (as in the case of most countries of Africa south of the Equator) now have a majority of the population considering itself Christian, biblical literature and a Christian interpretation of religious words may be almost as central to vernacular culture as was the King James version and its vocabulary specific to the culture of preindustrial Britain.
Bibliography Bontinck F & Ndembe Nsasi D (1978). Le cate´ chisme Kikongo de 1624. Re´ e´ dition critique. Brussels. Doke C M (1967). The southern Bantu languages. London: International African Institute. Fabian J (1986). Language and colonial power: the appropriation of Swahili in the former Belgian Congo, 1880–1938. Cambridge: Cambridge University Press. Hair P E H (1967). The early study of Nigerian languages. Cambridge: Cambridge University Press. Hastings A (1989). ‘The choice of words for Christian meanings in eastern Africa.’ In African Catholicism. London: SCM Press. Samarin W (1986). ‘Protestant missions and the history of Lingala.’ Journal of Religion in Africa 16(2), 138–163.
392 Christianity in Central Asia and the Near East
Christianity in Central Asia and the Near East E Hunter, University of Cambridge, Cambridge, UK ! 2006 Elsevier Ltd. All rights reserved.
During the 2nd century A.D., the kingdom of Osrohoene, with its capital city Edessa (Urfa in Turkey), emerged as a major center of Syriac Christianity, which quickly spread throughout the Syrian territories of the east Roman Empire and penetrated Mesopotamia. When the Sassanians conquered the Parthians in 225 A.D., the incoming Zoroastrian king Ardashir found substantial Christian communities in his territories, stretching from the northern regions of Mesopotamia to Mesene (Basra) in southern Iraq. Syriac Christianity also expanded along the coastline of the Arabian Gulf and was transmitted to India, as well as establishing itself in Iran and Central Asia. The Christian population of the Sassanid Empire were supplemented by deportees from Byzantium following the incursions of Shapur I into Syria in 256 A.D. and 260 A.D. This pattern was repeated throughout the centuries of Sassanid rule. It culminated with the influx of thousands of Monophysite captives following the defeat of Antioch by Chosroes II, whose campaigns ventured deep within the Byzantine empire, taking Jerusalem in 612 A.D. Some of the deportees were settled in Herat in northern Afghanistan, where Syrian Orthodox bishops still resided as late as the 9th century. In Mesopotamia, where the Sassanian royal house and the nobility were Zoroastrian, the burgeoning Christian communities lived alongside Jews, Manichaeans, and Mandaeans. The Council of Nicaea was convened in 325 A.D. by Emperor Constantine (324–337 A.D.) to discuss the Arian interpretation of the Trinity, resulting in the Niceno-Constantinopolitan Creed of 381 A.D. that was adopted universally by Christian communities in both the Byzantine and Sassanid territories. This unity was shattered at the Council of Ephesus in 431 A.D. when Nestorius, bishop of Constantinople, was excommunicated because of his Diophysite theology. The Church of the East, which followed the teachings of Nestorius, separated from Christianity in the Byzantine territories. In reality, in 424 A.D. the Synod of Da¯dı¯sˇo¯, held at the religious centre of Hira in southwest Iraq, had declared the Christian communities east of the Euphrates autocephalous of the five pentarchies of Alexandria, Antioch, Constantinople, Jerusalem, and Rome, effectively severing relations with the West. By the 5th century, the Church of the East had consolidated itself in eastern Iran and Turkestan,
although Syrian Orthodox, Armenian, and Byzantine (principally Melkite) missionaries were also active. The Synodicon Orientale, a listing of the synods held by the Church of the East between 410 A.D. and 790 A.D., recorded that bishops of Nisˇapur, Marv, and Herat were represented at the Synod of Da¯dı¯sˇo¯ in 424 A.D. The Syriac History of Mar Aba recounts the consecration of a Hephthalite bishop in 549 A.D., when a delegation traveled to Seleucia- Ctesiphon to present their candidate and rendered obeisance both to the Sassanid monarch and the Patriarch. The Hephthalite see may have been the bishopric of Badghis-Qadistan, near Herat. Herat was elevated to a metropolitanate of the Church of the East at the 585 A.D. Synod of Isˇo’ya ¯ b, but, as mentioned, there was also a Syrian Orthodox presence. Marv, the administrative seat of the Sassanid province of Margiana, was strategically located on the Silk Route to be the starting point for missions in Central Asia among the Turkic nomadic tribes that were progressively moving westward and had infiltrated the regions around the Syr and Amu Darya rivers. Reflecting this role, the Synod of Mar Aba in 544 A.D. ranked Marv as the seventh metropolitanate, after Beth Lapat, Nisibis, Perath dhe Maysan (Basra), Arbel (Erbil), Karkha de Beth Selokh (Kirkuk), and Rew Ardashir, all of which were located in Mesopotamia or the nearby Iranian territories. The missionary activities of the metropolitans of Marv are recorded in several Syriac sources. The anonymous 6th-century Chronica Minora narrates that the metropolitan converted a Turkic kinglet by a quasi-shamanic ceremony, making the sign of the cross to still a storm that had been conjured by pagan priests. Samarkand was also a base for proselytism among both the Sogdian-speaking Iranian communities settled in east Turkestan and the westward-moving nomadic groups whose Turkic languages became the lingua franca. Syriac sources do not comment directly on Samarkand’s role in missionary enterprises, but there are three Nestorian crosses and inscriptions engraved on a rock on the main route from Bactria to Lhasa. A single Sogdian word may be interpreted as ‘Jesus’, while a longer inscription, also written in Sogdian, reads ‘‘in the year 210 . . . came Noˆsˇ-farn from Samarkand as emissary to the Khan of Tibet’’ (Gillman and Klimkeit, 1999: 108). The metropolitan of Samarkand may have helped consolidate the Church of the East in Tibet where, in the late 8th century, Patriarch Timothy I made reference to the consecration of a metropolitan of the be¯t tuˆpta¯yeˆ’, possibly the Tibetans.
Christianity in Central Asia and the Near East 393
The metropolitanate of Kashgar probably emerged during the dynamic incumbency of Patriarch Timothy I (787/8–823 A.D.). An important trade station, Kashgar was located where the Silk Route to China bifurcated. Its jurisdiction extended over the bishoprics of Yarkand, Urumtsi, and various sites on the rim of the inhospitable Tarim basin, including Khotan. On the northern side, the Turfan oasis was dotted with Christian sites. The most renowned was Bula¨ yiq, where a monastery of the Church of the East and its library of Syriac, Sogdian, and Christian Turkic manuscripts was discovered by the 1905 expedition of Albert von le Coq. Kashgar was still a metropolitanate in the 12th century A.D., while a century later Marco Polo noted that near the city were ‘‘some Turks who are Nestorian Christians.’’ The metropolitanate was well located for ministering conversions among the eastern Turco-Mongol Naiman, Merkit, Ongut, and Kerait tribes. The precise details of the conversions of these tribes are not known, but the influence of Christianity probably was felt during the 9th century when the dioceses of the Church of the East stretched to Chang’an in western China. Gregory Bar Hebræus, writing during the Mongol Il-Khanate, narrated the conversion of the Kerait in 1007 A.D. in his two Syriac historiographies, Chronicon ecclesiasticum and Chronicon Syriacum, but the term ‘Kerait’ may have been inserted. Thomas of Marga’s Historia monastica (Book of the governors), written c. 840 A.D., narrates the preparation undertaken by missionaries, which included instruction in Syriac, the liturgical language of the Church of the East, as well as vernacular languages. Missionaries built churches, baptized, and organized dioceses, selecting the lower ranks of the clergy, priests and deacons, from the indigenous population. Bishops and metropolitans were drawn from Mesopotamian clergy, thus maintaining a direct link with Seleucia-Ctesiphon and later Baghdad. During the Abbassid period (850–1256 A.D.), Christians assumed a prominent profile as physicians to the caliphs and the transmitters of Greek scientific and medical works. One of the principals of the House of Knowledge, an institution of higher learning in Baghdad founded by Caliph Mamun (d. 833), was the ‘Nestorian’ Hunain ibn Ishaq (d. 873), who translated over 100 books from Greek into Syriac and Arabic, including Aristotle’s Organum, Analyticon, and Hermaneutics. Despite the increasingly Islamic tenor of the law and bureaucracy, financial exactions, and forced conversion of some Christian Arab tribes, Patriarch Timothy I was able to win the respect of Caliph al-Mahdi (775–785 A.D.) by emphasizing the traditional enmity between the Church of the East and the Orthodox church of
Byzantium. A staunch defender of the faith, Timothy also debated with the caliph on the respective claims of Christianity and Islam but sought to stress the common monotheism of Muslims and Christians. The Church of the East reached its zenith during the incumbency of Timothy I. Its dioceses encompassed Arabia, where a bishop was ordained for the communities in Sanaa (Yemen); India, where the church was elevated to a metropolitanate; Tibet; Central Asia; and China. Details are found in Timothy’s extant correspondence. Writing to the monks of Mar Maron, he revealed the conversion of ‘‘the King of the Turks with all his people.’’ In a letter sent to the Metropolitan of Elam, Timothy announced: ‘‘in these days the Spirit anointed a metropolitan for the be¯ t .tuˆ rka¯ yeˆ .’’ This event was also chronicled in Arabic during the 12th century by the Christian historian Mari ibn Suleiman in Kitab’ul mijdal: ‘‘Henceforth, Timothy led into faith the Khaqan, the king of the Turks and other nations.’’ The metropolitanate was probably located near the Syr Darya river in western Turkestan, where the Oghuz settled around 600 A.D. During the incumbency of Timothy I, the T’ang dynasty in China accepted the Church of the East as a ‘Persian religion,’ together with Manichaeism and Buddhism. The legacy of the Church of the East’s missions in Central Asia came to fruition in 1258 A.D. when Hulagu Khan sacked Baghdad. Several Christian Kerait princesses, including Sorghotani, the mother and wife of Hulugu Khan, were in the royal Mongol household. Gregory Bar Hebræus’s Chronography notes that in 1279 A.D. the mother of the third Il-Khan, Tegudar Ahmad (1282–1284 A.D.), revived the Christian procession of the Epiphany, which had ceased due to conflicts between the Christians and Muslims. The Il-Khans were disposed to Christianity and considered an alliance with Western Europe against the Muslims, sending the Uighur monk Rabban Sauma to Europe as an ambassador. There he met Philip, king of France, and Edward I of England, as well as Pope Nicholas IV. In 1281 A.D. an Ongut monk, enthroned as Mar Yabhallaha III, became Patriarch of the Church of the East. Ten years later, the Il-Khans embraced Islam and the tide turned against the Christian communities in their domains. The Mongolian script, a derivative of Syriac, is a lasting legacy of the Christian influence. The Fransciscan friars John de Plan Carpin and William of Rubruck, who visited in the 13th century, provide information about the ‘Nestorians’ at the Mongol court in Karakorum. These medieval Latin commentaries are the last eyewitness accounts of the Mongol Christians of Central Asia. At Tokmek and Pishpek, in the region of Semireche’e, between
394 Christianity in Central Asia and the Near East
Lake Balkash and Issy-kol (Kirghizia), cemeteries with hundreds of Old Turkic headstones written in the Syriac script indicate the presence of sizeable Christian Turkic communities between the 9th and 14th centuries. Faced with the ravages of Timur Lang, the Christian communities of Central Asia disappeared, and Mesopotamian Christians were forced northward into the Hakkari region of Kurdistan. The Church of the East existed there as an enclave until the end of the Ottoman Empire. The Syrian Orthodox community also survived in pockets around Tekrit in Iraq, in northeast Syria, and in the Tur ‘Abdin region of southeast Turkey. Under Ottoman jurisdiction, the Syriac communities were governed under the ‘Millet i-Rum’ being granted a degree of semi-independence in return for allegiance to the sultan and payment of taxes. The activities of Roman Catholic missionaries during the 17th and 18th centuries led to the creation of Uniate churches, which seceded from the Church of the East and the Syrian Orthodox. At first the links between the ancient Syriac churches and Rome were tenuous, with many schisms and counterclaims, but the Chaldaeans and the Syrian Catholics were permitted to retain their liturgy and the use of Syriac. An influx of Protestant missionaries arrived in the early 19th century, following the discovery of the ancient Christian communities. The American Presbyterian Mission established its base of Urmia (in Azerbaijan, Iran) in 1830, building schools, hospitals, and welfare
centers and establishing the first printing press. The Anglican mission to the Church of the East was led by the Rev. George Percy Badger, who was the representative of the Archbishop of Canterbury. On his two journeys in 1842–1844 and 1850 to the homeland of the Church of the East in Kurdistan, Badger also met Syrian Orthodox communities at Aleppo, Urfa, Diyarbekir, Mardin, and Mosul. The Uniate churches are now dominant in the Near East, with congregations in Iran, Iraq, Syria, Jordan, and Istanbul. The Chaldaeans are the largest denomination in Iraq, but the ancient churches also survive. Both the Church of the East, otherwise known as Assyrian, and Syrian Orthodox communities are still found in Syria, Iran, and Iraq, with sizeable diaspora communities in Europe, the United States, and Australia.
See also: Arabic; Aramaic and Syriac; Armenian; Greek, Ancient; Islam and Arabic; Mongolic Languages; Syriac; Turkic Languages; Turkish; Uyghur.
Bibliography Gillman I & Klimkeit H-J (1999). Christians in Asia before 1500. London: Curzon. Moffett S (1998). A history of Christianity in Asia. Vol I: beginnings to 1500 (2nd rev. edn.). Maryknoll, NY: Orbis.
Christianity In Latin America J L Klaiber, Pontifical Catholic University of Peru, Lima, Peru ! 2006 Elsevier Ltd. All rights reserved.
With the Spanish conquest and the Portuguese settlement of Brazil, Spanish and Portuguese became the principal languages of those two New World empires: Portuguese in Brazil, and Spanish in the rest of South America, Mexico, and large areas of southwestern United States. But European missionaries who followed the conquest used the Native American languages as their main instrument for evangelization. Thanks largely to their efforts, the myriad Indian languages of Latin America were preserved throughout colonial times, and some of them, such as Quechua, Aymara, and Guaranı´, continue to be spoken today by millions of Latin Americans. The religious also transformed the oral traditions of the
pre-Columbian peoples into written literature in Spanish and Portuguese, as well as the native languages. Spain and Portugal established Catholic Christianity as the official religion of all Latin America. Under the concept of royal patronage, a special concession made by different popes, the Spanish and Portuguese monarchs enjoyed the right to establish the church, name bishops, collect tithes, and authorize missionaries to go to the New World. The initial evangelization of Spanish and Portuguese America was carried out by missionaries belonging to religious orders: Franciscans, Dominicans, Augustinians, and later on, the Jesuits. The secular or diocesan clergy followed later in order to serve the pastoral needs of the white settlers, although many of them also did mission work among the Indians and black slaves. The first phase of evangelization, which corresponded to the conquest of the Caribbean and the
394 Christianity in Central Asia and the Near East
Lake Balkash and Issy-kol (Kirghizia), cemeteries with hundreds of Old Turkic headstones written in the Syriac script indicate the presence of sizeable Christian Turkic communities between the 9th and 14th centuries. Faced with the ravages of Timur Lang, the Christian communities of Central Asia disappeared, and Mesopotamian Christians were forced northward into the Hakkari region of Kurdistan. The Church of the East existed there as an enclave until the end of the Ottoman Empire. The Syrian Orthodox community also survived in pockets around Tekrit in Iraq, in northeast Syria, and in the Tur ‘Abdin region of southeast Turkey. Under Ottoman jurisdiction, the Syriac communities were governed under the ‘Millet i-Rum’ being granted a degree of semi-independence in return for allegiance to the sultan and payment of taxes. The activities of Roman Catholic missionaries during the 17th and 18th centuries led to the creation of Uniate churches, which seceded from the Church of the East and the Syrian Orthodox. At first the links between the ancient Syriac churches and Rome were tenuous, with many schisms and counterclaims, but the Chaldaeans and the Syrian Catholics were permitted to retain their liturgy and the use of Syriac. An influx of Protestant missionaries arrived in the early 19th century, following the discovery of the ancient Christian communities. The American Presbyterian Mission established its base of Urmia (in Azerbaijan, Iran) in 1830, building schools, hospitals, and welfare
centers and establishing the first printing press. The Anglican mission to the Church of the East was led by the Rev. George Percy Badger, who was the representative of the Archbishop of Canterbury. On his two journeys in 1842–1844 and 1850 to the homeland of the Church of the East in Kurdistan, Badger also met Syrian Orthodox communities at Aleppo, Urfa, Diyarbekir, Mardin, and Mosul. The Uniate churches are now dominant in the Near East, with congregations in Iran, Iraq, Syria, Jordan, and Istanbul. The Chaldaeans are the largest denomination in Iraq, but the ancient churches also survive. Both the Church of the East, otherwise known as Assyrian, and Syrian Orthodox communities are still found in Syria, Iran, and Iraq, with sizeable diaspora communities in Europe, the United States, and Australia.
See also: Arabic; Aramaic and Syriac; Armenian; Greek, Ancient; Islam and Arabic; Mongolic Languages; Syriac; Turkic Languages; Turkish; Uyghur.
Bibliography Gillman I & Klimkeit H-J (1999). Christians in Asia before 1500. London: Curzon. Moffett S (1998). A history of Christianity in Asia. Vol I: beginnings to 1500 (2nd rev. edn.). Maryknoll, NY: Orbis.
Christianity In Latin America J L Klaiber, Pontifical Catholic University of Peru, Lima, Peru ! 2006 Elsevier Ltd. All rights reserved.
With the Spanish conquest and the Portuguese settlement of Brazil, Spanish and Portuguese became the principal languages of those two New World empires: Portuguese in Brazil, and Spanish in the rest of South America, Mexico, and large areas of southwestern United States. But European missionaries who followed the conquest used the Native American languages as their main instrument for evangelization. Thanks largely to their efforts, the myriad Indian languages of Latin America were preserved throughout colonial times, and some of them, such as Quechua, Aymara, and Guaranı´, continue to be spoken today by millions of Latin Americans. The religious also transformed the oral traditions of the
pre-Columbian peoples into written literature in Spanish and Portuguese, as well as the native languages. Spain and Portugal established Catholic Christianity as the official religion of all Latin America. Under the concept of royal patronage, a special concession made by different popes, the Spanish and Portuguese monarchs enjoyed the right to establish the church, name bishops, collect tithes, and authorize missionaries to go to the New World. The initial evangelization of Spanish and Portuguese America was carried out by missionaries belonging to religious orders: Franciscans, Dominicans, Augustinians, and later on, the Jesuits. The secular or diocesan clergy followed later in order to serve the pastoral needs of the white settlers, although many of them also did mission work among the Indians and black slaves. The first phase of evangelization, which corresponded to the conquest of the Caribbean and the
Christianity In Latin America 395
first years of Spanish rule in Mexico, was somewhat disorganized: The missionaries did not know the native languages and practiced mass baptisms without knowing whether the Indians understood them. At the same time, in reaction to the atrocities committed by the European settlers, many churchmen raised their voice in defense of the Indians. Antonio de Montesinos, a Domincan, denounced his fellow Spaniards in a sermon in Santo Domingo in 1511, and soon afterward Bartolome´ de las Casas, who originally had gone to the New World to exploit the Indians, experienced a major conversion, became a Dominican, and emerged as champion of the rights of the Indians. He influenced Emperor Charles V, who was simultaneously King of Spain (1516–1556), to make laws to protect the Indians.
Mexico: The Utopian Phase This first phase was characterized by utopian and even apocalyptic concepts. Many religious dreamed of forging a church in which they and the Indians would live together in peaceable harmony, free from the contamination of European civilization. Vasco de Quiroga, a lay humanist who was a member of the ruling board of Mexico, founded two colonies for the Indians, both based on Thomas More’s Utopia. The Franciscan bishop of Mexico City, Juan de Zuma´ rraga, founded several schools for Indians boys and girls, the most famous of which was Santa Cruz of Tlatelolco (1536). In time, the missionaries finally began mastering the native languages, usually by having recourse to newly baptized Indians, especially children, many of whom studied in the mission schools. The Franciscan friar Toribio de Benavente (known by his Nahuatl name of Motolinı´a) wrote a history of the Indians of New Spain (Mexico), and later another Franciscan, Bernardino de Sahagu´ n, with the aid of his own Indian students at Santa Cruz of Tlatelolco, wrote the multivolume History of the things of New Spain (finished around 1580) in Nahuatl and Spanish. These early histories were based upon oral testimony and codices that contained pictures and symbolic expressions of Aztec history and culture. The missionaries also wrote grammars, catechisms, and books of sermons in Nahuatl and other native languages. In 1544 Pedro de Co´ rdoba, a Franciscan, published the first catechism printed in Mexico, ‘in the Mexican and Castilian Languages.’ Between 1524 and 1572, 109 works were written in native languages in Mexico by missionaries, 80 of which were done by Franciscans.
Peru
These same religious orders set about evangelizing Peru following Pizarro’s conquest of the Incas in 1532. They founded Indian parishes and wrote grammars and sermons in Quechua, the language that the Incas had used as the lingua franca throughout their empire. In southern Peru and Bolivia the principal language used was Aymara. Viceroy Francisco Toledo (1569–1581) created a system of reducciones (territories reserved only for Indians) in order to protect the Indian population from unbridled exploitation by the Spanish. At the same time, the reducciones served to control the Indian population and facilitate the process of evangelization. For each reduccio´n a parish, known as a doctrina, was established.
Methods of Evangelization Following the Council of Trent (1545–1563), similar church councils were held in Spanish and Portuguese America with the aim of reorganizing the evangelization process and standardizing methods. The Third Lima Council (1582–1583), presided over by Archbishop Toribio de Mogrovejo, produced a catechism. It was the first book printed in Peru, written in three languages: Spanish, Quechua (or Quichua), and Aymara. This catechism was used in Indian parishes until the 18th century throughout the Andes, in Peru, Bolivia, Chile, and Ecuador. Priests who worked in these parishes were expected to preach in the native languages. If they could not, they had recourse to Indian interpreters and catechists. But evangelization was not limited to catechism classes; Christianity was also communicated through the liturgy and sermons, through Biblical plays, and through art and music. In Bolivia and Peru a school of native artists painted images of the Trinity, Christ, Mary, the angels, and many Biblical themes. Popular religiosity in Latin America, which centered on certain devotions, was in large part a creation of the newly Christianized Indians, mestizos, and blacks.
The Jesuits Arriving early in Brazil (1549), the Jesuits soon became the dominant missionary order there until the 18th century. They arrived later in Spanish America (Peru in 1568; Mexico in 1572), where the other orders had already carried out the initial evangelization. But the Jesuits came for another purpose: to found schools for the Creoles, the sons of the Spanish born in the New World. They also worked with the Indians, but in a
396 Christianity In Latin America
more specialized way than the first religious. For example, in Peru they were entrusted with running two schools in Lima and in Cuzco for the sons of Indian chiefs. Jose´ de Acosta, a Spanish Jesuit, wrote two important works: De procuranda Indorum salute (‘On bringing about the salvation of the Indians,’ 1588), which became a standard manual for missionaries on how to treat the Indians; and A moral and natural history of the Indies (1590), a wide-ranging description of the religion and cultural achievements of the Peruvian and Mexican Indians. Other Jesuits became notable linguists, especially Alonso de Barzana and Blas Valera. Blas Valera was a mestizo (born of Spanish and Indian blood) who knew Quechua as a child and later as a Jesuit wrote extensively on Inca history and culture. Though most of his writings were lost, they did reach the hands of Garcilaso de la Vega, the Cuzco-born mestizo who wrote the Royal commentaries of the Incas, in which he cites Valera extensively. Garcilaso and Blas Valera may be considered among the creators of a distinct American and Peruvian literature because they represented a non-Spanish mentality that esteemed (and at times romanticized) the positive values of Indian culture and history. For the Jesuits especially, language was the key to evangelization. Particularly important was their mission in Juli, by Lake Titicaca, where they ran four parishes for Indians. Many Jesuits learned Quechua or Aymara in Juli before going to their permanent assignments in Lima, Bolivia, or Chile. Juli also became a connecting link between Lima and the Jesuit missions in Paraguay. Finally, after Charles V, who allowed Flemish friars to work in the New World, the door to foreign (non-Iberian) missionaries was closed. The Jesuits alone had the privilege of having a percentage of foreign missionaries: Germans, Bohemians, Swiss, Italians, Tyroleans (like Eusebius Kino in Arizona), and Flemish. The Missions
In colonial Latin America, the ‘missions’ referred to the nomadic and warlike Indians living near the frontiers of the two empires: the deserts of northern Mexico and the Amazon jungle in South America. These Indians had not been touched by the initial conquest and resisted efforts to subdue them until the 17th, 18th, and even the 19th century. But the missionaries were able to do what the soldiers could not: approach the Indians and persuade them to build mission towns. The soldiers followed later in order to protect the missions from non-Christian Indians and the encroachment of other Europeans. The two principal missionary orders were the Franciscans and the Jesuits. Both established missions in California
(the Franciscans in present-day California and the Jesuits in Lower California), northern Mexico, remote regions of the Andes (in Peru and Bolivia), the Amazon jungle (in Peru, Bolivia, Ecuador, Colombia, Venezuela, and Brazil), Paraguay, and the Chaco region of Paraguay and Argentina. Missionaries of both orders learned the new languages and wrote grammars, dictionaries, and catechisms. Gonzalo de Tapia, an exceptionally gifted Jesuit in northern Mexico, was able to give a sermon in Tarascan 15 days after he began studying the language. In Brazil, Jose´ de Anchieta, a Portuguese Jesuit, wrote a grammar in Tupı´ (1556) that became a standard for other missionaries. In the 17th century, another Jesuit, Antonio de Vieira, founded numerous villages to protect the Indians in the Maranha˜ o region of northern Brazil. Vieira exhorted fellow Jesuits to learn the native tongues of the Indians because the Holy Spirit would speak through those new tongues. In Chile, Luis de Valdivia wrote a grammar for the Mapuche language. This intense linguistic activity made it possible for Lorenzo Hervas, the 18thcentury Spanish Jesuit philologist, to include a great number of native American languages in his survey of world languages (1784).
Paraguay Paraguay is the only truly bilingual country of Latin America. This is due in large measure to the Jesuit missionaries, who separated the Guaranı´ Indians from the white settlers and gathered them into 30 reducciones (mission towns), and for over 150 years (1609–1768) spoke to them in their native Guaranı´. Before the Jesuits, two Franciscans had prepared the way: Luis de Bolan˜ os, who mastered Guaranı´ and wrote the first grammar that later missionaries would use, and Francisco Solano, who worked among the tribes in the Argentinian Chaco. The Jesuits themselves created a unified ‘mission Guaranı´’ that became the lingua franca for all the missions. Although the Indians learned a rudimentary Spanish, Guaranı´ was the principal language used in the missions, which covered present-day southern Paraguay, Uruguay, and the northern department of Argentina, Misiones. Printing presses in the missions produced grammars, dictionaries, and catechisms. The standard grammar used was Arte de la lengua guaranı´ (Grammar of the Guaranı´ language, 1640) by Antonio Ruiz de Montoya, a Peruvian Jesuit who was superior of the missions. Two Jesuits who worked among the Chaco tribes wrote lengthy descriptions of their customs and beliefs: Martin Dobrizhoffer, a Bohemian, and Florian Paucke, a Swiss.
Christianity In Latin America 397
After the expulsion of the Jesuits in 1768, the missions fell into decline. The mission Indians drifted into neighboring Spanish and Portuguese cities, where many used the crafts they had learned in the missions to obtain work. The music played in the mission towns of Chiquitos in eastern Bolivia lives on today, where Baroque music festivals are held yearly.
Creole Authors In later colonial times, native-born religious made significant contributions to Latin American literature. In Mexico, Sor Juana Ine´ s de la Cruz became the leading poet of America and the only female literary figure of note. The exiled Jesuits, especially Francisco Clavijero of Mexico and Juan Ignacio Molina of Chile, wrote histories of their respective countries that impressed Europeans with the material and cultural wealth of the Indian civilizations of preconquest America.
Contemporary Christianity Protestantism appeared in the 19th century but only experienced significant growth in the 20th century, especially in Chile, Brazil, and Guatemala. In 1936, William Cameron Townsend founded the Summer Institute of Linguistics with sections in many Latin American countries. Using the Bible as its principal instrument, the Institute aims to convert the languages of the native peoples of the Amazon and other remote areas into written languages. Affiliated to the Wycliffe Bible Translators, based in the United States, the Institute has translated the Bible into numerous native languages. In general, Protestants of the historical confessions and progressive Catholics have achieved a close ecumenical relationship. See also: Aymara; Herva´s y Panduro, Lorenzo (1735–
1809); Mapudungan; Missionary Linguistics; Nahuatl; Portuguese; Quechua; Society and Language: Overview; Spanish.
Bibliography Abad P A (1992). Los franciscanos en Ame´ rica. Madrid: Editorial Mapfre. Albo´ X (1966a). ‘Jesuitas y culturas indı´genas.’ Ame´ rica Indı´gena 26(July), 249–308; (October), 395–445. Albo´ X (1996b). ‘Notas sobre jesuitas y lengua aymara.’ Anuario de la Academia Boliviana de Historia Eclesia´stica 2, 94–114. Burgaleta C M (1999). Jose´ de Acosta, S. J. (1540–1600): his life and thought. Chicago: Loyola University. Caraman P (1990). The lost paradise. New York: Dorset Press. Cohen T M (1998). The fire of tongues: Antonio Vieira and the missionary church in Brazil and Portugal. Stanford: Stanford University Press. Cushner N P (2002). Soldiers of God: the Jesuits in colonial America, 1565–1767. Buffalo, NY: Language Communications/Digital@batesjackson. Dussel E (ed.) (1992). The church in Latin America, 1492– 1992. Maryknoll, NY: Orbis Books/Tunbridge Wells, UK: Burns & Oates. Ganson B A (2003). The Guaranı´ under Spanish rule in the Rı´o de la Plata. Stanford: Stanford University Press. Martin D (1990). Tongues of fire: the explosion of protestantism in Latin America. Oxford: Basil Blackwell. Martı´n L (1968). The intellectual conquest of Peru: the Jesuit college of San Pablo. New York: Fordham University Press. Me´ traux A (1944). ‘The contributions of the Jesuits to the exploration and anthropology of South America.’ Mid-Century: An Historical Review 26(July), 183–191. Ricard R (1966). The spiritual conquest of Mexico. Simpson L B (trans.). Berkeley, CA: University of California Press. Ronan C E (1977). Francisco Clavijero, s. j. (1731–1787), figure of the Mexican enlightenment: his life and works. Chicago: Loyola University Press. Ronan C E (2002). Juan Ignacio Molina: the world’s window on Chile. New York: Peter Lang Publishing. Santos A (1992). Los jesuitas en Ame´ rica. Madrid: Editorial Mapfre. Stoll D (1982). Fishers of men or founders of empire? The Wycliffe bible translators in Latin America. London: Zed Press. Stoll D (1990). Is Latin America turning protestant? The politics of evangelical growth. Berkeley: University of California Press.
398 Christianity in South Asia
Christianity in South Asia S Kim, York St John, College of the University of Leeds, UK ! 2006 Elsevier Ltd. All rights reserved.
Christianity in South Asia has developed in the midst of South Asia’s rich historical traditions, diversities of cultural expressions, and multitudes of languages – 1650 or more. It also has faced major religious traditions of Hinduism, Buddhism, and Islam as well as other local religious practices. Interaction with churches and missionaries from the West has been a prime factor in shaping Christianity in this region, but the indigenization of Christian faith in local sociocultural forms by South Asian Christians has played an equally important role in this process.
The Historical Formation of Christianity in South Asia Three different influxes form the background to Christianity in South Asia: St. Thomas Christianity, Roman Catholic, and Protestant Christianity, and they each form distinctive aspects of South Asian Christianity. Mar Thoma Christianity
The arrival of the Apostle Thomas in the first century is regarded as the beginning of Christianity in South Asia. Thomas is believed to have arrived on the southwest coast of Malabar by sea, or perhaps came over land from the north, and was later martyred on the southeast coast at Mylapore. Though this tradition lacks material support, nevertheless the existence of Christian and Jewish communities in Malabar can be traced back to the second century. The oldest literature describing the work of St. Thomas is the Acts of Thomas, which originated in the second century, as is evidenced by the existence of a Syriac version. Other Christian traders and settlers moved into the western coast of India. Later, as a result of persecutions of Christians in the Persian Empire, there were increasing influxes of Christians to India, and eventually the patriarch of Babylon claimed ecclesiastical authority over these Christians. Syriac became the literary and liturgical language for all Thomas Christians and the doctrine was Nestorian, that is, of the belief that there are two separate natures, divine and human, in Christ (contrary to the Creed of Chalcedon). Though their allegiance was to the Syrian patriarch of Babylon, due to Muslim expansion in the 7th century this close relationship became more difficult to maintain. In spite of enjoying higher status than other local people, Mar Thoma Christians faced a major
challenge when the Portuguese authority established the Padroado (royal patronage) in Goa in 1533, and during their subsequent missionary work toward the south. The Syrian Church of the East suffered its major schism in the 19th century, when the Mar Thoma Church of Malabar broke away. Furthermore, as a result of the intervention of outside ecclesiastical authorities, missionary work by other groups, and various internal divisions, there are currently at least six different groups in Kerala who claim their origin in the Thomas tradition. Catholic Christianity
Catholic Christianity was introduced with the arrival of the Portuguese and the ecclesiastical hierarchy was established in Goa. Through the integration of marriage with the local community and the efforts of monastic priests and friars, the Christian population of Goa increased. The encounters between Roman Catholic and Thomas Christians were long and complicated. They began when the fishermen of Cochin asked help from the authorities in Goa in 1502 to protect them from Muslim threats, which resulted in the acceptance of Roman ecclesiastical authority over the Thomas Christians. However, the relationship between Roman Catholics and Thomas Christians was soured by differences over doctrine and ritual issues and over ecclesiastical hegemony in Kerala. When the Propaganda Fide encouraged mission to areas beyond Portuguese control, Catholic missionary activity concentrated on new territories, for example St. Francis Xavier on the Parava coast. Jesuits worked in northwestern Sri Lanka, where they formed the Tamil and Singhala-speaking Christian communities, and in other parts of India, including the areas controlled by the Mughal empire; although some were invited to the court of the Grand Mughal, their work was generally limited in the cultural and diplomatic realm. Most Catholics are of the Latin rite but, following Vatican II, there have been various changes, including the indigenization of liturgy in the local language and the employment of Hindu symbols and philosophy. Protestant Christianity
Protestant missions began their work with the arrival in 1706 of Lutheran missionaries in Tranquebar, which was a Danish trading port. As a result, the Tamil Evangelical Christian community was established. From 1792 onward, William Carey and other missionaries from England arrived in Serampore, from where the various modes of Protestant mission steadily developed. Some of the East India Company’s
Christianity in South Asia 399
chaplains, such as Henry Martyn, were also involved in mission work beyond their role of ministering to the British officials and soldiers and, in fact, many of the company’s officers were sympathetic toward missionary work among Indians. One of the distinctive aspects of Protestant missions was their emphasis on the translation of the Bible into local languages. The missionaries in both Tranquebar and Serampore concentrated their efforts in Bible translation, and this had a great impact, not only in terms of the church expansion but also in the indigenization of Christianity in India. During the late 19th and early 20th centuries the conversion of lower and outcastes in various parts of India brought changes in the demography of the Indian religious setting. Mass conversion along the lines of caste group or community was a general pattern in the history of Christianity in India, but this became a very sensitive issue, especially as the British government in India moved in 1932 to provide a separate electorate for the depressed classes, and it thus became a political issue. Toward Independence there were active discussions for unity among the Protestant denominations in South India, and as a result the Church of South India was formed in 1947 as an amalgamation of the South India United Church, Anglican, and Methodist Churches. This was an unprecedented case of the union of Episcopal and non-Episcopal churches, which became a basis of schemes of union in North India, Pakistan, and Sri Lanka. Furthermore, in 1978 a joint council was established by the Church of South India, Church of North India, and the Mar Thoma Church, reflecting a high degree of union: intercommunion, doctrinal unity, episcopal polity, mutual recognition of one another’s ministry, and some joint activities.
emphasis on various spiritual gifts mentioned in the Bible, have been strong among the lower and outcastes, and this seems to set to continue. There has also been movement among dalit (outcaste) Christian groups for their social uplift in society as well as equality within the Christian churches. Christianity in India is the second largest religious minority, after Islam. According to the 2001 census there were 24 million Christians, 2.3% of the total population. Looking at the states, the northeast states have a very high proportion of Christians – for example, Nagaland, 90%; Mizoram, 87%; Meghalaya, 70.3%. Goa has 26.7% and Kerala 19.0% whereas in many of the north and central states Christians are less than 1% of the population. The Christian churches have been actively involved in medical, educational, and other social work in India, for example, the work of Mother Teresa. Christianity in Sri Lanka largely consists of Catholics who are descendants of converts during the Portuguese time, Anglicans, and some others. Most of the 1.7 million Christians (9.3% of the population) live on the west coast and over 70% are Catholics. Pakistan has 3.8 million Christians (2.4% of the population), of which 80% are from the Punjabi-speaking ethnic group as a result of an early–20th-century mass conversion movement. In 1970 the United Church of Pakistan was formed. In recent years Christians have been the subject of intimidation by some militant Muslim groups. As in other South Asian nations, Christians are a minority in Bangladesh and Nepal, and their religious worship and activities are limited by being in a disadvantageous state in their societies.
Various Christian Expressions and the Demography of Christianity
Christianity in South Asia, with the exception of the Mar Thoma tradition, has been introduced in association with foreign imperial authorities, and this historical attachment has always been a problem for the Christians in this region. There are two major areas of concern for Christianity in South Asia: ‘inculturation’ of Christianity (making Christianity acceptable and relevant to South Asian culture and society) and overcoming the problem of conversion (the relationship between the Christian community and the wider religious community).
Christianity was introduced to the northeast of India by the Baptists of Serampore during the early part of the 19th century; they were soon followed by American Baptists and Welsh Presbyterian missionaries during the rest of the 19th and early 20th centuries. Baptist missions were particularly successful in terms of numerical growth, resulting in a remarkably high proportion of Christians in these states (up to 90% in places). The churches in northeast India face considerable challenges in terms of sociopolitical and economic tensions with the central government, but continue to maintain Christian activities both within the northeast and beyond. Recently Pentecostal and charismatic movements, which have conservative theology, organized missionary apparatus, and an
Issues in Christianity in South Asia
‘Inculturation’ of Christianity
One of the earliest and best known examples of Christian attempts at inculturation was by the Jesuit missionary Roberto de Nobili, who arrived in Madurai in 1605. He mastered Sanskrit in order
400 Christianity in South Asia
to convey the Christian faith by means of Indian philosophy, and also translated catechisms into the Tamil language. He lived like a sannyasi (Hindu holy man) and worked among the Brahmin caste. He even suggested to the Catholic authorities that Sanskrit instead of Latin should be taken as the liturgical language for the Indian church. Since then there have been various attempts to bridge the gap between Christian doctrines and Hindu philosophy by actively employing Sanskritic religious concepts into Christian thinking. In late 19th-century Bengal, Brahmabandhab Upadhyay advocated the idea of Indian Christians as ‘Hindu-Catholic’ in the sense of being Hindus in culture and Christian in faith, and suggested that the Christian doctrine of the Trinity could be related to the definition of Brahman as Being, Consciousness, and Bliss. Also following classical advaita philosophy, in 1964 Raymond Panikkar published his book The Unknown Christ of Hinduism, in which he argued that Christ is already present in Hinduism as Ishvara (the Lord). On the other hand, following the tradition of bhaki devotion, in the early 20th century A. J. Appasamy interpreted Jesus Christ as the unique Avatar (descent) or incarnation of the deity. The Problem of Conversion and Communal Relations
The most contentious issue between Christian and Hindu communities has been the problem of conversion. The traditional understanding of conversion as manifested in joining the Christian community leads to serious difficulties in the life of the converts in South Asia, particularly in India, where change of religious community has major implications for relations with the wider Hindu community. Hindu leaders oppose Christian conversion as incompatible with Indian philosophies and social practices, and have countered it by legislation and by the reconversion of Christian converts. In particular in the 1930s, M. K. Gandhi made his strong objection to Christian conversion activities a part of his political agenda in his struggle against the British Raj because he feared mass conversions would increase communal disturbances. During and after Independence, the discussion about conversion was focused on the inclusion of the freedom to ‘propagate’ as one of the fundamental rights in the Indian Constitution Assembly (1947–1949). Hindu objections to Christian missionary activities led to a public inquiry
into missionary activities by the government of Madhya Pradesh in 1954. The resulting Niyogi Report, completed in 1956, was highly critical of converting activities, particularly the conversion of tribals, and of the activities of foreign missionaries. Subsequently, Hindu objections to conversion were made concrete in three main ways: by the introduction of Hindu ‘personal laws,’ which were disadvantageous for caste Hindus who converted to another religion (1955–1956); by the limitation of social benefits for converts from Scheduled Caste backgrounds (1950s); and by the passing of the Freedom of Religion Acts in various states (1960s and 1970s). The election of 2003 was won by the Congress Party, which promotes a secular approach to religious issues in the central government, and the tensions over the conversion issue seem to have relaxed. Christianity in South Asia, despite being a minority in numbers, has made major contributions to nationbuilding, especially in education, medical work, and other social areas. Christian communities in India try to live side by side with their neighbors by fully integrating into the languages and cultures of India, while at the same time affirming their Christian faith and practices, which may be distinctive from others. The future shape of Christianity in South Asia may lie in the balancing of these two aspects of integration and distinctiveness as Christians strive to exhibit their integrity and identity in both public and private life in the diverse yet often rigid cultures and societies of this region. See also: Aramaic and Syriac; Christianity, Catholic; Christianity, Protestant; Dravidian Languages; Hindi; Islam in Southeast Asia; Sanskrit; South Asia: Religions; Tamil.
Bibliography Boyd R (1975). An introduction to Indian Christian theology. Delhi: ISPCK. Brown L (1982). The Indian Christians of St Thomas (2nd edn.). Cambridge: Cambridge University Press. C.H.A.I. (1988–1997). History of Christianity in India (vols I–V). Bangalore: The Church History Association of India. Moffett S H (1998). A history of Christianity in Asia (vol. 1). Maryknoll, New York: Orbis Books. Neill S (1985). A history of Christianity in India, 1707–1858. Cambridge: Cambridge University Press.
Christianity in Southeast Asia 401
Christianity in Southeast Asia B Watson Andaya, University of Hawai‘i, Honolulu, Hawaii ! 2006 Elsevier Ltd. All rights reserved.
The Early Spread of Christianity The history of Christianity in Southeast Asia can be dated from 1494, when the Treaty of Tordesillas divided the global missionary project between Europe’s two great Catholic powers. What is now the Philippines fell in the Spanish sphere, with the Portuguese given responsibility for the MalayIndonesian archipelago. A Spanish expedition reached the Philippines in 1521, but a full commitment did not come until 1570, when Manila was captured. Over the next century Spanish control and the Christian religion spread over the entire archipelago except for the Muslim south. From the outset, the Catholic orders were committed to mastering local languages. Using romanized phonetic writing rather than local scripts, they compiled dictionaries, catechisms, confessions, and sermons in the major Filipino languages. Latin or Castilian words were retained for terms such as ‘God,’ ‘heaven,’ ‘hell,’ ‘holy spirit,’ ‘sin,’ ‘limbo,’ etc., so that new concepts would not carry the baggage of animist beliefs. The vocabulary and traditions of Christianity were also adapted by Southeast Asians, with Filipino versions of the sung Pasyon, the life of Christ, assuming a central position in religious ritual. Though Church schools were a major vehicle for imparting the new teachings, ethnolinguistic boundaries were reinforced because each order developed expertise in the cultural area to which it was assigned. The impact of Portugal’s Christianizing effort was more limited than that of Spain. The conquest of Melaka on the Malay Peninsula in 1511 was expected to provide a base for missionary work, but Islam was already entrenched in the western archipelago. There was less opposition in the islands east of Java, but despite initial successes the lack of priests made it difficult to prevent apostasy. Numerous Portuguese words were nonetheless adopted into local languages and Portuguese became widely used in diplomacy and trade. The creole known as Kristang is still spoken by the Portuguese-descended community in Melaka. The Portuguese missionary effort suffered a major blow when the Dutch East India Company (VOC) captured Melaka in 1641. The Dutch harbored an intense dislike of Papism and actively promoted Protestantism, especially in the spice-producing islands of eastern Indonesia where VOC interests
were concentrated. VOC ministers used Portuguese or more commonly Malay for Christian instruction, since both were well established as lingua franca. The Bible was printed in both romanized Malay and the Arabic-based Jawi script. Christian teachings were prominent in the schools established in areas under VOC control, with teaching delegated to Christian schoolmasters familiar with Malay in its new romanized form. Although older Catholic communities such as the one in Melaka survived, Portuguese missionary activity was mostly restricted to coastal areas in Flores and eastern Timor. The British settlements of Penang (1786) and Singapore (1819), intended primarily to further trade, had only limited influence in Christian proselytizing. In mainland Southeast Asia, Catholic missions were more active. A romanized form of Vietnamese known as quoc ngu (‘national language’) was developed to help missionaries prepare sermons and translate texts. In 1662, the Missions E´ trange`res de Paris (MEP) established a base in Siam, but their ‘converts’ were generally infants, the dying, or marginalized groups such as the Mon or Lao. The MEP had more success in neighboring Vietnam, especially among women, but suffered periodic persecution, and during the 18th century missionaries were expelled or withdrew.
Christianity, Colonialism and Nationalism During the 19th century, all Southeast Asia except Thailand came under European control. Colonial administrations typically used a European and a selected local language for administrative purposes, but always showed a preference for employing Christians. Roman Catholicism thus revived in French-controlled Vietnam, and French became the language of the elite. Quoc ngu was promoted and eventually displaced Chinese and the Chinesebased Vietnamese script known as nom. Frenchspeaking Vietnamese were also employed in Laos and Cambodia. In the Netherlands Indies, Malay was the medium of communication between colonial officials and indigenous administrators, although English was somewhat more common in British Malaya. Because Burma was part of British India, Indians fluent in English were commonly hired for the colonial service, and Burmese never developed into an administrative language. Colonial administrations were generally wary of encouraging Christian missions in societies already committed to other world religions. However, remote areas were opened up to missionaries, who saw the
402 Christianity in Southeast Asia
compilation of dictionaries and grammars for unwritten and little-known languages as a primary task. New as well, revised translations of the Bible and other Christian books were accorded a high priority. Fonts were developed to print the gospels, Christian tracts, and catechisms in Burmese and Thai characters. In 1813, American Baptist missionaries began working in Burma, concentrating on non-Buddhist societies such as the Chins, Karens, and Kachins. The Netherlands East Indies government looked favorably on the Bible Society and other missionary groups because conversion of animist groups was seen as a way of stemming the Muslim advance. From the mid-19th century, Catholic missions were permitted entry, with German Lutherans prominent among the Toba Batak of Sumatra. In Vietnam the French ‘civilizing mission’ encouraged Church penetration into the highlands, and in 1917 the first extensive dictionary of a Miao dialect was published. This linguistic codification had far-reaching effects because the standardized form of favored languages was endorsed by government usage, mission education, school texts, and vernacular newspapers. While this affirmed the position of well-known languages such as Malay, others such as Iban became more widely known as a medium for mission teaching. Tetun Prasa (one of 16 languages in Portuguese Timor) became a lingua franca primarily because it was promoted by the Catholic Church. Often a new sense of identity was forged between previously separate groups, as in the Kachin areas, where Baptist missionaries created a common language, Jinghpaw, for all the Kachin tribes. For the most part, indigenous Christian communities saw themselves as allies of colonial regimes, and it is not surprising that migrant Asians, especially Chinese, frequently adopted Christianity. In the Philippines, however, the tensions associated with colonialism were fueled because the Catholic Church never welcomed the ordination of indigenous priests. The vocabulary of the Pasyon, with its images of heaven, illumination, and saintly leadership, proved highly effective in motivating and inspiring peasantbased anticolonial movements. Resistance also drew on a growing body of nationalist literature written in Spanish by elite Filipinos, the most notable of whom was Dr. Jose Rizal. When a revolution against Spanish rule finally erupted in 1896, he was condemned as a traitor and executed. With America’s acquisition of the Philippines in 1898 after the Spanish-American war, the Catholic Church lost some of its former wealth, but it remained a formidable influence in Philippine society. However, the door was now open to Protestant
missionary activity, notably in the highlands and in the Muslim island of Mindanao. English became the accepted medium for Christian preaching and instruction among a new generation of Protestant and Catholic leaders. Tagalog was more common in a Catholic breakaway group, the Iglesia Filipina Independiente, and in the Protestant Iglesia ni Kristo, both of which developed during the early years of the American occupation. The outbreak of the Second World War and the Japanese invasion of Southeast Asia in 1941 mark the end of a chapter. All Europeans were imprisoned, and although the French gained a short reprieve in Vietnam, the Japanese eventually assumed control here as well. Those who followed Christianity were regarded as supporters of the colonial powers, and European languages were forbidden.
Christianity After Independence All the colonized countries of Southeast Asia eventually attained their independence after the Second World War, and indigenous control over practice and policy has increased as more local men and women have become ministers, priests, and highranking church officials. In 1960, for instance, the first Filipino cardinal was appointed. Although foreign missionaries have continued to work in the region, despite occasional government opposition, there have been pronounced linguistic shifts. In Catholic areas, this was especially evident after Vatican II (1962–1965) sanctioned indigenous languages for celebrating Mass. Within the formal church structure, some developments are uniquely local. In the Philippines, the Pentecostal movement known as El Shaddai has remained within the Catholic Church, but its charismatic leadership, belief in miracles, and use of symbols invoke earlier peasant movements. Because preaching is largely in Tagalog rather than English or the ‘Taglish’ associated with intellectuals, El Shaddai has a strong working class base and has moved overseas via the Filipino diaspora. At the same time, evangelical churches in the Philippines have also proliferated, drawing many adherents away from mainstream Catholicism. In Indonesia and Malaysia, most churches still display their origins in European missionizing, but there is considerable room for local transformations because they were initially built on an ethnic base, such as the Gereja Batak Karo Protestan (the Karo Batak Protestant Church) and the Gereja Toraja, both founded in 1941. This allows for local expressions of Christian identity such as the huge monuments that Toba Bataks erect in honor of the dead. After 1965, when communism was made illegal, Church
Christianity in Southeast Asia 403
membership increased everywhere in Indonesia. Because Indonesian rather than a regional language is increasingly common in city churches and among a younger generation of Christians, earlier ethnic explicitness has been muted. An example of a greater inclusiveness comes from Borneo, where the Sidang Injil Borneo (Borneo Gospel Council, established by Westerners in 1928) is now under complete local control. It is very strong among some interior communities such as the Kelabit, but has also established a firm base in urban centers by conducting services in both Malay and Iban. Annual pilgrimages to revered mountain sites, faith healing, and a belief in miracles are all reminiscent of older animist practices. As elsewhere in Southeast Asia, this localization can generate tensions. While there are few difficulties in employing indigenous languages, questions always arise about the adoption of specific cultural practices, especially in regard to marriage and burial. Throughout contemporary Southeast Asia, connections with overseas Christian organizations remain influential. Because the region is so linguistically diverse, translation of Christian texts is a high priority. In 1971, the Summer Institute of Linguistics was established to translate the Bible into little-known languages. It has been very active in Indonesia, where Catholicism and Protestantism are two of the five officially sanctioned religions and where missionary work is thought to assist development programs. The Indonesian government has permitted numerous Christian missions to teach and proselytize, with most using Indonesian as a medium. In the Philippines, evangelization has also continued in Mindanao, and among interior groups. Apart from Thailand, where the presence of missionaries as teachers and health workers among the hill tribes is again seen as compatible with the government’s goal of modernization, Christian proselytization on the mainland has faced more obstacles because there is always the potential that Christian churches will become critics of the government. The authorities have only to look towards East Timor, which was invaded by Indonesia in 1975 and where the Catholic Church emerged as a focus for dissent, particularly after 1981 when Mass was conducted in Tetun Prasa in place of the banned Portuguese. In Burma, ethnic groups associated with Christianity such as the Karens have also been strong opponents of the government. Such examples explain why Christian congregations in Vietnam are subject to closer surveillance, especially in the non-Vietnamese highlands. In the Philippines, the Catholic Church initially supported President Ferdinand Marcos in his declaration of Martial Law in 1972, but Church leadership was critical in the ‘people’s power’
movement that ended the Marcos regime. Much of this opposition was generated by activist priests, nuns, and laity, who argued that liberation theology and an alliance with the Communist Party was the best means for dealing with entrenched poverty. For the most part, this left wing movement has been retained within the Catholic Church. At the beginning of the 21st century, a disturbing development is the simmering Christian-Muslim hostility in eastern Indonesia and the southern Philippines. Although the issues must be analyzed within a historical context, hostilities have been fueled by global developments. Internet sites have become a battleground as advocates from both sides seek to make their causes more widely known, mostly in English, but also in local languages. Despite some internal divisions, Christianity maintains an unassailable position in the Philippines (91% of the population) and East Timor (93%), with the majority Roman Catholic. In all the other countries of Southeast Asia, Christians are a minority, ranging from an estimated 10% in Brunei and 8% in Indonesia to 4% in Burma and 0.5% in Thailand. In these places, Christianity is more vulnerable if it is thought to provide an umbrella for opposition to government policies. See also: Applied Linguistics in Southeast Asia; Bateson, Gregory (1904–1980); Burma: Language Situation; Christianity in South Asia; Indonesia: Language Situation; Islam in Southeast Asia; Language Education Policies in Southeast Asia; Malay; Malaysia: Language Situation; Philippines: Language Situation; Tai Languages; Thailand: Language Situation; Vietnam: Language Situation.
Bibliography Collins J T (2004). ‘A book and a chapter in the history of Malay.’ Archipel 67, 77–127. DeFrancis J (1977). Colonialism and language policy in Viet-Nam. The Hague: Mouton. Fernandez P (1979). History of the Church in the Philippines, 1521–1898. Manila: National Book Store. Fox J J (2000). ‘Tracing the path, recounting the past: historical perspectives on Timor.’ In Fox J J & Soares D B (eds.) Out of the ashes: destruction and reconstruction of East Timor. Adelaide: Crawford House. Herbert P & Milner A (1989). South-East Asia. Languages and literatures: a select guide. Honolulu: University of Hawai‘i Press. Ileto R C (1979). Pasyon and revolution: popular movements in the Philippines, 1840–1910. Quezon City: University de Ateneo Press. Keyes C (ed.) (1996). ‘Symposium: Protestants and tradition in Southeast Asia.’ Journal of Southeast Asian History 27(2), 280–386.
404 Christianity in Southeast Asia Rafael V L (1993). Contracting colonialism: translation and Christian conversion in Tagalog society under early Spanish rule. Durham: Duke University Press. Steenbrink K (2002). Catholics in Indonesia, 1808–1900. Leiden: KILTV Press.
Swellengrebel J L (1974). In Leijdeckers voetspoor. Anderhalve eeuw Bijbelvertaling en taalkunde in de Indonesische talen. I, 1820–1900. The Hague: Martinus Nijhoff.
Christianity in the Far East B Vermander, Taipei Ricci Institute, Taipei, Taiwan, ROC ! 2006 Elsevier Ltd. All rights reserved.
The encounter between Christianity and East Asian languages, cultures, and nations took place mainly in the larger context of the confrontation between Western expansionism and societies meeting with a number of crises. The original conditions of the encounter still partly determine the relationship between Christianity and East Asian languages (the focus of this article being on Chinese, Japanese, and Korean). However, this relationship was shaped not by historical factors only but also by the intrinsic difficulties encountered in translating the Christian worldview as elaborated in Europe throughout centuries with words, concepts, and linguistic structures proper to East Asia. Four successive questions will help us to circumscribe the issues here at stake: . How were the historical circumstances of the encounter between Christianity and East Asia reflected in religious terminology and linguistic choices made up to the present day? . How did translations of the Bible in East Asian languages contribute to this religious–linguistic encounter? . How to translate religious terms and concepts dependent on Greco-Latin vocabulary and philosophy into languages of the Far East, especially when taking into account the fact that these languages rely on the present use or past inheritance of Chinese characters loaded with specific cultural meanings? . Do lexical, syntactic, and cultural characteristics of East Asian languages provide Christianity with new resources for expressing anew its dogma, worldview, and spiritual experience?
Historical Encounter and Language Issues Christianity as shaped by European tradition encountered the civilizations of Japan, China, and Korea from 1550 on. Until the beginning of the 19th
century, East Asian nations witnessed the arrival of mostly Catholic missionaries, whose lingua franca was Latin, although Portuguese (due to the patronage delegated by the Pope to the King of Portugal) and other European languages were also used as communication and translation tools. Protestant missionaries arrived in the region at the beginning of the 19th century. Their linguistic policies had much to do with efforts developed for translating the Bible into vernacular languages and will therefore be sketched later. Jesuit missionaries in particular dealt directly with a variety of linguistic problems. Matteo Ricci (in China from 1583 until his death in 1610) took pains to write his apologetic works in elegant literary Chinese. In 1615, the Jesuits received from the Pope permission to use vernacular language in the liturgy and to translate the Bible into classical Chinese. However, the development of the Rites Controversy prevented them from making use of this permission. Attempts made in Japan during the same period were also aborted. More generally, although apologetic and catechetical treaties in East Asian languages were numerous, the authoritative sources of Catholicism were still controlled by the use of Latin until the middle of the 20th century. Making use of Chinese as a language for theological teaching and research is a quite recent enterprise, at least when one speaks about ‘professional’ theologians contrasted with the lay persons who, from the 17th century on, expressed their understanding of the faith in their own language. For example, the Jesuit faculty of theology in Shanghai, which was first transferred to the Philippines from 1952 to 1967, kept Latin as the only teaching language until 1964, later shifting to English. In the Catholic world, it is only with the foundation of the Fu Jen Faculty of Theology, in Taipei, that teaching and research were conducted in Chinese, starting in 1968. From that time on, the shift has been swift and complete. Similar remarks could be made for Japan and Korea. It is no wonder that the Catholic Japanese novelist Shusaku Endo made one of his characters declare to a Christian having studied abroad, ‘‘Your Latin is good. But your faith is rotten.’’
404 Christianity in Southeast Asia Rafael V L (1993). Contracting colonialism: translation and Christian conversion in Tagalog society under early Spanish rule. Durham: Duke University Press. Steenbrink K (2002). Catholics in Indonesia, 1808–1900. Leiden: KILTV Press.
Swellengrebel J L (1974). In Leijdeckers voetspoor. Anderhalve eeuw Bijbelvertaling en taalkunde in de Indonesische talen. I, 1820–1900. The Hague: Martinus Nijhoff.
Christianity in the Far East B Vermander, Taipei Ricci Institute, Taipei, Taiwan, ROC ! 2006 Elsevier Ltd. All rights reserved.
The encounter between Christianity and East Asian languages, cultures, and nations took place mainly in the larger context of the confrontation between Western expansionism and societies meeting with a number of crises. The original conditions of the encounter still partly determine the relationship between Christianity and East Asian languages (the focus of this article being on Chinese, Japanese, and Korean). However, this relationship was shaped not by historical factors only but also by the intrinsic difficulties encountered in translating the Christian worldview as elaborated in Europe throughout centuries with words, concepts, and linguistic structures proper to East Asia. Four successive questions will help us to circumscribe the issues here at stake: . How were the historical circumstances of the encounter between Christianity and East Asia reflected in religious terminology and linguistic choices made up to the present day? . How did translations of the Bible in East Asian languages contribute to this religious–linguistic encounter? . How to translate religious terms and concepts dependent on Greco-Latin vocabulary and philosophy into languages of the Far East, especially when taking into account the fact that these languages rely on the present use or past inheritance of Chinese characters loaded with specific cultural meanings? . Do lexical, syntactic, and cultural characteristics of East Asian languages provide Christianity with new resources for expressing anew its dogma, worldview, and spiritual experience?
Historical Encounter and Language Issues Christianity as shaped by European tradition encountered the civilizations of Japan, China, and Korea from 1550 on. Until the beginning of the 19th
century, East Asian nations witnessed the arrival of mostly Catholic missionaries, whose lingua franca was Latin, although Portuguese (due to the patronage delegated by the Pope to the King of Portugal) and other European languages were also used as communication and translation tools. Protestant missionaries arrived in the region at the beginning of the 19th century. Their linguistic policies had much to do with efforts developed for translating the Bible into vernacular languages and will therefore be sketched later. Jesuit missionaries in particular dealt directly with a variety of linguistic problems. Matteo Ricci (in China from 1583 until his death in 1610) took pains to write his apologetic works in elegant literary Chinese. In 1615, the Jesuits received from the Pope permission to use vernacular language in the liturgy and to translate the Bible into classical Chinese. However, the development of the Rites Controversy prevented them from making use of this permission. Attempts made in Japan during the same period were also aborted. More generally, although apologetic and catechetical treaties in East Asian languages were numerous, the authoritative sources of Catholicism were still controlled by the use of Latin until the middle of the 20th century. Making use of Chinese as a language for theological teaching and research is a quite recent enterprise, at least when one speaks about ‘professional’ theologians contrasted with the lay persons who, from the 17th century on, expressed their understanding of the faith in their own language. For example, the Jesuit faculty of theology in Shanghai, which was first transferred to the Philippines from 1952 to 1967, kept Latin as the only teaching language until 1964, later shifting to English. In the Catholic world, it is only with the foundation of the Fu Jen Faculty of Theology, in Taipei, that teaching and research were conducted in Chinese, starting in 1968. From that time on, the shift has been swift and complete. Similar remarks could be made for Japan and Korea. It is no wonder that the Catholic Japanese novelist Shusaku Endo made one of his characters declare to a Christian having studied abroad, ‘‘Your Latin is good. But your faith is rotten.’’
Christianity in the Far East 405
Besides the large-scale encounter between European Christianity and East Asia that occurred from 1550 onward, more limited exchanges took place throughout history, which brought into East Asia Christian traditions shaped through other languages and cultures. The most interesting of these attempts is the one that developed after 635 when a few monks from the Chaldean (Nestorian) Church arrived at Xi’an, capital of the Chinese Empire. The Xi’an Stele (erected in 781) remains the most complete testimony of this early presentation of Christianity to China. Based on Syriac documents translated into Chinese, it makes a creative use of Buddhist and Taoist terminology in order to express aspects of the Christian mysteries. Approximately 70 years after the completion of the stele, this Christian community was included within the large-scale persecution that decimated Buddhism in China, and its fragile roots were probably extirpated from the Chinese soil. However, the questions that this first attempt at inculturation raised have remained acute. With regard to the relationship between the spreading of the faith and sociolinguistics, the cases of Korea and Taiwan deserve special mention. In Korea, Christian use of the Hangul script enabled the spreading of the faith. As early as the end of the 18th century, portions of the Gospels, doctrinal books, and a hymnary appeared in this script. This was a challenge to the perceived cultural superiority of Chinese and a factor in the rise of literacy. In Taiwan, the influence of the Presbyterian Church is strongly linked to its early advocacy of the Taiwanese (Minnan) language and romanization.
Translating the Bible into East Asian Languages Protestant missionaries in the Far East viewed the encounter between Christianity and Eastern languages mainly through the prism of Bible translation. An exploratory stage took place from 1820 to 1890, a time during which full translations in Japanese and Chinese of both testaments were completed. In approximately 1910–1920, revised, reliable, and wellpolished translations of the Bible in Far Eastern languages were published and are still in use today. In 1919, the publication of the Mandarin Union Version, coinciding with the May Fourth Movement, was a lasting cultural and literary event. In Japan, authoritative Catholic and Protestant versions of the New Testament were published in approximately 1910–1917. From the 1920s onward, the translated Bible was not only a religious but also a literary text influencing the intellectual life of China, Korea, and Japan. Biblical narratives and stylistic features entered
into Far Eastern languages. After 1960, new publications appeared, based on renewed scholarship. The first complete Chinese Catholic Bible was published in 1967 in Taiwan. In Korea, the Bible was newly translated for common use by both the Catholic and the Protestant churches – the New Testament in 1971 and the Old Testament in 1977. Questions regarding the reliability of these two translations have been raised; a Catholic Bible was completed only in 2002.
East Asian Languages and Christian Dogmas East Asian Christian theology might have yet to develop a body of assumptions and research that would make it comparable in size and importance to other ‘regional’ theologies such as found in India, Africa, or Latin America. Furthermore, with regard to the global theological field, Chinese, Korean, and Japanese are still minority, if not marginal, languages. At the same time, regarding elaborate theological discourse in East Asian languages, the reference to Chinese is somehow analogous to what was the reference to Latin in classical Christianity. Linguistic– cultural concepts embodied in characters such as dao (the Way), li (Reason), and xing (Nature) are the indispensable tools used not only by Chinese but also by Japanese and Korean Christian thinkers. The challenge is that the Chinese intellectual tradition (at the same time a corpus of wisdom, a philosophical worldview, and a religious way of life) does not easily accommodate the conceptual framework through which Christian theology is currently expressed. Chinese as a linguistic tool deeply differs from languages such as Greek, Latin, or Sanskrit. Its morphology does not distinguish between clear-cut grammatical categories. On the other hand, Chinese characters have a concrete flavor and a suggestiveness of their own and constitute a framework for expressing perception and thought that closely associates form and meaning. Consequently, basic Western concepts such as soul, substance, and modality have often been translated in a rather clumsy way, whereas finding equivalents for some basic Chinese categories is a painstaking endeavor. Furthermore, Chinese terminology is rooted in the canon of classical writings that constituted the basis for the development of Chinese culture and philosophy. This explains the fact that in the entire East Asia, for Christian theology, Confucianism was always perceived as an ‘‘embedded cultural–linguistic matrix’’ (the expression is due to the Protestant Korean theologian Heup Young Kim). With regard to such an inheritance, it is not surprising that theological inculturation has much to do with semantics. In this respect, a few terms deserve special attention.
406 Christianity in the Far East
First, the word ‘God’ has no immediate equivalent in Chinese language. A choice had to be made between the terms ‘Heaven’ (tian) and ‘Lord (or Emperor) from above’ (shangdi). Forming a pure neologism (as had been the case in Japan), expressions such as ‘Supreme Principle’ (taiji) or ‘Spirit’ (shen) were less plausible alternatives. Later, the Catholics adopted the term tianzhu (‘Lord of Heaven’), and the ecclesiastical authorities prohibited the use of other names after 1704. Today, most Protestant denominations still prefer the term shangdi, which is more common in Chinese. Another field open for semantic inculturation is that formed by words such as ‘virtue’ (daode), ‘law’ (fa), and ‘rites’ (li). Virtue is traditionally seen as an internal principle that governs one’s conduct and deeply influences one’s surroundings. In contrast, law is often accused of being an artificial construct that goes against the natural and virtuous flow of life. Some Chinese theologians have contrasted Moses, who gives his people a law, with Confucius, who gives the Chinese cultural world a set of internal principles of conduct. The previous examples illustrate how terminological problems are intrinsically linked with theological debates that determine the understanding and development of Christianity in East Asian societies and cultures.
East Asian Languages and the Reshaping of the Christian Narrative During the past 30 years of the 20th century, Korean minjung theology has provided East Asia with an example of the importance of language issues in the crafting of a new Christian expression. Although minjung roughly means ‘people,’ the word is usually not translated in order to preserve the specificity of the historical experience it represents. Korean minjung theology pioneered extratextual hermeneutics, insisting on popular rituals and expressions of feeling as a source of inspiration. Of special importance has been the stress put on kut, a shaman-like rite that makes the community as a whole gathering, resurrecting and offering sacrifice. Similarly, much writing has been devoted to han, the dominant popular feeling arising from ‘‘the suppressed, amassed, and condensed experience of oppression’’ (Suh Nan-dong). Such journey allowed, for instance, the Korean feminist Chung Hyun Kyung to write, ‘‘I discovered my bowels are shamanistic bowels, my heart is a Buddhist heart, and my head is a Christian head.’’ (On the blurring of religious identities, see Christianity in Southeast Asia.) Although not as vibrant as was
the case in the 1980s, minjung theology still provides a set of questions for East Asian Christianity as a whole. The linguistic and cultural rooting of Christianity in the East Asian context is a multivalent endeavor. It means coming to terms with the East Asian worldview embedded into words, concepts, and linguistic structures. It requires to remain open to the plurality of experiences as translated into linguistic forms. The various texts and cultural–linguistic expressions that East Asian Christian thinkers have to deal with are to be taken and analyzed according to their hermeneutical status: A closer look at the ambivalence of Chinese classics enriches biblical exegesis, individual narrations of the spiritual experiences undergone in East Asian contexts challenge Western categories of the spiritual or mystic path, and Buddhist and Taoist scriptures obey hermeneutic models that challenge the usual categorization of theological discourse (a good example is provided by the debate between the Buddhist philosopher Masao Abe and the Catholic theologian David Tracy). Interpreting anew the quest for ‘harmony’ typical of East Asian culture, recalling stories of hardships, traumas, forgiveness, survival, and hopes, and being attentive to the style of storytelling found in various East Asian cultures, all contribute to the writing of the Christian East Asian narrative. The appropriation of Christianity by East Asian languages is not a mere lexicographic endeavor: words and concepts take blood and flesh within the flow of a story told in many tongues.
See also: Bible; Christianity in Southeast Asia; Sacred Texts: Hermeneutics.
Bibliography Abe M (1990). ‘Kenotic God and dynamic Sunyata.’ In Cobb J B & Ives C (eds.) The emptying god. Maryknoll, NY: Orbis. 3–65. Chandrakanthan A J V (1990). ‘Emerging trends in Asia theology.’ East Asian Pastoral Review 27(3/4), 271–280. Costelloe J (ed.) (1992). The letters and instructions of Francis Xavier. St Louis: Institute of Jesuit Sources. DeVido E A & Vermander B (eds.) (2004). Creeds, rites and videotapes, narrating religious experience in East Asia. Varie´ te´ s Sinologiques New Series No. 93. Taipei: Taipei Ricci Institute. Eber I, Wan S K & Walf K (eds.) (1999). Bible in modern China, the literary and intellectual impact. Sankt Augustin, Germany: Institut Monumenta Serica. England J C, Kuttiaimattathil J, Prior J M, Quintos L A, Suh D K S & Wickeri J (eds.) (2003). Asian Christian
Christianity, Catholic 407 theologies, a research guide to authors, movements: (Vol .3). Northeast Asia. Delhi: ISPCK. Kim H Y (1994). ‘Jen and Agape: towards a Confucian Christology.’ Asia Journal of Theology, 8(2), 335–363. Koyama K (1974). Waterbuffalo theology. London: SCM. Ku¨ nster V R (1995). Theologie in Kontext, zugleich ein Versuch u¨ ber die Minjung-Theologie. Studia Instituti Missiologici Societatis Verbi Divini No. 62. Nettetal, Germany: Steylker Verlag. Phan P C (1996). ‘Jesus the Christ with an Asian face.’ Theological Studies 57, 399–430.
Song C S (1984). Tell us our names: story theology from an Asian perspective. Maryknoll, NY: Orbis. Standaert N (ed.) (2001). Handbook of Christianity in China (vol. 1). Leiden: Brill. 635–1800. Sujirtharajah R S (ed.) (1994). Frontiers in Asian Christian theology. Maryknoll, NY: Orbis. Vermander B (1996). ‘Theologizing in the Chinese context.’ Studia Missionalia 45, 119–134. Vermander B (2000). ‘Le monde sinise´ : Chine, Taiwan, Core´ e, Japon.’ In Dore´ J (ed.) Le devenir de la the´ ologie catholique mondiale depuis Vatican II, 1965–1999. Paris: Beauchesne. 397–427.
Christianity, Catholic D Sheerin, University of Notre Dame, Notre Dame, IN, USA ! 2006 Elsevier Ltd. All rights reserved.
Latin Christianity Christianity first developed in the West in Greekspeaking Jewish communities, thus the earliest surviving documents are in Greek; from the first half of the 3rd century, however, there are sophisticated Latin texts from Christian communities in Africa and in Rome (although use of Greek as well continued in Rome (see Lafferty, 2003)). The religious idiom of the Latin texts owed much to the Greek Christian authors who had built on the linguistic achievements of Hellenistic Judaism (see Loi, 1992a, 1992b), and owed much as well to the anonymous translations of the scriptures from Greek into Latin (known collectively as the Vetus Latina, by way of distinction from translations and revisions made later by the biblical scholar Jerome (d. 420)). The Latin of these translations is quite distinctive due to the large number of lexicographic, syntactic, and stylistic Grecisms and mediated Hebraisms created by the literalism of the translators, and due as well to the many vulgarisms (some coincident with Grecisms and Hebraisms) that the translators admitted to the written register, whether consciously, with a view to accessibility, or unconsciously, because of their limited control of written Latin (see Sheerin, 1996a).
Christian Latin Christian Latin has been studied extensively (see Sanders and Van Uytfanghe, 1989) and its peculiar features are well known (see Sheerin, 1996a). There are mostly literary remains, but the language of less educated Christians is known from inscriptions and
graffiti, from reports by literary figures, and from usage in the many sermons that survive from stenographic records. What has proved controversial is the exact linguistic category into which ‘Christian Latin’ should be placed. Christian Latin could be called a sociolect, but not as conventionally understood, for its use was not restricted along class lines. Yet it was at once the creation and the linguistic identifier of a particular group, and might even be called a ‘restricted code’ (in the sense of sociolinguistic scholar Basil Bernstein), for it would have been confusing and exclusionary to non-Christians. But it was employed only in connection with religion, thus it was, in effect, a Fachsprache, a ‘language for special purposes,’ or technolect. However, Christian Latin was employed by and helped to define an ever-growing and more varied community, a community that went far toward realizing its ambition of universality. The Christian Latin community developed its own terminology through loanwords from Greek and Hebrew, neologisms, and polysemous use of common terms as technical terms. Writers from late antiquity through the early modern period have shown an awareness of the difference and peculiarity of Christian terminology, as when Isidore of Seville (d. 636) contrasted Christian Grecisms to standard Latin terms: ‘‘On prophets. Those that the pagans call vates, our people call prophetae . . .’’, ‘‘Martyres in the Greek language are called testes (witnesses) in Latin . . .’’ (Etymologiae 7: 8.1, 11.1), and Erasmus provided a very comical (and satirical) specimen of an attempt to present Christian discourse in Ciceronian Latin (Ciceronianus, Amsterdam edition of Erasmus’s Opera omnia (ASD) I-2: 640–642; Leyden edition (LB) 1: 995–996). Phonology and morphology were unaffected save by the many foreign-sounding terms and names
Christianity, Catholic 407 theologies, a research guide to authors, movements: (Vol .3). Northeast Asia. Delhi: ISPCK. Kim H Y (1994). ‘Jen and Agape: towards a Confucian Christology.’ Asia Journal of Theology, 8(2), 335–363. Koyama K (1974). Waterbuffalo theology. London: SCM. Ku¨nster V R (1995). Theologie in Kontext, zugleich ein Versuch u¨ber die Minjung-Theologie. Studia Instituti Missiologici Societatis Verbi Divini No. 62. Nettetal, Germany: Steylker Verlag. Phan P C (1996). ‘Jesus the Christ with an Asian face.’ Theological Studies 57, 399–430.
Song C S (1984). Tell us our names: story theology from an Asian perspective. Maryknoll, NY: Orbis. Standaert N (ed.) (2001). Handbook of Christianity in China (vol. 1). Leiden: Brill. 635–1800. Sujirtharajah R S (ed.) (1994). Frontiers in Asian Christian theology. Maryknoll, NY: Orbis. Vermander B (1996). ‘Theologizing in the Chinese context.’ Studia Missionalia 45, 119–134. Vermander B (2000). ‘Le monde sinise´: Chine, Taiwan, Core´e, Japon.’ In Dore´ J (ed.) Le devenir de la the´ologie catholique mondiale depuis Vatican II, 1965–1999. Paris: Beauchesne. 397–427.
Christianity, Catholic D Sheerin, University of Notre Dame, Notre Dame, IN, USA ! 2006 Elsevier Ltd. All rights reserved.
Latin Christianity Christianity first developed in the West in Greekspeaking Jewish communities, thus the earliest surviving documents are in Greek; from the first half of the 3rd century, however, there are sophisticated Latin texts from Christian communities in Africa and in Rome (although use of Greek as well continued in Rome (see Lafferty, 2003)). The religious idiom of the Latin texts owed much to the Greek Christian authors who had built on the linguistic achievements of Hellenistic Judaism (see Loi, 1992a, 1992b), and owed much as well to the anonymous translations of the scriptures from Greek into Latin (known collectively as the Vetus Latina, by way of distinction from translations and revisions made later by the biblical scholar Jerome (d. 420)). The Latin of these translations is quite distinctive due to the large number of lexicographic, syntactic, and stylistic Grecisms and mediated Hebraisms created by the literalism of the translators, and due as well to the many vulgarisms (some coincident with Grecisms and Hebraisms) that the translators admitted to the written register, whether consciously, with a view to accessibility, or unconsciously, because of their limited control of written Latin (see Sheerin, 1996a).
Christian Latin Christian Latin has been studied extensively (see Sanders and Van Uytfanghe, 1989) and its peculiar features are well known (see Sheerin, 1996a). There are mostly literary remains, but the language of less educated Christians is known from inscriptions and
graffiti, from reports by literary figures, and from usage in the many sermons that survive from stenographic records. What has proved controversial is the exact linguistic category into which ‘Christian Latin’ should be placed. Christian Latin could be called a sociolect, but not as conventionally understood, for its use was not restricted along class lines. Yet it was at once the creation and the linguistic identifier of a particular group, and might even be called a ‘restricted code’ (in the sense of sociolinguistic scholar Basil Bernstein), for it would have been confusing and exclusionary to non-Christians. But it was employed only in connection with religion, thus it was, in effect, a Fachsprache, a ‘language for special purposes,’ or technolect. However, Christian Latin was employed by and helped to define an ever-growing and more varied community, a community that went far toward realizing its ambition of universality. The Christian Latin community developed its own terminology through loanwords from Greek and Hebrew, neologisms, and polysemous use of common terms as technical terms. Writers from late antiquity through the early modern period have shown an awareness of the difference and peculiarity of Christian terminology, as when Isidore of Seville (d. 636) contrasted Christian Grecisms to standard Latin terms: ‘‘On prophets. Those that the pagans call vates, our people call prophetae . . .’’, ‘‘Martyres in the Greek language are called testes (witnesses) in Latin . . .’’ (Etymologiae 7: 8.1, 11.1), and Erasmus provided a very comical (and satirical) specimen of an attempt to present Christian discourse in Ciceronian Latin (Ciceronianus, Amsterdam edition of Erasmus’s Opera omnia (ASD) I-2: 640–642; Leyden edition (LB) 1: 995–996). Phonology and morphology were unaffected save by the many foreign-sounding terms and names
408 Christianity, Catholic
that came with Christianity and by a toleration of – indeed, a preference for – vulgarisms (see, e.g., Augustine, De doctrina christiana IV). Syntax was not greatly affected save that the prestige of the scriptures made written registers of Christian Latin permeable to the Grecisms, mediated Hebraisms, and vulgarisms that characterize the Latin biblical translations. Rusticity became for a time the Christian ideal – often real, sometimes pretended as a humility topos.
The Latin Church Latin became the official language of Western Christianity, so much so that Western Christianity was sometimes called Latinitas (see Sheerin, 1987). Latin dominance in Romanized regions is to be expected, but Latin primacy was also resolutely maintained against non-Romance vernacular alternatives. Thus, although Eastern Christianity early produced scripture, liturgy, and literature in Greek, Syriac, Coptic, Ge‘ez, Armenian, and Georgian, and later in Church Slavonic and Arabic, Latin remained the unique vehicle for the higher end of ecclesiastical discourse in the West, the official language of the Bible and liturgy (see Sheerin, 1996b), and over the centuries distinctive styles and idioms developed in the schools for philosophy, theology, and canon law (see articles in Mantello and Rigg (1996)). Even when religious texts became available in the vernaculars, a kind of ecclesiastical diglossia continued through the Middle Ages and well beyond (see ‘The ‘‘Latin Stronghold’’: the Church,’ in Waquet (2001: chap. 2)). Of course, vernacular vocabulary, syntax, and, to a far greater degree, pronunciation affected ecclesiastical Latin (see Erasmus, De correcta latini graecique sermonis pronuntiatione). Not until the early 20th century did anything like a ‘standard’ pronunciation of ecclesiastical Latin came about, with the general adoption of ‘Roman’ or ‘Italian’ pronunciation as a consequence of the revival of Gregorian chant, and even then not without resistance from traditionalists in France (see Brittain, 1955) and, later, from ‘nationalists’ in Spain. But even in the post-Vatican II era, Latin has remained the official language of the Roman Catholic Church (see the Vatican’s official publication, Acta apostolicae sedis). How well Latin has served may be seen in the fact that for 1700 years Latin has been the vehicle of administration, regulation, and education of a polyglot religious imperium more extensive than any political empire the world has seen. Clerical education had to begin with training in Latin. Even in late antiquity, clerical ignorance of
Latin was a problem; Augustine (De catechizandis rudibus 9.13) remarked on the scandal caused by clerics’ garbled Latin, and clerical inability to compose ex tempore contributed to the abandonment of improvised liturgical prayers for fixed ones. Complaints about clerical ignorance of Latin are found in the literature through the Middle Ages and well beyond (see Waquet, 2001: 60–63). Training in Latin, rudimentary or advanced, was the preserve of males, of clergy alone for centuries, but later young ‘gentlemen’ as well (see Ong, 1959). Though some women, mostly in religious life, were taught Latin and became proficient scribes and composers of Latin texts (see Churchill et al., 2002), women in religious life, in degree varying with time and place, had to depend on oral instruction in the vernacular or on versions of theological and spiritual texts in the vernacular. The vernacular languages of the Catholic peoples, however, were the medium for unwritten communication with the vast majority who did not know Latin. In the later Middle Ages, multiple factors evoked an increase in the composition of religious texts in the vernacular languages and pari passu an elaboration of vernacular resources for the presentation of religious thought (a view of these resources can be had in a glance at the work of Foster and Carey (2002)). Printing, the Reformation, and the Counterreformation accelerated the vernacularization of religious language. But the Catholic church barely allowed vernacular translations of the Latin Vulgate Bible and refused translation of the Latin liturgy. Latin remained the medium for theological education and discourse, but vernacular theological language had to be employed in vernacular controversial literature and in vernacular catechisms for the instruction of the laity.
The Roman Catholic Church in the United States Here focus shifts abruptly from the pre-modern to the modern period, and narrows from the official Latin language to the vernacular of American Catholics. Until at least the early 1960s, the speech of American Catholics was characterized, to a limited degree, by pronunciation and the use of Latin tags (for their residue, see Wills (1972: 16–17) and Bretzke (1998)), but far more by a latinate English terminology, a technolect, largely derived from the Baltimore catechism, which had been promulgated by the U.S. Catholic bishops in 1891 (the American equivalent of the Anglo-Irish ‘penny catechism,’ A catechism of Christian doctrine).
Christianity, Catholic 409
Catholic elementary school students were taught about things divine and how to define perfection, unity, nature, and substance; they were taught about the church and how to define its attributes of authority, infallibility, and indefectibility, and how to describe the various ranks of the officers of the church, those in minor orders and major orders, and the prelates who ruled the church, from monsignor through the Pope (who could teach ex cathedra) and his vicars apostolic and cardinals. They were taught a taxonomy of sins: original and actual, mortal and venial, and capital and material, and about predominant sin or ruling passion, which might determine for each what constituted a near occasion of sin. But grace was available – actual grace and sanctifying grace – which came through the sacraments (the sacraments of the living and sacraments of the dead), with their outward sign, matter, and form, and the requisite right intention and right dispositions. Students learned that confession and general confession (not recommended for those with scruples), with its absolution and penance, could remit the two punishments due to actual sins, the eternal punishment, but only part of the temporal punishment in purgatory. For help with the latter, one sought a plenary indulgence or a partial indulgence. Catholics knew the end of man, that after death there would be a particular judgment that would send him to hell, purgatory, or heaven, but that there was to be a general resurrection followed by a general judgment, after which the souls of the just would live on in a glorified body, with its qualities of brilliancy, agility, subtility, and impassibility. For some practices, Catholics had expressions ranging from the top to the bottom of Catholic idiom (see Wills (1972: 40), on the hypercorrect usage of the Catholic intelligentsia of the 1950s), e.g., from the formal ‘receive the Eucharist’ or ‘receive Holy Communion’ to the informal ‘go to Communion,’ to the antiformal ‘hit the rail’ (the communion, or altar, rail that separated the sanctuary from the nave of the church). The intelligentsia employed a reserved code of compressed, implicit expressions, such as when they advised one another to ‘offer it up’ (gladly to offer personal discomfort or frustration in place of the pains of the church suffering in purgatory) or excused themselves to ‘make a visit’ (to the Blessed Sacrament reserved in church). Even Catholic profanity was distinctive, an apt topic for a study such as that of Beccaria (2001). Catholic communities were distinguished also by onomastics. Religion restricted the choice of personal names given in the sacraments of baptism and confirmation to ‘saints’ names’ broadly understood, and affected even the toponymy of areas where Catholics
lived. Spanish-speaking Catholics had left the mark of their religion on the toponymy of the western states in names that are still intact or that have been abbreviated, e.g., in the United States, Ventura (California), from San Bonaventura, and the Animas River (northern Colorado), from Rio de las Animas Perdidas. Later waves of Catholic immigrants used parish boundaries to create their own toponymy for the great cities in which they settled. Thus, when asked where they or another lived, they would reply with the name of the parish. This could have ethnic significance, e.g., St. Laurence O’Toole (Irish), St. Liborius (German), or St. John Nepomuk (Czech), or might simply indicate in which Catholic community they lived, with all of the socioeconomic and cultural implications of any other community designation. This toponymy was confusing to non-Catholics or even to Catholics from out of town, and confusion was compounded when the term ‘parish’ was dropped from the complete or abbreviated parish name, as when one would say that he was from ‘Blessed Sacrament’ or ‘Holy Rosary’ or from ‘Sorrows’ (Our Lady of Sorrows parish) or ‘Immaculate’ (Immaculate Conception parish). Roman Catholic culture has undergone fundamental changes since the Second Vatican Council, and the American Catholic idiom described here is now obsolescent. Some of its features endure, but many have been displaced by new formulations, and, in the interests of interfaith relations, gender equity, etc., many peculiar usages have been eliminated altogether. It would be, perhaps, inexact to apply the term ‘language death’ to this case, but the fact remains that cultural changes are bringing about the extinction of a form of peculiarly Catholic mode of communication. See also: Bible; Ethnicity; Gender, Grammatical; Languages for Specific Purposes.
Bibliography Beccaria G L (2001). Sicuterat: il latino di chi non lo sa: Bibbia e liturgia nell’italiano e nei dialetti. Milano: Garzanti. Bretzke J T (1998). Consecrated phrases: a Latin theological dictionary: Latin expressions found in theological writings. Collegeville, MN: Liturgical Press. Brittain F (1955). Latin in church: the history of its pronunciation. London: A. R. Mowbray. Churchill L J, Brown P R & Jeffrey J E (eds.) (2002). Women writing Latin: From Roman antiquity to early modern Europe. New York: Routledge. Foster E E & Carey D H (2002). Chaucer’s church: a dictionary of religious terms in Chaucer. Aldershot: Ashgate.
410 Christianity, Catholic Lafferty M K (2003). ‘Translating faith from Greek to Latin: romanitas and christianitas in late fourthcentury Rome.’ Journal of Early Christian Studies 11, 21–62. Loi V (1992a). ‘Greek, Christian.’ In Di Berardino A (ed.) Encyclopedia of the early church, vol. 2. New York: Oxford University Press. 360–361. Loi V (1992b). ‘Latin, Christian.’ In Di Berardino A (ed.) Encyclopedia of the early church, vol. 2. New York: Oxford University Press. 474. Mantello F A C & Rigg A G (eds.) (1996). Medieval Latin: an introduction and bibliographical guide. Washington, DC: Catholic University of America Press.
Ong W (1959). ‘Latin language study as a Renaissance puberty rite.’ Studies in Philology 56, 103–124. Sanders G & Van Uytfanghe M (1989). Bibliographie signale´ tique du Latin des chre´ tiens. Turnhout: Brepols. Sheerin D (1987). ‘In media latinitate.’ Helios 14, 51–67. Sheerin D (1996a). ‘Christian and biblical Latin.’ In Mantello & Rigg (eds.) 137–156. Sheerin D (1996b). ‘The liturgy.’ In Mantello & Rigg (eds.) 157–182. Waquet F (2001). Latin, or, the empire of a sign: from the thirteenth to the twentieth century. Howe J (trans.). London: Verso. Wills G (1972). Bare ruined choirs: doubt, prophecy, and radical religion. New York: Doubleday.
Christianity, Protestant H Hillerbrand, Duke University, Durham, NC, USA ! 2006 Elsevier Ltd. All rights reserved.
Protestantism is one of the three major Christian traditions, alongside Roman Catholicism and Greek and Russian Orthodoxy. Its number of adherents worldwide is calculated to be 342 000 000 Core Protestants, to which should be added another 491 000 000 Wider Protestants, members of so-called independent traditions, which are referred to as Protestant by others. Statistics of adherents of religions are not always fully reliable, however, since they are often based on self-reported figures by the churches themselves. European countries with vestiges of the historical state-church system count baptized members (for example, in both Sweden and Finland approximately 82% of the respective populations), many of whom have no meaningful relationship to Christianity or Protestantism. Nonetheless, Protestants comprise about 44% of the adherents of world Christianity. Much like Roman Catholicism, Protestantism is not confined to Europe and North America but has increasingly been, since the 18th century, a global phenomenon. The importance of Protestantism can be seen in the fact that worldwide the overwhelming number of Christian publications come from Protestant sources and that a Christian ‘megacensus,’ which measures everything empirically measurable about Christianity, has found that 75% stem from Protestant initiatives. Unlike Roman Catholic and Orthodox Christianity, however, Protestant Christianity is divided not only geographically and culturally, but also theologically and ecclesiastically. There is no single Protestant Church as such, as there is – despite various diversities – a single Roman Catholic Church. Quite the contrary, there are dozens upon dozens of Protestant
Churches. Some of these, such as the Anglican Communion, are worldwide in scope and distribution of membership; others such as the Church of the Prussian Union are confined to a single country or area; still others such as the independent snake-handling congregations of Appalachia in the United States, are solitary congregations. Despite such diversity, which Catholics in the past used to buttress their own truth claims (since truth – as Bishop Bossuet noted in the 17th century – must be one, not many), virtually all of these traditions had their origin in staking out the same absolutist truth claims as have the Roman Catholic and Orthodox churches. Until the modern era, all Protestant churches insisted on their exclusive possession of Christian truth and, each in its own way, echoed the ancient notion, extra ecclesiam nulla salus – outside the church there is no salvation – except that Protestants had a way of defining ‘church’ rather subjectively. The term ‘Protestant’ itself comes from the ‘protest’ which the German territorial rulers, supportive of Luther’s movement of reform, lodged at the diet (parliament) at Speyer in 1529 against the decision of the Catholic rulers to carry out the stipulations of the Edict of Worms against Martin Luther. Thus, one may define Protestant as all those individuals and churches that affirm the Bible as the sole norm of truth and, in so doing, protest the authority of the Roman pontiff and the Catholic Church. The term itself is, therefore, a negative one, even though some interpreters of the 1529 action have pointed to the root meaning of the Latin protestari as denoting ‘to bear witness’ as is still found in the word ‘protestation,’ meaning strong declaration or affirmation. Protestantism may thus be defined with a number of positive assertions, as one also.
410 Christianity, Catholic Lafferty M K (2003). ‘Translating faith from Greek to Latin: romanitas and christianitas in late fourthcentury Rome.’ Journal of Early Christian Studies 11, 21–62. Loi V (1992a). ‘Greek, Christian.’ In Di Berardino A (ed.) Encyclopedia of the early church, vol. 2. New York: Oxford University Press. 360–361. Loi V (1992b). ‘Latin, Christian.’ In Di Berardino A (ed.) Encyclopedia of the early church, vol. 2. New York: Oxford University Press. 474. Mantello F A C & Rigg A G (eds.) (1996). Medieval Latin: an introduction and bibliographical guide. Washington, DC: Catholic University of America Press.
Ong W (1959). ‘Latin language study as a Renaissance puberty rite.’ Studies in Philology 56, 103–124. Sanders G & Van Uytfanghe M (1989). Bibliographie signale´tique du Latin des chre´tiens. Turnhout: Brepols. Sheerin D (1987). ‘In media latinitate.’ Helios 14, 51–67. Sheerin D (1996a). ‘Christian and biblical Latin.’ In Mantello & Rigg (eds.) 137–156. Sheerin D (1996b). ‘The liturgy.’ In Mantello & Rigg (eds.) 157–182. Waquet F (2001). Latin, or, the empire of a sign: from the thirteenth to the twentieth century. Howe J (trans.). London: Verso. Wills G (1972). Bare ruined choirs: doubt, prophecy, and radical religion. New York: Doubleday.
Christianity, Protestant H Hillerbrand, Duke University, Durham, NC, USA ! 2006 Elsevier Ltd. All rights reserved.
Protestantism is one of the three major Christian traditions, alongside Roman Catholicism and Greek and Russian Orthodoxy. Its number of adherents worldwide is calculated to be 342 000 000 Core Protestants, to which should be added another 491 000 000 Wider Protestants, members of so-called independent traditions, which are referred to as Protestant by others. Statistics of adherents of religions are not always fully reliable, however, since they are often based on self-reported figures by the churches themselves. European countries with vestiges of the historical state-church system count baptized members (for example, in both Sweden and Finland approximately 82% of the respective populations), many of whom have no meaningful relationship to Christianity or Protestantism. Nonetheless, Protestants comprise about 44% of the adherents of world Christianity. Much like Roman Catholicism, Protestantism is not confined to Europe and North America but has increasingly been, since the 18th century, a global phenomenon. The importance of Protestantism can be seen in the fact that worldwide the overwhelming number of Christian publications come from Protestant sources and that a Christian ‘megacensus,’ which measures everything empirically measurable about Christianity, has found that 75% stem from Protestant initiatives. Unlike Roman Catholic and Orthodox Christianity, however, Protestant Christianity is divided not only geographically and culturally, but also theologically and ecclesiastically. There is no single Protestant Church as such, as there is – despite various diversities – a single Roman Catholic Church. Quite the contrary, there are dozens upon dozens of Protestant
Churches. Some of these, such as the Anglican Communion, are worldwide in scope and distribution of membership; others such as the Church of the Prussian Union are confined to a single country or area; still others such as the independent snake-handling congregations of Appalachia in the United States, are solitary congregations. Despite such diversity, which Catholics in the past used to buttress their own truth claims (since truth – as Bishop Bossuet noted in the 17th century – must be one, not many), virtually all of these traditions had their origin in staking out the same absolutist truth claims as have the Roman Catholic and Orthodox churches. Until the modern era, all Protestant churches insisted on their exclusive possession of Christian truth and, each in its own way, echoed the ancient notion, extra ecclesiam nulla salus – outside the church there is no salvation – except that Protestants had a way of defining ‘church’ rather subjectively. The term ‘Protestant’ itself comes from the ‘protest’ which the German territorial rulers, supportive of Luther’s movement of reform, lodged at the diet (parliament) at Speyer in 1529 against the decision of the Catholic rulers to carry out the stipulations of the Edict of Worms against Martin Luther. Thus, one may define Protestant as all those individuals and churches that affirm the Bible as the sole norm of truth and, in so doing, protest the authority of the Roman pontiff and the Catholic Church. The term itself is, therefore, a negative one, even though some interpreters of the 1529 action have pointed to the root meaning of the Latin protestari as denoting ‘to bear witness’ as is still found in the word ‘protestation,’ meaning strong declaration or affirmation. Protestantism may thus be defined with a number of positive assertions, as one also.
Christianity, Protestant 411
Historically, Protestantism had its beginnings early in the 16th century in the various efforts to engage the Roman Catholic Church in reform of practice and theology. These efforts were spearheaded by Martin Luther, professor at the University of Wittenberg in Central Germany, and his 95 Theses against the practice of indulgences. The prompt verdict of excommunication against Luther and his supporters issued in 1521 meant a parting of the ways and, before long, the establishment of new churches separate from the Roman Church. It is a truism that, once the break had occurred, subsequent theological reflection convinced the Protestant reformers that their understanding of text and message differed categorically from that of the old church. There surely should be no doubt – the exceptions seem to be systematic theologians who tend to view the past from the perspective of the present – that from a certain time onward in the Reformation controversy, the reformers and their successors would not have returned to the Catholic Church even if they had been welcomed with open arms. Soon it became obvious, however, that common opposition against the Roman church and its theology did not mean common belief. Dissension arose within the movement of reform, first among those who wished for a more selective church and repudiated the baptism of infants, later called Anabaptists, then among those who held different interpretations of the meaning of the Lord’s Supper. Indeed, this particular issue became the cause for a far-reaching split among the ranks of the reformers, with the line drawn most sharply between the Lutheran and the Calvinist traditions, the latter tracing its origin to John Calvin, a Frenchman whose reforming activities centered in Geneva, Switzerland. Calvin understood the bread and wine of the Communion service to signify the spiritual union between the believer and Christ, while Luther held that these elements were his true body and blood. Calvin’s tradition has had a particularly strong importance in the Anglo-Saxon world, while Lutheranism tended to be found in central and northern Europe. In the England of King Henry VIII, religious agitation together with the strong will of the king led to the establishment of a new church that initially focused on the repudiation of papal authority but subsequently, under Edward VI and Elizabeth I, took on a more Protestant ring. Even as on the Continent earlier in the century, there had been those for whom the reforms undertaken did not go far enough, so there were those in England during Elizabeth’s reign who called for reform that was more extensive and a purer church. Dubbed Puritans by their opponents, these ardent reformers owed a great debt to
Calvin, and became a lasting element in English Protestantism. Mainstream Protestantism in England, soon dubbed Anglican, saw its distinctive feature in the affirmation of a via media between Catholicism and Protestantism. The story of Protestantism in England gained in excitement in the course of the 17th century with the turbulence of the civil war and its mixture of political and religious concerns. This brought the emergence of several dissenting churches that, while small in numbers, have lastingly influenced English Protestantism to this day: Baptists, Quakers, and Congregationalists. Similar discontent with mainstream Protestantism in England in the 18th century led to the emergence of the Wesleyan movement, named after the Oxford don and Church of England clergyman John Wesley. Eventually, this movement separated from the Church of England and subsequently became known as Methodist. Continental Protestantism was characterized in the 17th century by an emphasis on proper doctrine (thus, the label Orthodoxy), which later gave it the unmerited reputation of having little empathy for the praxis of Christianity. Orthodox theology focused on delineating the distinctive doctrines of the respective traditions, which led, among other affirmations, to the delineation of the notion of the verbal inspiration of the Bible. Late in the century, Philipp Jakob Spener, a German Lutheran clergyman, influenced by such English devotional writers as William Perkins and William Ames, offered a proposal for renewing the church in a book entitled Pious Desires. This book triggered the movement of Pietism, which can be seen as a Continental parallel to the Wesleyan movement in England. Of equal, if not greater, importance for the course of Protestantism were the intellectual changes emanating from the European Enlightenment in the late 17th and 18th centuries. The searching reexamination of long-held scholarly, philosophical, and religious views led to an increasing attack on the conventional understanding of the Bible as well as on traditional theology. A group of English thinkers labeled Christian Deists challenged the time-honored understanding of the Bible as a revealed book, repudiating not only the concept of revelation but miracles and prophecy as well. Protestantism became divided into two camps: those that affirmed traditional Christian doctrines and those who reinterpreted these doctrines. In the 19th and even the 20th century, the history of Protestantism was less the story of great church leaders or institutions than it was the story of theologians, such as Friedrich Schleiermacher, Albrecht Ritschl, David Friedrich Strauss, or Adolf Harnack. These theologians undertook to reinterpret the
412 Christianity, Protestant
Christian faith in the face of the challenges of modernity. Others, often less in the public eye, affirmed traditional notions. Twentieth century Protestantism was marked by several theological waves. Some, such as Neo-Orthodoxy, were theologically conservative, while others, such as Liberation Theology, tended to be more liberal. The great theme of 20th century Protestantism, however, was its globalization and its eroding numbers and importance in Europe. Protestantism was successfully transplanted from old to New England in the 17th century. Many immigrants who came to North America were religious dissenters, such as the Mennonites, the Amish, the Independents, or the Baptists, for whom the new continent was the biblical city on a hill, where they could practice their faith unhindered by government, though they often manifested the same kind of intolerance that they themselves had experienced in their European homelands. Theologically, several affirmations characterize Protestantism, of which the centrality of the Bible as sole authority for life and faith holds the primacy of place and importance. The Latin phrase sola scriptura (‘Scripture alone’) expresses this centrality, which implies the rejection of the church and its traditions as normative authority as is variously the case in Roman Catholicism and Greek and Russian Orthodoxy. Since it is the presupposition of this Protestant premise that the meaning of the Bible is clear and selfevident, Protestantism has no teaching office (magisterium) as does the Roman Catholic Church. Thus, the distinctive Protestant hallmark has to do with authority. A second Protestant assertion relates to the doctrine of justification, which explains how humans are reconciled with God. Catholic teaching has held in varying ways that such justification entails a process of divine and human cooperation, where humans will marshal their moral prowess to which divine grace is added. This Catholic notion of cooperation, often mislabeled as work righteousness by Protestants, was countered in Protestant theology with the insistence that justification is solely attributable to divine grace, which humans appropriate solely by faith. Here, too, Protestant nomenclature invoked the Latin sola (alone) to insist that justification is sola gratia, sola fide, solely by grace, solely by faith. Beyond these basic affirmations, Protestantism is marked by theological diversity. Protestant diversity finds its explanation in the absence of a central authoritative entity – either person or structure – in Protestantism that would exercise normative authority (and power). The Protestant recourse to the Bible, or the Word of God as the ultimate authority,
produced multiple divergent interpretations, and new theological or biblical interpretations frequently assumed structural concreteness. Yet it seems neither fair nor theologically accurate to contrast the relatively homogeneous Catholic and Orthodox churches with the diversity of Protestant denominations – and to find in this diversity proof positive for the nonviability of Protestant truth claims. The Roman Catholic Church sustains its theological homogeneity through excommunication or the voluntary separation of dissenting members, indicating that it is not itself able to maintain unity of interpretation. The very existence of Orthodox and Protestant traditions suggests that the Roman Catholic Church has not been able to sustain its truth claims universally but has sloughed off dissent within its ranks. In Protestant churches, excommunication and dissent have likewise led to separation, but with a difference: the establishment of new groupings and churches. The phenomenon of new ecclesial structures has been particularly prominent in places where the legal freedom to do so existed. The absence of established churches in North America and the nonEuropean world has allowed dissent from the mainstream to express itself organizationally and sociologically in the form of new churches, each of which claims its own truth. The diversity of Protestant churches, especially pronounced in the United States, entails two consequences. One is the difficulty of speaking of the Protestant understanding of almost any topic, be it worship, doctrine, ethics, etc. Even as regards the traditional hallmark of Protestantism, the priority of grace in salvation, there are diverse Protestant notions as to exactly how divine grace and human effort are to be related. Second, there is also the increasingly popular (at least among scholars) tendency to use the plural and speak of Protestantisms to denote the empirical reality of Protestant diversity. This view, which simultaneously speaks of Catholicisms or Christianities, ignores the fundamental unity in diversity by favoring the latter rather than the former. The closing decades of the 20th century saw a striking vitality of theologically conservative Protestant churches, especially in places such as Korea, Central America, and Africa. Pentecostal churches have seen dramatic increases in membership. This is in contrast to stable membership numbers and a more liberal theological outlook in the European and North American Protestant churches. The important question, as regards the future of Protestantism, will be if these recent developments must be seen as harbingers of identity and place of Protestantism in the 21st century.
Christianity, Protestant 413 See also: Bible; Christianity, Catholic; Christianity in Africa; Christianity in Latin America; Christianity in the Far East; Luther, Martin (1483–1546); Quakerism; Reformation, Northern European; Religious Language; Sacred Texts: Hermeneutics.
Bibliography Ad Fontes (2001). ‘Digital library of classic Protestant texts.’ [S.l.], Ad Fontes. Balmer R H & Winner L F (2002). Protestantism in America. New York: Columbia University Press. Barth K (2001). Protestant theology in the nineteenth century. London. SCM. Bell J S & Sumner T M (2002). The complete idiot’s guide to the Reformation & Protestantism. Indianapolis: Alpha Books. Berg J V D, Bruijn J D et al. (1999). Religious currents and cross-currents: essays on early modern Protestantism and the Protestant Enlightenment. Leiden: Brill. Besier G, Boyens A et al. (1999). Nationaler Protestantismus und o¨ kumenische Bewegung: kirchliches Handeln im Kalten Krieg (1945–1990). Berlin: Duncker & Humblot. Bowie W R & Giniger K S (1965). What is Protestantism? New York: F. Watts. Brecht M (1993). Geschichte des Pietismus. Go¨ ttingen: Vandenhoeck & Ruprecht. Brown R M (1965). The spirit of Protestantism. New York: Oxford University Press. Cohen J (2002). Protestantism and capitalism : the mechanisms of influence. New York: A. de Gruyter. Cummings B (2002). The literary culture of the Reformation: grammar and grace. Oxford, New York: Oxford University Press. Derr T S, Neuhaus R J et al. (1988). The believable futures of American Protestantism. Grand Rapids: W. B. Eerdmans. Dillenberger J & Welch C (1988). Protestant Christianity : interpreted through its development. New York/London: Macmillan/Collier Macmillan. Duchhardt H (1977). Protestantisches Kaisertum und Altes Reich: d. Diskussion u¨ ber d. Konfession d. Kaisers in Politik, Publizistik u. Staatsrecht. Wiesbaden: Steiner. Durnbaugh D F (1985). The believers’ church: the history and character of radical Protestantism. Scottdale: Herald Press. Fanfani A (1984). Catholicism, Protestantism, and capitalism. Notre Dame: University of Notre Dame Press. Gisel P (1995). Encyclope´ die du protestantisme. Paris, Gene`ve: Ed. du Cerf. Green I M (2000). Print and Protestantism in early modern England. Oxford, New York: Oxford University Press. Harrison P (1998). The Bible, Protestantism, and the rise of natural science. Cambridge, New York: Cambridge University Press. Haynes C A (1998). Divine destiny: gender and race in nineteenth-century Protestantism. Jackson: University Press of Mississippi.
Hillerbrand H J (2004). Encyclopedia of Protestantism. New York: Routledge. Hutchison W R (1976). The modernist impulse in American Protestantism. Cambridge: Harvard University Press. Jacobsen D & Trollinger W V (1998). Re-forming the center: American Protestantism, 1900 to the present. Grand Rapids: W. B. Eerdmans. Kantzenbach F W (1965). Protestantisches Christentum im Zeitalter der Aufkla¨rung. Gu¨ tersloh: Gu¨ tersloher Verlagshaus G. Mohn. Kantzenbach F W (1969). Geschichte des Protestantismus von 1789–1848. Gu¨ tersloh: Gu¨ tersloher Verlagshaus G. Mohn. Kegley C W (1965). Protestantism in transition. New York: Harper & Row. Lehmann H (2001). Protestantisches Christentum im Prozess der Sa¨ kularisierung. Go¨ ttingen: Vandenhoeck & Ruprecht. Le´ onard E´ G, Rowley H H et al. (1965). A history of Protestantism. London: Nelson. Marshall P & Ryrie A (2002). The beginnings of English Protestantism. Cambridge: Cambridge University Press. Marty M E (2004). The Protestant voice in American pluralism. Athens: University of Georgia Press. McGrath A E & Marks D C (2004). The Blackwell companion to Protestantism. Malden: Blackwell. Mı´guez Bonino J (1997). Faces of Latin American Protestantism: 1993 Carnahan lectures. Grand Rapids: W. B. Eerdmans. Morgan D (1999). Protestants & pictures: religion, visual culture, and the age of American mass production. New York: Oxford University Press. Noll M A (2002). America’s God: from Jonathan Edwards to Abraham Lincoln. Oxford, New York: Oxford University Press. Pettegree A & East Carolina University Dept. of History (1999). Huguenot voices: the book and the communication process during the Protestant Reformation. Greenville: East Carolina University Dept. of History College of Arts and Sciences. Pincus S C A (1996). Protestantism and patriotism: ideologies and the making of English foreign policy, 1650– 1668. Cambridge, New York: Cambridge University Press. Rausch D A & Voss C H (1987). Protestantism, its modern meaning. Philadelphia: Fortress Press. Schu¨ tte H (1967). Protestantismus. Sein Selbstversta¨ndnis und sein Ursprung gema¨ss der deutschsprachigen protestantischen Theologie der Gegenwart und eine kurze katholische Besinnung, mit einem Geleitbrief. EssenWerden: Fredebeul & Koenen. Shaull R (1991). The reformation and liberation theology: insights for the challenges of today. Louisville: Westminster/J. Knox Press. Wuthnow R & Evans J H (2002). The quiet hand of God: faith-based activism and the public role of mainline Protestantism. Berkeley: University of California Press. Zahl P F M (1998). The Protestant face of Anglicanism. Grand Rapids: W. B. Eerdmans.
414 Christmas Island: Language Situation
Christmas Island: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
An Australian Territory since 1958, Christmas Island is a small tropical isle in the eastern sector of the Indian Ocean, 2600 km northwest of Perth, Western Australia. Its closest neighbor is Java, 360 km away. The uninhabited island was discovered and named on Christmas Day of the year 1643 by Captain William Mynors of the East India Ship Company, but it was not until 1888 that Christmas Island was annexed by Britain and settled. The Clunies-Ross brothers from neighboring Cocos-Keeling Islands (some 900 km to the southwest) established a settlement at Flying Fish Cove to collect timber and supplies for the growing industry on Cocos, and when phosphate mining began in the 1890s, Christmas Island became increasingly populated. The earliest settlers spoke English and Cocos Malay, a unique version of Malay that has been isolated from the mainstream language for over 150 years. Early arrivals from China mainly spoke Cantonese. In fact, many early place names on Christmas Island are Cantonese words, such as Poon Saan, which means ‘halfway up the hill’. Postwar arrivals who came from Penang introduced other Chinese languages including Hakka, Hainese, Hokkien, and Teochew, while those from Singapore
introduced Mandarin. Malay is widely spoken by the Malay Community present on the island and, even though English is the official language of Christmas Island, there are many residents who generally communicate in Malay or one of the four Chinese dialects. Because English was not a prerequisite for employment, a sizable proportion of today’s community is not fluent in English and many residents still converse in their native tongue. The Australian Bureau of Statistic Census of 2001 recorded a total population of 1508 people on Christmas Island, with an ethnic composition of approximately 60% Chinese, 10–15% European, and 25–30% Malay. The influx of tourists has also had an impact on the island’s language. Indonesian is frequently spoken along with many of the Chinese languages. Thai, Japanese, German, and a few other European languages are sometimes also heard. See also: Cocos (Keeling) Islands: Language Situation;
Malay.
Bibliography Adelaar S (1996). ‘Malay in the Cocos (Keeling) Islands.’ In Nothofer B (ed.) Reconstruction, classification, description. Festschrift in honor of Isidore Dyen. Hamburg: Abera. 167–198.
Chrysippos (ca. 282–208 B.C.) P Swiggers and A Wouters, Katholieke Universiteit Leuven, Leuven, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Born in Soli (Cilicia), Chrysippus went to study in Athens, where he received his philosophical training in the Platonist academy with Arcesilaus and subsequently in the Stoic school led by Cleanthes of Assos. He succeeded his master, Cleanthes, and became the third head of the Stoic school. He died in Athens around 205 B.C. Chrysippus was an extremely influential thinker and highly prolific writer, but unfortunately very little is left of the approximately 700 titles attributed to him by Diogenes Laertius, whose Lives of the philosophers remains our main source of information on Chrysippus and other Stoic philosophers. Chrysippus’s main fields of research were dialectics and language study (i.e., philosophy of language and
grammar), and he seems to have laid the foundations of Stoic linguistic thought. For Chrysippus, the term dialektikeˆ (‘dialectics’) referred to the study of signs and ‘things signified.’ From the catalogue of his lost writings it is possible to get an idea of the extent and technicality of his philosophical-linguistic work (more than 100 titles). The areas of dialectics and language study investigated by Chrysippus covered the following topics: (a) judgments and other types of sentences (disjunctive judgments, hypothetical judgments, consequents; questions, queries, answers, orders); (b) predicates and classes (species, genera, contrary terms, and relative terms); (c) words, word classes, and their morphological and syntactic behavior (this domain includes writings on cases, on proper names, on the elements of speech, on the arrangement of expressions); and (d) the origin of words (etymology). The first three kinds of topics fall within Stoic logic and theory of meaning: the composition, the nature, the content, and the function of sentences
414 Christmas Island: Language Situation
Christmas Island: Language Situation Editorial Team ! 2006 Elsevier Ltd. All rights reserved.
An Australian Territory since 1958, Christmas Island is a small tropical isle in the eastern sector of the Indian Ocean, 2600 km northwest of Perth, Western Australia. Its closest neighbor is Java, 360 km away. The uninhabited island was discovered and named on Christmas Day of the year 1643 by Captain William Mynors of the East India Ship Company, but it was not until 1888 that Christmas Island was annexed by Britain and settled. The Clunies-Ross brothers from neighboring Cocos-Keeling Islands (some 900 km to the southwest) established a settlement at Flying Fish Cove to collect timber and supplies for the growing industry on Cocos, and when phosphate mining began in the 1890s, Christmas Island became increasingly populated. The earliest settlers spoke English and Cocos Malay, a unique version of Malay that has been isolated from the mainstream language for over 150 years. Early arrivals from China mainly spoke Cantonese. In fact, many early place names on Christmas Island are Cantonese words, such as Poon Saan, which means ‘halfway up the hill’. Postwar arrivals who came from Penang introduced other Chinese languages including Hakka, Hainese, Hokkien, and Teochew, while those from Singapore
introduced Mandarin. Malay is widely spoken by the Malay Community present on the island and, even though English is the official language of Christmas Island, there are many residents who generally communicate in Malay or one of the four Chinese dialects. Because English was not a prerequisite for employment, a sizable proportion of today’s community is not fluent in English and many residents still converse in their native tongue. The Australian Bureau of Statistic Census of 2001 recorded a total population of 1508 people on Christmas Island, with an ethnic composition of approximately 60% Chinese, 10–15% European, and 25–30% Malay. The influx of tourists has also had an impact on the island’s language. Indonesian is frequently spoken along with many of the Chinese languages. Thai, Japanese, German, and a few other European languages are sometimes also heard. See also: Cocos (Keeling) Islands: Language Situation;
Malay.
Bibliography Adelaar S (1996). ‘Malay in the Cocos (Keeling) Islands.’ In Nothofer B (ed.) Reconstruction, classification, description. Festschrift in honor of Isidore Dyen. Hamburg: Abera. 167–198.
Chrysippos (ca. 282–208 B.C.) P Swiggers and A Wouters, Katholieke Universiteit Leuven, Leuven, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Born in Soli (Cilicia), Chrysippus went to study in Athens, where he received his philosophical training in the Platonist academy with Arcesilaus and subsequently in the Stoic school led by Cleanthes of Assos. He succeeded his master, Cleanthes, and became the third head of the Stoic school. He died in Athens around 205 B.C. Chrysippus was an extremely influential thinker and highly prolific writer, but unfortunately very little is left of the approximately 700 titles attributed to him by Diogenes Laertius, whose Lives of the philosophers remains our main source of information on Chrysippus and other Stoic philosophers. Chrysippus’s main fields of research were dialectics and language study (i.e., philosophy of language and
grammar), and he seems to have laid the foundations of Stoic linguistic thought. For Chrysippus, the term dialektikeˆ (‘dialectics’) referred to the study of signs and ‘things signified.’ From the catalogue of his lost writings it is possible to get an idea of the extent and technicality of his philosophical-linguistic work (more than 100 titles). The areas of dialectics and language study investigated by Chrysippus covered the following topics: (a) judgments and other types of sentences (disjunctive judgments, hypothetical judgments, consequents; questions, queries, answers, orders); (b) predicates and classes (species, genera, contrary terms, and relative terms); (c) words, word classes, and their morphological and syntactic behavior (this domain includes writings on cases, on proper names, on the elements of speech, on the arrangement of expressions); and (d) the origin of words (etymology). The first three kinds of topics fall within Stoic logic and theory of meaning: the composition, the nature, the content, and the function of sentences
Chukotko-Kamchatkan Languages 415
(and their constituents, such as subject terms, predicates) were the essential prerequisite for a correct analysis of syllogisms, ambiguities, fallacies, aporiai, and other philosophical puzzles. Chrysippus played a major role in showing the importance of linguisticgrammatical study (focusing on the structure of the proposition, on propositional content, and on the meaning of its constituent parts) for logical analysis and for a correct understanding of statements (as referring to states of affairs).
Bibliography Arnim H von (1903). Stoicorum veterum fragmenta (vol. I). Leipzig: Teubner. Blank D & Atherton C (2003). ‘The Stoic contribution to traditional grammar.’ In Inwood B (ed.) The Cambridge Companion to the Stoics. Cambridge: Cambridge University Press. 310–327. Brunschwig J (1986). ‘Remarques sur la classification des propositions simples dans les logiques helle´ nistiques.’ In
Philosophie du langage et grammaire dans l’antiquite´ . Brussels: OUSIA. 287–310. Gould J B (1970). The Philosophy of Chrysippus. Leiden: Brill. Hagius H (1979). ‘The Stoic theory of the parts of speech.’ Ph.D. thesis, Columbus University, New York. Hu¨ lser K (1987–1988). Die Fragmente zur Dialektik der Stoiker (4 vols). Stuttgart: Frommann/Bad Cannstatt: Holzboog. Ildefonse F (2000). Les stoı¨ciens. I. Ze´non, Cle´anthe, Chrysippe. Paris: Les Belles Lettres. Luhtala A (2000). On the origin of syntactical description in Stoic logic. Mu¨ nster: Nodus. Mansfeld J (1986). ‘Diogenes Laertius on Stoic philosophy.’ Elenchos 7, 297–382. Marrone L (1984). ‘Proposizione e predicato in Crisippo.’ Cronache Ercolanesi 14, 135–146. Swiggers P & Wouters A (1997). ‘Philosophical aspects of the Techneˆ grammatikeˆ of Dionysius Thrax.’ In Berrettoni P & Lorenzi F (eds.) Grammatica e ideologia nella storia della linguistica. Perugia: MargiacchiGaleno. 35–83.
Chukotko-Kamchatkan Languages G D S Anderson, Max Planck Institute, Leipzig, Germany, and University of Oregon, Eugene, OR, USA ! 2006 Elsevier Ltd. All rights reserved.
Chukotko-Kamchatkan Chukotko-Kamchatkan, formerly also known as Luor[a]vetlan, is a small family of languages spoken in extreme northeastern Siberia on the Chukotka Peninsula, opposite Alaska and the large Kamchatka Peninsula in far Eastern Siberia. The family consists of four remaining languages, Alutor, Chukchi, Itelmen, and Koryak. All of the languages in the group, excluding Chukchi, are endangered; Kerek became extinct in the 1990s. Alutor (Ethnologue code ALR), also known as Alyutor or Palana Koryak, is spoken by some 200 people in the villages of Vyvenka and Rekinniki in the Koryak National District, in the northeast Kamchatka Peninsula. Chukchi (Ethnologue code CKT) is spoken by some 10 000 people, primarily on the Chukchi Peninsula of northeastern Siberia. In English language literature, especially older works, the language is sometimes spelled Chukchee as well. Several local variants exist, but differences are relatively minor. More celebrated were the once active phonological differences in men’s and women’s speech, seen in the following word pair: (men) reqerken ¼ (women) tzeqetzen ‘what is s/he making/doing?’
(Ka¨ mpfe and Volodin, 1995: 8). Itelmen (Ethnologue code ITL) is also known as Kamchadal. Itelmen is currently moribund, with fewer than 100 speakers. Itelmen speakers are found primarily in the Tigil region, in Kovran, and in the Upper Khairiuzovo villages on the Kamchatka Peninsula. There were originally at least three Itelmen languages, two gradually giving way to Russian over the past two centuries, and they are now extinct. Only the Western dialect remains; it is sometimes divided into separate Kovran and Sedanka varieties. Kerek (Ethnologue code KRK) became extinct in the late 1990s. It was closely related to Koryak (Ethnologue code KPY); Koryak has some 3500 speakers scattered across the Koryak National Okrug, on the northern half of Kamchatka. An alternate name is Nymylan. There are several divergent varieties, some now considered separate languages (Alutor). Dialects include Chavchuven, Apukin, and Kamen. Itelmen stands in isolation from the northern languages genetically, with the speech representing a southern branch. It is sometimes debated whether Itelmen is related at all to Northern Chukotko-Kamchatan, and it is indeed different in numerous ways, but these are attributable rather to different substrate populations and various locally defined internal developments within Northern and Southern Chukotko-Kamchatkan, and their ultimate genetic unity seems clear. The northern branch in many
Chukotko-Kamchatkan Languages 415
(and their constituents, such as subject terms, predicates) were the essential prerequisite for a correct analysis of syllogisms, ambiguities, fallacies, aporiai, and other philosophical puzzles. Chrysippus played a major role in showing the importance of linguisticgrammatical study (focusing on the structure of the proposition, on propositional content, and on the meaning of its constituent parts) for logical analysis and for a correct understanding of statements (as referring to states of affairs).
Bibliography Arnim H von (1903). Stoicorum veterum fragmenta (vol. I). Leipzig: Teubner. Blank D & Atherton C (2003). ‘The Stoic contribution to traditional grammar.’ In Inwood B (ed.) The Cambridge Companion to the Stoics. Cambridge: Cambridge University Press. 310–327. Brunschwig J (1986). ‘Remarques sur la classification des propositions simples dans les logiques helle´nistiques.’ In
Philosophie du langage et grammaire dans l’antiquite´. Brussels: OUSIA. 287–310. Gould J B (1970). The Philosophy of Chrysippus. Leiden: Brill. Hagius H (1979). ‘The Stoic theory of the parts of speech.’ Ph.D. thesis, Columbus University, New York. Hu¨lser K (1987–1988). Die Fragmente zur Dialektik der Stoiker (4 vols). Stuttgart: Frommann/Bad Cannstatt: Holzboog. Ildefonse F (2000). Les stoı¨ciens. I. Ze´non, Cle´anthe, Chrysippe. Paris: Les Belles Lettres. Luhtala A (2000). On the origin of syntactical description in Stoic logic. Mu¨nster: Nodus. Mansfeld J (1986). ‘Diogenes Laertius on Stoic philosophy.’ Elenchos 7, 297–382. Marrone L (1984). ‘Proposizione e predicato in Crisippo.’ Cronache Ercolanesi 14, 135–146. Swiggers P & Wouters A (1997). ‘Philosophical aspects of the Techneˆ grammatikeˆ of Dionysius Thrax.’ In Berrettoni P & Lorenzi F (eds.) Grammatica e ideologia nella storia della linguistica. Perugia: MargiacchiGaleno. 35–83.
Chukotko-Kamchatkan Languages G D S Anderson, Max Planck Institute, Leipzig, Germany, and University of Oregon, Eugene, OR, USA ! 2006 Elsevier Ltd. All rights reserved.
Chukotko-Kamchatkan Chukotko-Kamchatkan, formerly also known as Luor[a]vetlan, is a small family of languages spoken in extreme northeastern Siberia on the Chukotka Peninsula, opposite Alaska and the large Kamchatka Peninsula in far Eastern Siberia. The family consists of four remaining languages, Alutor, Chukchi, Itelmen, and Koryak. All of the languages in the group, excluding Chukchi, are endangered; Kerek became extinct in the 1990s. Alutor (Ethnologue code ALR), also known as Alyutor or Palana Koryak, is spoken by some 200 people in the villages of Vyvenka and Rekinniki in the Koryak National District, in the northeast Kamchatka Peninsula. Chukchi (Ethnologue code CKT) is spoken by some 10 000 people, primarily on the Chukchi Peninsula of northeastern Siberia. In English language literature, especially older works, the language is sometimes spelled Chukchee as well. Several local variants exist, but differences are relatively minor. More celebrated were the once active phonological differences in men’s and women’s speech, seen in the following word pair: (men) reqerken ¼ (women) tzeqetzen ‘what is s/he making/doing?’
(Ka¨mpfe and Volodin, 1995: 8). Itelmen (Ethnologue code ITL) is also known as Kamchadal. Itelmen is currently moribund, with fewer than 100 speakers. Itelmen speakers are found primarily in the Tigil region, in Kovran, and in the Upper Khairiuzovo villages on the Kamchatka Peninsula. There were originally at least three Itelmen languages, two gradually giving way to Russian over the past two centuries, and they are now extinct. Only the Western dialect remains; it is sometimes divided into separate Kovran and Sedanka varieties. Kerek (Ethnologue code KRK) became extinct in the late 1990s. It was closely related to Koryak (Ethnologue code KPY); Koryak has some 3500 speakers scattered across the Koryak National Okrug, on the northern half of Kamchatka. An alternate name is Nymylan. There are several divergent varieties, some now considered separate languages (Alutor). Dialects include Chavchuven, Apukin, and Kamen. Itelmen stands in isolation from the northern languages genetically, with the speech representing a southern branch. It is sometimes debated whether Itelmen is related at all to Northern Chukotko-Kamchatan, and it is indeed different in numerous ways, but these are attributable rather to different substrate populations and various locally defined internal developments within Northern and Southern Chukotko-Kamchatkan, and their ultimate genetic unity seems clear. The northern branch in many
416 Chukotko-Kamchatkan Languages
interpretations has further subgroups of Alutor and Koryak (and Kerek), in opposition to Chukchi. Along the coasts, Chukchi people live as sea mammal hunters, like the local Yup’ik populations, but they live as reindeer herders in the interior. Approximately three-quarters of the Chukchi live as reindeer herders. Northern Kamchatkan groups mainly practice reindeer-oriented economies and fishing and sea mammal hunting along the coasts. The Itelmen live primarily as subsistence fishers. Chukotko-Kamchantkan languages in general, but the northern ones in particular, are characterized by a range of features that set them apart from many indigenous Siberian languages, but also reflect a number of areally common features. First, many words in Chukotko-Kamchatkan languages are very long (e.g., Chukchi ga-npenac˘ g-ergena-qora-ma ‘with the old men’s reindeers’ (Skorik, 1986: 107)), and initial nis common (as is typical of northern and eastern Siberian languages (Anderson, 2003). Clusters of stop þ n are also found. Example (1) is from Skorik (1986: 79, 85) (cf. Itelmen nosx ‘tail’ and neyne ‘mountain’ (Volodin, 1976: 31)): (1) Chukchi Koryak Alutor Kerek gloss NoyNen NoyNen NoyNen NuyNen ‘tail’ NeronNeyon- Nerun- neyuq- ‘3 together’ laN-o laN laN laNu
Compare Kerek tnivek ‘to send’ (Skorik, 1986: 89) with Itelmen pnilpnel ‘root’ (Skorik, 1986: 78)). Itelmen shows an unusual tolerance to consonant clusters word-initially, as well as ejective consonants that the northern languages do not share. Thus, words such as klfknan ‘it fell out’ and kstk’lknan ‘he jumped’ may be found in Itelmen. Northern Chukotko-Kamchatkan languages stand out for their areally atypical system of vowel harmony. Vowels belong to one of two harmonic classes, strong/dominant and weak/recessive. A strong vowel triggers strong allophones throughout the word, and therefore a vowel in an affix may trigger alternation in stem vowels, as shown in Example (2) for Koryak: (2) weyem mil’ut en˜ pic˘
‘river’ > ‘hare’ > ‘father’ >
wayamen mel’otan an˜pec˘ena-nan
‘river-DAT’ ‘hare-DAT’ ‘fatherAUGMDAT’
Note: geyqe-miml-e ‘with water’ vs. gawen-memlema ‘with water’ (Zhukova, 1972: 111–112; 120). Among the most characteristic features of Chukotko-Kamchatkan morphology is the frequent use of circumfixes (combined prefix þ suffix combinations) to encode a variety of inflectional categories, both nominal and verbal, some of which appear to be
very old in the family. In Koryak, this is realized as ga-c˘ ol’-ma ‘with salt’ (Zhukova, 1972: 120), and in Chukchi it is ga-npenac˘ g-ergena-qora-ma ‘ with the old men’s reindeers’ (Skorik, 1986: 107). In Koryak, y(A)-. . .-n (Zhukova, 1972: 202): (3) y-ac˘ac˘gan˜-n-ek DESID-laugh-DESID-INF ‘want to laugh’
ye-lqen˜-n-ek DESID-leave-DESID-INF ‘want to leave’
In Chukchi, re-. . .-n (Ka¨mpfe and Volodin, 1995: 88): (4) vinrete-rken help-IMPERF. REALIS
‘he helps’
>
re-vinrete-ne-rken DESID-helpDESID-IMPERF.REALIS
‘he wants to help’
Among the wider relationships that have been proposed for Chukotko-Kamchatkan languages, none widely accepted by specialists, are connections with Uralic, Eskimo-Aleut, and ‘Eurasian,’ among others. See also: Endangered Languages; Russian Federation:
Language Situation.
Bibliography Anderson G D S (2003). ‘Towards a phonological typology of Native Siberia.’ In Holisky D A & Tuite K (eds.) Current trends in Caucasian, East European and Inner Asian linguistics. Papers in honor of Howard I. Aronson. Amsterdam & Philadelphia: John Benjamins. 1–22. Angere J (1951). ‘Das Verha¨ltnis des tschuktschischen Sprachgruppe zu dem uralischern Sprachstamme.’ Spra˚ kvetenskapliga Sa¨ llskapets i Uppsala Fo¨ rhandlingar 1949– 1951, 109–150. Asinovskij A S (1991). Konsonantizm chukotskogo jazyka. Leningrad: Nauka. Bobaljik J (1998). ‘Pseudo-ergativity in Chukotko-Kamchatkan agreement systems.’ Recherches Linguistiques de Vincennes 27, 21–44. Bogoraz W (1922). ‘Chukchee.’ In Boas F (ed.) Handbook of American Indian languages. Washington: Government Printing Office. 631–903. Bogoraz V G (1937). ‘Luoravetlanskij (chukotskij) jazyk.’ In Krejnovich E A & Koshkin J P (eds.) Jazyki i pis’mennost’ narodov Severa. Chast’ III. Jazyki i pis’mennosti paleoaziatskix narodov. Moscow: Gosudarstvennoe uchebno-pedagogicheskoe izdatel’stvo. Comrie B (1979). ‘Degrees of ergativity: some Chukchee evidence.’ In Plank F (ed.) Ergativity. New York: Academic Press. 219–240. Comrie B (1980). ‘Inverse verb forms in Siberia: evidence from Chukchee, Koryak, Kamchadal.’ Folia Linguistica I(1), 61–74. Du¨rr M, Kasten E & Khalojmova K N (2001). Itelmen language and culture. Mu¨nster/Berlin: Waxmann [multimedia CD-ROM].
Church Slavonic 417 Fortescue M (1998). Language relations across Bering Strait: reappraising the archaeological and linguistic evidence. London: Cassel. Georg R-S & Volodin A P (1999). Die itelmenische Sprache. Wiesbaden: Harrassowitz. Inenlikej P I (1987). Slovar’ chukotsko-russkij i russkochukotskij (2nd edn.). Leningrad: Proveshchenie. Ka¨ mpfe H R & Volodin A P (1995). Abriss der tschuktschischen grammatik. Wiesbaden: Harrassowitz. Kibrik A E, Kodzasov S V & Muravyova I A (2000). Aljutorskii iazyk i folklor. Moscow: IMLI RAN Nasledie. Kozinsky I S, Nedjalkov V P & Polinskaja M S (1988). ‘Antipassive in Chukchee: oblique object, object incorporation, zero object.’ In Shibatani M (ed.) Passive and voice. Amsterdam: John Benjamins. 651–706. Krejnovich E A (1979). ‘Nejtral’nyj glasnyj i konsonatizm v chukotsko-kamchatskikh jazykakh.’ In Zvukovoj stroj jazykov. Leningrad: Nauka. 157–166. Mudrak O A (2000). Etimologicheskij slovar’ ChukotskoKamchatskix jazykov. Moscow: Jazyki Russkoj Kul’turi. Nedjalkov V P (1979). ‘Degrees of ergativity in Chukchee.’ In Plank F (ed.) Ergativity. New York: Academic Press. 241–262. Nedjalkov V P, Inenlikej P I & Rachtilin V G (1983). ‘Rezul’tativ i perfekt v Chukotskom jazyke.’ In Tipologija rezul’tativnyx konstruktsij. Leningrad: Nauka. 101–109. ¨ ber die Sprache der Tschuktschen Radloff L (1861). U und ihr Verha¨ ltnis zum Korjakischen. St. Petersburg: Comissiona¨re der Kaiserlichen Akademie der Wissenschaften. Skorik P J (1948). Ocherki po sintaksisu chukotskogo jazyka. Inkorporatsija. Moscow: Gosudarstvennoe uchebno-pedagogicheskoe izdatel’stvo.
Skorik P J (1961). Grammatika chukotskogo jazyka. Chast’ I. Moscow: Izdatel’stvo Akodemii Nauk SSSR. Skorik P J (1977). Grammatika chukotskogo jazyka. Chast’ II. Moscow: Izdatel’stvo Akodemii Nauk SSSR. Skorik P J (1986). ‘Kategorii imeni sushchestvitel’nogo v Chukotsko-Kamchatskikh jazykakh.’ In Skorik P J (ed.) Paleoaziatskie jazyki. Novosibirsk: Akademija Nauk SSSR. 76–111. Spencer A (1995). ‘Incorporation in Chukchi.’ Language 71(3), 439–489. Stebnitskij S N (1937). ‘Osnovnye foneticheskie razlichija dialektov nymylanskogo (korjakskogo) jazyka.’ In Pamjati V. G. Bogoraza. Moscow: Izdatel’stvo Akademii Nauk SSSR. Stebnitskij S N (1938). ‘Aljutorskij dialekt nymylanskogo jazyka.’ Sovetskij Sever I, 65–102. Volodin A P (1976). Itel’menskij jazyk. Leningrad: Nauka. Volodin A P (1991). ‘Prospekt opisanija grammatiki kerekskogo jazyka (Chukotsko-Kamchatskaja gruppa).’ In Jazyki narodov sibiri. grammaticheskie issledovanija. Novosibirsk: Nauka Sibirskoe Otdelenie. Volodin A P (1997). Itel’menskij jazyk.’ In Jazyki Mira: Paleoaziatskie jazyki. Moscow: Indrik. 60–72. Volodin A P & Khalojmova K N (1989). Itel’mensko- russkij russko-itel’menskij slovar’. Leningrad: Proveshchenie. Worth D S (1962). ‘La place du Kamtchadal parmi les langues soi-disant pale´ osibe´ riennes.’ Orbis XI(2). Zhukova A N (1968). ‘Aljutorskij jazyk.’ In Jazyki narodov SSSR, V, Paleoaziatskie jazyki. Moscow: Akademija Nauk SSSR. 294–309. Zhukova A N (1972). Grammatika korjakskogo jazyka. Leningrad: Nauka. Zhukova A N (1980). Jazyk palanskikh korjakov. Leningrad: Nauka.
Church Slavonic C M MacRobert, Oxford University, Oxford, UK ! 2006 Elsevier Ltd. All rights reserved.
Church Slavonic is a generic term for the closely related, highly conservative varieties of Slavic language used for liturgical purposes by the Eastern Orthodox Slavs (Belorussian, Bulgarian, Macedonian, Russian, Serbian, Ukrainian) and the Ukrainian Uniates, and also by the Romanians until the 16th century and, until the introduction of services in the vernacular, the Roman Catholic Croats of the Slavonic rite. In the medieval period, Church Slavonic also had the wider functions of a literary language among most of these peoples. Church Slavonic originated in the translations of Scripture and liturgy made mainly from Greek by SS Cyril and Methodius and their associates in the late
9th and early 10th centuries (see Old Church Slavonic). The basic vocabulary, grammatical forms, and pronunciation of these texts predominantly followed the usage of Slavs in the southeast Balkans, while syntax and word-formation were to a large extent modeled on Greek. Two developments signal the transition, by the end of the 11th century, from Old Church Slavonic to Church Slavonic. One was the emergence of local varieties, such as Croatian, Russian, and Serbian Church Slavonic, which compromised between traditional pronunciation and grammatical forms and the vernacular usage of the area. Initially unsystematic, these modified varieties rapidly stabilized to local norms that in the hands of competent scribes attained a high degree of regularity. The other development consisted in revisions of syntax and vocabulary, which seem to have been motivated partly by the
Church Slavonic 417 Fortescue M (1998). Language relations across Bering Strait: reappraising the archaeological and linguistic evidence. London: Cassel. Georg R-S & Volodin A P (1999). Die itelmenische Sprache. Wiesbaden: Harrassowitz. Inenlikej P I (1987). Slovar’ chukotsko-russkij i russkochukotskij (2nd edn.). Leningrad: Proveshchenie. Ka¨mpfe H R & Volodin A P (1995). Abriss der tschuktschischen grammatik. Wiesbaden: Harrassowitz. Kibrik A E, Kodzasov S V & Muravyova I A (2000). Aljutorskii iazyk i folklor. Moscow: IMLI RAN Nasledie. Kozinsky I S, Nedjalkov V P & Polinskaja M S (1988). ‘Antipassive in Chukchee: oblique object, object incorporation, zero object.’ In Shibatani M (ed.) Passive and voice. Amsterdam: John Benjamins. 651–706. Krejnovich E A (1979). ‘Nejtral’nyj glasnyj i konsonatizm v chukotsko-kamchatskikh jazykakh.’ In Zvukovoj stroj jazykov. Leningrad: Nauka. 157–166. Mudrak O A (2000). Etimologicheskij slovar’ ChukotskoKamchatskix jazykov. Moscow: Jazyki Russkoj Kul’turi. Nedjalkov V P (1979). ‘Degrees of ergativity in Chukchee.’ In Plank F (ed.) Ergativity. New York: Academic Press. 241–262. Nedjalkov V P, Inenlikej P I & Rachtilin V G (1983). ‘Rezul’tativ i perfekt v Chukotskom jazyke.’ In Tipologija rezul’tativnyx konstruktsij. Leningrad: Nauka. 101–109. ¨ ber die Sprache der Tschuktschen Radloff L (1861). U und ihr Verha¨ltnis zum Korjakischen. St. Petersburg: Comissiona¨re der Kaiserlichen Akademie der Wissenschaften. Skorik P J (1948). Ocherki po sintaksisu chukotskogo jazyka. Inkorporatsija. Moscow: Gosudarstvennoe uchebno-pedagogicheskoe izdatel’stvo.
Skorik P J (1961). Grammatika chukotskogo jazyka. Chast’ I. Moscow: Izdatel’stvo Akodemii Nauk SSSR. Skorik P J (1977). Grammatika chukotskogo jazyka. Chast’ II. Moscow: Izdatel’stvo Akodemii Nauk SSSR. Skorik P J (1986). ‘Kategorii imeni sushchestvitel’nogo v Chukotsko-Kamchatskikh jazykakh.’ In Skorik P J (ed.) Paleoaziatskie jazyki. Novosibirsk: Akademija Nauk SSSR. 76–111. Spencer A (1995). ‘Incorporation in Chukchi.’ Language 71(3), 439–489. Stebnitskij S N (1937). ‘Osnovnye foneticheskie razlichija dialektov nymylanskogo (korjakskogo) jazyka.’ In Pamjati V. G. Bogoraza. Moscow: Izdatel’stvo Akademii Nauk SSSR. Stebnitskij S N (1938). ‘Aljutorskij dialekt nymylanskogo jazyka.’ Sovetskij Sever I, 65–102. Volodin A P (1976). Itel’menskij jazyk. Leningrad: Nauka. Volodin A P (1991). ‘Prospekt opisanija grammatiki kerekskogo jazyka (Chukotsko-Kamchatskaja gruppa).’ In Jazyki narodov sibiri. grammaticheskie issledovanija. Novosibirsk: Nauka Sibirskoe Otdelenie. Volodin A P (1997). Itel’menskij jazyk.’ In Jazyki Mira: Paleoaziatskie jazyki. Moscow: Indrik. 60–72. Volodin A P & Khalojmova K N (1989). Itel’mensko- russkij russko-itel’menskij slovar’. Leningrad: Proveshchenie. Worth D S (1962). ‘La place du Kamtchadal parmi les langues soi-disant pale´osibe´riennes.’ Orbis XI(2). Zhukova A N (1968). ‘Aljutorskij jazyk.’ In Jazyki narodov SSSR, V, Paleoaziatskie jazyki. Moscow: Akademija Nauk SSSR. 294–309. Zhukova A N (1972). Grammatika korjakskogo jazyka. Leningrad: Nauka. Zhukova A N (1980). Jazyk palanskikh korjakov. Leningrad: Nauka.
Church Slavonic C M MacRobert, Oxford University, Oxford, UK ! 2006 Elsevier Ltd. All rights reserved.
Church Slavonic is a generic term for the closely related, highly conservative varieties of Slavic language used for liturgical purposes by the Eastern Orthodox Slavs (Belorussian, Bulgarian, Macedonian, Russian, Serbian, Ukrainian) and the Ukrainian Uniates, and also by the Romanians until the 16th century and, until the introduction of services in the vernacular, the Roman Catholic Croats of the Slavonic rite. In the medieval period, Church Slavonic also had the wider functions of a literary language among most of these peoples. Church Slavonic originated in the translations of Scripture and liturgy made mainly from Greek by SS Cyril and Methodius and their associates in the late
9th and early 10th centuries (see Old Church Slavonic). The basic vocabulary, grammatical forms, and pronunciation of these texts predominantly followed the usage of Slavs in the southeast Balkans, while syntax and word-formation were to a large extent modeled on Greek. Two developments signal the transition, by the end of the 11th century, from Old Church Slavonic to Church Slavonic. One was the emergence of local varieties, such as Croatian, Russian, and Serbian Church Slavonic, which compromised between traditional pronunciation and grammatical forms and the vernacular usage of the area. Initially unsystematic, these modified varieties rapidly stabilized to local norms that in the hands of competent scribes attained a high degree of regularity. The other development consisted in revisions of syntax and vocabulary, which seem to have been motivated partly by the
418 Church Slavonic
desire to eliminate outdated or unfamiliar linguistic material, but also aimed to make texts conform to a received Greek version and to produce more closely literal translations. The earliest systematic revisions are associated with Preslav, the capital of Bulgaria in the 10th century, when a number of early Church Slavonic revisions, new translations, and original compositions came into existence. There also appears to have been a revision of Croatian Church Slavonic texts on the basis of Latin sources in the 12th century. Revisionist tendencies culminated by the 14th century in comprehensive reform of scriptural and liturgical translations into Bulgarian and Serbian Church Slavonic. This development has been associated with the Bulgarian patriarch Euthymius (elected patriarch in 1375; exiled by the Turks in 1393), though more recent research suggests it began in the early part of the century, perhaps on Mount Athos. The resulting standardized orthography, conservatism in grammatical forms and vocabulary, and highly literalistic translational practice were introduced among the East Slavs from the end of the 14th century, albeit with some adjustments to pre-existing local usage. The late 16th and early 17th centuries saw attempts in the Ukraine at systematic description of this late and composite type of Church Slavonic, on the model of Greek and Latin grammars; the most comprehensive of these, compiled by Meletij Smotryc’kyj in the early 17th century and subsequently modified to conform to Muscovite practice, remained the fullest description of Russian Church Slavonic until the 19th century. Further revisions of Church Slavonic texts initiated in Muscovy or the Ukraine in the 16th and 17th centuries, though controversial in their time, dealt with minor textual discrepancies or the detail of grammatical and orthographical norms. A final standardization was effected in the publications approved by the Synod of the Russian Orthodox Church in the 18th century. Thanks to the dissemination of these printed books in the Balkans, the Orthodox Bulgarians, Macedonians, and Serbs came to use ‘Synodal’ Russian Church Slavonic, albeit with their own pronunciations. Modern Church Slavonic does not stand in a simple genetic relationship to other Slavic languages. Its texts may be understood in different ways and to varying degrees by Slavs of differing linguistic background and, as a result of literalistic translational practices aiming at morpheme for morpheme equivalence, some of them are intelligible only with the help of their Greek originals. It is virtually a closed system, for though new texts can be created if need arises, they are acceptable as Church Slavonic only insofar
as they reproduce traditional constructions and phraseology. While its liturgical use still prevails in the Russian Orthodox Church, among the Orthodox South Slavs, Church Slavonic tends increasingly to be supplanted by modern vernacular translations, and survives mainly as a vehicle for the traditional corpus of hymns. See also: Balto-Slavic Languages; Bulgaria: Language Situation; Bulgarian; Macedonia: Language Situation; Macedonian; Old Church Slavonic.
Bibliography Corin A R (1993). ‘Variation and norm in Croatian Church Slavonic.’ Slovo 41(13), 155–196. D’jacˇenko G (1899/1976). Polnyj cerkovno-slavjanskij slovar’ (2 vols). Moscow: Vil’de/ Rome: JUH. Horbatsch O (1964). Die vier Ausgaben der kirchenslavischen Grammatik von M. Smotryc´ kyj. Wiesbaden: Harrassowitz. Kraveckij A G & Pletneva A A (2001). Istorija cerkovnoslavjanskogo jazyka v Rossii: konec XIX–XX v. Moscow: Jazyki russkoj kul’tury. Pletneva A A & Kraveckij A G (2001). Cerkovnoslavjanskij jazyk. Moscow: Drevo dobra. Mathiesen R (1972). The inflectional morphology of the Synodal Church Slavonic verb. Columbia: Ann Arbor. Mathiesen R (1984). ‘The Church Slavonic language question: an overview (IX–XX Centuries).’ In Picchio R & Goldblatt H (eds.) Aspects of the Slavic language question (2 vols). New Haven: Slavica. vol. 1, 45–65. Miklosich F von (1862–1865/1977). Lexicon Palaeoslovenico-Graeco-Latinum. Vienna: Wilhelm Braumu¨ller/ Scientia Verlag Aalen). Picchio R (1980). ‘Church Slavonic.’ In Schenker A M & Stankiewicz E (eds.) The Slavic literary languages: formation and development. New Haven: Slavica. 1–33. Pla¨hn J (1978). Der Gebrauch des modernen russischen Kirchenslavisch in der russischen Kirche. Hamburg: Buske. Reinhart J (1990). ‘Eine redaktion des kirchenslavischen Bibeltextes im Kroatien des 12. Jahrhunderts.’ Wiener Slavistisches Jahrbuch 37, 193–241. Slavova T (1989). ‘Preslavska redakcija na KiriloMetodievieja staroba˘lgarski evangelski prevod.’ In Dinekov P et al. (eds.) Kirilo-Metodievski studii 6. Sofia: Ba˘lgarska Akademija na naukite. 15–129. Thomson F J (1998). ‘The Slavonic translation of the Old Testament.’ In Krasˇovec J (ed.) Interpretation of the Bible. Ljubljana/Sheffield: Slovenska akademija zanosti in umetnosti/Sheffield Academic Press. 605–920. Tolstoy N I (1988). Istorija i struktura slavjanskix literaturnyx jazykov. Moscow: Nauka. Trunte N H (1998). Slavenskij jazyk. Ein praktisches Lehrbuch des Kirchenslavischen in 30 Lektionen. Band 2: Mittel- und Neukirchenslavisch. Munich: Otto Sagner.
Chuvash 419
Chuvash L Johanson, Mainz University, Mainz, Germany ! 2006 Elsevier Ltd. All rights reserved.
Location and Speakers Chuvash (cˇa˘vasˇ cˇe˘lxi, cˇa˘vasˇla) is the only modern representative of the Oghur (or Bulgar) branch of the Turkic language family. It is spoken in the Volga-Ural region, partly in the Chuvash Republic ˘ a˘vasˇ Respubliki) at the ‘Great Bend’ of the (C Volga River. The Chuvash Republic (the capital is Cheboksary, Sˇupasˇkar) was established in 1990 within the Russian Federation; its forerunner was the Chuvash Autonomous Soviet Socialist Republic, created in 1925. The Chuvash have majority status in the Republic, forming nearly 70% of the population. Over three-fourths of the population regard Chuvash as their native language. More than half of the speakers of Chuvash live outside the Republic, especially in the south and southwest parts of Tatarstan, in the central and west parts of the Bashkortostan, and in the Kuybyshev, Ulyanovsk, and Samar provinces. Speakers of Chuvash also live in other parts of Russia, in West and East Siberia, in the Far East, and in some Central Asian republics. The total number of Chuvashspeaking people is nearly 2 million. According to a law adopted in 1991, Chuvash and Russian are the official languages of the Republic. Russian is the medium of communication between nationalities and the main language of instruction. However, the efforts to maintain Chuvash are strong, even in the younger generation.
Origin and History Parts of the old Oghur tribal confederation, originally based in the Baikal Lake region, moved west and arrived, in the mid-5th century, in the European steppe, where they established states on the Kuban, Danube, and Volga rivers. They mostly assimilated linguistically, a well-known example being the Slavicization of Bulgar groups in the Balkans. At the end of the 9th century or earlier, Oghur groups settled in the Volga-Kama region, where they established the Volga Bulgar kingdom, with its center on the middle and lower course of the Volga River. They accepted Islam as early as 922. After the destruction of this state by the Mongols in the 13th century, the Volga Bulgars and other groups of the region became subject to the Golden Horde. Early Oghur is unknown except for the evidence found in some proper names and old loanwords.
Chuvash, which was recorded for the first time in the 18th century in word lists, texts, and one grammar, is considered closely related to Volga Bulgar and other old varieties of the Oghur type. Volga Bulgar is partly known from tombstone inscriptions found on the left bank of the Volga River, dating to the 13th and 14th centuries. Several linguistic features recorded in these epitaphs do not, however, fit very well with the known features of Chuvash. It is thus still not quite clear that Chuvash is a direct descendant of Volga Bulgar. It is also unknown whether the ancestors of the Chuvash took part in the written culture of the Bulgars. There are no Volga Bulgar epitaphs on the territory of the Chuvash Republic. The fact that Chuvash is one of the very few Turkic languages that is not strongly influenced by Islam may indicate that the ancestors of the Chuvash were not affected by the Volga Bulgar Islamic culture.
Related Languages and Language Contacts Chuvash is the result of the oldest known split within the Turkic family. Its origins reside in the language of Oghur Turkic group. Chuvash has played a key role in comparative Turkic linguistics, especially in discussions about a possible genealogical relationship of Turkic, Mongolic, and Tungusic within an Altaic language family. According to an older view, Chuvash constitutes an independent Altaic language. The hypothesis of an Altaic protolanguage relies on reconstructions on the basis of words shared by Turkic, Mongolic, Tungusic, and sometimes other languages, such as Korean and Japanese. Deviant Chuvash consonant representations have been used to reconstruct a Proto-Altaic phonology. Chuvash words with r and l sometimes correspond to Common Turkic words with z and sˇ (e.g., Chuvash cˇul ‘stone’ vs. Common Turkic ta:sˇ ). This is an archaic Oghur feature. Two Samoyed words that can be traced back to *yu¨r ‘hundred’ and *kil" ‘winter’ have obviously been copied from Oghur words containing the same final consonants. The corresponding Chuvash words are s´e˘r and xe˘l, whereas other Turkic languages display forms ending in -z and -sˇ, respectively (e.g., Turkish yu¨z, kı ). Chuvash words with r and l sometimes have Mongolic equivalents with r and l (e.g., cˇul ‘stone’ vs. cˇilagun). Cases such as these have been used to reconstruct the special Proto-Altaic elements r2 and l2, which are thought to be represented by r and l in Mongolic and Chuvash, whereas they have developed into z and sˇ in Common Turkic. Scholars who do not accept the Altaic hypothesis
420 Chuvash
explain these and other correspondences by contact relationship. In this case, the assumption is that an Oghur language of the Chuvash type, with certain features, was the source of the oldest layer of Turkic loanwords in Mongolic. Tungusic, in turn, is thought to have borrowed words with these features from Mongolic. Complex processes of linguistic assimilation have taken place in the Volga-Kama region since the 10th century. The Bulgar influence on East Finnic, Slavic, and early Kipchak Turkic was considerable. Ancestors of the Chuvash assimilated speakers of Udmurt (Votyak) and Meadow Mari (Cheremis). The assimilation of local populations led to strong substrate influences, especially from Mari. The term ‘Chuvash,’ first documented in Russian chronicles of the 16th century, originally referred to groups that also included speakers of Mari. On the other hand, the designation ‘Cheremis’ was also applied to Chuvash. After the Mongol conquest, from the 14th century on, Kipchak-speaking newcomers played an important role in the area. Speakers of Volga Bulgar were linguistically influenced by them. Parts of them assimilated Volga Bulgars and other Oghur-speaking groups, which led to substratum influence. What is known as Chuvash today remained relatively uninfluenced by the Kipchak wave. In its more recent linguistic history, however, Chuvash has been closely connected with Kipchak Turkic through massive Tatar impact.
The Written Language Standard Chuvash is written with a Cyrillic-based alphabet that includes a few special letters. It goes back to a script system established by Ivan Jakovlev (1848–1930), which mirrors the pronunciation of the Anatri dialect. The alphabet was reformed in 1938 and has remained unchanged since then. It basically represents phonemes, and few allophones.
Distinctive Features Chuvash shares basic linguistic features with other Turkic languages, preserving numerous so-called Common Turkic traits. It exhibits most linguistic features typical of the Turkic family (see Turkic Languages). It is, for example, an agglutinative language with suffixing morphology, sound harmony, and a head-final constituent order. On the other hand, it strongly deviates from Common Turkic in some respects, particularly in its phonology. In the following suffix notations, capital letters indicate phonetic variation (e.g., A ¼ a˘/e˘). Hyphens are used to indicate morpheme boundaries.
Phonology
Chuvash phonology displays many irregular and complicated sound changes. This is especially true of the vowels, of which correspondences with Common Turkic vowels are far from unequivocal. For instance, the Common Turkic vowel a is represented by u in words such as ut ‘horse’ (cf. Tatar at), but by ı¨ in words such as pı¨r- ‘to go’ (cf. Tatar bar-). Chuvash possesses the reduced vowels a˘ and e˘ (e.g., ta˘ r- ‘to stand’, pe˘ r ‘one’), which have their counterparts in neighboring languages, including Tatar, without corresponding to them in a systematic way. Originally long vowels are generally not preserved in Chuvash. In some cases, however, they are represented by diphthongs (e.g., ke˘ vak ‘blue’ < ko¨ :k). Chuvash has a rather reduced consonant inventory in comparison with most other Turkic languages. Under Slavic influence, palatalized and nonpalatalized consonants are distinguished, the palatalized ones occurring before and after front vowels. Chuvash r sometimes corresponds to an Old Turkic interdental d, as in ura ‘foot’ vs. adaq. This is not necessarily an archaic feature. In cases such as this, early Bulgar d seems to have changed into z, which then developed into r in late Bulgar. Chuvash words are, as a rule, subject to sound harmony. The vowels of a word normally belong either to the front or to the back class. Most suffixes have a back vowel and a front vowel variant. However, some suffixes of standard Chuvash exist only in a front vowel variant: the plural suffix -sem, as in acˇ asem ‘children’ (of acˇ a ‘child’), and the third-person possessive suffix, as in ı¨va˘ l-e˘ ‘her/his son’ (of ı¨va`l ‘son’). Grammar
The morphology exhibits certain deviations from Common Turkic patterns. There are thus exceptions from the agglutinative principles generally valid for Turkic languages (e.g., tu ‘mountain’ vs. ta˘ v-a [mountain-DAT] ‘to the mountain’) (cf. Turkish dag˘ [mountain], dag˘ -a [mountain-DAT]). Eight cases are normally distinguished for the standard language. As a result of phonetic development, the dative and accusative case markers have fused into one marker, -A. Besides the suffixless nominative, the dative– accusative, the genitive in -An, the locative in -rA, and the ablative in -rAn, Chuvash grammarians reckon with an instrumental-comitative case in -pA[lA], a privative (or abessive) case in -sAr, and a causal or purposive case in -sˇ An. Some scholars distinguish still more cases, e.g., a directive in -AllA. The plural suffix -sem is of unknown origin; other Turkic languages use plural suffixes of the type -lAr. The plural
Chuvash 421
marker -sem follows possessive suffixes and precedes case markers (e.g., kil-e˘ m-sen-cˇ en [house-POSS.1.SG-PLABL] ‘from my houses’). In other Turkic languages, the plural suffix precedes the possessive markers, as in Turkish ev-ler-im-den [house-PL-POSS.1.SG-ABL]. The nominative forms of the personal pronouns of the first and second persons contain a proclitic deictic element e-, lacking in other Turkic languages: epe˘ ‘I’, ese˘ ‘you (singular)’, epir ‘we’, esir ‘you (plural)’ (cf. Turkish ben, sen, biz, siz). The reflexive pronouns of the type xa- plus possessive suffixes (e.g., xam ‘I myself’) are unknown in other Turkic languages. Three degrees of proximity are expressed with the demonstrative pronouns ku ‘this’, s´ ak(a˘ ) ‘this there’, and s´ av(a˘ ) ‘that there’. The numerals 1–10 display, besides their normal forms (pe˘ r(e) ‘one’, ta˘ xa˘ r ‘nine’), emphatic variants with long consonants for use in isolated syntagmatic positions (pe˘ rre ‘one’, ta˘ xxa˘ r ‘nine’). Ordinals are formed with the suffix -me˘ sˇ , otherwise unknown in Turkic (ikke˘ -me˘ sˇ [two-ORD] ‘second’). The Chuvash verb system does not exhibit such important deviations from the common Turkic system as has been assumed by some researchers. For example, the so-called ‘aorist’ (e.g., Turkish gel-ir [come-AOR.3.SG] ‘comes, will come’) is not lacking in Chuvash, but has survived as the so-called future, as in kil-e˘ [come-FUT.3.SG] ‘will come’. The negated imperative is formed with a preposed particle an (an pı¨r [NEG go.IMP] ‘do not go’), whereas other Turkic languages use the negation suffix -mA with imperatives as well (e.g., Turkish git-me [go-NEG.IMP] ‘do not go’). It has been suggested that some of these idiosyncratic Chuvash features – the deictic element e-, the pronouns s´ aka˘ and s´ ava˘ , the negative particle an, and the plural suffix -sem – have been copied from Mari or other Volga Finnic languages. Lexicon
Most basic words in the Chuvash lexicon belong to the common Turkic vocabulary. Many elements have, however, been copied from other languages, mostly from Tatar, neighboring Finnic languages, and Russian. An old layer indicates contacts with Samoyeds in southwestern Siberia. Later loans reflect the contacts with Mari in the Volga region, e.g., pu¨ rt ‘house’ < po¨ rt. Tatar dialects have exerted strong influence on the lexicon. Words of Arabic and Persian origin have mostly entered Chuvash via Tatar, but certain words were borrowed already in the
Volga Bulgarperiod. Words of Mongolic origin havealso mostly been copied from Tatar, such as ta˘xta- ‘to wait’ < tuqta- ‘to stop’. There are numerous Russian loans, including xas´ at ‘newspaper’ < gazeta and ke˘ neke ‘book’ < kniga. There are also many lexical elements of unknown origin.
Dialects Modern Chuvash has two main dialects. Viryal, the ‘upper’ dialect, is spoken in the northern and northwestern parts of the Republic. Anatri, the ‘lower’ one, is spoken in the south. In the center and northeast, there is found a transitional dialect that is rather close to the lower dialect. The differences between the dialects are small. Standard Chuvash is based on Anatri dialects. Chuvash speakers living outside the Republic also speak Anatri dialects. Vowel harmony is less consistent in Standard Chuvash and Anatri than in Viryal. Tatar loans are more common in Anatri, whereas Mari and Russian loans are more common in Viryal. See also: Altaic Languages; Russian Federation: Lan-
guage Situation; Turkic Languages.
Bibliography Andreev I A (1966). ‘C˘ uvasˇ skij jazyk.’ In Baskakov N A (ed.) Jazyki narodov SSSR 2. Tjurkskie jazyki. Moskva: Nauka. 43–65. Asˇ marin N I (1928–1950). Thesaurus linguae tschuvaschorum/Slovar’ cˇ uvasˇ skogo jazyka. Kazan: C˘ eboksary. Benzing J (1959). ‘Das Tschuwaschische.’ In Deny J et al. (eds.) Philologiae turcicae fundamenta 1. Aquis Mattiacis: Steiner. 695–751. Clark L (1998). ‘Chuvash.’ In Johanson L & Csato´ E´ A´ (eds.) The Turkic languages. London & New York: Routledge. 434–452. Krueger J (1961). Chuvash manual. Introduction, grammar, reader, and vocabulary. Indiana University Publications, Uralic and Altaic Series 7. The Hague: Mouton. Ro´ na-Tas A (ed.) (1982). Chuvash studies. Budapest: Akade´ miai Kiado´ . ˇ uvasˇ sko-russkij slovar’. Skvorcov M I (ed.) (1982). C Moskva: Russkij Jazyk.
Relevant Website http://www.turkiclanguages.com – Website with many Turkic language resources.
422 Circum-Baltic Area
Circum-Baltic Area M Koptjevskaja-Tamm, Stockholm University, Stockholm, Sweden ! 2006 Elsevier Ltd. All rights reserved.
The Circum-Baltic area comprises primarily Baltic, Germanic, and Slavic languages within Indo-European; Finnic and Saami within Uralic/Finno-Ugrian; as well as the Indo-Aryan language(s) Romani and the Turkic languages Tatar and Karaim. Since time immemorial it has been an arena for intensive linguistic contacts, but it has never been united, either linguistically, politically, economically, or culturally. Following Jakobson, linguists have suggested various partly overlapping Sprachbu¨nde here. However, the notion of Sprachbund is hardly applicable to an area of such historical and linguistic complexity, with many layers of micro- and macrocontacts and mutual influences superimposed on each other over a long period of time.
The Circum-Baltic Area: The Historical, Geographical and Sociocultural Background The region around the Baltic Sea has been inhabited by man at least since the end of the last glacial era
(around 8000 B.C.). In historical times, it has mainly been the home for languages from two linguistic stocks: Indo-European (Baltic, Germanic and Slavic languages) and Uralic/Finno-Ugrian (Finnic and Saami). In addition, three ‘exotic’ languages have been used in the area for a considerable time: the Indo-Aryan language(s) Romani and the Turkic languages Tatar and Karaim. The exact delineation of the ‘Circum-Baltic area’ (CB area) and list of the ‘Circum-Baltic languages’ (CB languages; both terms launched in Dahl and Koptjevskaja-Tamm, 1992 and further developed in Dahl and Koptjevskaja-Tamm, 2001) are open to discussion. At least the following languages can count here (see also Figure 1): Indo-European Germanic West: High German, Low German, Eastern Yiddish North: Danish, Swedish, Dalecarlian, Norwegian Baltic: Latvian, Lithuanian Slavic West: Polish, Kashubian East: Belarusan, Russian Indo-Aryan: Romani Finno-Ugrian Finnic: Estonian, Finnish, Ingrian, Karelian, Ludian, Olonetsian, Veps
Figure 1 The Circum-Baltic languages. Nonterritorial languages (Romani, Yiddish, Tatar) not shown.
The list does not contain extinct languages – e.g., Polabian (Slavic); Old Prussian, Jatvingian, Curonian, and Galindian (Baltic) – and ignores the dialectal variation. Dalecarlian (egentligt dalma˚l), however, is treated here as a separate language, or a chain of languages, as opposed to its traditional inclusion among the Eastern Swedish dialects. ‘Dalecarlian’ refers to the highly conservative Scandinavian vernaculars that are spoken in the Swedish province of Dalarna (Dalecarlia) and are not comprehensible to speakers of Standard Swedish. The CB area was settled via numerous migrations; among others, the northward expansion of groups of Indo-Europeans, and much later the Slavic northward expansion to the former Baltic and Finnic-speaking regions. Archaeologists, geneticists, and linguists all claim to trace back the two language stocks in the area at least to the 2nd millennium B.C., even though they disagree on who of the Indo-Europeans or FinnoUgrians were the first ones where. Finno-Ugrians and Indo-Europeans were engaged in multiple contacts already in prehistoric times, both outside and within the CB area. Some of these contacts must have taken place in extensive bilingual areas; otherwise it is hardly possible to explain the huge number of Baltic and Germanic loanwords in Finnic, including even kinship terms and names for body parts. Although the CB area has been an arena for intensive contacts since time immemorial, it has never been economically, politically, culturally, or linguistically united. The contacts, both peaceful and military, have been achieved both across the sea and by land. Movements across the sea normally involve fewer people than those on land, either tradesmen, missionaries, warriors, or colonizers, and there is a smaller chance that the newcomers will ‘sweep’ through the area. Also, coastlines and numerous islands often serve as refuges for the languages pushed from the inland by expanding ones. The CB area has been constantly divided and redivided among different spheres of influence. Thus, the period of 800–1000 meant expansive activities of the Scandinavian Vikings and the emergence of the Scandinavian, Polish, and Russian states, each with its own sphere of dominance. The period of 1100–1500 witnessed Denmark’s expansion, the crusades and the establishment of the Teutonic Order states in Northern Baltikum, the dominance of the Hanseatic league, and the expansion of the Polish and Lithuanian states, later of the Polish-Lithuanian state. In later times, the area went on being repartitioned among powers such as Sweden, the Polish-Lithuanian
Commonwealth, Prussia (later Germany), and Russia (later the Soviet Union). Each of the dominant powers brought with it a new prestige language (Danish, Low German, the Eastern Slavic variety used in the Grand Duchy of Lithuania, Polish, Russian, Swedish and German) that expanded over a large area and influenced the local languages. The three main religions in the area – Catholicism, Lutheranism, and Greek Orthodoxy – have also considerably shifted their spheres of dominance over time.
Contact Phenomena within the CB Area There is an old tradition of studying contacts among the languages around the Baltic Sea which mostly concentrated on loanwords. R. Jakobson (1931/1971) was the first to apply the term ‘Sprachbund’ to the languages of the CB area. He suggested that several languages spoken around the Baltic Sea together built a ‘phonetic’ Sprachbund. These included Baltic, two South Finnic languages (Livonian, Estonian), Mainland Scandinavian (except for an area in the west of Norway and for most of the Swedish dialects in Finland and in Estonia), Northern Kashubian, and the Germanic varieties spoken in the Cologne-Trier area (Central Franconian and Limburgian), i.e., outside of the CB area proper. Their common property is lexical, or pitch accent (i.e., tonal oppositions in addition to stress), which in Danish corresponds to the opposition between syllables with and without a glottal closure, stød. However, although the CB languages show a very high concentration of lexical accents in Europe, and probably also globally, there is no evidence for any real connections here, primarily, between the Scandinavian languages, on the one hand, and Baltic and South Finnic, on the other hand (Lehiste, 1988; Koptjevskaja-Tamm and Wa¨ lchli, 2001). Later research has suggested several partly overlapping Sprachbu¨ nde in the CB area, with two main hotbeds of areal phenomena. The first is the Latvian– Livonian–Estonian zone stretching in different directions over the Baltikum and further – the ‘Peipus-Bund’ (De´ csy, 1973); the ‘Baltischer Sprachbund’ (Haarmann, 1970; 1976); the ‘Convergence zone in the Baltikum’ (Stolz, 1991; see also Falkenhahn, 1963); and the second is the Eastern Finnic–Northwestern Russian (–Baltic) core – the ‘Eastern Baltic Sprachbund’ (Mathiassen, 1985a, 1985b) and the ‘Karelian Sprachbund’ (Sarhimaa, 1999). The core in each zone has been a mixed bilingual or multilingual area over long periods of time with complex local contacts leading to assimilation, acculturation, language convergence and/or language shifts. Thus, a strong Finnish substrate is generally
424 Circum-Baltic Area
recognized in Latvian, especially in its northeastern dialects, covering the area originally inhabited by Livonians. Livonian itself is at present spoken by a few dozen speakers and is largely influenced by Latvian. Northern Russian has a number of features generally attributed to the Finnic substrate and, primarily, to contacts with the smaller Finnic languages (Ingrian, Karelian, Ludian, Veps, and Votian). The latter languages are now on the verge of extinction or, in the case of Karelian, are mainly used as one part in bilingual mixed codes (Sarhimaa, 1999). In addition, the CB area partially overlaps with two other suggested convergence zones including, first, Scandinavian–Celtic–Northern Finnic–Saami– the ‘Wikinger-Bund’ (Haarmann, 1976), and second, Polish–Kashubian–Belarusan–Ukrainian– Lithuanian–the ’Rokytno-Bund’ (Haarmann, 1976), or the ‘Baltic-Slavic contact area’ (Wiemer, 2004; see also Falkenhahn, 1963). The abundance of different Sprachbu¨ nde in the CB area rests partly on a fairly vague understanding of the term. The CB area shows many layers of microand macrocontacts and mutual influences superimposed on each other over a long period of time, and the term ‘Sprachbund’ hardly does justice to an area of such historical and linguistic complexity (Nau, 1996; Koptjevskaja-Tamm and Wa¨ lchli, 2001). The crosslinguistically most striking isoglosses in the CB area are mainly found in its eastern part and include the following (for details, see KoptjevskajaTamm and Wa¨ lchli, 2001). 1. The alternation between the accusative and the genitive (in Baltic and Slavic) or the partitive (Finnic) case marking differentiates between high transitivity and low transitivity objects, or ‘total’ vs. ‘partial’ objects. The details and the relevance of this distinction are subject to a considerable crosslinguistic variation, where Finnic and Standard Russian show the most vs. the least grammaticalized system. The factors underlying this distinction include polarity of the clause (affirmative vs. negative), as in example (1) from Polish, aspect, affectedness of the object, etc. see also Partitives. (1) Polish a. Widzi-my gwiazd-e¸. see.PRES-1PL star-ACC ‘We see a/the star.’ b. Nie widzi-my gwiazd-y. NEG see.PRES-1PL star-GEN ‘We do not see a/the star.’
2. Less canonical subjects can sometimes be marked with the genitive (in Baltic and Slavic) or the partitive (Finnic) case. Again, the most
grammaticalized system is found in Finnic, where the partitive case consistently marks subjects in existential clauses, as in example (2) from Finnish, and in semantically related clause types. The partitive case is particularly preferred if the clause is negated and the subject refers to a quantitatively nondelimited entity. Roughly the same two conditions govern the choice between the nominative and the genitive case marking on Baltic and Slavic existential subjects, but on a significantly more restricted scale, with the most grammaticalized distinction found in northwestern Russian. (2) Finnish a. Kirj-at o-vat po¨ yda¨ -lla. book-NOM.PL be.PRES-3PL table-ILLAT ‘The books are on the table.’ b. Po¨ yda¨ -lla¨ o-n kirj-oja. table-ILLAT be.PRES-3SG book-PART.PL ‘There are (some) books on the table.’
3. In several constructions in Finnic, Baltic and Northern Russian (both Old Northern Russian and modern northwestern dialects), the object appears in the nominative case and not in the accusative. One common context, illustrated in example (3) from northwestern Russian dialects, is provided by an infinitival clause functioning as the subject of a necessitive matrix predicate. The Latvian correspondence to such clauses involves the so-called Debitive Mood, also with the object in the nominative. The set of nominative–object constructions differs greatly across the languages. Thus, while nominative objects in Northern Russian and Lithuanian are normally restricted to clauses with clearly nonfinite predicates (infinitives and converbs), Finnic requires, in addition, nominative objects in imperative clauses and clauses with impersonal passives. It has been suggested that the common denominator of all these contexts is their systematic lack of an overt personal subject (Timberlake, 1974). (3) Northwestern Russian dialects a. Topim pecˇk-u. heat.PRES.1PL oven-ACC ‘We are heating the oven.’ b. Nado/ Pora topit’ pecˇk-a. it-is-necessary/ it-is-time heat.INF oven-NOM ‘It is necessary/It is time to heat the oven.’
4. Finnic, Baltic, and Slavic display double (or sometimes multiple) options in the case marking of predicate adjectives and nominals. The choice between the nominative and some oblique case, e.g., the instrumental in example (4) from Lithuanian, can roughly be described as correlating with the distinction between time-stable and temporary situations respectively (Stassen, 2001).
Circum-Baltic Area 425 (4) Lithuanian a. Jis yra mokytoj-as. he.NOM is teacher-NOM ‘He is a teacher.’ b. Jis buvo mokytoj-u. he.NOM was.3SG teacher-INSTR ‘He was (working as) a teacher.’
5. In Finnic and most Slavic, most cardinal numerals higher than one alternate between case-governing and agreeing with their complements under well-defined syntactic conditions. As example (5) from Russian shows, when the numeral is in one of the direct cases (nominative or accusative), the complement appears in the genitive (or in the partitive in Finnic). Otherwise, both the numeral and the complement are in the same case (here, in the dative). In Baltic, the two properties are primarily associated with different sets of numerals, which is more common both crosslinguistically and within Indo-European. (5) Russian a. Ja vizˇu pjat’ stakan-ov. I.NOM see.PRES.1SG five.NOM/ACC glass-GEN.PL ‘I see five glasses.’ b. Ja prisˇ-l-a s pjat’-ju stakan-ami. I.NOM come-PAST-F.SG with five-DAT glass-DAT.PL ‘I came with five glasses.’
6. Estonian, Livonian, Latvian and Lithuanian use special ‘evidential’ verb forms for marking that the speaker’s factual claims are based on indirect evidence, rather than on direct, or attested evidence (evidential, quotative, relative or oblique mood). These are basically nonfinite verb forms, primarily participles, sometimes ‘frozen’ in an oblique case: thus, -vat in example (6) from Estonian is historically the partitive case form of the present participle. (6) Estonian Sina ra¨a¨ ki-vat saksa keel-t. you(SG).NOM speak-QUOT German language-PART ‘You are said to speak German; they say you can speak German.’
An isogloss connecting Scandinavian, Baltic, and East Slavic languages is the expression of certain verbal voice functions (reflexive, reciprocal, anticausative, passive) by means of verbal postfixes, i.e., affixes in the last position of a word, following, e.g., tense/ aspect and agreement markers. These affixes have all developed due to coalescence between the main verb and permutable reflexive and reciprocal pronouns (-s/-st in Scandinavian, -s in Baltic, and -s’/-sja in East Slavic).
Conclusion Significantly, there are no isoglosses covering all the CB languages; moreover, the isoglosses pick up different subsets of the languages, in many cases also extending outside of the CB area proper. In the CB area, convergence works primarily on a microlevel and reflects language contacts of groups of people and, maximally, of two or three languages. Convergence comprising more than two or three languages seems always to be the result of the overlapping and superposition of different language contacts. In some respects, the CB region forms a border zone between the Central Eurasian languages in the east and the Standard Average European languages in the west. See also: Belarus: Language Situation; Case; ContactInduced Convergence: Typology and Areality; Denmark: Language Situation; Evidentiality in Grammar; Finland: Language Situation; Germany: Language Situation; Jakobson, Roman (1896–1982); Latvia: Language Situation; Lithuania: Language Situation; Norway: Language Situation; Numerals; Partitives; Poland: Language Situation; Sweden: Language Situation; Tone: Phonology.
Bibliography ¨ & Koptjevskaja-Tamm M (1992). Language typoDahl O logy around the Baltic sea: a problem inventory. Stockholm: Papers from the Institute of Linguistics University of Stockholm (PILUS). ¨ & Koptjevskaja-Tamm M (eds.) (2001). The Dahl O Circum-Baltic languages: typology and contact. Studies in Language Companion Series (SLCS). Amsterdam/ Philadelphia: John Benjamins. De´csy G (1973). Die linguistische Struktur Europas. Vergangenheit–Gegenwart–Zukunft. Wiesbaden: Otto Harrassowitz. Falkenhahn V (1963). ‘Die Bedeutung der Verbalrektion fu¨r das Problem eines litauisch-polnischen Sprachbundes.’ Zeitschrift fu¨ r Slawistik 6, 893–907. Haarmann H (1970). Die indirekte Erlebnisform als grammatische Kategorie. Eine eurasische Isoglosse. Wiesbaden: Otto Harrassowitz. Haarmann H (1976). Aspekte der Arealtypologie. Die Problematik der europa¨ ischen Sprachbu¨ nde. Tu¨bingen: Narr. Jakobson R (1931 (1971)). ‘U¨ber die phonologischen Sprachbu¨nde.’ In Jakobson R (ed.) Selected writings, 1. The Hague: Mouton. 137–143. Koptjevskaja-Tamm M & Wa¨lchli B (2001). ‘The CircumBaltic languages: An areal-typological approach.’ In Dahl & Koptjevskaja-Tamm (eds.). v.2. 615–750. Lehiste I (1988). Lectures on language contact. Cambridge, MA: MIT Press.
426 Circum-Baltic Area Mathiassen T (1985a). ‘A discussion of the notion ‘‘Sprachbund’’ and its application in the case of the languages in the eastern Baltic area.’ International Journal of Slavic Philology 21/22, 273–281. Mathiassen T (1985b). Slavisk-baltisk-østersjo¨ finske syntaktiske isoglosser og spørsma˚ let om et Sprachbund i den østlige del av østersjøomra˚det. X Nordiska Slavostmo¨ tet 13-17 augusti 1984. A˚ bo: Research Institute of the A˚ bo Akademi Foundation. Nau N (1996). ‘Ein Beitrag sur Arealtypologie der Ostseseeanrainersprachen.’ In Boretzky N (ed.) Areale, Kontakte, Dialekte, Sprachen und ihre Dynamik in mehrsprachigen Situationen. Bochum: Brockmeyer. 51–67. Sarhimaa A (1999). Syntactic transfer, contact-induced change, and the evolution of bilingual mixed codes.
Focus on Karelian-Russian language alternation. Helsinki: Finnish Literature Society. Stassen L (2001). ‘Nonverbal predication in the CircumBaltic languages.’ In Dahl & Koptjevskaja-Tamm (eds.). 569–590. Stolz T (1991). Sprachbund im Baltikum? Estnisch und Lettisch im Zentrum einer sprachlichen Konvergenzlandschaft. Bochum: Brockmeyer. Timberlake A (1974). The nominative object in Slavic, Baltic, and West Finnic. Mu¨ nchen: Sagner. Wiemer B (2004). ‘Population linguistics on a micro-scale: lessons to be learned from Baltic and Slavic dialects in contact.’ In Kortmann B (ed.) Dialectology meets typology. Berlin/New York: Mouton de Gruyter. 497–526.
Cladistics A McMahon, University of Edinburgh, Edinburgh, UK R McMahon, Western General Hospital, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
Cladistics is an approach to classification initially introduced for animals and plants. When we consider how close two or more species are to one another, ‘close’ can mean two different things. In a phenetic classification, what matters is surface similarity: this might mean what animals look like, or how they behave. Phenetic methods seek to measure the distance between species, regardless of cause. Cladistic classifications, however, are based on identifying natural groups, which share characters derived from a common ancestor. Cladistics therefore seeks to recover the shared history of a group of organisms. Three dangers occur in cladistic classification that are less problematic for phenetics. . The same character might arise in two or more species by convergent evolution: often, this means the same environmental pressures have led to parallel solutions. Animals with white coats in winter do not necessarily share a single, recent common ancestor; but they probably live in similarly cold conditions for part of the year. Several different mutations may also lead to the same endpoint in different species, especially if systems are intrinsically limited in permitted variability: in DNA, there are only four bases (A, C, G, and T), so that over a long period, multiple mutations at the same site may well result in a C, regardless of starting states and histories. Both forms of convergence are known as homoplasy. . Contemporary species may also exhibit shared retentions from a distant common ancestor. If
several groups have lost a feature, it may be taken as indicating greater closeness than is historically the case between the groups that happen to retain it. For example, egg-laying has been retained in both monotremes (like the duck-billed platypus) and birds; but this tells us nothing about the closeness of the historical relationship between the two groups. Monotremes, historically speaking, are definitely mammals, though other mammals no longer lay eggs. . The final difficulty does not generally hold at the species level, but at the population level. Within a species, interbreeding between groups and consequent genetic interchange can lead to shared features, though these are a result of recent history, not signals of common ancestry. Cladistics, because it seeks explicitly to recover the histories of groups, cannot treat all features as equal, since it would then be calculating distance only, and would fall prey to the difficulties outlined above. Instead, it is essential to identify and prioritize shared, derived features, or synapomorphies. Both cladistic and phenetic methods can be relevant to language. In constructing family trees, we are doing cladistics, since here we attempt to prioritize shared, derived linguistic features. Typological comparison, on the other hand, is essentially phenetic, if the distribution rather than the source of features is the main issue. When conducting cladistic language classifications, it is important to remember the three problems raised above for biological cladistics, since in language too we face homoplasy, or parallel developments; shared retentions in some languages that have been lost in others; and borrowing. It follows that linguistics may also benefit from the investigation and adoption of computational methods
426 Circum-Baltic Area Mathiassen T (1985a). ‘A discussion of the notion ‘‘Sprachbund’’ and its application in the case of the languages in the eastern Baltic area.’ International Journal of Slavic Philology 21/22, 273–281. Mathiassen T (1985b). Slavisk-baltisk-østersjo¨finske syntaktiske isoglosser og spørsma˚let om et Sprachbund i den østlige del av østersjøomra˚det. X Nordiska Slavostmo¨tet 13-17 augusti 1984. A˚bo: Research Institute of the A˚bo Akademi Foundation. Nau N (1996). ‘Ein Beitrag sur Arealtypologie der Ostseseeanrainersprachen.’ In Boretzky N (ed.) Areale, Kontakte, Dialekte, Sprachen und ihre Dynamik in mehrsprachigen Situationen. Bochum: Brockmeyer. 51–67. Sarhimaa A (1999). Syntactic transfer, contact-induced change, and the evolution of bilingual mixed codes.
Focus on Karelian-Russian language alternation. Helsinki: Finnish Literature Society. Stassen L (2001). ‘Nonverbal predication in the CircumBaltic languages.’ In Dahl & Koptjevskaja-Tamm (eds.). 569–590. Stolz T (1991). Sprachbund im Baltikum? Estnisch und Lettisch im Zentrum einer sprachlichen Konvergenzlandschaft. Bochum: Brockmeyer. Timberlake A (1974). The nominative object in Slavic, Baltic, and West Finnic. Mu¨nchen: Sagner. Wiemer B (2004). ‘Population linguistics on a micro-scale: lessons to be learned from Baltic and Slavic dialects in contact.’ In Kortmann B (ed.) Dialectology meets typology. Berlin/New York: Mouton de Gruyter. 497–526.
Cladistics A McMahon, University of Edinburgh, Edinburgh, UK R McMahon, Western General Hospital, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
Cladistics is an approach to classification initially introduced for animals and plants. When we consider how close two or more species are to one another, ‘close’ can mean two different things. In a phenetic classification, what matters is surface similarity: this might mean what animals look like, or how they behave. Phenetic methods seek to measure the distance between species, regardless of cause. Cladistic classifications, however, are based on identifying natural groups, which share characters derived from a common ancestor. Cladistics therefore seeks to recover the shared history of a group of organisms. Three dangers occur in cladistic classification that are less problematic for phenetics. . The same character might arise in two or more species by convergent evolution: often, this means the same environmental pressures have led to parallel solutions. Animals with white coats in winter do not necessarily share a single, recent common ancestor; but they probably live in similarly cold conditions for part of the year. Several different mutations may also lead to the same endpoint in different species, especially if systems are intrinsically limited in permitted variability: in DNA, there are only four bases (A, C, G, and T), so that over a long period, multiple mutations at the same site may well result in a C, regardless of starting states and histories. Both forms of convergence are known as homoplasy. . Contemporary species may also exhibit shared retentions from a distant common ancestor. If
several groups have lost a feature, it may be taken as indicating greater closeness than is historically the case between the groups that happen to retain it. For example, egg-laying has been retained in both monotremes (like the duck-billed platypus) and birds; but this tells us nothing about the closeness of the historical relationship between the two groups. Monotremes, historically speaking, are definitely mammals, though other mammals no longer lay eggs. . The final difficulty does not generally hold at the species level, but at the population level. Within a species, interbreeding between groups and consequent genetic interchange can lead to shared features, though these are a result of recent history, not signals of common ancestry. Cladistics, because it seeks explicitly to recover the histories of groups, cannot treat all features as equal, since it would then be calculating distance only, and would fall prey to the difficulties outlined above. Instead, it is essential to identify and prioritize shared, derived features, or synapomorphies. Both cladistic and phenetic methods can be relevant to language. In constructing family trees, we are doing cladistics, since here we attempt to prioritize shared, derived linguistic features. Typological comparison, on the other hand, is essentially phenetic, if the distribution rather than the source of features is the main issue. When conducting cladistic language classifications, it is important to remember the three problems raised above for biological cladistics, since in language too we face homoplasy, or parallel developments; shared retentions in some languages that have been lost in others; and borrowing. It follows that linguistics may also benefit from the investigation and adoption of computational methods
Clark, Herbert H. (b. 1940) 427
from biology that seek to assist in determining best trees, and in diagnosing features that are inconsistent with such trees (Ringe et al., 2002; McMahon and McMahon, 2003). See also: Contact-Induced Convergence: Typology and Areality; Historical and Comparative Linguistics in the 19th Century; Language Change and Language Contact.
Bibliography
McMahon April & McMahon Robert (2003). ‘Finding families: Quantitative methods in language classification.’ Transactions of the Philological Society 101, 7–55. Ridley M (1986). Evolution and classification: The reformation of cladism. London: Longman. Ringe, Don, Warnow, Tandy and Taylor & Ann (2002). ‘Indo-European and computational cladistics.’ Transactions of the Philological Society 100, 59–129. Skelton, Peter, Smith & Andrew (2002). Cladistics: A practical primer on CD-ROM. Cambridge: Cambridge University Press.
Lass R (1997). Historical linguistics and language change. Cambridge: Cambridge University Press.
Clark, Herbert H. (b. 1940) P H Portner, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Herbert H. Clark has been one of the most important psychologists within the community of linguists, psychologists, and other cognitive scientists interested in language use. His influential research, pursued with a number of colleagues over the years, has focused on the role of shared knowledge in linguistic communication and has been extremely important within his core fields of psychology of language and psycholinguistics. In addition, connections with Stalnaker’s notion of Common Ground and analysis of presupposition (Stalnaker, 1974, 1978) helped his work to become influential in semantics and pragmatics (e.g., Heim, 1982); meanwhile, his interest in detailed analyses of interaction brought his work into close contact with discourse analysis. Clark did his undergraduate work at Stanford and his graduate work at Johns Hopkins, receiving his Ph.D. in 1966. He briefly taught at Carnegie Mellon before returning to Stanford in 1969, where he has been Professor of Psychology since 1969. He has spent time as a visiting scholar at University College, London, and the Max Planck Institute for Psycholinguistics in Nijmegen. See also: Pragmatics: Overview; Psycholinguistics: Over-
view.
Bibliography Clark H H (1977). ‘Bridging.’ In Johnson-Laird P N & Wason P C (eds.) Thinking: readings in cognitive science. London, New York: Cambridge University Press.
Clark H H (1979). ‘Responding to indirect speech acts.’ Cognitive Psychology 11, 430–477. Clark H H (1992). Arena of language use. Chicago: University of Chicago Press. Clark H H & Brennan S A (1991). ‘Grounding in communication.’ In Resnick L B, Levine J M & Teasley S D (eds.) Perspectives on socially shared cognition. Washington: APA Books. Clark H H & Clark E V (1977). Psychology and language: an introduction to psycholinguistics. New York: Harcourt, Brace, Jovanovich. Clark H H & FoxTree J E (2002). ‘Using uh and um in spontaneous speech.’ Cognition 84, 73–111. Clark H H & Haviland S E (1977). ‘Comprehension and the given-new contrast.’ In Freedle R (ed.) Discourse production and comprehension. Norwood, NJ: Ablex. Clark H H & Marshall C R (1981). ‘Definite reference and mutual knowledge.’ In Joshi A K, Webber B L & Sag I A (eds.) Elements of Discourse. Cambridge: Cambridge University Press. Clark H H & Schaefer E F (1987). ‘Collaborating on contributions to conversation.’ Language and Cognitive Processes 2, 19–41. [Reprinted in Dietrich R & Graumann C F (eds.) (1989). Language processing in a social context. Amsterdam: North Holland.] Clark H H & Wilkes-Gibbs D (1986). Referring as a collaborative process. Cognition 22, 1–39. [Reprinted in Cohen P R, Morgan J L & Pollack M E (eds.) (1990). Intentions in communication. Cambridge: MIT Press.] Heim I (1982). The semantics of definite and indefinite noun phrases. Ph.D. diss., University of Massachusetts at Amherst. [Published 1988, New York: Garland.] Stalnaker R (1974). ‘Pragmatic presupposition.’ In Munitz M & Unger P (eds.) Semantics and philosophy. New York: New York University Press. 197–213. Stalnaker R (1978). ‘Assertion.’ In Cole P (ed.) Syntax and semantics 9: pragmatics. New York: Academic Press. 315–332.
Clark, Herbert H. (b. 1940) 427
from biology that seek to assist in determining best trees, and in diagnosing features that are inconsistent with such trees (Ringe et al., 2002; McMahon and McMahon, 2003). See also: Contact-Induced Convergence: Typology and Areality; Historical and Comparative Linguistics in the 19th Century; Language Change and Language Contact.
Bibliography
McMahon April & McMahon Robert (2003). ‘Finding families: Quantitative methods in language classification.’ Transactions of the Philological Society 101, 7–55. Ridley M (1986). Evolution and classification: The reformation of cladism. London: Longman. Ringe, Don, Warnow, Tandy and Taylor & Ann (2002). ‘Indo-European and computational cladistics.’ Transactions of the Philological Society 100, 59–129. Skelton, Peter, Smith & Andrew (2002). Cladistics: A practical primer on CD-ROM. Cambridge: Cambridge University Press.
Lass R (1997). Historical linguistics and language change. Cambridge: Cambridge University Press.
Clark, Herbert H. (b. 1940) P H Portner, Georgetown University, Washington, DC, USA ! 2006 Elsevier Ltd. All rights reserved.
Herbert H. Clark has been one of the most important psychologists within the community of linguists, psychologists, and other cognitive scientists interested in language use. His influential research, pursued with a number of colleagues over the years, has focused on the role of shared knowledge in linguistic communication and has been extremely important within his core fields of psychology of language and psycholinguistics. In addition, connections with Stalnaker’s notion of Common Ground and analysis of presupposition (Stalnaker, 1974, 1978) helped his work to become influential in semantics and pragmatics (e.g., Heim, 1982); meanwhile, his interest in detailed analyses of interaction brought his work into close contact with discourse analysis. Clark did his undergraduate work at Stanford and his graduate work at Johns Hopkins, receiving his Ph.D. in 1966. He briefly taught at Carnegie Mellon before returning to Stanford in 1969, where he has been Professor of Psychology since 1969. He has spent time as a visiting scholar at University College, London, and the Max Planck Institute for Psycholinguistics in Nijmegen. See also: Pragmatics: Overview; Psycholinguistics: Over-
view.
Bibliography Clark H H (1977). ‘Bridging.’ In Johnson-Laird P N & Wason P C (eds.) Thinking: readings in cognitive science. London, New York: Cambridge University Press.
Clark H H (1979). ‘Responding to indirect speech acts.’ Cognitive Psychology 11, 430–477. Clark H H (1992). Arena of language use. Chicago: University of Chicago Press. Clark H H & Brennan S A (1991). ‘Grounding in communication.’ In Resnick L B, Levine J M & Teasley S D (eds.) Perspectives on socially shared cognition. Washington: APA Books. Clark H H & Clark E V (1977). Psychology and language: an introduction to psycholinguistics. New York: Harcourt, Brace, Jovanovich. Clark H H & FoxTree J E (2002). ‘Using uh and um in spontaneous speech.’ Cognition 84, 73–111. Clark H H & Haviland S E (1977). ‘Comprehension and the given-new contrast.’ In Freedle R (ed.) Discourse production and comprehension. Norwood, NJ: Ablex. Clark H H & Marshall C R (1981). ‘Definite reference and mutual knowledge.’ In Joshi A K, Webber B L & Sag I A (eds.) Elements of Discourse. Cambridge: Cambridge University Press. Clark H H & Schaefer E F (1987). ‘Collaborating on contributions to conversation.’ Language and Cognitive Processes 2, 19–41. [Reprinted in Dietrich R & Graumann C F (eds.) (1989). Language processing in a social context. Amsterdam: North Holland.] Clark H H & Wilkes-Gibbs D (1986). Referring as a collaborative process. Cognition 22, 1–39. [Reprinted in Cohen P R, Morgan J L & Pollack M E (eds.) (1990). Intentions in communication. Cambridge: MIT Press.] Heim I (1982). The semantics of definite and indefinite noun phrases. Ph.D. diss., University of Massachusetts at Amherst. [Published 1988, New York: Garland.] Stalnaker R (1974). ‘Pragmatic presupposition.’ In Munitz M & Unger P (eds.) Semantics and philosophy. New York: New York University Press. 197–213. Stalnaker R (1978). ‘Assertion.’ In Cole P (ed.) Syntax and semantics 9: pragmatics. New York: Academic Press. 315–332.
428 Class Language
Class Language A Luke, Nanyang Technological University, Singapore P Graham, University of Waterloo, Canada ! 2006 Elsevier Ltd. All rights reserved.
The relationship between language and social class is a key theoretical and empirical issue in critical discourse studies, ethnography of communication, and sociolinguistic research. It has been a focal point for postwar and current policy in language planning, and language and literacy education. The central questions of a class analysis of language were stated in Mey’s (1985) proposal for Marxian pragmatics: ‘Whose language’ counts? With what material and social consequences? For which communities and social groups? Central concerns are how language factors into the intergenerational reproduction of social and economic stratification, and how communities, families, schools, the media, and governments contribute to ‘‘linguistic inequality’’ (Hymes, 1996). Yet current research continues to table and debate contending definitions of language and social class as social and economic phenomena. Marx viewed language as an intrinsic characteristic of human ‘species being,’ as a form of mental and material labor. The ‘‘language of real life,’’ he argued, is ‘‘directly interwoven’’ with ‘‘material activity and. . . mental intercourse’’ (Marx and Engels, 1845/1970: 118). This ‘‘mental production’’ is ‘‘expressed in the language of politics, laws, morality, religion, metaphysics etc. of a people.’’ ‘Sense experience,’ the work of the eye and ear, was the basis not only of science, but of communal and social life (Marx, 1844/1964: 160–166). At the same time, Marx’s (Marx and Engels, 1845/1970: 37) classical definition of ideology as a ‘camera obscura’ established the centrality of language in the distortion and misrepresentation of social and economic reality in social class interests (see Marxist Theories of Language). Marxist theory establishes three critical traditions in the analysis of language and class. These are: (1) the analysis of language as a form of class-based social action and consciousness; (2) the analysis of social class and linguistic variation; and (3) the analysis of language as the medium for power and control, ideology, and truth in specific linguistic and capital markets (see Power and Pragmatics).
Language as Social Action and Class Consciousness The prototypical class analysis of language was undertaken by Voloshinov (1973) (see Voloshi-
nov, V. N. (ca. 1884/5–1936)). Language was conceptualized as a marker of class consciousness and a medium of class struggle. According to models of heteroglossia and ‘multivocality,’ each utterance and text is a revoicing of previous historical speakers and writers. The ideological content and social functions of each speech act or speech genre bear their own material historical origins. That is, they are produced and reproduced by and through social and economic ‘‘conditions of production’’ (Fairclough, 1992). By this account, face-to-face language exchanges are instances of class conflict and ideological difference, where class-located social actors bring to bear distinctive material interests and discourse positions. The point of such analysis is to extend the notion of the situated speaking and writing subject, to a closer sociological and economic analysis of that positionality. Contemporary work in critical discourse analysis supplements class analysis with attention to the linguistic construction of gender, race, sexual preference, and other forms of social identity ideology, and position (e.g., Lemke, 1995). If utterances and their use are indexical of ideology and social class consciousness, what might this mean for differing cultural groups, communities, and their historical practices? Following Vygotsky (see Vygotskii, Lev Semenovich (1896–1934)), Luria (1982) argued that the cognitive uses of the ‘tool’ of language were mediated by one’s social relations, cultural practices, and material conditions. In his studies of the Uzbeks, Luria made the claim that particular forms of cognition and consciousness, what Marx referred to as capacity for the ‘‘production of ideas,’’ were linked to cultural practices and material conditions of tool use. The cognitive affordances of language and literacy are mediated by material economic and social conditions, including class location and cultural history. In contemporary literacy theory, Paulo Freire also argued for the direct links between language and social class consciousness. Freire’s (1972) prototypical work was concerned with the effects of literacy education upon the language and consciousness of the indigenous population and peasantry of postwar Brazil. Bringing together Marxist dialectics with liberation theology, he argued that autocratic governments and education systems constituted ‘‘cultures of silence’’ where marginalized populations were educated in ways that misnamed and misrecognized the world. Freire’s work views ideologically distorted language as a mode of class-based false consciousness. For Freire, critique of class consciousness was achieved through an educational process of ‘renaming’
Class Language 429
the world in ways that demystified power, consciousness, and life, a similar agenda to that of Mey (1985), Chouliaraki and Fairclough (2001) and other contemporary critical linguists. Current agendas for the teaching of ‘critical literacy’ and critical discourse analysis stand in this Marxist tradition, focusing on the demystification, critique, and reconstruction of ideological language (Luke, 2004).
Linguistic Variation and Social Class A further concern in the analysis of language and social class is how language variation acts as a marker and instrument of social class, and of racial and other forms of social stratification. A principal concern of sociolinguists in the postwar period has been over the effects of the differential and inequitable spread of economic and social capital on the language minority, postcolonial, and economically marginal communities (Hymes, 1996) (see Minority Languages: Oppression). The postwar origins of language planning reflect the impact of colonization, decolonization, migration, and geopolitical conflict upon linguistic retention and stability. The flow of global, regional, and national capital visibly impacts upon language loss, use, and retention (Pennycook, 1998). In the postwar period, sociolinguistic and language planning research has engaged with the effects of the unequal spread and distribution of economic capital upon language loss. Yet attempts to theorize and empirically describe the complex reproductive relationships of language and social class have been debated and contested (see Linguistic Decolonization). The sociological and sociolinguistic research on U.K. children by Basil Bernstein and colleagues (e.g., Bernstein, 1975) took up this challenge. This work provided an account of the role of language in the institutional production of stratified levels of educational achievement. Bernstein’s argument was that working class students spoke a ‘‘restricted code,’’ characterized by embedded and literal meanings, limited command of deixis, and thresholds in technical complexity. Middle-class children, he argued, mastered an ‘‘elaborated code’’ which was fitted for educational success and mastery of academic and scientific discourses. These, he argued, were tied to particular forms of early childhood language socialization and family structure (Bourdieu and Passeron, 1992). Bernstein’s work was the object of several decades of controversy. Labov’s (2001) studies of urban African-American language registers and nonstandard dialects, and Heath’s (1983) studies of early class-based language socialization made the case against models of linguistic deficit. Bernstein’s model has been defended by systemic functional
linguists, who argue that there are indeed elaborated technical registers and contents, specific language domains affiliated with power, some of which particular social classes make explicitly available in early language socialization and educational training (Hasan and Williams, 1996) (see Codes, Elaborated and Restricted (Bernstein)). As distinctive sociodemographic speech communities, particular social classes may indeed have different speech patterns, varying in lingua franca, register, dialect, accent, and practices of diglossia (see Multilingualism: Pragmatic Aspects; Register: Overview). These, further, are affiliated with class-based social ideologies and cultural practices (Fishman, 1991). Ethnographic studies have shown how these variations are made to count in local social networks and institutions (Milroy, 1987). But the social and cultural bases and material consequences of such differences remain localized and contentious. To move past descriptive claims requires a broader sociological theory of social class, of ‘‘linguistic markets’’ (Mey, 1985; Bourdieu, 1992), and of changing media and modes of production and information.
Language as Capital in Linguistic Markets Classical sociological definitions of social class begin from conceptions of structural economic location and material position. They attempt to define position and power vis-a`-vis dominant means of production. The tendency of Marxist models is to further affiliate social class with particular ‘class consciousness,’ of which language, its use and its expression, is a constitutive speech marker. Bourdieu’s (1992) sociology began from a view that class position is at least in part structurally determined. But it is also embodied by human subjects in their ‘habitus,’ the sum total of socially acquired dispositions. By this account, the bodily performance of linguistic competence is an element of cultural capital. This capital and affiliated forms of embodied taste, style, and ideology, constitute a key marker of one’s social class position and mobility. Linguistic capital is deployed in specific social fields, which constitute ‘linguistic markets.’ Each market, each institutional context, in turn has variable rules and conventions of exchange whereby linguistic competence and literate proficiency in specific languages is valued or not. There, language use – as class marker and tool – has exchange value and power only in relation to other forms of capital, including social capital (e.g., networks, institutional access), economic capital, formal institutional credentials, artifacts, and so forth.
430 Class Language
This is a more complex view of the relationship of language and social class. Language matters, as a primary marker of class, gender, culture, training, education, networks, traditions, and ideological consciousness. Yet like post-structuralist theories of discourse, Bourdieu’s model viewed language not just as an index or marker of class position, but as reflexively constituting position and identity, power, and categorical social status. In this way, how language marks class, capital, and power is sociologically contingent, rather than determined by the characteristics of linguistic code or class position per se, or the ostensive power of any given utterance, genre, or text (Luke, 1996). This contingency is dependent on the availability of other forms of capital, and the variable, historically shifting norms, rules, and conventions of particular social institutions, fields of knowledge, and linguistic markets (see Discourse, Foucauldian Approach; Poststructuralism and Deconstruction; Structuralism).
Current Issues One of the principal critiques of postwar sociolinguistics, ethnography of communication, and functional linguistics was that they lacked a sufficient analysis of power, capital, and conflict. Indeed, sociolinguistic models of ‘social context,’ ‘context of situation,’ and ‘social network’ are often based on structural functionalist models of society and culture. The study of language and social class requires the rigorous analysis of social and economic relations within and between speech communities. Current work on language and social class continues to examine how language represents class consciousness, how it is implicated in ongoing issues of class conflict and cohesion, and how its acquisition and use are central to intergenerational production of social stratification of material and discourse resources. Language variation, diversity, change, and ideology can be systematically linked to social class. Linguistic performance in text and discourse production does indeed have both symbolic and material exchange value, particularly in service- and information-based economies. But this depends upon the complex local economic and institutional formations of particular speech communities. Bourdieu offered a stronger analytic frame for analyzing how language ‘counts’ in specific institutional, disciplinary, and knowledge fields, and everyday social contexts. He suggested that issues around ‘whose language’ counts, which classes have power, require a rigorous socioeconomic analysis. Particularly under conditions of late capitalism and globalization, these sociological and sociolinguistic contexts and conditions are under considerable historical transition and challenge.
As the medium of consciousness and labor, language is entailed in the production of ideology, material goods, and social relations. The move in globalized economies towards information- and discourse-based forms of labor raises a number of key challenges to linguistic and ethnographic studies. First, linguistic, semiotic, and discourse competence will have increased significance in productive labor and consumption, shifting social class relations to means of production. Second, social class location and membership is determined by relations to dominant modes of communication, semiosis, and information (Castells, 2000) as much as it might be defined in classical Marxist terms. Finally, the formation of social class identity, ideology, and speech community have become more complex. They are now strongly influenced by forces of mass culture, media, and globalized information flows. One of the principal claims of post-structuralist and postmodern theory of the past decade has been a breakdown of essential relationships between discourse and social class as a primary analytic category. It is increasingly difficult to analyze social class formation without due consideration of the complexity of cultural, racial, gender, and religious identity and position. Any analysis of language and social class must engage with this complexity. But indeed, any contemporary analysis of class and intersecting categories must engage with the constitutive place of language, text, and discourse. See also: Codes, Elaborated and Restricted (Bernstein); Critical Discourse Analysis; Discourse, Foucauldian Approach; Language Planning and Policy: Models; Linguistic Decolonization; Marxist Theories of Language; Minority Languages: Oppression; Multilingualism: Pragmatic Aspects; Poststructuralism and Deconstruction; Power and Pragmatics; Structuralism; Voloshinov, V. N. (ca. 1884/5– 1936); Vygotskii, Lev Semenovich (1896–1934).
Bibliography Bernstein B (1971). Class, codes and control: theoretical studies towards sociology of language (Vol. 1). London: Routledge. Bourdieu P (1991). Language and symbolic power. Cambridge, MA: Harvard University Press. Bourdieu P & Passeron J C (1992). Reproduction in education, culture and society (2nd edn.). Beverley Hills, CA: Sage. Castells M (2000). The rise of the network society (2nd edn.). Oxford: Blackwell Publishers. Chouliaraki L & Fairclough N (1999). Discourse in late modernity: rethinking critical discourse analysis. Edinburgh: Edinburgh University Press.
Classical Antiquity: Language Study 431 Fairclough N (1992). Discourse and social change. Cambridge, UK: Polity Press. Fishman J A (1991). Reversing language shift: theoretical and empirical foundations of assistance to threatened languages. Clevedon, UK: Multilingual Matters. Freire P (1972). Pedagogy of the oppressed. Ramos M B (trans.). London: Penguin Books. Hasan R & Williams G (eds.) (1996). Literacy in society. London: Longman. Heath S B (1983). Ways with words: language, life, and work in communities and classrooms. Cambridge/ NewYork: Cambridge University Press. Hymes D (1996). Ethnography, linguistics, narrative inequality: toward an understanding of voice. London: Taylor & Francis. Labov W (2001). Principles of linguistic change: social factors (Vol. 2). Oxford: Blackwell Publishers. Lemke J (1995). Textual politics: discourse and social dynamics. London: Taylor & Francis. Luria A R (1982). Language and cognition. New York: John Wiley.
Luke A (1996). ‘Genres of power? Literacy education and the production of capital.’ In Hasan & Williams (eds.) Literacy in society. London: Longman. 308–338. Luke A (2004). ‘Two takes on the critical.’ In Norton B & Toohey K (eds.) Critical pedagogies and language learning. Cambridge: Cambridge University Press. 21–29. Marx K (1844/1964). Karl Marx: early writings. Bottomore T B (trans. & ed.). New York: McGraw-Hill. Marx K & Engels F (1845/1970). The German ideology. London: Lawrence & Wishart. Mey J L (1985). Whose language: a study in linguistic pragmatics. Amsterdam: John Benjamins. Milroy L (1987). Language and social networks (2nd edn.). Oxford: Blackwell. Pennycook A (1998). English and the discourses of colonialism. London: Routledge. Voloshinov V N (1973). Marxism and the philosophy of language. Matejka L & Titunik I R (trans.). Cambridge, MA: Harvard University Press.
Classical Antiquity: Language Study D J Taylor, Lawrence University, Appleton, WI, USA ! 2006 Elsevier Ltd. All rights reserved.
As is the case with so many other Western intellectual endeavors, the formal study of language begins in ancient Greece and is transmitted to the modern world by the Romans, but only after the latter have creatively adapted and transformed the linguistic legacy bequeathed to them by their Greek predecessors. The original historical context is the dynamic symbiosis of philosophical speculation and literary experimentation that characterized and helped to define the golden age of classical Greece, but that context changed dramatically over the centuries and ranged from the scholarly ambience of Alexandria to the bilingual reality of Constantinople and from the fledgling philology of early Rome to the monumental tradition of the late Latin artes grammaticae. Graeco–Roman language science features numerous noteworthy accomplishments: the obligatory inclusion of vowels in the alphabetic inventory, an achievement of momentous import for the development of literature; the enumeration of four illocutionary forces, what they called sentence-types; the invention of and reliance upon the four pathe¯ or transformations (addition, deletion, substitution, permutation) as both heuristic procedures and explanatory devices; the coining of an entire metalanguage (much of which is still in use) to refer to the hundreds of grammatical
phenomena they discovered; the canonical definitions of the parts of speech; carefully composed and cogent arguments on the arbitrariness of language and the relationships between signifiant and signifie´e; meticulous and copious descriptions of accidence including, in the case of the Romans, the discovery and identification of the declensions and conjugations that form the keystone of all subsequent Latin grammatical treatises and textbooks; the publication and widespread dissemination of linguistic knowledge throughout both the Greek and Roman worlds via grammatical manuals (technai) and tomes (artes); and in general the conscious acknowledgment of, and emphasis upon, the central and ubiquitous role of grammar in education and intellectual discourse as well as an unremitting insistence upon scientific rigor in linguistic exegesis. So classical linguistics is much more familiar to modern language scientists than might at first be supposed.
Historiographical Problems The lamentable loss of so much of ancient grammatical literature has made it difficult to chronicle accurately the course of classical language science: no Stoic linguistic treatise is extant; the dating and authorship of the only surviving Alexandrian grammatical manual have been questioned for centuries; almost all of ancient Rome’s early scholarly forays into the study of the Latin language, i.e., Aelius Stilo’s
Classical Antiquity: Language Study 431 Fairclough N (1992). Discourse and social change. Cambridge, UK: Polity Press. Fishman J A (1991). Reversing language shift: theoretical and empirical foundations of assistance to threatened languages. Clevedon, UK: Multilingual Matters. Freire P (1972). Pedagogy of the oppressed. Ramos M B (trans.). London: Penguin Books. Hasan R & Williams G (eds.) (1996). Literacy in society. London: Longman. Heath S B (1983). Ways with words: language, life, and work in communities and classrooms. Cambridge/ NewYork: Cambridge University Press. Hymes D (1996). Ethnography, linguistics, narrative inequality: toward an understanding of voice. London: Taylor & Francis. Labov W (2001). Principles of linguistic change: social factors (Vol. 2). Oxford: Blackwell Publishers. Lemke J (1995). Textual politics: discourse and social dynamics. London: Taylor & Francis. Luria A R (1982). Language and cognition. New York: John Wiley.
Luke A (1996). ‘Genres of power? Literacy education and the production of capital.’ In Hasan & Williams (eds.) Literacy in society. London: Longman. 308–338. Luke A (2004). ‘Two takes on the critical.’ In Norton B & Toohey K (eds.) Critical pedagogies and language learning. Cambridge: Cambridge University Press. 21–29. Marx K (1844/1964). Karl Marx: early writings. Bottomore T B (trans. & ed.). New York: McGraw-Hill. Marx K & Engels F (1845/1970). The German ideology. London: Lawrence & Wishart. Mey J L (1985). Whose language: a study in linguistic pragmatics. Amsterdam: John Benjamins. Milroy L (1987). Language and social networks (2nd edn.). Oxford: Blackwell. Pennycook A (1998). English and the discourses of colonialism. London: Routledge. Voloshinov V N (1973). Marxism and the philosophy of language. Matejka L & Titunik I R (trans.). Cambridge, MA: Harvard University Press.
Classical Antiquity: Language Study D J Taylor, Lawrence University, Appleton, WI, USA ! 2006 Elsevier Ltd. All rights reserved.
As is the case with so many other Western intellectual endeavors, the formal study of language begins in ancient Greece and is transmitted to the modern world by the Romans, but only after the latter have creatively adapted and transformed the linguistic legacy bequeathed to them by their Greek predecessors. The original historical context is the dynamic symbiosis of philosophical speculation and literary experimentation that characterized and helped to define the golden age of classical Greece, but that context changed dramatically over the centuries and ranged from the scholarly ambience of Alexandria to the bilingual reality of Constantinople and from the fledgling philology of early Rome to the monumental tradition of the late Latin artes grammaticae. Graeco–Roman language science features numerous noteworthy accomplishments: the obligatory inclusion of vowels in the alphabetic inventory, an achievement of momentous import for the development of literature; the enumeration of four illocutionary forces, what they called sentence-types; the invention of and reliance upon the four pathe¯ or transformations (addition, deletion, substitution, permutation) as both heuristic procedures and explanatory devices; the coining of an entire metalanguage (much of which is still in use) to refer to the hundreds of grammatical
phenomena they discovered; the canonical definitions of the parts of speech; carefully composed and cogent arguments on the arbitrariness of language and the relationships between signifiant and signifie´e; meticulous and copious descriptions of accidence including, in the case of the Romans, the discovery and identification of the declensions and conjugations that form the keystone of all subsequent Latin grammatical treatises and textbooks; the publication and widespread dissemination of linguistic knowledge throughout both the Greek and Roman worlds via grammatical manuals (technai) and tomes (artes); and in general the conscious acknowledgment of, and emphasis upon, the central and ubiquitous role of grammar in education and intellectual discourse as well as an unremitting insistence upon scientific rigor in linguistic exegesis. So classical linguistics is much more familiar to modern language scientists than might at first be supposed.
Historiographical Problems The lamentable loss of so much of ancient grammatical literature has made it difficult to chronicle accurately the course of classical language science: no Stoic linguistic treatise is extant; the dating and authorship of the only surviving Alexandrian grammatical manual have been questioned for centuries; almost all of ancient Rome’s early scholarly forays into the study of the Latin language, i.e., Aelius Stilo’s
432 Classical Antiquity: Language Study
linguistic analyses of archaic documents, Varro’s voluminous corpus, and Palaemon’s grammar, are lost; and in extant Roman artes, references to otherwise unknown grammars and grammarians are all too frequent. The tendency to ascribe various notions and developments to nonexistent texts and anonymous authors is therefore one of several historiographical problems posed by the vast lacunae in our knowledge. Dichotomies of one sort or another for which there is little evidence abound in the secondary literature – nature and convention, analogy and anomaly, philosophical and technical grammar – and are adduced to provide theoretical links between extant and lost texts in order to chart a cumulative course for the history of grammatical thought in antiquity. More recent research, however, has concentrated exclusively on extant texts and what can be known for sure, has uniformly eschewed weighty and sweeping generalizations, and has concluded that the history of Graeco–Roman linguistics is for the most part a discontinuous one. Episodic such accounts may be, but they are at least faithful to the historical record. Despite the many problems, readable and informative surveys of classical linguistics do exist, e.g., Della Casa (1973), Pinborg (1975), Baratin & Desbordes (1981), Hovdhaugen (1982), Robins (1990), Schmitter (1991), Law (2003). Such surveys readily reveal that linguistic information can be found almost anywhere, for philosophers, logicians, rhetoricians, historians, philologists, literary critics, even poets, as well as bona fide grammarians, contribute to formulating ancient language science. Moreover, Greeks and Romans neither compartmentalize knowledge nor always broker out grammatical phenomena as we do. They do, however, create a vast array of technical terms, but that terminology is, for much of the time, in the process of evolving and does not become fixed until late in the tradition; consequently many authors employ the language of everyday intellectual discourse, often embellished by notable metaphors, rather than an established metalanguage when articulating their linguistic theories and practices. So both the availability of texts and the contents of those texts pose historiographical problems of some magnitude. What the Greeks and Romans have to say about language and the study of language in those texts is, however, significant and ranks as one of the major intellectual legacies bequeathed to us by classical antiquity. They do, after all, construct the classical foundations of most subsequent language science, and so we can observe firsthand the Western world’s earliest attempts to describe scientifically and to study formally what is uniquely human, namely, language. We can also, of course, find striking anticipations of and
parallels with later, even modern, linguistics, but we can also encounter equally striking differences in both intellectual attitudes and approaches to the analysis of linguistic phenomena.
From the Origins to Plato and Aristotle Including vowels in the Semitic alphabet that the Greeks imported and adapted was a stroke of genius and, inter alia, allowed the Greeks to commit Homer’s orally composed Iliad and Odyssey to writing. The two processes established the foundation of language science in Greek antiquity, and as a consequence literature enjoys forever a privileged status in the scientific study of language. As literature proliferates, 5th-century Greeks attend consciously to differences in dialect and meter and to errors in morphology (barbarisms) and syntax (solecisms). Protagoras initiates what becomes a hallmark of Greek linguistics, namely, taxonomy: he distinguishes three genders, three numbers, and probably the four pathe¯ or transformations; he also identifies four sentencetypes (wish or prayer, question, answer, command). Herodotus informs us of an Egyptian experiment to determine the original language, but both monogenetic and polygenetic theories on the origin of language(s) are known. Socrates addresses the existential and equational functions of the copula and the arbitrariness of the sign. The scores of etymologies and the explicit confrontation of conventional and natural explanations for names in the Cratylus have attracted far more attention than one would suppose, and the dialogue has its fair share of both critics and admirers. The term pto¯ sis (‘inflected form’ and later ‘case’) also appears, but the relationship between its etymological meaning (‘a fall or falling’) and its grammatical meaning is still puzzling. Plato analyzes and classifies the sounds of Greek reasonably effectively and also establishes once and for all the roles of NP and VP in linguistic thought, though the key passage (Sophist 261D–263D) is quite vexing. He begins by referring to onomata ‘nouns’ and rhe¯ mata ‘verbs’ as individual entities but then abruptly treats them in combination as necessary sentential constituents, presumably therefore as ‘subjects’ and ‘predicates’; yet nowhere does he employ onoma as ‘subject’ or rhe¯ ma as ‘predicate.’ Nonetheless, he has both defined noun and verb and postulated a fundamental, binary, syntactic structure that necessarily includes the former in one part and the latter in the other. So by the mid-4th century, grammatical knowledge is fairly extensive but also fairly eclectic. The grammatical sketch in Aristotle’s Poetics (21) takes us into the realm of more formal linguistic analysis and probably into the schoolroom as well.
Classical Antiquity: Language Study 433
Aristotle reprises much of what we already know from Plato but organizes and augments it. He divides letters into vowels, semivowels, and consonants (mutes), and he correctly identifies aspirated stops for the first time. He then defines and describes syllable, conjunction, article, noun, verb, cases (inflected forms of both nouns and verbs), and discourse or speech. These contents clearly manifest an organizational scheme that ultimately becomes traditional in both grammar and education: a hierarchical arrangement of linguistic items (letters, syllables, words, sentences), an embryonic account of selected parts of speech (although, conspicuously, Aristotle does not use and presumably does not know that term), and a rudimentary description of nominal and verbal accidence. Aristotle affirms the conventional view of language, the uniqueness of human language (as opposed to animal sounds), and the priority of speech over writing; formulates more precise definitions by including the feature ‘minus tense’ in his definition of noun and ‘plus tense’ in that of the verb; defines metaphor as a semantic analogia ‘proportion, analogy’; provides a fairly sophisticated and semisyntactic treatment of solecisms; offers better-developed notions of subject and predicate; and formulates a list of ten predicate-types that is as exhaustive as it is ignored by later grammarians. Aristotle’s analysis of predicates may not be unrelated to his four parts of speech, for an Aristotelian categorical syllogism essentially requires only nouns, articles, verbs, and conjunctions (his rather open-ended category that includes logical connectors). Aristotle’s remarks on innateness strikingly anticipate modern theories that postulate a universal mental language common to all humankind. Greek language science has obviously progressed and expanded its scope of inquiry during the classical period.
Hellenistic Linguistics The Hellenistic Age (3rd–1st centuries B.C.) features Stoic logic and Alexandrian philology: the former introduces what passes for syntax throughout most of classical antiquity, and the latter establishes literary criticism as the ‘noblest part’ of grammatical inquiry. Our major sources for Stoic logic, Diogenes Laertius and Sextus Empiricus, are both late (also confused and polemical respectively), and the Alexandrian Techne¯ ascribed to Dionysius Thrax may not be completely authentic. It was, however, a remarkably productive period in the history of classical linguistics. Stoics divide their epistemology into logic (dialectic) and rhetoric and the former into the study of pho¯ ne¯ ‘sound’ and to se¯ mainomenon ‘the being
signified.’ Stoics therefore define language science properly as the relationship between sound and meaning. Under pho¯ ne¯ they analyze sounds, letters, parts of speech, grammatical errors (barbarisms and solecisms), poetic diction, verbal ambiguities, euphony, and music; under to se¯ mainomenon they study case, voice, mood, and lekta ‘sayables.’ They increase the number of parts of speech by adding articles and maybe adverbs and by differentiating common and proper nouns, þ or – case becomes a defining feature of the major parts of speech, the cases are identified and named, and grammatical terminology proliferates. Arguably the Stoics’ most sophisticated linguistic achievement is their analysis of aspect and tense in the Greek verb system, but where and exactly how they do so are still sub iudice. Stoic logic privileges to se¯ mainomenon, and the increase in the number of parts of speech is due to the emphasis on logic and lekta rather than to any cumulative process or more refined taxonomy. Lekta, the things signified, are to be distinguished from both the things signifying (sounds) and the things existing (referents); the former are incorporeal, but the latter two are corporeal. Referents aside, logicians often distinguish declarative statements, which can be written or spoken and read or heard and which are therefore corporeal, from propositions, which are what the statements mean or express and which are consequently incorporeal. Lekta are either complete or incomplete. Complete lekta are yes–no questions, questions requiring a declarative answer, commands, oaths, and, most significantly, propositions (axio¯ mata), i.e., statements that can be affirmed or denied. Stoic logic is therefore propositional logic, and Stoic syntax is the syntax of propositions. Our sources discuss the several types of axio¯ mata and their syntax at length. Incomplete or deficient lekta include predicates and, although no text explicitly says so, probably subjects. The Stoic analysis of predicates distinguishes clearly and unambiguously between the incorporeal predicate (kate¯ gore¯ ma) and the corporeal verb (rhe¯ ma) and accordingly treats the former under to se¯ mainomenon and the latter under pho¯ ne¯ . Both the structure and the metalanguage of Stoic logicolinguistic thought are remarkably consistent. The complete absence of original Stoic linguistic texts from the historical record is one of the most lamentable losses in intellectual history. Alexandrian scholars are first and foremost philologists and literary critics: Aristophanes of Byzantium produces epoch-making editions of the Iliad and Odyssey; Aristarchus invents critical symbols for indicating doubts about the authenticity, repetition, or order of verses, and introduces respect for the authority of manuscript traditions into the history of
434 Classical Antiquity: Language Study
scholarship; and Dionysius Thrax becomes famous for his lectures on Nestor’s cup (Iliad 11.632–37). Numerous commentaries, lexicographical treatises, etc., facilitate and enhance the reading of the now critically edited texts of classical Greece’s monumental authors. Though no textbooks are extant, the study of language and literature is obviously at the core of the educational curriculum. Critical editions and ancillary scholarly texts presuppose a vast reservoir of grammatical knowledge, but whether such knowledge is ever pursued for its own sake is still in dispute. For example, Aristophanes supposedly authors a treatise entitled Peri Analogias and adduces five principles, to which Aristarchus adds a sixth, for determining analogy, and so some scholars argue that the Alexandrians actually formulate rules for inflection, declensions, conjugations, paradigms, etc. Yet no extant text contains anything of the sort, and so other scholars argue that the criteria for analogy are much more readily understandable as criteria for emending texts. As Aristotle uses analogia as a means of identifying and defining metaphor, so the Alexandrians employ it as a heuristic procedure for correcting poorly transmitted manuscripts. Moreover, no extant Hellenistic text, grammatical or otherwise, testifies to any sort of analogy/anomaly quarrel between Alexandrians and Stoics, as was once supposed. The Alexandrians do, however, dramatically expand the scope of grammatical inquiry in their drive to edit and comprehend literary texts, but unfortunately the one extant grammatical Techne¯ that purportedly summarizes their accomplishments in language science raises more questions than it answers. Both the authorship and the dating of the Techne¯ Grammatike¯ have been questioned for centuries (see Di Benedetto, 1958–59, 1973; and Law and Sluiter, 1995). Moreover, neither the Techne¯ nor its author receives contemporary acclaim, and both are more influential later and elsewhere. The text begins as follows (translation from Kemp, 1986): Grammar is the practical study of the normal usages of poets and prose writers. Its six divisions comprise: (1) skill in reading (aloud) with due attention to prosodic features; (2) interpretation, taking note of the tropes of literary composition found in the text; (3) the ready explanation of obscure words and historical references; (4) discovery of the origins of words; (5) a detailed account of regular patterns; and (6) a critical assessment of poems; of all that the art includes this is the noblest part.
Dionysius or whoever is clearly thinking of grammar in philological and literary terms. After four brief entries on the first topic, the text switches abruptly to an analysis of letters and syllables, defines word
and sentence, then enumerates eight parts of speech, and thereafter devotes itself wholly to defining and exemplifying those parts of speech; none of this (except the four items on reading aloud) follows from any of the six divisions listed in the introductory paragraph. Scholars account for this inconcinnity of contents by assuming that the first five paragraphs are genuine and the remainder spurious. In any case, the contents of the slim volume are hardly exceptional, offering little or nothing beyond the ken of any competent Hellenistic grammarian. Aristotelian categorical and Stoic propositional syllogisms no longer determine the enumeration of the parts of speech; so Alexandrians are free to focus on the parts of speech (or, more properly, parts of the sentence) in normal literary discourse and therefore arrive at eight such items. The Techne¯ ’s parts of speech are noun (including both common and proper nouns), verb, participle, article, pronoun, preposition, adverb, conjunction, and this list becomes canonical. The author is addicted to semantic taxonomy – he classifies nouns into 24 subtypes, adverbs into 26, and conjunctions into eight, all on the basis of meaning – but he describes grammatical accidence rather skillfully. The plethora of grammatical terms is impressive: the tiny volume contains about 150 technical terms that will comprise almost the entire Western linguistic vocabulary until well into the 20th century. The definitions of the parts of speech include morphological, semantic, and, at least implicitly, syntactic criteria, but they suffice and so endure for centuries without much elaboration or alteration. In fine, Alexandrian scholarship privileges textual and literary criticism and the manifold linguistic issues ancillary to such endeavors, and if grammar is not yet autonomous, it is nearly so. Language science now possesses an extensive metalanguage, linguistic levels are clearly demarcated, and the phonological and morphological parameters of linguistic inquiry are well established even if syntax still remains hidden in the dark domains of Stoic logic and/or solecisms (but see Swiggers and Wouters, 2003). The word-class system of grammar instituted by Stoic and Alexandrian language science readily transfers to the Roman world.
Linguistic Theory and Practice in Rome: Varro Intellectual activity, along with military, political, and economic power, moves westward in the late 1st and 2nd centuries B.C. to Rome. Educated and bilingual Romans are aware of their literary and linguistic
Classical Antiquity: Language Study 435
debts to Greece, as early Latin literature abundantly attests. In 168 B.C. or so, Crates of Mallos, a Stoic philosopher, introduces Romans to the formal study of literature and linguistics. (Bits and pieces of his unconventional poetics are beginning to emerge from a papyrus roll discovered in the mid-18th century but only recently edited.) Early Roman language science, however, is decidedly Alexandrian in mode (editions of poetry, commentaries, etymological analyses, interpretations of hymns and legal documents, etc.) but also demonstrably practical and quintessentially Roman in purpose (orthography and spelling reform). Lucius Aelius Stilo (fl. 100 B.C.), Rome’s first philologist of note, teaches both Cicero and Marcus Terentius Varro (116–27 B.C.), and the latter becomes Rome’s most famous scholar. Varro’s intellectual curiosity is all-consuming and his learning vast – he authors at least 74 works – but dearest to his soul is the Latin language. Of the dozen or so major works he authors on language science, however, only six of the original 25 books, i.e., chapters, of his magnum opus, the De lingua Latina, remain. His eclectic work combines Greek and Latin linguistic thought into a distinctively Roman blend of language science. The books on etymological theory, morphological practice, and syntax are lost, but fragments suggest that Varro’s syntax is Stoic, i.e., it deals with propositions. Books 5–6 on etymological practice are an enormous reservoir of cultural artifacts testifying to what Romans think about their own language, and books 8–10 on morphological theory are replete with penetrating observations on the nature of language and linguistic inquiry. Varro’s accomplishments in the De lingua Latina are numerous: he addresses etymology from a distinctly scientific perspective; distinguishes, for the first time in ancient linguistic thought, between derivational and inflectional morphology; formulates the first embryonic declensions and conjugations in ancient language science; is the first and only grammarian in ancient times to apply abstract models to the articulation and solution of linguistic problems; so creatively adapts the Stoic analysis of the Greek verb to the Latin verb system that he becomes the only Latin grammarian prior to the Renaissance to recognize the future perfect indicative; and even includes syntax in his account of the Latin language. Such achievements arguably establish that Varro is classical antiquity’s premier linguistic theoretician. According to Varro’s word-based theory (cf. Taylor, 1974 and 1996), words are of two sorts: those that vary in form, and those that do not. Declinatio ‘morphological variation’ is a linguistic universal and is likewise of two sorts: declinatio voluntaria is derivational morphology, and declinatio naturalis is
inflectional morphology. Varro determines that Latin has four partes orationis ‘parts of speech’: words with case, words with tense, words with both, words with neither. How words vary inflectionally is determined by both their figura ‘phonological form’ and their materia ‘grammatical substance,’ and thus similitudo ‘linguistic similarity’ is a matter of both form and substance (but not meaning). The linguist’s task is to identify similitudo wherever he can, and analogia ‘proportion’ is his most useful heuristic procedure. Varro utilizes four arithmetical proportions as models of inflection, which allow him to discover both declensions and conjugations. Because he ignores vowel length, his five declensions are not the same as our five, and his conjugations are only three in number; nonetheless he succeeds in identifying, perfectly in theory and almost so in practice, the declensions and conjugations of Latin, and these form the core of Latin grammar forever. Varro’s status as a grammarian par excellence and the vir Romanorum eruditissimus is thus guaranteed, and it is no wonder that he is the most frequently cited source in the Roman grammatical tradition and ergo the first authorial figure in the history of Latin linguistics.
Early Roman Imperial Grammar and the Alexandrian Renaissance During the first two centuries A.D., Rome attracts scholars from all over its far-flung empire who publish extensively on lexicography, orthography, and metrics, but most such texts are lost. The pace of language science accelerates, grammars assume their canonical form, and the now prominent discipline garners attention from emperors and critics alike. Grammarians like Palaemon and Probus are able to specialize on the uninflected parts of speech and the verb respectively and to become sufficiently famous that later works are spuriously attributed to them. Palaemon substitutes the interjection for the article, which Latin lacks, in the Alexandrian list of partes orationis and is also credited with authoring the first Roman ars. Meanwhile, Greek scholars like Apollonius Dyscolus (cf. Blank, 1981 and Householder, 1981), his son Herodian, and Sextus Empiricus produce influential and innovative works. Apollonian language science dominates Byzantine linguistics, and Priscian’s Latin grammar (see below) is heavily dependent upon Apollonius and Herodian. The latter’s works have not survived independently but were copiously quoted by later grammarians. Moreover, Apollonius is the first grammarian in ancient times, Greek or Roman, whose book-length works, or at least four of them, have been preserved more or less in toto and also the first
436 Classical Antiquity: Language Study
formal syntactician in Graeco–Roman linguistics. Sextus, a physician and philosopher, is a skeptical critic not only of grammar, the first of the liberal arts, but of all the arts and sciences. Quintilian’s description (Institutio oratoria I, 4–8) of grammatical education in 1st-century A.D. Rome is the most extensive extant Latin linguistic text of the period. Quintilian is not himself a grammarian, but his text is our best source for early imperial grammar. His organizational scheme accords nicely with that of the ars in subsequent centuries as well as with what little we know of Palaemon’s ars, whose pupil he is reputed to have been. Grammarians and their students first study sounds, then word derivation, then the parts of speech, next declension and conjugation, and finally the virtues and vices of speech, particularly barbarisms and solecisms. The tripartite arrangement of the later monumental artes – phonology, morphology, vitia virtutesque orationis – is therefore present, at least implicitly, in Quintilian’s account. The study of phonology is important for reading aloud (the norm in classical antiquity), spelling, and scanning poetry, and the stylistic component presupposes at least some normative grammar. The largest section in Quintilian’s educational schema is that on morphology, and it emphasizes the parts of speech, declined and conjugated forms in particular, although Quintilian nowhere classifies nouns and verbs into particular declensions or conjugations and presumably does not know them. Ambiguity interests him, and he adduces (6.1) and discusses at length four criteria for resolving ambiguities and other doubtful grammatical issues: reason (analogy or etymology), antiquity, authority, usage. He criticizes excessive reliance on any one criterion, however, and inveighs mightily against prescriptive analogy and conjectural etymology. Examples are ubiquitous throughout Quintilian’s entire account of 1st-century A.D. grammatical education, and they provide fascinating glimpses into the many linguistic questions with which Roman grammarians and their students were grappling at the time. What the Romans do best, however, is to systematize and organize, and as Quintilian’s sketch and subsequent artes suggest, that is what it appears they do most of all for language study in the early empire. Apollonius Dyscolus is a rationalist and an analogist: linguistic problems have reasonable solutions, and similar problems have analogous solutions. Language is by nature logical, orderly, and rule governed; any and all deviations from linguistic logic, order, and rule are explicable by reason and analogy; and grammatical analysis is therefore principled behavior designed to discover rules and to explain exceptions. Apollonius assumes an underlying structure for all
levels of language. So there is an order to the letters of the alphabet, to the parts of speech, to the cases, and to the sentence. Letters combine into syllables, syllables into words, and words into sentences, but constraints obtain; therefore none of these processes is random. Change (pathos) affects each level, however, and Apollonius analyzes change as addition, deletion, substitution, and transposition, i.e., the four transformations or the quadripartita ratio. Just as words are subject to pathos, i.e., are misspelled, and are corrected by the theory of spelling, so too sentences may consist of improperly combined noeta ‘meanings’ (Apollonius is also a mentalist) and are to be corrected by syntactic theory. He decomposes complex syntactic structures with subordination into synonymous but coordinate simple structures and analyzes elliptical constructions on the basis of complete ones. Apollonius’s insistence on an underlying structure to phonology, morphology, syntax, and semantics is remarkably forward-looking and allows him to stress katalle¯ lote¯ s ‘grammaticality’ and to seek such structures where they are not overtly apparent. Sextus Empiricus could not disagree more. For him, a science of grammar is simply not possible, and he therefore debunks Stoic and Alexandrian grammatical doctrines alike with impartial glee. In so doing, however, Sextus provides precious information, albeit negatively expressed, about both Stoic and Alexandrian linguistic thought. He also proves himself a keen observer of both diachronic change and synchronic variation. He reports quite accurately that oi, ei, ou, and ai are no longer diphthongs, that zeta is now simply [z], and that the aspirated stops phi, theta, and chi have become continuants. He also stresses the role of style, register, or idiolect and how it varies from speaker to speaker or from speech act to speech act. Of course, Sextus adduces such observations in order to argue, pace his predecessors, that neither analogy nor usage, separately or in tandem, can establish correctness in speech. We know little about either Apollonius (e.g., why he was called Dyscolus ‘ill tempered’) or Sextus (e.g., where he lived), but together they constitute the last chapter in the history of ancient Greek linguistics.
The Roman Ars Grammatica In late antiquity, grammar is the undisputed queen of the sciences: the sheer quantity of grammatical texts dwarfs that of the preceding centuries taken collectively; numerous grammarians are represented in the corpus of extant texts; many others who are otherwise unknown are mentioned in those texts; and grammarians everywhere enjoy a status in society that guarantees them respect and remuneration.
Classical Antiquity: Language Study 437
Unfortunately, few texts have been edited properly, and we must therefore depend for the most part upon Keil’s eight mammoth but outdated volumes (1857– 1880). Worse, prosopography is often chaotic, footnoting either casual or conspicuous by absence, and plagiarism rampant. Extant artes range from textbooks to reference grammars, their authors include amateurs as well as professionals, and even a new genre, the grammatical commentary, emerges. Thus, contents vary widely from favorite themes to areas of expertise, methods comprise the purely formal and the heavily semantic (sometimes both in the same text), and elaborate taxonomies abound. Despite such diversity these artes nonetheless manifest an apparent uniformity that obfuscates their authors’ occasional forays into independent thought. Therefore determining with precision the historical course of progress, dependence, influence, and innovation is virtually impossible, but the wealth of information in these copious artes is undeniably vast. Phonology, spelling, orthography, metrics, the partes orationis and their accidence, declensions, conjugations, comparison, vitia virtutesque orationis, even syntax, and countless other topics all receive their due and more, and grammarians ruthlessly mine both pagan and Christian texts for relevant literary examples. Sacerdos (3rd century A.D.) authors the first extant ars, but it is in the next century that grammar proliferates and rises to prominence. Charisius regularly exercises his own judgment as he reviews and critiques Roman grammatical literature; Diomedes can be unduly influenced by Greek sources one moment but quite independent, even radical, the next; and Donatus, the teacher of St Jerome and the most famous and influential contemporary grammarian, produces two classics, his Ars major and Ars minor. The latter is a series of questions and answers concentrating exclusively on the parts of speech and testifying to the preeminent place of morphology in Latin grammar, whereas St Augustine reprises much of Stoic linguistics in his De dialectica. Contrastive linguistics looms larger as both the Roman empire and grammar become bilingual, a fact much in evidence in the grammatical works of Marius Victorinus, Macrobius, and Dositheus. The 5th century features equally frenetic grammatical activity. Consentius and Phocas limit their artes to the noun and verb: both take a deductive approach to Latin grammar, but the former emphasizes definitions and explanations in the manner of schoolbooks while the latter produces a regula-type grammar heavy on rules, paradigms, and examples. Servius, Cledonius, and Pompeius inaugurate the new genre of grammatical literature by commenting on the texts of Donatus. Martianus Capella’s bizarre fantasy on the
marriage of Mercury and Philology ranks the ars grammatica as the first of the artes liberales just as Varro’s Disciplinae had done more than a halfmillennium previously. Priscian’s Institutiones grammaticae (5th–6th century A.D.) constitutes the stunning and surprising culmination to the history of Graeco–Roman language science. Priscian is a data-oriented grammarian who devotes his first 16 books to morphology; examples and literary quotations by the thousands enhance some of the most extensive morphological analyses in the entire history of linguistics. (His abridged treatise on the noun, pronoun, and verb becomes a pedagogical classic.) Semantics reigns supreme in his linguistic theory, however, for Priscian is heavily dependent upon Apollonius Dyscolus, as befits a grammarian writing in Constantinople. Priscian devotes books 17 and 18 to syntax, the first such study in all of Roman language science. He relies on Apollonius for theory, form, content, and methodology, and thus reason, analogy, order, and, obviously, semantics figure prominently everywhere. He cites Apollonius repeatedly and even recasts in Latin Apollonius’s famous analogy between spelling and syntax. Like his Greek predecessor, Priscian decomposes complex structures into simpler ones and analyzes ellipses on the basis of complete constructions. Especially noteworthy, are Priscian’s analyses of the ablative absolute, gapping, the passive transformation, and impersonal verbs. Priscian’s work is so significant that it is transmitted in more than a thousand manuscripts, and his syntax so striking that books 1–16 and books 17–18 are often separately entitled Priscianus major and Priscianus minor, respectively. Priscian’s supreme achievements in morphology and syntax are difficult to assess in detail, for even now we still ‘‘have no extensive study of him nor even a satisfactory philological edition of his work’’ (Hovdhaugen, 1982: 105); nonetheless his magnum opus stands as the consummate swan song to ancient language study.
Conclusion Western intellectual history would not be the same without the contributions of classical antiquity’s language scientists, for individually and collectively the grammarians of Greece and Rome lay the foundations and chart the course for the formal study of language in the future. Their claims on originality are considerable; the four transformations, parts of speech, grammatical terms by the score, the wordclass system, word-and-paradigm grammar, sign and referent, and dozens of other linguistic matters first articulated in Greece or Rome are still with us
438 Classical Antiquity: Language Study
today. Plato and Aristotle begin the lengthy process of identifying the fundamental constituents of language, Stoic logic remains au courant, the Techne’s taxonomic and terminological influence is indisputable, Varro’s revised declensions and conjugations are the permanent heart and soul of Latin grammar the world over, Donatus’s artes are centerpieces of grammatical literature and educational curricula, and Priscian’s syntactic analyses have found their way into modern textbooks. The history of Graeco–Roman language science is not always as well documented as we would like, but what is extant not only testifies to one of the longest-lasting intellectual legacies in the Western world but also continues to inspire and inform its modern practitioners. See also: Apollonius Dyscolus and Herodian; Architecture of Grammar; Aristotle and Linguistics; Aristotle and the Stoics on Language; Diogenes the Babylonian (ca. 240– 152 B.C.); Dionysius Thrax and Hellenistic Language Scholarship; Ellipsis; Europe Alphabets, Ancient Classical; Grammar; Greek, Ancient; Hippocrates: Theory of the Sign; Historiography of Linguistics; Latin; Linguistic Terminology; Logic and Language: Philosophical Aspects; Plato and His Predecessors; Pliny the Elder (23–79 A.D.); Priscianus Caesariensis (d. ca. 530); Quintilian (ca. 30–98 A.D.); Roman Ars Grammatica; Sextus Empiricus (fl. 200 A.D.); Varro, Marcus Terentius (116–27 B.C.); Word Classes/Parts of Speech: Overview.
Bibliography Baratin M & Desbordes F (1981). L’Analyse linguistique dans l’antiquite´ classique. I: Les The´ ories. Paris: Klincksieck. Blank D (1981). Ancient philosophy and grammar: the syntax of Apollonius Dyscolus. Chico, CA: Scholars Press. Della Casa A (1973). ‘La grammatica.’ In Introduzione allo studio della cultura classica. II: Linguistica e filologia. Milan: Marzorati. 41–91. Di Benedetto V (1958–59). ‘Dionisio Trace e la techne a lui attribuita.’ Annali della Scuola Normale Superiore di Pisa, Series II(27), 169–210; (28), 87–118.
Di Benedetto V (1973). ‘La techne spuria.’ Annali della Scuola Normale Superiore di Pisa, Series III(3), 797–814. Holtz L (1981). Donat et la tradition de l’enseignement grammatical: E´ tude sur l’Ars Donati et sa diffusion (IVe – IXe sie`cle) et e´ dition critique. Paris: Centre National de la Recherche Scientifique. Householder F (1981). The syntax of Apollonius Dyscolus. Amsterdam: Benjamins. Hovdhaugen E (1982). Foundations of western linguistics: from the beginning to the end of the first millennium A.D. Oslo: Universitetsforlaget. Kaster R (1988). Guardians of language: the grammarian and society in late antiquity. Berkeley: University of California. Keil H (ed.) (1857–80). Grammatici latini, 7 vols & supplement. Leipzig: Teubner. Kemp A (1986). ‘The Techne¯ Grammatike¯ of Dionysius Thrax: translated into English.’ Historiographia Linguistica 13, 343–363. Also in Taylor D (ed.) (1987) The history of linguistics in the classical period. Amsterdam: Benjamins. 169–189. Law V (2003). The history of linguistics in Europe: from Plato to 1600. Cambridge: Cambridge University Press. Law V & Sluiter I (eds.) (1995). Dionysius Thrax and the Techne¯ Grammatike¯ . Mu¨ nster: Nodus. Luhtala A (2000). On the origin of syntactical description in Stoic logic. Mu¨ nster: Nodus. Pinborg J (1975). ‘Classical antiquity: Greece.’ In Sebeok T (ed.) Current trends in linguistics, XIII: historiography of linguistics. The Hague: Mouton. 69–126. Rawson E (1985). Intellectual life in the late Roman republic. Baltimore: The Johns Hopkins University. Robins R (1990). A short history of linguistics (3rd edn.). London: Longman. Schmitter P (1991). Geschichte der sprachtheorie 2: sprachtheorien der abendla¨ndischen antike. Tu¨ bigen: Narr. Sluiter I (1990). Ancient grammar in context: contributions to the study of ancient linguistic thought. Amsterdam: VU University Press. Swiggers P & Wouters A (eds.) (2003). Syntax in antiquity. Leuven: Peeters. Taylor D (1974). Declinatio: A study of the linguistic theory of Marcus Terentius Varro. Amsterdam: Benjamins. Taylor D (1996). Varro de lingua Latina X. Amsterdam: Benjamins.
Classical Tests for Speech and Language Disorders 439
Classical Tests for Speech and Language Disorders J Macoir, Laval University, Quebec, QC, Canada A Sylvestre, Laval University, Quebec, QC, Canada Y Turgeon, Campbellton Regional Hospital, Campbellton, NB, Canada ! 2006 Elsevier Ltd. All rights reserved.
Introduction The evaluation of speech and language is one of the most important tasks of speech-language pathologists and professionals from a variety of disciplines and backgrounds (neuropsychologists, physicians, nurses, etc.). The assessment session is often the first contact with clients and also constitutes the starting point of all clinical interventions. Because of the absence of biological markers or simple assessment methods, the early detection or diagnosis of speech and language problems remains dependent on various indirect assessments (i.e., speech or language functioning must be inferred from the client performance in various tasks devised to explore the different areas of this functioning) performed to identify specific impairments and eliminate other possible causes. There are various purposes to conduct speechlanguage assessments. The main goal of screening is to determine whether a client has a problem or not. The output of this type of assessment is a ‘pass’ or ‘fail’ result, based on an established criterion that could lead to a more extensive or a follow-up assessment. Diagnosis and differential diagnosis assessments are usually performed to label the communication problem and/or to differentiate it from other disorders in which similar characteristics are usually reported. Another important purpose to evaluation is to provide clinicians with a detailed description of the client’s baseline level of functioning in all areas of communication in order to identify affected and preserved components, to plan for treatment, to establish treatment effectiveness, or to track progress over time through periodic re-evaluations. These types of assessment require the clinician to consider all aspects of communication, including the different areas of speech (e.g., articulation, voice, resonance) and language (e.g., lexical access, comprehension, written spelling), but also important related abilities and components such as pragmatics, cognitive functions (e.g., attention, memory, visual perception), emotions, awareness of deficits, etc. The selection of evaluation tools is also conditioned by the specific objectives of assessments. Screening for a speech or language disorder is usually performed with standardized screening measures whereas standardized norm-referenced tests are used for diagnosis and differential diagnosis
assessments as well as for clinical treatment purposes (baseline, effectiveness, progress).
Reference Models for the Assessment of Speech and Language Impairment The choice of a particular method of assessment, the selection of evaluation tools as well as the interpretation of results, is highly dependent not only on the clinician’s own conception of speech and language functioning but also on the reference to a clinicopathological or cognitive model of assessment. In the clinicopathological model, speech and language problems are considered as essential characteristics of clinical syndromes. These clinical syndromes are organized and classified according to neurological-neuropathological characteristics (e.g., deterioration of cortical tissue in a specific brain area) and according to semiology (e.g., sensitive and motor deficits, visuospatial deficits, language deficits, etc.). For the purpose of assessment, the emphasis is put on the precise identification of the diagnostic label that best corresponds to the observed deficits as well as to the identification of the possible etiology. For example, within this model, the general assessment process of an aphasic person essentially consists of (1) gathering case history data (e.g., cerebrovascular accident in the left frontal area), (2) administrating a specific test battery (e.g., the Boston Diagnostic Aphasia Examination [BDAE]; Goodglass et al., 2000), (3) confronting the results and description of behavior (e.g., impaired fluency, impaired articulatory agility, relatively good auditory comprehension, agrammatism) with the classification of neurogenic acquired deficits of language, and (4) specifying the precise aphasic label (Broca’s aphasia) that best fits these characteristics. If screening or labeling is the main goal of the assessment, the clinicopathological model is probably the best option. It is, however, certainly not so if the purpose of the evaluation is to localize the functional origin of deficits or to guide clinical practice. Knowing that a person presents with a Broca’s aphasia may not be much help in identifying the specific components of language that are totally or partially affected or preserved. It also does not tell the clinician what intervention goals are appropriate, what treatment approaches will succeed best. Instead of resorting to a medical assessment model, clinicians may use cognitive neuropsychological models, directly derived from information-processing theories, to evaluate language. In these models, cognitive functions, including language, are sustained by specialized interconnected processing components,
440 Classical Tests for Speech and Language Disorders
represented in functional architecture models. For example, as shown on Figure 1, the ability to orally produce a word in picture naming is conceived as a staged process in which the activation flow is initiated in a conceptual-semantic component and ends with the execution of articulation mechanisms. An assessment process based on cognitive neuropsychological models consists in the localization of the impaired and preserved processing components for each language modality. This localization is performed through the administration of specific tasks or test batteries (e.g., Psycholinguistic Assessments of Language Processing in Aphasia [PALPA]; Kay et al., 1992) aiming at the evaluation of each component and route of the model. For example, the evaluation of naming abilities in an aphasic person could be performed by the administration of tasks exploring the conceptual-semantic (e.g., semantic questionnaire), phonological output lexicon (e.g., picture-naming task controlled for frequency, familiarity, etc), and phonological output buffer (e.g., repetition of words and nonwords controlled for length) components. Important information regarding the level of impairments also arises from error analysis. With the same example, an anomic behavior could arise from distinct underlying deficits (e.g., in the activation of conceptualsemantic representations or in retrieving phonological forms of words in the output lexicon), leading to distinct types of errors (e.g., semantic substitutions, phonemic errors). The complete cognitive assessment process should allow the clinician to understand the client’s deficits (i.e., surface manifestations, underlying origins, affected components) as well as to identify the strengths and weaknesses in his communication
abilities. When recommended, the treatment may focus on the impaired levels of processing (i.e., function restoration) or on alternative processing routes (i.e., function reorganization) that will allow the client to communicate successfully.
Classical Tests for the Assessment of Aphasia Aphasia is the most common disorder of communication resulting from brain damage (i.e., stroke, brain tumor, head trauma, infections). This affection mainly involves language problems of production and comprehension as well as disturbances in reading and spelling. Bedside and Screening Tests
The patient’s symptoms change rapidly during the first days and weeks following the brain damage. Moreover, patients are often too ill to complete an exhaustive aphasia examination and bedside or screening instruments may be useful to advise relatives and health care professionals about the global communication profile and the best means to communicate in functional situations. These instruments are also useful to help clinicians to determine the necessity of performing a more thorough and extensive assessment of language or to establish the priority of patients on a waiting list. In addition to actual screening tests (e.g., Aphasia Screening Test; Reitan, 1991; for an extensive list see Murray and Chapey, 2001 and Spreen and Risser, 2003), clinicians also may administer shortened versions of comprehensive tests of aphasia (e.g., short form of the Token Test; Spellacy and Spreen, 1969; for an extensive list see Murray and Chapey, 2001). As pointed out by Spreen and Risser (2003), although bedside and screening tests may be used to identify language impairments in moderate and severe aphasics (language is obviously affected, even in simple and natural communication situations), they are inappropriate or of little use to distinguish the responses of individuals with mild deficits from those with normal language skills. Comprehensive Examinations and Aphasia Batteries
Figure 1 Schematic depiction of the cognitive neuropsychological model of spoken picture naming.
As compared to bedside and screening tests, the main purpose of comprehensive examinations of aphasia is to provide an extensive description of language skills through the administration of tests designed to explore the different areas of language (i.e., spontaneous speech, naming, oral expression, auditory and written comprehension, repetition, reading, and writing). According to the reference model of assessment, the
Classical Tests for Speech and Language Disorders 441
output of a comprehensive examination may consist in the identification of a particular diagnostic of aphasia with the description of severity of deficits in each language area (clinicopathological approach), or in the localization of specific impairments affecting functional processing components of language skills (cognitive neuropsychological approach). There are several classical comprehensive examinations and aphasia batteries. The most widely used in clinical and research settings in English are BDAE (Goodglass et al., 2000), the Western Aphasia Battery (Kertesz, 1982), and the Aphasia Diagnostic Profiles (Helm-Estabrooks, 1992). All these standardized test batteries comprise different subtests (e.g., BDAE has 27 subtests) that assess all the dimensions of language in order to diagnose and classify aphasic syndromes according to clinical localization-based classifications (i.e., Broca’s, Wernicke’s aphasia, etc.). For a complete description and a critical review of these instruments, and others not reported here, see Spreen and Strauss (1998), Murray and Chapey (2001), and Spreen and Risser (2003). PALPA (Kay et al., 1992) is a comprehensive test battery directly derived from the cognitive neuropsychology approach of assessment. This aphasia battery, commonly used in the United Kingdom, consists in a set of resource materials comprising 60 rigorously controlled tests that enable the user to select tasks ‘‘that can be tailored to the investigation of an individual patient’s impaired and intact abilities.’’ The scoring and analysis of errors give the clinician a detailed profile of language abilities, including reading and written spelling, which can be interpreted within current cognitive models of language. As compared to classical batteries of aphasia, the versatile and flexible nature of PALPA is, however, lessened by the lack of standardization and validity/reliability measures. Tests for the Assessment of Specific Aspects of Language
Specific aspects of language behavior can also be assessed through the administration of several tests. These are often used to complete aphasia batteries but some of them also are used as screening tests. Clinicians may select these tests according to the different aspects of language they want to explore in depth, but also according to the underlying theoretical model of assessment. For example, comprehension may be tested through the administration of specific tests aiming at the discrimination of phonemic sounds (Phoneme Discrimination Test; Benton et al., 1994), semantics (Pyramids and Palm Trees Test; Howard and Patterson, 1992), sentence length and syntactic complexity (Auditory Comprehension Test for Sentences; Shewan, 1979), commands (Token Test; De Renzi and Vignolo, 1962), or narrative discourse (Discourse
Comprehension Test; Brookshire and Nichols, 1993). Other tests are available for measuring verbal expression, spoken and written naming, verbal fluency, reading, writing, gestural abilities, etc. An extensive list of specific language function tests can be found in Spreen and Strauss (1998), Murray and Chapey (2001), and Spreen and Risser (2003). The Assessment of Functional Communication
Traditional tests provide useful information on linguistic abilities and language impairments in aphasia. However, performance on these tests does not necessarily predict how a person will communicate in more naturalistic settings and everyday life. Instead of focusing on the importance and the nature of deficits, the functional communication approach of assessment aims at the impact of these deficits on the person’s activities and participation in society. Functional communication skills may be assessed with specific structured tests or by rating scales and inventories of communication profiles. Structured tests such as Communication Activities of Daily Living 2 (Holland et al., 1999) and the Amsterdam–Nijmegen Everyday Language Test (Blomert et al., 1994) have been devised to explore functional communication skills using role-play in daily life activities (shopping, dealing with a receptionist, etc.) and have shown themselves to be useful to track progress over time. However, while they are certainly more ecological than comprehensive examinations and tests for specific aspects of language, structured tests of functional communication do not necessarily give reliable views of the communication skills of a person in real-life situations. In this respect, rating scales and inventories of communication profiles are closer to functional situations. For example, the Functional Assessment of Communication Skills for Adults (Frattali et al., 1995) is a rating protocol of 43 items, on a seven-point scale, based on the observations made by the speech-language pathologist or other significant person in the following four domains: social communication (e.g., ‘refers to familiar people by name’); communication of basic needs (e.g., ‘makes needs to eat’; reading, writing, and number concepts (e.g., ‘writes messages’); and daily planning (e.g., ‘tells time’). For a more extensive description of these functional communication tools, and others not described here, see Murray and Chapey (2001) and Spreen and Risser (2003).
Classical Tests for the Assessment of Speech and Language Impairment in Children The assessment of language and communication in children can take place from infancy through
442 Classical Tests for Speech and Language Disorders
adolescence, when cognitive abilities are developing. Therefore, the language assessment process must not only inform on current specific abilities, but has also to capture changes over time in the level, sequence, and rate of acquisition. The interrelationship between language and other cognitive and social skills is also of primary importance. As a part of a larger process, usually performed by different professionals, the evaluation of language in children should be completed by an assessment of nonverbal communication, play and social skills, perception, attention and memory, behavior, etc. Moreover, because of the major influence it has on child development, the evaluation also has to consider the familial and social environment, especially with respect to adult–child interaction. The different components (e.g., sensitivity, promptness) as well as the context (e.g., physical settings, types of play, activities) in which this interaction takes place should be analyzed through specific assessment tools or through direct observation. The assessment of preschool children (children aged 2 to 5 years) and school-age children (5 to 10 or 12 years) is usually based upon a combination of parent interviews, standardized tests, criterionreferenced instruments, developmental scales, and observations. All these tools and methods aim to explore both receptive and expressive language abilities in semantics, morphology, syntax, phonology, and pragmatics. Collecting a communication sample is also a frequently used method to analyze communication in terms of sentence length, intelligibility of speech, vocabulary, and conversational strengths and weaknesses. Similarly to tests for aphasia, preschool and school-age tests can be divided into two major categories: screening and diagnostic tests. The purpose of screening tests is to determine if the child’s communication should be explored more extensively for the presence of a possible impairment. On the other hand, the main purposes of diagnostic tests is to establish the presence or absence of a deficit in one or more areas of language, to identify a possible difference in language development, to determine the child’s eligibility for clinical services, and to identify the targets for intervention. These instruments are devised to assess language development by reference to the parameters of the normal range. Screening Tests
Screening tests are usually inexpensive and require minimal time for administration and interpretation of results. Many norm-referenced standardized instruments may be used to establish the child’s general level of expressive and receptive language functioning as well as other areas of functioning. For example, the
Denver Developmental Screening Test II (Frankenburg et al., 1990), a standardized screening battery for children from birth to age 6, is designed to test the child’s abilities in the following four sectors: personal-social, fine motor, gross motor, and language (including expressive-receptive vocabulary). Screening tests may also consist in large batteries exploring language and cognitive functions through tasks of general verbal and nonverbal intellectual abilities. For example, the Wechsler Intelligence Scale for Children IV (Wechsler, 2004) is the most widely used measure of verbal and nonverbal intelligence in individuals from age 6 years 0 months to 16 years 11 months. As a screening tool, this battery consists in 16 subtests of verbal comprehension, perceptual reasoning, working memory, and processing speed skills. For school-age children, some large screening batteries specifically concern academic achievement. That is, for example, the case with the Peabody Individual Achievement Test – Revised (PIAT-R; Markwardt, 1998), which provides a screening measure of achievement in the areas of mathematics, reading recognition and comprehension, spelling, and general information. However, most of the tasks of these large screening batteries are multifactorial and are therefore not appropriate to assess specific language or cognitive processes. For this purpose, clinicians may select among various specific screening tests for preschool and school-age children that focus only on language. Most of these instruments are designed to explore the different language components. That is the case, for example, with the Fluharty Preschool Speech and Language Screening Test II (Fluharty, 2001), which explores articulation, expressive and receptive vocabulary, and composite language in children from 2 to 6 years old. An exhaustive list of norm-referenced standardized screening tests of language can be found in Paul (2001). Comprehensive Examinations and Batteries
As for screening, some diagnostic tools are designed to explore language skills as well as other aspects of development. That is the case, for example, with the Communication and Symbolic Behavior Scales Developmental Profile (Wetherby and Prizant, 1998), which includes tasks exploring expressive and receptive language, symbolic play, and nonverbal communication in children from 6 to 24 months old. Another example is the Rossetti Infant-Toddler Language Scale (Rossetti, 1990), which is used to assess attachment, play, gestures, and pragmatics, as well as language comprehension and expression in children from birth to 3 years old. There are also several
Classical Tests for Speech and Language Disorders 443
standardized comprehensive batteries of language processing that comprise tests exploring exclusively some or all of the language areas. That is the case, for example, with the Preschool Language Scale 4 (Zimmerman et al., 2002), which is used to identify specific strengths and weaknesses in receptive and expressive language skills in children from birth to 6 years 11 months. The Clinical Evaluation of Language Fundamentals 4 (Semel et al., 2003) is a multidimensional battery that can be used in individuals between the ages of 5 and 21 years to explore semantics, expressive and receptive language, and syntax, as well as working memory. The Comprehensive Assessment of Spoken Language (CarrowWoolfolk, 1999), designed for children from age 3 to 21, is another comprehensive battery of language skills, comprising 15 tests that provide an assessment of expressive and receptive skills in four language categories: lexical/semantic, syntactic, supralinguistic, and pragmatics. An extensive list of available comprehensive examinations and batteries of language for children can be found in McCauley (2001), Paul (2001), Mattis and Luck (2002), and Haynes and Pindzola (2003). Tests for Specific Aspects of Language
Different components of language can be affected with more or less intensity in children according to the origin of developmental disorders. Therefore, the in-depth assessment of language and communication disorders in children is a critical component in the clinical process. Core tests can be used to evaluate each of the language areas in order to identify specific impairments, establish baselines, and identify precise therapeutic and intervention goals. For example, there are several core tests and instruments for the evaluation of word retrieval (e.g., naming and verbal fluency tests), phonology (e.g., word and nonword repetition tests), receptive and expressive vocabulary (e.g., word definition tests), receptive and expressive syntax and morphology (e.g., sentence-to-picture matching tests), and pragmatic skills (e.g., narrative production, story comprehension tests). A combination of different tests, each focusing on specific language components, may also be used to establish such a language and communication profile. For example, to assess vocabulary, clinicians may select the following standardized norm-referenced specific tests: the Expressive One-Word Picture Vocabulary Test Revised (Gardner, 2000), to exclusively explore expressive vocabulary in individuals ages 2 years 0 months through 18 years 11 months; or the Peabody Picture Vocabulary Test (Dunn and Dunn, 1997), to exclusively explore receptive
vocabulary in individuals from age 2 years 6 months to adult. A more complete description of available diagnostic tests adapted to preschool and school-age children can be found in McCauley (2001), Paul (2001), Mattis and Luck (2002), and Haynes and Pindzola (2003). The Assessment of Reading and Writing
The relationship between language acquisition and academic achievement is well established. Developmental disorders of language in preschool children are frequently associated with later difficulties in learning to read and write. The most common referral for a speech-language pathology assessment concerns school-age children who encounter problems in progressing beyond the developing language phase and present with difficulties in learning and acquiring communicative and academic skills. As for other populations, but especially at this stage of development, a significant difficulty in assessing school-age children arises because of important comorbidity between language and learning disorders and other cognitive and clinical pathological profiles, such as attention deficit/hyperactivity or executive function disorders. Therefore, the assessment process should include specific tests of language and communication but also instruments designed for exploring other cognitive functions, such as attention, working memory, and executive functions. In addition to formal tests, another important source of information also comes from structured interviews of the child himself, his parents and his teacher. With respect to language, phonological processing deficits are considered as an underlying cause of dyslexia and also play a role in developmental disorders of spelling. For example, dyslexic children often show problems with word and nonword repetition tasks, phonological awareness tasks (e.g., word and nonword segmentation tasks, phoneme manipulation, etc.), and working memory tasks for verbal material (i.e., word or digit span tasks). The semantic processing is another cognitive area highly related to reading and writing. School-age children usually learn new words through reading and writing. Those who encounter problems in reading and writing often present with poor vocabulary as well as with difficulty in word association and comprehension. Therefore, the assessment procedure for written language problems should be part of a more exhaustive evaluation of language and cognition. It should also include a close control of psycholinguistic parameters (e.g., orthographic regularity, lexical frequency) that are known to play an important role in written and spoken word recognition, reading comprehension, phoneme–grapheme
444 Classical Tests for Speech and Language Disorders
conversion, decoding, etc. However, very few standardized assessment tools fulfill these conditions. As an exception, French-speaking clinicians may use the Batterie d’E´ valuation du Langage E´ crit et de Ses Troubles (Mousty et al., 1994), a written-languagetesting battery based on current models of reading and writing, to assess children between the ages of 7 and 12 years. In addition to experimental tasks, one can resort to standardized achievement or specific tests of reading and writing skills. Among the most used of achievement tests are PIAT-R (Markwardt, 1998), which comprises subtests of reading comprehension, reading recognition, and spelling, and the Wide Range Achievement Test (Wilkinson, 1993), a brief test measuring reading recognition, spelling, and arithmetic computation. For a description of specific clinical tests of reading and writing, see Spreen and Strauss (1998) and Bailet (2001).
Classical Tests for the Assessment of Speech and Language Impairment in Special Populations Referral for speech-language assessment not only concerns aphasia and developmental deficits of language but also involves individuals of different age groups presenting with various language and communication problems. In children and adolescents, these references include language deficits in pervasive developmental disorders (e.g., autism, Asperger’s disorder), mental retardation, attention deficit/hyperactivity disorder, specific language impairment, sensory deficits (hearing loss, blindness), acquired disorders (e.g., traumatic brain injury), stuttering, etc. In adults, referral for a speech-language evaluation may be required for language and communication deficits following right hemisphere damage, traumatic brain injury, Alzheimer’s disease and other forms of dementia (e.g., primary progressive aphasia, semantic dementia), stuttering and other fluency problems, etc. In children, adolescents, and adults, clinical assessments may also concern such speech problems as dysarthria, following a stroke and neurodegenerative illnesses or accompanying cerebral palsy, acquired or developmental apraxia of speech, etc. In addition to the conventional evaluation of basic language (or speech) skills, the assessment procedure in all these special populations involves specific aspects and particularities of speech and language. For example, because of the absence of biological markers or simple diagnosis methods, the early detection of dementia often depends on various assessment tools, including
language and cognitive tests performed to exclude other possible disease processes or identify specific forms of a given disease. In that particular domain, the assessment of speech and language usually includes tests allowing for differential diagnosis. For example, tests that specifically tap either semantic processing or written spelling can contribute to differentiating common disease processes in the elderly population. Semantic deficits are prominent characteristics of individuals diagnosed with Alzheimer’s disease and these individuals usually differ from patients diagnosed with vascular dementia or frontotemporal dementia because of the presence of surface dysgraphia, a specific spelling disorder. It is obviously not possible to exhaustively describe here the various tests adapted to special populations. The reader will find a more complete description of such tests in McCauley (2001), Paul (2001), Haynes and Pindzola (2003), and Spreen and Risser (2003).
Conclusion Language production and comprehension are complex cognitive skills that should not be considered in isolation in assessment procedures. The interrelation between language and other cognitive functions has to be captured, particularly with respect to the possible influence of attention, working memory, and executive functions on linguistic abilities. If possible, clinicians should always select valid and reliable norm-referenced tests to assess language and communication. Resorting to theoretical models of language functioning also appears of primary importance and may sometimes condition the utilization of experimental, well-controlled, assessment tasks. A comprehensive assessment of language and communication is more than just an evaluation of specific skills in terms of preservation or impairment of processing components and surface structures. The scope of assessment should be widened in order to provide information about physical, social, and emotional contexts of communication, cultural differences, and economic factors. The combination of these data, obtained through assessment tools and direct observations, should then allow the clinician to establish a complete portrait of functional communication abilities. See also: Dementia and Language; Impairments of Proper and Common Names; Phonological Impairments, Sublexical; Phonological, Lexical, Syntactic, and Semantic Disorders in Children; Primary Progressive Aphasia in Nondementing Adults; Speech Impairments in Neurodegenerative Diseases/Psychiatric Illnesses.
Classical Tests for Speech and Language Disorders 445
Bibliography Bailet L L (2001). ‘Written language test reviews.’ In Bain A M, Bailet L L & Moats L C (eds.) Written language disorders. Austin: Pro-Ed. 221–248. Benton A L, Hamsher K, Varney N R & Spreen O (1994). Contributions to neuropsychological assessment (2nd edn.). New York: Oxford University Press. Blomert L, Kean M L, Koster C & Schokker J (1994). ‘Amsterdam–Nijmegen Everyday Language Test: construction, reliability, and validity.’ Aphasiology 8, 381–407. Brookshire R & Nichols L E (1993). The Discourse Comprehension Test. Minneapolis: BRK. Carrow-Woolfolk E (1999). Comprehensive Assessment of Spoken Language. Circle Pines, MN: American Guidance Service. DeRenzi E & Vignolo L (1962). ‘The token test: a sensitive test to detect receptive disturbances in aphasics.’ Brain 85, 665–678. Dunn L M & Dunn L M (1997). Peabody Picture Vocabulary Test. Circle Pines, MN: American Guidance Service. Fluharty N B (2001). Fluharty Preschool Speech and Language Screening Test II. Austin: Pro-Ed. Frankenburg W, Dodds J, Archer P, Bresnick B, Maschka P, Edelman N & Shapiro H (1990). Denver II: screening manual. Denver: Denver Developmental Materials. Frattali C M, Thompson C K, Holland A L, Wohl C B & Ferketic M M (1995). Functional Assessment of Communication Skills for Adults. Rockville, MD: American Speech-Language-Hearing Association. Gardner M F (2000). Expressive One-Word Picture Vocabulary Test – Revised. Novato, CA: Academic Therapy. Goodglass H, Kaplan E & Barresi B (2000). Boston Diagnostic Aphasia Examination (3rd edn.). Philadelphia: Lippincott Williams & Wilkins. Haynes W O & Pindzola H R (2003). Diagnosis and evaluation in speech pathology (6th edn.). Boston: Allyn & Bacon. Helm-Estabrooks N (1992). Aphasia Diagnostic Profiles. Chicago: Riverside Publishing. Holland A L, Frattali C M & Fromm D (1999). Communication Activities of Daily Living (2nd edn.). Austin: Pro-Ed. Howard D & Patterson K E (1992). The Pyramids and Palm Trees Test. Oxford: Harcourt Assessment. Kay J, Lesser R & Coltheart M (1992). Psycholinguistic Assessments of Language Processing in Aphasia (PALPA). Hove, England: Lawrence Erlbaum Associates. Kertesz A (1982). Western Aphasia Battery. New York: Grune & Stratton.
Markwardt F C (1998). Peabody Individual Achievement Test Revised. Circle Pines, MN: American Guidance Service. Mattis S & Luck D Z (2002). ‘Neuropsychological assessment of school-aged children.’ In Segalowitz S J & Rapin I (eds.) Handbook of neuropsychology: child neuropsychology 1. New York: Elsevier Science. McCauley R J (2001). Assessment of language disorders in children. Mahwah, NJ: Lawrence Erlbaum Associates. Mousty P, Leybaert J, Ale´ gria J, Content A & Morais J (1994). Batterie d’E´ valuation du Langage E´ crit et de ses Troubles. Brussels: Laboratoire de Psychologie Expe´ rimentale, Universite´ Libre de Bruxelles. Murray L L & Chapey R (2001). ‘Assessment of language disorders in adults.’ In Chapey R (ed.) Language intervention strategies in aphasia and related neurogenic communication disorders (4th edn.). Philadelphia: Lippincott Williams & Wilkins. 55–126. Paul R (2001). Language disorders from infancy through adolescence: Assessment and intervention (2nd edn.). Toronto: Mosby. Reitan R M (1991). Aphasia Screening Test. Tucson: Reitan Neuropsychology Laboratory. Rossetti L (1990). Rossetti Infant-Toddler Language Scale. East Moline, IL: LinguiSystems. Semel E, Wiig E H & Secord W A (2003). Clinical Evaluation of Language Fundamentals (4th edn.). Toronto: Psychological Corporation. Shewan C M (1979). Auditory Comprehension Test for Sentences (ACTS). Chicago: Biolinguistics Clinical Institutes. Spellacy F & Spreen O (1969). ‘A short form of the Token Test.’ Cortex 5, 390–397. Spreen O & Risser A H (2003). Assessment of aphasia. New York: Oxford University Press. Spreen O & Strauss E (1998). A compendium of neuropsychological tests (2nd edn.). New York: Oxford University Press. Wechsler D (2004). The Wechsler Intelligence Scale for Children: Fourth edition integrated. San Antonio, TX: Harcourt Assessment. Wetherby A M & Prizant B M (1998). Communication and Symbolic Behavior Scales-Developmental Profile. Chicago: Applied Symbolix. Wilkinson G S (1993). Wide Range Achievement Test 3. San Antonio, TX: Harcourt Assessment. Zimmerman I L, Steiner V G & Pond R E (2002). Preschool Language Scale (4th edn.). San Antonio, TX: Harcourt Assessment.
446 Classification of Languages
Classification of Languages B J Blake, La Trobe University, Bundoora, VIC, Australia ! 2006 Elsevier Ltd. All rights reserved.
Classification of Languages This article describes the principles underlying the classification of languages in this encyclopedia. Classification may be based on genetics, diffusion, lexicostatistics, or other relationships. A map (Figure 1) showing locations of major language groupings worldwide is provided.
Genetic Classification Both professional linguists and general readers find a genetic classification the most satisfying way to group languages. This approach is one in which languages are classified into families on the basis of descent from a common ancestor. A good example is the Indo-European family of languages, which includes most of the languages of Europe, Iran, Afghanistan, and the northern part of South Asia. These languages can be shown to descend from a common ancestor, a common protolanguage. There are no records of the ancestral language, but it can be reconstructed from records of daughter languages such as Sanskrit, Ancient Greek, and Latin by using what is known as the ‘comparative method’. Consider the following words for ‘father’: Sanskrit pita´ r, Greek pate´ :r, and Latin pater. It is possible to align the initial ps, the medial ts, and the final rs and reconstruct a root with the consonants p-t-r (the vowels require a little further examination). English is also a related language, so the word father should show the same consonants, but in fact the expected p shows as an f and the t shows up as a th (representing a voiced dental fricative). However, a consideration of further words shows that the f/p correspondence also appears in many other words, such as English foot against Sanskrit pa´ d-, Greek pod-, and Latin ped-, and the th/ t correspondence also shows up in other words, such as English mother against Sanskrit ma:ta´ :r, Greek ma´ :te:r, and Latin ma:ter. We still reconstruct p-t-k and conclude that English has systematically changed the original stop consonants into fricatives. In fact, all the Germanic languages have done so. Inflections as well as roots can be reconstructed. A common genitive ending in -s can be seen in Greek pod-o´ s, Latin ped-is, and English foot’s. Proceeding in this way, we can reconstruct a good deal of the protolanguage and we can demonstrate that these languages and a score or so others are
related as members of one family, which we call Indo-European. A language family can be represented by a tree diagram, with the branches representing subgroups. Subgroups are characterized by shared innovations, which sets them apart from other languages in the family. The Germanic branch of Indo-European (English, German, Dutch, etc.) is characterized by various consonant shifts such as p ! f and t ! th, as just mentioned, and by a past tense marked by a dental (or alveolar) stop, as in English answered or German antwortete. Other branches of Indo-European that can be reconstructed include Armenian, Anatolian, Celtic, Tocharian, and Italic. The Italic branch contained languages located in Italy, such as Oscan, Umbrian, and Latin. Latin was spread by conquest from Rome to a large area around the Mediterranean. It is no longer a spoken language, but it survives through its daughters, namely, French, Portuguese, Spanish, Italian, and Romanian, to mention only national languages. These languages, collectively called the Romance languages, form a sub-branch of the Italic branch of the Indo-European family. In this instance, we have records of Latin, which serves as a check on what we might reconstruct as proto-Romance. All of the Indo-European languages treated in this encyclopedia are included in the alphabetic list of families and other large groupings in the last section of this article (‘Status of the Groupings Used in the Classification’). It is common in studying languages to find among them resemblances that are insufficient for the reconstruction of a protolanguage. This can be because there are insufficient data or because the languages have diverged so far that only a little evidence remains of their genetic affiliation. Where there is insufficient evidence for establishing a family or grouping families into a wider family, so that they become branches of the larger family, we can describe the languages in question as belonging to a particular stock. There can be degrees of resemblance among languages. If languages are grouped into stocks on the basis of sharing 10–20% of vocabulary, and some stocks are found to share between 5 and 10%, then these stocks can be said to belong to the one phylum.
Diffusion In the ideal case, a number of innovations will coincide, as with the Germanic innovations mentioned previously, and a branch can be added to a tree diagram. However, all innovations, whether they
Classification of Languages 447
Figure 1 Locations of the major language groupings of the world, excluding the large-scale expansion of European languages such as English and Spanish over the past 500 years. The approximate locations of major concentrations are shown. In the Americas, there are many families, often with discontinuous and interlocking distributions, so the labels, indicated by name, are very approximately located.
448 Classification of Languages
are new pronunciations, new affixes, new words, or new constructions, must start at a particular location and then spread, and different innovations can have different starting points, and the spreads can overlap. This can happen within a particular language or between languages in contact, with the result that linguists cannot always present a neat, noncontroversial tree diagram. The diffusion of language features can be massive and widespread. Vocabulary can be borrowed from one language to another. ‘Borrow’ is the conventional term for the adoption of language features from another language, but no paying back is implied. Words to do with culture are most easily borrowed. English, for instance, has borrowed almost the entire learned stratum of its lexicon from French, Latin, and Greek. Similarly, Thai, Lao, and Khmer (Cambodian) have borrowed their learned stratum from Pali, a language of the Indo-Aryan branch of Indo-European. Pali is the language of Buddhism. In areas where Islam is found, languages exhibit various degrees of borrowing from Arabic. Common vocabulary is not immune from borrowing. English, for example, has borrowed very from French, and it has borrowed some hundreds of fairly basic words from Old Norse, including the pronominal forms they, their, and them. The standard tree diagram shows English as part of the West Germanic sub-branch of Germanic and Old Norse (ancestral to the modern Scandinavian languages), as representing North Germanic, but it is more realistic to think of English as a mixture, predominantly West Germanic, but with an admixture of North Germanic. And there is also the learned stratum of vocabulary already mentioned. Though grammatical forms, particularly bound forms such as plural markers or past tense markers, are not normally borrowed, grammatical structure or patterns are relatively diffusible. It is interesting to note that most of the languages of South Asia have subject-object-verb (SOV) word order even though they belong to different language families: the IndoAryan branch of Indo-European, Dravidian, and the Munda branch of Austro-Asiatic. Burushaski, a language isolate spoken in northern Pakistan, is also SOV. In China, and in Laos, Thailand, and Vietnam to the south, a number of genetically diverse languages have assimilated to Chinese in having monosyllabic roots and tones. When languages converge in this way, we have a Sprachbund (German for ‘language union’), or linguistic area. If languages were classified typologically, then various languages of different genetic provenience would be classified together because of diffusion. Vietnamese is a good example. Historically, it belongs to the Mon-Khmer branch of Austro-Asiatic, but it has been so
influenced by Chinese that not only has it adopted numerous Chinese words, but it has also reduced its own roots to conform to Chinese patterns and it has developed tones as in Chinese. Word order is subject-verb-object, as in Chinese.
Lexicostatistics Linguists are not always in a position to reconstruct the relationship between languages as has been done in the case of Indo-European. Where linguists have been confronted with a number of languages that have not been studied in detail, a common situation outside Europe over the past century, they have resorted to lexicostatistics. The method is very simple. The percentages of common roots are counted using a list of ‘basic’ words. The theory is that basic vocabulary is resistant to borrowing, so that the percentage will give a guide to how closely languages are related. Although it is true that everyday words are less easily borrowed compared to words to do with culture (in the broadest sense), the difference is one of degree. One of the 200-word lists of basic vocabulary that has been used contains the numerals ‘one’, ‘two’, ‘three’, ‘four’, and ‘five’, but these can be borrowed, as in the case of the Tai languages, which have borrowed them from Chinese. The same list also contains ‘animal’, ‘lake’, and ‘mountain’, all of which are borrowings in English, ultimately from Latin. The problem of distinguishing roots that have been borrowed as opposed to those that have been inherited from a protolanguage is even greater when dealing with languages for which no detailed descriptions are available. Nevertheless, lexicostatistics has been widely used in the classification of the languages of various areas, including Africa, the Americas, Australia, and New Guinea. Lexicostatistics does give a good guide to the degree of similarity between languages, and on the basis of the percentages obtained it is possible to draw a hierarchical tree diagram and classify languages in terms of phylum, stock, family, branch, sub-branch, language, and dialect. However, there is no guarantee that such a tree diagram reflects the successive breaking up of protolanguages, and the terms family, branch, and sub-branch do not have the same meaning as these terms do when based on the comparative method. Greenberg classified the languages of Africa and the Americas using a form of lexicostatistics. Although his classification of African languages is widely accepted and in general use, his classification of the languages of the Americas is rejected by most scholars. In this classification, all of the languages of the Americas are united in one vast Amerind family, except for Na Dene (mainly in northwestern part of
Classification of Languages 449
North America) and Eskimo-Aleut in the Arctic (Greenberg, 1987).
Beyond the Language Family As mentioned previously, there can be various degrees of resemblance between language families and the levels of relationship can be quantified lexicostatistically and described in terms of stock and phylum. But besides hypotheses of wider relationships based purely on lexicostatistics, there are hypotheses about possible relationships between families using standard techniques of reconstruction or mixtures of standard methodology and lexicostatistics. The Nostratic hypothesis is one of the boldest and most controversial approaches; largely the work of Aharon Dolgopolsky and Vladimir Illich-Svitych, the hypothesis claims that there is a macrofamily consisting of Indo-European, Semitic, Berber, Kartvelian, Uralic, Altaic, Korean, Japanese, and Dravidian (Dolgopolsky, 1998). Other work includes that of Paul Benedict, who proposed an Austro-Tai family combining Hmong-Mien (Miao-Yao), the Tai-Kadai (or Daic) family, and Austronesian. Joseph Greenberg considered that these three recognized families plus Austroasiatic form an Austric family (Ruhlen, 1991: 152–156).
Pidgins and Creoles Where people find themselves in contact but without a common language, a ‘pidgin’ develops, which is a simplified form of language. The pidgin usually combines elements from more than one language, but in most cases the bulk of the lexicon comes from one particular language. A number of pidgins developed in the context of European colonial expansion from the 15th to the 19th centuries in places where workers, often slaves, from different language backgrounds were faced with an unfamiliar European language and in many cases unfamiliar languages of fellow workers. Where later generations learned these pidgins as their native language, the pidgins expanded to be full languages. Such languages are known as ‘creoles’. In terms of classification, pidgins and creoles do not lend themselves to the hierarchical taxonomy wherein each language has a single ancestor. However, they tend to be identified in terms of which language supplies most of the vocabulary. The list of the pidgins and creoles included in this work, given in Table 1, shows the main source of the lexicon and where the pidgin or creole is, or was, spoken.
Isolates A number of languages appear to belong to no family, though in many cases they are presumably remnants of families. The following languages are examples: . . . . . . . . . .
Ainu (spoken in Japan) Burushaski (spoken in northern Pakistan) Basque (spoken in the Pyrenees) Elamite (an extinct language of southwestern Iran; it has been claimed to be related to the Dravidian languages of southern India) Japanese and the Ryukyuan dialects (the latter spoken in the Ryukyu Islands of Japan) Ket (spoken in the Yenisei Basin, Siberia) Korean Nivkh (spoken in eastern Siberia, including Sakhalin Island) Sumerian (extinct language of Mesopotamia with records from the 3rd millennium B.C.) Yukaghir (spoken in eastern Siberia).
For most of these languages, hypotheses are put forward from time to time linking them with other languages. A number of scholars include Japanese or Korean, or both, in the Altaic family, and some would include Yukaghir in the Uralic family.
English English French Choctaw, Chickasaw Spanish Portuguese, Spanish
Russenorsk Sango
Russian, Norwegian Ngbandi, French
Sranan Tok Pisin Yanito
English English English, Spanish
Sea Islands of South Carolina Hawaii, United States Sierra Leone Lousiana, United States Southeastern United States (extinct) Colombia Aruba, Bonaire, Curac¸ao Arctic (extinct) Central African Republic Surinam New Guinea Gibraltar
450 Classification of Languages
Status of the Groupings Used in the Classification This section contains a list of language families and other groupings in alphabetic order with an indication of the status of the groupings, i.e. whether the labels represent generally accepted families, controversial families or larger entities. It should be noted that while the list covers most of the language families of the world, it is not a complete catalogue of the world’s languages, which total somewhere near 5000. Afroasiatic Languages
There are various classifications of Afroasiatic languages. The one used here recognizes six families: Ancient Egyptian and its successor, Coptic; Berber (northwest corner of Africa); Chadic (Niger and Chad); Cushitic (Somalia and eastern Sudan); Omotic (southern Ethiopia); and Semitic. Semitic has three branches. The eastern branch is represented by Akkadian, which was spoken in Mesopotamia from the 3rd to the 1st millennium B.C. The southern branch is represented by the Ethiopian languages (Amharic, Tigrinya, and the extinct Ge’ez). The central branch, which is centered around the eastern end of the Mediterranean, includes the dead languages Phoenician, Syriac, and Ugaritic, plus Aramaic, a language in which parts of the Bible are written and which is still spoken; Hebrew, which has been brought back to life as the language of Israel; and Arabic, which, as the language of Islam, has spread over northern Africa and the Middle East. Altaic Languages
Altaic is a widely, though not universally, accepted language family covering three branches: Turkic, Mongolic, and Tungusic, represented in this work by Evenki. The Turkic languages, which include Turkish, extend across from the Balkans through Turkey across central Asia to Siberia. The Mongolic languages are centered on Mongolia and the Tungusic languages in Siberia and northern China. If Altaic is rejected as a family, then we have three separate families rather than three branches of a family. These languages are typologically similar in that they are agglutinative, and they represent the classic SOV word-order type with SOV word order, postpositions, and preposed genitives. Some linguists would include Japanese and/or Korean in the Altaic family. Australian Languages
The languages of the Australian mainland look as if they are related, but no detailed reconstruction of a
protolanguage has been undertaken and it is unlikely that such a reconstruction will be possible. These languages have been classified lexicostatistically, i.e., by counting percentages of common vocabulary. This classification currently recognizes about a score of lexicostatistical families, with one of them, PamaNyungan, covering most of the mainland. Some genetic groupings are recognizable within PamaNyungan, and some of the other lexicostatistical families can be shown to be true families, such as the Tangkic family, which includes Kayardild, and West Barkly, which includes Wambaya. Tiwi is the sole member of the Tiwian family. Dixon (2002: 674) suggests that the similar-looking Daly group of languages (represented in this work by Ngan’gityemerri) is an areal group rather than a genetic one. Records of the extinct Tasmanian languages consist almost entirely of amateur word lists. These show very few resemblances to the languages of the mainland. Joseph Greenberg classified the Tasmanian languages, the Papuan languages, and the languages of the Andaman Islands in an Indo-Pacific phylum (Ruhlen, 1991). This grouping has been disregarded by almost all other linguists. Austroasiatic Languages
The Austroasiatic classification comprises two branches: the Munda languages of northeast India, which includes Santali, and the more scattered Mon-Khmer branch, which includes Mon (southeastern Myanmar (Burma)), Khmer (or Cambodian, the official language of Cambodia), Khasi (northeast India), Wa (southwest Yunnan, China), and Vietnamese. Vietnamese is interesting from the point of view of classification. It has been so influenced by Chinese that as well as borrowing large numbers of Chinese words, it has reduced the form of roots and developed tones so that the language looks like a Chinese language. Austronesian Languages
The Austronesian language family contains over 1000 languages. In the most widely used classification, there are four branches, Paiwanic, Tsouic, Ayatalic, and Malayo-Polynesian. The first three are the indigenous languages of Taiwan and are collectively known as the Formosan languages. The extra-Formosan languages, which are assumed to have emanated from Taiwan, make up the Malayo-Polynesian branch, which is spread from Madagascar in the western Indian Ocean, where Malagasy is spoken, to Easter Island in the eastern Pacific. Oversimplifying somewhat, we can consider there are three subbranches: western, which takes in the languages of the Philippines, Indonesia, and Malaysia as well as
Classification of Languages 451
Malagasy and Hawaiian; central, represented in this work by the Flores languages and Malukan languages; and Oceanic, which covers languages such as Fijian, Maori, Samoan, Tahitian, and Tongan. Caucasian Languages
The languages of the Caucasus comprise a South Caucasian or Kartvelian family, represented here by Georgian, and the North Caucasian languages, with a northwestern sub-branch, represented here by Abkhaz, and a northeastern sub-branch, represented by Lak. It is not quite certain that the northwestern branch and northeastern Branch are branches of a single family, and it is even more uncertain that South Caucasian and North Caucasian families form a genetic group, but the label ‘Caucasian languages’ is useful since the two groups share some features and are all quite distinct from surrounding languages. Chukotko-Kamchatkan Languages
This is a small family of languages spoken on the Chukotka and Kamchatka peninsulas of Siberia. Dravidian Languages
This language family is concentrated in southern India. Some branches are recognizable. Dravidian proper includes Gondi, Kurukh, and Telegu; the southern branch includes Kannada, Malayalam, Tamil, and Toda, and the northwestern branch includes Brahui. Eskimo-Aleut
The Eskimo-Aleut language family has two primary branches. The Aleut branch is spoken in the Aleutian Islands and the Eskimo languages are found in Siberia, Alaska, Canada, and Greenland. The latter branch is represented here by Inupiaq and West Greenlandic. Indo-European
Indo-European is the most widely studied of all language families and has a well-articulated sub-grouping based on the comparative method, though details of the classification are subject to dispute from time to time. This family of languages contains a number of branches containing a single language (or group of dialects), namely, Albanian, Armenian, Hellenic (Greek), and two dead languages, records of which came to light only in the 20th century. One dead language is Hittite, which was spoken in Anatolia (modern Turkey). There are records of Hittite from the latter part of the second millennium B.C. The other dead language, Tocharian, the easternmost
Indo-European language, was spoken in what is now the Xinjiang province of western China. There are records of Tocharian from the period 500–700 A.D. Among other branches are the following IndoEuropean languages: . Baltic contains Lithuanian and Latvian, and Slavic, the earliest records of which are in Old Church Slavonic and date from the 11th and 12th centuries. Modern Slavic languages include Polish, Sorbian, Czech, and Slovak (western sub-branch); Bulgarian, Macedonian, Slovene, and the ‘Serbian-CroatianBosnian complex’ (southern sub-branch); and Russian, Belorussian, and Ukrainian (eastern subbranch). Some linguists would classify Baltic and Slavic as sub-branches of a Balto-Slavic branch. . Celtic is usually divided into two sub-branches: the Brythonic branch, which contains Breton, Cornish, Welsh, and possibly Pictish, about which little is known, and the Goidelic branch, which contains Scots Gaelic. . Germanic contains three sub-branches. The eastern sub-branch is represented by the extinct Gothic; the northern sub-branch, by the Scandinavian languages (Danish, Icelandic, Norwegian, Swedish); and the western sub-branch, by German (including High German, Yiddish, and Low German), Frisian, Dutch, and its South African derivative, Afrikaans, and various forms of English, including Scots. . Indo-Iranian is a large branch containing two large sub-branches, Indo-Aryan (or Indic) and Iranian. Indo-Aryan covers Sanskrit, the language of the Hindu sacred texts; Pali, the language of the Hinayana Buddhist canon; plus Bengalic, the Dardic languages, Dhivehi, Domari, Gujerati, Hindi, Hindustani, Kashmiri, Lahnda, Marathi, Nepali, Punjabi, Sindhi, Sinhala, and Urdu, all of which are spoken in India, Pakistan, and Bangladesh, plus Romani, the language of scattered Gypsy communities. Iranian covers Avestan, the language of the Zoroastrian scriptures, plus Bactrian, Baluchi, Chorasmian, Khotanese, Kurdish, Ossetic, Pahlavi (Middle Persian), Pashto, Persian, Sogdian, and Tajik. . Italic contains a number of extinct languages of Italy, one of which, Latin, was spread via the political dominance of Rome. The descendants of Latin, known collectively as the Romance Languages, include several national languages (French, Italian, Portuguese, Romanian, and Spanish) as well as Catalan (northeastern Spain), Galician (northwestern Spain), Jerriais (Jersey), Occitan (southern France), Rhaeto-Romance (eastern Switzerland and northeastern Italy), and Sardinian.
452 Classification of Languages Khoesan Languages
The Khoesan group of languages is spoken by the Khoekhoe and San peoples of southern Africa. The group is often described as having three branches, but the branches are probably separate families. Two languages of northern Tanzania, Hadza and Sandawe, are also included in the group in most reference works, but it is not clear that they are genetically related to any of the southern families. Languages of the Americas
As mentioned in the preceding section on lexicostatistics, Joseph Greenberg classified all of the languages of the Americas in one vast Amerind family, except for Na-Dene (mainly in northwestern part of North America) and Eskimo-Aleut in the Arctic. This classification is generally rejected and most scholars would recognize some scores of separate families in Greenberg’s Amerind, though allowing that some of these can be grouped into stocks. We have followed a widespread convention of breaking up the languages of the Americas into three geographical regions: North America, Central America, and South America. This is largely to reduce a very large area to manageable chunks. We have considered Eskimo-Aleut separately from the languages of the Americas since it is not confined to North America. Languages of North America . The Algonquian languages are found in the eastern part of North America and westward into Alberta and Montana, and the Ritwan languages (Wiyot and Yurok) are found in northern California. Mithun (1999: 327) recognizes Eastern Algonquian, Central and Plains Algonquian, and Ritwan as branches of an Algic family. Algonquian is represented in this work by Cree and Mitchif. Mitchif is a creole, but, unlike most creoles, it did not arise from a pidgin. It retains the complex verbal morphology of Cree, and noun phrases show distinctions of number, gender, and definiteness, as in French. . The Caddoan language family belongs to the Great Plains of the midwestern United States. . The Hokan group of languages is centered in California. It is not established that these languages form a family. Among the Hokan languages is the Pomoan family of northern California. . The Iroquoian language family of southeastern Canada and the eastern United States is represented in this work by Oneida (Northern Iroquoian) and Cherokee (Southern Iroquoian).
. The Keres language consists of a number of dialects spoken in New Mexico. . The Muskogean language family of the southeastern United States includes Choctaw (Mississippi) and Creek (Alabama and Georgia). . The Na-Dene language family includes Tlingit, Eyak, and the large Athapaskan branch. Most of these languages belong to Alaska and western Canada, but there is an enclave of Athapascan in the southwest of the United States. Navajo (Navaho) is spoken in Arizona, New Mexico, and Utah. . The Penutian group of languages or stock belongs to the west of North America, from British Columbia to California. . Languages of the Salishan family are spoken in British Columbia and the northwest of the United States. . The Siouan family of languages covered a vast area of the Great Plains and included Crow, Lakota, and Omaha-Ponca. . The Wakashan language family is mainly from Vancouver Island, British Columbia, and is represented by Nuuchahnulth (Nootka). Languages of Central America . Languages of the Chibchan family are spoken in Nicaragua, Costa Rica, Panama, western Colombia, and Ecuador, and the Paezan languages are spoken in Colombia. . The Mayan family of languages is spoken in southeastern Mexico and Guatemala. . The Misumalpan language family is found in western Honduras and western Nicaragua. . The Mixe-Zoquean language family is found in southern Mexico. . The Oto-Manguean language, represented here by Zapotecan, is found in Southern Mexico. . The Uto-Aztecan language family is found mainly in the southwest of the United States and Mexico, but extends as far north as Idaho. This family includes Cupen˜ o, Hopi, Tohono O’odham, and Nahuatl, the language of the Aztec civilization. Languages of South America . The most widely spoken native language of South America is Quechua. It is spoken in Peru, Ecuador, and Bolivia, extending north into Colombia and extending south into northern Chile and northwestern Argentina. It shares similarities with Aymara´ and the two are sometimes grouped in an Andean family, but this is not generally accepted, since it is not agreed whether the resemblances are genetic or arise from contact.
Classification of Languages 453
. The large Arawak language family is widespread, ranging from Honduras in Central America to Brazil in South America, and formerly to Paraguay and Argentina. The Arawak language in this work is Tariana, of Brazil. . The large Carib language family is found in Brazil and the countries of South America north of Brazil. . The Choco language family is found in Brazil. . The languages of the Panoan family are found in Peru and neighboring parts of Bolivia and Brazil. . Macro-Jeˆ is a grouping of languages that have been considered to be related to the Jeˆ family. These languages are located in Brazil. . The Mapudungan language is spoken in Chile and Argentina. It has no clear genetic affiliation. . The Tucanoan language family is found in western Brazil and neighboring parts of Colombia, Ecuador, and Brazil. . The Tupian language family is located in Brazil. The Tupı´-Guaranı´ sub-group is also found in Brazil, but various members of the sub-group are found in Bolivia, Paraguay, and Argentina. Guaranı´ is an official language of Paraguay, along with Spanish.
Niger-Congo Languages
This is a very large language family, with about 1000 members. It is spread over southern Africa. There are various classifications, including some that are hierarchical with several levels. We have adopted a flat classification with eight branches: . The Kordofanian group of languages is spoken in Sudan. In some classifications, a Niger-Kordofanian family is recognized, with Kordofanian and Niger-Congo as the primary branches. . The Atlantic Congo language sub-group is located in the far west of Africa from Liberia to Senegal. It includes Fula and Wolof. . Languages of the Kru sub-group are spoken in Ivory Coast and Liberia. . The Mande language sub-group is found from Senegal to Burkina Faso (Upper Volta) and Ivory Coast. . The Gur (Voltaic) language sub-group is spoken in Mali, Burkina Faso, and Ghana, and extends east into Nigeria. In some classifications, Dogon is not assigned to any branch; in others, it is assigned to the Gur sub-branch. . The Kwa sub-group of languages extends from Liberia to Nigeria. . The Benue-Congo language sub-group covers a very large part of southern Africa. This branch includes Efik, Yukuben, and Mambila. The very large Bantu language group, which includes
Kikuyu, Kinyarwanda, Nyanja, Shona, Swahili, Xhosa, and Zulu, is a sub-group of Benue-Congo and hence a sub-sub-group of Niger-Congo. . The Adama-Ubangi language sub-group is spoken in a band running across Africa from Nigeria to Sudan. Nilo-Saharan
The languages of the Nilo-Saharan family are found mainly in northeastern and north-central Africa. They include Dinka, Kanuri, Luo, and the Songhay languages. Papuan Languages
The label ‘Papuan’ has no genetic significance. It is defined negatively as the non-Austronesian languages of New Guinea and surrounding islands. It covers about 750 languages in New Guinea and another 50 or so on neighboring islands from Timor to the Solomons. These languages can be classified into 23 families and 10 isolates. One very large family, the Trans-New Guinea family, covers most of New Guinea and is also found on some of the neighboring islands. It contains a number of branches, including the Madang languages. Other families include Sepik, represented in this work by Manambu of the Ndu subgroup, Skou, Torricelli, and West Papuan. Also included in this work is an article on several of the Papuan languages of the central Solomons. Sino-Tibetan
The Sino-Tibetan languages include the Sinitic family and Tibeto-Burman. Sinitic can be equated with Chinese, but Chinese is popularly understood to be a single language, whereas in fact it is more like a family of languages, one of which, Mandarin Chinese, is the standard, based largely on the Beijing dialect. Tibeto-Burman takes in a number of genetically related languages, including Tibetan and Burmese, but there is no consensus about the details of the classification. Whether Tibeto-Burman and Sinitic are genetically related is not agreed, but there are some apparent cognates. Tai Languages
The Tai, or Daic, language family is centered in Laos and Thailand and includes the national languages of these two countries, Lao (or Laotian) and Thai. The family is also represented in Burma, southern China, northern Vietnam, and on Hainan Island in the Gulf of Tonkin. Lao and Thai are mutually comprehensible. A purely linguistic classification would recognize a chain of Tai dialects across the two countries that included the national languages.
454 Classification of Languages Uralic Languages
The Uralic languages are a family of languages spoken in northeastern Europe, extending across northern Russia into northwestern Siberia. There are two major branches, the Samoyed branch, represented in this work by Nenets, spoken in northern Russia, and Finno-Ugric, which includes Estonian, Finnish, and Saami (spoken in northern Norway, Sweden, and Finland), as well as Hungarian, the national language of Hungary, which is separated from the rest of the family. Some would include Yuhaghir in the Uralic family, others would combine Uralic and Altaic into a larger family. See also: There are separate articles on each of the languages and language families shown in the classification appended to this article. There is also an alphabetical list of all the language articles in the classified index.
Language Classification Afroasiatic Languages Ancient Egyptian and Coptic Berber Languages Chadic Languages Hausa Cushitic Languages Highland East Cushitic Languages Oromo Somali Omotic Languages Wolaitta Semitic Languages Eblaite Eastern Akkadian Central Arabic Aramaic Hebrew, Biblical and Jewish Hebrew, Israeli Jewish languages Maltese Phoenician and Punic Syriac Ugaritic Southern Ethiopian Semitic Languages Amharic Ge’ez Tigrinya Altaic Languages Mongolic Languages Tungusic Languages Evenki Turkic Languages Azerbaijanian Bashkir Chuvash Kazakh Kirghiz
Language Classification (cont.) Tatar Turkish Turkmen Uygur Uzbek Yakut Australian Languages Pama-Nyungan Arrernte Gamilaraay Guugu Yimidhirr Jiwarli Kalkutungu Kaytetj Morrobalama Pitjantjatjara Warlpiri Daly Ngan’gityemerri Tangkic Kayardild Tiwian Tiwi West Barkly Wambaya Austroasiatic Languages Mon-Khmer Languages Northern Khasi Vietnamese Wa Eastern Khmer Southern Mon Munda Languages Santali Austronesian Languages Formosan Languages Malayo-Polynesian Languages Western Balinese Bikol Cebuano Hawaiian Hiligaynon Ilocano Javanese Kapampangan Madurese Malagasy Malay (Malaysian and Indonesian) Niuean North Philippine Languages Riau Indonesian Samar-Leyte South-Philippine Languages Central Flores Languages Malukan Languages
Classification of Languages 455 Language Classification (cont.) Oceanic Fijian Maori Tahitian Tamambo Tongan Vures Caucasian Languages Abkhaz Georgian Lak Chukotko-Kamchatkan Languages Dravidian Languages Brahui Kannada Kurukh Malayalam Tamil Telugu Toda Eskimo-Aleut Inupiaq West Greenlandic Indo-European Languages Albanian Anatolian Languages Hittite Armenian Balto-Slavic Languages Baltic Languages Latvian Lithuanian Slavic Languages Belorussian Bulgarian Church Slavonic Czech Macedonian Old Church Slavonic Polish Russian ‘Serbian-Croatian-Bosnian Linguistic Complex’ Slovak Slovene Sorbian Ukrainian Celtic Languages Breton Cornish Pictish Scots Gaelic Welsh Germanic Languages Afrikaans Danish Dutch English, Early Modern English: African American Vernacular English: Middle English English, Later Modern
Language Classification (cont.) English: in the present day English: Old English English, World German Germanic Languages Gothic Luxembourgish Norwegian Old Icelandic Scots Swedish Yiddish Hellenic Greek, Ancient Greek, Modern Indo-Iranian Languages Indo Aryan Languages Bengali Dardic Kashmiri Dhivehi Domari Gujarati Hindi Hindustani Lahnda Marathi Nepali Pali Punjabi Romani Sanskrit Sindhi Sinhala Urdu Iranian Languages Avestan Bactrian Baluchi Chorasmian Khotanese Kurdish Ossetic Pahlavi Pashto Persian, Modern Persian, Old Sogdian Tajik Italic Languages Latin Romance Languages Catalan Franglais French Galician Italian Jerriais Occitan Portuguese Rhaeto-Romance Romanian Spanish Tocharian
456 Classification of Languages Language Classification (cont.) Khoesan Languages Khoesan Languages Languages of the Americas Languages of North America Algonquian and Ritwan Languages Cree Mitchif Caddoan Languages Hokan Languages Pomoan Languages Iroquoian Languages Oneida Cherokee Keres Muskogean Languages Choctaw Creek Na-Dene Languages Navaho Penutian Languages Salishan Languages Siouan Languages Crow Lakota Omaha-Ponca Wakashan Languages Nuuchahnulth Languages of Central America Chibchan and Paezan Languages Mayan Languages Misumalpan Mize-Zoquean Languages Oto-Manguean Languages Zapotecan Totonacan Languages Uto-Aztecan Languages Cupen˜o Hopi Nahuatl Tohono O’odham Languages of South America Andean Languages Aymara´ Quechua Arawak Languages Tariana Cariban Languages Choco Languages Chibchan (see Central America) Macro-Jeˆ Languages Mapudungan Panoan Piraha Tucanoan Languages Tupian Languages Guarani Niger-Congo Languages Kordofanian Languages Mande Languages
Language Classification (cont.) Atlantic Congo Languages Fula Wolof Dogon Gur Languages Kru Languages Adamawa-Ubangi Kwa Languages Akan Ewe Yoruba Benue-Congo Languages Efik Mambila Bantu Languages Gikuyu Kinyarwanda Luganda Nyanja Shona Swahili Xhosa Zulu Southern Bantu Languages Nilo-Saharan Languages Dinka Kanuri Luo Papuan Languages Central Solomon Languages Sepik Languages Manumbu Skou Languages Torricelli Languages Trans New Guinea Languages Madang Languages West Papuan Languages Pidgins and Creoles Bislama Cape Verdean Creole Fanagolo Gullah Hawaiian Creole Krio Louisiana Creole Mobilian Jargon Palenquero Papamiento Russenorsk Sango Tok Pisin Tsotsi Taal Yanito Sino-Tibetan Languages Sinitic Languages Chinese Tibeto-Burman Languages Burmese Tibetan
Classification of Text, Automatic 457 Language Classification (cont.) Tai Languages Lao Thai Uralic Languages Estonian Finnish Hungarian Nenets Saami Language isolates and Languages of disputed affiliation Ainu Basque Burushaski Elamite Japanese Ryukyuan Ket Korean Nivkh Sumerian Yukaghir Artifical Languages Esperanto Language Classification The Languages of the World Austric hypothesis Austro-Tai hypothesis Ethnologue SIL
Bibliography Comrie B (ed.) (1987). The world’s major languages. London: Croom Helm. Comrie B (1981). The languages of the Soviet Union. Cambridge: Cambridge University Press. Dixon R M W (2002). The languages of Australia. Cambridge: Cambridge University Press. Dixon R M W & Aikhenvald A Y (eds.) (1999). The Amazonian languages. Cambridge: Cambridge University Press. Dolgopolsky A (1998). The Nostratic macrofamily and linguistic paleontology. Cambridge: McDonald Institute for Archaeological Research. Greenberg J H (1987). Language in the Americas. Stanford: Stanford University Press. Grimes B F (2000). Ethnologue: languages of the world (14th edn.). Dallas: Summer Institute of Linguistics (http://www.ethnologue.com). Mithun M (1999). The languages of native North America. Cambridge: Cambridge University Press. Ruhlen M (1991). A guide to the world’s languages. Volume 1: classification (rev. edn.). Stanford: Stanford University Press. Sua´rez J A (1983). The Mesoamerican Indian languages. Cambridge: Cambridge University Press.
Classification of Text, Automatic F Sebastiani, Universita` di Padova, Padova, Italy ! 2006 Elsevier Ltd. All rights reserved.
Introduction In the last two decades, the production of textual documents in digital form has increased exponentially, due to the increased availability of inexpensive hardware and software for generating digital text (e.g., personal computers, word processors) and for digitizing textual data not in digital form (e.g., scanners, optical character recognition software). As a consequence, there is an ever-increasing need for mechanized solutions for organizing the vast quantity of digital texts that are being produced, with an eye toward their future use. The design of such solutions has traditionally been the object of study of information retrieval (IR), the discipline that, broadly speaking, is concerned with the computer-mediated access to data with poorly specified semantics.
There are two main directions for providing convenient access to a large, unstructured repository of text: . Providing powerful tools for searching relevant documents within this repository. This is the aim of text search (see Document Retrieval, Automatic), a subdiscipline of IR concerned with building systems that take as input a natural language query and return, as a result, a list of documents ranked according to their estimated degree of relevance to the user’s information need. Nowadays the tip of the iceberg of text search is represented by Web search engines (see Web Searching), but commercial solutions for the text search problem were being delivered decades before the very birth of the Web. . Providing powerful tools for turning this unstructured repository into a structured one, thereby easing storage, search, and browsing. This is the aim of text classification (TC), a discipline at the crossroads of IR, machine learning (ML), and (statistical)
Classification of Text, Automatic 457 Language Classification (cont.) Tai Languages Lao Thai Uralic Languages Estonian Finnish Hungarian Nenets Saami Language isolates and Languages of disputed affiliation Ainu Basque Burushaski Elamite Japanese Ryukyuan Ket Korean Nivkh Sumerian Yukaghir Artifical Languages Esperanto Language Classification The Languages of the World Austric hypothesis Austro-Tai hypothesis Ethnologue SIL
Bibliography Comrie B (ed.) (1987). The world’s major languages. London: Croom Helm. Comrie B (1981). The languages of the Soviet Union. Cambridge: Cambridge University Press. Dixon R M W (2002). The languages of Australia. Cambridge: Cambridge University Press. Dixon R M W & Aikhenvald A Y (eds.) (1999). The Amazonian languages. Cambridge: Cambridge University Press. Dolgopolsky A (1998). The Nostratic macrofamily and linguistic paleontology. Cambridge: McDonald Institute for Archaeological Research. Greenberg J H (1987). Language in the Americas. Stanford: Stanford University Press. Grimes B F (2000). Ethnologue: languages of the world (14th edn.). Dallas: Summer Institute of Linguistics (http://www.ethnologue.com). Mithun M (1999). The languages of native North America. Cambridge: Cambridge University Press. Ruhlen M (1991). A guide to the world’s languages. Volume 1: classification (rev. edn.). Stanford: Stanford University Press. Sua´rez J A (1983). The Mesoamerican Indian languages. Cambridge: Cambridge University Press.
Classification of Text, Automatic F Sebastiani, Universita` di Padova, Padova, Italy ! 2006 Elsevier Ltd. All rights reserved.
Introduction In the last two decades, the production of textual documents in digital form has increased exponentially, due to the increased availability of inexpensive hardware and software for generating digital text (e.g., personal computers, word processors) and for digitizing textual data not in digital form (e.g., scanners, optical character recognition software). As a consequence, there is an ever-increasing need for mechanized solutions for organizing the vast quantity of digital texts that are being produced, with an eye toward their future use. The design of such solutions has traditionally been the object of study of information retrieval (IR), the discipline that, broadly speaking, is concerned with the computer-mediated access to data with poorly specified semantics.
There are two main directions for providing convenient access to a large, unstructured repository of text: . Providing powerful tools for searching relevant documents within this repository. This is the aim of text search (see Document Retrieval, Automatic), a subdiscipline of IR concerned with building systems that take as input a natural language query and return, as a result, a list of documents ranked according to their estimated degree of relevance to the user’s information need. Nowadays the tip of the iceberg of text search is represented by Web search engines (see Web Searching), but commercial solutions for the text search problem were being delivered decades before the very birth of the Web. . Providing powerful tools for turning this unstructured repository into a structured one, thereby easing storage, search, and browsing. This is the aim of text classification (TC), a discipline at the crossroads of IR, machine learning (ML), and (statistical)
458 Classification of Text, Automatic
natural language processing, concerned with building systems that partition an unstructured collection of documents into meaningful groups (Sebastiani, 2002). Text Clustering and Text Categorization
There are two main variants of TC. The first is text clustering, which is characterized by the fact that only the desired number of groups (or clusters) is known in advance: no indication as to the semantics of these groups is instead given as input. The second variant is text categorization, whereby the input to the system consists not only of the number of categories (or classes), but also of some specification of their semantics. In the most frequent case, this specification consists in a set of labels, one for each category and usually consisting of a noun or other short natural language expression, and in a set of example labeled texts, i.e., texts whose membership or nonmembership in each of the categories is known. Clustering may thus be seen as the task of finding a latent but as yet undetected group structure in the repository, while categorization can be seen as the task of structuring the repository according to a group structure known in advance. In logical-philosophical terms, we can see clustering as the task of determining both the extensional and intensional level (see Extensionality and Intensionality) of a previously unknown group structure, and categorization as determining the extensional level only of a group structure whose intensional level is known. It is the latter task that will be the focus of this article (text clustering is covered elsewhere in this volume – see Text Mining). From now on we will thus use the expressions ‘text classification’ and ‘text categorization’ interchangeably (abbreviated as TC), and the expression ‘(text) classifier’ to denote a system capable of performing automatic TC. Note that the central notion of TC, that of membership of a document dj in a class ci based on the semantics of dj and ci, is an inherently subjective notion, since the semantics of dj and ci cannot be formally specified. Different classifiers (be they humans or machines) might thus disagree on whether dj belongs to ci. This means that membership cannot be determined with certainty, which in turn means that any classifier (be it human or machine) will be prone to misclassification errors. As a consequence, it is customary to evaluate automatic text classifiers by applying them to a set of labeled (i.e., preclassified) documents (a set that here plays the role of a gold standard), so that the accuracy (or effectiveness) of the classifier can be measured by the degree of coincidence between its classification decisions and
the labels originally attached to the preclassified documents. Single-Label and Multi-Label Text Categorization
TC itself admits of two important variants: singlelabel TC and multi-label TC. Given as input the set of categorories C ¼ {c1 ,. . ., cm}, single-label TC is the task of attributing, to each document dj in the repository, the category to which it belongs. Multilabel TC, instead, deals with the case in which each document dj may in principle belong to zero, one, or more than one category; it thus comes down to deciding, for each category ci in C, whether a given document dj belongs or does not belong to ci. The technologies for coping with either single-label or multi-label TCs are slightly different (the former problem often being somehow more challenging), especially concerning the phases of feature selection, classifier learning, and classifier evaluation (see below). In a real application, it is thus of fundamental importance to identify whether the application requires single-label or multi-label TC from the beginning. Hard or Soft Text Categorization
Taking a binary decision, yes or no, as to whether a document dj belongs to a category ci, is sometimes referred to as a ‘hard’ categorization decision. This is the kind of decisions that are taken by autonomous text classifiers, i.e., software systems that need to decide and act accordingly without human supervision. A different type of decision, sometimes referred to as a ‘soft’ categorization decision, is one which consists of attributing a numeric score (e.g., between 0 and 1) to the pair (dj,ci), reflecting the degree of confidence of the classifier in the fact that dj belongs to ci. This allows, for instance, ranking a set of documents in terms of their estimated appropriateness for category ci, or ranking a set of categories in terms of their estimated appropriateness for dj. Such rankings are often useful for nonautonomous, interactive classifiers, i.e., systems whose goal is to recommend a categorization decision to a human expert, who is responsible for making the final decision. For instance, in a single-label TC task a human expert in charge of the final classification decision may take advantage of a system that preranks the categories in terms of their estimated appropriateness to a given document dj. Again, the technologies for coping with either soft or hard categorization decisions are slightly different, especially concerning the phases of classifier learning and classifier evaluation (see below). In any real-world application, it is thus important to establish whether the task is one requiring soft or hard decisions from the beginning.
Classification of Text, Automatic 459
Applications
Techniques
Maron’s seminal paper (Maron, 1961) is usually taken to mark the official birth date of TC, which at the time was called automatic indexing; this name reflected that the main (or only) application that was then envisaged for TC was automatically indexing (i.e., generating internal representations for) scientific articles for Boolean IR systems (see Indexing, Automatic). In fact, since index terms for these representations were drawn from a fixed, predefined set of such terms, we can regard this type of indexing as an instance of TC (where index terms play the role of categories). The importance of TC increased in the late 1980s and early 1990s with the need to organize the increasingly larger quantities of digital text being handled in organizations at all levels. Since then, frequently pursued applications of TC technology have been newswire filtering, i.e., the grouping, according to thematic classes of interest, of news stories produced by news agencies, thus allowing personalized delivery of information to customers according to their profiles of interest (Hayes and Weinstein, 1990); patent classification, i.e., the organization of patents and patent applications into specialized taxonomies, so as to ease the detection of existing patents related to a new patent application (Fall et al., 2003); and Web page classification, i.e., the grouping of Web pages (or sites) according to the taxonomic classification schemes typical of Web portals (Dumais and Chen, 2000). The applications above all have a certain thematic flavor, in the sense that categories tend to coincide with topics, or themes. However, TC technology has been applied to real-world problems that are not thematic in nature, among which spam filtering, i.e., the grouping of personal e-mail messages into the two classes LEGITIMATE and SPAM, so as to provide effective user shields against unsolicited bulk mailings (Drucker et al., 1999); authorship attribution, i.e., the automatic identification of the author of a text among a predefined set of candidates (Diederich et al., 2003) (see Authorship Attribution: Statistical and Computational Methods); author gender detection, i.e., a special case of the previous task in which the issue is deciding whether the author of the text is a MALE or a FEMALE (Koppel et al., 2002); genre classification, i.e., the identification of the nontopical communicative goal of the text (such as determining if a product description is a PRODUCTREVIEW or an ADVERTISEMENT) (Stamatatos et al., 2000); survey coding, i.e., the classification of respondents to a survey based on the textual answers they have returned to an openended question (Giorgetti and Sebastiani, 2003); or even sentiment classification, as in deciding if a product review is a THUMBSUP or a THUMBSDOWN (Turney and Littman, 2003).
Approaches
In the 1980s, the most popular approach to TC was one based on knowledge engineering, whereby a knowledge engineer and a domain expert working together would build an expert system capable of automatically classifying text. Typically, such an expert system would consist of a set of ‘if . . . then . . .’ rules, to the effect that a document was assigned to the class specified in the ‘then’ clause only if the linguistic expressions (typically: words) specified in the ‘if’ part occurred in the document. The drawback of this approach was the high cost in terms of humanpower required for (i) defining the rule set, and (ii) for maintaining it, i.e., for updating the rule set as a result of possible subsequent additions or deletions of classes or as a result of shifts in the meaning of the existing classes. In the 1990s, this approach was superseded by the machine-learning approach, whereby a general inductive process (the learner) is fed with a set of example (training) documents preclassified according to the categories of interest. By observing the characteristics of the training documents, the learner may generate a model (the classifier) of the conditions that are satisfied by the documents belonging to the categories considered. This model can subsequently be applied to new, unlabeled documents for classifying them according to these categories. This approach has several advantages over the knowledge engineering approach. First of all, a higher degree of automation is introduced: the engineer needs to build not a text classifier, but an automatic builder of text classifiers (the learner). Once built, the learner can then be applied to generating many different classifiers, for many different domains and applications: one only needs to feed it with the appropriate sets of training documents. By the same token, the above-mentioned problem of maintaining a classifier is solved by feeding new training documents appropriate for the revised set of classes. Many inductive learners are available off the shelf; if one of these is used, the only human power needed in setting up a TC system is that for manually classifying the documents to be used for training. For performing this latter task, less skilled human power than for building an expert system is needed, which is also advantageous. It should also be noted that if an organization has previously relied on manual work for classifying documents, then many preclassified documents are already available to be used as training documents when the organization decides to automate the process. Most importantly, one of the advantages of the ML approach is that the accuracy of classifiers built by these techniques now often rivals that of human
460 Classification of Text, Automatic
professionals, and usually exceeds that of classifiers built by knowledge engineering methods. This has brought about a wider and wider acceptance of learning methods even outside academia. While for certain applications such as spam filtering a combination of ML and knowledge engineering still lies at the basis of several commercial systems, it is fair to say that in most other TC applications (especially of the thematic type), the adoption of ML technology has been widespread. Note that the ML approach is especially suited to the case in which no additional knowledge (of a procedural or declarative nature) of the meaning of the categories is available, since in this case the classification rules can be determined only on the basis of knowledge extracted from the training documents. This case is the most frequent one, and is thus the usual focus of TC research. Solutions devised for the case in which no additional knowledge is available are extremely general, since they do not presuppose the existence of e.g., additional lexicosemantic resources that, in real-life situations, might be either unavailable or expensive to create (see Computational Lexicons and Dictionaries). A further reason why TC research rarely tackles the case of additionally available external knowledge is that these sources of knowledge may vary widely in type and format, thereby making each instance of their application to TC a case in its own, from which any lesson learned can hardly be exported to different application contexts. When in a given application external knowledge of some kind is available, heuristic techniques of any nature may be adopted in order to leverage on these data, either in combination or in isolation from the IR and ML techniques we will discuss here. However, it should be noted that past research has not been able to show any substantial benefit from the use of external resources (such as lexicons, thesauri, or ontologies) in TC. As previously noted, the meaning of categories is subjective. The ML techniques used for TC, rather than trying to learn a supposedly perfect classifier (a gold standard of dubious existence), strive to reproduce the subjective judgment of the expert who has labeled the training documents, and do this by examining the manifestations of this judgment, i.e., the documents that the expert has manually classified. The kind of learning that these ML techniques engage in is usually called supervised learning, since it is supervised, or facilitated, by the knowledge of the preclassified data. Learning Text Classifiers
Many different types of supervised learners have been used in TC (Sebastiani, 2002), including probabilistic ‘naive Bayesian’ methods, Bayesian networks,
regression methods, decision trees, Boolean decision rules, neural networks, incremental or batch methods for learning linear classifiers, example-based methods, classifier ensembles (including boosting methods), and support vector machines. While all of these techniques still retain their popularity, it is fair to say that in recent years support vector machines (Joachims, 1998) and boosting (Schapire and Singer, 2000) have been the two dominant learning methods in TC. This seems attributable to a combination of two factors: (i) these two methods have strong justifications in terms of computational learning theory, and (ii) in comparative experiments on widely accepted benchmarks, they have outperformed all other competing approaches. An additional factor that has determined their success is the free availability, at least for research purposes, of wellknown software packages based on these methods, such as SVMlight and BoosTexter. Building Internal Representations for Documents
The learners discussed above cannot operate on the documents as they are, but require the documents to be given internal representations that the learners can make sense of. The same is true of the classifiers, once they have been built by learners. It is thus customary to transform all the documents (i.e., those that are used in the training phase, the testing phase, or the operational phase of the classifier) into internal representations by means of methods used in text search, where the same need is also present (see Indexing, Automatic). Accordingly, a document is usually represented by a vector lying in a vector space whose dimensions correspond to the terms that occur in the training set, and the value of each individual entry corresponds to the weight that the term in question has for the document. In TC applications of the thematic kind, the set of terms is usually made to coincide with the set of content-bearing words (which means all words but topic-neutral ones such as articles, prepositions, etc.), possibly reduced to their morphological roots (stems – see Stemming) so as to avoid excessive stochastic dependence among different dimensions of the vector. Weights for these words are meant to reflect the importance that the word has in determining the semantics of the document it occurs in, and are automatically computed by weighting functions. These functions usually rely on intuitions of a statistical kind, such as (i) the more often a term occurs in a document, the more important it is for that document; and (ii) the more documents a term appears in, the less important it is in characterizing the semantics of a document it occurs in. In TC applications of a nonthematic nature, the opposite is often true. For instance, it is the frequency
Classification of Text, Automatic 461
of use of articles, prepositions, and punctuation (together with many other stylistic features) that may be a helpful clue in authorship attribution, while it is more unlikely that the frequencies of use of content-bearing words can be of help (see Computational Stylistics). This shows that choosing the right dimensions of the vector space for the right classification task requires a deep understanding, on the part of the engineer, of the nature of the task. It is fairly evident from the above discussion that internal representations used in TC applications are, from the standpoint of linguistic analysis, extremely primitive: with the possible exception of applications in sentiment classification (Turney and Littman, 2003), hardly any sophisticated linguistic analysis is usually attempted in order to provide a more faithful rendition of the semantics of the text. This is because previous attempts at applying state-of-the-art natural language processing techniques (including techniques for parsing text robustly (Moschitti and Basili, 2004), extracting collocations (Koster and Seutter, 2003), performing word sense disambiguation (Kehagias et al., 2003), etc.) have not shown any substantial benefit with respect to the basic representations outlined above. Reducing the Dimensionality of the Vectors
The techniques described in the previous section tend to generate very large vectors, with sizes in the tens of thousands. This situation is problematic in TC, since the efficiency of many learning devices (e.g., neural networks) tends to degrade rapidly with the size of the vectors. In TC applications, it is thus customary to run a dimensionality reduction pass before starting to build the internal representations of the documents. Basically, this means identifying a new vector space in which to represent the documents, with a much smaller number of dimensions than the original one. Several techniques for dimensionality reduction have been devised within TC (or, more often, borrowed from the fields of ML and pattern recognition). An important class of such techniques is that of feature extraction methods (examples of which are term clustering methods and latent semantic indexing – see Latent Semantic Analysis). Feature extraction methods define a new vector space in which each dimension is a combination of some (or all) of the original dimensions; their effect is usually a reduction of both the dimensionality of the vectors and the overall stochastic dependence among dimensions. An even more important class of dimensionality reduction techniques is that of feature selection methods, which do not attempt to generate new terms, but try to select the best ones from the original set. The measure of quality for a term is its expected impact on the accuracy of the resulting classifier. To measure
this, feature selection functions are employed for scoring each term according to this expected impact, so that the highest scoring ones can be retained for the new vector space. These functions mostly come from statistics (e.g., Chi-square) or information theory (e.g., mutual information, also known as information gain), and tend to encode (each one in their own way) the intuition that the best terms for classification purposes are the ones that are distributed most differently across the different categories.
Challenges TC, especially in its ML incarnation, is today a fairly mature technology that has delivered working solutions in a number of applicative contexts. Interest in TC has grown exponentially in the last 10 years, from researchers and developers alike. For IR researchers, this interest is one particular aspect of a general movement toward leveraging user data for taming the inherent subjectivity of the IR task, i.e., taming the fact that it is the user, and only the user, who can say whether a given item of information is relevant to a query she has issued to a Web search engine, or relevant to a private folder of hers in which documents should be filed according to content. Wherever there are predefined classes, documents previously (and manually) classified by the user are often available; as a consequence, these latter data can be exploited for automatically learning the (extensional) meaning that the user attributes to the categories, thereby reaching accuracy levels that would be unthinkable if these data were unavailable. For ML researchers, this interest is because TC applications prove a challenging benchmark for their newly developed techniques, since these applications usually feature extremely high-dimensional vector spaces and provide large quantities of test data. In the last 5 years, this has resulted in more and more ML researchers adopting TC as one of their benchmark applications of choice, which means that cutting-edge ML techniques are being applied to TC with minimal delay since their original invention. For application developers, this interest is mainly due to the enormously increased need to handle larger and larger quantities of documents, a need emphasized by increased connectivity and availability of document bases of all types at all levels in the information chain. But this interest also results from TC techniques having reached accuracy levels that often rival the performance of trained professionals, levels that can be achieved with high efficiency on standard hardware and software resources. This means that more and more organizations are automating all their activities that can be cast as TC tasks. Still, a number of challenges remain for TC research.
462 Classification of Text, Automatic
The first and foremost challenge is to deliver high accuracy in all applicative contexts. While highly effective classifiers have been produced for applicative domains such as the thematic classification of professionally authored text (such as newswire stories), in other domains reported accuracies are far from satisfying. Such applicative contexts include the classification of Web pages (where the use of text is more varied and obeys rules different from the ones of linear verbal communication), spam filtering (a task which has an adversarial nature, in that spammers adapt their spamming strategies so as to circumvent the latest spam filtering technologies), authorship attribution (where current technology is not yet able to tackle the inherent stylistic variability among texts written by the same author), and sentiment classification (which requires much more sophisticated linguistic analysis than classification by topic). A second important challenge is to bypass the document labeling bottleneck, i.e., tackling the facts that labeled documents for use in the training phase are not always available, and that labeling (i.e., manually classifying) documents is costly. To this end, semisupervised methods have been proposed that allow building classifiers from a small sample of labeled documents and a (usually larger) sample of unlabeled documents (Nigam and Ghani, 2000; Nigam et al., 2000). However, the problem of learning text classifiers mainly from unlabeled data unfortunately is still open. See also: Authorship Attribution: Statistical and Computa-
tional Methods; Computational Lexicons and Dictionaries; Computational Stylistics; Document Retrieval, Automatic; Extensionality and Intensionality; Indexing, Automatic; Latent Semantic Analysis; Stemming; Text Mining; Web Searching.
Bibliography Diederich J, Kindermann J, Leopold E & Paaß G (2003). ‘Authorship attribution with support vector machines.’ Applied Intelligence 19(1/2), 109–123. Drucker H, Vapnik V & Wu D (1999). ‘Support vector machines for spam categorization.’ IEEE Transactions on Neural Networks 10(5), 1048–1054. Dumais ST & Chen H (2000). ‘Hierarchical classification of Web content.’ In Belkin NJ, Ingwersen P & Leong M-K (eds.) Proceedings of SIGIR-00, 23rd ACM International Conference on Research and Development in Information Retrieval. Athens: ACM Press. 256–263. Fall CJ, To¨ rcsva´ ri A, Benzineb K & Karetka G (2003). ‘Automated categorization in the International Patent Classification.’ SIGIR Forum 37(1), 10–25. Giorgetti D & Sebastiani F (2003). ‘Automating survey coding by multiclass text categorization techniques.’
Journal of the American Society for Information Science and Technology 54(12), 1269–1277. Hayes PJ & Weinstein SP (1990). ‘CONSTRUE/ TIS: a system for content-based indexing of a database of news stories.’ In Rappaport A & Smith R (eds.) Proceedings of IAAI90, 2nd Conference on Innovative Applications of Artificial Intelligence. Menlo Park: AAAI Press. 49–66. Joachims T (1998). ‘Text categorization with support vector machines: learning with many relevant features.’ In Ne´ dellec C & Rouveirol C (eds.) Proceedings of ECML98, 10th European Conference on Machine Learning. Lecture Notes in Computer Science series, no. 1398 Heidelberg: Springer Verlag. 137–142. Kehagias A, Petridis V, Kaburlasos VG & Fragkou P (2003). ‘A comparison of word- and sense-based text categorization using several classification algorithms.’ Journal of Intelligent Information Systems 21(3), 227–247. Koppel M, Argamon S & Shimoni AR (2002). ‘Automatically categorizing written texts by author gender.’ Literary and Linguistic Computing 17(4), 401–412. Koster CH & Seutter M (2003). ‘Taming wild phrases.’ In Sebastiani F (ed.) Proceedings of ECIR-03, 25th European Conference on Information Retrieval. Pisa: Springer Verlag. 161–176. Maron M (1961). ‘Automatic indexing: an experimental inquiry.’ Journal of the Association for Computing Machinery 8(3), 404–417. Moschitti A & Basili R (2004). ‘Complex linguistic features for text classification: a comprehensive study.’ In McDonald S & Tait J (eds.) Proceedings of ECIR-04, 26th European Conference on Information Retrieval Research. Lecture Notes in Computer Science series, no. 2997. Heidelberg: Springer Verlag. 181–196. Nigam K & Ghani R (2000). ‘Analyzing the applicability and effectiveness of co-training.’ In Agah A, Callan J & Rundensteiner E (eds.) Proceedings of CIKM-00, 9th ACM International Conference on Information and Knowledge Management. McLean: ACM Press. 86–93. Nigam K, McCallum AK, Thrun S & Mitchell TM (2000). ‘Text classification from labeled and unlabeled documents using EM.’ Machine Learning 39(2/3), 103–134. Schapire RE & Singer Y (2000). ‘BoosTexter: a boostingbased system for text categorization.’ Machine Learning 39(2/3), 135–168. Sebastiani F (2002). ‘Machine learning in automated text categorization.’ ACM Computing Surveys 34(1), 1–47. Stamatatos E, Fakotakis N & Kokkinakis G (2000). ‘Automatic text categorization in terms of genre and author.’ Computational Linguistics 26(4), 471–495. Turney PD & Littman ML (2003). ‘Measuring praise and criticism: inference of semantic orientation from association.’ ACM Transactions on Information Systems 21(4), 315–346.
Relevant Websites http://svmlight.joachims.org – SVMlight web site. http://www.research.att.com/%schapire/BoosTexter/ – BoosTexter website. http://www.math.unipd.it/!fabseb60 – F. Sebastiani’s website.
Classifiers and Noun Classes: Semantics 463
Classifiers and Noun Classes: Semantics A Y Aikhenvald, La Trobe University, Bundoora, Australia ! 2006 Elsevier Ltd. All rights reserved.
Almost all languages have some grammatical means for the linguistic categorization of nouns and nominals. The continuum of noun categorization devices covers a range of devices, from the lexical numeral classifiers of Southeast Asia to the highly grammaticalized gender agreement classes of Indo-European languages. They have a similar semantic basis, and one can develop from the other. They provide a unique insight into how people categorize the world through their language in terms of universal semantic parameters involving humanness, animacy, sex, shape, form, consistency, and functional properties. Noun categorization devices are morphemes that occur in surface structures under specifiable conditions, and denote some salient perceived or imputed characteristics of the entity to which an associated noun refers (Allan, 1977: 285). They are restricted to classifier constructions, morphosyntactic units (e.g., noun phrases of different kinds, verb phrases, or clauses) that require the presence of a particular kind of morpheme, the choice of which is dictated by the semantic characteristics of the referent of the nominal head of a noun phrase. Noun categorization devices come in various guises. We distinguish noun classes, noun classifiers, numeral classifiers, classifiers in possessive constructions, and verbal classifiers. Two relatively rare types are locative and deictic classifiers. They share a common semantic core and differ in the morphosyntactic contexts of their use and in their preferred semantic features.
Noun Classes Some languages have grammatical agreement classes based on such core semantic properties as animacy, sex, and humanness, and sometimes also shape. The number of noun classes (also known as genders, or gender classes) varies – from two, as in Portuguese or French, to 10 or so, as in Bantu, or even to several dozen, as in some languages of South America. Noun classes can to a greater or lesser extent be semantically transparent, and their assignment can be based on semantic, morphological, and/or phonological criteria. They are realized through agreement with a modifier or the predicate outside the noun itself. Examples (1) and (2), from Portuguese, illustrate masculine and feminine genders, which are marked
on the noun itself and on the accompanying article and adjective. (1) o menin-o ARTICLE: childMASC.SG MASC.SG ‘the beautiful boy’ (2) a menin-a ARTICLE: child-FEM.SG FEM.SG ‘the beautiful girl’
bonit-o beautifulMASC.SG bonit-a beautiful-FEM.SG
The cross-linguistic properties of noun classes are the following: 1. There is a limited, countable number of classes. 2. Each noun in the language belongs to one (or sometimes more than one) class. 3. There is always some semantic basis to the grouping of nouns into gender classes, but languages vary in how much semantic basis there is. This usually includes animacy, humanness and sex, and sometimes also shape and size. 4. Some constituent outside the noun itself must agree in gender with a noun. Agreement can be with other words in the noun phrase (adjectives, numbers, demonstratives, articles, etc.) and/or with the predicate of the clause, or an adverb. In some languages there is a marker of noun class on every noun; in some languages nouns bear no marker. Noun class systems are typically found in languages with a fusional or agglutinating (not an isolating) profile. Languages often have portmanteau morphemes combining information about noun class with number, person, case, etc. The semantics of noun classes in the languages of the world involves the following parameters: . Sex: feminine vs. masculine, as in many Afroasiatic languages, in East-Nilotic, and in Central Khoisan . Human vs. nonhuman, as in some Dravidian languages of India . Rational (humans, gods, demons) vs. nonrational, as in Tamil and other Dravidian languages . Animate vs. inanimate, as in Siouan, from North America The term neuter is often used to refer to irrational, inanimate gender or to a residue gender with no clear semantic basis. Languages can combine these parameters. Zande and Ma (Ubangi, Niger-Congo) distinguish masculine, feminine, nonhuman animate, and inanimate. Godoberi (Ghodoberi) (Northeast-Caucasian) has feminine, masculine, and nonrational genders.
464 Classifiers and Noun Classes: Semantics
Primarily sex-based genders can have additional shape- and size-related meanings. In languages of the Sepik region of New Guinea, feminine is associated with short, wide, and round, and masculine with long, tall, and narrow objects (e.g., Ndu family; Alamblak). Feminine is associated with small size and diminutives in Afroasiatic and East-Nilotic languages; masculine includes long, thick, solid objects. Hollow, round, deep, flat, and thin objects are feminine in Kordofanian and Central Khoisan languages (Heine, 1982: 190–191). Unusually large objects are feminine in Dumo, a Sko language from New Guinea (see the summary in Aikhenvald, 2000: 277). In some languages, most nouns are assigned to just one noun class; in other languages, different noun classes can be chosen to highlight a particular property of a referent. Manambu, a Ndu language from the Sepik area, has two genders. The masculine gender includes male referents, and feminine gender includes females. But the gender choice depends on other factors and can vary: if the referent is exceptionally long, or large, it is assigned masculine gender; if it is small and round, it is feminine. Rules for the semantic assignment of noun classes can be more complex. The Australian language Dyirbal (Dixon, 1972: 308–312) has four noun classes. Three are associated with one or more basic concepts: Class I – male humans, nonhuman animates; Class II – female humans, water, fire, fighting; Class III – nonflesh food. Class IV is a residue class covering everything else. There are also two rules for transferring gender membership. By the first, an object can be assigned to a gender by its mythological association rather than by its actual semantics. Birds are classed as feminine by mythological association, since women’s souls are believed to enter birds after death. The second transfer rule is that if a subset of a certain group of objects has a particular important property, e.g., being dangerous, it can be assigned to a different class from the other nouns in that group. Most trees without edible parts belong to Class IV, but stinging trees are placed in Class II. A typical gender system in Australian languages contains four terms that can be broadly labeled as masculine, feminine, vegetable, and residual (Dixon, 2002: 449–514). Andian (Northeast Caucasian) languages have a special noun class for insects, and Bantu languages for places (also see Corbett, 1991). The degree of semantic motivation for noun classes varies from language to language. Noun classes in Bantu languages constitute an example of a semantically opaque system. Table 1 summarizes a basic semantic grid common to Bantu noun class systems (Spitulnik, 1989: 207) based on the interaction of shape, size, and humanness. However, these
Table 1 Noun classes in Bantu Class
Semantics
1/2 3/4
Humans, a few other animates Plants, plant parts, foods, nonpaired body parts, miscellaneous Fruits, paired body parts, miscellaneous inanimates Miscellaneous inanimates Animals, miscellaneous inanimates, a few humans Long objects, abstract entities, miscellaneous inanimates Small objects, birds Masses Abstract qualities, states, masses, collectives Infinitives
5/6 7/8 9/10 11/10 12/13 6 14 15
parameters provide only a partial semantic motivation for the noun classes in individual Bantu languages. (In the Bantuist tradition, every countable noun is assigned to two classes: one singular and one plural.) In modern Bantu languages, however, noun class assignment is often much less semantically motivated, though the semantic nucleus is still discernible. Thus, in Babungo, Class 1/2 is basically human; however, it is a much bigger class than it was in Proto-Bantu, and also contains many animals, some birds and insects, body parts, plants, and household and other objects, e.g., necklace, pot, book, rainbow (Schaub, 1985: 175). Shape and size also appear as semantic parameters: in ChiBemba, class 7/8 is associated with large size and carries pejorative overtones, while class 12/13 includes small objects and has overtones of endearment (also see Denny, 1976; Aikhenvald, 2000: 281–283). In a seminal study, Zubin and Ko¨ pcke (1986) provided a semantic rationale for the gender assignment of nouns of different semantic groups in German. Masculine and feminine genders mark the terms for male and female adults of each species of domestic and game animals (following the natural sex principle), and neuter is assigned to non-sex-specific generic and juvenile terms. Masculine gender is used for types of cloth, for precipitation and wind, and for minerals. Disciplines and types of knowledge have feminine gender, and games and types of metal – with the exception of alloys – have neuter gender. This is contrary to a common assumption that there is no real semantic basis for gender assignment in the well-known Indo-European languages. Noun class assignment is typically more opaque for inanimates and for nonhuman animates than for humans and high animates. In the Australian language Bininj Gun-Wok (Evans, 2003: 185–199) masculine class includes male humans, the names of certain malevolent beings mostly associated with
Classifiers and Noun Classes: Semantics 465
the sky, items associated with painting (a male activity), and also some mammals, some snakes, and some birds and fish. Feminine class includes female humans, and also some reptiles, fish, and birds. Vegetable class includes all terms for nonflesh foods, but also a few bird names. Finally, the neuter, or residue, class is the most semantically heterogenous – it includes items that do not fit into other classes, e.g., most body parts, generic terms for plants, and terms for various inanimate objects. In Jingulu (Pensalfini, 2003: 159–168) nouns divide into four classes, only some of which are more or less semantically transparent. The vegetable class mostly includes objects that are long, thin, or pointed. This class happens to include most vegetables, as well as body parts such as the colon, penis, and neck; instruments such as spears, fire drills, and barbed wire; natural phenomena such as lightning and rainbows; and roads and trenches. The feminine class includes female humans and higher animates, and also words for axes, the sun, and most smaller songbirds. The semantic content of the remaining two classes, masculine and neuter, is much harder to define: masculine is mostly used for the rest of animates and neuter for the rest of inanimates, except that flat and/or rounded inanimates – such as most trees and eggs, and body parts such as the liver and the brow – are masculine.
Noun Classifiers Noun classifiers categorize the noun with which they co-occur and are independent of any other element in a noun phrase or in a clause. They are often independent words with generic semantics. Thus, in Yidiny, an Australian language, one would not generally say: ‘the girl dug up the yam’; it is more felicitous to include generics and say ‘the person girl dug up the vegetable yam’ (Dixon, 1982: 185), as in (3). Classifier constructions are in square brackets. (3) [mayi jimirr] [bama-al vegetableþABS yamþABS CL:PERSON-ERG yaburu-Ngu] julaal girl-ERG dig-PAST ‘The person girl dug up the vegetable yam’
Every noun in a language does not necessarily take a noun classifier. And a noun may occur with more than one classifier. In Minangkabau, a Western Austronesian language from Sumatra, different noun classifiers may be used with the same noun to express different meanings, e.g., batang limau (CL:TREE lemon) ‘lemon-tree’, buah limau (CL:FRUIT lemon) ‘lemon-fruit.’ They are similar to derivationallike devices. The choice of a noun classifier is
predominantly semantic, based on social status, function, and nature, and also on physical properties, e.g., shape. But in some cases the semantic link between a noun classifier and a noun is not obvious. In most languages of the Daly area in Australia, honey takes the noun classifier for flesh food. The choice of noun classifier in Jacaltec, a Mayan language from Guatemala, is often obscured by extension through perceptual analogy; for instance, ice is assigned to the rock class (see Craig, 1986: 275–276). Noun classifiers are found in numerous Australian languages, in Western Austronesian languages, in Tai languages, and in Mayan languages (Aikhenvald, 2000). In Yidiny (Australian) (Dixon, 1977: 480 ff.; 1982: 192 ff.), a language with 20 noun classifiers, these are of two kinds: . Inherent nature classifiers divide into humans (waguja ‘man,’ bunya ‘woman,’ and a superordinate bama ‘person,’ as in [3]); fauna (jarruy ‘bird,’ man gum ‘frog,’ munyimunyi ‘ant’); flora (jugi ‘tree,’ narra ‘vine’); natural objects (buri ‘fire,’ walba ‘stone,’ jabu ‘earth’); and artefacts (gala ‘spear,’ bundu ‘bag,’ baji ‘canoe’). . Function classifiers are minya ‘edible flesh food,’ mayi ‘edible nonflesh food,’ bulmba ‘habitable,’ bana ‘drinkable,’ wirra ‘movable,’ gugu ‘purposeful noise.’ A distinction between flesh and nonflesh food is typical for Australian languages with noun classifiers (Dixon, 2002: 454–459). Noun classfiers for humans often involve social functions. In Mayan languages of the Kanjobalan branch, as in Jacaltec, humans are classified according to their social status, kinship relation, or age. Mam has classifiers for men and women; for young and old men and women; for old men and women to whom respect is due; and for someone of the same status as the speaker. There is also a classifier for babies, and just one nonhuman classifier. In Australian languages, noun classifiers that refer to social status include such distinctions as initiated man. Murinhpatha (Australian) (Walsh, 1997: 256) has a classifier for Aboriginal people (which also covers human spirits) and another for non-Aboriginal people, which includes all other animates. Nouns with nonhuman, or inanimate, referents are classified in terms of inherent nature-based properties from the natural domains of human interaction: animals, birds, fish, plants, water, fire, minerals, and artefacts. Individual systems may vary. There is often a general term for birds and fish, as in Minangkabau (Western Austronesian); while Ngan"gityemerri (Australian) and Akatek (Mayan) have a generic noun classifier for animals. Classifiers in
466 Classifiers and Noun Classes: Semantics
Murrinh-Patha, from Australia, cover fresh water and associated concepts, flowers and fruits of plants, spears, offensive weapons, fire and things associated with fire, time and space, and speech and language, and there is a residue classifier. There is usually a noun classifier for culturally important concepts. Mayan languages have a noun classifier for corn, a traditionally important crop, and for domesticated dogs, while Daly languages, in northern Australia, have classifiers for spears, diggings sticks, and spear throwers. Noun classifiers often have to be distinguished from generic nouns. In Yidiny, a test for what can be used as a classifier is provided by the way interrogative-indefinite pronouns are used: there is one that means ‘what generic?’ and another meaning ‘generic being known, what specific?’ Another decisive criterion is how obligatory the classifiers are, and whether it is possible to formulate explicit rules for their omission. Incipient structures superficially similar to noun classifiers can be found in Indo-European languages. In English it is possible to use a proper name together with a descriptive noun phrase, such as that evil man Adolf Hitler, but this type of apposition is rather marked and used to achieve rhetorical effect. Lexicosyntactic mechanisms of this kind may well be a historical source of noun categorization devices. Noun classifiers should be distinguished from derivational components in class nouns, such as berry in English strawberry, blackberry, etc., with their limited productivity, high degree of lexicalization, and the fact that they are restricted to a closed subclass of noun roots.
Numeral Classifiers Numeral classifiers are morphemes that only appear next to a numeral, or a quantifier; they may categorize the referent of a noun in terms of its animacy, shape, and other inherent properties. Uzbek, a Turkic language, has 14 numeral classifiers. A classifier for humans is shown in (4). Inanimate objects are classified by their form, as shown in (5) (Beckwith, 1998). (4) bir nafar one CL:HUMAN ‘one person’
aˆ dam person
(5) bir baˆ s one CL:HEAD.SHAPED ‘one (head of) cabbage’
karaˆ m cabbage
Numeral classifiers are relatively frequent in isolating languages of Southeast Asia; in the agglutinating North Amazonian languages of South America; in Japanese, Korean, and Turkic; and in the fusional Dravidian and Indic languages.
In a language with a large set of numeral classifiers, the way they are used often varies from speaker to speaker, depending on the speaker’s social status and competence (Adams, 1989). In this (and in the ways they are acquired by children), they are much more similar to the use of lexical items than to a limited set of noun classes. Each noun in the language does not have to be associated with a numeral classifier. Some nouns take no classifier at all; and some nouns take more than one classifier, depending on which property of the noun’s referent is in focus. Numeral classifiers are always determined by the semantics of the noun referent. Typical semantic parameters are animacy, physical properties (such as dimensionality, shape, consistency, nature), functional properties (e.g., object with a handle), and arrangement (e.g., bunch). There can also be specific classifiers for culturally important items, e.g., canoe, house. A few languages (e.g., Kana, a Cross-River language from Nigeria, and a number of New Guinea languages) (Aikhenvald, 2000: 287–288) have no classifier for animates or humans: when counted, these are classified by shape or by function. For instance, a human is assigned to a class of vertically positioned or elongated objects. A typical problem with numeral classifiers concerns differentiating between sortal classifiers, which just characterize a referent, and mensural classifiers, which contain information about how the referent is measured. As Ahrens (1994: 204) put it, classifiers can classify only a limited and specific group of nouns, while measure words can be used as a measure for a wide variety of nouns. Almost every language, whether it has numeral classifiers or not, has quantifiers, the choice of which may depend on the semantics of the noun. This often depends on whether the noun referent is countable or not. For instance, in English much is used with noncountable nouns, and many with countable nouns; other languages have just one word covering ‘much’ and ‘many.’ The choice of quantifying expressions may also depend on the properties of the referent noun; for instance, in English we include head in five head of cattle, stack in three stacks of books, flock in two flocks of birds, and so on. These quantifying expressions are not numeral classifiers, because they do not fill an obligatory slot in the numeral-noun construction, but are instead used in a type of construction that is also employed for other purposes. For instance, quantifier constructions in English three head of cattle are in fact a subtype of genitive constructions. This is the main reason that English is not a numeral classifier language. The quantifiers also have a lexical meaning of their own.
Classifiers and Noun Classes: Semantics 467
Classifiers in Possessive Constructions
Table 2 Examples of the use of ‘give’ in Mescalero Apache
Classifiers in possessive constructions are of three kinds. Relational classifiers categorize the ways in which noun referents relate to, or can be manipulated by, the possessor – whether they are to be eaten, drunk, worn, etc. They tend to occur in languages that distinguish alienable and inalienable possession. In Fijian (Lichtenberk, 1983: 157–158), different classifiers are used to categorize kava as something one is going to drink, as in (6), or as something one has grown or is going to sell, as in (7).
1. Na´t 0 uhı´ sha´n"aa ‘Give me (a plug of) tobacco’ 2. Na´t 0 uhı´ sha´nkaa ‘Give me (a can, box, pack) of tobacco’ 3. Na´t 0 uhı´ sha´n t)i)i ‘Give me (a bag) of tobacco’ 4. Na´t 0 uhı´ sha´nt)i)i ‘Give me (a stick) of tobacco’ 5. Na´t 0 uhı´ sha´njaash ‘Give me (loose, plural) tobacco’
(6) na me-qu ARTICLE CL:DRINKABLE-my ‘my kava (which I intend to drink)’
yaqona kava
(7) na no-qu yaqona ARTICLE CL:GENERAL-my kava ‘my kava (that I grew, or that I will sell)’
Oceanic languages typically have from two to five relational classifiers, while Kipea´ -Karirı´, an extinct Macro-Jeˆ language from Brazil, had 12. Categorization of the possessive relationship via a relational classifier is based on functional interaction between possessor and possessed. The primary semantic division of referents is into consumable and nonconsumable, as in Fijian, or general and alimentary, as in Manam (Lichtenberk, 1983; Dixon, 1988: 136). Consumable objects can be further classified according to the way in which they are consumed (eaten, drunk, chewed), or prepared (e.g., cooked or roasted). Nonconsumable objects are classified according to how they have been acquired (e.g., found, or received as a gift, as in Kipea´-Karirı´). Value is a semantic parameter used in relational classifiers in Oceanic languages. Humans can be classified by their social function, that is, social status or kinship relationship, as in Ponapean, a Micronesian language. Possessed classifiers characterize a possessed noun itself, based on the physical properties (shape, form, consistency, function) or animacy of its referent, as in Panare (a South American language from the Carib family) (Aikhenvald, 2000: 128), shown in (8). (8) y-uku-n wane¨ 1sg-CL:LIQUID-GENITIVE honey ‘my honey (mixed with water for drinking)’
Possessed classifiers can also be in a generic-specific relationship with the noun they categorize (this is similar to noun classifiers mentioned in this article). In some Carib languages, ‘my papaya’ can only be phrased as ‘my fruit papaya,’ as in (9), from Macushı´: (9) u-yekkari 1sg-CL:FRUIT.FOOD ‘my papaya’
ma"pıˆya papaya
Generic possessed classifiers are often functionbased. Uto-Aztecan languages have possessed classifiers for pets and domesticated plants. Only one language, Daˆw (from the Maku´ family in South America), has possessor classifiers characterizing the possessor in possessive constructions in terms of animacy.
Verbal Classifiers Also called verb-incorporated classifiers, they appear on the verb, categorizing a noun, which is typically in S (intransitive subject) or O (direct object) function, in terms of its animacy, shape, size, structure, and position. Example (10), from Waris, a Papuan language of the Border family (Brown, 1981: 96), shows how the classifier-put-‘round object’ is used with the verb ‘get’ to characterize its O argument, coconut, as a round object. (10) sa coconut
ka-m 1sg-to
put-ra-ho-o VERBAL.CL:ROUND-getBENEFACTIVEIMPERATIVE ‘Give me a coconut (literally coconut to-me round.one-give)’
Suppletive (or partly analyzable) classificatory verbs are a subtype of verbal classifiers. Classificatory verbs can categorize the S/O argument in terms of its inherent properties (e.g., animacy, shape, form, and consistency), as in Athapascan languages of North America, such as Mescalero Apache, shown in Table 2. Different arrangements of tobacco are reflected in the form of a classificatory verb whose basic meaning is ‘give’ (in bold) (Rushforth, 1991): Alternatively, classificatory existential verbs can categorize the S/O argument in terms of its orientation or stance in space, and also to its inherent properties, as in Dakota and Nevome, from North America, and in Papuan languages of the Engan family in the Highlands of New Guinea. In Enga, a verb meaning ‘stand’ is used with referents judged to be tall, large, strong, powerful, standing, or supporting, e.g., men, houses, trees; and ‘sit’ is used with referents judged to be small, squat, horizontal, or weak, e.g., women, possums, ponds.
468 Classifiers and Noun Classes: Semantics
Cross-linguistically, classificatory verbs tend to belong to the semantic groups of handling, motion, and existence/location. That classificatory verbs should combine reference to inherent properties of referents, and to their orientation, is not surprising. Shape, form, and other inherent properties of objects correlate with their stance in space. Certain positions and states are only applicable for objects of particular kinds; for instance, a tree usually stands, and only liquids can flow. However, classificatory verbs differ from the lexical selection of a verb in terms of physical properties or the position of an object. Most languages have lexical items similar to English drink (which implies a liquid O), or chew (which implies an O of chewable consistency). Unlike these verbs, classificatory verbs make consistent paradigmatic distinctions in the choice of semantic features for their S/O argument throughout the verbal lexicon. In other words, while English distinguishes liquid and nonliquid objects only for verbs of drinking, classificatory verbs provide a set of paradigmatic oppositions for the choice of verb sets depending on the physical properties of all kinds of S/O. Similarly, posture verbs in many languages tend to occur with objects of a certain shape. For instance, in Russian, long, vertical objects usually stand, and long, horizontal ones lie. However, the correlations between the choice of the verb and the physical properties of the object are not paradigmatic; these verbs cannot be considered classificatory.
Locative Classifiers Locative classifiers occur with locative prepositions and postpositions, and categorize the head noun in terms of its animacy or physical properties, including form and shape. These are found in South American Indian languages of the Carib family, and in Palikur, an Arawak language from Brazil: e.g., pi-wan min (2sg-arm LOC.CL þVERTICAL) ‘on your (vertical) arm’; ah peu (tree LOC.CL þ BRANCH LIKE) ‘on (branchlike) tree’.
Deictic Classifiers Deictic classifiers occur on deictics within a noun phrase and categorize the noun referent in terms of its inherent properties and position in space, such as horizontal or vertical. They are found in Siouan languages from North America, e.g., Mandan dE-ma˜ k ‘this one (lying)’; dE-nak ‘this one (sitting).’ Nouns are typically classified by their canonical position, which correlates with their shape and extendedness; for instance, in Pilaga´ (a Guaicuruan language, from Argentina), fire and stones are
classified as horizontal, and buildings and animals as sitting. All noun categorization devices use the same set of core parameters, which include: . animacy; . physical properties covering shape and dimensionality (one-, two-, or three-dimensional objects, including long, flat, and round referents) and direction; size; consistency (flexible, hard or rigid, liquid); material (what the object is made of, e.g., clothlike); . functional properties (to do with specific uses of objects or kinds of action typically performed on them), including social status, which can be considered a subtype of functional categorization; . arrangement (that is, configuration of objects, e.g., a coil of rope or a bunch). Various kinds of noun categorization devices opt for different preferred semantic parameters: animacy and humanness are predominant in noun classes, while noun classifiers often categorize referents in terms of their function and social status. Numeral classifiers typically categorize referents by shape (e.g., round or vertical), while verbal classifiers may also involve orientation (vertical or horizontal). Semantic parameters employed in noun categorization systems follow some tendencies. If a language has classifiers for three-dimensional objects, it is likely to also have classifiers for two-dimensional ones. A summary of preferred semantic parameters depending on a type of noun categorization device is in Table 3 (for their cognitive correlates, see also Bisang, 2002). These preferences represent only tendencies. Generic-specific relations are characteristic of noun classifiers, verbal classifiers, and sometimes possessed classifiers, but not of other types (they are rare in numeral classifiers). The semantic complexity of an individual noun class or classifier varies. Some are semantically simple, e.g., the classifier ‘person’ in Malay and Minangkabau used with all humans. Others undergo semantic extensions, and their choice is less straightforward. Consider the semantic structure of the classifier -hon in Japanese (Matsumoto, 1993: 676–681). In its most common use, it covers saliently one-dimensional objects, e.g., long, thin, rigid objects such as sticks, canes, pencils, candles, trees, dead snakes, and dried fish. It also covers martial arts contests with swords (which are long and rigid), hits in baseball, shots in basketball, Judo matches, rolls of tape, telephone calls, radio and TV programs, letters, movies, medical injections, bananas, carrots, pants, guitars, and teeth. This heterogeneity results from various processes of semantic extension and metonymy. Extensions can be based on certain rules for transferring class
Classifiers and Noun Classes: Semantics 469 Table 3 Preferred semantic parameters in classifiers Classifier
membership, as in Dyirbal (see the section ‘‘Noun Classes’’). According to these principles, idealized models of the world – for instance, myths and beliefs – can account for other chaining links within the structure of a class. In Dyirbal, birds belong to feminine Class II, because they are believed to be the spirits of dead human females. A further type of extension is the Domain of Experience Principle, which links members thought to be associated with the same experience domain. Thus, fish in Dyirbal belong to Class 1, since they are animate, and so do fishing implements, because they are associated with the same activity. These domains are often culture-specific, and subject to change with sociocultural changes. The numeral classifier tay in Korean was originally used with reference to traditional vehicles, and then was extended to introduced European artifacts with wheels. It was further extended to any electric machinery, and to other kinds of machines or instruments, including even the piano. In Austroasiatic languages, shape parameters in inanimate categorization account for typical semantic extensions of terms for plants and their component parts when employed as classifiers, such as small and roundish (from the word for ‘seed’), round (from ‘fruit’), bulky (from ‘tuber’), flat and sheetlike (from ‘flower,’ ‘leaf,’ ‘fiber’), and long (from ‘stalk,’ ‘stick,’ ‘sprout’) (Conklin, 1981: 341). An instructive example of prototype-and-extenson in a multiple classifier system comes from classifier tua in Thai (used with numerals, demonstratives, and adjectives). The structure of the category is shown in Figure 1. Arrows indicate extensions from a prototypical member to a less prototypical one (Carpenter, 1987: 45–46). The prototypical referent classified with tua is a four-legged animal, such as a dog or a water buffalo. The classifier extends to include trousers and shirts, due to their shape: trousers are leglike, and shirts have armlike sleeves. Because of shared function, and the bodylike shape, this classifier also applies to jackets
No No
Figure 1 Structure of the tua category in Thai.
and skirts and even to dresses, underwear, and bathing suits. The general four-legged shape of items of furniture, such as tables and chairs, accounts for their inclusion in the category covered by the classifier tua. Other kinds of furniture were then added because of their shared function with tables and chairs. ‘Letter (of the alphabet)’ in Thai is a compound tua nangseu ‘body book’, so a combination of shape and repetition of the generic compound head caused letters to be classified with tua. Numbers were included either on the basis of shape or by their shared function with letters. Ghosts were included because of their similarity with the two-limbed shape of a human body. Semantic extensions of classifiers can be manipulated by language planners. Following an order of King Mongkut issued in 1854, ‘noble’ animals, such as elephants and horses, should be counted without any classifier; the classifier tua could be used only for animals of a ‘lower’ status. In Setswana, a Bantu language with a large set of noun classes, it is now considered politically incorrect to refer to ethnic minorities, such as the Chinese or the Bushmen, using noun class 5/6 (which includes substances, such as dirt or clay, and abstract nouns); all humans have to be referred to with the ‘human’ class 1/2 (see Table 1). Noun categorization devices are hardly ever semantically redundant. They are often used to distinguish what can be encoded with different lexemes in some languages. For instance, in Burmese a river can be viewed as a place, as a line (on a map), as a section, as a sacred object, or as a connection. These meanings are distinguished through the use of different
470 Classifiers and Noun Classes: Semantics Table 4 Categorization of an inanimate noun in Burmese with a classifier Noun
Numeral
Classifier
Translation
miyi
te
ya
miyi miyi
te te
tan hmwa
miyi
te
sin
miyi
te
ywE
miyi
te
pa
miyi
te
khu
miyi
te
miyi
river one place (e.g., destination for a picnic) river one line (e.g., on a map) river one section (e.g., a fishing area) river one distant arc (e.g., a path to the sea) river one connection (e.g., connecting two villages) river one sacred object (e.g., in mythology) river one conceptual unit (e.g., in a discussion of rivers in general) river one river (the unmarked case)
of inanimate and nonhuman objects is directly related to cultural notions. Animacy and sex, when extended metaphorically, are influenced by social stereotypes and beliefs. Correlations between the choice of physical properties encoded in classifiers and nonlinguistic parameters are much less obvious. They may relate to the cultural salience of certain shapes or forms, and they may ultimately be based on typical metaphorical extensions.
See also: Cognitive Semantics; Gender, Grammatical;
Metaphor and Conceptual Blending; Metaphor: Psychological Aspects; Possession, Adnominal.
Bibliography numeral classifiers – this is shown in Table 4 (Becker, 1975: 113). In Apache, a plug, a box, a stick, and a bag of tobacco are distinguished through the use of different classificatory verbs. In languages with overt noun class marking, variability in marking noun class on the same root is a way of creating new words. In Bantu languages, such as Swahili, most stems usually occur with a prefix of one class. Prefixes can be substituted to mark a characteristic of an object. M-zee means ‘old person’ and has the human class prefix m-. It can be replaced by ki-(inanimate class) to yield ki-zee ‘scruffy old person’. In Dyirbal, the word ‘man’ can be used with the feminine class marker, instead of masculine, to point out the female characteristics of a hermaphrodite. In Manambu, ‘head’ is usually feminine because of its round shape, but it is treated as masculine when a person has a headache, since then the head feels heavy and unusually big. We have seen that semantically noun categorization devices are heterogenous, nonhierarchically organized systems that employ both universal and culture-specific parameters. The ways these parameters work are conditioned and restricted by cognitive mechanisms and the sociocultural environment. Among universal parameters are animacy, humanness, and physical properties, e.g., shape, dimensionality, consistency. Culture-specific parameters can cover certain functional properties and social organization. Classificatory parameters associated with function rather than physical properties are more sensitive to cultural and other nonlinguistic factors. Human categorization, as a sort of ‘social’ function, depends entirely on social structure. Functional categorization
Adams K L (1989). Systems of numeral classification in the Mon-Khmer, Nicobarese and Asian subfamilies of Austroasiatic. Canberra: Pacific Linguistics. Ahrens K (1994). ‘Classifier production in normals and aphasics.’ Journal of Chinese Linguistics 22, 203–246. Aikhenvald A Y (2000). Classifiers: a typology of noun categorization devices. Oxford: Oxford University Press. Allan K (1977). ‘Classifiers.’ Language 53, 284–310. Becker A J (1975). ‘A linguistic image of nature: the Burmese numerative classifier system.’ Linguistics 165, 109–121. Beckwith C I (1998). ‘Noun specification and classification in Uzbek.’ Anthropological Linguistics 40, 124–140. Bisang W (2002). ‘Classification and the evolution of grammatical structures: a universal perspective.’ Sprachtypologie und Universalienforschung 55, 289–308. Brown R (1981). ‘Semantic aspects of some Waris predications.’ In Franklin K (ed.) Syntax and semantics in Papua New Guinea languages. Ukarumpa: Summer Institute of Linguistics. 93–123. Carpenter K (1987). How children learn to classify nouns in Thai. Ph.D. diss., Stanford University. Conklin N F (1981). The semantics and syntax in numeral classification in Tai and Austronesian. Ph.D. diss., University of Michigan. Corbett G (1991). Gender. Cambridge: Cambridge University Press. Craig C G (1986). ‘Jacaltec noun classifiers: a study in language and culture.’ In Craig C G (ed.) Noun classes and categorization. Amsterdam: John Benjamins. 263–294. Denny J P (1976). ‘What are noun classifiers good for?’ Papers from the annual regional meeting of the Chicago Linguistic Society. Chicago: Chicago Linguistic Society. 12, 122–132. Dixon R M W (1972). The Dyirbal language of North Queensland. Cambridge: Cambridge University Press. Dixon R M W (1977). A grammar of Yidiny. Cambridge: Cambridge University Press.
Classroom Talk 471 Dixon R M W (1982). Where have all the adjectives gone? and other essays in semantics and syntax. Berlin: Mouton. Dixon R M W (1988). A grammar of Boumaa Fijian. Chicago: University of Chicago Press. Dixon R M W (2002). Australian languages: their nature and development. Cambridge: Cambridge University Press. Evans N (2003). Bininj Gun-Wok: a pan-dialectal grammar of Mayali, Kunwinjku and Kune. Canberra: Pacific Linguistics. Heine B (1982). ‘African noun class systems.’ In Seiler H & Lehmann C (eds.) Apprehension: Das sprachliche Erfassen von Gegensta¨ nden, Teil I: Bereich und Ordnung der Pha¨ nomene. Tu¨ bingen: Narr Language Universals Series 1/I. 189–216. Lichtenberk F (1983). ‘Relational classifiers.’ Lingua 60, 147–176. Matsumoto Y (1993). ‘Japanese numeral classifiers: a study on semantic categories and lexical organisation.’ Linguistics 31, 667–713.
Pensalfini R (2003). A grammar of Jingulu, an Aboriginal language of the Northern Territory. Canberra: Pacific Linguistics. Rushforth S (1991). ‘Uses of Bearlake and Mescalero (Athapaskan) classificatory verbs.’ International Journal of American Linguistics 57, 251–266. Schaub W (1985). Babungo. London: Croom Helm. Spitulnik D (1989). ‘Levels of semantic restructuring in Bantu noun classification.’ In Newman P & Botne R D (eds.) Current approaches to African linguistics, vol. 5. Dordrecht: Foris. 207–220. Walsh M (1997). ‘Nominal classification and generics in Murrinhpatha.’ In Harvey M & Reed N (eds.) Nominal classification in Aboriginal Australia. Amsterdam: John Benjamins. 255–292. Zubin D & Ko¨ pcke K M (1986). ‘Gender and folk taxonomy: the indexical relation between grammatical and lexical categorization.’ In Craig C G (ed.) Noun classes and categorization. Amsterdam: John Benjamins. 139–180.
Classroom Talk E Hinkel, Seattle University, Seattle, WA, USA ! 2006 Elsevier Ltd. All rights reserved.
Much classroom activity takes the form of talk. In recent decades, studies of teacher and student spoken language in the classroom have been undertaken from a variety of perspectives in applied linguistics, education, ethnography, and ethnomethodology. In particular, the analyses of talk between the teacher and the students, as well as among students, seek to understand how the spoken language and the discourse of the classroom affect learning (including language learning) and the development of sociocultural affiliation and identity (e.g., Watson-Gegeo, 1997). To a great extent, spoken language and face-toface interaction constitute the foundational aspects of both teaching and learning at school. Although specialists in education and teaching first became interested in the impact of classroom discourse and interaction on students’ learning and the development of cognitive skills in the 1930s and 1940s, since that time, research on classroom talk has moved forward in a number of directions. In the study of language and applied linguistics, classroom talk has been the subject of considerable exploration in discourse, conversation, and text analyses, as well as sociolinguistic and sociocultural features of interaction. The linguistic features of classroom talk were studied intensively in the 1970s and 1980s, when
the uses of language and forms of interaction at school became an important venue in discourse, pragmatic, and literacy studies. Many of the early discourse analyses focused on the linguistic features of talk, narrative structure, common speech acts, their sequences, and the contexts in which they occurred, as well as the flow of classroom speech (e.g., Sinclair and Coulthard, 1975; Stubbs, 1983). As a matter of course, these studies approached classroom talk as occurrences of conversational discourse, without attempting to discern the effect of the language spoken in the classroom on student learning and the educational processes. The analyses of the discourse flow and the language of interaction revealed that classroom talk is highly structured and routinized. Building on the discourse-analytic foundation, the influential work of such sociolinguists and cognitive linguists as Cazden (2001), Gumperz (1982, 1986), Edwards and Mercer (1987), and Edwards and Westgate (1994) employed a combination of methodological perspectives in their explorations of the spoken discourse, language, and the structure of interaction in schooling. In general terms, sociolinguistics takes into account the social contexts and the structure of interaction to determine how they shape the spoken language. Sociolinguistic research methods in the classroom are usually complemented by ethnographic and pragmatic perspectives. Taken together, the findings of these studies have brought to the foreground issues of power, socioeconomic class, culture, and the social construction of experience in
Classroom Talk 471 Dixon R M W (1982). Where have all the adjectives gone? and other essays in semantics and syntax. Berlin: Mouton. Dixon R M W (1988). A grammar of Boumaa Fijian. Chicago: University of Chicago Press. Dixon R M W (2002). Australian languages: their nature and development. Cambridge: Cambridge University Press. Evans N (2003). Bininj Gun-Wok: a pan-dialectal grammar of Mayali, Kunwinjku and Kune. Canberra: Pacific Linguistics. Heine B (1982). ‘African noun class systems.’ In Seiler H & Lehmann C (eds.) Apprehension: Das sprachliche Erfassen von Gegensta¨nden, Teil I: Bereich und Ordnung der Pha¨nomene. Tu¨bingen: Narr Language Universals Series 1/I. 189–216. Lichtenberk F (1983). ‘Relational classifiers.’ Lingua 60, 147–176. Matsumoto Y (1993). ‘Japanese numeral classifiers: a study on semantic categories and lexical organisation.’ Linguistics 31, 667–713.
Pensalfini R (2003). A grammar of Jingulu, an Aboriginal language of the Northern Territory. Canberra: Pacific Linguistics. Rushforth S (1991). ‘Uses of Bearlake and Mescalero (Athapaskan) classificatory verbs.’ International Journal of American Linguistics 57, 251–266. Schaub W (1985). Babungo. London: Croom Helm. Spitulnik D (1989). ‘Levels of semantic restructuring in Bantu noun classification.’ In Newman P & Botne R D (eds.) Current approaches to African linguistics, vol. 5. Dordrecht: Foris. 207–220. Walsh M (1997). ‘Nominal classification and generics in Murrinhpatha.’ In Harvey M & Reed N (eds.) Nominal classification in Aboriginal Australia. Amsterdam: John Benjamins. 255–292. Zubin D & Ko¨pcke K M (1986). ‘Gender and folk taxonomy: the indexical relation between grammatical and lexical categorization.’ In Craig C G (ed.) Noun classes and categorization. Amsterdam: John Benjamins. 139–180.
Classroom Talk E Hinkel, Seattle University, Seattle, WA, USA ! 2006 Elsevier Ltd. All rights reserved.
Much classroom activity takes the form of talk. In recent decades, studies of teacher and student spoken language in the classroom have been undertaken from a variety of perspectives in applied linguistics, education, ethnography, and ethnomethodology. In particular, the analyses of talk between the teacher and the students, as well as among students, seek to understand how the spoken language and the discourse of the classroom affect learning (including language learning) and the development of sociocultural affiliation and identity (e.g., Watson-Gegeo, 1997). To a great extent, spoken language and face-toface interaction constitute the foundational aspects of both teaching and learning at school. Although specialists in education and teaching first became interested in the impact of classroom discourse and interaction on students’ learning and the development of cognitive skills in the 1930s and 1940s, since that time, research on classroom talk has moved forward in a number of directions. In the study of language and applied linguistics, classroom talk has been the subject of considerable exploration in discourse, conversation, and text analyses, as well as sociolinguistic and sociocultural features of interaction. The linguistic features of classroom talk were studied intensively in the 1970s and 1980s, when
the uses of language and forms of interaction at school became an important venue in discourse, pragmatic, and literacy studies. Many of the early discourse analyses focused on the linguistic features of talk, narrative structure, common speech acts, their sequences, and the contexts in which they occurred, as well as the flow of classroom speech (e.g., Sinclair and Coulthard, 1975; Stubbs, 1983). As a matter of course, these studies approached classroom talk as occurrences of conversational discourse, without attempting to discern the effect of the language spoken in the classroom on student learning and the educational processes. The analyses of the discourse flow and the language of interaction revealed that classroom talk is highly structured and routinized. Building on the discourse-analytic foundation, the influential work of such sociolinguists and cognitive linguists as Cazden (2001), Gumperz (1982, 1986), Edwards and Mercer (1987), and Edwards and Westgate (1994) employed a combination of methodological perspectives in their explorations of the spoken discourse, language, and the structure of interaction in schooling. In general terms, sociolinguistics takes into account the social contexts and the structure of interaction to determine how they shape the spoken language. Sociolinguistic research methods in the classroom are usually complemented by ethnographic and pragmatic perspectives. Taken together, the findings of these studies have brought to the foreground issues of power, socioeconomic class, culture, and the social construction of experience in
472 Classroom Talk
the classroom. Many, if not most, of these investigations point to the common and frequent mismatches between the normative properties of the school language and the language used in students’ families. A number of important and congruent findings have emerged from the study of classroom discourse and spoken language. One prominent thread in research is that a large majority of classroom interactions occur between the teacher and the students, individually or in groups, although some student– student interactions also take place during group or collaborative activities. Investigations carried out in different locations and countries around the world have shown that in classroom interactions, teachers talk approximately 75% of the time, with the remainder divided among the students. This pattern of talk seems to be comparatively consistent and, on the whole, resistant to change, despite the calls for its modification or attempted educational reforms (e.g., van Lier, 1988, 1996; Dysthe, 1996; Nystrand, 1997). Another strand that runs through practically all studies is that classroom talk includes a number of predictable and observable sequences. Much of the classroom spoken language centers around knowledge and information elicitation turns between the teacher and the students, cohesive topical stretches of talk, or exchanges motivated by instructional activity in the classroom. In general terms, teacher–student exchanges reflect the unequal and hierarchical relationship of their participants in teacher-fronted classrooms (Edwards and Westgate, 1994). The typical conversational patterns in such dyadic exchanges proceed along the lines of what has become known as Initiation-Response-Feedback (IRF) (also called Initiation-Response-Evaluation or Question-Answer-Comment), e.g.: Teacher: So, why did Peter run to the village? Student: For a joke. Teacher: Right!
In such routine classroom sequences, the teacher initiates the interaction or asks a question, the student responds or answers the question, and the teacher takes the concluding turn that provides a commentary (e.g., So, Peter was bored) or an evaluation (e.g., Good/Great answer). Spoken language in the classroom is fundamentally different from many other types of talk, such as conversations among peers, coworkers, or family members. Some researchers, such as, for example, Mehan (1979) and van Lier (1996), have pointed out that IRF interactions are, by their nature, artificial and constrained and, for this reason, they cannot be analyzed as ordinary conversational discourse that
follows ordinary interactional conventions. In their view, the institutionalized structure of classroom talk is crucially distinct when the teacher nominates topics and speakers, and controls turn-taking and the amount of participant talk. The decades of investigating talk in the classroom have also identified the social, cultural, and behavioral practices that predominate in classroom discourse. Numerous studies carried out in such locations as the United States, the United Kingdom, and Scandinavia have demonstrated that complex systems of sociocultural prescriptions and expectations that exist in the wider society are strongly reflected in the norms of speaking and behaving in the classroom. Influential works by, for example, Heath (1983) and Gee (1990), highlight the pervasive discontinuities between the middle-class linguistic and interactional practices widely adopted in schooling and those in children’s homes. The disparities between the rigidly prescribed and traditional rules of the classroom talk extend to the learning, socialization, and literacy development of the children in racially and linguistically diverse schools. Examples of language and interaction mismatches abound (e.g., Scollon and Scollon, 1981; Brock et al., 1998): . Spanish-speaking students in the United States are not always familiar with the predominant norms of classroom behavior when students are expected to be quiet while the teacher or another student is speaking. . Native American students often participate in class conversations collectively, but not individually, as is usually expected in U.S. and Canadian schools. . Ethnic Chinese students in U.K. and Australian schools rarely speak during requisite classroom activities and strongly prefer to work alone instead of working in groups, where much conversation is required. In all, a large number of sociolinguistic and ethnographic studies have shown conclusively that the practice of classroom talk and the rigid norms of interaction in schooling represent culturally bound contexts for learning. As an outcome, the learning and literacy development of racial and linguistic minority students can be constrained in the classrooms where the structure of talk and discourse follows sociocultural prescriptions different from those in the students’ communities outside the school. From a different vantage point, research in discourse and conversation analysis, as well as language acquisition, has also shown that classroom talk has numerous important learning, cognitive, and social functions. The most common of these include exposure to language and linguistic input in the form
Classroom Talk 473
of, for example, direct instruction, questions and answers, orientations to topics, information elicitations, explanations, hypothesis-making, and using evidence. In the following example of a story-circle discussion, the teacher attempts to elicit more elaborate explanations and evidential support for the students’ in effect accurate appraisal of the story events: Teacher: Ok, so Laura got a pretty new dress ... . A very nice dress. She must have liked it. So, did she like it? Several students together: Nooooooo. Teacher: She didn’t? Well, no, she didn’t ... . Eh, ok, ... so how do we know that she didn’t? Sam: She said ... I ... I don’t need it ... it ... the new one. So, she didn’t. Teacher: Good job, Sam, good thinking ... . Laura really didn’t need this dress? Ok, or maybe, she didn’t like it? Can we tell? How can we tell?
In addition to guiding the students to support their conclusion by means of the information in the story, the teacher also uses relatively advanced syntactic constructions, such as must have liked it and a number of complex sentences with noun clauses and negation. More recently, with the increased understanding of learners’ cognitive and linguistic development, investigations of classroom talk have continued to gain importance in language teaching and education of second language and minority students. In many cases, discourse and conversation analyses of classroom talk have also shown that language uses and interactions in educational contexts play an important role in learner language and cognitive development (see, e.g., Edwards and Westgate, 1994; Dysthe, 1996; Seedhouse, 2004). For instance, the uses of lexical and grammatical features in classroom talk have allowed researchers to assess the value of classroom language exposure and input in language learning and the growth of first and second language literacy skills. Among other venues, for example, the uses of display and referential questions in classroom talk have been extensively researched. The purpose of display questions is to elicit information already known to the interaction participant, who asks the question to lead to the display of knowledge or familiarity with information, e.g., ‘‘What do we call this thing?’’ On the other hand, referential questions elicit information that is not known to the speaker, e.g., ‘‘Why did you and Mary put this picture before that one?’’ Studies of referential questions have shown that their educational uses lead to different classroom exchanges that result in significantly longer speech events, higher rates of lexically and syntactically complex responses, and greater opportunities for learner
language use (e.g., Edwards and Mercer, 1987; van Lier, 1996). In-depth investigations of classroom talk have undertaken to gain insight into a large number of sociocultural and linguistic properties of interaction, such as equal and unequal power relationships, some aspects of turn-taking, talk management, and the timing and length of speech events (e.g., Markee, 2000). From the perspective of conversation analysis, classroom interactions have provided a fertile ground for examinations of repair, correction, selfcorrection, discourse, and face-saving markers in equal and unequal power educational contexts. At present, sociologists, educators, and linguists almost universally recognize that social and cultural institutions of schooling are inseparable from how language and discourse are employed to transmit knowledge and socialize learners (e.g., WatsonGegeo, 1997; Cazden, 2001). See also: Conversation Analysis; Identity and Language;
Institutional Talk; Language Education: Teacher Preparation; Socialization.
Bibliography Brock C, McVee M, Shojgreen-Downer A & Duen˜ as L (1998). ‘No habla ingle´ s: Exploring a bilingual child’s literacy learning opportunities in a predominantly English-speaking classroom.’ Bilingual Research Journal 22, 175–200. Cazden C (2001). Classroom discourse: The language of teaching and learning (2nd edn.). Portsmouth, NH: Heinemann. Dysthe O (1996). ‘The multivoiced classroom: interaction of writing and classroom discourse.’ Written Communication 13(3), 385–425. Edwards A & Westgate D (1994). Investigating classroom talk (2nd edn.). London: Falmer Press. Edwards D & Mercer N (1987). Common knowledge: The development of understanding in the classroom. London: Methuen. Gee J (1990). Social linguistics and literacies: ideology in discourses. New York: Falmer Press. Gumperz J (1982). Language and social identity. Cambridge: Cambridge University Press. Gumperz J (1986). ‘Interactional sociolinguistics in the study of schooling.’ In Cook-Gumperz J (ed.) The social construction of literacy. Cambridge: Cambridge University Press. 229–252. Heath S B (1983). Ways with words: language, life, and work in communities and classrooms. New York: McGraw-Hill. Markee N (2000). Conversation analysis. Mahwah, NJ: Erlbaum. Mehan H (1979). Learning lessons. Cambridge, MA: Harvard University Press.
474 Classroom Talk Nystrand M, Gamoran A, Kachur R & Prendergast C (1997). Opening dialogue: understanding the dynamics of language and learning in the English classroom. New York: Teachers College Press. Scollon R & Scollon S (1981). Narrative, literacy and face in interethnic communication. Norwood, NJ: Ablex. Seedhouse P (2004). The interactional architecture of the language classroom: a conversation analysis perspective. Malden, MA: Blackwell. Sinclair J & Coulthard R M (1975). Towards an analysis of discourse: the English used by teachers and pupils. Oxford: Oxford University Press.
Stubbs M (1983). Discourse analysis: the sociolinguistic analysis of natural language. Chicago: University of Chicago Press. van Lier L (1988). The classroom and the language learner: ethnography and second-language classroom research. London: Longman. van Lier L (1996). Interaction in the language curriculum: awareness, autonomy, and authenticity. London: Longman. Watson-Gegeo K A (1997). ‘Classroom ethnography.’ In Hornberger N H & Corson D (eds.) Encyclopedia of language and education: Research methods in language and education, vol. 8. Dordrecht, The Netherlands: Kluwer Academic. 135–144.
Clause Relations M Hoey, University of Liverpool, Liverpool, UK ! 2006 Elsevier Ltd. All rights reserved.
The concept of the ‘clause relation’ grew up in a number of places at the same time at the end of the 1960s and at the beginning of the 1970s. Given that there was limited contact between the linguists responsible for the notion, it is unsurprising that each linguist or group of linguists labeled and defined it slightly differently, but a basic shared element in the definitions was that a clause relation was a regularly recurring semantic relationship holding between parts of a text, minimally clauses, that helped account for the organization of the text. One of the places where the concept of the clause relation was developed was at Hatfield Polytechnic, Hatfield, England, where Eugene Winter headed a small group of researchers interested in the ways in which written text may be organized. Winter had worked with M. A. K. Halliday, Richard Hudson, Rodney Huddleston, and Alec Henrici on a corpusbased study of scientific writing, funded by the Office of Scientific and Technical Information and completed in 1968, and it had become apparent to him that some aspects of the ways clauses and sentences interconnected in scientific text could not be explained in terms of the scale-and-category model (an early version of systemic grammar) which they were using. He therefore posited that clauses were systematically related to each other (Huddleston et al., 1968; Winter, 1971) in ways that could be described independently of the grammar (though he was always insistent on the close relationship between grammar and clause relations, and felt that grammars needed to be adapted to take account of this relationship). He defined clause relations as follows:
A clause relation is the cognitive process whereby we interpret the meaning of a sentence or group of sentences in the light of its adjoining sentence or group of sentences. (Winter, 1971 and elsewhere)
It will be noted that, despite its label, a clause relation may hold between sentences and groups of clauses as well as, of course, between clauses. It will also be noted that the definition treats the clause relation as a property of the reader’s processing of the text rather than a property of the text itself, an important difference from the position adopted by others working with the notion. These points will be returned to later. Winter quickly moved to the position that there are two basic kinds of clause relation (Winter, 1971 et seq.): matching relations and logical sequence relations (relabeled sequence relations in Hoey, 1983, in order to accommodate spatial and temporal sequence relationships). Matching relations are characterized by two clauses or groups of clauses being matched for points of similarity and difference in the content. Such relations include compatibility (e.g., I like Mozart and so does my wife), contrast (e.g., I like Mozart but my wife does not), generalization – exemplification (e.g., My wife doesn’t like classical music. For example, she hates Mozart), and preview– detail (e.g., There are three composers I especially like. I really enjoy Mozart, I love Bach and I adore Sibelius) The simplest kind of matching relation is topic maintenance (e.g., Mozart is one of my favorite composers. He was born in Salzburg). Authentic examples of matching relations can be found in the following passage from a 1907 biology primer (A primer of biology and nature study by Randal Mundy): 2.1 A Simple Classification of Plants. – Plants may be divided into two great groups:
474 Classroom Talk Nystrand M, Gamoran A, Kachur R & Prendergast C (1997). Opening dialogue: understanding the dynamics of language and learning in the English classroom. New York: Teachers College Press. Scollon R & Scollon S (1981). Narrative, literacy and face in interethnic communication. Norwood, NJ: Ablex. Seedhouse P (2004). The interactional architecture of the language classroom: a conversation analysis perspective. Malden, MA: Blackwell. Sinclair J & Coulthard R M (1975). Towards an analysis of discourse: the English used by teachers and pupils. Oxford: Oxford University Press.
Stubbs M (1983). Discourse analysis: the sociolinguistic analysis of natural language. Chicago: University of Chicago Press. van Lier L (1988). The classroom and the language learner: ethnography and second-language classroom research. London: Longman. van Lier L (1996). Interaction in the language curriculum: awareness, autonomy, and authenticity. London: Longman. Watson-Gegeo K A (1997). ‘Classroom ethnography.’ In Hornberger N H & Corson D (eds.) Encyclopedia of language and education: Research methods in language and education, vol. 8. Dordrecht, The Netherlands: Kluwer Academic. 135–144.
Clause Relations M Hoey, University of Liverpool, Liverpool, UK ! 2006 Elsevier Ltd. All rights reserved.
The concept of the ‘clause relation’ grew up in a number of places at the same time at the end of the 1960s and at the beginning of the 1970s. Given that there was limited contact between the linguists responsible for the notion, it is unsurprising that each linguist or group of linguists labeled and defined it slightly differently, but a basic shared element in the definitions was that a clause relation was a regularly recurring semantic relationship holding between parts of a text, minimally clauses, that helped account for the organization of the text. One of the places where the concept of the clause relation was developed was at Hatfield Polytechnic, Hatfield, England, where Eugene Winter headed a small group of researchers interested in the ways in which written text may be organized. Winter had worked with M. A. K. Halliday, Richard Hudson, Rodney Huddleston, and Alec Henrici on a corpusbased study of scientific writing, funded by the Office of Scientific and Technical Information and completed in 1968, and it had become apparent to him that some aspects of the ways clauses and sentences interconnected in scientific text could not be explained in terms of the scale-and-category model (an early version of systemic grammar) which they were using. He therefore posited that clauses were systematically related to each other (Huddleston et al., 1968; Winter, 1971) in ways that could be described independently of the grammar (though he was always insistent on the close relationship between grammar and clause relations, and felt that grammars needed to be adapted to take account of this relationship). He defined clause relations as follows:
A clause relation is the cognitive process whereby we interpret the meaning of a sentence or group of sentences in the light of its adjoining sentence or group of sentences. (Winter, 1971 and elsewhere)
It will be noted that, despite its label, a clause relation may hold between sentences and groups of clauses as well as, of course, between clauses. It will also be noted that the definition treats the clause relation as a property of the reader’s processing of the text rather than a property of the text itself, an important difference from the position adopted by others working with the notion. These points will be returned to later. Winter quickly moved to the position that there are two basic kinds of clause relation (Winter, 1971 et seq.): matching relations and logical sequence relations (relabeled sequence relations in Hoey, 1983, in order to accommodate spatial and temporal sequence relationships). Matching relations are characterized by two clauses or groups of clauses being matched for points of similarity and difference in the content. Such relations include compatibility (e.g., I like Mozart and so does my wife), contrast (e.g., I like Mozart but my wife does not), generalization – exemplification (e.g., My wife doesn’t like classical music. For example, she hates Mozart), and preview– detail (e.g., There are three composers I especially like. I really enjoy Mozart, I love Bach and I adore Sibelius) The simplest kind of matching relation is topic maintenance (e.g., Mozart is one of my favorite composers. He was born in Salzburg). Authentic examples of matching relations can be found in the following passage from a 1907 biology primer (A primer of biology and nature study by Randal Mundy): 2.1 A Simple Classification of Plants. – Plants may be divided into two great groups:
Clause Relations 475
Figure 1 A simple analysis of the repetition/replacement pattern in a contrast relation between two pieces of text.
I. Flowerless Plants (Cryptogams). II. Flowering Plants (Phanerogams). I. Flowerless Plants comprise the Thallophytes, Bryophytes and Pteridophytes. 1. THALLOPHYTES have a body commonly in the form of a flattened shoot, termed a thallus. They have no true roots and are reproduced by spores. They are thus subdivided:(a) Algae, e.g., sea-weeds and many fresh water plants, such as Chara. (b) Funghi, e.g., moulds and mushrooms. No chlorophyll is present. (c) Lichens, peculiar plants each consisting of an alga and a fungus, living together for their mutual advantage. 2. BRYOPHYTES or moss-like plants consist of a stem and leaves, but have no true roots or fibro-vascular bundles. They are reproduced by spores. Bryophytes are sub-divided into:(a) Hepatics or Liverworts, e.g., Marchantia and Jungermannia. (b) Mosses. (See Chapter XXII) 3. PTERIDOPHYTES or fern-like plants possess stem, leaves, roots and fibro-vascular bundles, and are reproduced by spores. They comprise :(a) Ferns. (b) Equisetums (Horsetails). (c) Lycopodiums (Club mosses). II. Flowering Plants have root, stem, leaves, flowers and fibro-vascular bundles, and are reproduced by seed. They are sub-divided into:(a) GYMNOSPERMS, with seeds naked, i.e., not enclosed in a cavity (ovary); e.g., Cycads, and Conifers such as the pine, fir and yew. (b) ANGIOSPERMS, with seeds enclosed in an ovary. They comprise:i. Monocotyledons (one seed-leaf or lobe), e.g., grasses, rushes, palms and lilies. ii. Dicotyledons (two seed-lobes), e.g., most trees, shrubs and herbs.
This passage is more or less entirely made up of different kinds of Matching relations. At the uppermost level, the whole passage is organized by a preview-detail relation, with 2.1 A Simple Classification of Plants. – Plants may be divided into two great groups:- I Flowerless Plants (Cryptogams). II Flowering Plants (Phanerogams)
functioning as the preview and the remainder of the passage serving as the detail. Within the paragraphs there are also instances of generalization–example, such as: Algae, e.g., sea-weeds and many fresh water plants, such as Chara.
In this case neither the generalization nor the examples are expressed as full clauses. The contrast relation can be illustrated in the following two pieces of text: BRYOPHYTES or moss-like plants consist of a stem and leaves, but have no true roots or fibro-vascular bundles. PTERIDOPHYTES or fern-like plants possess stem, leaves, roots and fibro-vascular bundles . . .
Winter (1979) noted that pieces of text such as these can be analyzed in terms of what is repeated and what is replaced (see Figure 1). The focus here is on what is replaced rather than on what is repeated, because it is the differences that provide the basis for the classification. In the two pieces of text in Figure 2, however, the focus is on the sameness; this, then, is an instance of a compatibility relation, and it will be seen that the only replacement here is of the topic. Clause relations may function between large chunks of text (the preview–detail relation) or between bits of clauses (the generalization–example relation) or between one or more clauses (the contrast and compatibility relations). In this respect, the name is misleading, implying as it does that the relationships are only or primarily between clauses. Relations may be inferred by the reader or can be signaled by
476 Clause Relations
Figure 2 A simple analysis of the repetition/replacement pattern in a compatibility relation between two pieces of text.
the writer, either by parallel structures and repetition as in the compatibility example or by a special vocabulary of signals (signals in the passage include subdivided, e.g., thus, but, comprise, consist of, i.e., such as), which serve to label both prospectively and retrospectively the relations the writer sees between the chunks of text, as well as by typographical features such as listing and capitalization. As noted above, the other great class of clause relations, according to Winter, is that of the sequence relations. Sequence relations include time sequence (e.g., she washed her face and then put her coat on), cause–effect (e.g., it was cold, so she put her coat on), instrument–purpose (e.g., she put her coat on to protect herself from the cold), instrument–achievement (e.g., she put her coat on and protected herself against the cold), and spatial sequence (e.g., Her coat hung on a clothes-stand in the hall. An old umbrella lay on a table next to it). It will be noticed from the fabricated examples that, as with the matching relations, the relationship may be between clauses that are syntactically integrated, between clauses that are loosely coordinated or between self-standing sentences. Some of the sequence relations are illustrated in the following extract from a biography of Shakespeare published in 1908 (A life of William Shakespeare by Sidney Lee). At the beginning of 1616 Shakespeare’s health was failing. He directed Francis Collins, a solicitor of Warwick, to draft his will, but, though it was prepared for signature on January 25, it was for the time laid aside. On February 10, 1616, Shakespeare’s younger daughter, Judith, married, at Stratford parish church, Thomas Quiney, four years her junior, a son of an old friend of the poet. The ceremony took place apparently without public asking of the banns and before a license was procured. The irregularity led to the summons of the bride and bridegroom to the ecclesiastical court at Worcester and the imposition of a fine.
This passage contains a number of sequence relations. First, we can identify a cause–effect (or cause– consequence) relation between the first sentence and the first clause of the second. Likewise we can see a time sequence relation between the second and third sentence, and another between the third sentence, summarized in the first clause of the fourth (the ceremony
took place), and the final clause of the fourth (a license was procured). The fourth sentence, summarized as the noun phrase the irregularity, is the cause and the summons . . . to the ecclesiastical court and the imposition of a fine are its effects (or consequences), themselves in a time sequence relation. As before, relations may be inferred by the reader or can be signaled by the writer; the special vocabulary of signals used in this passage comprises before and led to, as well as the use of dates in the second and third sentences. Winter’s work was picked up and developed by a number of linguists, including Winifred Crombie (see below), Michael Hoey, and Michael Jordan (1983, 1985, 1988, 1990, 1992); Jordan’s work seeks to integrate clause relational theory with cohesive analysis in interesting ways. Although Winter himself never attempted to itemize or classify clause relations (indeed, in his later work such as Winter, 1982, 1986, 1992, and 1994 he retreated from classificatory systems), they quickly turned into a classificatory system, and here the second place where the idea turned up was influential. Working in conjunction with the Summer Institute of Linguistics, John Beekman and John and Katherine Callow were involved in efforts to translate the Bible into languages in which it had hitherto been unknown. They found that translations that only concerned themselves with the transference of sentence meanings from one language to the other resulted often in stilted and even unintelligible translations. Only if a textual dimension was built in was there a chance of a translation being accepted in the community for which it was intended. More specifically, they argued in a series of articles (Beekman, 1970a, etc.) and a book (Beekman and Callow, 1974) that texts are organized in terms of regular semantic relations the world over and that the first step in undertaking any translation is to analyze the source text into its component relations; the translation may need to configure these relations differently in accordance with the expectations of the target language community but the relations themselves would not be altered. A similar position was argued by Robert Longacre and his colleagues in a range of papers and books (Ballard et al., 1971; Longacre, 1972, 1976 [substantially revised as Longacre, 1983], 1979, et seq.), though the
Clause Relations 477
attention in the earlier of these works was on the relationship of clauses only. The Callows (Callow and Callow, 1992: 6) emphasized the perspective of the writer or speaker rather than that of the reader or listener (in contrast to Winter, see above): ‘‘The speaker is central.’’ My own adaptation of Winter’s definition (Hoey, 1983) also foregrounded the writer/speaker, though not at the expense of the reader/listener. Despite the sender/receiver focus of both these perspectives, the practice of clause relation analysts in both the United States and the United Kingdom was message centered. Beekman and the Callows argued that messages are structured. For them, the largest unit of communication is the message, roughly equivalent to the text, and the smallest unit is the proposition, roughly equivalent to the content of a clause. A message consists of ‘units-in-relation’ and their term for the relation holding between units is ‘coherence relation’: Related propositions constitute a configuration. The clauses Mary blushed and Simon laughed do not, as they stand, realize a configuration, because their relationship is not clear. But Mary blushed because Simon laughed does realize a configuration. When Mary blushed Simon laughed realizes a different configuration: the units are the same but the coherence relation is different. (Callow and Callow, 1992: 9)
This analysis is, it will be noted, text centered. The writer is only here in the connectives that make the configuration interpretable. The reader is nowhere. Despite the fact that Beekman and the Callows appear to be talking about the same things as Winter, their starting point was at the other end of the interaction, which is unsurprising, given that their background was in translation. The terms and concepts, though, are very similar. They did not divide relations into matching and sequence relations, but they did have relations such as purpose, reason, result, and identification and they showed how such relations can account for the organization of one of the shorter Epistles, that of Jude (Beekman, 1970b). Like Beekman and the Callows, Winifred Crombie (1985a, 1985b) had a practical goal – the improvement of ELT materials. Drawing on Winter’s work, with a knowledge also of the work of Beekman and his colleagues, she provided one of the fullest classifications of clause relations, which she termed ‘semantic relations.’ She grouped these relations in a more complex way than Winter or Hoey, her broad categories being temporal, matching, cause–effect, truth and validity, alternation, bonding, paraphrase, amplification, and setting/conduct. It would go beyond the
scope of an article such as this to give details of all of these, but bonding, for example, subdivides into: 1. Coupling (two or more juxtaposed members, without comparison or sequence being in focus, e.g., Achilles wore a robe and carried a shield) 2. Contrast 3. Statement–exemplification 4. Statement–exception. Crombie’s categories are perceptive and thorough and as a starting point for analysis they comprise the best classification within the British tradition (and arguably elsewhere as well). They are, however, even more text centered than those of Beekman and the Callows, and Crombie sought to represent them formally in quasi-logical and partly abstract fashion. Winter’s focus on the reader is not lost, though – she introduced the notion of semantic relations in terms of the questions that might be asked by a reader to elicit particular relations – and she also picked up from Winter an attention to the way relations are signaled, a facet of clause relation analysis that will be returned to later in the article. Crombie is not the only linguist to have attempted to classify clause relations, though she is probably the only one to have linked such a classification to the ways in which writers may signal to the reader the semantic relations they intend between pieces of text. One pair of linguists who went further down the classificatory road and at the same time further from the interactive starting point is William Mann and Sandra Thompson (1986, 1987; Mann et al., 1992), whose rhetorical structure theory (RST) grew explicitly out of the work of Winter, Jordan (1984), Hoey, and Beekman and the Callows. Like Crombie, Mann and Thompson took great care to formalize and categorize relations and in common with all the linguists so far mentioned, they were driven by a practical goal, in this case the automatic generation of text as well as the automatic processing of text within a systemic-functional tradition. One of the major features of Mann and Thompson’s adaptation of clause relations is that relations are typically asymmetric, with one member being central (the nucleus) and the other being more marginal (the satellite). The relation operates between spans of text and within spans of text. White’s (1998) account of the orbital structure of newspaper texts would at first sight be supportive of their position, in that he argued that news stories are organized in terms of a series of satellites all relating back to an initial sentence or sentences that report the core news. White’s analysis was not, however, apparently compatible with RST in another respect, as will be explained below.
478 Clause Relations
Mann and Thompson, with Matthiessen, argued for a hierarchical analysis of texts: Texts are organized such that elementary parts are composed into larger parts, which in turn are composed into yet larger parts up to the scale of the whole text. (Mann et al., 1992)
This assumption, which they noted is not the only possible assumption one might make, is helpful because it reminds us that chunks of text may relate to other chunks. It also forces a fuller and more systematic examination of the relations in a text. A hierarchical view of clause relations is present also in the work of Gottfried Graustein and Wolfgang Thiele (1981, 1987, etc.), East German linguists who independently developed the notion of configurations of clause relations but were apparently unknown in the American tradition, though in contact from the late 1970s with Winter and his colleagues. Like Mann and Thompson, Graustein and Thiele saw text as hierarchically organized, but they made greater, and more clearly separate, use than Mann and Thompson of culturally popular patterns such as the problem–solution pattern (see Problem-Solution Patterns). To show the way a hierarchical analysis can work (though without representing it in accordance with the terms and diagrammatic style of either Mann and Thompson or Graustein and Thiele), it is worth considering again the biology extract we analyzed earlier, which, though it has an apparently straightforward hierarchical organization, permits demonstration of the problems that a strict hierarchical view of the organization of text brings. The first division into flowerless and flowering plants seems to license a first analysis into two blocks, the first block corresponding to the account of the flowerless plants, i.e., the thallophytes, bryophytes, and pteridophytes, and the second of course the account of the flowering plants (see Figure 3). What Figure 3 claims is that the passage is divisible into two uneven chunks, the first comprising a general overview of the topic (the preview), the second the particulars of the topic (the details); Winter would have labeled this relation the general–particular relation. The details in turn are divisible into two (again uneven) chunks in a matching contrast relation between the three chunks of text concerned with the description of thallophytes, bryophytes, and pteridophytes and the block of text concerned with the description of flowering plants. As already intimated, the first chunk is further divisible into three subsections relating to three types of flowerless plants, in which the focus is equally on what they share (their compatibility) and where they differ (contrast). Each
of these subsections is analyzable on similar lines until all the clauses are accounted for. All these relations are justifiable in terms of the signals that the writer has incorporated into his text. The initial division into preview and details is established by the repetition of the headings with their Roman numerals and typographical distinctiveness. The contrast is explicitly signaled in advance by the word divided and the phrase two great groups, and also in the chunks themselves by two parallel questions being answered: what subclasses of flowerless/flowering plants are there, and how do they reproduce? The compatibility relations amongst the three sections on the flowerless plants are signaled by the kinds of repetition patterning that were being illustrated earlier. In short, such an analysis works and indicates both the way a text may be hierarchically organized and the way in which relations may occur between blocks of language. The problem that the strictly hierarchical position has is that Figure 3 does not reflect all the relations in the text. While the flowering plants section undoubtedly stands in a contrast relation with the preceding three flowerless plants sections, it is also stands in a separate relation of compatibility with the last of the three flowerless plants sections, the pteridophytes. This can be shown in a similar fashion as previously in Figure 4. The relations of compatibility and contrast in Figure 4 are no different from those we find between the sections devoted to the thallophytes and bryophytes, and indeed it would be possible to argue on linguistic evidence for a hierarchical analysis that represented the second stage as in Figure 5.
Figure 3 A partial hierarchical analysis of a passage of text.
Clause Relations 479
Figure 4 Patterns of repetition and replacement in two sentences from the botany passage.
Figure 5 An alternative (partial) hierarchical analysis of the botany passage.
The hierarchical model of clause relations also assumes that all texts conform to a particular monologue model. White’s work on news stories, referred to above, would seem fundamentally incompatible with the hierarchical assumption of RST; my account of colony texts such as dictionaries and encyclopedias (Hoey, 1986, 2001) also challenged the assumption, though it is possible to represent such texts as a shallow hierarchy. All of this work has been influential and RST in particular still has many adherents, but the study of clause relations, and their use in analysis (as opposed to the use of culturally popular patterns, which are related to clause relations in a variety of ways [see Problem-Solution Patterns]), is perhaps no longer central to text linguistics. Except in a crude and limited fashion, they are not part of the armory of tools used in systemic-functional linguistics despite RST’s close relationship to that model, which has after all a textual dimension. Nor are they used in critical discourse analysis, perhaps the most productive branch of written discourse analysis at the present time; Fairclough (1989) provided a helpful list of analytical tools that might be brought to bear upon a text, and clause relational analysis is not among them. Clause relations are not much used in contemporary stylistics, although Winter, Hoey (e.g., Hoey and Winter, 1982), Coulthard (1990), and other linguists have made effective use of the concept in the past. It is not that the concept has been rejected but that it has slipped out of use.
The reasons for this relative current neglect are probably fourfold. In the first place, clause relational analysis is time consuming. Secondly, it is often hard to allocate a particular pair of clauses (or groups of clauses) to a particular category of relationship with confidence. Thirdly, it is often sufficient to know that an analysis can be done without needing to do it; a detailed analysis of a text may be true but unrevealing. Finally, and most importantly, clause relational analysis is, as currently normally conceived, text centered rather than writer/reader centered. As we have seen, Winter’s original definition placed clause relations in the interpretation of text by a reader, not in the text itself, though a later definition by Hoey and Winter (1986: 188) tried, perhaps unsuccessfully (and certainly clumsily), to define clause relations both in terms of the reader and in terms of the textual product: A clause relation is the cognitive process, and the products of that process, whereby the reader interprets the meaning of a clause, sentence, or group of sentences in the context of one or more preceding clauses, sentences, or groups of sentences in the same discourse. It is also the cognitive process, and the product of that process, whereby the choices the writer makes from grammar, lexis, and intonation in the creation of a clause, sentence or group of sentences are made in the context of the other clauses, sentences, or groups of sentences in the discourse.
Widdowson (2004) emphasized that text does not have meaning on its own but that readers find meaning in texts on the basis of context and what they bring to the text. In such terms, any attempt to find the semantic structure of a text is inherently misguided. This is not to say that a text never contains clause relations independently of the reader’s interpretation – the writer may have chosen to signal explicitly the relation that he or she perceived between two pieces of text – but it is to say that there is no single analysis possible of any text unless, improbably, every possible semantic relation has been explicitly signaled within it. Readers may differ both in the relations they find in a text and in the inventory of relations they have available to them for processing the text. Consider
480 Clause Relations
again the first two sentences of the passage about Shakespeare’s final year: ‘‘At the beginning of 1616 Shakespeare’s health was failing. He directed Francis Collins, a solicitor of Warwick, to draft his will, but, though it was prepared for signature on January 25, it was for the time laid aside.’’ They were represented as standing in a cause–consequence relation. But that is one reader’s interpretation and is not explicitly signaled in the text. It would be open to another reader to read these as in the relation of simultaneity (a special sequence relation in which time sequence is denied), in which Shakespeare’s decision to draft a will may have been unconnected with his failing health. I prefer my reading but the other reading has to be accounted for. The future of clause relational analysis must therefore lie in a closer integration with reading theory. Attempts to determine a single semantic structure may be misguided but attempts to find possible semantic structures are not. It would be unfortunate if the insights clause relational analysis offers were lost for lack of any attempt to bring it up to date. See also: Causatives: Semantics; Coherence: Psycholin-
guistic Approach; Cohesion and Coherence: Linguistic Approaches; Connectives in Text; Discourse Markers; Discourse Processing; Discourse Semantics; Functional Discourse Grammar; Macrostructure; Popularizations; Problem-Solution Patterns; Reading Processes in Adults; Rhetorical Structure Theory; Systemic Theory; Tagmemics; Text and Text Analysis; Text World Theory.
Bibliography Ballard D L, Conrad R J & Longacre R E (1971). ‘The deep and surface grammar of interclausal relations.’ Foundations of Language 7, 70–118. Beekman J (1970a). ‘Propositions and their relations within a discourse.’ Notes on Translation 37, 6–23. Beekman J (1970b). ‘A structural display of propositions in Jude.’ Notes on Translation 37, 27–31. Beekman J & Callow J (1974). Translating the Word of God. Grand Rapids, MI: Zondervan. Callow K & Callow J (1992). ‘Text as purposive communication: a meaning-based analysis.’ In Mann & Thompson (eds.). 5–37. Coulthard M (ed.) (1986). Talking about text: studies presented to David Brazil on his retirement. Birmingham: English Language Research. Coulthard M (1990). ‘Matching relations in Borges’ ‘‘La Muerte y la Brujula’’: an exercise in literary stylistics.’ Lenguas Modernas 17, 57–62. Crombie W (1985a). Discourse and language learning: a relational approach to syllabus design. Oxford: Oxford University Press. Crombie W (1985b). Process and relation in discourse and language learning. Oxford: Oxford University Press. Fairclough N (1989). Language and power. London: Longman.
Graustein G & Thiele W (1981). ‘Principles of text analysis.’ Linguistische Arbeitsberichte 31, 3–29. Graustein G & Thiele W (1987). Properties of English texts. Leipzig: VEB Verlag. Hoey M (1983). On the surface of discourse. London: George Allen & Unwin. Hoey M (1986). ‘The discourse colony: a preliminary study of a neglected discourse type.’ In Coulthard (ed.). 1–26. Hoey M (2001). Textual interaction. London: Routledge. Hoey M & Winter E O (1982). ‘Believe me for mine honour: a stylistic analysis of the speeches of Brutus and Mark Antony at Caesar’s funeral in Julius Caesar, Act III, Scene 2, from the point of view of discourse construction.’ Language & Style 14(4), 315–339. Hoey M & Winter E O (1986). ‘Clause relations and the writer’s communicative task.’ In Couture B (ed.) Functional approaches to writing: research perspectives. London: Frances Pinter. 120–141. Huddleston R, Hudson R, Winter E & Henrici A (1968). Sentence and clause in scientific English, report of the research project ‘The Linguistic Properties of Scientific English.’ Unpublished report, Department of General Linguistics, University College London. Jordan M P (1983). ‘Complex lexical cohesion in the English clause and sentence.’ In Manning A, Martin P & McCalla K (eds.) The tenth LACUS forum. Columbia, SC: Hornbeam Press. 224–234. Jordan M P (1984). Rhetoric of everyday English texts. London: George Allen & Unwin. Jordan M P (1985). ‘Some relations of surprise and expectation in English.’ In Hall B (ed.) The eleventh LACUS forum. Columbia, SC: Hornbeam Press. 263–273. Jordan M P (1988). ‘Some advances in clause relational theory.’ In Benson J D & Greaves W S (eds.) Systemic functional approaches to discourse. Norwood, NJ: Ablex. 282–301. Jordan M P (1990). ‘Clause relations within the anaphoric nominal group.’ In Jordan M P (ed.) The 16th LACUS forum. Lake Bluff, IL: LACUS. 409–419. Jordan M P (1992). ‘An integrated three-pronged analysis of a fund-raising letter.’ In Mann & Thompson (ed.). 171–226. Longacre R (1972). Hierarchy and universality of discourse constituents in New Guinea languages: discussion and texts. Washington, DC: Georgetown University Press. Longacre R (1976). An anatomy of speech notions. Lisse, Belgium: Peter de Ridder Press. Longacre R (1979). ‘The paragraph as a grammatical unit.’ In Givo´ n T (ed.) Syntax and semantics 12: Discourse and syntax. New York: Academic Press. 115–134. Longacre R (1983). The grammar of discourse. New York: Plenum Press. Mann W C & Thompson S A (1986). ‘Relational propositions in discourse.’ Discourse Processes 9(1), 57–90. Mann W C & Thompson S A (1987). Rhetorical structure theory: a theory of text organization. Technical report, Information Sciences Institute, University of Southern California. Mann W C & Thompson S A (eds.) (1992). Discourse description: diverse linguistic analyses of a fund-raising text. Amsterdam: John Benjamins.
Clause Structure in Spoken Discourse 481 Mann W C, Matthiessen C M I M & Thompson S A (1992). ‘Rhetorical structure theory and text analysis.’ In Mann & Thompson (eds.). 39–78. White P (1998). Telling media tales: the news story as rhetoric. Unpublished Ph. D., University of Sydney. Widdowson H (2004). Text, context, pretext. Oxford: Blackwell. Winter E (1971). ‘Connection in science material: a proposition about the semantics of clause relations.’ In Science and technology in a second language: papers from a seminar held at the University of Birmingham from 27th to 29th March 1971. London: Centre for Information on Language Teaching. 41–52.
Winter E (1979). ‘Replacement as a fundamental function of the sentence in context.’ Forum Linguisticum 4, 95–133. Winter E (1982). Towards a contextual grammar of English. London: George Allen & Unwin. Winter E (1986). ‘Clause relations as information structure: two basic text structures in English.’ In Coulthard (eds.). 88–108. Winter E (1992). ‘The notion of unspecific versus specific as one way of analysing the information of a fund-raising letter.’ In Mann & Thompson (eds.). 131–170. Winter E (1994). ‘Clause relations as information structure: two basic text structures in English’ (revised reprint). In Coulthard M (ed.) Advances in written text analysis. London: Routledge. 46–68.
Clause Structure in Spoken Discourse J Miller, University of Auckland, Auckland, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Clauses in unplanned speech are simpler than clauses in planned writing, which can be edited. The differences lie in the simplicity of noun phrases (NPs) and in where less-than-simple NPs are positioned in clauses; in the lack of to-clauses, ing-clauses, and complement clauses in subject position and their abundance at the end of clauses; in the absence of sentences; and in constructions that are typical of unplanned speech or planned writing (although a good number of constructions are typical of both). The constructions typical of unplanned speech are regarded as reflecting its general characteristics, especially the lack of editing time, the limitations of short-term memory, and the fact that information is signaled not just by the verbal component of language but also by nonverbal components such as gesture, pitch and amplitude, and voice quality. Not all analysts agree that unplanned speech has simpler clause structure (e.g., Halliday, 1989), but the disagreements can be explained in terms of formality (setting, topic, and participants). Greater formality generally leads to more complex language. Speakers also differ from one another. Speakers with long exposure to written texts produce complex language in unplanned speech. And the more experience speakers have of unplanned speaking in formal situations, the more likely they are to produce complex language.
NPs as a Measure of Complexity Complexity is measured by two properties: the number of words in a phrase and of phrases in a clause,
and the depth of embedding. NPs provide a good illustration. Miller and Weinert (1998: 146) found that in a sample of monologue, 50% of the NPs consisted of a pronoun and another 7% consisted of a single nonpronominal word. When NPs consisting only of a numeral (give me two please) or a quantifier (I’d like more) were counted, the percentage of oneword NPs rose to 64. Few NPs contained other constituents: 5.6% of the NPs contained an adjective, 6% contained a prepositional phrase (the car outside the house), and 3.2% contained a relative clause. There were no complex NPs, that is, NPs containing a combination of these constituents, as in a new proposal from the agency, which is likely to be rejected. In contrast, an analysis of the NPs in the letters to a newspaper (see Miller and Weinert 1998: 154) showed that 19.7% contained adjectives and 18.8% contained prepositional phrases (every telephone exchange in the country). Only 3% contained relative clauses, but 3% were complex, as in a rigorous and valid examination on applied economics that consists of three papers. Counting alone is not sufficient; where types of NPs occur in clauses is also important. The main tendency is clear: in subject position speakers use simple NPs. In Thompson’s data (1988), the subject NPs of transitive clauses did not have adjectives, although some subject NPs of intransitive clauses did. (Transitive clauses have object NPs and are more complex; intransitive clauses do not have object NPs.) In Miller and Weinert’s spoken data (1998), no adjectives occurred in subject NPs; Crystal (1979: 164) found that in the conversations in the London-Lund Corpus, 77% of the clauses had as subject a pronoun or an ‘empty’ word such as it or there. The pattern is confirmed in Biber et al. (1999: 235–237).
Clause Structure in Spoken Discourse 481 Mann W C, Matthiessen C M I M & Thompson S A (1992). ‘Rhetorical structure theory and text analysis.’ In Mann & Thompson (eds.). 39–78. White P (1998). Telling media tales: the news story as rhetoric. Unpublished Ph. D., University of Sydney. Widdowson H (2004). Text, context, pretext. Oxford: Blackwell. Winter E (1971). ‘Connection in science material: a proposition about the semantics of clause relations.’ In Science and technology in a second language: papers from a seminar held at the University of Birmingham from 27th to 29th March 1971. London: Centre for Information on Language Teaching. 41–52.
Winter E (1979). ‘Replacement as a fundamental function of the sentence in context.’ Forum Linguisticum 4, 95–133. Winter E (1982). Towards a contextual grammar of English. London: George Allen & Unwin. Winter E (1986). ‘Clause relations as information structure: two basic text structures in English.’ In Coulthard (eds.). 88–108. Winter E (1992). ‘The notion of unspecific versus specific as one way of analysing the information of a fund-raising letter.’ In Mann & Thompson (eds.). 131–170. Winter E (1994). ‘Clause relations as information structure: two basic text structures in English’ (revised reprint). In Coulthard M (ed.) Advances in written text analysis. London: Routledge. 46–68.
Clause Structure in Spoken Discourse J Miller, University of Auckland, Auckland, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Clauses in unplanned speech are simpler than clauses in planned writing, which can be edited. The differences lie in the simplicity of noun phrases (NPs) and in where less-than-simple NPs are positioned in clauses; in the lack of to-clauses, ing-clauses, and complement clauses in subject position and their abundance at the end of clauses; in the absence of sentences; and in constructions that are typical of unplanned speech or planned writing (although a good number of constructions are typical of both). The constructions typical of unplanned speech are regarded as reflecting its general characteristics, especially the lack of editing time, the limitations of short-term memory, and the fact that information is signaled not just by the verbal component of language but also by nonverbal components such as gesture, pitch and amplitude, and voice quality. Not all analysts agree that unplanned speech has simpler clause structure (e.g., Halliday, 1989), but the disagreements can be explained in terms of formality (setting, topic, and participants). Greater formality generally leads to more complex language. Speakers also differ from one another. Speakers with long exposure to written texts produce complex language in unplanned speech. And the more experience speakers have of unplanned speaking in formal situations, the more likely they are to produce complex language.
NPs as a Measure of Complexity Complexity is measured by two properties: the number of words in a phrase and of phrases in a clause,
and the depth of embedding. NPs provide a good illustration. Miller and Weinert (1998: 146) found that in a sample of monologue, 50% of the NPs consisted of a pronoun and another 7% consisted of a single nonpronominal word. When NPs consisting only of a numeral (give me two please) or a quantifier (I’d like more) were counted, the percentage of oneword NPs rose to 64. Few NPs contained other constituents: 5.6% of the NPs contained an adjective, 6% contained a prepositional phrase (the car outside the house), and 3.2% contained a relative clause. There were no complex NPs, that is, NPs containing a combination of these constituents, as in a new proposal from the agency, which is likely to be rejected. In contrast, an analysis of the NPs in the letters to a newspaper (see Miller and Weinert 1998: 154) showed that 19.7% contained adjectives and 18.8% contained prepositional phrases (every telephone exchange in the country). Only 3% contained relative clauses, but 3% were complex, as in a rigorous and valid examination on applied economics that consists of three papers. Counting alone is not sufficient; where types of NPs occur in clauses is also important. The main tendency is clear: in subject position speakers use simple NPs. In Thompson’s data (1988), the subject NPs of transitive clauses did not have adjectives, although some subject NPs of intransitive clauses did. (Transitive clauses have object NPs and are more complex; intransitive clauses do not have object NPs.) In Miller and Weinert’s spoken data (1998), no adjectives occurred in subject NPs; Crystal (1979: 164) found that in the conversations in the London-Lund Corpus, 77% of the clauses had as subject a pronoun or an ‘empty’ word such as it or there. The pattern is confirmed in Biber et al. (1999: 235–237).
482 Clause Structure in Spoken Discourse
In unplanned speech subject NPs do not normally contain infinitives (e.g., to see Naples and die would be idiotic) or gerunds, whether simple gerunds such as seeing Vesuvius erupting was the highlight of the trip or complex gerunds such as their missing the eruption of Vesuvius was unfortunate. Biber et al. (1999: 754) found that to-clauses and ing-clauses (infinitives and gerunds) are relatively rare in conversation. Complement clauses occur in unplanned speech, but not in subject NPs; examples such as that she was leaving the company came as a surprise to everyone are rare. Instead, in spoken English we find the construction called extraposition, which allows speakers to place infinitives, gerunds, and complement clauses at the end of clauses. It is called extraposition, the name reflecting the idea that the infinitives, gerunds, and complement clauses are really the subject but are out of position, as in it would be idiotic to see Naples and die, the highlight of the trip was seeing Vesuvius erupt, it came as a surprise to everyone that she was leaving the company. The construction allows speakers to produce the main predication, as in it would be idiotic and it came as a surprise, and to add the phrase specifying what it refers to. It is often analyzed as a dummy or empty subject pronoun; for unplanned speech it is tempting to analyze it as having a referent, one which has to be specified by the infinitive, and so on. In many languages speakers avoid complex subject NPs by means of a construction consisting of an NP followed by a clause, as in the guy that tried to lift it he nearly dropped it on his foot. The speaker produces a complex NP which is not part of a clause – the guy that tried to lift it – and then produces the clause conveying information about the referent of the NP – he nearly dropped it. The initial NP need not be very complex; compare the driver you get a laugh with him, from the Edinburgh Corpus of Spoken Scottish English. A very common construction in unplanned speech is it’s unfair what they’re doing to the union and it is unreasonable what she suggests. This construction allows the speaker to produce a clause containing the information that something is unfair or unreasonable and then to produce an NP specifying the something. As in the examples above, the complex NP is not integrated into a clause.
Number of Modifiers in Clauses In unplanned speech there are severe limitations on the number of phrases modifying a verb in a single clause. Highly literate speakers can produce clauses such as the duckling was killed by the farmer for his wife at dawn near the duck pond with a .357 Magnum. In fact, clauses in unplanned spoken
English typically contain just one or two NPs, subject and direct object, with at most three NPs: he killed the duckling and he killed the duckling with his .357 Magnum. An account of the event in unplanned spoken English will be along the lines of the farmer went to the duck pond at dawn; he took his .357 Magnum and shot a duckling for his wife. The number of modifiers in clauses is affected by the distinction between given and new information. The constituents conveying given information (that is, information handled as being available to the listener) are regularly ellipted. Consider the following sequence from the Edinburgh Corpus of Spoken Scottish English. One participant asks what’s he going to do anyway that boy? The reply is play golf; both he and ‘s going to have been ellipted. What is merely typical in unplanned spoken English is obligatory in some languages that are spoken only. Munro and Gordon (1982: 111) suggested that in the Native American language Chickasaw, clauses can have only two modifiers per verb, and Lichtenberk made a similar comment with respect to the Austronesian language Toqabaqita (To’abaita) (Solomon Islands). The number of modifiers is relevant to constructions expressing different perspectives on transitivity. Most clauses in unplanned speech and in formal writing are active, although the proportion of passive clauses increases in dense informative texts such as academic monographs and textbooks. Such texts do contain long passives such as this scientist was criticized by theologians; unplanned speech contains very few long passives but does offer short passives, such as this scientist was criticized.
Clause Structures (Un)Typical of Unplanned Speech Certain clause constructions are quite untypical of spontaneous speech. These include gapping (Jim washed and Margaret dried the dishes), accusative and infinitive (we consider her to be the best candidate), possessive gerunds (his having resigned before he even took up the post astonished everyone), free participles (browsing in the bookshop, I came across a book on Peter the Great), and participial phrases/ reduced relative clauses (the book rejected by the publisher, the plane sitting on the runway at Heathrow). In general there are fewer finite subordinate clauses in unplanned speech than in planned writing. Miller and Weinert (1998: 90–93) found that 25% of all the finite clauses in their conversational data were subordinate. Almost the same percentage occurred in fiction (26%), but a broadsheet newspaper had 41%, and a semiacademic journal 45%.
Clefting in Spoken Discourse 483
Finite adverbial clauses present a complex pattern. Thompson (1988) found the highest proportion of finite adverbial clauses in informal speech but looked only at monologues. Greenbaum and Nelson (1995: 186) found a low percentage of finite adverbial clauses in spoken English, a higher percentage in informal written texts, and the highest in formal written texts, but analyzed monologues, broadcast discussions, and conversation. Adverbial clauses of condition, reason, and time are most frequent in speech, but what appear to be clauses of condition and reason are regularly used as main clauses. Commands are issued via ‘conditional’ clauses: if you just put the parcel on the table. Reason clauses behave like main clauses, even allowing tag questions, which are excluded from subordinate clauses: cos it’s too dear, isn’t it. In Miller and Weinert’s data, concessions are expressed by means of main clauses ending in though: you won’t find many dogs here though. Another construction currently in regular use in unplanned speech looks like an adverbial clause of concession, e.g., although you won’t find many dogs here. Such clauses may be separated from surrounding clauses by a very long pause and do not combine with a main clause to form a ‘sentence’ or clause combination. The essential property of the construction is that although is uttered with a separate intonation and separated by a long pause from the main part of the clause. These occurrences of although look more like discourse particles than subordinating conjunctions.
See also: Coordination; Influence of Literacy on Language Development; Relative Clauses; Sentence Fragmentation: Stylistic Aspects; Subordination.
Bibliography Beaman K (1984). ‘Coordination and subordination revisited: syntactic complexity in spoken and written narrative discourse.’ In Tannen D (ed.) Coherence in spoken and written discourse. Norwood, NJ: Ablex. 45–80. Biber D, Johansson S, Leech G et al. (1999). Longman grammar of spoken and written English. London: Longman. Crystal D (1979). ‘Neglected grammatical factors in conversational English.’ In Greenbaum S, Leech G & Svartvik J (eds.) Studies in English linguistics. London: Longman. 153–166. Greenbaum S & Nelson G (1995). ‘Nuclear and peripheral clauses in speech and writing.’ In Melchers G & Warren B (eds.) Studies in Anglistics. Stockholm: Almqvist and Wiksell International. 181–190. Halliday M A K (1989). Spoken and written language. Oxford: Oxford University Press. Huddleston C D & Pullum G (2002). The Cambridge grammar of the English language. Cambridge: Cambridge University Press. Miller J & Weinert W (1998). Spontaneous spoken language: syntax and discourse. Oxford: Clarendon Press. Munro P & Gordon L (1982). ‘Syntactic relations in Western Muskogean: a typological perspective’ Language 58, 81–115. Thompson S A (1988). ‘A discourse approach to the crosslinguistic category ‘‘adjective.’’’ In Hawkins J A (ed.) Explaining linguistic universals. Oxford: Basil Blackwell. 167–185.
Clefting in Spoken Discourse A´ Di Tullio, Universidad del Comahve, Neuque´n, Argentina ! 2006 Elsevier Ltd. All rights reserved.
Syntactic Structure of Cleft Constructions Among sentences that overtly exhibit the informational functions of their constituents, cleft constructions are special in that they achieve this effect by means of syntactic devices: the copula and a subordinator. As the informational functions of a constituent are concerned with its contribution to the insertion of the sentence in a given discourse, the information supplied by the context may be considered from two different angles: ground/focus and presupposition/ assertion. The information that the speaker regards as active in the mental world of the listener is distinguished from the one introduced as new, on the one
hand, and what is presented as granted or assumed from what is presented as prominent, on the other. In cleft constructions, the copula be and the subordinator are employed to focus a constituent, while placing the others in the background. Although the focused constituent is syntactically highlighted, it does not always provide new information. ‘Cleft’ is a polysemous word: as a cover term it will be used to refer to all constructions that involve clefting; they will be called ‘cleft constructions.’ It is also used to refer to a specific type of construction, ‘it-cleft sentence.’ The following are samples of cleft constructions: (1a) It was the paper that Jeremy left on the desk yesterday. (1b) What Jeremy left on the desk yesterday was the paper. (1c) The paper was what Jeremy left on the desk yesterday.
Clefting in Spoken Discourse 483
Finite adverbial clauses present a complex pattern. Thompson (1988) found the highest proportion of finite adverbial clauses in informal speech but looked only at monologues. Greenbaum and Nelson (1995: 186) found a low percentage of finite adverbial clauses in spoken English, a higher percentage in informal written texts, and the highest in formal written texts, but analyzed monologues, broadcast discussions, and conversation. Adverbial clauses of condition, reason, and time are most frequent in speech, but what appear to be clauses of condition and reason are regularly used as main clauses. Commands are issued via ‘conditional’ clauses: if you just put the parcel on the table. Reason clauses behave like main clauses, even allowing tag questions, which are excluded from subordinate clauses: cos it’s too dear, isn’t it. In Miller and Weinert’s data, concessions are expressed by means of main clauses ending in though: you won’t find many dogs here though. Another construction currently in regular use in unplanned speech looks like an adverbial clause of concession, e.g., although you won’t find many dogs here. Such clauses may be separated from surrounding clauses by a very long pause and do not combine with a main clause to form a ‘sentence’ or clause combination. The essential property of the construction is that although is uttered with a separate intonation and separated by a long pause from the main part of the clause. These occurrences of although look more like discourse particles than subordinating conjunctions.
See also: Coordination; Influence of Literacy on Language Development; Relative Clauses; Sentence Fragmentation: Stylistic Aspects; Subordination.
Bibliography Beaman K (1984). ‘Coordination and subordination revisited: syntactic complexity in spoken and written narrative discourse.’ In Tannen D (ed.) Coherence in spoken and written discourse. Norwood, NJ: Ablex. 45–80. Biber D, Johansson S, Leech G et al. (1999). Longman grammar of spoken and written English. London: Longman. Crystal D (1979). ‘Neglected grammatical factors in conversational English.’ In Greenbaum S, Leech G & Svartvik J (eds.) Studies in English linguistics. London: Longman. 153–166. Greenbaum S & Nelson G (1995). ‘Nuclear and peripheral clauses in speech and writing.’ In Melchers G & Warren B (eds.) Studies in Anglistics. Stockholm: Almqvist and Wiksell International. 181–190. Halliday M A K (1989). Spoken and written language. Oxford: Oxford University Press. Huddleston C D & Pullum G (2002). The Cambridge grammar of the English language. Cambridge: Cambridge University Press. Miller J & Weinert W (1998). Spontaneous spoken language: syntax and discourse. Oxford: Clarendon Press. Munro P & Gordon L (1982). ‘Syntactic relations in Western Muskogean: a typological perspective’ Language 58, 81–115. Thompson S A (1988). ‘A discourse approach to the crosslinguistic category ‘‘adjective.’’’ In Hawkins J A (ed.) Explaining linguistic universals. Oxford: Basil Blackwell. 167–185.
Clefting in Spoken Discourse A´ Di Tullio, Universidad del Comahve, Neuque´n, Argentina ! 2006 Elsevier Ltd. All rights reserved.
Syntactic Structure of Cleft Constructions Among sentences that overtly exhibit the informational functions of their constituents, cleft constructions are special in that they achieve this effect by means of syntactic devices: the copula and a subordinator. As the informational functions of a constituent are concerned with its contribution to the insertion of the sentence in a given discourse, the information supplied by the context may be considered from two different angles: ground/focus and presupposition/ assertion. The information that the speaker regards as active in the mental world of the listener is distinguished from the one introduced as new, on the one
hand, and what is presented as granted or assumed from what is presented as prominent, on the other. In cleft constructions, the copula be and the subordinator are employed to focus a constituent, while placing the others in the background. Although the focused constituent is syntactically highlighted, it does not always provide new information. ‘Cleft’ is a polysemous word: as a cover term it will be used to refer to all constructions that involve clefting; they will be called ‘cleft constructions.’ It is also used to refer to a specific type of construction, ‘it-cleft sentence.’ The following are samples of cleft constructions: (1a) It was the paper that Jeremy left on the desk yesterday. (1b) What Jeremy left on the desk yesterday was the paper. (1c) The paper was what Jeremy left on the desk yesterday.
484 Clefting in Spoken Discourse
The sentences in (1) exemplify the two main types of cleft constructions: (1a) is a cleft sentence (also named ‘it-cleft’ or ‘true cleft’), and (1b) and (1c) are pseudocleft sentences (also called ‘wh-cleft’), in their basic and reversed version, respectively. Cleft and pseudocleft sentences may be connected to a simpler, undivided sentence (2): (2) Jeremy left the paper on the desk yesterday.
Syntactically, cleft constructions are biclausal copulative sentences. Both clauses are linked by a shared constituent, tacit in the subordinate clause, as direct object of left, but explicit as complement of the copula in the matrix clause: the noun phrase (NP) the paper. (3) It was the paper that Jeremy left yesterday.
on the desk
From a semantic perspective, cleft constructions are specificational copulative sentences. A description that predicates on a certain kind of object – Jeremy left something on the desk – is specified by the focused constituent, whose referent appears as the only object that renders it true, namely the paper. From a pragmatic point of view, cleft constructions are emphatic copulative constructions, for they contain a focused constituent that corrects a piece of information of the precedent discourse (an actual sentence or presupposed information). Such focus attains contrastive meaning because it rejects all other alternatives of the same category. The structure of the subordinate clause reproduces that of the original sentence: (4a) Jeremy left the book on the desk yesterday. (4b) No, it was the paper that Jeremy left on the desk yesterday.
The complexity of these constructions is evident in each of the levels, as well as in the intricate relationships holding between them. As a consequence, they have received continued attention from grammarians, who are interested in accounting for their syntactic anomalies, and from pragmaticians, who focus more on their informative functions. On the other hand, clefts are sensitive to differences in registers: in colloquial English, for example, a pleonastic construction, such as What Jeremy left on the desk is Jeremy left a paper, is common at all sociolects; in spoken Spanish, the most frequent cleft is (Lo que pasa) es que . . . (‘What happens is that . . . ‘or ‘it is that’), an explicative resource for the preceding sentence. The general syntactic characteristics of cleft constructions will be analyzed in the next section, starting from their components: the copula, the subordinate clause, and the focused constituent. Then the types of cleft constructions, namely cleft sentences, pseudocleft
sentences, and some other minor types, will be differentiated in English and Spanish. Finally, the discursive value of cleft constructions in spoken discourse will be examined. Cleft Constituents: Configurational Definition and Process Definition
Cleft constructions can be described in two alternative ways. In configurational terms, each class denotes a sentence type consisting of a certain sequence of constituents in a particular order, which takes the following form in English: (a) It-cleft: It þ copula þ focused constituent þ ‘relative’ clause. (b) Basic pseudocleft: Free relative clause þ copula þ focused constituent. (c) Inverted pseudocleft: Focused constituent þ copula þ free relative clause. The it-cleft sentence is characterized by its first segment: a dummy subject, it, followed by the copula, and a noncanonical relative. In pseudocleft sentences the copula occupies an intermediate position between a free relative clause and the focused constituent. However, ‘cleft’ is associated with ‘clefting,’ a process term. In this sense, it may involve the operation of splitting an undivided simpler sentence into two clauses. The three cleft constructions in (1) are different expansions of the basic sentence (2). In effect, sentence (2) describes the same state of things as the three sentences in (1), expressed via an identical argument structure, with the same illocutionary force and truth value. This equivalence holds, regardless of any commitment to an actual psychological operation or to a derivational analysis of these constructions. In fact, the process may be understood as an expository device to depict a relationship holding between sentences. Each of the complements or adjuncts of the undivided sentence (2) can be cleft. The focused phrase may be an NP (direct object, or DO, in (1), the subject in (5a); a prepositional phrase, PP, or an adverbial phrase, AdvP (5b); or a nontensed (5c) or tensed clause (5d): (5a) It was Jeremy who left the paper on the desk yesterday. (5b) It was on the desk/yesterday that Jeremy left the paper. (5c) What Jeremy did was to leave the paper on the desk yesterday. (5d) What happened was that Jeremy left the paper on the desk yesterday.
In contrast, neither atonic elements (determiners, clitic pronouns, auxiliaries) nor peripheral constituents
Clefting in Spoken Discourse 485
(supplement modifiers, disjuncts, vocatives) are cleft. When the focused constituent is an NP, as in (1), all three types of cleft constructions are possible. Clauses, in contrast, are nearly exclusively foregrounded by means of basic pseudocleft sentences. In these, free relatives may introduce lexical units that are not present in the simpler sentence: the verb do in (5c) signals that the focused nonfinite clause is headed by an action verb; the verb happen in (5d) does not distinguish between types of predicates. Contrary to the case of pseudocleft sentences, it-clefts always have undivided correlates. It is possible to cleave only one constituent (6a), usually lighter in ‘it-clefts’ or ‘reversed-pseudoclefts’; heavier in ‘basic pseudoclefts,’ with the exception of the combination of a temporal and a locative expression, which functions as a single constituent, indicating a spatiotemporal setting: (6a) *It was the regional paper on the desk that Jeremy left. (6b) It was in Malta in 1945 that the treaty was signed. Value of the Copula: Specificational Copular Sentences
The copula be is a relational word that conjoins the subject and a predicative expression. It provides inflectional information, both manifesting subject/ predicate agreement features and indicating the temporal setting of the state or relationship. Three types of copular sentences are usually identified, as follows. First, in predicational copular sentences, the predicative complement assigns a property to the referent of the subject or ascribes it to a class: the grammar professor is funny/vegetarian/my mother’s neighbor. Secondly, identificational copulative sentences establish an identity between two potential referring expressions. Thus, in the grammar professor is Julia’s husband/Ignacio/that one, the referent of the subject is identified as the same person as the referent of the copular complement. Although the resulting sentence is also grammatical when the order is reversed – Julia’s husband/Ignacio/that one is the grammar professor – the change in order entails different interpretations, which are pragmatically justified. If the listener is able to match both referring expressions, the complement of the copula becomes referential. In the third type, specificational sentences, the identification proceeds in the opposite direction (7a). It is the one to which cleft constructions belong. The variable contained in an open proposition is specified by the referential expression in focus position (7b). As a consequence of this relation variable/value, the specification provides two pieces of information that
the noncopular sentence (2) lacks: the referent of the focused phrase (thing, person, property, proposition), whose existence is presupposed, is the only one that satisfies the description (exhaustivity, [7c]), thus rejecting all other possible referents (contrastiveness, [7d]): (7a) Jeremy left something on the desk: the paper. (7b) The x such that Jeremy left x on the desk ¼ the paper. (7c) The paper is the only thing that Jeremy left on the desk. (7d) The only thing that Jeremy left on the desk is the paper, not the book.
Although all cleft constructions are specificational, pseudoclefts form part of a broader paradigm. The initial description may be a (semi)free relative clause or else a definite NP, a general noun perhaps modified by a superlative expression. The demarcation line between free relative clauses (or even semifree relatives with demonstrative pronouns) and NPs with a restrictive modifier is not always neatly drawn crosslinguistically. In pseudoclefts the relative pronoun provides only grammatical information on the missing phrase, but the nominal elements in (8b) add lexical information: the copular complement may be an instantiation or hyponym of the subject head: (8a) The one who left the paper on the desk was Jeremy. (8b) The person who left the paper on the desk was Jeremy. (8c) The person who read the most papers is Jeremy.
On the contrary, as the three types of copulative sentences can have free relative clauses as subjects, ambiguity may arise as regards their predicative or specificational character, as seen in (9): (9a) What he told you is fiction. (9b) What John is is necessary.
The complement of the copula in (9a) may be interpreted as evaluative (‘an invention’) or else as a work of literature; only in the latter case is the sentence specificational. Sentence (9b) is ambiguous as well: in its specificational reading, necessary picks up the defining feature of John, and not that of his profession, as in the predicational interpretation. This type of ambiguity, frequent in pseudoclefts, is rare in cleft sentences. In predicational and identificational copular sentences, the copula has independent temporal information: tense situates the predicate in a point in time. In cleft constructions, copular tense is not an informative grammatical unit but rather a dependent category: it may be a default tense, the present, or a tense copied from the inflection of the subordinate clause.
486 Clefting in Spoken Discourse
The directionality of this influence also shows that the copula is not the main verb of the sentence but a defective verb, almost an auxiliary. However, modality marks as auxiliaries, tenses, or adverbs can modify the copula independently of the principal verb: it will/ would be the paper that Jeremy left on the desk. In the syntactic structure of pseudocleft sentences, the copulative verb is placed in the habitual position between the subject and its complement. The position of the copula is less canonical in it-clefts. In fact, the copular segment – it is – appears to be a focalizing device rather than the matrix clause. The initial copula does not permit changes in order. This rigidity is accounted for in terms of the restrictions on copula position: because it must be adjacent to the focused constituent, it cannot be final; the ‘relative clause’ can appear only in final position. Subordinate Clauses: Relative Pronouns or Conjunctions
The free relative clauses in pseudoclefts are semantically versatile constructions, able to represent the different denotations listed in (10): (10a) (10b) (10c) (10d)
NP non personal: What I need is money. PP place: Where he found it was on the desk. AdvP time: When I met you was spring. AdvP manner: How he fought was courageously. (10e) NP or AdjP quality: What he is is a bastard/is important. (10f) Verb phrase (VP) action: What he is doing is hitting the door. (10g) Clause: What he said is that he is tired of the noise.
Relative pronouns anticipate the semantic type of the focused constituent. But in it-clefts, the complementizer that lacks semantic information; in fact, it may even be empty: it was this paper Jeremy read yesterday; when the relativized unit is a human noun, the relative pronouns who, whose, or whom carry grammatical information about the case. It-clefts are not susceptible to a wholly satisfactory binary analysis: neither It is the paper/that Jeremy left on the table nor It is/the paper that Jeremy left on the table is an adequate analysis. In the first segmentation, it is not clear what function the subordinate clause performs. The problem with the second one is that the focused constituent and the ‘relative’ clause do not form a constituent, as the contrasts in (11) show: (11a) It was the paper that Jeremy left on the desk. (11b) I didn’t find the paper that Jeremy left on the desk.
(12a) It was Jeremy who left he paper on the desk. (12b) Jeremy, who left the paper on the desk, has been looking for it.
Unlike (11b), the ‘relative clause’ in (11a) does not restrict the antecedent paper, but rather it is an open clause to be specified. In (12b), the relative clause is not restrictive, as the comma indicates, while in (12a) the relative clause after Jeremy is not marked by means of a separate tonal group as in the nonrestrictive relative clause in (12b) is. Nor is the subordinate clause a complement clause: there is no selector word, and it is not a complete sentence for it contains a missing phrase specified by the focused constituent, as the contrast with an extraposed omplement clause portrays (13a and 13b): (13a) It is unlikely that my son will return this week. (13b) That my son will return this week is unlikely. (13c) It is the paper that John left on the desk yesterday. (13d) *That John left on the desk yesterday is the paper.
The subordinate clause in it-clefts is not canonically relative, nor is it a complement clause, but rather a focusing modifier, and so a special ‘relative clause’ (thus the use of quotation marks). The Focused Constituent
It-/wh-clefts present differences regarding the categorial options of the focused constituent. As it-cleft only permits referential phrases, NPs or PPs, it rejects negative or universally or existentially quantified phrases, as well as phrases preceded by a focalizer such as even or also: (14a) *It was nobody/somebody/everybody that came. (14b) *It is even/also money I need.
The rejected phrases contradict the semantic conditions of cleft constructions: existence, exhaustivity, and contrastivity. Yet, adverbs like precisely, exactly, and justly, are perfectly compatible with such contents. Wh-clefts are less restrictive: the focused constituent may be not only a referential expression but also a predicative one: an AdjP or usually a clause. It must be semantically congruent with the relative pronoun. On the other hand, the focused constituent of reversed pseudoclefts differs as much in categorial restrictions as in their ‘weight’: the cleft constituent, topicalized as subject, is generally an NP, specifically a pronoun: (15a) That is what I said. (15b) Here is where I live.
Clefting in Spoken Discourse 487
The category and weight of the focalized constituent, then, in clefts and reversed pseudoclefts, are similar: a light phrase as an NP, generally pronominal (that), or else a deictic or anaphoric adverb (there, then, so). Basic pseudoclefts usually focalize heavy constituents, usually clauses. A paradigm such as (1) is possible, then, only with nominal constituents. In conclusion, the paradigm in (1) can lead to the misguided idea of a perfect symmetry among the three members. Nevertheless, the differences regarding the focused constituent displayed by the two versions of pseudoclefts are not dependent only on word order. The most outstanding distinctions are to be found between it-clefts and the rest of cleft constructions: the nonreversible order of their constituents and their peculiar ‘relative clause.’ These sentences prove unsuitable for dichotomous categories as regards constituency, syntactic functions, nature of the subordinate clause, and value of the copula. Most of these anomalies arise from their high degree of grammaticalization.
Types of Cleft Sentences The structural realization of informative functions reflects crosslinguistic diversity. These differences are connected to the structural features of languages. In languages with strict word order, intonation is exploited for informational purposes: in Germanic languages, in general, nuclear stress falls on the focalized constituent, regardless of the position it occupies. Cleft constructions are more sophisticated alternative strategies, for they trigger syntactic mechanisms. Strict word orders render them useful compensatory devices. In languages with more flexible word order, the movement of the focused constituent to initial position – obligatory in Hungarian, optional in others, such as Spanish, Italian, or Portuguese – does not need clefting; yet, cleft constructions reinforce the focal status of the displaced constituent. Some nonconfigurational languages – e.g., polysynthetic languages such as Quichua, Toba, and Mapuche (Mapudungan) – lack clefts. In Spanish or Portuguese, word order on its own does not make it possible to differentiate clefts from pseudoclefts, since the latter present three possible orderings. In Italian it-clefts, the ‘relative clause’ (16a) alternates with the infinitive verb preceded by the preposition a (16b) when the focused constituent is an agentive subject; in Portuguese and American Spanish, only the first option is available. Both Italian and Portuguese admit focused NPs in it-clefts, while in American Spanish only PP or adverbials are possible: (16a) e` (stato) Gianni che lo ha detto is (been) Gianni that has said it
(16b) e` stato Gianni a dirlo is been Gianni a to say it (16c) foi o bolo che comeu Joao was a cake that ate Joao (16d) *Fue la torta que comio´ Juan *Was the cake that ate Juan
Genetically and typologically related languages deploy, then, different strategies (Smits, 1989). Furthermore, cleft constructions are not only interlinguistically but also interdialectically divergent, as is the case in Spanish. Additionally, there are marginal varieties of canonical clefts and pseudoclefts, particularly in spoken discourse: reduced clefts, conditional clefts, and inferential or eventive clefts, which will be presented below. Other possibilities are also available, such as this-cleft, all-cleft, and there-cleft. In spoken French, avoir-cleft, or presentational cleft (Lambrecht, 1988), makes use of segmentation in order to introduce a new referent in a given discourse or as event-reporting statements: (17a) j’ai ma voiture qui est en panne I have my car that is breakdown ‘my car broke down’ (17b) y’a le te´ le´ phone qui sonne there-has the telephone that rings ‘the phone’s ringing’
All these varieties are specificational but, as some of the typical constituents are missing, they imply a broader definition of cleft constructions, which not all linguists are open to accepting. Spanish Clefts and Pseudoclefts
Spanish pseudocleft sentences admit three possible orders; grammars call them ‘relative periphrasis’: (18a) fue eso lo que dijo Juan was that what say-PAST 3PERS John ‘that was what John said’ (18b) lo que dijo Juan fue eso what say-PAST 3PERS John was that ‘that was what John said’ (18c) eso fue lo que dijo Juan that was what say-PAST 3PERS John ‘that was what John said’
When the focused constituent is a personal pronoun, it determines the inflectional features of the copula but not those of the verb in the relative clause. In any of the orders, the relative pronoun imposes third person agreement: (19a) soy be.PRE
yo la que ha sido acusada I the-FEM that has been accuse-PAST
1SING
‘I’m the one who’s been accused’
FEM
488 Clefting in Spoken Discourse (19b) la que te miras en el espejo the-FEM that look yourself at mirror eres tu´ are you ‘the one who is looking at herself in the mirror is you’
If the focused constituent is a PP, the relative pronoun reproduces the preposition in all three orders: (20a) de la que (quien) todos hablan es de la mujer del presidente about the one who everybody talks is about the president’s wife ‘the one everybody talks about is the president’s wife’ (20b) de la mujer del presidente es de la que (quien) todos hablan about the president’s wife is about the one who everybody talks ‘the president’s wife is the one everybody talks about’ (20c) es de la mujer del presidente de la que (quien) todos hablan is about the president’s wife about the one who everybody talks ‘it is about the president’s wife that everybody talks’
Both European Spanish and American Spanish possess pseudoclefts, yet only American Spanish has real clefts, which are unacceptable for normative grammars. These (it)clefts, characteristic of spoken discourse, are nonreversible constructions in which the relative clause is introduced by the complementizer que alone. In these sentences the focused constituent, a PP or AdvP, can be pre- or post-copular: (21a) es de la mujer del presidente que todos hablan is about the president’s wife that everybody talks ‘it is about the president’s wife that everybody talks’ (21b) de la mujer del presidente es que todos hablan about the president’s wife is that everybody talks ‘it is about the president’s wife that everybody talks’
A third variety, limited both dialectically and sociolinguistically, is found in nonstandard registers of Caribbean Spanish, as well as in Portuguese, especially Brazilian: in it, the copula becomes a focalizing device, practically frozen in tense and agreement: (22a) todos hablan es de la mujer del presidente everybody talks is about the president’s wife ‘it is the president’s wife everybody talks about’
(22b) Juan come es papa/papas John eats is potato/potatoes ‘what John eats is potato(es)’ Truncated Clefts
In spoken discourse, truncated cleft constructions, as (23), lack subordinate clauses, although the omitted clause is still recoverable from the context. The missing information is a cohesive resource and functions as a means of segmenting the flow of information in shorter units: (23a) A. Who left the paper on the desk? B. It was Jeremy. (23b) Yesterday two cars crashed head-on. It was on the bridge.
Spanish and other null subject languages have not only cleft but also pseudocleft reduced variants. In (24a) and (24b), the omitted part is the focused constituent, but as it is a pronominal, this or that, or an adverbial unit, here or there, it is inferable from the context or situation. In (24c) the omitted clause is recoverable from the situational context. The speaker wants to identify himself as the person who is speaking, on the telephone, for example. Although the verb is in the first person, the insertion of the personal pronoun in preverbal position would be inappropriate in such a context: (24a) es lo que yo digo is that I say ‘that is what I say’ (24b) es donde nos encontramos siempre is where we meet always ‘that is where we always meet’ (24c) soy Juan am Juan ‘I am Juan’
Some constructions that have been analyzed as a peculiar type of cleft are those that are reduced to the two characteristic syntactic devices of clefts, es que (it is that) in strict adjacency in (25) and interrupted by the anticipated subject in (26). As reduced versions of basic pseudoclefts, the focused constituent is the entire clause; the lack of a variable guiding the interpretation is compensated for by a specific relationship with the context: as an explanation, in terms of causes, reasons, results, or conclusions (Delahunty, 2001; Declerck, 1992a), and as the identification of a perception. It is precisely by virtue of this relation that they are termed ‘inferential’ (25) and ‘eventive’ (26), respectively. (25a) ¿por que´ no te quedas a comer? – Es que estoy apurada ‘why don’t you stay for lunch? – it’s because I’m in a hurry’
Clefting in Spoken Discourse 489 (25b) no es que me disguste, sino que no tengo hambre ‘it is not that I don’t like it, rather I’m not hungry’ (26a) ¿que´ pasa? – Es Juan que acaba de llegar ‘what’s happening? – it’s Juan who’s just arrived’ (26b) ¿Que´ ruido es ese? – Es el tele´ fono que esta´ sonando ‘what’s that noise? – it’s the phone ringing’
The sequence es que, which corresponds to e´ che, c’est que, it is that, has been analyzed in Spanish as a conversational marker resulting from a reanalysis produced by grammaticalization. In fact, a subject may be added, as eso ‘that,’ la verdad ‘the truth,’ lo que pasa ‘what happens’: these expressions indicate the pertinence of the explanation to a previous sentence. However, the cleft status of these sentences is proved by possible inflectional features of the copula, modification by means of modal auxiliaries, and the presence of negation, inducing subjunctive. Conditional Clefts
The structure of conditionals also provides a segmented syntactic scheme: the copula in the if-clause foregrounds a particular constituent (27) or else the whole sentence (28), placed in the main clause: (27a) si quiere a alguien, es a su hijo if loves anybody, is to his/her son ‘if there is anyone he/she loves, that is his/her son’ (27b) si se entusiasma con algo, es con la computadora if is enthusiastic with anything, is with the computer ‘if he/she gets enthusiastic about anything, it is the computer’ (28a) si dijo eso, es (por)que esta´ resentida if said that, is because is resented (FEM) ‘if she said that, it is because she feels resentful’ (28b) si lo acepta, es para que la vuelvan a invitar if accepts it, is for to be invited again ‘if he/she accepts, it is because he/she wants to be invited again’
These are also specificational sentences, in which the focused constituent specifies the value of a variable, the indefinite pronoun situated in the if-clause, but they differ from clefts in that they are interpreted as high degree quantified expressions. On the other hand, when the variable that is specified is an adjunct, it does not have to be explicit, as can be seen in (28), where the clause is understood as an explanation, causal in (28a) and final in (28b).
In conclusion, the differences between cleft constructions in Spanish and English stem from idiosyncratic characteristics of these languages. In European Spanish, cleft strategies have shown a high degree of redundancy in pseudoclefts, contrasting with the economic device in Caribbean Spanish. The copula is grammaticalized in American Spanish clefts, merely reinforcing the displacement of focus. The analysis of the three groups of cleft constructions has allowed us to see that the syntactic device copula-subordinator foregrounds two types of constituents: phrasal referential expressions on the one hand and clausal units on the other. It-clefts and reversed pseudoclefts foreground referential constituents. Both types constitute a natural class, as evidenced by the lightness of the constituents and the inverted variable-value order. Basic pseudoclefts are specialized in focusing clauses, heavy constituents that occupy the final position according to a fixed order: first variable, then value. Reduced versions of each type follow this pattern. Conditional structure presents both kinds as well. Yet es que is specialized in focalizing clauses: the relationship with the context (linguistic or nonlinguistic) is necessary to infer the missing link so as to build up the explanation. The specification of the value of an entire clause is, then, performed directly when the variable is supplied in the free relative clause, or indirectly when it is necessary to infer it from context. The vagueness and indeterminacy of texts render this specification not always an easy task (Sornicola, 1989). On the other hand, it-cleft sentences are more frequent in written discourse, not only in English but also in Spanish (Sedano, 1990). The more canonical character of pseudoclefts explains the wider paradigm, which includes colloquial constructions such as What he looked up was he looked up a linguistic term, semifree relatives (8a), and hyperonyms nouns modified by relatives (8b).
Cleft Constructions in Spoken Discourse The focused constituent has been considered so far as a syntactic constituent of cleft constructions. In this section, in contrast, it will be analyzed from the perspective of the informational functions it performs. The focus of cleft sentences, intonationally and syntactically marked, is involved in two information partitions: focus/ground and assertion/presupposition. The first partition is realized through word order, from what is anchored in the preceding discourse or in the situational context to the new information. The second partition is expressed in clefts by
490 Clefting in Spoken Discourse
means of the biclausal structure. The subordinate clause contains the information regarded as presupposed, while in the matrix clause, which conveys the assertion, the focus is the lexical element. These two information partitions may well coincide or intertwine. In the first case, the subordinate clause indicates old-discourse information, GROUND, while the focused constituent is in charge of the newdiscourse information, FOCUS. That is what generally occurs in pseudoclefts, as (29) shows: (29) ‘‘¿Me desvı´o otra vez? No creo, no soy yo el que se desvı´a, la que se desvı´a es la historia. . . . Lo que me lleva tambie´n a desviarme es la intensidad, la tragicidad de los hechos que narro, desbocados todos excesivos.’’ ‘Am I diverting again? I don’t think so, it’s not me who diverts, it is history that diverts. . . . what leads me to divert is the intensity, the tragic nature of the events I narrate, bolted, all excessive’ (Feinman, 2003: 163).
In the succession of cleft constructions that answers the initial question, the speaker picks up already uttered words, correcting some – a nominal as well as a verbal one – and reproducing the rest. The question subject is first negated and then taken up in a semifree relative clause: la que se desvı´a, whose variable is specified with the referent of the focused constituent as the value that exhaustively and exclusively satisfies it. In the next sentence, Lo que me lleva tambie´n a desviarme, it is the verb, modified by a modal, that is reintroduced with contrastive meaning. In it-clefts, on the contrary, the focus may also introduce new information (30a) or, in most cases, establish a relationship with the context, as information already introduced in the listener’s ‘mental world’ but syntactically highlighted (30b); in reversed pseudoclefts (30c), the focus is topicalized as well as the subject; thus it is nearly always thematic: (30a) A. Tiene el doctorado hecho en Estados Unidos. B. ¿Pero es en Lingu¨ı´stica que era el doctorado? (Barrenechea, 1971: 2, 13) A. He/she has finished the Ph.D. in the United States. B. But, is in linguistics that the Ph.D. was? (30b) Buenos Aires era la ciudad centralizadora. Es esa la imagen que yo tenı´a de Buenos Aires hasta hace poco ‘Buenos Aires was the centralizing city. It was the image that I had of Buenos Aires . . .’ (Barrenechea, 1971: 1. III, 59) (30c) eso fue lo que dijo ‘that was what he said’
In fact, presupposition does not necessarily mean old discursive information; it can be either old or new
information depending on the order of the constituents and on the type of cleft construction. The clause usually contains the new information both in clefts and in reversed pseudoclefts, their foci generally transmitting already known or inferable information, as frequently shown by their pronominal realization. The free relative clause in basic pseudoclefts, in contrast, tends to be the cohesive portion. Thus, focus may be either strongly cohesive or strongly novel. When it is cohesive, as in clefts and reversed pseudoclefts, it may lose its contrastive value, becoming a mere emphatic resource: in fact, (31a) does not necessarily indicate that the event took place in that particular year, but rather it is normally interpreted without a contrastive sense. When the focalized constituent has discourse-old information, the relative clause may contain the contrastive value, shown in paralelism in (31b): (31a) It was in 1932 when my family immigrated to Argentina (31b) It was then when you gave me the paper, not the keys
Then, the focused constituent is available not only as a contrastive resource but also as a cohesive or an emphasizing device. The importance of two information partitions comes from the way in which cleft constructions establish their particular relationship with the context: when the speaker specifies the value of a variable, he/she foregrounds a piece of information, contrasting with another explicit or inferred textual piece or acting as if the contrast did exist actually. Indeed, the metalinguistic nature of cleft constructions can be explained through this conversational dynamic, which accounts for most of the syntactic and semantic anomalies detected. See also: Focus; Foregrounding; Functional Categories; Information Structure in Spoken Discourse; Relative Clauses; Syntactic Variation; Thematic Structure.
Bibliography Baker C L (1989). English syntax. Cambridge: MIT Press. Barrenechea A (1971). El habla culta de la ciudad de Buenos Aires: materiales (I y II). Buenos Aires: Instituto de Filologı´a y Literatura Hispa´nica Dr Amado Alonso. Bosˇkovic´ Zˇ (1997). ‘Pseudoclefts.’ Studia linguistica 51(3), 235–277. Bosque I (1999). ‘Sobre la estructura sinta´ctica de una construccio´n focalizadora.’ In Homenaje a D. Ambrosio Rabanales. Santiago: Universidad de Chile. Declerck R (1988). Studies on copular sentences, clefts and pseudo-clefts. Leuven: Foris. Declerck R (1992a). ‘The inferential it is that construction and its congeners.’ Lingua 87, 203–230.
Clitics 491 Declerck R (1992b). ‘The taxonomy and interpretation of clefts and pseudoclefts.’ Lingua 93, 183–220. Delahunty G (1984). ‘The analysis of English cleft sentences.’ Linguistic Analysis 13, 63–113. Delahunty G (2001). ‘Discourse functions of inferential sentences’ Linguistics 39(3), 517–545. Delahunty G & Gatzkiewicz L (2000). ‘On the Spanish inferential construction ser que.’ Pragmatics 10(3), 301–322. Di Tullio A (1990). ‘Sobre hendidas y pseudohendidas.’ Revista de lengua y literatura 7, 3–16. Doherty M (2001). ‘Discourse functions and languagespecific conditions for the use of cleft(-like) sentences: a prelude.’ Linguistics 39(3), 457–462. Feinman J P (2003). La crı´tica de las armas. Buenos Aires: Grupo Editorial Norma. Ferna´ndez Leborans M J (2001). ‘Sobre formas de ambigu¨ edad de las oraciones escindidas: sintaxis y discurso.’ E. L. U. Alicante 15, 285–305. Grevisse M (1993). Le bon usage (13th edn.). Paris-Louvainla-Neuve: Duculot. Guitart J (1989). ‘On Spanish cleft sentences.’ In Kirschner C & DeCesaris J (eds.) Studies in Romance linguistics. Amsterdam-Philadelphia: Benjamins. 129–137. Huddleston R & Pullum G (2002). The Cambridge grammar of the English language. Cambridge: Cambridge University Press. Kany C (1972). Sintaxis hispanoamericana. Madrid: Gredos. Kiss E (1999). ‘The English cleft construction as a focus phrase.’ In Mereu L (ed.) Boundaries of morphology and syntax. John Benjamins: Amsterdam. 217–230. Knowles J (1986). ‘The cleft sentence: a base-generated perspective.’ Lingua 69, 295–317.
Kovacci O (1991). ‘Sobre la estructura de la forma de relieve con ser y proposicio´ n relativa.’ Voz y letra II 1, 39–49. Lambrecht K (1988). ‘Presentational cleft constructions in spoken French.’ In Haiman J & Thompson S A (eds.) Clause combining in grammar and discourse. Amsterdam: Benjamins. 135–179. Lambrecht K (2001). ‘A framework for the analysis of cleft constructions.’ Linguistics 39(3), 463–516. Longobardi G (1987). ‘Las oraciones copulativas en la teorı´a sinta´ctica actual.’ In Demonte V & Ferna´ndez Lagunilla M (eds.) Sintaxis de las lenguas roma´ nicas. Madrid: El Arquero. Moreno Cabrera J C (1999). ‘Las funciones informativas: las perı´frasis de relativo y otras construcciones perifra´ sticas.’ In Bosque I & Demonte V (eds.) Grama´ tica descriptiva de la lengua espan˜ ola 3: Tı´tulo del volumen individual (si lo hubiera). Madrid: Espasa-Calpe. 4245–4301. Sedano M (1990). Hendidas y otras construcciones con ser en el habla de Caracas. Caracas: Universidad Central de Venezuela. Sedano M (2003). ‘Seudohendidas y oraciones con verbo ser localizador en dos corpus del espan˜ ol hablado de Caracas.’ Revista Internacional de Lingu¨ ı´stica Iberoamericana I(1), 175–204. Smits R J C (1989). Eurogrammar: the relative and cleft constructions of the Germanic and Romance languages. Dordrecht: Foris. Sornicola R (1989). ‘It-clefts and Wh-clefts: two awkward sentence types.’ Journal of Linguistics 24, 343–378. Vallduvı´ E & Engdahl E (1996). ‘The linguistic realization of information packaging.’ Linguistics 34, 459–519.
Clitics A D Caink, University of Westminster, London, UK ! 2006 Elsevier Ltd. All rights reserved.
Term and Definitions The term clitic is used in traditional grammar for a word or particle that cannot bear accent or stress and leans on an adjacent accented word (from Greek kli:no ‘lean’). It includes both enclitic and proclitic elements. An enclitic morph joins to the end of an adjacent word, such as the reduced form of the English auxiliary verb have in (1), which is cliticized to the subject pronoun. It cannot appear in the question in (2) because it lacks a host to its left. A proclitic element attaches to the beginning of a word, such as the reduced form of the English auxiliary verb do in (3): (1) they’ve decided against it. (2) * ‘ve they decided against it? (3) d’you need to decide today?
Some clitics may be proclitic in one context and enclitic in another, such as in Macedonian, where pronominal clitics are proclitic on the finite verb (4a) and enclitic on the gerund (4b) (see Franks and King, 2000). (4a) mi ja dadoa me.DAT it.ACC gave.3.PL ‘They gave me the bill’ (4b) davajki mi ja give.GER me.DAT it.ACC ‘Giving me the bill, . . .’
smetka-ta bill-DEF smetka-ta, . . . bill-DEF
Items that are clitic vary across languages but are always grammatical (or functional) words and thus members of closed classes in that they cannot be coined (Emonds, 1985: Chap. 4); they may include auxiliary verbs, pronouns (as in many Indo-European languages), question particles (Slavic -li in (9) or Finnish -ko in Nevis, 1988: 9), negative particles (Slavic verbal negation ne -in (9)), and conjunctions
Clitics 491 Declerck R (1992b). ‘The taxonomy and interpretation of clefts and pseudoclefts.’ Lingua 93, 183–220. Delahunty G (1984). ‘The analysis of English cleft sentences.’ Linguistic Analysis 13, 63–113. Delahunty G (2001). ‘Discourse functions of inferential sentences’ Linguistics 39(3), 517–545. Delahunty G & Gatzkiewicz L (2000). ‘On the Spanish inferential construction ser que.’ Pragmatics 10(3), 301–322. Di Tullio A (1990). ‘Sobre hendidas y pseudohendidas.’ Revista de lengua y literatura 7, 3–16. Doherty M (2001). ‘Discourse functions and languagespecific conditions for the use of cleft(-like) sentences: a prelude.’ Linguistics 39(3), 457–462. Feinman J P (2003). La crı´tica de las armas. Buenos Aires: Grupo Editorial Norma. Ferna´ndez Leborans M J (2001). ‘Sobre formas de ambigu¨edad de las oraciones escindidas: sintaxis y discurso.’ E. L. U. Alicante 15, 285–305. Grevisse M (1993). Le bon usage (13th edn.). Paris-Louvainla-Neuve: Duculot. Guitart J (1989). ‘On Spanish cleft sentences.’ In Kirschner C & DeCesaris J (eds.) Studies in Romance linguistics. Amsterdam-Philadelphia: Benjamins. 129–137. Huddleston R & Pullum G (2002). The Cambridge grammar of the English language. Cambridge: Cambridge University Press. Kany C (1972). Sintaxis hispanoamericana. Madrid: Gredos. Kiss E (1999). ‘The English cleft construction as a focus phrase.’ In Mereu L (ed.) Boundaries of morphology and syntax. John Benjamins: Amsterdam. 217–230. Knowles J (1986). ‘The cleft sentence: a base-generated perspective.’ Lingua 69, 295–317.
Kovacci O (1991). ‘Sobre la estructura de la forma de relieve con ser y proposicio´n relativa.’ Voz y letra II 1, 39–49. Lambrecht K (1988). ‘Presentational cleft constructions in spoken French.’ In Haiman J & Thompson S A (eds.) Clause combining in grammar and discourse. Amsterdam: Benjamins. 135–179. Lambrecht K (2001). ‘A framework for the analysis of cleft constructions.’ Linguistics 39(3), 463–516. Longobardi G (1987). ‘Las oraciones copulativas en la teorı´a sinta´ctica actual.’ In Demonte V & Ferna´ndez Lagunilla M (eds.) Sintaxis de las lenguas roma´nicas. Madrid: El Arquero. Moreno Cabrera J C (1999). ‘Las funciones informativas: las perı´frasis de relativo y otras construcciones perifra´sticas.’ In Bosque I & Demonte V (eds.) Grama´tica descriptiva de la lengua espan˜ola 3: Tı´tulo del volumen individual (si lo hubiera). Madrid: Espasa-Calpe. 4245–4301. Sedano M (1990). Hendidas y otras construcciones con ser en el habla de Caracas. Caracas: Universidad Central de Venezuela. Sedano M (2003). ‘Seudohendidas y oraciones con verbo ser localizador en dos corpus del espan˜ol hablado de Caracas.’ Revista Internacional de Lingu¨ı´stica Iberoamericana I(1), 175–204. Smits R J C (1989). Eurogrammar: the relative and cleft constructions of the Germanic and Romance languages. Dordrecht: Foris. Sornicola R (1989). ‘It-clefts and Wh-clefts: two awkward sentence types.’ Journal of Linguistics 24, 343–378. Vallduvı´ E & Engdahl E (1996). ‘The linguistic realization of information packaging.’ Linguistics 34, 459–519.
Clitics A D Caink, University of Westminster, London, UK ! 2006 Elsevier Ltd. All rights reserved.
Term and Definitions The term clitic is used in traditional grammar for a word or particle that cannot bear accent or stress and leans on an adjacent accented word (from Greek kli:no ‘lean’). It includes both enclitic and proclitic elements. An enclitic morph joins to the end of an adjacent word, such as the reduced form of the English auxiliary verb have in (1), which is cliticized to the subject pronoun. It cannot appear in the question in (2) because it lacks a host to its left. A proclitic element attaches to the beginning of a word, such as the reduced form of the English auxiliary verb do in (3): (1) they’ve decided against it. (2) * ‘ve they decided against it? (3) d’you need to decide today?
Some clitics may be proclitic in one context and enclitic in another, such as in Macedonian, where pronominal clitics are proclitic on the finite verb (4a) and enclitic on the gerund (4b) (see Franks and King, 2000). (4a) mi ja dadoa me.DAT it.ACC gave.3.PL ‘They gave me the bill’ (4b) davajki mi ja give.GER me.DAT it.ACC ‘Giving me the bill, . . .’
smetka-ta bill-DEF smetka-ta, . . . bill-DEF
Items that are clitic vary across languages but are always grammatical (or functional) words and thus members of closed classes in that they cannot be coined (Emonds, 1985: Chap. 4); they may include auxiliary verbs, pronouns (as in many Indo-European languages), question particles (Slavic -li in (9) or Finnish -ko in Nevis, 1988: 9), negative particles (Slavic verbal negation ne -in (9)), and conjunctions
492 Clitics
(Latin -que ‘and’). There are no clitic forms for openclass items such as the lexical noun wood, despite the homophony with the modal auxiliary would (cf. I’d really like that). In formal research, the term clitic is often used to refer only to pronominal clitics as in (4) that have proved to be of particular interest in terms of phonology, morphology, and syntax and the interfaces between these components of the grammar. A clitic cluster is formed when clitic elements in a language appear adjacent to one another, usually in a strict order and often in a fixed position in the clause, even in apparently free–word order languages. In some formal approaches, the clitic cluster may be a primitive of the system with its own rules of clitic order (e.g., Perlmutter, 1971). For many (particularly syntactic) approaches to clitic placement, clitic cluster is a descriptive term, an epiphenomenon of the prosodic properties of clitics and their placement. In some languages, the position of the clitic cluster is restricted to the second position in the sentence, termed the Wackernagel position after the German 19th-century philologist (Wackernagel, 1892). The second position varies across languages between ‘following the first phonological word’ (e.g., Ancient Greek; Kaisse, 1985: 80) and ‘following the first constituent’ (e.g., Finnish; see Nevis, 1988: 19). Other languages allow for both second positions (e.g., Luisen˜ o (Uto-Aztecan); see Kaisse, 1985: 85). To some extent, this is true of Serbo-Croatian, a language whose second position has attracted much debate in recent years (Bosˇ kovic´ , 2001: 12): (5a) taj cˇ ovek je that man aux.3.SG ‘that man loved Milena’
volio loved
Milenu. Milena
(5b) taj je cˇ ovek that aux.3.SG man ‘that man loved Milena’
volio loved
Milenu. Milena
The clitic auxiliary je follows the first constituent in (5a) and the first phonological word in (5b). Recent research on Serbo-Croatian demonstrates that the clitic cluster is restricted from intervening within some initial constituents, that there is substantial native speaker variation, and that the size of the clitic cluster is a factor (see Bosˇ kovic´ , 2001: Chap. 2). For cross-linguistic debate on second-position phenomena see articles in Halpern and Zwicky (1996) and for Slavic generally see Franks and King (2000). A distinction is made between syntactic cliticization and prosodic cliticization. Hence, in the sequence A – b – C where b is a clitic, it may be the case that b is syntactically proclitic on C, but prosodically
enclitic on A, for example, the Australian language Nganhcara (Klavans, 1985: 104–105) and Bulgarian (see (9)) (Franks and King, 2000: 63).
Affixes, Words, and Clitics The fundamental characteristic of clitics is that in some ways they behave like affixes (e.g., the past tense affix –d on the verb in (1)) and in some ways they behave like independent words (Sapir, 1930: 70–71), with much cross-linguistic variation. For example, affixes are highly restricted as to their host, whereas words tend to subcategorize for phrases, if at all. Pronominal clitics in many Indo-European languages are similarly restricted to appearing on a verb (as in Bulgarian (9) and French (10)), but this contrasts with Serbo-Croatian pronominal clitics, which do not distinguish between the category of host (see (5) and (7)). Similarly, the English possessive ’s is dependent on whatever the last word of the preceding noun phrase happens to be (the man I saw’s hat; that man I gave a dollar to’s hat). As a result, it can be termed a phrasal affix. Also, affixes are subject to word-internal phonological processes to which adjacent words are not subject. In Macedonian, the antepenultimate stress rule exemplified in zˇ e´ nata ‘the wife’ is followed even when the possessive clitic ti ‘your’ is added to form zˇ ena´ ta-ti ‘your wife’; the clitic appears to be part of the phonological word in the same way as a suffix, and the stress shifts accordingly (see Spencer, 1991: 360). Yet, in Latin, the enclitic -que ‘and’ is not subject to the rule that assigns main stress in a word. This rule does not ordinarily assign primary stress to the penultimate syllable if it is strong, but primary stress still appears on the strong penultimate syllable in the word þ clitic rosa´ que ‘and the rose (NOM)’ (Nespor and Vogel, 1986: 115–116; for other word-internal rules in relation to clitics, see Kayne, 1975: 85, and Nespor and Vogel, 1986: Chap. 5). Affixes generally appear in a strict order, and, if this order varies at all, then there is a concomitant change in meaning. Most languages with a clitic cluster have an equally rigid ordering. However, Catalan allows freedom in the ordering of some pronominal clitics (Bonet, 1991: 67): (6)
te ’m van recomanar you.DAT/ACC me.ACC/DAT PAST recommend per a aquesta feina for to job ‘they recommended me to you for this job’ or ‘they recommended you to me for this job’
Clitics 493
Syntactic rules of ellipsis never apply to affixes. For example, in Amali played baseball and Yasmin soccer, the missing verb is understood through identity with the first verb, but the affix –d cannot similarly be recovered: *Amali danced the tango and Yasmin play soccer. This holds true of most clitics, but recent evidence suggests that for some speakers at least, pronominal clitics in Serbo-Croatian may undergo ellipsis: (7) ona she a and
mu him.DAT
ga it.ACC
je aux.3SG
dala gave
i also
sam AUX.
mu him.
ga it.ACC
ja I
Typology of Clitics dala gave
1.SG DAT ‘she gave it to him, and I did too’
Here both the pronominal clitic mu ‘him’ and the lexical verb dala ‘gave’ may undergo ellipsis (Franks and King, 2000: 336). It is generally accepted that, if a morpheme comes between a stem and an affix, it must be itself an affix (Zwicky, 1977), yet Portuguese pronominal clitics appear between the verb stem and the tense-affix morpheme (Spencer, 1991: 366): (8) leva´ -lo raise -it.ACC ‘I will raise it’
-ei -aux.fut
In Portuguese, the lexical phonological rules treat the future ending like an inflectional affix, and it only appears on the verb. This may be related to the fact that historically the future ending was a clitic auxiliary that has become an affix (cf. French j’aimerai ‘I will love’; see Spencer, 1991: Chap. 9). The inability to bear accent or stress, a common feature of affixes, is not always a feature of clitics. In Bulgarian, the pronominal clitic mu ‘him’ bears stress (Hauge, 1976: 30): (9) ne mu´ li go not him.DAT Q it.ACC ‘didn’t he tell him it?’
status of clitics. Some argue that there is a distinct category of clitic group within the prosodic hierarchy between the phonological word and the phonological phrase (Nespor and Vogel, 1986). Much recent prosodic literature rejects the need for a special category, some arguing that a given language may have more than one representation of its clitics and distinct treatments for proclitics and enclitics (see references in Gerlach and Grijzenhout, 2000: 2–8).
kaza? said.3.SG
The proclitic ne ‘not’ forms a prosodic word with the enclitic mu ‘him’ that bears stress. This word then serves as a host for both the question particle li and go ‘it’ (see Klavans, 1982, for further examples of accented clitics). Indeed, it has been suggested that some clitics do not require a host at all, for example, the Dutch prepositional er ‘there’ (Riemsdijk, 1978, 1999). For diagnostics for distinguishing clitics from affixes, see Zwicky and Pullum (1983); for the clitic versus word distinction, see Zwicky (1985). For phonology and morphology, a principal interest is therefore in the definition of a word and the
The clitics in (1)–(3) appear in a subset of the set of positions in which their corresponding full forms appear (see Kaisse, 1985: 43 for a further discussion of English auxiliaries). In other words, they exhibit no distinct syntactic behavior; their position may be derived from the syntax. These are termed simple clitics in Zwicky (1977). In contrast, the pronominal clitics in examples (4)–(10) appear in a different syntactic position than their full-form equivalents. That is, whereas the clitic l’ ‘him’ in the French example (10a) precedes the verb, the full form lui ‘him’ appears in the subcategorized argument position following the verb (10b). Clitics with such idiosyncratic syntax and phonological forms distinct from their full-form alternants are termed special clitics in Zwicky (1977). (A third distinction in Zwicky, 1977, that has not been widely adopted is the bound word, a clitic without any apparent full-form alternant, such as Latin -que ‘and’). Special Clitics
The principal interest in special clitics and particularly pronominal clitics has been to account for their syntactic position and relation to the full-form positions. Kayne (1975) first argued that the relation was one of movement in French: (10a) Jean l’ John him.ACC ‘John loves him’
aime loves
e
(10b) Jean n’ aime que lui Jean NEG loves only him.ACC ‘it’s only him that Jean loves’
In (10a), the clitic is base-generated in argument position e and moved to its surface position. (10b) has the full-form alternant lui ‘him’ in the post-verbal argument position. Issues addressed within the movement approach include the questions of why the movement is generally clause-bound and why, in some Romance and Balkan languages, clitics associated with the argument structure of a subordinate
494 Clitics
predicate may appear in the main clause (clitic climbing). Consider the Italian sentence: (11) Piero ti verra` a parlare Piero you.DAT come.FUT to speak.INF di parapsicologia of parapsychology ‘Piero will come to speak to you about parapsychology’
Approaches have tended either to focus on the typology of movements (head vs. phrasal movement; A vs. A0 movement) or to argue that the structure in (11) is formally monoclausal in some way (see Rizzi, 1982, and articles in Riemsdijk, 1999). In Bare Phrase Structure (Chomsky, 1995), when a phrase dominates only a pronominal clitic, the clitic is formally both a maximal and minimal projection and thus able to move as either. Clitic movement differs from more traditional examples of movement because the form of the moved element differs from that of the unmoved element. Also, in some South American Spanish dialects (Borer, 1984: 16), the clitic duplicates a phrase in argument position (called clitic doubling in Jaeggli, 1982): (12) lo vimos him.ACC see.PRET.1.PL ‘we saw Juan’
a to
Juan. Juan
The clitic has been seen as absorbing the case and theta-assigning properties of the verb, forcing the need for a preposition (the Kayne-Jaeggli generalization; see Jaeggli, 1982; Borer 1984, 1986). However, there is no preposition in other cases of clitic doubling such as Macedonian example (4) (clitic doubling is common in the Balkan languages, Romanian, Greek, and Albanian, and in South Slavic; see articles in Riemsdijk, 1999; Beukema and Dikken, 2000). One recent reformulation suggests that the clitic is generated as part of the functional structure of the subcategorized phrase (e.g., as D) and moves independently of the overt phrase (e.g., Torrego, 1998). Alternatively, pronominal clitics are argued to be base-generated in the surface position with an appropriate mechanism relating the clitic to the empty category in argument position (e.g., Borer, 1984; articles in Riemsdijk, 1999). With the rise of a more articulated functional hierarchy, a popular account has been to see clitics as agreement morphemes heading functional projections with arguments raising to check features (e.g., Sportiche, 1995). Attempts to account for second-position clitics are many and varied, from the purely phonological treatment through to the purely syntactic. Increasingly, accounts involve a combination of components, such
as modification of the syntactic output at the phonological or morphological form (either through movement or the selective spell-out of movement copies; see articles in Halpern and Zwicky, 1996; for variable spell-out in, mainly, Slavic, see Bosˇ kovic´ , 2001). Morphologists and Optimality theoretic approaches have often pursued clitics as phrasal affixes, usually with a wider cross-linguistic coverage. Klavans (1982, 1985) and Anderson (1992) position clitics via three parameters: (1) the choice of host as an initial or final element in a phrase (2) preceding or following the host, and (3) whether the item is proclitic or enclitic. Optimality theoretic approaches have proven particularly effective in capturing the various competing constraints with respect to clitic placement cross-linguistically (see Anderson, 1996; Franks and King, 2000; articles in Gerlach and Grizenhout, 2000). For an introduction focusing on morphology, see Spencer (1991: Chap. 9); for significant reviews of the field and key approaches, see Borer (1986), Riemsdijk (1999), and Gerlach and Grizenhout (2000). See also: Affixation; A-Morphous Morphology; Ellipsis; Functional Categories; Lexical Functional Grammar; Minimalism; Morpheme; Morphology: Optimality Theory; Noun Phrases; Phonology: Optimality Theory; Rule Ordering and Derivation in Phonology; Transformational Grammar: Evolution; Uto-Aztecan Languages; Wackernagel, Wilhelm (1806–1869); Word; Word Classes/Parts of Speech: Overview; Word Formation; Word Stress.
Bibliography Anderson S R (1992). A-morphous morphology. Cambridge: Cambridge University Press. Anderson S R (1996). ‘How to get your clitics in place or why the best account of second-position phenomena may be something like the optimal one.’ Linguistic Review 13, 165–191. Beukema F & Dikken M den (eds.) (2000). Clitic phenomena in European languages. Amsterdam: John Benjamins. Bonet E M (1991). Morphology after syntax: Pronominal clitics in Romance. Ph.D. diss., MIT. Borer H (1984). Studies in generative grammar 13: Parametric syntax: Case studies in Semitic and Romance languages. Dordrecht: Foris. Borer H (1986). Syntax and semantics 19: The syntax of pronominal clitics. Orlando, FL: Academic Press. ˇ (2001). Linguistic variations 60: On the nature Bosˇ kovic´ Z of the syntax-phonology interface: Cliticization and related phenomena. Amsterdam: Elsevier. Chomsky N (1995). The minimalist program. Cambridge, MA: MIT Press. Emonds J E (1985). A unified theory of syntactic categories. Dordrecht: Foris.
Clothing: Semiotics 495 Franks S & King T H (2000). A handbook of Slavic clitics. Oxford: Oxford University Press. Gerlach B & Grijzenhout J (eds.) (2000). Linguistics today 36: Clitics in phonology, morphology and syntax. Amsterdam: John Benjamins. Halpern A & Zwicky A (1996). Approaching second position: Second position clitics and related phenomena. Stanford, CA: CSLI Publications. Hauge K R (1976). The word order of predicate clitics in Bulgarian. Oslo: Universitetet i Oslo, Slavisk-Baltisk Institutt. Jaeggli O (1982). Topics in Romance syntax. Dordrecht: Foris. Kaisse E (1985). Connected speech: The interaction of syntax and phonology. Orlando, FL: Academic Press. Kayne R (1975). Current studies in linguistics 6: French syntax: The transformational cycle. Cambridge, MA: MIT Press. Klavans J (1982). Some problems in a theory of clitics. Bloomington, IN: Indiana University Linguistics Club. Klavans J (1985). ‘The independence of syntax and phonology in cliticization.’ Language 61, 95–120. Nespor M & Vogel I (1986). Prosodic phonology. Dordrecht: Foris. Nevis J A (1988). Finnish particle clitics and general clitic theory. New York: Garland Publishing.
Perlmutter D (1971). Deep and surface structure constraints in syntax. New York: Holt, Rinehart & Wilson. Riemsdijk H C van (1978). A case study in syntactic markedness. Dordrecht: Foris. Riemsdijk H C van (ed.) (1999). Empirical approaches to language typology: Clitics in the languages of Europe. The Hague: Mouton de Gruyter. Rizzi L (1982). Issues in Italian syntax. Dordrecht: Foris. Sapir E (1930). ‘Southern Paiute, a Shoshonean language.’ Proceedings of the American Academy of Arts and Sciences 65, 1. Spencer A (1991). Morphological theory. Oxford: Blackwell. Sportiche D (1995). ‘Clitic constructions.’ In Rooryck J & Zaring L (eds.) Phrase structure and the lexicon. Dordrecht: Kluwer. 213–276. Torrego E (1998). Linguistic Inquiry monographs 34: The dependencies of objects. Cambridge, MA: MIT Press. Wackernagel J (1892). ‘U˝ ber ein Gesetz der IndoGemanischen Wortstellung.’ Indogermanische Forschungen 1, 333–436. Zwicky A (1977). On clitics. Bloomington, IN: Indiana University Linguistics Club. Zwicky A (1985). ‘Clitics and particles.’ Language 61, 283–305. Zwicky A & Pullum G (1983). ‘Cliticization vs. inflection: English n’t.’ Language 59, 502–513.
Clothing: Semiotics M Danesi, University of Toronto, Toronto, Canada ! 2006 Elsevier Ltd. All rights reserved.
Introduction Like any other object or artifact, we interpret ‘clothes’ as signs that stand for such things as the personality, the social status, and overall character of the wearer. This is why semioticians talk of ‘dress codes’ as particular types of social codes – systems of signs that cohere to provide information about how to dress for an occasion. Clothing is more than just bodily covering for protection. It is a sign system that is interconnected with the other sign systems of a society through which such emotions, states, and variables as attitudes, gender, age, social status, political beliefs, etc. can be encoded. This is why uniforms are required by special groups such as sports teams, military organizations, religious institutions, and the like. These encode specific kinds of meanings. Since Roland Barthes’ (1915–1980) study of fashion as a sign system in Syste`me de la mode (1967), ‘clothing semiotics,’ as it is sometimes called, has become a popular area of inquiry both within
semiotics (e.g., Rubinstein, 1995; Danesi, 2004) and in cognate disciplines such as anthropology and psychology (e.g., Davis, 1992; Enninger, 1993; Craik, 1993; Hollander, 1988, 1994; McRobbie, 1988; Steele, 1995; Luciano, 2000).
Bodies, Clothes, and Dress People do not perceive bodies merely as biological substances; they are perceived also as signs of Selfhood (Goffman, 1959). For this reason, the human body has been subject to varying interpretations across history and across cultures. In ancient Greece, for example, the body was glorified as a source of pleasure; in ancient Rome, on the other hand, it was viewed as a source of moral corruption. The Christian church has always played on the duality of the body as a temple and as an enemy of the spirit. Because clothes are worn on bodies, they are perceived as extensions of bodily meanings and are thus tied to varying cultural interpretation. This does not imply that clothes have basic social functions across societies. As the anthropologist Helen Fisher (1992: 253–254) observes, even in the jungle of Amazonia, Yanomamo men and women wear clothes for sexual
Clothing: Semiotics 495 Franks S & King T H (2000). A handbook of Slavic clitics. Oxford: Oxford University Press. Gerlach B & Grijzenhout J (eds.) (2000). Linguistics today 36: Clitics in phonology, morphology and syntax. Amsterdam: John Benjamins. Halpern A & Zwicky A (1996). Approaching second position: Second position clitics and related phenomena. Stanford, CA: CSLI Publications. Hauge K R (1976). The word order of predicate clitics in Bulgarian. Oslo: Universitetet i Oslo, Slavisk-Baltisk Institutt. Jaeggli O (1982). Topics in Romance syntax. Dordrecht: Foris. Kaisse E (1985). Connected speech: The interaction of syntax and phonology. Orlando, FL: Academic Press. Kayne R (1975). Current studies in linguistics 6: French syntax: The transformational cycle. Cambridge, MA: MIT Press. Klavans J (1982). Some problems in a theory of clitics. Bloomington, IN: Indiana University Linguistics Club. Klavans J (1985). ‘The independence of syntax and phonology in cliticization.’ Language 61, 95–120. Nespor M & Vogel I (1986). Prosodic phonology. Dordrecht: Foris. Nevis J A (1988). Finnish particle clitics and general clitic theory. New York: Garland Publishing.
Perlmutter D (1971). Deep and surface structure constraints in syntax. New York: Holt, Rinehart & Wilson. Riemsdijk H C van (1978). A case study in syntactic markedness. Dordrecht: Foris. Riemsdijk H C van (ed.) (1999). Empirical approaches to language typology: Clitics in the languages of Europe. The Hague: Mouton de Gruyter. Rizzi L (1982). Issues in Italian syntax. Dordrecht: Foris. Sapir E (1930). ‘Southern Paiute, a Shoshonean language.’ Proceedings of the American Academy of Arts and Sciences 65, 1. Spencer A (1991). Morphological theory. Oxford: Blackwell. Sportiche D (1995). ‘Clitic constructions.’ In Rooryck J & Zaring L (eds.) Phrase structure and the lexicon. Dordrecht: Kluwer. 213–276. Torrego E (1998). Linguistic Inquiry monographs 34: The dependencies of objects. Cambridge, MA: MIT Press. Wackernagel J (1892). ‘U˝ber ein Gesetz der IndoGemanischen Wortstellung.’ Indogermanische Forschungen 1, 333–436. Zwicky A (1977). On clitics. Bloomington, IN: Indiana University Linguistics Club. Zwicky A (1985). ‘Clitics and particles.’ Language 61, 283–305. Zwicky A & Pullum G (1983). ‘Cliticization vs. inflection: English n’t.’ Language 59, 502–513.
Clothing: Semiotics M Danesi, University of Toronto, Toronto, Canada ! 2006 Elsevier Ltd. All rights reserved.
Introduction Like any other object or artifact, we interpret ‘clothes’ as signs that stand for such things as the personality, the social status, and overall character of the wearer. This is why semioticians talk of ‘dress codes’ as particular types of social codes – systems of signs that cohere to provide information about how to dress for an occasion. Clothing is more than just bodily covering for protection. It is a sign system that is interconnected with the other sign systems of a society through which such emotions, states, and variables as attitudes, gender, age, social status, political beliefs, etc. can be encoded. This is why uniforms are required by special groups such as sports teams, military organizations, religious institutions, and the like. These encode specific kinds of meanings. Since Roland Barthes’ (1915–1980) study of fashion as a sign system in Syste`me de la mode (1967), ‘clothing semiotics,’ as it is sometimes called, has become a popular area of inquiry both within
semiotics (e.g., Rubinstein, 1995; Danesi, 2004) and in cognate disciplines such as anthropology and psychology (e.g., Davis, 1992; Enninger, 1993; Craik, 1993; Hollander, 1988, 1994; McRobbie, 1988; Steele, 1995; Luciano, 2000).
Bodies, Clothes, and Dress People do not perceive bodies merely as biological substances; they are perceived also as signs of Selfhood (Goffman, 1959). For this reason, the human body has been subject to varying interpretations across history and across cultures. In ancient Greece, for example, the body was glorified as a source of pleasure; in ancient Rome, on the other hand, it was viewed as a source of moral corruption. The Christian church has always played on the duality of the body as a temple and as an enemy of the spirit. Because clothes are worn on bodies, they are perceived as extensions of bodily meanings and are thus tied to varying cultural interpretation. This does not imply that clothes have basic social functions across societies. As the anthropologist Helen Fisher (1992: 253–254) observes, even in the jungle of Amazonia, Yanomamo men and women wear clothes for sexual
496 Clothing: Semiotics
modesty. A Yanomamo woman would feel as much discomfort and agony at removing her vaginal string belt as would a North American woman if one were to ask her to remove her underwear. Similarly, a Yanomamo man would feel just as much embarrassment at his penis accidentally falling out of its encasement, as would a North American male caught literally ‘with his pants down.’ As these cross-cultural comparisons bring out, clothing the body for social presentation is fundamentally a reflex of basic human functions. It is, in fact, intrinsically intertwined with sexual, romantic, and courtship functions throughout the world. When a young Zulu woman falls in love, she is expected to make a beaded necklace resembling a close-fitting collar with a flat panel attached, which she then gives to her boyfriend. Depending on the combination of colors and bead pattern, the necklace will convey a specific type of romantic message: e.g., a combination of pink and white beads in a certain pattern would convey the message You are poor, but I love you just the same (Dubin, 1987: 134). At a biological level, clothes have a very important function indeed – they enhance survivability considerably. This is the level of denotation in semiotic theory – the level at which a referent is tied to its biological function. Clothes are, denotatively, humanmade extensions of the body’s protective resources, perceived as additions to our protective bodily hair and skin thickness. As Werner Enninger (1992: 215) aptly points out, this is why clothing styles vary according to geography: ‘‘The distribution of types of clothing in relation to different climatic zones and the variation in clothes worn with changes in weather conditions show their practical, protective function.’’ But, it is also a fact that clothes take on a whole range of connotations in specific social settings that have little to do with survival. Connotations are meanings that accrue in cultural context over time, leading to the formation of ‘dress codes’ (from Old French dresser ‘to arrange, set up’) that inform people about how to clothe themselves in social situations. To someone who knows nothing about Amish culture, the blue or charcoal Mutze of the Amish male is just a jacket. But to the Amish the blue Mutze signals that the wearer is between 16 and 35 years of age, and the charcoal one signals that he is over 35. Similarly, to an outsider the Russian kalbak appears to be a brimless red hat. To a rural Russian, however, it means that the wearer is a medical doctor. It is interesting to note, too, that dress codes, like other types of codes, can be used to lie about oneself. Con artists and criminals can dress in three-piece suits to look trustworthy, a crook can dress like a police officer to gain a victim’s confidence, and so on. To discourage people from deceiving others through
clothing, some societies have even enacted laws that prohibit misleading dressing, defining strictly who can dress in certain ways. In ancient Rome, for instance, only aristocrats were allowed to wear purple-colored clothes, and in many religiously oriented cultures, differentiated dress codes for males and females are regularly enforced.
Dress Codes The broad range of connotations associated with dress codes is inextricably interconnected with social trends and political movements. Until the early 1950s, females in Western culture rarely wore pants. The one who ‘wore the pants’ in a family meant, denotatively and connotatively, that the wearer was a male. With the change in social role structures during the 1950s and 1960s, women began to wear pants regularly, sending out the new social messages that this entailed. The reverse situation has never transpired. Except in special ritualistic circumstances – such as the wearing of a Scottish kilt – men have never adopted wearing women’s skirts in modern-day Western society. If they ever do, it would probably be labeled an act of ‘transvestitism.’ Dressing for social reasons is a universal feature of human cultures. Even in cold climates, some people seem more interested in decorating their bodies than in protecting them. In the 1830s, British biologist Charles R. Darwin (1809–1882) traveled to the islands of Tierra del Fuego, off the southern tip of South America. There he saw people who wore only a little paint and a small cloak made of animal skin, in spite of the cold rain and the sleet. Darwin gave the people scarlet cloth, which they took and wrapped around their necks, rather than wear it around the lower body for warmth. Even in the cold weather, the people wore clothing more for decoration than for protection. No one knows exactly why or when people first wore clothes. Estimates trace the origin of clothing to 100 000 years ago. Archeological research suggests that prehistoric hunters may have worn the skin of a bear or a reindeer in order to keep warm or as a sign of personal skill, bravery, and strength. By the end of the Old Stone Age – about 25 000 years ago – people had invented the needle, which enabled them to sew skins together. They had also learned to make yarn from the threadlike parts of some plants or from the fur or hair of some animals. In addition, they had learned to weave yarn into cloth. At the same time, people had begun to raise plants that gave them a steady supply of materials for making yarn. They had also started to herd sheep and other animals that gave them wool.
Clothing: Semiotics 497
From the outset, it appears that clothes were worn not only for protection, but also for various social reasons. Shamans, for example, have always worn special clothing to indicate who they are. This continues to be the case for all kinds of clerics today. Dress also conveys people’s beliefs, feelings, and general approach to life. Confident people often show more independence in choosing their style of dress than do those who are shy or unsure of themselves. The confident individual is likely to try new clothing styles. A shy person may seek security by following current styles. Others may be unconcerned about their dress and care little whether they dress in what others consider attractive clothing. Some people wear plain clothes because of strong beliefs about personal behavior. Such people believe it is wrong to care about wearing clothes as decoration. They believe that, instead, people should be concerned with other matters. Members of the Amish religious group have this kind of belief system. Amish men wear plain, dark clothes, and Amish women wear long, plain dresses. The hippies, on the other hand, dressed to emphasize ‘love’ and ‘freedom’ in the 1960s. Motorcycle gang members wear leather jackets, boots, and various items such as brass knuckles to convey toughness. Like language, a dress code can be endearing, offensive, controversial, delightful, disgusting, foolish, or charming. In all societies, certain items of dress have special meanings. For example, the meanings of headgear vary widely, depending not only on climate, but also on customs. For instance, a Russian farmer wears a fur hat to protect himself from the cold. A South American cowboy wears a felt gaucho hat as part of his traditional costume. The American cowboy wears a wide-brimmed hat for protection from the sun. The members of a nation’s armed services wear a hat as part of their uniform. The hats of coal miners, fire fighters, and matadors indicate the wearer’s occupation. Clowns wear colorful, ridiculous hats to express fun and happiness. To the Amish, the width of the hat brim and the height of the crown can communicate whether the wearer is married or not. It is interesting to note that throughout the centuries, the desire of people to be fashionable has resulted in many kinds of unusual hats. During the 1400s, many European women wore a tall, coneshaped hat called a hennin. This hat measured from 3 to 4 feet (0.9 to 1.2 meters) high and had a long, floating veil. The Gainsborough hat became popular with both men and women in the late 1700s. It had a wide brim and was decorated with feathers and ribbons. Hats are, and have always been, props in dress codes, communicating various things about the people who wear them. Most people wear a hat that they
believe makes them look attractive. This is why much protective headgear today, such as fur hoods and rain hats, is both attractive and stylish. Even the caps of police officers and military personnel are designed to improve the wearer’s appearance. No one knows when people first wore hats. People in various cold climates may have worn fur hoods as far back as 100 000 years ago. Through the centuries, people have worn headgear to indicate their social status. In ancient Egypt, the nobility wore crowns as early as 3100 B.C. They have also worn them to be fashionable. Some ancient Greeks wore hats known as pelos for fashion. These were usually made from wool fibers. Pelos can still be found in parts of southern Siberia today. They are similar to the brimless, tasseled hat known as a fez. By the 14th century, people wore hats increasingly for fashion, resulting in the development of a large variety of hats and frequent changes in hat styles. People in one area often adopted the hat styles worn in another. During that century, for example, women in western Europe wore a type of hat that resembled a turban. They adopted this style from the headgear worn by people who lived in the Middle East and the Orient. During the 20th century, hat styles varied more widely than ever before. In the 1920s, women wore a drooping, bell-shaped hat called the cloche. In the 1930s, they wore the harlequin hat, which had a wide, upturned brim. A variety of hats were worn in the 1940s and 1950s. The cap became a central accouterment of male teen style, during the heyday of the rap movement in the mid to late 1990s, when it symbolized clique solidarity.
Nudity The human being is the only animal that does not ‘go nude,’ so to speak, without social repercussions (unless, of course, the social ambiance is that of a nudist camp). Nudity is the counterpart of clothing, and thus can only be interpreted culturally. What is considered ‘exposable’ of the body will vary significantly from culture to culture, even though the covering of genitalia seems, for the most part, to cross cultural boundaries. Semiotically, nudity assumes significance because it is in binary opposition with clothing – i.e., it forms a counterpart to it, with one entailing the other. Acts of ‘clothing-removal,’ such as strip-tease performances, have appeal because of this unconscious semiotic dualism. In an audience setting, these have, first and foremost, something of a pagan ritualistic quality to them, based on mimetic portrayals of sexual activities and sexual emotions. As the psychoanalyst Sigmund Freud (1856–1939) suggested in many of his writings,
498 Clothing: Semiotics
clothing the body has, paradoxically, stimulated curiosity and desire for the nude body. In a word, what makes nudity appealing in such situations is clothing. This is why certain types of clothing items, such as shoes, are perceived to have sexual significance. They allude to bodily parts that have become desirable, engaging viewers in a communal ritual similar to the many carnivals put on throughout the world. The nude body is, thus, a sign. This is why visual artists have always had a fascination with the nude figure. The ancient Greek and Roman nude statues of male warriors, Michelangelo’s (1475–1564) powerful David sculpture, Rodin’s (1840–1917) nude sculpture The Thinker are all suggestive of the potency of the male body. It is this ‘iconography’ of nudity that enhances the attractiveness of the male in our society. A male with a ‘weakling’ body is hardly ever perceived as sexually attractive. On the other side of this semiotic paradigm, paintings and sculptures of female nude figures have tended to portray the female body ambiguously as either (1) soft and submissive, as can be seen in the famous ancient Greek statue known as the Venus de Milo, which represents Aphrodite, the Greek goddess of love and beauty (Venus in Roman mythology), or (2) feral and powerful (as can be seen in the sculptures of Diana of Greek mythology). It is (2) that came to the forefront again in the 1990s. Known as the ‘girl power’ movement, representations of women in pop culture now emphasize the second type of the two iconographic traditions. The interplay between clothing and nudity as sign systems is part of a culture’s historical iconography. This is largely unconscious, conditioning representations of bodies in virtually all areas of human social life, from advertising and erotica to religious dress.
Fashion Until the Renaissance, following trends in dress, known as ‘fashion,’ was the privilege of the rich in most parts of the world. Since the early decades of the 20th century, however, it has become an intrinsic component of the lifestyle of common people in many parts of the world. ‘Fashion statement’ has become personal statement. Fashion can be defined as the prevailing style or custom of dress. Although fashion usually refers to dress, it does not mean the same thing as clothing. People have always worn clothes that reflected the long-standing customs of their communities, and clothing styles changed extremely slowly in the past. Fashion, however, causes styles to change rapidly for a variety of historical, psychological, and sociological reasons. A clothing style may be introduced as a fashion, but the style becomes a custom if it is handed
down from generation to generation. A fashion that quickly comes and goes is called a fad. As Barthes (1967) argued, it constitutes a kind of ‘macro’ dress code that sets standards according to age, gender, class, etc. To understand how fashion codes emerge, it is instructive to consider the male business suit. The connotative message underlying the apparel text is, of course, ‘dress for success.’ How did this subtext crystallize in our culture? A look at the history of the business suit provides an interesting answer to this question. In 17th-century England, there existed a bitter conflict in social ideology between two forces – the Royalist ‘Cavaliers,’ who were faithful to King Charles I, and the Puritans, who were followers of Oliver Cromwell (1599–1658), the military, political, and religious figure who led the Parliamentarian victory in the English Civil War (1642–1649). This conflict was a battle of lifestyles, as the two warring camps sought to gain political, religious, and cultural control of English society. The Cavaliers were aristocrats who only superficially followed the teachings of the Anglican Church. Their main penchant was for a life of indulgence (at least as the Puritans perceived it). They wore colorful clothes, flamboyant feathered hats, beards, and long flowing hair. This image of the Cavalier as a ‘swashbuckler’ has been immortalized by literary works such as The Three Musketeers (Alexandre Dumas, 1844) and Cyrano de Bergerac (Edmond Rostand, 1897). The Puritans, on the other hand, frowned precisely upon this type of fashion, because of the ‘degenerate lifestyle’ that they perceived it to represent. Known as the ‘Roundheads,’ Cromwell’s followers cropped their hair very short, forbade all carnal pleasures, and prohibited the wearing of frivolous clothing. They wore dark suits and dresses with white shirts and collars. Their clothes conveyed sobriety, plainness, and rigid moralism. The Cavaliers were in power throughout the 1620s and the 1630s. During this period the Puritans escaped from England and emigrated to America, bringing with them their lifestyle, rigid codes of conduct, and clothing styles. In 1645 the Puritans, led by Cromwell, defeated the Royalist forces and executed the king. Subsequently, many Cavaliers also emigrated to America. Since the Puritans had set up colonies in the northeast, the Cavaliers decided to set up colonies in the south. The king’s son, Charles II, escaped to France to set up a court in exile. For a decade, England was ruled by the Puritans. Frowning upon all sorts of pleasure-seeking recreations, they closed down theaters, censored books, enforced Sunday laws, and forbade the wearing of flashy clothing. With Cromwell’s death in 1658, the Puritans were eventually thrown out of power and England
Clothing: Semiotics 499
welcomed the exiled king, Charles II, back in 1660. Known as the Restoration, the subsequent 25-year period saw a return to the lifestyle and fashions of the Cavaliers. For two centuries the Puritans had to bide their time. They were excluded from holding political office, from attending a university, from engaging in any socially vital enterprise. Nevertheless, throughout the years they maintained their severe lifestyle and dress codes. By the time of the Industrial Revolution, the Puritans had their final and lasting revenge. Their lifestyle – based on thrift, diligence, temperance, and industriousness, which some have called the ‘Protestant work ethic’ – allowed them to take advantage of the economic conditions in the new industrialized world. In America and in England, Cromwell’s descendants became rich and eventually took over the reigns of economic power. Ever since, Puritan ethics and fashion in the work force have influenced British and North American business culture, not to mention social mores and values at large. The origins of modern corporate capitalism are to be found in those values. The belief that hard work and ‘clean living’ are necessarily interrelated, and that this combination leads to wealth and prosperity, had become a widespread one by the turn of the present century. To this day, there is a deeply felt conviction in capitalist culture that hard work and strict living codes will lead to success in both this life and the afterlife. The business suit is a contemporary version of Puritan dress. The toned down colors (blues, browns, grays) that the business world demands are the contemporary reflexes of the Puritan’s fear and dislike of color and ornament. During the ‘hippie’ 1960s and early 1970s, the office scene came briefly under the influence of a new form of Cavalierism, with the wearing of colorful suits, turtle neck sweaters rather than white shirts, longer hair, sideburns, Nehru jackets, medallions, and beards. This new ‘fashion dare’ made a serious pitch to take over the world of corporate capitalism. But it was bound to fail, as the hippie movement of the 1960s was defeated and subsequently overtaken by conservative neo-puritanical forces in the late 1970s and 1980s. The ‘business suit’ model became once again the dress code for all of corporate North America, with only minor variations in detail. The business suit somehow endures – perhaps because it is intrinsically intertwined with the history of capitalism. But, nowadays, even this fashion code has become rather eclectic, not to say fragmented. Take, for example, the length of the skirt in the female business suit code. The mini, maxi, and normal length skirts are alternatively in and out of fashion. Evidently, a detail such as length of skirt is, in itself, meaningless. What appears to count is what it implies as a
signifier about the ever-fluctuating perceptions of women in the workplace and in society at large. When the mini is ‘in,’ it might imply an increased emphasis on sexual freedom in the culture. When it is ‘out,’ then it might imply the opposite – a decreased emphasis on sexuality. Whatever the case may be, the point to be made here is that the specific elements and features of a fashion code will invariably have connotative value that is a derivative of larger social codes within the culture. True fashions began to appear in northern Europe and Italy when a system of social classes developed in the late Middle Ages with the rise of the bourgeois class. At this time, the people of Europe began to classify one another into groups based on such factors as wealth, ancestry, and occupation. The clothes people wore helped identify them as members of a particular social class. Before the late Middle Ages, only wealthy and powerful individuals concerned themselves with the style of their clothes. But when the class system developed, the general population began to compete for positions within society. Fashion was one means by which people did this. One of the first true fashions appeared among young bourgeois Italian men during the Renaissance. While their elders dressed in long traditional robes, the young males wore tights and short, close-fitting jackets called doublets. The was one of the first examples of youth-based clothing that intentionally set itself apart from the adult dress code. German soldiers set another early fashion when they slashed their luxurious silk clothes with knives to reveal another colorful garment underneath. Theirs too was a youth-based fashion trend, probably intended to influence their appeal to the opposite sex. Before the 1800s, many countries controlled fashion with regulations called sumptuary laws. These limited the amount of money people could spend on private luxuries, being obviously designed to preserve divisions among the classes. They thus regulated fashion according to a person’s rank in society. In some countries, only the ruling class could legally wear silk, fur, and the colors red and purple. In Paris in the 1300s, middle-class women were forbidden by law to wear high headdresses, wide sleeves, and fur trimmings. Other sumptuary laws forced people to buy products manufactured in their own country to help the country’s economy. For example, an English law in the 1700s prohibited people of all classes from wearing cotton cloth produced outside of England. But the lure of fashion caused many people to break this law. The cloth was so popular that people risked arrest to wear it. Ordinary people have always hoped to raise their social position by following the fashions of privileged
500 Clothing: Semiotics
people. Fashions have also emerged to accompany differing perceptions of gender. Until the late 1700s, upper-class European men dressed as elaborately as women did. It was acceptable for men to wear brightcolored or pastel suits trimmed with gold and lace, hats decorated with feathers, high-heeled shoes, and fancy jewelry. But by the mid-1800s, men had abandoned such flamboyance in favor of plain, darkcolored wool suits. Society came to view this new fashion style as democratic, businesslike, and masculine. Until the early 1900s, European and American women rarely wore trousers, and their skirts almost always covered their ankles. By the 1920s, however, standards of feminine modesty had changed to the point that women began to wear both trousers and shorter skirts. Contrary to popular belief, political events seldom cause fashions to change. However, political events do sometimes speed up changes that have already begun, as we saw in the case of the business suit. For example, during the French Revolution (1789– 1799), simple clothing replaced the extravagant costumes made fashionable by French aristocrats. But simple styles had become popular years earlier when men in England started wearing practical, dark suits instead of elegant, colorful clothes. English people identified these plain suits with political and personal liberty. Because many French people admired English liberty, this style was already becoming fashionable in France before the revolution. In the 19th century, the invention of mechanical looms, chemical dyes, artificial fabrics, and methods of mass production made fashions affordable to many more people. In addition, new means of mass communication spread European and American fashions throughout the rest of the world. The Industrial Revolution created a ‘fashion global village.’ Since then, fashion shows and fashion magazines have proliferated. And, as Barthes (1967) pointed out, they change constantly because rapid turnover guarantees economic success. It is the only constant in contemporary fashion trends.
Conclusion Like all social codes, clothing is interconnected with the other codes of a culture. For instance, it is intertwined with religious ceremonies and rituals – the clothing worn at a religious service, during certain religious feasts and festivals, for example, is designed to send out specific kinds of religious messages. It is also intertwined with daily life routines, whereby dress codes guide or are sensitive to lifestyle and social options. The semiotic study of clothing thus shows that clothing is a sign system. This is why,
arguably, people are so interested in fashion trends (Steele, 1995). Today, attractive-looking celebrities, rather than aristocrats, set trends. People tend to follow fashion primarily to make themselves similarly attractive. When the standard of beauty changes, fashion changes with it. For example, when physical fitness became a popular standard of good looks in the 1980s, people began to wear exercise and athletic clothing more often. A clothing style may become fashionable over time with many different groups. For example, people began wearing blue jeans during the mid-1800s as ordinary work clothes. For decades, they were worn chiefly by outdoor laborers, such as farmers and cowboys. In the 1940s and 1950s, American teenagers adopted blue jeans as a comfortable, casual youth fashion. Young people during the 1960s wore blue jeans as a symbol of rebellious political and social beliefs. By the 1970s, people no longer considered jeans rebellious, and expensive designer jeans had become fashionable. In a fundamental sense, culture can be characterized as a huge system of connotative meanings that cohere into a ‘macro-code’ that allows members of the culture to interact purposefully and to represent and think about the world in specific ways. This is why some semioticians prefer to call it the ‘semiosphere,’ a term coined by the great Estonian semiotician Juri Lotman (1922–1993). In biology, a region that sustains life is called the ‘biosphere.’ By analogy, the semiosphere is the region of social life that sustains knowledge-making and representational activities (Lotman, 1991). Clothing is one of those sign systems that provides a direct route to the study of the semiosphere. See also: Body Language; Denotation versus Connotation; Nonverbal Communication.
Bibliography Barthes R (1957). Mythologies. Paris: Seuil. Barthes R (1967). Syste`me de la mode. Paris: Seuil. Craik J (1993). The face of fashion: cultural studies in fashion. London: Routledge. Danesi M (2004). Messages, signs, and meanings: an introduction to semiotics and communication theory. Toronto: Canadian Scholars’ Press. Davis F (1992). Fashion, culture, and identity. Chicago: University of Chicago Press. Dubin L S (1987). The history of beads. New York: Abrams. Enninger W (1992). ‘Clothing.’ In Bauman R (ed.) Folklore, cultural performances, and popular entertainments. Oxford: Oxford University Press. 123–145.
Coarticulation 501 Fisher H E (1992). Anatomy of love. New York: Norton. Goffman E (1959). The presentation of self in everyday life. Garden City: Doubleday. Gottdiener M (1995). Postmodern semiotics: material culture and the forms of postmodern life. London: Blackwell. Holbrook M B & Hirschman E C (1993). The semiotics of consumption: interpreting symbolic consumer behavior in popular culture and works of art. Berlin: Mouton de Gruyter. Hollander A (1988). Seeing through clothes. Harmondsworth: Penguin.
Hollander A (1994). Sex and suits: the evolution of modern dress. New York: Knopf. Lotman Y (1991). Universe of the mind: a semiotic theory of culture. Bloomington: Indiana University Press. Luciano L (2000). Looking good: male body image in modern America. New York: Hill & Wang. McRobbie A (1988). Zoot suits and second-hand dresses. Boston: Unwin Hyman. Rubinstein R P (1995). Dress codes: meanings and messages in American culture. Boulder: Westview. Steele V (1995). Fetish: fashion, sex, and power. Oxford: Oxford University Press.
Coarticulation W Hardcastle, Queen Margaret University College, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
A stretch of speech is often represented in terms of discrete elements (phonemes, segments, letters, etc.) arranged in a linear sequence on the page. However, when the movements of speech organs that produce this stretch of speech, such as the tongue, lips, and soft palate, are tracked instrumentally, it can be seen that these organs are continuously moving and that movements associated with different segments overlap in time. A consequence of this dynamic overlapping is that there is virtually no one-to-one correspondence between aspects of the speech signal and the discrete units of representation. The attempt to reconcile abstract elements in the phonological representation, which are discrete and timeless, with the time-varying physiological realizations as articulatory movements (and their acoustic consequences) has remained one of the central issues in modern experimental phonetics and has led to the development of many different theories and models. The ubiquitous overlapping of articulatory movements associated with separate sound segments is referred to as coarticulation. The term appears to have originated in the 1930s with Menzerath and de Lacerda (1933), building on the work of early experimental phoneticians such as Scripture, Rousselot, and Laclotte. In demonstrating that speech organs are continuously moving and overlapping in time, these early investigators argued against a prevailing view of their time that speech consisted of a series of relatively steady-state postures of the speech organs, linked by rapid transitional glides. Scripture, for example, concluded that ‘‘the tongue is never still and never occupies exactly the same position for
any period of time’’ (Scripture, 1902: 325). In addition, he established that the character of any articulatory speech movement depends on other movements occurring at the same time: thus there are no static characteristic postures. (For discussions of the early history of coarticulation, see Hardcastle, 1981 and Ku¨ hnert and Nolan, 1999.) One of the consequences of coarticulation is that speech sounds vary (both physiologically and acoustically) according to the nature of neighboring sounds, and ‘coarticulation’ is often used these days in its broader sense to refer to this variation. Since the 1960s, coarticulation has developed into a major area of phonetic research, and many theories have been devised to account for coarticulatory effects. The volume edited by Hardcastle and Hewlett (1999) offers a comprehensive overview of theories, data, and experimental techniques pertaining to coarticulation. Other critical reviews of models and theories of coarticulation can be found in Farnetani (1990; 1997), Kent and Minifie (1977), Sharf and Ohde (1981), Kent (1983), and Fowler (1985). The phenomenon of coarticulation can be illustrated by a graphical representation of movements of the speech organs (and different parts of the same organ) (see Figure 1). Figure 1 shows an instrumental record of movements of the jaw, lips, and tongue (tip and dorsum) during production of the sentence ‘‘say schooner again’’ by a Scottish English speaker. This speaker rounds the vowel [u] in the word ‘schooner,’ and the rounding is indicated in the instrumental record by horizontal protrusion of the lower lip. The lower lip is seen to be protruding (marked as a downward direction of the lip trace) well before the [u] vowel is articulated, in fact as early as the beginning of the [s] (indicated by the fricative noise on the speech waveform and the increase in anterior contact on the EPG trace).
Coarticulation 501 Fisher H E (1992). Anatomy of love. New York: Norton. Goffman E (1959). The presentation of self in everyday life. Garden City: Doubleday. Gottdiener M (1995). Postmodern semiotics: material culture and the forms of postmodern life. London: Blackwell. Holbrook M B & Hirschman E C (1993). The semiotics of consumption: interpreting symbolic consumer behavior in popular culture and works of art. Berlin: Mouton de Gruyter. Hollander A (1988). Seeing through clothes. Harmondsworth: Penguin.
Hollander A (1994). Sex and suits: the evolution of modern dress. New York: Knopf. Lotman Y (1991). Universe of the mind: a semiotic theory of culture. Bloomington: Indiana University Press. Luciano L (2000). Looking good: male body image in modern America. New York: Hill & Wang. McRobbie A (1988). Zoot suits and second-hand dresses. Boston: Unwin Hyman. Rubinstein R P (1995). Dress codes: meanings and messages in American culture. Boulder: Westview. Steele V (1995). Fetish: fashion, sex, and power. Oxford: Oxford University Press.
Coarticulation W Hardcastle, Queen Margaret University College, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
A stretch of speech is often represented in terms of discrete elements (phonemes, segments, letters, etc.) arranged in a linear sequence on the page. However, when the movements of speech organs that produce this stretch of speech, such as the tongue, lips, and soft palate, are tracked instrumentally, it can be seen that these organs are continuously moving and that movements associated with different segments overlap in time. A consequence of this dynamic overlapping is that there is virtually no one-to-one correspondence between aspects of the speech signal and the discrete units of representation. The attempt to reconcile abstract elements in the phonological representation, which are discrete and timeless, with the time-varying physiological realizations as articulatory movements (and their acoustic consequences) has remained one of the central issues in modern experimental phonetics and has led to the development of many different theories and models. The ubiquitous overlapping of articulatory movements associated with separate sound segments is referred to as coarticulation. The term appears to have originated in the 1930s with Menzerath and de Lacerda (1933), building on the work of early experimental phoneticians such as Scripture, Rousselot, and Laclotte. In demonstrating that speech organs are continuously moving and overlapping in time, these early investigators argued against a prevailing view of their time that speech consisted of a series of relatively steady-state postures of the speech organs, linked by rapid transitional glides. Scripture, for example, concluded that ‘‘the tongue is never still and never occupies exactly the same position for
any period of time’’ (Scripture, 1902: 325). In addition, he established that the character of any articulatory speech movement depends on other movements occurring at the same time: thus there are no static characteristic postures. (For discussions of the early history of coarticulation, see Hardcastle, 1981 and Ku¨hnert and Nolan, 1999.) One of the consequences of coarticulation is that speech sounds vary (both physiologically and acoustically) according to the nature of neighboring sounds, and ‘coarticulation’ is often used these days in its broader sense to refer to this variation. Since the 1960s, coarticulation has developed into a major area of phonetic research, and many theories have been devised to account for coarticulatory effects. The volume edited by Hardcastle and Hewlett (1999) offers a comprehensive overview of theories, data, and experimental techniques pertaining to coarticulation. Other critical reviews of models and theories of coarticulation can be found in Farnetani (1990; 1997), Kent and Minifie (1977), Sharf and Ohde (1981), Kent (1983), and Fowler (1985). The phenomenon of coarticulation can be illustrated by a graphical representation of movements of the speech organs (and different parts of the same organ) (see Figure 1). Figure 1 shows an instrumental record of movements of the jaw, lips, and tongue (tip and dorsum) during production of the sentence ‘‘say schooner again’’ by a Scottish English speaker. This speaker rounds the vowel [u] in the word ‘schooner,’ and the rounding is indicated in the instrumental record by horizontal protrusion of the lower lip. The lower lip is seen to be protruding (marked as a downward direction of the lip trace) well before the [u] vowel is articulated, in fact as early as the beginning of the [s] (indicated by the fricative noise on the speech waveform and the increase in anterior contact on the EPG trace).
502 Coarticulation
Figure 1 Computer printout of instrumental records of the sentence ‘‘say schooner again’’ spoken by a speaker of Scottish English. The top trace shows the speech waveform and time scale in tenths of seconds. The next four traces show kinematic data from a Carsten’s electromagnetic articulograph (EMA), which plots movement in the x-y plane of miniature coils attached to the midline of articulatory structures. The EMA traces are, from the top: vertical up-down movement of the jaw; horizontal front-back movement of the lower lip (negative = forward); vertical up-down movement of the tongue dorsum; and velocity of tongue dorsum movement. The lower traces show information from electropalatography synchronized with EMA: total tongue–palate contacts activated in the anterior region of the palate (ANT trace) and total number activated in the posterior region of the palate (PST trace). Reprinted from Durward B, Baer G & Rowe P (eds.) (1999). Functional Human Movement: Measurement and Analysis. Oxford, UK: Butterworth Heinemann.
It reaches its maximum forward movement during the articulatory closure for the [k] (indicated by maximum raising of the tongue dorsum trace). The illustration in Figure 1 shows an example of labial coarticulation with the lips moving forward in anticipation of the rounded vowel at the same time as the tongue articulation for the [s]. The [s] in this environment would therefore be different from the [s] in a word such as ‘skill,’ in which no such lip rounding would be involved. Similar contextual effects involving other articulations can be seen in the [k] in ‘key,’ which is produced farther forward in the oral cavity than the [k] in ‘car’ because the body of the tongue is anticipating the more fronted position required for the [i] vowel. Another example is the difference between the vowel sounds in ‘mad’ versus ‘bad.’ The vowel in ‘mad’ will usually be more nasalized than that in ‘bad’ because the soft palate
lowering for the [m] in ‘mad’ coarticulates or overlaps with the tongue movement for the vowel, unlike in ‘bad.’ The term ‘coarticulation’ in its broadest sense is often used interchangeably with ‘assimilation,’ which also refers to the influence of context on speech sounds. Thus, we find place assimilation in words like ‘meat pie,’ in which the alveolar stop [t] ‘assimilates’ into the place of articulation of the following bilabial [p], or voice assimilation in a phrase such as ‘I have to,’ in which the voicing of the [v] assimilates into the following voiceless stop [t] and becomes the perceived voiceless [f]. Some investigators (including the originator of the term, P. Menzerath) restrict ‘coarticulation’ to the physiological mechanisms underlying speech production and reserve ‘assimilation’ for audible change to specific sounds in context, often resulting in perception of a different phoneme.
Coarticulation 503
Coarticulatory effects such as those above can be described in terms of their type and extent. Type may refer to the speech organs predominantly involved, e.g., labial coarticulation (or more specifically, lingual/labial), nasal coarticulation, etc. Type may also refer to the direction of coarticulatory influence, whether anticipatory (sometimes called ‘right-to-left’) or perseverative (carryover, or ‘left-to-right’). Designation of coarticulatory influences as anticipatory or carryover depends on the theoretical premise that there is an underlying linear abstract segmental representation at some level in the speech production process. Thus, anticipatory coarticulation occurs if a sound segment is influenced by (or even becomes more like) a following sound (such as in the ‘schooner’ example in Figure 1). If a sound shows influence of a preceding sound, this is an example of carryover or perseverative coarticulation. Some theories claim that carryover coarticulation occurs because of inherent physiological characteristics of the speech articulators. For example, in the word ‘mad,’ coarticulated nasality is said to occur on the vowel because the soft palate is relatively slow moving and takes time to rise from its maximally lowered position for [m]. Anticipatory coarticulation is rather more difficult to explain. At one level it can be seen as a production strategy to enable articulatory movements to occur at the rate necessary to deliver up to five syllables per second, as in normal spontaneous speech. It is also probably the most economical strategy to employ, particularly in the face of increasing demands that occur in fast colloquial speech, and the concept of ‘economy of effort’ has frequently been linked to coarticulation. (For further discussion of competing demands on the articulators during a communication situation, see Lindblom, 1990). For rapid productions, some degree of parallel processing is inevitable and is seen also in the skeletal motor system (such as in sign language interpreters engaged in rapid finger spelling). At the cognitive level, anticipatory coarticulation can be seen as a further example of the universal tendency for the brain to ‘scan ahead of time’ (cf. early work on the serial ordering of behavior by Lashley, 1951). There may also be some perceptual motivation for anticipatory coarticulation. For example, as a result of anticipatory coarticulation, acoustic information on an upcoming segment is available to the listeners before that segment is articulated, and this prior knowledge may facilitate more accurate perception than would be the case if all acoustic cues were confined within the temporal boundaries of that segment (Ku¨ hnert and Nolan, 1999). Coarticulation can also be described with reference to the temporal domain of its influence. This can be expressed in terms of time or as numbers of segments.
In the example in Figure 1, the labial coarticulatory influence extended two segments in advance of the vowel. Some early theories of anticipatory coarticulation claimed that articulations began as early as possible (e.g., a study by Benguerel and Cowan, 1974, showed lip rounding influence extending up to six segments in advance). Other models using the notion of distinctive features involved a ‘featurespreading’ approach following Henke’s work (1966) on computer modeling. For example, the feature [þ rounding] spreads across all [-rounded] segments in a string. An earlier influential theory (Kozhevnikov and Chistovich, 1965) proposed a higher-level structure, the articulatory syllable, defined with reference to the temporal domain of coarticulatory spread. Based on measurements of lip protrusion for the vowel [u], the model stated that this protrusion begins at the same time as the first consonant in any string of consonants preceding the vowel, providing these consonants did not involve ‘contradictory’ articulatory gestures. Later work was to show, however, that coarticulatory influences could in fact extend across articulatory syllable boundaries so that in a vowelconsonant-vowel sequence, for example, the two vowels influence each other across a consonant boundary, a result not predicted by Kozhevnikov ¨ hman, and Chistovich’s model (see, for example, O 1966, and for the many studies since, see the review in Farnetani and Recasens, 1999). These early theories proposed that coarticulation begins as early as possible. An alternative view is that coarticulatory influences are time locked and that the component gestures of a segment begin at a fixed interval before the phonetic target is achieved (see, e.g., Bell-Berti and Harris, 1982). Closely related to the idea of time locking is a radical approach to the phenomenon of coarticulation based on action theory using the concept of coordinative structures (see e.g., Kelso et al., 1986). In this approach the underlying units of speech production are not segment-like units such as is assumed in most of the above accounts but ‘gestures,’ which are speech-relevant goals containing spatial-temporal information about speech articulations. An example of an articulatory gesture in this framework would be a bilabial closure, which involves a specific and unique combination of coordinated upper lip, lower lip, and jaw movements regardless of the context. Contextual effects are said to arise as a result of temporal overlapping (‘coproduction’) with other gestures. In this coordinative structure framework, coarticulation is viewed as an automatic consequence of the inherent kinematic properties of the production mechanism. (For discussions of the gestural approach and its formulation in the articulatory phonology framework,
504 Coarticulation
see Browman and Goldstein, 1992, Fowler and Saltzman, 1993, and Nolan, 1982.) Much research has been devoted to identifying the types of constraints that affect coarticulatory processes. The notion of ‘coarticulatory resistance’ (Bladon and Al-Bamerni, 1976; Recasens, 1985) attempts to identify some of the articulatory characteristics that may affect the spread of coarticulatory influences. There is evidence that coarticulation is gradual and varies between different segments. Recasens (1991, 2002) has developed an articulatory constraint model that relates coarticulatory resistance to the degree of tongue dorsum raising required for the consonant. Other constraints on coarticulatory processes pertain to prosodic and related aspects of the language, such as stress patterns, suprasegmental boundaries, syntactic structures, rate of articulation, clarity, and speech style (see, e.g., Engstrand, 1988; Lindblom, 1963; Hardcastle, 1985; and Matthies et al., 2001), and all these factors have been found to influence coarticulation both temporally and spatially. The phonological structure of a particular language may also constrain coarticulatory patterns. For example, there is some evidence that the extent of coarticulatory nasalization of vowels preceding a nasal consonant will tend to be more restricted in those languages that have a nasal/oral phonological contrast in vowels (e.g., French) compared with those that do not (e.g., English; see, e.g., Clumeck, 1976). Various models have attempted to account for these language-specific influences (see, for example, the ‘window model of coarticulation’ of Keating, 1990, and for a comprehensive account of crosslanguage studies of coarticulation, see Manuel, 1999). Coarticulation remains a productive area of phonetic research, and we can expect to see more-refined models being developed as instrumental techniques for investigating the kinematics of speech production improve. See also: Distinctive Features; Experimental and Instru-
Bibliography Beddor P S, Harnsberger J D & Lindemann S (2002). ‘Language-specific patterns for vowel-to-vowel coarticulation: acoustic structures and their perceptual correlates.’ Journal of Phonetics 30(4), 591–537. Bell-Berti F & Harris K S (1982). ‘Temporal patterns of coarticulation: lip rounding.’ Journal of the Acoustical Society of America 71, 449–454. Benguerel A P & Cowan H (1974). ‘Coarticulation of upper lip protrusion in French.’ Phonetica 30, 41–55.
Bladon R A W & Al-Bamerni A (1976). ‘Coarticulation resistance in English /I/.’ Journal of Phonetics 4, 137–150. Boyce S E (1990). ‘Coarticulatory organisation for lip rounding in Turkish and English.’ Journal of the Acoustical Society of America 88, 2584–2595. Browman C P & Goldstein L (1992). ‘Articulatory phonology: an overview.’ Phonetica 49, 155–180. Clumeck H (1976). ‘Patterns of soft palate movement in six languages.’ Journal of Phonetics 4, 337–351. Daniloff R & Moll K (1968). ‘Coarticulation of lip rounding.’ Journal of Speech and Hearing Research 11, 707–721. Engstrand O (1988). ‘Articulatory correlates of stress and speaking rate in Swedish VCV utterances.’ Journal of the Acoustical Society of America 83, 1863–1875. Farnetani E (1990). ‘V-C-V lingual coarticulation and its spatio-temporal domain.’ In Hardcastle W J & Marchal A (eds.) Speech production and speech modelling. Netherlands: Kluwer Academic. 93–110. Farnetani E (1997). ‘Coarticulation and connected speech processes.’ In Hardcastle W J & Laver J (eds.) A handbook of phonetic science. Oxford: Blackwell. 371–404. Farnetani E & Recasens D (1999). ‘Coarticulation models in recent speech production theories.’ In Hardcastle W J & Hewlett N (eds.) Coarticulation: theory, data and techniques. Cambridge: Cambridge University Press. 31–65. Flege J (1988). ‘Anticipatory and carryover nasal coarticulation in the speech of children and adults.’ Journal of Speech and Hearing Research 31, 525–536. Fowler C A (1980). ‘Coarticulation and theories of extrinsic timing.’ Journal of Phonetics 8, 113–133. Fowler C A (1985). ‘Current perspective on language and speech production: a critical overview.’ In Daniloff R (ed.) Speech science. London: Taylor & Francis. 193–278. Fowler C A & Saltzman E (1993). ‘Coordination and coarticulation in speech production.’ Language and Speech 36, 171–195. Gay T J (1979). ‘Coarticulation in some consonant-vowel and consonant cluster-vowel syllables.’ In Lindblom B ¨ hman S (eds.) Frontiers of Speech Communication &O Research. London: Academic Press. 69–76. Gelfer C, Bell-Berti F & Harris K (1989). ‘Determining the extent of coarticulation: effects of experimental design.’ Journal of the Acoustical Society 6, 2443–2445. Guenther F H (1994). ‘Skill acquisition, coarticulation and rate effects in a neural network model of speech production.’ Journal of the Acoustical Society of America 95, 2924. Hardcastle W J (1981). ‘Experimental studies in lingual coarticulation.’ In Asher R & Henderson E (eds.) Towards a history of phonetics. Edinburgh: Edinburgh University Press. 50–66. Hardcastle W J (1985). ‘Some phonetics and syntactic constraints on lingual coarticulation during /kl/ sequences.’ Speech Communication 4, 247–263. Hardcastle W J & Hewlett N (1999). Coarticulation: theory, data and techniques. Cambridge: Cambridge University Press. Henke W L (1966). Dynamic articulatory model of speech production using computer simulation. Ph.D. diss., MIT.
Cobbett, William (1763–1835) 505 Hoole P, Nguyen-Trong N & Hardcastle W J (1993). ‘A comparative investigation of coarticulation in fricatives: electropalatographic, electromagnetic and acoustic data.’ Language and Speech 36, 235–260. Katz W, Kripke C & Tallal P (1991). ‘Anticipatory coarticulation in the speech of adults and young children: acoustic, perceptual and video data.’ Journal of Speech and Hearing Research 34, 1222–1232. Keating P A (1990). ‘The window model of coarticulation: articulatory evidence.’ In Kingston J & Beckman M E (eds.) Papers in laboratory phonetics I: between the grammar and the physics of speech. Cambridge: Cambridge University Press. 451–470. Kelso J A S, Saltzman E L & Tuller B (1986). ‘The dynamical perspective on speech production: data, and theory.’ Journal of Phonetics 14, 29–59. Kent R (1983). ‘The segmental organization of speech.’ In MacNeilage P (ed.) The production of speech. New York: Springer. 57–89. Kent R & Minifie F (1977). ‘Coarticulation in recent speech production models.’ Journal of Phonetics 5, 115–133. Kozhevnikov V & Chistovich L (1965). Speech: articulation and perception. Washington, DC: Joint Publications Research Service. Ku¨ hnert B & Nolan F (1999). ‘The origin of coarticulation.’ In Hardcastle W J & Hewlett N (eds.) Coarticulation: theory, data and techniques. Cambridge: Cambridge University Press. 7–30. Lashley K S (1951). ‘The problem of serial order in behavior.’ In Jeffress L A (ed.) Cerebral mechanisms in behavior. New York: Wiley. 112–136. Lindblom B (1963). ‘Spectrographic study of vowel reduction.’ Journal of the Acoustical Society of America 35, 1773–1781. Lindblom B (1990). ‘Explaining phonetic variation: a sketch of the H&H theory.’ In Hardcastle W J & Marchal A (eds.) Speech production and speech modelling. Dordrecht: Kluwer Academic Publishers. 403–439. Lubker J F & Gay T (1982). ‘Anticipatory labial coarticulation: experimental, biological and linguistic variables.’ Journal of the Acoustical Society of America 71, 437–448. Manual S (1999). ‘Cross-language studies: relating languageparticular coarticulation patterns to other languageparticular facts.’ In Hardcastle W J & Hewlett N (eds.)
Coarticulation: theory, data and techniques. Cambridge: Cambridge University Press. 179–198. Matthies M et al. (2001). ‘Variation in anticipatory coarticulation with changes in clarity and rate.’ Journal of Speech and Language Hearing Research 44(2), 340– 353. Menzerath P & de Lacerda A (1933). ‘Koartikulation, Steuerung und lautabgrenzung.’ Berlin: Fred. Dummlers. Nolan F (1982). ‘The role of action theory in the description of speech production.’ Linguistics 20, 287–308. Ohala J J (1993). ‘Coarticulation and phonology.’ Language and Speech 36, 155–171. ¨ hman S (1966). ‘Coarticulation in VCV utterances: specO trographic measurements.’ Journal of the Acoustical Society of America 39, 151–168. Parush A, Ostry D & Munhall G (1983). ‘A kinematic study of lingual coarticulation in VCV sequences.’ Journal of the Acoustical Society of America 74, 1115–1125. Perkell J S & Matthies M (1992). ‘Temporal measures of anticipatory labial coarticulation for the vowel /u/: within –and cross-subject variability.’ Journal of the Acoustical Society of America 91, 2911–2925. Recasens D (1985). ‘Coarticulatory patterns and degrees of coarticulatory resistance in Catalan CV sequences.’ Language and Speech 28, 97–114. Recasens D (1991). ‘An electropalatographic and acoustic study of consonant-to-vowel coarticulation.’ Journal of Phonetics 19, 177–192. Recasens D (2002). ‘An EMA study of VCV coarticulatory direction.’ Journal of the Acoustical Society of America 111(6), 2828–2841. Recasens D, Pallare`s M D & Fontdevila J (1997). ‘A model of lingual coarticulation based on articulatory constraints.’ Journal of the Acoustical Society of America 102, 544–561. Scripture E (1902). The elements of experimental phonetics. New York: Charles Scribner’s Sons. Sharf D J & Ohde R N (1981). ‘Physiologic, acoustic and perceptual aspects of coarticulation: implications for the remediation of articulatory disorders.’ In Lass N J (ed.) Speech and language: advances in basic research and practice V. New York: Academic Press. 153–247. Sussman H M & Westbury J (1981). ‘The effects of antagonistic gestures on temporal and amplitude parameters of anticipatory labial coarticulation.’ Journal of Speech and Hearing Research 24, 16–24.
Cobbett, William (1763–1835) M Miyawaki, Senshu University, Kanagawa, Japan ! 2006 Elsevier Ltd. All rights reserved.
William Cobbett, essayist, politician, agriculturist, and grammarian, was born in Farnham, Surrey, England, on March 9, 1763. Cobbett grew up as a farm
boy with virtually no formal education. In 1784, at the age of 21, he joined the army, where he managed to find time to teach himself the rules of grammar by reading and copying Robert Lowth’s A short introduction to English grammar (1762). One day he received an eloquent but ungrammatical letter from a Nottingham stocking weaver, which impelled him
Cobbett, William (1763–1835) 505 Hoole P, Nguyen-Trong N & Hardcastle W J (1993). ‘A comparative investigation of coarticulation in fricatives: electropalatographic, electromagnetic and acoustic data.’ Language and Speech 36, 235–260. Katz W, Kripke C & Tallal P (1991). ‘Anticipatory coarticulation in the speech of adults and young children: acoustic, perceptual and video data.’ Journal of Speech and Hearing Research 34, 1222–1232. Keating P A (1990). ‘The window model of coarticulation: articulatory evidence.’ In Kingston J & Beckman M E (eds.) Papers in laboratory phonetics I: between the grammar and the physics of speech. Cambridge: Cambridge University Press. 451–470. Kelso J A S, Saltzman E L & Tuller B (1986). ‘The dynamical perspective on speech production: data, and theory.’ Journal of Phonetics 14, 29–59. Kent R (1983). ‘The segmental organization of speech.’ In MacNeilage P (ed.) The production of speech. New York: Springer. 57–89. Kent R & Minifie F (1977). ‘Coarticulation in recent speech production models.’ Journal of Phonetics 5, 115–133. Kozhevnikov V & Chistovich L (1965). Speech: articulation and perception. Washington, DC: Joint Publications Research Service. Ku¨hnert B & Nolan F (1999). ‘The origin of coarticulation.’ In Hardcastle W J & Hewlett N (eds.) Coarticulation: theory, data and techniques. Cambridge: Cambridge University Press. 7–30. Lashley K S (1951). ‘The problem of serial order in behavior.’ In Jeffress L A (ed.) Cerebral mechanisms in behavior. New York: Wiley. 112–136. Lindblom B (1963). ‘Spectrographic study of vowel reduction.’ Journal of the Acoustical Society of America 35, 1773–1781. Lindblom B (1990). ‘Explaining phonetic variation: a sketch of the H&H theory.’ In Hardcastle W J & Marchal A (eds.) Speech production and speech modelling. Dordrecht: Kluwer Academic Publishers. 403–439. Lubker J F & Gay T (1982). ‘Anticipatory labial coarticulation: experimental, biological and linguistic variables.’ Journal of the Acoustical Society of America 71, 437–448. Manual S (1999). ‘Cross-language studies: relating languageparticular coarticulation patterns to other languageparticular facts.’ In Hardcastle W J & Hewlett N (eds.)
Coarticulation: theory, data and techniques. Cambridge: Cambridge University Press. 179–198. Matthies M et al. (2001). ‘Variation in anticipatory coarticulation with changes in clarity and rate.’ Journal of Speech and Language Hearing Research 44(2), 340– 353. Menzerath P & de Lacerda A (1933). ‘Koartikulation, Steuerung und lautabgrenzung.’ Berlin: Fred. Dummlers. Nolan F (1982). ‘The role of action theory in the description of speech production.’ Linguistics 20, 287–308. Ohala J J (1993). ‘Coarticulation and phonology.’ Language and Speech 36, 155–171. ¨ hman S (1966). ‘Coarticulation in VCV utterances: specO trographic measurements.’ Journal of the Acoustical Society of America 39, 151–168. Parush A, Ostry D & Munhall G (1983). ‘A kinematic study of lingual coarticulation in VCV sequences.’ Journal of the Acoustical Society of America 74, 1115–1125. Perkell J S & Matthies M (1992). ‘Temporal measures of anticipatory labial coarticulation for the vowel /u/: within –and cross-subject variability.’ Journal of the Acoustical Society of America 91, 2911–2925. Recasens D (1985). ‘Coarticulatory patterns and degrees of coarticulatory resistance in Catalan CV sequences.’ Language and Speech 28, 97–114. Recasens D (1991). ‘An electropalatographic and acoustic study of consonant-to-vowel coarticulation.’ Journal of Phonetics 19, 177–192. Recasens D (2002). ‘An EMA study of VCV coarticulatory direction.’ Journal of the Acoustical Society of America 111(6), 2828–2841. Recasens D, Pallare`s M D & Fontdevila J (1997). ‘A model of lingual coarticulation based on articulatory constraints.’ Journal of the Acoustical Society of America 102, 544–561. Scripture E (1902). The elements of experimental phonetics. New York: Charles Scribner’s Sons. Sharf D J & Ohde R N (1981). ‘Physiologic, acoustic and perceptual aspects of coarticulation: implications for the remediation of articulatory disorders.’ In Lass N J (ed.) Speech and language: advances in basic research and practice V. New York: Academic Press. 153–247. Sussman H M & Westbury J (1981). ‘The effects of antagonistic gestures on temporal and amplitude parameters of anticipatory labial coarticulation.’ Journal of Speech and Hearing Research 24, 16–24.
Cobbett, William (1763–1835) M Miyawaki, Senshu University, Kanagawa, Japan ! 2006 Elsevier Ltd. All rights reserved.
William Cobbett, essayist, politician, agriculturist, and grammarian, was born in Farnham, Surrey, England, on March 9, 1763. Cobbett grew up as a farm
boy with virtually no formal education. In 1784, at the age of 21, he joined the army, where he managed to find time to teach himself the rules of grammar by reading and copying Robert Lowth’s A short introduction to English grammar (1762). One day he received an eloquent but ungrammatical letter from a Nottingham stocking weaver, which impelled him
506 Cobbett, William (1763–1835)
to write an English grammar. His Grammar of the English language, set out in the form of letters to his son James, first appeared in New York in 1818. It was an immediate success. The second and third editions were published in London in 1819, and the fourth edition followed in 1820. A revised edition, to which were added ‘‘Six lessons, intended to prevent statesmen from using false grammar, and from writing in an awkward manner,’’ appeared in 1823. In 1832 Cobbett was elected an M.P. representing Oldham, Lancashire, in which seat he remained until his death on June 18, 1835. Although the overall framework of Cobbett’s grammar is traditional, being based on the nine parts of speech and their categories, the readership he has in mind is unique. As the subtitle indicates, his grammar is ‘‘intended for the Use of Schools and of Young Persons in general; but, more especially for the Use of Soldiers, Sailors, Apprentices, and Ploughboys.’’ Thus Cobbett’s aim was to make young people of the working class competent speakers and writers of English so that they would be able to ‘‘assert with effect [their] rights and liberties’’ (edn. by Nickerson and Osborne, 1983: 32). Accordingly, his statements are prescriptive rather than descriptive, with reason as the criterion of correctness: ‘‘It is reason [that] is to be your sole guide’’ (93). Cobbett warned against the use of me in ‘‘It was me,’’ which should be ‘‘It was I’’
(93). Similarly, he criticized the double negative in ‘‘Do not give him none of your money,’’ which should be ‘‘Do not give him any of your money’’ (141). Cobbett also advised on matters of style, emphasizing the clearness and strength of meaning. He warned that ‘‘one of the greatest of all faults in writing and in speaking is [. . .] the using of many words to say little’’ (150). Thus his grammar can be seen as a forerunner of such manuals of usage and style as H. W. Fowler’s Modern English usage (1926). See also: Fowler, Henry Watson (1858–1933); Lowth, Robert (1710–1787).
Bibliography Aarts F G A M (1986). ‘William Cobbett: radical, reactionary and poor man’s grammarian.’ Neophilologus 70, 603–614. Aarts F (1994). ‘William Cobbett’s Grammar of the English language.’ Neuphilologische Mitteilungen 95, 319–332. Cobbett W (1818). A grammar of the English language, in a series of letters. New York: Clayton and Kingsland (modern edn. by C C Nickerson and J W Osborne 1983, Amsterdam: Rodopi; reissue of the 1923 edn. with an introduction by Roy Hattersley, 2002, Oxford: Oxford University Press). Vallins G H (1954). ‘Cobbett’s grammar.’ English 10, 48–53.
Cocos (Keeling) Islands: Language Situation U Ansaldo, Universteit van Amsterdam, Amsterdam, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
The Cocos (Keeling) Islands are a coral atoll of 27 islands that constitute the westernmost outpost of Australia, positioned in the middle of the Indian Ocean 1000 km southwest of Java. First settled by agents of the East India Company, in 1829 the original population consisted of a total of 98 people, most of whom were of Indonesian/Malay origin, with a few British sailors as well as, possibly, a few individuals of Papuan, Indian, and even Asian provenance. They were portrayed as speaking a form of Trade Malay or a Malay-based lingua franca. As of 2001, the population of the Cocos-Keeling consists of 618 individuals who reside on the two inhabited islands of the atoll. West Island is the base for roughly 100 Australians, mostly administrators and schoolteachers on 2- to 3-year postings. They
speak Australian English and, with few exceptions, have no knowledge of Malay. Home Island, the original settlement of the first settlers, still houses 80% of the population, predominantly of Indonesian provenance. These are the Cocos Malays, descendants of laborers brought to the islands during the 150-year period in which the islands were an estate of the Clunies-Ross clan, which lasted until the Australian government took over in 1978. The community is Muslim. Cocos Malay is the dominant language on Home Island; it can be described as a contact variety of colloquial Malay with strong Javanese influences. It was the dominant language until Australian takeover and is still very alive today. Knowledge of English is restricted in most individuals of advanced age; it is functional in most individuals of middle age, and the younger generations have near-to-native fluency. The school operates a bilingual program in which English is the language of immersion and Cocos Malay is also used to aid in instruction, particularly at primary
506 Cobbett, William (1763–1835)
to write an English grammar. His Grammar of the English language, set out in the form of letters to his son James, first appeared in New York in 1818. It was an immediate success. The second and third editions were published in London in 1819, and the fourth edition followed in 1820. A revised edition, to which were added ‘‘Six lessons, intended to prevent statesmen from using false grammar, and from writing in an awkward manner,’’ appeared in 1823. In 1832 Cobbett was elected an M.P. representing Oldham, Lancashire, in which seat he remained until his death on June 18, 1835. Although the overall framework of Cobbett’s grammar is traditional, being based on the nine parts of speech and their categories, the readership he has in mind is unique. As the subtitle indicates, his grammar is ‘‘intended for the Use of Schools and of Young Persons in general; but, more especially for the Use of Soldiers, Sailors, Apprentices, and Ploughboys.’’ Thus Cobbett’s aim was to make young people of the working class competent speakers and writers of English so that they would be able to ‘‘assert with effect [their] rights and liberties’’ (edn. by Nickerson and Osborne, 1983: 32). Accordingly, his statements are prescriptive rather than descriptive, with reason as the criterion of correctness: ‘‘It is reason [that] is to be your sole guide’’ (93). Cobbett warned against the use of me in ‘‘It was me,’’ which should be ‘‘It was I’’
(93). Similarly, he criticized the double negative in ‘‘Do not give him none of your money,’’ which should be ‘‘Do not give him any of your money’’ (141). Cobbett also advised on matters of style, emphasizing the clearness and strength of meaning. He warned that ‘‘one of the greatest of all faults in writing and in speaking is [. . .] the using of many words to say little’’ (150). Thus his grammar can be seen as a forerunner of such manuals of usage and style as H. W. Fowler’s Modern English usage (1926). See also: Fowler, Henry Watson (1858–1933); Lowth, Robert (1710–1787).
Bibliography Aarts F G A M (1986). ‘William Cobbett: radical, reactionary and poor man’s grammarian.’ Neophilologus 70, 603–614. Aarts F (1994). ‘William Cobbett’s Grammar of the English language.’ Neuphilologische Mitteilungen 95, 319–332. Cobbett W (1818). A grammar of the English language, in a series of letters. New York: Clayton and Kingsland (modern edn. by C C Nickerson and J W Osborne 1983, Amsterdam: Rodopi; reissue of the 1923 edn. with an introduction by Roy Hattersley, 2002, Oxford: Oxford University Press). Vallins G H (1954). ‘Cobbett’s grammar.’ English 10, 48–53.
Cocos (Keeling) Islands: Language Situation U Ansaldo, Universteit van Amsterdam, Amsterdam, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
The Cocos (Keeling) Islands are a coral atoll of 27 islands that constitute the westernmost outpost of Australia, positioned in the middle of the Indian Ocean 1000 km southwest of Java. First settled by agents of the East India Company, in 1829 the original population consisted of a total of 98 people, most of whom were of Indonesian/Malay origin, with a few British sailors as well as, possibly, a few individuals of Papuan, Indian, and even Asian provenance. They were portrayed as speaking a form of Trade Malay or a Malay-based lingua franca. As of 2001, the population of the Cocos-Keeling consists of 618 individuals who reside on the two inhabited islands of the atoll. West Island is the base for roughly 100 Australians, mostly administrators and schoolteachers on 2- to 3-year postings. They
speak Australian English and, with few exceptions, have no knowledge of Malay. Home Island, the original settlement of the first settlers, still houses 80% of the population, predominantly of Indonesian provenance. These are the Cocos Malays, descendants of laborers brought to the islands during the 150-year period in which the islands were an estate of the Clunies-Ross clan, which lasted until the Australian government took over in 1978. The community is Muslim. Cocos Malay is the dominant language on Home Island; it can be described as a contact variety of colloquial Malay with strong Javanese influences. It was the dominant language until Australian takeover and is still very alive today. Knowledge of English is restricted in most individuals of advanced age; it is functional in most individuals of middle age, and the younger generations have near-to-native fluency. The school operates a bilingual program in which English is the language of immersion and Cocos Malay is also used to aid in instruction, particularly at primary
Codes, Elaborated and Restricted (Bernstein) 507
level. Code-switching is common in the young generation. This generation is also particularly exposed to standard Indonesian and standard Malay. The exposure to Indonesian comes from the media (Indonesian TV) as well as from Bahasa Indonesia, which is offered in school as a second language choice. The influence of Malay comes less directly, from the status that the language enjoys for literary and religious reasons: in 2001 there were clearly normative voices to be heard advocating a more standard form of Malay to be spoken on the island. All these facts considered, it is quite likely that Cocos Malay may lose many of its particular traits in the near future. Cocos Malay communities can also be found in Western Australia (children have to move there to complete their education after grade 10) as well as in Sabah, Malaysia. Relocated from the original settlements in different waves as the islands can only sustain a limited population, these communities are described as having lower proficiency in Cocos Malay than that on Home Island. In particular, in Sabah it is reported that convergence towards standard Malay has taken place.
See also: Australia: Language Situation; Bilingualism; Code Switching and Mixing; Language Change and Language Contact; Malaysia: Language Situation.
Bibliography Adelaar K A (1996). ‘Malay in the Cocos (Keeling) Islands.’ In Nothofer B (ed.) Reconstruction, classification, description. Festschrift in honour of Isodore Dyen. Hamburg: Abera Network Asia Pacific. 23–37. Bunce P (1988). The Cocos (Keeling) Islands. Singapore: John Wiley & Sons Australia Ltd. Gibson-Hill M A (1947). ‘Notes on the Cocos-Keeling Islands.’ Journal of the Malayan Branch of the Royal Asiatic Society XX(2), 140–202. Hunt J G (1989). ‘The revenge of the Bantamese. Factors for change in the Cocos (Keeling) Islands.’ Master’s thesis, Australian National University. Lapsley A D (1983). ‘Cocos Malay Syntax.’ Master’s thesis, Monash University. Lim L & Ansaldo U (2003). ‘Sounds Cocos.’ In Sole´ M J, Recasens D & Romero J (eds.) Proceedings of the 15th International Conference of Phonetic Sciences. Barcelona: The 15th ICPhS Organizing Committee. 803–806.
Codes, Elaborated and Restricted (Bernstein) A Capone, Barcellona, Italy ! 2006 Elsevier Ltd. All rights reserved.
Bernstein was among the first scholars to focus on the correlation between the scholastic success (or failure) of a learner and the social class he or she belonged to. A student living in a well-off family, having many cultural stimuli (books, newspapers, periodicals, films, etc.) is bound to develop a rich and fully articulated language (a so-called ‘elaborate code’), whereas a student who belongs to a working-class family and is exposed to poor linguistic and cultural stimuli, develops a fragmented, poor, syntactically deficient language (called a ‘restricted code’ by Bernstein) (see Bernstein, 1971–1975). Contrary to the Chomskyan theory that language naturally develops in the brain, due to the interaction of biologically innate structures and the environment to which the child is exposed, may be that the data to which the child is exposed are so poor and confused that it is easy to demonstrate that the innate learning program prevails over the environmental stimuli, Bernstein emphasizes the predominant role played by the environment in shaping the learning process. Of course, the correlation between social class is not, strictly speaking, without exceptions, because much
of the learning process depends on the lifestyle of the family under consideration. There are exceptional working-class families where parents, contrary to all expectations, have good knowledge of the language and place great importance on culture, but the norm is that, within working-class families, cultural stimuli are less predominant than in more well-off families. The problem, for sociologists, is how to offset the disadvantages of the pupils belonging to workingclass families and how pedagogues (and teachers) can have an antideterministic effect on such children. A possible solution to the problem is to ensure that the school (or the class) becomes another miniature family and that the negative effects of the families are compensated for by the pedagogical action of the school. The school should, therefore, be a positive environment in which pupils are exposed to positive cultural and affective stimuli that help their personalities grow and come to maturity. In such a model of the school, teachers lose their primary function of being transmitters of notions (knowledge, in general) and are required to take the roles of educators or pedagogues who act as models and provisionally replace (at least within the boundaries of the school) the family by setting good examples for the students, and, in particular, exposing them to the positive aspects of culture, intended as knowledge that interacts with the
508 Codes, Elaborated and Restricted (Bernstein)
individual to make him or her grow up intellectually and emotionally. To compensate for the negative effects of families, within which dialogue and conversation have died, or are confined to adjacency pairs consisting of questions/answers or orders/replies, teachers have to play the role of communicators and have to stimulate communication. It is, in my view, impossible for a student to make progress in his or her language (to develop a more articulated written or oral mode of expression) unless he or she understands the function of communication, which is that of transmitting knowledge, but also of enhancing the expressive as well as the interpersonal function. To communicate is not only to express propositions
(concerning others), but also to express propositions concerning what we really are and feel, and, by so doing, to interact with others, creating an intersubjective dimension in which social life is possible (see Capone, 2003).
Bibliography Bernstein B (1971–1975). Class, codes and control (3 vols). London: Routledge & Kegan Paul. Bernstein B (1990). The structuring of pedagogical discourse. London/New York: Routledge. Capone A (2003). Pragmemes. Messina: Minerva.
Code Switching S Gross, East Tennessee State University, Johnson City, TN, USA ! 2006 Elsevier Ltd. All rights reserved.
In many bi- and multilingual communities around the world, speakers need to choose, often at an unconscious level, which language to use in their interactions with other members of the community. One of the choices that bilingual speakers often make is to code-switch: that is, speakers switch back and forth between languages (or varieties of the same language), sometimes within the same utterance (see Bilingualism; Code Switching and Mixing). The motivations for code switching have often been treated simply as lists of possible functions for code switching. For example, Appel and Muysken (1987) cite five such functions. First, code switching may serve a referential function by compensating for the speaker’s lack of knowledge in one language, perhaps on a certain subject. Second, it may serve a directive function by including or excluding the listener. Third, code switching may have an expressive function by identifying the speaker as someone having a mixed cultural identity. Fourth, it may have a phatic function indicating a change in tone in the conversation. And fifth, it may serve a metalinguistic function when code switching is used to comment on the languages involved. While such lists are useful places to start, and no one would deny that code switching can certainly serve these functions, these types of lists fail to answer the question of what motivates speakers to make the choices they do at a particular point in a conversation. This article reviews the major proposals that have been advanced regarding the following question:
why do speakers choose to engage in code switching in the first place?
Code Switching as a Research Topic The current interest in code switching can be dated to a 1972 study of language use in Hemnesberget, a small village in northern Norway, conducted by Jan Blom and John Gumperz and described in a volume on sociolinguistics edited by Gumperz and Hymes (1972). In Hemnesberget, two varieties of Norwegian are used: Ranama˚l, a local dialect, and Bokma˚l, the standard variety (see Language and Dialect: Linguistic Varieties). However, speakers’ decisions regarding which variety to use are by no means arbitrary or haphazard. In general, Ranama˚l, the local variety, is used in local activities and relationships, reflecting shared identities with the local culture. In contrast, Bokma˚l is used in official settings such as school, church, and the media, communicating an individual’s dissociation from the local group, i.e., not stressing his or her local ties. Blom and Gumperz distinguish between two main functions of code switching: situational and metaphorical. In situational code switching, which seems to be similar to the notion of diglossia, the speaker’s choice of language is constrained by factors external to her/his own motivations, for example, the status of the interlocutor, the setting of the conversation, or the topic of conversation. So, in Hemnesberget, Blom and Gumperz observed that when an outsider joins a group of locals engaged in a conversation, the locals will often switch from the local variety, Ranama˚l, to the standard variety, Bokma˚l. In a later work, Gumperz (1982) introduces the distinction between ‘we’ and ‘they’ codes, which further
Code Switching 509
amplifies the kind of linguistic alternation that occurs in situational code switching. ‘We’ codes are associated with home and family, while ‘they’ codes are associated with public discourse. In contrast to situational code switching, speakers may engage in a more complex type of code switching to create a ‘metaphoric’ effect. In his 1982 book, Gumperz explains that this metaphoric effect is a way for speakers to communicate ‘‘information about how they intend their words to be understood’’ (1982: 61). The classic example of metaphorical code switching from the Blom and Gumperz 1972 article is from a conversation at the local community administration office, where two villagers switch from the standard variety of Norwegian, in which they had been discussing official business, to the local variety to discuss family and other private affairs. The Blom and Gumperz study is important because it illustrates that code switching is a complex, skilled linguistic strategy used by bilinguals to convey important social meanings above and beyond the referential content of an utterance. However, this is not to say that the Blom and Gumperz article has not generated some criticism over the years. In addition to some overlap and lack of clarity in the definitions of situational versus metaphorical switching, this model of code switching implies a sharp boundary between the two types of switching. In fact, Myers-Scotton (1993: 55) argues that the metaphorical meaning of a switched utterance is derived from its situationally based meaning. In any case, it is fair to say that this Blom and Gumperz study sparked an interest in studying code switching data in terms of a dynamic, interactional model that focuses on individual choices rather than static factors related to an individual’s social status (see Social Class and Status).
The Audience-Centered Approach to Code Switching Using a social psychological theory of language use, Howard Giles and his associates have developed a model of interpersonal communication that considers how speakers change the way they speak according to their audience. Giles refers to this type of strategy as accommodation (see Speech Accommodation Theory and Audience Design). Within Giles’s speech accommodation theory, speakers are motivated by their desire for approval vis-a`-vis their desire to dissociate themselves from the hearer. These concerns are cognitively salient and are realized by speech convergence (similar styles of speaking) or divergence (different styles of speaking). In fact, Giles predicted from these assumptions that the greater the effort in
converging, the more favorably the individual will be evaluated by the listener. Thus, convergence and divergence are linguistic strategies to either decrease or increase social distance between participants in a conversation. Although the premises behind speech accommodation theory have not been rigorously tested using code switching data in any comprehensive way, it is not difficult to see how the model could be used to explain speakers’ motivations for code switching. For example, to test the prediction regarding how listeners will evaluate a speaker based on the speaker’s perceived effort at converging, Giles et al. (1973) conducted an experiment involving bilingual English Canadian students who were asked to rate their reactions to a set of taped descriptions of a simple harbor scene given by bilingual French Canadian students. Different versions of this description – reflecting different levels of linguistic convergence to monolingual English – were presented to the English Canadian raters. The results of this experiment supported the prediction that the greater the effort in converging, the more favorably the speaker would be perceived. More specifically, the most convergent bilingual French Canadian student was viewed as the most considerate and the most concerned about bridging the cultural gap between French and English Canadians. Speech accommodation theory has been successfully applied mainly in the contexts of dialect or style switching (see Style and Style Shifting). However, the theory and its predictions still await further testing on code switching in bilingual settings.
The Conversation Analytic Approach to Code Switching Both the Gumperz and the Giles models of code switching attend to extralinguistic factors such as topic, setting, and participants as influencing speakers’ linguistic choices in conversations. Peter Auer (1984) questioned these assumptions and specifically questioned the way ‘situation’ was defined. For Auer, situation was a not a static set of contextual features that constrain linguistic choices. Rather, situation was seen as a dynamic phenomenon, emerging from the sequential nature of a conversational interaction. Using the terminology and the techniques of ethnomethodology and conversation analysis, Auer argued that the meaning behind code switching must be interpreted on the basis of the linguistic choices made by the participants themselves in the preceding and following turns in a conversation. In other words, for Auer, social meaning is constituted locally rather than at a societal level (see Conversation Analysis; Ethnomethodology).
510 Code Switching
Studies that use the technique of conversation analysis typically assign no independent semantic value to either of the languages involved. Instead, the conversational meaning of code switching results from the mere juxtaposition of the two languages, which generates contextualization cues whereby participants signal various contextual presuppositions. Thus, this language switching has a value of its own, independent of the direction of code alternation. Some have argued that an approach that focuses on the negotiation of meaning as locally constructed cannot generalize across interactions in order to build explanatory theories. However, Auer (1995) has identified several basic code switching patterns that correspond to identifiable meanings, such as participantrelated alternations, which reflect language competence or preference on the part of the speaker, and discourse-related alternations, which signal, for example, topic change (among other functions). Nevertheless, most researchers still recognize that in multilingual communities, each language available in the community indexes specific social and interactional meanings, and listeners tend to attribute consistent interpretations to the particular language choices that speakers make (see Pragmatic Indexing).
The Marked Model: A Speaker-Centered Approach to Code Switching One of the more richly developed models designed to explain the sociopragmatic motivations for code switching is Carol Myers-Scotton’s markedness model, which developed out of her field research in East Africa. The central premise of the markedness model is that speakers are rational actors who make code selections in such a way as to minimize costs and maximize rewards; that is, speakers are concerned with optimizing the outcomes of an interaction in their own favor (see Markedness). This notion of the speaker as a rational actor making certain decisions about code choice, albeit at a largely unconscious level, is also evident in speech accommodation theory as well as in Gumperz’s approach to code switching. However, unlike these other models, which consign the primary motivation for code switching to the addressee or to some other factor external to the speaker (e.g., topic or social setting), the markedness model is primarily a speaker-centered approach to communication. Markedness and Communicative Competence
Within the marked model, all code choices fall along a continuum, as more or less marked or unmarked. With respect to bilingual speech, the unmarked choice
simply refers to the linguistic variety that is expected, given the societal norms for that interaction. In contrast, marked choices fall at the other end of this continuum; that is, they are in some sense unusual or unexpected for the particular social interaction. Furthermore, all speakers possess what MyersScotton calls a markedness evaluator, or the capacity to evaluate linguistic choices in terms of markedness, as part of their innate communicative competence (see Communicative Competence). While the capacity to assign markedness readings to linguistic choices is innate, the markedness continuum is established through exposure to the range of linguistic options used in the community. Explaining Speakers’ Choices
All code choices can ultimately be explained in terms of speakers’ motivations to optimize the outcomes of the interaction. Most choices that speakers make affirm the norms that are in place for the particular exchange. These are unmarked choices, and they are usually the safest choices to make. The particular code used by the speaker is important only insofar as the participants view its status as marked or unmarked for that type of interaction. Thus, in a multilingual setting such as Nairobi, Kenya, the unmarked choice for most business transactions between strangers is Swahili. However, if the participants discover during the course of their conversation that they are members of the Luyia ethnic group, they will often switch into Luyia to continue the conversation. Luyia, then, becomes the unmarked choice, given the ethnic identity of the participants. However, code switching between Swahili and English within the same turn, within the same sentence, and even within words is typically the unmarked choice for informal social gatherings between educated, middleclass peers in Nairobi. Hence, the markedness of linguistic choices must be evaluated in terms of the norms for a particular exchange; the unmarked setting may even change within a conversation. In contrast to the relative safety of unmarked choices, marked choices carry with them some element of risk for someone who wishes to defy the norms. Importantly, the interpretation that a marked choice receives derives from its contrast with the unmarked choice for that exchange. That is, the unexpectedness, the ‘otherness’ of a marked choice carries significant social meaning. Marked choices are typically used to redefine the relationship between the speaker and the addressee, often as an expression of the speaker’s authority or power, to indicate anger, or to assert one’s ethnic identity (see Identity and Language; Power and Pragmatics). All these
Code Switching and Mixing 511
strategies can be subsumed under a single general principle: speakers make marked choices to negotiate a change in the expected social distance between the participants, either increasing or decreasing it. The Significance of the Markedness Model
Since its formulation, the markedness model has been successfully used to explain code switching between languages, between dialects and registers, and even between stylistic choices in literary contexts. The strength of the markedness model is its ability to explain not only unmarked choices, which other models do as well, but also its ability to explain marked choices. Furthermore, the markedness model addresses the universal aspects of communicative competence in terms of the cognitive abilities that use readings of markedness to assess speakers’ intentions. For these reasons, the markedness model is a powerful tool not only for code switching research, but also for any examination of the ways in which speakers use language to achieve interactional goals. See also: Bilingualism; Code Switching and Mixing; Communicative Competence; Conversation Analysis; Ethnomethodology; Identity and Language; Language and Dialect: Linguistic Varieties; Markedness; Power and Pragmatics; Social Class and Status; Speech Accommodation Theory and Audience Design; Style and Style Shifting.
Bibliography Appel R & Muysken P (1987). Language contact and bilingualism. London: Arnold. Auer P (1984). Bilingual conversation. Amsterdam: Benjamins.
Auer P (ed.) (1998). Code-switching in conversation: language, interaction and identity. London: Routledge. Blom J P & Gumperz J J (1972). ‘Social meaning in structure: code switching in Norway.’ In Gumperz J J & Hymes D (eds.) Directions in sociolinguistics: the ethnography of communication. New York: Holt, Rinehart, Winston. 407–434. Bourhis R G, Giles H, Leyens J P & Tajfel H (1979). ‘Psycholinguistic distinctiveness: language divergence in Belgium.’ In Giles H & St Clair R (eds.) Language and social psychology. Oxford: Blackwell. 158–185. Giles H, Taylor D M & Bourhis R Y (1973). ‘Towards a theory of interpersonal accommodation through language: some Canadian data.’ Language in Society 2, 177–192. Gumperz J J (1982). Discourse Strategies. Cambridge: Cambridge University Press. Jacobson R (ed.) (1990). Codeswitching as a worldwide phenomenon. New York: Peter Lang. Heller M (ed.) (1988). Codeswitching: anthropological and sociolinguistic perspectives. Berlin: Mouton de Gruyter. Milroy L & Muysken P (eds.) (1995). One speaker, two languages: cross disciplinary perspectives on codeswitching. Cambridge: Cambridge University Press. Myers-Scotton C (1993). Social motivations for codeswitching: evidence from Africa. Oxford: Clarendon Press. Myers-Scotton C (1998). ‘A theoretical introduction to the markedness model.’ In Myers-Scotton C (ed.) Codes and consequences. New York: Oxford University Press. 18–38. Thakerar J N, Giles H & Cheshire J (1982). ‘Psychological and linguistic parameters of speech accommodation theory.’ In Fraser C & Scherer K R (eds.) Advances in the social psychology of language. Cambridge: Cambridge University Press. 205–255. Wei L (ed.) (2000). The bilingualism reader. London: Routledge.
Code Switching and Mixing S Mahootian, Northeastern Illinois University, Chicago, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
Code switching is a linguistic phenomenon commonly occurring in bi- and multilingual speech communities. The term, which also appears as ‘codeswitching’ and ‘code-switching’ in the literature, broadly refers to the systematic use of two or more languages or varieties of the same language during oral or written discourse. One of the earliest definitions of code switching
was provided by Weinreich’s (1953) description of bilingualism as ‘‘the practice of alternately using two languages.’’ Gumperz (1982: 59) highlights the structural aspects of code switching that have dominated the last three decades of research in this area, by defining code switching ‘‘as the juxtaposition within the same speech exchange of passages of speech belonging to two different grammatical systems or subsystems.’’ Although the conversation patterns observed between bilinguals of the same language backgrounds indicate a predominantly unconscious switching back
Code Switching and Mixing 511
strategies can be subsumed under a single general principle: speakers make marked choices to negotiate a change in the expected social distance between the participants, either increasing or decreasing it. The Significance of the Markedness Model
Since its formulation, the markedness model has been successfully used to explain code switching between languages, between dialects and registers, and even between stylistic choices in literary contexts. The strength of the markedness model is its ability to explain not only unmarked choices, which other models do as well, but also its ability to explain marked choices. Furthermore, the markedness model addresses the universal aspects of communicative competence in terms of the cognitive abilities that use readings of markedness to assess speakers’ intentions. For these reasons, the markedness model is a powerful tool not only for code switching research, but also for any examination of the ways in which speakers use language to achieve interactional goals. See also: Bilingualism; Code Switching and Mixing; Communicative Competence; Conversation Analysis; Ethnomethodology; Identity and Language; Language and Dialect: Linguistic Varieties; Markedness; Power and Pragmatics; Social Class and Status; Speech Accommodation Theory and Audience Design; Style and Style Shifting.
Bibliography Appel R & Muysken P (1987). Language contact and bilingualism. London: Arnold. Auer P (1984). Bilingual conversation. Amsterdam: Benjamins.
Auer P (ed.) (1998). Code-switching in conversation: language, interaction and identity. London: Routledge. Blom J P & Gumperz J J (1972). ‘Social meaning in structure: code switching in Norway.’ In Gumperz J J & Hymes D (eds.) Directions in sociolinguistics: the ethnography of communication. New York: Holt, Rinehart, Winston. 407–434. Bourhis R G, Giles H, Leyens J P & Tajfel H (1979). ‘Psycholinguistic distinctiveness: language divergence in Belgium.’ In Giles H & St Clair R (eds.) Language and social psychology. Oxford: Blackwell. 158–185. Giles H, Taylor D M & Bourhis R Y (1973). ‘Towards a theory of interpersonal accommodation through language: some Canadian data.’ Language in Society 2, 177–192. Gumperz J J (1982). Discourse Strategies. Cambridge: Cambridge University Press. Jacobson R (ed.) (1990). Codeswitching as a worldwide phenomenon. New York: Peter Lang. Heller M (ed.) (1988). Codeswitching: anthropological and sociolinguistic perspectives. Berlin: Mouton de Gruyter. Milroy L & Muysken P (eds.) (1995). One speaker, two languages: cross disciplinary perspectives on codeswitching. Cambridge: Cambridge University Press. Myers-Scotton C (1993). Social motivations for codeswitching: evidence from Africa. Oxford: Clarendon Press. Myers-Scotton C (1998). ‘A theoretical introduction to the markedness model.’ In Myers-Scotton C (ed.) Codes and consequences. New York: Oxford University Press. 18–38. Thakerar J N, Giles H & Cheshire J (1982). ‘Psychological and linguistic parameters of speech accommodation theory.’ In Fraser C & Scherer K R (eds.) Advances in the social psychology of language. Cambridge: Cambridge University Press. 205–255. Wei L (ed.) (2000). The bilingualism reader. London: Routledge.
Code Switching and Mixing S Mahootian, Northeastern Illinois University, Chicago, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
Code switching is a linguistic phenomenon commonly occurring in bi- and multilingual speech communities. The term, which also appears as ‘codeswitching’ and ‘code-switching’ in the literature, broadly refers to the systematic use of two or more languages or varieties of the same language during oral or written discourse. One of the earliest definitions of code switching
was provided by Weinreich’s (1953) description of bilingualism as ‘‘the practice of alternately using two languages.’’ Gumperz (1982: 59) highlights the structural aspects of code switching that have dominated the last three decades of research in this area, by defining code switching ‘‘as the juxtaposition within the same speech exchange of passages of speech belonging to two different grammatical systems or subsystems.’’ Although the conversation patterns observed between bilinguals of the same language backgrounds indicate a predominantly unconscious switching back
512 Code Switching and Mixing
and forth between the two languages, this is not to say that switching occurs randomly. In fact, over the last three decades, researchers in the area of language contact have come to a consensus that code switching is a systematic rule-governed linguistic behavior. Switching may be conscious and intentional. Intentional switching may be used to indicate shifts in topic, change in interlocutor, and change in interpersonal or social relationships. Much of the time, however, switching between languages is unintentional. It is the result of psycho- and sociolinguistic variables that the speaker is not consciously aware of, involving processing issues and the tendency of speakers to adapt their speech style to the interlocutor’s style and/or community norms and expectations. Although switching has at times been associated with language attrition, indicative of weakness in one of the bilingual’s languages, many researchers believe that code switching is in fact a natural consequence of competence in more than one language and that it should not be mistaken for a language deficit. The following are some examples of code switching between (1) Farsi-English, (2) Irish-English, (3) Japanese-English, and (4) Arabic-French. (1) gofte bud ke she wanted to get revenge. said was that ‘She had said that she wanted to get revenge.’ (Mahootian, 1993) (2) Ta carr light green aige be car at him ‘He has a light green car.’ (Stenson, 1990) (3) one algebra question o mark shite ACC. do ‘you mark one algebra question’ (Nishimura, 1991) (4) un professeur aDim a professor excellent ‘an excellent professor’ (Bentahila and Davies, 1983)
Types of Code Switching Two main types of code switching can be identified. Switching between languages at sentence or clause boundaries is called intersentential. Switches within a clause involving a phrase, a single word or across morpheme boundaries are intrasentential switches. Some researchers identify tag switching as a third type of switching, separate from intersentential switching. Tag switches involve the insertion of tag forms such as I mean, you know, isn’t it?, etc., from one language into an utterance of another language. Example (5) shows an instance of intersentential switching between Puerto Rican Spanish and English; (6) and (7) are examples of intrasentential switching,
and (8) is an example of a tag switch between Farsi and English. (5) Sometimes I’ll start a sentence in English y termino en espan˜ ol and finish it in Spanish ‘Sometimes I’ll start a sentence in English and finish it in Spanish.’ (Poplack, 1980) (6) I’m shuxi-ing with you. joke‘I’m joking with you.’ (Mahootian, 1993) (7) Your bag is zir-e miz under of table ‘Your bag is under the table.’ (8) It was a good performance, nae? , no? ‘It was a good performance, wasn’t it?’
Code Mixing Some researchers have used the term ‘code mixing’ (also ‘codemixing’ and ‘code-mixing’) to refer specifically to intrasentential switching, and code switching to refer to intersentential switching. In most current literature, however, the term ‘code mixing’ is used interchangeably with ‘code switching,’ with both terms referring to both types of language mixing. Recently, a few researchers have made finer distinctions between the two terms, using ‘code mixing’ and ‘mixed code’ to distinguish the use of two or more languages at the discourse level from switches within clauses/words. In studies of child bilingualism, however, ‘code mixing’ carries additional implications. Depending on the researcher’s view vis-a`-vis children’s ability to keep their two languages separate, code mixing can be seen as either a sign of the child’s display of two differentiated code systems or an underlying unified system. Mixing in the former case refers to the same phenomenon found in adult mixed speech. In the latter, mixing refers to the use of the ‘wrong’ language in a monolingual context. More on this topic can be found in ‘Bilingual Language Acquisition and Code Switching.’
Borrowings, Nonce Borrowings, and Loanwords Adequately defining bilingualism and distinguishing between two of its related features, code switching and borrowing, has been an ongoing challenge for researchers. Borrowing is not a new phenomenon and can be seen as part of the development and lexical expansion of all languages. The most significant reason that languages borrow from one another is also the most obvious. Borrowing is motivated primarily
Code Switching and Mixing 513
by cultural contact, whether through trade or war. Along with new ways, styles, foods, religions, forms of government, etc., new words for these items are introduced into the community. For example, looking at the history of the development of the English language we see the influence of Latin and French in everyday words such as plant, pear, organ, bishop, heretic, pot, cook, from Latin, and cardinal, duke, court, abbey, beef, mutton, joy, poor, fruit, from French. Traditionally, the term ‘borrowing’ has been used to refer to any word or phrase taken from one language and used by the monolingual speakers of another language. Usually, borrowings fill lexical gaps arising from imported concepts, such as telephone, television, fax, pizza, etc. Before a word or phrase becomes a fully legitimate borrowing and becomes fully integrated into the host language, it goes through a few stages. Usually the target word is introduced into a language through bilinguals. At this stage, the word is not phonologically or morphologically integrated into the host language and its usage is more or less limited to bilinguals. Once a foreign word becomes part of the monolingual speech of a host language, most researchers agree that it has become part of the host language and hence a language borrowing. At this stage, the borrowed word will also show signs of adaptation to the morphology and phonology of the host language. However, since borrowing is typically a gradual process, there are a number of factors that have been identified as relevant to distinguishing code switches from borrowings. First among these is the length of a borrowed utterance and the degree of morphological and phonological integration of the utterance into the host language. Opinions on the degree of integration range from fully integrated single words that have been completely adapted to the host language phonology and morphology systems, such as the Japanese word takshi [takusˇi] borrowed from the English taxi [tæksi], to phrases˚ of any length showing partial integration (Reyes, 1974; Pfaff, 1979; Haugen, 1956; Hasselmo, 1970; Grosjean, 1982; Mahootian, 1993). Frequency of occurrence in the host language has also been identified as a factor, with the idea that borrowings occur more frequently than code switches (Poplack et al., 1988; Myers-Scotton, 1993).
as necessary loans and unnecessary loans. Necessary loans fill lexical gaps or accompany specific items brought into the host culture. Some examples of necessary loanwords in English are pajamas, whiskey, chili, croissant, robot, orange, and a host of other words that most monolingual English speakers probably do not recognize as borrowings. Food, fashion, technology, etc. are typical candidates for necessary loans. Unnecessary loans, as their label implies, do not fill in gaps. In fact, they coexist with the native analog but usually in a semantically altered fashion. For example, the French word veal, meaning ‘yearling’ or ‘calf’, did not simply replace the English word ‘calf,’ which also referred to the young animal and, therefore, blocked the French word from applying to that meaning. Veal could, however, be used for some other related meaning and was used to refer to the meat from the animal rather than the animal itself. One of the reasons often cited for the unnecessary loans is that the guest language is associated with prestige by speakers of the host language, but there are many cases where prestige cannot account for such borrowings. A case in point is English borrowings in Japanese. For example, the word maketo (
Loanwords
As mentioned earlier, one of the major problems confronting researchers in the area of language contact is how to distinguish between code switching and borrowing. These two have generally been considered different phenomena that produce mixed sentences. Most researchers agree that switching can be of any length. It can occur at the word, phrase, or sentence
Based on degree of integration, borrowings are further classified as either loanwords or nonce borrowings. Words fully integrated and used by monolinguals, usually without any knowledge of the words’ origins, are loanwords. Loanwords can be further categorized
Distinguishing Borrowings from Code Switches
514 Code Switching and Mixing
level as long as it is a complete shift into the other language, as in example (9) below: (9) C ¸ a m’e´ tonnerait qu’on ait code-switched autant que c¸ a. ‘I can’t believe that we code-switched as often as that.’ (Grosjean, 1982).
In this example the word ‘code-switched’ has been code switched, because the speaker has used an English term with an English pronunciation and past participle form in an otherwise French sentence. In contrast, borrowings tend to be short and phonologically and morphologically adapted to the host language. For example, the word ‘code-switche´ ’ in sentence (10) is an instance of borrowing (Grosjean, 1982). The English term ‘code switch’ is pronounced and inflected in French. (10) C¸ a m’e´ tonnerait qu’on ait CODE-SWITCHE´ autant que c¸ a. ‘I can’t believe that we code-switched as often as that.’
The distinction between code switching and borrowing is important in sociolinguistics, where the decision to code switch or not is an integral part of the dynamics of a community (see ‘Functions of Code Switching’). Until the early 1990s, linguists interested in the structure of code switching had also honored the distinction between switching and borrowing, with the implication that borrowings behave differently syntactically. However, for the distinction to be useful, it would have to rest on three interrelated assumptions: 1. borrowing and code switching are two conceptually distinct linguistic phenomena, 2. we can systematically and consistently separate the two phenomena, and 3. the distinction will hold across languages in a way that allows us to make generalizations about each phenomenon. The criteria set thus far for distinguishing the two are not infallible. Since code switching can occur on a single-word level as well in longer utterances, length of utterance does not offer a clear-cut distinction. Phonological adaptation also fails to be foolproof. For example, if a bilingual speaker has transferred the phonological system of L1 to L2 while acquiring L2 (in other words the speaker has an accent), it will be difficult to evaluate whether word X from L2 has been borrowed into an L1 sentence or if the speaker has code switched into L2. Most researchers acknowledge the shortcoming of using morphological adaptation as a guideline in cases where the switch may be only one word such as an adverb or an uninflected free morpheme. Some researchers claim to
have developed operational criteria for distinguishing borrowings from code switches. Their operational criteria rest on establishing a parallel between nonce borrowings and established loanwords. They assert that nonce borrowings, like established loans, are morphologically and syntactically integrated into the host language but code switches are not.
Nonce Borrowings Despite some evidence that nonces are loanwords, there is more evidence that they are not. First, nonce borrowings, like code switches, are only used by bilinguals, not monolinguals. Second, loanwords are established, recognized by the community as part of the native lexicon, and used as frequently and naturally as other native lexical items. Nonce borrowings are spontaneous usages, not established, with no guarantee of recurrence – just as code switches are spontaneous, not established, with no guarantee of recurrence. Third, loanwords are phonologically adapted to the host language but nonce borrowings are not. Here too, nonce borrowings behave like code switches. Fourth, morphological adaptation, as mentioned earlier, is not a foolproof criterion. It may be true that all words morphologically adapted into the host language are borrowings. However, what about words that have no overt morphology? Are they automatically code switches by default? Fifth, syntactic adaptation also has its limits. For example in the case of two languages with the same word order, such as Spanish and English, would the word slowly in sentence (11) be considered a code switch or a borrowing? In cases such as (11), syntactic adaptation fails to be a clear criterion. (11) Lo hizo slowly. ‘He did it slowly.’ (Woolford, 1983)
According to proponents of the nonce borrowing hypothesis, slowly could be either a nonce borrowing or a code switch.
Matrix and Embedded Languages The distinction between matrix or host language and embedded or guest language is a crucial part of a number of code switching accounts. However, no feasible criteria have been established that systematically distinguish the matrix language from the embedded language in all instances of code switching. Some have proposed the base language should be whichever language the syntax of the utterance belongs to or whichever language provides the inflectional and
Code Switching and Mixing 515
derivational morphemes. Myers-Scotton distinguishes between system morphemes, or grammatical morphemes, and content morphemes, the morphemes that assign thematic roles. System morphemes come from the matrix language. Joshi uses speaker intuition as the basis for distinguishing matrix from embedded language. He claims that in mixed discourse, speaker and hearer ‘usually agree’ on which language is the matrix language. However, as the following example indicates, not all bilingual speakers seem to be conscious of their switches, let alone which language frame they are using. (12) I mean I’m guilty in that sense ke ziada wsi English i bolde fer ode nal eda . . . wsi mix kerde rene a˜ . I mean, unconsciously, subconsciously, keri jane e . . . ‘I mean I’m guilty in that sense that we speak English more and more . . . we keep mixing. I mean unconsciously, subconsciously, we keep doing it. . .’ (Romaine, 1995)
Not all researchers agree with these criteria. For example, Romaine (1995) notes that the syntax and morphology criteria would not work for PunjabiEnglish. Nor, as she and others have pointed out, would they yield unequivocal results in all instances. The mixed utterances in (13)–(15) underscore the problematic nature of the criteria proposed for the matrix/embedded distinction. In (13), for example, Japanese structure overlaps with English structure; therefore, it will not be possible to assign matrix language based on syntax. In cases such as (14) and (15), where a single system morpheme is in one language and the remaining utterance is in the other language, it seems counterintuitive, or at least problematic, to designate the language of the grammatical morpheme as the matrix. (13) Dakedo I don’t like New York but ‘But I don’t like New York.’ (Jap-Eng; Nishimura, 1985) (14) E baguette, s’il vous plaıˆt . . . a baguette please ‘A baguette please . . .’ (Alsatian-Fr; GardnerChloros, 1991) (15) Lawyer-et will tell you what to do. -2P-POSS ‘Your lawyer will tell you what to do.’ (Farsi-Eng; Mahootian, 1993)
In addition to the unreliability of the proposed criteria for distinguishing matrix language from embedded, some researchers have pointed out that syntactic analyses of code switching, by virtue of their claim to be based on principles of universal grammar, should not need to resort to such a distinction.
Social and Pragmatic Functions of Code Switching Early studies of code switching were embedded in studies of language contact and bilingualism. The most significant of these studies were carried out by Weinreich (1953), Haugen (1953), Hasselmo (1961), and Clyne (1972). By the early 1970s, code choice and code mixing were taking more of the spotlight as researchers became interested in the conversational functions and the social motivations of the byproducts of bilingualism. A number of functions were identified and associated with code switching, with most, if not all, directly or indirectly related to a complex of interconnected social and contextual variables or domains such as situation, interlocutor, and topic of discourse. Bilinguals’ code switching was likened to monolinguals’ style shifting. It was argued that having more than one language gave bilingual speakers a choice of an additional discourse mode. Giles et al. (1977) proposed that in addition to these domains, code switching and language choice are also influenced by sociopsychological forces. This approach, known as accommodation theory, would, for example, predict that speakers’ needs or desires to associate with or disassociate from a group will direct their language choice. Referential and Expressive Switching
The two functions most discussed in the literature until the early 1980s were the referential and the expressive functions of code switching. The referential function of code switching refers to the types of code switches that are primarily motivated by lexical gaps, or lack of fluency about a topic in one language, or simple failure of lexical retrieval. This function of code switching highlights the fact that the notion of balanced bilingualism is more an issue of theoretical competence rather than a reality. It is often the case that a bilingual can discuss some subjects more easily in one language than the other. For example, individuals whose home language (L1) differs from the language in which they have been educated (L2) are more at ease discussing academic concepts in the L2 and will therefore switch to the L2 to do so. This is true with regard to technical or culture-specific concepts as well. Tuning in to non-English radio stations in the United States reveals an interesting pattern of code switching. The native-language broadcasts are peppered with English terms for traffic flow, weather conditions, and local events. The expressive function of code switching is associated with a meta-level act of communication where the form itself, meaning mixed speech discourse, is a comment about the speaker rather than the speech.
516 Code Switching and Mixing
The relationship between language and identity has long been established and documented. Language is both co-constructor and a reflection of social identity. This relationship becomes more complex in bilingual communities where the languages and cultural/ethnic values and identities bear unequal social prestige. Studies of Spanish-English code switching among Puerto Ricans in New York and of code switching in the media conclude that speakers choose mixed code as a way to emphasize their bilingual/bicultural identity. Metaphorical and Transactional Switching
The concepts of metaphorical or nonsituational and transactional or situational code switchings were introduced to capture two functions of code switching (Blom and Gumperz, 1972). Transactional switching is motivated by variables such as topic and interlocutors. Metaphorical switching, on the other hand, accounts for the extralinguistic message the speaker wishes to express, the effect the speaker wants to have on the hearer. Metaphorical code switching is viewed as an indication of the speaker’s momentary attitudes and emotions with social variables (class, situation, speakers, topic, etc.) and ideological variables (identity, group affiliation, etc.) as important contributors to the form of the message (Gumperz, 1982). In his seminal work on discourse strategies, Gumperz characterizes bilingual speakers and bilingual speech communities as ‘‘marked both by diversity of norms and attitudes and by diversity of communicative conventions’’ (Gumperz, 1982: 71). In an attempt to capture uniformity across different bilingual language exchanges, he posits five conversational functions of code switching: (Gumperz, 1982: 77ff): 1. Quotation: to distinguish between direct speech and quotations or reported speech: When quoting or reporting someone else’s discourse, speakers will often switch into that person’s language, as in the following Farsi-English example: unvaeqt jan be maen mige, ‘‘I don’t think I can make it.’’ then John to me says ‘Then John says to me, ‘‘I don’t think I can make it.’’’
2. Addressee specification: speakers may code switch in order to ‘‘direct a message to one of several addressees’’ (Gumperz, 1982: 77): ‘Well I don’t know how to describe it but it just doesn’t feel like home to me (directed to Hearers 1 and 2; H1 is monolingual, H2 bilingual), to miduni maenzuraem chie, doroste? (you know what I mean, right?’ directed to H2). (Farsi-Eng)
3. Interjections: speakers may code switch as a way to mark an interjection as in the phrase ‘you know’ in the Panjabi-English sentence below: I wish, you know ke m3 pure Panjabi bol s3ka ‘I wish, you know, that I could speak pure Panjabi’ (Romaine, 1995: 122)
4. Reiteration: speakers may switch languages to emphasize or clarify a message: Ven aca´ , ven aca´, Come here, come here,
come here, you. . (Sp-Eng; Gumperz, 1982: 78)
5. Message qualification: speakers may switch to add more information in order to qualify the main message. 6. Personalization versus objectification: in this category, switching marks a number of related functions that reflect the degree of speaker involvement or distancing vis-a`-vis the message, the interlocutors, etc. Related to the sixth category, Grosjean (1982: 152) identifies a number of additional discourse functions such as marking group identity, emphasizing solidarity, excluding others from a conversation, raising the status of the speaker, and adding authority or expertise to a message. Though not comprehensive and by his own admission limited, Gumperz’s view of the sociopragmatic functions of code switching captured the dynamic nature of language choice and was an important first step to a fresh look at an old phenomenon. Many of the same ideas and observations concerning the discourse functions of code switching were recast by Auer (1984) in a conversational analysis framework and by Appel and Muysken (1987) in the functional framework of Jakobson (1960) and Halliday et al. (1964). Based on her work in Africa, MyersScotton also suggested that code switching ‘‘serves the same general socio-psychological functions everywhere’’ (Myers-Scotton, 1988: 3) and proposed the markedness model of code switching to ‘‘explain the socio-psychological motivations behind CS [code switching]’’ (Myers-Scotton, 1988: 3).
Attitudes Toward Code Switching Despite the pragmatic functions that code switching/ language mixing serves and clear evidence that code mixed utterances are structurally rule-governed and systematic, in most communities, code switching usually has a stigmatized status. This is not to say that all bilinguals or all bilingual communities consider code switching to be negative. In fact, in all bilingual
Code Switching and Mixing 517
communities one finds that attitudes toward mixed language range from positive to negative, depending on class, age, education, profession, and other social factors. For example, older generations of speakers in a bilingual community typically have a negative response to code switching and assert that it shows a loss of pride in the home culture and disrespect to the community elders, not to mention ignorance and laziness. They avoid code switching and usually expect the languages to be kept separate and hence ‘pure.’ Younger generations are expected to do the same, at least in the presence of their elders. This ideal of language purity is a strong social constraint on language mixing and can be found across many bilingual communities. Who holds what attitude is not necessarily predictable, however, and can vary from community to community and from language pair to language pair. In part, attitude is tied to a group’s position on bilingualism, to the social status of each language, as well as to the status of the immigrant group in the host country. For example, Puerto Ricans in New York have a positive attitude toward bilingualism and code switching. For them, code switching is frequent and occurs in both directions, i.e., from Spanish into English and from English into Spanish. Mixing two languages is an important way to show the speakers’ affiliation and connection to both their cultures. In the same vein, in some social contexts, using mixed language discourse is considered the unmarked or expected form, and not mixing is considered marked or unexpected. Conversely, until recently, among MexicanAmericans, code switching was highly stigmatized and referred to pejoratively. It has been noted that when political ideology changes and a group becomes more conscious of their ethnicity, attitudes toward code mixing change. For example, Romaine (1995) points to the reversal of the previously pejorative use of pocho and calo´ used in California and the southwestern United States to refer to the variety of Spanish-English spoken by Chicanos. She ties this shift to a heightened ethnic awareness among Chicanos.
Structural Accounts of Code Switching Studies of the structure of code switching fall into four categories: descriptive accounts (Timm, 1975; Pfaff, 1979); accounts involving surface constraints and a third grammar for code switching (Poplack, 1980; Sankoff and Poplack, 1981); principle-based accounts involving special mechanisms (Belazi et al., 1994; Bentahila and Davies, 1983; Di Sciullo et al., 1986; Joshi, 1985; Woolford, 1983); and
principle-based accounts without code switchingspecific mechanisms or constraints (Mahootian, 1993, 1996a, 1996b; Myers-Scotton, 1993). Overall, these approaches show the progression of code switching accounts moving from a view of code switching as only a dynamic social phenomenon to code switching as a language phenomenon that can inform linguistic theory at other levels such as syntax, morphology, and language acquisition. Descriptive Accounts of Code Switching
Until the early 1970s, linguists generally considered the grammatical implications in one of two ways. On the one hand, grammatical implications were ignored because it was assumed that the internal structures of sentences have no bearing on the motivations for switching. On the other hand, some researchers concluded that the failure to find rules or constraints indicated that codes witching was syntactically unrestricted and irregular. However, in the course of the 1970s, some linguists suggested that code switching does not occur syntactically randomly. A number of descriptive accounts, mostly about Spanish and English, observed that switching between languages seemed to follow certain patterns. For example, based on their data of Spanish-English code switches, Timm (1975) and Gumperz (1976) noted that switching did not occur between the subject pronoun and the finite verb. They proposed that switching was in fact prohibited between those elements, shown in examples (16) and (17) from Timm (1975: 477). (16) Pron Subj Sp Eng Eng Sp
Pfaff (1979) described further constraints on Spanish-English code switching. She noted in the case of switches between adjective and noun that the resulting mix must not violate the surface word order of either language. Hence switches between English nouns and postnominal Spanish adjectives are prohibited, as shown in example (18). However, switching between determiners and their NPs, shown in example (19), is allowed. (18) *I went to the house chiquita small ‘I went to the small house.’ (Pfaff, 1979) (19) el same day the ‘the same day’ (Pfaff, 1979)
518 Code Switching and Mixing A Three-Grammar Approach
Results such as those obtained by Timm and Pfaff led researchers to look for ways to formulate the apparent constraints on code switching. The first effort at formalizing constraints was made by Sankoff and Poplack (1981). They suggested that bilinguals have a code switching grammar separate from the grammar of their two languages. That is, they claim bilinguals have a third, code switching grammar along with their two monolingual grammars. Their code switching grammar comprises the lexicon of the two languages plus the grammatical categories of the two languages. Poplack (1980) and Sankoff and Poplack (1981) also proposed two universal constraints on code switching. The Free Morpheme Constraint prohibits a switch between a stem and an affix unless the affix has been phonologically integrated into the language of the stem. Accordingly, mixed utterances such as in example (20) are disallowed. (20) *eat-iendo ‘eating’ (Poplack, 1980)
Their second principle, the Equivalence Constraint, operates on surface structures. Its simplest form states that the word order before and after a switch must be possible in both languages. In the example below, (21c) is the speaker’s utterance, (21a) and (21b) are the corresponding English and Spanish forms, and therefore the switch in (21c) is possible (Poplack, 1980: 586): (21a) Eng: I told him that so that he would bring it fast (21b) Sp: (Yo) lo dije eso pa’ que (el) la trajero ligero I him told that so that he it bring fast (21c) CS: I told him that PA’ QUE LA TRAJERO LIGERO
However, the switch between the noun and adjective in example (22) is ruled out since the nounadjective order is exclusive to Spanish: (22) *the casa big house ‘the big house’ (Woolford, 1983)
Despite their seminal work showing that code switching is syntactically systematic and rulegoverned, Sankoff and Poplack’s model has been criticized for falling short on two levels. On the conceptual level, since every language requires a third code mixed grammar, a trilingual speaker would be carrying around seven grammars. This number jumps to 15 for the quadrilingual. Such an increase in the number of grammars seems unlikely.
Second, both of their proposed constraints have been shown to be too strong. The equivalence constraint that operates on linear order disregards the hierarchical relationship among categories and constituents. Consequently, although the constraint adequately predicts switches between Spanish and English, it fails to predict possible switches in the case of typologically dissimilar languages, such as English (SVO) and Japanese (SOV). In example (23), an English adverb is used in Hindi word order and precedes the verb. Example (24) shows a Farsi object DP (determiner phrase) following the verb. In example (25), an English adjective follows an Irish noun and in example (26) a Japanese postposition follows an English DP. (23) pusˇ pa bahut quickly bat karh hai Pushpa very word do-PROG ‘Pushpa talks very quickly.’ ( Hindi-Eng; Di Sciullo et al., 1986) (24) I love xormalu persimmon. ‘I love persimmon.’ (Farsi-Eng; Mahootian, 1996a) (25) Ta carr light green aige be car at him ‘He has a light green car.’ (Ir-Eng; Stenson, 1990) (26) I slept with her basement de. in ‘I slept with her in the basement.’ (Jap-Eng; Nishimura, 1985)
Second, the Free Morpheme Constraint proposed as a general constraint on code switching predicts no switching between a bound morpheme of one language and a phonologically unassimilated lexical item of another language. This prediction is not borne out, as seen in the following examples from Arabic-English (Prince and Pintzuk, 1984: 2), IrishEnglish (Stenson, 1990: 180), and Japanese-English (Nishimura, 1991: 31). (27) inta hang-ha up you -it ‘you hang it up’ (28) job-anna -PL ‘jobs’ (29) one algebra question-o mark-shite ACC -do ‘(you) mark one algebra question, and . . .’
The Sankoff and Poplack model accounts for examples such as (23)–(28) by designating the switches as borrowings.
Code Switching and Mixing 519 Principle-Based Approaches with Special Mechanisms
The limitations of Sankoff and Poplack’s surface level constraints led other researchers to look for more principle-based constraints to account for code switching. Code switching was seen as an additional process that needed to be accounted for by special rules/constraints outside the bilingual speaker’s two monolingual grammars but motivated by universal principles of grammar. Toward this end, a number of code switching models were proposed: Woolford’s phrase structure congruence model (1983), Bentahila and Davies’ subcategorization restrictions model (1983). Joshi’s closed-class item constraint (1985), Di Sciullo et al.’s government constraint (1986), and Belazi et al.’s Functional Head Constraint (1994).
(32) el hombre viejo the man old ‘the old man’ (33) the hombre viejo (34) el old man
On the other hand, the rule that adjectives follow nouns is unique to Spanish. Woolford’s model predicts we will get the utterance in (35) but not the ones in (36)–(39): (35) el hombre viejo ‘the old man’ (36) *el man viejo ‘the old man’ (37) *el hombre old (38) *the old hombre
Phrase Structure Congruence Model Woolford (1983) argues for an account of code switching that follows from ‘‘the manner which monolingual grammars cooperate to produce hybrid sentences’’ (Woolford, 1983: 519). Woolford argues against an autonomous code switching grammar, citing as evidence the fact that we do not find utterances such as (30b) where English lexical items in a Spanish structure produce an unacceptable utterance (Woolford, 1983: 521): (30a) No estoy terca. (Spanish) not am stubborn ‘I’m not stubborn.’ (30b) *Not am stubborn.
In this model, the lexicon and word formation rules of the two languages are assumed to be separate and cannot interact. Instead she proposes that the monolingual grammars interact at D-structure at the level of phrase structure rules, where ‘‘the two grammars operate during code-switching just as they do during monolingual speech’’ (Woolford, 1983: 522). Therefore, while dropping the subject pronoun is acceptable in Spanish (30a), pro-drop is not permitted in English (30b). Woolford’s model, like her predecessors’, also prohibits morphologically mixed French-English lexical items such as *savoir-ing. Terminal nodes of phrase structures can only be filled from the lexicon of the grammar that generated them. Code switches result when a phrase structure is identical in both languages and ‘‘both lexicons have equal access to terminal nodes created by common rules’’ (Woolford, 1983: 524). For example, in both Spanish and English the determiner precedes the noun. Therefore, we should expect to get the four following combinations the phrase ‘the old man’. (31) the old man
(39) *the man old
Although accepted and incorporated into others’ models of code switching, the noun-adjective prediction was shown to be wrong by Mahootian and Santorini (1994) and Santorini and Mahootian (1995). Their account of switching between nouns and adjectives uses the distinction between complements and adjuncts on the one hand and attributive and predicative adjectives on the other, to account for and predict noun-adjective switches between languages with dissimilar NP structures such as Spanish and English, Italian and English, and Irish and English: (40) I want a motorcycle verde green ‘I want a green motorcycle.’ (Sp-Eng; McClure, 1981: 87) (41) Ma ci stanno dei smart italiani. ‘But there are smart Italians.’ (It-Eng, Di Sciullo et al., 1986: 15) (42) do gheansai deas pink your sweater nice ‘your nice pink sweater’ (Ir-Eng; Stenson, 1990: 171)
The main criticism brought against Woolford’s model is its inability to account for switches involving structurally dissimilar languages. With the exception of noun-adjective switches, Woolford’s approach successfully accounts for Spanish-English, but like Sankoff and Poplack’s equivalence constraint, its reliance on the order of constituents in a phrase renders it too strong when dealing with languages whose word orders differ such as Hindi-English, Farsi-English, Arabic-French, Irish-English, etc. Although Woolford’s model does not account for the full range of code-switching data, it captures
520 Code Switching and Mixing
two significant generalizations about the process of code switching. First, code switching is an ordinary language-use phenomenon that is part of our grammar system and does not require a third grammar, and second, switches involve cross-linguistically corresponding categories. These two insights figure prominently in models that followed a decade later. Subcategorization Restrictions Intrasentential code switches between Moroccan Arabic and French led Bentahila and Davies (1983) to conclude that there are no surface constraints on switching and that, for at least Arabic-French, switching is constrained only by language-specific subcategorization restrictions. Accordingly, they were able to account for switches within VP even where the structure of the two languages are dissimilar, as is the case of Moroccan Arabic (VSO) and French (SVO). Although they do not explicitly define it as such, in application their notion of subcategorization restrictions is equal to sisterhood. Accordingly, they allow for the switch in (43), where an Arabic adjective (which subcategorizes as postnominal) follows a French noun (Bentahila and Davies, 1983: 321): (43) un professeur a professor ‘an excellent professor’
aDim excellent
However, they wrongly block switches like the one in example (44), where an Arabic adjective precedes a French noun, thus violating the subcategorization rule for Arabic adjectives. As mentioned earlier, these types of switches were later addressed and accounted for by Mahootian and Santorini. (44) un aDim professeur a excellent professor ‘an excellent professor’
The subcategorization model also prohibits switching at word-internal morpheme boundaries for which they postulate a constraint similar to Sankoff and Poplack’s Free Morpheme Constraint. Bentahila and Davies’ model of switching has an advantage over other models. Although they cannot account for word-internal switches or the range of possible noun-adjective switches, their model accounts for a wide range of intrasentential switches by simply invoking the universal principle of subcategorization restrictions. The Closed-Class Item Constraint Joshi (1985) also proposed a system for code switching where the grammars of the two languages are kept separate. There is no merging of the phrase structure rules or lexicon of the two languages. Rather the proposed
switching constraints are grounded in basic principles of language and grammar. The system he presents involves a matrix language Lm with a corresponding grammar Gm and an embedded language Le with grammar Ge. He assumes a correspondence between grammatical categories across the two grammars so that the structure of a DP in one corresponds to the structure of DP in the other. Joshi places three constraints on switching. The first is asymmetry, which specifies that switching can only occur in one direction: from matrix language to embedded language. The second constraint prohibits switches from occurring at the root node of the matrix language (Sm). The third constraint, the Closed-Class Item Constraint, prohibits switching of closed-class items including determiners, quantifiers, prepositions, possessive, Aux, Tense, etc., meaning that all closed-class items must be in the matrix language. Two related problems have been noted with this model. In the first place, the viability of the asymmetry constraint is questioned. Asymmetry depends on the distinction between matrix language and embedded language. Joshi claims ‘‘speakers and hearers usually agree on which language the mixed sentence is ‘coming from’’’ (Joshi, 1985: 193). However, as noted earlier, this criterion, along with others proposed to distinguish matrix language from embedded language, is unreliable. Consequently, since matrix and embedded languages are not systematically discernible from each other, the closed-class item constraint loses its viability. The following are some examples of switches that violate the predictions of the Closed-Class Item Constraint. (45) Anyway, I figured ke if I worked hard enough, that I would be finished by summer. ‘I figured that if I worked hard enough I would be finished by summer.’ (Farsi-Eng; Mahootian, 1993) (46) elle te pique min fuq le drap it bites you through the sheet ‘it bites through the sheet’ (Mor. Ar-Fr; Bentahila and Davies, 1983) (47) Where are they, los language things? , the ‘Where are they, the language things?’ (Sp-Eng; Poplack, 1980)
In sentence (45), a Farsi complementizer appears between an English main clause and an English embedded clause. In (46), a lone Arabic preposition appears in the midst of a French sentence. In (47), a Spanish determiner occurs in an all-English sentence.
Code Switching and Mixing 521
In each case, the model would designate the language of the closed class item, i.e., the complementizer/ preposition/determiner, as the matrix language, regardless of the fact that the rest of the sentence is syntactically and lexically in another language. Despite these problems, the main insights motivating this model remain valuable: that a third grammar for switching is unnecessary; and that switching results from the interaction between two grammatical systems with comparable categories.
The model predicts that switches between a verb (V) and COMP are prohibited, as shown in the Hindi/ English examples (50) and (51) from Di Sciullo et al. (1986: 17):
The Government Constraint Di Sciullo et al.’s Government Constraint Model (1986) was an attempt to account for switches found between languages with dissimilar phrase structure rules, such as Hindi, Farsi, Japanese (SOV), and English (SVO). In this model, switching is prohibited where there is a government relation at S-structure between two elements. They propose that elements in a government relation must be in the same language, designated by the language index q such that: if X governs Y, . . .Xq . . .Yq (5). Y is a maximal projection, not a terminal node. The language index is assigned to the highest lexical element in a maximal projection. Government is defined through immediate c-command. For example, in the prepositional phrase to school, the preposition and its complement would both bear the index q, and would have to be in the same language:
However, as the Farsi-English examples in (52) and (53) illustrate, switches do occur between V and COMP. Farsi, like Hindi, is an SOV language.
(48)
In the phrase to the school, however, the determiner the is the highest lexical element in the governed DP the school, and will therefore be the language index carrier: to and the must be drawn from the lexicon of the same language, school may or may not be in another language. Example (49) illustrates: (49)
This model successfully eliminates word order equivalence as a criterion for switching and allows for code switching between languages with dissimilar word orders. However, a number of conceptual and empirical problems have been noted. First, the notion of language index that the model rests on is, from a processing point of view, redundant and pointless for the monolingual speaker who may never be exposed to another language. More importantly, the Government Constraint proves to be too strong.
(50) I told him that
ram bahut bimar Ram very sick ‘I told him that Ram was very sick.’
hai. AUX
(51) *I told him ki ram bahut bimar hai. that Ram very sick AUX ‘I told him that Ram was very sick.’
(52) I was implying ke in kar dorost nist. that this act correct isn’t ‘I was implying that this isn’t right.’ (Mahootian, 1993: 104) haesˇ tad-o yek-e. that nine nineeighty-and CLASS one-is ‘They don’t know that nine times nine is eightyone.’
(53) They don’t know
ke
noh
noh-ta
They also predict that conjunctions such as and or but will be in the same language as the clause being joined, so we should get CPq CONJp CPp, but we would not get *CPq CONJq CPp. However, this prediction does not hold true. Examples (54) and (55) show switching between a Farsi conjunction and the following English CP conjunct, and an English conjunction followed by a Farsi CP, respectively. (54) fekr kaerd-aem daest-aem thought did-1st hand-1stPOS daerd begire vaeli it was fine. pain get-3rd but ‘I thought my hand would hurt but it was fine.’ (55) They are boys and adaem-esmart-i-ye. morteza xeyli morteza very person-EZ IND-is ‘They are boys and Morteza is a very smart person.’
Halmari (1997) modifies and expands the Di Sciullo et al. model to accommodate Finnish-English code switches. In doing so, she strengthens the model and makes the apt observation that the Government model is likely one of a number of models that can account for some of the structural outputs of code switching. The Functional Head Constraint Belazi et al. (1994) propose a model that restricts switching between a functional head and its complement: a functional head f-selects its complement with regards to
522 Code Switching and Mixing
language as a feature, thereby ensuring a match between the language of the functional head and its complement. Accordingly, switching is disallowed between C and IP, I and VP, Neg and VP, D and NP, and Q and NP. Additionally, to account for noun/adjective switches, they posit the Word-Grammar Integrity Corollary, according to which words from a given language must obey the grammar rules of that language and no other. So, for example, postnominal modifiers cannot occur pronominally in a mixed utterance. The Functional Head Constraint model is similar to the Government Constraint model. Both rest upon and exploit a special relationship between certain elements (government/f-selection) and stipulate a language-sensitive component (language index/ language feature) as part of the relationship. From this point, the two analyses turn in opposite directions from each other, resulting in a striking (near-) complementary distribution between Di Sciullo et al. and Belazi et al.’s analyses. What the government constraint rejects as switch points becomes fertile switching grounds for the functional head constraint. Examples (56)–(59) show reported switches that violate the Functional Head Constraint. C and IP (56) il croyait bi?ana je faisais c¸ a expre`s. he thought that I was doing that on purpose ‘He thought that I was doing that on purpose.’ (Mor. Ar.-Fr; Bentahila and Davies, 1983) I and VP (57) Nı´ dispute-alfaidh me´ leat. neg -FUT I with-you ‘I won’t dispute with you.’ (Ir-Eng; Stenson, 1990: 180) Neg and VP (58) Pa´ ra que no talfene´ en a la misˇ tara´ . so that NEG. phone they would to the police ‘So that they wouldn’t phone the police.’ (Hebr-Sp; Berk-Seligson, 1986) D and NP (59) Ha portato un cadeau. (he) brought a present ‘He brought a present.’ (It-Fr; Di Sciullo et al., 1986) Q and NP (60) Daban unos steaks tan sabrosos. (they) served some so tasty ‘They served some steaks so tasty.’ (Sp-Eng; Pfaff, 1979) Code-Switching Models Based on Universal Principles with No Special Mechanisms
In the early 1990s two new models emerged. One makes use of a fundamental principle of grammar,
the Head-Complement Principle, to account for intrasentential switches as well as word-internal switches. (Mahootian, 1993, 1996a,b; Mahootian and Santorini, 1996). The other, the Matrix Language Frame model, relies on a matrix/embedded relationship between the speaker’s two languages and on the distinction between system and content morphemes. (Myers-Scotton, 1993, 2002). The MLF Model The Matrix Language Frame (MLF) model for code switching (Myers-Scotton, 1993, 2002) explains code switching in terms of two types of asymmetry: (a) the Matrix Language (ML) vs. the Embedded Language (EL) and (b) content vs. system morphemes. System morphemes overlap with the members of the category traditionally referred to as closed-class items. They carry the feature [þQuantification]. System morphemes can be specifiers, quantifiers, possessive adjectives, and inflectional morphemes. Content morphemes have the feature ["quantification] and are the morphemes that assign or receive thematic roles. They may be either in the ML or the EL. Content morphemes can be verbs, nouns, descriptive adjectives, or prepositions. The MLF model rests on four hypotheses (MeyersScotton, 1993): the Matrix Language Hypothesis, the Blocking Hypothesis, the Embedded Language Island Trigger Hypothesis, and the Embedded Language Implicational Hierarchy Hypothesis. 1. The Matrix Language Hypothesis: the grammatical frame in intrasentential code switching is set by the matrix language. This hypothesis is guided by two principles. The Morpheme Order Principle states that, in a mixed constituent, morpheme order cannot violate the order of morphemes in the ML. The System Morpheme Principle states that the ML supplies the system morphemes in mixed constituents. 2. The Blocking Hypothesis: switching is blocked when content morphemes from the embedded language do not meet ML congruency requirements. 3. The Embedded Language Island Trigger Hypothesis: switching into a nonpermissible EL morpheme forces an obligatory EL island, i.e., the constituent must be completed in the EL. 4. The EL Implicational Hierarchy Hypothesis: idiomatic, formulaic, or peripheral EL constituents may occur and will not be barred by hypotheses 1, 2, or 3. In the mixed constituents in examples (61)–(62), Irish and Finnish are considered the matrix languages. (61) Ni neg
disputalfaidh dispute-FUT
me I
leat with-you
Code Switching and Mixing 523 ‘I won’t dispute with you’ (Ir-Eng; Stenson, 1990) (62) Se oli semmosesta landista¨ . it was such þELAT þELA ‘It was about a land.’ (Finn-Eng; Halmari, 1997
In her revised MLF model, the 4-M model, MyersScotton (2002) makes finer distinctions between system and content morphemes. The 4-M model defines subtypes of system morphemes, arguing that while this basic dichotomy holds, the subtypes of morphemes have both explanatory and predictive power. The general objection brought against the MLF model has to do with its reliance on the matrixembedded distinction. As discussed earlier, this distinction is not clearly defined. Although the MLF’s first hypothesis is intended to systematize this distinction, it has been noted that the description contains a certain amount of circularity: on the one hand, the ML is recognized by the fact that system morphemes are in the ML. On the other hand, the System Morpheme Principle specifies that system morphemes are supplied by the ML. Thus, examples such as (63)–(65) would either have to be noted as counterevidence to the model or one must in each instance assign ML status to the entire utterance based on a single morpheme. (63) Anyway I figured ke if I worked hard enough . . . that ‘Anyway I figured that if I worked hard enough . . .’ (Farsi-English; Mahootian, 1993) (64) there wasn’t an item vos we didn’t have that ‘there wasn’t an item that we didn’t have’ (Eng-Yidd; Prince and Pintzuk, 1984) (65) Lawyer-et will tell you what to do. 2P-POSS ‘Your lawyer will tell you what to do.’ (Farsi-Eng; Mahootian, 1993)
The Head-Complement Principle Model The HeadComplement Principle model followed in the footsteps of earlier researchers such as Bentahila and Davies and Woolford in looking for a principlebased approach. The model sets out to account for the multitude of switches recorded between typologically dissimilar languages and for switches found between bound and free morphemes. The model builds on three basic assumptions: 1. The same principles and derivational constraints that produce monolingual utterances produce mixed utterances, whether the switch involves a bound morpheme, a single word, or an entire
phrase. General principles of phrase structure, rather than constraints specific to code switching, produce code switched utterances. 2. No special third grammar is needed to account for code switched utterances. A bilingual’s two grammars are sufficient to generate code switching. The two grammars remain distinct, both sociolinguistically and syntactically. Their interaction is accounted for by (1). 3. From a syntactic perspective, there is no formal distinction between ‘code switching’ and ‘borrowing’: both can be accounted for in the same way by the same principles of grammar. The principle itself is summed (Mahootian, 1994, 1996a,b):
up
below
The Head-Complement Principle (HCP): heads determine the syntactic properties of their complements in code switching and monolingual contexts alike: heads determine the phrase structure position, syntactic category, and feature content of their arguments.
The HCP applies to both lexical and functional heads. Consequently, switches are permissible between determiner (including Quantifiers) and NP, Comp and IP, Infl and VP, V and PP, P and DP, and other heads and their complements. Furthermore, single-word switches and within-word switches, often treated as borrowings and therefore left outside of syntactic analysis, are accounted for in the HCP model. In the HCP model, maximal projection nodes are the points at which switching occurs: switching may occur where there is phrase structure label congruence between languages. This is different from previous approaches that rely on word-order congruence between the switching languages. In line with numerous theories of grammar where the lexicon plays a central role, this model adopts a view of grammar where the lexicon projects phrase structures. Lexical items of a language are represented as trees reflecting their language-particular syntactic requirements in accordance with X-bar theory. This view of language as a combination of structures rather than strings allows for switches between languages with dissimilar word order that other approaches have not been able to address. The HCP model predicts: . switches at all maximal projections (possible but not obligatory), . L1 word order with mixed L1/L2 words, . L2 word order with mixed L1/L2 words, . L1 word order with all L1 words, . L2 word order with all L2 words.
524 Code Switching and Mixing
Application of the model is illustrated with the FarsiEnglish object-verb switch in (66). Farsi is an SOV language. (66) ten dollars dad gave ‘she gave ten dollars’
Based on the assumptions of the model, the utterance in (66) is derived from the Farsi structures shown in (67) and the English structures shown in (68). (67)
(68)
A combination of (68a)–(68d) yields the English phrase in (72): (72) gave ten dollars
A number of other attested types of mixed utterances can also result from the phrase structures in (67) and (68), for example, gave daeh dolar, gave daeh dollars, ten dolar dad. Switches across morpheme boundaries are accounted for in the same manner. The HCP model has some clear advantages over models that require mechanisms and rules specific to code switching. By applying principles of universal grammar, the model accurately predicts switch points at clause, phrase, and within word levels. No additional mechanisms or assumptions are required. MacSwan (1999), inspired by the simplicity and broad scope of application inherent in the HCP approach, recast the same insights and principles into a minimalist framework.
Bilingual Language Acquisition and Code Switching Lexical items project the structures in (67a)–(67d) and (68a)–(68d). The combination of the tree in (68b) þ (68c) þ (68d) yields the DP structure in (69). This DP combines with the lexical structure for dad-e ‘gave’, to yield a Farsi left-branching VP, ten dollars dad, shown in (70). (69)
(70)
Given the assumptions of this model, the same lexical structures generate comparable monolingual utterances in Farsi and English. By combining phrase structures (67a)–(67d), we get the Farsi utterance in (71). (71) daeh dolar dad ten dollar gave ‘gave ten dollars’
‘Bilingual language acquisition’ refers to the simultaneous acquisition of more than one language during the developmental stages of language acquisition. One of the first documented studies of this process was conducted by Ronjat (1913), who examined his son’s simultaneous acquisition of German and French. This was followed by a number of other longitudinal studies. Of particular significance is Leopold’s seminal four-volume detailed diary of his daughter’s acquisition of German and English (Leopold, 1939–1949). Leopold’s study marked the beginning of focused interest in child bilingualism. Since then, numerous other studies have been conducted in this area. One of the issues arising from research into child bilingualism is the status of code switching. The main theoretical question for researchers in this area is whether children start out with a unitary language system that gradually splits into two separate systems or whether they start out with two separate systems. Proponents of the unitary system contend that the words of both languages form a single lexicon for the child and that mixed utterances at the twoand three-word stage support a nondifferentiated system. Researchers supporting the separate system, also called the Independent Language Hypothesis, maintain that children start out with two separate systems and keep them differentiated. They show that children typically keep their languages separated when they are in monolingual contexts by using the appropriate language for the context. To show an undifferentiated system, they assert, a child would need to use the two languages randomly in either context.
Code Switching and Mixing 525
Another question regarding language mixing in early acquisition stages concerns the similarity between the types of switches found in children’s utterances and those found in adult switches. Structurally, the question is whether children mix their two languages in the same way adults do. The answer to this question has theoretical implications on a number of fronts. One much debated issue in first language acquisition is whether adult grammars are a continuation of children’s grammars or if the two are separate systems. The similarity between adults’ and children’s code switches may bear on this question in particular, and on the notion of language universals in general. A further insight gained from child code-switching data concerns existing models of code switching. How well do these models hold up when they set out to account for children’s code switches? Of the existing models, only a few, Mahootian’s HCP model and Myers-Scotten’s MLF model, have been successfully applied to children’s code-switching data. Examples (73)–(77) show mixed utterances produced by children. Children’s ages are noted in parentheses in years followed by months. Languages are abbreviated as follows: Fr ¼ French, Germ ¼ German, Russ ¼ Russian, Norw ¼ Norwegian. (73) c¸ a c¸ a sonne this this sun
(1;11, Fr-Germ; Ko¨ ppe and Meisel, 1995)
(74) das dort ne? this sleeps, right?
(2;0, Germ-Fr; Ko¨ ppe and Meisel, 1995)
(75) mer cookie more ‘more cookie’
(2;2, Norw-Eng; Lanza, 1997)
The following is a comparison between adult mixed utterances and children’s mixed utterances showing switches occurring at the same grammatical junctures (Mahootian, 1999). . Switches between Determiner and NP (76) el same night the ‘the same night’ (adult: Sp-Eng; Pfaff, 1979) (77) That’s my kino. cinema (movie) ‘That’s my movie.’ (child: 3; 4, Russ-Eng; O’Neill, 1998)
. Switches between Verb and DP (78) il ne faut pas changer ttwSe:l the receipt ‘You must not change the receipt.’ (adult: Arab. -Fr; Bentahila and Davies, 1983)
. Switches between Subject DP and I0 (80) No you govorish, chto eto vozmozhno. But say that that possible ‘But you say that it’s possible.’ (adult: Russ-Eng; O’Neill, 1998) (81) Jeg give it to Daddy I ‘I give it to daddy’ (child: 2; 7, Norw-Eng; Lanza, 1997)
. Switches between V and PP (82) Nu vidish, Steven King zhil in a trailer. well you see lived So you see, Steven King lived in a trailer.’ (Adult: Russ-Eng; O’Neill, 1998) (83) on va maintenant zum krankenwagen we go now to the ambulance ‘We now go to the ambulance.’ (child: 3; 7,2, Germ.-Fr; Meisel, 1994)
. Switches between Roots and Affixes (84) Eto na dresser-e. That on -LOC-Sg ‘It’s on the dresser.’ (adult: Russ-Eng; O’Neill, 1998) (85) It’s lop -en. break‘It’s broken.’ (child: 3; 10, Russ-Eng; O’Neill, 1998) See also: Bilingual Education; Bilingual Language Development: Early Years; Bilingualism and Aphasia; Bilingualism and Second Language Learning; Bilingualism; Code Switching; Codes, Elaborated and Restricted (Bernstein); Identity and Language; Identity: Second Language; Language Attitudes; Language Change and Language Contact; Language Maintenance and Shift; Multiculturalism and Language; Relevance Theory; Rule Borrowing.
Bibliography Appel R & Muysken P (1987). Language contact and bilingualism. London: Edward Arnold. Auer P (1984). On the meaning of conversational codeswitching. In Auer P & do Luizo A (eds.) Interpretive Sociolinguistics: Migrants, Childre, Migrant Children. Tu¨ bingen, Germany: Gunter Narr Verlag. 87–112. Backus A & Eversteijn N (2004). ‘Pragmatic functions and their outcomes: language choice, code-switching, and non-switching.’ In Lorenzo Sua´ rez A M, Ramallo F & Rodrı´guez-Ya´ n˜ ez P (eds.). Bilingual socialization and bilingual language acquisition. Proceedings from the Second International Symposium on Bilingualism,
526 Code Switching and Mixing University of Vigo, Galicia-Spain, October 23–26, 2002. Vigo: Servizo de publicacio´ ns da Universidade de Vigo. Belazi H M, Rubin E J & Toribio J A (1994). ‘Code switching and X-bar theory: the Functional Head Constraint.’ Linguistic Inquiry 25, 221–237. Bentahila A & Davies E E (1983). ‘The syntax of ArabicFrench code-switching.’ Lingua 59, 301–330. Berk-Seligson S (1986). Linguistic constraints on intrasentential code-switching: A study of Spanish/Hebrew bilingualism. Language in Society 15, 313–348. Blom J-P & Gumperz J J (1972). ‘Social meaning in linguistic structures: code-switching in Norway.’ In Gumperz J J & Hymes D (eds.) Directions in sociolinguistics. New York: Holt, Rinehart and Wilson. Reprinted in Wei L (ed.). Burling R (1959). ‘Language development of a Garo and English-speaking child.’ Word 15, 45–68. Reprinted in bar-Adon A & Leopold W (eds.) 1971. Child language: a book of readings. Englewood Cliffs, NJ: Prentice-Hall. 170–185. Clyne M (1972). Perspectives on language contact. Melbourne: Hawthorne Press. De Houwer A (1995). ‘Bilingual language acquisition.’ In Fletcher P & MacWhinney (eds.) The handbook of child language. Oxford: Blackwell. 219–250. Di Sciullo A, Muysken P & Singh R (1986). ‘Government and code-mixing.’ Linguistics 22, 1–24. Fantini A (1985). Language acquisition of a bilingual child: a sociolinguistic perspective. San Diego: College Hill Press. Fishman J (1972). Language in socio-cultural change. In Fil A S (ed.). Stanford: Stanford University Press. Genesee F (1989). ‘Early bilingual language development: one language or two?’ Journal of Child Language 16, 161–179. Genesee F, Nicola´ dis & Paradis J (1995). ‘Language differentiation in early bilingual development.’ Journal of Child Language 22, 611–631. Giles H, Bourhis R & Taylor D (1977). Towards a theory of language in ethnic group relations. In Giles H (ed.) Language, ethnicity and intergroup relations. London: Academic Press. Grosjean Franc¸ ois (1982). Life with two languages. Cambridge: Harvard University Press. Gumperz J J (1982). Discourse strategies. Cambridge: Harvard Universty Press. Gumperz J J & Cook-Gumperz J (1982). ‘Introduction: language and the communication of social identity.’ In Gumperz J J (ed.) Language and social identity. Cambridge: Cambridge University Press. Halliday M A K, McIntosh A & Strevens P (1964). The linguistic sciences and language teaching. London: Longman. Halmari H (1997). Government and codeswitching: explaining American Finnish. Amsterdam, Philadelphia: John Benjamins. Hasselmo N (1961). American Swedish. PhD. Thesis. Harvard University. Hasselmo N (1970). ‘Codeswitching and modes of speaking.’ In Gilbert G (ed.) Texas studies in bilingualsm. Berlin: de Gruyter. 179–209.
Haugen E (1953). The Norwegian language in America: a study in bilingual behaviour. Philadelphia: University of Pennsylvania Press [Reprinted in 1969, Bloomington: Indiana University Press]. Haugen E (1956). Bilingualism in the Americas: a bibliography and research guide. Alabama: University of Alabama Press. Jakobson R (1960). Linguistics and poetics. In Sebeok T (ed.) Style in language. New York: Wiley. 350–377. Joshi A K (1985). ‘Processing of sentences with intrasentential code switching.’ In Dowty D R, Karttunen L & Zwicky A M (eds.) Natural language parsing: psychological, computational, and theoretical perspectives. Cambridge: Cambridge University Press. 190–205. Ko¨ ppe R (1996). ‘Language differentiation in bilingual children: the development of grammatical and pragmatic competence.’ Linguistics 34(5), 927–954. Ko¨ ppe R & Meisel J (1995). ‘Code-switching in bilingual first language acquisition.’ In Milroy L & Muysken P (eds.) One speaker, two languages: Cross-disciplinary perspective on code-switching. Cambridge, UK: Cambridge University Press. 276–301. Lanza E (1997). ‘Language contact in bilingual two-year olds and codeswitching: language encounters of a different kind?’ The International Journal of Bilingualism 1(2), 135–162. Leopold W (1939–1949). Speech development of a bilingual child: a linguists record (vols I–IV). Evanston: Northwestern University Press. MacSwan J (1999). A minimalist approach to intrasentential code switching. Outstanding Dissertations in Linguistics. New York: Garland. Mahootian S (1993). A null theory of codeswitching. Ph.D. diss. Northwestern University. Mahootian S (1996a). ‘Codeswitching and universal constraints: evidence from Farsi/English.’ World Englishes 15(3), 337–384. Mahootian S (1996b). ‘A competence model of codeswitching.’ In Arnold J, Blake R, Davidson B, Schwenter S & Solomon J (eds.) Sociolinguistic variation: data, theory, and analysis: selected papers from NWAV23 at Stanford University. Stanford: CSLI Publications. Mahootian S (1999). Codeswitching and Universal Grammar. Paper presented at the Second International Symposium on Bilingualism, Newcastle upon Tyne, UK. Mahootian S (2005). ‘Linguistic change and social meaning: codeswitching in the media.’ International Journal of Bilingualism, Blackwell Publishers, Vol. 9. Mahootian S & Santorini B (1994). Adnominal adjectives, codeswitching and lexicalized TAG. In 3eColloque International sur les Grammaire d’Arbre Adjoint(TAG þ 3) rapport Technique TALANA-RT-94-01. Mahootian S & Santorini B (1996). ‘Codeswitching and the complement adjunct distinction.’ Linguistics Inquiry 27(3), 464–479. McClure E (1981). Formal and functional aspects of the codeswitched discourse of bilingual children. In Duran R P (ed.) Latino language and communication behavior. Norwood, NJ: ABLEX Publishing Corporation. 69–94.
Coelho, Francisco Adolpho (1847–1919) 527 Meisel J M (1994). ‘Code-switching in young bilingual children: the acquisition of grammatical constraints.’ Studies in Second Language Acquisition 16, 413–439. Myers-Scotton C (1993). Social motivations for codeswitching: evidence from Africa. Oxford: Oxford University Press. Myers-Scotton C (2002). Contact linguistics: Bilingual encounters and grammatical outcomes. Oxford: Oxford University Press. Nishimura M (1985). Intrasentential codeswitching in Japanese and English. Ph.D. diss., University of Pennsylvania. Nishimura M (1991). Varieties of Japanese/English bilingual speech: Implications for theories of codeswitching and borrowing. Unpublished manuscript, Georgetown University. O’Neill M (1998). ‘Support for the independent development hypothesis: evidence from a study of RussianEnglish bilinguals.’ BUCLD 22 Proceedings, 586–597. Pfaff C (1979). ‘Constraints on language mixing.’ Language 55, 291–319. Poplack S (1980). ‘Sometimes I start a sentence in English y termino en espanol: toward a typology of codeswitching.’ Linguistics 18, 581–616. Poplack S, Sankoff D & Miller C (1988). ‘The social correlates and linguistic processes of lexical borrowing and assimilation.’ Linguistics 26, 47–104. Prince E & Pintzuk S (1984). Bilingual code-switching and the open/closed class distinction. Unpublished paper, University of Pennsylvania.
Reyes R (1974). Studies in Chicano Spanish. Ph.D. thesis, Harvard University. Romaine S (1995). Bilingualism. New York: Basil Blackwell, Inc. Ronjat J (1913). Le de´ veloppement du langage observe´ chez un enfant bilingue. Paris: Champion. Sankoff D & Poplack S (1981). A formal grammar for codeswitching. Papers in Linguistics 14, 3–46. Santorini B & Mahootian S (1995). ‘Codeswitching and the syntactic status of adnominal adjectives.’ Lingua 96, 1–27. Stenson N (1990). ‘Phrase structure congruence, government, and Irish-English code-switching.’ In Hendrick R (ed.) Syntax and semantics 23. New York: Academic Press, Inc. 167–197. Timm L (1975). ‘Spanish-English code-switching: el porque y how-not-to.’ Romance Philology 28, 473–482. Vihman M (1985). ‘Language differentiation by the bilingual infant.’ Journal of Child Language 12, 297–324. Volterra V & Taeschner T (1978). ‘The acquisition and development of language by bilingual children.’ Journal of Child Language 5, 311–326. Weinreich U (1953). Languages in contact. New York: Linguistic Circle of New York Publication No. 2. Woolford E (1983). ‘Bilingual code-switching and syntactic theory.’ Linguistics Inquiry 14(3), 520–535.
Coelho, Francisco Adolpho (1847–1919) R Cavaliere, Universidade Federal Fluminense, Rio de Janeiro, Brazil ! 2006 Elsevier Ltd. All rights reserved.
Francisco Adolpho Coelho is the most important 19th-century name in the field of Portuguese linguistic studies. He was born in Coimbra on January 15, 1847, and died in Carcavelos on February 9, 1919. Coelho distinguished himself as an extraordinary, self-taught person during his basic education, although it is known that he had attended classes at the University of Coimbra between 1862 and 1864. In Lisbon, Coelho entered the university as a student of the Superior Course of Letters, in which he studied at least from 1865 till 1866. In 1878, he began his career as an assistant professor in this same course, teaching general Indo–European linguistics. Due to
Coelho’s comparative research on Romance languages, the thesis presented to the academic word by Friedrich Diez in his Grammatik der romanischen Sprachen has been applied in modern Portuguese linguistic studies, and that is the reason why philology began to be studied by more qualified people in Portugal from that time on. His first published work, entitled A lı´ngua portuguesa: fonologia, etimologia, morfologia e sintaxe (1868), is thus considered the initial step of the historical–comparative school in Portugal. Some years later, Coelho managed to publish two other important works about descriptive grammar based on the comparative method: Theoria da conjugac¸a˜o em latim e portuguez. Estudo de grama´tica comparativa (1870) and Questo˜es da lı´ngua portuguesa (the first part in 1874 and the second in 1889). His bibliography, nevertheless, is not absolutely restricted to the field of linguistics, since he had written
Codes, Elaborated and Restricted (Bernstein) 507
level. Code-switching is common in the young generation. This generation is also particularly exposed to standard Indonesian and standard Malay. The exposure to Indonesian comes from the media (Indonesian TV) as well as from Bahasa Indonesia, which is offered in school as a second language choice. The influence of Malay comes less directly, from the status that the language enjoys for literary and religious reasons: in 2001 there were clearly normative voices to be heard advocating a more standard form of Malay to be spoken on the island. All these facts considered, it is quite likely that Cocos Malay may lose many of its particular traits in the near future. Cocos Malay communities can also be found in Western Australia (children have to move there to complete their education after grade 10) as well as in Sabah, Malaysia. Relocated from the original settlements in different waves as the islands can only sustain a limited population, these communities are described as having lower proficiency in Cocos Malay than that on Home Island. In particular, in Sabah it is reported that convergence towards standard Malay has taken place.
See also: Australia: Language Situation; Bilingualism; Code Switching and Mixing; Language Change and Language Contact; Malaysia: Language Situation.
Bibliography Adelaar K A (1996). ‘Malay in the Cocos (Keeling) Islands.’ In Nothofer B (ed.) Reconstruction, classification, description. Festschrift in honour of Isodore Dyen. Hamburg: Abera Network Asia Pacific. 23–37. Bunce P (1988). The Cocos (Keeling) Islands. Singapore: John Wiley & Sons Australia Ltd. Gibson-Hill M A (1947). ‘Notes on the Cocos-Keeling Islands.’ Journal of the Malayan Branch of the Royal Asiatic Society XX(2), 140–202. Hunt J G (1989). ‘The revenge of the Bantamese. Factors for change in the Cocos (Keeling) Islands.’ Master’s thesis, Australian National University. Lapsley A D (1983). ‘Cocos Malay Syntax.’ Master’s thesis, Monash University. Lim L & Ansaldo U (2003). ‘Sounds Cocos.’ In Sole´ M J, Recasens D & Romero J (eds.) Proceedings of the 15th International Conference of Phonetic Sciences. Barcelona: The 15th ICPhS Organizing Committee. 803–806.
Codes, Elaborated and Restricted (Bernstein) A Capone, Barcellona, Italy ! 2006 Elsevier Ltd. All rights reserved.
Bernstein was among the first scholars to focus on the correlation between the scholastic success (or failure) of a learner and the social class he or she belonged to. A student living in a well-off family, having many cultural stimuli (books, newspapers, periodicals, films, etc.) is bound to develop a rich and fully articulated language (a so-called ‘elaborate code’), whereas a student who belongs to a working-class family and is exposed to poor linguistic and cultural stimuli, develops a fragmented, poor, syntactically deficient language (called a ‘restricted code’ by Bernstein) (see Bernstein, 1971–1975). Contrary to the Chomskyan theory that language naturally develops in the brain, due to the interaction of biologically innate structures and the environment to which the child is exposed, may be that the data to which the child is exposed are so poor and confused that it is easy to demonstrate that the innate learning program prevails over the environmental stimuli, Bernstein emphasizes the predominant role played by the environment in shaping the learning process. Of course, the correlation between social class is not, strictly speaking, without exceptions, because much
of the learning process depends on the lifestyle of the family under consideration. There are exceptional working-class families where parents, contrary to all expectations, have good knowledge of the language and place great importance on culture, but the norm is that, within working-class families, cultural stimuli are less predominant than in more well-off families. The problem, for sociologists, is how to offset the disadvantages of the pupils belonging to workingclass families and how pedagogues (and teachers) can have an antideterministic effect on such children. A possible solution to the problem is to ensure that the school (or the class) becomes another miniature family and that the negative effects of the families are compensated for by the pedagogical action of the school. The school should, therefore, be a positive environment in which pupils are exposed to positive cultural and affective stimuli that help their personalities grow and come to maturity. In such a model of the school, teachers lose their primary function of being transmitters of notions (knowledge, in general) and are required to take the roles of educators or pedagogues who act as models and provisionally replace (at least within the boundaries of the school) the family by setting good examples for the students, and, in particular, exposing them to the positive aspects of culture, intended as knowledge that interacts with the
508 Codes, Elaborated and Restricted (Bernstein)
individual to make him or her grow up intellectually and emotionally. To compensate for the negative effects of families, within which dialogue and conversation have died, or are confined to adjacency pairs consisting of questions/answers or orders/replies, teachers have to play the role of communicators and have to stimulate communication. It is, in my view, impossible for a student to make progress in his or her language (to develop a more articulated written or oral mode of expression) unless he or she understands the function of communication, which is that of transmitting knowledge, but also of enhancing the expressive as well as the interpersonal function. To communicate is not only to express propositions
(concerning others), but also to express propositions concerning what we really are and feel, and, by so doing, to interact with others, creating an intersubjective dimension in which social life is possible (see Capone, 2003).
Bibliography Bernstein B (1971–1975). Class, codes and control (3 vols). London: Routledge & Kegan Paul. Bernstein B (1990). The structuring of pedagogical discourse. London/New York: Routledge. Capone A (2003). Pragmemes. Messina: Minerva.
Code Switching S Gross, East Tennessee State University, Johnson City, TN, USA ! 2006 Elsevier Ltd. All rights reserved.
In many bi- and multilingual communities around the world, speakers need to choose, often at an unconscious level, which language to use in their interactions with other members of the community. One of the choices that bilingual speakers often make is to code-switch: that is, speakers switch back and forth between languages (or varieties of the same language), sometimes within the same utterance (see Bilingualism; Code Switching and Mixing). The motivations for code switching have often been treated simply as lists of possible functions for code switching. For example, Appel and Muysken (1987) cite five such functions. First, code switching may serve a referential function by compensating for the speaker’s lack of knowledge in one language, perhaps on a certain subject. Second, it may serve a directive function by including or excluding the listener. Third, code switching may have an expressive function by identifying the speaker as someone having a mixed cultural identity. Fourth, it may have a phatic function indicating a change in tone in the conversation. And fifth, it may serve a metalinguistic function when code switching is used to comment on the languages involved. While such lists are useful places to start, and no one would deny that code switching can certainly serve these functions, these types of lists fail to answer the question of what motivates speakers to make the choices they do at a particular point in a conversation. This article reviews the major proposals that have been advanced regarding the following question:
why do speakers choose to engage in code switching in the first place?
Code Switching as a Research Topic The current interest in code switching can be dated to a 1972 study of language use in Hemnesberget, a small village in northern Norway, conducted by Jan Blom and John Gumperz and described in a volume on sociolinguistics edited by Gumperz and Hymes (1972). In Hemnesberget, two varieties of Norwegian are used: Ranama˚l, a local dialect, and Bokma˚l, the standard variety (see Language and Dialect: Linguistic Varieties). However, speakers’ decisions regarding which variety to use are by no means arbitrary or haphazard. In general, Ranama˚l, the local variety, is used in local activities and relationships, reflecting shared identities with the local culture. In contrast, Bokma˚l is used in official settings such as school, church, and the media, communicating an individual’s dissociation from the local group, i.e., not stressing his or her local ties. Blom and Gumperz distinguish between two main functions of code switching: situational and metaphorical. In situational code switching, which seems to be similar to the notion of diglossia, the speaker’s choice of language is constrained by factors external to her/his own motivations, for example, the status of the interlocutor, the setting of the conversation, or the topic of conversation. So, in Hemnesberget, Blom and Gumperz observed that when an outsider joins a group of locals engaged in a conversation, the locals will often switch from the local variety, Ranama˚l, to the standard variety, Bokma˚l. In a later work, Gumperz (1982) introduces the distinction between ‘we’ and ‘they’ codes, which further
Coelho, Francisco Adolpho (1847–1919) 527 Meisel J M (1994). ‘Code-switching in young bilingual children: the acquisition of grammatical constraints.’ Studies in Second Language Acquisition 16, 413–439. Myers-Scotton C (1993). Social motivations for codeswitching: evidence from Africa. Oxford: Oxford University Press. Myers-Scotton C (2002). Contact linguistics: Bilingual encounters and grammatical outcomes. Oxford: Oxford University Press. Nishimura M (1985). Intrasentential codeswitching in Japanese and English. Ph.D. diss., University of Pennsylvania. Nishimura M (1991). Varieties of Japanese/English bilingual speech: Implications for theories of codeswitching and borrowing. Unpublished manuscript, Georgetown University. O’Neill M (1998). ‘Support for the independent development hypothesis: evidence from a study of RussianEnglish bilinguals.’ BUCLD 22 Proceedings, 586–597. Pfaff C (1979). ‘Constraints on language mixing.’ Language 55, 291–319. Poplack S (1980). ‘Sometimes I start a sentence in English y termino en espanol: toward a typology of codeswitching.’ Linguistics 18, 581–616. Poplack S, Sankoff D & Miller C (1988). ‘The social correlates and linguistic processes of lexical borrowing and assimilation.’ Linguistics 26, 47–104. Prince E & Pintzuk S (1984). Bilingual code-switching and the open/closed class distinction. Unpublished paper, University of Pennsylvania.
Reyes R (1974). Studies in Chicano Spanish. Ph.D. thesis, Harvard University. Romaine S (1995). Bilingualism. New York: Basil Blackwell, Inc. Ronjat J (1913). Le de´veloppement du langage observe´ chez un enfant bilingue. Paris: Champion. Sankoff D & Poplack S (1981). A formal grammar for codeswitching. Papers in Linguistics 14, 3–46. Santorini B & Mahootian S (1995). ‘Codeswitching and the syntactic status of adnominal adjectives.’ Lingua 96, 1–27. Stenson N (1990). ‘Phrase structure congruence, government, and Irish-English code-switching.’ In Hendrick R (ed.) Syntax and semantics 23. New York: Academic Press, Inc. 167–197. Timm L (1975). ‘Spanish-English code-switching: el porque y how-not-to.’ Romance Philology 28, 473–482. Vihman M (1985). ‘Language differentiation by the bilingual infant.’ Journal of Child Language 12, 297–324. Volterra V & Taeschner T (1978). ‘The acquisition and development of language by bilingual children.’ Journal of Child Language 5, 311–326. Weinreich U (1953). Languages in contact. New York: Linguistic Circle of New York Publication No. 2. Woolford E (1983). ‘Bilingual code-switching and syntactic theory.’ Linguistics Inquiry 14(3), 520–535.
Coelho, Francisco Adolpho (1847–1919) R Cavaliere, Universidade Federal Fluminense, Rio de Janeiro, Brazil ! 2006 Elsevier Ltd. All rights reserved.
Francisco Adolpho Coelho is the most important 19th-century name in the field of Portuguese linguistic studies. He was born in Coimbra on January 15, 1847, and died in Carcavelos on February 9, 1919. Coelho distinguished himself as an extraordinary, self-taught person during his basic education, although it is known that he had attended classes at the University of Coimbra between 1862 and 1864. In Lisbon, Coelho entered the university as a student of the Superior Course of Letters, in which he studied at least from 1865 till 1866. In 1878, he began his career as an assistant professor in this same course, teaching general Indo–European linguistics. Due to
Coelho’s comparative research on Romance languages, the thesis presented to the academic word by Friedrich Diez in his Grammatik der romanischen Sprachen has been applied in modern Portuguese linguistic studies, and that is the reason why philology began to be studied by more qualified people in Portugal from that time on. His first published work, entitled A lı´ngua portuguesa: fonologia, etimologia, morfologia e sintaxe (1868), is thus considered the initial step of the historical–comparative school in Portugal. Some years later, Coelho managed to publish two other important works about descriptive grammar based on the comparative method: Theoria da conjugac¸a˜o em latim e portuguez. Estudo de grama´tica comparativa (1870) and Questo˜es da lı´ngua portuguesa (the first part in 1874 and the second in 1889). His bibliography, nevertheless, is not absolutely restricted to the field of linguistics, since he had written
528 Coelho, Francisco Adolpho (1847–1919)
many well-recommended texts about literature and popular traditions. In this area, Coelho organized several editions of popular Portuguese short stories and texts for youths. His name is linked as well to ethnographic and anthropological research, with texts written under the brilliant knowledge of the 19th-century ethnolinguists, such as Exposic¸ a˜ o Etnogra´ fica Portuguesa (1896) and Os Ciganos de Portugal;com um estudo sobre o cala˜ o (1892). Inspired by Hugo Schuchardt’s ideas about dialects and creole languages, Coelho proceeded with an exhausting investigation about Portuguese-based African and Asiatic creoles, in addition to having published some studies on Brazilian Portuguese (that, in his conception, should be called Brazilian dialect). This segment of his bibliography has been severely censured by some philologists of his time, such as Jose´ Leite de Vasconcelos, who reproached these papers for being the product of desk work only due to the fact that Coelho didn’t endeavor to develop field research methods. It is imperative to attribute to Coelho the virtue of renovating the linguistic thought in his country, whose bases still were, by the middle of the 19th century, immersed in the rational ideas of philosophical grammar. This revolutionary trend contributed decisively to a new order in the teaching of vernacular language in elementary and middle levels. According to the Brazilian linguist Serafim da Silva Neto, Coelho ‘‘has been a philosopher-philologist, with the faithful purpose of ascending to general ideas.’’ Undoubtedly, his activity in the 19th century linguistic scene opened the doors of Glottology and Ethnolinguistics to those who dedicated themselves to the study of the Portuguese language, and even to the study of the Romance languages in general, creating a great legion of disciples not only in Portugal but also in Brazil. See also: Diez, Friedrich (1794–1876); Indo–European Languages; Portuguese; Romance Languages; Schuchardt, Hugo (1842–1927).
Bibliography Cabral J P (1991). ‘A Antropologia em Portugal Hoje.’ In Difel (ed.) Os Contextos da Antropologia. Lisboa. 11–41.
Coelho F A (1868). A lingua portugueza: phonologia, etymologia, morphologia e syntaxe. Coimbra: Imprensa da Universidade. Coelho F A (1870). Theoria da conjugac¸ a˜ o em latim e portuguez. Estudo de grammatica comparativa. Lisboa: Typ. Universal. Coelho F A (1874). Questo˜ es da lingua portugueza. Porto: Ernesto Chardron; Braga: Eugenio Chardron. Coelho F A (1875). Bibliographia critica de historia e litteratura. Porto: Imp. Litterario-Commercial. Coelho F A (1879). Contos Populares Portugueses. Lisboa: F. Plantier. Coelho F A (1881). A lingua portugueza: noc¸ o´ es de glottologia geral e especial portugueza. Porto: Magalha˜ es & Moniz. Coelho F A (1881). Os dialectos romanicos ou neo-latinos na A´ frica, A´ sia e Ame´ rica. Lisboa: Casa da Sociedade de Geografia. Coelho F A (1884). ‘Les Ciganos. a propos de la communication de M. P. Bataillard ‘‘Les Gitanos d’Espagne et les Ciganos de Portugal.’’’ Congre´ s International d’Anthropologie et d’Arche´ ologie Pre´ -Historique. Compte-rendu de la Neuvie´ me Session a` Lisbonne. Lisboa: Academia das Cieˆ ncias. Coelho F A (1885). Tales of Old Lusitania, from folklore of Portugal. London: Ywan Sonnenschein, trad. de Henriqueta Monteiro. Coelho F A (1885). ‘Tradic¸ o˜ es relativas a`s sereias e outros mitos similares.’ Archivio per lo Studio delle Tradizioni Popolari IV, 325–360. Coelho F A (1890). Esboc¸ o de um programa para o estudo antropolo´ gico, patolo´ gico e demogra´fico do povo portugueˆ s. Lisboa: Tip. do Come´ rcio de Portugal. Coelho F A (1892). Os ciganos de Portugal; com um estudo sobre o cala˜ o. Lisboa: Imprensa Nacional. Fernandes R (1973). As ideias pedago´ gicas de F. Adolfo Coelho. Lisboa: Instituto Gulbenkian de Cieˆ ncia/Centro de Investigac¸ o˜ es Pedago´ gicas. Gonc¸ alves M J L (1947). ‘Contribuic¸ a˜ o para a Bibliografia de Adolfo Coelho.’ Biblos. v. XXIII, 801–834. Silva A C (1994). F. Adolfo Coelho e a grama´tica portuguesa. Funchal: [s. n.]. SilvaNeto S da (1957). Manual de filologia portuguesa. Rio de Janeiro: Livraria Acadeˆ mica. Vasconcelos J L de (1920). ‘Adolfo Coelho e a etnografia portuguesa.’ Lusa Jan-Mar, 98.
Cognitive Anthropology 529
Cœurdoux, Gaston-Laurent (1691–1779) K Karttunen, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.
The French missionary and pioneer of Indology Gaston-Laurent Cœurdoux was born in Bourges, France, in 1691 and died in Pondicherry in the colony of French India in 1779. He joined the Jesuit order in 1715, departed for South India in 1732, and served as a member of Madurai Mission until his death. In 1744 he was named superior of the mission, a role he filled until 1751. Beside his missionary work, he showed keen interest both in science and in language studies. In addition to local Dravidian languages (Tamil and Telugu), he learned Sanskrit and was one of the first to note its resemblance with classical Greek and Latin (and German). In a wellknown memoir, he demonstrated the relationship between Sanskrit and European languages well before Sir William Jones; he sent the text to AnquetilDuperron to Paris for publication, but it remained in manuscript form and was printed only at the beginning of the 19th century (in Histoire et Me´moires de l’Acade´mie des Inscriptions et Belles-Lettres 49, 1808: 647–667). Another work by Father Cœurdoux, a detailed account of South Indian customs, remained unpublished and was to a large extent copied by Abbe´ Dubois in the book he then published under his own name (published in French in 1825, in English translation 1817, rev. ed. Hindu manners, customs and ceremonies 1897). The original version is preserved in an abridged version that was published in 1987 by
Sylvia Murr, who also showed Dubois’s complete dependence on it. There are further some letters by Cœurdoux published in Lettres e´difiantes et curieuses, a manuscript dictionary ‘te´lougou–franc¸ais– samskroutam [i.e. Sanskrit],’ and some further manuscript works. See also: Missionary Linguistics.
Bibliography Cœurdoux G-L (1987). Moeurs et coutumes des indiens (1777). Un ine´dit du Pe`re G.-L. Coeurdoux S. J. dans la version de N.-J. Desvaulx. Texte e´tabli et annote´ par Sylvia Murr. Publications de l’E´cole franc¸aise d’Extreˆme-Orient 146, L’Inde philosophique entre Bossuet et Voltaire 1. Paris: E´cole franc¸aise d’Extreˆme-Orient. Dehergne J (1961). ‘Cœurdoux, Gaston Laurent.’ In Dictionnaire de biographie franc¸aise 9, 121. Dubois A J A. Hindu manners, customs and ceremonies. Translated from the author’s later French manuscript and edited with notes, corrections and biography by Henry K. Beauchamp. 3rd rev. edition, repr. by Oxford University Press, Delhi, 1978. Godfrey J J (1967). ‘Sir William Jones and Pe`re C.: A philological footnote.’ Journal of the American Oriental Society 87, 57–59 (see also 89, 1969, 416f.). Murr S (1987). L’indologie du pe`re Cœurdoux: strate´gies, apologe´tique et scientificite´. Publications de l’E´cole franc¸aise d’Extreˆme-Orient 146, L’Inde philosophique entre Bossuet et Voltaire 2. Paris: E´cole franc¸aise d’Extreˆme-Orient.
Cognitive Anthropology C Strauss, Pitzer College, Claremont, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Cognitive anthropology is the study of cultural knowledge and processes of cognition in a sociocultural context. While most cultural anthropologists study cultural knowledge, cognitive anthropologists do so distinctively by asking how specific people in a society think. Cognitive anthropology dates from the late 1950s, although there were many earlier precursors (e.g., the work of Evans-Pritchard).
In cognitive anthropology today, major schools can be distinguished by their answers to two questions. First, does the cultural part of cognition (D’Andrade, 1981) consist of mental representations or situated practices? Second, how significant are panhuman cognitive universals?
Cultural Cognition as Culturally Variable Mental Representations An early, influential statement of this approach, cultural cognition as cross-culturally variable mental
530 Cognitive Anthropology
representations, is Goodenough’s definition of a society’s culture as ‘‘whatever it is one has to know or believe in order to operate in a manner acceptable to its members’’ (Goodenough, 1957: 167). From Goodenough on, language has been central to work in this approach, as both evidence for and the content of cultural knowledge, and linguistics has been an ongoing source of theoretical inspiration. From the 1950s through the 1970s, a major focus of research was lexical semantics (‘ethnosemantics’). Typically, all of the terms in a domain (e.g., kin terms, kinds of talk) are elicited, then an underlying mental structure of defining or salient features is inferred from contrasting patterns of use or judgments of similarity. Influenced by phonologists’ decomposition of sounds into distinctive features, componential analysts similarly analyzed word meanings as sets of semantic features (e.g., BACHELOR ¼ þMALE, #MARRIED). Systematic methods were developed for eliciting terms and their relations (e.g., frame and slot elicitation: What words can fill the blank in a sentence like, ‘‘____is a kind of sport,’’ or pile sorts and triads tests, analyzed with multidimensional scaling, Weller and Romney, 1988). Work in this tradition sometimes focused on biological classifications and was called ‘ethnoscience’; however, that term covers later folk science studies using other methods as well. One current development of such work is the study of cultural knowledge as consensus in responses to standardized questions (Romney and Moore, 1998; Romney et al., 1986). Cultural consensus analysis can be used to determine patterns of disagreement as well as consensus, which can highlight the social distribution of knowledge (see, e.g., Boster’s 1985 study of the effect of kinship and residence on Aguaruna women’s knowledge of manioc varieties). While these methods have been extremely productive, a number of concerns arose about the older, lexically focused methods (these objections do not all apply to cultural consensus modeling). It was not always clear that the analyst’s structures corresponded to the mental representations of members of the society in question; in some domains, a study of key terms reveals only problematic cases, not the taken-for-granted normal scenario (Holland and Skinner, 1987); in many cases, lexical semantics omits crucial elements of cultural knowledge (e.g., eliciting meanings of disease terms in English will not reveal a germ theory of disease, D’Andrade, 1995); and distinctive features are not random assortments of attributes, but co-occur in patterned wholes, as discussed by Rosch and others (see summary in D’Andrade, 1995). Schema theory was adopted
from the work of linguists such as Fillmore and Langacker and psychologists (e.g., Schank and Abelson) to address these issues. Schemas (‘frames’) are mental representations of an underlying, typical pattern of features or connected set of ideas understood holistically. For example, my MINIVAN schema includes not only the features that distinguish it from other vehicles, but also a mental image and ideas about what sort of person is likely to drive one. Shared schemas are cultural models (Holland and Quinn, 1987). A speaker’s reference to one part of a cultural model will activate the whole schema, as well as related schemas, in a listener’s mind; hence, much can be left unsaid when culture is assumed to be shared. Cultural model researchers typically conduct in-depth, semistructured interviews, inferring the implicit understandings that are assumed in what is said (Holland and Quinn, 1987; Quinn, in press). While the focus of work in this approach has been on contents of cultural knowledge, some researchers have addressed cognitive processes and the way knowledge is used: e.g, in memory and personal narratives (e.g., Garro, 2000; see also memory research in studies of linguistic relativity), metaphor choice (Quinn, 1991), the pragmatics of opinion expression (Strauss, 2004), reasoning (Hutchins, 1980; Quinn, 1996), and motivation (D’Andrade and Strauss, 1992). Connectionist models have been proposed to address how cultural knowledge, including nonpropositional knowledge, is learned, changes, and is sensitive to the context of its use (Strauss and Quinn, 1997).
Cultural Cognition as Situated, Distributed Practices Cognitive processes, rather than cultural knowledge, dominate the work of cognitive anthropologists who study distributed cognition or ‘cognition in practice.’ They argue that studies in the first school neglect emergent cognitive effects as people interact with each other and the objects and structures in their environment. Thus, the primary method employed here is observation of ongoing activities (arithmetic while shopping, Lave, 1988 or navigation on board a ship, Hutchins, 1995), perhaps asking the person being observed to think out loud while they are in the midst of their activity. Drawing on Marxian psychology (e.g., Vygotsky, and Leontiev’s Activity Theory), some argue that a key way cognition is ‘cultural’ is that it depends on tools or ‘mediating devices’: concrete objects, symbols, and activities that are socially produced, and which evoke and
Cognitive Anthropology 531
direct thought (Holland and Valsiner, 1988). One example is Goody’s argument that the invention of writing, especially lists and tables, created the possibility for new structures of thought (but see Scribner and Cole, 1981 on the cognitive effects of the Vai script in Liberia). The means and social relations of communication are also relevant ‘structuring resources’ (Lave, 1988). Hutchins (1995), for example, considered how the bandwidth of communications media, the social ranking of who talks to whom, and the timing of their talk affect the propagation of cognitive representations among the members of a navigation team on a large ship (see Nardi, 1996 for a useful comparison of different cognition-in-practice approaches).
Universals in Human Cognition Other cognitive anthropologists agree with the first school in its focus on mental representations but disagree with its emphasis on cross-cultural variability. Thus, for example, Berlin and Kay (1969; Kay et al., 1997) analyzed basic color terms (ones that are monolexemic like ‘red,’ not subtypes of another color, etc.) first in 20, then in 110 languages. They found a restricted set of basic color terms with crossculturally stable focal referents, and comparable composite color categories across all languages. In addition, there is a fairly predictable relation between the number of basic color terms in a language and which colors are named. Atran (1990) and Berlin (1992) observed cross-cultural and historical similarities in classification of plants and animals. Lately, empirical studies have been joined by theories prioritizing cognitive universals. These draw on Chomsky and Fodor’s claims that human brains process much information with specialized modules, and those of evolutionary psychologists that human brains are the result of evolutionary adaptation during the Pleistocene, thus include innate knowledge and processing rules that are useful for social cooperation and categorization, mate selection, foraging, and predator avoidance (Cosmides, 1989; Hirschfeld and Gelman, 1994, but see criticisms by Donald, Karmiloff-Smith and others). The research on cross-cultural cognitive universals challenges extreme claims of linguistic relativity.
Directions for the Future Theoretical bridges have been proposed between cultural models and practice theory; thus, Holland and
Valsiner (1988) argued that learned schemas determine the purposes to which mediating devices are put, Keller and Keller (1993) studied feedback between external events and internal representations over the course of an extended activity, and Quinn (1996) argued that cultural models are internal mediating devices. Similarly, cognitive anthropologists in the first and last schools would agree that there are both universals and variation in cultural knowledge (see, e.g., D’Andrade, 1995; Romney and Moore, 1998; Sperber, 1985; Strauss and Quinn, 1997); the only question is of what sorts. An underexplored issue is differences in ways of knowing (Sperber, 1985; Strauss and Quinn, 1997: Chap. 8). Finally, an interest that cuts across all three approaches is what Sperber (1996) called the ‘‘epidemiology of representations’’: how ideas spread among people in groups, how they are transformed, and how they are used as they move between internal and external forms (Hutchins, 1995; Keller and Keller, 1993; Strauss and Quinn, 1997).
See also: Activity Theory; Ethnoscience; Fillmore, Charles
J. (b. 1929); Frame Semantics; Human Language Processing: Connectionist Models; Langacker, Ronald (b. 1942); Leont’iev, Aleksei Alekseevich (1936–2004); Modularity of Mind and Language; Sapir, Edward (1884–1939); Schank, Roger C. (b. 1946); Semantic Primitives; Vygotskii, Lev Semenovich (1896–1934); Whorf, Benjamin Lee (1897– 1941); Wierzbicka, Anna (b. 1938).
Bibliography Atran S (1990). Cognitive foundations of natural history: towards an anthropology of science. Cambridge: Cambridge University Press. Berlin B (1992). Ethnobiological classification: principles of categorization of plants and animals in traditional societies. Princeton, NJ: Princeton University Press. Berlin B & Kay P (1969). Basic color terms: their universality and evolution. Berkeley: University of California Press. Boster J (1985). ‘‘‘Requiem for the omniscient informant’’: there’s life in the old girl yet.’ In Dougherty J (ed.) Directions in cognitive anthropology. Urbana: University of Illinois Press. 177–197. Cosmides L (1989). ‘The logic of social exchange: has natural selection shaped how humans reason? Studies with the Wason selection task.’ Cognition 31, 187–276. D’Andrade R (1981). ‘The cultural part of cognition.’ Cognitive Science 5, 179–195.
532 Cognitive Anthropology D’Andrade R (1995). The development of cognitive anthropology. Cambridge: Cambridge University Press. D’Andrade R & Strauss C (1992). Human motives and cultural models. Cambridge: Cambridge University Press. Garro L (2000). ‘Cultural knowledge as resource in illness narratives: remembering through accounts of illness.’ In Mattingly C & Garro L (eds.) Narrative and the cultural construction of illness and healing. Berkeley: University of California Press. 70–87. Goodenough W (1957). ‘Cultural anthropology and linguistics.’ In Garvin P (ed.) Report of the seventh annual round table meeting in linguistics and language study, monograph series on language and linguistics, no. 9. Washington, DC: Georgetown University. 167–173. Hirschfeld L & Gelman S (eds.) (1994). Mapping the mind: domain specificity in cognition and culture. Cambridge: Cambridge University Press. Holland D & Quinn N (eds.) (1987). Cultural models in language and thought. Cambridge: Cambridge University Press. Holland D & Skinner D (1987). ‘Prestige and intimacy: the cultural models behind Americans’ talk about gender types.’ In Holland & Quinn (eds.). 78–111. Holland D & Valsiner J (1988). ‘Cognition, symbols, and Vygotsky’s developmental psychology.’ Ethos 16, 247–272. Hutchins E (1980). Culture and inference: a Trobriand case study. Cambridge, MA: Harvard University Press. Hutchins E (1995). Cognition in the wild. Cambridge, MA: MIT Press. Kay P, Berlin B, Maffi L & Merrifield W (1997). ‘Color naming across languages.’ In Hardin C & Maffi L (eds.) Color categories in thought and language. Cambridge: Cambridge University Press. 21–56. Keller C & Keller J D (1993). ‘Thinking and acting with iron.’ In Chaiklin S & Lave J (eds.) Understanding practice: perspectives on activity and context. Cambridge: Cambridge University Press. 125–143.
Lave J (1988). Cognition in Practice: mind, mathematics and culture in everyday life. Cambridge: Cambridge University Press. Nardi B (1996). ‘Studying context: a comparison of activity theory, situated action models, and distributed cognition.’ In Nardi B (ed.) Context and consciousness: activity theory and human-computer interaction. Cambridge, MA: MIT Press. 69–102. Quinn N (1991). ‘The cultural basis of metaphor.’ In Fernandez J (ed.) Beyond metaphor: the theory of tropes in anthropology. Stanford: Stanford University Press. 56–93. Quinn N (1996). ‘Culture and contradiction: the case of Americans reasoning about marriage.’ Ethos 24, 391–425. Quinn, Naomi. (In press). Finding culture in talk: a collection of methods. NY: Palgrave. Romney A K & Moore C (1998). ‘Toward a theory of culture as shared cognitive structures.’ Ethos 26, 314–337. Romney A K, Weller S & Batchelder W (1986). ‘Culture as consensus: a theory of culture and informant accuracy.’ American Anthropologist 88, 313–338. Scribner S & Cole M (1981). The psychology of literacy. Cambridge, MA: Harvard University Press. Sperber D (1985). ‘Apparently irrational beliefs.’ In Sperber D (ed.) On anthropological knowledge. Cambridge: Cambridge University Press. 35–63. Sperber D (1996). ‘Anthropology and psychology: towards an epidemiology of representations.’ In Sperber D (ed.) Explaining culture: a naturalistic approach. Oxford: Blackwell. 56–76. Strauss C (2004). ‘Cultural standing in expression of opinion.’ Language in Society 33, 161–194. Strauss C & Quinn N (1997). A cognitive theory of cultural meaning. Cambridge: Cambridge University Press. Weller S & Romney A K (1988). Systematic data collection. Newbury Park, CA: Sage.
Cognitive Basis for Language Evolution in Nonhuman Primates 533
Cognitive Basis for Language Evolution in Nonhuman Primates R Tincoff and M D Hauser, Harvard University, Cambridge, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
A defining feature of the human language faculty is its limitless capacity for expressing ideas. We can produce an infinite number of new sentences, and we can compose sentences of infinite length. This ability reveals a powerful computational mechanism, a form of recursion often termed discrete infinity. This recursive mechanism entails a procedure or rule that calls itself, specifically, combining the discrete store of words or sentence constituents (e.g., Noun, Verb) into hierarchical phrase structures that can be further embedded within other hierarchically arranged phrase structures. Computational mechanisms are also involved in generating phonological and semantic representations, as well as in the interfaces among syntax, phonology, and semantics. Here we focus on the language faculty and its evolution in terms of these internal computational mechanisms instantiated in the mind/brain of each language user. This focus is conceptually separate from concerns about the actual production and social use of language and leads to a particular set of questions about language evolution. The significance of recursion for generating linguistic representations and mapping between interfaces leads to the proposal that such a device must be universal to all human languages and innately endowed in human infants (Lenneberg, 1967; Chomsky, 1988). These features, universal and innate, raise an important question: how did our language faculty evolve? Specifically, do phylogenetic analyses reveal precursors to the language faculty in other animals? This specific question is distinct from, but complementary to, other approaches to studying language evolution that consider historical or cultural factors or employ mathematical models of potential evolutionary scenarios (for a variety of approaches see chapters in Christiansen and Kirby, 2003). In addition, this question reflects a starting assumption that, in the absence of hearing deficits, spoken language is the default output form of the language faculty. Comparative studies complement this starting assumption as our knowledge of other species’ communication systems is largely based on ones using the vocal-auditory channel (Hauser and Konishi, 1999). Additional methods will be necessary to explore the language faculty as expressed by signed languages, and to explore questions about why human language has flexibility in its expressive form, an additional
defining feature that is lacking in nonhuman communication systems. Below, we describe why data from nonhuman animals, particularly nonhuman primates (hereafter ‘primates’), are critical for explaining the patterns and processes of language evolution. The general conclusion we draw is that once the language faculty is separated into its component systems, a wealth of opportunities for comparative analysis opens up and yields interesting insights into the origins of human language (Hauser et al., 2002).
Applying the Comparative Method of Evolutionary Biology to Language Evolution Evolutionary biology provides a method for determining the phylogenetic origins of traits and their functions (for a detailed survey see Ridley, 2004). This method selects species that share a trait and then examines their evolutionary relationship. Homologies are traits with similar structure or function that have descended from a common ancestor; for example mammary glands used for lactation are shared by humans and rats, and by our common mammalian ancestor. Homologous traits are therefore derived from a common ancestor, tending to arise most frequently among closely related species. Homoplasies are traits shared by a pair of species, but not present in their common ancestor; classic examples include the eye, which has evolved independently in many different lineages, and the wing, which has evolved independently in birds and mammals. Homoplasies provide the signature of convergent evolution, and reveal the powerful role of selection in creating adaptive solutions to common social and ecological problems. Further comparisons of species will also reveal phylogenetically novel traits that appear in one species, or a cluster of species within a genus, and exhibit a particular structure or function. The comparative method allows researchers to lay out the terrain of shared and unique traits among selected species and then draw inferences about how those traits evolved (for applications to cognition and language see Gallistel, 1990; Shettleworth, 1998; Hauser, 2000; Hauser, 2003; Fitch, in press). The evolution of language presents a particular set of problems to the comparative approach. First, there are no other living human species and thus we cannot follow the usual form of the comparative method, comparing living, closely related species sharing similar behaviors. Second, the record of early hominids is limited to fossil bits of vocal tract that provide few
534 Cognitive Basis for Language Evolution in Nonhuman Primates
insights into the computational machinery underlying what our hominid ancestors produced, while ancient written records essentially represent language in its modern form. Finally, no single nonhuman species is equipped with a sufficiently inclusive set of abilities to constitute a precursor to our full-blown language faculty, and some traits appear to be absent altogether (Hockett, 1960; Hauser, 1996; Hauser, 2000). How do we study language evolution when the human species is a poor model for the comparative method and other species do not offer an equivalent repertoire?
Distinguishing the Shared and Unique Components of the Language Faculty In exploring language evolution from a comparative perspective, there are three possible outcomes for any given trait: 1. Trait X is not uniquely human, is a homology evolved for an earlier function, directly inherited, and therefore did not evolve for language; 2. Trait X is not uniquely human, but is a homoplasy that evolved completely unrelated to language or evolved for other purposes and was redesigned for language; 3. Trait X is uniquely human, is a phylogenetically novel trait that evolved only in hominid species, possibly for language or possibly for another cognitive capacity. Hauser et al. (2002) lay out a comparative research program for explaining the evolution of the language faculty, separating it into a broad and a narrow sense. The faculty of language in the broad sense (FLB) encompasses the sensory-motor (SM) system responsible for perceiving and producing the sound patterns of spoken language, the conceptual-intentional (CI) system involving conceptual representations and the capacity to refer, and the faculty of language in the narrow sense (FLN). FLN is the recursive system responsible for the computations involved in narrow syntax that generate internal representations and it maps them into the systems of phonology (SM) and semantics (CI); importantly, then, FLN entails recursion and its interfaces with phonology and semantics. The strict definition of FLN and its separation from FLB, along with the available comparative evidence, motivated the proposal that ‘‘most if not all of FLB is based on mechanisms shared with nonhuman animals . . . [but] . . . FLN – the computational mechanism of recursion – is recently evolved and unique to our species’’ (Hauser et al., 2002: 1573). This proposal represents ‘‘a tentative, testable hypothesis in need of further empirical investigation’’ (Hauser et al., 2002: 1578).
This empirical program targets comparative studies aimed at distinguishing homologies, homoplasies, and phylogenetically novel mechanisms among the components of the language faculty. The data sets include nonhuman animals’ natural communication, their noncommunicative problem-solving abilities, and their perceptual sensitivities to speech stimuli and the computational abilities that underlie language organization. The methodologies include field studies of natural behavior, laboratory studies building on those field studies, and, in the case of speech stimuli, combining the two such that researchers employ spontaneous behavioral measures borrowed from the field with carefully controlled laboratory methods. The goal of using speech stimuli, by definition artificial for primate subjects, is to assess whether they have the perceptual and computational abilities to extract the relevant dimensions such as phonetic categories, prosodic rhythms, word-like units, and grammatical structures. By employing the comparative approach, researchers can triangulate on the critical traits that contribute to the uniquely human form of language. Evidence of Homologous Mechanisms
The sensory-motor system includes abilities used in vocal production, imitation, and auditory perception. Here we highlight studies of auditory perception as they provide strong evidence for homologous mechanisms shared by humans and primates (Trout, 2001; Hauser et al., 2002). Vocal production and imitation reveal a more complicated pattern of similarities and differences across species (Rizzolatti and Arbib, 1998; Fitch, 2000). A classic set of findings reveals that categorical perception, required for detecting the articulatoryacoustic cues to phonetic boundaries, is shared between humans and primates, along with other mammalian and nonmammalian species such as chinchilla and quail (for reviews see Harnad, 1987; Hauser, 1996). These experiments entail carefully controlled laboratory methods with trained responses (e.g., shock avoidance, button pressing) and natural or synthetic speech stimuli to test the animals’ discrimination of cues signaling the boundaries between phonetic categories. The general conclusion is that the animals’ discrimination and labeling functions follow those detailed in studies of human adults and infants. For example, rhesus monkeys tested on a voice onset time contrast showed, similarly to humans, that they were most sensitive to differences close to the human phonetic boundaries and less sensitive to differences within a phonetic category. Laboratory and field studies of categorical perception using speciesspecific vocalizations, as opposed to speech, confirm
Cognitive Basis for Language Evolution in Nonhuman Primates 535
the general nature of this perceptual ability. These findings lead to the conclusion that categorical perception is shared among a broad range of species and represents a general as opposed to a speech-specific perceptual capacity. Comparative analysis thus reveals that categorical perception is not unique to humans, nor did it evolve for language. More recent research has extended comparative study of the language faculty in two significant directions: the implementation of nontraining methods and the presentation of speech stimuli other than phonetic categories. In a series of studies comparing human neonates and adult cotton-top tamarins, results suggest that these species share a perceptual mechanism for discriminating the rhythmic classes of spoken languages (for recent results and detailed references see Tincoff et al., 2005). Early descriptions of language rhythms separated them into three major classes reflecting linguistic metrical timing units: stress patterns, syllable length, and a subsyllabic unit length termed the mora. More recently, a quantitative analysis has been conducted of the acoustic characteristics of consonant and vowel units in several spoken languages. This analysis largely confirmed the early descriptive categories and revealed rhythmic clusters such as English and Dutch (stress timed), Spanish and Italian (syllable timed), and Japanese (mora timed). Perceptual experiments with human adults and newborns further confirmed this rhythmic organization. Most recently, human newborns and cotton-top tamarin species were tested in a habituation–discrimination paradigm using the same set of human speech utterances drawn from languages with different rhythmic structures (e.g., stress-timed Dutch vs. mora-timed Japanese). The discrimination response of human newborns was assessed with a high-amplitude sucking measure that capitalized on newborns’ natural sucking reflex. The discrimination response of adult tamarin monkeys was assessed by a head orientation measure that capitalized on tamarins’ spontaneous tendency to orient to a sound source. Results from both species showed that following habituation to one language (e.g., Dutch), individuals responded more to new sentences from a different rhythmic-class language (e.g., Japanese) than they did to new sentences from the habituation language. Both subject groups, however, failed to discriminate within a rhythmic class (e.g., Dutch vs. English or Spanish vs. Italian) and when the utterances were played backwards, a manipulation that arguably disrupts the rhythmic cues. Figure 1 presents the results of the tamarin subjects across these test conditions. The use of these natural spontaneous responses, compared to arbitrary trained responses, lends stronger support to the conclusion that the shared
Figure 1 Response to test utterances from New language compared to test utterances from Same, habituated language. Stippled bars indicate responses to different rhythmic class comparisons, Dutch vs. Japanese and Polish vs. Japanese. Solid bars indicate responses to same rhythmic class comparison, Dutch vs. American English. The top panel presents responses to utterances played forward; the bottom panel presents responses to utterances played backwards. Adult tamarins responded significantly more to New language utterances than to Same language utterances when the utterances were played forward and were from different rhythmic classes (*p < 0.05). Tamarins failed to discriminate New language utterances when they were played backwards, and when played forward, but were from the same rhythmic class. ( a percentage of subjects responding, bmean percentage of responses across subjects).
discrimination ability is based on a homologous perceptual mechanism. Another perceptual domain that extends our knowledge of the sensory-motor system is cross-modal integration. Natural spoken language in face-to-face interactions is both an auditory and a visual/ articulatory signal. Much research has documented that humans integrate the acoustic and articulatory sources of spoken syllables into a single percept. Research testing human infants’ perception of multimodal stimuli such as sound–object matches, face– voice matches, and auditory-visual vowels shows that cross-modal integration is present from a young age (for reviews see chapters in Lewkowicz and Lickliter, 1994). Experiments with nonhuman subjects, especially primates, are critical for explaining the origins of this fundamental perceptual mechanism and identifying possible homologues or precursors. Recently, a laboratory experiment revealed that rhesus monkeys can spontaneously match the auditory signal of two of their natural vocalizations with the video of the appropriate articulatory gesture: coos, produced by a rhesus monkey caller with a small mouth opening and protruding
536 Cognitive Basis for Language Evolution in Nonhuman Primates
lips, and aggressive threats, produced with a larger mouth opening and no lip protrusion (Ghazanfar and Logothetis, 2003). This finding provides the first evidence that the perceptual mechanism for crossmodal matching of acoustic-articulatory cues may be homologous between humans and primates and inherited from a common primate ancestor. It remains an open question whether this mechanism can be used by animals to detect cross-modal matches and mismatches with speech and human faces, and similarly, whether humans could detect correspondences between rhesus faces and voices. The above findings provide support for the conclusion that at least three perceptual mechanisms utilized by the sensory-motor system of the language faculty evolved before humans, and are homologous with mechanisms present in nonhuman primates and, in the case of categorical perception, other vertebrates. Importantly, these mechanisms evolved to solve more general perceptual problems rather than those that humans specifically encounter in processing speech. These findings are consistent with the first part of the language evolution hypothesis in Hauser et al. (2002), emphasizing commonalities within FLB. Evidence of Phylogenetically Unique Mechanisms
The conceptual-intentional system includes conceptual representations (e.g., number, animacy, color, spatial referents), the capacity to attribute mental states (e.g., beliefs, desires, intentions), and the ability to form and extend a lexicon of words or word-like signals. A long history of comparative research has led to the general conclusion that nonhuman animals, and perhaps especially chimpanzees, have a rich conceptual system, including elements of a theory of mind (see chapters in Heyes and Huber, 2000; Bekoff et al., 2002; Tomasello et al., 2003). These animals lack, however, the full-blown systems observed in humans, and certainly lack the capacity to express in sounds or gestures what they are thinking about (Hauser et al., 2002). Here we focus further on the lexical aspects of the conceptual-intentional system, as they reveal striking differences. The classic studies of vervet monkeys show that they produce acoustically distinct behavioral and vocal responses to different predator classes (leopards, eagles, and snakes) (reviewed in Hauser, 1996; Cheney and Seyfarth, in press). In particular, field playback experiments demonstrated that vervets could infer the context from the call alone. These findings motivated the conclusion that animals might produce signals that function like symbols, labeling particular external referents or designating instructions for action. A strong claim building on this
conclusion proposes that signals such as vervet alarm calls are evidence of precursors to human words. A more cautious claim allows that these signals carry significant information that can be decoded by the listener in order to mediate adaptive responses in the context of food, predation, and social relationships, but that information is not intentionally encoded by the sender to accommodate the listener’s beliefs or desires (Cheney and Seyfarth, in press). We, along with most in the field, consider this latter interpretation to be correct. Additionally, nonhuman vocalizations differ from human words in at least three important ways: meaning, as reflected by acoustically distinct calls produced in particular situations, is tied to a narrow range of topics, survival and mating; the calls are never produced in the absence of the referent; and the lexicon is small and finite with no apparent capacity to generate novel sounds with new meaning (Hauser et al., 2002). The implication for the evolution of the language faculty is that our lexical capacity likely stands as a phylogenetically novel trait; if correct, then models and simulations of language evolution that begin with the capacity for reference have skipped over a fundamental evolutionary change. We therefore need a better understanding of when and how our species alone acquired the remarkable capacity to learn words as young children, to generate new words on the fly, and to use our words to refer to absent, or even nonexistent and imagined, entities or events. FLN – and especially the mechanism of recursion – was defined by Hauser et al. (2002)as a computational process that is responsible for the generative and hierarchical properties of narrow syntax. In discussing this system, they leave open the possibility that although the recursive machinery is essential to language, it is also deployed, at least in some form, in other domains. On this view, therefore, a much broader array of investigations is likely to be useful to our understanding of language evolution. For example, comparative studies have explored the generative or hierarchical mechanisms for number and serial-order learning (Biro and Matsuzawa, 2001; Terrace et al., 2003), tool use (Matsuzawa, 2001), foraging and navigation (Shettleworth, 1998), and human grammar (Fitch and Hauser, 2004; Newport et al., 2004). We focus here on studies testing primates’ processing of grammar as they provide the most direct test of the FLN system used for narrow syntax. These laboratory studies build on the perceptual studies reviewed above using a familiarization–discrimination test paradigm with natural or synthesized speech stimuli and the same spontaneous orienting measure. The findings show that cottontoptamarin monkeys share with humans the basic
Cognitive Basis for Language Evolution in Nonhuman Primates 537
Figure 2 Mean response across subjects to test patterns that violated familiarized grammar compared to test patterns consistent with grammar. The top panel shows that tamarins familiarized to the finite state grammar (FSG) responded significantly more to violations than to grammar consistent patterns, but tamarins familiarized to the phrase structure grammar (PSG) did not discriminate violations from consistent patterns.
computational mechanisms for calculating transitional probabilities (a cue to word boundaries) and extracting algebraic rules (AAB vs. ABB), as well as other finitestate grammars (ABn patterns). Unlike humans, however, tamarins fail to extract phrase structure grammars (AnBn patterns) (Fitch and Hauser, 2004). In this experiment, tamarins were first familiarized to sets of syllables following an AnBn grammar. They were then tested with novel items, half violating the grammar, half consistent with it. Tamarins responded equally to the test items, failing to discriminate violations from grammarconsistent items (see Figure 2). This failure cannot be explained by task demands (identical to those used in the finite-state grammar experiment), working memory, capacity to extract the relevant units for the computation (i.e., the syllables were previously used in experiments testing finite-state grammar and algebraic rules), or number. These results suggest that the mechanism for generating hierarchically embedded structures is a phylogenetically novel trait unique to humans.
Conclusions and Open Questions Although comparative studies related to the language faculty have a long history, most of the core issues associated with understanding the evolution of the language faculty have only recently been approached. One useful approach may be the theoretical framework articulated by Hauser et al. (2002) in which the language faculty is carved at several relevant joints, including at the simplest level, the broad and narrow
sense with the associated component systems and its interfaces. Comparative studies with nonhuman primates and other animals offer initial support that the sensory-motor component of FLB is homologous. In contrast, the available data suggest that the conceptual-intentional system of FLB is characterized by a pattern of homologous, homoplasic, and phylogenetically unique traits. Finally, no strong evidence has been gathered to argue for a nonhuman homologue or homoplasy for FLN; FLN, as defined for narrow syntax, is a strong candidate for a phylogenetically unique human trait within the language faculty. Despite these positive developments, the conclusions presently rest on a sparse foundation due to the paucity of species tested and the range of unrelated methods implemented. Furthermore, important caveats must be raised for the identified behavioral homologies. We know relatively little about the neurophysiological substrates for any of the behavioral phenomena described, or the genetic mechanisms that form these processes. Further probing could reveal that the similar behavioral responses are generated by phylogenetically different underlying systems that yield convergent behavior. Alternatively, further probing could reveal homologies throughout the relevant neural, hormonal, or genetic mechanisms. An additional question is whether the target mechanisms serve an adaptive function related to language or if they are byproducts of other adaptations that were co-opted by the language faculty. Studies using neurophysiological preparations with natural vocalizations and human speech, together with studies involving molecular techniques, provide important avenues for the future (for a perspective on the recent mapping of the FOXP2 complex see Marcus and Fischer, 2003). The study of language evolution is an exciting and growing area of linguistics and language behavior. Data from nonhuman animals will play a pivotal role in constraining theories and designing experiments. See also: Alarm Calls; Animal Communication: Overview; Animal Communication: Vocal Learning; Apes: Gesture Communication; Categorical Perception in Animals; Communication in Grey Parrots; Communication in Marine Mammals; Development of Communication in Animals; Evolution of Phonetics and Phonology; Evolution of Pragmatics; Evolution of Semantics; Evolution of Syntax; Individual Recognition in Animal Species; Non-human Primate Communication; Production of Vocalizations in Mammals; Traditions in Animals.
Bibliography Bekoff M, Allen C & Burghardt G (eds.) (2002). The cognitive animal. Cambridge, MA: MIT Press.
538 Cognitive Basis for Language Evolution in Nonhuman Primates Biro D & Matsuzawa T (2001). ‘Chimpanzee numerical competence: cardinal and ordinal skills.’ In Matsuzawa (ed.). 199–225. Cheney D L & Seyfarth R M (in press). ‘Constraints and preadaptations in the earliest stages of language evolution.’ Linguistic Review. Chomsky N (1988). Language and problems of knowledge: the Managua lectures. Cambridge, MA: MIT Press. Christiansen M H & Kirby S (eds.) (2003). Language evolution. Oxford: Oxford University Press. Fitch W T (2000). ‘The evolution of speech: a comparative review.’ Trends in Cognitive Sciences 4(3), 258–267. Fitch W T (in press). ‘The evolution of language: a comparative review.’ Biology and Philosophy. Fitch W T & Hauser M D (2004). ‘Computational constraints on syntactic processing in a nonhuman primate.’ Science 303, 377–380. Gallistel C R (1990). The organization of learning. Cambridge, MA: Bradford Books/MIT Press. Ghazanfar A A & Logothetis N K (2003). ‘Facial expressions linked to monkey calls.’ Nature 423, 937–938. Harnad S (ed.) (1987). Categorical perception: the groundwork of cognition. New York: Cambridge University Press. Hauser M D (1996). The evolution of communication. Cambridge, MA: MIT Press. Hauser M D (2000). Wild minds: what animals really think. New York: Henry Holt. Hauser M D (2003). ‘Primate cognition.’ In Gallagher M & Nelson R J (eds.) Handbook of psychology 3: biological psychology. New York: John Wiley. 561–574. Hauser M D & Konishi M (eds.) (1999). The design of animal communication. Cambridge, MA: MIT Press. Hauser M D, Chomsky N & Fitch W T (2002). ‘The faculty of language: what is it, who has it, and how did it evolve.’ Science 298, 1569–1579 (this article provided important background for the present article). Heyes C M & Huber L (eds.) (2000). The evolution of cognition. Cambridge, MA: MIT Press.
Hockett C F (1960). The origin of speech. Scientific American 203, 88–96. Lenneberg E H (1967). Biological foundations of language. New York: John Wiley. Lewkowicz D J & Lickliter R (eds.) (1994). The development of intersensory perception: comparative perspectives. Hillsdale, NJ: Lawrence Erlbaum Associates. Marcus G F & Fisher S E (2003). ‘FOXP2 in focus: what can genes tell us about speech and language?’ Trends in Cognitive Sciences 7(6), 257–262. Matsuzawa T (2001). ‘Primate foundations of human intelligence: a view of tool use in nonhuman primates and fossil hominids.’ In Matsuzawa (ed.). 3–28. Matsuzawa T (ed.) (2001). Primate origins of human cognition and behavior. Tokyo: Springer-Verlag. Newport E L, Hauser M D, Spaepen G & Aslin R A (2004). ‘Learning at a distance II: statistical learning of non-adjacent dependencies in a non-human primate.’ Cognitive Psychology 49(2), 85–117. Ridley M (2004). Evolution. Malden, MA: Blackwell Science. Rizzolatti G & Arbib M A (1998). ‘Language within our grasp.’ Trends in Neurosciences 21, 188–194. Shettleworth S J (1998). Cognition, evolution, and behavior. New York: Oxford University Press. Terrace H S, Son L K & Brannon E M (2003). ‘Serial expertise of rhesus macaques.’ Psychological Science 14(1), 66–73. Tincoff R, Hauser M, Tsao F, Spaepen G, Ramus F & Mehler J (2005). ‘The role of speech rhythm in language discrimination: further tests with a nonhuman primate.’ Developmental Science 8(1), 26–35. Tomasello M, Call J & Hare B (2003). ‘Chimpanzees understand psychological states—the question is which ones and to what extent.’ Trends in Cognitive Sciences 7(4), 153–156. Trout J D (2001). ‘The biological basis of speech: what to infer from talking to the animals.’ Psychological Review 108(3), 523–549.
Cognitive Grammar R W Langacker ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 2, pp. 590–593, ! 1994, Elsevier Ltd.
‘Cognitive grammar ’ (originally called ‘space grammar’) is a highly innovative theory of linguistic structure that has been developed and progressively articulated since 1976. In stark contrast to modular approaches, it regards language as an integral facet of cognition, and grammar as being inherently meaningful. It presupposes a ‘conceptualist’ account of
linguistic semantics that properly recognizes our capacity for construing the same conceived situation in alternate ways. With an appropriate view of meaning, all grammatical elements are reasonably attributed some kind of conceptual import. Grammar is thus considered ‘symbolic’ in nature: it reduces to the structuring and symbolization of conceptual content.
Linguistic Organization The ultimate goal of linguistic research is to characterize language as a cognitive entity. As envisaged in cognitive grammar, linguistic structure ultimately
538 Cognitive Basis for Language Evolution in Nonhuman Primates Biro D & Matsuzawa T (2001). ‘Chimpanzee numerical competence: cardinal and ordinal skills.’ In Matsuzawa (ed.). 199–225. Cheney D L & Seyfarth R M (in press). ‘Constraints and preadaptations in the earliest stages of language evolution.’ Linguistic Review. Chomsky N (1988). Language and problems of knowledge: the Managua lectures. Cambridge, MA: MIT Press. Christiansen M H & Kirby S (eds.) (2003). Language evolution. Oxford: Oxford University Press. Fitch W T (2000). ‘The evolution of speech: a comparative review.’ Trends in Cognitive Sciences 4(3), 258–267. Fitch W T (in press). ‘The evolution of language: a comparative review.’ Biology and Philosophy. Fitch W T & Hauser M D (2004). ‘Computational constraints on syntactic processing in a nonhuman primate.’ Science 303, 377–380. Gallistel C R (1990). The organization of learning. Cambridge, MA: Bradford Books/MIT Press. Ghazanfar A A & Logothetis N K (2003). ‘Facial expressions linked to monkey calls.’ Nature 423, 937–938. Harnad S (ed.) (1987). Categorical perception: the groundwork of cognition. New York: Cambridge University Press. Hauser M D (1996). The evolution of communication. Cambridge, MA: MIT Press. Hauser M D (2000). Wild minds: what animals really think. New York: Henry Holt. Hauser M D (2003). ‘Primate cognition.’ In Gallagher M & Nelson R J (eds.) Handbook of psychology 3: biological psychology. New York: John Wiley. 561–574. Hauser M D & Konishi M (eds.) (1999). The design of animal communication. Cambridge, MA: MIT Press. Hauser M D, Chomsky N & Fitch W T (2002). ‘The faculty of language: what is it, who has it, and how did it evolve.’ Science 298, 1569–1579 (this article provided important background for the present article). Heyes C M & Huber L (eds.) (2000). The evolution of cognition. Cambridge, MA: MIT Press.
Hockett C F (1960). The origin of speech. Scientific American 203, 88–96. Lenneberg E H (1967). Biological foundations of language. New York: John Wiley. Lewkowicz D J & Lickliter R (eds.) (1994). The development of intersensory perception: comparative perspectives. Hillsdale, NJ: Lawrence Erlbaum Associates. Marcus G F & Fisher S E (2003). ‘FOXP2 in focus: what can genes tell us about speech and language?’ Trends in Cognitive Sciences 7(6), 257–262. Matsuzawa T (2001). ‘Primate foundations of human intelligence: a view of tool use in nonhuman primates and fossil hominids.’ In Matsuzawa (ed.). 3–28. Matsuzawa T (ed.) (2001). Primate origins of human cognition and behavior. Tokyo: Springer-Verlag. Newport E L, Hauser M D, Spaepen G & Aslin R A (2004). ‘Learning at a distance II: statistical learning of non-adjacent dependencies in a non-human primate.’ Cognitive Psychology 49(2), 85–117. Ridley M (2004). Evolution. Malden, MA: Blackwell Science. Rizzolatti G & Arbib M A (1998). ‘Language within our grasp.’ Trends in Neurosciences 21, 188–194. Shettleworth S J (1998). Cognition, evolution, and behavior. New York: Oxford University Press. Terrace H S, Son L K & Brannon E M (2003). ‘Serial expertise of rhesus macaques.’ Psychological Science 14(1), 66–73. Tincoff R, Hauser M, Tsao F, Spaepen G, Ramus F & Mehler J (2005). ‘The role of speech rhythm in language discrimination: further tests with a nonhuman primate.’ Developmental Science 8(1), 26–35. Tomasello M, Call J & Hare B (2003). ‘Chimpanzees understand psychological states—the question is which ones and to what extent.’ Trends in Cognitive Sciences 7(4), 153–156. Trout J D (2001). ‘The biological basis of speech: what to infer from talking to the animals.’ Psychological Review 108(3), 523–549.
Cognitive Grammar R W Langacker ! 2006 Elsevier Ltd. All rights reserved. This article is reproduced from the previous edition, volume 2, pp. 590–593, ! 1994, Elsevier Ltd.
‘Cognitive grammar ’ (originally called ‘space grammar’) is a highly innovative theory of linguistic structure that has been developed and progressively articulated since 1976. In stark contrast to modular approaches, it regards language as an integral facet of cognition, and grammar as being inherently meaningful. It presupposes a ‘conceptualist’ account of
linguistic semantics that properly recognizes our capacity for construing the same conceived situation in alternate ways. With an appropriate view of meaning, all grammatical elements are reasonably attributed some kind of conceptual import. Grammar is thus considered ‘symbolic’ in nature: it reduces to the structuring and symbolization of conceptual content.
Linguistic Organization The ultimate goal of linguistic research is to characterize language as a cognitive entity. As envisaged in cognitive grammar, linguistic structure ultimately
Cognitive Grammar 539
reduces to recurrent patterns of neurological activity, and owing to its multifaceted complexity, a language is more aptly likened metaphorically to a biological organism than to a computer program or a logical deductive system. Thus it is not presumed that any single formalism can capture all aspects of a given phenomenon, or that any particular notation translates directly into specific psychological claims. The various notations and descriptive formats devised in cognitive grammar are meant to be precise within the limits of our understanding and revelatory for particular descriptive and analytical purposes. They do not however constitute a strict or uniquely privileged formalization, nor is the expectation of such a formalism considered appropriate. To the extent that a pattern of neurological activity is ‘entrenched’ and readily elicited as a pre-established whole, it is referred to as a ‘unit.’ Linguistic knowledge or ability (i.e., a speaker’s grasp of linguistic convention) comprises a vast array of such units, which is structured in the sense that units participate in excitatory and inhibitory relationships, and that some units include others as components. This knowledge – the ‘internal grammar’ – is not conceived as a generative or constructive device. The function of linguistic units is rather to serve as templates for the ‘categorization’ of expressions. An expression is simultaneously categorized by a multitude of units, each of which corresponds to a particular aspect of its structure and represents a constraint on its possible well-formedness (conventionality). Units compete for activation and the privilege of categorization on the basis of entrenchment and their degree of overlap with the target expression. Cognitive grammar imposes severe limitations on the kinds of units ascribable to a linguistic system. On the one hand, it posits only (a) semantic units, (b) phonological units, and (c) symbolic units (in which semantic and phonological units are linked by symbolic relationships). This is the bare minimum needed to accommodate the basic semiological function of language, namely the symbolization of conceptualizations by means of phonological sequences. Symbolic units are held sufficient for the description of lexicon, morphology, and syntax, which form a continuum (rather than discrete components). On the other hand, cognitive grammar observes the ‘content requirement,’ which restricts linguistic units to (a) semantic, phonological, and symbolic structures that occur overtly as (parts of) expressions, (b) ‘schematizations’ of permitted structures, and (c) ‘categorizing relationships’ between permitted structures (including ‘instantiation’ of a schema and ‘extension’ from a prototype). By virtue of these restrictions, the theory achieves naturalness, conceptual unification, and theoretical austerity.
Semantic Structure Cognitive grammar maintains that grammatical structure is ‘symbolic’ in nature, being fully describable in terms of symbolic links between semantic and phonological structures. The viability of this conception of grammar depends on a particular view of linguistic semantics. Basic Tenets
Cognitive semantics rests on several fundamental notions. First, meaning is not identified with truth conditions, but with ‘mental experience’ or ‘conceptualization’ in the broadest sense of that term. Included are novel conceptions (as well as established concepts), all facets of sensorimotor experience, and cognizance of the social, linguistic, and cultural context. Second, a linguistic category is typically ‘complex’: its adequate description requires not just a single structure, but a set of structures linked by relationships of instantiation and extension to form a network. As a special case of this phenomenon, lexical items are typically ‘polysemous.’ A lexeme’s meaning comprises a network of related senses, some being schematic relative to others, and some constituting extensions vis-a`-vis more prototypical values. Third, linguistic semantics is ‘encyclopedic’ in scope. The meaning of a lexical item (even a single sense) cannot in general be captured by a limited, dictionarytype definition. Everything we know about an entity can in principle be regarded as contributing to the meaning of an expression that designates it, even though certain specifications are far more central and linguistically important than others. One cannot motivate any sharp distinction (only one of degree) between semantics and pragmatics, or between ‘linguistic’ and ‘extralinguistic’ knowledge. Finally, an expression’s meaning does not consist solely in the conceptual content it evokes (let alone in truth conditions or the objective situation it describes) – equally significant is how that content is ‘construed.’ Two expressions may invoke the same conceptual content yet differ semantically by virtue of the construals they impose. Construal
Numerous aspects of construal have been identified. They are conveniently grouped under several broad rubrics: specificity, scope, background, perspective, and prominence. We have the capacity to conceive an entity or situation at varying levels of specificity and detail, as witnessed by such hierarchies as thing > creature > insect > fly > fruit fly. Each term in the hierarchy is ‘schematic’ for (and ‘elaborated’ by) the one that
540 Cognitive Grammar
follows, which characterizes the designated entity with greater precision (finer resolution). An expression’s ‘scope’ comprises the full array of conceptual content that it specifically evokes and relies upon for its characterization. The term ‘lid,’ for instance, evokes the schematic conception of a container, as well as the notion of one entity covering another. A conception of any type or any degree of complexity is capable of being invoked as part of an expression’s meaning. Numerous conceptions – called ‘cognitive domains’ – typically figure in the meaning of a given expression, which may evoke them in a flexible and open-ended manner (as determined by context). Hence the starting point for semantic description is not a set of semantic features or conceptual primitives, but rather an appropriate array of integrated conceptions, among them higher-order structures representing any level of conceptual organization. At the lowest level, presumably, are cognitively irreducible ‘basic domains’ such as space, time, and the domains associated with the various senses (e.g., color space). Another aspect of construal is our ability to conceive of one structure against the ‘background’ provided by another. Categorization is perhaps the most fundamental and pervasive manifestation of this ability. Another is the relationship between the source and target domains of a metaphor. Words like even, only, many, few, more, and less compare an actual value to some norm or expectation, and the contrast between the truth-functionally equivalent half-empty and half-full is well known. More generally, such phenomena as presupposition, anaphora, and the given/-new distinction all involve construal against a certain kind of background. Perspective subsumes such factors as vantage point, orientation, and the subjectivity or objectivity with which an entity is construed. Vantage point and orientation both figure in the two interpretations of Jack is to the left of Jill, where Jack’s position may be reckoned from either the speaker’s perspective or from Jill’s. By subjectivity or objectivity is meant the degree to which an entity functions asymmetrically as the ‘subject’ or the ‘object of conception.’ The conceptualizers (i.e., the speaker and addressee) are construed subjectively in There’s a mailbox across the street, where they remain implicit as ‘offstage’ reference points. They construe themselves more objectively in There’s a mailbox across the street from us. The final aspect of construal is the relative ‘prominence’ accorded to the different substructures of a conception. Various kinds of prominence need to be distinguished. One is the salience that comes with objective construal and explicit mention, as in the previous example. A second type of prominence is
called ‘profiling’: within the conception it evokes, every expression singles out some substructure as a kind of focus of attention; this substructure – the ‘profile’ – is the one that the expression ‘designates.’ For example, hypotenuse evokes the conception of a rightangled triangle (its scope) and profiles (designates) the side lying opposite the right angle. Above profiles the spatial ‘relationship’ between two entities. A third type of prominence pertains to the participants in a profiled relationship. One participant, termed the ‘trajector,’ is analyzed as the ‘figure’ within the profiled relationship; an additional salient entity is referred to as a ‘landmark.’ For instance, because above and below evoke the same conceptual content and profile the same relationship, their semantic contrast can only reside in figure/ground alignment. X is above Y is concerned with locating X, which is thus the trajector (relational figure), whereas Y is below X uses X as a landmark to specify the location of Y.
Grammatical Structure Grammar is claimed to be ‘symbolic’ in nature. Only symbolic units (form-meaning pairings) are held necessary for the description of grammatical structure. Thus all valid grammatical constructs are attributed some kind of conceptual import. Rather than being autonomous in regard to semantics, grammar reduces to the structuring and symbolization of conceptual content. Grammatical Classes
An expression’s grammatical class is determined by the nature of its profile. The most fundamental distinction is between a ‘nominal’ and a ‘relational’ expression, which respectively profile a ‘thing’ and a ‘relationship.’ Both terms are defined abstractly. A thing is characterized schematically as a ‘region in some domain,’ where a ‘region’ can be established from any set of entities (e.g., the stars in a constellation) just by conceiving of them in relation to one another. While physical objects occupy bounded regions in space and are prototypical instances of the thing category, the schematic characterization also accommodates such entities as as unbounded substances (e.g., water), geographical areas (Wisconsin), regions in abstract domains (stanza), collections of entities (alphabet), points on a scale (F-sharp; 30! C), conceptual reifications (termination), and even the absence of some entity (intermission). The term ‘relationship’ is also broadly interpreted. It applies to any assessment of entities in relation to one another, regardless of their nature and status; in particular, they need not be distinct, salient, or individually recognized. Expressions classified as relational are
Cognitive Grammar 541
therefore not limited to those (like above) traditionally considered two-place predicates. For instance, the adjective blue profiles the relationship between an object and a certain region in color space. When used as a noun, square profiles the region comprising a set of line segments arranged in a particular fashion (or else the area they enclose). When used as an adjective, however, it profiles the complex relationship among subparts of this geometrical figure (involving perpendicularity, equal length of sides, and so on). Expressions that profile things include such traditional classes as noun, pronoun, and noun phrase (for which the term ‘nominal’ is adopted in cognitive grammar). Relational expressions subsume those traditionally recognized as adjectives, prepositions, adverbs, infinitives, participles, verbs, clauses, and full sentences. On the basis of the intrinsic complexity of their profiles, relational expressions can be divided into those which designate ‘simple atemporal relations,’ ‘complex atemporal relations,’ and ‘processes.’ A simple atemporal relation is one that comprises a single consistent configuration (or ‘state’ – hence it is also called a ‘stative relation’). For example, adjectives and many prepositions have this character. A complex atemporal relation cannot be reduced to a single configuration but can only be represented as an ordered series of states. In She walked across the field, for instance, the preposition across designates a series of locative configurations defining the trajector’s path with reference to the landmark. A process is a complex ‘temporal’ relation, i.e., one whose component states are saliently conceived as being distributed through a continuous span of time, and whose temporal evolution is viewed sequentially (rather than holistically). Verbs and finite clauses designate processes, whereas participles and infinitives impose a holistic view on the process specified by a verb stem and are thus atemporal. Rules and Constructions
Grammar consists of patterns for combining simpler symbolic structures into symbolic structures of progressively greater complexity. A symbolically complex expression, e.g., cracked, represents a ‘grammatical construction’ wherein two ‘component structures’ (crack and -ed) are ‘integrated’ to form a ‘composite structure.’ Such integration, both phonological and semantic, is effected by ‘correspondences’ established between subparts of the component expressions, and by the superimposition of corresponding entities. Typically, one component structure corresponds to, and serves to ‘elaborate,’ a schematic substructure within the other. Thus -ed, being a suffix, makes schematic phonological reference to a stem, which crack elaborates to yield cracked. Semantically, the adjecti-
val sense of the past participial morpheme profiles the final, resultant state of a schematically characterized process, which corresponds to the specific process profiled by crack. By superimposing the corresponding processes, and adopting the profiling of the participial morpheme, one obtains the composite semantic structure of cracked, which profiles a stative relation identified as the final state of crack. It is usual for the composite structure to inherit its profiling from one of the component structures, which thereby constitutes the construction’s ‘head.’ The suffix -ed is thus the head within the participial construction cracked. A component that elaborates the head is a ‘complement,’ hence complement of -ed by virtue of elaborating the schematic process it invokes. Conversely, a component that the head elaborates is a ‘modifier.’ In blue square, for instance, blue modifies square because the latter – the head – elaborates blue’s schematic trajector (the entity located in the blue region of color space). Grammatical rules take the form of schematized constructions. A ‘constructional schema’ is a symbolically complex structure whose internal organization is exactly analogous to that of a set of constructions (complex expressions), but which abstracts away from their points of divergence to reveal their schematic commonality. For instance, the rule for adjective þ noun combinations in English is a symbolic structure parallel to blue square, cracked sidewalk, playful kitten, etc., except that the adjective and noun are schematic rather than specific: semantically, the constructional schema specifies that the trajector of the adjective corresponds to the profile of the noun, which lends its profile to the composite structure; phonologically, it specifies that the adjective directly precedes the noun as a separate word. A constructional schema may be characterized at any appropriate level of abstraction, and represents the conventionality of a particular pattern of integration. It is available for the categorization of novel complex expressions and can also be thought of as a template used in their assembly. Other Grammatical Elements
The foregoing remarks indicate that grammatical classes, rules, and such notions as head, complement, and modifier can all be characterized in terms of configurations of symbolic structures. The same is true of other grammatical elements. For instance, a ‘nominal’ (noun phrase) profiles a thing and further incorporates a specification of its relationship to the ‘ground’ (i.e., the speech event and its participants) with respect to fundamental, ‘epistemic’ cognitive domains; demonstratives, articles, and certain quantifiers serve this function in English. Similarly, a
542 Cognitive Grammar
‘finite clause’ profiles a process grounded (in the case of English) by tense and the modals. A ‘subject’ can then be characterized as a nominal which elaborates the trajector of a process profiled at the clausal level of organization, and a ‘direct object’ as a nominal that elaborates its primary landmark. Grammatical markers are all attributed semantic values, often quite schematic. For example, the derivational morpheme -er (as in complainer) profiles a thing characterized only as the trajector of a schematic process; like most derivational morphemes, it is semantically schematic for the class it derives, its primary semantic value residing in the profile it imposes on the specific conceptual content provided by the stem it combines with. Besides schematicity, factors considered compatible with a grammatical marker’s meaningfulness include semantic overlap with other elements (e.g., redundant marking, as in agreement), the lack of any option (as in government), and failure to exhibit a single sense in all its uses (polysemy being characteristic of both lexical and grammatical elements – e.g., -ed has distinct but related meanings in its adjectival, perfect, and passive uses).
Assessment and Outlook Cognitive grammar has been revealingly applied to a steadily widening array of phenomena in a diverse set of languages. It is rapidly being established as a viable model of language structure, and in view of the restrictiveness and conceptual unification it achieves, it merits serious attention from linguistic theorists. It is fully compatible with ‘functional’ approaches to linguistic structure, which help explain why certain of the structures it permits have the status of prototypes, or represent language universals or universal tendencies. It also has a natural affinity to ‘connec-
tionist’ (or ‘parallel distributed processing’) models of cognition, both because the distinction between rules and data is only one of degree, and also because grammatical structure reduces to form–meaning pairings. The possibility of achieving this reduction has extensive implications for language acquisition, models of language processing, and our conception of the human mind. See also: Cognitive Linguistics; Cognitive Pragmatics; Cognitive Semantics; Construction Grammar; Grammaticalization.
Bibliography Haiman J (1980). ‘Dictionaries and encyclopedias.’ Lingua 50, 329–357. Lakoff G (1987). Women, fire, and dangerous things: what categories reveal about the mind. Chicago, IL: University of Chicago Press. Langacker R W (1986). ‘An introduction to cognitive grammar.’ Cognitive Science 10, 1–40. Langacker R W (1987a). Foundations of cognitive grammar, vol. 1: theoretical prerequisites. Stanford, CA: Stanford University Press. Langacker R W (1987b). ‘Nouns and verbs.’ Language 63, 53–94. Langacker R W (1988). Autonomy, agreement, and cognitive grammar. In Brentari D et al. (eds.) Agreement in grammatical theory. Chicago, IL: Chicago Linguistic Society. Langacker R W (1990a). Concept, image, and symbol: the cognitive basis of grammar. Berlin: Mouton de Gruyter. Langacker R W (1990b). ‘Subjectification.’ Cognitive Linguistics 1, 5–38. Langacker R W (1991). Foundations of cognitive grammar, vol. 2: descriptive application. Stanford, CA: Stanford University Press. Rudzka-Ostyn B (ed.) (1988). Topics in cognitive linguistics. Benjamins: Amsterdam.
Cognitive Linguistics L Talmy, State University of New York, Buffalo, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Developing over the past two to three decades, cognitive linguistics has as its central concern the representation of conceptual structure in language. This relatively new field can initially be characterized through a contrast of its conceptual approach with two other familiar approaches, the formal and the psychological. The formal approach focuses on the
overt structural patterns exhibited by linguistic forms, largely abstracted away from any associated meaning. The tradition of generative grammar has been centered here, but has had limited involvement with the other two approaches. Its formal semantics has largely included only enough about meaning to correlate with its formal categories and operations. And its reach to psychology has largely considered only the kinds of cognitive structure and processing needed to account for its formal categories and operations. The psychological approach regards language from the perspective of general cognitive systems such
542 Cognitive Grammar
‘finite clause’ profiles a process grounded (in the case of English) by tense and the modals. A ‘subject’ can then be characterized as a nominal which elaborates the trajector of a process profiled at the clausal level of organization, and a ‘direct object’ as a nominal that elaborates its primary landmark. Grammatical markers are all attributed semantic values, often quite schematic. For example, the derivational morpheme -er (as in complainer) profiles a thing characterized only as the trajector of a schematic process; like most derivational morphemes, it is semantically schematic for the class it derives, its primary semantic value residing in the profile it imposes on the specific conceptual content provided by the stem it combines with. Besides schematicity, factors considered compatible with a grammatical marker’s meaningfulness include semantic overlap with other elements (e.g., redundant marking, as in agreement), the lack of any option (as in government), and failure to exhibit a single sense in all its uses (polysemy being characteristic of both lexical and grammatical elements – e.g., -ed has distinct but related meanings in its adjectival, perfect, and passive uses).
Assessment and Outlook Cognitive grammar has been revealingly applied to a steadily widening array of phenomena in a diverse set of languages. It is rapidly being established as a viable model of language structure, and in view of the restrictiveness and conceptual unification it achieves, it merits serious attention from linguistic theorists. It is fully compatible with ‘functional’ approaches to linguistic structure, which help explain why certain of the structures it permits have the status of prototypes, or represent language universals or universal tendencies. It also has a natural affinity to ‘connec-
tionist’ (or ‘parallel distributed processing’) models of cognition, both because the distinction between rules and data is only one of degree, and also because grammatical structure reduces to form–meaning pairings. The possibility of achieving this reduction has extensive implications for language acquisition, models of language processing, and our conception of the human mind. See also: Cognitive Linguistics; Cognitive Pragmatics; Cognitive Semantics; Construction Grammar; Grammaticalization.
Bibliography Haiman J (1980). ‘Dictionaries and encyclopedias.’ Lingua 50, 329–357. Lakoff G (1987). Women, fire, and dangerous things: what categories reveal about the mind. Chicago, IL: University of Chicago Press. Langacker R W (1986). ‘An introduction to cognitive grammar.’ Cognitive Science 10, 1–40. Langacker R W (1987a). Foundations of cognitive grammar, vol. 1: theoretical prerequisites. Stanford, CA: Stanford University Press. Langacker R W (1987b). ‘Nouns and verbs.’ Language 63, 53–94. Langacker R W (1988). Autonomy, agreement, and cognitive grammar. In Brentari D et al. (eds.) Agreement in grammatical theory. Chicago, IL: Chicago Linguistic Society. Langacker R W (1990a). Concept, image, and symbol: the cognitive basis of grammar. Berlin: Mouton de Gruyter. Langacker R W (1990b). ‘Subjectification.’ Cognitive Linguistics 1, 5–38. Langacker R W (1991). Foundations of cognitive grammar, vol. 2: descriptive application. Stanford, CA: Stanford University Press. Rudzka-Ostyn B (ed.) (1988). Topics in cognitive linguistics. Benjamins: Amsterdam.
Cognitive Linguistics L Talmy, State University of New York, Buffalo, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Developing over the past two to three decades, cognitive linguistics has as its central concern the representation of conceptual structure in language. This relatively new field can initially be characterized through a contrast of its conceptual approach with two other familiar approaches, the formal and the psychological. The formal approach focuses on the
overt structural patterns exhibited by linguistic forms, largely abstracted away from any associated meaning. The tradition of generative grammar has been centered here, but has had limited involvement with the other two approaches. Its formal semantics has largely included only enough about meaning to correlate with its formal categories and operations. And its reach to psychology has largely considered only the kinds of cognitive structure and processing needed to account for its formal categories and operations. The psychological approach regards language from the perspective of general cognitive systems such
Cognitive Linguistics 543
as perception, memory, attention, and reasoning. Centered here, the field of psychology has also addressed the other two approaches. Its conceptual concerns have included semantic memory, the associativity of concepts, the structure of categories, inference generation, and contextual knowledge. But it has insufficiently considered systematic conceptual structuring. By contrast, the conceptual approach of cognitive linguistics is concerned with the patterns in which and processes by which conceptual content is organized in language. It has thus addressed the linguistic structuring of such basic conceptual categories as space and time, scenes and events, entities and processes, motion and location, and force and causation. To these it adds the basic ideational and affective categories attributed to cognitive agents, such as attention and perspective, volition and intention, and expectation and affect. It addresses the semantic structure of morphological and lexical forms, as well as of syntactic patterns. And it addresses the interrelationships of conceptual structures, such as those in metaphoric mapping (see Metaphor: Psychological Aspects), those within a semantic frame, those between text and context, and those in the grouping of conceptual categories into large structuring systems. Overall, the aim of cognitive linguistics is to ascertain the global integrated system of conceptual structuring in language. Further, cognitive linguistics addresses the formal properties of language, accounting for grammatical structure in terms of its representation of conceptual structure. And, distinguishing it from earlier semantics, cognitive linguistics relates its findings to the cognitive structures of the psychological approach (see Psycholinguistics: Overview). Its long-range trajectory is to integrate the linguistic and the psychological perspectives on cognitive organization in a unified understanding of human conceptual structure. Many of the major themes of cognitive linguistics can be related in a way that shows the overall structure of the field. A beginning observation is that language consists of two subsystems – the open-class or lexical, and the closed-class or grammatical – that have different semantic and functional properties. Closed-class, but not open-class forms, exhibit great semantic constraint, and do so at two levels. First, their referents can belong to certain semantic categories, such as number, gender, and tense, but not to others such as color or material. For example, inflections on a noun indicate its number in many languages, but never its color. Second, they can refer only to certain concepts even within an acceptable category like number – e.g., ‘singular,’ ‘dual,’ ‘plural,’ and ‘paucal,’ but never ‘even,’ ‘odd,’ or ‘dozen.’ Certain principles govern this semantic constraint, e.g.,
the exclusion of reference to Euclidean properties such as specificity of magnitude or shape. What largely remain are topological properties such as the magnitude-neutral distance represented by the deictics (see Deixis and Anaphora: Pragmatic Approaches) in This speck/planet is smaller than that speck/planet, or the shape-neutral path represented by the preposition in I circled/zigzagged through the forest. The two subsystems differ also in their basic functions, with conceptual content represented by open-class forms and conceptual structure by closed-class forms. For example, in the overall conception evoked by the sentence A rustler lassoed the steers, the three semantically rich open-class forms – rustle, lasso, steer – contribute most of the content, while most of the structure is determined by the remaining closed-class forms. Shifts in all the closed-class forms – as in Will the lassoers rustle a steer? – restructure the conception but leave the cowboy-landscape content largely intact, whereas a shift in the open-class forms – as in A machine stamped the envelopes – changes content while leaving the structure intact. The basic finding in this ‘‘semantics of grammar’’ portion of cognitive linguistics is that the closed-class subsystem is the fundamental conceptual structuring system of language (Talmy, 2000). Such conceptual structure is understood in cognitive linguistics as ‘schematic’, with particular ‘schemas’ or ‘image-schemas’ represented in individual linguistic forms – whether alone in closed-class forms or with additional material in open-class forms. The idea is that the structural specifications of linguistic forms are regularly conceptualized in terms of abstracted, idealized, and sometimes virtually geometric delineations. Such schemas fall into conceptual categories that join in extensive ‘schematic systems.’ Many of the substantive findings about conceptual organization made by cognitive linguists can be placed within these schematic systems. One schematic system is ‘configurational structure,’ covering the structure of objects in space and events in time – often with parallels between the two. For example, inits category of ‘plexity’ – a term covering both number and aspect – the object referent of bird and the event referent of (to) sigh are intrinsically ‘uniplex’, but the addition of the extra forms in birds and keep sighing triggers a cognitive operation of ‘multiplexing’ that yields multiplex referents. And in the category ‘state of boundedness,’ the intrinsically unbounded object and event referents of water and (to) sleep can undergo ‘bounding’ through the additional form in some water and (to) sleep some to yield bounded referents. The second schematic system of ‘perspective’ covers the location or path of the point at which one
544 Cognitive Linguistics
places one’s ‘mental eyes’ to regard a represented scene. For example, in There are some houses in the valley, the closed-class forms together represent a distal stationary perspective point with global scope of attention. But the substituted forms in There is a house every now and then through the valley represent a proximal moving perspective point with local scope of attention. The third schematic system of ‘attention’ covers the patterns in which different aspects of a linguistic reference are foregrounded or backgrounded. For example, the word hypotenuse ‘profiles’ – foregrounds in attention – its direct reference to a line segment against an attentionally backgrounded ‘base’ of the conception of a right triangle (Langacker, 1987). The verb bite in The dog bit the cat foregrounds the ‘active zone’ of the dog’s teeth. And over an expression of a certain kind, the ‘Figure’ or ‘trajector’ is the most salient constituent whose path or site is characterized in terms of a secondarily salient constituent, the ‘Ground’ or ‘landmark.’ These functional assignments accord with convention in The bike is near the house, but their reversal yields the odd ?The house is near the bike. A fourth schematic system of ‘force dynamics’ covers such relations between entities as opposition, resistance, overcoming, and blockage, and places causation alongside permitting and preventing, helping and hindering. To illustrate, the sentence The ball rolled along the green is force dynamically neutral, but in The ball kept rolling along the green, either the ball’s tendency toward rest is overcome by something like the wind, or its tendency toward motion overcomes something such as stiff grass (Talmy, 2000). Schemas from all the schematic systems, and the cognitive operations they trigger can be nested to form intricate structural patterns. To illustrate with events in time, the uniplex event in The beacon flashed can be multiplexed as in The beacon kept flashing; this can be bounded as in The beacon flashed 5 times in a row; this can be treated as a new uniplexity and remultiplexed as in The beacon kept flashing 5 times at a stretch; and this can in turn be rebounded, as in The beacon flashed 5 times at a stretch for 3 hours. Further conceptual structuring is seen within the meanings of morphemes. A morpheme’s meaning is generally a prototype category whose members differ in privilege, whose properties can vary in number and strength, and whose boundary can vary in scope (Lakoff, 1987). For example, the meaning of breakfast prototypically refers to eating certain foods in the morning, but can extend to other foods at that time or the same foods at other times (Fillmore, 1982). For a polysemous morpheme, one sense can
function as the prototype to which the other senses are progressively linked by conceptual increments within a ‘radial category.’ Thus, for the preposition over, the prototype sense may be ‘horizontal motion above an object’ as in The bird flew over the hill, but linked to this by ‘endpoint focus’ is the sense in Sam lives over the hill (Brugmann, 1981). These findings have led cognitive linguists to certain stances on the properties of conceptualization. The conceptual structuring found in language is largely held to be a product of human cognition and imposed on external phenomena (where it pertains to them), rather than arising from putative structure intrinsic to such external phenomena and veridically taken up by language. For example, in one type of ‘fictive motion,’ motion can be imputed to a shadow – cross linguistically always from an object to its silhouette – as in The pole threw its shadow on the wall, even though a distinct evaluative part of our cognition may judge the situation to lack physical motion. An important consequence is that alternatives of conceptualization or ‘construal’ can be applied to the same phenomena. Thus, a person standing 5 feet from and pointing to a bicycle can use either deictic in Take away that/this bicycle, in effect imputing the presence of a spatial boundary either between herself and the bicycle or on the far side of the bicycle. The notion of ‘embodiment’ extends the idea of conceptual imposition and bases the imposed concepts largely on experiences humans have of their bodies interacting with environments or on psychological or neural structure (Lakoff and Johnson, 1999). As one tenet of this view, the ‘objectivist’ notion of the autonomous existence of logic and reason is replaced by experiential or cognitive structure. For example, our sense of the meaning of the word angle is not derived from some independent ideal mathematical realm, but is rather built up from our experience, e.g., from perceptions of a static forking branch, from moving two sticks until their ends touch, or from rotating one stick while its end touches that of another. The cognitive process of conceptual imposition – more general than going from mental to external phenomena or from experiential to ideal realms – also covers directed mappings from any one conceptual domain to another. An extensive form of such imputation is metaphor, mainly studied in cognitive linguistics not for its familiar salient form in literature but, under the term ‘conceptual metaphor,’ for its largely unconscious pervasive structuring of everyday expression. In it, certain structural elements of a conceptual ‘source domain’ are mapped onto the content of a conceptual ‘target domain.’ The embodiment-based directionality of the imputational
Cognitive Linguistics 545
mapping is from a more concrete domain, one grounded in bodily experience, to a more abstract domain – much as in the Piagetian theory of cognitive development. Thus, the more palpable domain of physical motion through space can be mapped onto the more abstract domain of progression through time – in fact, in two different ways – as in We’re approaching Christmas and Christmas is approaching – whereas mappings in the reverse direction are minimal (Lakoff, 1992). Generally, mappings between domains are implicit in metaphor, but are explicitly established by linguistic forms in the area of ‘mental spaces.’ The mapping here is again directional, going from a ‘base’ space – a conceptual domain generally factual for the speaker – to a subordinate space that can be counterfactual, representational, at a different time, etc. Elements in the former space connect to corresponding elements in the latter. Thus, in Max thinks Harry’s name is Joe, the speaker’s base space includes ‘Max’ and ‘Harry’ as elements; the word thinks sets up a subordinate space for a portion of Max’s belief system; and this contains an element ‘Joe’ that corresponds to ‘Harry’ (Fauconnier, 1985). Further, two separate mental spaces can map elements of their content and structure into a third mental space that constitutes a ‘blend’ or ‘conceptual integration’ of the two inputs, with potentially novel structure. Thus, in referring to a modern catamaran reenacting a centuryold voyage by an early clipper, a speaker can say At this point, the catamaran is barely maintaining a 4 day lead over the clipper, thereby conceptually superimposing the two treks and generating the apparency of a race (Fauconnier and Turner, 2002). In terms of the sociology of the field, there is considerable consensus across cognitive linguists on the assumptions of the field and on the body of work basic to it. No competing schools of thought have arisen, and cognitive linguists engage in relatively little critiquing of each other’s work, which mainly differs only in the phenomena focused on. See also: Cognitive Semantics; Componential Analysis; Deixis and Anaphora: Pragmatic Approaches; Metaphor: Psychological Aspects; Orality; Prototype Semantics; Psycholinguistics: Overview; Spatiality and Language.
Bibliography Bowerman M (1996). ‘Learning how to structure space for language: a crosslinguistic perspective.’ In Bloom P, Peterson M, Nadel L & Garrett M F (eds.) Language and space. Cambridge, MA: MIT Press. 385–436. Brugmann C (1981). The story of ‘‘over.’’ M.A. thesis, University of California, Berkeley.
Geeraerts D & Cuyckens H (eds.) (forthcoming). Handbook of Cognitive Linguistics. Oxford: Oxford University Press. Fauconnier G (1985). Mental spaces: aspects of meaning construction in natural language. Cambridge, MA/ London: MIT Press/Bradford. Fauconnier G & Turner M (2002). The way we think: conceptual blending and the mind’s hidden complexities. NY: Basic Books. Fillmore C (1975). ‘An alternative to checklist theories of meaning.’ Berkeley Linguistics Society 1, 155–159. Fillmore C (1982). ‘Frame semantics.’ In Linguistic Society of Korea (ed.) Linguistics in the Morning Calm. Seoul: Hanshin Publishing Co. 111–137. Fillmore C (1997). Lectures on deixis. Stanford, CA: CSLI Publications. Herskovits Annette (1986). Language and spatial cognition: an interdisciplinary study of the prepositions in English. Cambridge: Cambridge University Press. Kemmer Su (1993). The middle voice. Amsterdam: John Benjamins. Lakoff G (1987). Women, fire, and dangerous things: what categories reveal about the mind. Chicago/London: University of Chicago Press. Lakoff G (1992). ‘The contemporary theory of metaphor.’ In Ortony A (ed.) Metaphor and thought, 2nd edn. Cambridge: Cambridge University Press. Lakoff G & Johnson M (1999). Philosophy in the flesh: the embodied mind and its challenge to western thought. NY: Basic Books. Langacker R (1987). Foundations of cognitive grammar: theoretical prerequisites (vol. 1). Stanford: Stanford University Press. Langacker R (1991). Foundations of cognitive grammar: descriptive application (vol. 2). Stanford: Stanford University Press. Langacker R (2002). Concept, image, and symbol: the cognitive basis of grammar. Berlin/NY: Mouton de Gruyter. Rudzka-Ostyn B (ed.) (1988). Topics in cognitive linguistics. Amsterdam/Philadelphia: John Benjamins. Slobin D I (1997). ‘Mind, code, and text.’ In Bybee J, Haiman J & Thompson S A (eds.) Essays on language function and language type: dedicated to T. Givon. Amsterdam: John Benjamins. 437–467. Slobin D I (2003). ‘Language and thought online: cognitive consequences of linguistic relativity.’ In Gentner D & Goldin-Meadow S (eds.) Language in mind: advances in the study of language and thought. Cambridge, MA: MIT Press. 157–192. Sweetser E (1990). From etymology to pragmatics. Cambridge: Cambridge University Press. Sweetser E (1999). ‘Compositionality and blending: semantic composition in a cognitively realistic framework.’ In Redeker G & Janssen T (eds.) Cognitive linguistics: foundations, scope and methodology. Berlin: Mouton de Gruyter. 129–162. Talmy L (2000). Toward a cognitive semantics (2 vols). Cambridge: MIT Press.
546 Cognitive Linguistics Talmy L (2003). ‘The representation of spatial structure in spoken and signed Language.’ In Emmorey K (ed.) Perspectives on Classifier Constructions in Sign Language. Mahwah, NJ: Lawrence Erlbaum. 169–195. Talmy L (forthcoming). The attention system of language. Cambridge: MIT Press. Tomasello M (ed.) (1998). The new psychology of language: cognitive and functional approaches to language structure (vol. 1). Mahwah, NJ: Lawrence Erlbaum. Tomasello M (ed.) (2003). The new psychology of language: cognitive and functional approaches to language structure (vol. 2). Mahwah, NJ: Lawrence Erlbaum.
Traugott E (1989). ‘On the rise of epistemic meanings in English: an example of subjectification in semantic change.’ Language 57, 33–65. Verhagen A (2002). ‘From parts to wholes and back again.’ In van Wolde E (ed.) Job 28. Cognition in Context. Leiden: Brill. 231–252. Zubin D A & Kopcke K M (1986). ‘The gender marking of superordinate and basic level concepts in German: an analogist apology.’ In Craig C (ed.) Categorization and Noun Classification. Philadelphia: Benjamins North America. 139–180.
Cognitive Pragmatics F M Bosco, University of Torino, Torino, Italy ! 2006 Elsevier Ltd. All rights reserved.
Introduction Cognitive pragmatics is concerned with the mental processes involved in intentional communication. Typically, studies within this area focus on cognitive processes underlying the comprehension of a linguistic speech act and overlook linguistic production or extralinguistic communication. As far as cognitive processes are concerned authors in this field are interested in both the inferential chains necessary to understand a communicator’s intention starting from the utterance he proffered and the different mental representations underlying the comprehension of various communicative phenomena as cognitive processes. Thus, a theory in cognitive pragmatics aims to explain what mental processes a person actually engages in during a communicative interaction (see Shared Knowledge). Relevance theory (Sperber and Wilson, 1986/1995) is usually identified as the principal theoretical framework in the area of cognitive pragmatics (see Relevance Theory). Nonetheless, in the last decade, other theories have been developed. These include a far-reaching theory of the cognitive processes underlying human communication, known as the Cognitive Pragmatics theory (Airenti et al., 1993a, 1993b; Bara, 2005), and the Graded Salience Hypothesis (Giora, 2003), a theory which focuses on mental inferences underlying the comprehension of literal vs. figurative language (see Cognitive Linguistics; Metaphor: Psychological Aspects). Describing the cognitive processes involved in communicative interaction is interesting not only for the study of such processes as fixed states – an approach
that takes into consideration exclusively the final stage in healthy adult subjects – but also for the consideration of how a given function develops from infancy, through childhood, and to adulthood, and how it eventually decays in subjects with brain injuries (Bara, 1995). Such an approach makes it possible to better comprehend, from a cognitive perspective, how pragmatic competence develops and what neurocognitive structures might cause deficits in people’s performance if damaged. A closely related topic is the identification of the cognitive components that contribute to the realization of a complete pragmatic competence. From this perspective, it is important to consider the role played by a person’s Theory of Mind and by the Executive Function (see below) during a communicative interaction.
Cognitive Pragmatics Theory Airenti et al. (1993a, 1993b) presented a theory of the cognitive processes underlying human communication aiming to provide a unified theoretical framework for the explanation of different communicative phenomena (Bara, 2005). The authors proposed that their theoretical analysis holds for both linguistic and extralinguistic communication, and thus introduced, with reference to the interlocutors, the terms ‘actor’ and ‘partner’ instead of the classical ‘speaker’ and ‘hearer.’ The theory assumes that the literal meaning of an utterance is necessary but not sufficient to the partner in order for him or her to reconstruct the meaning conveyed by the actor, and that in order to understand the actor’s communicative intention, the partner has to recognize a ‘behavior game’ the actor is proposing for him (the partner) to play. The behavior game is a social structure mutually shared by the participants of the communicative interaction.
546 Cognitive Linguistics Talmy L (2003). ‘The representation of spatial structure in spoken and signed Language.’ In Emmorey K (ed.) Perspectives on Classifier Constructions in Sign Language. Mahwah, NJ: Lawrence Erlbaum. 169–195. Talmy L (forthcoming). The attention system of language. Cambridge: MIT Press. Tomasello M (ed.) (1998). The new psychology of language: cognitive and functional approaches to language structure (vol. 1). Mahwah, NJ: Lawrence Erlbaum. Tomasello M (ed.) (2003). The new psychology of language: cognitive and functional approaches to language structure (vol. 2). Mahwah, NJ: Lawrence Erlbaum.
Traugott E (1989). ‘On the rise of epistemic meanings in English: an example of subjectification in semantic change.’ Language 57, 33–65. Verhagen A (2002). ‘From parts to wholes and back again.’ In van Wolde E (ed.) Job 28. Cognition in Context. Leiden: Brill. 231–252. Zubin D A & Kopcke K M (1986). ‘The gender marking of superordinate and basic level concepts in German: an analogist apology.’ In Craig C (ed.) Categorization and Noun Classification. Philadelphia: Benjamins North America. 139–180.
Cognitive Pragmatics F M Bosco, University of Torino, Torino, Italy ! 2006 Elsevier Ltd. All rights reserved.
Introduction Cognitive pragmatics is concerned with the mental processes involved in intentional communication. Typically, studies within this area focus on cognitive processes underlying the comprehension of a linguistic speech act and overlook linguistic production or extralinguistic communication. As far as cognitive processes are concerned authors in this field are interested in both the inferential chains necessary to understand a communicator’s intention starting from the utterance he proffered and the different mental representations underlying the comprehension of various communicative phenomena as cognitive processes. Thus, a theory in cognitive pragmatics aims to explain what mental processes a person actually engages in during a communicative interaction (see Shared Knowledge). Relevance theory (Sperber and Wilson, 1986/1995) is usually identified as the principal theoretical framework in the area of cognitive pragmatics (see Relevance Theory). Nonetheless, in the last decade, other theories have been developed. These include a far-reaching theory of the cognitive processes underlying human communication, known as the Cognitive Pragmatics theory (Airenti et al., 1993a, 1993b; Bara, 2005), and the Graded Salience Hypothesis (Giora, 2003), a theory which focuses on mental inferences underlying the comprehension of literal vs. figurative language (see Cognitive Linguistics; Metaphor: Psychological Aspects). Describing the cognitive processes involved in communicative interaction is interesting not only for the study of such processes as fixed states – an approach
that takes into consideration exclusively the final stage in healthy adult subjects – but also for the consideration of how a given function develops from infancy, through childhood, and to adulthood, and how it eventually decays in subjects with brain injuries (Bara, 1995). Such an approach makes it possible to better comprehend, from a cognitive perspective, how pragmatic competence develops and what neurocognitive structures might cause deficits in people’s performance if damaged. A closely related topic is the identification of the cognitive components that contribute to the realization of a complete pragmatic competence. From this perspective, it is important to consider the role played by a person’s Theory of Mind and by the Executive Function (see below) during a communicative interaction.
Cognitive Pragmatics Theory Airenti et al. (1993a, 1993b) presented a theory of the cognitive processes underlying human communication aiming to provide a unified theoretical framework for the explanation of different communicative phenomena (Bara, 2005). The authors proposed that their theoretical analysis holds for both linguistic and extralinguistic communication, and thus introduced, with reference to the interlocutors, the terms ‘actor’ and ‘partner’ instead of the classical ‘speaker’ and ‘hearer.’ The theory assumes that the literal meaning of an utterance is necessary but not sufficient to the partner in order for him or her to reconstruct the meaning conveyed by the actor, and that in order to understand the actor’s communicative intention, the partner has to recognize a ‘behavior game’ the actor is proposing for him (the partner) to play. The behavior game is a social structure mutually shared by the participants of the communicative interaction.
Cognitive Pragmatics 547
Suppose, for example, that while you are working in your office, a colleague walks in and says: [1] It’s snowing outside. Although the literal meaning of the utterance is completely clear, you probably are utterly bewildered about how to respond. Only if [1] is understood as an invitation not to go outside, a request to close the window, a proposal to go skiing next week-end (that is, only if, in some way, the reason or reasons for uttering the expression were evident), will you be able to make the necessary inferences and answer appropriately. The utterance, pure and simple, without a game to refer to, has in itself no communicative significance whatsoever. Thus, an utterance extrapolated from its context of reference has no communicative meaning and cannot have any communicative effect on the partner. Starting from the assumption that the communicative meaning of an utterance is intrinsically linked to the context within which it is proffered, Bosco et al. (2004a) defined a taxonomy of six categories of context: Access, Space, Time, Discourse, Behavioral Move, and Status. Using contextual information, the partner can identify the behavior game bid by the speaker, which allows him to fully comprehend the actor communicative intention. Following the tenets of the Cognitive Pragmatics theory, Bucciarelli et al. (2003) proposed that two cognitive factors affect comprehension of various kind of pragmatic phenomena: the ‘inferential load’ and the ‘complexity of mental representations’ underlying the comprehension of a communicative act. Inferential Load: Simple and Complex Speech Acts
Searle (1975) claimed that in speech act comprehension, the literal interpretation of an utterance always has priority with respect to any other interpretations derived from it. According to Searle, understanding an indirect speech act, e.g., [2] Would you mind passing me the salt?, is harder than understanding a direct speech act, e.g., [3] Please pass me the salt, because it requires a longer inferential process. Bara and Bucciarelli (1998) provided empirical evidence that, beginning at two-and-a-half years of age, children find direct speech acts such as [4] Please sit down, and conventional indirects such as [5] Would you mind closing the door? equally easy to comprehend. In a further study, Bucciarelli et al. (2003) found that starting at age two-and-a-half years, children find both direct and conventional indirect speech acts easier to understand than nonconventional indirect speech acts, such as the utterance [6] Excuse me, I’m studying when it is a request to a partner who is hammering in a nail to stop making noise.
Using the tenets of Cognitive Pragmatics theory, it is possible to abandon the distinction between direct and indirect speech acts and adopt a new one based on the difference between inferential processes involved in comprehending simple as against complex communicative acts (Bara and Bucciarelli, 1998). According to the theory, the partner’s understanding of any kind of speech act depends on the comprehension of the behavioral game bid by the actor; an agent will interpret an interlocutor’s utterance based on the grounds that are assumed to be shared. In this perspective, the partner’s difficulty in understanding a communicative act depends on the inferential chain necessary to refer the utterance to the game intended by the actor. Direct and conventional indirect speech acts make immediate reference to the game, and thus are defined as ‘simple speech acts.’ On the other hand, nonconventional indirect speech acts can be referred to as ‘complex speech acts,’ because they require a chain of inferential steps due to the fact that the specific behavior game of which they are a move is not immediately identifiable. For example, to understand [4] and [5], it is sufficient for the partner to refer to the ‘Ask for Something’ game. In order to understand [6], a more complex inferential process is necessary: the partner needs to share with the actor the belief that when a person is studying, he needs silence and that since hammering [6] is a request to stop is noisy. Only then, the partner can attribute to the utterance the value of a move in the ‘Ask for Something’ game. Thus, if the problem is how to access the game, the distinction between direct and indirect speech acts is not relevant. It is the complexity of the inferential steps necessary to refer the utterance to the game bid by the actor that accounts for the difficulties in speech act comprehension. This distinction applies not only to standard communicative acts such as direct, conventional indirect, and nonconventional indirect speech acts, but also to nonstandard ones such as ironic and deceitful (Bara et al., 1999a). The same distinction between simple and complex standard, ironic, and deceitful communicative acts holds for extralinguistic communication acts as well (see Irony). That is, the distinction holds also when the actor communicates with the partner only through gestures (Bosco et al., 2004b) (see Gestures: Pragmatic Aspects). The inferential load underlying a communicative act may explain the difference in difficulty that exists in the comprehension of different communicative acts pertaining to the same pragmatic category, such as between simple and complex standard communicative acts. To explain the difference in difficulty that might occur among communicative acts pertaining to a different pragmatic category, such as between a
548 Cognitive Pragmatics
direct communicative act and a deceitful communicative act, is necessary to consider the complexity of the mental representations involved in their comprehension. Complexity of Mental Representations
Still within the framework of Cognitive Pragmatics theory and along with the same complexity of the inferential load involved, Bucciarelli et al. (2003) described an increasing difficulty in comprehending simple communicative acts of different sorts: simple standard, simple deceitful, and simple ironic communicative acts. According to the theory, in standard communication, default rules of inference are used to understand another person’s mental states; default rules are always valid unless their consequences are explicitly denied. Indeed, in standard communication, what the actor says is in line with his private beliefs. Direct, conventional indirect, and nonconventional indirect speech acts are all examples of standard communication. In terms of mental representations, to comprehend a standard communicative act, the partner has to simply refer the utterance proffered by the actor to the behavior game he bids. On the other hand, nonstandard communication such as irony and deceit involves the comprehension of communicative acts via the block of default rules and the occurrence of more complex inferential processes that involve conflicts between the beliefs the actor has shared with the partner and the latter’s private beliefs. In the comprehension of irony and deceit, the mental representations involved produce a difference between what the actor communicates and what he privately entertains. It follows that, along with the same complexity of the inferential load involved, standard communicative acts are easier to deal with than nonstandard pragmatic phenomena. According to Bucciarelli et al. (2003), in the case of the comprehension of deceit, the partner has to recognize the difference between the mental states that are expressed and those the actor privately entertains. Consider for instance the following example: Mark and Ann share that the lecture they just attended was incredibly boring. Later Ann meets John and tells him that Mark and she attended a tedious lecture. In the afternoon also Mark meets John, who asks him about the lecture. Actually, Mark is annoyed with John because John did not go to the lecture and he does not want John to know that he feels he wasted the whole morning. Mark does not know that John has already met Ann, thus he answers: [7] It was really interesting! John can understand that Mark is trying
to deceive him because he recognizes the difference between the mental state that Mark is expressing and the one that he truly and privately entertains. A statement, instead, becomes ironic when, in addition to the awareness of this difference, the partner also recognizes that the mental states expressed contrast with the scenario that he shares with the actor. For example, some months later, during a chat with Mark, Ann asks: Do you remember the lecture that we attended some months ago? Mark answers: [8] It was really interesting! What makes this utterance ironic is the fact that both interlocutors share that the lecture had actually been boring. Thus, the difference between irony and deceit lies not in the partner’s awareness of the difference between the mental states that the actor expressed and those that he actually entertains, but in his awareness that he does or does not share this difference with the actor. In the case of irony, the partner has to represent not only the discrepancy between the mental states that the actor expressed and those that he privately entertains, but also that such awareness is shared with the actor. This makes an ironic communicative act more difficult to comprehend than a deceitful one. Bucciarelli et al. (2003) showed the existence of an increase in difficulty in the comprehension of simple standard communicative acts, simple deceits, and simple ironies with an experiment carried out on children from two-and-a-half to seven years of age. The authors also pointed out that the same children show a similar predicted gradation of difficulty in understanding the same pragmatic phenomena, both when these are expressed by linguistic speech acts and when these are expressed by communicative gestures. Regardless of the communicative channel used by the actor, linguistic or extralinguistic, children find simple standard speech acts easier to comprehend than simple deceits, which are, in turn, easier to comprehend than simple ironic communicative acts. Finally, an overall consideration of the mentioned results makes it possible to conclude that all of the theoretical predictions (both derived from the Cognitive Pragmatics theory and grounded on a person’s cognitive processes underlying the communicative comprehension) hold true for the same pragmatic phenomena whether expressed by linguistic speech acts or by gestures. These results seem to indicate that linguistic and extralinguistic communicative acts share the most relevant mental processes in each of the specific pragmatic phenomena investigated and suggest that pragmatic competence shares the same cognitive faculty – regardless of the input processed – be it linguistic or extralinguistic. It is possible to interpret such empirical evidence as being in favor of a unified theoretical framework of
Cognitive Pragmatics 549
human communication in which linguistic and extralinguistic communication develop in parallel being different aspects of a unique communicative competence (see Bara and Tirassa, 1999; Bara, 2005) (see Communicative Principle and Communication).
Cognitive Pragmatics and Development In this section, we shall examine the empirical evidence in favor of the existence of cognitive processes of increasing complexity that underlie different pragmatic phenomena. The developmental domain is particularly interesting for this aim because it makes it possible to observe errors in the comprehension of different kinds of pragmatics tasks that allow us to falsify our hypotheses regarding the complexity of the mental processes involved in specific phenomena. However, adult subjects possess a fully developed cognitive system and communicative competence, and thus they do not show any interesting errors in comprehending or producing different kinds of communicative acts; it is only possible to analyze their time of reaction in solving such tasks. On the other hand, if inferential processes and mental representations of increasing complexity underlie the comprehension of various kind of pragmatic phenomena, then it is possible to explain why, during the development of children’s communicative competence, some communicative acts are understood and produced before others are. For example, children initially only understand sincere communicative acts and only later on in their development do they start comprehending, for example, deceit and irony. Children’s ability to deal with mental representations and inferential chains of increasing complexity develops with age, and this fact helps explain the development of their pragmatic competence. From this perspective, the increasing capacity to construct and manipulate complex mental representations is involved in the emergence of preschoolers’ and kindergarten student’s capacity to deceive. A deceptive task could be made easier to comprehend by reducing the number of characters, episodes, and scenes involved in the task, and by including a deceptive context (Sullivan et al., 1994). Likewise the ability to comprehend and produce different forms of ironies involves an increasing and sophisticated inferential ability. Lucariello and Mindolovich (1995) carried out a study on the ability of 6and 8-year-old children to provide ironic endings to unfinished stories. The authors claimed that the recognition and the construction of (situational) ironic events involve the ability to manipulate the representations of events. These representations have to be critically viewed, and disassembled in order to create
new, different, and ironic event structures. Also, different forms of irony behave in different ways, as the authors’ experiments show. Their results show that older children construct more complex ironic derivations from the representational base than younger children do. Just as it is possible to better understand the development of pragmatic competence by considering the cognitive processes involved in a specific communicative act, it also is possible to explain deficits in performance in cases of brain damage. The ability of children with closed head injury to solve pragmatic tasks is closely dependent (for a review, see Bara et al., 1999b). These subjects performed worse than did their normal peers in specific pragmatic tasks such as bridging the inferential gap between events in stereotypical social situations and tasks such as comprehending utterances that require inferential processes because of their use of idiomatic and figurative language (Dennis and Barnes, 1990).
Cognitive Pragmatics and Brain Damage Neuropsychological diseases affect communicative performance in various ways, depending on which relevant cognitive subsystem is damaged. The information obtained by studying these abnormal processes provides us with an opportunity to better understand the architecture of the brain/mind and its relationship to pragmatic competence (Tirassa, 1999; Bara and Tirassa, 2000). Acquired brain damage impairs certain cognitive processes while leaving others unaffected. For example, it is well-documented in the literature that aphasic patients with left-brain damage have residual pragmatic competence despite their language impairment (see Language in the Nondominant Hemisphere). On the other hand, what different cerebral injuries have in common is a damaged capacity to deal with phenomena that require complex mental processes in order to be understood. In particular, if the tasks require more complex inferences, then this capacity seems to be more damaged than in other cases, as we will show later in this section. Results like these seem to confirm the assumption that different pragmatic phenomena require the activation of increasingly complex cognitive processes. McDonald and Pearce (1996) found that traumatic brain injured patients (TBI) do not have difficulty in the comprehension of written sincere exchanges such as [9] Mark: What a great football game!; Wayne: So you are glad I asked you?, but they have several problems, compared to the normal control subjects, in comprehending ironic exchanges such as [10] Mark: What a great football game!; Wayne: Sorry
550 Cognitive Pragmatics
I made you come. The authors gave the subjects the same experimental material in auditory form and found that the patients’ performance did not improve. The authors concluded that TBI patients have difficulty in comprehending irony and that, even if the tone of voice usually facilitates the comprehension of ironic remarks, it is not sufficient on its own. Furthermore, McDonald (1999) found that, surprisingly, TBI patients have no problem understanding written ironic utterances such as [11] Tom: That’s a big dog; Monica: Yes, it’s a miniature poodle. The author suggested that [11] might require a shorter inferential chain compared to [10] in order to be understood. Indeed, in comprehending [11], it is sufficient to understand what Monica answers as meaning that Tom’s statement meant the opposite of what it said. In [10], however, Wayne’s response is not only a rejection of the original comment, but an allusion to Mark’s actual reaction to the game. Thus, there were at least two necessary inferential steps in the comprehension process. Such findings are in line with the proposal that different kinds of irony may vary in their difficulty of being understood, according to the complexity of the required inferential load (Bara et al., 1999a). Particularly interesting from our perspective are studies that showed that the decay of pragmatic competence in closed head injured subjects (CHI) reflects the same type of development that is observed in normal children, i.e., the capacities acquired later in the development of the pragmatic ability are the most damaged. Using a linguistic experimental protocol, Bara et al. (1997) tested a group of CHI subjects and found that specific pragmatic tasks such as the comprehension of nonstandard communication, e.g., deceit and irony, are more difficult than tasks requiring only simple mental representations, such as the comprehension of standard communication involving only direct, conventional, and nonconventional indirect speech acts. In addition, the authors found no differences in patients’ comprehension of direct and conventional indirect speech acts. The same results were observed in the performance of children aged 2 to 6 years old who were tested by the same experimental protocol (Bara and Bucciarelli, 1998). It should also be noted that Bara et al. (1997) presented two classical tests on false belief to CHI patients in order to measure their theory of mind, but did not find any significant difference with the control group of children who were not brain damaged. Thus, the patients’ poor performance on pragmatic tasks cannot be ascribed to a deficit of the Theory of Mind; that is, their poor performance cannot be ascribed to an inability to understand another person’s mental states.
Moreover, Bara et al. (2000) used a similar extralinguistic version of the same pragmatic experimental protocol and evaluated the comprehension of standard communication, i.e., simple and complex communicative acts, and nonstandard communication, i.e., deceit and irony. Such a protocol contains videotaped scenes wherein the pragmatic phenomena are presented using extralinguistic means, such as pointing or clapping. The subjects were firstly a group of children 2–6 years of age and secondly a group of Alzheimer’s disease patients, and found that children show the same tendency in the development of extralinguistc competence that was observed by Bara and Bucciarelli (1998) in the linguistic domain. In addition, the authors observed a similar tendency toward decay in the Alzheimer’s patients’ extralinguistic competence: the nonstandard extralinguistic tasks are understood less well than are the standard communicative tasks. Finally, the trend of decaying pragmatic competence in the Alzheimer patient group matched the results obtained by CHI patients, when tested according to the same extralinguistic protocol (Bara et al., 2001). The CHI subjects were also given several neuropsychological tests, but no statistical correlation between the subjects’ performance on the pragmatic protocol and their performance on these collateral neuropsychological tests was found. Thus, the patient’s poor performance cannot be ascribed to a deficit in their executive functioning. As already observed for the development of pragmatic linguistic and extralinguistic competence, the empirical data concerning brain damaged subjects seem to be in favor of the existence of a unified pragmatic competence which is independent of the input – whether it is linguistic or extralinguistic. That is, the comprehension of speech acts and extralinguistic communicative acts shares the most relevant mental processes when tested on different pragmatic phenomena, and the pragmatic competence seems to be independent of the expressive means used to realize it.
Cognitive Pragmatics and the Executive Function While the literature provides empirical evidence that mental processes involved in various pragmatic tasks can be ordered according to increasing difficulty, as we have seen above, in order fully comprehend pragmatic competence from a cognitive perspective, we need to consider also a further factor affecting the human ability to communicate: the executive functions. The Executive Function is a cognitive construct used to describe the goal-directed behaviors that are mediated by the frontal lobes. The Executive
Cognitive Pragmatics 551
Function guides a person’s actions and enables him to behave adaptively and flexibly; it includes cognitive capacities such as planning, inhibition of dominant responses, flexibility, and working memory. Barnes and Dennis (2001) have shown that, in addition to a deficient inferential ability, also a reduction of working memory and metacognitive skills may be invoked to explain closed-head injured children’s problems in comprehending stories. Working memory provides the necessary resources for computing inference in ongoing text comprehension; metacognitive skills are used when checking if, and when, an inference needs to be made. The authors tested children with severe to mild head injury on their ability to comprehend brief written stories, and found inferencing deficits in children with severe (but not with mild) head injury; these children had problems linking their general knowledge to the particular wording of the text. In general, when the metacognitive demands and the pressure on working memory were reduced, children with severe head injuries did not show any deficiencies in inferencing compared to the development in normal children or their mildly head-injured peers. Working memory also plays a role in explaining the poor ability to comprehend written stories that is observed in children with hydrocephalus, a neuro-developmental disorder accompanied by increased pressure of the cerebrospinal fluid on the brain tissue. Children with hydrocephalus, when compared to the control group, show increasing difficulty drawing on information from an earlier read sentence when trying to understand a new sentence, the greater the distance between the two texts. Thus, while these children do not seem to have a fundamental problem in making inferences, their poor performance is mainly due to a deficit in their working memory (Barnes et al., 2004). As to the role of other executive functions, Channon and Watts (2003) examined the ability of CHI patients to comprehend brief vignettes involving pragmatic judgement and the relationship between this activity and some executive functions: working memory, inhibition, and the ability to organize and plan appropriate responses in a certain context. The authors found that only the ability to solve the inhibition task, which required the subjects to inhibit dominant words and generate words that completed sentences with nonsensical endings, correlates with the pragmatic comprehension task. No association was found with the other executive skills. From a neuropsychological perspective, intact frontal lobes are critical to executive functioning, and because traumatic brain injury often results in damage to these areas, pragmatic deficits shown by these patients can be explained by a principal
Executive Function impairment. From this perspective, the deficits in planning and monitoring of behavior that are usually observed in such patients seem to explain the difficulty these subjects have in adhering to the structure of conventional discourse (McDonald and Pearce, 1998). To conclude, theoretical and empirical studies in the literature seem to suggest that in order to explain people’s pragmatic competence, it is necessary to take into account the role played by at least three elements: mental processes, namely, the inferential load and the complexity of the mental representations; the Theory of Mind; and the Executive Function whereas the empirical studies mainly focus on the linguistic competence that is needed to realize various pragmatic tasks, the perspective should be widened to include a methodical comparison with extralinguistic competence. In order to establish whether, or not, the cognitive components that make these two different means of communication are the same in both cases. Finally, a complete theory in the cognitive pragmatic domain should be able to explain not only adult normal subjects’ ability to communicate, but also the development and the decay of this capacity in brain-damaged patients. See also: Cognitive Grammar; Cognitive Linguistics; Cog-
nitive Semantics; Communicative Principle and Communication; Gestures: Pragmatic Aspects; Irony; Language in the Nondominant Hemisphere; Meaning: Overview of Philosophical Theories; Metaphor: Psychological Aspects; Pragmatics: Overview; Relevance Theory; Shared Knowledge; Speech Acts; Speech Acts, Literal and Nonliteral.
Bibliography Airenti G, Bara B G & Colombetti M (1993a). ‘Conversation and behavior games in the pragmatics of dialogue.’ Cognitive Science 17, 197–256. Airenti G, Bara B G & Colombetti M (1993b). ‘Failures, exploitations and deceits in communication.’ Journal of Pragmatics 20, 303–326. Bara B G (1995). Cognitive science: a developmental approach to the simulation of the mind. Hillsdale, NJ: Lawrence Erlbaum Associates. Bara B G (2005). Cognitive pragmatics. Cambridge, MA: MIT Press. Bara B G, Bosco F M & Bucciarelli M (1999a). ‘Simple and complex speech acts: what makes the difference within a developmental perspective.’ In Hahn M & Stoness S C (eds.) Proceedings of the XXI Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum Associates. 55–60. Bara B G, Bosco F M & Bucciarelli M (1999b). ‘Developmental pragmatics in normal and abnormal children.’ Brain and Language 68, 507–528.
552 Cognitive Pragmatics Bara B G & Bucciarelli M (1998). ‘Language in context: the emergence of pragmatic competence.’ In Quelhas A C & Pereira F (eds.) Cognition and context. Lisbon: Instituto Superior de Psicologia Aplicada. 317–345. Bara B G, Bucciarelli M & Geminiani G (2000). ‘Development and decay of extralinguistic communication.’ Brain and Cognition 43, 21–27. Bara B G, Cutica I & Tirassa M (2001). ‘Neuropragmatics: extralinguistic communication after closed head injury.’ Brain and Language 77, 72–94. Bara B G & Tirassa M (1999). ‘A mentalist framework for linguistic and extralinguistic communication.’ In Bagnara S (ed.) Proceedings of the 3rd European Conference on Cognitive Science. Roma: Istituto di Psicologia del Consiglio Nazionale delle Ricerche. Bara B G & Tirassa M (2000). ‘Neuropragmatics: brain and communication.’ Brain and Language 71, 10–14. Bara B G, Tirassa M & Zettin M (1997). ‘Neuropsychological constraints on formal theories of dialogue.’ Brain and Language 59, 7–49. Barnes M A & Dennis M (2001). ‘Knowledge-based inferencing after childhood head injury.’ Brain and Language 76, 253–265. Barnes M A, Faulkner H, Wilkinson M & Dennis M (2004). ‘Meaning construction and integration in children with hydrocephalus.’ Brain and Language 89, 47–56. Bosco F M, Bucciarelli M & Bara B G (2004a). ‘The fundamental context categories in understanding communicative intentions.’ Journal of Pragmatics 36(3), 467–488. Bosco F M, Sacco K, Colle L, Angeleri R, Enrici I, Bo G & Bara B G (2004b). ‘Simple and complex extralinguistic communicative acts.’ In Forbus K, Gentner D & Regier T (eds.) Proceedings of the Twenty-Sixth Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum Associates. 44–49.
Bucciarelli M, Colle L & Bara B G (2003). ‘How children comprehend speech acts and communicative gestures.’ Journal of Pragmatics 35, 207–241. Channon S & Watts M (2003). ‘Pragmatic language interpretation after closed head injury: relationship to executive functioning.’ Cognitive Neuropsychiatry 8, 243–260. Dennis M & Barnes M A (1990). ‘Knowing the meaning, getting the point, bridging the gap, and carrying the message: aspects of discourse following closed head injury in childhood and adolescence.’ Brain and Language 39, 428–446. Giora R (2003). On our mind: salience, context and figurative language. New York: Oxford University Press. Lucariello J & Mindolovich C (1995). ‘The development of complex meta-representational reasoning: the case of situational irony.’ Cognitive Development 10, 551–576. McDonald S & Pearce S (1996). ‘Clinical insights into pragmatic theory: frontal lobe deficits and sarcasm.’ Brain and Language 53, 81–104. McDonald S & Pearce S (1998). ‘Requests that overcome listener reluctance: impairment associated with executive dysfunction in brain injury.’ Brain and Language 6, 88–104. McDonald S (1999). ‘Exploring the process of inference generation in sarcasm: a review of normal and clinical studies.’ Brain and Language 68, 486–506. Searle J R (1975). ‘Indirect speech acts.’ In Cole P & Morgan J L (eds.) Syntax and semantics, vol. 3: Speech acts. New York: Academic Press. 59–82. Sperber D & Wilson D (1986/1995). Relevance. Oxford: Blackwell. Sullivan K, Zaitchik D & Tager-Flusberg H (1994). ‘Preschoolers can attribute second-order beliefs.’ Developmental Psychology 30, 395–402. Tirassa M (1999). ‘Communicative competence and the architectures of the mind/brain.’ Brain and Language 68, 419–441.
Cognitive Science and Philosophy of Language S Scott, Washington University in St. Louis, St. Louis, MO, USA ! 2006 Elsevier Ltd. All rights reserved.
Much contemporary philosophy of language can be viewed as a synthesis of three major traditions: ideal language philosophy, ordinary language philosophy, and cognitivism. In the first three-quarters of the 20th century, philosophers in both the ordinary and ideal language traditions sought to solve or dissolve traditional philosophical problems through careful exegesis of the meanings of words and sentences. For ideal language philosophers, the project was to formally describe how words and sentences ought to be
interpreted in scientific and philosophical discourse. For ordinary language philosophers, the project was to characterize the conventions underlying the actual use of words and sentences in ordinary speech. Philosophers in both traditions made a number of lasting contributions to the philosophical and scientific study of language, but they were not just studying language for its own sake. Many philosophers in this period considered the philosophy of language to be first philosophy, the foundation on which other philosophical inquiries are built, and they had other philosophical issues in mind when developing their accounts of language (see Epistemology and Language; Metaphysics, Substitution Salva Veritate and the Slingshot Argument).
552 Cognitive Pragmatics Bara B G & Bucciarelli M (1998). ‘Language in context: the emergence of pragmatic competence.’ In Quelhas A C & Pereira F (eds.) Cognition and context. Lisbon: Instituto Superior de Psicologia Aplicada. 317–345. Bara B G, Bucciarelli M & Geminiani G (2000). ‘Development and decay of extralinguistic communication.’ Brain and Cognition 43, 21–27. Bara B G, Cutica I & Tirassa M (2001). ‘Neuropragmatics: extralinguistic communication after closed head injury.’ Brain and Language 77, 72–94. Bara B G & Tirassa M (1999). ‘A mentalist framework for linguistic and extralinguistic communication.’ In Bagnara S (ed.) Proceedings of the 3rd European Conference on Cognitive Science. Roma: Istituto di Psicologia del Consiglio Nazionale delle Ricerche. Bara B G & Tirassa M (2000). ‘Neuropragmatics: brain and communication.’ Brain and Language 71, 10–14. Bara B G, Tirassa M & Zettin M (1997). ‘Neuropsychological constraints on formal theories of dialogue.’ Brain and Language 59, 7–49. Barnes M A & Dennis M (2001). ‘Knowledge-based inferencing after childhood head injury.’ Brain and Language 76, 253–265. Barnes M A, Faulkner H, Wilkinson M & Dennis M (2004). ‘Meaning construction and integration in children with hydrocephalus.’ Brain and Language 89, 47–56. Bosco F M, Bucciarelli M & Bara B G (2004a). ‘The fundamental context categories in understanding communicative intentions.’ Journal of Pragmatics 36(3), 467–488. Bosco F M, Sacco K, Colle L, Angeleri R, Enrici I, Bo G & Bara B G (2004b). ‘Simple and complex extralinguistic communicative acts.’ In Forbus K, Gentner D & Regier T (eds.) Proceedings of the Twenty-Sixth Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum Associates. 44–49.
Bucciarelli M, Colle L & Bara B G (2003). ‘How children comprehend speech acts and communicative gestures.’ Journal of Pragmatics 35, 207–241. Channon S & Watts M (2003). ‘Pragmatic language interpretation after closed head injury: relationship to executive functioning.’ Cognitive Neuropsychiatry 8, 243–260. Dennis M & Barnes M A (1990). ‘Knowing the meaning, getting the point, bridging the gap, and carrying the message: aspects of discourse following closed head injury in childhood and adolescence.’ Brain and Language 39, 428–446. Giora R (2003). On our mind: salience, context and figurative language. New York: Oxford University Press. Lucariello J & Mindolovich C (1995). ‘The development of complex meta-representational reasoning: the case of situational irony.’ Cognitive Development 10, 551–576. McDonald S & Pearce S (1996). ‘Clinical insights into pragmatic theory: frontal lobe deficits and sarcasm.’ Brain and Language 53, 81–104. McDonald S & Pearce S (1998). ‘Requests that overcome listener reluctance: impairment associated with executive dysfunction in brain injury.’ Brain and Language 6, 88–104. McDonald S (1999). ‘Exploring the process of inference generation in sarcasm: a review of normal and clinical studies.’ Brain and Language 68, 486–506. Searle J R (1975). ‘Indirect speech acts.’ In Cole P & Morgan J L (eds.) Syntax and semantics, vol. 3: Speech acts. New York: Academic Press. 59–82. Sperber D & Wilson D (1986/1995). Relevance. Oxford: Blackwell. Sullivan K, Zaitchik D & Tager-Flusberg H (1994). ‘Preschoolers can attribute second-order beliefs.’ Developmental Psychology 30, 395–402. Tirassa M (1999). ‘Communicative competence and the architectures of the mind/brain.’ Brain and Language 68, 419–441.
Cognitive Science and Philosophy of Language S Scott, Washington University in St. Louis, St. Louis, MO, USA ! 2006 Elsevier Ltd. All rights reserved.
Much contemporary philosophy of language can be viewed as a synthesis of three major traditions: ideal language philosophy, ordinary language philosophy, and cognitivism. In the first three-quarters of the 20th century, philosophers in both the ordinary and ideal language traditions sought to solve or dissolve traditional philosophical problems through careful exegesis of the meanings of words and sentences. For ideal language philosophers, the project was to formally describe how words and sentences ought to be
interpreted in scientific and philosophical discourse. For ordinary language philosophers, the project was to characterize the conventions underlying the actual use of words and sentences in ordinary speech. Philosophers in both traditions made a number of lasting contributions to the philosophical and scientific study of language, but they were not just studying language for its own sake. Many philosophers in this period considered the philosophy of language to be first philosophy, the foundation on which other philosophical inquiries are built, and they had other philosophical issues in mind when developing their accounts of language (see Epistemology and Language; Metaphysics, Substitution Salva Veritate and the Slingshot Argument).
Cognitive Science and Philosophy of Language 553
As the limitations of the ordinary and ideal language traditions became apparent and their influence began to decline, the cognitivist tradition in the scientific study of language was growing. Cognitivists view the mind as a computational and representational system and bring a wide variety of empirical evidence to bear on their investigations into the structure and processing of linguistic knowledge in the mind. The synthesis of cognitive science and philosophy of language, or as I shall call it, the new philosophy of language, integrates the formalisms of the ideal language tradition with the careful attention to the nuances of use that characterized the ordinary language tradition. But as cognitivists, many contemporary philosophers of language also take results from linguistics into account and share with other cognitive scientists a commitment to producing theories that are consistent with available psychological and neuroscientific evidence. What follows is a very brief account of the three traditions and their synthesis into the new philosophy of language, ending with a review of some recent work on proper names that exemplifies this new synthesis.
objects in the real world (see Objects, Properties, and Functions). Accordingly, a defining feature of ideal language philosophy was the idea that the relationship of reference is a basic unit of meaning (see Reference: Philosophical Theories), and the starting point was the analysis of simple property attribution sentences such as:
The Ideal Language Tradition
Simple Analysis of (1)
Ordinary speech is a rich source of vagueness, ambiguity, puzzles, and paradoxes, most of which go unnoticed by most speakers. This may not matter all that much for the purposes of ordinary conversation, but in scientific and philosophical discourse the imprecision of ordinary language is not to be tolerated. So said Bertrand Russell, Gottlob Frege, W. V. O. Quine, and the philosophers of the ideal language tradition (see Frege, Gottlob (1848–1925); Quine, Willard van Orman (1908–2000); Russell, Bertrand (1872–1970)). According to them, ordinary language contains certain deficiencies and the philosopher’s job is to provide remedies (Russell, 1919: 172, describes one such ‘‘deficiency’’ as a ‘‘disgrace to the human race’’). The goal of these philosophers was to standardize and regiment language, explain away puzzles and paradoxes, and formally characterize ambiguities. Their aim was to transform ordinary language into something closer to an ideal language – one that scientists and philosophers could use to express their hypotheses about the world. The strengths and weaknesses of their approach can be illustrated using Russell’s theory of proper names. Example: Proper Names
The idea that scientific hypotheses are about the world was key for the ideal language philosophers. Sentences in science and philosophy, not to mention ordinary conversation, often attribute properties to
(1a) Venus is round. (1b) Venus is a star.
Here are some basic intuitions: Sentence (1a) is true because the planet Venus has the property of being round, and sentence (1b) is false because the planet Venus does not have the property of being a star. Here is a simple analysis that respects those intuitions: In both sentences, the proper name Venus refers to an object (see Proper Names: Philosophical Aspects; Proper Names: Semantic Aspects), the remaining words is round and is a star attribute properties to that object, and the sentences refer to the propositions that Venus is round and that Venus is a star, respectively (see Propositions). This analysis is shown more formally in (2), where VENUS denotes the actual object Venus, not a word or an idea. (2a) round(VENUS) (2b) star(VENUS)
This analysis of simple sentences can be developed into a powerful system for characterizing the semantics of much more complex and interesting sentences. But, unfortunately, it also runs into fatal problems with certain sentences that seem just as simple as those in (1). For instance, it is not easy to see how to extend the analysis to cover: (3) Vulcan is round.
This sentence was once thought to be true by astronomers who postulated the existence of a planet, tentatively named Vulcan, to explain the observed perturbations in Mercury’s orbit. It is now known that there is no such planet or, to put it another way, that Vulcan is an empty name (see Empty Names). So, although (3) is clearly meaningful and has a grammatical form that parallels the sentences in (1), the simple analysis will not work in this case. Recall that (1a) is true because the object referred to by the name Venus has the property of roundness. But in (3), there is no object named Vulcan and therefore nothing to which any such property can be applied. Here we have the makings of a puzzle – if reference is as basic to meaning as it appears to be, then how is it possible to say meaningful things using words that have no referents? One option is to allow that
554 Cognitive Science and Philosophy of Language
nonexistent things such as Vulcan, Santa Claus, unicorns, and so on really do have some kind of objecthood. But most philosophers would reject this option because, as Russell (1919: 169) put it, ‘‘logic . . . must no more admit a unicorn than zoology can; for logic is concerned with the real world just as truly as zoology.’’ Another option is to just bite the bullet and accept that (3) does not express a proposition and is therefore meaningless. Although some contemporary philosophers of language have taken this route (e.g., Adams and Stecker, 1994), the ideal language philosophers did not want to take that way out either because to do so would be to render many important scientific and philosophical hypotheses meaningless.
Step Two of Russell’s Analysis of (3) (5a) There exists exactly one planet x that is the cause of the perturbations in Mercury’s orbit, and it is round. (5b) 9x (( 8y ( pm(y) $ y ¼ x )) & round(x))
In this final analysis, there is no longer any element in the proposition corresponding to the name Vulcan and no role available for any referent, and thus the puzzle of empty names disappears. To recap: Names are shorthand for disguised definite descriptions, and sentences that contain definite descriptions express general propositions about the world and the things in it rather than singular propositions about particular entities. Limitations of the Ideal Language Approach
Russell’s Theory of Descriptions
Russell found a solution to the problem of empty names (and other philosophical puzzles) in his theory of descriptions (see Descriptions, Definite and Indefinite: Philosophical Aspects). Briefly, Russell held that names such as Vulcan and Venus do not directly refer but instead are shorthand for definite descriptions such as the planet causing perturbations in Mercury’s orbit and the second planet from the sun, respectively. That is, names are disguised definite descriptions. So, when scientists utter sentences such as those in (1) and (3), what they assert is something more like: Step One of Russell’s Analysis of (1) and (3) (4a) The second planet from the sun is round. (4b) The second planet from the sun is a star. (4c) The planet causing perturbations in Mercury’s orbit is round.
On the face of it, it looks like (4c) has the same problem as (3) – descriptions such as The planet causing perturbations in Mercury’s orbit seem like they should be interpreted as namelike referring expressions. But Russell did not think so. He thought that descriptions such as these should be analyzed as general, quantificational statements about what exists in the world. In the case of (4c), the correct interpretation, according to Russell, is that there is exactly one planet causing perturbations in Mercury’s orbit and all such planets are round. This analysis is expressed in quantificational notation in (5), where pm() stands for the property of being a planet that causes perturbations in Mercury’s orbit. (Some of the inessential details have been simplified in a way that Russell might have objected to, but that does not matter for current purposes.)
Russell’s analysis of proper names, as clever and influential as it is, runs afoul of ordinary intuitions. Sentence (3) seems to have a very simple subject-predicate form, but the proposition in (5) that provides the meaning for (3) bears no resemblance to that form. Furthermore, (5) is false because it asserts the existence of something that does not exist (i.e., it asserts the existence of a planet that causes perturbations in Mercury’s orbit, but there is no such planet). But it is not clear to everybody that (3) really is false (see Strawson, 1950; and the reply by Russell, 1957). To many people, questions such as Is Vulcan round? have the same kind of problem as questions such as Have you stopped cheating on exams yet? – to answer either ‘‘yes’’ or ‘‘no’’ would be to accept a problematic premise. Russell was not driven to this analysis of simple sentences as an attempt to characterize how ordinary speech works but as an attempt to dissolve an apparent logico-scientific puzzle that arises when we take the referential commitments of ordinary speech seriously. But the analysis ends up providing no account of the fact that people seem quite capable of making what appear to be true claims about nonexistent things. (6a) Santa Claus usually wears a red suit. (6b) Pegasus looks a lot like a horse.
Russell’s theory of disguised definite descriptions makes the sentences in (6) come out false, contrary to most people’s intuitions. His theory preserves the apparent meaningfulness of these sentences, and does so without maintaining any problematic commitments to entities such as Pegasus and Santa Claus, but at the price of a theory that may not have much to say about their ordinary use.
Cognitive Science and Philosophy of Language 555
The Ordinary Language Tradition As vague, ambiguous, and rife with semantic puzzles as ordinary language is, it also contains a wealth of information that philosophers cannot afford to ignore. In order to discover anything meaningful about important philosophical topics such as Truth, Knowledge, and Justice, philosophers need to know what truth, knowledge, justice, and other related words actually mean in ordinary language. This was the perspective of Gilbert Ryle, H. P. Grice, J. L. Austin, P. F. Strawson, Ludwig Wittgenstein (in his later works), and the philosophers of the ordinary language tradition (see Austin, John Langshaw (1911–1960); Grice, Herbert Paul (1913–1988); Strawson, Peter Frederick (b. 1919). According to them, philosophers must pay careful attention to the nuances of ordinary language use and must be particularly wary of misusing ordinary language expressions in their philosophical theories. In many ways, this tradition was radically opposed to the ideal language tradition: whereas the ideal language project was a prescriptive project, concerned with legislating how language ought to be understood, the ordinary language approach was purely descriptive, concerned with the investigation of how language is actually used; whereas ideal language philosophers sought to construct a theory of meaning based on reference to things in the world, ordinary language philosophers sought to construct a theory of meaning based on conventions of ordinary use (see Use Theories of Meaning). But despite these differences, both traditions shared a common motivation, namely, the analysis of language in order to help solve or dissolve philosophical problems. It is just that in pursuing this common aim, ideal language philosophers were busy constructing a new language while ordinary language philosophers were busy pointing out how philosophers tended to misuse the old one. Example: Ryle on Free Will
Ordinary language philosophers thought that the meaning of an expression is the conventions governing its use. Thus, to get at the meaning of an expression, we have to examine how it is ordinarily used. The standard technique is to assemble a list of sentences containing a given expression and then try to find conditions under which it would be inappropriate or nonsensical to use those sentences. Whatever those conditions turn out to be, their negation must be part of the meaning of the word in question. (Notice that this makes short work of the puzzle of empty names. Because the meaning of a word is the conventions governing its use, names can have meaning whether they have a referent or not.)
As an example of ordinary language analysis in action, consider Ryle’s (1949) investigation of the word voluntary. Ryle noted that philosophers often characterize free will (another important philosophical topic) based on the distinction between voluntary and involuntary actions – free will is said to be involved in an action when it is performed voluntarily and not when it is performed involuntarily. So voluntary (along with grammatical variants such as voluntarily) is an important word in philosophy, but what does it actually mean in ordinary language? Consider the following sentences: (7a) Benazir went to school voluntarily. (7b) Hussein ate the sandwich voluntarily. (7c) Ahmad watched Seinfeld voluntarily.
As Ryle observed, such uses of voluntary and its grammatical variants seem odd or wrong in any situation in which there is no reason to believe that the person in question ought not to have performed the action. So if Benazir has been banned from campus or hates school or is supposed to be doing something else, then (7a) might make sense. But if there is no reason to suppose anything like that, then the word voluntarily should be left out. Ditto for (7b) and (7c). From these sorts of considerations, Ryle concluded, part of the meaning of the word voluntary must include the condition that it can only be used in the description of an action that for some reason ought not to have been performed. To bring this back to the philosophical problem of free will, Ryle noted that philosophers who worry about what it could mean to eat a sandwich or watch Seinfeld voluntarily, absent any kind of context, are systematically misusing ordinary English. As he put it, they engage in an ‘‘unwitting extension of the ordinary sense of ‘voluntary’ and ‘involuntary’ ’’ (Ryle, 1949: 69). The conclusion that Ryle drew from these and other considerations was that there is no problem of free will. The appearance of the problem arises only when philosophers misuse ordinary language words such as voluntary. The whole problem just dissolves under ordinary language analysis. Limitations of the Ordinary Language Approach
The ordinary language philosophers tended to be less likely to make use of formalisms for characterizing the meanings of words or sentences. The nature of ordinary language analysis was such that it produced accounts of word or sentence meaning that tended to be less rigorous than those produced by philosophers working in the ideal language tradition. Furthermore, the use theories of meaning pursued by ordinary language philosophers had little to say about the relationship between language and reality,
556 Cognitive Science and Philosophy of Language
and were thereby limited in their ability to account for reference and truth conditions, whether in scientific, philosophical, or ordinary discourse. The ordinary language philosophers demonstrated many of the important and subtle ways in which philosophically interesting words are employed in ordinary language, but they did so at the price of having neither a systematic, precise account of meaning nor a theory of the relationship between language and the world. The ordinary language tradition ultimately met its demise at the hands of its own adherents. In his 1967 lectures on ‘Logic and Conversation,’ Grice (1989) gave a strong voice to many philosophers’ growing misgivings about the project. He argued for a sharp distinction between what is said by a speaker on a particular occasion and what the speaker might have meant by what was said. For Grice, what is said is the literal, truth-evaluable, relatively invariant portion of meaning. To use one of his examples, suppose Alyssa happens upon Cliff, who has run out of gas on the highway, and utters: (8) There’s a gas station around the corner.
What Alyssa has said, in Grice’s sense (literally expressed, truth-conditional meaning) is the proposition that around the indicated corner is a gas station. Alyssa said nothing further about whether the gas station is open, has gas to sell, and so on. But assuming she is sincerely trying to help Cliff out, it will be inappropriate for her to use that sentence unless she believes that the gas station is open and has gas to sell. Based on this latter observation, an ordinary language philosopher might be tempted to conclude that these further conditions are part of the meaning of (8). But that, Grice argues, is a mistake. Grice’s alternative is that the further propositional content about the gas station being open and having gas to sell is not part of the literal meaning of (8), but is what he called a conversational implicature (see Implicature). This conversational implicature is part of what Alyssa means to communicate with (8), but she expects Cliff to be able to pick up on it without requiring her to state it explicitly. The details of how Cliff might do that is beyond the scope of the current discussion (see Grice, 1989; Sperber and Wilson, 1995), but to get a sense of the reasonableness of the distinction between what is said and what is conversationally implicated, consider how Alyssa could have tacked an extra clause onto (8) to take back both what she implicated and what she said. Clauses That Cancel Implicatures (9a) There’s a gas station around the corner, but it’s not open. (9b) There’s a gas station around the corner, but it’s out of gas.
The sentences in (9) both have (8) embedded in them, and the fact that they do not seem contradictory indicates that the material in the final clause must not be opposed to any part of the meaning of (8). Now suppose Alyssa had instead uttered one of the sentences in (10). Clauses That Contradict What Is Said (10a) There’s a gas station around the corner, but it’s not a gas station. (10b) There’s a gas station around the corner, but it’s not around the corner.
The fact that these sentences are clearly contradictory indicates that the added clauses must be opposed to some part of the literal meaning of (8). So there is strong intuitive support for the distinction between what Alyssa has said, as shown by the contradictory clauses in (10), and what she conversationally has implicated, as shown by the noncontradictory clauses in (9). On the basis of this distinction, Grice argued for caution when moving from facts about how words are used to facts about the meanings of those words. It would have been inappropriate for Alyssa to utter (8) if she thought the gas station was closed, but that does not tell us anything about what (8) means. Evidence about use can, in principle, indicate something about the literal meaning of words and sentences, but not always in such a simple way. Ryle, in particular, was probably wrong to jump from facts about the use of the word voluntary to facts about its meaning (and then to the denial of the problem of free will). Grice thought that ordinary language analysis could still be useful but that philosophers needed to pay more attention to separating what an expression can be used to communicate from what that expression actually means in the language – a project that turns out to be exceedingly difficult (see Semantics–Pragmatics Boundary).
The Cognitivist Tradition Language is a fascinating topic of study in its own right, regardless of its role in helping philosophers do their work. It is now clear that the production of even very simple speech behaviors is far more complex than was once thought and working out how linguistic knowledge is structured and processed in the human mind should be a central goal in the scientific study of language. That is what linguists working in the cognitivist tradition tend to think. According to them, the goal of linguistic inquiry is not primarily to account for reference and truth or to characterize conventions of use but rather to find out what it is
Cognitive Science and Philosophy of Language 557
about the human mind that makes language what it is. Cognitivism is actually a cross-disciplinary tradition concerned with the study of the human mind in general, not just language. Leading figures in the birth and early development of the cognitivist tradition included computer scientists (e.g., Marvin Minsky; psychologists (e.g., George Miller), linguists (e.g., Noam Chomsky; see Chomsky, Noam (b. 1928)), and philosophers (e.g., Hilary Putnam, Jerry Fodor, Daniel Dennett; see Fodor, Jerry (b. 1935)). There are four features that, taken together, loosely define the cognitivist approach to the study of mind and language: (1) an adherence to computational and representational theories of mind, (2) a rejection of most forms of behaviorism, (3) an openness to empirical evidence from a wide variety of sources, and (4) a tendency toward identifying linguistic meanings with mental states rather than with things in the world or patterns of ordinary use. Each of these aspects is discussed next. Computational and Representational Theories of Mind
Cognitivists model the mind/brain as an information processing system that performs computations on structured representations of the world. In other words, the mind/brain is a kind of computer, analogous in many ways to a digital computer. Many people find this claim jarring at first, but actually it is quite natural to suppose that, at least in some circumstances, people use computers to do their thinking for them. Whenever an accountant uses a spreadsheet to prepare tax forms, a pilot flies using an automatic guidance system, or a librarian searches an electronic catalog, computers are being used to perform tasks that would require mental effort if performed by human beings. When people use a computer to perform a task, they avoid some of the thinking that would have been required if they had performed the task unaided. Digital computers accomplish their apparently mental feats by executing algorithms that manipulate data structures. An algorithm is a set of fully explicit, step-by-step instructions for accomplishing a given task, and a data structure is a package of information about some aspect of the world. For example, a data structure might contain information about a social hierarchy, the layout of a city, or the structure of a sentence. Algorithms contain instructions for how to use those data structures to decide, for example, who to approach for a loan, how to get from downtown to the suburbs, or what a speaker might mean by uttering a particular sentence. Cognitivists claim that human thought consists of computational
processes (analogous to algorithms) that operate on mental representations of the external world (analogous to data structures), although there remains much debate over the nature of those processes and representations. Like a digital computer, the mind/brain can be analyzed at a number of different levels (Dawson, 1998; Marr, 1982). At the physical level, digital computers are instantiated in electronic circuitry and minds are instantiated in brains. By investigating the brain, we can figure out what kinds of mental representations and computational processes it supports and what parts of it may or may not be involved in language. At the algorithmic level, digital computers run programs that specify the details of their behavior. The bold conjecture of cognitive science is that minds are the programs that run on the physical circuitry of the brain. By performing psychological experiments, we can shed light on how linguistic knowledge is represented in the mind and what computational processes are involved in using that knowledge (see Psycholinguistics: Overview). Finally, there is the task level. The programs that digital computers run can only be made sense of in light of knowledge about their connections to the world and the tasks they were designed to solve. Similarly, in order to understand how the mind uses language, it is necessary to have a theory of what language is and what knowledge is involved in language use. These three levels of analysis thus define a multidisciplinary program of research into the nature of human language, with different research questions posed at each level (see Table 1). Cognitivist linguists focus most of their attention on the algorithmic and task levels, concentrating on the difficult problems of identifying the knowledge required to produce wellformed grammatical utterances, determining how that knowledge must be represented in the minds of the speakers, and identifying which elements of that knowledge are learned and which are innate (see
Table 1 Three-level research program Level
Questions
Task
How are natural languages structured? What must people know and what must they know how to do in order to produce and understand human speech? How is knowledge of language represented in the mind? What computational processes are involved in producing and understanding speech? How are these representations and computational processes implemented in the hardware of the brain?
Algorithmic
Physical
558 Cognitive Science and Philosophy of Language
Innate Knowledge). But as cognitivists, they remain open to, and sometimes make use of, evidence from the physical level as well.
Table 2 Sources of evidence for the three levels Level
Example sources of evidence
Task
Judgments of native speakers Which strings of words are grammatical and which are not? What meanings can a sentence have and not have? Developmental psychology How do children acquire language? What are the common patterns of language development? Cognitive psychology How do adults react to linguistic stimuli under controlled conditions? Clinical studies What kinds of brain injuries and diseases cause language deficits? What specific language deficits are caused by specific brain injuries and diseases? Anatomical and functional studies What parts of the brain are involved in language use? How are these parts interconnected?
The Rejection of Linguistic Behaviorism
Prior to the establishment of the cognitivist tradition in the 1960s and 1970s, the dominant approach to the study of the mind and language was behaviorism. Many philosophers at the time endorsed or were influenced this approach, including prominent representatives of both the ideal language and ordinary language traditions. Behaviorism comes in a number of varieties (see Behaviorism: Varieties), but what all behaviorists agree on is a rejection of internal mental states as something that can be scientifically studied or appealed to in explanations of language and behavior. For psychologists such as B. F. Skinner, this meant that linguistic behavior was to be explained as a complex pattern of responses to environmental stimuli. Verbal responses were thought of as being under the control of certain stimuli in the environment (Skinner, 1957). Skinner’s view of language was subjected to ruthless criticism from Chomsky, who pointed out the complexity of linguistic behavior and the wide variety of possible responses to a given stimulus: A typical example of stimulus control for Skinner would be the response to . . . a painting with [the utterance] Dutch. . . Suppose instead of saying Dutch we had said, Clashes with the wallpaper, I thought you liked abstract work, Never saw it before, Tilted, Hanging too low, Beautiful, Hideous, Remember our camping trip last summer? (Chomsky, 1959, p. 31)
Once the nonstimulus-bound nature of linguistic behavior is fully appreciated, said Chomsky, the prospect of arriving at an account of linguistic behavior without involving an appeal to mental states is completely hopeless. Cognitivism pointed the way out of behaviorism by providing a method of formally characterizing those mental states. The Open Evidence Base
The cognitivist tradition is an empirical tradition. The sources of evidence available to the linguist include the judgments of native speakers, the process of first-language acquisition, the controlled psychological study of speech production and comprehension, the study of acquired and genetic language deficits, and the study of the neurological features of language use in healthy adults, to name but a few. These sources of evidence can be used to investigate language at the task, algorithmic, and physical levels (see Table 2). This is not to say that it is the current practice of linguists to make use of all of these sources of evidence. Indeed much work in theoretical linguistics
Algorithmic
Physical
proceeds using only the grammaticality judgments of the linguists themselves. But there is a general commitment both to the idea that a complete theory of language has to be consistent with all these sources of evidence and to the idea that the evidence base for linguistics is open – that is, there are no principled limits on the kinds of evidence that might bear on the structure of linguistic knowledge. The commitment to an open evidence base has important consequences. For behaviorists, the study of language had to be grounded in observable behavior only. As Quine (1960) pointed out, it turns out that this leads to the conclusion that linguistic knowledge and meaning cannot be unambiguously determined. From this, he drew the conclusion that there is simply no fact of the matter about how to specify the mapping from words and sentences to their meanings (see Indeterminacy, Semantic). A famous response to Quine, again from Chomsky (1969), is based on the notion of the open evidence base. According to Chomsky, Quine reached his radical conclusions about semantic indeterminacy by accepting in advance the behaviorist notion that only observable behavior and responses to environmental stimuli may be used as the data for theories of linguistic meaning. But, as Chomsky points out, no other science places such a priori limits on the kinds of evidence that can be used to decide between competing theories. As long as the evidence base in linguistics remains open, the possibility of discovering further evidence that will help determine linguistic meaning is open as well.
Cognitive Science and Philosophy of Language 559 Meanings as Mental States
The establishment of a viable theory about mental states and mental processing opened the door to a new class of theories of linguistic meaning based on the pairing of words in the public language with mental states of speakers. The general idea of a mental state theory of meaning is at least as old as Aristotle (see Aristotle and Linguistics), but the computational and representational theory of mind gave it new life by providing a story about what mental states might be like and how they might be processed in the mind. In addition to endorsing a mental state account of meaning, some cognitivists also harbor a deep mistrust of the reference-based theories pursued in the ideal language tradition. The semanticist Ray Jackendoff (2002) argues that the only kind of reference a cognitivist theory of language can countenance is reference to other mental states (see Jackendoff, Ray S. (b. 1945)), whereas Chomsky (2000) suggests that reference, as originally construed by ideal language philosophers, is not a suitable topic for scientific inquiry at all. Jerry Fodor (1975) has proposed that words and sentences come by their meaning through being paired with internally represented formulae in what he calls the Language of Thought (see Language of Thought), or Mentalese (see Mentalese). Mentalese is not a public language such as English. It is more like a computer language – a formal system with a combinatorial syntax and an expressive power that equals or surpasses that of a public language. Fodor proposes that words and sentences express mental states, but, unlike Chomsky and Jackendoff, he takes the further step of attempting to scientifically characterize the meanings of expressions in Mentalese as relationships to objects and properties in the external world (see Representation in Language and Mind; Causal Theories of Reference and Meaning). Fodor’s theory of meaning thus has two parts: (1) words inherit their meanings from the mental states they express, and (2) most of those mental states get their meanings through reference to the external world. An important alternative cognitivist account of meaning as mental states is offered by connectionism, although a full discussion of that approach is beyond the scope of this article (see Human Language Processing: Connectionist Models). The Limitations of Cognitive Science
It is not yet clear how far cognitive science can go, and there are philosophers who dispute the claim that studying the structure and processing of linguistic knowledge in the human mind can tell us much about the nature of language itself (see Barber,
2003). But the computational and representational theory of mind, as a working hypothesis, has given rise to a productive research program producing theories of mind and language rich and predictive enough that, at the very least, they should not be ignored. The cognitivist approach to the study of mind and language is widely regarded by philosophers as the only approach currently worth taking seriously.
The New Philosophy of Language The new philosophy of language emerged in the 1970s as a synthesis of the ideal language, ordinary language, and cognitivist traditions. From the ideal language tradition comes the use of rigorous formalisms and a concern for the connection between language and reality. From the ordinary language tradition comes the descriptive nature of the project and careful attention to the nuances of ordinary use, as well as Grice’s distinction between what is said and what is implicated by an utterance. And from the cognitivist tradition comes an adherence to computational and representational theories of the mind, a rejection of linguistic behaviorism, an attention to the mental states of the language user, and a concern with making semantic and pragmatic theories consistent with the relevant empirical results concerning language and the mind. The boundaries between linguistics and the philosophy of language have become blurred in this new synthesis. Whereas phonology (the sounds of language), morphology (the structure of words), and syntax (the structure of sentences) remain a concern mostly of linguists, semantics (the meaning of language) and pragmatics (the communicative use of language) are studied by both linguists and philosophers. There has also been considerable crossfertilization between linguistics and philosophy. Linguists have adopted the formalisms of the ideal language tradition and the Gricean view of the relation between semantics and pragmatics that arose out of the ordinary language tradition. Philosophers, on the other hand, have adopted the linguistic account of syntax and feel an obligation to relate the semantic interpretation of a sentence to its syntactic form. In addition, the cognitivist approach to linguistics also throws up a host of difficult conceptual issues that demand a rigorous philosophical treatment (see Philosophy of Linguistics), for example, the place of reference in semantic theory (see Externalism about Content), the nature of linguistic knowledge (see Innate Knowledge; Tacit Knowledge), and the connection between language and thought (see Thought and Language: Philosophical Aspects).
560 Cognitive Science and Philosophy of Language Two More Theories of Proper Names
How might a practitioner of the new philosophy of language tackle a traditional semantic problem such as the content of proper names? Two theories of proper names are presented by Tyler Burge (1973) and Larson and Segal (1995). These two theories agree with one another in many important respects – so much so that we might be tempted to suppose that they are merely variants of one another. But, as Gabriel Segal (2001) points out, there are a number of pieces of relevant evidence from the task, algorithmic, and physical levels of cognitive analysis that may be used to adjudicate between the theories. (A caution: The semantic issue is actually more technical than the following discussion suggests, concerning points of difference between semanticists working in the formal framework of truth-theoretic semantics. Because there is no room to introduce the details of that framework here, the accounts of the rival theories are somewhat sketchy, although, I hope, detailed enough to make it clear how empirical evidence can be used to decide between them.) Burge’s approach to proper names is a variation on Russell’s disguised definite descriptions. Burge proposes that proper names are actually a kind of common noun, that is, words such as table and cat that encode properties that apply to large numbers of objects. In Burge’s account, if we have a cat named Sylvester, then that object has both the property of being a cat (a property it shares with other cats) and the property of being a Sylvester (a property it shares with other Sylvesters). In defense of this idea, Burge points out that, like common nouns, names can be pluralized and paired with determiners such as the and a: (11a) There are very few Sylvesters in the world. (11b) There were three Madelines at the party. (11c) There’s a Bartholomew Kropotnik here to see you. (11d) The Jessica I met today was a real jerk.
This idea encounters an immediate difficulty. Burge says that names are common nouns, even when they occur unmodified and on their own: (12) Fido wants to chase Sylvester.
But other common nouns cannot be used that way in English: (13) *Dog wants to chase cat.
Sentence (13) only works if we interpret dog and cat as unusual names rather than as common nouns. So proper names seem to be unlike common nouns in at least this respect. Burge resolves the discrepancy
by suggesting that bare, unmodified names actually have hidden determiners attached. A name such as Fido, when used on its own is, unbeknown to the speaker, actually the phrase That Fido or The Fido in disguise. The rival view is Segal’s contention that proper names are not common nouns but instead are a special kind of word, paired in each speaker’s mind with a special kind of mental representation – an individual concept. These individual concepts are mental representations that encode information about the individuals named. So the name David Bowie is paired with an individual concept of David Bowie, perhaps containing the information that he sings, plays the saxophone, is married to a runway model, has probably had plastic surgery, and so on. Names, in Segal’s account, are not at all like common nouns, encoding predicates that can apply to more than one person. Rather, they are labels that attach to conceptual information about particular individuals. There are not many David Bowies sharing one name. Rather, there are potentially many names David Bowie, each linked to a different individual concept. Empirical Evidence
It might seem that in the end, the differences between the two theories do not amount to much. Burge says that the name Fido can be applied to anything that is a Fido, whereas Segal says that it only applies to one individual and that the reason why there seem to be so many Fidos is that there are many names for distinct individuals that happen to sound the same (call these names Fido1, Fido2, etc.). Is there any real difference between these two theories? A behaviorist such as Quine might be inclined to think that, as long as each can be integrated into a larger theory of equal power in predicting linguistic behavior, then there is no fact of the matter about which is correct. But a cognitivist would rather suppose that there is a way to tell how the language system works, reflected in the biology and psychology of language, and that at most only one of the two suggestions can be correct. And it seems, at least at first glance, that the evidence from the task, algorithmic, and physical levels supports Segal’s theory over Burge’s. At the task level, cognitivists consult the intuitions of native speakers to determine the characteristics of the language that they speak. In the case of proper names, the two theories under consideration make different predictions about the syntax of English. Burge’s theory predicts that bare names actually have a hidden determiner word attached to them. But this view has some trouble accounting for common intuitions about how names and common nouns
Cognitive Science and Philosophy of Language 561
can be used. For example, why is it that determiners can go unpronounced when attached to names, but not when attached to common nouns, as shown by (13)? And why is it that sometimes special contexts are required to insert the determiner in front of a name? For example, to the question ‘‘Where do you live?’’ the response in (14a) seems natural whereas (14b) sounds awful. If Saint Louis is really short for a phrase such as that Saint Louis, then why can we not say (14b)? (14a) I live in Saint Louis. (14b) *I live in that Saint Louis.
At the algorithmic level, cognitivists look at psychological evidence regarding how linguistic knowledge is represented and processed. Again, the two theories make different predictions about the psychology of names. Burge predicts that names that sound the same are the same name, whereas Segal predicts that each individual’s name is distinct. If Segal is right, there should be evidence that people tend to expect identical-sounding names to apply only to a single individual. Again, there is some evidence that supports Segal’s prediction. It seems that children learning English as a first language expect there to be a class of nouns that refer to only one thing and make use of syntactic clues such as the presence or absence of determiners to decide whether to apply new words to other objects or not. For example, when told that a novel object is wuzzle (with no determiner), children are reluctant to apply the new word to other novel objects, even when they are highly similar to the original. But when told that the novel object is a wuzzle, they will happily generalize the term to other objects that seem to share some salient properties with the original – just like ordinary common nouns. Burge’s theory also predicts that names are not a special kind of noun, whereas Segal predicts that they are. If Segal is right, we should expect to find psychological differences between names and common nouns. We might also expect some physical-level differences. (Recall that at the physical level, cognitivists look to neurological evidence for or against the kinds of representation and processing they propose in their algorithmic-level theories.) Again, the evidence seems to support Segal’s view over Burge’s. As previously noted, children seem to be prewired to look for names as well as common nouns. In addition, psychological studies on adults reveal that proper names are much harder to recall than common nouns, suggesting distinct storage and/or processing. And at the physical level, certain kinds of brain damage can cause people to lose their ability to use proper names while leaving their ability to use common nouns intact, and vice versa (see Aphasia Syndromes). This strongly suggests
that names are a special kind of word stored in a separate area of the brain. In fact, things are not as bad as all that for Burge’s theory. Segal (2001), in his much more complete and sober account, correctly points out that the psychological and neurological evidence is still quite sketchy and open to interpretation. It is quite possible that a committed Burgian could find a way to keep the common noun theory of names alive. The main point of this example has been to show how, in principle, multidisciplinary evidence from all three levels of cognitive analysis can bear on an issue in semantics. Whereas a behaviorist might be content with two theories that are equally good at describing some aspect of linguistic behavior, the new philosopher of language looks deeper to try and find out which theory does a better job of accounting for the cognitive aspects of language. Final Words
The work on proper names reviewed here nicely illustrates the main features of the new philosophy of language. Burge and Segal’s truth-theoretic approach to semantics is as rigorously formal as any theory in the ideal language tradition; the attention to ordinary speaker intuitions in mediating between semantic theories echoes the approach of the ordinary language philosophers; the mentalistic nature of the theory, the formal, computational nature of truth theories, and the openness to evidence from all levels of cognitive analysis clearly places the work in the cognitivist tradition. But is this new hybrid approach really philosophy of language, or is it just a branch of linguistics or psychology? There are still those who hold out the hope that analysis of language will eventually help with the resolution of issues in other branches of philosophy, even if only in providing a starting point, and most contemporary philosophers of language remain sensitive the philosophical puzzles and paradoxes that drove the ideal and ordinary language philosophers. Indeed, one of the selling points of both Burge’s and Segal’s theories of proper names is that they can account for the meanings of empty names. But heeding Grice’s lesson about the difficulties of determining what is said and heeding the lessons from contemporary linguistics about the complexities of ordinary language, few still believe any philosophical problem will be solved or dissolved with just a little bit of armchair reflection on conventions of use. The new philosophy of language promises progress on some of the difficult traditional problems in philosophy of language (and perhaps on more general philosophical problems) by combining careful
562 Cognitive Science and Philosophy of Language
conceptual analysis with detailed attention to empirical results from the scientific study of language, the mind, and the brain. See also: Aphasia Syndromes; Aristotle and Linguistics; Austin, John Langshaw (1911–1960); Behaviorism: Varieties; Causal Theories of Reference and Meaning; Chomsky, Noam (b. 1928); Cognitive Science: Overview; Congo, Republic of: Language Situation; Descriptions, Definite and Indefinite: Philosophical Aspects; Empty Names; Epistemology and Language; Externalism about Content; Fodor, Jerry (b. 1935); Frege, Gottlob (1848–1925); Grice, Herbert Paul (1913–1988); Human Language Processing: Symbolic Models; Implicature; Indeterminacy, Semantic; Innate Knowledge; Jackendoff, Ray S. (b. 1945); Language of Thought; Mentalese; Metaphysics, Substitution Salva Veritate and the Slingshot Argument; Objects, Properties, and Functions; Proper Names: Philosophical Aspects; Proper Names: Semantic Aspects; Propositions; Psycholinguistics: Overview; Quine, Willard van Orman (1908–2000); Reference: Philosophical Theories; Representation in Language and Mind; Russell, Bertrand (1872–1970); Semantics–Pragmatics Boundary; Strawson, Peter Frederick (b. 1919); Tacit Knowledge; Thought and Language: Philosophical Aspects; Use Theories of Meaning.
Bibliography Adams F & Stecker R (1994). ‘Vacuous singular terms.’ Mind and Language 9(4), 387–401. Barber A (ed.) (2003). The epistemology of language. Oxford, UK: Oxford University Press. Burge T (1973). ‘Reference and proper names.’ Journal of Philosophy 70(14), 425–439. Chomsky N (1959). ‘A review of B F Skinner’s Verbal Behavior.’ Language 35(1), 26–58. Chomsky N (1969). ‘Quine’s empirical assumptions.’ In Davidson D & Hintikka J (eds.) Words and objections: Essays on the work of W. V. Quine. Dordrecht: D. Reidel. 53–68.
Chomsky N (2000). New horizons in the study of mind and language. Cambridge, UK: Cambridge University Press. Dawson M R W (1998). Understanding cognitive science. Malden, MA: Blackwell. Fodor J A (1975). The language of thought. Cambridge, MA: Harvard University Press. Fodor J A (1990). A theory of content and other essays. Cambridge, MA: MIT Press. Gazzaniga M S, Ivry R B & Mangun G R (2002). Cognitive neuroscience: The biology of the mind, second edition. New York: W. W. Norton. Grice H P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Jackendoff R (2002). Foundations of language: Brain, meaning, grammar, evolution. Oxford: Oxford University Press. Larson R K & Segal G (1995). Knowledge of meaning: An introduction to semantic theory. Cambridge, MA: MIT Press. Marr D (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco, CA: W. H. Freeman. Quine W V O (1960). Word and object. Cambridge, MA: MIT Press. Russell B (1919). Introduction to mathematical philosophy. London: George Allen & Unwin. Russell B (1957). ‘Mr. Strawson on referring.’ Mind 66, 385–389. Ryle G (1949). The concept of mind. New York: Barnes & Noble. Segal G (2001). ‘Two theories of proper names.’ Mind and Language 16(5), 547–563. Skinner B F (1957). Verbal behavior. New York: AppletonCentury-Crofts. Soames S (2003). Philosophical analysis in the 20th century (vols 1–2). Princeton, NJ: Princeton University Press. Sperber D & Wilson D (1995). Relevance: Communication and cognition. Cambridge, MA: Blackwell. Strawson P F (1950). ‘On referring.’ Mind 59, 320–344. Trask R L (1999). Language: The basics (2nd edn.). New York: Routledge. Valentine T, Brennan T & Bre´ dart S (1996). The cognitive psychology of proper names: The importance of being earnest. New York: Routledge.
Cognitive Science: Overview J Oberlander, University of Edinburgh, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
Introduction Cognitive science is the interdisciplinary scientific study of the mind. Many questions therefore fall within its scope. For instance, how do people perceive the world through their senses? How do they manage
to act in a timely fashion in a changing world? How do they solve novel problems? How do they manage to learn new skills? And how do they understand one another? In addressing these questions, most researchers assume that the human mind is some kind of computational device, containing representations. Modeling human language capacities has been a central goal within cognitive science; relevant research draws on a wide range of empirical and
562 Cognitive Science and Philosophy of Language
conceptual analysis with detailed attention to empirical results from the scientific study of language, the mind, and the brain. See also: Aphasia Syndromes; Aristotle and Linguistics; Austin, John Langshaw (1911–1960); Behaviorism: Varieties; Causal Theories of Reference and Meaning; Chomsky, Noam (b. 1928); Cognitive Science: Overview; Congo, Republic of: Language Situation; Descriptions, Definite and Indefinite: Philosophical Aspects; Empty Names; Epistemology and Language; Externalism about Content; Fodor, Jerry (b. 1935); Frege, Gottlob (1848–1925); Grice, Herbert Paul (1913–1988); Human Language Processing: Symbolic Models; Implicature; Indeterminacy, Semantic; Innate Knowledge; Jackendoff, Ray S. (b. 1945); Language of Thought; Mentalese; Metaphysics, Substitution Salva Veritate and the Slingshot Argument; Objects, Properties, and Functions; Proper Names: Philosophical Aspects; Proper Names: Semantic Aspects; Propositions; Psycholinguistics: Overview; Quine, Willard van Orman (1908–2000); Reference: Philosophical Theories; Representation in Language and Mind; Russell, Bertrand (1872–1970); Semantics–Pragmatics Boundary; Strawson, Peter Frederick (b. 1919); Tacit Knowledge; Thought and Language: Philosophical Aspects; Use Theories of Meaning.
Bibliography Adams F & Stecker R (1994). ‘Vacuous singular terms.’ Mind and Language 9(4), 387–401. Barber A (ed.) (2003). The epistemology of language. Oxford, UK: Oxford University Press. Burge T (1973). ‘Reference and proper names.’ Journal of Philosophy 70(14), 425–439. Chomsky N (1959). ‘A review of B F Skinner’s Verbal Behavior.’ Language 35(1), 26–58. Chomsky N (1969). ‘Quine’s empirical assumptions.’ In Davidson D & Hintikka J (eds.) Words and objections: Essays on the work of W. V. Quine. Dordrecht: D. Reidel. 53–68.
Chomsky N (2000). New horizons in the study of mind and language. Cambridge, UK: Cambridge University Press. Dawson M R W (1998). Understanding cognitive science. Malden, MA: Blackwell. Fodor J A (1975). The language of thought. Cambridge, MA: Harvard University Press. Fodor J A (1990). A theory of content and other essays. Cambridge, MA: MIT Press. Gazzaniga M S, Ivry R B & Mangun G R (2002). Cognitive neuroscience: The biology of the mind, second edition. New York: W. W. Norton. Grice H P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Jackendoff R (2002). Foundations of language: Brain, meaning, grammar, evolution. Oxford: Oxford University Press. Larson R K & Segal G (1995). Knowledge of meaning: An introduction to semantic theory. Cambridge, MA: MIT Press. Marr D (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco, CA: W. H. Freeman. Quine W V O (1960). Word and object. Cambridge, MA: MIT Press. Russell B (1919). Introduction to mathematical philosophy. London: George Allen & Unwin. Russell B (1957). ‘Mr. Strawson on referring.’ Mind 66, 385–389. Ryle G (1949). The concept of mind. New York: Barnes & Noble. Segal G (2001). ‘Two theories of proper names.’ Mind and Language 16(5), 547–563. Skinner B F (1957). Verbal behavior. New York: AppletonCentury-Crofts. Soames S (2003). Philosophical analysis in the 20th century (vols 1–2). Princeton, NJ: Princeton University Press. Sperber D & Wilson D (1995). Relevance: Communication and cognition. Cambridge, MA: Blackwell. Strawson P F (1950). ‘On referring.’ Mind 59, 320–344. Trask R L (1999). Language: The basics (2nd edn.). New York: Routledge. Valentine T, Brennan T & Bre´dart S (1996). The cognitive psychology of proper names: The importance of being earnest. New York: Routledge.
Cognitive Science: Overview J Oberlander, University of Edinburgh, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
Introduction Cognitive science is the interdisciplinary scientific study of the mind. Many questions therefore fall within its scope. For instance, how do people perceive the world through their senses? How do they manage
to act in a timely fashion in a changing world? How do they solve novel problems? How do they manage to learn new skills? And how do they understand one another? In addressing these questions, most researchers assume that the human mind is some kind of computational device, containing representations. Modeling human language capacities has been a central goal within cognitive science; relevant research draws on a wide range of empirical and
Cognitive Science: Overview 563
computational methods. This brief overview first characterizes the subject and then sketches a brief history of it. In indicating the current state of play, key issues in mental representation, modularity, and computational architecture are noted, and some current directions in cognitive research are indicated.
Characterizing the Discipline Christopher Longuet-Higgins is credited with inventing the term ‘cognitive science’ in 1973 (Darwin, 2004). The term was apparently used in 1975 during a meeting in Abbaye de Royaumont, which hosted a celebrated debate between Noam Chomsky and Jean Piaget. The term covered a then-emerging research field that drew together researchers from diverse backgrounds, including artificial intelligence and computer science, linguistics, psychology, philosophy, the neurosciences, and anthropology. One thing that united these researchers was an interest in the underpinnings of intelligent human behavior and a recognition that researchers in different disciplines, such as linguistics and psychology, were already studying common phenomena, albeit from distinct perspectives. The idea was that, by pooling expertise, deeper understanding could be achieved. The core objects of cognitive science study are the normal cognitive capacities of a typical adult (Von Eckardt, 1993), so it is most common to experiment on, and model, such individuals. In practice, the majority of experimental participants are probably undergraduate students in the United States, although cognitive science has also benefited greatly from comparative techniques. For instance, comparisons can be made regarding linguistic competence in adults and children (Tomasello, 2000), ability to attribute mental states to others in normal and autistic children (Baron-Cohen et al., 2000), conscious awareness of visual objects in normal adults and in blindsight persons with specific brain lesions (Weiskrantz et al., 1974), problem solving in adults with varying working-memory capacities (Carpenter et al., 1990), generalization abilities in humans and cotton-top tamarins (Hauser et al., 2002), and reading processes in Hebrew and English (Pollatsek et al., 1981). Whether working with adult normal typical subjects or carrying out one or other type of comparative study, the vast majority of cognitive scientists can be seen to share two substantive assumptions (Von Eckardt, 1993): (1) the human cognitive mind or brain is a computational device (computer) and (2) it is a representational device. Exactly what kind of computer it is remains more open to dispute, as we shall see. However, most would agree with Von Eckardt that a computer is a device ‘‘capable of
automatically inputting, storing, manipulating, and outputting information in virtue of inputting, storing, manipulating and outputting representations of that information’’ (Von Eckardt, 1993: 50). A device is taken to be representational if it has states or entities inside it that function as representations; however, there is rather less agreement in the field about what constitutes a representation. Given that computation and representation are basic to cognitive science, there is clearly a special link between the analytic goals of cognitive science and the synthetic goals of artificial intelligence (AI). Most researchers in AI aim to develop computer programs that help a machine exhibit behavior that, if it were the behavior of a human, would be called intelligent. For instance, some researchers experiment with systems that can participate in natural language dialogues to help users buy products, such as airline tickets. By contrast, most researchers in cognitive science are not in the business of building smarter software (or hardware) agents; rather, they want to know more about human cognitive capacities. However, an important tool in this effort is the use of computational modeling. If we have a theory of how someone achieves certain performance (such as understanding written words, or solving a novel problem), then a computerbased model can be built to test the theory. If the theory is expressed in terms of algorithms and data structures, a computer model can be given data as input, and both its output behavior and its internal processes can be compared with human behavior and information about the internals underlying that behavior. A computational model has virtues, both theoretical and practical: it requires a theory to be made explicit enough to implement, and it renders that theory testable. Indeed, with explicit theories of sufficient complexity or power, it may well be that a computational simulation is the only way to determine in a timely fashion what the predictions are in some given scenario. It can therefore be seen that there is two-way traffic between artificial intelligence and cognitive science. AI supplies both terminology and tools, such as algorithms, programs, and simulation environments. Cognitive scientists can use these in their modeling. In the other direction, cognitive scientists develop models of human cognitive capacities that demonstrate that certain computational tasks can be solved under particular conditions, and how these tasks are solved. Such demonstrations can help AI researchers build better systems. It is worth noting, incidentally, that cognitive models of human performance can help builders of computer systems in ways other than by inspiring the design of AI systems. For instance,
564 Cognitive Science: Overview
by taking human memory limitations into account, specialists in human–computer interaction can build more usable computer interfaces, and they can also use empirical methods borrowed from cognitive science to help evaluate the usability of their systems. There is one final point to make about the goals of cognitive science. Most cognitive scientists (whatever their disciplinary background) are interested in how people carry out particular cognitive tasks. That is, they want to find out what representations and processes underlie the acquisition and execution of the relevant skilled behavior. Some are also interested in where the representations and processes are located. On the one hand, this includes neuroscientists, who consider individual behavior and seek to locate the relevant brain areas, neural circuits, and chemical pathways. On the other hand, researchers who study distributed cognition are interested in group behavior, and they seek to locate external representations (such as documents) and processes (e.g., manipulating objects) occurring in the physical environments in which people work and play (Hutchins, 1995). There are connections between this approach and that which focuses on embodied cognition, whereby the (possibly changing) physical properties of an agent strongly influence the development of its cognitive capacities (Clark, 1997). Yet there is also a recent move toward studying why people have specific representations and processes. This is the province of evolutionary psychology (Barkow et al., 1992), which casts the interdisciplinary net even wider than before, and draws on anthropology, archaeology, and paleontology to try to explain why modern humans have, for instance, acquired the kinds of reasoning biases that seem to make them depart from the canons of probability theory. Given the incompleteness of our knowledge of even recent human history, evolutionary approaches remain necessarily speculative. However, advances in genetics and bioinformatics may broaden the interdisciplinary range still further. But before considering current directions in cognitive science, it is worth sketching some history.
The Rise of Cognitivism In the 19th century, researchers such as Hermann von Helmholtz and Hermann Ebbinghaus studied human thought and developed systematic methods for measuring relevant processes, such as the conduction of nerve signals, or the rate of forgetting. Others, such as Wilhelm Wundt, maintained that controlled introspection could also deliver useful insights into the workings of people’s minds. However, in the late 19th and early 20th centuries, introspection fell into
disrepute. It took with it most theories involving ‘unobservable’ mental entities, whether they had been objectively observed or not. It is true that the cognitive tradition did continue, thanks to Russian researchers such as Alexander Luria and Lev Vygotskii. But in North America and most of Europe, the behaviorist school, led by researchers such as John Watson, Edward Thorndike, and B. F. Skinner, argued that the only proper objects of psychological study were the externally observable stimuli and responses of humans and other animals. In the mid- to late 20th century, behaviorism fell from favor. One reason was the perception that – although it had developed some sophisticated experimental methods, such as various forms of conditioning for studying learning – it was not actually delivering useful psychological generalizations. Another reason was that it came under heavy attack: Noam Chomsky (1959), for example, argued that skilled behavior – linguistic behavior in particular – required mediating mental entities to explain it. Behaviorism might still have survived, but by then there was a respectable alternative. The alternative arose from the invention, during World War II, of computers. Alan Turing’s prewar mathematical and metamathematical work had laid the theoretical foundations for modern computing. According to (one formulation of) the thesis of the American mathematician Alonzo Church, all computable functions are Turing computable; what are now known as ‘Turing machines’ are theoretical devices for effectively computing mathematical functions. During the war, actual computing machines were built for code-breaking purposes and for other numerically intensive calculations; Turing was instrumental in this effort in the United Kingdom, along with John von Neumann in the United States. After the war, Turing (1950) laid out a vision of machine (or artificial) intelligence. It maintained that what mattered for attributing intelligence to an unknown agent was (as before) its observable behavior; it would be considered intelligent if it passed what subsequently became known as the ‘Turing test.’ But now, this behavior could be generated by a machine that transformed input data into output data by following an internally stored algorithm. These ideas gave birth to the fields of computer science and artificial intelligence, the latter of which was nurtured by, among others, John McCarthy and Marvin Minsky, who proposed an artificial intelligence summer project for 1956 at Dartmouth College, New Hampshire. Turing’s ideas also had a huge impact in the philosophy of mind, a growing influence in psychology, and soon led to significant interactions between linguistics and computer science (Chomsky,
Cognitive Science: Overview 565
1957). Turing machines had a relatively simple architecture, but alternative ways of designing computers were soon being developed. von Neumann’s architecture was slightly more complex: a single central processor consulted a special part of the computer’s memory to find which program instruction to carry out next; on the basis of the instruction, it manipulated other parts of its memory, which were dedicated to storing data. At about the same time, McCulloch and Pitts (1943) developed a very different computer architecture, inspired by the relations between neurons in brains: rather than a single (powerful) processor, with access to large amounts of program and data memory, they perceived that a useful computer can be composed of a large number of rather simple processors with small amounts of memory, so long as the processing nodes are properly interconnected; simple rules could be followed for updating the nodal states, on the basis of the neighboring states. The von Neumann machine architecture dominated both computer science and cognitive science for decades. As a result, researchers focused on the types of representations that were naturally manipulated within this kind of architecture. Whereas, at low levels, computers might store information in binary digital form, they could be programmed to interpret and generate sequences of symbols. The logical roots of computing reach back at least to Gottlob Frege (1848–1945), and the language and logic of first-order predicate calculus came to function as a symbolic lingua franca for many researchers, providing models for both language and reason. The idea that computers could run programs that led to intelligent behavior was a special gift to philosophers, such as Hilary Putnam. Computers appear to furnish solutions to a number of crucial problems, including the mind–body problem and the homunculus problem. The mind–body problem concerns the relation between mental states and events and bodily states and events. For instance, are all mental events really just physical events, differently described? Is there a special kind of mental substance? If so, how does it interact with physical substances? The homunculus problem concerns the relation between mental representations and the minds that contain them. If my representation of a pig is a mental image of a pig, who looks at the mental image? Traditionally, it was suggested that the mind would have to contain a viewer who inspected the image; the ‘homunculus,’ or ‘little man,’ was needed to fulfill this role. But since the homunculus also had to contain a mental image, he also contained another homunculus. With an infinite regress threatening, the homunculus looks like a nonexplanation.
Computers appear to help solve the mind–body problem because they provide a beautiful analogy: the brain is to the mind as the computer is to the program. A program has a purely physical instantiation, because it is stored in the computer’s memory. But when the program runs, interesting – sometimes even intelligent – behavior can occur. Taken further, perhaps the human brain (or the whole body containing the brain) really is just a computer, and the mind is therefore the product of programs running on the computing machine. Computers appear to solve the homunculus problem because they contain internal representations that do not need a smart little man to read them. A very complex program can be decomposed into a finite set of simple instructions. Each of those instructions can be carried out by a very simple processor of limited powers. There is no little man in the machine.
Central Issues in Cognitive Science With the computational metaphor to empower it, the study of human psychology was liberated from the constraints of behaviorism. From the 1950s to the 1970s, information-processing models were developed to cover a range of human cognitive capacities, such as memory, attention, reasoning, and problem solving (Miller, 1956; Broadbent, 1958; JohnsonLaird and Wason, 1970; Newell and Simon, 1972). Marr’s (1982) work on vision encapsulated the strength of cognitive science, framing three key levels of explanation: the computational, algorithmic, and implementational. The first of these involves the behavior of a human agent given a specific task, the second involves the cognitive or affective systems underlying a given computational task (it is at this level that mental processes and representations are traditionally located), and the third involves the basic biological systems underlying the algorithmic level (such as the brain). Sometimes computational behavior can be explained directly in terms of implementational things and events (explanations involving localized brain damage sometimes take this form), but more frequently, the algorithmic level is required to identify common causes of computational-level behavior. Modeling human language processing was a central goal for cognitive scientists, and in this context, a number of larger issues emerged. Jerry Fodor played an important role in bringing at least three key issues into focus. One issue concerns the nature of internal mental representations. Fodor (1975) argued that to explain human conceptual learning, we had to posit the existence of ‘mentalese,’ a language of thought having properties that reflected those of natural
566 Cognitive Science: Overview
languages. Others, such as Pylyshyn (1973), revived an old debate about the nature of mental imagery, to argue that evidence in favor of image-like mental representations could be explained purely in terms of language-like mental representation. Most recently, however, the pendulum has swung in the opposite direction, and many researchers are now pursuing the idea that language processing draws on mental representations that have imagistic (perceptually oriented) components and motoric (action-oriented) components (Pulvermu¨ ller, 1999). A second issue concerns the relationships between our various mental capacities. Fodor (1983) revisited the notion of faculty psychology, arguing that the human mind is modular in nature, with relatively limited communication between specialized modules. Language processing, in particular, was taken to involve modules that operated automatically, very different in kind from the operation of central (conscious) reasoning. Ideas about the extent and origin of modularity of mind have been very influential and are related to the nativist/empiricist debate (Pinker, 1994; Elman et al., 1996). Modularity has been adopted by evolutionary psychologists such as Leda Cosmides. The massive modularity hypothesis suggests that the mind is highly modularized and that the domainspecific modules correspond to evolutionary adaptations, acquired by our ancestors in solving persistent problems in the relatively recent past. Although these modules may be adaptations, they may not remain adaptive, and this constitutes part of the explanation as to why humans do not always reason in accordance with the canons of probability theory. Evolutionary psychology has its problems, however, not the least of which is the paucity of data concerning the environment of evolutionary adaptation. In the face of this, alternative explanations for apparent anomalies in human reasoning are still being developed, some of which draw explicitly on the idea that reasoning and language interpretation are intimately intertwined (Stenning and van Lambalgen, 2004). A third and final issue concerns the computational architecture underlying the human mind. Although the von Neumann architecture has prevailed in computer science and cognitive science, the neural architecture proposed by McCulloch and Pitts ran a more checkered career. The neural architecture lends itself very naturally to learning how to compute functions that transform stimuli into responses. Turing had emphasized that for practical purposes, an intelligent machine would have to be a learning machine. But results from Marvin Minsky and Seymour Papert in the 1960s suggested that there were fundamental limits to what neutrally inspired perceptrons could learn
to compute. It was not until the early 1980s that work by David Rumelhart and James McClelland and collaborators revived the approach, under the banner of ‘parallel distributed processing,’ or more generally, ‘neo-connectionism.’ The development of effective learning algorithms allowed connectionist machines to alter the strengths of connections between nodes, either with or without supervision from outside, and to accomplish a much broader range of computational tasks. The distinction between traditional and neural architectures is sometimes drawn in terms of the representations used by the machines. As already noted, traditional architectures usually store (at least some) information about the world in sentence-like symbolic structures, composed of symbols representing real-world entities and the relations between them. By contrast, connectionist architectures are sometimes considered nonsymbolic or subsymbolic: in some systems, processing nodes may correspond directly to specific real-world entities; but in distributed representation systems in particular, a given node may participate in representing many different realworld objects or relations, and representing a given object may require the activation of many processing nodes. There has been a vast amount of subsequent work on connectionist modeling of human cognitive tasks. The models have two special virtues: unlike symbolic approaches, learning is built in. This is good, because learning is core to many cognitive abilities. And unlike symbolic approaches, damaging a network leaves residual function. This means that is possible to simulate the effects of lesions in the brain. Many successful models do just this, such as the models of dyslexia by Plaut and Shallice (1994). On the other hand, there are some things that symbolic systems do better than neural systems. For instance, certain properties of language, such as constituency and recursion, are taken to be dealt with quite naturally by symbolic systems, but to pose problems for connectionist systems (Fodor and Pylyshyn, 1988). Considerable effort has consequently been devoted to showing that connectionist systems can indeed process languages with such properties. It has also been argued that connectionist systems are better models of humans because neural networks are more similar to brains than von Neumann architectures are. Against this, neural networks abstract away from many (in fact, nearly all) features of real brain circuitry, and many of the learning algorithms used are biologically implausible. On the other hand, some algorithms are indeed deliberately modeled on processes in neural circuits, such as Hebbian reinforcement (Hebb, 1949).
Cognitive Science: Overview 567
Current Directions The past decade has seen an explosion in the number of studies that use at least one brain imaging technique. These include positron emission tomography, functional magnetic resonance imaging, and transcranial magnetic stimulation. The last of these goes beyond straightforward (but computationally intensive) imaging and actively alters brain states in experimental participants (for an introduction to work relevant to language, see, for instance, Poeppel and Hickok (2004)). Cognitive neuroscience started from an interest in pathologies, such as blindsight, but imaging technologies have progressed to the point where many studies are carried out on normal individuals executing simple cognitive tasks. As a result, there is a popular conception that cognitive science is brain science. But, given Marr’s three levels of explanation, implementation in brains is only part of the overall picture, and, equally, it is perfectly possible to study the brain without being interested in cognitive processes. What cognitive studies bring to brain science is the ability to demarcate a coherent set of cognitive capacities, and it is these that can (sometimes) be localized within particular brain areas. Imaging is also, at least in part, responsible for the rehabilitation in recent years of the study of consciousness and emotion. New measurement technologies allow even those researchers who are suspicious of subjective or introspective reports to begin to investigate empirically brain states and events associated with consciousness. The study of emotion and other affective states has also progressed considerably, allowing interactions between affective and cognitive processes to be traced with increasing precision (Davidson and Irwin, 1999). The rapprochement between affect and cognition is particularly notable; for instance, new work on personality and individual differences takes account of imaging results in explicitly advocating cognitive models of personality traits such as anxiety (Matthews et al., 2000). Another line of development generalizes from one aspect of connectionism: learning from data. The increasing availability of online corpora (of images, speech, and text) has encouraged researchers to explore how far statistical learning techniques can model cognitive performance in real-world domains. From an engineering point of view, substantial successes have been achieved. For instance, both speech processing and, more recently, text processing have been revolutionized by the recruitment of statistical techniques. These successes are mirrored by developments in cognitive science, in language processing, and beyond. On the language side, approaches such
as latent semantic analysis have been developed to form the basis of psychological theories of textual meaning (Landauer and Dumais, 1997). More generally, the rational analysis of cognition proposes that a cognitive system operates as a probabilistic engine, to optimize the adaptation of the organism’s behavior to its environment (Anderson, 1990; Chater and Oaksford, 1999). A final area of burgeoning interest arises from progress in genetic research enabled by developments in bioinformatics. Most researchers do not expect to find a one-to-one mapping from genes to specific cognitive capacities or dispositions. Thus, the idea that there might be a single ‘language gene’ is increasingly considered unsophisticated. However, as genealogy, molecular biology, behavioral genetics, and language research are brought together, significant results are already beginning to emerge. In 2001, FOXP2 was isolated as a gene in which a point mutation correlates with language (and other) disorders in affected members of the intensively studied three-generation KE family (Lai et al., 2001). The gene codes for a transcription factor, and changes in its structure may therefore have a broad effect on the expression of genes during cognitive development. As might be expected, imaging techniques have also been brought to bear on the KE family (Lie´ geois et al., 2003), but for current purposes, this recent progress on FOXP2 is merely the harbinger of changes to come. Cognitive science has always had the computational metaphor at its core. But with increasingly sophisticated hardware, ever larger online corpora, and more powerful software for processing that data, it seems that computational power is more important than ever for future progress in the interdisciplinary scientific study of the mind. See also: Cognitive Linguistics; Cognitive Pragmatics; Cognitive Science and Philosophy of Language; Cognitive Technology; Computer-Mediated Communication: Cognitive Science Approach; Consciousness, Thought and Language; Distributed Cognition and Communication; Helmholtz, Hermann Ludwig Ferdinand von (1821– 1894); Human Language Processing: Connectionist Models; Human Language Processing: Symbolic Models; Human Reasoning and Language Interpretation; Language Development: Overview; Language, Visual Cognition and Motor Action; Latent Semantic Analysis; Modularity of Mind and Language; Natural Language Processing: Overview; Piaget, Jean (1896–1980); Psycholinguistics: Overview; Rational Analysis and Language Processing; Stroop Effect in Language; Turing, Alan Mathison (1912–1954); Vygotskii, Lev Semenovich (1896– 1934); Writing and Cognition; Wundt, Wilhelm (1832–1920).
568 Cognitive Science: Overview
Bibliography Anderson J R (1990). The adaptive character of thought. Hillsdale, NJ: Erlbaum. Barkow J H, Cosmides L & Tooby J (eds.) (1992). The adapted mind: evolutionary psychology and the generation of culture. New York: Oxford University Press. Baron-Cohen S, Tager-Flusberg H & Cohen D J (eds.) (2000). Understanding other minds: perspectives from developmental cognitive neuroscience (2nd edn.). Oxford: Oxford University Press. Broadbent D E (1958). Perception and communication. London: Pergamon. Carpenter P A, Just M A & Shell P (1990). ‘What one intelligence test measures: a theoretical account of the processing in the Raven Progressive Matrices Test.’ Psychological Review 97, 404–431. Chater N & Oaksford M (1999). ‘Ten years of the rational analysis of cognition.’ Trends in Cognitive Sciences 3, 57–65. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1959). ‘A review of B. F. Skinner’s ‘‘verbal behavior.’’’ Language 35, 26–58. Clark A (1997). Being there: putting brain, body and world together again. Cambridge, MA: MIT Press. Cummins R & Cummins D D (eds.) (2000). Minds, brains and computers: the foundations of cognitive science. Oxford: Blackwell. Darwin C J (2004). ‘Obituary: Christopher LonguetHiggins.’ The Guardian, 10th June, 2004. Davidson R J & Irwin W (1999). ‘The functional neuroanatomy of emotion and affective style.’ Trends in Cognitive Sciences 3, 11–21. Elman J L, Bates E, Johnson M H, Karmiloff-Smith A, Parisi D & Plunkett K (1996). Rethinking innateness: A connectionist perspective on development. Cambridge, MA: MIT Press. Fodor J A (1975). The language of thought. Cambridge, MA: Harvard University Press. Fodor J A (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor J A & Pylyshyn Z W (1988). ‘Connectionism and cognitive architecture: a critical analysis.’ Cognition 28, 3–71. Gardner H (1985). The mind’s new science. New York: Basic Books. Green D W (1996). Cognitive science: an introduction. Oxford: Blackwell. Gregory R L (ed.) (1998). The Oxford companion to the mind. Oxford: Oxford University Press. Hauser M D, Weiss D J & Marcus G (2002). ‘Rule learning by cotton-top tamarins.’ Cognition 86, B15–B22. Hebb D O (1949). The organization of behavior: a neuropsychological theory. New York: Wiley. Hutchins E (1995). ‘How a cockpit remembers its speeds.’ Cognitive Science 19, 265–288. Johnson-Laird P N & Wason P C (1970). ‘A theoretical analysis of insight into a reasoning task.’ Cognitive Psychology 1, 134–148.
Lai C S, Fisher S E, Hurst J A, Vargha-Khadem F & Monaco A P (2001). ‘A forkhead-domain gene is mutated in a severe speech and language disorder.’ Nature 413, 519–523. Landauer T & Dumais S (1997). ‘A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge.’ Psychological Review 104, 211–240. Larkin J H & Simon H A (1987). ‘Why a diagram is (sometimes) worth 10 000 words.’ Cognitive Science 11, 65–99. Lie´ geois F, Baldeweg T, Connelly A, Gadian D G, Mishkin M & Vargha-Khadem F (2003). ‘Language fMRI abnormalities associated with FOXP2 gene mutation.’ Nature Neuroscience 6, 1230–1237. Marr D (1982). Vision. Freeman: San Francisco. Matthews G, Derryberry D & Siegle G (2000). ‘Personality and emotion: cognitive science perspectives.’ In Hampson S (ed.) Advances in personality psychology, vol. 1. Routledge: London. 199–237. McCulloch W S & Pitts W (1943). ‘A logical calculus of the ideas imminent in nervous activity.’ Bulletin of Mathematical Biophysics 5, 115–133. Miller G A (1956). ‘The magical number seven, plus or minus two: some limits on our capacity for processing information.’ Psychological Review 63, 81–97. Newell A & Simon H A (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Pinker S (1994). The language instinct. London: Penguin Books. Plaut D C & Shallice T (1994). Connectionist modelling in cognitive neuropsychology: a case study. Hillsdale, NJ: Erlbaum. Poeppel D & Hickok G (2004). ‘Towards a new functional anatomy of language.’ Cognition 92, 1–12. Pollatsek A, Bolozky S, Well A D & Rayner K (1981). ‘Asymmetries in the perceptual span for Israeli readers.’ Brain and Language 14, 174–180. Pulvermu¨ ller F (1999). ‘Words in the brain’s language.’ Behavioral and Brain Sciences 22, 253–336. Pylyshyn Z W (1973). ‘What the mind’s eye tells the mind’s brain: a critique of mental imagery.’ Psychological Bulletin 80, 1–24. Stenning K & van Lambalgen M (2004). ‘A little logic goes a long way: basing experiments on semantic theory in the cognitive science of conditional reasoning.’ Cognitive Science 28, 481–529. Tomasello M (2000). ‘Do young children have adult syntactic competence?’ Cognition 74, 209–253. Turing A M (1950). ‘Computing machinery and intelligence.’ Mind 59, 433–460. Von Eckardt B (1993). What is cognitive science? Cambridge, MA: MIT Press. Weiskrantz L, Warrington E K, Saunders M D & Marshall J (1974). ‘Visual capacity in the hemianopic field following a restricted occipital ablation.’ Brain 97, 709–728.
Cognitive Semantics 569
Cognitive Semantics J R Taylor, University of Otago, Dunedin, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Cognitive Linguistics and Cognitive Semantics Cognitive semantics is part of a wider movement known as ‘cognitive linguistics.’ Prior to surveying the main characteristics of cognitive semantics, it will be advisable to clarify what is meant by cognitive linguistics. As a matter of fact, the term is open to different interpretations. On a broad understanding, any approach that views language as residing in the minds of its speakers and a linguistic description as a hypothesis about a speaker’s mental state would merit the designation ‘cognitive.’ Chomsky’s career has been devoted to pursuing cognitive linguistics on this broad understanding. On the narrower, and more specialized interpretation intended here, cognitive linguistics refers to a movement that emerged in the late 1970s and early 1980s, mainly as a reaction to certain tendencies of Chomskyan, and, more generally, formalist linguistics. Linguists who were prominently associated with the emergence of cognitive linguistics, in this narrow sense, were George Lakoff, Ronald Langacker, and Leonard Talmy. Rather than a specific theory, cognitive linguistics can best be described as an approach, or cluster of approaches to language study, whose practitioners nevertheless share a basic outlook on the nature of language. Several common aspects can be identified: . Cognitive linguists are skeptical of the idea, promoted within Chomskyan linguistics, that human language might be associated with a language-specific module of the mind. Their starting point, rather, is that language is embedded in more general cognitive abilities and processes. According to the editorial statement of the monograph series Cognitive linguistics research (published by Mouton de Gruyter, Berlin), the guiding assumption is that ‘language is an integral facet of cognition which reflects the interaction of social, cultural, psychological, communicative and functional considerations, and which can only be understood in the context of a realistic view of acquisition, cognitive development and mental processing.’ Special attention, therefore, has been directed towards studying language, its structure, acquisition, and use, from the perspective of such topics as perception, categorization, concept formation, spatial cognition, and imagery. Although these capacities might well be subject to highly
specialized elaboration in human language, they are not per se linguistic capacities. . Cognitive linguistics signaled a return to the basic Saussurean insight that language is a symbolic system, which relates signifiers (that is, language in its perceptible form, whether as sound, marks on paper, or gesture) and signifieds (that is, meanings). Indeed, Langacker (1987: 11) characterized a language as ‘an open-ended set of linguistic signs [. . .], each of which associates a semantic representation of some kind with a phonological representation.’ Importantly, semantic representations, i.e., ‘meanings,’ are taken to be mental entities, or, perhaps more appropriately, mental processes. Thus, Langacker prefers to refer not to ‘concepts’ (a term that suggests that meanings are static, clearly individuated entities) but to ‘conceptualizations,’ where the deverbal nominal emphasizes the dynamic, processual character of the phenomenon. . A third feature of cognitive linguistics follows from the view of language as a symbolic system, namely that syntax and morphology – patterns for the combination of words and morphemes into larger configurations – are themselves symbolic, and hence inherently meaningful. The same goes for the elements over which syntax and morphology operate – lexical and phrasal categories, for example – as well as the kinds of relations that can hold between these elements, i.e., relations such as subject (of a clause), modification, complementation, apposition, subordination. The view, current in many linguistic theories, that syntax and morphology constitute autonomous levels of linguistic organization is therefore rejected. Indeed, a major thrust of cognitive linguistic research over the past couple of decades has been, precisely, the attempt to offer a conceptual characterization of formal aspects of language organization. It will be apparent that the orientation of cognitive linguistics, as characterized above, was bound to have considerable influence on the ways in which meanings (whether of words, sentences, syntactic patterns, etc.) have been studied. One aspect has already been mentioned, namely, that meanings are taken to be mental entities. In this, cognitive linguistics contrasts strikingly with other approaches, such as logical approaches, which have focused on logical aspects of sentences and the propositions they express; with truth-conditional approaches, which focus on the relation between propositions and states of affairs in the world; with structuralist approaches, which view meaning in terms of semantic relations within the
570 Cognitive Semantics
language; with behaviorist approaches, which view meaning in terms of stimulus-response associations; and, more generally, with theories of meaning as use. What these alternative approaches to meaning have in common is their avoidance of mentalism, i.e., the characterization of meanings as ‘things in the head.’ The remainder of this article surveys some important themes and research topics in cognitive semantics. It should be mentioned that the survey is by no means comprehensive; for broader coverage, the reader is referred to the introductions to cognitive linguistics listed at the end of this article. Some topics, such as metaphor and metonymy, are dealt with elsewhere in this encyclopedia and for this reason are discussed only briefly. It should also be borne in mind that cognitive semantics, like cognitive linguistics itself, does not constitute a unified theory, but is better regarded as a cluster of approaches and research themes that nevertheless share a common outlook and set of assumptions.
Meaning Is Encyclopedic in Scope Many semanticists, especially those who see the language faculty as an encapsulated module of the mind, insist on the need to make a distinction between the dictionary and the encyclopedia, that is, between what one knows in virtue of one’s knowledge of a language and what one knows in virtue of one’s knowledge of the world. Cognitive semantics denies the validity of such a distinction. On the contrary, meaning is taken to be essentially encyclopedic in scope. A person’s linguistic knowledge would therefore, in principle, be coextensive with the person’s total world knowledge. An individual word, to be sure, provides access to only a small segment of encyclopedic knowledge. No clear bounds, however, can be set on how far the relevant knowledge network extends. The encyclopedic nature of linguistic semantics is captured in the notions of profile, base, domain, and Idealized Cognitive Model (or ICM). The terms ‘profile’ and ‘base’ are due to Langacker (1987). A linguistic expression intrinsically evokes a knowledge structure, some facet of which is profiled. Take the word hypotenuse. The word designates a straight line. Whatever we predicate of hypotenuse is predicated of a hypotenuse qua straight line, as when we assert The hypotenuse is 3 cm. long. Obviously, the notion of a straight line does not exhaust the meaning of the word. The straight line in question is part of a larger structure, namely, a right-angled triangle. Although hypotenuse does not designate the triangle, the notion of a triangle is essential for the understanding of the word (Figure 1). Notice that
Figure 1 Notion of hypotenuse.
the concept designated by the word cannot be identified with the profile – as mentioned, the profile is simply a straight line. The concept resides in the profiling of a facet of the base. For other examples that illustrate the profile-base relation, consider words such as thumb (profiled against the conception of a human hand), top (profiled against a schematic notion of a three-dimensional entity), island (a mass of land profiled against the surrounding water). In fact, it is axiomatic, in cognitive semantics, that all expressions achieve their meaning through profiling against the relevant background knowledge. Returning to the hypotenuse example, it will be apparent that the base – the notion of a triangle – itself presupposes broader knowledge configurations, namely, those pertaining to planar geometry, which themselves are based in notions of space and shape. These broader knowledge configurations are referred to as ‘domains.’ Some domains may be basic, in the sense that they are not reducible to other domains. Examples include time, space, color, temperature, weight, etc. Otherwise, a knowledge structure of any degree of complexity can function as a domain, for example, the rules of a game, a scientific theory, kinship networks, gender stereotypes, educational, political, and legal systems. Domains may also be constituted by deeply held beliefs about life, nature, causation, the supernatural, and so on. Most concepts are characterized against a ‘matrix’ of more than one domain. Uncle, for example, profiles a male human being against the base of a (portion of a) kinship network, specifically, that part of the network that relates the uncle to his nephews/ nieces. The notion of kinship itself rests on notions of gender, procreation, marriage, inheritance, etc. At the same time, uncle profiles a human being, which is understood against multiple domains pertaining to
Cognitive Semantics 571
life forms, to three-dimensional bodies and their various parts, with their features of weight, extension, shape, and so on. If we add to this the fact that, in many societies, uncles may have special rights and obligations vis-a`-vis their nephews/nieces, we may appreciate that even a single word, if its meaning is fully explored, can take us into the farthest reaches of our knowledge and cultural beliefs. It will be apparent that the distinction between base and domain is not a clear-cut one. The base may be defined as a knowledge structure that is inherently involved in profiling, whereas domains constitute background, more generalized knowledge. Terminology in this area is also confusing because different authors have favored a range of terms for domain-based knowledge. Some scholars have used the not always clearly distinguishable terms ‘scene,’ ‘scenario,’ ‘script,’ and ‘frame’ to refer in particular to knowledge about expected sequences of events. Thus, anger refers not just to an emotional state, but is understood against an expected scenario that includes such stages as provocation, response, attempts at control, likely outcomes, and so on. Likewise, paying the restaurant bill evokes the ‘restaurant script’ – knowledge of the kinds of things one does, and the things that happen, when one visits culturally instituted establishments known as ‘restaurants.’ The notion of paying also invokes the frame of a commercial transaction, with its various participants, conventions, and activities. Mention might also be made of Searle’s (1992) notions of ‘the Network’ and ‘the Background,’ whereby a particular belief takes its place within a network of other beliefs, and against the background of capacities, abilities, and general know-how. Of special importance is Lakoff’s (1987) notion of ‘Idealized Cognitive Model,’ or ICM – a notion that bears some affinity with the concept of ‘folk theory’ (again, different scholars prefer different terms). ICMs capture the fact that knowledge about a particular domain may be to some extent idealized and may not fit the actual states of affairs that we encounter on specific occasions. Consider the words bachelor and spinster. We might define these as ‘adult unmarried male’ and ‘adult unmarried female,’ respectively. The concepts, thus defined, presuppose an ICM of marriage practices in our society. According to the ICM, a person reaches a more-or-less clearly definable marriageable age. People who pass the marriageable age without marrying are referred to as bachelors and spinsters, as the case may be. The ICM attributes different motives to men and women who do not marry. Men do so out of choice, women out of necessity. As will be appreciated, the ICM is idealized, in that it presupposes that all citizens are heterosexual
and that all are equally available for marriage. It thus ignores the existence of celibate priests and of couples who live together without marrying. The discrepancy between model and reality can give rise to prototype effects. The fact that the Pope is not a representative example of the bachelor category derives from the fact that Catholic clergy are not covered by the ICM. Appeal to the ICM can also explain the different connotations of bachelor and spinster. Although one might not want to subscribe to the sexist framing of the ICM, it does offer an explanation for why eligible bachelor is an accepted collocation, whereas eligible spinster is not. As mentioned, the meaning of a word may need to be characterized against a matrix of several domains. However, not all uses of a word need invoke each of the domains in equal measure. Certain uses may activate only some domains whereas others are backgrounded or eclipsed. The notion of a kinship network is likely to be prominent in most uses of uncle, yet when parents use the word to introduce one of their adult male friends to their child, the kinship domain is eclipsed. For another example of selective domain activation, consider the concept of a book. When you drop a book, the status of the book as a (heavy) material object is activated, when you read a book, the status of a book as a printed text is activated, when you translate a book, the status of the book as a text in a given language is foregrounded. Note that begin a book can be interpreted in various ways, according to which of the domains is activated. The activity that one begins with respect to the book could be reading, writing, editing, translating, or even (if you are bookworm, literally!), eating. The above examples not only illustrate the importance of domains and related notions in the study of word meanings, they also show why it has been deemed necessary to favor an encyclopedic approach to semantics. The reason is, namely, that we need to appeal to domain-based knowledge in order to account for how words are used and for the ways in which complex expressions are judged. Often, the very possibility of interpreting an expression, and of accepting it as semantically well-formed, can only be explained by reference to appropriate background knowledge. A common objection to an encyclopedic semantics is that one cannot reasonably claim that everything a person knows about the concept designated by a word is relevant to the use of the word. It is certainly true that some facets of background knowledge may be central, and more intrinsic to a concept, others might be more peripheral or even idiosyncratic to an individual speaker. Nevertheless, even extrinsic knowledge might become relevant to a word’s use,
572 Cognitive Semantics
for example, in discourse between intimates or family members. Moreover, the study of semantic change teaches us that even highly peripheral and circumstantial knowledge pertaining to a concept can sometimes leave its mark on the semantic development of a word. Langacker (1987: 160) has remarked that Jimmy Carter’s presidency had a notable, if transient, effect on the semantics of peanut. Equally, Margaret Thatcher’s premiership probably influenced the semantic development of handbag, at least for British speakers. The notion of domain is relevant to two important themes in cognitive semantic research, namely metaphor and metonymy. ‘Metaphor’ has been analyzed in terms of the structuring of one domain of experience (usually, a more abstract, intangible domain) in terms of a more concrete, and more directly experienced domain. For example, time is commonly conceptualized in terms of space and motion, as when we speak of a long time, or say that Christmas is approaching, or even that it is just around the corner. More recently, metaphor has been studied under the more general rubric of ‘conceptual blending,’ whereby components of two or more input domains are incorporated into a new conceptualization, the blend. Whereas metaphor involves elements from more than one domain, ‘metonymy,’ in contrast, concerns elements within a single domain. Thus, we can use the name of an author to refer to books written by the author, as when we enquire whether someone has read any Dickens. The transfer of reference from person to product is possible because both are linked within domain-based knowledge pertaining to books and their authorship.
Categorization Every situation and every entity that we encounter is uniquely different from every other. In order to be able to function in our physical and social worlds, we need to reduce this information overload. We do this by regarding some situations and some entities as being essentially ‘the same.’ Having categorized an entity in a certain way, we know how we should behave towards it and what properties it is likely to have. It is significant that whenever we encounter something whose categorization is unclear we typically feel uneasy. ‘What is it?’, we want to know. Categorization is not a peculiarly human ability. Any creature, if it is to survive, needs at the very least to categorize its environment in terms of edible or inedible, harmful or benign. Humans have developed phenomenal categorization abilities. We operate with literally thousands, if not hundreds of thousands of categories. Moreover, our categories are flexible
enough to accommodate new experiences, and we are able to create new categories as the need arises. To know a word is to know, among other things, the range of entities and situations to which the word can be appropriately applied. To this extent, the study of word meanings is the study of the categories that these words denote. And it is not only words that can be said to designate categories. It can be argued that syntactic configurations, for example, those associated with intransitive, transitive, and ditransitive constructions, designate distinct categorizations of events and their participants. What is the basis for categorization? Intuitively, we might want to say that things get placed in the same category because of their similarity. Similarity, however, is a slippery notion. One approach would be to define similarity in terms of the sharing of some common feature(s) or attribute(s). Similarity, then, would reduce to a matter of partial identity. Feature-based theories of categorization often require that all members of a category share all the relevant features. A corollary of this approach is that categories are well-defined, that is, it is a clear-cut matter whether a given entity does, or does not, belong in the category. It also follows that all members have equal status within the category. There are a number of problems associated with this approach. One is that the categories designated by linguistic expressions may exhibit a prototype structure. Some members of the category might be more representative than others, while the boundary of the category may not be clearly defined. In a wellknown passage, though without introducing the prototype concept, Wittgenstein (1953: x66) drew attention to categorization by family resemblance. Imagine a family photograph. Some members of the family might have the family nose, others might have the family chin, others might have the family buck teeth. No member of the family need exhibit all the family traits, yet each exhibits at least one; moreover, some members might exhibit different traits from others. Wittgenstein illustrated the notion on the example of the kinds of things we call ‘games,’ or Spiele (Wittgenstein was writing in German). Some (but not all) games are ‘amusing,’ some require skill, some involve luck, some involve competition and have winners and losers. The family resemblance notion has been usefully applied to the study of word meaning. Thus, some uses of climb (as in The plane climbed to 30 000 feet) exhibit the feature ‘ascend,’ some (such as The mountaineers climbed along the cliff ) exhibit the feature ‘move laboriously using one’s limbs.’ Considered by themselves, these two uses have very little in common. We see the relation, however, when we
Cognitive Semantics 573
consider some further uses of climb (as in The boy climbed the tree), which exhibit both of the features. A fundamental problem with feature-based theories of categorization concerns the nature of the features themselves. As Wittgenstein pointed out, skill in chess is not the same as skill in tennis. The concept of skill therefore raises the very same issues of how categories are to be defined as were raised by the notion of game, which the notion of skill is supposed to explicate. Understanding similarity in terms of partial identity is problematic for another reason. Practically any two objects can be regarded as similar in some respect (for example, both may weigh less than 100 kg., or both may cost between $5 and $5000), but this similarity does not mean that they constitute a viable or useful category. An alternative approach would be that categorization is driven by the role of the entities within broader knowledge configurations, that is, by domain-based knowledge and ICMs. Sometimes, apparently similar activities might be categorized differently, as when making marks on paper might be called, in some cases, ‘writing’, in other cases, ‘drawing.’ The distinction is based on knowledge pertaining to the nature and purpose of ‘writing’ and ‘drawing.’ On the other hand, seemingly very different activities might be brought under the same category. In terms of the actions performed, making marks with a pen on a piece of paper has little in common with depressing small, square-shaped pads on a keyboard. But given the appropriate domain-based knowledge, both can be regarded as instances of ‘writing.’ Categories, as Murphy and Medin (1985) have aptly remarked, are ultimately based in ‘theories’ (that is, in ICMs). The matter may be illustrated by the distinction (admittedly, not always a clear-cut one) between ‘natural kinds’ and ‘nominal kinds.’ Natural kinds are believed to be given by nature and are presumed to have a defining ‘essence’; moreover, we are inclined to defer to the scientists for an elucidation of their defining essence. Nominal kinds, in contrast, are often defined vis-a`-vis human concerns, and their perceptual properties and/or their function is often paramount in their categorization. Remarkably, even very young children are sensitive to the difference (Keil, 1989). Suppose a zebra had its stripes painted out; would it thereby become a horse? Or suppose a giraffe had its neck surgically shortened; would it cease to be a giraffe? Even very young children respond: ‘No.’ Changes to the appearance of the entities would not alter their defining essence. But suppose you saw off the back of a chair. Does the chair become a stool? Arguably, it does. In this case, a ‘superficial’ aspect is crucial to categorization.
The dynamics of categorization may be illustrated by considering the relationship between a linguistic expression (e.g., the word fruit) and its possible referents (e.g., an apple). We can address the relation from two perspectives. We can ask, for this word, what are the things in the world to which the word can be applied? Alternatively, we can ask, for this thing, what are the linguistic expressions that can refer to it? The first perspective (the ‘referential’ perspective: ‘To what does this word apply?’) operationalizes the notion of prototype. Fruit designates, primarily, such things as apples, pears, and bananas – these are the fruit prototypes. Less commonly, the word might be used to refer to olives and tomatoes. The second perspective (the ‘onomasiological,’ or naming perspective: ‘What is this thing to be called?’) operationalizes the notion of basic level. It is evident that one and the same thing can be named by terms that differ in their specificity vs. generality. For example, the thing you are now sitting on might be called a chair, an office chair, a piece of furniture, an artifact, or even a thing. All of these designations could be equally ‘correct.’ Yet, in the absence of special reasons to the contrary, you would probably call the thing a chair. (This, for example, is probably the answer you would give if a foreign learner wanted to know what the thing is called in English.) Chair is a basic level term, the basic level being the level in a taxonomy at which things are normally named. The basic level has this special status because categorization at this level provides maximum information about an entity. Thus, at the basic level, chairs contrast with tables, beds, and cupboards – very different kinds of things, in terms of their appearance, use, and function. Terms at a lower level in a taxonomy, e.g., kitchen chair vs. office chair, do not exhibit such a sharp contrast while terms at a higher level are too general to give much information at all about an entity. Not surprisingly, basic level terms turn out to be of frequent use, they are generally quite short and morphologically simple, and they are learned early in language acquisition.
The Usage-Basis of Cognitive Semantics Langacker has described cognitive linguistics as a ‘usage-based’ approach. The claim can be understood in two ways. On the one hand, it could be a statement about the methodology of cognitive linguistic research. Usage-based research would be research based on authentic data, as documented in a corpus, recorded in the field, or elicited in controlled situations, rather than on invented, constructed data. Although different researchers might prefer different
574 Cognitive Semantics
methodologies, a glance at practically any publication by leading figures in the field, such as Lakoff, Langacker, and Talmy, will show that cognitive linguistics, as a movement, cannot reasonably be said to be ‘usage-based’ in this sense. On a second interpretation, usage-based refers to the presumed nature of linguistic knowledge and the manner in which it is acquired, mentally represented, and accessed. The claim, namely, is that a language is learned ‘bottom-up’ through exposure to usage events. A usage event presents the language user/ learner with an actual vocalization in association with a fine-grained, context-dependent conceptualization. Acquisition proceeds through generalization over usage events. Necessarily, many of the contextdependent particularities of the usage events will be filtered out, leaving only a schematic representation of both the phonology and the semantics. In this respect, cognitive linguistics contrasts strikingly with ‘top-down’ theories of acquisition, whereby the basic ‘architecture’ of a language is presumed to be genetically given, exposure to usage data being needed merely to trigger the appropriate settings of innately given parameters. The usage-based approach raises two questions, which have loomed large in cognitive semantics research. These concern (a) the units over which schematization occurs, and (b) the extent of schematization. Let us first consider the second of these issues. One of the most vibrant areas of cognitive semantic research has been the study of lexical polysemy. It is a common observation that words exhibit a range of different meanings according to the contexts in which they are used. Indeed, the extent of polysemy appears to be roughly proportional to the frequency with which a word is used. Not surprisingly, among the most highly polysemous words in English are the prepositions. Consider the preposition on. Given such uses as the book on the table and the cat on the mat, it is easy to see how a schematic, de-contextualized image of the on-relation could emerge. It involves locating one object with respect to another in terms of such aspects as contact, verticality, and support. But the preposition has many other uses, as exemplified by the fly on the ceiling, the picture on the wall, the leaves on the tree, the writing on the blackboard, the washing on the clothes-line, the shoes on my feet, the ring on my finger. Do we proceed with further abstraction and schematization, coming up with a characterization of the on-relation that is compatible with all of these uses? Or do we identify a set of discrete meanings, which we may then attempt to relate in a prototype or a family resemblance category? If we adopt this latter
approach, another question arises, namely, just how many distinct meanings are to be postulated. Three? Ten? Several dozen? Do we want to say that the water on the floor and the cat on the mat exemplify different senses of on, on the grounds that the relation between cat and mat is not quite the same as that between the water and the floor? Needless to say, the issue becomes even more critical when we take into consideration the vast range of nonspatial uses of the preposition: on television, be on a diet, be on drugs, on Monday, and countless more. In general, as is consistent with a usage-based orientation, cognitive semanticists have tended to focus on the particularities of low-level generalizations, an approach that has frequently been censured for the ‘polysemy explosion’ that it engenders. Nevertheless, the role of more schematic representations is not denied. Langacker, in this connection, draws attention to the ‘rule-list fallacy.’ The fallacy resides in the notion that rules (high-level generalizations), once acquired, necessarily expunge knowledge of the lower-level generalizations on whose basis the rules have been abstracted. It is entirely plausible that highand low-level generalizations might co-exist in the mental grammar. Indeed, knowledge of low-level generalizations – not too far removed, in terms of their schematicity, from actually encountered usage-events – may be needed in order to account for speakers’ fluency in their language. The topic interacts with a more general issue, namely, the relative roles of ‘computation’ vs. ‘storage’ in language knowledge and language use. Humans are not generally very good at computation, but we are quite adept at storing and retrieving specific information. Consider arithmetical operations. We can, to be sure, compute the product of 12 by 12 by applying general rules, but the process is slow and laborious and subject to error, and some people may need the help of pencil and paper. It is far easier, quicker, and more reliable to access the ready-made solution, if we have learned it, namely, that 12 ! 12 ¼ 144. The point of the analogy is that in order for speech production and understanding to proceed smoothly and rapidly, it may well be the case that we access ready-made patterns and preformed chunks, which have been learned in their specific detail, even though these larger units could be assembled in accordance with general principles. The role of formulaic language in fluency and idiomaticity has been investigated especially by linguists engaged in corpus-based lexicography and second language acquisition research. Their findings lend support to the view that linguistic knowledge may indeed be represented at a relatively low level. We might suppose, therefore, that the ring on my finger is judged to be
Cognitive Semantics 575
acceptable, not because some highly schematic, underspecified sense of on has been contextually elaborated, nor because some rather specific sense of on has been selected, but simply because speakers have encountered, and learned, such an expression. These considerations lead into the second aspect of a usage-based model: what are the units over which schematization takes place? The study of lexical semantics has typically been based on the assumption that schematization takes place over word-sized units. Indeed, the above discussion was framed in terms of how many meanings the preposition on might have. The study of idioms and related phenomena, such as collocations, constructions, and formulaic expressions, casts doubt on the validity of this assumption. Corpus-based studies, in particular, have drawn attention to the fact that words may need to be characterized in terms of the constructions in which they occur, conversely, that constructions need to be characterized in terms of the words that are eligible to occur in them. It might be inappropriate, therefore, to speak of the ‘mental lexicon,’ understood as a list of words with their phonological and semantic properties. A more appropriate concept might be the ‘mental phrasicon,’ or the ‘mental contructicon.’ It would certainly be consistent with a usage-based model to assume that language is represented as schematizations over the units in terms of which language is encountered – not individual words as such, but phrases, constructions, and even utterance-length units.
Construal Linguistic meaning has often been approached in terms of the correspondence between an expression and the situation that it designates. Given the expression The cat is on the mat, and a situation in which there is a mat with a cat on it, we might be inclined to say that the linguistic expression fully and accurately describes the observed situation. The matter, however, is not so straightforward. For any conceived situation, certain facets will have been ignored for the purpose of its linguistic expression. Where was the mat? How big was it? What color was it? Was it laid out flat or was it rolled up? Was the cat in the center of the mat? Was the cat sitting or lying? And so on. Secondly, the speaker is able to categorize the situation at different levels of schematicity. Instead of saying that the cat is on the mat, the speaker could have stated that the animal is sprawled out on my new purchase. The speaker’s decision to include or exclude certain facets of the scene, and to categorize the scene and its participants in a certain way, are symptomatic of the broader phenomenon of
‘construal,’ namely, the way in which a conceived situation is mentally structured for the purpose of its linguistic expression. There is a sense in which the whole cognitive semantics enterprise is concerned with how speakers construe a conceived situation and how this construal receives linguistic expression, as a function of the conventional resources of a particular language. Some important facets are construal are discussed below. Figure-Ground Organization
A feature of our perceptual mechanism is that a perceived scene is structured in terms of ‘figure’ and ‘ground.’ Certain aspects of a scene are likely to be especially prominent and specifically attended to, whereas others are relegated to the background context. Given the situation of the cat and the mat, we are likely to say that the cat is on the mat, rather than that the mat is under the cat. Both wordings might be equally true in terms their correspondence with the situation. Yet one would normally be preferred over the other. This preference is because we would most likely select the cat as the figure, whose location is described with respect to the mat, rather than the other way round. Figure-ground organization is ubiquitous in perception, most obviously in visual perception, but also in other modalities. When we listen to a lecture, the speaker’s voice is (hopefully) the auditory figure, which stands out against the sound of the air conditioning and of people coughing and shuffling. A number of aspects influence the figure-ground alignment. The figure, as the primary object of attention, is likely to be moveable and variable, it can act, or be acted on, independently of the ground, and it is likely to be more information-rich (for the perceiver) than the ground. Moreover, animate entities – especially if human – are likely to attract our attention as figure vis-a`-vis inanimate entities. The ground, in contrast, is likely to be static relative to the figure, it is presupposed, and provides the context for the characterization of the figure. It must be emphasized, however, that while certain inherent features of a scene may strongly suggest a certain figureground alignment, we can often choose to reverse the relation. While at a lecture, we could consciously direct our attention to a background noise, relegating the speaker’s voice to the ground. Figure-ground organization is built into language at many levels. The contrast between an active clause and its passive counterpart can be understood in such terms. The farmer shot the rabbit presents the farmer as the figure – we are interested in what the farmer did. The rabbit was shot (by the farmer)
576 Cognitive Semantics
presents the rabbit as figure – we are interested in what happened to the rabbit. Note that what is at issue in these examples is not so much how the scene as such might be visually perceived, but how it is mentally organized by the speaker for its linguistic encoding. Figure-ground asymmetry is also relevant to the encoding of reciprocal relations. If A resembles B, then B obviously resembles A. Yet we would be far more likely to observe that a boy resembles his grandfather than to say that an old man resembles his grandson. We take the old man as the ground, against which the growing boy is assessed, rather than vice versa. Force Dynamics
Another aspect of construal is illustrated by the contrast between The ball rolled along the floor and The ball kept rolling along the floor. There would be no way to differentiate these sentences in terms of objective features of the situations that they designate. Whenever the one sentence can truthfully be applied to a situation, so can the other. Yet the two sentences construe the situation differently. The difference was investigated by Talmy in terms of his notion of ‘force dynamics.’ We view entities as having an inherent tendency either for motion (or change) or for rest (or inaction). When entities interact, their inherent force dynamic tendencies also interact. The force of one entity may overcome, or fail to overcome the force of another, or the two forces may be in equilibrium. Typically, in a force-dynamic interaction, our attention goes on a figure entity (the agonist), whose behavior is tracked relative to an antagonist. The ball rolled along the floor presents the motion of the ball as resulting from its inherent tendency towards motion. But if we say that the ball kept rolling along the floor, we assume a force opposing the ball’s activity, which, however, was not strong enough to overcome the ball’s tendency towards motion. It is the verb keep that introduces a forcedynamic interaction into the situation, as we construe it. It conveys that the tendency towards motion of the agonist (i.e., the ball) was able to overcome an (unnamed) opposing force. The opposing force may, of course, be explicitly stated: The ball kept rolling, despite our attempt to halt it. Force-dynamic interaction holds even with respect to a ‘static’ situation. I kept silent designates the continuation of a static situation. The stasis, however, results from the fact that an (unnamed) antagonist was not powerful enough to cause the situation to change. Quite a few lexical items have an implicit forcedynamic content, such as keep, prevent, despite, and even finally and (to) manage. Thus, I finally managed to start my car not only conveys that I did start my
car, but also that I had to overcome an opposing force. Force dynamics offers an interesting perspective on causation. Prototypically, causation (as expressed by verbs such as cause or make) involves the agonist (the causer) exerting force that overcomes the inactivity of antagonist. Variants of this scenario including letting and preventing. Let conveys that the agonist fails to engage with the antagonist, while prevent conveys that the agonist overcomes the disposition towards action of the antagonist. Another fruitful field of application has been in the study of modality (Sweetser, 1990). Thus, I couldn’t leave conveys that an unnamed antagonist (whether this be another person, a law or proscription, an ethical consideration, a broken leg, or even the fact of a locked door) overcame my disposition to leave. Similarly, I had to leave presents my leaving as resulting from a force that overcame my disposition to remain where I was. Objective vs. Subjective Construal
Any conceptualization involves a relation between the subject of conceptualization (the person entertaining the conceptualization) and the object of conceptualization (the situation that is conceptualized). In The cat is on the mat, the object of conceptualization is, obviously, the location of the cat vis-a`-vis the mat. Although not explicitly mentioned in the sentence, the subject of conceptualization is relevant to the conceptualization in a number of ways. Firstly, the use of the definite noun phrases the cat and the mat conveys that the referents of these expressions are uniquely identifiable to the speaker, also, that the speaker expects the hearer to be able to uniquely identify the referents. (It’s not just a cat, but the cat.) Also, the use of the tensed verb is conveys that the situation is claimed to hold at the time the speaker utters the expression. Since the speaker’s role is not itself the object of conceptualization, we may say that the speaker is being construed subjectively. Langacker has illustrated the notion of objective vs. subjective construal by means of an analogy. For persons who need to wear them, their spectacles are not usually the object of their visual experience. Spectacles function simply as an aid to the seeing process but are not themselves seen. Their role is therefore a subjective one. A person can, to be sure, take off their spectacles and visually examine them, in which case, the spectacles are viewed objectively. ‘Objectification,’ then, is the process whereby some facet of the subject of conceptualization becomes the object of conceptualization. ‘Don’t talk to your mother like that,’ a woman says to her child. Here, the speaker makes herself the object of conceptualization by referring to herself in the third person. ‘Subjectification,’
Cognitive Semantics 577
in contrast, is the process whereby some facet of the object of conceptualization gets to be located in the subject of conceptualization. Take, as an example, the contrast between Jim walked over the hill and Jim lives over the hill. The first sentence profiles the motion of the figure entity vis-a`-vis the ground. The second merely designates the location of the figure. The location, however, is presented as one that lies at the end of a path that goes over the hill. Importantly, the path is not traced by the object of conceptualization, that is, by Jim. Rather, it is the subject of conceptualization who mentally traces the path. Subjectification has been identified as an important component of grammaticalization. Consider the use of (be) going to as a marker of the future. Ellen is going to the store can be construed objectively – Ellen is currently engaged in the process of moving towards the store. If we continue to observe Ellen’s motion, we will probably find that she ends up at the store. We can easily see how (be) going to is likely to take on connotations of prediction. Indeed, Ellen is going to the store might be interpreted in just such a way, not as a statement about Ellen’s current activity, but as a prediction about the future. Similarly, It’s going to rain and You’re going to fall have the force of a prediction, extrapolated from the observation of current circumstances. Notice, in these examples, that in spite of the use of the verb go, there is no objective movement, whether literal or metaphorical, towards the future situation. Rather, it is the conceptualizer who mentally traces the future evolution of the present situation. The idea of motion, contained in the verb go, has been subjectified, that is, it has been located in the subject of conceptualization. A special manifestation of subjectification is the phenomenon of ‘fictive motion.’ This typically involves the use of a basically dynamic expression to designate an objectively static situation. Go, we might say, is basically a motion verb, or, more generally, a change of state verb (I went to the airport, The milk went sour, The lights went red). But consider a statement that the road goes through the mountains. No motion is involved here – the road is merely configured in a certain way, it does not (objectively) go anywhere. The idea of motion implied by go can, however, be attributed to the subject of conceptualization. One mentally traces the path followed by the road through the mountains. Mental motion on the part of the conceptualizer is also invoked in reference to the road from London to Oxford, which, of course, could be the very same entity, objectively speaking, as the road from Oxford to London. Similarly, one and the same entity could be referred to, either as the gate into the garden or the gate out of the garden.
Linguistic Conventions
Although speakers may construe a situation in many alternate ways, their options are to some extent constrained by the linguistic resources available to them. The matter can be illustrated with respect to language-specific lexicalization patterns. Talmy has drawn attention to alternative ways in which a motion event can be linguistically encoded. Consider the English expression I flew across the Atlantic. In English (and in other Germanic languages), we prefer to encode the manner of motion by means of the verb (fly), the path of the motion being expressed in a prepositional phrase (across the Atlantic). In Romance languages, an alternative construal is preferred. Path is encoded by the verb, manner by means of an adverbial phrase: J’ai traverse´ l’Atlantique en avion ‘I crossed the Atlantic by plane.’ Notice that, in the French sentence, the statement of the manner of motion is optional; the French speaker does not have to state how the Atlantic was crossed, merely that it was crossed. Comparison of the ways in which speakers of different languages give linguistic expression to visually presented situations, and of the ways in which texts in one language are translated into another, supports the notion that situations tend to be construed in a manner that is compatible with the construals made available by the conventional resources of different languages (Slobin, 1996). For example, speakers of English (and Germanic languages) will tend to specify the manner of motion in much finer detail than speakers of Romance languages.
Embodiment An important theme in cognitive semantic research has been the insight that the relation between words and the world is mediated by the language user him/ herself. The language user is a physical being, with its various parts, existing in time and space, who is subject to a gravitational field, and who engages in bodily interaction with entities in the environment. Quite a number of our concepts are directly related to aspects of our bodily experience. To put the matter somewhat fancifully: if we humans were creatures with a different mode of existence, if, for example, we were gelatinous, air-born creatures, floating around in the stratosphere, it is doubtful whether we could ever have access to many of the concepts that are lexicalized in presently existing human languages. Thus, to understand the concept of what it means for an object to be heavy, we have to have experienced the sensation of holding, lifting, or trying to move, a heavy object. The notion of heavy cannot be fully explicated
578 Cognitive Semantics
in purely propositional terms, nor in terms of verbal paraphrase. A characteristic of basic level terms, in particular, is that, very often, they are understood in terms of how we would typically interact with the entities in question. Consider the concept of chair. We understand the concept, not simply in terms of what chairs look like, nor even in terms of their various parts and how they are interrelated, but in terms of what we do with our bodies with respect to them, namely, we sit on them, and they support our body weight. We have no such ‘embodied’ conceptualization of more schematic concepts such as ‘thing’ or ‘artifact.’ We do not understand these categories in terms of how we characteristically interact with them. The role of bodily experiences has been elaborated in the theory of image schemas (Johnson, 1987; Lakoff, 1987). ‘Image schemas’ are common recurring patterns of bodily experience. Examples include notions of containment, support, balance, orientation (up/down), whole/part, motion along a path from a source to a goal, and many more. (Force dynamic interactions, discussed above, may also be understood in image schematic terms.) Take the notion of balance. We experience balance when trying to stand on one leg, when learning to ride a bicycle, or when trying to remain upright in a strong wind. The notion involves the distribution of weights around a central axis. (Balance, therefore, is understood in force-dynamic terms.) The notion can be applied to many domains of experience. We can speak of a balanced diet, a balanced argument, a political balance of power, and of the balance of a picture or photograph. One could, no doubt, analyze these expressions as examples of metaphor. This approach, however, might be to miss the embodied, nonpropositional nature of the concept. Our experience of balancing provides a primitive, experiential schema that can be instantiated in many different domains.
Compositionality A particularly contentious issue in semantics concerns the question of compositionality. According to the compositionality principle, the properties (here: the semantic properties) of the whole can be computed from the properties of the parts and the manner of their combination. From one point of view, compositionality is a self-evident fact about human language. The cat is on the mat means what it does in virtue of the meanings of the component words, and the fact that the words stand in certain syntactic configurations. Speakers of English can work out what the sentence means, they do not have to have specifically learned this sentence. Unless compositionality were
a feature of language, speakers would not be able to construct, and to understand, novel sentences. The very fact of linguistic creativity suggests that compositionality has got to be the case. Not surprisingly, therefore, in many linguistic theories, the compositionality of natural languages is axiomatic, and the study of semantics is to a large extent the study of the processes of semantic composition. Cognitive linguists, however, have drawn attention to some serious problems with the notion. It is, of course, generally accepted that idioms are problematic for the compositionality principle. Indeed, idioms are commonly defined as expressions that are not compositional. The expression spill the beans ‘inadvertently reveal confidential information’ is idiomatic precisely because the expression is not compositional, that is, its meaning cannot be worked out on the basis of the meanings that spill and beans have elsewhere in the language. Leaving aside obviously idiomatic expressions – which, by definition, are noncompositional in their semantics – it is remarkable that the interpretation of an expression typically goes beyond, and may even be at variance with, the information that is linguistically encoded. Langacker (1987: 279–282) discussed the example the football under the table. The expression is clearly not idiomatic, neither would it seem to be problematic for the compositionality principle. Take a moment, however, to visualize the described configuration. Probably, you will imagine a table standing in its canonical position, with its legs on the floor, and the football resting on the floor, approximately in the center of the polygon defined by the bottom of the table’s legs. Note, however, that these specific details of the visualization were not encoded in the expression – they have been supplied on the basis of encyclopedic knowledge about tables. The purely compositional meaning of the expression has been enriched by encyclopedic knowledge. There is more to this example, however. If you think about it carefully, you will see that the enriched interpretation is in an important sense at variance with the compositional meaning. If by ‘X is under Y,’ we mean that X is at a place lower than the place of Y, the football, strictly speaking, is not actually under the table at all. The football, namely, is not at a place that is lower than the lowest part of the table. In interpreting even this seemingly unproblematic expression, we have had to go beyond, and to distort, its strictly compositional meaning. This state of affairs is not unexpected on a usagebased model. The resources of a language – lexical, syntactic, phraseological – are abstractions over encountered uses. The meanings abstracted from previous usage events are necessarily schematic, and may not fit precisely the requirements of the situation at
Cognitive Semantics 579
hand. In giving linguistic expression to a conceptualization, we search for the linguistic resources that most closely match our intentions, accepting that some discrepancies and imprecisions are likely to occur. We trust to the inferencing powers of our interlocutors to achieve the fit between the expression and the intended conceptualization.
The Conceptual Basis of Syntactic Categories In many linguistic theories, syntax constitutes an autonomous level of organization, which mediates between phonology and semantics. As pointed out, cognitive linguistics rejects this approach. Rather, syntactic organization is itself taken to be inherently meaningful. Several things flow from this conception of syntactic organization. First, the notion of ‘meaningless’ morphemes gains little support. It is sometimes said, for example, that the preposition of is a dummy element in expressions such as the destruction of the city, inserted by the syntax in order to satisfy the constraint that a noun cannot take a noun phrase as its complement. The cognitive semantic view of the matter would be that of does indeed have a meaning, albeit a fairly abstract one; specifically, of profiles an intrinsic relation between entities. Just as talk of a student entails some subject matter that is studied, and talk of a photograph entails some thing that was photographed, so talk of destruction entails some entity that was destroyed. These inherent relations between entities are profiled by the same preposition: destruction of the city, a student of physics, a photograph of me. More far-reaching, perhaps, are the implications of the cognitive linguistic approach for the study of word classes. It is almost a truism, in modern linguistics, that word classes – noun, verb, adjective, etc. – must be defined, not in semantic terms, but in terms of their distribution. The word explosion is a noun, not because of what it means, but because it distributes like a noun – it can take a determiner, it pluralizes, and so on. Such an approach is tantamount to claiming that syntax constitutes an autonomous level of linguistic organization, independent of semantics. Many cognitive linguists, committed to the symbolic view of language, have been skeptical of this approach and have reexamined the traditional view that word classes are indeed semantically based. There are a number of ways in which the conceptual approach can be implemented. One is a prototype approach. Prototypically, nouns designate concrete, enduring, individuated objects, while verbs designate rapid changes of state (Givo´ n,
1984). A problem with this approach is that explosion, while semantically not at all a prototypical noun, is nevertheless a noun, whose distributional properties are fully comparable with those of supposedly prototypical nouns, such as table and chair. A second approach is functional (Croft, 1991). Nominals designate what is being talked about; adjectivals specify nominals in greater detail; verbal predications make assertions about nominals; while adverbials specify verbal predications more precisely. Each of these functionally defined categories has prototypical instances. Less prototypical instances often bear distinctive morphological markings. Thus, explosion betrays its nonprototypical status as an entity to be talked about by its derivational morphology. Langacker’s aim has been more ambitious. It is to offer unified conceptual definitions of the major lexical and syntactic categories. Essentially, the claim is that the syntactic category of a word is determined by the nature of its profile. Conversely, the status of a word as noun, verb, adjective, etc., imposes a certain kind of profile on the associated semantic representation. A first distinction is between nominal vs. relational profiles. A good way to understand the distinction is by reference to autonomous vs. dependent conceptualizations. A concept is ‘autonomous’ if it can be entertained without necessary reference to other entities. Of course, there can be no such thing as a fully autonomous concept, given the ubiquity of domainbased knowledge and of the profile-base relation in the understanding of concepts. Nevertheless, relatively autonomous concepts can be proposed, for example, the concept of hypotenuse. As stated earlier, the word hypotenuse profiles a straight line. Although the concept is understood against the base of a right-angled triangle, the word does not profile the triangle, nor the relation of the hypotenuse to the triangle. It is in this sense that nominal profiles are autonomous. Compare, now, the preposition on. The word profiles a kind of (prototypically: spatial) relation between two entities, often referred to as the ‘trajector’ and the ‘landmark’ of the relation. The trajector can be thought of as the figure, i.e., the more prominent element in the relation, the landmark as the ground, i.e., the less prominent participant. Without some schematic notion of the trajector and landmark, the notion of ‘on’ lacks coherence. It is in this sense that the conceptualization associated with on is ‘dependent’ – it inherently requires reference to other entities. Relational profiles are subject to further distinctions. On designates an atemporal relation – the time at which the relation holds, or over which it holds, is not profiled. Verbs, on the other hand,
580 Cognitive Semantics
Figure 2 Taxonomy of main lexical categories. Reproduced from Taylor J R (2002) Cognitive grammar. Oxford: Oxford University Press, with permission from Oxford University Press.
inherently designate temporal relations. Like, as a verb, designates a relation between a trajector (the verb’s subject) and a landmark (its direct object). The temporality of the relation is a facet of the profile. Another distinction concerns the nature of the trajector and landmark. These may themselves be either nominal or relational, and, if relational, temporal or atemporal. Prepositions (before lunch) take a nominal landmark, subordinating conjunctions (before we had lunch) take as their landmark a temporal relation (i.e., a clause). Figure 2 depicts a taxonomy of the major lexical categories based on the nature of their profiles. The combination of smaller units into larger configurations can now be understood in terms of the way in which the profiles of the smaller units can be combined. Figure 3 illustrates the assembly of the book on the table. (The role of the determiners is ignored so as not to unduly complicate the discussion.) In accordance with conventions established by Langacker, nominal profiles are represented by circles, relations by lines between circles, while profiled entities (whether nominal or relational) are depicted in bold. The table, having a nominal profile, is able to function as the landmark of on. The resulting expression, on the table, inherits the relational profile of on. The book is able to function as the trajector of this expression, whereby the resulting expression, the book on the table, inherits the nominal profile of book. The composite expression thus designates a book, the book, however, is one which is taken to be on the table. The pattern illustrated in Figure 3 is valid, not only for the assembly of the specific expression in question, but also, mutatis mutandis, for the assembly of any nominal modified by an prepositional
Figure 3 Combination of smaller units into larger configurations. Reproduced from Taylor J R (2002) Cognitive grammar. Oxford: Oxford University Press, with permission from Oxford University Press.
phrase. The pattern, therefore, is able to function as a schema that sanctions expressions of a similar internal structure.
Relativism vs. Nativism The cognitive semantics program highlights the tension between relativism and nativism. The relativist position is that a language brings with it certain categorizations and conceptualizations of the world. The language that one speaks therefore imposes certain construals of the world. It will be evident that a number of themes in cognitive semantics are liable to emphasize the language-specific, and indeed the culture-specific character of semantic structures. For example, emphasis on the symbolic nature of language, in particular the proposal to ground syntactic categories and syntactic relations in conceptual terms, would lead one to suppose that different syntactic structures, as manifested in different languages, would be based in different conceptualizations of the
Cognitive Semantics 581
world. Equally, a focus on the role of domain-based knowledge in the characterization of meaning is likely to accentuate the culture-specific nature of linguistic semantics. On the other hand, several themes in cognitive linguistics are likely to be compatible with the nativist position, according to which the commonalities of human languages reflect a common, universal cognitive endowment. For example, the claim that language is embedded in general cognitive processes and abilities – if combined with the not unreasonable assumption that all humans share roughly the same cognitive capacities – would tend to highlight what is common to the conceptualizations symbolized in different languages. All languages, it may be presumed, manifest embodiment, image schematic and force dynamic construals, FigureGround asymmetries, and nominal vs. relational profiling. Aware of the tension between relativism and nativism, Langacker has scrupulously distinguished between ‘conceptual structure’ and ‘semantic structure.’ Conceptual structure – how people perceive and cognize their world (including the inner world of the imagination) – is taken to be universal and based in shared capacities. Semantic structure, on the other hand, pertains to the way in which conceptual structure is formatted in order that it is consistent with the conventionalized resources of a given language. Compare the ways in which English and French speakers refer to bodily sensations of cold. English requires an attributive adjectival construction (I am cold), French require a possessive construction (J’ai froid ‘I have cold’). Although the experience is construed differently in the two languages, one cannot on this basis alone draw the inference that English and French speakers differ in their phenomenological experience of ‘being cold.’ In order to substantiate the claim that the different semantic, syntactic, and lexical resources of different languages do influence conceptualizations of the world, it would be necessary to go beyond the purely linguistic evidence and document correlations between linguistic organization and nonlinguistic cognition. Currently, the matter is hotly debated. Evidence is, however, emerging that the different construals conventionalized in different languages may sometimes have repercussions in nonlinguistic domains, giving some support to the relativist position. For example, in English (as in many other languages), we may state the location of an object by saying that it is in front of us, to our left, or behind another object or person. In some languages, for example, in Guugu Yimithirr (Guguyimidjir; spoken in Northern Queensland, Australia) and Tzeltal (spoken in Mexico), these resources are not available. Rather,
an object’s location has to be stated with respect to the cardinal points (to the north, etc.) or to some fixed geophysical landmark (upstream, mountain-wards). Such differences in the linguistic construal of spatial relations have been shown to correlate with nonlinguistic spatial cognition, for example, speakers’ proficiency in dead-reckoning, that is, their ability to track their current location in terms of distance and direction from their home base (Levinson, 2003).
Conclusion Meaning is central to linguistic enquiry. Meaning, after all, is what language is all about. Yet meaning is a notoriously difficult topic to analyze. What is meaning, and how are we to study it? Some semanticists have studied meaning in terms of relations between language and situations in the world. Others have focused on relations within a language, explicating meanings in terms of paradigmatic relations of contrast, synonymy, hyponymy, entailment, and so on, and syntagmatic relations of collocation and co-occurrence. Yet others have tried to reduce meaning to matters of observable linguistic behavior. Cognitive semanticists have grasped the nettle and taken seriously the notion that meanings are ‘in the head,’ and are to be equated with the conceptualizations entertained by language users. Cognitive semantics offers the researcher a theoretical framework and a set of analytical tools for exploring this difficult issue.
See also: Cognitive Grammar; Cognitive Linguistics; Con-
cepts; Construction Grammar; Corpus Lexicography; Frame Semantics; Grammaticalization; Idioms; Langacker, Ronald (b. 1942); Metaphor: Psychological Aspects; Metonymy; Modularity of Mind and Language; Onomasiology and Lexical Variation; Polysemy and Homonymy; Prototype Semantics; Relativism; Saussure: Theory of the Sign; Semantics of Spatial Expressions; Spatiality and Language.
Bibliography Barlow M & Kemmer S (2000). Usage based models of language. Stanford: CSLI Publications. Croft W (1991). Syntactic categories and grammatical relations: the cognitive organization of information. Chicago: University of Chicago Press. Croft W & Cruse D A (2004). Cognitive linguistics. Cambridge: Cambridge University Press. Cuyckens H, Dirven R & Taylor J (eds.) (2003). Cognitive approaches to lexical semantics. Berlin: Mouton de Gruyter.
582 Cognitive Semantics Givo´ n T (1984). Syntax: a functional-typological approach 1. Amsterdam: John Benjamins. Johnson M (1987). The body in the mind: the bodily basis of meaning, imagination, and reason. Chicago: University of Chicago Press. Kay P (1997). Words and the grammar of context. Chicago: University of Chicago Press. Keil F (1989). Concepts, kinds, and conceptual development. Cambridge, MA: MIT Press. Lakoff G (1987). Women, fire, and dangerous things: what categories reveal about the mind. Chicago: University of Chicago Press. Lakoff G & Johnson M (1980). Metaphors we live by. Chicago: University of Chicago Press. Langacker R W (1987). Foundations of cognitive grammar 1: Theoretical prerequisites. Stanford: Stanford University Press. Langacker R W (1990). Concept, image, and symbol: the cognitive basis of grammar. Berlin: Mouton de Gruyter. Langacker R W (1991). Foundations of cognitive grammar 2: Descriptive application. Stanford: Stanford University Press. Langacker R W (1999). Grammar and conceptualization. Berlin: Mouton de Gruyter. Lee D (2001). Cognitive linguistics: an introduction. Oxford: Oxford University Press.
Levinson S (2003). Space in language and cognition: explorations in cognitive diversity. Cambridge: Cambridge University Press. Murphy G & Medin D (1985). ‘The role of theories in conceptual coherence.’ Psychological Review 92, 289–316. Searle J (1992). The rediscovery of the mind. Cambridge, MA: MIT Press. Slobin D (1996). ‘From ‘‘thought and language’’ to ‘‘thinking for speaking’’.’ In Gumperz J & Levinson S (eds.) Rethinking linguistic relativity. Cambridge: Cambridge University Press. 70–96. Sweetser E (1990). From etymology to pragmatics: metaphorical and cultural aspects of semantic structure. Cambridge: Cambridge University Press. Talmy L (2000). Towards a cognitive semantics 1: Conceptual structuring systems. Cambridge, MA: MIT Press. Talmy L (2003). Towards a cognitive semantics 2: Typology and process in concept structuring. Cambridge, MA: MIT Press. Taylor J R (2002). Cognitive grammar. Oxford: Oxford University Press. Taylor J R (2003). Linguistic categorization (3rd edn.). Oxford: Oxford University Press. First edition: 1989. Ungerer F & Schmid H-J (1996). An introduction to cognitive linguistics. London: Longman. Wittgenstein L (1953). Philosophical investigations. Anscombe G E M (trans.). Oxford: Blackwell.
Cognitive Technology B Gorayska, University of Cambridge, Cambridge, UK ! 2006 Elsevier Ltd. All rights reserved.
Scholarly Discipline The primary area of inquiry is Technological Cognition (TC), which examines what happens to humans when they augment themselves with technologies, either physically or cognitively, to amplify their natural capabilities. The aim of CT/TC is to formulate and to test theories of human cognitive processes that interact with technological artifacts and are partially formed by those interactions. As a fluid, symbiotic hybrid of the embodied mind and its tools, such a technologized cognition has epistemic effects: It affords an understanding and control of the external world that otherwise would not have been possible. As rational agents, humans develop and use tools to empower themselves in the real world which exists independently of them. This places CT/TC firmly in the realist tradition, in direct contrast with the assumptions of postmodernism (see Postmodernism): if the external world were merely a social construct,
where reality is reduced to, and interpreted as, text, tool augmentation other than that related to natural language (e.g., metaphor) would be superfluous and CT/TC would lose its raison d’eˆ tre. The belief that the human mind is molded by tools and open to scientific scrutiny also dissociates CT/TC from any theoretical framework that is essentially behaviorist in nature (see Behaviorism: Varieties).
Methodology for Tool Design Dialectic adaptation of the mind to the operations of its tools – a process that is often tool-coerced – leads to technological change. Increased technological sophistication forces us to make ethical choices. Of interest here is a search for design methods and practice capable of eliminating, before they even arise, any undesirable effects of tool use on users. The main question is ‘‘Which design methods and practice will result in tool-mind-world hybrids that optimally benefit humankind?’’ CT, understood as a methodology for design, is thus a process: an approach to design, not a product of such a design. We can design
582 Cognitive Semantics Givo´n T (1984). Syntax: a functional-typological approach 1. Amsterdam: John Benjamins. Johnson M (1987). The body in the mind: the bodily basis of meaning, imagination, and reason. Chicago: University of Chicago Press. Kay P (1997). Words and the grammar of context. Chicago: University of Chicago Press. Keil F (1989). Concepts, kinds, and conceptual development. Cambridge, MA: MIT Press. Lakoff G (1987). Women, fire, and dangerous things: what categories reveal about the mind. Chicago: University of Chicago Press. Lakoff G & Johnson M (1980). Metaphors we live by. Chicago: University of Chicago Press. Langacker R W (1987). Foundations of cognitive grammar 1: Theoretical prerequisites. Stanford: Stanford University Press. Langacker R W (1990). Concept, image, and symbol: the cognitive basis of grammar. Berlin: Mouton de Gruyter. Langacker R W (1991). Foundations of cognitive grammar 2: Descriptive application. Stanford: Stanford University Press. Langacker R W (1999). Grammar and conceptualization. Berlin: Mouton de Gruyter. Lee D (2001). Cognitive linguistics: an introduction. Oxford: Oxford University Press.
Levinson S (2003). Space in language and cognition: explorations in cognitive diversity. Cambridge: Cambridge University Press. Murphy G & Medin D (1985). ‘The role of theories in conceptual coherence.’ Psychological Review 92, 289–316. Searle J (1992). The rediscovery of the mind. Cambridge, MA: MIT Press. Slobin D (1996). ‘From ‘‘thought and language’’ to ‘‘thinking for speaking’’.’ In Gumperz J & Levinson S (eds.) Rethinking linguistic relativity. Cambridge: Cambridge University Press. 70–96. Sweetser E (1990). From etymology to pragmatics: metaphorical and cultural aspects of semantic structure. Cambridge: Cambridge University Press. Talmy L (2000). Towards a cognitive semantics 1: Conceptual structuring systems. Cambridge, MA: MIT Press. Talmy L (2003). Towards a cognitive semantics 2: Typology and process in concept structuring. Cambridge, MA: MIT Press. Taylor J R (2002). Cognitive grammar. Oxford: Oxford University Press. Taylor J R (2003). Linguistic categorization (3rd edn.). Oxford: Oxford University Press. First edition: 1989. Ungerer F & Schmid H-J (1996). An introduction to cognitive linguistics. London: Longman. Wittgenstein L (1953). Philosophical investigations. Anscombe G E M (trans.). Oxford: Blackwell.
Cognitive Technology B Gorayska, University of Cambridge, Cambridge, UK ! 2006 Elsevier Ltd. All rights reserved.
Scholarly Discipline The primary area of inquiry is Technological Cognition (TC), which examines what happens to humans when they augment themselves with technologies, either physically or cognitively, to amplify their natural capabilities. The aim of CT/TC is to formulate and to test theories of human cognitive processes that interact with technological artifacts and are partially formed by those interactions. As a fluid, symbiotic hybrid of the embodied mind and its tools, such a technologized cognition has epistemic effects: It affords an understanding and control of the external world that otherwise would not have been possible. As rational agents, humans develop and use tools to empower themselves in the real world which exists independently of them. This places CT/TC firmly in the realist tradition, in direct contrast with the assumptions of postmodernism (see Postmodernism): if the external world were merely a social construct,
where reality is reduced to, and interpreted as, text, tool augmentation other than that related to natural language (e.g., metaphor) would be superfluous and CT/TC would lose its raison d’eˆtre. The belief that the human mind is molded by tools and open to scientific scrutiny also dissociates CT/TC from any theoretical framework that is essentially behaviorist in nature (see Behaviorism: Varieties).
Methodology for Tool Design Dialectic adaptation of the mind to the operations of its tools – a process that is often tool-coerced – leads to technological change. Increased technological sophistication forces us to make ethical choices. Of interest here is a search for design methods and practice capable of eliminating, before they even arise, any undesirable effects of tool use on users. The main question is ‘‘Which design methods and practice will result in tool-mind-world hybrids that optimally benefit humankind?’’ CT, understood as a methodology for design, is thus a process: an approach to design, not a product of such a design. We can design
Cognitive Technology 583
Figure 1 Anthropocentric CT design versus traditional humancentered design.
and the brain. They can be more or less externalized or detached from the brain or the body (cf. bionic prostheses incorporating direct brain implants with word processors vs. spades). The greater the detachment, the lower the brain-tool connectivity and alignment. The lower the connectivity, the lower the ability to attend to the feedback the brain receives at the interface; consequently, the less the need for people to adapt to the tool. Within this continuum, we can distinguish between natural technologies and fabricated technologies. Natural Technologies
tools in accordance with CT principles, informed by the theoretical developments of TC, but those tools are not in themselves instances of CT. The search for ethical factors in tool design means that studies undertaken within CT are primarily about people and not about the technologies that augment them. This distinguishes CT from investigations of the so-called human factors in designing human-centered systems that promote ‘naturalness’ in human-tool exchanges and aim at designing ergonomic, user-friendly tools, in order to fully accommodate the limits of human performance and fully exploit the advantages of the user (as it is practiced in areas such as Cognitive Ergonomics, Cognitive Engineering, and Engineering Psychology). What such practices do not overtly address is the question of who really stands to benefit. Nor do they explicitly consider that every tool is a prosthetic device (some mental functions become redundant), that the user/tool relation is of one-to-many (with the dangers of stress overload), that sophisticated tools can obscure dignity at work, that they separate people from their natural habitat or change their perception of their own competence. By contrast, CT aims to bring humane factors to bear on design. The tool-mind-world hybrids designed according to the principles of CT are intended to be essentially anthropocentric, i.e., they maximize human (user) benefits (see Figure 1).
Mind-Amplifying Tool CT takes a broad view of technology. All artifacts are in some measure cognitive tools. We need to understand them in terms of their goals and computational constraints as well as in terms of the external physical and social environments that shape and afford cognition. Any technology that serves that purpose by providing a tool has implications for CT/TC. Cognitive tools can be situated on a continuum of purposeful use between the extremes of raw material
Natural technologies are instances of mental techne´ , i.e., learned skills and mental competencies. Examples proposed within CT include a body image generator responsible for perceiving one’s own body size, the cognitive processes within the prefrontal cortex responsible for our social competence, or the cognitive processes evolved in autism. Here also belong learned aspects of natural language (see Whorf, Benjamin Lee (1897–1941)), narratives, mnemonic systems for improving memory, task-dedicated cognitive processes in logic and arithmetic, or metaphors in cultural heritage. The mind itself can be viewed as a complex, goal-driven toolkit of natural cognitive artifacts that bootstrap on what is hard-wired, in order to make available a much bigger set of such artifacts. The mind thus facilitates interactions with the environment through the organization and integration of perceptual data, memory, and goal-management. Fabricated Technologies
While some fabricated technologies are tangible (pens, paper, computers, mechanical gadgets, telecommunication devices, medical equipment, or flight simulators are obvious examples), others are not, but nonetheless they affect our mental capabilities and competences: e.g., writing systems, narratives, artificial languages, and so on. Cybernetic systems such as robots can become a scientist’s cognitive tools if they are built to further scientific progress. They have been applied to understand how categorical perception works, how intelligence of natural systems is embodied, and to the question of whether there is a fundamental distinction between humans and machines. In artificial intelligence, computation has been employed to formulate and to test theories of intelligence or the nature of consciousness. In computational linguistics (see Computational Linguistics: History), automated grammars have aided linguists in their efforts to understand the workings of natural language.
584 Cognitive Technology
Any systematized environment that constrains cognition results in cognitive change. Perhaps the best example is a prison system where people become habituated to its inherent operations; many are unable to function effectively upon release. Or, take an organizational merger: Its most challenging aspect is consolidating the diversity of employees’ mental cultures.
Perspectives on Mind Change The simplest form of cognitive change is learning, and many technologies are purposefully developed to this end. While people will adapt to any technology, the extent of their adaptation depends on the extent of the cognitive fit between the human mind and the augmenting technology. When some mental processes are taken over by technology, the relevant natural technologies augment, too. A subsequent absence of the tool may then render the acquired techne´ ineffective. (As an example, consider the impact of word processors on generating texts.) Such a technological augmentation is not permanent and can be unlearned. Technological augmentation can also have lasting effects. Longitudinal studies in archaeology reveal that cognitive fluidity can be directly linked to tool use. Studies in primatology and developmental psychology show that language constructs, in particular storytelling, played a crucial role in handling the complexity of the social dynamics responsible for the evolution of the primate brain. Social and narrative intelligence requires a larger neocortex, hence a bigger brain size. Can we predict a priori the long-term effects of a given technology on the human user? The difficulty of this question has been recognized with respect to language. The meaning of a linguistic symbol depends on how, and in which context, it is used (see Context, Communicative; Wittgenstein, Ludwig Josef Johann (1889–1951)). Language becomes alive and unfolds its history via its interaction with people. Meanings transform across different epochs and different cultures as a result of interpreting and reinterpreting language. The context of its use, and the medium through which it is used, have a bearing on language itself. Consider the impact of the Internet and mobile phones (texting) on the art of conversation or the spelling of words. The same is true of any artifact. Internal structure of design and a form perceived at the interface are insufficient to derive the social and personal significance or the subsequent development of a piece of technology. The latter can only happen in the context in which it is used. In the absence of such context, there is no unique answer to the question ‘‘What is this artifact for?’’ Its affordances for action (the toolness of a tool) only
become manifest when the mind comes in contact with it in some context. To predict the consequences of tool use, it is necessary (though not sufficient) to understand the social, psychological, and cognitive mechanisms that create the need for technological augmentation. A theoretical framework with a potential for such investigations must necessarily deal with motives and benefits, in other words, with relevance. Within CT, the Theory of Relevance, originally developed with respect to language use (see Relevance Theory), has been broadened to encompass all cognitive processes (symbolic and connectionist) involved in action planning and goal management. This extended framework can be invoked to explain the modularity of mind and remove a variety of difficulties experienced in the symbol-driven acquisition of natural technologies or the design of fabricated ones. It provides grounds for classifying a dedicated inferential comprehension module as an instance of natural CT. By recognizing a multitude of cognitive interfaces (e.g., between perception, consciousness, knowledge, motivation, emotion, action, natural/fabricated technologies, external situations), the extended framework can assist the exploration of (1) the extent to which the constraints on our mental life are biologically or technologically determined, and (2) how language techne´ interacts with other aspects of cognition, facilitating, ultimately, the choice between various proposals for developing a humanized linguistic technology. Even so, the big question, whether, and to what extent, technology can be humanized, remains an open question (see Adaptability in Human-Computer Interaction). See also: Adaptability in Human-Computer Interaction; Behaviorism: Varieties; Computational Linguistics: History; Postmodernism; Relevance Theory; Whorf, Benjamin Lee (1897–1941); Wittgenstein, Ludwig Josef Johann (1889–1951).
Bibliography Beynon M, Nehaniv C L & Dautenhahn K (eds.) (2001). Cognitive Technology: instruments of mind, CT01. Lecture Notes in AI. Berlin: Springer. 2117. Brooks R (2002). Robot: the future of flesh and machines. London: Penguin Books. Clark A (2000). Mindware: introduction to Cognitive Science. Oxford: Oxford University Press. Clark A (2003). Natural-born cyborgs: minds, technologies, and the future of human intelligence. New York: Oxford University Press. CT ‘99 Conference, Proceedings (1999). Gorayska B & Lindsay R (1989). Metasemantics of relevance. The First International Congress on Cognitive Linguistics. Print A265. L. A. U. D. (Linguistic Agency at the
Coherence: Psycholinguistic Approach 585 University of Duisburg) Catalogue. Pragmatics. http:// www.linse.uni-essen.de:16080/linse/laud/shop_laud. Gorayska B & Lindsay R (1993). ‘The roots of relevance.’ Journal of Pragmatics 19(4), 301–323. Gorayska B & Mey J L (eds.) (1996a). Cognitive Technology: in search of a humane interface. Amsterdam: North Holland. Gorayska B & Mey J L (eds.) (1996b). ‘Special Issue on Cognitive Technology.’ AI & Society 10. Gorayska B & Mey J L (eds.) (2002). International Journal of Cognition and Technology 1(1 & 2). Gorayska B & Mey J L (eds.) (2004). Cognition and Technology: co-existence, convergence, co-evolution. Amsterdam: John Benjamins. Lindsay R (2001). ‘Perception and Language.’ In Verschueren ¨ stman J-O, Blommaert J & Bulcaen J (eds.) Handbook J, O of Pragmatics. Amsterdam/Philadelphia: John Benjamins. 1–20. Marsh J, Gorayska B & Mey J L (eds.) (1999). Humane interfaces: questions of methods and practice in Cognitive Technology. Amsterdam: North Holland.
Marsh J, Nehaniv C L & Gorayska B (eds.) (1997). Proceedings of the second international Cognitive Technology conference CT’97: humanizing the information age. Palo Alto, CA: IEEE Computer Society Press. Mey J L (1998). ‘Adaptability.’ In Concise encyclopedia of pragmatics. Oxford: Elsevier Science. 5–7. Mey J L (2000). ‘The computer as prosthesis: reflections on the use of a metaphor.’ Hermes: Journal of Linguistics 24, 15–30. Norman D A (1993). Things that make us smart. Reading, MA: Addison-Wesley. Norman D A (1999). The invisible computer. Cambridge, MA: MIT Press. Norman D A & Draper S W (eds.) (1986). User-centered system design. Hillsdale, NJ: Erlbaum. Mithen S J (1996). The prehistory of the mind: a search for the origins of art, religion and science. London: Orion Books Ltd. Wickens C (1992). Engineering psychology and human performance (2nd edn.). New York: Harper Collins.
Coherence: Psycholinguistic Approach A Sanford, University of Glasgow, Glasgow, Scotland, UK ! 2006 Elsevier Ltd. All rights reserved.
Coherence in Text and in the Mind A text is coherent to the extent that it is intelligible, that there are no aspects of the text that do not relate to the message, and that there is no sense that things are missing from the text. We may judge a text as incoherent if these conditions are not met. There are two important sources of information that contribute to coherence: text cues and psychological constraints. Text cues are simply those cues that are in the text itself, while psychological constraints refers to processes of thought or inference that add to what is given by the text. Of course, if as a result of the way it is written we have to make too many poorly-guided inferences to understand a message, then we may say that the text itself appears incoherent. From a psychological perspective, a coherent text may be thought of as one that brings to mind just the right things to facilitate easy understanding, while an incoherent text is one that fails to do that, leaving the reader or listener with no sense of understanding the message. Texts that present the reader or listener with a difficult task may be judged more or less coherent. This raises an interesting question: how to define a text as distinct from a random concatenation of sentences. There has been a tradition in text linguistics
that claims that coherence is an intrinsic defining property of a text. Pieces of writing that do not conform to the principles underlying coherence are taken either to be defective (suboptimal) or not texts at all. For instance, a text that is coherent must have clauses that are clearly connected to one another. Second, the clauses must logically relate to one another, and third, each sentence must somehow be relevant to the overall topic of the discourse. Some of these requirements can be met from what is actually written in the text itself. For instance, texts can contain explicit cohesion markers that provide links between the clauses of a text. But the other requirements, such as clauses logically relating to one another, and the clauses being relevant to the overall topic of the discourse are plainly psychological; they require the reader/listener to perceive relevance. We shall amply illustrate this point in this article. The psychological view is that coherence is something that depends on the mental activity of the reader or listener, on their capacity to understand the message that the producer of the text is trying to convey. The text can be thought of as providing clues as to what the message is, but the reader has to use these cues. So, from a psychological perspective, we may ask what mental processes lead to the development of a coherent mental representation of the text (knowledge of the message), and what clues in texts help to guide these processes appropriately (see Gernsbacher and Givon, 1995, for a broad perspective).
Coherence: Psycholinguistic Approach 585 University of Duisburg) Catalogue. Pragmatics. http:// www.linse.uni-essen.de:16080/linse/laud/shop_laud. Gorayska B & Lindsay R (1993). ‘The roots of relevance.’ Journal of Pragmatics 19(4), 301–323. Gorayska B & Mey J L (eds.) (1996a). Cognitive Technology: in search of a humane interface. Amsterdam: North Holland. Gorayska B & Mey J L (eds.) (1996b). ‘Special Issue on Cognitive Technology.’ AI & Society 10. Gorayska B & Mey J L (eds.) (2002). International Journal of Cognition and Technology 1(1 & 2). Gorayska B & Mey J L (eds.) (2004). Cognition and Technology: co-existence, convergence, co-evolution. Amsterdam: John Benjamins. Lindsay R (2001). ‘Perception and Language.’ In Verschueren ¨ stman J-O, Blommaert J & Bulcaen J (eds.) Handbook J, O of Pragmatics. Amsterdam/Philadelphia: John Benjamins. 1–20. Marsh J, Gorayska B & Mey J L (eds.) (1999). Humane interfaces: questions of methods and practice in Cognitive Technology. Amsterdam: North Holland.
Marsh J, Nehaniv C L & Gorayska B (eds.) (1997). Proceedings of the second international Cognitive Technology conference CT’97: humanizing the information age. Palo Alto, CA: IEEE Computer Society Press. Mey J L (1998). ‘Adaptability.’ In Concise encyclopedia of pragmatics. Oxford: Elsevier Science. 5–7. Mey J L (2000). ‘The computer as prosthesis: reflections on the use of a metaphor.’ Hermes: Journal of Linguistics 24, 15–30. Norman D A (1993). Things that make us smart. Reading, MA: Addison-Wesley. Norman D A (1999). The invisible computer. Cambridge, MA: MIT Press. Norman D A & Draper S W (eds.) (1986). User-centered system design. Hillsdale, NJ: Erlbaum. Mithen S J (1996). The prehistory of the mind: a search for the origins of art, religion and science. London: Orion Books Ltd. Wickens C (1992). Engineering psychology and human performance (2nd edn.). New York: Harper Collins.
Coherence: Psycholinguistic Approach A Sanford, University of Glasgow, Glasgow, Scotland, UK ! 2006 Elsevier Ltd. All rights reserved.
Coherence in Text and in the Mind A text is coherent to the extent that it is intelligible, that there are no aspects of the text that do not relate to the message, and that there is no sense that things are missing from the text. We may judge a text as incoherent if these conditions are not met. There are two important sources of information that contribute to coherence: text cues and psychological constraints. Text cues are simply those cues that are in the text itself, while psychological constraints refers to processes of thought or inference that add to what is given by the text. Of course, if as a result of the way it is written we have to make too many poorly-guided inferences to understand a message, then we may say that the text itself appears incoherent. From a psychological perspective, a coherent text may be thought of as one that brings to mind just the right things to facilitate easy understanding, while an incoherent text is one that fails to do that, leaving the reader or listener with no sense of understanding the message. Texts that present the reader or listener with a difficult task may be judged more or less coherent. This raises an interesting question: how to define a text as distinct from a random concatenation of sentences. There has been a tradition in text linguistics
that claims that coherence is an intrinsic defining property of a text. Pieces of writing that do not conform to the principles underlying coherence are taken either to be defective (suboptimal) or not texts at all. For instance, a text that is coherent must have clauses that are clearly connected to one another. Second, the clauses must logically relate to one another, and third, each sentence must somehow be relevant to the overall topic of the discourse. Some of these requirements can be met from what is actually written in the text itself. For instance, texts can contain explicit cohesion markers that provide links between the clauses of a text. But the other requirements, such as clauses logically relating to one another, and the clauses being relevant to the overall topic of the discourse are plainly psychological; they require the reader/listener to perceive relevance. We shall amply illustrate this point in this article. The psychological view is that coherence is something that depends on the mental activity of the reader or listener, on their capacity to understand the message that the producer of the text is trying to convey. The text can be thought of as providing clues as to what the message is, but the reader has to use these cues. So, from a psychological perspective, we may ask what mental processes lead to the development of a coherent mental representation of the text (knowledge of the message), and what clues in texts help to guide these processes appropriately (see Gernsbacher and Givon, 1995, for a broad perspective).
One thing that can be seen in texts is a so-called cohesion marker (see Halliday and Hasan, 1976). This marker may be a connective, like and, but, when, while, because, so, therefore, hence, and so on. Another form of connection is anaphora – using a term that relates a concept back to one that was previously introduced, as in (1): (1) John came home because he was missing his mother.
Here, because is a connective that links the two clauses, and he is an anaphoric pronoun that refers back to John. Both of these devices provide some glue to connect the two clauses, and help bind them into a coherent whole. The devices are visible in the text itself. There are many other cues that signal relationships between the parts of text, expressions like first (which cues that there will be a successor), next, later, finally, after that (signaling temporal progressions), similarly, and in the same way (signaling various ways in which clauses or phrases may be related to one another). Such cues are only cues, of course, and they are neither sufficient nor necessary for a text to appear coherent. So, a text with ample coherence markers may be quite incoherent: (2) John ate a banana. The banana that was on the plate was brown, and brown is a good color for hair. The hair of the dog is a drink to counteract a hangover.
Such texts are not truly coherent, in the sense that they do not produce an obvious message. So the presence of cohesion markers is not enough to guarantee coherence. The clauses in a text need to be sensibly related and to form a sensible whole. Of course, what is sensible depends upon comparing what is being said in the text with what the reader knows about the world. It is clearly a matter of psychology, not just of what is in the text. Cohesion markers are not necessary for finding a text to be coherent, either. For instance, consider the following: (3) Mr. Smith was killed the other night. The steering on the car was faulty.
Although there is no stated connection between the two sentences, readers infer that the second sentence provides the reason for the state of affairs depicted in the first sentence, and that makes the text coherent. There is no cue to this in the text itself. In (4) there is such a cue: (4) Mr. Smith was killed the other night, because the steering on the car was faulty.
So, although explicit connectives may indicate the relationship between different clauses and propositions, a text may be quite coherent even in the absence of such markers, as shown in (3). What psychological studies have shown is that for some connectives, their presence does indeed aid comprehensibility. For instance, if people read short stories where the last sentence either did or did not begin with a connective (for instance, However, the pilot made a safe landing), they spend less time reading the final sentence when an explicit connective is used than when it is not. The mental representation of sentences that have clauses linked by causal connectives seem to be more stable as well, since they are better remembered than those that are not directly linked. So although it may be possible to infer the link between two clauses, an explicit cue does help, and of course, may sometimes be necessary.
The Psychological Concept of a Connected, Coherent, Discourse Representation An almost universal view, within psychology, of how text is processed is that the text expresses ideas that become connected to form a coherent whole. A parallel idea in text linguistics is that each part of a text is linked to at least one other part by some sort of relation (called rhetorical predicates, see Rhetorical Structure Theory). The idea is similar: that coherence results from being able to appropriately relate each bit of the discourse to some other, so that a connected whole results; however, the psychological approach is concerned with studying how the connections are established, and what are the mental entities that come to be related to each other. The end point is the mental representation of the discourse and is part of memory. Because memory is fallible, the final representation will be incomplete too. As discussed above, connectives (explicit or inferred) are partly responsible for the local coherence of pieces of text (adjacent pairs of sentences, say). But a text will give rise to a coherent mental representation at several levels. It is possible to illustrate some aspects of connectivity with the following example: (5) (a) Harry was trying to clean up the house. (b) He asked Mary to put the book in the box on the table. (c) She said she was too busy to do that. (d) She had to write out a check for her friend Jill because she had no money left. (e) Harry nearly exploded. (f) He thought that they spent too much money on charities as it was. (g) Mary suddenly burst into tears.
The same individuals appear over and again in this simple story. Harry in (a) is He in (b), Harry in (e), and He in (f). Identifying Harry in this way is important, since that way we can connect the actions and reactions of that individual. Sometimes Harry is used in preference to He, as in (e). Using a name like this occurs especially when the individual concerned has not been at the center of the unfolding text (in focus) for a while; psycholinguistic work has shown that the use of a name is especially useful when a character is being ‘reintroduced,’ as Harry is in (e). Use of He would still be intelligible, but slower processing of the sentence would result, showing a difficulty with using a pronoun. Processing difficulties are minimized with the character Harry because there is only one male character in the story. But the situation with Mary is different because there are two female characters, with Jill introduced in (d). In fact, in sentence (d), the reader has to work out who she is from general knowledge, not from the text. In this case, because it would be a person with money who could give money to someone without money, the second she is treated as coreferential with Jill, not Mary. Psychological work has shown that such inferential processes, although they seem automatic, are time-consuming. A further anaphoric connector in the passage is worthy of note. First, in (c), there is that. Here the expression refers to the event Mary puts the book in the box on the table. Terms like this and that can refer to huge conglomerations of events, as in a very complex story leading to the statement This was too much for Harry. So, to summarize, anaphoric devices are vital to producing a coherent mental representation of who did what. A major review of psychological work on anaphora was Garnham (2000). Causal Connectivity
With narrative texts especially, the reader has to establish causal links between the various parts of the text, and the whole structure gives rise to global coherence. So, in (a), Harry is given a goal. In (b), a further action is introduced. How is this interpreted? Most people interpret the action as being part of realizing this goal. However, there is nothing in the text to indicate this. In our example, there is hardly anything to tell the reader what the causal structure is. In the passage below, we have included some connectives that fill out the causal structure: (6) Harry was trying to clear up the house. TO HELP WITH HIS GOAL He asked Mary to put the book in the box on the bookshelf. HOWEVER
(BLOCKING HIS GOAL) She said that she was too busy to do that. THE REASON WAS She had to write out a check for her friend Jill because she had no money left. AS A RESULT OF THIS REASON Harry nearly exploded. THE REASON WAS He thought they spent too much on charities as it was. AS A RESULT Mary suddenly burst into tears.
In order to achieve a very minimal understanding of this text, the information provided in bold, or something like it, must be inferred by the reader and incorporated into their mental representation of the discourse. A number of studies have shown that judgments of how clearly the clauses of texts are causally related predicts a number of performance measures during reading, including the time taken to read sentences, judgments of how well sentences fit into texts, the judged coherence of texts as a whole, and the ease of recalling texts (see Langston and Trabasso, 1999). When people understand texts, they appear to do so by forming mental representations consisting of causal chains, and the robustness of the causal chains reflects coherence.
Studies of Inferential Activity Necessity and Elaboration
Everywhere in discourse, readers are called upon to make inferences; without them, there would be no coherence. A key distinction is made between inferences that are necessary for coherence and inferences that are not necessary, but rather just fill out the picture being portrayed by the writer. Inferring causal relations, and anaphoric relations, are generally considered to be necessary inferences. In general, when a necessary inference is made, it can be shown to take time. One classic case (Haviland and Clark, 1974) is: (7) Harry took the picnic things from the trunk. (8) The beer was warm.
On reading (8), to understand how The beer fits into things, the reader must infer that beer was part of the picnic supplies. Sentence (8) took longer to read after (7) than it did after (9): (9) Harry took the beer from the trunk.
Thus measurable time is needed to make the necessary bridging inference. Bridging inferences are assumed to be made only when necessary, when a gap in the connectivity of clauses is detected. There is no inference that beer might be part of the picnic things when just sentence (7) is read; rather, the inference is triggered when (8) is encountered. So such inferences are also called backwards inferences.
588 Coherence: Psycholinguistic Approach
Necessary backwards-bridging inferences are contrasted with forward elaborative inferences. For instance, given (10), what might one infer? (10) Unable to control his rage, the angry husband threw the valuable antique vase at the wall.
There are many possibilities, but a highly plausible one is that the vase broke. If on reading (10) this inference were made, then it would be an elaboration over what was said, and as it is not made because it is needed, it is called a forward inference. If this were followed by (11), the forward inference would facilitate comprehension: (11) It cost well over $100 to replace.
But such an inference would not help us understand (12): (12) He had been keeping his emotions bottled for weeks.
There has been much debate over whether such elaborative inferences are typical, and if and when they are made. Clearly there are many such inferences that might be made. For instance, given (11), one might infer that the wife of the angry husband might be in some danger, that the husband might become more violent, that he felt ashamed afterwards, etc. Do we make all, or any, of such plausible inferences? Because such inferences are indeed so plausible, it might be supposed that they are routinely made. In order to test whether an inference has been made, a variety of priming tasks have been used. With these, a test word is presented after a short passage. For instance, after (10), the test word BROKE might be presented. This word would also be presented after a sentence in which information pertinent to breaking is absent. Subjects are asked to read out loud the test word when it appears. The critical question is whether the word is read more rapidly after a priming sentence (11), when compared with a nonpriming sentence. If the word is read more rapidly, then it has been primed, and this suggests that the inference that the vase had broken had been made. Several different tests have been devised based on this idea. Under some circumstances, there has been weak evidence for such forward inferences happening immediately, though the general view is that they are made only under very constrained conditions and are not typical. The paucity of evidence for elaborative inferences was summed up in McKoon and Ratcliff (1992), who put forward the idea that during reading, immediate forwards elaborative inferences are typically minimal, and inferences are largely restricted to the necessary, backward, variety. However, in the long term, elaborative inferences have to be made, since
comprehension is only possible when a mental model of what the text is about is constructed. We shall go on to look at some aspects of such a model. Situation-Specific Information: Scenario-Theory For a discourse to be understood, it has to be situated with respect to background knowledge. For instance: (13) John drove to London yesterday. (14) The car broke down halfway.
Superficially, this is similar to examples (7) and (8), in that a backwards-bridging inference could be made to link The car in (14) to drove in (13). Such a backwards-inference would be time-consuming for reading (14). However, several studies have shown that the time to read (14) is no greater to read after (13) than it is after (15), where the car is mentioned explicitly: (15) John took his car to London yesterday.
The key difference between The car in (15) and The beer in (8) is that car is typically definitional of driving, whereas beer is just a plausible option for picnic things. So, for entities that are part of the meaning of actions, those entities appear to be included in the representation of the sentence depicting the action. The concept is part of the representation of the action drove to a place. Sanford and Garrod (1981, 1998) put forward the idea that when we read a text, we identify as quickly as possible the background situation in which the propositions in the text are grounded; they further assumed that much of what we know is represented in a situation-specific way, in structures that they termed scenarios. Driving is one example, where the concept , and expected actions, are represented. Another well-known illustration is of having a meal in restaurant, where the events, the order of events (find table, get waiter, order meal, eat courses in expected order, get bill, pay, leave, etc.), and the principle actors (customer, waiter, wine-waiter) are represented. In general, if a new entity is introduced into a text, either it will already be part of the prior representation (scenario), or a backwards inference will have to made. Using situation-based knowledge is essential for developing a coherent representation, and a simple example is: (16) Fred put the wallpaper on the table. Then he rested his mug of coffee on the paper.
This pair of sentences is immediately coherent; nothing seems out of place. However, (17) depicts an unrealistic state of affairs, and this is immediately recognized:
Coherence: Psycholinguistic Approach 589 (17) Fred put the wallpaper on the wall. Then he rested his mug of coffee on the paper.
Sentences (16) and (17) depict different situations: putting wallpaper on a table leaves the paper as a horizontal surface, while putting it on the wall leaves it in a vertical plane, so that the cup would just fall down. The implication is that people try to build a representation of what is being depicted with the each pair of sentences, and in order to do that, they have to use situation-specific knowledge. Keeping Track of Things: Situation Models The kind of situation-specific knowledge discussed above is stereotyped, and connecting language input to representations of specific situations is essential for adequate understanding, and hence coherence. But this is plainly not enough, in that texts do not simply refer to static situations; rather, as they unfold they depict a dynamic passage of events. Even the simple example (6) serves to illustrate that, which is why the development of a causal chain is so important for a coherent representation. A bold attempt to grasp the nettle of this more dynamic aspect of comprehension is found in the concept of the situation model (see Zwaan and Radvansky, 1998 for a detailed overview). They propose that as texts unfold, readers may keep track of a number of things. Consider first space. In some stories, people move around in space, and there is evidence that readers keep track of these movements. So, it turns out that readers have faster access to mental representations of rooms where protagonists are, or toward which protagonists are heading, than to representations of other rooms. This suggests two things: readers take the perspective of protagonists (see Duchan et al., 1995, for an introduction to this issue), and they update the focus of interest to where the protagonist is, or is said to be heading. Of course, not all stories are about events in space, but when they are, readers generally update where the protagonist is in their mental representation of the text. Several researchers have suggested that there are at least five dimensions to situations that could be encoded by the reader: space (as above), time, causation (briefly discussed above), intentionality, and protagonist. It has been plausibly argued that the comprehension of narratives revolves around keeping track of the goals and plans of protagonists (intentionality). It has been shown by probing readers that they appear to make inferences based on what motivates the actions of protagonists, even when this information is not directly available. The
protagonist category refers to the people and things in the mental representation. They are the entities that are being updated with respect to place, time, and intentionality. This category leads to a further aspect that has to be understood about characters if coherence is to be achieved: the emotions of the protagonists. Experimental evidence suggests that the emotional states of protagonists might be inferred as forward inferences (see the classic work of Gernsbacher et al., 1992). Multiple Viewpoints A further aspect of coherence is that with many texts, alternative versions of reality may have to be entertained if the text is to be understood. Consider (18): (18) John put his wallet on the dresser. Unbeknownst to John, Mary later moved it to his bedside table. When he came in from gardening, John went to get his wallet. First he tried his bedside table.
John’s action doesn’t make sense in our situation model, because we represent what John believes is the location of his wallet, and we also represent where the wallet actually is. The capacity to capture the beliefs of others and how these beliefs relate to what we know to be reality is called having a ‘Theory of Mind.’ Without the capacity to make these representations, texts like (18) could not be understood as anomalous, and would display apparent coherence that was unwarranted. Dealing with multiple viewpoints like this has received some attention, and given its prevalence in narratives and real-life situations, deserves more. However, one major discovery is that some people do not have the capacity to handle these multiple perspectives (it is particularly a feature of autism; see Baron-Cohen, 1995; for a very detailed analysis of multiple viewpoints and coherence, see Fauconnier, 1994).
Coherence and Selective Processing We have portrayed coherence as the establishment of connections between things mentioned in a text, actions mentioned in a text, and world knowledge supporting interpretation. There is an extra ingredient that has to be included, and that is selectivity in processing. Not all parts of texts are equally important, and a key aspect of coherence is being able to sort out what is important from what is mere detail (or even irrelevant). Coherence derives from not merely being able to join up what is in a text and relate it to knowledge, but to be able to selectively attend to what is salient – the gist of a message.
Ideas as to what constitutes the gist of a text have been around since the seminal work of Walter Kintsch and his colleagues (see Kintsch, 1977, for a sample of this important work). Essentially, the idea was that what was in texts could be expressed as propositions (or idea units). Furthermore, some propositions were dependents of others, and that the less dependent a proposition, the closer it corresponded to gist. Consider the following: (19) Harry was tired. He planned to go on package holiday to Greece.
In the second sentence, we learn that Harry planned to go on holiday (the key proposition), that the type of holiday was a package holiday (a dependent proposition: there can be no package holiday without there being a holiday), and that the holiday was in Greece (another dependent proposition). Many experiments on summarizing texts, and on remembering texts, showed that summaries and memories tended to lose dependent propositions. Thus a ‘gist’ representation of (19) might include Harry planned a holiday, but might exclude the package-deal component, or the fact that he planned to take it in Greece (unless these details become more salient later). There are cues in the structure of texts themselves as to what might be the most important information. For instance, discourse focus is a device for indicating where salient information lies. In (20), the salient information is with the hat, because that specifies which man was arrested. In (21), however, with the hat is merely an incidental attribute of the man. (20) Q: Which man was it who was in trouble? A: The man with the hat was arrested. (21) Q: What was it that happened last night? A: The man with the hat was arrested.
Recent experiments have shown that if subjects read a text similar to (20) once, and then immediately afterwards read the same thing again, but with hat changed to cap, they tend to notice the change. But if they read (21), they tend not to notice the change. Focus influences what is important, and how deeply the text is processed, showing how it controls patterns of understanding (see Sanford and Sturt, 2002, for a review of shallow processing). Other aspects of text structure influence what is taken into account in producing a coherent representation. Emphasis may be put on quite different aspects of a state of affairs by simple devices in language, such as negation and quantification. For instance, here are two ways of depicting the fat content of a product:
(22) This product is 90% fat free. (23) This product contains 10% fat.
Experiments have shown that people judge products to be less greasy and more healthy if the second of these descriptions is used (even if they taste a product). This is because the first description focuses on the amount of nonfat, whereas the second focuses on the amount of fat. Such focusing can happen implicitly, if terms are negative. Experimental work has shown that both of these statements are coherent, but in the fat-free case, people do not interpret the amount of fat-freeness against how much fat would be good or bad in a product. The fat-free formulation inhibits the use of world-knowledge, while the %-fat formulation does not inhibit it. Thus 75% fat free and 95% fat free are both considered to be more or less equally a good thing, while 25% fat is considered to be much less healthy than 5% fat (see Sanford and Moxey, 2003, for a review of these arguments for a variety of situations). Linking together different elements in a text and linking text to knowledge are important for a coherent interpretation of a text to be made; so too is being selective by choosing perspectives to build coherent structures around, and being selective to avoid overelaborating inferences that are not relevant. To achieve these goals, the writer or speaker has to choose the right linguistic devices and forms of words to guide the reader/listener into making the right inferences, using the sort of knowledge the producer intended. The capacity of a producer to do this effectively is what makes the discourse they are producing appear coherent. See also: Cohesion and Coherence: Linguistic Approaches; Discourse Processing; Rhetorical Structure Theory.
Bibliography Baron-Cohen S (1995). Mindblindness: an essay on autism and Theory of Mind. Cambridge, MA: MIT Press. Duchan J F, Bruder G A & Hewitt L E (1995). Deixis in narrative: a cognitive science perspective. Hillsdale, NJ: Lawrence Erlbaum Associates. Fauconnier G (1994). Mental spaces. New York: Cambridge University Press. Garnham A (2000). Mental models and the interpretation of anaphora. Hove: Psychology Press. Gernsbacher M A & Givon T (eds.) (1995). Coherence in spontaneous text. Philadelphia: John Benjamins. Gernsbacher M A, Goldsmith H H & Robertson R R W (1992). ‘Do readers mentally represent fictional characters emotional states?’ Cognition and Emotion 6, 89–111.
Cohesion and Coherence: Linguistic Approaches 591 Halliday M A K & Hasan R (1976). Cohesion in English. London: Longman. Haviland S E & Clark H H (1974). ‘What’s new? Acquiring new information as a processing comprehension.’ Journal of Verbal Learning and Verbal Behavior 13, 512–521. Kintsch W (1977). Memory and cognition. New York: Wiley. Langston M & Trabasso T (1999). ‘Modelling causal integration and availability of information during comprehension of narrative texts.’ In van Oostendorp H & Goldman S R (eds.) The construction of mental representations during reading. Mahwah, NJ: Lawrence Erlbaum Associates. McKoon G & Ratcliff R (1992). ‘Inferences during reading.’ Psychological Review 99, 440–466.
Sanford A J & Garrod S C (1981). Understanding written language: explorations beyond the sentence. Chichester: John Wiley and Sons. Sanford A J & Garrod S C (1998). ‘The role of scenario mapping in text comprehension.’ Discourse Processes 26, 159–190. Sanford A J & Moxey L M (2003). ‘New perspectives on the expression of quantity.’ Current Directions in Psychological Science 12, 240–242. Sanford A J & Sturt P (2002). ‘Depth of processing in language comprehension: not noticing the evidence.’ Trends in Cognitive Sciences 6, 382–386. Zwaan R A & Radvansky G A (1998). ‘Situation models in language comprehension and memory.’ Psychological Bulletin 123, 162–185.
Cohesion and Coherence: Linguistic Approaches T Sanders and H Pander Maat, Utrecht Institute of Linguistics OTS, Utrecht University, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Discourse is more than a random set of utterances: it shows connectedness. A central objective of linguists working on the discourse level is to characterize this connectedness. Linguists have traditionally approached this problem by looking at overt linguistic elements and structures. In their famous Cohesion in English, Halliday and Hasan (1976) describe text connectedness in terms of reference, substitution, ellipsis, conjunction, and lexical cohesion. According to Halliday and Hasan (1976: 13), these explicit clues make a text a text. Cohesion occurs ‘‘when the interpretation of some element in the discourse is dependent on that of another’’ (Halliday and Hasan, 1976: 4). The following types of cohesion are distinguished. . Reference: two linguistic elements are related in what they refer to. Jan lives near the park. He often goes there.
. Substitution: a linguistic element is not repeated but is replaced by a substitution item. Daan loves strawberry ice-creams. He has one every day.
. Ellipsis: one of the identical linguistic elements is omitted. All the children had an ice-cream today. Eva chose strawberry. Arthur had orange and Willem too.
. Conjunction: a semantic relation is explicitly marked.
Eva walked into town, because she wanted an icecream.
. Lexical cohesion: two elements share a lexical field (collocation). Why does this little boy wriggle all the time? Girls don’t wriggle (Halliday and Hasan, 1976: 285). It was hot. Daan was lining up for an ice-cream.
While lexical cohesion is obviously achieved by the selection of vocabulary, the other types of cohesion are considered as grammatical cohesion. The notion of lexical cohesion might need some further explanation. Collocation is the most problematic part of lexical cohesion (Halliday and Hasan, 1976: 284). The analysis of the first example of lexical cohesion above would be that girls and boys have a relationship of complementarity and are therefore related by lexical cohesion. The basis of lexical cohesion is in fact extended to any pair of lexical items that stand next to each other in some recognizable lexicosemantic relation. Let us now consider the second example of lexical cohesion mentioned above. Do hot weather and ice-cream belong to the same lexical field? Do they share a lexicosemantic relationship? If we want to account for the connectedness in this example, we would have to assume that such a shared lexicosemantic relationship holds, since the other forms of cohesion do not hold. The clearest cases of lexical cohesion are those in which a lexical item is replaced by another item that is systematically related to the first one. The class of general noun, for instance, is a small set of nouns having generalized reference within the major noun classes, such as ‘human noun’:
Cohesion and Coherence: Linguistic Approaches 591 Halliday M A K & Hasan R (1976). Cohesion in English. London: Longman. Haviland S E & Clark H H (1974). ‘What’s new? Acquiring new information as a processing comprehension.’ Journal of Verbal Learning and Verbal Behavior 13, 512–521. Kintsch W (1977). Memory and cognition. New York: Wiley. Langston M & Trabasso T (1999). ‘Modelling causal integration and availability of information during comprehension of narrative texts.’ In van Oostendorp H & Goldman S R (eds.) The construction of mental representations during reading. Mahwah, NJ: Lawrence Erlbaum Associates. McKoon G & Ratcliff R (1992). ‘Inferences during reading.’ Psychological Review 99, 440–466.
Sanford A J & Garrod S C (1981). Understanding written language: explorations beyond the sentence. Chichester: John Wiley and Sons. Sanford A J & Garrod S C (1998). ‘The role of scenario mapping in text comprehension.’ Discourse Processes 26, 159–190. Sanford A J & Moxey L M (2003). ‘New perspectives on the expression of quantity.’ Current Directions in Psychological Science 12, 240–242. Sanford A J & Sturt P (2002). ‘Depth of processing in language comprehension: not noticing the evidence.’ Trends in Cognitive Sciences 6, 382–386. Zwaan R A & Radvansky G A (1998). ‘Situation models in language comprehension and memory.’ Psychological Bulletin 123, 162–185.
Cohesion and Coherence: Linguistic Approaches T Sanders and H Pander Maat, Utrecht Institute of Linguistics OTS, Utrecht University, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Discourse is more than a random set of utterances: it shows connectedness. A central objective of linguists working on the discourse level is to characterize this connectedness. Linguists have traditionally approached this problem by looking at overt linguistic elements and structures. In their famous Cohesion in English, Halliday and Hasan (1976) describe text connectedness in terms of reference, substitution, ellipsis, conjunction, and lexical cohesion. According to Halliday and Hasan (1976: 13), these explicit clues make a text a text. Cohesion occurs ‘‘when the interpretation of some element in the discourse is dependent on that of another’’ (Halliday and Hasan, 1976: 4). The following types of cohesion are distinguished. . Reference: two linguistic elements are related in what they refer to. Jan lives near the park. He often goes there.
. Substitution: a linguistic element is not repeated but is replaced by a substitution item. Daan loves strawberry ice-creams. He has one every day.
. Ellipsis: one of the identical linguistic elements is omitted. All the children had an ice-cream today. Eva chose strawberry. Arthur had orange and Willem too.
. Conjunction: a semantic relation is explicitly marked.
Eva walked into town, because she wanted an icecream.
. Lexical cohesion: two elements share a lexical field (collocation). Why does this little boy wriggle all the time? Girls don’t wriggle (Halliday and Hasan, 1976: 285). It was hot. Daan was lining up for an ice-cream.
While lexical cohesion is obviously achieved by the selection of vocabulary, the other types of cohesion are considered as grammatical cohesion. The notion of lexical cohesion might need some further explanation. Collocation is the most problematic part of lexical cohesion (Halliday and Hasan, 1976: 284). The analysis of the first example of lexical cohesion above would be that girls and boys have a relationship of complementarity and are therefore related by lexical cohesion. The basis of lexical cohesion is in fact extended to any pair of lexical items that stand next to each other in some recognizable lexicosemantic relation. Let us now consider the second example of lexical cohesion mentioned above. Do hot weather and ice-cream belong to the same lexical field? Do they share a lexicosemantic relationship? If we want to account for the connectedness in this example, we would have to assume that such a shared lexicosemantic relationship holds, since the other forms of cohesion do not hold. The clearest cases of lexical cohesion are those in which a lexical item is replaced by another item that is systematically related to the first one. The class of general noun, for instance, is a small set of nouns having generalized reference within the major noun classes, such as ‘human noun’:
592 Cohesion and Coherence: Linguistic Approaches
people, person, man, woman, child, boy, girl. Cohesion achieved by anaphoric reference items like the man or the girl is very similar to cohesion achieved by reference with pronouns like he or she, although Halliday and Hasan (1976: 276) state explicitly what the difference is: ‘‘the form with general noun, the man, opens up another possibility, that of introducing an interpersonal element into the meaning, which is absent in the case of the personal pronoun.’’ This interesting observation points forward to similar observations formulated in theories developed much later, as in Accessibility Theory (Ariel, 1990) and Mental Space Theory (Fauconnier, 1994; Fauconnier and Sweetser, 1996; Sanders and Redeker, 1996). This is only one example in which Cohesion in English shows itself to be a seminal work, in some respects ahead of its time. After the publication of cohesion in English, the notion of cohesion was widely accepted as a tool for the analysis of text beyond the sentence level. It was used to characterize text structure, but also to study language development and written composition (Lintermann-Rygh, 1985). Martin’s English text (1992) is a more recent elaboration of the cohesion work. It also starts from a systemic functional approach to language and claims to provide a comprehensive set of discourse analyses for any English text. Useful and seminal as the cohesion approach may be, there seem to be some principled problems with it. For instance, the notion of lexical cohesion is hard to define. The intuition that ‘hot weather’ and ‘icecream’ belong to the same lexical field may be shared by many people in modern Western culture, but now consider example (1). (1) The winter of 1963 was very cold. Many barn owls died.
Here it is much harder to imagine that ‘cold winters’ and ‘barn owls,’ or even ‘dying barn owls,’ should be related by a lexical field. Still, relating these items is necessary to account for the connectedness in (1). This problem is hardly solved by Halliday and Hasan’s (1976: 290) advice ‘‘to use common sense, combined with the knowledge that we have, as speakers of a language, of the nature and structure of its vocabulary.’’ Examples like (1) constitute a major problem for a cohesion approach: this short text presents no interpretation difficulties whatsoever, but there is no overt linguistic signal either. This suggests that cohesion is not a necessary condition for connectedness. Such a conclusion is corroborated by cases like (2), from a Dutch electronic newspaper (Sanders and Spooren, in press), to which we added the segment-indices (a) and (b).
(2a) Greenpeace heeft in het Zuid-Duitse Beieren een nucleair transport verstoord. (2b) Demonstranten ketenden zich vast aan de rails. (Telegraaf-i, April 10, 2001) (2a) ‘Greenpeace has impeded a nuclear transportation in the Southern German state Bayern.’ (2b) ‘Demonstrators chained themselves to the rails.’
This short electronic news item does not create any interpretative difficulties. However, in order to understand the fragment correctly, a massive amount of inferencing has to take place. For instance, we need to infer that the nuclear transportation was not disturbed by the organization Greenpeace, but by members of that organization; that the protesters are members of the organization; that the nuclear transportation took place by train, etc. Some of these inferences are based on world knowledge, for instance that organizations consist of people and that people, but not organizations, can carry out actions like the one described here. Others are based on discourse structural characteristics. One example is the phrase the rails. This definite noun phrase suggests that its referent is given in some way. But because there is no explicit candidate antecedent, the reader is invited to link it up with transportation, the most plausible interpretation being that the transportation takes place by a vehicle on rails, i.e., a train. It is clear by now that the cohesion approach to connectedness is inadequate. Instead, the dominant view has come to be that the connectedness of discourse is a characteristic of the mental representation of the text rather than of the text itself. The connectedness thus conceived is often called coherence (see Coherence: Psycholinguistic Approach). Language users establish coherence by actively relating the different information units in the text. Generally speaking, there are two respects in which texts can cohere: 1. Referential coherence: smaller linguistic units (often nominal groups) may relate to the same mental referent (see Discourse Anaphora); 2. Relational coherence: text segments (most often conceived of as clauses) are connected by coherence relations like Cause–Consequence between them (see Clause Relations). Although there is a principled difference between the cohesion and the coherence approaches to discourse, the two are more related than one might think. We need to realize that coherence phenomena may be of a cognitive nature, but that their reconstruction is often based on linguistic signals in the text itself. Both coherence phenomena under consideration – referential
Cohesion and Coherence: Linguistic Approaches 593
and relational coherence – have clear linguistic indicators that can be taken as processing instructions. For referential coherence these are devices such as pronouns and demonstratives, and for relational coherence these are connectives and (other) lexical markers of relations, such as cue phrases and signaling phrases. A major research issue is the relation between the linguistic surface code (what Givo´ n, 1995, calls ‘grammar as a processing instructor’) and aspects of the discourse representation. In the domain of referential coherence, this relation can be illustrated by the finding that different referential devices correspond to different degrees of activation for the referent in question. For instance, a discourse topic may be referred to quite elaborately in the first sentence but once the referent has been identified, pronominal forms suffice. This is not a coincidence. Many linguists have noted this regularity (e.g., Ariel, 1990; Givo´ n, 1992; Chafe, 1994). Ariel (1990, 2001), for instance, has argued that this type of pattern in grammatical coding should be understood to guide processing. In her accessibility theory, ‘high accessibility markers’ use little linguistic material and signal the default choice of continued activation. By contrast, ‘low accessibility markers’ contain more linguistic material and signal the introduction of a new referent (see Accessibility Theory). We now turn to (signals of) relational coherence. Coherence relations taken into account for the connectedness in readers’ cognitive text representation (cf. Hobbs, 1979; Sanders et al., 1992). They arealso termed rhetorical relations (Mann and Thompson, 1986, 1988, 1992) or clause relations, which constitute discourse patterns at a higher text level (Hoey, 1983; see Problem-Solution Patterns). Coherence relations are meaning relations connecting two text segments. A defining characteristic for these relations is that the interpretation of the related segments needs to provide more information than is provided by the sum of the segments taken in isolation. Examples are relations like CauseConsequence, List, and Problem-Solution. These relations are conceptual and they can, but need not, be made explicit by linguistic markers, so-called connectives (because, so, however, although) and lexical cue phrases (for that reason, as a result, on the other hand) (see Connectives in Text). In the last decade, a significant part of research on coherence relations has focused on the question of how the many different sets of relations should be organized (Hovy, 1990; Knott and Dale, 1994). Sanders et al. (1992) have started to define the ‘relations among the relations,’ relying on the intuition that some coherence relations are more alike than others. For instance, the relations in (3), (4), and (5) all express (a certain type of) causality; they
express relations of Cause–Consequence/Volitional result (3), Argument–Claim/Conclusion (4) and Speech Act Causality (5): ‘This is boring watching this stupid bird all the time. I propose we go home now!’ The relations expressed in (6) and (7), however, do not express causal, but rather additive relations. Furthermore, a negative relation is expressed in (6). All other examples express positive relations, and (7) expresses an enumeration relation. (3) The buzzard was looking for prey. The bird was soaring in the air for hours. (4) The bird has been soaring in the air for hours now. It must be a buzzard. (5) The buzzard has been soaring in the air for hours now. Let’s finally go home! (6) The buzzard was soaring in the air for hours. Yesterday we did not see it all day. (7) The buzzard was soaring in the air for hours. There was a peregrine falcon in the area, too.
Sweetser (1990) introduced a distinction dominant in many existing classification proposals, namely that between content relations (also sometimes called ideational, external, or semantic relations), epistemic relations, and speech act relations. In the first type of relation, segments are related because of their propositional content, i.e., the locutionary meaning of the segments. They describe events that cohere in the world. If this distinction is applied to the set of examples above, the causal relation (3) is a content relation, whereas (4) is an epistemic relation, and (5) a speech act relation. This systematic difference between types of relation has been noted by many students of discourse coherence (see Connectives in Text). Still, there is a lively debate about whether this distinction should be conceived of in terms of domains, or rather in terms of subjectivity; often, semantic differences between connectives are used as linguistic evidence for proposals [see contributions to special issues and edited volumes like Spooren and Risselada (1997); Risselada and Spooren (1998); Sanders, Schilperoord and Spooren (2001); and Knott, Sanders and Oberlander (2001); further see Pander Maat (1999)]. Others have argued that coherence is a multilevel phenomenon, so that two segments may be simultaneously related on different levels (Moore and Pollack, 1992; Bateman and Rondhuis, 1997); see Sanders and Spooren (1999) for discussion. So far, we have discussed connectedness as it occurs in both spoken/dialogical discourse and written/ monological text. However, the connectedness of spoken discourse is established by many other means than the ones discussed so far. Aspects of discourse
594 Cohesion and Coherence: Linguistic Approaches
structure that are specific to spoken language include the occurrence of adjacency pairs, i.e., minimal pairs like Question-Answer and Summons-Response (Sacks, Schegloff and Jefferson, 1974), and prosody. These topics are subject to ongoing investigations (see especially Ford, Fox and Thompson, 2001) that we consider important because they relate linguistic subdisciplines like grammar and the study of conversation. In addition, it is clear that linguistic signals of coherence, such as connectives, have additional functions in conversations. For instance, connectives function to express coherence relations between segments, like but in example (8), which expresses a contrastive relation. (8) The buzzard was soaring in the air for hours. But yesterday we did not see it all day.
In conversations, this use of connectives is also found, but at the same time, connectives frequently function as sequential markers: for instance, they signal the move from a digression back to the main line of the conversation or even signal turn-taking. In this type of use, connectives are often referred to as discourse markers (Schiffrin, 2001) (see Particles in Spoken Discourse). In sum, we have discussed the principled difference between two answers to the question ‘how to account for connectedness of text and discourse?’ We have seen that, while cohesion seeks the answer in overt textual signals, a coherence approach considers connectedness to be of a cognitive nature. A coherence approach opens the way to a fruitful interaction between text linguistics, discourse psychology, and cognitive science, but at the same does not neglect the attention for linguistic detail characterizing the cohesion approach. The coherence paradigm is dominant in most recent work on the structure and the processing of discourse (see, among many others, Hobbs, 1990; Garnham and Oakhill, 1992; Sanders, Spooren and Noordman, 1992 ; Gernsbacher and Givo´ n, 1995; Noordman and Vonk, 1997; Kintsch, 1998; Kehler, 2002). In our view it is this type of paradigm, located at the intersection of linguistics and discourse-processing research, that will lead to significant progress in the field of discourse studies.
See also: Accessibility Theory; Clause Relations; Coherence: Psycholinguistic Approach; Connectives in Text; Discourse Anaphora; Discourse Processing; Particles in Spoken Discourse; Problem-Solution Patterns.
Bibliography Ariel M (1990). Accessing noun-phrase antecedents. London: Routledge. Ariel M (2001). ‘Accessibility theory: an overview.’ In Sanders T, Schilperoord J & Spooren W (eds.) Text representation: linguistic and psycholinguistic aspects. Amsterdam: John Benjamins. 29–87. Bateman J A & Rondhuis K J (1997). ‘Coherence relations: towards a general specification.’ Discourse Processes 24, 3–49. Chafe W L (1994). Discourse, consciousness, and time. The flow and displacement of conscious experience in speaking and writing. Chicago: Chicago University Press. Fauconnier G (1994). Mental spaces: Aspects of meaning construction in natural language. Cambridge: Cambridge University Press. Fauconnier G & Sweetser E (eds.) (1996). Spaces, worlds and grammar. Chicago: The University of Chicago Press. Ford C E, Fox B A & Thompson S A (eds.) (2001). The language of turn and sequence. Oxford: Oxford University Press. Garnham A & Oakhill J (eds.) (1992). Discourse Representation and Text Processing. A Special Issue of Language and Cognitive Processes. Hove, UK: Lawrence Erlbaum Associates. Gernsbacher M A & Givo´ n T (eds.) (1995). Coherence in spontaneous text. Amsterdam: John Benjamins. Givo´ n T (1992). ‘The grammar of referential coherence as mental processing constructions.’ Linguistics 30, 5–55. Givo´ n T (1995). ‘Coherence in text vs. coherence in mind.’ In Gernsbacher M A & Givo´ n T (eds.) Coherence in spontaneous text. Amsterdam: John Benjamins. 59–115. Halliday M A K & Hasan R (1976). Cohesion in English. London: Longman. Hobbs J R (1979). ‘Coherence and coreference.’ Cognitive Science 3, 67–90. Hobbs J R (1990). Literature and cognition. Menlo Park, CA: CSLI. Hoey M (1983). On the surface of discourse. London: George Allen & Unwin. Hovy E H (1990). ‘Parsimonious and profligate approaches to the question of discourse structure relations.’ In Proceedings of the 5th International Workshop on Natural Language Generation. Lintermann-Rygh I (1985). ‘Connector density – an indicator of essay quality?’ Text 5, 347–357. Kehler A (2002). Coherence, reference and the theory of grammar. Chicago: Chicago University Press. Kintsch W (1998). Comprehension. A paradigm for cognition. Cambridge: Cambridge University Press. Knott A & Dale R (1994). ‘Using linguistic phenomena to motivate a set of coherence relations.’ Discourse Processes 18, 35–62. Knott A, Sanders T & Oberlander J (eds.) (2001). Levels of Representation in Discourse Relations. Special Issue of Cognitive Linguistics. Berlin: Mouton de Gruyter.
Collitz, Hermann (1855–1935) 595 Martin J R (1992). English text. System and structure. Philadelphia: John Benjamins. Mann W C & Thompson S A (1986). ‘Relational propositions in discourse.’ Discourse Processes 9, 57–90. Mann W C & Thompson S A (1988). ‘Rhetorical Structure Theory: toward a functional theory of text organization.’ Text 8, 243–281. Mann W C & Thompson S A (eds.) (1992). Discourse description. Diverse analyses of a fund-raising text. Amsterdam: John Benjamins. Moore J D & Pollack M E (1992). ‘A problem for RST: the need for multi-level discourse analysis.’ Computational Linguistics 18, 537–544. Noordman L G M & Vonk W (1997). ‘The different functions of a conjunction in constructing a representation of the discourse.’ In Fayol M & Costermans J (eds.) Processing interclausal relationships in production and comprehension of text. Hillsdale, NJ: Erlbaum. 75–93. Noordman L G M & Vonk W (1998). ‘Memory-based processing in understanding causal information.’ Discourse Processes 26, 191–212. Pander Maat H L W (1999). ‘The differential linguistic realization of comparative and additive coherence relations.’ Cognitive Linguistics 10(2), 147–184. Risselada R & Spooren W (eds.) (1998). The function of discourse markers. Special Issue of Journal of Pragmatics. Amsterdam: Elsevier. Sacks H, Schegloff E A & Jefferson G (1974). ‘A simplest systematics for the organization of turn-taking for conversation.’ Language 50, 696–735.
Sanders J & Redeker G (1996). ‘Perspective and the representation of speech and thought in narrative discourse.’ In Fauconnier G & Sweetser E (eds.) Spaces, Worlds and Grammars. Chicago: University of Chicago Press. 290–317. Sanders T, Schilperoord J & Spooren W (eds.) (2001). Text representation: linguistic and psycholinguistic aspects. Amsterdam: John Benjamins. Sanders T & Spooren W (1999). ‘Communicative intentions and coherence relations.’ In Bublitz W, Lenk U & Ventola E (eds.) Coherence in text and discourse. Amsterdam: John Benjamins. 235–250. Sanders T & Spooren W (in press). ‘Discourse and text structure.’ In Geeraerts D & Cuykens H (eds.) Handbook of cognitive linguistics. Oxford: Oxford University Press. Sanders T, Spooren W & Noordman L (1992). ‘Toward a taxonomy of coherence relations.’ Discourse Processes 15, 1–35. Schiffrin D (2001). ‘Discourse markers: language, meaning, and context.’ In Schiffrin D, Tannen D & Hamilton D (eds.) The handbook of discourse analysis. Malden, MA: Blackwell. 54–75. Spooren W & Risselada R (eds.) (1997). Discourse markers. Special Issue of Discourse Processes. Mawah, NJ: Erlbaum. Sweetser E E (1990). From etymology to pragmatics. Cambridge: Cambridge University Press.
Collitz, Hermann (1855–1935) S Ku¨rschner, Albert-Ludwigs-Universita¨t Freiburg im Breisgau, Freiburg im Breisgau, Germany ! 2006 Elsevier Ltd. All rights reserved.
Hermann Collitz, one of the most influential comparative Indo-European linguists from Germany, spent much of his career in the United States. He was among the first linguistic scientists to move to the United States with the specific intention of working at American universities. Hermann Collitz was born on February 4, 1855 in Bleckede, Germany. He completed his linguistic studies at Halle, Berlin, and Go¨ ttingen, where he obtained a doctor’s degree for his work on ‘‘the emergence of the Indo-Iranic row of palatals’’ (1879). His postdoctoral thesis (‘habilitation’) on nominal inflection in Old Indian and Greek was published in 1885 at the University of Halle, where he taught Sanskrit and comparative linguistics. As a professor of German philology – and later as a professor of comparative philology – Collitz went
abroad to teach at Bryn Mawr College in Philadelphia, Pennsylvania, in 1886. From 1907 to 1927, he was a professor at Johns Hopkins University, Baltimore, Maryland. In addition, Collitz stayed active in Germany and was among the editors of several books on Indo-European linguistics and dialectology. He worked on a dictionary of his own Low German dialect from Waldeck and concentrated on the dialectal history of Greek in several projects. In general philology, Collitz took part in the collection and editing of Greek dialectal inscriptions and did research on IndoEuropean mythology. With his diachronic studies, Collitz was one of the pioneers in comparative linguistics. He specialized in the phonology and morphology of the Indo-European languages. Although he was supposed to be part of the group of Neogrammarians at Leipzig, which included Sievers, Paul, and Braune, these linguists could not persuade Collitz to take part in their activities. In contradiction to their radical theoretical approaches, Collitz stuck to the importance of sound changes in
Collitz, Hermann (1855–1935) 595 Martin J R (1992). English text. System and structure. Philadelphia: John Benjamins. Mann W C & Thompson S A (1986). ‘Relational propositions in discourse.’ Discourse Processes 9, 57–90. Mann W C & Thompson S A (1988). ‘Rhetorical Structure Theory: toward a functional theory of text organization.’ Text 8, 243–281. Mann W C & Thompson S A (eds.) (1992). Discourse description. Diverse analyses of a fund-raising text. Amsterdam: John Benjamins. Moore J D & Pollack M E (1992). ‘A problem for RST: the need for multi-level discourse analysis.’ Computational Linguistics 18, 537–544. Noordman L G M & Vonk W (1997). ‘The different functions of a conjunction in constructing a representation of the discourse.’ In Fayol M & Costermans J (eds.) Processing interclausal relationships in production and comprehension of text. Hillsdale, NJ: Erlbaum. 75–93. Noordman L G M & Vonk W (1998). ‘Memory-based processing in understanding causal information.’ Discourse Processes 26, 191–212. Pander Maat H L W (1999). ‘The differential linguistic realization of comparative and additive coherence relations.’ Cognitive Linguistics 10(2), 147–184. Risselada R & Spooren W (eds.) (1998). The function of discourse markers. Special Issue of Journal of Pragmatics. Amsterdam: Elsevier. Sacks H, Schegloff E A & Jefferson G (1974). ‘A simplest systematics for the organization of turn-taking for conversation.’ Language 50, 696–735.
Sanders J & Redeker G (1996). ‘Perspective and the representation of speech and thought in narrative discourse.’ In Fauconnier G & Sweetser E (eds.) Spaces, Worlds and Grammars. Chicago: University of Chicago Press. 290–317. Sanders T, Schilperoord J & Spooren W (eds.) (2001). Text representation: linguistic and psycholinguistic aspects. Amsterdam: John Benjamins. Sanders T & Spooren W (1999). ‘Communicative intentions and coherence relations.’ In Bublitz W, Lenk U & Ventola E (eds.) Coherence in text and discourse. Amsterdam: John Benjamins. 235–250. Sanders T & Spooren W (in press). ‘Discourse and text structure.’ In Geeraerts D & Cuykens H (eds.) Handbook of cognitive linguistics. Oxford: Oxford University Press. Sanders T, Spooren W & Noordman L (1992). ‘Toward a taxonomy of coherence relations.’ Discourse Processes 15, 1–35. Schiffrin D (2001). ‘Discourse markers: language, meaning, and context.’ In Schiffrin D, Tannen D & Hamilton D (eds.) The handbook of discourse analysis. Malden, MA: Blackwell. 54–75. Spooren W & Risselada R (eds.) (1997). Discourse markers. Special Issue of Discourse Processes. Mawah, NJ: Erlbaum. Sweetser E E (1990). From etymology to pragmatics. Cambridge: Cambridge University Press.
Collitz, Hermann (1855–1935) S Ku¨rschner, Albert-Ludwigs-Universita¨t Freiburg im Breisgau, Freiburg im Breisgau, Germany ! 2006 Elsevier Ltd. All rights reserved.
Hermann Collitz, one of the most influential comparative Indo-European linguists from Germany, spent much of his career in the United States. He was among the first linguistic scientists to move to the United States with the specific intention of working at American universities. Hermann Collitz was born on February 4, 1855 in Bleckede, Germany. He completed his linguistic studies at Halle, Berlin, and Go¨ttingen, where he obtained a doctor’s degree for his work on ‘‘the emergence of the Indo-Iranic row of palatals’’ (1879). His postdoctoral thesis (‘habilitation’) on nominal inflection in Old Indian and Greek was published in 1885 at the University of Halle, where he taught Sanskrit and comparative linguistics. As a professor of German philology – and later as a professor of comparative philology – Collitz went
abroad to teach at Bryn Mawr College in Philadelphia, Pennsylvania, in 1886. From 1907 to 1927, he was a professor at Johns Hopkins University, Baltimore, Maryland. In addition, Collitz stayed active in Germany and was among the editors of several books on Indo-European linguistics and dialectology. He worked on a dictionary of his own Low German dialect from Waldeck and concentrated on the dialectal history of Greek in several projects. In general philology, Collitz took part in the collection and editing of Greek dialectal inscriptions and did research on IndoEuropean mythology. With his diachronic studies, Collitz was one of the pioneers in comparative linguistics. He specialized in the phonology and morphology of the Indo-European languages. Although he was supposed to be part of the group of Neogrammarians at Leipzig, which included Sievers, Paul, and Braune, these linguists could not persuade Collitz to take part in their activities. In contradiction to their radical theoretical approaches, Collitz stuck to the importance of sound changes in
596 Collitz, Hermann (1855–1935)
comparative linguistics. Independently of the German linguistic movements, Collitz became a strong advocate of the American philologies, which were widely ignored in Europe. Hermann Collitz was honored in a festschrift (Studies in honor of Hermann Collitz, 1930) and – after his death – in several obituaries that cited his long list of works on Indo-European linguistics and especially the history of German, English, the other Germanic languages, and also Greek and Latin (cf. Sehrt, 1936). A list of four major monographs is supplemented by editorial work for several outstanding American journals such as Modern Language Notes and the Journal of English and Germanic Philology. Collitz founded the book series Hesperia with American work on Germanic philology and he published more than 70 articles and more than 30 book reviews. He received the honorary degree of L.H.D. from the University of Chicago in 1916. In 1925, he was the president of the Linguistic Society of America and the MLA. After retiring in 1927, Collitz
remained an active philologist until his death on May 13, 1935. See also: Germanic Languages; Historical and Comparative Linguistics in the 19th Century; Neogrammarians.
Bibliography Collitz H (1879). Die Entstehung der indoiranischen Palatalreihe. Diss., Go¨ ttingen. Collitz H (1912). Das schwache Pra¨teritum und seine Vorgeschichte. Hesperia 1. Go¨ ttingen: Vandenhoeck & Ruprecht. Sehrt E H (1936). ‘Hermann Collitz 1855–1935.’ Modern Language Notes 51(2), 69–80. Studies in honor of Hermann Collitz (1930). Professor of Germanic philology, Emeritus in the Johns Hopkins University, Baltimore, Maryland. Presented by a group of his pupils and friends on the occasion of his seventy-fifth birthday, February 4, 1930. Baltimore: Johns Hopkins Press.
Collocations R Krishnamurthy, Aston University, Birmingham, UK ! 2006 Elsevier Ltd. All rights reserved.
Historical Use of the Term Collocation The fact that certain words co-occurred frequently was noticed in Biblical concordances (e.g., Cruden listed the occurrences of dry with ground in 1769). Style and usage guides in the 19th and 20th centuries (e.g., Fowler’s The King’s English) addressed only the overuse of collocations, labeling them cliche´ s and criticizing their use, especially by journalists (e.g., Myles na Gopaleen (see O’Nolan, 1977: 225–6), in a more humorous vein: ‘When and again have I asked you not to do that? Time . . . What is our civilization much? Vaunted. What is the public? Gullible. What interests? Vested.’).
Collocation in Modern Linguistics In modern linguistics, collocation refers to the fact that certain lexical items tend to co-occur more frequently in natural language use than syntax and semantics alone would dictate. Collocation was first given theoretical prominence by J. R. Firth, who separated it from cognitive and semantic ideas of word meaning, calling it an ‘‘abstraction at the syntagmatic level’’ (Firth 1957a: 196), and accorded it a distinct status in his account of the linguistic levels at which meaning
can arise. Firth implicitly indicated that collocation required a quantitative basis, giving actual numbers of co-occurrences in some texts. Halliday (1976) saw collocation as a cohesive device and identified the need for a measure of significant proximity between collocating items and said that collocation could only be discussed in terms of probability, thus validating the need for quantitative analyses and the use of statistics. Sinclair (Sinclair et al., 1970) performed the first computational investigation of collocation, comparing written and spoken corpora, identifying ! five words as the span of significant proximity and experimenting with statistical measures and lemmatization. Halliday (1966) and Sinclair (1966) thought that collocation could enable a lexical analysis of language independent of grammar. Sinclair (1991) suggested that lexical items could be defined by their collocational environments, saw collocation as part of the idiom principle (lexically determined choices), as opposed to the open choice principle (grammatically determined choices). Leech (1974: 20) included ‘collocative’ in his categories of meaning, but marginalized it as an idiosyncratic property of individual words, incapable of contributing to generalizations. Sinclair (1987c) and Stubbs (1996) suggested that all lexical items have collocations; and Hoey (2004) accommodated collocation within a model of ‘lexical priming,’ suggesting that most sentences are made
596 Collitz, Hermann (1855–1935)
comparative linguistics. Independently of the German linguistic movements, Collitz became a strong advocate of the American philologies, which were widely ignored in Europe. Hermann Collitz was honored in a festschrift (Studies in honor of Hermann Collitz, 1930) and – after his death – in several obituaries that cited his long list of works on Indo-European linguistics and especially the history of German, English, the other Germanic languages, and also Greek and Latin (cf. Sehrt, 1936). A list of four major monographs is supplemented by editorial work for several outstanding American journals such as Modern Language Notes and the Journal of English and Germanic Philology. Collitz founded the book series Hesperia with American work on Germanic philology and he published more than 70 articles and more than 30 book reviews. He received the honorary degree of L.H.D. from the University of Chicago in 1916. In 1925, he was the president of the Linguistic Society of America and the MLA. After retiring in 1927, Collitz
remained an active philologist until his death on May 13, 1935. See also: Germanic Languages; Historical and Comparative Linguistics in the 19th Century; Neogrammarians.
Bibliography Collitz H (1879). Die Entstehung der indoiranischen Palatalreihe. Diss., Go¨ttingen. Collitz H (1912). Das schwache Pra¨teritum und seine Vorgeschichte. Hesperia 1. Go¨ttingen: Vandenhoeck & Ruprecht. Sehrt E H (1936). ‘Hermann Collitz 1855–1935.’ Modern Language Notes 51(2), 69–80. Studies in honor of Hermann Collitz (1930). Professor of Germanic philology, Emeritus in the Johns Hopkins University, Baltimore, Maryland. Presented by a group of his pupils and friends on the occasion of his seventy-fifth birthday, February 4, 1930. Baltimore: Johns Hopkins Press.
Collocations R Krishnamurthy, Aston University, Birmingham, UK ! 2006 Elsevier Ltd. All rights reserved.
Historical Use of the Term Collocation The fact that certain words co-occurred frequently was noticed in Biblical concordances (e.g., Cruden listed the occurrences of dry with ground in 1769). Style and usage guides in the 19th and 20th centuries (e.g., Fowler’s The King’s English) addressed only the overuse of collocations, labeling them cliche´s and criticizing their use, especially by journalists (e.g., Myles na Gopaleen (see O’Nolan, 1977: 225–6), in a more humorous vein: ‘When and again have I asked you not to do that? Time . . . What is our civilization much? Vaunted. What is the public? Gullible. What interests? Vested.’).
Collocation in Modern Linguistics In modern linguistics, collocation refers to the fact that certain lexical items tend to co-occur more frequently in natural language use than syntax and semantics alone would dictate. Collocation was first given theoretical prominence by J. R. Firth, who separated it from cognitive and semantic ideas of word meaning, calling it an ‘‘abstraction at the syntagmatic level’’ (Firth 1957a: 196), and accorded it a distinct status in his account of the linguistic levels at which meaning
can arise. Firth implicitly indicated that collocation required a quantitative basis, giving actual numbers of co-occurrences in some texts. Halliday (1976) saw collocation as a cohesive device and identified the need for a measure of significant proximity between collocating items and said that collocation could only be discussed in terms of probability, thus validating the need for quantitative analyses and the use of statistics. Sinclair (Sinclair et al., 1970) performed the first computational investigation of collocation, comparing written and spoken corpora, identifying ! five words as the span of significant proximity and experimenting with statistical measures and lemmatization. Halliday (1966) and Sinclair (1966) thought that collocation could enable a lexical analysis of language independent of grammar. Sinclair (1991) suggested that lexical items could be defined by their collocational environments, saw collocation as part of the idiom principle (lexically determined choices), as opposed to the open choice principle (grammatically determined choices). Leech (1974: 20) included ‘collocative’ in his categories of meaning, but marginalized it as an idiosyncratic property of individual words, incapable of contributing to generalizations. Sinclair (1987c) and Stubbs (1996) suggested that all lexical items have collocations; and Hoey (2004) accommodated collocation within a model of ‘lexical priming,’ suggesting that most sentences are made
Collocations 597
up of interlocking collocations, and can therefore be seen as reproductions of earlier sentences.
Collocation and Lexicography The pedagogical value of collocation was recognized by English teachers in the 1930s. English collocations were described in detail by Harold Palmer in a report on phraseology research with A. S. Hornby, using the term fairly loosely to cover longer phrases, proverbs, and so on, as well as individual word combinations. Palmer and Hornby showed a major interest in the classification of collocations in grammatical and semantic terms but also used collocations to indicate the relevant senses of words in word lists (draw 1. e.g., a picture 2. e.g., a line), and in their dictionary examples (a practice continued in Hornby’s (1948) and subsequent editions of the Oxford advanced learner’s dictionary). Early EFL dictionaries avoided using the term collocation, e.g., Hornby (1974) referred to ‘‘special uses of an adjective with a preposition’’ (liable: !for, be ! to sth), and a ‘‘special grammatical way in which the headword is used’’ (meantime: in the !). Proctor (1978), in the Longman dictionary of contemporary English, referred to ‘‘ways in which English words are used together, whether loosely bound or occurring in fixed phrases’’ and ‘‘special phrases in which a word is usually (or always) found’’; however, the dictionary also had a section headed ‘Collocations,’ defined as ‘‘a group of words which are often used together to form a naturalsounding combination,’’ and stated that they are shown in three ways: in example sentences, in explanations in Usage Notes, or in heavy black type inside round brackets if they are very frequent or almost a fixed phrase (‘‘but not an idiom’’). These are signaled by ‘in the phr.’ or similar rubrics, and Procter (1978) gave the example a mountain fastness. Later EFL dictionaries (Cobuild, Cambridge, Macmillan, etc.) continued to incorporate collocations in their dictionaries, including them in definitions and examples and typographically highlighting them in phrases. Sinclair’s Introduction to the Cobuild dictionary (1987b), in the section on ‘Word and Environment,’ speaks of ‘‘the way in which the patterns of words with each other are related to the meanings and uses of the words’’ and says that ‘‘the sense of a word is bound up with a particular usage . . . a close association of words or a grouping of words into a set phrase’’ and ‘‘(a word) only has a particular meaning when it is in a particular environment.’’ Examples such as hard luck, hard facts, hard evidence, strong evidence, tough luck, and sad facts are discussed.
In Sinclair (1987b), collocates are defined as ‘‘words which co-occur significantly with headwords,’’ and regular or significant collocation as ‘‘lexical items occurring within five words . . . of the headword’’ with a greater frequency than expected, which ‘‘was established only on the basis of corpus evidence.’’ For the first time in lexicography, a statistical notion of collocation had been introduced. Collocation is used to distinguish senses: ‘‘Different sets of collocates found with these different senses pinpoint the fact that they are different senses’’; ‘‘Collocation . . . frequently reinforces meaning distinctions’’; and lexical sets used in disambiguation are ‘‘signalled by coincidence of collocation’’ (Sinclair, 1987a). Collocation can also be a marker of metaphoricity: the presence of modifiers and qualifiers indicates metaphorical uses of treadmill and blanket, e.g., . . . the corporate treadmill; . . . the treadmill of office life; a security blanket for new democracies; a blanket of snow (ibid). Collocation is the ‘‘lexical realisation of the situational context’’ (ibid.). In the central patterns of English, ‘‘meaning was only created by choosing two or more words simultaneously’’ (ibid.). However, the flexibility of collocation (sometimes crossing sentence boundaries) can cause problems in the wording of definitions: often, ‘‘no particular group of collocates occurs in a structured relationship with the word’’ and therefore ‘‘there is no suitable pattern ready for use as a vehicle of explanation’’ (ibid.). The difficulty of eliciting collocates by intuition is discussed; we tend to think of semantic sets: feet suggests ‘‘legs, toes, head or shoe, sandals, sock, or walk, run,’’ whereas significant corpus collocates of feet are ‘‘tall, high, long, and numbers’’ (ibid.). Prompted by hint, we produce ‘‘subtle, small, clue’’; the corpus indicates ‘‘give, take, no.’’ The difference between left-hand and right-hand collocates is exemplified by open: the most frequent words before open are ‘‘the, to, an, is, an, wide, was, door, more, eyes’’ and after open are ‘‘to, and, the, for, up, space, a, it, in, door’’ (ibid.). Lexicographers can also use collocations to distinguish between near-synonyms, e.g., the difference between electric (collocates: specific devices such as guitar, chair, light, car, motor, windows, oven, all ‘powered by electricity’), and electrical (collocates: more generic terms such as engineering, equipment, goods, appliances, power, activity, signals, systems, etc., ‘concerning or involving electricity’).
Finding Collocations in a Corpus Initially, collocates for dictionary headwords were identified manually by lexicographers wading through pages of printouts of concordance lines. This was
598 Collocations
clearly unsatisfactory, and only impressionistic views were feasible. Right-sorted concordances obscured left-context collocates and vice versa. The fixed-length context of printouts prevented the observation of collocates beyond a few words. Subsequent software developments have enabled the automatic measurement of statistically significant co-occurrences. These are within a specifiable and adjustable span or window of context, using different measures of statistical significance, principally mutual information (or MI-score) and t-score. MI-score privileges lower-frequency, high-attraction collocates (e.g., dentist with hygienist, optician, and molar) while t-score favors higher-frequency collocates (e.g., dentist with chair), including significant grammatical words (e.g., dentist with a, and your). The software can also display the collocate’s positional distribution if required, and recursive options are available to investigate the detailed phraseology of collocating items. Software has also become more publicly available, from MicroConcord to Wordsmith Tools and Collocate. Kilgarriff and Tugwell’s WordSketch (Kilgarriff et al., 2004) was used in creating the Macmillan English dictionary (Rundell, 2002) and offers clause-functional information about collocations, e.g., wear þ objects: suit, dress, hat, etc. þ prepositional phrases (after of: armor, clothing, jeans, etc.; after with: pride, sleeve, collar, etc.; after on: sleeve, wrist, finger, etc.; after over: shirt, head, dress, etc.); similarly, fish is the subject of the verbs swim, catch, fry, etc.; the object of the verbs catch, eat, feed, etc. and modified by the adjectives tropical, bony, oily, and so on. Lexicographers are in general less concerned about the detailed classification of collocations, although their judgments affect the both the placement and specific treatment of the combinations. Hornby’s attempts (e.g., Hornby, 1948, 1974) at classification (focusing on verbs) later used transformations and meaning distinctions as well as surface patterns, and Hunston and Francis (2000) listed the linguistic and lexicological terminology that has developed subsequently for collocational units: lexical phrases, composites, gambits, routine formulae, phrasemes, etc., and referred to the work of Moon (e.g. 1998) and Mel’cˇuk (e.g. 1998) in discussing degrees of fixity and variation, which does impact on lexicography. However, one of Firth’s (1957b) original terms, ‘colligation,’ used to describe the habitual co-occurrence of grammatical elements, has not achieved the same widespread usage as ‘collocation.’ One manifestation of colligation, phrasal verbs, the combination of verb and particle (adverb or preposition) to form semantic units, has been highlighted in EFL dictionaries.
Several EFL publishers have produced separate dictionaries of phrasal verbs. There have been some dictionaries of collocations, but so far each has had its own limitations: not wholly corpus-based (e.g., Benson et al., 1986; Hill and Lewis, 1997), based on a small corpus (e.g., Kjellmer, 1994), or limited coverage (the recent Oxford collocations dictionary for students of English (Lea, 2002)).
Collocation in Computational Linguistics, Pedagogy, and Translation Interest in collocation has increased substantially in the past decade, as evidenced by workshops at lexicographical, linguistic, pedagogical, and translation conferences. For computational purposes, the relevant features of collocation are that they are ‘‘arbitrary, domain independent, recurrent, and cohesive lexical clusters’’ (Smadja, 1993), and ‘‘of limited semantic compositionality’’ (Manning and Schu¨tze, 1999). But the greatest interest has been generated in the language-teaching profession, with numerous conference and journal papers. Lewis (2000) encapsulates the main concerns: students do not recognize collocations in their input, and hence fail to produce them; collocation represents fluency (which precedes accuracy, represented by grammar); transparent versus ‘arbitrary’ (or idiomatic) combinations, with familiar words in rarer combinations (a heavy smoker is not a fat person); transformation can be misleading (extremely disappointed but rarely extreme disappointment); students may generalize more easily from corpus concordance examples than from canonical versions in dictionaries (exploring versus explaining); collocation as a bridge between the artificial separation of lexis and grammar; collocation extends knowledge of familiar words (easier than acquiring new words in isolation); and longer chunks are more useful and easier to store than isolated words.
Conclusions and the Future For many fields, it seems that collocation has a great future. The applications of collocation in language teaching have been one of the notable recent successes. Its more detailed exploration in large language corpora requires a significant advance in software. The exact parameters are not fully established, and the statistical measures can be improved. Research to identify word-senses by the clustering of collocates was initiated in the 1960s (Sinclair et al., 1970), but has still not become sufficiently robust for automatic processing. The identification of lexical sets by collocation, signaled in Sinclair (1966; Sinclair et al.,
Collocations 599
1970) and Halliday (1966), is yet to be achieved, as is a corpus-generated thesaurus. The theoretical impetus of collocation has yet to reach the level of a language-pervasive system, although Hoey’s notion of Lexical Priming heads in that direction. See also: Computational Lexicons and Dictionaries;
Computational Stylistics; Computers in Lexicography; Concordances; Corpus Approaches to Idiom; Corpus Linguistics; Corpus Lexicography; Data and Evidence; Disambiguation, Lexical; Firth, John Rupert (1890–1960); Halliday, Michael A. K. (b. 1925); Idiom Dictionaries; Idioms; Lexicon Grammars; Palmer, Harold Edward (1877–1949); Phraseology; Polysemy and Homonymy; Selectional Restrictions; Statistics.
Bibliography Benson M, Benson E & Ilson R (1986). The BBI combinatory dictionary of English. New York: John Benjamins. Church K W & Hanks P (1989). ‘Word association norms, mutual information, and lexicography.’ In Proceedings of the 27th annual meeting of the Association for Computational Linguistics, reprinted in Computational Linguistics 16(1), 1990. Church K W, Gale W, Hanks P & Hindle D (1990). ‘Using statistics in lexical analysis.’ In Zernik U (ed.) Lexical acquisition: using on-line resources to build a lexicon. Lawrence Erlbaum Associates. Clear J (1993). ‘From Firth principles: computational tools for the study of collocation.’ In Baker M, Francis G & Tognini-Bonelli E (eds.) Text and technology. Amsterdam: John Benjamins. Collocate (2005). Written by Michael Barlow. Houston: Athelstan. For details see http://www.nol.net/!athel/ on.html and http://athel.com/product_info.php?products_ id=29&osCsid=8c5d654da554afcb0348ee65eb143265. Cowie A P (1999). English dictionaries for foreign learners – a history. Oxford: Clarendon Press. Firth J R (1957a). ‘Modes of meaning.’ In Papers in linguistics 1934–51. London: Oxford University Press. Firth J R (1957b). ‘A synopsis of linguistic theory 1930–55.’ In Studies in linguistic analysis. (Special volume of the Philological Society). Oxford: Blackwell. Reprinted in Palmer F (ed.) (1968) Selected papers of J. R. Firth 1952–59. Halliday M A K (1966). ‘Lexis as a linguistic level.’ In Bazell C E, Catford J C, Halliday M A K & Robins R H (eds.) In memory of J. R. Firth. London: Longman. Halliday M A K & Hasan R (1976). Cohesion in English. London: Longman. Hill J & Lewis M (1997). LTP Dictionary of Selected Collocations. Hove: LTP. Hoey M (2004). ‘Textual colligation – a special kind of lexical priming.’ Language and Computers 49(1), 171–194. Hornby A S (ed.) (1948). Oxford advanced learner’s dictionary of current English (1st edn.). Oxford: Oxford University Press.
Hornby A S (ed.) (1974). Oxford advanced learner’s dictionary of current English (3rd edn.). Oxford: Oxford University Press. Kenny D (1998). ‘Creatures of habit? What translators usually do with words.’ Meta 43(4), 515–523. Kilgarriff A, Rychly P, Smrz P & Tugwell D (2004). ‘The sketch engine.’ In Williams G & Vessier S (eds.) Proceedings of Euralex 2004. Lorient, France: Universite´ de Bretagne Sud. For more details and access to software, please see http://www.sketchengine.co.uk/. Kjellmer G (1994). A dictionary of English collocations. Oxford: Clarendon Press. Lea D (ed.) (2002). Oxford collocations dictionary for students of English. Oxford: Oxford University Press. For details see http://www.oup.com/elt/catalogue/isbn/ 0-19-431243-7?cc=gb. Leech G (1974). Semantics. London: Penguin. Lewis M (2000). Teaching collocation. Hove: Language Teaching Publications. Louw B (1993). ‘Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies.’ In Baker M et al. (eds.) Text and technology. Amsterdam: John Benjamins. Manning C D & Schu¨ tze H (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press. Melcˇ uk I (1998). Collocations and lexical functions. In Cowie A P (ed.) Phraseology. Theory, analysis, and applications. Oxford: Clarendon Press. 23–53. MicroConcord (1993). Written by Scott M & Johns T. Oxford: OUP. See http://users.ox.ac.uk/!ctitext2/ resguide/resources/m125.htmlfor details and http:// www.liv.ac.uk/!ms2928/software/ for free download. Moon R (1998). Fixed expressions and idioms in English: a corpus-based approach. Oxford: O.U.P. O’Nolan K (ed.) (1977). The best of Myles – a selection from ‘Cruiskeen Lawn’. London: Pan Books. Palmer H E (1933). Second interim report on English collocations. Tokyo: Kaitakusha. Procter P (ed.) (1978). Longman dictionary of contemporary English (1st edn.). Harlow: Longman. Rundell M (ed.) (2002). Macmillan English dictionary. Basingstoke: Macmillan. Sinclair J M (1966). ‘Beginning the study of lexis.’ In Bazell C E, Catford J C, Halliday M A K & Robins R H (eds.) In memory of J. R. Firth. London: Longman. Sinclair J M (ed.) (1987a). Looking up-an account of the COBUILD project in lexical computing. London: Collins ELT. Sinclair J M (1987b). ‘Introduction.’ In Sinclair J M (ed.) Collins Cobuild English language dictionary, 1st edn. London/Glasgow: Collins. Sinclair J M (1987c). ‘Collocation: a progress report.’ In Steele R & Threadgold T (eds.) Language topics. Amsterdam/Philadelphia: Benjamins. Sinclair J M (1991). Corpus, concordance, collocation. Oxford: O.U.P. Sinclair J M, Jones S & Daley R (1970). English lexical studies. Report to OSTI on Project C/LP/08. Now
600 Collocations published (2004) as Krishnamurthy (ed.). English collocation studies: the OSTI Report. London: Continuum. Stubbs M (1996). Text and Corpus Analysis. Oxford: Blackwell. Smadja F (1993). ‘Retrieving collocations from text: Xtract.’ Computational Linguistics 19(1), 143–177.
Smadja F, McKeown K & Hatzivassiloglou V (1996). ‘Translating collocations for bilingual lexicons: a statistical approach.’ Computational Linguistics 22(1), 1–38. Wordsmith Tools (1996). Written by Scott M. Oxford: OUP. For details and downloads, see http://www.lexically. net/wordsmith/ and http://www.oup.co.uk/isbn/0-19459400-9.
Colombia: Language Situation J Landaburu, Centre d’Etudes des Langues Indige`nes d’Ame´rique, Villejuif, France ! 2006 Elsevier Ltd. All rights reserved.
Colombia’s geography has greatly influenced its present-day linguistic situation. Its position at the end of the isthmus of Panama forced the pre-Columbian peoples migrating southward from North America to pass through it. The extreme diversity of its ecological niches, which include coastal areas along the Pacific and Atlantic Oceans, three Andean mountain ranges with climates varying according to altitude, savannas of the Orinoco Plain, Amazonian forests, torrid deserts, and cold plateaus, allowed many of these populations to settle there. As a result, Colombia has an exceptionally large variety of the South American continent’s indigenous language families. On the other hand, in the 16th century, the islands of the Caribbean Sea and the Atlantic coast of what is today Colombia saw the earliest Spanish settlements and the first shipments of black slaves, historical factors that would greatly influence the sociolinguistic configuration of the country. The languages spoken in Colombia today include 69 Amerindian languages; two creole languages spoken by black populations of African descent in the Caribbean; and the IndoEuropean language Spanish, represented by a great number of regional variants. This linguistic reality is demographically highly unequal. Of the total Colombian population of over 40 million, there are fewer than 700 000 indigenous language speakers, and the speakers of creole languages number fewer than 35 000. Spanish is therefore the dominant language, and, except in some isolated indigenous zones, most Colombians speak it. In spite of this scarcity of linguistic minorities, there fortunately exists today a greater awareness and acceptance of linguistic diversity. In 1991, Colombia adopted a new constitution, Article 10 of which says: ‘‘Castilian [Spanish] is the official language of Colombia. The languages and dialects of the ethnic groups are also official in their
territories. The education that is imparted in communities with their own linguistic traditions will be bilingual.’’ This text has allowed the acknowledgment and fostering of many initiatives, especially in scholastic circles, to use and revitalize the vernacular languages, which have developed over the last 30 years and whose fortune it would be yet premature to predict. In any event, the future of these languages remains worrisome since, of the 71 vernacular languages, 30 have fewer than a thousand speakers.
Afro-American Languages There are two creole languages, spoken in the Caribbean areas by populations of black African origin: the creole of San Basilio de Palenque, near Cartagena de Indias, spoken by 3000 people; and the creole of the islands of San Andre´ s and Providencia (Old Providence) off the coast of Nicaragua, spoken by 30 000 people. These two languages are new. They were created by slaves of diverse African ethnolinguistic origin (more clearly Bantu in the case of the creole of Palenque) in the period of the slave trade. The creole of San Basilio, known as Palenquero, was born in a Hispanic context, and a majority of its lexical roots come from Spanish, thus making it the only creole of Hispanic base in the Americas. The creole of San Andre´ s and Providencia was born in an English context (migrations from Jamaica) and its lexical base is mainly English.
Indigenous Languages Studies of the indigenous languages of Colombia have developed substantially in recent decades (for a bibliography and a characterization of these advances, see Landaburu, 2003). Relying on these works, it is now possible to group the 69 Amerindian languages present in Colombia into 13 different language families, to which may be added 8 isolated languages whose affiliation with others is as yet undemonstrated, giving us 21 different genetic groups. Greenberg’s proposed classification about languages of the Americas
600 Collocations published (2004) as Krishnamurthy (ed.). English collocation studies: the OSTI Report. London: Continuum. Stubbs M (1996). Text and Corpus Analysis. Oxford: Blackwell. Smadja F (1993). ‘Retrieving collocations from text: Xtract.’ Computational Linguistics 19(1), 143–177.
Smadja F, McKeown K & Hatzivassiloglou V (1996). ‘Translating collocations for bilingual lexicons: a statistical approach.’ Computational Linguistics 22(1), 1–38. Wordsmith Tools (1996). Written by Scott M. Oxford: OUP. For details and downloads, see http://www.lexically. net/wordsmith/ and http://www.oup.co.uk/isbn/0-19459400-9.
Colombia: Language Situation J Landaburu, Centre d’Etudes des Langues Indige`nes d’Ame´rique, Villejuif, France ! 2006 Elsevier Ltd. All rights reserved.
Colombia’s geography has greatly influenced its present-day linguistic situation. Its position at the end of the isthmus of Panama forced the pre-Columbian peoples migrating southward from North America to pass through it. The extreme diversity of its ecological niches, which include coastal areas along the Pacific and Atlantic Oceans, three Andean mountain ranges with climates varying according to altitude, savannas of the Orinoco Plain, Amazonian forests, torrid deserts, and cold plateaus, allowed many of these populations to settle there. As a result, Colombia has an exceptionally large variety of the South American continent’s indigenous language families. On the other hand, in the 16th century, the islands of the Caribbean Sea and the Atlantic coast of what is today Colombia saw the earliest Spanish settlements and the first shipments of black slaves, historical factors that would greatly influence the sociolinguistic configuration of the country. The languages spoken in Colombia today include 69 Amerindian languages; two creole languages spoken by black populations of African descent in the Caribbean; and the IndoEuropean language Spanish, represented by a great number of regional variants. This linguistic reality is demographically highly unequal. Of the total Colombian population of over 40 million, there are fewer than 700 000 indigenous language speakers, and the speakers of creole languages number fewer than 35 000. Spanish is therefore the dominant language, and, except in some isolated indigenous zones, most Colombians speak it. In spite of this scarcity of linguistic minorities, there fortunately exists today a greater awareness and acceptance of linguistic diversity. In 1991, Colombia adopted a new constitution, Article 10 of which says: ‘‘Castilian [Spanish] is the official language of Colombia. The languages and dialects of the ethnic groups are also official in their
territories. The education that is imparted in communities with their own linguistic traditions will be bilingual.’’ This text has allowed the acknowledgment and fostering of many initiatives, especially in scholastic circles, to use and revitalize the vernacular languages, which have developed over the last 30 years and whose fortune it would be yet premature to predict. In any event, the future of these languages remains worrisome since, of the 71 vernacular languages, 30 have fewer than a thousand speakers.
Afro-American Languages There are two creole languages, spoken in the Caribbean areas by populations of black African origin: the creole of San Basilio de Palenque, near Cartagena de Indias, spoken by 3000 people; and the creole of the islands of San Andre´s and Providencia (Old Providence) off the coast of Nicaragua, spoken by 30 000 people. These two languages are new. They were created by slaves of diverse African ethnolinguistic origin (more clearly Bantu in the case of the creole of Palenque) in the period of the slave trade. The creole of San Basilio, known as Palenquero, was born in a Hispanic context, and a majority of its lexical roots come from Spanish, thus making it the only creole of Hispanic base in the Americas. The creole of San Andre´s and Providencia was born in an English context (migrations from Jamaica) and its lexical base is mainly English.
Indigenous Languages Studies of the indigenous languages of Colombia have developed substantially in recent decades (for a bibliography and a characterization of these advances, see Landaburu, 2003). Relying on these works, it is now possible to group the 69 Amerindian languages present in Colombia into 13 different language families, to which may be added 8 isolated languages whose affiliation with others is as yet undemonstrated, giving us 21 different genetic groups. Greenberg’s proposed classification about languages of the Americas
Colombia: Language Situation 601
(1987) is insufficiently documented and argued; a more solid classification is made by investigators in direct contact with these families (see Rodriguez de Montes, 1993; Gonza´lez de Perez, 2000). For demographic data on the indigenous populations, Arango (1999) is a reliable, albeit not yet definitive, authority. However, as very few serious sociolinguistic surveys have been done up to now, the present data refer more to ethnic populations than properly to speakers of languages. Classification of Languages
The linguistic families of Colombia can be classified according to their geographic scope. Three groupings can be observed: 1. Five genetic groups present throughout the continent: a. The Chibcha family (seven languages). This linguistic family, probably of Central American origin, is also present in Panama, Costa Rica, and Nicaragua. ‘Chibcha’ was the name of the people found by Spaniards in the region of Bogota. In Colombia today, there are Chibchan languages in Darie´ n (Cuna [Kuna], with 1000 speakers in Colombia and more than 30 000 in Panama), in the Sierra Nevada of Santa Marta (Kogui [Cogui], with 10 000 speakers; Arhuaco or Ika [Ica], with 14 000 speakers; Damana, spoken by 1800 Wiwa or Arsario people; and Chimila, 900 people but very few speakers), in Catatumbo (Barı´ [Motilo´n], 3500 speakers), and in western Arauca (Uwa or Tunebo, 7000 speakers). A Chibchan affiliation of languages in the south of Colombia (e.g., Pa´ez, Guambiano, Awa or Kwaiker [Cuaiquer]) has been proposed, but there is not sufficient evidence to maintain this assertion. b. The Arawak family (nine languages). This is the most geographically extensive family in South America. Probably of central- Amazonian origin, it spread along the tributaries of the Amazon and Orinoco and along the coast of the Caribbean over the past two millennia. In Colombia, Arawak languages are found in three areas: the Guajira (Wayuu or Guajiro, with 144 000 people in Colombia and more than 180 000 in Venezuela), the eastern plains of the Orinoco and the area of the Negro River (Achagua, 280 people; Piapoco, 4500; Curripaco and Baniva of the Isana, 7000; Baniva of the Guainı´a, Tariano, 330); the area of the Caqueta´ River (Yucuna, 500 people; Cabiyarı´, 280). c. The Carib family (two languages). This genetic group also spread before the arrival of
the Spaniards, from the Guyanas throughout the north of South America and south of the Amazon. In Colombia, it was represented in the Atlantic areas, the Magdalena River drainage, the Amazon, and probably in other regions as well. Today a group in the mountain range of Perija´ subsists partly in Colombia and partly in Venezuela (the department of Zulia). They speak a Carib language called Yuko or Yukpa, with about 3530 people in Colombia and an equal number in Venezuela. Colombia’s other extant Carib language is Carijona, found in the Middle Caqueta Region. Its population, greatly decreased during the first decades of the 20th century, has faded away. Today, fewer than 30 people speak Carijona. d. The Quechua family (three languages). The presence of languages of the Quechua family in Colombia seems to be modern. Today, Inga or Ingano is spoken in the department of Narin˜o (Aponte), in the valley of Sibundoy (Putumayo), and in the department of Caqueta´ (along the upper Caqueta´ River, the Fragua River, the Yuruyacu River, and the Orteguaza River) by 18 000 people. Another variety of Quechua is spoken near Puerto Asis and along the San Miguel River. Both varieties are comparable with the dialects of Ecuadoran Quichua, especially with the Ecuadoran forest dialects. It is very possible that its presence in Colombia and its expansion is due to its diffusion as a ‘lengua general’ by the Catholic missionaries of the 17th century. There are also speakers of Peruvian varieties of Amazonian Quechua on the lower Putumayo River. e. The Tupı´ family (two languages). This great language family is found mainly in Brazil, Bolivia, Paraguay, and Argentina, but Tupi languages have a few speakers in the tiny community of Cocama, on the border between Colombia, Brazil, and Peru. Hundreds of speakers of ‘Lengua Geral’ (Nheengatu´) [Nhengatu] have been reported on the Guaviare River. 2. Eight genetic groups with a regional projection present in several areas in the northwest: a. The Barbacoa family (two languages). This group is found in the Andean southwest, with possible prolongations in the Ecuadoran west (Chachi [Cayapa], Tsachila [Colorado, Tsafiki]). It includes Guambiano in Cauca (21 000 people), and Awa or Kwaiker [Cuaiquer] in the Pacific piedmont of Narin˜o (130 000 people).
602 Colombia: Language Situation
b. The Choco´ family (two languages). It is found on the Pacific coast, from Panama to Ecuador, with incursions in both countries. Its languages are Embera, with much dialectal variation (Embera-Catı´o, Embera-Chamı´, Tado´ , Epena, with more than 70 000 people), and the welldifferentiated Waunana [Woun Meu] (8000 people) along the San Jua´ n River. c. The Guahibo family (three languages). This family is found in the eastern plains of the Orinoco in Colombia and also in Venezuela, spoken by formerly nomadic populations who are today mostly settled. In Colombia, two very distinct languages are found at the extreme ends of the area: Hitnu or Macaguane [Macagua´ n] in the north (500 people), and Guayabero in the south (1200 people). Between these, a more homogenous space is occupied by Guahibo proper or Sikuani (25 000 people in Colombia), with dialectal differences that are not very marked (e.g., Cuiba, Hamoru´ a). d. The Sa´ liba-Piaroa family (two languages). Peoples in the plains of Orinoco were catechized early by the Jesuits in the 17th century. Sa´ liba is spoken in the west (1300 people); Piaroa is spoken in the east and also in Venezuela, close to the Orinoco River (800 people in Colombia, 5000 in Venezuela). e. The Macu´ -Puinave family (five languages). Small groupings of nomadic forest communities along the Inı´rida River and in the forests of Guaviare and Vaupe´ s speak the languages Yuhup, Hupda [Hupde¨ ], Nukak [Nukak Maku´ ], and Kakua. A more sedentary group along the Inirida river speaks Puinave (5400 people). f. The Tucano family (eighteen languages). These languages are distributed in two areas: the upper Caqueta´ and the upper Putumayo in the west, and the upper Negro River and Vaupe´ s in the east. Languages of this family are also spoken in Brazil, Ecuador, and Peru. In Colombia, the languages of the western area (Coreguaje [Koreguaje], Siona; 3000 people) are threatened by recent colonization; the eastern area is characterized by systematic practices of multilingualism. This latter area has 16 languages spoken by its fewer than 30 000 people: Cubeo, Tanimuca [Tanimuca-Retuara˜ ], Tucano, Desano, Macuna, Tatuyo, Barasana, Carapana, Tuyuca, Yurutı´, Siriano, Piratapuyo, Bara´ [Waimaha], Taiwano, Wanano [Guanano], and Pisamira.
g. The Huitoto family (three languages). The Uitoto language with its three dialects is spoken along the rivers Caqueta´ and Putumayo (6200 people), as is the Ocaina language, spoken by fewer than 100 people (though it is also spoken in Peru), and the Nonuya language, which is now moribund with only three living speakers. h. The Bora family (three languages). Located in the Caqueta´ –Putumayo area, its languages are Muinane (550 people), Bora (650 people), and Miran˜ a (660 people); the latter two are very similar. 3. Eight genetically unaffiliated languages: a. Andoque (500 people), spoken in Araracuara (Amazonas). b. Cofa´ n (1460 people), spoken along the upper Putumayo and in a few communities across the border in Ecuador. c. Kamsa´ (3500 people), spoken in the valley of Sibundoy (Andean–Amazonian piedmont). d. Pa´ ez (100 000 people), spoken in the Andean southeast (eastern Cauca). e. Tinigua (moribund, with two speakers), found in the Sierra de la Macarena. f. Yaruro (3000 people total), found on the border with Venezuela (Arauca River); its speakers are occasionally present in Colombia. g. Ticuna (6580 people in Colombia; more than 30 000 total), spoken at the edge of the Amazon River and extending beyond the border with Brazil and Peru. h. Yagua (300 people in Colombia; 3000 in Peru), found on the border with Peru and along the rivers Putumayo and Amazon.
Some Structural Features of the Indigenous Languages
We here outline only some simple characteristics of the indigenous languages. The more important typological differences are probably those found between the lowland languages (in Amazonas and Orinoquı´a, and on the Pacific and Atlantic coasts) and the highland or Andean languages (associating with these latter the languages of the Chibcha family that are sometimes found in the lowlands). At the phonetic–phonological level, we find complex consonantal systems with simple vocalic systems in the Andes; whereas, in the lowlands, the tendency is an opposite one of complex vocalic systems with simpler consonantal systems. Remarkable consonantal characteristics include the retroflex of the Guambiano and the Kamsa´ ; the six consonant series of the Pa´ ez,
Colombia: Language Situation 603
where the simple oclusives can receive either a feature of palatalization and/or aspiration or a feature of palatalization and/or prenasalization; the use in many languages of the opposition tense/lax, rather than the opposition voiceless/voiced; the existence of implosives in the Embera of the Pacific, in the Arawakan languages of the plains (Achagua, Piapoco, Curripaco), and in the Witoto family (Nonuya, Uitoto-Nepode); the importance of consonantal prenasalization; the postnasals of the Yuhup (Maku´ ); the affricative laterals of the Kogui; the existence of aspirated flaps (Barı´, Cabiyarı´) or nasalized labiovelars (Kogui). The most common vocalic system is one of six vowels: the five cardinal vowels plus a vowel which can be mid-central in the Andes, or frequently is closed unrounded and back in the Amazon (note also the rounded front vowel of the Embera-Chamı´es). In the Andes or the Chibchan languages, the system can be reduced to four vowels (Pa´ ez , AwaKwaiker) or to five (Guambiano, Cuna, Chimila). In the Amazon region, there are greater complexities not only in the number of vocalic qualities (nine qualities in Andoque and Yuhup; eight qualities in Cuiba) but also in coarticulations. It is common to have, along with the simple system, a system of nasal vowels and/or a system of glottalized or of aspirated vowels. The handling of vocalic nasality among the eastern Tucano languages of Vaupe´ s is remarkable (morphemic nasality, word harmony). In the same area of Vaupe´ s, and also along the lower Caqueta´ -Putumayo and the Amazon River, there are tonal languages of two or three registers (the Tucano family, the Bora family, the Maku´ -Puinave family, Andoque, Ticuna). At the border of this area are pitch-accent languages that keep the oppositions of register, but only on the accented unit (syllable or mora): Piapoco, Yucuna, Barasana, Nonuya, etc. The most attested type of word morphology is agglutinating, although there are tendencies toward flexion in the classic sense among the languages of the Sierra Nevada of Santa Marta, and isolating tendencies occur in Embera, Cuna, Uwa, and others. The agglutination can go to the point of polysynthesis or holophrasis (Pa´ ez , Kamsa´ ), with nominal incorporation, yielding utterances composed solely of a predicate word. The verb is commonly synthetic, but there are also analytical constructions with aspectual, negative, modal, deictic auxiliaries. At the syntactic level, the regressive order (determiner– determined) is dominant, with a strong tendency to locate the verb at the end of the sentence, preceded by its complements. Many languages can choose between different predicate structures to direct the attention to an event or to some entity of the event
(Sa´ liba, Piapoco, Achagua, Cofa´ n, Muinane, Puinave, etc.). The nominal or adjectival predication is frequently expressed as a verbal predicate (the noun or the adjective is ‘conjugated’ in Cofa´ n, Pa´ ez , Piapoco, etc.) since the verb–noun opposition is frequently questionable at a syntactic higher level. Regarding hierarchial structuring and the classification of participants, there are languages clearly ergative (Embera, Uwa), partially ergative (Kogui, Wiwa [Malayo] or Damana), and accusative (Ika, Andoque, Eastern Tucanoan, etc.). The active–stative type is very common in the lowlands (Arawak). Among the Chibchan languages and in the highlands, morphologic topicalization is common (Awa or Kwaiker, Guambiano, Pa´ ez, Uwa or Tunebo, Arhuaco, Cuna, etc.). The nominal function is frequently marked by declension suffixes. The representation of the main participants in an event is commonly made by means of integrated personal or generic markers on the verb. Nevertheless, there are also languages without personal flexion in the verb (e.g., Embera, Uwa-tunebo, Yuhup). At the syntactic–semantic level, and for the representation of entities, it is common in the lowlands to find classifiers of shape and/or gender markers with functions of syntactic agreement (Sa´ liba, Tucano family, Bora family, Andoque, etc.), whereas the absence of class and gender markers is dominant in the highlands (Guambiano, Pa´ ez , Sierra Nevada, Cuna, Uwa, etc.). There are also numeral classifiers (Cuna). The categorization of person typically opposes 1st and 2nd to the 3rd, although there is also the opposition of 1st versus 2nd and 3rd (in the Andean south, Guambiano, Awa-Kwaiker, Cofa´n). The Pa´ez language also distinguishes feminine 1st person from masculine 1st person and feminine 2nd person from masculine 2nd person. The Andoque language distinguishes an impregnable (i.e., a potentially pregnant woman) 2nd person (pluralized) from an unimpregnable (i.e., a young girl or old woman) 2nd person. The opposition between inalienable and alienable nouns is generalized; the inalienable ones (body parts, spatial relations, kinship, etc.) appear with obligatory possessive prefixes. The categorization of space is often complex and the systems of deixis or the systems of orientation of event are highly elaborated (Sierra Nevada of Santa Marta, Kamsa´ , Pa´ ez, Sikuani or Guahibo, Andoque, etc.), and it combines criteria of proximity with criteria of movement, nominal class, and the directionality of the sun, the rivers, and so on. Grammaticalization of temporary location also occurs, although it is generally marked by a combination of aspectual and deictic markers. The systems of epistemic modality are also noteworthy, and they highlight a sensitivity to the source of information
604 Colombia: Language Situation
(Tucano family, Pa´ ez, Guambiano, Uwa, Andoque, Kamsa´ , languages of the Sierra Nevada, etc.).
The Spanish Spoken in Colombia We have deliberately focused on the Amerindian linguistic diversity of Colombia because it is more qualitatively profuse and less well-known. Nevertheless, we must remember that Colombia is not today a country with majority or near majority Indian areas like Peru, Bolivia, Ecuador, and Guatemala. Nor was it affected by the demographically significant immigration of Europeans after the 18th century, as is the case in Argentina, Uruguay, and Chile. The three basic population components – Indians, blacks, and whites – were gradually mixed from the 16th century on in diverse proportions in different regions, producing a locally differentiated but globally continuous Spanish speech. This continuum represents nearly the totality of the population of 40 million inhabitants. More than anywhere, it can be found in the three Andean mountain ranges, their interAndean valleys, and on the Atlantic coast. In these regions are the great cities in which most of today’s Colombians are concentrated, including Bogota, Medellı´n, Cali, Barranquilla, Cartagena, and Bucaramanga. The linguistic variation of this Spanish-speaking population is remarkable, but it does not impede communication. The lowest fluency of interactive understanding is found between the speakers of the coastal varieties (found along the Atlantic and ‘interland’ of the Caribbean) and the others. Among the many different features which distinguish the ‘Costen˜ os’ from the ‘Andeans,’ we can note, at the phonetic level, the aspiration or loss of syllable final /s/, the loss of final /r/, and the velar pronunciation of final /n/. These same phonetic features are found and were present in Andalusian Spanish, the variety of Spanish that indeed began the American conquest on the islands and the Caribbean coast. In the use of 2nd person markers, the Costen˜ os of the Caribbean prefer the ‘tuteo’ (i.e., the use of ‘tu’), whereas the Costen˜ os of the Pacific prefer the ‘voseo’ (i.e., the use of ‘vos’), which is generalized in the southeast of Colombia. The Atlantic coastal varieties can be subdivided into Cartagenero, Samario (of Santa Marta), and Guajiro. The more eastern speech of the plains of Orinoco are of a costen˜ o type, possibly from the influence of Venezuelan speech, a coastal variety. Andean Spanish may be divided into western (Narin˜ o, Cauca, Caldas, and Antioquia) and eastern (Cundinamarca and Boyaca´ , Santander, Tolima and Huila) varieties. Among the western dialects, we can distinguish Antioquen˜ a or ‘Paisa,’ Valluna of the area
of Cali, ‘Pastusa’ of the southwest border with Ecuador. The Paisa variety is noticeable to other Colombians for its apico–palatal pronunciation of /s/ (similar to Castillian) and the ‘yeismo’ (ll>y) that it shares with the people of Valle del Cauca. In Valle del Cauca, the voseo is generalized and the labialization of final /n/ is also noteworthy. In Narin˜ o, as also in Valle del Cauca, the ‘quechuismos’ (i.e., elements of Quechua) are frequent. The pronunciation resembles that of Andean Ecuador, with tense consonants and a short syllabic rate. On the eastern side of the Andes, the distinctions appear mainly in the lexicon (isoglosses). Of note in Boyaca´ , though less in Cundinamarca, is the form of ‘Su Merced’ for polite 2nd person. There is also an assibilation of /r/ in these varieties (‘rolo’ or traditional speech of Bogota, ‘opita’ of the department of Huila), a feature probably originating in the indigenous substrate (also present in Narin˜ o). We cannot here discuss the many differences of lexical usage. For particular studies, see the monumental Atlas lingu¨ı´stico-etnogra´fico de Colombia, 1981–1983, compiled at the Instituto Caro y Cuervo under the direction of Luis Flo´ rez. It is also important to mention that with the intensification of telecommunications and considerable internal migration over the past 40 years, some of the specificities are disappearing, and many others, mainly lexical, are appearing in areas different from their origin. See also: Minorities and Language; Spanish; Venezuela: Language Situation.
Language Maps (Appendix 1): Maps 56, 57.
Bibliography Arango Ochoa R & Sa´ nchez Gutierrez E (1999). Los pueblos indı´genas de Colombia 1997 (poblacio´n y territorio). Bogota´ : Departamento Nacional de Planeacio´ n. Gonza´ lez de Pe´ rez M S (ed.) (2000). Lenguas indı´genas de Colombia: Una visio´n descriptiva. Bogota´ : Instituto Caro y Cuervo. Greenberg J H (1987). Language in the Americas. Stanford: Stanford University Press. Instituto Colombiano de Antropologı´a (1987). Introduccio´n a la Colombia amerindia. Bogota´ . Landaburu J (2003). ‘E´ tat des lieux de la linguistique colombienne en Ame´ rique latine.’ In Faits de Langue1: Me´so-Ame´rique, Caraı¨bes, Amazonie. Paris: Ophrys. Montes Giraldo J J (1985). Estudios sobre el espan˜ol de Colombia. Bogota´ : Instituto Caro y Cuervo. Rodriguez de Montes M L (ed.) (1993). Estado actual de la clasificacio´n de las lenguas indı´genas de Colombia. Bogota´ : Instituto Caro y Cuervo.
Color Terms 605
Color Terms
Color terms are not the same thing as the psychophysical perception of wavelength and reflectivity, but are Sausseurian ‘signs’ which name color concepts. Individuals from two distinct language-culture groups may perceive given light-wave experiences similarly but use very distinct patterns of color terms to talk about their experiences. For example, it is unlikely that a native English speaker would use a single color term to name the entire range of colors that are named by the term niroˆ , or a single term for the range named by poˆ s in Maa, the language of the Maasai, in Kenya and Tanzania (Figure 3). Conversely, many English speakers might use the single word brown for the hues that Maa speakers divide into niroˆ , mu´ gı´e´ , morı´joi, and several other categories.
Essentially, all languages have two or more lexical items that name color concepts as their basic sense (but see Levinson, 2002). The Dani (Irian Jaya) word mola names a color concept roughly corresponding to a combination of ‘red þ white þ yellow.’ The Yagua (Peru) ru´ una´˛ y names ‘red.’ Some color terms may derive from the names of objects, such as English olive, which names a tree and its fruit; only by metonymic extension does it name the grayish-green color corresponding to the prototypical fruit of the olive tree. Some color terms are contextually restricted. Thus, English blond primarily applies to human hair colors, and cannot be used for the same hue range in paint found, for example, on cars or walls. The Maa o´ mo` is restricted to the color of certain light-brown sheep. Even for terms that are not contextually restricted, their reference on particular occasions of use is likely to be severely affected by context. The meaning of black in black sky versus in black crow is not likely to be same ‘black.’ Red is unlikely to designate the same hue-saturationbrighness values in red lipstick and in red hair (under natural circumstances). Color terms often have emotional or social connotations, such as the widely-attested association of ‘red’ with anger. Color terms are common in idioms for human beings. Sometimes languages include in their ‘color’ category words that cannot be defined only by hue, saturation, and brightness parameters. The Maa emu´ a´ ‘color’ category contains both hue-saturationbrightness terms and color-plus-design terms such as aroˆ s ‘spotted black and white,’ keshu´ roi ‘red and white/brown and white’ with ‘white’ on or near the face, sa´ mpu` ‘thinly striped, typically with tan and white’ (Figure 4), etc. Puko´ tı` ‘blend of black and white, so well blended that from a distance the whole may appear blue’ is a hyponym (subcase) of
Figure 1 Luminescence.
Figure 2 Saturation.
D L Payne, University of Oregon, Eugene, OR, USA ! 2006 Elsevier Ltd. All rights reserved.
Color Perception Color is a shorthand way of referring to the psychological interpretation of retinal and neuronal perception of reflected visible light (Lenneberg and Roberts, 1956; Hardin, 1988). Colors are commonly thought of as being composed of three properties: 1) hue (perception of wavelength interactions), 2) brightness or luminesence on a dark-light scale (based on reflectivity of a surface), and 3) saturation (perception of purity of one dominant wavelength). The highest degree of luminiscence is ‘white’ or ‘bright,’ while the lowest degree (no reflectivity) is ‘black’ or ‘dark’ (Figure 1). If there is very low or no saturation, the color is interpreted as ‘gray’ (Figure 2).
Color Vocabulary
606 Color Terms
Figure 3 Maa color naming. See http://darkwing.uoregon.edu/~dlpayne/maasai/MaaColorNaming-.htm. This figure reflects a color-naming task done by Vincent Konene Ole-Konchellah, a Maa (Maasai) speaker of Kenya, il-Wuasinkishu section. When the task was done, the color circles were randomized within a field. They are re-arranged here according to the names applied to the colors. In other Maaspeaking areas some terms, e.g., si0 nteˆt and poˆ s, may designate different colors. Maa has many additional color terms which Ole-Konchellah just did not employ in this task.
poˆ s ‘blue’, parallel to saga´ rara´ mı` ‘light blue/purple’ (from the name of a seed pod), and kiı´ ‘blue’ (from ‘whetting stone’) (Payne et al., 2003). On different occasions, the same speaker may name a given hue-saturation-brightness value with different terms. In part, this led MacLaury (1996; 2002) to argue that speakers may switch perspectives in observing a phenomenon; they may look at two items from the vantage point of either how similar, or how different, they are. Perspective-switching allows for flexible cognitive categorizations, hence alternative namings, and eventually may lead to different lexicalizations across speech communities.
Color Term Universals An enduring question concerns whether universal constraints underlie inventories of color terms. If so, do explanations lie in physiology or the nature of cognition? Bloomfield (1933: 140) advanced the relativist idea that languages can ‘mark off’ different portions of the wavelength continuum quite arbitrarily. For him, color naming should be entirely culture-specific. A related question concerns to what extent color vocabulary may affect individuals’ cognitive perceptions of color (cf. Whorf, 1956; Kay and Kempton, 1984). Scientific cross-cultural studies of color terms began with the optician Magnus (1880), who drew evolutionary conclusions about vocabulary development.
The anthropologist Rivers (1901) drew evolutionary conclusions about social and mental development. Employing Lenneberg and Roberts’s (1956) procedures for researching Zuni (New Mexico) color terms, Berlin and Kay (1969) (henceforth BK) addressed the universals question. They distinguished between basic color terms (BCTs) versus color terms generally, and argued against an extreme relativist position, instead positing universal constraints on the evolution of basic terms. BK defined a BCT as a word that refers to color first and foremost; is not a composite of other color terms; is not a sub-case hyponym of a more general term; is not contextually restricted; and is salient, as judged by being readily used and widely known throughout a language community. By these criteria, we identify Yagua as having four basic color roots (though of differing parts of speech): pupa´ -‘white,’ dakuuy ‘be dark, black,’ ru´ una˛´ y ‘red colored,’ su´ nu˛ -‘green-blue.’ A concept partially corresponding to ‘yellow’ can be expressed, but this involves modifying su´ nu˛ -‘greenblue’ with a suffix that probably derives from-diiy ‘near’ (su´ nu˛ diipo´ ‘pale, yellowish,’ su´ nu˛ dı´way ‘be yellowish, pale, anemic’; Powlison, 1995). Secondary criteria, appealed to in problematic cases, include whether the term (a) has the same grammatical properties as other BCTs; (b) is not derived from the name of an object; and (c) is not recently borrowed. Secondary criteria can be synchronically irrelevant for determining basic status, even if historically true.
Color Terms 607
Figure 4 Animal hide displaying the Maa (Maasai) color term sa´mpu` ‘thinly striped, typically with tan and white.
English orange was borrowed from French and still is the name of a fruit tree, but orange is considered a BCT in modern English because it meets the primary criteria. BK tested the hypothesis that there are constraints on development of BCTs using an array of about 330 Munsell color chips and 20 languages, relying on bilingual speakers living in California. The BCTs of each language were identified and elicited from the speakers. They were then asked to use the color chips to identify the best example (focal hue) of each term identified as a BCT in their respective languages. In a separate step speakers plotted the range of each BCT on an array of the color chips. The 20-language sample was supplemented by data on 78 more languages extracted from dictionaries and field-workers’ notes. BK concluded that though BCTs could show marked differences in range, there was a high degree of stability for focal hues across languages: only about 30 of the chips were nominated as focal hues. These concentrated around the focal hues of English black, white, red, green, yellow, blue, gray, brown, orange, purple, pink. Some languages had a term that covered blue þ green (cf. Yagua su´ nu˛ -), but BK’s results showed that the focal hue of this term tended to be either ‘blue’ or ‘green,’ but not half-way in between. They concluded that languages could be placed along a continuum of seven stages of BCT development, and that an implicational hierarchy governed the order in which new BCTs could be added, ending with a maximum of 11 BCTs (Figure 5). These claims opposed the view that languages could vary without limit.
Further empirical evidence argued that, for people with normal trichromatic vision, certain focal centers are psychologically salient even when a person’s language has no BCT corresponding to those focal colors (Heider, 1972; Rosch, 1975). Rosch showed that in Dani, with just two BCTs, speakers were better able to hold certain colors in memory than others, even when the memorable colors did not correspond to a focal center of one of the two Dani color terms. Importantly, the memorable colors corresponded quite closely to the BK ‘best examples’ from other languages. This result argues that the focal colors BK identified are psychologically salient, with the implication that at least the centers of color term categories were not dependent on culture or language. Again this concept countered a strong form of the Whorfian hypothesis. Subsequent scholars have challenged the BK study on several grounds, including Western cultural bias, non-random sampling procedures, bilingual interference, transcription and data errors, and inadequate experimental methodologies (Hickerson, 1971; Saunders and van Brakel, 1997). Dedrick (1998) provides an even-handed review of the research from a philosophy of science perspective. The BK study was nevertheless hugely influential in initiating an enduring research tradition, spurring investigation of hundreds of additional languages (Borg, 1999). Major cross-language studies include MacLaury (1996) and the World Color Survey (Kay et al., forthcoming). Together these motivated revisions to the universalist claims (cf. Kay et al., 1997), including the following.
608 Color Terms
Figure 5 Berlin and Kay’s (1969) hypothesized stages in development of BCTs. If a language has any BCT to the right on the hierarchy, it was predicted to have all BCTs to the left. (A Stage VII language need have only some of ‘gray, pink, orange, purple.’)
Figure 6 Kay and McDaniel’s (1978) revised BCT color sequence. Arrows represent splitting of composite categories. Gray is ‘wild,’ able to appear anywhere, but later is more likely.
. In addition to ‘blue þ green,’ the developmental sequence was revised to include more composites (Kay and McDaniel, 1978) (Figure 6). This was partially based on the discovery that ‘white’ was not a focal hue in all two-color BCT systems. For example, though the range of the Dani mola includes ‘white þ red þ yellow,’ it had a focal hue within the ‘red’ range. A more insightful characterization is that mola is a WARM color term, and neither a ‘white’ nor a ‘red’ term. The complementary term is mili, which is a ‘black þ green þ blue,’ or DARK-COOL composite. ‘Yellow þ green,’ ‘white þ yellow,’ and ‘black þ blue’ composites have also been documented. In some languages a ‘green þ blue’ composite may persist even after ‘brown,’ ‘purple,’ or both have achieved BCT status. Acknowledging composites accounted for how speakers can use BCTs to name any hue-saturationbrightness value, whereas BK would have predicted that some phenomenological color values would go unnamed. . Composite color categories may have their foci in one salient hue or another, or may have multiple foci. This difference may vary by speaker. . In the revised developmental sequence, the colors of Stages VI and VII were viewed as derived. The developmental sequence thus contained category types: composite, unique hue, and achromatic (‘red, yellow, green, blue, white, black’), binary hue (‘orange’ as a combination of ‘yellow’ and ‘red,’ ‘purple’ as a combination of ‘red’ and ‘blue’), and derived (‘brown,’ ‘pink’). . Developmentally, ‘brown, purple, pink, orange’ and especially ‘gray’ may appear earlier than predicted by BK (Greenfield, 1986). Indeed, the supposition that BCTs always come about by splitting hue-based categories into smaller hue-based
categories is wrong, as brightness and saturation parameters can play a role. For example, a desaturated ‘gray’ might surface early in the sequence, and subsequently be reinterpreted as ‘blue’ (independently of any ‘green þ blue’ composite) (MacLaury, 1999). . Languages may lexicalize BCTs along a brightness parameter. The Bellonese (Solomon Islands) system has three ‘mothers’ or ‘big names’ of colors: susungu for bright, light colors (other than light greens and green-yellows), ‘ungi for dark colors (except pitch-black), and unga for the rest of the spectrum (plus other non-BCTs) (Kuschel and Monberg, 1974; cf. MacLaury, 1996). . Though color categories cannot be defined by their boundaries, there are still restrictions on boundaries. Suppose one color category has its focus in ‘red’ and another has its focus in ‘yellow.’ If a speaker of such a language moves gradually from the red focus to the yellow one, there will be some point after which the speaker simply can no longer affirm that the hue could be considered ‘red’: a hue boundary has been passed (Dedrick, 1998). . Some languages have more than 11 BCTs. Russian has 12, including goluboj ‘light, pale blue’ and sinij ‘dark, bright blue.’ Hungarian has both piros ‘light red’ and vo¨ ro¨ s ‘dark red’ BCTs (MacLaury et al., 1997).
Explaining Basic Color Terms The claim that universals partially govern development of BCTs appears to receive strong statistical support (Kay et al., 1997; and the forthcoming World Color Survey). Even so, what can ultimately explain the constrained developmental patterns
Color Terms 609
remains unresolved. Kay and McDaniel (1978) argued that unique hue terms like white, black, red, green, yellow, and blue could be explained by an opponency theory, derived from the nature of the human eye and basic neural responses (which concerns whether a given retinal cell is maximally excited or inhibited by a given wavelength; Hering, 1920/ 1964; Hardin, 1988). Appeal was then made to fuzzy set theory (Zadeh, 1965) to account for binary and derived color terms like brown, orange, purple, pink and gray. But this set of explanations cannot account well for composite color terms that combine fundamental perceptual categories such as ‘yellow þ red,’ ‘green þ blue,’ and ‘white þ yellow.’ ‘Yellow þ green þ blue’ composites are particularly troubling, since certain retinal cells appear to be maximally excited by focal blue hues but maximally inhibited by focal yellow. Disconcertingly, the proposal did not explain how categories change over time – one of the principal claims of the BK research paradigm was precisely that systems do change. Rosch’s findings led to explanations for color categorization in terms of central prototypes grounded in perception. Such an explanation works well for perceptually salient focal colors, but does not account for BCTs like purple, which tend not to have a salient focus; nor does it account for category boundary phenomena in color naming tasks. Arguments have been advanced that composite color terms for LIGHT-WARM and DARK-COOL may be linked to colors typically associated with day and night (Goddard, 1998); and other color terms may develop based on the color of culturally important objects (Saunders and van Brakel, 1997) (the position of cultural relativists). But troubling data for a culturally-grounded explanation of DARK-COOL and LIGHT-WARM terms is that BCTs for these notions do not often correspond to lexical terms for ‘night,’ and ‘day’ or ‘sun,’ respectively. Most troubling, these accounts have no way of accounting for the strong statistical patterns seen in large data sets such as the World Color Survey or MacLaury’s Mesoamerican study. Almost certainly any reductionist one-factor explanation will ultimately fail in explaining all of the patterns of BCT development in the world’s languages. See also: Categorizing Percepts: Vantage Theory; Cognitive Semantics; Lexicalization; Prototype Semantics.
Bibliography Berlin B & Kay P (1969). Basic color terms, their universality and evolution. Berkeley: University of California Press
[Reprinted 1991/1999. Stanford: CSLI Publications, with expanded bibliography by Luisa Maffi, and color chart by Hale Color Consultants.]. Bloomfield L (1933). Language. New York: Holt. Borg A (ed.) (1999). The language of color in the Mediterranean. Stockholm: Almqvist & Wiksell. Dedrick D (1998). Naming the rainbow: colour language, colour science, and culture. Dordrecht: Kluwer. Goddard C (1998). Semantic analysis: a practical introduction. Oxford: Oxford University Press. Greenfield P J (1986). ‘What is grey, brown, pink, and sometimes purple: the range of ‘wild card’ color terms.’. American Anthropologist 24, 908–916. Hardin C L (1988). Color for philosophers: unweaving the rainbow. Indianapolis/Cambridge, MA: Hackett. Heider E R (1972). ‘Universals in color naming and memory.’ Journal of Experimental Psychology 93, 1–20. Hering E (1920/1964). Outlines of a theory of the light sense. Cambridge, MA: Harvard University Press. Hickerson N P (1971). ‘Review of Berlin and Kay (1969).’ International Journal of American Linguistics 37, 257–270. Kay P, Berlin B, Maffi L & Merrifield W (1997). ‘Color naming across languages.’ In Hardin C L & Maffi L (eds.) Color categories in thought and language. Cambridge: Cambridge University Press. 21–55. Kay P, Berlin B, Maffi L & Merrifield W (forthcoming). World color survey. Chicago: University of Chicago Press (Distributed by CSLI). Kay P & Kempton W (1984). ‘What is the Sapir-Whorf Hypothesis?’ American Anthropologist 86, 65–79. Kay P & McDaniel C K (1978). ‘The linguistic significance of basic color terms.’ Language 54, 610–646. Kuschel R & Monberg T (1974). ‘‘We don’t talk much about colour here’: a study of colour semantics on Bellona Island.’ Man 9, 213–242. Lenneberg E H & Roberts J M (1956). The language of experience: a study in methodology, Memoir 13, International Journal of American Linguistics. Baltimore: Waverly. Levinson S C (2002). ‘Ye lıˆ Dyne and the theory of basic colour terms.’ Journal of Linguistic Anthropology 10, 3–55. Maclaury R E (1996). Color and cognition in Mesoamerica: constructing categories as vantages. Austin: University of Texas Press. MacLaury R E (1999). ‘Basic color terms: twenty-five years after.’ In Borg A (ed.) The Language of Color in the Mediterranean. Stockholm: Almqvist and Wiksell. 1–37. MacLaury R E (2002). ‘Introducing vantage theory.’ Language Sciences 24, 493–536. MacLaury R E, Alma´ si J & Ko¨ vecses Z (1997). ‘Hungarian Piros and Vo¨ ro¨ s: color from points of view.’ Semiotica 114, 67–81. Magnus H (1880). Untersuchung u¨ ber den Farbensinn der Naturvo¨ lker. Jena: Gustav Fischer. Payne D L, Ole-Kotikash L & Ole-Mapena K (2003). ‘Maa color terms and their use as human descriptors.’ Anthropological Linguistics 45, 169–200.
610 Color Terms Powlison P (1995). Nijyami Niquejadamusiy-May Niquejadamuju. (Diccionario Yagua – Castellano) [YaguaEnglish Dictionary]. Lima: Instituto Lingu¨ ı´stico de Verano. Rivers W H R (1901). ‘Introduction: colour vision.’ In Haddon A C (ed.) Reports of the Cambridge Anthropological Expedition to Torres Straits 2: Physiology and Psychology. Cambridge: Cambridge University Press. 1–132. Rosch E H (1975). ‘Cognitive reference points.’ Cognitive Psychology 4, 328–350. Saunders B & van Brakel J (1997). ‘Are there nontrivial constraints on colour categorization?’ Behavioral and Brain Sciences 20, 167–228.
Whorf B L (1956). ‘The relation of habitual thought and behavior to language.’ In Carroll J B (ed.) Language, thought and reality: selected writings of Benjamin Lee Whorf. Cambridge, MA: MIT Press. 134–159. Zadeh L (1965). ‘Fuzzy sets.’ Information and Control 8, 338–353.
Relevant Website http://www.icsi.berkeley.edu – World Color Survey Site.
Combinatory Categorial Grammar M Steedman, University of Texas, Austin, TX, USA J Baldridge, University of Edinburgh, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
categories or primitive categories. For example, the English transitive verb married bears the following category: (1) married :¼ (S\NP)/NP
Introduction Combinatory Categorial Grammar (CCG), like other varieties of Categorial Grammar (CG) discussed by Wood (1993), is a radically lexicalized grammar in which all language-specific grammatical information is specified in the lexicon and the application of syntactic rules is entirely conditioned on the syntactic type, or category, of their inputs. No rule is structuredependent. In this respect CCG is to be contrasted with Transformational Grammar and its descendents. It is further distinguished from them and most other theories of natural grammar by its radically free conception of derivational constituency, uniting intonation structure and surface structure, and its distinctive account of the long-range dependencies involved in relative clauses and coordination. The latter account avoids the use of syntactic variables and eschews movement and deletion as syntactic operations. CCG is also distinguished by its use of a fixed inventory of type-driven rules from nonfinitely axiomatizable categorial logics such as the Lambek calculus and Type-Logical Grammar. Categories identify the syntactic type of a constituent as either a primitive category or a function category. Primitive categories, such as N, NP, PP, and S, may be regarded as further distinguished by features such as number, case, and inflection (including features of some version of the X theory), where appropriate. Functions (such as verbs) bear categories identifying the type of their result (such as S) and that of their argument(s)/complements(s), both of which may themselves be either function
This syntactic category identifies the transitive verb as a function and specifies the type and directionality of its arguments and the type of its result. We here use the ‘result leftmost’ notation, in which a rightwardcombining functor over a domain b into a range a is written a/b; the corresponding leftward-combining functor is written a\b, where a and b may themselves be function categories. (There is an alternative ‘result on top’ notation due to Lambek, according to which the latter category is written b\a. The use of slashes in both notations should be distinguished from the quite different use of slash notation in Generalized Phrase Structure Grammar.) The transitive verb category in (1) also reflects its semantic type, which we write (following the article Semantics in Categorial Grammar) as ((t e) e), where e is the type of an entity and t is the type of a proposition. We can make this semantics explicit by pairing the category with a term of the lambda calculus, via a colon operator: (2) married :¼ (S\NP)/NP : lxly.marry 0 xy
(Primes mark constants; nonprimes are variables. The notation uses concatenation to mean function application under a left-associative convention, so that the expression marry 0 xy is equivalent to (marry 0 x)y.) Pure CG limits syntactic combination to rules of functional application of functions to arguments to the right or left, which in the present notation can be written as: (3a) X/Y:f (3b) Y:a
Y:a ) X:fa X\Y:f ) X:fa
(>) (<)
610 Color Terms Powlison P (1995). Nijyami Niquejadamusiy-May Niquejadamuju. (Diccionario Yagua – Castellano) [YaguaEnglish Dictionary]. Lima: Instituto Lingu¨ı´stico de Verano. Rivers W H R (1901). ‘Introduction: colour vision.’ In Haddon A C (ed.) Reports of the Cambridge Anthropological Expedition to Torres Straits 2: Physiology and Psychology. Cambridge: Cambridge University Press. 1–132. Rosch E H (1975). ‘Cognitive reference points.’ Cognitive Psychology 4, 328–350. Saunders B & van Brakel J (1997). ‘Are there nontrivial constraints on colour categorization?’ Behavioral and Brain Sciences 20, 167–228.
Whorf B L (1956). ‘The relation of habitual thought and behavior to language.’ In Carroll J B (ed.) Language, thought and reality: selected writings of Benjamin Lee Whorf. Cambridge, MA: MIT Press. 134–159. Zadeh L (1965). ‘Fuzzy sets.’ Information and Control 8, 338–353.
Relevant Website http://www.icsi.berkeley.edu – World Color Survey Site.
Combinatory Categorial Grammar M Steedman, University of Texas, Austin, TX, USA J Baldridge, University of Edinburgh, Edinburgh, UK ! 2006 Elsevier Ltd. All rights reserved.
categories or primitive categories. For example, the English transitive verb married bears the following category: (1) married :¼ (S\NP)/NP
Introduction Combinatory Categorial Grammar (CCG), like other varieties of Categorial Grammar (CG) discussed by Wood (1993), is a radically lexicalized grammar in which all language-specific grammatical information is specified in the lexicon and the application of syntactic rules is entirely conditioned on the syntactic type, or category, of their inputs. No rule is structuredependent. In this respect CCG is to be contrasted with Transformational Grammar and its descendents. It is further distinguished from them and most other theories of natural grammar by its radically free conception of derivational constituency, uniting intonation structure and surface structure, and its distinctive account of the long-range dependencies involved in relative clauses and coordination. The latter account avoids the use of syntactic variables and eschews movement and deletion as syntactic operations. CCG is also distinguished by its use of a fixed inventory of type-driven rules from nonfinitely axiomatizable categorial logics such as the Lambek calculus and Type-Logical Grammar. Categories identify the syntactic type of a constituent as either a primitive category or a function category. Primitive categories, such as N, NP, PP, and S, may be regarded as further distinguished by features such as number, case, and inflection (including features of some version of the X theory), where appropriate. Functions (such as verbs) bear categories identifying the type of their result (such as S) and that of their argument(s)/complements(s), both of which may themselves be either function
This syntactic category identifies the transitive verb as a function and specifies the type and directionality of its arguments and the type of its result. We here use the ‘result leftmost’ notation, in which a rightwardcombining functor over a domain b into a range a is written a/b; the corresponding leftward-combining functor is written a\b, where a and b may themselves be function categories. (There is an alternative ‘result on top’ notation due to Lambek, according to which the latter category is written b\a. The use of slashes in both notations should be distinguished from the quite different use of slash notation in Generalized Phrase Structure Grammar.) The transitive verb category in (1) also reflects its semantic type, which we write (following the article Semantics in Categorial Grammar) as ((t e) e), where e is the type of an entity and t is the type of a proposition. We can make this semantics explicit by pairing the category with a term of the lambda calculus, via a colon operator: (2) married :¼ (S\NP)/NP : lxly.marry 0 xy
(Primes mark constants; nonprimes are variables. The notation uses concatenation to mean function application under a left-associative convention, so that the expression marry 0 xy is equivalent to (marry 0 x)y.) Pure CG limits syntactic combination to rules of functional application of functions to arguments to the right or left, which in the present notation can be written as: (3a) X/Y:f (3b) Y:a
Y:a ) X:fa X\Y:f ) X:fa
(>) (<)
Combinatory Categorial Grammar 611
The application rules in (3) allow derivations equivalent to those in traditional Context-Free Phrase Structure Grammar (CFPSG), such as the following: (4)
The restriction to rules of functional application alone limits pure CG to the level of context-free grammar. CCG generalizes the context-free core by introducing further rules for combining categories. Because of their strictly type-driven character and their semantic correspondence to the simplest of the combinators identified by Curry and Feys (1958) these rules are called combinatory rules – the distinctive ingredient of CCG that gives it its name. They are strictly limited to certain directionally specialized instantiations of a very few basic operations, of which the most important for the present purposes are Type-Raising and functional Composition. A third class of combinatory rules related to Substitution, Curry and Feys’s S combinator, is also crucially involved. Some variants of CCG discussed in this article use further combinatory operations. Examples include ‘wrapping’ or commutative combinatory rules related to Curry and Feys’s C (Bach, Dowty, and Jacobson) and the product combinator of Lambek (Pickering and Barry, and Dowty), sometimes eschewing composition and/or type-raising entirely. All of these combinatory versions of categorial grammar are strictly distinct from Type-Logical Grammar (TLG) and related generalizations of CGs related to the Lambek calculus and/or Martin-Lo¨ f Type-Theory. Type-logical grammars equate function categories with implicative formulæ in a substructural logic. Accordingly, TLG treats grammatical derivation as proof rather than as reduction in an applicative system. Logics and applicative systems stand in a very close relation under the ‘Curry-Howard isomorphism,’ and some of the combinatory rules of CCG are in fact theorems of the (pure, associative) Lambek calculus. However, others are not. Because not even the pure associative Lambek calculus is finitely axiomatizable and it has been shown by Pentus to be not only weakly context-free but to also have an NP-hard decision problem, these are really radically different kinds of systems. In practice, the emphasis of the two approaches has been rather different. Work in TLG has attended more to mathematical foundations, whereas work in CCG has concentrated on linguistic explanation and computational applications. The
relation of TLG to both CCG and theories such as TG that make use of syntactic variables is discussed by Dowty (1993). Much work in CCG has centered on the problem of capturing unbounded dependencies – that is, on constructions such as relative clauses and reduced conjunctions, in which elements that are syntactically and semantically dependent may be separated by arbitrary amounts of intervening linguistic material. Such dependent elements are indicated by subscripts in the following examples: (5a) a man thati Anna said Manny thinks I likei (5b) I wrote toi, and you heard fromi, Mannyi
Such dependencies are to be contrasted with bounded dependencies, which hold between function and argument categories such as verbs and the noun phrases in their immediate domain, as in the case of binding and control. The Categorial Lexicon
In order to capture languages with freer word order, such as Turkish and Tagalog, the notation introduced in (1) and (2) must be understood as a special case of a more general notation allowing categories to be schematized over a number of orders of combination and directionalities. Baldridge, following Hoffman, proposed a Multi-Modal Set-CCG notation, according to which the single arguments of a rigid directional category such as (1) are replaced by one or more multisets of one or more argument types, each bearing its own directionality slash, and allowed to combine in any order. For example, the transitive verb category of a completely free-word-order accusative language with nominal case (such as Latin) is written S{|NPnom, |NPacc}, where | indicates that either leftward or rightward combination is allowed. The set brackets indicate that the subject and object can combine in either order. For a language such as Tagalog, which is verbinitial but otherwise freely ordered and cased, the corresponding accusative–transitive active voice category is written S{/NPnom, /NPacc}. Verb-final Japanese accusative transitives are written S{\NPnom, \NPacc}. In this extended notation, the English transitive verb can be written in full as (S{\NPnom}) {/NPacc}. However, we adopt a convention that suppresses set brackets when argument sets are singletons, so that we continue to write this category as (S\NP)/NP, as in (1). We can generalize the semantic notation introduced in (2) using a parallel argument set notation for lambda terms and a convention that pairs the unordered syntactic arguments with the unordered semantic
612 Combinatory Categorial Grammar Table 1 Notation for transitive verb categories Language
Figure 1 CCG type hierarchy for slash modalities (from Baldridge and Kruijff, 2003).
Combinatory Rules
arguments in the left-to-right order in which they appear on the page. The transitive verb categories then appear as shown in Table 1. All such schemata cover only a finite number of deterministic categories like (1) and can only generate the language that will be generated by compiling the schema into explicit multiple deterministic lexical categories. (Baldridge showed that schematization of this kind does not increase the expressive power of the theory.) The lexicon of a given language is a finite set of categories subject to quite narrow restrictions that ultimately stem from limitations on the variety of semantic types with which the syntactic categories are paired in the lexicon. In particular, we can assume that lexical function categories are limited to finite – in fact, very small – numbers of arguments. (For English at least, the maximum appears to be four, required for a small number of verbs such as bet as in I bet you 5 dollars I can spit further than you.) The present paper further follows Jacobson, Hepple, and Baldridge and Kruijff in assuming that rules and function categories are ‘modalized,’ as indicated by a subscript on slashes. Baldridge further assumed that slash modalities are features in a type hierarchy, drawn from some finite set M (the modalities used here are M = {w, e, !, " }). The effect of each of these modalities is described here as the related combinator rules are introduced. The basic intuition is as follows. . The w modality is the most restricted and allows only the most basic applicative rules. . The e permits order-preserving associativity in derivations. . The ! allows limited permutation. . The " is the most permissive, allowing all rules to apply. The relation of these modalities to one another can be compactly represented via the hierarchy given in Figure 1. By convention, we write the maximally permissive slashes /. and \. as plain slashes / and \. This allows us to continue writing the categories that bear this modality, such as the transitive verbs in (Table 1), as before.
The function application rules can now be written as follows: (6a) X/wY:f Y:a (6b) Y:a X\wY:f
) X:fa ) X:fa
(>) (<)
Because w is the supertype of all other modalities, the /w and \w slashes in the rules are interpretated as ‘#w’—that is, all functional categories can combine by these most basic rules, allowing the derivation (4) as before. For example, Japanese transitive verbs such as tazuneta ‘visited’ have the following category, in which the l notation is generalized as in (Table 1) and the modalities are " (suppressed by convention): (7) tazuneta :¼ S{\NPnom, \NPacc} : l{y, x}.visit0 xy
This category is clause-final, and supports multiple derivations, which are guaranteed to yield identical semantic representations: (8)
(9)
CCG includes a number of further more restricted operations for combining (in the terms of the Minimalist Program (MP), ‘Merging’) categories. These operations correspond semantically to combinators – that is, type-driven operators over functions of the kind developed by Curry and Feys. For present purposes, they can be regarded as limited to operations of type-raising (corresponding semantically to the combinator T), composition (corresponding to the combinator B), and substitution (corresponding to the combinator S). Type-raising turns argument categories such as NP into functions over the functions that take them as arguments, such as the verbs discussed previously, into the results of such functions. Thus, NPs such as Anna can take on such categories as:
Combinatory Categorial Grammar 613 (10a) Anna :¼ S/(S\NP) : lp"p anna0 (10b) Anna :¼ S\(S/NP) : l p"p anna0 (10c) Anna :¼ (S\NP)\((S\NP)/NP): lp"p anna0
and so on. This operation must be limited to ensure decidability and, in practice, can be strictly limited to argument categories NP, AP, PP, VP and S. One way to do this is to specify it in the morpho-lexicon in the categories for proper names, determiners, and the like, in which case their original ground types such as NP and NP/N can be eliminated. Type-raising therefore resembles the traditional operation of case; NPs and other arguments are specified as to the case slot that they may fill in the verb. In English, case is specified directionally rather than morphologically, but in Japanese the relation of type-raising and case is completely transparent: (11a) Anna-ga :¼ S/(S{\NPnom}) : lp"p anna0 (11b) Anna-o :¼ S/(S{\NPacc}) : lp"p anna0 (11c) Anna-o :¼ (S{\NPnom})/ (S{\NPnom, \NPacc}) : lp"p anna0
and so on. CCG also includes rules of functional composition rules such as the following forward composition (>B) rule. (12) X/eY : f Y/eZ : g )B X/eZ : lx.f(gx)
(The modalities on X/eY and Y/eZ in the rule are interpreted as ‘#e’—that is ", or e. The two modalities on these inputs need not be the same, but the modality on X/eZ in the output has to be the same as that on Y/eZ, in accordance with the Principle of Inheritance, to be defined later. In interaction with simple functional application and lexicalized type-raising, composition engenders a potentially very freely reordering and rebracketing calculus and generalizes the notion of surface or derivational constituency. For example, the simple transitive sentence of English has two equally valid surface constituent derivations, each yielding the same logical form: (13)
In (13), Anna and married compose, as indicated by the annotation >B, to form a nonstandard constituent of type S/NP, which the object NP commands. In (14), there is a more traditional derivation involving a verb phrase of type S\NP commanded by the subject. More complex sentences may have many semantically equivalent derivations. However, all yield identical logical forms, and all are legal surface derivational constituent structures. As we see directly, the point of allowing nonstandard constituents such as Anna marriedS/NP is that they occur as the residue of relativization and coordination, as in the man that Anna married and Frankie divorced, and Anna married, Manny. It immediately follows that properties dependent on traditional command relations, notably including binding asymmetries such as the following, cannot be defined over surface CCG derivtions and must be defined over logical forms: (15a) Manny likes himself (15b) *himself likes Manny
In fact, surface derivations do not constitute a representational level at all in CCG. They are merely an uninterpreted record of different ways in which the same typed logical form can be put together. In CCG, unlike certain other generalizations of CG, logical form is the sole grammatical representational level. Substitution, a further species of combinator related to Curry and Feys’s S, was proposed by Szabolosi under the name ‘connection’ for the analysis of ‘parasitic gaps.’ It completes the set of core combinator species used in all forms of CCG. Its role is somewhat specialized, and we defer further discussion until derivation (19), the book that Anna burned without reading. The (16) backward crossed substitution (
! X/$Z : lz.fz(gz)
Before moving to a more formal definition of the space of possible CCG grammars, including the principle that again requires that the modalities on Y/$Z and X/$Z must be the same, we briefly review the kind of linguistic analyses that CCG allows.
This theory has been applied to the linguistic analysis of coordination, relativization, and intonational structure in English and many other languages by Szabolcsi, Dowty, Hepple, Jacobson, Baldridge, Bozsahin, Hoffman, Kang, Komagata, Oehrle, Prevost, Steedman, Trechsel, and others. It has been successfully applied
614 Combinatory Categorial Grammar
to the wide-coverage parsing of English newspaper text using a grammar automatically extracted from the Penn Wall Street Journal Treebank by Hockenmaier and by Clark and Curran, with state-of-the-art accuracy in dependency recovery, including such unbounded dependencies. Relativization
Because substrings such as Anna married are now fully interpreted derivational constituents, complete with compositional semantic interpretations, they can be used to define relativization without movement or empty categories, as in (18) and (19), via the following category for the relative pronoun. (17) that :¼ (N\N)/(S/NP) : lplnlx.(nx) ^ (px) (18)
Crossing Dependencies
The forward composition rule (12) is restricted by the e modality, which means that it cannot apply to categories bearing the " or w modalities of Figure 1. Crucially, crossing composition rules, in which the directionality of the composed functions differ, are also allowed in CCG under the Principles of Consistency and Inheritance (discussed later), unlike the pure associative Lambek calculus. An example is the following forward crossed composition (>B") rule. (21) X/"Y : f
Y\"Z : g )B
X\"Z : lx.f(gx)
There is a natural generalization of all the composition rules to composition into functions with n arguments for some small fixed n, including the following forward crossed composition (>B2") rule. (22) X/"Y : f (Y\"Z)/W : g )B (X\"Z)/W : lylx.f(gyx)
These rules are restricted by the " modality because they have a reordering effect. Most nominal
(19)
(20)
Such extractions are correctly predicted to be unbounded because composition can operate across clause boundaries. It is the lexical category (17) of the relative pronoun that establishes the long-range dependency between the noun and verb (via the nonessential use of the (nonessential) variable x in the present notation). This relation too is established in the lexicon: syntactic derivation merely projects it onto the logical form. In the terms of the Minimalist Program (MP) of Chomsky, in which such relationships are established by the operation ‘Move,’ it should be clear that CCG reduces this operation to the other major MP operation Merge because composition and type-raising, as well as application, correspond to the latter, more basic operation.
functor categories such as determiners NP/eN bear the e modality of harmonic composition, preventing these rules from applying to allow examples such as the following. (23)
It is the inclusion of such rules that increases the expressive power of the formalism beyond contextfree languages. (Vijay-Shanker and Weir showed that CCG is weakly equivalent to TAG and Linear Indexed Grammar (LIG). This equivalence gives rise to a polynominal time worst-case parsing complexity result and, more important, means that standard parsing algorithms can apply.)
Combinatory Categorial Grammar 615
For example, the availability of these rules allows crossing dependencies in Dutch and certain Swiss dialects of German, which cannot be captured by CFG and have given rise to proposals for verb-raising transformational operations, as in the following example (from Shieber):
into S and, in the latter case, ^ schematizes over the usual pointwise recursion over logical conjunction (Partee and Rooth): (27) and :¼(S\wS)/wS : lplq.p^q
(24)
The ! modality on the verbs ha¨ lfed and aastriichte permits the forward crossed composition rule (21) to apply. The tensed verb is distinguished as the head of a subordinate clause via the feature SUB. The type-raised NP categories are abbreviated as NP"case because the fact that they are raised is not essential to understanding the point about crossing dependencies. It is correctly predicted that the following word orders are also allowed in at least some dialects (Shieber, 1985: 338–339): (25a) das mer em Hans ha¨ lfed es huus aastriiche (25b) das em Hans mer es huus ha¨lfed aastriiche
The construction is completely productive, so the dependencies are not only intersective but unbounded. For example, we have the following (also from Shieber):
This category allows a movement- and deletion-free account of right node raising, as in (28): (28)
The w modality on the conjunction category (27) means that it can only combine like types by the application rules (3). Hence, as in GPSG (Gazdar), this type-dependent account of extraction and coordination, as opposed to theories such as TG, LFG, and HPSG that use structure-dependent rules, makes the across-the-board condition (ATB) on extractions from coordinate structures (including the ‘same case’ condition) a prediction or theorem rather than a
(26)
Again the unbounded dependencies are projected from the lexical frame of the verb, without syntactic movement. Coordination
The nonstandard constituents that CCG engenders, such as Anna married and Anna says he married, can also undergo coordination. We can assume that English conjunctions such as and bear the following category, in which S is S or any function category
stipulation. A consideration of the types involved in the following examples reveals how. (29a) A saxophonist [that(N\N)\ (S/NP) [[Anna married]S/NP and [Manny detests]S/NP] S/NP]N\N (29b) A saxophonist [that(N\N)/ (S/NP) *[[Anna married]S/NP and [detests Manny]S\NP]] (29c) A saxophonist that(N\N)/ (S/NP) *[[Anna married]S/NP and [Manny detests him]S]]
616 Combinatory Categorial Grammar (29d) A saxophonist that(N\N)/ (S/NP) *[[Anna married him]S and [Manny detests]S/NP]
the following by embodying traditional notions of command at the level of logical form:
In Japanese, the interaction of type-raising (which is specified by morphological case) and composition similarly allows multiple derivations. In particular, we have:
(34a) I shall introduce the participants to each other (34b) *I shall introduce each other to the participants
(30)
(31)
(The conjunction category in Japanese is enclitic, unlike English proclitic (27).) The prediction of nonstandard argument cluster constituents such as Anna-ga Manny-o in derivation (30) correctly predicts that such clusters can coordinate.
We have already seen that the nonstandard derivations of CCG force us to distinguish logical form from derivation. However, this is a departure from some other categorial accounts, which we turn to next.
The possibility of similar argument cluster coordinations in the Dutch and Swiss German verb-raising construction (26) and in English in examples such as (33) is similarly immediate, as Steedman and Dowty pointed out, if we assume the following category for ditransitive give, in which the e modality prevents overgenerations such as I will give a bone a very heavy dog from arising via the backward crossed composition rule:
The fact that substrings such as Anna married and Manny says that Anna married are accorded the full status of derivational constituents in CCG means that intonation structure and surface structure can be reunited in a single level of derivational structure rather than being consigned to different tiers. Consider the following minimal pair of dialogs, in which intonational tunes are indicated both informally, via parentheses and small capitals, and in the standard notation of Pierrehumbert, in which prosodic phrases are specified solely in terms of two kinds of
(32) give :¼ (VP/NP)/eNP: lxly.give 0 yx
English Intonation and Information Structure
(33)
However, the category (32) makes the order of arguments for the syntactic category of ditransitives the reverse of that of the predicate in the corresponding logical form. The reason for doing this is to allow a standard account of binding asymmetries like
elements, the pitch accent(s) and the boundary: (35) Q: I know who married DANNY. But who married MANNY? A: (ANNA) (married MANNY). H*L LþH* LH%
Combinatory Categorial Grammar 617 (36) Q: I know which man Anna DATED. But which one did she MARRY? A: (Anna MARRIED) (MANNY). LþH*LH% H* LL%
The intuition that these tunes strongly convey systematic distinctions in discourse meaning is inescapable. For example, exchanging the answer tunes between the two contexts in (35) and (36) yields complete incoherence. Prevost and Steedman claimed that the tunes LþH* LH% and H*L (or H* LL%) are, respectively, associated with the topic (or theme) and comment (or rheme) of the sentence, where the theme can be thought of as linking the utterance to the preceding context and the rheme can be thought of as the part of the utterance that moves the discourse forward to a new information state. The fact that CCG allows alternative derivations such as (13) and (14) offers an obvious way to bring intonation structure and its interpretation – information structure – into the same syntactic system as everything else. Crucially, these alternative derivations are guaranteed to yield the same predicate argument relations, as exemplified by the logical form that results from the two derivations. However, the derivations build this logical form via different routes that construct lambda terms corresponding semantically to the theme and rheme. In particular the derivation (14) corresponds to the information structure associated with the intonation contour in (35), whereas derivation (13) corresponds to that in (36).
Bounded Constructions A number of accounts stemming from work by Bach, Dowty, and Shaumyan proposed combinatory accounts of bounded constructions such as passive, dative alternation, anaphor binding, raising, and control. Although these have often been presented as syntactic accounts, with combinatory rules applying during syntactic derivation and typically under the control of slash-typing, as in the previous account of unbounded constructions, the very fact that these phenomena are clause-bounded means that they can equally well be regarded as applying presyntactically – that is, as lexical rules or parameters – as was originally proposed by Shaumyan and Dowty.
Rather than rectifying the command relations in a logical form, Dowty proposed to instead assign the category (VP/NP)/PP to introduce and to allow phrasal verbs such as introduce to each other to combine with the second argument the participants via the right-wrap combinatory rule proposed by Bach in order to handle a number of bounded constructions, notably including Control. Crucially, if we are to avoid erroneous predictions of preposition-stranding extractions such as *a man that I gave to _ a book, right-wrapping phrasal verbs such as introduce to each other must be specified as only combining via right-wrap, a result that we can accomplish by the use of a further slash type, of the kind Jacobson proposed, writing the category for introduce as follows: (37) introduce :¼ (VP/wNP)/PP : introduce0
The rule as proposed by these authors was expressed in various notations separating immediate dominance (ID) and linear precedence (LP) relations, as is made explicit in the version proposed by Zwicky and Dowty. In the present notation it can be expressed as the (weakly context-free) Bach/Dowty/ Jacobson right-wrap rule. (38) (X/wY) / Z:f
Y: y Z : z
) X/Z : fz Y : y (>WRAP)
The point of this rule is that it affords a derivation structure in which the participants commands its anaphor each other, as in derivation (39), in which the interpretation indicated for the reciprocal is merely a placeholder for the semantics (see Semantics in Categorial Grammar). We are therefore in a position to define binding conditions over derivation structures, say as in Chierchia (1998). (39)
Such rules express a very significant crosslinguistic generalization – for example, as Dowty pointed out, simple transitive verbs in VSO languages such as Irish (Irish Gaelic) must obligatorily wrap the subject and object arguments under these assumptions.
Binding
Raising
The syntactic category that we would first be tempted to assign to introduce is (VP/PP)/NP, reflecting the linear order of the verb and its complements. However, if we make the standard assumption that anaphors, like reciprocals, have to fall in the scope of their binder, the syntactic derivation implicit in the category makes the PP command the NP, apparently making the wrong prediction concerning (34).
Jacobson proposed to account for raising verbs such as seem in terms of function composition, using a composition rule and a further slash modality, here written # , distinct from those involved in the earlier account of extraction in that the modality limited categories to only combining via that rule, as in: (40) X/# Y: f Y\Z : g ) X\Z : lx.f(gx) (>RAISE)
618 Combinatory Categorial Grammar
Note that, this is a form of crossing composition. This allows derivations such as the following: (41)
Again it is crucial that this category combine only via rule (40). Control
Bach proposed an account of Control in terms of rule (38), which in present terms made object control verbs such as persuade wrapping verbs with category (VP/wNP) /VPto–inf, giving rise to derivations such as (42), in which the responsibility for establishing the control relation itself is devolved to semantics (see Semantics in Categorial Grammar). (42)
He further assumed that subject control verbs such as promise were not wrapping verbs but had the category (VP/VPto– inf)/NP. An Argument for Lexicalizing the Bounded Constructions
Despite the elegance of this account, the costs that follow from including the Bach/Dowty/Jacobson wrapping categories in the CCG lexicon and introducing the corresponding wrapping and raising combinators at the same level as the other CCG combinatory rules are quite high. In particular, WRAP alone does not explain why derivations parallel to (33) exist not only for promise but also for the wrapping verbs persuade, introduce, and give itself. (43a) promise Anna to come and Manny to go (43b) persuade Anna to come and Manny to go (43c) introduce the participants to each other and Anna to Manny (43d) give the piano player a drink and the singer a cigar
There appears to be no easy way to allow cluster coordination analogous to (33) within WRAP-CCG without adding otherwise unmotivated slash types, lifted types, and/or combinatory rules. This is perhaps why Dowty in later work abandoned the combinatory approach entirely in favor of a very expressive form of Type–Logical Grammar, in which the
responsibility for supporting word order for nonserial dependencies in Dutch and English is transferred to structural rules. The alternative is to reinterpret the raising and wrapping combinators (as did Shaumyan and many other lexicalized grammars, such as HPSG) as lexical relations between syntactic types and the related logical forms, as in the earlier analysis of give (32) and following categories, in which the wrap relations proposed by Bach et al. are already represented at the level of logical form and which, as (33) shows, require no additional raising or wrapping combinatory rules: (44a) seem :¼ VP/VP: lply.seemingly0 (py) (44b) promise :¼
Opinions currently differ among combinatory categorial grammarians as to whether it is better to extend the syntactic component with extra combinatory rule types to save syntactic raising and/or wrap, or to compile raising and wrap into the lexicon. Anaphora
A number of theories of pronominal anaphora have been proposed in CCG, notably by Szabolcsi and Jacobson. These theories depart in a number of respects from the forms of combinatory grammar that have been motivated by the syntactic constructions discussed here (they are discussed at length, together with the syntax and semantics of quantifiers, elsewhere; see Semantics in Categorial Grammar).
Principles and Relations to Other Theories of Grammar Lexicon
The most basic assumptions concerning the transparency of syntactic types to semantics in all versions of CCG are the following: . The Principle of Combinatory Type-Transparency. All syntactic combinatory rules are type-transparent versions of one of a small number of simple semantic operations over functions. . The Principle of Combinatory Type-Transparency. All syntactic categories are type-transparent to semantic interpretation. A further principle adhered to by all the theories discussed here (although it is breached by Jacobson’s account of anaphora) is that the responsibility for specifying all dependencies, whether unbounded or
Combinatory Categorial Grammar 619
bounded, resides in the lexical specifications of syntactic categories for the heads of those dependencies – that is, the words corresponding to predicateargument structural functors, such as verbs. This principle, which is related to the Projection Principle of GB, can be more formally stated as follows: . The Principle of Lexical Head Government. Both bounded and unbounded syntactic dependencies are entirely determined by the lexical syntactic type of their head. This is simply to say that the present theory of grammar is ‘radically lexicalized,’ a property that makes it akin to lexicalized Tree-Adjoining Grammar (TAG). This is a stronger sense of lexicalization than is embodied in Lexical-Functional Grammar (LFG), Head-Driven Phrase Structure Grammar (HPSG), and certain recent minimalist versions of TG. Radically lexicalized grammars make the lexical entries for words do all of the language-specific grammatical work of mapping the strings of the language to their interpretations. The size of the lexicon involved is therefore an important measure of a grammar’s complexity. Other things being equal, one lexicalized grammar is simpler than another if it captures the same pairing of strings and interpretations using a smaller lexicon. A more restrictive principle of CCG, which it shares with LFG and GB and which sets it apart from TAG, GPSG, and HPSG (which in other respects are more closely related), is that it attempts to minimize the size of the lexicon by adhering as closely as possible to the following stronger principle: . The Principle of Head Categorial Uniqueness. A single nondisjunctive lexical category for the head of a given construction entirely determines both bounded and unbounded dependencies upon that head. This is not to say that a given word may not be the head of more than one construction and hence be associated with more than one category. Nor (as we have seen in the cases of Tagalog and Japanese) does it exclude the possibility that a given word-sense pair may permit more than one canonical order and, hence, have more than one category per sense, possibly schematized using the set CCG notation in Table 1. The claim is simply that each of these categories specifies both canonical order and all varieties of extraction for the clause type in question. For example, a single lexical syntactic category (2) for the word married, which does not distinguish between ‘antecedent,’ ‘y,’ or any other variety of government, is involved in all of the dependencies illustrated in (4), (18), (20), and (28). By contrast, in both TAG and
GPSG these dependencies are mediated by different initial trees or categories, and in HPSG they are mediated by a disjunctive category. Unlike the Principle of Lexical Head Government, exceptions to the Principle of Head Categorial Uniqueness are sometimes forced. An example of such a necessary exception is the treatment of subject extraction in English by Steedman. It is a prediction of CCG (rather than a stipulation via a Fixed Subject constraint or Empty Category Principle) that a fixed SVO word order language such as English cannot permit complement subjects to extract under the Head Categorial Uniqueness Principle, as illustrated by the anomaly of (45a). The exceptional possibility of extracting subjects from English bare complements, as in (45b), therefore have been argued to require an extra antecedent-governed category for barecomplement-taking verbs such as think, in violation of that principle. (45a) *who do you think that married Manny? (45b) who do you think married Manny?
However, each such exception complicates the grammar by expanding the lexicon and makes it compare less favorably with an otherwise equivalently valued grammar that requires no such exceptions, if one could be found. Baldridge offered a grammar with a single category for such verbs. The Combinatory Projection Principle
Even quite small sets of functional combinators, including the set {BTS} implicit in CCG, can yield calculi of the full expressive power of Turing machines and the simply typed lambda calculus. However, CCG syntax is subject to a number of principles that limit its expressive power to weak equivalence to TAG and Linear Indexed Grammar (LIG), the least more powerful natural class of languages known (Vijay-Shanker and Weir, 1994). These principles can be summed up as a ‘projection principle’ that says that syntax must project (and may not override) directional information specified in the lexicon and, conversely, that the lexicon should not do syntax’s job of unbounded projection. This principle is expressed as a number of subsidiary principles We have given examples of several rules that encode the syntactic reflex of a few basic semantic functions (combinators). However, a larger set of possible rules could be derived from the combinators. CCG restricts the set to only those that obey the following principles: . The Principle of Adjacency. Combinatory rules may only apply to finitely many phonologically realized and string-adjacent entities.
620 Combinatory Categorial Grammar
. The Principle of Consistency. All syntactic combinatory rules must be consistent with the directionality of the principal function. . The Principle of Inheritance. If the category that results from the application of a combinatory rule is a function category, then the slash type of a given argument in that category will be the same as the one(s) of the corresponding argument(s) in the input function(s). The first of these principles is merely the definition of a combinator, to which all the rules discussed here (including WRAP) conform. The other principles say that combinatory rules may not override, but must rather ‘project,’ the directionality specified in the lexicon. More concretely, the Principle of Consistency excludes the following kind of rule: (46) X\w Y
Y )
X
(disallowed)
The Principle of Inheritance excludes rules such as the following hypothetical instances of composition: (47a) X/eY (47b) X/eY
Y/Z ) X\Z Y/eZ ) X/xZ
(disallowed) (disallowed)
It is also the Principle of Inheritance that requires that any modality on the slash that unifies with the /Z slash in the input to the combinatory rules previously instanced has to be the same modality on the output /Z. These principles do allow rules such as the following crossing functional composition rules, as instances of rules of a kind already seen in their general form in (21): (48a) X/xY Y\.Z ) X\.Z (>BX) (48b) Y/.Z X\xY ) X/.Z (
Such crossing rules are not theorems of type calculi such as that of Lambek and its descendant Type– Logical Grammar and, in fact, cause the collapse of such calculi into permutation completeness if added as axioms (Moortgat), a fact that has motivated the development of multimodal varieties of categorial grammar within the type–logical tradition by Hepple and others. Although such rules do not cause the same collapse in CCG even without the modalities, the present use of modalities to provide finer control over the rules is directly inspired by multimodal categorial grammar.
Conclusion CCG abandons traditional notions of surface constituency in favor of flexible surface structure, in which most contiguous substrings of a grammatical sentence are potential constituents, complete with a compositional semantic interpretation, for the purposes of the
application of grammatical rules. The benefits of this move are the following. 1. Word order, relativization, coordination, and intonation structure can all be handled via a single mechanism, using strictly type-driven syntactic combinatory rules with low expressive power. 2. These combinatory rules are universal and invariant. All language-specific information is specified in the lexicon. 3. The traditional modules of surface structure, S-structure, and intonation structure are unified into a single surface derivational module. Derivation is not a representational level; the only representational levels are phonetic and logical form. It follows that everything that depends on relations of c-command (e.g., binding and control, and quantifier domains) must be expressed at the level of logical form, with a consequent transfer of responsibility for the grammar of bounded constructions and the phenomena that have led to proposals for covert movement or quantifier-raising at that level, to the lexicon. The ways in which these matters also can be handled in CCG without syntactic movement is discussed elsewhere (see Semantics in Categorial Grammar). See also: Categorial Grammars: Deductive Approaches; Coordination; Long-Distance Dependencies; Semantics in Categorial Grammar; Syntactic Variables and Variable-free Syntax.
Bibliography Bach E (1976). ‘An extension of classical transformational grammar.’ In Problems in linguistic metatheory: Proceedings of the 1976 Conference at Michigan State University. Lansing, MI: Michigan State University. 183–224. Bach E (1979). ‘Control in Montague Grammar.’ Linguistic Inquiry 10, 513–531. Baldridge J (2002). Lexically specified derivational control in Combinatory Categorial Grammar. Ph.D. diss., University of Edinburgh. Baldridge J & Kruijff G J (2003). ‘Multi-modal Combinatory Categorial Grammar.’ In Proceedings of the 11th Annual Meeting of the European Association for Computational Linguistics, Budapest. Cambridge: ACL. 211–218. Bozsahin C (1998). ‘Deriving predicate-argument structure for a free word order language.’ In Proceedings of COLING-ACL ’98. Cambridge, MA: MIT Press. 167– 173. Chierchia G (1988). ‘Aspects of a categorial theory of binding.’ In Oehrle R T, Bach E & Wheeler D (eds.) Categorial grammars and natural language structures. Dordrecht: Reidel. 125–151.
Combinatory Categorial Grammar 621 Clark S & Curran J R (2004). ‘Parsing the WSJ using CCG and log-linear models.’ In Proceedings of the 42nd Meeting of the ACL. 104–111. Barcelona, Spain. Curry H B & Feys R (1958). Combinatory logic (Vol. 1). Amsterdam: North Holland. Dowty D (1978). ‘Governed transformations as lexical rules in a Montague Grammar.’ Linguistic Inquiry 9, 393–426. Dowty D (1982). ‘Grammatical relations and Montague Grammar.’ In Jacobson P & Pullum G K (eds.) The nature of syntactic representation. Dordrecht: Reidel. 79–130. Dowty D (1988). ‘Type-raising, functional composition, and nonconstituent coordination.’ In Oehrle R T, Bach E & Wheeler D (eds.) Categorial grammars and natural language structures. Dordrecht: Reidel. 153–198. Dowty D (1993). ‘‘‘Variable-free’’ syntax, variable-binding syntax, the natural deduction Lambek calculus, and the crossover constraint.’ In Proceedings of the 11th West Coast Conference on Formal Linguistics, 1992. Stanford, CA: SLA. 161–176. Dowty D (1996). ‘Towards a Minimalist Theory of Syntactic Structure.’ In Bunt H & van Horck A (eds.) Discontinuous constituency. The Hague: Mouton de Gruyter. 11–62. Dowty D (1997). ‘Nonconstituent coordination, wrapping, and multimodal categorial grammars: syntactic form as logical form.’ In Dalla Chiara M L (ed.) Proceedings of the 10th International Congress of Logic, Methodology, and Philosophy of Science, 1995. Amsterdam: NorthHolland. 347–368. Available at: http://www.ling.ohiostate.edu/!dowty/. Gazdar G (1981). ‘Unbounded dependencies and coordinate structure.’ Linguistic Inquiry 12, 155–184. Hepple M (1990). The grammar and processing of order and dependency: a categorial approach. Ph.D. diss., University of Edinburgh. Hockenmaier J (2003). Data and models for statistical parsing with CCG. Ph.D. diss., University of Edinburgh. Hoffman B (1995). Computational analysis of the syntax and interpretation of ‘free’ word-order in Turkish. Ph.D. diss., University of Pennsylvania. Jacobson P (1990). ‘Raising as function composition.’ Linguistics and Philosophy 13, 423–476. Jacobson P (1992a). ‘Flexible categorial grammars: questions and prospects.’ In Levine R (ed.) Formal grammar. Oxford: Oxford University Press. 129–167. Jacobson P (1992b). ‘The lexical entailment theory of control and the tough construction.’ In Sag I & Szabolcsi A (eds.) Lexical matters. Stanford, CA: CSLI Publications. 269–300. Jacobson P (1999). ‘Towards a Variable-Free Semantics.’ Linguistics and Philosophy 22, 117–184. Kang B-M (1995). ‘On the treatment of complex predicates in Categorial Grammar.’ Linguistics and Philosophy 18, 61–81.
Komagata N (1999). Information structure in texts: a computational analysis of contextual appropriateness in English and Japanese. Ph.D. diss., University of Pennsylvania. Lambek J (1958). ‘The mathematics of sentence structure.’ American Mathematical Monthly 65, 154–170. Moortgat M (1989). Categorial investigations. Dordrecht: Foris. Oehrle R T (1988). ‘Multidimensional compositional functions as a basis for grammatical analysis.’ In Oehrle R T, Bach E & Wheeler D (eds.) Categorial grammars and natural language structures. Dordrecht: Reidel. 349–390. Partee B & Rooth M (1983). ‘Generalised conjunction and type ambiguity.’ In Bau¨ erle R, Schwarze C & von Stechow A (eds.) Meaning, use, and interpretation of language. Berlin: de Gruyter. 361–383. Pickering M & Barry G (1993). ‘Dependency Categorial Grammar and coordination.’ Linguistics 31, 855–902. Pierrehumbert J (1980). The Phonology and Phonetics of English Intonation. Bloomington, IN: Indiana University Linguistics Club. Prevost S & Steedman M (1994). ‘Specifying intonation from context for speech synthesis.’ Speech Communication 15, 139–153. Shaumyan S (1974). Applikativnaja grammatika kak semanticˇeskaja teorija estestvennyx jazykov. Moscow: Nauka. [English edition: Applicational grammar as a semantic theory of natural language (1977), Edinburgh: Edinburgh University Press.] Shieber S (1985). ‘Evidence against the context-freeness of natural language.’ Linguistics and Philosophy 8, 333–343. Steedman M (1985). ‘Dependency and coordination in the grammar of Dutch and English.’ Language 61, 523– 568. Steedman M (1996). Surface structure and interpretation. Cambridge, MA: MIT Press. Steedman M (2000). The syntactic process. Cambridge, MA: MIT Press. Szabolcsi A (1989). ‘Bound variables in syntax: are there any?’ In Bartsch R, van Benthem J & van Emde Boas P (eds.) Semantics and contextual expression. Dordrecht: Foris. 295–318. Trechsel F (2000). ‘A CCG account of Tzotzil pied piping.’ Natural Language and Linguistic Theory 18, 611–663. Vijay-Shanker K & Weir D (1994). ‘The equivalence of four extensions of context-free grammar.’ Mathematical Systems Theory 27, 511–546. Wood M M (1993). Categorial grammar. London: Routledge. Zwicky A (1986). ‘Concatenation and liberation.’ In CLS 22: Papers from the General Session of the Chicago Linguistic Society. Chicago, IL: University of Chicago. 65–74.
Comenius, Johann(es) Amos (Jan Amos Komensky´) (1562–1670) W Hu¨llen, University of Duisburg-Essen, Essen, Germany ! 2006 Elsevier Ltd. All rights reserved.
Johannes Amos Comenius (Jan Amos Komensky´ ) called himself a theologian. Later centuries saw the pedagogue in him. The linguistic education of young people was his main concern, thus his relevant works, in particular his textbooks, are of a linguistic nature. With the growing interest in his last (unfinished) work, De rerum humanarum emendatione consultatio catholica, which was rediscovered only in 1923, he is now considered as a philosopher with a strong inclination to language. Orphaned when only 11 years old, Jan Komensky´ spent his childhood and youth in Moravia, educated by a relative and a guardian in the spirit of the Unity of the Bohemian Brethren (Unitas Fratrum), a Hussite denomination in which Comenius was to become priest, senior, and last bishop in 1616, 1632, and 1648, respectively. From 1611 to 1614 he studied at the Universities of Herborn and Heidelberg, meeting there, among others, the encyclopedist Johann Heinrich Alsted (1588–1638) and the irenicist David Pareus (1548–1622), who both had a lifelong influence on Comenius’s thinking. Between 1614 and 1628 he taught at Unitarian schools in his home country. After that, the turmoil of the Thirty Years’ War and the persecution of the Protestants in the course of the so-called Counter-Reformation devastated him and exiled him for good. From 1628 to 1641 he lived in Lezno, Poland. Following an invitation by the Puritan reformer and pedagogue Samuel Hartlib (d. 1670), he visited England, where he contacted the founding members of the Royal Society. After only one year, he left the country because of the civil war. After traveling in the Low Countries and in Sweden, he spent the years 1642–1648 in Elbing, then Swedish territory, and 1648–1650 in Lezno. Accepting an invitation by the Hungarian nobleman Ra´ ko´ czi, he traveled to Sa´ rospatak, staying there until 1654. He spent the two succeeding years in Lezno again, where in 1656 a fire destroyed almost all his papers, among them the manuscript of a Czech alphabetical dictionary on which he had worked since his time in Heidelberg. Because of the Swedish–Polish war, he emigrated to his last abode, Amsterdam. During all these years he spent as a European refugee for religious reasons, he constantly published books, wrote innumerable letters to the great thinkers and statesmen, met influential people, and supervised schools. Of a theological character were Comenius’s lifelong intentions to restitute the world as willed by God in the unity of the creation, not least through
knowledge and education (pansophia). Of a philosophical character was his encyclopedia in which he integrated all human knowledge, including the findings of the (then) new sciences. Of a didactic character were his books on teaching, mainly the Didactica magna (1657), and his textbooks for learning languages, mainly Janua linguarum reserata (1631) and Orbis sensualium pictus (1658). For Comenius, language was the most important means of education, besides belief and piety. Repeatedly, e.g., in Novissima linguarum methodus (1648), he developed the triadic arrangement of ratio-oratiooperatio, marking the stages of education, and the more elaborated chain res-mens-lingua-manus-res, marking the process of general reform that eventually leads to pansophy. In both cases, language, and with it linguistic education, was the central link. The two works that propagated Comenius’s ideas with much more success than his other ones and that are linguistic in the narrow sense are Janua linguarum reserata and Orbis sensualium pictus; they are both special types of what would today be called onomasiological dictionaries. The former is a collection of exactly 1000 Latin sentences, broken down into exactly 100 sections. Each section provides the complex definition of a term, mostly nouns or adjectives. The latter is a collection on a much smaller scale of 150 sections consisting of defining sentences in Latin and German and of concomitant pictures that semanticize various lexemes incorporated into those sentences. In both books, the arrangement of sentences follows the encyclopedic order as had been popular since the 12th century, not least in the works of Comenius’s teacher Alsted. Both books, whose success led to literally hundreds of editions in almost all the European languages, were didactic because they were schoolbooks geared to the capacity of learning children. They were philosophical because they presented all the extant knowledge of the world, and they were theological because they paved the way to pansophy. For Comenius, the tongue adequate to the envisaged perfect state of mankind was a universal language as a perfect means of cognition and communication for all. He described it in Via lucis (1641) and in Panglottia (the linguistic part of the Consultatio) in the way that was commonly accepted in the British universal language movement. See also: Applied Linguistics: Overview and History; Lexicography: Overview; Religious Language.
Bibliography Academia Scientiarum (1969 ff.). Johannis Amos Comenii Opera Omnia. Prague: Academia.
Comics: Pragmatics 623 Acta Comeniana. Internationale Revue fu¨ r Studien u¨ ber J. A. Comenius. International Review of Comenius Studies. Prague (biannually). Blekastad M (1969). Comenius. Versuch eines Umrisses von Leben, Werk und Schicksal des Jan Amos Komensky´. Oslo/Prag: Universitetsforlaget/Academia.
Comenius Jahrbuch. Hohengehren: Schneider (annually). Hu¨ llen W (1999). English dictionaries 800–1700: the topical tradition. Oxford: Clarendon Press, 361–430. Studia Comeniana et Historica. Uhersky´ Brod (biannually). Turnbull G H (1947). Hartlib, Dury and Comenius. London: University Press of Liverpool.
Comics: Pragmatics K-A L Mey, Zu¨rich, Switzerland ! 2006 Elsevier Ltd. All rights reserved.
A comic (plural: comics), also known as (a) comic strip(s), is a narrative form that combines written text (see Text and Text Analysis) and pictorial elements. A comic consists of a series of interrelated picture/text combinations. Each single picture stands in direct relation to the preceding units; this sequential order constitutes a chain of reference. Comparable to other serial productions of mass media such as soap operas or book and film series, comics have a continuous cast of main characters. The fact that the reader is familiar with the protagonists’ background serves as another important point of reference. Comics either appear as regular strips in printed media (comic strips), in comic magazines with contributions by various authors, or as comic books featuring a main character and his or her story or episodes.
Origins of the Comic Although the historic roots of comics can be traced back to the 18th and 19th centuries, to political cartoons and illustrated narratives such as Max und Moritz (1865) by Wilhelm Busch, comics in their modern form are a relatively recent phenomenon. At the end of the 19th century, American newspapers included comic strips in their Sunday supplement to attract more readers. These humorous picture-stories, also known as ‘the funnies’ or comic strips, gave their name to a new genre (see Genre and Genre Analysis), the comics, which is not restricted to funny stories only.
Narrative Means: How Comics Tell a Story Is there a language specific to comics? At first sight, expressions like ‘zooom,’ ‘grrowr!’ and ‘splash!’ are likely to be identified as typical expressions of comiclanguage. While such expressions are certainly characteristic of the medium, they do not touch its
essence. Among the great variety of narrative means of the comic, the most important feature is the interdependence of the illustrations and written text. Both elements are bearers of meaning, but it is their combination that makes up the narrative. While the narrative may have any imaginable content – there are adventure stories, political satire, family series, the classical Greek mythologies and the Bible retold, comics for children, so-called adult comics with erotic or pornographic components, etc. – what all comics have in common is the use of this specific means of telling a story. A comic consists of minimally two picture–text units, called panels. Panels are usually square boxes containing an image and sometimes text, bounded by a thin frame line. These panels are to be read in sequential order, comparable to a normal text, and this sequential relation distinguishes a comic from a mere accumulation of pictures. This sequence distinguishes a comic from a cartoon, which consists of a single picture-frame only. If a comic is to be read like a text, the author has to create coherence within the story. He achieves this by forming a ‘chain of reference.’ This chain of reference will enable the reader to recognize the different panels as narrative elements of the same story, comparable to the process of reading a text – the reader knows that the preceding words of a sentence are connected to the following words and will create a coherent narrative (see Pragmatics of Reading). To illustrate this, let’s imagine an episode with the world’s most famous duck, Walt Disney’s Donald Duck. A first panel might show Donald Duck sitting on a bench; the second one, Donald walking through a park; the last one, Donald in front of a house. Theoretically, the reader could interpret this pictures as three separate pictures: ‘Donald sits on a bench’ / ‘Donald takes a stroll in the park’ / ‘Donald stands in front of his/a house.’ But the reader knows, by means of identical reference, that every Donald appearing in the panels following the first panel is the same protagonist in the same story. That way, the reader can fill in the
Comics: Pragmatics 623 Acta Comeniana. Internationale Revue fu¨r Studien u¨ber J. A. Comenius. International Review of Comenius Studies. Prague (biannually). Blekastad M (1969). Comenius. Versuch eines Umrisses von Leben, Werk und Schicksal des Jan Amos Komensky´. Oslo/Prag: Universitetsforlaget/Academia.
Comenius Jahrbuch. Hohengehren: Schneider (annually). Hu¨llen W (1999). English dictionaries 800–1700: the topical tradition. Oxford: Clarendon Press, 361–430. Studia Comeniana et Historica. Uhersky´ Brod (biannually). Turnbull G H (1947). Hartlib, Dury and Comenius. London: University Press of Liverpool.
Comics: Pragmatics K-A L Mey, Zu¨rich, Switzerland ! 2006 Elsevier Ltd. All rights reserved.
A comic (plural: comics), also known as (a) comic strip(s), is a narrative form that combines written text (see Text and Text Analysis) and pictorial elements. A comic consists of a series of interrelated picture/text combinations. Each single picture stands in direct relation to the preceding units; this sequential order constitutes a chain of reference. Comparable to other serial productions of mass media such as soap operas or book and film series, comics have a continuous cast of main characters. The fact that the reader is familiar with the protagonists’ background serves as another important point of reference. Comics either appear as regular strips in printed media (comic strips), in comic magazines with contributions by various authors, or as comic books featuring a main character and his or her story or episodes.
Origins of the Comic Although the historic roots of comics can be traced back to the 18th and 19th centuries, to political cartoons and illustrated narratives such as Max und Moritz (1865) by Wilhelm Busch, comics in their modern form are a relatively recent phenomenon. At the end of the 19th century, American newspapers included comic strips in their Sunday supplement to attract more readers. These humorous picture-stories, also known as ‘the funnies’ or comic strips, gave their name to a new genre (see Genre and Genre Analysis), the comics, which is not restricted to funny stories only.
Narrative Means: How Comics Tell a Story Is there a language specific to comics? At first sight, expressions like ‘zooom,’ ‘grrowr!’ and ‘splash!’ are likely to be identified as typical expressions of comiclanguage. While such expressions are certainly characteristic of the medium, they do not touch its
essence. Among the great variety of narrative means of the comic, the most important feature is the interdependence of the illustrations and written text. Both elements are bearers of meaning, but it is their combination that makes up the narrative. While the narrative may have any imaginable content – there are adventure stories, political satire, family series, the classical Greek mythologies and the Bible retold, comics for children, so-called adult comics with erotic or pornographic components, etc. – what all comics have in common is the use of this specific means of telling a story. A comic consists of minimally two picture–text units, called panels. Panels are usually square boxes containing an image and sometimes text, bounded by a thin frame line. These panels are to be read in sequential order, comparable to a normal text, and this sequential relation distinguishes a comic from a mere accumulation of pictures. This sequence distinguishes a comic from a cartoon, which consists of a single picture-frame only. If a comic is to be read like a text, the author has to create coherence within the story. He achieves this by forming a ‘chain of reference.’ This chain of reference will enable the reader to recognize the different panels as narrative elements of the same story, comparable to the process of reading a text – the reader knows that the preceding words of a sentence are connected to the following words and will create a coherent narrative (see Pragmatics of Reading). To illustrate this, let’s imagine an episode with the world’s most famous duck, Walt Disney’s Donald Duck. A first panel might show Donald Duck sitting on a bench; the second one, Donald walking through a park; the last one, Donald in front of a house. Theoretically, the reader could interpret this pictures as three separate pictures: ‘Donald sits on a bench’ / ‘Donald takes a stroll in the park’ / ‘Donald stands in front of his/a house.’ But the reader knows, by means of identical reference, that every Donald appearing in the panels following the first panel is the same protagonist in the same story. That way, the reader can fill in the
624 Comics: Pragmatics
narrative gaps and verbalize this sequence of panels as ‘Donald sat on a bench and walked home through a park.’ As most readers will be familiar with the small suburban house that Donald lives in, the author can draw on this familiarity with Donald’s surroundings as a further point of reference. This system of reference is based on two narrative strings. One string refers to the space or environment where the action occurs, the other refers to the action itself. In this way, once introduced, the environment ‘park’ remains valid until a new environment appears. The depiction of the park can be reduced to a single tree, a meadow, a flower – it may even disappear entirely, with no harm done to text comprehensibility, as the reader will still know the setting to be a park. The other string deals with the action. Here, too, an element, mostly a living being such as Donald Duck in the example above, refers to its first introduction and can be reduced in various ways, e.g., to a silhouette, a hand, a hat floating on the water (as will happen to such an unlucky person as Donald). The separation between these two narrative strings, environment and action, is not absolute, though; an element of the environment can become bearer of an action, for instance, when a rock gives up its function as environment and falls down to block the road instead. Conversely, Donald’s car can change function: when he parks it in front of his house and walks away, the car becomes part of the environment and is no longer an element of the action string. In comics, a story can do without a description of the environment, but not without action. However, even though the description of action thus has priority over that of the environment, the latter has another, equally important function: it determines the rhythm of the narrative. An environment drawn in every detail will slow down the narrative rhythm, since the reader is likely to spend more time contemplating the picture and to study all the details shown, whereas a picture stripped of all environmental details will speed up the pace of the narration (see below, Narrative Rhythm).
Playful Conventions: How to Read the Narrative Codes Most readers of comics have been familiar with the genre since childhood, hence know to decipher the conventions of the codes specific to comics. There is no prescriptive list of given codes, comparable to punctuation in a written text, such as periods or commas. While certain conventions have been established, each author still has the freedom to disregard them, to play with them, and to invent new means of
structuring a story – the only constraint being that the reader still must be able to grasp the meaning. Common Narrative Codes in Comics
Arrangement of Panels Panels are usually arranged in sequential order, to be read from left to right and top to bottom, according to the usual direction of reading of the Latin alphabet. Deviations are marked by numbers or by arrows indicating the new direction. Deviations of the ordinary sequential pattern are often used to express the rupture of the normal (i.e., linear) flow of narrated time and space, for example, to illustrate simultaneous action or a particular protagonist’s train of thought, as in daydreaming. Panels may have subpanels. A large panel may take up an entire page in a comic book, or be divided into subpanels forming a whole; so-called split-panels are arranged to show the details of an action happening, comparable to slow motion in a film. Panel Frame The panel frame usually consists of a straight line forming a square frame. It indicates the boundaries of the image-text component. The form of the frame is included in the playful way that comics handle narrative conventions. The frame will become more than a mere designating line and start being a bearer of meaning, for instance, when a story within a story is being told. A flashback is marked by wavy or punctuated frame lines, zigzagging lines will express strong emotions or pain. Irrespective of its shape, the frame’s function becomes clear within the entire context. Some authors will occasionally omit the frame altogether. By doing so, they strip the topic of its environmental context. Thus, they create a moment of concentration, the effect resembling a closeup in film. Because almost anything goes in this genre (as long as the readers can construct meaning from the context), there are even authors that do without frames altogether. Balloons Balloons are another vital constitutive element of the comic’s narrative codes. They contain words or thoughts attributed to figures in the panel, and indicate who is speaking or thinking. The basic form of the balloon is a round or square frame containing the text; it usually hovers above the speaker’s head like a small cloud, a small tail pointing to the speaker’s head. Balloons containing speech are conventionally drawn with a continuous line; balloons containing thoughts replace the balloon-tail by a line of bubbles. Thought-balloons are mostly used for characters that cannot speak in real life, such as animals, e.g., Jim Davis’ ‘Garfield,’ the cat. There are, of course, infinite
Comics: Pragmatics 625
Figure 1 In Walt Kelly’s ‘Pogo,’ the lettering is masterfully used to carry meaning. (Kelly, 1972: 120). Copyright 2000 OGPI.
variations on this theme: whispering is illustrated by an interrupted line; a zigzagging balloon-tail indicates a voice as heard over a telephone; a balloon in zigzagshape shows the speaker to be very angry; little icicles hanging on the balloon’s lower frame lines indicate words spoken in an very frosty mood; a balloon wreathed in flowers shows the character’s effort to sweet-talk someone. Colored comics will add color to their balloons and thus enhance the emotional impact of the text spoken or thought. A green balloon will signify envy, a red one anger or pain. A black balloon may even be drawn to resemble a somber storm cloud looming above the character’s head, showing his or her dark mood. Several balloons in a panel, as in a dialog, are to be read according to the direction of reading. Balloons can also contain symbols such as a light bulb (inspiration), a heart (love), or a saw (emulating the sound of snoring). Written Text
There are three main groups of written text in comics: a. text within a balloon b. text within the panel
c. text at the edge of, or between panels, so-called caption texts. Text Within a Balloon Written texts in comics not only transmit their message by the words themselves, but also through the typographical appearance of the lettering. The latter is especially the case for the first two kinds of text mentioned, (a) and (b). One important feature of the balloon text is its size. Small letters in a relatively oversized balloon indicate a low voice or a whisper; big letters, almost bursting out of their balloon, indicate a loud voice or scream. The size of the lettering thus compensates for the absence of sound in the comic medium. Various kinds of typography can be used to characterize the speaker. Comic author Walt Kelly, for instance, does this with great artistic subtlety in his story ‘Pogo’ (cf. Figure 1). This example shows three types of balloons as well as three types of lettering. The tortoise shown in the first panel is communicating in thought-balloons, according to the convention that animals do not speak. The letters are written in the widely used conventional capitals. The second panel contains the balloon with the monologue of the deacon, a stiff-upper-lip persona forced to do kitchen-chores (‘‘me, an administrative
626 Comics: Pragmatics
advisor, put to work peeling knockwursts and other vegetables’’). His speech is contained by an ornamental balloon-frame, resembling ancient parchments. The letters are written in an accordingly old-fashioned way, made to resemble Gothic type, using capital and small letters. Note, in the third panel, how small the letters ‘ – sigh – ’ appear in the balloon. The last panel introduces an even different type of balloon; it is drawn to resemble a small cloud emerging from the bag containing the sausages and it contains the word ‘chomp!’ (an expression that combines the verb and the sound – see the next section). So, just like the form of the panel or the balloon, the form of the letters, too, can bear meaning. Words cried in anguish will appear shaky or fragmented; oldfashioned typography and ornamental lettering is used to evoke an atmosphere of once-upon-a-time. Text Outside Balloons Since comics cannot represent sound, they make it visible. This is achieved with the aid of sound-imitating or -describing words, also called onomatopoeia – the ‘zooom,’ ‘grrowr!’ and ‘splash!’ mentioned above. Whenever the text is not confined to the balloon, there are even fewer limits on the imagination of the author as to their typography. As in the balloon texts, big-sized lettering indicates loudness. The source of the sound can be inferred from its position in the panel. ‘Plitch,’ a sound describing a dripping faucet, will appear near the surface that the water drop falls on. Some authors draw onomatopoeia with such expressiveness that these become pictures in themselves, e.g., a ‘bouumm!’ with exploding letters. These expressions are often based on the imitation of sound, such as the just mentioned ‘bouumm!’ that evokes the sound of an explosion. Often, a verb is shortened into a descriptive form that describes the action, as in ‘drip,’ ‘sob,’ ‘cracklerattlebash!’ Not to forget the innumerable possibilities of combinations of both sound, verb, and description, as the above mentioned ‘chomp!’ or the sound of a starting racing car: ‘vroummmroarr!’ Note that a loud cry can take on the quality of a sound word, drawn without a balloon (‘yikes!!’). Caption Texts Caption texts are explanatory texts located at the edge of the panel (or between panels), often in a small, square frame of their own. They comment on the progress of the story in the panel and give information that has not been conveyed by the panels. The function of the caption text is to link the panels, sum up or comment on the action,
or provide any information the author wants to communicate to the reader. They frequently deal with time factors, e.g. they could read ‘later,’ ‘meanwhile,’ or ‘ten years ago.’ Pictorial Signs
Apart from the narrative means of structuring a story listed above, comics dispose of a large variety of pictorial signs. These signs appear as illustrations of the action taking place in the panel; often they are used to show a protagonist’s emotional state or his or her general condition. Such illustrations are often graphical translations of a figure of speech, such as ‘having a broken heart’ or ‘if looks could kill’ – the lovelorn protagonist will have a splintered heart hovering above his or her head in the first case, whereas small daggers will be drawn on their way from the protagonist’s eyes towards his or her adversary in the latter. Great effort, embarrassment, and alarm are universally shown by little drops of perspiration flying from the protagonist’s head (as in sweating, due to physical exertion, or breaking out in cold sweat). Pain is depicted by stars appearing above the hurting part of his or her body; feeling dizzy, being drunk or knocked-out, by spirals around the head (cf. above about ‘balloons’: hearts, light bulbs, etc., within the balloon). Apart from these mostly figurative illustrations, comics have developed a specific graphic feature to show movement. They are called ‘speed lines’ and refer to the slurring of vision to the eye when an object or person moves in fast motion. Speed lines will trail along a speeding object, telling the onlooker ‘it was here just a second ago, but it moved over there within the blink of an eye.’ Speed lines will show the course of the moving object; often, they are accompanied by small dust clouds to enhance the effect.
Doing without Sound and Motion: Narrative Rhythm Every narrative is told in segments. An author will select which segments of a progression he or she will show or tell and leave gaps in between for the reader to fill in and make up a continuous narrative flow (see Narrative Means: How Comics Tell a Story, above). The pace of a narration is directly related to the number of panels; an event illustrated by many panels will naturally slow down the narrative rhythm, whereas inserting a caption reading ‘two weeks later’ above a panel speeds it up. The narrative rhythm is not related to the time narrated (see Narrativity and Voice).
Comics: Semiotic Approaches 627
Apart from the numbers of panels, the narrative rhythm can also be varied by other means: by drawing a detailed environment (usually in a panel that is comparatively larger than the others), or by zooming out into a wide angle, as is often used in films to mark a moment of introduction or contemplation at the beginning or ending of a film. These wide angle shots are, for instance, used as recurring features in Goscinny & Uderzo’s ‘Asterix the Gaul.’ The story usually starts with a large introductory panel, a wide angle shot of the ‘small village in Gaul,’ depicting a pastoral idyll, and it ends invariably with a panoramic view of the villagers enjoying themselves at a big banquet under the starry sky (with the unmusical bard being tied to a tree in the foreground, in most of the cases). Equally, comics will make use of the other possibilities of film language, as it is expressed in the way a camera shot is taken. One of the factors involved is distance. A panel can show a small human silhouette in the distance, in the vast landscape of a desert plain. Or it can show only a detail of that person’s face e.g., a pair of frightened eyes, seen from very short distance. The close-up shot will let the reader be part of the protagonist’s emotional state of mind, whereas the first example keeps the reader more at a distance. Apart from distance, the virtual camera can choose a particular angle to convey the narrative’s message. A character shown from below will appear as someone superior and in control, someone looked down
upon will appear as just that. Clever authors even make use of the subjective camera, known from experimental films. Thus, in ‘Asterix and the Normans’ a teenage boy from the capital Lutetia is sent to a remote small village ‘to become a man.’ He gets caught by the fear-inspiring Normans, knocked over the head and falls unconscious. The Normans splash water on him and in the following panel we see what the frightened boy sees: A close-up of a row of aweinspiring beards as seen from lying on the ground, all nasty smiles and helmets, looking very grim indeed.
See also: Genre and Genre Analysis; Language: Semiotics; Media and Language: Overview; Narrativity and Voice; Pragmatics of Reading; Text and Text Analysis.
Bibliography Holtz C (1980). Comics – ihre Entwicklung und Bedeutung. Munich: Saur. Kelly W (1972). Pogo: we have seen the enemy and he is us. New York: Simon and Schuster. 120. Krafft U (1978). Comics lesen. Stuttgart: Klett-Cotta. McCloud S (1993). Understanding comics. Northampton, MA: Kitchen Press. Reitberger R C & Fuchs W J (1972). Comics: anatomy of a mass medium. Toronto: Little, Brown. Varnum R & Gibbons C T (2001). The language of comics: word and image. Jackson: University of Mississippi Press.
Comics: Semiotic Approaches M D’Angelo, European Design Institute, Rome, Italy L Cantoni, University of Lugano, Lugano, Switzerland ! 2006 Elsevier Ltd. All rights reserved.
The goal of this article is to present the ways comics have been and are studied by semiotics. In particular, the meaning of a semiotic approach will be dealt with, along with an overview of its history; then a preliminary approach to comics considered as a literary genre is depicted, and a wider approach, which considers comics as a language with its own semiotic code(s), is proposed. Various research perspectives on comics are then presented, addressing their narrative form; their mixed language, always in between pure images and pure texts; and their sequential form. Finally, further and future research directions are discussed.
What Does ‘Semiotic Approach’ Mean? Forty years ago a young Italian academic intervened in an intellectual philosophic congress on myth with his collection of Superman comic books under his arm. With great cultural boldness, the academic maintained that this superhero was a renovation of myth in the context of the mass society. At the conclusion of his speech, he awaited the reaction of the other academics with some trepidation, given the novelty of his proposal. What followed was a lively discussion, and the audience was enthusiastic, so much so that, at the end of the meeting, the young academic realized that at least half of his precious comic books had disappeared! The name of that young academic was Umberto Eco (the anecdote is in Eco, 1964a: XI), and his essay ‘The myth of Superman,’ first published in 1964 (Eco 1964b), was to shortly become a classic in studies on
Comics: Semiotic Approaches 627
Apart from the numbers of panels, the narrative rhythm can also be varied by other means: by drawing a detailed environment (usually in a panel that is comparatively larger than the others), or by zooming out into a wide angle, as is often used in films to mark a moment of introduction or contemplation at the beginning or ending of a film. These wide angle shots are, for instance, used as recurring features in Goscinny & Uderzo’s ‘Asterix the Gaul.’ The story usually starts with a large introductory panel, a wide angle shot of the ‘small village in Gaul,’ depicting a pastoral idyll, and it ends invariably with a panoramic view of the villagers enjoying themselves at a big banquet under the starry sky (with the unmusical bard being tied to a tree in the foreground, in most of the cases). Equally, comics will make use of the other possibilities of film language, as it is expressed in the way a camera shot is taken. One of the factors involved is distance. A panel can show a small human silhouette in the distance, in the vast landscape of a desert plain. Or it can show only a detail of that person’s face e.g., a pair of frightened eyes, seen from very short distance. The close-up shot will let the reader be part of the protagonist’s emotional state of mind, whereas the first example keeps the reader more at a distance. Apart from distance, the virtual camera can choose a particular angle to convey the narrative’s message. A character shown from below will appear as someone superior and in control, someone looked down
upon will appear as just that. Clever authors even make use of the subjective camera, known from experimental films. Thus, in ‘Asterix and the Normans’ a teenage boy from the capital Lutetia is sent to a remote small village ‘to become a man.’ He gets caught by the fear-inspiring Normans, knocked over the head and falls unconscious. The Normans splash water on him and in the following panel we see what the frightened boy sees: A close-up of a row of aweinspiring beards as seen from lying on the ground, all nasty smiles and helmets, looking very grim indeed.
See also: Genre and Genre Analysis; Language: Semiotics; Media and Language: Overview; Narrativity and Voice; Pragmatics of Reading; Text and Text Analysis.
Bibliography Holtz C (1980). Comics – ihre Entwicklung und Bedeutung. Munich: Saur. Kelly W (1972). Pogo: we have seen the enemy and he is us. New York: Simon and Schuster. 120. Krafft U (1978). Comics lesen. Stuttgart: Klett-Cotta. McCloud S (1993). Understanding comics. Northampton, MA: Kitchen Press. Reitberger R C & Fuchs W J (1972). Comics: anatomy of a mass medium. Toronto: Little, Brown. Varnum R & Gibbons C T (2001). The language of comics: word and image. Jackson: University of Mississippi Press.
Comics: Semiotic Approaches M D’Angelo, European Design Institute, Rome, Italy L Cantoni, University of Lugano, Lugano, Switzerland ! 2006 Elsevier Ltd. All rights reserved.
The goal of this article is to present the ways comics have been and are studied by semiotics. In particular, the meaning of a semiotic approach will be dealt with, along with an overview of its history; then a preliminary approach to comics considered as a literary genre is depicted, and a wider approach, which considers comics as a language with its own semiotic code(s), is proposed. Various research perspectives on comics are then presented, addressing their narrative form; their mixed language, always in between pure images and pure texts; and their sequential form. Finally, further and future research directions are discussed.
What Does ‘Semiotic Approach’ Mean? Forty years ago a young Italian academic intervened in an intellectual philosophic congress on myth with his collection of Superman comic books under his arm. With great cultural boldness, the academic maintained that this superhero was a renovation of myth in the context of the mass society. At the conclusion of his speech, he awaited the reaction of the other academics with some trepidation, given the novelty of his proposal. What followed was a lively discussion, and the audience was enthusiastic, so much so that, at the end of the meeting, the young academic realized that at least half of his precious comic books had disappeared! The name of that young academic was Umberto Eco (the anecdote is in Eco, 1964a: XI), and his essay ‘The myth of Superman,’ first published in 1964 (Eco 1964b), was to shortly become a classic in studies on
628 Comics: Semiotic Approaches
mass media. Paradoxically, the fame of ‘The myth of Superman’ overshadowed the most innovative essay on comics, published in that same year: ‘A reading of Steve Canyon’ (Eco, 1964c). This reading of a Sunday comic strip marks a theoretical watershed in the approach to this medium and can be considered the forefather of the ‘semiotic approach to comics,’ the subject of this article. The dates must be considered conventional, in the same way we maintain that America was discovered in 1492. As far as we know, ‘A reading of Steve Canyon’ is the first example of the semiotic analysis of comics. In fact, while the discussion on Superman was a sociological one, albeit not completely consolidated, yet already present in the academic literature of those years, ‘A reading of Steve Canyon’ concentrated – as has never been done before – on language. Eco asked himself neither ‘‘what role do comics have in society?’’ nor ‘‘what effects do they have on young minds?’’ He analyzed the graphic art of panels and balloons, the decoupage inside the page, the narrative mechanisms. Eco wanted to understand which characteristics of expression allow comics to involve the reader, tell a story, make sense. In short, Eco was doing semiotics. Easy to say today! In fact, in 1964 semiology (both in Europe and America) had not yet stabilized in precise research paradigms, nor had it been institutionalized as an academic study (for an excursus, see Bailey and Matejka, 1980). But this is of relative importance. When we speak of ‘doing semiotics,’ before even considering a defined discipline, we are referring to a cultural approach through which to investigate the possibility of meanings of languages. In this light, ‘A reading of Steve Canyon’ is, without any doubt, a semiotic analysis. It is so in the pioneering fashion of all the essays of the period dedicated to the exploration of semiosis in the different forms of language: literature (Tzvetan Todorov), cinema (Christian Metz), photography (Roland Barthes), and so on. These pioneers of the sign, as they were inventing new ways of seeing things, had to invent a new (meta)language to explain them and a new method to investigate them. The Pioneers of the Sign
This can be seen, aside from Eco’s work, in that of two other forerunners in the field, Roman Gubern and Pierre Fresnault-Deruelle. Both, at the start of the 1970s, wrote essays programmatically dedicated to the investigation of the language of comics (Gubern, 1972; Fresnault-Deruelle, 1972). While Eco used an interpretative key to locate the instructions to the reading of a comics text, Gubern analyzed, in a cognitive perspective, the mechanisms of expression of
the medium. Even more different is the approach of Fresnault-Deruelle, indebted to Saussurian linguistics and with the intent of delineating the structural implant of the language. The variety of approaches has made the field of study rich in classifications, research methods, and definitions right from the start. With the passage of years, instead of diminishing, the phenomenon has amplified in proportion to the number of essays and articles. Even today, paradoxically, it is impossible to think of a ‘semiotics of comics’ in the sense of a coherent theoretical picture, as can be done for the cinema or for painting. Therefore the expression ‘semiotic approach to comics’ remains the most fitting to describe the variety of orientations and instruments adopted.
Misunderstanding ‘Comics’: Comics as a Genre As we have said, what is common to the works of Eco, Gubern, and Fresnault-Deruelle, despite their different orientations, is the idea that comics are language. Today this affirmation seems obvious, but in those days it was revolutionary. Proof of this is the ‘shyness’ in the terms used by the academics. Eco did not use the term ‘language.’ Rather, he defined comics as a ‘‘particular literary genre’’ (Eco, 1964c: 151). Fresnault-Deruelle spoke of ‘‘paraliterature,’’ explaining that he intended that the term convey no negative connotations (Fresnault-Deruelle, 1972: 29). All this caution should be understood within the context of the culture of that age. The reason is suggested by the anglophone term ‘comics,’ which has always been the most popular to define the medium. It recalls the narrative genre that triumphed, initially, on the first American Sunday pages (Couch, 2000). The gags of terrible young brats, clumsy gentlemen, and talking animals were amusing. They were funny: com(ed)ic situations repeated week after week. So the term quickly came to designate the entire editorial universe of comics. Even with the appearance of adventure stories, the name continued to be used, both in the Sunday pages and in the new types of publications (strips and books). Dick Tracy, Flash Gordon, and Batman were not funny, but they were termed ‘serious comics.’ They were laying the foundations for the misunderstanding that would persecute the medium for a long time. In fact, the continuous use of this same term for different types of stories resulted in the idea that ‘comics’ had no dignity of language but were a ‘macrogenre.’ In the words of Louis T. Hjemslev, it was impossible to think of ‘the form’ of expression
Comics: Semiotic Approaches 629
without certain narrative ‘contents’ and certain readers, generally children. Are Comics ‘Innocent’? From Genre to Language
This misunderstanding was echoed even in the pages of certain academics, in particular the affirmations of Arthur Asa Berger in his famous essay ‘Comic stripped America’ (Berger, 1974). The sociologist found a connection between the forms of comics and the American imagination. For example, the simple graphics and the loud spoken gags of the first Sunday pages were an extension of the ingenuity and innocence of the readers of the day – a tempting thesis, but it continued the error that comics are a unique genre of storytelling (see Frezza, 1978). Berger named only one exception to this theory: Little Nemo in Slumberland, by Windsor McCay, the most refined and complex comic of the age, in both the narrative and the graphical senses. In reality, the socalled exceptions (aside from Little Nemo, we could cite, from those same years, Krazy Kat, by Herriman, and Kin-der-Kids, by Feininger) prove that comics, from the beginning, demonstrated specific expressive capacities. In the succeeding decades, ‘exceptional’ comics have proliferated, putting an end to any misunderstanding. Works like La Ballata del Mare Salato [Ballad of the salt sea], by Hugo Pratt, Contract with God, by Will Eisner, Maus, by Art Spiegelman, Le garage herme´ tique [The airtight garage], by Moebius, The dark knight returns, by Frank Miller, Fuochi, [Fires], by Lorenzo Mattotti, and many others remind us that comics are not a unique genre but a language with its own semiotic code (Barbieri, 1991). However, as far as we are concerned, the language/ genre misunderstanding is above all proof of the difficulty of defining, in any exhaustive way, the semiotic specifics of the medium. In other words, all of us, looking at comics, realize that they are something different from a movie or a novel, but this difference is not easy to describe; it is entangled with many similarities to other languages. The most obvious similarities are those with literature, on the one hand, and with cinema on the other. Eco and Fresnault-Deruelle addressed the similarities of comics and literature.
Graphic Literature: Comics as a Narrative Form No doubt comics are related to literature. First of all, like books, they are printed on paper and therefore need the same type of willing reader, with the intention of activating their meaning by reading. Secondly, comics can include a series of verbal elements in their story content (balloons, sound effects, captions), and
so they give a literary rhythm to the succession of panels. The idea of making literature, with the added value of images, has appealed to many cartoonists. Hugo Pratt, for instance, liked to define his works as ‘‘Letteratura Disegnata’’ [graphic literature]. In effect, in richness of narrative, the vicissitudes of his Corto Maltese can compete with the works of great adventure novelists (Melville, Conrad, London, etc.). But it is a resemblance induced by a precise type of story. When the Italian author spoke of graphic literature, he renewed the misunderstanding mentioned earlier. This time it is the literary language that is mistaken for a genre: romantic literature. This, as we have seen, happens with all the narrative languages, in which many things can be told in many ways. This is one way in which literature, comics, and cinema resemble each other! We could compare them to different automobile models. Each has its own bodywork, its own motor, its optionals. But in the end they run on the same fuel: the story. The Analysis of the Narrative Form in Comics
The first to confront the analysis of the narrative dimension of comics is Fresnault-Deruelle, in the essay he published in 1972. The study was born in the francophone academic context, in that age very sensitive to the subject of the narrative. Back in 1928, after years of oblivion, Morfologiya Skazki [Morphology of the folktale], an original work by Vladimir J. Propp, was circulated in Europe. It was a proposal for a comparative analysis of the story by locating a reduced number of recurring structural narrative elements, called ‘functions.’ In the wake of Propp, semiology set forth on several lines of research, manifested by the various European pioneers (Roland Barthes, Claude Bremond, Umberto Eco, Algirdas J. Greimas, and Tzvetan Todorov) in issue number 8 of the magazine Communications (1966). Fresnault-Deruelle took his approach from this analytic heritage. He focused his analysis on the bande dessinee, comics in the French language published in albums (for a classification, see Couch, 2000), and in particular on the vast production in series of three celebrated cartoonists: Herge´ (George Re´ mi), J. Martin, and Ed P. Jacobs (authors, respectively, of Tintin, Alix, and Blake and Mortimer). FresnaultDeruelle found a tight net of plot-points in their comics, and he classified them in terms of recurrence. He even managed to find in all the stories a common structural scheme: (1) threat, (2) unrolling of the plot, and (3) result. Today the analytic characteristic of FresnaultDeruelle shows several limitations. Nevertheless, he blazed a trail; only the method was yet to be refined.
630 Comics: Semiotic Approaches
In this aspect a decisive role was played by the renowned Paris School of Semiotics (for an overview see Collins and Perron, 1989). The definition is associated with a group of semiologists of the Ecole de Paris, reunited around the figure and teachings of Algirdas J. Greimas. To summarize the Greimasian theory, the story represents the fundamental structure of any process of meaning and, consequently, of any text. Une lecture de Tintin au Tibet, by Jean-Marie Floch (1997), is an exemplary application of the Greimasian method to comics. The monographic reading of a famous Herge´ album becomes an occasion, for the French semiologist, to dig among the panels and balloons in an attempt to bring deep values and meanings to light. The most significant pages of the essay are dedicated to the comparison of all the adventures of Tintin. Floch draws an interesting connection between serial comics and ancient myth: ‘‘These had for material an unlimited stock of figures, scenes and motives . . .. The adventures of Tintin are cut out, responding and inverted in the manner of myths . . .. The analysis of the Adventures of Tintin in Tibet implicates the analysis of another adventure that, in turn, implicates another’’ (Floch, 1999: 185–189). The recent works of Alvise Mattozzi follow this same Greimasian pattern, but they develop the discussion in a social semiotic perspective. Starting from superhero comic books (Mattozzi, 2000) and underground comics, the Italian semiologist analyzed the evolution of the ‘superficial’ modalities (narratives and graphics) of comics in relation to the ‘profound’ changes in the sociocultural context. This is a new aspect of research, and of particular interest. Generally, all the essays on comics that are done in the Greimas style have their merits and limits. The deeper you dig, the more you ‘discriminate,’ and the greater the probability of finding some principles of identity. But, at the same time, the deeper you dig, the more you ‘simplify,’ and the greater the probability of losing sight of the specifics of the texts. So the method is effective if the objective is to study the ‘genetic’ values of the story, independently of the fact that they appear in Tintin, in David Copperfield, or in Red Hood. On the contrary, it is minimally productive if the aim is to explain why, for instance, Melville’s Moby Dick (literature) and Dino Battaglia’s Moby Dick (comics) are two very different texts despite the fact that they are based on an identical narrative plot. In other words, if the Greimasian approach helps us define what comics have in common with literature (and with narrative forms in general), one needs to look at other semiotic methods to understand their differences.
The Strange Case of Dr Image and Mrs Word: Comics as Mixed Language We have already hinted at the possibilities of language, presented in comics through visual components (balloons, sound effects, captions). Above all, balloons, containers of the thoughts and words of the characters, are the naturalized element of language of comics (or at least it is traditionally perceived to be so). The Italians have even elevated them to the name of the medium: fumetti [little smoke clouds]. This way of labeling probably convinced even the American historian Robert C. Harvey: ‘‘In speech balloons, we have something that is unique to the medium . . .. In all other graphic representations – in all other pictorial narratives – characters are doomed to wordless posturing and pantomime. In comics, they speak’’ (quoted in Groensteen, 2000: 1). Even other American historians (Bill Blackbeard, Les Daniels, and Maurice Horn) recognized the value of verbal components, so much so that they inserted verbal components into the definition of the medium. According to this approach, the uniqueness of comics lies exactly in the possibility of articulating words and images together in panels. Art Spiegelman has even proposed to change the term ‘comics’ into ‘commix’: a commixing of words and images. Silent Comics: Verbal and Visual Forms in Comics
The Belgian-French Thierry Groensteen (2000) commented, a little maliciously, on the insistence of American theorists on the mixing of words and images; according to him, this is mainly a way of endorsing the fact that the medium was ‘Made in the U.S.A.’ In fact, the official date of birth of comics is made to coincide with the publication, in 1886, of the American Sunday page Yellow Kid. The adventures of the hairless boy are distinguished exactly because they were the first comic with a verbal component integrated into the graphic composition. Of course, Groensteen was not interested in the controversy that surrounds the argument. Rather, what was of interest to him were the linguistics. Although a student of Fresnault-Deruelle, Groensteen kept his distance from traditional semiology, convinced that, as far as comics are concerned, a specific theory was needed. As he explained in his essay ‘Syste`me de la bande dessine´ e’ (Groensteen, 1999), comics (strips, pages, and books) are a mainly visual narrative space. While you can imagine comics without words, you cannot imagine comics without images. Groensteen cited, as an example, Arzach, by Moebius (Jean Giraud). With the abolition of
Comics: Semiotic Approaches 631
balloons, captions, sound effects, and even kinetic signs, the silent stories of this surreal character live in the essentiality of juxtaposed panels. Something similar, with grotesque connotations, was done by the Argentineans Carlos Trillo and Domingo Mandrafina with their cycle of mute ‘short novels.’ But stories without dialogues and captions can be found, occasionally, in more popular serial comics (from Batman to Ken Parker). With these cases at hand, one could agree with Groensteen and Scott McCloud, who excluded from his famous definition of the medium any reference to the verbal component. In the opinion of McCloud, comics are ‘‘Juxtaposed pictorial and other images in deliberate sequence, intended to convey information and/or to produce an aesthetic response in the viewer’’ (McCloud, 1993: 9). McCloud’s definition is especially interesting. His essays represented an original methodological frontier for studies on this subject. Understanding comics (McCloud, 1993) and Reinventing comics are theoretical works on the language of comics, by a cartoonist, using the language of comics. The verbal component has a fundamental role: the protagonist (a caricature of McCloud) constantly uses balloons to explain the theoretical steps. In effect, rather than completely denying the importance of words, the author reminds us of how difficult it is to measure their exact expressive importance. To illustrate this, McCloud used a graphic picture/word scale (McCloud, 1993, Chap. 6). In comics differing in format, genre, and nationality, the scale can tip from one side to the other without altering the expressive quality of the story. McCloud, therefore, relinquished any theoretical certainty and rather put his trust in experience: the word/image relationship of comics is more an alchemy than a science. This solution reminds us of certain cooking recipes that say ‘‘add salt as necessary,’’ leaving the cooks to their own devices. Perhaps ‘words as necessary’ could satisfy able comic gourmets such as Scott McCloud or Will Eisner, but it does not satisfy in the semiotic sense. If, as Groensteen says, the specificity of the medium resides in the visual juxtaposition of images, may we continue to consider graphic texts such as Prince Valiant, Peanuts, Superhero Comic Books, and others in which words have a preponderant role, as comics? Would it not be more appropriate to define them as illustrated stories? Blind Comics: The Analysis of Verbal Form in Comics
Examples of verbose comics are not lacking, but a typical example, mentioned in every essay, is Prince Valiant. In the Sunday page of this chivalric hero,
created by Harold Foster in 1934, the captions beneath the illustrations are the essence of the story (see the examples of Peeters, 1998). Paradoxically, the events that are narrated would be understandable even without the images. As opposed to the silent comics, so praised by Groensteen, Valiant could be termed a ‘blind’ comic, in which the word completely dominates over the visual aspect. For this reason, some academics (e.g., Will Eisner, Robert C. Harvey, and Scott McCloud) go so far as to exclude Valiant from the comics club and consider it an illustrated story. But this reclassification is not totally convincing. Even if we were to remove these precious captions from Prince Valiant, it would still remain a story told by images, although not very fluid or effective, but nonetheless a visual story. In effect, as Benoit Peeters pointed out (Peeters, 1998: 97–98), in Foster’s work the word is not a substitute for images; they are used alongside each other as a stylistic effect. Barthes’ ‘Rhe´ torique de l’image’ of 1964 had already codified redundancy as one of the possible verbal/iconic combinations. According to Barthes, the image is polysemic, a fluctuating chain of meanings with which words can have various types of relationships. In Prince Valiant, the captions reinstate the epicalchivalric tone of the world that is being represented in a modern chanson de geste. In other comics, the same dynamics are used for different stylistic effects. Alack Sinner (by Carlos Sampayo and Jose´ Munoz) constantly sets out in its panels the ‘voice-over’ of the protagonist to reinforce the Chandlerian tone of the detective story. In the comics by Frank Miller, David McKean, and Bill Sienkewicz, the thought-captions of the characters, a sort of ‘stream of consciousness,’ triumph. And so we can identify the fundamental semiotic difference between the use of words in comics and in literature. In literary works, and verbal stories in general, words need somebody to utter them, and that somebody, by consequence, is present in the text as a personal or impersonal narrator. On the contrary, in comics (and even in cinema and paintings) the image tells itself, and no subtitles are needed. Words, if present, are intended as components of a vast system of expressions. Another consequence is the so-called metaphonological language (Cigada, 1989): if in a book the mood of the characters and its influences on their utterances are to be described through other words (‘‘said the man in a sad voice,’’ ‘‘the girl cried,’’ ‘‘proclaimed aloud,’’ etc.), in comics mood and quality of voice are expressed by means other than words (by the images themselves and by other iconic tools). From this point of view, ‘blind’ and ‘silent’ comics are not incompatible categories but rather the
632 Comics: Semiotic Approaches
extreme ends of a graduated continuum on which every text is situated according to the particular combination of its components. It is therefore not very advisable to determine the ‘purity’ of a particular comics only by means of the juxtaposition of words and images. As Daniele Barbieri wrote, ‘‘Even if we were dealing with a ‘simple’ juxtaposition, the global effect would result neither from the words themselves alone nor from the images themselves alone, but from their relationship’’ (Barbieri, 1991: 203). The study of the way this relationship functions is the primary objective of the modern semiotic approach to comics. This was determined also in the volume The language of comics: word and image, edited by Christina Gibbons and Robin Varnum (2002). For instance, Frank L. Cioffi proposed a study on the ‘‘disturbing dissonances’’ triggered in the reader by the image/word relationship of underground commix. Robert C. Harvey stressed, instead, how the close complementarities of verbal and graphic components can become the distinctive trait of a particular genre of comics, the so-called one-panel cartoons. Even more original was the contribution offered by Gene Kannenberg, Jr., through his analysis of the experimental comics of Chris Ware. Kannenberg stressed the innovative composition of the pages of this cartoonist, which couple graphic and verbal elements in an insoluble ‘visual totality.’ The works cited above give proof of the variety of points of view and of the research being done on what continues to be one of the main topics in the semiotic debate on comics.
From Panel to Panel: Comics as a Sequential Language The discussion on the words/images relationship has brought to light the differences between literature and comics, but it has also highlighted comics’ resemblances to cinema, a medium born around the same period and which has in common with comics the juxtaposition of images in sequence. This resemblance in expressions has often promoted an intertwining of semiological analyses. Even the historic dating coincides. The year 1964, the conventional date of birth of semiotic research on comics, is also the year of publication of essays that have become classics on cinema research, such as ‘Le Cine´ ma: langue ou langage?’ by Metz and ‘Rhe´ torique de l’image’ by Barthes. As a matter of fact, those suggestive studies originated a heterogeneous line of research (some of it sensible, some rather dubious), with the intention of giving iconic texts universal morphologies and
syntheses, in some cases adopted by verbal language, in other cases created from scratch. For example, right from the start, the frame shots of movies and the panels of comics were equated to the linguistic notion of syntagmas. In the same way that syntagmas are the basic units of verbal discussions, so are frames and panels to visual stories. These are the foundations of the particular dynamics of the shots. In fact, unlike paintings or photographs, the shots of comics and of the cinema are never closed or ends in themselves. Each image is a partial step in the story, constantly ‘in search of’ a mate in order to complete its sense. The elements present in each single shot (situations, characters, backdrops) are always organized in respect to what is outside the particular shot and perhaps present in the one that follows. Roman Gubern (1972) was the first to stress how the cognitive discrepancy generated by a shot with respect to its preceding one becomes the principal center of interest to the spectator/reader. The most effective definition of these dynamics is that of Eisner’s ‘sequential art’ (Eisner, 1985). The Analysis of Sequential Dimension in Comics
Sequentiality bases its process of signifying on what Roman Jakobson defined as the two polar axes of language: contiguity and similarity, or in rhetorical terms, metonymy and metaphor. In fact, on the one hand, the sequential story works by metonymic combination, on the basis of a principle of contiguity, through the juxtaposition of images along temporal paths (actual in cinema, suggested in comics). On the other hand, a metaphoric selection intervenes: every single shot ‘replaces’ the preceding one, and the perception of the narrative continuum (logical or chronological) lies in the principle of similarity, in seeing the same forms that vary from one frame to the next. Contiguity and similarity cooperate in suggesting connections between the images to the spectator/ reader. The repertoire of these connections, which constitute the grammar of sequential footage (alternated, parallel, etc.), has been the object of several classifications in comics, among which the most famous is still that of McCloud (1993). His catalogue is based on the psychological concept of closure: a completion operated by inference by the reader on the basis of experience. The different types of connections inferred by the reader correspond to the different types of closures. Although McCloud did not make an explicit distinction between the footage of cinema and that of comics, we must stress that the greatest differences between the two languages lie at the level of closures.
Comics: Semiotic Approaches 633
The sequentiality of the frames, unlike in cinema, is not the product of the mechanical running of photograms but an instruction of the text to the reader in order for him to read the static images in a series. We could say that the story is completely in the eye of the reader. As far as visual suggestion may be effective, there remains the physical limit, made by the white lines that outline the frames. Groensteen stressed the importance of the ‘‘interconic white’’ as the site of an ulterior articulation of the language of comics (Groensteen, 1999). The white hiatus creates a discontinuity, a jump, great or small as it may be in terms of time and space, between one shot and the next. There is another substantial difference between the sequentiality of the two languages. The concept is ‘iconic solidarity,’ coined by Groensteen to stress that the panels are to be interpreted, simultaneously, both as isolated units and as parts of a whole, determined by the form of the comics (strips, pages, books). As Jakobson put it, the axis of similarity prevails in cinematographic sequences: one shot frame ‘chases away’ the preceding one. In comics, on the contrary, the axis of contiguity is dominant: the constant reiteration of panels. Iconic solidarity is manifested by the planar (a succession of panels along the same strip) and vertical (a succession of strips in the same page) directions together in the same space. Comics as Polyphonic Language
This is a very particular prerogative because it allows comics, more than any other language, to have meaning both with the narrative process (intradiegetic), given the succession of events in the story, and the paranarrative process, given the succession of graphic forms on the page. Usually the latter is at the service of the former: a panel of greater size indicates a narrative event of special importance. But there is no universal law on the subject. For example, in superhero comic books, splash panels (giant frames) are used even for the less important parts of the story not to mention all those stylistic variables that cannot be included in the intentions of the single cartoonist, but that instead derive from general cultural codes. In fact, the meaning of comics is activated, aside from the story line, by the perceptive architectures that catch the eye of the reader. From this reflection, the semiologist Daniele Barbieri developed an analysis paradigm that considered the ‘rhythmic’ architecture of comics. A student of Eco, Barbieri borrowed his interpretative approach and extended the notion of rhythm, traditionally tied to a musical context, to any text (verbal, graphic, audiovisual) that develops in a sequence (Barbieri, 2004). In a semiotic sense, rhythm is the recurrence
inside a (textual) structure of homologous elements that are perceptible sequentially. In the case of comics, the layout, the dialogues, the graphics, the story, etc., are rhythmical forms. On each of these rhythms, any element in contrast to the order set previously is perceived as unexpected, as a novelty. Narrative and paranarrative forms work to build an architecture of tension made of climaxes and downfalls, accelerations and slowing downs, so as to feed the interest of the reader. This rhythmical paradigm gives rise to a modern semiotic conception of comics. This medium is made rich by the exchanges with other forms of expression, but it is also an autonomous language, able to arrange meaning in polyphonic ways, articulating various levels of expression, all of them tightly interwoven.
The Semiotic Approach to Comics, Today and Tomorrow Almost 40 years after ‘A reading of Steve Canyon,’ Eco wondered at the appropriateness of scientific studies (especially semiotic ones) of comics. In his article ‘Quattro modi di parlare di fumetti’ [Four ways of speaking about comics], he distinguished, with subtle irony, four analytical tendencies. A couple of them are more concerned with the critique of the sector rather than with the research, while the remaining two are relevant here. In the first (‘Saggi sui Comics come medium’ [essays about comics as medium]), he included himself and the other pioneers of this field, along with the young students (whom he criticizes), among those who still today in their dissertations passionately ‘‘rewrite, for the umpteenth time, that there is a syntax of shots, a semantics of balloons, a textual theory on footage, or that the first comics was the mosaic of San Clemente . . . . However it is clear that this phase is over, and at most can now generate useful archive research, perhaps a critical edition of Yellow Kid’’ (Eco, 1999: 1). In the second (‘Dimenticare il medium’ [forgetting the medium]), Eco praised all those who dedicate themselves to the study of the texts and ignore the analysis of expressive forms in general (‘‘as they are no longer in question’’), to concentrate on the evolution of ‘‘genres, themes, techniques and topics in the universe of the comic-medium’’ (Eco, 1999: 1). Eco’s words demonstrate the evolution of the semiotics of the past, from the ‘abstract’ discussions on signs and codes, into the concrete analysis of the text (built with signs and codes). The change to the textual approach has permitted, especially in the cases of languages with aesthetic goals (literature, cinema,
634 Comics: Semiotic Approaches
comics), the maturing of more precise theoretical schemes. Therefore, he is right in maintaining that it makes no sense to study comics in the same manner as was done in the past. Nevertheless, the general theory on comics does not seem to have reached its ‘Nirvana.’ In fact, this article has highlighted that, despite the indubitable progress made, uncertainties and errors remain. Paradoxically, however, this circumstance must not be seen as a limit of research but rather a stimulus for the future. After all, are there any languages for which we can say with absolute certainty that we know all the grammar and possible meanings? Languages live in their usages and in the creativity of daily use. This is the case with comics: many love them still, read them still, create them still. Styles are evolving, genres are becoming more mature, and reading practices are becoming more refined. In the face of such a dynamic scenario, various research perspectives are opening up. Of course, as Eco suggests, it is important to focus the analysis of the text on specific themes and techniques, ‘forgetting’ the medium; but it is also important to remember it from time to time and reevaluate the theoretical perspectives. See also: Barthes, Roland: Theory of the Sign; Eco, Um-
berto: Theory of the Sign; Iconicity; Russian Formalism; Visual Semiotics.
Bibliography Abel R H & White D M (eds.) (1963). The funnies. Glencoe: Free Press. Baetens J & Lefe`vre P (1993). Pour une lecture moderne de la bande dessine´ e. Bruxelles: Centre Belge de la Bande Dessine´ e. Bailey R & Matejka L (1980). The sign – semiotics around the world. Ann Arbor: Michigan Slavic Contributions. Barbieri D (1991). I linguaggi del fumetto. Milano: Bompiani. Barbieri D (2004). Nel corso del testo. Una teoria della tensione e del ritmo. Milano: Bompiani. Becker S D (1959). Comic art in America: a social history of the funnies, the political cartoons, magazine humor, sporting cartoons, and animated cartoons. New York: Simon and Schuster. Berger A A (1974). The comic-stripped America: What Dick Tracy, Daddy Warbucks, and Charlie Brown tell us about ourselves. Baltimore: Penguin. Christiansen H-C & Magnussen A (eds.) (2000). Comics and culture: analytical and theoretical approaches to comics. Copenhagen: Museum Tusculanum. Cigada S (1989). ‘Il Linguaggio Metafonologico e le sue Applicazioni Stilistica e Linguistica.’ In Il linguaggio
metafonologico: ricerche sulle tecniche retoriche nell’opera narrativa di G. Cazotte, M. G. Lewis, E. A. Poe, G. Flaubert, O. Wilde. Brescia: La Scuola. Collins F & Perron P (eds.) (1989). The Paris school semiotics. Amsterdam: John Benjamins. Couch C (2000). ‘The publication and formats of comics, graphic novels, and tankobon.’ http://www.imageandnarrative.be/. D’Angelo M (2001). Con la testa fra le nuvole. La Cooperazione interpretativa nella serie a fumetti. Diss., ‘La Sapienza’ University. Eco U (1964a). Apocalittici e integrati. Milano: Bompiani. [revised edition with new introduction, 1980]. Eco U (1964b). ‘Il mito di Superman.’ In Eco U (ed.) Apocalittici e integrati. Milano: Bompiani. 219–263. [English translation: ‘The Myth of Superman.’ In Eco U (ed.) (1979). The role of the reader: explorations in the semiotics of texts. Bloomington: Indiana University Press. 107–124.] Eco U (1964c). ‘Lettura di Steve Canyon.’ In Eco U (ed.) Apocalittici e integrati. Milano: Bompiani. 131–186. [English translation in Wagstaff S (ed.) (1987). Comic iconoclasm. London: Institute of Contemporary Arts. 20–25.] Eco U (1999). ‘Quattro modi di parlare di fumetti.’ www.fucinemute.com. [Original edition in La Cappella Underground (ed.) (1999). Claire Brete´ cher: il disegno del fumetto. Trieste: Comune di Trieste, Associazione Italo-Francese.] Eisner W (1985; revised 2001). Comics and sequential art. Tamarac: Poorhouse Press. Floch J-M (1997). Une lecture de ‘Tintin au Tibet.’ Paris: Presses Universitaires de France. Fresnault-Deruelle P (1972). La bande dessine´ e. L’univers et les techniques de quelques ‘comics’ d’expression franc¸ aise. Paris: Hachette. Fresnault-Deruelle P (1977). Re´ cits et discours par la bande: essais sur les comics. Paris: Hachette. Frezza G (1978). L’immagine innocente. Cinema e fumetto americano delle origini. Roma: Napoleone. Gasca L & Gubern R (1988). El discurso del comic. Madrid: Ca`tedra. Groensteen T (1999). Syste`me de la bande dessine´ e. Paris: Presses Universitaires de France. Groensteen T (2000). ‘Des rapports entre la se´ miologie de la bande dessine´ e et son e´ tude historique.’ http:// www.editionsdelan2.com. Gubern R (1972). El Lenguaje de los Comics. Barcelona: Peninsula. Harvey R C (1994). The art of the funnies: an aesthetic history. Jackson: University Press of Mississippi. Kannenberg Gene Jr (2002). ‘The comics of Chris Ware: text, image, and visual narrative strategies.’ In Varnum T & Gibbons C T (eds.) The language of comics: word and image. Jackson: University Press of Mississippi. 174–197. Mattozzi A (2003). ‘Innovating superheroes.’ Reconstructions 3(2). http://www.reconstrunction.ws. McCloud S (1993). Understanding comics. Northampton, Massachusetts: Kitchen Sink Press.
Command Relations 635 Peeters B (1998). Case, planche, re´ cit. Lire la bande dessine´ e. Paris: Casterman. Varnum R & Gibbons C T (eds.) (2002). The language of comics: word and image. Jackson: University Press of Mississippi. Comics Cited Eisner W (words, art) (1985). A contract with God: and other tenement stories. New York: Titan Books. Feineinger L (words and art) (1906–1980). The Kin-derKids: the complete run of the legendary comic strip. New York: Dover. Foster H (words and art) (1934–1984). Prince Valiant vol. 1–20. Seattle: Fantagraphics Books. Herriman G (words and art) (1935–1990). The komplete kolor Krazy Kat. Abington: Remco Worldservice Books.
Mattotti L (words and art) (1985). Fires. Barcelona: Catalan Edition. [English translation by Tom Leighton.] McCay W (words and art) (1905–1989). Marschall R (ed.) The complete Little Nemo in Slumberland: vol. I: 1905– 1907. Seattle: Fantagraphics. Miller F (words and art) (1986). ‘The Dark Knight returns.’ In Batman: the dark knight. New York: DC Comics. Moebius (Giraud J) (words and art) (1977–1993). The airtight garage. New York: Epic Comics. Pratt H (words and art) (1967). Ballad of the Salt Sea (Corto Maltese Adventure). New York: Harvill Publishing. [English translation by Ian Monk.] Spiegelman A (words and art) (1983). Maus: a survivor’s tale. New York: Pantheon Books. Ware C (words and art) (1993). The ACME novelty library No. 1. Seattle: Fantagraphics Books.
Command Relations T Reinhart, Utrecht University, Utrecht, The Netherlands T Siloni, Tel-Aviv University, Tel-Aviv, Israel ! 2006 Elsevier Ltd. All rights reserved.
Since the earliest stages of transformational grammar, it has been observed that linguistic rules are sensitive to the structural relations between nodes in the syntactic tree. Historically, this was noted first for interpretative rules, such as those governing anaphora and quantifier scope. The syntactic relations discussed in the 1960s include ‘command,’ ‘clause-mate,’ and the linear relation ‘precede.’ The relation found most useful at this stage was the composed relation ‘precede and command’ that was introduced in Langacker (1966) to capture the relations between pronouns and their antecedents (which he viewed as governed by a transformation of pronominalization). The relation was later applied to the analysis of the scope of negation in Ross (1967) and of quantifier scope in Jackendoff (1972). We illustrate the development of the view of command relations with the problem of coreference (between pronouns and antecedents) since this was the problem around which most of the discussion centered historically. The relation command is defined, following Langacker (1966), as follows: (1) A node a commands a node b iff neither a nor b dominates the other and the S node most immediately dominating a also dominates b. (2a) *[S She denied that Lucie stole the diamond].
(2b) The man who [S t traveled with her] denied that Lucie stole the diamond. (2c) [NP Her husband] denied that Lucie stole the diamond.
Combined with the relation of ‘precedence,’ Langacker’s generalization on coreference (stated in current terminology) was that a pronoun cannot both precede and command its antecedent. Thus, in (2a) the pronoun both precedes and commands Lucie; hence; coreference (marked with bold type) is impossible. In (2b), however, the preceding pronoun does not command Lucie (the first sentence (S) node dominating it does not dominate Lucie); hence, coreference is not excluded. By this definition of command, (2c) should be blocked as well, because the first S node dominating her also dominates Lucie. Hence, the pronoun both precedes and commands its antecedent. However, Jackendoff (1972: 140) and Lasnik (1976) suggested a modification of the definition (1) that makes use of the notion ‘cyclic node’ rather than just S. At the time, the cyclic nodes were believed to be both S and noun phrase (NP). (Lasnik proposed the term ‘kommand’ for the modified relation.) Their view on the restriction on anaphora remained the same: A pronoun cannot corefer with a full NP that it both precedes and commands. In (2c), the first cyclic node that dominates the pronoun is the NP, which does not dominate Lucie. Hence, the pronoun precedes, but does not command, the antecedent, and coreference is permitted. Reinhart (1976, 1981) argued that structural relations like command can be best understood as defining the syntactic domain of a given node – roughly,
Command Relations 635 Peeters B (1998). Case, planche, re´cit. Lire la bande dessine´e. Paris: Casterman. Varnum R & Gibbons C T (eds.) (2002). The language of comics: word and image. Jackson: University Press of Mississippi. Comics Cited Eisner W (words, art) (1985). A contract with God: and other tenement stories. New York: Titan Books. Feineinger L (words and art) (1906–1980). The Kin-derKids: the complete run of the legendary comic strip. New York: Dover. Foster H (words and art) (1934–1984). Prince Valiant vol. 1–20. Seattle: Fantagraphics Books. Herriman G (words and art) (1935–1990). The komplete kolor Krazy Kat. Abington: Remco Worldservice Books.
Mattotti L (words and art) (1985). Fires. Barcelona: Catalan Edition. [English translation by Tom Leighton.] McCay W (words and art) (1905–1989). Marschall R (ed.) The complete Little Nemo in Slumberland: vol. I: 1905– 1907. Seattle: Fantagraphics. Miller F (words and art) (1986). ‘The Dark Knight returns.’ In Batman: the dark knight. New York: DC Comics. Moebius (Giraud J) (words and art) (1977–1993). The airtight garage. New York: Epic Comics. Pratt H (words and art) (1967). Ballad of the Salt Sea (Corto Maltese Adventure). New York: Harvill Publishing. [English translation by Ian Monk.] Spiegelman A (words and art) (1983). Maus: a survivor’s tale. New York: Pantheon Books. Ware C (words and art) (1993). The ACME novelty library No. 1. Seattle: Fantagraphics Books.
Command Relations T Reinhart, Utrecht University, Utrecht, The Netherlands T Siloni, Tel-Aviv University, Tel-Aviv, Israel ! 2006 Elsevier Ltd. All rights reserved.
Since the earliest stages of transformational grammar, it has been observed that linguistic rules are sensitive to the structural relations between nodes in the syntactic tree. Historically, this was noted first for interpretative rules, such as those governing anaphora and quantifier scope. The syntactic relations discussed in the 1960s include ‘command,’ ‘clause-mate,’ and the linear relation ‘precede.’ The relation found most useful at this stage was the composed relation ‘precede and command’ that was introduced in Langacker (1966) to capture the relations between pronouns and their antecedents (which he viewed as governed by a transformation of pronominalization). The relation was later applied to the analysis of the scope of negation in Ross (1967) and of quantifier scope in Jackendoff (1972). We illustrate the development of the view of command relations with the problem of coreference (between pronouns and antecedents) since this was the problem around which most of the discussion centered historically. The relation command is defined, following Langacker (1966), as follows: (1) A node a commands a node b iff neither a nor b dominates the other and the S node most immediately dominating a also dominates b. (2a) *[S She denied that Lucie stole the diamond].
(2b) The man who [S t traveled with her] denied that Lucie stole the diamond. (2c) [NP Her husband] denied that Lucie stole the diamond.
Combined with the relation of ‘precedence,’ Langacker’s generalization on coreference (stated in current terminology) was that a pronoun cannot both precede and command its antecedent. Thus, in (2a) the pronoun both precedes and commands Lucie; hence; coreference (marked with bold type) is impossible. In (2b), however, the preceding pronoun does not command Lucie (the first sentence (S) node dominating it does not dominate Lucie); hence, coreference is not excluded. By this definition of command, (2c) should be blocked as well, because the first S node dominating her also dominates Lucie. Hence, the pronoun both precedes and commands its antecedent. However, Jackendoff (1972: 140) and Lasnik (1976) suggested a modification of the definition (1) that makes use of the notion ‘cyclic node’ rather than just S. At the time, the cyclic nodes were believed to be both S and noun phrase (NP). (Lasnik proposed the term ‘kommand’ for the modified relation.) Their view on the restriction on anaphora remained the same: A pronoun cannot corefer with a full NP that it both precedes and commands. In (2c), the first cyclic node that dominates the pronoun is the NP, which does not dominate Lucie. Hence, the pronoun precedes, but does not command, the antecedent, and coreference is permitted. Reinhart (1976, 1981) argued that structural relations like command can be best understood as defining the syntactic domain of a given node – roughly,
636 Command Relations
the portion of the tree consisting of those nodes that a given node bears the structural relation to. The structural conditions on linguistic rules, which are based on these relations, restrict these rules to operate on two nodes just in case one of them is in the domain of the other. In the case of semantic interpretation rules of the type mentioned previously, the domain of a given node corresponds to the portion of the syntactic tree in which that node can effect the interpretation of other nodes. For example, in the case of quantifiers’ scope, the syntactic domain of a given quantified NP determines its potential scope over other NPs. Characterized in terms of syntactic domains, the relation ‘precede and command’ defines the domain in (3). The coreference restriction assumed previously can now be stated as in (4): (3) The domain of a node a consists of a together with all and only the nodes that a precedes and commands. (4) A pronoun cannot corefer with (non-pronoun) NPs in its domain.
From this perspective, Reinhart argued that if linguistic rules operate indeed within the domains defined by (3), it would be rather mysterious why this is so since these domains are arbitrary units – chunks of the tree that do not necessarily correspond to independently established syntactic units. The source of the problem is the incorporation of the relation precede in the definition of domains. By this definition, any sequence that follows a given node within the same S (or cyclic node) forms a domain. For instance, given a sentence such as Ben introduced Max to Rosa in September, the domains defined are, first, the domain of the subject (the whole sentence) and, next, the domain of the verb (the verb phrase (VP)). These domains correspond to syntactic constituents, but the next domains would be [Max to Rosa in September], [to Rosa in September], and [Rosa in September]. If such arbitrary chunks of the tree constitute syntactic domains, it is difficult to see what content the notion could have. Next, linear order varies across languages. If (3) is the definition of the domains linguistic rules operate on, different languages may have dramatically different domains. Reinhart (1976: 41, ex. 31) illustrated this with the following example from Malagasy, a VOS language: (5a) namono azy ny anadahin-d Rakoto hit/killed him the sister-of Rakoto ‘Rakoto’s sister killed him’ (5b) *namono ny anadahin-d Rakoto izy hit/killed the sister-of Rakoto he ‘*He killed Rakoto’s sister’
The domain of the subject pronoun in the English translation of (5b) is the whole sentence. However, in the subject-final Malagasy, the domain of the pronoun includes only the pronoun since nothing follows it. In (5a), it is the other way around: Whereas in English the object pronoun has nothing in its domain, in Malagasy its domain is the arbitrary chunk [him the sister of Rakoto]. As shown in (5), the coreference options are precisely the same in Malagasy and English. However, if the coreference rule operates on domains based on linear order, it should incorrectly allow coreference in (5b) and rule it out in (5a). Along with the principled questions, Reinhart argued that there are many empirical problems with rules based on the relation precede and command, even for right branching languages such as English. In the area of coreference, for instance, coreference is permitted in (6) (from Reinhart, 1976), although the pronoun precedes and commands its antecedent: (6a) The chairman hit him on the head before the lecturer had a chance to say anything. (6b) Rosa won’t like him anymore, with Ben’s mother hanging around all the time.
Reinhart concluded that linear order is universally irrelevant for the definition of syntactic domains, and hence for linguistic rules of the type under consideration. On the other hand, the domains for the relevant linguistic rules must be narrower than those defined by the relation command, and they consist of constituents rather than S or cyclic nodes. The initial definition of Reinhart (1976: 32, (36)) for the relation c-command (constituent command) is given in (7): (7) A node a c-commands a node b iff neither a nor b dominates the other and the first branching node dominating a also dominates b. (8) The domain of a node a consists of a together with all and only the nodes that a c-commands.
As observed in Reinhart (1976), this relation is similar to (the converse of) the relation ‘in construction with,’ which was suggested by Klima (1964) for the analysis of the scope of negation. It also bears resemblance to ‘superiority’ suggested by Chomsky (1973), the difference being that superiority is asymmetric. Thus, sister nodes cannot be superior to each other, whereas (7) allows them to c-command each other. The formal properties of c-command were further investigated in Reinhart (1981), where she argued that the requirement that neither a nor b dominate the other is empirically superfluous and leads to formal complications. If it is dropped from the definition, then c-command is a reflexive relation (namely, nodes c-command themselves), which has formal advantages. This line was
Command Relations 637
further pursued in Baker and Pullum (1990). However, the definition that is most widely assumed remains the one in (7). Given this definition, syntactic domains must be defined as in (8). (For a simpler definition, see Reinhart, 1981.) Given the c-command domains in (7), the coreference restriction (4) captures all cases of coreference discussed here. In the Malagasy (5a), just as in its English translation, the first branching node dominating the subject pronoun is S (inflection phrase (IP)); hence, its domain is the whole S, regardless of whether the other nodes in S precede or follow it. Thus, (4) equally rules out its coreference with Rakoto in both English and Malagasy. In (5b) the pronoun does not c-command the antecedent. The domain of the object pronoun for both English and Malagasy is only the VP, which does not dominate the subject. Hence, (4) does not rule out coreference. In the sentences of (6), the pronoun is an object, and the NP with which it corefers is in a clausal prepositional phrase (PP) attached higher than the VP. The syntactic tree of (6a) is given in (9):
restriction (4), applying at c-command domains, was later reformulated as condition C in the binding theory of Chomsky (1981). As mentioned previously, command relations were first introduced for rules governing semantic dependencies. However, Reinhart (1976) argued that semantic dependencies are just a specific instance of a broader generalization, which she labeled the ‘C-command domain condition’: All sentence-level linguistics rules and operations are restricted to apply only within the c-command domain. Within this domain, the dependent node must be c-commanded by the node it depends on. Syntactic movement can be viewed as an instance of this dependency – the trace left by the moved element fully depends for its interpretation upon the moved element. Correspondingly, syntactic movement is allowed only into a c-commanding position, namely the target position of the moved element must c-command its source position. Among the illustrations of this principle that Reinhart provides is (10) (couched here in current syntactic terms): (10a) Felix did not realize he is a failure until whose remarks (underlying)
The first branching node dominating the pronoun is V’, which does not dominate the PP containing the lecturer. Hence, the pronoun does not c-command the potential antecedent, which means that the latter is not in the syntactic domain of the pronoun, so the coreference restriction (4) does not rule out coreference here. Given the command relation, by contrast, the relevant node for defining the domain of the pronoun is not the VP but, rather, the first IP (S) that dominates it. That IP also dominates the potential antecedent; hence, (4) would rule out coreference if it applies in the command domain. For the sentences of (2), the c-command and the precede and command domains give the same results for English. In (2a) the pronoun both c-commands and commands its antecedent (since the first branching node is S). In (2b) and (2c) the pronoun neither commands nor c-commands its antecedent. The coreference
(10c) (Remind me) [CP1 until whose remarksi [IP Felix did not realize [CP2 [IP he is a failure]] ti]] (10d) *[CP1 [IP Felix did not realize [CP2 until whose remarksi [IP he is a failure]] ti]]
In the underlying (10b), the wh-phrase has to move to a Spec of CP (COMP). If we look at the sequence linearly, the first available Spec is that of the embedded CP2. However, this Spec does not c-command it, and indeed, it cannot move there, as witnessed in (10d). The only permitted option is movement to
638 Command Relations
the top Spec of CP1, which c-commands the whphrase, as in (10c). No independent principle, known at the time, could explain why (10c) should be blocked. The subjacency condition on movement, introduced in Chomsky (1973), disallows movement that crosses more than one cyclic node (IP or NP), but the movement in (10c) does not cross any such node. For examples such as (10), the same results would be obtained if the domains were to be defined in terms of command, rather than c-command, because the lower Spec also does not command the wh-phrase. However, Reinhart (1976, 1983) cited extraposition from NP subjects and result clause extraposition as an example in which the command condition is not sufficient and concluded that the relevant domain for movement is the c-command domain. The domain condition also provided an explanation for why there can be no syntactic ‘lowering’ movement, a hypothesis advocated by Chomsky (1977) at the time. Such movement violates the principle that the target must c-command the source position. This condition is still broadly assumed in syntax theory, at least for overt syntactic movement, and it is believed to hold universally. Let us now pay closer attention to the definition of c-command. The definition in (7) counts any branching node as relevant for c-command. However, it was noted already in Reinhart (1976) that this definition is too restrictive for interpretative rules (e.g., the anaphora rules). In current terms, the reason is that it disregards the special status of adjuncts. This can be illustrated with the VP adjuncts in (11):
For the anaphora rules, the objects in (11) should c-command the PP: The coreference restriction disallows the pronoun to corefer with an NP in that PP, as in (11a). The bound anaphora restriction (which we did not discuss here) allows quantified antecedent to bind a pronoun only in their c-command domain. As seen in (11b), the quantified object can, indeed, bind the pronoun in the PP. However, the actual representation of, for example, (11a) is (12), where the PP is an adjunct rather than a complement (sister) of the verb. If what determines c-command is the first branching node, the object pronoun does not c-command Ben in (12), and coreference would be incorrectly permitted. It was later found that the branching node definition of c-command is too restrictive also in certain instances of syntactic movement, which also involves adjunction. Head movement (e.g., the movement of V (Chomsky, 1986)) adjoins a given head to another head, as illustrated in the French (13a) and (13b): (13a) Jean souvent voit Marie (underlying) Jean often sees Marie (13b) Jean voit souvent t Marie
(11a) *We met him in Ben’s office. (11b) We met every researcher in his office.
The domain condition allows only movement to a c-commanding position, but the first branching node dominating the verb in its new position in (13c) is the higher I, which does not dominate the trace. Hence, by the branching definition, the moved verb does not c-command its original position. Reinhart (1976) argued that in view of facts like (11), intermediate categories of the same type should be ignored in the full definition of c-command. At the time, X-bar theory was not available (and the PP of (12) was analyzed as adjoined to VP), so capturing this extension of c-command required the rather complex definition in (14) (Reinhart, 1976: 148, (4)):
Command Relations 639 (14) A node a c-commands a node b iff (neither a nor b dominates the other and) the first branching node g1 dominating a either dominates b or is immediately dominated by a node g2 which dominates b, and g2 is of the same category type as g1.
In (12), the first V’ dominating the object pronoun is immediately dominated by another V’, which dominates the PP. Since the two instances of V’ are of the same category type, the first V’ does not count, so given (14), the pronoun c-commands the PP. With the development of X-bar theory, in Jackendoff (1977) and Chomsky (1981), it became possible to simplify the definition in (14). Aoun and Sportiche (1983) argued that the intuition behind this definition is that the nodes relevant for c-command are not arbitrary branching nodes but, rather, maximal projections (the maximal categorical expansion of a head, e.g., NP, VP, and IP). More broadly, linguistic rules are restricted to operate within constituents that are full projections of a given head. Aoun and Sportiche (1983) coined the term ‘m-command’ (maximal-command) for the relation under consideration, defined in (15): (15) A node a m-commands a node b iff (neither a nor b dominates the other and) the first maximal projection dominating a also dominates b.
In (12), the first maximal projection dominating the pronoun is the VP; hence, it m-commands the PP. In (13c), the first maximal projection dominating the moved V is IP; hence, V m-commands its trace. As Aoun and Sportiche pointed out, there is further motivation to replace the definition of c-command with (15). In the case of an intransitive verb, the first branching node that dominates the verb turns out to be the clause node (IP or I’) because the VP in that case is not branching. This may be irrelevant for anaphora but would cause problems in other areas, such as government and Case. Other extensions of the m-command domain have been proposed throughout the years: May (1985) proposed an extension of the domain of adjuncts, which was needed for quantifier raising (QR) and quantifier scope. This rests on the definition (16) for the relation ‘dominate’: (16) a dominates b iff every segment of a dominates b.
In the case of adjunction, which creates identical nodes, each occurrence of these nodes is viewed as a segment. In the structure in (17b), QR has applied to adjoin everyone to IP. IP does not dominate the NP everyone because there is a segment of IP that does not dominate it.
(17a) Who does everyone like t?
Hence, the first maximal projection that dominates this NP is only the CP, which means that everyone m-commands who, which is why it can take wide scope over who. Chomsky (1986) applied the segment definition of dominate to other areas of syntax. Under Kayne’s (1994) approach to phrase structure, specifiers are an instance of adjunction. Combined with the definition of dominate in (16), this means that specifiers have the same c-command domain as their maximal projection. This has broad implications in Kayne’s system; regarding the problems discussed here, Kayne’s definition derives the fact that quantifier binding is possible in examples such as Every girl’s father thinks she’s a genius, discussed by Reinhart (1983: 177). Reinhart assumed that there is only one relation relevant to linguistic rules, namely the extended c-command in (14), which is better captured in (15). (She viewed (7) as a simplified definition of c-command needed only for expository reasons.) In that view, branching node c-command is just a subcase, or a specific instance, of m-command. However, in subsequent research since Chomsky (1981), two distinct relations have been assumed – the basic branching node c-command defined in (7), repeated here, and m-command, as previously defined: (7) A node a c-commands a node b iff (neither a nor b dominates the other and) the first branching node dominating a also dominates b.
This left open the question of whether different rules may be sensitive to different command relations – a question we return to later. In the Principles and Parameters framework (‘Government and Binding,’ (Chomsky, 1981)), the relation
640 Command Relations
m-command played a central role. It was used to define the relation of ‘government,’ which was believed at the time to determine the domain of many local linguistic processes. A version of the definition of government, adapted from Chomsky (1986), is given in (18): (18) a governs b iff a m-commands b and there is no maximal projection which dominates b and not a.
The m-command domain of a given node a, along the lines examined previously, is a together with all the nodes it m-commands. Assuming that m-command is the relevant command relation of natural language, this means that no linguistic rule can operate on any two nodes if one of them is not in the m-command domain of the other. However, the domains defined by m-command (just as those defined by c-command) may still be quite large, extending down the tree from the m-commanding node. Many linguistic processes, known as ‘local,’ are restricted to apply in a smaller subdomain of the m-command domain. At this stage of linguistic theory, government was believed to be the relation defining these local domains. The government domain of a consists only of the nodes it m-commands within the same maximal projection, with a few local extensions built into the theory. (Most notably, the verb can govern the subject of its clausal complement.) To understand why it seemed necessary to take m-command, rather than the branching c-command, as the relation defining government, let us examine case assignment. (19a) Der Student hat the student.NOM has gesehen. (German) seen ‘The student saw the man’
den the
Mann man.ACC
The basic assumption in this framework is that in (19), the object gets its accusative case from V, and
the subject gets its nominative case from the functional head I. More broadly, heads of the relevant type (V, N, P, A, and tensed I) are case-assigners. The question was what is the structural relation that must hold between the head and an NP to enable case assignment. In terms of c-command, V and I do not bear the same relation to the NP to which they assign case. In (19), V c-commands the object, but I does not c-command the subject. However, V and I equally m-command the relevant NP. Hence, it was concluded that m-command is the relevant relation for case assignment. The domain for case assignment is further restricted by the government requirement. For example, if we added a PP to (19), such as mit einem Feldstecher (‘with binoculars’), the NP in that PP cannot be assigned accusative case because it is not in the government domain of the verb (the PP being an intervening maximal projection). Its governor is only the P, which assigns it dative. However, at the stage of the minimalist program (MP), the relation government was found to be a superfluous relation (Chomsky, 1995). Chomsky (2005) argued that core syntactic processes are sensitive only to c-command, so from this perspective mcommand is superfluous as well. Syntactic movement of XPs is permitted only to a c-commanding position. This is the case with wh-movement and NP-movement, which were traditionally labeled ‘substitution,’ but also XP movement by adjunction (topicalization and extraposition at the overt structure (SS)) and QR at the covert structure (LF) are possible only into a c-commanding position. Regarding the problem of nominative case assignment exemplified in (19), Chomsky (2001) handled it without resort to mcommand. Under the VP-internal subject hypothesis (Sportiche (1988), among others), the original position of the subject in (19) is VP internal, as in (20):
At this position, I c-commands the subject and can therefore check its nominative case (the mechanism is called ‘agree’). Subsequently, the subject moves to the Spec IP position (to satisfy the extended projection principle, which requires that Spec IP be filled). Within
Command Relations 641
the MP, syntactic structure is built gradually, and case checking applies immediately upon the introduction of the relevant case checking head. It follows that an NP can have its case checked only by the closest c-commanding case checker. Thus, the ban against intervention added by the government requirement turned out to be superfluous since it is derived independently. Head movement, illustrated in (13), is the only instance of a syntactic process that still seems to be restricted by m-command rather than c-command. Pointing out that head movement differs from the core rules of syntax in important respects (not only does head movement violate c-command but also it does not conform to the extension principle, discussed later), Chomsky (2001) suggested that it is a phonological process resulting from the affixal character of the inflectional heads. This leaves us with the question of semantic interpretation rules. These must still operate in m-command domains. This was illustrated here only with the anaphora problems in (11), but Reinhart (1983) argued that precisely the same holds for relative quantifier scope and other interpretative procedures. The upshot so far seems to be that two distinct syntactic domains must be assumed – the c-command domain for core syntactic processes and the m-command domain for interpretative processes. An alternative perspective on the c-command domain is offered in Epstein (1999) and Epstein et al. (1998). They argued that what appears to be the c-command restriction on syntactic movement is in fact an entailment of an independent linguistic factor. In current MP views, there is one primitive syntactic operation, ‘merge,’ which constructs a new object from two objects. There are two types of merge: ‘external merge,’ which puts together two distinct categories, and ‘internal merge,’ which attaches to a category a phrase that is part of that category (hence, the label ‘internal’). Internal merge is movement. There is a general principle governing merge, the extension principle, which determines that a node can be merged only to the root (the topmost node available). Since movement is a subtype of merge, its landing site, the position it merges into, must always c-command its original position. This is so because the moved element must be merged to the root, which obviously dominates the original position. Under the assumption that case checking is done upon merge – namely, case is checked as soon as the head with the case feature is merged – case checking is also possible only when the head c-commands the argument it assigns case to. In this view, then, only one command relation must be assumed – m-command, which governs mainly interpretative rules. Syntactic processes are executed derivationally and are governed by merge.
Interpretative rules must take into consideration the full derivation tree, including adjoined constituents. In the MP, adjunction (‘pair-merge’ in MP terms) (as in the case of the PP in (11)) is a different process than argument merge (‘set-merge’). It applies freely and is not restricted by the extension principle. Hence, a constituent that does not belong to the c-command domain of a given node, because it was inserted by adjunction, may still belong to its interpretative domain if it is inserted in the same maximal projection. See also: Configurationality; Island Constraints; Principles
and Parameters Framework of Generative Grammar; Subjects and the Extended Projection Principle; Transformational Grammar: Evolution; X-Bar Theory.
Bibliography Aoun J & Sportiche D (1983). ‘On the formal theory of government.’ Linguistic Review 2(3), 211–236. Barker C & Pullum G K (1990). ‘A theory of command relations.’ Linguistics and Philosophy 15, 1–34. Chomsky N (1973). ‘Conditions on transformations.’ In Anderson S & Kiparsky P (eds.) A Festschrift for Morris Halle. New York: Holt, Rinehart & Winston. 232–286. Chomsky N (1977). ‘On wh-movement.’ In Culicover P, Wasow T & Akmajian A (eds.) Formal syntax. New York: Academic Press. Chomsky N (1981). Lectures on government and binding. Dordrecht, The Netherlands: Foris. Chomsky N (1986). Barriers. Cambridge: MIT Press. Chomsky N (1995). The minimalist program. Cambridge: MIT Press. Chomsky N (2001). ‘Derivation by phase.’ In Kenstowicz M (ed.) Ken Hale: a life in language. Cambridge: MIT Press. Chomsky N (2005). ‘Three factors in language design.’ Linguistic Inquiry 36, 1–22. Epstein S (1999). ‘Unprincipled syntax: the derivation of syntactic relations.’ In Epstein S & Hornstein N (eds.) Working minimalism. Cambridge: MIT Press. 317–355. Epstein S D, Groat E M, Kawashima R & Kitahara H (1998). A derivational approach to syntactic relations. Oxford: Oxford University Press. Jackendoff R S (1972). Semantic interpretation in generative grammar. Cambridge: MIT Press. Jackendoff R S (1977). X’-syntax: a study of phrase structure. Cambridge: MIT Press. Kayne R (1994). The antisymmetry of syntax. Cambridge: MIT Press. Klima E S (1964). ‘Negation in English.’ In Fodor J A & Katz J J (eds.) The structure of language. Englewood Cliffs, NJ: Prentice Hall. Langacker R (1966). ‘On pronominalization and the chain of command.’ In Reibel W & Schane S (eds.) Modern studies in English. Englewood Cliffs, NJ: Prentice Hall. Lasnik H (1976). ‘Remarks on coreference.’ Linguistic Analysis 2, 1–22.
642 Command Relations May R (1985). Logical form. Cambridge: MIT Press. Reinhart T (1976). The syntactic domain of anaphora. Ph.D. diss., MIT. (Distributed by MIT Working Papers in Linguistics). Reinhart T (1981). ‘DefiniteNP anaphora and c-command domains.’ Linguistic Inquiry 12, 605–635. Reinhart T (1983). Anaphora and semantic interpretations. London/Chicago: Croom Helm/University of Chicago Press.
Ross J R (1967). Constraints on variables in syntax. Ph.D. diss., MIT. (Published in 1986 as Infinite syntax. Norwood, NJ: Albex). Sportiche D (1988). ‘A theory of floating quantifiers and its corollaries for constituent structure.’ Linguistic Inquiry 19, 425–449.
Communication in Grey Parrots I M Pepperberg, Brandeis University, Waltham, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
For over 25 years, I have taught Grey parrots the meaningful use of English speech. Using English speech, the oldest, Alex, matches certain cognitive capacities of apes, marine mammals, and sometimes 4- to 6-year-old children (Pepperberg, 1999). His abilities are inferred not from operant tasks common in animal research, but from vocal responses to vocal questions; that is, he also demonstrates intriguing communicative parallels with young humans, despite his phylogenetic distance. I doubt I taught Alex and other parrots these abilities de novo; their achievements likely derive from existent cognitive and neural architectures. My data thus suggest an avian role in the evolution of intelligence and communication.
Animals and Human Language Historically, animal–human communication studies have focused on cetaceans and apes. ‘Talking’ birds were rarely included (except by a very small number of researchers, e.g., Mowrer, 1954), birds being viewed as unintelligent mimics. Operant experiments with pigeons had demonstrated capacities inferior to those of mammals and those results were extrapolated to all birds, despite separate intriguing evidence for parrots’ cognitive and possibly communicative feats (Pepperberg, 1999). Correlates of intelligence – relatively large brain to body weight ratios, highly social natures, and long lives during which they must continuously be able to both remember and update information about their physical and social environments – do exist in parrots, and proper training induces language-like abilities matching those of other species.
What Greys Can Learn Alex’s accomplishments are impressive (Pepperberg, 1999). He vocally labels more than 50 objects, seven colors, five shapes, quantities to 6, and three categories (color, shape, material), and appropriately uses ‘‘no,’’ ‘‘come here,’’ ‘‘wanna go X,’’ and ‘‘want Y’’ (X and Y are, respectively, locations or items). He answers queries concerning relative size, categories, quantity, presence or absence of attribute similarity/ difference, and label comprehension. He combines labels to identify, request, comment upon, or refuse more than 100 items and to alter his environment. He can state color, shape, material, and object name of an exemplar. Given two items, he answers the questions ‘‘What toy?’’, ‘‘How many?’’, ‘‘What’s same/different?’’, ‘‘What color bigger/smaller?’’, and ‘‘What matter bigger/smaller?’’ Shown collections of mixed numbers of, for example, red and blue balls and blocks, he answers the question ‘‘How many blue block?’’, thus comprehending recursive, conjunctive queries. He uses very limited forms of segmentation – sentence frames and consistent adjective þ noun constructions. He semantically separates labeling from requesting. He requests absent objects and refuses substitutes, demonstrating that his labels are representational; that is, he recognizes dissonance between the concept encoded in the request and proffered items. He exhibits solitary practice, apparently building new labels from sounds already in his repertoire; for children, Bruner (1977) referred to this behavior as ‘scaffolding.’ Thus, after learning ‘‘grey’’ he spontaneously produced ‘‘grape,’’ ‘‘grate,’’ ‘‘grain,’’ ‘‘cane,’’ and ‘‘chain,’’ subsequently mapping these labels referentially onto relevant objects. Ongoing studies with younger parrots show that he is not exceptional. Alex’s behavior is not simple object-label association. He is equally accurate on items related, but not identical, to training objects, and he transfers
642 Command Relations May R (1985). Logical form. Cambridge: MIT Press. Reinhart T (1976). The syntactic domain of anaphora. Ph.D. diss., MIT. (Distributed by MIT Working Papers in Linguistics). Reinhart T (1981). ‘DefiniteNP anaphora and c-command domains.’ Linguistic Inquiry 12, 605–635. Reinhart T (1983). Anaphora and semantic interpretations. London/Chicago: Croom Helm/University of Chicago Press.
Ross J R (1967). Constraints on variables in syntax. Ph.D. diss., MIT. (Published in 1986 as Infinite syntax. Norwood, NJ: Albex). Sportiche D (1988). ‘A theory of floating quantifiers and its corollaries for constituent structure.’ Linguistic Inquiry 19, 425–449.
Communication in Grey Parrots I M Pepperberg, Brandeis University, Waltham, MA, USA ! 2006 Elsevier Ltd. All rights reserved.
For over 25 years, I have taught Grey parrots the meaningful use of English speech. Using English speech, the oldest, Alex, matches certain cognitive capacities of apes, marine mammals, and sometimes 4- to 6-year-old children (Pepperberg, 1999). His abilities are inferred not from operant tasks common in animal research, but from vocal responses to vocal questions; that is, he also demonstrates intriguing communicative parallels with young humans, despite his phylogenetic distance. I doubt I taught Alex and other parrots these abilities de novo; their achievements likely derive from existent cognitive and neural architectures. My data thus suggest an avian role in the evolution of intelligence and communication.
Animals and Human Language Historically, animal–human communication studies have focused on cetaceans and apes. ‘Talking’ birds were rarely included (except by a very small number of researchers, e.g., Mowrer, 1954), birds being viewed as unintelligent mimics. Operant experiments with pigeons had demonstrated capacities inferior to those of mammals and those results were extrapolated to all birds, despite separate intriguing evidence for parrots’ cognitive and possibly communicative feats (Pepperberg, 1999). Correlates of intelligence – relatively large brain to body weight ratios, highly social natures, and long lives during which they must continuously be able to both remember and update information about their physical and social environments – do exist in parrots, and proper training induces language-like abilities matching those of other species.
What Greys Can Learn Alex’s accomplishments are impressive (Pepperberg, 1999). He vocally labels more than 50 objects, seven colors, five shapes, quantities to 6, and three categories (color, shape, material), and appropriately uses ‘‘no,’’ ‘‘come here,’’ ‘‘wanna go X,’’ and ‘‘want Y’’ (X and Y are, respectively, locations or items). He answers queries concerning relative size, categories, quantity, presence or absence of attribute similarity/ difference, and label comprehension. He combines labels to identify, request, comment upon, or refuse more than 100 items and to alter his environment. He can state color, shape, material, and object name of an exemplar. Given two items, he answers the questions ‘‘What toy?’’, ‘‘How many?’’, ‘‘What’s same/different?’’, ‘‘What color bigger/smaller?’’, and ‘‘What matter bigger/smaller?’’ Shown collections of mixed numbers of, for example, red and blue balls and blocks, he answers the question ‘‘How many blue block?’’, thus comprehending recursive, conjunctive queries. He uses very limited forms of segmentation – sentence frames and consistent adjective þ noun constructions. He semantically separates labeling from requesting. He requests absent objects and refuses substitutes, demonstrating that his labels are representational; that is, he recognizes dissonance between the concept encoded in the request and proffered items. He exhibits solitary practice, apparently building new labels from sounds already in his repertoire; for children, Bruner (1977) referred to this behavior as ‘scaffolding.’ Thus, after learning ‘‘grey’’ he spontaneously produced ‘‘grape,’’ ‘‘grate,’’ ‘‘grain,’’ ‘‘cane,’’ and ‘‘chain,’’ subsequently mapping these labels referentially onto relevant objects. Ongoing studies with younger parrots show that he is not exceptional. Alex’s behavior is not simple object-label association. He is equally accurate on items related, but not identical, to training objects, and he transfers
Communication in Grey Parrots 643
utterances to novel contexts: After learning to answer ‘‘none’’ to the question ‘‘What’s different?’’ with respect to two identical items, he spontaneously responded ‘‘none’’ when asked ‘‘What color bigger?’’ regarding two equally sized, differently colored blocks (Pepperberg and Brezinsky, 1991). Alex lacks all but a very few verbs, but nevertheless exhibits certain communicative capacities that were once presumed limited to primates.
How Greys Learn: Parallels with Humans My Greys’ learning sometimes parallels human processes, providing insights into how acquisition of complex communication may have evolved. Referential, contextually applicable (functional), and socially rich input allows parrots, like young children – as shown by researchers such as Hollich et al. (2000) – to acquire communication skills effectively (Pepperberg, 1999). Reference, the object-word connection, is exemplified by rewarding birds with objects they label. Context/function involves the situation in which an utterance is used and effects of its use; initially using labels as requests motivates birds to learn human speech. Social interaction engages subjects directly, accents environmental components (e.g., contextual explanations for actions and their consequences), emphasizes common attributes – and possible underlying rules – of diverse actions, and allows continuous adjustment of input to learners’ levels. I next describe the training technique of our laboratory, then experiments to determine the necessary and sufficient input for engendering learning. Model/Rival (M/R) Training
My model/rival (M/R) system (background in Pepperberg, 1999) uses three-way social interactions among two humans and a parrot to demonstrate targeted vocal behavior. The parrot watches the trainer display, and query the second human about, one or more item(s) (e.g., ‘‘What’s here?’’, ‘‘How many?’’) and reward correct answers referentially with praise and the object(s). Incorrect responses (like the bird may make) are punished by scolding and temporarily removing item(s) from sight. The second human is a model for the parrot’s responses and its rival for the trainer’s attention, and illustrates error consequences by trying again or talking more clearly if the responses were first (deliberately) incorrect or garbled. Because the bird is also queried and rewarded for successive approximations to correct responses, training is adjusted to its level. Unlike the modeling procedures of other researchers (i.e., those reviewed in Pepperberg and Sherman,
2000), ours (a) interchanges the roles of the trainer and the model to emphasize that one is not always the questioner and the other the respondent and that the procedure can effect environmental change, and (b) exclusively uses intrinsic rewards. Todt (1975), for example, showed that birds whose trainers never reverse roles responded only to the questioner; our birds, however, interact with and learn from all humans. Intrinsic rewards – uttering ‘‘X’’ gets X – ensure the closest possible correlations of labels or concepts to be learned with their appropriate referents. M/R training demonstrated which input elements enabled referential acquisition, not what was necessary and sufficient. What if elements were lacking? Answering that question required additional parrots: Alex might cease learning because training had changed, not because of how it changed. With three naive Greys – Kyaaro, Alo, and Griffin – my laboratory tested the importance of reference, context/function, and social interaction in training. Eliminating Aspects of Input
We performed eight experiments (Pepperberg, 1999; Pepperberg et al., 2000). First, we concurrently presented Alo and Kyaaro with (a) audiotapes of Alex’s sessions, which were nonreferential, not contextually applicable, and noninteractive; (b) videotapes of Alex’s sessions, which were referential, minimally contextually applicable, and noninteractive; and (c) standard M/R training. In (a) and (b), birds were socially isolated. Condition (a) paralleled early allospecific song acquisition studies such as those of Marler (1970); (b) involved issues about avian vision and video: Ikebuchi and Okanoya (1999) demonstrated that birds might not see standard video output as do humans. Second, because Rice and her colleagues (Rice et al., 1990) found that interactive coviewers could increase young children’s learning from video, a coviewer now provided social approbation for viewing and pointed to the screen with comments like ‘‘Look what Alex has!’’ Birds’ labeling attempts would garner only vocal praise. Social interaction was limited; referentiality and functionality matched earlier video sessions. Third, because the extent of coviewer interaction might be relevant, he or she now uttered targeted labels and asked questions. Fourth, so that lack of reward would not deter learning, socially isolated parrots watched videos while a student in another room monitored utterances and could deliver rewards remotely. Fifth, because birds might habituate to the single videotape used per label (although each tape depicted numerous Alex–trainer interactions), we used live video from Alex’s sessions. Sixth, because Baldwin (1995) had
644 Communication in Grey Parrots
demonstrated that labels were not acquired if adult– child duos failed to focus jointly on objects being labeled, a single trainer faced away from the bird (who could reach, e.g., a key) and chatted about the item (‘‘Look, a blue key!’’, ‘‘You want key?’’, etc.; sentence frames, Pepperberg, 1999) but had no visual or physical contact with either the parrot or the object; a bird’s labeling attempt would receive only vocal praise, thereby eliminating some functionality and considerable social interaction. Parrots failed at referential targeted label acquisition in any non-M/R condition, but succeeded in concurrent M/R sessions (Pepperberg, 1999). Seventh, we eliminated some aspects of modeling: one trainer jointly focused with the bird on objects, using labels and making queries. Griffin did not utter labels during 50 such sessions, but clearly produced labels after two or three subsequent M/R sessions. (NB: Birds that were switched to M/R training after 50 video sessions needed ~20 sessions before producing labels.) We suspected latent learning: Griffin apparently stored but could not use labels without observing their use modeled. Finally, we used a liquid crystal monitor to see if cathode ray tube flicker-fusion hindered video learning (Ikebuchi and Okanoya, 1999); parrots still failed at video learning (Pepperberg and Wilkes, 2004). The results emphasized that training must include reference, social interaction, and functionality/contextual use if parrots are to communicate with humans rather than mimic speech. Mutual Exclusivity: Studying Subtle Changes in Input
Parrots’ learning processes may also parallel young children’s mutual exclusivity (ME) (Pepperberg and Wilcox, 2000). This term, used by researchers such as Liittschwager and Markman (1994), refers to many children’s early belief that each object has one, and only one, label. Along with the ‘whole object assumption’ (that a label identifies the entire object, not some feature), ME supposedly guides initial word acquisition. Liittschwager and Markman (1994), for example, suggested that ME eventually helps children interpret novel words as feature labels (overcome the whole object assumption), but very young children may initially reject second labels as unacceptable alternatives. Input, however, affects ME: Gottfried and Tonks (1996) showed that children, and I showed that parrots like Alex, who receive inclusivity data (X is a kind of Y; e.g., colors taught as additional, not alternative, labels: ‘‘Here’s a key; it’s a green key’’), accept multiple labels for items and easily form hierarchical relations. Given an object, Alex answers the questions ‘‘What color?’’, ‘‘What shape?’’, ‘‘What matter?’’, and ‘‘What toy?’’
(Pepperberg, 1999). But parrots taught colors or shapes as alternative labels (e.g., ‘‘Here’s key’’, later, ‘‘It’s green’’) have difficulty using these modifiers for previously labeled items. Griffin, thusly trained, initially answered the question ‘‘What color?’’ with object labels. Similarly, he had difficulty learning an object label – cup – and answered the question ‘‘What toy?’’ with colors (Pepperberg and Wilcox, 2000). Even small input differences affect acquisition as much for parrots as they do for young children. Combinatory Learning
Researchers (e.g., Johnson-Pynn et al., 1999) have argued, pointing primarily to behavioral data, that a common neural substrate initially underlies young children’s parallel development of communicative and object (manual) combinations and that a homologous substrate in apes allows similar, limited, parallel development, thus suggesting a shared evolutionary history for linguistic and physical behavior. But Heather Shive and I (Pepperberg and Shive, 2001) demonstrated that Griffin showed comparable limited, parallel combinatorial development of three-item and three-label combinations. The percentages of physical and vocal combinations were roughly equal; vocal three-label combinations emerged only when Griffin initiated three-object combinations; only one of 14 vocal combination types that Griffin produced had been trained; and physical combinations were performed with his beak, not with his feet. Moreover, unlike JohnsonPynn et al.’s Cebus (1999), Griffin’s physical tasks were untrained. Although Griffin’s – or even Alex’s (Pepperberg, 1999) – behavior matches neither human language nor combinatory behavior in complexity, our data show that parallel combinatory development is not limited to primates and that mammalian brains are not a prerequisite for such behavior. According to some researchers, responsible substrates are likely analogous, arising independently under similar evolutionary pressures; other researchers believe that the structures may indeed be homologous (see discussion in Medina and Reiner, 2000).
Parallel Evolution of Avian and Mammalian Abilities? Although human and animal behavior are not isomorphic, many species provide information on evolutionary pressures that helped shape existent systems (Pepperberg, 1999). Such pressures were not exclusive to primates; hence we see analogous complex avian communication systems and likely analogous neural architectures. Moreover, complex
Communication in Grey Parrots 645
communication apparently requires or coevolves with complex cognition: Although communication is functionally social, its complexity is based on the complexity of information communicated, processed, and received, thus contingencies that shape cognition (social, ecological, etc.) likely shape communication. If, as researchers such as Rozin (1976) and Humphrey (1976) have claimed, intelligence indeed correlates with primates’ complicated social systems and long lives – the outcome of selection processes favoring animals that flexibly transfer skills across domains and that remember/act upon knowledge of detailed group social relations – these patterns might drive parrot cognition and communication: long-lived birds with complex primate-like social systems might use abilities honed for social gains to direct information processing and vocal learning. When you add the need for categorical classes (e.g., to distinguish neutral stimuli from predators), the ability to recognize/remember environmental regularities yet adapt to unpredictable changes over extensive lifetimes, and a primarily vocal communication system, then parrots’ capacities are not surprising. Whether similar adaptive responses evolved independently for birds and humans under comparable environmental pressures is unclear, but a common core of skills likely underlies complex cognitive and communicative behavior across species, even if specific skills manifest differently. By looking for species commonalities, we can develop theories about behavioral elements essential to, and evolutionary pressures that shape, complex capacities (Pepperberg, 1999). See also: Animal Communication: Dialogues; Animal Communication Networks; Animal Communication: Overview; Animal Communication: Vocal Learning; Apes: Gesture Communication; Birdsong; Categorical Perception in Animals; Cognitive Basis for Language Evolution in Nonhuman Primates; Development of Communication in Animals; Individual Recognition in Animal Species; Nonhuman Primate Communication; Traditions in Animals; Vocal Production in Birds.
Bibliography Baldwin D A (1995). ‘Understanding the link between joint attention and language.’ In Moore C & Dunham P J (eds.) Joint attention. Hillsdale, NJ: Erlbaum. 131–158. Bruner J S (1977). ‘Early social interaction and language acquisition.’ In Schaffer H R (ed.) Studies in mother– infant interaction. London: Academic. 271–289. Cheney D L & Seyfarth R M (1992). ‘Precis of ‘How monkeys see the world.’’ Behavioral and Brain Sciences 15, 135–182.
Gottfried G M & Tonks J M (1996). ‘Specifying the relation between novel and known: Input affects the acquisition of novel color terms.’ Child Development 67, 850–866. Hollich G J, Hirsh-Pasek K & Golinkoff R M (2000). ‘Breaking the language barrier.’ Monographs of the Society for Research in Child Development 262, 1–138. Humphrey N K (1976). ‘The social function of intellect.’ In Bateson P P G & Hinde R A (eds.) Growing points in ethology. Cambridge, UK: Cambridge University Press. 303–317. Ikebuchi M & Okanoya K (1999). ‘Male zebra finches and Bengalese finches emit directed songs to the video images of conspecific females projected onto a TFT display.’ Zoological Science 16, 63–70. Johnson-Pynn J, Fragaszy D M, Hirsh E M, Brakke K E & Greenfield P M (1999). ‘Strategies used to combine seriated cups by chimpanzees (Pan troglodytes), bonobos (Pan paniscus), and capuchins (Cebus apella).’ Journal of Comparative Psychology 113, 137–148. Liittschwager J C & Markman E M (1994). ‘Sixteen- and 24-month olds’ use of mutual exclusivity as a default assumption in second-label learning.’ Developmental Psychology 30, 955–968. Marler P (1970). ‘A comparative approach to vocal learning: Song development in white-crowned sparrows.’ Journal of Comparative and Physiological Psychology 71, 1–25. Medina L & Reiner A (2000). ‘Do bird possess homologues of mammalian primary visual, somatosensory, and motor cortices?’ Trends in Neurosciences 23, 1–12. Mowrer O H (1954). ‘A psychologist looks at language.’ American Psychologist 9, 660–694. Pepperberg I M (1999). The Alex studies. Cambridge, MA: Harvard University Press. Pepperberg I M (2002). ‘Cognitive and communicative abilities of Grey parrots.’ Current Directions in Psychological Science 11, 83–87. Pepperberg I M & Brezinsky M V (1991). ‘Relational learning by an African Grey parrot (Psittacus erithacus): Discriminations based on relative size.’ Journal of Comparative Psychology 105, 286–294. Pepperberg I M & Sherman D (2000). ‘Proposed use of two-part interactive modeling as a means to increase functional skills in children with a variety of disabilities.’ Teaching and Learning in Medicine 12, 213–220. Pepperberg I M & Shive H A (2001). ‘Hierarchical combinations by a Grey Parrot (Psittacus erithacus): Bottle caps, lids, and labels.’ Journal of Comparative Psychology 115, 376–384. Pepperberg I M & Wilcox S E (2000). ‘Evidence for a form of mutual exclusivity during label acquisition by Grey parrots (Psittacus erithacus)?’ Journal of Comparative Psychology 114, 219–231. Pepperberg I M & Wilkes S (2004). ‘Lack of referential vocal learning from LCD video by Grey parrots (Psittacus erithacus).’ Interaction Studies 5, 75–97. Pepperberg I M, Sandefer R M, Noel D & Ellsworth C P (2000). ‘Vocal learning in the Grey Parrot (Psittacus erithacus): Effect of species identity and number of trainers.’ Journal of Comparative Psychology 114, 371–380.
646 Communication in Grey Parrots Rice M L, Huston A C, Truglio R & Wright J (1990). ‘Words from ‘‘Sesame Street’’: Learning vocabulary while viewing.’ Developmental Psychology 26, 421–428. Rozin P (1976). ‘The evolution of intelligence and access to the cognitive unconscious.’ In Sprague J M & Epstein A
N (eds.) Progress in psychobiology and physiological psychology, vol. 6. New York: Academic Press. 245–280. Todt D (1975). ‘Social learning of vocal patterns and modes of their applications in Grey Parrots.’ Zeitschrift fu¨ r Tierpsychologie 39, 178–188.
Communication in Marine Mammals V M Janik, University of St Andrews, Fife, UK ! 2006 Elsevier Ltd. All rights reserved.
Marine mammals are not a monophyletic group but include species as phylogenetically different as whales, manatees, seals, and polar bears. Traditionally marine mammals have been treated as one group because of their adaptations to the marine environment. These adaptations can make distantly related marine species appear physically more similar than closely related terrestrial species. Behaviorally, marine mammals can differ tremendously. This allows for interesting comparisons of evolutionary pathways in animals that live in the same environment. This section concentrates on cetaceans (whales, dolphins, and porpoises, comprising 85 species), pinnipeds (seals and sea lions, comprising 35 species) and sirenians (sea cows, comprising 4 species). While polar bears, otters, and other mammals feeding on marine species have sometimes been considered to be marine mammals, they are not covered in this section. I will also focus on acoustic communication. While marine mammals use visual signals at close range, most species are rather limited in their physical expressiveness, due to the evolution of streamlined bodies suitable for swimming. Furthermore, most marine habitats have very limited underwater visibility, which favors the evolution of acoustic communication since it is not affected by visibility.
Types of Sounds Cetacean sounds are often split into three categories: tonal calls, clicks, and burst-pulsed sounds. Additionally, cetaceans use a variety of other mammalian-type sounds that are described as barks, grunts, squeals, or similar onomatopoetic descriptors. Little is known about the production mechanisms of these sounds. Tonal calls include many elements of baleen whale songs and whistles of dolphins (reviewed in Tyack and Clark, 2000). The most famous case of a marine mammal acoustic display is probably the song of the humpback whale (Megaptera novaeangliae), which
typically consists of five to nine different themes (Payne and Payne, 1985) (Figure 1). Each theme consists of a repeated phrase. A phrase consists of a stereotypical sequence of elements. Each phrase lasts for around 15 seconds, but it can be repeated multiple times before the animal switches to the next theme. A whole song usually lasts between 8 and 16 minutes. The sequence of themes in a song is fairly stable, but around 10% of theme transitions in songs skip themes in the sequence. Humpback whales repeat their song many times in singing bouts that can last for several hours. It appears that only males produce song, primarily during the breeding season. All individuals in one population share the same song. However, the song sung by all males is different at the start and the end of the breeding season. In the following breeding season they start out with the last song they sang in the previous year. Thus, song changes over the years in synchrony among all males in a population. The only explanation for this change is that males use vocal learning to alter their own song in relation to what other whales produce. A similar pattern seems to exist in bowhead whales (Balaena mysticetus) with a somewhat simpler song organization and less change in the song. Blue (Balaenoptera musculus) and fin whales (Balaenoptera physalus) produce even simpler sequences, consisting of a repeated, single element. In fin whales it appears that only males sing while this is not known for blue whales. Once baleen whales come together in social groups, they produce a variety of burst-pulsed sounds often called grunts or snorts. Some species like the northern (Eubalaena glacialis) and southern right whales (Eubalaena australis) and the gray whale (Eschrichtius robustus) do not appear to sing at all but only engage in sound production during such social encounters. Dolphins and other toothed whales also do not produce song. Their signals are either used in social interactions akin to the way primates use their signals or for echolocation. Echolocation is a specific adaptation to explore the environment, as is found in bats. An animal produces a click sound and listens to echoes from sound-reflecting objects extracting object feature information from the acoustic parameters
646 Communication in Grey Parrots Rice M L, Huston A C, Truglio R & Wright J (1990). ‘Words from ‘‘Sesame Street’’: Learning vocabulary while viewing.’ Developmental Psychology 26, 421–428. Rozin P (1976). ‘The evolution of intelligence and access to the cognitive unconscious.’ In Sprague J M & Epstein A
N (eds.) Progress in psychobiology and physiological psychology, vol. 6. New York: Academic Press. 245–280. Todt D (1975). ‘Social learning of vocal patterns and modes of their applications in Grey Parrots.’ Zeitschrift fu¨r Tierpsychologie 39, 178–188.
Communication in Marine Mammals V M Janik, University of St Andrews, Fife, UK ! 2006 Elsevier Ltd. All rights reserved.
Marine mammals are not a monophyletic group but include species as phylogenetically different as whales, manatees, seals, and polar bears. Traditionally marine mammals have been treated as one group because of their adaptations to the marine environment. These adaptations can make distantly related marine species appear physically more similar than closely related terrestrial species. Behaviorally, marine mammals can differ tremendously. This allows for interesting comparisons of evolutionary pathways in animals that live in the same environment. This section concentrates on cetaceans (whales, dolphins, and porpoises, comprising 85 species), pinnipeds (seals and sea lions, comprising 35 species) and sirenians (sea cows, comprising 4 species). While polar bears, otters, and other mammals feeding on marine species have sometimes been considered to be marine mammals, they are not covered in this section. I will also focus on acoustic communication. While marine mammals use visual signals at close range, most species are rather limited in their physical expressiveness, due to the evolution of streamlined bodies suitable for swimming. Furthermore, most marine habitats have very limited underwater visibility, which favors the evolution of acoustic communication since it is not affected by visibility.
Types of Sounds Cetacean sounds are often split into three categories: tonal calls, clicks, and burst-pulsed sounds. Additionally, cetaceans use a variety of other mammalian-type sounds that are described as barks, grunts, squeals, or similar onomatopoetic descriptors. Little is known about the production mechanisms of these sounds. Tonal calls include many elements of baleen whale songs and whistles of dolphins (reviewed in Tyack and Clark, 2000). The most famous case of a marine mammal acoustic display is probably the song of the humpback whale (Megaptera novaeangliae), which
typically consists of five to nine different themes (Payne and Payne, 1985) (Figure 1). Each theme consists of a repeated phrase. A phrase consists of a stereotypical sequence of elements. Each phrase lasts for around 15 seconds, but it can be repeated multiple times before the animal switches to the next theme. A whole song usually lasts between 8 and 16 minutes. The sequence of themes in a song is fairly stable, but around 10% of theme transitions in songs skip themes in the sequence. Humpback whales repeat their song many times in singing bouts that can last for several hours. It appears that only males produce song, primarily during the breeding season. All individuals in one population share the same song. However, the song sung by all males is different at the start and the end of the breeding season. In the following breeding season they start out with the last song they sang in the previous year. Thus, song changes over the years in synchrony among all males in a population. The only explanation for this change is that males use vocal learning to alter their own song in relation to what other whales produce. A similar pattern seems to exist in bowhead whales (Balaena mysticetus) with a somewhat simpler song organization and less change in the song. Blue (Balaenoptera musculus) and fin whales (Balaenoptera physalus) produce even simpler sequences, consisting of a repeated, single element. In fin whales it appears that only males sing while this is not known for blue whales. Once baleen whales come together in social groups, they produce a variety of burst-pulsed sounds often called grunts or snorts. Some species like the northern (Eubalaena glacialis) and southern right whales (Eubalaena australis) and the gray whale (Eschrichtius robustus) do not appear to sing at all but only engage in sound production during such social encounters. Dolphins and other toothed whales also do not produce song. Their signals are either used in social interactions akin to the way primates use their signals or for echolocation. Echolocation is a specific adaptation to explore the environment, as is found in bats. An animal produces a click sound and listens to echoes from sound-reflecting objects extracting object feature information from the acoustic parameters
Communication in Marine Mammals 647
Figure 1 Spectrogram of a section of a humpback whale song. Themes are separated by thick vertical lines. Phrases within themes are separated by thin vertical lines. Recording kindly provided by P. Miller. (FFT size 1024, Flat top window, DF: 44 Hz, DT: 23 ms.)
of the echo. Most cetacean echolocation signals consist of high-frequency clicks that can extend up to above 200 kHz. It has also been argued that the repetitive sequences of fin and blue whale calls could be used for long-range echolocation (Tyack and Clark, 2000). Similarly, sperm whales (Physeter macrocephalus) produce click trains of signals that have a bandwidth of 0.1–30 kHz with temporal patterning similar to echolocation click trains used by dolphins. However, this does not mean that every click is an echolocation signal. Sperm whales, porpoises, and some dolphin species almost only have click signals in their repertoires. They are used in social contexts as
well and the distinction between echolocation and social signal can be difficult in these species. Most dolphins use tonal whistles extensively as social signals (Tyack, 1999). Many of these have harmonics extending to more than 100 kHz, but the fundamental frequency is mostly between 2 and 30 kHz. Burst-pulsed sounds are signals that consist of very fast click trains that result in a tonal quality to the human ear. Killer whales (Orcinus orca) use these signals as their main social signal and bottlenose dolphins (Tursiops truncatus) have a wide variety of burst-pulsed sounds at their disposal. However, many signals produced by cetaceans do not fall into any of
648 Communication in Marine Mammals
Figure 2 Spectrogram of a harbor seal song. It consists of only one element, which is repeated many times. (FFT size 1024, Flat top window, DF: 37 Hz, DT: 27 ms.)
the three traditional categories of clicks, burst-pulsed sounds and tonal sounds (Janik, 1999). Bottlenose dolphins, for example, produce gunshot-like sounds, pops, squeals, and other sounds that reflect their versatility in sound production. Little is known about these signals because they are common in close social encounters when it is difficult to observe the exact call context. Pinnipeds produce sounds underwater and in air. Most of their sounds lie in a frequency band between 0.1 and 10 kHz (review in Richardson et al., 1995). Like baleen whales, the walrus (Odobenus rosmarus) and several of the so-called true seals (Phocids) produce underwater song. Examples are the Weddell seal (Leptonychotes weddellii), the bearded seal (Erignathus barbatus), the harp seal (Pagophilus groenlandicus), and the harbor seal (Phoca vitulina). In all of these species, calling activity peaks in the mating season. The simplest song comes from the harbor seal. Males produce a low-frequency growl sound that is repeated many times (Figure 2). The call itself lasts between 5 and 15 seconds. A similar pattern can be found for the walrus, where males produce repetitive sequences of a sound reminiscent of a church bell. Bearded seal songs consist of very tonal trill sounds that are around 14–30 seconds long. There are six basic trill types, which most often occur alone but can be produced in longer sequences. Only males are believed to produce these calls. In harp and Weddell seals, both sexes have been confirmed to produce underwater calls. However, the general behavior during the breeding season suggests that male calls play a role in mate attraction and territorial defense, the main two song functions in other species. Males also appear to have more call types in their repertoire than
females. Repertoire sizes are difficult to determine, since little is known about how animals classify signals they perceive. Thus, researchers can look at spectrograms and sort calls into types, but it is unclear to what extent this reflects classification by a marine mammal. Not surprisingly though, learning seems to lead to larger and more complex repertoires. Sea lions and fur seals do not appear to produce song. Their sound repertoires consist of bark and moan sounds as well as noisy growls, very similar to those produced by dogs. Most of these are produced in air at haul-out sites. To date there is no evidence for vocal learning in these groups. Hybrids of Antarctic (Arctocephalus gazella) and Subantarctic fur seals (Arctocephalus tropicalis) display a hybrid repertoire of calls suggesting a strong genetic influence on call development (Page et al., 2000). Northern elephant seals (Mirounga angustirostris) have been thought to have different dialects between colonies but these differences could be traced back to a founder effect when small groups of animals repopulated colony sites after heavy depletion (Janik and Slater, 1997). Once the number of animals increased, vocal differences disappeared. Sirenians are relatively quiet animals. Manatees produce very tonal whistle sounds, while dugongs appear to have a larger repertoire of sounds used in social interactions.
Sound Transmission and Active Space Sound propagation in the sea is much better than in air. In addition to normal spreading loss, an underwater sound of 1 kHz loses around 0.04 dB/km through absorption while the same sound in air loses 4 dB/km. The result is that marine mammal calls have a much larger active space than those of most terrestrial animals. The active space is defined as the area over which a receiver can detect the call of a conspecific. Most marine mammals can produce underwater sounds that have an active space of more than 10 km (reviewed in Janik, 2005). Calls of baleen whales can even be detected over several 100 km. The most detailed information is available for bottlenose dolphins and killer whales. Both species appear to have a maximum active space of around 25 km. However, high frequency components are attenuated more rapidly and are often more directional than low frequency ones. Thus, while it is likely that animals can detect each other over these distances, recognition of a particular call or individual may only be possible at closer range. Data on the active space of pinniped or sirenian calls are not available, but most of the seals that produce underwater song, like bearded seals, Weddell seals, and harp seals can be heard over more than 20 km.
Communication in Marine Mammals 649
While active spaces of low-frequency signals underwater tend to be large, there are many marine mammal underwater signals that are much quieter and do not travel nearly as far (reviewed in Janik, 2005). Most marine mammals produce sounds at different source levels, many of which are only audible to conspecifics within 100 m or less. Furthermore, several species, such as porpoises and dolphins belonging to the genus Cephalorhynchus, hardly use low-frequency sounds (i.e., <20 kHz), but instead use click sounds to communicate. Thus, their signals are subject to much larger transmission loss. Communicative clicks are very similar to echolocation clicks and travel only a few hundred meters, making the active space of these signals relatively small. Communication networks in these species are therefore much smaller and more comparable to those found in some terrestrial species. The difference between sound transmission in air and in water is demonstrated best by comparing large underwater active spaces with those of in-air calls of elephant seals (Southall et al., 2003). The in-air female attraction call of a pup only reaches animals within 5 to 70 m, depending on ambient noise conditions. The largest active space was found for the male clap threat call, which can travel up to 507 m in quiet conditions. In air, seal colonies often cause an impressive amount of noise. Such colonies can be detected over a few kilometers, but the recognition of single individuals is largely masked by the noise caused by other members of the colony. Thus, recognition of call types or individuals is likely to be compromised by biological background noise.
Social Contexts Marine mammal sounds can also be split by function into echolocation signals and social signals, including calls and song. Echolocation has been demonstrated in several toothed whale species and larger whales may be able to extract information about objects or whole shore lines using low frequency sounds. Bottlenose dolphins can also extract information from echolocation signals of conspecifics (Xitco and Roitblat, 1996). Several authors have speculated how dolphins swimming side-by-side perceive echoes from signals sent out by the animal next to them. This may lead to a wider field of coverage letting individuals share the same perceptional experience. However, clicks are also used in communication and different types of clicks have been identified that are used in different social contexts. Bottlenose dolphins can be trained to modify click as well as whistle parameters, giving them the potential to produce a vast number of different signals (review in Janik and Slater, 1997).
As in terrestrial mammals, many marine mammals have been shown to use calls if a mother and a calf become separated. Examples are manatees, harbor seals, southern right whales, California sea lions, and bottlenose dolphins, but it is likely to be much more widespread. Bottlenose dolphins are particularly interesting in this context, since they do not simply produce a shared isolation call, but develop individually distinctive signature whistles (review in Janik, 1999) that are used if closely bonded animals are separated from each other (Janik and Slater, 1998) (Figure 3). Each individual develops its own signature whistle in the first few months of life and uses it into adulthood to indicate identity and position. Signature whistle development is strongly influenced by learning. It appears that individuals use other animals’ whistles as models and develop their own whistle by modifying an existing model. Signature whistles are more plastic in males than in females. Initially, the signature whistle of many males is similar to that of the mother but it may change later in life when a male enters into an alliance with other males. Males in an alliance tend to have similar whistles, too, suggesting a change by at least some of the animals (Watwood et al., 2004). The signature whistle of a female, on the other hand, is very dissimilar to that of her mother from the start and it is remarkably stable even over decades (Sayigh et al., 1990). Fifty-two percent of all whistles used by wild, unrestrained bottlenose dolphins are signature whistles (Cook et al., 2004). Janik and Slater (1998) showed that dolphins tend to produce signature whistles when isolated from their group, demonstrating the importance of signature whistles in group cohesion. Bottlenose dolphins have also been found to imitate each other’s signature whistles, most likely in an attempt to address a specific individual (Janik, 2000). Given that this is a learned signal, signature whistles could be considered names of animals. However, we do not know to what extent a receiving dolphin has a mental representation of an individual if it hears that individual’s signature whistle. Several marine mammal calls have group-specific features that allow a researcher to identify which group it came from. Killer whales, for example, use group-specific repertoires of burst-pulsed sounds (review in Janik and Slater, 2003). These calls appear to be acquired through vocal learning and show only subtle changes over long periods of time. Killer whale groups off British Columbia that specialize in preying on fish socialize frequently and share many of their calls. Others in the same geographic area specialize in preying on smaller marine mammals and do not socialize or share calls with the fish eaters. Different killer whale call repertoires are often referred to as
650 Communication in Marine Mammals
Figure 3 Spectrograms of signature whistles of four bottlenose dolphins. Each line shows three examples of the signature whistle of one individual. Signature whistles are very different among the four individuals but highly stereotypical within individuals. (FFT size 1024, Hamming window, DF: 56 Hz, DT: 18 ms.)
dialects. Sperm whales and blue whales display a similar pattern, but here animals belong to much larger groups that, unlike killer whales, do not appear to be purely matrilineal (review in Janik, 2005). These groups are characterized by shared vocal features and are called vocal clans. Different clans are sympatric but rarely socialize with each other. It appears likely that closed groups like those of killer whales use the degree of sharing to recognize their own and other known groups. It is less clear whether acoustic differences are functional in clans. Clan variation could be used for recognition or be a byproduct of genetically or culturally inheriting calls from relatives that show a particular association or mating pattern. In pinnipeds, group specific calls have not been described, largely because they have not been shown to form the same individualized social groups that can be found in cetaceans. Several pinniped species show geographic variation in their call repertoires (review in Janik and Slater, 1997), but this is most likely caused by genetic drift and environmental differences. In some species geographic variation has been
described for haul-out sites that were only a few tens of kilometers apart. However, most of these did not stand up to closer scrutiny of longer-term data. Other calls that are common in animals are alarm and food calls. There are no specific alarm calls in marine mammals. In fact, most of them become remarkably quiet once they detect a predator. This can be explained by the behavior of the most common predator, the killer whale, which seems to use passive listening to detect prey. Food-related calls have been described for bottlenose dolphins. Many observations exist from other dolphin species rushing toward a feeding site from several kilometers away, suggesting a similar cue. Such calls can be beneficial to the sender for different reasons. They may indicate a food source to relatives. If attracted animals are not related to the caller, they may aid in prey capture by involuntarily chasing fish toward the caller. Finally, such calls may be directed at prey, perhaps modifying prey behavior in a predictable way to facilitate capture. Other dolphins may use these calls as a cue to a food source, but the signal did not evolve to attract them. Alarm and
Communication in Marine Mammals 651
food calls in marine mammals are reviewed in Janik (2005). Finally, territory defense and mate attraction are very prominent contexts in marine mammal sound production. As in birds, it is difficult to distinguish between these two functions. Many male marine mammals produce elaborate song displays during the breeding season (see above). Playback studies have been conducted with humpback whales and Weddell seals. In both species, males react to the playback of song while females appear less interested. However, while males may want to repel an intruder immediately, females may need much longer listening periods before they decide on a mate. Thus, their reactions would be less apparent in a playback study where stimuli are only played for a few minutes. Territorial defense is also an obvious calling context at pinniped haul-out sites.
Traditions in Marine Mammal Calls In animal communication research, the term ‘tradition’ refers to vocal displays that are learned socially and remain in a population for a set amount of time. This time window has been defined differently by different authors, so that most studies concentrate on whether a display was acquired through social learning or not and how many individuals share it at any one time. Recently, some authors have used the term ‘culture’ instead of tradition. However, the definition used for ‘culture’ was the same as the one given for ‘tradition’ here. Vocal traditions have been described in a variety of animal species. They occur by definition in all vocal learners, like birds and marine mammals. Examples are the group-specific call repertoires of killer whales or sperm whale clans, the geographic variation of whistle parameters found in bottlenose dolphins, and seemingly the small-scale geographic variation in calls of some phocid seals (review in Janik and Slater, 2003). Even the changing songs of humpback whales can be considered to be traditions, albeit short-lived ones. While usually 63% of all themes in a humpback whale song are shared between all individuals in subsequent years, this rate is not stable. An observed immigration of a few individuals from the west to the east coast of Australia led to a complete adaptation of the west coast song by east coast animals in only one year. While the description of such patterns is a useful first step, the question remains whether such differences have a function. In killer whales, the stability of differences between call repertoires of sympatric groups suggests that variation helps in group recognition. Larger scale variation, however, may just be a byproduct of copying errors when social
learning occurs. This can lead to a drift in the trait that could explain the geographic variation found in calls of seals and bottlenose dolphins.
Cognitive Abilities Affecting Communication Marine mammals have been shown to pay careful attention to their acoustic environment and they have remarkable sound production and perception abilities (review in Janik, 2005). Harbor seals in the wild, for example, learn to recognize calls of mammal-eating killer whales but ignore those of fish-eating killer whales. Harbor seals are also capable of vocal learning (Ralls et al., 1985), the ability to modify signals in form as a result of experience with those of other individuals (Janik and Slater, 1997). Several captive individuals have been found to copy human speech sounds, probably one of the most convincing demonstrations of vocal learning since it requires copying sounds from another species. However, evidence for vocal learning in other pinnipeds is sparse. Cetaceans also show impressive vocal learning skills (review in Janik and Slater, 1997). A careful investigation of the ability to copy computer-generated tonal signals has been conducted with bottlenose dolphins (Richards et al., 1984). Once the animal had learned the paradigm it was able to produce close imitations of novel sounds in the first trial they were used. The patterns of change in humpback whale song also indicate that learning plays a major role in their maintenance. Gray seals can learn to produce existing signals of their repertoire in novel contexts (Shapiro et al., 2004), and bottlenose dolphins can be taught to use learned signals to label objects in their tank. Once some objects have been removed dolphins can report presence or absence of an object by pressing one of two paddles if artificial labels are used to request this information (Herman and Forestell, 1985). Bottlenose dolphins also seem to be capable of responding correctly to referential pointing gestures performed by humans (Herman et al., 1999), even if an object is placed in a novel location. In pointing studies on other species the experimental setup was usually more restricted so that an apparently correct response to pointing could often be explained by a simple conditioning effect causing the animal to move to the left or the right. This summary demonstrates that marine mammals have advanced cognitive abilities that can facilitate complex communication.
Animal Language Studies Before the discovery that each individual dolphin has its own signature whistle type, dolphin whistle
652 Communication in Marine Mammals
repertoires were thought to be boundless, since every investigation of a new individual added novel whistle types. This led to fairly naı¨ve attempts to apply linguistic methods to the study of dolphin communication. For example, several researchers applied Zipf’s analysis to dolphin whistles and found that the resulting slope is comparable to that found in humans (e.g., Dreher, 1961). This was taken as evidence for dolphin communication to be as complex as human language. However, it is widely accepted now that Zipf’s law can apply to a large variety of processes and does not necessarily imply transmission of complex information (Suzuki et al., 2005). John Lilly tried to establish interspecies communication between humans and dolphins by constructing a living environment that allowed a volunteer to live with a dolphin over prolonged periods of time. He hoped that this situation would lead to the development of complex communication between the two. The study failed. It also lacked scientific rigor since the volunteer was also the observer for the study (review in Tyack, 1999). Early experiments on information transmission in dolphins were conducted by Bastian (1967). He placed two individuals in the same pool with a visual barrier in between them. Each animal had a set of two paddles in its enclosure. Bastian then trained them in a task where only one individual could see a light that indicated which paddle to press. However, only if both animals pressed the correct paddle did they receive a reward. The animals performed successfully in this task. Bastian argued that this demonstrated referential communication about which paddle to press between the two individuals. However, it does not imply that the first dolphin invented a signal for left and right. It is more likely that the second individual received acoustic cues from the position of the first one. Dolphins usually echolocate as they approach a target. When the first animal approached the two different paddles in different locations in the pool, the resulting echoes from pool walls audible to the second animal could have been distinctive enough to serve as conditioning stimuli for that second animal in the paddle pressing task. This means that the result achieved by Bastian can be explained without any intentional communication taking place. The first animal was trained by the experimenter to press one of two paddles, and the second one could have learned to use acoustic cues to achieve the same result. Unfortunately, the necessary tests to identify the cue used by the second animal have not been conducted. Bastian did notice a large amount of clicks sounds during the performance of the task. Nevertheless, the explanation given here could still be too simple. As we have seen earlier, Richards et al. (1984) demonstrated that dolphins can learn to use novel sounds as labels for objects. Thus, a more
complicated explanation that involves referential communication with novel signals is theoretically still possible. Animal language studies have been conducted on two marine mammal species, the bottlenose dolphin (Herman et al., 1984; Herman et al., 1993) and the California sea lion (Schusterman and Krieger, 1986; Gisiner and Schusterman, 1992). In 1984 Herman and his colleagues published their results on the comprehension of sentences by bottlenose dolphins. This long-term study used artificial communication systems to instruct two female dolphins what tasks to perform. In the first instance it consisted of signals for objects and actions that could be combined in different ways. One individual was exposed to a purely acoustic system, while the other one received handsignals. Each dolphin was taught a different syntax. While the first one consisted of sentences structured as OBJECT-ACTION-OBJECT, the second one would receive the same message as OBJECT-OBJECTACTION. An example in which the dolphin was supposed to bring a surfboard to a hoop, two of many items in the pool, would read SURFBOARD-FETCHHOOP in the first system and SURFBOARD-HOOPFETCH in the second. Once these systems were established, Herman and his colleagues added additional signals called modifiers that indicated object location or direction. The vocabulary in each of these systems consisted of 35 to 40 signals. Novel sequences of these signals were used to test the dolphins’ ability to process syntax. Herman and colleagues found that the animals reacted correctly to almost all novel sequences. Another test involved presenting incomplete or incorrect sequences. These tests seemed to demonstrate that the dolphins had a concept of the artificial language system and its rules. Kako (1999) compared the syntactical abilities of three language-trained species, the gray parrot, the bonobo, and the bottlenose dolphin, concentrating on discrete combinatorics, category-based rules, argument structure, and closed-class items. Discrete combinatorics refers to the situation when word meanings do not blend into each other when they are combined, but stay discrete. Category-based rules are rules that determine where a word of a particular category can occur in a sequence. To understand argument structure an animal needs to know how many arguments each verb can have, where these arguments are positioned in a sequence and what relationship they have to the verb in these positions. Finally, closed-class items are items that provide structure in a syntax system rather than referring to actions or objects. Examples are prepositions, quantifiers and determinators like A or THE. Kako found evidence for correct processing of the first three aspects in bonobos, parrots, and dolphins.
Communication in Marine Mammals 653
Closed-class items were only investigated in dolphins, where it has been shown that they can process such items like the demonstrative pronoun THAT and the conjunction AND. It is important, though, to keep in mind that using syntactical category names is problematic in the interpretation of the performances of language-trained animals. Animals may use their learning skills to perform successfully in such tasks without necessarily using anything like human syntax. Schusterman and colleagues tried to replicate the dolphin results with a female California sea lion using the OBJECT-FETCH-OBJECT structure (review in Schusterman and Gisiner, 1997). They found that the sea lion was capable of comparable processing as that demonstrated in Herman’s study (but not testing closed-class items). However, while Herman argued that dolphins have linguistic capacities, Schusterman and colleagues used a more cautious approach. They suggested that the performances by dolphins and sea lions can be explained by the formation of functional equivalence classes between signals and objects and conditional discrimination learning abilities, and that the development of linguistic concepts was unnecessary to produce the required action sequences. There were several differences in the studies on dolphins and sea lions. The dolphins were taught contrasting terms in relational sentences, for example different signals for taking one object to another or placing one object inside another one. The sea lion, on the other hand, was taught a different set of modifiers that included object features such as size and color. One major difference between the dolphins’ and the sea lion’s performances was in their responses to incomplete or incorrect sequences. While the dolphins often used parts of the sequence to perform a behavior sequence, the sea lion frequently refused to perform at all. However, Gisiner and Schusterman (1992) argue that this can be explained by different training regimes. While the dolphins were rewarded for any performance in a test trial, the sea lion would never be rewarded in test trials. Thus, it paid for the dolphins to produce a behavioral sequence while it did not for the sea lion. If the dolphins or the sea lion did respond, the way they performed in these trials suggested a close attention to the components of the command sequences they received. However, the extensive exposure to similar sequences in their training history makes it difficult to interpret performances. Thus, it is useful to see the results of these experiments as a remarkable demonstration of rule-learning skills that might also be relevant in humans for language comprehension, but not necessarily as evidence for a linguistic capacity as such, since the animals may use these skills in their own communication system in a very different way.
Language studies that required the animals to produce sounds themselves have been sparse, and an attempt to teach a dolphin to use a keyboard for interspecies communication had little success. Thus, it is difficult to compare the production side of the system with results found in other animals like parrots and great apes. See also: Animal Communication: Dialogues; Animal Communication: Long-Distance Signaling; Animal Communication Networks; Animal Communication: Vocal Learning; Development of Communication in Animals; Individual Recognition in Animal Species; Traditions in Animals.
Bibliography Bastian J (1967). ‘The transmission of arbitrary environmental information between bottlenose dolphins.’ In Busnel R G (ed.) Animal sonar systems – biology and bionics. Jouy-en-Josas: Laboratoire de Physiologie Acoustique. 803–873. Cook M L H, Sayigh L S, Blum J E & Wells R S (2004). ‘Signature-whistle production in undisturbed free-ranging bottlenose dolphins (Tursiops truncatus).’ Proceedings of the Royal Society of London B 271, 1043–1049. Dreher J J (1961). ‘Linguistic considerations of porpoise sounds.’ Journal of the Acoustical Society of America 33, 1799–1800. Gisiner R C & Schusterman R J (1992). ‘Sequence, syntax, and semantics: responses of a language-trained sea lion (Zalophus californianus) to novel sign combinations.’ Journal of Comparative Psychology 106, 78–91. Herman L M, Richards D G & Wolz J P (1984). ‘Comprehension of sentences by bottlenosed dolphins.’ Cognition 16, 129–219. Herman L M & Forestell P H (1985). ‘Reporting presence or absence of named objects by a language-trained dolphin.’ Neuroscience and Biobehavioral Reviews 9, 667–681. Herman L M, Kuczaj S & Holder M D (1993). ‘Responses to anomalous gestural sequences by a language-trained dolphin: evidence for processing of semantic relations and syntactic information.’ Journal of Experimental Psychology: General 122, 184–194. Herman L M, Abichandani S L, Elhajj A N, Herman E Y K, Sanchez J L & Pack A A (1999). ‘Dolphins (Tursiops truncatus) comprehend the referential character of the human pointing gesture.’ Journal of Comparative Psychology 113, 347–364. Janik V M (1999). ‘Origins and implications of vocal learning in bottlenose dolphins.’ In Box H O & Gibson K R (eds.) Mammalian social learning: comparative and ecological perspectives. Cambridge: Cambridge University Press. 308–326. Janik V M (2000). ‘Whistle matching in wild bottlenose dolphins (Tursiops truncatus).’ Science 289, 1355–1357. Janik V M (2005). ‘Acoustic communication networks in marine mammals.’ In McGregor P K (ed.) Animal communication networks. Cambridge: Cambridge University Press. 390–415.
654 Communication in Marine Mammals Janik V M & Slater P J B (1997). ‘Vocal learning in mammals.’ Advances in the Study of Behavior 26, 59–99. Janik V M & Slater P J B (1998). ‘Context-specific use suggests that bottlenose dolphin signature whistles are cohesion calls.’ Animal Behaviour 56, 829–838. Janik V M & Slater P J B (2003). ‘Traditions in mammalian and avian vocal communication.’ In Perry S & Fragaszy D (eds.) The biology of tradition: models and evidence. Cambridge: Cambridge University Press. 213–235. Kako E (1999). ‘Elements of syntax in the systems of three language-trained animals and ‘‘commentaries.’’’ Animal Learning and Behavior 27, 1–27. Page B, Goldsworthy S D & Hindell M A (2000). ‘Vocal traits of hybrid fur seals: intermediate to their parental species.’ Animal Behaviour 61, 959–967. Payne K & Payne R (1985). ‘Large scale changes over 19 years in songs of humpback whales in Bermuda.’ Zeitschrift fu¨ r Tierpsychologie 68, 89–114. Ralls K, Fiorelli P & Gish S (1985). ‘Vocalizations and vocal mimicry in captive harbor seals, Phoca vitulina.’ Canadian Journal of Zoology 63, 1050–1056. Richards D G, Wolz J P & Herman L M (1984). ‘Vocal mimicry of computer-generated sounds and vocal labeling of objects by a bottlenosed dolphin, Tursiops truncatus.’ Journal of Comparative Psychology 98, 10–28. Richardson W J, Greene C R, Malme C I & Thomson D H (1995). Marine mammals and noise. San Diego: Academic Press. Sayigh L S, Tyack P L, Wells R S & Scott M D (1990). ‘Signature whistles of free-ranging bottlenose dolphins, Tursiops truncatus: mother-offspring comparisons.’ Behavioral Ecology and Sociobiology 26, 247–260. Schusterman R J & Krieger K (1986). ‘Artificial language comprehension and size transposition by a California sea
lion (Zalophus californianus).’ Journal of Comparative Psychology 100, 348–355. Schusterman R J & Gisiner R C (1997). ‘Pinnipeds, porpoises, and parsimony: animal language research viewed from a bottom-up perspective.’ In Mitchell R W, Thompson N S & Miles H (eds.) Antropomorphism, anecdotes, and animals. New York: State University of New York Press. 370–382. Shapiro A D, Slater P J B & Janik V M (2004). ‘Call usage learning in gray seals (Halichoerus grypus).’ Journal of Comparative Psychology 118, 447–454. Southall B L, Schusterman R J & Kastak D (2003). ‘Acoustic communication ranges for northern elephant seals (Mirounga angustirostris).’ Aquatic Mammals 29, 202–213. Suzuki R, Buck J R & Tyack P L (2005). ‘The use of Zipf’s law in animal communication analysis.’ Animal Behaviour 69, F9–F17. Tyack P L (1999). ‘Communication and cognition.’ In Reynolds J E & Rommel S A (eds.) Biology of marine mammals. Washington: Smithsonian Institution Press. 287–323. Tyack P L & Clark C W (2000). ‘Communication and acoustic behavior of dolphins and whales.’ In Au W W L, Popper A N & Fay R R (eds.) Hearing by whales and dolphins. New York: Springer Verlag. 156–224. Watwood S L, Tyack P L & Wells R S (2004). ‘Whistle sharing in paired male bottlenose dolphins, Tursiops truncatus.’ Behavioral Ecology and Sociobiology 55, 531–543. Xitco M J & Roitblat H L (1996). ‘Object recognition through eavesdropping: passive echolocation in bottlenose dolphins.’ Animal Learning and Behavior 24, 355–365.
Communication, Understanding, and Interpretation: Philosophical Aspects D Hunter, State University of New York at Buffalo, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Philosophers have asked questions about both the nature and the extent of communication. What, for instance, is the difference between expressing a belief and communicating one, and what role does language play in communication? What must one know to interpret another person, and can we ever really understand another person? Philosophers are still at work refining these questions and considering answers. This discussion aims to sketch the direction of this work.
The Nature of Communication There might seem nothing especially puzzling about the nature of communication. After all, we seem to communicate with each other all the time and succeed, largely without having to reflect on the nature of our success. We tell others what we think or believe, and they in turn tell us what they think or believe. One way to begin to see what philosophers have found puzzling about the nature of communication is to consider whether nonhuman animals – dogs and cats, for instance – ever communicate with each other or with us. Suppose that Fido knocks his food bowl over whenever he is hungry, and he does it only in the presence of his master, whom he then looks at
654 Communication in Marine Mammals Janik V M & Slater P J B (1997). ‘Vocal learning in mammals.’ Advances in the Study of Behavior 26, 59–99. Janik V M & Slater P J B (1998). ‘Context-specific use suggests that bottlenose dolphin signature whistles are cohesion calls.’ Animal Behaviour 56, 829–838. Janik V M & Slater P J B (2003). ‘Traditions in mammalian and avian vocal communication.’ In Perry S & Fragaszy D (eds.) The biology of tradition: models and evidence. Cambridge: Cambridge University Press. 213–235. Kako E (1999). ‘Elements of syntax in the systems of three language-trained animals and ‘‘commentaries.’’’ Animal Learning and Behavior 27, 1–27. Page B, Goldsworthy S D & Hindell M A (2000). ‘Vocal traits of hybrid fur seals: intermediate to their parental species.’ Animal Behaviour 61, 959–967. Payne K & Payne R (1985). ‘Large scale changes over 19 years in songs of humpback whales in Bermuda.’ Zeitschrift fu¨r Tierpsychologie 68, 89–114. Ralls K, Fiorelli P & Gish S (1985). ‘Vocalizations and vocal mimicry in captive harbor seals, Phoca vitulina.’ Canadian Journal of Zoology 63, 1050–1056. Richards D G, Wolz J P & Herman L M (1984). ‘Vocal mimicry of computer-generated sounds and vocal labeling of objects by a bottlenosed dolphin, Tursiops truncatus.’ Journal of Comparative Psychology 98, 10–28. Richardson W J, Greene C R, Malme C I & Thomson D H (1995). Marine mammals and noise. San Diego: Academic Press. Sayigh L S, Tyack P L, Wells R S & Scott M D (1990). ‘Signature whistles of free-ranging bottlenose dolphins, Tursiops truncatus: mother-offspring comparisons.’ Behavioral Ecology and Sociobiology 26, 247–260. Schusterman R J & Krieger K (1986). ‘Artificial language comprehension and size transposition by a California sea
lion (Zalophus californianus).’ Journal of Comparative Psychology 100, 348–355. Schusterman R J & Gisiner R C (1997). ‘Pinnipeds, porpoises, and parsimony: animal language research viewed from a bottom-up perspective.’ In Mitchell R W, Thompson N S & Miles H (eds.) Antropomorphism, anecdotes, and animals. New York: State University of New York Press. 370–382. Shapiro A D, Slater P J B & Janik V M (2004). ‘Call usage learning in gray seals (Halichoerus grypus).’ Journal of Comparative Psychology 118, 447–454. Southall B L, Schusterman R J & Kastak D (2003). ‘Acoustic communication ranges for northern elephant seals (Mirounga angustirostris).’ Aquatic Mammals 29, 202–213. Suzuki R, Buck J R & Tyack P L (2005). ‘The use of Zipf’s law in animal communication analysis.’ Animal Behaviour 69, F9–F17. Tyack P L (1999). ‘Communication and cognition.’ In Reynolds J E & Rommel S A (eds.) Biology of marine mammals. Washington: Smithsonian Institution Press. 287–323. Tyack P L & Clark C W (2000). ‘Communication and acoustic behavior of dolphins and whales.’ In Au W W L, Popper A N & Fay R R (eds.) Hearing by whales and dolphins. New York: Springer Verlag. 156–224. Watwood S L, Tyack P L & Wells R S (2004). ‘Whistle sharing in paired male bottlenose dolphins, Tursiops truncatus.’ Behavioral Ecology and Sociobiology 55, 531–543. Xitco M J & Roitblat H L (1996). ‘Object recognition through eavesdropping: passive echolocation in bottlenose dolphins.’ Animal Learning and Behavior 24, 355–365.
Communication, Understanding, and Interpretation: Philosophical Aspects D Hunter, State University of New York at Buffalo, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Philosophers have asked questions about both the nature and the extent of communication. What, for instance, is the difference between expressing a belief and communicating one, and what role does language play in communication? What must one know to interpret another person, and can we ever really understand another person? Philosophers are still at work refining these questions and considering answers. This discussion aims to sketch the direction of this work.
The Nature of Communication There might seem nothing especially puzzling about the nature of communication. After all, we seem to communicate with each other all the time and succeed, largely without having to reflect on the nature of our success. We tell others what we think or believe, and they in turn tell us what they think or believe. One way to begin to see what philosophers have found puzzling about the nature of communication is to consider whether nonhuman animals – dogs and cats, for instance – ever communicate with each other or with us. Suppose that Fido knocks his food bowl over whenever he is hungry, and he does it only in the presence of his master, whom he then looks at
Communication, Understanding, and Interpretation: Philosophical Aspects 655
intently. Is Fido trying to communicate with his master? To answer this, we need to know more about what it is to try to communicate. One initially helpful distinction is between communicating a belief and merely manifesting or revealing one. Agents manifest or reveal something about their beliefs, desires, and intentions whenever they act, and ordinarily we can explain and predict these actions by reference to what (we think) they believe, want, and intend. This applies as much to dogs and cats as to people. We might, for instance, speculate that Fido is knocking his bowl over in part because he knows it is empty and wants it to be full. We might even, somewhat more ambitiously perhaps, say that Fido knows that if he knocks the bowl around his master will fill it for him. These attitudes are manifested by Fido’s actions, in the sense that we can infer from the actions that he has these beliefs and desires. But, of course, it is one thing for an action to reveal what one believes or desires and another to communicate what one believes or desires. After all, sometimes our actions reveal more than we want, as when we let something slip or when someone is spying on us. Genuine communication, it seems, requires intending to reveal what one believes; it requires manifesting one’s beliefs or desires on purpose. Communication is thus an intentional activity. Did Fido intend to communicate his beliefs and desires when he knocked the bowl over? Part of the difficulty in answering this stems from an ambiguity in the phrase ‘communicate what he believes’. In one sense, for Fido to communicate what he believes is for him to communicate, in this case, that his bowl is empty, that it should be full, and that he is hungry. In this sense, what is communicated are the facts, or at least the facts as Fido takes them to be. But in another sense, for Fido to communicate what he believes is for him to communicate the fact that he has those beliefs and desires. In this sense, what is communicated is the fact that Fido takes or wants the world to be some way. Plausibly, to communicate what one believes or desires in the second sense, one must be aware that one has those beliefs and desires; one might doubt whether dogs and cats are aware of their own beliefs and desires. To doubt this is not to doubt whether dogs and cats have beliefs and desires or whether their actions are caused by their beliefs and desires. One can admit all of this while still doubting whether dogs and cats have the level of self-awareness necessary for intending to reveal that they have certain beliefs and desires. It is an empirical question whether dogs and cats have this level of self-awareness. In any event, the two senses of ‘communicate what one believes’ differ only over what is communicated and not over what it is to
communicate. So even if it is true that Fido is incapable of communicating that he believes that his bowl of food is empty, it might still be true that he can communicate that his bowl of food is empty. But before we can decide this, there is still more we need to know about what Fido is trying to do. More specifically, is Fido trying through his action to influence his master’s beliefs or actions? Is Fido trying to get his master to believe something, or is he trying to get him to do something? To try to influence a person’s beliefs requires, it seems, some awareness that that person has beliefs that can be influenced. I think it is clear that this is something that humans do when they communicate with one another. But if one doubted that dogs and cats are aware of their own beliefs and desires, then one would likely doubt that they are aware that people have beliefs and desires. Still, we sometimes communicate with others with the intention of influencing their actions, as when we give warnings or orders. But perhaps we do this with the primary intention of influencing their beliefs, hoping that this will lead to the desired action. What is clear, in any event, is that we humans often do communicate with the intent of influencing the beliefs and desires of other people, whether or not this intention is required for genuine communication. It is also clear that, whether it is necessary or not, acting with an intention to influence someone’s beliefs is not enough for genuine communication. I might leave the milk carton on the counter, intending that when my wife notices that it is empty she will plan to buy one on her way home. If my plan succeeds, I will have influenced her beliefs. But it seems wrong to say that I would have communicated to her that the milk carton is empty, or that we need more milk, even though I deliberately caused her to believe these things. If, instead, I had made a show of holding the carton upside down in her presence I might well have communicated these things to her. But what is the relevant difference? Part of the difference, in this example anyway, is that in the second case my wife would know that I am trying to influence her beliefs. She would recognize that in making a show of holding the carton upside down I was trying to make her see that the milk is gone (or that I believe that it is gone). Genuine communication, it seems, may require that the audience recognize one’s intention to influence his or her beliefs or actions. This gives rise to a possible asymmetry in the case of dog-human communication. Plausibly, in knocking his food bowl around, Fido is intending to show his master that the bowl is empty, and no doubt his master can recognize this intention. This could be so even if Fido himself is not aware of having this intention or of his master’s recognizing it. And of course
656 Communication, Understanding, and Interpretation: Philosophical Aspects
the master can act with the intention of influencing Fido’s beliefs or actions. But can Fido, or any dog, recognize such an intention in his master? Can Fido figure out that when his master pulls back on the leash he or she is trying to get Fido to heel? Again, if one doubts that dogs and cats are aware of their own beliefs and intentions then one might well doubt whether they are aware of the beliefs and intentions of others. And if they cannot recognize intentions and beliefs in others, and if this recognition is needed for the audience in genuine communication, then dogs and cats cannot be the audience of genuine communication. But so long as trying to communicate does not itself require being aware of intentions, then dogs and cats might be able to communicate, even if they are incapable of being the audience of a communication. This is the potential asymmetry. One might think – perhaps with some justification – that the question of whether Fido is communicating or not is, at this point in the discussion, more than a little terminological. After all, all sides can agree – supposing certain empirical questions about the selfawareness of dogs and cats to have been settled – about what dogs and cats and humans can and cannot do to try to influence beliefs and actions. Deciding whether to call what nonhuman animals do ‘communication’ may seem less important than recognizing the differences and similarities between what all sides agree that human and nonhuman animals can do. In any event, progress in understanding animal communication requires further empirical study, not terminological decision. The discussion until now has left language out of the picture. The examples have all been of nonlinguistic communication. It is undeniable that we do communicate nonlinguistically with others using waves, winks, and kicks under the table (although there are terminological questions about just how to draw the line between linguistic and nonlinguistic communication). When we do so, we hope that our audience will be able to recognize our intention to communicate. There is nothing, I think, essentially new about linguistic communication except that it involves speech acts – acts done with words having a conventional meaning. However, reliance on conventions is not unique to linguistic communication, since nonlinguistic communication using signals and codes may also involve conventions. There is considerable current research about the nature or essence of human language and how it differs from codes or signal systems. Some leading philosophers also question whether human language is in any interesting sense meant for or designed for communication. Some philosophers have argued, though, that nonlinguistic forms of communication are in a way
dependent on linguistic forms. Some, such as Rene´ Descartes in the 17th century and Donald Davidson in the 20th century, held that genuine communication is essentially linguistic. With this view, having beliefs and intentions requires having language, and since (as we have seen) communication requires having beliefs and intentions, communication requires having language. Since dogs and cats have no beliefs they are, on this view, incapable of communication. This is, however, a minority opinion. The rough definition of communication I have sketched applies just as much to linguistic communication as to nonlinguistic communication. To communicate is to perform an action, perhaps a speech act, with the intention of influencing an audience’s beliefs or actions and whose success requires that the audience recognize this intention. This is just a rough sketch of a complete picture. Considerable ongoing philosophical research is aimed at filling in the details. In particular, research is focused on the precise nature of the relevant intentions.
Interpretation and Understanding Communication, whatever it is, succeeds only when the audience correctly understands the communicator, when he or she correctly interprets what the communicator intended him or her to come to believe or do. Philosophers have asked various epistemological questions about the extent to which we can and do understand each other. Some are quite skeptical that communication ever succeeds. One kind of philosophical question concerns the possibility that some of an agent’s thoughts are essentially private, in the sense that only that agent can think them. Such thoughts would be essentially incommunicable, ones an agent could never communicate to anyone else. One purported kind of example includes thoughts about the character of an agent’s own conscious experience. If no one else can know what it is like for me to taste chocolate or to see red, perhaps no one else can truly understand what I say when I try to describe these experiences. Perhaps what it is like to be me, from the inside, is something I can never fully communicate with another. A related kind of purportedly private thoughts are so-called first-person thoughts: thoughts an agent has about her own place in the world. Perhaps what I think when I think that I am in Buffalo is not what you (or I) think when you (or I) think that David Hunter is in Buffalo. Perhaps thoughts that locate my own position for me are not thoughts that others can share. If some thoughts are private in this way, then they would mark one principled limit to communication. But what such private thoughts might be
Communication, Understanding, and Interpretation: Philosophical Aspects 657
like – and indeed whether they are even possible – are areas of ongoing philosophical research. A more generalized skepticism about communication derives from the fact that what a person means by his or her words can never neatly be separated from what he or she believes. We use our words to express our beliefs, but what we intend to say depends on what we believe our words can be used to say. So we cannot understand what someone is saying without knowing what they believe, but our best insight into what they believe is through our understanding of what they say. This fact about the interdependence of meaning and belief has led some philosophers to suggest that what a person means depends on their entire cultural milieu. Different cultures, according to this position, have different systems of belief, or different worldviews, and interpreting or understanding an agent from another culture requires sharing or at least knowing that worldview. A related position is that scientists working within different scientific paradigms, such as pre- and post-Einsteinian physics, cannot genuinely understand each other, because the meanings of their shared words derive from different theoretical structures. It is not just that what Newton meant by ‘energy’ is not what Einstein meant by it. Rather, the claim is, Einstein could not even understand what Newton meant by it, since he did not share Newton’s scientific paradigm. This skepticism conflicts with the apparent ease of cross-cultural communication and ordinary interpretation. Perhaps this skepticism rests on mistaken semantic assumptions. But it may be that the appearance of easy communication stems from the fact that we typically assume that other people generally share our beliefs and meanings. Perhaps this ‘principle of charity’ in interpretation creates an illusion of successful communication. In any event, the general point that there is some interdependence between what a person means by their words and what they believe can hardly be doubted and is enough to raise some doubt about just how successful ordinary communication really is. A more severe skeptical worry starts from the fact that any theory is under-determined by evidence. It is a general fact about the nature of theories that very different, even conflicting, accounts of some phenomena will be compatible with all the available evidence. In the case of communication, this means that very different interpretations of someone’s speech act will make equally good sense of all available (indeed, of all possible) evidence. Just as a scientific theory can be adjusted in countless ways to accommodate new evidence, so our interpretation of a speech act can be varied in countless ways by
varying our interpretation of the agent’s beliefs or meanings. By itself, this under-determination suggests that we might never be in a position to know that our interpretations are correct, since no amount of evidence could identify a single best interpretation. Meaning might forever transcend our ability to know. Some doubt whether this brand of skepticism constitutes a special problem for communication, since all of our theories are under-determined by evidence. Perhaps if our epistemic position with respect to meaning is no worse than that with respect to, say, atomic physics, then we can live with this much skepticism about interpretation. However, the American philosopher W. V. O. Quine argued powerfully that there is a special problem in the case of communication. In early work, he stressed the idea that the under-determination of translation would occur, even if there were no under-determination of physics. Even if we agreed on all the physical facts, we might not, he argued, be able to agree on a unique best interpretation of an agent’s speech act, since there would still be room to vary the agent’s beliefs and desires. The physical facts, in his view, do not determine the semantic ones. In later work, Quine stressed the idea that in the case of, say, physics, we are prepared to admit that the physical facts might transcend our cognitive capacities. We might, he held, simply lack the intellectual resources needed to discover those facts, since nothing in the facts themselves guarantees that we can know them. As a result, he said, we are prepared to say that even though conflicting physical theories might be equally compatible with all available evidence, at most one of them can be true. This means that we might not be able to tell which theory is true. However, Quine argued, it makes no sense to suppose that what someone means by his or her words or what he or she believes could transcend the evidence we have at our disposal. Facts about meaning and belief are, he held, essentially public and knowable by us. So, he concluded, the special problem for interpretation and understanding is that it makes no sense to say that one interpretation is truer than any other, so long as they each make equally good sense of the evidence. It is not that facts about meaning could go beyond what we can know; it is that there is nothing more to meaning than what we can know. And because what we can know fails to determine a unique translation, this means that translation is indeterminate. While Quine’s writings on this topic have been extremely influential, there is little consensus about just what his arguments are, let alone what the consequences of his view would be. And while he has won few converts, there is no consensus about where his arguments go wrong. Some have responded that
658 Communication, Understanding, and Interpretation: Philosophical Aspects
Quine unjustly adopted behaviorist limits on the available evidence or that he overlooked other sources of evidence at our disposal. Still, the thesis of the indeterminacy of translation is one of the most significant contributions to the philosophy of communication of the 20th century. See also: Behaviorism: Varieties; Causal Theories of Ref-
erence and Meaning; Conventions in Language; Empiricism; Epistemology and Language; Indeterminacy, Semantic; Radical Interpretation, Translation and Interpretationalism.
Bibliography Bach K & Harnish R M (1979). Linguistic communication and speech acts. Cambridge, MA: MIT Press. Castaneda H (1966). ‘‘‘He’’: a study in the logic of selfconsciousness.’ Ratio 8, 130–157. Chomsky N (1969). ‘Quine’s empirical assumptions.’ In Davidson D & Hintikka J (eds.) Words and objections. Dordrecht: Reidel. Davidson D (1984). Inquiries into truth and interpretation. Oxford: Oxford University Press. Davidson D (1984). ‘Thought and talk.’ In Davidson D (ed.) Inquiries into truth and interpretation. Oxford: Oxford University Press.
Dummett M (1993). ‘Language and communication.’ In The seas of language. Oxford: Oxford University Press. Frege G (1918/1977). ‘Thoughts.’ Geach P T & Stoothoff R H (trans.). In Geach P T (ed.) Logical investigations. New Haven: Yale University Press. xxx–xxx. George A (1986). ‘Whence and whither the debate between Quine and Chomsky?’ Journal of Philosophy. Grice H P (1967/1989). ‘Logic and conversation.’ In Grice H P (ed.) Studies in the way of words. Cambridge, MA: Harvard University Press. Kripke S (1982). Wittgenstein on rules and private language. Cambridge, MA: Blackwell. Kuhn T (1962). The structure of scientific revolutions (2nd edn., enlarged). Chicago: University of Chicago Press. Perry J (1993). The problem of the essential indexical. Oxford: Oxford University Press. Quine W V O (1960). Word and object. Cambridge, MA: MIT Press. Quine W V O (1992). Pursuit of truth. Cambridge, MA: Harvard University Press. Searle J (1969). Speech acts. Cambridge: Cambridge University Press. Sperber D & Wilson D (1988). Relevance: communication and cognition. Cambridge, MA: Harvard University Press. Whorf B J (1956). Language, thought and reality. Carroll J M (ed.). Cambridge, MA: MIT Press. Wittgenstein L (1953). Philosophical investigations. Anscombe G E M (trans.). Oxford: Blackwell.
Communication: Semiotic Approaches E Rigotti and S Greco, University of Lugano, Lugano, Switzerland ! 2006 Elsevier Ltd. All rights reserved.
The Rise of a Controversy The topic at issue in this article shows several problematic aspects. In communication research, we are presently crossing a phase of intensive innovation, in which the paradigm and the role of the different disciplines are changing remarkably. For a long time, the leading role in this area was played by the sciences du language, in particular by semiotics and linguistics. Nowadays, this role is played by a complex epistemological interplay, where other human and social sciences – focusing on the organizational assets of communication context – as well as technological disciplines contribute to the study of real communicative events. Thanks to these contributions, it has become evident that real communicative events are not only influenced, but functionally governed by their actual context (enterprises, institutions,
communities, and other social organizations . . .) and by the media, by which they are not only broadcasted, but also structured. Moreover, even linguistic sciences, which are expected to explain the internal structure of a communicative event, are largely adopting a model of communication whose conceptual frame is no longer essentially semiotic, but rather pragmatic. The prevailing of a pragmatic paradigm seems to have strongly redimensioned the semiotic claim. More specifically, both major trends – Speech acts Theory and Relevance Theory (i.e., the ostensive inferential model of communication) – are proposing a vision of communication that does not focus on semiotic aspects. While the former of these trends has developed its own model essentially ignoring the semiotic approach (Austin, 1962 and Searle, 1969), the latter has created a proper controversy, initiating a sort of campaign against the semiotic approach and its academic power. At this point, it is useful to outline the ostensiveinferential model of Relevance theory synthetically, in
658 Communication, Understanding, and Interpretation: Philosophical Aspects
Quine unjustly adopted behaviorist limits on the available evidence or that he overlooked other sources of evidence at our disposal. Still, the thesis of the indeterminacy of translation is one of the most significant contributions to the philosophy of communication of the 20th century. See also: Behaviorism: Varieties; Causal Theories of Ref-
erence and Meaning; Conventions in Language; Empiricism; Epistemology and Language; Indeterminacy, Semantic; Radical Interpretation, Translation and Interpretationalism.
Bibliography Bach K & Harnish R M (1979). Linguistic communication and speech acts. Cambridge, MA: MIT Press. Castaneda H (1966). ‘‘‘He’’: a study in the logic of selfconsciousness.’ Ratio 8, 130–157. Chomsky N (1969). ‘Quine’s empirical assumptions.’ In Davidson D & Hintikka J (eds.) Words and objections. Dordrecht: Reidel. Davidson D (1984). Inquiries into truth and interpretation. Oxford: Oxford University Press. Davidson D (1984). ‘Thought and talk.’ In Davidson D (ed.) Inquiries into truth and interpretation. Oxford: Oxford University Press.
Dummett M (1993). ‘Language and communication.’ In The seas of language. Oxford: Oxford University Press. Frege G (1918/1977). ‘Thoughts.’ Geach P T & Stoothoff R H (trans.). In Geach P T (ed.) Logical investigations. New Haven: Yale University Press. xxx–xxx. George A (1986). ‘Whence and whither the debate between Quine and Chomsky?’ Journal of Philosophy. Grice H P (1967/1989). ‘Logic and conversation.’ In Grice H P (ed.) Studies in the way of words. Cambridge, MA: Harvard University Press. Kripke S (1982). Wittgenstein on rules and private language. Cambridge, MA: Blackwell. Kuhn T (1962). The structure of scientific revolutions (2nd edn., enlarged). Chicago: University of Chicago Press. Perry J (1993). The problem of the essential indexical. Oxford: Oxford University Press. Quine W V O (1960). Word and object. Cambridge, MA: MIT Press. Quine W V O (1992). Pursuit of truth. Cambridge, MA: Harvard University Press. Searle J (1969). Speech acts. Cambridge: Cambridge University Press. Sperber D & Wilson D (1988). Relevance: communication and cognition. Cambridge, MA: Harvard University Press. Whorf B J (1956). Language, thought and reality. Carroll J M (ed.). Cambridge, MA: MIT Press. Wittgenstein L (1953). Philosophical investigations. Anscombe G E M (trans.). Oxford: Blackwell.
Communication: Semiotic Approaches E Rigotti and S Greco, University of Lugano, Lugano, Switzerland ! 2006 Elsevier Ltd. All rights reserved.
The Rise of a Controversy The topic at issue in this article shows several problematic aspects. In communication research, we are presently crossing a phase of intensive innovation, in which the paradigm and the role of the different disciplines are changing remarkably. For a long time, the leading role in this area was played by the sciences du language, in particular by semiotics and linguistics. Nowadays, this role is played by a complex epistemological interplay, where other human and social sciences – focusing on the organizational assets of communication context – as well as technological disciplines contribute to the study of real communicative events. Thanks to these contributions, it has become evident that real communicative events are not only influenced, but functionally governed by their actual context (enterprises, institutions,
communities, and other social organizations . . .) and by the media, by which they are not only broadcasted, but also structured. Moreover, even linguistic sciences, which are expected to explain the internal structure of a communicative event, are largely adopting a model of communication whose conceptual frame is no longer essentially semiotic, but rather pragmatic. The prevailing of a pragmatic paradigm seems to have strongly redimensioned the semiotic claim. More specifically, both major trends – Speech acts Theory and Relevance Theory (i.e., the ostensive inferential model of communication) – are proposing a vision of communication that does not focus on semiotic aspects. While the former of these trends has developed its own model essentially ignoring the semiotic approach (Austin, 1962 and Searle, 1969), the latter has created a proper controversy, initiating a sort of campaign against the semiotic approach and its academic power. At this point, it is useful to outline the ostensiveinferential model of Relevance theory synthetically, in
Communication: Semiotic Approaches 659
order to specify its criticism of the semiotic model of communication, and also what it justly presupposes a semiotic model to be. In fact, the ostensive-inferential model, whose roots are Paul Grice’s and David Lewis’ works (Sperber and Wilson, 1995/1986: 2) is introduced by means of those aspects that oppose it to the semiotic model (Sperber and Wilson, 1986: 6) ‘‘The semiotic approach to communication (as Peirce called it and we will call it ourselves), or the semiological approach (as Saussure and his followers called it), is a generalization of the code model of verbal communication to all forms of communication,’’ and is thus to be abandoned, since it does not seem to explain the real functioning of communicative events: ‘‘The code model of verbal communication is only a hypothesis, with well-known merits and rather less wellknown defects. [. . .] Its main defect, as we will shortly argue, is that it is descriptively inadequate: comprehension involves more than the decoding of a linguistic signal’’ (Sperber and Wilson, 1995/1986: 6). In other words, the semiotic appoach appears to interpret communication as a process where a speaker constructs a message by coding a certain meaning by means of a linguistic system, and transfers it to a hearer who simply decodes it, thus retrieving its original meaning. The roles of the speaker and the hearer in a communicative event are thus reduced to coding and decoding respectively. The scholars of the ostensive-inferential approach to communication, relying on wide and unquestionable evidence, argue that the process of interpreting a message by the hearer is far more complex, and that the semiotic component represents a rather short stretch of the communicative process. The semiotic component is neither necessary nor sufficient to explain the process of communication. Firstly, it is not necessary because many messages do not make use of a linguistic system; very often, the communicator addresses the hearer not through words of a certain natural language, or through another semiotic system, but through traces by which the hearer is expected to be guided to infer the communicative intention of the message. Sperber and Wilson (1995/1986: 25) argued in fact that Grice’s originality consisted in suggesting that the identification of the communicator’s intentions is sufficient for the achievement of successful communication, and the mediation of a verbal code is not necessarily needed. The authors give an example (Sperber and Wilson, 1995/1986: 25–26) that shows how communication may succeed even without the help of the coding-decoding process. If Peter asks Mary: ‘How are you feeling today?’, Mary may answer by pulling a bottle of aspirin out of her bag and showing it to him. Although there is no code or convention that
rules the interpretation of her behavior, this action can be taken as strong evidence that she wants to inform Peter that she does not feel well. In this sense, Mary and Peter can be said to have communicated, even if they have not made use of any verbal or nonverbal code. The semiotic component is not even sufficient, even in the very usual cases where it is present. However, large it may be, interpretation requires that various contextual aspects are involved in order to complete the information carried by the semiotic component: ‘‘Verbal communication is a complex form of communication. Linguistic coding and decoding is involved, but the linguistic meaning of an uttered sentence falls short of encoding what the speaker means: it merely helps the audience infer what she means’’ (Sperber and Wilson, 1995/1986: 27). Within this complex form of communication, the results of the decoding process are considered a piece of evidence from which the hearer, through a noncoded mechanism, can infer the speaker’s intentions. In this sense, the semiotic component becomes subservient to the inferential process. Using the terminology of Relevance Theory, an enrichment of the linguistic form of the message is however indispensable to obtain the semantic and pragmatic interpretation of a message. This is crucial to distinguish between ‘sentence’ and ‘utterance of a sentence.’ According to these authors, generative grammars fail to consider that a certain sentence may appear in an enormous variety of utterances that, though sharing a ‘core of meaning’ (Sperber and Wilson, 1995/1986: 9) bound to the linguistic code, each includes a different nonlinguistic, context-bound meaning that can be neither predicted nor ‘calculated’ through a decoding process. Therefore, an inference process is required in order to grasp a complete representation of the communicator’s intentions. To give just an example (adapted from Sperber and Wilson, 1995/1986: 11), a sentence like ‘You’re leaving’ contains different levels of noncoded meaning: (1) an indexical (you) whose interpretation depends on the actual communicative event where the sentence is uttered; and (2) a set of possible interpretations: is the speaker informing the hearer that she is to leave? Is she making a guess? Or is she rather expressing disappointment because he is leaving? Thus, the process of comprehension, through which the hearer reconstructs the communicator’s intentions, is not a decoding process, but rather an inferential process. Whereas the decoding process ‘‘starts from a signal and results in the recovery of a message which is associated to the signal by an underlying code’’ (Sperber and Wilson, 1995/1986: 13), an inferential process starts from a set of premises and
660 Communication: Semiotic Approaches
reaches a conclusion warranted by the premises themselves. Among possible interpretations of an utterance, the hearer chooses the most adequate to certain expectations of truthfulness, informativeness, and comprehensibility. The inferential process of comprehension is an essential component of communication, which is nonetheless often integrated by the employment of a code. A common code between the interlocutors turns out to be the most powerful, however not indispensable, tool for communicating. Sperber and Wilson’s critical remarks are generally convincing and acceptable, where they criticize the attempt to explain the interpretative process merely in terms of decoding. Less convincing is the more general criticism of all semiotic models of communication, accusing them of reducing communication to a coding and decoding process.
Figure 1 Ferdinand de Saussure’s model.
worth noticing that these two trends have developed considerably different attitudes toward communication in their theoretical elaboration. Relevant representatives of both trends being numerous, only those scholars who cover significant and universally acknowledged cruxes will be mentioned here. The Functionalist Reading
Saussurean ‘Signification’ as Keyword and Sign of Contradiction Our thesis is that Sperber and Wilson’s criticism, which is legitimate in relation to certain semiotic models, is unacceptable for others. Furthermore, in our opinion, their reductive vision of the function of the semiotic component within a communicative process is by no means convincing. For both points, we should briefly reconsider some of the communication models more or less explicitly proposed by semioticians and linguists in the past century. It is almost compulsory to start by referring to Ferdinand de Saussure, with whom the beginning of modern linguistics in its structuralist phase is usually connected. His representation of the communication process seems to constitute a typical coding and decoding model, Figure 1. Here the speaker, having in mind a particular signifie´, correlates it to the corresponding ‘signifiant’ of her linguistic system (langue), which is perceived by the hearer who correlates it to the correspondent signifie´ of the same linguistic system. Nonetheless, Saussure’s Cours has a problematic nature; its real function is to witness a deep and complex meditation rather than systematicly expounding a theory. Thus, beyond the approximate presentation of the discourse circle (circuit de la parole), the Saussurean text introduces the fundamental but problematic distinction between signifie´ – defined as a meaning carried by an element of a langue (a linguistic system) – and signification – a term denoting a notion that remains rather opaque in the Saussurean text. Its interpretation and the evaluation of its role in Saussure’s doctrine is nevertheless crucial, and turned out to characterize the two main divergent trends that emerged within post-Saussurean structuralism. It is
Let us consider Saussure’s text. In a passage in chapter IV of the second part of the Cours de linguistique ge´ne´rale (1916/1995: 158–159), Saussure seems to employ the term signification as equivalent to signifie´: signification seems to be nothing but the counterpart of the auditive image, ‘‘un des aspects de la valeur linguistique’’ or, better, the value of the conceptual component of the linguistic sign. Nonetheless, in the following passages, Saussure opposes signification to signifie´ throughout a series of interlinguistic confrontations (mouton vs. sheep and mutton, French plural vs. Sanskrit plural and dual, etc). So, without explicitly saying it, Saussure employs signification as opposed to signifie´; interlinguistic comparisons between different language-bound signifie´s are possible thanks to a conceptualization of reality that is formed somehow independently of these signifie´s. This distinction lets us guess the existence of a complex correlation between the two semantic dimensions (reasonably understood as interpretation), which goes from the signifie´s obtained through the coding to the significations, which articulate the parole (the speaker’s actual message). Without this conceptualization, evoked by the use of the term signification, such a comparison between different languages would be simply impossible (on this point, see Rigotti & Rocci, in press). Signification, thus, has to be interpreted as an interor translinguistic category independent of the linguistic code, however correlated to it. If we integrate this notion into the Saussurean circuit de la parole, we obtain a more comprehensive model of communication, where the correlation of signifiant and signifie´ is only a stretch of a more complex path, starting with the actual meaning intended by the speaker, and ending with the reconstruction of this meaning tentatively operated by the hearer.
Communication: Semiotic Approaches 661
The interpretation of the Saussurean text presupposed by this model is explicitly adopted by N. Troubezkoy in his Grundzu¨ ge der Phonologie (1939), where the signifie´ s are considered, at the level of langue, as abstract rules and conceptual schemes, which need to be related to the actual significations emerging from language use (see also Rigotti & Rocci, in press: 5). On this point, M. Bre´ al (1844–1995: 552b) observes that, where we need to employ a certain word in communication, we ‘forget’ all possible meanings of that word except the one that corresponds to our thought (‘‘s’accorde avec notre pense´ e’’). Although the other meanings are still somehow present to our mind, we choose the one that corresponds to the meaning we want to express – i.e. to the signification. Here, the relation between signifiant and signifie´ is certainly not a codingand-decoding one, since it is mediated by the speaker’s choice to her communicative intention. The same approach to communication may be found in Karl Bu¨ hler’s Organonmodell, as outlined in his Sprachtheorie (1934). Among the numerous pages of Bu¨ hler’s text, which could be useful to elucidate his position on this issue, one passage seems particularly revealing (1934: 63), where Bu¨ hler argues that no code can ensure the correct interpretation of the word ‘horse’ as it is used in a text, where it can refer to a single entity or to the species of horses in general. The use within a text is not ‘‘morphologisch erkennbar,’’ i.e., it cannot be decoded by means of morphological aspects of the language, neither in Latin, a language that does not foresee articles, nor in the Indo-Germanic, article-provided languages. What allows us to correctly interpret the use of the word ‘horse,’ is a ‘detective-attitude’ towards the context of the communicative event, which aims at evaluating what the speaker has in mind: ‘‘Man muss es detektivisch gleichsam dem Kontexte oder den Umsta¨ nden der Sprechsituation entnehmen, ob der Sprecher das eine oder das andere im Auge hat und meint.’’ Moreover, an author to whom Bu¨ hler is quite indebted, Philipp Wegener, had also stressed the interpretative aspect of communication 50 years previously. Wegener argues that the hearer has the complex task of understanding the speaker’s action; for this purpose, he has to figure out what the ‘goal’ of the communicative action may be. Comprehension of verbal messages is achieved through ‘inferences’ (Schlu¨ sse), which rely both on the meaning of the verbal signs as well as the experience of reality. So, where experience is lacking, comprehension is impossible (Wegener, 1885–1991: 128). For instance, one could not understand a sentence such as ‘a whistle of the train, and my brother was gone,’ if one had no experience of a train setting off from a station.
If, speaking of a semiotic approach to communication, we refer to this research tradition in linguistics and semiotics, the criticism put forward by the scholars of Relevance Theory loses its bite: in fact, the process of communication is not referred to as a coding-decoding process within this trend. Rather, one should acknowledge that the process of communicative inference is constantly associated with the concept of interpretation. Nor would the objection be acceptable that, in these models, inference only plays the subservient role of integrating the semiotic process. Here, it must be noticed that, if inference is acknowledged as a necessary integration of the semiotic component, it follows that the semiotic component itself is not considered sufficient for the accomplishment of the communicative event. Therefore, the inferential component becomes essential for communication. More specifically, in this first tradition, neither the speaker’s coding nor the hearer’s decoding hold the supremacy in the communicative process; the crucial moment is rather when it becomes clear what the speaker intended to communicate, and the hearer understands it. As Bu¨ hler claims, on the backdrop of a Husserlian philosophical vision, language always appeals to the speaker’s knowledge of reality; and each time we understand the meaning of a communicative event, we deeply and unavoidably rely on a ‘reality-driven selection’ (sachgesteuerte Selektion, Bu¨ hler, 1934: 65), which constitutes the core of communication. Not by chance, a large part of Bu¨ hler’s research is devoted to the study of the specific semantic mechanism of the ‘indexicals’ or ‘deictics’ (Zeigwo¨ rter). This term refers to linguistic units and structures whose meaning is reconstructed through the identification of an aspect of the communicative situation (Bu¨ hler, 1934 see in particular p. 107). Here, given the importance that Bu¨ hler attributes to reality in the process of communication, it becomes clear why he adopts a ‘triadic’ notion of the sign, which is rather innovative if we compare it to other structuralist models. In his Organonmodell (1934: 24), the sign is conceived as an ‘instrument’ for communicating; and communication is interpreted pragmatically, as an action accomplished by the speaker and the hearer. According to Bu¨ hler (1934: 52), communication must be viewed as a human ‘action,’ vitally bound to other meaningful human behaviors. Communication is related to other actions, and is an action in itself. In particular, Bu¨ hler distinguishes between Sprechhandlung (1934: 53), which is the human activity of communicating, i.e., the Saussurean parole, as opposite to Sprachgebilde (the langue, 1934: 57); moreover, with the notion of Sprechakt (1934: 62), he focuses on a single communicative
662 Communication: Semiotic Approaches
Figure 2 Karl Bu¨hler’s Organonmodell.
action, and with Sprachwerk (1934: 53), he denotes the linguistic products resulting from a single human action of communicating. Within the model, the sign is related to the speaker (Sender), the addressee (Empfa¨ nger) and the objects and states of affairs in reality (Gegensta¨ nde und Sachverhalte). The sign is bound to each dimension by a specific relation: with regard to the speaker, the sign is a ‘symptom,’ bound by a relation of ‘expression’ (Ausdruck); with regard to the addressee, the sign is a ‘signal,’ and stands in the relation of appeal (Appel); and, finally, with regard to the object, the sign is a ‘symbol,’ and stands in the relation of ‘representation.’ The following diagram, Figure 2, illustrates Bu¨ hler’s model (1934: 28): The distinction between code dimension and discourse dimension of semantics, implied by the Saussurean terms signifie´ and signification, is tackled and deepened by another linguist: E. Benveniste, who introduced the terms ‘semiotic’ and ‘semantic’ (Benveniste, 1966a). He underlines that the content dimension of code units is a semiotic one; while the content dimension of the same units, insofar as they are used within a discourse, is truly semantic (on this point, see also Rocci, 2003). Moreover, among the indexicals investigated by Bu¨ hler, he focuses on personal pronouns, by which the communicative act and its constituents are mirrored in specific linguistic
structures (Benveniste, 1966b). The study of personal pronouns on both the diachronic and the synchronic axes brings Benveniste to single out the essential role that is played by subjectivity (I and You) in communication. On the basis of the Saussurean notion of ‘signification,’ conceived as the actual, situation-bound meaning of the sign in the communicative process, and of Bu¨ hler’s interpretation of the sign as an instrument for communicating, we could modify Saussure’s diagram and build a model of communication that is shared in its fundamental aspects by all the authors within the research tradition we have examined so far, Figure 3. Even in its visual diversity, the well-known model proposed by Roman Jakobson (Jakobson, 1960/ 1995) is, in many respects, reminiscent of Bu¨ hler’s sign model. Being evidently influenced by Shannon and Weaver’s model, it brings to light the process of transmitting a message, thus offering a rather obvious metaphor of the communicative process, Figure 4. Jakobson’s model has two indubitable merits: firstly, it takes into account, and represents synthetically, a complex set of factors; secondly, it deepens many of the specific functions of the message in relation to each of these factors in the communication process. This Russian linguist treasures his former belonging to the significant experience of Russian formalism, by
Communication: Semiotic Approaches 663
Figure 3 The model of communication within the functionalist reading.
Some Code-model Approaches
Figure 4 Roman Jakobson’s model of the fundamental factors of communication.
Figure 5 Roman Jakobson’s model of the textual functions.
introducing the poetic function into his model, as an autotelic orientation of the message towards itself, Figure 5. The graphic representation of Jakobson’s model appears to be richer than the sign scheme provided by Bu¨ hler. However, if we consider the model implicit in the theory of the latter, we have to recognize that Bu¨ hler’s model is richer in important respects: indeed, in Jakobson’s perspective, the pragmatic dimension is weakened; the essential role of inference in interpretation is ignored, as well as the relevance of context for interpretation. Another important aspect concerns the distinction between signifie´ and signification, reflecting the more general difference between language (langue) and speech (parole), which remains outside the graphic model outlined by Jakobson, even though it is adumbrated in some significant research (Jakobson, 1957).
The precise definition of the Saussurean model represents a core issue for a large segment of semioticians of the past century. Indeed, besides the tradition we have tackled so far, another tradition of semiotic studies starts from a different interpretation of Saussure’s signification. The second trend does not concentrate on the notion of signification, and therefore it does not focus on the textual and discursive dimension of the parole, whereas the point of view of the code (langue) is preferred. This position can be found not only in Hjemslev’s Prolegomena to a theory of language (1961), but also in various scholars belonging to French structuralism – among which R. Barthes plays a paradigmatic role – and in Umberto Eco’s first semiotic theory, expounded in his work Trattato di semiotica generale (1979). It is worth noticing that it is quite difficult to infer a model of communication from these positions. Barthes, for instance, stresses the interpretation of language as a system, whereby the individual performing a particular act of parole (a discourse) simply selects and actualizes one of the possible states of the system (Barthes, 1964). As the semantic dimension is exhaustively represented by the system of the signifie´ s, the meaning of communicative messages is not built by a speaker for an addressee, but it is rather one possible product the system can generate. The human subject is excluded from the communication process; communication itself, conceived as a communicative interaction between two human beings, i.e., as the junction of the communicative action of the speaker with the interpretative action of the addressee, fails to be considered at all. Umberto Eco (1979: 8) defined communication as ‘‘the passage of a signal (not necessarily a sign) from a source (through a transmitter, along a channel) to a destination.’’ This definition is meant to include both cases of
664 Communication: Semiotic Approaches
machine-to-machine passages of information (see also 1979: 32), and cases where the destination (and not necessarily the source) is a human being. In the latter case, communication involves the process of signification, ‘‘provided that the signal is not merely a stimulus but arouses an interpretive response in the addressee’’ (1979: 32). The process of signification is not conceived as a communicative action; the focus here is on the signification system, ‘‘an autonomous semiotic construct that has an abstract mode of existence independent of any possible communicative act it makes possible’’ (1979: 9). Thus, it is the system that guarantees communication, and the existence of the system does not presuppose the existence of actual communicative events. On the contrary, communication between human beings necessarily presupposes a signification system (thus excluding cases of nonverbal, ostensive communication). It must be observed that Eco explicitly discusses the problem of what the place of the human being, i.e., ‘‘the ‘acting subject’ ’’ (1979: 314) within semiotics should be. He concludes that what is outside the signification system – its ‘‘material expressions’’ (1979: 317) might even be ‘‘tremendously important,’’ but it is beyond the subject of semiotics. In fact, as Eco argues, the proper subject of signification is ‘‘nothing more than the continuously unaccomplished system of systems of signification that reflects back on itself,’’ whereas individual material subjects only ‘‘obey, enrich, change and criticize’’ the signification system (1979: 315). As emerges from our survey of some theories within the second trend of Saussurean semiotics, speaking of a proper ‘model of communication’ in relation to them turns out to be quite difficult. In fact, communication in itself is intrinsically ignored. What they hypothesize are the mysterious workings of an autonomous semiotic program, which would auto-install and run on a mass of undifferentiated terminals, thus defining their individual or network sign production.
Charles Sanders Peirce The model of communication of the first trend inspired by Saussurean semiotics, which we found in Bu¨ hler, and which is confirmed by recent pragmatic models, shows interesting analogies with another tradition, often considered as alternative to the Saussurean one: the semiotic model by Charles Sanders Peirce. As Bu¨ hler would do in the 1930s, Peirce had already proposed a triadic notion of sign at the end of the 19th century, Figure 6. According to what Peirce wrote in 1897 (1897– 1935–1958: 2.228), ‘‘A sign, or ‘representamen’ is something which stands to somebody for something
Figure 6 Charles Sanders Peirce’s model of sign.
in some respect or capacity. It addresses somebody, that is, creates in the mind of that person an equivalent sign, or perhaps a more developed sign. That sign which it creates I call the interpretant of the first sign.’’ Although Peirce is often considered one of the founders of semiotics, it must not be forgotten that his contribution is particularly relevant from the logical and philosophical points of view. And his interest for semiotics concerns the cognitive rather than the communicative dimensions. Nonetheless, his contribution is also significant for semiotics and for a theory of communication. Concerning semiotics, we have to underline that Peirce’s notion of sign includes ‘symbols,’ as well as ‘indexes’ (bound to the object through a real connection), and ‘icons,’ which remind of objects by reproducing their features. Semiotics turns out to include both verbal and nonverbal dimensions. Nevertheless we should also consider that, within Peirce’s enormous scientific production, we find some significant cues for a significantly comprehensive communication model. Firstly, the correlation of the sign with both subjectivities involved in communication is highlighted by the above-quoted definition, where the subject to whom the sign is addressed is explicitly mentioned and the addresser is presupposed. On this point, M. Hansen (2002) argued that Peirce’s approach implies an active involvement of the speaker and the addressee in the process of interpretation. In fact, the ‘representamen’ does not univocally imply a certain ‘interpretant,’ but it rather suggests several possible interpretations. Here, the interaction of the speaker and the addressee is necessary to evaluate the interpretation to be chosen: the context of interpretation is actively constructed by the interlocutors, on the basis of the experience of the knowledge community. Secondly, we find in the Peircean text a truly pragmatic reading of the process of interpretation, as the ‘final interpretant’ of a sign is the ‘habit change,’ i.e., ‘‘a modification of a person tendencies toward action’’ (Peirce, 1897–1935–1958: 5.476; on this point, see also Rigotti & Rocci, 2001: 48).
Communication: Semiotic Approaches 665
Conclusive Remarks We might conclude by arguing that the criticism moved against the semiotic tradition by the scholars of Relevance Theory is only valid for those semiotic approaches which can be defined as code-driven, and depend on a reductive interpretation of Saussure. They conceive of the sign as a binary unit, and thus reduce communication to a coding and decoding process. The criticism does not hold for all those, indeed rather numerous, approaches (Peirce and the functionalist interpretation of the Saussurean Cours: Troubetzkoy, Bu¨ hler, Jakobson, Bally, Sechehaye, Karcevskij . . .), where a pragmatic (in the sense of the Organonmodell of language) and triadic representation of the sign allows to understand communicative events in an adequately comprehensive perspective. Our short survey of semiotic approaches to communication in the 20th century shows that not all of them can be considered as code-models, it also puts forward – this concerns in particular some authors like Peirce, Bu¨ hler, and Benveniste – the possibility and even the reasonableness of constructing a semiopragmatic model of communication (Searle, 1969; Clark, 1996), concerning both the wording and the interpretation side, this latter being based on semiotic, metaphoric (Lakoff, 1980; Danesi, 2004) and inferential processing (Sperber and Wilson, 1995– 1986); comprehending both verbal and nonverbal communication (Rocci, 2003), and including a theory of subjectivity as one of its relevant components (see Rigotti and Cigada, 2004). And the semiotic tradition of the 20th century could be shown to be helpful in this endeavor. See also: Austin, John Langshaw (1911–1960); Barthes, Roland (1915–1980); Benveniste, Emile (1902–1976); Bu¨hler, Karl (1879–1963); Bickerton, Derek (b. 1926); Grice, Herbert Paul (1913–1988); Hjelmslev, Louis Trolle (1899– 1965); Jakobson, Roman (1896–1982); Peirce, Charles Sanders (1839–1914); Saussure, Ferdinand (-Mongin) de (1857–1913); Trubetskoy, Nikolai Sergeievich, Prince (1890–1938); Wegener, Philipp (1848–1916); Relevance Theory; Structuralism; Speech Acts; Eco, Umberto: Theory of the Sign; Jakobson, Roman: Theory of the Sign; Context, Communicative; Nonverbal Communication; Saussure: Theory of the Sign; Semiosis.
Bibliography Austin J L (1962). How to do things with words. Oxford: Oxford University Press. Bally Ch (1950). Linguistique ge´ ne´ rale et linguistique franc¸ aise. Bern: Franke. Barthes R (1964). Ele´ ments de se´ miologie. Paris: Editions du Seuil.
Benveniste E (1966a). ‘Les niveaux de l’analyse linguistique.’ In Benveniste E (ed.) Proble`mes de linguistique ge´ ne´ rale. Paris: Gallimard. Benveniste E (1966b). ‘La subjectivite´ dans le langage.’ In Benveniste E (ed.) Proble`mes de linguistique ge´ ne´ rale. Pairs: Gallimard. Bre´ al M (1884–1995). ‘Comment les mots sont classe´ s dans notre esprit.’ In Desmet P & Swiggers P (eds.) De la grammaire compare´ e a` la se´ mantique. Textes de Michel Bre´ al publie´ s entre 1864 et 1898. Leuven/Paris: Peeters. 283–291. Bu¨ hler K (1934). Sprachtheorie. Die Darstellungsfunktion der Sprache (2nd edn.). Stuttgard/New York: Gustav Fischer. Clark H (1996). Using language. Cambridge: Cambridge University Press. Danesi M (2004). Poetic logic: the role of metaphor in thought, language and culture. Madison: Atwood Publishing. Eco U (1975). Trattato di semiotica generale. Milan: Bompiani. English translation (1976), A theory of semiotics. Bloomington: Indiana University Press. Grice H P (1957). ‘Meaning.’ Philosophical Review 66, 377–388. Hansen M B M (2002). ‘Se´ miotique peirce´ enne et analyse des interactions verbales.’ In Andersen H L & Nølke H (eds.) Macro-syntaxe et macro-se´ mantique. Actes du colloque international d’ rhus, 17–19 mai 2001. Bern: Lang. 361–381. Hjelmslev L (1943–1961). Prolegomena to a theory of language. Madison: University of Winsconsin Press. [Omkring sprogteoriens grundlœggelse.] Jakobson R (1957). ‘Shifters, Verbal Categories, and the Russian Verb.’ In Rudy S (ed.) Selected Writings. The Hague: Mouton. Jakobson R (1960–1995). ‘The speech event and the function of language.’ In Waugh L R & Monville-Burston M (eds.) On language. Cambridge/London: Harvard University Press. Karcevskij S O (1929). ‘Du dualisme asymmetrique du signe linguistique.’ Travaux du cercle linguistique de Prague 1, 88–93. Lakoff G (1980). Metaphors we live by. Chicago: The University of Chicago Press. Lewis D (1969). Convention. Cambridge: Harvard University Press. Peirce Ch S (1897–1935–1958). Collected papers of Charles Sanders Peirce (8 vols). Hartshorne C & Weiss P (eds.). Cambridge: The Belknap Press of Harvard University Press. Rigotti E & Cigada S (2004). La comunicazione verbale. Milano: Apogeo. Rigotti E & Rocci A (2001). ‘Sens – non-sens – contresens.’ Studies in communication sciences 1, 45–80. Rigotti E & Rocci A (in press). ‘Le signe linguistique comme structure interme´ diaire.’ In Saussure L de (ed.) (in press). Nouvelles perspectives sur Saussure. Me´ langes offerts a` Rene´ Amacker, Publications du Cercle Ferdinand de Saussure. Gene`ve: Droz.
666 Communication: Semiotic Approaches Rocci A (2003). ‘La testualita`.’ In Bettetini G, Cigada S, Raynaud S & Rigotti E (eds.) Semiotica II. Configurazione disciplinare e questioni contemporanee. Brescia: La Scuola. 257–319. Secheaye A (1926). Essay sur la structure logique de la phrase. Paris: Champion. Saussure F de (1916/1995). ‘Cours de linguistique ge´ ne´ rale.’ In Bally Ch & Sechehaye A (ed.) with the collaboration of Riedlinger A. Paris: Payot.
Searle J (1969). Speech acts. Cambridge: Cambridge University Press. Sperber D & Wilson D (1995/1986). Relevance: communication and cognition (2nd edn.). Blackwell Publishers. Troubetzkoy N S (1939). Grundzu¨ ge der Phonologie. Prague: Travaux du Cercle Linguistique de Prague. Wegener Ph (1885–1991). ‘Untersuchungen u¨ ber die Grundfragen des Sprachlebens.’ In Knobloch C & Koerner K (eds.). Amsterdam/Philadelphia: John Benjamins.
Communicative Competence T M Lillis, The Open University, Milton Keynes, UK ! 2006 Elsevier Ltd. All rights reserved.
The phrase ‘communicative competence’ was introduced by the North American linguist and anthropologist, Dell Hymes, in the late 1960s (Hymes, 1962/1968, 1971). He used it to reflect the following key positions on knowledge and use of language: . The ability to use a language well involves knowing (either explicitly or implicitly) how to use language appropriately in any given context. . The ability to speak and understand language is not based solely on grammatical knowledge. . What counts as appropriate language varies according to context and may involve a range of modes – for example, speaking, writing, singing, whistling, drumming. . Learning what counts as appropriate language occurs through a process of socialization into particular ways of using language through participation in particular communities. Hymes’s juxtaposition of the word ‘communicative’ with ‘competence’ stood in sharp contrast at the time with Noam Chomsky’s influential use of the term ‘linguistic competence,’ which Chomsky used to refer to a native speaker’s implicit knowledge of the grammatical rules governing her/his language (Chomsky, 1957, 1965). Such knowledge, Chomsky argued, enables speakers to create new and grammatically correct sentences and accounts for the fact that speakers are able to recognize grammatically incorrect as well as correct sentences such as, in English She book the read, or in Spanish plaza yo a la voy (‘square I am going to’). While accepting the importance of grammatical knowledge, Hymes argued that in order to communicate effectively, speakers had to know not only what was grammatically correct/incorrect, but what was communicatively
appropriate in any given context. A speaker therefore must possess more than just grammatical knowledge; for example, a multilingual speaker in a multilingual context knows which language to use in which context and users of a language where there are both formal and informal forms of address know when to use which, such as vous (formal) and tu (informal) in French. Hymes famously stated that a child who produced language without due regard for the social context would be a monster (1974b: 75). The emphasis that Hymes placed on appropriateness according to context, in his use of the term competence, challenged Chomsky’s view about what exactly counts as knowledge of a language – knowledge of conventions of use in addition to knowledge of grammatical rules. In addition, and more fundamentally, Hymes problematized the dichotomy advanced by Chomsky between ‘competence’ and ‘performance’ and the related claim about what the study of linguistics proper should be. Chomsky’s interest was in the universal psycholinguistics of language, the human capacity for generating the syntactic rules of language. His interest in knowledge, captured in his use of ‘competence,’ was therefore at an ideal or abstract level rather than in any actual knowledge that any one speaker or group of speakers might possess. For Chomsky, the focus of linguistics as a discipline should be on understanding and describing the general and abstract principles that make the human capacity for language possible. In contrast, ‘performance’ or actual utterances – that is, what people actually say and hear with all the errors, false starts, unfinished sentences – could add little to an understanding of the principles underlying language use and was therefore not deemed to be a relevant focus of linguistic study. Hymes acknowledged the value of the more abstract and idealized approach that Chomsky advocated, not least because such a universalistic approach challenged any theories of language based on genetic differences or notions of racial hierarchy (Hymes, 1971: 4). However, he argued that there
666 Communication: Semiotic Approaches Rocci A (2003). ‘La testualita`.’ In Bettetini G, Cigada S, Raynaud S & Rigotti E (eds.) Semiotica II. Configurazione disciplinare e questioni contemporanee. Brescia: La Scuola. 257–319. Secheaye A (1926). Essay sur la structure logique de la phrase. Paris: Champion. Saussure F de (1916/1995). ‘Cours de linguistique ge´ne´rale.’ In Bally Ch & Sechehaye A (ed.) with the collaboration of Riedlinger A. Paris: Payot.
Searle J (1969). Speech acts. Cambridge: Cambridge University Press. Sperber D & Wilson D (1995/1986). Relevance: communication and cognition (2nd edn.). Blackwell Publishers. Troubetzkoy N S (1939). Grundzu¨ge der Phonologie. Prague: Travaux du Cercle Linguistique de Prague. Wegener Ph (1885–1991). ‘Untersuchungen u¨ber die Grundfragen des Sprachlebens.’ In Knobloch C & Koerner K (eds.). Amsterdam/Philadelphia: John Benjamins.
Communicative Competence T M Lillis, The Open University, Milton Keynes, UK ! 2006 Elsevier Ltd. All rights reserved.
The phrase ‘communicative competence’ was introduced by the North American linguist and anthropologist, Dell Hymes, in the late 1960s (Hymes, 1962/1968, 1971). He used it to reflect the following key positions on knowledge and use of language: . The ability to use a language well involves knowing (either explicitly or implicitly) how to use language appropriately in any given context. . The ability to speak and understand language is not based solely on grammatical knowledge. . What counts as appropriate language varies according to context and may involve a range of modes – for example, speaking, writing, singing, whistling, drumming. . Learning what counts as appropriate language occurs through a process of socialization into particular ways of using language through participation in particular communities. Hymes’s juxtaposition of the word ‘communicative’ with ‘competence’ stood in sharp contrast at the time with Noam Chomsky’s influential use of the term ‘linguistic competence,’ which Chomsky used to refer to a native speaker’s implicit knowledge of the grammatical rules governing her/his language (Chomsky, 1957, 1965). Such knowledge, Chomsky argued, enables speakers to create new and grammatically correct sentences and accounts for the fact that speakers are able to recognize grammatically incorrect as well as correct sentences such as, in English She book the read, or in Spanish plaza yo a la voy (‘square I am going to’). While accepting the importance of grammatical knowledge, Hymes argued that in order to communicate effectively, speakers had to know not only what was grammatically correct/incorrect, but what was communicatively
appropriate in any given context. A speaker therefore must possess more than just grammatical knowledge; for example, a multilingual speaker in a multilingual context knows which language to use in which context and users of a language where there are both formal and informal forms of address know when to use which, such as vous (formal) and tu (informal) in French. Hymes famously stated that a child who produced language without due regard for the social context would be a monster (1974b: 75). The emphasis that Hymes placed on appropriateness according to context, in his use of the term competence, challenged Chomsky’s view about what exactly counts as knowledge of a language – knowledge of conventions of use in addition to knowledge of grammatical rules. In addition, and more fundamentally, Hymes problematized the dichotomy advanced by Chomsky between ‘competence’ and ‘performance’ and the related claim about what the study of linguistics proper should be. Chomsky’s interest was in the universal psycholinguistics of language, the human capacity for generating the syntactic rules of language. His interest in knowledge, captured in his use of ‘competence,’ was therefore at an ideal or abstract level rather than in any actual knowledge that any one speaker or group of speakers might possess. For Chomsky, the focus of linguistics as a discipline should be on understanding and describing the general and abstract principles that make the human capacity for language possible. In contrast, ‘performance’ or actual utterances – that is, what people actually say and hear with all the errors, false starts, unfinished sentences – could add little to an understanding of the principles underlying language use and was therefore not deemed to be a relevant focus of linguistic study. Hymes acknowledged the value of the more abstract and idealized approach that Chomsky advocated, not least because such a universalistic approach challenged any theories of language based on genetic differences or notions of racial hierarchy (Hymes, 1971: 4). However, he argued that there
Communicative Competence 667
were other important dimensions to the study of language that should not be so readily excluded from linguistics as a scientific field. Hymes’s own interest in language was in large part driven by a concern for language questions arising in real life contexts, such as why children from economically advantaged and disadvantaged social backgrounds differ in the language they use. Chomsky’s and Hymes’s different aims for developing language theory are nowhere more clearly evident than in Hymes’s comment on Chomksy’s (1965: 3) now famous statement, on the purpose of linguistic theory: ‘‘Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogenous speech-community, who knows its language perfectly.. . .’’ Hymes (1971: 4) comments: ‘‘The theoretical notion of the ideal speaker-listener is unilluminating from the standpoint of the children we seek to understand and to help.’’ Hymes was highly critical of a theory that explicitly set out to ignore the impact of social context on how language is used and hence the competence/performance dichotomy set up by Chomsky (echoing in some ways the langue and parole distinction made by Saussure, 1916). At a specific level, his key reasons for challenging such a dichotomy can be summarized as follows (based on Hymes, 1962/1968; 1971; 1974b):
. Given the focus on knowledge as a set of abstract rules underlying use, actual use is relegated to only a marginal position in the scientific study of language.
. The dichotomy itself is problematic. It presupposes that knowledge can be understood without reference to use, yet analyzing actual use of language is key to exploring underlying principles for such use. Hymes argued that ‘‘performance data’’ should be considered a legitimate focus for linguistic study both in its own right and as data that reflects knowledge underlying any performance. . The dichotomy is built on a series of abstractions: ideal speaker-listener, homogenous speechcommunity, perfect knowledge of language. . Chomsky’s notion of speaker-listener does not acknowledge or account for the differences in reception competence and production competence evident in many contexts, as in children from some social backgrounds understanding formal school language yet not producing it. . What counts as knowledge of language is reduced to only one aspect of knowledge, namely grammatical knowledge, when there are clearly other aspects to knowledge of language that are important, such as when to use which language, or varieties of languages, and in which contexts. . Within an approach that focuses on competence as idealized knowledge, it is the abstract system of language that becomes the focus rather than speakers’/groups of speakers’ use of language.
1. Whether (and to what degree) something is formally possible 2. Whether (and to what degree) something is feasible 3. Whether (and to what degree) something is appropriate 4. Whether (and to what degree) something is in fact done, actually performed (Hymes 1972a: 284–286).
Hymes (1972a: 282) offers communicative competence as a more general and superordinate term to encompass the language capabilities of the individual that include both knowledge and use: ‘‘competence is dependent upon both (tacit) knowledge and (ability for) use.’’ While Hymes argued against the foundational dichotomy between competence and performance proposed by Chomsky, he was not dismissing the value of the distinction entirely. Hymes refers to communicative competence as ‘‘abilities in a broad sense’’ of how to use language, whereas performance is always a specific use of language that reflects some of that competence (2003: 321). Thus any specific performance may partially reflect the nature of the conventions governing an individual or a community’s knowledge of language. In setting up a framework for developing an adequate theory of language, Hymes argued that both what is known (competence) and what is actually done (performance) must be taken into account. Such a framework involves exploring and accounting for the following:
Questions 3 and 4 are central to the socially oriented approach to the study of language advocated by Hymes. In contrast to Chomsky and his claim to linguistics as a subfield of psychology and philosophy, Hymes seeks to claim a space for the study of language within ‘‘a science of social man’’ (Hymes, 1971: 6).
A Key Concept in an Emerging Sociolinguistic Tradition Emphasis on the notion of communicative competence formed part of Dell Hymes’s call for a new field of study, the ethnography of communication, sometimes called the ethnography of speaking (Hymes, 1962/1968; Gumperz and Hymes, 1972/ 1986). There are a number of concepts and categories presupposed by the notion of communicative
668 Communicative Competence
competence, which continue to be highly influential in sociolinguistics and in many socially oriented approaches to study of language. Sociocultural Context
Given the importance attached to knowledge of the social conventions governing language use, understanding the context of language use is considered to be central. Exploring such context, that is, the cultural, historical, and social practices associated with the language use of any particular group or community of people, involves detailed descriptions and classification of language use organized around the following key questions. What are the communicative events, and their components, in a community? What are the relationships among them? What capabilities and status do they have, in general and in particular cases? How do they work? (Hymes, 1974b: 25).
Speech Event
This is a category (after Jakobson, 1960) that reflects the idea that all interaction is embedded in sociocultural contexts and is governed by conventions emerging from those contexts. Examples of speech events are interviews, buying and selling goods in a shop, sermons, lectures, and informal conversation. The speech event involves a number of core components identified by Hymes, which are signaled in his mnemonic device SPEAKING. [See Table 1]. Table 1 SPEAKING – acronym invented by Dell Hymes (1972b) to specify relevant features of a speech event S-settings and scenes
Ethnography of Communication
P-participants
In order to explore how language is used in context, Hymes argued for an ethnographic approach to the study of communication or ways of speaking (Hymes, 1974a). This involves researchers setting out to systematically observe the activities of any given community, through immersing themselves in such activities and collecting a range of data, such as recordings, field notes, and documentation. In this methodology both ‘etic’ and ‘emic’ approaches are considered important and complementary; the etic approach refers to observation from the outside as it were, that is, the researcher seeks to observe in detail the communicative activities – or speech events – of participants in a community; the emic involves exploring such events, from the inside, to determine how participants make sense of and understand such events and interactions. Ethnographers emphasize the importance of emic accounts to any theory of language; for example, only an emic perspective would enable a researcher to understand that a clap of thunder may in some cultural contexts be considered to be a communicative act (as in the case of the Ojibwa reported by Hymes, 1974b: 13), or that certain types of communication are permitted to men in some contexts while proscribed in others, such as the disciplining of children (as reported by Philipsen, 1975). In an attempt to build a descriptive framework of how language is used in different contexts, Hymes, drawing on anthropologists such as Malinowski (1923, 1935), developed a series of categories to map out the relevant contextual aspects to language use, such as speech event and speech community.
E-ends
A-acts
K-keys
I-instrumentalities
N-norms
G-genres
Setting refers to time, place, physical circumstances. Scene refers to the psychological or cultural definitions of the event: for example what ‘counts’ as a formal event varies from community to community. Who is involved, as either speaker/ listener, audience. Ends can be defined in terms of goals and outcomes. Goals refer to what is expected to be achieved in any event: outcomes refers to what is actually achieved. Goals and outcomes exist at both community and individual participant level: for example, the conventional goal of a wedding ceremony may be marriage, however, individuals within that event may have other goals. Speech events involve a number and range of speech acts, particular types of utterances such as requests, commands, and greetings. The tone, manner, and spirit in which acts are done, for example, serious or playful. Specific keys may be signaled through verbal or/and non-verbal means. The particular language/language varieties used and the mode of communication (spoken, written). Norms of interaction refer to rules of speaking, who can say what, when, and how. Norms of interpretation refer to the conventions surrounding how any speech may be interpreted. Categories or types of language use, such as the sermon, the interview, or the editorial. May be the same as ‘speech event’ but may be a part of a speech event. For example, the sermon is a genre and may at the same time be a speech event (when performed conventionally in a church); a sermon may be a genre, however, that is invoked in another speech event, for example, at a party for humorous effect.
Communicative Competence 669 Speech Community
While the term speech community was not coined by Hymes (the most notable earlier use being that of Bloomfield, 1933), Hymes’s elaboration of the term certainly contributed to its prominence in sociolinguistic approaches to the study of language. The acquisition of communicative competence takes place within speech communities: speech communities are constituted not just by a shared variety or language, but shared sets of norms and conventions about how those varieties can and should be used. Through everyday interaction with others in a speech community, a child learns how to use language appropriately, that is, according to the norms of any given speech community. Some events inevitably involve people from different speech communities, which may create tensions: as in for example school classrooms where participants share a common language but may not be members of the same speech community (Hymes, 1972c). Diversity
Acknowledgement of diversity and variety between and across language use, in communities and individuals, is a basic position in Hymes’s work and is a central tenet in sociolinguistics. Such diversity manifests itself in countless ways: the very existence of language varieties, both as languages and varieties within languages; the range of conventions governing the use of such varieties in different contexts (such differences have been documented in relation, notably, to social class, ethnic group, gender); the different values attached to particular usages (for example, the values attached in different communities to such phenomena as silence, eloquence, and interruptions). Privileging diversity as a universal of language shifts the emphasis away from any differential status attached to varieties, or the notion that difference signals deficiency in any way. All varieties are seen as equally valid, although some are acknowledged to be more appropriate in particular contexts. Appropriateness
This is a key presupposition to the notion of communicative competence and is a central notion in sociolinguistics. As discussed, communicative competence presupposes the following; that a language user’s knowledge – competence – is more than just grammar-based; that knowledge of language requires knowledge of the appropriate social conventions governing what and how something can be said, to whom and in what contexts. Appropriateness thus
involves both linguistic and cultural knowledge (Hymes, 1971: 14). Within sociolinguistics, a focus on appropriateness of language use is said to indicate a descriptive (how language is used) rather than a prescriptive (how language should be used) approach to language diversity. Socialization
People learn the rules of use through everyday interaction within speech communities. It is through such interaction that children acquire knowledge about appropriate language use, that is, communicative competence (Hymes, 1971: 10). Hymes indicates that socialization is not constituted by a rigid trajectory and suggests that both ‘‘a long and short range view of competency should be adopted’’ (1972a: 287). From his perspective, the short range view concerns innate capacities as they emerge in the first years of life, and the long range concerns continuing socialization through life. What this short/long range implies is that competence is not static. In some instances, quite drastic changes can be made to an individuals’ competence; as when a child whose home language variety is significantly different from the school variety. Of course, as Hymes emphasizes, such extensions or shifts in competence are not necessarily straightforward; there are plenty of opportunities for misunderstanding to occur when receivers/listeners accustomed to the language varieties of one community engage in communication with those from another.
Communicative Competence in Other Domains The notion of communicative competence has been highly influential in fields beyond linguistics, such as education, sociology, and psychology. In some instances the basic assumptions surrounding the term have been maintained, and in others extended or problematized. Probably nowhere has the impact of the notion been more powerful than in the teaching of languages, including the teaching of English as a second or foreign language. Whereas the emphasis in language teaching had been on grammatical and syntactic accuracy, following the work of Hymes and others (Gumperz and Hymes, 1972/1986), there was a significant turn towards communicative language teaching: this shift involved the teaching and learning of language considered to be appropriate to specific situations, based on what speakers actually use, rather than what they are presumed to use (Paulston,
670 Communicative Competence
1992). Assessment of language learning has been influnced accordingly, with a focus on students’ capacity to communicate, rather than the ability to produce grammatically correct sentences (Hall and Eggington, 2000). The extent to which this more situational approach to second and foreign language teaching prevails is a matter of debate, but the impact of communicative competence is widely acknowledged (Firth and Wagner, 1997). The use of the term has also been extended and modulated in other domains. For example, Culler (1975) developed the influential notion of literary competence to describe readers’ knowledge of the conventions required in order to interpret literary texts. Academic communicative competence has been used to refer to knowledge of the conventions governing the use of language in academic communication (Berkenkotter et al., 1991). Both uses refer to knowledge of specific textual features, such as metaphor in the case of literary competence and argument in academic competence, as well as knowledge about what counts as specific text types or genres (academic, literary) in particular cultural contexts. Other uses of ‘communicative competence’ have developed, alongside and in contradistinction to the Hymesian term. Habermas (1970) uses the term communicative competence more in line with Chomsky’s linguistic competence, to the extent that he is interested in theorizing an ideal speech situation, rather than elaborating a sociolinguistic description of actual situations and utterances. In contrast, Bernstein’s interest was in an elaboration of actual use of language, particularly within the context of schooling. However, he offered a critique of the way in which ‘competence’ models implied an exaggerated capacity of individual rational choice and control over language use, without due attention to ‘‘distribution of power and principles of control which selectively specialize modes of acquisition and realizations’’ (Bernstein, 1996: 56). The need to theorize power in relation to competence and language use is a key strand in other studies re-examining the notion of communicative competence in more recent times.
and related notions, as they have come to be used in sociolinguistics, from both critical and post structuralist approaches.
Re-examining Communicative Competence
Re-examining Speech Event and Speech Community
The work of Hymes is central in sociolinguistics as a field and continues to reverberate across socially oriented approaches to the study of language in a range of disciplines, including applied linguistics, education, communication studies, and social psychology. In recent times, there have also been significant re-examinations of communicative competence
While Hymes always indicated that he used the word ‘speech’ to mean all types of communicative modes/channels, sociolinguistic research has tended to focus on the spoken word. In more recent times, explicit attention has been paid to other modes of communication, thus extending the use of core concepts. For example, those working within literacy
Re-examining Appropriateness
The notion of appropriateness is central to communicative competence and central to the field of sociolinguistics whose empirical goal has been to explore patterns of language use, according to the norms of any given community. However, the use of such a notion has been critiqued by some because it serves to emphasize norms and underplay differences within any given community or communicative context. Fairclough (1995), for example, like Bernstein mentioned above, argues that a model of language based on appropriateness assumes shared views among all users about what counts as appropriate, ignoring struggles and tensions in any given interaction; for example, tensions evident in interactions between institutional representatives and clients, men and women, or speakers from different cultural and linguistic backgrounds. Research in some socially oriented approaches to language, such as feminist linguistics and critical discourse analysis, has made visible the power dynamics in communicative events, within and across communities (Cameron, 1992; Wodak, 1992; Chouliaraki and Fairclough, 1999). In the same vein, emphasis on a normative notion of communicative competence in second and foreign language teaching has been critiqued by theorists of second language acquisition. Norton (2000) states that although it is important for learners to understand the conventions of the target language, it is also important for them to explore ‘‘whose interests these rules serve’’ (2000: 15). She argues that any definition of communicative competence should include an acknowledgement of the importance of the right to speak (Bourdieu, 1977); such a right to speak, or be heard, is not granted to all speakers in all contexts. Thus for example, immigrants using a foreign language may find that, although familiar with the conventions governing a particular use of that language, they may not be granted the right to speak or be heard in some contexts.
Communicative Competence 671
studies have used existing terms to signal a specific focus, such as ‘‘writing event’’ (Basso, 1974), ‘‘literacy event’’ (Heath, 1983; Barton and Hamilton, 1998). Likewise, Swales (1990) has argued that the term discourse community is more useful than speech community, as a term for describing and accounting for practices around written texts. Some theorists have argued that the word ‘speech’ signals that language is considered more significant than other practices, or that language is somehow divorced from other social purposes and activities, and have argued that the notion of practice, including the notion of ‘‘community of practice’’ is more all encompassing and powerful (Eckert and McConnell-Ginet, 2003; see also discussion about ways in which ‘practice’ is used in Schultz and Hull, 2002). A more fundamental challenge to the notion of speech community comes from theorists emphasizing the ways in which recent historical changes, notably globalization, powerfully influence the ways in which people engage in the world and disrupt traditional notions of community and community membership. Through a whole range of technological, social, and economic developments – shaping modes of labor, travel, and communication – individuals’ relations to others are more diverse and fluid, less restricted by time and space. The extent to which speech community with any presumed identifiable boundaries continues to be a meaningful category of observation and analysis is debatable within the context of a rapidly changing world (Rampton, 1998; Collins, 2003). Re-examining the Notion of Speaker
Just as the notion of speech community has been challenged, so too have prominent labels used to categorize individuals in relation to communities – such as social class, ethnicity, linguistic repertoire, and gender. Such terms, because they often denote fixed sets of attributes and capacities, have been recognized as problematic, particularly by post structuralist writers who stress that identity is always in process. Indeed, the relationship between language and identity has established itself as a key area for research. Such work tends to challenge the idea that language use reflects categories of identity (I speak as I do because I am a working class woman) and emphasizes, rather, how individuals actively construct aspects of social and personal identity through their use of language in specific contexts (in speaking as I am, I am constructing and representing myself as a working class woman). While it is recognized that such constructions of identity are not free floating but are regulated by the specific contexts and interactions in which they occur (Cameron, 1997a), the fluidity of
identity tends to be emphasized. In these approaches, the term ‘performativity’ rather than ‘performance’ is used, in order to signal how identity is enacted or performed through interaction (Cameron, 1997b; Butler, 1990/1999). Re-examining Context
The work of Hymes placed the importance of context centrally within the concern of linguistics and advocated ethnography as the key organizing methodological tool with which to observe language use. However, there has been considerable debate about what constitutes context and how context should be conceptualized and explored. Two significant and quite distinct approaches to the study of context can be found in conversation analysis and critical discourse analysis: the former orients inwards as it were towards language, the latter orients outwards towards the social world. Conversation analysts argue that speakers construct and represent relevant aspects of context through their actual interaction and that these can be empirically observed (Schegloff, 1997). In contrast, critical discourse analysts (Fairclough, 1995) and feminist linguists (Cameron, 1992) have signaled the limitations to approaches that seek to understand context through empirical observation alone: there have been calls to draw on social, critical, and post structuralist theorists and philosophers such as Foucault, Habermas, Bourdieu, and Bakhtin, in order to explore the ways in which language use is related to ideology and power, and in order to explore how phenomena such as globalization are influencing communicative practices. Some of this work tends to explore language use through the lens of such theory and pays only minimal attention to examining contexts empirically (Chouliaraki and Fairclough, 1999), whereas others drawing on ethnographic traditions such as Hymes’s, aim to establish an approach that draws on both empirical observation and specific aspects of social theory (Rampton, 1995; Lee, 1996; Maybin, 1999). Attempts have a been made to integrate levels of analysis at the macro level of society with micro levels of actual utterances; Gee (1996) for example uses the terms big ‘D’ discourse to refer to the former and little ‘d’ discourse to refer to the latter; Fairclough (1992) has developed a three-layered framework to explore such relations, which he refers to as a textually oriented discourse analysis (TODA).
See also: Assessment of Second Language Proficiency;
Chomsky, Noam (b. 1928); Codes, Elaborated and Restricted (Bernstein); Communicative Language Teaching; Context, Communicative; Discourse, Foucauldian
672 Communicative Competence Approach; Ethnomethodology; Habermas, Ju¨rgen (b. 1929); Identity in Sociocultural Anthropology and Language; Identity: Second Language; Intercultural Pragmatics and Communication; Speech and Language Community.
Bibliography Barton D & Hamilton M (1998). Local literacies: reading and writing in one community. London: Routledge. Basso K (1974). ‘The ethnography of writing.’ In Bauman R & Sherzer J (eds.) Explorations in the ethnography of speaking. Cambridge: CUP. 425–432. Berkenkotter C, Huchin T & Ackerman J (1991). ‘Social context and socially constructed texts.’ In Bazerman C & Paradis J (eds.) Textual dynamics and the professions. Madison, WI: University of Wisconsin Press. 191–215. Bernstein B (1996). Pedagogy, symbolic control and identity: theory, research, critique. London: Taylor and Francis. Bloomfield L (1933). Language. New York: Holt, Reinhart and Winston. Bourdieu P (1977). ‘The economics of linguistic exchanges.’ Social Science Information 16, 645–668. Butler J (1990/1999). Gender trouble: feminism and the subversion of identity. New York/London: Routledge. Cameron D (1992). Feminism and linguistic theory (2nd edn.). Basingstoke/London: Macmillan. Cameron D (1997a). ‘Theoretical debates in feminist linguistics: questions of sex and gender.’ In Wodak R (ed.) Gender and discourse. London: Sage Publications. Cameron D (1997b). ‘Performing gender identity: young men’s talk and the construction of heterosexual masculinity.’ In Johnson S & Meinhof U (eds.) Language and masculinity. Oxford: Blackwell. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chouiaraki L & Fairclough N (1999). Discourse in late modernity: rethinking critical discourse analysis. Edinburgh: Edinburgh University Press. Collins J (2003). ‘Language, identity, and learning in the era of ‘‘expert guided’’ system.’ In Wortham S & Rymes B (eds.) The linguistic anthropology of education. Connecticut: Praeger. 31–60. Culler J (1975). Structuralist poetics. London: Routledge/ Kegan Paul. Eckert P & McConnell-Ginet S (2003). Language and gender. Cambridge: CUP. Fairclough N (1992). ‘The appropriacy of appropriateness.’ In Fairclough N (ed.) Critical language awareness. London: Longman. 233–252. Fairclough N (1995). Critical discourse analysis: the critical study of language. London: Longman. Firth A & Wagner J (1997). On discourse, communication and (some) fundamental concepts in SLA research. Modern Language Journal 81(3) 285–300. Gee J (1996). Social linguistics and literacies: ideologies in discourses (2nd edn.). Basingstoke: Falmer Press.
Gumperz J J & Hymes D (eds.) (1972/1986). Directions in sociolinguistics: the ethnography of communication. Oxford: Basil Blackwell. Habermas J (1970). ‘Towards a theory of communicative competence.’ In Dreitzel H P (ed.) Recent sociology No. 2. New York: Macmillan. 114–148. Hall J K & Eggington W G (2000). The sociopolitics of English language teaching. Clevedon: Multilingual Matters. Heath S B (1983). Ways with words: language life and work in communities and classrooms. Cambridge: CUP. Hymes D (1962/1968). ‘The ethnography of speaking.’ In Fishman J A (ed.) The ethnography of communication. The Hague: Mouton. 99–138. Hymes D (1971). ‘Competence and performance in linguistic theory.’ In Huxley R & Ingram E (eds.) Language acquisition: models and methods. New York: Academic Press. Hymes D (1972a). ‘On communicative competence.’ In Pride J B & Holmes J (eds.) Sociolinguistics. Harmondsworth: Penguin. 269–285. Hymes D (1972b). ‘Models of the interaction of language and social life.’ In Gumperz J & Hymes D (eds.) Directions in sociolinguistics: the ethnography of communication. Oxford: Basil Blackwell. 35–71. Hymes D (1972c). ‘Introduction.’ In Cazden C, John V & Hymes D (eds.) Functions of language in the classroom. New York: Teachers College Press. xi–lvii. Hymes D (1974a). ‘Ways of speaking.’ In Bauman R & Sherzer J (eds.) Explorations in the ethnography of speaking. Cambridge: CUP. Hymes D (1974b). Foundations in sociolinguistics: an ethnographic approach. Philadelphia: University of Pennsylvania Press. Hymes D (2003). Now I know only so far: essays in ethnopoetics. Lincoln: University of Nebraska Press. Jakobson R (1960). ‘Closing statement: linguistics and poetics.’ In Sebeok T A (ed.) Style in language. Cambridge, MA: MIT Press. 398–429. Lee A (1996). Gender, literacy and the curriculum: rewriting school geography. London: Taylor and Francis. Maybin J (1999). ‘Framing and evaluation in 10–12 year old school children’s use of appropriated speech, in relation to their induction into educational procedures and practices.’ Text 19, 4. Norton B (2000). Identity and language learning: gender, ethnicity and educational change. London: Longman. Malinowski B (1923). ‘The problem of meaning in primitive languages.’ In Ogden C K & Richards I A (eds.) The meaning of meaning. London: Kegan Paul. 451–510. Malinowski B (1935). Coral gardens and their magic (2 vols.). London: Allen and Unwin. Paulsten C B (1992). Linguistic and communicative competence: topics in ESL. Clevedon: Multilingual Matters. Philipsen G (1975). ‘Speaking ‘‘like a man’’ in teamsterville: culture patterns of role enactment in an urban neighbourhood.’ Quarterly Journal of Speech 61, 13–22. Rampton B (1995). Crossing: language and ethnicity among adolescents. London: Longman.
Communicative Language Teaching 673 Rampton B (1998). ‘Speech community.’ In Verschueren J et al. (eds.) Handbook of pragmatics. Amsterdam: John Benjamins. Saussure F de ([1916] 1959). In Bally C & Sechehaye A (eds.) Course in general linguistics. Baskin W (trans.). New York: McGraw Hill. Schultz K & Hull G (2002). ‘Locating literacy theory in outof-school contexts.’ In Hull G & Schultz K (eds.) School’s out: bridging out-of-school literacies with classroom
practice. New York/London: Teachers College Press Columbia University. 11–31. Schegloff E A (1997). ‘Whose text, whose context?’ Discourse and Society 8, 165–187. Swales J (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press. Wodak R (ed.) (1992). Language power and ideology. Oxford: Blackwell.
Communicative Language Teaching S J Savignon, Pennsylvania State University, University Park, PA, USA ! 2006 Elsevier Ltd. All rights reserved.
Communicative language teaching (CLT) is best understood within the broader historical spectrum of methods or approaches to language teaching. Seen from a 21st-century modernist perspective that views teaching as rather more science than art, the theoretical grounding for the epistemology of practice offered by CLT can be found in (1) the second- or foreign language acquisition research that began to flourish in the 1970s and (2) a long-standing functional view of language and language use as social behavior. The interpretation or implementation of practice in language teaching contexts around the world is, of course, yet another matter. A consideration of these various influences highlights the major issues that confront CLT on the threshold of the 21st century.
Linguistic Theory and Classroom Practice The essence of CLT is the engagement of learners in communication to allow them to develop their communicative competence. Use of the term ‘communicative’ in reference to language teaching refers to both the process and goals of learning. A central theoretical concept in CLT is communicative competence, a term introduced in the early 1970s into discussions of language (Habermas, 1970; Hymes, 1971) and second-language learning (Jakobovits, 1970; Savignon, 1971). Competence is defined as the expression, interpretation, and negotiation of meaning and looks to second-language acquisition research to account for its development (Savignon, 1972, 1983, 1997). The identification of learner communicative needs provides a basis for curriculum design. Descriptors sometimes used to refer to features
of CLT include process-oriented, task-based, and inductive or discovery-oriented The elaboration of what has come to be known as CLT can be traced to concurrent developments in linguistic theory and language learning curriculum design, both in Europe and in North America. In Europe, the language needs of a rapidly increasing group of immigrants and guest workers, along with a rich British linguistic tradition that included social as well as linguistic context in the description of language behavior, led to the development of a syllabus for learners based on notional-functional concepts of language use. This notional-functional approach to curriculum design derived from neo-Firthian systemic or functional linguistics that views language as meaning potential and maintains the centrality of context of situation in understanding language systems and how they work (Firth, 1937; Halliday, 1978). With sponsorship from the Council of Europe, a Threshold Level of language ability was proposed for each of the languages of Europe in terms of what learners should be able to do with the language (van Ek, 1975). Functions were based on the assessment of learner needs and specified the end result or goals of an instructional program. The term ‘communicative’ was used to describe programs that followed a notional-functional syllabus based on needs assessment, and the language for specific purposes (LSP) movement was launched. Concurrently, development within Europe focused on the process of classroom language learning. In Germany, against a backdrop of social democratic concerns for individual empowerment articulated in the writings of philosopher Ju¨ rgen Habermas (1970), language teaching methodologists took the lead in the development of classroom materials that encouraged learner choice (Candlin, 1978). A collection of exercise types for communicatively oriented English language teaching was used in teacher in-service courses and workshops to guide curriculum change. Exercises
Communicative Language Teaching 673 Rampton B (1998). ‘Speech community.’ In Verschueren J et al. (eds.) Handbook of pragmatics. Amsterdam: John Benjamins. Saussure F de ([1916] 1959). In Bally C & Sechehaye A (eds.) Course in general linguistics. Baskin W (trans.). New York: McGraw Hill. Schultz K & Hull G (2002). ‘Locating literacy theory in outof-school contexts.’ In Hull G & Schultz K (eds.) School’s out: bridging out-of-school literacies with classroom
practice. New York/London: Teachers College Press Columbia University. 11–31. Schegloff E A (1997). ‘Whose text, whose context?’ Discourse and Society 8, 165–187. Swales J (1990). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press. Wodak R (ed.) (1992). Language power and ideology. Oxford: Blackwell.
Communicative Language Teaching S J Savignon, Pennsylvania State University, University Park, PA, USA ! 2006 Elsevier Ltd. All rights reserved.
Communicative language teaching (CLT) is best understood within the broader historical spectrum of methods or approaches to language teaching. Seen from a 21st-century modernist perspective that views teaching as rather more science than art, the theoretical grounding for the epistemology of practice offered by CLT can be found in (1) the second- or foreign language acquisition research that began to flourish in the 1970s and (2) a long-standing functional view of language and language use as social behavior. The interpretation or implementation of practice in language teaching contexts around the world is, of course, yet another matter. A consideration of these various influences highlights the major issues that confront CLT on the threshold of the 21st century.
Linguistic Theory and Classroom Practice The essence of CLT is the engagement of learners in communication to allow them to develop their communicative competence. Use of the term ‘communicative’ in reference to language teaching refers to both the process and goals of learning. A central theoretical concept in CLT is communicative competence, a term introduced in the early 1970s into discussions of language (Habermas, 1970; Hymes, 1971) and second-language learning (Jakobovits, 1970; Savignon, 1971). Competence is defined as the expression, interpretation, and negotiation of meaning and looks to second-language acquisition research to account for its development (Savignon, 1972, 1983, 1997). The identification of learner communicative needs provides a basis for curriculum design. Descriptors sometimes used to refer to features
of CLT include process-oriented, task-based, and inductive or discovery-oriented The elaboration of what has come to be known as CLT can be traced to concurrent developments in linguistic theory and language learning curriculum design, both in Europe and in North America. In Europe, the language needs of a rapidly increasing group of immigrants and guest workers, along with a rich British linguistic tradition that included social as well as linguistic context in the description of language behavior, led to the development of a syllabus for learners based on notional-functional concepts of language use. This notional-functional approach to curriculum design derived from neo-Firthian systemic or functional linguistics that views language as meaning potential and maintains the centrality of context of situation in understanding language systems and how they work (Firth, 1937; Halliday, 1978). With sponsorship from the Council of Europe, a Threshold Level of language ability was proposed for each of the languages of Europe in terms of what learners should be able to do with the language (van Ek, 1975). Functions were based on the assessment of learner needs and specified the end result or goals of an instructional program. The term ‘communicative’ was used to describe programs that followed a notional-functional syllabus based on needs assessment, and the language for specific purposes (LSP) movement was launched. Concurrently, development within Europe focused on the process of classroom language learning. In Germany, against a backdrop of social democratic concerns for individual empowerment articulated in the writings of philosopher Ju¨rgen Habermas (1970), language teaching methodologists took the lead in the development of classroom materials that encouraged learner choice (Candlin, 1978). A collection of exercise types for communicatively oriented English language teaching was used in teacher in-service courses and workshops to guide curriculum change. Exercises
674 Communicative Language Teaching
were designed to exploit the variety of social meanings contained within particular grammatical structures. A system of ‘chains’ encouraged teachers and learners to define their own learning path through a principled selection of relevant exercises (Piepho, 1974; Piepho and Bredella, 1976). Similar exploratory projects were also initiated by Candlin at his academic home, the University of Lancaster in England, and by Holec (1979) and his colleagues at the University of Nancy in France. Supplementary teacher resource materials promoting classroom CLT became increasingly popular (for example, see Maley and Duff, 1978). There was also a renewed interest in learner vocabulary building. The widespread promotion of audiolingual methodology with a focus on accuracy in terms of so-called native grammatical or syntactic form had resulted in the neglect of learner lexical resources (Coady and Huckin, 1997). At about the same time, paradigm-challenging research on adult classroom second-language acquisition at the University of Illinois (Savignon, 1971, 1972) used the term ‘communicative competence’ to characterize the ability of classroom language learners to interact with other speakers and to make meaning, as distinguished from their ability to recite dialogues or to perform on discrete-point tests of grammatical knowledge. At a time when pattern practice and error avoidance were the rule in language teaching, this study of adult classroom acquisition of French looked at the effect of practice in the use of coping strategies as part of an instructional program. By encouraging learners to ask for information, to seek clarification, to use circumlocution and whatever other linguistic and non-linguistic resources they could muster to negotiate meaning, and to stick to the communicative task at hand, teachers were invariably leading learners to take risks, to speak in other than memorized patterns. Consistent with the process of language development that was being documented in first-language and untutored or ‘natural’ second-language acquisition research, the communicative activities offered learners an opportunity to focus on meaning as opposed to form. Achievement tests administered at the end of the 18-week introductory-level instructional period showed conclusively that learners who had engaged in communication in lieu of repeating laboratory pattern drills performed with no less accuracy on discrete-point tests of grammatical structure. In fact, their communicative competence as measured in terms of fluency, comprehensibility, effort, and amount of communication in unrehearsed communicative tasks significantly surpassed that of learners who had had no such practice. Learner reactions to the test formats lent further support to the view that even beginners respond
well to activities that let them focus on meaning as opposed to formal features. A collection of role plays, games, and other communicative classroom activities was developed subsequently for inclusion in the adaptation of the French CREDIF materials, Voix et Visages de la France. The accompanying guide (Savignon, 1974) described the purpose of these activities as involving learners in the experience of communication. Teachers were encouraged to provide learners with the French equivalent of such expressions as ‘What’s the word for. . .?,’ ‘Please repeat,’ and ‘I don’t understand,’ expressions that would help them participate in the negotiation of meaning. Not unlike the efforts of Candlin and colleagues working in Europe, the focus was on classroom process and learner autonomy. The use of games, role plays, pair, and other small group activities gained acceptance and was subsequently recommended for inclusion in language teaching programs generally. The coping strategies identified in the Savignon (1971, 1972) study became the basis for the subsequent identification by Canale and Swain (1980) of strategic competence in their three-component framework for communicative competence, along with grammatical competence and sociolinguistic competence. Grammatical competence represented sentence-level syntax, forms that remain the focus of Chomskyan theoretical linguistic inquiry and were a primary goal of both grammar-translation and audiolingual methodologies. Consistent with a view of language as social behavior, sociolinguistic competence represented a concern for the relevance or appropriateness of those forms in a particular social setting or context. There is now widespread recognition of the importance of these various dimensions of language use and of the need for learners to be involved in the actual experience of communication if they are to develop communicative competence. Inclusion of sociolinguistic competence in the Canale and Swain framework reflected the challenge within American linguistic theory to the prevailing focus on syntactic features. Dell Hymes (1971) had reacted to Noam Chomsky’s (1965) characterization of the linguistic competence of the ‘‘ideal native speaker’’ and had used the term ‘communicative competence’ to represent the use of language in social context and the observance of sociolinguistic norms of appropriateness. His concern with speech communities and the integration of language, communication, and culture was not unlike that of Firth and Halliday in the British linguistic tradition. Hymes’s communicative competence may be seen as the equivalent of Halliday’s meaning potential. Social interaction rather than the abstract psycholinguistic
Communicative Language Teaching 675
functioning of the human brain would become an identifying feature of CLT. Interpreting the significance of Hymes’s perspective for language learners, some U.S. methodologists tended to focus on ‘native speaker’ cultural norms and the difficulty, if not impossibility, of representing these norms in a classroom of ‘non-natives.’ In light of this difficulty, the appropriateness of communicative competence as an instructional goal for classroom learners was questioned (Paulston, 1974). CLT thus can be seen to derive from a multidisciplinary perspective that includes linguistics, anthropology, philosophy, sociology, psychology, and educational research. Its focus has been the elaboration and implementation of programs and methodologies that promote the development of functional language ability through learner participation in communicative events. Central to CLT is the understanding of language learning as both an educational and a political issue. Language teaching is inextricably tied to language policy. Viewed from a multicultural, intranational, and international perspective, diverse sociopolitical contexts mandate not only a diverse set of language learning goals but also a diverse set of teaching strategies. Program design and implementation depend on negotiation among policymakers, linguists, researchers, and teachers. The evaluation of program success requires a similar collaborative effort. The selection of methods and materials appropriate to both the goals and context of teaching begins with an analysis of socially defined language learner needs, as well as the styles of learning that prevail in a given educational setting (Berns, 1990). Emergence of English as a Global Language
Along with a better understanding of the secondlanguage acquisition process itself, the emergence of English as a global or international language has had a profound influence on language teaching, confronting language teacher education with new challenges worldwide. With specific reference to English, CLT recognizes that the norms followed by those in the ‘inner circle’ of English language users, to adopt the terminology proposed by Kachru (1992), may not be an appropriate goal for learners (Pennycook, 2001; Savignon 2001, 2002). In a post-colonial, multicultural world where users of English in the outer and expanding circles outnumber those in the inner circle by a ratio of more than two to one, the use of such terms as ‘native’ or ‘native-like’ in the evaluation of communicative competence has become increasingly inappropriate. Learners moreover have been found to differ markedly in their reactions to learning a language for communication. Although some may welcome
apprenticeship in a new language, viewing it as an opportunity, others experience feelings of alienation and estrangement. Such phenomena may be individual or general to a community of learners. In Spanishspeaking Puerto Rico, for example, a long-standing general resentment of U.S. domination exerts a powerful negative influence on English language instruction. Not only learners but sometimes teachers also may consciously or subconsciously equate communicative English language learning with disloyalty to the history and culture of the island. Studying the rules of grammar and memorizing vocabulary lists is one thing. Using English for communication in other than stereotypical classroom exercises is quite another. Where they exist, such feelings are a strong deterrent to second- or foreign language use, even after 10 or more years of instruction. With respect to the documentation of cross-varietal differences of English, research to date has focused most often on sentence-level lexical and syntactic features. Consequently, such attempts as the Educational Testing Service (ETS) Test of English for International Communication (TOEIC) to represent norms for a standard English for international communication reflect a primarily lexical and syntactic emphasis (Lowenberg, 1992). The hegemony of essentially Western conventions at the levels of discourse and genre is represented or challenged less easily. Differences in the way genres are constructed, interpreted, and used of course clearly extend beyond lexical and syntactic variation. Such differences are currently thought of as discursive in nature and are included in discourse competence, a fourth component of communicative competence identified by Canale (1983). Pressures for a ‘democratization’ of discursive practices have in some settings resulted in genre mixing and the creation of new genres. In professional communities, however, conformity to the practices of an established membership continues to serve an important gate-keeping function. The privilege of exploiting generic conventions becomes available only to those who enjoy a certain stature or visibility (Foucault, 1981; Fairclough, 1992; Bhatia, 1997). Sociocultural Competence for a Dialogue of Cultures
Consistent with a view of language as social behavior, sociolinguistic competence is as we have seen integral to overall communicative competence. Second- or foreign language culture and its teaching have of course long been a concern of language teachers. Yet, if early research addressed the possibility of including some aspects of culture in a foreign language curriculum (for example, see Lado, 1957), recent
676 Communicative Language Teaching
discussion has underscored the strong links between language and culture and their relevance for teaching and curriculum design (Valdes, 1986; Byram, 1989; Damen, 1990; Kramsch, 1993). So mainstream now is the view of culture and language as inseparable that the term ‘‘sociocultural’’ has come to be substituted for the term ‘‘sociolinguistic’’ in representing the components of communicative competence (Byram, 1997; Savignon, 2002; Savignon and Sysoyev, 2002). Interest in teaching culture along with language has led to the emergence of various integrative approaches. The Russian scholar Victoria Saphonova (1996:62) has introduced a sociocultural approach to teaching modern languages that she has described as ‘‘teaching for intercultural L2 communication in a spirit of peace and a dialogue of cultures.’’ Given the dialogic nature of culture (Bakhtin, 1981), we cannot fully understand one culture in the absence of contact with other cultures. Thus, dialogue can be seen to be at the very core of culture, where culture is understood as a dialogical self-consciousness of every civilization. The emergence of a focus on sociocultural competence can be seen in other European nations as well. The free flow of people and knowledge within the European Union has increased both the need and the opportunity for language learning and intercultural understanding. Brammerts (1996:121) described the creation of the International E-Mail Tandem Network, a project funded by the European Union that brings together universities from more than 10 countries to promote ‘‘autonomous, cooperative, and intercultural learning.’’ The project is an extension of the tandem learning initiated in the 1970s in an effort to unite many states in a multicultural, multilingual Europe. Collaboration between entire classrooms of learners is a focus of ongoing research (Savignon and Roithmeier, 2003; Kinginger, 2004).
Interpretations of CLT Although the term CLT may be recognized worldwide, theoretical understanding and interpretations of it vary widely. Some methodologists have suggested that CLT is an essentially Western concept, inappropriate in other than Western contexts (Richards and Rogers, 2001; Rao, 2002). In addition, there are those who consider discussions of CLT to be passe´ (Bhatia, 2003; Kumaravadivelu, 2003; Savignon, 2003, 2004). Discouraged by the failure of both grammar-translation and audiolingual methods to prepare learners for the interpretation, expression, and negotiation of meaning and yet encouraged to adopt a variety of commercial materials and strategies increasingly labeled ‘communicative,’ many teachers and even teacher educators have been left
confused or disillusioned. Substantive revision of teaching practice appropriate to a given context is ultimately of course the responsibility of classroom teachers. Yet, they cannot be expected to change their practices without considerable administrative and governmental support along with extensive guided experiential pre-service and in-service professional development. Given the current widespread uncertainty as to just what are and are not essential features of CLT, a summary description would be incomplete without brief mention of what CLT is not. CLT is not concerned exclusively with face-to-face oral communication; principles of CLT apply equally to literacy. Whether written or oral, activities that involve readers and writers in the interpretation, expression, and negotiation of meaning are in and of themselves communicative. The goals of CLT depend on learner needs in a given context. Although group tasks have been found helpful in many contexts as a way or providing increased opportunity and motivation for communication, classroom group or pair work should not be considered an essential feature of CLT and may well be inappropriate in some settings. Finally, CLT does not exclude metalinguistic awareness or conscious knowledge of rules of syntax, discourse, and social appropriateness. However, knowing a rule is no substitute for using a rule. The creative use of interpretive and expressive skills in both reading and writing requires practice. CLT cannot be found in any one textbook or set of curricular materials inasmuch as strict adherence to a given text is not likely to be true to the process and goals of CLT. In keeping with the notion of context of situation, CLT is properly seen as an approach or theory of intercultural communicative competence to be used in developing materials and methods appropriate to a given context of learning. No less than the means and norms of communication they are designed to reflect, communicative teaching methods will continue to be explored and adapted. Considerable resources, both human and monetary, are being deployed around the world to respond to the need for language teaching that is appropriate for the communicative needs of learners. In the literature on CLT, teacher education has not received adequate attention. What happens when teachers try to make changes in their teaching in accordance with various types of reform initiatives, whether top-down ministry of education policy directives or teachergenerated responses to social and technological change? Several recent reports of reform efforts in different nations provide a thought-provoking look at language teaching today as the collaborative and context-specific human activity that it is.
Communicative Language Teaching 677
Redirection of English language education by Mombusho, the Japan Ministry of Education, includes the introduction of a communicative syllabus, the Japan Exchange and Teaching (JET) Program, and overseas in-service training for teachers. Previous encouragement to make classrooms more ‘communicative’ through the addition of ‘communicative activities’ led to the realization by Mombusho that teachers felt constrained by a structural syllabus that continued to control the introduction and sequence of grammatical features. With the introduction of a new national syllabus, structural controls were relaxed, and teachers found more freedom in the introduction of syntactic features. The theoretical rationale underlying the curriculum change in Japan includes both the well-known Canale and Swain (1980) model of communicative competence and the hypothetical classroom model of communicative competence, or the ‘‘inverted pyramid,’’ proposed by Savignon (1983: 46). Minoru Wada, senior advisor to Mombusho, described these efforts as ‘‘a landmark in the history of English education in Japan. For the first time it introduced into English education at both secondary school levels the concept of communicative competence. . . . The basic goal of the revision was to prepare students to cope with the rapidly occurring changes toward a more global society’’ (Wada, 1994:1). Following the research model adapted by Kleinsasser (1993) to understand language teachers beliefs and practices, Sato (2002) reported on a yearlong study of teachers of English in a private Japanese senior high school. Multiple data sources, including interviews, observations, surveys, and documents, offered insight into how EFL teachers learn to teach in this particular context. Among the major findings was the context-specific nature of teacher beliefs, which placed an emphasis on managing students, often to the exclusion of opportunities for English language learning. Cheng (2002) has documented the influence of a new, more communicative English language test on the classroom teaching of English in Hong Kong, a region that boasts a strong contingent of applied linguists and language teaching methodologists and has experienced considerable political and social transformation in recent years. In keeping with curricular redesign to reflect a more task-based model of learning, alternative public examinations were developed to measure learners’ ability to make use of what they have learned, to solve problems, and to complete tasks. Cheng’s ambitious multiyear study found the effect of washback of the new examination on classroom teaching to be limited. There was a change in classroom teaching at the content level, but not at the more important methodological level.
The role of washback in Costa Rica, a small nation with a long democratic tradition of public education, offers a contrast with the Hong Kong study. QuesadaInces (2001), a teacher educator with many years of experience, reported the findings of a multicase study that explored the relationship between teaching practice and the Bachillerato test of English, a national standardized reading comprehension test administered at the end of secondary school. Although teachers expressed strong interest in developing learner communicative ability in both written and spoken English, a reading comprehension test was seen to dominate classroom emphasis, particularly in the final two years of secondary school. The findings illustrate what Messick (1996) has called ‘‘negative washback,’’ produced by construct under-representation and construct irrelevance. The Bachillerato test of English does not reflect the content of the curriculum, assessing skills less relevant than those that go unmeasured. The English testing situation in Costa Rica is not unlike that described by Shohamy (1998) in Israel where two parallel systems exist – one the official national educational policy and syllabus and the other reflected in the national tests of learner achievement. English language teaching has also been a focus of curricular reform in both Taiwan and South Korea. Adopting a sociocultural perspective on language use and language learning as a prerequisite to pedagogical innovation, Wang (2002) noted the efforts that have been made to meet the demand for competent English language users in Taiwan. They include a change in college entrance examinations, a new curriculum with a goal of communicative competence, and the island-wide implementation of English education in the elementary schools. However, she noted that despite learner preference for a more communication-focused curriculum, grammar teaching continued to prevail and much more needed to be done to ensure quality classroom teaching and learning: ‘‘Further improvements can be stratified into three interrelated levels . . . teachers, school authorities, and the government. Each is essential to the success of the other efforts’’ (Wang, 2002: 145).
CLT in the 21st Century In each of the studies sketched above, the research was both initiated and conducted by local educators in response to local issues. Although each is significant in its own right, together they can only suggest the dynamic and contextualized nature of language teaching in the world today. Nonetheless, the settings that have been documented constitute a valuable resource for understanding the current global status of
678 Communicative Language Teaching
CLT. Viewed in kaleidoscopic fashion, they appear as brilliant multilayered bits of glass, tumbling about to form different yet always intriguing configurations. From these data-rich records of language teaching reform on the threshold of the 21st century three major themes emerge, suggestive of the road ahead: 1. The highly contextualized nature of CLT is underscored again and again. It would be inappropriate to speak of CLT as a teaching method in any sense of that term as it was used in the 20th century. Rather, CLT is an approach that understands language to be inseparable from individual identity and social behavior. Not only does language define a community but a community, in turn, also defines the forms and uses of language. The norms and goals appropriate for learners in a given setting, and the means for attaining these goals, are the concern of those directly involved. 2. Related both to the understanding of language as culture in motion and to the multilingual reality in which most of the world population finds itself is the futility of any definition of a ‘native speaker,’ a term that came to prominence in descriptive structural linguistics and was adopted by teaching methodologists to define an ideal for language learners. 3. Time and again, assessment seems to be the driving force behind curricular innovations. Increasing demands for accountability along with a positivistic stance that one cannot teach that which cannot be described and measured by a common yardstick continue to influence program content and goals. Irrespective of their own needs or interests, learners prepare for the tests they will be required to pass. High-stakes language tests often determine future access to education and opportunity. See also: Habermas, Ju¨rgen (b. 1929); Halliday, Michael A. K. (b. 1925); Language Teaching Traditions: Second Language; Second and Foreign Language Learning and Teaching; Teacher Preparation: Second Language.
Bibliography Berns M (1990). Contexts of competence: English language teaching in non-native contexts. New York: Plenum. Bakhtin M (1981). The dialogic imagination: four essays. In Holquist M (ed.) and Emerson C & Holquist M (trans.). Austin, TX: University of Texas Press. Bhatia V (1997). ‘The power and politics of genre.’ World Englishes 16, 359–371. Bhatia V (2003). ‘Response to S. J. Savignon, Teaching English as communication: a global perspective.’ World Englishes 22, 69–71.
Brammerts H (1996). ‘Language learning in tandem using the internet.’ In Warschauer M (ed.) Telecollaboration in foreign language learning. Manoa: Second Language Teaching and Curriculum Center, University of Hawaii. Byram M (1989). Cultural studies in foreign language education. Clevedon: Multilingual Matters. Byram M (1997). Teaching and assessing intercultural communicative competence. Clevedon: Multilingual Matters. Canale M (1983). ‘From communicative competence to communicative language pedagogy.’ In Richards J & Schmidt R (eds.) Language and communication. London: Longman. Canale M & Swain M (1980). ‘Theoretical bases of communicative approaches to second language teaching and testing.’ Applied Linguistics 1, 1–47. Candlin C (1978). Teaching of English: principles and an exercise typology. London: Langenscheidt-Longman. Cheng L (2002). ‘The washback effect on classroom teaching of changes in public examinations.’ In Savignon S J (ed.) Interpreting communicative language teaching: contexts and concerns in teacher education. New Haven, CT: Yale University Press. 91–111. Chomsky N (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Coady J & Huckin T (eds.) (1997). Second language vocabulary acquisition. Cambridge: Cambridge University Press. Damen L (1990). Culture learning: the fifth dimension in the language classroom. Reading, MA: Addison-Wesley. Fairclough N (1992). Discourse and social change. Cambridge: Polity Press. Firth J R (1930). Tongues of men. London: Watts & Co. Foucault M (1981). The archeology of knowledge. New York: Pantheon. Habermas J (1970). ‘Toward a theory of communicative competence.’ Inquiry 13, 360–375. Halliday M A K (1978). Language as social semiotic: the social interpretation of language and meaning. Baltimore, MD: University Park Press. Holec H (1979). Autonomy and foreign language learning. Strasbourg: Council of Europe. Hymes D (1971). ‘Competence and performance in linguistic theory.’ In Huxley R & Ingram E (eds.) Language acquisition: models and methods. London: Academic Press. Jakobovits L (1970). Foreign language learning: a psycholinguistic analysis of the issues. Rowley, MA: Newbury House. Kachru B (1992). ‘World Englishes: approaches, issues, and resources.’ Language Teaching, 1–14. Kinginger C (2004). ‘Communicative foreign language teaching through telecollaboration.’ In Van Esch C & Saint John O (eds.) New insights in foreign language learning and teaching. Frankfurt Am Main: Peter Lang Verlag. Kleinsasser R (1993). ‘A tale of two technical cultures: foreign language teaching.’ Teaching and Teacher Education 9, 373–383. Kramsch C (1993). Context and culture in language teaching. Oxford: Oxford University Press.
Communicative Language Teaching 679 Kumaravadivelu B (2003). Beyond methods: microstrategies for language teaching. New Haven, CT: Yale University Press. Lado R (1957). Linguistics across cultures: applied linguistics for language teachers. Ann Arbor: University of Michigan Press. Lowenberg P (1992). ‘Testing English as a world language: issues in assessing non-native proficiency.’ In Kachru B B (ed.) The other tongue: English across cultures, 2nd edn. Urbana, IL: University of Illinois Press. Messick S (1996). ‘Validity and washback in language testing.’ Language Testing 13, 241–256. Maley A & Duff A (1978). Drama techniques in language learning. Cambridge: Cambridge University Press. Paulston C (1974). ‘Linguistic and communicative competence.’ TESOL Quarterly, 347–362. ¨ berPiepho H E (1974). Kommuicative Kompetence Als U geornete Lernziel Des Englischunterrichts. DornburgFrickhofen, West Germany: Frankonius. Piepho H E & Bredella L (eds.) (1976). ‘Contacts.’ Integriertes Englischlehrwerk Fu¨ r Klassen, 5–10. Pennycook A (2001). Critical applied linguistics. Hillsdale, NJ: Erlbaum. Quesada-Inces R (2001). ‘‘Washback overrides the curriculum: an exploratory study on the washback effect of a high-stakes standardized test in the Costa Rican EFL high school context.’’ Ph.D. diss. Pennsylvania State University. Rao Z (2002). ‘Bridging the gap between teaching and learning styles in East Asian contexts.’ TESOL Journal 1(2), 5–11. Richards J & Rogers T (2001). Approaches and methods in language teaching (2nd edn.). Cambridge: Cambridge University Press. Saphonova V (1996). Teaching languages of international communication in the context of dialogue of cultures and civilizations. Voronezh: Istoki. Sato K (2002). ‘Practical understandings of communicative language teaching and teacher development.’ In Savignon S J (ed.) Interpreting communicative language teaching: contexts and concerns in teacher education. New Haven, CT: Yale University Press. 41–81. Savignon S J (1971). ‘‘A study of the effect of training in communicative skills as part of a beginning college French course on student attitude and achievement in linguistic and communicative competence.’’ Ph.D. diss. University of Illinois, Urbana Champaign.
Savignon S J (1972). Communicative competence: an experiment in foreign language teaching. Philadelphia: Center for Curriculum Development. Savignon S J (1974). ‘Teaching for communication.’ In Coulombe R et al. (eds.) Voix et visages de la France: level 1 teachers’ guide. Chicago: Rand-McNally. Reprinted in English Teaching Forum 16 (1978). 13–20. Savignon S J (1983). Communicative competence: theory and classroom practice. Reading, MA: Addison-Wesley. Savignon S J (1997). Communicative competence: theory and classroom practice (2nd edn.). New York: McGraw Hill. Savignon S J (2001). ‘Communicative language teaching for the twenty-first century.’ In Celce-Murcia M (ed.) Teaching English as a second or foreign language. Boston: Heinle & Heinle. 13–28. Savignon S J (ed.) (2002). Interpreting communicative language teaching: contexts and concerns in teacher education. New Haven, CT: Yale University Press. Savignon S J (2003). ‘Teaching English as communication: a global perspective.’ World Englishes 22, 55–66. Savignon S J (2004). ‘Review of Kumaravadivelu, Beyond methods: macrostrategies for language teaching.’ World Englishes 23(2), 328–330. Savignon S J & Sysoyev P (2002). ‘Sociocultural strategies for a dialogue of cultures.’ Modern Language Journal 86, 508–524. Savignon S J & Roithmeier W (2004). ‘Computer-mediated communication: texts and strategies.’ CALICO Journal 21, 265–290. Shohamy E (1998). ‘Testing methods, testing consequences: Are they ethical? Are they fair?’ Language Testing, 14–15. Valdes J (1986). Culture bound: bridging the cultural gap in language teaching. New York: Cambridge University Press. Van Ek J (1975). Systems development in adult language learning: the Threshold Level in a European unit credit system for modern language learning by adults. Strasbourg: Council of Europe. Wada M (ed.) (1994). The course of study for senior high school: foreign languages (English version). Tokyo: Kairyudo. Wang C (2002). ‘Innovative teaching in EFL contexts: the case of Taiwan.’ In Savignon S J (ed.) Interpreting communicative language teaching: contexts and concerns in teacher education. New Haven, CT: Yale University Press. 131–153.
680 Communicative Principle and Communication
Communicative Principle and Communication I Kecske´s, State University of New York, Albany NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Inferential Nature of Communication The main characteristic of human communicative behavior is that its major part takes place between the explicitly expressed words. Most of the time, there is a significant difference between what we say and what we mean. In spite of this, we usually have no difficulty figuring out what the speaker tries to communicate implicitly. Why is this so? According to Grice, the decisive feature of pragmatic interpretation is its inferential nature (Grice, 1961). He argued that most aspects of utterance interpretation that traditionally are regarded as conventional, or semantic, should be treated as conversational, or pragmatic. This means that the hearer constructs and evaluates a hypothesis about the speaker’s meaning. In this process she/he relies on the meaning of utterances, contextual and background assumptions, and general communicative principles that speakers are supposed to observe in normal circumstances. In the center of the Gricean approach are the so-called implicatures which are aspects of the speaker’s meaning inferred on the basis of contextual assumptions of communication and principles. Out of the three main elements of inferential intention-recognition: meaning of utterances, contextual and background assumptions, and principles of communication, the latter has generated most debate. Pragmaticians have been engaged in searching for communicative principles that govern communication. A principle is the formalized expression of the behavior of a system. It is not a statistical generalization but a causal, mechanical explanation, a general law. Models of communication, especially cognitive ones, have made serious efforts to identify the principles that govern different aspects of use and understanding of language.
Cooperative Principle Grice regarded cooperation as the ruling element of verbal communicative interaction. He argued that utterances automatically create expectations that guide the hearer toward the speaker’s meaning. He considered communication to be both rational and cooperative, and claimed that the inferential intention-recognition is governed by a cooperative principle and maxims of quality, quantity, relation, and manner (truthfulness, informativeness, relevance
and clarity), which speakers are expected to observe (Grice, 1961, 1989: 368–372). The interpretation that a hearer should choose is the one that best satisfies his/her expectations. Inferential communication can be considered successful when the communicator provides evidence of her/his intention to convey a certain thought, and the audience infers this intention from the evidence provided by the communicator. Grice’s inferential approach to communication has been so fundamental that all subsequent pragmatic theories have been influenced by it. Researchers have accepted and relied on the inferential nature of communication, but some have questioned the cooperative principle and maxims as the governing communicative principle of communication. Several critics of the Gricean view (e.g., Keenan, 1979; Traunmu¨ ller, 1991) expressed their skepticism about the universality of maxims, arguing that different cultures have different principles or maxims. According to Gumperz (1978), culturally colored interactional styles create culturally determined expectations and interpretive strategies and can lead to breakdowns in intercultural and interethnic communication. Others, such as Sperber and Wilson (1995), argued that cooperation is not essential to communication and suggested a reduction of Grice’s maxims to a single principle of relevance. According to this view, a rational speaker will choose an utterance that will provide the hearer with a maximum number of contextual implications in a minimum processing effort. In recent years, two approaches have emerged as most influential in the debate about the communicative principle: the neo-Gricean view and the theory of relevance.
Neo-Gricean View Neo-Gricean pragmatists such as Horn (1984) and Levinson (2000) retained the view that cooperation is essential to communication. Whether generalized or particularized, they argue, conversational implicature derives from the shared presumption that speaker and hearer are interacting rationally and cooperatively to reach a common goal. Although they kept the cooperative principle as a decisive factor of communication, they revised the maxims to account for a range of generalized implicatures, which Grice described as carried in all normal contexts and contrasted with more context-dependent particularized implicatures. In his book, Levinson addressed the problem of ‘‘generalized conversational implicatures (GCI)’’ as opposed to ‘‘particularized conversational implicatures (PCI).’’ He claimed that only the former
Communicative Principle and Communication 681
are truly linguistic in that only they do not rely on ‘‘specific contextual assumptions’’ (Levinson, 2000: 16). He emphasized that ‘‘ . . . a theory of GCIs has to be supplemented with a theory of PCIs that will have at least as much, and possibly considerably more, importance to a general theory of communication. It is just to a linguistic theory that GCIs have an unparalleled import’’ (Levinson, 2000: 22). Levinson listed three principles that guide generalized conversational implicatures: Q-Principle: Speaker: Choose the maximally informative expression alternative (that still is true). Addressee: Assume that speaker has chosen the maximally informative expression alternative (that still is true). I-Principle: Speaker: Produce only as much linguistic information as necessary to satisfy the communicative purpose. Addressee: Enrich the given linguistic information, identify the most specific information relative to the communicative purpose. M-Principle (Modality/Manner/Markedness) Speaker: Communicate non-normal, non-stereotypical situations by expressions that contrast with those that you would choose for normal, stereotypical situations. Addressee: If something is communicated by expressions that contrast with those that would be used for normal, stereotypical meanings, then assume that the speaker wants to communicate a non-normal, non-stereotypical meaning.
Relevance Theory Supporters of Relevance Theory share Grice’s intuition that utterances raise expectations of relevance. However, they question several other aspects of his approach, including the need for a cooperative principle and maxims, the focus on pragmatic processes that contribute to implicatures rather than to explicit, truth-conditional content, the role of deliberate maxim violation in utterance interpretation, and the treatment of figurative utterances as deviations from a maxim or convention of truthfulness (Sperber and Wilson, 1995, 2004). Building on the central insights of Grice’s contribution but advancing beyond him in significant ways, Sperber and Wilson (1995) argued that cooperation is not crucial for ostensive communication. However, it is fundamental for all speakers to form their contributions so that the audience will not only attend to them but will be able to infer the intended meaning without unjustifiable processing effort. This approach is grounded in a general view of human cognition according to which human cognitive processes are geared to achieving the greatest possible cognitive effect for the smallest possible processing effort. In order for individuals to achieve this, they must focus their attention on what seems to be
the most relevant information available to them. A feature of Sperber and Wilson’s theory that is significantly different from Grice’s is the idea that the processing of an utterance involves the construction of a context in which the effects of the utterance are evaluated. The context is not given, but is enriched in such a way that the processing of the utterance is facilitated. (See Relevance Theory). Sperber and Wilson believe that to communicate verbally is to claim an individual’s attention: hence, to communicate is to imply that the information communicated is relevant. This fundamental idea, according to which communicated information comes with a guarantee of relevance, is the communicative principle of relevance. They argued that the principle of relevance is essential to explaining human communication because human cognition tends to be geared to the maximization of relevance (cognitive principle of relevance). According to relevance theory, utterances raise expectations of relevance not because speakers are expected to obey a cooperative principle and maxims or some other specifically communicative convention, but because the search for relevance is a basic feature of human cognition which communicators may utilize (Sperber and Wilson, 2004). In their book, Sperber and Wilson (1995) demonstrated how these principles are enough on their own to account for the interaction of linguistic meaning and contextual factors in utterance interpretation. Their claim is that the expectations of relevance raised by an utterance are precise enough, and predictable enough, to guide the hearer toward the speaker’s meaning. The aim is to explain in cognitively realistic terms what these expectations of relevance amount to and how they might contribute to an empirically plausible account of comprehension.
Criticism of the Principle of Relevance Levinson (2000) criticized Relevance Theory (RT), while making the case that generalized conversational implicatures comprise a distinct domain within pragmatics. He relies on the Gricean distinction between generalized and particularized conversational implicature and claims that an approach such as RT (Sperber and Wilson, 1995), which does not give any theoretical weight to the distinction and uses the same communicative principle and comprehension procedure in the derivation of all conversational implicatures, cannot really give an adequate account of the nature of these generalized inferences. According to Levinson, RT does not allow for an intermediate level of generalized conversational implicatures in between ‘literal meaning’ (semantics) and once-off (‘nonce’) inferences.
682 Communicative Principle and Communication
Some researchers have described the relevancetheoretic approach to communication as psychological rather than sociological. Mey said that ‘‘relevance theory . . . does not include, let alone focus on, the social dimensions of language’’ (2001: 89). Talbot (1994: 3525–3526) called relevance theory ‘‘an asocial model,’’ pointing out that, within the RT framework, there is no way of discussing any divergence of assumptions according to class, gender, or ethnicity. In her response to this criticism, Sperber and Wilson (1997) argued that sociological aspects are not left out of RT because the theory considers human communication inferential, and it presupposes and exploits an awareness of self and others. Inferential communication is fundamentally social, not just because it is a form of interaction but also because it exploits and enlarges the scope of basic forms of social cognition. Verbal communication usually conveys much more than is linguistically encoded. Pragmatic theories such as RT take an interest mainly in the enrichment of linguistic meaning, derivation of standard implicatures, and principles governing the process of communication. Sociolinguists are more interested in the ostensive or non-ostensive uses of the act of communication itself to convey claims and attitudes about the social relationship between the interlocutors. Sperber and Wilson accepted that RT largely ignored these issues. They emphasized, however, that this did not mean to deny their importance. They merely felt that, at that stage of the theory development, they could best contribute to the study of human communication by taking it at its most elementary level and abstracting away from these more complex aspects. Another issue is that communication can hardly be restricted to what people intend to communicate. People usually communicate more than they intend and, according to Mey and Talbot, Sperber and Wilson’s model rests on the exclusion of precisely this (Mey and Talbot, 1988: 746). In their rebuttal, Sperber and Wilson (1997) pointed out that the issue is not whether non-ostensive forms of informationtransmission exist but whether they should be treated as communication. In RT, it is argued that unintentionally transmitted information is subject merely to general cognitive, rather than specifically communicative, constraints. Consequently, it falls under the first, or cognitive principle of relevance rather than the second, or communicative principle.
Communicative Principle Mey (1993: 2001) introduced an inclusive term ‘Communicative Principle’ that basically comprises
both the Cooperative Principle and the Principle of Relevance. He argued as follows: People talk with the intention to communicate something to somebody; this is the foundation of all linguistic behavior. I call this the Communicative Principle; even though this principle is not mentioned in the pragmatics literature (at least not under this name – a variant, the ‘Principle of Relevance’, will be discussed later in this chapter), it is nevertheless the hidden condition for all human pragmatic activity, and the silently agreed-on premise of our investigation into such activity. (Mey, 2001: 68–69).
According to the Communicative Principle intention, cooperation and relevance are all responsible for communication action in a concrete context.
Conclusion In recent years pragmatics has been thriving both in the social and inferential paradigms. However, only an integrated model that unifies the linguistic, cognitive, and social aspects of communication has considerable hope to be able to account for what is universal and what is culture-specific in human verbal interaction. See also: Context and Common Ground; Context, Communicative; Cooperative Principle; Inference: Abduction, Induction, Deduction; Intercultural Pragmatics and Communication; Nonstandard Language Use; Relevance Theory.
Bibliography Grice H P (1961). ‘The causal theory of perception.’ Proceedings of the Aristotelian Society, Supplementary Volume 35: 121–152. Partially reprinted in Grice, 224–247. Grice H P (1989). Studies in the way of words. Cambridge, MA: Harvard University Press. Gumperz J (1978). ‘The conversational analysis of interethnic communication.’ In Ross E L (ed.) Interethnic communication: proceedings of the Southern Anthropological Society. Athens, GA: University of Georgia Press. 223–238. Horn L (1984). ‘Towards a new taxonomy for pragmatic inference: Q- and R-based implicature.’ In Schiffrin D (ed.) Meaning, form, and use in context. Washington, DC: Georgetown University Press. 11–42. Keenan E O (1979). ‘The universality of conversational postulates.’ Language in Society 5, 67–80. Levinson S (2000). Presumptive meanings: the theory of generalized conversational implicature. Cambridge, MA: MIT Press. Mey J (2001). Pragmatics: an introduction (2nd edn.). Oxford: Blackwell.
Communities of Practice 683 Mey J & Talbot M (1988). ‘Computation and the soul.’ Journal of Pragmatics 12, 743–789. Sperber D & Wilson D (1995). Relevance: communication and cognition. Oxford: Blackwell. [1986]. Sperber D & Wilson D (1997). ‘Remarks on relevance theory and social sciences.’ Multilingua 16, 145–151. Sperber D & Wilson D (2004). ‘Relevance theory.’ In Ward G & Horn L (eds.) Handbook of pragmatics. Oxford: Blackwell. 607–632.
Talbot M (1994). ‘Relevance.’ In Asher R (ed.) The encyclopedia of language and linguistics 8. Oxford: Elsevier. 3524–3527. Traunmu¨ ller H (1991). ‘Conversational maxims and principles of language planning.’ PERILUS XII, 25–47. Ward G & Horn L (eds.) (2004). Handbook of pragmatics. Oxford: Blackwell.
Communities of Practice P Eckert, Stanford University, Stanford, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
The notion ‘community of practice’ was developed by Jean Lave and Etienne Wenger (Lave and Wenger, 1991; Wenger, 2000) as the basis of a social theory of learning. A community of practice is a collection of people who engage on an ongoing basis in some common endeavor: a bowling team, a book club, a friendship group, a crack house, a nuclear family, a church congregation. The construct was brought into sociolinguistics (Eckert and McConnell-Ginet, 1992a, 1992b) as a way of theorizing language and gender – most particularly, of responsibly connecting broad categories to on-the-ground social and linguistic practice. The value of the notion to sociolinguistics and linguistic anthropology lies in the fact that it identifies a social grouping not in virtue of shared abstract characteristics (e.g., class, gender) or simple copresence (e.g., neighborhood, workplace), but in virtue of shared practice. In the course of regular joint activity, a community of practice develops ways of doing things, views, values, power relations, ways of talking. And the participants engage with these practices in virtue of their place in the community of practice, and of the place of the community of practice in the larger social order. The community of practice is thus a rich locus for the study of situated language use, of language change, and of the very process of conventionalization that underlies both. Two conditions of a community of practice are crucial in the conventionalization of meaning: shared experience over time and a commitment to shared understanding. A community of practice engages people in mutual sense making – about the enterprise they are engaged in, about their respective forms of participation in the enterprise, about their orientation to other communities of practice and to the world around them more generally. Whether this mutual sense making is
consensual or conflictual, it is based in a commitment to mutual engagement, and to mutual understanding of that engagement. Participants in a community of practice collaborate in placing themselves as a group with respect to the world around them. This includes the common interpretation of other communities, and of their own practice with respect to those communities, and ultimately the development of a style – including a linguistic style – that embodies these interpretations. Time, meanwhile, allows for greater consistency in this endeavor – for more occasions for the repetition of circumstances, situations, and events. It provides opportunities for joint sense making, and it deepens participants’ shared knowledge and sense of predictability. This not only allows meaning to be exercised, but it provides the conditions for setting down convention (Lewis, 1969). The community of practice offers a different perspective from the traditional focus on the speech community as an explanatory context for linguistic heterogeneity. The speech community perspective views heterogeneity as based in a geographically defined population, and structured by broad and fundamental social categories, particularly class, gender, age, race, and ethnicity. The early survey studies in this tradition (Labov, 1966; Wolfram, 1969; Trudgill, 1974; Macaulay, 1977) provided the backbone of variation studies, mapping broad distributions across large urban communities. What these studies could not provide is the link between broad, abstract patterns and the meanings that speakers are constructing in the concrete situated speech that underlies them. The search for local explanations of linguistic variability has spurred a range of ethnographic studies over the years (Labov, 1963; Gal, 1979; Eckert, 2000), and in recent decades the ethnographic trend has intensified. A major challenge in such studies is to find local settings in which speakers engage the most intensely in making sense of their place in the wider social world, and in which they articulate their linguistic behavior with this sense.
Communities of Practice 683 Mey J & Talbot M (1988). ‘Computation and the soul.’ Journal of Pragmatics 12, 743–789. Sperber D & Wilson D (1995). Relevance: communication and cognition. Oxford: Blackwell. [1986]. Sperber D & Wilson D (1997). ‘Remarks on relevance theory and social sciences.’ Multilingua 16, 145–151. Sperber D & Wilson D (2004). ‘Relevance theory.’ In Ward G & Horn L (eds.) Handbook of pragmatics. Oxford: Blackwell. 607–632.
Talbot M (1994). ‘Relevance.’ In Asher R (ed.) The encyclopedia of language and linguistics 8. Oxford: Elsevier. 3524–3527. Traunmu¨ller H (1991). ‘Conversational maxims and principles of language planning.’ PERILUS XII, 25–47. Ward G & Horn L (eds.) (2004). Handbook of pragmatics. Oxford: Blackwell.
Communities of Practice P Eckert, Stanford University, Stanford, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
The notion ‘community of practice’ was developed by Jean Lave and Etienne Wenger (Lave and Wenger, 1991; Wenger, 2000) as the basis of a social theory of learning. A community of practice is a collection of people who engage on an ongoing basis in some common endeavor: a bowling team, a book club, a friendship group, a crack house, a nuclear family, a church congregation. The construct was brought into sociolinguistics (Eckert and McConnell-Ginet, 1992a, 1992b) as a way of theorizing language and gender – most particularly, of responsibly connecting broad categories to on-the-ground social and linguistic practice. The value of the notion to sociolinguistics and linguistic anthropology lies in the fact that it identifies a social grouping not in virtue of shared abstract characteristics (e.g., class, gender) or simple copresence (e.g., neighborhood, workplace), but in virtue of shared practice. In the course of regular joint activity, a community of practice develops ways of doing things, views, values, power relations, ways of talking. And the participants engage with these practices in virtue of their place in the community of practice, and of the place of the community of practice in the larger social order. The community of practice is thus a rich locus for the study of situated language use, of language change, and of the very process of conventionalization that underlies both. Two conditions of a community of practice are crucial in the conventionalization of meaning: shared experience over time and a commitment to shared understanding. A community of practice engages people in mutual sense making – about the enterprise they are engaged in, about their respective forms of participation in the enterprise, about their orientation to other communities of practice and to the world around them more generally. Whether this mutual sense making is
consensual or conflictual, it is based in a commitment to mutual engagement, and to mutual understanding of that engagement. Participants in a community of practice collaborate in placing themselves as a group with respect to the world around them. This includes the common interpretation of other communities, and of their own practice with respect to those communities, and ultimately the development of a style – including a linguistic style – that embodies these interpretations. Time, meanwhile, allows for greater consistency in this endeavor – for more occasions for the repetition of circumstances, situations, and events. It provides opportunities for joint sense making, and it deepens participants’ shared knowledge and sense of predictability. This not only allows meaning to be exercised, but it provides the conditions for setting down convention (Lewis, 1969). The community of practice offers a different perspective from the traditional focus on the speech community as an explanatory context for linguistic heterogeneity. The speech community perspective views heterogeneity as based in a geographically defined population, and structured by broad and fundamental social categories, particularly class, gender, age, race, and ethnicity. The early survey studies in this tradition (Labov, 1966; Wolfram, 1969; Trudgill, 1974; Macaulay, 1977) provided the backbone of variation studies, mapping broad distributions across large urban communities. What these studies could not provide is the link between broad, abstract patterns and the meanings that speakers are constructing in the concrete situated speech that underlies them. The search for local explanations of linguistic variability has spurred a range of ethnographic studies over the years (Labov, 1963; Gal, 1979; Eckert, 2000), and in recent decades the ethnographic trend has intensified. A major challenge in such studies is to find local settings in which speakers engage the most intensely in making sense of their place in the wider social world, and in which they articulate their linguistic behavior with this sense.
684 Communities of Practice
The construct ‘community of practice’ is a way of locating language use ethnographically so as to create an accountable link between local practice and membership in extralocal and broad categories. What makes a community of practice different from just any group of speakers (e.g., a bunch of kids found hanging out on the street, or a group of undergraduates assembled for an experiment) is not the selection of the speakers so much as the nature of the accountability for this selection. While every community of practice offers a window on the world, the value of this approach relies on the analyst’s ability to seek out communities of practice that are particularly salient to the sociolinguistic question being addressed. It is this selection that makes the difference between particularism and a close-up study with far-reaching significance. Explanation for broad patterns is to be found in speakers’ experience, understanding, and linguistic development as they engage in life as members of important overarching categories. A white workingclass Italian-American woman does not develop her ways of speaking directly from the larger categories ‘working class,’ ‘Italian-American’ and ‘female,’ but from her day-to-day experience as a person who combines those three (and other) memberships. Her experience will be articulated by her participation in activities and communities of practice that are particular to her place in the social order. It is in these communities of practice that she will develop an identity and the linguistic practices to articulate this identity. Thus communities of practice are fundamental loci for the experience of membership in broader social categories – one might say that it is the grounded locus of the habitus (Bourdieu, 1977). Survey studies show us that working-class speakers lead in the adoption of local phonological change. While one can speculate about the motivations for this early adoption on the basis of general knowledge about class, the actual dynamics of social meaning can only be found through direct examination of working-class linguistic practice. Ethnographic work in suburban Detroit high schools (Eckert, 2000) sought to understand the salience of class in adolescents’ day-to-day practice. The study uncovered an opposition between two large communities of practice, the jocks and the burnouts, that constitute class cultures in the context of the high school. The working-class culture of the burnouts and the middle-class culture of the jocks are specifically adolescent, and class consciousness and conflict takes the form of a highlighted social opposition in school and the maximization of resources in constructing this opposition. Linguistic variables, a prime resource, correlated significantly with participation in these
communities of practice, rather than with parents’ social class. The jocks’ and burnouts’ contrasting orientation to such things as school, the urban area, relationships, and the future provided direct explanations for the burnouts’ lead in the adoption of new local changes. Another important aspect of the communities of practice approach is its focus on the fluidity of social space and the diversity of experience. The speech community perspective’s focus on demographic categories implies a center and a periphery (Rampton, 1999). The focus on average behavior for categories suggests a ‘typical’ speaker, erasing the important activity of speakers at the borders of categories. This also produces a static view of the relation between the linguistic and the social, since change tends to come from the borders (Pratt, 1988). Studies of communities of practice, therefore, can capture the interaction between social and linguistic change. Qing Zhang, for example (Zhang, 2001), has captured the role of stylistic practice among the new Beijing ‘yuppies’ in the development of new dialect features, and Andrew Wong (Wong, 2005) has traced semantic change in the differential use of the term tongzhi ‘comrade’ between the activist and nonactivist gay communities. Mary Bucholtz’s study (Bucholtz, 1996) of a group of girls who were fashioning themselves as geeks – a persona normally reserved for males – provided direct observation of girls pushing the envelope of gender in their daily linguistic practice. A community of practice that is central to many of its participants’ identity construction is an important locus for the setting down of joint history, allowing for the complex construction of linguistic styles. Such history also sets the stage for change. Emma Moore’s study of teenage girls in northern England (Moore, 2003) traced the gradual split of a group of somewhat rebellious ‘populars’ as some of them emerged as the tougher ‘townies’ in their ninth year. In the process, the vernacular speech patterns of the townies intensified in opposition to those of their more conservative friends. The enterprise of sociolinguistics (and linguistic anthropology) is to relate ways of speaking to ways of participating in the social world. This is not simply a question of discovering how linguistic form correlates with social structure or activity, but of how social meaning comes to be embedded in language. Meaning is made in the course of local social practice (McConnell-Ginet, 1989) and conventionalized on the basis of shared experience and understanding (Lewis, 1969). The importance of the community of practice lies in the recognition that identity is not fixed, that convention does not pre-exist use, and
Comoros: Language Situation 685
that language use is a continual process of learning. The community of practice is a prime locus of this process of identity and linguistic construction. Communities of practice emerge in response to common interest or position, and play an important role in forming their members’ participation in, and orientation to, the world around them. It should be clear that the speech community and the community of practice approaches are both necessary and complementary, and that the value of each depends on having the right abstract categories and finding the communities of practice in which those categories are most salient. In other words, the best analytic process would involve feedback between the two approaches. See also: Gender; Identity and Language; Interactional
Sociolinguistics; Sociolinguistic Crossing.
Bibliography Bourdieu P (1977). Outline of a theory of practice. Cambridge: Cambridge University Press. Bucholtz M (1996). ‘Geek the girl: language, femininity and female nerds.’ In Warner N, Ahlers J, Bilmes L, Oliver M, Wertheim S & Chen M (eds.) Gender and belief systems: proceedings of the Fourth Berkeley Women and Language Conference. Berkeley, CA: Berkeley Women and Language Group. 119–132. Eckert P (2000). Linguistic variation as social practice. Oxford, Blackwell. Eckert P & McConnell-Ginet S (1992a). ‘Communities of practice: where language, gender and power all live.’ In Hall K, Bucholtz M & Moonwomon B (eds.) Locating power: Proceedings of the Second Berkeley Women and Language Conference. Berkeley, CA: Berkeley Women and Language Group. 89–99. Eckert P & McConnell-Ginet S (1992b). ‘Think practically and look locally: language and gender as communitybased practice.’ Annual Review of Anthropology 21, 461–490.
Gal S (1979). Language shift: social determinants of linguistic change in bilingual Austria. New York: Academic Press. Labov W (1963). ‘The social motivation of a sound change.’ Word 18, 1–42. Labov W (1966). The social stratification of English in New York City. Washington, DC: Center for Applied Linguistics. Lave J & Wenger E (1991). Situated learning: legitimate peripheral participation. Cambridge: Cambridge University Press. Lewis D (1969). Convention. Cambridge, MA: Harvard University Press. Macaulay R K S (1977). Language, social class and education: a Glasgow study. Edinburgh: Edinburgh University Press. McConnell-Ginet S (1989). ‘The sexual (re)production of meaning: a discourse-based theory.’ In Frank F W & Treichler P A (eds.) Language, gender and professional writing: theoretical approaches and guidelines for nonsexist usage. New York: MLA. 35–50. Moore E (2003). Learning style and identity: a sociolinguistic analysis of a Bolton high school. Ph.D. thesis, University of Manchester. Pratt M L (1988). ‘Linguistic utopias.’ In Fabb N, Attridge D, Durant A & MacCabe C (eds.) The linguistics of writing: arguments between language and literature. New York: Methuen. 48–66. Rampton B (1999). Speech community. London: Centre for Applied Linguistic Research, Thames Valley University. Trudgill P (1974). The social differentiation of English in Norwich. Cambridge: Cambridge University Press. Wenger E (2000). Communities of practice. New York: Cambridge University Press. Wolfram W (1969). A sociolinguistic description of Detroit negro speech. Washington, DC: Center for Applied Linguistics. Wong A (2005). ‘The re-appropriation of tongzhi.’ Language in Society 34(5). Zhang Q (2001). Changing economics, changing markets: a sociolinguistic study of Chinese yuppies. Ph.D. thesis, Stanford University.
Comoros: Language Situation W Full, University of Mainz, Mainz, Germany ! 2006 Elsevier Ltd. All rights reserved.
The Comoros consist of four main islands situated halfway between the East African coast and the northern tip of Madagascar. Linguistic and archaeological evidence show that the ancestors of today’s Comorians arrived here from East Africa, probably in
the 8th or 9th century (Nurse and Hinnebusch, 1993). Over the next 1000 years the Comoros were part of a great trading network in the Indian Ocean dominated by Arab merchants. In the 19th century France gradually increased its influence on the islands and finally declared them a French colony in 1912. Political demand for independence arose later than in most African countries, but in 1975 the three islands of Grande Comore (Ngazija/Shingazija), Anjouan
Comoros: Language Situation 685
that language use is a continual process of learning. The community of practice is a prime locus of this process of identity and linguistic construction. Communities of practice emerge in response to common interest or position, and play an important role in forming their members’ participation in, and orientation to, the world around them. It should be clear that the speech community and the community of practice approaches are both necessary and complementary, and that the value of each depends on having the right abstract categories and finding the communities of practice in which those categories are most salient. In other words, the best analytic process would involve feedback between the two approaches. See also: Gender; Identity and Language; Interactional
Sociolinguistics; Sociolinguistic Crossing.
Bibliography Bourdieu P (1977). Outline of a theory of practice. Cambridge: Cambridge University Press. Bucholtz M (1996). ‘Geek the girl: language, femininity and female nerds.’ In Warner N, Ahlers J, Bilmes L, Oliver M, Wertheim S & Chen M (eds.) Gender and belief systems: proceedings of the Fourth Berkeley Women and Language Conference. Berkeley, CA: Berkeley Women and Language Group. 119–132. Eckert P (2000). Linguistic variation as social practice. Oxford, Blackwell. Eckert P & McConnell-Ginet S (1992a). ‘Communities of practice: where language, gender and power all live.’ In Hall K, Bucholtz M & Moonwomon B (eds.) Locating power: Proceedings of the Second Berkeley Women and Language Conference. Berkeley, CA: Berkeley Women and Language Group. 89–99. Eckert P & McConnell-Ginet S (1992b). ‘Think practically and look locally: language and gender as communitybased practice.’ Annual Review of Anthropology 21, 461–490.
Gal S (1979). Language shift: social determinants of linguistic change in bilingual Austria. New York: Academic Press. Labov W (1963). ‘The social motivation of a sound change.’ Word 18, 1–42. Labov W (1966). The social stratification of English in New York City. Washington, DC: Center for Applied Linguistics. Lave J & Wenger E (1991). Situated learning: legitimate peripheral participation. Cambridge: Cambridge University Press. Lewis D (1969). Convention. Cambridge, MA: Harvard University Press. Macaulay R K S (1977). Language, social class and education: a Glasgow study. Edinburgh: Edinburgh University Press. McConnell-Ginet S (1989). ‘The sexual (re)production of meaning: a discourse-based theory.’ In Frank F W & Treichler P A (eds.) Language, gender and professional writing: theoretical approaches and guidelines for nonsexist usage. New York: MLA. 35–50. Moore E (2003). Learning style and identity: a sociolinguistic analysis of a Bolton high school. Ph.D. thesis, University of Manchester. Pratt M L (1988). ‘Linguistic utopias.’ In Fabb N, Attridge D, Durant A & MacCabe C (eds.) The linguistics of writing: arguments between language and literature. New York: Methuen. 48–66. Rampton B (1999). Speech community. London: Centre for Applied Linguistic Research, Thames Valley University. Trudgill P (1974). The social differentiation of English in Norwich. Cambridge: Cambridge University Press. Wenger E (2000). Communities of practice. New York: Cambridge University Press. Wolfram W (1969). A sociolinguistic description of Detroit negro speech. Washington, DC: Center for Applied Linguistics. Wong A (2005). ‘The re-appropriation of tongzhi.’ Language in Society 34(5). Zhang Q (2001). Changing economics, changing markets: a sociolinguistic study of Chinese yuppies. Ph.D. thesis, Stanford University.
Comoros: Language Situation W Full, University of Mainz, Mainz, Germany ! 2006 Elsevier Ltd. All rights reserved.
The Comoros consist of four main islands situated halfway between the East African coast and the northern tip of Madagascar. Linguistic and archaeological evidence show that the ancestors of today’s Comorians arrived here from East Africa, probably in
the 8th or 9th century (Nurse and Hinnebusch, 1993). Over the next 1000 years the Comoros were part of a great trading network in the Indian Ocean dominated by Arab merchants. In the 19th century France gradually increased its influence on the islands and finally declared them a French colony in 1912. Political demand for independence arose later than in most African countries, but in 1975 the three islands of Grande Comore (Ngazija/Shingazija), Anjouan
686 Comoros: Language Situation
(Ndzwani/Shindzwani), and Moheli (Mwali) became independent, while the majority on the fourth island, Mayotte (Maore), voted to remain a French overseas territory (see Mayotte: Language Situation). Despite many declarations by the United Nations and the Organization of African Unity condemning France’s behavior as neocolonial, the political division of the archipelago continues today. Since 1975 the political situation of the independent islands was mostly unstable, with many coup d’etats, often assisted by foreign mercenaries. In 1997 Anjouan and Moheli declared their secession from the dominant island of Grande Comore, and only in 2001 under the mediation of the Organization of African Unity did all three islands agree to a new very federal constitution that guaranteed autonomy in many political fields for each island. On the three islands of the Union of Comoros, the first language of nearly everybody is a form of Comorian, a Bantu language that in the past was often regarded as a mere dialect of Swahili. Today it is generally accepted that Comorian is an independent language, classified within the genetic subgroup of Sabaki languages together with Swahili and four other languages from the East African mainland (Nurse and Hinnebusch, 1993). It is estimated that there are more than 700 000 speakers of Comorian on the Comoro islands (including Mayotte), apart from large groups of Comorian emigrants in neighboring countries and in France. The language is subdivided into three dialects, following the geographical and political separation within the Union of Comoros: Ngazija/Shingazija (on Grande Comore), Ndzwani/Shindzwani (on Anjouan), and Shimwali (on Moheli). The biggest differences occur between Shingazidja and Shindz-
wani, while Shimwali stands in the middle of the linguistic continuum. Although Comorian is the everyday language of the whole population, it was only in the constitution of 2001 that it gained the status of an official language alongside French and Arabic. French can only be attained by formal education, but most Comorians visit school only for few years, which is not enough to acquire a profound knowledge of the language. The main function of French is that of an administrative language. Arabic is only used in religious contexts (more than 95% of the Comorians are Muslims), but real knowledge is restricted to specialists, most of whom have studied abroad. See also: Bantu Languages; Mayotte: Language Situation;
Swahili.
Bibliography Ahmed-Chamanga M (1992). Lexique Comorien (Shinzuani)-Franc¸ais. Paris: L’Harmattan. Ahmed-Chamanga M (1997). Dictionnaire Franc¸aisComorien. Paris: L’Harmattan. Full, W. (forthcoming). Dialektologie des Komorischen. Heepe M (1920). Die Komorendialekte Ngazidja, Nzwani und Mwali. Hamburg: L. Friederichsen. Lafon M (1991). Lexique Franc¸ais-Comorien (Shingazidja). Paris: L’Harmattan. Nurse D & Hinnebusch T (1993). Swahili and Sabaki: a linguistic history. Berkeley & Los Angeles: University of California Press. Ottenheimer M & Ottenheimer H (1994). Historical dictionary of the Comoro Islands. Metuchen & London: Scarecrow.
Comparative Constructions L Stassen, Radboud University, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Definition of the Domain In semantic or cognitive terms, comparison can be defined as a mental act by which two objects are assigned a position on a predicative scale. Should this position be the same for both objects, then we have a case of the comparison of equality. If the positions on the scale are different, then we speak of the comparison of inequality. In both cases,
however, the notion essentially involves three things: a predicative scale, which, in language, is usually encoded as a gradable predicate, and two objects. Although these objects can, in principle, be complex, the practice of typological linguistic research has been to restrict them to primary objects, which are typically encoded in the form of noun phrases. Thus, a comparative construction typically contains a predicate and two noun phrases, one of which is the object of comparison (the comparee NP), while the other functions as the ‘yard stick’ of the comparison (the standard NP). In short, prototypical instances of comparative constructions in the languages of the world
686 Comoros: Language Situation
(Ndzwani/Shindzwani), and Moheli (Mwali) became independent, while the majority on the fourth island, Mayotte (Maore), voted to remain a French overseas territory (see Mayotte: Language Situation). Despite many declarations by the United Nations and the Organization of African Unity condemning France’s behavior as neocolonial, the political division of the archipelago continues today. Since 1975 the political situation of the independent islands was mostly unstable, with many coup d’etats, often assisted by foreign mercenaries. In 1997 Anjouan and Moheli declared their secession from the dominant island of Grande Comore, and only in 2001 under the mediation of the Organization of African Unity did all three islands agree to a new very federal constitution that guaranteed autonomy in many political fields for each island. On the three islands of the Union of Comoros, the first language of nearly everybody is a form of Comorian, a Bantu language that in the past was often regarded as a mere dialect of Swahili. Today it is generally accepted that Comorian is an independent language, classified within the genetic subgroup of Sabaki languages together with Swahili and four other languages from the East African mainland (Nurse and Hinnebusch, 1993). It is estimated that there are more than 700 000 speakers of Comorian on the Comoro islands (including Mayotte), apart from large groups of Comorian emigrants in neighboring countries and in France. The language is subdivided into three dialects, following the geographical and political separation within the Union of Comoros: Ngazija/Shingazija (on Grande Comore), Ndzwani/Shindzwani (on Anjouan), and Shimwali (on Moheli). The biggest differences occur between Shingazidja and Shindz-
wani, while Shimwali stands in the middle of the linguistic continuum. Although Comorian is the everyday language of the whole population, it was only in the constitution of 2001 that it gained the status of an official language alongside French and Arabic. French can only be attained by formal education, but most Comorians visit school only for few years, which is not enough to acquire a profound knowledge of the language. The main function of French is that of an administrative language. Arabic is only used in religious contexts (more than 95% of the Comorians are Muslims), but real knowledge is restricted to specialists, most of whom have studied abroad. See also: Bantu Languages; Mayotte: Language Situation;
Swahili.
Bibliography Ahmed-Chamanga M (1992). Lexique Comorien (Shinzuani)-Franc¸ais. Paris: L’Harmattan. Ahmed-Chamanga M (1997). Dictionnaire Franc¸aisComorien. Paris: L’Harmattan. Full, W. (forthcoming). Dialektologie des Komorischen. Heepe M (1920). Die Komorendialekte Ngazidja, Nzwani und Mwali. Hamburg: L. Friederichsen. Lafon M (1991). Lexique Franc¸ais-Comorien (Shingazidja). Paris: L’Harmattan. Nurse D & Hinnebusch T (1993). Swahili and Sabaki: a linguistic history. Berkeley & Los Angeles: University of California Press. Ottenheimer M & Ottenheimer H (1994). Historical dictionary of the Comoro Islands. Metuchen & London: Scarecrow.
Comparative Constructions L Stassen, Radboud University, Nijmegen, The Netherlands ! 2006 Elsevier Ltd. All rights reserved.
Definition of the Domain In semantic or cognitive terms, comparison can be defined as a mental act by which two objects are assigned a position on a predicative scale. Should this position be the same for both objects, then we have a case of the comparison of equality. If the positions on the scale are different, then we speak of the comparison of inequality. In both cases,
however, the notion essentially involves three things: a predicative scale, which, in language, is usually encoded as a gradable predicate, and two objects. Although these objects can, in principle, be complex, the practice of typological linguistic research has been to restrict them to primary objects, which are typically encoded in the form of noun phrases. Thus, a comparative construction typically contains a predicate and two noun phrases, one of which is the object of comparison (the comparee NP), while the other functions as the ‘yard stick’ of the comparison (the standard NP). In short, prototypical instances of comparative constructions in the languages of the world
Comparative Constructions 687
are sentences that are equivalent to the English sentences in (1), in which the noun phrase following the items as and than is the standard NP: (1) English (Indo-European, Germanic) (1a) John is as tall as Lucy (1b) John is taller than Lucy
The Comparison of Inequality: Parameters Modern literature on the typology of the comparison of inequality has concentrated largely on the comparison of inequality. Relevant publications include Ultan (1972), Andersen (1983), and Stassen (1984, 1985). The last of these authors presents a typology of comparative constructions that is based on a sample of 110 languages and that boils down to four major types. A basic parameter in this typology is the encoding of the standard NP. First, one can make a distinction between instances of fixed-case comparatives and derived-case comparatives. In the former type, the standard NP is always in the same case, regardless of the case of the comparee NP. In the latter type, the standard NP derives its case assignment from the case of the comparee NP. Classical Latin is an example of a language in which both types were allowed. The sentences in (2) illustrate a construction type in which the standard NP is dependent on the comparee NP for its case marking. In contrast, sentence (3) shows a construction type in which the standard NP is invariably in the ablative case. As a result, sentence (3) is ambiguous between the readings of (2a) and (2b). (2) Latin (Indo-European, Italic) (2a) Brutum ego non minus B.-ACC 1SG.-NOM not less amo quam Caesar love.1SG.PRES than C.-NOM ‘I love Brutus no less than Caesar (loves Brutus)’ (Ku¨ hner and Stegmann, 1955: 466) (2b) Brutum ego non minus B.-ACC 1SG.NOM not less amo quam Caesarem love.1SG.PRES than C.-ACC ‘I love Brutus no less than (I love) Caesar’ (Ku¨ hner and Stegmann, 1955: 466) (3) Latin (Indo-European, Italic) Brutum ego non B.-ACC 1SG.NOM not amo Caesare love.1SG.PRES C.-ABL (Ku¨ hner and Stegmann, 1955: 466)
minus less
Both types of comparative constructions can be subcategorized further, on the basis of additional parameters. Within the fixed-case comparatives, a
first distinction is that between direct-object comparatives and locational comparatives. Direct-object comparatives (or, as Stassen [1985] calls them, Exceed-Comparatives) have as their characteristic that the standard NP is constructed as the direct object of a transitive verb with the meaning ‘to exceed’ or ‘to surpass.’ Thus, the construction typically includes two predicates, one of which is the comparative predicate, and another which is the ‘exceed’ verb. The comparee NP is the subject of the ‘exceed’ verb. Concentrations of the Exceed comparative are found in Sub-Saharan Africa, in China and Southeast Asia, and in Eastern Austronesia. Duala, a Bantu language from Cameroon, presents an instance of the Exceed comparative, as does Thai. (4) Duala (Niger-Kordofanian, Northwest Bantu) nin ndabo e kolo buka nine this house it big exceed that ‘This house is bigger than that’ (Ittmann, 1939: 187) (5) Thai (Austro-Asiatic, Kam-Tai) kaˇw suˇuN kwaa` kon tu´k he tall exceed man each ‘He is taller than anyone’ (Warotamasikkhadit, 1972: 71)
kon man
Locational comparatives, on the other hand, are characterized by the fact that the standard NP is invariably constructed in a case form that has a locational/ adverbial function. Depending on the exact nature of this function, adverbial comparatives can be divided into three further subtypes. Separative comparatives mark the standard NP as the source of a movement, with a marker meaning ‘from,’ or ‘out of.’ Allative comparatives construct the standard NP as the goal of a movement (‘to, toward,’ ‘over, beyond’) or as a benefactive (‘for’). Finally, locative comparatives encode the standard NP as a location, in which an object is at rest (‘in,’ ‘on,’ ‘at,’ ‘upon’). Concentrations of (the various subtypes of) the Locational Comparative are found in Africa above the Sahara, in Eurasia (including the Middle East and India, but with the exception of the modern languages of Continental Europe), Eskimo, some Western North American languages, Mayan, Quechuan, Carib, Polynesian, and some (but not many) Australian and Papuan languages. Illustrations of the various subtypes of locational comparatives are: (6) Mundari (Austro-Asiatic, Munda) sadom-ete hati mananga-i horse-from elephant big-3SG.PRES ‘The elephant is bigger than the horse’ (Hoffmann, 1903: 110)
688 Comparative Constructions (7) Estonian (Uralic, Balto-Finnic) kevad on su¨ gis-est ilusam spring is fall-from more.beautiful ‘The spring is more beautiful than the fall’ (Oinas, 1966: 140) (8) Maasai (Nilo-Saharan, Nilotic) sapuk olkondi to lkibulekeny big hartebeest to waterbuck ‘The hartebeest is bigger than the waterbuck’ (Tucker and Mpaayi, 1955: 93) (9) Tamachek’ (Afro-Asiatic, Berber) kemmou tehousid foull oult ma m you pretty.2SG.FEM upon sister of you ‘You are prettier than your sister’ (Hanoteau, 1896: 52) (10) Tubu (Nilo-Saharan, Saharan) sa-umma gere du mado eye-his blood on red ‘His eye is redder than blood’ (Lukas, 1953: 45)
Turning now to the derived-case comparatives, in which the case marking of the standard NP is derived from – or ‘parasitic on’ – the case marking of the comparee NP, we note that, again, two subtypes can be distinguished. First, there is the conjoined comparative. Here, the comparative construction consists of two structurally independent clauses, one of which contains the comparee NP, while the other contains the standard NP. Furthermore, the two clauses show a structural parallelism, in that the grammatical function of the comparee NP in one of the clauses is duplicated by the grammatical function of the standard NP in the other clause. If, for example, the comparee functions as the grammatical subject in its clause, the standard NP will also have subject status in its clause. Since the construction has two clauses, it follows that the construction will also have two independent predicates. In other words, the comparative predicate is expressed twice. There are two ways in which this double expression may be effectuated. The language may employ antonymous predicates in the two clauses (‘good-bad,’ ‘strong-weak’). Alternatively, the two predicates may show a positive-negative polarity (‘good-not good’, ‘strong-not strong’). An example of the first variant is found in Amele; the second variant has been attested for Menomini. Sentence (13) illustrates one of the comparative constructions in Malay. Here the standard NP and the comparee NP are conjoined as sentence topics, and the following clause predicates the property of the comparee NP only; that is, in this (rather infrequent) variant of the Conjoined Comparative, the
comparative predicate is expressed only once. In geographical terms, the conjoined comparative seems to be concentrated in the Southern Pacific, including Australian, Papuan, and Eastern Austronesian languages, but it is also common in large parts of the Americas, and there are also some cases in Eastern Africa. (11) Amele (Papuan, Madang) jo i ben , jo eu nag house this big , house that small ‘This house is bigger than that house’ (Roberts, 1987: 135) (12) Menomini (Algonquian) Tata’hkes-ew , nenah teh kan strong-3SG , I and not ‘He is stronger than me’ (Bloomfield, 1962: 506) (13) Malay (Austronesian, West Indonesian) kayu, batu, beˇ rat batu wood, stone, heavy stone ‘Stone is heavier than wood’ (Lewis, 1968: 157)
A second subtype of derived-case comparison is defined negatively, in that the standard NP has derived case, but the construction does not have the form of a coordination of clauses. Instead, the construction features a specific comparative particle that accompanies the standard NP. With a few, mostly West-Indonesian exceptions, this particle comparative appears to be restricted to Europe. The English than comparative is a case in point. Other examples are the comparative construction in French, with its comparative particle que, and the comparative construction in Hungarian, which features the particle mint ‘than, like.’ (14) French (Indo-European, Romance) tu es plus jolie que ta soeur you are more pretty than your sister ‘You are prettier than your sister’ (B. Bichakjian, personal commuication) (15)
Hungarian (Uralic, Ugric) Istvan magasa-bb mint Peter I.NOM tall-more than P.NOM ‘Istvan is taller than Peter’ (E. Moravcsik, personal communication)
In summary, the typology of comparison of inequality developed in Stassen (1984, 1985) can be presented as follows: (16) FIXED CASE
Predicate Marking in Comparative Constructions Apart from, or in addition to, case assignment of the standard NP, a further possible parameter in the typology of comparative constructions might be considered to be the presence or absence of comparative marking on the predicate. In the vast majority of languages, such overt marking is absent; predicative adjectives in comparatives retain their unmarked, ‘positive’ form. Some languages, however, mark a predicative adjective in a comparative construction by means of a special affix, e.g., -er in English, German, and Dutch, -ior in Latin, -bb in Hungarian, -ago in Basque) or a special adverb (more in English, plus in French). Especially in the case of comparative affixes, the etymological origin is largely unknown. As for the areal distribution of this predicate marking in comparatives, it can be observed that it is an almost exclusively European phenomenon, and that it is particularly frequent in languages that have a particle comparative construction. For a tentative explanation of this latter correlation, see Stassen (1985, Chap. 15).
Explanation of the Typology of Comparative Constructions Stassen (1985) advances the claim that the typology of comparative constructions is derived from (and hence predicted by) the typology of temporal sequencing. That is, the type(s) of comparative construction that a language may employ is argued to be limited by the options that the language has in the encoding of (simultaneous or consecutive) sequences of events. A first indication in favor of this hypothesis is that at least one of the attested comparative types, viz., the conjoined comparative, has the overt form of a temporal sequence (in this case, a simultaneous coordination). Moreover, for most of the other comparative types a correlation with a possible encoding of some temporal sequence can be established as well. Stassen (1985) produces detailed evidence for the correctness of the following set of universals of comparative type choice: a. If a language has an adverbial comparative, then that language allows deranking (i.e., nonfinite subordination) of one of the clauses in a temporal sequence, even when the two clauses in that sequence have different subjects. b. If a language has an Exceed-Comparative, then that language allows deranking of one of the clauses in a temporal sequence only if the two clauses have identical subjects.
c. If a language has a conjoined comparative, then that language does not allow deranking of clauses in temporal sequences at all. The parallelism between these various options in temporal sequence encoding and corresponding comparative types is illustrated by examples from Naga, Dagbane, and Kayapo: (17) Naga (Sino-Tibetan, Tibeto-Burman) (17a) A de kepu ki themma I words speak on man lu a vu-we that me strike-INDIC ‘As I spoke these words, that man struck me’ (Grierson (ed.), 1903: 417) (17b) Themma hau lu ki vi-we man this that on good-INDIC ‘This man is better than that man’ (Grierson (ed.), 1903: 415) (18) Dagbane (Niger-Kordofanian, Gur) (18a) Nana san-la o-suli n-dum nira scorpion take-HAB his-tail PREF-sting people ‘The scorpion stings people with its tail’ (Fisch, 1912: 32) (18b) O-make dpeoo n-gare-ma he-has strength PREF-exceed-me ‘He is stronger than me’ (Fisch, 1912: 20) (19) Kayapo´ (Ge) (19a) Ga-ja nium-no 2SG-stand 3SG-lie down ‘You are standing, and/ while he is lying down’ (Maria, 1914: 238) (19b) Gan ga-prik ba i-pri 2SG 2SG-big1SG 1SG ISG-small ‘You are bigger than me’ (Maria, 1914: 237)
Given that the universals listed above meet with very few and ‘incidental’ counterexamples, Stassen (1985) concludes that the typology of comparative constructions is modeled on the typology of temporal sequencing, so that, in effect, comparative constructions appear to be a special case of the encoding of temporal sequences. A residual problem for this modeling analysis of comparative types is presented by the particle comparatives. Like conjoined comparatives, particle comparatives form a case of derived-case comparison, but unlike conjoined comparatives their surface structure form is not that of a coordination. Nonetheless, there are indications that even particle comparatives are coordinate in origin. In a number of cases, the particle used in particle comparatives has a clear source in a coordinating conjunction or adverb (e.g., karo ‘than/but’ in Javanese, dan ‘than/then’ in Dutch,
690 Comparative Constructions
baino ‘than/but’ in Basque, asa ‘than/then’ in Toba Batak, noria ‘than/after that’ in Goajiro, ngem ‘than/ but’ in Ilocano, na ‘than/nor’ in Scottish Gaelic, nor ‘than/nor’ in Scottish English, e` ‘than/or’ in Classical Greek). Furthermore, particle comparatives in at least some languages share a number of syntactic properties with coordinations. For example, the Dutch comparative allows Gapping, a rule which is commonly thought to be restricted to coordinate structures. (20) Dutch (Indo-European, Germanic) (20a) Ik verzamel boeken en mijn I collect books and my broer verzamelt platen brother collects records ‘I collect books and my brother collects records’ (own data) (20b) Ik verzamel boeken en mijn broer Ø platen (own data) (21) Dutch (Indo-European, Germanic) (21a) Ik koop meer boeken dan I buy more books than mijn broer platen koopt my brother records buys ‘I buy more books than my brother buys records’ (own data) (21b) Ik koop meer boeken dan mijn broer platen Ø (own data)
One might argue, then, that particle comparatives must be seen as grammaticalizations from an underlying sequential construction. In this way, the particle comparative does not have to present a counterexample to the modeling analysis of comparative constructions, although it certainly forms a recalcitrant case.
Bibliography Andersen P K (1983). Word order typology and comparative constructions. Amsterdam: Benjamins. Bloomfield L (1962). The Menomini language. New Haven: Yale University Press. Fisch R (1912). Grammatik der Dagomba-Sprache. Berlin: Reimer. Grierson G A (1903). Linguistic survey of India, vol. III: Tibeto-Burman family, part II: Specimens of the Bodo, Naga and Kachin groups. Calcutta: Government Printing Office. Hanoteau A (1896). Essai de grammaire de la langue tamachek’. Algiers: Jourdan. Hoffmann J (1903). Mundari grammar. Calcutta: Bengal Secretariat Press. Ittman J (1939). Grammatik des Duala. Berlin: Reimer. Ku¨ hner R & Stegmann C (1955). Ausfu¨ hrliche Grammatik der lateinischen Sprache: Satzlehre. Leverkusen: Gottschalk. Lewis M B (1968). Malay. London: St. Paul’s House. Lukas J (1953). Die Sprache der Tubu in der zentralen Sahara. Berlin: Akademie-Verlag. Maria P A (1914). ‘Essai de grammaire Kaiapo´ , langue des Indiens Kaiapo´ , Bre´ sil.’ Anthropos 9, 233–240. Oinas F J (1966). Basic course in Estonian. Bloomington: Indiana University. Roberts John R (1987). Amele. London: Croom Helm. Stassen L (1984). ‘The comparative compared.’ Journal of Semantics 3, 143–182. Stassen L (1985). Comparison and universal grammar. Oxford: Blackwell. Tucker A N & Tompo ole Mpaayi J (1955). A Maasai grammar with vocabulary. London: Longmans, Green. Ultan R (1972). ‘Some features of basic comparative constructions.’ Working Papers On Language Universals (Stanford) 9, 117–162. Warotamasikkhadhit U (1972). Thai syntax. The Hague: Mouton.
See also: Antonymy and Incompatibility; Comparatives: Semantics.
Comparatives, Semantics C Kennedy, Northwestern University, Evanston, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
Introduction The ability to establish orderings among objects and make comparisons between them according to the amount or degree to which they possess some property is a basic component of human cognition. Natural languages reflect this fact: all languages have syntactic categories that express gradable concepts, and all
languages have designated comparative constructions, which are used to express explicit orderings between two objects with respect to the degree or amount to which they possess some property (Sapir, 1944). In many languages, comparatives are based on specialized morphology and syntax. English exemplifies this type of system. It uses the morphemes more/-er, less, and as specifically for the purpose of establishing orderings of superiority, inferiority, and equality, respectively, and the morphemes than and as to mark the standard against which an object is compared:
690 Comparative Constructions
baino ‘than/but’ in Basque, asa ‘than/then’ in Toba Batak, noria ‘than/after that’ in Goajiro, ngem ‘than/ but’ in Ilocano, na ‘than/nor’ in Scottish Gaelic, nor ‘than/nor’ in Scottish English, e` ‘than/or’ in Classical Greek). Furthermore, particle comparatives in at least some languages share a number of syntactic properties with coordinations. For example, the Dutch comparative allows Gapping, a rule which is commonly thought to be restricted to coordinate structures. (20) Dutch (Indo-European, Germanic) (20a) Ik verzamel boeken en mijn I collect books and my broer verzamelt platen brother collects records ‘I collect books and my brother collects records’ (own data) (20b) Ik verzamel boeken en mijn broer Ø platen (own data) (21) Dutch (Indo-European, Germanic) (21a) Ik koop meer boeken dan I buy more books than mijn broer platen koopt my brother records buys ‘I buy more books than my brother buys records’ (own data) (21b) Ik koop meer boeken dan mijn broer platen Ø (own data)
One might argue, then, that particle comparatives must be seen as grammaticalizations from an underlying sequential construction. In this way, the particle comparative does not have to present a counterexample to the modeling analysis of comparative constructions, although it certainly forms a recalcitrant case.
Bibliography Andersen P K (1983). Word order typology and comparative constructions. Amsterdam: Benjamins. Bloomfield L (1962). The Menomini language. New Haven: Yale University Press. Fisch R (1912). Grammatik der Dagomba-Sprache. Berlin: Reimer. Grierson G A (1903). Linguistic survey of India, vol. III: Tibeto-Burman family, part II: Specimens of the Bodo, Naga and Kachin groups. Calcutta: Government Printing Office. Hanoteau A (1896). Essai de grammaire de la langue tamachek’. Algiers: Jourdan. Hoffmann J (1903). Mundari grammar. Calcutta: Bengal Secretariat Press. Ittman J (1939). Grammatik des Duala. Berlin: Reimer. Ku¨hner R & Stegmann C (1955). Ausfu¨hrliche Grammatik der lateinischen Sprache: Satzlehre. Leverkusen: Gottschalk. Lewis M B (1968). Malay. London: St. Paul’s House. Lukas J (1953). Die Sprache der Tubu in der zentralen Sahara. Berlin: Akademie-Verlag. Maria P A (1914). ‘Essai de grammaire Kaiapo´, langue des Indiens Kaiapo´, Bre´sil.’ Anthropos 9, 233–240. Oinas F J (1966). Basic course in Estonian. Bloomington: Indiana University. Roberts John R (1987). Amele. London: Croom Helm. Stassen L (1984). ‘The comparative compared.’ Journal of Semantics 3, 143–182. Stassen L (1985). Comparison and universal grammar. Oxford: Blackwell. Tucker A N & Tompo ole Mpaayi J (1955). A Maasai grammar with vocabulary. London: Longmans, Green. Ultan R (1972). ‘Some features of basic comparative constructions.’ Working Papers On Language Universals (Stanford) 9, 117–162. Warotamasikkhadhit U (1972). Thai syntax. The Hague: Mouton.
See also: Antonymy and Incompatibility; Comparatives: Semantics.
Comparatives, Semantics C Kennedy, Northwestern University, Evanston, IL, USA ! 2006 Elsevier Ltd. All rights reserved.
Introduction The ability to establish orderings among objects and make comparisons between them according to the amount or degree to which they possess some property is a basic component of human cognition. Natural languages reflect this fact: all languages have syntactic categories that express gradable concepts, and all
languages have designated comparative constructions, which are used to express explicit orderings between two objects with respect to the degree or amount to which they possess some property (Sapir, 1944). In many languages, comparatives are based on specialized morphology and syntax. English exemplifies this type of system. It uses the morphemes more/-er, less, and as specifically for the purpose of establishing orderings of superiority, inferiority, and equality, respectively, and the morphemes than and as to mark the standard against which an object is compared:
Comparatives, Semantics 691 (1a) Mercury is closer to the sun than Venus. (1b) The Mars Pathfinder mission was less expensive than previous missions to Mars. (1c) Uranus doesn’t have as many rings as Saturn.
In the case of properties for which specific measure units are defined, it is also possible to express differences between objects with respect to the degree to which they possess some property, even when the predicate from which the comparative is formed does not permit explicit measurement: (2a) Mercury is 0.26 AU closer to the sun than Venus. (2b) ??Mercury is 0.46 AU close to the sun.
Languages such as English also allow for the possibility of expressing more complex comparisons by permitting a range of phrase types after than and as. For example, (3a) expresses a comparison between the degrees to which the same object possesses different properties, (3b) compares the degrees to which different objects possess different properties, and (3c) relates the actual degree that an object possesses a property to an expected degree. (3a) More meteorites vaporize in the atmosphere than fall to the ground. (3b) The crater was deeper than a 50-story building is tall. (3c) The flight to Jupiter did not take as long as we expected.
Finally, many languages also have related degree constructions that do not directly compare two objects but instead provide information about the degree to which an object possesses a gradable property by relating this degree to a standard based on some other property or relation. The English examples in (4) using the morphemes too, enough and so exemplify this sort of construction. (4a) The equipment is too old to be of much use to us. (4b) Current spacecraft are not fast enough to reach the speed of light. (4c) The black hole at the center of the galaxy is so dense that nothing can escape the pull of its gravity, not even light.
Example (4b), for example, denies that the speed of current spacecraft is as great as the speed required to equal the speed of light.
Gradability A discussion of the semantics of comparison must begin with the semantics of gradable predicates more generally. Not all properties can be used in comparatives, as shown by the contrast between the examples in (1) and (5).
(5a) ??Giordano Bruno is deader than Galileo. (5b) ??The new spacecraft is more octagonal than the old one. (5c) ??Carter is as former a president as Ford.
The crucial difference between predicates such as expensive and close, on the one hand, and dead, octagonal, and former, on the other, is that the first, but not the second, are gradable – they express properties that support (nontrivial) orderings. Comparatives thus provide a test for determining whether a predicate is inherently gradable or not. The most common analysis of gradable predicates assigns them a unique semantic type that directly represents their order-inducing feature; they are analyzed as expressions that map their arguments onto abstract representations of measurement, or scales. Scales have three crucial parameters, the values of which must be specified in the lexical entry of particular gradable predicates: a set of degrees, which represent measurement values; a dimension, which indicates the property being measured (cost, temperature, speed, volume, height, etc.); and an ordering relation on the set of degrees, which distinguishes between predicates that describe increasing properties (e.g., tall) and those that describe decreasing properties (e.g., short) (see Sapir, 1944; Bartsch and Vennemann, 1973; Cresswell, 1977; Seuren, 1978; von Stechow, 1984a; Bierwisch, 1989; Klein, 1991; Kennedy, 1999; Schwarzschild and Wilkinson, 2002). The standard implementation of this general view claims that gradable predicates have (at least) two arguments: an individual and a degree. Gradable predicates further contain as part of their meanings a measure function and a partial ordering relation such that the value of the measure function applied to the individual argument returns a degree on the relevant scale that is at least as great as the value of the degree argument. The adjective expensive, for example, expresses a relation between an object x and a degree of cost d such that the cost of x is at least as great as d. In order to derive a property of individuals, it is necessary to first saturate the degree argument. In the case of the positive (unmarked) form, the value of the degree argument is contextually fixed to an implicit norm or standard of comparison, whose value may vary depending on a number of different contextual factors (such as properties of the subject, the type of predicate, and so forth; see Vagueness). For example, the truth conditions of a sentence such as (6a) can be represented as in (6b), where size is a function from objects to degrees of size and ds is the contextually determined standard – the cutoff point for what counts as large in the context of utterance.
In the context here, the various objects in the solar system, the value of ds is typically such that (6a) is false. If we are talking about Saturn’s moons, however, then ds is such that (6a) is true. This sort of variability is a defining feature of gradable adjectives as members of the larger class of vague predicates.
Comparison In contrast to the positive form, comparatives (and degree constructions in general) explicitly fix the value of the degree argument of the predicate. There are a number of implementations of this basic idea (see von Stechow, 1984a, for a comprehensive survey), but most share the core assumption that the comparative morphemes fix the value of the degree argument of the comparative-marked predicate by requiring it to stand in a particular relation – > for more, < for less, and ! for as – to a second degree, the comparative standard, which is provided by the comparative clause (the complement of than or as). One common strategy is to assign the comparative morpheme essentially the same semantic type as a quantificational determiner – it denotes a relation between two sets of degrees. One of these sets is derived by abstracting over the degree argument of the comparative predicate; the second is derived by abstracting over the degree argument of a corresponding predicate in the comparative clause. This analysis presupposes that the comparative clause contains such a predicate. In some cases, it is present in the surface form (see (3b)), but typically, in particular whenever it is identical to the comparative predicate, it is eliminated from the surface form by an obligatory deletion operation. For example, in the analysis developed in Heim (2000), more (than) denotes a relation between two sets of degrees such that the maximal element of the first (provided by the main clause) is ordered above the maximal element of the second (provided by the comparative clause). At the relevant level of semantic representation, a sentence such as (7) has the constituency indicated in (8a) (where material elided from the surface form is struck through) and the truth conditions in (8b). (7) Titan is larger than Hyperion. (8a) [Titan is d large] more than [Hyperion] - ) ! d0 } (8b) max{d | large(t) ! d}> max{d0 | large(h
Note that because the truth conditions of the comparative form do not involve reference to a contextual norm, the comparative does not entail the corresponding positive. Thus (8a), for example, can be true even in a context in which (6a) is false.
Differential comparatives such as (2a) can be accounted for by modifying the basic semantics to include a measure of the difference between the respective (maximal) degrees contributed by the two arguments of the comparative morpheme (von Stechow, 1984a; Schwarzschild and Wilkinson, 2002). Such differences always correspond to closed intervals on a scale and so are measurable even if the degrees introduced by the base-gradable predicate themselves are not (Seuren, 1978; von Stechow, 1984b; Kennedy, 2001). Because the standard of comparison is derived by abstracting over a degree variable in the comparative clause, this approach allows for the expression of arbitrarily complex comparisons such as those in (3). There are some limits, however. First, the comparative clause is a wh-construction, so the syntactic operation that builds the abstraction structure is constrained by the principles governing long-distance dependencies (see Kennedy, 2002, for an overview). Second, it is also constrained by its semantics; because the comparative clause is the argument of a maximalization operator, it must introduce a set of degrees that has a maximal element. Among other things, this correctly predicts that negation (and other decreasing operators) are excluded from the comparative clause (von Stechow, 1984a; Rullmann, 1995): (9a) ??Venus is brighter than Mars isn’t. (9b) max{d | bright(v) ! d} > max{d0 | :bright(m) ! d0 }
The set of degrees d0 such that Mars is not as bright as d0 includes all the degrees of brightness greater than the one that represents Mars’s brightness. Because this set has no maximal element, the maximality operator in (9b) fails to return a value. The hypothesis that the comparative clause is subject to a maximalization operation has an additional logical consequence (von Stechow, 1984a; Klein, 1991; Rullmann, 1995); for any (ordered) set of degrees D and D0 , if D"D0 , then max(D0 ) ! max(D). The comparative clause is thus a downward-entailing context and so is correctly predicted to license negative-polarity items and conjunctive interpretations of negation (Seuren, 1973; Hoeksema, 1984; but cf. Schwarzschild and Wilkinson, 2002): (10a) The ozone layer is thinner today than it has ever been before. (10b) We observed more sunspot activity in the last 10 days than anyone has observed in years. (11a) Jupiter is larger than Saturn or Uranus.) (11b) Jupiter is larger than Saturn, and Jupiter is larger than Uranus.
Finally, the assumption that the comparative is a type of quantificational expression leads to the
Comparatives, Semantics 693
expectation that it should participate in scopal interactions with other logical operators. The ambiguity of (12), which has the (sensible) de re interpretation in (13a) and an (unlikely) de dicto interpretation in (13b), bears out this prediction. (12) Kim thinks Earth is larger than it is. (13a) max{d | think(large(e) ! d) (k)}> max{d0 | large(e) > d0 } (13b) think(max{d | large(e) ! } > max{d0 | large(e) > d0 }) (k)
The extent to which comparatives interact with other operators and the implications of such interactions for the compositional semantics of comparatives and gradable predicates is a focus of current investigation (see Larson, 1988; Kennedy, 1999; Heim, 2000; Bhatt and Pancheva, 2004).
Comparison Cross-Linguistically As previously noted, there are in fact several distinct semantic analyses of comparatives that differ in their details but share the core assumption that gradable adjectives map objects to ordered sets of degrees. For example, one alternative analyzes the truth conditions of a sentence such (7) as in (14); roughly, there is a degree d such that Titan is at least as large as d but Hyperion is not as large as d (Seuren, 1973; Klein, 1980; Larson, 1988). (14) 9d[[large(t) ! d] ^ :[large(h) ! d]]
Analysis (14) does not express an explicit ordering between two degrees but instead takes advantage of the implicit ordering on the scale of the predicate to derive truth conditions equivalent to (8b) – given the inherent ordering, (14) holds whenever the maximal degree of Titan’s largeness exceeds that of Hyperion (and vice versa). The fact that the underlying semantics of gradable predicates supports multiple equivalent logical analyses of comparatives appears at first to be a frustrating obstacle to the discovery of the ‘right’ semantics of the comparative. In fact, however, this may be a positive result when we take into account the extremely varied syntactic modes of expressing comparison in the world’s languages (see Stassen, 1985), which include forms that superficially resemble the logical representation in (14), such as the example from Hixkarya´ na in (15). (15) Kaw-ohra naha Waraka, kaw naha Kaywerye tall-NOT he-is Waraka tall he-is Kaywerye ‘Kaywerye is taller than Waraka’
Although it may turn out to be difficult to find clear empirical evidence to choose between competing, equivalent logical representations of comparatives within a
particular language such English, it may also turn out that a study of the various expressions of comparison in different languages will show that all the possible options provided by the underlying semantics of gradability are in fact attested. Comparatives, therefore, provide a potentially fruitful and important empirical domain for investigating broader typological questions about the mapping between (universal) semantic categories and (language-specific) syntactic ones. See also: Antonymy and Incompatibility; Comparative Constructions; Monotonicity and Generalized Quantifiers; Negation: Semantic Aspects; Quantifiers: Semantics; Vagueness.
Bibliography Bartsch R & Vennemann T (1973). Semantic structures: A study in the relation between syntax and semantics. Frankfurt: Atha¨ enum Verlag. Bhatt R & Pancheva R (2004). ‘Late merger of degree clauses.’ Linguistic Inquiry 35, 1–46. Bierwisch M (1989). ‘The semantics of gradation.’ In Bierwisch M & Lang E (eds.) Dimensional adjectives. Berlin: Springer-Verlag. 71–261. Cresswell M J (1977). ‘The semantics of degree.’ In Partee B (ed.) Montague grammar. New York: Academic Press. 261–292. Heim I (2000). ‘Degree operators and scope.’ In Jackson B & Matthews T (eds.) Proceedings of semantics and linguistic theory, 10. Ithaca, NY: CLC Publications. 40–64. Hoeksema J (1984). ‘Negative polarity and the comparative.’ Natural Language & Linguistic Theory 1, 403–434. Kennedy C (1999). Projecting the adjective: The syntax and semantics of gradability and comparison. New York: Garland Press. Kennedy C (2001). ‘Polar opposition and the ontology of ‘‘degrees.’’’ Linguistics and Philosophy 24, 33–70. Kennedy C (2002). ‘Comparative deletion and optimality in syntax.’ Natural Language & Linguistic Theory 20.3, 553–621. Klein E (1980). ‘A semantics for positive and comparative adjectives.’ Linguistics and Philosophy 4, 1–45. Klein E (1991). ‘Comparatives.’ In von Stechow A & Wunderlich D (eds.) Semantik: Ein internationales Handbuch der zeitgeno¨ ssischen Forschung. Berlin: Walter de Gruyter. 673–691. Larson R K (1988). ‘Scope and comparatives.’ Linguistics and Philosophy 11, 1–26. Rullmann H (1995). Maximality in the semantics of whconstructions. Ph.D. diss., University of Massachusetts: Amherst. Sapir E (1944). ‘Grading: A study in semantics. Philosophy of Science 11, 93–116. Schwarzschild R & Wilkinson K (2002). ‘Quantifiers in comparatives: A semantics of degree based on intervals.’ Natural Language Semantics 10, 1–41.
694 Comparatives, Semantics Seuren P A (1973). ‘The comparative.’ In Kiefer F & Ruwet N (eds.) Generative grammar in Europe. Dordrecht: Riedel. 528–564. Seuren P A (1978). ‘The structure and selection of positive and negative gradable adjectives.’ In Farkas D, Jacobsen W J & Todrys K (eds.) Papers from the parasession on the lexicon. Chicago: Chicago Linguistic Society. 336–346.
Stassen L (1985). Comparison and universal grammar. Oxford: Basil Blackwell. von Stechow A (1984a). ‘Comparing semantic theories of comparison.’ Journal of Semantics 3, 1–77. von Stechow A (1984b). ‘My reply to Cresswell’s, Hellan’s, Hoeksema’s and Seuren’s comments.’ Journal of Semantics 3, 183–199.
Complement Clauses M Noonan, University of Wisconsin–Milwaukee, Milwaukee, WI, USA ! 2006 Elsevier Ltd. All rights reserved.
In approaches to linguistics within, or influenced by, the generative tradition, the term ‘complementation’ has come to refer to the syntactic situation that arises when a notional sentence or predication is an argument of a predicate. For practical purposes, a predication can be viewed as an argument of a predicate if it functions as the subject or object of that predicate. (1) That Zeke eats leeks is surprising. (2) Zelda knows that Zeke eats leeks.
In (1), the clause that Zeke eats leeks functions as a subject and is referred to as a subject complement; in (2), that Zeke eats leeks functions as an object and is referred to as an object complement. Complements are subordinate (or co-subordinate) clauses, but not all subordinate clauses are complements: relative clauses, converbals, and clauses of time, manner, purpose, and place are not considered complements because they are not arguments. Within a given language, various grammatical constructions can serve as complements. Such constructions are referred to as complement types. (3) a. That Zeke eats leeks is surprising. b. Zeke’s eating leeks is surprising. c. For Zeke to eat leeks would be surprising. (4) a. Zelda remembered that Zeke eats leeks. b. Zelda remembered Zeke’s eating leeks. c. Zelda remembered to eat leeks.
Predicates such as be surprised, know, and remember, which take complement arguments, are referred to as complement-taking predicates [or ctps]. Every language has a complement type that is identical grammatically to an independent clause; such complements are used to express direct quotes of say, (5) Zelda said ‘‘Zeke eats leeks.’’
and may be found in other contexts as well. Such complements are referred to as a sentencelike
complements. The complements illustrated in (1), (2), (3a), and (4a) are also sentencelike. Sentencelike complements exhibit the same possibilities for the expression of tense, aspect, and mood; case marking of subjects and objects; and argument-verb agreement phenomena as independent clauses for any given language. In addition, languages also have one or more complement types which are reduced or desententialized: such clauses lack some features associated with main clauses. The gerunds [nominalizations] in (3b) and (4b) and the infinitives in (3c) and (4c) lack some features associated with sentence-like complements and main clauses. For example, neither can be inflected for primary tense [past or non-past], though secondary, or relative, tenses [e.g., the perfect] are possible with both.
Complement Types Complement types are identified by the following criteria: . Whether they are sentence-like or reduced; . The part of speech of the predicate [or the grammatical head of the predicate complex], i.e., whether it is a verb, a noun, or an adjective; . The sorts of grammatical relations the predicate has with its arguments, e.g., whether the agent has a subject relation to the predicate, as in the sentencelike complements above, or whether it has a genitive relation, as in (3b) and (4b); . The external grammatical relations of the complement construction as a whole, e.g., whether the complement has a subordinate or coordinate relation to the main (or matrix) clause. . Grammatical constructions that function as complements may have other grammatical functions as well. For example, infinitives may be complements (as in [3c] and [4c]), but they may also be adverbials of purposes (Zeke came to eat leeks), relatives (The leeks to eat are Zeke’s), etc. For a construction to be considered a complement, it must meet the semantic test of functioning as an argument of a predicate.
694 Comparatives, Semantics Seuren P A (1973). ‘The comparative.’ In Kiefer F & Ruwet N (eds.) Generative grammar in Europe. Dordrecht: Riedel. 528–564. Seuren P A (1978). ‘The structure and selection of positive and negative gradable adjectives.’ In Farkas D, Jacobsen W J & Todrys K (eds.) Papers from the parasession on the lexicon. Chicago: Chicago Linguistic Society. 336–346.
Stassen L (1985). Comparison and universal grammar. Oxford: Basil Blackwell. von Stechow A (1984a). ‘Comparing semantic theories of comparison.’ Journal of Semantics 3, 1–77. von Stechow A (1984b). ‘My reply to Cresswell’s, Hellan’s, Hoeksema’s and Seuren’s comments.’ Journal of Semantics 3, 183–199.
Complement Clauses M Noonan, University of Wisconsin–Milwaukee, Milwaukee, WI, USA ! 2006 Elsevier Ltd. All rights reserved.
In approaches to linguistics within, or influenced by, the generative tradition, the term ‘complementation’ has come to refer to the syntactic situation that arises when a notional sentence or predication is an argument of a predicate. For practical purposes, a predication can be viewed as an argument of a predicate if it functions as the subject or object of that predicate. (1) That Zeke eats leeks is surprising. (2) Zelda knows that Zeke eats leeks.
In (1), the clause that Zeke eats leeks functions as a subject and is referred to as a subject complement; in (2), that Zeke eats leeks functions as an object and is referred to as an object complement. Complements are subordinate (or co-subordinate) clauses, but not all subordinate clauses are complements: relative clauses, converbals, and clauses of time, manner, purpose, and place are not considered complements because they are not arguments. Within a given language, various grammatical constructions can serve as complements. Such constructions are referred to as complement types. (3) a. That Zeke eats leeks is surprising. b. Zeke’s eating leeks is surprising. c. For Zeke to eat leeks would be surprising. (4) a. Zelda remembered that Zeke eats leeks. b. Zelda remembered Zeke’s eating leeks. c. Zelda remembered to eat leeks.
Predicates such as be surprised, know, and remember, which take complement arguments, are referred to as complement-taking predicates [or ctps]. Every language has a complement type that is identical grammatically to an independent clause; such complements are used to express direct quotes of say, (5) Zelda said ‘‘Zeke eats leeks.’’
and may be found in other contexts as well. Such complements are referred to as a sentencelike
complements. The complements illustrated in (1), (2), (3a), and (4a) are also sentencelike. Sentencelike complements exhibit the same possibilities for the expression of tense, aspect, and mood; case marking of subjects and objects; and argument-verb agreement phenomena as independent clauses for any given language. In addition, languages also have one or more complement types which are reduced or desententialized: such clauses lack some features associated with main clauses. The gerunds [nominalizations] in (3b) and (4b) and the infinitives in (3c) and (4c) lack some features associated with sentence-like complements and main clauses. For example, neither can be inflected for primary tense [past or non-past], though secondary, or relative, tenses [e.g., the perfect] are possible with both.
Complement Types Complement types are identified by the following criteria: . Whether they are sentence-like or reduced; . The part of speech of the predicate [or the grammatical head of the predicate complex], i.e., whether it is a verb, a noun, or an adjective; . The sorts of grammatical relations the predicate has with its arguments, e.g., whether the agent has a subject relation to the predicate, as in the sentencelike complements above, or whether it has a genitive relation, as in (3b) and (4b); . The external grammatical relations of the complement construction as a whole, e.g., whether the complement has a subordinate or coordinate relation to the main (or matrix) clause. . Grammatical constructions that function as complements may have other grammatical functions as well. For example, infinitives may be complements (as in [3c] and [4c]), but they may also be adverbials of purposes (Zeke came to eat leeks), relatives (The leeks to eat are Zeke’s), etc. For a construction to be considered a complement, it must meet the semantic test of functioning as an argument of a predicate.
Complement Clauses 695
Some complement types are regularly accompanied by a complementizer, a word or clitic that marks the construction as subordinate and a complement. The sentencelike complements illustrated above are all accompanied by the complementizer that; the infinitives are accompanied by the complementizer to. The gerunds ([3b] and [4b]) lack a complementizer: neither the derivational morpheme -ing nor the genitive -’s are properly complementizers. The reduction undergone by some complement types may be associated with changes or limitations on the grammatical relations that the complement predicate can have with its logical arguments. This most commonly affects the relation of the predicate to its subject or, in languages lacking subject relations, its agent. For example, with English infinitives the notional subject is either raised (6), equi-deleted (7), or made into an object of an adposition (8):
.
.
(6) Zelda wanted Zeke to eat leeks. (7) Zelda wanted Ø to eat leeks. (8) For Zeke to eat leeks would amaze Zelda.
Raising refers to a situation whereby an argument of a complement predicate assumes a grammatical relation in the matrix clauses. In (6), Zeke is generally analyzed as the direct object of wanted: note that if Zeke is replaced by a pronoun, it is the objective case him, not the subjective case he, that is used. In (7), the notional subject of eat is Zelda, coreferential with the subject of the matrix verb wanted: the second mention of Zelda is said to be equi-deleted under identity with the matrix subject.
.
though will always be identical to that of the clause containing the complement-taking predicate. Syntactically, the paratactic complement and its accompanying clause are like two coordinate clauses asyndetically juxtaposed, though forming an intonational unit like that of main-subordinate clause pairs. As a result, the paratactic complement is never accompanied by a complementizer. Infinitive: The predicate is a verb, but cannot form a constituent with its notional subject, nor can it agree with it if the language permits subject-verb agreement. The range of inflectional categories is reduced. Nominalization: The predicate is a noun. Grammatical relations between the predicate and its arguments are expressed in ways characteristic of noun-modifier relations in the language with the predicate as the head (e.g., the notional subject may be expressed as a genitive), but if there is deviation from this pattern, the subject is more likely to retain the noun-modifier mode than is the object. Since it is a noun, the predicate may be marked for nominal categories like case and number. There is often a gradation between nominalizations and infinitives; diachronically, nominalizations often evolve into infinitives. Participial: The predicate is an adjective. The notional subject is the head, while the rest of the predication takes the form of a modifier, a participial phrase (or, rarely, a clause identical to ordinary relative clauses) modifying the notional subject NP. Inflectional categories are reduced, and the construction may take adjectival inflections, e.g., concord class morphology, agreeing with the notional subject. Languages vary in the number of complement types they employ, the number ranging from two (a reduced vs. a non-reduced complement type) to five or six.
Classification of Complement Types
.
Some of the more typical features of commonly encountered complement types are described below. These characterizations are ideal types, and it is quite possible to find examples having characteristics that are intermediate between certain of the types listed below. In the discussion below, ‘predicate’ refers to the head of the verb complex.
Reduction or Desententialization of Complements
. Categories in the verb complex may be reduced. The term ‘subjunctive’ has traditionally been used for a sentence-like complement type that is specialized for subordinate clauses, though it may have main clause use with hortative or imperative sense. . Paratactic: The predicate is a verb. The subject is an argument shared by the paratactic complement and the clause which contains the complementtaking predicate. The range of inflectional categories is the same as for independent clauses,
As noted, non-indicative complements are in various ways reduced or desententialized. This is a consequence of two distinct factors. The first is the pragmatic backgrounding of the complement predication: when the information contained within the complement is not the focus of the assertion for the entire sentence, the complement may be reduced. (9) Dale regretted that Roy fell off his horse. (10) Dale regretted Roy’s falling off his horse. (11) Dale regretted it.
696 Complement Clauses
In (9) the information coded by the object complement is given full expression by an indicative complement. But it is also possible to express the complement as a nominalization as in (10), in which case some information (e.g., tense) is eliminated. In (11), the reduction is such that the direct object is no longer considered a complement. The second and, for our purposes, more interesting reason for reduction is that the meaning of the complement-taking predicate may limit the semantic possibilities of the complement predication. So, for example, the things we want to occur necessarily occur after our wanting them, so that (12) is possible, but not (13): (12) I want Zuma to leave tomorrow. (13) I want Zuma to leave yesterday.
The greater the degree to which the semantics of the complement is bound to elements of the meaning of the complement-taking predicate, the greater the degree of reduction will be possible. With the exception of agreement of the notional subject with the predicate, which varies with individual complement types, the retention of inflectional categories associated with predicates in independent clauses can be arranged as follows: (14) Full range of tenses
1
past vs. non-past (morphologically may correspond to the perfect/ non-perfect distinction in the indicative) 2
Generally speaking, the further to the left an item is on this scale, the less likely it is to be coded on a nonindicative complement. The categories in set 4 are almost always coded on infinitive and subjunctive complements if they are coded on indicatives. Associated with reduction or desententialization is a phenomenon we can refer to as clause merger. Raising and equi-deletion, referred to earlier, are modes of clause-merger: the erasure of the grammatical ‘boundaries’ between the complement clause and the matrix clause. Degrees of clause merger are arrayed on a continuum ranging from no merger all the way to clause union. Raising is a mode of clause merger, since it removes arguments from the predications with which they are logically associated and assigns them grammatical roles in the matrix clause: (15) It’s tough for Melvin to please Melba. (16) Melba is tough for Melvin to please.
Notionally, Melba is an argument of please, but in (16) Melba is expressed as the subject of be tough. With clause union, the matrix and complement predictes share the arguments of both matrix and complement predications. We can see an example of this from French: (17) Roger laissera Marie manger les pommes. Roger let–3sg-fut Marie eat-inf the apples ‘Roger will let Marie eat the apples.’ (18) Roger laissera manger les pommes a` Marie. Roger let–3sg-fut eat-inf the apples to Marie ‘Roger will let Marie eat the apples.’
In (17), laissera has as its direct object Marie, and manger has as its direct object les pommes. In (18), however, the merged predicate laissera manger has a direct object les pommes and an indirect object a` Marie: the clauses have merged and the arguments are shared by the merged predicate.
Choice of Complement Type Not only the possibility of reduction, but also the choice of a particular complement type is determined by the meaning of the complement-taking predicate. For example, in English, nominalizations [gerunds] are used to express complement predicates taken as facts, whereas infinitives are used to express complement predications treated as potential, projected events. The complement-taking predicate remember is compatible with both, since one can remember both a fact and a projected event: (15) Gus remembered paying the bill. [nominalization/gerund] (16) Gus remembered to pay the bill. [infinitive]
Want, however, is compatible only with project events; therefore, want is compatible with the infinitive but not with the nominalization: (17) *Gus wants paying the bill. [nominalization/ gerund] (18) Gus wants to pay the bill. [infinitive]
The meanings and uses of a given complement type will vary with each language. Few grammatical principles, if any, are specific to complementation, and though complementation can be given a workable definition, the definition is semantic, not grammatical. For example, all the grammatical constructions described as complement types have uses outside the realm of complementation proper, so their properties cannot be characterized solely by reference to complementation.
Complex Predicates 697
Complementation can be viewed as one mode of clause-combining. See also: Constituent Structure; X-Bar Theory.
Bibliography
Horie K (2001). ‘Complement clauses.’ In Haspelmath M et al. (eds.) Language typology and language universals. Berlin & New York: Walter de Gruyter. Noonan M (2005). ‘Complementation.’ In Shopen T (ed.) Language typology and syntactic description. Cambridge: Cambridge University Press.
Dixon R M W (1995). ‘Complement clauses and complement strategies.’ In Palmer F R (ed.) Meaning and grammar. Cambridge: Cambridge University Press.
Complex Predicates St Mu¨ller, Universita¨t Bremen, Bremen, Germany ! 2006 Elsevier Ltd. All rights reserved.
Complex predicates usually are defined as predicates that are multiheaded; they are composed of more than one grammatical element (either morphemes or words), each of which contributes part of the information ordinarily associated with a head. In the following discussions, several phenomena that are explained by complex predicate analyses are presented. Several analyses of these phenomena are then suggested in various frameworks.
Phenomena In dealing with language from a cross-linguistic perspective, it becomes apparent that languages differ in the ways that they express properties such as tense, aspect, and agreement. These differences can be expressed either synthetically or analytically. As an example, consider the French and the German sentences in Examples (1a) and (1b). French expresses the future tense synthetically, whereas German uses a combination of the infinitive of a main verb and an inflected form of the auxiliary werden: (1a) Je le varrai. I him will.see ‘I will see him.’ (1b) weil ich ihn sehen because I him see ‘because I will see him’
werde will
Such periphrastic constructions are often analyzed as complex predicates, i.e., it is assumed that the auxiliary forms a complex with the embedded verb that has a status similar to a verb combined with the future morpheme in other languages. In addition to periphrastic constructions, certain verbal complexes, particle verbs, and combinations of a resultative secondary predicate and a verbal
element are treated as complex predicates. The evidence for assuming a closer connection between two heads is discussed in the following subsections. German examples are used for the illustration, but some pointers to literature regarding similar cases in other languages are given. Topological Properties
German is a subject-object-verb language, and particle verbs, complex-forming verbs, and resultative constructions form a topological unit at the right periphery of the clause. In the descriptive literature, the part in which the respective elements are located is called the ‘right sentence bracket’ (see Bech (1955) for a brilliant description and analysis of verbal constructions in German) (abbreviations in the following examples: NOM, nominative; ACC, accusative; PART, particle). (2a) weil jemand ihn because somebody.NOM him.ACC anlacht PART (to).laughs ‘because somebody smiles at him’ (2b) weil jemand ihn because somebody.NOM him.ACC zu reparieren versucht to repair tries ‘because somebody tries to repair it’ (2c) weil jemand ihn because somebody.NOM him.ACC klug findet smart finds ‘because somebody finds him smart’ (2d) weil jemand den Teich because somebody.NOM the pond.ACC leer fischt empty fishes ‘because somebody fishes the pond empty’
The accusatives in Examples (2a)–(2d) are dependents of the particle an (‘toward’), the infinitive zu reparieren (‘to repair’), and the resultative predicate
Complex Predicates 697
Complementation can be viewed as one mode of clause-combining. See also: Constituent Structure; X-Bar Theory.
Bibliography
Horie K (2001). ‘Complement clauses.’ In Haspelmath M et al. (eds.) Language typology and language universals. Berlin & New York: Walter de Gruyter. Noonan M (2005). ‘Complementation.’ In Shopen T (ed.) Language typology and syntactic description. Cambridge: Cambridge University Press.
Dixon R M W (1995). ‘Complement clauses and complement strategies.’ In Palmer F R (ed.) Meaning and grammar. Cambridge: Cambridge University Press.
Complex Predicates St Mu¨ller, Universita¨t Bremen, Bremen, Germany ! 2006 Elsevier Ltd. All rights reserved.
Complex predicates usually are defined as predicates that are multiheaded; they are composed of more than one grammatical element (either morphemes or words), each of which contributes part of the information ordinarily associated with a head. In the following discussions, several phenomena that are explained by complex predicate analyses are presented. Several analyses of these phenomena are then suggested in various frameworks.
Phenomena In dealing with language from a cross-linguistic perspective, it becomes apparent that languages differ in the ways that they express properties such as tense, aspect, and agreement. These differences can be expressed either synthetically or analytically. As an example, consider the French and the German sentences in Examples (1a) and (1b). French expresses the future tense synthetically, whereas German uses a combination of the infinitive of a main verb and an inflected form of the auxiliary werden: (1a) Je le varrai. I him will.see ‘I will see him.’ (1b) weil ich ihn sehen because I him see ‘because I will see him’
werde will
Such periphrastic constructions are often analyzed as complex predicates, i.e., it is assumed that the auxiliary forms a complex with the embedded verb that has a status similar to a verb combined with the future morpheme in other languages. In addition to periphrastic constructions, certain verbal complexes, particle verbs, and combinations of a resultative secondary predicate and a verbal
element are treated as complex predicates. The evidence for assuming a closer connection between two heads is discussed in the following subsections. German examples are used for the illustration, but some pointers to literature regarding similar cases in other languages are given. Topological Properties
German is a subject-object-verb language, and particle verbs, complex-forming verbs, and resultative constructions form a topological unit at the right periphery of the clause. In the descriptive literature, the part in which the respective elements are located is called the ‘right sentence bracket’ (see Bech (1955) for a brilliant description and analysis of verbal constructions in German) (abbreviations in the following examples: NOM, nominative; ACC, accusative; PART, particle). (2a) weil jemand ihn because somebody.NOM him.ACC anlacht PART (to).laughs ‘because somebody smiles at him’ (2b) weil jemand ihn because somebody.NOM him.ACC zu reparieren versucht to repair tries ‘because somebody tries to repair it’ (2c) weil jemand ihn because somebody.NOM him.ACC klug findet smart finds ‘because somebody finds him smart’ (2d) weil jemand den Teich because somebody.NOM the pond.ACC leer fischt empty fishes ‘because somebody fishes the pond empty’
The accusatives in Examples (2a)–(2d) are dependents of the particle an (‘toward’), the infinitive zu reparieren (‘to repair’), and the resultative predicate
698 Complex Predicates
leer (‘empty’), respectively; lachen (‘laugh’) is an intransitive verb, as evidenced by Examples (3a) and (3b): (3a) Er lacht. he laughs (3b) *Er lacht he laughs
sie her
The additional argument in Example (2a) is licensed by the particle (Stiebels and Wunderlich, 1994; Stiebels, 1996). The finite verb þ particle/infinitive/resultative predicate forms a topological unit in Examples (2a)–(2d), but this is not necessarily the case, since the finite verb can be serialized in clause-initial position in languages such as German and Dutch. Similarily, it is possible to front the embedded infinitive and the resultive predicate in verb-second (V2) sentences. Particle fronting is possible under certain circumstances (Mu¨ ller, 2002b), thus the constructions in Examples (2a)–(2d) should be analyzed in syntax. That predicates form a topological unit in some variant of a clause that could be assumed to be basic is not a necessary condiation for predicate complex formation. Butt (1997) discussed constructions in Urdu that she analyzed as complex predicates and which nevertheless were discontinuous. Constituent Order
German is a language with relatively free constituent order. Arguments of a single head can be reordered with respect to each other in the so-called Mittelfeld (the area between the complementizer and the finite verb in verb-last sentences, but the area between the finite verb and other verbs or verb particles in verb-initial sentences). The sentences in Examples (4a)–(4d) show that the arguments that are introduced by different heads in Examples (2a)–(2d) may be reordered: (4a) weil ihn jemand because him.ACC somebody.NOM anlacht PART (to).laughs ‘because somebody smiles at him’ (4b) weil ihn jemand because him.ACC somebody.NOM zu reparieren versucht to repair tries ‘because somebody tries to repair it’ (4c) weil ihn jemand because him.ACC somebody.NOM klug findet smart finds ‘because somebody finds him smart’
(4d) weil den Teich jemand because the pond.ACC somebody.NOM leer fischt empty fishes ‘because somebody fishes the pond empty’
The important thing to notice about these examples is that the heads and the accusative elements, which are arguments of the respective heads, appear discontinuously. If it is assumed that anlacht (‘smiles at’), zu reparieren versucht (‘tries to repair’), and leer fischt (‘fishes empty’) form a complex head that requires all arguments of the matrix and the embedded element, the data in Examples (4a)–(4d) are explained automatically: since arguments of simplex heads can be reordered in German, it would follow automatically that the nominative and the accusative arguments of the complex heads could be reordered in similar sentences. Remote Passive
Examples (5a)–(5d) show that the argument of the embedded predicate can be realized as the subject in passive constructions (see Ho¨ hle (1978: 175–176) on the remote passive in verbal complexes; corpus examples are provided in Mu¨ ller (2002a: chap. 3.1.4.1): (5a) weil er angelacht wurde because he.NOM PART (to).laughed was ‘because he was smiled at’ (5b) weil er zu reparieren versucht wurde tried was because he.NOM to repair ‘because somebody tried to repair it’ (5c) weil er klug gefunden wurde because he smart found was ‘because he was found smart’ (5d) weil der Teich leer gefischt wurde because the pond.NOM empty fished was ‘because the pond was fished empty’
Again, such data can be explained by assuming that the particle verb combination, the combination of infinitive and embedding verb, and the combination of verb and resultative predicate act like a simplex head. The subject of the respective complexes is suppressed and the accusative object is realized as subject. (See also Rizzi (1982) and Monachesi (1998) for long passives in Italian, Manning (1992) for passives of verbal complexes in Romance languages, and Grimshaw and Mester (1985) for passives in Inuit Eskimo.) Other Phenomena
Due to space limitations, not all phenomena related to complex predicate formation can be discussed
Complex Predicates 699
here. Briefly, however, Example (6) shows a verbal complex construction that has two readings: (6) daß Karl den Roman nicht zu that Karl.NOM the novel.ACC not to lieben scheint love seems ‘that Karl does not seem to love the novel’ ‘that Karl seems not to love the novel’
The negation can scope over the zu infinitive or over the matrix verb, although it is placed between parts of what would normally be analyzed as an infinitival verb phrase, i.e., between den Roman and zu lieben. If zu lieben and scheint form a complex, nicht may attach to it before combination of arguments, and the wide scope reading can be explained. Furthermore, binding-theoretic effects may be observed: reflexives that are arguments of the embedded predicate can be bound by the subject (or by another argument) of the matrix verb. Apart from the phenomena that were demonstrated using German examples, there is a phenomenon called ‘clitic climbing’ in Romance languages. Usually a clitic is attached to a verb that it depends on, but with certain auxiliary verbs and causative verbs it is possible that a clitic that corresponds to an embedded verb attaches to the matrix verb. Again, such clitic constructions can be analyzed as involving complex predicate formation. The matrix verb selects both its own arguments and the arguments of the embedded verb. Since the arguments of the embedded verb are treated as arguments of the matrix predicate, it can be explained why they can be realized as a clitic to the matrix predicate (Monachesi, 1998; Abeille´ et al., 1997).
Analyses There have been various analytical approaches to the phenomena of complex predicates. The analyses can be ordered into two groups. One approach assumes that two predicates form a syntactic (or morphologic) constituent and the other approach assumes that two heads project as they would do normally as simplex heads. In the latter approach, the complex predicate effects are explained by restructuring or by movements of heads that result in monoclausal structures. The latter approach is discussed first here. Verb Phrase Embedding and Small Clauses and Incorporation
One way to analyze the phenomena of complex predicates is to assume that verbal heads uniformly embed maximal projections of a certain type. In the case of complex-forming control verbs and/or
raising verbs, it is assumed that the embedded constituent is a complementizer phrase (CP), inflection phrase (IP), or verb phrase (VP) (for particle verbs and for resultative constructions, small clause analyses have been applied; see, for instance, Hoekstra (1988) and den Dikken (1995), and references therein). Structures with monoclausal properties are explained by restructuring, reanalysis, or incorporation. An initial structure that contains the full CP, IP, VP, or small clause is mapped to another structure with different properties, accounting for the fact that a subject of an embedded predicate behaves like an object, or that arguments of embedded heads may scramble with respect to arguments of higher heads (Evers, 1975; Grewendorf, 1994; Grewendorf and Sabel, 1994; Wurmbrand, 2001). For instance, verbal particles are said to incorporate into their matrix verb (see Baker (1988) for a detailed discussion of incorporation). Such accounts are attractive since they can assume that there is just one underlying structure for a certain thematic relation. All other configurations are derived from this configuration by movement. Baker (1988) formulated this as the ‘uniformity of theta-assignment hypothesis’ (UTAH): ‘‘Identical thematic relationships between items are represented by identical structural relationships between those items at the level of D-structure.’’ (See also den Dikken (1995) for other formulations of the UTAH and further discussion.) Usually, so-called small clauses, i.e., verbless predication structures, are assumed for particle verbs (den Dikken, 1995), for consider predication, and for resultative constructions (Hoekstra, 1988). For instance, Example (2c) would be analyzed in the following way: (7) weil jemand [sc ihn because somebody.NOM him.ACC findet finds ‘because somebody finds him smart’
klug] smart
The matrix verb finden selects a small clause (SC) that contains the adjective klug and the subject over which klug predicates. Small clause analyses have been widely criticized (Bresnan, 1982: sect. 9.6; Williams, 1983; Booij, 1990: 56; Hoeksema, 1991; Neeleman and Weermann, 1993; Neeleman, 1995; Pollard and Sag, 1994: chap. 3.2; Stiebels, 1996: chap. 10.2.3; Winkler, 1997: chap. 2.1). One problematic aspect was discussed by Demske-Neumann (1994: 63) (see also Fanselow (1991: 70) for discussion of German, and Hoekstra (1987: 232) for a discussion of Dutch). Noun phrases, adjectives, and prepositional phrases (PPs) can be used predicatively in copula constructions (Examples (8a)– (8c)), but not all of these predicative constructions can
700 Complex Predicates
be used in all small clause environments. (Examples (9a)–(9c) and (10a)–(10c)). Therefore, the category of the predicative element has to be available for selection by the governing verb, i.e., machen (‘to make’) or nennen (‘to call’), respectively. (8a) Herr Mr. (8b) Herr Mr. (8c) Herr Mr. (9a) *Der the (9b) Das the
K. K. K. K. K. K.
ist is ist is ist is
kein Verbrecher. not.a criminal. unschuldig. innocent. in Berlin. in Berlin.
Richter macht Herrn K. judge makes Mr. K. Gericht macht Herrn court makes Mr.
einen Verbrecher. a criminal. K. mu¨ de. K. tired.
(9c) Der Richter macht Herrn K. zum Verbrecher. the judge makes Mr. K. to.the criminal. (10a) Herr Mr. (10b) Herr Mr. (10c) *Herr Mr.
K. K. K. K. K. K.
nennt calls nennt calls nennt calls
den the den the den the
Richter einen Idioten. judge an idiot. Richter voreingenommen. judge biased. Richter alz/zum Idioten. judge as/to.the idiot.
Demske concluded that the elements that are predicated over have to be treated as specifiers of noun, adjective, and preposition projections in a small clause analysis. However, this is incompatible with X-theoretic assumptions. In particular, the relation between den Richter and einen Idioten is unclear (see Hoekstra (1987: 296–297) on this point). The specifier of Idioten is einen, so there is no slot for another specifier (see also Pollard and Sag (1994: chap. 3.2) for English examples that are parallel to Examples (9a)–(9c) and (10a)–(10c) ). One way out of this is to introduce an additional projection on top of the NP, but then the category features of the predicative phrase inside the small clause have to be made available for selection by heads governing the small clause (den Dikken, 1995: 26). There have been many proposals for dealing with the mapping from bisentential to monosentential structures. These include Baker’s incorporation (1988), which may take place overtly or nonovertly, or the approach by Haegman and Riemsdijk (1986) that assumed simultaneous representations – i.e., not just one underlying structure is mapped to another one, but it is assumed that several analyses together (so-called co-analyses) constitute the analysis of a sentence. Frameworks that use multiple strata to represent grammatical information can account for the monoclausal status on one or several levels. For instance, Butt (1997), who worked in the framework of lexical functional grammar, suggested a complex predicate analysis for Urdu, in which the complex
predicate is not formed in the constituent structure but rather in the functional structure. See also Rosen (1997) for a multistratal analysis in the framework of relational grammar. Complex Predicates
The alternative to an analysis that assumes that maximal projections are embedded and that these structures are reanalyzed, have co-analyses, or similar things, is to assume that the two predicates form a close unit at some level of representation, right from the start. Such analyses have been suggested across frameworks in transformational grammar, government and binding, categorial grammar, lexical-functional grammar, and head-driven phrase structure grammar. The question is how the selectional properties of the heads that take part in complex formation are described. One option is to assume that fischen (‘to fish’) is an intransitive verb in Example (2d) and that the subject of leer (‘empty’) becomes the object of the complete predicate complex leer fischen. Such approaches were suggested, for instance, by Chomsky (1985: 100–101), for English particle verbs and consider þ predicate constructions; by Dowty (1979: chap. 4.7), for English resultatives; and by Neeleman and Weermann (1993) and Neeleman (1995), for English and Dutch resultative constructions. Alternatively, the fact that there will be additional arguments could be encoded in the lexical entry of fischen. Such approaches have been suggested for resultative constructions and for all of the other phenomena discussed here. Argument attraction approaches for verbal complexes were suggested by Geach (1970) in the framework of categorical grammar; by Karttunen (1986), for Finnish, in the framework of categorial unification grammar; by Haider (1986) and Bierwisch (1990), for German in the government and binding (GB) framework; and in the framework of head-driven phrase structure grammar (HPSG), by Hinrichs and Nakazawa (1989, 1994), Kiss (1995), Ackerman and Webelhuth (1998), Mu¨ ller (1999, 2002a), and Meurers (2000) for German, and by van Noord and Bouma (1994, 1997) and Rentier (1994) for Dutch. Przepio´ rkowski and Kups´ c´ (1997) suggested a complex predicate analysis for Polish, Monachesi (1998) used argument attraction to account for restructuring verbs in Italian, Abeille´ et al. (1997) dealt with complex predicate formation in French, and Manning et al. (1999) suggested a complex predicate analysis of Japanese causatives. Verspoor (1997), Wechsler (1997), Wechsler and Noh (2001), and Mu¨ ller, (2002a) suggested HPSG analyses for resultative constructions in English, Korean, and German. Winkler (1997: chap. 6.2.2) proposed a corresponding analysis
Complex Predicates 701
for resultative constructions in the government and binding (GB) framework. In what follows, how so-called argument attraction approaches work is demonstrated. The analysis of the phenomena discussed previously will be sketched; in frameworks such as categorial grammar or head-driven phrase structure grammar, functors are specified together with descriptions of the syntactic properties of their dependents. These descriptions are cancelled during syntactic combination. In the case of HPSG, the arguments are specified in a list. (This is a simplification; contemporary approaches assume two lists, one for the subject and one for the remaining arguments. For languages such as German, it is assumed that the subject of finite verbs is treated like the other arguments, because it can be permuted with them.) Identity of elements is indicated by identical numbers in boxes (see Examples (12a)–(12b) ). Hinrichs and Nakazawa (1994) developed an argument attraction approach for auxiliary verbs and modals. (11) weil er ihn reparieren because he him repair ‘because he wants to repair it’
will want
In this analysis, reparieren (‘to repair’) and will (‘to want’) form a close unit that functions as the head of the whole clause. The syntactic information contained in the valence specifications of the respective verbs is given in Example (12): (12a) reparieren: SUBCAT h NP [str], NP [str] i (12b) will: SUBCAT ! h V [SUBCAT ] i (12c) reparieren will: SUBCAT h NP[str], NP[str] i NP[str]
represents a noun phrase with structural case. Case is assigned according to the following principle: The first argument in a SUBCAT list with structural case is realized as nominative unless it is raised to a higher head (Meurers, 1999b). All other NPs with structural case are realized as accusative. The specification for will shows how argument attraction works: will selects a verb and attracts all elements of the SUBCAT list of the embedded verb. The identity of the attracted elements and the arguments of the embedded verb is indicated by the . Since the arguments of reparieren will are not raised by a higher predicate, the first one is assigned nominative case and the second one is assigned accusative case. This kind of analysis was extended to infinitival constructions involving zu infinitives, such as the one in Example (2b), by Kiss (1995). As Kathol (1998) noted, remote passive cases as shown in Example (5b) fall out automatically: If versuchen is analyzed as an argument attraction verb, the
accusative object of reparieren is simultaneously an object of the embedded verb zu reparieren and of the complex head zu reparieren versucht: (13a) reparieren: SUBCAT h NP[str]i, NP[str]j i (13b) versucht: SUBCAT h NP[str]k i ! ! h V[SUBCAT h NP[str]k i ! ] i (13c) zu reparieren versucht (finite): SUBCAT h NP[str]k, NP [str]j i (13d) zu reparieren versucht wurde (passive): SUBCAT h NP [str]j i
Here versuchen is a subject control verb, therefore the referential index of the subject (k) is identified with the referential index of the subject of the embedded predicate in Example (13b). The nonsubject arguments of the embedded verb ( ) are attracted by the matrix verb. Therefore, the object of the embedded verb is simultaneously the object of the matrix verb. Because both the downstairs object and the upstairs subject are dependents of the same (complex) head, the possibility of reordering is expected, since this phenomenon also occurs with simplex heads in German. If the matrix verb is passivized as in Example (13d), the subject (NP[str]k) is suppressed and the second argument becomes the first one in the SUBCAT list. Since it is the first argument in this list, it is realized as nominative, and the remote passive example in Example (5b) is explained. Examples (2c), (4c), and (5c) and similar constructs can be explained similarily: verbs such as finden embed an adjective and attract the subject of this adjective. As Manning (1992) pointed out, the passive examples seem to be problematic for theories that assume that verbal complex formation is a syntactic process, since passive is treated as a lexical process in many frameworks (for instance, lexicalfunctional grammar and HPSG). If argument composition happens at the point where the actual combination takes place, lexical processes cannot access arguments that are selected by other predicates. The argument composition approach that was sketched previously does not have the problems mentioned by Manning. The reason is that the argument composition is done in the lexicon, albeit in an underspecified way. The attracting head does not specify the exact form of the elements that are attracted. If lexical processes are applied to the higher verb, these lexical processes can impose requirements on the raised arguments and make the list more specific (see, for instance, Mu¨ ller (2003) for adjectival derivation with -bar ‘-able’). Bobaljik and Wurmbrand (2004) (and Susanne Wurmbrand, in an unpublished manuscript) argued that modification data and fronting data show that a complex predicate analysis for verbal complexes is
702 Complex Predicates
not adequate. Wurmbrand discussed the sentence in Example (14): (14) Sie haben den Fisch eine Woche lang they have the fish one week long in zwei Minuten zu fangen versucht. in two minutes to catch tried ‘They tried for a week to catch the fish in two minutes.’
This example shows that both verbs must be available for modification, i.e., a fusion of the two events is not tenable. This sentence is not problematic for complex predicate approaches if it is assumed that adverbials can attach to the verb directly. The adjunct does not change the projection level and therefore in zwei Minuten zu fangen has the same status as zu fangen. There are also examples in which the adjunct is not adjacent to the verb. To analyze these examples, discontinuous head-adjunct structures could be assumed (Mu¨ ller, 1999: chap. 17.6), as could an analysis that introduces adjuncts lexically. This was suggested by van Noord and Bouma (1994) for Dutch: a lexical rule introduces an adjunct into the valence list of a head. Argument composition works as previously outlined. If adjuncts are combined with the complex head, they scope over the verb as a dependent of which they were introduced (see also Manning et al., (1999) for an analysis of Japanese causatives that assumes a lexical introduction of adjuncts). The third possibility is to assume that the events variables of the verbs involved in complex formation are available at the predicate complex and that adverbials attach to verbal complexes and pick one of the available event variables, as was suggested by Crysmann (2004). Wurmbrand also argued against the complex predicate analysis on the basis of fronting as in Examples (15a) and (15b): (15a) Reparieren wird er den Wagen repair will he the car (15b) Den Wagen wird er reparieren the car will he repair
mu¨ ssen. must mu¨ ssen. must
Wurmbrand pointed out that reparieren and mu¨ssen are not adjacent and that the verb can be fronted without its object. That the verbs are not adjacent is not a problem if there is some device that mediates between the fronted constituent and the place where argument composition is assumed to take place. In the GB framework, movement is usually assumed for such dislocations; in HPSG, this phenomenon is handled by percolation of feature bundles. Example (15a) has the structure indicated in Example (16): (16) Repariereni wird er den Wagen [_i mu¨ ssen].
The _i is a trace that corresponds to the fronted reparieren, i.e., it has the same syntactic and semantic properties. The argument composition of the arguments of _i and mu¨ssen works exactly parallel to the composition of arguments of reparieren and mu¨ssen (see also Haider (1990) for a parallel treatment in the GB framework). Wurmbrand argued that Example (15a) is evidence for the XP status of reparieren, since reparieren is fronted and only maximal projections can be fronted, but this is a theory-internal assumption that is not universally shared. Since X theory does not restrict the set of possible grammars if empty elements are allowed (Koronai and Pullum, 1990), there is no reason to stick to X-theoretic assumptions. Analyses of partial verb phrase fronting that allow projections of different projection levels to be fronted were developed by Haider (1990) in the GB framework and by Mu¨ ller (1999, 2002a) and Meurers (1999a) (see also Bierwisch (1990) for remarks on the necessity to admit phrasal and lexical material in front of the finite verb). The same argument attraction technique that is used for verbal complexes can be used to account for particle verbs: for the particle an (‘toward’), the valence list contains one argument with structural case: (17) an: SUBCAT h NP[str] i
The verb lachen has one argument, which also has structural case: (18) lach-: SUBCAT h NP [str] i
Mu¨ ller (2002a: 344) suggested a lexical rule licensing an additional lexical item for lach- that is subcategorized for a particle in addition to the normal arguments of lach-. The result of the rule application is a lexical item with the following subcategorization list: (19) lach-: h NP [str] i !
! h PART[SUBCAT
]i
When lacht and an are combined, the resulting complex head selects both the subject of the intransitive base verb lachen and the argument of the particle: (20) anlacht: h NP[str], NP[str] i
Since both noun phrases depend on the same head, scrambling of these noun phrases, as in Example (4a), is expected. If lach-is passivized, the subject of lach-is suppressed and whatever is contributed by the particle ( ) will occupy the first position in the SUBCAT list. If the passivized form of lach- is combined with the particle an, the first element of the SUBCAT list of angelacht will be the NP[str] contributed by an. This element is realized as nominative. The sentence in Example (5a) is accounted for. Verspoor (1997), Wechsler (1997), Wechsler and Noh (2001), and Mu¨ ller (2002a) suggested a lexical
Complex Predicates 703
rule for resultative constructions in English, Korean, and German. The lexical rule licenses additional lexical items that select for a resultative predicate. The subject of the resultative predicate is attracted from the embedded predicate. The matrix verb and the resultative predicate form a complex head, and therefore the subject of the resultative predicate can be permuted with the subject of the matrix verb and the subject of the embedded predicate can be realized as the subject of the matrix predicate if the matrix predicate is passivized. See also: Argument Structure; Binding Theory; Control
and Raising; Head-Driven Phrase Structure Grammar; Lexical Functional Grammar; Long-Distance Dependencies; Periphrasis; Predication; Principles and Parameters Framework of Generative Grammar; Word Order and Linearization; X-Bar Theory.
Bibliography Abeille´ A, Godard D, Miller P H & Sag I A (1997). ‘French bounded dependencies.’ In Balari S & Dini L (eds.) Romance in HPSG, CSLI lecture notes, no. 75. Stanford: CSLI Publications. 1–54. Ackerman F & Webelhuth G (1998). A theory of predicates. CSLI lecture notes, no. 76. Stanford: CSLI Publications. Alsina A, Bresnan J & Sells P (eds.) (1997). Complex predicates. CSLI lecture notes, no. 64. Stanford: CSLI Publications. Baker M C (1988). Incorporation. A theory of grammatical function change. Chicago, London: University of Chicago Press. Bech G (1955). Studien u¨ ber das deutsche Verbum infinitum. Linguistische arbeiten, no. 139. (2nd edn.) (1983). Tu¨ bingen: Max Niemeyer Verlag. Bierwisch M (1990). ‘Verb cluster formation as a morphological process.’ In Booij & van Marle (eds.). 173–199. Blight R C & Moosally M J (eds.) (1997). Texas linguistic forum 38: the syntax and semantics of predication. Proceedings of the 1997 Texas Linguistics Society conference. Austin, TX: University of Texas Department of Linguistics. Bobaljik J & Wurmbrand S (2004). ‘Anti-reconstruction effects are anti-reconstruction effects.’ In Burelle S & Somesfalean S (eds.) Proceedings of the 2003 annual meeting of the Canadian Linguistic Association (CLA). 13–24. Booij G E (1990). ‘The boundary between morphology and syntax: separable complex verbs in Dutch.’ In Booij & van Marle (eds.). 45–63. Booij G E & van Marle J (eds.) (1990). Yearbook of morphology (vol. 3). Dordrecht Providence, RI: Foris Publications. Bresnan J (1982). ‘Control and complementation.’ Linguistic Inquiry 13(3), 343–434.
Butt M (1997). ‘Complex predicates in Urdu.’ In Alsina et al. (eds.) 107–149. Chomsky N (1985). The logical structure of linguistic theory. Chicago/London: University of Chicago Press. COLING Staff (eds.) (1994). Proceedings of Conference on Computational Linguistics (COLING) 94, Kyoto, Japan. Cambridge: Computational Linguistics ACL – Association of MIT Press. Crysmann B (2004). ‘Underspecification of intersective modifier attachment: some Arguments from German.’ In Mu¨ ller S (ed.) Proceedings of the HPSG-2004 conference, Center for Computational Linguistics, Katholieke Universiteit Leuven. Stanford: CSLI Publications. Demske-Neumann U (1994). Modales passiv und tough movement. Zur strukturellen Kausalita¨ t eines syntaktischen Wandels im Deutschen und Englischen. Linguistische Arbeiten, no. 326. Tu¨ bingen: Max Niemeyer Verlag. den Dikken M (1995). Particles. On the syntax of verbparticle, triadic, and causative constructions. New York, Oxford: Oxford University Press. Dowty D R (1979). Word meaning and Montague grammar. Synthese language library, no. 7. Dordrecht, Boston, London: D. Reidel Publishing Company. Evers A (1975). ‘The transformational cycle in Dutch and German.’ Ph.D. thesis, University of Utrecht. Fanselow G (1991). ‘Minimale syntax.’ Groninger Arbeiten zur Germanistischen Linguistik 32. Geach P T (1970). ‘A program for syntax.’ Synthese 22, 3–17. Grewendorf G (1994). ‘Koha¨ rente infinitive und inkorporation.’ In Steube A & Zybatow G (eds.) Zur Satzwertigkeit von Infinitiven und Small Clauses, Linguistische Arbeiten, no. 315. Tu¨ bingen: Max Niemeyer Verlag. 31–50. Grewendorf G & Sabel J (1994). ‘Long scrambling and incorporation.’ Linguistic Inquiry 25(2), 263–308. Grimshaw J & Mester R-A (1985). ‘Complex verb formation in Eskimo.’ Natural Language and Linguistic Theory 3, 1–19. Haegman L & Riemsdijk H van (1986). ‘Verb projection raising, scope, and the typology of rules affecting verbs.’ Linguistic Inquiry 17(3), 417–466. Haider H (1986). ‘Fehlende Argumente: vom Passiv zu koha¨ renten Infinitiven.’ Linguistische Berichte 101, 3–33. Haider H (1990). ‘Topicalization and other puzzles of German syntax.’ In Grewendorf G & Sternefeld W (eds.) Scrambling and Barriers. Amsterdam, Philadelphia: John Benjamins Publishing Company. 93–112. Hinrichs E W, Kathol A & Nakazawa T (eds.) (1998). Complex predicates in nonderivational syntax; vol. 30, syntax and semantics. San Diego: Academic Press. Hinrichs E W & Nakazawa T (1989). ‘Subcategorization and VP structure in German.’ In Aspects of German VP structure, SfS-Report-01-93. Tu¨ bingen: Eberhard-KarlsUniversita¨ t. Hinrichs E W & Nakazawa T (1994). ‘Linearizing AUXs in German verbal complexes.’ In Nerbonne et al. (eds.) 11–38.
704 Complex Predicates Hoeksema J (1991). ‘Complex predicates and liberation in Dutch and English.’ Linguistics and Philosophy 14(6), 661–710. Hoekstra T (1987). Transitivity. Grammatical relations in government-binding theory. Dordrecht, Cinnaminson, NJ: Foris Publications. Hoekstra T (1988). ‘Small clause results.’ Lingua 74, 101–139. Ho¨ hle T N (1978). Lexikalische Syntax: Die Aktiv-PassivRelation und andere Infinitkon-struktionen im Deutschen. Linguistische Arbeiten, no. 67. Tu¨ bingen: Max Niemeyer Verlag. Karttunen L (1986). Radical lexicalism. Report no. CSLI86-68. Stanford: CSLI Publications. Kathol A (1998). ‘Constituency and linearization of verbal complexes.’ In Hinrichs et al. (eds.). 221–270. Kiss T (1995). Infinite Komplementation. Neue Studien zum deutschen Verbum infinitum. Linguistische Arbeiten, no. 333. Tu¨ bingen: Max Niemeyer Verlag. Koronai A & Pullum G K (1990). ‘The X-bar theory of phrase structure.’ Language 66(1), 24–50. Manning C D (1992). Romance is so complex. Technical report CSLI-92-168. Stanford: CSLI Publications. Manning C D, Sag I A & Iida M (1999). ‘The lexical integrity of Japanese causatives.’ In Levine R D & Green M (eds.) Studies in contemporary phrase structure grammar. Cambridge: Cambridge University Press. 39–79. Meurers D (1999a). ‘German partial-VP fronting revisited – back to basics.’ In Webelhuth G, Koenig J P & Kathol A (eds.) Lexical and constructional aspects of linguistic explanation. Stanford: CSLI Publications. 129–144. Meurers D (1999b). ‘Raising spirits (and assigning them case).’ Groninger Arbeiten zur Germanistischen Linguistik (GAGL) 43, 173–226. Meurers D (2000). Lexical generalizations in the syntax of German non-finite constructions. Arbeitspapiere des SFB 340, no. 145. Tu¨ bingen: Eberhard-karls-Universita¨ t. Monachesi P (1998). ‘Italian restructuring verbs: a lexical analysis.’ In Hinrichs et al. (eds.). 313–368. Mu¨ ller S (1999). Deutsche Syntax deklarativ. Head-driven phrase structure grammar fu¨ r das Deutsche. Linguistische Arbeiten, No. 394. Tu¨ bingen: Max Niemeyer Verlag. Mu¨ ller S (2002a). Complex predicates: verbal complexes, resultative constructions, and particle verbs in German. Stanford: CSLI Publications. Mu¨ ller S (2002b). ‘Syntax or morphology: German particle verbs revisited.’ In Dehe´ N, Jackendoff R S, McIntyre A & Urban S (eds.) Verb-particle explorations, Interface explorations, no. 1. Berlin, New York: Mouton de Gruyter. 119–139. Mu¨ ller S (2003). ‘The morphology of German particle verbs: solving the bracketing paradox.’ Journal of Linguistics 39(2), 275–325.
Neeleman A (1995). ‘Complex predicates in Dutch and English.’ In Haider H, Olsen S & Vikner S (eds.) Studies in comparative Germanic syntax, vol. 31, Studies in Natural Language and Linguistic Theory. Dordrecht, Boston, London: Kluwer Academic Publishers. 219–240. Neeleman A & Weermann F (1993). ‘The balance between syntax and morphology: Dutch particles and resultatives.’ Natural Language and Linguistic Theory 11, 433–475. Nerbonne J, Netter K & Pollard C J (eds.) (1994). German in head-driven phrase structure grammar. CSLI lecture notes, no. 46. Stanford: CSLI Publications. Pollard C J & Sag I A (1994). Head-driven phrase structure grammar. Studies in Contemporary Linguistics. Chicago, London: University of Chicago Press. Przepio´ rkowski A & Kups´ c´ A (1997). ‘Verbal negation and complex predicate formation in Polish.’ In Blight & Moosally (eds.). 247–261. Rentier G (1994). ‘Dutch cross serial dependencies in HPSG.’ In COLING Staff (ed.). 818–822. Rizzi L (1982). ‘A restructuring rule.’ In Issues in Italian Syntax. Dordrecht, Cinnaminson, NJ: Foris Publications. 1–48. Rosen C (1997). ‘Auxiliation and serialization: on discerning the difference.’ In Alsina et al. (1997) (eds.). 175–202. Stiebels B (1996). Lexikalische Argumente und Adjunkte: Zum semantischen Beitrag verbaler Pra¨ fixe und Partikeln. Studia grammatica XXXIX. Berlin: Akademie Verlag. Stiebels B & Wunderlich D (1994). ‘Morphology feeds syntax: the case of particle verbs.’ Linguistics 32(6), 913–968. van Noord G & Bouma G (1994). ‘The scope of adjuncts and the processing of lexical rules.’ In COLING Staff (ed.). 250–256. van Noord G & Bouma G (1997). ‘Dutch verb clustering without verb clusters.’ In Blackburn P & de Rijke M (eds.) Specifying syntactic structures. Stanford: CSLI Publications/Folli. 123–153. Verspoor C M (1997). ‘Contextually-dependent lexical semantics.’ Ph.D. thesis, University of Edinburgh. Wechsler S (1997). ‘Resultative predicates and control.’ In Blight & Moosally (eds.). 307–321. Wechsler S & Noh B (2001). ‘On resultative predicates and clauses: parallels between Korean and English.’ Language Sciences 23, 391–423. Williams E (1983). ‘Against small clauses.’ Linguistic Inquiry 14(2), 287–308. Winkler S (1997). Focus and secondary predication. Studies in generative grammar, no. 43. Berlin, New York: Mouton de Gruyter. Wurmbrand S (2001). Infinitives. Restructuring and clause structure. Studies in generative grammar, no. 55. Berlin, New York: Mouton de Gruyter.
Complex Segments 705
Complex Segments W Kehrein, Philipps University Marburg, Marburg, Germany ! 2006 Elsevier Ltd. All rights reserved.
A complex segment is a single speech unit with a nonhomogeneous phonetic structure. The term will be used here in a broad sense covering two major subclasses: A. ‘contour segments,’ i.e., sounds produced with intrinsic sequential properties, such as affricates [ , ], pre- and postnasalized stops [nd, dn], and short diphthongs [ , ]; B. ‘multiply articulated consonants,’ i.e., sounds with (more or less) simultaneous articulations at different places, such as ‘doubly articulated’ labiovelars [ , ] and clicks [8, !], and ‘secondaryarticulated’ consonants [pj, kw, ] (palatalized consonants, labialized consonants, and velarized consonants). Besides its meaning as a cover term, phonologists also use ‘complex segment’ to denote either (A) or (B). The former usage is based on Hoard’s (1971) original definition; the latter follows Sagey’s (1986) terminology.
Complex Segments as Single Speech Units Many phonologists have observed that affricates, prenasalized stops, doubly articulated consonants, and secondary-articulated consonants behave like single units and unlike clusters for PHONOTACTIC reasons (see Trubetzkoy, 1939, Martinet, 1939 on affricates; Herbert, 1986 for a critical review on prenasalized stops). Dagbani, Boazi, and Chipewyan, for instance, tolerate single consonants and complex segments, though no clusters, in syllable onsets: Dagbani has syllable-initial affricates [ , ] and labiovelars [ , , ] (Ladefoged, 1964), Boazi has prenasalized stops [mb, nd, Ng, NG] (Foley, 1986), and Chipewyan has quite a number of affricates and labialized velars, e.g., [ , , , , kw, xw] (Maddieson, 1984). Languages with more complex syllable types show a parallel pattern. Verb stems in E´ we´ , for instance, allow for initial Cþliquid clusters, where C can be a simple consonant, a labiovelar [ , ], an affricate [ , ], or a palatalized nasal [nj] (Ladefoged, 1964; see (1d) below). Similarly, word initial Cþsonorant clusters in Standard German can start with a single obstruent, e.g., [pl]anke ‘plank,’ [fl]anke ‘flank,’
[ku]al ‘pain,’ or an affricate, as in [ l]anze ‘plant,’ [ u]ei ‘two.’ The parallel phonotactic behavior of simple and complex segments is supported by a number of further observations. First, complex segments are TAUTOSYLLABIC in intervocalic position, i.e., [a a, a.nda, a. a, a.pja] etc., whereas clusters are typically heterosyllabic, [as.ta, ar.ma, at.fa, am.sa]. Verbal nouns in Dera (a.k.a. Kanakuru), for instance, have a high-low tone pattern if their initial syllable is closed, but two high tones if the initial syllable is open, cf. [ja´ h.JeA k] ‘sift’ and [mo´ .neB k] ‘forget.’ A word such as Ki´ndeB k ‘squeeze’ illustrates that prenasalized stops pattern as onsets, i.e., [Ki´ndeB k] (Clements, 2000). Similarly, vowel lengthening in open syllables identifies affricates in Faroese as onsets; cf. [e:.ta] ‘to eat’ and [ve:. a] ‘wake up,’ but [hEs.tor] ‘horse’ (Lockwood, 1977). Second, complex segments have PHONETIC DURATIONS comparable to single segments but significantly shorter than clusters. This has been shown for labiovelars in Eggon, E´ we´ , Idoma, Yoruba, and Igbo, for affricates in English, Polish, and Kabardian, for prenasalized stops in languages such as Ganda and Sinhala, and for palatalized consonants in Russian (see e.g., Sagey, 1986; Ladefoged and Maddieson, 1996). Third, complex segments differ from clusters by having a FIXED ORDER no matter what their position in the syllable is, e.g., German [ ao] ‘peacock’ and [tO ] ‘pot,’ but [kla:!] ‘clear’ vs. [kalk] ‘lime.’ Analyzing complex segments as single speech units explains why they are not affected by the ‘sonorancy sequencing generalization,’ the principle which determines the order of consonants in syllable onsets and codas. Fourth, complex segments are INSEPARABLE UNITS with regard to processes such as vowel epenthesis, infixation, or reduplication. In E´ we´ , for instance, reduplication copies the first consonant and vowel of a stem, as shown in (1a) vs. (1b). Complex segments are copied as units (1c); and clusters of complex segment and liquid are split up after the complex segment, but not after its first component (1d, e) (Sagey, 1986: 86). (1a) fo (1b) fle (1c) i (1d) lo (1e) njra
‘to beat’ fo-fo ‘to buy’ fe-flee ‘to grow’ i- ii ‘to lead’ o- lo ‘to rave’ nja-njrala
‘beating’ ‘bought’ ‘grown up’ *t ii ‘leading’ *ko lo ‘a raver’ *nanjrala
Finally, complex segments must be distinguished from their corresponding (though phonetically longer) clusters because in some languages at least both are in CONTRAST with each other. Some examples are given in Table 1.
706 Complex Segments
Different Types of Complex Segments Following the criteria from the previous section, phonologists have identified quite a number of complex segments. Some of them are listed in Table 2 below along with their frequency in Maddieson’s (1984) sample of 317 languages. Several conclusions can be drawn from crosslinguistic investigations such as Maddieson’s: first and most obviously, some complex segments are widespread among the languages of the world, while others are exceedingly rare. The palatoalveolar affricate [ ] is by far the most common, occurring in 141 (7¼ 44.5%) of the languages in the database (80 languages have [ ], 43 languages have [ h], and 35 have [ ’]). Second, the frequency of a particular complex segment is not entirely determined by its ‘type.’ Thus, the affricates [ ] and [ ] are extremely frequent compared to [ ], which occurs only in Luo and Chipewyan; [kw] is common, but only Nambakaengo (Santa Cruz) has [pw]; labiovelars [ ] (and [ ], in 18 languages) are relatively frequent, but coronovelars, such as the alveolar click [!] in Nama and !Xu˜ (KungEkoka), or the rather obscure dentopalatal [ ] of Maung, are not; and while prenasalized stops [mb, n d, Ng] occur in 18 languages of the sample, their postnasalized counterparts [bm, dn, gN] are restricted to a single language (Aranda) Western Arrarnta. Third, complex segments tend to be homogeneous except for a single phonological dimension. This is evident from comparing the frequencies of [ ], [mb], [ ], or [tj] with doubly complex segments such as [ j], [n ], and [ ]. Fourth, many potential complex segments do not occur at all in Maddieson’s database, nor in any other known language. Restrictions apply to both contour Table 1 Contrasts of complex segments and clusters Language
Complex segment
Cluster
Eggon: Russian: Polish: Kabardian:
u ‘die’ pjotr ‘Peter’ $i ‘whether’ a:s ‘it has been thrown’ landa ‘blind’
segments and doubly articulated consonants: the former display internal changes in continuancy ([ ]), nasality ([mb, bm]), and laterality ([ ]), but never in, e.g., voicing (*[ , ]), glottal width (*[ , ]), or sonorancy (*[ , , ]). The latter are always composed of two, independently movable articulators, i.e., labial and dorsal ([ , , , 8]), coronal and dorsal ([|, !, ] and possibly [ ]), or coronal and labial ([ , ]) (in Yeletnye [Yele]; Ladefoged and Maddieson, 1996), but never bilabial and labiodental (*[ ]), alveolar and retroflex (*[ ]), palatal and velar (*[ ]), or velar and uvular (*[ ]) (see Halle, 1983). The same restrictions apply to more heterogeneous types: [ j], [n ], and [ ] are possible (and attested) doubly complex segments because their individual components are (cf. [ ], [tj], [nd], [ ]); but *[ ], *[ ], *[ ], and *[ ] are impermissible for the general absence of voicing contours (*[ ]), glottal width contours (*[ ]), multiple coronals (*[ ]), and multiple dorsals (*[ ]), respectively. There is no general agreement on how ‘complex’ a complex segment can get; but some phonologists have argued that the Chadic language Margi (Marghi Central) carries things to extremes. According to Sagey’s (1986) analysis at least, Margi has affricates, prenasalized consonants, labialized consonants, and labiocoronals ([ , ]), as well as many of their combinations, yielding prenasalized, labialized stops ([mbw, Ngw]), ‘heterorganic affricates’ ([ , , ]), stop–affricate combinations [ , ], and even some ‘ternary-complex segments,’ such as [n w] and [ ].
Phonological Representations for Complex Segments Complex segments have played quite a role in various developments in theoretical phonology. Doubly articulated consonants have served as major evidence for a fundamental shift from place-ofarticulation features to the so-called articulatorbased theory (Halle, 1983; Clements and Hume, 1995). In this model, simple and multiply articulated consonants can be represented with different numbers and types of active articulators: simple
Table 2 Frequency of some complex segments kw S %
141 44.5
95 30.0
Source: Maddieson (1984).
38 12.0
m
tj
18 5.7
12 3.8
b
19 6.0
n
8 2.5
5 1.6
j
3 0.9
2 0.6
!
pw
2 0.6
1 0.3
bm 1 0.3
1 0.3
1 0.3
Complex Segments 707
consonants receive a single consonantal articulator, as shown in (2a); doubly articulated consonants have two consonantal articulators (2b); and secondaryarticulated consonants are specified for consonantal and vocalic articulators (2c).
stops display a change from [!nasal] to [þnasal] (Anderson, 1976). Most phonologists, including Hoard, Anderson, and Sagey, analyze such ‘feature contours’ as parts of single segments, as shown in (4). (4a)
(2a) (4b)
(4c)
(2b)
An alternative to segment-internal feature contours is Clements and Keyser’s (1983) bisegmental analysis. They treat contour segments as combinations of two (closely related) simple segments sharing a single position in the syllable. This is sketched in (5).
(2c)
(5a)
The feature geometrical structures in (2) are interpreted as phonologically unordered. This explains why no language contrasts, e.g., [ ] and [ ], or [kw] and [wk]. And it also accounts for the observation that multiply articulated consonants can spread both place specifications to a preceding consonant, as shown below for the labiovelar in Kpelle (Sagey, 1986: 37). (3) N-polu N-kO: N- iN
[m ´ .bo.lu] [ .gO:] [ . iN]
*[
iN], *[m ´ . iN]
(5b)
(5c)
‘my back’ ‘my foot’ ‘myself’
Contour segments, on the other hand, seem to act as one type of sound to their left, but as another type to their right, a phenomenon known as the ‘edge effect.’ Prenasalized stops, for instance, often nasalize preceding vowels, e.g., /and/ ! [a˜ nd] (in Guaranı´, Kainga´ng, and other languages), but never following vowels (/nda/ ! *[nda˜]; Anderson, 1976). Likewise, epenthesis in English plural forms such as buses ([-SIZ]), bushes ([-sIZ]), Churches (-[ IZ]) and edges ([- IZ]) seems to treat affricates as continuant sounds at their right edge (though see ‘Affricates as Strident Stops’ below). On the strength of these findings, contour segments are usually represented with phonologically ordered properties: affricates contain a sequence of [!continuant] and [þcontinuant] specifications (Hoard, 1971; Sagey, 1986), prenasalized stops are [þnasal] then [!nasal] sounds, and postnasalized
In Clements and Keyser’s view, contour segments represent one of the three fundamental ways in which segments can be linked to the syllable template (the so-called skeletal tier): short segments are linked to single skeletal units (6a); long segments are linked to two positions (6b); and contour segments are represented as two segments linked to a single slot on the skeleton (6c). (6a)
(6b)
708 Complex Segments (6c)
The unitary function of contour segments, in this model, follows from their monopositional status at the skeletal tier. This is shown in (7). (7a) illustrates the parallel phonotactic behavior of simple consonants and affricates in German; (7b) represents an intervocalic prenasalized stop in Dera; and (7c) shows how affricates can be distinguished from clusters in a language such as Polish. (7a)
(7b)
(7c)
Affricates as Strident Stops In an early feature analysis, Jakobson, et al. (1952) proposed to treat affricates as ‘strident stops,’ i.e., [þstrident, "continuant] sounds. Proponents of the contour analysis of affricates have advanced three major problems with this view: (a) the existence of nonstrident affricates, such as [ ] in Chipewyan and Luo; (b) the natural class of affricates and fricatives; and (c) the observation of edge effects, suggesting that affricates act as stops only to their left, but as fricatives to their right. More recent work, however, has raised serious objections to all these points, thereby putting the strident stop approach back on the map. First, contrary to what the internal ordering of features would predict, affricates may well act as ["continuant] sounds to their right (the so-called antiedge effects; see Lombardi, 1990). Second, on closer examination, the alleged natural class of affricates and fricatives as [þcontinuant] sounds turns out to be a natural class of [þstrident] sounds, because it includes stridents, [s, s, §, , , ] etc., but excludes nonstrident continuants (see LaCharite´ , 1993, Kim, 1997, Kehrein,
2002). The English plural, mentioned in the previous section, is a typical example in this regard: only [þstrident] explains why epenthesis shows up in bushes and churches though not in, e.g., cliffs ([-fs]) and moths ([-ys]) (see also Rubach, 1994, on the active role of stridency in the phonology of Polish). Finally, Clements (1999) and Kehrein (2002) rejected the idea of nonstrident affricates as independent phonological entities because these sounds never contrast with stops. Rather, they occur either (a) as phonetic variants of simple stops, as in Diyari (Dieri) [t9] # [ ], for instance, or (b) as the phonetic realization of a particular laryngeal series of stops, e.g., Tahltan [q, q’] but [qwh], or (c) in cases of minor place distinctions, such as Chipewyan and Luo, which have laminodental [ ] and apicoalveolar [t], though neither [t9] nor [ ] (see Clements, 1999; and Kehrein, 2002 for more examples and references). The findings suggest that affricates are, though phonetically complex, rather simple sounds at the phonological level: strident affricates such as [ , , ] are specified for ["continuant, þstrident], lateral affricates [ , ] are specified for ["continuant, þlateral] (see Kehrein, 2002), and nonstrident affricates ([ , , ] etc.) are ordinary simple stops phonologically, i.e., ["continuant] sounds. Their affricated phonetic forms, in this view, follow exclusively from requirements of perceptibility. Laterality and stridency are necessarily sequenced with respect to oral closure; and affrication of simple stops might be explained as a strategy to increase the perceptibility of other phonological distinctions, laryngeal series and minor place contrasts in particular. See also: Autosegmental Phonology; Diphthongs; Distinc-
Bibliography Anderson S R (1976). ‘Prenasalized consonants and the internal organisation of segments.’ Language 52, 326–344. Clements G N (1999). ‘Affricates as noncontoured stops.’ In Fujimura O, Joseph B D & Palek B (eds.) Proceedings of LP ’98. Prague: Karolinum Press. 271–299. Clements G N (2000). ‘Phonology.’ In Heine B & Nurse D (eds.) African languages: an introduction. Cambridge: Cambridge University Press. 123–160. Clements G N & Hume E V (1995). ‘The internal organization of speech sounds.’ In Goldsmith J A (ed.) The handbook of phonological theory. Oxford: Blackwell. 245–306. Clements G N & Keyser S J (1983). CV-phonology: a generative theory of the syllable. Cambridge, MA: MIT Press.
Componential Analysis 709 Foley W A (1986). The Papuan languages of New Guinea. Cambridge: Cambridge University Press. Halle M (1983). ‘On distinctive features and their articulatory implementation.’ Natural Language and Linguistic Theory 1, 91–105. Herbert R K (1986). Language universals, markedness theory, and natural phonetic processes. Berlin: Mouton de Gruyter. Hoard J E (1971). ‘The new phonological paradigm.’ Glossa 5, 222–268. Jakobson R, Fant G & Halle M (1952). Prelininaries to speech analysis: the distinctive features and their correlates. Cambridge, MA: MIT Press. Kehrein W (2002). Phonological representation and phonetic phasing: affricates and laryngeals. Linguistische Arbeiten 466. Tu¨ bingen: Niemeyer. Kim H (1997). ‘The phonological representation of affricates: evidence from Korean and other languages.’ Unpublished Ph.D. dissertation. Cornell University. LaCharite´ D (1993). ‘The internal structure of affricates.’ Unpublished Ph.D. dissertation. University of Ottawa.
Ladefoged P (1964). A phonetic study of West African languages. West African Languages and Monographs 1. Cambridge: Cambridge University Press. Ladefoged P & Maddieson I (1996). The sounds of the world’s languages. Oxford: Blackwell. Lockwood W B (1977). An introduction to modern Faroese. To´ rshavn: Føroya Sku´ labo´ kagrunnur. Lombardi L (1990). ‘The nonlinear organisation of the affricate.’ Natural Language and Linguistic Theory 8, 375–425. Maddieson I (1984). Patterns of sounds. Cambridge: Cambridge University Press. Martinet A (1939). ‘Un ou deux phone`mes?’ Acta Linguistica 1, 94–103. Rubach J (1994). ‘Affricates as strident stops in Polish.’ Linguistic Inquiry 25, 119–143. Sagey E (1986). ‘The representation of features and relations in nonlinear phonology.’ Ph.D. Dissertation, MIT. Trubetzkoy N S (1939). Grundzu¨ ge der phonologie. Go¨ ttingen: Vandenhoeck and Ruprecht.
Componential Analysis D Geeraerts, University of Leuven, Leuven, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Componential Analysis Componential analysis is an approach that describes word meanings as a combination of elementary meaning components called semantic features or semantic components. The set of basic features is supposed to be finite. These basic features are primitive in the sense that they are the undefined building blocks of lexical-semantic definitions. Hence, the term ‘semantic primitives’ (or sometimes ‘atomic predicates’) is used to refer to the basic features. The advantage of having definitional elements that themselves remain undefined resides in the possibility of avoiding circularity: if the definitional language and the defined language are identical, words are ultimately defined in terms of themselves – in which case the explanatory value of definitions seems to wholly disappear. More particularly, definitional circularity would seem to imply that it is impossible to step outside the realm of language and to explain how language is related to the world. This motivation for having undefined primitive elements imposes an important restriction on the set of primitive features. In fact, if achieving noncircularity is the point, the set of primitives should be smaller than the set of words to be defined: there is no
reductive or explanatory value in a set of undefined defining elements that is as large as the set of concepts to be defined. Furthermore, the idea was put forward that the restricted set of primitive features might be universal, just like in phonology. This universality is not, however, a necessary consequence of the primitive nature of features: the definitional set of features could well be language specific.
The European Tradition of Componential Analysis Componential analysis was developed in the second half of the 1950s and the beginning of the 1960s by European and American linguists, at least to some extent independently of each other. Although the first step in the direction of componential analysis can be found in the work of Louis Hjelmslev (Hjelmslev, 1953), its full development does not emerge in Europe before the early 1960s, in the work of Bernard Pottier (Pottier, 1964; Pottier, 1965), Eugenio Coseriu (Coseriu, 1964; Coseriu, 1967) and Algirdas Greimas (Greimas, 1966). The fundamental idea behind these studies is that the items in a lexical field are mutually distinguished by functional oppositions. In this sense, componential analysis grew out of a desire to provide a systematic analysis of the semantic relations within a lexical field.
Componential Analysis 709 Foley W A (1986). The Papuan languages of New Guinea. Cambridge: Cambridge University Press. Halle M (1983). ‘On distinctive features and their articulatory implementation.’ Natural Language and Linguistic Theory 1, 91–105. Herbert R K (1986). Language universals, markedness theory, and natural phonetic processes. Berlin: Mouton de Gruyter. Hoard J E (1971). ‘The new phonological paradigm.’ Glossa 5, 222–268. Jakobson R, Fant G & Halle M (1952). Prelininaries to speech analysis: the distinctive features and their correlates. Cambridge, MA: MIT Press. Kehrein W (2002). Phonological representation and phonetic phasing: affricates and laryngeals. Linguistische Arbeiten 466. Tu¨bingen: Niemeyer. Kim H (1997). ‘The phonological representation of affricates: evidence from Korean and other languages.’ Unpublished Ph.D. dissertation. Cornell University. LaCharite´ D (1993). ‘The internal structure of affricates.’ Unpublished Ph.D. dissertation. University of Ottawa.
Ladefoged P (1964). A phonetic study of West African languages. West African Languages and Monographs 1. Cambridge: Cambridge University Press. Ladefoged P & Maddieson I (1996). The sounds of the world’s languages. Oxford: Blackwell. Lockwood W B (1977). An introduction to modern Faroese. To´rshavn: Føroya Sku´labo´kagrunnur. Lombardi L (1990). ‘The nonlinear organisation of the affricate.’ Natural Language and Linguistic Theory 8, 375–425. Maddieson I (1984). Patterns of sounds. Cambridge: Cambridge University Press. Martinet A (1939). ‘Un ou deux phone`mes?’ Acta Linguistica 1, 94–103. Rubach J (1994). ‘Affricates as strident stops in Polish.’ Linguistic Inquiry 25, 119–143. Sagey E (1986). ‘The representation of features and relations in nonlinear phonology.’ Ph.D. Dissertation, MIT. Trubetzkoy N S (1939). Grundzu¨ge der phonologie. Go¨ttingen: Vandenhoeck and Ruprecht.
Componential Analysis D Geeraerts, University of Leuven, Leuven, Belgium ! 2006 Elsevier Ltd. All rights reserved.
Componential Analysis Componential analysis is an approach that describes word meanings as a combination of elementary meaning components called semantic features or semantic components. The set of basic features is supposed to be finite. These basic features are primitive in the sense that they are the undefined building blocks of lexical-semantic definitions. Hence, the term ‘semantic primitives’ (or sometimes ‘atomic predicates’) is used to refer to the basic features. The advantage of having definitional elements that themselves remain undefined resides in the possibility of avoiding circularity: if the definitional language and the defined language are identical, words are ultimately defined in terms of themselves – in which case the explanatory value of definitions seems to wholly disappear. More particularly, definitional circularity would seem to imply that it is impossible to step outside the realm of language and to explain how language is related to the world. This motivation for having undefined primitive elements imposes an important restriction on the set of primitive features. In fact, if achieving noncircularity is the point, the set of primitives should be smaller than the set of words to be defined: there is no
reductive or explanatory value in a set of undefined defining elements that is as large as the set of concepts to be defined. Furthermore, the idea was put forward that the restricted set of primitive features might be universal, just like in phonology. This universality is not, however, a necessary consequence of the primitive nature of features: the definitional set of features could well be language specific.
The European Tradition of Componential Analysis Componential analysis was developed in the second half of the 1950s and the beginning of the 1960s by European and American linguists, at least to some extent independently of each other. Although the first step in the direction of componential analysis can be found in the work of Louis Hjelmslev (Hjelmslev, 1953), its full development does not emerge in Europe before the early 1960s, in the work of Bernard Pottier (Pottier, 1964; Pottier, 1965), Eugenio Coseriu (Coseriu, 1964; Coseriu, 1967) and Algirdas Greimas (Greimas, 1966). The fundamental idea behind these studies is that the items in a lexical field are mutually distinguished by functional oppositions. In this sense, componential analysis grew out of a desire to provide a systematic analysis of the semantic relations within a lexical field.
710 Componential Analysis
Methodologically speaking, componential analysis has a double background. First, it links up with the traditional lexicographical practice of defining concepts in an analytical way, by splitting them up into more basic concepts; thus, a definition of ram as ‘male sheep’ uses the differentiating feature ‘male’ to distinguish the term ram from other items in the field of words referring to sheep. In the Aristotelian and Thomistic tradition, this manner of defining is known as a definition per genus proximum et differentias specificas, i.e., (roughly) ‘by stating the superordinate class to which something belongs, together with the specific characteristics that differentiate it from the other members of the class.’ Second, the background of the componential idea can be traced to structural phonology, where the sound inventory of natural languages had been successfully described by means of a restricted number of oppositions. On the basis of this phonological model, the structuralist semanticians set out to look for functional oppositions within a lexical field, oppositions that are represented, as in phonology, by means of a binary plus/minus notation. Pottier (1964) provides an example in his analysis of a field consisting (among others) of the terms pouf, tabouret, chaise, fauteuil, and canape´ ; the term that delimits the field as a superordinate term is sie`ge, ‘sitting equipment with legs.’ These five words can be contrasted mutually by means of distinctive oppositions. Consider the following set: s1 s2 s3 s4 s5 s6
‘for sitting’ ‘with legs’ ‘with back’ ‘for a single person’ ‘with arms’ ‘made from hard material.’
We can then define the items in the field: S1 S2 S3 S4 S5
The work of the structuralist semanticians of the European school tends to be rich in terminological distinctions, and this is also the case in Pottier’s work. The values of the oppositional dimensions (s1, s2, etc.) are called se`mes, and the meaning of a lexe`me (lexical item) is a se´ me`me (S1, S2, etc). Sie`ge, then, is the archilexe`me, and the meaning of this archilexe`me (in this case, features s1 and s2) is the archise´ me`me. The archise´ me`me is present in the seme`mes of any of the separate lexe`mes in the field. This is not yet the whole story, since foncte`mes (relevant for the
description of grammatical meaning aspects, such as word class) and classe`mes (se`mes that recur throughout the entire vocabulary) should also be taken into account. This terminological abundance has, however, hardly found its way to the customary semantic vocabulary (although the English counterparts of the French terms, such as ‘sememe’ and ‘seme,’ may occasionally be met with). This illustrates the fact that, as mentioned before, the European branch of componential analysis has remained more or less isolated. Specifically, it has not played an important role in the developments that grew out of the American branch, such as the incorporation of componential analysis into generative grammar. Beside the ones mentioned above, other names that are of importance within the European tradition are those of Horst Geckeler (Geckeler, 1971), who specifically continues the lines set out by Coseriu, Klaus Heger (Heger, 1964), Kurt Baldinger (Baldinger, 1980), and Leonhard Lipka (Lipka, 2002). Through the work of Greimas, European structuralist semantics has had a considerable impact outside linguistics, especially in literary studies.
The American Tradition of Componential Analysis In America, the componential method emerged from anthropological linguistic studies. In a rudimentary way, this is the case with Conklin (1955), whereas a thorough empirical, formal, and theoretical elaboration is provided by Goodenough (1956) and especially Lounsbury (1956). The major breakthrough of componential analysis did not, however, occur until the appearance of Jerrold J. Katz and Jerry A. Fodor’s seminal article ‘‘The structure of a semantic theory’’ (Katz and Fodor, 1963). It was Katz in particular who extended and defended the theory afterward; see especially Katz (1972). Rather than analyzing a lexical field, Katz and Fodor gave an example of the way in which the meanings of a single word, when analyzed componentially, can be represented as part of a formalized dictionary. Such a formalized dictionary (to distinguish it from ordinary dictionaries, it is sometimes referred to by the term ‘lexicon’) would then be part of a formal grammar. What the entry for the English word bachelor would look like is demonstrated in Figure 1. Next to word form and word class, two kinds of semantic components can be found in the diagram: markers and distinguishers (indicated with parentheses and square brackets respectively). Markers constitute what is called the systematic part
Componential Analysis 711
designed by Katz and Fodor, i.e., that of a formalized componential meaning representation as part of a formal grammar.
The Contemporary Situation
Figure 1 Componential analysis of bachelor (after Katz and Fodor, 1963).
of the meaning of an item. Like Pottier’s classe`mes, they recur throughout the lexicon. Specifically, they are supposed to represent those features in terms of which selection restrictions (semantic restrictions on the combinatory possibilities of words) are formulated. Distinguishers represent what is idiosyncratic rather than systematic about the meaning of an item; they only appear on the lowest level of the formalized representation. The Katzian approach has had to endure heavy attacks (among others from Bolinger, 1965, Weinreich 1966, and Bierwisch 1969), and Katz’s views gradually moved to the background of the ongoing discussions. The Katzian distinction between markers and distinguishers, for instance, was generally found not to be well established, and was consequently abandoned. Conversely, various other distinctions between types of features were proposed, two kinds of which may be mentioned separately. To begin with, binary features of the plus/minus type were supplemented with nonbinary features, which represent cases where the distinctive dimension can have more than two values. Leech (1974), for instance, suggested a distinctive dimension ‘metal’ with multiple values, in order to distinguish between gold, copper, iron, mercury, and so on. Further, a distinction between elementary and complex features was drawn to stress the fact that a concept with distinctive value in one lexical field might itself have to be subjected to further decomposition, until the ultimate level of basic features was reached. Other developments triggered by the Katzian approach included attempts to combine componential analysis with other forms of semantic analysis, e.g., with lexical field theory (Lehrer, 1974; Lutzeier 1981). One should bear in mind that suggestions such as those enumerated here, although leading away from the original Katzian model, were by and large situated within the very framework that was
Basically, the contemporary attitude of linguists towards componential analysis takes one of three forms: componential analysis may be used as a descriptive formalism, as an epistemological necessity, or as a heuristic instrument. To begin with, there are various approaches in formal grammar that use some form of semantic decomposition as a descriptive device: see for instance Dowty (1979) and Pustejovsky (1995), which incorporated ideas from componential analysis in the framework of logical semantics. With the exception of researchers such as Ray Jackendoff (Jackendoff, 2002), who dialogues actively with cognitive psychology, the approaches mentioned here tend to pay minimal attention to the methodological question of how to establish the basic, primitive nature of semantic features. If the original Katzian approach combines the idea of primitiveness with the idea of formalization, most of the approaches in this first contemporary group stress the formalization aspect more than the systematic quest for primitives. The converse is the case in Anna Wierzbicka’s natural semantic metalanguage approach (see Natural Semantic Metalanguage), which is not much interested in formalization of lexical and grammatical analyses, but which systematically tries to establish the basic set of primitive concepts. Third, at the other extreme, cognitive semantics and related approaches within contemporary semantics question the componential approach itself: what is the justification for assuming that lexical meanings are to be represented in a fragmented way, as a collection of more basic semantic elements? The antidecompositional reasoning takes many forms (see Fillmore, 1975 for one of the most influential statements), but one of the basic arguments is the following. The appeal of noncircular definitions seemed to be that they could explain how the gap between linguistic meaning and extralinguistic reality is bridged: if determining whether a concept A applies to thing B entails checking whether the features that make up the definition of A apply to B as an extralinguistic entity, then words are related to the world through the intermediary of primitive features. But obviously, this does not explain how the basic features themselves bridge the gap. More generally, the ‘referential connection’ problem for words remains unsolved as long as it is not solved for the primitives. And
712 Componential Analysis
conversely, if the ‘referential connection’ problem could be solved for primitive features, the same solution might very well be applicable to words as a whole. So, if noncircularity does not solve the referential problem as such, decomposition is not a priori to be preferred over nondecompositional approaches, and psychological evidence for one or the other can be taken into account (see Aitchison, 2003 for an overview of the psychological issues). However, even within those approaches that do not consider semantic decomposition to be epistemologically indispensable, componential analysis may be used as a heuristic device. For instance, in Geeraerts et al. (1994), a work that is firmly situated within the tradition of cognitive semantics, the internal prototypical structure of lexical categories is analyzed on the basis of a componential analysis of the referents of the words in question. It would seem, in other words, that there is widespread agreement in linguistics about the usefulness of componential analysis as a descriptive and heuristic tool, but the associated epistemological view that there is a primitive set of basic features is generally treated with much more caution. See also: Cognitive Semantics; Lexical Fields; Natural Semantic Metalanguage; Semantic Primitives.
Bibliography Aitchison J (2003). Words in the mind: an introduction to the mental lexicon (3rd edn.). Oxford: Blackwell. Baldinger K (1980). Semantic theory. Oxford: Blackwell. Bierwisch M (1969). ‘On certain problems of semantic representations.’ Foundations of Language 5, 153–184. Bolinger D (1965). ‘The atomization of meaning.’ Language 41, 555–573. Conklin H (1955). ‘Hanuno´ o color categories.’ Southwestern Journal of Anthropology 11, 339–344. Coseriu E (1964). ‘Pour une se´ mantique diachronique structurale.’ Travaux de Linguistique et de Litte´ rature 2, 139–186.
Coseriu E (1967). ‘Lexikalische Solidarita¨ten.’ Poetica 1, 293–303. Dowty D (1979). Word meaning and Montague grammar. Dordrecht: Reidel. Fillmore C (1975). ‘An alternative to checklist theories of meaning.’ In Cogen C, Thompson H & Wright J (eds.) Proceedings of the First Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society. 123–131. Geckeler H (1971). Zur Wortfelddiskussion. Munich: Fink. Geeraerts D, Grondelaers S & Bakema P (1994). The structure of lexical variation. Berlin: Mouton de Gruyter. Goodenough W (1956). ‘Componential analysis and the study of meaning.’ Language 32, 195–216. Greimas A (1966). Se´ mantique structurale. Paris: Larousse. Heger K (1964). Monem, Wort, Satz und Text. Tu¨ bingen: Niemeyer. Hjelmslev L (1953). Prolegomena to a theory of language. Bloomington: Indiana University Press. Jackendoff R (2002). Foundations of language. Oxford: Oxford University Press. Katz J J (1972). Semantic theory. New York: Harper and Row. Katz J J & Fodor J A (1963). ‘The structure of a semantic theory.’ Language 39, 170–210. Leech G (1974). Semantics. Harmondsworth, England: Penguin. Lehrer A (1974). Lexical fields and semantic structure. Amsterdam: North Holland. Lipka L (2002). English lexicology. Tu¨ bingen: Niemeyer. Lounsbury F (1956). ‘A semantic analysis of Pawnee kinship usage.’ Language 32, 158–194. Lutzeier P (1981). Wort und Feld. Tu¨ bingen: Niemeyer. Pottier B (1964). ‘Vers une se´ mantique moderne.’ Travaux de Linguistique et de Litte´ rature 2, 107–137. Pottier B (1965). ‘La de´ finition se´ mantique dans les dictionnaires.’ Travaux de Linguistique et de Litte´ rature 3, 33–39. Pustejovsky J (1995). The generative lexicon. Cambridge, MA: MIT Press. Weinreich U (1966). ‘Explorations in semantic theory.’ In Sebeok T A (ed.) Current Trends in Linguistics 3. The Hague: Mouton. 395–477.
Compositionality: Philosophical Aspects F J Pelletier, Simon Fraser University, Burnaby, BC, Canada ! 2006 Elsevier Ltd. All rights reserved.
There are three different but loosely related conceptions that are associated with the term ‘compositionality’ in the literature of philosophical and linguistic semantics.
One conception, taking its lead from the more literal sense of this technical term, concerns the manner of composition of objects in the world. In this sense, an object or type of object is compositional if it is identical with its parts when they are assembled in some specified way. A slogan for this notion of compositionality is: ‘‘An object is the sum of its parts.’’ However, this is a slightly misleading slogan, because
712 Componential Analysis
conversely, if the ‘referential connection’ problem could be solved for primitive features, the same solution might very well be applicable to words as a whole. So, if noncircularity does not solve the referential problem as such, decomposition is not a priori to be preferred over nondecompositional approaches, and psychological evidence for one or the other can be taken into account (see Aitchison, 2003 for an overview of the psychological issues). However, even within those approaches that do not consider semantic decomposition to be epistemologically indispensable, componential analysis may be used as a heuristic device. For instance, in Geeraerts et al. (1994), a work that is firmly situated within the tradition of cognitive semantics, the internal prototypical structure of lexical categories is analyzed on the basis of a componential analysis of the referents of the words in question. It would seem, in other words, that there is widespread agreement in linguistics about the usefulness of componential analysis as a descriptive and heuristic tool, but the associated epistemological view that there is a primitive set of basic features is generally treated with much more caution. See also: Cognitive Semantics; Lexical Fields; Natural Semantic Metalanguage; Semantic Primitives.
Bibliography Aitchison J (2003). Words in the mind: an introduction to the mental lexicon (3rd edn.). Oxford: Blackwell. Baldinger K (1980). Semantic theory. Oxford: Blackwell. Bierwisch M (1969). ‘On certain problems of semantic representations.’ Foundations of Language 5, 153–184. Bolinger D (1965). ‘The atomization of meaning.’ Language 41, 555–573. Conklin H (1955). ‘Hanuno´o color categories.’ Southwestern Journal of Anthropology 11, 339–344. Coseriu E (1964). ‘Pour une se´mantique diachronique structurale.’ Travaux de Linguistique et de Litte´rature 2, 139–186.
Coseriu E (1967). ‘Lexikalische Solidarita¨ten.’ Poetica 1, 293–303. Dowty D (1979). Word meaning and Montague grammar. Dordrecht: Reidel. Fillmore C (1975). ‘An alternative to checklist theories of meaning.’ In Cogen C, Thompson H & Wright J (eds.) Proceedings of the First Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society. 123–131. Geckeler H (1971). Zur Wortfelddiskussion. Munich: Fink. Geeraerts D, Grondelaers S & Bakema P (1994). The structure of lexical variation. Berlin: Mouton de Gruyter. Goodenough W (1956). ‘Componential analysis and the study of meaning.’ Language 32, 195–216. Greimas A (1966). Se´mantique structurale. Paris: Larousse. Heger K (1964). Monem, Wort, Satz und Text. Tu¨bingen: Niemeyer. Hjelmslev L (1953). Prolegomena to a theory of language. Bloomington: Indiana University Press. Jackendoff R (2002). Foundations of language. Oxford: Oxford University Press. Katz J J (1972). Semantic theory. New York: Harper and Row. Katz J J & Fodor J A (1963). ‘The structure of a semantic theory.’ Language 39, 170–210. Leech G (1974). Semantics. Harmondsworth, England: Penguin. Lehrer A (1974). Lexical fields and semantic structure. Amsterdam: North Holland. Lipka L (2002). English lexicology. Tu¨bingen: Niemeyer. Lounsbury F (1956). ‘A semantic analysis of Pawnee kinship usage.’ Language 32, 158–194. Lutzeier P (1981). Wort und Feld. Tu¨bingen: Niemeyer. Pottier B (1964). ‘Vers une se´mantique moderne.’ Travaux de Linguistique et de Litte´rature 2, 107–137. Pottier B (1965). ‘La de´finition se´mantique dans les dictionnaires.’ Travaux de Linguistique et de Litte´rature 3, 33–39. Pustejovsky J (1995). The generative lexicon. Cambridge, MA: MIT Press. Weinreich U (1966). ‘Explorations in semantic theory.’ In Sebeok T A (ed.) Current Trends in Linguistics 3. The Hague: Mouton. 395–477.
Compositionality: Philosophical Aspects F J Pelletier, Simon Fraser University, Burnaby, BC, Canada ! 2006 Elsevier Ltd. All rights reserved.
There are three different but loosely related conceptions that are associated with the term ‘compositionality’ in the literature of philosophical and linguistic semantics.
One conception, taking its lead from the more literal sense of this technical term, concerns the manner of composition of objects in the world. In this sense, an object or type of object is compositional if it is identical with its parts when they are assembled in some specified way. A slogan for this notion of compositionality is: ‘‘An object is the sum of its parts.’’ However, this is a slightly misleading slogan, because
Compositionality: Philosophical Aspects 713
it does not distinguish between two different types of objects made of the same parts but put together differently. This notion of compositionality is metaphysical in nature: it provides a characterization of the ontology of objects in the world, saying that they can all be described in terms of some basic atomic elements and their combinations. Along with this ontological feature often goes an epistemological feature: that one can know objects in the world by understanding what the atomic items are and the ways they can be assembled. Both the ontological and the epistemological aspects here are further associated with reductionism: the view that objects are ‘‘nothing more than’’ their parts. In this meaning of compositionality, the compositionalists are often called ‘atomists,’ and anti-compositionalists are called ‘holists’ or sometimes ‘wholists.’ These latter theorists deny that all objects can be described and known in terms of their parts and the arrangement of the parts – for instance, they might deny that a corporation, a nation, or a group is ‘‘nothing more than’’ the class of individuals making them up together with their relationships – and hence they are antireductionistic. They might also hold that there are emergent properties and gestalt properties that cannot be described and known in the way required by atomism. A slogan for these theories is: ‘‘The whole is more than the sum of its parts.’’ In the field of semantics, whether semantics of natural language or of mental items, there is a somewhat different conception of compositionality in play. In this realm, it is meaning that is claimed to be compositional; but since meaning is always meaning of something, it is this other something that defines the parts and the whole, unlike the case of the first sort of compositionality. The slogan for this second conception of compositionality is: ‘‘The meaning of a whole is determined by the meaning of its parts and the way these parts are combined.’’ What we see here is that a feature of a whole (its meaning) is claimed to be determined by the similar feature in the parts of the whole, plus the mode of combination of these parts – unlike the case of the first type of compositionality, in which it was the whole itself that was alleged to be ‘‘nothing more than’’ its parts. In the second type of compositionality, the notions of ‘part’ and ‘whole’, as well as their mode of combination, are presupposed to be already defined in terms of an independent syntax (in the case of language) or an independent mental economy (in the case of concepts). So the realm of syntax or mental economy is presupposed to be compositional in the first sense, and the issue is whether the property of meaning that is associated with the parts and wholes will likewise compose. Since the second conception assumes that
the first conception applies to the background syntax, this second conception presupposes basic or primitive meanings for the atomic (syntactic or mental) parts out of which all other (syntactic or mental) items are composed. (Once this second notion of compositionality is acknowledged, where there is a presupposed part-whole structure and it is then asked whether a feature of the whole is somehow determined by the similar features in the parts, one can see questions of compositionality arising in many fields, not just in semantics. For example, one might wonder whether the intrinsic value of an action is determined by the values of the parts of the action and the way the parts are ordered. One might wonder whether the beauty of a whole is determined by the beauty of its parts and the way the parts are combined. One might wonder whether the duties and obligations of a corporation or a society are determined by those of its members and the way these members fit together to form the corporation or society.) Obviously, whether semantic compositionality is true or false depends upon the presupposed syntax or mental economy, the conception of meaning under consideration, and what is meant by the phrase ‘‘is determined by.’’ Indeed, many theorists have thought that this indeterminism inherent in semantic compositionality shows that its truth or falsity is merely ‘‘a methodological matter.’’ For a small alteration in the underlying syntax or mental economy might make a given semantics become non-compositional; a slight change in the assumed notion of ‘determination’ might make it become compositional again; an inclusion or exclusion of some property as ‘‘being semantic meaning’’ (as opposed, say, to ‘‘being pragmatics’’) makes it become non-compositional again; and there might be no reason to make these changes other than to keep or deny compositionality. The most popular explanation of ‘‘is determined by’’ in the semantic compositionalist’s slogan is that it means ‘is a (mathematical) function of’; so the slogan becomes: ‘‘The meaning of a complex syntactic unit is a (mathematical) function of the meanings of its syntactic parts and the way in which they are syntactically combined.’’ But according to some, this notion allows too much: it is claimed that if no constraints are put upon the function, nearly any meanings of parts and syntactic combination can be compositionally related to the meaning of a whole. Some theorists would want to make the function be natural or systematic (and so on), without saying much about what, exactly and in the abstract, would make a function be natural or systematic. More usual is to be given examples of what sort of mathematical function should be ruled out. Consider the idea that an adjective like red means something
714 Compositionality: Philosophical Aspects
different depending on what noun it modifies. For example, according to this view, red wine vs. red rose vs. red hair vs. red skin vs. red grapefruit all employ a different meaning of red. And then compositionality is false, because these phrases are all constructed by the same syntactic rule and yet the meaning of red changes as a result of some syntactic item (viz., the noun being modified) that is not a part of the lexical item red. But a defender of compositionality could respond that the meaning of red is constant throughout, by being disjunctive (‘‘when modifying wine it means r1; when modifying hair it means r2; etc.’’). This is a perfectly good mathematical function and would obviously yield the right meanings of wholes if there were enough disjuncts. Those who object to the mathematical notion of function in the definition of compositionality might claim here that disjunctive functions are ‘‘not natural.’’ The notion opposed to semantic compositionality is ‘semantic holism’. However, this notion means different things to different theorists, and it is not always just taken to mean merely that there is no mathematical function that will generate the required meanings. For example, some people call semantic holism the view that ‘‘words have meaning only in the context of a sentence’’ or that no word or other syntactic unit (including sentences, paragraphs, and discourses) has meaning in itself, but only in the setting of an entire theory or worldview or form of life. Others take semantic holism to be that the meaning of a syntactically defined item is determined not only by the meanings of its syntactic parts and their syntactic combination but also by the nonlinguistic context in which the utterance is made. (For example, it might be thought that the meaning of There is no money depends on who is speaking, whether the audience knows which business deal is being discussed, and so forth.) And still other holists, not necessarily wanting to bring these nonlinguistic items into meaning, nonetheless might hold that there are cases where the meaning of a syntactically complex item depends on meanings of linguistic items that are not syntactic parts of the complex. (For example, in The first man landed on the moon in 1969, we cannot take the meaning of the first man and combine it with landed on the moon in 1969 to get the right meaning, for there is no sense in which the sentence really is talking about the first man. Rather, the relevant meaning of the subject term is that of the first man who landed on the moon. But to obtain that meaning, we need to get information from the verb phrase. Hence, to get the meaning of the subject term we need information of items that are not syntactic parts of the subject term.) A third conception for (semantic) compositionality is less definite than the preceding, and comes through
considerations that might be called ‘the magic of language’. A set of closely related considerations have been pointed at in various times in the history of philosophy, both Western and Indian: . We can understand an infinite number of novel sentences, so long as they employ words we already understand. We understand sentences and combinations that we have never encountered. . We can create new sentences that we have never heard or used before, and we know that they are appropriate to the situation in which we use them. . We are finite creatures who are exposed to a finite amount of information concerning our language. Nonetheless, we learn a system that is capable of infinite expression. These considerations all point to the same features: (1) that language is something special (infinite, novel, creative, or whatever) and (2) that people manage to use/learn/understand language despite their finite nature. It is natural to see compositionality as an explanation of this ability – people have a finite stock of atomic items whose meanings are learned primitively, and there is a finite number of rules of combination whose effect on meaning are learned. But given that the rules are recursive in nature, this allows for an infinite number of sentences whose meanings are finitely knowable. (The opening paragraph of Frege [1923] is often taken to be an endorsement of this argument for compositionality, but it is a matter of scholarly dispute as to whether or not Frege actually believed in semantic compositionality. See Pelletier, 2001 and Janssen, 2001 for discussion and further references.) This third conception of (semantic) compositionality is a ‘functional’ one and thus less definite than the preceding two. It amounts to saying that compositionality is whatever accounts for the magic of language. It might be the second conception of compositionality, with its mathematical functions, that will do the trick, or it might be some other, more exotic type of function. Or it may be some function that operates on items that are not necessarily syntactic subparts of the expression to be evaluated, and somehow thereby brings in information from context (of both linguistic and nonlinguistic varieties). The magic of language considerations are the only arguments in favor of compositionality that do not seem merely to turn on such methodological considerations as the aesthetics of the syntax-semantics interface. However, it should be noted that they are not conclusive in relation to compositionalityas-mathematical-function. The second notion of compositionality does not guarantee the magic, nor does
Compositionality: Philosophical Aspects 715
non-compositionality in this second notion necessarily deny the magic. For it might be that the meaning of every syntactic whole is a function of the meanings of its parts and its syntactic mode of combination, but if these functions are not computable functions, then the language cannot be learned/used/understood in the way required by the magic. On the other hand, even if there is no function defined solely by the meanings of the parts and their modes of combination that will yield the meanings of the wholes, it could nonetheless be true that these meanings are computable in some other way . . . and then the magic would still be there. (An example of this possibility is Pelletier’s 1994/2004 ‘semantic groundedness’.)
Considerations Against Semantic Compositionality The linguistic semantics literature is rife with demonstrations of how some linguistic phenomenon can or cannot be given a compositional description. It often seems that these works would more accurately be described as demonstrating how a phenomenon can or cannot be given a compositional description employing some particular syntactic-semantic device or within some specific syntactic-semantic theory. There are, however, three more general arguments that have been presented against semantic compositionality. The first is an argument from (nonlinguistic) context, of the sort mentioned above, where it is claimed that the meaning of a sentence in a context just cannot be derived from the meanings of the words and their combinations. In evaluating this argument, scholars need to distinguish between (what might be called) ‘literal meaning’ and ‘occasion meaning’. The former is thought of as the meaningin-language, while the latter is thought of as the meaning-in-a-context. If there is such a distinction, then there will be two principles of semantic compositionality – one for each type of meaning. And it is not so clear that either of them is overturned by considerations of context. The only casualty would be a mixed principle that no one believes, i.e., that the occasion meaning of a complex expression is a mathematical function of the literal meanings of its parts and their manner of combination. The second general argument against compositionality comes from the existence of synonymy and Mates-like (Mates, 1950) considerations. Given that there is synonymy, so that x1 and x2 mean the same, then there are two sentences, S1 and S2, that differ only in that one contains x1 while the other contains x2. Given compositionality, it follows that S1 and S2 are synonymous too; and by compositionality again, it follows that Mary believes S1 and Mary believes S2
are synonymous. But for any such S1 and S2, it can be the case that the former is true, while the latter is false. However, it cannot be the case that, of two synonymous sentences, one is true and the other false. Hence, either there is no synonymy or else compositionality is wrong. And the existence of synonymy is more secure than that of compositionality. The third general argument comes from the existence of ambiguity. If compositionality implies that the meaning of a whole is a mathematical function of the meanings of its parts (and combination), then there cannot be any ambiguity of the sort where one and the same item has two or more meanings, for that would deny that it was a function that computed meaning. As with synonymy, one could of course deny the existence of ambiguity; but most theorists find that this is too lavish a position to take. So it is usually admitted by compositionalists that individual words can be ambiguous; therefore, sentences using these ambiguous words may also be ambiguous (but the ambiguities are always traceable to the ambiguity of the words). Also, it is pointed out that strings of words such as Visiting professors can be fun are ambiguous (is it the professors or the activity of visiting the professors that can be fun?), but this ambiguity is traceable to the fact that the words are put together in different ways – that is, there are different structural descriptions that can be associated with this string of words. Hence, this ambiguity is not a challenge to compositionality. However, Pelletier (1994/2004) points to a number of examples that seem neither to have ambiguous words nor to have different structural descriptions but which are nonetheless ambiguous. For example: When Alice rode a bicycle, she went to school. This seems to have but one syntactic analysis within any particular theory, but its meaning is ambiguous: On those occasions where Alice rode a bicycle, she took it to school vs. Back in the days when Alice was a bicyclist, she was a student.
Formal Considerations There have been a number of works concerned with the question of whether compositionality is a nonempirical issue on the grounds of certain formal features that are required by compositionality. A review article that surveys this work is Westersta˚hl (1998). More recent work on formal features of compositional semantics is in the important work of Hodges (2001) and material based on this.
History Although the general principle of compositionality seems to have been around for some time, as
716 Compositionality: Philosophical Aspects
mentioned earlier, it is not clear when the term ‘compositionality’ came into the linguistic semantics literature (unlike ‘holism,’ which was introduced by Smuts, 1926). ‘Compositionality’ is used by Katz (1973) and Thomason (1974). See also: Context Principle; Holism, Semantic and Epistemic; Human Language Processing: Connectionist Models; Reductionism; Representation in Language and Mind; Systematicity.
Bibliography Davidson D (1965). ‘Theories of meaning and learnable languages.’ In Bar-Hillel Y (ed.) Logic, methodology and philosophy of science. Amsterdam: North Holland. 383–394. Dever J (in press). ‘Compositionality.’ In Lepore E & Smith B (eds.) Oxford handbook of the philosophy of language. Oxford: Oxford University Press. Fodor J & Lepore E (1992). Holism: a shopper’s guide. Oxford: Blackwell. Fodor J & Lepore E (2002). The compositionality papers. New York: Oxford University Press. Frege G (1923/1963). ‘Compound thoughts.’ Stoothoff R (trans.). Mind 72, 1–17. Hodges W (2001). ‘Formal features of compositionality.’ Journal of Logic, Language and Information 10, 7–28. Janssen T (1997). ‘Compositionality.’ In van Benthem J & ter Meulen A (eds.) Handbook of logic and language. Amsterdam: Elsevier. 417–473. Janssen T (2001). ‘Frege, contextuality and compositionality.’ Journal of Logic, Language and Information 10, 115–136.
Kamp H & Partee B (1995). ‘Prototype theory and compositionality.’ Cognition 57, 129–191. Katz J (1973). ‘Compositionality, idiomaticity, and lexical substitution.’ In Anderson S & Kiparsky P (eds.) A festschrift for Morris Halle. New York: Holt, Rinehart, and Winston. 357–376. Mates B (1950). ‘Synonymity.’ California University Publications in Philosophy 25. Reprinted in Linsky L (1952). Semantics and the philosophy of language. Urbana: University of Illinois Press. 111–136. Pagin P (1997). ‘Is compositionality compatible with holism?’ Mind and Language 12, 11–23. Partee B (1984). ‘Compositionality.’ In Landman F & Veltman F (eds.) Varieties of formal semantics. Dordrecht: Foris. 281–311. Partee B (2003). Compositionality in formal semantics: selected papers by Barbara H. Partee. Oxford: Blackwell. Pelletier F J (1994/2004). ‘The principle of semantic compositionality.’ Topoi 13, 11–24. [Reprinted with new appendices in Davis S & Gillon B (eds.) Semantics: a reader. New York: Oxford University Press. 133–158.] Pelletier F J (2001). ‘Did Frege believe Frege’s principle?’ Journal of Logic, Language and Information 10, 87–114. Smuts J (1926). Holism and evolution. London: Macmillan. Szabo Z (2000). Problems of compositionality. New York: Garland. Thomason R (1974). ‘Introduction.’ In Thomason R (ed.) Formal philosophy: selected papers of Richard Montague. New Haven: Yale University Press. 1–69. Westersta˚ hl D (1998). ‘On mathematical proofs of the vacuity of compositionality.’ Linguistics and Philosophy 21, 635–643. Westersta˚ hl D (2002). ‘Idioms and compositionality.’ In Barker-Plummer D, Beaver D, van Benthem J & Scotto di Luzio P (eds.) Words, proofs, and diagrams. Stanford: CSLI Publications. 241–271.
Compositionality: Semantic Aspects G Sandu and P Salo, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.
According to the principle of compositionality, the meaning of a complex expression depends only on the meanings of its constituents and on the way these constituents have been put together. The kind of dependence involved here is usually a functional one. Principle of Compositionality (PC): The meaning of complex expression is a function of the meanings of its constituents and of the rule by which they were combined.
PC is rather vague unless one specifies the meanings of ‘is a function of’ and ‘meaning(s)’, something that
is easier said than done. A more rigorous formulation of these notions is possible for formal languages and is due to Richard Montague. Montague (1974) defined compositionality as the requirement of the existence of a homomorphism between syntax and semantics, both to be understood as ‘structures’ in the mathematical sense. To keep technicalities down to a minimum, Montague’s requirement of a compositional interpretation was that for each syntactic operation ‘O’ that applies to n expressions e1, . . ., en in order to form the complex expression ‘O(e1, . . ., en)’, the interpretation of the complex expression ‘Oi(e1, . . ., en)’ is the result of the application of the semantic operation ‘Ci’, which is the interpretation of ‘Oi’ to the interpretations m1, . . ., mn of ‘e1’, . . .,‘en’, respectively. In other
716 Compositionality: Philosophical Aspects
mentioned earlier, it is not clear when the term ‘compositionality’ came into the linguistic semantics literature (unlike ‘holism,’ which was introduced by Smuts, 1926). ‘Compositionality’ is used by Katz (1973) and Thomason (1974). See also: Context Principle; Holism, Semantic and Epistemic; Human Language Processing: Connectionist Models; Reductionism; Representation in Language and Mind; Systematicity.
Bibliography Davidson D (1965). ‘Theories of meaning and learnable languages.’ In Bar-Hillel Y (ed.) Logic, methodology and philosophy of science. Amsterdam: North Holland. 383–394. Dever J (in press). ‘Compositionality.’ In Lepore E & Smith B (eds.) Oxford handbook of the philosophy of language. Oxford: Oxford University Press. Fodor J & Lepore E (1992). Holism: a shopper’s guide. Oxford: Blackwell. Fodor J & Lepore E (2002). The compositionality papers. New York: Oxford University Press. Frege G (1923/1963). ‘Compound thoughts.’ Stoothoff R (trans.). Mind 72, 1–17. Hodges W (2001). ‘Formal features of compositionality.’ Journal of Logic, Language and Information 10, 7–28. Janssen T (1997). ‘Compositionality.’ In van Benthem J & ter Meulen A (eds.) Handbook of logic and language. Amsterdam: Elsevier. 417–473. Janssen T (2001). ‘Frege, contextuality and compositionality.’ Journal of Logic, Language and Information 10, 115–136.
Kamp H & Partee B (1995). ‘Prototype theory and compositionality.’ Cognition 57, 129–191. Katz J (1973). ‘Compositionality, idiomaticity, and lexical substitution.’ In Anderson S & Kiparsky P (eds.) A festschrift for Morris Halle. New York: Holt, Rinehart, and Winston. 357–376. Mates B (1950). ‘Synonymity.’ California University Publications in Philosophy 25. Reprinted in Linsky L (1952). Semantics and the philosophy of language. Urbana: University of Illinois Press. 111–136. Pagin P (1997). ‘Is compositionality compatible with holism?’ Mind and Language 12, 11–23. Partee B (1984). ‘Compositionality.’ In Landman F & Veltman F (eds.) Varieties of formal semantics. Dordrecht: Foris. 281–311. Partee B (2003). Compositionality in formal semantics: selected papers by Barbara H. Partee. Oxford: Blackwell. Pelletier F J (1994/2004). ‘The principle of semantic compositionality.’ Topoi 13, 11–24. [Reprinted with new appendices in Davis S & Gillon B (eds.) Semantics: a reader. New York: Oxford University Press. 133–158.] Pelletier F J (2001). ‘Did Frege believe Frege’s principle?’ Journal of Logic, Language and Information 10, 87–114. Smuts J (1926). Holism and evolution. London: Macmillan. Szabo Z (2000). Problems of compositionality. New York: Garland. Thomason R (1974). ‘Introduction.’ In Thomason R (ed.) Formal philosophy: selected papers of Richard Montague. New Haven: Yale University Press. 1–69. Westersta˚hl D (1998). ‘On mathematical proofs of the vacuity of compositionality.’ Linguistics and Philosophy 21, 635–643. Westersta˚hl D (2002). ‘Idioms and compositionality.’ In Barker-Plummer D, Beaver D, van Benthem J & Scotto di Luzio P (eds.) Words, proofs, and diagrams. Stanford: CSLI Publications. 241–271.
Compositionality: Semantic Aspects G Sandu and P Salo, University of Helsinki, Helsinki, Finland ! 2006 Elsevier Ltd. All rights reserved.
According to the principle of compositionality, the meaning of a complex expression depends only on the meanings of its constituents and on the way these constituents have been put together. The kind of dependence involved here is usually a functional one. Principle of Compositionality (PC): The meaning of complex expression is a function of the meanings of its constituents and of the rule by which they were combined.
PC is rather vague unless one specifies the meanings of ‘is a function of’ and ‘meaning(s)’, something that
is easier said than done. A more rigorous formulation of these notions is possible for formal languages and is due to Richard Montague. Montague (1974) defined compositionality as the requirement of the existence of a homomorphism between syntax and semantics, both to be understood as ‘structures’ in the mathematical sense. To keep technicalities down to a minimum, Montague’s requirement of a compositional interpretation was that for each syntactic operation ‘O’ that applies to n expressions e1, . . ., en in order to form the complex expression ‘O(e1, . . ., en)’, the interpretation of the complex expression ‘Oi(e1, . . ., en)’ is the result of the application of the semantic operation ‘Ci’, which is the interpretation of ‘Oi’ to the interpretations m1, . . ., mn of ‘e1’, . . .,‘en’, respectively. In other
Compositionality: Semantic Aspects 717
words, the interpretation of ‘Oi (e1, . . ., en)’ is Ci (m1, . . ., mn). An immediate consequence of PC is the ‘Substitutivity Condition’: Substituting a constituent with its synonym in a given expression does not change the meaning of the resulting expression. Thus, PC is violated if a complex expression has meaning but some of its component expressions do not (the Domain Condition) or if the Substitutivity Condition fails. As one can see, PC is by itself rather weak, and so it comes as no surprise that in the case of formal languages, one can always devise a trivial compositional interpretation by assigning some arbitrary entities to the primitive expressions of the language and then associating arbitrarily the syntactic operations of the language with corresponding operations on the domain of those entities. This way of implementing the principle can hardly be of any interest, although it has led some philosophers and logicians to claim that PC is methodologically empty. A slightly more interesting case is the one in which one has an intended semantic interpretation in mind, that is, an interpretation with an intended domain of entities for the primitive expressions of the language to be mapped into, and a class of intended operations to serve as the appropriate interpretations of the syntactic operations of the language. A case in point is Horwich’s (1998) interpretation. His formal language was intended to serve as a regimentation for a fragment of English that contains proper names (‘John,’ ‘Peter,’ etc.), common nouns (‘dogs,’ ‘cows,’ etc.), and verb phrases (‘talks,’ ‘walks,’ ‘bark,’ etc.) as primitive expressions together with grammatical operations on them. For simplicity, let us assume predication is such a grammatical operation marked in this case by an empty space. Thus the syntax contains clauses of the form: If ‘n’ is a proper name and ‘v’ is a verb phrase, then ‘n v’ is a complex expression.
The intended semantic interpretation consists of a domain of entities that serve as the intended meanings of the proper names and verbs phrases (whatever they are; they are marked by capitals), together with an operation – say, P – that interprets the grammatical operation of predication (whatever that is). The only thing one needs to worry about in this case is to see to it that the operation of predication is defined for the entities mentioned above. The relevant semantic clauses now have this form: The interpretation of ‘n v’ is the result of the application of P to the entities assigned to ‘n’ and ‘v’, respectively.
Thus, the interpretation of the sentence ‘John talks’ is the result of the application of P to TALKS to
JOHN. This interpretation is trivially compositional in that the interpretation of every compound ‘n v’ has been defined as the result of the application of the operation assigned to the syntactic operation of concatenation to the interpretations of ‘n’ and ‘v’, respectively. The more challenging cases for PC are those in which one has an intended interpretation for the complex expressions and would like to find a compositional interpretation that agrees with it. In contrast to the previous case, the meanings of the complex entities are not any longer defined but are given at the outset. We have here a typical combination of PC with the Context Principle (CP): An expression has a meaning only in the context in which it occurs. The combination was largely explored in the work of Gottlob Frege and in Donald Davidson’s theory of meaning, which assumed the form of a theory of truth. Davidson took whole sentences to be the meaning-carrying units in language, and truth to be a primitive, undefinable semantic property that is best understood. Truth being undefinable, the strategy applied above, which ensured a trivial implementation of PC, is no longer available. Instead, PC acquires the status of a methodological constraint on an empirical theory of truth for the target language: the division of a sentence into parts and their association with appropriate semantic entities in a compositional theory becomes a theoretical business that has no other role except to show how they contribute to the computation of the truth of the sentences of the target language in which they occur. The literature on formal semantics for natural language has plenty of cases of the application of the Context Principle. We consider just two examples. In game-theoretical semantics (GTS), one starts with a standard first-order language and defines truth only for the sentences of that language. The truth of every such sentence (in a prenex normal form) is defined via a second-order sentence, known as its Skolem form. This interpretation is clearly not compositional, since it violates the Domain Condition. One can now ask whether there is a compositional interpretation that agrees with the given game-theoretical interpretation of sentences. It is known that the answer is positive, but only assuming certain nontrivial mathematical principles (the Axiom of Choice). The second example concerns Dynamic Predicate Logic. The starting point is the same language as in GTS – that is, a standard first-order language – but we now want a compositional interpretation in which, e.g., an existential quantifier occurring in the antecedent of a conditional binds a free variable occurring in the consequence of the conditional and in addition has the force of an universal quantifier.
718 Compositionality: Semantic Aspects
There is a compositional interpretation that has the required property, that of Dynamic Predicate Logic (Groenendijk and Stokhoff, 1991). From a technical point of view, the situation described in the two examples may be depicted as an extension problem (Hodges, 1998). One starts with an intended interpretation I, which either (a) fixes only the interpretation of certain complex expressions (e.g., sentences) or (b) puts some relational constraints on the interpretation of complex expressions. One then wants to find a compositional interpretation I" that agrees with the independently understood interpretation I. Hodges’s Extension Theorem solves case (a). It shows that any partial interpretation for a grammar can be extended to a total compositional interpretation. This shows that the combination of PC with CP (in its form [a]) is trivially satisfiable. The more interesting cases are those falling under (b). This is the situation that typically arises in the case of empirical linguistics where the intended interpretation is supposed to be motivated by empirical argument. As an illustration, consider the much-discussed ‘pet fish’ problem. There is some empirical evidence to the effect that the meanings of concept words are prototypes. A prototype is either a good exemplar of the category or a statistical average of all or some instances of the category (Smith and Medin, 1981). A given instance x is then categorized as X if x resembles the prototype of X more than any other prototype. Given two expressions X (e.g., ‘pet’) and Y (‘fish’), one asks whether there is an interpretation that assigns to the complex concept word XY (‘pet fish’) a prototype that is the composition of the prototype assigned to X and the prototype assigned to Y. One also wants the meaning function to satisfy certain basic properties that are required for explanatory purposes; e.g., it should be the case that if x is XY, it must also be X and Y. We thus want every x to resemble the prototype of XY no less than it resembles the prototypes of X and Y. It has been argued that there is no such interpretation, that is, there is no operation of composition that yields a prototype as the interpretation of XY with the desired properties when applied to the two prototypes that are the interpretations of X and Y respectively (Fodor, 1998; Osherson and Smith, 1981). The moral to be drawn from all this should have been somehow anticipated from our discussion of formal languages. When the intended interpretation puts constraints only on the meanings of primitive expressions and on the operations governing them, PC follows rather trivially, provided the semantic entities of complex expressions are not constrained in any way.
When the intended interpretation concerns only the meanings of complex expressions, Hodges’s extension theorem shows that a compositional semantics can still be found, at least in some cases, provided that one does not constrain the meanings of the primitive expressions or syntactical operations on them. In natural language, however, the situation is hardly so simple, as one meets constraints at every level. It is no wonder, then, that Fodor and Lepore (2002) argued that most theories of concepts or mental architecture in cognitive science are in contradiction with PC. The case of prototype semantics was only one example, but the same considerations apply to the theory that the meaning of a word is its use or the criteria for its application, etc. PC is often defended as the best explanation of the empirical phenomenon of systematicity: Any competent speaker of a given language who has in his repertoire the complex expressions P, R, and Q has also in his repertoire the complex expressions in which P, R, and Q are permuted (provided they are grammatical). For instance, anybody who understands the sentence ‘Mary loves John’ also understands the sentence ‘John loves Mary’. Fodor and his collaborators argued extensively that PC is the best explanation of the systematicity of language, but this is an issue that will not be tackled here (cf. Fodor and Pylyshyn, 1988; Fodor, 2001; Fodor and Lepore, 2002; Fodor, 2003; Aizawa, 2002). PC should not be confused with the principles of productivity or generativity of language, which require that the expressions of a language be generated from a finite set of basic expressions and syntactical rules. Although it presupposes that the language under interpretation has a certain syntactic structure, PC does not take a stand on how that structure should be specified (phrase structure rules, derivational histories, etc.), as long as it is given a compositional interpretation. See also: Compositionality: Philosophical Aspects; Context Principle; Game-Theoretical Semantics; Montague, Richard (1931–1971); Prototype Semantics; Systematicity.
Bibliography Aizawa K (2002). The systematicity argument. Amsterdam: Kluwer. Bloom P (1994). ‘Generativity within language and other domains.’ Cognition 51(2), 177–189. Chomsky N (1957). Syntactic structures. The Hague: Mouton. Fodor J A (1998). Concepts: where cognitive science went wrong. Oxford: Clarendon Press.
Compound 719 Fodor J A (2001). ‘Language, thought and compositionality.’ Mind and Language 16(1), 1–15. Fodor J A (2003). Hume variations. Oxford: Oxford University Press. Fodor J A & Lepore E (2002). The compositionality papers. Oxford: Clarendon Press. Fodor J A & Pylyshyn Z (1988). ‘Connectionism and cognitive architecture: a critical analysis.’ Cognition 28, 3–71. Groenendijk J & Stokhoff M (1991). ‘Dynamic predicate logic.’ Linguistics and Philosophy 14, 39–100. Hintikka J & Kulas J (1983). The game of language. Dordrecht: Reidel. Hodges W (1998). ‘Compositionality is not the problem.’ Logic and Philosophy 6, 7–33. Horwich P (1998). Meaning. Oxford: Clarendon Press. Janssen T M V (1997). ‘Compositionality.’ In van Benthem J & Meulen A T (eds.) Handbook of logic and language. Amsterdam: Elsevier. 417–473.
McLaughlin B (1993). ‘The classicism/connectionism battle to win souls.’ Philosophical Studies 70, 45–72. Montague R (1974). Formal philosophy: selected papers of Richard Montague. New Haven: Yale University Press. Oshershon D N & Smith E E (1981). ‘On the adequacy of prototype theory as a theory of concepts.’ Cognition 9, 35–58. Pelletier F J (1994). ‘The principle of semantic compositionality.’ Topoi 13, 11–24. Rips L J (1995). ‘The current status of research on concept combination.’ Mind and Language 10(1/2), 72–104. Smith E E & Medin D L (1981). Categories and concepts. Cambridge: Harvard University Press. Smolensky P (1987). ‘The constituent structure of mental states: a reply to Fodor and Pylyshyn.’ Southern Journal of Philosophy 26, 137–160. Zadrozny W (1994). ‘From compositional to systematic semantics.’ Linguistics and Philosophy 17, 329–342.
Compound L Bauer, Victoria University of Wellington, Wellington, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Definitions A compound is usually defined (somewhat paradoxically) as a word that is made up of two other words. This basic definition requires a certain amount of modification, some of it for all languages, some of it for specific languages. For example, there may be more than two ‘words’ involved in the formation of a compound, though there must be at least two. Word is to be understood in this definition as meaning ‘lexeme.’ The implication of this is that the forms in which the individual subwords appear may be differently defined in different languages: a citation form in one, a stem in another, a specific compounding form in yet a third, a word form in a fourth. Even this leaves room for a certain amount of disagreement about what a compound is in particular languages. Perhaps the rider should be added that the construction created by the two or more lexemes should not be a normal noncompound phrasal structure of the language: well done and in time are not compounds. There appear to be two fundamental approaches to the nature of a compound. The first sees a compound as a particular construction type, an entity with a formal definition. The second views a compound as a lexical unit meeting certain criteria. Because the two overlap to a large extent, it may not be clear in which
of these two senses the term is being used. For example, under either approach blackbird, windmill, and combination lock would count as compounds of English, as would sky-blue, onto, and freeze-dry. But the things we find in everyday headlines (such as PM backs mercy killings bill, where the relevant unit is mercy killings bill) are not included as compounds by those who view compounds as lexical units on the grounds that they arise through the productive use of syntactic rules, but are included as compounds by those who view compounds as a construction type. The argument in favor of the latter view would be that the major distinction between the lexical-item compounds and the headline compounds is frequency of occurrence and that there is no formal distinction to be drawn between the two groups. Although this leaves us in the unfortunate situation of not necessarily being able to recognize compounds in a given language, there are a number of criteria that are generally accepted as correlating with compound status, at least to a certain degree. These will be considered immediately below.
Orthographic Criteria Although orthographic criteria cannot be robust, given the number of languages that still are not written or, if written, have had an orthography imposed by nonnative speakers of the language, or, if neither of these, may not indicate word breaks in the orthography at all, nevertheless they are taken as a powerful indicator in those languages for which they are
Compound 719 Fodor J A (2001). ‘Language, thought and compositionality.’ Mind and Language 16(1), 1–15. Fodor J A (2003). Hume variations. Oxford: Oxford University Press. Fodor J A & Lepore E (2002). The compositionality papers. Oxford: Clarendon Press. Fodor J A & Pylyshyn Z (1988). ‘Connectionism and cognitive architecture: a critical analysis.’ Cognition 28, 3–71. Groenendijk J & Stokhoff M (1991). ‘Dynamic predicate logic.’ Linguistics and Philosophy 14, 39–100. Hintikka J & Kulas J (1983). The game of language. Dordrecht: Reidel. Hodges W (1998). ‘Compositionality is not the problem.’ Logic and Philosophy 6, 7–33. Horwich P (1998). Meaning. Oxford: Clarendon Press. Janssen T M V (1997). ‘Compositionality.’ In van Benthem J & Meulen A T (eds.) Handbook of logic and language. Amsterdam: Elsevier. 417–473.
McLaughlin B (1993). ‘The classicism/connectionism battle to win souls.’ Philosophical Studies 70, 45–72. Montague R (1974). Formal philosophy: selected papers of Richard Montague. New Haven: Yale University Press. Oshershon D N & Smith E E (1981). ‘On the adequacy of prototype theory as a theory of concepts.’ Cognition 9, 35–58. Pelletier F J (1994). ‘The principle of semantic compositionality.’ Topoi 13, 11–24. Rips L J (1995). ‘The current status of research on concept combination.’ Mind and Language 10(1/2), 72–104. Smith E E & Medin D L (1981). Categories and concepts. Cambridge: Harvard University Press. Smolensky P (1987). ‘The constituent structure of mental states: a reply to Fodor and Pylyshyn.’ Southern Journal of Philosophy 26, 137–160. Zadrozny W (1994). ‘From compositional to systematic semantics.’ Linguistics and Philosophy 17, 329–342.
Compound L Bauer, Victoria University of Wellington, Wellington, New Zealand ! 2006 Elsevier Ltd. All rights reserved.
Definitions A compound is usually defined (somewhat paradoxically) as a word that is made up of two other words. This basic definition requires a certain amount of modification, some of it for all languages, some of it for specific languages. For example, there may be more than two ‘words’ involved in the formation of a compound, though there must be at least two. Word is to be understood in this definition as meaning ‘lexeme.’ The implication of this is that the forms in which the individual subwords appear may be differently defined in different languages: a citation form in one, a stem in another, a specific compounding form in yet a third, a word form in a fourth. Even this leaves room for a certain amount of disagreement about what a compound is in particular languages. Perhaps the rider should be added that the construction created by the two or more lexemes should not be a normal noncompound phrasal structure of the language: well done and in time are not compounds. There appear to be two fundamental approaches to the nature of a compound. The first sees a compound as a particular construction type, an entity with a formal definition. The second views a compound as a lexical unit meeting certain criteria. Because the two overlap to a large extent, it may not be clear in which
of these two senses the term is being used. For example, under either approach blackbird, windmill, and combination lock would count as compounds of English, as would sky-blue, onto, and freeze-dry. But the things we find in everyday headlines (such as PM backs mercy killings bill, where the relevant unit is mercy killings bill) are not included as compounds by those who view compounds as lexical units on the grounds that they arise through the productive use of syntactic rules, but are included as compounds by those who view compounds as a construction type. The argument in favor of the latter view would be that the major distinction between the lexical-item compounds and the headline compounds is frequency of occurrence and that there is no formal distinction to be drawn between the two groups. Although this leaves us in the unfortunate situation of not necessarily being able to recognize compounds in a given language, there are a number of criteria that are generally accepted as correlating with compound status, at least to a certain degree. These will be considered immediately below.
Orthographic Criteria Although orthographic criteria cannot be robust, given the number of languages that still are not written or, if written, have had an orthography imposed by nonnative speakers of the language, or, if neither of these, may not indicate word breaks in the orthography at all, nevertheless they are taken as a powerful indicator in those languages for which they are
720 Compound
relevant. Compounds are assumed to show their status by being written as single words. This type of criterion is employed linguistically especially by corpus linguists, who may have no other way of isolating compounds. For such people, it is a clearly a less than ideal default. Unfortunately, it is taken as a serious definition by some outside corpus linguistics. There are a number of problems with such a criterion. First, in a language like English, there is so much variation in the writing of two-word lexical items that even standard dictionaries (supposed arbiters of the prescribed norm) are unable to agree. Rainforest, rain-forest, and rain forest are all easily attestable, for example. It might be argued, however, that such variation merely shows the progress of an item like rain forest from syntactic sequence to lexical item. Unfortunately, such drift is not easily correlated with the relative age of dictionaries showing a particular orthography. More serious, though, are items like a New York–Los Angeles flight, which, on strict orthographic criteria, give compounds such as York– Los. We should also note that orthographic conventions can change, as where recent orthographic changes have made the writing of noun þ noun sequences as two words a rather more common event in Danish than it used to be.
Phonological Criteria There are a number of phonological ways in which status as single words may be indicated. These may be segmental or suprasegmental. In English, for example, stress is sometimes taken to be criterial, distinguishing most clearly between examples like blackbird and black bird. Where noun þ noun sequences are concerned, stress is less consistent in English, church-warden, for example, being reported with various stress patterns. And for examples like sky-blue, stress as the sole criterion would suggest that they are compounds when used attributively (a sky-blue dress) and not when used predicatively (her dress was sky-blue). In Danish, the stød (or glottalization, corresponding to the use of tones in other Scandinavian varieties) is generally lost in the first element of a compound (Basbøll, 1985). In Japanese, there is a process known as rendaku, whereby the initial consonant of the second element of a certain class of compounds becomes voiced, as in the examples in (1). (1) iro e
‘color’ irogami ‘picture’ edako
kami ‘colored paper’ tako ‘picture kite’
‘paper’ ! ‘kite’
!
ike
‘arrange’ ikebana
hana ‘flower arranging’
‘flower’ !
In Sengseng, geminate consonants are permitted within a word, but are always broken up by an epenthetic vowel if they occur adjacent at word boundaries.
Morphological Criteria It is sometimes claimed that since compounds are single lexemes, the only inflection allowed in them must be the inflection that allows that single word’s functioning in the sentence as a whole to be seen. Accordingly, internal words (words acting as modifiers within the compound) are said to disallow inflections. There are instances in which this appears to be true. In peninsular Scandinavian, for example, where definiteness is marked by a suffix, no marking for definiteness can occur on the modifying element of a compound. There are other instances in which this rule appears to hold most of the time, but not all of the time. In English, for example, plural is not usually marked on the modifying elements in compounds so that even a noun like trousers, which rarely appears in the singular, has no plural marking in the compound trouserpress. However, there is an apparently increasing set of items in which internal plural marking is found: official secrets act, suggestions box, weapons inspector. The example of mercy killings bill cited earlier is an instance. Although we might want to argue that plural in English is not entirely inflectional, we appear to have sporadic counterexamples here. Then there are instances in which this rule clearly does not hold. Sometimes these are sporadic, as in Danish nyta˚r (new.NEUT"year), where a clearly inflectional neuter agreement occurs contrary to usual practice. In other languages, though, we may find systematic violations of the rule. Consider, for instance, the Finnish examples in (2) (from Sulkala and Karjalainen, 1992) or the Sanskrit examples in (3) (from Whitney, 1889). (2) auto"n"ikkuna car"GEN"window ‘car window’ maa"lta"pako country"ABL"flee.NOM ‘rural depopulation’ (3) bhayam ˙ kartr. fear.ACC.causer ‘causer of fear’ diviks. ı´t sky.LOC.dwelling ‘dwelling in the sky’
Compound 721
Syntactic Criteria Syntactic criteria are attempts to find ways of indicating that the compound is being treated in the syntax as a single unit and not as a sequence of two distinct units. This usually means that anaphora cannot pick out the modifying element in a compound, but can in a syntactic phrase. Thus, for English, it is noted that we can say I thought this house had aluminum windows, not wooden ones (where ones refers to windows, not aluminum windows, and thus shows aluminum windows to be made up of two elements), and also The aluminum windows look good but I’m afraid that it may not be strong enough (where it, in referring to the aluminum rather than to the whole compound, again shows that the two words are separate). In parallel instances with lexical compounds such as combination lock, the claim would be that I wanted a combination lock but they only had Yale ones and I installed a combination lock and now I can’t remember it (where it refers to the combination) are not possible, indicating the unity of the sequence. Because these tests appear to be robust some of the time, they are hard to argue against, but we can make some comments here. First, intuitions are not always as secure as the tests seem to predict; second, speakers do produce the ‘impossible’ constructions on occasion, as is shown in (4); third, there are constraints on these uses that have not been fully explored; and fourth, because the tests deal with the degree to which the constituent words of a compound can be recognized, they are again tests of lexicalization. (4) Here he found that the Greatest Story Ever Told had stopped for a tea-break. Dorothy Horncastle was dispensing mugs of the stuff from a large copper urn . . . (Reginald Hill, Bones and Silence (London: Collins, 1990: 275)) I want to give myself a headache by banging it on the floor! (attested)
Semantic Criteria Semantic criteria sometimes invoked to indicate that something has become a compound are fundamentally indications of lexicalization. A celebrated example concerns the pair push-chair and wheel-chair. Although it is possible to push a wheel-chair and a push-chair (‘baby buggy’) has wheels, the two are uniquely identified by their labels that, therefore, imply a lot more than they state. This specialization of meaning is sometimes taken to indicate compound status. There are two possible counterarguments here. The first is that lexicalization (or idiomatization) is
something that affects not only noun þ noun sequences such as wheel-chair, but also syntactic sequences such as a red herring or how do you do? If how do you do? is still a sentence, so wheel-chair should have the same status before and after lexicalization. The second is that meaning specialization is not something that comes only with frequent use (and gradual movement from syntax to the lexicon) but something that comes with first use. Downing (1977) provides us with the celebrated attested example of apple-juice seat. In the abstract this could mean a number of things, including the seat in which you have to sit to win some apple juice, the seat on which apple juice has been spilled, the seat on which a can of apple juice has been placed, the seat on which I usually sit when I drink apple juice, and so on. The attested meaning of ‘seat with a glass of applejuice placed before it’ is already a specialization from the large number of potential meanings that construction could have had. That being the case, we could argue that any relevant sequence that has actually been used has already had a specialized meaning. All that may remain in question is the degree of specialization.
The Universality of Compounds Because of the two types of definition of compound referred to earlier, it is not clear whether or not all languages have compounds. Claims that they have can be found in the literature; so can claims in grammars of individual languages that compounds are not found in that language. The two need not be incompatible if they depend on different types of definition. It may be, for example, that compounds viewed as a construction type are universal, but compounds viewed as lexical entities are not. Because the problem has not been recognized in the literature, it is impossible to be sure. There is also the problem that in individual languages, things may be called compounds that would not normally be so termed in other languages. For example, Glinert (1989) describes as compounds of Hebrew, things that sound like blends (q.v.) to the reader more used to languages such as English. Discussions of French sometimes refer to units such as chemin de fer (‘way of iron’ ¼ ‘railway’) or pomme de terre (‘apple of earth’ ¼ ‘potato’) as compounds. It is clear that these items are lexical items (listemes in the terminology of Di Sciullo and Williams, 1987). The use of the label compound may thus be intended to indicate the lexicalized nature of such constructions (or at least, of some such constructions) and their near-syntactic productivity rather than be intended to attribute any particular structure to them. The use of the label compound in such
722 Compound
instances certainly adds to the confusion surrounding the whole issue. Perhaps just as confusing is the way in which noun incorporation is sometimes not included under compounding. Incorporation is an important enough and theoretically interesting enough method of word formation to demand separate discussion, as is done in this encyclopedia (see Incorporation). But since both compounding and incorporation involve the close binding together of two stems into a new morphosyntactic element, there is also a sense in which the two need to be viewed as related processes. Neoclassical compounds, also dealt with separately here (see Neoclassical Compounding), have many of the features of compounds, but are compounded according to borrowed patterns rather than native ones.
ii.
iii.
iv.
The Semantics of Compounds Speakers of European languages, at least, seem to view compounds made up of two nouns as the prototypical type of compound (although there are languages that appear to prefer verb–verb compounds). A noun–noun compound such as rain–cloud is an ideal construction for providing a subcategorization. The element cloud (the head element; see below) tells us what kind of entity we are dealing with, and the modifying element (here, rain) tells us something about the subtype the compound denotes. Similarly, an adjective–noun compound such as blackbird provides not so much a description of the bird (female blackbirds are brown) as a label for a subtype of bird. What is perhaps strange in English is that this subcategorizational use of adjectives need not be restricted to compounds: neither a red squirrel nor red wine is prototypically red, yet their labels contrast with gray squirrel and white wine, respectively, and thus show a subcategorization of precisely the same type that is found in compounds. Much is made in the literature of the superficial ambiguity of noun–noun compounds. Although a hayfever pill may be intended to relieve hayfever, a sleeping pill is intended to induce sleep, a sugar pill neither relieves nor induces sugar but contains it, and a morning-after pill provides a fourth logical link between the elements of the compound. Several approaches to the descriptive problem posed by such apparent ambiguity have been taken in the literature. In no particular order, these include the following: i. relating the various logical links to the meanings of prepositions available in the language or inflectional cases available in the language (so ‘pill against hayfever,’ ‘pill for sleeping,’ and so on)
v.
and assuming that the compound arises through the deletion of such marking; relating the various logical links to the syntactic role the elements might play in sentences glossing the link (where ‘the pill relieves hayfever’ and ‘the pill induces sleeping’ would both be of the same subject–predicate type, opposed to ‘the pill is taken on the morning after,’ which would be a subject–adverbial type); relating the various logical links to specific predicates that are assumed to be deleted in the course of the syntactic derivation of the compound structure (for example, the actual lexemes RELIEVE and INDUCE might be considered to be present at some underlying level of analysis, but not at the surface); relating the various logical links to a limited set of semantically basic predicates that are deleted in the process of derivation (this solution is similar to the last, but assumes some set of universal Aristotelian categories rather than languagespecific lexemes); some mixture of the above.
Quite apart from the theoretical problems that beset all of these approaches, in the final analysis all of these suggestions fall foul of the fact that there are some compounds that are remarkably resistant to any of these classifications. For example, spaghetti western requires some lengthy paraphrase (‘western made in a country that can be characterized by the amount of spaghetti that is consumed there’), and yet is not unique: goulash communism seems to reflect precisely the same relationship and the relationship underlying milk tooth remains obscure. A preferable solution may be to see the relationship between the elements not as an ambiguity but as a vagueness and to deny that the specific links between the elements of compounds is strictly grammatical at all. Rather, the specificity that speakers read into the meanings of these compounds can be seen as the result of the lexicalization process (q.v.), starting with the context of first use and becoming more specific with further use. Even this solution has problems associated with it, however. Though the ambiguity or vagueness that has been discussed here is found with one type of compound, there are other types that do not show the same variable meaning relationship. Some of these other types will be discussed below.
Compounds and Headedness In the last section, the point was made that compounds provide a suitable structure for reflecting
Compound 723
subcategorization. Compounds like rain–cloud show a modifier–head structure, with the head denoting the superordinate of the thing denoted by the compound, and the modifying element denoting the important feature for subclassification. Compounds of this type denote hyponyms of their head elements. Whereas the headedness can be defined in semantic terms like this, it is typically the case that headedness can be used to predict rather more about the structure of the compound. First, we can note that compounds of this type all seem to have a binary structure. Even very complex compounds of the headline type can be broken down into a number of binary compounds embedded within each other. At each division we can distinguish a modifier and a head. Second, we note that in languages that have grammatical gender or different inflectional classes, the head of a compound of this type determines the inflectional class for the compound as a whole. Given a German (Standard German) compound like Zeitgeist in which the modifying element (Zeit ‘time’) is grammatically feminine and the head element (Geist ‘spirit’) is grammatically masculine, we can tell that the compound will be grammatically masculine and will make its plural in the same way that Geist makes its plural. Though there are some apparent counterexamples to this generalization (highlight has the past tense form highlighted and never *highlit, despite the inflectional class of light), these occur in very special circumstances (here, for example, the verb highlight is derived from the noun highlight; it is not a compound verb created by the joining of high and light; compare also grandstanded). Also, the head element tends to carry the inflections for the word as a whole, not the modifying element. There has been some speculation in the literature as to the regularity of the order of modifier and head in compounds of individual languages. There does not appear to be necessary consistency, with languages like Vietnamese and French showing both orders (consider French homme-grenouille ‘man frog’ ¼ ‘frogman’ as opposed to chauvesouris ‘bald mouse’ ¼ ‘bat’). Even English, which is largely right-headed, does not appear to be exclusively so, as is shown by isolated examples such as whomever (where the inflections for the word as a whole are carried on the leftmost element) and Model T (which denotes a type of model, not a type of T). This lack of necessary consistency can make it difficult to determine what is treated as the head element in compounds such as Alsace–Lorraine, where the meaning is the addition of the two elements rather than a hyponym of one of the elements. Compounds like redwood, which denote not a type of wood but a tree that has red wood, are sometimes
said not to have a head. Although it is true that redwood is not a hyponym of wood, but of an unexpressed element tree, it is nonetheless the case that red is the modifier of wood and that wood is the element that carries the inflections for the word as a whole. Thus, redwood may be seen as a headed structure, just as much as rain–cloud. Similar arguments seem to hold with examples such as breath-taking, chimneysweep, trouble-free where there may be problems in applying the hyponymy criterion in a straightforward way.
Classifying Compounds Compounds have been classified in a number of ways, none of which appears to be totally satisfactory. The oldest classification is that provided by the Sanskrit grammarians. This provides four fundamental classes of compound, a classification that continues to be used, in whole or in part, today. Tatpurusa compounds are the type in which there is a clear modifier–head structure, as discussed above. These are sometimes termed ‘determinative’ compounds. In rain-cloud, rain determines what kind of cloud is denoted. The Sanskrit grammarians give kharmadharaya compounds as a subtype of tatpurusa compounds. In more modern times, the kharmadharaya compounds have been divided into two distinct groups. In the first, we have adjective–noun compounds like blackbird. The second type seems very different: it is the compound made up of two elements, each of which refers independently to some facet of the thing denoted by the compound as a whole: man-servant, secretary-treasurer. This latter type is sometimes confused with the next category, dvandva compounds. Dvandvas denote an entity that is the sum of the entities in the two elements: Alsace– Lorraine is an example of this type; in many other languages we find dvandvas meaning ‘parents’ made up of the words for mother and father (for example, in Kashmiri, Marathi, and Tamil). The third main Sanskrit type of compound is the bahuvrihi compound. In Sanskrit, these were adjectival in nature, with bahuvrihi being an example of the type and meaning ‘having much rice’. In more modern descriptions, this label has been appropriated for compounds like redwood discussed above, and they are sometimes called possessive compounds or, in German, Dickkopfkomposita, again illustrating the type. The final type in the Sanskrit classification is the avyayibhava compound, which is the label given to adjectival compounds used adverbially. The label has tended to be ignored by modern scholars. Many more modern classifications of compounds are in effect reinterpretations of the Sanskrit labels.
724 Compound Table 1 Types of compound whose elements can be understood to be coordinated Examples
Semantic description
Label
man-servant, writer-director, bitter-sweet Alsace–Lorraine, mother– father (e.g., in Tamil) ` n ghe ˆ ‘table Vietnamese ba
The elements denote different aspects of the same individual
Appositional compounds
The compound denotes a unit made up of the individuals denoted by the elements The elements denote different individuals that act as prototypical members of the set denoted by the compound The elements denote the extremes of some real or metaphorical journey between two points The elements denote the participants; there is no movement between extremes
Dvandva compounds
chair ¼ furniture’
London–Los Angeles (flight), Greek–English (dictionary) American–Australian (talks)
For example, dvandvas are sometimes relabeled as copulative compounds, and the man-servant type of kharmadharaya is called an appositional compound. This leads to a proliferation of labels without any particular insights. For example, the confusion between appositional and copulative compounds in the literature appears to arise because both can be glossed by inserting the word ‘and’ between the elements: a man-servant is a man and a servant; Alsace– Lorraine is made up of Alsace and Lorraine. This masks the original insight by focusing on a different superficial similarity, the apparent coordination. But here we can recognize not only the original distinction and instances that may be hard to classify on the borderline, we can recognize further subtypes as well (see Table 1). Bloomfield (1935) introduces a different fundamental distinction into the classification of compounds. For him, compounds are basically endocentric (they contain their heads) or exocentric (the true head is unexpressed). The tatpurusa compounds are endocentric, as are probably the dvandvas. The bahuvrihi compounds are exocentric. Comment was made above about the headedness of bahuvrihi compounds like redwood. But there are many other compounds that have no overt head element. English examples like pickpocket or their equivalents in Romance languages (e.g., French garde-robe ‘keep dress’ ¼ ‘wardrobe’) also lack overt heads. One unfortunate result of this is that these have sometimes been classified as bahuvrihi, which, traditionally, they were not. Rather they are a type not readily encompassed by the Sanskrit terminology. Other exocentric types include sit-in (a noun made up of a verb and a preposition), up-keep (if this is not to be viewed as a nominalization of a phrasal verb through processes or reordering and stress-change), and compounds like roll-neck (sweater), red-brick (university), go-go (dancer), and highrise (block), which appear to create nonprototypical
Usually seen as dvandvas; may be termed co-hyponymic compounds May be termed translative compounds May be termed participative compounds
adjectives although they do not have adjectival heads, and things like army-fashion, which are most often used adverbially. A more recent classification sees all the compounds that have been discussed so far as primary compounds (sometimes misleadingly termed root compounds), in contrast to compounds such as bus-driver, which are synthetic compounds (sometimes called verbal compounds or verbal–nexus compounds). Synthetic compounds are built around a verb (in bus-driver, the verb is drive) with arguments of the verb taking up other structural positions in the verb. So bus is the direct object of the verb, and the final -er suffix represents the subject or external argument of the verb. There are restrictions on what arguments of the verb can occur in the various positions in a synthetic compound structure. Generally adverbial elements and subject arguments do not occur in the first element of such compounds (*He is a fast-driver of buses, *Driver-cleaning of buses), except where the second element is a past participle (home made, self-driven). Apparent exceptions to this rule are interesting, in that they may help delimit the notion of synthetic compound. For some authorities, synthetic compounds are found only when very productive affixes are involved (-er, -ing, -ed, where English is concerned) so that speech-synthesizer is a synthetic compound but speech-synthesis is not. For such authorities, a compound such as insect flight is not a counterexample to the above generalization, though consumer spending is. Some authorities are also unwilling to allow spatial or temporal locative elements as parts of synthetic compounds, but others present wider definitions. An example like Sunday driver is this a synthetic compound for some but not for others. Although it is clear that this whole area still needs more work, it is of particular interest since it appears to show the interaction of syntactic and morphological principles in the creation of new lexical items.
Compound 725
The Limits of Compounding Finally, it must be mentioned that there are some things that look like compounds but are not generally accepted as compounds in the literature. The most obvious type is the type where the compound is the base of some subsequent process. For example, things that look like compound verbs in English are usually not created by straight compounding but by back-formation (baby-sit from baby-sitter) or conversion/zero derivation (to grandstand from a grandstand). There is some equivocation here between the final form and the route by which that final form has been achieved. We have already seen that words like walk-out may be viewed as nominalizations of phrasal verbs rather than as compounds. There are some multilexeme lexical items that appear to derive from the lexicalization of a syntactic structure. English examples include forget-me-not and toad-in-the-hole. Although these are sometimes called compounds, it seems that their formation is completely distinct from that of the constructions that have been discussed here and that they really belong under a different heading. The same is true of so-called compounds whose primary motivation appears to be phonetic/phonological: things like namby-pamby and shilly-shally. Those are dealt with separately in this work, since they are not made up of two lexemes. See also: Back-Formation; Conversion; Hyponymy and Hyperonymy; Incorporation; Lexicalization; Neoclassical Compounding; Panini; Word.
Bibliography Adams V (1973). An introduction to modern English word-formation. London: Longman. Basbøll H (1985). ‘Stød in modern Danish.’ Folia Linguistica 19, 1–50. Bauer L (1978). The grammar of nominal compounding. Odense: Odense University Press. Bauer L (1979). ‘On the need for pragmatics in the study of nominal compounding.’ Journal of Pragmatics 3, 45–50. Bauer L (1983). English word-formation. Cambridge, UK: Cambridge University Press. Bauer L (1983). ‘Stress in compounds: A rejoinder.’ English Studies 64, 47–53. Bauer L (1998). ‘When is a sequence of noun þ noun a compound in English?’ English Language and Linguistics 2, 65–86. Bauer L (2001). ‘Compounding.’ In Haspelmath M, Ko¨ nig E, Oesterreicher W & Raible W (eds.) Language universals and language typology. Berlin/New York: de Gruyter. 695–707.
Bauer L & Renouf A (2001). ‘A corpus-based study of compounding in English.’ Journal of English Linguistics 29, 101–123. Benveniste E (1966). ‘Formes nouvelles de la composition nominale.’ Bulletin de la Socie´ te´ Linguistique de Paris 61, 82–95. Bloomfield L (1935). Language. London: Allen & Unwin. Botha R P (1984). Morphological mechanisms. Oxford: Pergamon. Brekle H E (1970). Generative Satzsemantik im System der Englischen Nominalkomposition. Munich: Wilhelm Fink. Carr C T (1939). Nominal compounds in Germanic. London: Oxford University Press. Darmsteter A (1875). Formation des mots compose´ s en franc¸ ais. Paris. Di Sciullo A-M & Williams E (1987). On the definition of word. Cambridge, MA: MIT Press. Downing P (1977). ‘On the creation and use of English compound nouns.’ Language 53, 810–842. Fabb N (1998). ‘Compounding.’ In Spencer A & Zwicky A M (eds.) The handbook of morphology. Oxford, UK/ Malden, MA: Blackwell. 66–83. Farnetani E, Torsello C T et al. (1988). ‘English compound versus non-compound noun phrases in discourse: An acoustic and perceptual study.’ Language and Speech 31, 157–180. Gleitman L R & Gleitman H (1970). Phrase and paraphrase. New York: Norton. Glinert L (1989). The grammar of Modern Hebrew. Cambridge, UK: Cambridge University Press. Hatcher A G (1960). ‘An introduction to the analysis of English noun compounds.’ Word 16, 356–373. Jespersen O (1942). A modern English grammar on historical principles. Part VI. Morphology. London: George Allen and Unwin; Copenhagen: Munksgaard. Kvam A M (1990). ‘Three-part noun combinations in English, composition–meaning–stress.’ English Studies 71, 152–161. Ladd D R (1984). ‘English compound stress.’ In Gibbon D & Richter H (eds.) Intonation, accent and rhythm. Berlin/New York: de Gruyter. 253–266. Lees R B (1960). The grammar of English nominalizations. The Hague: Mouton. Lees R B (1970). ‘Problems in the grammatical analysis of English nominal compounds.’ In Bierwisch M & Heidolph K E (eds.) Progress in linguistics. The Hague: Mouton. 174–186. Levi J N (1978). The syntax and semantics of complex nominals. New York: Academic Press. Lewicka H (1963). ‘Re´ flexions the´ oriques sur la composition des mots en ancien et en moyen franc¸ ais.’ Kwartalnik Neofilologiczny 10, 131–149. Marchand H (1969). The categories and types of present-day English word-formation (2nd edn.). Munich: Beck. Meys W J (1975). Compound adjectives in English and the ideal speaker–listener. Amsterdam: North Holland.
726 Compound Olsen S (2000). ‘Composition.’ In Booij G, Lehmann C & Mugdan J (eds.) Morphologie/morphology, vol. 1. Berlin and New York: de Gruyter. 897–916. Rohrer C (1977). Die Wortzusammensetzung im modernen Franzo¨ sisch. Tu¨ bingen: Narr. Ryder M E (1994). Ordered chaos: The interpretation of English noun–noun compounds. Berkeley: University of California Press. Sandra D (1990). ‘On the representation and processing of compound words: Automatic access to constituent morphemes does not occur.’ Quarterly Journal of Experimental Psychology 42A, 529–567.
Scalise S (1992). ‘The morphology of compounding.’ Italian Journal of Linguistics/Rivista di Linguistica 4/1. Sulkala H & Karjalainen M (1992). Finnish. London and New York: Routledge. Warren B (1978). Semantic patterns of noun–noun compounds. Go¨ teborg: Acta Universitatis Gothoburgensis. Whitney W D (1889). Sanskrit grammar (2nd edn.). Cambridge, MA: Harvard University Press/London: Oxford University Press.
Computational Approaches to Language Acquisition J Elman, University of California San Diego, La Jolla, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Why Computational Models? One of the striking developments in the field of child language acquisition within the past two decades has been the dramatic increase in the use of computational models as a way of understanding the acquisition process. In part, this has been driven by the widespread availability of inexpensive but powerful computers and the development of software that has made modeling more widely accessible. But there is a more interesting and scientifically significant reason for the phenomena. Computational theories of learning themselves have matured significantly since the middle of the 20th century. The renaissance in neural network (or connectionist) approaches, and more recently, the linkages with mathematical approaches such as Bayesian inference, information theory, and statistical learning have provided a much more sophisticated perspective on a number of issues relevant to language acquisition. These models are necessarily used in conjunction with empirical approaches, but provide an important complement to such approaches. At the very least, computational models can be thought of as enforcing a level of detail and specificity in a theory or account that a verbal description might not possess. Furthermore, even in simple models, there may be interactions among the model’s components that are sufficiently complex that only through empirically running a simulation is it possible to know how the model will behave. Computer models also afford the opportunity to explore aspects of a phenomenon that may not be easily tested in the real world (either because the corresponding situation has not yet been
studied, or perhaps may be infeasible to test). By systematically exploring the full parameter space of a theory, one can sometimes gain insight into the deeper principles that underlie a behavior. And of course, a model may be amenable to analytic techniques that are not practical with real children. With children we can never do more than make inferences about the internal mechanisms that drive a behavior. Computer models, on the other hand, can in principle be completely understood. Finally, such models can serve as hypothesis generators. They often suggest novel ways of understanding a phenomenon. Of course, the validity of the hypothesis ultimately depends on empirical testing with real children. In general, there have emerged two complementary approaches to modeling. In the first, the goal is to determine that a problem can be solved without making specific claims that the solution implemented in the computer model is the same as it would be for the child. These approaches tend to be more mathematical in nature. Work involving Bayesian inference, information theory, and statistical learning are of this sort. The second approach attempts to model the acquisition process a bit more directly. Learning plays a central role in these approaches, and the models’ behavior at intermediate stages is as much a focus as the ability to ultimately master the task. Connectionist models are examples of this second sort of model. Because the field of computational approaches to language acquisition has grown so explosively – and cannot be fully covered in the present short review – what follows will be organized around the major issues that have been addressed (leaving aside a large number of interesting but less central phenomena). For excellent discussion of related computational approaches, see Brent (1996), Munakata and McClelland (2003), and MacWhinney (1999).
726 Compound Olsen S (2000). ‘Composition.’ In Booij G, Lehmann C & Mugdan J (eds.) Morphologie/morphology, vol. 1. Berlin and New York: de Gruyter. 897–916. Rohrer C (1977). Die Wortzusammensetzung im modernen Franzo¨sisch. Tu¨bingen: Narr. Ryder M E (1994). Ordered chaos: The interpretation of English noun–noun compounds. Berkeley: University of California Press. Sandra D (1990). ‘On the representation and processing of compound words: Automatic access to constituent morphemes does not occur.’ Quarterly Journal of Experimental Psychology 42A, 529–567.
Scalise S (1992). ‘The morphology of compounding.’ Italian Journal of Linguistics/Rivista di Linguistica 4/1. Sulkala H & Karjalainen M (1992). Finnish. London and New York: Routledge. Warren B (1978). Semantic patterns of noun–noun compounds. Go¨teborg: Acta Universitatis Gothoburgensis. Whitney W D (1889). Sanskrit grammar (2nd edn.). Cambridge, MA: Harvard University Press/London: Oxford University Press.
Computational Approaches to Language Acquisition J Elman, University of California San Diego, La Jolla, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Why Computational Models? One of the striking developments in the field of child language acquisition within the past two decades has been the dramatic increase in the use of computational models as a way of understanding the acquisition process. In part, this has been driven by the widespread availability of inexpensive but powerful computers and the development of software that has made modeling more widely accessible. But there is a more interesting and scientifically significant reason for the phenomena. Computational theories of learning themselves have matured significantly since the middle of the 20th century. The renaissance in neural network (or connectionist) approaches, and more recently, the linkages with mathematical approaches such as Bayesian inference, information theory, and statistical learning have provided a much more sophisticated perspective on a number of issues relevant to language acquisition. These models are necessarily used in conjunction with empirical approaches, but provide an important complement to such approaches. At the very least, computational models can be thought of as enforcing a level of detail and specificity in a theory or account that a verbal description might not possess. Furthermore, even in simple models, there may be interactions among the model’s components that are sufficiently complex that only through empirically running a simulation is it possible to know how the model will behave. Computer models also afford the opportunity to explore aspects of a phenomenon that may not be easily tested in the real world (either because the corresponding situation has not yet been
studied, or perhaps may be infeasible to test). By systematically exploring the full parameter space of a theory, one can sometimes gain insight into the deeper principles that underlie a behavior. And of course, a model may be amenable to analytic techniques that are not practical with real children. With children we can never do more than make inferences about the internal mechanisms that drive a behavior. Computer models, on the other hand, can in principle be completely understood. Finally, such models can serve as hypothesis generators. They often suggest novel ways of understanding a phenomenon. Of course, the validity of the hypothesis ultimately depends on empirical testing with real children. In general, there have emerged two complementary approaches to modeling. In the first, the goal is to determine that a problem can be solved without making specific claims that the solution implemented in the computer model is the same as it would be for the child. These approaches tend to be more mathematical in nature. Work involving Bayesian inference, information theory, and statistical learning are of this sort. The second approach attempts to model the acquisition process a bit more directly. Learning plays a central role in these approaches, and the models’ behavior at intermediate stages is as much a focus as the ability to ultimately master the task. Connectionist models are examples of this second sort of model. Because the field of computational approaches to language acquisition has grown so explosively – and cannot be fully covered in the present short review – what follows will be organized around the major issues that have been addressed (leaving aside a large number of interesting but less central phenomena). For excellent discussion of related computational approaches, see Brent (1996), Munakata and McClelland (2003), and MacWhinney (1999).
Computational Approaches to Language Acquisition 727
Issues and Results It is useful to focus on the modeling work in terms of three major questions that have been addressed (bearing in mind the caveat above, that just as the field of language acquisition is itself large and diverse, there are many models that fall outside the scope of this taxonomy). These questions have to do with: (1) Oddities in the ‘shape of change’ (e.g., discontinuities or nonlinearities in acquisition, as in U-shaped curves); (2) What information is available in the input a child receives, and what she can infer from it (e.g., the problem of segmenting words or discovering grammatical categories or syntactic regularities); and (3) How learning can proceed in the face of putatively insufficient information (e.g., ‘Baker’s paradox’ or the so-called ‘Poverty of the Stimulus’ problem). We shall consider each of these in detail, considering first what the issues are and then the computational models that have endeavored to understand the phenomena. Explaining the Shape of Change
The simplest and possibly most natural pattern of development would probably be to assume a linear increase in performance over time. Such a pattern would be consistent with the assumption that the mechanisms that subserve learning remain relatively constant, and thus the increase in what is learned over
every time increment should also remain constant. In fact, few developmental patterns illustrate such linear tendencies. Development seems to proceed in fits and spurts, sometimes interrupted by long periods where little appears to change, and sometimes even by phases where performance temporarily deteriorates. Noteworthy examples of such nonlinearities abound in the realm of language acquisition, and they have played a major role in theories about the mechanisms that make language acquisition possible. The special ability of children to learn languages (the Critical Period) is a notable example of such a nonlinearity. One influential explanation of this effect is that it reflects the existence of a specialized neural mechanism, the Language Acquisition Device, which is operative only during childhood. Another well-documented set of nonlinearities is exemplified by the rapid increases in word comprehension, production, and knowledge of grammar that occur in young children during their second year of life (as in Figure 1, from Bates and Goodman, 1997). Clearly, something dramatic seems to be happening at the point where, for example, the child manifests a burst in the rate at which she learns new words. Many theorists have interpreted such bursts as evidence that something new has appeared in the child. A final example has played a particularly important role in the theoretical literature: the apparent U-shaped curve that characterizes children’s mastery of the past tense of the English verbal system. At the
Figure 1 Median growth scores for word comprehension expressed as a percentage of available items. (Reproduced with permission from Bates E & Goodman J (1997). ‘On the inseparability of grammar and the lexicon.’ Language and Cognitive Processes 12(5/6), 507–584. ! Psychology Press Ltd (http://www.psypress.co.uk/journals.asp).)
728 Computational Approaches to Language Acquisition
earliest stage, children know a small number of verbs, mostly of high frequency and tending to be irregular; they typically produce the past tense correctly. At the second stage, the number of verbs in the child’s productive vocabulary increases and includes a larger number of regulars, some of which may be lower in frequency. At this stage, both observational evidence (overgeneralization of the ‘þed’ pattern for regular verbs) and experimental data (ability to generate the regular version of nonce verbs) suggest that the child has learned a rule for forming the past tense. During the second stage, the rule is incompletely learned and misapplied to many (previously correctly produced) irregulars, resulting in a decline in overall performance. Finally, at the third stage, the correct forms for both regulars and irregulars are produced, and the child appears to have learned not only the rule – which applies to regulars – but also the exceptions. These data have provided a powerful argument in favor of the psychological reality of rules. The Critical Period
A number of computational models have addressed these issues and in many cases provided alternative hypotheses for the phenomena. In attempting to understand how neural networks might deal with complex recursive structure in language, Elman (1993) discovered that the network was able to process complex sentences only when it began either by initially being exposed to simple sentences (a kind of neural network ‘motherese’), or if it began the learning process with a restricted working memory (similar to the limited WM found in young children). Elman called this the ‘starting small’ effect. It is similar in spirit to Newport’s ‘‘less is more’’ hypothesis (Newport, 1990). In both accounts, the limitation on processing resources acts like a filter that temporarily hides the more complex aspects of language from the network (or child). Learning the simpler phenomena first creates a foundation of knowledge that makes it possible to subsequently learn more complex regularities. These accounts suggest that rather than being enabled by a special mechanism (the LAD) that is lost in adulthood, the explanation for the Critical Period is that – paradoxically – it is maturational limitations that facilitate the learning of language. However, it is also possible that there are multiple factors that result in Critical Period effects. Using a model based on Hebbian learning (a computational paradigm closely related to the changes in Long Term Potentiation of synaptic junctions that results from synchronous firing of neurons), Munakata and Pfaffly (2004) demonstrated that even though the mechanism for plasticity did not change, what was learned early in a network’s life constrained what it could learn later. Marchman
(1993) has demonstrated similar effects in networks that learn the past tense; networks that suffer simulated brain damage early in life recover much better compared to networks that are lesioned after much learning has occurred. The Vocabulary Burst
A number of models have been used to attempt to understand what factors might lead to the rapid acceleration in learning of new words that typically occurs in the middle of the second year of life. Plunkett et al. (1992) trained networks to associate linguistic labels with visual images and observed that a burst-like increase in ability to learn labels occurred after early training. They also found that comprehension performance in the networks always exceeded (and preceded) production, that the networks exhibited prototype effects, and that they also show underextension and overextension phenomena found in children. Plunkett et al. (1992) attribute these behaviors to the network’s need to develop robust conceptual categories. Prior to this time, learning is slow and errorful. Once categories are learned, they facilitate the learning of new words. A similar effect was found in Elman (1998), who also found that there was a direct, causal connection between vocabulary growth and the later emergence of grammar (cf. Bates and Goodman, 1997). The effect arose because essentially grammar was understood as a generalization over the commonalities in syntactic behavior of many words; with a small vocabulary these patterns are not evident, and so vocabulary growth is a necessary prerequisite to discovery of grammatical patterns (cf. Tomasello, 2000 for a similar account in the acquisition literature). The English Past Tense: Rules or Connections?
The final example of nonlinearities in language acquisition is the U-shaped performance shown in English by many children as they learn the correct past tense forms of verbs. This phenomenon has long been seen as demonstrating the psychological reality of rules, insofar as we appear to be observing the moment in time when the rule for the past tense is being acquired (Pinker, 1991). Rumelhart and McClelland (1986) challenged this assumption by showing that when a neural network was trained, on a verb-byverb basis, to produce the past tense of English verbs, it not only manifested a similar U-shaped performance curve, but also replicated in detail many of the more specific empirical phenomena found in children. Rumelhart and McClelland suggested that the network account provided an alternative to the traditional interpretation involving rules. Not surprisingly,
Computational Approaches to Language Acquisition 729
this claim provoked a controversy that continues to this day (Prince and Pinker, 1988). The debate has been lively, if at times acrimonious. And although the theoretical interpretation remains controversial, one of the most important outcomes of this debate has been the broadening – in terms of both languages studied and level of detail – of empirical research in English but also in other languages, including German, Hebrew, Icelandic, Italian, Norwegian, Polish, and Spanish. This is an excellent example of how computational models can refine the questions that are addressed and inspire new avenues of empirical investigation. The debate has also led to a more sophisticated understanding of the implications of the competing accounts not only for acquisition but also for other aspects of language processing and historical change. What Information Is Available to a Child, and What Can Be Learned?
Although obviously a child’s experience places a critical role in the learning process, the relationship between what the child hears and what she ultimately knows is in many cases not transparent. Indeed, it has been claimed that in some cases there is no evidence at all for this knowledge (Crain, 1991). The putative insufficiency of the evidence available to a child – the poverty of the stimulus problem – has led to the conclusion that significant amounts of linguistic knowledge must be ‘preknown’ by a child. This knowledge constitutes a universal grammar that is part of the biological endowment every child uses as she learns the specific features of her own language. There are three issues that must be considered when evaluating such a hypothesis. The first involves what the actual input is that is available to children. Although that input is in fact massive in terms of word tokens, there is now reason to believe that it reflects a restricted range of the adult language (Cameron-Faulkner et al., 2003). Second, it is also clear that for a long period of time, children are actually much more conservative in their productions and stick closely to what they hear (CameronFaulkner et al., 2003; Lieven et al., 2003; Theakston et al., 2003). Nonetheless, it is also obviously true that at some point children venture into uncharted territory, so the problem of what motivates such creative use of language is real. This leads to the third question, which is what theory of learning is assumed. At least some of the Nativist accounts have assumed a weak kind of learning, essentially little more than a mental tabulation of utterances (e.g., Pinker, 1984: 49ff.). Computational models have been most successful in addressing this third question, by exploring
the properties of more powerful – but hopefully psychologically plausible – learning mechanisms. Discovering Where the Words Are: The Segmentation Problem
Unlike written language, in which words are delimited by white space or punctuation characters, spoken language yields few explicit clues as to where the boundaries between words are. For the infant, this poses a serious challenge, complicated by the fact that even the definition of what counts as a word differs dramatically across languages. How does the child thus learn (a) what can serve as a word, and (b) where the words are in continuous speech? A number of computational approaches have converged in a similar insight, which is that at least to a first approximation, sequences of sounds that are highly associated are good candidates to be words. The manner in which this hypothesis is implemented varies (Brent and Cartwright, 1996; Christiansen et al., 1998; Elman, 1990), but the essential idea is that word boundaries are locations where the conditional probability of the next sound, given what has preceded it, is low. This can be seen in Figure 2, which shows the errors made by a network that has learned to predict the next letter in a sequence of words (white space removed) that make up a child’s story (Elman, 1990). Error tends to be greatest at the onsets of words, and decreases as more of a word is heard. Error maxima thus constitute likely word boundaries. Another issue that concerns word learning is the problem of determining the syntactic and semantic categories of words. Here again, strong claims have been advanced that at least the categories must be innate, as well as innate principles that guide the child in making such determinations. The arguments have included the claim that the kind of distributional information available to a child (e.g., words in the same category tend to have similar distributional properties) will fail given the complexity of real language input. However, a number of computational models have suggested otherwise. Considerably more information of this sort appears to be available to a child than might be imagined (Cartwright and Brent, 1997; Elman, 1995; Mintz, 2002; Redington et al., 1998). Again, these models differ in their details, but share the same insight that a word’s privilege of occurrence is a powerful indicator of its category. Importantly, there is an increasingly empirical literature involving learning of artificial languages by infants and young children that is highly consistent with the type of learning embodied in the computational models (see Gomez and Gerken, 2000 and Saffran, 2001 for a discussion of this work).
730 Computational Approaches to Language Acquisition
Figure 2 Performance of a simple recurrent network that has learned to predict the next letter in a short story. Error maxima are highly correlated with the onsets of a new word. (Reproduced with permission from Elman J L (1990). ‘Finding structure in time.’ Cognitive Science 14(2), 179–211. ! Taylor & Francis.)
Discovering Grammar? The Poverty of the Stimulus Problem
Perhaps the strongest claims regarding the necessity for children’s innate linguistic knowledge arise in the context of grammar. As with the past tense debate, the controversy has been heated. It has also been complex, because it interacts not only with the long-standing nature vs. nurture debate but also with questions regarding the extent to which linguistic knowledge is modular and independent from other cognitive processes (i.e., domainspecific), and whether the uniqueness of language to our species also reflects specialized neural – and presumably also genetic – substrates that are entirely absent in other species. For two different answers to these questions, see Elman et al. (1996) and Pinker (1994). One basic question that arose early in the discussion is whether connectionist models were capable at all of capturing some of the apparently recursive nature of natural language. Even if recursion in human language is only partial, there is good evidence that some kind of abstract internal representations must underlie the surface forms. Symbolic accounts that make use of syntactic trees provide one mechanism that might explain why, for example, the verb is in (1) is in the singular, agreeing with woman, rather than with any of the other nouns in the sentence. (1) The woman who Mary and Bob introduced me to last summer while I was visiting them in Paris on my way to Prague is really quite interesting.
Similarly, tree-structured representations provide a formalism that makes possible hypotheses about why (2) is an acceptable sentence, whereas (3) – which is similar in meaning – is ungrammatical. (It should be noted, however, that accounts of such differences are elusive, and there is still not complete agreement within any framework about the explanation for these sorts of differences.) (2) Who did you believe Annie saw? (Possible answer: I believed Annie saw Elvis.) (3) *Who did you believe the claim Annie saw? (Possible answer: I believed the claim Annie saw Elvis.)
Claims that neural networks are in principle unable to deal with such linguistic complexities may be premature. Their solution to the problem of recursion differs from classical discrete automata, but recurrent neural networks definitely have sufficient power to deal with complex grammatical constructions (Siegelmann, 1995). More relevant to issues in language acquisition is whether these complex grammatical regularities can actually be learned, given the input to which a child might be exposed. A number of computational models suggest a positive answer (e.g., Christiansen and Chater, 1999; Elman, 1993). One particularly challenging problem, and the one we will conclude with, was posed by Crain (1991) and concerns what has been called Aux Inversion as a hypothesis to explain how certain kinds of questions are formed.
Computational Approaches to Language Acquisition 731
Crain argued that based on the evidence available to a child, such as question-answer pairs shown in (4) and (5), any account of grammar acquisition that relies solely on learning would be expected to produce the incorrect generalization that, for any declarative, the corresponding interrogative involves inversion of the first verb and first noun, as captured schematically by the rule shown in (6). (4a) Mary is happy. (4b) Is Mary happy? (5a) Timmy can swim awfully fast. (5b) Can Timmy swim awfully fast? (6) Declarative: Noun AUX . . . Interrogative: AUX Noun . . .
But this rule would be wrong, because it predicts incorrectly that the interrogative that corresponds to (7a) would be (7b). In reality, the correct interrogative is (7c) (For convenience, underlining shows the location of the auxiliary prior to inversion.) (7a) The boy who is smoking is crazy. (7b) *Is the boy who __ smoking is crazy? (7c) Is the boy who is smoking __ crazy?
Crain argues that children do not hear the sort of data (e.g., questions of the form in (7c)) until well past the period where they can be shown – by experimentally eliciting them – to correctly produce these forms. He concludes that this is strong evidence for the existence of an innate constraint that requires that abstract constituent structure be the basis for learning grammatical regularities. He calls this the ‘parade case’ for an innate constraint. To test this claim, Lewis and Elman (2001) constructed a simulation in which a recurrent neural network was trained on examples of well-formed sentences; the training data were generated to mimic the types and frequencies of sentences found in the Manchester corpora from the CHILDES databank (MacWhinney, 2000; Theakston et al., 2001). Crucially, although there were many sentences of the form shown in (4) and (5), no sentences of the forms shown in (7) were included. The network was then tested on both ungrammatical (7b) and grammatical (7c) inputs. Its clear preference was for the grammatical questions, despite never having seen similar sentences during training. How did the network learn the true grammatical generalization? It turns out that there are many other sentences present in the input (to children as well as these networks) that provide ample evidence for the fact that noun phrases act as constituents. These include sentences such as those shown in (9a) and (9b). (9a) The bike with wheels belongs to me. (Not: The bike with wheels *belong to me.)
(9b) The cats my dog chases belong to our neighbor. (Not: The cats my dog chases *belongs to our neighbor.)
The input to the network is thus sufficient to motivate a number of generalizations. These involve learning about different grammatical categories (nouns, verbs, prepositions, complementizers, etc.); selectional restrictions imposed by verbs on their arguments; the form of simple declaratives; the form of simple interrogatives; and the fact that agreement relations (among others) must respect constituenthood. Although these are logically independent generalizations, they have the opportunity to interact. The critical interaction occurs when a complex sentence is also an interrogative. The network has never seen such interactions, but its ability to partial out independent generalizations also makes it possible to combine generalizations as they may interact. There is an important lesson here, and it is a clear demonstration of the ways in which computational models – particularly those that involve learning – can yield new insights into old problems. To a large degree, the question of what can be learned from the available input hinges crucially on what counts as input. Many of the claims regarding poverty of the stimulus have taken a straightforward and literal view of the input. If the target generalization to be learned involves strings of the form X, then the relevant input consists of strings of the form X. But this is a limited view of the relationship between our experience and what we make of it. The Lewis and Elman simulation suggests that some of the more complex aspects of language learning may involve a good deal of what is really indirect evidence, and that inductive mechanisms of the sort instantiated in neural networks are capable of combining that evidence in novel ways to yield outcomes that are not transparently related to the input. Whether this is in fact also true of children of course can only be determined through empirical research. The importance of the computational simulations is that they open up a logical possibility that previously had been ruled out.
See also: Associationism and Connectionism; Chart Parsing and Well-Formed Substring Tables; CHILDES Database; Corpora; Developmental Relationship between Language and Cognition; Formal Models and Language Acquisition; Infancy: Sensitivity to Linguistic Form; Information Theory; Language Acquisition Research Methods; Language Development: Morphology; Language Development: Overview; Lexical Acquisition; Syntactic Development.
732 Computational Approaches to Language Acquisition
Bibliography Bates E & Goodman J C (1997). ‘On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia, and real-time processing.’ Language and Cognitive Processes 12, 507–584. Brent M R (1996). ‘Advances in the computational study of language Acquisition.’ Special Issue on Compositional Language Acquisition. Cognition 61(1–2), 1–38. Brent M & Cartwright T (1996). ‘Distributional regularity and phonotactic constraints are useful for segmentation.’ Cognition 61(1–2), 93–125. Cameron-Faulkner T, Lieven E & Tomasello M (2003). ‘A construction based analysis of child directed speech.’ Cognitive Science 27(6), 843–874. Cartwright T & Brent M (1997). ‘Syntactic categorization in early language Acquisition: Formalizing the role of distributional analysis.’ Cognition 63(2), 121–170. Christiansen M H, Allen J & Seidenberg M S (1998). ‘Learning to segment speech using multiple cues: A connectionist model.’ Special Issue on Language Acquisition and Connectionism. Language and Cognitive Processes 13(2&3), 221–268. Christiansen M H & Chater N (1999). ‘Toward a connectionist model of recursion in human linguistic performance.’ Cognitive Science 23(2), 157–205. Crain S (1991). ‘Language acquisition in the absence of experience.’ Brain and Behavioral Sciences 14, 597–611. Elman J L (1990). ‘Finding structure in time.’ Cognitive Science 14(2), 179–211. Elman J L (1993). ‘Learning and development in neural networks: The importance of starting small.’ Cognition 48(1), 71–99. Elman J L (1995). ‘Language as a dynamical system.’ In Port R F & van Gelder T (eds.) Mind as motion. Cambridge, MA: MIT Press. 195–225. Elman J L (1998). Generalization, simple recurrent networks, and the emergence of structure. Mahwah, NJ: Lawrence Erlbaum. Elman J L, Bates E A, Johnson M H, Karmiloff-Smith A et al. (1996). Rethinking innateness: A connectionist perspective on development. Cambridge, MA: MIT Press. Gomez R L & Gerken L A (2000). ‘Infant artificial language learning and language acquisition.’ Trends in Cognitive Science 4(5), 178–186. Lewis J D & Elman J L (2001). ‘A connectionist investigation of linguistic arguments from poverty of the stimulus: Learning the unlearnable.’ In Moore J D & Stenning K (eds.) Proceedings of the Twenty-Third Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum. 552–557. Lieven E, Behrens H, Speares J & Tomasello M (2003). ‘Early syntactic creativity: A usage-based approach.’ Journal of Child Language 30(2), 333–367.
MacWhinney B (2000). The CHILDES project: Tools for analyzing talk, vol. 1: Transcription format and programs (3rd edn.). Mahwah, NJ: Lawrence Erlbaum. MacWhinney B (ed.) (1999). The emergence of language. Mahwah, NJ: Lawrence Erlbaum. Marchman V (1993). ‘Constraints on plasticity in a connectionist model of the English past tense.’ Journal of Cognitive Neuroscience 5, 215–234. Mintz T H (2002). ‘Category induction from distributional cues in an artificial language.’ Memory and Cognition 30(5), 678–686. Munakata Y & McClelland J L (2003). ‘Connectionist models of development.’ Developmental Science 6(4), 413–429. Munakata Y & Pfaffly J (2004). ‘Hebbian learning and development.’ Developmental Science 7(2), 141–148. Newport E L (1990). ‘Maturational constraints on language learning.’ Cognitive Science 14, 11–28. Pinker S (1984). Language learnability and language development. Cambridge, MA: Harvard University Press. Pinker S (1991). ‘Rules of language.’ Science 253(5019), 530–535. Pinker S (1994). The language instinct. New York, NY: William Morrow. Plunkett K, Sinha C, Moller M F & Strandsby O (1992). Symbol grounding or the emergence of symbols? Vocabulary growth in children and a connectionist net.’ Connection Science: Journal of Neural Computing, Artificial Intelligence and Cognitive Research Special Issue: Philosophical issues in connectionist modeling 4(3–4), 293–312. Prince A & Pinker S (1988). ‘Rules and connections in human language.’ Trends in Neurosciences 11(5), 195–202. Redington M, Chater N & Finch S (1998). ‘Distributional information: A powerful cue for acquiring syntactic categories.’ Cognitive Science 22(4), 425–469. Rumelhart D E & McClelland J L (1986). On learning the past tenses of English verbs. Cambridge, MA: MIT Press. Saffran J (2001). ‘The use of predictive dependencies in language learning.’ Journal of Memory and Language 44(4), 493–515. Siegelmann H T (1995). ‘Computation beyond the Turing limit.’ Science 268(5210), 545–548. Theakston A L, Lieven E V M, Pine J M & Rowland C F (2001). ‘The role of performance limitations in the acquisition of verb-argument structure: An alternative account.’ Journal of Child Language 28(1), 127–152. Theakston A L, Lieven E V M & Tomasello M (2003). ‘The role of the input in the acquisition of third person singular verbs in English.’ Journal of Speech, Language, Hearing Research 46(4), 863–877. Tomasello M (2000). ‘The item-based nature of children’s early syntactic development.’ Trends in Cognitive Science 4, 156–163.
Computational Language Systems: Architectures 733
Computational Language Systems: Architectures H Cunningham and K Bontcheva, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.
Software Architecture Every building, and every computer program, has an ‘architecture’: structural and organizational principles that underpin its design and construction. The garden shed once built by one of the authors had an ad hoc architecture, extracted (somewhat painfully) from the imagination during a slow and nondeterministic process that, luckily, resulted in a structure that keeps the rain on the outside and the mower on the inside (at least for the time being). As well as being ad hoc (i.e., not informed by analysis of similar practice or relevant science or engineering) this architecture is implicit: no explicit design was made, and no records or documentation were kept of the construction process. The pyramid in the courtyard of the Louvre, by contrast, was constructed in a process involving explicit design performed by qualified engineers with a wealth of theoretical and practical knowledge of the properties of materials, the relative merits and strengths of different construction techniques, and the like. So it is with software: sometimes it is thrown together by enthusiastic amateurs, and sometimes it is architected, built to last, and intended to be ‘not something you finish, but something you start’ (to paraphrase Brand, 1994). Several researchers argued in the early and middle 1990s that the field of computational infrastructure or architecture for human language computation merited increased attention. The reasoning was that the increasingly large-scale and technologically significant nature of language processing science was placing increasing burdens of an engineering nature on research and development (R&D) workers seeking robust and practical methods (as was the increasingly collaborative nature of research in this field, which puts a large premium on software integration and interoperation). Since then, several significant systems and practices have been developed in what may be called software architecture for language engineering (SALE). Language engineering (LE) may be defined as the production of software systems that involve processing human language with quantifiable accuracy and predictable development resources (Cunningham, 1999). LE is related to but distinct from the fields of computational linguistics, natural language processing, and artificial intelligence, with its own priorities
and concerns. Chief among these are (1) dealing with large-scale tasks of practical utility, (2) measuring progress quantitatively relative to performance on examples of such tasks, (3) a growing realization of the importance of software engineering in general, and (4) reusability, robustness, efficiency, and productivity, in particular. Software architectures can contribute significantly toward achieving these goals (Maynard et al., 2002; Cunningham and Scott, 2004). This article gives a critical review of the various approaches that have been taken to the problem of software architecture for language engineering (SALE). The prime criterion for inclusion in this article is that the approaches are infrastructural – work that is intended to support language engineering (LE) R&D in some way that extends beyond the boundaries of a single time-limited project. This article presents categories of work that range over a wide area. To provide an organizing principle for the discussion, we extrapolate a set of architectural issues that represent the union of those addressed by the various researchers cited. This approach has the advantage of making it easier to see how certain problems have been addressed and the disadvantage that multipurpose infrastructures appear in several categories. The following section discusses infrastructures aimed at algorithmic resources including the issues of component integration and execution. The article then analyzes data resources infrastructure, including the issues of access and the representation of information about text and speech. If concludes with a discussion on future directions for work on SALE.
Software Architectures for Language Engineering The problem addressed by the systems reviewed here is the construction of software infrastructure for language processing: software that is intended to apply to whole families of problems within this field and to be like a craftsman’s toolbox in the service of construction and experimentation. We consider three types of infrastructural systems: frameworks, architectures, and development environments. A ‘framework’ typically means an object-oriented class library that has been designed with a certain domain in mind and that can be tailored and extended to solve problems in that domain. A framework may also be known as a platform or a component system. All software systems have an architecture. Sometimes, the architecture is explicit, perhaps conforming
734 Computational Language Systems: Architectures
to certain standards or patterns, and sometimes it is implicit. Where an architecture is explicit and targeted on more than one system, it is known as a ‘reference architecture’ or a ‘domain-specific architecture.’ The former is ‘‘a software architecture for a family of application systems’’ (Tracz and Mar, 1995). The term ‘domain-specific software architecture (DSSA),’ the subject of an eponymous ARPA research program, ‘‘applies to architectures designed to address the known architectural abstractions specific to given problem domains’’ (Clements and Northron, 1996). An implementation of an architecture that includes some graphical tools for building and testing systems is a ‘development environment’. One of the benefits of an explicit and repeatable architecture is that it can give rise to a symbiotic relationship with a dedicated development environment. In this relationship, the development environment can help designers conform to architectural principles and visualize the effect of various design choices and can provide code libraries tailored to the architecture. The most significant issues addressed by SALE systems include the following. . enabling a clean separation of low-level tasks, such as data storage, data visualization, location and loading of components, and execution of processes from the data structures and algorithms that actually process human language . reducing integration overheads by providing standard mechanisms for components to communicate data about language and using open standards, such as Java and XML, as the underlying platform . providing a baseline set of language processing components that can be extended and/or replaced by users as required . providing a development environment or at least a set of tools to support users in modifying and implementing language processing components and applications . automating measurement of performance of language-processing components. This article focuses on the first two sets of issues, because they are issues that arise in every single NLP system or application and are prime areas where SALE can make a contribution. For a discussion of other requirements, see Cunningham (2000).
Categories of Work on SALE As with other software, LE programs comprise data and algorithms. The current trend in software development is to model both data and algorithms together, as ‘objects.’ (Older development methods,
such as structured analysis kept them largely separate; Yourdon, 1989.) Systems that adopt the new approach are referred to as ‘object-oriented’ (OO), and there are good reasons to believe that OO software is easier to build and maintain (see Booch, 1994). In the domain of human language processing R&D, however, the choice is not quite so clear cut. Language data, in various forms, are of such significance in the field that they are frequently worked on independently of the algorithms that process them. Such data have even come to have their own term: ‘language resources’ (LRs; LREC-1, 1998), covering many data sources, from lexicons to corpora. In recognition of this distinction, this article uses the following terminology. . Language resource (LR) refers to data-only resources, such as lexicons, corpora, thesauri, or ontologies. Some LRs come with software (e.g., Wordnet has both a user query interface and C and Prolog APIs), but resources in which software is only a means of accessing the underlying data are still defined as LRs. . Processing resource (PR) refers to resources that are principally programmatic or algorithmic, such as lemmatizers, generators, translators, parsers, or speech recognizers. For example a part-of-speech (POS) tagger is best characterized by reference to the process it performs on text. PRs typically include LRs (e.g., a tagger often has a lexicon). PRs can be viewed as algorithms that map between different types of LR and that typically use LRs in the mapping process. An MT (Machine Translation) engine, for example, maps a monolingual corpus into a multilingual aligned corpus using lexicons, grammars, and the like. Adopting the PR/LR distinction is a matter of conforming to established domain practice and terminology. It does not imply that one cannot model the domain (or build software to support it) in an objectoriented manner. This distinction is used to categorize work on SALE. The next section surveys infrastructural work on processing resources, and the following section reviews the much more substantial body of work on language resources.
Processing Resources Often, a language processing system follows several discrete steps. For example, a translation application must first analyze the source text to arrive at some representation of meaning before it can begin deciding upon target language structures that parallel that meaning. A typical language analysis process
Computational Language Systems: Architectures 735
follows such stages as text structure analysis, tokenization, morphological analysis, syntactic parsing, and semantic analysis. The exact breakdown varies widely and is to some extent dependent on method; some statistical work early in the second wave of the application of these types of method completely ignored the conventional language analysis steps in favor of a technique based on a memory of parallel texts (Brown et al., 1990). Later work has tended to accept the advantages of some of these stages, however, though they may be moved into an off-line corpus annotation process, such as the Penn Treebank (Marcus et al., 1993). Each of these stages is represented by components that perform processes on text and use components containing data about language, such as lexicons and grammars. In other words, the analysis steps are realized as a set of processing resources (PRs). Several architectural questions arise in this context: 1. Is the execution of the PRs best done serially or in parallel? 2. How should PRs be represented such that their discovery on a network and loading into an executive process are transparent to the developer of their linguistic functions? 3. How should distribution across different machines be handled? 4. What information should be stored about components, and how should it be represented? 5. How can commonalities among component sets be exploited? 6. How should the components communicate information between each other? (This question can also be stated as, ‘How should information about text and speech be represented?’) This section reviews work that addresses questions 1–5. The issue of representing information about language is addressed in the following section. Locating and Loading
There are several reasons why PR components should be separate from the controlling application that executes them: . There will often be a many-to-one relation between applications and PRs. Any application using language analysis technology needs a tokenizer component, for example. . A PR may have been developed for one computing platform, such as UNIX, but the application wishing to use it may operate on another (e.g., Windows). . The processing regime of the application may require linear or asynchronous execution; this choice
should be isolated from the component structures as far as possible to promote generality and encourage reuse. . PR developers should not be forced to deal with application-level software engineering issues, such as how to manage installation, distribution over networks, exception handling, and so on. . Explicit modeling of components allows exploitation of modern component infrastructures, such as Java Beans or Active X. Accordingly, many papers on infrastructural software for LE separate components from the control executive (e.g., Boitet and Seligman, 1994; Edmondson and Iles, 1994; Koning et al., 1995; Wolinski et al., 1998; Poirier, 1999; Zajac, 1998b; Lavelli et al., 2002; Cunningham et al., 2002a). The term ‘executive’ is used here in the sense of a software entity that executes, or runs, other entities. The questions then are how do components become known to control processes or applications and how are they loaded and initialized. A related question is what data should be stored with components to facilitate their use by an executive; see the discussion on metadata below. Much work ignores componentrelated issues the rest of this section covers those SALE systems for which the data are available. The TIPSTER architecture (Grishman, 1997) recognized the existence of the locating and loading problems, but did not provide a full solution to the problem. The architecture document includes a placeholder for such a solution – in the form of a ‘register annotator’ Application Programmers’ Interface (API) call, which an implementation could use to provide component loading – but the semantics of the call were never specified. The TalLab architecture ‘‘is embedded in the operating system,’’ which allows them to ‘‘reuse directly a huge, efficient and reliable amount of code’’ (Wolinski et al., 1998). The precise practicalities of this choice are unclear, but it seems that components are stored in particular types of directory structure, which are presumably known to the application at startup time. The Intarc Communication Environment (ICE) is an ‘‘environment for the development of distributed AI systems’’ (Amtrup, 1995) and part of the Verbmobil real-time speech-to-speech translation project (Kay et al., 1994). ICE provides distribution based around Parallel Virtual Machine (PVM) and a communication layer based on channels. ICE is not specific to LE because the communication channels do not use data structures specific to NLP needs and because document handling issues are left to the individual modules. ICE’s answer to the locating and
736 Computational Language Systems: Architectures
loading problem is the Intarc License Server, which is a kind of naming service or registry that stores addressing information for components. Components must themselves register with the server by making an API call (Ice_Attach). The components must therefore link to the ICE libraries and know the location of the license server as must applications using ICE services. Following from the ICE work, Herzog et al. (2004) presented the latest in three generations of architecture to arise from the Verbmobil and Smartkom projects, in the shape of the Multiplatform system. This architecture supports multiple distributed components from diverse platforms and implementation languages running asynchonously and communicating via a message-passing substrate. Corelli (Zajac, 1997) and its successor, Calypso, (Zajac, 1998b) are also distributed systems that cater for asynchronous execution. The initial Corelli system implemented much of the CORBA standard (Object Management Group, 1992), and component discovery used a naming and directory service. All communication and distribution were mediated by an object request broker (ORB). Components ran as servers and implemented a small API to allow their use by an executive or application process. In the later Calypso incarnation, CORBA was replaced by simpler mechanisms because efficiency problems (for a usage example, see Amtrup, 1999). In Calypso, components are stored in a centralized repository, which sidesteps the discovery problem. Loading is handled by requiring components to implement a common interface. Another distributed architecture based on CORBA is SiSSA (Lavelli et al., 2002). The architecture comprises processors (PRs in our terms), servers for their execution, data containers (LRs), and a manager component called SiSSA Manager, which establishes and removes connections between the processors, according to a user-designed data flow. SiSSA uses a processor repository to keep information about processors registered with the architecture. Carreras and Padro´ (2002) reported a distributed architecture specifically for language analyzers. GATE version 1 (Cunningham et al., 1997) was a single-process, serial execution system. Components had to reside in the same file system as the executive; location was performed by searching a path stored in an environment variable. Loading was performed in three ways, depending on the type of component and which of the GATE APIs it used. GATE version 2 (Cunningham et al., 2002a,b) supports remote components; location is performed by providing one or more component repositories called Collection of REusable Objects for Language Engineering (CREOLE) repositories, which contain XML
definitions of each resource and the types of its parameters (e.g., whether it works with documents or corpora). The user can then instantiate a component by selecting it from the list of available components and choosing its load-time parameters. GATE makes a distinction between load-time and run-time parameters; the former are essential for the working of the module (e.g., a grammar) and need to be provided at load time, whereas the latter can change from one execution to the next (e.g., a document to be analyzed). Components can also be re-initialized, which enables users to edit their load-time data (e.g., grammars) within the graphical environment and then reload the component to reflect the changes. GATE also supports editing of remote language resources and execution of remote components using remote method invocation (RMI); that is, it provides facilities for building client-server applications. Execution
It seems unlikely that people process language by means of a set of linear steps involving morphology, syntax, and so on. More likely, we deploy our cognitive faculties in a parallel fashion; hence, the term ‘parallel distributed processing’ in neural modeling work (McClelland and Rumelhart, 1986). These kinds of ideas have motivated work on nonlinear component execution in NLP; von Hahn (1994) gave an overview of a number of approaches, and a significant early contribution was the Hearsay speech understanding system (Erman et al., 1980). Examples of asynchronous infrastructural systems include Kasuga (Boitet and Seligman, 1994), Pantome (Edmondson and Iles, 1994), Talisman (Koning et al., 1995), Verbmobil (Go¨ rz et al., 1996), TalLab (Wolinski et al., 1998), Xelda (Poirier, 1999), Corelli (Zajac, 1997), Calypso (Zajac, 1998b), SiSSA (Lavelli et al., 2002), Distributed Inquery (Cahoon and McKinley, 1996), and the Galaxy Communicator Software Infrastructure (GCSI-MITRE, 2002). Motivations include the desire for nonlinear execution and for feedback loops in ambiguity resolution (see Koning et al., 1995). In the Inquery and Verbmobil systems, an additional motivation is efficiency. ICE, the Verbmobil infrastructure, addressed two problems: distributed processing and incremental interpretation. Distribution is intended to contribute to processing speed in what is a very computer-intensive application area (speech-to-speech translation). Incremental interpretation is designed both for speed and to facilitate feedback of results from downstream modules to upstream ones (e.g., to inform the selection of word interpretations from phone lattices using POS
Computational Language Systems: Architectures 737
information). ICE’s PVM-based architecture provides for distributed asynchronous execution. GCSI is an open source architecture for constructing dialogue systems. This infrastructure concentrates on distributed processing, hooking together sets of servers and clients that collaborate to hold dialogues with human interlocutors. Data get passed between these components as attribute/value sets or ‘frames,’ the structuring and semantics of which must be agreed upon on a case-by-case basis. Communication between modules is achieved using a hub. This architectural style tends to treat components as black boxes that are developed using other tool sets. To solve this problem, other support environments can be used to produce GCSI server components, using GCSI as a communication substrate to integrate with other components. The model currently adopted in GATE is that each PR may run in its own thread if asynchronous processing is required (by default, PRs will be executed serially in a single thread). The set of LRs being manipulated by a group of multithreaded PRs must be synchronized (i.e., all their methods must have locks associated with whichever thread is calling them at a particular point). Synchronization of LRs is performed in a manner similar to the Java collections framework. This arrangement allows the PRs to share data safely. Responsibility for the semantics of the interleaving of data access (who has to write what in what sequence for the system to succeed) is a matter for the user, however. Metadata
A distinction may be made between the data that language processing components use (or language resources) and data that are associated with components for descriptive and other reasons. The latter are sometimes referred to as ‘metadata’ to differentiate them from the former. In a similar fashion web content is largely expressed in HTML; data that describe web resources, such as ‘this HTML page is a library catalogue,’ are also called metadata. Relevant standards in this area include the Resource Description Framework RDF; (Lassila and Swick, 1999; BernersLee et al., 1999). There are several reasons why metadata should be part of a component infrastructure, including the following: . to facilitate the interfacing and configuration of components . to encode version, author, and availability data . to encode purpose data and allow browsing of large component sets.
When components are reused across more than one application or research project, often their input/output (I/O) characteristics have not been designed alongside the other components forming the language-processing capability of the application. For example, one POS tagger may require tokens as input in a one-per-line encoding. Another may require the Standard Generalized Markup Language (SGML) input (Goldfarb, 1990). To reuse the tagger with a tokenizer that produces some different flavor of output, that output must be transformed to suit the tagger’s expectations. In cases where there is an isomorphism between the available output and the required input, a straightforward syntactic mapping of representations is possible. In cases where there is a semantic mismatch, additional processing is necessary. Busemann (1999) addressed component interfacing and described a method for using feature structure matrices to encode structural transformations on component I/O data structures. These transformations essentially reorder the data structures around pre-existing unit boundaries; therefore, the technique assumes isomorphism among the representations concerned. The technique also allows for type checking of the output data during restructuring. TIPSTER (Grishman, 1997), GATE (Cunningham, 2002), and Calypso (Zajac, 1998b) deal with interfacing in two ways. First, component interfaces share a common data structure (e.g., corpora of annotated documents), thus ensuring that the syntactic properties of the interface are compatible. Component wrappers are used to interface to other representations as necessary; for example, a Brill tagger (Brill, 1992) wrapper writes out token annotations in the required one-per-line format, then reads in the tags, and writes them back to the document as annotations. Second, where there is semantic incompatibility between the output of one component and the input of another, a dedicated transduction component can be written to act as an intermediary between the two. In Verbmobil a component interface language is used, which constrains the I/O profiles of the various modules (Bos et al., 1998). This language is a Prolog term that encodes logical semantic information in a flat list structure. The principle is similar to that used in TIPSTER-based systems, but the applicability is somewhat restricted by the specific nature of the data structure. Provision of descriptive metadata has been addressed by the Natural Language Software Registry (NLSR; DFKI, 1999) and by the EUDICO distributed corpora project (Brugman et al., 1998a,b). In each case, web-compatible data (HTML and XML, respectively) are associated with components. The NLSR is
738 Computational Language Systems: Architectures
purely a browsable description; the EUDICO work links the metadata with the resources themselves, allowing the launching of appropriate tools to examine them. Note that EUDICO has only dealt with language resource components to date. GATE 2 (Cunningham et al., 2002b) uses XML for describing the metadata associated with processing resources in its CREOLE repositories. This metadata are used for component loading and also for launching the corresponding visualization and editing tools. In addition to the issue of I/O transformation, in certain cases it may be desirable to be able to identify automatically which components are plugcompatible with which other ones, so as to identify possible execution paths through the component set. GATE 1 (Cunningham et al., 1997) addresses automatic identification of execution paths by associating a configuration file with each processing component that details the input (preconditions) and output (post-conditions) in terms of TIPSTER annotation and attribute types (see the section on reference attribution). This information is then used to autogenerate an execution graph for the component set. Commonalities
To conclude this survey of infrastructural work related to processing, this section looks at the exploitation of commonalities between components. For example, both parsers and taggers have the characteristics of language analyzers. One of the key motivating factors for SALE is to break the ‘software waste cycle’ (Veronis and Ide, 1996) and promote reuse of components. Various researchers have approached this issue by identifying typical component sets for particular tasks (Hobbs, 1993; TIPSTER, 1995; Reiter and Dale, 2000). Work is continuing on providing implementations of common components (Ibrahim and Cummins, 1989; Cheong et al., 1994). The rest of this section describes these approaches. Reiter and Dale have reviewed and categorized Natural Language Generation (NLG) components and systems in some detail. Reiter (1994) argued that a consensus component breakdown has emerged in NLG (and that there is some psychological plausibility for this architecture); the classification was extended in Reiter and Dale (2000). They also discussed common data structures in NLG (as does the RAGS project; see below) and appropriate methods for the design and development of NLG systems. Reiter (1999) argued that the usefulness of this kind of architectural description is to ‘make it easier to describe functionalities and data structures’ and thus facilitate research by creating a common vocabulary
among researchers. He stated that this is a more limited but more realistic goal than supporting the integration of diverse NLG components in an actual software system. The term he used for this kind of descriptive work is a ‘reference architecture,’ which is also the subject of the workshop at which the paper was presented (Mellish and Scott, 1999). The TIPSTER research program developed descriptive or reference architectures for information extraction and for information retrieval. Hobbs (1993) described a typical module set for an IE system. The architecture comprises 10 components, dealing with such tasks as pre-processing, parsing, semantic interpretation, and lexical disambiguation; for a description of the full set, see Gaizauskas and Wilks, 1998). For IR, TIPSTER (1995) describes two functions, search and routing, each with a typical component set (some of which are PRs and some LRs.) An architecture for spoken dialogue systems, which divides the task into dialogue management, context tracking, and pragmatic adaptation, is presented in LuperFoy et al. (1998). This in turn leads to an architecture in which various components (realized as agents) collaborate in the dialogue. Some example components are speech recognition, language interpretation, language generation, and speech synthesis. In addition a dialogue manager component provides high-level control and routing of information among components. The preceding discussion illustrates that there is considerable overlap among component sets developed for various purposes. A SALE that facilitated multipurpose components would cut down on the waste involved in the continual reimplementation of similar components in different contexts. The component model given in Cunningham (2000) is made available in the GATE framework (Cunningham et al., 2002b). This model is based on inheritance: A parser is a type of language analyzer that is a type of processing resource. Language engineers can choose, therefore, between implementing a more specific interface and adhering to the choices made by the GATE developers for that type, or implementing a more general interface and making their own choices about the specifics of their particular resource. In several cases, work on identifying component commonalities has led to the development of toolkits that aim to implement common tasks in a reusable manner. For example, TARO (Ibrahim and Cummins, 1989) is an OO syntactic analyzer toolkit based on a specification language. A toolkit for building IE systems and exemplified in the MFE IE system is presented in Cheong et al. (1994).
Computational Language Systems: Architectures 739
Language Resources As described above, language resources are data components, such as lexicons, corpora, and language models. They are the raw materials of language engineering. This section covers five issues relating to infrastructure for LRs: 1. computational access (local and nonlocal) 2. managing document formats and document collections (corpora), including multilingual resources 3. representing information about corpora (language data or performance modeling) 4. representing information about language (data about language or competence modeling) 5. indexing and retrieval of language-related information. Note also that the advantages of a component-based model presented (in relation to PRs) in the section on locating and loading PRs also apply to LRs. Programmatic Access
LRs are of worth only inasmuch as they contribute to the development and operation of PRs and the language processing research prototypes, experiments, and applications that are built from them. A key issue in the use of LRs for language processing purposes is that of computational access. Suppose that a developer is writing a program to generate descriptions of museum catalogue items this program may have a requirement for synonyms, for example, in order to lessen repetition. Several sources for synonyms are available, such as WordNet (Miller, 1990) or Roget’s Thesaurus. To reuse these sources, the developer needs to access the data in these LRs from their program. Although the reuse of LRs has exceeded that of PRs (Cunningham et al., 1994), in general, there are still two barriers to LR access and hence LR reuse: (1) each resource has its own representation syntax and corresponding programmatic access mode (e.g., SQL for Celex, C or Prolog for WordNet); and (2) resources must generally be installed locally to be usable, and how this is done depends on what operating systems are available, what support software is required, and the like, which vary from site to site. A consequence of the first barrier is that, although resources of the same type usually have some structure in common (for example, at one of the most general levels of description, lexicons are organized around words), this commonality cannot be exploited when it comes to using a new resource. In each case, the user has to adapt to a new data structure; this adaptation is a significant overhead. Work that seeks to investigate or exploit commonalities among
resources has first to build a layer of access routines on top of each resource. So, for example, if one wished to do task-based evaluation of lexicons by measuring the relative performance of an IE system with different instantiations of lexical resource, one would typically have to write code to translate several different resources into SQL or some other common format. Similarly, work, such as Jing and McKeown (1998) on merging large-scale lexical resources (including WordNet and Comlex) for NLG, must deal with this problem. There have been two principal responses to this problem: standardization and abstraction. The standardization solution seeks to impose uniformity by specifying formats and structures for LRs. So, for example, the EAGLES working groups have defined standards for lexicons, corpora, and so on (EAGLES, 1999). More recently, Ide and Romary (2004) reported the creation of a framework for linguistic annotations as part of the work of ISO standardization Technical Committee 37, Sub-Committee 4, whose objective is to prepare various standards by specifying principles and methods for creating, coding, processing and managing language resources, such as written corpora, lexical corpora, speech corpora, dictionary compiling and classification schemes. These standards will also cover the information produced by natural language processing components in these various domains.
The work reported here is from Working Group 1 of the committee, which has developed a linguistic annotation framework based on the XML (eXtensible Markup Language), RDF(S) (Resource Discovery Framework (Schema)), and OWL (Ontology Web Language). Although standardization would undoubtedly solve the representation problem, there remains the question of existing LRs (and of competing standards). Peters et al. (1998) and Cunningham et al. (1998) described experiments with an abstraction approach based on a common object-oriented model for LRs that encapsulates the union of the linguistic information contained in a range of resources and encompasses as many object hierarchies as there are resources. At the top of the resource hierarchies are very general abstractions; at the leaves are data items specific to individual resources. Programmatic access is available at all levels, allowing the developer to select an appropriate level of commonality for each application. Generalizations are made over different object types in the resources, and the object hierarchies are linked at whatever levels of description are appropriate. No single view of the data is imposed on the user, who may choose to stay with the ‘original’
740 Computational Language Systems: Architectures
representation of a particular resource or to access a model of the commonalities among several resources, or a combination of both. A consequence of the requirement for local installation – the second barrier to LR access – is that users may have to adjust their compute environments to suit resources tailored to particular platforms. In addition, there is no way to ‘try before you buy,’ no way to examine an LR for its suitability for one’s needs before licensing it in toto. Correspondingly, there is no way for a resource provider to give limited access to their products for advertising purposes or to gain revenue through piecemeal supply of sections of a resource. This problem of non local access has also attracted two types of responses, which can be broadly categorized as: web browsing and distributed databases. Several sites now provide querying facilities from HTML pages, including the Linguistic Data Consortium and the British National Corpus server. So, for example, all occurrences of a particular word in a particular corpus may be found via a web browser. This is a convenient way to access LRs for manual investigative purposes, but is not suited to (or intended for) use by programs for their access purposes. Moving beyond browsing, several papers report work on programmatic access using distributed databases. Fikes and Farquhar (1999) showed how ontologies may be distributed, Brugman et al. (1998a,b) described the EUDICO distributed corpus access system, and Peters et al. (1998) and Cunningham et al. (1998) proposed a system similar to EUDICO, generalized to other types of LR. Some new directions in sharing language resources are discussed in the section on trends. Other issues in the area of access to LRs include that of efficient indexing and search of corpora (see the section, ‘Indexing and Retrieval’), and that of annotation of corpora (see the section on annotation). The issue of how to access SGML documents in an efficient manner is discussed in Olson and Lee (1997), who investigated the use of object-oriented databases for storing and retrieving SGML documents. Their conclusions were essentially negative due to the slowness of the databases used. Hendler and Stoffel (1999) discussed how ontologies may be stored and processed efficiently using relational databases, and here the results were more positive. Documents, Formats, and Corpora
Documents play a central role in LE. They are the subject of analysis for such technologies as IE, and
they are both analyzed and generated in technologies such as MT. In addition, a large amount of work uses annotated documents as training data for machine learning of numerical models. Previous work on LE infrastructure has developed models for documents and corpora, provided abstraction layers for document formats, and investigated efficient storage of documents in particular formats. Documents may contain text, audio, video or a mixture of these formats; documents with a mixture of formats are referred to as multimedia documents. The underlying data are frequently accompanied by formatting information (delineating titles, paragraphs, areas of bold text, etc.) and, in the LE context, by annotation (storing linguistic data, such as gesture tags, POS tags, or syntax trees). Both formatting and annotation come in a wide variety of formats, including proprietary binary data, such as MS Word’s.doc or Excel’s .xls; semi-open, semi-readable formats, such as Rich Text Format (Word’s exchange format); and nonproprietary standardized formats, such as HTML, XML, or GIF (Graphics Interchange Format). The Text Encoding Initiative (TEI; (SperbergMcQueen and Burnard, 1994, 2002), the Corpus Encoding Standard (CES; Ide, 1998), and XCES (Ide et al., 2000) are models of documents and corpora that aim to standardize the representation of structural and linguistic data for textual documents. The general approach is to represent all information about document structure, formatting, and linguistic annotation using SGML/XML. The issue of document formats has been addressed by several TIPSTER-based systems, including GATE and Calypso, and by the HTK speech recognition toolkit (Young et al., 1999). In the HTK toolkit, the approach is to provide API calls that deal with documents in various known formats (e.g. Windows audioformat, MPEG) independent of those formats. For example, a speech recognizer can access the raw audio from these documents without knowing anything about the representation format. The TIPSTER systems deal with formats by means of input filters that contain knowledge about the format encoding and use that knowledge to unpack format information into annotations. TIPSTER also supplies a model of corpora and data associated with both corpus and documents (Grishman, 1997). Note that the two approaches are not mutually exclusive: Ogden (1999) has defined a mapping between TEI/ CES and TIPSTER annotations. Another important issue that needs to be dealt with in infrastructures supporting LRs in multiple languages is the problem of editing and displaying multilingual information. It is often thought that the
Computational Language Systems: Architectures 741
character sets problem has been solved by use of the Unicode standard. This standard is an important advance, but in practice the ability to process text in a large number of the world’s languages is still limited by (1) incomplete support for Unicode in operating systems and applications software, (2) languages missing from the standard, and (3) difficulties in converting non-Unicode character encodings to Unicode. To deal with all these issues, including displaying and editing of Unicode documents, GATE provides a Unicode Kit and a specialized editor (Tablan et al., 2002). In addition, all processing resources and visualization components are Unicode-compliant. Annotation
One of the key issues for much of the work done in this area is how to represent information about text and speech. This kind of information is sometimes called ‘language data,’ distinguishing it from ‘data about language’ in the form of lexicons, grammars, etc. Two broad approaches to annotation have been taken: to use markup (e.g., SGML/XML) or to use annotation data structures with references or pointers to the original (e.g., TIPSTER, ATLAS). Interestingly, the differences between the two kinds of approaches have become less pronounced in recent work. SGML used to involve embedding markup in the text; TIPSTER (and related systems) use a referential scheme where the text remains unchanged and annotation refers to it by character offsets. The embedding approach has several problems, including the difficulty of extending the model to cope with multimedia data (Nelson, 1997, Cunningham et al., 1997; Bird and Liberman, 1999a). Partly in response to these difficulties and as part of the rebirth of SGML as XML (Goldfarb and Prescod, 1998), the ‘ML’ community has adopted a referential scheme itself, which is now known as ‘stand-off markup.’ The data models of the various systems are now much closer than they were before XML existed, and the potential for interoperation between referential systems, such as GATE and XML-based architectures, is greater as a result. GATE exploits this potential by providing input from and output to XML in most parts of the data model (Cunningham et al., 2002a,b). Markup-Based Architectures Language data can be represented by embedding annotation in the document itself, at least in the case of text documents; users of embedding typically transcribe speech documents before markup or use ‘stand-off markup.’ The principal examples of embedded markup for language data use the Standard Generalized Markup Language (SGML; Goldfarb, 1990). SGML is a
‘meta-language,’ a language used to create other languages. The syntax of SGML is therefore abstract, with each document filling in this syntax to obtain a concrete syntax and a particular markup language for that document. In practice, certain conventions are so widespread as to be de facto characteristics of SGML itself. For example, annotation is generally delimited by and pairs, often with some attributes associated, such as . The legitimate tags (or ‘elements’) and their attributes and values must be defined for each class of document, using a Document-Type Definition (DTD). It does not specify what the markup means; the DTD is the grammar that defines how the elements may be legally combined and in what order in a particular class of text; see Goldfarb (1990). A good example of SGML used for corpus annotation is the British National Corpus (BNC; Burnard, 1995). The HyperText Markup Language (HTML) is an application of SGML and is specified by its own DTD. A difference from ordinary SGML is that the DTD is often cached with software, such as web browsers, rather than being a separate file associated with the documents that instantiate it. In practice, web browsers have been lenient in enforcing conformance to the HTML DTD, which has led to diversity among web pages; this means that HTML DTDs now represent an idealized specification of the language that often differs from its usage in reality. Partly in response to this problem, the eXtensible Markup Language (XML; Goldfarb and Prescod, 1998) was developed. SGML is a complex language: DTDs are difficult to write, and full SGML is difficult to parse. XML made the DTD optional and disallowed certain features of SGML, such as markup minimization. For example, the American National Corpus (ANC; Macleod et al., 2002) uses XML and XCES (Ide et al., 2000) to encode linguistic annotations. One of the problems in the SGML/XML world is that of computational access to and manipulation of markup information. Addressing this problem, the Language Technology group at the University of Edinburgh developed an architecture and framework based on SGML called the LT Normalized SGML Library (LT NSL; McKelvie et al., 1998). This in turn led to the development of LT XML (Brew et al., 1999), following the introduction of the XML standard. Tools in an LT NSL system communicate via interfaces specified as SGML DTDs (essentially tag set descriptions), using character streams on pipes: a pipe-and-filter arrangement modeled after UNIXstyle shell programming. To avoid the need to deal with certain difficult types of SGML (e.g., minimized
742 Computational Language Systems: Architectures
markup), texts are converted to a normal form before processing. A tool selects what information it requires from an input SGML stream and adds information as new SGML markup. LT XML is an extension of LT NSL to XML; it makes the normalization step unnecessary. Other similar work in this area includes the XDOC workbench (Ro¨ sner and Kunze, 2002), stand-off markup for NLP tools (Artola et al., 2002), and the multilevel annotation of speech (Cassidy and Harrington, 2001). Reference Annotation I: TIPSTER The ARPAsponsored TIPSTER program in the United States, which was completed in 1998, produced a data-driven architecture for NLP systems (Grishman, 1997) several sites implemented the architecture, such as GATE version 1 (Cunningham et al., 1999) and ELLOGON (Petasis et al., 2002); the initial prototype was written by Ted Dunning at the Computing Research Lab of New Mexico State University. In contrast to the embedding approach, in TIPSTER, the text remains unchanged while information about it is stored in a separate database. The database refers to the text by means of offsets. The data are stored by reference. Information is stored in the database in the form of annotations, which associate arbitrary information (attributes) with portions of documents (identified by sets of start/end character offsets or spans). Attributes are often the result of linguistic analysis (e.g., POS tags). In this way, information about texts is kept separate from the texts themselves. In place of an SGML DTD (or XML XSchema), an ‘annotation type declaration’ defines the information present in
Figure 1 Example of a TIPSTER annotation.
annotation sets (though few implementations instantiated this part of the architecture). Figure 1 gives an example of TIPSTER annotation; it ‘‘shows a single sentence and the result of three annotation procedures: tokenization with part-of-speech assignment, name recognition, and sentence boundary recognition. Each token has a single attribute, its part of speech (POS); . . .; each name also has a single attribute, indicating the type of name: person, company, etc.’’ (Grishman, 1997). Documents are grouped into collections (or corpora), each with an associated database storing annotations and such document attributes as identifiers, headlines, etc. The definition of documents and annotations in TIPSTER forms part of an objectoriented model that can deal with inter-as well as intratextual information by means of reference objects that can point at annotations, documents, and collections. The model also describes elements of IE and IR systems relating to their use, providing classes representing queries and information needs. TIPSTER-style models have several advantages and disadvantages. Texts may appear to be onedimensional, consisting of a sequence of characters, but this view is incompatible with such structures as tables, which are inherently two-dimensional. Their representation and manipulation are easier in a referential model like TIPSTER than in an embedding one like SGML, in which markup is stored in a onedimensional text string. In TIPSTER, a column of a table can be represented as a single object with multiple references to parts of the text (an annotation with multiple spans, or a document attribute with multiple references to annotations). Marking columns in SGML requires a tag for each row of the column, and manipulation of the structure as a whole necessitates traversal of all the tags and construction of some other, non-SGML data structure. Distributed control has a relatively straightforward implementation path in a database-centered system like TIPSTER: the database can act as a blackboard, and implementations can take advantage of wellunderstood access control technology. In TIPSTER, in contrast to the hyperlinking used in LT XML, there is no need to break up a document into smaller chunks, as the database management system (DBMS) in the document manager can deal efficiently with large data sets and visualization tools can give intelligible views into this data. To crossrefer between annotations is a matter of citing ID numbers, which are themselves indexes into database records and can be used for efficient data access. It is also possible to have implicit links: Simple API calls find all the token annotations subsumed by a
Computational Language Systems: Architectures 743
sentence annotation, for example, via their respective byte ranges without any need for additional crossreferencing information. Another advantage of embedded markup in TIPSTER is that an SGML structure like has to be parsed in order to extract the fact that there is a ‘w’ tag whose ‘id’ attribute is ‘p4.w1’. A TIPSTER annotation is effectively a database record with separate fields for type (e.g., ‘w’), ID, and other attributes, all of which may be indexed and none of which ever requires parsing. There are three principal disadvantages of the TIPSTER approach. 1. Editing of texts requires offset recalculation. 2. TIPSTER specifies no interchange format, and TIPSTER data are weakly typed. There is no effective DTD mechanism, though this may also to an extent be an advantage, as a complex typing scheme can inhibit unskilled users. 3. The reference classes can introduce brittleness in the face of changing data: Unless an application chases all references and updates them as the objects they point to change, the data can become inconsistent. This problem also applies to hyperlinking in embedded markup. Reference Annotation II: Linguistic Data Consortium The Linguistic Data Consortium (LDC) has proposed the use of Directed Acyclic Graphs (DAGs) or just Annotation Graphs (AGs) as a unified data structure for text and speech annotation (Bird et al., 2000b). Bird and Liberman (1999b) provided an example of using these graphs to mark up discourse-level objects. This section compares the structure of TIPSTER annotations with the graph format. As discussed above, TIPSTER annotations are associated with documents and have four elements: 1. a type, which is a string 2. an ID, which is a string unique among annotations on the document 3. a set of spans that point into the text of the document 4. a set of attributes. TIPSTER attributes, which are associated with annotations and with documents and collections of documents, have a name, which is a string, and a value, which may be one of several data types including a string; a reference to an annotation, document, or collection; or a set of strings or references. Some implementors of the architecture, including GATE and Corelli, have relaxed the type requirements on attribute values, allowing any object as a value.
This has the advantage of flexibility and the disadvantage that it makes viewing, editing, and storage of annotations more complex. TIPSTER explicitly models references between annotations with special reference classes. These classes rely on annotations, documents, and collections of documents having unique identifiers. LDC annotations are arcs in a graph, the nodes of which are time points or, by extension, character offsets in a text. Each annotation has a type and a value, which are both atomic. A document may have several different graphs, and graphs can be associated with more than one document; this is not specified in the model. There are no explicit references. Rather, references are handled implicitly by equivalence classes: if two annotations share the same type and value, they are considered co-referential. To refer to particular documents or other objects, an application or annotator must choose some convention for representing those references as strings and use those as annotation values. This seems problematic: an annotation of type Co-reference Chain and value Chain23 should be equivalent to another of the same type and value, but this is not true for an annotation of type PartOfSpeech and value Noun. Because LDC annotation values are atomic, any representation of complex data structures must define its own reference structure to point into some other representation system. TIPSTER has a richer formalism, both because of the complexity of the annotation/attribute part of the model and because documents and collections of documents are an explicit part of the model, as are references among all these objects. The inherent problems with developing a model of a task to be solved in software in isolation from the development of instances of that software are evident in the work of Cassidy and Bird (2000), who discussed the properties of the LDC AG model when stored and indexed in a relational database. At that point the authors added identifier fields to annotations to allow referencing without the equivalent class notion. Reference Annotation III: GATE GATE version 2 has a reference annotation model that was designed to combine the advantages of the TIPSTER and LDC models: . Annotation sets are more explicitly graph-based. This feature allows increased efficiency of traversal and simpler editing because offsets are moved from the annotations into a separate node object. In addition, the offsets can be both character and
744 Computational Language Systems: Architectures
.
. . .
.
time offsets, thus enabling annotation of multimodal data. Multiple annotation sets are allowed on documents. Consider the situation when two people are adding annotations to the same document and later wish to compare and merge their results. TIPSTER would handle this by having an ‘annotator’ attribute on all the annotations. It is much simpler to have disjoint sets. Documents and collections are an essential part of the model, and information can be associated with them in similar fashion to that on annotations. All annotations have unique identifiers to allow for referencing. An annotation only has two nodes which means that the multiple-span annotations of TIPSTER are no longer supported; the workaround is to store noncontiguous data structures as features of the document and point from there to the multiple annotations that make up the structures. The annotation values are extensible (i.e., any classes of object can be added to the model and be associated with annotations).
In addition, both LDC and TIPSTER need an annotation meta-language to describe – for purposes of validation or configuration of viewing and editing tools – the structure and permissible value set of annotations. GATE uses the XML schema language supported by W3C as an annotation metalanguage (Cunningham et al., 2002b). These annotation schemas define which attributes and optionally which values are permissible for each type of annotation (e.g., POS, named entity). For instance, a chosen tag set can be specified as permissible values for all POS annotations. This metainformation enables the annotation tools to control the correctness of the user input, thus making it easier to enforce annotation standards. Data about Language
The preceding sections described language data, information related directly to examples of the human performance of language. This section considers work on data about language or the description of human language competence. Much work in this area has concentrated on formalisms for the representation of the data and has advocated declarative, constraint-based representations (using feature-structure matrices manipulated under unification) as an appropriate vehicle with which ‘‘many technical problems in language description and computer manipulation of language can be solved’’ (Shieber, 1992). One example of an infrastructure project based on
Attribute-Value Matrices (AVMs) is ALEP, the Advanced Language Engineering Platform. ALEP aims to provide ‘‘the NLP research and engineering community in Europe with an open, versatile, and general-purpose development environment’’ (Simkins, 1992). ALEP, although open in principle, is primarily an advanced system for developing and manipulating feature structure knowledge bases under unification. It also has several parsing algorithms – algorithms for transfer, synthesis, and generation (Schu¨ tz, 1994). As such, it is a system for developing particular types of LRs (e.g., grammars, lexicons) and for doing a particular set of tasks in LE in a particular way. The system, despite claiming to use a theoryneutral formalism (in fact an HPSG (Head-driven Phrase Structure Grammar)-like formalism), is still committed to a particular approach to linguistic analysis and representation. It is clearly of utility to those in the LE community who use that class of theories and to whom those formalisms are relevant, but it excludes or at least does not support actively those who are not, including an increasing number of researchers committed to statistical and corpus-based approaches. Other systems that use AVMs include a framework for defining NLP systems based on AVMs (Zajac, 1992); the Eurotra architecture, an ‘open and modular’ architecture for MT promoting resource reuse (Schu¨ tz et al., 1991); the DATR morphological lexicon formalism (Evans and Gazdar, 1996); the Shiraz MT Architecture, a chart and unification-based architecture for MT and (Amtrup, 1999), a unified (Finite State Transducer) FST/AVM formalism for morphological lexicons Zajac (1998a); and the RAGS architecture. A related issue is that of grammar development in an LE context (see Netter and Pianesi, 1997; Estival et al., 1997). Fischer et al. (1996) presented an abstract model of thesauri and terminology maintenance in an OO framework. ARIES is a formalism and development tool for Spanish morphological lexicons (Goni et al. 1997). The Reference Architecture for Generation Systems (RAGS) project (Cahill et al., 1999a,b) has concentrated on describing structures that may be shared among NLG component interfaces. This choice is motivated by the fact that the input to a generator is not a document, but a meaning representation. RAGS describes component I/O using a nested feature matrix representation, but does not describe the types of LR that an NLG system may use or the way in which components may be represented, loaded, and so on. More recently, Mellish et al. (2004) presented the RAGS conceptual framework and Mellish and
Computational Language Systems: Architectures 745
Evans (2004) discussed the implementation of this framework in several experimental systems and how these systems illustrate a wider range of issues for the construction of SALE for generation. Indexing and Retrieval
Modern corpora, and annotations upon them, frequently run to many millions of tokens. To enable efficient access to this data, the tokens and annotation structures must be indexed. In the case of raw corpora, this problem equates to information retrieval (IR; also known as document detection), a field with a relatively well-understood set of techniques based on treating documents as bags of stemmed words and retrieving based on relative frequency of these terms in documents and corpora (see van Rijsbergen, 1979). Although these processes are well understood and relatively static, IR is an active research field, partly because existing methods are imperfect and partly because that imperfection becomes more and more troubling in the face of the explosion of web content. There have been several attempts to provide SALE systems in this context. As noted above, the TIPSTER (1995) program developed a reference model of typical IR component sets. More concretely, this program also developed a communication protocol based on Z39.50 for the detection of interactions between the querying application and search engine (Buckley, 1998). The annotation and attribute data structures described earlier were also applied for IR purposes, although the practical applications of the architecture were found in general to be too slow for the large data sets involved. GATE (Cunningham et al., 2002a,b) uses an extendable, open-source IR engine, Lucene, to index documents and corpora for full-text retrieval. Lucene also allows indexing and retrieval by customprovided fields like annotations. The model used to wrap Lucene in GATE is designed for extensibility to other IR systems when required. Whereas the problem of indexing and retrieving documents is well understood, the problem of indexing complex structures in annotations is more of an open question. The Corpus Query System (Christ, 1994, 1995) is the most-cited source in this area, providing indexing and search of corpora and later of WordNet. Similar ideas have been implemented in CUE (Mason, 1998) for indexing and search of annotated corpora and at the W3-Corpora site (University of Essex, 1999) for searchable on-line annotated corpora. Some work on indexing in the LT XML system was reported in McKelvie and Mikheev (1998). Bird et al. (2000a) proposed a query language for the LDC annotation graph model, called AGQL.
Cassidy (2002) discussed the use of XQuery as an annotation query language and concluded that it is good for dealing with hierarchical data models like XML, but needs extending with better support for sequential data models, such as annotation graphs. GATE indexes and retrieves annotations by storing them in a relational database, indexed by type, attributes, and their values. In this way, it is possible to retrieve all documents that contain a given attribute and/or value or to retrieve all annotations of a given type in a corpus, without having to traverse each document separately (Bontcheva et al., 2002; Cunningham et al., 2002b). The query language used is SQL.
Recent Trends and Future Directions As has become evident from the work reviewed here, there are many tools and architectures, and many of these are focused on subareas of NLP (e.g., dialog speech) or specific formalisms (e.g., HPSG). Each of these infrastructures offers specialized solutions, so it is not likely that there will ever be only one universal architecture or infrastructure. Instead, the focus in recent work has been on ‘inter-operability’, allowing infrastructures to work together, and reusability, enabling users to reuse and adapt tools with a minimum effort. We review some of these new trends here to see how they are likely to influence the next period of research on SALE. Toward Multipurpose Repositories
To support the reusability of resources, several repositories have been established; some describe NLP tools (e.g., ACL Natural Language Software Registry), and others distribute language resources, such as corpora and lexicons (e.g., ELRA and LDC). To date, these repositories have remained largely independent of each other, with the exception of such repositories as TRACTOR (Martin, 2001), which contain both corpora in a number of languages and specialized tools for corpus analysis. As argued in Declerck (2001), there is a need to link the two kinds of repositories to allow corpus researchers to find the tools they need to process corpora and vice versa. The idea is to create a multipurpose infrastructure for the storage and access of both language data and the corresponding processing resources. One of the cornerstones of such an infrastructure are metadata, associated with each resource and pointing at other relevant resources (e.g., tools pointing at the language data that they need and
746 Computational Language Systems: Architectures
can process). The following section discusses recent research on metadata descriptions for tools and language resources, including handling of multimodal and multilingual data. Resource Metadata and Annotation Standards
As discussed earlier, there are several reasons why metadata should be part of a component infrastructure (i.e., why it is useful beyond the more narrow scope of providing descriptions of resources in a repository). One dimension that affects the kinds of metadata needed to describe resources is their type: whether they are documents in a corpus, a lexicon, or a tool working on language data. For example, the ISLE Computational Lexicon working group has developed a modular architecture, called MILE, designed to factor out linguistically independent primitive units of lexical information; deal with monolingual, bilingual, and multilingual lexicons; and avoid theoretical bias (Calzolari et al., 2001). Some of these desiderata are relevant also to the problem of resource distribution, as discussed in the section on programmatic access and in Cunningham et al. (2000). Multimedia/multimodal language resources (MMLR) pose a different set of problems, and existing standards for tagging textual documents (e.g., XCES; Ide et al., 2000) are not sufficient. Broeder and Wittenburg (2001) provided a metadata vocabulary for MMLR, which encodes information related to the media files (e.g., format and size) and the annotation units used (e.g., POS), as well as the basic information on creator, content, and so on. Another aspect of improving resource reusability and interoperability is the development of standards for encoding annotation data. Ide and Romary (2002) described a framework for linguistic annotations based on XML and the XML-based RDF and DAMLþOIL standards for defining the semantics of the annotations. It provides a link with recent work on formal ontologies and the semantic web and enables the use of the related knowledge management tools to support linguistic annotation. For example, Collier et al. (2002) used the popular Prote´ ge´ ontology editor as a basis for an annotation tool capable of producing RDF(S) annotations of language data in multiple languages. Open Archives
One of the new research directions is toward ‘open archives,’ archives aiming to make resources easily discoverable, accessible, and identifiable. This work not only includes language resources, such as corpora and lexicons, but also software tools (i.e., processing
resources and development environments). Resource discovery is made possible by metadata associated with each resource and made available in a centralized repository. The recently established Open Language Archives Community (OLAC; Bird and Simons, 2001; Bird et al., 2002) aims to create a worldwide virtual library of language resources through the development of inter-operating repositories and tools for their maintenance and access. OLAC also aims to establish and promote best practices in archiving for language resources. The OLAC infrastructure is based on two initiatives from digital library research: the Open Archieves Initiative and the Dublin Core initiative for resource metadata. Currently, OLAC comprises 12 archives with a cross-archive searching facility. As argued in Wynne (2002), the current trends toward multilinguality and multimodality suggest that the language resources of the future will span across languages and modalities, will be distributed over many repositories, and will form virtual corpora, supported by a diverse set of linguistic analysis and searching tools. As already discussed, metadata and annotation standards play a very important role here. The other major challenge lies in making existing processing resources accessible over the web and enhancing their reusability and portability. Component Reusability, Distributed Access, and Execution
To enable virtual corpora and collaborative annotation efforts spanning country boundaries, software infrastructures and tools need to control user access to different documents, types of annotations, and metadata. Ma et al. (2002) discussed how this access can be achieved by using a shared relational database as a storage medium, combined with a number of annotation tools based on the annotation graph formalism discussed in the section on the Linguistic Data Consortium. The same approach has been taken in GATE (Cunningham et al., 2002b), in which all LRs and their associated annotations can be stored in Oracle or PostgreSQL. This feature enables users to access remote LRs, index LRs by their annotations, and construct search queries retrieving LRs given annotations or metadata constraints (e.g., find all documents that contain person entities called Bush). User access is controlled at the individual and group level, with read/write access rights specified at LR creation time by their owner (the user who has first stored the LR in the database). Because the storage mechanisms in GATE are separate from the API used for accessing LRs and annotations, the visualization tools and processing resources work on both local
Computational Language Systems: Architectures 747
and remote data in the same way. Ma et al. (2002) discussed a special version of AGTK TableTrans tool created to work with the database annotations. In addition, GATE’s database storage model supports other LRs, such as lexicons and ontologies. The recent development of web services enables integration of different information repositories and services across the Internet and offers a new way of sharing language resources across the Internet. Dalli (2002) discussed an architecture for web-based inter-operable LRs based on SOAP and web services. Work in progress extends this approach to processing resource execution in the context of on-line adaptive information extraction (see Tablan et al., 2003). Both make extensive use of XML for metadata description. However, the benefits of the relational database storage mechanism can still be maintained by providing a conversion layer, which transforms the stored LRs and annotations into the desired XML format when needed. Similarly, Todirascu et al. (2002) described an architecture that uses SOAP to provide distributed processing resources as services on the web, both as a protocol for message passing and a mechanism for executing remote modules from the client. Bontcheva et al. (2004) reported recent work in upgrading GATE to meet challenges posed by research in semantic web, large-scale digital libraries, and machine learning for language analysis. Popov et al. (2004) presented an application that combines several SALE systems, including GATE and Sesame, to create a platform for semantic annotation called KIM (Knowledge and Information Management). Their paper covered several issues relating to building scaleable ontology-based information extraction. Measurement
A persistent theme in SALE work has been measurement, quantitative evaluation, and the relationship between engineering practice and scientific theory. To quote Lord Kelvin in a lecture to the Institution of Civil Engineers, in London in 1883. When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind: it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the stage of science.
On the other hand, Einstein tells us, Not everything that counts can be counted, and not everything that can be counted counts (from a sign hanging in Einstein’s office at Princeton University).
Researchers have taken similarly varied approaches to measurement, both of component systems developed using SALE systems and of the success of those systems themselves. The presentation of IBM’s TEXTRACT architecture by Neff et al. (2004) included an illustration of how the same mechanism can be used for producing both quantitative metrics and for visual feedback to users of the results of automated processing. Ferrucci and Lally (2004) reported a successor to TEXTRACT called UIMA (Unstructured Information Management Architecture), which is in active development to support the work of several hundred R&D staff working in areas as diverse as question answering and machine translation. The significant commitment of IBM to SALE development indicates the success of the TEXTRACT concept and of architectural support for language processing research.
Prognosis The principal defining characteristic of NLE work is its objective: to engineer products that deal with natural language and that satisfy the constraints in which they have to operate. This definition may seem tautologous or a statement of the obvious to an engineer practicing in another, well established area (e.g., mechanical or civil engineering), but is still a useful reminder to practitioners of software engineering, and it becomes near-revolutionary when applied to natural language processing. This is partly because of what, in our opinion, has been the ethos of most Computational Linguistics research. Such research has concentrated on studying natural languages, just as traditional linguistics does, but using computers as a tool to model (and, sometimes, verify or falsify) fragments of linguistic theories deemed of particular interest. This is of course a perfectly respectable and useful scientific endeavor, but does not necessarily (or even often) lead to working systems for the general public (Boguraev et al., 1995).
Working systems for public consumption require qualities of robustness that are unlikely to be achieved at zero cost as part of the normal development of experimental systems in language computation research (Maynard et al., 2002). Investing the time and energy necessary to create robust reusable software is not always the right thing to do, of course; sometimes what is needed is a quick hack to explore some simple idea with as little overhead as possible. To conclude that this is always the case is a rather frequent error, however, and is of particular concern at a time when web-scale challenges to language processing are common.
748 Computational Language Systems: Architectures
Also problematic for SALE is the fact that it is not always easy to justify the costs of engineered systems when developers of more informal and short-term solutions have been known to make claims for their power and generality that are, shall we say, somewhat optimistic. The fact that the majority of the language processing field continues to use a SALE system of one type or another indicates that this has been a fruitful pursuit.
Acknowledgments The authors were partly supported by EPSRC grant GR/N15764/01 (AKT) and by EU grants SEKT, PrestoSpeace, and Knowledge Web. See also: Human Language Technology; Language Pro-
cessing: Statistical Methods; Natural Language Processing: System Evaluation; Text Retrieval Conference and Message Understanding Conference.
Bibliography All websites have been confirmed as live before publication, but may change post-publication. Amtrup J (1995). ICE – INTARC Communication Environment user guide and reference manual version 1.4. University of Hamburg. Amtrup J (1999). ‘Architecture of the Shiraz Machine Translation System.’ http://crl.nmsu.edu/shiraz/archi.html. Artola X, de Ilarraza A D, Ezeiza N, Gojenola K, Herna´ ndez G & Soroa A (2002). ‘A class library for the integration of NLP tools: definition and implementation of an abstract data type collection for the manipulation of SGML documents in a context of stand-off linguistic annotation.’ In Proceedings of LREC 2002 Third International Conference on Language Resources and Evaluation. Gran Canaria, Spain. 1650–1657. Berners–Lee T, Connolly D & Swick R (1999). ‘Web architecture: describing and exchanging data. Tech. rep., W3C Consortium.’ http://www.w3.org/1999/04/ WebData. Bird S & Liberman M (1999a). A formal framework for linguistic annotation. Technical report MS-CIS-9901. (Philadelphia: University of Pennsylvania. http:// xxx.lanl.gov/abs/cs.CL/9903003. Bird S & Liberman M (1999b). ‘Annotation graphs as a framework for multidimensional linguistic data analysis.’ In Towards standards and tools for discourse tagging. Proceedings of the ACL-99 Workshop. 1–10. Bird S & Simons G (2001). ‘The OLAC metadata set and controlled vocabularies.’ In Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources. 27–38.
Bird S, Buneman P & Tan W (2000a). ‘Toward a query language for annotation graphs.’ In Proceedings of the Second International Conference on Language Resources and Evaluation. Athens, Greece. Bird S, Day D, Garofolo J, Henderson J, Laprun C & Liberman M (2000b). ‘ATLAS: a flexible and extensible architecture for linguistic annotation’ In Proceedings of the Second International Conference on Language Resources and Evaluation. Bird S, Uszkoreit H & Simons G (2002). ‘The open language archives community.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Boguraev B, Garigliano R & Tait J (1995). ‘Editorial.’ Natural Language Engineering 1(1). Boitet C & Seligman M (1994). ‘The ‘‘Whiteboard’’ architecture: a way to integrate heterogeneous components of NLP systems.’ In Proceedings of COLING ’94. 426–430. Bontcheva K, Cunningham H, Tablan V, Maynard D & Saggion H (2002). ‘Developing reusable and robust language processing components for information systems using GATE.’ In Proceedings of the 3rd International Workshop on Natural Language and Information Systems. Aix-en-Provence, France: IEEE Computer Society Press. Bontcheva K, Tablan V, Maynard D & Cunningham H (2004). ‘Evolving GATE to meet new challenges in language engineering.’ Natural Language Engineering 10(3/4), 349–373. Booch G (1994). Object-oriented analysis and design (2nd edn.). Amsterdam: Benjamin/Cummings. Bos J, Rupp C, Buschbeck-Wolf B & Dorna M (1998). ‘Managing information at linguistic interfaces.’ In Proceedings of the 36th ACL and the 17th COLING (ACL-COLING ’98). 160–166. Brand S (1994). How buildings learn. London: Penguin. Brew C, McKelvie D, Tobin R, Thompson H & Mikheev A (1999). The XML Library LT XML version 1.1 User documentation and reference guide. Edinburgh: Language Technology Group. http://www.ltg.ed.ac.uk. Brill E (1992). ‘A simple rule-based part-of-speech tagger.’ In Proceedings of the Third Conference on Applied Natural Language Processing. Broeder D & Wittenburg P (2001). ‘Multimedia language resources.’ In Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources. 47–51. Brown P, Cocke J, Pietra S D, Pietra V D, Jelinek F, Lafferty J, Mercer R & Roossin P (1990). ‘A statistical approach to machine translation.’ Computational Linguistics 16, 79–85. Brugman H, Russel A, Wittenburg P & Piepenbrock R (1998a). ‘Corpus-based research using the Internet.’ In Workshop on Distributing and Accessing Linguistic Resources. Granada, Spain. 8–15. http://www.dcs.shef. ac.uk/!hamish/dalr/. Brugman H, Russel H & Wittenburg P (1998b). ‘An infrastructure for collaboratively building and using
Computational Language Systems: Architectures 749 multimedia corpora in the humaniora.’ In Proceedings of the ED-MEDIA/ED-TELECOM Conference. Buckley C (1998). ‘TIPSTER Advanced Query (DN2). TIPSTER program working paper.’ (Unpublished). Burnard L (1995). ‘Users reference guide for the British National Corpus.’ http://info.ox.ac.uk/bnc. Busemann S (1999). ‘Constraint-based techniques for interfacing software modules.’ In Proceedings of the AISB’99 Workshop on Reference Architectures and Data Standards for NLP. Edinburgh: Society for the Study of Artificial Intelligence and Simulation of Behaviour. Cahill L, Doran C, Evans R, Mellish C, Paiva D, Reape M, Scott D & Tipper N (1999a). ‘Towards a reference architecture for natural language generation systems.’ Tech. Rep. ITRI-99-14; HCRC/TR-102. Edinburgh and Brighton: University of Edinburgh and Information Technology Research Institute. Cahill L, Doran C, Evans R, Paiva D, Scott D, Mellish C & Reape M (1999b). ‘Achieving theory-neutrality in reference architectures for NLP: to what extent is it possible desirable?’ In Proceedings of the AISB’99 Workshop on Reference Architectures and Data Standards for NLP. Edinburgh: Society for the Study of Artificial Intelligence and Simulation of Behaviour. Cahoon B & McKinley K (1996). ‘Performance evaluation of a distributed architecture for information retrieval.’ In Proceedings of SIGIR ’96. 110–118. Calzolari N, Lenci A & Zampolli A (2001). ‘International standards for multilingual resource sharing: the isle computational lexicon working group.’ In Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources. 39–46. Carreras X & Padro´ L (2002). ‘A flexible distributed architecture for natural language Analyzers.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. 1813–1817. Cassidy S (2002). ‘Xquery as an annotation query language: a use case analysis.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Cassidy S & Bird S (2000). ‘Querying databases of annotated speech.’ In Eleventh Australasian Database Conference. Canberra: Australian National University. Cassidy S & Harrington J (2001). ‘Multi-level annotation in the Emu speech database management system.’ Speech Communication 33, 61–77. Cheong T, Kwang A, Gunawan A, Loo G, Qwun L & Leng S (1994). ‘A pragmatic information extraction architecture for the message formatting export (MFE) system.’ In Proceedings of the Second Singapore Conference on Intelligent Systems (SPICIS ’94). B371–B377. Christ O (1994). ‘A modular and flexible architecture for an integrated corpus query system.’ In Proceedings of the Third Conference on Computational Lexicography and Text Research (COMPLEX ’94). http://xxx.lanl.gov/abs/ cs.CL/9408005.
Christ O (1995). ‘Linking WordNet to a corpus query system.’ In Proceedings of the Conference on Linguistic Databases. Clements P & Northrop L (1996). Software architecture: an executive overview. Tech. Rep. CMU/SEI-96-TR-003. Pittsburgh: Software Engineering Institute, Carnegie Mellon University. Collier N, Takeuchi K, Nobata C, Fukumoto J & Ogata N (2002). ‘Progress on multilingual named entity annotation guidelines using RDF(s).’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation, Conference. Cunningham H (1999). ‘A definition and short history of language engineering.’ Journal of Natural Language Engineering 5(1), 1–16. Cunningham H (2000). Software architecture for language engineering. Ph.D. diss., University of Sheffield. http:// gate.ac.uk/sale/thesis/. Cunningham H (2002). ‘GATE, a general architecture for text engineering.’ Computers and the Humanities 36, 223–254. Cunningham H & Scott D (2004). ‘Introduction to the special issue on software architecture for language engineering.’ Natural Language Engineering 10, 205–211. Cunningham H, Freeman M & Black W (1994). ‘Software reuse, object-oriented frameworks and natural language processing.’ In New methods in language processing (NeMLaP-1). Manchester. Cunningham H, Humphreys K, Gaizauskas R & Wilks Y (1997). ‘Software infrastructure for natural language processing.’ In Proceedings of the Fifth Conference on Applied Natural Language Processing (ANLP-97). http://xxx.lanl.gov/abs/cs.CL/9702005. Cunningham H, Peters W, McCauley C, Bontcheva K & Wilks Y (1998). ‘A level playing field for language resource evaluation.’ In Workshop on Distributing and Accessing Lexical Resources at Conference on Language Resources Evaluation. Cunningham H, Gaizauskas R, Humphreys K & Wilks Y (1999). ‘Experience with a language engineering architecture: three years of GATE.’ In Proceedings of the AISB’99 Workshop on Reference Architectures and Data Standards for NLP. Edinburgh: Society for the Study of Artificial Intelligence and Simulation of Behaviour. Cunningham H, Bontcheva K, Peters W & Wilks Y (2000). ‘Uniform language resource access and distribution in the context of a General Architecture for Text Engineering (GATE).’ In Proceedings of the Workshop on Ontologies and Language Resources (OntoLex’2000). Bulgaria: Sozopol. http://gate.ac.uk/sale/ ontolex/ontolex.ps. Cunningham H, Maynard D, Bontcheva K & Tablan V (2002a). ‘GATE: a framework and graphical development environment for robust NLP tools and applications.’ In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02).
750 Computational Language Systems: Architectures Cunningham H, Maynard D, Bontcheva K, Tablan V & Ursu C (2002b). ‘The GATE user guide.’ http://gate. ac.uk/. Dalli A (2002). ‘Creation and evaluation of extensible language resources for Maltese.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Declerck T (2001). ‘Introduction: extending NLP tool repositories for the interaction with language data resource repositories.’ In Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources. 3–6. DFKI (1999). ‘The Natural Language Software Registry.’ http://www.dfki.de/lt/registry/. EAGLES (1999). EAGLES recommendations . . http:// www.ilc.pi.cnr.it/EAGLES96/browse.html. Edmondson W & Iles J (1994). ‘A non-linear architecture for speech and natural language processing.’ In Proceedings of International Conference on Spoken Language Processing, vol. 1. 29–32. Eriksson M (1996). ALEP. http://www.sics.se/humle/ projects/svensk/platforms.html. Erman L, Hayes-Roth F, Lesser V & Reddy D (1980). ‘The Hearsay II speech understanding system: integrating knowledge to resolve uncertainty.’ Computing Surveys 12. Estival D, Lavelli A, Netter K & Pianesi F (eds.) (1997). ‘Computational environments for grammar development and linguistic engineering.’ Madrid: Association for Computational Linguistics. Evans R & Gazdar G (1996). ‘DATR: a language for lexical knowledge representation.’ Computational Linguistics 22(1). Ferrucci D & Lally A (2004). ‘UIMA: an architectural approach to unstructured information processing in the corporate research environment.’ Natural Language Engineering 10, 327–349. Fikes R & Farquhar A (1999). ‘Distributed repositories of highly expressive reusable ontologies.’ IEEE Intelligent Systems 14(2), 73–79. Fischer D, Mohr W & Rostek L (1996). ‘A modular, objectoriented and generic approach for building terminology maintenance systems.’ In TKE ’96: Terminology and Knowledge Engineering. 245–258. Gaizauskas R & Wilks Y (1998). ‘Information extraction: beyond document retrieval.’ Journal of Documentation 54(1), 70–105. Goldfarb C & Prescod P (1998). The XML handbook. New York: Prentice Hall. Goldfarb C F (1990). The SGML handbook. Oxford: Oxford University Press. Goni J, Gonzalez J & Moreno A (1997). ‘ARIES: a lexical platform for engineering Spanish processing tools.’ Journal of Natural Language Engineering 3(4), 317–347. Go¨ rz G, Kessler M, Spilker J & Weber H (1996). ‘Research on architectures for integrated speech/language systems in Verbmobil.’ In Proceedings of COLING-96.
Grishman R (1997). ‘TIPSTER architecture design document version 2.3. Tech. rep., DARPA.’ http://www.itl.nist.gov/div894.02/related_projects/tipster/. Hendler J & Stoffel K (1999). ‘Back-end technology for high-performance knowledge representation systems.’ IEEE Intelligent Systems 14(3), 63–69. Herzog G, Ndiaye A, Merten S, Kirchmann H, Becker T & Poller P (2004). ‘Large-scale software integration for spoken language and multimodal dialog systems.’ Natural Language Engineering 10, 283–307. Hobbs J (1993). ‘The generic information extraction system.’ In Proceedings of the Fifth Message Understanding Conference (MUC-5). http://www.itl.nist.gov/div894/ 894.02/related_projects/tipster/gen_ie.htm. Ibrahim M & Cummins F (1989). ‘TARO: an interactive, object-oriented tool for Building natural language systems.’ In IEEE International Workshop on Tools for Artificial Intelligence. 108–113. Ide N (1998). ‘Corpus encoding standard: SGML guidelines for encoding linguistic corpora.’ In Proceedings of the First International Language Resources and Evaluation Conference. 463–470. Ide N & Romary L (2002). ‘Standards for language resources.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Ide N & Romary L (2004). ‘Standards for language resources.’ Natural Language Engineering 10, 211–227. Ide N, Bonhomme P & Romary L (2000). ‘XCES: an XMLbased standard for Linguistic corpora.’ In Proceedings of the Second International Language Resources and Evaluation Conference (LREC). 825–830. Jing H & McKeown K (1998). ‘Combining multiple, largescale resources in a reusable lexicon for natural language generation.’ In Proceedings of the 36th ACL and the 17th COLING (ACL-COLING ’98). 607–613. Kay M, Gawron J & Norvig P (1994). Verbmobil, a translation system for face-to-face dialog. Stanford: CSLI. Koning J, Stefanini M & Deamzeau Y (1995). ‘DAI interaction protocols as control strategies in a natural language processing system.’ In Proceedings of IEEE Conference on Systems, Man and Cybernetics. Lassila O & Swick R (1999). ‘Resource description framework (RDF) model and syntax specification. Tech. Rep. 19990222, W3C Consortium.’ http://www.w3.org/-TR/ REC-rdf-syntax/. Lavelli A, Pianesi F, Maci E, Prodanof I, Dini L & Mazzini G (2002). ‘SiSSA: an infrastructure for developing NLP applications.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. LREC-1 (1998). Conference on Language Resources Evaluation (LREC-1). LuperFoy S, Loehr D, Duff D, Miller K, Reeder F & Harper L (1998). ‘An architecture for dialogue man-
Computational Language Systems: Architectures 751 agement, context tracking, and pragmatic adaptation in spoken dialogue systems.’ In Proceedings of the 36th ACL and the 17th COLING (ACL-COLING ’98). 794–801. Ma X, Lee H, Bird S & Maeda K (2002). ‘Models and tools for collaborative annotation.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Macleod C, Ide N & Grishman R (2002). ‘The American National Corpus: standardized resources for American English.’ In Proceedings of the LREC Second International Conference on Language Resources and Evaluation. 831–836. Marcus M, Santorini B & Marcinkiewicz M (1993). ‘Building a large annotated corpus of English: the Penn Treebank.’ Computational Linguistics 19(2), 313–330. Martin W (2001). ‘An archive for all of Europe.’ In Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources. 11–14. Mason O (1998). ‘The CUE corpus access tool.’ In Workshop on Distributing and Accessing Linguistic Resources. 20–27. http://www.dcs.shef.ac.uk/!hamish/ dalr/. Maynard D, Tablan V, Cunningham H, Ursu C, Saggion H, Bontcheva K & Wilks Y (2002). ‘Architectural elements of language engineering robustness.’ Journal of Natural Language Engineering Special Issue on Robust Methods in Analysis of Natural Language Data 8(2/3), 257–274. McClelland J & Rumelhart D (1986). Parallel distributed processing. Cambridge, MA: MIT Press. McKelvie D & Mikheev A (1998). ‘Indexing SGML files using LT NSL, IT Index documentation.’ http:// www.ltg.ed.ac.uk/. McKelvie D, Brew C & Thompson H (1998). ‘Using SGML as a basis for data-intensive natural language processing.’ Computers and the Humanities 31(5), 367–388. Mellish C & Evans R (2004). ‘Implementation architectures for natural language generation.’ Natural Language Engineering 10, 261–283. Mellish C & Scott D (1999). ‘Workshop preface.’ In Proceedings of the AISB’99 Workshop on Reference Architectures and Data Standards for NLP. Edinburgh: Society for the Study of Artificial Intelligence and Simulation of Behaviour. Mellish C, Scott D, Cahill L, Evans R, Paiva D & Reape M (2004). ‘A reference architecture for generation systems.’ Natural Language Engineering. Miller G A (ed.) (1990). ‘WordNet: an on-line lexical database.’ International Journal of Lexicography 3(4) 235–312. MITRE (2002). ‘Galaxy communicator.’ http://communicator.sourceforge.net/. Neff M S, Byrd R J & Boguraev B K (2004). ‘The talent system: TEXTRACT architecture and data model.’ Natural Language Engineering.
Nelson T (1997). ‘Embedded markup considered harmful.’ In Connolly D (ed.) XML: principles tools and techniques. Cambridge, MA: O’Reilly. 129–134. Netter K & Pianesi F (1997). ‘Preface.’ In Proceedings of the Workshop on Computational Environments for Grammar Development and Linguistic Engineering. iii–v. Ogden B (1999). ‘TIPSTER annotation and the Corpus Encoding Standard.’ http://crl.nmsu.edu/Research/ Projects/tipster/annotation. Olson M & Lee B (1997). ‘Object databases for SGML document management.’ In IEEE International Conference on Systems Sciences. Petasis G, Karkaletsis V, Paliouras G, Androutsopoulos I & Spyropoulos C (2002). ‘Ellogon: a new text engineering platform.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Peters W, Cunningham H, McCauley C, Bontcheva K & Wilks Y (1998). ‘Uniform Language resource access and distribution.’ In Workshop on Distributing and Accessing Lexical Resources at Conference on Language Resources Evaluation. Poirier H (1999). ‘The XeLDA Framework.’ http:// www.dcs.shef.ac.uk/!hamish/dalr/baslow/xelda.pdf. Popov B, Kiryakov A, Kirilov A, Manov D, Ognyanoff D & Goranov M (2004). ‘KIM – semantic annotation platform.’ Natural Language Engineering. Reiter E (1994). ‘Has a consensus NL generation architecture appeared, and is it psycholinguistically plausible?’ In Proceedings of the Seventh International Workshop on Natural Language Generation (INLGW-1994). http:// xxx.lanl.gov/abs/CS.cl/9411032. Reiter E (1999). ‘Are reference architectures standardisation tools or descriptive aids?’ In Proceedings of the AISB’99 Workshop on Reference Architectures and Data Standards for NLP. Edinburgh: Society for the Study of Artificial Intelligence and Simulation of Behaviour. Reiter E & Dale R (2000). Building natural language generation systems. Cambridge: Cambridge University Press. Ro¨ sner D & Kunze M (2002). ‘An XML-based document suite.’ In Proceedings of the 19th International Conference on Computational Linguistics (COLING’02). Schu¨ tz J (1994). ‘Developing lingware in ALEP.’ ALEP User Group News 1(1). Schu¨ tz J, Thurmair G & Cencioni R (1991). ‘An architecture sketch of Eurotra-II.’ In MT Summit III. 3–11. Shieber S (1992). Constraint-based grammar formalisms. Cambridge, MA: MIT Press. Simkins N K (1992). ALEP user guide. Luxemburg: cEC. Simkins N K (1994). ‘An open architecture for language engineering.’ In First CEC Language Engineering Convention. Sperberg-McQueen C & Burnard L (1994). ‘Guidelines for electronic text encoding and interchange (TEI P3). ACH, ACL, ALLC.’ http://etext.virginia.edu/TEI.html.
752 Computational Language Systems: Architectures Sperberg-McQueen C & Burnard L (eds.) (2002). Guidelines for electronic text encoding and interchange (TEI P4). TEI Consortium. Tablan V, Bontcheva K, Maynard D & Cunningham H (2003). ‘OLLIE: on-line learning for information extraction.’ In Proceedings of the HLT-NAACL Workshop on Software Engineering and Architecture of Language Technology Systems. Tablan V, Ursu C, Bontcheva K, Cunningham H, Maynard D, Hamza O, McEnery T, Baker P & Leisher M (2002). ‘A Unicode-based environment for creation and use of language resources.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Object Management Group (1992). The common object request broker: architecture and specification. New York: John Wiley. TIPSTER (1995). ‘The generic document detection system.’ http://www.itl.nist.gov/div894/894.02/related_projects/ tipster/gen_ir.htm. Todirascu A, Kow E & Romary L (2002). ‘Towards reusable nlp components.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Tracz W (1995). ‘Domain-specific software architecture (DSSA) frequently asked questions (FAQ).’ http:// www.oswego.com/dssa/faq/faq.html. University of Essex (1999). ‘Description of the W3-Corpora web-site.’ http://clwww.essex.ac.uk/w3c/. van Rijsbergen C (1979). Information retrieval. London: Butterworths. Veronis J & Ide N (1996). ‘Considerations for the reusability of linguistic software. Tech. rep., EAGLES.’ http:// w3.lpl.univ-aix.fr/projects/multext/LSD/LSD1.html. von Hahn W (1994). ‘The architecture problem in natural language processing.’ Prague Bulletin of Mathematical Linguistics 61, 48–69. Wolinski F, Vichot F & Gremont O (1998). ‘Producing NLP-based on-line contentware.’ In Natural Language
and Industrial Applications. http://xxx.lanl.gov/abs/ cs.CL/9809021. Wynne M (2002). ‘The language resource archive of the 21st century.’ In Proceedings of the LREC 2002 Third International Conference on Language Resources and Evaluation. Young S, Kershaw D, Odell J, Ollason D, Valtchev V & Woodland P (1999). The HTK book (Version 2.2). Cambridge: Entropic Ltd. ftp://ftp.entropic.com/pub/htk/. Yourdon E (1989). Modern structured analysis. New York: Prentice-Hall. Zajac R (1992). ‘Towards computer-aided linguistic engineering.’ In Proceedings of COLING ’92. 828–834. Zajac R (1997). ‘An open distributed architecture for reuse and integration of heterogenous NLP components.’ In Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP-97). Zajac R (1998a). ‘Feature structures, unification and finite-state transducers.’ In International Workshop on Finite State Methods in Natural Language Processing. Zajac R (1998b). ‘Reuse and integration of NLP components in the Calypso architecture.’ In Workshop on Distributing and Accessing Linguistic Resources. 34–40. http://www.dcs.shef.ac.uk/!hamish/dalr/.
Relevant Websites http://www.tc37sc4.org – ISO standardization. http://www.ldc.upenn.edu – Linguistic Data Consortium. http://www.info.ox.ac.uk – British National Corpus. http://www.openarchives.org – Open Archives Initiative. http://www.dublincore.org – Dublin Core Initiative for Resource Metadata. http://www.openrdf.org – Knowledge and information management.
Computational Lexicons and Dictionaries 753
Computational Lexicons and Dictionaries K C Litkowski, CL Research, Damascus, MD, USA ! 2006 Elsevier Ltd. All rights reserved.
What Are Computational Lexicons and Dictionaries? Computational lexicons and dictionaries (henceforth lexicons) include manipulable computerized versions of ordinary dictionaries and thesauruses. Computerized versions designed for simple lookup by an end user are not included, since they cannot be used for computational purposes. Lexicons also include any electronic compilations of words, phrases, and concepts, such as word lists, glossaries, taxonomies, terminology databases (see Terminology and Terminological Databases), wordnets (see WordNet(s)), and ontologies. While simple lists may be included, a key component of computational lexicons is that they contain at least some additional information associated with the words, phrases, or concepts. One small list frequently used in the computational community is a list of about 100 most frequent words (such as a, an, the, of, and to), called a stoplist, because some applications ignore these words in processing text. In general, a lexicon includes a wide array of information associated with entries. An entry in a lexicon is usually the base form of a word, the singular for a noun and the present tense for a verb. Using an ordinary dictionary as a reference point, an entry in a computational lexicon contains all the information found in the dictionary: inflectional and variant forms, pronunciation, parts of speech, definitions, grammatical properties, subject labels, usage examples, and etymology (see Lexicography: Overview). More specialized lexicons contain additional types of information. A thesaurus or wordnet contains synonyms, antonyms, or words bearing some other relationship to the entry. A bilingual dictionary contains translations for an entry into another language. An ontology (loosely including thesauruses or wordnets) arranges concepts in a hierarchy (e.g., a horse is an animal), frequently including other kinds of relationships as well (e.g., a leg is part of a horse). The term ‘computational’ applies in several senses for computational lexicons. Essentially, the lexicon is in an electronic form. Firstly, the lexicon and its associated information may be studied to discover patterns, usually for enriching entries. Secondly, the lexicon can be used computationally in a wide variety of applications; frequently, a lexicon may be constructed to support a specialized computational linguistic theory or grammar. Thirdly, written or
spoken text may be studied to create or enhance entries in the lexicon. Broadly, these activities comprise the field known as computational lexicology, the computational study of the form, meaning, and use of words (see also Lexicology).
History of Computational Lexicology Computational lexicology as the study of machinereadable dictionaries (MRDs) (Amsler, 1982) emerged in the mid-1960s and received considerable attention until the early 1990s. ‘Machine-readable’ does not mean that the computer reads the dictionary, but only that it is in electronic form and can be processed and manipulated computationally. Computational lexicology had gone into decline as researchers concluded that MRDs had been fully exploited and that they could not be usefully exploited for NLP applications (Ide and Veronis, 1993). However, since that time, many dictionary publishers have taken the early research into account to include more information that might be useful. Thus, practitioners of computational lexicology can expect to contribute to the further expansion of lexical information. To provide the basis for this contribution, the results of the early history need to be kept in mind. MRDs evolved from keyboarding a dictionary onto punchcards, largely through the efforts of Olney (1968), who was instrumental in getting G. & C. Merriam Co. to permit computer tapes to be distributed to the computational linguistics research community. The ground-breaking work of Evens (Evens and Smith, 1978) and Amsler (1980) provided the impetus for a considerable expansion of research on MRDs, particularly using Webster’s seventh new collegiate dictionary (W7; Gove, 1969). These efforts stimulated the widespread use of the Longman dictionary of contemporary English (LDOCE; Proctor, 1978) during the 1980s; this dictionary is still the primary MRD today. Initially, MRDs were faithful transcriptions of ordinary dictionaries, and researchers were required to spend considerable time interpreting typesetting codes (e.g., to determine how a word’s part of speech was identified). With advances in technology, publishers eventually came to separate the printing and the database components of MRDs. Today, the various fields of an entry are specifically identified and labeled, increasingly using eXtensible Markup Language (XML), such as shown in Figure 1. As a result, researchers can expect that MRDs will be in a form that is much easier to understand, access, and manipulate, particularly using XML-related technologies developed in computer science.
754 Computational Lexicons and Dictionaries
Figure 1 Sample entry for the word double using XML.
The Study of Computational Lexicons Making Lexicons Tractable
An electronic lexicon provides the resource for examination and use, but requires considerable initial work on the part of the investigator, specifically to make the contents tractable. The investigator needs (1) to understand the form, structure, and content of the lexicon, and (2) to ascertain how the contents will be studied or used. Understanding involves a theoretical appreciation of the particular type of lexicon. While dictionaries and thesauruses are widely used, their content is the result of considerable lexicographic practice; an awareness of lexicographic methods is extremely valuable in studying or using these resources. Wordnets require an understanding of how words may be related to one another. Ontologies require an understanding of conceptual relations, along with a formalism for capturing properties in slots and their fillers. A full ontology may also involve various principles for ‘reasoning’ with objects in a knowledge base. Lexicons that are closely tied to linguistic theories and grammars require an understanding of the underlying theory or grammar. The actual study or use of the lexicons is essentially the development of procedures for manipulating the content, i.e., making the contents tractable. A common objective is to transform or extract some part of the content into a form that will meet the user’s needs. This can usually be accomplished by recognizing patterns in the content; a considerable amount of lexical semantics research falls into this category. Another common objective is to map some or all of the content in one format or formalism into another. The general idea of these mappings is to take advantage of content developed under one formalism and to use it in another. The remainder of this section focuses on defining patterns that have been observed in MRDs.
What Can Be Extracted From Machine-Readable Dictionaries?
Lexical Semantics Olney (1968), in his groundbreaking work on MRDs, laid out a series of computational aids for studying affixes, obtaining lists of semantic classifiers and components, identifying semantic primitives, and identifying semantic fields. He also examined defining patterns (including their syntactic and semantic characteristics) to identify productive lexical processes (such as the addition of -ly to adjectives to form adverbs). Defining patterns are essentially regular expressions that specify string, syntactic, and semantic elements of definitions that occur frequently within definitions. For example, in (a|an) [adj] manner, applied to adverb definitions, can be used to characterize the adverb as manner, to establish a derived-from [adj] relation, and to characterize a productive lexical process. The program Olney initiated in studying these patterns is still incomplete. There is no systematic compilation that details the results of the research in this area. Moreover, in working with the dictionary publishers, he was provided with a detailed list of defining instructions used by lexicographers. Defining instructions, usually hundreds of pages, guide the lexicographer in deciding what constitutes an entry, what information the entry should contain, and frequently provides formulaic details on how to define classes of words. Each publisher develops its own idiosyncratic set of guidelines, again underscoring the point that a close working relationship with the publishers can provide a jump-start to the study of patterns. Amsler (1980) and Litkowski (1978) both studied the taxonomic structure of the nouns and verbs in dictionaries, observing that, for the most part, definitions of these words begin with a superordinate or hypernym (flax is a plant, hug is to squeeze). They both recognized that a dictionary is not fully consistent in laying out a taxonomy, because it contains defining cycles (where words may be used to define
Computational Lexicons and Dictionaries 755
Figure 2 Illustrations of definition cycles for (aerify, aerate), (aerate, ventilate), and (air, aerate, ventilate) in a directed graph anchored by oxygenate.
themselves when all links are followed). Litkowski, applying the theory of labeled directed graphs to the dictionary structure, concluded that primitives had to be concept nodes lexicalized by one or more words and verbalized with a gloss (identical to the synonym set encapsulated in the nodes in WordNet). He also hypothesized that primitives essentially characterize a pattern of usage in expressing their concepts. Figure 2 shows an example of a directed graph with three defining cycles; in this example, oxygenate is the base word underlying all the others and is only relatively primitive. Evens and Smith (1978), in considering lexical needs for a question-answering system, presented a description of approximately 45 syntactic and semantic lexical relations. Lexical semantics is the study of these relations and is concerned with how meanings of words relate to one another (see Lexical Semantics: Overview). Evens and Smith grouped the lexical relations into nine categories: taxonomy and synonymy, antonymy, grading, attribute relations, parts and wholes, case relations, collocation relations, paradigmatic relations, and inflectional relations. Each relation was viewed as an entry in the lexicon itself, with predicate properties describing how to use the relations in a first-order predicate calculus. The study of lexical relations is distinguished from the componential analysis of meaning (Nida, 1975), which seeks to analyze meanings into discrete semantic components (or features). In this form of analysis, semantic features (such as maleness or animacy) are used to contrast the meanings of words (such as father and mother). These features proved to be extremely important among field anthropologists in understanding and translating among many languages. These features can be useful in characterizing lexical preferences, e.g., indicating that the subject of a verb should have an animate feature. Their importance has faded somewhat, particularly as the meanings of words have been seen to have fuzzy boundaries and to depend very heavily on the contexts in which they appear.
Ahlswede (1985), Chodorow et al. (1985), and others engaged in large-scale efforts for automatically extracting lexical semantic relations from MRDs, particularly W7. Evens (1988) provides a valuable summary of these efforts; a special issue of Computational Linguistics on the lexicon in 1987 also provides considerable detail on important theoretical and practical perspectives on lexical issues. One focus of this research was on extracting taxonomies, particularly for nouns. In general, noun definitions are extended noun phrases (e.g., including attached prepositional phrases), in which the head noun of the initial noun phrase is the hypernym. Parsing the definition provides the mechanism for reliably identifying the hypernym. However, the various studies showed many cases where the head is effectively empty or signals a different type of lexical relation. Examples of such heads include a set of, any of various, a member of, and a type of. Experience with extracting lexical relations other than taxonomy was similar. Investigators examined defining patterns for regularities in signaling a particular relation (e.g., a part of indicating a part-whole relation). However, the regularities were generally not completely reliable and further work, sometimes manual, was necessary to separate good results from bad results. Several observations can be made. First, there is no repository of the results; new researchers must reinvent the processes or engage in considerable effort to bring together the relevant literature. Second, few of these efforts have benefited directly from the defining instructions or guidelines used in creating the definitions. Third, as outcomes emerge that show the benefit of particular types of information, dictionary publishers have slowly incorporated some of this additional information, particularly in electronic versions of the dictionaries. Research Using Longman’s Dictionary of Contemporary English Beginning in the early 1980s, the Longman’s dictionary of contemporary English (LDOCE; Proctor, 1978) became the primary MRD
756 Computational Lexicons and Dictionaries
used in the research community. LDOCE is designed primarily for learners of English as a second language. It uses a controlled vocabulary of about 2000 words in its definitions. LDOCE uses about 110 syntactic categories to characterize entries (e.g., noun and noun/count/followed-by-infinitive-withTO). The electronic version includes box codes that provide features such as abstract and animate for entries; it also includes subject codes, identifying the subject specialization of entries where appropriate. Wilks et al. (1996) provide a thorough overview of research using LDOCE (along with considerable philosophical perspectives on meaning and a detailed history of research using MRDs). In using LDOCE, many researchers have built upon the research that used W7. In particular, they have reimplemented and refined procedures for identifying the dictionary’s taxonomy and for investigating defining patterns that reveal lexical semantic relations. In addition to string pattern matching, researchers began parsing definitions, necessarily taking into account idiosyncratic characteristics of definition text as compared to ordinary text. A significant problem emerged when parsing definitions: the difficulty of disambiguating the words making up the definition. This problem is symptomatic of working with MRDs, namely, that almost any pattern that is investigated will not have complete reliability and will require some amount of manual intervention. Boguraev and Briscoe (1987) introduced a new task into the analysis of MRDs, using them to derive lexical information for use in NLP applications. In particular, they used the box codes of LDOCE to create ‘‘lexical entries containing grammatical information compatible with’’ parsing using different grammatical theories (see Symbolic Computational Linguistics: Overview). The derivational task has been generalized into a considerable number of research efforts to convert, map, and compare lexical entries from one or more sources. Since 1987, these efforts have grown and constitute an active area of research. Conversion efforts generally involve creation of broad-coverage lexicons from lexical resources within particular formalisms. Mapping efforts attempt to exploit and capture particular lexical properties from one lexicon into another. Comparison efforts examine multiple lexicons. Comparison of lexical entries from multiple sources led to a crisis in the use of MRDs. Ide and Veronis (1993), in surveying the results of research using MRDs, noted that lexical resources frequently were in conflict with one another and could not be used reliably for extracting information. Atkins (1991) described difficulties in comparing entries
from several dictionaries because of lexicographic exigencies and editorial decisions (particularly the dictionary size). She noted that lexicographers could variously lump senses together, split them apart, or combine elements of meaning in different ways. These papers, along with others, seemed to slow the research on using MRDs and other lexical resources. They also underscore the major difficulty that there is no comprehensive theory of meaning, i.e., an organization of the semantic content of definitions. This difficulty may be characterized as the problem of paraphrase, or determining the semantic equivalence of expressions (discussed in detail below). Semantic Networks Quillian (1968) considered the question of ‘‘how semantic information is organized within a person’s memory.’’ He described semantic memory as a network of nodes interconnected by associative links. In explicating this approach, he visualized a dictionary as a unified whole, where conceptual nodes (representing individual definitions) were connected by paths to other nodes corresponding to the words making up the definitions. This model envisioned that words would be properly disambiguated. Computer limitations at the time precluded anything more than a limited implementation. A later implementation by Ide and Veronis (1990) added the notion that nodes within the semantic network would be reached by spreading activation. WordNet (Fellbaum, 1998) was designed to capture several types of associative links, although the number of such links was limited by practical considerations. WordNet was not designed as a lexical resource, so that its entries do not contain the full range of information that is found in an ordinary dictionary. Notwithstanding these limitations, WordNet has found widespread use as a lexical resource, both in research and in NLP applications. WordNet is a prime example of a lexical resource that is converted and mapped into other lexical databases. MindNet (Dolan et al., 2000) is a lexical database and a set of methodologies for analyzing linguistic representations of arbitrary text. It combines symbolic approaches to parsing dictionary definitions with statistical techniques for discriminating word senses using similarity measures. MindNet began by parsing definitions and identifying highly reliable semantic relations instantiated in these definitions. The set of 25 semantic relations includes Hypernym, Synonym, Goal, Logical_subject, Logical_object, and Part. A distinguishing characteristic of MindNet is that the inverse of all relations identified by pattern-matching heuristics are propagated throughout the lexical database. As a result, both direct and indirect paths between entries and words contained in their definitions
Computational Lexicons and Dictionaries 757
exist in the database. Given two words (such as pen and pencil), the database is examined for all paths between them (ignoring any directionality in the paths). The path lengths and weights on different kinds of connections leads to a measure of similarity (or dissimilarity), so that a strong similarity is indicated between pen and pencil because both of them appear in various definitions as means (or instruments) linked to draw. Originally, MindNet was constructed from LDOCE; subsequently, American Heritage (3rd edn., 1992) was added to the lexical database. Patterns used in recognizing semantic relations from definitions can be used as well in parsing and analyzing any text, including corpora. Recognizing this, the MindNet database was extended by processing the full text of Microsoft Encarta. In principle, MindNet can be continually extended by processing any text, essentially refining the weights showing the strength of relationships. MindNet provides a mechanism for capturing the context within which a word is used and hence is a database that characterizes a word’s usage, in line with Firth’s (1957) argument that ‘‘the meaning of a word could be known by the company it keeps.’’ MindNet is a significant departure from traditional dictionaries, although it essentially encapsulates the process by which a lexicographer constructs definitions. This process involves the collection of many examples of a word’s usage, arranging them with concordances, and examining the different contexts to create definitions. The MindNet database could be mined to facilitate the lexicographer’s processes. Traditional lexicography is already being extended through automated techniques of corpus analysis very similar in principle to MindNet’s techniques.
Using Lexicons Language Engineering
Research on computational lexicons, even with a resultant propagation of additional information and formalisms throughout the entries, is inherently limited. While a dictionary publisher makes decisions on what to include based on marketing considerations, the design and development of computational lexicons have not been similarly driven. In recent years, the new field of language engineering has emerged to fill this void (see Human Language Technology). Language engineering is primarily concerned with NLP applications and includes the development of supporting lexical resources. The following sections examine the role of lexicons, particularly WordNet, in word-sense disambiguation, information extraction, question answering, text summarization,
and speech recognition and speech synthesis (see also Text Mining). Word-Sense Disambiguation Many entries in a dictionary have multiple senses. Word-sense disambiguation (WSD) is the process of automatically deciding which sense is intended in a given context (see Disambiguation, Lexical). WSD presumes a sense inventory, and as noted earlier, there can be considerable controversy about what constitutes a sense and how senses are distinguished from one another. Hirst (1987) provides a basic introduction to the issues involved in WSD, framing the problem as taking the output of a parser and interpreting the output into a suitable representation of the text. WSD requires a characterization of the context and mechanisms for associating nearby words, handling syntactic disambiguation cues, and resolving the constraints imposed by ambiguous words, all of which pertain to the content of lexicons. (See also SaintDizier and Viegas, [1995] for an updated view of lexical semantics.) To understand the relative significance of lexical information, a community-wide evaluation exercise known as Senseval (word-sense evaluation) was developed to assess WSD systems. Senseval exercises have been conducted in 1998 (Kilgarriff and Palmer, 2000), 2001, and 2004. WSD systems fall into two categories: supervised (where hand-tagged data are used to train systems using various statistical techniques) and unsupervised (where systems make use of various lexical resources, particularly MRDs). Supervised systems make use of collocational, syntactic, and semantic features used to characterize training data. The extent of the characterization depends on the ingenuity of the investigators and the amount of lexical information they use. Unsupervised systems require substantial information, not always available, in the lexical resources. In Senseval, supervised systems have consistently outperformed unsupervised systems, indicating that computational lexicons do not yet contain sufficient information to perform reliable WSD. The use of WordNet in Senseval, both as the sense inventory and as a lexical resource for disambiguation, emphasized the difference between the two types of WSD systems, since it does not approach dictionary-based MRDs in the amount of lexical information it contains. Close examination of the details used by supervised systems, particularly the use of WordNet, can reveal the kind of information that is important and can guide the evolution of information contained in computational lexicons. Dictionary publishers are increasingly drawing on results from Senseval and other exercises to expand the content of electronic versions of their dictionaries.
758 Computational Lexicons and Dictionaries
Information Extraction Information extraction (IE; Grishman, 2002; see also Information Extraction, Automatic and Named Entity Extraction) is ‘‘the automatic identification of selected types of entities, relations, or events in free text.’’ IE grew out of the Message Understanding Conferences (see Text Retrieval Conference and Message Understanding Conference), in which the main task was to extract information from text and put it into slots of predefined templates. Template filling does not require full parsing, but can be accomplished by pattern-matching using finite-state automata (which may be characterized by regular expressions). Template filling fills slots with a series of words, classified, for example, as names of persons, organizations, locations, chemicals, or genes. Patterns can use computational lexicons; some of these can be quite basic, such as a list of titles and abbreviations that precede a person’s name. Frequently, the lists can become quite extensive, as with lists of company names and abbreviations or of gazetteer entries. Names can be identified quite reliably without going beyond simple lists, since they usually appear in noun phrases within a text. Recognizing and characterizing events can also be accomplished by using patterns, but more substantial lexical entries are necessary. Events typically revolve around verbs and can be expressed in a wide variety of syntactic patterns. Although these patterns can be expressed with some degree of reliability (e.g., company hired person or person was hired by company) as the basis for string matching, this approach does not achieve a desired level of generality. Characterization of events usually entails a level of partial parsing, in which major sentence elements such as noun, verb, and prepositional phrases are identified. Additional generality can be achieved by extending patterns to require certain semantic classes. For example, in uncertain cases of classifying a noun phrase as a person or thing, the fact that the phrase is the subject of a communication verb (said or stated) would rule out classification as a thing. WordNet is used extensively in IE, particularly using hypernymic relations as the basis for identifying semantic classes. Continued progress in IE is likely to be accompanied by the use of increasingly elaborate computational lexicons, balancing needs for efficiency and particular tasks. Question Answering Although much research in question answering has been conducted since the 1960s, this field was much advanced with the introduction of the question-answering track in the Text Retrieval Conferences (see Text Retrieval Conference and Message Understanding Conference) beginning
in 1998 (see Question Answering from Text, Automatic and Voorhees and Buckland, 2004 and earlier volumes for papers relating to question answering). From the beginning, researchers viewed this NLP task as one that would involve semantic processing and provide a vehicle for deeper study of meaning and its representation. This has not generally proved to be the case, but many nuances have emerged in handling different types of questions. Use of the WordNet hierarchy as a computational lexicon has proved to be a key component of virtually all question-answering systems. Questions are analyzed to determine what type of answer is required; e.g., ‘‘what is the length . . .?’’ requires an answer with a number and a unit of measurement; candidate answers use WordNet to determine if a measurement term is present. Exploration of ways to use WordNet in question answering has demonstrated the usefulness of hierarchical and other types of relations in computational lexicons. At the same time, however, lexicographical shortcomings in WordNet have emerged, particularly the use of highly technical hypernyms in between common-sense terms in the hierarchy. Many questions can be answered with stringmatching techniques. In the first year, most of the questions were developed directly from texts (a process characterized as back-formation), so that answers were easily obtained by matching the question text. IE techniques proved to be very effective in answering the questions. Some questions can be transformed readily into searches for string patterns, without any use of additional lexical information. More elaborate string-matching patterns have proved to be effective when pattern elements specify semantic classes, e.g., ‘accomplishment’ verbs in identifying why a person is famous. Over the 6 years of the question-answering track, the task has been continually refined to present more difficult questions that would require the use of more sophisticated techniques. Many questions have been devised that require at least shallow parsing of texts that contain the answer. Many questions require more abstract reasoning to obtain the answer. One system has made use of logical forms derived from WordNet glosses in an abductive reasoning procedure for determining the answer. Improvements in question answering will continue to be fueled in part by improvements in the content and exploitation of computational lexicons. Text Summarization The field of automatic summarization of text has also benefited from a series of evaluation exercises, known as the Document Understanding Conferences (see Over, 2004 and references
Computational Lexicons and Dictionaries 759
to earlier research). Again, much research in summarization has been performed (see Mani, 2001 and Summarization of Text, Automatic for an overview). Extractive summarization (in which highly salient sentences in a text are used) does not make significant use of computational lexicons. Abstractive summarization seeks a deeper characterization of a text. It begins with a characterization of the rhetorical structure of a text, identifying discourse units (roughly equivalent to clauses), frequently with the use of cue phrases (see Discourse Parsing, Automatic). Cue phrases include subordinating conjunctions that introduce clauses and sentence modifiers that indicate a rhetorical unit. Generally, this overall structure requires only a small list of words and phrases associated with the type of rhetorical unit. Attempts to characterize texts in more detail involve a greater use of computational lexicons. First, texts are broken down into discourse entities and events; information extraction techniques described earlier are used, employing word lists and some additional information from computational lexicons. Then, it is necessary to characterize the lexical cohesion of the text, by understanding the equivalence of different entities and events and how they are related to one another. Many techniques have been developed for characterizing different aspects of a text, but no trends have yet emerged in the use of computational lexicons in summarization. The overall discourse structure is characterized in part by the rhetorical relations, but these do not yet capture the lexical cohesion of a text. The words used in a text give rise to lexical chains based on their semantic relations to one another (i.e., such as the type of relations encoded in WordNet). The lexical chains indicate that a text activates templates (via the words) and that various slots in the templates are filled. For example, if word1 ‘is a part of’ word2, the template activated by word2 will have a slot part that will be filled by word1. When the various templates activated in a text are merged via synonymy relations, they will form a set of concepts. The concepts in a text may also be related to one another, particularly instantiating a concept hierarchy for the text. This concept hierarchy may then be used as the basis for summarizing a text by focusing on the topmost elements of the hierarchy. Speech Recognition and Speech Synthesis The use of computational lexicons is speech technologies is limited (see Van Eynde and Gibbon [2000] for several papers on lexicon development for speech technologies). MRDs usually contain pronunciations, but this information only provides a starting point for the recognition and synthesis of speech. Speech
computational lexicons include the orthographic word form and a reference or canonical pronunciation. A full-form lexicon also contains all inflected forms for an entry; rules may be used to generate a full-form lexicon, but it is generally more accurate to use a full-form lexicon. The canonical pronunciations are not sufficient for spoken language processing. Lexical needs must reflect pronunciation variants arising from regional differences, language background of nonnative speakers, position of a word in an utterance, emphasis, and function of the utterance. Some of these difficulties may be addressed programmatically, but many can be handled only through a much more extensive set of information. As a result, speech databases provide empirical data on actual pronunciations, containing spoken text and a transcription of the text into written form. These databases contain information about the speakers, type of speech, recording quality, and various data about the annotation process. Most significantly, these databases contain speech signal data recorded in analog or digital form. The databases constitute a reference base for attempting to handle the pronunciation variability that may occur. In view of the massive amounts of data involved in implementing basic recognition and synthesis systems, they have not yet incorporated the full range of semantic and syntactic capabilities for processing the content of the spoken data.
The Semantic Imperative In considering the NLP applications of word-sense disambiguation, information extraction, question answering, and summarization, there is a clear need for increasing amounts of semantic information. The main problem facing these applications is an inability to identify paraphrases, that is, identifying whether a complex string of words carries more or less the same meaning as another string. Research in the linguistic community continues to refine methods for characterizing, representing, and using semantic information. At the same time, researchers are investigating properties of word use in large corpora (see Corpus Linguistics and Lexical Acquisition). As yet, the symbolic content of traditional dictionaries has not been merged with the statistical properties of word usage revealed by corpus-based methods. Dictionary publishers are increasingly recognizing the value of electronic versions and are putting more information in these versions than appears in the print versions (see Computers in Lexicography). McCracken (2003) describes several efforts to enhance a dictionary database as a resource for computational applications. These efforts include much
760 Computational Lexicons and Dictionaries
greater use of corpus evidence in creating definitions and associated information for an entry, particularly variant forms, morphology and inflections, grammatical information, and example sentences (see Corpus Lexicography; Concordances; and Idiom Dictionaries). The efforts also include the development of a semantic taxonomy based on lexicographic principles and statistical measures of definitional similarity. The statistical measures are also used for automatic assignment of domain indicators. Collocates for senses are being developed based on various clues in the definitions (e.g., lexical preferences for the subject and object of verbs, see Collocations). Corpus-based methods have also been used in the construction of a thesaurus. A lexicon of a person, language, or branch of knowledge is inherently a very complex entity, involving many interrelationships. Attempting to comprehend a lexicon within a computational framework reveals the complexity. Despite the considerable research using computational lexicons, the computational understanding of meaning still presents formidable challenges. See also: Collocations; Computers in Lexicography; Con-
cordances; Corpus Lexicography; Dictionaries and Encyclopedias: Relationship; Disambiguation, Lexical; Discourse Parsing, Automatic; Frame Semantics; Human Language Technology; Idiom Dictionaries; Information Extraction, Automatic; Learners’ Dictionaries; Lexical Conceptual Structure; Lexical Semantics: Overview; Lexicography: Overview; Lexicology; Lexicon: Structure; Meronymy; Named Entity Extraction; Natural Language Understanding, Automatic; Polysemy and Homonymy; Question Answering from Text, Automatic; Selectional Restrictions; Semantic Primitives; Summarization of Text, Automatic; Symbolic Computational Linguistics: Overview; Synonymy; Terminology and Terminological Databases; Text Mining; Text Retrieval Conference and Message Understanding Conference; Thesauruses; WordNet(s).
Bibliography Ahlswede T (1985). ‘A tool kit for lexicon building.’ Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics. Chicago, Illinois: Association for Computational Linguistics. June 8–12. Amsler R A (1980). ‘The structure of the Merriam-Webster pocket dictionary.’ Ph.D. diss., Austin: University of Texas. Amsler R A (1986). ‘Computational lexicology: a research program.’ In Maffox A (ed.) American Federated Information Processing Societies Conference Proceedings. National Computer Conference, Arlington, VA: AFIPS Press. 397–403.
Atkins B T S (1991). ‘Building a lexicon: the contribution of lexicography.’ International Journal of Lexicography 4(3), 167–204. Boguraev B & Briscoe T (1987). ‘Large lexicons for natural language processing: utilising the grammar coding system of LDOCE.’ Computational Linguistics 13(3–4), 203–218. Chodorow M, Byrd R & Heidorn G (1985). ‘Extracting semantic hierarchies from a large on-line dictionary.’ Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics. Chicago, IL: Association for Computational Linguistics. Dolan W, Vanderwende L & Richardson S (2000). ‘Polysemy in a broad-coverage natural language processing system.’ In Ravin Y & Leacock C (eds.) Polysemy: theoretical and computational approaches. Oxford: Oxford University Press. 178–204. Evens M (ed.) (1988). Relational models of the lexicon: representing knowledge in semantic networks. Cambridge: Cambridge University Press. Evens M & Smith R (1978). ‘A lexicon for a computer question-answering system.’ American Journal of Computational Linguistics 4, 1–96. Fellbaum C (ed.) (1998). WordNet: an electronic lexical database. Cambridge: MIT Press. Firth J R (1957). ‘Modes of meaning.’ In Firth J R (ed.) Papers in linguistics 1934–1951. Oxford: Oxford University Press. 190–215. Gove P (ed.) (1972). Webster’s seventh new collegiate dictionary. Springfield, MA: G. & C. Merriam Co. Grishman R (2003). ‘Information extraction.’ In Mitkov R (ed.) The Oxford handbook of computational linguistics. Oxford: Oxford University Press. Hirst G (1987). Semantic interpretation and the resolution of ambiguity. Cambridge: Cambridge University Press. Ide N & Veronis J (1990). ‘Very large neural networks for word sense disambiguation.’ Proceedings of the 9th European Conference on Artificial Intelligence. Stockholm. Ide N & Veronis J (1993). ‘Extracting knowledge bases from machine-readable dictionaries: have we wasted our time?’ Proceedings of Knowledge Bases and Knowledge Structures 93. Tokyo. Kilgarriff A & Palmer M (2000). ‘Introduction to the special issue on SENSEVAL.’ Computers and the Humanities 34(1–2), 1–13. Litkowski K C (1978). ‘Models of the semantic structure of dictionaries.’ American Journal of Computational Linguistics 4, 25–74. Mani I (2001). Automatic summarization. Amsterdam: John Benjamins. McCracken J (2003). ‘Oxford dictionary of English: current developments.’ Companion volume of the 10th conference of the European Association for Computational Linguistics. Budapest, Hungary. Nida E A (1975). Componential analysis of meaning. The Hague: Mouton. Olney J, Revard C & Ziff P (1968). Toward the development of computational aids for obtaining a formal
Computational Linguistics: History 761 semantic description of English. Santa Monica, CA: System Development Corporation. Over P (ed.) (2004). Document understanding workshop. Human Language Technology/North American Association for Computational Linguistics Annual Meeting. Association for Computational Linguistics. Proctor P (ed.) (1978). Longman dictionary of contemporary English. Harlow, Essex: Longman Group. Quillian M R (1968). ‘Semantic memory.’ In Minsky M (ed.) Semantic information processing. Cambridge: MIT Press. 216–270. Saint-Dizier P & Viegas E (eds.) (1995). Computational lexical semantics. Cambridge: Cambridge University Press.
Soukhanov A (ed.) (1992). The American heritage dictionary of the English language (3rd edn.). Boston: Houghton Mifflin Company. Van Eynde F & Gibbon D (eds.) (2000). Lexicon development for speech and language processing. Dordrecht: Kluwer Academic Publishers. Voorhees E M & Buckland L P (eds.) National Institute of Science and Technology Special Publication 500-255. The Twelfth Text Retrieval Conference (TREC 2003). Gaithersburg, MD: National Institute of Standards and Technology. Wilks Y A, Slator B M & Guthrie L M (1996). Electric words: dictionaries, computers, and meanings. Cambridge: The MIT Press.
Computational Linguistics: History Y Wilks, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.
Introduction A remarkable feature of the 50-year history of natural language processing (NLP) by computer, alias computational linguistics (CL), is how much of what we now take for granted in terms of topics of interest was there at the very beginning; all the pioneers lacked were computers. In the 1950s and 1960s, King was arguing for statistical machine translation, Masterman for the power of a semantic thesaurus, Ceccato for conceptual codings (Ceccato, 1961), and Yngve, still working at the time of writing, had designed COMIT, a special programming language for NLP, and had refined his famous claim about the effect of a limitation on processing resources on permissible syntactic structures in a language (Yngve, 1960). The latter project brought him into direct conflict with Chomsky over the permissible ways of drawing syntactic tree structures, which can now be seen to have constituted a defining moment of schism in the history of NLP in its relationship to mainstream linguistics. It was the foundational schism, not healed until decades later when Gazdar became the first major linguist to embrace a computational strategy explicitly. Machine Translation (MT) is the subject of a separate article and will be described only indirectly here, but it must always be remembered that it was the original task of NLP and remains a principal one; however, there is now a wide range of other NLP tasks that researchers are investigating and for which companies sell software solutions: question
answering, information extraction, document summarization, etc. Thus, NLP does require a task: it is not in itself a program of scientific investigation, which is what CL normally claims to be, and that remains a significant difference between two very close terms. It is also important to distinguish major tasks, such as those just mentioned, from a wide range of tasks that are defined only in terms of linguistic theories, and whose outcomes can only be judged by experts, as opposed to naı¨ve users of the results of the major tasks above. These non-major tasks include wordsense disambiguation (e.g., Yarowsky, 1995), partof-speech tagging, syntactic analysis, parallel text alignment, etc. CL is more associated with these tasks than with the very general tasks listed earlier, and they can be taken as ways of testing theories rather than producing useful artifacts. Linguists are not the only scientists wishing to test theories of language functioning – there are also psychologists and neurophysiologists – and the dominant linguistic paradigm of the last half century, Chomsky’s, has never believed that CL was the way to test linguistic theories. This dispute is over what constitutes the data of language study: it very clearly separates NLP and CL on the one hand, from linguistics proper on the other, where data is intimately connected with the intuitions of a speaker rather than with computable processes. Since 1990, emphasis has shifted to the use of corpora, of actual texts, rather than those imagined or written by linguists. Corpora are now normally gleaned from the Web, and have become the canonical data of NLP and CL. An element in the history of NLP/CL that cannot be overemphasized is the effect of hardware developments that have produced extraordinary increases in
Computational Linguistics: History 761 semantic description of English. Santa Monica, CA: System Development Corporation. Over P (ed.) (2004). Document understanding workshop. Human Language Technology/North American Association for Computational Linguistics Annual Meeting. Association for Computational Linguistics. Proctor P (ed.) (1978). Longman dictionary of contemporary English. Harlow, Essex: Longman Group. Quillian M R (1968). ‘Semantic memory.’ In Minsky M (ed.) Semantic information processing. Cambridge: MIT Press. 216–270. Saint-Dizier P & Viegas E (eds.) (1995). Computational lexical semantics. Cambridge: Cambridge University Press.
Soukhanov A (ed.) (1992). The American heritage dictionary of the English language (3rd edn.). Boston: Houghton Mifflin Company. Van Eynde F & Gibbon D (eds.) (2000). Lexicon development for speech and language processing. Dordrecht: Kluwer Academic Publishers. Voorhees E M & Buckland L P (eds.) National Institute of Science and Technology Special Publication 500-255. The Twelfth Text Retrieval Conference (TREC 2003). Gaithersburg, MD: National Institute of Standards and Technology. Wilks Y A, Slator B M & Guthrie L M (1996). Electric words: dictionaries, computers, and meanings. Cambridge: The MIT Press.
Computational Linguistics: History Y Wilks, University of Sheffield, Sheffield, UK ! 2006 Elsevier Ltd. All rights reserved.
Introduction A remarkable feature of the 50-year history of natural language processing (NLP) by computer, alias computational linguistics (CL), is how much of what we now take for granted in terms of topics of interest was there at the very beginning; all the pioneers lacked were computers. In the 1950s and 1960s, King was arguing for statistical machine translation, Masterman for the power of a semantic thesaurus, Ceccato for conceptual codings (Ceccato, 1961), and Yngve, still working at the time of writing, had designed COMIT, a special programming language for NLP, and had refined his famous claim about the effect of a limitation on processing resources on permissible syntactic structures in a language (Yngve, 1960). The latter project brought him into direct conflict with Chomsky over the permissible ways of drawing syntactic tree structures, which can now be seen to have constituted a defining moment of schism in the history of NLP in its relationship to mainstream linguistics. It was the foundational schism, not healed until decades later when Gazdar became the first major linguist to embrace a computational strategy explicitly. Machine Translation (MT) is the subject of a separate article and will be described only indirectly here, but it must always be remembered that it was the original task of NLP and remains a principal one; however, there is now a wide range of other NLP tasks that researchers are investigating and for which companies sell software solutions: question
answering, information extraction, document summarization, etc. Thus, NLP does require a task: it is not in itself a program of scientific investigation, which is what CL normally claims to be, and that remains a significant difference between two very close terms. It is also important to distinguish major tasks, such as those just mentioned, from a wide range of tasks that are defined only in terms of linguistic theories, and whose outcomes can only be judged by experts, as opposed to naı¨ve users of the results of the major tasks above. These non-major tasks include wordsense disambiguation (e.g., Yarowsky, 1995), partof-speech tagging, syntactic analysis, parallel text alignment, etc. CL is more associated with these tasks than with the very general tasks listed earlier, and they can be taken as ways of testing theories rather than producing useful artifacts. Linguists are not the only scientists wishing to test theories of language functioning – there are also psychologists and neurophysiologists – and the dominant linguistic paradigm of the last half century, Chomsky’s, has never believed that CL was the way to test linguistic theories. This dispute is over what constitutes the data of language study: it very clearly separates NLP and CL on the one hand, from linguistics proper on the other, where data is intimately connected with the intuitions of a speaker rather than with computable processes. Since 1990, emphasis has shifted to the use of corpora, of actual texts, rather than those imagined or written by linguists. Corpora are now normally gleaned from the Web, and have become the canonical data of NLP and CL. An element in the history of NLP/CL that cannot be overemphasized is the effect of hardware developments that have produced extraordinary increases in
762 Computational Linguistics: History
the storage and processing power available for experiments. This is obvious, and its effect on the field’s development can be seen by considering the case of Sparck Jones’ thesis (1966/1986), which was almost certainly the first work to apply statistical clustering techniques to semantic issues and the first to make use of a large lexical resource, namely Roget’s Thesaurus. Her statistical ‘clump’ algorithms required the computation of large matrices that simply could not be fully computed with the tiny machines in use in 1964, with the result that this work’s significance was not appreciated at the time, and it has been rediscovered, usually without knowledge of the original, at regular intervals ever since. The first piece of work to capture attention outside mainstream NLP was Winograd’s SHRDLU thesis at MIT in 1971 (Winograd, 1971). One reason for the interest it aroused in the wider AI community was its choice of domain: the MIT Blocks World used for robotics and planning research, which consisted of blocks of different shapes that could be stacked, and were either real or simulated (simulated in Winograd’s case) as well as a crane and a box for putting blocks in, all on a table top. It was a small world about which it was possible to know every fact. Winograd designed a dialogue program that discussed this world and manipulated it by responding to requests such as ‘‘put the red block on the green block into the box.’’ This system had many sophisticated features, including an implementation of a Halliday grammar in a procedural language, PROGRAMMAR, that prefigured LISP, the language designed specifically for processing strings of symbols, such as sentences. It also had a method of forming up truth conditions in a form in LISP that could then be evaluated against the state of the Blocks World. These conditions expressed the semantic content of an utterance and their value, when run, gave the denotation of the sentence, which might be the name of a block, or false if nothing satisfied them. This was an elegant and procedural implementation of the Fregean distinction of sense and reference. Like most systems of its time, it was not available for general testing and performed on only a handful of sentences. SHRDLU’s virtues and failings can be seen by contrasting it with a contemporary system from Stanford: Colby’s PARRY dialogue system (Colby, 1973). This, also programmed in LISP, was made available over the then young Internet and tested by thousands of users, who often refused to believe they had not been typing to a human being. It simulated a paranoid patient in a Veterans’ Hospital, and had all the interest and conversational skills that Weizenbaum’s more famous but trivial ELIZA lacked. It was very robust,
appeared to remember what was said to it, and reacted badly when internal parameters called FEAR and ANGER became high. It did not repeat itself and appeared anxious to contribute to the conversation when subjects about which it was paranoid were touched on: horses, racing, Italians, and the Mafia. It had no grammar, parsing or logic like SHRDLU, but only a very fast table of some six thousand patterns that were matched onto its input. Contrasts between these two systems show issues that became more important later in NLP: widely available and robust systems versus toy ones; grammar parsing, which was cumbrous and rarely successful, versus surface pattern matching (later to be called information extraction); systems driven by world knowledge versus those which were not, such as PARRY, and which essentially ‘knew’ nothing, although it would have been a far better choice as a desert island companion than SHRDLU. We began this historical essay by looking briefly at samples of important and prescient early work, then showing two contrasting, slightly later, approaches to the extraction of content, evaluation, representation, and the role of knowledge. We shall now consider five types of system based on their own theoretical and methodological assumptions, and in this way try to get a picture of the range of influences that have been brought to bear on CL/NLP since the early 1970s.
Systems in Relation to Linguistics Explicit links between CL/NLP and linguistics proper are neither as numerous nor as productive as one might imagine. We have already referred to the early schism between Yngve and Chomsky over the nature of tree representations and, more importantly, over the role of procedures and processing resources in the computation of syntactic structure. Yngve claimed that such computation had to respect limits on storage capacity for intermediate structures, which he assumed corresponded to innate constraints on human processing of languages, such as George Miller’s contemporary claim about the depth of human linguistic processing. Chomsky, on the other hand, assigned all such considerations to mere language performance. In the 1960s, there were a number of attempts to program Chomsky’s transformational grammars to parse sentences: the largest and longest running was at IBM in New York. These were uniformly unsuccessful in that they parsed little or nothing beyond the sentences for which they had been designed, and even then produced a large number of readings between which it was impossible to choose. This last was the fate of virtually all syntactic analyzers until the more
Computational Linguistics: History 763
recent statistical developments described below, including the original Harvard analyzer of Kuno and Oettinger (1962), and the parsers based on the more sophisticated linguistic grammars of the 1970s and 1980s. The last were linguistically motivated but designed explicitly as the basis for parsers, unlike linguistic grammars; the best known was GPSG from Gazdar and colleagues (Gazdar, 1982), which constituted a return to phrase structure, together with procedures for access to deeply nested constituents that owed nothing to transformations. Later came LFG (lexical-functional grammar) from Kaplan and Bresnan (1982) and FUG (functional unification grammar) from Martin Kay (1984) which, like Winograd earlier, was inspired by Halliday’s grammars (Halliday, 1976), as well as the unification logic paradigm for grammar processing that came in with the programming language Prolog. These researchers shared with Chomsky, and linguists in general, the belief that the determination of syntactic structure was not only an end in itself, in that it was a self-sufficient task, but was also necessary for the determination of semantic structure. It was not until much later, and the development of techniques such as information extraction, that this link was questioned with large-scale experimental results. However, it was questioned very early by those in NLP who saw semantic structure as primary and substantially independent of syntactic structure as far as the determination of content was concerned; these researchers, such as Schank and Wilks in the 1960s and 1970s, drew some inspiration and support from the case grammar of Fillmore (1968). He had argued, initially within the Chomskyan paradigm, that the case elements of a verb are crucial to sentence structure (e.g., agents, patients, recipients of actions), an approach which came to emphasize the semantic content of language more than its grammatical structure, since these case elements could appear under many grammatical forms. There have been hundreds of attempts to parse sentences computationally into case structure and Fillmore remains almost certainly the linguist with the most explicit influence on NLP/ CL as a whole. Syntactic and semantic structure can be linked in another way to procedures by considering the traditional issue of the center-embedding of sentences in English. The rule: S ! aSb,
where ab is a sentence, is generally considered a rule of English, producing sentences such as the cat the man bit died. The problem is that repeated applications of the rule rapidly produce sentences that are
well-formed but incomprehensible, such as the cat the man the dog chased bit died and so on. Evidence suggests there may be resource limitations on repeated applications of rules, corresponding in some way to syntactic processing limitations in the human, which is no surprise within NLP, but which has no place within linguistics. However, the situation is more complex: DeRoeck and colleagues (DeRoeck et al., 1982) found the following perfectly comprehensible sentence: isn’t it more likely that example sentences that people that you know produce are more likely to be accepted which, give or take the isn’t it true that, has the same depth of syntactic center-embedding as the (incomprehensible) cat-dog sentence above. This seems to show that, even given some depth limitation on the comprehension of center-embeddings, there may be another effect at work, namely that the sentence above is understood not by means of syntactic analysis at all but by some other, possibly more superficial, surface semantic coherence, which the cat-dog sentence fails to possess. This is precisely the sort of consideration that motivated the semantics-based understanding movement of the 1960s and 1970s.
Representation Issues: Logic, Knowledge, and Semantics There is an extreme view of NLP, held by AI researchers for whom logic and knowledge representation are still its main technique, that, in Hewitt’s words, ‘‘language is just a side-effect’’ (Hewitt, 1971). By that he meant that, since AI could be seen as knowledgebased processing then, if only we had a full computer-based representation of knowledge, that alone would effect the understanding of human language, a matter which then has no intrinsic interest on its own. Unsurprisingly, this view has little support in NLP/CL, but it does capture a core AI view about the universal power of logic-based knowledge representation, a vision of some antiquity, going back at least to Carnap’s Logische Aufbau der Welt, the logical structure of the world (Carnap, 1928). The central AI vision (e.g., McCarthy and Hayes, 1969) is that some version of the first-order predicate calculus (FOPC), augmented by whatever mechanisms are necessary, will be found sufficient for this task of representing language and knowledge, a standard view since McCarthy and Hayes (1969). This position, and its parallel movement in linguistic semantics, claim that logic can and should provide the underlying semantics of natural language, and it has had a profound and continuing effect on CL/NLP. In linguistics, the view is usually ascribed first to Lakoff’s generative semantics movement, in some
764 Computational Linguistics: History
ways a natural extension to transformational grammar, albeit never acknowledged by Chomsky, given the logical origins of that movement in Carnap’s rules of transformation as part of what he called logical syntax. Its high point was Montague’s model of theoretic semantics (Montague, 1970) for English in the late 1960s, which aimed to formalize language semantics independently of Chomsky’s theories. Although these movements, in AI and linguistics, have many formal achievements in print, they have had little success in producing any general and usable program to translate English to formal logic, nor indeed any demonstration from psychology that such a translation into logic would correspond to the human storage and manipulation of meaning. In more surface-oriented and recent movements such as information extraction, a task driven largely by evaluation competitions run by the US agency DARPA, the translation of English to FOPC structures remains a goal, but no one has yet set realistic standards for its achievement. Part of the problem that any such translation scheme raises is the following: logical structure is not a mere decoration but something designed to take part in proofs. There will undoubtedly be NLP applications that require logical inferences to be established between sentence representations but, if those are only part of an application (e.g., the consistency of times in an airline reservation system), it is not clear they have anything to do with the underlying meaning structure of natural language, and hence with CL/NLP proper. At this point, there are a number of possible routes to take: one can say (a) that logical inferences are intimately involved in the meaning of sentences, since to know their meanings is to be able to draw inferences, and logic is the best way to do that. A recent survey of such approaches in linguistics is in Pulman (2005). One can also say (b) that there can be meaning representation outside logic, and this can be found in linguistics back to the semantic marker theories of Fodor and Katz (1963), developed within the transformational paradigm, as well quite independently, in NLP as forms of computational semantics. There is also a more extreme position (c) that the predicates of logic, and formal systems generally, such as ontologies, only appear to be different from human language (often accentuated by writing their predicates in capital letters), but this is an illusion, and their terms are in fact the language words they appear to be, as prone to ambiguity and vagueness as other words; both sides of this are argued by Nirenburg and Wilks (2001). Under (a) in the preceding paragraph, one should note the highly original work of Perrault and colleagues at Toronto in the late 1970s (Perrault et al.,
1980) who were the first group to compute over beliefs represented in FOPC so as to assign speech acts to utterances in a dialogue system. Speech acts are a notion drawn from Searle’s work in philosophy, which has become the central concept in computational pragmatics, one that might enable a system to distinguish a request for information from an apparent question that is really a command, such as Can you close the door? The Toronto system was designed as a railway advisory system for passengers, and made use of limited logical reasoning to establish, for example, that the system knew when a given train arrived, and the passenger knew it did, so the question Do you know when the next train from Montreal arrives? would not be, as it might appear, about the system’s own knowledge. Under (b) above, one can indicate the NLP tradition of the 1970s and 1980s of conceptual/semantic codings of meaning (already mentioned in the last section) by means of a language of primitive elements and the drawing of (nonlogical) inferences from structures based on them. The best known of such systems were Schank’s conceptual dependency system (1975) and Wilks’s (Wilks and Fass, 1992) preference semantics system; both were implemented in interlingual MT systems, and a range of other applications. Schank’s system was based on a set of 14 primitive verbs and Wilks’s on a set of about 80 primitives of various types. Schank asserted firmly that his primitives were not English words, in spite of similarities of appearance (e.g., with INGEST), whereas Wilks argued there could be many sets of primitives and that they were no more than privileged words, as in dictionary definitions (see ‘Corpora, Resources, and Dictionaries’ below). Wilks’s notion of preference became well known: that verbs and adjectives have preferred agents, objects, etc. and that knowledge of these default preferences is the major method of ambiguity resolution. Such preferences were later computed statistically when NLP became larger scale and more empirical (see ‘Statistical and Quntitative Methods in NLP’ below). Schank later developed larger-scale structures called scripts that became highly influential as a way of capturing the overall meaning of texts and dialogues. There are strong analogies between this strand of NLP work and contemporary work in linguistics, particularly with Fillmore and Lakoff, but there was at that time little or no direct contact between researchers in NLP and linguistics proper. That is one of the most striking changes over the last 20 years, and the simplest explanation is distance from Chomsky’s distaste for all things computational, and the realization by linguists, at least since the work of Gazdar, that computational methods could be central
Computational Linguistics: History 765
for them. In spite of this distance, there were undoubtedly influences across the divide: no one can see the semantic structures of Jackendoff (1983), involving structured sequences of primitives such as: CAUSE GO LIQUID TO IN MOUTH OF
as representing drink, without feeling their similarity to the earlier NLP structures mentioned above.
Corpora, Resources, and Dictionaries In the 1960s, Masterman (1957) and Sparck Jones (1966/1986) had made use of Roget’s Thesaurus, punched onto IBM cards, as a device for word sense disambiguation and semantic primitive derivation, respectively, even though they could not do serious computations on them with the computers then available. Subsequently, large-scale linguistic computation was found only in MT, and in the era of the influence of AI methods in CL/NLP, the vocabularies of working systems were found to average about 35, which gave rise to the term ‘toy systems’ to refer to most of the systems described above. But there were movements to bring together substantial corpora of texts for experiments, although these were driven largely from the humanities and in the interests of stylistic studies and statistical measures of word use and distribution. The best-known of these was the Brown-Oslo-Bergen corpus of English (Francis and Kucera, 1964), but the British National Corpus was constructed explicitly with the needs of NLP in mind, and the University of Lancaster team, under Geoffrey Leech, played a key role in its construction. This group had already created the first effective piece of corpus-based statistical NLP, the part-of-speech tagger CLAWS4 (Garside, 1987). At very much the same time, in the early 1980s, interest arose in the value to NLP, not only of text corpora, but specifically of the texts that are dictionaries, both monolingual and bilingual. Bran Boguraev in Cambridge was one of the first researchers (since very early work on Webster’s Third Dictionary at Systems Development Corporation in the 1960s; Olney et al., 1968) to seek to make use of a dictionary via its electronic printing tape, in this case of the Longman Dictionary of Contemporary English, a dictionary specifically designed for foreign learners of the language. This had definitions with restricted syntax drawn from a vocabulary of only 2000 words. In the 1980s, there was a great deal of activity devoted to extracting computational meaning on a large scale from such machine-readable dictionaries (see Wilks et al., 1996): it seemed a sensible way to overcome the toy system problem, and after all dictionaries contained meaning, did they not, so
why not seek it there? Substantial and useful semantic databases were constructed automatically from LDOCE and a range of other dictionaries, again usually dictionaries for learners of English since they expressed themselves more explicitly than traditional dictionaries for scholars and the broadly educated. Hierarchical ontologies were constructed automatically, and these databases of definitions remain, along with thesauri, a component database for many major systems for resolving word sense ambiguity. But such dictionaries were not a panacea that cured the problem of meaning, and it became clear that dictionaries themselves require substantial implicit knowledge to be of computational use, knowledge both of the world and of the primitive vocabulary contained in their definitions. Brief mention should be made here of systematic annotation codings – the automatic attachment of tags representing linguistic information to the words of a text – which began, again, in the humanities with the language SGML for marking up corpora. This type of annotation has now become a huge range of annotations in differing modalities, the best known of which are HTML and XML, the annotations underlying the World Wide Web. A curious effect of all this has been to bring programs, once thought of as quite disjoint from texts, into the space of objects that are themselves annotated texts, which is an unexpected new universality for linguistics, taken broadly. Another quite independent source of annotated corpus resources were tree banks, of which the Penn Tree Bank (Marcus, 1993) is the best known: a corpus syntactically structured by hand, with the syntactic structure being added to the text as annotations, indicating structure and not merely categories. One effect of the wide use of the Penn Tree Bank for experiments was to enshrine the texts used for it, in particular sections of the Wall Street Journal, as ueber-corpora, used so much and so often that some believed their particular features had distorted NLP research. In the recent past, great energy and discussion has been put into the selecting and balancing corpora – dialogue, novels, and memoranda, etc., – but this activity is becoming irrelevant because of the growing use of very large parts of the World Wide Web itself as a corpus that can be annotated. The so-called Semantic Web project (Berners-Lee et al., 2001) has as one of its aims the annotation of the whole Web-asa-corpus, so that machines as well as humans can read its content. This is a project that envisages such annotations as reaching further than traditional linguistic annotations, of say syntactic or semantic type, right up to annotating logical structure. This goal brings the project back to the traditional AI one of automatically translating the whole of human
766 Computational Linguistics: History
language into logic. The value of this translation, even if possible, has yet to be shown in practice.
Statistical and Quantitative Methods in NLP This movement is the most difficult to survey briefly, largely because it is currently on-going (see Manning and Schuetze, 1999). In the 1960s, Gilbert King predicted that MT could be done by statistical methods, on the grounds of the well-known 50% redundancy of characters and words in Western languages, though it is not easy to see why the second justified the first. Later, and as we saw earlier, Sparck Jones pioneered what were essentially IR methods to produce semantic classifications, intended ultimately for use in MT. We noted earlier that the first clear example of modern statistical NLP was the work by Leech and his colleagues on the CLAWS4 part-of-speech tagger in the late 1970s. At the time, few could see the interest of assigning part-of-speech categories to text words. Yet now, only two decades later, almost all text processing work starts with a part-of-speech assignment phase, since this is now believed (even at about 98% accuracy, the usual level achieved) to simplify all subsequent linguistic processes, by filtering out a large range of possibilities that used to overtax syntactic analyzers. The undoubted success of such methods showed that analysis decisions previously believed to require high-level syntactic or semantic information could in fact be taken at a low level by methods such as n-gram statistics over sequences of words. The greatest impetus for statistical NLP, however, came from work on MT, a research program of Jelinek (Brown et al., 1990) and his group at IBM, who were applying methods that had been successful in automatic speech recognition (ASR) to what had been considered a purely symbolic (linguistic or AI) problem. Jelinek began asking what phenomenon should be modeled (answer, translation) and then sought examples of that human skill for the application of machine learning. The most obvious case was parallel corpora: texts expressing the same meaning in more than one language. These were widely available and he took the Canadian Hansard texts in English and French. We can already see some of the major forms machine learning (ML) in NLP can take: in the CLAWS4 work, part-of-speech tagging had been annotated onto text by humans and the ML algorithms were then set to recapitulate those annotations, in the sense of being able to tag new unseen texts at some acceptable level of accuracy.
This is called supervised ML; in Jelinek’s work, on the other hand, although the targets to be learned are given, namely the translations in the parallel texts, the training material had not been produced specifically for this task, but consisted of naturally occurring texts, albeit produced by people. Many would call this weakly supervised ML. In the work of Sparck Jones, however, the clusters found were not set up in advance, which is normally called unsupervised ML. Jelinek’s work produced an accuracy level of about 50% of sentences translated correctly, a remarkable fact given that the system had no linguistic knowledge of any kind. When applied to new, unseen texts, it failed to beat the traditional, hand-coded MT system SYSTRAN, which had not been trained for specific kinds of text. Jelinek’s CANDIDE system was a benchmark in that it suggested there were limits to purely ASRderived statistical methods applied to a linguistic task such as MT, and he himself began a program for the derivation of linguistic structures (lexicons, grammars, etc.) by those same statistical ML methods, in an attempt to raise the level of CANDIDE’s success, and in doing so he set in motion a movement throughout NLP to learn traditional NLP/CL structures at every linguistic level by those methods. There are now far too many such applications to cite here: ML methods have been applied to the alignment of texts, syntactic analysis, semantic tagging, word-sense disambiguation (Yarowsky, 1995), speech act assignment, and even dialogue management. In the case of some of these traditional tasks, the nature of the task has changed with the evaluation and scoring regimes that have come along with the paradigm shift. For example, it was conventional to say, only a few years ago, that syntactic parsers had failed, at least for languages like English, and that there simply was no parser that could be relied on to produce a correct parse for an unseen English sentence, or at least not one that could be reliably picked out, by probabilities or other ordering, from a forest of alternatives. However, now that statistically based parsers learn over tree banks and are scored by the number of brackets they can correctly insert, and the appropriate phrase structure annotations they can assign, the issue is merely quantitative and it is no longer considered essential that a full parse (i.e., to the S symbol) is produced. Charniak currently produces the best figures (2001) for doing this. There is a general perception that statistical, or corpus-driven (also known as empirical), linguistics has resulted in a shift to surface considerations in language: the shallower syntactic structures just mentioned that have allowed syntactic analysis to become more useful in linguistic processing, because they are
Computational Linguistics: History 767
more successful and reliable. One could also point to the success of the independent task information extraction (IE; Gaizauskas and Wilks, 1997), which consists, in broad terms, in extracting fact-like structures from texts on a large scale for practical purposes, e.g., all those whom IBM promoted in the 1990s, extracted from public source newspapers. At the 95þ% level (this is the norm of acceptability in empirical linguistics), IE has become an established technology, and this has been achieved largely by surface pattern matching, rather than by syntactic analysis and the use of knowledge structures, although the latter have played a role in some successful systems. However, many of the more recent successes of empirical linguistics, again based on ML over corpora, have been in areas normally considered semantic or less superficial in nature, such as word-sense disambiguation and the annotation of dialogue utterances with their dialogue or speech acts, indicating their function in the overall dialogue. It may well be that raising the currently low figure for tagging dialogue acts (80%) to an acceptable level does require more complex structures to be modeled, as was shown to be the case in Jelinek’s approach to MT, e.g., the modeling of dialogue managers and agent belief systems, but it is proving much harder to model and evaluate these independently than was the case for components of an MT system.
Computational Linguistics as an Independent Paradigm? In conclusion, let us consider briefly to what extent CL/NLP is an independent paradigm (see Cole et al., 1996), rather than being just a subdivision of linguistics (or even AI). It is certainly the case that a small number of linguists have had a disproportionate and continuing influence on the development of CL/NLP: Halliday’s and Fillmore’s work continue to appear in computational paradigms, and Halliday’s influence on Kay’s functional unification grammar is clear. Chomsky, by contrast, has had little influence in CL since the unsuccessful attempts in the 1960s to program transformational grammars. It is also clear that much of the development described in this article can be traced to the influence on CL/NLP of some combination of the following movements: 1. linguistics itself, 2. logic and knowledge representation in AI, 3. statistical methods: speech research, neural net/ connectionists, the evaluation community, and information retrieval, 4. lexicographers and corpus experts.
But there is another strand of influence, one hard to describe, but coming directly from computation itself, namely procedure-based theories: those in which the procedures are essential, and not merely the programming of rules constraining some domain. Some elements of NLP constitute a kind of core NLP, definitive of the subject. In such a list one could include: . Winograd’s procedural expressions of grammar, truth conditions and the movement content of verbs; . Marcus’s syntactic parser (1980), which put a resource bound on searching structures in an attempt to capture the notion of ‘garden path sentences’; . Charniak’s (1983) attempt to limit searches of semantic nets by means of a finite resource, on the assumption that correct results are defined partly by the resources available; . Wilks’ preference semantics (Wilks and Fass, 1992), an attempt to define the best semantic structure for an utterance as the maximally coherent one in terms of satisfied preferences; . Woods’ display of grammars as a path tracking procedure (ATNs) augmented by recursive pushdown stacks and registers (Woods et al., 1974). . A number of authors, including Gazdar (Evans and Gazdar, 1996) and Pustejovsky (1996), who attempted to define appropriate dictionary entries by some level of maximal compression of information; . Waltz and Pollack’s (1985) connectionist model of word-sense choice in terms of affinity and repulsion; . Much of the work of Yngve referred to at the beginning of the article, especially the notion of limiting syntactic depth; . Hirst (1990) and others who have attempted to define semantic structure as one progressively revealed and specified by incoming information; . Grosz’s definition of the accessibility of discourse constituents with a network where partitions are progressively closed off (Grosz and Sidner, 1986). One could continue with this list, but it might not be especially revealing, and it would certainly not include all or most of NLP/CL. The act of making it does seek, however, to raise the question of whether there is some distinctive core of CL/NLP that captures human language behavior, as well as machine behavior, by some set of procedures based on information compression and the minimization of effort, a component in several theories on that list. All science is information compression, in a wide sense, and it is certainly plausible that the brain, and any other
768 Computational Linguistics: History
language machine, will have available distinctive procedures to do this, as opposed to the brute force methods of statistics, which are implausible as models of human language processing. About this last, Chomsky was probably right. Finally, it is not possible to understand the history of NLP/CL over the last half century without seeing the crucial role of its funders, particularly the U.S. Defense Department, which created MT from nothing in the United States and, through DARPA and ARPA, which have continued to shape the field in the United States, and to some extent worldwide. In recent years, it has been the DARPA evaluation competitions, open to all, that created information extraction and then the entire empirical linguistics movement we are still participating in. Whether all this effort defended anybody or anything is, of course, another question. See also: Corpora; Fillmore, Charles J. (b. 1929); Fodor, Jerry (b. 1935); Information Extraction, Automatic; Jackendoff, Ray S. (b. 1945); Katz, Jerrold J. (1932–2002); Kay, Martin (b. 1935); Lexicography: Overview; Machine Translation: History; Mark-up Languages: Speech; Meaning, Sense, and Reference; Montague Semantics; Part-of-Speech Tagging; Propositional and Predicate Logic: Linguistic Aspects; Symbolic Computational Linguistics: Overview; Text Retrieval Conference and Message Understanding Conference; Treebanks and Tagsets.
Bibliography Berners-Lee T, Hendler J & Lassila O (2001). ‘The semantic web.’ Scientific American. 25–35. Brown P F, Cocke J, Della Pietra S, Della Pietra V, Jelinek F, Lafferty J, Mercer R L & Roossin P (1990). ‘A statistical approach to machine translation.’ Computational Linguistics 16(2), 79–85. Carnap R (1928). Der Logische Aufbau der Welt. Berlin: Weltkreis. Ceccato S (1961). ‘Operational linguistics and translation.’ In Ceccato S (ed.) Linguistic analysis and programming for mechanical translation. New York: Gordon & Breach. 117–129. Charniak E (1983). ‘Passing markers: a theory of contextual influence in language comprehension.’ Cognitive Science 7, 171–190. Charniak E (2001). ‘Immediate-head parsing for language models.’ In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, New York. 116–123. Colby K M (1973). ‘Simulation of belief systems.’ In Schank R & Colby K M (eds.) Computer models of thought and language. San Francisco: W. H. Freeman. Cole R, Mariani J, Uszkoreit H, Zaenen A & Zue V (eds.) (1996). Survey of the state of the art in human language technology. Cambridge University Press.
Cooper R P (1996). ‘Head-driven phrase structure grammar.’ In Brown P F & Miller J (eds.) Concise encyclopedia of syntactic theories 191–196. Oxford: Pergamon. 152–179. De Roeck A et al. (1982). ‘A myth about centre-embedding.’ Lingua 58, 327–340. Evans R & Gazdar G (1996). ‘DATR: a language for lexical knowledge representation.’ Computational Linguistics 22(2), 167–216. Fillmore C (1968). ‘The case for case.’ In Bach E & Harms T (eds.) Universals in linguistic theory. New York: Holt, Rinehart and Winston. 1–90. Fillmore C (1977). ‘The case for case reopened.’ In Cole R & Sadock J (eds.) Syntax and semantics 8: grammatical relations. New York: Academic Press. 59–81. Fodor J A & Katz J (1963). ‘The structure of a semantic theory.’ Language 39, 170–210. Francis W & Kucera H (1964). A standard corpus of present day edited American English, for use with digital computers. Providence, Rhode Island Department of Linguistics Brown University. Gaizauskas R & Wilks Y (1997). ‘Information extraction: beyond document retrieval.’ Journal of Documentation 36, 70–105. Garside R (1987). ‘The CLAWS word-tagging system.’ In Garside R, Leech G & Sampson G (eds.) The computational analysis of English. London: Longman. Gazdar G (1982). ‘Phrase structure grammar.’ In Jacobson R & Pullum G (eds.) The nature of syntactic representation. Dordrecht: Reidel. 131–186. Reprinted in Kulas J, Fetzer J H & Rankin T L (eds.) (1988) Philosophy, language, and artificial intelligence. Dordrecht: Kluwer. 163–218. Grosz J B & Sidner C (1986). ‘Attention, intentions and the structure of discourse.’ Computational Linguistics 12(3), 175–204. Halliday M A K (1976). ‘Halliday: system and function in language.’ In Kress G (ed.) Selected papers. London: Oxford University Press. Hewitt C (1971). ‘Procedural semantics.’ In Rustin R (ed.) Natural Language Processing Courant Computer Science Symposium 8. New York: Algorithmics Press. 180–198. Hirst G (1990). ‘Mixed-depth representations for natural language text.’ AAAI Spring Symposium on Text-Based Intelligent Systems, Stanford, March 25–29. Jackendoff R (1983). Semantics and cognition. Cambridge: MIT Press. Kaplan R M & Bresnan J (1982). ‘Lexical-functional grammar: a formal system for grammatical representation.’ In Bresnan J (ed.) The mental representation of grammatical relations. Cambridge: MIT Press. 173–281. Kay M (1984). ‘Functional unification grammar: a formalism for machine translation.’ In Proceedings of the 22nd Conference on Association for Computational Linguistics. Stanford: Association for Computational Linguistics. 75–78. King G W (1961/2003). ‘Stochastic methods of mechanical translation.’ In Nirenburg S, Somers H & Wilks Y (eds.) Readings in machine translation. Cambridge: MIT Press. 45–51.
Computational Stylistics 769 Kuno S & Oettinger A (1962). ‘Multiple-path syntactic analyzer.’ In Proceedings of IFIP Congress ‘62. Munich: 1143–1162. Manning C D & Schuetze H (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press. Marcus M (1980). A theory of syntactic recognition for natural language. Cambridge: MIT Press. Marcus M (1993). ‘Building a large annotated corpus of English: the Penn Treebank.’ Computational Linguistics 19, 87–105. Masterman M (1957). ‘The thesaurus in syntax and semantics.’ Mechanical Translation 4(1–2), 35–43. McCarthy J & Hayes P (1969). ‘Some philosophical problems from the standpoint of artificial intelligence.’ In Meltzer B & Michie D (eds.) Machine Intelligence 4. Edinburgh: Edinburgh University Press. Montague R (1970). ‘English as a formal language.’ In Visentini B et al. (eds.) Linguaggi nella Societa e nella Tecnica. Milan: Edizioni di Comunita. 98–119. Nirenburg S & Wilks Y (2001). ‘What’s in a symbol: ontology, representation and language.’ Journal of Experimental and Theoretical Artificial Intelligence 13, 9–23. Olney J, Revard C & Ziff P (1968). ‘Some monsters in Noah’s Ark.’ Research memorandum SP-2698. Systems Development Corp., Santa Monica, CA. Perrault R, Cohen P & Allen A (1980). ‘A plan-based analysis of indirect speech acts.’ Computational Linguistics 6(3–4), 167–182. Pulman S G (2005). ‘Lexical decomposition: for and against.’ In Tait J (ed.) Charting a new course: natural language processing and information retrieval. Cambridge: Cambridge University Press.
Pustejovsky J (1996). The generative lexicon. Cambridge: MIT Press. Schank R (1975). Conceptual information processing. Amsterdam: North Holland. Sparck Jones K (1966/1986). Synonymy and semantic classification. Edinburgh: Edinburgh University Press. Waltz D L & Pollack J (1985). ‘Massively parallel parsing: a strongly interactive model of natural language interpretation.’ Cognitive Science 9(1), 57–84. Wilks Y & Fass D (1992). ‘Preference semantics: a family history.’ Computing and Mathematics with Applications 23(2), 53–74. Wilks Y, Slator B & Guthrie L (1996). Electric words: dictionaries, computers and meanings. Cambridge: MIT Press. Winograd T (1971). Understanding natural language. Cambridge: MIT Press. Woods W, Kaplan R & Nash-Webber B (1974). ‘The lunar sciences natural language information system.’ Final Report 2378. Cambridge, MA: Bolt, Beranek & Newman, Inc. Yarowsky D (1995). ‘Unsupervised word-sense disambiguation rivalling supervised methods.’ In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, MA, 189–196. Yngve V H (1960). ‘A model and an hypothesis for language structure.’ Proceedings of the American Philosophical Society 104(5), 444–466.
Relevant Website http://www.natcorp.ox.ac.uk – British National Corpus.
Computational Stylistics L L Stewart, The College of Wooster, Wooster, OH, USA ! 2006 Elsevier Ltd. All rights reserved.
Computational stylistics is the study of the features of literary or nonliterary texts using quantitative, particularly algorithmic, means. As such, it is a subfield both of computational linguistics and of stylistics. Although traditionally style has been identified as relating to the form rather than the content of a text, a more helpful definition in this context may be Birch’s in ELL1 ‘‘the sum of linguistic features which distinguish one text from another’’ (ELL1: 4378). In considering these features, computational stylisticians attempt to replace subjective impressions with more nearly objective analyses based on empirical data. Generally, such study has one of two main
emphases: (1) determining the special or unique features of the writing of a given author and, thus, differentiating that author from others, and (2) determining differences or distinctions within or among the texts of a single writer. The first emphasis raises the question of whether one can identify writers on the basis of stylistic habits or traits and, if so, what these distinguishing traits may reveal about the writer. The assumption of many computational stylisticians is that each writer does indeed have unique stylistic tendencies, a kind of stylistic fingerprint that differentiates that writer from all others. This study of unique features is used in two main ways. First, it is used to classify texts and has become central in the field of nontraditional authorship attribution – nontraditional being the term used to differentiate attribution studies utilizing quantitative and statistical procedures from those using more
Computational Stylistics 769 Kuno S & Oettinger A (1962). ‘Multiple-path syntactic analyzer.’ In Proceedings of IFIP Congress ‘62. Munich: 1143–1162. Manning C D & Schuetze H (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press. Marcus M (1980). A theory of syntactic recognition for natural language. Cambridge: MIT Press. Marcus M (1993). ‘Building a large annotated corpus of English: the Penn Treebank.’ Computational Linguistics 19, 87–105. Masterman M (1957). ‘The thesaurus in syntax and semantics.’ Mechanical Translation 4(1–2), 35–43. McCarthy J & Hayes P (1969). ‘Some philosophical problems from the standpoint of artificial intelligence.’ In Meltzer B & Michie D (eds.) Machine Intelligence 4. Edinburgh: Edinburgh University Press. Montague R (1970). ‘English as a formal language.’ In Visentini B et al. (eds.) Linguaggi nella Societa e nella Tecnica. Milan: Edizioni di Comunita. 98–119. Nirenburg S & Wilks Y (2001). ‘What’s in a symbol: ontology, representation and language.’ Journal of Experimental and Theoretical Artificial Intelligence 13, 9–23. Olney J, Revard C & Ziff P (1968). ‘Some monsters in Noah’s Ark.’ Research memorandum SP-2698. Systems Development Corp., Santa Monica, CA. Perrault R, Cohen P & Allen A (1980). ‘A plan-based analysis of indirect speech acts.’ Computational Linguistics 6(3–4), 167–182. Pulman S G (2005). ‘Lexical decomposition: for and against.’ In Tait J (ed.) Charting a new course: natural language processing and information retrieval. Cambridge: Cambridge University Press.
Pustejovsky J (1996). The generative lexicon. Cambridge: MIT Press. Schank R (1975). Conceptual information processing. Amsterdam: North Holland. Sparck Jones K (1966/1986). Synonymy and semantic classification. Edinburgh: Edinburgh University Press. Waltz D L & Pollack J (1985). ‘Massively parallel parsing: a strongly interactive model of natural language interpretation.’ Cognitive Science 9(1), 57–84. Wilks Y & Fass D (1992). ‘Preference semantics: a family history.’ Computing and Mathematics with Applications 23(2), 53–74. Wilks Y, Slator B & Guthrie L (1996). Electric words: dictionaries, computers and meanings. Cambridge: MIT Press. Winograd T (1971). Understanding natural language. Cambridge: MIT Press. Woods W, Kaplan R & Nash-Webber B (1974). ‘The lunar sciences natural language information system.’ Final Report 2378. Cambridge, MA: Bolt, Beranek & Newman, Inc. Yarowsky D (1995). ‘Unsupervised word-sense disambiguation rivalling supervised methods.’ In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, MA, 189–196. Yngve V H (1960). ‘A model and an hypothesis for language structure.’ Proceedings of the American Philosophical Society 104(5), 444–466.
Relevant Website http://www.natcorp.ox.ac.uk – British National Corpus.
Computational Stylistics L L Stewart, The College of Wooster, Wooster, OH, USA ! 2006 Elsevier Ltd. All rights reserved.
Computational stylistics is the study of the features of literary or nonliterary texts using quantitative, particularly algorithmic, means. As such, it is a subfield both of computational linguistics and of stylistics. Although traditionally style has been identified as relating to the form rather than the content of a text, a more helpful definition in this context may be Birch’s in ELL1 ‘‘the sum of linguistic features which distinguish one text from another’’ (ELL1: 4378). In considering these features, computational stylisticians attempt to replace subjective impressions with more nearly objective analyses based on empirical data. Generally, such study has one of two main
emphases: (1) determining the special or unique features of the writing of a given author and, thus, differentiating that author from others, and (2) determining differences or distinctions within or among the texts of a single writer. The first emphasis raises the question of whether one can identify writers on the basis of stylistic habits or traits and, if so, what these distinguishing traits may reveal about the writer. The assumption of many computational stylisticians is that each writer does indeed have unique stylistic tendencies, a kind of stylistic fingerprint that differentiates that writer from all others. This study of unique features is used in two main ways. First, it is used to classify texts and has become central in the field of nontraditional authorship attribution – nontraditional being the term used to differentiate attribution studies utilizing quantitative and statistical procedures from those using more
770 Computational Stylistics
traditional historical methods. (see Authorship Attribution: Statistical and Computational Methods.) If each author’s style is unique, it should be possible to attribute a text of unknown authorship to the writer whose style it matches. Second, the study of a writer’s individual stylistic features is used descriptively insofar as it may be a means of understanding and commenting on the writer’s mind and personality. The author’s characteristic stylistic choices can be seen as a reflection of his or her mental or conceptual framework. The second main emphasis of computational stylistics is the determination of differences or distinctions within or among the texts of a single writer and again may aim toward either classification or description. This kind of study may be used in descriptive literarycritical analyses insofar as it allows the researcher to interpret and comment on meanings within a text. For example, one might attempt to determine whether there are stylistic differences among the dialogues of different characters in a particular play or whether the style changes markedly in certain chapters of a novel. If, for instance, one found that female characters used such words as perhaps, if, possibly, etc., at a significantly higher frequency than male characters, the text might be described as one that viewed women as being less certain than men or as being more affected by the contingencies of the world. Determining differences among the texts of a given writer has also been used for purposes of classification; for instance, techniques developed by Richard Forsyth and others and referred to as stylochronometry are utilized for the dating of texts. The assumption here is that a writer’s stylistic habits may gradually change over time. The purposes indicated above may not appear to differ markedly from the aims of traditional or noncomputational stylistics. Traditional stylistics also considers textual features in order to indicate the unique traits of a given writer and to interpret textual meaning. As well, traditional stylistics has been used in order to attribute texts of unknown authorship and to establish the dates of texts. However, many computational stylisticians suggest that human observation alone is incapable of processing the many variations and features that make up a style and that studies based on unaided human observation are particularly susceptible to the bias of the observer.
The Methods of Computational Stylistics As the above discussion indicates, computational stylistics is primarily concerned with measuring difference. Determining the unique or special traits of a writer is a matter of finding what traits differ
from those of other writers. Attributing a work of unknown authorship to a given writer is a matter of measuring the text’s relative differences from and similarities to works of possible authors. Commenting on the stylistic traits of characters within a novel depends upon measuring differences in the styles of those characters. The primary questions for computational stylistics then are what differences to measure and how best to measure them. There seem to be two main approaches to determining which features to measure. The first is simply to begin with features that appear significant. These might be features that traditional stylisticians or literary critics have noted or ones that seem significant from the analyst’s own close reading. The other, frequently favored by statisticians and scientists as Horton (ELL1) indicates, is to measure ‘‘many features (perhaps chosen arbitrarily) in control texts, and [use] statistics to find those features that produce statistically significant differences’’ (ELL1: 4384). Computational stylisticians have traditionally recognized these two different approaches. For instance, Milic in one of the early computerized studies, a consideration of Jonathan Swift’s prose style, directly confronted the problem of approach, recognizing the need to measure something that is ‘‘significant, not something which is merely measurable’’ (Milic, 1967: 82). His solution for much of his analysis was to use the first approach, which he spoke of as one that ‘‘begins with an intuition’’ but ends with ‘‘concrete data’’ (Milic, 1967: 83). He simply read through Swift’s works attempting to observe features present more frequently in Swift’s prose than in that of his contemporaries and then counted those features. For instance, he observed that Swift appeared very often to begin sentences with connectives (coordinating conjunctions, subordinating conjunctions, and conjunctive adverbs). He then counted the number of initial connectives in 2000 sentence samples of Swift and three other 18th century writers and found Swift to use initial connectives at a rate more than double that of the next closest writer. The intuitive sense that Swift began sentences in a certain way was confirmed by empirical data. Although this first approach has the advantage of measuring features that appear significant to unaided human observation and, thus, may be more readily accepted by traditional literary critics, it is particularly susceptible to human bias and may be viewed as circular or tautological, a well-known and devastating charge made by Fish against Milic and computational stylistics in general. The point is that by deciding to measure a feature that already appears ‘different,’ the researcher has biased the study and found only what was already known. Such an
Computational Stylistics 771
approach is sometimes said merely to repeat subjective impressions in mathematical language. The second approach seeks to avoid such bias by using statistical means to determine the significant features to measure. A fuller discussion of some of these statistical procedures is given later in this article, but the assumption is that variables should be ‘selfdeclared.’ Milic, in fact, recognized this need and, in one part of his study, turned to the second method. For this analysis, he created a classification system for parts of speech and manually tagged all words in a number of samples of Swift’s prose and that of several other 18th-century writers; designating each word as one of the classes, he ran a series of analyses to test whether the distribution of word types would statistically differentiate the texts of one writer from those of another and, thus, make it possible to determine individual stylistic characteristics. The aim was to have the analysis rather than the analyst determine which word types and distributions were significant. In recent years, this second method has clearly predominated in computational stylistics and most of the more specific procedures and techniques discussed here are examples of that method.
Measuring Difference Measuring difference raises a number of complicated problems, but several basic statistical procedures described by Hockey, Burrows, and Kenny have often been used by computational stylisticians and can serve as relatively simple examples. The use of normal distribution, standard deviation, and z-scores is one way of measuring comparative data. Normal distribution, the well-known bell-shaped curve, suggests that sets of data are likely to arrange themselves with the majority near the average or mean of all the scores and fewer near either extremity above or below the average. The measurement of distance from the mean is calculated in terms of standard deviation, defined as ‘‘the amount by which a set of values differs from the arithmetical mean.’’ This amount can be stated as a so-called z-score; for example, a z-score of .77 would represent .77 of a standard deviation above the mean and a z-score of !.52 would represent .52 of a standard deviation below the mean. Obviously, z-scores allow the analyst to determine the degree to which any result departs from the norm. For instance, in looking at the distribution of the word ‘but’ among different characters in a given novel, one might find that one character has a z-score of þ.63 and another !.78. The significance of difference is frequently calculated with the chi-square test, a procedure that determines the probability of a given result. If, for
example, it is found that one writer uses the word but at a rate of 3.7 times per thousand words and another writer uses the word 4.1 times per thousand, it is important to determine whether this variation is meaningful or simply random. The chi-square test, measuring the difference between an expected occurrencePand the observed occurrence with the formula w2 ¼ [(observed ! expected)2/expected], produces a result expressed in probability. For instance, a score of .07 indicates a seven percent chance that such a result is likely to occur. Normally, statistical significance is determined by a result of less than .05, that is, a figure indicating a less than a one in twenty probability of the result occurring; highly significant differences are at .01 or less. A third procedure, one now widely used in the investigation of texts, is multivariate analysis. As the name implies, multivariate techniques allow the analysis of many different variables rather than a single one. For instance, principal component analysis reduces any number of variables to a small number of underlying components or factors, each factor being composed of ‘information’ from the different variables. The analysis allows one to determine which variables account for the most variation and, because the many variables can frequently be reduced to two significant factors, makes it possible to graph the results and display them in visual form. Although even more sophisticated statistical procedures are currently being utilized in computational stylistics, the purpose is still to measure difference and to determine whether or not the difference is significant.
Computational Studies Although tests may indicate which differences are statistically significant, obviously it remains the researcher’s task to determine what features to measure. Any attempt to list the many features analyzed by computational stylisticians is doomed to be incomplete and rapidly outdated, but Holmes (1985) offers an important treatment of various studies of different ‘analytic units’ including, among others, studies of word length, syllables, sentence length, distribution of parts of speech, function words, type-token ratio, entropy, and word frequencies. The consideration of several of these procedures may indicate the kind and range of measures being utilized. Word length and sentence length at first glance seem particularly amenable to statistical analysis, and in fact one of the earliest quantitative studies employed word length. Mendenhall (1887) attempted to solve questions of Shakespearean authorship by measuring word length – i.e., the number of letters per word – in texts of Shakespeare, Bacon, Jonson,
772 Computational Stylistics
and Marlowe. Although word length continues occasionally to be measured, its appropriateness, at least as a single measure, has been called into question, partly because word length is likely to be a reflection of subject matter and genre rather than a characteristic of a writer’s entire style. The measure of sentence length is also problematic, in this case because of its reliance on conventions of punctuation. For example, two texts by the same author could show very different sentence lengths due to punctuation choices made by different editors. A somewhat more reliable measure is the typetoken ratio, sometimes referred to as lexical density. The type-token ratio is simply the ratio of the number of different words in a text to the total number of words. For instance, the preceding sentence contains a total of 22 words but only 14 different words. That is, the, ratio, of, and number are all used more than once; each the, for instance is seen as a token of the word type the. The type-token ratio would be figured by dividing the number of different words by the total number of words and multiplying the answer by 100, in this case, (14/22) ! 100 ¼ 63.64. It is usually considered that the more different words that are used and, thus, the higher the type-token ratio, the more difficult or dense the text. However, as Holmes (1985) and others point out, the longer a text, the greater the increase in individual words (tokens) in comparison to the word types. Therefore, any comparison of texts of different lengths must build in a statistical procedure to account for those various lengths. In computational stylistics, the type-token ratio is seldom used by itself but most frequently utilized as one of several measures to characterize or distinguish a text or writer. Currently, among the more widely used kinds of analysis in computational stylistics is the measure of word frequencies, a measure Burrows (1987) used with great effectiveness in his attempt to determine the idiolects of various characters in Jane Austen’s novels. Most often, researchers focus on the most common or frequent words in the literary text, usually personal pronouns, conjunctions, auxiliary forms, prepositions, adverbs, and articles. These grammatical words, which may constitute half of the total words in a given text, are often ignored in favor of lexical words, that is, words viewed as having semantic content. The argument, however, is that the distribution of these grammatical words reveals marked differences in the language and style of texts. Using principal component analysis to measure the distribution of these common words, a procedure frequently referred to as the Burrows technique, has now become standard both in author-attribution and in literary-critical analyses.
The number of words used as variables in this kind of analysis traditionally has ranged from 12 to 50, although Hoover has recently demonstrated that the use of more variables (up to 500 or 600) significantly increases the ability of the method to differentiate among writers. Some researchers include only function words and eliminate nouns and pronouns from their analyses on the grounds that content words tend to reflect subject matter and content rather than style. Still another measure used either singly or in combination with others is the consideration of collocations, usually defined as the frequent co-occurrence of words or lexical items. Just as writers use words at various rates of frequency, so too they may use sequences of words at greater or lesser frequency. Hoover (2002) refers briefly to scholars who have used word sequences in the study of style and, in the same article, argues that combining word frequency with word-sequence frequency produces more reliable results in author attribution research. In a corpus of 870 000 words (29 novels by 17 writers), he finds the most frequent two-word sequences to be of the, in the, to the, it was, he was, and the, and on the. Although these phrases may seem to be meaningless because of their commonality, the fact that different writers use them at significantly different frequencies allows them to be utilized in the study of the unique stylistic features of given writers. Collocations are also used in studies that more directly consider matters of content. For instance, David Miall analyzed Coleridge’s notebooks, showing what words (body, heart, love) collocated with words associated with emotion at various times. Although analyses of word and sentence length, word frequency, collocation, and type-token ratio are among the more frequently used methods in computational stylistics today, the mention of several other methods may indicate some of the range of present-day approaches. Unlike the methods mentioned above, all of which count words in one way or another, some computational approaches focus on grammatical or syntactic sequence. For instance, Jackson (2002) analyzes the pauses in Shakespeare’s iambic pentameter lines. Using multivariate analysis to produce correlation coefficients, he demonstrates correlations among the frequencies of pauses in Shakespeare’s plays and the dates of those plays. Such results could of course serve as supplementary evidence in the dating of plays, but they also are evidence of Shakespeare’s prosodic development. A different kind of analysis is the identification of themes in various texts. Early in the use of computers for literary study, Fortier and McConnell produced a program to detect the presence of different themes in
Computational Stylistics 773
texts by locating words associated with those themes and producing frequency and distribution tables for various parts of the text. More recently, Fortier has examined the way in which several themes appear and interact in the works of Celine, Gide, and others.
The Results of Computational Stylistics The question for computational stylistics ultimately is whether the impressive measurements of many linguistic features and the careful attention to statistical procedures have produced worthwhile results. Has computational stylistics made a difference to textual study in general and added to our understanding of both literary and nonliterary texts? Although computational stylisticians frequently lament their field’s lack of impact on mainstream or traditional literary and historical study, there have been a number of promising results, in the areas both of authorship attribution and of literary criticism. While stylometric attribution has certainly not yet reached the level of scientific proof, it has come to the point where Burrows can claim, ‘‘Where only two or three writers are eligible candidates for the authorship of a particular text and where that text is of a sufficient length, we are now well equipped to form strong inferences about their rival claims’’ (Burrows, 2002: 267). Mosteller’s and Wallace’s (1964) attribution of several disputed Federalist Papers to Madison is a frequently cited success of computational authorship attribution, but Holmes et al. (2001) and a number of other recent studies also demonstrate how traditional and nontraditional approaches can successfully work in tandem on problems of attribution (see Authorship Attribution: Statistical and Computational Methods). The results of computational stylistics in the area of literary criticism and interpretation are more ambiguous than those in the field of attribution; certainly, many computational stylisticians believe traditional literary scholars do not take computational study seriously. Even here, though, numerous examples demonstrate what computational stylistics is capable of. Several of these including Milic’s (1967) relatively early work have already been mentioned in the discussion of computational studies, but Burrows (1987) is one of the fullest applications of computational methods to traditional literary interpretation. In his study of the characters in Jane Austen’s novels, Burrows’ main purpose was to demonstrate ‘‘that exact evidence, often couched in the unfamiliar language of statistics, does have a distinct bearing on questions of importance in the territory of literary interpretation and judgment’’ (Burrows, 1987: 2). Although recognizing the central significance of
close and intelligent readings of literary texts, Burrows argued also for the need for the kind of computational evidence ‘‘to which the unassisted human mind could never gain consistent, conscious access,’’ noting, for example, the 26 000 uses of the in the novels of Jane Austen, a number that defies ‘‘the most accurate memory and the finest powers of discrimination’’ (Burrows, 1987: 2, 3). Burrows is able to demonstrate how apparently inconsequential words help to define characters, noting, for instance, that the imperious Lady Catherine de Bourgh of Pride and Prejudice uses the first-person plural pronoun we less frequently than any of the 47 other major characters in Austen’s novels, a stylistic trait apparently connected to her insolence and exclusivity. As Burrows considers simply this one pronoun, he skillfully moves back and forth from the text, showing how the use of we gives insight into various characters. A particularly interesting and, from a literary critical point of view, significant set of analyses in Burrows uses multivariate analysis of the 12 most common words to chart changes in the idiolects of three characters from Emma, Mr. Knightley, Mrs. Elton, and Emma herself. In the discussion, Burrows is able to demonstrate how changes in their idiolects reflect changes in the characters as they move through the novel. Overall, Emma and Mrs. Elton are shown in what Burrows calls a ‘parodic’ relationship, their idiolects converging at one point and then moving in opposition. Conversely, the idiolects of Emma and Mr. Knightley, though beginning in proximity diverge at various places in the novel only to move toward convergence again near the conclusion, as do the characters themselves. Although Burrows’ text is an especially rich, almost classic, example of the merging of computational stylistics and literary criticism, work continues in this area, much of it to be found in the journal Literary and Linguistic Computing and, until it recently ceased publication, in Computers and the Humanities.
Questions and Controversy If computational stylistics has shown the ability to comment meaningfully on literary texts, the obvious question is why it appears to have had so little impact on mainstream literary study. Part of the answer may be simply that most literary scholars do not understand statistical procedures and do not trust their use on literary problems. This, however, may not be wholly a matter of ignorance on the part of traditional scholars. Rather, it may be that computational stylisticians have too infrequently shown the relationship of their statistical findings to central critical
774 Computational Stylistics
issues in a text. Although Burrows and others mentioned above do move back and forth between their statistics and the text, many stylistic studies seem reluctant to do so, staying, as Craig (1999) says, ‘‘within the safe confines of the statistical results themselves.’’ This reluctance probably comes in part from a dilemma implicit in the earlier reference to Fish’s criticism of Milic. If the researcher begins an analysis with certain features in mind (e.g., Swift’s use of initial connectives or Austen’s use of a certain kind of word), the charge of circularity or tautology is likely to be leveled. Even though these features may be very much at issue in critical discussions, the researcher is seen as biasing the analysis. On the other hand, if the researcher selects lower-level items and a statistical procedure designed to determine their significance, it may be extremely difficult to move from, say, the relative frequency of the to significant critical commentary. Although some argue that readers are affected by and at least unconsciously are aware of even such apparently low-level differences, the statistics on the surface simply do not appear to have a connection with features that seem meaningful to the reader. This dilemma may be less troubling in attribution studies where the primary question is the pragmatic one of whether the procedure works: Does the analysis allow us to determine that text A is the product of author B rather than author C? For instance, one could hypothesize the discovery of an algorithm based on the relative distribution of the letters x, z, and m that would unerringly assign authorship in every case. It would be the fingerprint or the DNA of a writer’s style and would reflect the unique nature of every writer. Such a procedure would seem to be all that is necessary for authorship attribution, but it would appear to tell the stylistician nothing about the writer’s mind or habits and would seem to be of no help to the literary scholar interested in a particular text or writer. Although this hypothetical example may state the dilemma in unrealistically stark terms, the relationship between the method of analysis on the one hand and observable and more apparently interesting and significant textual features on the other remains a problem both real and theoretical. A second aspect of the problem concerns the issue of whether the kind of objectivity for which computational stylistics is thought to strive is, in fact, possible. This issue was at the heart of Fish’s critique of Milic and was again raised in a series of exchanges in scholarly periodicals during the late 1990s. The argument maintains that objectivity is a myth, as is the belief that one can approach textual analysis in a scientific manner. In these arguments, the specter of
circularity is nearly always raised with the charge that quantitative analyses yield no more objective proof than do apparently subjective impressions. Computational stylistics simply counts features already labeled prominent. One answer to these charges, however, was given as early as Burrows, who argued that the so-called circularity or tautology of which the field is sometimes accused ‘‘is actually a convergence of two mutually supportive lines of argument, each of which would generally stand in its own right’’ (Burrows, 1987: 218). That is, it is no more circular to discover quantitative evidence validating a point previously made by scholars using, for instance, the methods of historical criticism than it would be to realize that the assumptions of genre or myth criticism lead to the same understanding. Scholarship frequently uses different approaches to the same question; computational stylistics is one of those approaches. As well, although those involved in authorship attribution studies may need, as Joseph Rudman has pointed out, to make even more rigorous use of scientific methods, certainty, most computational stylisticians do not claim absolute objectivity, recognizing that new evidence in any field is likely to modify what appears to be true at any given time. Rather, they argue that their methods and results are systematic, explicit, and verifiable. However, even given the cogency of such arguments, there are some signs of movement in new directions in computational stylistics. Ramsay (2003), for instance, suggests that the field has gone wrong with its reliance on ‘‘hypothesis testing and empirical validation.’’ He calls instead for a new kind of ‘‘algorithmic criticism,’’ a more ludic endeavor in which the powers of the computer are enlisted not simply to validate and test but to bring to the fore patterns, insights, and understandings not otherwise available. Certainly, a number of researchers suggest that computational stylistics should not be simply a more empirical version of traditional literary criticism and particularly that it not be associated primarily with formalist and structuralist literary theory. Instead, computerized textual analysis might lead to a new kind of criticism or at least be used in conjunction with more recent theoretical understandings of literature. See also: Authorship Attribution: Statistical and Computa-
tional Methods; Stylistics; Stylistics: Corpus Approaches.
Bibliography Binongo J N G & Smith M W A (1999). ‘The application of principal component analysis to stylometry.’ Literary and Linguistic Computing 14, 445–465.
Computer-Mediated Communication: Cognitive Science Approach 775 Bradley J (2003). ‘Finding a middle ground between ‘‘determinism’’ and ‘‘aesthetic indeterminacy’’: a model for text analysis tools.’ Literary and Linguistic Computing 18, 185–207. Burrows J F (1987). Computation into criticism: a study of Jane Austen’s novels and an experiment in method. Oxford: Clarendon Press. Burrows J F (2002). ‘‘‘Delta’’: a measure of stylistic difference and a guide to likely authorship.’ Literary and Linguistic Computing 17, 267–287. Butler C (ed.) (1992). Computers and written texts. Oxford: Blackwell. Corns T N (1990). Milton’s language. Oxford: Blackwell. Craig H (1999). ‘Authorial attribution and computational stylistics: if you can tell authors apart, have you learned anything about them?’ Literary and Linguistic Computing 14, 103–113. Fish S (1980). ‘What is stylistics and why are they saying such terrible things about it?’ In Fish S (ed.) Is there a text in this class? Cambridge: Harvard University Press. 68–96. Forsyth R S (1999). ‘Stylochronometry with substrings, or a poet young and old.’ Literary and Linguistic Computing 14, 467–477. Fortier P A (1996). ‘Categories, theory, and words in literary texts.’ In Perissinotto G (ed.) Research in humanities computing 5: papers from the 1995 ACH ALLC Conference. Oxford: Oxford University Press. 91–109. Hockey S (2000). Electronic texts in the humanities: principles and practice. Oxford University Press. Holmes D I (1985). The analysis of literary style: a review. Journal of the Royal Statistical Society 148, 328–341. Holmes D I, Robertson M & Paez R (2001). ‘Stephen Crane and the New York Tribune: a case study in traditional and nontraditional authorship attribution.’ Computers and the Humanities 35, 315–331. Hoover D L (2002). ‘Frequent word sequences and statistical stylistics.’ Literary and Linguistic Computing 17, 157–180. Hoover D L (2003). ‘Multivariate analysis and the study of style variation.’ Literary and Linguistic Computing 18, 341–360.
Jackson M P (2002). ‘Pause patterns in Shakespeare’s verse: canon and chronology.’ Literary and Linguistic Computing 17, 37–46. Kenny A (1982). The computation of style. Oxford: Pergamon. McCarty W (2002). ‘Humanities computing: essential problems, experimental practice.’ Literary and Linguistic Computing 17, 108–125. Mendenhall T C (1887). ‘The characteristic curves of composition.’ Science IX, 237–249. Miall D S (1992). ‘Estimating changes in collocations of key words across a large text: a case study of Coleridge’s notebooks.’ Computers and the Humanities 26, 1–12. Milic L (1967). A quantitative approach to the style of Jonathan Swift. The Hague: Mouton. Mosteller R & Wallace D L (1984). Applied Bayesian and classical inference: the case of the Federalist Papers. New York: Springer-Verlag. Oakman R L (1980). Computer methods for literary research. Columbia, SC: University of South Carolina Press. Opas L L & Tweedie F J (1999). ‘The magic carpet ride: reader involvement in romantic fiction.’ Literary and Linguistic Computing 14, 89–101. Potter R G (ed.) (1989). Literary computing and literary criticism: theoretical and practical essays on theme and rhetoric. Philadelphia: University of Pennsylvania Press. Ramsay S (2003). ‘Toward an algorithmic criticism.’ Literary and Linguistic Computing 18, 167–174. Siemens R G (2002). ‘A new computer-assisted literary criticism.’ Computers and the Humanities 36, 259–267. Tallentire D R (1972). ‘An appraisal of methods and models in computational stylistics, with particular reference to author attribution.’ Ph.D. Thesis, University of Cambridge, UK. Yule G U (1938). ‘On sentence-length as a statistical characteristic of style in prose, with application to two cases of disputed authorship.’ Biometrika 30, 363–390. Zipf G K (1932). Selected studies of the principle of relative frequency in language. Cambridge: Harvard University Press.
Computer-Mediated Communication: Cognitive Science Approach S E Brennan and C B Lockridge, Stony Brook University (SUNY), Stony Brook, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Human languages and the conventions for using them evolved with people interacting face-to-face. Likewise, face-to-face interaction is the key setting in which children acquire language. Despite these origins, more and more communication now takes place
between people who are not copresent in the same space at the same time, via technologies such as e-mail, instant messaging, cell phones, voice mail, and videoconferencing. How do people adjust when communication is mediated? How is language processing affected? And how is conversation shaped by the medium in which it is conducted? Consider this example: early one morning, Calion is typing an e-mail message to his wife Aisha, who will soon be in her office in the English Department across
Computer-Mediated Communication: Cognitive Science Approach 775 Bradley J (2003). ‘Finding a middle ground between ‘‘determinism’’ and ‘‘aesthetic indeterminacy’’: a model for text analysis tools.’ Literary and Linguistic Computing 18, 185–207. Burrows J F (1987). Computation into criticism: a study of Jane Austen’s novels and an experiment in method. Oxford: Clarendon Press. Burrows J F (2002). ‘‘‘Delta’’: a measure of stylistic difference and a guide to likely authorship.’ Literary and Linguistic Computing 17, 267–287. Butler C (ed.) (1992). Computers and written texts. Oxford: Blackwell. Corns T N (1990). Milton’s language. Oxford: Blackwell. Craig H (1999). ‘Authorial attribution and computational stylistics: if you can tell authors apart, have you learned anything about them?’ Literary and Linguistic Computing 14, 103–113. Fish S (1980). ‘What is stylistics and why are they saying such terrible things about it?’ In Fish S (ed.) Is there a text in this class? Cambridge: Harvard University Press. 68–96. Forsyth R S (1999). ‘Stylochronometry with substrings, or a poet young and old.’ Literary and Linguistic Computing 14, 467–477. Fortier P A (1996). ‘Categories, theory, and words in literary texts.’ In Perissinotto G (ed.) Research in humanities computing 5: papers from the 1995 ACH ALLC Conference. Oxford: Oxford University Press. 91–109. Hockey S (2000). Electronic texts in the humanities: principles and practice. Oxford University Press. Holmes D I (1985). The analysis of literary style: a review. Journal of the Royal Statistical Society 148, 328–341. Holmes D I, Robertson M & Paez R (2001). ‘Stephen Crane and the New York Tribune: a case study in traditional and nontraditional authorship attribution.’ Computers and the Humanities 35, 315–331. Hoover D L (2002). ‘Frequent word sequences and statistical stylistics.’ Literary and Linguistic Computing 17, 157–180. Hoover D L (2003). ‘Multivariate analysis and the study of style variation.’ Literary and Linguistic Computing 18, 341–360.
Jackson M P (2002). ‘Pause patterns in Shakespeare’s verse: canon and chronology.’ Literary and Linguistic Computing 17, 37–46. Kenny A (1982). The computation of style. Oxford: Pergamon. McCarty W (2002). ‘Humanities computing: essential problems, experimental practice.’ Literary and Linguistic Computing 17, 108–125. Mendenhall T C (1887). ‘The characteristic curves of composition.’ Science IX, 237–249. Miall D S (1992). ‘Estimating changes in collocations of key words across a large text: a case study of Coleridge’s notebooks.’ Computers and the Humanities 26, 1–12. Milic L (1967). A quantitative approach to the style of Jonathan Swift. The Hague: Mouton. Mosteller R & Wallace D L (1984). Applied Bayesian and classical inference: the case of the Federalist Papers. New York: Springer-Verlag. Oakman R L (1980). Computer methods for literary research. Columbia, SC: University of South Carolina Press. Opas L L & Tweedie F J (1999). ‘The magic carpet ride: reader involvement in romantic fiction.’ Literary and Linguistic Computing 14, 89–101. Potter R G (ed.) (1989). Literary computing and literary criticism: theoretical and practical essays on theme and rhetoric. Philadelphia: University of Pennsylvania Press. Ramsay S (2003). ‘Toward an algorithmic criticism.’ Literary and Linguistic Computing 18, 167–174. Siemens R G (2002). ‘A new computer-assisted literary criticism.’ Computers and the Humanities 36, 259–267. Tallentire D R (1972). ‘An appraisal of methods and models in computational stylistics, with particular reference to author attribution.’ Ph.D. Thesis, University of Cambridge, UK. Yule G U (1938). ‘On sentence-length as a statistical characteristic of style in prose, with application to two cases of disputed authorship.’ Biometrika 30, 363–390. Zipf G K (1932). Selected studies of the principle of relative frequency in language. Cambridge: Harvard University Press.
Computer-Mediated Communication: Cognitive Science Approach S E Brennan and C B Lockridge, Stony Brook University (SUNY), Stony Brook, NY, USA ! 2006 Elsevier Ltd. All rights reserved.
Human languages and the conventions for using them evolved with people interacting face-to-face. Likewise, face-to-face interaction is the key setting in which children acquire language. Despite these origins, more and more communication now takes place
between people who are not copresent in the same space at the same time, via technologies such as e-mail, instant messaging, cell phones, voice mail, and videoconferencing. How do people adjust when communication is mediated? How is language processing affected? And how is conversation shaped by the medium in which it is conducted? Consider this example: early one morning, Calion is typing an e-mail message to his wife Aisha, who will soon be in her office in the English Department across
campus. If Calion wants Aisha to meet him later for a bite to eat, he cannot simply say, ‘‘Meet me for Indian after class.’’ Many things can go wrong. For instance, Calion needs to be confident that Aisha can receive the message (will she remember to plug her laptop into the campus network?), will be attentive enough to notice that a message has arrived (will she be too busy meeting with undergraduates to check e-mail?), will figure out what Calion intends (their common ground will likely enable her to figure out what he intends by ‘‘Indian’’ and ‘‘after class’’), and is willing and able to commit herself to the action he proposes (or will she have a meeting or other commitment at the time he’s proposing?). So after hitting the send key, Calion must await evidence that Aisha has received, understood, and committed to his invitation. For her part, Aisha doesn’t simply read Calion’s message and resolve to head out to the food court at the appropriate time; she sends an e-mail reply that gives evidence that she has received, understood, and accepted the invitation. Or if she needs to negotiate or clarify the plan, she may switch media and try to instant-message him; this will work only if they can both attend to their screens at the same time. If the expected e-mail response is not forthcoming soon enough, Calion may take the initiative to actively seek out evidence by calling Aisha on her cell phone. The point is that communication does not succeed automatically, just because two people speak the same language, possess the same cognitive architecture, and know the same things. Regardless of the mode of communication, people jointly construct meanings by engaging in an active process of ‘grounding’, or seeking and providing evidence that they understand one another (Brennan, 1990, 2004; Clark and Brennan, 1991; Clark and Schaefer, 1989; Clark and Wilkes-Gibbs, 1986; Schober and Clark, 1989). Contributions to conversations are coordinated in two phases: a presentation phase and an acceptance phase (Clark and Schaefer, 1989). As Calion’s invitation illustrates, an utterance does not count as an actual contribution to a conversation (nor as part of the interlocutors’ common ground) until its acceptance phase is complete. After (or depending on the modality of communication, even while) one person presents an utterance, the addressee provides evidence of attention, understanding, and uptake. This evidence may be implicit, in the form of continued eye contact or a relevant next turn (as when an answer follows a question), or explicit, in the form of a rephrasing, a request for clarification, or a modification of what came before (Clark and Schaefer, 1989). Both speaker and addressee take responsibility for seeking and providing evidence; often who takes the initiative at any given moment depends on who can
do so more easily (Brennan, 1990). In this way, interlocutors in a collaborative task adjust their individual effort in order to minimize the effort they expend jointly, in order to reach a grounding criterion, or degree of certainty that they understand one another sufficiently well for current purposes (Clark and Wilkes-Gibbs, 1986). In the rest of this article, we will briefly present some robust findings about mediated communication and discuss them in the context of the grounding framework. The grounding framework conceptualizes mediated communication as a coordinated activity constrained by costs and affordances (Clark and Brennan, 1991). This framework is compatible with both experimental and descriptive findings about communication (whether electronic or face-to-face) and can be used to predict and explain how communication media shape language use.
Basic Findings About Mediated Communication: Speech and Visual Evidence The richness associated with face-to-face conversation diminishes when communication goes electronic: for instance, prosody is absent when text is the currency of exchange rather than speech; spontaneous facial expressions and gestures are lost when an interlocutor can’t be seen; and conversational turns grow longer with voice mail or e-mail messages than with media that support more fine-grained interaction, such as electronic chat or telephone conversations. Yet perhaps surprisingly, people are able to communicate quite clearly and easily over a wide variety of media, including those with relatively low bandwidth (e.g., text-based media); in fact, cognitive tasks tend to be accomplished just as well over lower-bandwidth media as face-to-face (for a comprehensive review, see Whittaker, 2002). Despite the common expectation that the more similar a medium is to face-to-face communication, the better communication should be, study after study has failed to confirm this ‘bandwidth hypothesis’ (Brennan, 1990, 1998; Brennan and Lockridge, 2004; Karsenty, 1999; Ohaeri, 1998; Whittaker, 2002). Clearly, more bandwidth is not necessarily better. In fact, mediated communication sometimes offers tangible advantages over face-to-face conversation, especially when it is of value to be able to edit utterances, review them, or save them as a paper trail; when it is useful to broadcast them to many addressees at once; or when interlocutors’ schedules prevent them from attending to a message at the same time. Some studies have documented media-based differences in efficiency among task-oriented conversations
Figure 1 In this example, D can see M’s icon, which provides immediate visual evidence about how M understands D’s description. The exchange occurs as M moves his icon toward the target location described by D. D takes the initiative to propose that M has the right location using a deictic cue (‘right there’) after only 6 seconds. The graph shows the convergence of icon to target over time, with the point at which the icon reaches the target marked on the graph by an arrow.
(higher efficiency is when the same task is accomplished just as well in less time or with fewer words). In comparisons of different configurations of speech, handwriting, teletyping, and video, Chapanis and colleagues found early on that remote communication is much less efficient without speech; the only way to substantially improve a medium’s efficiency is to add a voice channel (Chapanis et al., 1972; Chapanis et al., 1977; Ochsman and Chapanis, 1974). The ability to coordinate using speech typically makes a task more efficient by a factor of two or more. Yet adding a video channel to a medium that already includes speech may do nothing to improve either performance or efficiency (Chapanis et al., 1972; Chapanis et al., 1977; Ochsman and Chapanis, 1974; Whittaker, 2002; Williams, 1977). Of course, this depends on what visual information is transmitted: For cognitive or physical tasks where the focus is on the task activity, there are few if any benefits to seeing a partner’s face (Fish et al., 1993; Gaver et al., 1993; Whittaker, 1995, 2002), despite repeated attempts by telephone companies and teleconferencing researchers to supply disembodied talking heads along with people’s voices. (Seeing the face of a remote interlocutor can, however, have effects upon interpersonal social judgments, affiliation, or adversarial situations involving negotiation; see Whittaker, 2002 for a review). Visual information other than faces, such as views of the objects or task under discussion, can be very useful in task-directed communication (Anderson et al., 2000; Brennan and Lockridge, 2004; Clark and Krych, 2004; Kraut et al., 2002; Whittaker, 1995, 2002). The impact of a
Figure 2 In this example, D cannot see M’s icon and so grounding depends on the verbal evidence of understanding sought by D and provided by M. After the icon reaches the correct location there follows a lengthy period of grounding before they reach their grounding criteria and can conclude that they understand one another.
particular kind of visual information can be explained by the role it plays in grounding. Consider the task of giving someone driving directions. This is easiest when both partners can see and point at the same map. In one study of remote communication (Brennan, 1990, 2004), two partners had the same map displayed on their screens and could speak freely to one another. One, the director, knew the target location, and directed the other, a matcher, to move his car icon to the target. Half the time the director could see on her map where the matcher’s car was, and half the time she could not (the situation was asymmetrical; the matcher saw his own car icon in both conditions). When directors had visual evidence about matchers’ understanding, matchers quite literally came to use icon motion to replace their turns in the conversation. And directors could quickly tell when matchers understood where the target location was, so directors took responsibility for deciding when it was time to move on to the next trial. In trials without such evidence, directors waited for matchers to tell them when they understood well enough to move on. Trials with visual evidence also took less than half as long as those without, because pairs could ground in parallel; that is, while the director presented a description, the partner conducted the acceptance phase simultaneously by silently moving his icon (see Figure 1). Without visual evidence (see Figure 2), he had to give verbal evidence, speaking after the director’s description, which made the granularity of interaction much larger.
Grounding in Mediated Communication In mediated communication, interlocutors typically inhabit different times and/or different places, so some aspects of coordination can be more difficult
778 Computer-Mediated Communication: Cognitive Science Approach Table 1 Affordances of communication mediaa Affordances of media
Media
(1) Physical co-presence: Participants share a physical environment, including a view of what each is doing and looking at. (2) Visibility: One participant sees another, but not necessarily what the other is doing or looking at. (3) Audibility: One participant can hear another. (4) Cotemporality: Messages are received without delay (close to the time that they are produced and directed at addressees), permitting finegrained interactivity. (5) Simultaneity: Participants can send and receive messages at the same time, allowing communication in parallel. (6) Sequentiality: Participants take turns in an orderly fashion in a single conversation at a time; one turn’s relevance to another is signaled by adjacency. (7) Reviewability: Messages do not fade over time. (8) Revisability: Messages can be revised before being sent.
Faceto-face
Video conference
Telephone
Instant messaging or chat
Email
þþ
??
""
""
""
þþ
þ
""
""
""
þþ þþ
þþ ??
þþ þþ
"" þ
"" ""
þþ
??
þþ
??
""
þþ
þþ
þþ
""
""
"" ""
"" ""
"" ""
þþ þþ
þþ þþ
Adapted from Clark & Brennan, 1991. Present in a particular medium: þþ; present to a limited extent: þ; present in some systems: ??; absent: "". Physical co-presence (1), the hallmark of face-to-face communication, nearly always includes affordances (2) through (5). a
than in face-to-face conversation, particularly if people are limited to a medium that does not facilitate grounding or if techniques for grounding within the medium are unknown. Table 1, adapted from Clark and Brennan (1991), compares key affordances of face-to-face conversation with those of four other communication media. Grounding in communication can be decomposed into various sub-tasks bearing distinct costs (Clark and Brennan, 1991; Brennan, 1998), with the idea that people must adapt techniques for grounding to the affordances and constraints of the current medium in order to meet these costs. Sub-tasks that incur grounding costs include getting a partner’s attention in order to initiate communication (startup costs); producing a presentation by speaking or typing or in some other manner (production costs); timing the placement of feedback (asynchrony costs) or of a conversational turn (speaker-change costs); pointing, demonstrating, or gesturing in order to refer or clarify content (display costs); awaiting, reading, or listening to a partner’s utterance (reception costs); monitoring the partner’s focus of attention and, if the dialog is task-oriented, any relevant activities or tangible products that make up the task (monitoring costs), preventing misunderstandings or repairing errors caused by self or partner (repair costs), and maintaining politeness (face-management costs) [based on Clark and Brennan, 1991 and Brennan and Ohaeri, 1999]. Discussing a few of these costs will help show how grounding shapes behavior.
Startup and monitoring costs are low for people who are physically copresent because they can easily monitor what a partner is doing, assess when the time is right for an interruption, and initiate a conversation by speaking to get the partner’s attention (for review of physical proximity effects, see Kraut et al., 2002). Startup is more costly for a video conference, since participants must arrange to be present in appropriately equipped facilities at the same time. Starting up a telephone call is unpredictable on a landline, as people are often away from such telephones; but with proliferating cell phones, calls find addressees regardless of their locations and so startup costs are somewhat lower. Production costs are typically higher for text than for speech because most people find it harder to type than to speak, so typed utterances tend to be shorter than spoken utterances. In one study, people were more likely to sacrifice politeness when typing than when speaking when it took more words to frame a polite utterance (e.g., inviting a partner’s input using hedges), but not when it took the same number of words to be polite (e.g., inviting the partner’s input with questions); moreover, individuals with faster typing speeds used more politeness devices per 100 words than those who typed slowly (Brennan and Ohaeri, 1999). This finding demonstrates that people who communicate remotely do not actually become depersonalized or cease to care about politeness (as some social psychological theories have suggested), but that when they must struggle to meet production costs
they do this at the expense of something else, such as face-management. It also illustrates that grounding costs are not independent of one another; often one cost must be traded off against another, and such trade-offs are made differently in different media. As another example, consider repair costs: When communication is cotemporal, such as with voice, text-based chat, and instant messaging, the grain of interaction is small, and turns tend to be shorter, less formal, and more numerous than in larger-grained text-based media (such as letters or e-mail). So any errors or misunderstandings can be addressed quickly, and repair costs are relatively low (more so for speech than for text, since production costs are higher for text). In closing, the grounding framework is a useful vantage point from which to view, understand, and predict the effects of new media upon communication. The abundance and portability of new communication programs and devices (PDAs, added cell phone functionality such as digital photography, more extensive wireless networks, unobtrusive methods for eye-tracking, multimedia Internet content, etc.) will continue to make it even easier for mediaphiles to switch mid-conversation from one medium to another, as in our opening example of Calion’s e-mail invitation to Aisha. Recently the New York Times chronicled a man and his BlackBerry (a portable wireless device for e-mail and instant messaging): ‘‘He once saw a romantic interest walk into a bar and immediately called her on her cell phone. ‘I saw her look at the phone and put me right to voice mail,’ he said, still indignant. But then he sent her a BlackBerry message, which made her laugh and prompted her to walk over and find him.’’ The ability to spontaneously switch media within the same conversation enables increasingly flexible and innovative techniques for grounding. See also: Context and Common Ground; Dialogue and
Interaction; E-mail, Internet, Chatroom Talk: Pragmatics; Language in Computer-Mediated Communication; Multimodal Interaction with Computers; Pauses and Hesitations: Psycholinguistic Approach; Psycholinguistics: Overview.
Bibliography Anderson A H, Smallwood L & MacDonald R (2000). ‘Video data and video links in mediated communication: What do users value?’ International Journal of HumanComputer Studies 52, 165–187. Brennan S E (1990). ‘Seeking and providing evidence for mutual understanding.’ Unpublished doctoral dissertation, Stanford University, Stanford, CA.
Brennan S E (1998). ‘The grounding problem in conversation with and through computers.’ In Fussell S R & Kreuz R J (eds.) Social and cognitive psychological approaches to interpersonal communication. Hillsdale, NJ: Erlbaum. 201–225. Brennan S E (2004). ‘How conversation is shaped by visual and spoken evidence.’ In Trueswell J & Tanenhaus M (eds.) Approaches to world situated language use: Psycholinguistic, linguistic, and computational perspectives on bridging the product and action traditions. Cambridge, MA: MIT Press. 95–130. Brennan S E & Lockridge C B (2004). ‘Monitoring an addressee’s visual attention: Effects of visual co-presence on referring in conversation.’ Unpublished manuscript. Brennan S E & Ohaeri J O (1999). ‘Why do electronic conversations seem less polite? The costs and benefits of hedging.’ Proceedings of the International Joint Conference on Work Activities, Coordination, and Collaboration (WACC ’99). San Francisco, CA: ACM. 227–235. Chapanis A, Ochsman R & Parrish R (1972). ‘Studies in interactive communication I: The effects of four communication modes on the behavior of teams during cooperative problem solving.’ Human Factors 14, 487–509. Chapanis A, Ochsman R & Parrish R (1977). ‘Studies in interactive communication II: The effects of four communication modes on the linguistic performance of teams during cooperative problem solving.’ Human Factors 19, 101–126. Clark H H & Brennan S E (1991). ‘Grounding in communication.’ In Resnick L B, Levine J & Teasley S D (eds.) Perspectives on socially shared cognition. Washington, DC: APA. 127–149. [Reprinted in Baecker R M (ed.) (1992). Groupware and computer-supported cooperative work: Assisting human-human collaboration. San Mateo, CA: Morgan Kaufman. 222–233.] Clark H H & Krych M A (2004). ‘Speaking while monitoring addressees for understanding.’ Journal of Memory and Language 50, 62–81. Clark H H & Schaefer E F (1989). ‘Contributing to discourse.’ Cognitive Science 13, 259–294. Clark H H & Wilkes-Gibbs D (1986). ‘Referring as a collaborative process.’ Cognition 22, 1–39. Fish R, Kraut R & Root R (1993). ‘Video as a technology for informal communication.’ Communications of the ACM 36, 48–61. Gaver W, Sellen A & Heath C (1993). ‘One is not enough: Multiple views in a media space.’ In Proceedings of CHI ’93: Human Factors in Computing Systems. New York: ACM Press. 335–341. Hanna J E & Brennan S E (2004). ‘Using a speaker’s eyegaze during comprehension: A cue both rapid and flexible.’ Abstract, 17th Annual CUNY Conference on Human Sentence Processing. MD: College Park. Karsenty L (1999). ‘Cooperative work and shared context: An empirical study of comprehension problems in sideby-side and remote help dialogues.’ Human-Computer Interaction 14(3), 283–315. Kraut R E, Fussell S R, Brennan S E & Siegel J (2002). ‘Understanding effects of proximity on collaboration:
780 Computer-Mediated Communication: Cognitive Science Approach Implications for technologies to support remote collaborative work.’ In Hinds P & Kiesler S (eds.) Distributed work. Cambridge, MA: MIT Press. 137–162. Lee J (2004). ‘A BlackBerry throbs, and a wonk has a date.’ New York Times Sunday Styles, Section 9, May 30. 1–2. Ochsman R B & Chapanis A (1974). ‘The effects of 10 communication modes on the behavior of teams during cooperative problem-solving.’ International Journal of Man–Machine Studies 6, 579–619. Ohaeri J O (1998). ‘Group processes and the collaborative remembering of stories.’ Unpublished doctoral dissertation, State University of New York at Stony Brook. Schober M F & Clark H H (1989). ‘Understanding by addressees and overhearers.’ Cognitive Psychology 21, 211–232.
Whittaker S (1995). ‘Rethinking video as a technology for interpersonal communications: theory and design implications.’ International Journal of Man-Machine Studies 42, 501–529. Whittaker S (2002). ‘Theories and methods in mediated communication.’ In Graesser A, Gernsbacher M & Goldman S (eds.) The Handbook of Discourse Processes. Hillsdale, NJ: Erlbaum. 243–286. Whittaker S J, Brennan S E & Clark H H (1991). ‘Coordinating activity: An analysis of interaction in computersupported cooperative work.’ In Proceedings of CHI ‘91: Human Factors in Computing Systems. New Orleans, LA: Addison-Wesley. 361–367. Williams E (1977). Experimental comparisons of faceto-face and mediated communication. Psychological Bulletin 16, 963–976.
Computers in Field Linguistics N Thieberger, The University of Melbourne, Melbourne, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.
Computers have been associated with field linguistics from their earliest days, as witness the enthusiasm with which computers were embraced by linguists, from mainframe computers in the 1960s to personal computers in the 1980s. While initially it was common to force our efforts into the framework provided by particular software, we are now more aware of the need to see the data itself as the primary concern of the analyst and not the software that we use to manipulate the data. Inasmuch as it allows us to carry out the main functions desired by a field linguist, software is a tool through which our data passes, the data becoming transformed in some way, but surviving the journey sufficiently to live on, independent of any software, into the future. In this article, I discuss ways in which computers can assist field linguists whose chief concerns I take to be language documentation, including recording a previously unrecorded or little recorded language in order to write a grammatical description. Field linguistics has been going through a change in focus over the past few years. There is increasing recognition of the need to record languages with few speakers, and to support such speakers with materials such as text collections, dictionaries, and multimedia (e.g., text, audio, images, and video). Computers are central to this effort, especially as we move to digital recording in which there will be no analog original. Laptop and palm computers are common
tools for the first-world linguist, as are solid-state digital recorders and digital video cameras, which produce digital files for access on computers. Processing power of computers keeps increasing as does storage and RAM, which means we are now able to deal with real-time media (audio and video) in ever larger quantities, raising crucial issues for data management. A typical workflow engaged in by a field linguist is presented below, together with a description of methods for working with small and perhaps endangered languages, and for managing the data so that it can be analyzed. Further analytical tools, like morphological parsers, are considered in the article on Natural Language Processing (NLP) (see Natural Language Processing: Overview). An interest in supporting endangered languages, and the efforts of speakers or their descendants to learn about them, encourages us to focus on archival methods and on producing the best quality material for access in the future. Thus, the focus here will be on computer-based tools for analyzing linguistic material in ways that allow it to be safely stored, retrieved, and reused by others, as discussed by Bird and Simons (2003) in a work that is central to the present discussion. For the linguistic fieldworker, the usual workflow involves recording, transcribing, and interlinearizing a corpus so that there is a base of information for analysis. This analysis is written as a grammar and may be accompanied by a collection of texts and a dictionary of the language. There may also be a set of media files that are linked to by their transcripts, allowing readers to hear audio or see video in the
Computer-Supported Writing 809 Shawver G (1995). ‘The Semantics of ‘‘storie’’ and ‘‘tale’’ in a lemmatized Chaucer: a computer-aided text analysis.’ In The electric scriptorium: electronic approaches to the imaging, transcription, editing and analysis of Medieval manuscript texts. Calgary: The Calgary Institute for the Humanities and The Society for Early English and Norse Electronic Texts (SEENET), 9–11 November 1995. Shawver G (1999). A Chaucerian narratology: ‘Storie’ and ‘Tale’ in Chaucer’s narrative practice. Ph.D. diss.: University of Toronto. Siemens R G (1998). ‘Review of The Arden Shakespeare CD-ROM: Texts and Sources for Shakespeare Study.’ Early Modern Literary Studies 4(2), 28.1–10. Siemens R G (2002). ‘A new computer-assisted literary criticism?’ Computers and the Humanities 36, 259–267. Sinclair J (1991). Corpus, concordance, collocation. Oxford University Press. Sinclair J (2003). Reading concordances: an introduction. London: Pearson. Sinclair S (1996). HyperPo: text analysis and exploration tools. Edmonton, AL: University of Alberta. URL: http:// huco.ualberta.ca/HyperPo. Sinclair S (2003a). ‘Computer-assisted reading: reconceiving text analysis.’ Literary & Linguistic Computing 18(2), 175–184. Sinclair S (2003b). ‘SAToRBase: a database of topoi for French literature.’ TEXT technology 12(1). Smadja F (1994). ‘Retrieving collocations from text: Xtract.’ Computational Linguistics 19(1), 143–177. Smith D A, Rydberg-Cox J A & Crane G R (2000). ‘The Perseus Project: a digital library for the humanities.’ Literary & Linguistic Computing 15(1), 15–25. Smith J B (1978). ‘Computer criticism.’ Style 12(4), 326–356. Smith J B (1980). Imagery and the mind of Stephen Dedalus: a computer-assisted study of Joyce’s A portrait of the artist as a young man. Lewisburg: Bucknell University Press. Smith M W A (1991). ‘The authorship of The Raigne of King Edward the Third.’ Literary & Linguistic Computing 6(3), 166–174.
Smith M W A (1999). ‘The application of principal component analysis to stylometry.’ Literary & Linguistic Computing 14(4), 445–465. Somers H & Tweedie F (2003). ‘Authorship attribution and pastiche.’ Computers and the Humanities 37, 407–429. Steele K B (1991). ‘‘‘The Whole Wealth of thy Wit in an Instant’’: TACT and the explicit structures of Shakespeare’s plays.’ CCH Working Papers 1, 15–35. Stubbs M (1996). Text and corpus analysis: computerassisted studies of language and culture. Oxford: Blackwell. Sutherland K (ed.) (1997). Electronic textuality: investigations in method and theory. Oxford: Oxford University Press. Tirvengadum V (1998). ‘Linguistic fingerprints and literary fraud.’ CH Working Papers A.9. Vickers B (2002). Shakespeare, co-author: a historical study of five collaborative plays. Oxford: Oxford University Press. Watt R J C (1999). Concordance. University of Dundee: Scotland, UK. Waugh S, Adams A & Tweedie F (2000). ‘Computational stylistics using artificial neural networks.’ Literary & Linguistic Computing 15(2), 187–198. Wiener N (1950/1967). The human use of human beings: cybernetics and society. New York: Hearst.
Relevant Websites http://www.ach.org/ – Association for Computers and the Humanities. http://www.allc.org/ – Association for Literary and Linguistic Computing. http://www.coch-cosh.ca/ – Consortium for Computers in the Humanities/Le Consortium pour ordinateurs en sciences humaines. http://www.chass.utoronto.ca/ – Text Analysis Computing Tools. http://www.tei-c.org/ – Text Encoding Initiative. http://www.uni-tuebingen.de/zdv/tustep/tustep_eng.html – Tu¨bingen System of Text Processing Programs.
Computer-Supported Writing K Lunsford, University of California, Santa Barbara, CA, USA ! 2006 Elsevier Ltd. All rights reserved.
Introduction The concept of computer-supported writing has been evolving and continues to evolve as new computer technologies emerge. As a result, it has at least three current meanings. In some contexts,
computer-supported writing refers to various hardware and software tools or aids that often allow writers to be more efficient. For example, this perspective would highlight the claim that writers can revise texts more easily by moving words around in a word processor file than they can by using pen and paper. From a second perspective, computersupported writing refers to the technologies that allow people at a distance to collaborate on texts. In other words, the concept refers to a particular type of
810 Computer-Supported Writing
Computer-Supported Cooperative Work (CSCW) or Computer-Mediated Communication (CMC) in which people use technologies such as electronic mail (e-mail), online archives, and file-sharing programs to compose documents together. Most recently, computer-supported writing encompasses the idea that computer technologies have allowed the creation of new genres (such as personal Web pages), new contexts for writing (such as the worldwide audience available on the Internet), and new expectations about what it means to read and write (such as the ability to compose and interpret texts that may combine words, numbers, sounds, hyperlinks, and visuals). From this perspective, to speak of computer-supported writing is to speak of the development of new literacies or multiliteracies. All of these perspectives continue to influence computer-supported writing in educational settings. To provide a broad sketch of this concept and its implications, the following discussion is divided into five sections. The first section provides a concise history of computers and writing, briefly highlights the idea of computers as writing aids, and comments on how computers and writing have mutually evolved. The second section focuses on computersupported writing as a collective activity. In the third section, several new literacies are discussed. The fourth section addresses computer-supported writing in the classroom, and the final section comments on unresolved issues generated by today’s writing technologies.
Brief History of Computers and Writing Computers were not originally conceived of as writing technologies. Rather, before the late 1970s, punchcard-munching, number-crunching mainframes were primarily associated with accounting and with scientific, mathematical, and military calculations. Indeed, according to some, writing was a support for computers. Language capacities began to be programmed to allow humans to document software and to interpret data. By the late 1970s and early 1980s, those language capacities were being developed further by technology manufacturers to reach a growing market of business, government, and academic writers. Since then, computer technologies to support writing have expanded to include not only writing aids, but also electronic networks, and mobile and embedded tools. Writing Aids
As Hawisher et al. (1996) detail, when programmers fully turned to computer-supported writing in the 1970s, the early heyday of Computer-Assisted
Instruction (CAI) influenced their assumptions. Computers were not seen primarily as devices for composing texts. Instead, many were programmed as tools to ease the (so-called) drudgery associated with teaching and learning how to write. A computer could drill novice writers on grammar and punctuation through tutorials, exercises, and educational games. A human teacher or editor (presumably) then had more time to address the higher-level skills of argument. Similarly, some computer programs were designed to prompt novice writers through activities associated with the writing process: prewriting, drafting, revising, and editing. The assumptions behind some of these aids have been challenged – that writing can be divided between higher and lower functions to be divvied up between humans and machines. However, products that allow novice writers to practice writing skills remain a strong component of computer-supported writing. The early 1980s were marked by technologies that initiated a widespread change in writing practices: word processing programs on personal computers. The programs for drilling novice writers morphed into spelling and style-checkers that were incorporated as standard features of word processors. More importantly, word processors allowed writers to replicate, manipulate, and store texts in ways that they had not before. Writers also took advantage of new electronic reference aids such as dictionaries, writing handbooks, and concordance programs that could analyze multiple documents to identify Key Words In Context (KWIC). As software and hardware developed throughout the 1980s and early 1990s, writers began to experiment with new multimedia capacities (see below, Hypertexting and Multimediating). They could more easily combine words with illustrations, background colors, links to other documents and files, and audio and visual clips. As a result, writers and composers could create new kinds of texts and could reinterpret older, print-based genres. From a novelty, computer-supported writing became an expected activity; from a gadget for replicating penand-paper writing practices and making them more efficient, the computer became for many writers a necessity associated with its own genres and practices. Networks
Also by the early 1990s, computer-supported writing commonly included writing across networks (Hawisher et al., 1996). The Local Area Networks (LANs), Wide Area Networks (WANs), and BITNET (Because It’s Time Network) of the 1980s were joined and then overshadowed by the Internet. As servers and personal computers worldwide joined the Internet, writers discovered faster and often cheaper
Computer-Supported Writing 811
means of communication. Some of these technologies included e-mail, bulletin boards, Internet Relay Chat (IRC), and early chat rooms. Many documents no longer had to be sent through snail mail, or physical postal services, but could be e-mailed or uploaded and downloaded from online locations via various file transfer programs. By the late 1990s, the Internet had become nearly synonymous with the World Wide Web (WWW or Web), as the connective infrastructure provided by the Internet was adapted to new uses, and as standard protocols such as hypertext markup language (html) were developed. The Web brought multimedia writing to the forefront, because it employed graphical, linked Web pages (as opposed to command lines) as the preferred interface among humans and machines. Although early networks had been primarily used in military, government, research, and educational settings, the Web was also embraced by the wider public and commercial interests. Personal Web pages existed side-by-side with more official sites published by institutions and businesses. The new writing profession of ‘Web designer’ soon took hold. Technologies began to be developed to better enable collaboration on Web sites and across these networks (see below, ‘Collective Writing’).
technologies that support writers who collaborate on documents or contribute to collective projects, especially writers who live at a distance from each other. Even so, as of the first decade in 2000, computer-supported collective writing remains a relatively new concept. In fact, programmers commonly complain that many writers tend to be fairly conservative in adopting new collective writing technologies, preferring instead to send documents as e-mail attachments and to converse on the phone. Early adopters of technologies are still experimenting with the new tools and social norms that enable people to compose texts together. Again, these initial experiments have often attempted to transfer existing writing practices to new media (part of a process that Bolter and Grusin, 1999, have called remediation). As early adopters have grown more familiar with the technologies, they have developed new practices that have in turn required new technologies, in a continuing cycle. The following entries represent only a sampling of the collective writing technologies still under development.
Mobile and Embedded Technologies
E-zines (electronic magazines) and fanfic (fanfiction) sites have evolved from a long tradition. Historically, writers often have shared their work with each other in small groups and have self-published to larger audiences. For example, zines are amateur, often collectively written, home-grown print magazines. They became especially popular with grassroots organizers, reporters, and fiction writers when photocopiers became widely available. E-zines update this common practice, for they are distributed relatively cheaply on the Internet/Web and potentially reach worldwide communities. E-zines are often about serious topics such as political activism or health issues, but they can be integral components of a fanfic Web site as well. Fans of popular culture icons congregate on shared bulletin boards, document centers, and blogs (see following section) to write scripts for new episodes of their favorite television series, create characters for a popular book series or computer game, discuss their favorite bands and lyrics, hold contests for the best adaptation of a published story, and so on.
Although personal computers and the Internet/Web remain the focus of writing theorists in the first decade of 2000, the next wave of computersupported writing is becoming apparent: mobile and embedded technologies. Some mobile technologies are essentially smaller forms of desktop computer technologies. Yet, because computer components can now be miniaturized, they also can be incorporated into other handheld, mobile devices. Personal Digital Assistants (PDAs), for example, record notes. Cell phones have become multimedia writing devices. In addition to transmitting sound, they can send textual Instant Messages (IM) and digital pictures just as programs on the Internet/Web can. Even more, computer components are being embedded into other appliances to create smart versions, or ‘things that think.’ For instance, smartboards are whiteboards that can electronically record anything handwritten on them with a special stylus. The future for many nations, scholars speculate, may be cultures in which technologies that enable computer-supported writing become ubiquitous and thus invisible.
Collective Writing As networks and especially the Internet have grown in popularity, considerable funding, time, and human resources have been devoted to improving the
E-zines and Fanfic Sites
Webrings, Blogs, Wikis, and MOOs
In addition to e-zines and fanfic sites, many specialized social protocols and software programs have arisen to support collective writing. Among the most popular have been Webrings, blogs, wikis, and MOOs.
812 Computer-Supported Writing
A Webring is a collection of Web sites linked to one another, usually with a template that provides ‘back’ and ‘forward’ arrows in a footer to lead visitors from one site to the next. The sites may engage in direct dialogue and conversation, as they comment on each other’s content, or the links may merely provide easy access to resources on related topics. The term blog began as the shortened name for a Web log, or an annotated record of the Web sites a Web designer had consulted. A blogging program’s ability to record and publish chronological entries on the Web, however, soon attracted the attention of diarists, essayists, and journalists. Today, blogs range from online diaries read by groups of friends, to bibliographic entries collected by a team working on a research project, to political or editorial commentary written by news reporters. Like blogs, wikis combine database and Webpublishing capabilities, although wikis tend to be associated with large-scale, collective writing projects. They have been used, for example, to communally generate dictionaries and other reference materials. When a contributor submits a written entry on a topic, others can comment on and revise the original entry and subsequent comments. Wikis thus may preserve a record of contributors’ debates and corrections, offering different viewpoints on the topic. Their approach to written text is modeled on the Open Source movement, in which software programmers collectively work on programs that are shared, commented upon, and continually tweaked. A MOO (Multi-user domain, Object Oriented) is somewhat like a chat room where multiple guests, owners, and wizards can verbally create virtual identities, describe their environments, sometimes represent these realms or domains through graphics, and create computerized objects that interact with humans. For example, on Connections MOO (1994–2004), the Tuesday Cafe´ was a virtual meeting space for writing specialists, and it was complete with tables, benches, and a server (a bot – a computergenerated character) named Rhet who filled drink and food orders. Like Webrings, blogs, and wikis, MOOs have been used for multiple purposes, and they have included both formal and informal writing initiatives. Collaboratories
Collaboratories (from ‘collaborative laboratories’) represent a formal attempt to foster collective research and writing. In the mid-1990s, the U.S. National Science Foundation (NSF) sponsored a grant initiative to fund Internet- and Web-based sites to support scientific research. These sites allowed scientists in a specialized area to share access
to expensive scientific instruments, bibliographic databases, bulletin boards, and writing resources. The concept of collaboratories soon spread to nonscientific venues, especially education. The Inquiry Page and iLabs, for example, provide educators, students, civic leaders, and researchers with suites of collaborative communication and information technologies to support individuals and groups as they pursue various inquiries. Interactive Publication Systems
While collaboratories have begun to change research groups, publishers and academics have also started a revolution in publication practices. Newspapers, magazines, academic journals, books, and other print-based publications have been translated to digital media. In some cases, an online publication is simply a digitized (often a pdf file) version of the print format. In other cases, the publication exists only online. In all cases, though, these digitized archives plus increasingly powerful search engines have significantly changed reading practices because they often allow readers to browse and find appropriate materials more easily. Beyond making the search for materials more efficient, experimental sites and journals have taken further steps to make publications more interactive. Like wikis (see above, Webrings, Blogs, Wikis, and MOOs), they allow readers and writers to engage in dialogues over articles. Similarly, some publication sites, such as the Los Alamos preprint archive, sidestep traditional journal practices by allowing researchers to post the penultimate or conference versions of their articles (preprints) directly to the site for discussion. These open dialogues, some scholars propose, might replace traditional forms of peer review, or the processes by which manuscripts are vetted for publication. The more open digital sites and journals, too, may allow writers to hyperlink their alphanumeric texts to other online, collective resources, such as image databases. As a result, these sites do not simply replicate traditional, paperbased publications, but change how writers and readers connect texts with larger contexts.
New Literacies To capture this understanding that computersupported writing is not merely pen-and-paper writing made more efficient, but something different, several recent theorists have proposed that computers require multiliteracies or new literacies. In other words, they propose that today’s technologies require different ways of knowing how to compose and interpret texts. In particular, today’s texts require knowledge
Computer-Supported Writing 813
of new genres and the social contexts in which they are used. Writing theorists have proposed several lists of new literacies. What follows is a selection of recent proposals to extend traditional ideas of what counts as reading and writing. Hypertexting and Multimediating
Both hypertexts and multimedia texts may break the conventions of traditional print texts, and thus may require different composing and reading practices. For example, a traditional academic article presents an argument in a linear and hierarchical manner, with each main point supported in turn by evidence. A hypertext, however, contains hyperlinks that enable writers and readers to jump from one section of a document to another, or to other documents entirely. A multimedia document contains non-alphanumeric elements such as visuals or audio clips, and, of course, a hypertext may also be multimediated. As a result, arguments presented as hypertexts and/or multimedia may be more associative, nonlinear, and nonhierarchical. Although this last claim is sometimes questioned, most writing theorists believe that the explosion of hypertexts and multimedia on CD-ROMs and the WWW has created new conventions for reading and writing. Employing Visual Rhetoric
An outgrowth of the interest in multimedia has been a specific interest in visual rhetoric. Although people have communicated through visuals for centuries (art, illustrations, graphs, etc.), today’s technologies have made visuals easier and often cheaper to produce, copy, manipulate, and distribute. They are being used extensively in texts that once relied more on words, such as textbooks and advertisements. As a result, visuals are inspiring several theoretical questions about how they communicate: How are people persuaded by visuals? How do people interpret a visual’s elements? Must viewers be able to translate a visual into words before it can be said to be offering an argument? Most important, can a more sophisticated language be developed to describe and to teach writers systematically about visuals’ effects? Designing and Manipulating Information Architectures
To sort through electronic resources quickly is a learned skill, and it often depends on knowledge of information architectures. How does a database organize and retrieve its various entries? What constitutes an appropriate interface or Web site design? How do search engines work, what constitutes an appropriate keyword for a search, and how can
users best retrieve information from outdated or legacy systems? Questions such as these address how information is categorized. Because categorizing something is a rhetorical choice, and often a politically and socially charged activity (as when people are categorized according to different races and ethnicities), information scientists today see a desperate need for more computer users to become literate in information architectures. Understanding Netiquette, Viruses, and Urban Legends
To participate on the Internet/Web effectively, many users have had to learn to be cautious. Certain conventions have developed, for example, to maintain relatively polite conversations on electronic lists, chatrooms, and other interactive spaces. These netiquette conventions often directly address the use of text, as when words in all capital letters are declared to be the equivalent of shouting, or when emoticons (such as smiley faces) are used as punctuation marks to indicate irony and tone. Similarly, many users have become more savvy about the rhetorical strategies (such as misleading subject lines) that hackers may employ to cause them to open a virus-laden e-mail message. The same strategies often characterize urban legends, more-or-less plausible but usually false stories, which often spread rapidly through Web sites and forwarded e-mails. All of these topics have become the subject of academic scrutiny, as they represent new social conventions for texts. Gaming
Although some scholars and pundits may resist the idea that computer gaming constitutes a new literacy, other academics see games as a new genre(s) with specific conventions. Computer games (e.g., Zork) began as entirely text-based entities, as graphics and sound were not immediately possible. Gamers learned to accommodate fragmented and often scrolling text. More important, this community learned how to compose and to interpret the standard elements still found in today’s multimedia games: familiar plotlines, puzzles, character types, and ways the player can manipulate the virtual environment. Currently, many game manufacturers (most notably, Electronic Arts, the creators of SimCity) even invite fans to participate in creating the game’s environment by proposing new storylines on online fanfic sites (see above, E-zines and Fanfic Sites). Some scholars have proposed that games represent the next wave of literature, as they are attracting not only substantial profits but also the next generation’s creative energy.
814 Computer-Supported Writing
Computer-Supported Writing in Educational Settings Computer-supported writing appears in all three forms (as aids, collective writing sites, and new literacies) in educational settings, often in combination. For instance, teachers might refer students to an online writing handbook as they contribute to a collaborative course blog, and in turn, learn the students’ Instant Messaging netiquette. Yet, in addition to deciding which forms of computer-supported writing might be the most appropriate for different instructional goals, educators must consider several other aspects involved with using computer technologies to teach writing. This section highlights the recent questions that have most concerned writing specialists. Computer Classrooms
Computer classrooms may be physical, virtual, or both. How classrooms are arranged has proven to be a key concern, as different stakeholders attempt to make these expensive investments both cost-effective and educationally sound. Experienced educators know that a classroom’s arrangement has varying effects on the student-teacher relationship. For example, many physical classrooms are arranged so that students sit at desktop computers placed in rows. However, this arrangement tends to place the instructor at the front of the room, as a ‘sage on the stage.’ Many writing specialists instead recommend turning attention onto students and their texts. A computer classroom might be arranged with desktop computers placed around the perimeter to allow individual or small group work, and with a central seminar table to allow the entire class to work faceto-face together. Even more flexible are islands of desktop computer carrels that can be rearranged at will, or wireless laptops that can be used throughout a room. As yet, few conversations about physical classroom arrangements are taking into account students’ own handheld devices (PDAs, cell phones, etc.) and how they might be incorporated into writing instruction. Like physical classrooms, virtual writing environments may take many forms. One influential software package has been the Daedalus Integrated Writing Environment (DIWE). Created first for Local Area Networks (LANs) that connected the computers within a physical classroom, DIWE is now accessible through a Web interface. DIWE provides various features, including writing prompts, screens where writers compose texts, and a chatroom environment that allows an entire class or groups within the class to collaborate. Other virtual writing environments
may include specialized areas on wikis, blogs, and MOOs (see above, Webrings, Blogs, Wikis, and MOOs) or collective Web sites that provide both tools for active collaboration and places to selfpublish student work. Course-Management and Commenting Software
In recent years, courseware and commenting programs have become common, especially as schools and campuses have mandated their use. Several commercial products, among them Blackboard and WebCT, allow educators to keep track of writing assignments students have completed, their grades, and notes about the students’ progress. Other products, such as Comment, primarily allow peers and teachers to write responses to student work. Still other programs combine both functions. For example, TOPIC (Texas Tech Online-Print Integrated Curriculum), developed at Texas Tech University and now available commercially, can manage the many class sections that constitute an entire writing department. A fully digital environment, it allows students to upload their papers, to comment on peers’ work, to access writing advice, and to read and respond to comments from several writing instructors. Educators have responded to these various programs with mixed judgments, both liking the organizational power of computers and expressing concern over the potential for excessive surveillance. Assessment Software
Also currently earning mixed reactions are tools whose sophisticated algorithms are used to assess student texts. For example, testing services have developed software to rate students’ college entrance essays. Although a computer does not read a text’s content in the way a human would, it can analyze textual features such as sentence and essay length, punctuation and grammar usage, word choice, and overall organization. As a result, some scholars claim that these products predict scores that compare favorably to the assessment scores that human readers provide. Similarly, other products and online services can assess a student’s use of other texts. By comparing strings of words between a student essay and other documents, some services claim that they identify plagiarism. Writing teachers often use online search engines (such as Google) to do the same. Ultimately, the debates over whether or not to use these assessment tools depend on questions that are decided locally, such as whether the writing forms these tools privilege and measure are appropriate for the instructors’ educational goals.
Computer-Supported Writing 815
Significant Unresolved Issues In addition to the questions that educators ask when they turn to computer-supported writing, several significant issues remain unresolved. Despite the expectation among authors worldwide that writing will be computer-supported, there remain questions about who really will be able to write and for whom, who owns texts, and how long documents will ultimately survive. Access to Computers
Scholars have argued that although computer technologies are widespread, a digital divide exists between the haves and have-nots. Where to draw the line to mark the divide remains a source of contention: into which category should someone with institutional (school, church, community service) access to computers, but no home access, be placed? Nonetheless, because government officials and ruling classes tend to have access to these technologies, they incorporate them into their definitions of literacy and of who counts as capable workers and citizens. In particular, access to the Web remains an issue, both for economic and political reasons. Some regions have difficulty maintaining the infrastructures (phone and cable lines, access to satellite communications, and so on) needed to support the Internet. Other regions object to some of the content available online and so take measures to block access. Computer-supported writing is not a universal norm.
be copied by mirror Web sites, readers worldwide, and Internet archives. Removing the material from a personal Web site, for example, does not guarantee that the writing is gone. On the other hand, digital media do not age well. Web sites often contain broken links, and digital files may be corrupted. Moreover, if the software programs that created certain files are not maintained, or if the files are not continually updated, then these documents could become irretrievable. Finally, if the preservation of texts is at issue, so is the purpose of citations. Footnotes, parenthetical notes, and works cited are intended to allow readers to locate the referenced materials. However, if those materials are moved, lost, copied, altered, or corrupted, then the citations become instable as well. These questions, too, are being discussed widely by academics, businesses, and government officials, as new practices are slowly evolving.
Conclusion Computer-supported writing has grown exponentially in its various forms since its beginning and promises to continue to evolve. Yet technologies in themselves are rarely solutions for problems; rather, they bring both benefits and challenges that writers and readers need to assess carefully. The questions ultimately to ask are what do emergent technologies enable and disallow; and what social, writing, and reading practices must be altered to make them useful?
Intellectual Property
With new writing media have come reconsiderations of intellectual property conventions. Designed for print media, copyright and patent laws in most nations have not kept pace with social conventions that digital information ought to be copied and shared, or with technological capacities to instantly copy (or pirate) anything posted online. In fact, as lawmakers debate new provisions to extend copyright and patent protections, movements such as the Open Source have developed an alternative set of conventions for how to share text as well as code. Arguments over intellectual property are likely to shape discussions about computer-supported writing for many years to come. Preservation of Documents
The conversations about preserving computersupported writing swing between a concern that digitized writing persists without authors realizing it, and a concern that digital media cannot be easily stored in archives. On the one hand, many writers may not realize that materials posted on the Web can
See also: Language Education, Computer-Assisted; Language in Computer-Mediated Communication; Writers’ Aids.
Bibliography Bolter J D (2001). Writing space: computers, hypertext, and the remediation of print (2nd edn.). Mahwah, NJ: Lawrence Erlbaum Associates. Bolter J D & Grusin R (1999). Remediation: understanding new media. Cambridge, MA: The MIT Press. Bruce B C (ed.) (2003). Literacy in the information age: inquiries into meaning making with new technologies. Newark, DE: International Reading Association. Buranen L & Roy A M (eds.) (1999). Perspectives on plagiarism and intellectual property in a postmodern world. Albany: State University of New York Press. Cope B, Kalantzis M & the New London Group (2000). Multiliteracies: literacy learning and the design of social futures. London: Routledge. Douglas J Y (2000). The end of books – or books without end? Reading interactive narratives. Ann Arbor: The University of Michigan Press.
816 Computer-Supported Writing Gee J P (2003). What video games have to teach us about learning and literacy. New York: Palgrave Macmillan. Gurak L J (2001). Cyberliteracy: navigating the Internet with awareness. New Haven: Yale University Press. Handa C (2004). Visual rhetoric in a digital world: a critical sourcebook. Boston: Bedford/St. Martin’s. Hawisher G E, LeBlanc P, Moran C & Selfe C L (1996). Computers and the teaching of writing in American higher education, 1979–1994: A history. Norwood, NJ: Ablex. Hawisher G E & Selfe C L (eds.) (2000). Global literacies and the world-wide web. London: Routledge. Haythornthwaite C & Kazmer M M (eds.) (2004). Learning, culture and community in online education: research and practice. New York: Peter Lang Publishing. Inman J A, Reed C & Sands P (eds.) (2004). Electronic collaboration in the humanities: issues and options. Mahwah, NJ: Lawrence Erlbaum Associates. Kirchner P A, Buckingham Shum S J & Carr C S (eds.) (2003). Visualizing argumentation: software tools for collaborative and educational sense-making. London: Springer. Kress G & van Leeuwen T (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold/Hodder Headline Group. Landow G P (1997). Hypertext 2.0. Revised, amplified edition of Hypertext: the convergence of contemporary critical theory and technology. Baltimore: Johns Hopkins University Press. Original edition published in 1992. Lankshear C & Knobel M (2003). New literacies: changing knowledge and classroom learning. Buckingham, UK: Society for Research into Higher Education & Open University Press. Liestøl G, Morrison A & Rasmussen T (eds.) (2003). Digital media revisited: theoretical and conceptual
innovation in digital domains. Cambridge, MA: The MIT Press. Manovich L (2001). The language of new media. Cambridge, MA: The MIT Press. Murray J H (1997). Hamlet on the holodeck: the future of narrative in cyberspace. Cambridge, MA: The MIT Press. Porter J E (1998). Rhetorical ethics and internetworked writing. Greenwich, CT: Ablex. Reiss D, Selfe D & Young A (eds.) (1998). Electronic communication across the curriculum. Urbana, IL: National Council for Teachers of English. Selber S A (2004). Multiliteracies for a digital age. Carbondale, IL: Southern Illinois University Press. Selfe C L (1999). Technology and literacy in the twenty-first century: the importance of paying attention. Carbondale, IL: Southern Illinois University Press. Selfe C L & Hawisher G E (2004). Literate lives in the information age: narratives of literacy from the United States. Mahwah, NJ: Lawrence Erlbaum Associates. Sharples M (ed.) (1993). Computer supported collaborative writing. London: Springer-Verlag. Shulman S (1999). Owning the future: inside the battles to control the new assets – genes, software, databases, and technological know-how – that make up the lifeblood of the new economy. Boston: Houghton Mifflin. Snyder I (ed.) (2002). Silicon literacies: communication, innovation and education in the electronic age. London: Routledge. Sullivan P & Porter J E (1997). Opening spaces: writing technologies and critical research practices. Greenwich, CT: Ablex. Taylor T & Ward I (eds.) (1998). Literacy theory in the age of the Internet. New York: Columbia University Press.
Comrie, Bernard (b. 1947) F Katada, Waseda University, Tokyo, Japan ! 2006 Elsevier Ltd. All rights reserved.
Bernard Comrie, one of the world’s leading figures in the field of language universals and linguistic typology, was born on May 23, 1947, in Sunderland, England. He studied at the University of Cambridge, from which he received a B.A. in modern and medieval languages (1968) and a Ph.D. in linguistics (1972). At Cambridge, he was Junior Research Fellow at King’s College (1970–1974) and subsequently became University Lecturer (1974–1978). In 1978 he joined the faculty of the University of Southern California, Los Angeles, first as Associate Professor (1978–1981), then as Full Professor (1981–1998), of linguistics.
Comrie is currently Director of the Department of Linguistics of the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany (since 1997). He is also Honorary Professor of Linguistics at the University of Leipzig (since 1999), and Distinguished Professor of Linguistics at the University of California, Santa Barbara (since 2002). Honors conferred upon him are Member of the Saxon Academy of Sciences, Leipzig (1999), Corresponding Member of the British Academy (1999), Foreign Member of the Royal Netherlands Academy of Arts and Sciences (2000), and Doctor of Letters Honoris Causa, La Trobe University, Australia (2004). Comrie’s intellectual interests have centered around general questions of language universals, with emphases on syntax and semantics. From his
780 Computer-Mediated Communication: Cognitive Science Approach Implications for technologies to support remote collaborative work.’ In Hinds P & Kiesler S (eds.) Distributed work. Cambridge, MA: MIT Press. 137–162. Lee J (2004). ‘A BlackBerry throbs, and a wonk has a date.’ New York Times Sunday Styles, Section 9, May 30. 1–2. Ochsman R B & Chapanis A (1974). ‘The effects of 10 communication modes on the behavior of teams during cooperative problem-solving.’ International Journal of Man–Machine Studies 6, 579–619. Ohaeri J O (1998). ‘Group processes and the collaborative remembering of stories.’ Unpublished doctoral dissertation, State University of New York at Stony Brook. Schober M F & Clark H H (1989). ‘Understanding by addressees and overhearers.’ Cognitive Psychology 21, 211–232.
Whittaker S (1995). ‘Rethinking video as a technology for interpersonal communications: theory and design implications.’ International Journal of Man-Machine Studies 42, 501–529. Whittaker S (2002). ‘Theories and methods in mediated communication.’ In Graesser A, Gernsbacher M & Goldman S (eds.) The Handbook of Discourse Processes. Hillsdale, NJ: Erlbaum. 243–286. Whittaker S J, Brennan S E & Clark H H (1991). ‘Coordinating activity: An analysis of interaction in computersupported cooperative work.’ In Proceedings of CHI ‘91: Human Factors in Computing Systems. New Orleans, LA: Addison-Wesley. 361–367. Williams E (1977). Experimental comparisons of faceto-face and mediated communication. Psychological Bulletin 16, 963–976.
Computers in Field Linguistics N Thieberger, The University of Melbourne, Melbourne, Victoria, Australia ! 2006 Elsevier Ltd. All rights reserved.
Computers have been associated with field linguistics from their earliest days, as witness the enthusiasm with which computers were embraced by linguists, from mainframe computers in the 1960s to personal computers in the 1980s. While initially it was common to force our efforts into the framework provided by particular software, we are now more aware of the need to see the data itself as the primary concern of the analyst and not the software that we use to manipulate the data. Inasmuch as it allows us to carry out the main functions desired by a field linguist, software is a tool through which our data passes, the data becoming transformed in some way, but surviving the journey sufficiently to live on, independent of any software, into the future. In this article, I discuss ways in which computers can assist field linguists whose chief concerns I take to be language documentation, including recording a previously unrecorded or little recorded language in order to write a grammatical description. Field linguistics has been going through a change in focus over the past few years. There is increasing recognition of the need to record languages with few speakers, and to support such speakers with materials such as text collections, dictionaries, and multimedia (e.g., text, audio, images, and video). Computers are central to this effort, especially as we move to digital recording in which there will be no analog original. Laptop and palm computers are common
tools for the first-world linguist, as are solid-state digital recorders and digital video cameras, which produce digital files for access on computers. Processing power of computers keeps increasing as does storage and RAM, which means we are now able to deal with real-time media (audio and video) in ever larger quantities, raising crucial issues for data management. A typical workflow engaged in by a field linguist is presented below, together with a description of methods for working with small and perhaps endangered languages, and for managing the data so that it can be analyzed. Further analytical tools, like morphological parsers, are considered in the article on Natural Language Processing (NLP) (see Natural Language Processing: Overview). An interest in supporting endangered languages, and the efforts of speakers or their descendants to learn about them, encourages us to focus on archival methods and on producing the best quality material for access in the future. Thus, the focus here will be on computer-based tools for analyzing linguistic material in ways that allow it to be safely stored, retrieved, and reused by others, as discussed by Bird and Simons (2003) in a work that is central to the present discussion. For the linguistic fieldworker, the usual workflow involves recording, transcribing, and interlinearizing a corpus so that there is a base of information for analysis. This analysis is written as a grammar and may be accompanied by a collection of texts and a dictionary of the language. There may also be a set of media files that are linked to by their transcripts, allowing readers to hear audio or see video in the
Computers in Field Linguistics 781
language. In addition, this material is housed in a suitable repository, a digital archive which preserves the data for future use. The types of tasks that we will need to carry out in the analysis of a previously unrecorded language are outlined below. Assuming that we begin with recordings (digital, or analog converted to digital) that are the primary data, we first need to label them clearly, so that they are identifiable from the moment of recording, and to establish a database of metadata, the who/what/where/when information that is easily forgotten in a short time without good descriptive notes. It is useful at this stage to have considered a naming convention, so that the tapes can be permanently identified in both our own documentation and in any archive in which we lodge the data. (Filenames should persist over time so that any reference to them can be resolved, for example by someone looking through the data in the future. Filenames should not contain unusual characters that various computer systems find difficult to recognize.) Maintaining a good database of the items (tapes, transcripts, texts, images, etc.) and of the relationships between them allows us to keep track of derived forms and the context from which they are derived. We then need to transcribe the media to produce a textual index in whatever form we require. Transcription can be undertaken with tools that capture time-alignment, so that the resulting file has timecodes associated with chunks of text. We should be clear from the outset that we are engaging in a data management task, in which complex relationships between types of ethnographic data need to be tracked, both for our own use of them and for assisting in retrieving information in the future. Database structures can assist here, but only if they do not lock up the data in a proprietary format (one that is owned by a company rather than being ‘open source’ or publicly and freely available). Relational databases allow us to reflect relationships in the data and to avoid duplication by listing, for example, items on a tape linked to the names of speakers and their characteristics (age, sex, etc.), and the derived information (such as texts, media files, and lexicons). In the late 1980s, Lancashire (1991) listed a number of software tools for various aspects of linguistic analysis, many of them aimed at working with large corpora of metropolitan languages. Not all of these are useful from the point of view of a fieldworker recording a small language (one with relatively few speakers and typically with no written record), as the programs deal with what we can characterize as ‘high-end’ applications such as NLP or analysis based on very large datasets.
An issue that was dealt with extensively in the late 1980s was representation of orthographic typefaces by fonts, and it may not be too optimistic to say that we are about to overcome these problems by means of the international standard, Unicode, in which most character sets have found a home. While field linguistics is not addressed as a subject heading in Lancashire’s compilation, more recent work by Johnston (1995) and Antworth and Valentine (1998) is devoted to just this topic and surveys the relevant software of the time in some detail. Some of the tools described in these two sources are still used by field linguists, but this is partly because there is no choice. Shoebox is an example of a fine piece of software that is the mainstay of lexicographic and textual analysis and was last updated in 2000, although it has recently been replaced as Toolbox on Windows platforms. A number of tools have not been updated and are now unable to run on recent operating systems. Bearing in mind that the data is our primary concern and not the software we use to manipulate it, it is nevertheless critical that the software enables us to perform the kinds of tasks we routinely require in order to assist us in our fieldwork. It is the function of a software tool to transform data, or to allow us to interact with the data. We take it as given that the tools discussed here may soon be superseded. The kinds of functions that we need as linguists will continue to be addressed in new ways in the future. As there is no one tool that will do all that we require, we need ways of allowing our data to flow between the tools. This typically involves the use of text manipulation software or regular expression parsers. Most of the examples of tools listed below can be found on the Internet, and searching for the major headings here will locate any more recent items. There is an enormous possibility for new uses of linguistic data, both in the exploration of its internal links and in the representation of the data itself, to accompany our analyses or to assist in language reintroduction programs. Given that this is the case, it would be foolhardy to suggest that we could provide all the answers in a fixed time or location. Rather, there are major sources of information on these topics, as given in the list of web links below, that should be consulted by anyone wanting to locate current information on these topics. They should also get in touch with the local linguistic archive that will be keeping abreast of the best emerging practices.
Transcribing Producing a textual index (or transcript) of a media file, with timecodes inserted into the resulting file.
782 Computers in Field Linguistics
Elan, http://www.mpi.nl/tools/elan.html Transcriber, http://www.ldc.upenn.edu/mirror/Transcriber/ Clan, http://childes.psy.cmu.edu/clan/ TASX, http://tasxforce.lili.uni-bielefeld.de/ (cf. the Annotations page which has a list of many of these kinds of tools: http://www.ldc.upenn.edu/ annotation/)
Emacs, http://www.emacs.org BBEdit, http://www.barebones.com/products/bbedit/ index.shtml Perl, http://www.perl.com ECONV, (http://www.mpi.nl/tools/econv.htm) does conversions between Shoebox, Transcriber and Elan textual formats without the need to learn regular expressions.
Interlinearizing Text
Building a Dictionary Based on the Corpus
Providing an annotation of the transcript, in a morpheme-level correspondence, typically with reference to a controlled vocabulary that will become a lexicon of the language.
Shoebox, http://www.sil.org/computing/shoebox/ Databases programs are, in general, not recommended for building dictionaries as they are too restrictive on the form in which an entry can be represented. A major benefit of Shoebox is that it provides a means for glossing texts linked to a dictionary, a function that is not available with other tools. Dictionary presentation tools are a useful way of getting structured lexical information into a public form, for example:
Building a Corpus of Media Material Amassing transcripts linked to media files to allow navigation through the media via the textual index. Instantiating links established with transcription tools. Audiamus, http://www.linguistics.unimelb.edu.au/ thieberger/audiamus.htm
Concordancing the Corpus Establishing a list of all words in the corpus in their context. Ideally this concordance interacts with the corpus to allow you to move between the concordance and the corpus (McEnery and Wilson 2001: 209ff., give a list of tools for corpus research). Conc, http://www.sil.org/computing/conc/ Wordsmith, http://www.lexically.net/
Conversion of Linguistic Data To restructure our data for use in the tools listed here we need conversion methods that can take the data from one format to another. Regular expressions allow the linguist to query the data on structure rather than content. So, for example, the expression ‘\r.’ will find any carriage return and following character, regardless of what it is. Similarly, ‘\r[0-9]’ finds any numeral in that position. Regular expressions assist in structuring textual data to move it between applications. A general search on ‘regular expression’ will give more information, see for example http:// www.regular-expressions.info. Tools that use regular expressions include:
Spectral Analysis Acoustic analysis of segments of field recordings can be accomplished with these two widely used tools. Praat, http://www.fon.hum.uva.nl/praat/ Emu, http://emu.sourceforge.net/
Archiving Data These archives are both repositories for field recordings and derived forms of data and analysis and clearinghouses for relevant information on linguistic methods and tools. Digital Endangered Languages and Musics Archive Network (DELAMAN), http://delaman.org/ Open Language Archives Community. (OLAC), http://www.language-archives.org/ Aboriginal Studies Electronic Data Archive (ASEDA), http://www.aiatsis.gov.au/rsrch/rsrch_ pp/ased_abt.htm Archive of the Indigenous Languages of Latin America, http://www.ailla.utexas.org Documentation of Endangered Languages (DOBES), http://www.mpi.nl/DOBES Endangered Languages Archive (ELAR), http:// www.hrelp.org/archive/ Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC), http://paradisec.org.au
Computers in Lexicography 783
Linguistic Computing Directories General sources of information on linguistics and computing tools. http://www.sil.org/linguistics/computing.html http://www.linguistlist.org/sp/Software.html See also: Character Sets; Natural Language Processing: Overview; Phonetics: Field Methods; Semantics: Field Work Methods.
Bibliography Antworth E & Valentine R J (1998). ‘Software for doing field linguistics.’ In Lawler J & Dry H A (eds.) Using
computers in linguistics: a practical guide. London; New York: Routledge. 170–196. Bird S & Simons G (2003). ‘Seven dimensions of portability for language documentation and description.’ Language 79, 557–582. Johnston E C (1995). ‘Computer software to assist linguistic field work.’ Cahiers des sciences humaines 31(7), 103–129. Lancashire I (1991). The humanities computing yearbook 1989–90. Oxford: Clarendon Press. Lawler J & Dry H A (eds.) (1998). Using computers in linguistics: a practical guide. London; New York: Routledge. Leech G N, Myers G & Thomas J (eds.) (1995). Spoken English on computer: transcription, mark-up, and application. Harlow, Essex, England; New York: Longman. McEnery T & Wilson A (2001). Corpus linguistics: an introduction. Edinburgh: Edinburgh University Press.
Computers in Lexicography A Kilgarriff, Lexicography MasterClass, Brighton, UK ! 2006 Elsevier Ltd. All rights reserved.
Computers can be used in lexicography to support the analysis of the language and to support the synthesis of the dictionary text. There are, of course, many other interactions between computing and lexicography, including the preparation and presentation of electronic dictionaries, the use of dictionaries in language technology systems (see Computational Lexicons and Dictionaries), and the automatic acquisition of lexical information (see Controlled Languages; Lexical Acquisition). They will not be covered here. In technologically advanced dictionary-making, the lexicographer works with two main systems on their computer: the corpus query system (CQS) for analysis and the dictionary writing system (DWS) for synthesis. Currently, these are always independent, with communication between the two via cut and paste. We describe requirements, and the state of the art, for each.
Dictionary Writing Systems (DWSs) Anyone producing a dictionary needs to (a) write it, and (b) store it. Each can be done on either paper or computer. ‘Dictionary writing system’ means the software used where either or both are done on a computer. Producing a dictionary is a large and complex operation. The DWS can facilitate the operation at many
points. Dictionary production usually involves a team whose members include lexicographers, a chief editor, a project manager, and a publisher. The DWS will be a key tool for all of them, each from a different perspective. The lexicographer wants the tool to facilitate writing and editing text. The chief editor wants it to support quality checking and consistency, including ensuring that dictionary policies are observed. The project manager wants it to support progress monitoring, including the process of allocating packages of work to lexicographers, distributing them, and checking that they are returned on time. The publisher wants it to deliver a versatile database that can readily be used for producing various dictionaries (electronic and paper, large and small) and potentially for licensing for a range of other purposes, such as spell-checking or automatic translation. The Dictionary Grammar
A dictionary is a highly structured document. An entry typically contains a headword, pronunciation and part-of-speech code, optional labels, and information about inflectional class and morphological and spelling variants, then a sequence of senses, each with definition or translation and optional examples. Each of these is a different information field. There are constraints on which fields are required or allowed where. Fields are often distinguished by font or use of bold or italics. Some fields, like part of speech, may only take one of a small set of values; others play a specific role in sorting or crossreferencing. A lexicographer or user of an electronic
Computers in Lexicography 783
Linguistic Computing Directories General sources of information on linguistics and computing tools. http://www.sil.org/linguistics/computing.html http://www.linguistlist.org/sp/Software.html See also: Character Sets; Natural Language Processing: Overview; Phonetics: Field Methods; Semantics: Field Work Methods.
Bibliography Antworth E & Valentine R J (1998). ‘Software for doing field linguistics.’ In Lawler J & Dry H A (eds.) Using
computers in linguistics: a practical guide. London; New York: Routledge. 170–196. Bird S & Simons G (2003). ‘Seven dimensions of portability for language documentation and description.’ Language 79, 557–582. Johnston E C (1995). ‘Computer software to assist linguistic field work.’ Cahiers des sciences humaines 31(7), 103–129. Lancashire I (1991). The humanities computing yearbook 1989–90. Oxford: Clarendon Press. Lawler J & Dry H A (eds.) (1998). Using computers in linguistics: a practical guide. London; New York: Routledge. Leech G N, Myers G & Thomas J (eds.) (1995). Spoken English on computer: transcription, mark-up, and application. Harlow, Essex, England; New York: Longman. McEnery T & Wilson A (2001). Corpus linguistics: an introduction. Edinburgh: Edinburgh University Press.
Computers in Lexicography A Kilgarriff, Lexicography MasterClass, Brighton, UK ! 2006 Elsevier Ltd. All rights reserved.
Computers can be used in lexicography to support the analysis of the language and to support the synthesis of the dictionary text. There are, of course, many other interactions between computing and lexicography, including the preparation and presentation of electronic dictionaries, the use of dictionaries in language technology systems (see Computational Lexicons and Dictionaries), and the automatic acquisition of lexical information (see Controlled Languages; Lexical Acquisition). They will not be covered here. In technologically advanced dictionary-making, the lexicographer works with two main systems on their computer: the corpus query system (CQS) for analysis and the dictionary writing system (DWS) for synthesis. Currently, these are always independent, with communication between the two via cut and paste. We describe requirements, and the state of the art, for each.
Dictionary Writing Systems (DWSs) Anyone producing a dictionary needs to (a) write it, and (b) store it. Each can be done on either paper or computer. ‘Dictionary writing system’ means the software used where either or both are done on a computer. Producing a dictionary is a large and complex operation. The DWS can facilitate the operation at many
points. Dictionary production usually involves a team whose members include lexicographers, a chief editor, a project manager, and a publisher. The DWS will be a key tool for all of them, each from a different perspective. The lexicographer wants the tool to facilitate writing and editing text. The chief editor wants it to support quality checking and consistency, including ensuring that dictionary policies are observed. The project manager wants it to support progress monitoring, including the process of allocating packages of work to lexicographers, distributing them, and checking that they are returned on time. The publisher wants it to deliver a versatile database that can readily be used for producing various dictionaries (electronic and paper, large and small) and potentially for licensing for a range of other purposes, such as spell-checking or automatic translation. The Dictionary Grammar
A dictionary is a highly structured document. An entry typically contains a headword, pronunciation and part-of-speech code, optional labels, and information about inflectional class and morphological and spelling variants, then a sequence of senses, each with definition or translation and optional examples. Each of these is a different information field. There are constraints on which fields are required or allowed where. Fields are often distinguished by font or use of bold or italics. Some fields, like part of speech, may only take one of a small set of values; others play a specific role in sorting or crossreferencing. A lexicographer or user of an electronic
784 Computers in Lexicography
version of the dictionary data may wish to specify particular fields in a search. For all these reasons, they need to be explicit; all data in the dictionary database must be within a particular field. When lexicographers write or edit an entry, they must not only input the text; they must also specify the field it falls within. The ‘dictionary grammar’ is at the center of the project. It names the different fields of information and says how they are to be nested and ordered, and which are obligatory and which are optional. When a dictionary project is planned, decisions must be made about the different fields and entry structures. These policies, along with many more, go to form the ‘style manual,’ an extensive document detailing how all the many varieties of lexical fact are to be classified and presented. The dictionary grammar implements the style manual. It tells the computer what a dictionary entry needs to look like. The computer can then make sure entries have appropriate structures and can guide the lexicographer through the compilation process. There should be a one-to-one mapping between the information fields in the style manual and those in the dictionary grammar (and it is sensible to give them the same names in both). The human-readable rules for entry structures in the style manual and the computer-readable ones in the dictionary grammar should correspond. If policies change, with new information types or entry types added, corresponding changes must be made to the dictionary grammar. The lexicographer will need to become highly expert on dictionary style, and this will mean knowing the dictionary grammar, as well as the style manual, very well. Database and XML
At the heart of a DWS is a database, which stores the growing dictionary. Standard database functions that a DWS needs are fast access, locking items when a user is working on them, backup, and crossreference checking. (Crossreference checking and sorting are two particular tasks where a dictionary project makes demands that go beyond what many generic systems offer.) The database view of the dictionary is a ‘nuts and bolts’ view, critical for the working of the project but not supporting a view of the dictionary as structured text. For this, a language for representing texts is needed (see Mark-up Languages: Text). XML (eXtensible Markup Language) is designed for this purpose (and is now the language of choice throughout the publishing industry). XML can be used to specify the dictionary grammar. (XML provides two mechanisms for specifying the structure of a document: a
document type definition, or XML schema. Both are suitable for specifying the dictionary grammar.) The XML version of the dictionary database will then serve as an ‘exchange format’ (for delivering the dictionary to printers and other customers) and for guaranteeing its longevity. (Any database systems will in due course be superseded, so it is important for the publisher to have access to a version of the dictionary that does not become unusable if the database system is no longer supported.) XML and associated standards support many of the processes of dictionary production, including style sheets (so the format of paper and electronic products can be efficiently specified, in a process that the lexicographer need not worry about), linking to other resources, and transforming the data (so a range of derived dictionaries and other variants may be produced automatically). The database view and the XML view are complementary. The database must be one that allows the data structures to be specified by a dictionary grammar and that can both input and output XML. The Lexicographer’s Perspective
Lexicographers will spend most of their working week with the DWS, so it is crucial that it helps them rather than hinders them. It must do jobs for them, not give them extra jobs to do. It must be robust (so that it does not crash and lose a week’s work) and fast (so there is no time wasted waiting for the computer to respond). It should be intuitive; lexicographers will need training in lexicography and linguistics, but should not need to commit much time to learning the software. It should support working from home (with or without a high-bandwidth Web connection). It should give them read-only access to an up-to-date version of the whole database, so they can see how other entries related to the one they are working on have been handled. When typing, it will tend to be most intuitive for the lexicographers to ‘fill in boxes’ for different information fields, but they will also want to be able to check how the overall entry looks as they proceed. So the DWS should support both of these views, and possibly others. Some fields, like part-of-speech labels, will have a closed set of possible contents. In such cases, the options should be offered in a drop-down list, both so the lexicographer need not remember details of capitalization, punctuation etc., and so that consistency is guaranteed. Lexicographers often want to restructure long entries, including changing the ordering or nesting of senses and other units. This will be a hard intellectual
Computers in Lexicography 785
Figure 1 Dictionary Writing System, standard lexicographer’s view.
task; the DWS can at least make it a technically easy one. Figure 1 presents a screenshot of a leading DWS. It shows the ‘structure view’ (the default for data input) in the top half of the screen with the WYSIWYG view, simulating what the entry will look like on the dictionary page, in the bottom half. (Editing can take place in either window.) Checks and Searches
The DWS, in its database functionality, needs to support the many checks to be made before a dictionary text is ready to publish. It must report any failures to comply with the dictionary grammar, or unresolved crossreferences, or spelling errors, or use of words which are not themselves defined (for many, though not all, dictionary types). These checks can be fully automatic. A further range of checks require a combination of specific database searches combined with human judgment: ‘find me all the sports-domain words with examples using a country name’ (to ensure we do not repeatedly talk about the same country winning or losing) or ‘find me all the phrasal-verb
entries with a grammar field which specifies a preposition other than to or for’ (to check that they are all consistent with one of the dictionary policies on grammatical description of phrasal verbs). One important checking role is to assess the quality of revised entries when they are received back from lexicographers, prior to replacing the earlier versions of those entries. For that, a tool which supports quick comparison is needed. Such a tool from one leading DWS is shown in Figure 2. History
The earliest use of computers to support English dictionary compilation was by Laurence Urdang on the Random House Dictionary of the English Language (1966). A computer was used to sort and classify words and senses, so that contributors could work in logical rather than alphabetical order: for example, the medical editor wrote all the entries for words denoting diseases systematically, leaving alphabetical sorting to the computer. At that time computer typesetting was not advanced, and when compilation was complete, the whole text of that dictionary had to be printed onto paper and rekeyed by the printer.
786 Computers in Lexicography
Figure 2 ‘Merge’ tool for comparing the new version of an entry with the old, prior to replacing the old. Typically for the use of a senior team member, reviewing the work of a more junior colleague.
Urdang’s system was improved and refined for the compilation of the first edition of Collins English Dictionary (1979), which was able to give far more extensive coverage to special-subject vocabulary (sciences, technologies, sports, etc.) than previous one-volume dictionaries. General-language editors, special-subject contributors, grammarians, etymologists, and pronunciation editors worked in parallel, again leaving alphabetical sorting to the computer. Sorting problems were corrected during a final copyediting pass. On this work, computeraided typesetting came into its own with the use of the ‘flying spot’ Fototronic typesetting machine, resulting in an exceptionally economical use of page space, so that more information could be packed into clear and readable pages than would have been possible using conventional hot-metal or film setting. Systems and the Marketplace
While it is possible to assemble a DWS from offthe- shelf components – database, editor, and project
management tools – there will be substantial wasted time and effort associated with any failures of the tools to work smoothly together, and there is a strong case for a DWS being developed as a single application, meeting all the requirements sketched above. Several systems have now been developed which do meet all, or most, of the desiderata. An early one was Gestorlex, developed in the late 1980s by a Danish company working with the Danish Dictionary Project and Longman Dictionaries. However it worked with the OS/2 operating system, and when, in the late 1990s, OS/2 was no longer supported, there were for a while no high-specification DWSs publicly available for sale. The situation has recently improved, and there are, as of late 2004, at least three systems meeting all the desiderata on the market. Several large publishers have developed their own systems, which may meet all the desiderata listed but are not available for sale or for inspection. The emphasis in this section has been on large dictionary projects involving whole teams. A different
Computers in Lexicography 787
scenario has been addressed by the Summer Institute of Linguistics (SIL) (see Bilingual Lexicography), an organization with its roots in Bible translation, which works particularly on the documentation of languages without a written tradition. The prototypical case is that of a field linguist visiting a remote community to learn and record their language. SIL have produced the widely-used tools ShoeBox and LinguaLinks with this situation in mind. They offer less flexibility than a system which allows a new dictionary grammar to be developed for each new project, because they work with a fixed set of information fields, but this is suitable for a scenario in which the field linguist does not have a support team and would not know how to prepare a dictionary grammar. A recent entry into the field, Tshwanelex (Joffe and de Schryver, 2004) aims to meet the needs of both the field linguist and larger dictionary teams. Another variable is the type of dictionary. Since the late 1980s, one brand of lexicographic work has been the production of WordNets in various languages (see WordNet(s)), and several dedicated DWSs have been developed for them, with their particular requirements of hierarchical structure and interoperability with other WordNets. One recent development is a series of international workshops on DWSs, sponsored by EURALEX (European Association for Lexicography). More details can be found online.
Corpus Query Systems (CQSs) How should the lexicographer approach the core task of working out what to say about a word? Two possibilities are to look (1) in their own head (introspection), and (2) in other dictionaries. The former is central to lexicography, and any good lexicographer needs a keen awareness of how words behave and what they mean, but it suffers the limitations that, first, it is very easy to miss things, and second, it is subjective: different individuals will have different ideas of what is important or central or salient. The latter is obviously derivative. The third possibility is to look at a corpus (see Corpus Lexicography; Corpora; Corpus Linguistics). People writing dictionaries have a greater and more pressing need for a corpus than most other linguists, and have long been in the forefront of corpus development. The first age of corpus lexicography was precomputer. Dictionary compilers such as Samuel Johnson and James Murray worked from vast sets of index cards, their ‘corpus.’ The data lying behind the Oxford English Dictionary comprised over 20
million index cards, each with a citation exemplifying a use of a word. KWIC Concordances
The second age commenced with the COBUILD project, in the late 1970s (Sinclair, 1987). Sinclair and Atkins, its devisers, saw the potential for the computer to do the storing, sorting, and searching that was previously the role of readers, filing cabinets, and clerks and, at the same time, to make it far more objective; human readers would only make a citation for a word if it was rare, or where it was being used in an interesting way, so citations focused on the unusual but gave little evidence of the usual. The computer would be blindly objective, and show norms as well as the exceptions, as required for an objective account of the language. We call the piece of software which holds the corpus, and which allows the user to extract data and reports from it, the Corpus Query System (CQS). The KWIC (Key Word in Context) Concordance is the basic tool for using a corpus. It shows a line of context for each occurrence of the word, with the word centered, as in Figure 3. The lexicographer can now scan the data and quickly get an idea of the patterns of usage of the word, quite likely spotting meanings, compounds, etc., that they might have missed had they relied on introspection. There are several additional functions that make the CQS more useful including sorting, sampling, filtering, ‘more context,’ and complex searches. Sorting Sorting the concordance lines will often bring a number of instances of the same pattern together, making it easier for the lexicographer to spot it. In Figure 3 the corpus lines are sorted by the beginning of the first word to the left of the nodeword, which brings together the six instances of foreign language(s), indicating that it is an expression worthy of mentioning in the dictionary entry (depending, of course, on dictionary size, function, etc.). Different patterns will be highlighted according to how we sort; a sort according to the word to the right of the nodeword throws up language development, language learning, and language teaching as common collocations. The three buttons next to the word ‘Sort,’ in Figure 3, allow the user to sort the concordance according to left context, nodeword (since some searches will match a number of nodewords), and right context, while if they click on the word ‘Sort’ itself, they are taken to an ‘advanced sort’ dialogue box where other sorting strategies, for example sorting according to word endings or according to the word two to the left of the nodeword, can be specified.
788 Computers in Lexicography
Figure 3 A CQS showing KWIC concordances, sampled and left-sorted.
Sampling Sampling is useful because there will frequently be too many instances for the lexicographer to inspect them all. When this is the case, it is hazardous just to look at the first ones shown by the CQS, because they will, in general, all come from the first part of the corpus. If, arbitrarily, there are a few texts about language development near the beginning of the corpus, then it is all too likely that the user gets an exaggerated view of the role of that term, while missing others. The sampling button allows the user to take a manageable-sized sample from the whole corpus. In Figure 3, the extract is from a left-sorted sample of 250 instances taken from a population of 21 955. The first/previous/next/last buttons are for navigating around the 13 pages of results for this sample, and here, at ‘e’ and ‘f’ in the alphabet, we are on the fifth of those pages. Filtering Filtering functions relate to the classification of the documents in the corpus. If one part of a corpus is, for example, spoken language, then the CQS should allow the lexicographer to view just the concordance lines for that part. Many words show different meanings and patterns of use in different varieties of language, and the lexicographer needs to be able to explore this kind of variation.
The prerequisites are that . all the text in the corpus comes packaged in ‘documents,’ . each document comes with a ‘header,’ . the header states facts about the type of text contained in the document, and does so in a way that the CQS can interpret. Each corpus will have its own scheme of text types. (Classifying all the corpus documents according to the scheme is a large corpus development task). In general the scheme will be specified as a number of features, with each feature having a range of possible values. Thus, the feature ‘w/s’ may have values , the feature ‘mode,’ , and the feature ‘time,’ . In Figure 3, the left-hand column contains an identifier for the document, and by clicking it, the user can see a brief description of the document that the line is taken from. (It is also possible to specify that the value for a particular feature be shown, for each concordance line, in an additional column.) Searches can be constrained according to text type by first defining a subcorpus, for example ‘all the spoken
Computers in Lexicography 789
Figure 4 A CQS screen used for complex searches.
material’ (all documents for which w/s ¼ ‘spoken’), and then searching in that subcorpus. More Context ‘More context’ functions allow the user to see more of the text, where the truncated KWIC line does not tell them enough. (This happens only occasionally in lexicography, but is a common issue for linguists using the CQS to study syntax, prosody, or discourse structure.) In the CQS used for Figure 3, if the word is clicked, a window opens up at the bottom of the screen showing additional context, which can then be scrolled left or right. Another option is for the user to click the KWIC/ Sentence button, which toggles between showing the KWIC line and the full sentence. (For this, it is a prerequisite that the sentences are identified.) Complex Searches In addition to simple searches for single words, users may often want to search for a phrase or some other more complex structure. A good CQS will support complex searches, while keeping the interface simple and user friendly for the simple searches that users most often want to do. One solution is shown in Figure 4. This is the screen at which the user specifies the search. If they want to simply specify a word, they input it in the first box and hit ‘return’ (or click ‘Make concordance’) to get
the concordance. If they wish to specify an exact phrase, they insert it in the phrase box. If they want to specify a pair of collocates more loosely, they can give the one pair of the two in the first box, and then specify the other in either the ‘right context’ or left context’ box (depending on whether it is expected to fall to the right or the left of the first term). Searching is much improved if the corpus is, first, lemmatized and, second, part-of-speech tagged. Lemmatization (also known as morphological analysis) is, for current purposes, the process of identifying the dictionary headword and part of speech for a corpus instance of a word. Thus, in Figure 3, the node word sometimes has the form languages, sometimes language. The lemmatization process has identified language (noun) as the lemma, and the search has found all examples of the lemma language. In a language such as English, many corpus words may be instances of more than one lemma. Thus, brushes may be the plural of the noun, or the present tense third person singular form of the verb. The process of identifying which one applies in a particular context, by computer, is called part-of-speech tagging (see Part-of-Speech Tagging). Once all words in a corpus are lemmatized and part-of-speech tagged (and this information is made available to the CQS),
790 Computers in Lexicography
each word in the corpus can be thought of as a triple, , and searches can be specified in terms of any of these. Thus, in Figure 4, the user can specify a lemma or a word form (either with an associated word class or without, which would cover, for example, both noun brush and verb brush). In the left and right context, the user may specify word classes, as well as lemmas or wordforms. (The contents of the text items are interpreted as either lemmas or wordforms.) The one box unexplained so far is the CQL box. CQL (for Corpus Query Language) is a formalism for corpus querying developed at the University of Stuttgart (Christ, 1995), which approaches the status of a standard for the field. It allows one to build sophisticated structured searches, matching all- or part-strings, for as many fields of information for the word as are provided. (We have only, to date, seen wordform, lemma, and POS-tag, but there could be more.) The CQL box allows the advanced user to interact with the corpus directly in CQL. Summary
Since COBUILD, lexicographers have been using KWIC concordances as their primary tool for finding out how a word behaves. This has been a revolution in lexicography (see Corpus Lexicography; Corpus Approaches to Idiom). For a lexicographer to look at the concordances for a word is a most satisfactory way to proceed, and any new and ambitious dictionary project will buy, borrow, or steal a corpus and use one of a number of CQSs to check the corpus evidence for a word prior to writing the entry. But corpora get bigger and bigger. As more and more documents are produced electronically, as the Web makes so many documents easily available, so it becomes easy to produce ever larger corpora. Most of the first COBUILD dictionary was produced using a corpus of under 8 million words. Several of the leading English dictionaries of the 1990s were produced using the British National Corpus (BNC), of 100M words. The Linguistic Data Consortium has recently announced its Gigaword (1000 M word) corpus, and the Web is perhaps 10 000 times bigger than that, in terms of English-language text alone. This is good. The more data we have, the better placed we are to present a complete and accurate account of a word’s behavior. But it does present problems. Given 50 corpus occurrences of a word, the lexicographer can simply read them. If there are 500, it is still a possibility but might well take longer than an editorial schedule permits. Where there are 5000, reading all of them is no longer at all viable. Having more data is good – but the data then needs summarizing.
Collocation Statistics
The third age of corpus lexicography began with the paper that also inaugurated the new subfield of collocation statistics, by Church and Hanks (1989). They proposed the measure (from information theory) of mutual information (MI) as an automatic way of finding a word’s lexicographically interesting collocates. Given a node word, we find all the words that occur within, for example, a five-word window of it in any of its corpus occurrences. We count how often each of these words occurs in the window. We can then compute how much more often the word was found than it would have been by chance, if there was no systematic relation between the two words. The ratio of how many more times the word is found than it would have been by chance (strictly, the logarithm of the ratio) is the mutual information that each word holds for the other. Table 1, adapted from Church and Hanks (1989), shows the highest-mutual-information collocates found in a window of one to five words to the right of save (with a frequency threshold of five; collocates are only shown if they occurred with save more than five times.) Thus, we save forests, lives, jobs, money (in various forms) and face. This is useful lexicographic information and shows that we can automatically summarize the corpus data, presenting just a list of salient collocates to the lexicographer. The approach generated a good deal of interest among lexicographers, and leading CQSs such as WordSmith (Scott, 1999) and the Stuttgart tools (Christ, 1995) now provide functionality for identifying salient collocates, along these lines. One flaw of the original work is that MI favors rare words (and an ad hoc frequency threshold has to be imposed or the Table 1 Collocates within a 1–5 word window to the right of save, From Church and Hanks (1989) Word
f(x þ y)
f(y)
forests $1.2 lives enormous annually jobs money life dollars costs thousands face estimated your
list would be dominated by very rare items). This problem can be solved by changing the statistic, and a number of proposals have been made. A range of proposals are evaluated in Evert and Krenn (2001) (though the evaluation is from a linguist’s rather than a lexicographer’s perspective). There are several other points at which Table 1 could be improved. First, it contains both life and lives. As discussed above, we could count lemmas rather than word forms; then these two would be merged. Other concerns include: . the arbitrariness of deciding to look at the five words to the right; in practice, lexicographers often try a number of window sizes and positions, to capture different kinds of collocates. Some CQSs include a tool which shows the user the highestsalience collocates in each position between –5 and þ5, though this solution gives the user a lot of information to wade through and fails to merge information about the same word occurring in different positions. . assorted noise, of no linguistic interest ($1.2, your). . the inclusion in the same list of words that might be the object of the verb (forests, lives, jobs, money and face), an adverb (annually), another associated verb, or a preposition, or – of less interest – modifiers of the direct object (enormous, estimated). Word Sketches
These three limitations can be addressed at once by applying grammar. Up to this point, collocationfinding has been grammatically blind. It has considered only proximity. However, lexicographically interesting collocates are, in most cases, words occurring with the node word in a particular grammatical relation. An alternative to looking at a window of words is to look for all words standing in a specific grammatical relation to the headword. This task is parsing, and parsing is hard. It has been the core problem for Natural Language Processing (NLP) since the field was born (see Parsing and Grammar Description, Corpus-Based), and the best current parsers still make many mistakes. However, this application is error-tolerant, since statistics are applied to the output of the parser, and collocates will only be shown to the lexicographer if they occur repeatedly and with high salience; it is less likely that errors will occur repeatedly with the same collocates. Once a corpus is parsed, a word sketch (Kilgarriff et al., 2004) can be produced. A word sketch is a onepage summary of a word’s grammatical and collocational behavior. Figure 5 shows the word sketch for the noun language produced from the British National Corpus (BNC, 1995).
Word sketches were first used for the Macmillan English Dictionary (Kilgarriff and Rundell, 2002; Rundell, 2002). In the CQS which provides word sketches, the Sketch Engine, the word sketches are integrated with concordancing functions, so the user can move easily between sketch and concordances. The tool also provides a data-driven thesaurus and the ‘sketch diff’ function which contrasts collocates of near-synonyms. For word sketches to be built, the system must be told what the grammatical relations are for the language, and where in the corpus they are instantiated. There are two ways to do this. The input corpus may already be parsed, with grammatical relations given in the input corpus. Such a corpus is occasionally available. The other way is to define the grammatical relations and parse the corpus within the tool. To do this, the input corpus must be part-of-speech tagged. (It should also be lemmatized.) Then each grammatical relation is defined as a regular expression over partof-speech tags, using the CQP formalism (Christ, 1995). The regular expressions are used to parse the corpus at the compiling stage, giving a database of tuples such as