Lecture 8: Split-ergative and Inverse Systems



In this lecture we will examine various syntactic phenomena which reflect the deictic centrality of the speaker and addressee in the speech event.  Deixis and its expression in various linguistic categories is an old and well-known concept in semantic analysis, but it is seldom invoked in syntactic analysis.  After all, though the difference between the shifting reference of I and you and the fixed reference (within a given discourse) of other NP's is obvious, it has no obvious syntactic consequences--the 1st and 2nd person pronouns have the exactly same syntactic privileges as other NP's.  But, as we will see, a great many languages manifest some kind of syntactic alternation which directly reflects the deictic status of the various core arguments.  And, indeed, in the context of this demonstration, we will be able to see that the category is indistinctly but unmistakably reflected even in a language like English.



1.0 Split Ergative and Inverse Marking


Among the linguistic phenomena which pose long-standing problems for theories of grammatical relations are split ergative marking and direct/inverse marking in the verb.  There are several types of ergative "split", in which case marking is sometimes according to an ergative pattern, and sometimes not; our interest here is in the nominal split pattern, in which the place of the A argument on a hierarchy of nominal types determines whether or not it will be marked as ergative.  This, as we will see, is responsive to the same functional parameters as direct/inverse marking, where a transitive verb is marked to reflect whether the A or O is higher on the same nominal hierarchy.



1.1 Split-ergative Case Marking and Indexation


It has been generally recognized since Silverstein's seminal paper on the topic (1976) that one widely-attested pattern of split ergative marking reflects a hierarchy of nominal types.  Dixon (1994:84ff) summarizes this "Nominal Hierarchy" of eligibility to be subject of a transitive verb as:


1st person    2nd person    Demonstratives    Proper    Common

pronouns      pronouns      & 3rd person    nouns     nouns



Following Silverstein, Dixon notes that:


     Those participants at the left-hand end of the hierarchy are most likely to be agents, to be in A function, and those at the right-hand end are most likely to be patients, to be in O function.  (Dixon 1994:85)


However, this interpretation of the facts, though standard, is somewhat misleading.  In reality almost[1] all split ergative languages make the "split" between 1st and 2nd person pronouns (the Speech Act Participants, or SAP's), which do not distinguish A from S forms, and all other arguments, which do (DeLancey 1981a, cf. Dixon 1994:88).

     This type of split is very common in Australian languages (Silverstein 1976, Heath 1976, Blake 1987), and attested in North America (e.g. Silverstein 1976, Mithun 1999:230-3, and below), Siberia (Comrie 1979a, b, 1980), and in a number of Tibeto-Burman languages (Bauman 1979).  An example the last is Sunwar, a language of the Kiranti branch of Tibeto-Burman spoken in Nepal.[2]  Lexical nouns and 3rd person pronouns (which in Sunwar are the demonstratives méko 'that' and mére 'yon') are unmarked in S function, and take ergative case (-Vm) as A's:


     1)     méko ?àl   hí-t-a

          DEM  child come.down-PAST-3sg

          'The child came down.'


     2)     méko ?àl-am    tà-t-i

          DEM  child-ERG see-PAST-3sg61sg

          'The child saw me.'


     3)     méko hí-t-a

          DEM  come.down-PAST-3sg

          'He came down.'


     4)     méko-m  tà-t-i

          DEM-ERG see-PAST-3sg61sg

          'He saw me.'


But there is no such alternation for 1st and 2nd person pronouns, which do not have ergative forms:


     5)     go hí-t-i

          I  come-down-PAST-1sgINTR

          'I came down.'


     6)     go méko ?àl   tá-t-a

          I  DEM  child see-PAST-1sgTR

          'I saw the child.'


     There is also a head-marking version of the same phenomenon, with distinct ergative and absolutive verbal indices for 3rd, but not 1st or 2nd, person arguments, as in the Chinookan (Penutian) languages of the lower Columbia River.  Consider these forms from the Kiksht[3] language (Dyk 1933, cf. Silverstein 1976):[4]


     6)     ni-n-i-waqw       'I killed him.'


     7)     gal-i-tí           'He came.'


     8)     a-tc-n-dwágwa         'He will kill me.'


     9)     a-n-kdáyu               'I will go.'


(0-0) show the masculine singular 3rd person absolutive index i- (in italics) as A and as S.  (0) shows the masculine singular 3rd person ergative index tc- as A.  (0, 0, and 0) show the undifferentiated 1st person singular index n- (in boldface) as, respectively, A, O, and S.  The structure of the agreement paradigm is the same as that of the Sunwar case paradigm:  3rd person forms distinguish ergative and absolutive, 1st and 2nd person forms do not.

     The essential facts about split ergative marking are the special status of the SAP's, and the pattern of the split--it is not only that it is always the SAP's that get special treatment, but that the special treatment is always the same, with SAP A arguments unmarked where 3rd person A is a morphologically marked category.



1.2 Inverse systems


Another type of grammatical system which manifests exactly the same person hierarchy is found in inverse-marking languages.[5]  In the usual sense of the term,[6] an inverse‑marking system is one in which there is a ranking of person in which SAP's outrank all 3rd persons (while ranking among SAP's is language‑specific, see DeLancey 1981a), and a transitive verb is marked to reflect whether or not the O argument outranks the A on the hierarchy.  The configuration in which the O outranks the A is called inverse, and that in which the A outranks the O is direct.

     Direct-inverse marking, like dative-subject marking, ergativity, and active-stative typology before it, is an "exotic" typological pattern which, once recognized, turns out to be far more common than anyone ever suspected.  A generation ago it was considered, by those linguists who were aware of it at all, to be a strange idiosyncratic feature of Algonquian languages.  As our database expanded, a handful of similar examples began to be pointed out (Comrie 1980, DeLancey 1980, 1981a, b, Whistler 1985, Grimes 1985).  In the 1970's the phenomenon was of considerable theoretical interest to practitioners of Relational Grammar (e.g. LeSourd 1976, Jolley 1981), for the same reasons that is relevant to our present investigation--the fact that it involves different morphosyntactic indications of subjecthood being associated with different arguments.  Recent years have seen a substantial number of analyses of inverse or inverse-like constructions in a range of languages (e.g. Ebert 1991, Payne 1994, Zavala 1994, 1996, Bickel 1995, Watkins 1996), and increasing interest in the topic in both formal and functional frameworks (Jelinek 1990, Arnold 1994, Givón 1994b, Payne 1994, Rhodes 1994).



1.2.1 Nocte


A maximally simple example of the system is found in Nocte (or Namsangia), a language of the Konyak branch of Tibeto-Burman spoken in Arunachal Pradesh (adapted from Weidert ms.; cf. Konow 1903, Das Gupta 1971, DeLancey 1981a, b, Weidert 1985):


     10)     õaa-mE @1te1-n@õ2 vaat-@õ1

          I-ERG  he-ACC     beat-1sg

          'I beat him.'


     11)     @1te1-mE õaa-n@õ2 vaat-h-@õ1

          he-ERG   I-ACC    beat-INV-1sg

          'He beat me.'


     12)     n@õ-mE  @1te1-n@õ2 vaat-o?

          you-ERG he-ACC     beat-2sg

          'You beat him.'


     13)     @1te1-mE n@õ-n@õ2 vaat-h-o?

          he-ERG   you-ACC     beat-2sg

          'He beat you.'


The first thing to notice about this system is the fact that agreement is not always with the same grammatical role.  The verbs in (0) and (0) both have 1st person agreement, although the 1st person participant is an ergative-marked A in (0) and an accusative-marked O in (0).  (0) and (0) show the same pattern, with the 2nd person argument attracting agreement regardless of its grammatical role.  The second interesting feature of the system is the ‑h- suffix found in some forms.  These two phenomena are clearly related--we find the -h- suffix in just those forms where agreement is not with the subject.

     These forms illustrate the basic structure of an inverse-marking system.  Agreement is always with a SAP in preference to a 3rd person, regardless of grammatical role.  When this results in the verb indexing a non-subject argument, a special inverse morpheme is added to the verb.  Thus, although both 'I hit him' and 'he hit me' have 1st person agreement, the verb forms are not ambiguous, but distinguished by the presence or absence of the inverse -h-.  The verb forms in (0) and (0), which lack the ‑h‑, are direct forms; in Nocte the direct category is unmarked.

     In anticipation of discussion to come, note the behavior of Nocte verbs when both arguments are SAP's:


     14)     õaa-mE n@õ-n@õ2 vaat-E1

          I-ERG  you-ACC  beat-162

          'I beat you.'


     15)     n@õ-mE  õaa-n@õ2 vaat-h-@õ1

          you-ERG I-ACC    beat-1sg

          'You beat me.'


The inverse marker is absent in (0), with 1st person A and 2nd person O, and present in (0).  The overall verbal indexing system is illustrated below (imperfective paradigm with singular participants, adapted from Weidert 1985:925):




                    1st       2nd       3rd


               1st            ‑E        ‑aõ


               2nd     h‑aõ                ‑o


               3rd     h‑aõ      h‑o       ‑a



The distribution of the inverse marker follows a simple formula:  there is a person hierarchy in which 1st person outranks 2nd, and both outrank 3rd, which we will symbolize as 1 > 2 > 3.  When an O argument outranks the A on this hierarchy, the verb is in its inverse form.  We can almost capture the indexation pattern by simply saying that the verb indexes the argument which is highest on the person hierarchy, but this does not account for the anomalous agreement suffix in (0).  By analogy with the rest of the paradigm we would expect 1st person indexation here; what we have instead is a suffix which occurs nowhere else in the singular paradigm--in fact, it is identical to the 1st person plural index.[7]



1.2.2 The classic direction system:  Cree


In this section we will briefly examine a typically complex system from the Algonquian family, where the direction marking phenomenon was first recognized.  The Algonquian systems are the most elaborate that I am aware of, most of them making all of the distinctions found in any other direction system; they represent a prototype in terms of which other systems are easily analyzable.[8]  A straightforward example is Plains Cree (Wolfart 1973, cp. Dahlstrom 1986) which overtly marks four direction categories‑‑direct, inverse, and the two local (Hockett 1966) categories 162 and 261‑‑with morphemes from a single paradigm, and consistently indexes the principal participant in all configurations.  The verb forms with both arguments singular in the independent order of the transitive animate paradigm (i.e. verbs with animate objects) can be schematized as follows (where V represents the verb stem):



                    1st         2nd       3rd


               1st              ki‑V‑in     ni‑V‑aawa


               2nd     ki‑V‑etin              ki‑V‑aawa


               3rd     ni‑V‑ekw   ki‑V‑ekw     V‑ekw  /  V‑eewa



The prefixes and second position suffixes[9] are person indices:  ki‑ '2nd', ni‑ '1st', ‑wa '3rd proximate', and ‑n '1st or 2nd'. The first position suffixes are direction markers:  ‑ekw marks unambiguously inverse, and ‑aa unambiguously direct, configurations, while the two local categories each have their own direction marker, ‑i '162' and ‑eti '261'. 

     The distribution of the personal prefixes clearly reflects a 2nd > 1st > 3rd person hierarchy.  Such a hierarchy should, as in Nocte, define every configuration except 363 as clearly either direct, i.e. with subject higher on the hierarchy than object, or inverse, with object higher than subject.  In Cree, however, we find not a two‑ but a four‑term direction system; as I have argued at greater length in DeLancey 1981a, this reflects the fact that the language­‑particular ranking among SAPs is of a different order from the universal SAP > 3rd ranking.

     The other significant respect in which the Cree system differs from that of Nocte is in the subdivision of the 363 category according to the relative topicality of the two  participants.  The form ‑ee‑aw, which Wolfart glosses as 'direct‑3rd',[10] is used with proximate, i.e. more topical, subject and obviative, i.e. less topical, object, and the clearly inverse form ‑ekw with obviative subject and proximate object.  Thus in Algonquian relative topicality can define the principal participant when hierarchical ranking fails to do so.  This consti­tutes the major functional point of contact between direction and voice systems, as we will discuss below.



1.2.3 The direction-marking prototype


The deictic nature of these patterns is self-evident--in both Nocte and Cree, verbal morphology is obligatorily responsive to a fundamental distinction between the speech act participants and all other participants.  But there are certain other distinctions which occur in either Cree or Nocte, but not both.  In Cree (and all Algonquian languages, but cf. DeLancey 1981a:643) verbal indexation reflects a ranking of 2nd person above 1st.  Direction marking, on the other hand, appears to treat them as equal; in any case it shows that the ranking of 2>1 is of a different order from the ranking SAP>3.  Nocte explicitly ranks 1st above 2nd in direction marking, marking 261 but not 261 as inverse, but the odd personal index in the 162 form (and the fact that both SAP's, but not 3rd person, are indexed) suggest again that 1st person outranks 2nd by much less than both of them outrank 3rd.  Cree treats both local categories as special direction categories, but both show normal hierarchical indexation.  Nocte treats 261 as inverse and 162 as non-inverse, but the indexation paradigm treats 162 as a special category.  Taken together, then, Nocte and Cree imply a universal schema in which SAP arguments are clearly distinguished from and ranked above all others, and there is no universal ranking of the two SAP's (since Cree shows one of the possible rankings, and Nocte the other).



2.0 Variations on a Theme


2.1 Hierarchical agreement


We are used to thinking of verb agreement as tied to grammatical relations:  a common claim about the typology of verb agreement is that if a language has verb agreement it will index the subject; some languages index both subject and object, and a rare handful index only objects (e.g. Keenan 1976:316).  However, there are languages in which indexation of arguments in the verb reflects not grammatical relations, but the person hierarchy.  In these languages a verb will always agree with a SAP argument, regardless of its grammatical role.  (Typically such languages have no 3rd person index).  This is, of course, exactly the typical indexation pattern of a direct/inverse-marking language; in the languages which we will discuss in this section, however, we find the hierarchical indexation pattern without inverse marking on the verb.  In earlier work I have described this pattern as a variation of split ergative marking (DeLancey 1980, 1981a, b), on the grounds that verb agreement and zero case marking serve the same function, of identifying one argument of the clause as the most topical or "starting point" (Delancey 1981a, see below).  But this terminological extension is somewhat misleading; it is better to reserve the term "split ergative" for the Chinookan-type pattern.  Nevertheless this pattern represents one more way of encoding exactly the same functional domain that we have discussed in the previous sections.

     The hierarchical agreement pattern has not received as much theoretical or descriptive attention as split ergative or direction marking.  I don't have a clear sense of how widespread it may be, but it is fairly common in Tibeto-Burman (Bauman 1979, DeLancey 1980, 1988, 1989, Sun 1983).  One example which has been discussed elsewhere is Tangut (Kepping 1979, 1981, Comrie 1980, DeLancey 1981a, b).   A slightly more complicated case is the Nungish (Tibeto-Burman) languages of Yunnan (Tarong (Dulung) data from Sun 1982, 1983:25-6; cp. Lo 1945):


                    O:     1st            2nd            3rd


               1st                          -õ             -õ


               2nd       n@- -õ                     n@-


               3rd       n@- -õ        n@-            --


      The transitive paradigm of Trung



There is no 3rd person index.  The 1st person suffix -õ occurs on any verb with a 1st person argument.  The n@- prefix occurs on intransitive 2nd person subject verbs, and in all transitive forms with a 2nd person argument except for the 162 form.  (Recall that this form gets special marking also in Nocte).  The synchronic identification of this prefix as a 2nd person index is complicated by its occurrence in the 361 form, which has no 2nd person argument, but this is demonstrably a secondary development involving the merger of a previously distinct prefix with the original 2nd person form.[11]  Despite this complication the hierarchical nature of the indexation pattern is clear:  any 1st person argument must be indexed; any 2nd person argument is indexed unless there is a 1st person actor.



2.2 Sahaptian


The Sahaptian languages Nez Perce and Sahaptin[12] show a fascinating combination of hierarchial indexation and a unique pattern of split-ergative case marking.  In these languages pronominal clitics, ordinarily in sentence-second position, occur in a purely hierarchical indexation pattern.  In Nez Perce these occur primarily in subordinate clauses; in Sahaptin they occur in main clauses (see exx. 0-0 below).  The Nez Perce paradigm is (Aoki 1970, Rude 1985):


     Intransitive 1st ‑x, 2nd ‑m, Inclusive ‑nm, 3rd 0



                    1st       2nd       3rd


               1st               ‑m‑ex     ‑x


               2nd     ‑m                  ‑m


               3rd     ‑x        ‑m        ‑‑


Note that the 162 configuration is once again exceptional, in this case in having both arguments independently indexed.  The same pattern occurs in Sahaptin (Jacobs 1931, Rude 1985, p.c.).

     Along with this indexation pattern, the Sahaptian languages show interesting variations on the split ergative pattern (Rude 1991).  Nez Perce has a typical pattern, with ergative case marking on 3rd person (0) but not SAP (0) A's:


     16)     hi-páayn-a       haáma

          3NOM-arrive-PAST man

          'The man arrived.'


     17)     'iin páayn-a

          I    arrive-PAST

          'I arrived.'


     18)     wewúkiye-ne pée-'wi-ye     háama-nm

          elk-ACC     363-shoot-PAST man-ERG

          'The man shot an elk.'


     19)     'iin 'ew-'wii-ye      wewúkiye-ne

          I    SAP63-shoot-PAST elk-OBJ

          'I shot an elk.


Note, however, that O arguments are consistently case-marked (-ne in exx. 0-0).  Sahaptin shows the same ergative split--3rd person A arguments take ergative marking, SAP A's do not (exx. from Rigsby and Rude 1996 and Rude 1991):


     20)     iwínš i-wínan-a

          man   3NOM-go-PAST

          'The man went.'


     21)     iwínš-in  pá-tuXnana yáamaš-na

          man-ERG1  363-shot   mule.deer-ACC

          'The man shot a deer.'


     22)     ín=aš[13] tuXnana yáamaš-na

          I=1sg shot    mule.deer-OBJ

          'I shot a deer.'


But Sahaptin has two distinct ergative forms.  Both occur only on 3rd person A arguments, but the -in morpheme seen in (0) occurs only when the O argument is also 3rd person.  When the O is a SAP, there is a different ergative marker, -nm (which Rude, for obvious reasons, calls the "inverse ergative"):


     23)     áw=naš  i-nák-wina    k'waali-nm

          now=1sg 3NOM-carry=go dangerous.one-INV.ERG

          'Now the dangerous one has taken me along.'


     24)     iwínš-nm=nam   i-q'ínu-ša

          man-INV.ERG=2sg 3NOM-see-IMPF

          'The man sees you.'


Thus, where Nez Perce, in the typical split ergative pattern, distinguishes two transitive configurations:


          SAP   6

          3-ERG 6


Sahaptin distinguishes three:


          SAP    6

          3-ERG1 6 SAP

          3-ERG2 6 3


And the additional distinction which Sahaptin makes once again reflects the SAP / 3rd division.



2.3 Inverse with non‑hierarchical agreement


Inverse marking languages came into theoretical prominence during the heyday of Relational Grammar, for which their peculiar use of verb agreement, which is normally thought of as a perquisite of subjecthood, and the formal similarity of inverse and passive constructions, formed a particularly intriguing puzzle, which still captures the attention formal theoreticians.  Since I want to argue that inverse marking is a direct expression of a deictic category, it is significant that we find languages with very similar deictic systems where it is clearly distinct from grammatical relations.



2.3.1 Expansion of the Cislocative in Kuki-Chin


In several languages of the Kuki‑Chin branch of Tibeto­‑Burman, spoken in western Burma and eastern India and Bangla­desh, a simple inverse marking system has developed from the marking of deictic orientation on motion verbs (DeLancey 1980).  In most of these languages a motion verb *hong 'come' has become partially or completely grammaticalized as a cislocative 'hither' prefix on motion verbs (see DeLancey 1985 for details).  In the closely related Tiddim (Henderson 1965), Sizang (Stern 1963), and Paite (Konow 1904)  dialects, this morpheme has developed the additional function of optionally marking some transitive or ditran­sitive configurations with SAP object.

     In Tiddim and Sizang, we find the cislocative marker used at least optionally with any transitive or ditransitive verb with 1st or 2nd person object or goal, as in (exx. from Stern 1984:52, 56):


     25)     k‑ong   thûk  kí:k  lâ‑lê:u  hî:

          1st‑CIS  reply again  once more FIN

          ... I in turn reply to you.'


     26)     hong  sá:t  thê:i  lê:

          CIS  beat   ever  INTER

          'Do [they] ever beat you?'


     27)     hong  sá:t lé:  ká‑pe:ng  tál  do*ng ká‑ta:i tû:

          CIS  beat  if  1st‑leg  break until 1st‑flee FUT

          'If [they] beat me I'll run till my legs break.'


The (h)ong in all of these examples occurs when there is a SAP goal or object, even when, as in (0), the subject is the other SAP.  Its distribution in the transitive paradigm is thus almost identical to that of the Nocte inverse marker, except that it marks both local categories rather than only one:



               1st       2nd       3rd


          1st            hong      ‑‑


          2nd     hong                ‑‑


          3rd     hong      hong      ‑‑


This pattern suggests a natural category of marked direction which includes all configu­rations with SAP objects or goals, and provides further evidence for the non‑universality of any ranking of SAPs.

     In Sizang‑Tiddim, unlike the languages that we have previously considered, personal indexation in the transitive verb, if present, is consistently with the subject rather than with the principal partici­pant.  (This is clear in ex. (0); in (0-0), with 3rd person agent, there is no subject index, but in most cases there would be a 3rd person prefix: a‑hong sa:t '3rd‑CIS beat' = 's/he beat you/me'; see Stern 1963:254‑5).  Thus while inverse forms with SAP subject, such as (0), are unambiguous in isolation, 3rd person subject forms depend upon context for the identification of the SAP object.



2.3.2 The Dravidian "Special Base"


In two Dravidian languages, Kui (Winfield 1929) and Pengo (Burrow and Bhattacharya 1970), we find a similar system of inverse marking with consistent subject agreement, which appears to have the same cislocative origin as the Chin and Loloish inverse constructions.  The morpheme in question is a suffix which forms what Burrow and Bhattacharya call the "special base" (glossed SB in the examples below) of the verb, after which are suffixed ordinary negative, tense/aspect, and personal index morphemes.  It occurs "when the object, direct or indirect, is the first or second person" (Burrow and Bhattacharya p. 70), regardless of the person of the subject, as in:


     28)     huR‑d‑av‑at‑an


          'He did not see (me, us).'


     29)     huR‑d‑av‑at‑ang


          'I did not see you.'


As in Kuki-Chin, the subject is always indexed by the subject suffix.  We can represent the distribution of the Pengo "special" morpheme /d/ and the personal indices as follows:



          object        1st       2nd       3rd


               1st                 d‑1st    -1st


               2nd       d‑2nd              -2nd


               3rd       d‑3rd     d‑3rd    -3rd


Thus the distribution of Pengo ‑d is identical to that of Tiddim Chin ‑hong.

     Emeneau (1945), on the basis of deictically‑specified verbs of motion and giving elsewhere in Dravidian, reconstructs essen­tially the Kui-Pengo inverse marking for Proto‑Dravidian, where it marked not only inverse transitive forms, but also, like the Chin *hong reflexes, motion verbs with SAP or deictic center as goal.  Analogy with the Chin system, as well as the general tendency for historical development to proceed from more concrete to more  abstract grammaticalized functions, suggest that this morpheme probably originated in a cislocative marker which later developed an inverse function.  In the modern languages which retain a form of this system, the inverse marker no longer has a cislocative function (which further suggests that the Chin‑like stage reconstructed by Emeneau was a transi­tional stage between an originally exclusively motional function and the exclusively inverse function found in modern Kui and Pengo) but it still occupies the same suffixal slot as other motional orientation morphemes.


2.3.3 Subject and Deictic Center


The important difference between the Kuki-Chin and Dravidian systems which we have looked at in this section and more typical direction systems is that while in the latter argument indexation is hierarchically determined, often completely independent of case or grammatical relations, in Chin and Kui-Pengo the verb indexes the subject, with hierarchical status irrelevant.  In other words, in these languages there is a subject relation, which is independent of the deictic center; demonstrating the independence of this variety of direction-marking from any sort of subject‑selection process.



3.0 A Unified Approach to Hierarchical systems


The typological patterns exemplified above vary in their structural expression, which is some combination of case marking on A arguments, indexation in the verb or a pronominal clitic bundle, and morphological marking of the verb for inverse (and, sometimes, direct and local) status.  But there is obviously a deep functional unity; all of these patterns reflect the same hierarchy of person, and serve to distinguish between transitive configurations with SAP and non-SAP A arguments.



3.1 Viewpoint and Attention Flow


In DeLancey 1981, I suggested an interpretation of these patterns in terms of two putatively cognitive categories,  viewpoint and attention flow.  The concept of viewpoint has been recognized by many researchers in psychology and linguistics (as well as literary theory and elsewhere); the same idea is often called perspective.  Just as an actual scene must be observed by an actual observer from a specific actual location, which determines a certain perspective on the scene, so a virtual scene, being described by a speaker, must be scanned from a specific virtual perspective.  In real life, one's perspective as an observer is always one's physical location,[14] and this is thus the most natural perspective from which to render any description.  Of course, perspective can be manipulated for various discourse purposes.  But surely my own perspective is the most natural for me to take in relating an event in which I was actually a participant.  My addressee, likewise, can be expected to have a personal involvement in the perspective from which an event is related if it is one which she was a participant in.  Thus is seems natural to interpret hierarchical agreement as simply an index of this concrete viewpoint, tagging a clause as being presented from the perspective of the deictic center, because one or both SAP's are participants in it.

     Attention flow is equivalent to the notion of scanning in Cognitive Grammar:  in actual perception, our attention begins with one element of a scene, and scans through the perceptual field, taking its various elements in decreasing order of their intrinsic or contextual interest.  When we present a mental image, we perform an imaginary scan of the same type, and present the elements of the image to our hearer so as to help him recreate not only the image, but the image scanned as we scan it.  (We will return to this in the next lecture).  The beginning of the scan we can call the starting point.  In observing--and therefore in relating--a transitive event, an objective observer can be expected to attend first to the Agent, on no more elaborate grounds than that, in an Agentive event, there is nothing to attend to until the Agent acts.  Put in other words, causes precede effects, thus an Agentive event necessarily starts with the Agent, and so the default attention flow (or direction of scanning) starts with the Agent.

     Then both split ergative and inverse marking can be interpreted as devices to signal a conflict between starting point and viewpoint.  In a transitive configuration with a SAP A argument, viewpoint and starting point coincide.  In both split ergative and inverse systems, this is the unmarked pattern--no case marking on the Agent, no marking (in a simple, Nocte-type system) on the verb.  When the Agent is not a SAP, the starting point (the 3rd person Agent) and the viewpoint (the speaker) do not coincide, and in this situation split ergative languages mark the Agent to show that it is not the viewpoint, and inverse languages flag the verb to indicate that there is a conflict.  Languages of the Chin or Kui type are flagging a slightly more specific conflict, where the starting point and the viewpoint are both arguments of the clause, but different arguments.  Sahaptin takes the final step of doing both, marking both the conflict between starting point and viewpoint and the special situation in which viewpoint is a non-Agent argument.

     It has often been noted (Halliday 1967:217, R. Lakoff 1969, Ross 1970) that there is something odd about English passives with pronominal, and especially 1st or 2nd person pronominal (Kuno and Kaburaki 1977) agents.  While this is indubitably true, it is relatively easy to convincingly demonstrate that there is nothing in principle ungrammatical about such sentences in English (Kato 1979).  As a result, this very interesting interaction of person and voice does not count as a legitimate topic of investigation within most formal frameworks.  To many linguists the fact that passives with SAP Agents are ungrammatical in languages like Nootkan (Whistler 1985, Dahlstrom 1983) or Jacaltec (Craig 1977) is legitimate syntactic data, but the fact that such constructions are conspicuously rare and seem to be highly marked in a language like English is not.

     Nevertheless it is self-evident that the functional motivation for both facts is the same, and clearly in the same functional domain as the hierarchical systems which we examined in the last section.  In the terms of DeLancey 1981a, passive is a device for indicating marked viewpoint, i.e. viewpoint associated with a non-Agent argument.  Thus passives with SAP subjects are entirely natural, and in Nootka are obligatory.  In contrast, passives with SAP Agents are highly marked, since presenting in passive voice an event with a SAP Agent amounts to deliberately shifting viewpoint away from the one participant with whom it is most naturally and intimately associated.  The difference between English and Nootkan is simply that a what is a functionally-motivated tendency in English has been grammaticalized into a structural restriction in Nootkan.



3.2 The "Pragmatic Inverse"


With increased attention to the phenomenon of direction marking have come proposals to recognize a wider range of phenomena as "inverse".   Klaiman (1991, 1992) suggests including under the category of inverse the notorious Apachean yi-/bi- alternation.  Like the deictic inverse, this alternation involves no change in transitivity, and is triggered by the relative ranking of two arguments of a transitive verb in terms of the Empathy Hierarchy, which Klaiman sees as a hierarchy of "ontological salience".  In DeLancey 1981a both of these constructions are interpreted as devices for managing conflicts of viewpoint and attention flow, an analysis operationally equivalent to Klaiman's.  They are not conflated into a single category, however, on the same grounds to be argued here--essentially, that the deictic inverse is an intrinsically deictic category, and in this respect differs from other related alternations.


3.2.1 Expansion of the Concept


Thompson (1989a, b, 1990, 1994) considers the cognate construction in other Athabaskan languages to likewise be inverses.  In Koyukon and other northern Athabaskan languages, the alternation between the two 3rd-person object forms is not nearly as strictly hierarchically governed as is reported for the Apachean languages.  However, Thompson shows by quantitative discourse analysis techniques that in Koyukon the bi- form is used when the object is unusually topical relative to the subject.  Since in earlier work (1987) he had shown a similar discourse function for the direct/inverse alternation with proximate and obviative 3rd person arguments in the Algonquian language Plains Cree, he argues that direction in Algonquian and the yi-/bi- alternation in Athabaskan represent the same functional category.

     Givón (1990, 1994a) further extends the concept of inverse to include any construction whose function appears to be to code unusual topicality of an object without suppressing the subject argument, including even constituent order alternations in which an unusually topical object NP is fronted (the "word-order inverse"):


     ... it can be shown that the functional characteristics of these object-topicalizing clauses are the same as those of all other types of inverse clauses:  They all topicalize non-agent arguments ... And they do not involve a drastic pragmatic demotion of the agent.  There is no principled reason for not attaching to these construction their rightful functional label. (1994a:18)


While there may well be grounds for attaching some functional label to this entire range of constructions, it is certainly debatable whether inverse is the best one, as the term has always been applied to a) a structural configuration (and one which is much more explicitly definable than, say, "passive"), and b) one whose function is quite tightly tied to the category of person.  The label thus appears to have considerable validity and utility in its traditional use.  A convincing argument for the expansion of its use to the extent that Givón advocates would require a demonstration that the traditionally-recognized structural-functional unity of the classic deictic inverse can be seen to be somehow chimerical.  Givón asserts this claim, but does little to demonstrate it.

     The Thompson-Givón interpretation of the inverse presents a functional definition of the category which considerably expands the set of inverse constructions, to include constructions from a number of languages which have little or nothing in common structurally with the classic inverse:


     An inverse construction indicates a deviation from the normal degree of relative topicality between agent and non-agent.  Traditional uses of the word "inverse" have been limited to those languages in which inverse constructions are based on the ranking of persons ...  One of the contributions of this paper and Thompson (1989a) is to extend the term "inverse" to languages such as Koyukon, where the direction system can operate between third person arguments, but not obligatorily between speech act participants and third persons.

     (Thompson 1994:60-1)


The terminological innovation here is the application of the term "inverse" to languages which mark "Contextual or generic ranking between third persons (e.g. Athabaskan)" (Thompson 1994:61).  The more important issue is the assumption that the differences of behavior across person observed in inverse systems is just a particular case of topicality, i.e. of "contextual ... ranking between third persons."


3.2.2 "Semantic" and "Pragmatic" Inverse


Givón, in recognition of the discreteness of the deictic inverse, distinguishes it as the "semantic inverse", a recognizable subcategory of the broader inverse category.  Thompson's Athabaskan-type "inverse" which involves only ranking between third persons is then labelled the "pragmatic inverse".  Especially in light of the discussion in Givón 1990 (pp. 611-18) it is hard to see how this notion of "pragmatic inverse" differs from the notion of obviation, which has long been known to be often associated with inverse, but nevertheless recognized as a distinct phenomenon.

     Part of the empirical argument for this hypothesis is that in some inverse-marking languages, such as the Algonquian languages (see above and Dahlstrom 1986) and the Tibeto-Burman language Chepang (Caughley 1978, 1982, cf. Thompson 1990), the same mechanism is used to encode a person-based direct/inverse system and to categorize 363 configurations according to whether the subject or object is more topical.  Thus Givón notes:


     One can, of course, detect a fundamental unity in the use of the semantic and pragmatic inverse in a language that unites both functions in the very same construction.  (1994:22-3)


One problem with this claim is that there seem to be relatively fewer such languages than we might expect to find if there were in fact the "fundamental unity" which Thompson and Givón suggest; at least in the literature I have seen, the "semantic inverse" with no "pragmatic" extension into the realm of 3rd person arguments seems to be the commonest type.

     But the Givón's broader argument is not empirical in this sense.  Rather, it assumes that the differences between speech act participants and 3rd person arguments which is reflected in inverse marking is simply a special case of the broad category of differences in topicality.  A clear statement of this assumption is provided by Doris Payne:


     Because 1st and 2nd person participants are already, simply by the pragmatics of the speech act, individuated from the world of things "out there" to be talked about, they are inherently more topical than 3rd persons.  The speech act participants are also always available in memory; by definition, if a hearer is attending to a speaker, the hearer must always have an "open file" for the speaker.  There is also a natural sense in which speech act participants are generally taken for granted as "more important" or the "natural center of interest", over 3rd persons.  Thus, regardless of any particular discourse context, the hierarchy in (8) [i.e. 1 > 2 > 3] can be taken as an inherent topicality hierarchy.  (Payne 1994:316)


Thus the expression of topicality relations among 3rd person participants and the deictic SAP/3rd opposition are seen as belonging to the same functional domain (in the sense of Givón 1981).  Since topicality maintenance in the broad sense is the larger and more functionally central domain, the implication--made explicit by Givón--is that the "semantic" inverse is only a variant of the broader inverse category:


     In many languages with a direct vs. inverse voice contrast, the inverse clause must be used obligatorily under certain grammatical conditions.  Most commonly, such obligatory 'inversion' occurs when the agent is third person but the patient is first/second person, or when the agent is inanimate/non-human but the patient is animate/human.  One may consider this an inherent topicality inversion:  Universally, speaker and hearer outrank 3rd persons in topicality, and animates/humans outrank inanimates/non-humans.  These cases of obligatory inversion are in essence grammaticalized uses of the inverse voice under the same basic conditions--the patient outranks the agent in topicality.  (Givón 1990:617; emphasis original)


But there is another typological context in which inverse constructions can be viewed.  Inverses have both typological and diachronic connections with other phenomena which are fundamentally concerned with the functional domain of deictic orientation.  Viewed from this perspective, the "semantic" inverse can be seen to be primary, and to the extent that some "pragmatic" constructions may be connected to the classic inverse pattern, it is probable that they represent extension of an originally deictic pattern.

     Givón further makes the tentative suggestion that the "semantic" inverse may be seen universally as arising diachronically from a grammaticalization of a pragmatic inverse construction:


     ... it was suggested that word-order inversion precedes--and gives rise to--pronominal morphological inversion.  Since all word-order inverses known to us are purely pragmatic, the inference is strong that pragmatic inversion is the diachronically early, general ('unmarked') phenomenon, and that semantic inversion is the more special ('marked') sub-phenomenon within it ... However, if this hypothesis is to prevail, the existence of the purely-semantic, purely pronominal inverses ... must be interpreted as a vestigial survival of an erstwhile mixed semantic-pragmatic inverse. (Givón 1994:29)


I know of no evidence for such an inference.  Indeed, Gildea (1994) documents a language with both functions grammaticalized in entirely different structural systems.  And there is substantial evidence against the claim that the classic inverse pattern necessarily arises from a "pragmatic" inverse. In fact, the opposite line of development--from a grammatical restriction on passives into an inverse system--seems much more plausible.  As we have seen (section XXX), there is also reason to think that an inverse system may sometimes arise from a system of morphological marking of deixis for motion verbs; this source too is rooted in the fundamental deictic distinction.


  Lecture 8: Subject and Topic: Starting Points


Tibetan is a striking example of a language in which surface morphological and syntactic phenomena directly reflect underlying thematic relations.  As we saw in Lecture 6, there is no morphological, and very little syntactic, evidence in Tibetan for subject or object roles distinct from thematic relations.  But the majority--probably the overwhelming majority--of languages are not like Tibetan in this respect.  If we compare the English and Tibetan in the following examples:


     1)     blo=bzang shi-ba red      'Lobsang died.'

          Lobsang   die-PERF


     2)     blo=bzang-la ngul  dgos      'Lobsang needs money.'

          Lobsang-LOC  money need


     3)     blo=bzang-gis nga gzhus-byung     'Lopsang hit me.'

          Lobsang-ERG   I   hit-PERF


We see that the noun blo=bzang in the Tibetan examples is marked, respectively, as THEME (with zero marking), LOC (-la), and AGENT (‑gis).  In the English equivalents, Lobsang has the same form in all three sentences.  In Tibetan, as we have seen, there is no verb agreement per se; but certain alternations in auxiliaries depend in part upon the person of certain arguments (DeLancey 1990, 1992a): the byung perfective in (0), for example, is there because there is a 1st person non-Agent argument.  However, this is entirely independent of any putative objecthood; byung occurs just as easily with 1st person arguments in S and A functions:


     4)     nga 'khags‑byung

          I    cold-PERF

          'I'm cold.'


     5)     nga-r deb de    rnyed-byung

          I-DAT book that found-PERF

          'I found the book.'


     6)     nga hab=brid brgyab-byung

          I   sneeze   throw-PERF

          'I sneezed.'


     Unlike the initial arguments in the Tibetan sentences (0-0), which share nothing in the way of case marking, verb agreement, or anything else, the initial arguments in the English translations share a great deal of morphsyntactic behavior, and thus constitute a robust category.  This is the well-known and notoriously problematic subject category.  We have seen that there is no directly corresponding structural category in many languages, including not only anomalously transparent systems such as Tibetan, but more conventionally ergative, split ergative, and active-stative languages, and languages with inverse systems or other varities of hierarchical agreement.

     One lesson we need to learn from the structural patterns that we examined in the last lecture is the separability of different aspects of "subject"hood.  In these languages, there is not a subject, as there is in almost all English sentences.  There is a starting point, and there is a viewpoint, and what is tracked is the relation between them, not an enforced compromise.  Then to ask "what is the real subject" can only be to ask what is the subject for purposes of a particular construction, and the answer is, whichever is demanded by the function of that construction.

     But a tremendous number of languages--all those of truly "nominative" alignment--grammaticalize something very similar to English subject.  Our purpose in this lecture will be to analyze the functional determinants of subjecthood in English, i.e. to try and explain why a language (and, a fortiori, why so many languages) should have such a category.



1.0 Approaches to Subjecthood


1.1 Formal definitions


In many formal approaches, subject and object are taken as given by the theory.  They may either be simply stipulated, as in Relational Grammar, or defined configurationally.  In early Transformational Grammar subject was defined as the NP directly dominated by S, and object as an NP directly dominated by VP; a more current interpretation defines subject as the "external" and direct object as the "internal" argument.

     Connecting grammatical relations to the concepts of "external" and "internal" arguments is a non-explanation, unless it is accompanied by some (presumably functional) explanation of why there should be these two kinds of arguments.  To the extent that an old-fashioned Phrase-Structure grammar represents a correct understanding of the structure of any given language, then there is indeed an NP in each clause which is directly dominated by S, and up to two in each clause which are directly dominated by VP.  Of course, this interpretation of the subject relation implies that if there are languages which are not accurately represented by such a PS grammar, then the theory does not define subject for them--a conclusion to which some typologists would be quite sympathetic.  A common and important objection to this interpretation of grammatical relations is to cite the fact that so many languages pay as much attention as they do to subjects and objects.  In English the subject relation is clearly central to the syntax; a great many important construction types, including such basic syntactic functions as complementation and yes/no questions, simply cannot be accurately described except in terms of subjecthood.  And there are a very great many languages of which this is true.  The existence of such languages, and of so many of them, is taken by Functionalists as compelling evidence that the category of subject must carry some significant functional load.



1.2 Typological approaches


The fundamental task for typology is to establish whether or not subject is in fact a universal linguistic category, and if so how we know one when we see it.  The primariy task for functional research is to explain why the subject relation has the prominence which it has in so many languages.  A fundamental difficulty in defining subject in terms of function is that structurally, there is really no one such thing as "subject".  In subject-forming languages[15] like English, subject is defined by a complex of behavioral properties, as it is in German, in Irish, in Swahili, Japanese, and Klamath.  And there is considerable overlap in those behavioral properties:  case in German, Irish, Japanese, Klamath, and marginally in English, verb agreement in Swahili, German, Irish, and marginally in English, initial position in English, German, Swahili, and Japanese (but with differing degrees of predictability--more often initial in English than in German, for example).  But other properties may be more language-particular--English Subject-Aux inversion, for example, is often regarded as one of the crucial tests for subjecthood--hence its relevance to the problem of the subject of presentational there is sentences.  But this is very much English-specific.

     In a seminal paper Keenan (1976) assembled sets of behavioral and functional properties which are associated with the subject relation in many languages.  Prominent among these are least-marked case form, verb agreement, privileged accessibility to relativization, control of reflexivization and other anaphoric phenomena, Agenthood, and topicality.  Keenan noted that there is no associated subset of properties consistently associated with subject in all languages, and thus no possible criterial definition of subject.  Instead, he proposed (without calling it that) a "family resemblance" definition of subject, in which the subject of a sentence is that argument which has the largest number of properties from his lists.  Putting the matter this way still presupposes that there is such a thing as a subject in every sentence, and/or in every language, a question on which there remains some divergence of opinion among functionalists:  Dixon (1994, cp. DeLancey 1996) assumes the universality of subject with little argument, and Givón (1997) with none, while Dryer (1997)  and Mithun and Chafe (1999) express strong sceptism about the universal relevance of the category.  To some extent these may be differences of definition.  If by Subject we mean a category of argument that can be structurally defined and equated, on structural or functional grounds, with an analogous category in other languages, there does not seem to be such a relation in every language.  At the other extreme, there is no doubt that in any language at least one of Keenan's properties will identify a category with some functional similarities to subject in a language like English.  But the argument between, say, Givón and Dryer is more a question of functional vs. structural definition.  Givón claims that there is a universal subject function, while Dryer denies that there is a universal structural category of subject; both could quite possibly be correct.

     A further consideration is that in certain senses of the term, at least, "subject" is a multifactoral category.  The syntactic properties which define categories have functional motivations which are associated with the functional basis of the category.  In the case of subject, there is a substantial range of syntactic properties associated with it, and considerable cross-linguistic variation in how they bundle.  There is every reason to expect, as is the case with other categories, that different properties may in fact reflect different functional motivations, which coincide in the same structural category in some languages but not in others (cf. Mithun and Chafe 1999, Croft 1991:16).



1.3 Basic and Derived Subjects


In discussing where subjects come from, we need to distinguish between basic and derived subjects.  By this I mean subjects of basic (in the sense of Keenan 1976--these are Keenan's b-subjects-- or Givón 1979) or derived sentences, the latter being passives and other constructions which can be perspicuously described only in terms of some other more basic pattern.  Such derived constructions are well-known to have a function in the organization of discourse (Givón 1979), so that in effect we are distinguishing between what a verb lexically expects to be its subject and what the speaker actually chooses as a subject for a particular utterance.  The referents of these terms are more or less the same as those of the old terms "deep" and "surface" subject, but, despite my use of the convenient term "derived" here, there is no need to appeal to distinct syntactic levels in order to distinguish between the two phenomena.

     The problem of basic subject selection is, are there general principles which will tell us which of the arguments of a verb is its default subject?  We notice immediately, for example, that if a verb has an Agent argument, that is the unmarked subject.  This pattern is quite robust across languages with a recognizable subject category.  Since we have refuted the idea that there are distinct Instrument or Force roles which can compete with Agent for the subject role, we do not immediately have to resort to a "case hierarchy" (Fillmore 1968, Givón 1984, Bresnan XXX, inter alia) in order to guarantee primacy of Agent.

     The only other thematic relation which we have seen in the subject role is the Loc argument of possessional constructions and experiencer verbs like like.  This is, indeed, the original motivation for the distinct case role Experiencer, to provide a label for those Loc (or "Dative") arguments which occur as subjects.  This then makes it possible to say that subject is selected according to a hierarchy of case roles.  In our terms this would probably look like (cf. Givón 1984):


          Agent > Experiencer > Theme > Locative


This is a neat statement of the facts, but without more of a story it is not yet an explanation.  We need to explain why this hierarchy, rather than some other, is universal.



2.0 Theories of Subject


Functional accounts of the subject category, have, from time immemorial, revolved around to functional categories:  Agent (the traditional "doer of the action") and Topic (the traditional "what the sentence is about").  Neither of these is and adequate basis for a theory without further refinement.  Agent is a reasonably well-defined concept (see Lecture 4), but accounts for only a subset of subjects.  Topic, in contrast, is a vaguely-defined category which therefore can be applied to many different things, some of which are and some of which clearly are not subjects.



2.1 Subject and Topic


For generations discussions of derived subjecthood have revolved around a notion referred to as thematicity (Mathesius 1975) or topicality.  English speakers have a clear intuition that the motivation for a passive sentence is that the non-Agent argument which is selected as subject is so selected because it is "more important".  The problem is specifying exactly what we mean by "important", or "topic".  Topic, though used as a technical term, often ends up meaning little more than "whatever it is that makes a non-Agent argument eligible for subject status in a passive sentence".

     Givón motivates the case role hierarchy of eligibility for basic subject status in terms of relative inherent topicality.  In the study of topicality, certain types of referent are considered to be inherently more topical than others--in particular, humans than non-humans, and animates than inanimates.  Givón claims that Agents inherently topic-worthy, as well as being typically animate, and prototypically human.  Experiencers are necessarily human or anthropomorphized non-humans, and thus inherently more topical than any non-human stimulus (i.e. Theme) argument.  Since experiencer verbs are typically indifferent to the animacy of their Theme argument, this means that the Experiencer will be the most inherently topical, and thus the default subject.

     This account of basic subject selection then ties neatly into a story for derived subjects, which are likewise responsive to topicality--but actual, discourse-based topicality, rather than inherent.  This part of the hypothesis is in principle open to empirical verification.  Assuming that we can provide some operationalizable definition of topic (and if we can't, then any use of it as an explanatory construct severely weakens our theory), then we can look at actual discourse, and see whether or not there is a correlation between derived subjecthood and topicality.

     Givón (1983a) attempted to provide, if not a definition, at least a replicable measure of topicality, based on the presumption that a more topical referent in a discourse will be mentioned more often than a less topical one.  This suggests the simple expedient of taking the number of times a referent is referred to (explicitly or anaphorically) within a given span of text as an index of its topicality.  The utility of the measure was confirmed when a number of grammatical factors hypothesized to reflect topicality--most importantly, for our purposes, voice alternations--turned out to correlate quite significantly with topicality as measure by text counts (Givón 1983b).

     But the notoriously nebulous character of the topic category remains problematic.  It is very reassuring to find an objective, quantifiable variable which correlates so neatly with our intuitions about topicality and its relation to grammar, but it would still be nice to have a clear picture of exactly what it is we are talking about.



2.2 Subject as Starting Point


In DeLancey (1981a), following out suggestions by Ertel (1977) and MacWhinney (1977), I presented an account of subject choice in terms of a putatively cognitive category of starting point.  Similar suggestions have been made by Chafe (1994, cf. Mithun and Chafe 1999) and others.  Put in Langacker's terms, this is the start of a mental scan of a scene or event.  In actually observing an event, one's attention moves from one participant to another.  There may be any number of factors which will determine where an actual individual actually begins to actually scan an actual scene--besides inherent attractors of attention such as size, motion, and humanness, a particular perceiver has individual predispositions and interests, and potentially some personal or emotional involvement with all or some aspects of what is happening.  But there are certain default values.  As we noted in the last lecture, the Agent is the natural starting point for an observer with no interest in the event beyond observing it.  Thus the natural direction of scanning, or attention flow, is from the Agent to the other arguments.  And, just like the topicality hypothesis, this account is easily extended from basic to derived subjects.  Passive voice is interpreted as a syntactic device to signal the hearer that an inherently Agentive event is being presented with something other than the Agent as the starting point, and hence subject.

     Starting point can be thought of in the following terms: while the theoretical sentence may be constructed around the verb, actual sentences are built around a NP.  That is, a speaker begins to construct a sentence by choosing a referent, and constructs the sentence with that referent as starting point.

     The formulation of subjecthood in terms of starting point has certain elements of superiority to the topicality approach.  For one thing, it provides a framework for understanding a range of models of clause organization, including both nominal and aspectual split ergativity (DeLancey 1981a, 1982), true inverse patterns, and passive constructions.  Most importantly, it represents a more precise, operationalizable construct than topic.  In the form in which it is presented in DeLancey 1981a, however, it remains speculative, in the sense that the cognitive categories which are invoked as explanatory devices are inferred from the linguistic facts which they are intended to explain.  As we will see in the next section, more recent work has provided an operationalizable version of starting point and demonstrated its relevance to subject formation experimentally.



2.3 Attention and subject formation


In an elegant series of studies, Russell Tomlin and his students has demonstrated that in on-line discourse production, subject selection in English and (provisionally) several other languages is driven primarily, if not entirely, by attention.  In an early study (Tomlin 1983), he looks at the alternation of active and passive sentences in the (English) play-by-play description of a televised hockey game.  The bulk of the data turn out to be easily describable in terms of a simple model in which the speaker is tracking the puck, and the puck, the shot, or the player handling the puck are the automatic choices for subject status.  This is hardly surprising, in terms either of our everyday experience.  But it is not directly predicted by the way that we typically talk about topicality and subjecthood, since it is not intuitively obvious that the puck itself is the "topic" of the play-by-play, in the sense that, for example, my brother is the topic of a story about him.  In more recent studies, Tomlin and his students have pursued an experimental strategy of eliciting discourse under controlled conditions, with the primary variable being where the subject's attention is directed (Tomlin 1995, 1997, Tomlin and Pu 1991, Kim 1993, Forrest 1999).  Attention is used here is a very explicit sense.  At any given moment, an individual is giving primary attention to one element within the visual field; Tomlin shows when the speaker formulates a sentence to describe what is in the visual field, the attended element will be selected as subject, and the rest of the sentence--in terms of both syntactic construction and lexical choice--is constructed accordingly.



2.3.1 Attention and subject selection in controlled discourse production


Forrest (1992, 1999) demonstrates the association between attention and subject selection in reporting static scenes.  In her study, subjects were presented with a computer screen on which are

two figures, for example a cross and a circle, one above the other.  The subject's task is to describe the spatial relation of the two figures; thus subjects are producing sentences like The cross is above the circle or The circle is below the cross.  Forrest was very consistently able to determine the form of subjects' output by directing their attention to either the upper or lower part of the screen.  For example, in one version of the task, the subject first sees a blank screen.  Then a cue flashes in either the upper or lower part of the screen, which quite reliably attracts the subject's attention.  Immediately (so as to be within the very short amount of time--on the order of 150 milliseconds--which is required for humans to reorient their visual attention) the task screen is presented.  If the screen has a cross in the upper part, and a circle in the lower, subjects who have been primed to attend to the upper screen will report The cross is above the circle, if they have been primed to attend to the lower screen, they will report The circle is below the cross.  An important aspect of this study is that the subjects are never told that they should be attending to one figure or another.  Once a subject's attention is directed to one location on the screen, he will automatically attend to the figure which then occupies that location, and the attentional focus is robustly coded as subject.  (Cf. the discussion of semantic Theme as perceptual Figure in Lecture 3).

     Tomlin's experiments involve the reporting of events.  In one study (Tomlin 1995), subjects are shown an animated sequence in which differently-colored but otherwise identical fish swim toward one another from opposite sides of the screen.  They meet in the middle, and one opens its mouth and swallows the other.  The subject's task is to describe what is happening on the screen.  The crucial question, of course, is the form of the climactic sentence--does the subject report Then the red fish eats the blue fish or Then the blue fish is eaten by the red fish?  Again, Tomlin is able to control subjects output by drawing their attention to one or the other of the fish as they first emerge onto the screen.  In this experiment a small arrow appears briefly on the screen pointing at one of the fish as it emerges.  Consistently, English-speaking subjects make the cued fish the subject of the climactic sentence, choosing active or passive voice according to whether the attended fish is the eater or the eaten; similar results were found in other languages studied.

     Tomlin's interpretation of these studies is that in a performance grammar of English, subject is the linguistic reflection of attention.  He further suggests that other putatively subject-associated properties such as humanness, animacy, agentivity, and size have no direct connection to subject formation.  All of these factors are well-known determinants of attention, which in Tomlin's model (Tomlin 1997) is the sole direct determinant of subject selection.



2.3.2 Attention and topicality


Tomlin's results present a very plausible picture of how subjects emerge in on-line descriptive discourse.  There are various possible objections to his hypothesis as an explanation for the subject category.  One which need not detain us for long is the argument that while this may be part of a performance grammar, it is not, nor could Tomlin's experimental methods lead us toward, a grammar of competence.  But this presupposes the correctness of a theory in which there is a discrete, autonomous linguistic competence which is distinct from and prior to performance.  We have no commitment to such a concept--if a grammar of linguistic production and comprehension is able to account for linguistic structure, there is no theoretical or metaphysical reason to insist that it is somehow underlain by an inaccessible grammar of competence.

     A more concrete problem is the obvious fact that most linguistic use is not on-line description--that is, most sentences which are actually produced in language use are not descriptions of anything in the speaker's immediate perceptual field.  To take a simple thought experiment, you can certainly imagine yourself watching a cloud, or a bird, or some other pleasant distraction, while talking to someone about linguistics, or family problems, or anything at all.  When I return home and tell a mutual friend Susan said hi, it is probably the friend who is my actual attention focus; in any case it is not Susan.

     Still, Tomlin's results cannot be irrelevant to the general problem of subjects.  For one thing, his proposal to replace the hopelessly fuzzy concept of topic with the well-studied and easily measurable concept of attention is too attractive to dismiss offhandedly.  More important, Tomlin's hypothesis seems to be the correct account for subject formation in his experimental tasks (and in more naturalistic discourse such as sportscasting).  If that is true, we do not want to posit a completely different subject-formation mechanism for other modes of discourse.

     Once again, we come face to face with the essential mystery of human cognition.  Discourse which is not a description of the immediate context evokes and/or builds a virtual world in which virtual events are described as taking place.  As Langacker, Chafe, and others argue, the mechanisms by which we portray this world are virtual analogues to the perceptual mechanisms by which we build our representation of the real world.  Thus a description of an imaginary event does have exactly the same kind of scanning sequence as the perception and description of an event in real time, and is done from a virtual viewpoint which defines a virtual perspective on the scene.



3.0 Basic Subjects


Obviously Tomlin's results cannot be directly extended to basic subject selection, since a verb is a label for a concept, not a specific (real or virtual) scene being scanned.  But if we think of a verb as representing a generic scene, we will think of it presented in the most generic, i.e the most natural, scan.  There are well-known psychological principles of attention allocation, including preferential attention to human over non-human arguments, and to moving objects in preference to immobile ones.  As I have suggested above, the second of these (and, pragmatically, the first as well) implies Agents in preference to other arguments.  Preferential attention to humans is enough to explain why experiencers are basic subjects in preference to their frequently inanimate Theme "stimulus" argument.

     This might sound like an argument in favor of Experiencer as an underlying case role, which I argued in Lecture 3 should be dispensed with in favor of a more general and basic role of Location.  In fact recognizing Experiencer as a distinct and coherent role creates larger problems than any that it might solve.  For example, Pesetsky (1995:18-19) adduces pairs such as (0-0) and (0-0) as evidence that there cannot be a strict hierarchy of eligibility for subjecthood:


     7)     The paleontologist liked/loved/adored the fossil.


     8)     The fossil pleased/delighted/overjoyed the paleontologist.


     9)     Bill disliked/hated/detested John's house.


     10)     John's house displeased/irritated/infuriated Bill.


The claim is that predicates like like and dislike take an Experiencer subject and a Theme object, and predicates like please and displease take a Theme subject and an Experiencer object.  Thus there cannot be any general principle which requires Experiencer rather than Theme to be selected as basic subject.

     The verbs in (0) and (0) are ordinary experience subject verbs of the sort which we discussed in Lecture 3.  It is the "Experiencer object" verbs in (0) and (0) that are problematic for a functional account of basic subjects such as we are trying to develop.  If we apply Fillmore's tests to these verbs, we find that they act like change-of-state verbs.  They are not labile, but they clearly have stative as well as eventive passives; indeed to my intuition this is by far the most natural use for most of these verbs:[16]


     11)     The paleontologist was overjoyed


     12)     I'm just delighted over your success.


     13)     He seems irritated.


     14)     I'm bored!



Thus the argument structures of the two types of verb are quite different:


     15)     The paleontologist liked the fossil.

               LOC                   THEME


     16)     The fossil pleased the paleontologist.

            AGENT      LOC         THEME


     The semantic difference between the paired verbs is a difference of construal.  A situation in which a person experiences some cognitive or emotional state can be construed in three ways--as a state which the individual enters into, parallel to sick or grown-up, as a force which enters into the individual, as a disease, or as a proposition entertained in the individual's mind.  The last of these is grammaticalized as dative-subject predicates like like; the first is grammaticalized as a species of change-of-state predicate like please.  Even if the like and please sentences should be truth-conditionally equivalent (though Pesetsky (1995:56-60) perspicuously shows that they are not) this is irrelevant to the actual semantics which inform their syntactic behavior.

     Still, it is true that English is able to lexicalize these two alternate construals of the same "objective" situation-type only by contravening the principle that intrinsically human arguments, as inherently natural foci of attention, should automatically be basic subjects.  In that context it is worth repeating the observation that these verbs seem to be most naturally used in the passive--which in this instance is being used to rectify this less-than-optimal choice of basic subject (cf. DeLancey 1981a:XXX)





    [1]I hedge this with "almost", because Silverstein's, Dixon's, and others' discussion of the topic implies the existence of languages with a different split.  However, I am not aware of an example of a language with the split anywhere else except between SAP and 3rd person arguments.

    [2]Sunwar data were provided by Tangka Raj Sunawar in Eugene, Oregon, 1988-9.

    [3]Kiksht, also known as Wasco or Wishram, is a Chinookan language of the middle Columbia River, still spoken by a few elderly speakers in Oregon and Washington.

    [4]For the sake of clarity I present Chinookan verbs morphologically analyzed only as far as is necessary to illustrate the point under discussion.

    [5]Like much linguistic terminology, inverse shows up in the literature in at least one sense quite different from this one.  I have sometimes referred to this pattern as direction marking (following Hockett 1966), but to many readers this term suggests the morphological marking of deixis with respect to motional rather than transitive predicates.  Direct/inverse marking is less ambiguous, but still misleading, as in languages of this type the direct category is typically unmarked.

    [6]A significantly different definition will be discussed below.

    [7]It is not clear whether this suffix is originally a plural form or a distinct 2nd person form which is only secondarily homophonous with the 1st person plural.  However, even if the homophony shold be secondary, other cross-linguistic evidence that 1st person plural marking in such a form is not unnatural (see below) suggests that the fact that the homophony has survived is probably not coincidental.

    [8]For a sampling of the extensive descriptive and analytical work on Algonquian direct/inverse systems, see Hockett 1966, Goddard 1979, Wolfart 1973, LeSourd 1976, Jolley 1981, Dahlstrom 1986, Rhodes 1994.

    [9]My reference to first and second position suffixes counts only those which we examine here; what I am calling first and second position are actually second and third, first being occupied by an optional obviative suffix ‑em.

    [10]The ‑ee is analyzed as an allomorph of the direct morpheme.  The fact that this allomorphy is not phonologically conditioned should be the cause for some discomfort, as it suggests the possibility that Cree does not treat these configurations as truly direct.

    [11]It is possible that this earlier prefix might have been an inverse marker (DeLancey 1981b, 1988), but this question requires further investigation.

    [12]Sahaptian belongs to the Plateau branch of Penutian; Sahaptin is spoken along the upper Columbia, Nez Perce in adjacent areas of Washington, Oregon, and Idaho.

     The reader is warned to note the terminological distinction between the family name Sahaptian and Sahaptin, which is one of its two daughter languages.

    [13]=aš is a pronominal clitic, occurring in sentence-second position, as noted above; note the occurrence of these clitics in the next two examples as well.

    [14]We can safely consider the phenomena associated with periscopes, remote cameras and so forth to be irrelevant here.

    [15]I take this explicitly tendentious term, which presupposes the claim that there are languages which do not form syntactic subjects, from J. Anderson (1979).

    [16]This is a different argument from that presented by Pesetsky (1995:22-3), who uses the passivizability of these verbs (without regard to stative or eventive interpretation) as an argument that they are not unaccusative predicates.