date: 19 June 2021

Syntactic Typology

Syntactic Typologyfree

  Masayoshi Shibatani


The major achievements in syntactic typology garnered nearly 50 years ago by acclaimed typologists such as Edward Keenan and Bernard Comrie continue to exert enormous influence in the field, deserving periodic appraisals in the light of new discoveries and insights. With an increased understanding of them in recent years, typologically controversial ergative and Philippine-type languages provide a unique opportunity to reassess the issues surrounding the delicately intertwined topics of grammatical relations and relative clauses (RCs), perhaps the two foremost topics in syntactic typology.

Keenan’s property-list approach to the grammatical relation subject brings wrong results for ergative and Philippine-type languages, both of which have at their disposal two primary grammatical relations of subject and absolutive in the former and of subject and topic in the latter. Ergative languages are characterized by their deployment of arguments according to both the nominative (S=A≠P) and the ergative (S=P≠A) pattern. Phenomena such as nominal morphology and relativization are typically controlled by the absolutive relation, defined as a union of {S, P} resulting from a P-based generalization. Other phenomena such as the second person imperative deletion and a gap control in compound (coordinate) sentences involve as a pivot the subject relation, defined as an {S, A} grouping resulting from an A-based generalization. Ergative languages, thus, clearly demonstrate that grammatical relations are phenomenon/construction specific. Philippine-type languages reinforce this point by their possession of subjects, as defined above, and a pragmatico-syntactic relation of topic correlated with the referential prominence of a noun phrase (NP) argument. As in ergative languages, certain phenomena, for example, controlling of a gap in the want-type control construction, operate in terms of the subject, while others, for example, relativization, revolve around the topic.

With regard to RCs, the points made above bear directly on the claim by Keenan and Comrie that subjects are universally the most relativizable of NP’s, justifying the high end of the Noun Phrase Accessibility Hierarchy. A new nominalization perspective on relative clauses reveals that grammatical relations are actually irrelevant to the relativization process per se, and that the widely embraced typology of RCs, recognizing so-called headless and internally headed RCs and others as construction types, is misguided in that RCs in fact do not exist as independent grammatical structures; they are merely epiphenomenal to the usage patterns of two types of grammatical nominalizations.

The so-called subject relativization (e.g., You should marry a manwho loves you) involves a head noun and a subject argument nominalization (e.g., [who [Ø loves you]]) that are joined together forming a larger NP constituent in the manner similar to the way a head noun and an adjectival modifier are brought together in a simple attributive construction (e.g., a rich man) with no regard to grammatical relations. The same argument nominalization can head an NP (e.g., You should marry who loves you). This is known as a headless RC, while it is in fact no more than an NP use of an argument nominalization, as opposed to the modification use of the same structure in the ordinary restrictive RC seen above. So-called internally headed RCs involve event nominalizations (e.g., Quechua Maria wallpa-ta wayk’u-sqa-n-ta mik”u-sayku [Maria chicken-acc cook-P.nmlzr-3sg-acc eat-prog.1pl], lit. “We are eating Maria cook a chicken,” and English I heardJohn sing in the kitchen) that evoke various substantive entities metonymically related to the event, such as event protagonists (as in the Quechua example), results (as in the English example), and abstract entities such as facts and propositions (e.g., I know that John sings in the kitchen).


1. Introduction

While language typology (or linguistic typology) had been practiced within the framework of “comparative philology” much earlier than the beginning of the 20th century, a first explicit formulation of its program is found in the following quotation from the posthumous publication of the German linguist Georg von der Gabelentz (1840–1893), who also christened the enterprise “typology”:

Aber welcher Gewinn wäre es auch, wenn wir einer Sprache auf den Kopf zusagen dürften: Du hast das und das Einzelmerkmal, folglich hast du die und die weiteren Eigenschaften und den und den Gesammtcharakter!—wenn wir, wie es kühne Botaniker wohl versucht haben, aus dem Lindenblatte den Lindenbaum construiren könnten. Dürfte man ein ungeborenes Kind taufen, ich würde den Namen Typologie wählen.

[But what an achievement would it be were we to be able to confront a language and say to it: You have such and such a specific property hence also such further properties and such and such an overall character!—were we able, as daring botanists must have tried, to construct the entire lime tree from its leaf. If an unborn child could be baptized, I would choose the name typology.]

(Von der Gabelentz, 1901, p. 481)

Modern typological studies leading to the present-day flourishing of the field date back to the 1960s, when the seminal works, known as “word order typology,” by the American anthropologist and linguist Joseph Greenberg (1915–2001) were published. Greenberg’s influential works helped shift the perspective on language typology from the traditional morphological classification of languages1 to the syntactic domain, coinciding in time with the laying of the foundation of generative syntax by Noam Chomsky. Although the two approaches to grammar did not meet eye to eye in the beginning, much contemporary work in generative grammar, though characterized by its unique formalism and theory-internal argumentation, is typologically oriented.

Greenberg (1963) also established the quantitative typological method, which ascertains crosslinguistic generalizations based on the number of languages attesting specific combinations of structural patterns out of the logical possibilities. Greenberg, following in von der Gabelentz’s footsteps, wanted to find out whether knowing that a language has the constituent order of S(ubject)-O(bject)-V(erb), for example, would lead to knowing “further properties and such and such an overall character” of the language in question. With a limited sample of some 30 languages, Greenberg came up with a surprising number of crosslinguistic generalizations, such as the following for SOV languages:

With overwhelmingly greater than chance frequency, languages with normal SOV order are postpositional. (Greenberg’s linguistic universal 4)

If a language has dominant SOV order and the genitive follows the governing noun, then the adjective likewise follows the noun. (Greenberg’s linguistic universal 5)

If in a language with dominant SOV order there is no alternative basic order, or only OSV as the alternative, then all adverbial modifiers of the verb likewise precede the verb. (Greenberg’s linguistic universal 7)

In languages with dominant order SOV, an inflected auxiliary always follows the main verb. (Greenberg’s linguistic universal 16)

These implicational statements are known as “Greenberg’s linguistic universals,” as they set the boundaries of possible human languages. Greenberg’s universal 5, if it turns out to be 100% true, excludes from possible human languages those having SOV, N+GEN, and ADJ+N orders.

While the quantitative typological method and its results, including Greenberg’s, continue to be further developed and refined with expanded and areally and genetically balanced data (e.g., Dryer, 1992, 2009; Hawkins, 1983), many have pursued another line of inquiry, qualitative typology, which focuses on specific topics, such as ergativity, word classes, classifiers, causative constructions, and valency patterns (e.g., Aikhenvald, 2000; Dixon, 1994; Haspelmath, 2015). By going deeper into the analysis of a narrowly defined phenomenon, qualitative typology addresses issues not only of typological nature, that is, ascertaining and understanding the possible structural variation of a morphological category and a syntactic phenomenon within and across languages, but also of both theoretical and descriptive relevance, probing into the crosslinguistic viability of theoretical notions such as adjective, subject, and verb phrase, and developing a framework for the informed description and comparison of specific categories and constructions in individual languages. As the works by Edward Keenan and Bernard Comrie examined in this article amply show, qualitative typology is also intimately bound up with language universals research, seeking answers to questions such as whether all languages have a lexical category of adjective and whether the grammatical relation subject is a universally applicable notion.

This article demonstrates the qualitative typological method, focusing on the two interlocking topics of grammatical relations (section 2, “Grammatical Relations”) and RC formation (section 3, “Relative Clauses”), both of which received intensive scrutiny in the 1970s, and which have continued to dominate the field as the foremost topics of controversy in both descriptive and theoretical linguistics. These topics are orthogonal to the controversy surrounding the typological profiling of two special types of languages, namely ergative- and Philippine-type languages. In the course of unraveling the relevant issues, some unique features emerge that characterize these types of language and set them apart from each other and from other, more familiar languages such as English and Spanish.

The well-defined focus of this article affords an extensive and critical review of the past achievements and their limitations, including two kinds of broader issues in qualitative typology. One is the limitations of the qualitative method itself when applied to the analysis of certain groups of languages (sections 2.2, “Grammatical Relations in Western Malayo-Polynesian Languages,” and 2.3, “Methodological Issues”), and the other the very foundational problems of defining the grammatical constructions to be typologized (section 3.4, “Typology of Relative Clauses?”). In response to theoretical interest in realative clcauses (RCs), no modern descriptive grammar fails to mention them. An important question, but rarely asked in both descriptive and theoretical studies, is whether the different types of RCs recognized, such as headless RCs and internally headed RCs, and even ordinary RCs themselves, instantiate independent linguistic structures or they are merely epiphenomenal to some other basic structures under different usage patterns. Taking the word “research” in the title of this encyclopedia seriously, the following narrative addressing these issues weaves into it both original research and new insights.

2. Grammatical Relations

The question of grammatical relations, especially the notion of subject, has been one of the perennial issues in linguistics. “What is a grammatical subject?” and “Do all languages have a subject?” are two prime questions that have captivated generations of grammarians. In the 1970s these questions surfaced anew in the light of a renewed interest in ergative languages and the rise of a theoretical framework known as Relational Grammar (Perlmutter, 1983), which recognized grammatical relations as theoretical primitives needed in capturing universal aspects of syntactic phenomena. Keenan’s (1976) work “Towards a Universal Definition of ‘Subject’” was written in the backdrop of Relational Grammar, but it continues to have enormous influence on all works dealing with the question of subject, whether theoretical or descriptive, well beyond a specific framework.

Keenan’s (1976) goal was to provide a list of grammatical properties of the subject of a basic sentence, the b(asic)-subject, so that anyone confronted with the task of finding a subject in any language can identify it. In his words

[A]n NP [noun phrase] in a b[asic]-sentence (in any L[anguage]) is a subject of that sentence to the extent that it has the properties in the properties list . . . If one NP in the sentence has a clear preponderance of the subject properties then it will be called the subject of the sentence. (p. 312; emphasis original)

Keenan’s thirty odd “properties that subjects characteristically possess” are divided into several types. The following from Keenan are the ones most often invoked by grammarians in identifying a subject in a language:



A non b-subject may often be eliminated from a sentence with the result still being a complete sentence. But this is often not true of b-subjects.

Semantic properties


b-subjects normally express the agent of the sentence, if there is one.


Subjects normally express the addressee phrase of imperatives.

Coding properties


b-subjects are normally the left-most occurring NP in b-sentences.


b-subjects of intransitive sentences are usually not case marked if any of the NPs in the L are not case marked.

Behavior and control properties


The NPs which control verb agreement, if any, include b-subjects.


The NPs which can be relativized, questioned, and cleft include b-subjects.


NPs which can be coreferentially deleted in sentence complement when coreferent with matrix NPs always include b-subjects.


b-subjects are the easiest NPs to stipulate the coreference of across clause boundaries.

Keenan’s heuristics or similar procedures have been widely practiced in identifying subjects in various languages, especially those in which the status of the subject relation has been controversial. There are two types of languages in which this has been the case, namely ergative languages and so-called “Philippine-type” languages belonging to the Western Malayo-Polynesian group.2

2.1 Grammatical Relations in Ergative Languages

Ergative languages are so identified if they show a phenomenon in which the argument of an intransitive (S hereafter) and the patientive argument of a transitive sentence (P hereafter) are morphosyntactically treated alike to the exclusion of the agentive argument of a transitive sentence (A hereafter).3 The most obvious and widespread manifestation of “ergativity” of treating S and P alike is in nominal morphology, whereby S and P are treated alike, typically being unmarked, in opposition to A, which typically receives an extra marker. The ergative-grouping pattern is opposed to the “accusative” pattern, where S and A are treated alike, to the exclusion of P, as in figure 1.

Figure 1. Accusative and ergative patterns.

Source: Author.

The morphological manifestations of the accusative {S, A} and the ergative {S, P} grouping pattern are illustrated in case marking of Bolivian Quechua and Warlpiri in (1) and (2).4



As the examples in (1) show, the S=A marking in the (nominative-) accusative pattern, known as the nominative (NOM) case, is typically zero (unmarked), with P being morphologically marked by the “accusative (ACC)” case form. In the typical (absolutive-) ergative marking pattern, the unmarked S and P are said to be in the absolutive case, glossed as ABS in the interlinear glossing of example sentences as in (2). The marked A is singled out by an ergative case marker, glossed as ERG. The morphology of ABS and ERG parallels that of NOM and ACC in that ABS is typically unmarked like NOM and ERG is similar to ACC for being marked. The typically unmarked NOM arguments (S and A) are subjects in accusative languages (see below), suggesting that the unmarked counterparts, the ABS arguments (S and P), are subjects in ergative languages. This morphologically motivated identification of NOM and ABS and the attendant assumption that ABS arguments are subjects in ergative languages has not sat well with many grammarians, because what is semantically patientive (P), which is aligned with an object in accusative languages, is identified as a subject.

Against the background above, Anderson (1976) demonstrates that a morphologically ergative language such as Warlpiri has subjects just like English subjects because it has phenomena paralleling those in which English subjects play a pivotal role. Consider the controller and the gap function in the Warlpiri subordinate clause in (3) and compare them with the while adverbial pattern in the English translations, where the subjects of both intransitive and transitive main clauses control the interpretation of the gaps, which also occur in subject position.5


Despite the A argument in Warlpiri having an ERG case marker, it shows a subject property (or behaves like a subject in English) by controlling a gap in a subordinate clause (3a) or being able to function as a gap in the subordinate clause (3b). S and A arguments in Warlpiri are subjects, like in English, not the morphologically suggested grouping of S and P, as many have thought, argues Anderson (1976).

Anderson’s subject-property approach to grammatical relations would further show that in some ergative languages the ergative case marking pattern and a syntactic pattern may be isomorphic in that absolutive arguments, S and P, to the exclusion of ERG-marked A, form a natural class, as in (4).


Other grammarians (e.g., Comrie, 1981; Kroeger, 2004) follow Anderson’s (1976) (and Keenan’s, 1976) decision to identify the ABS arguments (S and P) in ergative languages as subjects as long as they display some subject properties. Interestingly, Dyirbal shows that the control of the gap in a compound (coordinate) sentence is beyond the morphological ergativity, since first and second person pronouns in the accusative case marking pattern also group the nominative S and the accusative P as a unit, displaying the “subject” property under discussion. Observe:


Notice that the controller in (5d) is the accusative marked P, not the zero marked nominative A argument like English. The controllers are S and P irrespective of their case marking—they can be zero marked (either as NOM and ABS) or marked as ACC (5d).

The “split-ergative” phenomenon shown in (4) and (5), where full nouns are in the ergative case marking pattern while pronouns show the accusative pattern, actually poses a major problem for identifying ABS arguments (or the members of the {S, P} grouping) as subjects for languages like Dyirbal. While Anderson (1976) appears to dismiss morphological ergativity as a superficial phenomenon, morphology is as much part of grammar as syntax is and should not be lightly dismissed. Accounting for the ergative case marking pattern for Dyirbal full nouns, as in (4), can invoke the proposed subject relation embracing S and P and say that full noun subjects receive absolutive (zero) marking and that an A full noun object (?) receives ergative marking. Accounting for the accusative case marking pattern for Dyirbal pronouns, as in (5), requires another set of grammatical relations; one accounting for the nominative (zero) marking for S and A and another for the accusative marking for P. Appropriating the term subject to the {S, P} grouping does not allow one to use the subject relation in accounting for the accusative case marking pattern in the language, despite it being the same pattern found in accusative languages such as Quechua and English, where the subject and object relations are invoked in accounting for the opposition in case forms (Ø vs. -ta for Quechua and e.g., he vs. him in English).

This kind of situation where two sets of grammatical relations are called for is actually found at the level of syntax in a number of ergative languages. Central Alaskan Yup’ik (CAY) is a representative of such languages. Besides the ergativity in the case marking pattern and the verb agreement morphology in the main clause, CAY organizes several syntactic phenomena with ABS arguments functioning as a unit. The process of causativization, for example, treats S and P uniformly and turns them into an ABS argument of a causative sentence to the exclusion of A, which is realized as a terminalis (TERM) oblique phrase. Observe the following:6


Compare the above pattern with English causativization, which treats S and A alike, as observed in (7).


Relativization in CAY, as in many, if not most, ergative languages, treat S and P (marked ABS) as a pivot, with a ban on relativizing an ergative A argument altogether. Observe:


As in (8b″), the A argument of a transitive clause cannot be relativized on. In order for it to be relativized on, it must be made a derived S via antipassivization (see [8c] and [8c′]).

Keenan’s (1976) property-list approach to subjects would point to the union of S and P in CAY as forming a subject category (“[t]he NPs which can be relativized, questioned, and cleft include b-subjects”), as Anderson (1976) and others would do so with the {S, P} grouping in Dyirbal. However, CAY also has other syntactic phenomena in which S and A form as a unit. Observe the pattern of gap control in compound sentences in CAY in (9), and compare it with the Dyirbal pattern seen earlier in (4) and (5), where S and P function as both the controller and gap position.


The gaps are controlled by S (9a) and A (9b), as well as the derived S of a passive sentence (9c), in exactly the same way as subjects control the gaps in the parallel phenomenon in English; cf. the English translations in (9).

Another CAY phenomenon calling for the {S, A} grouping is the deletion of the addressee NP in imperatives. Both S and A can be deleted without any problem (10a′, b′). But deleting P cannot give rise to an imperative (10c′). In order for such a P to undergo deletion, it must be made a derived S via passivization, as in (10d, d′). In other words, imperative addressee deletion applies to S (including a derived S) and A exactly like English imperatives.


It is sometimes argued that a phenomenon such as imperative formation is universally agent-oriented and therefore semantically controlled and does not count as a syntactic phenomenon (Otsuka, 2000, chp. 2). Such an argument does not hold here because in CAY, as in English, the derived patientive S of a passive sentence undergoes imperative deletion, as shown in (10d′). It is also the case that the addressee agent of a passive sentence cannot be deleted in forming an imperative (Mary is kissed by you > *Mary is kissed!).7

It is clear then that syntax of ergative languages appears to typically require two sets of grammatical relations, one grouping S and P as a unit, and the other the union of S and A. Confronted with a situation like this, Keenan’s property-list approach to subjects fails to give an answer as to which grouping of argument types is defining the subject. Despite Keenan’s aspiration toward the universal definition of subject, the subject properties recognized by him are merely symptomatic, rather than definitional. Cough, fever, and sore throat do not define influenza, nor are the influenza viruses the only ones that cause these symptoms; with these symptoms, one should not lightly assume that he has influenza.

What is a grammatical subject then? Subject, as used in the European grammatical tradition, is the notion developed to capture a syntactic generalization displayed by a language, when S is assimilated to and is treated like A for morphosyntactic purposes. Transitive sentences typically involve an agentive (A) and a patientive (P) argument as two separate NPs. The sole argument (S) of an intransitive sentence can be either agentive (John ran) or patientive (John fell asleep). In principle, S can assimilate to either A or P. In European languages, S assimilates to A and they pattern alike in nominal morphology and syntax, as the pronominal case forms and verb agreement show in English.


The understanding of a subject in reference to the {S, A} grouping resulting from an A-based generalization does not assume the universality of this grammatical relation. Indeed, it allows the use of this term only when a language displays a phenomenon showing the relevant A-based generalization, namely one that treats S and A alike to the exclusion of P. Compared with Keenan’s (1976) properties-based approach, this understanding of the subject more straightforwardly answers the question of whether a given language has a subject or whether a subject is a universal category. One only needs to look for a phenomenon that treats S and A as a unit. If a language shows no such phenomenon, then it does not have a grammatical subject and the notion of subject is not universal. It is entirely empirical.8

Limiting the use of the term subject derived from the European grammatical tradition to the {S, A} argument grouping curbs the Eurocentrism that has dominated the field. But it does so successfully only when the term is understood and intended, as in this article, as a handy mnemonic capturing partial but significant crosslinguistic similarities; types of grammatical relations and similarly labeled grammatical relations are not entirely uniform both within and across languages as amply demonstrated in this article (see also Croft, 2001; Dryer, 1997; Haspelmath, 2007).

The above understanding of subject also answers the question of which grouping of arguments the term subject should be appropriated when a language shows two different groupings of arguments, as in CAY and other ergative languages. Designating the members of the ergative grouping of S and P as subjects, as proposed for Dyirbal by Anderson (1976), Keenan (1976), and others is at best misappropriation of the term subject. The {S, P} grouping results from a P-based generalization of treating S like P, and identifying it as subject is similar to a misdiagnosis of COVID-19 as influenza based on the initial symptoms.

If the members of the S and A grouping are identified as subjects, then how can the ergative grouping resulting from a P-based generalization be best designated? Since the description of the case marking pattern employs the labels “absolutive” and “ergative,” the most straightforward labels for the relations born by the {S, P} group and the A argument are respectively the “absolutive relation” and the “ergative relation” (with a caution to be heeded that morphological ergativity is not necessarily isomorphic to these syntactic relations), which parallel the subject and the object relation in both morphological markedness and syntactic primacy. Then, the ergative syntactic phenomena treating S and P alike to the exclusion of A (e.g., Dyirbal control of a gap in compound sentences and relativization and CAY causativization and relativization) are to be described in terms of the absolutive relation, whereas the accusative syntactic phenomena treating S and A alike (e.g., Warlpiri control of a gap in subordinate clauses and CAY control of a gap in compound sentences, as well as deletion of the addressee NP in imperatives) are to be analyzed in terms of the subject relation.9

What is unique to most, if not all, ergative languages, then, is having two sets of grammatical relations, the ABS/ERG and the SUB/OBJ relations at their disposal, allowing them to employ either ABS or SUB as a pivotal relation in different morphosyntactic phenomena. That is, CAY transitive and intransitive sentences, for example, render themselves to deployments of arguments according to two different alignment patterns, as shown in (12).


Depending on the phenomenon/construction, arguments are deployed according to either the ergative or the accusative alignment pattern, showing that the alignment pattern is phenomenon/construction-specific.

The complexity of ergative languages lies in the variegated patterns of the relationship seen between ergativity at the morphological level and syntactic ergativity of treating S and P alike for syntactic purposes. To begin with, morphological phenomena themselves (nominal inflection, case marking, and agreement) may be highly complex in some languages, mixing the ergative and the accusative pattern, as in Dyirbal (see [4] and [5]), or mixing the accusative (for first and second person pronouns), the ergative (for proper names and common nouns), and the tripartite pattern (for third person pronouns, which have three distinct forms for S, A, and P roles) as in the Cashinawa language of Peru (Dixon, 1979, p. 85ff.), or even a combining of this kind of person-split patterns with a split conditioned by tense/aspect, as in Nepali. Ergativity at the morphological level (or at least some manifestation of morphological ergativity) may be paralleled by largely ergative syntax revolving around the absolutive (S and P) pivot, as in Dyirbal. Morphological ergativity may belie the accusative orientation in syntax with subject (S and A) pivots, as in the case of Warlpiri and Basque. And morphological ergativity may coexist with split-syntax, as in Central Alaskan Yup’ik and Tongan, where some syntactic phenomena are organized according to the ergative alignment pattern, while some others are accusatively oriented. How a given language fits in, or does not fit in, with any one of these possibilities is determined by a thorough syntactic analysis, which is still a long way off even for many documented ergative languages.

The possibility of allowing two different construals of argument alignment for a single language, as shown in (12), raises an interesting question whether there is a phenomenon calling for an argument that is simultaneously specified with grammatical relations from the two different sets, as ERG=SUB or ABS=OBJ, for example. The need for a compound specification of an argument with two separate types of grammatical relations (distinguishing SUB=TOP (subject-topic) from OBJ=TOP (object-topic), for example) is seen in Western Malayo-Polynesian languages, taken up next.

2.2 Grammatical Relations in Western Malayo-Polynesian Languages

Controversy over the nature of grammatical relations in Western Malayo-Polynesian languages (aka Western Austronesian languages) has been raging as intensely as it has been with ergative languages, with some arguing for a European style subject relation, while some others denying such a grammatical relation and opting for another type of relation dubbed “topic.”

Western Malayo-Polynesian languages are recognized as a distinct type of language for their possession of what is known as “focus morphology,” which is “the feature of a verbal predicate that determines the semantic relationship between a predicate verb and its topic [the ang-marked NPs in (13)]” (Schachter & Otanes, 1972, p. 69). Because Tagalog and other Philippine languages generally have reflexes of the four-way morphological contrast of Proto-Austronesian (AF=Actor focus, PF=Patient focus, LF=Locative focus, CF=Circumstantial focus), as in (13), these Austronesian languages are often referred to as “Philippine-type” languages.


The extent to which the original four-way contrast is maintained varies considerably among the Western Malayo-Polynesian languages. Reduced three-way systems are seen in Taiwan, where Kavalan shows a morphological merger between PF and LF, and where Thao has dropped the CF (or Referential focus [RF]) construction altogether. Tondano in Northwest Sulawesi maintains a four-way contrast, while Lundayeh and Sa’ban in Northern Sarawak, respectively, have a three- and a two-way system (Clayre, 2014).

Like standard Malay (Bahasa Melayu/Malaysia) and standard Indonesian (Bahasa Indonesia), many other Austronesian languages in Indonesia maintain a two-way morphological contrast for AF (e.g., standard Indonesian men-beli ‘AF buy’, nulis ‘AF write’) and PF (beli ‘PF buy’, tulis ‘PF write’), but some Sasak dialects on Lombok Island in eastern Indonesia and those farther east, belonging to the Central-Eastern Malayo-Polynesian group, have lost a morphological contrast, though the AF/PF contrast is structurally preserved (see Shibatani, 2008, and below).10 The focusing morphology in these originally verb-initial languages indicates that one of the argument nominals of the sentence to follow has a special pragmatico-syntactic status (see below). In the AF construction A argument has such a status; in PF it is the P argument that is accorded with such a status, and so on. In Tagalog these nominals bearing the special status are marked by the preposition ang (for common nouns) or si (for pronouns) and are identified as “topic” nominals by Schachter and Otanes (1972); AF morphology is associated with an A topic nominal, and PF morphology with a P topic nominal, and so on. The central issue is what these ang-marked topic nominals are; are they subjects or something else?

The pioneering studies of Tagalog such as those by Blake (1916) and Bloomfield (1917) considered the focus alternations as exemplified in (13) to be a voice phenomenon with the assumption that the AF construction is active and that PF and the other non-AF constructions are passive, with the concomitant assumption that the ang-marked nominal phrases are subjects. Later works such as Bell (1983), Comrie and Keenan (1979), Keenan (1985a), and Keenan and Comrie (1977) follow this line of thinking, which was fortified by the observation that relativizable NPs across languages include subjects and that Tagalog ang-marked NPs and their equivalents in these Austronesian languages are the only ones that can be relativized (see section 3.2). Other students of the Philippine-type languages (e.g., Arka, 2003; Artawa, 1994; Kroeger, 1993) generally identify the equivalents of the Tagalog ang-marked NPs as subjects following Keenan’s (1976) property-list approach.

Schachter and Otanes (1972), perhaps following the earlier works on the Philippine language Maranao by McKaughan (1958, 1962), use the term “topic” for the Tagalog ang-phrase, as mentioned above.11 Continuing the use of this term, Schachter (1976) challenged the prevailing view on the subjecthood in Philippine languages by showing that Keenan’s property-list approach fails to single out a unique NP type as subject in these languages. As summarized in table 1, some phenomena would identify topics as subjects, while some others point to what Schachter called actor nominals as subjects.12

Table 1. Subject-Property Split in Philippine Languages According to Schachter (1976)

Topics as subjects

Actors as subjects


Quantifier floating

Agreement in Kapampangan


Gap-control in subordinate clause

Imperative addressee deletion

Word order in Pangasinan

Schachter’s conclusions from this are that “there is in fact no single syntactic category in Philippine languages that corresponds to the category identified as subjects in other languages” (1976, p. 513) and that “[they] CAN be analyzed quite satisfactorily as NOT having subjects” (1977, p. 304; emphasis original).

More than the problem of the Philippine subject itself, Schachter’s study reveals the limitations of the property-list (symptom-based) approach to subjects. While Schachter’s discovery that Philippine languages have two primary grammatical relations is a major contribution to the field, his allegiance to Keenan’s heuristics and lack of a clear, independent understanding of what subject is has helped muddle the situation, leading many to believe that Philippine languages do not have a subject. Had he understood the notion of subject along the line of thinking outlined earlier (i.e., that subject is a cover term for the grouping of S and A arguments, which a language may display in its morphosyntactic patterning), his conclusions would have been different; namely, (a) Philippine languages also have subjects, as demonstrated by those phenomena controlled by the actor nominals, which are in fact S and A; and (b) they have a separate grammatical relation, apart from subjects, that is, a topic relation exemplified by the Tagalog ang-marked NP, which displays some other properties associated with subjects in some other languages. Conclusions like these are no longer surprising in the light of the earlier discussion in this article showing that most ergative languages have two primary grammatical relations of subject and absolutive, either of which displays some symptoms associated with subjects in European languages.

The past discussions on the grammatical relations in Western Malayo-Polynesian languages have been handicapped by the fact that Malagasy and the languages of the Philippines and Taiwan generally lack passive constructions, depriving the researchers of an opportunity to compare PF and other non-AF constructions with true passive constructions. Therefore, the existence of passive constructions in standard Malay and standard Indonesia, Sasak, and some others in Indonesia, as shown in (14) and (15), provides a rare window to the structural properties of PF and other non-AF constructions, which have been treated as passive by many (see above). Notice that the equivalents of the Tagalog ang-marked topics come in sentence initial position in Malay/Indonesian and Sasak.



Unfortunately, those who recognize a passive construction separately from a PF construction have failed to appreciate the differences between them, either treating both as types of passive constructions (such as “passive type one” and “passive type two” by Sneddon [1996] and many other Indonesian specialists) or both as PF constructions (Verhaar, 2000). Even those who distinguish PF and passive constructions in Malay/Indonesian fail to differentiate the PF patient topic and the patient topic of a passive sentence, treating both as subjects. This is what Kroeger (2014, p. 12) states:

The subject tests . . . show that the patient of the OV [PF] construction is the grammatical subject, just as we found for the di-V [passive] construction. [Both the patient of the PF and that of the passive construction] can be clefted, can be controlled, and can be expressed by the pronoun ia [third person singular].

The examples in (16) illustrate how the two patientive NPs pattern alike with regard to the controllee gap function in the untuk construction.


Actually, the situation is far more complex and intriguing than Kroeger and others have imagined. That the PF and the passive construction are fundamentally different, at least in a number of Malaysian and Indonesian languages of the Austronesian stock, is already shown by the Sasak examples in (15). Sasak dialects show a much clearer picture of the cliticization phenomenon than others. The S of an intransitive cliticizes on the auxiliary-like past tense morpheme (15a), and so do the A’s of transitive constructions (15b, c). The P topic of a PF construction does not cliticize (15c), while the P topic of a passive construction does (15d). What cliticize as a second-position clitic in Sasak are S, A, and derived S (of a passive), indicating that the second-position cliticization pattern parallels the subject-controlled agreement pattern of English. Other Sasak phenomena that make reference to the subject include the “try”-control phenomenon, the addressee deletion in imperatives, the control of a gap in compound sentences, and the relativizer selection in the Bagu Sasak dialect (Shibatani, 2008). What is interesting about Sasak (and the neighboring Sumbawa) is that non-topic A is marked the same way by the preposition (i)siq in both PF and passive constructions. The similarity here is deceptive; the (i)siq-marked (léng, ling, ning in Sumbawa) A in a PF construction is a subject and triggers second-position cliticization (15c), while the same in a passive is not (15d). That a topic relation distinct from a subject is required in Sasak is shown by a number of phenomena calling for it. As discussed in section 3.2, relativization in Sasak targets the position of a topic, as in Tagalog, Malay/Indonesian, and others.

Although the cliticization phenomenon in Malay and Indonesian is not as clear-cut as in Sasak in demonstrating the existence of a subject relation apart from topics, there are still other phenomena that require a separation between the two and that are at odds with Kroeger’s and others’ position of identifying the topic as a grammatical subject in these languages. Observe the summary of the ingin ‘want’ control phenomenon in table 2, which is based on questionnaire surveys conducted by the present author, involving 90 Malay and 63 Indonesian native speakers.13

Table 2. Ingin ‘Want’ Control in Malay/Indonesian

It is clear that a large majority of speakers of both Malay (71 out of 90 speakers) and Indonesian (49 out of 63 speakers) align the patient of a passive sentence with S of an intransitive and A of a transitive sentence in contrast to the patient of a PF construction, whose gap form is permitted by only 17 Malay speakers and two Indonesian speakers (cf. [3] and [4] of table 2), indicating that the patient (topic) of a passive sentence is a subject and that the patient (topic) of a PF construction is not. Largely similar patterns are observed with regard to coba ‘try’ control and the controlling of a gap in the purposive untuk construction.

What is intriguing is that those who accept the passive patientive subject as a possible Ø controllee form three subgroups on the basis of their treatment of P of a PF and A of a PF construction. Observe the summary in table 3:

Table 3. Controllee Arguments (SUB=TOP, TOP, SUB Groups of Speakers)

Nearly one half of the Malay speakers align the patientive subject of a passive sentence with neither P nor A of PF constructions ([1] of table 3), while a fair number of people align it with either P ([2] of table 3 ) or A ([3] of table 3) of PF constructions.

Now, in order to account for the control patterns in tables 2 and 3, it is necessary to recognize two gramatical relations of topic and subject, which are distinct yet may converge on a single argument, as indicated in the tables. First of all, the alignment of the patient of a passive construction is P=SUB=TOP, while that of a PF construction is P=OBJ=TOP. This is clearly reflected in the difference between (3) of table 2 and (4) of table 2, where the patientive P=SUB=TOP of a passive sentence, in contradistinction to P=OBJ=TOP of a PF construction, behaves like S=SUB=TOP of an intransitive sentence and A=SUB=TOP of a transitive sentence.

On the other hand, A of a PF construction, as in (5) of table 2, is aligned like A=SUB=NTOP, where NTOP means “non-Topic,” contrasting it with A of an AF construction in (2) of table 2, whose alignment pattern is A=SUB=TOP. The S of an intransitive sentence is aligned as S=SUB=TOP. Given these alignment patterns, the three groups of speakers shown in table 3 can be described in a straightfoward manner. The largest group of speakers in both Malay and Indonesian permit only a gap in the SUB=TOP position to be controlled, as in (1) of table 3. One of the smaller groups of Indonesian speakers (11 people) permit a gap in TOP position, allowing both SUB=TOP and OBJ=TOP positions to contain a gap, as in (2) of table 3. This is the pattern consistent with Kroeger’s analysis shown in (15), but only small numbers of Indonesian/Malay speakers treat the P=SUB=TOP of a passive and P=OBJ=TOP of PF alike. The other of the smaller group composed of 12 Indonesian speakers allow a gap in SUB position, permitting both SUB=TOP and SUB-NTOP position to contain a gap, as shown in (3) of table 3. The relatively small numbers of Indonesian speakers showing the patterns in (2) of table 3 and (4) of table 3 indicate some difference in grammar between Indonesian and Malay speakers.

As if distinguishing subject and topic is not enough, there appear to be phenomena controlled by the agentive semantic role. Observe the controller function in compound sentences with a gap in the second conjunct, as summarized in table 4.

Table 4. Controllers of a Gap in a Compound Sentence

The consistent questionnaire responses divide Malay speakers into three groups, with 36 speakers choosing subjects as controllers of the gap, 27 speakers opting for agents (including the adjunctive agent of a passive sentence) as controllers, and 19 speakers using topics as controllers. Indonesian speakers, on the other hand, mostly choose either agents or topics as controllers, with only five speakers opting for subject controllers, indicating a clearer difference between Malay and Indonesian grammars.

The study above indicates not only the need for recognizing both subject and topic as primary grammatical relations in Malay and Indonesia, but also that (a) grammatical relations are phenomena/construction-specific; (b) the two grammars differ as to which of the two grammatical relations is primarily used for specific phenomena/constructions—in the control of a gap in compound sentence, Malay speakers prefer to use subject controllers, whereas Indonesian speakers opt for topic controllers (see table 4); and finally (c) there is a great deal of variation in the choice of a primary relation for specific constructions among speakers of each language. Altogether the situations surrounding the issues of grammatical relations in Malay and Indonesian are far more complex than hitherto recognized.

The following summarizes the alignment pattern of the intransitive construction, the AF and PF transitive constructions, and the passive construction.14


The second point made above that “grammars [of Western Malayo-Polynesian languages] differ as to which of the two grammatical relations [of topic and subject] is primarily used for specific phenomena/constructions” must be taken seriously in describing these languages, since some languages appear to choose topics as a pivot for a greater range of phenomena than Malay/Indonesian and Sasak. For example, in Malagasy the two types of control phenomenon studied above appear to be more strongly oriented toward topics, although there are still some phenomena in which a reference to subjects must be made, such as AF focus morphology (m- in the present and n- in the past) that is triggered by SUBJ=TOP (S and A topics): M/N-atory ilay zaza (PRS/PST-fall the child) ‘The child falls/fell asleep’; M/N-amaky boky ilay zaza (PRS-PST-read book the child) ‘The child reads/read a book’.

Does “a clear preponderance of the properties characteristic of b(asic)-subjects” (Keenan, 1976, p. 307) of topics make them subjects, as assumed in the studies on Malagasy grammatical relations by Keenan (1976) and on Balinese grammatical relations by Artawa (1994) and Arka (2003)?15 The answer would be: “Not at all.” Section 2.1 pointed out the infelicity of identifying the absolutive relation with the subject. As highlighted earlier, influenza and COVID-19 (or SARS-CoV-2) are both viral pathogens that show similar symptoms, yet their defining characteristics differ and are accordingly assigned to two different virus families.

The topic in Western Malayo-Polynesian languages is not rooted in syntactic generalizations such as the A-based generalization of treating S like A for the subject and the P-based generalization of treating S like P for the absolutive.16 As Schachter’s (1976, 1977) studies show, the defining characteristic of the Philippine topic lies in its correlation with the referential prominence of an NP. Philippine topics, as well as the parallel NPs in other Western Malayo-Polynesian languages, constitute a pragmatically motivated NP category that is “regularly definite [or at least specific], i.e. regularly [has] a presupposed referent” (Schachter, 1977, p. 290), showing a closer affinity to the Japanese-style topic, marked by the particle wa.17

The fact that Philippine languages and others under consideration have multiple transitive constructions effected by focus alternations surprises many in the field as typologically unique, but it would not be a surprise if the focus alternations were considered a case of alternative topic choice seen in a language such as Japanese, which has a grammatical topic in addition to a subject and which would translate the Tagalog sentences in (13) into multiple transitive constructions depending on the topic choice. Observe:


The variation in sentence structure due to topicalization like the above does not alter the pattern of the syntactic relations such as subject and object. The subject sono onna ‘that woman’ is consistently marked by the nominative particle ga in all these sentences, except when the subject itself is also a topic, as in (18a). While the object case marker is also lost under topicalization, as in (18b), the oblique kara and the indirect object marker ni are retained under topicalization, as in (18c, d).18 There are also syntactic phenomena that show subjects and objects do not change their grammatical status under topicalization. For example, a SUB=TOP (subject topic) still triggers the subject honorification process and an OBJ=TOP (object topic) still triggers the object honorific process, just as non-topic subjects and non-topic objects respectively do (see Shibatani, 1990).

The grammatical topic is a pragmatically determined “pragmatico-syntactic” grammatical relation, while the subject is a “semantico-syntactic” relation resulting from an A-based generalization, as described earlier. While both play syntactic roles to a varying degree depending on the construction and on the language, the two types of grammatical relations are motivated by different principles.19 And this explains why the topic relation can be superposed on semantico-syntactic relations, as the alignment of a subject or an object with a topic results in compound argument specifications such as SUB=TOP and OBJ=TOP, as seen above. On the other hand, it is unlikely that compound specifications such as SUB=ERG and SUB=ABS are called for, since subjects and objects, on the one hand, and absolutives and ergatives, on the other, arise from mutually exclusive syntactic generalizations, as described above.

Topicalization, meaning an alignment/linking of an argument with the pragmatico-syntactic relation of topic, is an additive, superposing process that does not affect semantico-syntactic relations.20 A SUB=TOP argument in both Western Malayo-Polynesian and Japanese, for example, does not lose subject syntactic properties. On the contrary, a SUB=TOP argument displays the properties of both subjects and topics (e.g., a SUB=TOP argument of Philippine languages displays both the Subject/Actor properties and the Topic properties in table 1), and an OBJ=TOP accrues topic properties on top of object properties. It is in this respect that topicalization is distinct from voice alternations, which are historically defined as alternations at the level of semantico-syntactic relations. Though it is currently fashionable to speak of focus alternations in Western Malayo-Polynesian languages as a symmetrical VOICE system, following Foley (2008), they do not actually constitute a voice phenomenon. Figure 2, illustrating the A=SUB=TOP alignment of AF/subject topic constructions in Western Malayo-Polynesian, shows a distinction between the topicalization and the voice domain, maintained also in the treatments of those languages (e.g., Japanese, Chinese) having distinct passive and topic constructions.21

Figure 2. Topicalization and voice domains.

Source: Author.

Despite the similarities between the Western Malayo-Polynesian topicalization and the Japanese counterpart, there are important differences between them. Topics in Western Malayo-Polynesian languages are much more grammaticalized than the Japanese (and Korean) counterparts in the sense that they have become an integrative, indispensable argument, occurring in both main and subordinate clauses, while Japanese topics are “optional,” allowing topicless sentences, and that the Austronesian topic plays a greater role in syntactic phenomena than the Japanese counterpart (Shibatani, 1991). Just as the syntactic status of subjects and absolutives varies crosslinguistically, topics, even if they show the underlying pragmatic unity (for being referentially prominent), are not entirely uniform in their syntactic properties across languages. Absolutives play a greater syntactic role in Dyirbal than in Warlpiri in that they are pivotal in a greater number of syntactic phenomena in the former than in the latter. Subjects in Indonesian play a greater role than in Malagasy in the similar sense. At the construction level, subjects in Malay play a greater role than in Indonesian in the gap control phenomenon in compound sentences in the sense that a greater number of Malay speakers invoke this relation than Indonesian speakers; see table 4. Syntactic properties of the grammatical relations are not uniform both within and across languages.

2.3 Methodological Issues

The qualitative method in linguistics usually relies on the grammaticality judgment on a limited range of phenomena by a handful of native speakers, typically including the investigator himself. The danger of such a method is clear, especially when dealing with languages that have several options at their disposal in selecting a grammatical relation for different phenomena/constructions. To underscore the point that the methodological issues raised in the preceding section are not unique to Western Malayo-Polynesian languages and to pit the proposed analysis of grammatical relations against some past practices, the status of grammatical subject in Mandarin Chinese is discussed next.

Chinese, having a grammatical relation of topic, similar to Japanese but lacking a special topic marker like Japanese wa and Korean nun, is another language in which the status of the grammatical relation subject has been a contentious issue. In a mildly influential paper, Li and Thompson (1976) offer a new typology of languages based on the syntactic prominence of subject and topic, characterizing Chinese as a Topic-prominent language, in opposition to the Subject-prominent Indo-European languages such as English.22 Their characterization of a Topic-prominent language includes a claim that “[i]n a Tp [Topic-prominent] language . . . subject does not play a prominent role” (p. 467). This point is reiterated in Li and Thompson’s (1981) reference grammar of Mandarin Chinese, which, while recognizing subject in Mandarin, comments that: “In Mandarin . . . the concept of subject seems to be less significant, while the concept of topic appears to be quite crucial in explaining the structure of ordinary sentences in the language” (p.16).

In demonstrating the grammatical prominence of the topic in Mandarin, Li and Thompson (1976) draw on phenomena such as the abundance of “double-subject” sentences (Li & Thompson, 1981, p. 92) of the type illustrated in (19a, b) and a gap control phenomenon seen in (19c).


Notice that in (19a, b) the lexical predicates ‘big’ and duō ‘many’ do not respectively predicate over the topics neì-xie shùmu ‘those trees’ and Zhāngsān ‘Zhangsan’, which are instead asserted by the clausal predicates bracketed in the examples. Example (19c) shows that these topics take precedence over a lexically predicated subject in the interpretation of a gapped argument; (19c) does not allow the interpretation that the speaker does not like the leaves. Indeed, a presence of a topic interferes with the interpretation of a gap that is controlled by a subject (a subject-topic, more precisely). Compare the following two examples, where (20b) with the presence of a topic distinct from a subject renders the sentence incoherent or at best confusing (see endnote 5).


These examples clearly support Li and Thompson’s (1976) claim that a topic plays an important role in Chinese grammar. On the other hand, their claim that in such a language subject does not play a prominent role is not justified. Some researchers go one step further and argue, for example, that “the view of Chinese clause structure as simply topic and comment, with no grammaticalized categories we might call ‘subject’ or ‘direct object,’ can explain all of the clause patterns found in Chinese” (LaPolla, 2009, p. 10). This highly ambitious claim is not substantiated either. Indeed, Chinese also displays phenomena (clause patterns) that motivate positing a relational category of subject as a way of capturing a grammatical generalization over S and A arguments. Observe the following data that parallel the English patterns, whose description is facilitated by recognizing a subject:





Notice that the relevant baseline sentences above can be construed as a topicless, “sentence focus” type in the framework of Lambrecht’s (1994) study of information structure that LaPolla follows. They can be understood as event-reporting sentences, answering questions like “What’s going on?” and “What happened?” whose constituents all carry new information. There are, thus, phenomena in Mandarin that call for a subject/object distinction.

What is interesting, and perhaps eye-opening to the students of European languages, is that pragmatic considerations (e.g., world knowledge, common sense) exert stronger influence than in European languages in the processing of certain types of Chinese sentences, oftentimes even overriding the relevant grammatical rules.23 This tendency is also reflected in the native-speaker reactions to the sentences made up by linguists during an elicitation session. Where speakers of European languages, for example, would outright reject ungrammatical sentences, Chinese speakers would try hard to make sense even overlooking grammatical mistakes. Consider the following examples by LaPolla:


The sentence “The man dropped the watermelon and burst” is hardly acceptable to English speakers with the interpretation that the watermelon burst. This is not the case for sentence (25a), which appears to be far more readily accepted by Mandarin speakers with the interpretation intended by LaPolla, even ignoring the grammatical mistake of using the intransitive verb diào ‘drop’ in the sentence, which literally says that “the man dropped to the ground.”24 LaPolla takes the generosity of his consultant(s) like this to his advantage in supporting his claim that “in Chinese there are no constraints” (LaPolla, 1993, p. 774) like those of English that force the speaker to choose the subject of the first clause as the controller of a gap in the second of a compound sentence.

Like many other qualitative studies in linguistics, LaPolla’s work fails to take into consideration a wider range of data, let alone “all of the clause patterns found in Chinese.” Consider the earlier example (21b), in which the controller of the missing argument in the second clause is limited to the subject. When both subject and object of the first clause refer to an entity that is semantically and pragmatically construable as the missing argument in the second clause, the subject-oriented interpretation would be the one chosen by Mandarin speakers. The fact that Chinese also has the relevant grammatical constraints referring to the subject as in English is also revealed by the native-speaker reactions to the following kind of sentences, where the grammatical constraints clash with the commonsense knowledge of the world.


When informed Mandarin speakers are asked about these sentences, they tend to offer some explanations along the following line: As for (26a), the normal interpretation would be that the mother cried, but since the child is the one who suffered directly from being dropped on the ground, it makes more sense if we understood that it was the child who cried. Mandarin speakers would also say that in (26b) it is the truck that exploded, but when two cars collide, both might explode, and so it is also possible for the sentence to mean that both the truck and the car exploded. If Mandarin did not have the grammatical constraints similar to English, this kind of explanation would not be expected. Notice that for the sentences where pragmatic considerations do not clash with the grammatical constraints, a straight answer would be given along the line of the subject-oriented interpretation of (21b) without any comments like the above.

LaPolla’s (1990, 1993) examination of another type of construction reveals some additional issues in the handling of the data in qualitative research. Consider the following sentences (and also [24]):


LaPolla considers both (27b) and (27c) grammatical, but actually they are not grammatically equal. Consider first the grammatical difference between the following two, where (28a) has topicalized the subject NP and (28b) the object NP:


When a subject is topicalized as in (28a), the gap created by the topicalization cannot be filled by a pronoun; it is felt highly redundant and advised not to repeat the subject. On the other hand, when the object is topicalized as in (28b), the preferred pattern is the one with the object gap filled by a pronoun.

Besides this syntactic difference, there is a qualitative difference in the acceptability of LaPolla’s sentences. Sentence (27b), where the subject has been topicalized, is unequivocally and unanimously judged grammatical, but reactions to (27c) vary from “O.K.” to “a bit weird,” and to “bad,” indicating that reactions to a sentence violating a grammatical constraint vary considerably. Problematic examples like (27c) can be made much more easily acceptable by providing a context that bolsters the topicality of an object NP. Consider the following sentence, for example:


This sentence is easily accepted even by those who reject (27c), when told that it was uttered by someone to the owner whose car had been parked in the garage and that had apparently been driven away. Again, the point is that in a sentence like (27b), in which a subject has been topicalized, no such bolstering of topicality is needed. The bottom line here is that NPs forming a sentence are not uniform in topicality, with those occurring in subject position generally higher in topically than those in object position. The grammatical constraint against bringing objects to the topic position in front of sìhū/hǎoxiàng ‘seemingly’ is rooted in the subject/object asymmetry in topicality. Similar to the case discussed earlier, Chinese speakers can apparently override this grammatical constraint when there is a prevailing pragmatic consideration to do so. In this regard, one may claim that the grammatical rules, at least some of them, in Chinese are less rigid than those in European languages and some others, whose rules are more difficult to breach. But such a claim is different from asserting that there are no grammatical rules/constraints referring to grammatical relations such as subject and object in Chinese.

Qualitative research demands high quality in both the data themselves and their interpretations. Since the grammaticality judgment of sentences (especially those infringing the grammatical constraints) by native speakers varies considerably, depending on the power of imagination and on the level of effort in marshaling pragmatic wits that might mitigate the grammatical offence, dealing with one or even a handful of native speakers is often insufficient in obtaining quality data. One way to alleviate this limitation in qualitative research is to augment it with a quantitative method, as demonstrated in the preceding section. Another is to turn to a corpus of natural data, if available. In the case of Chinese, there are several available, the “Academia Sinica Balanced Corpus of Modern Chinese” and the “CCL Corpus” developed jointly by Peking University and the Chinese Academy of Social Sciences being perhaps the most reliable and widely used. A quick check of the Sinica corpus turns up many results matching the pattern of (24a’), (24b’), (28b), as well as LaPolla’s example (27b), where S or A has been topicalized and placed before sìhū/hăoxìng ‘seemingly’, but none that matches the pattern of (24b”), (28c), (29), and LaPolla’s (27c), where P has been topicalized. A one-hour survey of the CCL Corpus has yielded only three cases where objects are placed in front of hǎoxiàng ‘seemingly’, two of which actually tropicalize both object and subject, placing these in this order before hǎoxiàng.25 The sources for both these corpora are written materials, including some sampling of less formal writings from blog sites, in which the third of the examples was found. The scarcity of the examples in which objects are topicalized in these large-scale corpora shows that the relevant phenomenon is rule governed and that the rule is strictly adhered to in the writing context, where authors tend to be more conscious about obeying grammatical rules, since they can rely less on the reader’s active and cooperative participation in the manipulation of the patterns of information structure.

3. Relative Clauses

As shown in several parts of section 2, relative clause formation plays a role in the discussion of grammatical relations. Indeed, Keenan and Comrie’s (1977) work, the most influential typological study of RCs, advances a claim that the patterns of RC formation are universally determined by the hierarchy of grammatical relations, highlighting the intertwined relationship between the two topics. Before scrutinizing Keenan and Comrie’s claim, a brief mention is made of a few studies focusing on some other aspects of RC constructions.

The early typological studies of RCs such as those by Downing (1978) and Greenberg (1963), as well as more recent ones such as by Dryer (2011), Hawkins (1983), Mallinson and Blake (1981), and Tomlin (1986) were couched within the framework of Greenberg’s word order typology and were mainly concerned with the implicational universals gleaned from the correlations between the ordering of verb (V) and object (O) and the position of RC vis-à-vis its head noun, that is, pre- or postnominal RCs. Major findings include the following generalizations: In VO languages RCs almost always follow the noun they modify, VO ⊃ NRel (postnominal RC). However, the other pattern expected from Greenberg’s study, namely the OV ⊃ RelN (prenominal RC) pattern, was not found true in later works (e.g., Dryer, 2011; Hawkins, 1983; Tomlin, 1986), which show that OV languages can have both NRel order and RelN order.

Several previous studies also observed that the positioning of the relative marker correlates with the positioning of the RC (Downing, 1978; Lehmann, 1984). Head final (prenominal) RCs are usually marked by a final relative marker and the head initial RCs are likely marked by an initial relative marker, though a fair number of them appear to have a marker in final position (Diessel, n.d.). More recent efforts (e.g., De Vries, 2001; Diessel, n.d.; Lehmann, 1984) along this line of research take a more holistic approach in attempting to explicate the nature of the head final and head initial RC structures, for example, whether or not there is any correlation between the use of overtly marked nominalizations with the head final vs. head initial patterns, whether pre- or postnominal RCs show greater structural resemblances with other noun modifying structures (e.g., noun complements), etc.

3.1 NP Accessibility Hierarchy

Research on RCs took a new turn with the formal publication of Keenan and Comrie (1972) as Keenan and Comrie (1977), which remains the most influential work in the typological literature from the 1970s. This work, based on the observation briefly summarized below, is typological in several respects. Firstly, individual languages or language groups are characterized in terms of the types of RCs that they permit. For example, under the heading of “Subjects Only,” Keenan and Comrie (1977, p. 70) state that “[i]n many Western Malayo-Polynesian languages, only subjects can be relativized,” and note that, beside Malagasy discussed in the text, Javanese, Iban, Minang-Kabau, Toba Batak (all Western Malayo-Polynesian languages) represent this type of language (see below for a follow-up on this). This group of languages is also characterized as those obeying the “subject-only restriction/constraint” (Beguš, 2017; Polinsky & Potsdam, in press).

Similarly, under the heading “Subject–Direct Objects,” languages such as Welsh, Finnish, and Malay are discussed. Languages such as Basque, Tamil, and Riviana are said to represent the Subject–Indirect Object type. The overall pattern of relativization possibilities discovered by this kind of typological grouping of the world’s languages is shown in table 5, with a single-language illustration of each group.

Table 5. Crosslinguistic Distribution Patterns of Two Relativization Strategies

As table 5 shows, languages may have more than one relativization strategy (RC constructions). Accordingly, it is actually misleading to characterize Welsh, for example, as a Subject–Direct Object language, since it has RCs involving those positions low in the hierarchy, namely RCs that mark the case relation of the relativized position. While Keenan and Comrie (1977) talk about languages having different relativization strategies and about the different patterns of their utilization in a language, it is sometimes difficult to determine whether a single strategy or two different strategies are involved. For example, in the Indo-Aryan language Marathi, the participial RCs are distributed differently depending on the aspect. In the imperfective aspect, a gap ([-case]) strategy applies only to subject position, whereas in the perfective aspect, the same strategy applies further down in the AH (see next paragraph). The upshot of the above discussion is that, as in the case of the issues surrounding grammatical relations, each construction, for example, a participial RC in the imperfective, must be taken into consideration in trying to come up with an accurate language profile.

The grouping of language types in terms of the relativizability of argument positions led Keenan and Comrie (1977) to posit what is known as the NP Accessibility Hierarchy:


(SU = subject, DO = direct object, IO = indirect objet, OBL = oblique object,

OCOMP = object of comparison)

A number of specific claims are made on this basis. The AH says that the subject position is most accessible to relativization (can be relativized in all languages), followed by the direct object and other positions down the hierarchy.

Based on the observation of the distribution pattern of RCs summarized in table 5 and the AH derived from it, Keenan and Comrie (1977, p. 67) propose the following Hierarchy Constraints (HC):


A language must be able to relativize subjects.


Any RC-forming strategy must apply to a continuous segment of the AH.


Strategies that apply at one point of the AH may in principle cease to apply at any lower point.

HC2, which “justifies the actual ordering of terms in the AH” (Keenan & Comrie, 1977, p. 67), is of particular importance in demonstrating how Keenan and Comrie’s work is in line with the original goal of language typology, namely knowing the typological characteristic of a language on the basis of some correlative variables. The AH and HC2 jointly predict any relativization strategy “is free to treat adjacent positions on the AH as the same, but it cannot ‘skip’ positions. Thus, if a given strategy can apply to both subjects and locatives, it can also apply to DOs and IOs” (p. 67). A series of implicational universals that can be drawn from the AH would be powerful constraints on human language, such that there would be no natural language that allows a given relativization strategy to apply to subjects and indirect objects but not to direct objects.

Naturally, the achievements made by Keenan and Comrie depend on the correctness of the AH and HCs. As it turns out, there are indications that the AH and HC2 may not be entirely correct. There have been reports that HC2 cannot be maintained in some languages. Joseph (1983) shows that many speakers of Modern Greek allow RCs with a gap headed by the pu marker in subject, object, and oblique positions, but skipping indirect object position. Similarly, Koul and Hook (in press) point out that past participial prenominal RCs in the Indo-Aryan languages Hindi-Urdu, Kashmiri, and Panjabi are found in subject and direct object positions, on the one hand, and oblique and possessor positions, on the other, but not in indirect object position (see also Subbārāo, 2012, pp. 285, 331–332). Sindhi data in (30) illustrate how the questionable status of relativizing on indirect position compares with the ease of the relativization into oblique position.


Shibatani (2008) reports that Sasak, discussed earlier and again in section 3.3, has a pronoun retention ([+case]) relativization strategy that applies to subject and obliques, but not direct object. Similarly, in Tongan, relativization on the subject of a transitive sentence and on obliques requires a pronoun retention strategy, but object position cannot retain a pronoun (see section 3.2). Perhaps more cases like these will be discovered as exceptional to HC2 with an expanded data and a more detailed analysis of individual languages.

The more serious issue in Keenan and Comrie’s studies of RCs has to do with the validity of the proposed AH and the role of the “Subjects Only” languages in the universal claims they make. Justifying the top end of the AH, Keenan and Comrie make the following claims about the relativizability of subjects:


“All languages can relativize Subjects.” (Comrie & Keenan, 1979, p. 652)


“. . . in absolute terms Subjects are the most relativizable of NP’s.” (Comrie & Keenan, 1979, p. 653)


“Subjects are universally the most relativizable of NPs.” (Keenan, 1985b, p. 158)

While claim (a) is rather benign, since it does not assume that all languages have a subject, (b) and (c) are very powerful claims that need to be scrutinized. As in the case of grammatical relations, there are two groups of languages that challenge such strong claims, namely, ergative languages and Western Malayo-Polynesian languages.

While noticing issues these groups of languages raise, Comrie and Keenan (1979), Keenan (1976), and Keenan and Comrie (1977) consistently downplay the seriousness of the issues and assume that the pertinent languages, once viewed in some alternative perspectives, for example, ergative constructions and Western Malayo-Polynesian PF (patient focus) constructions as passives, align with the familiar nominative-accusative languages. This is actually not true with either ergative languages or Malayo-Polynesian languages, where clear cases can be made that many of these languages have a distinct grammatical relation best characterized as subject but that does not relativize or that is not the easiest to relativize.

3.2 Relativization in Ergative Languages

Section 2.1 demonstrated that Central Alaskan Yup’ik has a robust subject relation (a grouping of S and A), in addition to an absolutive relation (a grouping of S and P), both of which play a central role in different morphosyntactic phenomena. If relativization were to be described in terms of the subject/object alignment pattern, according to which the AH is formulated, the subject of a transitive sentence (an A argument) in Central Alaskan Yup’ik would not be relativizable at all (see [8]). An A argument must be made a derived S via antipassivization in order for it to be relativized. The same holds true in other ergative languages, including Dyirbal, Chukchi, and some others (see Polinsky, 2017).26

Tongan relativization shows the less accessible nature of the subject of a transitive sentence in ergative languages in a unique way. Otsuka (2000) shows that Tongan also requires two sets of grammatical relations of ABS/ERG and SUB/OBJ. The former needs to be invoked in accounting for the case marking pattern and relativization, and the latter for the interpretation of the gap in one type of compound sentence. If Tongan relativization were to be described in terms of the SUB/OBJ alignment, the A subject of a transitive sentence would have to be said to be less accessible than the P object of a transitive sentence in the sense that the former requires the pronoun retention ([+case]) strategy in subject position, while the latter can be relativized simply with the gap ([-case]) strategy. Observe:27


The point that subject position of a transitive clause is less accessible or more difficult to relativize on can be seen from the fact that the pronoun retention strategy applies to the less accessible lower positions in the AH hierarchy, as demonstrated in the relativization of IO and Adjunct above (also see the distribution of the [+case] strategy in table 5).

The fact above showing that the A of a transitive clause is treated similar to those positions low in the AH does not mean that it is an oblique object or an adjunct. This is so because the A of a transitive clause, despite its ergative marking, functions as a subject. Consider the mo-coordination sentence in Tongan, which, just like English, requires the controller of a gap in the second conjuct clause to be a subject, as observed in (32):


As the comparison between the Tongan expressions and their English translations indicate, the A argument of a transitive clause and S form a natural class, exactly like A and S in English. But it was shown above that such a subject is not easily relativizable, contrary to Keenan and Comrie’s claims.

The simplest analysis of relativization in these ergative languages is not in terms of subject/object relations but in terms of the absolutive/ergative relations, pointing to the correctness of the Ergativity Hierarchy, as proposed by David Johnson (1974):

ABS > ERG > IO > OBL, etc.

3.3 Relativization in Western Malayo-Polynesian Languages

Can Western Malayo-Polynesian languages be characterized as “Subjects Only” languages permitting relativization only on subjects, as Keenan and Comrie (1977) and their other works have it? This has already been answered by Schachter (1977), who showed that topics are what can be relativized in Tagalog. The conclusion that some subjects in Western Malayo-Polynesian languages are actually more difficult to relativize than objects was argued for by Shibatani (2008) in a clearer manner in terms of Sasak data. As pointed out in section 2.2, Sasak has a cliticization phenomenon that works exactly like English verb agreement; S and A trigger second-position cliticization in Sasak just like S and A agree with verbs in English. However a cliticizing subject of a transitive clause in Sasak cannot be relativized on, unless it is aligned with a topic, as shown in (33b, b′).


The NP dengan mame=no in (33b) is a subject triggering cliticization in the past tense form. Yet, it cannot be relativized (33b′). The NP degnan ine=no in (33c) is an object, but it can be relativized (33c′).

The description of Sasak relativization in terms of subject/object relations would have to say that some subjects (non-topic subjects) are harder to relativize than some objects (object topics), and this would be true of all those Western Malayo-Polynesian languages that Keenan and Comrie (1977) classified as “Subjects Only” languages. A true generalization for these languages is that only topics can be relativized regardless of their subject status; subjects (and other arguments) can be relativized only when they are aligned with the topic relation.

3.4 Typology of Relative Clauses?

Besides the position of the RC vis-à-vis the head noun (pre- vs. postnominal RCs), pursued from the beginning of the typological studies in the 1960s, and the distinction between the gap type and the pronoun retention type, which plays a major role in Keenan and Comrie (1977), RCs of the world’s languages have been typologized in terms of some additional structural properties of the head noun (e.g., Andrews, 2007a; Keenan, 1985b; Kuroda, 1992; Lehmann, 1984). Following C. Huang (2008), who demonstrates that the Tibeto-Burman language Qiang has all of them, figure 3 summarizes different types of RCs widely recognized in the field.

Figure 3. Types of relative clauses.

Adapted from diagram 1 of C. Huang (2008, p. 762).

The single headed, externally headed type, the headless type (aka “free relatives,” “nominal RCs,” “fused RCs”), the internally headed type, and the head final type are illustrated by the Bolivian Quechua examples in (34).


The head initial type is illustrated in the English translation for (34a), and the double-headed type by the Qiang example in (35).


The widely embraced typology of RCs summarized in figure 3 has been challenged by Shibatani (2019), who questions whether the different types of RCs recognized are indeed distinct structures, raising at the same time the much more fundamental question of whether RCs themselves exist as independent grammatical structures. Compare, for example, the externally headed RC (34a) and the headless RC (34b). The two structures identified as an RC are clearly the same. Perhaps two different RC constructions, rather than the RCs themselves, are intended by “externally headed” and “headless” RCs; but it then begs the question whether the headless type is really a RC, whose function is either to restrict the denotation of the head noun to its subset (the restrictive RC) or to identify the head noun (non-restrictive RC).28 Headless RC constructions by definition do not have a head noun.

Similar questions can be raised about the internally headed RC from a slightly different angle. What is found in (34c) is structurally the same as that of a verb complement structure seen in example (36). If the complement structure is not an RC, then the same structure identified as internally headed RC is not an RC either.


Functionally, it is again not clear whether the internally headed RC serves the same purposes as a restrictive or non-restrictive RC.29

The upshot of these observations is that what is identified as a headless RC is the same structure as an externally headed RC, and what is identified as an internally headed RC is the same structure as a verbal complement. The only difference between an externally headed RC and a headless RC is the usage pattern and the attendant function. So-called RCs are structures used to modify a head noun, while the same structures by themselves can be used as the head of an NP referring to a nominal (thing and thing-like) entity. In the case of a so-called internally headed RC and a verb complement, the structure itself is again the same in many languages. The difference is what is being denoted/referred to; a so-called internally headed RC typically denotes an event protagonist, while a verb complement denotes an event itself, a fact, a proposition, and other more abstract nominal entitles metonymically related to an event. In other words, so-called RCs are nominal structures that denote substantive entities and function as a referential expression when they head an NP, exactly like ordinary nouns, which can both modify another noun and head an NP (see figure 6).30

The nominal nature of the complex structures identified as RCs or (clausal) complements is indicated in many languages by nominalization morphology that derives nouns, as in the case of Bolivian Quechua, where suffixes -q and -sqa derive both lexical and grammatical nominalizations. Observe:


While many languages of the world involve clear morphology in nominalization like Quechua, many other languages do not, as English lexical nominalization known as conversion does not; cf. speak > (a) speaker, run > (a) runner; cook > (a) cook, judge > (a) judge. Similarly grammatical nominalizations may not involve any obvious nominalizing morphology, though what is nominalizing and what is not nominalizing morphology depends on one’s understanding of what nominalization is (see the discussion of German later on in this section). Shibatani (2019, p. 21) offers this working definition of nominalization:

Nominalization is a metonymy-based grammatical derivation process yielding constructions associated with a denotation comprised of entity (thing-like) concepts that are metonymically evoked by the nominalization structures, such as events, facts, propositions, resultant products and event participants. Nominalizations, as grammatical structures, are similar to nouns by virtue of their association with an entity-concept denotation; they both denote thing-like concepts, which provide a basis for the referential function of an NP headed by these nominals.

The two types of grammatical nominalization relevant to the discussion of RCs are event nominalizations and argument nominalizations. The former, represented by the Quechua examples (34c) and (36), typically denote abstract nominal concepts like an event, a fact, and a proposition metonymically evoked by the structure, as in (36), but they may also evoke a concrete object such as an event protagonist, as in (34c). They may also denote a resultant product, as in example (37) (cf. the lexical resultative nominalizations, [a] building, [a] painting, [a] writing).


The resultative nominalization in (38) points out the limitation of the analysis of the construction involved in (34c) as an internally headed RC; both have exactly the same structure, but the nominalization structure in the former does not permit an interpretation where the internal argument laranjas ‘oranges’ functions as a head argument of the main clause verb ujya-ni ‘I drink’.

When event nominalizations metonymically evoke an event protagonist, ambiguity may arise as to which argument is being intended by the speaker. Observe Navajo example (39):


Languages avoid this kind of ambiguity by creating a gap in the position of an intended argument, as in (40).


The gap in an argument position uniquely points to a specific argument that is intended as a metonymically denoted event protagonist, an agent in example (39). The gap marks a variable argument whose specific denotation is determined by the context. This is a major property of metonymically motivated structures, as observed in ordinary metonymic expressions such as Drink a glass a day and India won the World Cup in 1983 and 2011, and lexical nominalizations, such as I bought a half-pounder, We need a three-wheeler. Interpretations of these and grammatical nominalizations are thus contextually determined along the line of the Gricean Cooperative Principle.

It is these argument nominalizations with a gap pointing to the type of argument to be metonymically evoked that are used as a modifier in so-called externally headed RCs. Compare (34b) and (34a), (40), and (41).


While in many languages (e.g., Chinese, Japanese) a gap in argument position is the only clue in ascertaining an intended argument, many others have special morphology pointing to the argument represented by a gap. This has already been shown in the Bolivian Quechua examples. The nominalizer -q points to an agentive argument and -sqa a patientive argument, which respectively occur together with a gap in subject and object position in grammatical argument nominalizations. Other languages have more elaborate systems, finely distinguishing different types of argument. In Western Malayo-Polynesian languages, the focus markers synchronically play two functions. One is to mark the semantics of the sentential topic nominal, as discussed in the beginning of section 2.2. The other, likely the original function of the focus morphology, is to mark different types of argument nominalization in both lexical and grammatical argument nominalizations, as exemplified by the Formosan language Mayrinax Atayal in (42).31



Similarly, German argument nominalizations use nominalizers derived from articles, which mark different argument types, in addition to a gap in an argument position.


Like other languages, both Mayrinax Atayal and German make use of these argument nominalizations as a modifier of a noun, in so-called (externally headed) RCs, as shown in (45) and (46).



Reanalyzing what are conventionally analyzed as RCs in German as argument nominalizations and reanalyzing so-called relative pronouns as nominalizers both find support in the way these forms function. First, the fact that the relevant structures are syntactically nominal is indicated by article marking. Compare the marking patterns for nominalizations and ordinary nouns in (47):32


The reason that these argument nominalizations occur in a nominal position (as the head of an NP) like ordinary nouns is precisely because they denote nominal entities. This fact is indicated both internally and externally vis-à-vis nominalization structures. Internally, the nominalizers der, die, and das respectively indicate that what is denoted is a masculine, feminine, and neuter thing-like entity. Externally, the determiners der, die, and das also indicate that the nominals they mark respectively denote a masculine, feminine, and neuter thing-like entity. These facts for German argument nominalizations are completely in line with argument nominalizations in other languages, which have been traditionally analyzed as nominalizations. There are many languages (e.g., Hmong, Dravidian, Barasano; see Shibatani, 2019) that mark argument nominalizations with classifiers, showing that they denote nominal entities that are conventionally classified, just like ordinary nouns. In sum, nominalizations qua RCs denote nominal entities rather than predicating and asserting like clauses and declarative sentences.

It is by now clear that RC constructions are combinations of argument nominalizations and a head noun. A subject RC construction is a combination of a subject argument nominalization and a head noun, as in (45a) and (46a), and an object RC construction is a combination of an object argument nominalization and a head noun (47a) and (47b), and so on. What are known as RCs do not exist as independent structures apart from grammatical nominalizations in two different uses—the NP use, where they head a NP and perform a referential function, and the modification use, in which they function as a modifier either restricting the denotation of the head noun (restrictive RC) or identifying the denotation of the head noun (non-restrictive or appositive RC). The typology of RC constructions, as in figure 3, is no more than an epiphenomenon of these usage patterns of grammatical nominalizations. Figures 46 show, with English examples, how the two major uses of grammatical nominalizations parallel those of ordinary nouns.33

Figure 4. Usage pattern of nouns.

Source: Author.

Figure 5. Usage pattern of event nominalizations.

Source: Author.

Figure 6. Usage pattern of argument nominalization.

Source: Author.

4. Conclusion

Language typology enjoys immense popularity among contemporary practicing linguists. The field is thriving, with typologically oriented researchers bringing a range of fresh data and analysis of descriptive and theoretical significance from a wide spectrum of little-known languages, many of which are quite marginalized with the high risk of extinction. Many of these praiseworthy endeavors are breaking away from the Eurocentric perspectives and the time-honored descriptive and analytic frameworks that have dominated the field. By closely focusing on the two typologically prominent topics of grammatical relations and RCs and by mainly examining them in the light of two groups of languages that challenge the conventional wisdom of the field, this article critically appraised the achievements garnered in the span of the last 50 years.

The past studies on grammatical relations focusing on so-called subject properties suffer from the lack of a clear understanding of the subject of European languages as a reference point and from a failure to see whether the grammatical relation in question shares the defining characteristic of the subject, rather than merely observing in it some properties symptomatic of the subject. The subject results from an A-based generalization of treating S like A, while the P-based generalization of S and P defines the absolutive relation found in ergative languages. So-called Philippine-type languages, as well as Japanese and Chinese, have a grammaticalized topic relation that is rooted in the pragmatic notion of referential prominence and that is superposed on semantico-syntactic relations of subject and object, resulting in the alignment patterns such as SUB=TOP (subject topic) and OBJ=TOP (object topic). Subjects, absolutives, and topics are primary grammatical relations that languages choose as pivots in specific morphosyntactic phenomena, with some languages (e.g., English) choosing one type of grammatical relation more consistently across different phenomena/constructions than some others, which divide the labor among different types of grammatical relations. Ergative languages are characterized by their use of both the subject/object (accusative) alignment and the absolutive/ergative (ergative) alignment pattern. The Philippine-type languages make use of a superposed system involving both semantico-syntactic relations of subject and object that are further linked with the pragmatico-syntactic relations of topic and non-topic.

That different grammatical relations play different functions and bear a different functional load across languages was shown by their role in the formation of so-called RCs. Contrary to the widely embraced NP Accessibility Hierarchy proposed by Keenan and Comrie (1977), relativization revolves around the absolutive relation in many ergative languages and around the topic relation in the Philippine-type languages. Perhaps more surprisingly, it has turned out to be the case that grammatical relations have no direct relevance to RC formation per se, which simply joins together a head noun and a grammatical argument nominalization, just as a head noun and a modifying adjective are brought together with no regard to grammatical relations in the formation of a simple attributive construction. What is known as subject relativization, for example, forms a NP by simply bringing together a head noun and an argument nominalization with a gap in subject position. There are no RCs as independent structures apart from the grammatical nominalizations in two different uses, let alone the typology of them.


Many thanks are due to Niranjan Uppoor, Yukishige Tamura, and Haowen Jiang, who, besides supplying some crucial example sentences from pertinent languages, carefully went over earlier versions of this article and provided comments and discussions useful in improving the manuscript. The research reported in this article was partly supported by the project “Noun Modifying Expressions” (PI: Prashant Pardeshi) of the National Institute for Japanese Language and Linguistics in Tokyo, Japan.

Further Reading

  • 1. The attempts to classify languages in terms of the manner of word formation; e.g., isolating (Old Chinese), agglutinating (Turkish), inflectional (Sanskrit), polysynthetic (Inuit). Besides the morphological typology of this kind, typological studies may focus on lexical categories and their internal structures (kinship terms, classifiers, body-part vocabulary, etc.), phonological inventories and structures (vowels, consonants, prosodic features), as well as morphosyntactic phenomena, exemplified in this article. See the entries in the Further Reading section for broader overviews of language typology.

  • 2. Section 2.3 examines the status of subject in Chinese.

  • 3. See Dixon (1979, 2009) for the rationale for recognizing these semantico-syntactic roles. Dixon and some others use O for P.

  • 4. Language data whose sources are not identified are from the author’s personal database.

  • 5. In this article all missing arguments whose semantic interpretation depends on another noun phrase are identified as a “gap” and are represented by Ø in the examples, although different kinds of missing argument are actually involved depending on different constructions. The noun phrase that determines the semantic interpretation of a missing argument is called a “controller.” An important and difficult question, not pursued in this article, concerns a distinction between the gaps sanctioned by specific grammatical constructions and those due to the pragmatically or grammaticality controlled phenomenon of “pro-drop,” where an argument position in certain languages (e.g., Chinese, Japanese, Italian) is not filled by a pronoun as in some other languages (e.g., English, French). This problem is particularly acute in dealing with many Asian languages, including Chinese, demanding a more detailed examination of the various gaps involved in the discussion on Mandarin in section 2.3.

  • 6. The discussions here and below owe a great deal to Yuki-Shige Tamura, who also provided all the Central AlaskanYup’ik examples in this article.

  • 7. Cf: Be dazzled tonight by Cirque Diablo at The Lexington Opera House!

  • 8. The definition of subject here in terms of the union of S and A agrees with Dixon’s (1979), but it differs from his stipulation that the subject is “a universal deep structure category” (p. 109). On the other hand, the position advocated in this article does not license researchers to cavalierly declare that a given language does not have a subject without demonstrating that there is no phenomenon in the language that is oriented toward the {S, A} grouping of argument types. See the relevant discussion on the Chinese subject in section 2.3.

  • 9. There are languages, e.g., the Pomoan language Eastern Pomo, the Caucasian language Bats, the Arawak language Baniwa, that split Ss into two groups, with an agentive Sa being treated like A and a patientive Sp like P (see Dixon, 1979). The nature of the grammatical relations of this type of language known as “active type” must be carefully ascertained. Not showing the {S, A} or the {S, P} (morphological) grouping of arguments does not mean that there are no grammatical relations; there might be syntactic phenomena defined in terms of these groupings, or there might be other types of grammatical relations, e.g., those defined in terms of the {Sa, A} and the {Sp, P} grouping.

  • 10. In recent literature dealing with Western Malayo-Polynesian languages, AF and PF constructions and others are identified as AV (actor voice) and UV (patient/undergoer voice), etc., as if these are voice constructions. See the relevant discussion later in this section.

  • 11. McKaughn (1973) apologized for calling the Maranao equivalents of the Tagalog ang- phrases “topic” and confusing the issue, which, he thought, could have been avoided had he used the term “subject.”

  • 12. See Shibatani (1991) on Cebuano, and Lin (2010) on Tsou and other Formosan languages for similar patterns of distribution of the subject properties.

  • 13. Thanks are due to the local collaborators of this project: Junko Hiyama of Tun Hussein Onn University of Malaysia, Johor, Malaysia, and Bambang Kaswanti Purwo of Atma Jaya Catholic University Indonesia, Jakarta, Indonesia.

  • 14. The OBJ relation is needed in these languages in accounting for the AF/PF patterning and applicativization that turns oblique arguments (e.g., locative and instrumental arguments) into objects, which can then be made the topic of a PF construction just like an ordinary object.

  • 15. Arka (2019), still maintaining the treatment of focus alternations as a voice phenomenon (see figure 2 and the pertinent discussion), avoids the term “subject” in the description of Balinese grammatical relations, opting for the term “Pivot.” He recognizes different types of Pivot; for example, Pivot (S/A) (e.g., for the raising phenomenon) and Pivot (S/P) (e.g., for relativization). Under this approach, the raising and the want-control phenomenon are described as involving “S Pivot” and “A Pivot” (pp. 265, 266), and relativization as involving “S Pivot” and “P Pivot” (p. 264). Confronted with situations like this, the linguist’s task is to capture and state the generalizations that the relevant phenomena show, rather than simply listing the elements involved. Rather than stating that “/i/, /e/, /ɛ/, and /æ/ palatalize the preceding /t/, /d/, /s/, and /z/,” a more revealing description would capture the generalization displayed by the phenomenon, e.g., as “front vowels palatalize the preceding coronal consonants.” Grammatical relations such as “subject,” “absolutive,” and “topic” are posited precisely as a way of capturing this kind of generalization.

  • 16. Identifying the Austronesian topic with the absolutive of ergative languages at the SYNTACTIC level, advocated by those who analyze the relevant languages as ergative, is also misguided for the reasons discussed here.

  • 17. Observe the changes in the “a/the” article choice in the translations for the Tagalog examples in (13).

  • 18. In some Ryukyuan languages (sister languages of Japanese), the nominative and the accusative marker are retained even when they are made a topic.

  • 19. Japanese topics also play a role in certain syntactic phenomena (Shibatani, 2018). See section 2.3, where the Chinese topic is discussed.

  • 20. See Andrews (1985/2007b), who also proposes to treat the grammatical relations in the Philippine-type languages as a “superposed” system.

  • 21. If a language does not have semantico-syntactic relations, semantico-syntactic roles may be directly linked to the topic relation, if it has this pragmatically defined relation.

  • 22. Without justification Li and Thompson (1976, p. 46) group Indonesian and Malagasy as Subject-prominent languages along with Dyirbal, and Tagalog and another Philippine language Ilocano as neither Topic-prominent nor Subject-prominent languages.

  • 23. The discussion on Chinese below greatly benefited from the consultations with Hideki Kimura, Professor Emeritus of Chinese Linguistics at the University of Tokyo, Japan, and Haowen Jiang, both of whom went beyond the call of the duty by reaching out to larger groups of Mandarin speakers to check on the relevant data and their interpretations.

  • 24. Hideki Kimura (personal communication) points out that if a Chinese pupil ever wrote a sentence like (25a) intending to express the meaning that LaPolla had in mind, the teacher would immediately correct it to something like the one below, where the verb for “drop” is changed to a transitive one and the intended subject of the second clause is overtly expressed, indicating the difficulty in having a gap in reference to the object of the first clause.

    Nèi gè rén bǎ xīguā shuāi zài dìshàng, xīguā suì le.

    that CLF person BA watermelon drop.TR LOC ground watermelon burst ASP

    ‘That man dropped the watermelon on the ground, (and) the watermelon burst.’

  • 25. Thanks are due to Haowen Jiang and Hideki Kimura for undertaking these tasks.

  • 26. There are ergative languages that relativize a subject of a transitive sentence, e.g., Basque (Hualde & Ortiz de Urbina, 2003) and the Australian language Kaitji (Hale, 1976).

  • 27. The pronoun ai glossed as ‘there’ in examples (d) and (e) occurs after a preposition and can refer to inanimate objects, humans, and locations.

  • 28. See Keenan and Comrie’s (1977, pp. 63–64) definition of a restrictive RC: “We consider any syntactic object to be an RC if it specifies a set of objects (perhaps a one-member set) in two steps: a larger set is specified, called the domain of relativization, and then restricted to some subset of which a certain sentence, the restricting sentence, is true. The domain of relativization is expressed in surface structure by the head NP, and the restricting sentence by the restricting clause, which may look more or less like a surface sentence depending on the language.”

  • 29. An informed native speaker of Bolivian Quechua consulted would rather use the externally headed RC (23a) in answering the question “Which chicken are you eating?”

  • 30. Denotation is a relationship between a linguistic structure and a mental representation of a meaning/concept. Reference is a speech act of identifying some discourse entity in terms of a nominal expression.

  • 31. See Starosta, Pawley, and Reid (1982) for a hypothesis on the rise of the predication function of nominalizations in Austronesian languages.

  • 32. The articles marking grammatical nominalizations differ slightly from the ones—presumably the historically older ones—that mark ordinary nouns. See Shibatani (2020) for further discussions on German relative clauses.

  • 33. As for the form You should marry [who loves you] in figure 6, Modern English prefers to mark the NP use/referential function of an argument nominalization by the NP-use marker one, as in You should marry one [who loves you]. This is a widely observed phenomenon where an NP use of grammatical nominalizations is marked by an NP-use marker, which may be a determiner, a classifier, a particle of unknown origins (e.g., Japanese no, Korean kes), or some other items grammaticalized from nouns of general meaning such as “person” and “thing” (see Shibatani, 2019).