Conversation Analysisfree

  • Jack SidnellJack SidnellUniversity of Toronto


Conversation analysis is an approach to the study of social interaction and talk-in-interaction that, although rooted in the sociological study of everyday life, has exerted significant influence across the humanities and social sciences including linguistics. Drawing on recordings (both audio and video) naturalistic interaction (unscripted, non-elicited, etc.) conversation analysts attempt to describe the stable practices and underlying normative organizations of interaction by moving back and forth between the close study of singular instances and the analysis of patterns exhibited across collections of cases. Four important domains of research within conversation analysis are turn-taking, repair, action formation and ascription, and action sequencing.


  • Pragmatics
  • Sociolinguistics


Conversation analysis (CA) is an approach to the study of social interaction that emerged through the collaborative research of Harvey Sacks, Emanuel Schegloff, Gail Jefferson, and their students in the 1960s and early 1970s. In 1974, Sacks, Schegloff, and Jefferson published a landmark paper in Language titled, “A Simplest Systematics for the Organization of Turn-Taking for Conversation.” Not only did this paper lay out an account of turn-taking in conversation and provide a detailed exemplification of the conversation analytic method, it also articulated with concerns in linguistics and brought CA to the attention of linguists and others engaged in the scientific study of language. The paper remains the most cited and the most downloaded paper ever published in the history of the journal (Joseph, 2003, see also Google citation index). Since the publication of the turn-taking paper, researchers in this area have continued to identify ways in which the study of conversation and social interaction relates to the concerns of linguistic science.

Interaction as the Home of Language

An underlying, guiding assumption of research in conversation analysis is that the home environment of language is co-present interaction and that its structure is in some basic ways adapted to that environment. This distinguishes CA from much of linguistic science, which generally understands language to have its home in the human mind and to reflect in its structure the organization of mind. For the most part these can be seen as complementary rather than opposed perspectives (depending, perhaps on the model of mind involved). Language is both a cognitive and an interactional phenomenon, and its organization must certainly reflect this fact.

What do we mean by interaction or co-present interaction? Goffman (who supervised the PhD studies of both Sacks and Schegloff) described interaction as a normatively organized structure of attention (see inter alia 1957, 1964)—when people interact they are, however fleetingly, attending to one another’s attention. While drawing on these and other ideas from Goffman, conversation analysts tend to emphasize the fact that interaction is the arena for human action. In order to accomplish the business of everyday life—for instance checking to see that a neighbor received the newspaper, updating a friend about a recent event, asking for a ride to work—we interact with one another. Conversation analysis seeks to discover and describe (formally and in a rigorous, generalizable way) the underlying norms and practices that make interaction the orderly thing that it is. For instance, one fundamental aspect of the orderliness of interaction has to do with the distribution of opportunities to participate in it. How, that is, does a participant determine when it is her turn to speak, or her turn to listen? Another aspect of orderliness concerns the apparatus for addressing problems of hearing, speaking, or understanding. How, that is, do participants in conversation remedy problems that inevitably arise in the course of interaction and how do they do this in an effective yet efficient way, such that they are able to resume whatever activity they were engaged before the trouble arose? A third aspect of orderliness has to do with the way in which speakers produce, and recipients understand, stretches of talk so as constitute them as actions by which they can achieve their interactional goals. A final aspect of the orderliness of interaction has to do with the way these actions are organized into sequences in such a way as to construct an architecture of intersubjectivity—a basis for mutual understanding in conversation. Each of these four domains of conversational organization will be briefly sketched out and ways in which research in each area connects with the concerns of linguists and other scholars of language will be highlighted.


We can begin by noting, as the authors of Sacks et al. (1974) do, that there are various ways in which turn-taking for conversation (and indeed the distribution of opportunities to participate in interaction more generally) could be organized. For instance, turns could be pre-allocated so that every potential participant was entitled to talk for two minutes and the order of speakers was decided in advance (by their age, gender, status, first initial, height, weight, etc.). There are speech exchange systems (as Sacks et al., 1974 calls them) that operate more or less in this way, such as debate. But there are reasons that such a system would not work for conversation. If, for instance, we imagine that in such a system participants A, B, C, D each get an opportunity to talk and in that order, what will happen if B asks A a question? B now has to wait for C and D to speak before A can answer. But what if C and D also ask A a question? Or what if D does not hear the question that B has asked and so on? Of course, although this kind of pre-allocated system obviously won’t work for conversation, there are many other ways in which turn-taking might be organized (and, indeed, is organized for activities other than conversation). We need not review all the possibilities here. We can already see, in light of these considerations and common sense, that turn-taking for conversation must be organized locally, by the participants themselves. As Sacks et al. (1974) puts it, turn-taking in conversation is “locally managed, party-administered, interactionally controlled.”

The model these authors describe has two components and a set of “rules” that coordinate their operation. The “turn constructional component” determines the shape and extent of possible turns by specifying a sharply delimited set of units from which turns can be composed. Specifically, in English, turn constructional units (TCUs) can be lexical items, phrases, clauses, and sentences. In the following case, Shelley’s declaratively formatted question at line 01, “you were at the Halloween thing.” is a sentential TCU while her “the Halloween party” at line 03 is a phrasal TCU. Debbie’s turns at lines 02 and 04 are lexical TCUs.

(1) Debbie & Shelley

Instances of these TCUs “allow a projection of the unit-type under way, and what, roughly, it will take for an instance of that unit-type to be completed” (Sacks et al., 1974, p. 702). This feature of projectability allows a recipient to anticipate possible completion of the current TCU and to target this “point of possible completion” as a place to begin his or her own talk. We can see how this works in example (1). Debbie is able to position her talk at line 02 so that it begins just as Shelley reaches possible completion and, in the case of line 04, just before Shelley reaches possible completion. As Sacks et al. write “we find sequentially appropriate starts by next speakers after turns composed of single-word, single-phrase, or single-clause constructions, with no gap—i.e., with no waiting for possible sentence completion” (Sacks et al., 1974, p. 702). The precise timing of these starts thus provides evidence for the projectability of possible completion of a TCU. At the same time the fact that participants target these points as appropriate places to begin their own talk indicates that such points are treated as transition-relevant. Points of possible completion constitute transition relevance places (TRPs), which are, as Schegloff (1992, p. 116) puts it, “discrete places in the developing course of a speaker’s talk ( . . . ) at which ending the turn or continuing it, transfer of the turn or its retention become relevant.”

The “turn allocation component” specifies techniques by which turns are allocated among parties to a conversation. For current purposes the most important of these techniques are those by which a current speaker selects a next speaker. A basic technique in this respect involves combining an address term (or other method of address such as directed gaze) with a sequence-initiating action such as a question, request, invitation, complaint, and so on. Consider (2), in which Michael and Nancy are guests for dinner at the home of Shane and Vivian. In the fragment below, Michael addresses his talk to Nancy by using her name (or a short from of it) and produces a question that is also a request. In this way he selects her to speak next, which she does at line 03.

(2) Chicken Dinner p. 3 (Address term)

According to Sacks et al. (1974), a set of rules coordinates the use of the turn constructional and turn allocation component. These rules apply at the first transition relevance place of any turn.

Rule 1 C= current speaker, N= next speaker


If C selects N in current turn, then C must stop speaking, and N must speak next, transition occurring at the first possible completion after N-selection.


If C does not select N, then any party (other than C) may self-select at a first point of possible completion, first speaker gaining rights to the next turn.


If C has not selected N, and no other party self-selects under option (b), then C may (but need not) continue (i.e., claim rights to a further TCU).

Rule 2 applies at all subsequent TRPs:

When Rule 1(c) has been applied by C, then at the next TRP Rules 1 (a)–(c) apply, and recursively at the next TRP, until speaker change is effected.

This “simplest systematics” allows us to see how turn-taking in ordinary conversation is accomplished in such a way as to minimize both gap and overlap. It also allows us to see why (and to predict where) many cases of overlap occur. Consider (3).

(3) Parky

Here Tourist’s turn at line 01, being formatted as a polar interrogative, selects some other party who is knowledgeable about the park. Parky is the first to respond and his answer is precision timed to begin at just the point where Tourist’s turn reaches completion. Parky’s turn does not select a next speaker and, after a delay of one second, Old man self-selects, elaborating the answer that Parky has provided. Parky apparently means to agree with this elaboration and produces a turn (line 06) that is, again, precisely timed to begin at just the point that Old man reaches possible completion with no gap and no overlap. However, we can see that the talk at line 06 is in fact Parky’s third attempt to articulate the agreement. What is important to see for present purposes is that the first two attempts to self-select actually target points of possible, though not actual, completion within the emerging course of Old man’s turn. That is to say, “Th’ Funfair changed it” is in fact a possibly complete turn in this context, as is “Th’Funfair changed it’n ahful lot.” This example, which is in no way unusual, provides clear evidence that Parky is able to parse the talk as it emerges so as to project points of possible completion within it and thus be prepared to begin his own turn at just these places. Overlap of the kind produced here provides further evidence of the projectability of possible completion and, moreover, of the fact that participants orient to such possible completion as transition relevant.

Two implications of what has so far been said are first, the turn-taking system for conversation operates over only two turn constructional units at a time: current and next. Second, a current speaker is initially entitled to produce only one TCU and at the first point of possible completion transition to a next speaker becomes a relevant possibility. Thus, if a current speaker is to talk for more than one TCU, some effort to secure additional opportunity will have to be made. One set of practices involves foreclosing the possibility of another self-selecting at possible completion by, for instance, reducing the extent and recognizability of that point of possible completion. Another practice involves issuing a bid to produce a longer stretch of talk. If the other participants buy in and provide a go-ahead response to such a bid, the result is to effectively suspend the association between possible completion and transition relevance for the duration of the telling. So, for example, when speakers produce stories they often begin with a short sequence in which a bid is made with “Guess what happened to me today?” and a recipient responds with “What.” etc. (see Sidnell, 2010). Another implication of the foregoing discussion is that the turn-taking mechanism for ordinary conversation is, as Goodwin (1979) writes, “coercive” rather than “permissive.” A number of other models of turn-taking propose that speakers employ “turn-ending signals” or “completion cues,” and that a listener must wait to hear one of these cues before beginning his or her own talk. Such a system would be “permissive” in that it would allow a current speaker to continue talk as long he or she wished. But the system described by Sacks et al. (1974) is not like this. Rather it is “spring-loaded” with a number of pressures encouraging shorter turns, most important the fact that a current a speaker is initially entitled to produce only a single TCU.

This analysis of turn-taking draws upon basic ideas about language structure. For instance, in their description of the turn-constructional component, the authors of Sacks et al. (1974) suggest that grammar plays a key role in determining what can count as a possible TCU—these are lexical items, phrases, clauses, sentences. Subsequent researchers have developed these ideas and have sought to determine the relative role of intonation, prosody, grammar, and pragmatics in shaping possible completion (see Ford, Fox, & Thompson, 1996). Other research has addressed the question of whether the turn-taking system described by Sacks et al. (1974) applies to English only or, rather, applies generally to all languages (see, e.g., Sidnell, 2001). Stivers et al. (2009) draws on a sample of 10 languages, showing that there was clear evidence in all of them for a general avoidance of overlapping talk and a minimization of silence between conversational turns. Focusing on transitions between Yes-No (or polar) questions and their responses, Stivers et al. provides evidence that in all the languages they compared the same factors account for the variation in speed of response. Answers were produced with significantly less delay than non-answer responses. Within the set of answers, those that were confirmations were delivered with less delay than those that were disconfirmations. When a response included a visible (nonverbal) component this was produced with less delay than those responses without. Finally, in 9 of the 10 languages studied, responses were delivered faster if the speaker was looking at the recipient while asking the question. This study then also provides strong evidence that turn-taking for conversation is organized in ways that are independent of the language being spoken.


A second important area of research within conversation analysis concerns the systematically organized set of practices of “repair” that participants use to address troubles of speaking, hearing, and understanding. Episodes of repair are composed of parts (Schegloff, 1997; Schegloff, Jefferson, & Sacks, 1977). A repair initiation marks a “possible disjunction with the immediately preceding talk,” while a repair outcome results either in a “solution or abandonment of the problem” (Schegloff, 2000, p. 207). That problem, the particular segment of talk to which the repair is addressed, is termed the “trouble source” or “repairable.”

Repair can be initiated either by the speaker of the repairable item or by some other participant (e.g., the recipient). Likewise the repair itself can be done either by the speaker of the trouble source or someone else. In describing the organization of repair it is usual to use the term “self” for the speaker of the trouble source and “other” for any other participant. Thus we can identify cases of self-initiated, self-repair (see [4]), other-initiated, self-repair (see [5]) and self-initiated, other-repair, etc. In these examples, the arrow labeled (a) indicates the position of the repairable item or “trouble source,” the arrow labeled (b) indicates the position of the repair initiator, and the arrow labeled (c) indicates the position of the repair or correction.

(4) XTR (1.2)

(5) NB 1.1

We can immediately see that the components of the repair episode (a, b, c) cluster in one turn in (4), whereas in (5) they are distributed across a sequence of three turns. In (4) we see that B initiates repair with a cut-off on “Fri:-” and then subsequently provides the repair by replacing what was presumably going to be “on Friday” with “on Sunday.” Several other observations are that the word to be replaced is framed by repeated material (“on”), and that the problem is pre-monitored by delay (“ah” in line 02). In (5) when Guy asks for the first time “Is Cliff dow:n by any chance?=do you know?” Jon responds not with an answer to the question (which it can be observed he knows) but rather with “↑Ha:h?” thereby indicating trouble with some aspect of what Guy has said and initiating repair of the prior turn. In response, Guy re-asks the question, hesitating slightly before substituting “Brown” for “Cliff” (a surname for a first name). At line 06 Jon answers the question, affirmatively, saying “Yeah he’s down.” (“down” here refers to being at the beach rather than in town).

When repair is initiated by a participant other than the speaker of the trouble source, this is typically done in the turn subsequent to that which contains the trouble-source by one of the available next-turn-repair-initiators (NTRI). The various NTRIs “have a natural ordering, based on their relative strength or power on such parameters as their capacity to locate a repairable.” (Schegloff et al., 1977, p. 369). At one end of the scale, NTRIs such as what? and huh? indicate only that a recipient has detected some trouble in the previous turn; they do not locate any particular repairable component within that turn. Question words such as who, where, when are more specific in that they indicate what part of speech is repairable (e.g., who—a person referring noun phrase, etc.). The power of such question words to locate trouble in a previous turn is increased when appended to a partial repeat. Repair may also be initiated by a partial repeat without any question word.

Recent research has sought to describe the linguistic practices and resources used in initiating repair from a cross-linguistic, comparative perspective. Fox, Hayashi, and Jasperson (1996) notes differences between self-repair in English and Japanese and links these to the different “syntactic practices” of the two languages. The authors of Hayashi and Hayano (2013) describe a particular format used in Japanese conversation, which they term “proferring an insertable element” (PIE), in which a next speaker articulates a candidate understanding of the prior utterance, but does so with an item that is understood to be inserted into rather than appended onto the preceding turn. In a comparison of a diverse set of languages, Dingemanse, Blythe, and Dirksmeyer (2014) describes various formats for other to initiate repair, suggesting that, “different languages make available a wide but remarkably similar range of linguistic resources for this function,” noting that repair initiation formats are adapted to deal with different contingencies of trouble in interaction. Specifically, repair initiation formats respond to the problems of characterizing the trouble encountered, managing responsibility for the trouble and displaying their speaker’s understanding of the distribution of knowledge. Thus a form such as “huh?” indicates trouble but does not characterize it, includes no on-record position with respect to responsibility for the trouble, and also claims no knowledge of what has been said. In contrast, a repair initiation format such as “you mean the one around the corner?” locates (e.g., the expression “the coffee stand”) and characterizes (as a problem of reference or understanding) the trouble. Although such a format again includes no explicit indication of which participant is responsible for the trouble, it nevertheless suggests that the one initiating repair takes responsibility for finding a solution. And, finally, by displaying an understanding (candidate) of what has been said, it thereby shows that its speaker is knowledgeable in this respect (and has heard what was said).

Action in Interaction

A basic question addressed by research within linguistic pragmatics concerns how saying something can count as doing something. Much of the work in this area has drawn on the ideas of John Searle and others who have argued for a solution to the problem based on a theory of speech acts. While there are different versions of the theory, some common assumptions seem to be that actions are relatively discrete and can therefore be classified or categorized. Applied to interaction, the theory suggests that recipients listen for cues (or clues) that allow for the identification of whatever act the talk is meant to be doing (e.g., greeting, complaining, requesting, inviting). Moreover, the theory seems to presume a closed set or inventory of actions that are cued by a delimited range of linguistic devices. On this formulation, the basic problem to be accounted for by scholars of interaction is how participants are able to recognize so quickly what action is being done (see Levinson, 2012). As we have already seen, participants in interaction are able to respond to prior turns with no waiting, no gap, and so on (indeed they routinely respond in overlap). Operating with the standard assumptions of psycholinguistics (i.e., that speech recognition and language comprehension requires “processing time,” that speech production requires “planning time,” and so on), this creates something of a mystery—how are participants able not only to parse the turn at talk into TCUs (and thereby anticipate points of possible completion), but also to recognize what action is being done in and through those TCUs, and somehow be prepared to respond to that action with little or no latency (indeed, in cases of overlapped response, with less than zero latency).

Sidnell and Enfield (2014) offers a critique of the underlying assumptions of speech act theory applied to action in interaction describing it as a “binning” approach,

in which the central problem is taken to involve recipients of talk (or other participants) sorting the stream of interactional conduct into the appropriate categories or bins. . . . These accounts appear to involve a presumption about the psychological reality of action types that is somewhat akin to the psychological reality of phonemes. . . . That is, for the binning account to be correct, there must be an inventory of actions just as there is a set of phonemes in a language. Each token bit of conduct would be put into an appropriate pre-existing action-type category. The binning approach thus also suggests that it would be reasonable to ask how many actions there are. But we think that to ask how many actions there are is more like asking how many sentences there are.

An alternative account treats “action” as, always, a formulation or a construal of some configuration of practices in interaction. For the most part, formulations are not required to ensure the orderly flow of interaction. Participants respond on the fly and infer what a speaker is doing from a broad range of evidence. However, on occasion (such as in some cases of reported speech and in some cases of third position repair), a speaker formulates, using the vernacular metalinguistic terms available to her, the action that she or another participant is understood to have accomplished (e.g., “I requested that he get off the table!,” “I’m not asking you to come down, I’m just saying you’re welcome if you want,” etc.). And, of course, in various kinds of post hoc reporting contexts and in scholarly analysis, persons outside of an interaction routinely formulate the actions that were done within it. So an alternative to the binning or speech act account is one in which producing an “action” (in quotation marks to indicate that this is merely a heuristic use of the word) involves putting together, configuring, or orchestrating a range of distinct practices of conduct to allow for the inference that the speaker is doing “x” or “y” where “x” or “y” are possible formulations or descriptions.

It is often suggested by conversation analysts that there is no necessary “one-to-one mapping” between a given practice of speaking (e.g., “do you want me to come over and get her?”) and some specific action (such as “an offer”), and this is usually taken to imply a many-to-one relation running in both directions; that is, there are multiple practices available to accomplish any given action, and any given practice can, in context, be understood to accomplish a range of different actions (see, e.g., Schegloff, 1997; Sidnell, 2010). But, while this is no doubt true (insofar as the terms in which it formulates the problem are adequate, e.g., “context,” “an action,” etc.), matters are a good deal more complicated than this, because any determination of “what a speaker is doing” is an inference from a complex putting together of distinct practices of composition and positioning.

Levinson (2012), puzzled as to how recipients are seemingly able to determine what action is being done so early on in the production of a turn and somehow able to respond without delay, distinguishes two major types of information that can be gleaned from a turn-at-talk. On the one hand there is the “front-loaded” information of prosody (e.g., pitch reset), gaze, and turn-initial tokens (such as “oh,” “look,” “well,” and so on) that can potentially tip off the recipient as to what is being done. On the other hand there is the detailed linguistic information that is revealed only as the turn-at-talk unfolds. This includes much of the information available through grammatical formatting (e.g., morphological cues, syntactic inversion, imperative forms, etc.), as well as through richly informative linguistic formulations (e.g., “the deal,” “my boss,” “stupid trial thing,” etc.). While Levinson thus recognizes that the passage from a turn-at-talk to “action” involves a recipient putting together various strands of evidence, he argues that the solution must involve a delimited inventory of actions, recognition of which these practices, solely or in combination, are able to trigger. Alternatively, Sidnell and Enfield (2014) argue that a model involving inference from a complex set of features implies an inevitable degree of indeterminacy in action ascription, which is always merely an inference from evidence. For the most part, participants in interaction get along just fine, such inference-based action ascriptions are good enough for all practical purposes and, because no formulation is typically required, problems typically do not arise.

It is well established in CA that one can look to subsequent turns in order to ground an analysis of previous ones—this is called the “next turn proof procedure” (Sacks et al., 1974). In the analysis of single cases we can ground our analysis of some turn as, for instance, an “accusation” by looking to see how the recipient responds to it (e.g., with an excuse or justification). Sacks et al. (1974) proposes along these lines that:

while understandings of other turns’ talk are displayed to co-participants, they are available as well to professional analysts, who are thereby afforded a proof criterion . . . for the analysis of what a turn’s talk is occupied with. Since it is the parties’ understandings of prior turns’ talk that is relevant to their construction of next turns, it is their understandings that are wanted for analysis. The display of those understandings in the talk of subsequent turns affords both a resource for the analysis of prior turns and a proof procedure for professional analyses of prior turns—resources intrinsic to the data themselves.

This “data-internal evidence” is used, for instance, to ground the claim that when Debbie says “what is the deal” in line 15 of example (6), which comes from the opening of a telephone call, she is not simply asking a question but is, in doing so, accusing Shelley of wrong-doing:

(6) Debbie and Shelley

“What is the deal” is hearable as an accusation, as conveying that Shelley has done or is otherwise responsible for something that Debbie is unhappy about. What aspects of the talk convey that? First, the positioning of the question, pre-empting “how are you” type inquiries, provides for a hearing of this as “abrupt” and in some sense interruptive of the usual niceties with which a call’s opening is typically occupied (e.g., “how are you?”). Second, by posing a question that requires Shelley to figure out what is meant by “the deal,” Debbie thereby suggests that Shelley should already know what she is talking about and thus that there is something in the “common ground,” something to which both Debbie and Shelley are already attending (have “on their minds”). Third, by selecting the idiom “the deal” Debbie reveals her stance toward what she is talking about as “a problem” or as something that she is not happy about. Fourth, with the prosody, including the stress on “is” so that it is not contracted, the emphasis on “dea::l.” and the apparent pitch reset with which the turn begins conveys, Debbie conveys heightened emotional involvement. Putting all this together we can hear in what Debbie says here something other than a simple request for information—this is an accusation. It seems clear that Debbie is upset and the implication is that Shelley is responsible for this. But how can we ground the analysis of the turn in question in the displayed orientations of the participants themselves? To do this we look to Shelley’s response.

That Shelley hears in this more than a simple question is evidenced first by her plea of innocence with “whadayou ↑mean.” and secondly by her excuse. All other-initiations of repair indicate that the speaker has encountered a trouble of hearing or understanding in the previous turn. Among these “What do you mean” appears specifically adapted to indicate a problem of understanding based on presuppositions about common ground (Hayashi, Raymond, & Sidnell, 2013). Here “what do you mean?,” which is produced with a noticeably higher pitch, suggests Shelley does not understand what Debbie means by the clearly allusive, in-the-know expression, “the deal.” More narrowly, it conveys that the expression “what is the deal” has asked Shelley to search for a possible problem that she is perhaps responsible for, and that no such problem can be identified. It is thus hearable as claiming “innocence.”

When Debbie redoes the question, in response to the initiation of repair by “What do you mean,” she does it with a yes-no (polar) question that strongly suggests she already knows the answer. “You’re not going to go” is what Pomerantz (1988) calls a candidate answer question that presents, in a declarative format, to Shelley what Debbie suspects is the answer, and requests confirmation of this. This then reveals the problem that Debbie had in mind and meant to refer to by “the deal.” And when Shelley responds to the repaired question she does so with what is recognizable as an excuse. This is a “type non-conforming” response (i.e., one that contains no “yes” or “no” token; see Raymond, 2003), in which Shelley pushes the responsibility for not going (which is implied, not stated) onto “her boss” (invoking the undeniable obligations of work in the district attorney’s office), and suggesting that the obstacle here is an inconvenience for her (as well as for Debbie) by characterizing the impediment to her participation as a “stupid trial-thing.”

Clearly, as the quote from Sacks et al. (1974) makes clear and as the foregoing discussion is meant to explicate, the most important data-internal evidence we have comes in subsequent talk. In the case we have considered, subsequent talk reveals how Shelley herself understood the talk that has been addressed to her because this understanding is embodied in the way she responds.

It is important to clarify what exactly is being claimed. Subsequent talk, and data internal evidence, allow us to ground the analysis of this question—“What is the deal”—as projecting an accusation of Shelley by Debbie. It does not, however, tell us what specific features of the talk cue, convey, or carry that complaint/accusation. As the pioneers of conversation analysis demonstrated, in order to address that question, the question as to which specific features or practices provide for an understanding of what a given turn is doing, we need to look across different cases. We need to isolate these practices in order to discern their generic, context-free, cohort independent character. So case-by-case analysis (single case analysis using data-internal evidence) inevitably leaves us with a question—specifically, what particular aspects of a turn convey (allow for an inference as to) what the speaker is doing (i.e., what action is being done)? What are the particular practices of speaking that result in that consequence? What are the generic features of the practice that are independent of this particular context, situation, group of participants, etc.?

In order to attempt an answer to these questions we have to move beyond the analysis of a single case to look at multiple instances. However, and this is the key point in the context of the present discussion, when we do this we inevitably find that each practice that is put together with others in some particular instance (to effect some particular action outcome) can be used in other ways, combined with other practices, to result in other outcomes. We can take any particular “practice” from the Debbie and Shelley case and work out from there. We can look for questions that, like Debbie’s “What is the deal?,” occur in this position, pre-empting what normatively happens in the opening turns of a telephone call. If we do this we find that some are like this one and seem to deliver or imply an accusation, but others do not. We can look at other cases in which a speaker refers to something as “the deal” or asks “what is the deal” and again find some cases in which an accusation is inferred but others in which it is not. And we can find other instances in which similar prosody is used in the formation of a question or instances in which a question is delivered with an initial pitch reset. The result is always the same: no single feature is associated with some particular outcome. The conclusion we must then draw is that “action” is an inference from a diverse set of pieces of evidence that a speaker puts together or orchestrates within a single TCU or utterance (see also Robinson, 2007).

Action Sequencing

As we have already seen, in conversation, actions are organized into sequences. The most basic form such sequences can take is as a set of two paired actions, a first and a second, known as an adjacency pair. For instance, production of a question establishes a next position within which an answer is relevant and expected next. In order to capture this aspect of organization, Schegloff (1968, p. 1083) introduced the concept of conditional relevance:

By the conditional relevance of one item on another we mean: given the first, the second is expectable; upon its occurrence it can be seen to be a second item to the first; upon its nonoccurrence it can be seen to be officially absent—all this provided by the occurrence of the first item.

Although questions are not always followed by answers, the conditional relevance that a question activates ensures that participants will inspect any talk that responds to a question to see if and how it might be an answer, or might account for why an answer is not being produced. In response to questions the most common account for not answering is “I don’t know.” So, in the following, when Guy asks Jon if a mutual acquaintance might like to go golfing with them, Jon replies with “I don’t know,” and follows up by suggesting that he “go by and see,” thereby indicating a willingness to obtain the information that has been requested.

(7) NB 1.1 1:05

So even where second speakers do not (for whatever reason) actually produce the second pair part that is called for, they typically exhibit some orientation to its relevance and often account for its non-occurrence and even, in some cases, apologize for an inability to deliver it. The same example also provides evidence that questioners orient to the conditional relevance exerted by sequence initiating action such as Guy’s “Think he’d like to go?” in line 07. Thus when Jon does not answer the question posed Guy reissues it at line 12 thereby pursuing a response.

Schegloff and Sacks (1973) identifies four defining characteristics of the adjacency pair. It is composed of two utterances that are:




Produced by different speakers.


Ordered as a first pair part (FPP) and second pair part (SPP).


“Typed”?, so that a particular first pair part provides for the relevance of a particular second pair part (or some delimited range of seconds, e.g., a complaint can be relevantly responded to by a remedy, an excuse, a justification, a denial, and so on).

Adjacency pairs are sequences composed of only two turns—a first and second pair part. But talk-in-interaction and conversation in particular is not composed solely of paired actions, produced one after the other. Rather, an adjacency pair may be expanded so as to result in a much more complex sequence. An adjacency pair can be expanded prior to the occurrence of its first part, after the occurrence of its first part but before the occurrence of its second, or after its second pair part. These expansions are themselves often built out of paired actions and can themselves serve as the bases upon which further expansion takes place.

Pre-expansions involve an expansion of a base adjacency pair prior to the occurrence of the first pair part and are preparatory to the action the base pair part is meant to accomplish. So, for instance, a pre-invitation “hey, are you busy tonight?” checks on the availability of the recipient. A pre-request such as “You wouldn’t happen to be going my way would you?” checks on the degree of inconvenience a projected request is likely to impose, and so on. Such pre-expansions check on a condition for the successful accomplishment of the base first pair part. Consider the following phone call excerpt:

(8) HS:STI,1

Judy’s “why” at line 07 displays an orientation to the preceding turn as something more than an information-seeking question and John’s answer at lines 8–11 confirms this inference.

As just noted, an adjacency pair consists of two adjacent utterances, with the second selected from some range of possibilities defined by the first. However, on some occasions, the two utterances of an adjacency pair are not, in fact, adjacent. In some cases this is because another sequence has been inserted between the first and second pair part of an adjacency pair. Such insert expansions can be divided into post-firsts and pre-seconds (Schegloff, 2007) according to the kind of interactional relevancy they address.

Post-expansions are highly variable with respect to their complexity. Schegloff (2007) suggests that they can be divided into minimal and non-minimal types. Minimal post-expansions consist of one turn. “Oh” for instance can occur after the response to a question, thereby registering that the questioner has been informed by that response and minimally expanding the sequence with a single turn of post-expansion. Other forms of post expansion are more elaborate and addressed to a range of interactional contingencies.

This brief overview of conversation analysis has discussed four domains of organization: turn-taking, repair, action formation, and action sequencing. Research in each of these four domains has consequences for our understanding of language and language structure (see Couper-Kuhlen & Selting, in press; Thompson, Fox, & Couper-Kuhlen, 2015). While work to date has drawn connections primarily between linguistics and turn-taking and repair, there are obvious ways in which the work on action and action sequencing bears on the concerns of linguistics. For instance, work on action formation intersects with research within linguistics on mood and with the analysis of speech acts. Work on action sequencing bears on problems of anaphora resolution and inter-sentential grammatical relations. In order to fully explore these and other themes, we will likely require a robustly cross-linguistic, comparative, and interdisciplinary program of research.

  • Couper-Kuhlen, E., & Selting, M. (in press). Interactional Linguistics: Studying language in social interaction. Cambridge, U.K.: Cambridge University Press.
  • Dingemanse, M., Blythe, J., & Dirksmeyer, T. (2014). Formats for other-initiation of repair across languages: An exercise in pragmatic typology. Studies in Language, 38, 5–43.
  • Ford, C. E., Fox, B. A., & Thompson, S. A. (1996). Practices in the construction of turns: The TCU revisited. Pragmatics, 6(3), 427–454.
  • Fox, B. A., Hayashi, M., & Jasperson, R. (1996). Resources and repair: A cross-linguistic study of syntax and repair. In E. Ochs, E. A. Schegloff & S. A. Thompson (Eds.), Interaction and grammar (pp. 185–237). Cambridge, U.K.: Cambridge University Press.
  • Goffman, E. (1957). Alienation from interaction. Human Relations, 10, 47–60.
  • Goffman, E. (1964). The neglected situation. American Anthropologist, 66(6, Pt. 2), 133–136.
  • Goodwin, C. (1979). Review of Starkey Duncan Jr. and Donald W. Fiske, Face-to-face interaction: Research methods and theories. Language in Society, 8(3), 439–444.
  • Hayashi, M., & Hayano, K. (2013). Proffering insertable elements: A study of other-initiated repair in Japanese. In M. Hayashi, G. Raymond, & J. Sidnell (Eds.), Conversational repair and human understanding (pp. 293–321). Cambridge, U.K.: Cambridge University Press.
  • Hayashi, M., Raymond, G., & Sidnell, J. (2013). Conversational repair and human understanding: An introduction. In M. Hayashi, G. Raymond, & J. Sidnell (Eds.), Conversational repair and human understanding (pp. 1–40). Cambridge, U.K.: Cambridge University Press.
  • Heritage, J. (1984). A change of state token and aspects of its sequential placement. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action: Studies in conversation analysis (pp. 299–345). Cambridge, U.K.: Cambridge University Press.
  • Joseph, B. D. (2003). The Editor’s Department: Reviewing our contents. Language, 79(3), 461–463.
  • Levinson, S. C. (2013). Action formation and ascription. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 103–130). Malden, MA: Wiley-Blackwell.
  • Pomerantz, A. M. (1988). Offering a candidate answer: An information seeking strategy. Communication Monographs, 55(4), 360–373.
  • Raymond, G. (2003). Grammar and social organization: Yes/no interrogatives and the structure of responding. American Sociological Review, 68, 939–967.
  • Robinson, J. (2007). The role of numbers and statistics within conversation analysis. Communication Methods and Measures, 1(1), 65–75.
  • Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50(4), 696–735.
  • Schegloff, E. A. (1968). Sequencing in conversational openings. American Anthropologist, 70(6), 1075–1095.
  • Schegloff, E. A. (1992). To Searle on conversation: A note in return. In H. Parret & J. Verschueren (Eds.), (On) Searle on conversation (pp. 113–128). Amsterdam: John Benjamins.
  • Schegloff, E. A. (1997). Practices and actions: Boundary cases of other-initiated repair. Discourse Processes, 23(3), 499–545.
  • Schegloff, E. A. (2000). When “others” initiate repair. Applied Linguistics, 21(2), 205–243.
  • Schegloff, E. A. (2007). Sequence organization in interaction: A primer in conversation analysis. Cambridge, U.K.: Cambridge University Press.
  • Schegloff, E. A., Jefferson, G., & Sacks, H. (1977). The preference for self-correction in the organization of repair in conversation. Language, 53(2), 361–382.
  • Schegloff, E. A., & Sacks, H. (1973). Opening up closings. Semiotica, 8, 289–327.
  • Sidnell, J. (2001). Conversational turn-taking in a Caribbean English Creole. Journal of Pragmatics, 33(8), 1263–1290.
  • Sidnell, J. (2010). Conversation analysis: An introduction. Oxford: Wiley-Blackwell.
  • Sidnell, J., & Enfield, N. J. (2014). The ontology of action in interaction. In N. J. Enfield, P. Kockelman, & J. Sidnell (Eds.), The Cambridge handbook of linguistic anthropology (pp. 423–446). Cambridge, U.K.: Cambridge University Press.
  • Stivers, T., Enfield, N. J., Brown, P., Englert, C., Hayashi, M., Heinemann, T., et al. (2009). Universals and cultural variation in turn-taking in conversation. Proceedings of the National Academy of Sciences, 106(26), 10587–10592.
  • Thompson, S., Fox, B., & Couper-Kuhlen, E. (2015). Grammar in everyday talk: Building responsive actions. Cambridge, U.K.: Cambridge University Press.