Actions That Aren't: features of speech events in textual conversation

By Julia Orth for Prof. Vesperi,
Language, Culture, and Society 18 April 2000

Historically, written language was thought to represent spoken language, to in fact be close enough to a frozen form of a language as to provide most of the information used for the study of language. More recent thought has contested this idea heartily, and with good reason. As can easily be determined with an audio recorder, written language does not directly represent spoken language, and indeed, the two forms are structured quite differently due to their different natures and purposes. Speech is something common to all human societies, whereas writing is possessed by only some. Speech is learnt first and used most often and in everyday interactions, and it almost always occurs in a context rich with paralinguistic/kinesic information--information that is provided through such means as stress, intonation, physical movement, cultural gestures, and emotional expressions (Salzmann, 1998, pp 234 - 236). The instant feedback in speech situations, the fact that the addresser hears the speech as it's produced and sees the reactions of the addressee at the same time, makes dialogue possible. Because of its immediate, transient nature and frequent, necessary use, speech changes across space and time more quickly than does writing. Since one of the purposes of writing is in fact to overcome the transient quality of speech and cross distances of space and time, it tends to be standardized and conservative, and to represent only one dialect of a language. Since written language is presented without the wealth of contextual information present in speech, it also tends to be carefully structured so as not to be ambiguous (Milroy & Milroy, 1991; Duranti, 1997, pp 125.)

This paper proposes that there is a new form of writing, which might be characterized as written conversation that possesses some social features confined in other cases to speech. A particular focus of this paper is on the direct reference to and use of some paralinguistic features such as actions and expressions in this written conversation. English, particularly Standard American English, is the language I am considering in this discussion, and great caution must be exercised in generalizing these thoughts to other languages.

"Written conversation" may seem to be oxymoronic, but I contend that it is an accurate description of on-line written communication in real time, which I will refer to here as "chatting." Chatting takes place on computer networks when two people can send text or chunks of text to the other's screen, at which it arrives almost immediately, and the other participant is present to reply. Chatting is a special subset of writing. I say that it is a "subset" of writing because most of the salient features of written language, as opposed to spoken language, apply. Like other writing, chatting makes use of a standardized form of the language in question (though there is more variation in spelling, grammar, and punctuation than in formal writing, this is largely not systematic or regular in the sense that dialects and accents are), permanent, and lacking natural paralinguistic/kinesic features. At the same time, chatting also has important features that set it apart from formal writing. The primary difference is that while most other writing (as poetry, books, letters, speeches, papers, presentations, &c.) is necessarily a solitary activity, with the writer seperated from the audience by time and space alike, on-line chatting is necessarily a shared activity. It is by definition communication that takes place in "real time" (meaning that when a message is sent, it arrives almost instantly) and it is not an activity that one can engage in without partners. It is, therefore, a social language act, in a way that most speech events are and most written works are not. Jakobson represents speech events as consisting of six factors, the addresser, addressee, context, message, contact, and code. He continues to describe six corresponding functions of language, emotive, conative, referential, poetic, phatic, and metalingual. The referential function, which describes something, the poetic function, which focuses on the form of the message, and the metalingual function, which allows language to be used in reference to itself, are all possible in normal writing. What is of interest here are the emotive function, which expresses information about the addresser's feelings and attitude, the conative function, which signals a message specifically directed at the addressee (as vocatives, imperatives), and the phatic function, which is meant to establish, prolong, or discontinue communication, as greetings, farewells, and "idle talk." (Duranti 1997, pp 284 - 286) These three functions are not usually appropriate in written language, but it seems that all three have their places in chatting. This is exemplified very clearly by the use, in chatting, of lexical items that serve these particular functions.

Greetings are lexical items of this sort, serving, among other things, a phatic function. In the interest of clarity, I will define my own approximation of this usage of the English "hello" using Weirzbicka's "natural semantic metalanguage," (Weirzbicka) a set of words representing proposed universal basic elements used to describe the social/semantic values of specific speech acts and language constructions:

Hello =
I know you are here
I want you to know that I am here
I want you to know that I know you are here.
Because we know these things, we can say things to one another (if we want)
It is good for people to know these things.

Clearly "hello" with this meaning, and other phatic signals, have no place in most writing. Since writing is utilized for communicating across time, in those circumstances "hello" cannot be an acknowledgement of the fact that one shares environment with someone else that could potentially permit further interaction. There is no interaction to begin with, since the addresser and addressee are not sharing the communicative event in the same temporal space. I posit that when "hello" is used to open notes or letters, it has only a fraction of the meaning it has when it is used as a speech event. It serves to express pleasantness or politeness, to mark the beginning of a piece of writing, or out of the habits of spoken language. In chatting language, however, "hello" possesses something very much akin to its full speech meaning. When an individual logs into a chat room s/he is given greetings such as "hello" as a way of showing that others in the same "space" or channel of communication are aware of his/her presense and their capacity to engage in social interaction with one another. In some ways this phatic signal serves an even more important function on-line, because unlike in real life, when chatting one can never tell if a person is actually present on-line unless one is receiving information from him or her. This fact makes the 'active listening' components of chatting, which can serve both emotive and phatic functions, particularly important as well. During speech acts, signals like nodding, expressing attention or agreement ("Yes", "Right", "Oh!", "uh-huh", "Awww...", &c.) all serve to signify that the addressee is attending to what the addresser has to say, and often also to express the addressee's attitudes or feelings towards that subject. The same holds true on-line, with the added importance of the fact that it is the only way the addresser can be sure that an audience exists at all. (This urgency is countered to a degree by the fact that the written words of chatting continues to exist after they are produced, unlike the spoken word.) This too is obviously dependent on the more-or-less real-time nature of speech acts. Here is an example of active-listening emotive/phatic signals such as "hrm," and "ack," given by the addressee ("Platypus") during a short narrative. Note that while the times between segments of text are longer than would be necessary in speech, an interactive conversation is still taking place, and timing still provides information concerning the conversational event.

[00:16] <Eclipse> Anyway, as you probably know Lucas and I were going to meet our friends Stephen and Mary from Santa Rosa for dinner today..

[00:17] <Eclipse> Well, this afternoon I got a call from Mary..

[00:18] <Eclipse> Saying that they were still in California.

[00:18] <Platypus> hmm.. heh, I didn't know, but go on..

[00:18] <Eclipse> Apparently a truck carrying nuclear waste was in an accident on the major road to the airport..

[00:18] <Eclipse> They had to shut the whole thing down, and Stephen and Mary couldn't get to the airport for their flight..

[00:18] <Platypus> ack.. :/

[00:19] <Eclipse> They called in an explained, and were put on stand-by for every flight this evening.. I haven't heard from them since..

[00:19] <Platypus> hrm..

[00:20] <Eclipse> Heh, if you're on the web and you're bored, see if you can find anything about a truck carrying nuclear waste having an accident somewhere vaguely near Santa Rosa, CA..

[00:23] <Platypus> k.. hmm.. I think Santa Rosa is somewhere near where Tia is...

(personal archives, 1999. Used with permission.)

Chatting language, like all written language, lacks the paralinguistic and kinesic features that characterize spoken communication, and help to fulfill emotive and conative functions. These include intonation, volume, and stress in speech itself, as well as physical demeanor and actions, facial expressions (smiles, laughs, grimaces), and cultural gestures (shrugging, waving, nodding). Much work has done on the significance of these signals to communication. Signals such as these are in fact so valuable to communication that they have been "transposed" after a fashion for use in chatting.

The most common social signals utilized in chatting are known as "smiley faces", and look something like this: :-) ;-) The first is an indication of friendliness, happiness, or pleasantness, and the second (a "winking" smiley face) a sign that one is teasing or flirting. These signals often seem to function, in a very limited degree, the way that one's tone of voice functions in speech. B's response in an interactions such as:

A: Oh no, I just spilled cola all over my desk!
B: That was dumb.

Could easily be interpreted as serious and mean spirited, depending on the context, where as an interaction such as:

A: Oh no, I just spilled cola all over my desk!
B: That was dumb. ;-)

It is clear that B is only teasing, if perhaps not appropriately, and means no harm. (That is, it nearly as clear as were the conversation to be taking place in real life, and B using a tone of voice/facial expression clearly associated with teasing; both cases potentially allow for misinformation.)

Perhaps more interestingly, "actions" (also known as "poses" or "emotes") are ubiquitous in chatting. This author has yet to participate in a medium for chatting or a group of chatters that do not employ some standard for conveying emotions and attitudes (emotive functions) by means of reference to physical kinesic signals. Some examples of common actions are emotional expressions, such as <smile>, <laugh>, and <blush>, and socially defined signals such as <shrug>, <nod>, and <wave>.

These are always set off from other discussion in some way, sometimes by enclosure in brokets, as <grin>, or asterisks, as *giggle.* Longer descriptions of actions almost always employ the third person, as *sits on the couch and fidgets,* and shorter ones are sometimes marked this way as well: <grins>, <waves>. This is likely because a third-person description is closest to how the addressee would think of real-world kinesic information. The third person may also be another way of setting actions off from discussion. Many chat clients (programs that allow chatting to take place). have special commands which both set off actions and put them in the appropriate third person context. (i.e. the command "/me" in IRC; and ":" in TinyMUCK).

It would appear that speech events and chatting share some significant features. Due to their different natures, however, there are slight differences in how these features are employed and what they mean, and this could provide worthwhile information regarding both events.

The use of emotive actions in particular is a unique feature of chatting. Expressions and social actions in real life are not always under conscious control, such as blushing, crying, and shivering, and even to some extent laughing, smiling, and tone of voice. In chatting, all such social signals are not only voluntary, but explicitly voluntary, despite being set off from the body of the conversation. While a participant in spoken conversation may find it believable that a burst of laughter or deep blush was an unintended reflex, perhaps even an exposure of the individual's raw emotions, s/he can harbor no such illusions about the nature of those signals which his/her conversational partner chooses to display in chatting. This is especially intriguing when the actions in question are indeed ones that would not normally be under conscious control. The speaker may be accurately reporting reflexive responses in real life, though s/he could clearly choose not to, or s/he may simply be trying to express a lesser degree of the emotions usually symbolized by those reactions (embarassment/coyness, sadness, fear/excitement, surprise, &c.). Either way, the producer knows what s/he is choosing to express, the receiver knows that s/he knows this, and s/he knows that the receiver knows.

This feature makes chatting a potentially useful venue for examining attitudes towards particular expressions/actions, once some kind of relationship between actions in chats and their counterparts in the "real world" is experimentally determined. Questions such work might address are: when do people choose to utilize actions in chats? Is this related to when they express the related actions in face-to-face interactions? How often are actions reported on-line being physically expressed by the person at the keyboard? If reported actions and physical actions contradict, could this be a way of seperating more voluntary actions from less voluntary actions, or determining what is thought to be socially desirable in given circumstances? In a broader sense, now that it is clear that speech events and chatting are in some ways comparable, their relationship deserves to be addressed in more detail. What does chatting allow that speech events do not? What do speech events permit that chatting does not? By considering both situations, it is hoped that more can be learned about each.

