Information Structure and Spoken Language : Cross-Linguistic Comparative Studies

International Workshop

July 9-10, 2011, in conjunction with the 2011 LSA Linguistic Institute, Boulder, Colorado


The goal of this workshop is to analyze and compare some of the main processes involved in Information Structuring in several typologically different languages (Indo-European, Finno-Ugric and Uralic, African, Amerindian, Austronesian, Indonesian, Semitic languages ; Basque, Thai, Vietnamese) and from distinct theoretical and methodological approaches, mainly

• the formal semantic approach that gives a central role to syntax (e.g. configurational languages and their relation to truth-conditional ambiguity)

• the functional and enunciative approach, that hypothesizes a double tripartite organization of information for both sentences and discourses. Initial and Final Detachments (= « left/right » dislocations) and discourse markers will be scrutinized as core processes of oral information strategies, as well as the structure of Question-Answer pairs and its effect on the recycling of emotions into an intentional meaning

• the Role and Reference Grammar, that investigates the role played by Information Structuring in explaining cross-linguistic differences in grammatical systems.


rf.srnc.fjv|tsev-zednanref.enylecoj.m.m#tseV-zednanreF enylecoJ.M.M, CNRS-LACITO & TUL, Université Paris 3-Sorbonne Nouvelle, scientific coordinator of the CNRS-TUL program ISTY (« Information Structuring and Typology : Detachment constructions in languages and discourses ») ;

• in collaboration with ed.frodlesseud-inu.gnil|nilavnav#rJ nilaV-naV treboR, coordinator of the research group « Syntax, typology and information structure » (Max Planck Institute for Psycholinguistics, Nijmegen).

Registration Information :

There is no registration fee for the workshop but we would appreciate having people pre-register in order to help with planning. Please do so no later than July 2, 2011 by emailing to the organizers with your name and affiliation.

Contact and Links :

For more information, contact the organizers.

Potentially of interest to our participants are the LSA2011 Courses LING 7800-025, LING 7800-073, LING 7800-075.


Saturday, july 9th
9 :00-9 :15
Introduction :
Information Structuring and Typology (ISTY), a program of the CNRS

- Typologie et Universaux Linguistiques (TUL) Research Federation

9 :15-11 :00
Topicalization in Hindi :Word Order, discourse particle ‘to’ and intersubjectivity

Information Structuring Strategies in Pidgin Madam (an Arabic-based pidgin with Sinhala substratum).

9 :15-11 :00

11 :15-13 :00
– Dejan MATIC (Max Planck Institute, Nijmegen) & Irina NIKOLAEVA (SOAS, London) :
Verb focus and realis marking in Tundra Yukaghir

– Robert VAN VALIN Jr (University of Düsseldorf ; Max Planck Institute, Nijmegen ; University of Buffalo)
The Role and Reference Grammar approach to information structure and its application to selected Amazonian languages

13 :00-14 :15

14 :15-16 :00
– Saskia VAN PUTTEN (Max Planck Institute, Nijmegen) :
Marked topics and contrast in Avatime (Kwa)

– Ricardo ETXEPARE (CNRS-IKER, Université de Bayonne)
Basque focus at the interfaces

16 :00-17 :00
Social hour

Sunday, july 10th
9 :00-10.45
– Peter SLOMANSON (University of Aarhus, Denmark)
Focus change as a precursor to morphosyntactic change : the case of Sri Lankan Malay

– M.M.Jocelyne FERNANDEZ-VEST (CNRS-LACITO, Paris 3 & Paris 4) :
Detachment constructions as fundaments of Information Structuring – evidence from French, Finnic, Samic

11 :00-12 :45
Focus Particles in Vietnamese

– Jirasak ACHARIYAYOS (Paris 3, CNRS-LACITO) :
Information Structuring : Thematization in Thai

12 :45-14 :00

14 :00-15 :15
– Marie-Ange JULIA-SOULETIS (CPGE Henri IV, Paris 4-Centre Ernout):
Thematization in Latin illustrated through Drama

Pragmatic vs. grammatical mode : utterance internal hierarchy (UIH) in Hebrew and beyond

15 :15-15 :30

15 :30-16 :30
Round table discussion :
Information Structuring between Typology and Universals

CNRS : Centre National de la Recherche Scientifique
EVA : Max Planck Institute for Evolutionary Anthropology
IKER : Centre de Recherche sur la Langue et les Textes Basques
INALCO : Institut National des Langues et Civilisations Orientales
LACITO : Laboratoire de Langues et Civilisations à Tradition Orale
MoDyCo : Modèles, Dynamiques, Corpus
Paris 3 : Université Sorbonne-Nouvelle
Paris 4 : Université Paris-Sorbonne
SeDyL : Structures et Dynamiques des Langues
SOAS : School of Oriental and African Studies


Jirasak ACHARIYAYOS (Paris 3, CNRS-LACITO) : Information structuring : Thematization in Thai

Information structure is generally referred to as the management and organization of elaboration in discourse. Speakers and writers have at their disposal a variety of techniques for controlling the presuppositions that they wish to maintain and the new relationships that they wish to assert about them. These techniques vary cross-linguistically : prosody, morphology, and syntactic structure (Slayden, 2010). In this talk I will show that Thai as an isolating, strict word order language uses a number of discourse particles to deal with the organization of information. The focus will be made on Detachment in Thai. Detachment generally concerns a movement of an element to the initial or final position in an utterance. This movement can be both unmarked or marked. Some particles like nà and ná are used to mark the detached element.

(1) unmarked initial detachment
năŋsɯ̆ː luŋ sɯ́ː maː sɔ̆ːŋ lêm
book uncle buy come two CL
(2) marked initial detachment
năŋsɯ̆ː nà luŋ sɯ́ː maː sɔ̆ːŋ lêm
book TOP uncle buy come two CL
« (As for) the book, Uncle bought two books »
(3) marked final detachment
luŋ sɯ́ː maː sɔ̆ːŋ lêm năŋsɯ̆ː nà
uncle buy come two CL book TOP
« Uncle bought two, the books »

Cheung, L.Y.L. (2009). Dislocation focus construction in Chinese. Journal of East Asian Linguistics, (18(3)), 197-232.
Fernandez-Vest, M.M.J. (1994). Les Particules Enonciatives dans la construction du disours. Paris: PUF.
Lambrecht, K. (1994). Information Structure and Sentence Form. Cambridge: CUP.
Slayden, G. (2010). An Information Structure Annnotation of Thai Narrative Fiction. Washington.
Vassawanont, P. (1996). Datchaniiparitchet na naibotsonthanabaepkaneng phasathai khrungthep (Discourse markers in Bangkok Thai casual conversations). Chulalongkorn University, Bangkok.
Warotamasikkhadit, U. (1997). Fronting and backing topicalization in Thai. Mon-Khmer Studies, 27, 303-306.

Fida BIZRI (INALCO, CNRS-SeDyL) : Information structuring strategies in Sinhala and in Pidgin Madam (an Arabic-based pidgin with Sinhala substratum)

Sinhala is the only Indo-Aryan language (along with the related Divehi of the Maldives) where the focused construction involves a special marking of the tensed verb, leading to a reorganization of the word order, and interacting with some important grammatical processes (such as the Wh- question formations, and negation). Although verb final statements are statistically preferred, Sinhala cannot be considered a rigidly verb-final language. Pidgin Madam is a new Arabic-based pidgin with Sinhala substratum born in the context of (female) labor migration from Sri Lanka to Middle Eastern countries. The language exhibits both features of OV and VO orders, in different contexts. Data from Pidgin Madam interrogates information structuring strategies in mixed languages and the impact of both the substratum and the superstatum on the selection of morpho-syntactical features.

Danh-Thành DO-HURINVILLE (INALCO, CNRS-MoDyCo) : Focus Particles in Vietnamese

After briefly presenting some points of view on the focus information (Halliday, 1967, Lambrecht, 1994, Dik, 1997), I will examine Vietnamese focus structures such as « argument focus », « predicate focus » and « sentence focus » by means of the following particles: chinh, cai, thi, toi, nhung, là, rang, mà.
The « argument focus » is used with the markers chinh, cai, toi, nhung, thi. More precisely, chinh, cai, thi focus on the subject, while toi, nhung focus on the direct object. The « predicate focus » functions with là, which focus on the whole predicate or on part of it. As for the « sentence focus », it works with là and mà.
Most of these focus markers derive from lexemes, which are grammaticalized or pragmaticalized. They are divided into three groups. The first group is composed of cai, thi, rang, toi, which are both grammaticalized and pragmaticalized. The second group includes chinh, nhung, which are only pragmaticalized. As for the third group, it contains là and mà.

Ricardo ETXEPARE (CNRS-IKER, Université de Bayonne) : Basque focus at the interfaces

In this paper I will provide an overview of some of the fundamental syntactic properties of focus constructions in Basque, by reviewing a number of classical observations (see Etxepare and Ortiz de Urbina, 2003 for an introductory work): (i) the adjacency between the focused constituent and the verb; (ii) the parallelism between focused constituents and wh-words; (iii) the existence of possibly different focus positions and of multiple focus contructions (Etxepare and Uribe-Etxebarria, 2008); (iv) the involvement of focus in quantificational constructions of a distributive nature (Etxepare, 2002, 2011). I will then use those data as a basis to compare two broad lines of analysis that have been applied to Basque focus constructions in syntactic theory: the idea that focus constructions are operator constructions targeting a given syntactic position in the clause structure (the so-called “syntactocentric” or derivational analysis, see recently Irurtzun 2007 for Basque); (ii) and the idea that focus is an interface phenomenon, whose syntactic distribution follows from comparison between representations pertaining to different modules of the linguistic faculty (for a recent proposal in this sense, Arregi 2003)

Ref.• Arregi, K. (2003) Focus on Basque Movements. Unpublished doctoral dissertation, MIT. • Etxepare, R. (2002) “Bare indefinites and distributivity in Basque” In X. Artiagoitia, P. Goenaga and J. Lakarra (eds) Erramu Boneta: Festchrift for Rudolf P.G. De Rijk. Supplements of the International Journal of Basque Language and Linguistics. Bilbao : University of the Basque Country. 231-246.• Etxepare, R. (2011) “Minimal Correlatives” Ms. IKER-UMR5478. • Etxepare, R. and J. Ortiz de Urbina (2003) “Focalization” In J.I. Hualde and J. Ortiz de Urbina (eds) A Grammar of Basque. Berlin: Mouton. 459-515. • Etxepare, R. and M. Uribe-Etxebarria (2008) “Negation and focus in Basque” In X. Artiagoitia, and J. Lakarra (eds) A Festschrift for Patxi Goenaga. Supplements of the International Journal of Basque Language and Linguistics, 51. Bilbao: University of the Basque Country. 287-310. • Irurtzun, A. (2007) The Grammar of Focus at the Interfaces. Unpublished doctoral dissertation, University of the Basque Country.

M.M.Jocelyne FERNANDEZ-VEST ( CNRS-LACITO, Paris 3 & Paris 4) : Detachment constructions as fundaments of Information Structuring – evidence from French, Finnic, Samic

This talk illustrates in a functional, typological and evolutionary perspective the natural segmentation of spoken language which is manifested by pre- and post-rhematic detached constructions. The methodology, originally elaborated in the 1970s for the study of an interlanguage, Finnish spoken by bilingual Sami, was then applied to Northern Sami (a corpus of conversations and life stories collected in the border region of the Deatnu Valley), still prototypical of pure orality in the 1980s, and later extended to several European languages. In view of a typology of Questions and Answers, two basic operations are analyzed : how Initial Detachments (ID) and Final Detachments (FD) – formerly « left / right dislocations » (Lambrecht 1994, 2001) – are used for building binary strategies (1/ Theme-Rheme // 2/ Rheme-Mneme). It is argued that, while the pivot of IS is the Minimal Communicative Utterance (MCU, a Rheme), in impromptu speech ID and FD regularly combine into iconical figures of cohesion, modulated by Discourse Particles. A comparison is made, from the point of view of internal contrastivity (oral vs. written style), with another FU language, Finnish. The role of ID and FD in different types of discourses and texts is tackled, as well as their evolution when the oral language is grammaticized and/or influenced by contact languages. As a counterpoint to suspicions of exotism, French is included in the demonstration.

Ref. : Fernandez-Vest M.M.J., 1987, La Finlande trilingue, 1. Le discours des Sames. Oralité, contrastes, énonciation, Préface de Claude Hagège, Paris, Didier Erudition. • 2009, « Typological evolution of Northern Sami : spatial cognition and Information Structuring », The Quasqui-centennial of the Finno-Ugrian Society, Jussi Ylikoski (ed.), Helsinki, Mémoires de la Société Finno-Ougrienne – SUST 258, 33-55. • 2011a, « Typological evolution of information grammar in Uralic », in Csúcs Sándor (ed.), Papers from the XIth International Congress of Finno-Ugric studies. 08-15 August 2010, Piliscsaba, Budapest, Reguly Társaság. • 2011b, Detachments for cohesion. Toward an information grammar of oral languages, 250 p., Ms.

Marie-Ange JULIA (CPGE Henri IV, Paris 4-Centre Ernout) : Thematization in Latin illustrated through Drama

This paper intends to show that he Information Structuring (IS) approach can be applied to classical languages which had indeed different registers, and, conversely, that this study is fruitful for our knowledge of modern languages. Latin Drama represents life and features a spoken language. The line – often short – is mainly based on thematization or focalization devices. In the context of a dialogical activity, these devices provide thematic continuity, organize or maintain the interaction. We shall concentrate on the IS role of case endings and discourse markers, one of the most salient features of Plautus’ theater, e.g. :

Mulier profecto natast ex ipsa Mora.
woman DM to be born Ind.perf.PS3 PREP itself slowness
« The woman really, she was born from Slowness itself. » (Plautus, Miles gloriosus, v. 1292)

Language is inherently dialogical: the relational and pragmatic challenge of the spoken language is essential as shown by the multiplicity of thematization devices. Similarities between classical corpora and modern corpora are not only related to the colloquial register. They identify interlocutive strategies which, although partly specific to theater (“simulated speech”), have much in common with the ordinary spoken language. « All the world’s a stage, And all the men and women merely players. » (Shakespeare, As you like it, II, 7).

Pablo KIRTCHUK, CNRS-LACITO, INaLCO (Paris) : Pragmatic vs. grammatical mode : Utterance Internal Hierarchy (UIH) in Hebrew and beyond

I show (a) the correlations between intonation, prosody and pragmatic constituent order as far as UIH is concerned, and the iconic link between them; (b) that those factors and their linguistic expressions override and determine grammatical forms and roles, not the other way round; (c) that the relative importance attributed to each part of the utterance, as well as its communicative and expressive values, depend first and foremost on the speakers’ intention, idiosyncrasy, state of mind, context, relative urgency and the like, and that grammar is not the starting point of speech, in other words that the syntax-first hypothesis is dead wrong. UIH is what it is about not structure, as the communication mode we are dealing with is pragmatic-deictic, not grammatical-semantic. Keywords:
Pragmatic-Deictic mode: (Topic-)Focus, Hierarchy, Utterance, Intonation / Prosody, Motivated, Imposed, Iconic, Pre-rational, Biology, Non-Formal, Tendencies, Induction / Abduction, ‘Hardware’, Ontogeny, Creologeny, Phylogeny, Oral, Spontaneous, Communication, Interaction, Context-dependent, Concrete, Dialogic, 1-2 Person (+ non-person), Deixis, Gestures, (Linguistic cum) Gestural, Lamarck, Darwin, Bühler, Bolinger, Greenberg, Givón, Ochs, Kimura, Lieberman, Maturana, Kirtchuk.
Grammatical-Semantic mode: Subject-Predicate, Structure, Sentence, Syntax, Arbitrary, Conventional, Symbolic, Rational, Mathematics, Formal, Rules, Deduction, ‘Software’, Adult Language, Systematized Language, Present-day Language, Written, Planned, Conceptualization, Thought, Context-free, Abstract, Dialogic or not, Non-Person (+ 1st and 2nd p.), Nouns, Lexemes, Solely linguistic, Saussure, Jakobson, Chomsky.

Dejan MATIC (Max Planck Institute, EVA, Leipzig) & Irina NIKOLAEVA (SOAS, London) :
Verb focus and realis marking in Tundra Yukaghir

Tundra Yukaghir has a verbal particle mə(r)=, which has been described in the literature as a marker of declarative illocutionary force, positive polarity, and/or predicate focus. We show that these approaches are inadequate and propose a modal analysis of mə(r)=. However, it highlights the intrinsic connection between realis modality (realis assertion) and focus. In our analysis mə(r)= is a realis marker whch existentially bind propositions (provides existential anchoring.) The locus of realis marking in Tundra Yukaghir is the main semantic predicate, which, in its turn, must be associated with focus. If focus is on a nominal element, then this element is the main semantic predicate; if it is on the verb, then the verb is the locus of the predication and must be marked by mə(r)=. We show that this analysis is supported by a number of syntactic and semantic arguments.

Annie MONTAUT (INALCO, CNRS-SeDyL) : Topicalization in Hindi :Word Order, discourse particle ‘to’ and intersubjectivity

Word order in Hindi is deemed rigidly SOV/head final (according to Greenberg’s and Dryer’s criteria), although the language is sometimes described as displaying free order, since all orders are indeed observable without morpho-syntactical restructuration, in various enonciative situations all involving intersubjectivity. Statements with non final verb may then represent the postposition (post-comment) of an element already known, or considered as secondary in the communicative act, but may in specific contexts and with specific intonational pattern represent the focalization of the postponed element. As for OSV, it most of the time corresponds to the focalization of S, but not if O is topicalized. The paper will bear on statements involving a marked topic and the enonciative conditions required for moving V or O, since the predicate too can be topicalized as well as focalized. Besides intonational parameters, the intersubjective setting plays a crucial role, particularly when the topicalized NP is also marked by the topic particle to, in statements where the contrastive topic has scope on either the NP only or the whole statement, depending on the discursive context (polemical or ironic rephrasing, introduction of an argument not relevant, introduction of an element relevant but not crucial, etc.)

Peter SLOMANSON (University of Aarhus, Denmark) : Focus change as a precursor to morphosyntactic change : the case of Sri Lankan Malay

In conventional sentence organization, events in complex sentences appear as subclauses in the order in which they occur. In many languages, this linear ordering and context are sufficient to convey event sequence. In a smaller number of languages, such as those spoken in Sri Lanka, emphatic focus marking is typically marked with constituent dislocation to the sentence periphery. In such an information structure system, one function of contrastive finiteness marking is to facilitate the interaction of these two structural patterns. Bilinguals, one of whose languages lacks a finiteness contrast, may be motivated to incorporate a finiteness contrast from the other language, if this is motivated by the need to preserve marking of temporal sequence under dislocation for emphatic focus. This constitutes a pragmatic-discourse motivation for a structural accretion that is otherwise accounted for as a case of morphosyntactic complexification that is motivated by the not particularly explanatory term "language contact", or by other metaphors that suggest random processes.
Sri Lankan Malay underwent extensive grammatical change during a protracted period of close cultural contact and bilingualism in Sonam, a variety of Tamil spoken natively by Sri Lankan Muslims. Whereas the original Malay varieties spoken in Sri Lanka did not feature a finiteness contrast, languages such as Sonam treat the tensed matrix verb and its clause as in focus by default, and the non-finite participles representing subsequent events as backgrounded. Reassigning focus cannot occur at the expense of the tense asymmetry marked with morphology that conveys finiteness status. That morphology is preserved even when the linear order of constituents shifts. Temporal sequence, verb morphology, and focus marking now work this way in Sri Lankan Malay too. Since three way tense-marking in Sri Lankan Malay is suppressed under negation, the finiteness contrast is restored through the use of distinct plus finite and minus finite negation markers. Predictably, a non-finite negation marker will be retained even when its verbal predicate is dislocated for emphatic focus.

Saskia VAN PUTTEN (Max Planck Institute for Psycholinguistics, Nijmegen & International Max Planck Research School for Language Sciences, Radboud University, Nijmegen) : Marked topics and contrast in Avatime (Niger-Congo)

Avatime is a Kwa (Niger-Congo) language spoken in Ghana. In this language, sentence topics are frequently followed by particles indicating some form of contrast. I will describe the semantics and pragmatics of the two particles most frequently used for this purpose: tsyɛ and kɔ.
The particle kɔ indicates in most cases that there is an alternative to the topic it associates with, and that there is an opposition between what is claimed about the associate of kɔ and this alternative.
The particle tsyɛ 'also/too' also marks contrastive topics, usually indicating similarity between alternative topics. Contrary to e.g. the English particle too, there is no strict condition that what is said about the associate of tsyɛ is identical to what is said about an alternative. In cases where this identity condition is not met, tsyɛ functions on the discourse level, contributing to discourse coherence.

Robert VAN VALIN Jr (University of Düsseldorf ; Max Planck Institute, Nijmegen ; University of Buffalo) : The Role and Reference Grammar approach to information structure and its application to selected Amazonian languages

This paper will present the Role and Reference Grammar [RRG] approach to information structure and illustrate it by examining aspects of information structure in three Amazonian languages: Banawá (Arawan; Reinbold 2004, 2007), Wari’ (Chapakuran; Turner 2006), and Karitiâna (Tupi; Everett 2008). The RRG approach is based on Lambrecht’s (1994) theory of information structure, and it integrates information-structural notions into the analysis of clause structure and into the linking between syntax and semantics. This approach will be applied to data from three Amazonian languages, in order to capture the similarities and highlight the differences in the syntax-semantics-pragmatics interface in them.