The Intonational Variation in Arabic corpus uses a multi-layered set of data collection instruments, following in the footsteps of the Intonational Variation in English (IViE) project
. A range of different tools are used to collect speech recordings, to systematically vary certain variables of interest, and control others. The table below lists the tools used, and the variables and/or speech style that each is designed to yield.
Data in the IVAr database
| scripted dialogue|| The scripted dialogue yields multiple read speech realisations of different utterance types, in a controlled dialogue context to elicit the intended meaning: |
- broad focus declarative (dec)
- wh-question (whq)
- yes-no-question (ynq)
- coordinated question (coo)
- information focus declarative (inf)
- identification focus declarative (idf)
- confirmation focus declarative (con)
The position of the stressed syllable in the last lexical item in each sentence is systematically varied (final/penult/antepenult). The last lexical item in each sentence is (near-)identical in all dialects, permitting comparison of nuclear accent contours across utterance types/dialects.
| narrative|| A read narrative yields data in which different speakers of the same dialect all produce the same sentences, within a narrative sequence.|
Later, the speaker is asked to tell the story again from memory. The retold narrative yields at least some instances of the same or similar sentence produced semi- spontaneously by different speakers of the same dialect.
| map task|| The map task yields semi-spontaneous realisations of different utterance types; mismatches are included in the maps in order to naturally generate questions in the conversation. The names of landmarks on the map contain mostly sonorant speech sounds, and the position of the stressed syllable is systematically varied in the final word of each landmark name.|
| Sense Relation Network|| This tool collects local variants of vocabulary items known to vary across Arabic dialects. The data permits independent confirmation of which dialect is spoken by the participants. |
| free conversation|| Free conversation between two participants, on one or more of the following topics: what is shared/unique about your dialect of Arabic, cooking/food, fashion, cars or sport. |
To support our analysis of the core database materials, we also collected some read speech experimental sentences to elicit phonetic variables and a short passage in Modern Standard Arabic. Participants were optionally invited to provide recordings in English, for use in our work on second language acquisition of phonology. Finally, we collected data with 2-4 speakers of each dialect using an Arabic version of a Dialogue Completion Task tool, based on those used in prior work on Spanish and Portuguese. A subset of this additional data will be made available to researchers on request after the IVAr database is launched.
- Blum-Kulka, S., House, J. & Kasper, G. 1989. Investigating cross-cultural pragmatics: An introductory overview. In S. Blum-Kulka, J. House & G. Kasper. (eds). Cross-cultural pragmatics: Requests and apologies. pp1-34. Norwood, NJ: Ablex.
- Llamas, Carmen. 2007. A new methodology: data elicitation for regional and social language variation studies. York Papers in Linguistics Series 2. 8 138-163.