Workshop1-Orofacial expressions and acoustic cues in whispering

doi:None

Title	Authors

Home

Live interaction sessions on Teams: Apr 9, 15:30-16:30 (CEST) Please click here: https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZDcxMjQ4NjctZGZjNy00ODkxLWI3MjUtMzhiYzUwN2E5MTZk%40thread.v2/0?context=%7b%22Tid%22%3a%227ef3035c-bf11-463a-ab3b-9a9a4ac82500%22%2c%22Oid%22%3a%22d0ffeb31-e265-4176-8b28-97ebe0a940be%22%2c%22IsBroadcastMeeting%22%3atrue%7d Apr 15, 15:30-16:30 Please click here: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MjMxNTNiNzYtNDEzYi00YTJlLTg5NGYtY2UyZDBhYjM4MWYw%40thread.v2/0?context=%7b%22Tid%22%3a%227ef3035c-bf11-463a-ab3b-9a9a4ac82500%22%2c%22Oid%22%3a%22d0ffeb31-e265-4176-8b28-97ebe0a940be%22%2c%22IsBroadcastMeeting%22%3atrue%7d Feel free to contact us at zygis,@leibniz-zas.de and fuchs@leibniz-zas.de if you have any questions. Aside from the general consensus that gestures are related to the process of speaking (Kendon 1972, McNeill 1992), the motivation behind using gestures while speaking is still debatable. For instance, according to the trade-off hypothesis, if speaking becomes more difficult, the likelihood of a gesture which would ‘take over’ some of the communicative load is higher; and conversely, when gesturing becomes harder, speakers will rely more on speech (De Ruiter et al. 2012).However, the hypothesis is based on voiced speech and focus on hand gestures. It remains, however, entirely unclear what happens to orofacial expressions when (i) the speech signal becomes degenerated, and (ii) speakers do not see each other. More specifically, what happens to oral gestures if the fundamental frequency, i.e. the crucial parameter of speech, is not produced, as is the case in whispered speech? How are questions and statements realized if F0 is absent? To what extent do acoustics and orofacial expressions change if speakerswhisper and do not see each other? According to the /trade-off hypothesis/, visible orofacial motion would compensate for the F0 absence. This is in line with Dohen & Loevenbruck (2008), who showed that orofacial gestures produced while whispering decidedly enhance perception of prosodic focus in French. However, the motion would be less remarkable when speakers do not see each other. To test the hypotheses, we conducted a motion capture experiment with 17 native speakers of German (7 male) by recording movements of their eyebrows and lip openings (see Fig. 1) in parallel to acoustic signal in four randomized blocks: (1) normal speech, visible mode; (2) normal speech, invisible mode; (3) whispered speech, visible mode; (4) whispered speech, invisible mode. In the invisible mode, the confederate and the informant were separated by an artificial wall (see Fig.2). The task of the informant was to ask a question or produce a statement in response to a sentence previously pronounced by the confederate. The sentences differed only in their final word, which was strictly controlled and consisted of a bilabial initial consonant followed by an unrounded high, mid or low vowel, e.g. /Mandel/ “almond”. Several linear mixed effect models based on 2566 observations analysed the effect of speech mode [normal, whispered], visibility [visible, invisible], vowel [low, high, mid] and sentence type [question vs. statement] and their interactions on left and right eyebrow motion, lip aperture, duration and intensity of stressed syllables. Random structure was included. The results reveal that both left and right eyebrows are raised higher in whispered than normal speech. The eyebrows are also higher when speakers are invisible to each other. They are highest in conditions of whispered invisibility. Furthermore, the right eyebrow is also more raised in questions than answers. As for lip aperture, the results reveal a larger lip opening in questions than in statements. However, in contrast to eyebrow movement, it is not larger when the interlocutors do not see each other. Male speakers show larger lip aperture than female speakers. Regarding acoustics, duration of stressed syllables, sentence-final words and sentences is longer in whispered than normal speech. Sentence duration is also longer in invisible than visible conditions. Finally, intensity of stressed vowels is higher in invisible conditions. Overall, our results suggest compensation effects between degraded acoustic signals and articulatory gestures on various levels:(a) the lack of fundamental frequency is compensated for by raised eyebrows and a larger lip aperture, which may enhance visual cues for the interlocutor and b)the lack of interlocutor visibility is also compensated for by larger lip aperture and the raised right eyebrow.

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.

This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.

Create an Account Learn More Hide this message

Main content

Home

Menu

Start managing your projects on the OSF today.

Main content

Links to this project

Home

Menu

Add new wiki page

Page permissions have changed

Wiki page deleted

Connected to the collaborative wiki

Connecting to the collaborative wiki

Collaborative wiki is unavailable

Browser unsupported

Start managing your projects on the OSF today.