**
Vienna Talking Faces (ViTaFa) database
--------------------------------------
**
The Wiki includes comprehensive information and codebooks essential for comprehending the contents of the uploaded files.
### **Access to the database** ###
For access to the ViTaFa database, you have to sign the confidentiality agreement and the terms of use ("ViTaFa_Confidentiality_Agreement_Terms_of_Use.pdf") and send it to the corresponding author of the database (Christina Krumpholz; chrissikrumpholz@gmail.com) who will then provide you access to the database.
### **Files overview** ###
This section refers to the file "File_overview.xlsx", which is a colour-coded overview document specifying which files are available in the database. It is applicable to three different formats of the database: audio files, video files (audiovisual), videofiles (video-only), meaning that if a file is labelled as a missing file, it is missing across the three formats; if it is labelled as available, it is available across the three formats. This overview is not applicable to the available images of the database.
Please read the following codebook to understand colour-coding, column and row names.
#### **Codebook** ####
- green-coloured/no cell entry = files (audio, audiovisual video, video-only video) are available
- red-coloured/cell entry "NA" = files (audio, audiovisual video, video-only video) are not available
- ID = unique stimulus person identifier code
- Angry = this section refers to all files from the condition "angry".
- Flirty = this section refers to all files from the condition "flirty".
- Happy = this section refers to all files from the condition "happy".
- Neutral = this section refers to all files from the condition "neutral".
- Sad = this section refers to all files from the condition "sad".
- No. of missing files = specifies how many files are missing in the specific row (per subject) or in the specific column (per content per condition)
The following column names describe the different content that was spoken.
- A = "A"
- B = "bido"
- D = "Die Leute sitzen vor der Tür"
- E = "E"
- G = "gali"
- H = "Hallo, ich bin's"
- I = "I"
- L = Reading extract from the German version of Snow White: "Es war einmal mitten im Winter, und die Schneeflocken fielen wie Federn vom Himmel herab. Da saß eine Königin an einem Fenster, das einen Rahmen von schwarzem Ebenholz hatte, und nähte. Und wie sie so nähte und nach dem Schnee aufblickte, stach sie sich mit der Nadel in den Finger, und es fielen drei Tropfen Blut in den Schnee."
- K = "Magst du mit mir einen Kaffee trinken gehen?"
- M = "Morgens ist auf den Straßen viel los"
- O = "O"
- S = "Straße"
- T = "Tür"
- U = "U"
- W = "Wie geht's dir?"
### **Example Stimuli** ###
This folder contains example stimuli which serve demonstration purposes only and are not included in the database. They were recorded and edited under comparable conditions as the database stimulus material.
### **Objective measurement data** ###
There are objective measurements available for ViTaFa database (in the manuscript section "Objective measurements"). They include:
- facial landmarks
- measurements of sexual dimorphism
- measurements of distinctiveness
- measurements of fundamental frequency
All measurements can be found in the file "Norming_data.csv". Please read the following codebook to understand column names.
#### **Codebook** ####
- ID = unique stimulus person identifier code
- Gender = stimulus person gender
- image_file = name of image file that was used for objective measurements; following measurements of sexual dimorphism and distinctiveness as well as facial lanfmarks are based on these image files
- sd.vector = sexual dimorphism values calculated using the vector analysis method (Holzleitner et al., 2014)
- sd.discrim = sexual dimorphism values calculated using the discriminant analysis method (Lee et al., 2014)
- distinct_scores = face-shape distinctiveness values calculated following Lee et al. (2016)
- audio_file = name of audio file that was used for objective measurements; following measurements of pitch and fundamental frequency are based on these audio files. Audio files were spoken sentences and contained a broad language range (important for pitch & fundamental frequency calculation in order to predict person's pitch and fundamental frequency from it)
- audio_duration = duration of the respective audio file
- fund_freq = fundamental frequency with Praat's autocorrelation function (Boersma, 1993)
- landmark_file = name of file containing 189 facial landmarks using Webmorph (DeBruine, 2018). These files can be found in the folder "Facial landmarks" in this registry. All files are based on the images described in image_file. Facial landmark files can be imported to Webmorph or Psychomorph for further manipulation.
### **Ratings of social perception** ###
ViTaFa was validated for several dimensions of social perception through subjective measurements (in the manuscript section "Ratings of social perception"). These dimensions include:
- Attractiveness
- Distinctiveness
- Beauty
- Sexual Dimorphism
- Dominance
- Health
- Sexual Attractiveness
- Sympathy
- Trustworthiness
Rating data for each stimulus person across all dimensions are summarized in the file "Ratings_Social_Perception.csv". Each rating is a combined value based on ratings of neutral images and audiovisual videos with the phrase "Hallo, ich bin's" (between-subjects design; for more information refer to the linked publication).
Please read the following codebook to understand column names.
#### **Codebook** ####
All ratings were given on a 7-point Likert scale ranging from 1 to 7.
- ID = unique stimulus person identifier code
- variable = which dimension was rated
- n = number of ratings per stimulus
- min = minimum rating given for this stimulus
- max = maximum rating given for this stimulus
- median = median value
- iqr = interquartile range between first and third quartile
- mean = mean value
- sd = standard deviation
- se = standard error of the mean
- ci = 95% confidence interval of the mean
Analyses are available as .html rendered Rmarkdown File via R_Codes/Ratings_of_Social_Perception.html
### **Emotion validation ratings** ###
ViTaFa was validated in terms of emotion recognizability. File "Emotion_Recognition_per_File.csv" summarizes validation data for each file, i.e. how well the emotion in each file was recognized by raters.
Please read the following codebook to understand column names.
#### **Codebook** ####
The experiment was a forced-choice task, whereby participants were asked which emotion they think was shown and had to select an answer option that they thought was the most suitable. They could choose between neutral, angry, sad, happy, flirting, and not applicable (where they had to indicate an alternative).
- file_name = unique file name identifier
- n_recognized = number of times that the emotion was correctly recognized
- proportion_correct_score = Proportion correct scores per file, i.e. how often the emotion expressed in the respective file was recognized correctly divided by the number of responses that were given for this file (N recognized/Ntotal)
- sd = Standard deviation of proportion correct scores
- target_emotion = emotion category that was intended by actor
When participants chose the option "not applicable" , they had to indicate the emotion that they think fits best the expression. Some of these emotions were synonyms or very similar to the target emotions and were therefore post-hoc recoded by us. An overview over recoding of emotions can be found in the file "other_emotions.xlsx".
Please read the following codebook to understand column names.
#### **Codebook** ####
- Var1 = participants' free responses
- Freq = How often this response was used by participants
- recode = in case the indicated response was close enough to one of the target emotions, this column indicates which target emotion it was recoded to
- Cluster = in order to recode, we defined clusters (= categories) including target emotions and other clusters that seemed to be mentioned more often
File "recognition_per_actor.csv" summarizes validation data for each actor (= stimulus person), i.e. how well the emotion expression for each actor respectively was recognized by raters.
#### **Codebook** ####
- actor = unique stimulus person identifier code
- mean = proportion correct score, i.e. the number of correctly categorized emotions for this actor divided by the number of responses given for this actor in total
- sd = standard deviation of proportion correct score
File "recognition_per_actor_per_emotion.csv" summarizes validation data for each actor (= stimulus person) across target emotion categories, i.e. how well the emotion expression for each actor in each emotion category resepctively was recognized by raters.
#### **Codebook** ####
- file_name = unique stimulus person identifier code
- target_emotion = the intended target emotion category
- mean = proportion correct score, i.e. the number of correctly categorized emotions for this actor within a specific emotion category divided by the number of responses given for this actor in this emotion category
- sd = standard deviation of proportion correct score
Analyses are available as .html rendered Rmarkdown File via R_Codes/Emotion_Validation_Ratings.html
### **Actor demographics** ###
We assessed demographic data of the actors represented in ViTaFa. Please read the following codebook to understand column names.
- ID = unique stimulus person identifier code
- Age = Age of the actor at the recorded time
- Relationship_status = whether and in which kind of relationship the actor was involved at the time of recording
- Menstruation = for females, at which stage of their menstrual cycle they were; the question asked was "When was the first day of your last period?"
- Practice = Actors self-reported how often they had practiced the script before the recording.
- Gender = self-reported gender
### **Available databases overview** ###
The file "Available_Databases_Overview.xlsx" is meant to give a broad overview of existing databases used for person perception research, specifically face and voice. However, the authors give no guarantee that this overview is complete, it is solely a guide that others researchers can draw from when searching for existing databases and their characteristics.
Please read the following codebook to understand column names.
#### **Codebook** ####
- Name = Name of the database
- Face images = Does the database contain face images and if yes, how many (and how many stimulus individuals)?
- Measurements = Does the database (or a related publication) provide any subjective or objective measurement data?
- Body images = Does the database contain body images?
- Face videos_muted = Does the database contain dynamic video-only stimulus material of the faces (i.e. muted videos)?
- Face_videos_with_sound = Does the database contain dynamic audiovisual stimulus material of the depicted individuals (i.e. audiovisual videos)?
- Voice recordings = Does the database contain voice recordings and if yes, how many (and how many stimulus individuals)?
- Creation Date = When was the database created or published/described?
- Website = Where can the database retrieved from?
- Related publication = In which scientific contribution was the database (or its validation etc.) described; how can it be cited?
- colour/bw = If the database contains visual material, is this in colour or black and white?
- purpose = If the database specifies, what is/are the main usage purpos(es) of the database?